Wei Chen
- Papers
- 46
Cite
Notes
Only stored in your browser.
Authored papers
46MetricAnything: Scaling Metric Depth Pretraining with Noisy Heterogeneous Sources
arXiv 2026
Switch-KD: Visual-Switch Knowledge Distillation for Vision-Language Models
arXiv 2026
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation
arXiv 2026
Thyme: Think Beyond Images
arXiv 2025
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
arXiv 2025
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
arXiv 2025
MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning
arXiv 2025
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
arXiv 2025
Kwai Keye-VL 1.5 Technical Report
arXiv 2025
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning
arXiv 2025
Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model
arXiv 2025
Ark: An Open-source Python-based Framework for Robot Learning
arXiv 2025
Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph
arXiv 2024
FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
arXiv 2024
LDM: Large Tensorial SDF Model for Textured Mesh Generation
arXiv 2024
FaceVid-1K: A Large-Scale High-Quality Multiracial Human Face Video Dataset
arXiv 2024
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks
arXiv 2024
The NPU-ASLP-LiAuto System Description for Visual Speech Recognition in CNVSRC 2023
arXiv 2024
MC-CoT: A Modular Collaborative CoT Framework for Zero-shot Medical-VQA with LLM and MLLM Integration
arXiv 2024
STKDRec: Spatial-Temporal Knowledge Distillation for Takeaway Recommendation
arXiv 2024
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
arXiv 2024
Invisible Gas Detection: An RGB-Thermal Cross Attention Network and A New Benchmark
arXiv 2024
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
arXiv 2024
Can Graph Learning Improve Planning in LLM-based Agents?
arXiv 2024
decoupleQ: Towards 2-bit Post-Training Uniform Quantization via decoupling Parameters into Integer and Floating Points
arXiv 2024
DreamLIP: Language-Image Pre-training with Long Captions
arXiv 2024
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
arXiv 2024
PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling
arXiv 2024
AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator
arXiv 2024
On-Device Language Models: A Comprehensive Review
arXiv 2024
DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning
arXiv 2023
Large Language Models for Generative Information Extraction: A Survey
arXiv 2023
DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation
arXiv 2023
MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
arXiv 2023
UrbanCLIP: Learning Text-enhanced Urban Region Profiling with Contrastive Language-Image Pretraining from the Web
arXiv 2023
TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification
arXiv 2023
BAFFLE: A Baseline of Backpropagation-Free Federated Learning
arXiv 2023
Towards Enhancing Relational Rules for Knowledge Graph Link Prediction
arXiv 2023
Topic-oriented Adversarial Attacks against Black-box Neural Ranking Models
arXiv 2023
DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services
arXiv 2023
Combinatorial Bandits for Maximum Value Reward Function under Max Value-Index Feedback
arXiv 2023
TO-FLOW: Efficient Continuous Normalizing Flows with Temporal Optimization adjoint with Moving Speed
arXiv 2022
Elucidation of Relaxation Dynamics Beyond Equilibrium Through AI-informed X-ray Photon Correlation Spectroscopy
arXiv 2022
Certified Robustness to Word Substitution Ranking Attack for Neural Ranking Models
arXiv 2022
CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation
Findings (ACL) 2021 8
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
arXiv 2020
Affiliations
Frequent co-authors
10from 46 papers