Wenwei Zhang

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

arXiv 2025

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

arXiv 2025

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

arXiv 2025

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

arXiv 2025

Pre-Trained Policy Discriminators are General Reward Models

arXiv 2025

Rethinking Verification for LLM Code Generation: From Generation to Testing

arXiv 2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv 2025

Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

ICCV 2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

arXiv 2025

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

arXiv 2024

Are Your LLMs Capable of Stable Reasoning?

arXiv 2024

OMG-Seg: Is One Model Good Enough For All Segmentation?

CVPR 2024 1

CriticEval: Evaluating Large Language Model as Critic

arXiv 2024

4D Contrastive Superflows are Dense 3D Representation Learners

arXiv 2024

AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

arXiv 2024

InternLM-Law: An Open Source Chinese Legal Large Language Model

arXiv 2024

CIBench: Evaluating Your LLMs with a Code Interpreter Plugin

arXiv 2024

F-LMM: Grounding Frozen Large Multimodal Models

CVPR 2025 1

Can AI Assistants Know What They Don't Know?

arXiv 2024

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

arXiv 2024

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

arXiv 2023

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

arXiv 2023

MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

arXiv 2023

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

NeurIPS 2023 11

OV-PARTS: Towards Open-Vocabulary Part Segmentation

NeurIPS 2023 11

T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

arXiv 2023

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

CVPR 2023 1

CLIM: Contrastive Language-Image Mosaic for Region Representation

arXiv 2023

Fake Alignment: Are LLMs Really Aligned Well?

arXiv 2023

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

arXiv 2023

Evaluating Hallucinations in Chinese Large Language Models

arXiv 2023