0

Wenwei Zhang

Papers
33

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
33papers

Authored papers

33

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

ICCV 2025

2025

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

arXiv 2025

2025

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

arXiv 2025

2025

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

arXiv 2025

2025

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

arXiv 2025

2025

Pre-Trained Policy Discriminators are General Reward Models

arXiv 2025

2025

Rethinking Verification for LLM Code Generation: From Generation to Testing

arXiv 2025

2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

arXiv 2025

2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv 2025

2025

Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

ICCV 2025

2025

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

arXiv 2024

2024

Are Your LLMs Capable of Stable Reasoning?

arXiv 2024

2024

OMG-Seg: Is One Model Good Enough For All Segmentation?

CVPR 2024 1

2024

F-LMM: Grounding Frozen Large Multimodal Models

CVPR 2025 1

2024

Can AI Assistants Know What They Don't Know?

arXiv 2024

2024

CriticEval: Evaluating Large Language Model as Critic

arXiv 2024

2024

4D Contrastive Superflows are Dense 3D Representation Learners

arXiv 2024

2024

AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data

arXiv 2024

2024

InternLM-Law: An Open Source Chinese Legal Large Language Model

arXiv 2024

2024

CIBench: Evaluating Your LLMs with a Code Interpreter Plugin

arXiv 2024

2024

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

arXiv 2024

2024

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

arXiv 2023

2023

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

arXiv 2023

2023

MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

arXiv 2023

2023

Evaluating Hallucinations in Chinese Large Language Models

arXiv 2023

2023

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

CVPR 2023 1

2023

CLIM: Contrastive Language-Image Mosaic for Region Representation

arXiv 2023

2023

Fake Alignment: Are LLMs Really Aligned Well?

arXiv 2023

2023

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

NeurIPS 2023 11

2023

OV-PARTS: Towards Open-Vocabulary Part Segmentation

NeurIPS 2023 11

2023

T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

arXiv 2023

2023

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

arXiv 2023

2023

RTMDet: An Empirical Study of Designing Real-Time Object Detectors

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 33 papers