0

Xiaohan Wang

Papers
24

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
24papers

Authored papers

24

V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Think

arXiv 2026

2026

ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning

arXiv 2026

2026

Tool Verification for Test-Time Reinforcement Learning

arXiv 2026

2026

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

preprint

2025

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

CVPR 2025 1

2025

Video Action Differencing

arXiv 2025

2025

Temporal Preference Optimization for Long-Form Video Understanding

arXiv 2025

2025

UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist

arXiv 2025

2025

Transductive Visual Programming: Evolving Tool Libraries from Experience for Spatial Reasoning

arXiv 2025

2025

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

arXiv 2025

2025

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

CVPR 2025 1

2025

DeepSeek-V3 Technical Report

arXiv 2024

2024

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

arXiv 2024

2024

Why are Visually-Grounded Language Models Bad at Image Classification?

arXiv 2024

2024

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

arXiv 2024

2024

How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?

arXiv 2023

2023

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

arXiv 2023

2023

Whitening-based Contrastive Learning of Sentence Embeddings

arXiv 2023

2023

LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities

arXiv 2023

2023

Describing Differences in Image Sets with Natural Language

CVPR 2024 1

2023

Bird's-Eye-View Scene Graph for Vision-Language Navigation

ICCV 2023 1

2023

Clustering based Point Cloud Representation Learning for 3D Analysis

ICCV 2023 1

2023

JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery

ICCV 2023 1

2023

CenterCLIP: Token Clustering for Efficient Text-Video Retrieval

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 24 papers