Xiyao Wang
- Papers
- 12
Cite
Notes
Only stored in your browser.
Authored papers
12Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following
arXiv 2025
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training
arXiv 2025
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
arXiv 2025
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning
arXiv 2025
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
arXiv 2025
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
arXiv 2025
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
arXiv 2024
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss
arXiv 2024
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension
ICCV 2025
TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning
arXiv 2023
DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization
arXiv 2023
Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy
arXiv 2022
Affiliations
Frequent co-authors
10from 12 papers