Jieyu Zhang
- Papers
- 24
Cite
Notes
Only stored in your browser.
Authored papers
24Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding
arXiv 2026
WildDet3D: Scaling Promptable 3D Detection in the Wild
arXiv 2026
Theory of Space: Can Foundation Models Construct Spatial Beliefs through Active Exploration?
arXiv 2026
Video-Based Reward Modeling for Computer-Use Agents
arXiv 2026
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems
arXiv 2025
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
arXiv 2025
SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning
arXiv 2025
Spatial Mental Modeling from Limited Views
arXiv 2025
MolmoAct: Action Reasoning Models that can Reason in Space
arXiv 2025
Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base
arXiv 2025
Adaptive In-conversation Team Building for Language Model Agents
arXiv 2024
TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
arXiv 2024
m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
arXiv 2024
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
arXiv 2024
Template Matters: Understanding the Role of Instruction Templates in Multimodal Language Model Evaluation and Training
arXiv 2024
DataComp: In search of the next generation of multimodal datasets
NeurIPS 2023 11
SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality
sugarcrepe-fixing-hackable-benchmarks-for
EcoAssistant: Using LLM Assistant More Affordably and Accurately
arXiv 2023
When to Learn What: Model-Adaptive Data Augmentation Curriculum
ICCV 2023 1
Subclass-balancing Contrastive Learning for Long-tailed Recognition
ICCV 2023 1
Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias
large-language-model-as-attributed-training
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
arXiv 2023
A Survey on Programmatic Weak Supervision
arXiv 2022
WRENCH: A Comprehensive Benchmark for Weak Supervision
arXiv 2021
Affiliations
Frequent co-authors
10from 24 papers