0

Wenhao Chai

Papers
28

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
28papers

Authored papers

28

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

arXiv 2026

2026

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

arXiv 2026

2026

Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling

arXiv 2026

2026

BabyVision: Visual Reasoning Beyond Language

arXiv 2026

2026

FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale

arXiv 2026

2026

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

arXiv 2025

2026

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

arXiv 2025

2025

Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark

arXiv 2025

2025

Science-T2I: Addressing Scientific Illusions in Image Synthesis

CVPR 2025 1

2025

TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action

arXiv 2025

2025

FrontierCS: Evolving Challenges for Evolving Intelligence

arXiv 2025

2025

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

arXiv 2025

2025

VideoNSA: Native Sparse Attention Scales Video Understanding

arXiv 2025

2025

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

arXiv 2025

2025

An Empirical Study of GPT-4o Image Generation Capabilities

arXiv 2025

2025

EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments

arXiv 2025

2025

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

arXiv 2025

2025

Next-Embedding Prediction Makes Strong Vision Learners

arXiv 2025

2025

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory

samurai-adapting-segment-anything-model-for

2024

MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection

CVPR 2025 1

2024

MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

CVPR 2024 1

2023

Five A$^{+}$ Network: You Only Need 9K Parameters for Underwater Image Enhancement

arXiv 2023

2023

DiffFashion: Reference-based Fashion Design with Structure-aware Transfer by Diffusion Models

arXiv 2023

2023

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

arXiv 2023

2023

Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation

ICCV 2023 1

2023

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

ICCV 2023 1

2023

PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation

arXiv 2023

2023

Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 28 papers