Yi-Fan Zhang

Papers: 23

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

23papers

Authored papers

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

arXiv 2026

2026

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

arXiv 2026

2026

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

arXiv 2026

2026

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

arXiv 2026

2026

Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling

arXiv 2026

2026

Can MLLMs Read Students' Minds? Unpacking Multimodal Error Analysis in Handwritten Math

arXiv 2026

2026

VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting

arXiv 2025

2025

Aligning Multimodal LLM with Human Preference: A Survey

arXiv 2025

2025

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

arXiv 2025

2025

MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

arXiv 2025

2025

Thyme: Think Beyond Images

arXiv 2025

2025

Kwai Keye-VL 1.5 Technical Report

arXiv 2025

2025

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

arXiv 2025

2025

RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark

arXiv 2025

2025

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

arXiv 2025

2025

MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models

arXiv 2025

2025

Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

arXiv 2024

2024

Debiasing Multimodal Large Language Models

arXiv 2024

2024

OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling

onenet-enhancing-time-series-forecasting

2023

AdaNPC: Exploring Non-Parametric Classifier for Test-Time Adaptation

arXiv 2023

2023

Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation

arXiv 2022

2022

Evaluating Step-by-Step Reasoning through Symbolic Verification

arXiv 2022

2022

Towards Principled Disentanglement for Domain Generalization

CVPR 2022 1

2021

Affiliations

No known affiliations.

Frequent co-authors

from 23 papers

Chaoyou Fu

Liang Wang

Zhang Zhang

Yang Shi

Qingsong Wen

Tieniu Tan

Yuanxing Zhang

Haotian Wang

Pengfei Wan

Rong Jin