0

Yi Zhang

Papers
32

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
32papers

Authored papers

32

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

arXiv 2026

2026

Code2Worlds: Empowering Coding LLMs for 4D World Generation

arXiv 2026

2026

LongCat-Flash-Thinking-2601 Technical Report

arXiv 2026

2026

SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?

arXiv 2026

2026

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

arXiv 2026

2026

Salt: Self-Consistent Distribution Matching with Cache-Aware Training for Fast Video Generation

arXiv 2026

2026

Grid2Matrix: Revealing Digital Agnosia in Vision-Language Models

arXiv 2026

2026

CLARE: Continual Learning for Vision-Language-Action Models via Autonomous Adapter Routing and Expansion

arXiv 2026

2026

DreamO: A Unified Framework for Image Customization

arXiv 2025

2025

SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

arXiv 2025

2025

Quadratic Interest Network for Multimodal Click-Through Rate Prediction

arXiv 2025

2025

Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing

arXiv 2025

2025

Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset

arXiv 2025

2025

M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis

arXiv 2025

2025

Qwen3Guard Technical Report

arXiv 2025

2025

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

arXiv 2025

2025

Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning

arXiv 2025

2025

GLM-TTS Technical Report

arXiv 2025

2025

Language Representations Can be What Recommenders Need: Findings and Potentials

arXiv 2024

2024

CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models

arXiv 2024

2024

Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding

arXiv 2024

2024

Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology

CVPR 2024 1

2024

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

arXiv 2024

2024

Right this way: Can VLMs Guide Us to See More to Answer Questions?

arXiv 2024

2024

3D-Aware Neural Body Fitting for Occlusion Robust 3D Human Pose Estimation

ICCV 2023 1

2023

NatCS: Eliciting Natural Customer Support Dialogues

arXiv 2023

2023

PairingNet: A Learning-based Pair-searching and -matching Network for Image Fragments

arXiv 2023

2023

Clinical Prompt Learning with Frozen Language Models

arXiv 2022

2022

Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems

arXiv 2022

2022

The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation

ICCV 2023 1

2022

What Makes Convolutional Models Great on Long Sequence Modeling?

arXiv 2022

2022

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

ACL 2022 5

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 32 papers