Kai Yu
- Papers
- 21
Cite
Notes
Only stored in your browser.
Authored papers
21Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders
arXiv 2026
HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer
arXiv 2025
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
arXiv 2025
NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering
arXiv 2025
URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models
arXiv 2025
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
arXiv 2024
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
arXiv 2024
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
arXiv 2024
MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation
arXiv 2024
FakeSound: Deepfake General Audio Detection
arXiv 2024
UrFound: Towards Universal Retinal Foundation Models via Knowledge-Guided Masked Modeling
arXiv 2024
Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding
arXiv 2024
A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
arXiv 2024
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
arXiv 2024
Large Language Models Are Semi-Parametric Reinforcement Learning Agents
large-language-models-are-semi-parametric
SciEval: A Multi-Level Large Language Model Evaluation Benchmark for Scientific Research
arXiv 2023
CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical Dataset
arXiv 2023
Towards Instance-adaptive Inference for Federated Learning
ICCV 2023 1
DiffDub: Person-generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-encoder
arXiv 2023
Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning
ICCV 2023 1
Mobile-Env: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction
arXiv 2023
Affiliations
Frequent co-authors
10from 21 papers