Yike Guo
- Papers
- 25
Cite
Notes
Only stored in your browser.
Authored papers
25UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass
arXiv 2026
Learning While Staying Curious: Entropy-Preserving Supervised Fine-Tuning via Adaptive Self-Distillation for Large Reasoning Models
arXiv 2026
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation
arXiv 2026
Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing
arXiv 2026
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
arXiv 2025
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens
arXiv 2025
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis
arXiv 2025
Audio-FLAN: A Preliminary Release
arXiv 2025
YuE: Scaling Open Foundation Models for Long-Form Music Generation
arXiv 2025
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging
arXiv 2025
ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
arXiv 2025
ChatMusician: Understanding and Generating Music Intrinsically with LLM
arXiv 2024
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
arXiv 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
CVPR 2025 1
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
arXiv 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
arXiv 2024
ComposerX: Multi-Agent Symbolic Music Composition with LLMs
arXiv 2024
Discovering symbolic expressions with parallelized tree search
arXiv 2024
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
arXiv 2024
Importance Weighting Can Help Large Language Models Self-Improve
arXiv 2024
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
arXiv 2023
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
arXiv 2023
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
arXiv 2023
A Survey of Reasoning with Foundation Models
arXiv 2023
Label Dependent Attention Model for Disease Risk Prediction Using Multimodal Electronic Health Records
arXiv 2022
Affiliations
Frequent co-authors
10from 25 papers