Zhiyong Wu
- Papers
- 37
Cite
Notes
Only stored in your browser.
Authored papers
37LeVo: High-Quality Song Generation with Multi-Preference Alignment
arXiv 2025
Seed1.5-VL Technical Report
arXiv 2025
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
arXiv 2025
Advances in Speech Separation: Techniques, Challenges, and Future Trends
arXiv 2025
$φ$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation
arXiv 2025
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
arXiv 2025
Implicit Search via Discrete Diffusion: A Study on Chess
arXiv 2025
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
arXiv 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
arXiv 2025
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows
arXiv 2025
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement
arXiv 2024
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
arXiv 2024
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
arXiv 2024
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
arXiv 2024
A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond
arXiv 2024
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models
arXiv 2024
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
arXiv 2024
Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis
arXiv 2024
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
arXiv 2024
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation
arXiv 2024
Foundation Models for Music: A Survey
arXiv 2024
SpeechCraft: A Fine-grained Expressive Speech Dataset with Natural Language Description
arXiv 2024
MuCodec: Ultra Low-Bitrate Music Codec
arXiv 2024
SCNet: Sparse Compression Network for Music Source Separation
arXiv 2024
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling
arXiv 2023
UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons
arXiv 2023
Can We Edit Factual Knowledge by In-Context Learning?
arXiv 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
arXiv 2023
AdaMesh: Personalized Facial Expressions and Head Poses for Adaptive Speech-Driven 3D Facial Animation
arXiv 2023
Compositional Exemplars for In-context Learning
arXiv 2023
In-Context Learning with Many Demonstration Examples
arXiv 2023
How Vocabulary Sharing Facilitates Multilingualism in LLaMA?
arXiv 2023
DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
arXiv 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
arXiv 2022
A Survey on In-context Learning
arXiv 2022
NeuFA: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism
arXiv 2022
Lexical Knowledge Internalization for Neural Dialog Generation
ACL 2022 5
Affiliations
Frequent co-authors
10from 37 papers