Zhaoyang Liu
- Papers
- 17
Cite
Notes
Only stored in your browser.
Authored papers
17OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent
arXiv 2026
Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing
arXiv 2026
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
arXiv 2025
Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution
arXiv 2025
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
arXiv 2025
MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents
arXiv 2025
CuES: A Curiosity-driven and Environment-grounded Synthesis Framework for Agentic RL
arXiv 2025
ZeroGUI: Automating Online GUI Learning at Zero Human Cost
arXiv 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
arXiv 2025
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
arXiv 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
arXiv 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
CVPR 2025 1
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
arXiv 2024
Paths of A Million People: Extracting Life Trajectories from Wikipedia
arXiv 2024
Data-Juicer: A One-Stop Data Processing System for Large Language Models
arXiv 2023
ControlLLM: Augment Language Models with Tools by Searching on Graphs
arXiv 2023
MotionBERT: A Unified Perspective on Learning Human Motion Representations
ICCV 2023 1
Affiliations
Frequent co-authors
10from 17 papers