ZiHao Wang
- Papers
- 43
Cite
Notes
Only stored in your browser.
Authored papers
43Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation
arXiv 2026
FeatCal: Feature Calibration for Post-Merging Models
arXiv 2026
Do Reasoning Models Enhance Embedding Models?
arXiv 2026
YuE: Scaling Open Foundation Models for Long-Form Music Generation
arXiv 2025
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
arXiv 2025
Seed1.5-VL Technical Report
arXiv 2025
From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
arXiv 2025
Enhancing Transformers for Generalizable First-Order Logical Entailment
arXiv 2025
Open-World Skill Discovery from Unsegmented Demonstrations
arXiv 2025
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
arXiv 2025
SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model
arXiv 2025
ACE: Anti-Editing Concept Erasure in Text-to-Image Models
CVPR 2025 1
UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering
arXiv 2025
Diff-V2M: A Hierarchical Conditional Diffusion Model with Explicit Rhythmic Modeling for Video-to-Music Generation
arXiv 2025
LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning
arXiv 2025
Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training
arXiv 2025
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs
arXiv 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
arXiv 2025
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation
arXiv 2025
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
arXiv 2025
UQ: Assessing Language Models on Unsolved Questions
arXiv 2025
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization
arXiv 2025
Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models
ICCV 2025
Generative Evaluation of Complex Reasoning in Large Language Models
arXiv 2025
Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects
arXiv 2024
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
arXiv 2024
Towards Foundation Model for Chemical Reactor Modeling: Meta-Learning with Physics-Informed Adaptation
arXiv 2024
LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
arXiv 2024
A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis
arXiv 2024
Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models
arXiv 2024
Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks
arXiv 2024
ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting
CVPR 2025 1
SaMoye: Zero-shot Singing Voice Conversion Model Based on Feature Disentanglement and Enhancement
arXiv 2024
SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget
arXiv 2024
MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music
arXiv 2024
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
arXiv 2023
Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors
arXiv 2023
Towards Evaluating Generalist Agents: An Automated Benchmark in Open World
arXiv 2023
GROOT: Learning to Follow Instructions by Watching Gameplay Videos
arXiv 2023
Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction
CVPR 2023 1
Real-Time Machine-Learning-Based Optimization Using Input Convex Long Short-Term Memory Network
arXiv 2023
Integrating Knowledge Graph embedding and pretrained Language Models in Hypercomplex Spaces
arXiv 2022
Upper Limb Movement Recognition utilising EEG and EMG Signals for Rehabilitative Robotics
arXiv 2022
Affiliations
Frequent co-authors
10from 43 papers