0

ZiHao Wang

Papers
43

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
43papers

Authored papers

43

Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation

arXiv 2026

2026

FeatCal: Feature Calibration for Post-Merging Models

arXiv 2026

2026

Do Reasoning Models Enhance Embedding Models?

arXiv 2026

2026

YuE: Scaling Open Foundation Models for Long-Form Music Generation

arXiv 2025

2025

Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets

arXiv 2025

2025

Seed1.5-VL Technical Report

arXiv 2025

2025

From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery

arXiv 2025

2025

Enhancing Transformers for Generalizable First-Order Logical Entailment

arXiv 2025

2025

Open-World Skill Discovery from Unsegmented Demonstrations

arXiv 2025

2025

Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

arXiv 2025

2025

SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

arXiv 2025

2025

ACE: Anti-Editing Concept Erasure in Text-to-Image Models

CVPR 2025 1

2025

UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering

arXiv 2025

2025

Diff-V2M: A Hierarchical Conditional Diffusion Model with Explicit Rhythmic Modeling for Video-to-Music Generation

arXiv 2025

2025

LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning

arXiv 2025

2025

Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training

arXiv 2025

2025

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

arXiv 2025

2025

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

arXiv 2025

2025

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

arXiv 2025

2025

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

arXiv 2025

2025

UQ: Assessing Language Models on Unsolved Questions

arXiv 2025

2025

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

arXiv 2025

2025

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

ICCV 2025

2025

Generative Evaluation of Complex Reasoning in Large Language Models

arXiv 2025

2025

Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects

arXiv 2024

2024

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

arXiv 2024

2024

Towards Foundation Model for Chemical Reactor Modeling: Meta-Learning with Physics-Informed Adaptation

arXiv 2024

2024

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

arXiv 2024

2024

A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis

arXiv 2024

2024

Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models

arXiv 2024

2024

Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks

arXiv 2024

2024

ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting

CVPR 2025 1

2024

SaMoye: Zero-shot Singing Voice Conversion Model Based on Feature Disentanglement and Enhancement

arXiv 2024

2024

SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget

arXiv 2024

2024

MuChin: A Chinese Colloquial Description Benchmark for Evaluating Language Models in the Field of Music

arXiv 2024

2024

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

arXiv 2023

2023

Rethinking Complex Queries on Knowledge Graphs with Neural Link Predictors

arXiv 2023

2023

Towards Evaluating Generalist Agents: An Automated Benchmark in Open World

arXiv 2023

2023

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

arXiv 2023

2023

Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction

CVPR 2023 1

2023

Real-Time Machine-Learning-Based Optimization Using Input Convex Long Short-Term Memory Network

arXiv 2023

2023

Integrating Knowledge Graph embedding and pretrained Language Models in Hypercomplex Spaces

arXiv 2022

2022

Upper Limb Movement Recognition utilising EEG and EMG Signals for Rehabilitative Robotics

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 43 papers