Huan Wang
- Papers
- 42
Cite
Notes
Only stored in your browser.
Authored papers
42Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey
arXiv 2026
RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution
arXiv 2026
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs
arXiv 2026
DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels
arXiv 2026
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
arXiv 2025
HoliTom: Holistic Token Merging for Fast Video Large Language Models
arXiv 2025
Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
arXiv 2025
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
arXiv 2025
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
arXiv 2025
Autoregressive Image Generation with Randomized Parallel Decoding
arXiv 2025
OBS-Diff: Accurate Pruning For Diffusion Models in One-Shot
arXiv 2025
UserRL: Training Interactive User-Centric Agent via Reinforcement Learning
arXiv 2025
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels
arXiv 2025
UserBench: An Interactive Gym Environment for User-Centric Agents
arXiv 2025
CoDA: Coding LM via Diffusion Adaptation
arXiv 2025
When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
arXiv 2025
OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
arXiv 2025
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics
arXiv 2025
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
arXiv 2025
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
arXiv 2025
RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
arXiv 2025
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models
arXiv 2025
Boosting Large Language Models with Mask Fine-Tuning
arXiv 2025
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
arXiv 2025
TARS: MinMax Token-Adaptive Preference Strategy for MLLM Hallucination Reduction
arXiv 2025
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
arXiv 2024
DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
CVPR 2025 1
AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System
arXiv 2024
TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
arXiv 2024
Image as Set of Points
arXiv 2023
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
unicontrol-a-unified-diffusion-model-for
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
arXiv 2023
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
arXiv 2023
Improved Online Conformal Prediction via Strongly Adaptive Online Learning
arXiv 2023
Frame Flexible Network
CVPR 2023 1
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
arXiv 2023
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution
ICCV 2023 1
Real-Time Neural Light Field on Mobile Devices
CVPR 2023 1
Converse: A Tree-Based Modular Task-Oriented Dialogue System
arXiv 2022
Rethinking Adam: A Twofold Exponential Moving Average Approach
adapting-stepsizes-by-momentumized-gradients-1
MNN: A Universal and Efficient Inference Engine
arXiv 2020
What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective
arXiv 2020
Affiliations
Frequent co-authors
10from 42 papers