Haonan Lu
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15X-OmniClaw Technical Report: A Unified Mobile Agent for Multimodal Understanding and Interaction
arXiv 2026
PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding
arXiv 2026
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning
arXiv 2026
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
ICCV 2025
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
CVPR 2025 1
Improved Visual-Spatial Reasoning via R1-Zero-Like Training
arXiv 2025
Pluggable Pruning with Contiguous Layer Distillation for Diffusion Transformers
arXiv 2025
H2VU-Benchmark: A Comprehensive Benchmark for Hierarchical Holistic Video Understanding
arXiv 2025
Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens
arXiv 2025
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
arXiv 2024
TLCM: Training-efficient Latent Consistency Model for Image Generation with 2-8 Steps
arXiv 2024
GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation
arXiv 2023
PEA-Diffusion: Parameter-Efficient Adapter with Knowledge Distillation in non-English Text-to-Image Generation
arXiv 2023
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
arXiv 2023
Prompt Space Optimizing Few-shot Reasoning Success with Large Language Models
arXiv 2023
Affiliations
Frequent co-authors
10from 15 papers