Di Huang
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization
arXiv 2026
Is Diversity All You Need for Scalable Robotic Manipulation?
arXiv 2025
Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
arXiv 2025
OmniCellTOSG: The First Cell Text-Omic Signaling Graphs Dataset for Joint LLM and GNN Modeling
arXiv 2025
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
CVPR 2025 1
POSS: Position Specialist Generates Better Draft for Speculative Decoding
arXiv 2025
Towards Training-free Anomaly Detection with Vision and Language Foundation Models
CVPR 2025 1
CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization
CVPR 2025 1
StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion
arXiv 2025
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
arXiv 2024
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
arXiv 2024
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
arXiv 2024
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
CVPR 2025 1
InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization
CVPR 2024 1
I4VGen: Image as Free Stepping Stone for Text-to-Video Generation
arXiv 2024
PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot Object Detection
arXiv 2024
Depth Any Video with Scalable Synthetic Data
arXiv 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
arXiv 2024
3D$^2$-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling
arXiv 2024
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct
arXiv 2024
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
arXiv 2023
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
CVPR 2024 1
Denoising Diffusion Autoencoders are Unified Self-supervised Learners
ICCV 2023 1
ANPL: Towards Natural Programming with Interactive Decomposition
anpl-towards-natural-programming-with
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping
arXiv 2023
DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration
ICCV 2023 1
Affiliations
Frequent co-authors
10from 26 papers