0

Di Huang

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization

arXiv 2026

2026

Is Diversity All You Need for Scalable Robotic Manipulation?

arXiv 2025

2025

Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation

arXiv 2025

2025

OmniCellTOSG: The First Cell Text-Omic Signaling Graphs Dataset for Joint LLM and GNN Modeling

arXiv 2025

2025

APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers

CVPR 2025 1

2025

POSS: Position Specialist Generates Better Draft for Speculative Decoding

arXiv 2025

2025

Towards Training-free Anomaly Detection with Vision and Language Foundation Models

CVPR 2025 1

2025

CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization

CVPR 2025 1

2025

StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion

arXiv 2025

2025

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

arXiv 2024

2024

PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines

arXiv 2024

2024

AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer

arXiv 2024

2024

ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems

CVPR 2025 1

2024

InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

CVPR 2024 1

2024

I4VGen: Image as Free Stepping Stone for Text-to-Video Generation

arXiv 2024

2024

PS-TTL: Prototype-based Soft-labels and Test-Time Learning for Few-shot Object Detection

arXiv 2024

2024

Depth Any Video with Scalable Synthetic Data

arXiv 2024

2024

FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model

arXiv 2024

2024

3D$^2$-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling

arXiv 2024

2024

InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

arXiv 2024

2024

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

arXiv 2023

2023

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

CVPR 2024 1

2023

Denoising Diffusion Autoencoders are Unified Self-supervised Learners

ICCV 2023 1

2023

ANPL: Towards Natural Programming with Interactive Decomposition

anpl-towards-natural-programming-with

2023

BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping

arXiv 2023

2023

DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration

ICCV 2023 1

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers