0

Hao Jiang

Papers
27

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
27papers

Authored papers

27

Towards Customized Multimodal Role-Play

arXiv 2026

2026

Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

arXiv 2026

2026

LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models

arXiv 2026

2026

HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation

arXiv 2025

2025

Fast-Slow Thinking for Large Vision-Language Model Reasoning

arXiv 2025

2025

Does Hearing Help Seeing? Investigating Audio-Video Joint Denoising for Video Generation

arXiv 2025

2025

Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations

arXiv 2025

2025

DiffSemanticFusion: Semantic Raster BEV Fusion for Autonomous Driving via Online HD Map Diffusion

arXiv 2025

2025

Streaming Video Question-Answering with In-context Video KV-Cache Retrieval

arXiv 2025

2025

Towards Universal Soccer Video Understanding

CVPR 2025 1

2024

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

CVPR 2025 1

2024

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

arXiv 2024

2024

Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation

arXiv 2024

2024

RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

arXiv 2024

2024

CursorCore: Assist Programming through Aligning Anything

arXiv 2024

2024

Pyramidal Flow Matching for Efficient Video Generative Modeling

arXiv 2024

2024

MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance

arXiv 2024

2024

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

arXiv 2024

2024

HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models

arXiv 2024

2024

A Comprehensive Survey of Direct Preference Optimization: Datasets, Theories, Variants, and Applications

arXiv 2024

2024

Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation

arXiv 2024

2024

DoNet: Deep De-overlapping Network for Cytology Instance Segmentation

CVPR 2023 1

2023

Multi-Modal Experience Inspired AI Creation

arXiv 2022

2022

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline

NAACL 2022 7

2021

Ego4D: Around the World in 3,000 Hours of Egocentric Video

CVPR 2022 1

2021

Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking

arXiv 2021

2021

Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 27 papers