Pheng-Ann Heng
- Papers
- 16
Cite
Notes
Only stored in your browser.
Authored papers
16OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation
arXiv 2026
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
arXiv 2026
EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning
arXiv 2025
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
arXiv 2025
Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark
arXiv 2025
Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
arXiv 2025
Thinking-while-Generating: Interleaving Textual Reasoning throughout Visual Generation
arXiv 2025
A Survey on Inference Optimization Techniques for Mixture of Experts Models
arXiv 2024
SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners
arXiv 2024
Unveiling the Generalization Power of Fine-Tuned Large Language Models
arXiv 2024
A Survey of Reasoning with Foundation Models
arXiv 2023
3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation
arXiv 2023
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following
arXiv 2023
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning
arXiv 2023
RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction
CVPR 2023 1
Acknowledging the Unknown for Multi-label Learning with Single Positive Labels
arXiv 2022
Affiliations
Frequent co-authors
10from 16 papers