Jaehong Yoon
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance
arXiv 2025
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation
arXiv 2026
Self-Refining Video Sampling
arXiv 2026
PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation
arXiv 2026
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
arXiv 2026
AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories
arXiv 2026
Are Video Reasoning Models Ready to Go Outside?
arXiv 2026
On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective
arXiv 2025
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization
arXiv 2025
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs
arXiv 2025
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning
arXiv 2025
Planning with Sketch-Guided Verification for Physics-Aware Video Generation
arXiv 2025
MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation
arXiv 2025
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
arXiv 2025
Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning?
arXiv 2025
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
arXiv 2024
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
arXiv 2024
RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives
arXiv 2024
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
arXiv 2024
Adapt-$\infty$: Scalable Lifelong Multimodal Instruction Tuning via Dynamic Data Selection
arXiv 2024
Glider: Global and Local Instruction-Driven Expert Router
arXiv 2024
Progressive Fourier Neural Representation for Sequential Video Compilation
arXiv 2023
Forget-free Continual Learning with Soft-Winning SubNetworks
arXiv 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
arXiv 2023
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
arXiv 2022
Personalized Subgraph Federated Learning
arXiv 2022
Affiliations
Frequent co-authors
10from 26 papers