Bin Lin
- Papers
- 20
Cite
Notes
Only stored in your browser.
Authored papers
20Manifold-Aware Exploration for Reinforcement Learning in Video Generation
arXiv 2026
iFSQ: Improving FSQ for Image Generation with 1 Line of Code
arXiv 2026
ImgEdit: A Unified Image Editing Dataset and Benchmark
arXiv 2025
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
arXiv 2025
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
arXiv 2025
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
arXiv 2025
SwapAnyone: Consistent and Realistic Video Synthesis for Swapping Any Person into Any Video
arXiv 2025
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
arXiv 2025
GIR-Bench: Versatile Benchmark for Generating Images with Reasoning
arXiv 2025
Can Understanding and Generation Truly Benefit Together -- or Just Coexist?
arXiv 2025
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback
arXiv 2025
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
arXiv 2024
Open-Sora Plan: Open-Source Large Video Generation Model
arXiv 2024
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
arXiv 2024
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
arXiv 2024
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
CVPR 2025 1
Next Patch Prediction for Autoregressive Visual Generation
arXiv 2024
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
video-llava-learning-united-visual
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
arXiv 2023
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
arXiv 2023
Affiliations
Frequent co-authors
10from 20 papers