Kun Gai
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions
arXiv 2026
VINO: A Unified Visual Generator with Interleaved OmniModal Context
arXiv 2026
ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation
arXiv 2026
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers
arXiv 2026
WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing
arXiv 2026
RecGOAT: Graph Optimal Adaptive Transport for LLM-Enhanced Multimodal Recommendation with Dual Semantic Alignment
arXiv 2026
Scaling Image and Video Generation via Test-Time Evolutionary Search
arXiv 2025
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework
arXiv 2025
Monet: Reasoning in Latent Visual Space Beyond Images and Language
arXiv 2025
SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder
arXiv 2025
Agentic Entropy-Balanced Policy Optimization
arXiv 2025
Visual Generation Tuning
arXiv 2025
Kwai Keye-VL 1.5 Technical Report
arXiv 2025
ASPO: Asymmetric Importance Sampling Policy Optimization
arXiv 2025
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
arXiv 2025
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
arXiv 2025
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers
arXiv 2025
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
arXiv 2025
Leanabell-Prover: Posttraining Scaling in Formal Reasoning
arXiv 2025
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
arXiv 2024
RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
arXiv 2024
Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention
arXiv 2024
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
arXiv 2023
Multi-Task Recommendations with Reinforcement Learning
arXiv 2023
Two-Stage Constrained Actor-Critic for Short Video Recommendation
arXiv 2023
Deep Interest Network for Click-Through Rate Prediction
arXiv 2017
Affiliations
Frequent co-authors
10from 26 papers