Yukang Chen
- Papers
- 24
Cite
Notes
Only stored in your browser.
Authored papers
24TriAttention: Efficient Long Reasoning with Trigonometric KV Compression
arXiv 2026
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
arXiv 2026
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation
arXiv 2026
StreamingVLM: Real-Time Understanding for Infinite Video Streams
arXiv 2025
Scaling RL to Long Videos
arXiv 2025
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
arXiv 2025
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
arXiv 2025
TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation
arXiv 2025
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
arXiv 2025
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
arXiv 2025
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
arXiv 2024
NVILA: Efficient Frontier Visual Language Models
CVPR 2025 1
VisionZip: Longer is Better but Not Necessary in Vision Language Models
CVPR 2025 1
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
ICCV 2025
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
arXiv 2024
SEED-Story: Multimodal Long Story Generation with Large Language Model
arXiv 2024
LISA: Reasoning Segmentation via Large Language Model
CVPR 2024 1
Spherical Transformer for LiDAR-based 3D Recognition
CVPR 2023 1
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
arXiv 2023
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
voxelnext-fully-sparse-voxelnet-for-3d-object
Mask-Attention-Free Transformer for 3D Instance Segmentation
ICCV 2023 1
Denoising Diffusion Step-aware Models
arXiv 2023
FocalFormer3D : Focusing on Hard Instance for 3D Object Detection
arXiv 2023
Focal Sparse Convolutional Networks for 3D Object Detection
CVPR 2022 1
Affiliations
Frequent co-authors
10from 24 papers