Boyi Li
- Papers
- 14
Cite
Notes
Only stored in your browser.
Authored papers
14Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing
arXiv 2026
V_1: Unifying Generation and Self-Verification for Parallel Reasoners
arXiv 2026
Toward Cognitive Supersensing in Multimodal Large Language Model
arXiv 2026
Describe Anything: Detailed Localized Image and Video Captioning
ICCV 2025
FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
arXiv 2025
Adaptive Graph Pruning for Multi-Agent Communication
arXiv 2025
Scaling Vision Pre-Training to 4K Resolution
CVPR 2025 1
Atlas: Multi-Scale Attention Improves Long Context Image Modeling
arXiv 2025
Wolf: Captioning Everything with a World Summarization Framework
arXiv 2024
Extrapolated Urban View Synthesis Benchmark
ICCV 2025
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
arXiv 2023
Interactive Task Planning with Language Models
arXiv 2023
Language-driven Semantic Segmentation
language-driven-semantic-segmentation
On Feature Normalization and Data Augmentation
CVPR 2021 1
Affiliations
Frequent co-authors
10from 14 papers