Juan Carlos Niebles
- Papers
- 18
Cite
Notes
Only stored in your browser.
Authored papers
18Future Optical Flow Prediction Improves Robot Control & Video Generation
arXiv 2026
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
arXiv 2025
Re-thinking Temporal Search for Long-Form Video Understanding
CVPR 2025 1
Unifying Specialized Visual Encoders for Video Language Models
arXiv 2025
Exploring Diffusion Transformer Designs via Grafting
arXiv 2025
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation
ICCV 2025
Taming generative video models for zero-shot optical flow extraction
arXiv 2025
Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas
arXiv 2025
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
arXiv 2024
TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
arXiv 2024
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
arXiv 2024
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
arXiv 2024
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
CVPR 2024 1
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
unicontrol-a-unified-diffusion-model-for
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
arXiv 2023
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
arXiv 2023
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer
arXiv 2022
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
CVPR 2022 1
Affiliations
Frequent co-authors
10from 18 papers