Silvio Savarese
Chief Scientist at Salesforce / Salesforce AI Research; Stanford professor on leave; co-founder of CTRL-Labs (sold to Meta).
- Role
- researcher
- Currently at
- Stanford University
- Scholar
- scholar.google.com/citations
- Papers
- 36
Cite
Notes
Only stored in your browser.
Authored papers
36Future Optical Flow Prediction Improves Robot Control & Video Generation
arXiv 2026
OSWorld-Verified: A Cleaner, More Reliable Computer-Use Benchmark
blog
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset
arXiv 2025
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
arXiv 2025
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
arXiv 2025
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
arXiv 2025
Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics
arXiv 2025
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
arXiv 2025
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
arXiv 2025
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models
arXiv 2025
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels
arXiv 2025
UserRL: Training Interactive User-Centric Agent via Reinforcement Learning
arXiv 2025
UserBench: An Interactive Gym Environment for User-Centric Agents
arXiv 2025
CoDA: Coding LM via Diffusion Adaptation
arXiv 2025
CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions
arXiv 2025
Reward-Guided Speculative Decoding for Efficient LLM Reasoning
arXiv 2025
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
NeurIPS
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
arXiv 2024
GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation
arXiv 2024
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
arXiv 2024
AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System
arXiv 2024
TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
arXiv 2024
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback
arXiv 2024
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models
arXiv 2024
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
arXiv 2023
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
CVPR 2024 1
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
unicontrol-a-unified-diffusion-model-for
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
arXiv 2023
XGen-7B Technical Report
arXiv 2023
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
arXiv 2023
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
arXiv 2023
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
arXiv 2022
3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera
3d-scene-graph-a-structure-for-unified-1
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
arXiv 2018
Joint 2D-3D-Semantic Data for Indoor Scene Understanding
arXiv 2017
Active Learning for Convolutional Neural Networks: A Core-Set Approach
active-learning-for-convolutional-neural-1
Affiliations
Frequent co-authors
10from 36 papers