Sai Rajeswar

Papers: 13

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

13papers

Authored papers

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

arXiv 2025

2025

Grounding Computer Use Agents on Human Demonstrations

arXiv 2025

2025

The Promise of RL for Autoregressive Image Editing

arXiv 2025

2025

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs

arXiv 2025

2025

StarFlow: Generating Structured Workflow Outputs From Sketch Images

arXiv 2025

2025

GenRL: Multimodal-foundation world models for generalization in embodied agents

arXiv 2024

2024

VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text

arXiv 2024

2024

RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content

arXiv 2024

2024

StarVector: Generating Scalable Vector Graphics Code from Images and Text

CVPR 2025 1

2023

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

arXiv 2022

2022

Choreographer: Learning and Adapting Skills in Imagination

arXiv 2022

2022

Touch-based Curiosity for Sparse-Reward Tasks

arXiv 2021

2021

Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images using a View-based Representation

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

from 13 papers

David Vazquez

researcher

Christopher Pal

Perouz Taslakian

Aaron Courville

Juan A Rodriguez

Bart Dhoedt

Nicolas Chapados

researcher

Pietro Mazzaglia

Spandana Gella

Tim Verbelen