Siva Reddy
- Papers
- 34
Cite
Notes
Only stored in your browser.
Authored papers
34Forecasting Downstream Performance of LLMs With Proxy Metrics
arXiv 2026
Structured Distillation of Web Agent Capabilities Enables Generalization
arXiv 2026
LLM2Vec-Gen: Generative Embeddings from Large Language Models
arXiv 2026
LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs
arXiv 2026
The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents
arXiv 2026
Humans and LLMs Diverge on Probabilistic Inferences
arXiv 2026
SafeArena: Evaluating the Safety of Autonomous Web Agents
arXiv 2025
The Promise of RL for Autoregressive Image Editing
arXiv 2025
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
arXiv 2025
How to Get Your LLM to Generate Challenging Problems for Evaluation
arXiv 2025
The Markovian Thinker
arXiv 2025
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
arXiv 2025
DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning
arXiv 2025
Exploiting Instruction-Following Retrievers for Malicious Information Retrieval
arXiv 2025
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
arXiv 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
arXiv 2024
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
arXiv 2024
The BrowserGym Ecosystem for Web Agent Research
arXiv 2024
Learning Action and Reasoning-Centric Image Editing from Videos and Simulations
arXiv 2024
Universal Adversarial Triggers Are Not Universal
arXiv 2024
Are self-explanations from Large Language Models faithful?
arXiv 2024
Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
arXiv 2023
The Impact of Positional Encoding on Length Generalization in Transformers
the-impact-of-positional-encoding-on-length
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
arXiv 2023
Faithfulness Measurable Masked Language Models
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Combining Modular Skills in Multitask Learning
arXiv 2022
Image Retrieval from Contextual Descriptions
ACL 2022 5
Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model
arXiv 2022
Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment
Findings (ACL) 2022 5
Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval
EMNLP 2021 11
An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models
ACL 2022 5
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
arXiv 2021
StereoSet: Measuring stereotypical bias in pretrained language models
ACL 2021 5
Affiliations
Frequent co-authors
10from 34 papers
Nicholas Meade
Xing Han Lù
Vaibhav Adlakha
Amirhossein Kazemnejad
Arkil Patel
Parishad BehnamGhader
Benno Krojer
Marius Mosbach
Alessandro Sordoni
Andreas Madsen