0

Siva Reddy

Papers
34

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
34papers

Authored papers

34

Forecasting Downstream Performance of LLMs With Proxy Metrics

arXiv 2026

2026

Structured Distillation of Web Agent Capabilities Enables Generalization

arXiv 2026

2026

LLM2Vec-Gen: Generative Embeddings from Large Language Models

arXiv 2026

2026

LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

arXiv 2026

2026

The Blind Spot of Agent Safety: How Benign User Instructions Expose Critical Vulnerabilities in Computer-Use Agents

arXiv 2026

2026

Humans and LLMs Diverge on Probabilistic Inferences

arXiv 2026

2026

SafeArena: Evaluating the Safety of Autonomous Web Agents

arXiv 2025

2025

The Promise of RL for Autoregressive Image Editing

arXiv 2025

2025

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

arXiv 2025

2025

How to Get Your LLM to Generate Challenging Problems for Evaluation

arXiv 2025

2025

The Markovian Thinker

arXiv 2025

2025

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

arXiv 2025

2025

DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning

arXiv 2025

2025

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval

arXiv 2025

2025

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

arXiv 2024

2024

WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

arXiv 2024

2024

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

arXiv 2024

2024

The BrowserGym Ecosystem for Web Agent Research

arXiv 2024

2024

Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

arXiv 2024

2024

Universal Adversarial Triggers Are Not Universal

arXiv 2024

2024

Are self-explanations from Large Language Models faithful?

arXiv 2024

2024

Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering

arXiv 2023

2023

The Impact of Positional Encoding on Length Generalization in Transformers

the-impact-of-positional-encoding-on-length

2023

The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents

arXiv 2023

2023

Faithfulness Measurable Masked Language Models

arXiv 2023

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

TMLR

2022

Combining Modular Skills in Multitask Learning

arXiv 2022

2022

Image Retrieval from Contextual Descriptions

ACL 2022 5

2022

Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model

arXiv 2022

2022

Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment

Findings (ACL) 2022 5

2022

Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval

EMNLP 2021 11

2021

An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models

ACL 2022 5

2021

Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining

arXiv 2021

2021

StereoSet: Measuring stereotypical bias in pretrained language models

ACL 2021 5

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 34 papers