Daniel Khashabi
- Papers
- 34
Cite
Notes
Only stored in your browser.
Authored papers
34A Very Big Video Reasoning Suite
arXiv 2026
Steered LLM Activations are Non-Surjective
arXiv 2026
Many-Tier Instruction Hierarchy in LLM Agents
arXiv 2026
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback
arXiv 2025
BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases
arXiv 2025
World-in-World: World Models in a Closed-Loop World
arXiv 2025
Jointly Reinforcing Diversity and Quality in Language Model Generations
arXiv 2025
Science Hierarchography: Hierarchical Organization of Science Literature
arXiv 2025
ICL CIPHERS: Quantifying "Learning" in In-Context Learning via Substitution Ciphers
arXiv 2025
Generative World Explorer
arXiv 2024
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
arXiv 2024
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning
arXiv 2024
Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell
arXiv 2024
Benchmarking Language Model Creativity: A Case Study on Code Generation
arXiv 2024
Dated Data: Tracing Knowledge Cutoffs in Large Language Models
arXiv 2024
Efficient Large Multi-modal Models via Visual Context Compression
arXiv 2024
AnaloBench: Benchmarking the Identification of Abstract and Long-context Analogies
arXiv 2024
GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution
arXiv 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
arXiv 2023
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
arXiv 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
arXiv 2022
Self-Instruct: Aligning Language Models with Self-Generated Instructions
arXiv 2022
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
arXiv 2022
COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
arXiv 2022
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics
NAACL 2022 7
Hey AI, Can You Solve Complex Tasks by Talking to Agents?
Findings (ACL) 2022 5
Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts
NAACL 2022 7
GooAQ: Open Question Answering with Diverse Answer Types
Findings (EMNLP) 2021 11
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
arXiv 2021
Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models
NAACL 2021 4
UnifiedQA: Crossing Format Boundaries With a Single QA System
Findings of the Association for Computational Linguistics 2020
ParsiNLU: A Suite of Language Understanding Challenges for Persian
arXiv 2020
Affiliations
Frequent co-authors
10from 34 papers