0

Valentina Pyatkin

Research scientist at Allen Institute for AI; co-leads OLMo and Tülu open-language-model projects; ACL Theme Paper Award winner.

Role
research-scientist
Papers
22

Cite

Notes

Only stored in your browser.

22papers

Authored papers

22

RewardBench 2: Advancing Reward Model Evaluation

preprint

2025

Olmo 3

arXiv 2025

2025

IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance

arXiv 2025

2025

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

preprint

2024

2 OLMo 2 Furious

arXiv 2024

2024

OLMo: Accelerating the Science of Language Models

arXiv 2024

2024

RewardBench: Evaluating Reward Models for Language Modeling

arXiv 2024

2024

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

arXiv 2024

2024

Superlatives in Context: Modeling the Implicit Semantics of Superlatives

arXiv 2024

2024

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

arXiv 2024

2024

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

arXiv 2024

2024

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

arXiv 2023

2023

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

arXiv 2023

2023

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

arXiv 2023

2023

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

arXiv 2023

2023

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

arXiv 2023

2023

QASem Parsing: Text-to-text Modeling of QA-based Semantics

arXiv 2022

2022

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

arXiv 2022

2022

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

arXiv 2022

2022

Asking It All: Generating Contextualized Questions for any Semantic Role

EMNLP 2021 11

2021

The Possible, the Plausible, and the Desirable: Event-Based Modality Detection for Language Processing

ACL 2021 5

2021

QADiscourse -- Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

arXiv 2020

2020

Affiliations

Frequent co-authors

10

from 22 papers