0

Iryna Gurevych

Papers
64

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
64papers

Authored papers

64

SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

arXiv 2026

2026

Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring

arXiv 2026

2026

MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors

arXiv 2025

2025

From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning

arXiv 2025

2025

LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews

arXiv 2025

2025

The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities

arXiv 2025

2025

PeerQA: A Scientific Question Answering Dataset from Peer Reviews

arXiv 2025

2025

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human

arXiv 2025

2025

GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval

arXiv 2025

2025

Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions

arXiv 2025

2025

ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding

arXiv 2025

2025

NeoQA: Evidence-based Question Answering with Generated News Events

arXiv 2025

2025

Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors

arXiv 2024

2024

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

arXiv 2024

2024

Variational Learning is Effective for Large Deep Networks

arXiv 2024

2024

Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models

arXiv 2024

2024

Triple-Encoders: Representations That Fire Together, Wire Together

arXiv 2024

2024

RIRAG: Regulatory Information Retrieval and Answer Generation

arXiv 2024

2024

FIRE: Fact-checking with Iterative Retrieval and Verification

arXiv 2024

2024

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs

arXiv 2024

2024

IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators

arXiv 2024

2024

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

arXiv 2024

2024

Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions

arXiv 2024

2024

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

arXiv 2024

2024

$\texttt{MixGR}$: Enhancing Retriever Generalization for Scientific Domain through Complementary Granularity

arXiv 2024

2024

DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs

arXiv 2024

2024

Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded Dialogues

arXiv 2024

2024

Localizing and Mitigating Errors in Long-form Question Answering

arXiv 2024

2024

M2QA: Multi-domain Multilingual Question Answering

arXiv 2024

2024

SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models

arXiv 2024

2024

Dive into the Chasm: Probing the Gap between In- and Cross-Topic Generalization

arXiv 2024

2024

MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

arXiv 2023

2023

Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

arXiv 2023

2023

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

arXiv 2023

2023

UKP-SQuARE v3: A Platform for Multi-Agent QA Research

arXiv 2023

2023

DAPR: A Benchmark on Document-Aware Passage Retrieval

arXiv 2023

2023

AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification

arXiv 2023

2023

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection

arXiv 2023

2023

Are Emergent Abilities in Large Language Models just In-Context Learning?

arXiv 2023

2023

Model Merging by Uncertainty-Based Gradient Matching

arXiv 2023

2023

Opportunities and Challenges in Neural Dialog Tutoring

arXiv 2023

2023

Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly

arXiv 2023

2023

Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals

arXiv 2023

2023

Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend Existing Ones?

arXiv 2023

2023

How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study

arXiv 2023

2023

Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking

arXiv 2022

2022

Mining Legal Arguments in Court Decisions

arXiv 2022

2022

TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation

arXiv 2022

2022

TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning

arXiv 2021

2021

TWEAC: Transformer with Extendable QA Agent Classifiers

arXiv 2021

2021

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

arXiv 2021

2021

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

NAACL 2022 7

2021

What to Pre-Train on? Efficient Intermediate Task Selection

EMNLP 2021 11

2021

xGQA: Cross-Lingual Visual Question Answering

Findings (ACL) 2022 5

2021

MetaQA: Combining Expert Agents for Multi-Skill Question Answering

arXiv 2021

2021

Learning to Reason for Text Generation from Scientific Tables

arXiv 2021

2021

Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning

EMNLP 2021 11

2021

AdapterHub: A Framework for Adapting Transformers

EMNLP 2020 11

2020

Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

NAACL 2021 4

2020

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

ACL 2021 5

2020

MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale

EMNLP 2020 11

2020

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts

EMNLP 2021 11

2020

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems

text-processing-like-humans-do-visually-1

2019

Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!

cross-lingual-argumentation-mining-machine-1

2018

Affiliations

No known affiliations.

Frequent co-authors

10

from 64 papers