Iryna Gurevych
- Papers
- 64
Cite
Notes
Only stored in your browser.
Authored papers
64SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
arXiv 2026
Themis: Training Robust Multilingual Code Reward Models for Flexible Multi-Criteria Scoring
arXiv 2026
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors
arXiv 2025
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning
arXiv 2025
LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews
arXiv 2025
The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities
arXiv 2025
PeerQA: A Scientific Question Answering Dataset from Peer Reviews
arXiv 2025
GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human
arXiv 2025
GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval
arXiv 2025
Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions
arXiv 2025
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
arXiv 2025
NeoQA: Evidence-based Question Answering with Generated News Events
arXiv 2025
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors
arXiv 2024
OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs
arXiv 2024
Variational Learning is Effective for Large Deep Networks
arXiv 2024
Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models
arXiv 2024
Triple-Encoders: Representations That Fire Together, Wire Together
arXiv 2024
RIRAG: Regulatory Information Retrieval and Answer Generation
arXiv 2024
FIRE: Fact-checking with Iterative Retrieval and Verification
arXiv 2024
Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs
arXiv 2024
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators
arXiv 2024
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
arXiv 2024
Are Large Language Models Good Classifiers? A Study on Edit Intent Classification in Scientific Document Revisions
arXiv 2024
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
arXiv 2024
$\texttt{MixGR}$: Enhancing Retriever Generalization for Scientific Domain through Complementary Granularity
arXiv 2024
DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs
arXiv 2024
Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded Dialogues
arXiv 2024
Localizing and Mitigating Errors in Long-form Question Answering
arXiv 2024
M2QA: Multi-domain Multilingual Question Answering
arXiv 2024
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models
arXiv 2024
Dive into the Chasm: Probing the Gap between In- and Cross-Topic Generalization
arXiv 2024
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems
arXiv 2023
Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning
arXiv 2023
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers
arXiv 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research
arXiv 2023
DAPR: A Benchmark on Document-Aware Passage Retrieval
arXiv 2023
AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classification
arXiv 2023
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection
arXiv 2023
Are Emergent Abilities in Large Language Models just In-Context Learning?
arXiv 2023
Model Merging by Uncertainty-Based Gradient Matching
arXiv 2023
Opportunities and Challenges in Neural Dialog Tutoring
arXiv 2023
Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly
arXiv 2023
Exploring Jiu-Jitsu Argumentation for Writing Peer Review Rebuttals
arXiv 2023
Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend Existing Ones?
arXiv 2023
How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study
arXiv 2023
Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking
arXiv 2022
Mining Legal Arguments in Court Decisions
arXiv 2022
TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation
arXiv 2022
TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning
arXiv 2021
TWEAC: Transformer with Extendable QA Agent Classifiers
arXiv 2021
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models
arXiv 2021
GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
NAACL 2022 7
What to Pre-Train on? Efficient Intermediate Task Selection
EMNLP 2021 11
xGQA: Cross-Lingual Visual Question Answering
Findings (ACL) 2022 5
MetaQA: Combining Expert Agents for Multi-Skill Question Answering
arXiv 2021
Learning to Reason for Text Generation from Scientific Tables
arXiv 2021
Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning
EMNLP 2021 11
AdapterHub: A Framework for Adapting Transformers
EMNLP 2020 11
Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
NAACL 2021 4
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
ACL 2021 5
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale
EMNLP 2020 11
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts
EMNLP 2021 11
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems
text-processing-like-humans-do-visually-1
Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!
cross-lingual-argumentation-mining-machine-1
Affiliations
Frequent co-authors
10from 64 papers