Minjoon Seo
- Papers
- 39
Cite
Notes
Only stored in your browser.
Authored papers
39Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams
arXiv 2026
The Coverage Principle: A Framework for Understanding Compositional Generalization
arXiv 2025
Reasoning Models Better Express Their Confidence
arXiv 2025
Let's Predict Sentence by Sentence
arXiv 2025
On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning
arXiv 2024
RouterRetriever: Routing over a Mixture of Expert Embedding Models
arXiv 2024
LangBridge: Multilingual Reasoning Without Multilingual Supervision
arXiv 2024
Aligning Large Language Models by On-Policy Self-Judgment
arXiv 2024
Generative Prompt Internalization
arXiv 2024
Rethinking the Role of Proxy Rewards in Language Model Alignment
arXiv 2024
How language models extrapolate outside the training data: A case study in Textualized Gridworld
arXiv 2024
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
arXiv 2024
Aligning to Thousands of Preferences via System Message Generalization
arXiv 2024
Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards
arXiv 2024
INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models
arXiv 2024
TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models
arXiv 2024
SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs
arXiv 2024
Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Utilization
arXiv 2024
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
arXiv 2024
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
arXiv 2023
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records
ehrsql-a-practical-text-to-sql-benchmark-for
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
arXiv 2023
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
arXiv 2023
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
arXiv 2023
Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
arXiv 2023
Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision
arXiv 2023
Gradient Ascent Post-training Enhances Language Model Generalization
arXiv 2023
Aligning Large Language Models through Synthetic Feedback
arXiv 2023
KTRL+F: Knowledge-Augmented In-Document Search
arXiv 2023
A Bayesian Approach To Analysing Training Data Attribution In Deep Learning
a-bayesian-approach-to-analysing-training
Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
arXiv 2023
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
arXiv 2022
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
arXiv 2022
Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts
arXiv 2022
Towards Continual Knowledge Learning of Language Models
towards-continual-knowledge-learning-of-1
MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension
mrqa-2019-shared-task-evaluating-1
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
real-time-open-domain-question-answering-with-1
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
contextualized-sparse-representations-for
A Diagram Is Worth A Dozen Images
arXiv 2016
Affiliations
Frequent co-authors
10from 39 papers