0

Nanyun Peng

Papers
48

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
48papers

Authored papers

48

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

arXiv 2026

2026

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

arXiv 2026

2026

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

arXiv 2026

2026

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

arXiv 2025

2025

Do "New Snow Tablets" Contain Snow? Large Language Models Over-Rely on Names to Identify Ingredients of Chinese Drugs

arXiv 2025

2025

CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners

arXiv 2025

2025

FronTalk: Benchmarking Front-End Development as Conversational Code Generation with Multi-Modal Feedback

arXiv 2025

2025

DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation

arXiv 2025

2025

MMGR: Multi-Modal Generative Reasoning

arXiv 2025

2025

Matryoshka Query Transformer for Large Vision-Language Models

arXiv 2024

2024

On Prompt-Driven Safeguarding for Large Language Models

arXiv 2024

2024

Weak-to-Strong Extrapolation Expedites Alignment

arXiv 2024

2024

New Job, New Gender? Measuring the Social Bias in Image Generation Models

arXiv 2024

2024

Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization

arXiv 2024

2024

Re-ReST: Reflection-Reinforced Self-Training for Language Agents

arXiv 2024

2024

Verbalized Representation Learning for Interpretable Few-Shot Generalization

ICCV 2025

2024

VDebugger: Harnessing Execution Feedback for Debugging Visual Programs

arXiv 2024

2024

Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

arXiv 2024

2024

On the Loss of Context-awareness in General Instruction Fine-tuning

arXiv 2024

2024

Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?

arXiv 2024

2024

Adaptable Logical Control for Large Language Models

arXiv 2024

2024

Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

arXiv 2024

2024

ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models

arXiv 2024

2024

Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding

arXiv 2024

2024

Evaluating Cultural and Social Awareness of LLM Web Agents

arXiv 2024

2024

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

arXiv 2023

2023

Tractable Control for Autoregressive Language Generation

arXiv 2023

2023

"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters

arXiv 2023

2023

Evaluating Large Language Models on Controlled Generation Tasks

arXiv 2023

2023

ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos

arXiv 2023

2023

Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge

arXiv 2023

2023

RLCD: Reinforcement Learning from Contrastive Distillation for Language Model Alignment

arXiv 2023

2023

AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model

arXiv 2023

2023

ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems

arXiv 2023

2023

Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems

arXiv 2023

2023

Identifying Informational Sources in News Articles

arXiv 2023

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

TMLR

2022

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

coarse-to-fine-vision-language-pre-training-1

2022

Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative Comprehension

arXiv 2022

2022

Controllable Text Generation with Neurally-Decomposed Oracle

arXiv 2022

2022

Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction

ACL 2022 5

2022

GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles

arXiv 2022

2022

Generalized Decoding for Pixel, Image, and Language

CVPR 2023 1

2022

Re3: Generating Longer Stories With Recursive Reprompting and Revision

arXiv 2022

2022

NewsEdits: A News Article Revision Dataset and a Document-Level Reasoning Challenge

arXiv 2022

2022

DEGREE: A Data-Efficient Generation-Based Event Extraction Model

NAACL 2022 7

2021

On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark

Findings (ACL) 2022 5

2021

Socially Aware Bias Measurements for Hindi Language Representations

NAACL 2022 7

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 48 papers