0

Chien-Sheng Wu

Papers
21

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
21papers

Authored papers

21

The Illusion of Certainty: Decoupling Capability and Calibration in On-Policy Distillation

arXiv 2026

2026

Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding

arXiv 2025

2025

CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions

arXiv 2025

2025

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

arXiv 2025

2025

AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation

arXiv 2025

2025

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

arXiv 2024

2024

Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses

arXiv 2024

2024

Evaluating Cultural and Social Awareness of LLM Web Agents

arXiv 2024

2024

ReIFE: Re-evaluating Instruction-Following Evaluation

arXiv 2024

2024

LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond

arXiv 2023

2023

Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles

arXiv 2023

2023

XGen-7B Technical Report

arXiv 2023

2023

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

arXiv 2023

2023

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

arXiv 2022

2022

Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

arXiv 2022

2022

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

arXiv 2022

2022

Exploring Neural Models for Query-Focused Summarization

Findings (NAACL) 2022 7

2021

QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization

NAACL 2022 7

2021

MixQG: Neural Question Generation with Mixed Answer Types

Findings (NAACL) 2022 7

2021

A Simple Language Model for Task-Oriented Dialogue

NeurIPS 2020 12

2020

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 21 papers