0

Kai-Wei Chang

Papers
64

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
64papers

Authored papers

64

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

arXiv 2026

2026

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

arXiv 2026

2026

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

arXiv 2026

2026

OpenThoughts: Data Recipes for Reasoning Models

arXiv 2025

2025

Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation

arXiv 2025

2025

LaViDa: A Large Diffusion Language Model for Multimodal Understanding

arXiv 2025

2025

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

arXiv 2025

2025

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

arXiv 2025

2025

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

arXiv 2025

2025

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

arXiv 2025

2025

When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning

arXiv 2025

2025

Visualized Text-to-Image Retrieval

arXiv 2025

2025

MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

arXiv 2025

2025

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

arXiv 2025

2025

FronTalk: Benchmarking Front-End Development as Conversational Code Generation with Multi-Modal Feedback

arXiv 2025

2025

DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation

arXiv 2025

2025

TrustLLM: Trustworthiness in Large Language Models

arXiv 2024

2024

Matryoshka Query Transformer for Large Vision-Language Models

arXiv 2024

2024

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

arXiv 2024

2024

Codec-SUPERB: An In-Depth Analysis of Sound Codec Models

arXiv 2024

2024

On Prompt-Driven Safeguarding for Large Language Models

arXiv 2024

2024

Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

arXiv 2024

2024

ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models

arXiv 2024

2024

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

arXiv 2024

2024

Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization

arXiv 2024

2024

Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models

arXiv 2024

2024

Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding

arXiv 2024

2024

Verbalized Representation Learning for Interpretable Few-Shot Generalization

ICCV 2025

2024

VDebugger: Harnessing Execution Feedback for Debugging Visual Programs

arXiv 2024

2024

LLMs in Biomedicine: A study on clinical Named Entity Recognition

arXiv 2024

2024

Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

arXiv 2024

2024

The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention

arXiv 2024

2024

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

arXiv 2024

2024

Enhancing Large Vision Language Models with Self-Training on Image Comprehension

arXiv 2024

2024

TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation

arXiv 2024

2024

Re-ReST: Reflection-Reinforced Self-Training for Language Agents

arXiv 2024

2024

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension

arXiv 2024

2024

MetaKP: On-Demand Keyphrase Generation

arXiv 2024

2024

What's "up" with vision-language models? Investigating their struggle with spatial reasoning

arXiv 2023

2023

Agent Lumos: Unified and Modular Training for Open-Source Language Agents

arXiv 2023

2023

IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models

arXiv 2023

2023

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

arXiv 2023

2023

Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step

arXiv 2023

2023

KPEval: Towards Fine-Grained Semantic-Based Keyphrase Evaluation

arXiv 2023

2023

"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters

arXiv 2023

2023

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

arXiv 2023

2023

VideoCon: Robust Video-Language Alignment via Contrast Captions

CVPR 2024 1

2023

CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning

ICCV 2023 1

2023

Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

arXiv 2023

2023

Red Teaming Language Model Detectors with Language Models

arXiv 2023

2023

UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding

arXiv 2023

2023

Efficient Shapley Values Estimation by Amortization for Text Classification

arXiv 2023

2023

Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems

arXiv 2023

2023

A Survey of Deep Learning for Mathematical Reasoning

arXiv 2022

2022

GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles

arXiv 2022

2022

Representation Learning for Resource-Constrained Keyphrase Generation

arXiv 2022

2022

Controllable Text Generation with Neurally-Decomposed Oracle

arXiv 2022

2022

Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction

ACL 2022 5

2022

How Much Can CLIP Benefit Vision-and-Language Tasks?

arXiv 2021

2021

DEGREE: A Data-Efficient Generation-Based Event Extraction Model

NAACL 2022 7

2021

BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation

arXiv 2021

2021

Unified Pre-training for Program Understanding and Generation

NAACL 2021 4

2021

Socially Aware Bias Measurements for Hindi Language Representations

NAACL 2022 7

2021

GPT-GNN: Generative Pre-Training of Graph Neural Networks

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 64 papers