Le Sun

Beyond Text-Dominance: Understanding Modality Preference of Omni-modal Large Language Models

arXiv 2026

Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

arXiv 2026

MetaphorVU: Towards Metaphorical Video Understanding

arXiv 2026

ConsistentChat: Building Skeleton-Guided Consistent Dialogues for Large Language Models from Scratch

arXiv 2025

SAISA: Towards Multimodal Large Language Models with Both Training and Inference Efficiency

arXiv 2025

DeepRAG: Thinking to Retrieval Step by Step for Large Language Models

arXiv 2025

LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

arXiv 2025

The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models

arXiv 2025

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

ICCV 2025

PaperRegister: Boosting Flexible-grained Paper Search via Hierarchical Register Indexing

arXiv 2025

Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models

arXiv 2025

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

arXiv 2025

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

arXiv 2025

Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch

arXiv 2025

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

arXiv 2024

Beyond Correctness: Benchmarking Multi-dimensional Code Generation for Large Language Models

arXiv 2024

StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation

arXiv 2024

READoc: A Unified Benchmark for Realistic Document Structured Extraction

arXiv 2024

Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation

arXiv 2024

Aligning Large Language Models via Self-Steering Optimization

arXiv 2024

Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching

arXiv 2024

Academically intelligent LLMs are not necessarily socially intelligent

arXiv 2024

SoFA: Shielded On-the-fly Alignment via Priority Rule Following

arXiv 2024

Transferable Post-training via Inverse Value Learning

arXiv 2024

Self-Retrieval: End-to-End Information Retrieval with One Large Language Model

arXiv 2024

ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases

arXiv 2023

Toward Unified Controllable Text Generation via Regular Expression Instruction

arXiv 2023

Offline Pseudo Relevance Feedback for Efficient and Effective Single-pass Dense Retrieval

arXiv 2023

Benchmarking Large Language Models in Retrieval-Augmented Generation

arXiv 2023

DBCopilot: Natural Language Querying over Massive Databases via Schema Routing

arXiv 2023

The Life Cycle of Knowledge in Big Language Models: A Survey

arXiv 2023