Yixin Liu
- Papers
- 29
Cite
Notes
Only stored in your browser.
Authored papers
29References Improve LLM Alignment in Non-Verifiable Domains
arXiv 2026
Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing
arXiv 2026
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models
arXiv 2026
Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning
arXiv 2025
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes
arXiv 2025
EfficientLLM: Efficiency in Large Language Models
arXiv 2025
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
arXiv 2025
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
CVPR 2025 1
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
arXiv 2025
PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving
arXiv 2025
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research
arXiv 2025
M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models
arXiv 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
arXiv 2024
Evaluating Mathematical Reasoning Beyond Accuracy
arXiv 2024
ReIFE: Re-evaluating Instruction-Following Evaluation
arXiv 2024
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
arXiv 2024
TrustLLM: Trustworthiness in Large Language Models
arXiv 2024
Understanding Reference Policies in Direct Preference Optimization
arXiv 2024
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks
arXiv 2023
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
arXiv 2023
On Learning to Summarize with Large Language Models as References
arXiv 2023
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents
arXiv 2023
QTSumm: Query-Focused Summarization over Tabular Data
arXiv 2023
Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization
arXiv 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
arXiv 2023
BRIO: Bringing Order to Abstractive Summarization
ACL 2022 5
FOLIO: Natural Language Reasoning with First-Order Logic
arXiv 2022
Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
arXiv 2022
SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization
ACL 2021 5
Affiliations
Frequent co-authors
10from 29 papers