Xin Liu
- Papers
- 54
Cite
Notes
Only stored in your browser.
Authored papers
54CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
arXiv 2026
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters
arXiv 2026
ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation
arXiv 2026
UltraEval-Audio: A Unified Framework for Comprehensive Evaluation of Audio Foundation Models
arXiv 2026
T^2PO: Uncertainty-Guided Exploration Control for Stable Multi-Turn Agentic Reinforcement Learning
arXiv 2026
Beyond Test-Time Memory: State-Space Optimal Control for LLM Reasoning
arXiv 2026
GeoMotionGPT: Geometry-Aligned Motion Understanding with Large Language Models
arXiv 2026
FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning
arXiv 2025
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
arXiv 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
preprint
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts
arXiv 2025
Seed1.5-VL Technical Report
arXiv 2025
CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
CVPR 2025 1
HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation
arXiv 2025
MMHCL: Multi-Modal Hypergraph Contrastive Learning for Recommendation
arXiv 2025
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
ICCV 2025
BIG-Bench Extra Hard
arXiv 2025
Medical Hallucinations in Foundation Models and Their Impact on Healthcare
arXiv 2025
AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning
CVPR 2025 1
ZeroMerge: Parameter-Free KV Cache Compression for Memory-Efficient Long-Context LLMs
arXiv 2025
IHEval: Evaluating Language Models on Following the Instruction Hierarchy
arXiv 2025
Step-GUI Technical Report
arXiv 2025
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
arXiv 2025
"What's Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets
arXiv 2025
Learning to Optimize Multi-Objective Alignment Through Dynamic Reward Weighting
arXiv 2025
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
arXiv 2025
Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
arXiv 2025
RADAR: Benchmarking Language Models on Imperfect Tabular Data
arXiv 2025
Discriminative Finetuning of Generative Large Language Models without Reward Models and Human Preference Data
arXiv 2025
WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance
arXiv 2025
DeepSeek-V3 Technical Report
arXiv 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
arXiv 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
arXiv 2024
WeatherQA: Can Multimodal Language Models Reason about Severe Weather?
arXiv 2024
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Models
arXiv 2024
Re-Attentional Controllable Video Diffusion Editing
arXiv 2024
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation
arXiv 2024
Persona Knowledge-Aligned Prompt Tuning Method for Online Debate
arXiv 2024
Improving Arabic Multi-Label Emotion Classification using Stacked Embeddings and Hybrid Loss Function
arXiv 2024
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
arXiv 2024
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
arXiv 2024
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning
arXiv 2024
MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models
arXiv 2023
Edit Temporal-Consistent Videos with Image Diffusion Model
arXiv 2023
Objects do not disappear: Video object detection by single-frame object location anticipation
ICCV 2023 1
MSECNet: Accurate and Robust Normal Estimation for 3D Point Clouds by Multi-Scale Edge Conditioning
arXiv 2023
AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph
arXiv 2023
From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models
arXiv 2023
CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering
arXiv 2023
NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video: Dataset, Methods and Results
arXiv 2022
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
arXiv 2022
GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization
arXiv 2022
MMChat: Multi-Modal Chat Dataset on Social Media
LREC 2022 6
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
arXiv 2021
Affiliations
Frequent co-authors
10from 54 papers