0

Qian Liu

Papers
50

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
50papers

Authored papers

50

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

arXiv 2026

2026

Proxy Compression for Language Modeling

arXiv 2026

2026

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

arXiv 2025

2025

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

arXiv 2025

2025

General-Reasoner: Advancing LLM Reasoning Across All Domains

arXiv 2025

2025

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

arXiv 2025

2025

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

arXiv 2025

2025

SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development

arXiv 2025

2025

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

arXiv 2025

2025

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

arXiv 2025

2025

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

arXiv 2025

2025

Diffusion Language Models are Super Data Learners

arXiv 2025

2025

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

arXiv 2025

2025

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

arXiv 2025

2025

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

arXiv 2025

2025

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

arXiv 2025

2025

When Attention Sink Emerges in Language Models: An Empirical View

arXiv 2024

2024

Sailor: Open Language Models for South-East Asia

arXiv 2024

2024

Scaling up Masked Diffusion Models on Text

arXiv 2024

2024

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

arXiv 2024

2024

Faithful Logical Reasoning via Symbolic Chain-of-Thought

arXiv 2024

2024

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

arXiv 2024

2024

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

arXiv 2024

2024

Mercury: A Code Efficiency Benchmark for Code Large Language Models

arXiv 2024

2024

EVOR: Evolving Retrieval for Code Generation

arXiv 2024

2024

Beyond Memorization: The Challenge of Random Memory Access in Language Models

arXiv 2024

2024

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

arXiv 2024

2024

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

arXiv 2024

2024

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

arXiv 2024

2024

MANTIS: Interleaved Multi-Image Instruction Tuning

arXiv 2024

2024

RegMix: Data Mixture as Regression for Language Model Pre-training

arXiv 2024

2024

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

arXiv 2024

2024

Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning

arXiv 2024

2024

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

arXiv 2024

2024

Bootstrapping Language Models with DPO Implicit Rewards

arXiv 2024

2024

GrainGrasp: Dexterous Grasp Generation with Fine-grained Contact Guidance

arXiv 2024

2024

SantaCoder: don't reach for the stars!

arXiv 2023

2023

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

arXiv 2023

2023

Active Retrieval Augmented Generation

arXiv 2023

2023

OctoPack: Instruction Tuning Code Large Language Models

arXiv 2023

2023

OpenAgents: An Open Platform for Language Agents in the Wild

arXiv 2023

2023

Reasoning Implicit Sentiment with Chain-of-Thought Prompting

arXiv 2023

2023

Bag of Tricks for Training Data Extraction from Language Models

arXiv 2023

2023

Generative Table Pre-training Empowers Models for Tabular Prediction

arXiv 2023

2023

From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning

arXiv 2023

2023

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

arXiv 2023

2023

Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination

arXiv 2023

2023

Lemur: Harmonizing Natural Language and Code for Language Agents

arXiv 2023

2023

OpenFE: Automated Feature Generation with Expert-level Performance

arXiv 2022

2022

TAPEX: Table Pre-training via Learning a Neural SQL Executor

tapex-table-pre-training-via-learning-a-1

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 50 papers