Qian Liu
- Papers
- 50
Cite
Notes
Only stored in your browser.
Authored papers
50Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
arXiv 2026
Proxy Compression for Language Modeling
arXiv 2026
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
arXiv 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
arXiv 2025
General-Reasoner: Advancing LLM Reasoning Across All Domains
arXiv 2025
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
arXiv 2025
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
arXiv 2025
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
arXiv 2025
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization
arXiv 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
arXiv 2025
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle
arXiv 2025
Diffusion Language Models are Super Data Learners
arXiv 2025
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling
arXiv 2025
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
arXiv 2025
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
arXiv 2025
NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents
arXiv 2025
When Attention Sink Emerges in Language Models: An Empirical View
arXiv 2024
Sailor: Open Language Models for South-East Asia
arXiv 2024
Scaling up Masked Diffusion Models on Text
arXiv 2024
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
arXiv 2024
Faithful Logical Reasoning via Symbolic Chain-of-Thought
arXiv 2024
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
arXiv 2024
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
arXiv 2024
Mercury: A Code Efficiency Benchmark for Code Large Language Models
arXiv 2024
EVOR: Evolving Retrieval for Code Generation
arXiv 2024
Beyond Memorization: The Challenge of Random Memory Access in Language Models
arXiv 2024
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale
arXiv 2024
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
arXiv 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
arXiv 2024
MANTIS: Interleaved Multi-Image Instruction Tuning
arXiv 2024
RegMix: Data Mixture as Regression for Language Model Pre-training
arXiv 2024
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
arXiv 2024
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
arXiv 2024
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
arXiv 2024
Bootstrapping Language Models with DPO Implicit Rewards
arXiv 2024
GrainGrasp: Dexterous Grasp Generation with Fine-grained Contact Guidance
arXiv 2024
SantaCoder: don't reach for the stars!
arXiv 2023
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
arXiv 2023
Active Retrieval Augmented Generation
arXiv 2023
OctoPack: Instruction Tuning Code Large Language Models
arXiv 2023
OpenAgents: An Open Platform for Language Agents in the Wild
arXiv 2023
Reasoning Implicit Sentiment with Chain-of-Thought Prompting
arXiv 2023
Bag of Tricks for Training Data Extraction from Language Models
arXiv 2023
Generative Table Pre-training Empowers Models for Tabular Prediction
arXiv 2023
From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning
arXiv 2023
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models
arXiv 2023
Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination
arXiv 2023
Lemur: Harmonizing Natural Language and Code for Language Agents
arXiv 2023
OpenFE: Automated Feature Generation with Expert-level Performance
arXiv 2022
TAPEX: Table Pre-training via Learning a Neural SQL Executor
tapex-table-pre-training-via-learning-a-1
Affiliations
Frequent co-authors
10from 50 papers