Ning Ding
Tsinghua researcher known for parameter-efficient fine-tuning, UltraFeedback, and OpenBMB open-source LLM tooling.
- Role
- researcher
- Currently at
- Tsinghua University
- GitHub
- github.com/ningding97
- Scholar
- scholar.google.com/citations
- Papers
- 45
Cite
Notes
Only stored in your browser.
Authored papers
45Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
arXiv 2026
Post-Trained MoE Can Skip Half Experts via Self-Distillation
arXiv 2026
AI Can Learn Scientific Taste
arXiv 2026
Toward Efficient Agents: Memory, Tool learning, and Planning
arXiv 2026
MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling
arXiv 2026
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads
arXiv 2026
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
arXiv 2025
TTRL: Test-Time Reinforcement Learning
arXiv 2025
MiniCPM4: Ultra-Efficient LLMs on End Devices
arXiv 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
arXiv 2025
Process Reinforcement through Implicit Rewards
arXiv 2025
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
arXiv 2025
Farseer: A Refined Scaling Law in Large Language Models
arXiv 2025
MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe
arXiv 2025
SSRL: Self-Search Reinforcement Learning
arXiv 2025
FlowRL: Matching Reward Distributions for LLM Reasoning
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
P1: Mastering Physics Olympiads with Reinforcement Learning
arXiv 2025
Towards a Unified View of Large Language Model Post-Training
arXiv 2025
From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones
arXiv 2025
RLPR: Extrapolating RLVR to General Domains without Verifiers
arXiv 2025
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
arXiv 2025
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
arXiv 2025
JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
arXiv 2025
UltraIF: Advancing Instruction Following from the Wild
arXiv 2025
UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset
arXiv 2024
Advancing LLM Reasoning Generalists with Preference Trees
arXiv 2024
UltraMedical: Building Specialized Generalists in Biomedicine
arXiv 2024
How to Synthesize Text Data without Model Collapse?
arXiv 2024
Free Process Rewards without Process Labels
arXiv 2024
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
arXiv 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
arXiv 2024
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
arXiv 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
arXiv 2024
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
EMNLP
UltraFeedback: Boosting Language Models with High-quality Feedback
ICML
GPT4Image: Can Large Pre-trained Models Help Vision Models on Perception Tasks?
arXiv 2023
Tool Learning with Foundation Models
arXiv 2023
OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
arXiv 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
arXiv 2023
Exploring the Impact of Model Scaling on Parameter-Efficient Tuning
arXiv 2023
Sparse Low-rank Adaptation of Pre-trained Language Models
arXiv 2023
KoLA: Carefully Benchmarking World Knowledge of Large Language Models
arXiv 2023
OpenPrompt: An Open-source Framework for Prompt-learning
ACL 2022 5
Few-NERD: A Few-Shot Named Entity Recognition Dataset
ACL 2021 5
Tool contributions
1Affiliations
Previously
Frequent co-authors
10from 45 papers
Bowen Zhou
professor
Zhiyuan Liu
professor
Ganqu Cui
researcher
Kaiyan Zhang
Maosong Sun
professor
Yuxin Zuo
Xingtai Lv
Xuekai Zhu
Ermo Hua
Lifan Yuan
grad-student