0

Ning Ding

Tsinghua researcher known for parameter-efficient fine-tuning, UltraFeedback, and OpenBMB open-source LLM tooling.

Role
researcher
Papers
45

Cite

Notes

Only stored in your browser.

45papers·1tool contribs

Authored papers

45

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

arXiv 2026

2026

Post-Trained MoE Can Skip Half Experts via Self-Distillation

arXiv 2026

2026

AI Can Learn Scientific Taste

arXiv 2026

2026

Toward Efficient Agents: Memory, Tool learning, and Planning

arXiv 2026

2026

MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling

arXiv 2026

2026

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

arXiv 2026

2026

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

arXiv 2025

2025

TTRL: Test-Time Reinforcement Learning

arXiv 2025

2025

MiniCPM4: Ultra-Efficient LLMs on End Devices

arXiv 2025

2025

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

arXiv 2025

2025

Process Reinforcement through Implicit Rewards

arXiv 2025

2025

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

arXiv 2025

2025

Farseer: A Refined Scaling Law in Large Language Models

arXiv 2025

2025

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

arXiv 2025

2025

SSRL: Self-Search Reinforcement Learning

arXiv 2025

2025

FlowRL: Matching Reward Distributions for LLM Reasoning

arXiv 2025

2025

A Survey of Reinforcement Learning for Large Reasoning Models

arXiv 2025

2025

P1: Mastering Physics Olympiads with Reinforcement Learning

arXiv 2025

2025

Towards a Unified View of Large Language Model Post-Training

arXiv 2025

2025

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

arXiv 2025

2025

RLPR: Extrapolating RLVR to General Domains without Verifiers

arXiv 2025

2025

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

arXiv 2025

2025

EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization

arXiv 2025

2025

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

arXiv 2025

2025

UltraIF: Advancing Instruction Following from the Wild

arXiv 2025

2025

UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset

arXiv 2024

2024

Advancing LLM Reasoning Generalists with Preference Trees

arXiv 2024

2024

UltraMedical: Building Specialized Generalists in Biomedicine

arXiv 2024

2024

How to Synthesize Text Data without Model Collapse?

arXiv 2024

2024

Free Process Rewards without Process Labels

arXiv 2024

2024

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

arXiv 2024

2024

Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process

arXiv 2024

2024

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

arXiv 2024

2024

Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding

arXiv 2024

2024

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

EMNLP

2023

UltraFeedback: Boosting Language Models with High-quality Feedback

ICML

2023

GPT4Image: Can Large Pre-trained Models Help Vision Models on Perception Tasks?

arXiv 2023

2023

Tool Learning with Foundation Models

arXiv 2023

2023

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

arXiv 2023

2023

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model

arXiv 2023

2023

Exploring the Impact of Model Scaling on Parameter-Efficient Tuning

arXiv 2023

2023

Sparse Low-rank Adaptation of Pre-trained Language Models

arXiv 2023

2023

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

arXiv 2023

2023

OpenPrompt: An Open-source Framework for Prompt-learning

ACL 2022 5

2021

Few-NERD: A Few-Shot Named Entity Recognition Dataset

ACL 2021 5

2021

Tool contributions

1

Affiliations

Currently at

Tsinghua University

researcher · university lab

Previously

Frequent co-authors

10

from 45 papers