Cite
Notes
Only stored in your browser.
Attribution
Better LLM Reasoning via Dual-Play
arXiv 2025
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
arXiv 2024
from 3 papers
Chengyu Huang
Claire Cardie
professor
Aochong Oliver Li
Dan Zhao
Gabriele Oliaro
Qing Li
Xupeng Miao
Yong Jiang
Zhihao Jia