Cite
Notes
Only stored in your browser.
Attribution
The Road Less Scheduled
arXiv 2024
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
Learning-Rate-Free Learning by D-Adaptation
arXiv 2023
Optimal Linear Decay Learning Rate Schedules and Further Refinements
from 4 papers
Aaron Defazio
Ashok Cutkosky
Harsh Mehta
Ahmed Khaled
Hao Mark Chen
Hongxiang Fan
Ka Fai Cedric Yiu
Rui Li
Stylianos I. Venieris
Wayne Luk