Cite
Notes
Only stored in your browser.
Attribution
Don't be lazy: CompleteP enables compute-efficient deep transformers
arXiv 2025
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
arXiv 2023
from 2 papers
Joel Hestness
Bin Claire Zhang
Blake Bordelon
Boris Hanin
Bowen Yang
Cengiz Pehlevan
Chen
Daria Soboleva
Faisal Al-Khateeb
Hemant Khachane