Cite
Notes
Only stored in your browser.
Attribution
Analysing The Impact of Sequence Composition on Language Model Pre-Training
arXiv 2024
Focused Transformer: Contrastive Training for Context Scaling
NeurIPS 2023 11
from 2 papers
Piotr Miłoś
Szymon Tworkowski
Henryk Michalewski
researcher
Mikołaj Pacek
Pasquale Minervini
Wei Liu
Yu Zhao
Yuanbin Qu
Yuhuai Wu
Yuxiang Wu