Cite
Notes
Only stored in your browser.
Attribution
MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies
arXiv 2023
from 1 papers
Mark Dredze
Mohit Bansal
Ozan İrsoy
Shijie Wu
Shiyue Zhang
Steven Lu