Cite
Notes
Only stored in your browser.
Attribution
How Transformers Learn Causal Structure with Gradient Descent
arXiv 2024
Fine-Tuning Language Models with Just Forward Passes
fine-tuning-language-models-with-just-forward
from 2 papers
Eshaan Nichani
Jason D. Lee
Danqi Chen
professor
Sadhika Malladi
Sanjeev Arora
Tianyu Gao