0

NEFTune: Noisy Embeddings Improve Instruction Finetuning

Injecting noise into embedding vectors during language model fine-tuning significantly enhances performance across various modern instruction datasets.

Year
2023
Venue
arXiv 2023
Authors
13
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2310.05914v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instruct see a 10% improvement, with ShareGPT an 8% improvement, and with OpenPlatypus an 8% improvement. Even powerful models further refined with RLHF such as LLaMA-2-Chat benefit from additional training with NEFTune.

Authors

13