Revisiting Instruction Fine-tuned Model Evaluation to Guide Industrial Applications

LLM-based metrics are evaluated for their effectiveness in task-specialization strategies for Instruction Fine-Tuning (IFT) in industrial settings.

Open

Preview
Year: 2023
Venue: arXiv 2023
ArXiv: arxiv.org/abs/2310.14103
Authors: 4
Hosting: Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2310.14103ARXIV-DEFAULT
TL;DR: Semantic Scholar

Attribution policy →

Abstract

Instruction Fine-Tuning (IFT) is a powerful paradigm that strengthens the zero-shot capabilities of Large Language Models (LLMs), but in doing so induces new evaluation metric requirements. We show LLM-based metrics to be well adapted to these requirements, and leverage them to conduct an investigation of task-specialization strategies, quantifying the trade-offs that emerge in practical industrial settings. Our findings offer practitioners actionable insights for real-world IFT model deployment.

Authors

Manuel Faysse Céline Hudelot Pierre Colombo Gautier Viaud