SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog

Participants in SemEval-2025 Task 5 developed LLM-based systems for automated subject tagging of scientific and technical records using the GND taxonomy, with evaluations showing the effectiveness of LLM ensembles and synthetic data generation.

Open

Preview
Year: 2025
Venue: arXiv 2025
ArXiv: arxiv.org/abs/2504.07199
Authors: 5
Hosting: Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2504.07199v3ARXIV-DEFAULT
TL;DR: Semantic Scholar

Attribution policy →

Abstract

We present SemEval-2025 Task 5: LLMs4Subjects, a shared task on automated subject tagging for scientific and technical records in English and German using the GND taxonomy. Participants developed LLM-based systems to recommend top-k subjects, evaluated through quantitative metrics (precision, recall, F1-score) and qualitative assessments by subject specialists. Results highlight the effectiveness of LLM ensembles, synthetic data generation, and multilingual processing, offering insights into applying LLMs for digital library classification.

Authors

Jennifer D'Souza Sameer Sadruddin Holger Israel Mathias Begoin Diana Slawig