Online education platforms are powered by various NLP pipelines, which utilize models like BERT to aid in content curation. Since the inception of the pre-trained language models like BERT, there have also been many efforts toward adapting these pre-trained models to specific domains. However, there has not been a model specifically adapted for the education domain (particularly K-12) across subjects to the best of our knowledge. In this work, we propose to train a language model on a corpus of data curated by us across multiple subjects from various sources for K-12 education. We also evaluate our model, K12-BERT, on downstream tasks like hierarchical taxonomy tagging.
K-12BERT: BERT for K-12 education
A domain-specific language model, K12-BERT, is trained for K-12 education across multiple subjects and evaluated for hierarchical taxonomy tagging.
- Year
- 2022
- Venue
- arXiv 2022
- Authors
- 6
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2205.12335ARXIV-DEFAULT
- TL;DR
- Semantic Scholar