This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank. Comprising both of the official standards of written Norwegian (Bokm{\aa}l and Nynorsk), the corpus contains around 600,000 tokens and annotates a rich set of entity types including persons, organizations, locations, geo-political entities, products, and events, in addition to a class corresponding to nominals derived from names. We here present details on the annotation effort, guidelines, inter-annotator agreement and an experimental analysis of the corpus using a neural sequence labeling architecture.
NorNE: Annotating Named Entities for Norwegian
A neural sequence labeling architecture is evaluated using NorNE, a manually annotated corpus of named entities for Norwegian, encompassing Bokmål and Nynorsk.
- Year
- 2019
- Venue
- norne-annotating-named-entities-for-norwegian-1
- Authors
- 5
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/1911.12146v2ARXIV-DEFAULT
- TL;DR
- Semantic Scholar