Chancen und Grenzen von automatischer Annotation

Link:

https://doi.org/10.1515/zgl-2015-0004

Autor/in:

Zinsmeister, Heike

Erscheinungsjahr:

2015

Medientyp:

Text

Schlagworte:

Dialect
Language
Urban vernaculars
Vowels
Dialect
Language
Urban vernaculars
Vowels

Beschreibung:

Linguistic annotation helps corpus users to retrieve relevant examples in an efficient way. It also supports the identification of latent patterns in the data by encoding generalizations such as parts of speech. Since manual annotation is very time-consuming, many projects use off-the-shelf tools for annotating or at least preprocessing their data. This article discusses pros and cons of such automatic annotation using part-of-speech tagging as an example case. It argues that errors made by annotation tools are systematic in nature and hence predictable to a certain extent. In addition, the article addresses the issue of descriptive adequacy of tagsets. In particular, it discusses how well the Stuttgart-Tubingen Tagset (STTS) describes German parts of speech. Finally, the article briefly addresses normalization, an additional preprocessing step that is sometimes required before automatic annotation tools can be applied.

Lizenz:

info:eu-repo/semantics/restrictedAccess

Quellsystem:

Forschungsinformationssystem der UHH

Interne Metadaten

Quelldatensatz: oai:www.edit.fis.uni-hamburg.de:publications/608c3b92-aa96-44a8-8e5a-9d9ab068f1f9