Finding the balance between strict defaults and total openness:Collecting and managing metadata for spoken language corpora with the EXMARaLDA Corpus Manager
This paper presents the metadata model of the EXMARaLDA system and its implementations. It will first take a look on existing metadata schemes for transcriptions of spoken language as well as written texts and emphasize on their advantages and disadvantages. The paper will justify the decisions against existing models that led to a new data model that does not prescribe many metadata items and relies on XML files. It will conclude with a brief outlook on ongoing efforts to standardize metadata.