Multimodal Artefacts: Exploring Vision-Language Models to Bridge the Modalities of Historical Written Artefacts

Link:

https://www.fis.uni-hamburg.de/publikationen/detail.html?id=be0b30c9-843d-4511-9f02-0cb46579510d

Autor/in:

Mohammed, H.

Beteiligte Personen:

Melzer, Sylvia
Peukert, Hagen
Thiemann, Stefan

Verlag/Körperschaft:

CEUR-WS.org

Erscheinungsjahr:

2023

Medientyp:

Text

Beschreibung:

Historical written artefacts are multi-dimensional objects with several modalities, typically analysed separately by dedicated computational systems. These modalities are generated as research data from the study of artefacts, including digital images, measurements of material properties, and meta data from historical contexts. In most cases, these modalities are interrelated and interdependent. Therefore, understanding the relationship and learning to associate between different modalities can be essential for a holistic understanding beyond the textual contents of historical written artefacts. Recent advancements in research on multimodal models offer the possibility of analysing the different modalities of historical artefacts and modelling the relationships between them. Such models can be used by scholars for tasks such as text-based image retrieval and visual question answering. This work aims explore the potential of utilising multimodal models, and expressing the different modalities in research data of historical written artefacts in image and text formats, so that vision-language models can be employed.

Lizenz:

info:eu-repo/semantics/openAccess

Quellsystem:

Forschungsinformationssystem der UHH

Interne Metadaten

Quelldatensatz: oai:www.edit.fis.uni-hamburg.de:publications/be0b30c9-843d-4511-9f02-0cb46579510d