< Back to previous page

Publication

Applications of artificial intelligence for the resource-scarce cultural heritage domain

Book - Dissertation

Subtitle:from language and image processing to multi-modality
The cultural heritage domain, mainly represented by Galleries, Libraries, Archives, and Museums (GLAM), is massively digitising its collections, which leads to an increasing amount of raw digital material. Such material is slow and expensive to annotate, as it requires the intervention of highly skilled professionals who are difficult to find and costly to train. Artificial intelligence (AI) made great progress in the last decade and nowadays offers an excellent opportunity to automate data annotation in the GLAM sector. However, the cultural institutions, due to their limited budgets, can provide only small annotated datasets, which are insufficient to train modern AI models from scratch. In this thesis, we investigate and develop AI models to operate in the resource-scarce cultural heritage domain, which is significantly different from the general field of AI. We focus on artistic metadata and provide 6 case studies divided into 3 parts that consider different aspects of the domain under consideration. Part I is devoted to a subfield of natural language processing, namely, neural machine translation, where we investigate the translation of artwork titles. In this context, we compare character-level and subword-level models, which are fine-tuned in two stages. To further improve the quality of the translation, we propose to enrich the artwork titles with the corresponding textual definitions from Iconclass codes. Iconclass is an iconographic thesaurus which is widely used in the GLAM sector to describe various objects represented in visual artworks. In Part II, we apply computer vision models to visual reproductions of artworks containing musical instruments. First, we collect a small benchmark dataset for image classification and object detection. Next on, we investigate style transfer as a data augmentation technique using the collected dataset. Finally, Part III focuses on multi-modal matching of Iconclass definitions with the corresponding artworks. We modify a cross-modal matching framework to take into account textual and visual features. Later on, we reimplement this model to exploit the transfer learning paradigm and compare this framework to the models from the general AI domain.
Number of pages: 156
Publication year:2022
Keywords:Doctoral thesis
Accessibility:Open