Representation Learning for Sign Language Translation Using Linguistic and Knowledge-based Constraints
This PhD project is an integral part of the Horizon 2020 research project SignON, which unites 17 European partners in a research project that aims to facilitate the exchange of information among deaf and hard of hearing, and hearing individuals across Europe by developing automatic sign language translation tools. Recent machine learning methods based on neural transformer architectures have greatly improved the state-of-the art in natural language processing applications, both for translation applications and general natural language understanding. The same kind of architectures have been applied to the multi-modal problem of sign language translation, with equally profitable results. However, due to the inherent complexity of the task, most neural network approaches do not favour an end-to-end approach (i.e., directly translating sign to text), but first transform the signs to an intermediate, gloss-based transcription (sign to gloss), and in a second step translate the intermediate representation to verbal language (gloss to text). Using glosses as an interface for sign to language translation is fairly successful, but also poses a number of problems. Gloss annotations are an imprecise representation of sign language; in this respect, they are often an impoverished representation that does not do justice to the complex multi-channel production of sign language. This project will focus on the intermediate representation that functions as an interface between sign language and verbal language in the context of sign language translation. Research will be carried out along two tracks. Firstly, the project will consider the development of a multi-faceted interlingual representation for sign language translation, that can function as a sufficiently rich interface between sign language and verbal language, and is tailored towards machine learning methods. Crucially, the representation needs to be sufficiently rich to capture the intricacies of elaborate, multi-channel sign language, but at the same time lenient enough to be incorporated into a classification-based optimization objective that is inherent to machine learning approaches. Secondly, the project will examine how the resulting representations can be exploited as soft constraints to improve the output predictions of the neural machine translation architecture for sign language. By augmenting the network output with representation-based constraints modeled as a priori distributions on the neural network’s output distribution, possible discrepancies due to the scarcity of resources can be mediated. The results of the project will be integrated into an overarching system for sign language translation, in cooperation with the European partners of the SignON project.