< Back to previous page


Multimodal Learning Methods for Commonsense Enriched Learning

Machine understanding of natural language has proven to be a difficult task in current natural language processing research. A common stumbling block in this field is how to incorporate world knowledge and common sense into our models as they are often left implicit in natural language. This PhD research investigates novel techniques to incorporate commonsense knowledge in the representation learning of natural language by incorporating data from different modalities like images and video. This is in line with how humans acquire knowledge and understand language. We often understand language by being able to visualize what is meant in natural language. The main focus of this PhD research lies on the learning of cross-modal representations involving natural language and visual data. This research will investigate generative methods that make use of artificial neural networks and attention mechanisms to achieve this goal. The learned commonsense knowledge in the obtained representations can increase the machine understanding of natural language, which is evaluated in tasks like the generation of 3D models from language. These results might contribute to real-world applications like automated generation of video from written scripts, but the research itself is fundamental. On the other hand by translating the learned commonsense enriched representations of visual data back to natural language, we can explain and describe the learned commonsense knowledge in natural language. This will contribute to the interpretability of the learned neural representations. This PhD study is part of the DEEPTEMPL project (FWO-SNSF G078618N) and the CALCULUS project (Horizon 2020, ERC-2017-ADG, 788506).

Date:1 Sep 2021 →  Today
Keywords:Natural Language Processing, Computer Visions, Multimodal, Artificial Neural Networks, Generative Models, Common Sense
Disciplines:Data visualisation and imaging , Modelling and simulation, Natural language processing, Knowledge representation and reasoning
Project type:PhD project