Representation Learning in the Context of Natural Language Understanding
I will be working with deep learning models that will try to understand the human language and construct answers and sentences using our language. More specifically, I will be looking at multimodal systems, where the system makes use of multiple types of data, such as visual and linguistic data. E.g. images or videos for the visual data and questions, a conversation or summaries as linguistic data. My aim is to make models that understand the language well enough so they can reason about the images to perform tasks such as visual question answering.