< Back to previous page


Understanding the Multi-Modal World: Towards Multi-Modal Semantics, Information Search and Retrieval.

As more and more data nowadays are present in more than one modality (e.g., text, vision, video) besides text-only data, there is a pressing need to transfer the research from unimodal text environments to multi-modal environments, and to jointly model information that organically comes from multiple modalities. The focus of this project is on textual and visual modalities. One essential step of the project includes learning representations of multi-modal data based on both textual and visual/perceptual inputs. As one part of this project, I propose to work towards multi-modal and cross-modal models of information search and retrieval based on these joint representations (e.g., retrieving images that are relevant to an issued textual query and vice versa, or retrieving information based on a multi-modal query comprising both textual and visual information). Another part of the project tackles the recently started initiative on multi-modal semantics, where the goal is to learn semantic concept representations from both linguistic and visual input. I plan to investigate whether the additional inclusion of visual information may lead to improved models of semantic representation, with an emphasis on cross-lingual models of semantic similarity and association. For instance, different languages use different words to refer to the same concept which may be expressed by the same image, e.g., elephant(EN)-olifant(NL)-slon(HR) are words in three different languages that refer to the same image of an elephant. Therefore, I plan to study whether shared visual input may help to further bridge the lexical chasm across different languages.
Date:1 Oct 2014 →  30 Sep 2015
Keywords:Multi-modal information retrieval, Cross-modal modeling, Multi-modal data representation, Multi-modal semantics, Cross-linguality, Semantic similarity, Emantic association
Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences, Artificial intelligence, Cognitive science and intelligent systems