< Back to previous page

Project

Structured Machine Learning for Mapping Natural Language to Spatial Ontologies

Natural language understanding is one of the fundamental goals of artificial intelligence. An essential function of natural language is to talkabout the location, and translocation of objects in space. Understanding spatial language is important in many applications such as geographical information systems, human computer interaction, the provision of navigational instructions to robots, visualization or text-to-scene conversion, etc.
Due to the complexity of spatial primitives and notions, andthe challenges of designing ontologies for formal spatial representation, the extraction of the spatial information from natural language stillhas to be placed in a well-defined framework. Machine learning has not systematically been applied to the task, and no established corpora are available. In this thesis I study the problem from cognitive, linguistics and computational points of view, with a primary focus on establishinga supervised machine learning framework.
This thesis makes five mainresearch contributions. The first is the design of a spatial annotationscheme  to bridge between natural language and formal spatial representations. In this scheme the universal and commonly accepted cognitive spatial notions and multiple well-known qualitative spatial reasoning models are applied.
The second is the definition of a novel computational linguistic task that utilizes the annotation scheme to map natural language to spatial ontologies. For this task I have built  rich annotated corpora and an evaluation scheme.
The third is a detailed investigation of the linguistic features and structural characteristics of spatial language that aid the use of machine learning in extracting spatial roles and relations from annotated data. The learning methods used are discriminative graphical models and statistical relational learning.
The fourth is the proposal of a unified structured output learning model for ontologies. The ontology components are learnt while taking intoaccount the ontological constraints and linguistic dependencies among the components. The ontology includes roles and relations, and multiple formal semantic types. 
The fifth is the proposal of an efficientinference approach based upon constraint optimization. It can deal witha large number of variables and constraints, and makes building a global structured learning model for ontology population, feasible. To test the approach I have performed an empirical investigation using my spatialontology.
The application of my proposed unified learning model for ontology population is not limited to the extraction of spatial semantics, it could be used to populate any ontology. I argue therefore that this work is an important step towards automatically describing text with semantic labels that form a structured ontological representation of the content.    

Date:27 Oct 2008 →  1 Jul 2013
Keywords:Relational Learning for Text and Image
Disciplines:Applied mathematics in specific fields
Project type:PhD project