< Terug naar vorige pagina


Subgraph pattern matching over uncertain graphs with identity linkage uncertainty

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

There is a growing need for methods that can represent and query uncertain graphs. These uncertain graphs are often the result of an information extraction and integration system that attempts to extract an entity graph or a knowledge graph from multiple unstructured sources. Such an integration typically leads to identity uncertainty, as different data sources may use different references tothe same underlying real-world entities. Integration usually also introduces additional uncertainty on node attributes and edge existence. In this paper, we propose the notion of a probabilistic entity graph (PEG), a formal model that uniformly and systematically addresses these three types of uncertainty. A PEG is a probabilistic graph model that defines a distribution over possible graphs at the entity level. We introduce a general framework for constructing a PEG given uncertain data at the reference level and develop efficient algorithms to answer subgraph pattern matching queries in this setting.Our algorithms are based on two novel ideas: context-aware path indexing and reduction by join-candidates, which drastically reduce the query search space. A comprehensive experimental evaluation shows that our approach outperforms baseline implementations by orders of magnitude.
Boek: Proceedings of the IEEE 30th International Conference on Data Engineering (ICDE)
Pagina's: 904 - 915
Jaar van publicatie:2014
Authors from:Higher Education