< Back to previous page

Publication

Subgraph pattern matching over uncertain graphs with identity linkage uncertainty

Book Contribution - Book Chapter Conference Contribution

There is a growing need for methods that can represent and query uncertain graphs. These uncertain graphs are often the result of an information extraction and integration system that attempts to extract an entity graph or a knowledge graph from multiple unstructured sources. Such an integration typically leads to identity uncertainty, as different data sources may use different references tothe same underlying real-world entities. Integration usually also introduces additional uncertainty on node attributes and edge existence. In this paper, we propose the notion of a probabilistic entity graph (PEG), a formal model that uniformly and systematically addresses these three types of uncertainty. A PEG is a probabilistic graph model that defines a distribution over possible graphs at the entity level. We introduce a general framework for constructing a PEG given uncertain data at the reference level and develop efficient algorithms to answer subgraph pattern matching queries in this setting.Our algorithms are based on two novel ideas: context-aware path indexing and reduction by join-candidates, which drastically reduce the query search space. A comprehensive experimental evaluation shows that our approach outperforms baseline implementations by orders of magnitude.
Book: Proceedings of the IEEE 30th International Conference on Data Engineering (ICDE)
Pages: 904 - 915
ISBN:9781479925544
Publication year:2014
BOF-keylabel:yes
IOF-keylabel:yes
Authors from:Higher Education
Accessibility:Open