< Back to previous page

Project

Querying distributed dynamic data collections (R-4211)

The ability to seamlessly query data residing in several distinct and distributed data sources has been a major driving force in database research and has led to the development of the first distributed database systems already in the nineteen eighties. Nowadays, our information society is characterized by data which is inherently dispersed but not necessarily residing in actually interconnected databases. Consider, for instance, the many, mostly independently operated, data collections in the life sciences where connections among data sources can be either hard-coded as explicit links or can be implicit through the semantics of the data. While each such collection can be queried through various portals, there are no centralized portals that provide an integrated interface to all of the data. An important reason for the lack of such centralized systems is the continuing growth in the number of data sources on the Web stemming from the ease by which scientists can provide data on-line. The combined data therefore spans a network of linked databases in which data sources can dynamically be added and removed, and whose contents can be changed, copied, transformed or even revoked without notice. The aim of this proposal is to study and develop techniques for querying such dynamic distributed data collections. Our approach is based on three pillars: (1) the study of navigational query languages for linked data; (2) the study of distributed computing methods for distributed query evaluation; and, (3) the use of provenance as a mechanism for monitoring alterations to data sources.
Date:1 Jan 2013 →  31 Dec 2016
Keywords:DATABASE THEORY
Disciplines:Applied mathematics in specific fields