< Back to previous page

Project

Advanced Data Modeling and Mapping Support for NoSQL Databases

In recent years, the database landscape has become vastly diverse as these systems have to handle increasingly large volumes of heterogeneous data, a requirement which shows several limitations in traditional relational database systems for being too rigid, and unable to scale horizontally. As a solution, “not only SQL”
or NoSQL databases emerged, an umbrella term for database systems that typically provide benefits in the form of more flexible data models, specialized functionality, and high availability through horizontal and elastic scalability.

Each class of NoSQL database systems is targeted at a different use case, data model, and set of functionalities associated with the data model, and currently over 225 such technologies are in existence. The challenges and domains examined within this research are motivated by the compelling benefits of NoSQL, and mainly stem from their novelty, heterogeneity and (distributed) nature. In this research, we address three main challenges that are related to: (i) database selection and technology comparison, (ii) improving developer adoption via object-database mappers that feature (uniform) interface and data model abstraction. Thirdly, (iii) these databases have to be efficiently used and configured (e.g. DB schema selection) tailored to a given application case.

The initial research challenge starts at the selection of a NoSQL technology given an intended use case. As a reference, database benchmarks are used in practice to compare NoSQL technologies. The context of heterogeneous NoSQL databases presents many interesting research challenges to determine which benchmarks exist, and how they deal with the large disparity, and whether they are adequate to facilitate selection.

Secondly, the adoption of NoSQL databases by developers can be challenging when programming against low-level native DB APIs, posing the risk of vendor lock-in and hindering technology migration. Object-database mappers allow developers to access databases using an accustomed (e.g. standardized) interface and object data model. We investigate how these mappers, when implementing NoSQL support, have adapted to the challenges related to (a) the novel, and numerous dimensions in data modeling, as well as (b) the mapping of (uniform) application queries to diverse (non-standardized) APIs.

The third area of research is related to the challenge of (initial) schema design in NoSQL, which in contrast to relational databases favors a much larger degree of data redundancy. The design involves deciding among others which data to copy and co-locate that is jointly requested frequently for performance. This is a multi-dimensional modeling problem, and in practice developers will have to rely on experience or expertise, guidelines, heuristics, or design support tools.

This research presents contributions on all three fronts. First, we provide an assessment of benchmarks available to compare and evaluate NoSQL databases on performance, and their suitability to different applications. Secondly, regarding object mappers, we contribute a systematic and exhaustive technology
survey on existing object-NoSQL database mappers (ONDM) frameworks. Our third contribution is a design support tool, namely a workload-driven document database schema recommender (DBSR) that simplifies the complex task of schema design.

Date:8 Dec 2016 →  20 Jan 2022
Keywords:NoSQL databases, Federated storage
Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences
Project type:PhD project