< Back to previous page

Project

Automated harmonisation strategies for real-world multiple sclerosis data sources (R-12679)

Multiple sclerosis (MS) is an autoimmune disease of the central nervous system that manifests mostly early in life and is, up until today, not curable but treatable. Although there is an estimated prevalence of about 2.8 million people with a diagnosed MS worldwide, the data collections for MS real world data is often focused on local/national counts. These numbers are often too low for broad analyses in certain areas of interest, e.g. treatment options and their safety profile or influences of co-morbidities, to make statistically significant statements, especially in view of the diversity of the disease. One approach to overcome the "predicament" of the distributed data sources is the establishment of data sharing initiatives. One important part of data sharing is the enhancement of distributed, mostly heterogeneous real-world data by establishing a FAIR (findable, accessible, interoperable, reusable) data structure (including aligned data sets, common data models, central vs. local infrastructures) and general infrastructure (including network principles, processes, GDPR, ethics, governance). The basis for fully exploiting heterogeneous, distributed real world data lies in the core of the to-be federated network: the data. Numerous efforts that build upon establishing a network for large-scale analysis of distributed data already exist. For health data, one example of a large initiative is EHDEN. Launched in 2018, EHDEN sets out to gain insight and evidence from real-world clinical data by establishing a federated network where source data is mapped to the OMOP common data model (CDM). The OMOP CDM originates from the Observational Medical Outcome Partnership (OMOP), and is now implemented and updated by the Observational Health Data Sciences and Informatics (OHDSI) community. This model is an eligible candidate for analyses on health data across the globe –but it is by default not yet fit for purpose for observational, disease-specific data coming from registries or cohorts, especially for MS. There is, until now, no established solution to standardize real-world data on MS across the globe. Therefore, with my PhD I aim to accelerate real-world MS data research by developing tools to automate harmonisation strategies of real-world MS datasets as a first step in moving towards a federated MS data network.
Date:1 Jan 2022 →  31 Dec 2023
Keywords:common data model, data harmonisation, multiple sclerosis, real world data, registries
Disciplines:Health informatics