Project
Applied mathematics for proteomics
Mass Spectrometry Imaging (MSI) is an explorative scan technique to analyze chemical compounds within biological tissues. It uses laser beams at multiple (x,y)-coordinates to ionize the molecules at those grid points and measure their mass-to-charge (m/z) ratio. This results in very large data cubes with dimensions x, y, and m/z. Nowadays, the data occupies already tens of gigabytes of memory, and with increasing resolution of the scan technology, size is still increasing. The MALDI scan technique easily measures thousands of m/z values. This is a big increase in information compared to the classically used H&E staining by pathologists. To investigate these large datasets manually is very time consuming and practically infeasible when multiple tissues need to be analyzed rapidly. This research will focus on tackling both size and interpretation obstacles, by taking a closer look at dimensionality reduction methods for proteomics. Unsupervised dimensionality reduction (DR) has many facets and can be divided in multiple subcategories: linear DR, multilinear DR, non-linear DR, and others such as topological data analysis. This research aims to investigate known techniques within these categories as well as searching for new algorithms to improve current standards for MSI and other omics data analysis and facilitate pathological experts to discover unknown biomedical insights.