Publicaties
Gekozen filters:
Gekozen filters:
Multiple nested reductions of single data modes as a tool to deal with large data sets KU Leuven
The increased accessibility and concerted use of novel measurement technologies give rise to a data tsunami with matrices that comprise both a high number of variables and a high number of objects. As an example, one may think of transcriptomics data pertaining to the expression of a large number of genes in a large number of samples or tissues (as included in various compendia). The analysis of such data typically implies ill-conditioned ...
Combining multiway principal component analysis (MPCA) and clustering for efficient data mining of historical data sets of SBR processes Universiteit Gent
A methodology based on Principal Component Analysis (PCA) and clustering is evaluated for process monitoring and process analysis of a pilot-scale SBR removing nitrogen and phosphorus. The first step of this method is to build a multi-way PCA (MPCA) model using the historical process data. In the second step, the principal scores and the Q-statistics resulting from the MPCA model are fed to the LAMDA clustering algorithm. This procedure is ...
Nonlinear projection methods for visualizing Barcode data and application on two data sets Universiteit Antwerpen
Developing tools for visualizing DNA sequences is an important issue in the Barcoding context. Visualizing Barcode data can be put in a purely statistical context, unsupervised learning. Clustering methods combined with projection methods have two closely linked objectives, visualizing and finding structure in the data. Multidimensional scaling (MDS) and Self-organizing maps (SOM) are unsupervised statistical tools for data visualization. Both ...
How to optimize intracardiac blood flow tracking by echocardiographic particle image velocimetry? Exploring the influence of data acquisition using computer-generated data sets KU Leuven
Echocardiographic particle image velocimetry (EPIV) has been used for tracking contrast-enhanced intracavitary blood flow. Little is known, however, how basic imaging parameters (line density, frame rate, contrast bubble density) affect the quality of such tracking results. Our study aimed at investigating this by using simulated echo data sets.
A data structure to represent data sets with more than one order relation like polygons Universiteit Gent
An effective over-sampling method for imbalanced data sets classification Universiteit Gent
Imbalanced data sets in real-world applications have a majority class with normal instances and a minority class with abnormal or important instances. Learning from such data sets usually generates biased classifiers that have a higher predictive accuracy over the majority class, but a rather poorer predictive accuracy over the minority class. The Synthetic minority over-sampling technique (SMOTE) is specifically designed for learning from ...