< Back to previous page

Publication

High-dimensional Techniques for ECG Analysis

Book - Dissertation

The electrocardiogram, or ECG, is a key diagnostic tool in clinical practice, due to its simple and non-invasive way of recording the electrical activity of the heart. In clinical examinations, the ECG is mostly captured using 12 leads to obtain a multi-view graphical representation of the cardiac cycle. These signals (i.e., time series) are recorded from different angles, to characterize the electrical activity of the different regions of the heart. ECG analysis allows identifying abnormalities that can be associated to different cardiac diseases. The presence of some of these abnormalities can be linked to a high risk of arrhythmia and mortality. Therefore, it is crucial to detect these aberrant patterns in time to ensure treatment and prevent the occurrence of life threatening events. One of the main challenges in cardiology is the prediction of ventricular arrhythmia. For that aim, several ECG biomarkers have been proposed in literature, and algorithms have been developed for their automatic detection. However, many of these biomarkers do not make use of the complete 12-lead information, by either limiting their analysis to one of the leads, or deriving 12 independent features that need to be jointly interpreted by clinicians. One of the challenges of using 12-lead ECG is that, while the 12-leads provide different views with complementary information to characterize the entire cardiac cycle, they also contain redundant information. Therefore, robust algorithms are required for the analysis of these multi-lead signals, that can discriminate risk biomarkers from other irrelevant information, while maximally exploiting the content of the 12-leads. Some of these approaches imply the use of dimensionality reduction techniques to translate the information contained in 12-lead ECG into simpler data structures. Additionally, powerful models for arrhythmia or mortality risk prediction are multi-variate, i.e. they are a combination of more than one biomarker or feature that contribute to predict the outcome. These models can also benefit from dimensionality reduction techniques to decrease the number of features, thereby increasing their interpretability and accuracy. This thesis presents advanced algorithms for the analysis of multi-dimensional data, both for feature-based approaches and for time series analysis. The main motivation in the development of these algorithms was to create reliable tools usable in clinical practice, which can support further diagnosis. In particular, these methods are defined to be used on ECG analysis, for the identification of arrhythmia risk predictors. The first part of this thesis proposes two robust algorithms for the extraction of ECG features for arrhythmia risk prediction. These algorithms include strategies to reduce the impact of noise in the analysis of the signal and the influence of ambiguous annotations. First, a heartbeat classification algorithm for single-lead ECG was proposed. It achieved results comparable to the state-of-the-art, while facing the same challenges related to the accurate detection of under-represented classes. This triggered an exhaustive analysis of the database and the protocol used for training and evaluating heartbeat classification methods, highlighting their limitations. The second algorithm proposed for ECG analysis focuses on an improved quantification of QRS fragmentation (fQRS). This method was trained and evaluated on multi-center data, exploring the impact of including different fQRS definitions in the training process, hence tackling the problem of inter-observer variability. Results indicate that the combination of two different fQRS definitions achieved comparable results to those trained and evaluated on a single definition. Therefore, this study confirms the relevance of using multi-center data in the development of these algorithms, to make them more flexible and applicable to clinical practice. The ECG analysis algorithms mentioned above, as well as most of the state-of-the-art methods, analyze each ECG-lead independently. Therefore, the second part of the thesis aims to maximally exploit the information contained in multi-lead ECG analysis by defining a new framework for ECG dimensionality reduction based on Laplacian Eigenmaps. This data-driven approach emphasizes the irregularities in the data, and it was evaluated for the characterization of structural and arrhythmogenic pathologies from 12-lead ECGs. This framework was compared against the vectorcardiogram, which is a 3-lead signal that can be derived from the 12-lead ECG. Results suggest that the LE framework is more robust to outliers than the approach based on VCG. This confirms the advantages of data-driven approaches for 12-lead ECG dimensionality reduction, setting the path for more efficient and exhaustive analysis of multi-lead signals. The obtained reduced-dimensionality signals preserve and enhance abnormalities, which are often missed both by visual and current automated analysis. The last part of this thesis focuses on a feature-based approach and it presents a novel unsupervised non-parametric feature selector: the U2FS algorithm. This user-friendly method based on the utility metric was evaluated using simulated and benchmark databases and it was proven to obtain results in line with the state-of-the-art, while requiring less computation time.
Publication year:2022
Accessibility:Open