< Terug naar vorige pagina

Publicatie

Denoised Kernel Spectral Data Clustering

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

© 2016 IEEE. Kernel Spectral Clustering (KSC) solves a weighted kernel principal component analysis problem in a primal-dual optimization framework. It builds an unsupervised model on a small subset of data using the dual solution of the optimization problem. This allows KSC to have a powerful out-of-sample extension property leading to good cluster generalization w.r.t. unseen data points. However, in the presence of noise that causes overlapping data, the technique often fails to provide good generalization capability. In this paper, we propose a two-step process for clustering noisy data. We first denoise the data using kernel principal component analysis (KPCA) with a recently proposed Model selection criterion based on point-wise Distance Distributions (MDD) to obtain the underlying information in the data. We then use the KSC technique on this denoised data to obtain good quality clusters. One advantage of model based techniques is that we can use the same training and validation set for denoising and for clustering. We discovered that using the same kernel bandwidth parameter obtained from MDD for KPCA works efficiently with KSC in combination with the optimal number of clusters k to produce good quality clusters. We compare the proposed approach with normal KSC and KSC with KPCA using a heuristic method based on reconstruction error for several synthetic and real-world datasets to showcase the effectiveness of the proposed approach.
Boek: Proc. of the International Joint Conference on Neural Networks
Pagina's: 3709 - 3716
ISBN:9781509006199
Jaar van publicatie:2016
BOF-keylabel:ja
IOF-keylabel:ja
Authors from:Higher Education
Toegankelijkheid:Closed