Zoeken | FRIS onderzoeksportaal

Gekozen filters:

1 - 10 of 51 resultaten

Sorteren op

On the disentanglement and robustness of self-supervised speech representations Universiteit Gent

Yanjue Song, Doyeon Kim, Nilesh Madhu, Hong-Goo Kang

This paper conducts an analysis of latent embeddings generated by a range of pre-trained, self-supervised learning (SSL) models. Departing from conventional practices that predominantly focus on examining these embeddings within the realm of speech recognition tasks, our study investigates the characteristics associated with speakers and their behavior under the influence of input distortions. We establish a controlled setting with varying ...

Spatially selective speaker separation using a DNN with a location dependent feature extraction Universiteit Gent

Alexander Bohlender, Ann Spriet, Wouter Tirry, Nilesh Madhu

Deep neural networks (DNNs) have proven themselves as an effective means to separate clean speech from noisy mixtures. When there are multiple concurrent talkers, however, unambiguously defining the target output is not trivial, especially if the mixture is single-channel and the talkers are not known in advance. Although this problem can be addressed with permutation invariant training or deep clustering, the performance still suffers in this ...

Robust detection of background acoustic scene in the presence of foreground speech Universiteit Gent

Siyuan Song, Yanjue Song, Nilesh Madhu

The characterising sound required for the Acoustic Scene Classification (ASC) system is contained in the ambient signal. However, in practice, this is often distorted by e.g., foreground speech of the speakers in the surroundings. Previously, based on the iVector framework, we proposed different strategies to improve the classification accuracy when foreground speech is present. In this paper, we extend these methods to deep-learning (DL)-based ...

Aiding speech harmonic recovery in DNN-based single channel noise reduction using cepstral excitation manipulation (CEM) components Universiteit Gent

Yanjue Song, Nilesh Madhu

Weak harmonics of voiced speech segments are often lost during the process of noise suppression – especially at low SNRs. This leads to a distortion in the harmonic structure, and an accompanying loss in quality. In this paper, inspired by previous work on speech harmonic enhancement using statistical methods, we present a loss function component we term cepstral excitation manipulation (CEM) loss, which is constructed based on the fundamental ...

Investigating spherical head models to simulate binaural room impulses for training deep neural networks Universiteit Gent

Jasper Maes, Siyuan Song, Stijn Kindt, Pieter-Jan Maes, Bruno Masiero, Nilesh Madhu

Influence of lossy speech codecs on hearing-aid, binaural sound source localisation using DNNS Universiteit Gent

Siyuan Song, Stijn Kindt, Jasper Maes, Alexander Bohlender, Nilesh Madhu

Hearing aids are typically equipped with multiple microphones to exploit spatial information for source localisation and speech enhancement. Especially for hearing aids, a good source localisation is important: it not only guides source separation methods but can also be used to enhance spatial cues, increasing userawareness of important events in their surroundings. We use a state-of-the-art deep neural network (DNN) to perform binaural ...

Ad hoc distributed microphones clustering : a comparative analysis on using coherence and signal-specific features Universiteit Gent

Stijn Kindt, Martijn Meeldijk, Nilesh Madhu

It is often useful to cluster hoc distributed microphones according to the dominant source each captures. For example, in a recently proposed source separation approach, inter- and intracluster information is aggregated to enhance the dominant source at each cluster. To generate the features for this blind clustering, spectro-temporal characteristics of the signals are usually exploited to be clustered by the fuzzy C-means algorithm. A recent ...

Comparative study of LC3plus and Lyra codec on DNN-based source localisation for hearing aids Universiteit Gent

Siyuan Song, Stijn Kindt, Jasper Maes, Alexander Bohlender, Nilesh Madhu

Lossy codecs are often used to exchange audio data in bandwidthconstrained applications. However, this can have a detrimental effect on the subsequent signal processing stages - especially with regard to multichannel source localisation and enhancement. Understanding and circumventing these effects is, therefore, crucial. We contrast the effect of LC3plus (developed for Bluetooth Low Energy (BLE) communications) against Lyra, a recently proposed ...

CRNN-based multi-DOA estimator : comparing classification and regression Universiteit Gent

Pieter Cooreman, Alexander Bohlender, Nilesh Madhu

Detecting translocation of DNA nanostructures through nanopores : first steps towards structural barcode readout Universiteit Gent

Pratima Upretee, S Santermans, K Martens, J Gevers, S Marion, W Van Den Bosch, Jan Fostier, Nilesh Madhu

Nanopore sequencing works on the principle of detecting the patterns in the current as a biomolecule translocates through an electrically charged nanopore. Robustly detecting such translocations is the first, key step in nanopore signal analysis. As the current changes in a step-wise manner when a molecule translocates, state-of-the-art approaches rely on straightforward thresholding of the signal to identify the start and end of translocation ...

Publicaties