< Back to previous page

Project

Deep-Learning based Joint Audio, Video Processing for Augmented Listening

Augmented listening implies the extraction of desired audio signal(s) from a distorted capture. Like human perception of speech, where visual and acoustic cues jointly contribute to the understanding, we aim to improve this extraction by augmenting the audio with visual information. A side-application is the detection of inconsistent streams, hinting at deepfakes or otherwise compromised streams.

Date:1 Oct 2022 →  Today
Keywords:Augmented reality, Deep learning, Joint audio-video processing
Disciplines:Machine learning and decision making, Audio and speech computing, Computer vision, Audio and speech processing, Image processing