< Back to previous page
Project
Deep-Learning based Joint Audio, Video Processing for Augmented Listening
Augmented listening implies the extraction of desired audio signal(s) from a distorted capture. Like human perception of speech, where visual and acoustic cues jointly contribute to the understanding, we aim to improve this extraction by augmenting the audio with visual information. A side-application is the detection of inconsistent streams, hinting at deepfakes or otherwise compromised streams.
Date:1 Oct 2022 → Today
Keywords:Augmented reality, Deep learning, Joint audio-video processing
Disciplines:Machine learning and decision making, Audio and speech computing, Computer vision, Audio and speech processing, Image processing