Project

Solving inverse problems in room acoustics using physical models, sparse regularization and numerical optimization

Reverberation consists of a complex acoustic phenomenon that occurs inside rooms. Many audio signal processing methods, addressing source localization, signal enhancement and other tasks, often assume absence of reverberation. Consequently, reverberant environments are considered challenging as state-of-the-art methods can perform poorly. The acoustics of a room can be described using a variety of mathematical models, among which, physical models are the most complete and accurate.

The use of physical models in audio signal processing methods is often non-trivial since it can lead to ill-posed inverse problems. These inverse problems require proper regularization to achieve meaningful results and involve the solution of computationally intensive large-scale optimization problems. Recently, however, sparse regularization has been applied successfully to inverse problems arising in different scientific areas. The increased computational power of modern computers and the development of new efficient optimization algorithms makes it possible to tackle inverse problems also in the context of room acoustics. This thesis explores this novel framework by applying the latest sparse regularization methods and optimization algorithms to develop new audio signal processing methods that are more robust against reverberation and noise. The inverse problems these methods face naturally lead to joint formulations of multiple tasks that are typically treated separately enabling, e.g., simultaneous source localization, sound field reconstruction and dereverberation.

The first part of the thesis is dedicated to optimization algorithms particularly suited for the type of inverse problems under consideration. These are called proximal gradient (PG) algorithms and are capable of minimizing the nonsmooth cost functions that typically arise in optimization problems involving sparse regularization. In addition, PG algorithms can be accelerated using quasi-Newton methods and combined with matrix-free operators allowing them to tackle large-scale problems and reduce the computational burden of many signal processing methods.

The second part of the thesis addresses acoustic modeling and focuses on sweeping echoes, a particular physical phenomenon that typically does not occur in regular rooms. This phenomenon is studied and attributed to the idealized cuboid geometries employed by many acoustic models. A variation of the image method (IM), a popular acoustic model that is usually restricted to rectangular rooms, is proposed to produce perceptually realistic simulations without the presence of sweeping echos.

The third part of the thesis covers inverse problems that utilize the finite-difference time-domain method. This method, which aims at solving the wave equation numerically, requires precise knowledge of the room geometry and of the acoustic impedances that model the acoustic properties of the walls of the room. Firstly, the problem of acoustic impedance estimation is addressed, also resulting in an inverse problem. Secondly, source localization is jointly formulated with source reconstruction by proposing a two-step method. Once the original sound source location is identified by solving an inverse problem that exploits the spatial sparsity
of the sound sources in the room, a reconstruction of the original source signal is performed. Finally, the use of the FDTD method for multi-zone sound field control is envisaged. Solving an inverse problem regularized with spatial sparsity allows to optimally control and place a set of loudspeakers to reproduce a specific sound field inside a highly reverberant room while keeping part of the room silent.

The fourth part of the thesis is dedicated to the use of wave decomposition models, that can represent the sound field of a room only in a limited portion of space but, contrary to the FDTD method, without the knowledge of the room geometry and of the acoustic impedances. Here, firstly the problem of room impulse response (RIR) interpolation is addressed. Since measuring RIRs in a wide space is time-consuming, an effective interpolation of these measurements is often useful in a number of applications. It is shown that the combination of spatio-temporal sparse regularization with a time-domain wave decomposition model can substantially reduce the number of microphones needed to perform RIR interpolation. The fourth part is concluded by the description of a novel method capable of performing source localization and dereverberation jointly. This is achieved by performing a sound field interpolation as well, performed through the solution of an inverse problem that employs a particular combination of a wave decomposition model with sparse regularization. Once a proper sound field interpolation is achieved, the direction of arrival (DOA) of a moving sound source and dereverberated signals can be obtained simultaneously in challenging acoustic environments.

Date:13 May 2013 → 29 Aug 2018

Keywords:numerical optimization, room acoustics, sparse regularization

Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences, Modelling, Biological system engineering, Signal processing, Control systems, robotics and automation, Design theories and methods, Mechatronics and robotics, Computer theory

Project type:PhD project

Project

Solving inverse problems in room acoustics using physical models, sparse regularization and numerical optimization

Researchers

Project partners

Funding