Speaker Recognition in Noisy Environments

A Toolbox for Speaker Recognition in Noisy Environments

The integration of voice control into connected devices is expected to improve the efficiency and comfort of our daily lives. However, the underlying biometric systems often impose constraints on the individual or the environment during interaction (e.g., quiet surroundings). Such constraints have to be surmounted in order to seamlessly recognize individuals. In this paper, we propose an evaluation framework for speaker recognition in noisy smart living environments. To this end, we designed a taxonomy of sounds (e.g., home-related, mechanical) that characterize representative indoor and outdoor environments where speaker recognition is adopted. Then, we devised an approach for off-line simulation of challenging noisy conditions in vocal audios originally collected under controlled environments, by leveraging our taxonomy. Our approach adds a (combination of) sound(s) belonging to the target environment into the current vocal example. Experiments on a large-scale public dataset and two state-of-the-art speaker recognition models show that adding certain background sounds to clean vocal audio leads to a substantial deterioration of recognition performance. In several noisy settings, our findings reveal that a speaker recognition model might end up to make unreliable decisions. Our framework is intended to help system designers evaluate performance deterioration and develop speaker recognition models more robust to smart living environments.

Background Sound Taxonomy

Here you can find some details about the Taxonomy

53 Classes

The background sounds we collected come from different context (e.g., indoor, outdoor).

> 150

Sound files

Each sound file is associated with a source (e.g., mechanical, human)

Download

Here you can find the code and the data used throughout our experiments.

Taxonomy Data

Models

Code

License

The taxonomy data is available to download for commercial/research purposes under a Creative Commons Attribution 4.0 International License. The copyright remains with the original owners. A complete version of the license can be found here.

Please contact the authors below if you have any question regarding the taxonomy.

Publications

Please cite the following if you make use of our framework.

G. Fenu, R. Galici, M. Marras
Evaluation Framework for Context-aware Speaker Recognition in Noisy Smart Living Environments
In: KDAH-CIKM'20: Knowledge-driven Analytics and Systems Impacting Human Quality of Life (KDAH-CIKM-2020).
fenu@unica.it, r.galici1@studenti.unica.it, mirko.marras@unica.it
Abstract & BibTex