Utility-Preserving Privacy-Enabled Speech Embeddings for Emotion Detection
Sanjiv Das, William and Janice Terry Professor of Finance and Business Analytics
"Utility-Preserving Privacy-Enabled Speech Embeddings for Emotion Detection" (with Chandrshekhar Lavania, Xin Huang. and Kyu Han), conference proceedings of Interspeech 2023, 20-24 August, Dublin, Ireland.
Audio privacy has been undertaken using adversarial task training or adversarial models based on GANs, where the models also suppress scoring of other attributes (e.g., emotion, etc.), but embeddings still retain enough information to bypass speaker privacy. We use methods for feature importance from the explainability literature to modify embeddings from adversarial task training, providing a simple and accurate approach to generating embeddings for preserving speaker privacy while not attenuating utility for related tasks (e.g., emotion recognition). This enables better adherence with privacy regulations around biometrics and voiceprints, while retaining the usefulness of audio representation learning.