Fighting AI with AI: Fake Speech Detection Using Deep Learning

Malik, Hafiz; Changalvala, Raghavendar

AES E-Library

Fighting AI with AI: Fake Speech Detection Using Deep Learning

Voice cloning technologies have found applications in a variety of areas ranging from personalized speech interfaces to advertisement, video gaming, and so on. Existing voice cloning systems are capable of learning speaker characteristics from few samples and generating perceptually indistinguishable speech. These advances pose new security and privacy threats to voice-driven interfaces. This paper presents a deep learning-based framework for learning cloned speech synthesis models and the bona-?de speech production processes. To this end, a convolutional neural network is trained and tested on spectrogram estimated from input audio recordings. Performance of the proposed method is evaluated on cloned and bona-?de audios. Experimental results indicate that the proposed method is capable of detecting bona-?de and cloned audios with a close to perfect accuracy.

Authors: Malik, Hafiz; Changalvala, Raghavendar
Affiliation: University of Michigan - Dearborn, Dearborn, MI, USA
AES Conference: 2019 AES International Conference on Audio Forensics (June 2019)
Paper Number: 29
Publication Date: June 8, 2019 Import into BibTeX
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20479

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location: /conf/2019/forensics/2019_audio_forensics_paper_29.pdf

Start a discussion about this paper!

AES E-Library

Fighting AI with AI: Fake Speech Detection Using Deep Learning

ABOUT AES

Contact Us