AES E-Library

AES E-Library

Fighting AI with AI: Fake Speech Detection Using Deep Learning

Document Thumbnail

Voice cloning technologies have found applications in a variety of areas ranging from personalized speech interfaces to advertisement, video gaming, and so on. Existing voice cloning systems are capable of learning speaker characteristics from few samples and generating perceptually indistinguishable speech. These advances pose new security and privacy threats to voice-driven interfaces. This paper presents a deep learning-based framework for learning cloned speech synthesis models and the bona-?de speech production processes. To this end, a convolutional neural network is trained and tested on spectrogram estimated from input audio recordings. Performance of the proposed method is evaluated on cloned and bona-?de audios. Experimental results indicate that the proposed method is capable of detecting bona-?de and cloned audios with a close to perfect accuracy.

Authors:
Affiliation:
AES Conference:
Paper Number:
Publication Date:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=20479

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this paper!


AES - Audio Engineering Society