AES E-Library

AES E-Library

Comparison of Audio Spectral Features in a Convolutional Neural Network

Document Thumbnail

Time-Frequency transformation and spectral representations of audio signals are commonly used in various machine learning applications. Typically the Mel-Spectrogram is used to create the input features to the network justified by the Mel scale’s human auditory system basis. In this paper, we compare several spectral features in a gender detection speech model comparing their performance and showing that the Mel-Spectrogram is not always the best choice for input features.

Open Access

Open
Access

Authors:
Affiliations:
AES Convention: Paper Number:
Publication Date:
Subject:
Permalink: https://www.aes.org/e-lib/browse.cfm?elib=21963


Download Now (470 KB)

This paper is Open Access which means you can download it for free.

Learn more about the AES E-Library

E-Library Location:

Start a discussion about this Applications in Audi!


AES - Audio Engineering Society