In Search of a Perceptual Metric for Timbre: Dissimilarity Judgments among Synthetic Sounds with MFCC-Derived Spectral Envelopes
Because the spectral envelope of a sound is a crucial aspect of timbre perception, the authors propose a quantitative model of spectral envelope perception using a set of orthogonal basis functions, analogous to the three primary colors in vision. The goal is find a quantitative mapping between the physical description of the spectral envelope and its perception. This allows for a meaningful and reliable way of controlling timbre in sonification. This paper presents a quantitative metric to describe the multidimensionality of spectral envelope perception, i.e., the perception that is specifically related to the spectral element of timbre. Mel-frequency cepstral coefficients (MFCC) were chosen as a metric for spectral envelope perception because of their linearity, orthogonality, and multidimensionality. Quantitative data from two experiments illustrate the linear relationship between the subjective perception of spectrally-varied synthetic sounds and the MFCC.
Click to purchase paper or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.