In Search of a Perceptual Metric for Timbre: Dissimilarity Judgments among Synthetic Sounds with MFCC-Derived Spectral Envelopes
Because the spectral envelope of a sound is a crucial aspect of timbre perception, the authors propose a quantitative model of spectral envelope perception using a set of orthogonal basis functions, analogous to the three primary colors in vision. The goal is find a quantitative mapping between the physical description of the spectral envelope and its perception. This allows for a meaningful and reliable way of controlling timbre in sonification. This paper presents a quantitative metric to describe the multidimensionality of spectral envelope perception, i.e., the perception that is specifically related to the spectral element of timbre. Mel-frequency cepstral coefficients (MFCC) were chosen as a metric for spectral envelope perception because of their linearity, orthogonality, and multidimensionality. Quantitative data from two experiments illustrate the linear relationship between the subjective perception of spectrally-varied synthetic sounds and the MFCC.
Click to purchase paper or login as an AES member. If your company or school subscribes to the AES Journal then you can look for this paper in the institutional version of the Online Journal. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.