A Mid-Level Approach to Local Tonality Analysis: Extracting Key Signatures from Audio
We propose a new method to automatically determine key signature changes. In automatic music transcription, sections in distantly related keys may lead to music scores that are hard to read due to a high number of notated accidentals. The problem of key change is commonly addressed by finding the correct local key out of the 24 major and minor keys. However, to provide the best matching key signature, choosing the right mode (major or minor) is not necessary and thus, we only estimate the local underlying diatonic scale. After extracting chroma features and a beat grid from the audio data, we calculate local probabilities for the different diatonic scales. For this purpose, we present a multiplicative procedure that shows promising results for visualizing complex tonal structures. From the obtained diatonic scale estimates, we identify candidates for key signature changes. By clustering similar segments and applying minimum segment length constraints, we get the tonal segmentation. We test our method on a dataset containing 30 hand-annotated pop songs. To evaluate our results, we calculate scores based on the number of frames correctly annotated, as well as segment border F-measures and perform a cross-validation study. Our rule-based method yields up to 90% class accuracy and up to 70% F-measure score for segment borders. These results are promising and qualify the approach to be applied for automatic music transcription.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is temporarily free for AES members.