Phoneme Mappings for Online Vocal Percussion Transcription
×
Cite This
Citation & Abstract
DE. AL. Delgado Alejandro, and C. Saitis, M. Sandler, "Phoneme Mappings for Online Vocal Percussion Transcription," Paper 10529, (2021 October.). doi:
DE. AL. Delgado Alejandro, and C. Saitis, M. Sandler, "Phoneme Mappings for Online Vocal Percussion Transcription," Paper 10529, (2021 October.). doi:
Abstract: Vocal Percussion Transcription (VPT) aims at detecting vocal percussion sound events in a beatboxing performance and classifying them into the correct drum instrument class (kick, snare, or hi-hat). To do this in an online (real-time) setting, however, algorithms are forced to classify these events within just a few milliseconds after they are detected. The purpose of this study was to investigate which phoneme-to-instrument mappings are the most robust for online transcription purposes. We used three different evaluation criteria to base our decision upon: frequency of use of phonemes among different performers, spectral similarity to reference drum sounds, and classification separability. With these criteria applied, the recommended mappings would potentially feel natural for performers to articulate while enabling the classification algorithms to achieve the best performance possible. Given the final results, we provided a detailed discussion on which phonemes to choose given different contexts and applications.
@article{delgado alejandro2021phoneme,
author={delgado alejandro and saitis, charalampos and sandler, mark},
journal={journal of the audio engineering society},
title={phoneme mappings for online vocal percussion transcription},
year={2021},
volume={},
number={},
pages={},
doi={},
month={october},}
@article{delgado alejandro2021phoneme,
author={delgado alejandro and saitis, charalampos and sandler, mark},
journal={journal of the audio engineering society},
title={phoneme mappings for online vocal percussion transcription},
year={2021},
volume={},
number={},
pages={},
doi={},
month={october},
abstract={vocal percussion transcription (vpt) aims at detecting vocal percussion sound events in a beatboxing performance and classifying them into the correct drum instrument class (kick, snare, or hi-hat). to do this in an online (real-time) setting, however, algorithms are forced to classify these events within just a few milliseconds after they are detected. the purpose of this study was to investigate which phoneme-to-instrument mappings are the most robust for online transcription purposes. we used three different evaluation criteria to base our decision upon: frequency of use of phonemes among different performers, spectral similarity to reference drum sounds, and classification separability. with these criteria applied, the recommended mappings would potentially feel natural for performers to articulate while enabling the classification algorithms to achieve the best performance possible. given the final results, we provided a detailed discussion on which phonemes to choose given different contexts and applications.},}
TY - paper
TI - Phoneme Mappings for Online Vocal Percussion Transcription
SP -
EP -
AU - Delgado Alejandro
AU - Saitis, Charalampos
AU - Sandler, Mark
PY - 2021
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2021
TY - paper
TI - Phoneme Mappings for Online Vocal Percussion Transcription
SP -
EP -
AU - Delgado Alejandro
AU - Saitis, Charalampos
AU - Sandler, Mark
PY - 2021
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2021
AB - Vocal Percussion Transcription (VPT) aims at detecting vocal percussion sound events in a beatboxing performance and classifying them into the correct drum instrument class (kick, snare, or hi-hat). To do this in an online (real-time) setting, however, algorithms are forced to classify these events within just a few milliseconds after they are detected. The purpose of this study was to investigate which phoneme-to-instrument mappings are the most robust for online transcription purposes. We used three different evaluation criteria to base our decision upon: frequency of use of phonemes among different performers, spectral similarity to reference drum sounds, and classification separability. With these criteria applied, the recommended mappings would potentially feel natural for performers to articulate while enabling the classification algorithms to achieve the best performance possible. Given the final results, we provided a detailed discussion on which phonemes to choose given different contexts and applications.
Vocal Percussion Transcription (VPT) aims at detecting vocal percussion sound events in a beatboxing performance and classifying them into the correct drum instrument class (kick, snare, or hi-hat). To do this in an online (real-time) setting, however, algorithms are forced to classify these events within just a few milliseconds after they are detected. The purpose of this study was to investigate which phoneme-to-instrument mappings are the most robust for online transcription purposes. We used three different evaluation criteria to base our decision upon: frequency of use of phonemes among different performers, spectral similarity to reference drum sounds, and classification separability. With these criteria applied, the recommended mappings would potentially feel natural for performers to articulate while enabling the classification algorithms to achieve the best performance possible. Given the final results, we provided a detailed discussion on which phonemes to choose given different contexts and applications.
Open Access
Authors:
Delgado Alejandro; Saitis, Charalampos; Sandler, Mark
Affiliations:
Roli Ltd., London, UK; Queen Mary University of London, London, UK(See document for exact affiliation information.)
AES Convention:
151 (October 2021)
Paper Number:
10529
Publication Date:
October 13, 2021Import into BibTeX
Subject:
Audio Signal Processing
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=21493