Note-Intensity Estimation of Piano Recordings Using Coarsely Aligned MIDI Score
×
Cite This
Citation & Abstract
D. Jeong, T. Kwon, and J. Nam, "Note-Intensity Estimation of Piano Recordings Using Coarsely Aligned MIDI Score," J. Audio Eng. Soc., vol. 68, no. 1/2, pp. 34-47, (2020 January.). doi: https://doi.org/10.17743/jaes.2019.0049
D. Jeong, T. Kwon, and J. Nam, "Note-Intensity Estimation of Piano Recordings Using Coarsely Aligned MIDI Score," J. Audio Eng. Soc., vol. 68 Issue 1/2 pp. 34-47, (2020 January.). doi: https://doi.org/10.17743/jaes.2019.0049
Abstract: Automatic Music Transcription (AMT) is a process of inferring score notation from audio recordings, which depends on such subtasks as multipitch estimation, onset detection, tempo estimation, etc. The dynamics of music is one of the main elements that explains the characteristics of a performance, but dynamics has not yet been thoroughly investigated in the context of automatic music transcription. This report proposes a system for estimating the intensity of individual notes from piano recordings. The algorithm is based on a score-informed nonnegative matrix factorization (NMF) that takes the spectrogram of an audio recording and a corresponding MIDI score as inputs and factorizes the spectrogram into a set of spectral templates and their activations. The intensity of each note is obtained from the maximum activation of the corresponding pitch template around the onset of the note. The authors improved their system by employing an NMF model that can learn the temporal progress of the timbre of piano notes. While the previous research was evaluated only with perfectly-aligned scores, this paper also presents an evaluation with coarsely-aligned scores. The results shows that this approach is robust in aligning errors within 100 ms.
@article{jeong2020note-intensity,
author={jeong, dasaem and kwon, taegyun and nam, juhan},
journal={journal of the audio engineering society},
title={note-intensity estimation of piano recordings using coarsely aligned midi score},
year={2020},
volume={68},
number={1/2},
pages={34-47},
doi={https://doi.org/10.17743/jaes.2019.0049},
month={january},}
@article{jeong2020note-intensity,
author={jeong, dasaem and kwon, taegyun and nam, juhan},
journal={journal of the audio engineering society},
title={note-intensity estimation of piano recordings using coarsely aligned midi score},
year={2020},
volume={68},
number={1/2},
pages={34-47},
doi={https://doi.org/10.17743/jaes.2019.0049},
month={january},
abstract={automatic music transcription (amt) is a process of inferring score notation from audio recordings, which depends on such subtasks as multipitch estimation, onset detection, tempo estimation, etc. the dynamics of music is one of the main elements that explains the characteristics of a performance, but dynamics has not yet been thoroughly investigated in the context of automatic music transcription. this report proposes a system for estimating the intensity of individual notes from piano recordings. the algorithm is based on a score-informed nonnegative matrix factorization (nmf) that takes the spectrogram of an audio recording and a corresponding midi score as inputs and factorizes the spectrogram into a set of spectral templates and their activations. the intensity of each note is obtained from the maximum activation of the corresponding pitch template around the onset of the note. the authors improved their system by employing an nmf model that can learn the temporal progress of the timbre of piano notes. while the previous research was evaluated only with perfectly-aligned scores, this paper also presents an evaluation with coarsely-aligned scores. the results shows that this approach is robust in aligning errors within 100 ms.},}
TY - paper
TI - Note-Intensity Estimation of Piano Recordings Using Coarsely Aligned MIDI Score
SP - 34
EP - 47
AU - Jeong, Dasaem
AU - Kwon, Taegyun
AU - Nam, Juhan
PY - 2020
JO - Journal of the Audio Engineering Society
IS - 1/2
VO - 68
VL - 68
Y1 - January 2020
TY - paper
TI - Note-Intensity Estimation of Piano Recordings Using Coarsely Aligned MIDI Score
SP - 34
EP - 47
AU - Jeong, Dasaem
AU - Kwon, Taegyun
AU - Nam, Juhan
PY - 2020
JO - Journal of the Audio Engineering Society
IS - 1/2
VO - 68
VL - 68
Y1 - January 2020
AB - Automatic Music Transcription (AMT) is a process of inferring score notation from audio recordings, which depends on such subtasks as multipitch estimation, onset detection, tempo estimation, etc. The dynamics of music is one of the main elements that explains the characteristics of a performance, but dynamics has not yet been thoroughly investigated in the context of automatic music transcription. This report proposes a system for estimating the intensity of individual notes from piano recordings. The algorithm is based on a score-informed nonnegative matrix factorization (NMF) that takes the spectrogram of an audio recording and a corresponding MIDI score as inputs and factorizes the spectrogram into a set of spectral templates and their activations. The intensity of each note is obtained from the maximum activation of the corresponding pitch template around the onset of the note. The authors improved their system by employing an NMF model that can learn the temporal progress of the timbre of piano notes. While the previous research was evaluated only with perfectly-aligned scores, this paper also presents an evaluation with coarsely-aligned scores. The results shows that this approach is robust in aligning errors within 100 ms.
Automatic Music Transcription (AMT) is a process of inferring score notation from audio recordings, which depends on such subtasks as multipitch estimation, onset detection, tempo estimation, etc. The dynamics of music is one of the main elements that explains the characteristics of a performance, but dynamics has not yet been thoroughly investigated in the context of automatic music transcription. This report proposes a system for estimating the intensity of individual notes from piano recordings. The algorithm is based on a score-informed nonnegative matrix factorization (NMF) that takes the spectrogram of an audio recording and a corresponding MIDI score as inputs and factorizes the spectrogram into a set of spectral templates and their activations. The intensity of each note is obtained from the maximum activation of the corresponding pitch template around the onset of the note. The authors improved their system by employing an NMF model that can learn the temporal progress of the timbre of piano notes. While the previous research was evaluated only with perfectly-aligned scores, this paper also presents an evaluation with coarsely-aligned scores. The results shows that this approach is robust in aligning errors within 100 ms.
Authors:
Jeong, Dasaem; Kwon, Taegyun; Nam, Juhan
Affiliation:
Graduate School of Culture Technology, Korea Advanced Institute of Science and Technology, Daejeon, South Korea JAES Volume 68 Issue 1/2 pp. 34-47; January 2020
Publication Date:
February 5, 2020Import into BibTeX
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=20716