MIDI renderings of audio are traditionally regarded as lifeless and unnatural - lacking in expression. However, MIDI is simply a protocol for controlling a synthesizer. Lack of expression is caused by either an expressionless synthesizer or by the difficulty in setting the MIDI parameters to provide expressive output. We have developed a system to construct an expressive MIDI representation of an audio signal, i.e. an audio representation which uses tailored pitch variations in addition to the note base pitch parameters which audio-to-MIDI systems usually attempt. A pitch envelope is estimated from the original audio, and a genetic algorithm is then used to estimate pitch modulator parameters from that envelope. These pitch modulations are encoded in a MIDI file and rerendered using a sampler. We present some initial comparisons between the final output audio and the estimated pitch envelopes.
https://www.aes.org/e-lib/browse.cfm?elib=15057
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Learn more about the AES E-Library
Start a discussion about this paper!