Feature Extraction for Voice-driven Synthesis
This paper explores the singing voice from an unusual perspective, not as a musical instrument but as a musical controller. A set of spectral processing algorithms extract features form the input voice. These features are categorized in four groups: excitation, vocal tract, voice quality and context. The extracted values are then transmitted as Open Sound Control (OSC) messages to be used in an external synthesis engine. In this document, we provide first a technical description of the algorithms, and in a second part, we detail the components of the system. A practical example of voice-driven synthesis using PureData (Pd) is also presented.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is temporarily free for AES members.