Last Updated: 20060821, mei
P8 - Posters: Analysis and Synthesis
Friday, October 6, 11:00 am — 12:30 pm
P8-1 The Singing Tutor: Expression Categorization and Segmentation of the Singing Voice—Oscar Mayor, Jordi Bonada, Alex Loscos, Pompeu Fabra University - Barcelona, Spain
Computer evaluation of singing interpretation has traditionally been based exclusively on tuning and tempo. This paper presents a tool for the automatic evaluation of singing voice performances that depends on tuning and tempo but also on the expression of the voice. For such purpose, the system performs analysis at note and intra-note levels. Note level analysis outputs traditional note pitch, note onset, and note duration information, while intra-note level analysis is in charge of the location and the expression categorization of the note’s attacks, sustains, transitions, releases, and vibratos. Segmentation is done using an algorithm based on untrained Hidden Markov Models (HMMs) with probabilistic models built out of a set of heuristic rules. A graphical tool for the evaluation and fine-tuning of the system is presented. The interface gives feedback about analysis descriptors and rule probabilities.
Convention Paper 6897 (Purchase now)
P8-2 Expert System for Automatic Classification and Quality Assessment of Singing Voices—Pawel Zwan, University of Technology Gdansk - Gdansk, Poland
The aim of the research work presented is an automatic singing voice quality recognition system. For this purpose a database containing singers’ sample recordings is constructed and parameters are extracted from recorded voices of trained and untrained singers of different voice types. Parameters, which are especially designed for the analysis of the singing voice, are analyzed and a feature vector is formed. Each singers’ voice sample is judged by experts and information about voice quality is obtained. Parameters extracted are used in the training process of a neural network, and the effectiveness of an automatic voice quality classification is tested by comparing automatic recognition results with subjective expert judgments. Finally, discussion of results is presented and conclusions are derived.
Convention Paper 6898 (Purchase now)
P8-3 Facilities Used for Introductory Electronic Music: A Survey of Universities with an Undergraduate Degree in Audio—Joseph Akins, Middle Tennessee State University - Murfreesboro, TN, USA
This paper reports the facilities used for introductory electronic music in United States universities that offered an undergraduate degree in audio production and technology in fall of 2005. The population included 54 programs listed on the Audio Engineering Society’s Directory of Educational Programs. With an online questionnaire, each university reported on the first hands-on electronic music course offered at their institution. With a response rate of 81 percent, the respondents reported on specific hardware, software, purposes, and curricular application. For example, 93 percent of the respondents reported using Mac OS where 20 percent reported using Microsoft Windows.
Convention Paper 6899 (Purchase now)
P8-4 Improvements to a Sample-Concatenation Based Singing Voice Synthesizer—Jordi Bonada, Merlijn Blaauw, Alex Loscos, Universitat Pompeu Fabra - Barcelona, Spain
This paper describes recent improvements to our singing voice synthesizer based on concatenation and transformation of audio samples using spectral models. Improvements include robust automation of previous singer database creation process, a lengthy and tedious task that involved recording scripts generation, studio sessions, audio editing, spectral analysis, and phonetic based segmentation; and synthesis technique enhancement, improving the quality of sample transformations and concatenations, and discriminating between phonetic intonation and musical articulation.
Convention Paper 6900 (Purchase now)
P8-5 Modeling Musical Articulation Gestures in Singing-Voice Performances—Esteban Maestre, Jordi Bonada, Oscar Mayor, Universitat Pompeu Fabra - Barcelona, Spain
We present a procedure to automatically describe musical articulation gestures used in singing voice performances. We detail a method to characterize temporal evolution of fundamental frequency and energy contours by a set of piece-wise fitting techniques. Based on this, we propose a meaningful parameterization that allows reconstructing contours from a compact set of parameters at different levels. We test the characterization method by applying it to fundamental frequency contours of manually segmented transitions between adjacent notes and train several classifiers with manually labeled examples. We show the recognition accuracy for different parameterizations and levels of representation.
Convention Paper 6901 (Purchase now)
P8-6 Automatic Tonal Analysis from Music Summaries for Version Identification—Emilia Gómez, Beesuan Ong, Perfecto Herrera, Universitat Pompeu Fabra - Barcelona, Spain
Identifying versions of the same song by means of automatically extracted audio features is a complex task to achieve using computers, even though it may seem very simple for a human listener. The design of a system to perform this job gives the opportunity to analyze which features are relevant for music similarity. This paper focuses on the analysis of tonal and structural similarity and its application to the identification of different versions of the same piece. It describes the situations where a song is versioned and several musical aspects are transformed with respect to the canonical version. A quantitative evaluation is made using tonal descriptors, including chroma representations and tonality, combined with the automatic extraction of summary of pieces through music structural analysis.
Convention Paper 6902 (Purchase now)
P8-7 Groovator—An Implementation of Real-Time Rhythm Transformations—Jordi Janer, Jordi Bonada, Sergi Jordà, Universitat Pompeu Fabra - Barcelona, Spain
This paper describes a real-time system for rhythm manipulation of polyphonic audio signals. A rhythm analysis module extracts information of tempo and beat location. Based on this rhythm information, we apply different transformations: tempo, swing, meter, and accent. This type of manipulation is generally referred to as content-based transformations. We address characteristics of the analysis and transformation algorithms. In addition, user interaction also plays an important role in this system. Tempo variations can be controlled either by tapping the rhythm with a MIDI interface or by using an external audio signal such as prcussion or the voice as tempo control. We will conclude pointing out several use-cases, focusing on live performance situations.
Convention Paper 6903 (Purchase now)