AES Berlin 2014
Paper Session P10
P10 - Human Factors
Monday, April 28, 16:30 — 18:00 (Room Paris)
Hyunkook Lee, University of Huddersfield - Huddersfield, UK
P10-1 A New Algorithm for Vocal Tract Shape Extraction from Singer's Waveforms—Rebecca Vos, University of York - Heslington, York, UK; Jamie Andrea Shyla Angus, University of Salford - Salford, Greater Manchester, UK; Brad H. Story, University of Arizona - Tucson, AZ, USA
This paper presents a new algorithm for extracting vocal tract shape from speech or singing. Based on acoustic sensitivity functions it removes the ambiguity that conventional methods suffer from. We describe acoustic sensitivity functions and how we extract the necessary formant frequencies from the acoustic waveform. Results are presented for a variety of singers both male and female singing a variety of vowels and notes. The results are good and the system not only has applications in voice training but could also be used for control of games or music synthesis.
Convention Paper 9073 (Purchase now)
P10-2 Participatory Amplitude Level Adjustment of Gesture Controlled Upper Body Garment Sound in Immersive Virtual Reality—Erik Sikström, Aalborg University Copenhagen - Copenhagen, Denmark; Morten Havemøller Laursen, Aalborg University Copenhagen - Copenhagen, Denmark; Kasper Søndergaard Pedersen, Aalborg University Copenhagen - Copenhagen, Denmark; Amalia de Götzen, Aalborg University Copenhagen - Copenhagen, Denmark; Stefania Serafin, Aalborg University Copenhagen - Copenhagen, Denmark
Gesture-controlled sounds from virtual clothes in immersive virtual environments, is a relatively unexplored topic. In this paper an experiment aiming to find a range between a highest acceptable amplitude level and a lowest acceptable level for the sounds of an upper-body garment was conducted. Participants were asked to set the two amplitude levels of the sound from the virtual clothes that were generated by the subjects’ gesture input, in relation to other sound sources with already predefined levels (footsteps and ambient sounds). This task was performed while walking around in a virtual park area. The results yielded two dynamic ranges that were differently placed depending on if the sound was initially presented at a loudest possible level, or the lowest possible level.
Convention Paper 9074 (Purchase now)
P10-3 Audio Information Mining – Pragmatic Review, Outlook, and a Universal Open Architecture—Philip J. Duncan, University of Salford - Salford, Greater Manchester, UK; Duraid Y. Mohammed, University of Salford - Salford, Greater Manchester, UK; Francis F. Li, University of Salford - Salford, Greater Manchester, UK
There is an immense amount of audio data available currently whose content is unspecified and the problem of classification and generation of metadata poses a significant and challenging research problem. We present a review of past and current work in this field; specifically in the three principal areas of segmentation, feature extraction, and classification and give an overview and critical appraisal of techniques currently in use. One of the major impediments to progress in the field has been specialism and the inability of classifiers to generalize, and we propose a non exclusive generalized open architecture framework for classification of audio data that will accommodate third party plugins and work with multi-dimensional feature/descriptor space as input.
Convention Paper 9075 (Purchase now)