AES Journal

Journal of the AES

2018 June - Volume 66 Number 6

Download Entire Issue (10.0MB)

Listen to Podcast

Guest Editors’ Note

Authors: György Fazekas, Mathieu Barthet, and George M. Kalliris

Page: 412

Download: PDF (42KB)

Papers

Qualitative Evaluation of Media Device Orchestration for Immersive Spatial Audio Reproduction

Open
Access

Authors:Francombe, Jon; Woodcock, James; Hughes, Richard J.; Mason, Russell; Franck, Andreas; Pike, Chris; Brookes, Tim; Davies, William J.; Jackson, Philip J. B.; Cox, Trevor J.; Fazi, Filippo M.; Hilton, Adrian
Affiliation:Institute of Sound Recording, University of Surrey, Guildford, Surrey, UK; Acoustics Research Centre, University of Salford, Salford, UK; Institute of Sound and Vibration Research, University of Southampton, Southampton, UK; BBC Research and Development, MediaCityUK, Salford, UK; Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, Surrey, UK
Page:414

The challenge of installing and setting up dedicated spatial audio systems can make it difficult to deliver immersive listening experiences to the general public. However, the proliferation of smart mobile devices and the rise of the Internet of Things mean that there are increasing numbers of connected devices capable of producing audio in the home. “Media device orchestration” (MDO) is the concept of utilizing an ad hoc set of devices to deliver or augment a media experience. In this paper, the concept is evaluated by implementing MDO for augmented spatial audio reproduction using object-based audio with semantic metadata. A system that augmented a stereo pair of loudspeakers with an ad hoc array of connected devices is described. The MDO approach aims to optimize aspects of the listening experience that are closely related to listener preference rather than attempting to recreate sound fields as devised during production. A thematic analysis of positive and negative listener comments about the system revealed three main categories of responses: perceptual, technical, and content-dependent aspects. MDO performed particularly well in terms of immersion/envelopment, but the quality of listening experience was partly dependent on loudspeaker quality and listener position.

Download: PDF (HIGH Res) (8.4MB)

Download: PDF (LOW Res) (521KB)

Be the first to discuss this paper

User-independent Accelerometer Gesture Recognition for Participatory Mobile Music

Authors:Roma, Gerard; Xambó, Anna; Freeman, Jason
Affiliation:University of Huddersfield, Huddersfield, UK; Queen Mary University of London, London, UK; Georgia Institute of Technology, Atlanta, GA, USA
Page:430

With the widespread use of smartphones that have multiple sensors and sound processing capabilities, there is a great potential for increased audience participation in music performances. This paper proposes a framework for participatory mobile music based on mapping arbitrary accelerometer gestures to sound synthesizers. The authors describe Handwaving, a system based on neural networks for real-time gesture recognition and sonification on mobile browsers. Based on a multiuser dataset, results show that training with data from multiple users improves classification accuracy, supporting the use of the proposed algorithm for user-independent gesture recognition. This illustrates the relevance of user-independent training for multiuser settings, especially in participatory music. The system is implemented using web standards, which makes it simple and quick to deploy software on audience devices in live performance settings.

Download: PDF (HIGH Res) (4.4MB)

Download: PDF (LOW Res) (202KB)

Be the first to discuss this paper

Augmenting a MIDI Keyboard Using Virtual Interfaces

Authors:Desnoyers-Stewart, John; Gerhard, David; Smith, Megan L.
Affiliation:University of Regina, Regina, Canada
Page:439

With the ongoing development of virtual reality (VR) systems, such as the HTC Vive and Oculus Rift, there is a need to develop new interfaces that maximize immersion in VR. Several VR interfaces have been successfully developed and implemented for a mixed reality (MR) MIDI keyboard. MR exists along a continuum between a pure real environment and a pure virtual environment. This paper presents a collection of virtual interfaces used to augment a MIDI keyboard synchronized in physical and virtual space. Several virtual interfaces are developed and evaluated. Some utilize the tactility offered by the keyboard’s surface, while others rely on the improved presence offered by the keyboard. An evaluation of these virtual interfaces is made with respect to learnability and the utility to identify the successes and failures of these interfaces. With few exceptions, the interfaces developed offer functional and immersive control in a VR environment. Some mimic ordinary real interfaces without the need for fixed sensors, offering significantly improved flexibility and functionality while taking advantage of the tactility of the keyboard. A number of other interfaces are built upon this immersion to take advantage of the virtual environment.

Download: PDF (HIGH Res) (5.2MB)

Download: PDF (LOW Res) (289KB)

Be the first to discuss this paper

Measurement, Recognition, and Visualization of Piano Pedaling Gestures and Techniques

Open
Access

Authors:Liang, Beici; Fazekas, György; Sandler, Mark
Affiliation:Centre for Digital Music, Queen Mary University of London, London, UK
Page:448

When playing the piano, pedaling is one of the important techniques that lead to expressive performance, comprising not only the onset and offset information that composers often indicate in the score, but also gestures related to the musical interpretation by performers. This research examines pedaling gestures and techniques on the sustain pedal from the perspective of measurement, recognition, and visualization. Pedaling gestures can be captured by a dedicated measurement system where the sensor data is simultaneously recorded alongside the piano sound under normal playing conditions. Recognition is comprised of two separate tasks on the sensor data: pedal onset/offset detection and classification by technique. The onset and offset times of each pedaling technique were computed using signal processing algorithms. Based on features extracted from every segment when the pedal is pressed, the task of classifying the segments by pedaling technique was undertaken using machine-learning methods. High accuracy was obtained by cross validation. The recognition results can be represented using novel pedaling notations and visualized in an audio-based score-following application.

Download: PDF (HIGH Res) (2.7MB)

Download: PDF (LOW Res) (374KB)

Be the first to discuss this paper

Speech Emotion Recognition for Performance Interaction

Authors:Vryzas, Nikolaos; Kotsakis, Rigas; Liatsou, Aikaterini; Dimoulas, Charalampos A.; Kalliris, George
Affiliation:Aristotle University of Thessaloniki, Thessaloniki, Greece
Page:457

This research explores the relevance of machine-driven Speech Emotion Recognition (SER) as a way to augment theatrical performances and interactions, such as controlling stage color/light, stimulating active audience engagement, actors’ interactive training, etc. It is well known that the meaning of a speech utterance arises from more than the linguistic content. Emotional affect can dramatically change meaning. As the basis for classification experiments, the authors developed the Acted Emotional Speech Dynamic Database (AESDD, which contains spoken utterances from 5 actors with 5 emotions. Several audio features and various classification techniques were implemented and evaluated using this database, as well comparing results with the Surrey Audio-Visual Expressed Emotion (SAVEE) database. The training classified was integrated into a novel application that performed live SER, fitting the needs of actor training.

Download: PDF (HIGH Res) (2.9MB)

Download: PDF (LOW Res) (428KB)

Be the first to discuss this paper

Sounding Out Ethnography and Design: Developing Metadata Frameworks for Designing Personal Heritage Soundscapes

Authors:Chamberlain, Alan; Bødker, Mads; Papangelis, Konstantinos
Affiliation:University of Nottingham, Nottingham, UK; Dept. Digitalization, Copenhagen Business School, DK; Xi’an Jiaotong-Liverpool University, PRC
Page:468

Ethnography has long been used within a variety of settings in order to articulate and understand the everyday worlds of work and leisure. This paper explores the use of auto-ethnography as a method for soundscape design in the fields of personal heritage and locative media. Specifically, the authors explore possible connections between digital media, space, and ‘meaning making,” suggesting how autoethnographies might help discover design opportunities for merging digital media and places. These are methods that are more personally relevant than those typically associated with more system-based design approaches that often are less sensitive to the way that emotion, relationships, memory, and meaning come together. As digital technologies are increasingly ubiquitous, there are new possibilities that allow people to self-design experiences that can be social, located, or mobile, spanning modalities and times. There is a suggestion that tangible interactive technologies might contribute to community-based (or intersubjective) narratives and foster participatory sense-making around such merging of place with media. As physical space and digital media become ever more intertwined, together forming and augmenting meaning and experience, there is a need for methods to explore possible ways in which physical places and intangible personal content can be used to develop meaningful experiences.

Download: PDF (HIGH Res) (1.2MB)

Download: PDF (LOW Res) (154KB)

Be the first to discuss this paper

Touch the Sound: Design and Development of a Tangible System for Sound Experimentation

Author:Cuadrado, Francisco
Affiliation:Universidad Loyola Andalucía, Seville, Spain
Page:478

“Touch the Sound” is a technological tool specifically designed for children training in experimentation of sound. It is based on the interaction with physical and tangible elements (passive objects that children can organize and move inside a two-axis space). A tablet based app uses computer-vision procedures to recognize each object and its position and movement to playback audio files and modify different sound parameters in real time. The technological approach and design adopted according to the philosophy and objectives of the Touch the Sound project has proved to be effective, despite the drawbacks and technical problems encountered during the development process. Programming based on native Android OS tools has enabled the design of a system that achieves optimal performance results on devices with limited hardware resources. The system contributes to the media literacy of children, making them aware of the narrative possibilities of sound, and teaching them to control and modify its parameters.

Download: PDF (HIGH Res) (2.6MB)

Download: PDF (LOW Res) (217KB)

Be the first to discuss this paper

Ecosystems of Visual Programming Languages for Music Creation: A Quantitative Study

Authors:Pošcic, Antonio; Krekovic, Gordan
Affiliation:Faculty of Electrical Engineering and Computing, University of Zagreb, Zagreb, Croatia; Unaffiliated
Page:486

Visual programming provides a way to construct and communicate concepts to a computer by manipulating graphical objects and by using symbols, spatial arrangements, and visual expressivity instead of text. In context of music creation, it offers an intuitive yet comprehensive approach. To avoid using textual programming, some composers, performers, and multimedia artists employ visual languages to support their creative processes. This research explores contextual aspects related to discovery, learning, and use of different visual languages for music. The authors conducted a survey of 218 participants and quantitatively analyzed relations between relevant dimensions. The resulting interpretation of the analyzed data formed guidelines for educators, visual language developers, and end-users. Educators can use this research to improve how they transfer knowledge and mentor their students. Developers are provided with empirical evidence gathered through rigorous quantitative research that can indicate the existence of certain phenomena related to users of their tools. End users can engage in continuous and unstructured exploration and experimentation.

Download: PDF (HIGH Res) (1.1MB)

Download: PDF (LOW Res) (173KB)

Be the first to discuss this paper

Features

Preview of Spatial Reproduction Conference, Tokyo

Page: 496

Download: PDF (546KB)

Preview of Audio for Virtual and Augmented Reality Conference, Redmond

Page: 498

Download: PDF (1.0MB)

Loudspeaker Design: Optimization and Efficiency

Author:Rumsey, Francis
Page:501

When loudspeakers are a commodity product, manufactured in large numbers at the lowest possible price, finding ways of ensuring optimum performance can be challenging. Various acoustical and signal processing methods can be used to extract the optimum performance out of them, and testing regimes need to be sensitive to the real implications of performance variations.

Download: PDF (475KB)

Be the first to discuss this feature

Departments

Section News

Page: 506

Download: PDF (144KB)

Book Reviews

Page: 508

Download: PDF (97KB)

Products

Page: 510

Download: PDF (239KB)

AES Conventions and Conferences

Page: 512

Download: PDF (110KB)

Navigation

Journal of the AES

2018 June - Volume 66 Number 6

Guest Editors’ Note

Guest Editors’ Note

Papers

Qualitative Evaluation of Media Device Orchestration for Immersive Spatial Audio Reproduction

User-independent Accelerometer Gesture Recognition for Participatory Mobile Music

Augmenting a MIDI Keyboard Using Virtual Interfaces

Measurement, Recognition, and Visualization of Piano Pedaling Gestures and Techniques

Speech Emotion Recognition for Performance Interaction

Sounding Out Ethnography and Design: Developing Metadata Frameworks for Designing Personal Heritage Soundscapes

Touch the Sound: Design and Development of a Tangible System for Sound Experimentation

Ecosystems of Visual Programming Languages for Music Creation: A Quantitative Study

Features

Preview of Spatial Reproduction Conference, Tokyo

Preview of Audio for Virtual and Augmented Reality Conference, Redmond

Loudspeaker Design: Optimization and Efficiency

Departments

Section News

Book Reviews

Products

AES Conventions and Conferences

Extras

Table of Contents

Cover & Sustaining Members List

AES Officers, Committees, Offices & Journal Staff

ABOUT AES

Contact Us