Talking Soundscapes: Automatizing voice transformations for crowd simulation
Jordi Janer, Music Technology Group, Universitat Pompeu Fabra
Roland Geraerts, Department of Information and Computing Sciences, Utrecht University
Wouter G. van Toll, Department of Information and Computing Sciences, Utrecht University
Jordi Bonada, Music Technology Group, Universitat Pompeu Fabra
The addition of a crowd in a virtual environment, such as a game world, can make the environment more realistic. While researchers focused on the visual modeling and simulation of a crowd, its sound production has received less attention. We propose the generation of the sound of a crowd by retrieving a very small set of speech snippets from a user-contributed database, and transforming and layering voice recordings according to the character localization in the crowd simulation. Our proof-of-concept integrates state-of-the-art audio processing and crowd simulation algorithms. The novelty resides in exploring how we can create a flexible crowd sound from a reduced number of samples, whose acoustic characteristics (such as people density and dialogue activity) could be modeled in practice by means of pitch, timbre and time-scaling transformations.
The Future of Adaptive Game Music: The Continuing Evolution of Dynamic Music Systems in Video Games
David M. Young, David M. Young Music
This paper examines what the future may hold for adaptive music in video games. Discussions are focused on technical developments in music production software, game audio middleware, and gaming interfaces, and what these could mean for dynamic music systems. Specifically, the heralding of an industry-standard interactive audio transferable file type, the increasingly standardized functionality and appearance of game audio middleware, the blurring of the lines between DAW and middleware, improved real-time audio effects, generative music and MIDI-based capabilities in game engines, and the use of new player-state-based data input streams to inform and personalize music experiences on a player-by player basis, are explored.
Can Interactive Procedural Audio Affect the Motorical Behaviour of Players in Computer Games with Motion Controllers
Niels Bøttcher, Medialogy(AAU-CPH), Aalborg University Copenhagen
This paper presents the design and implementation of a procedural sword sound model controlled with the Nintendo Wii remote. A prototype of a first person sword game was developed in order to test if the use of procedural audio in comparison to pre-recorded audio could potentially change the motorical behavior of the players. A test indicated that some of the test persons were influenced by the procedural audio, but no common measures could be found in the test.
Preliminary Investigation of Self-reported Emotional Responses to Approaching and Receding Footstep Sounds in a Virtual Reality Context
Erik Sikström, Niels Christian Nilsson, Rolf Nordahl, and Stefania Serafin, Department of Architecture, Design and Media Technology, Aalborg University Copenhagen
The emotional impact of approaching and receding sounds sources studies has previously been studied in seated laboratory experiments in with and without accompanying visual stimulus. This paper investigates the emotional responses to approaching and receding footstep sounds in an interactive virtual reality using a head-mounted display, 24-channel surround audio and a novel walking-in-place device utilizing acoustic detection of the user’s input. Based on self-reports using the Self-Assessment Manikin, the subjects gave post-experiment evaluations of 7 seconds long footstep sequence approaching and receding from outside of the participants field of view. The participants’ sensation of presence is also studied using a SUS questionnaire. The results showed that approaching footsteps sequences in the beginning of the experiments were found to elicit a higher level of arousal than receding footsteps in the beginning of the experiment and during the times when there were no footstep sequences.
Auditory Feedback to Improve Navigation in a Maze Game
Kevin Dahlstrøm, Nicolai Gajhede, Søren K. Jacobsen, Nicklas S. Jakobsen, Søren Lang, Magnus L. Rasmussen, Erik Sikstrom and Stefania Serafin, Medialogy, Aalborg University Copenhagen
In this paper we investigate whether sound design guidelines used to improve navigation in a train station can be applied in providing navigation guidelines in a maze game. For this purpose, we designed a maze game and augmented it with auditory cues useful for the navigation of the maze. A between-subject experiment showed that auditory cues significantly reduce the time needed to complete the maze.
Rhythm-Action Games: The sonic interaction perspective
Cumhur Erkut, Department of Signal Processing and Acoustics, Aalto University
Hüseyin Hacihabiboğlu, Informatics Institute, Middle East Technical University
This paper provides game audio researchers and practitioners a short background on rhythmic interaction, with application to rhythm-action games. Based on our previous experiments and observations, we point out technical challenges and our current solutions. A special focus is on how these concepts can be used in education, reflecting on the relevant sessions in the IEEE SPS Summer School Game Audio (September 3-6 2012, Ankara, Turkey). The technical and educational implications of rhythmicity for game audio are provided.
Modular Architecture for Virtual-World Parametric Spatial Audio Synthesis
Tapani Pihlajamaki, Mikko-Ville Laitinen, and Ville Pulkki, Department of Signal Processing and Acoustics, Aalto University School of Electrical Engineering
An adaptation of a parametric spatial audio coding method, Directional Audio Coding (DirAC), has been previously developed and validated for virtual-world applications. Although the quality in most cases is very good, it was noticed that some auditory scenes, e.g., ones containing multiple sources in acoustically dry conditions, are not produced optimally. In this paper, the architecture of this virtual-world DirAC is restructured and modified to avoid previous problems. In addition, these modifications achieve better scalability for the algorithm.
Integrating Custom 3D Audio Rendering into Game Sound Engines
Fritz Menzer, MN Signal Processing
While basic positional 3D audio for multichannel loudspeaker setups and headphone playback is provided by currently available game sound engines and specialized APIs, the needs of some game developers can go beyond just placing sounds in space, requiring also the simulation of diverse acoustical environments such as rooms, caves, or forests. This paper explores the possibility of addressing the game developers’ needs by adding custom 3D audio rendering into game sound engines by using their plugin systems and evaluates the performance of different plugin topologies.
Virtual Sound Source Positioning by Differential Head Related Transfer Function
Dominik Štorek, Dept. of Radioelectronics, Czech Technical University in Prague
This article deals with a new approach to virtual sound source positioning. The usual modern advanced method based on applying Head Related Transfer Function to both stereo channels is substituted by method of affecting stereo signal only in one channel. The proposed method claims only the difference in spectral features in both channels is essential for ability of sound source perception in horizontal plane. This fact allows to reduce the usual required positioning data and also computing operations to only a half. In this paper, the process of proposed method is introduced, compared to the standard method, and verified by listening test.
A Framework for the Development of Accurate Acoustic Calculations for Games
Panagiotis Charalampous and Panos Economou, P.E Mediterranean Acoustics Research and Development
Despite the rapid development in acoustics calculation software during the last couple of decades, such advances have not been achieved uniformly. Various demands in different disciplines have shifted the focus to a number of different aspects of the calculations. Methods in game development have focused on speed and optimized calculation times to achieve interactive sound rendering, whilst engineering methods have concentrated in achieving accuracy for reliable predictions. This paper presents a flexible, expandable and adjustable framework for the development of fast and accurate acoustics calculations both for game development and engineering purposes. It decomposes the process of acoustic calculations for 3D environments into distinct calculation steps and allows third party users to adjust calculation methodologies according to their needs.
Use of 3D Head Shape for Personalized Binaural Audio
Philip J. B. Jackson and Naveen K. Desiraju, CVSSP, Dept. of Electronic Engineering, University of Surrey
Natural-sounding reproduction of sound over headphones requires accurate estimation of an individual’s Head-Related Impulse Responses (HRIRs), capturing details relating to the size and shape of the body, head and ears. A stereo-vision face capture system was used to obtain 3D geometry, which provided surface data for boundary element method (BEM) acoustical simulation. Audio recordings were filtered by the output HRIRs to generate samples for a comparative listening test alongside samples generated with dummy-head HRIRs. Preliminary assessment showed better localization judgements with the personalized HRIRs by the corresponding participant, whereas other listeners performed better with dummy-head HRIRs, which is consistent with expectations for personalized HRIRs. The use of visual measurements for enhancing users’ auditory experience merits investigation with additional participants.
Geometric and Wave-Based Acoustic Modelling Using Blender
Jelle van Mourik and Damian Murphy, AudioLab, University of York
Geometric and wave-based acoustic algorithms have been shown as appropriate for the auralisation of room acoustic models. In particular they hold significant potential to be used in interactive virtual environments as a means of real-time sound rendering, with possible applications ranging from aiding architectural acoustic design to enhancing computer game audio. This paper presents a tool for developing acoustical scenes in Blender, an open source 3D development programme, based on 3D acoustic modelling using ray-tracing and/or FDTD methods. With the potential for real-time interaction and walk-through auralisation by means of the Blender Game Engine we demonstrate how Blender can be used as part of the acoustical design process.
Plausible Mono-to-Surround Sound Synthesis in Virtual-World Parametric Spatial Audio
Tapani Pihlajamaki and Mikko-Ville Laitinen, Department of Signal Processing and Acoustics, Aalto University School of Electrical Engineering
The control of diffuseness of sound is a tool for the sound designer to synthesize surrounding sound from a monophonic signal in virtual-world audio. Virtual-world Directional Audio Coding offers this with a specific diffuseness parameter. However, the diffuseness parameter value measured from sound recordings often has spurious short-term fluctuations which have to be synthesized to obtain natural reproduction. This data is not readily available when upmixing a monophonic signal into a multi-channel setup. In this paper, a method is proposed for estimating the fluctuation of diffuseness parameter from a monophonic signal and synthesizing a multi-channel output based on it. This algorithm is based on the estimation of the reverberant energy in a signal. A formal listening test was performed to compare the relative quality of the proposed method to constant diffuseness cases. The results show that the proposed method increases the perceptual quality of the synthesis.
Modeling and Real-Time Generation of Pen Stroke Sounds for Tactile Devices
Hanwook Chung, Institute of New Media and Communication, Department of Electrical Engineering and Computer Science, Seoul National University
Hoon Heo, Music and Audio Research Group, Seoul National University
Dooyong Sung, Music and Audio Research Group, Seoul National University
Yoonchang Han, Music and Audio Research Group, Seoul National University
In a real-world situation, pen strokes produce specific sounds that help to make interactions more natural in a virtual environment, such as using tactile input devices for education or games. In this paper, we describe a method for modeling and generating pen stroke sounds in real time. Since the proposed method is based on recorded signals, not only a specific pen sound but also various sound sources can be used. The difference in sound due to the change of speed of a pen movement is modeled by real-time resampling which is a simple and practical method. Acoustical resonant characteristics of the body below the surface and the pen are also identified. We conducted an experiment by implementing the proposed method on a tactile device and verified the performance.
Granular Analysis/Synthesis for Simple and Robust Transformations of Complex Sounds
Jung-Suk Lee, Music Technology Area, Schulich School of Music, McGill University; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT); Broadcom Corporation;
François Thibault, Audiokinetic Inc.
Philippe Depalle, Music Technology Area, Schulich School of Music, McGill University; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT)
Gary P. Scavone, Music Technology Area, Schulich School of Music, McGill University; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT)
In this paper, a novel and user-friendly granular analysis/synthesis system particularly geared towards environmental sounds is presented. A granular analysis component and a grain synthesis component were intended to be implemented separately so as to achieve more flexibility. The grain analysis component seg- ments a given sound into many ‘grains’ that are believed to be microscopic units that define an overall sound. A grain is likely to account for a local sound event generated from a microscopic interaction between objects. Segmentation should be able to successfully isolate these local sound events in a physically or perceptually meaningful way. The second part of the research was focused on the granular synthesis that can easily modify and re-create a given sound. The granular synthesis system would feature flexible time modification with which the user could re-assign the timing of grains and adjust the time-scale. Also, the system would be capable of cross-synthesis given the target sound and the collection of grains obtained through an analysis of sounds that might not include grains from the target one.
Individualized HRTFs Simulation Using Multiple Source Ray Tracing Method
Dooyong Sung, Music and Audio Research Group, Seoul National University
Nara Hahn, Institue of New Media Communication, Department of Electrical Engineering and Computer Science, Seoul National University
Kyogu Lee, Music and Audio Research Group, Seoul National University
Head-related transfer functions (HRTFs) explore the spatial auditory characteristics of human and can be used in various applications such as spatial audio and 3D games. Since non-individualized HRTFs cause high elevation localization error and front/back confusion, individualizing HRTFs are required for more precise three-dimensional localization. However, HRFTs measurement for each individual is expensive and time- consuming. In this paper, we use ray tracing techniques to simulate individualized HRTFs. Ray tracing techniques, however, show limited performance in simulating diffraction of sound. In order to solve such problem, Kirchhoff-Helmholtz integral is applied to ray tracing, so called Multiple Source Ray Tracing. We conducted experiments using binaural and spectral cues as performance measurement, and verified that the proposed method yields performance comparable to the measured HRFTs.