AES San Francisco 2012
Game Audio Track Event Details

Friday, October 26, 10:00 am — 11:30 am (Foyer)

Poster: P3 - Audio Effects and Physical Modeling

P3-1 Luciverb: Iterated Convolution for the ImpatientJonathan S. Abel, Stanford University - Stanford, CA, USA; Michael J. Wilson, Stanford University - Stanford, CA, USA
An analysis of iteratively applied room acoustics used by Alvin Lucier to create his piece "I'm Sitting in a Room" is presented, and a real-time system allowing interactive control over the number of rooms in the processing chain is described. Lucier anticipated that repeated application of a room response would bring out room resonances and smear the input sound over time. What was unexpected was the character of the smearing, turning a transient input into a sequence of crescendos at the room modes, ordered from high-frequency to low-frequency. Here, a room impulse response convolve with itself L times is shown have energy at the room mofes, each with a roughly Gaussian envelope, peaking at the observed L/2 times the frequency-dependent decay time.
Convention Paper 8691 (Purchase now)

P3-2 A Tilt Filter in a Servo LoopJohn Lazzaro, University of California, Berkeley - Berkeley, CA, USA; John Wawrzynek, University of California, Berkeley - Berkeley, CA, USA
Tone controls based on the tilt filter first appeared in 1982, in the Quad 34 Hi-Fi preamp. More recently, tilt filters have found a home in specialist audio processors such as the Elysia mpressor. This paper describes a novel dynamic filter design based on a tilt filter. A control system sets the tilt slope of the filter, in order to servo the spectral median of the filter output to a user-specified target. Users also specify a tracking time. Potential applications include single-instrument processing (in the spirit of envelope filters) and mastering (for subtle control of tonal balance). Although we have prototyped the design as an AudioUnit plug-in, the architecture is also a good match for analog circuit implementation.
Convention Paper 8692 (Purchase now)

P3-3 Multitrack Mixing Using a Model of Loudness and Partial LoudnessDominic Ward, Birmingham City University - Birmingham, UK; Joshua D. Reiss, Queen Mary University of London - London, UK; Cham Athwal, Birmingham City University - Birmingham, UK
A method for generating a mix of multitrack recordings using an auditory model has been developed. The proposed method is based on the concept that a balanced mix is one in which the loudness of all instruments are equal. A sophisticated psychoacoustic loudness model is used to measure the loudness of each track both in quiet and when mixed with any combination of the remaining tracks. Such measures are used to control the track gains in a time-varying manner. Finally we demonstrate how model predictions of partial loudness can be used to counteract energetic masking for any track, allowing the user to achieve better channel intelligibility in complex music mixtures.
Convention Paper 8693 (Purchase now)

P3-4 Predicting the Fluctuation Strength of the Output of a Spatial Chorus Effects ProcessorWilliam L. Martens, University of Sydney - Sydney, NSW, Australia; Robert W. Taylor, University of Sydney - Sydney, NSW, Australia; Luis Miranda, University of Sydney - Sydney, NSW, Australia
The experimental study reported in this paper was motivated by an exploration of a set of related audio effects comprising what has been called “spatial chorus.” In contrast to a single-output, delay-modulation-based effects processor that produces a limited range of results, complex spatial imagery is produced when parallel processing channels are subjected to incoherent delay modulation. In order to develop a more adequate user interface for control of such “spatial chorus” effects processing, a systematic investigation of the relationship between algorithmic parameters and perceptual attributes was undertaken. The starting point for this investigation was to perceptually scale the amount of modulation present in a set of characteristic stimuli in terms of the auditory attribute that Fastl and Zwicker called “fluctuation strength.”
Convention Paper 8694 (Purchase now)

P3-5 Computer-Aided Estimation of the Athenian Agora Aulos Scales Based on Physical ModelingAreti Andreopoulou, New York University - New York, NY, USA; Agnieszka Roginska, New York University - New York, NY, USA
This paper presents an approach to scale estimation for the ancient Greek Aulos with the use of physical modeling. The system is based on manipulation of a parameter set that is known to affect the sound of woodwind instruments, such as the reed type, the active length of the pipe, its inner and outer diameters, and the placement and size of the tone-holes. The method is applied on a single Aulos pipe reconstructed from the Athenian Agora fragments. A discussion follows on the resulting scales and the system’s advantages, and limitations.
Convention Paper 8695 (Purchase now)

P3-6 A Computational Acoustic Model of the Coupled Interior Architecture of Ancient ChavínRegina E. Collecchia, Stanford University - Stanford, CA, USA; Miriam A. Kolar, Stanford University - Stanford, CA, USA; Jonathan S. Abel, Stanford University - Stanford, CA, USA
We present a physical, modular computational acoustic model of the well-preserved interior architecture at the 3,000-year-old Andean ceremonial center Chavín de Huántar. Our previous model prototype [Kolar et. al. 2010] translated the acoustically coupled topology of Chavín gallery forms to a model based on digital waveguides (bi-directional by definition), representing passageways, connected through reverberant scattering junctions, representing the larger room-like areas. Our new approach treats all architectural units as “reverberant” digital waveguides, with scattering junctions at the discrete planes defining the unit boundaries. In this extensible and efficient lumped-element model, we combine architectural dimensional and material data with sparsely measured impulse responses to simulate multiple and circulating arrival paths between sound sources and listeners.
Convention Paper 8696 (Purchase now)

P3-7 Simulating an Asymmetrically Saturated Nonlinearity Using an LNLNL CascadeKeun Sup Lee, DTS, Inc. - Los Gatos, CA, USA; Jonathan S. Abel, Stanford University - Stanford, CA, USA
The modeling of a weakly nonlinear system having an asymmetric saturating nonlinearity is considered, and a computationally efficient model is proposed. The nonlinear model is the cascade of linear filters and memoryless nonlinearities, an LNLNL system. The two nonlinearities are upward and downward saturators, limiting, respectively, the amplitude of their input for either positive or negative excursions. In this way, distortion noted in each half an input sinusoid can be separately controlled. This simple model is applied toy simulating the signal chain of the Echoplex EP-4 tape delay, where informal listening tests showed excellent agreement between recorded and simulated program material.
Convention Paper 8697 (Purchase now)

P3-8 Coefficient Interpolation for the Max Mathews Phasor FilterDana Massie, Audience, Inc. - Mountain View, CA, USA
Max Mathews described what he named the “phasor filter,” which is a flexible building block for computer music, with many desirable properties. It can be used as an oscillator or a filter, or a hybrid of both. There exist analysis methods to derive synthesis parameters for filter banks based on the phasor filter, for percussive sounds. The phasor filter can be viewed as a complex multiply, or as a rotation and scaling of a 2-element vector, or as a real valued MIMO (multiple-input, multiple-output) 2nd order filter with excellent numeric properties (low noise gain). In addition, it has been proven that the phasor filter is unconditionally stable under time varying parameter modifications, which is not true of many common filter topologies. A disadvantage of the phasor filter is the cost of calculating the coefficients, which requires a sine and cosine in the general case. If pre-calculated coefficients are interpolated using linear interpolation, then the poles follow a trajectory that causes the filter to lose resonance. A method is described to interpolate coefficients using a complex multiplication that preserves the filter resonance.
Convention Paper 8698 (Purchase now)

P3-9 The Dynamic Redistribution of Spectral Energies for Upmixing and Re-Animation of Recorded AudioChristopher J. Keyes, Hong Kong Baptist University - Kowloon, Hong Kong
This paper details a novel approach to upmixing any n channels of audio to any arbitrary n+ channels of audio using frequency-domain processing to dynamically redistribute spectral energies across however many channels of audio are available. Although primarily an upmixing technique, the process may also help the recorded audio regain the sense of “liveliness” that one encounters in concerts of acoustic music, partially mimicking the effects of sound spectra being redistributed throughout a hall due to the dynamically changing radiation patterns of the instruments and the movements of the instruments themselves, during performance and recording. Preliminary listening tests reveal listeners prefer this technique 3 to 1 over a more standard upmixing technique.
Convention Paper 8699 (Purchase now)

P3-10 Matching Artificial Reverb Settings to Unknown Room Recordings: A Recommendation System for Reverb PluginsNils Peters, International Computer Science Institute - Berkeley, CA, USA; University of California Berkeley - Berkeley, CA, USA; Jaeyoung Choi, International Computer Science Institute - Berkeley, CA, USA; Howard Lei, International Computer Science Institute - Berkeley, CA, USA
For creating artificial room impressions, numerous reverb plugins exist and are often controllable by many parameters. To efficiently create a desired room impression, the sound engineer must be familiar with all the available reverb setting possibilities. Although plugins are usually equipped with many factory presets for exploring available reverb options, it is a time-consuming learning process to find the ideal reverb settings to create the desired room impression, especially if various reverberation plugins are available. For creating a desired room impression based on a reference audio sample, we present a method to automatically determine the best matching reverb preset across different reverb plugins. Our method uses a supervised machine-learning approach and can dramatically reduce the time spent on the reverb selection process.
Convention Paper 8700 (Purchase now)

 
 

Friday, October 26, 11:00 am — 12:30 pm (Room 123)

Game Audio: G1 - A Whole World in Your Hands: New Techniques in Generative Audio Bring Entire Game Worlds into the Realms of Mobile Platforms

Presenter:
Stephan Schütze


Abstract:
"We can't have good audio; there is not enough memory on our target platform." This is a comment heard far too often especially considering it's incorrect. Current technology already allows for complex and effective audio environments to be made with limited platform resources when developed correctly, but we are just around the corner from an audio revolution.

The next generation of tools being developed for audio creation and implementation will allow large and complex audio environments to be created using minimal amounts of resources. While the new software apps being developed are obviously an important part of this coming revolution it is the techniques, designs, and overall attitudes to audio production that will be the critical factors in successfully creating the next era of sound environments.

This presentation will break down and discuss this new methodology independent of the technology and demonstrate some simple concepts that can be used to develop a new approach to sound design. All the material presented in this talk will benefit development on current and next gen consoles as much as development for mobile devices.

 
 

Friday, October 26, 2:00 pm — 3:30 pm (Room 123)

Game Audio: G2 - The Future Is Now: Mind Controlled Interactive Music

Presenters:
Adam Gazzaley, Neuroscience Imaging Center, UCSF - San Francisco, CA, USA
Jim Hedges, Zynga
Kyle Machulis, Nonpolynomial Labs
Nicolas Tomasino, IGN Entertainment
Richard Warp, Leapfrog


Abstract:
If one thing is clear from the games industry over the last 20 years, it is that consumers are seeking an ever-more immersive environment for their gaming experience, and in many ways biofeedback is the "final frontier," where a player’s emotions, reactions, and mood can directly influence the gameplay. Whether the feedback comes from autonomic processes (stress or arousal, as in Galvanic Skin Response) or cognitive function (EEG signals from the brain), there is no doubt that these "active input" technologies, which differ from traditional HCI inputs (such as hardware controllers) in their singular correspondence to the individual player, greatly enhance the contextual responsiveness and "reality" of a game. These technologies are already robust enough to be integrated via audiovisual mappings into the immersive world of gaming. Things are about to get a lot more real.

 
 

Friday, October 26, 4:00 pm — 6:00 pm (Room 133)

Game Audio: G3 - How to Use the Interactive Reverberator: Theoretical Bases and Practical Applications

Chair:
Steve Martz, THX Ltd. - San Rafael, CA, USA
Presenters:
Toshiki Hanyu, Nihon University - Funabashi, Chiba, Japan
Takumi Higashi, CAPCOM Co. Ltd. - Okaka-shi, Oasaka-fu, Japan
Tomoya Kishi, CAPCOM Co., Ltd. - Okaka-shi, Oasaka-fu, Japan
Masataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan


Abstract:
The interactive reverberator, applying realistic computed acoustic responses interactively for video game scenes, is a very important technology for in-game sound processing. The presenters have developed the interactive reverberator whose acoustical properties can be adjusted easily even after the calculated results are given. It has been already implemented into a Capcom's middleware, MT-framework and a trial run has been conducted successfully. How to setup initial parameters of the interactive reverberator? Do they need to be adjusted again by hearing impression? How to minimize difference in reverb timber between interactive scenes and cut scenes? The workshop introduces algorithm and basic functions of the interactive reverberator that the authors developed and also show practical operations of using it with 5.1ch run-time demos.

 
 

Saturday, October 27, 9:00 am — 10:00 am (Room 120)

Product Design: PD4 - Audio in HTML 5

Presenters:
Jeff Essex, AudioSyncrasy
Jory K. Prum, studio.jory.org - Fairfax, CA, USA


Abstract:
HTML 5 is coming. Many expect it to supplant Flash as an online rich media player, as Apple has made abundantly clear. But audio support is slow in coming, and there are currently marked differences between browsers. From an audio content standpoint, it's the Nineties all over again. The W3C's Audio Working Group is developing standards, but this is a fast-moving target. This talk will provide an update on what's working, what isn't.

 
 

Saturday, October 27, 9:00 am — 11:00 am (Room 123)

Game Audio: G4 - Education Panel—New Models for Game Audio Education in the 21st Century

Chair:
Steve Horowitz, The Code International Inc., MPA
Panelists:
Matt Donner, Pyramind - San Francisco, CA, USA
Steve Horelick, macProVideo
Scott Looney, Academy of Art University - San Francisco, CA, USA
Stephan Schütze
Michael Sweet, Berklee College of Music - Boston, MA, USA


Abstract:
Steve Horelick from macProVideo will rundown the new ways that the internet and social media are changing the face of game audio education.

Formal Game Audio education programs are just starting to take root and sprout up all across the country and the world. From full fledged degree programs, 1 year certificate programs, to single class offerings, the word on the street is out and game audio education is becoming a hot topic and a big money-maker for schools. This panel brings together department heads from the some of the country's top public and private institutions to discuss the current landscape and offerings in audio for interactive media education. Students looking to find the right institute will get a fantastic overview of what is out there and available. This is a must for students who are trying to decide what programs are right for them as they weigh their options for getting a solid education in sound and music for games and interactive media.

 
 

Saturday, October 27, 9:00 am — 11:00 am (Room 133)

Workshop: W4 - What Does an Object Sound Like? Toward a Common Definition of a Spatial Audio Object

Chair:
Frank Melchior, BBC R&D - Salford, UK
Panelists:
Jürgen Herre, International Audio Laboratories Erlangen - Erlangen, Germany; Fraunhofer IIS - Erlangen, Germany
Jean-Marc Jot, DTS, Inc. - Calabasas, CA, USA
Nicolas Tsingos, Dolby Labs - San Francisco, CA, USA
Matte Wagner, Red Storm Entertainment - Cary, NC, USA
Hagen Wierstorf, Technische Universität Berlin - Berlin, Germany


Abstract:
At the present time, several concepts for the storage of spatial audio data are under discussion in the research community. Besides the distribution of audio signals corresponding to a specific speaker layout or encoding a spatial audio scene in orthogonal basis functions like spherical harmonics, several solutions available on the market are applying object-based formats to store and distribute spatial audio scenes. The workshop will cover the similarities and difference between the various concepts of audio objects. This comparison will include the production and reproduction of audio objects as well as their storage. The panelists will try to find a common definition of audio objects in order to enable an object-based exchange format in the future.

 
 

Saturday, October 27, 9:00 am — 12:30 pm (Room 121)

Paper Session: P8 - Emerging Audio Technologies

Chair:
Agnieszka Roginska, New York University - New York, NY, USA

P8-1 A Method for Enhancement of Background Sounds in Forensic Audio RecordingsRobert C. Maher, Montana State University - Bozeman, MT, USA
A method for suppressing speech while retaining background sound is presented in this paper. The procedure is useful for audio forensics investigations in which a strong foreground sound source or conversation obscures subtle background sounds or utterances that may be important to the investigation. The procedure uses a sinusoidal speech model to represent the strong foreground signal and then performs a synchronous subtraction to isolate the background sounds that are not well-modeled as part of the speech signal, thereby enhancing the audibility of the background material.
Convention Paper 8731 (Purchase now)

P8-2 Transient Room Acoustics Using a 2.5 Dimensional ApproachPatrick Macey, Pacsys Ltd. - Nottingham, UK
Cavity modes of a finite acoustic domain with rigid boundaries can be used to compute the transient response for a point source excitation. Previous work, considering steady state analysis, showed that for a room of constant height the 3-D modes can be computed very rapidly by computing the 2-D cross section modes. An alternative to a transient modal approach is suggested, using a trigonometric expansion of the pressure through the height. Both methods are much faster than 3-D FEM but the trigonometric series approach is more easily able to include realistic damping. The accuracy of approximating an “almost constant height” room to be constant height is investigated by example.
Convention Paper 8732 (Purchase now)

P8-3 Multimodal Information Management: Evaluation of Auditory and Haptic Cues for NextGen Communication DisplaysDurand Begault, Human Systems Integration Division, NASA Ames Research Center - Moffett Field, CA, USA; Rachel M. Bittner, New York University - New York, NY, USA; Mark R. Anderson, Dell Systems, NASA Ames Research Center - Moffett Field, CA, USA
Auditory communication displays within the NextGen data link system may use multiple synthetic speech messages replacing traditional air traffic control and company communications. The design of an interface for selecting among multiple incoming messages can impact both performance (time to select, audit, and release a message) and preference. Two design factors were evaluated: physical pressure-sensitive switches versus flat panel “virtual switches,” and the presence or absence of auditory feedback from switch contact. Performance with stimuli using physical switches was 1.2 s faster than virtual switches (2.0 s vs. 3.2 s); auditory feedback provided a 0.54 s performance advantage (2.33 s vs. 2.87 s). There was no interaction between these variables. Preference data were highly correlated with performance.
Convention Paper 8733 (Purchase now)

P8-4 Prototype Spatial Auditory Display for Remote Planetary ExplorationElizabeth M. Wenzel, NASA-Ames Research Center - Moffett Field, CA, USA; Martine Godfroy, NASA Ames Research Center - Moffett Field, CA, USA; San Jose State University Foundation; Joel D. Miller, Dell Systems, NASA Ames Research Center - Moffett Field, CA, USA
During Extra-Vehicular Activities (EVA), astronauts must maintain situational awareness (SA) of a number of spatially distributed "entities" such as other team members (human and robotic), rovers, and a lander/habitat or other safe havens. These entities are often outside the immediate field of view and visual resources are needed for other task demands. Recent work at NASA Ames has focused on experimental evaluation of a spatial audio augmented-reality display for tele-robotic planetary exploration on Mars. Studies compared response time and accuracy performance with different types of displays for aiding orientation during exploration: a spatial auditory orientation aid, a 2-D visual orientation aid, and a combined auditory-visual orientation aid under a number of degraded vs. nondegraded visual conditions. The data support the hypothesis that the presence of spatial auditory cueing enhances performance compared to a 2-D visual aid, particularly under degraded visual conditions.
Convention Paper 8734 (Purchase now)

P8-5 The Influence of 2-D and 3-D Video Playback on the Perceived Quality of Spatial Audio Rendering for HeadphonesAmir Iljazovic, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Florian Leschka, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Bernhard Neugebauer, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany; Jan Plogsties, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Algorithms for processing of spatial audio are becoming more attractive for practical applications as multichannel formats and processing power on playback devices enable more advanced rendering techniques. In this study the influence of the visual context on the perceived audio quality is investigated. Three groups of 15 listeners are presented to audio-only, audio with 2-D video, and audio with 3-D video content. The 5.1 channel audio material is processed for headphones using different commercial spatial rendering techniques. Results indicate that a preference for spatial audio processing over a downmix to conventional stereo can be shown with the effect being larger in the presence of 3-D video content. Also, the influence of video on perceived audio quality is significant for 2-D and 3-D video presentation.
Convention Paper 8735 (Purchase now)

P8-6 An Autonomous System for Multitrack Stereo Pan PositioningStuart Mansbridge, Queen Mary University of London - London, UK; Saorise Finn, Queen Mary University of London - London, UK; Birmingham City University - Birmingham, UK; Joshua D. Reiss, Queen Mary University of London - London, UK
A real-time system for automating stereo panning positions for a multitrack mix is presented. Real-time feature extraction of loudness and frequency content, constrained rules, and cross-adaptive processing are used to emulate the decisions of a sound engineer, and pan positions are updated continuously to provide spectral and spatial balance with changes in the active tracks. As such, the system is designed to be highly versatile and suitable for a wide number of applications, including both live sound and post-production. A real-time, multitrack C++ VST plug-in version has been developed. A detailed evaluation of the system is given, where formal listening tests compare the system against professional and amateur mixes from a variety of genres.
Convention Paper 8736 (Purchase now)

P8-7 DReaM: A Novel System for Joint Source Separation and Multitrack CodingSylvain Marchand, University of Western Brittany - Brest, France; Roland Badeau, Telecom ParisTech - Paris, France; Cléo Baras, GIPSA-Lab - Grenoble, France; Laurent Daudet, University Paris Diderot - Paris, France; Dominique Fourer, University Bordeaux - Talence, France; Laurent Girin, GIPSA-Lab - Grenoble, France; Stanislaw Gorlow, University of Bordeaux - Talence, France; Antoine Liutkus, Telecom ParisTech - Paris, France; Jonathan Pinel, GIPSA-Lab - Grenoble, France; Gaël Richard, Telecom ParisTech - Paris, France; Nicolas Sturmel, GIPSA-Lab - Grenoble, France; Shuhua Zang, GIPSA-Lab - Grenoble, France
Active listening consists in interacting with the music playing, has numerous applications from pedagogy to gaming, and involves advanced remixing processes such as generalized karaoke or respatialization. To get this new freedom, one might use the individual tracks that compose the mix. While multitrack formats lose backward compatibility with popular stereo formats and increase the file size, classic source separation from the stereo mix is not of sufficient quality. We propose a coder / decoder scheme for informed source separation. The coder determines the information necessary to recover the tracks and embeds it inaudibly in the mix, which is stereo and has a size comparable to the original. The decoder enhances the source separation with this information, enabling active listening.
Convention Paper 8737 (Purchase now)

 
 

Saturday, October 27, 11:00 am — 1:00 pm (Room 123)

Game Audio: G5 - Careers Panel—Getting a Job in the Game Industry

Chair:
Steve Horowitz, The Code International Inc., MPA
Panelists:
Charles Deenen, Electronic Arts - Los Angeles, CA, USA
Jesse Harlin, LucasArts
Adam Levenson, Levenson Artists
Richard Warp, Leapfrog


Abstract:
From AAA titles to social media, the game industry offers a lot of opportunity for the audio practitioner. In this event our panel will break down the current state of the industry.

Everyone wants to work in games, just check out the news. The game industry is larger then the film industry and the growth curve keeps going up and up and up. So, what is the best way to get that first gig in audio for games? How can I transfer my existing skills to interactive media? Should I go to school? What are the pros and cons of a degree program versus just getting out there on my own? Good questions! We will take a panel of today’s top creative professionals from large game studios to Indie producers and ask them what they think you need to know when looking for work in the game industry. So, whether you are already working in the game industry or just thinking of the best way to transfer your skills from film, TV or general music production to interactive media or a complete newbie to the industry, this panel is a must!

 
 

Saturday, October 27, 2:00 pm — 3:30 pm (Room 130)

Game Audio: G6 - Building a AAA Title—Roles and Responsibilities

Presenters:
Justin Drust, Red Storm Entertainment - Cary, NC, USA
Fran Dyer, Red Storm Entertainment - Cary, NC, USA
Chris Groegler, Red Storm Entertainment - Cary, NC, USA
Matt McCallus, Red Storm Entertainment - Cary, NC, USA
Matte Wagner, Red Storm Entertainment - Cary, NC, USA


Abstract:
Look behind the curtain of a AAA title and into the world of game audio development from multiple perspectives— the Producer, Audio Director, Sound Designers, and Programmer. See the inner workings of the Red Storm Audio Team as they collaborate with multiple Ubisoft studios to create the Tom Clancy's Ghost Recon: Future Soldier Multiplayer experience. Discover the tips, tricks, and techniques of a major AAA title’s audio design process from conception to completion in this postmortem.

 
 

Saturday, October 27, 2:30 pm — 6:00 pm (Room 122)

Paper Session: P12 - Sound Analysis and Synthesis

Chair:
Jean Laroche, Audience, Inc.

P12-1 Drum Synthesis via Low-Frequency Parametric Modes and Altered ResidualsHaiying Xia, CCRMA, Stanford University - Stanford, CA, USA; Electrical Engineering, Stanford University - Standford, CA, USA; Julius O. Smith, III, Stanford University - Stanford, CA, USA
Techniques are proposed for drum synthesis using a two-band source-filter model. A Butterworth lowpass/highpass band-split is used to separate a recorded “high tom" drum hit into low and high bands. The low band, containing the most salient modes of vibration, is downsampled and Poisson-windowed to accelerate its decay and facilitate mode extraction. A weighted equation-error method is used to fit an all-pole model—the “modal model”—to the first five modes of the low band in the case of the high tom. The modal model is removed from the low band by inverse filtering, and the resulting residual is taken as a starting point for excitation modeling in the low band. For the high band, low-order linear prediction (LP) is used to model the spectral envelope. The bands are resynthesized by feeding the residual signals to their respective all-pole forward filters, upsampling the low band, and summing. The modal model can be modulated to obtain the sound of different drums and other effects. The residuals can be altered to obtain the effects of different striking locations and striker materials.
Convention Paper 8762 (Purchase now)

P12-2 Drum Pattern Humanization Using a Recursive Bayesian FrameworkRyan Stables, Birmingham City University - Birmingham, UK; Cham Athwal, Birmingham City University - Birmingham, UK; Rob Cade, Birmingham City University - Birmingham, UK
In this study we discuss some of the limitations of Gaussian humanization and consider ways in which the articulation patterns exhibited by percussionists can be emulated using a probabilistic model. Prior and likelihood functions are derived from a dataset of professional drummers to create a series of empirical distributions. These are then used to independently modulate the onset locations and amplitudes of a quantized sequence, using a recursive Bayesian framework. Finally, we evaluate the performance of the model against sequences created with a Gaussian humanizer and sequences created with a Hidden Markov Model (HMM) using paired listening tests. We are able to demonstrate that probabilistic models perform better than instantaneous Gaussian models, when evaluated using a 4/4 rock beat at 120 bpm.
Convention Paper 8763 (Purchase now)

P12-3 Procedural Audio Modeling for Particle-Based Environmental EffectsCharles Verron, REVES-INRIA - Sophia-Antipolis, France; George Drettakis, REVES/INRIA Sophia-Antipolis - Sophia-Antipolis, France
We present a sound synthesizer dedicated to particle-based environmental effects, for use in interactive virtual environments. The synthesis engine is based on five physically-inspired basic elements (that we call sound atoms) that can be parameterized and stochastically distributed in time and space. Based on this set of atomic elements, models are presented for reproducing several environmental sound sources. Compared to pre-recorded sound samples, procedural synthesis provides extra flexibility to manipulate and control the sound source properties with physically-inspired parameters. In this paper the controls are used simultaneously to modify particle-based graphical models, resulting in synchronous audio/graphics environmental effects. The approach is illustrated with three models that are commonly used in video games: fire, wind, and rain. The physically-inspired controls simultaneously drive graphical parameters (e.g., distribution of particles, average particles velocity) and sound parameters (e.g., distribution of sound atoms, spectral modifications). The joint audio/graphics control results in a tightly-coupled interaction between the two modalities that enhances the naturalness of the scene.
Convention Paper 8764 (Purchase now)

P12-4 Knowledge Representation Issues in Audio-Related Metadata Model DesignGyörgy Fazekas, Queen Mary University of London - London, UK; Mark B. Sandler, Queen Mary University of London - London, UK
In order for audio applications to interoperate, some agreement on how information is structured and encoded has to be in place within developer and user communities. This agreement can take the form of an industry standard or a widely adapted open framework consisting of conceptual data models expressed using formal description languages. There are several viable approaches to conceptualize audio related metadata, and several ways to describe the conceptual models, as well as encode and exchange information. While emerging standards have already been proven invaluable in audio information management, it remains difficult to design or choose the model that is most appropriate for an application. This paper facilitates this process by providing an overview, focusing on differences in conceptual models underlying audio metadata schemata.
Convention Paper 8765 (Purchase now)

P12-5 High-Level Semantic Metadata for the Control of Multitrack Adaptive Digital Audio EffectsThomas Wilmering, Queen Mary University of London - London, UK; György Fazekas, Queen Mary University of London - London, UK; Mark B. Sandler, Queen Mary University of London - London, UK
Existing adaptive digital audio effects predominantly use low-level features in order to derive control data. These data do not typically correspond to high-level musicological or semantic information about the content. In order to apply audio transformations selectively on different musical events in a multitrack project, audio engineers and music producers have to resort to manual selection or annotation of the tracks in traditional audio production environments. We propose a new class of audio effects that uses high-level semantic audio features in order to obtain control data for multitrack effects. The metadata is expressed in RDF using several music and audio related Semantic Web ontologies and retrieved using the SPARQL query language.
Convention Paper 8766 (Purchase now)

P12-6 On Accommodating Pitch Variation in Long Term Prediction of Speech and Vocals in Audio CodingTejaswi Nanjundaswamy, University of California, Santa Barbara - Santa Barbara, CA, USA; Kenneth Rose, University of California, Santa Barbara - Santa Barbara, CA, USA
Exploiting inter-frame redundancies is key to performance enhancement of delay constrained perceptual audio coders. The long term prediction (LTP) tool was introduced in the MPEG Advanced Audio Coding standard, especially for the low delay mode, to capitalize on the periodicity in naturally occurring sounds by identifying a segment of previously reconstructed data as prediction for the current frame. However, speech and vocal content in audio signals is well known to be quasi-periodic and involve small variations in pitch period, which compromise the LTP tool performance. The proposed approach modifies LTP by introducing a single parameter of “geometric” warping, whereby past periodicity is geometrically warped to provide an adjusted prediction for the current samples. We also propose a three-stage parameter estimation technique, where an unwarped LTP filter is first estimated to minimize the mean squared prediction error; then filter parameters are complemented with the warping parameter, and re-estimated within a small neighboring search space to retain the set of S best LTP parameters; and finally, a perceptual distortion-rate procedure is used to select from the S candidates, the parameter set that minimizes the perceptual distortion. Objective and subjective evaluations substantiate the proposed technique’s effectiveness.
Convention Paper 8767 (Purchase now)

P12-7 Parametric Coding of Piano SignalsMichael Schnabel, Ilmenau University of Technology - Ilmenau, Germany; Benjamin Schubert, Ilmenau University of Technology - Ilmenau, Germany; Fraunhofer IIS - Erlangen, Germany; Gerald Schuller, Ilmenau University of Technology - IImenau, Germany
In this paper an audio coding procedure for piano signals is presented based on a physical model of the piano. Instead of coding the waveform of the signal, the compression is realized by extracting relevant parameters at the encoder. The signal is then re-synthesized at the decoder using the physical model. We describe the development and implementation of algorithms for parameter extraction and the combination of all the components into a coder. A formal listening test was conducted, which shows that we can obtain a high sound quality at a low bit rate, lower than conventional coders. We obtain a bitrate of 11.6 kbps for the proposed piano coder. We use HE-AAC as reference codec at a gross bitrate of 16 kbps. For low and medium chords the proposed piano coder outperforms HE-AAC in terms of subjective quality, while the quality falls below HE-AAC for high chords.
Convention Paper 8768 (Purchase now)

 
 

Saturday, October 27, 3:30 pm — 5:30 pm (Room 132)

Student / Career: SC4 - Speed Counseling with Experts—Mentoring Answers for Your Career


Abstract:
This event is specially suited for students, recent graduates, young professionals, and those interested in career advice. Hosted by SPARS in cooperation with the AES Education Committee, Women’s Audio Mission, G.A.N.G., and Manhattan Producers Alliance, career related Q&A sessions will be offered to participants in a speed group mentoring format. A dozen students will interact with 4-5 working professionals in specific audio engineering fields or categories every 20 minutes. Audio engineering fields/categories include gaming, live sound/live recording, audio manufacturer, mastering, sound for picture, and studio production.

Moderator: Kirk Imamura, President, Society of Professional Audio Recording Services (SPARS)
Mentors (subject to change) include: Elise Baldwin, Dren McDonald, Paul Lipson, Tom Salta, Bob Skye, Deanne Franklin, Lauretta Molitor, David Hewitt, Chris Estes, Dave Hampton, Andrew Hollis, Chris Spahr, Richard Warp, David Glasser, Michael Romanowski, Piper Payne, Scott Hull, Steve Horowitz, Eric Johnson, Krysten Mate, Leslie Mona-Mathus, Shawn Murphy, Chris Bell, Mark Rubel, David Bowles, Pat McMakin, Neil Dorfsman

 
 

Saturday, October 27, 4:15 pm — 5:45 pm (Room 131)

Game Audio: G7 - Loudness Issues in Games

Chair:
Steve Martz, THX Ltd. - San Rafael, CA, USA
Panelists:
Mike Babbitt, Dolby Labs - San Francisco, CA, USA
Richard Cabot, Qualis Audio - Lake Oswego, OR, USA
Tom Hays, Technicolor Creative Services
Mark Yeend, Microsoft - Redmond, WA, USA


Abstract:
If its too loud ….

Loudness wars in games have been hotly debated but without significant progress. Other industries have taken steps to rein in the content delivered to consumers. Are there parallels that can be applied to games? A panel of industry experts will review the present implementation of the broadcast industries’ Commercial Advertisement Loudness Mitigation Act (CALM Act) of 2012 and investigate its potential application to the games industry. The panel will also discuss current attempts to address this issue amongst Publishers and Developers.

 
 

Sunday, October 28, 9:00 am — 11:30 am (Room 121)

Paper Session: P14 - Spatial Audio Over Headphones

Chair:
David McGrath, Dolby Australia - McMahons Point, NSW, Australia

P14-1 Preferred Spatial Post-Processing of Popular Stereophonic Music for Headphone ReproductionElla Manor, The University of Sydney - Sydney, NSW, Australia; William L. Martens, University of Sydney - Sydney, NSW, Australia; Densil A. Cabrera, University of Sydney - Sydney, NSW, Australia
The spatial imagery experienced when listening to conventional stereophonic music via headphones is considerably different from that experienced in loudspeaker reproduction. While the difference might be reduced when stereophonic program material is spatially processed in order to simulate loudspeaker crosstalk for headphone reproduction, previous listening tests have shown that such processing typically produces results that are not preferred by listeners in comparisons with the original (unprocessed) version of a music program. In this study a double blind test was conducted in which listeners compared five versions of eight programs from a variety of music genres and gave both preference ratings and ensemble stage width (ESW) ratings. Out of four alternative postprocessing algorithms, the outputs that were most preferred resulted from a nearfield crosstalk simulation mimicking low-frequency interaural level differences typical for close-range sources.
Convention Paper 8779 (Purchase now)

P14-2 Interactive 3-D Audio: Enhancing Awareness of Details in Immersive Soundscapes?Mikkel Schmidt, Technical University of Denmark - Kgs. Lyngby, Denmark; Stephen Schwartz, SoundTales - Helsingør, Denmark; Jan Larsen, Technical University of Denmark - Kgs. Lyngby, Denmark
Spatial audio and the possibility of interacting with the audio environment is thought to increase listeners' attention to details in a soundscape. This work examines if interactive 3-D audio enhances listeners' ability to recall details in a soundscape. Nine different soundscapes were constructed and presented in either mono, stereo, 3-D, or interactive 3-D, and performance was evaluated by asking factual questions about details in the audio. Results show that spatial cues can increase attention to background sounds while reducing attention to narrated text, indicating that spatial audio can be constructed to guide listeners' attention.
Convention Paper 8780 (Purchase now)

P14-3 Simulating Autophony with Auralized Oral-Binaural Room Impulse ResponsesManuj Yadav, University of Sydney - Sydney, NSW, Australia; Luis Miranda, University of Sydney - Sydney, NSW, Australia; Densil A. Cabrera, University of Sydney - Sydney, NSW, Australia; William L. Martens, University of Sydney - Sydney, NSW, Australia
This paper presents a method for simulating the sound that one hears from one’s own voice in a room acoustic environment. Impulse responses from the mouth to the two ears of the same head are auralized within a computer-modeled room in ODEON; using higher-order ambisonics for modeling the directivity pattern of an anthropomorphic head and torso. These binaural room impulse responses, which can be measured for all possible head movements, are input into a mixed-reality room acoustic simulation system for talking-listeners. With the system, “presence” in a room environment different from the one in which one is physically present is created in real-time for voice related tasks.
Convention Paper 8781 (Purchase now)

P14-4 Head-Tracking Techniques for Virtual Acoustics ApplicationsWolfgang Hess, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
Synthesis of auditory virtual scenes often requires the use of a head-tracker. Virtual sound fields benefit from continuous adaptation to a listener’s position while presented through headphones or loudspeakers. For this task position- and time-accurate, continuous robust capturing of the position of the listener’s outer ears is necessary. Current head-tracker technologies allow solving this task by cheap and reliable electronic techniques. Environmental conditions have to be considered to find an optimal tracking solution for each surrounding and for each field of application. A categorization of head-tracking systems is presented. Inside-out describes tracking stationary sensors from inside a scene, whereas outside-in is the term for capturing from outside a scene. Marker-based and marker-less approaches are described and evaluated by means of commercially available products, e.g., the MS Kinect, and proprietary developed systems.
Convention Paper 8782 (Purchase now)

P14-5 Scalable Binaural Synthesis on Mobile DevicesChristian Sander, University of Applied Sciences Düsseldorf - Düsseldorf, Germany; Robert Schumann Hochschule Düsseldorf - Düsseldorf, Germany; Frank Wefers, RWTH Aachen University - Aachen, Germany; Dieter Leckschat, University of Applied Science Düsseldorf - Düsseldorf, Germany
The binaural reproduction of sound sources through headphones in mobile applications is becoming a promising opportunity to create an immersive three-dimensional listening experience without the need for extensive equipment. Many ideas for outstanding applications in teleconferencing, multichannel rendering for headphones, gaming, or auditory interfaces implementing binaural audio have been proposed. However, the diversity of applications calls for scalability of quality and performance costs so as to use and share hardware resources economically. For this approach, scalable real-time binaural synthesis on mobile platforms was developed and implemented in a test application in order to evaluate what current mobile devices are capable of in terms of binaural technology, both qualitatively and quantitatively. In addition, the audio part of three application scenarios was simulated.
Convention Paper 8783 (Purchase now)

 
 

Sunday, October 28, 9:30 am — 10:30 am (Room 133)

Game Audio: G8 - Audio Shorts: Resources

Presenters:
Charles Deenen, Electronic Arts - Los Angeles, CA, USA
Tom Salta, Tom Salta Music
Stephan Schütze


Abstract:
This hour long session will be split into three twenty-minute segments. Each segment will go in depth into a subject that is near and dear to the presenter. Audio Shorts is designed to pack in as much usable information in as short of period of time as possible. It’s like the Reader’s Digest of game audio tutorials. You won't want to miss this one.

Shorty #1: Tools, Tips, and Techniques, Tom Salta, presenter
Shorty #2: Sound Libraries, Stephan Schutze, presenter
Shorty #3: My Favorite Plug-in!, Charles Deenen, presenter

 
 

Sunday, October 28, 10:45 am — 12:45 pm (Room 133)

Game Audio: G9 - Demo Derby

Panelists:
Paul Gorman, Electronic Arts
Jesse Harlin, LucasArts
Paul Lipson, Microsoft
Jonathan Mayer, Sony Computer Entertainment America
Dren McDonald, Loot Drop


Abstract:
The Demo Derby is now at AES. Bring your best demo material and have it reviewed by the Pros. Let’s see if you have what it takes to make it in games.

Music:
Attendees submit 60 seconds of their best work for a detailed critique and feedback from a team of leading audio directors and professionals and participate in an active discussion with fellow panelists and audience members. The Derby facilitates game audio practitioners of all levels and is suited for producers, composers, audio directors, and anyone interested in music for games and interactive entertainment.

Sound Design:
Attendees submit 120 seconds of their best work for a detailed critique and feedback from a team of leading audio directors and professionals and participate in an active discussion with fellow panelists and audience members. The Derby facilitates game audio practitioners of all levels and is suited for producers, composers, audio directors, and anyone interested in music for games and interactive entertainment.

Submissions:
Demos are to be on a CD/DVD that is clearly labeled with your name and contact information. Each disc should contain only 1 demo track. This disc will be played on a disc player, not a computer. Please author the disc so that your demo “auto-plays” immediately after it loads.

Submissions will be collected 30 minutes before the session begins.

 
 

Sunday, October 28, 1:00 pm — 2:00 pm (Room 124)

TC Meeting: Audio for Games


Abstract:
Technical Committee Meeting on Audio for Games

 
 

Sunday, October 28, 2:15 pm — 3:45 pm (Room 122)

Game Audio: G10 - Game Audio in a Web Browser

Presenters:
Owen Grace, Electronic Arts
Roger Powell, Electronic Arts
Chris Rogers, Google Inc.
Guy Whitmore, PopCap Games


Abstract:
Web browser-based computer games are popular because they do not require client application installation, can be played by single or multiple players over the internet, and are generally capable of being played across different browsers and on multiple devices. Audio tools support for developers is varied, with sound engine software typically employing the Adobe Flash plug-in for rendering audio, or the simplistic HTML5 <audio> element tag. This session will focus on a research project to create a game sound engine in Javascript based on the W3C WebAudio API draft proposal. The sound engine was used to generate 3-D spatialized rich audio content within a WebGL-based graphics game framework. The result, a networked multi-player arena combat-style game, rivals the experience of playing on a dedicated console gaming device.

 
 

Sunday, October 28, 4:00 pm — 5:45 pm (Room 122)

Engineering Brief: EB2 - eBrief Presentations—Lectures 1

Chair:
Lance Reichert, Sennheiser Electronic Corporation - San Francisco, CA, USA

EB2-1 A Comparison of Highly Configurable CPU- and GPU-Based Convolution EnginesMichael Schoeffler, International Audio Laboratories Erlangen - Erlangen, Germany; Wolfgang Hess, Fraunhofer Institute for Integrated Circuits IIS - Erlangen, Germany
In this work the performance of real-time audio signal processing convolution engines is evaluated. A CPU-based implementation using the Integrated Performance Primitives Library and two GPU-based implementations using CUDA and OpenCL are compared. The purpose of these convolution engines is auralization, e.g., the binaural rendering of virtual multichannel configurations. Any multichannel input and output configuration is supported, e.g., 22.2 to 5.1, 7.1 to 2.0, vice versa, etc. This ability results in a trade-off between configurability and performance. Using a 5.1-to-binaural setup with continuous filter changes due to simulated head-tracking, GPU processing is more efficient when 24 filters of more than 1.92 seconds duration each @ 48 kHz sampling rate are convolved. The GPU is capable of convolving longer filters in real-time than a CPU-based processing. By comparing both GPU-based implementations, negligible performance differences between OpenCL and CUDA were measured.
Engineering Brief 60 (Download now)

EB2-2 Multichannel Audio Processor which Adapts to 2-D and 3-D Loudspeaker SetupsChristof Faller, Illusonic - Uster, Switzerland
A general audio format conversion concept is described for reproducing stereo and surround audio content on loudspeaker setups with any number of channels. The goal is to improve localization and to generate a recording-related spatial impression of depth and immersion. It is explained how with these goals signals are processed using a strategy that is independent of a specific loudspeaker setup. The implementation of this general audio format conversion concept, in the Illusonic Immersive Audio Processor, is described.
Engineering Brief 61 (Download now)

EB2-3 A Comparison of Recording, Rendering, and Reproduction Techniques for Multichannel Spatial AudioDavid Romblom, McGill University - Montreal, Quebec, Canada; Catherine Guastavino, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada; Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
The objective of this project is to compare the relative merits of two different spatial audio recording and rendering techniques within the context of two different multichannel reproduction systems. The two recordings and rendering techniques are "natural," using main microphone arrays, and "virtual," using spot microphones, panning, and simulated acoustic delay. The two reproduction systems are the 3/2 system (5.1 surround), and a 12/2 system, where the frontal L/C/R triplet is replaced by a 12 loudspeaker linear array. Additionally, the project seeks to know if standard surround techniques can be used in combination with wavefront reconstruction techniques such as Wave Field Synthesis. The Hamasaki Square was used for the room effect in all cases, exhibiting the startling quality of increasing the depth of the frontal image.
Engineering Brief 62 (Download now)

EB2-4 The Reactive Source: A Reproduction Format Agnostic and Adaptive Spatial Audio EffectFrank Melchior, BBC R&D - Salford, UK
Spatial audio has become a more and more active field of research and various systems are currently under investigation on different scales of effort and complexity. Given the potential of 3-D audio systems, spatial effects beyond source positioning and room simulation are desirable to enhance the creative flexibility. This paper describes a new adaptive spatial audio effect called reactive source. The reactive source uses low-level features of the incoming audio signal to dynamically adapt the spatial behavior of a sound source. Furthermore, the concept is format agnostic so that the effect could easily be applied to different 3-D audio reproduction methods using the same interaction method. To verify the basic concept, a prototype system for multichannel reproduction has been developed.
Engineering Brief 63 (Download now)

EB2-5 Teaching Critical Thinking in an Audio Production CurriculumJason Corey, University of Michigan - Ann Arbor, MI, USA
The practice of sound recording and production can be characterized as a series of decisions based primarily on subjective impressions of sound. These subjective impressions lead to equipment choices and use, not only for artistic effect but also to accomplish technical objectives. Nonetheless, the ability to think critically about recording techniques, equipment specifications, and sound quality is vitally important to equipment choice and use. The goal of this paper is to propose methods to encourage critical thinking among students in an audio production curriculum and to consider topics that might be included in coursework to help aspiring audio engineers evaluate audio equipment and processing.
Engineering Brief 64 (Download now)

EB2-6 Sync-AV – Workflow Tool for File-Based Video ShootingsAndreas Fitza, University of Applied Science Mainz - Mainz, Germany
The Sync-AV workflow eases the sorting and synchronization of video and audio footage without the needs of expensive special hardware. It supports the preproduction and the shooting as well as the post-production. It consists of three elements: A script-information- and metadata-gathering iOS app that is synchronized with a server-back-end. It can be used on different devices at once to exchange information onset. A server database with a web-front-end that can sort files by their metadata and show dailies as well. It can also be used to distribute and manage information during the preproduction. A local client that can synchronize and rename the files and that implements the metadata.
Engineering Brief 65 (Download now)

EB2-7 Audio over IP Kieran Walsh, Audinate Pty. Ltd. - Ultimo, NSW, Australia
Developments in both IP networking and the attitude of professional audio to emerging technologies have presented the opportunity to consider a more abstract and all-encompassing approach to the ways that we manage data. We will examine this paradigm shift and discuss the benefits presented both in practical terms and creatively.
Engineering Brief 66 (Download now)

 
 

Sunday, October 28, 4:15 pm — 5:45 pm (Room 132)

Game Audio: G11 - Getting into Sound Design

Presenters:
Elise Baldwin, Electronic Arts/Maxis - Redwood City, CA, USA
Shaun Farley, Teleproductions International - Chantilly, VA, USA
Ann Kroeber, Sound Mountain Sound Effects Service - Richmond, CA, USA
Kyrsten Mate, Skywalker Sound
Nathan Moody, Stimulant


Abstract:
A cross-section of industry experts (games, film, TV) discuss entering the effects editing and sound design field. In addition to Game Audio, the panel will discuss the broader industry as a whole, different mediums where work can be found, and how they got their start. “Things no one told me,” skills development, continuing education, and personality all contribute to a successful career. Have you been doing everything you should be?

 
 

Sunday, October 28, 5:00 pm — 6:00 pm (Room 123)

Product Design: PD9 - Audio for iPad Publishers

Chair:
Jeff Essex, AudioSyncrasy


Abstract:
Book publishers are running to the iPad, and not just for iBooks, or one-off apps. They're building storefronts and creating subscription models, and the children's book publishers are leading the way. Through two case studies, this talk will explore how to build the audio creation and content management systems needed to produce multiple apps in high-volume environments, including VO production, concatenation schemes, file-naming conventions, audio file types for iOS, and perhaps most important, helping book publishers make the leap from the printed page to interactive publishing.

 
 

Monday, October 29, 9:30 am — 11:00 am (Foyer)

Engineering Brief: EB3 - eBrief Presentations—Posters 2

EB3-1 Implementation of an Interactive 3-D Reverberator for Video Games Using Statistical AcousticsMasataka Nakahara, ONFUTURE Ltd. - Tokyo, Japan; SONA Corp. - Tokyo, Japan; Tomoya Kishi, CAPCOM Co., Ltd. - Okaka-shi, Oasaka-fu, Japan; Kenji Kojima, CAPCOM Co. Ltd.; Toshiki Hanyu, Nihon University - Funabashi, Chiba, Japan; Kazuma Hoshi, Nihon University - Chiba-ken, Japan
An interactive reverberator, which applies realistic computed acoustic responses interactively to video game scenes, is a very important technology for the processing of in-game sounds. The mainframe of an interactive reverberator, which the authors developed, is designed based on statistical acoustics theory, so that it is possible to compute fast enough to realize real-time processing in fast-changing game scenes. Though statistical reverbs generally do not provide a high level of reality, the authors have achieved a quantum leap of sound quality by applying Hanyu's algorithm to conventional theories. The reverberator features: (1) No pre-computing jobs including room modeling are required. (2) Three-dimensional responses are generated automatically. (3) Complex factor of a room's shape, open-air areas, and effects of neighboring reverberations are expressed. The authors implemented the reverberator into a Capcom’s middleware experimentally and have verified it can run effectively. In this paper the algorithm, background theories, and implementation techniques are introduced.
Engineering Brief 67 (Download now)

EB3-2 Printable Loudspeaker Arrays for Flexible Substrates and Interactive SurfacesJess Rowland, University of California, Berkeley - Berkeley, CA, USA; Adrian Freed, University of California, Berkeley - Berkeley, CA, USA
Although planar loudspeaker drivers have been well explored for many years, a flat speaker array system that may flex or fold freely remains a current challenge to engineer. We will demonstrate a viable technique for building large loudspeaker arrays that allow for diffused fields of sound transduction on flexible membranes. Planar voice coils are made from machine-cut copper sheets, or by inkjet printing and electroless copper plating, on paper, thin plastic, or similar lightweight material. We will present various ways of attaching thin magnets to these membranes, including a novel alternative strategy of mounting magnets in gloves worn by the listener. This creates an engaging experience for listeners in which gestures can control sounds from the speaker array interactively.
Engineering Brief 68 (Download now)

EB3-3 Nonlinear Distortion Measurement in Audio Amplifiers: The Perceptual Nonlinear Distortion ResponsePhillip Minnick, University of Miami - Coral Gables, FL, USA
A new metric for measuring nonlinear distortion is introduced called the Perceptual Nonlinear Distortion Response (PNDR) to measure nonlinear distortion in audio amplifiers. This metric accounts for the auditory system's masking effects. Salient features of previously developed nonlinear distortion measurements are considered in the development of the PNDR. A small group of solid-state and valve audio amplifiers were subjected to various benchmark tests. A listening test was created to test perceptibility of nonlinear distortions generated in the amplifiers. These test results were analyzed and the Perceptual Nonlinear Distortion Response was more successful than traditionally used distortion metrics. This cognitive tool could provide the audio industry with more accurate test methods, facilitating product research and development.
Engineering Brief 69 (Download now)

EB3-4 EspGrid: A Protocol for Participatory Electronic Ensemble PerformanceDavid Ogborn, McMaster University - Hamilton, ON, Canada
EspGrid is a protocol developed to streamline the sharing of timing, code, audio, and video in participatory electronic ensembles, such as laptop orchestras. An application implementing the protocol runs on every machine in the ensemble, and a series of “thin” helper objects connect the shared data to the diverse languages that live electronic musicians use during performance (Max, ChucK, SuperCollider, PD, etc.). The protocol/application has been developed and tested in the busy rehearsal and performance environment of McMaster University’s Cybernetic Orchestra, during the project “Scalable, Collective Traditions of Electronic Sound Performance” supported by Canada’s Social Sciences and Humanities Research Council (SSHRC), and the Arts Research Board of McMaster University.
Engineering Brief 70 (Download now)

EB3-5 A Microphone Technique for Improved Stereo Image, Spatial Realism, and Mixing Flexibility: STAGG (Stereo Technique for Augmented Ambience Gradient)Jamie Tagg, McGill University - Montreal, Quebec, Canada
While working on location, recording engineers are often challenged by insufficient monitoring. Poor (temporary control room) acoustics or headphone monitoring can make judgments regarding microphone choice and placement difficult. These choices often lead to timbral, phase, and stereo image problems. We are often forced to choose between the improved spatial imaging of near-coincident techniques and the acoustic envelopment from spaced omni-directional mics. This poster proposes a new technique: STAAG (Stereo Technique for Augmented Ambience Gradient), which aims to improve stereo image, acoustic realism, and flexibility in the mix. The STAAG technique allows for adjustment of the acoustic envelopment once in a proper monitoring environment.
Engineering Brief 71 (Download now)

 
 

Monday, October 29, 11:00 am — 12:30 pm (Room 132)

Game Audio: G12 - Doing More with Less: How Games Immersively Simulate Audio on a Budget

Presenter:
Scott Selfon, Microsoft


Abstract:
How do games pack tens or hundreds of hours of experience onto a disc, hard drive, or the web? This talk covers some of the many techniques used (and the tradeoffs incurred) to make seemingly infinite, unique, and dynamic sounds and music—often with only a single content creator and a part-time programmer. Topics will include 3-D spatial simulation, compression, basic and advanced variation, and physical modeling techniques as applied to interactive media, with focus on topics that broadly apply to the full spectrum from mobile to console.

 
 

Monday, October 29, 1:00 pm — 5:00 pm (Tech Tours)

Technical Tour: TT8 - Electronic Arts


Abstract:
The world’s largest video game publisher is opening their doors to AES attendees with this rare tour of the 22-acre campus-like environment featuring a number of high tech development and production studios. Wear your walking shoes because this tour will visit each of EA’s four buildings and includes a presentation in their state-of-the-art auditorium, a "Visceral Studio" demonstration illustrating the audio focus of game development and a web audio API sound engine demo. Utopia keyboardist Roger Powell will participate, in his current role as EA's senior producer on emerging music technologies.

This event is limited to 44 tickets.

Technical Tours are made available on a first come, first served basis to anyone with an All Access badge. Tickets can be purchased during normal registration hours at the convention center.

Price: Members $40/Nonmembers $50

 
 


Return to Game Audio Track Events

EXHIBITION HOURS October 27th 10am – 6pm October 28th 10am – 6pm October 29th 10am – 4pm
REGISTRATION DESK October 25th 3pm – 7pm October 26th 8am – 6pm October 27th 8am – 6pm October 28th 8am – 6pm October 29th 8am – 4pm
TECHNICAL PROGRAM October 26th 9am – 7pm October 27th 9am – 7pm October 28th 9am – 7pm October 29th 9am – 5pm
AES - Audio Engineering Society