Sections

AES Section Meeting Reports

Pacific Northwest - May 18, 2023

Meeting Topic:

Moderator Name:

Speaker Name:

Meeting Location:

Summary

PNW Section's May 2023 meeting presented Phil Popp from Epic Games, who gave an overview of the current state of Unreal Engine's Metasounds, how they work, and how sound designers and game developers can use Metasounds. In addition, Phil's talk covered some of the lower-level details of Metasounds for audio programmers. This meeting was a hybrid with the live gathering at DigiPen also available on Zoom. About 20 persons were at Digipen (6 AES members) and 37 (26 AES members) on Zoom.

Metasounds is a scripting language for audio that is programmable within Unreal Engine (UE) and specialized for making interactive and procedural audio for video games and interactive media. Metasounds are real-time and work side-by-side with other systems in UE including physics, graphics, and game design. Metasounds work with their own audio engine. Popp described how an audio engine has many similarities to a software sampler; handling voice, memory, streaming, and state management. Metasounds also need to create subtle variations in parameter space or "performance" data.

Unlike the traditional approach to game audio using pre-rendered PCM audio assets, Metasounds are not played as raw samples. Metasounds are constructed as "object-based" designs that are driven by runtime data. Metasounds are designed as a topology to define how different sounds interact within Unreal Engine.

Popp emphasized that one of the primary concerns of game audio professionals is memory consumption. Handling runtime memory and disk footprint are an important focus for a successful game release. Streaming audio using compression codecs is an important part of these memory management systems. These codecs include perceptual, time-domain, and hardware codecs that all involve management of cpu resources. Another challenge is that games also need to support multiple platforms that are coming out all the time and shutting down. These different platforms have vastly different computational power which makes the task of handling memory for each of them even more complex. In order to make a tool like Metasounds work it needs to be scalable to meet the demands of this highly dynamic and fast-paced environment.

The inspiration behind Metasounds came from wanting to bring procedural audio generation to game audio. An awareness of tools like Max/MSP, Pure Data, SuperCollider, ChucK, and even modular synthesizers informed the creation of Metasounds. The creativity that these real-time or procedural audio tools provide artists as well as the unique limitations found in audio development for games inspired the creation of Metasounds.

Like "patcher" programming languages such as Max/MSP, Metasounds uses different functions (known as "nodes") that are interconnected with virtual patch cables. Metasounds in many ways resembles UE's already existing Blueprint visual scripting language. Metasound's various nodes can generate audio as well as process audio in various ways. Nodes can also be used to handle audio math and manage audio assets. Popp gave a live demonstration of Metasounds showcasing sound designs that featured sound synthesis and audio samples. Popp's demonstrations showed how a sound designer can modify sounds directly with Metasounds. Popp also provided a musical demo closely resembling modular synthesis patch programming.

The final portion of the presentation focused on how Metasounds work at a lower level. Popp noted that Metasounds had to be interactive, fast, robust, and expressive. This comes down to low-level architectural decisions such as the decision to make Metasounds a digital signal processing flow graph programming paradigm. A Metasound is a synchronous data flow audio DSP graph. Metasound nodes modify data, edges express data flow and node dependency, and inputs and outputs determine how the graph interacts with external programming. Metasounds are directed and data always flows in a specific direction. Metasounds are also acyclic and do not allow audio feedback, due the need for speed, safety, and stability. Metasounds can be extended by game developers by creating their own custom Metasounds; nodes are modular and can be written in C++ as well as be made up of other nodes.

Popp explained the importance in understanding the vast differences in data speeds between L1, L2, L3, RAM, and HD memory for audio processing. Respectively, the speed at which the computer can process the audio in these different memory caches gets slower. It is impossible to play back audio quickly if it's stored in HD memory. For audio processing, Metasounds uses sample accurate synthesis processed in render blocks. With UE's Audio Stream Cache, Metasounds need to be able to handle audio samples really quickly. Audio assets are deeply integrated and compressed and streamed to support many sounds.

For the future Epic is planning to add the ability to make sub-mix effects, source effects, spatializers, particle VFX, and control modulation as well as making Metasounds even faster. Epic Games have already released various free example Unreal projects including Lyra and Ancient Game with many examples of pre-made Metasounds that people can check out on their own and modify.

Written By:

More About Pacific Northwest Section

AES - Audio Engineering Society