Object-based audio (OBA) program material is challenging to distribute over low bandwidth channels and costly to render for thin clients. This research proposes a dynamic object-grouping solution that can represent a complex object-based scene as an equivalent reduced set of object groups while maintaining perceptually transparent rendering quality. This solution is a type of spatial coding. This paper introduces a real-time greedy simplification technique that addresses limitations of previous approaches by modeling spatial release from masking and distributing input objects into to multiple output groups. The core algorithm is extended to preserve other types of artistic metadata beyond object position. Results of perceptual tests show that this solution can achieve a 10:1 reduction in object count while maintaining high-quality audio playback and rendering flexibility at the endpoint. Spatial coding does not require perceptual coding of the objects’ audio essence but can be further combined with audio coding tools to deliver OBA content at low bit rates. This makes spatial coding a key component of an OBA production and distribution workflow. Object-based content creation, distribution, and rendering workflows require novel methods to process, combine, encode, and simplify complex auditory scenes to allow end-point rendering flexibility, efficiency, and adaptability as well as the means to cater for personalized experiences.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.