Perceptual Compression Methods for Metadata in Directional Audio Coding Applied to Audiovisual Teleconference
In teleconferencing application of Directional Audio Coding, the transmitted data consists of monophonic audio signal and directional metadata measured in frequency bands depending on time. In reproduction, each frequency channel of the signal is reproduced to corresponding direction with corresponding diﬀuseness. This paper examines methods for reducing the data rate of the metadata. The compression methods are based on psychoacoustic studies about the accuracy of directional hearing, and further developed and validated. Informal tests with one-way reproduction, as well as usability testing where an actual teleconference was arranged, were utilized for this purpose. The results indicate that the data rate can be as low as approx. 3 kbit/s without a signiﬁcant loss in the reproduced spatial quality.
Click to purchase paper or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $20 for non-members, $5 for AES members and is free for E-Library subscribers.