This paper presents some ideas for the appropriate management of every information source present in a generic speech or audio coder. This task becomes more necessary as coding structures get more complex, and an appropriate organization and processing of this information is a key point for an efficient implementation, in terms of complexity and quality. First, a data structure will be proposed, inspired by classic comprehension theories, which sorts the information into three different hierarchical levels. Based on this structure, a global sound encoder block diagram will be described. This model is based on blackboard models, commonly applied in speech recognition applications. Finally, it will be shown how an MPEG-2/4 AAC-LC coder can be considered as a particular case of the proposed model.
https://www.aes.org/e-lib/browse.cfm?elib=12655
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Learn more about the AES E-Library
Start a discussion about this paper!