We propose a new biologically-inspired paradigm for universal audio coding based on neural spikes. Our approach is based on the generation of sparse 2-D representations of audio signals, dubbed as spikegrams. The spikegrams are generated by projecting the signal onto a set of overcomplete adaptive gammachirp (gammatone with additional tuning parameters) kernels. A masking model is applied to the spikegrams to remove inaudible spikes and to increase the coding efficiency. The paradigm is a first step towards the implementation of a high-quality audio encoder by further processing acoustical events generated in the spikegrams. Upon necessary optimization, our coding system operating at 1 bit/sample for sound sampled at 44.1kHz, is expected to deliver high quality audio for broadcasting, archiving, etc.
https://www.aes.org/e-lib/browse.cfm?elib=14063
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Learn more about the AES E-Library
Start a discussion about this paper!