Comments on DRAFT REVISED AES41-xxxx

last updated 2010-05-31

Comments to date on DRAFT REVISED AES41-xxxx, AES standard for digital audio - Recoding data set for audio bit-rate reduction ,
published 2009-12-21 for comment.

Comment received from Steve Lyman, 2009-03-11

This comment seeks to expand the metadata capacity of AES41, allowing it to respond to the very real and immediate needs of the industry.

There are two possible approaches to the proposed changes to the document. The first is to simply provide suggestions and new language that leaves the scope of the document essentially untouched. The second, and in my opinion, the more constructive, is to enlarge the scope to acknowledge the need for a method of transporting various types of metadata, and for expanding the set of Dolby E metadata beyond the minimal set currently proposed.

The following comments will address specific issues, as applicable, and will propose a broader approach and the supporting documents necessary. I hope that the committee will recognize the need for and adopt the broader approach. If so, Dolby will be pleased to participate in the drafting and development of changes to the current proposal and if necessary, to any supporting documents needed.


The title should be revised to include metadata (or ancillary data). "AES standard for digital audio - Recoding data set for audio bit-rate reduction and metadata transport".

1 Scope
The Scope does mention transmission of additional ancillary data, but if the committee decides to adopt the broader approach, the Scope should acknowledge the range of data supported. I will supply specific wording pending that decision.

2 Normative References
The current list only includes documents necessary to deal with the recoding data. As some Dolby E metadata is being proposed, this section should include a definition of the metadata. That is provided by SMPTE RDD6. An RDD is not a SMPTE Engineering document; it is a Registered Disclosure Document that, by SMPTE rules, cannot be used as a normative reference. If the AES situation is the same, RDD6 should be included as an Informative reference or annex.

3 Definitions, Symbols and Abbreviations
Definitions (available from RDD6) of the minimal set of Dolby E metadata should be included, or if the more general approach is accepted, RDD6 in its entirety could be referenced by this section.

4 Information for Transparency
Is “data” a collective noun, or is a plural? There are several instances of “…the coding decision data become corrupted…” and “…coder control data are added…” in this section that may need correction.

6 Specification of the audio coder control data bit-stream syntax
There are several key items that could change substantially if the committee decides to take the broader view of the revision. These are addressed in the following section.

7 Semantics of the audio coder control data bit-stream syntax
The maximum length of a recoding data frame is limited to 255 bytes by the size of the length field of the data frame Header. Experience in SMPTE (cf. SMPTE 2020-2-2008, Vertical Ancillary Data Mapping of Audio Metadata - Method A) has shown that metadata packets may exceed the 255 byte limit. The solution in 2020-2 was to use one bit in a “payload descriptor byte” as a double packet flag, which allowed the metadata to run over into a second packet.

It would be wise to provide a similar or more flexible method of dealing with long metadata packets in this revision. Perhaps one or more reserved values in the length field, possibly in combination with certain type fields of the header could point to a location in the recoding data that carries the size of extended packets.

The size of the data type field of the header presents a similar problem. Again, experience in SMPTE with the five bit data type field used in SMPTE-338-2008 (Format for Non-PCM Audio and Data in AES3 - Data Types) has demonstrated that five bits is too short. That standard is only two years old, and all 32 data type numbers have been claimed. A recent request for a new value has sent that committee off into a study group to try to figure out how to solve the problem. Better to address a potential problem in AES41 now than wait for the inevitable.

I believe that the DolbyE audio minimal metadata set presented in the proposed revision to AES41 is too restricted. SMPTE 2020 carries the entire metadata stream at the insistence of many large users, and the standard is supported by at least four major equipment suppliers. Dolby has recently received a request from a group representing another large group of broadcasters to help map at least all of the static Dolby E metadata parameters into an MXF file.

8 Method of signaling the audio coder control data bit stream on the AES3 interface
The first paragraph of this section is a bit confusing. It could be modified as shown.

Three signaling methods are defined for use with the AES3 interface, according to depending on the resolution (bits/sample) of the systems that are to be interconnected in use. The methods signal the audio coder control data bit stream by altering the parity of the audio samples. Even parity indicates a data bit with value logic 0, odd parity indicates a value of logic 1.

9.3 Dolby E
I realize that the document is concerned primarily with audio coder control data, but this section should refer to audio metadata instead.

Respectfully submitted,
Steve Lyman
March 11, 2010

Reply from John Grant, 2009-04-16

As chair of SC-02-02 it falls to me to reply formally to your comments on the draft revised AES41.

Several of the proposed changes would require substantial redrafting, and to incorporate them at this stage would significantly delay publication of the document, and thus of the other revisions which it contains.

Accordingly we propose we defer these more substantial changes to a further, more comprehensive, revision, on which work can begin immediately. We would, of course, greatly appreciate your input to this further revision.

I attach your comments with our reply to each item added in colour [summarized below, referring to numbered comments].

  • Title: accepted
  • 1 Scope: Deferred to a more comprehensive revision
  • 2 Normative references: We will add it as an informative reference.
  • 3 Definitions, Symbols and Abbreviations: Clause 3 is chiefly intended for the definition of terms that are generally applicable throughout the document. Definitions specific to the Dolby E data set are in 7.2.3.
  • 4 Information for Transparency: The AES house style is that data are plural.
  • 6 Specification of the audio coder control data bit-stream syntax: Deferred to more comprehensive revision.
  • 7 Semantics of the audio coder control data bit-stream syntax: Deferred to more comprehensive revision
  • 8 Method of signaling the audio coder control data bit stream on the AES3 interface: The change to the first sentence is accepted. Deletion of the other two sentences would require the meaning of “as required” at the end of each of the next three paragraphs to be made explicit.
  • 9.3 Dolby E: The term "audio coder control data" is used extensively throughout the document, and we would prefer to leave the modification of it to a more comprehensive revision.

    Whilst the data could be considered as meta-data, equally, it is almost always used to control a Dolby AC-3 encoder, in the sense that the parameters in the Dolby AC-3 bit stream would be derived from the AES41 data. As such it could still be regarded as "audio coder control data".
  • A proposal: Deferred to more comprehensive revision.

Comment received from Steve Lyman, 2009-05-11

Dear John,

I read your proposal, but I'm afraid I can't agree with it.

Most importantly, the set of "Dolby E audio minimal metadata" does not include any dynamic range control information, which effectively eliminates much of the functionality of the AC3 system, and risks delivering unacceptable sound quality. (I note that the ISO/IEC 14496 metadata set does include dynamic range control data).

I'm also disappointed that the Title and Scope will not be changed. The proposed additions are not coder control data. The dialnorm, cmixlev and surmixlev data is applied to the baseband signal at the decoder output. The descriptive text, acmod and lfeon parameters have no relation to re-coding data.

I accept your proposed change to Clause 8.

I will agree to the publication the proposed version of AES41 if 1) there is a pressing need to deliver it to the industry before the comprehensive revisions you suggest can be made, and 2) if the items related to Dolby E are removed from the present version before it is published.

In this case, it will not be necessary to add an informative reference to SMPTE RDD6.

Yours truly,


Comment received from Steve Lyman, 2009-05-19

Dear John, I feel that it would be in everyone's best interests to modify my comments of May 11.

Specifically, the publication of AES41 may go ahead with the items related to Dolby E included, per the current call for comment (aes41-xxxx-cfc-091221) but with the addition of a reference to SMPTE RDD6c-2008 (SMPTE REGISTERED DISCLOSURE DOCUMENT for Television - Description and Guide to the Use of the Dolby E Audio Metadata Serial Bitstream).

I still feel, however, that it is in the best interests of the industry to carry a full set of audio metadata.

All the best,


AES - Audio Engineering Society