SC-02-08 meeting, San Francisco, 2004-10-27

Report of the meeting of the SC-02-08 working group on audio-file transfer and exchange of the SC-02 Subcommittee on Digital Audio, held on 2004-10-27 in San Francisco, CA., US.

The meeting was convened by chair, M. Yonge. Note that the identity of this group has been changed from SC-06-01 following the recent reorganisation of SC-06.

The agenda and the report of the previous meeting on 2004-05-09 in Berlin, Germany, were agreed as written.

Open Projects

AES31-1-R: Review of AES31-1-2001: AES standard for network and file transfer of audio - Audio-file transfer and exchange - Part 1: Disk format.
No action was requested or required

AES31-3-R: Review of AES31-3-1999, AES standard for network and file transfer of audio - Audio-file transfer and exchange - Part 3: Simple project interchange. including maintenance of annex F.
A draft revision was circulated. Two notable additions to the 1999 document are sampling frequency scaling and edit automation as discussed in previous meetings.

In clause, the sequence header now contains an additional parameter identified as the system sampling rate factor, "SYS_SAMP_RATE_FACTOR" which scales the sampling frequency of the timeline in the ADL by a factor of two or four, defaulting to one. This provides a similar utility to the double-rate and quadruple rate sampling frequencies defined in AES5-2003 and allows projects to operate at 96 kHz or 192 kHz where required. U. Henry felt it was important to clarify that this factor applies to the sequence or time-line sampling frequency (seq_samp_rate) and not the individual clip sample rate which is flagged in the corresponding TCF value.

A new major clause, 7, specifies edit automation based on Ultan's latest draft with some editorial changes. Subsequent clauses from the 1999 standard have been renumbered.

J. Bull felt that Fig 9 should show more clearly the minimum one-sample transition between different gain values. Suggested "t" and "t+1" to indicate a timeline increment.

Unlike the original draft, pan-point parameters are expressed in simple numeric values rather than percentage values. In 3rd para of it was proposed to replace "percentage" by "linear" to avoid confusion with non-linear pan laws which this positional interchange format is not intended to emulate. Proposed also to add a linear scale to the side of Fig 11. Felt it should be clarified that there are no negative values of front-rear panning.

The meeting felt that a new figure - 11b, labelled "pan of track 1" for example - should illustrate a series of pan moves and the example in to show the same information for comparison. There was felt to be no need for an additional surround pan diagram example.

Henry noted that the document did not include the introduction from his original draft. This was needed to clarify the purpose of edit automation and avoid any confusion with mixing-console automation, for example. This will be added a general introduction to 7.

Annex C has been amended to accommodate fader and pan lists.

S. Aoki observed that mentioned the use of a URL for locating a separate file, however this was not specified in sufficient detail. Henry also wished to wants to clarify file URLs because a number of implementations in the field use characters that are illegal in a URL.

Bull proposed a new key letter to avoid confusion. Henry felt that the key letter 'F' should be more clearly defined, and a new key letter 'U' would refer to a URI (rather than URL) to be defined clearly. Henry offered to draft some text for this. Aoki observed that delimiters are different for URIs compared with AES31-3. Discussion showed that this constraint could be handled by the URI spec and there was no need to change the AES31-3 specification - however a note would be prudent.

R. Caine asked whether there was a need to adopt a modern equivalent for "ASCII", such as "ISO 646"? Discussion revealed no strong will to do so.

A new revision of the draft will be prepared by Yonge, to include a number of details previously raised by B. Harris.

AES46-R: Review of AES46-2002: AES standard for network and file transfer of audio - Audio-file transfer and exchange - Radio traffic audio delivery extension to the broadcast-wave-file format
No action was requested or required

Development Projects

AES-X066: File Format for Transferring Digital Audio Data Between Systems of Different Type and Manufacture
The apparent divergence of Broadcast Wave Format (BWF) implementations around the world was discussed in NY and Berlin. There is now not a single recognised normative reference for BWF. At a recent SMPTE meeting in Basingstoke, it became clear that this was a problem for SMPTE, MXF, and AAF. It is also a problem for the ITU-R - who would like to see harmony in this area - and the EBU themselves. These bodies have all requested the AES to coordinate a definitive solution.

It is also necessary to identify a stable reference for the Wave part of the BWF.

A new draft document for AES-X66 dated 041023 was introduced for discussion. It was based on EBU Tech.3285 V1 as the original Broadcast Wave Format specification and for compatibility with the greatest number of implementations in the field.

R. Chalmers wished to see "BWF" in the title for clarity. Others preferred not to use "BWF" anywhere because of the strong and erroneous temptation to use it as a three-letter filename extension. After discussion, the meeting agreed that Broadcast Wave Format is an adjective that can be applied to a File. The abbreviation BWFF (Broadcast-Wave-Format File) is acceptable.

The circular set diagram in the EBU introduction was felt to be unnecessary. There may be a better way to illustrate the interdependence of chunks in a RIFF file.

Should the support for MPEG-coded files found in the EBU document be included in this AES document? In discussion it was acknowledged that this AES working group have consistently held the view that only PCM coding should be used in a production and post production context but were sensitive to the EBU who used the file format for distribution as well as production. Chalmers expressed concern an AES specification that did not support established EBU usage could obstruct acceptance by EBU users. Henry proposed that the core content of this document should be kept to the minimum required for implementation such that other elements could then be added without breaking the fundamental structure. Bull suggested that it could be possible to indicate that extensions of BWFF may include MPEG coding and point to an external reference?

Propose to move description of other chunk types to a separate informative annex. Although not required in this interchange specification, an implementor will need to know about it.

R. Caine suggested that, for the other chunks referred to in 4.1, "shall be considered private" should become "are outside the scope of this standard". Proposed referring to "other types of chunk included in this file" to suggest that other chunks will not be unusual. Henry mentioned some arguments for ordering the RIFF chunks so that the audio data was at the end of the file. Bull pointed out that there were established arguments against this because of the need to put metadata chunks as the last item before the file is closed in some operations. For example, the level chunk needs to be created after the operator presses 'stop' on a recording. It was emphasised once more that reading applications must accommodate chunks in any position in the file, as the RIFF format requires, although writing applications may choose a specific chunk order for their convenience. It was generally preferred that, wherever practical, the Format & BEXT chunks should be before the data and that this should be indicated in an informative annex. Caine felt that the order rules should be made very clear to avoid misunderstandings.

It was suggested that the format chunk should be before the bext chunk. (note: EBU Tech.3285-2001 and 1997 have it the other way around!)

BEXT data structure. The data fields listed in 4.3 can be considered as two groups. Some fields represent engineering - or 'essential' metadata - including Date, Time, Time-ref high Time-ref low, Version, and UMID. All these fields are machine readable and need to be interchanged unambiguously. Other, descriptive, fields are intended for human readability where machine-readable interchange is of secondary importance. These include Description, Originator, Originator-reference, and Coding history. It was felt in Berlin that the limited number of fixed-length fields in the bext chunk is no longer the best way to carry descriptive metadata. Other techniques are much better suited to this task and can also support multi-byte character sets more easily. It is possible to consider a case where there is no human readable data in the bext chunk at all. The meeting felt that the greatest priority is to define the machine-readable data and to consider the issue of human readable metadata as a separate issue.

Chalmers asked whether a UMID is actually required for a valid BWFF. it was felt not, but if a UMID is used it should comply with the standard. There is a need to find wording to support the option of no UMID. Noted that the UMID should be shown as a BYTE variable. A list of data types definitions needs to be added to the document.

Is the content of the descriptive fields important for base-level interchange? Fields such as Originator-reference need some content even if this is no more than a null string, or a single null character, to satisfy the machine. There was general agreement with a proposal from Aoki to fill these fields with nulls.

Henry felt that there should be some guidance for the use of the filename and the filename extension. Some filenames will not transfer from one system to another and this failure would break interchange. For example, different systems imposed different constraints on permissible characters, on path separator characters, and on filename length. Filename length should - for optimum cross-platform interchange - be 31 characters or less including any extension. Noted that multi-byte characters are not legal in a URI. Henry will write a proposal for a table of permitted characters for discussion, possibly based on 4.2 of AES31-3, and filename length. .

Bit depth. Aoki and Bull opined that the minimum PCM bit-depth should be 16-bit, so lesser legacy specifications need not be included. An EBU document exists to this effect (EBU-R84-1996). The document should also show 24-bit examples. Caine requested clarification that a value of 8000 will be permitted as an audio value (this value had sometimes been used for signalling in some older applications).

Number of channels. Bull pointed out that previously we had considered only mono files in AES31 because of the need for flexibility in post production. If changing, need more discussion than available here. Henry observed that multichannel files are already being used and there is a structure in AES31-3 to handle them. Caine and Chalmers pointed out the substantial usage of stereo BWF files in existing operations for radio broadcasting, for example. It would be difficult to get acceptance from the EBU if this option is excluded. Bull proposed a note on usage to separate post production operations from distribution (which could use multichannel files). It was proposed to discuss these issues - including multichannel field origination files - further on the reflector.

Chalmers mentioned that the EBU were examining ways to avoid the current file size limit which had become a greater restriction with the introduction of multi-channel files. There was a possibility of a new EBU 64-bit audio file specification, although the format may be an incompatible departure from the RIFF structure of BWFF. Henry noted that many servers currently will not support 64-bit files so they will not be useful for general interchange for some time. The meeting felt that it would be appropriate to stick with BWFF for this edition and, possibly, consider 64-bit file formats for some future revision. It was noted that an EBU Link chunk (EBU Tech.3285, Supplement 4, 2003) already exists for concatenating smaller files to build a larger project.

A revised PTD incorporating these notes will be posted for further discussion as soon as possible.

AES-X068: A Format for Passing Edited Digital Audio Between Systems of Different Type and Manufacture That Is Based on Object Oriented Computer Techniques
The emerging AAF Edit Protocol was a potential candidate for this part of the AES31 specification. Because of its complexity, it may be preferable to refer to an AAF document rather than attempt to publish in parallel, provided this met the necessary criteria of AES standards. It is expected to discuss this further with the AAF Association during the early months of 2005.

AES-X071: Liaison with SMPTE Registration Authority
No action was requested or required.

AES-X128 : Liaison with AAF Association
See notes for AES-X068

AES-X149: Format and Recommend Usage of the Direct Stream Digital Interchange File Format (DSDIFF)
A draft document is awaited.


DVD Forum. J. Yoshio mentioned that the DVD Forum is now making DVD for BWF and welcomes progress on the file format definition.

New Projects

There were no new projects

New Business

There was no new business

The next meeting is scheduled to take place in conjunction with the AES 118th Convention in Barcelona, Spain, 2005-05.

AES - Audio Engineering Society