AES New York 2007
Audio Content Management
Paper Session P12
Saturday, October 6, 3:30 pm — 5:30 pm
Chair: Rob Maher, Montana State University - Bozeman, MT, USA
P12-1 Music Structure Segmentation Using the Azimugram in Conjunction with Principal Component Analysis—Dan Barry, Mikel Gainza, Eugene Coyle, Dublin Institute of Technology - Dublin, Ireland
A novel method to segment stereo music recordings into formal musical structures such as verses and choruses is presented. The method performs dimensional reduction on a time-azimuth representation of audio, which results in a set of time activation sequences, each of which corresponds to a repeating structural segment. This is based on the assumption that each segment type such as verse or chorus has a unique energy distribution across the stereo field. It can be shown that these unique energy distributions along with their time activation sequences are the latent principal components of the time-azimuth representation. It can be shown that each time activation sequence represents a structural segment such as a verse or chorus.
Convention Paper 7235 (Purchase now)
P12-2 Using the Semantic Web for Enhanced Audio Experiences—Yves Raimond, Mark Sandler, Queen Mary, University of London - London, UK
In this paper we give a quick overview of some key Semantic Web technologies, allowing us to overcome the limitations of the current web of documents to create a machine-processable web of data, where information is accessible by automated means. We then detail a framework for dealing with audio-related information on the Semantic Web: the Music Ontology. We describe some examples of how this ontology has been used to link together heterogeneous data sets, dealing with editorial, cultural or acoustic data. Finally, we explain a methodology to embed such knowledge into audio applications (from digital jukeboxes and digital archives to audio editors and sequencers), along with concrete examples and implementations.
Convention Paper 7236 (Purchase now)
P12-3 Content Management Using Native XML and XML-Enabled Database Systems in Conjunction with XML Metadata Exchange Standards—Nicolas Sincaglia, DePaul University - Chicago, IL, USA
The digital entertainment industry has developed communication standards to support the distribution of digital content using XML technology. Recipients of these data communications are challenged when transforming and storing the hierarchical XML data structures into more traditional relational database structures for content management purposes. Native XML and XML-enabled database systems provide possible solutions to many of these challenges. This paper will consider several data modeling design options and evaluate the suitability of these alternatives for content data management.
Convention Paper 7237 (Purchase now)
P12-4 Music Information Retrieval in Broadcasting: Some Visual Applications—Andrew Mason, Michael Evans, Alia Sheikh, British Broadcasting Corporation Research - Tadworth, Surrey, UK
The academic research field of music information retrieval is expanding as rapidly as the MP3 collection of a stereotypical teenager. This could be no coincidence: the benefit of an automated genre classifier increases when the music collection contains several thousand tracks. Of course, there are other applications of music information retrieval. Here we highlight a few that make use of a simple, visual, representation of an audio signal, based on three easy-to-calculate audio features. The applications range from simple navigation around consumer recordings of broadcasts, to a music video production planning tool, to a short term "Listen Again" eye-catching display.
Convention Paper 7238 (Purchase now)
Last Updated: 20070820, mei