AES Munich 2009
Thursday, May 7, 09:00 — 10:00
Acoustics Event Details
Acoustics and Sound Reinforcement
Thursday, May 7, 12:00 — 13:30
Please join us as the AES presents special awards to those who have made outstanding contributions to the Society in such areas of research, scholarship, and publications, as well as other accomplishments that have contributed to the
enhancement of our industry. The awardees are:
Bronze Medal Award:
• Ivan Stamac
• Martin Wöhr
Board of Governors Award:
• Jan Berg
• Klaus Blasquiz
• Kimio Hamasaki
• Shinji Koyano
• Tapio Lokki
• Jiri Ocenasek
• John Oh
• Jan Abildgaard Pedersen
• Joshua Reiss
This year’s Keynote Speaker is Gerhard Thoma. Thoma has been leading the department of acoustics projects at BMW for more than 20 years. His speech will highlight many aspects of perception and acoustics from an unusual point of view: What does a driver in a car need to hear, what does he should not hear, and how can the acoustics and sounds of a car help to significantly enhance driving pleasure and safety?
Thursday, May 7, 16:00 — 17:00
Hearing and Hearing Loss Prevention
Thursday, May 7, 18:30 — 19:30
The Richard C. Heyser distinguished lecturer for the 126th AES Convention is Gunnar Rasmussen, a pioneer in the construction of acoustic instrumentation, particularly of microphones, transducers, vibration and related devices. He was employed at Brüel & Kjær Denmark as an electronics engineer immediately after his graduation in 1950. After holding various positions in development, testing, and quality control, he spent one year in the United States working for Brüel & Kjær in sales and service.
After his return to Denmark in the mid-1950s he began the development of a new measurement microphone. This resulted in a superior mechanical stability, increased temperature, and long term stability. The resulting one-inch pressure microphone soon became the de facto standard microphone for acoustical measurements to replace the famous W.E. 640AA standardized microphone.
The optimized mechanical design of the new generation of measurement microphones opened up the possibility for reducing the size of the microphones, first to a ½” microphone and then to ¼” and 1/8” microphones with essentially the same superior mechanical, temperature and long term stability. Notably the ½” microphone is still the most widely used measurement tool today. Since the beginning of the 1960’s, this microphone design has been preferred for all types of acoustic measurements and has formed the basis for the IEC 1094 series of international standards for measurement microphones.
Gunnar Rasmussen received the Danish Design Award in 1969 for his novel design of the microphones that were exhibited at the New York Museum of Modern Art. He also developed the first acoustically optimized sound level meter, where the shape of the body was designed to minimize the effect of reflections from the casing to the microphone. This type 2203 Sound Level meter was for many years seen as the archetype of sound level meters and its characteristic shape became the symbol of a sound level meter.
Other major inventions and designs include the Delta Shear accelerometer, the dual piston pistonphone calibrator for precision calibration, the face-to-face sound intensity probe and hydrophones, occluded ears, artificial mouth, etc. Rasmussen is also the author of numerous papers on acoustics and vibration and has served as chairman and vice-chairman of various international organizations and standard committees. In 1990 he received the CETIM medal for his contribution to the field of intensity techniques. He is also a Fellow of the Acoustical Society of America.
In 1994 Rasmussen started his own company, G.R.A.S. Sound and Vibration. Originally a company specializing in precision Outdoor Microphones for permanent noise monitoring around airports, it is now one of the world’s leading companies in acoustic front-ends and transducers forming a wide range of general purpose and specialized microphones, electro-acoustic measurement devices such as ear couplers, precision calibration tools and multi-dimensional sound intensity probes. The title of his lecture is, “The Reproduction of Sound Starts at the Microphone.”
The microphones may be developed for many specific purposes: for communication, recording or precision measurements. Quality may have different meaning for different applications. Price may be a dominating factor. Carbon microphones were dominating up to the 1950s. Electret microphones have taken the place of carbon microphones with great improvement in quality and performance at low prices. The MEMS microphones are on the way.
The challenge in the high quality microphone development is to match or exceed the human ear in perception of sound for measurement purposes. Without measurements we cannot qualify our progress. We are still trying to match the frequency band, the dynamic range, the phase linearity of the human ear and to obtain very good reproducibility in all situations where humans are involved. We need microphones for development, for standardized measurements and for legal related measurements. Where are we today?
Friday, May 8, 09:00 — 10:00
Friday, May 8, 09:00 — 11:30
P8 - Room Acoustics
Chair: Ronald M. Aarts
P8-1 Phase Velocity and Group Velocity in Cylindrical and Spherical Waves—Ian M. Dash, Australian Broadcasting Corporation - Sydney, NSW, Australia; Fergus R. Fricke, University of Sydney - Sydney, NSW, Australia
Closed-form expressions are derived for phase velocity and group velocity in cylindrical and spherical sound waves. These are plotted and compared for orders 0, 1, and 2, but the expressions are general and may be applied to waves of any order. Dispersion characteristics of these waves are examined and discussed. The implications for thermodynamic applicability of the wave equation and for application of Huygens’ principle are discussed.
Convention Paper 7693 (Purchase now)
P8-2 Selection of Loudspeaker Positions for Reverberation Time and Sound Field Measurements—Elena Prokofieva, Napier University - Edinburgh, UK
According to the various building standards, the source loudspeakers and receiving microphones during the internal noise level measurements can be placed “in any convenient position,” with just some distance restrictions from the nearest reflecting surfaces. In rooms of different shapes and volumes the location of the source and receiver microphone may significantly affect the measured results. If the difference between the reverberation times or noise levels measured for two positions in the same room exceeds 10 percent, they cannot be averaged. The simulation program is created to recommend the most suitable locations for microphone and loudspeakers in tested room for reverberation time measurements. The results of series of tests are analyzed to confirm the results of the simulation.
Convention Paper 7694 (Purchase now)
P8-3 A Rehearsal Hall with Virtual Acoustics for Symphony Orchestras—Tapio Lokki, Jukka Pätynen, Helsinki University of Technology - Espoo, Finland; Timo Peltonen, Olli Salmensaari, Akukon Consulting Engineers Ltd. - Helsinki, Finland
A solution for constructing a small rehearsal hall, the acoustics of which resembles the stage of a large concert hall is presented. The implemented system was evaluated both objectively with measurements and subjectively by collecting feedback from musicians. The subjective opinions were very positive and encouraging and the main objective was achieved. The electroacoustically enhanced rehearsal space sounded like a much bigger hall, although the sound pressure level increased less than one decibel. The presented solution is applicable in all spaces, which are not very reverberant by nature and where the height of the room is at least twice the standard room height.
Convention Paper 7695 (Purchase now)
P8-4 Sound Field Characterization and Absorption Measurement of Wideband Absorbers—Soledad Torres-Guijarro, Laboratorio Oficial de Metroloxía de Galicia (LOMG) - Ourense, Spain; Antonio Pena, Alfonso Rodríguez-Molares, Norberto Degara-Quintela, Universidad de Vigo - Vigo, Spain
Wideband absorbers are a fundamental part of non-environment control rooms. They consist of huge angled hanging panels in conjunction with a multilayer wall or ceiling. Their absorption capacity is very noticeable, mostly in the low frequency range. In this paper the mechanisms of absorption of the wideband absorbers of the rear wall of the control room at the Universidad de Vigo will be studied. Conclusions will be drawn from the analysis of pressure, velocity volume, and intensity measurements performed in the vicinity of the panels, and from the computation of the normal specific acoustic impedance and the normal absorption coefficient.
Convention Paper 7696 (Purchase now)
P8-5 Temporal Matching of 2-D and 3-D Wave-Based Acoustic Modeling for Efficient and Realistic Simulation of Rooms—Jeremy J. Wells, Damian T. Murphy, Mark Beeson, University of York - York, UK
Methods for adapting the output of a two-dimensional Kirchoff-variable digital waveguide mesh to better match that of a 3-D mesh, both of which are intended to model the same acoustic space, are presented. Details of the methods, including quality of output and computational demands, are given along with the details of how they are incorporated into the hybrid system within which they are employed.
Convention Paper 7697 (Purchase now)
Friday, May 8, 09:00 — 12:30
P9 - Signal Analysis, Measurements, Restoration
Chair: Jan Abildgaard Pedersen
P9-1 Some Improvements of the Playback Path of Wire Recorders—Nadja Wallaszkovits, Phonogrammarchiv Austrian Academy of Sciences - Vienna, Austria; Heinrich Pichler, Audio Consultant - Vienna, Austria
The archival transfer of wire recordings to the digital domain is a highly specialized process that incorporates a wide range of specific challenges. One of the basic problems is the format incompatibility between different manufacturers and models. The paper discusses the special design philosophy, using the tone control network in the record path as well as in the playback path. This tone control circuit causes additional phase and group delay distortions. The influence and characteristics of the tone control (which was not a priori present with every model) is discussed and analog phase correction networks are described. The correction of phase errors is outlined. As this format has been obsolete for many decades, a high quality archival transfer can only be reached by modifying dedicated equipment. The authors propose some possible main modifications and improvements of the playback path of wire recorders, such as signal pickup directly after the playback head, introducing a high quality preamplifier, followed by analog phase correction and correction of the amplitude characteristics. Alternatively signal pickup directly after the playback head, introducing a high quality preamplifier, followed by digital signal processing to optimize the output signal is discussed.
Convention Paper 7698 (Purchase now)
P9-2 Acoustics of the Crime Scene as Transmitted by Mobile Phones—Eddy B. Brixen, EBB-consult - Smorum, Denmark
One task for the audio forensics engineer is to extract background information from audio recordings. A major problem is the assessment of analyzed telephone calls in general and mobile phones (LPC-algorithms) in particular. In this paper the kind of acoustic information to be extracted from a recorded phone call is initially explained. The parameters used for the characterization of the various acoustic spaces and events in question are described. It is discussed how the acoustical cues should be assessed. The validity of acoustic analyses carried out in the attempt to provide crime scene information like reverberation time is presented.
Convention Paper 7699 (Purchase now)
P9-3 Silence Sweep: A Novel Method for Measuring Electroacoustical Devices—Angelo Farina, University of Parma - Parma, Italy
This paper presents a new method for measuring some properties of an electroacoustical system, for example a loudspeaker or a complete sound system. Coupled with the already established method based on Exponential Sine Sweep, this new Silence Sweep method provides a quick and complete characterization of not linear distortions and noise of the device under test. The method is based on the analysis of the distortion products, such as harmonic distortion products or intermodulation effects, occurring when the system is fed with a wide-band signal. Removing from the test signal a small portion of the whole spectrum, it becomes possible to collect and analyze the not-linear response and the noise of the system in that “suppressed” band. Changing continuously the suppressed band over time, we get the Silence Sweep test signal, which allows for quick measurement of noise and distortion over the whole spectrum. The paper explains the method with a number of examples. The results obtained for some typical devices are presented, compared with those obtained with a standard, state-of-the-art measurement system.
Convention Paper 7700 (Purchase now)
P9-4 Pitch and Played String Estimation in Classic and Acoustic Guitars—Isabel Barbancho, Lorenzo Tardón, Ana M. Barbancho, Simone Sammartino, Universidad de Málaga - Málaga, Spain
In classic and acoustic guitars that use standard tuning, the same pitch can be produced at different strings. The aim of this paper is to present a method based on the time and frequency-domain characteristics of the recorded sound to determine, not only the pitch but also the string of the guitar that has been played to produce that pitch. This system will provide information not only of the pitch of the notes played, but also about how those notes were played. This specific information can be valuable to identify the style of the player and can be used in teaching to play the guitar.
Convention Paper 7701 (Purchase now)
P9-5 Statistical Properties of Music Signals—Miomir Mijic, Drasko Masovic, Dragana Sumarac-Pavlovic, Faculty of Electrical Engineering - Belgrade, Serbia
This paper is concerned with the results of a complex approach to statistical properties of various music signals based on 412 musical pieces classified in 12 different genres. Analyzed signals contain more than 24 hours of music. For each piece time variation of the signal level was found, performed with a 10 ms period of integration in rms calculation and with 90 percent overlap, making a new signal representing the level as a function of time. For each piece the statistical analysis of signal level has been performed by its statistical distribution, cumulative distribution, effective value within complete duration of piece, mean level value, and level value corresponding to maximum of the statistical distribution. The parameter L1, L10, L50, and L99 were extracted from cumulative distributions as numerical indicators of dynamic properties. The paper contains detailed statistical data and averaged data for all observed genres, as well as quantitative data about dynamic range and crest factor of various music signals.
Convention Paper 7702 (Purchase now)
P9-6 Multi-Band Generalized Harmonic Analysis (MGHA) and its Fundamental Characteristics in Audio Signal Processing—Takahiro Miura, Teruo Muraoka, Tohru Ifukube, University of Tokyo - Tokyo, Japan
One of the main problems in sound restoration of valuable historical recordings includes the noise reduction. We have been proposing and continuing to improve the noise reduction method utilized by inharmonic analysis such as GHA (Generalized Harmonic Analysis). Algorithm of GHA frequency extraction enables us to extract arbitrary frequency components. In this paper we aimed at more accurate frequency identification from noisy signals to divide analyzed frequency section into multi-bands before analysis: this algorithm is named as Multi-Band GHA (MGHA). The simulation of frequency analysis in a noise-free condition indicated that MGHA is more effective than GHA for the extraction of low frequency components in the condition of both lower window length and amount of frequency components. However, excluding the case of both lower window length and amount of frequency components, GHA identifies frequency components more precisely. Furthermore the result of frequency analysis in condition with steady noise shows that MGHA can be more effectively applied to the case of short window length, many frequency components, and low S/N.
Convention Paper 7703 (Purchase now)
P9-7 Automatic Detection of Salient Frequencies—Joerg Bitzer, University of Applied Science Oldenburg - Oldenburg, Germany; Jay LeBoeuf, Imagine Research, Inc. - San Francisco, CA, USA
In this paper we present several techniques to find the most significant frequencies in recorded audio tracks. These estimated frequencies could be used as a starting point for mixing engineers in the EQing process. In order to evaluate the results, we compare the detected frequencies with a list of reported salient frequencies from audio engineers. The results show that automatic detection is possible. Thus, one of the more boring tasks of a mixing engineer can be automated, which gives the mixing engineer more time to do the artistic part of the mixing process.
Convention Paper 7704 (Purchase now)
Friday, May 8, 09:30 — 13:30
TT1 - BMW Group Dept. of Acoustics & Vibrations
The program starts with a guided tour of test stands that are employed for the refinement of acoustics and vibrations. The development process for a variety of individual noise sources such as powertrain or mechantronical components as well as the quality assessment in the interior of a vehicle will be demonstrated. Further, the participants will receive the opportunity to share a ride in the driving simulator to virtually experience different acoustical set-ups for the engine noise. Finally, a lecture about BMW´s acoustical philosophy will be presented including the fundamentals of sound engineering and BMW´s psychoacoustic approach to achieve a maximum of customer satisfaction.
Price: EUR 20
Friday, May 8, 10:00 — 13:30
TT3 - Herkulessaal der Residenz
A comparison of microphone settings for a live broadcast of a symphonic concert in 5.1 and stereo at the Bavarian Radio will be presented. The participants will have the opportunity to compare the settings themselves at the console and to make their own experiences with a 5.1 mix via multitrack recording under different ambience-mic-arrays. Presenters are Wolfram Graul and Klemens Kamp. Tour is limited to 10 people and transportation is not provided.
Friday, May 8, 11:30 — 13:30
T7 - Binaural Audio Technology—History, Current Practice, and Emerging Trends
Robert Schulein, RBS Consultants
During the winter and spring of 1931-32, Bell Telephone Laboratories, in cooperation with Leopold Stokowski and the Philadelphia Symphony Orchestra, undertook a series of tests of musical reproduction using the most advanced apparatus obtainable at that time. The objectives were to determine how closely an acoustic facsimile of an orchestra could be approached using both stereo loudspeakers and binaural reproduction. Detailed documents discovered within the Bell Telephone archives will serve as a basis for describing the results and problems revealed while creating the binaural demonstrations. Since these historic events, interest in binaural recording and reproduction has grown in areas such as sound field recording, acoustic research, sound field simulation, audio for electronic games, music listening, and artificial reality. Each of theses technologies has its own technical concerns involving transducers, environmental simulation, human perception, position sensing, and signal processing. This tutorial will cover the underlying principles germane to binaural perception, simulation, recording, and reproduction. It will include live demonstrations as well as recorded audio/visual examples.
Friday, May 8, 13:30 — 17:30
TT4 - Müller BBM Research / Environment
Müller-BBM is a leading consulting engineering company, who gives consultations, tests, and plans in the fields of buildings, environment, technology, and products. They form an interdisciplinary team, made up of engineers in various fields, architects, chemists, geologists, and physicists, who can provide complete, single-source solutions to many different problems. Since 1962, Müller-BBM has done business as an “acoustical consultancy.” Their job is to quantify, evaluate, and modify the effects of sounds, vibrations, heat, moisture, odors, pollutants, and electromagnetic waves on people, machinery, and the environment. They design measures to provide protection against disturbing influences, such as excessive noise from loud roads, railways or industrial plants, and they also develop measures for the purposeful “forming” of sounds—for example, constructing vehicles or technical facilities so that they produce agreeable sounds. You will see room acoustics and media technology using the example of international projects of concert halls and operas. Also metrological evidences of noise insulation and sound absorption of different structural elements will be shown. Active sound design in vehicle acoustics is also an important theme.
Price: EUR 20
Friday, May 8, 13:30 — 17:30
TT5 - Stadtmuseum Musikinstrumente
Museum of the City of Munich Instruments
The extraordinary collection of the Sammlung Musik-Münchner Stadtmuseum presents exhibits highlighting the construction of musical instruments from different cultures as well as a wide survey of the musical activities of mankind. On show are about 1500 musical instruments from Africa, Asia, the precolonial Americas, and Europe out of a total 6,000 objects. During the guided tour of the collections visitors have the opportunity to play the complete gamelans from the Indonesian Islands of Java and Bali.
Price: EUR 20
Friday, May 8, 13:30 — 17:30
TT6 - Herkulessaal der Residenz
A comparison of microphone settings for a live broadcast of a symphonic concert in 5.1 and stereo at the Bavarian Radio will be presented. The participants will have the opportunity to compare the settings themselves at the console and to make their own experiences with a 5.1 mix via multitrack recording under different ambience-mic-arrays. Presenters are Wolfram Graul and Klemens Kamp. Tour is limited to 10 people and transportation is not provided.
Friday, May 8, 15:00 — 16:00
Human Factors in Audio Systems
Friday, May 8, 16:30 — 18:00
P15 - Hearing
P15-1 Psychoacoustics and Noise Perception Survey in Workers of the Construction Sector—Marcos D. Fernández; Bálder Vitón; José Antonio Ballesteros, Samuel Quintana, Isabel González, Escuela Universitaria Politécnica de Cuenca, Universidad de Castilla-La Mancha - Cuenca, Spain
Noise levels are not enough to assess completely the influence of the noise. Therefore, psychoacoustics and perception surveys should be taken into account. The noise that the construction workers produce in their tasks is recorded with a HATS. Later, those recordings are processed to derive different parameters: spectrum, weighted equivalent levels, and the main psychoacoustics parameters. After that, a specific survey has been developed to assess the perception of such activity noises during the working time to correlate adjectives of perception with those parameters mentioned. The survey has been designed to be answered by the workers that are exposed to the noise, so that conclusions could be derived about the feelings and annoyance that the noise can cause.
Convention Paper 7734 (Purchase now)
P15-2 On the Design of Automatic Sound Classification Systems for Digital Hearing Aids—Enrique Alexandre, Lorena Álvarez-Perez, Roberto Gil-Pita, Raúl Vicen-Bueno, Lucas Cuadra, University of Alcalá - Alcalá de Henares, Spain
The design of digital hearing aids able to carry out advanced functionalities (such as, for instance, classify the acoustic environment and automatically select the best amplification program for the user’s comfort) exhibits a great difficulty. Since hearing aids have to work at very low clock frequency in order to minimize power consumption and maximize life battery, the number of available instructions per second is actually very small. This enforces the design of efficient algorithms with a reduced number of instructions. In particular, this paper will focus on three extremely related topics: (1) the design of low-complexity features; (2) the use of automatic feature selection algorithms to optimize the performance of the classifier; and (3) the critical analysis of a variety of different classification algorithms, basically based on their complexity and performance and determining whether or not they are feasible to be implemented.
Convention Paper 7735 (Purchase now)
P15-3 Pruning Algorithms for Multilayer Perceptrons Tailored for Speech/Non-Speech Classification in Digital Hearing Aids—Lorena Álvarez, Enrique Alexandre, Manuel Rosa-Zurera, University of Alcalá - Alcalá de Henares, Spain
This paper explores the feasibility of using different pruning algorithms for multilayer perceptrons (MLPs) applied to the problem of speech/non-speech classification in digital hearing aids. A classifier based on MLPs is considered the best option in spite of its presumably high computational cost. Nevertheless, its implementation has been proven to be feasible: it requires some trade-offs involving a balance between reducing the computational demands (that is, the number of neurons) and the quality perceived by the user. In this respect, this paper will focus on the design of three novel pruning algorithms for MLPs, which attempt to converge to the minimum complexity network (that is, the lowest number of neurons in the hidden layer) without degrading the performance of it. The results obtained with the proposed algorithms will be compared with those obtained when using another pruning algorithm proposed in the literature.
Convention Paper 7736 (Purchase now)
P15-4 Evolutionary Optimization for Hearing Aids of Computational Auditory Scene Analysis—Anton Schlesinger, Marinus M. Boone, Technical University of Delft - Delft, The Netherlands
Computational auditory scene analysis (CASA) provides an excellent means to improve speech intelligibility in adverse acoustical situations. In order to utilize algorithms of CASA in hearing aids, sets of algorithmic parameters need to be adjusted to the individual auditory performance of the listener and the acoustic scene in which they are employed. Performed manually, the optimization is an expensive procedure. We therefore developed a framework in which algorithms of CASA are automatically optimized by the principles of evolution, i.e., by a genetic algorithm. By using the speech transmission index (STI) as an objective function, the presented framework presents a holistic routine that is solely based on psychoacoustical and physiological models to improve and to assess speech intelligibility. The initial listening test revealed a discrepancy between the objective and subjective assessment of speech intelligibility, which suggests a review of the objective function. Once the objective function is in accordance with the individual perception of speech intelligibility, the presented framework could be applied in the optimization of all complex speech processors and therewith accelerate their assessment and application.
Convention Paper 7737 (Purchase now)
P15-5 Enhanced Control of On-Screen Faders with a Computer Mouse—Michael Hlatky, Kristian Gohlke, David Black, Hochschule Bremen (University of Applied Sciences) - Bremen, Germany; Jörn Loviscach, Fachhochschule Bielefeld (University of Applied Sciences) - Bielefeld, Germany
Input devices of the audio studio that formerly were physical have mostly been converted into virtual controls on the computer screen. Whereas this transition saves space and cost, it has reduced the performance of these controls, as virtual controls adjusted using the computer mouse do not exhibit the accuracy and accessibility of their physical counterparts. Previous studies show that interaction with scrollable timelines can be enhanced by an intelligent interpretation of the mouse movement. We apply similar techniques to virtual faders as used for audio control, leveraging such approaches as controllable zoom levels and pseudo-haptic interaction. Tests conducted on five such methods provide insight into how to decouple the fader from the mouse movement to improve accuracy without impairing the speed of the interaction.
Convention Paper 7738 (Purchase now)
P15-6 Modeling of External Ear Acoustics for Insert Headphone Usage—Marko Hiipakka, Miikka Tikander, Matti Karjalainen, Helsinki University of Technology - Espoo, Finland
Although acoustics of the external ear has been studied extensively for auralization and hearing aids, the acoustic behavior with insert headphones is not as well known. Our research focused on the effects of outer ear physical dimensions, particularly on sound pressure at the eardrum. The main parameter was the length of the canal, but eardrum’s damping of resonances was also studied. Ear canal simulators and a dummy head were constructed. Measurements were also performed from human ear canals. The study was carried out both with unblocked ear canals and when the canal entrance was blocked with an insert earphone. Special insert earphones with in-ear microphones were constructed for this purpose. Physics-based computational models were finally used to validate the approach.
Convention Paper 7739 (Purchase now)
Friday, May 8, 19:30 — 21:30
Isar Brau, Munchen Pullach
This year the Banquet will take place in a small old railway station, above the valley of the River Isar. The railway opened in 1891 and steam trains took people from the city to many beautiful places in the south of Munich. Today the steam trains have been replaced and the line is now part of the S-Bahn, so the old station is not needed anymore and has been turned into a traditional Bavarian style restaurant with its own micro-brewery. What could be more natural than making this location a pleasant place for a “get together” in a lovely atmosphere?
The welcome beer from the micro brewery and other drinks will be followed by a fine buffet with Bavarian delicacies. At the end of a long day at the Convention, these “Schmankerl” will be a good way to relax and enjoy the evening with old and new friends and colleagues. Come and savour Munich’s lifestyle. The ticket price includes all food and drinks and the bus to the restaurant and back.
55 Euros for AES members; 65 Euros for nonmembers
Tickets will be available at the Special Events desk.
Saturday, May 9, 09:00 — 11:00
P17 - Room Acoustics & Loudspeaker Interaction
Chair: Eddy B. Brixen
P17-1 Effects of Loudspeaker Directivity on Perceived Sound Quality—A Review of Existing Studies—William Evans, University of Surrey - Guildford, Surrey, UK; Jakob Dyreby, Søren Bech, Bang & Olufsen A/S - Struer, Denmark; Slawomir Zielinski, Francis Rumsey, University of Surrey - Guildford, Surrey, UK
The directivity of a loudspeaker system is often regarded as a prominent factor in the overall subjective quality of the reproduced sound experience. Much literature is available on the topic, and currently a broad field of opinion exists among designers. This paper provides an overview of the available literature, as well as an extended investigation into listener-based research. Results indicate that for such a widely debated topic, conclusive measurement data with regard to human listeners is limited, and, therefore, a proposal for more informative listening tests is presented.
Convention Paper 7745 (Purchase now)
P17-2 Subjective Validity of Figures of Merit for Room Aspect Ratio Design—Matthew Wankling, Bruno Fazenda, University of Huddersfield - Huddersfield, West Yorkshire, UK
Attempts have long been made to classify a room’s low frequency audio reproduction capability with regard to its aspect ratio. Common metrics used have relied on the homogeneous distribution of modal frequencies and from these a number of “optimal” aspect ratios have emerged. However, most of these metrics ignore the source and receiver coupling to the mode shapes—only a few account for this in the derivation of a figure of merit. The subjective validity of these attempts is tested and discussed. Examples are given of supposedly good room ratios with bad performance and vice versa. Subjective assessment of various room scenarios is undertaken and a ranking order has been obtained to correlate with a proposed figure of merit.
Convention Paper 7746 (Purchase now)
P17-3 A Study of Low-Frequency Near- and Far-Field Loudspeaker Behavior—John Vanderkooy, University of Waterloo - Waterloo, Ontario, Canada, B&W Group Ltd., Steyning, West Sussex, UK; Martial Rousseau, B&W Group Ltd. - Steyning, West Sussex, UK
Low-frequency loudspeaker measurements are difficult. Room reflections, mediocre anechoic chambers, and random noise play havoc with the quest. Diffraction is different in nearfield and farfield. This paper covers a range of topics that bear on these problems, such as boundary element diffraction simulations, an approximate theory for low frequencies, methods to shorten the impulse response, and nearfield characteristics. A few points are illustrated with measurements. An earlier simplified diffraction theory of Kessel is checked for axisymmetric cylindrical and rectangular boxes by boundary-element simulations, in an attempt to pin down the diffractive 4pi to 2pi transition. It turns out to have a strong connection to the acoustic center of a loudspeaker. Some measurements are made under various conditions. Shortening methods are used to minimize the deleterious effect of truncating room reflections from the impulse response.
Convention Paper 7747 (Purchase now)
P17-4 Subwoofers in Symmetrical and Asymmetrical Rooms—Juha Backman, Nokia Corporation - Espoo, Finland
A theoretical study of behavior of single and multiple subwoofers, taking also geometrical and acoustical asymmetry of practical listening environments into account, is presented. The results indicate that configurations aimed at precise cancellation of individual modes have a high sensitivity to deviations from the ideal. However, with multiple subwoofers it is possible to find robust placements that both reduce the spatial variation of the sound field and the frequency variation of the response. This, however, requires loudspeaker placements where also the height of the source from the floor is varied.
Convention Paper 7748 (Purchase now)
Saturday, May 9, 09:30 — 13:30
TT7 - Bavarian Broadcast System BR Radio
State of the Art Broadcasting Studio in Germany
Bayerischer Rundfunk [Bavarian Broadcasting] (BR) is the public broadcasting authority for the German Freistaat (Free State) of Bavaria, with its main offices located in Munich. On3, the young brand of Bavarian Radio, presents a completely new radio and internet world for the youth of Bavaria. The contents come from the most modern broadcasting studio in Germany. The new website and the Digital Radio “on3-radio,” the live broadcast “on3-südwild” in the Bavarian television, and the music program “on3-start ramp” are produced and broadcast from a specially designed, multimedia studio environment. The heart of the on3-Studios is an entertainment area, suited for live acts as well as for radio and television recordings. Young people, who attach importance to sound journalism and music offerings outside of the mainstream value will be addressed. They are invited to inform and to participate on the innovative multimedia platform for listening radio and downloading audio and videos. First time listeners may arrange their own personal radio program in public service quality according to their own wishes. www.on3-radio.de Furthermore you will see the big recording studio of the BR-Symphonic Orchestra with its modern control room.
Price: EUR 20
Saturday, May 9, 09:30 — 13:30
TT8 - Bavarian Broadcast System BR Television
On this tour you will see the postproduction facility including various transfer-rooms, sound design suites, and dubbing stages where every kind of production for BR-TV happens. We will visit a large TV-studio, whose control rooms were constructed in 2006, in use for live program. By default it was configured for 5.1 productions without any effort. Equipped with a StageTec AURUS mixing console it is used both as a music studio and as a distributing center during big events with a lot of venues. During the “UEFA European (Football) Championship” separate signals from stadiums in Austria and Switzerland are mixed as a 5.1 transmission for the German TV ARD.
Price: EUR 20
Saturday, May 9, 10:30 — 13:00
LS1 - Neumann & Müller and d & b
Sound System Design and Commissioning in Critical Acoustic Environments
The interaction of sound systems with special attention to the excitation of diffuse sound will be examined in theory and practical demonstrations using speech intelligibility measurements as an indicator. Software-aided line and subwoofer array designs will be discussed followed by a live demonstration of the tuning and
Saturday, May 9, 10:30 — 12:00
P18 - Assessment, Evaluation
P18-1 WhisPER—A New Tool for Performing Listening Tests—Simon Ciba, André Wlodarski, Hans-Joachim Maempel, Technical University of Berlin - Berlin, Germany
A software tool is presented for performing experiments in the field of perceptual audio evaluation and psychoacoustic measurement, controlling the interaction with both the subject and the playback environment. For this purpose a repertoire of test procedures has been implemented, including popular qualitative and quantitative approaches. By using OpenSound Control commands, not only traditional multichannel reproduction is supported, but also advanced spatial audio reproduction such as dynamic binaural synthesis or wavefield synthesis. WhisPER has been written in MATLAB to facilitate its further development within the scientific community. As opposed to existing libraries it provides a coherent graphical user interface system allowing easier access and configuration also for users without advanced programming experience.
Convention Paper 7749 (Purchase now)
P18-2 Psychoacoustic Assessment of the Noise Emitted by the Machines. The Case of the Grinders—Marcos D. Fernández, José Antonio Ballesteros, Iván Suárez; Samuel Quintana; Isabel González, Escuela Universitaria Politécnica de Cuenca, Universidad de Castilla-La Mancha - Cuenca, Spain
Sound quality it is used as the suitability of the sound emitted by a machine, depending on the characteristics of that sound and the perceptual sensation received that reflects the degree of acceptance of the machine by the user. In order to evaluate the sound quality of the grinders under study, binaural recordings, are required of the emitted sounds to determine the objective psychoacoustic parameters, and then, to make subjective tests to a representative number of people about the impression made by that particular sound.
Convention Paper 7750 (Purchase now)
P18-3 Investigations of the Effects of Nonlinear Distortions on Psychoacoustical Measures—Stephan Herzog, Technical University Kaiserslautern - Kaiserslautern, Germany
The perception of nonlinear distortions of audio devices, in particular the perception of nonlinear distortions of digital audio, is only insufficiently described by typical measures like THD. To provide a better insight into perceptual effects of nonlinear distortions, their audibility, and the impact on the psychoacoustical measures like loudness and sharpness is examined. For this purpose a test method has been developed. The first step in the test is the measurement of the frequency response of the device under test with an efficient method to enable the separation of linear and nonlinear processing. The second step of the test consists of the computation of the psychoacoustical measures and the thresholds for the audibility of nonlinear distortions. Both computations are based on the same psychoacoustical model to obtain consistent results. Results for several types of distortion obtained with simulations and measurements on analog circuits are presented.
Convention Paper 7751 (Purchase now)
Saturday, May 9, 13:30 — 17:30
TT10 - Staatsoper München
Munich’s “first” opera house, the Nationaltheater, shows its audio equipment. One studio for live-sound, one for production and broadcast, and one for recording are integrated in a digital network. Additionally, the back-stage installations will be shown.
Price: EUR 20
Saturday, May 9, 14:00 — 15:00
Perception and Subjective Evaluation of Audio Signals
Saturday, May 9, 15:30 — 17:00
The Career Fair will feature several companies from the exhibit floor. All attendees of the convention, students and professionals alike, are welcome to come talk with representatives from the companies and find out more about job and internship
opportunities in the audio industry. Bring your resume!
Saturday, May 9, 16:30 — 18:00
P23 - Psychoacoustics and Perception
P23-1 Influence of the Listening Room in the Perception of a Musical Work—Nelia Valverde, Marcos D. Fernández, José Antonio Ballesteros, Leticia Martínez, Samuel Quintana, Isabel González, Escuela Universitaria Politénica de Cuenca - Cuenca, Spain
The listening of the same musical composition generates a unique perception for every listener but, simultaneously, the specific acoustic conditions of the chosen room have a decisive influence on the perception. In order to evaluate such differences depending on the listening room, a musical work for choir has been composed and recorded with a HATS in an anechoic room, in a reverberant room, and in a normal room. With those records, surveys to professional musicians and non-expert listeners have been carried out, once they have previously heard the recording with headphones, and finally, the answers obtained have been evaluated in order to determine the influence of the listening room in the perception of the musical work.
Convention Paper 7775 (Purchase now)
P23-2 Comparison of Methods for Measuring Sound Quality through HATS and Binaural Microphones—José Antonio Ballesteros, Marcos D. Fernández, Samuel Quintana, Isabel González, Laura Rodríguez, Escuela Universitaria Politécnica de Cuenca, Universidad de Castilla-La Mancha - Cuenca, Spain
Sound quality techniques are currently becoming more important as they take into account the human perception of sound. By now, there is no well established international standards for measuring sound quality and no well recognizable reference index for its assessment. Then, a HATS or a pair of binaural microphones can be used for measuring the typical sound quality parameters. A set of measurements, under the same condition, has been carried out using both devices for assessing the differences and the possible variation in the results. As a consequence of all of this, guidance is given for choosing the device that best fits depending on each measurement context.
Convention Paper 7776 (Purchase now)
P23-3 Improving Perceived Tempo Estimation by Statistical Modeling of Higher-Level Musical Descriptors—Ching-Wei Chen, Markus Cremer, Kyogu Lee, Peter DiMaria, Ho-Hsiang Wu, Gracenote, Inc. - Emeryville, CA, USA
Conventional tempo estimation algorithms generally work by detecting significant audio events and finding periodicities of repetitive patterns in an audio signal. However, human perception of tempo is subjective and relies on a far richer set of information, causing many tempo estimation algorithms to suffer from octave errors, or “double/half-time” confusion. In this paper we propose a system that uses higher-level musical descriptors such as mood to train a statistical model of perceived tempo classes, which can then be used to correct the estimate from a conventional tempo estimation algorithm. Our experimental results show reliable classification of perceived tempo class, as well as a significant reduction of octave errors when applied to an array of available tempo estimation algorithms.
Convention Paper 7777 (Purchase now)
P23-4 Perceptually-Motivated Audio Morphing: Softness—Duncan Williams, Tim Brookes, University of Surrey - Guildford, Surrey, UK
A system for morphing the softness and brightness of two sounds independently from their other perceptual or acoustic attributes was coded. The system is an extension of a previous one that morphed brightness only, that was based on the Spectral Modeling Synthesis additive/residual model. A Multidimensional Scaling analysis, of listener responses to paired comparisons of stimuli generated by the morpher, showed movement in three perceptually-orthogonal directions. These directions were labeled in a subsequent verbal elicitation experiment that found that the effects of the brightness and softness controls were perceived as intended. A Timbre Morpher, adjusting additional timbral attributes with perceptually-meaningful controls, can now be considered for further work.
Convention Paper 7778 (Purchase now)
P23-5 Resolution of Spatial Distribution Perception with Distributed Sound Source in Anechoic Conditions—Olli Santala, Ville Pulkki, Helsinki University of Technology - Espoo, Finland
The resolution of directional perception of spatially distributed sound sources was investigated with a listening test in an anechoic chamber using various sound source distributions. Fifteen loudspeakers were used to produce test cases that included sound sources with varying widths and wide sound sources with gaps in the distribution. The subjects were asked to distinguish which loudspeakers emitted sound according to their own perception. Results show that small gaps in the sound source were not perceived accurately and wide sound sources were perceived narrower than they actually were. The results also indicate that the resolution for fine spatial details was worse than 15 degrees when the sound source was wide.
Convention Paper 7779 (Purchase now)
P23-6 Perceived Roughness—A Recent Psychoacoustic Measurement—Robert Mores, Thorsten Smit, Jana-Marie Wiese, University of Applied Science - Hamburg, Germany
This paper relates to an investigation on perceived roughness from Aures in 1984 where findings are based on psychoacoustic tests with synthetic sounds and a small group of people. The related results have repeatedly been used for modeling roughness perception since then, for instance in the context of noise perception. Roughness is again an issue when investigating the perceived quality or timbre of musical sounds. In this context roughness is one among some ten mid-level features to be extracted. Here, perceived roughness is measured again, but on a wider basis than in the earlier investigation. This paper outlines the psychoacoustic investigation, basically following the method of Aures, but modifying some of the issues under question. The results are reasonable and differ from the earlier findings in various aspects.
Convention Paper 7780 (Purchase now)
P23-7 A Physiological Auditory Model—Václav Vencovsky, Czech Technical University in Prague - Prague, Czech Republic
A physiological auditory model is described. The model simulates a processing of a sound by an outer, middle, and inner ear. A nonlinear inner ear model comprises the cochlear frequency selectivity model and the inner hair cells model proposed according to mammalian physiological data. A capability of the auditory model to simulate human psychophysical masking data is verified.
Convention Paper 7781 (Purchase now)
Saturday, May 9, 18:30 — 19:00
The band featured in the Live Sound Workshop LS2, Rauschenberger, will continue to play after the workshop finishes, in a concert open to all attendees.
The band "Rauschenberger" is a new upcoming group from Hannover around singer and leader Rauschenberger, who has a splendid and very characteristic voice.
Sunday, May 10, 10:30 — 12:30
LS3 - Yamaha and d & b
Line Source to Point Source Transformation—Technical Concept of the d&b T-Series
Increasing demands on flexibility, scalability, and efficiency in sound reinforcement applications encouraged one loudspeaker development configurable for point and line source applications by an easy mechanical
modification. Several implemented design technologies will be discussed before the listening demonstration of the performance of the system under critical acoustic conditions comparing Q-Series line arrays.
Modern IT-Compatible Audio Networks
IT-compatible audio networks and standards: Cobranet and Ethersound. Both formats will be discussed regarding their advantages and limitations. Advanced network strategies like VLAN programming offers high channel counts and a wide range of additional services via Gigabit audio networks. Now video, intercom, remote control, and DMX services may be included in a modern network infrastructure. Of course, such a network has to be as safe and stable as possible, so the redundancy concepts developed by the IT industry like link aggregation / trunking / spanning tree will be discussed.
Sunday, May 10, 10:30 — 12:00
P26 - Room Acoustics and Loudspeaker Interaction
P26-1 Acoustic Design of Classrooms—Suthikshn Kumar, PESIT - Bangalore, India
Acoustic principles when used effectively in classroom design can improve the audibility of the professor in dramatic way. The cost-effective way to enhance the acoustics serves several purposes: Less speaking effort on the part of the lecturer; students can easily hear the lecturer more clearly; improved communication and, hence, improved learning experience. Several improvements can be done to the classroom architecture to enhance the signal-to-noise ratio, reduce reverberation, and background noise. We propose an innovative way of providing parabolic reflectors near the platform for amplifying the lecturer’s voice. This paper focuses on the cost-effective, energy efficient acoustic design of classrooms.
Convention Paper 7796 (Purchase now)
P26-2 Epidaurus: Comments on the Acoustics of the Legendary Ancient Greek Theater—Christos Goussios, Christos Sevastiadis, Kalliopi Chourmouziadou, George Kalliris,, Aristotle University of Thessaloniki - Thessaloniki, Greece
The ancient Greek theaters and especially the well preserved theater of Epidaurus are of great interest because of their legendary acoustic characteristics. In the present paper the history and the construction characteristics of the specific theater are presented. The differences between the ancient and modern use of it are explained. Important acoustic parameters calculated using in situ measurements are presented. The conclusions show the relation between its excellent acoustic performance and the obtained results.
Convention Paper 7797 (Purchase now)
P26-3 A Matlab Toolbox for the Analysis of Ando’s Factors—Dario D'Orazio, Paolo Guidorzi, Massimo Garai, University of Bologna - Bologna, Italy
The autocorrelation and crosscorrelation functions analysis, as well-known in literature, obtains remarkable results in different scientific fields. The autocorrelation function (ACF) and the interaural crosscorrelation function (IACF) analysis in architectural acoustics is known thanks to Y. Ando's work. The Toolbox presented in this work has been developed in order to compute Ando's significant and spatial factors (as the factors obtained from ACF and IACF are called), to subjective preference functions and to investigate further applications.
Convention Paper 7798 (Purchase now)
Sunday, May 10, 13:00 — 16:00
P28 - Psychoacoustics and Perception
Chair: Florian Wickelmaier
P28-1 Localization of Consecutive Sound Events in Reverberant Environment—Marko Takanen, Antti Jylhä, Tapani Pihlajamäki, Juha Holm, Ilkka Huhtakallio, Ville Pulkki, Helsinki University of Technology - Espoo, Finland
A listening test was conducted to assess the localization of consecutive sound events in simulated reverberant conditions. The stimuli consisted of two sound events, which were reverberant wideband harmonic sounds reproduced in a multichannel anechoic chamber. Localization threshold for the latter sound event was measured as the direct-to-reverberant sound level ratio with an adaptive transformed up-down method. The studied factors affecting the localization threshold were the time interval and pitch difference between the two sound events and the time gap between the direct sound and reverberation. The results indicate that all factors have a significant effect on localization.
Convention Paper 7807 (Purchase now)
P28-2 The Contrasting and Conflicting Definitions of Envelopment—Jan Berg, Luleå University of Technology - Luleå, Sweden
In spatial audio, the term envelopment is not unambiguously defined and the different de facto definitions both overlap and contradict one another. This unclarity may pose a problem where the sensation of being surrounded by sound is subject for investigation and analysis. This paper reviews the different concepts of envelopment in order to point to where possible problems may occur. A tentative suggestion for a terminology that can serve the different contexts of enveloping sounds is also given.
Convention Paper 7808 (Purchase now)
P28-3 Apparent Source Width in ITU Surround—Jorge Medina Victoria, Thomas Görne, Hamburg University of Applied Sciences - Hamburg, Germany
Apparent Source Widths (ASW) of phantom images in a ITU-R BS.775-1 standard surround loudspeaker configuration have been investigated for different signals by means of a randomized blind test. Test signals were generated from anechoic recordings by amplitude panning between adjacent channels. The listening test showed that an increase of Apparent Source Width coincides with the increase of localization uncertainty at the side and back areas of the ITU setup. Largest ASW values were found between RS and LS channels.
Convention Paper 7809 (Purchase now)
P28-4 A New Methodological Approach to the Noise Threat Evaluation Based on the Selected Physiological Properties of the Human Hearing System—Jozef Kotus, Bozena Kostek, Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland
A new way of assessment of noise-induced harmful effects on the human hearing system is presented in this paper. The method takes into consideration properties of the selected physiological human hearing system. On the basis of the hearing examinations and noise measurements results and psychoacoustical noise dosimeter performance the new indicators of the noise harmfulness were proposed. The evaluation of the proposed indicators were conducted on the basis of hearing examinations in the real noise exposure situations and also on the basis of the simulation results using standard test signals (such as white, pink, and brown noise). The performed analysis and obtained results confirmed the practical usefulness and correctness of the proposed indicators.
Convention Paper 7813 (Purchase now)
P28-5 Octave-Band Analysis on ITU-R Listening Test Data—Ian M. Dash, Australian Broadcasting Corporation - Sydney, NSW, Australia
Listening test data collected in 2003 on 49 audio program samples were used to formulate the ITU-R BS.1770 program loudness prediction algorithm. The validity of this data at low frequencies was unproven. Octave-band analysis has therefore been performed on the test samples to test for audibility in each band. Results suggest that further listening tests may be needed to obtain reliable low-frequency data. A multiple regression analysis was also performed on the octave-band data to obtain a least-squares weighting curve for comparison with the BS.1770/RLB2 weighting curve. Results suggest that while the BS.1770 curve performs well, there is still room for improvement.
Convention Paper 7811 (Purchase now)
P28-6 Windowed Sine Bursts: In Search of Optimal Test Signals for Detecting the Threshold of Audibility of Temporal Decays—Andrew Goldberg, Helsinki University of Technology - Espoo, Finland
A slow decay in an audio signal is perceived as ringing and is commonly caused by room modes. This affects the perception of intelligibility, clarity, definition, and spatial rendering. A method has previously been devised to find the threshold of audibility of the decay in low-frequency narrow-band signals. One of the test signals in the large-scale listening test will be a low-frequency sine burst, but spectral spreading at the start and end of the test signal acts as an additional non-modal cue. This effect is removed by windowing, for example a half Hann. The aim of this paper is to determine the window length required (threshold) to render the end of the test signal free from audible spectral spreading. The Parameter Estimation by Sequential Testing (PEST) method and calibrated headphones (to remove factors associated with the listening environment) are used in subjective listening tests. The window length threshold is found to be constant above 200 Hz but rises exponentially toward low frequencies, and is replay level dependent. Threshold may be related to the absolute threshold of hearing, masking curves and/or auditory filter bandwidth.
Convention Paper 7812 (Purchase now)
Sunday, May 10, 13:30 — 15:00
P29 - Signal Analysis, Measurements, Restoration
P29-1 Evaluation and Comparison of Audio Chroma Feature Extraction Methods—Michael Stein, Benjamin M. Schubert, Ilmenau University of Technology - Ilmenau, Germany; Matthias Gruhne, Gabriel Gatzsche, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany; Markus Mehnert, Ilmenau University of Technology - Ilmenau, Germany
This paper analyzes and compares different methods for digital audio chroma feature extraction. The chroma feature is a descriptor, which represents the tonal content of a musical audio signal in a condensed form. Therefore chroma features can be considered as an important prerequisite for high-level semantic analysis, like chord recognition or harmonic similarity estimation. A better quality of the extracted chroma feature enables much better results in these high-level tasks. In order to discover the quality of chroma features, seven different state-of-the-art chroma feature extraction methods have been implemented. Based on an audio database, containing 55 variations of triads, the output of these algorithms is critically evaluated. The best results were obtained with the Enhanced Pitch Class Profile.
Convention Paper 7814 (Purchase now)
P29-2 Measuring Transient Structure-Borne Sound in Musical Instruments—Proposal and First Results from a Laser Intensity Measurement Setup—Robert Mores, Hamburg University of Applied Sciences - Hamburg, Germany; Marcel thor Straten, Consultant - Seevetal, Germany; Andreas Selk, Consultant - Hamburg, Germany
The proposal for this new measurement setup is motivated by curiosity in transients propagating across arched tops of violins. Understanding the impact of edge construction on transient wave reflection back to the to the top of a violin or on conduction into the rib requires single-shot recordings possibly without statistical processing. Signal-to-noise ratio should be high although mechanical amplitudes at distinct locations on the structure surface are in the range of a few micrometers only. In the proposed setup, the intensity of a laser beam is directly measured after passing a screen attached to the device under test. The signal-to-noise ratio achieved for one micrometer transients in single-shot recordings is significantly more than 60 dB.
Convention Paper 7815 (Purchase now)
P29-3 Evaluating Ground Truth for ADRess as a Preprocess for Automatic Musical Instrument Identification—Joseph McKay, Mikel Gainza, Dan Barry, Dublin Institute of Technology - Dublin, Ireland
Most research in musical instrument identification has focused on labeling isolated samples or solo phrases. A robust instrument identification system capable of dealing with polytimbral recordings of instruments remains a necessity in music information retrieval. Experiments are described that evaluate the ground truth of ADRess as a sound source separation technique used as a preprocess to automatic musical instrument identification. The ground truth experiments are based on a number of basic acoustic features, while using a Gaussian Mixture Model as the classification algorithm. Using all 44 acoustic feature dimensions, successful identification rates are achieved.
Convention Paper 7816 (Purchase now)
P29-4 Improving Rhythmic Pattern Features Based on Logarithmic Preprocessing—Matthias Gruhne, Christian Dittmar, Fraunhofer Institute for Digital Media Technology IDMT - Ilmenau, Germany
In the area of Music Information Retrieval, the rhythmic analysis of music plays an important role. In order to derive rhythmic information from music signals, several feature extraction algorithms have been described in the literature. Most of them extract the rhythmic information by auto-correlating the temporal envelope derived from different frequency bands of the music signal. Using the auto-correlated envelopes directly as an audio-feature is afflicted with the disadvantage of tempo dependency. To circumvent this problem, further postprocessing via higher-order statistics has been proposed. However, the resulting statistical features are still tempo dependent to a certain extent. This paper describes a novel method, which logarithmizes the lag-axis of the auto-correlated envelope and discards the tempo-dependent part. This approach leads to tempo-invariant rhythmic features. A quantitative comparison of the original methods versus the proposed procedure is described and discussed in this paper.
Convention Paper 7817 (Purchase now)
P29-5 Further Developments of Parameterization Methods of Audio Stream Analysis for Security Purposes—Pawel Zwan, Andrzej Czyzewski, Gdansk University of Technology - Gdansk, Poland
The paper presents an automatic sound recognition algorithm intended for application in an audiovisual security monitoring system. A distributed character of security systems does not allow for simultaneous observation of multiple multimedia streams, thus an automatic recognition algorithm must be introduced. In the paper a module for the parameterization and automatic detection of audio events is described. The spectral analysis of sounds of a broken window, gunshot, and scream are performed and parameterization methods are proposed and discussed. Moreover, a sound classification system based on the Support Vector Machines (SVM) algorithm is presented and its accuracy is discussed. The practical application of the system with the use of a monitoring station is shown. The plan of further experiments is presented and the conclusions are derived.
Convention Paper 7818 (Purchase now)
P29-6 Estimating Instrument Spectral Envelopes for Polyphonic Music Transcription in a Music Scene-Adaptive Approach—Julio J. Carabias-Orti, Pedro Vera-Candeas, Nicolas Ruiz-Reyes, Francisco J. Cañadas-Quesada, Pablo Cabañas-Molero, University of Jaén - Linares, Spain
We propose a method for estimating the spectral envelope pattern of musical instruments in a musical scene-adaptive scheme, without having any prior knowledge about the real transcription. A musical note is defined as stable when variations between its harmonic amplitudes are held constant during a certain period of time. A density-based clustering algorithm is used with the stable notes in order to separate different envelope models for each note. Music scene-adaptive envelope patterns are finally obtained from similarity and continuity of the different note models. Our approach has been tested in a polyphonic music transcription scheme with synthesized and real music recordings obtaining very promising results.
Convention Paper 7819 (Purchase now)