AES Warsaw 2015
Poster Session P2

P2 - (Poster) Education and Perception

Thursday, May 7, 10:30 — 12:30 (Foyer)

P2-1 Effects of Ear Training on Education on Sound Quality of Digital Audio for Non-Technical UndergraduatesAkira Nishimura, Tokyo Univeristy Information Sciences - Chiba-shi, Japan
This paper demonstrates the effectiveness of ear training in lectures on audio processing conducted over the 2013 and 2014 academic terms. Student understanding of the lecture content was assessed by comparing scores of written tests that covered the sound quality of perceptual audio codecs and other topics, which were administered after lectures with and without ear training on identifying bit rates of sound files. The same approach was applied to a lecture on audio digitizing and ear training on identifying sampling frequencies. The test scores of assessments that focused on the sound quality of perceptual audio codecs were significantly higher among students who had participated in ear training compared to those who had not participated in such training. In contrast, no significant difference was found in the group scores of participants tested after ear training on identifying sampling frequency. The effectiveness of ear training being limited to perceptual codings was investigated in terms of prior knowledge of the technical terms.
P2-2 Evaluation of the Low-Delay Coding of Applause and Hand-Clapping Sounds Caused by Music AppreciationKazuhiko Kawahara, Kyushu University - Fukuoka, Japan; Yutaka Kamamoto, NTT Communication Science Laboratories - Kanagawa, Japan; Akira Omoto, Kyushu University - Fukuoka, Japan; Takehiro Moriya, NTT Communicatin Science Labs - Atsugi-shi, Kanagawa-ken, Japan
Recently, the improvement of network resources enables us to distribute the contents in real-time. This paper presents the low-delay coding of applause sound and hand-clapping sound with less parameters by means of synthesizing these sounds at the receiver site. We found that number of people clapping their hands were corresponding to a sound volume of applause. In other words, no one considers who is clapping. Additionally, on the hand-clapping sound, the time interval of clapping also should be important. Based on such information, preliminary experiments confirm that our approach, which synthesize applause and hand-clapping sound from a few parameters, successfully generates natural applause and hand-clapping sounds.
P2-3 Subjective Evaluation of High Resolution Audio under In-Car Listening EnvironmentsMitsunori Mizumachi, Kyushu Institute of Technology - Kitakyushu, Fukuoka, Japan; Ryuta Yamamoto, Digifusion Japan Co., Ltd. - Hiroshima, Japan; Katsuyuki Niyada, Hiroshima Cosmopolitan University - Hiroshima, Japan
High resolution audio (HRA) becomes increasingly popular both for music production and the consumers. It enables to record a music performance in a wide-band and precise digital audio format. It is, however, unclear in its perceptual advantage under some listening environments. In this study listening tests were carried out inside cars where 34 participants listened to the same music in four different audio formats. The participants chose an audio format with better quality in paired comparison among 192 kHz/24 bits PCM, 48 kHz/16 bits PCM, and two kinds of lossy-compressed MPEG audio formats. The participants, who are familiar with HRA and live music performance, could significantly discriminate among the audio formats.
P2-4 Investigating Factors that Guitar Players to Perceive Depending on Amount of Distortion in TimbreKoji Tsumoto, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan; Atsushi Marui, Tokyo University of the Arts - Tokyo, Japan; Toru Kamekawa, Tokyo University of the Arts - Adachi-ku, Tokyo, Japan
Typical electric guitar timbre could be classified into three classes according to amount of distortion. Timbre with less distortion is called "Clean" and heavily distorted timbre is called "Distorted." Timbre between "Clean" and "Distorted" is called "Crunch." To investigate the factors that guitar players perceive depending on amount of distortion, semantic differential analysis using eight bipolar adjective scales was employed. Twenty guitar players including six professionals played their instruments through a guitar amp with nine different distortion level settings. Two factors were found in factor analysis, and "Clean" and "Distorted" were located opposite to each other. "Crunch" was located in the middle of latent factors and each anchoring adjectives used in the evaluation. Also the result of regression analysis indicated "Activeness Factor" was the reliable factor corresponding to the amount of distortion.
P2-5 Perception of Timbre Changes vs. Temporary Threshold ShiftBartlomiej Kruk, Wroclaw University of Technology - Wroclaw, Poland; Maurycy Kin, Wroclaw University of Technology - Wroclaw, Poland
The paper presents results of research on an influence of Temporary Threshold Shift (TTS) on the detection of changes in timbre of musical samples. The experiment was carried out with conditions that normally exist in a studio when sound material is recorded and mixed. The level of sound exposure that represents the noise signal is 90 dB, and this is an average value of sound level existing in control room. This musical material may be treated as a noise so TTS phenomenon may occur after several time durations: 60, 90, and 120 minutes. Ten subjects participated in the main part of the experiment and all of them have the normal hearing thresholds. The stimuli contained the musical material with introduced changes in timbre up to +/–6 dB in low (100 Hz), middle (1 kHz), and high frequency (10 kHz) regions. It turned out that listening to the music with an exposure of 90 dB for 1 hour influences the hearing thresholds for middle frequency region (about 1–2 kHz); and this has been reflected in a perception of timbre changes: after 1 hour listening the changes of spectrum in middle-frequencies region are perceived with a threshold of 3 dB while the changes of low and high ranges of spectrum were perceived with the thresholds of 1.8 and 1.5 dB, respectively. After the longer exposure, the thresholds shifted up to 3.5 dB for the all investigated stimuli.
P2-6 Hybrid Multiresolution Analysis of “Punch” in Musical SignalsSteven Fenton, University of Huddersfield - Huddersfield, West Yorkshire, UK; Hyunkook Lee, University of Huddersfield - Huddersfield, UK; Jonathan Wakefield, University of Huddersfield - Huddersfield, UK
This paper presents a hybrid multi-resolution technique for the extraction and measurement of attributes contained within a musical signal. Decomposing music into simpler percussive, harmonic, and noise components is useful when detailed extraction of signal attributes is required. The key parameter of interest in this paper is that of punch. A methodology is explored that decomposes the musical signal using a critically sampled constant-Q filterbank of quadrature mirror filters (QMF) before adaptive windowed short term Fourier transforms (STFT). The proposed hybrid method offers accuracy in both the time and frequency domains. Following the decomposition transform process, attributes are analyzed. It is shown that analysis of these components may yield parameters that would be of use in both mixing/mastering and also audio transcription and retrieval.
P2-7 Five Aspects of Maximizing Objectivity from Perceptual Evaluations of Loudspeakers: A Literature StudyChrister Volk, DELTA SenseLab - Hørsholm, Denmark; Aalborg University, Department of Electronic Systems - Aalborg East, Denmark; Søren Bech, Bang & Olufsen a/s - Struer, Denmark; Aalborg University - Aalborg, Denmark; Torben H. Pedersen, DELTA SenseLab - Hørsholm, Denmark; Flemming Christensen, Aalborg University - Aalborg, Denmark
A literature study was conducted focusing on maximizing objectivity of results from listening evaluations aimed at establishing the relationship between physical and perceptual measurements of loudspeakers. The purpose of this study was to identify and examine factors influencing the objectivity of data from the listening evaluations. This paper addresses the following subset of aspects for increasing the objectivity of data from listening tests: The choice of perceptual attributes, relevance of perceptual attributes, choice of loudness equalization strategy, optimum listening room specifications, as well as loudspeaker listening in-situ vs. listening to recordings of loudspeakers over headphones.
P2-8 Modding Game Audio for EducationRicardo Bragança, United Arab Emirates University - Al Ain, Abu Dhabi, UAE
Worldwide there is no formal curriculum for game audio. This paper will dwell on what can be done to change the current status quo. We intend to shed some light on possible solutions and guidelines that can be used by schools in order to achieve a higher awareness on how to implement game audio successfully in a university’s curriculum. We believe that due to its interdisciplinary nature, cross faculty cooperation and corporate partnerships are advised and will promote a better understanding on how to tackle the topic. Constructivist teaching methods and a student centric inquiry based learning approach is suggested to enhance the learning experience and insure adequate content absorption.
P2-9 The Acoustic Properties of Different Types of Earplug Used by Sound EngineersBartlomiej Kruk, Wroclaw University of Technology - Wroclaw, Poland; Michal Luczynski, Wroclaw University of Technology - Wroclaw, Poland
The main aim of this paper is to test various types of earplugs used by sound engineers. At live events, when sound engineers need to use earplugs for health reasons, it is very important that they maintain correct hearing perception abilities. The linear frequency response allows to avoid mistakes when working with sound. Earplugs were tested for attenuation depending on frequency. The authors tested earplugs in the different methods: subjectively using pure tone audiometry and objectively using the designed and created ear canal model. Research allowed to choose the appropriate earplugs for sound engineering purposes.
P2-10 Psychoacoustic Annoyance Monitoring with WASN for Assessment in Urban AreasJaume Segura-Garcia, Universitat de Valencia - Burjassot, Valencia, Spain; Polytechnic University of Valencia; Santiago Felici, Universitat de Valencia - Burjassot, Spain; Maximo Cobos, Universitat de Valencia - Burjassot, Spain; Ana Torres, Polytechic University School of Cuenca - Cuenca, Spain; Juan M. Navarro, Universidad Católica San Antonio - Murcia - Guadalupe (Murcia), Spain
The assessment of the subjective annoyance caused by noise pollution in cities is a matter of major importance as its influence is growing-up in urban areas. Different methods and techniques have been used to model this annoyance in terms of several psychoacoustic parameters, which define different aspects of the acoustic affection from noise pollution in the human behavior. In this paper we describe a monitoring system based on a wireless acoustic sensor network that measures and computes the psychoacoustic metrics following the Zwicker's annoyance model, in a distributed way and at different points simultaneously in urban areas. The nodes of this network run complex algorithms to find out these metrics. These nodes are Single-Board Computer platforms, in particular Raspberry Pi.
P2-11 The Advanced Sound System Listening Room at DolbySunil G. Bharitkar, Dolby Laboratories - San Francisco, CA, USA
A listening room at Dolby has been designed to test the spatial and timbre performance of next generation audio formats recommended in the new ITU-R BS.2051-0 (Advanced sound system for program production). The room has been best designed to conform to the new ITU-R BS.1116-2 (Methods for the subjective assessment of small impairments in audio systems) specification for testing the performance of next-generation audio codecs. Detailed physical and acoustical measurements have been conducted using international standards that demonstrate satisfying elements in both these international recommendations and that are presented in the paper. Subjective testing is ongoing and some preliminary feedback is included as well.
