In This Section
Clean Audio for TV broadcast: An Object-Based Approach for Hearing-Impaired Viewers - April 2015
Audibility of a CD-Standard A/DA/A Loop Inserted into High-Resolution Audio Playback - September 2007
Sound Board: Food for Thought, Aesthetics in Orchestra Recording - April 2015
Journal of the AES
2012 January/February - Volume 60 Number 1/2
Guest Editor’s Note: Special Issue on Perceptual Quality of Systems (PQS)
It is difficult to describe the quality of systems that focus on human perception. An engineering approach to quality includes the consideration of how a system is perceived by its users and how the needs and expectations of the users develop. Thus, quality assessment and prediction have to take into account the relevant factors related to human perception and judgment. The perceptual evaluation of sound quality and its modeling require the understanding of a number of diverse disciplines including engineering, psychology, statistics, music, aesthetics, and language. Descriptions are given for the papers in this special issue.
Sound quality is a complex and multilayered phenomenon. When analyzing or modeling the formation process of sound-quality judgments, a variety of quality elements and quality featureshave to be taken into account, whereby the actual relevance and salience of each of them is situation dependent. In the following we present some ideas with the aim of structuring the quality-formation process into different layers according to the amount of abstraction involved. Depending on the amount of abstraction, different sets of references and evaluation and assessment methods have to be employed.
The Semantic Space of Vehicle Sounds: Developing a Semantic Differential with Regard to Customer Perception
Evaluating the sound quality of a vehicle is a complex process. Physical and psychoacoustical measures cannot sufficiently describe this process with only superficial cues. Customer quality evaluation is based on their perceptions, interpretations, and expectations. This study generated a semantic space for vehicle sound. In other words, we elicited numerous attributes related to the perception and quality of vehicle sound. We sought to determine customers’ common language that appropriately describes vehicle sound quality. This study developed and applied a novel systematic approach, which includes a free verbalization interview, a test of participants’ understanding of acoustic attributes, and participant evaluation of the ability of these attributes to describe perceptible vehicle sound properties. In this manner we created a complete semantic database to describe vehicle sounds and testing the relevance and redundancy of these attributes. At the end of the investigation we developed two sets of 28 attributes for interior and exterior driving conditions.
Emoacoustics: A Study of the Psychoacoustical and Psychological Dimensions of Emotional Sound Design
Even though traditional psychoacoustics has provided indispensable knowledge about auditory perception, it has, in its narrow focus on signal characteristics, neglected listener and contextual characteristics. To demonstrate the influence of the meaning the listener attaches to a sound in the resulting sensations we used a Fourier-time-transform processing to reduce the identifiability of 18 environmental sounds. In a listening experiment, 20 subjects listened to and rated their sensations in response to, first, all the processed stimuli and then, all original stimuli, without being aware of the relationship between the two groups. Another 20 subjects rated only the processed stimuli, which were primed by their original counterparts. This manipulation was used in order to see the difference in resulting sensation when the subject could tell what the sound source is. In both tests subjects rated their emotional experience for each stimulus on the orthogonal dimensions of valence and arousal, as well as perceived annoyance and perceived loudness for each stimulus. They were also asked to identify the sound source. It was found that processing caused correct identification to reduce substantially, while priming recovered most of the identification. While original stimuli induced a wide range of emotional experience, reactions to processed stimuli were emotionally neutral. Priming manipulation reversed the effects of processing to some extent. Moreover, even though the 5th percentile Zwickers-loudness (N5) value of most of the stimuli was reduced after processing, neither perceived loudness nor auditory-induced emotion changed accordingly. Thus indicating the importance of considering other factors apart from the physical sound characteristics in sound design.
The perceived sound quality of small loudspeaker systems with and without digital optimization was empirically evaluated in a listening experiment. Further, it was investigated how the presentation order in the performed paired comparisons influenced the results, as well as whether a self-evaluation was of potential use for variance reduction. The systems were optimized by means of FIR filters. The two versions of each loudspeaker system were rated in a paired comparison test for music stimuli. For the purpose of analysis a linear Gaussian model was applied, resulting in an interval scale revealing interesting information about certainty and discrimination ability of the listeners. The test investigated whether linear pre-compensation of small and inexpensive loudspeaker systems results in a significant improvement of the perceived audio quality in a typical listening situation. The results indicated a significant preference for the optimized version and a significant dependency on the presentation order was detected. The self-evaluation was found to be uncorrelated to the test results.
In our daily lives, we usually perceive an event via more than one sensory modality (e.g., vision, hearing, touch). Therefore, multimodal integration and interactions play an important role when we use objects and for event recognition in our environment. A virtual environment (VE) is a computer simulation of a realistic-looking and interactive world. VEs should take into account the multisensory nature of humans and communicate with the user not only through vision but also through other modalities. In addition to vision, hearing and touch are the most commonly used communication channels. Recently, a variety of products with additional tactile input and output capabilities have been developed (e.g., Apple iPhone and other touch-screen devices, NintendoWii, etc.). Some of these devices provide new possibilities for interacting with a computer, including the auditory modality. Binaural synthesis and rendering are becoming key technologies for multimedia products. Virtual environments are no longer limited to academic research; they have commercial applications, particularly in medicine, game, and entertainment industries. Thus, the quality of VEs is becoming increasingly important. User interaction with a VE is a key issue in the perception of its quality. Several studies have discussed the quality of displays, input and output devices (for different modalities) as well as software and hardware issues; however, multimodal user interaction should also be examined. This paper focuses on the parameters that influence the quality of audio-tactile VEs.
In this study experiments were conducted to determine if a person could distinguish percussive audio loops by their fingertips using audio-driven tactile feedback. The audio signal was adapted to generate a vibration signal (tactile feedback) taking into account the limited capabilities of the tactile modality. A systematic approach to find the different adaptation parameters is discussed. The vibrations were created by an electrodynamic shaker mounted behind a touch-sensitive screen. Results indicate percussive loops are best distinguished if the source features (e.g., frequency spectrum) and sequence features (e.g., rhythm) are maintained.
Perceptual Evaluation of Headphone Compensation in Binaural Synthesis Based on Non-Individual Recordings
The headphone transfer function (HpTF) is a major source of spectral coloration observable in binaural synthesis. Filters for frequency response compensation can be derived from measured HpTFs. Therefore, we developed a method for measuring HpTFs reliably at the blocked ear canal. Subsequently, we compared non-individual dynamic binaural simulations based on recordings from a head and torso simulator (HATS) directly to reality, assessing the effect of non-individual, generic, and individual headphone compensation in listening tests. Additionally, we tested improvements of the regularization scheme of an LMS inversion algorithm, the effect of minimum phase inverse filters, and the reproduction of low frequencies by a subwoofer. Results suggest that while using non-individual binaural recordings the HpTF of the individual used for the recordings – typically a HATS – should be used for headphone compensation.
This article investigates the influence of test duration on user fatigue and the reliability of user ratings in the context of subjective Quality-of-Experience (QoE) assessment. The goal is to provide empirically grounded guidance for the design of lab-based quality experiments, particularly as concerns the overall duration of test sessions. Since subjective user tests tend to be time-consuming and costly, aspects of user workload and fatigue are relevant as they relate to a fundamental challenge: how to maximize test duration without compromising results quality by overly exhausting test participants? In order to address this challenge, we investigate the relationships between test duration, user fatigue, and rating behavior. Our analysis is grounded on measurements and observations made during three typical QoE lab studies with mixed audio, video, and web task profiles that assessed the impact of different network conditions on perceived quality. We measured participant workload and fatigue in two complementary ways: subjectively by means of a questionnaire and objectively by performing physiological measurements in terms of eye blink rates (EBR) as well as Electrocardiographs (ECG). Our results show that even after 90 minutes of active testing, participants’ quality gradings were still reliable despite the presence of measurable signs of fatigue. Thus, for comparable QoE lab user experiments, we recommend to stay within this limit in order to achieve a good balance between results quantity and results quality.
Standards and Information Documents
AES Standards Committee News
Audio connectors; audio measurements
132nd Convention Preview, Budapest
At the 131st Convention held last October, archiving and preservation of audio material was a prominent topic in the program. A tutorial laid down the why, how, and what of audio preservation, followed by fascinating workshops on the practical exploitation of record company archives in the form of reissues of some of the greatest acts in the history of sound. The challenges of getting the best out of recorded material stored in warehouses and vaults were thoroughly examined, and we discover that an ounce of prevention is probably more worthwhile than a pound of cure when it comes to treating the “patient” that is the valuable historical asset of a record company or radio station’s archive.
Technology Trends in Audio Engineering
133rd Call for Papers, San Francisco