Bulk download - click topic to download Zip archive of all papers related to that topic: Applications in Audio Audio Education Perception Perception—Part 1 Perception—Part 2 Perception—Part 3 Posters—Part 1 Posters—Part 2 Posters—Part 3 Recording & Production Recording and Production Signal Processing Sound Reinforcement & Acoustics Spatial Audio Spatial Audio—Part 1 Spatial Audio—Part 2 Transducers Transducers—Part 1 Transducers—Part 2 Transducers—Part 3
Many existing perceptual audio codec standards define only the bit stream syntax and associated decoder algorithms, but leave many degrees of freedom to the encoder design. For a systematic optimization of encoder parameters as well as for education and training of experienced test listeners, it is instrumental to provoke and subsequently assess individual coding artifact types in an isolated fashion with controllable strength. The approach presented in this paper consists of a pre-selection of suitable test audio content in combination with forcing a specially modified encoder into non-common operation modes to willingly generate controlled coding artifacts. In conclusion, subjective listening tests were conducted to assess the subjective quality for different parameters and test content.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
One of the biggest challenges still encounter with speech communication via a mobile phone is that it is sometimes very difficult to understand what is said when listening in a noisy place. In this paper a novel approach based on two models is introduced to increase speech intelligibility for a listener surrounded by environmental noise. One is to perceptually optimize the speech when considering simultaneous background noise, the other is to modify the speech towards a more intelligible, naturally elicited speaking style. The two models are combined to provide more understandable speech even in a loud noisy environment environment, even in the case where we are unable to increase the speech volume. The improvements in perceptual quality and intelligibility are shown by Perceptual Objective Listening Quality Assessment and Listening Effort Mean Opinion Score evaluation.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
A method of down-sample-rate conversion is discussed that exploits processes of spectral-domain matching and pseudo non-linear convolution applied to discrete data frames as an alternative to conventional convolutional filter and sub-sampling techniques. Spectral-domain matching yields a complex sample sequence that can subsequently be converted into a real sequence using the Discrete Hilbert Transform. The method is shown to result in substantially reduced time dispersion compared to the standard convolutional approach and circumvents filter symmetry selection such as linear phase or minimum phase. The formal analytic process is presented and validated through simulation then adapted to digital-audio sample-rate conversion by using a multi-frame overlap and add process. It has been tested in both LPCM-to-LPCM and DSD-to-LPCM applications where the latter can be simplified using a look-up code table.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
Automatic detection of piano pedaling techniques is challenging as it is comprised of subtle nuances of piano timbres. In this paper we address this problem on single notes using decision-tree-based support vector machines. Features are extracted from harmonics and residuals based on physical acoustics considerations and signal observations. We consider four distinct pedaling techniques on the sustain pedal (anticipatory full, anticipatory half, legato full, and legato half pedaling) and create a new isolated-note dataset consisting of different pitches and velocities for each pedaling technique plus notes played without pedal. Experiment shows the effectiveness of the designed features and the learned classifiers for discriminating pedaling techniques from the cross-validation trails.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
Music production is a highly subjective task, which can be difficult to automate. Simple session structures can quickly expose complex mathematical tasks which are difficult to optimize. This paper presents a method for the reduction of masking in an unknown mix using genetic programming. The model uses results from a series of listening tests to guide its cost function. The program then returns a vector that best minimizes this cost. The paper explains the limitations of using such a method for audio as well as validating the results.Music production is a highly subjective task, which can be difficult to automate. Simple session structures can quickly expose complex mathematical tasks which are difficult to optimize. This paper presents a method for the reduction of masking in an unknown mix using genetic programming. The model uses results from a series of listening tests to guide its cost function. The program then returns a vector that best minimizes this cost. The paper explains the limitations of using such a method for audio as well as validating the results.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
Previous research polled employers, new hires, and educators in the audio industry to identify what skills were most important, what skills new hires had, and what skills educators focused on in Audio Recording Production (ARP) Programs. The Skills Students Learned (SSL) Survey used in this study, polled 40 students from the U.S. and aboard to identify skills learned at ARP programs. Students reported their skill level before and after attending a formal ARP program via an online mixed methods survey instrument. In the quantitative section, students reported an improvement in all skill levels upon completing their ARP training. In the qualitative section, students reported communication skills and in-depth technical skills missing from their programs and personal skill sets. This study recommends infusion of these skills into existing ARP curriculum.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
This paper considers the various challenges, implications and pedagogical opportunities presented via a small-scale audio archiving project: School of Music RePlayed. Housed in the Australian National University’s School of Music, this historical archive of more than 1200 recital and concert tape recordings features multiple recordings of historical significance, yet presents with a number of issues pertaining to storage and tape deterioration. This paper first considers the challenges presented in the digitization of such an archive before focusing on the pedagogical opportunities afforded by such a unique project. Developed and run in conjunction with the National Film and Sound Archive of Australia, this unique project addresses both technological and pedagogical matters of preservation, heritage and digitization.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
Since the late 1990s and early 2000s, the changing nature of the music industry has led to the demise of recording studios, which have decreased dramatically in number. This decline has led to a corresponding disappearance of the “teaboy” route, the traditional route whereby engineers, producers, and mixers (EPM) learned their craft. In the training vacuum that the demise of recording studios creates, how do EPM professionals now learn the skills and knowledge necessary to succeed in the music industry? Through primary research and indepth interviews with leading EPM professionals and online education providers, this paper assesses the skills needed to become a successful EPM and explores whether the internet can ever replace the traditional teaboy route in educating the next generation of professionals. It concludes that there are currently significant limitations to internet learning of EPM skills, some of which might be overcome by new technological developments such as virtual reality.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
In recent years, some results on different auditory impressions from differences of materials and media have been discussed. To check the causes of these differences, we analyzed the differences in the sound pressure levels and interaural time difference [1] between three different Compact Discs by using wavelet analysis. The results of these analyses detected objective differences in sound despite different materials having the same data, and the new Compact Disc called the “Ultimate Hi Quality Compact Disc” made of photopolymer, where a special alloy has been employed as a reflection film, reproduces more of the master sound than the conventional Compact Disc. We show the method for analyzing sound and evaluate these differences and consider their application on various sound quality evaluations.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!
The ITU-Recommendation BS.1770 is now established throughout most of the broadcast industry. Program loudness measurement is undertaken through the summation of K-weighted energy and this summation typically involves material that is broadband in nature. We undertook listening tests to investigate the performance of the K-weighting filter in relation to perceived loudness of narrower band stimuli, namely octave-band pink noise and individual stems of a multitrack session. We propose two alternative filters based on the discrepancies found and evaluate their performance using different measurement window sizes. The new filters yield better performance accuracy for both pink noise stimuli and certain types of multitrack stem. Finally, we propose an informed set of parameters that may improve loudness prediction in auto mixing systems.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.
Start a discussion about this paper!