AES E-Library

AES E-Library Search Results

Search Results (Displaying 1-10 of 103 matches) New Search
Sort by:
                 Records Per Page:

Bulk download - click topic to download Zip archive of all papers related to that topic:   Acoustics and Live Sound    Acoustics and Signal Processing    Applications in Audio    Audio Education    Live Sound, Recording, and Production    Perception    Perception – Part 1    Perception – Part 2    Posters: Applications in Audio    Posters: Recording and Production    Posters: Spatial Audio    Recording and Production    Semantic Audio    Signal Processing—Part 1    Signal Processing—Part 2    Spatial Audio    Spatial Audio-Part 1    Spatial Audio-Part 2 (Evaluation)    Transducers    Transducers—Part 1    Transducers—Part 2    Transducers—Part 3   

 

Improved Psychoacoustic Model for Efficient Perceptual Audio Codecs

Document Thumbnail

Since early perceptual audio coders such as mp3, the underlying psychoacoustic model that controls the encoding process has not undergone many dramatic changes. Meanwhile, modern audio coders have been equipped with semi-parametric or parametric coding tools such as audio bandwidth extension. Thereby, the initial psychoacoustic model used in a perceptual coder, just considering added quantization noise, became partly unsuitable. We propose the use of an improved psychoacoustic excitation model based on an existing model proposed by Dau et al. in 1997. This modulation-based model is essentially independent from the input waveform by calculating an internal auditory representation. Using the example of MPEG-H 3D Audio and its semi-parametric Intelligent Gap Filling (IGF) tool, we demonstrate that we can successfully control the IGF parameter selection process to achieve overall improved perceptual quality.

Authors:
Affiliations:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


On the Influence of Cultural Differences on the Perception of Audio Coding Artifacts in Music

Document Thumbnail

Modern audio codecs are used all over the world, reaching listeners with many different cultures and languages. This study investigates if and how cultural background influences the perception and preference of different audio coding artifacts, focusing on musical content. A subjective listening test was designed to directly compare different types of audio coding and was performed with Mandarin Chinese and German speaking listeners. Overall comparison showed largely consistent results, affirming the validity of the proposed test method. Differential comparison indicates preferences for certain artifacts in different listener groups, e.g., Chinese listeners tended to grade tonality mismatch higher and pre-echoes worse compared to German listeners, and musicians preferred bandwidth limitation over tonality mismatch when compared to non-musicians.

Authors:
Affiliations:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


Perception of Phase Changes in the Context of Musical Audio Source Separation

Document Thumbnail

This study investigates into the perceptual consequence of phase change in conventional magnitude-based source separation. A listening test was conducted, where the participants compared three different source separation scenarios, each with two phase retrieval cases: phase from the original mix or from the target source. The participants’ responses regarding their similarity to the reference showed that (1) the difference between the mix phase and the perfect target phase was perceivable in the majority of cases with some song-dependent exceptions, and (2) use of the mix phase degraded the perceived quality even in the case of perfect magnitude separation. The findings imply that there is room for perceptual improvement by attempting correct phase reconstruction in addition to achieving better magnitude-based separation.

Authors:
Affiliation:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


Method for Quantitative Evaluation of Auditory Perception of Nonlinear Distortion: Part II – Metric for Music Signal Tonality and its Impact on Subjective Perception of Distortions

Document Thumbnail

In the first part of the paper we have noticed that the impact of audible nonlinear distortions on subjective listener preference is strongly dependent on the spectral structure of a test signal. In the second part we propose a method for considering the spectral characteristics of a test signal in the evaluation of the subjective perception of audible nonlinear distortions. To describe the tonal structure of a music signal, a qualitative characteristic, tonality, is taken as a metric, and tonality coefficient is proposed as a measure of this characteristic. Subjective listening tests were performed to estimate how the auditory perception of nonlinear distortions depends on the tonal structure of a signal and the spectral distribution of the noise-to-mask ratio (NMR)

Authors:
Affiliation:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


Developing a Method for the Subjective Evaluation of Smartphone Music Playback

Document Thumbnail

To determine the preferred audio characteristics for media playback over smartphones, a series of controlled double-blind listening experiments were run to evaluate the subjective playback quality of six high-end smartphones. Listeners rated products based on their audio quality preference and left comments categorized by attribute. The devices were tested in different orientations in level-matched and maximum-volume scenarios. Positional variation and biases were accounted for using a motorized turntable and audio playback was controlled remotely with remote-access software. Test results were compared to spatially-averaged measurements made using a multitone stimulus and demonstrate that the smoothness of the frequency response is the most important aspect in smartphone preference. Low frequency extension, decreased levels of nonlinear distortion, and higher maximum playback level did not correlate with higher phone ratings.

Open Access

Open
Access

Authors:
Affiliation:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:


Download Now (1.4 MB)

This paper is Open Access which means you can download it for free.

Start a discussion about this paper!


Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results

Document Thumbnail

Subjective experiments are a cornerstone of modern research with a variety of tasks being undertaken by subjects. In the field of audio, subjective listening tests provide validation for research and aid fair comparison between techniques or devices such as coding performance, speakers, mixes, and source separation systems. Several interfaces have been designed to mitigate biases and to standardize procedures, enabling indirect comparisons. The number of different combinations of interface and test design make it extremely difficult to conduct a truly unbiased listening test. This paper resolves the largest of these variables by identifying the impact the interface itself has on a purely auditory test. This information is used to make recommendations for specific categories of listening tests.

Authors:
Affiliations:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


Musical Instrument Synthesis and Morphing in Multidimensional Latent Space Using Variational, Convolutional Recurrent Autoencoders

Document Thumbnail

In this work we propose a deep learning based method—namely, variational, convolutional recurrent autoencoders (VCRAE)—for musical instrument synthesis. This method utilizes the higher level time-frequency representations extracted by the convolutional and recurrent layers to learn a Gaussian distribution in the training stage, which will be later used to infer unique samples through interpolation of multiple instruments in the usage stage. The reconstruction performance of VCRAE is evaluated by proxy through an instrument classifier and provides significantly better accuracy than two other baseline autoencoder methods. The synthesized samples for the combinations of 15 different instruments are available on the companion website.

Authors:
Affiliation:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


Music Enhancement by a Novel CNN Architecture

Document Thumbnail

This paper is concerned with music enhancement by removal of coding artifacts and recovery of acoustic characteristics that preserve the sound quality of the original music content. In order to achieve this, we propose a novel convolution neural network (CNN) architecture called FTD (Frequency-Time Dependent) CNN, which utilizes correlation and context information across spectral and temporal dependency for music signals. Experimental results show that both subjective and objective sound quality metrics are significantly improved. This unique way of applying a CNN to exploit global dependency across frequency bins may effectively restore information that is corrupted by coding artifacts in compressed music content.

Authors:
Affiliations:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


The New Dynamics Processing Effect in Android Open Source Project

Document Thumbnail

The Android “P” Audio Framework’s new Dynamics Processing Effect (DPE) in Android Open Source Project (AOSP), provides developers with controls to fine-tune the audio experience using several stages of equalization, multi-band compressors, and linked limiters. The API allows developers to configure the DPE’s multichannel architecture to exercise real-time control over thousands of audio parameters. This talk additionally discusses the design and use of DPE in the recently announced Sound Amplifier accessibility service for Android and outlines other uses for acoustic compensation and hearing applications.

Open Access

Open
Access

Author:
Affiliation:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:


Download Now (3.4 MB)

This paper is Open Access which means you can download it for free.

Start a discussion about this paper!


On the Physiological Validity of the Group Delay Response of All-Pole Vocal Tract Modeling

Document Thumbnail

Magnitude-oriented approaches dominate the voice analysis front-ends of most current technologies addressing, e.g., speaker identification, speech coding/compression, and voice reconstruction and re-synthesis. A popular technique is all-pole vocal tract modeling. The phase response of all-pole models is known to be non-linear and highly dependent on the magnitude frequency response. In this paper we use a shift-invariant phase-related feature that is estimated from signal harmonics in order to study the impact of all-pole models on the phase structure of voiced sounds. We relate that impact to the phase structure that is found in natural voiced sounds to conclude on the physiological validity of the group delay of all-pole vocal tract modeling. Our findings emphasize that harmonic phase models are idiosyncratic, and this is important in speaker identification and in fostering the quality and naturalness of synthetic and reconstructed speech.

Author:
Affiliation:
AES Convention: Paper Number: Permalink
Publication Date:
Subject:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!


                 Search Results (Displaying 1-10 of 103 matches)
AES - Audio Engineering Society