Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping
×
Cite This
Citation & Abstract
Y. Qiao, N. Zacharov, and PA. F.. Hoffmann, "Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping," Express Paper 48, (2022 October.). doi:
Y. Qiao, N. Zacharov, and PA. F.. Hoffmann, "Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping," Express Paper 48, (2022 October.). doi:
Abstract: Listening tests are regarded as the “gold standard” in evaluating the perceptual quality of audio systems. With the surge of applications in virtual and augmented reality, the demand for audio quality evaluations that are more efficient than listening tests has greatly increased. Auditory models are an attractive tool for this purpose, and can greatly complement listening tests. A machine-learning-based model for predicting timbral, spatial, and overall audio quality is presented. When both timbral and spatial attributes are considered, existing models (e.g., MoBi-Q [1]) often assume minimum interaction between the two attributes, and combine their respective quality predictions into a single overall quality judgement. To validate such an assumption, a listening test with various timbral and spatial distortions was conducted. Results revealed a strong correlation between the two quality attributes when moderate distortion is present. Based on this observation, the proposed model preserves the original front-end of MoBi-Q for feature extraction and uses a simple neural network as the decision module that independently maps auditory features to timbral, spatial, and overall quality scores with no explicit assumptions. Using available third-party datasets, our proposed model showed a significantly higher correlation with subjective scores than MoBi-Q for timbral and overall quality. The assessment of spatial audio quality is still ongoing.
@article{qiao2022prediction,
author={qiao, yue and zacharov, nick and hoffmann, pablo f.},
journal={journal of the audio engineering society},
title={prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping},
year={2022},
volume={},
number={},
pages={},
doi={},
month={october},}
@article{qiao2022prediction,
author={qiao, yue and zacharov, nick and hoffmann, pablo f.},
journal={journal of the audio engineering society},
title={prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping},
year={2022},
volume={},
number={},
pages={},
doi={},
month={october},
abstract={listening tests are regarded as the “gold standard” in evaluating the perceptual quality of audio systems. with the surge of applications in virtual and augmented reality, the demand for audio quality evaluations that are more efficient than listening tests has greatly increased. auditory models are an attractive tool for this purpose, and can greatly complement listening tests. a machine-learning-based model for predicting timbral, spatial, and overall audio quality is presented. when both timbral and spatial attributes are considered, existing models (e.g., mobi-q [1]) often assume minimum interaction between the two attributes, and combine their respective quality predictions into a single overall quality judgement. to validate such an assumption, a listening test with various timbral and spatial distortions was conducted. results revealed a strong correlation between the two quality attributes when moderate distortion is present. based on this observation, the proposed model preserves the original front-end of mobi-q for feature extraction and uses a simple neural network as the decision module that independently maps auditory features to timbral, spatial, and overall quality scores with no explicit assumptions. using available third-party datasets, our proposed model showed a significantly higher correlation with subjective scores than mobi-q for timbral and overall quality. the assessment of spatial audio quality is still ongoing.},}
TY - Spatial Audio
TI - Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping
SP -
EP -
AU - Qiao, Yue
AU - Zacharov, Nick
AU - Hoffmann, Pablo F.
PY - 2022
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2022
TY - Spatial Audio
TI - Prediction of timbral, spatial, and overall audio quality with independent auditory feature mapping
SP -
EP -
AU - Qiao, Yue
AU - Zacharov, Nick
AU - Hoffmann, Pablo F.
PY - 2022
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2022
AB - Listening tests are regarded as the “gold standard” in evaluating the perceptual quality of audio systems. With the surge of applications in virtual and augmented reality, the demand for audio quality evaluations that are more efficient than listening tests has greatly increased. Auditory models are an attractive tool for this purpose, and can greatly complement listening tests. A machine-learning-based model for predicting timbral, spatial, and overall audio quality is presented. When both timbral and spatial attributes are considered, existing models (e.g., MoBi-Q [1]) often assume minimum interaction between the two attributes, and combine their respective quality predictions into a single overall quality judgement. To validate such an assumption, a listening test with various timbral and spatial distortions was conducted. Results revealed a strong correlation between the two quality attributes when moderate distortion is present. Based on this observation, the proposed model preserves the original front-end of MoBi-Q for feature extraction and uses a simple neural network as the decision module that independently maps auditory features to timbral, spatial, and overall quality scores with no explicit assumptions. Using available third-party datasets, our proposed model showed a significantly higher correlation with subjective scores than MoBi-Q for timbral and overall quality. The assessment of spatial audio quality is still ongoing.
Listening tests are regarded as the “gold standard” in evaluating the perceptual quality of audio systems. With the surge of applications in virtual and augmented reality, the demand for audio quality evaluations that are more efficient than listening tests has greatly increased. Auditory models are an attractive tool for this purpose, and can greatly complement listening tests. A machine-learning-based model for predicting timbral, spatial, and overall audio quality is presented. When both timbral and spatial attributes are considered, existing models (e.g., MoBi-Q [1]) often assume minimum interaction between the two attributes, and combine their respective quality predictions into a single overall quality judgement. To validate such an assumption, a listening test with various timbral and spatial distortions was conducted. Results revealed a strong correlation between the two quality attributes when moderate distortion is present. Based on this observation, the proposed model preserves the original front-end of MoBi-Q for feature extraction and uses a simple neural network as the decision module that independently maps auditory features to timbral, spatial, and overall quality scores with no explicit assumptions. Using available third-party datasets, our proposed model showed a significantly higher correlation with subjective scores than MoBi-Q for timbral and overall quality. The assessment of spatial audio quality is still ongoing.
Authors:
Qiao, Yue; Zacharov, Nick; Hoffmann, Pablo F.
Affiliations:
Princeton University, NJ, USA; Reality Labs, Meta, Redmond, WA, USA; Reality Labs, Meta, Redmond, WA, USA(See document for exact affiliation information.) Express Paper 48; AES Convention 153; October 2022
Publication Date:
October 19, 2022Import into BibTeX
Subject:
Spatial Audio
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=21931