A Machine Learning Approach to Detecting Sound-Source Elevation in Adverse Environments
×
Cite This
Citation & Abstract
H. O'Dwyer, E. Bates, and FR. M.. Boland, "A Machine Learning Approach to Detecting Sound-Source Elevation in Adverse Environments," Paper 9968, (2018 May.). doi:
H. O'Dwyer, E. Bates, and FR. M.. Boland, "A Machine Learning Approach to Detecting Sound-Source Elevation in Adverse Environments," Paper 9968, (2018 May.). doi:
Abstract: Recent studies have shown that Deep neural Networks (DNNs) are capable of detecting sound source azimuth direction in adverse environments to a high level of accuracy. This paper expands on these findings by presenting research that explores the use of DNNs in determining sound source elevation. A simple machine-hearing system is presented that is capable of predicting source elevation to a relatively high degree of accuracy in both anechoic and reverberant environments. Speech signals spatialized across the front hemifield of the head are used to train a feedforward neural network. The effectiveness of Gammatone Filter Energies (GFEs) and the Cross-Correlation Function (CCF) in estimating elevation is investigated as well as binaural cues such as Interaural Time Difference (ITD) and Interaural Level Difference (ILD). Using a combination of these cues, it was found that elevation to within 10 degrees could be predicted with an accuracy upward of 80%.
@article{o'dwyer2018a,
author={o'dwyer, hugh and bates, enda and boland, francis m.},
journal={journal of the audio engineering society},
title={a machine learning approach to detecting sound-source elevation in adverse environments},
year={2018},
volume={},
number={},
pages={},
doi={},
month={may},}
@article{o'dwyer2018a,
author={o'dwyer, hugh and bates, enda and boland, francis m.},
journal={journal of the audio engineering society},
title={a machine learning approach to detecting sound-source elevation in adverse environments},
year={2018},
volume={},
number={},
pages={},
doi={},
month={may},
abstract={recent studies have shown that deep neural networks (dnns) are capable of detecting sound source azimuth direction in adverse environments to a high level of accuracy. this paper expands on these findings by presenting research that explores the use of dnns in determining sound source elevation. a simple machine-hearing system is presented that is capable of predicting source elevation to a relatively high degree of accuracy in both anechoic and reverberant environments. speech signals spatialized across the front hemifield of the head are used to train a feedforward neural network. the effectiveness of gammatone filter energies (gfes) and the cross-correlation function (ccf) in estimating elevation is investigated as well as binaural cues such as interaural time difference (itd) and interaural level difference (ild). using a combination of these cues, it was found that elevation to within 10 degrees could be predicted with an accuracy upward of 80%.},}
TY - paper
TI - A Machine Learning Approach to Detecting Sound-Source Elevation in Adverse Environments
SP -
EP -
AU - O'Dwyer, Hugh
AU - Bates, Enda
AU - Boland, Francis M.
PY - 2018
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2018
TY - paper
TI - A Machine Learning Approach to Detecting Sound-Source Elevation in Adverse Environments
SP -
EP -
AU - O'Dwyer, Hugh
AU - Bates, Enda
AU - Boland, Francis M.
PY - 2018
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2018
AB - Recent studies have shown that Deep neural Networks (DNNs) are capable of detecting sound source azimuth direction in adverse environments to a high level of accuracy. This paper expands on these findings by presenting research that explores the use of DNNs in determining sound source elevation. A simple machine-hearing system is presented that is capable of predicting source elevation to a relatively high degree of accuracy in both anechoic and reverberant environments. Speech signals spatialized across the front hemifield of the head are used to train a feedforward neural network. The effectiveness of Gammatone Filter Energies (GFEs) and the Cross-Correlation Function (CCF) in estimating elevation is investigated as well as binaural cues such as Interaural Time Difference (ITD) and Interaural Level Difference (ILD). Using a combination of these cues, it was found that elevation to within 10 degrees could be predicted with an accuracy upward of 80%.
Recent studies have shown that Deep neural Networks (DNNs) are capable of detecting sound source azimuth direction in adverse environments to a high level of accuracy. This paper expands on these findings by presenting research that explores the use of DNNs in determining sound source elevation. A simple machine-hearing system is presented that is capable of predicting source elevation to a relatively high degree of accuracy in both anechoic and reverberant environments. Speech signals spatialized across the front hemifield of the head are used to train a feedforward neural network. The effectiveness of Gammatone Filter Energies (GFEs) and the Cross-Correlation Function (CCF) in estimating elevation is investigated as well as binaural cues such as Interaural Time Difference (ITD) and Interaural Level Difference (ILD). Using a combination of these cues, it was found that elevation to within 10 degrees could be predicted with an accuracy upward of 80%.
Authors:
O'Dwyer, Hugh; Bates, Enda; Boland, Francis M.
Affiliation:
Trinity College, Dublin, Ireland
AES Convention:
144 (May 2018)
Paper Number:
9968
Publication Date:
May 14, 2018Import into BibTeX
Subject:
Posters: Modeling
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=19485