On the Use of Bottleneck Features of CNN Auto-Encoder for Personalized HRTFs
×
Cite This
Citation & Abstract
GE. WO. Lee, JU. MI. Moon, CH. JU. Chun, and HO. KO. Kim, "On the Use of Bottleneck Features of CNN Auto-Encoder for Personalized HRTFs," Paper 10023, (2018 May.). doi:
GE. WO. Lee, JU. MI. Moon, CH. JU. Chun, and HO. KO. Kim, "On the Use of Bottleneck Features of CNN Auto-Encoder for Personalized HRTFs," Paper 10023, (2018 May.). doi:
Abstract: The most effective way of providing immersive sound effects is to use head-related transfer functions (HRTFs). HRTFs are defined by the path from a given sound source to the listener's ears. However, sound propagation by HRTFs differs slightly between people because the head, body, and ears differ for each person. Recently, a method for estimating HRTFs using a neural network has been developed, where anthropometric pinna measurements and head-related impulse responses (HRIRs) are used as the input and output layer of the neural network. However, it is inefficient to accurately measure such anthropometric data. This paper proposes a feature extraction method for the ear image instead of measuring anthropometric pinna measurements directly. The proposed method utilizes the bottleneck features of a convolutional neural network (CNN) auto-encoder from the edge detected ear image. The proposed feature extraction method using the CNN-based auto-encoder will be incorporated into the HRTF estimation approach.
@article{lee2018on,
author={lee, geon woo and moon, jung min and chun, chan jun and kim, hong kook},
journal={journal of the audio engineering society},
title={on the use of bottleneck features of cnn auto-encoder for personalized hrtfs},
year={2018},
volume={},
number={},
pages={},
doi={},
month={may},}
@article{lee2018on,
author={lee, geon woo and moon, jung min and chun, chan jun and kim, hong kook},
journal={journal of the audio engineering society},
title={on the use of bottleneck features of cnn auto-encoder for personalized hrtfs},
year={2018},
volume={},
number={},
pages={},
doi={},
month={may},
abstract={the most effective way of providing immersive sound effects is to use head-related transfer functions (hrtfs). hrtfs are defined by the path from a given sound source to the listener's ears. however, sound propagation by hrtfs differs slightly between people because the head, body, and ears differ for each person. recently, a method for estimating hrtfs using a neural network has been developed, where anthropometric pinna measurements and head-related impulse responses (hrirs) are used as the input and output layer of the neural network. however, it is inefficient to accurately measure such anthropometric data. this paper proposes a feature extraction method for the ear image instead of measuring anthropometric pinna measurements directly. the proposed method utilizes the bottleneck features of a convolutional neural network (cnn) auto-encoder from the edge detected ear image. the proposed feature extraction method using the cnn-based auto-encoder will be incorporated into the hrtf estimation approach.},}
TY - paper
TI - On the Use of Bottleneck Features of CNN Auto-Encoder for Personalized HRTFs
SP -
EP -
AU - Lee, Geon Woo
AU - Moon, Jung Min
AU - Chun, Chan Jun
AU - Kim, Hong Kook
PY - 2018
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2018
TY - paper
TI - On the Use of Bottleneck Features of CNN Auto-Encoder for Personalized HRTFs
SP -
EP -
AU - Lee, Geon Woo
AU - Moon, Jung Min
AU - Chun, Chan Jun
AU - Kim, Hong Kook
PY - 2018
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - May 2018
AB - The most effective way of providing immersive sound effects is to use head-related transfer functions (HRTFs). HRTFs are defined by the path from a given sound source to the listener's ears. However, sound propagation by HRTFs differs slightly between people because the head, body, and ears differ for each person. Recently, a method for estimating HRTFs using a neural network has been developed, where anthropometric pinna measurements and head-related impulse responses (HRIRs) are used as the input and output layer of the neural network. However, it is inefficient to accurately measure such anthropometric data. This paper proposes a feature extraction method for the ear image instead of measuring anthropometric pinna measurements directly. The proposed method utilizes the bottleneck features of a convolutional neural network (CNN) auto-encoder from the edge detected ear image. The proposed feature extraction method using the CNN-based auto-encoder will be incorporated into the HRTF estimation approach.
The most effective way of providing immersive sound effects is to use head-related transfer functions (HRTFs). HRTFs are defined by the path from a given sound source to the listener's ears. However, sound propagation by HRTFs differs slightly between people because the head, body, and ears differ for each person. Recently, a method for estimating HRTFs using a neural network has been developed, where anthropometric pinna measurements and head-related impulse responses (HRIRs) are used as the input and output layer of the neural network. However, it is inefficient to accurately measure such anthropometric data. This paper proposes a feature extraction method for the ear image instead of measuring anthropometric pinna measurements directly. The proposed method utilizes the bottleneck features of a convolutional neural network (CNN) auto-encoder from the edge detected ear image. The proposed feature extraction method using the CNN-based auto-encoder will be incorporated into the HRTF estimation approach.
Authors:
Lee, Geon Woo; Moon, Jung Min; Chun, Chan Jun; Kim, Hong Kook
Affiliations:
Korea Institute of Civil Engineering and Building Technology (KICT), Goyang, Korea; Gwangju Institute of Science and Tech (GIST), Gwangju, Korea(See document for exact affiliation information.)
AES Convention:
144 (May 2018)
Paper Number:
10023
Publication Date:
May 14, 2018Import into BibTeX
Subject:
Posters: Spatial Audio
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=19419