Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks
×
Cite This
Citation & Abstract
CH. JU. Chun, SE. HE. Jeong, SU. YE. Park, and HO. KO. Kim, "Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks," Paper 9400, (2015 October.). doi:
CH. JU. Chun, SE. HE. Jeong, SU. YE. Park, and HO. KO. Kim, "Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks," Paper 9400, (2015 October.). doi:
Abstract: In this paper we propose a method of extending monaural into stereophonic sound based on deep neural networks (DNNs). First, it is assumed that monaural signals are the mid signals for the extended stereo signals. In addition, the residual signals are obtained by performing the linear prediction (LP) analysis. The LP coefficients of monaural signals are converted into the line spectral frequency (LSF) coefficients. After that, the LSF coefficients are taken as the DNN features, and the features of the side signals are estimated from those of the mid signals. The performance of the proposed method is evaluated using a log spectral distortion (LSD) measure and a multiple stimuli with a hidden reference and anchor (MUSHRA) test. It is shown from the performance comparison that the proposed method provides lower LSD and higher MUSHRA score than a conventional method using hidden Markov model (HMM).
@article{chun2015extension,
author={chun, chan jun and jeong, seok hee and park, su yeon and kim, hong kook},
journal={journal of the audio engineering society},
title={extension of monaural to stereophonic sound based on deep neural networks},
year={2015},
volume={},
number={},
pages={},
doi={},
month={october},}
@article{chun2015extension,
author={chun, chan jun and jeong, seok hee and park, su yeon and kim, hong kook},
journal={journal of the audio engineering society},
title={extension of monaural to stereophonic sound based on deep neural networks},
year={2015},
volume={},
number={},
pages={},
doi={},
month={october},
abstract={in this paper we propose a method of extending monaural into stereophonic sound based on deep neural networks (dnns). first, it is assumed that monaural signals are the mid signals for the extended stereo signals. in addition, the residual signals are obtained by performing the linear prediction (lp) analysis. the lp coefficients of monaural signals are converted into the line spectral frequency (lsf) coefficients. after that, the lsf coefficients are taken as the dnn features, and the features of the side signals are estimated from those of the mid signals. the performance of the proposed method is evaluated using a log spectral distortion (lsd) measure and a multiple stimuli with a hidden reference and anchor (mushra) test. it is shown from the performance comparison that the proposed method provides lower lsd and higher mushra score than a conventional method using hidden markov model (hmm).},}
TY - paper
TI - Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks
SP -
EP -
AU - Chun, Chan Jun
AU - Jeong, Seok Hee
AU - Park, Su Yeon
AU - Kim, Hong Kook
PY - 2015
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2015
TY - paper
TI - Extension of Monaural to Stereophonic Sound Based on Deep Neural Networks
SP -
EP -
AU - Chun, Chan Jun
AU - Jeong, Seok Hee
AU - Park, Su Yeon
AU - Kim, Hong Kook
PY - 2015
JO - Journal of the Audio Engineering Society
IS -
VO -
VL -
Y1 - October 2015
AB - In this paper we propose a method of extending monaural into stereophonic sound based on deep neural networks (DNNs). First, it is assumed that monaural signals are the mid signals for the extended stereo signals. In addition, the residual signals are obtained by performing the linear prediction (LP) analysis. The LP coefficients of monaural signals are converted into the line spectral frequency (LSF) coefficients. After that, the LSF coefficients are taken as the DNN features, and the features of the side signals are estimated from those of the mid signals. The performance of the proposed method is evaluated using a log spectral distortion (LSD) measure and a multiple stimuli with a hidden reference and anchor (MUSHRA) test. It is shown from the performance comparison that the proposed method provides lower LSD and higher MUSHRA score than a conventional method using hidden Markov model (HMM).
In this paper we propose a method of extending monaural into stereophonic sound based on deep neural networks (DNNs). First, it is assumed that monaural signals are the mid signals for the extended stereo signals. In addition, the residual signals are obtained by performing the linear prediction (LP) analysis. The LP coefficients of monaural signals are converted into the line spectral frequency (LSF) coefficients. After that, the LSF coefficients are taken as the DNN features, and the features of the side signals are estimated from those of the mid signals. The performance of the proposed method is evaluated using a log spectral distortion (LSD) measure and a multiple stimuli with a hidden reference and anchor (MUSHRA) test. It is shown from the performance comparison that the proposed method provides lower LSD and higher MUSHRA score than a conventional method using hidden Markov model (HMM).
Authors:
Chun, Chan Jun; Jeong, Seok Hee; Park, Su Yeon; Kim, Hong Kook
Affiliations:
Gwangju Institute of Science and Technology (GIST), Gwangju, Korea; City University of New York, New York, NY, USA(See document for exact affiliation information.)
AES Convention:
139 (October 2015)
Paper Number:
9400
Publication Date:
October 23, 2015Import into BibTeX
Subject:
Signal Processing
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=17957