Synthesis of Spatially Extended Virtual Source with Time-Frequency Decomposition of Mono Signals
×
Cite This
Citation & Abstract
T. Pihlajamäki, O. Santala, and V. Pulkki, "Synthesis of Spatially Extended Virtual Source with Time-Frequency Decomposition of Mono Signals," J. Audio Eng. Soc., vol. 62, no. 7/8, pp. 467-484, (2014 July.). doi: https://doi.org/10.17743/jaes.2014.0031
T. Pihlajamäki, O. Santala, and V. Pulkki, "Synthesis of Spatially Extended Virtual Source with Time-Frequency Decomposition of Mono Signals," J. Audio Eng. Soc., vol. 62 Issue 7/8 pp. 467-484, (2014 July.). doi: https://doi.org/10.17743/jaes.2014.0031
Abstract: Auditory displays, driven by nonauditory data, are often used to present a sound scene to a listener. Typically, the sound field places sound objects at different locations, but the scene becomes aurally richer if the perceived sonic objects have a spatial extent (size), called volumetric virtual coding. Previous research in virtual-world Directional Audio Coding has shown that spatial extent can be synthesized from monophonic sources by applying a time-frequency-space decomposition, i.e., randomly distributing time-frequency bins of the source signal. This technique does not guarantee a stable size and the timbre can degrade. This study explores how to optimize volumetric coding in terms of timbral and spatial perception. The suggested approach for most types of audio uses an STFT window size of 1024 samples and then distributes the frequency bands from lowest to highest using the Halton sequence. The results from two formal listening experiments are presented.
@article{pihlajamäki2014synthesis,
author={pihlajamäki, tapani and santala, olli and pulkki, ville},
journal={journal of the audio engineering society},
title={synthesis of spatially extended virtual source with time-frequency decomposition of mono signals},
year={2014},
volume={62},
number={7/8},
pages={467-484},
doi={https://doi.org/10.17743/jaes.2014.0031},
month={july},}
@article{pihlajamäki2014synthesis,
author={pihlajamäki, tapani and santala, olli and pulkki, ville},
journal={journal of the audio engineering society},
title={synthesis of spatially extended virtual source with time-frequency decomposition of mono signals},
year={2014},
volume={62},
number={7/8},
pages={467-484},
doi={https://doi.org/10.17743/jaes.2014.0031},
month={july},
abstract={auditory displays, driven by nonauditory data, are often used to present a sound scene to a listener. typically, the sound field places sound objects at different locations, but the scene becomes aurally richer if the perceived sonic objects have a spatial extent (size), called volumetric virtual coding. previous research in virtual-world directional audio coding has shown that spatial extent can be synthesized from monophonic sources by applying a time-frequency-space decomposition, i.e., randomly distributing time-frequency bins of the source signal. this technique does not guarantee a stable size and the timbre can degrade. this study explores how to optimize volumetric coding in terms of timbral and spatial perception. the suggested approach for most types of audio uses an stft window size of 1024 samples and then distributes the frequency bands from lowest to highest using the halton sequence. the results from two formal listening experiments are presented.},}
TY - paper
TI - Synthesis of Spatially Extended Virtual Source with Time-Frequency Decomposition of Mono Signals
SP - 467
EP - 484
AU - Pihlajamäki, Tapani
AU - Santala, Olli
AU - Pulkki, Ville
PY - 2014
JO - Journal of the Audio Engineering Society
IS - 7/8
VO - 62
VL - 62
Y1 - July 2014
TY - paper
TI - Synthesis of Spatially Extended Virtual Source with Time-Frequency Decomposition of Mono Signals
SP - 467
EP - 484
AU - Pihlajamäki, Tapani
AU - Santala, Olli
AU - Pulkki, Ville
PY - 2014
JO - Journal of the Audio Engineering Society
IS - 7/8
VO - 62
VL - 62
Y1 - July 2014
AB - Auditory displays, driven by nonauditory data, are often used to present a sound scene to a listener. Typically, the sound field places sound objects at different locations, but the scene becomes aurally richer if the perceived sonic objects have a spatial extent (size), called volumetric virtual coding. Previous research in virtual-world Directional Audio Coding has shown that spatial extent can be synthesized from monophonic sources by applying a time-frequency-space decomposition, i.e., randomly distributing time-frequency bins of the source signal. This technique does not guarantee a stable size and the timbre can degrade. This study explores how to optimize volumetric coding in terms of timbral and spatial perception. The suggested approach for most types of audio uses an STFT window size of 1024 samples and then distributes the frequency bands from lowest to highest using the Halton sequence. The results from two formal listening experiments are presented.
Auditory displays, driven by nonauditory data, are often used to present a sound scene to a listener. Typically, the sound field places sound objects at different locations, but the scene becomes aurally richer if the perceived sonic objects have a spatial extent (size), called volumetric virtual coding. Previous research in virtual-world Directional Audio Coding has shown that spatial extent can be synthesized from monophonic sources by applying a time-frequency-space decomposition, i.e., randomly distributing time-frequency bins of the source signal. This technique does not guarantee a stable size and the timbre can degrade. This study explores how to optimize volumetric coding in terms of timbral and spatial perception. The suggested approach for most types of audio uses an STFT window size of 1024 samples and then distributes the frequency bands from lowest to highest using the Halton sequence. The results from two formal listening experiments are presented.
Open Access
Authors:
Pihlajamäki, Tapani; Santala, Olli; Pulkki, Ville
Affiliation:
Aalto University, Department of Signal Processing and Acoustics, Helsinki, Finland JAES Volume 62 Issue 7/8 pp. 467-484; July 2014
Publication Date:
August 22, 2014Import into BibTeX
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=17339