Categorization of Broadcast Audio Objects in Complex Auditory Scenes
×
Cite This
Citation & Abstract
J. Woodcock, WI. J.. Davies, TR. J.. Cox, and F. Melchior, "Categorization of Broadcast Audio Objects in Complex Auditory Scenes," J. Audio Eng. Soc., vol. 64, no. 6, pp. 380-394, (2016 June.). doi: https://doi.org/10.17743/jaes.2016.0007
J. Woodcock, WI. J.. Davies, TR. J.. Cox, and F. Melchior, "Categorization of Broadcast Audio Objects in Complex Auditory Scenes," J. Audio Eng. Soc., vol. 64 Issue 6 pp. 380-394, (2016 June.). doi: https://doi.org/10.17743/jaes.2016.0007
Abstract: Because object-based audio is becoming an important framework for the representation of complex sound scenes, this research describes a series of experiments to determine a categorization framework for broadcast audio objects. Categorization is a fundamental human strategy for reducing cognitive load, and knowledge of these categories should be beneficial for the development of perceptually based representations and rendering strategies for object-based audio. In this study, 21 expert and non-expert listeners took part in a free card sorting task using audio objects from a variety of different types of program material. Hierarchical agglomerative clustering suggests that there are 7 general categories, which relate to sounds indicating actions and movement, continuous background sound, transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention-grabbing transient sounds. A three-dimensional perceptual space calculated via multidimensional scaling suggests that these categories vary along the dimensions of semantic content, continuous-transient, and presence-absence of people. The position of an audio object along the dimensions of the perceptual space relates to its perceived importance.
@article{woodcock2016categorization,
author={woodcock, james and davies, william j. and cox, trevor j. and melchior, frank},
journal={journal of the audio engineering society},
title={categorization of broadcast audio objects in complex auditory scenes},
year={2016},
volume={64},
number={6},
pages={380-394},
doi={https://doi.org/10.17743/jaes.2016.0007},
month={june},}
@article{woodcock2016categorization,
author={woodcock, james and davies, william j. and cox, trevor j. and melchior, frank},
journal={journal of the audio engineering society},
title={categorization of broadcast audio objects in complex auditory scenes},
year={2016},
volume={64},
number={6},
pages={380-394},
doi={https://doi.org/10.17743/jaes.2016.0007},
month={june},
abstract={because object-based audio is becoming an important framework for the representation of complex sound scenes, this research describes a series of experiments to determine a categorization framework for broadcast audio objects. categorization is a fundamental human strategy for reducing cognitive load, and knowledge of these categories should be beneficial for the development of perceptually based representations and rendering strategies for object-based audio. in this study, 21 expert and non-expert listeners took part in a free card sorting task using audio objects from a variety of different types of program material. hierarchical agglomerative clustering suggests that there are 7 general categories, which relate to sounds indicating actions and movement, continuous background sound, transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention-grabbing transient sounds. a three-dimensional perceptual space calculated via multidimensional scaling suggests that these categories vary along the dimensions of semantic content, continuous-transient, and presence-absence of people. the position of an audio object along the dimensions of the perceptual space relates to its perceived importance.},}
TY - paper
TI - Categorization of Broadcast Audio Objects in Complex Auditory Scenes
SP - 380
EP - 394
AU - Woodcock, James
AU - Davies, William J.
AU - Cox, Trevor J.
AU - Melchior, Frank
PY - 2016
JO - Journal of the Audio Engineering Society
IS - 6
VO - 64
VL - 64
Y1 - June 2016
TY - paper
TI - Categorization of Broadcast Audio Objects in Complex Auditory Scenes
SP - 380
EP - 394
AU - Woodcock, James
AU - Davies, William J.
AU - Cox, Trevor J.
AU - Melchior, Frank
PY - 2016
JO - Journal of the Audio Engineering Society
IS - 6
VO - 64
VL - 64
Y1 - June 2016
AB - Because object-based audio is becoming an important framework for the representation of complex sound scenes, this research describes a series of experiments to determine a categorization framework for broadcast audio objects. Categorization is a fundamental human strategy for reducing cognitive load, and knowledge of these categories should be beneficial for the development of perceptually based representations and rendering strategies for object-based audio. In this study, 21 expert and non-expert listeners took part in a free card sorting task using audio objects from a variety of different types of program material. Hierarchical agglomerative clustering suggests that there are 7 general categories, which relate to sounds indicating actions and movement, continuous background sound, transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention-grabbing transient sounds. A three-dimensional perceptual space calculated via multidimensional scaling suggests that these categories vary along the dimensions of semantic content, continuous-transient, and presence-absence of people. The position of an audio object along the dimensions of the perceptual space relates to its perceived importance.
Because object-based audio is becoming an important framework for the representation of complex sound scenes, this research describes a series of experiments to determine a categorization framework for broadcast audio objects. Categorization is a fundamental human strategy for reducing cognitive load, and knowledge of these categories should be beneficial for the development of perceptually based representations and rendering strategies for object-based audio. In this study, 21 expert and non-expert listeners took part in a free card sorting task using audio objects from a variety of different types of program material. Hierarchical agglomerative clustering suggests that there are 7 general categories, which relate to sounds indicating actions and movement, continuous background sound, transient background sound, clear speech, non-diegetic music and effects, sounds indicating the presence of people, and prominent attention-grabbing transient sounds. A three-dimensional perceptual space calculated via multidimensional scaling suggests that these categories vary along the dimensions of semantic content, continuous-transient, and presence-absence of people. The position of an audio object along the dimensions of the perceptual space relates to its perceived importance.
Open Access
Authors:
Woodcock, James; Davies, William J.; Cox, Trevor J.; Melchior, Frank
Affiliations:
Acoustics Research Centre, University of Salford, Salford, United Kingdom; BBC R&D, Dock House, MediaCityUK, Salford, United Kingdom(See document for exact affiliation information.) JAES Volume 64 Issue 6 pp. 380-394; June 2016
Publication Date:
June 27, 2016Import into BibTeX
Permalink:
http://www.aes.org/e-lib/browse.cfm?elib=18297