The Uncanny Valley of Spatial Voice
In computer animation there is a known dip in comfort level as a function of the fidelity and likeness of a human image. This paper is an investigation of this same subjective phenomenon in the area of spatial voice. Since voice signals are very familiar they are likely to exhibit a similar trajectory of comfort. One possible theoretical explanation for this is that large errors tend to create a sense of distance and be accepted as degradation in the channel - for example we tolerate a low fidelity with most remote voice communication. For smaller errors, as the channel improves, the errors may be associated with the source or person speaking. Such error may trigger a sense of unease. In any communications system, practical considerations often lead to distortion in the capture, transport and reproduction of voice; attempts to disguise and mask distortion may lead to the perception of disturbing abnormalities by the subject. This paper combines a literature review in this area of perception, some hypotheses and some development experiences related to deliberate and adverse distortion of acoustic and spatial aspects of a sound field containing voice. A suggested analysis framework is presented for considering the relationship between the nature of a disturbance and the potential for disturbing or uncanny experiences. The potential aspects to be investigated from this are numerous - the focus of this paper is to present a framework that may help to understand and estimate the potential impact of the uncanny and to present examples for possible further investigation.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is temporarily free for AES members.