The Uncanny Valley of Spatial Voice
In computer animation there is a known dip in comfort level as a function of the fidelity and likeness of a human image. This paper is an investigation of this same subjective phenomenon in the area of spatial voice. Since voice signals are very familiar they are likely to exhibit a similar trajectory of comfort. One possible theoretical explanation for this is that large errors tend to create a sense of distance and be accepted as degradation in the channel - for example we tolerate a low fidelity with most remote voice communication. For smaller errors, as the channel improves, the errors may be associated with the source or person speaking. Such error may trigger a sense of unease. In any communications system, practical considerations often lead to distortion in the capture, transport and reproduction of voice; attempts to disguise and mask distortion may lead to the perception of disturbing abnormalities by the subject. This paper combines a literature review in this area of perception, some hypotheses and some development experiences related to deliberate and adverse distortion of acoustic and spatial aspects of a sound field containing voice. A suggested analysis framework is presented for considering the relationship between the nature of a disturbance and the potential for disturbing or uncanny experiences. The potential aspects to be investigated from this are numerous - the focus of this paper is to present a framework that may help to understand and estimate the potential impact of the uncanny and to present examples for possible further investigation.
Click to purchase paper or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members, $5 for AES members and is free for E-Library subscribers.