Blind upmix denotes the process of converting audio content into a higher number of output channels without the aid of any prior spatial information. This is often needed for upmixing legacy monophonic recordings into modern multichannel audio formats. Especially in live-recordings, applause plays a vital role. However, creating a convincing blind upmix of applause signals is a demanding task. Applause can be interpreted as a superposition of distinctive and individually perceivable foreground claps and a more noise-like background. While the background signal can be upmixed by applying decorrelation and distribution across channels, it is important that the foreground claps are spatially distributed in a perceptually meaningful and plausible manner. This paper investigates the effect of the spatial, temporal, and timbral structure of foreground claps on the perceived plausibility of applause signals. The assessment was done by means of two listening tests. Results show that especially for sparse applause, plausibility is significantly reduced if its natural timbral and temporal structure is corrupted.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.