Program: Paper Session 6

Home | Call for Contributions | Program | Registration | Venue & Facilities | Accessibility & Inclusivity Travel | Sponsors Committee Twitter |

  A women in white is wearing headphones and looking up. Text above her head says '2019 AES International Conference on Immersive and Interactive Audio. March 27-29, 2019. York, UK.


Paper Session 6 - Spatial capture and measurement for interactive audio

Chair: Jez Wells


P6-1: "Automatic statistical analysis of ear shape anthropometrics of an international ear shape database"

Shoken Kaneko, Tsukasa Suenaga, Hiraku Okumura and Sungyoung Kim

We present a statistical analysis result of ear-related anthropometric data measured from 162 subjects and its subsets divided by gender and race. To analyze the data efficiently, we have developed a measurement technique that is semi-automatic, and therefore, can scale to larger data sets. The results show that, ear dimensions of Asian subjects’ ears tend to be larger than those of non-Asian subjects. Statistical tests confirmed the significant difference of the ear dimensions between different gender and racial categories. These findings suggest the importance of taking into account the subject’s demographic information such as gender or race, for generalized or individualized HRTF data based on ear shape modeling for immersive audio applications.


P6-2: "Obtaining Dense HRTF Sets from Sparse Measurements in Reverberant Environments"

Christoph Pörschmann and Johannes Mathias Arend

The paper describes a method for obtaining spherical sets of head-related transfer functions (HRTFs) based on a small number of measurements in reverberant environments. For spatial upsampling, we apply HRTF interpolation in the spherical harmonics (SH) domain. However, the number of measured directions limits the maximal accessible SH order, resulting in order-limitation errors and a restricted spatial resolution. Thus, we propose a method which reduces these errors by a directional equalization based on a spherical head model prior to the SH transform. To enhance the valid range of a subsequent low-frequency extension towards higher frequencies, we perform the extension on the equalized dataset. Finally, we apply windowing to the impulse responses to eliminate room reflections from the measured HRTF set. The analysis shows that the method for for spatial upsampling influences the resulting HRTF sets more than degradations due to room reflections or due to distortions of the loudspeakers.


P6-3: "Direct to Reverberant Ratio Measurements in Small and Mid-sized Rooms"

Sebastian Csadi, Francis M. Boland, Luke Ferguson, Hugh O'Dwyer and Enda Bates

Direct to Reverberant Ratio (DRR) is measured for three non-idealised rooms of different sizes, using a variety of methods. Binaural room impulse response are compared to the DRR calculated from an omnidirectional room impulse response. Consistent differences were found in the absolute DRR values calculated from each type of impulse, and also when the source is positioned close to a room boundary. As expected DRR decreased with distance, but certain room features produced some inconsistencies. The binaural DRR is also shown to vary with source angle, particularly for nearfield sources. DRR values are calculated with a variety of Direct Sound integration window sizes. The results suggest that in smaller rooms, a smaller window size produces more consistent changes in DRR. 


P6-4: "Acoustically hard 2D arrays for 3D HOA"

Svein Berge

The acquisition of higher-order ambisonic signals presents a technical challenge which has been met by several authors proposing a number of different array geometries and sensor types. The current paper presents a class of arrays whose performance has not been assessed before; arrays comprising only pressure-sensitive sensors on both sides of an acoustically hard plate. The combined acoustical and signal processing system is analyzed and numerical experiments on an optimized third order array provide a performance comparison with a more conventional hard-shell spherical microphone array. The model is verified by measurements.


P6-5: "Robust hypercardioid synthesis for spatial audio capture: microphone geometry, directivity and robustness"

Miguel Blanco Galindo, Philip Coleman and Philip J. B. Jackson

Frequency-invariant beamformers are useful for spatial audio capture since their attenuation of sources outside the look direction is consistent across frequency. In particular, the least-squares beamformer (LSB) approximates arbitrary frequency-invariant beampatterns with generic microphone configurations. This paper investigates the effects of array geometry, directivity order and regularization for robust hypercardioid synthesis up to 15th order with the LSB, using three 2D 32-microphone array designs (rectangular grid, open circular, and circular with cylindrical baffle). While the directivity increases with order, the frequency range is inversely proportional to the order and is widest for the cylindrical array. Regularization results in broadening of the mainlobe and reduced on-axis response at low frequencies. The PEASS toolkit was used to evaluate perceptually beamformed speech signals.


AES - Audio Engineering Society