A Multi-Angle, Multi-Distance Dataset of Microphone Impulse Responses

A new publicly available dataset of microphone impulse responses (IRs) has been generated. The dataset covers 25 microphones, including a Class-1 measurement microphone and polar pattern variations for seven of the microphones. Microphones that were included had omnidirectional, cardioid, supercardioid, and bidirectional polar patterns; condenser, moving-coil, and ribbon transduction types; single and dual diaphragms; multiple body and head basket shapes;smallandlargediaphragms;andend-addressandside-addressdesigns.Usingacustom-developedcomputer-controlledprecisionturntable,IRswerecapturedquasi-anechoicallyatincidentanglesfrom0 ◦ to 355 ◦ in steps of 5 ◦ and at source-to-microphone distances of 0.5, 1.25, and 5 m. The resulting dataset is suitable for perceptual and objective studies related to the incident-angle–dependent response of microphones and for the development of tools for predicting and emulating on-axis and off-axis microphone characteristics. The captured IRs allow generation of frequency response plots with a degree of detail not commonly available in manufacturer-supplied data sheets and are also particularly well-suited to harmonic distortion analysis.


INTRODUCTION
The incident-angle-dependent response of a microphone plays an important role in sound recording. Indeed, microphone manufacturers often purposely tailor their designs to give specific polar patterns that are desirable for particular applications. However, the achieved polar pattern is rarely invariant across the entire frequency spectrum, e.g., it might be a cardioid at high frequencies (HF) but omnidirectional at low frequencies (LF), meaning that angledependent changes to timbre and intended angle-dependent changes in level will occur. This can result in sound captured off-axis suffering a degree of timbral coloration, perhaps becoming muffled or duller. This can happen, for example, when a large ensem-ble is being recorded simultaneously and not all instruments can be positioned on-axis or when a lively singer periodically moves left and right of the microphone. There are also situations in which such off-axis coloration can be exploited by recording engineers to achieve a desired sonic result [1,2]. Acoustic phase-shift networks, diaphragm size and number, body size and shape, and head-basket design can all have an impact on a microphone's directional response characteristics [3][4][5][6].
Most of the research undertaken on this topic involves the technical assessment of one or more microphones at several incident angles to the sound source, comparing the resulting signals, in most cases, in terms of frequency response. Less commonly, the transient response is also analyzed. Olive [7] detailed the main disparities in the frequency response and directivity index of a set of microphones as the incident angle was changed, using an angular resolution of 15 • and 30 • . Research undertaken by Woszczyk [5] focused on the differences in the frequency response and transient response due to diffraction and the variation of the incident angle. Schneider [6] also performed a directional assessment of microphone response, describing the changes that singlediaphragm and double-diaphragm microphones exhibit in terms of directivity at LF in the near and far fields. Olive [7] also performed a subjective assessment of timbral changes, for which a single overall-quality attribute was rated, when comparing recordings made on-axis and at 60 • from the source.
Culloo and Ronan [8] investigated the perceptual effect of microphone angle, specifically on electric guitar recordings, using angles of 0 • , 30 • , and 60 • . Timbral changes between microphones were investigated comprehensively by Pearce et al. [9], who also ordered the observed timbral variations in terms of perceptual magnitude; in this study, microphones were used on-axis only. Ziegler et al. [10] developed a method for interactive visualization of microphone directivity characteristics, using a 15 • angular resolution.
From the above, it can be seen that there is an appetite for investigation of microphone characteristics in terms of technical measurement, perceptual evaluation, and visualization, but it can also be seen that, to date, rather coarse angular resolutions tend to have been employed. It would potentially be useful to more fully measure and document the incident-angle-dependent responses of a wide range of microphones. This could (i) provide a useful resource for microphone comparison and selection, (ii) allow correlations to be identified between microphone design parameters (e.g., capsule size, transduction mechanism, and body shape) and incident-angle-dependent response characteristics (e.g., frequency response, transient response, and harmonic distortion), (iii) feed into perceptual studies to identify the specific timbral variations that result from a particular incident-angle-dependent response pattern, and (iv) feed into further studies to develop predictive models of these timbral variations, in terms of microphone response characteristics or microphone design parameters.
The work documented in this report is therefore aimed to capture a large set of microphone impulse responses for multiple incident angles. A finer angular resolution than most previous studies was used at several sourceto-microphone distances for a wide range of microphones having a diversity of design characteristics.
The report has been structured as follows. Sec. 1 describes the main considerations related to the design of the measurement procedure, such as the angular resolution, source-to-microphone distances, recording environment, measurement equipment, set of microphones, stimulus and measurement method employed, and precision turntable used for rotating the microphone. In Sec. 2, the measurement procedure is described in detail, including the interaction between the components of the measurement setup, automation of the procedure, post-processing procedure followed to prepare the material for its delivery, and format and organization of the resulting dataset. Finally, a summary is presented in Sec. 3.

MEASUREMENT EQUIPMENT AND DESIGN
A measurement procedure was designed to obtain the impulse responses of several microphones while changing their incident angle with respect to a sound source. Each microphone was assessed individually, being rotated at a fixed angular resolution in an iterative process to record a stimulus signal at each step until a full 360 • rotation was completed. This section describes a number of variables and parameters related to the procedure.

Incident Angle Resolution
As summarized in Sec. 0, incident-angle-dependent changes in frequency response, and occasionally transient response, have been measured in previous studies. These studies have used different angular resolutions, depending on the aim of the experiment and level of detail needed. Olive [7] used a 15 • increment for the front hemisphere and 30 • for the rear hemisphere, performing both individual and averaged evaluations of the changes in the frequency response. In another study, Woszczyk [5] used angular resolutions of 5 • , 15 • , and 45 • . According to Woszczyk, the 5 • resolution offers a detailed view of the specific and average performance of the microphone frequency response. Shulman [11] explored the comb filtering effects at off-axis angles of highly directional microphones using an angle resolution of 15 • . Furthermore, BS EN Standard 60268-4:2014 [12] recommends an angular resolution of either 10 • or 15 • for measuring the directional response of a microphone.
An angular resolution of 5 • was chosen for the study documented here, to provide measurements at a finer resolution than most of the previous studies and frequency response plots contained in the datasheets of most microphones. This angle increment is also a divisor of the 10 • and 15 • resolutions recommended in BS EN Standard 60268-4:2014, guaranteeing that all possible angles with these increments are included.

Source-to-Microphone Distance
Three distances were chosen for the measurement procedure: 0.5, 1.25, and 5 m. The 0.5-m distance was chosen aiming to explore the changes in the LF region related to the proximity effect when the incident angle is varied, as noted by authors such as Eargle [4] and Torio [13]. Distances of less than 0.5 m were not used because they would have caused larger microphones to collide with the sound source on rotation. As indicated by Rumsey [14], the rise at LF starts appearing when the sound source is positioned at a distance of less than 1 m. The extent of the LF emphasis, and consequently the rate of angular variation of the proximity effect, will depend on the polar pattern of the microphone and, more specifically, on the path-length between the front and rear faces of the diaphragm. The 1.25-m distance was selected to assess the mid-field response, considering that it is commonly encountered in various recording scenarios and is also close to the majority of the distances used for studies related to the incidentangle-dependent response of microphones and for the mea-  surements made by manufacturers to build their technical data sheets. Finally, a 5-m distance allows identification of certain disparities that might appear in terms of directivity at LF and HF, as reported by Schneider [6], in a scenario for which plane-wave conditions are met.
It is acknowledged that some of the chosen microphone/distance combinations might not be representative of common recording practice (e.g., a Shure SM58 at 5 m or shotgun microphone at 0.5 m). However, all chosen microphones will be measured at all three distances, in order to facilitate comprehensive inter-microphone comparison in future research. This will also allow future investigation of the way each microphone might respond to spill from off-axis secondary sources, which could be at any distance from the microphone.

Recording Environment and Equipment
The measurements were carried out in Studio 1, at the Institute of Sound Recording (University of Surrey, UK). This is a large recording/performance room with an approximate floor area of 250 m 2 and reverberation time (RT60) of 1.1 s. Studio 1 is a live room commonly used as a studio and/or concert hall for recording ensembles, from a solo singer to a full orchestra and choir ensemble. The Noise Rating measured for the studio is NR23. A noise floor of 23.1 dBA ± 0.1 dB has been measured. Fig. 1 shows the setup used for the measurements. It is worth mentioning that only one source-to-microphone distance is in use at any one time. Fig. 2 shows the setup in one of the configurations, with the loudspeaker and an AKG C414 microphone positioned on-axis at 1.25 m. All doors remained closed during the measurements.
The microphones were positioned at 3 m above the floor and 3.7 m from the lighting gantry. These distances are relevant because measurements were performed quasianechoically, by applying a time window to remove the impulse response from the first reflection onward. It was therefore important to maximize the distance to the closest reflective surface (in this case, the floor), in order to maximize the length of the reflection-free part of the impulse response and, consequently, LF accuracy.
In order to rotate the microphone at the desired angular resolution, a turntable system was designed and manufactured. The characteristics of this device are presented in Sec. 1.6. The Bluetooth connection between the computer and control unit of the turntable was maintained by placing the computer and audio interface on a desk within the room. Because the direct sound from the laptop could vary with the incident angle of the microphone, measures needed to be taken to make sure that any noise coming from the laptop was primarily reverberant. Hence, the microphoneto-laptop distance had to be greater than the critical distance of the room, to ensure that the laptop was located in the reverberant field and any noise from the laptop would remain unaffected by the incident angle of the microphone. The critical distance was estimated using the following equations, stated by Howard and Angus [15]: From measurement of the room: RT60 = 1.10 s; surface area S = 901 m 2 ; volume V = 1,600 m 3 . Eq. (2) then gives room constant R = 316, and Eq. (3) gives the average absorption coefficient α = 0.259. All values are given to three significant figures. The directivity factor Q of the laptop fan would be equal to 1 if it were an omnidirectional sound source, and Eq. (1) would give critical distance D C = 2.51 m (3 s.f.). However, two large acoustic baffles that surround the control desk make Q significantly less than 1 in the direction of the turntable. Therefore, the critical distance is less than 2.51 m. The minimum microphoneto-laptop distance (measured with the turntable positioned at a source-to-microphone distance of 0.5 m) is 3.6 m, which ensures that the laptop is in the reverberant field with respect to the microphone. Additionally, a 5-m cable was used between the turntable and its power supply unit (PSU) to minimize the chance of picking up any PSU-generated acoustic noise and ensure that the microphone was in the reverberant field with respect to the PSU (cf. laptop noise).

Microphones
Twenty-four microphone models that are commonly found in recording studios were selected. The selection was intended to include both diversity and similarity. A dataset that includes microphones that differ in all or the majority of their design characteristics allows for potentially interesting comparisons, e.g., a small diaphragm fixed-pattern condenser in cardioid mode vs. a large dual-diaphragm multipattern condenser in cardioid mode, a large diaphragm multi-pattern condenser microphone in figure eight vs. a ribbon microphone, etc. Inclusion of microphones that are similar for most of their design characteristics allows for evaluation of response consistency between seemingly similar microphones.
Variations in all of the following design characteristics were included: r Diaphragm size, r Shape of the microphone, r Shape of the head basket, r Transduction mechanism, r Directivity, r Number of diaphragms, and r Polar pattern variability.
A Class-1 measurement microphone was also included for possible future use as a reference. The full list of microphones is shown in Table 1.
For this paper, diaphragm size is defined as "large" if the diaphragm diameter is equal to or greater than the wavelength in air of a 20-kHz tone (17 mm) and as "small" otherwise.

Stimulus and Sound Source
The exponential sine-sweep method was used for performing each impulse response measurement. This method is an effective and robust procedure for measuring the impulse response [16,17].
The stimulus and deconvolution technique separate the non-linear products of the audio system under assessment and position them at the very beginning, followed by the linear response of the system [16]. This method is used to measure the transfer function of "weakly non-linear, approximately time-invariant" systems [16]. Using a single, long sine sweep is preferred here to synchronous averaging of multiple shorter sweeps because it is more robust to time variance of the acoustic path. Additionally, it allows for distortion-free analysis of the linear response and separation of higher-order harmonics, and it provides an improvement in the signal-to-noise ratio of 60 dB or more compared with the generation of a single impulse with the same maximum amplitude. The 20-Hz-20-kHz exponential sine sweep used was 7 s long, with an additional 3 s of silence, to ensure the tail of the sweep is properly captured for every distance.
The exponential sine sweep considered for this experiment is the synchronized version proposed by Novak [18]. Novak found that, in comparison with the regular exponential sine sweep, the synchronized version offers phase synchronization of the higher-harmonic frequency responses (HHFRs), which improves the estimation of both their am-  plitude and phase. This is particularly useful for the assessment of individual harmonic distortion, which consists of the separation and individual assessment of the higher harmonics across the frequency spectrum. Furthermore, the deconvolution stage is performed in the frequency domain using the analytical expression for the inverse of the stimulus, which, as stated by Novak, leads to a larger frequency bandwidth for the HHFRs. The expression of the synchronized exponential sine sweep used is defined in Eq. (4). L is the rate at which the frequency increases, and it is given by Eq. (5), where f 1 is the initial frequency of the sweep, f 2 is the final frequency, and Tˆis the time length of the stimulus.
The duration of 7 s provides an appropriate trade-off between minimizing the total duration of the measurements and minimizing any influence of time variation on the estimated response [16]. The sine sweep was followed by a silence of 3 s, which ensures that the late part or tail of the stimulus is fully captured, i.e., the exponential sine sweep-including the direct sound and first reflections-is captured in its entirety for every source-to-microphone distance.

Precision Turntable
For the purposes of this study, a custom microphone rotation system was developed. It was specifically designed such that a wide range of microphones could be mounted, with the option to align the microphone capsule to the center of rotation.
The rotation system was developed around a NEMA 17 stepper motor, with a custom-designed and 3D-printed mount, allowing it to be mounted on a standard microphone stand. A thrust bearing was used for interfacing between the motor and rotating assembly. This design consideration ensured that the weight of the microphone was not spreading directly on the motor shaft, consequently increasing the maximum load. The adjustability was aided by the use of a K&M standard stereo bar that could slide back and forth for precise alignment of the capsule to the center of the rotation axis (Fig. 3).
The motor was controlled by an Arduino Uno microcontroller through a TMC2100 stepper driver. As a result, the turntable can rotate with a precision of ±0.1 • . The system was controlled via Bluetooth-through an HC-06 Bluetooth module-thus eliminating the need for additional cables. The communication to the system was established by serial messages containing the direction of movement and angle in degrees. This design consideration ensured a straightforward way of controlling the device using a variety of programming languages and development platforms.
During the present research, the system was controlled using MATLAB, through a custom-developed library that enabled the precise rotation of the device. Rotation was achieved through simple commands, e.g., turntable.rotate(5) %rotates 5 degrees clockwise, turntable.rotate(-5) %rotates 5 degrees counter-clockwise. Fig. 4 presents a flowchart of the communication between MATLAB and the microphone rotation system.

MEASUREMENT PROCEDURE
Measurements were made across multiple recording sessions. The general procedure comprised a calibration stage-run at the beginning of each session-and measurement stage, which comprises a series of steps, including the mechanical rotation of the microphone, reproduction of the stimulus, and capture of the measured responses of each microphone.

Calibration
A calibration procedure was run at the beginning of each session. An NTi XL2 sound level meter with an NTi M2211 reference microphone was used to set the desired sound pressure level at the measurement position, when repro- ducing a pink noise signal through the Genelec 8040 loudspeaker. The sound pressure level was set well above noise floor. The level set at source-to-microphone distances of 0.5 and 1.25 m was 83 dB(A) and 5 m was 75 dB(A).
Additionally, before each single measurement-which consisted of a full 360 • rotation of one microphone with a 5 • angle increment-the microphone amplifier input gain was adjusted to produce the same digital level of -15 dBFS ± 1 dB. This was done in order to ensure optimal and consistent digital levels in the ADC stage and the recording itself.

Measurements
Every microphone was evaluated individually by mounting it on the turntable. Once the Bluetooth connection between the computer and control unit of the turntable was established, a rotation over the full circle was completed for each measurement with an angular resolution of 5 • . A MATLAB script was developed to automate the measurement procedure. The script controlled the step-by-step rotation, the reproduction and capture of the stimulus, and measured signals, respectively, and saved the recordings made at each single position. A 10-s delay was included between rotation and measurement to ensure that the microphone had fully stabilized. A sample rate of 48 kHz and bit depth of 16 bits were used for recording the signals. A MATLAB (.mat) file was generated for each measurement, containing the array with all the recordings and full set of variables.
The turntable was initially positioned at 0.5 m from the loudspeaker. After completing each full rotation, the microphone was swapped for another model from the list, and the digital level calibration and a new measurement were performed. The same procedure was followed at the other source-to-microphone distances, i.e., 1.25 and 5 m.

Post-Processing
The post-processing stage for delivering the impulse responses consisted of two main stages: the calculation of the impulse response and time-windowing procedure. As mentioned in Sec. 1.3, time windowing was necessary to remove the impulse response from the first reflection onward.
The impulse responses at each angle were computed, using the method described by Farina [16] and Novak [18]. This involved carrying out the deconvolution of the measured signal (the raw recording) and inverse of the stimulus (the exponential sine sweep).
Each impulse response was windowed in order to exclude the room reflections. A Hanning window was applied, multiplying it by the impulse response in the time domain. Schuller [19] defines rectangular, Hamming and Hanning windows as common alternatives for short-time analysis. From these, the Hanning window is described as the narrowest in the time domain, being often the preferred option for audio signal processing in that domain. The Hanning window is commonly used for pitch and harmonic analysis, and it is also used by Farina [16,17] for performing the Fast Fourier Transform of impulse response measurements. One of the windows applied to the impulse responses is shown in Fig. 5.
To assist with determination of appropriate window lengths, first reflection arrival times were estimated. Fig.  6 shows a side view of the measurement setup. The measurement distance d was 0.5, 1.25, and 5 m. The estimated time of the first reflection was calculated trigonometrically as 2d R -d. Captured impulse responses were then examined in the region of each of these estimated reflection times to locate the first reflections more exactly. The observed first-reflection times were 16.6, 14.7, and 8.00 ms. These are the periods corresponding to frequencies of 60.2, 68.0, and 125 Hz, respectively. Thus, at these frequencies and above, the measurement procedure can capture at least one full period of direct sound before the first reflection arrives. Below these frequencies, the accuracy of the measurement will reduce. The LF threshold of 60.2 Hz at 0.5 m is particularly important, because this allows for an appropriate assessment of the LF phenomena that might appear because of proximity effect, when the incident angle is modified.
The first-reflection times listed above were doubled to determine the corresponding window lengths. The adopted window lengths were therefore 33.2, 29.4, and 16.0 ms for the measurements at 0.5, 1.25, and 5 m, respectively. The direct sound part of the impulse response was centered with respect to the window, to ensure that it  remained unaffected and that only the reverberant component was cut.

Dataset Size and Format
The impulse responses of 25 microphones for a full rotation in the horizontal plane with an angular resolution of 5 • were measured. Since some microphones have variable polar pattern-either switchable or featuring interchangeable grids-38 measurements were performed per sourceto-microphone distance. Because of the physical size of the Røde NTG8, it was not possible to measure this microphone at 0.5 m. Therefore, 113 measurements were made in total. Each measurement comprised 72 recordings, which is the number of angles in a 360 • rotation with the angle increment used. As a result, the dataset contains 8,136 impulse responses in total.
The impulse responses were rendered in .wav format, using a sample rate of 48 kHz and bit-depths of 24 and 32 bits, and named using the following convention:

Example Measurements
The raw and windowed impulse responses of an AKG C451 microphone with a CK1 cardioid capsule are shown in Fig. 7. The frequency response of this microphone and capsule, shown in Fig. 8, exhibits the fine level of detail achievable with the angular resolution of 5 • that has been used for the measurements. This fine resolution allows for the identification of features in the response that are not  observable at wider angle increments, e.g., the greater attenuation at 160 • than at 180 • , between 1 and 5 kHz, seen for the small-diaphragm cardioid condenser microphone in Fig. 9; the dramatic difference that an incident angle change of just 5 • can make to the response of a ribbon microphone near to its 90 • null (Fig. 10); and the markedly less-flat response that a moving-coil supercardoioid microphone can have at 130 • , compared with its response at 0 • and 180 • (Fig. 11).

SUMMARY
This report has detailed the generation of a dataset of directional microphone impulse responses of 25 micro-  phones, including a Class-1 measurement microphone and polar pattern variations for seven microphones at an angle resolution of 5 • , from three different source-to-microphone distances (0.5, 1.25, and 5 m). The measurement procedure was carried out under quasi-anechoic conditions, maximizing the distance between source/receiver and reflective surfaces and using time windowing to isolate the direct sound component of the captured impulse response. A distance of 3 m from the nearest reflective surface establishes a frequency threshold below which frequency-domain analysis resolution reduces of 60.2 Hz at 0.5 m mic-to-source, 68.0 Hz at 1.5 m mic-to-source, and 125 Hz at 5 m micto-source. This dataset will be useful for future research projects related to microphone comparison, correlation of microphone design parameters with response characteristics, timbral evaluation and modeling of microphones, and microphone emulation.

ACKNOWLEDGMENT
This work has been funded by the University of Surrey (Guildford, UK), through the Vice-Chancellor's Studentship Award. Special thanks are due to Dr. Hyunkook Lee and the Applied Psychoacoustics Laboratory at the University of Huddersfield (Huddersfield, UK) for their support in the manufacturing of the precision turntable used for the measurements. The data gathered during the course of this study are available at http://doi.org/10.5281/zenodo.4633508.