Authors:Lee, Chung; Horner, Andrew; Wu, Bin
Affiliation:Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Previous studies of MP3 compression quality have generally not considered the effect on timbre space of sustained musical instruments. In this research subjects were asked to rate the dissimilarity of pairs of eight original instrument tones from various instrument families. These included tones from a bassoon, clarinet, flute, horn, oboe, saxophone, trumpet, and violin. For example, does the sound of a clarinet and oboe become more or less similar as a result of the compression? Although the results showed strong correlations between the dissimilarity scores of the original and compressed tones, there were subtle perceptual changes, especially at low bit rates. The perceptual difference between the saxophone and bassoon was significantly reduced at 32 kbps compression.
Download: PDF (HIGH Res) (1.3MB)
Download: PDF (LOW Res) (230KB)
Affiliation:Dolby Laboratories, McMahons Point, Australia
This study compares interaural intensity differences (IIDs) of a real source and those resulting from a phantom source created by pair-wise amplitude panning in an anechoic environment with a listener situated in the sweet spot. The results indicate that the translation of panning gain ratios to IIDs depends on the source frequency, the individual’s HRTFs, the loudspeaker angle, and the source direction angle. For small loudspeaker angular apertures, the IIDs of a phantom source are typically larger in absolute sense than those of a real source from the direction predicted by the panning laws under test, especially above 1 kHz. The most conservative panning law (sine-cosine) generally results in the best correspondence between phantom and real source IIDs. For wider loudspeaker angular apertures the IID of the phantom source is either smaller or larger than the IID of the real source, depending on the sound source frequency and the panning law under test.
Download: PDF (HIGH Res) (410KB)
Download: PDF (LOW Res) (308KB)
Authors:Laitinen, Mikko-Ville; Disch, Sascha; Pulkki, Ville
Affiliation:Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland; Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
Conventional wisdom incorrectly assumes that changes to the phase spectrum of an audio signal are not perceptually relevant. The results of formal listening tests with synthetic harmonic complex signals showed that human beings are not “phase deaf.” The perceived difference resulting from randomization of the phase spectrum can be larger than those from randomizing the magnitude spectrum. Although the mechanism for phase perception is somewhat local in frequency, there are some influences on the perception of neighboring frequencies. The phase of a component at a certain frequency affects the perception of frequencies about one octave above and below. Signals for which the phase between the harmonics is aligned can be described as having a strong low pitch and “buzzy” quality, whereas random-phase signals are perceived to be colored, thinner, and absent of the buzzy quality.
Download: PDF (HIGH Res) (3.7MB)
Download: PDF (LOW Res) (630KB)
Authors:Bliem, Tobias; Galdo, Giovanni Del; Borsum, Juliane; Craciun, Alexandra; Zitzmann, Reinhard
Affiliation:Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany
Digital audio watermarking has become a popular technique in recent years to reduce illegal file sharing of music and to identify listener broadcast preferences. The added watermarking signal is made inaudible by exploiting the frequency and temporal masking effects of the human auditory system. Watermarking codes are embedded in the time-frequency domain using a sparse multicarrier approach. Because the watermarking is often decoded in an enclosed environment, it must be robust against such impairments as reverberation, microphone self-noise, movement, and interferences from extraneous sound sources. Subjective and objective testing showed that both robustness and inaudibility can be obtained.
Download: PDF (HIGH Res) (982KB)
Download: PDF (LOW Res) (510KB)
Authors:Faccenda, Francesco; Squartini, Stefano; Principi, Emanuele; Gabrielli, Leonardo; Piazza, Francesco
Affiliation:3MediaLabs - DII - Universitá Politecnica delle Marche, Ancona, Italy
To facilitate communications among passengers in a large vehicle, an appropriate system with microphones, loudspeakers, and amplifiers is needed. However, a signal processing algorithm is required to avoid feedback and instability. Borrowing from speech-reinforcement research, the authors use a room-modeling adaptive feedback-cancellation approach that combines the Prediction Error Method and adaptive filtering. And, by including a suppressor filter, the system can be extended to a dual-channel scenario that supports bidirectional communications, where additional feedback paths must be considered with respect to the single-channel case study. In order to achieve low latencies and real-time processing, the partitioned block frequency domain adaptive filter algorithm has been adopted. Voice-activity and double-talk detectors have been included as well. Computer simulations in various acoustic conditions have shown the effectiveness of this approach.
Download: PDF (HIGH Res) (4.1MB)
Download: PDF (LOW Res) (2.0MB)
Affiliation:AMS Neve Ltd., Burnley, United Kingdom
Several types of triodes tubes that are popular in audio preamplifiers were tested under typical operating conditions to determine the relationships between noise and operating conditions. In most cases the total noise in the audio band was dominated by the flicker effect. Most triodes exhibit an optimum anode current at which the equivalent input-noise voltage reaches a minimum, and this current value is relatively consistent between samples of the same type. In general, high-gm triode types are likely to exhibit superior noise figures compared to low-gm types for a given mean anode current. Reduced heater voltage tends to reduce equivalent input noise at small anode currents, but tends to increase it at higher anode currents. A general formula for estimating the equivalent input-noise voltage for any conventional small-signal audio triode, at rated heater voltage, is presented.
Download: PDF (HIGH Res) (2.0MB)
Download: PDF (LOW Res) (344KB)
Authors:Clifford, Alice; Reiss, Joshua D.
Affiliation:Centre for Digital Music, School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK
Multiple microphones are often used to record a single source in live and studio productions. Because such microphones are often at different distances from the source, the sum of their signals creates a comb filter response with flanging effect. These effects can be avoided if there is automated delay compensation. This article analyzes the accuracy of the Generalized Cross Correlation with Phase Transform (GCC-PHAT) as a delay-estimation technique when applied to arbitrary music signals. The authors show that the window function used in the GCC-PHAT calculation influences the interferences between frequency components with different amplitudes, which results in spectral leakage and errors in the GCC-PHAT calculation. This interference is greatest when the input signal is narrowband and when the window function has high-amplitude side lobes.
Download: PDF (HIGH Res) (547KB)
Download: PDF (LOW Res) (485KB)
Loudspeaker and headphone design challenges span a range from the highest quality to the smallest, cheapest drivers and boxes. Measurements that correspond more closely to perception are gaining ground as it is discovered that certain kinds of distortion in loudspeakers can be very high without being annoying. Headphones often need equalization to make them sound natural.
Download: PDF (417KB)