144th AES CONVENTION Poster Session P19: Posters: Audio Processing/Audio Education

AES Milan 2018
Poster Session P19

P19 - Posters: Audio Processing/Audio Education

Friday, May 25, 13:15 — 14:45 (Arena 2)

P19-1 Combining Fully Convolutional and Recurrent Neural Networks for Single Channel Audio Source SeparationEmad M. Grais, University of Surrey - Guildford, Surrey, UK; Mark D. Plumbley, University of Surrey - Guildford, Surrey, UK
Combining different models is a common strategy to build a good audio source separation system. In this work we combine two powerful deep neural networks for audio single channel source separation (SCSS). Namely, we combine fully convolutional neural networks (FCNs) and recurrent neural networks, specifically, bidirectional long short-term memory recurrent neural networks (BLSTMs). FCNs are good at extracting useful features from the audio data and BLSTMs are good at modeling the temporal structure of the audio signals. Our experimental results show that combining FCNs and BLSTMs achieves better separation performance than using each model individually.
Convention Paper 9990 (Purchase now)

P19-2 A Group Delay-Based Method for Signal DecorrelationElliot K. Canfield-Dafilou, Center for Computer Research in Music and Acosutics (CCRMA), Stanford University - Stanford, CA, USA; Jonathan S. Abel, Stanford University - Stanford, CA, USA
By breaking up the phase coherence of a signal broadcast from multiple loudspeakers, it is possible to control the perceived spatial extent and location of a sound source. This so-called signal decorrelation process is commonly achieved using a set of linear filters and finds applications in audio upmixing, spatialization, and auralization. Allpass filters make ideal decorrelation filters since they have unit magnitude spectra and therefore can be perceptually transparent. Here, we present a method for designing allpass decorrelation filters by specifying group delay trajectories in a way that allows for control of the amount of correlation as a function of frequency. This design is efficiently implemented as a cascade of biquad allpass filters. We present statistical and perceptual methods for evaluating the amount of decorrelation and audible distortion.
Convention Paper 9991 (Purchase now)

P19-3 Designing Quasi-Linear Phase IIR Filters for Audio Crossover Systems by Using Swarm IntelligenceFerdinando Foresi, Università Politecnica delle Marche - Ancona, Italy; Paolo Vecchiotti, Universita Politecnica delle Marche - Ancona, Italy; Diego Zallocco, Elettromedia s.r.l. - Potenza Piena, Italy; Stefano Squartini, Università Politecnica delle Marche - Ancona, Italy
In sound reproduction systems the audio crossover plays a fundamental role. Nowadays, digital crossover based on IIR filters are commonly employed, of which non-linear phase is a relevant topic. For this reason, solutions aiming to IIR filters approximating a linear phase behavior have been recently proposed. One of the latest exploits Fractional Derivative theory and uses Evolutionary Algorithms to explore the solution space in order to perform the IIR filter design: the IIR filter phase error is minimized to achieve a quasi-linear phase response. Nonetheless, this approach is not suitable for a crossover design, since the single filter transition band behavior is not predictable. This shoved the authors to propose a modified design technique including suitable constraints, as the amplitude response cut-off frequency, in the ad-hoc Particle Swarm Optimization algorithm exploring the space of IIR filter solutions. Simulations show that not only more performing filters can be obtained but also fully flat response crossovers achieved.
Convention Paper 9992 (Purchase now)

P19-4 Graduate Attributes in Music Technology: Embedding Design Thinking in a Studio Design CourseMalachy Ronan, Limerick Institute of Technology - Limerick, Ireland; Donagh O'Shea, Limerick Institute of Technology - Limerick, Ireland
Student acquisition of graduate attributes is an increasingly important consideration for educational institutes, yet embedding these attributes in the curriculum is often challenging. This paper recounts the process of embedding design thinking in a studio design course. The process is adapted to suit music technology students and delivered through weekly interactive workshops. Student adaptation to design thinking is assessed against the characteristics of experienced designers to identify issues and derive heuristics for future iterations of the course.
Convention Paper 9993 (Purchase now)

P19-5 Dynamic Range Controller Ear Training: Analysis of Audio Engineering Student Training DataDenis Martin, McGill University - Montreal, QC, Canada; CIRMMT - Montreal, QC, Canada; George Massenburg, Schulich School of Music, McGill University - Montreal, Quebec, Canada; Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) - Montreal, Quebec, Canada; Richard King, McGill University - Montreal, Quebec, Canada; The Centre for Interdisciplinary Research in Music Media and Technology - Montreal, Quebec, Canada
Eight students from the McGill University Graduate Program in Sound Recording participated in a dynamic range controller ear training program over a period of 14 weeks. Analysis of the training data shows a significant improvement in the percentage of correct responses over time. This result agrees with previous findings by other researchers and demonstrates a positive effect of this technical ear training program.
Convention Paper 9994 (Purchase now)

Return to Paper Sessions