Sections

AES Section Meeting Reports

Beijing - January 14, 2018

Meeting Topic:

Moderator Name:

Speaker Name:

Meeting Location:

Summary

Shusen WANG downloaded and shared the 6 of 11 files which Jan collected for this magazine
At 10:00 Dr Tianshu QU started to share his reading guide for the members of WeChat of AES Beijing:

AES (Beijing) Online Magazine January 2018
The First Reading Guide
Time: January 14, 2018

Wang Shusen:
@ Everyone
Good Morning,AES Beijing effective members! Today the guide reading activities will be start at 10:00 on time, please participate actively!
Qu tianshu: (The host of this reading guide)
Okay, let's start now
Convenient for everyone, you can enter this site https://secure.aes.org/members/
The theme of this time is sound quality evaluation
Read a few articles, think this article focuses on the introduction of objective sound quality evaluation method
Wang Jing: (The hostess of the reading guide)
Hello everyone, today for the first time to read, to discuss with you
Qu tianshu: (The host of this reading guide)
AES, the main plan of this event is Jonah Francis, English name: JON FRANCOMBE
I do not know everybody cooked to him, the website has some introduction:
Jon Francombe is a research fellow in the
Institute of Sound Recording, University of Surrey. His research background is
in perceptual audio quality evaluation. He has worked on methods for
understanding listener perception of novel audio technologies (including
personal sound zones and new spatial audio reproduction methods), and producing
and developing predictive models of important perceptual attributes
Translated a bit, to the effect that: Jonathan Francis is a University of Surrey recording researchers. His research background is in the perception of sound quality evaluation. He has been devoted to studying listener perception mechanisms of leading-edge audio technologies, including personal area sound field techniques and spatial audio reproduction techniques, and parameter-based objective prediction models
The University of Surrey has done a lot of work on audio and much of the introductory article came from their school. In addition, found his profile on LinkedIn, at https://www.linkedin.com/in/jon-francombe
My research background is in audio
perception-particularly designing, running, and analysing the results of listening experiments using quantitative and qualitative methods. I'm currently working on various aspects of immersive audio technology as part of the audio team in BBC R & D. He is now working in the BBC R & D team It's
I have a BMus in Music and Sound Recording from the University of Surrey, and a PhD from the same institution as part of the POSZ project (www.posz.org). I've also worked on the S3A Future Spatial Audio project (www .s3a-spatialaudio.org) However, his academic experience is consistent at the University of Surrey, also participated in the S3A Future Spatial Audio project
There are some articles on his LinkedIn page, and if you want to know more about it, you can visit https://www.linkedin.com/in/jon-francombe and then go to our introduction to this topic, Sound Quality Prediction: Sound quality prediction
The illustrations on this topic website are: In almost all areas that the AES is involved with, sound quality often of tell part of the story; perceptual quality judgments made by human assessors are the gold standard. There's been a great deal of work on listening test methodologies and statistical testing to ensure that perceptual measurements are reliable and consistent. However, performing such tests is expensive, time consuming, and requires expertise. Effects, researchers in industry and academia have worked on developing objective models for the prediction of sound quality (and of important aspects of sound perception such as loudness or speech intelligibility). These prediction models enable quick, repeatable measurements to be made, adequately maintain perceptual validity

Wang Jing: (The hostess of the reading guide)
University of Surrey in the voice and audio signal processing more research, especially in the three-dimensional audio reproduction has a larger project
Qu tianshu: (The host of this reading guide)
Simple translation: In almost all areas of AES are involved, the sound quality is crucial. Objective measurement can only represent part of the evaluation index, the subjective evaluation is the gold standard. Researchers have done a great deal of work on listening test methods and statistical tests to ensure the reliability and consistency of perceptual measurements. However, performing such a test is an expensive, time-consuming, and demanding job. As a result, researchers in industry and academia have been devoted to developing objective models for predicting sound quality, including important parameters of sound perception, such as loudness or speech intelligibility. These prediction models enable fast, repeatable measurements while maintaining consistent perception results.
That is, just mentioned the S3A Future Spatial Audio Project That is to say, the subjective evaluation is the gold standard of sound quality evaluation, but time-consuming and difficult, and the threshold is high, easy to understand, therefore, the objective evaluation method is imperative, objective evaluation The results of the model and subjective evaluation should be as similar as possible
Wang Jing: (The hostess of the reading guide)
Domestic companies generally think that the subjective evaluation of laborious, and do not want to invest too much. This is very difficult to enter the international competition
Haiyan XIANG:
Understand this R & D engage faster we are almost unemployed friends Thank you for the reading guide! Thank you, Chair!
Wang Jing: (The hostess of the reading guide)
Subjective evaluation of quality is rigorous, scientific, standardized, rather than looking for people to listen to it on the line. If you want to know more about it, please read the book of <The Manual of the Listening Tester> wrote by Prof. Wang Zexiang.
Qu tianshu: (The host of this reading guide)
Some resources for sound quality reviews are available on the web page, including journals, conferences, conference essays and videos, model training, testing and applications; links to profiles of AES-related groups and individuals; links to external resources, followed by two introductory Notice, we focus on the article this time, which is E-library part.
Wang Jing: (The hostess of the reading guide)
The objective evaluation model is computer-based evaluation of sound quality, which involves knowledge of psychology, physiology, signal processing, data mining and machine learning. Currently, there are some application standards for the objective evaluation of speech, but there is no reliable objective in three-dimensional audio evaluation Method, which is also the academic community is studying
Qu tianshu: (The host of this reading guide)
Other parts are also important, time, here do not explain. @ Wang Jing - North Polytechnic time is late, we begin to introduce the first article.
The first topic is: Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement Part II-Perceptual Model. You can open the literature on your phone or computer.
Wang Shusen:
@ Everyone
Carefully follow the guide, and promptly organize the contents of the teacher guided reading, you can get rewarded. Please sort out the file must be sent to me, I will be based on the situation in the group to share.
Qu tianshu: (The host of this reading guide)
We can also raise questions and discuss together.
Authors: Beerends, John G .; Schmidmer, Christian; Berger, Jens; Obermann, Matthias; Ullmann, Raphael; Pomy, Joachim;
The name of the first topic is: Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement Part II-Perceptual Model
The first topic is: perceived objective listening quality assessment (POLQA), ITU-T third-generation end-to-end voice quality measurement standards Part II - Perceptual mode
Published in AES in 2013, Posted by: TNO, Delft, The Netherlands; OPTICOM GmbH, Erlangen, Germany; SwissQual AG, Zuchwil, Switzerland.
Wang Shusen:
The units are German and Swiss companies
Qu tianshu: (The host of this reading guide)
This article mainly introduces POLQA voice quality evaluation method, which is the new third-generation voice evaluation method, voice evaluation method of the previous generation, we should be more familiar with PESQ
Wang Jing: (The hostess of the reading guide)
POLQA can be found by the English standard document in ITU-T P.863 standard
Qu tianshu: (The host of this reading guide)
the third generation perceptual objective speech quality measurement algorithm,
standardized by the International Telecommunication Union (ITU-T) as Recommendation
P.863 in 2011.
Wang Jing: (The hostess of the reading guide)
POLQA is upgraded standard based on PESQ, corresponding to the subjective MOS evaluation score, closer to the subjective MOS of PESQ, at present, most domestic and foreign operators and network operators involved in the evaluation of voice are gradually upgraded to POLQA.
Qu tianshu: (The host of this reading guide)
Compared with PESQ, it mainly has the following several enhancements: outside the scope of PESQ such as linear frequency response distortions, time stretching / compression as found in Voice-over-IP, certain types of codec distortions, reverberations, and the impact of playback volume.
Wang Shusen:
PESQ, Perceptual Evaluation of Speech Quality;
POLQA, Perceptual Objective Listening Quality Assessment;
Qu tianshu: (The host of this reading guide)
Enhancements include: linear frequency response, time scale scaling, codec distortion, reverb and playback volume
Wang Jing: (The hostess of the reading guide)
Compared to the PESQ algorithm, POLQA performs algorithmic enhancements on human ear perception, with more range or type of distortion for use, and is more suitable for VOIP
Qu tianshu: (The host of this reading guide)
The evaluation results of this article is this: POLQA
outperforms PESQ in assessing any kind of degradation making it an ideal tool for all speech
quality measurements in today's and future mobile and IP based networks.
In other words, it surpasses PESQ in any way and is an ideal tool for future related applications. (Evaluation of their own is still quite high)
This article is structured as follows: This paper (Part II) provides an overview of the subjective
including the performance of the new standard (Sections 3 and 4) and the most important conclusions (Section 5). The temporal alignment, including the model requirements and basic modeling approach, are given in Part I.
Wang Jing: (The hostess of the reading guide)
Most of the modules in this algorithm are PESQ before, but more detail improvements, such as the time alignment algorithm, are more accurate.
Qu tianshu: (The host of this reading guide)
Of course, also includes section0, mainly introduces the development of voice evaluation, PESQ features and deficiencies, as well as the main advantages of POLQA
The first chapter mainly introduces the subjective experiment, because the objective model must take the subjective experimental result as the standard
Wang Jing: (The hostess of the reading guide)
The other is the perception model has been greatly improved, more close to the human ear hearing, the practical application of the POLQA indeed found closer to the PESQ than the subjective MOS, but for the more specific types of distortions of the applicability of the situation still needs to be improved. Most of the domestic companies Voice evaluation have purchased POLQA's LISCENSE.
Qu tianshu: (The host of this reading guide)
The second chapter introduces the realization of the model, divided into 14 sections, including the steps introduced by teacher Wang Jing
Chapter III and Chapter IV introduced the evaluation results. The mos-lqo is the model given results, the mos-lqs is the subjective evaluation results, we can look at it, and finally in the fifth chapter, the author made a summary. We can read the parts of their interest, and then discuss together
Wang Jing: (The hostess of the reading guide)
POLQA provides significant, improvement over PESQ for narrowband (300-3400 Hz), as well as for wideband (50-7000 Hz) speech quality measurements, Also, POLQA allows quality assessment using super-wideband (20-14000 Hz) .
Wang Shusen:
Please help analyzing it more details. why is mos-lqo the result given by the model and mos-lqs is the subjective evaluation results?
Wangjing: (The hostess of this reading guide)
Note here PESQ can only measure broadband, POLAQ can detect ultra-wideband.
Qu tianshu: (The host of this reading guide)
Figure 6 as an example
Wangjing: (The hostess of this reading guide)
MOS-LQS (Mean Opinion Score Listening Quality Subjective). Using subjective test methods in the ITU-T P800 standard, such as ACR.
Qu tianshu: (The host of this reading guide)
This is a subjective and objective evaluation results are in good agreement with the horizontal axis is the objective evaluation results, the vertical axis is the subjective evaluation results; solid line is fitted curve, the upper left corner of the fitted results; mos mean square error of 0.078, fitting curve The equation is mOs-lqs = 0.6 * mos-lqo + 1.15, correlation: 0.94.
Wangjing: (The hostess of this reading guide)
The result graph is represented by a scatter plot in the objective evaluation method study. The more dispersed the point is, the lower the subjective and objective correlations are. The slope of a straight line fitted by a linear fit can represent the subjective and objective consistency. If the result point is exactly On the diagonal that subjective and objective agreement. RMSE said the subjective and objective results of the degree of dispersion, the lower the value the better, R said the subjective and objective results of relevance, the higher the value the better.
Qu tianshu: (The host of this reading guide)
In contrast, we can see Figure 7, is a good fit is not good, its MOS mean square error is 0.284, correlation is 0.85. A little note, Figure 6, the results of the data and the map is not consistent, it should be wrong. Wrong, Figure 6 is the wrong picture, the text is right, Figure 6 and Figure 10 heavy.
Figure 6 is a narrow band signal, 100-3500 Hz, Figure 8 Figure 9 is a wideband signal, 50-7000 Hz, Figure 10 Figure 11 is an ultra-wideband signal, 50-14000 Hz, The results of PESQ and the result of POLQA are shown in Figure 12. The above figure shows the result of PESQ and the figure below shows the result of POLQA, which can be compared by the numbers in the upper left corner of the figure. Figure 13 The same result is given in the last chapter. And conclusion. Everyone has a look, there is no problem with this article
Wang Shusen:
Well, thank you for wonderful introduction by the two teachers, "the first person to eat crab is always the most brave." If the all members have any questions, you can also mention that the teachers will reply at an appropriate time. Thank you for participating, online and offline have the opportunity to continue discussions. The first guide is finished now. Please comment on any suggestion and suggestion of this way, let's improve next time. The contents of this guide out the text out of the members, will be the appropriate reward. Thank you for your active participation! Goodbye!

Written By:

More About Beijing Section

AES - Audio Engineering Society