Community

AES Journal Forum

Perceptual Effects of Dynamic Range Compression in Popular Music Recordings

(Subscribe to this discussion)

Document Thumbnail

The belief that the use of dynamic range compression in music mastering deteriorates sound quality needs to be formally tested. In this study normal hearing listeners were asked to evaluate popular music recordings in original versions and in remastered versions with higher levels of dynamic range compression. Surprisingly, the results failed to reveal any evidence of the effects of dynamic range compression on subjective preference or perceived depth cues. Perceptual data suggest that listeners are less sensitive than commonly believed to even high levels of compression. As measured in terms of differences in the peak-to-average ratio, compression has little perceptual effect other than increased loudness or clipping effects that only occur at high levels of compression. One explanation for the inconsistency between data and belief might result from the fact that compression is frequently accompanied by additional processing such as equalization and stereo enhancement.

Authors:
Affiliations:
JAES Volume 62 Issue 1/2 pp. 37-41; January 2014
Publication Date:

Click to purchase paper as a non-member or you can login as an AES member to see more options.

(Comment on this report)

Comments on this report

Default Avatar
Harry Dymond


Comment posted February 7, 2014 @ 15:18:29 UTC (Comment permalink)

I would like to thank the authors for this report; I am surprised by the results. However, there are a few aspects of the experiment that concern me:

1.) The duration of the music "clips". The authors used 15 excerpts from commercial recordings, each of 15 second duration. 15 seconds strikes me as being too short. Obviously there is a requirement to ensure that the overall experiment does not occupy the volunteers for an excessive amount of time, but I would have preferred to see longer clips (or entire tracks) being used. Perhaps the volunteers could have been divided into groups with each group listening to fewer, but longer, recordings.

2.) A common complaint regarding dynamic range compression is that it can be "fatiguing". This aspect of the effect of dynamic range compression is not addressed by this study.

3.) Most concerning to me is the use of these commercial recordings. As the authors themselves note, whilst these "re-masters" involve dynamic range compression, this is usually accompanied by several other forms of processing, each of which could affect the sound quality perception of listeners. The experiment is therefore controlled in such a way that I fail to see how any conclusions relating to dynamic range compression can be drawn. The tracks being compared should have had one difference and one difference only - the degree of dynamic range compression.

Thank you again to the authors and I hope that the report stimulates further debate.


Default Avatar
Author Response
Jens Hjortkjær


Comment posted February 20, 2014 @ 14:46:59 UTC (Comment permalink)

Many thanks for these comments. We were ourselves quite surprised by the results

1) With longer stimuli - e.g. entire tracks - equivalent segments in the two versions would be far apart in time and the comparison task would arguably become more difficult and also more reliant on memory. We evaluated that 15 seconds was long enough for listeners to get an impression of the mix and short enough to allow direct comparison. It would indeed be interesting to test this with additional groups with longer stimuli

 

2) The possible 'fatiguing' aspect of compression is indeed relevant and we are testing this currently in a different study. We find it quite likely that higher levels of compression may cause fatigue even if listeners are not aware this

 

3) We wanted to test the actual use of compression in commercially issued music so possible effects of additional processing are not controlled for in this study. Simple brick-wall limiting without any additional processing was used by in a study by Croghan et al (JASA Aug 2012) - who also found little effect below ~10dB when RMS was equalized - but this way of compressing is probably not likely to be found in actual masterings. If compression used in the masterings that we examined here had a strong effect on subjective preference, then we would have expected to find an effect of this across the different mixes. But the interaction with other processing would naturally have to be investigated more in future experiments


Default Avatar
Robert Katz


Comment posted March 7, 2014 @ 17:20:58 UTC (Comment permalink)

Evaluating listener response to compression is a very difficult, perhaps impossible task. We already know that for a successful double-blind test, playing short segments repeatedly is the best way to train listeners on the sonic differences under test. And that the longer the segments, the more difficult it is to obtain repeatable and statistically-meaningful results.

However, when it comes to compression there are several things to consider:

1) Macro dynamic differences. That is, crescendos, decrescendos, and overall loudness differences between segments of a song or song cycle. These things are extremely difficult to test in a double blind environment, because the changes usually occur over several minutes of time within a piece of music and therefore acoustic memory (which is short) is taxed and would require a long-term study taking days to weeks to months in order to produce statistically valid results! But macrodynamic differences are VERY important to music, none of us deny it, and this study does not even begin to address them.

Furthermore, how would you adjust the loudness between two different long-form segments when one of them is more compressed than the other?  Supposing that you have a song with a long, soft passage which builds to a crescendo and a climax. One version which is more compressed so the soft passage is louder than in the less-compressed or uncompressed version. What is the fair method of setting the levels for the comparison? If you adjust it so the soft passages are equal, then the uncompressed version's loud passage will be much louder than the compressed version. If you adjust it so the loud passages are equal, then the uncompressed version's soft passage will be softer. If you take an average loudness approach, how do you measure the average and is it really meaningful to the test to do so?

So you cannot switch between the two types of recordings double blind and be fair at all! Instead, you have to devise some kind of a long-form listening test and we all know that statistically these kinds of tests just fall apart!

Bottom line: We all know that macrodynamics are VERY important to the appreciation of music, but we cannot easily or perhaps NEVER test for them. It MIGHT be possible to find a 30 second segment with a portion of a soft passage that leads to a subito-forte. But there will be unsolveable arguments over how to adjust the loudness of the two comparisons. I could make a test that would favor the uncompress or the compressed depending on where I equalize the relative loudnesses!

2) What about microdynamics? I define microdynamics as that which is measurable by the peak-to-loudness ratio of the material. it is a measure of the amount and types of compression processing which softens the transients and the peaks. This type of processing also affects the stereo depth, soundstage and the width as well as the transient resopnse. In order for this test to be successful, the listening room will need a reflection-free zone, a set of monitors with good transient response and low diffraction, as well as trained listeners. It will also be necessary to obtain the original, clean mix of a recording and an overcompressed master of the same recording. Comparisons between two released commercial versions of a recording are simply uncontrollable. At least by having the mix and the master of a recording, the provenance will be completely known as well as the processes which were used to produce the master.

HOWEVER, even if we make the appreciation of the transients (microdynamics) the goal of the test, the personal preference of the listener and the type of music and its sound will be uncontrollable factors! For example, some rock music sounds better more compressed (with less transients) than the original. Mastering engineers and producers could easily agree that the compressed master sounds better than the uncompressed mix.

3) Where does this leave us? In my opinion, you should leave this subject alone. Stop trying to test precepts which have been long accepted by music and audio professionals for hundreds of years:

Good dynamic range sounds good. Squashed dynamic range may or may not sound as good, but often sounds worse. Please let sleeping dogs lie and get back to making tests that can be validated, please!!!!!


Default Avatar
Author Response
Mads Walther-Hansen


Comment posted May 7, 2014 @ 14:00:24 UTC (Comment permalink)

Many thanks for your comments Robert.

1)     1) You are absolutely right that this study does not address macro-dynamic differences. This was never the purpose of the test and we agree that this cannot be measured easily – and surely not with the same methods we use in this test. As described in the response to Harry we are currently testing the correlation between DRC and listening fatigue. This test involves measuring the physiological response of subjects exposed to highly compressed music over longer periods of time. Here the method to normalize the clips is something we must consider carefully.

2)     2) Personal preferences, types of music etc. are of course always in play in just about any test of musical preferences. In this test we picked re-masters that were widely contested for their inferior sound quality (lacking depth etc.) compared to the original. Some of them are mentioned in AES articles and some are debated on different Internet forums. We were, in fact, expecting to verify that originals were better sounding and had more depth than the heavily compressed re-masters, and we think we provided test subjects (musicology students) with the best possible conditions to assess the differences. We did not use subjects trained in sound engineering, as we would expect them to recognize that differences were caused by compression and they would then know the ‘right’ answer. Further tests are clearly needed, and we do have reservations for the results as mentioned in the conclusions.

3)      3) We clearly disagree here. If these precepts are long accepted, it is about time to test them. Also, sound quality preferences changes, and we may speculate that listeners are less sensitive to high levels of compression today than just a few decades ago. 


Default Avatar
Harry Dymond


Comment posted March 9, 2014 @ 15:37:11 UTC (Comment permalink)

Robert,

Thank you for your interesting and detailed post; you make several good points. I would like to question though if this is really a "sleeping dog"? If music and audio professionals know that squashed dynamic range often sounds bad, why do most modern mainstream recordings have such low dynamic range? Do we not need hard evidence of the harm that squashed dynamic range does to recordings, in order to try and de-escalate the Loudness War?


Subscribe to this discussion

RSS Feed To be notified of new comments on this report you can subscribe to this RSS feed. Forum users should login to see additional options.

Join this discussion!

If you would like to contribute to the discussion about this report and are an AES member then you can login here:
Username:
Password:

If you are not yet an AES member and have something important to say about this report then we urge you to join the AES today and make your voice heard. You can join online today by clicking here.

AES - Audio Engineering Society