Manually setting the level of each track of a multitrack recording is often the first step in the mixing process. In order to automate this process, loudness features are computed for each track and gains are algorithmically adjusted to achieve target loudness values. In this paper we first examine human mixes from a multitrack dataset to determine instrument-dependent target loudness templates. We then use these templates to develop three different automatic level-based mixing algorithms. The first is based on a simple energy-based loudness model, the second uses a more sophisticated psychoacoustic model, and the third incorporates masking effects into the psychoacoustic model. The three automatic mixing approaches are compared to human mixes using a subjective listening test. Results show that subjects preferred the automatic mixes created from the simple energy-based model, indicating that the complex psychoacoustic model may not be necessary in an automated level setting application.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.