AES E-Library

AES E-Library Search Results

Bulk download: Download Zip archive of all papers from this conference

Forensic Musicology: An Overview

Document Thumbnail

The current evolution of the music industry towards digital means of recording and dissemination of material has increased the necessary skill set required of experts by legal professionals. Expert testimony for forensic musicology supports a broad spectrum of legal issues, including the authentication and differentiation of published compositions and musical recordings, performance rights, and legal determinations regarding copyright infringement. While legal cases involving music and performance infringement date back as far as the 19th century, the field of forensic musicology has no stated methodology by which an objective forensic determination can be made. Expert opinions based merely on subjective impression or resulting from the “golden ear” syndrome are pseudo-scientific and not objectively based. This paper proposes scientific methods and recommendations for analysis based on stated criteria, with the goal of controlling examiner bias. Considerations include analyses of composition, performance, and acoustical features, and factors such as melody, harmony, rhythm, and orchestration; pitch, tone, vibrato, and embellishment; metadata analysis; recording technologies; and digital signal processing, including “effects.” By engaging in a series of structured categorizations, the forensic expert can establish a consistent, replicable, and objectively verifiable means of determining whether or not a recorded piece of music has been misappropriated.

Authors: Begault, Durand R.; Heise, Heather D.; Peltier, Christopher A.
Affiliations: Audio Forensic Center, Charles M. Salter Associates, San Francisco, CA, USA; Cerami Associates; Thorton School of Music, University of Southern California, Los Angeles, CA, USA; Cerami and Associates, New York, NY, USA(See document for exact affiliation information.)
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 1-1 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Forensic Musicology

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

A Study of f0 as a Function of Vocal Effort

Document Thumbnail

In forensic speaker comparison, f0 is one among other parameters examined. However, the f0 data must be validated carefully, as it changes as a function of vocal efforts. This paper present the results of an experiment where six male and six female subjects uttered the same sentence in five speak modes: whisper, casual, normal, loud, and shout. Only voiced speech was subsequently analyzed. As a result, the fb and f0 mean as a function of vocal effort is determined. With reference to subjects’ “normal” speech level, the baseline f0 (fb) was estimated from corresponding data obtained by casual voice, loud voice, and shout.

Author: Brixen, Eddy B.
Affiliation: EBB-consult, Smørum, Denmark
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 2-1 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Forensic Speaker Recognition

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Forensic Voice Comparisons in German with Phonetic and Automatic Features Using Vocalise Software

Document Thumbnail

In this article, we present a novel forensic speaker recognition system that provides the capability to perform comparisons using both ‘traditional’ forensic phonetic parameters and ‘automatic’ spectral features in a semi- or fully automatic way. We evaluate this approach with simulated and real forensic case data in German, which ranges from high quality laboratory audio data to real telephone intercepts. We examine how the forensic expert can use his or her knowledge of the linguistic and phonetic content of the speech and combine it with ‘automatic’ acoustic analysis of the speech. This approach is shown to provide a level of validation and safeguard against misleading or incorrect identification results. We demonstrate that processing phonetic data will be in many ways complementary and will offer insights into the voice comparison analysis that the classical automatic methods cannot.

Authors: Jesse, Michael; Alexander, Anil; Forth, Oscar
Affiliations: Department of Speaker Identification and Audio Analysis, Bundeskriminalamt, Germany; Oxford Wave Research Ltd., Oxford, UK(See document for exact affiliation information.)
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 2-2 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Forensic Speaker Recognition

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Evaluation Results of Speaker Verification for VoIP Transmission with Packet Loss

Document Thumbnail

"In this contribution we show the influence of Voice over IP (VoIP) transmission for the task of forensic speaker verification if packet loss occurs. We particularly tested the influence of packet loss up to 20% on two modern automatic speaker verification systems with different feature sets. One of the systems was our own development to control features and the classifier parameters. The other one was a commercial system used in the forensic field. Furthermore, a phonetic acoustic analysis of the formant deviation and its influence on speaker verification is given. The results indicate that packet loss up to 20% has only minor influence on the performance of modern computer-based speaker verification tools and that the changes of the formant frequencies are not significant. However, some dynamic formant transitions get lost at high packet loss rates."

Authors: Bitzer, Joerg; Rollwage, Christian; Neumann, Martin
Affiliations: Project Group Hearing, Speech and Audiotechnology, Fraunhofer Institute for Digital Media Technology, Oldenburg, Germany; University of Applied Science Wilhelmshaven/Oldenburg, Oldenburg, Germany(See document for exact affiliation information.)
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 2-3 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Forensic Speaker Recognition

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Room Identification Using Roomprints

Document Thumbnail

Identification and verification of the location in which a recording was made are important yet understudied topics in audio forensics. The recently introduced concept of ‘roomprints’ provides some first steps towards tackling these tasks. A roomprint is defined as a quantifiable description of an acoustic environment which can be measured under controlled conditions and estimated from a monophonic recording made in that space. The various types of information which could be included in a roomprint are reviewed based on their expected reliability and the feasibility of extracting them from a recording. Frequency-dependent reverberation time is identified as a particularly promising feature in both regards. A room identification experiment was conducted using room impulse responses from 22 rooms. Depending on the frequency resolution and lower frequency extent of the roomprint identification rates of up to 97% were achieved.

Authors: Moore, Alastair H.; Brookes, Mike; Naylor, Patrick A.
Affiliation: Imperial College London, London, UK
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 3-1 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Dereverberation

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Evaluation of an Extended Reverberation Decay Tail Metric as a Measure of Perceived Reverberation

Document Thumbnail

In this work an extended Reverberation Decay Tail (R_DT) metric is proposed and developed as a means of calculating the level of reverberation distortion in speech recordings. The metric, based on earlier research, is extended to operate on wideband speech and incorporates an improved perceptual model. The effectiveness of the measure as a predictor of reverberation, it's robustness to common types of noise and performance comparisons with the original metric are evaluated experimentally through simulations. As an intrusive quality measure, potential applications include use as a research tool by audio forensic and signal processing experts working on the development of speech dereverberation algorithms.

Authors: Javed, Hamza A; Naylor, Patrick A.
Affiliation: Imperial College London, London, UK
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 3-2 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Dereverberation

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Exploting Sparsity for Source Separation Using the Sliding Ratio Signal Algorithm

Document Thumbnail

An algorithm for exploiting sparsity of the underlying source signals in either the time or time-frequency domain is introduced. Utilizing the sliding ratio signal (SRS) derived from at least two observed mixture signals, the often separate processing of estimating the number of sources and the mixing matrix (or overcomplete dictionary) are simultaneously detected for reduced computational load. For instantaneous mixtures, the observed signals are directly processed by the SRS algorithm which detects the major modes of ratio signals when the relative time delays of a source are equalized in both mixtures. For convolutive mixtures, the sliding discrete Fourier transform (SDFT) window is used to facilitate instantaneous de-mixing in the time frequency domain. The sliding Goertzel algorithm is used for pre-processing the the convolutive mixtures to reduce the room impulse response inter-symbol interference effects. The time-frequency signals, at a frequency bin of choice, are then used by the SRS algorithm to learn the mixing process such that sparse decay algorithms can separate the underlying source signals. The SDFT approach and sliding Goertzel algorithm greatly decrease the computational load compared to most time-frequency based methods which tend to suffer from permutation and scaling ambiguities of the estimated sources. Simulation results are provided to illustrate the performance of the proposed algorithm.

Authors: Gower, Ephraim; Hawksford, Malcolm
Affiliations: Botswana International University of Science and Technology, Palapye, Botswana; University of Essex, Colchester, UK(See document for exact affiliation information.)
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 3-3 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Dereverberation

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Quantization Level for Forensic Media Authentication

Document Thumbnail

The authentication of digital multimedia evidence can play important roles in both judicial and extrajudicial cases. Over the past decade or so, various techniques and methodologies have been proposed for forensic audio and video authentication. This paper proposes a new method for forensic media authentication based on quantization level analysis (QL) of a digital audio signal. Five methods for calculating QL are provided including: QL histogram, QL length, greatest common divisor, Fast Fourier Transform (FFT), and Cepstrum. Examples of QL analysis are provided to demonstrate its use in the detection a recording’s native QL versus the file format bit depth, the detection of signal processing, and the analysis of consistency with reference recordings (recorder attribution).

Authors: Grigoras, Catalin; Smith, Jeff M.
Affiliation: University of Colorado Denver, Denver, CO, USA
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 4-1 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Audio Authentication

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

Forensic Authenticity Analyses of the Metadata in Re-Encoded WAV Files

Document Thumbnail

Detailed digital data analyses were conducted of the metadata contained in forty (40) WAV files produced on nine (9) different digital audio recorders and then re-encoded with four common audio editing programs, or a total of 200 separate WAV files. The purpose of this research was to identify changes made by the re-encoding processes as they related to forensic audio authenticity examinations. The research found that the re-encoding produced clearly discernable differences, compared to the original WAV files, with the exception of one recorder’s files when they were re-encoded using two of the editing programs. The RIFF WAV metadata format structures, the procedures followed, the changes identified in the metadata of the re-encoded files, and a discussion of the authenticity implications are listed.

Authors: Koenig, Bruce E.; Lacey, Douglas S.
Affiliation: BEK TEK LLC, Clifton, VA, USA
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 4-2 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Audio Authentication

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!

A Multi-Codec Audio Dataset for Codec Analysis and Tampering Detection

Document Thumbnail

In this paper, a multi-codec tampered dataset is presented. The doctored speech content does not contain audible artifacts or changes of semantic meanings, but tampered regions which can be detected via lossy compression traces, e.g. using framing grid analysis. Possible applications, content and annotations included, and the steps required to generate the dataset are described. The dataset can be accessed online and is meant to be used for evaluation purposes by anyone interested. Moreover, the creation of derived modified datasets is encouraged, and supported by the choice of a respective Creative Commons license.

Authors: Gärtner, Daniel; Cuccovillo, Luca; Mann, Sebastian; Aichroth, Patrick
Affiliation: Fraunhofer IDMT, Ilmenau, Germany
AES Conference: 54th International Conference: Audio Forensics (June 2014)
Paper Number: 4-3 Permalink
Publication Date: June 12, 2014 Import into BibTeX
Subject: Audio Authentication

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Start a discussion about this paper!