AES New York 2007
Paper Session P18

P18 - Audio Forensics

Sunday, October 7, 4:00 pm — 6:00 pm
Chair: Durand R. Begault, Charles M. Salter Associates - San Francisco, CA, USA

P18-1 Experiment in Computational Voice Elimination Using Formant AnalysisDurand R. Begault, Charles M. Salter Associates - San Francisco, CA, USA
This paper explores the use of a computational approach to the elimination of a known from an unknown voice exemplar in a forensic voice elimination protocol. A subset of voice exemplars from 11 talkers, taken from the TIMIT database, were analyzed using a formant tracking program. Intra- versus inter-speaker mean formant frequencies are analyzed and compared.
Convention Paper 7272

P18-2 Applications of ENF Analysis Method in Forensic Authentication of Digital Audio and Video RecordingsCatalin Grigoras, Forensic Examiner - Bucharest, Romania
This paper reports on the electric network frequency (ENF) method as a means of assessing the integrity of digital audio/video evidence analysis. A brief description is given to different ENF types and phenomena that determine ENF variations, analysis methods, stability over different geographical locations on continental Europe, interlaboratory validation tests, uncertainty of measurement, real case investigations, different compression algorithm effects on ENF values and possible problems to be encountered during forensic examinations. By applying the ENF Method in forensic audio/video analysis, one can determine whether and where a digital recording has been edited, establish whether it was made at the time claimed, and identify the time and date of the registering operation.
Convention Paper 7273

P18-3 Quantifying the Speaking Voice: Generating a Speaker Code as a Means of Speaker Identification Using a Simple Code-Matching TechniquePeter S. Popolo, National Center for Voice and Speech - Denver, CO, USA, and University of Iowa, Iowa City, IA, USA; Richard W. Sanders, National Center for Voice and Speech - Denver, CO, USA, and University of Colorado at Denver, Denver, CO, USA; Ingo R. Titze, University of Iowa - Iowa City, IA, USA and University of Colorado at Denver, Denver, CO, USA
This paper looks at a methodology of quantifying the speaking voice, by which temporal and spectral features of the voice are extracted and processed to create a numeric code that identifies speakers, so those speakers can be searched in a database much like fingerprints. The parameters studied include: (1) average fundamental frequency (F0) of the speech signal over time, (2) standard deviation of the F0, (3) the slope and (4) sign of the FO contour, (5) the average energy, (6) the standard deviation of the energy, (7) the spectral energy contained from 50 Hz to 1,000 Hz, (8) the spectral energy from 1,000 Hz to 5,000 Hz, (9) the Alpha Ratio, (10) the average speaking rate, and (11) the total duration of the spoken sentence.
Convention Paper 7274

P18-4 Further Investigations into the ENF Criterion for Forensic AuthenticationEddy Brixen, EBB-consult - Smørum, Denmark
In forensic audio one important task is the authentication of audio recordings. In the field of digital audio and digital media one single complete methodology has not been demonstrated yet. However, the ENF (Electric Network Frequency) Criterion has shown promising results and should be regarded as a major tool in that respect. By tracing the electric network frequency in the recorded signal a unique timestamp is provided. This paper analyses a number of situations with the purpose to provide further information for the assessment of this methodology. The topics are: ways to establish reference data, spectral contents of the electromagnetic fields, low bit rate codecs’ treatment of low level hum components, and tracing ENF harmonic components.
Convention Paper 7275

