Authentication of audio recordings is an important task in the field of audio forensics. Splicing is the practice of manipulating recorded audio to replace or insert an external sound into the original audio track. Due to the ease with which digital audio recordings can be spliced, forgery and tampering of audio recordings with a criminal intent or intent to destroy their integrity are common practices. This paper describes a methodology for splicing detection in digital audio recordings with a comparative analysis of the effectiveness of different feature sets and classifiers. Different feature sets including conventional, chroma, and reverberation-based features are evaluated, compared, and combined to produce better classification accuracy. Exhaustive experimentation has been done to take into account factors such as the duration of the attack, effect of noise, and effect of compression. The Analytic Hierarchy Process is used to evaluate different performance parameters and identify the most suitable machine learning classifier for splicing detection based on priority weights assigned to the different performance parameters. Results indicate that Long Short-Term Memory with a feature set containing Mel-Frequency Cepstral Coefficients and Decay Rate Distribution features has the best performance compared with other classifiers and feature sets.
Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!
This paper costs $33 for non-members and is free for AES members and E-Library subscribers.