AES E-Library

AES E-Library

Audio Pattern Recognition of Baby Crying Sound Events

Document Thumbnail

Infants can communicate their internal state (such as pain, hunger, fear, fatigue, or stress) by the nature of their crying. Experts in linguistics suggest that the cry comprises the first speech manifestations. This article describes the design methodology for classifying baby crying sound events according to the pathological status of the infant. Such an automated system can be an aid to an attending physician performing a diagnosis. In order to address this challenge, a great variety of audio parameters (Perceptual Linear Prediction, Mel Frequency Cepstral Coefficients, Perceptual Wavelet Packets, Teager Energy Operator, Temporal Modulation) were considered. Classification techniques, including Multilayer Perception, Support Vector Machine, Random Forest, Reservoir Network, Gaussian Mixture model, and Hidden Markov model were customized. The goal is to provide an automatic and noninvasive framework for monitoring infants and helping inexperienced/trainee pediatricians, parents, and baby caregivers to identify the baby’s pathological status.

JAES Volume 63 Issue 5 pp. 358-369; May 2015
Publication Date:

Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member and would like to subscribe to the E-Library then Join the AES!

This paper costs $33 for non-members and is free for AES members and E-Library subscribers.

Learn more about the AES E-Library

E-Library Location:


Start a discussion about this paper!

AES - Audio Engineering Society