Tesis
Audio Segmentation Using Flattened Local Trimmed Range for Ecological Acoustic Space Analysis
Autor
Vega Viera, Giovany
Corrada Bravo, Carlos J. (Consejero)
Institución
Resumen
The acoustic space in a given environment is lled with footprints arising
from three processes: biophony, geophony and anthrophony. Bioacoustic
research using passive acoustic sensors can result in thousands
of recordings. An important component of processing these recordings
is to automate signal detection. In this thesis we describe a new
spectrogram-based approach for extracting individual audio events.
Spectrogram-based audio event detection (AED) relies on separating
the spectrogram into background (i.e., noise) and foreground (i.e.,
signal) classes using a threshold such as a global threshold, a per-band
threshold, or one given by a classi er. These methods are either too
sensitive to noise, designed for an individual species, or require prior
training data. Our goal is to develop an algorithm that is not sensitive
to noise, does not need any prior training data and works with any type of audio event.
To do this, we propose a spectrogram ltering method, the Flattened
Local Trimmed Range (FLTR) method, which models the spectrogram
as a mixture of stationary and non-stationary energy processes
and mitigates the e ect of the stationary processes; and an unsupervised
algorithm that uses this lter to detect audio events.
We measured the performance of this algorithm using a set of six
thoroughly validated audio recordings and obtained a sensitivity of
94% and a positive predictive value of 89%. These sensitivity and positive
predictive values are very high, given that the validated recordings
are diverse and obtained from eld conditions. The algorithm was then
used to extract audio events in three datasets. Features of these audio
events were plotted and showed the unique aspect of the three acoustic
communities.