Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species

dc.date.accessioned	2019-01-29T22:19:50Z
dc.date.accessioned	2023-05-30T23:27:34Z
dc.date.available	2019-01-29T22:19:50Z
dc.date.available	2023-05-30T23:27:34Z
dc.date.created	2019-01-29T22:19:50Z
dc.date.issued	2017
dc.identifier	19326203
dc.identifier	http://repositorio.ucsp.edu.pe/handle/UCSP/15785
dc.identifier	https://doi.org/10.1371/journal.pone.0179403
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/6477598
dc.description.abstract	Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC. © 2017 Ludeña-Choez et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
dc.language	eng
dc.publisher	Public Library of Science
dc.relation	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85020903270&doi=10.1371%2fjournal.pone.0179403&partnerID=40&md5=2ffdb02b008e1d9dd3cd7db2ef308770
dc.rights	info:eu-repo/semantics/restrictedAccess
dc.source	Repositorio Institucional - UCSP
dc.source	Universidad Católica San Pablo
dc.source	Scopus
dc.subject	acoustic analysis
dc.subject	analytic method
dc.subject	animal experiment
dc.subject	Article
dc.subject	audiometry
dc.subject	bird
dc.subject	controlled study
dc.subject	decomposition
dc.subject	hidden Markov model
dc.subject	kernel method
dc.subject	mel frequency cepstral coefficients
dc.subject	nonhuman
dc.subject	sound detection
dc.subject	species difference
dc.subject	speech analysis
dc.subject	support vector machine
dc.subject	task performance
dc.subject	vocalization
dc.subject	animal
dc.subject	classification
dc.title	Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
dc.type	info:eu-repo/semantics/article

Este ítem pertenece a la siguiente institución

Universidad Católica San Pablo (Perú)