bachelorThesis
Aplicación de tecnologías de segmentación de audio y reconocimiento automático de dialecto para la obtención de información de diálogos contenidos en audio
Fecha
2017-05-11Autor
Sigcha Quezada, Erik Alejandro
Institución
Resumen
The interest of the scientific community in the identification of audiovisual content has grown considerably in recent years, due to the need to execute automatic classification and monitoring processes on the increasing content broadcasted by different media such as television, radio and internet. This article proposes an architecture for extracting information from audio, with the purpose of applying it to the analysis of television contents in the Ecuadorian context. For this, two services are defined, an audio segmentation service and a transcription service. The segmentation service identifies and extracts audio segments containing speech, music, or speech with musical background. Whereas, the transcription service recognizes the speech segments to obtain its content as text. These services and the tools that conform them have been evaluated in order to measure their performance and, in the case of the tools used, to define which of these is the one that best fits the definition of the architecture. The results of the evaluations carried out on the proposed architecture demonstrate that the construction of a speech recognition system, that makes use of different existing open source tools, offers a higher level of precision than a general availability transcription service.