bachelorThesis
Analizar y aplicar técnicas de tratamiento de imágenes de periódicos antiguos del Ecuador para mejoras en el proceso de reconocimiento de textos (OCR).
Fecha
2023-07-26Autor
Ochoa Arevalo, Kevin Ismael
Quituisaca Suconota, Lucia Carolina
Institución
Resumen
Around the world, projects are being carried out to digitize historical documents
with the aim of preserving the information contained in them. Many of these projects
use Optical Character Recognition (OCR). However, there are currently no such projects in Ecuador. During the digitization process, challenges arise that affect the quality of the information obtained through OCR, due to problems directly related to the
image, such as stains, folds, lighting, among others. Therefore, it is necessary to find
solutions to counteract these problems and obtain a better quality of information.
In this research work we propose to analyze image processing techniques to improve OCR processes with images of old newspapers from Ecuador. A process of
comparison and analysis of the data obtained from OCR is carried out, focusing on
the number of words correctly recognized in the images that were treated and untreated, with the objective of identifying improvements in the results. The processing
techniques, for ease of analysis, are divided into three groups: traditional techniques,
segmentation techniques and super-resolution techniques.
The results demonstrate that super-resolution processes, in particular the LAPSRN
technique, show a significant improvement in OCR results. These findings have important implications for the field of preservation and access to historical information
in Ecuador.