ARTÍCULO DE CONFERENCIA
A Comparative evaluation of preprocessing techniques for short texts in spanish
Fecha
2020Registro en:
978-303039441-7
2194-5357
10.1007/978-3-030-39442-4_10
Autor
Orellana Cordero, Marcos Patricio
Trujillo, Andrea
Cedillo Orellana, Irene Priscila
Institución
Resumen
Natural Language Processing (NLP) is used to identify key information, generating predictive models, and explaining global events or trends. Also, NLP is supported during the process to create knowledge. Therefore, it is important to apply refinement techniques in major stages such as preprocessing, when data is frequently produced and processed with poor results. This document analyzes and measures the impact of combinations of preprocessing techniques and libraries for short texts that have been written in Spanish. These techniques were applied in tweets for analysis of sentiments considering evaluation parameters in its analysis, the processing time and characteristics of the techniques for each library. The performed experimentation provides readers insights for choosing the appropriate combination of techniques during preprocessing. The results show improvement of up to 5% to 9% in the performance of the classification.