Modelado de tópicos, una revisión sistemática de la literatura científica de Latent Dirichlet Allocation  LDA.

Nagua Domínguez, Roger Andrés

dc.contributor	Pilacuán Bonete, Luis Manuel
dc.creator	Nagua Domínguez, Roger Andrés
dc.date.accessioned	2023-04-19T18:55:42Z
dc.date.accessioned	2023-05-22T19:53:59Z
dc.date.available	2023-04-19T18:55:42Z
dc.date.available	2023-05-22T19:53:59Z
dc.date.created	2023-04-19T18:55:42Z
dc.date.issued	2023-03-24
dc.identifier	http://repositorio.ug.edu.ec/handle/redug/67264
dc.identifier.uri	https://repositorioslatinoamericanos.uchile.cl/handle/2250/6328917
dc.description.abstract	El propósito del presente trabajo de investigación es realizar un análisis bibliométrico de la literatura científica de las bases de datos Scopus, ScienceDirect y Web of Science para la extracción de los artículos de los últimos 10 años referentes al modelo probabilístico Latent Dirichlet Allocation (LDA) y aplicar el mismo modelo por medio del programa LDAShiny a aquella base de datos obtenida. Se logró reconocer los indicadores bibliométricos más relevantes de los resultados, además de los 18 tópicos con mayor coherencia probabilística dentro del corpus, siendo los términos principales: aprendizaje automático, salud del corazón y extracción de texto, además se propuso 18 tópicos de acuerdo al análisis de textos en cuanto a la matriz Phi que presenta la probabilidad posteriori por tópico por palabra. Finalmente se presentó el dendograma Ciencia de Datos el cual representa la agrupación jerárquica de los tópicos de acuerdo a la cantidad óptima de cluster definidos
dc.description.abstract	The purpose of this research work is to carry out a bibliometric analysis of the scientific literature from the Scopus, ScienceDirect and Web of Science databases for the extraction of articles from the last 10 years referring to the probabilistic model Latent Dirichlet Allocation (LDA) and apply the same model by means of the LDAShiny program to that obtained database. It was possible to recognize the most relevant bibliometric indicators of the results, in addition to the 18 topics with the greatest probabilistic coherence within the corpus, the main terms being: automatic learning, heart health and text extraction, in addition 18 topics were proposed according to the analysis. of texts in terms of the Phi matrix that presents the posterior probability per topic per word. Finally, the Data Science dendrogram was presented, which represents the hierarchical grouping of the topics according to the optimal number of clusters defined.
dc.language	spa
dc.publisher	Universidad de Guayaquil. Facultad de Ingeniería Industrial. Carrera de Ingeniería Industrial.
dc.relation	;BINGI06491
dc.rights	openAccess
dc.subject	PROYECTOS NUEVOS
dc.subject	ANÁLISIS BIBLIOMÉTRICO
dc.subject	ASIGNACIÓN LATENTE DE DIRICHLET
dc.subject	APRENDIZAJE AUTOMÁTICO
dc.title	Modelado de tópicos, una revisión sistemática de la literatura científica de Latent Dirichlet Allocation LDA.
dc.type	Thesis

Este ítem pertenece a la siguiente institución

Universidad de Guayaquil (Ecuador)