Fake news detection on Twitter using a data mining framework based on explainable machine learning techniques

Puraivan, E.; Godoy, E.; Riquelme, F.; Salas, R.

Artículo de revista

Registro en:

10.1049/icp.2021.1450

https://hdl.handle.net/20.500.12536/1799

https://repositorioslatinoamericanos.uchile.cl/handle/2250/8438014

Autor

Puraivan, E.

Godoy, E.

Riquelme, F.

Salas, R.

Institución

Universidad Viña del Mar (Chile)

Resumen

Online social networks are a powerful communication and information dissemination tool, particularly useful in complex scenarios such as social crises, natural disasters, and pandemics. However, one of the main problems, especially in socio-political crises, is the automatic detection of fake news. This problem is usually addressed with greater or lesser success using supervised machine learning techniques. In this work, we propose a mixed approach, using unsupervised learning for feature extraction, and supervised learning for the prediction of fake news on microblogging networks. We consider Twitter news with linguistic and network features. To identify hidden patterns in the data, we use Principal Component Analysis and t-Distributed Stochastic Neighbor Embedding. The results show that the data can be better classified using non-linear rather than linear separability. Moreover, when using Extreme Gradient Boosting (XGBoost), an accuracy of 99.26% is obtained, and the most relevant features are identified.

Materias

Twitter

Fake news

Social crisis

Explainable machine learning

Data mining

Mostrar el registro completo del ítem