TubeSpam: filtragem automática de comentários indesejados postados no YouTube

Alberto, Túlio Casagrande

dc.contributor	Almeida, Tiago Agostinho de
dc.contributor	http://lattes.cnpq.br/5368680512020633
dc.contributor	http://lattes.cnpq.br/0353538405905082
dc.creator	Alberto, Túlio Casagrande
dc.date.accessioned	2017-10-03T19:07:37Z
dc.date.available	2017-10-03T19:07:37Z
dc.date.created	2017-10-03T19:07:37Z
dc.date.issued	2017-02-03
dc.identifier	ALBERTO, Túlio Casagrande. TubeSpam: filtragem automática de comentários indesejados postados no YouTube. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/ufscar/9137.
dc.identifier	https://repositorio.ufscar.br/handle/ufscar/9137
dc.description.abstract	YouTube has become an important video sharing platform. Several users regularly produce video content and make this task their main livelihood. However, such success is also drawing the attention of malicious users propagating undesired comments and videos, looking for self-promotion or disseminating malicious links which may have malwares and viruses. Since YouTube offers limited tools for blocking spam, the volume of such messages is shockingly increasing and harming users and channels owners. In addition to the problem being naturally online, comment spam filtering on YouTube is different than the traditional email spam filtering, since the messages are very short and often rife with spelling errors, slangs, symbols and abbreviations. This manuscript presents a performance evaluation of traditional online classification methods, aided by lexical normalization and semantic indexing techniques when applied to automatic filter YouTube comment spam. It was also evaluated the performance of MDLText, a promising text classification method based on the minimum description length principle. The statistical analysis of the results indicates that MDLText, Passive-Aggressive, Naïve Bayes, MDL and Online Gradient Descent obtained statistically equivalent performances. The results also indicate that the lexical normalization and semantic indexing techniques are effective to be applied to the problem. Based on the results, it is proposed and designed TubeSpam, an online tool to automatic filter undesired comments posted on YouTube.
dc.language	por
dc.publisher	Universidade Federal de São Carlos
dc.publisher	UFSCar
dc.publisher	Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
dc.publisher	Câmpus Sorocaba
dc.rights	Acesso aberto
dc.subject	Youtube (Recurso eletrônico)
dc.subject	Aprendizado do computador
dc.subject	Spam (Mensagens eletrônicas)
dc.subject	Youtube (Recurso eletrônico)
dc.subject	Comentários indesejados
dc.subject	Spam (Electronic mail)
dc.subject	Machine learning
dc.subject	Undesired comments
dc.title	TubeSpam: filtragem automática de comentários indesejados postados no YouTube
dc.type	Tesis

Este ítem pertenece a la siguiente institución

Universidade Federal de São Carlos (Brasil)