dc.contributorAlmeida, Tiago Agostinho de
dc.contributorhttp://lattes.cnpq.br/5368680512020633
dc.contributorhttp://lattes.cnpq.br/0353538405905082
dc.creatorAlberto, Túlio Casagrande
dc.date.accessioned2017-10-03T19:07:37Z
dc.date.available2017-10-03T19:07:37Z
dc.date.created2017-10-03T19:07:37Z
dc.date.issued2017-02-03
dc.identifierALBERTO, Túlio Casagrande. TubeSpam: filtragem automática de comentários indesejados postados no YouTube. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/ufscar/9137.
dc.identifierhttps://repositorio.ufscar.br/handle/ufscar/9137
dc.description.abstractYouTube has become an important video sharing platform. Several users regularly produce video content and make this task their main livelihood. However, such success is also drawing the attention of malicious users propagating undesired comments and videos, looking for self-promotion or disseminating malicious links which may have malwares and viruses. Since YouTube offers limited tools for blocking spam, the volume of such messages is shockingly increasing and harming users and channels owners. In addition to the problem being naturally online, comment spam filtering on YouTube is different than the traditional email spam filtering, since the messages are very short and often rife with spelling errors, slangs, symbols and abbreviations. This manuscript presents a performance evaluation of traditional online classification methods, aided by lexical normalization and semantic indexing techniques when applied to automatic filter YouTube comment spam. It was also evaluated the performance of MDLText, a promising text classification method based on the minimum description length principle. The statistical analysis of the results indicates that MDLText, Passive-Aggressive, Naïve Bayes, MDL and Online Gradient Descent obtained statistically equivalent performances. The results also indicate that the lexical normalization and semantic indexing techniques are effective to be applied to the problem. Based on the results, it is proposed and designed TubeSpam, an online tool to automatic filter undesired comments posted on YouTube.
dc.languagepor
dc.publisherUniversidade Federal de São Carlos
dc.publisherUFSCar
dc.publisherPrograma de Pós-Graduação em Ciência da Computação - PPGCC-So
dc.publisherCâmpus Sorocaba
dc.rightsAcesso aberto
dc.subjectYoutube (Recurso eletrônico)
dc.subjectAprendizado do computador
dc.subjectSpam (Mensagens eletrônicas)
dc.subjectYoutube (Recurso eletrônico)
dc.subjectComentários indesejados
dc.subjectSpam (Electronic mail)
dc.subjectMachine learning
dc.subjectUndesired comments
dc.titleTubeSpam: filtragem automática de comentários indesejados postados no YouTube
dc.typeTesis


Este ítem pertenece a la siguiente institución