Actas de congresos
Artificial Neural Networks For Content-based Web Spam Detection
Registro en:
1601322186; 9781601322180
Proceedings Of The 2012 International Conference On Artificial Intelligence, Icai 2012. , v. 1, n. , p. 209 - 215, 2012.
2-s2.0-84875128798
Autor
Silva R.M.
Almeida T.A.
Yamakami A.
Institución
Resumen
Web spam has become a big problem in the lives of Internet users, causing personal injury and economic losses. Although some approaches have been proposed to automatically detect and avoid this problem, the high speed the techniques employed by spammers are improved requires that the classifiers be more generic, efficient and highly adaptive. Despite of the fact that it is a common sense in the literature that neural based techniques have a high ability of generalization and adaptation, as far as we know there is no work that explore such method to avoid web spam. Given this scenario and to fill this important gap, this paper presents a performance evaluation of different models of artificial neural networks used to automatically classify and filter real samples of web spam based on their contents. The results indicate that some of evaluated approaches have a big potential since they are suitable to deal with the problem and clearly outperform the state-of-the-art techniques. 1
209 215 George Mason Univ., Bioinformatics Comput. Biol. Program,HST Harvard Univ. MIT, Biomed. Cybern. Lab.,University of Minnesota, Minnesota Supercomputing Institute,Center for Cyber Defense, NCAT,Argonne's Leadersh. Comput. Facil. Argonne Natl. Lab. Svore, K.M., Wu, Q., Burges, C.J., Improving web spam classification using rank-time features (2007) Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb'07), pp. 9-16. , Banff, Alberta, Canada Gyongyi, Z., Garcia-Molina, H., Spam: It's not just for inboxes anymore (2005) Computer, 38 (10), pp. 28-34 Shen, G., Gao, B., Liu, T., Feng, G., Song, S., Li, H., Detecting link spam using temporal information (2006) Proceedings of the 6th IEEE International Conference on Data Mining (ICDM'06), pp. 1049-1053. , Hong Kong, China Egele, M., Kolbitsch, C., Platzer, C., Removing web spam links from search engine results (2011) Journal in Computer Virology, 7, pp. 51-62 Eiron, N., McCurley, K.S., Tomlin, J.A., Ranking the web frontier (2004) Proceedings of the 13rd International Conference on World Wide Web (WWW'04), pp. 309-318. , New York, NY, USA Almeida, T., Yamakami, A., Almeida, J., Evaluation of approaches for dimensionality reduction applied with naive bayes anti-spam filters (2009) Proceedings of the 8th IEEE International Conference on Machine Learning and Applications, pp. 517-522. , Miami, FL, USA Almeida, T., Yamakami, A., Almeida, J., Filtering spams using the minimum description length principle (2010) Proceedings of the 25th ACM Symposium on Applied Computing, pp. 1856-1860. , Sierre, Switzerland Almeida, T., Yamakami, A., Almeida, J., Probabilistic anti-spam filtering with dimensionality reduction (2010) Proceedings of the 25th ACM Symposium on Applied Computing, pp. 1804-1808. , Sierre, Switzerland Almeida, T., Yamakami, A., Content-based spam filtering (2010) Proceedings of the 23rd IEEE International Joint Conference on Neural Networks, pp. 1-7. , Barcelona, Spain Almeida, T., Almeida, J., Yamakami, A., Spam filtering: How the dimensionality reduction affects the accuracy of naive bayes classifiers (2011) Journal of Internet Services and Applications, 1 (3), pp. 183-200 Almeida, T., Yamakami, A., Redução de Dimensionalidade Aplicada na Classificaç ão de Spams Usando Filtros Bayesianos (2011) Revista Brasileira de Computação Aplicada, 3 (1), pp. 16-29 Almeida, T., Hidalgo, J.G., Yamakami, A., Contributions to the study of SMS spam filtering: New collection and results (2011) Proceedings of the 2011 ACM Symposium on Document Engineering, pp. 259-262. , Mountain View, CA, USA Almeida, T.A., Yamakami, A., Facing the spammers: A very effective approach to avoid junk E-mails (2012) Expert Systems with Applications, pp. 1-5 Almeida, T.A., Yamakami, A., Advances in spam filtering techniques (2012) Computational Intelligence for Privacy and Security, Ser. Studies in Computational Intelligence, 394, pp. 199-214. , D. Elizondo, A. Solanas, and A. Martinez-Balleste, Eds. Springer Gan, Q., Suel, T., Improving web spam classifiers using link structure (2007) Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb'07), pp. 17-20. , Banff, Alberta, Canada Ntoulas, A., Najork, M., Manasse, M., Fetterly, D., Detecting spam web pages through content analysis (2006) Proceedings of the World Wide Web Conference (WWW'06), pp. 83-92. , Edinburgh, Scotland Urvoy, T., Chauveau, E., Filoche, P., Tracking web spam with html style similarities (2008) ACM Transactions on the Web, 2 (1), pp. 1-3. , February Bíró, I., Siklósi, D., Szabó, J., Benczúr, A.A., Linked latent dirichlet allocation in web spam filtering (2009) Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web (AIRWebW), pp. 37-40. , Madrid, Spain Abernethy, J., Chapelle, O., Castillo, C., Graph regularization methods for web spam detection (2010) Machine Learning, 81 (2), pp. 207-225 Castillo, C., Donate, D., Gionis, A., Know your neighbors: Web spam detection using the web topology (2007) Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'07), pp. 423-430. , Amsterdam, The Netherlands Erdélyi, M., Garzó, A., Benczúr, A.A., Web spam classification: A few features worth more (2011) Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality'11), pp. 27-34. , Hyderabad, India Geng, G., Wang, C., Li, Q., Xu, L., Jin, X., Boosting the performance of web spam detection with ensemble under-sampling classification (2007) Proceedings of the 14th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD'07), pp. 583-587. , Haikou, China Largillier, T., Peyronnet, S., Lightweight clustering methods for webspam demotion (2010) Proceedings of the 9th IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT'10), pp. 98-104. , Toronto, Canada Ren, Q., Feature-fusion framework for spam filtering based on svm (2010) Proceedings of the 7th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS'10), pp. 1-6. , Redmond, Washington, USA Haykin, S., (1998) Neural Networks: A Comprehensive Foundation, , 2nd ed. New York, NY, USA: Prentice Hall Liu, H., On the levenberg-marquardt training method for feed-forward neural networks (2010) Proceedings of the 6th International Conference on Natural Computation (ICNC'10), pp. 456-460. , Yantai, China Bishop, C.M., (1995) Neural Networks for Pattern Recognition, , 1st ed. Oxford: Oxford Press Hagan, M.T., Menhaj, M.B., Training feedforward networks with the marquardt algorithm (1994) IEEE Transactions on Neural Networks, 5 (6), pp. 989-993 Kohonen, T., The self-organizing map (1990) Proceedings of the IEEE, 9 (78), pp. 1464-1480 Orr, M.J.L., (1996) Introduction to Radial Basis Function Networks