Towards Web Spam Filtering With Neural-based Approaches

Silva R.M.; Almeida T.A.; Yamakami A.

Actas de congresos

Registro en:

9783642346538

Lecture Notes In Computer Science (including Subseries Lecture Notes In Artificial Intelligence And Lecture Notes In Bioinformatics). Springer Verlag, v. 7637 LNAI, n. , p. 199 - 209, 2012.

3029743

10.1007/978-3-642-34654-5_21

http://www.scopus.com/inward/record.url?eid=2-s2.0-84906718052&partnerID=40&md5=53c990d4c929dabfe2cdf7ecdefa868f

http://www.repositorio.unicamp.br/handle/REPOSIP/97105

http://repositorio.unicamp.br/jspui/handle/REPOSIP/97105

2-s2.0-84906718052

http://repositorioslatinoamericanos.uchile.cl/handle/2250/1245971

Autor

Silva R.M.

Almeida T.A.

Yamakami A.

Institución

Universidade Estadual de Campinas (Brasil)

Resumen

The steady growth and popularization of the Web increases the competition between the websites and creates opportunities for profit in several segments. Thus, there is a great interest in keeping the website in a good position in search results. The problem is that many websites use techniques to circumvent the search engines which deteriorates the search results and exposes users to dangerous content. Given this scenario, this paper presents a performance evaluation of different models of artificial neural networks to automatically classify web spam.We have conducted an empirical experiment using a well-known, large and public web spam database. The results indicate that the evaluated approaches outperform the state-of-the-art web spam filters. © Springer-Verlag Berlin Heidelberg 2012.

7637 LNAI

199

209

et al.,Sociedad Colombiana de Computacion (SCo2),Universidad de Caldas,Universidad Nacional de Colombia,Universidad Tecnologica de Bolivar en Cartagena,Universidad Tecnologica de PereiraPublisher: Springer Verlag

Shengen, L., Xiaofei, N., Peiqi, L., Lin, W., Generating new features using genetic programming to detect link spam (2011) 2011 Intl. Conf. on Intelligent Computation Technology An D Automation (ICICTA 2011, pp. 135-138. , IEEE

Araujo, L., Martinez-Romo, J., Web spam detection-new classification features based on qualified link analysis and language models (2010) IEEE Trans Actions on Information Forensics and Security, 5 (3), pp. 581-590

Egele, M., Kolbitsch, C., Platzer, C., Removing web spam links from search engine results (2011) Journal in Computer Virology, 7 (1), pp. 51-62

Shen, G., Gao, B., Liu, T., Feng, G., Song, S., Li, H., Detecting link spam using temporal information (2006) 6th Intl. Conf. on Data Mining (ICDM 2006, pp. 1049-1053. , IEEE

Gan, Q., Suel, T., Improving web spam classifiers using link structure (2007) 3rd Intl.Work. on Adversarial Information Retrieval on the Web (AIRWeb 2007, pp. 17-20. , ACM

Ntoulas, A., Najork, M., Manasse, M., Fetterly, D., Detecting spam web pages through content analysis (2006) 15th Intl. Conf. on World Wide Web (WWW 2006, pp. 83-92. , ACM

Silva, R.M., Almeida, T.A., Yamakami, A., Artificial neural networks for content-based web spam detection (2012) 14th Intl. Conf. on Artificial Intelligence (ICAI 2012, pp. 1-7

Bíró, I., Siklosi, D., Szabo, J., Benczur, A.A., Linked latent dirichlet allocation in web spam filtering (2009) 5th Intl. Work. on Adversarial Information Retrieval on the Web (AIRWeb 2009, pp. 37-40. , ACM

Abernethy, J., Chapelle, O., Castillo, C., Graph regularization methods for web spam detection (2010) Machine Learning, 81 (2), pp. 207-225

Castillo, C., Donato, D., Gionis, A., Know your neighbors-web spam detection using the web topology (2007) 30th Annual Intl ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR 2007, pp. 423-430. , ACM

Svore, K.M., Wu, Q., Burges, C.J., Improving web spam classification using rank-time features (2007) 3rd Intl. Work. on Adversarial Information Retrieval on the Web (AIRWeb 2007, pp. 9-16. , ACM

Noi, L.D., Hagenbuchner, M., Scarselli, F., Tsoi, A., Web spam detection by probability mapping graphsoms and graph neural networks (2010) ICANN 2010, Part II. LNC S, 6353, pp. 372-381. , Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) Springer, Heidelberg

Largillier, T., Peyronnet, S., Using patterns in the behavior of the random surfer to detect webspam beneficiaries (2011) WISE Workshops 2010. LNCS, 6724, pp. 241-253. , Chiu, D.K.W., Bell atreche, L., Sasaki, H., Leung, H.-f., Cheung, S.-C., Hu, H., Shao, J. (eds.) Springer, Heidelberg

Haykin, S., (1998) Neural Networks-A Comprehensive Foundation, , 2nd edn Prentice Hall New York

Bishop, C.M., (1995) Neural Networks for Pattern Recognition, , Oxford Press, Oxford

Hagan, M.T., Menhaj, M.B., (1994) Training feedforward networks with themarquardt algorithm, 5 (6), pp. 989-993. , IEEE Transactions on Neural Networks

Kohonen, T., The self-organizing map (1990) Proceedings of the IEEE, 78 (9), pp. 1464-1480

Orr, M.J.L., (1996) Introduction to Radial Basis Function Networks, , Center for Cognitive Science, UK

Becchetti, L., Castillo, C., Donato, D., Leonardi, S., Baeza-Yates, R., Using rank propagation and probabilistic counting for link-based spam detection (2006) Workshop on Web Mining and Web Usage Analysis (WebKDD 2006, pp. 1-10. , ACM

Shao, J., Linear model selection by cross-validation (1993) Journal of the American Statistical Association, 88 (422), pp. 486-494

Witten, I.H., Frank, E., (2005) Data Mining-Practical Machine Learning Tools and Techniques, , 2nd edn Morgan Kaufmann, San Francisco

Materias

Mostrar el registro completo del ítem