dc.creatorAlbañil Sánchez, Misael Andrey
dc.creatorGalpin, Ixent
dc.date.accessioned2022-07-22T19:15:19Z
dc.date.accessioned2022-09-23T18:47:30Z
dc.date.available2022-07-22T19:15:19Z
dc.date.available2022-09-23T18:47:30Z
dc.date.created2022-07-22T19:15:19Z
dc.identifierhttp://hdl.handle.net/20.500.12010/27756
dc.identifierhttp://expeditio.utadeo.edu.co
dc.identifier.urihttp://repositorioslatinoamericanos.uchile.cl/handle/2250/3507467
dc.description.abstractThroughout the world, the provision of online goods and services has increased significantly over the last few years. We consider the case of Tango Discos, a small company in Colombia that sells entertainment products through an e-commerce website and receives customer messages through various channels, including a webform, email, Facebook and Twitter. This dataset comprises 29,970 messages collected from 2019 to 2021. Each message can be categorized as being either being a sale, request or complaint. In this work we evaluate different supervised classification models to automate the task of classifying the messages, viz. decision trees, Naive Bayes, linear Support Vector Machines and logistic regression. As the data set is unbalanced, the different models are evaluated in combination with various data balancing approaches to obtain the best performance. In order to maximize revenue, the management is interested in prioritizing messages that may result in potential sales. As such, the best model for deployment is one that minimizes false positives in the sales category, so that these are processed in a timely fashion. As such, the best performing model is found to be the Linear Support Vector Machine using the Random Over Sampler balancing technique. This model is deployed in the cloud and exposed using a RESTful interface.
dc.languageeng
dc.publisherUniversidad de Bogotá Jorge Tadeo Lozano
dc.publisherMaestría en Ingeniería y Analítica de Datos
dc.relationAdaji, I., Kiron, N., Vassileva, J.: Evaluating the susceptibility of e-commerce shoppers to persuasive strategies. a game-based approach. In: International Conference on Persuasive Technology. pp. 58–72. Springer (2020)
dc.relationAlghoul, A., Al Ajrami, S., Al Jarousha, G., Harb, G., Abu-Naser, S.S.: Email classification using artificial neural network (2018)
dc.relationBlackSip, Vtex, Nielsen, PayU, Credibanco, MercadoLibre, Rappi, emBlue, Icommkt: BlackIndex: reporte del ecommerce en Colombia. BlackSip (2019)
dc.relationBusemann, S., Schmeier, S., Arens, R.G.: Message classification in the call center. arXiv preprint cs/0003060 (2000)
dc.relationConfecamaras: https://confecamaras.org.co (13 de Enero de 2022)
dc.relationDuan, L., Li, A., Huang, L.: A new spam short message classification. In: 2009 First International Workshop on Education Technology and Computer Science. vol. 2, pp. 168–171. IEEE (2009)
dc.relationFang, W., Luo, H., Xu, S., Love, P.E., Lu, Z., Ye, C.: Automated text classification of near-misses from safety reports: An improved deep learning approach. Advanced Engineering Informatics 44, 101060 (2020)
dc.relationManning, C., Raghavan, P., Sch¨utze, H.: Introduction to information retrieval. Natural Language Engineering 16(1), 100–103 (2010)
dc.relationMansoor, R., Jayasinghe, N.D., Muslam, M.M.A.: A comprehensive review on email spam classification using machine learning algorithms. In: 2021 International Conference on Information Networking (ICOIN). pp. 327–332. IEEE (2021)
dc.relationMasterov, D.V., Mayer, U.F., Tadelis, S.: Canary in the e-commerce coal mine: Detecting and predicting poor experiences using buyer-to-seller messages. In: Proceedings of the Sixteenth ACM Conference on Economics and Computation. pp. 81–93 (2015)
dc.relationMenini, S., Moretti, G., Corazza, M., Cabrio, E., Tonelli, S., Villata, S.: A system to monitor cyberbullying based on message classification and social network analysis. In: Proceedings of the third workshop on abusive language online. pp. 105–110 (2019)
dc.relationMohammed,R., Rawashdeh, J., Abdullah, M.: Machine learning with oversampling and undersampling techniques: overview study and experimental results. In: 2020 11th international conference on information and communication systems (ICICS). pp. 243–248. IEEE (2020)
dc.relationNkansah, E.A.: Kayayo: An e-commerce site with recommendations and text messaging (2013)
dc.relationOzel, S.A., Sara¸c, E., Akdemir, S., Aksu, H.: Detection of cyberbullying on social media messages in turkish. In: 2017 International Conference on Computer Science and Engineering (UBMK). pp. 366–370. IEEE (2017)
dc.relationWebster, J.J., Kit, C.: Tokenization as the initial phase in nlp. In: COLING 1992 volume 4: The 14th international conference on computational linguistics (1992)
dc.relationWirth, R., Hipp, J.: Crisp-dm: Towards a standard process model for data mining. In: Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining. vol. 1, pp. 29–39. Manchester (2000)
dc.relationZois, D.S., Kapodistria, A., Yao, M., Chelmis, C.: Optimal online cyberbullying detection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 2017–2021. IEEE (2018)
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rightsAbierto (Texto Completo)
dc.sourceinstname:Universidad de Bogotá Jorge Tadeo Lozano
dc.sourcereponame:Expeditio Repositorio Institucional UJTL
dc.subjectE-Commerce
dc.titleClassifying incoming customer messages for an e-commerce site using supervised learning


Este ítem pertenece a la siguiente institución