Clustering binary data by application of combinatorial optimization heuristics

Trejos Zelaya, Javier; Amaya Briceño, Luis Eduardo; Jiménez Romero, Alejandra; Murillo Fernández, Alex; Piza Volio, Eduardo; Villalobos Arias, Mario Alberto

documento de trabajo

Fecha

2019-08-09

Registro en:

https://arxiv.org/abs/2001.01809

https://hdl.handle.net/10669/81593

https://repositorioslatinoamericanos.uchile.cl/handle/2250/4517844

Autor

Trejos Zelaya, Javier

Amaya Briceño, Luis Eduardo

Jiménez Romero, Alejandra

Murillo Fernández, Alex

Piza Volio, Eduardo

Villalobos Arias, Mario Alberto

Institución

Universidad de Costa Rica

Resumen

We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior combinatorial optimization metaheuristics: first ones are simulated annealing, threshold accepting and tabu search, and the others are a genetic algorithm and ant colony optimization. The methods are implemented, performing the proper calibration of parameters in the case of heuristics, to ensure good results. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM. Simulated annealing perform very well, especially compared to classical methods.

Materias

Mostrar el registro completo del ítem