Tesis de Maestría / master Thesis
An interpretable autoencoder for semi-supervised anomaly detection
Fecha
2021-10Registro en:
1006864
Autor
MEDINA PEREZ, MIGUEL ANGEL; 388892
Aguilar, Diana Laura
Institución
Resumen
Anomaly detection is a continuing concern in the machine learning community. Within this framework, several attempts have been made to address this problem. Most research, nonethe- less, has only focused on accuracy and has not taken account of interpretability. When a model is interpretable, it can furnish the explanations behind its classification decisions.
As reported by the literature, interpretability grows in importance when the application domain is high-stakes. As a result, people’s lives can be severely impacted. Moreover, this is the case of many application domains of anomaly detection. This dissertation seeks to account for it by proposing an interpretable autoencoder for semi-supervised anomaly detection. As far as is known, it is the first interpretable autoencoder based on decision trees to be used for this purpose.
This study comprises an assessment of the performance of the proposal of this disserta- tion against other state-of-the-art one-class classifiers with two different classes of data: nomi- nal and numerical. There were 123 datasets, of whom 37 were nominal, whereas the rest were numerical. The nominal experimental framework included nine one-class classifiers, while the numerical experiments encompassed 13 state-of-the-art classifiers. Moreover, AUC was utilized as an evaluation criterion, and statistical tests were conducted to seek significance.
The results of this research show that this proposal achieves competitive performance against its analogs from the literature working with nominal data. Furthermore, according to the statistical tests, there were no significant differences between the results of the proposal and the best-ranked benchmark classifier. Nevertheless, the findings also yield that the perfor- mance of the proposal in numerical data is not satisfactory as it was outperformed by several benchmark classifiers.