Assessing the Reliability of Visual Explanations of Deep Models through Adversarial Perturbation

Dan Nascimento Gomes do Valle

Dissertação de Mestrado

Fecha

2019-03-27

Registro en:

http://hdl.handle.net/1843/SLSC-BBZF5N

http://repositorioslatinoamericanos.uchile.cl/handle/2250/3795669

Autor

Dan Nascimento Gomes do Valle

Institución

Universidade Federal de Minas Gerais (Brasil)

Resumen

The increasing interest in complex deep neural networks for new applications demands transparency in their decisions, which leads to the need for reliable explanations of such models. Recent works have proposed new explanation methods to present interpretable visualizations of the relevance of input instances. These methods calculate relevance maps which often focus on different pixel regions and are commonly compared by visual inspection. This means that evaluations are based on human expectation instead of actual feature importance. In this work, we propose an effective metric for evaluating the reliability of the explanation of models. This metric is based on changes in the network's outcome resulted from the perturbation of input images in an adversarial way. These perturbations consider every relevance value and its inversion (irrelevance) so that the metric has characteristics of precision and recall. We also propose a direct application of this metric to filter relevance maps in order to create more interpretable images without any loss in essential explanation. We present a comparison between some widely-known explanation methods and their results using the proposed metric. We also expand the results into a discussion on visualization techniques and the amount of information lost to make them more interpretable. Then, we show the results of our filtering method which tackles this problem. In addition, we further present an in-depth analysis of the properties of the metric which make it appropriate for a variety of tasks. It shows the importance of using the irrelevance, the robustness to random values and misclassified images, and the correlation between the metric and the loss of the model evaluated.

Materias

Visão Computacional

Aprendizado Profundo

Aprendizado de Máquina Explicável

Mostrar el registro completo del ítem