Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

Riccardo Guidotti,A. Monreale,S. Matwin,D. Pedreschi

Published 2019 in ECML/PKDD

ABSTRACT

We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.

PUBLICATION RECORD

Publication year
2019
Venue
ECML/PKDD
Publication date
2019-09-16
Fields of study
Computer Science
Identifiers
DOI 10.1007/978-3-030-46150-8_12 arXiv 2002.03746
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Generative Adversarial Networks
2021cited by this paper
Interpretable Machine Learning
2019cited by this paper
Investigating Neighborhood Generation Methods for Explanations of Obscure Image Classifiers
2019cited by this paper
Enhancing the Robustness of Deep Neural Networks by Boundary Conditional GAN
2019cited by this paper
Explaining Multi-label Black-Box Classifiers for Health Applications
2019cited by this paper
Assessing the Stability of Interpretable Models
2018influential reference
A Survey of Methods for Explaining Black Box Models
2018influential reference
Local Rule-Based Explanations of Black Box Decision Systems
2018influential reference
Maximally Invariant Data Perturbation as Explanation
2018influential reference
Towards Robust Interpretability with Self-Explaining Neural Networks
2018influential reference
Contrastive Explanations with Local Foil Trees
2018cited by this paper
This looks like that: deep learning for interpretable image recognition
2018influential reference
iml: An R package for Interpretable Machine Learning
2018cited by this paper
Towards an Interpretable Latent Space – An Intuitive Comparison of Autoencoders with Variational Autoencoders
2018cited by this paper
Distilling a Neural Network Into a Soft Decision Tree
2017cited by this paper
Interpretable Explanations of Black Boxes by Meaningful Perturbation
2017cited by this paper
Deep Learning for Case-based Reasoning through Prototypes: A Neural Network that Explains its Predictions
2017influential reference
Explainable and Interpretable Models in Computer Vision and Machine Learning
2017cited by this paper
Towards A Rigorous Science of Interpretable Machine Learning
2017influential reference
Not Just a Black Box: Learning Important Features Through Propagating Activation Differences
2016cited by this paper
“Why Should I Trust You?”: Explaining the Predictions of Any Classifier
2016cited by this paper
Inducing Interpretable Representations with Variational Autoencoders
2016cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
Adversarial Autoencoders
2015influential reference
Distilling the Knowledge in a Neural Network
2015cited by this paper
On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation
2015cited by this paper
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
2013cited by this paper
Visualizing and Understanding Convolutional Networks
2013cited by this paper
Prototypes Vs Exemplars in Concept Representation
2012influential reference
Prototype selection for interpretable classification
2011cited by this paper
Et al
2008cited by this paper
Random Forests
2001cited by this paper

CITED BY

Axiomatic Foundations of Counterfactual Explanations
2026cites this paper
Explanations Go Linear: Interpretable and Individual Latent Encoding for Post-hoc Explainability
2025cites this paper
Explainable AI in Time-Sensitive Scenarios: Prefetched Offline Explanation Model
2025influential citation
Anomaly Detection in Event-Triggered Traffic Time Series via Similarity Learning
2025cites this paper
Explainable Artificial Intelligence in Biomedical Image Analysis: A Comprehensive Survey
2025cites this paper
Towards Transparent Healthcare: Advancing Local Explanation Methods in Explainable Artificial Intelligence
2024cites this paper
Towards Explainable Artificial Intelligence (XAI): A Data Mining Perspective
2024cites this paper
Counterfactual and Prototypical Explanations for Tabular Data via Interpretable Latent Space
2024cites this paper
Transparent Neighborhood Approximation for Text Classifier Explanation by Probability-Based Editing
2024influential citation
Interpretable machine learning for dermatological disease detection: Bridging the gap between accuracy and explainability
2024cites this paper
Explaining Siamese networks in few-shot learning
2024cites this paper
Advancing Dermatological Diagnostics: Interpretable AI for Enhanced Skin Lesion Classification
2024influential citation
SCGAN: Sparse CounterGAN for Counterfactual Explanations in Breast Cancer Prediction
2023cites this paper
M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models
2023cites this paper
Explaining any black box model using real data
2023cites this paper
Explainable AI for Medical Data: Current Methods, Limitations, and Future Directions
2023influential citation
Understanding Any Time Series Classifier with a Subsequence-based Explainer
2023influential citation
Explaining text classifiers through progressive neighborhood approximation with realistic samples
2023cites this paper
A multiorder feature tracking and explanation strategy for explainable deep learning
2023cites this paper
Transparent Latent Space Counterfactual Explanations for Tabular Data
2022cites this paper
Causality-Aware Local Interpretable Model-Agnostic Explanations
2022cites this paper
Explaining classifiers by constructing familiar concepts
2022influential citation
Exploiting auto-encoders for explaining black-box classifiers
2022influential citation
An Open-Source Software Library for Explainable Support Vector Machine Classification
2022cites this paper
Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box
2022cites this paper
Decomposing Counterfactual Explanations for Consequential Decision Making
2022cites this paper
Exemplars and Counterexemplars Explanations for Skin Lesion Classifiers
2022cites this paper
Stable and actionable explanations of black-box models through factual and counterfactual rules
2022cites this paper
Hub-VAE: Unsupervised Hub-based Regularization of Variational Autoencoders
2022cites this paper
CALIME: Causality-Aware Local Interpretable Model-Agnostic Explanations
2022cites this paper
Analogies and Feature Attributions for Model Agnostic Explanation of Similarity Learners
2022cites this paper
A survey on outlier explanations
2022cites this paper
A Survey of Algorithmic Recourse: Contrastive Explanations and Consequential Recommendations
2022influential citation
Explainable Deep Image Classifiers for Skin Lesion Diagnosis
2021influential citation
Evaluating local explanation methods on ground truth
2021cites this paper
Benchmarking and survey of explanation methods for black box models
2021influential citation
Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications
2021influential citation
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations
2021cites this paper
Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond
2021cites this paper
Understanding Prediction Discrepancies in Machine Learning Classifiers
2021cites this paper
Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics
2021cites this paper
Geometric Deformation on Objects: Unsupervised Image Manipulation via Conjugation
2021cites this paper
NoiseGrad: enhancing explanations by introducing stochasticity to model weights
2021cites this paper
Interpretable Summaries of Black Box Incident Triaging with Subgroup Discovery
2021cites this paper
XPROAX-Local explanations for text classification with progressive neighborhood approximation
2021cites this paper
Principles of Explainable Artificial Intelligence
2021cites this paper
Generating Realistic Natural Language Counterfactuals
2021cites this paper
Exemplars and Counterexemplars Explanations for Image Classifiers, Targeting Skin Lesion Labeling
2021cites this paper
Explainable process trace classification: An application to stroke
2021cites this paper
Interpretable Nearest Neighbor Queries for Tree-Structured Data in Vector Databases of Graph-Neural Network Embeddings
2020cites this paper
Social Mining & Big Data Ecosystem RESEARCH INFRASTRUCTURE RESEARCH INFRASTRUCTURE Inside this issue
2020influential citation
ProtoPShare: Prototypical Parts Sharing for Similarity Discovery in Interpretable Image Classification
2020cites this paper
B EYOND T RIVIAL C OUNTERFACTUAL G ENERATIONS WITH D IVERSE V ALUABLE E XPLANATIONS
2020cites this paper
Data-Agnostic Local Neighborhood Generation
2020cites this paper
Explaining Any Time Series Classifier
2020influential citation
Neural Prototype Trees for Interpretable Fine-grained Image Recognition
2020cites this paper
Explaining Differences in Classes of Discrete Sequences
2020cites this paper
MAIRE - A Model-Agnostic Interpretable Rule Extraction Procedure for Explaining Classifiers
2020cites this paper
Contextualizing Support Vector Machine Predictions
2020cites this paper
Explaining Explanation Methods
2020cites this paper
Explaining Sentiment Classification with Synthetic Exemplars and Counter-Exemplars
2020cites this paper
A survey of algorithmic recourse: definitions, formulations, solutions, and prospects
2020cites this paper
Counterfactual Explanation Based on Gradual Construction for Deep Networks
2020influential citation
On quantitative aspects of model interpretability
2020cites this paper
Explaining Predictions by Approximating the Local Decision Boundary
2020cites this paper
Cracking the Black Box: Distilling Deep Sports Analytics
2020cites this paper
Explaining Image Classifiers Generating Exemplars and Counter-Exemplars from Latent Representations
2020influential citation
A robust algorithm for explaining unreliable machine learning survival models using the Kolmogorov-Smirnov bounds
2020cites this paper
A Generic and Model-Agnostic Exemplar Synthetization Framework for Explainable AI
2020cites this paper
Explaining A Black-box By Using A Deep Variational Information Bottleneck Approach
2019cites this paper
XAI in Healthcare ⋆
year unknowncites this paper