Explaining Visual Classification using Attributes

Muneeb Ul Hassan 1 Philippe Mulhem 1 Denis Pellerin 2 Georges Quénot 1
1 MRIM - Modélisation et Recherche d’Information Multimédia [Grenoble]
Inria - Institut National de Recherche en Informatique et en Automatique, LIG - Laboratoire d'Informatique de Grenoble
GIPSA-DIS - Département Images et Signal
Abstract : The performance of deep Convolutional Neural Networks (CNN) has been reaching or even exceeding the human level on large number of tasks. Some examples are image classification, Mastering Go game, speech understanding etc. However, their lack of decomposability into intuitive and understandable components make them hard to interpret, i.e. no information is provided about what makes them arrive at their prediction. We propose a technique to interpret CNN classification task and justify the classification result with visual explanation and visual search. The model consists of two sub networks: a deep recurrent neural network for generating textual justification and a deep convolutional network for image analysis. This multimodal approach generates the textual justification about the classification decision. To verify the textual justification, we use the visual search to extract the similar content from the training set. We evaluate our strategy on a novel CUB dataset with the ground-truth attributes. We make use of these attributes to further strengthen the justification by providing the attributes of images.
Complete list of metadatas

Cited literature [36 references]  Display  Hide  Download

Contributor : Georges Quénot <>
Submitted on : Wednesday, October 16, 2019 - 9:40:33 PM
Last modification on : Thursday, October 24, 2019 - 10:35:51 AM


Files produced by the author(s)


  • HAL Id : hal-02318323, version 1


Muneeb Ul Hassan, Philippe Mulhem, Denis Pellerin, Georges Quénot. Explaining Visual Classification using Attributes. Content-Based Multimedia Indexing, Sep 2019, Dublin, Ireland. ⟨hal-02318323⟩



Record views


Files downloads