CN114723733A

CN114723733A - Class activation mapping method and device based on axiom interpretation

Info

Publication number: CN114723733A
Application number: CN202210450336.2A
Authority: CN
Inventors: 曾春艳; 严康; 王志锋; 万相奎; 冯世雄; 余琰; 夏诗言; 赵宇豪
Original assignee: Hubei University of Technology
Current assignee: Hubei University of Technology
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2022-07-08

Abstract

The invention relates to a class activation mapping method and device based on axiom interpretation. Firstly, inputting an electrical equipment image into a trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, calculating the gradient of the feature map relative to the score by utilizing back propagation, optimizing the gradient, performing global average pooling operation on the optimized gradient to obtain a weight, and finally, linearly combining the weight and the feature map, and performing up-sampling and normalization operation to obtain an initial class activation map; performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score; and finally, multiplying the obtained fraction by the initial class activation graph, and obtaining a final class activation graph after ReLU operation. The class activation graph generated by the method has better visual interpretation in the aspects of definition, object positioning and the like.

Description

Class activation mapping method and device based on axiom interpretation

Technical Field

The invention relates to the field of computer vision and deep learning interpretability, in particular to a class activation mapping method and device based on axiom interpretation.

Background

Under the push of Deep Neural Networks (DNNs), Deep learning makes a major breakthrough in related fields such as natural language processing, image processing, voice recognition and the like. One key factor in the success of a deep neural network is that its network is deep enough that complex combinations of a large number of non-linear network layers can characterize the original data at various levels of abstraction. However, most deep learning models have high complexity, many parameters and low transparency, and as a black box is common, people cannot understand the mechanism of making a decision by the "end-to-end" model, and cannot judge whether the decision is reliable. In order to improve the model transparency, a plurality of interpretability methods are proposed, which mainly comprise a gradient-based method, a class-activation-based mapping method and a perturbation-based interpretation method. When the class activation mapping method is used for solving the derivative gradient through back propagation, a small amount of scattered point noise still exists in the corresponding saliency map, and therefore the visualization effect of the final class activation map is influenced.

The back propagation method is based on a chain rule, the brighter the color in the generated initial class activation map indicates that the absolute value of the gradient of the position is larger, the positions highlight the features related to the output result of the model in the input space, and the higher the correlation is, the more obvious the corresponding features in the class activation map are. However, there is a certain limitation in calculating the gradient by using back propagation, and the obtained activation-like map has a small amount of scattered point noise, which indicates that some irrelevant features are concerned. The scattered noise is represented by that the gradient values of some local areas are large, the gradient values of other areas are small, and the distribution of the areas is disordered.

In order to solve the existing problems, the class activation map with more concentrated and clearer salient regions is generated in order to remove scattered point noise. The chapter proposes a new method called SA-CAM (Smooth Absolute value Class Activation Mapping-based) by combining with another Gradient interpretation method based on Class Activation Mapping, the method adds noise obeying Gaussian distribution after the initial Class Activation map is point-multiplied with the input image, and the smoothing of the image is realized by averaging a plurality of Class Activation maps, so that the noise in the Class Activation map is removed, and better visual interpretation is obtained in the aspects of Class Activation map definition, object positioning and the like.

Disclosure of Invention

The method mainly solves the problems that when the class activation mapping method is used for solving the derivative gradient by utilizing back propagation, a corresponding saliency map still has a small amount of scattered point noise and the like. As shown in FIG. 1, the invention provides a class activation mapping method-SA-CAM based on axiom interpretation, and the generated class activation map has better visual interpretation in the aspects of definition, object positioning and the like.

The technical problem of the invention is mainly solved by the following technical scheme:

a class activation mapping method based on axiom interpretation is characterized by comprising the following steps:

inputting the electrical equipment image into the trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, and calculating the gradient of the feature map relative to the score by utilizing back propagation;

optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, carrying out global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight and a characteristic graph and carrying out up-sampling and normalization operation to obtain an initial class activation graph;

performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score;

multiplying the obtained fraction with the initial class activation graph, and obtaining the final class activation graph after the ReLU operation

In the class activation mapping method based on axiom interpretation, extracting the feature map of the target convolution layer, outputting the target class score, and calculating the back propagation gradient specifically include:

step 1.1, extracting a characteristic diagram of a target convolutional layer: for a given electrical device image X₀Sending the model Y into a model Y, extracting a characteristic diagram A of the target convolutional layer l in the model Y, wherein the kth characteristic diagram of the A is shown as

Step 1.2, outputting a target category score: after the softmax operation, model Y outputs a score Y that predicts the image class_c'(X₀)

Step 1.3, calculating a back propagation gradient: computing the kth profile of A by back propagation

Intermediate spatial position (i, j) relative to Y_c'(X₀) Gradient of fraction

In the above method for mapping class activation based on axiom interpretation, the step of obtaining the initial class activation graph through optimization specifically includes:

step 2.1, gradient optimization: optimizing gradients by squaring processing strategies

Further highlighting the gradient in each layer that is positively correlated with the network output,

step 2.2, obtaining weight by global average pooling: the optimized gradient

Performing global average pooling operation to obtain weight

Wherein Z represents a characteristic diagram

The number of pixels of (a);

step 2.3, obtaining an initial class activation diagram: weight will be weighted

And characteristic diagram

Linear combination, up-sampling and normalization operation are carried out to obtain an initial class activation graph M₀；

Where U denotes an upsampling operation and S denotes a normalization operation.

In the above class activation mapping method based on axiom interpretation, the step of performing point multiplication on the initial class activation map and the input image and performing smoothing operation on a multiplication result specifically includes:

step 3.1, dot-multiplying the initial class activation graph and the input image: activate the initial class map M₀And the input image X₀Dot product to obtain M₁；

M₁＝M₀.X (4)

Step 3.2, smoothing operation, for M₁Carrying out smoothing operation; specifically, by applying at M₁Each image of (1) adding Gaussian noise to generate N noise images M₂；

Step 3.3, average score: image M₂Sending into model, generating N scores after softmax, and averaging to a final score Y_c(n)；

In the above class activation mapping method based on axiom interpretation, the obtaining of the final class activation graph specifically includes:

step 4.1, mixing Y_c(n) and initial saliency map M₀Multiplication and ReLU operation to obtain the final saliency map

An axiom interpretation-based class activation mapping device, comprising:

a first module: inputting the electrical equipment image into the trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, and calculating the gradient of the feature map relative to the score by utilizing back propagation;

a second module: optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, performing global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight with a characteristic graph and performing up-sampling and normalization operation to obtain an initial class activation graph;

a third module: performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score;

a fourth module: multiplying the obtained fraction with the initial class activation graph, and obtaining the final class activation graph after the ReLU operation

Therefore, the invention has the following advantages:

1. and adding noise subject to Gaussian distribution after the initial class activation map is multiplied by the input image points, and smoothing the image by averaging a plurality of class activation maps so as to remove the noise existing in the class activation map.

The SA-CAM generated class activation map has better visual interpretation in the aspects of definition, object positioning and the like.

Drawings

FIG. 1 is a diagram of a class activation mapping method framework based on axiom interpretation in accordance with the present invention;

Detailed Description

The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.

Example (b):

the invention relates to a class activation mapping method based on axiom interpretation, and an algorithm flow chart of the invention is shown in figure 1 and can be divided into four parts: 1) extracting a feature map of the target convolutional layer, outputting a target category fraction, and calculating a back propagation gradient; 2) optimizing gradient, and performing global average pooling to obtain weight and an initial class activation map; 3) multiplying the initial class activation graph and the input image by points and points, smoothing operation and averaging scores; 4) and obtaining a final class activation graph.

The method comprises the following steps: extracting a characteristic diagram of the target convolution layer, outputting a target category score and calculating a back propagation gradient, wherein the steps are as follows:

A. extracting a characteristic diagram of the target convolutional layer: for a given electrical device image X₀Sending the model Y into a model Y, extracting a characteristic diagram A of the target convolutional layer l in the model Y, wherein the kth characteristic diagram of the A is shown as

B. Outputting a target category score: after the softmax operation, model Y outputs a score Y that predicts the image class_c'(X₀)；

C. Calculating a back propagation gradient: computing the kth profile of A by back propagation

The intermediate spatial position (i, j) relative to Y_c'(X₀) Gradient of fraction

Step two: optimizing gradient, obtaining weight by global average pooling and obtaining an initial class activation map, wherein the steps are as follows:

A. optimizing the gradient: optimizing gradients by squaring processing strategies

B. global average pooling yields weights: the optimized gradient

Performing global average pooling operation to obtain weight

Wherein Z represents a characteristic diagram

Number of pixels of

C. Obtaining an initial class activation graph: will weight

And characteristic diagram

Linear combination, up-sampling and normalization operation are carried out to obtain an initial class activation graph M₀。

Where U represents the upsampling operation and S represents the normalization operation

Step three: the method comprises the steps of point multiplication of an initial class activation graph and an input image, smoothing operation and average score, and comprises the following steps:

A. dot-by-dot initial class activation map and input image: activate the initial class map M₀And the input image X₀Dot product to obtain M₁；

M₁＝M₀.X (10)

B. Smoothing operation on M₁A smoothing operation is performed. Specifically, by applying at M₁Each image of (1) adding Gaussian noise to generate N noise images M₂；

C. Average fraction: image M₂Sending into model, generating N scores after softmax, and averaging to a final score Y_c(n)。

Step four: and obtaining a final class activation graph, which comprises the following steps:

A. will Y_c(M₁) And the initial saliency map M₀Multiplication and ReLU operation to obtain final class activation graph

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A class activation mapping method based on axiom interpretation is characterized by comprising the following steps:

performing point multiplication on the obtained initial class activation image and the input image, performing smoothing operation on the image subjected to point multiplication, sending the image into a model, performing softmax operation, generating N scores, and finally averaging to obtain a score;

2. The axiom interpretation-based class activation mapping method according to claim 1, wherein extracting a feature map of a target convolution layer, outputting a target class score, and calculating a back propagation gradient specifically comprises:

step 1.1, extracting a characteristic diagram of a target convolutional layer: for a given electrical device image X₀Sending the model Y into a model, extracting a characteristic diagram A of the target convolutional layer l in the model Y, wherein the kth characteristic diagram of the A is shown as

3. The axiom interpretation-based class activation mapping method according to claim 1, wherein the optimization processing to obtain the initial class activation graph specifically comprises:

step 2.1, optimizing gradient: optimizing gradients by squaring processing strategies

step 2.2, obtaining weight by global average pooling: the optimized gradient

Performing global average pooling operation to obtain weight

Wherein Z represents a characteristic diagram

The number of pixels of (a);

step 2.3, obtaining an initial class activation diagram: will weight

And characteristic diagram

4. The axiom interpretation-based class activation mapping method according to claim 1, wherein the step of performing a smooth operation on the multiplication result by the initial class activation map and the input image specifically comprises:

step 3.1, dot-multiplying the initial class activation graph and the input image: activate the initial class map M₀And an input image X₀Dot product to obtain M₁；

M₁＝M₀.X (4)

5. The axiom interpretation-based class activation mapping method according to claim 1, wherein the obtaining of the final class activation graph specifically comprises:

6. An axiomatic interpretation-based class activation mapping apparatus, using the method of any one of claims 1 to 5, comprising:

a second module: optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, carrying out global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight and a characteristic graph and carrying out up-sampling and normalization operation to obtain an initial class activation graph;