CN114723733A - Class activation mapping method and device based on axiom interpretation - Google Patents
Class activation mapping method and device based on axiom interpretation Download PDFInfo
- Publication number
- CN114723733A CN114723733A CN202210450336.2A CN202210450336A CN114723733A CN 114723733 A CN114723733 A CN 114723733A CN 202210450336 A CN202210450336 A CN 202210450336A CN 114723733 A CN114723733 A CN 114723733A
- Authority
- CN
- China
- Prior art keywords
- class activation
- gradient
- image
- score
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004913 activation Effects 0.000 title claims abstract description 79
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000013507 mapping Methods 0.000 title claims abstract description 24
- 238000011176 pooling Methods 0.000 claims abstract description 13
- 238000012935 Averaging Methods 0.000 claims abstract description 11
- 238000010606 normalization Methods 0.000 claims abstract description 11
- 238000005070 sampling Methods 0.000 claims abstract description 8
- 238000010586 diagram Methods 0.000 claims description 19
- 238000009499 grossing Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 9
- 230000002596 correlated effect Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 abstract description 4
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a class activation mapping method and device based on axiom interpretation. Firstly, inputting an electrical equipment image into a trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, calculating the gradient of the feature map relative to the score by utilizing back propagation, optimizing the gradient, performing global average pooling operation on the optimized gradient to obtain a weight, and finally, linearly combining the weight and the feature map, and performing up-sampling and normalization operation to obtain an initial class activation map; performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score; and finally, multiplying the obtained fraction by the initial class activation graph, and obtaining a final class activation graph after ReLU operation. The class activation graph generated by the method has better visual interpretation in the aspects of definition, object positioning and the like.
Description
Technical Field
The invention relates to the field of computer vision and deep learning interpretability, in particular to a class activation mapping method and device based on axiom interpretation.
Background
Under the push of Deep Neural Networks (DNNs), Deep learning makes a major breakthrough in related fields such as natural language processing, image processing, voice recognition and the like. One key factor in the success of a deep neural network is that its network is deep enough that complex combinations of a large number of non-linear network layers can characterize the original data at various levels of abstraction. However, most deep learning models have high complexity, many parameters and low transparency, and as a black box is common, people cannot understand the mechanism of making a decision by the "end-to-end" model, and cannot judge whether the decision is reliable. In order to improve the model transparency, a plurality of interpretability methods are proposed, which mainly comprise a gradient-based method, a class-activation-based mapping method and a perturbation-based interpretation method. When the class activation mapping method is used for solving the derivative gradient through back propagation, a small amount of scattered point noise still exists in the corresponding saliency map, and therefore the visualization effect of the final class activation map is influenced.
The back propagation method is based on a chain rule, the brighter the color in the generated initial class activation map indicates that the absolute value of the gradient of the position is larger, the positions highlight the features related to the output result of the model in the input space, and the higher the correlation is, the more obvious the corresponding features in the class activation map are. However, there is a certain limitation in calculating the gradient by using back propagation, and the obtained activation-like map has a small amount of scattered point noise, which indicates that some irrelevant features are concerned. The scattered noise is represented by that the gradient values of some local areas are large, the gradient values of other areas are small, and the distribution of the areas is disordered.
In order to solve the existing problems, the class activation map with more concentrated and clearer salient regions is generated in order to remove scattered point noise. The chapter proposes a new method called SA-CAM (Smooth Absolute value Class Activation Mapping-based) by combining with another Gradient interpretation method based on Class Activation Mapping, the method adds noise obeying Gaussian distribution after the initial Class Activation map is point-multiplied with the input image, and the smoothing of the image is realized by averaging a plurality of Class Activation maps, so that the noise in the Class Activation map is removed, and better visual interpretation is obtained in the aspects of Class Activation map definition, object positioning and the like.
Disclosure of Invention
The method mainly solves the problems that when the class activation mapping method is used for solving the derivative gradient by utilizing back propagation, a corresponding saliency map still has a small amount of scattered point noise and the like. As shown in FIG. 1, the invention provides a class activation mapping method-SA-CAM based on axiom interpretation, and the generated class activation map has better visual interpretation in the aspects of definition, object positioning and the like.
The technical problem of the invention is mainly solved by the following technical scheme:
a class activation mapping method based on axiom interpretation is characterized by comprising the following steps:
inputting the electrical equipment image into the trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, and calculating the gradient of the feature map relative to the score by utilizing back propagation;
optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, carrying out global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight and a characteristic graph and carrying out up-sampling and normalization operation to obtain an initial class activation graph;
performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score;
multiplying the obtained fraction with the initial class activation graph, and obtaining the final class activation graph after the ReLU operation
In the class activation mapping method based on axiom interpretation, extracting the feature map of the target convolution layer, outputting the target class score, and calculating the back propagation gradient specifically include:
step 1.1, extracting a characteristic diagram of a target convolutional layer: for a given electrical device image X0Sending the model Y into a model Y, extracting a characteristic diagram A of the target convolutional layer l in the model Y, wherein the kth characteristic diagram of the A is shown as
Step 1.2, outputting a target category score: after the softmax operation, model Y outputs a score Y that predicts the image classc'(X0)
Step 1.3, calculating a back propagation gradient: computing the kth profile of A by back propagationIntermediate spatial position (i, j) relative to Yc'(X0) Gradient of fraction
In the above method for mapping class activation based on axiom interpretation, the step of obtaining the initial class activation graph through optimization specifically includes:
step 2.1, gradient optimization: optimizing gradients by squaring processing strategiesFurther highlighting the gradient in each layer that is positively correlated with the network output,
step 2.2, obtaining weight by global average pooling: the optimized gradientPerforming global average pooling operation to obtain weight
step 2.3, obtaining an initial class activation diagram: weight will be weightedAnd characteristic diagramLinear combination, up-sampling and normalization operation are carried out to obtain an initial class activation graph M0;
Where U denotes an upsampling operation and S denotes a normalization operation.
In the above class activation mapping method based on axiom interpretation, the step of performing point multiplication on the initial class activation map and the input image and performing smoothing operation on a multiplication result specifically includes:
step 3.1, dot-multiplying the initial class activation graph and the input image: activate the initial class map M0And the input image X0Dot product to obtain M1;
M1=M0.X (4)
Step 3.2, smoothing operation, for M1Carrying out smoothing operation; specifically, by applying at M1Each image of (1) adding Gaussian noise to generate N noise images M2;
Step 3.3, average score: image M2Sending into model, generating N scores after softmax, and averaging to a final score Yc(n);
In the above class activation mapping method based on axiom interpretation, the obtaining of the final class activation graph specifically includes:
step 4.1, mixing Yc(n) and initial saliency map M0Multiplication and ReLU operation to obtain the final saliency map
An axiom interpretation-based class activation mapping device, comprising:
a first module: inputting the electrical equipment image into the trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, and calculating the gradient of the feature map relative to the score by utilizing back propagation;
a second module: optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, performing global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight with a characteristic graph and performing up-sampling and normalization operation to obtain an initial class activation graph;
a third module: performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score;
a fourth module: multiplying the obtained fraction with the initial class activation graph, and obtaining the final class activation graph after the ReLU operation
Therefore, the invention has the following advantages:
1. and adding noise subject to Gaussian distribution after the initial class activation map is multiplied by the input image points, and smoothing the image by averaging a plurality of class activation maps so as to remove the noise existing in the class activation map.
The SA-CAM generated class activation map has better visual interpretation in the aspects of definition, object positioning and the like.
Drawings
FIG. 1 is a diagram of a class activation mapping method framework based on axiom interpretation in accordance with the present invention;
Detailed Description
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b):
the invention relates to a class activation mapping method based on axiom interpretation, and an algorithm flow chart of the invention is shown in figure 1 and can be divided into four parts: 1) extracting a feature map of the target convolutional layer, outputting a target category fraction, and calculating a back propagation gradient; 2) optimizing gradient, and performing global average pooling to obtain weight and an initial class activation map; 3) multiplying the initial class activation graph and the input image by points and points, smoothing operation and averaging scores; 4) and obtaining a final class activation graph.
The method comprises the following steps: extracting a characteristic diagram of the target convolution layer, outputting a target category score and calculating a back propagation gradient, wherein the steps are as follows:
A. extracting a characteristic diagram of the target convolutional layer: for a given electrical device image X0Sending the model Y into a model Y, extracting a characteristic diagram A of the target convolutional layer l in the model Y, wherein the kth characteristic diagram of the A is shown as
B. Outputting a target category score: after the softmax operation, model Y outputs a score Y that predicts the image classc'(X0);
C. Calculating a back propagation gradient: computing the kth profile of A by back propagationThe intermediate spatial position (i, j) relative to Yc'(X0) Gradient of fraction
Step two: optimizing gradient, obtaining weight by global average pooling and obtaining an initial class activation map, wherein the steps are as follows:
A. optimizing the gradient: optimizing gradients by squaring processing strategiesFurther highlighting the gradient in each layer that is positively correlated with the network output,
B. global average pooling yields weights: the optimized gradientPerforming global average pooling operation to obtain weight
C. Obtaining an initial class activation graph: will weightAnd characteristic diagramLinear combination, up-sampling and normalization operation are carried out to obtain an initial class activation graph M0。
Where U represents the upsampling operation and S represents the normalization operation
Step three: the method comprises the steps of point multiplication of an initial class activation graph and an input image, smoothing operation and average score, and comprises the following steps:
A. dot-by-dot initial class activation map and input image: activate the initial class map M0And the input image X0Dot product to obtain M1;
M1=M0.X (10)
B. Smoothing operation on M1A smoothing operation is performed. Specifically, by applying at M1Each image of (1) adding Gaussian noise to generate N noise images M2;
C. Average fraction: image M2Sending into model, generating N scores after softmax, and averaging to a final score Yc(n)。
Step four: and obtaining a final class activation graph, which comprises the following steps:
A. will Yc(M1) And the initial saliency map M0Multiplication and ReLU operation to obtain final class activation graph
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.
Claims (6)
1. A class activation mapping method based on axiom interpretation is characterized by comprising the following steps:
inputting the electrical equipment image into the trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, and calculating the gradient of the feature map relative to the score by utilizing back propagation;
optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, carrying out global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight and a characteristic graph and carrying out up-sampling and normalization operation to obtain an initial class activation graph;
performing point multiplication on the obtained initial class activation image and the input image, performing smoothing operation on the image subjected to point multiplication, sending the image into a model, performing softmax operation, generating N scores, and finally averaging to obtain a score;
2. The axiom interpretation-based class activation mapping method according to claim 1, wherein extracting a feature map of a target convolution layer, outputting a target class score, and calculating a back propagation gradient specifically comprises:
step 1.1, extracting a characteristic diagram of a target convolutional layer: for a given electrical device image X0Sending the model Y into a model, extracting a characteristic diagram A of the target convolutional layer l in the model Y, wherein the kth characteristic diagram of the A is shown as
Step 1.2, outputting a target category score: after the softmax operation, model Y outputs a score Y that predicts the image classc'(X0)
Step 1.3, calculating a back propagation gradient: computing the kth profile of A by back propagationThe intermediate spatial position (i, j) relative to Yc'(X0) Gradient of fraction
3. The axiom interpretation-based class activation mapping method according to claim 1, wherein the optimization processing to obtain the initial class activation graph specifically comprises:
step 2.1, optimizing gradient: optimizing gradients by squaring processing strategiesFurther highlighting the gradient in each layer that is positively correlated with the network output,
step 2.2, obtaining weight by global average pooling: the optimized gradientPerforming global average pooling operation to obtain weight
step 2.3, obtaining an initial class activation diagram: will weightAnd characteristic diagramLinear combination, up-sampling and normalization operation are carried out to obtain an initial class activation graph M0;
Where U denotes an upsampling operation and S denotes a normalization operation.
4. The axiom interpretation-based class activation mapping method according to claim 1, wherein the step of performing a smooth operation on the multiplication result by the initial class activation map and the input image specifically comprises:
step 3.1, dot-multiplying the initial class activation graph and the input image: activate the initial class map M0And an input image X0Dot product to obtain M1;
M1=M0.X (4)
Step 3.2, smoothing operation, for M1Carrying out smoothing operation; specifically, by applying at M1Each image of (1) adding Gaussian noise to generate N noise images M2;
Step 3.3, average score: image M2Sending into model, generating N scores after softmax, and averaging to a final score Yc(n);
6. An axiomatic interpretation-based class activation mapping apparatus, using the method of any one of claims 1 to 5, comprising:
a first module: inputting the electrical equipment image into the trained CNN model, extracting a feature map of a target convolution layer in the model, simultaneously obtaining a target category score, and calculating the gradient of the feature map relative to the score by utilizing back propagation;
a second module: optimizing the obtained gradient, optimizing the gradient by taking a square processing strategy, carrying out global average pooling operation on the optimized gradient to obtain a weight, and finally linearly combining the weight and a characteristic graph and carrying out up-sampling and normalization operation to obtain an initial class activation graph;
a third module: performing point multiplication on the obtained initial class activation image and the input image, performing smooth operation on the point-multiplied image, sending the point-multiplied image into a model, performing softmax operation, generating N scores, and averaging to obtain a score;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210450336.2A CN114723733A (en) | 2022-04-26 | 2022-04-26 | Class activation mapping method and device based on axiom interpretation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210450336.2A CN114723733A (en) | 2022-04-26 | 2022-04-26 | Class activation mapping method and device based on axiom interpretation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114723733A true CN114723733A (en) | 2022-07-08 |
Family
ID=82246091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210450336.2A Pending CN114723733A (en) | 2022-04-26 | 2022-04-26 | Class activation mapping method and device based on axiom interpretation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114723733A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117233723A (en) * | 2023-11-14 | 2023-12-15 | 中国电子科技集团公司第二十九研究所 | Radar tracking envelope extraction method based on CNN class activation diagram |
-
2022
- 2022-04-26 CN CN202210450336.2A patent/CN114723733A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117233723A (en) * | 2023-11-14 | 2023-12-15 | 中国电子科技集团公司第二十九研究所 | Radar tracking envelope extraction method based on CNN class activation diagram |
CN117233723B (en) * | 2023-11-14 | 2024-01-30 | 中国电子科技集团公司第二十九研究所 | Radar tracking envelope extraction method based on CNN class activation diagram |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108062756B (en) | Image semantic segmentation method based on deep full convolution network and conditional random field | |
CN113240580B (en) | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation | |
CN110335290B (en) | Twin candidate region generation network target tracking method based on attention mechanism | |
CN111260740B (en) | Text-to-image generation method based on generation countermeasure network | |
CN108985317B (en) | Image classification method based on separable convolution and attention mechanism | |
CN111861906B (en) | Pavement crack image virtual augmentation model establishment and image virtual augmentation method | |
CN112489164B (en) | Image coloring method based on improved depth separable convolutional neural network | |
CN113284100B (en) | Image quality evaluation method based on recovery image to mixed domain attention mechanism | |
CN111046917B (en) | Object-based enhanced target detection method based on deep neural network | |
CN106339753A (en) | Method for effectively enhancing robustness of convolutional neural network | |
KR101888647B1 (en) | Apparatus for classifying image and method for using the same | |
CN112348870A (en) | Significance target detection method based on residual error fusion | |
CN111553462A (en) | Class activation mapping method | |
CN109447897B (en) | Real scene image synthesis method and system | |
CN115565043A (en) | Method for detecting target by combining multiple characteristic features and target prediction method | |
CN113205103A (en) | Lightweight tattoo detection method | |
CN114723733A (en) | Class activation mapping method and device based on axiom interpretation | |
CN111260585A (en) | Image recovery method based on similar convex set projection algorithm | |
CN114581789A (en) | Hyperspectral image classification method and system | |
CN116912501A (en) | Weak supervision semantic segmentation method based on attention fusion | |
CN114723049A (en) | Class activation mapping method and device based on gradient optimization | |
CN111667401A (en) | Multi-level gradient image style migration method and system | |
CN115858808A (en) | Knowledge graph representation method for hierarchical relation of key neurons in deep neural network | |
CN114155560B (en) | Light weight method of high-resolution human body posture estimation model based on space dimension reduction | |
CN113111906B (en) | Method for generating confrontation network model based on condition of single pair image training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |