CN109087264B - Method for making network notice important part of data based on deep network - Google Patents
Method for making network notice important part of data based on deep network Download PDFInfo
- Publication number
- CN109087264B CN109087264B CN201810891937.0A CN201810891937A CN109087264B CN 109087264 B CN109087264 B CN 109087264B CN 201810891937 A CN201810891937 A CN 201810891937A CN 109087264 B CN109087264 B CN 109087264B
- Authority
- CN
- China
- Prior art keywords
- feature map
- original
- similarity matrix
- characteristic diagram
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 239000011159 matrix material Substances 0.000 claims abstract description 51
- 238000010586 diagram Methods 0.000 claims abstract description 34
- 239000013598 vector Substances 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 4
- 238000013461 design Methods 0.000 abstract description 3
- 230000007246 mechanism Effects 0.000 description 7
- 241000282414 Homo sapiens Species 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G06T5/94—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Abstract
The invention discloses a depth network-based network attention numberAccording to the significant part, the method comprises: a1, vectorizing the original feature map, namely each pixel of the original feature map is represented by a vector; a2, obtaining a similarity matrix through self-comparison learning and normalizing to obtain a reconstructed characteristic diagram X*(ii) a A3, reconstructing the feature map X*And performing iterative processing after comparing with the original characteristic diagram. Reconstructing the original characteristic diagram to obtain a reconstructed characteristic diagram X after the original characteristic diagram is subjected to vectorization design*Convergence is achieved through iterative processing, parameters do not need to be introduced, important areas become very obvious, and network identification capacity is improved.
Description
Technical Field
The invention relates to the field of computer vision, in particular to a method for making a network notice important parts of data based on a deep network, which is also called a notice mechanism in the field of computer vision.
Background
Human vision acquires a target area needing important attention by rapidly scanning a global image, and then puts more attention resources into the area, and suppresses other useless information. Although this is a human instinct, for neural networks, it does not have this ability to judge, but treats each pixel equally, thus limiting the ability to express the network.
The method for making the network notice important parts of data based on the deep network has the advantage that the neural network starts to learn to notice important information due to the introduction of a attention mechanism. An attention mechanism may be used for spatial pixels, so that the network pays attention to important spatial regions; the method can be used for a feature map channel, so that the network learns the category semantics; but also in the time dimension to capture behaviors, actions, etc. Acting on spatial pixels, it is necessary to learn a weight value in an interval of 0 to 1 for each pixel, and then multiply the pixel by the weight value as a new pixel value. For important pixels, the learned weight is larger, and the learned weight is smaller for less important pixels, so that the effect of amplifying important areas and inhibiting other areas can be achieved, and the attention function of human beings can be simulated.
Although attention mechanisms have had considerable success in the field of computer vision. However, these results have the following disadvantages:
1) the network design is very complicated and the universality is poor. For example, although the Network in "Residual Attention Network for Image Classification" has a good effect, the process of learning the Attention weight is very complicated and heavy, so that other researchers can hardly use the results.
2) In all current attention mechanisms, the process of learning the weight value needs to introduce additional parameters. Increasing the amount of parameters will cause various problems, such as easy overfitting, high training hardware requirements, inability to migrate to the mobile end, etc.
Therefore, it is necessary to learn attention weights according to the information contained in the data itself without introducing additional parameters, so that the number of deep network parameters is large, overfitting is easy to occur, the model is too large to be transplanted to a mobile terminal, how to obtain important image information, and the like, and a problem to be solved is urgently solved at present.
Disclosure of Invention
The invention aims to solve the problem that how to obtain important image information in a neural network without additionally introducing network parameters in the prior art, and provides a method for making the network pay attention to important parts of data based on a depth network and a parameter-free attention mechanism design method based on the depth network.
In order to solve the technical problem, the following technical scheme is adopted in the application:
a method for making a network aware of important parts of data based on a deep network, comprising the steps of:
a1, vectorizing the original feature map (X), namely each pixel of the original feature map (X) is represented by a vector;
a2, obtaining a similarity matrix (W) by self-comparison learning and normalizing the similarity matrix to obtain a reconstructed characteristic diagram (X)*);
A3, reconstructing a feature map (X)*) And after comparing with the original characteristic diagram (X), carrying out iterative processing to make the important area obvious.
Preferably, in the step a1, vectorizing the original feature map specifically includes:
where N is h × w, h is the length of the original feature map, w is the width of the original feature map, the number of elements in each vector is c, each pixel is a vector with length c, T is the transpose operation,representing the kth pixel in the feature map.
Preferably, in the step a2, the similarity between any two pixels is compared by multiplying the original feature map by the transpose of the original feature map, that is, W is XXTAnd obtaining a similarity matrix (W).
Preferably, in the step a2, the normalization of the similarity matrix (W) refers to the normalization of the similarity matrix (W) by a softmax function.
Preferably, in the step a2, a feature map (X) is reconstructed*) The similarity matrix is obtained by normalizing the similarity matrix (W) and then performing product on the normalized similarity matrix and the original characteristic diagram (X).
Preferably, the feature map (X) is reconstructed*) Is multiplied by the original feature map (X), i.e. W1=XX*TObtaining a reconstructed feature map (X)*) The similarity matrix (W1).
Preferably, the feature map X is reconstructed*Until the algorithm converges, the similarity matrix (W1) of the original feature map (X) is iterated with the similarity matrix (W) of the original feature map (X), and the similarity matrix (W0) embedded with the important region weight information is obtained.
Preferably, the (X) obtained from the last iteration is used*) Sending to the next iteration as input to obtain the next (X)*) This process is repeated until the algorithm converges.
Preferably, the first iteration, i.e. the original feature map (X), is compared with itself.
Preferably, the algorithm converges after 4 iterations.
Compared with the prior art, the invention has the beneficial effects that:
the method for making network notice important part of data based on deep network of the invention is to convert original characteristic diagram X into vector quantity and then convert the vector quantity into original characteristic diagram XPerforming product to obtain a similarity matrix W of the original characteristic diagram, and normalizing the similarity matrix W to obtain a reconstructed characteristic diagram X*Then, the feature map X is reconstructed*The similarity matrix is iterated with the similarity matrix W of the original characteristic diagram to obtain the similarity matrix embedded with the importance information weight, network parameters do not need to be additionally introduced, the important area becomes very obvious, and the network identification capability is improved.
Further, compared with a scalar, the vector can transmit and express more information, and the importance feature containing the original feature map X is more.
Further, the similarity matrix is learned by self-comparison within the feature map without introducing more parameters, thereby preventing overfitting and models being too large.
Furthermore, the method can be designed into a universal module, is suitable for being inserted into any layer of any convolutional network, and has very strong universality.
Drawings
FIG. 1 is a schematic diagram of the algorithm structure of the present invention;
fig. 2 is another algorithm flow diagram of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. It should be emphasized that the following description is merely exemplary in nature and is not intended to limit the scope of the invention or its application.
The invention relates to a method for making a network notice important parts of data based on a deep network, which comprises the following steps:
a1, vectorizing the original feature diagram X, namely each pixel of the original feature diagram X is represented by a vector;
a2, obtaining a similarity matrix W by self-comparison learning and normalizing to obtain a reconstructed characteristic diagram X*。
A3, reconstructing the feature map X*And after comparing with the original characteristic diagram X, carrying out iterative processing to make the important area obvious.
The algorithm flow chart of the method for making the network notice the important part of the data based on the deep network of the embodiment is shown in fig. 1, wherein,
on the left is the first iteration of the algorithm, i.e. the original feature map itself is learned compared with itself.
The left side is the residual iteration process, and the reconstructed feature map is compared with the original feature map.
Because the importance information is embedded into the weight in the process of comparative learning and introduced into the similarity matrix W, the process of reconstructing the feature map by using the similarity matrix W is actually the feature reselection, and the important area in the reconstructed feature map becomes very obvious, thereby improving the network identification capability.
Specifically, the method for designing the non-parameter attention mechanism based on the deep network in this embodiment is performed according to the following steps:
step A1, vectorizing the original characteristic diagram X,
let the original feature map X be a feature map of [ h, w, c ] dimension, where h and w represent the length and width dimensions of the original feature map and c is the number of channels.
According to the length h and the width w of the original feature map X, the original feature map X has h multiplied by w pixels. The number of channels per pixel is c. The original feature map X can therefore be seen as being composed of h × w vectors, each vector having c elements. The original feature map X is expressed asN ═ hxw. Wherein T is a transpose operation,represents the kth pixel in the original feature map, and each pixel is a vector with length c.
Step A2, obtaining a similarity matrix W by self-comparison learning and normalizing the similarity matrix W to obtain a reconstructed characteristic diagram X*The method comprises the following steps:
2.1 self-comparison learning to obtain a similarity matrix W
W=XXT (1)
Wherein the content of the first and second substances,representing the original feature map, XTThe representation is a transpose of the original feature map X,
The meaning expressed by equation (1) is: the original feature map X is multiplied with the transpose of the original feature map X, i.e. starting with the first pixel, each pixel of the original feature map X is inner-multiplied with the other pixels of the input feature map, becauseSo for any two vectors, the smaller the angle, the more similar the result of the inner product. Thus XXTThe result of (1) is to compare the similarity between any two pixels, thereby obtaining a similarity matrix W
The ith row in the similarity matrix W represents the result of the similarity comparison of the ith pixel with all the pixels.
2.2 normalization processing
Normalizing each column of the similarity matrix W by using a softmax function (0 in softmax (W,0) in the formula represents column normalization), and obtaining a normalized matrix of the original feature map X, namely
In which each element p in the matrixijIs a number between 0 and 1, indicating a similar ratio between pixels.
As can be seen from equation (3), if two pixels are closer, the weight of the corresponding element in the normalized matrix is larger.
2.3 reconstruction of feature map X*
Reconstructing the original characteristic diagram X by utilizing the similar proportion among all pixels to obtain a reconstructed characteristic diagram X with significant characteristics*I.e. by
X*=softmax(W,0)X (4)
In particular, the amount of the solvent to be used,
the weight values of all elements in the normalized matrix are used for representing the mutual relation between the pixels, the larger the weight value is, the closer the two pixels are, the greater the contribution to each other is during reconstruction, therefore, the relevance and the importance among the pixels are embedded into all the elements in the normalized matrix, and the feature diagram X is reconstructed*Is the process of re-selecting features using the similarity matrix W. Thus reconstructing the feature map X*Becomes stronger relative to the similarity matrix X. Reconstruction of feature map X*The important area represented by (1) will be given more weight to be highlighted.
Step A3, reconstructing the feature map X*After being compared with the original characteristic diagram X, the iteration processing is carried out to make the important area become obvious,
W=Wold+XX*T (6)
X*=softmax(W,0)X (7)
the X obtained by the i-1 th iterationi-1 *Sending the ith iteration as input to obtain the ith Xi *… … repeat this process until the algorithm converges.
Considering that the similarity matrix W is obtained by self-comparison learning and is not always right, if the similarity matrix W is problematic, the feature diagram X is reconstructed*Problems must also arise if the feature map X is reconstructed*Compared with the self-learning, the comparison result is more and more wrong, so that the application adopts the reconstructed characteristic diagram X*The method carries out self comparison learning with the original characteristic diagram X, and the original characteristic diagram X plays a role in supervision, and is similar to PagA transition matrix of the errank algorithm.
In general, this process iterates 3 to 4 times to achieve convergence.
The algorithm flow chart of the method for making the network notice the important part of the data based on the deep network of the embodiment is shown in fig. 1, wherein,
on the left is the first iteration of the algorithm, i.e. the original feature map itself is learned compared with itself.
The left side is the residual iteration process, and the reconstructed feature map is compared with the original feature map.
Because the importance information is embedded into the weight in the process of comparative learning and introduced into the similarity matrix W, the process of reconstructing the feature map by using the similarity matrix W is actually the feature reselection, and the important area in the reconstructed feature map becomes very obvious, thereby improving the network identification capability.
Specifically, the steps a1 to A3 in this embodiment are processed by the following algorithm process, as shown in fig. 2, including the following specific steps:
a11, start;
[x1,…,xN,]T;
a13, initializing variable i, 0 → i;
XT→W;
a15, normalization process softmax (W,0) → P;
a16, reconstruction characteristic map PX->X*;
A17, iterative processing i +1 → i;
sending Xi-1 to the ith iteration as input to obtain an ith reconstructed feature map Xi, Wold + XX T → W;
a19, performing normalization processing softmax (W,0) → P on the reconstruction feature map, wherein 0 in the formula represents the normalization of the W column;
a20, multiplying the new learned weight P by the original feature map to reconstruct a new feature map PX->X*;
A20, determining whether the number of iterations is greater than a set value i > 4? If yes, continue; if not, returning to A17;
a21, outputting a converged reconstruction feature map; and A22, ending.
Wherein, the steps A11-A12 correspond to the step A1, the steps A13-A16 correspond to the step A2, and the rest correspond to the step A3.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several equivalent substitutions or obvious modifications can be made without departing from the spirit of the invention, and all the properties or uses are considered to be within the scope of the invention.
Claims (8)
1. A method for making a network aware of important parts of image data based on a deep network, comprising the steps of:
a1, vectorizing the original feature diagram X of the image, namely each pixel of the original feature diagram X is represented by a vector;
a2, obtaining a similarity matrix W by self-comparison learning and normalizing to obtain a reconstructed characteristic diagram X*;
A3, reconstructing the feature map X*After comparing with the original characteristic diagram X, carrying out iterative processing to make the important area obvious;
reconstructing the feature map X*Is multiplied by the original feature map X, i.e. W1=XX*TObtaining a reconstructed feature map X*The similarity matrix W1;
reconstructing the feature map X*The similarity matrix W1 and the similarity matrix W of the original feature map X are iterated until the algorithm converges, and a similarity matrix W0 embedded with important region weight information is obtained.
2. The method according to claim 1, wherein in the step a1, the original feature map is vectorized, specifically:
3. The method according to claim 1, wherein in the step a2, the similarity between any two pixels is obtained by multiplying the original feature map by the transpose of the original feature map, i.e. W ═ XXTAnd obtaining a similarity matrix W.
4. The method according to claim 1, wherein in the step a2, the normalization of the similarity matrix W is performed by normalizing the similarity matrix W with a softmax function.
5. The method according to claim 1, wherein in step A2, the feature map X is reconstructed*The similarity matrix is obtained by normalizing the similarity matrix W and then performing product on the normalized similarity matrix W and the original characteristic diagram X.
6. The method of claim 1, wherein X from a previous iteration is used*Sending the next iteration as input to obtain the next X*This process is repeated until the algorithm converges.
7. The method of claim 1, wherein the first iteration is to compare the original profile X with itself.
8. The method of claim 1, wherein the algorithm converges after 4 iterations.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810891937.0A CN109087264B (en) | 2018-08-07 | 2018-08-07 | Method for making network notice important part of data based on deep network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810891937.0A CN109087264B (en) | 2018-08-07 | 2018-08-07 | Method for making network notice important part of data based on deep network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109087264A CN109087264A (en) | 2018-12-25 |
CN109087264B true CN109087264B (en) | 2021-04-09 |
Family
ID=64834210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810891937.0A Active CN109087264B (en) | 2018-08-07 | 2018-08-07 | Method for making network notice important part of data based on deep network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109087264B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106780468A (en) * | 2016-12-22 | 2017-05-31 | 中国计量大学 | View-based access control model perceives the conspicuousness detection method of positive feedback |
CN108334901A (en) * | 2018-01-30 | 2018-07-27 | 福州大学 | A kind of flowers image classification method of the convolutional neural networks of combination salient region |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106506901B (en) * | 2016-09-18 | 2019-05-10 | 昆明理工大学 | A kind of hybrid digital picture halftoning method of significance visual attention model |
-
2018
- 2018-08-07 CN CN201810891937.0A patent/CN109087264B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106780468A (en) * | 2016-12-22 | 2017-05-31 | 中国计量大学 | View-based access control model perceives the conspicuousness detection method of positive feedback |
CN108334901A (en) * | 2018-01-30 | 2018-07-27 | 福州大学 | A kind of flowers image classification method of the convolutional neural networks of combination salient region |
Also Published As
Publication number | Publication date |
---|---|
CN109087264A (en) | 2018-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liu et al. | Learning converged propagations with deep prior ensemble for image enhancement | |
Liu et al. | Connecting image denoising and high-level vision tasks via deep learning | |
CN110084296B (en) | Graph representation learning framework based on specific semantics and multi-label classification method thereof | |
CN109033095B (en) | Target transformation method based on attention mechanism | |
Brifman et al. | Turning a denoiser into a super-resolver using plug and play priors | |
Rubinstein et al. | Dictionary learning for analysis-synthesis thresholding | |
CN111461322B (en) | Deep neural network model compression method | |
US20160283858A1 (en) | Multimodal Data Fusion by Hierarchical Multi-View Dictionary Learning | |
CN111325165B (en) | Urban remote sensing image scene classification method considering spatial relationship information | |
CN107392865B (en) | Restoration method of face image | |
Hyvärinen | Statistical models of natural images and cortical visual representation | |
CN113220886A (en) | Text classification method, text classification model training method and related equipment | |
Trujillo et al. | Evolving estimators of the pointwise Hölder exponent with genetic programming | |
Zhang et al. | Robust alternating low-rank representation by joint Lp-and L2, p-norm minimization | |
Fu et al. | Learning dual priors for jpeg compression artifacts removal | |
CN113283524A (en) | Anti-attack based deep neural network approximate model analysis method | |
CN115457183A (en) | Training method, reconstruction method and device for generating and reconstructing serialized sketch model | |
CN113208641B (en) | Auxiliary diagnosis method for lung nodule based on three-dimensional multi-resolution attention capsule network | |
Seyedi et al. | Elastic adversarial deep nonnegative matrix factorization for matrix completion | |
CN109087264B (en) | Method for making network notice important part of data based on deep network | |
Mukherjee et al. | Learned reconstruction methods with convergence guarantees | |
CN112905894A (en) | Collaborative filtering recommendation method based on enhanced graph learning | |
US20200380364A1 (en) | Adversarial Probabilistic Regularization | |
Yao et al. | Multiscale residual fusion network for image denoising | |
CN113298232B (en) | Infrared spectrum blind self-deconvolution method based on deep learning neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |