CN111860681B - Deep network difficulty sample generation method under double-attention mechanism and application - Google Patents
Deep network difficulty sample generation method under double-attention mechanism and application Download PDFInfo
- Publication number
- CN111860681B CN111860681B CN202010749955.2A CN202010749955A CN111860681B CN 111860681 B CN111860681 B CN 111860681B CN 202010749955 A CN202010749955 A CN 202010749955A CN 111860681 B CN111860681 B CN 111860681B
- Authority
- CN
- China
- Prior art keywords
- attention
- channel
- attention mechanism
- spatial
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012549 training Methods 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 6
- 238000005728 strengthening Methods 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 230000002787 reinforcement Effects 0.000 claims description 2
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000013461 design Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000002457 bidirectional effect Effects 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 4
- 238000004445 quantitative analysis Methods 0.000 description 4
- 238000013434 data augmentation Methods 0.000 description 3
- 238000009412 basement excavation Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Abstract
The invention discloses a novel generation method and application of a difficult sample for deep learning, and designs a bidirectional attention mechanism for automatically generating the difficult sample, which is beneficial to a deep model to jump out of a local optimal solution and has stronger robustness. The attention mechanism of the method not only can emphasize the foreground target, but also can avoid the influence of background clutter to a certain extent, so that the shielded area is more concentrated, the generated difficult sample is more resistant, and the recognition precision of the depth network target can be improved.
Description
Technical Field
The invention relates to a method for generating a deep network difficult sample under a double-attention mechanism, and belongs to the technical field of artificial intelligence.
Background
The deep learning method needs to increase the diversity of training samples to improve the generalization capability of the model and relieve the overfitting phenomenon.
However, collecting a large data set is a difficult problem for most identification tasks. Data enhancement is a considerable alternative method, without the need for manually annotating a large number of pictures. Typical data augmentation methods include random cropping, mirroring images, image dithering, etc., reference [1-2]. Recently, the development of counter-generated networks has brought reliable choices for models to introduce additional augmentation data, reference [3-4].
But in general, data augmentation using an countermeasure generation network requires additional data sets because the data sets available to the training set need to be generated using information such as style or pose of the additional data sets. In order to increase the restriction force of the samples on the network while increasing the number of samples, research into the generation of difficult samples is started. Document [5] randomly erasing some regions in the input picture of the network avoids the network from being over fitted, making the network more robust.
But on the other hand, randomly occluding samples may add noise to the network, thereby making the network converge more slowly.
Document [6] uses sliding window to generate difficult sample pool, selects the most difficult sample according to the accuracy of different difficult samples after model training, expands training set according to the selected most difficult sample, retrains model, but the steps are more complicated.
Disclosure of Invention
The invention aims to provide a novel generation method of a difficult sample for deep learning, and designs a bidirectional attention mechanism for automatically generating the difficult sample, which is helpful for a deep model to jump out of a local optimal solution and has stronger robustness. The technical scheme comprises the following steps:
S01, acquiring a training set image, inputting the training set image I k, k=1, 2, N into a basic network ResNet-50, and extracting features from a conv5_3 layer W is the height and width of the feature F, C is the number of channels of F, and N is the number of training set images;
s02, establishing a space attention mechanism to obtain a space attention weight matrix
S03, establishing a channel attention mechanism to obtain a channel attention vector a c;
s04, gradually applying the channel attention vector a c and the spatial attention weight matrix a s to generate a double-attention-strengthening feature, and completing fusion of a spatial attention mechanism and a channel attention mechanism to obtain the double-attention-strengthening feature
S05, utilizing Grad-cam to enhance double attentionProcessing to obtain a heat-concerned picture corresponding to the I k, and performing binarization operation on the heat-concerned picture by using an OTSU algorithm to obtain a binary-concerned picture B k, k=1, 2,.,;
S06, for each sheet B k, k=1, 2,..n, selecting the largest connected domain, and setting the gray value of the pixel corresponding to the connected domain to 0 on the corresponding original image I k, k=1, 2,..n, i.e. generating a difficult sample J k, k=1, 2,..n under the dual-attention mechanism.
Further, step S02 specifically includes:
(1) Extracting a feature F from the conv5_3 layer;
(2) Extracting spatial features at spatial l= (x, y) of feature F
(3) The attention weight at spatial location l= (x, y) can be obtained with a channel softmax operation:
Thereby obtaining the space attention weight matrix
Further, step S03 specifically includes:
(1) Features for each channel I=1.. C applying an average pooling to give u i, thereby obtaining channel characteristics/>
(2) The average pooling layer is followed by a convolution layer to learn the reinforcement characteristics of each channel, namely:
u'=Wc*u+bc
Where x represents the convolution operation, W c represents the weight, and b c is the bias term.
(3) Applying Sigmoid operation to u '= [ u' 1,u'2,...,u'C ] to generate channel attention vectorI.e. the attention value of channel i/>
The invention also provides application of the depth network difficult sample generation method in the field of monitoring image recognition and heat map recognition.
The invention also provides application of the deep network difficult sample generation method in the field of financial economic quantitative analysis.
The invention also provides application of the deep network difficult sample generation method in the field of financial economic quantitative analysis.
The invention also provides application of the deep network difficult sample generation method in the fields of medical drug excavation, disease analysis and medical image analysis.
The invention also provides application of the deep network difficult sample generation method in the field of network security.
The invention also provides application of the deep network difficulty sample generation method in the field of spam filtering.
The invention also provides application of the deep network difficult sample generation method in the field of DNS malicious domain name analysis.
The invention has the following beneficial effects:
The invention designs a method for generating a depth network difficult sample based on a double-attention mechanism of a space and a channel, wherein the attention mechanism not only can emphasize a foreground target, but also can avoid the influence of background clutter to a certain extent, so that a shielded area is more concentrated, the generated difficult sample is more resistant, and the recognition precision of the depth network target can be improved.
Drawings
FIG. 1 is a flow chart of a method for generating deep network difficulty samples under a dual-attention mechanism;
FIG. 2 is a diagram of a dual attention model network framework;
FIG. 3 is an example of a heat map of interest and a difficult sample.
Detailed Description
The present invention will be described in detail below with reference to the embodiments shown in the drawings, but it should be understood that the embodiments are not limited to the present invention, and functional, method, or structural equivalents and alternatives according to the embodiments are within the scope of protection of the present invention by those skilled in the art.
Fig. 1 is a schematic flow chart of a method for generating a deep network difficulty sample under a dual-attention mechanism according to the present invention. In this embodiment, a method for generating a deep network difficulty sample under a dual-attention mechanism includes the following steps:
S01, inputting training set images I k, k=1, 2, & gt, N into a basic network ResNet-50, extracting features F from a conv5_3 layer, wherein N is the number of the training set images;
s02, establishing a space attention mechanism to obtain a space attention weight matrix
As shown in fig. 2 (a), the step S02 specifically includes:
(1) Extracting aggregated features Wherein W x H is the height and width of the aggregate feature F, C is the number of channels of F;
(2) Extracting spatial features at spatial l= (x, y) in the aggregated feature F
(3) The attention weight at spatial location l= (x, y) can be obtained with a channel softmax operation:
Thereby obtaining the space attention weight matrix
S03, establishing a channel attention mechanism to obtain a channel attention vector a c;
As shown in fig. 2 (b), the step S03 specifically includes:
(1) Aggregation features for each channel I=1.. C applying an average pooling to give u i, thereby obtaining channel characteristics/>
(2) The average pooling layer is followed by a convolution layer to learn the aggregate characteristics of each channel, namely:
u'=Wc*u+bc
Where x represents the convolution operation, W c represents the weight, and b c is the bias term.
(3) Applying Sigmoid operation to u '= [ u' 1,u'2,...,u'C ] to generate channel attention vectorI.e. the attention value of channel i/>
S04, generating double-attention-strengthening features by gradually applying channel attention and space attention, and fusing a space attention mechanism and a channel attention mechanism to obtain the double-attention-strengthening features
As shown in fig. 2 (c), the step S04 specifically includes:
(1) Notice value of channel i Multiplying the aggregate characteristic F i of each channel to obtain the attention-enhancing characteristic of channel i
(2) In the channel i, i=1, 2, k, c, the spatial attention weight matrix a s is dot multiplied by the attention-enhancing feature F i c of the channel i to obtain a dual-attention-enhancing feature
Fi sc=as*Fi c
S05, visualizing the region of interest of the I k, k=1, 2, using the Grad-cam, to obtain a heat-concerned map, and performing binarization operation on the heat-concerned map using the OTSU algorithm, to obtain a binary image of interest B k, k=1, 2, N;
S06, for each sheet B k, k=1, 2,..n, selecting the largest connected domain, and setting the gray value of the pixel corresponding to the connected domain to 0 on the corresponding original image I k, k=1, 2,..n, i.e. generating a difficult sample J k, k=1, 2,..n under the dual-attention mechanism.
An example of a heat map of interest and a difficult sample generated by the present invention is shown in fig. 3. The difficult samples generated by the invention are used for pedestrian re-identification, 30% of the difficult samples are randomly selected to replace the original training samples to re-train the model, and experiments are carried out on the public data set Market1501 data set and DukeMTMC-reID data set, and the experimental results are shown in table 1.
Table 1 comparison of experimental results
Example 2
The method for generating the deep network difficult sample is applied to the field of monitoring image recognition and heat map recognition.
Example 3
The application of the deep network difficult sample generation method in the field of quantitative analysis of financial economy.
Example 4
The application of the deep network difficult sample generation method in the field of quantitative analysis of financial economy.
Example 5
The deep network difficult sample generation method is applied to the fields of medical drug excavation, disease analysis and medical image analysis.
Example 6
The application of the deep network difficult sample generation method in the network security field.
Example 7
An application of a deep network difficulty sample generation method in the field of spam filtering.
Example 8
The deep network difficulty sample generation method is applied to the field of DNS malicious domain name analysis.
While the invention has been described with reference to the preferred embodiments, it is not limited thereto, and various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.
Reference is made to:
[1]Fadaee,Marzieh,Arianna Bisazza,and Christof Monz."Data augmentation for low-resource neural machine translation."arXiv preprint arXiv:1705.00440(2017).
[2]Perez L,Wang J.The effectiveness of data augmentation in image classification using deep learning[J].arXiv preprint arXiv:1712.04621,2017.
[3]Frid-Adar M,Diamant I,Klang E,et al.GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification[J].Neurocomputing,2018,321:321-331.
[4]Lim S K,Loo Y,Tran N T,et al.Doping:Generative data augmentation for unsupervised anomaly detection with gan[C]//2018IEEE International Conference on Data Mining(ICDM).IEEE,2018:1122-1127.
[5]Zhong Z,Zheng L,Kang G,et al.Random erasing data augmentation[J].arXiv preprint arXiv:170804896,2017.
[6]Huang H,Li D,Zhang Z,et al.Adversarially occluded samples for person re-identification;proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018
While the invention has been described with reference to the preferred embodiments, it is not limited thereto, and various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (1)
1. The method for generating the deep network difficulty sample is characterized by comprising the following steps of:
S01, acquiring a training set image, inputting the training set image I k, k=1, 2, N into a basic network ResNet-50, and extracting features from a conv5_3 layer W is the height and width of the feature F, C is the number of channels of F, and N is the number of training set images;
s02, establishing a space attention mechanism to obtain a space attention weight matrix
S03, establishing a channel attention mechanism to obtain a channel attention vector a c;
s04, gradually applying the channel attention vector a c and the spatial attention weight matrix a s to generate a double-attention-strengthening feature, and completing fusion of a spatial attention mechanism and a channel attention mechanism to obtain the double-attention-strengthening feature
S05, utilizing Grad-cam to enhance double attentionProcessing to obtain a heat-concerned picture corresponding to the I k, and performing binarization operation on the heat-concerned picture by using an OTSU algorithm to obtain a binary-concerned picture B k, k=1, 2,.,;
S06, for each B k, k=1, 2, & gt, N, selecting the largest connected domain, and setting the gray value of the pixel corresponding to the connected domain to 0 on the corresponding original image I k, k=1, 2, & gt, N, namely generating a difficult sample J k, k=1, 2, & gt, N under a dual-attention mechanism;
the step S02 specifically includes:
(1) Extracting a feature F from the conv5_3 layer;
(2) Extracting spatial features at spatial l= (x, y) of feature F
(3) The attention weight at spatial location l= (x, y) can be obtained with a channel softmax operation:
Thereby obtaining the space attention weight matrix
The step S03 specifically comprises the following steps:
(1) Features for each channel Using average pooling to obtain u i, and further obtaining channel characteristics
(2) The average pooling layer is followed by a convolution layer to learn the reinforcement characteristics of each channel, namely:
u'=Wc*u+bc
where x represents the convolution operation, W c represents the weight, b c is the bias term;
(3) Applying Sigmoid operation to u '= [ u' 1,u′2,...,u′C ] to generate channel attention vector I.e. the attention value of channel i/>
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010749955.2A CN111860681B (en) | 2020-07-30 | 2020-07-30 | Deep network difficulty sample generation method under double-attention mechanism and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010749955.2A CN111860681B (en) | 2020-07-30 | 2020-07-30 | Deep network difficulty sample generation method under double-attention mechanism and application |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111860681A CN111860681A (en) | 2020-10-30 |
CN111860681B true CN111860681B (en) | 2024-04-30 |
Family
ID=72945119
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010749955.2A Active CN111860681B (en) | 2020-07-30 | 2020-07-30 | Deep network difficulty sample generation method under double-attention mechanism and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111860681B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114005078B (en) * | 2021-12-31 | 2022-03-29 | 山东交通学院 | Vehicle weight identification method based on double-relation attention mechanism |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948658A (en) * | 2019-02-25 | 2019-06-28 | 浙江工业大学 | The confrontation attack defense method of Feature Oriented figure attention mechanism and application |
CN110020682A (en) * | 2019-03-29 | 2019-07-16 | 北京工商大学 | A kind of attention mechanism relationship comparison net model methodology based on small-sample learning |
CN110675406A (en) * | 2019-09-16 | 2020-01-10 | 南京信息工程大学 | CT image kidney segmentation algorithm based on residual double-attention depth network |
CN110991311A (en) * | 2019-11-28 | 2020-04-10 | 江南大学 | Target detection method based on dense connection deep network |
CN114818920A (en) * | 2022-04-26 | 2022-07-29 | 常熟理工学院 | Weak supervision target detection method based on double attention erasing and attention information aggregation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10929708B2 (en) * | 2018-12-10 | 2021-02-23 | International Business Machines Corporation | Deep learning network for salient region identification in images |
-
2020
- 2020-07-30 CN CN202010749955.2A patent/CN111860681B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948658A (en) * | 2019-02-25 | 2019-06-28 | 浙江工业大学 | The confrontation attack defense method of Feature Oriented figure attention mechanism and application |
CN110020682A (en) * | 2019-03-29 | 2019-07-16 | 北京工商大学 | A kind of attention mechanism relationship comparison net model methodology based on small-sample learning |
CN110675406A (en) * | 2019-09-16 | 2020-01-10 | 南京信息工程大学 | CT image kidney segmentation algorithm based on residual double-attention depth network |
CN110991311A (en) * | 2019-11-28 | 2020-04-10 | 江南大学 | Target detection method based on dense connection deep network |
CN114818920A (en) * | 2022-04-26 | 2022-07-29 | 常熟理工学院 | Weak supervision target detection method based on double attention erasing and attention information aggregation |
Non-Patent Citations (2)
Title |
---|
残差网络下基于困难样本挖掘的目标检测;张超;陈莹;;激光与光电子学进展;20180511(第10期);全文 * |
融合双注意力的深度神经网络在无人机目标检测中的应用;占哲琦;陈鹏;桑永胜;彭德中;;现代计算机;20200415(第11期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111860681A (en) | 2020-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kümmerer et al. | DeepGaze II: Reading fixations from deep features trained on object recognition | |
US8306327B2 (en) | Adaptive partial character recognition | |
CN109903237B (en) | Multi-scale face image deblurring method based on low and high frequency separation | |
Feng et al. | Deep-masking generative network: A unified framework for background restoration from superimposed images | |
CN113657267B (en) | Semi-supervised pedestrian re-identification method and device | |
WO2021144943A1 (en) | Control method, information processing device, and control program | |
CN111553462A (en) | Class activation mapping method | |
CN116342601B (en) | Image tampering detection method based on edge guidance and multi-level search | |
Sule et al. | Enhanced convolutional neural networks for segmentation of retinal blood vessel image | |
CN111860681B (en) | Deep network difficulty sample generation method under double-attention mechanism and application | |
Yilmaz | Practical fast gradient sign attack against mammographic image classifier | |
Wang et al. | PFDN: Pyramid feature decoupling network for single image deraining | |
CN114743126A (en) | Lane line sign segmentation method based on graph attention machine mechanism network | |
Karthick et al. | Deep regression network for the single image super resolution of multimedia text image | |
CN114723010B (en) | Automatic learning enhancement method and system for asynchronous event data | |
Gani et al. | Copy move forgery detection using DCT, PatchMatch and cellular automata | |
Zabihi et al. | Vessel extraction of conjunctival images using LBPs and ANFIS | |
Fan et al. | BFNet: Brain-like Feedback Network for Object Detection under Severe Weather | |
CN114187440A (en) | Small sample target detection system and method based on dynamic classifier | |
Shunmuganathan et al. | A Hybrid Image Segmentation Method Using Firefly And Artificial Bee Colony Algorithms For Color Images | |
Wang et al. | SEM image quality assessment based on intuitive morphology and deep semantic features | |
CN117315702B (en) | Text detection method, system and medium based on set prediction | |
Araki et al. | Patch-based cervical cancer segmentation using distance from boundary of tissue | |
CN115482595B (en) | Specific character visual sense counterfeiting detection and identification method based on semantic segmentation | |
Mangal et al. | Image co-saliency detection using CLAHE and modified Resnet-50 convolution neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |