CN111860681B

CN111860681B - Deep network difficulty sample generation method under double-attention mechanism and application

Info

Publication number: CN111860681B
Application number: CN202010749955.2A
Authority: CN
Inventors: 化春键; 王珊珊; 陈莹; 李祥明; 钱春俊; 裴佩
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2024-04-30
Anticipated expiration: 2040-07-30
Also published as: CN111860681A

Abstract

The invention discloses a novel generation method and application of a difficult sample for deep learning, and designs a bidirectional attention mechanism for automatically generating the difficult sample, which is beneficial to a deep model to jump out of a local optimal solution and has stronger robustness. The attention mechanism of the method not only can emphasize the foreground target, but also can avoid the influence of background clutter to a certain extent, so that the shielded area is more concentrated, the generated difficult sample is more resistant, and the recognition precision of the depth network target can be improved.

Description

Deep network difficulty sample generation method under double-attention mechanism and application

Technical Field

The invention relates to a method for generating a deep network difficult sample under a double-attention mechanism, and belongs to the technical field of artificial intelligence.

Background

The deep learning method needs to increase the diversity of training samples to improve the generalization capability of the model and relieve the overfitting phenomenon.

However, collecting a large data set is a difficult problem for most identification tasks. Data enhancement is a considerable alternative method, without the need for manually annotating a large number of pictures. Typical data augmentation methods include random cropping, mirroring images, image dithering, etc., reference [1-2]. Recently, the development of counter-generated networks has brought reliable choices for models to introduce additional augmentation data, reference [3-4].

But in general, data augmentation using an countermeasure generation network requires additional data sets because the data sets available to the training set need to be generated using information such as style or pose of the additional data sets. In order to increase the restriction force of the samples on the network while increasing the number of samples, research into the generation of difficult samples is started. Document [5] randomly erasing some regions in the input picture of the network avoids the network from being over fitted, making the network more robust.

But on the other hand, randomly occluding samples may add noise to the network, thereby making the network converge more slowly.

Document [6] uses sliding window to generate difficult sample pool, selects the most difficult sample according to the accuracy of different difficult samples after model training, expands training set according to the selected most difficult sample, retrains model, but the steps are more complicated.

Disclosure of Invention

The invention aims to provide a novel generation method of a difficult sample for deep learning, and designs a bidirectional attention mechanism for automatically generating the difficult sample, which is helpful for a deep model to jump out of a local optimal solution and has stronger robustness. The technical scheme comprises the following steps:

S01, acquiring a training set image, inputting the training set image I _k, k=1, 2, N into a basic network ResNet-50, and extracting features from a conv5_3 layer W is the height and width of the feature F, C is the number of channels of F, and N is the number of training set images;

s02, establishing a space attention mechanism to obtain a space attention weight matrix

S03, establishing a channel attention mechanism to obtain a channel attention vector a ^c;

s04, gradually applying the channel attention vector a ^c and the spatial attention weight matrix a ^s to generate a double-attention-strengthening feature, and completing fusion of a spatial attention mechanism and a channel attention mechanism to obtain the double-attention-strengthening feature

S05, utilizing Grad-cam to enhance double attentionProcessing to obtain a heat-concerned picture corresponding to the I _k, and performing binarization operation on the heat-concerned picture by using an OTSU algorithm to obtain a binary-concerned picture B _k, k=1, 2,.,;

S06, for each sheet B _k, k=1, 2,..n, selecting the largest connected domain, and setting the gray value of the pixel corresponding to the connected domain to 0 on the corresponding original image I _k, k=1, 2,..n, i.e. generating a difficult sample J _k, k=1, 2,..n under the dual-attention mechanism.

Further, step S02 specifically includes:

(1) Extracting a feature F from the conv5_3 layer;

(2) Extracting spatial features at spatial l= (x, y) of feature F

(3) The attention weight at spatial location l= (x, y) can be obtained with a channel softmax operation:

Thereby obtaining the space attention weight matrix

Further, step S03 specifically includes:

(1) Features for each channel I=1.. C applying an average pooling to give u _i, thereby obtaining channel characteristics/>

(2) The average pooling layer is followed by a convolution layer to learn the reinforcement characteristics of each channel, namely:

u'＝W_c*u+b_c

Where x represents the convolution operation, W _c represents the weight, and b _c is the bias term.

(3) Applying Sigmoid operation to u '= [ u' ₁,u'₂,...,u'_C ] to generate channel attention vectorI.e. the attention value of channel i/>

The invention also provides application of the depth network difficult sample generation method in the field of monitoring image recognition and heat map recognition.

The invention also provides application of the deep network difficult sample generation method in the field of financial economic quantitative analysis.

The invention also provides application of the deep network difficult sample generation method in the fields of medical drug excavation, disease analysis and medical image analysis.

The invention also provides application of the deep network difficult sample generation method in the field of network security.

The invention also provides application of the deep network difficulty sample generation method in the field of spam filtering.

The invention also provides application of the deep network difficult sample generation method in the field of DNS malicious domain name analysis.

The invention has the following beneficial effects:

The invention designs a method for generating a depth network difficult sample based on a double-attention mechanism of a space and a channel, wherein the attention mechanism not only can emphasize a foreground target, but also can avoid the influence of background clutter to a certain extent, so that a shielded area is more concentrated, the generated difficult sample is more resistant, and the recognition precision of the depth network target can be improved.

Drawings

FIG. 1 is a flow chart of a method for generating deep network difficulty samples under a dual-attention mechanism;

FIG. 2 is a diagram of a dual attention model network framework;

FIG. 3 is an example of a heat map of interest and a difficult sample.

Detailed Description

The present invention will be described in detail below with reference to the embodiments shown in the drawings, but it should be understood that the embodiments are not limited to the present invention, and functional, method, or structural equivalents and alternatives according to the embodiments are within the scope of protection of the present invention by those skilled in the art.

Fig. 1 is a schematic flow chart of a method for generating a deep network difficulty sample under a dual-attention mechanism according to the present invention. In this embodiment, a method for generating a deep network difficulty sample under a dual-attention mechanism includes the following steps:

S01, inputting training set images I _k, k=1, 2, & gt, N into a basic network ResNet-50, extracting features F from a conv5_3 layer, wherein N is the number of the training set images;

As shown in fig. 2 (a), the step S02 specifically includes:

(1) Extracting aggregated features Wherein W x H is the height and width of the aggregate feature F, C is the number of channels of F;

(2) Extracting spatial features at spatial l= (x, y) in the aggregated feature F

Thereby obtaining the space attention weight matrix

As shown in fig. 2 (b), the step S03 specifically includes:

(1) Aggregation features for each channel I=1.. C applying an average pooling to give u _i, thereby obtaining channel characteristics/>

(2) The average pooling layer is followed by a convolution layer to learn the aggregate characteristics of each channel, namely:

u'＝W_c*u+b_c

S04, generating double-attention-strengthening features by gradually applying channel attention and space attention, and fusing a space attention mechanism and a channel attention mechanism to obtain the double-attention-strengthening features

As shown in fig. 2 (c), the step S04 specifically includes:

(1) Notice value of channel i Multiplying the aggregate characteristic F _i of each channel to obtain the attention-enhancing characteristic of channel i

(2) In the channel i, i=1, 2, k, c, the spatial attention weight matrix a ^s is dot multiplied by the attention-enhancing feature F _i ^c of the channel i to obtain a dual-attention-enhancing feature

F_i ^sc＝a^s*F_i ^c

S05, visualizing the region of interest of the I _k, k=1, 2, using the Grad-cam, to obtain a heat-concerned map, and performing binarization operation on the heat-concerned map using the OTSU algorithm, to obtain a binary image of interest B _k, k=1, 2, N;

An example of a heat map of interest and a difficult sample generated by the present invention is shown in fig. 3. The difficult samples generated by the invention are used for pedestrian re-identification, 30% of the difficult samples are randomly selected to replace the original training samples to re-train the model, and experiments are carried out on the public data set Market1501 data set and DukeMTMC-reID data set, and the experimental results are shown in table 1.

Table 1 comparison of experimental results

Example 2

The method for generating the deep network difficult sample is applied to the field of monitoring image recognition and heat map recognition.

Example 3

The application of the deep network difficult sample generation method in the field of quantitative analysis of financial economy.

Example 4

Example 5

The deep network difficult sample generation method is applied to the fields of medical drug excavation, disease analysis and medical image analysis.

Example 6

The application of the deep network difficult sample generation method in the network security field.

Example 7

An application of a deep network difficulty sample generation method in the field of spam filtering.

Example 8

The deep network difficulty sample generation method is applied to the field of DNS malicious domain name analysis.

While the invention has been described with reference to the preferred embodiments, it is not limited thereto, and various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Reference is made to:

[1]Fadaee,Marzieh,Arianna Bisazza,and Christof Monz."Data augmentation for low-resource neural machine translation."arXiv preprint arXiv:1705.00440(2017).

[2]Perez L,Wang J.The effectiveness of data augmentation in image classification using deep learning[J].arXiv preprint arXiv:1712.04621,2017.

[3]Frid-Adar M,Diamant I,Klang E,et al.GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification[J].Neurocomputing,2018,321:321-331.

[4]Lim S K,Loo Y,Tran N T,et al.Doping:Generative data augmentation for unsupervised anomaly detection with gan[C]//2018IEEE International Conference on Data Mining(ICDM).IEEE,2018:1122-1127.

[5]Zhong Z,Zheng L,Kang G,et al.Random erasing data augmentation[J].arXiv preprint arXiv:170804896,2017.

[6]Huang H,Li D,Zhang Z,et al.Adversarially occluded samples for person re-identification;proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018

Claims

1. The method for generating the deep network difficulty sample is characterized by comprising the following steps of:

S06, for each B _k, k=1, 2, & gt, N, selecting the largest connected domain, and setting the gray value of the pixel corresponding to the connected domain to 0 on the corresponding original image I _k, k=1, 2, & gt, N, namely generating a difficult sample J _k, k=1, 2, & gt, N under a dual-attention mechanism;

the step S02 specifically includes:

(1) Extracting a feature F from the conv5_3 layer;

(2) Extracting spatial features at spatial l= (x, y) of feature F

Thereby obtaining the space attention weight matrix

The step S03 specifically comprises the following steps:

(1) Features for each channel Using average pooling to obtain u _i, and further obtaining channel characteristics

u'＝W_c*u+b_c

where x represents the convolution operation, W _c represents the weight, b _c is the bias term;

(3) Applying Sigmoid operation to u '= [ u' ₁,u′₂,...,u′_C ] to generate channel attention vector I.e. the attention value of channel i/>