CN115601583A - Deep convolution network target identification method of double-channel attention mechanism - Google Patents

Deep convolution network target identification method of double-channel attention mechanism Download PDF

Info

Publication number
CN115601583A
CN115601583A CN202211090432.7A CN202211090432A CN115601583A CN 115601583 A CN115601583 A CN 115601583A CN 202211090432 A CN202211090432 A CN 202211090432A CN 115601583 A CN115601583 A CN 115601583A
Authority
CN
China
Prior art keywords
neural network
attention mechanism
target
channel
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211090432.7A
Other languages
Chinese (zh)
Inventor
王俊杰
赵立业
黄程韦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202211090432.7A priority Critical patent/CN115601583A/en
Publication of CN115601583A publication Critical patent/CN115601583A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep convolutional network target identification method of a double-channel attention mechanism, which comprises the following steps of constructing a convolutional neural network, and extracting a high-dimensional characteristic diagram by taking a single sample pair as input; respectively constructing a spatial attention and channel attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature pixels in spatial dimension and adding the correlation with original features element by element; stacking the outputs of the space and channel attention mechanism modules on channel dimensions to obtain the final characteristic representation of the model; constructing a training sample pair, wherein the scale of the similar targets is increased through data enhancement, and the different targets are directly paired; and calculating cross entropy loss, and learning network parameters through random gradient descent to obtain a neural network model with the capability of distinguishing target classes. By the method, the accuracy of visual target image recognition can be improved under a single-sample scene and for the target classes which do not participate in training.

Description

Deep convolution network target identification method of double-channel attention mechanism
Technical Field
The invention belongs to the technical field of pattern recognition, and particularly relates to a deep convolutional network target recognition method based on a dual-channel attention mechanism.
Background
In the last decade, deep learning has been a great success in the field of computer vision, and more researchers have begun to focus on the application of neural networks in object recognition.
Although neural network models achieve excellent results in most target recognition tasks, challenges such as a variety of models, insufficient training samples, intra-class fine-grained variation, class increase and the like are still faced in an actual production environment. A neural network is a typical supervised learning algorithm, which relies on a large-scale labeled training data set, and the data cost is not negligible, so that enough images cannot be collected for each class of targets to be used for training; in addition, when the classes which change frequently are faced, the more typical neural network classification model cannot effectively process the classes which do not participate in training, which is one of the problems to be solved before the technology is actually applied.
Disclosure of Invention
In order to solve the problems, the invention discloses a method for identifying a deep convolution network target with a two-channel attention mechanism, which can realize automatic classification of visual targets on the premise that each type of target only has one training image sample.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a method for identifying a deep convolution network target with a two-channel attention mechanism comprises the following steps,
step 1: constructing a convolutional neural network, taking the image sample pair as input, and extracting a high-dimensional characteristic diagram;
step 2: constructing a spatial attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature pixels on spatial dimensions, and adding the correlation and the original features element by element;
and 3, step 3: constructing a channel attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature channels on channel dimensions, and adding the correlation and the original features element by element;
and 4, step 4: stacking the outputs of the space attention mechanism module and the channel attention mechanism module on a channel dimension to obtain a final characteristic representation of the model;
and 5: constructing a training sample pair, wherein the scale of similar targets is increased through data enhancement, and different targets are directly paired;
step 6: and calculating cross entropy loss and learning network parameters through random gradient descent to obtain a neural network model with the capability of distinguishing target categories.
Further, in the present invention: the step 1 further comprises the step of,
step 1-1: constructing a convolutional neural network comprising 17 convolutional layers, wherein the head convolutional layer is composed of 64 convolution kernels with the size of 7 multiplied by 7 and the step size is 2, so that 0.5-time down-sampling is carried out on the input image and the number of characteristic image channels is increased to 64 dimensions; the maximum value pooling layer adopts a window size of 3 multiplied by 3, the step length is 2, and the maximum value pooling layer is used for carrying out 0.5 time down sampling on the feature map; every 2 convolutional layers which adopt convolution kernels with the size of 3 multiplied by 3 except the head convolutional layer form a residual error module by a direct connection structure, 8 residual error modules are counted, the step length of the first convolutional layer in each residual error block is 2, the rest are 1, the number of the convolution kernels is continuously increased along with the increase of the network depth, and finally, the network weight of the high-dimensional feature map with the size of 1/32 of the input image and the channel number from ascending dimension to 512 dimension is obtained through random initialization and is continuously updated through back propagation in the training process;
step 1-2: constructing two paths of completely identical convolutional neural networks as described in step 1-1, wherein each path receives one image in the image sample pair as input and respectively outputs a high-dimensional feature map F 1 And F 2
Further, in the present invention: the step 2 further comprises the step of,
step 2-1: extracting an original high-dimensional feature map F epsilon R from the convolutional neural network in the step 1 C×H×W H, W and C respectively represent the height, width and channel number of the characteristic diagram, and the height, width and channel number are respectively input into three groups of 1 × 1 convolution layers to obtain three new characteristic diagrams F a 、F b 、F c And flattening its width and height dimensions, i.e. { F a ,F b ,F c }∈R C×(H×W) . Subsequently, F is a Transpose and F b Obtaining a space attention matrix M through a Softmax function after multiplication s ∈R (H×W)×(H×W) Is concretely provided with
Figure BDA0003837036840000021
Wherein the content of the first and second substances,
Figure BDA0003837036840000022
showing the correlation between the characteristic pixels of the ith position and the jth position, T representing transposition, F a 、F b Is a characteristic diagram of the convolutional layer output.
Step 2-2: make F c And M s Multiplying and multiplying with the original high dimensional feature F ∈ R C×H×W Adding element by element to obtain output characteristic F s Is concretely provided with
Figure BDA0003837036840000023
Wherein eta s Is a trainable scale factor and is initialized to 0 for avoidance
Figure BDA0003837036840000024
Too large, j is a subscript to the spatial location.
Further, in the present invention: the step 3 further comprises the step of,
step 3-1: extracting a high-dimensional characteristic diagram F epsilon R from the neural network in the step 1 C×H×W H, W and C respectively represent the height, width and channel number of the characteristic diagram, and the channel attention matrix M is obtained by taking the product of the H, W and C and the transpose of the H, W and C after the width and height dimensions of the characteristic diagram are flattened t ∈R C×C Let i, j represent the ith and jth positions in space, and T is the transposition operation, specifically
Figure BDA0003837036840000025
Step 3-2: make F and M t Multiplying by the original feature F ∈ R C×H×W Adding element by element to obtain output characteristic F t Is concretely provided with
Figure BDA0003837036840000031
Wherein eta t Is a trainable scale factor and is initialized to 0 for avoidance
Figure BDA0003837036840000032
Too large.
Further, in the present invention: the step 4 further comprises the step of,
step 4-1: for the attention mechanism modules described in step 2 and step 3, F is obtained s And F t Stacking on channel dimension to form a double-attention machine mechanism module;
step 4-2: for two paths of high-dimensional characteristic graphs F output by the neural network in the step 1 1 ,F 2 ∈R C×H×W Respectively passing through a double-injection machine manufacturing module to obtain better characteristic representation F' 1 ,F′ 2 ∈R (2×C)×H×W
Further, in the present invention: said step 5 further comprises the step of,
step 5-1: for different target class images, directly forming sample pairs for training network
Step 5-2: for the same target type images, random scale scaling, rotation, affine transformation and brightness, saturation and contrast adjustment are carried out, so that two images in each pair of the same sample pairs are different and then form a sample pair, and the number of the same type target image sample pairs is consistent with that of different type target image sample pairs. Assuming that there are E types of visual targets to be identified, the single-sample training set described in this patent includes E target images, and after sample pairs are constructed according to the method described in step 5, the total number N of sample pairs pairs Is composed of
Figure BDA0003837036840000033
Wherein n is an independent variable.
Further, in the present invention: said step 6 further comprises the step of,
step 6-1: processing a pair of feature maps F 'output by a neural network through a double-attention machine' 1 ,F′ 2 ∈R (2 ×C)×H×W Respectively obtaining a pair of eigenvectors f through global average pooling 1 ,f 2 ∈R (2×C) We calculate f 1 And f 2 And mapped to [0,1 ] via Sigmoid function]Within the range, the final output y of the neural network is obtained i Subsequently, a cross-entropy loss function loss is defined, in particular
Figure BDA0003837036840000034
Where i denotes the i-th pair of output characteristics, y i Is the output of the neural network. .
Step 6-2: using loss as a loss function, using the sample pair in the step 5 as input, training the neural network in the steps 1 to 4 by adopting an adaptive moment estimation algorithm, and dynamically adjusting the learning rate of each parameter by utilizing the first moment estimation and the second moment estimation of the gradient, wherein the weight attenuation of the adaptive moment estimation algorithm is set to be 5e -5 Inputting 32 samples as a small batch, and initializing the learning rate to 4e -3 And (4) every 40 iteration cycles are attenuated to half of the original cycle, and 200 cycles are iterated to obtain the neural network model with the capability of distinguishing the target classes.
The invention has the beneficial effects that:
under the condition that each type of target only has one training image, the neural network is trained by constructing a data enhancement sample pair to expand training data; the neural network structure enables the model to have the capability of training by using a small number of samples and identifying target classes which do not participate in training; the double-attention mechanism improves the distinction degree between intra-class compactness and inter-class compactness and improves the identification accuracy; the cross entropy loss function avoids the punishment degree imbalance caused by manually setting the margin in the training process, and the identification accuracy is improved.
Drawings
FIG. 1 is a schematic overall flow diagram of the process of the present invention;
FIG. 2 is a diagram of a convolutional network structure for image feature extraction in the present invention;
FIG. 3 is a schematic view of a spatial attention mechanism module of the present invention;
FIG. 4 is a schematic diagram of a channel attention mechanism module of the present invention;
FIG. 5 is an ablation experimental result on an experimental data set according to the method of the present invention.
Detailed Description
The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as merely illustrative of the invention and not as limiting the scope of the invention.
As shown in fig. 1, an overall flow diagram of a deep convolutional network target identification method with a dual channel attention mechanism provided by the present invention is shown, and the method specifically includes the following steps:
step 1: and (3) constructing a convolutional neural network, taking the image sample pair as input, and extracting a high-dimensional characteristic diagram.
And 2, step: constructing a spatial attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature pixels on spatial dimensions, and adding the correlation and the original features element by element;
and step 3: constructing a channel attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature channels on channel dimensions and adding the correlation and the original features element by element;
and 4, step 4: stacking the outputs of the space attention mechanism module and the channel attention mechanism module on a channel dimension to obtain a final feature representation of the model;
and 5: constructing a training sample pair, wherein the scale of the similar targets is increased through data enhancement, and the different targets are directly paired;
step 6: and calculating cross entropy loss and learning network parameters through random gradient descent to obtain a neural network model with the capability of distinguishing target categories.
As shown in fig. 2, two identical convolutional neural networks are constructed, each path receives one image in the image sample pair as input and respectively outputs a high-dimensional feature map F 1 And F 2 . The constructed convolutional neural network comprises 17 convolutional layers, wherein the head convolutional layer is composed of 64 convolution kernels with the size of 7 multiplied by 7 and the step size is 2, so that the input image is subjected to 0.5-time down-sampling and the number of characteristic image channels is increased to 64 dimensions; the maximum value pooling layer adopts a window size of 3 multiplied by 3, the step length is 2, and the maximum value pooling layer is used for carrying out 0.5 time down-sampling on the feature map; except for the head convolution layer, every 2 convolution layers adopting convolution kernels with the size of 3 multiplied by 3 form a residual error module by a direct connection structure, 8 residual error modules are counted, the step length of the first convolution layer in each residual error block is 2, the rest are 1, the number of the convolution kernels is continuously increased along with the increase of the network depth, and finally, the network weight of the high-dimensional characteristic diagram with the size of 1/32 of the input image and the channel ascending dimension to 512 dimensions is obtained through random initialization and is continuously updated through back propagation in the training process.
As shown in FIG. 3, the high-dimensional characteristic diagram F epsilon R extracted by the convolutional neural network C×H×W H, W and C respectively represent the height, width and channel number of the characteristic diagram, and the height, width and channel number are respectively input into three groups of 1 × 1 convolution layers to obtain three new characteristic diagrams F a 、F b 、F c And flattening it in the width and height dimensions, i.e. { F a ,F b ,F c }∈R C×(H×W) . Subsequently, F is a Transpose of (1) and (F) b Obtaining a space attention matrix M through a Softmax function after multiplication s ∈R (H×W)×(H×W) Specifically, it is
Figure BDA0003837036840000051
Wherein the content of the first and second substances,
Figure BDA0003837036840000052
the correlation between the characteristic pixels of the ith position and the jth position is shown.
Make F c And M s Multiplying by the original feature F ∈ R C×H×W Adding element by element to obtain output characteristic F s Is concretely provided with
Figure BDA0003837036840000053
Wherein eta s Is a trainable scale factor and is initialized to 0 for avoidance
Figure BDA0003837036840000054
Too large. F s And aggregation is selectively carried out according to the space attention moment array, so that strong related features are mutually promoted, the compactness and semantic consistency in the class are improved, and the network can better distinguish target images of different classes.
As shown in figure 4, a channel attention mechanism module is constructed, and a high-dimensional feature map F epsilon R extracted by a neural network is extracted C ×H×W H, W and C respectively represent the height, width and channel number of the characteristic diagram, and the channel attention matrix M is obtained by multiplying the width and height dimensions of the characteristic diagram by the transpose of the characteristic diagram after the width and height dimensions of the characteristic diagram are flattened t ∈R C×C Is concretely provided with
Figure BDA0003837036840000055
Make F and M t Multiplying by the original feature F ∈ R C×H×W Adding element by element to obtain output characteristic F t Is concretely provided with
Figure BDA0003837036840000056
Wherein eta t Is a trainable scale factor and is initialized to 0 for avoidance
Figure BDA0003837036840000057
Too large. Thus, the feature F produced at each location t The method is a result of weighted addition of the features on all positions and the original features, simulates the dependency relationship between feature map channels, enhances the distinguishability among classes and the feature identifiability, and enables a network to highlight the feature representation of fine-grained change of a target image.
As shown in fig. 5, in order to verify the beneficial effect of the deep convolutional network target identification method with the dual channel attention mechanism proposed by the present invention, the following experiment is performed:
the method for identifying the deep convolutional network target of the dual-channel attention mechanism provided by the invention is used for carrying out ablation experiments on a target identification data set, and specifically comprises the steps of respectively carrying out single-sample visual target identification experiments on four structures of a single-channel convolutional network, a convolutional neural network + sample pair construction + double-attention mechanism and a convolutional neural network + sample pair construction + double-attention mechanism + loss, and selecting Top-1 accuracy, top-5 accuracy and an average F1 score as evaluation indexes, wherein the Top-1 accuracy is the accuracy of the first ranking category which is consistent with an actual result, the Top-5 accuracy is the accuracy of the first ranking which is consistent with an actual ranking result, and the definition of the F1 score is specifically the accuracy of the first five ranking categories which is consistent with an actual ranking result
Figure BDA0003837036840000061
Wherein TP, FP and FN are the number of positive samples, negative samples and positive samples which are determined as positive samples and negative samples in sequence, and N is the number of categories.
It can be observed that the convolutional neural network + sample pair, the construction + the double-attention mechanism + loss of the present invention all obtain the best recognition effect on two data sets, wherein the convolutional neural network structure and the necessary large-scale training sample pair thereof play a decisive role in the performance improvement, and the double-attention mechanism and the loss respectively improve the result to a certain extent.
It should be noted that the above-mentioned contents only illustrate the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and it will be apparent to those skilled in the art that several modifications and embellishments can be made without departing from the principle of the present invention, and these modifications and embellishments fall within the protection scope of the claims of the present invention.

Claims (7)

1. A deep convolution network target identification method of a two-channel attention mechanism is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1: constructing a convolutional neural network, taking the image sample pair as input, and extracting a high-dimensional characteristic diagram;
step 2: constructing a spatial attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature pixels on spatial dimensions, and adding the correlation and the original features element by element;
and 3, step 3: constructing a channel attention mechanism module, taking two high-dimensional feature graphs extracted by a neural network as input, calculating the correlation between feature channels on channel dimensions and adding the correlation and the original features element by element;
and 4, step 4: stacking the outputs of the space attention mechanism module and the channel attention mechanism module on a channel dimension to obtain a final characteristic representation of the model;
and 5: constructing a training sample pair, wherein the scale of similar targets is increased through data enhancement, and different targets are directly paired;
step 6: and calculating cross entropy loss and learning network parameters through random gradient descent to obtain a neural network model with the capability of distinguishing target categories.
2. The method for identifying the target of the deep convolutional network with the dual-channel attention mechanism as claimed in claim 1, wherein: the step 1 specifically comprises the steps of,
step 1-1: constructing a convolutional neural network which comprises 17 convolutional layers, wherein the head convolutional layer is composed of 64 convolutional kernels with the size of 7 multiplied by 7, the step size is 2, and therefore 0.5-time down-sampling is conducted on the input image, and the number of feature map channels is increased to 64 dimensions; the maximum value pooling layer adopts a window size of 3 multiplied by 3, the step length is 2, and the maximum value pooling layer is used for carrying out 0.5 time down-sampling on the feature map; every 2 convolutional layers which adopt convolution kernels with the size of 3 multiplied by 3 except the head convolutional layer form a residual error module by a direct connection structure, 8 residual error modules are counted, the step length of the first convolutional layer in each residual error block is 2, the rest are 1, the number of the convolution kernels is continuously increased along with the increase of the network depth, and finally, the network weight of the high-dimensional feature map with the size of 1/32 of the input image and the channel number from ascending dimension to 512 dimension is obtained through random initialization and is continuously updated through back propagation in the training process;
step 1-2: constructing two paths of completely identical convolutional neural networks as described in step 1-1, wherein each path receives one image in the image sample pair as input and respectively outputs a high-dimensional feature map F 1 And F 2
3. The method for identifying the target of the deep convolutional network with the dual-channel attention mechanism as claimed in claim 2, wherein: the step 2 specifically comprises the steps of,
step 2-1: extracting the original high-dimensional characteristic diagram F epsilon R from the convolutional neural network in the step 1 C×H×W H, W and C represent height, width and channel number of the characteristic diagram, and three groups of 1 × 1 convolution layers are input to obtain three new characteristic diagrams F a 、F b 、F c And flattening it in the width and height dimensions, i.e. { F a ,F b ,F c }∈R C×(H×W) (ii) a Subsequently, F is a Transpose and F b Obtaining a space attention matrix M through a Softmax function after multiplication s ∈R (H×W)×(H×W) Is concretely provided with
Figure FDA0003837036830000011
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003837036830000021
showing the characteristic image of the ith position and the jth positionCorrelation between elements; t represents transpose, F a 、F b Is a characteristic diagram of the convolutional layer output;
step 2-2: make F c And M s Multiplying and multiplying with the original high dimensional feature F ∈ R C×H×W Adding element by element to obtain output characteristic F s Specifically, it is
Figure FDA0003837036830000022
Wherein eta s Is a trainable scale factor and is initialized to 0 for avoidance
Figure FDA0003837036830000023
Too large, j is a subscript to the spatial position.
4. The method for identifying the target of the deep convolutional network with the dual-channel attention mechanism as claimed in claim 3, wherein: the step 3 specifically includes the steps of,
step 3-1: extracting a high-dimensional feature map F epsilon R for the neural network described in the step 1 C×H×W H, W and C respectively represent the height, width and channel number of the characteristic diagram, and the channel attention matrix M is obtained by multiplying the width and height dimensions of the characteristic diagram by the transpose of the characteristic diagram after the width and height dimensions of the characteristic diagram are flattened t ∈R C×C Let i, j represent the ith and jth positions in space, and T is a transposition operation, specifically
Figure FDA0003837036830000024
Step 3-2: make F and M t Multiplying by the original feature F ∈ R C×H×W Adding element by element to obtain output characteristic F t Is concretely provided with
Figure FDA0003837036830000025
Whereinη t Is a trainable scale factor and is initialized to 0 for avoidance
Figure FDA0003837036830000026
Too large.
5. The method for identifying the deep convolutional network target of the dual-channel attention mechanism as claimed in claim 4, wherein: the step 4 further comprises the step of,
step 4-1: for the attention mechanism modules described in step 2 and step 3, the respective F s And F t Stacking on the channel dimension to form a double-attention mechanism module;
step 4-2: for two paths of high-dimensional characteristic graphs F output by the neural network in the step 1 1 ,F 2 ∈R C×H×W Respectively passing through a double-injection machine mechanism module to obtain a better characteristic expression F 1 ′,F 2 ′∈R (2×C)×H×W
6. The method for identifying the deep convolutional network target of the dual-channel attention mechanism as claimed in claim 5, wherein: said step 5 further comprises the step of,
step 5-1: for different target class images, directly forming a sample pair for training a network;
step 5-2: for the same target class images, random scale scaling, rotation, affine transformation and brightness, saturation and contrast adjustment are carried out, so that two images in each pair of the same sample pairs are different and then form a sample pair, and the number of the same type target image sample pairs is consistent with that of different type target image sample pairs; assuming that there are E visual targets to be identified, the single sample training set described in this patent includes E target images, and after the sample pairs are constructed according to the method described in step 5, the total number N of sample pairs pairs Is composed of
Figure FDA0003837036830000031
Wherein n is an independent variable.
7. The method for identifying the target of the deep convolutional network with the dual-channel attention mechanism as claimed in claim 6, wherein: said step 6 also comprises the step of,
step 6-1: a pair of characteristic diagrams F output by processing the neural network through a double-attention machine mechanism 1 ′,F 2 ′∈R (2×C)×H×W Respectively obtaining a pair of eigenvectors f through global average pooling 1 ,f 2 ∈R (2×C) We calculate f 1 And f 2 And mapped to [0,1 ] via Sigmoid function]Within the range, the final output y of the neural network is obtained i Subsequently, a cross-entropy loss function loss is defined, in particular
Figure FDA0003837036830000032
Wherein i represents the ith pair of output features; y is i Is the output of the neural network;
step 6-2: using loss as a loss function, using the sample pair in the step 5 as input, training the neural network in the steps 1 to 4 by adopting an adaptive moment estimation algorithm, and dynamically adjusting the learning rate of each parameter by utilizing the first moment estimation and the second moment estimation of the gradient, wherein the weight attenuation of the adaptive moment estimation algorithm is set to be 5e -5 32 samples are input as a small batch, and the learning rate is initialized to 4e -3 And (4) every 40 iteration cycles are attenuated to half of the original cycle, and 200 cycles are iterated to obtain the neural network model with the capability of distinguishing the target classes.
CN202211090432.7A 2022-09-07 2022-09-07 Deep convolution network target identification method of double-channel attention mechanism Pending CN115601583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211090432.7A CN115601583A (en) 2022-09-07 2022-09-07 Deep convolution network target identification method of double-channel attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211090432.7A CN115601583A (en) 2022-09-07 2022-09-07 Deep convolution network target identification method of double-channel attention mechanism

Publications (1)

Publication Number Publication Date
CN115601583A true CN115601583A (en) 2023-01-13

Family

ID=84843606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211090432.7A Pending CN115601583A (en) 2022-09-07 2022-09-07 Deep convolution network target identification method of double-channel attention mechanism

Country Status (1)

Country Link
CN (1) CN115601583A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116106856A (en) * 2023-04-13 2023-05-12 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Identification model establishment method and identification method for thunderstorm strong wind and computing equipment
CN116416479A (en) * 2023-06-06 2023-07-11 江西理工大学南昌校区 Mineral classification method based on deep convolution fusion of multi-scale image features
CN116579616A (en) * 2023-07-10 2023-08-11 武汉纺织大学 Risk identification method based on deep learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116106856A (en) * 2023-04-13 2023-05-12 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Identification model establishment method and identification method for thunderstorm strong wind and computing equipment
CN116106856B (en) * 2023-04-13 2023-08-18 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Identification model establishment method and identification method for thunderstorm strong wind and computing equipment
CN116416479A (en) * 2023-06-06 2023-07-11 江西理工大学南昌校区 Mineral classification method based on deep convolution fusion of multi-scale image features
CN116416479B (en) * 2023-06-06 2023-08-29 江西理工大学南昌校区 Mineral classification method based on deep convolution fusion of multi-scale image features
CN116579616A (en) * 2023-07-10 2023-08-11 武汉纺织大学 Risk identification method based on deep learning
CN116579616B (en) * 2023-07-10 2023-09-29 武汉纺织大学 Risk identification method based on deep learning

Similar Documents

Publication Publication Date Title
CN113011499B (en) Hyperspectral remote sensing image classification method based on double-attention machine system
CN108510012B (en) Target rapid detection method based on multi-scale feature map
CN109271522B (en) Comment emotion classification method and system based on deep hybrid model transfer learning
CN115601583A (en) Deep convolution network target identification method of double-channel attention mechanism
CN110048827B (en) Class template attack method based on deep learning convolutional neural network
CN104217214B (en) RGB D personage's Activity recognition methods based on configurable convolutional neural networks
CN107358293A (en) A kind of neural network training method and device
CN109063719B (en) Image classification method combining structure similarity and class information
CN110826338B (en) Fine-grained semantic similarity recognition method for single-selection gate and inter-class measurement
CN103065158B (en) The behavior recognition methods of the ISA model based on relative gradient
CN104866810A (en) Face recognition method of deep convolutional neural network
CN110378208B (en) Behavior identification method based on deep residual error network
CN106503729A (en) A kind of generation method of the image convolution feature based on top layer weights
CN107563407B (en) Feature representation learning system for multi-modal big data of network space
CN108021947A (en) A kind of layering extreme learning machine target identification method of view-based access control model
CN107203787A (en) A kind of unsupervised regularization matrix characteristics of decomposition system of selection
CN107563430A (en) A kind of convolutional neural networks algorithm optimization method based on sparse autocoder and gray scale correlation fractal dimension
CN107516128A (en) A kind of flowers recognition methods of the convolutional neural networks based on ReLU activation primitives
CN110765960B (en) Pedestrian re-identification method for adaptive multi-task deep learning
CN106777402A (en) A kind of image retrieval text method based on sparse neural network
CN114444600A (en) Small sample image classification method based on memory enhanced prototype network
CN110619045A (en) Text classification model based on convolutional neural network and self-attention
Zhang et al. Channel-wise and feature-points reweights densenet for image classification
Chen et al. Application of improved convolutional neural network in image classification
CN114780767A (en) Large-scale image retrieval method and system based on deep convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination