CN110334765B - Remote sensing image classification method based on attention mechanism multi-scale deep learning - Google Patents

Remote sensing image classification method based on attention mechanism multi-scale deep learning Download PDF

Info

Publication number
CN110334765B
CN110334765B CN201910603799.6A CN201910603799A CN110334765B CN 110334765 B CN110334765 B CN 110334765B CN 201910603799 A CN201910603799 A CN 201910603799A CN 110334765 B CN110334765 B CN 110334765B
Authority
CN
China
Prior art keywords
remote sensing
neural network
training
convolutional neural
image library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910603799.6A
Other languages
Chinese (zh)
Other versions
CN110334765A (en
Inventor
唐旭
马秋硕
马晶晶
焦李成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910603799.6A priority Critical patent/CN110334765B/en
Publication of CN110334765A publication Critical patent/CN110334765A/en
Application granted granted Critical
Publication of CN110334765B publication Critical patent/CN110334765B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a remote sensing image classification method based on attention mechanism multi-scale deep learning, which mainly solves the problem of low classification accuracy in the prior art. The scheme is as follows: establishing a remote sensing image library and a corresponding category of the image library, and randomly selecting 80% of remote sensing image samples from each category of remote sensing images after normalization processing to construct a training image library; constructing a convolutional neural network comprising a convolutional network module, an attention module, an SCDA module and a full connection layer; inputting training samples in a training image library into a convolutional neural network to obtain a classification result of the training samples, and determining a loss function of the convolutional neural network; iteratively updating the loss function by a gradient descent method until the loss value is stable to obtain a trained convolutional neural network; normalizing the remote sensing picture to be classified, and inputting the normalized remote sensing picture into a trained convolutional neural network to obtain a classification result; the method has high classification precision and strong robustness, and can be applied to the analysis and management of remote sensing image data.

Description

Remote sensing image classification method based on attention mechanism multi-scale deep learning
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a remote sensing image content classification method which can be applied to analysis and management of remote sensing image data.
Background
With the continuous improvement of the resolution of satellite remote sensing images and aerial remote sensing images, more useful data and information can be obtained from the remote sensing images. However, for applications in different situations, there are different requirements for processing remote sensing images, so in order to effectively analyze and manage the remote sensing image data, semantic tags need to be attached to the images according to the image contents. Scene classification is an important way to solve the problem. Scene classification refers to distinguishing images with similar scene features from multiple images and correctly classifying the images. Compared with a natural image, the remote sensing image has the characteristics of the remote sensing image, and the classification result of the remote sensing image often causes the phenomenon of wrong classification due to the limitation of the spatial resolution of the remote sensing image and the existence of the phenomena of same object, different spectrum and same spectrum of foreign objects, which are caused by the complexity of the remote sensing image. Therefore, it is also a challenge how to classify the remote sensing images more accurately.
The classification based on the convolutional neural network is that some pictures needing to be trained are input into the convolutional neural network in batches, and a target optimization loss function is reduced through repeated training of large batches of data. Thereby achieving the purpose of classification. A number of more mature, well-known convolutional nerves have been proposed today. As in 2012, alexKrizhevsky proposed a deep convolutional network model "AlexNet".
Although the conventional convolutional neural network can realize the task of picture scene classification, two defects still exist when the semantic information of the picture is learned, firstly, the classification information is inaccurately positioned due to the complexity of the remote sensing image, and secondly, the convolutional neural network usually falls into a local significant region during training, as shown in fig. 1. These two disadvantages can cause problems of poor robustness and easy generation of false scores in the process of classification of actual scenes.
Disclosure of Invention
The invention aims to provide a remote sensing image classification method based on attention mechanism multi-scale aiming at the problems in the prior art, so as to reduce the probability that the classification target of the remote sensing image falls into a local area, enlarge the attention area of a convolution network and improve the classification accuracy of the remote sensing image.
The technical idea of the invention is as follows: the convolution neural network is used for obtaining convolution characteristics of the picture, useful information which is beneficial to classification is obtained through the attention mechanism according to the attention mechanism principle, multi-scale convolution layer characteristics are extracted from the useful information, and image classification is achieved through the full-connection layer network.
According to the above concept, the implementation steps of the invention include the following:
(1) Establishing a remote sensing image library I 1 ,I 2 ,…I n …,I N The image library corresponds to the category { Y } 1 ,Y 2 ,…Y n …,Y N And normalizing the established remote sensing image library, wherein n represents the nth sample number in the image library, and n belongs to [0,N ]]N represents the number of pictures in the remote sensing image library;
(2) Randomly selecting 80% of samples from each type of images after normalization processing, and constructing a training image library { T } 1 ,T 2 ,…T j …,T M In which M is<N, wherein T j Represents the jth picture in the training image library, and j belongs to [0,M ]]M is the total number of training samples;
(3) Constructing a convolutional neural network comprising a convolutional network module, an attention module, an SCDA module and a full connection layer;
(4) Determining a loss function of the convolutional neural network:
(4a) Will train the image library { T 1 ,T 2 ,…T j …,T M Inputting the convolution layer neural network with pre-training weight and outputting the last layer of characteristics F of the convolution layer;
(4b) Inputting the last layer of features F into an attention module of a convolutional neural network, outputting convolutional features A, inputting the convolutional layer features A into a plurality of SCDA modules of the convolutional neural network with different average thresholds, and outputting T groups of mask convolutional features: { M 1 ,M 2 ,…M T T is the number of SCDA modules;
(4c) Inputting the T groups of mask convolution characteristics into a full connection layer of a convolution neural network after global average pooling, and outputting a classification result of training data to obtain a loss function of the convolution neural network:
Figure BDA0002120139630000021
therein, loss 1 For outputting the cross entropy of the classification result and the actual result 2 Outputting the absolute value sum of the cross entropy of the classification result and the actual result after the T groups of mask convolution characteristics pass through a full connection layer,
Figure BDA0002120139630000022
is the L2 norm, lambda, of the convolutional neural network weight vector r 、λ s Eta are respectively loss 1 ,loss 2 ,/>
Figure BDA0002120139630000023
The hyper-parameter of (c);
(5) Setting the iteration number as P, and carrying out iterative training on the convolutional neural network through gradient descent optimization until a loss function is obtained
Figure BDA0002120139630000031
The number of times of the training round is not reduced or reaches the number of iterations, and a well-trained convolutional neural network is obtained;
(6) And (3) normalizing the remote sensing image I 'to be classified by the user, and inputting the normalized remote sensing image I' into the trained convolutional neural network to obtain a classification result and finish image classification.
Compared with the prior art, the invention has the following advantages:
1. the invention can quickly find obvious characteristics in the remote sensing image based on the attention mechanism principle, concentrates the characteristics for classification into a certain area with obvious semantic information, and enhances the accuracy of scene classification of the remote sensing image;
2. the SCDA module is used, so that the perception field of the convolutional neural network is enlarged, the probability that the remote sensing image classification target falls into a local area is reduced, and the accuracy and the robustness of remote sensing image classification are enhanced;
3. the invention designs a loss function, further defines the classification task and improves the accuracy of remote sensing image classification.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a diagram of a convolutional neural network constructed in the present invention;
FIG. 3 is a sample view of a remote sensing image used in the simulation of the present invention.
Detailed Description
The effects of the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation steps of the invention are as follows:
step 1, establishing a remote sensing image library to obtain a training sample and a test sample.
1a) Downloading UC Merced images from the Weegee official network, and establishing a remote sensing image library { I } 1 ,I 2 ,…I n …,I N The image library corresponds to the category { Y } 1 ,Y 2 ,…Y n …,Y N In which I n Representing the nth image in the image library, Y n Represents the category corresponding to the nth image in the image library, n represents the nth sample number in the image library, and n belongs to [0,N ]];
1b) Carrying out normalization processing on the established remote sensing image library according to the following formula:
Figure BDA0002120139630000032
wherein V max Is the maximum value of points, V, of all pixels in the remote sensing image library min Is the dot minimum value of all pixels in the library of remotely sensed images, { I' 1 ,I' 2 ,…I' n …,I' N Is a remote sensing image library I 'after normalization processing' n To normalize the nth sample of the processed image set, n e [0,N]N represents the number of the pictures in the remote sensing image library after normalization;
1c) Randomly selecting 80% of remote sensing images from each type of images in the remote sensing image library after normalization processing as a training sample set { T } 1 ,T 2 ,…T j …,T M And taking the rest 20% of remote sensing images as a test sample set { T } 1 ,T 2 ,…T d …,T m Where T is j Represents the j sample in the training sample, j is E [0,M],t d Denotes the d-th sample in the test sample, d ∈ [0,m]M is the total number of training samples, M is the total number of testing samples, M<N,M<N。
And 2, constructing a convolutional neural network.
Referring to fig. 2, this step is implemented as follows:
2a) Setting a convolution network module which is composed of five convolution layers { conv1, conv2, conv3, conv4, conv5} connected in sequence in a pre-training AlexNet network;
2b) Setting an attention module, wherein the attention module consists of a global average pooling layer, a first full-connection layer, a Relu activation layer, a second full-connection layer and a Sigmoid function, and the structure of the attention module is shown in FIG. 3;
the global average pooling layer: the input convolution characteristic size is W multiplied by H multiplied by C, and the convolution characteristic size is used for averaging C Wmultiplied by H convolutions and outputting 1 multiplied by C convolution characteristics;
the first fully connected layer: the size of a convolution kernel is set to be 1 multiplied by C '/16, wherein C' is the characteristic dimension of the input first fully-connected layer;
the second fully-connected layer: the size of a convolution kernel is set to be 1 multiplied by C ', wherein C' is the characteristic dimension of the input second fully-connected layer;
the Relu activation function and the Sigmoid activation function are respectively as follows:
Figure BDA0002120139630000041
Figure BDA0002120139630000042
wherein x is an input function of the Relu activation function, and x' is an input function of the Sigmoid activation function;
2c) Setting up an SCDA module for outputting convolution mask features
Referring to fig. 2, the operating principle of the SCDA module is as follows:
inputting the three-dimensional convolution characteristics output by the second full-connection layer of the attention module into the SCDA module for third dimension summation to obtain two-dimensional convolution characteristics, and averaging the obtained two-dimensional convolution characteristics;
performing convolution mask according to the average value, namely comparing the value of the two-dimensional convolution characteristic data point with the average value, if the value of the two-dimensional convolution characteristic data point is larger than the average value, the code is 1, and if the value of the two-dimensional convolution characteristic data point is smaller than the average value, the code is 0, and obtaining the convolution mask;
extracting a convolution mask feature, namely multiplying a convolution mask by a set average value threshold value E, adding 1 to a multiplied convolution mask data value, and multiplying the multiplied convolution mask data value by a three-dimensional convolution feature input into an SCDA (sparse code division multiple access) module to obtain a mask feature; averaging the first two dimensions of the mask features to obtain and output convolution mask features;
2d) Setting a full connection layer, wherein the full connection layer consists of three convolution kernels of 4 multiplied by 1024,1024 multiplied by 21 with convolution kernel sizes of 512 multiplied by 1024,102 respectively;
2e) And sequentially connecting the convolutional network module, the attention module, the SCDA module and the full connection layer to obtain the convolutional neural network.
Step 3, determining a loss function of the convolutional neural network:
3a) Will train the sample set { T 1 ,T 2 ,…T j …,T M Inputting the data into a convolution network module of a convolution layer neural network, and outputting a last layer of characteristics F of the convolution layer;
3b) Inputting the last layer of feature F into an attention module of a convolutional neural network, outputting a convolutional feature A, inputting the convolutional layer feature A into a plurality of SCDA modules of the convolutional neural network with different average threshold values, and outputting T group mask convolutional features: { M 1 ,M 2 ,…M T T is the number of SCDA modules;
3c) Inputting the T groups of mask convolution characteristics into a full connection layer of the convolutional neural network, outputting a classification result of training data, and obtaining a loss function loss of the convolutional neural network op
Figure BDA0002120139630000051
Wherein:
Figure BDA0002120139630000052
is the L2 norm, lambda, of the convolutional neural network weight vector r 、λ s Eta are respectively loss 1 ,loss 2 ,/>
Figure BDA0002120139630000053
The hyper-parameter of (c);
Figure BDA0002120139630000054
indicating the cross entropy, y, of the output classification result and the actual result j For training T in image library j Predicted class probability of o j For training T in image library j The actual class label of (2);
Figure BDA0002120139630000055
expressing the absolute value sum of cross entropy of the classification result and the actual result output after T groups of mask convolution characteristics pass through the full connection layer, wherein T is the sum of the SCDA module class, loss m For training T in image library j Loss under the m-th convolution mask feature n For training T in image library j Loss under the nth convolution mask feature 1
And 4, performing iterative training on the convolutional neural network.
The existing method for carrying out iterative training on the convolutional neural network comprises a gradient descent optimization algorithm, a Nesterov gradient acceleration method and an Adagarad method, the invention adopts but is not limited to a gradient descent algorithm, and the implementation steps are as follows:
4a) Setting the iteration number as P, setting the initial learning rate of training as L and the attenuation rate as beta, and training the image library { T 1 ,T 2 ,…T j …,T M Dividing the input image into G times, inputting the G times into the convolutional neural network constructed in the step 2, wherein the number Q of the input images each time is as follows:
Figure BDA0002120139630000061
wherein M is the total number of the training image library samples;
4b) Setting the learning rate l corresponding to each input picture as:
l=L*β G
4c) G times of parameter updating are carried out on the convolutional neural network through the following formula to obtain an updated weight vector W new
Figure BDA0002120139630000062
Wherein, W is a weight vector of the parameter of the convolutional neural network;
the updated weight vector W new Substituting the loss function in 3 c) to obtain the loss function loss after updating the weight vector op
4d) Inputting the next training picture into the convolutional neural network, and updating the loss function loss of the weight vector op Updating so that the loss function loss op The value of (d) is constantly decreasing;
4e) Repeat 4 d) until the loss function loss op Stopping training the network to obtain a trained convolutional neural network if the current training round number is less than the set iteration number P; otherwise, stopping training the network when the training round reaches the set iteration number P to obtain a trained convolutional neural network;
and 5, classifying the remote sensing scene pictures input by the user.
5a) The user normalizes the remote sensing image to be classified, namely acquiring the maximum value V 'of pixel points of the remote sensing image to be classified' max And minimum value V 'of pixel point' min And dividing the values of all pixel points of the remote sensing image to be classified by V' max And V' min Obtaining the remote sensing image to be classified after normalization processing;
5b) And inputting the remote sensing image after the normalization processing into a trained convolution network model to obtain a classification result.
The effects of the present invention can be further illustrated by the following simulations:
1. simulation conditions
The embodiment completes the classification simulation of the remote sensing image scene and the prior remote sensing image scene on an HP-Z840-Workstation with Xeon (R) CPU E5-2630, geForce TITAN XP,64G RAM and Ubuntu systems and a TensorFlow operation platform.
The simulation parameters are set as follows, the iteration round P is 100 times, the learning rate is 0.00001, and lambda is r =0.7,λ s =0.3,η=0.0001, the number of pictures input each time G is 6, the attenuation rate beta is 0.9, 3 groups of SCDA modules are taken, and the average threshold values of the three groups are p 1 =1.0,p 2 =0.8,p 3 =0.6, the training data is randomly rotated to enhance by four times the number of original data. The training sequence is that in each iteration training, a class mark discriminator and a classification difference value optimizer are trained together.
2. Emulated content
Downloading a UC Merced remote sensing image set as shown in FIG. 3, and carrying out normalization processing on the remote sensing image set, namely acquiring the maximum value V of pixel points of the UC Merced image set " max Minimum value V of sum pixel " min Dividing the values of all pixel points of the UC Merced image set by V " max And V " min Obtaining a UC Merced image set after normalization processing;
randomly selecting 80% of remote sensing images from the UC Merced images after normalization processing as a training sample set D T Taking the rest 20% of remote sensing images as a test sample set D t
Under the simulation conditions, a training sample set D is adopted T Respectively training by using the three image classification models of the invention and the current representativeness, and adopting a test sample set D t Tests were performed to compare the accuracy of their classification, with the results shown in table 1.
The images in the training sample set and the test sample set are 21 types, which are, respectively, aggrecultural, airlane, baseball diamond, beach, buildings, chararral, dense identification, forest, free, golf, harbor, interaction, medium identification, mobilehomepark, overtpass, parkinglot, river, runway, sparingidentification, storage ranks, tenis color,
TABLE 1 Performance evaluation of the classification model of the present invention and the existing remote sensing image
Test sample accuracy
The invention 0.9849
MSCP 0.9782
SHHTFM 0.9789
DCA 0.9690
In table 1, MSCP is the existing remote sensing image classification method based on multi-stack covariance pooling, SHHTFM is the existing remote sensing image classification method based on isomorphic heterogeneous sparsity, and DCA is the existing remote sensing image classification method based on depth feature fusion.
As can be seen from Table 1, in the training sample set D T When the percentage of the UC Merced image set is 80%, the convolutional neural network trained by the method is used for testing 20% of the sample set D t And the accuracy rate of the classification is higher than that of the current representative remote sensing image classification model.
In conclusion, the remote sensing image classification effect of the invention is obviously better than that of other remote sensing image classification models.

Claims (6)

1. A remote sensing image classification method based on attention mechanism multi-scale deep learning is characterized by comprising the following steps:
(1) Establishing a remote sensing image library { I } 1 ,I 2 ,…I n …,I N The image library corresponds to the category { Y } 1 ,Y 2 ,…Y n …,Y N And performing the construction on the remote sensing image libraryNormalization process, wherein I n Representing the nth image in the image library, Y n Representing the category corresponding to the nth image in the image library, wherein n represents the nth sample number in the image library, and n belongs to [0,N ]]N represents the number of pictures in the remote sensing image library;
(2) Randomly selecting 80% of remote sensing image samples from each type of remote sensing image subjected to normalization processing, and constructing a training image library n E [0,N]Taking the rest 20% of remote sensing images as a test sample set { T% 1 ,T 2 ,…T d …,T m Where T is j Represents the jth sample in the training samples, j ∈ [0,M],t d Denotes the d-th sample in the test sample, d ∈ [0,m]And M is the total number of training samples,
m is the total number of test samples, m<N,M<N; wherein, T is j Represents the jth picture in the training image library, and j belongs to [0,M ]],
M is the total number of training samples;
(3) Constructing a convolutional neural network comprising a convolutional network module, an attention module, an SCDA module and a full connection layer;
(4) Determining a loss function of the convolutional neural network:
(4a) Will train the image library { T 1 ,T 2 ,…T j …,T M Inputting the data into a convolution network module of the convolution layer neural network, and outputting the last layer of characteristics F of the convolution layer;
(4b) Inputting the last layer of features F into an attention module of a convolutional neural network, outputting convolutional features A, inputting the convolutional layer features A into a plurality of SCDA modules of the convolutional neural network with different average thresholds, and outputting T groups of mask convolutional features:
{M 1 ,M 2 ,…,M T t is the number of SCDA modules;
(4c) Inputting the T groups of mask convolution characteristics into a full connection layer of a convolution neural network after global average pooling, and outputting a classification result of training data to obtain a loss function of the convolution neural network:
Figure FDA0003967657330000011
among them, loss 1 For outputting the cross entropy of the classification result and the actual result 2 Outputting the absolute value sum of the cross entropy of the classification result and the actual result after the T groups of mask convolution characteristics pass through a full connection layer,
Figure FDA0003967657330000012
is the L2 norm of the convolutional neural network weight vector,
λ r 、λ s eta are respectively loss 1 ,loss 2
Figure FDA0003967657330000013
The hyper-parameter of (c);
(5) Setting the iteration number as P, and carrying out iterative training on the convolutional neural network through gradient descent optimization until the loss function loss op The number of times of the training round is not reduced or reaches the number of iterations, and a well-trained convolutional neural network is obtained;
(6) And (3) normalizing the remote sensing image I 'to be classified by the user, and inputting the normalized remote sensing image I' into the trained convolutional neural network to obtain a classification result and finish the image classification.
2. The method of claim 1, wherein the remote sensing image library is normalized in (1) by the following equation:
Figure FDA0003967657330000021
wherein V max Is the maximum value of points, V, of all pixels in the remote sensing image library min Is the point minimum for all pixels in the library of remotely sensed images,
{I' 1 ,I' 2 ,…I' n …,I' N is a remote sensing image library I 'after normalization processing' n For the nth sample of the remote sensing image after normalization processing, n belongs to the field of 0,N]。
3. The method of claim 1, wherein the parameters of the convolutional network module, attention module, SCDA module and full connectivity layer in (3) which form the convolutional neural network are set as follows:
the convolutional network module is composed of five convolutional layers { conv1, conv2, conv3, conv4 and conv5} which are connected in sequence in a pre-training AlexNet network;
the attention mechanism module consists of a global average pooling layer, a first full-connection layer, a Relu activation function, a second full-connection layer and a Sigmoid function;
and the SCDA module consists of a convolution channel summation layer and a mask layer in sequence.
4. The method of claim 1, wherein the cross entropy loss of the output classification result and the actual result in (4 c) 1 The formula is as follows:
Figure FDA0003967657330000022
wherein, y j For training T in image library j Predicted class probability of o j For training T in image library j The actual class of.
5. The method of claim 1, wherein the T groups of mask convolution features in (4 c) output the absolute sum of cross entropy of the classification result and the actual result after passing through the full connection layer, and the formula is as follows:
Figure FDA0003967657330000031
wherein T represents the number of SCDA modules, loss m Representing T in a training image library j Loss under the mth convolution mask feature 1 ,loss n Representing T in a training image library j Loss under the nth convolution mask feature 1
6. The method of claim 1, wherein (5) the convolutional neural network is iteratively trained by gradient descent optimization, which is implemented as follows:
(5a) Setting the initial learning rate of training as L and the attenuation rate as beta, and training the image library { T 1 ,T 2 ,…T j …,T M In a convolutional neural network constructed by G times of input, the number Q of pictures input each time is as follows:
Figure FDA0003967657330000032
wherein M is the total number of the training image library samples;
(5b) Setting the learning rate l corresponding to each input picture as:
l=L*β G
(5c) G times of parameter updating are carried out on the convolutional neural network through the following formula to obtain an updated weight vector W new
Figure FDA0003967657330000033
Wherein, W is a weight vector of the parameter of the convolutional neural network;
(5d) Inputting the next training picture into the convolutional neural network, and updating the loss function loss of the weight vector op Updating so that the loss function loss op The value of (d) is constantly decreasing;
(5e) Repeat (5 d) until loss function loss op Stopping training the network to obtain a trained convolutional neural network if the current training round number is less than the set iteration number P; otherwise, when the training round reaches the set iteration number P, stopping the training of the network to obtain the trained convolutional neural network.
CN201910603799.6A 2019-07-05 2019-07-05 Remote sensing image classification method based on attention mechanism multi-scale deep learning Active CN110334765B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910603799.6A CN110334765B (en) 2019-07-05 2019-07-05 Remote sensing image classification method based on attention mechanism multi-scale deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910603799.6A CN110334765B (en) 2019-07-05 2019-07-05 Remote sensing image classification method based on attention mechanism multi-scale deep learning

Publications (2)

Publication Number Publication Date
CN110334765A CN110334765A (en) 2019-10-15
CN110334765B true CN110334765B (en) 2023-03-24

Family

ID=68144267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910603799.6A Active CN110334765B (en) 2019-07-05 2019-07-05 Remote sensing image classification method based on attention mechanism multi-scale deep learning

Country Status (1)

Country Link
CN (1) CN110334765B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866494B (en) * 2019-11-14 2022-09-06 三亚中科遥感研究所 Urban group extraction method and system based on optical remote sensing image
CN111046962B (en) * 2019-12-16 2022-10-04 中国人民解放军战略支援部队信息工程大学 Sparse attention-based feature visualization method and system for convolutional neural network model
CN111178304B (en) * 2019-12-31 2021-11-05 江苏省测绘研究所 High-resolution remote sensing image pixel level interpretation method based on full convolution neural network
CN111339862B (en) * 2020-02-17 2021-04-27 中国地质大学(武汉) Remote sensing scene classification method and device based on channel attention mechanism
CN111275192B (en) * 2020-02-28 2023-05-02 交叉信息核心技术研究院(西安)有限公司 Auxiliary training method for improving accuracy and robustness of neural network simultaneously
CN111723674B (en) * 2020-05-26 2022-08-05 河海大学 Remote sensing image scene classification method based on Markov chain Monte Carlo and variation deduction and semi-Bayesian deep learning
CN111861880B (en) * 2020-06-05 2022-08-30 昆明理工大学 Image super-fusion method based on regional information enhancement and block self-attention
CN111738124B (en) * 2020-06-15 2023-08-22 西安电子科技大学 Remote sensing image cloud detection method based on Gabor transformation and attention
CN111797941A (en) * 2020-07-20 2020-10-20 中国科学院长春光学精密机械与物理研究所 Image classification method and system carrying spectral information and spatial information
CN112101190B (en) * 2020-09-11 2023-11-03 西安电子科技大学 Remote sensing image classification method, storage medium and computing device
CN112580557A (en) * 2020-12-25 2021-03-30 深圳市优必选科技股份有限公司 Behavior recognition method and device, terminal equipment and readable storage medium
CN112926380B (en) * 2021-01-08 2022-06-24 浙江大学 Novel underwater laser target intelligent recognition system
CN113191285B (en) * 2021-05-08 2023-01-20 山东大学 River and lake remote sensing image segmentation method and system based on convolutional neural network and Transformer
CN113177523A (en) * 2021-05-27 2021-07-27 青岛杰瑞工控技术有限公司 Fish behavior image identification method based on improved AlexNet
CN113505651A (en) * 2021-06-15 2021-10-15 杭州电子科技大学 Mosquito identification method based on convolutional neural network
CN113435531B (en) * 2021-07-07 2022-06-21 中国人民解放军国防科技大学 Zero sample image classification method and system, electronic equipment and storage medium
CN113449712B (en) * 2021-09-01 2021-12-07 武汉方芯科技有限公司 Goat face identification method based on improved Alexnet network
CN114286113B (en) * 2021-12-24 2023-05-30 国网陕西省电力有限公司西咸新区供电公司 Image compression recovery method and system based on multi-head heterogeneous convolution self-encoder
CN114092832B (en) * 2022-01-20 2022-04-15 武汉大学 High-resolution remote sensing image classification method based on parallel hybrid convolutional network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN108830296A (en) * 2018-05-18 2018-11-16 河海大学 A kind of improved high score Remote Image Classification based on deep learning
WO2018214195A1 (en) * 2017-05-25 2018-11-29 中国矿业大学 Remote sensing imaging bridge detection method based on convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
WO2018214195A1 (en) * 2017-05-25 2018-11-29 中国矿业大学 Remote sensing imaging bridge detection method based on convolutional neural network
CN108830296A (en) * 2018-05-18 2018-11-16 河海大学 A kind of improved high score Remote Image Classification based on deep learning

Also Published As

Publication number Publication date
CN110334765A (en) 2019-10-15

Similar Documents

Publication Publication Date Title
CN110334765B (en) Remote sensing image classification method based on attention mechanism multi-scale deep learning
CN108647742B (en) Rapid target detection method based on lightweight neural network
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN110516596B (en) Octave convolution-based spatial spectrum attention hyperspectral image classification method
CN107066559B (en) Three-dimensional model retrieval method based on deep learning
CN109344736B (en) Static image crowd counting method based on joint learning
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
CN108108751B (en) Scene recognition method based on convolution multi-feature and deep random forest
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN109840560B (en) Image classification method based on clustering in capsule network
CN111753828B (en) Natural scene horizontal character detection method based on deep convolutional neural network
CN109684922B (en) Multi-model finished dish identification method based on convolutional neural network
CN110598029A (en) Fine-grained image classification method based on attention transfer mechanism
CN107832797B (en) Multispectral image classification method based on depth fusion residual error network
CN110716792B (en) Target detector and construction method and application thereof
CN113033520A (en) Tree nematode disease wood identification method and system based on deep learning
CN111833322B (en) Garbage multi-target detection method based on improved YOLOv3
CN113221956B (en) Target identification method and device based on improved multi-scale depth model
CN112364747B (en) Target detection method under limited sample
CN112364974B (en) YOLOv3 algorithm based on activation function improvement
CN110427819A (en) The method and relevant device of PPT frame in a kind of identification image
CN111914902A (en) Traditional Chinese medicine identification and surface defect detection method based on deep neural network
CN111079837A (en) Method for detecting, identifying and classifying two-dimensional gray level images
CN111222545B (en) Image classification method based on linear programming incremental learning
CN110516700B (en) Fine-grained image classification method based on metric learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant