CN114708496A - Remote sensing change detection method based on improved spatial pooling pyramid - Google Patents

Remote sensing change detection method based on improved spatial pooling pyramid Download PDF

Info

Publication number
CN114708496A
CN114708496A CN202210240578.9A CN202210240578A CN114708496A CN 114708496 A CN114708496 A CN 114708496A CN 202210240578 A CN202210240578 A CN 202210240578A CN 114708496 A CN114708496 A CN 114708496A
Authority
CN
China
Prior art keywords
layer
output
convolution
level feature
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210240578.9A
Other languages
Chinese (zh)
Inventor
邵攀
高梓昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Three Gorges University CTGU
Original Assignee
China Three Gorges University CTGU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Three Gorges University CTGU filed Critical China Three Gorges University CTGU
Priority to CN202210240578.9A priority Critical patent/CN114708496A/en
Publication of CN114708496A publication Critical patent/CN114708496A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

A remote sensing change detection method based on an improved spatial pooling pyramid comprises the following steps: step 1: for two-phase remote sensingPreprocessing the image by X respectively1And X2Representing the first and second time period images, generating a differential image, and recording the differential image as DI; then X is put in1、X2And DI cascade, inputting the result into the residual error neural network of the integrated cavity convolution; step 2: to XinDown-sampling and feature extraction are carried out to obtain high-level feature XHAnd low level feature XL(ii) a And step 3: using an improved spatial pooling pyramid to pair high level features XHExtracting features, cascading the extracted features to form a feature pyramid, and recording the feature pyramid as FPYR; and 4, step 4: upsampling the FPYR; and outputting the low-level feature X output in the step 2LCascading to an up-sampled mirror image characteristic layer through short connection, and then solving a change probability graph through a SoftMax layer; and 5: training network parameters through back propagation by using an improved cross entropy loss function based on a change probability graph and a real change graph, and performing change detection through the trained network parameters; and carrying out remote sensing change detection through the steps.

Description

Remote sensing change detection method based on improved spatial pooling pyramid
Technical Field
The invention belongs to the technical field of remote sensing, relates to a remote sensing change detection technology, and particularly relates to a remote sensing change detection technology based on an improved spatial pooling pyramid.
Background
With the continuous development of deep learning technology, the deep learning technology has been widely applied in the field of remote sensing. Compared with the traditional machine learning technology, the remote sensing change detection based on deep learning has higher precision, especially for high-resolution remote sensing images.
However, the conventional Change Detection technology based on deep learning, for example, "End-to-End Change Detection for High Resolution software Using Improved UNet + +" takes two-phase Images as input, and cannot fully consider the difference image space. In addition, the change detection problem is usually a serious class imbalance problem, and the changed class area is generally much smaller than the unchanged class area. In order to solve the problems and effectively utilize the multi-scale characteristic of the remote sensing image with high spatial resolution, the invention provides a remote sensing change detection technology based on an improved spatial pooling pyramid.
Disclosure of Invention
The invention provides a high-resolution remote sensing image change detection method based on an improved spatial pooling pyramid, aiming at the problems that the existing deep learning change detection technology is easily affected by pseudo changes and the detection of the edge of a change area is incomplete.
A remote sensing change detection method based on an improved spatial pooling pyramid comprises the following steps:
step 1: preprocessing the two-stage remote sensing image by respectively using X1And X2Representing the first and second time period images and generating X by difference method1And X2The difference image of (2) is recorded as DI; then X is put in1、X2And DI cascade, denoted Xin
Figure BDA0003540982740000011
Figure BDA0003540982740000012
Figure BDA0003540982740000013
Indicating a cascade operation, finally XinInputting the residual error neural network integrated with the cavity convolution;
step 2: residual neural network pair X using integrated hole convolutioninDown-sampling and feature extraction are carried out to obtain high-level feature XHAnd low level feature XL
And step 3: for high-level feature XHRespectively extracting features of different scales by using convolution, a position and channel attention mechanism, cavity convolution of three different cavity rates and global maximum pooling, cascading the extracted features to form a feature pyramid, and recording the feature pyramid as FPYR;
and 4, step 4: upsampling the FPYR; and the low-level feature X output in the step 2 is inputLCascading to an up-sampled mirror image characteristic layer through short connection, and then solving a change probability graph through a SoftMax layer;
and 5: training network parameters through back propagation by using an improved cross entropy loss function based on a change probability graph and a real change graph, and performing change detection through the trained network parameters;
and carrying out remote sensing change detection through the steps.
In step 2, the method specifically comprises the following steps:
step 2-1: inputting network into XinSending the convolution layer Conv to preliminarily extract the characteristics and increase the number of characteristic channels;
step 2-2: sequentially passing the Conv output of the convolutional layer through a batch normalization layer, an activation layer and a global maximum pooling layer to enable the network to have nonlinear expression capability;
step 2-3: obtaining low-level feature X by integrating residual error neural network of cavity convolutionLAnd high level feature XH
In step 3, the high-level feature X is processedHPerforming multi-scale feature extraction, specifically:
extracting features of different scales by using a convolution mechanism, a position and channel attention mechanism, cavity convolution of three different cavity rates and global maximum pooling;
the output of the position attention mechanism is denoted as PAM (X)H) The channel attention mechanism output is recorded as CAM (X)H) The output of the convolution (including the convolution layer, batch normalization layer and Relu function activation layer) is denoted as Conv1(XH) The outputs of the three hole convolutions (each convolution comprising a convolution layer, a batch normalization layer and a Relu function activation layer) are respectively denoted as AConv1(XH),AConv2(XH),AConv3(XH) The output of the global max pooling is denoted Pool (X)H) And cascading the outputs to obtain a cascaded feature pyramid FPYR, namely:
Figure BDA0003540982740000021
PAM (X) as defined aboveH) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HProcessing by a position attention mechanism, and then adjusting the number of characteristic channels by using the convolutional layer ConvP;
the CAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HThe number of characteristic channels is adjusted by the channel attention mechanism processing and then by the convolutional layer ConvC.
The residual error neural network integrating the cavity convolution is composed of a plurality of sequentially connected cavity convolution residual blocks, and the output of the n-th layer residual block Block is recorded as Bn(X), then the low-level feature XLFor the output of the 1 st residual block, the high-level feature XHThe output of the last 1 residual block.
In step 5, the proposed network structure is trained using an improved cross entropy loss function, which is:
Figure BDA0003540982740000022
wherein N represents the total number of pixels, u and c represent the unchanged class and the changed class, respectively, pikProbability, y, that pixel i predicted for the model belongs to class kikFor a Boolean variable equal to 0 or 1, by the class l to which the pixel i belongsiAnd (3) calculating:
Figure BDA0003540982740000023
wkfor the weight belonging to class k, k ∈ { u, c }, by the formula
Figure BDA0003540982740000024
And (4) determining. Wherein a isuAnd acRespectively represent the proportion of the unchanged class u and the changed class c, and beta is an equilibrium coefficient.
The network structure based on the improved spatial pooling pyramid for remote sensing image processing comprises an improved spatial pooling pyramid module: the device consists of a convolution unit, a position and channel attention mechanism unit, three cavity convolution units with different cavity rates and a global maximum pooling unit; the input of the improved space pooling pyramid module is connected with the first output of the cavity residual error module, and the first output of the cavity residual error module is a high-level feature XHThe second output of the hole residual module is the low-level feature XL(ii) a The output of the improved spatial pooling pyramid module is connected with the input of the first up-sampling module; low level feature XLAnd cascading with the features output by the first up-sampling module to obtain a new feature map after cascading.
The hole residual module is composed of a plurality of hole convolution residual blocks which are connected in sequence, and the output of the nth layer residual block is recorded as Bn(X), then the low-level feature XLFor the output of the 1 st residual block, the high-level feature XHIs the output of the last 1 residual block;
cavity residual error moduleThe input comprises a first time period remote sensing image X1And a second period remote sensing image X2First time period remote sensing image X1And the remote sensing image X of the second period2The three are cascaded to form a network input Xin,Xin=X1⊕X2⊕DI。
Network input XinSequentially passing through the convolutional layer Conv, the batch normalization layer, the activation layer and the global maximum pooling layer, wherein the output of the global maximum pooling layer is connected with the input of the cavity residual error module.
Feature map size and low-level feature X output by first up-sampling moduleLThe sizes are the same; and inputting the new cascaded feature map into a second up-sampling module after passing through at least 1 layer of convolution layer, wherein the size of the feature map output by the second up-sampling module is the same as the size of the image of the original input remote sensing image.
Compared with the prior art, the invention has the following technical effects:
the technical scheme provided by the invention simultaneously considers the original image space and the differential image space of the two-stage high-resolution remote sensing image; the multi-scale characteristic of the high-resolution remote sensing image is considered by improving the spatial pooling pyramid, so that the detection capability of a large change target is ensured; enhancing the detection capability of detail change information by using an attention mechanism in parallel; and a novel loss function is provided to adapt to the unbalance problem of the changed class and the unchanged class. Through the measures, the invention can obtain a better change detection result.
Drawings
The invention is further illustrated by the following examples in conjunction with the accompanying drawings:
FIG. 1: the invention provides a main frame structure of a network.
FIG. 2: the invention is a schematic diagram of a residual block structure of a residual neural network integrating cavity convolution.
FIG. 3: is a schematic diagram of a residual error unit adopted by the invention.
FIG. 4 is a schematic view of: the invention provides an improved spatial pooling pyramid structure diagram.
FIG. 5: is a schematic structural diagram of a position attention mechanism adopted by the invention.
FIG. 6: is a structural schematic diagram of a channel attention mechanism only used by the invention.
FIG. 7: is a schematic diagram of the upsampling technique dupsamplification employed by the present invention.
FIG. 8: example images of experimental data used in embodiments of the present invention include T1 and T2 session images and their variation reference maps.
FIG. 9: is a graph of the results of the change detection on the exemplary images for the six-ratio technique and the present invention.
Detailed Description
The invention discloses a remote sensing change detection method based on an improved spatial pooling pyramid, which comprises the following steps:
step 1: preprocessing the two-phase remote sensing images by respectively using X1And X2Representing the first and second time period images and generating X by difference method1And X2The difference image of (2) is recorded as DI; then X is put in1、X2And DI cascade, denoted Xin
Figure BDA0003540982740000031
Figure BDA0003540982740000041
Figure BDA0003540982740000042
Indicating a cascade operation, finally XinInputting the residual error neural network integrated with the cavity convolution;
and 2, step: residual neural network pair X using integrated hole convolutioninDown-sampling and feature extraction are carried out to obtain high-level feature XHAnd low level feature XL
And 3, step 3: for high-level feature XHRespectively extracting features of different scales by using convolution, position and channel attention mechanism, cavity convolution of three different cavity rates and global maximum pooling, and cascading the extracted features to form a feature pyramidTower, denoted FPYR;
and 4, step 4: upsampling the FPYR; and outputting the low-level feature X output in the step 2LCascading to an up-sampled mirror image characteristic layer through short connection, and then solving a change probability graph through a SoftMax layer;
and 5: training network parameters through back propagation by using an improved cross entropy loss function based on a change probability graph and a real change graph, and performing change detection through the trained network parameters;
and carrying out remote sensing change detection through the steps.
In step 2, the method specifically comprises the following steps:
step 2-1: inputting network into XinSending the convolution layer Conv to preliminarily extract the characteristics and increase the number of characteristic channels;
step 2-2: sequentially passing the Conv output of the convolutional layer through a batch normalization layer, an activation layer and a global maximum pooling layer to enable the network to have nonlinear expression capability;
step 2-3: obtaining low-level feature X by integrating residual error neural network of cavity convolutionLAnd high level feature XH
In step 3, the high-level feature X is processedHPerforming multi-scale feature extraction, specifically:
extracting features of different scales by using convolution, a position and channel attention mechanism, cavity convolution of three different cavity rates and global maximum pooling;
the position attention mechanism output is denoted as PAM (X)H) The channel attention mechanism output is recorded as CAM (X)H) The output of the convolution (including the convolution layer, batch normalization layer and Relu function activation layer) is denoted as Conv1(XH) The outputs of the three hole convolutions (each convolution comprising a convolution layer, a batch normalization layer and a Relu function activation layer) are respectively denoted as AConv1(XH),AConv2(XH),AConv3(XH) The output of the global max pooling is denoted Pool (X)H) And cascading the outputs to obtain a cascaded characteristic pyramid FPYR, namely:
Figure BDA0003540982740000043
the PAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HProcessing by a position attention mechanism, and then adjusting the number of the characteristic channels by using the convolutional layer ConvP;
the CAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HThe channel attention mechanism is used to process the channel and then the ConvC is used to adjust the number of the characteristic channels.
The residual error neural network integrating the hole convolution is composed of a plurality of hole convolution residual blocks which are connected in sequence, and the output of the nth layer of residual block Block is marked as Bn(X), then the low-level feature XLFor the output of the 1 st residual block, the high-level feature XHThe output of the last 1 residual block.
In step 5, the proposed network structure is trained using an improved cross-entropy loss function, which is:
Figure BDA0003540982740000051
wherein N represents the total number of pixels, u and c represent the unchanged class and the changed class, respectively, pikProbability, y, that pixel i predicted for the model belongs to class kikFor a Boolean variable equal to 0 or 1, by the class l to which the pixel i belongsiAnd (3) calculating:
Figure BDA0003540982740000052
wkfor the weight belonging to class k, k ∈ { u, c }, by the formula
Figure BDA0003540982740000053
And (4) determining. Wherein a isuAnd acRespectively represent the proportion of the unchanged class u and the changed class c, and beta is an equilibrium coefficient.
The invention also comprises a pyramid based on improved spatial pooling for remote sensing image processingA network structure of a pyramid comprising an improved spatial pooling pyramid module: the system is composed of a convolution unit, a position and channel attention mechanism unit, three cavity convolution units with different cavity rates and a global maximum pooling unit; the input of the improved space pooling pyramid module is connected with the first output of the cavity residual error module, and the first output of the cavity residual error module is a high-level feature XHThe second output of the hole residual module is the low-level feature XL(ii) a The output of the improved spatial pooling pyramid module is connected with the input of the first up-sampling module; low level feature XLAnd cascading with the features output by the first up-sampling module to obtain a new feature map after cascading.
The hole residual module is composed of a plurality of hole convolution residual blocks which are connected in sequence, and the output of the n-th layer residual block is recorded as Bn(X), then the low-level feature XLFor the output of the 1 st residual block, the high-level feature XHIs the output of the last 1 residual block;
the input of the cavity residual error module comprises a first time period remote sensing image X1And a second period remote sensing image X2First time period remote sensing image X1And the second period remote sensing image X2The three are cascaded to form a network input Xin,Xin=X1⊕X2⊕DI。
Network input XinSequentially passing through the convolutional layer Conv, the batch normalization layer, the activation layer and the global maximum pooling layer, wherein the output of the global maximum pooling layer is connected with the input of the cavity residual error module.
Feature map size and low-level feature X output by first up-sampling moduleLThe sizes are the same; and inputting the new cascaded feature map into a second up-sampling module after passing through at least 1 layer of convolution layer, wherein the size of the feature map output by the second up-sampling module is the same as the size of the image of the original input remote sensing image.
To facilitate a further understanding of the present invention by those of ordinary skill in the art, the following is further illustrated:
in an embodiment, the experiment is performed using a change detection dataset disclosed by the documents "s.ji, s.wei, and m.lu, full volumetric Networks for multiple source construction From an Open image and a software image Data Set, IEEE Transactions on geo Sensing and Remote Sensing, vol.57, pp.574-586,2019", the dataset comprising two sets of high resolution Remote Sensing images, each Set of images comprising two periods of Remote Sensing images and a true change map of the two periods of Remote Sensing images, the first Set of images having a size of 21243 x 15354 pixels, and the second Set of images having a size of 11265 x 15354 pixels. To facilitate web training, two large images are divided into 256 × 256 pixel groups of small images, and the groups of completely unchanged and completely changed images are removed, leaving 1863 groups of 256 × 256 small images, where the training set is 1250 groups and the test set is 613 groups;
fig. 1 shows a main frame structure of the network according to the present invention. The invention discloses a remote sensing change detection method based on an improved spatial pooling pyramid, which comprises the following steps of:
step 1: preprocessing the two-stage high-resolution remote sensing image, including registration, relative radiation correction and the like; respectively by X1And X2Representing first and second time period images and generating X by difference method1And X2The difference image of (2) is recorded as DI, and the DI calculation formula is as follows:
DI=|X1-X2|
then X is put in1、X2And DI cascade, denoted Xin
Figure BDA0003540982740000061
Figure BDA0003540982740000062
Indicating a cascade operation, finally XinInto the proposed network.
And 2, step: residual neural network AtrousResNet50 pair network input X using integrated hole convolutioninDown-sampling and feature extraction are carried out to obtain high-level feature XHAnd low level feature XL
Firstly, inputting a network into XinSending the obtained product into a convolutional layer Conv to primarily extract features and increase the number of feature channels, and then passing through a batchThe quantity normalization layer (BN layer), the activation layer (adopting Relu function) and the global maximum pooling layer enable the network to have nonlinear expression capability, can avoid degradation caused by continuous convolution, and finally obtain the low-level feature X through the residual error neural network integrating the cavity convolutionLAnd high level feature XH
The residual block structure of the residual neural network integrated with cavity convolution, which is adopted in the embodiment, is shown in fig. 2, and is composed of four cavity convolution residual blocks, each residual block is composed of a plurality of residual units, and specific parameters such as the number of the residual units and the structure of the residual units can be adjusted according to specific applications, in the embodiment, each residual unit is composed of a common convolution with a step length of 1 and a cavity rate of 0 except for the middle layer, and fig. 3 shows an example of the structure of the residual unit.
Let the output of the n-th residual block Block be Bn(X), then the low-level feature XLFor the output of the 1 st residual block, i.e. XL=B1(Xin) High level feature XHFor the output of the 4 th residual block, i.e. XH=B4(B3(B2(B1(Xin)))). Respectively obtaining low-level features X through a down-sampling stepLAnd high level feature XHDifferent scale information can be included. In this step, the number of residual blocks, the residual unit structure, the void ratio, and the size and number of convolution kernels can be adjusted according to specific applications. The detailed parameters of the convolutional layer and residual unit and the number of residual blocks used in this embodiment are shown in table 1:
TABLE 1 convolution layer and residual Unit detailed parameters
Figure BDA0003540982740000063
And step 3: and (3) extracting features of different scales from the high-level features by respectively using 1 × 1 convolution, a space and channel attention mechanism, cavity convolution of three different cavity rates and global maximum pooling, and cascading the features to form a feature pyramid. Fig. 4 shows a specific structure of step 3 in this embodiment.
The obtained cascade pyramid characteristics can fully consider the multi-scale characteristics and the spatial context information of the high-resolution remote sensing image. Conv of 1 x 1 convolution layer1The output of (D) is recorded as Conv1(XH) The output of the spatial attention mechanism is denoted PAM (X)H) The output of the channel attention mechanism is denoted as CAM (X)H) Three void convolutional layers AConv1,AConv2,AConv3The outputs are respectively recorded as AConv1(XH),AConv2(XH),AConv3(XH) The output of the global max pooling layer is denoted Pool (X)H) Wherein X isHIs the high-level feature obtained in step 2. Cascading the outputs to obtain a cascaded characteristic pyramid FPYR, i.e.
Figure BDA0003540982740000071
The present invention enhances the ability to characterize detailed objects and object boundaries by cascading spatial attention (PAM) and channel attention features (CAM), a positional attention mechanism as shown in fig. 5 and a channel attention mechanism as shown in fig. 6. PAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HObtained by the spatial attention mechanism process shown in fig. 5 and then adjusting the number of characteristic channels by using the convolution layer ConvP. CAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HObtained by performing the channel attention mechanism process shown in fig. 6 and then adjusting the number of characteristic channels by using the convolutional layer ConvC.
Table 2 shows the use of the convolution layer Conv in step 3 of this example1、AConv1、AConv2、AConv3ConvP and ConvC convolution kernel size, number, void rate and step size. The convolutional layers comprise a BN layer and a Relu function activation layer. It should be noted that the size, number, void rate, step size, maximum pooling mode, and number of void convolution layers used in step 3 may all be adjusted according to specific applications.
TABLE 2 convolution layer Conv1、AConv1、AConv2、AConv3Specific parameters of ConvP and ConvC
Figure BDA0003540982740000072
And 4, step 4: using an improved upsampling operation to upsample the pyramid of the cascaded features in the step 3 and outputting the low-level features X output in the step 2LCascading to an up-sampled mirror image characteristic layer through short connection, and finally solving a change probability graph through a SoftMax layer;
first, the pyramid of cascaded features FPYR in step three is fed into the convolutional layer Conv2Reducing the number of characteristic channels, increasing the nonlinear expression capacity through a BN layer and an activation layer (adopting Relu function), performing quadruple up-sampling by using an improved up-sampling technology DUpsampling, and outputting the characteristic diagram size and the low-level characteristic XLAnd similarly, cascading the two as a new feature map.
Conv for passing the cascaded feature map through two convolutional layers3And Conv4And (including the BN layer and the Relu function activation layer) carrying out channel number adjustment, wherein the number of convolution layer convolution kernels can be adjusted according to specific application. And then carrying out quadruple upsampling by using an improved upsampling technology DUpsmpling, wherein the size of an output characteristic diagram is the same as that of an input image, and finally solving a variation probability diagram by using the output characteristic diagram through a SoftMax layer.
The improved upsampling technique dupsamping is an improvement on the conventional linear interpolation, and the structure is shown in fig. 7. In the embodiment, dupsamping is used for replacing linear upsampling, and dupsamping generates a filling part required in the process of restoring the feature resolution through convolution, so that the restoration of the detail information of the feature map is facilitated, and particularly, the high-resolution remote sensing image with more detail parts is obtained. The detailed parameters of the convolutional layer for adjusting the number of channels in this step are shown in table 3:
TABLE 3 convolution layer Conv2、Conv3And Conv4Specific parameters
Figure BDA0003540982740000081
And 5: and (4) calculating loss based on the change probability graph and the real change graph obtained in the step (4) by using an improved cross entropy loss function, iteratively training parameters in the network structure through back propagation until an iteration stop condition is met, and storing the parameters when the iteration is stopped for obtaining a change detection graph.
In practical applications, the changed area is often much smaller than the unchanged area. The remote sensing change detection problem is generally a problem that the proportion of a changed class is greatly unbalanced with that of an unchanged class. Aiming at the problem, the invention provides a brand-new self-adaptive weight cross entropy loss function, calculates the loss of a change probability graph of the network output by using the loss function, and optimizes parameters in the network through back propagation.
The method comprises the following steps of designing a weight self-adaptive cross entropy loss function suitable for the class imbalance problem on the basis of the cross entropy loss function, wherein the weight self-adaptive calculation method is based on the proportion of a variable class and an unchanged class, and the calculation formula of the proposed weight self-adaptive cross entropy function is as follows:
Figure BDA0003540982740000082
wherein N represents the total number of pixels, u and c represent the unchanged class and the changed class, respectively, pikProbability, y, that pixel i predicted for the model belongs to class kikFor a Boolean variable equal to 0 or 1, by the class l to which the pixel i belongsiAnd (3) calculating:
Figure BDA0003540982740000083
wkfor the weight belonging to class k, k ∈ { u, c }, by the formula
Figure BDA0003540982740000084
And (4) determining. Wherein a isuAnd acRespectively represent the proportion of the unchanged class u and the changed class c, generally auFar greater than 1/2, acMuch less than 1/2. Beta is equalThe balance coefficient can ensure that the two types of weights are not excessively unbalanced. In the present embodiment, β is 1/10.
To verify the change detection effect of the present invention, the present invention is compared with 6 advanced level deep learning change detection techniques. The 6 comparison techniques are respectively as follows: full convolution twin neural network (FC-Sim-Conv), DeepLab v3+, Dual attention neural network (DANet), modified U-type network (Unet + +), Multi-output fusion modified U-type network (Unet + + _ MSOF), and Difference-graph-based modified U-type network (DifUNet + +). Four widely used quantitative indicators are used to evaluate the performance of different change detection techniques, namely accuracy, precision, recall and F1The value is obtained.
FIG. 9 is a graph showing the results of variation detection according to the present invention and the comparative technique. Respectively, a full convolution twin neural network (FC-Sim-Conv), a DeepLab v3+, a double attention neural network (DANet), an improved U-type network (Unet + +), a multi-output fusion improved U-type network (Unet + + _ MSOF), an improved U-type network (DifUNet + +) based on a difference map, and a change detection result map of the present invention. Table 4 gives four quantitative indices for different change detection results.
TABLE 4 quantitative index of change detection results
Figure BDA0003540982740000091
Comparing the change detection result graph with the quantitative statistical result, the change detection effect of the invention is obviously superior to the change detection results of other advanced level deep learning change detection technologies. In the present embodiment, it can be seen that the present invention achieves better results from either the complete degree of the change or the refinement degree of the edge. As can be seen from Table 4, the change detection result of the invention is superior to other change detection technologies in three precision indexes of accuracy, recall rate and F1 value. For example, the F1 value of the invention is 0.9046, which is higher than other methods by 0.0521, 0.0329, 0.0433, 0.0407, 0.0605 and 0.0284.

Claims (10)

1. A remote sensing change detection method based on an improved spatial pooling pyramid is characterized by comprising the following steps:
step 1: preprocessing the two-stage remote sensing image by respectively using X1And X2Representing the first and second time images and generating X by difference method1And X2The difference image of (2) is recorded as DI; then X is put in1、X2And DI cascade, denoted Xin
Figure FDA0003540982730000011
Figure FDA0003540982730000012
Figure FDA0003540982730000013
Indicating a cascade operation, finally XinInputting the residual error neural network integrated with the cavity convolution;
step 2: residual neural network pair X using integrated hole convolutioninDown-sampling and feature extraction are carried out to obtain high-level feature XHAnd low level feature XL
And 3, step 3: for high-level feature XHExtracting features of different scales by respectively using a convolution mechanism, a position and channel attention mechanism, three void convolutions of different void ratios and global maximum pooling, and cascading the extracted features to form a feature pyramid which is recorded as FPYR;
and 4, step 4: upsampling the FPYR; and the low-level feature X output in the step 2 is inputLCascading to an up-sampled mirror image characteristic layer through short connection, and then solving a change probability graph through a SoftMax layer;
and 5: training network parameters through back propagation by using an improved cross entropy loss function based on a change probability graph and a real change graph, and performing change detection through the trained network parameters;
and carrying out remote sensing change detection through the steps.
2. The method according to claim 1, characterized in that in step 2, it comprises in particular the steps of:
step 2-1: inputting network into XinSending the convolution layer Conv to preliminarily extract the characteristics and increase the number of characteristic channels;
step 2-2: sequentially passing the Conv output of the convolutional layer through a batch normalization layer, an activation layer and a global maximum pooling layer to enable the network to have nonlinear expression capability;
step 2-3: obtaining low-level feature X by integrating residual error neural network of cavity convolutionLAnd high level feature XH
3. The method of claim 1, wherein in step 3, high-level feature X is processedHPerforming multi-scale feature extraction, specifically:
extracting features of different scales by using a convolution mechanism, a position and channel attention mechanism, cavity convolution of three different cavity rates and global maximum pooling;
the position attention mechanism output is denoted as PAM (X)H) The channel attention mechanism output is recorded as CAM (X)H) The output of the convolution (including the convolution layer, batch normalization layer and Relu function activation layer) is denoted as Conv1(XH) The outputs of the three hole convolutions (each convolution comprising a convolution layer, a batch normalization layer and a Relu function activation layer) are respectively denoted as AConv1(XH),AConv2(XH),AConv3(XH) The output of the global max pooling is denoted Pool (X)H) And cascading the outputs to obtain a cascaded characteristic pyramid FPYR, namely:
Figure FDA0003540982730000014
4. method according to claim 3, characterized in that the PAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in step 2HProcessed by a position attention mechanism, thenAdjusting the number of the characteristic channels by using the convolutional layer ConvP;
the CAM (X)H) The specific acquisition method comprises the following steps: firstly, the high-level feature X obtained in the step 2HThe number of characteristic channels is adjusted by the channel attention mechanism processing and then by the convolutional layer ConvC.
5. The method according to claim 2, wherein the residual neural network integrating hole convolution is composed of a plurality of hole convolution residual blocks connected in sequence, and the output of the n-th layer residual block is denoted as Bn(X), then the low-level feature XLFor the output of the 1 st residual block, the high-level feature XHThe output of the last 1 residual block.
6. The method according to claim 1, characterized in that in step 5, the loss is calculated based on the variation probability map and the true variation map using an improved cross-entropy loss function, which is:
Figure FDA0003540982730000021
wherein N represents the total number of pixels, u and c represent the unchanged class and the changed class, respectively, pikProbability, y, of a pixel i predicted for the model to belong to class kikFor a Boolean variable equal to 0 or 1, by the class l to which the pixel i belongsiAnd (3) calculating:
Figure FDA0003540982730000022
wkfor the weight belonging to the class k, k belongs to { u, c }, through a formula
Figure FDA0003540982730000023
Is determined in which auAnd acRespectively represent the proportion of the unchanged class u and the changed class c, and beta is an equilibrium coefficient.
7. For remote-sensing image processingNetwork structure based on improve space pooling pyramid, its characterized in that, it is including improving space pooling pyramid module: the device consists of a convolution unit, a position and channel attention mechanism unit, three cavity convolution units with different cavity rates and a global maximum pooling unit; the input of the improved space pooling pyramid module is connected with the first output of the cavity residual error module, and the first output of the cavity residual error module is a high-level feature XHThe second output of the hole residual module is the low-level feature XL(ii) a The output of the improved spatial pooling pyramid module is connected with the input of the first up-sampling module; low level feature XLAnd cascading with the features output by the first up-sampling module to obtain a new feature map after cascading.
8. The network structure according to claim 7, wherein the hole residual block is composed of a plurality of sequentially connected hole convolution residual blocks, and the output of the n-th layer residual block is denoted as Bn(X), then the low-level feature XLFor the output of the 1 st residual block, the high-level feature XHIs the output of the last 1 residual block;
the input of the cavity residual error module comprises a first time period remote sensing image X1And a second period remote sensing image X2First time period remote sensing image X1And the second period remote sensing image X2The three are cascaded to form a network input Xin
Figure FDA0003540982730000024
9. The network of claim 8, wherein the network input XinSequentially passing through the convolutional layer Conv, the batch normalization layer, the activation layer and the global maximum pooling layer, wherein the output of the global maximum pooling layer is connected with the input of the cavity residual error module.
10. The network of claim 7, wherein the first upsampling module outputs the feature map size and the low-level feature XLThe sizes are the same; and inputting the new cascaded feature map into a second up-sampling module after passing through at least 1 convolution layer, wherein the size of the feature map output by the second up-sampling module is the same as the size of the image of the original input remote sensing image.
CN202210240578.9A 2022-03-10 2022-03-10 Remote sensing change detection method based on improved spatial pooling pyramid Pending CN114708496A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210240578.9A CN114708496A (en) 2022-03-10 2022-03-10 Remote sensing change detection method based on improved spatial pooling pyramid

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210240578.9A CN114708496A (en) 2022-03-10 2022-03-10 Remote sensing change detection method based on improved spatial pooling pyramid

Publications (1)

Publication Number Publication Date
CN114708496A true CN114708496A (en) 2022-07-05

Family

ID=82169726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210240578.9A Pending CN114708496A (en) 2022-03-10 2022-03-10 Remote sensing change detection method based on improved spatial pooling pyramid

Country Status (1)

Country Link
CN (1) CN114708496A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063396A (en) * 2022-07-11 2022-09-16 浙江金汇华特种耐火材料有限公司 Preparation system and preparation method of long-life refractory brick
CN116612076A (en) * 2023-04-28 2023-08-18 成都瑞贝英特信息技术有限公司 Cabin micro scratch detection method based on combined twin neural network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115063396A (en) * 2022-07-11 2022-09-16 浙江金汇华特种耐火材料有限公司 Preparation system and preparation method of long-life refractory brick
CN116612076A (en) * 2023-04-28 2023-08-18 成都瑞贝英特信息技术有限公司 Cabin micro scratch detection method based on combined twin neural network
CN116612076B (en) * 2023-04-28 2024-01-30 成都瑞贝英特信息技术有限公司 Cabin micro scratch detection method based on combined twin neural network

Similar Documents

Publication Publication Date Title
CN109886871B (en) Image super-resolution method based on channel attention mechanism and multi-layer feature fusion
CN109255755B (en) Image super-resolution reconstruction method based on multi-column convolutional neural network
CN112991354B (en) High-resolution remote sensing image semantic segmentation method based on deep learning
CN114708496A (en) Remote sensing change detection method based on improved spatial pooling pyramid
CN111507521B (en) Method and device for predicting power load of transformer area
CN113222823B (en) Hyperspectral image super-resolution method based on mixed attention network fusion
CN109509149A (en) A kind of super resolution ratio reconstruction method based on binary channels convolutional network Fusion Features
EP4154185A2 (en) Modeling dependencies with global self-attention neural networks
CN110264476B (en) Multi-scale serial convolution deep learning microscopic image segmentation method
CN112541532B (en) Target detection method based on dense connection structure
CN110288524B (en) Deep learning super-resolution method based on enhanced upsampling and discrimination fusion mechanism
CN110782393A (en) Image resolution compression and reconstruction method based on reversible network
CN113112446A (en) Tunnel surrounding rock level intelligent judgment method based on residual convolutional neural network
CN111242999B (en) Parallax estimation optimization method based on up-sampling and accurate re-matching
CN111861886B (en) Image super-resolution reconstruction method based on multi-scale feedback network
CN111127316A (en) Single face image super-resolution method and system based on SNGAN network
CN111667445A (en) Image compressed sensing reconstruction method based on Attention multi-feature fusion
CN104899835A (en) Super-resolution processing method for image based on blind fuzzy estimation and anchoring space mapping
CN114138919A (en) Seismic data reconstruction method based on non-local attention convolution neural network
Yang et al. Image super-resolution reconstruction based on improved Dirac residual network
Deng et al. Multiple frame splicing and degradation learning for hyperspectral imagery super-resolution
Gendy et al. Lightweight image super-resolution based multi-order gated aggregation network
CN115330901B (en) Image reconstruction method and device based on compressed sensing network
CN113554047A (en) Training method of image processing model, image processing method and corresponding device
Chen et al. A lightweight multi-scale residual network for single image super-resolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination