CN115018773A

CN115018773A - SAR image change detection method based on global dynamic convolution neural network

Info

Publication number: CN115018773A
Application number: CN202210564263.XA
Authority: CN
Inventors: 高峰; 徐云哲; 董军宇
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2022-05-23
Filing date: 2022-05-23
Publication date: 2022-09-06

Abstract

The invention provides an SAR image change detection method based on a global dynamic convolutional neural network, which comprises the steps of carrying out difference analysis on two multi-temporal SAR images captured in the same geographic area to obtain a difference map; pre-classifying the difference map, and constructing a training data set and a test data set; and training a global dynamic convolution neural network model by using the training data set, and testing by using the test data set so as to obtain a multi-temporal SAR image change detection result of the whole same geographic area. According to the scheme, a two-stage mixed sample data enhancement method is adopted to increase the number of data sets when a training data set is constructed, and the diversity and reliability of samples are ensured; meanwhile, the performance of a convolutional neural network is improved by adopting a global dynamic convolution mode, and the robustness and generalization capability of SAR image change detection on noise are improved; the digital image processing and deep learning technology are combined, and the method has important significance in the fields of national defense and military reconnaissance, natural environment monitoring, natural disaster monitoring, urban land planning and the like.

Description

SAR image change detection method based on global dynamic convolutional neural network

Technical Field

The invention belongs to the technical field of remote sensing image processing, and particularly relates to a Synthetic Aperture Radar (SAR) image change detection method based on a global dynamic convolutional neural network, which is used for detecting the ground feature change of the same geographic area by utilizing an SAR image.

Background

The change detection technology aims to detect changes occurring in the same geographic area at different times. The image change detection technology mainly detects the change of the ground object by means of the texture change of the image, and the texture change of the image is generally caused by the actual change of the ground object in the geographic area or caused by the external environmental conditions and hardware facility differences during image shooting, such as shooting angle, atmospheric environment, sensor precision and the like.

The basic premise of image change detection technology is that it is distinguishable from the actual changes in the terrain caused by these external environmental conditions and hardware facility differences. Currently, due to the advanced development of remote sensing technology, the ground feature change detection technology based on remote sensing images is a key technology in many ground observation applications, and is widely applied to many practical scenes, such as national defense and military reconnaissance, natural environment monitoring, natural disaster monitoring, urban land planning and the like. In particular, the Synthetic Aperture Radar (SAR) can obtain a large-area high-resolution remote sensing image by using a pulse compression technology, and can accurately acquire ground target information. More importantly, the remote sensing image acquired by the synthetic aperture radar is not influenced by weather conditions, and the sensor can capture the image at night and observe through cloud layers and smoke, so that the technology is all-weather. Therefore, SAR images are widely considered as an ideal source for the detection of changes in remote sensing images.

According to the existence of an artificial real label as a prior, the SAR image change detection method is mainly divided into a supervised method and an unsupervised method. Most of the existing work has focused on unsupervised change detection because it is difficult to obtain ground truth labels in many practical applications. Generally, the currently mainstream method for studying unsupervised SAR image change detection based on deep learning mainly comprises three steps: (1) generating a difference map, (2) unsupervised pre-classifying, and (3) training and classifying a neural network model. In the step of generating the difference map, a ratio-based method is widely applied, and a ratio operator is utilized to perform difference analysis on two SAR images which are registered and subjected to radiation correction to obtain the difference map; in the unsupervised pre-classification step, an unsupervised clustering method is widely used, each pixel of the difference map obtained in the step one is divided into an invariant class, a variable class and an uncertain class by the unsupervised clustering method, and a pseudo label of each pixel is obtained; in the step of training and classifying the neural network model, the pixels determined to be the invariant class and the variant class in the step two are used as training data set of the neural network to train the neural network model, then the trained neural network model is used for reclassifying each pixel in the difference image, mainly the pixels of the uncertain class in the step two are classified into the invariant class or the variant class, and finally the classes of all the pixels are obtained, so that the variation result image of the whole SAR image is obtained.

Although most of the current unsupervised SAR image change detection methods based on deep learning work well, there are some unsolved problems, which are summarized as follows: (1) global context and local feature interaction problems. The traditional convolutional layer mainly extracts local features, and context information is difficult to utilize in a neural network. Therefore, exploring global context and local feature interactions in a change detection framework is crucial. (2) Instability of data augmentation. In order to collect training samples with high enough quality for the SAR image change detection task, a mixed sample data enhancement technology is generally considered, and the methods effectively enrich the characterization space but also bring instability in the training process. Therefore, how to effectively eliminate the side effects of data expansion is a non-negligible task.

Disclosure of Invention

The invention provides an SAR image change detection method based on a global dynamic convolution neural network aiming at the problems of global context and local feature interaction and unstable data expansion of the existing SAR image change detection method, so as to improve the precision and performance of SAR image change detection.

The invention is realized by adopting the following scheme: a SAR image change detection method based on a global dynamic convolution neural network comprises the following steps:

step 1, carrying out difference analysis on two multi-temporal SAR images captured in the same geographic area to obtain a difference map;

step 2, pre-classifying the difference graph, and constructing a training data set and a test data set;

step 3, constructing a global dynamic convolution neural network model;

step 4, training a global dynamic convolution neural network model by using a training set enhanced by a two-stage mixed sample data enhancement method;

and 5, testing the test data set by using the trained network model so as to obtain a change detection result of the whole image.

The method comprises the following specific steps:

1. performing difference analysis on two multi-temporal SAR images captured in the same geographic area to obtain a difference map:

performing difference analysis on the two multi-temporal SAR images by adopting a logarithm ratio operator to obtain difference graphs of the two multi-temporal SAR images;

the calculation formula of the difference map is as follows:

I _DI ＝|logI ₁ -logI ₂ |

wherein, I ₁ And I ₂ Two multi-temporal SAR images, I, each representing the same geographical area _DI Is represented by ₁ And I ₂ The two disparity maps of the multi-temporal SAR images, | · | represents absolute value operation, and log represents logarithm operation with 10 as a base;

2. for difference chart I _DI Pre-classification is carried out to construct a training data set and a training data setTesting the data set:

(2.1) Pair difference plot I _DI Pre-classifying by using a hierarchical FCM (fuzzy C-means) clustering algorithm to obtain a pseudo label matrix, wherein the pseudo label values are 0,1 and 0.5 respectively, namely, high-probability samples which can be judged to be unchanged or changed and uncertain samples are generated after pre-classification;

(2.2) randomly selecting p% of pixels from the pixels with the pseudo label values of 0 and 1, wherein the value of p is an integer not greater than 10, extracting the spatial positions of the pixels, taking a neighborhood pixel block of r multiplied by r around a pixel point as a training set on the pixels of the corresponding spatial positions of the two original multi-temporal SAR images, extracting the domain pixel block by filling 0 in the edge pixel point through the domain pixel, and taking the value of r as an odd number not less than 3;

(2.3) extracting r multiplied by r neighborhood pixel blocks around all pixels from the original two multi-temporal SAR images as a test set, extracting the field pixel blocks by the edge pixel points in a mode of filling 0 in the field pixels, wherein the value of r is an odd number not less than 3;

3. constructing a global dynamic convolutional neural network:

the constructed network core architecture is composed of three global dynamic convolution layers, and the structure is as follows: input layer → lower layer global dynamic convolution layer → middle layer global dynamic convolution layer → upper layer global dynamic convolution layer → full link layer. The sample size input by the input layer is c multiplied by r, and c represents the number of input data channels;

(3.1) constructing a global dynamic convolution layer:

the structures of the three global dynamic convolution layers are consistent, and each global dynamic convolution layer is composed of a global feature coding module, a context feature projection module and a convolution kernel weight generation module. Firstly, the global feature coding module codes global information through an average pooling layer and a linear layer, and then the context feature projection module projects the coded features to an output dimension space through context feature projection. Finally, the convolution kernel weight generation module generates a convolution kernel containing global context information, and the generated new convolution kernel is used for executing conventional convolution;

(3.1.1) constructing a global feature coding module:

the global feature coding module is composed of an average pooling layer, a linear layer, a normalization layer and an activation layer. Extracting global information of input data by an average pooling layer, and reducing the space size of the input data to k multiplied by k, wherein k is the size of a convolution kernel; then projecting the features in all c channels onto a vector of m size through a linear layer; and obtaining the output characteristics of the global characteristic coding module through normalization and a ReLU activation function, wherein the output characteristics are as follows:

(3.1.2) constructing a context feature projection module:

the context feature projection module is composed of a linear layer, a normalization layer and an activation layer. The linear layer projects the feature G obtained in the step 3.1.1 into a space with an output dimension n, and the output feature of the context feature projection module is obtained through normalization and a ReLU activation function, wherein the output feature of the context feature projection module is as follows:

(3.1.3) constructing a convolution kernel weight generation module:

the weight generation module first converts the feature G obtained in step 3.1.1 and the feature C obtained in step 3.1.2 into the spatial size of the convolution kernel using two linear layers, respectively, the first linear layer converting the feature into the spatial size of the convolution kernel

Is converted into

Second linear layer will be characterized

Is converted into

Then M is added _G Performing dimension expansion to obtain

Will M _C Performing dimension expansion to obtain

Then, M is added _G And M _C Adding to generate convolution kernel weight M, and calculating as follows:

M＝δ(M′ _G +M′ _C )

wherein the content of the first and second substances,

δ represents the Sigmoid activation function, as the size of the original convolution kernel W. Finally, the generated convolution kernel weight M and the current convolution kernel weight W are multiplied element by element, and the global information and the local information are combined to obtain a new convolution kernel weight W' of the input data self-adaption, wherein the calculation formula is as follows:

W′＝M⊙W

wherein, "-" indicates element-by-element multiplication;

(3.1.4) performing conventional convolution on current data in the input data by using the new convolution kernel generated in the step 3.1.3;

(3.2) extracting low-level features F of the input data by using the low-level global dynamic convolution layer constructed in the step 3.1 _l ：

The size of a convolution kernel in the low-layer global dynamic convolution layer is k _l ×k _l The number of convolution kernels is n _l Wherein k is _l Is taken to be 3, n _l Number 48, low level feature F _l The calculation formula of (c) is as follows:

F _l ＝σ(BN(XW _l ′+b _l ))

wherein X represents an input sample in an input layer and is also an input of the low-layer global dynamic convolutional layer; w _l ' sample adaptive convolution kernel weights for the low-level global dynamic convolution layer obtained by step 3.1.3; b is a mixture of _l The bias term representing the global dynamic convolution layer of the lower layer is obtained by random initialization and passes through the netPerforming iterative training optimization on the channels; BN represents batch normalization operation; σ denotes the ReLU activation function;

(3.3) extracting middle layer characteristics F of the input data by using the middle layer global dynamic convolution layer constructed in the step 3.1 _m ：

The size of convolution kernel in the middle layer global dynamic convolution layer is k _m ×k _m The number of convolution kernels is n _m Wherein k is _m Is taken to be 3, n _m Number 96, middle layer characteristic F _m The calculation formula of (a) is as follows:

F _m ＝σ(BN(F _l W _m ′+b _m ))

wherein, F _l The low-level features representing the input data are also the input of the middle-level global dynamic convolution layer; w _m ' represents the low-level feature adaptive convolution kernel weight of the middle-level global dynamic convolution layer obtained by the step 3.1.3; b _m The bias item representing the middle layer global dynamic convolution layer is obtained by random initialization and optimized through network iterative training; BN represents batch normalization operation; σ denotes the ReLU activation function;

(3.4) extracting high-level features F of the input data by using the high-level global dynamic convolution layer constructed in the step 3.1 _h ：

The convolution kernel size in the high-level global dynamic convolution layer is k _h ×k _h The number of convolution kernels is n _h Wherein k is _h Is taken to be 3, n _h Number of 48, high level feature F _h The calculation formula of (a) is as follows:

F _h ＝σ(BN(F _m W _h ′+b _h ))

wherein, F _m The middle-layer characteristics representing the input data are also the input of the high-layer global dynamic convolution layer; w is a group of _h ' represents the middle layer feature adaptive convolution kernel weight of the high layer global dynamic convolution layer obtained by the step 3.1.3; b _h The bias item representing the high-level global dynamic convolution layer is obtained by random initialization and optimized through network iterative training; BN represents batch normalization operation; σ denotes the ReLU activation function;

(3.5) step (ii)3.4 high level feature F _h Through the full junction layer, Y:

Y＝W _fc2 (W _fc1 F _h )

wherein, W _fc1 Denotes a first layer fully connected operation, W _fc2 Representing the second layer of full join operation, and the Y obtained after two times of full join operation is a vector with dimension of 2 multiplied by 1

Where a denotes the probability that an input sample belongs to the invariant class and b denotes the probability that an input sample belongs to the variant class, based on the vector

To output a prediction tag for each sample

When a is>b, when the pressure is higher than the preset pressure,

is equal to the class to which a belongs i.e

When a is<When b is at

Is equal to the class to which b belongs i.e

4. And (3) using the training data set obtained in the step (2) for training the global dynamic convolutional neural network constructed in the step (3):

during training, a two-stage mixed sample data enhancement mode is adopted for training, the number of training samples is increased, the diversity of the samples is enriched, and the over-fitting problem is relieved;

(4.1) recording the number of training rounds as t, and constructing virtual samples for every two samples in all batches in batch training by using a mixed sample data enhancement mode from the 1 st round to the t/2 nd round, and expanding training samples:

the enhanced calculation formula of the mixed sample data is as follows:

wherein x is _i And x _j Representing two input sample data, y _i And y _j Sample labels corresponding to them, respectively, and (x) _i ,y _i ) And (x) _j ,y _j ) Are two samples randomly drawn from the training data set constructed in step 2.2, so

And

data and a label of a new virtual sample after two samples are mixed are represented, and lambda is a mixing coefficient obtained by random sampling in the beta distribution and has a value range of 0,1]：

λ～Beta(α,β)

Wherein α and β are parameters in the beta distribution, and in this process α ═ β ═ 0.5;

(4.2) from the (t/2+1) round to the t-th round, gradually transitioning the enhancement mode of the mixed sample data to the enhancement mode of the basic data, and controlling whether to use the enhancement mode of the mixed sample data by the probability epsilon of linear decline, wherein the calculation formula of epsilon is as follows:

ε＝(t-i)/2t

where i represents the current round number. Setting a random number theta as a threshold value, wherein the random range of theta is [0,1], and when theta is less than epsilon, enhancing by using the mixed sample data in the step 4.1; otherwise, expanding the sample by using basic data enhancement, such as clipping, turning, rotating the sample and the like;

(4.3) calculating a loss function of the global dynamic convolutional neural network in the training process, wherein the calculation formula is as follows:

wherein the content of the first and second substances,

if the mixed sample data enhancement is carried out: y is _i And y _j For two random input samples x as described in step 4.1 _i And x _j The corresponding sample label is marked with a corresponding sample label,

is x _i And x _j Mixed sample

A prediction tag is obtained after the network constructed in the step 3 is passed; if no enhancement of the mixed sample data is performed, y _i ＝y _j Y is the real label of the current input sample;

obtaining a prediction label for the current input sample after passing through the network constructed in the step 3; l is _CE Represents a cross entropy loss function, λ represents the mixing coefficient described in step 4.1, and log represents a base 10 logarithmic operation;

(4.4) optimizing global dynamic convolutional neural network parameters by using a Stochastic Gradient Descent (SGD) algorithm;

5. inputting the test data set in the step 2.3 into the optimized global dynamic convolutional neural network, and obtaining a prediction label of the test data set according to the processes in the steps 3.2 to 3.5; and (4) obtaining a change result graph of the place in the step 1.

Compared with the prior art, the invention has the advantages and positive effects that:

the SAR image change detection method based on the global dynamic convolutional neural network provided by the invention processes the SAR image through the difference map generation, the unsupervised pre-classification and the neural network model training and classification, and utilizes the characteristics of high classification precision and high robustness to noise of a global dynamic convolutional neural network classifier:

1. and performing difference analysis by using a logarithm ratio operator to obtain two difference graphs of the multi-temporal SAR images. The logarithmic ratio operator can effectively inhibit speckle noise and enhance the contrast of a change class and a non-change class, so that the difference of the sample is enhanced;

2. and (4) performing presorting by using a hierarchical FCM (fuzzy C-means) clustering algorithm to obtain a pseudo label matrix. The hierarchical FCM clustering algorithm has good clustering effect and high clustering efficiency, so that the precision and the speed of pre-classification can be improved;

3. the global dynamic convolution neural network effectively combines global characteristics and local characteristics through a self-adaptive convolution kernel, so that more robust characteristic representation can be obtained, and the classification precision of the neural network classifier is improved;

4. the two-stage mixed sample data enhancement method can increase the number and diversity of training samples, prevent the over-fitting problem, and improve the generalization capability of the network, thereby effectively generating more stable classification results.

Drawings

Fig. 1 is a schematic flowchart of a SAR image change detection method according to an embodiment of the present invention;

FIG. 2 is a schematic block diagram of a method according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a neural network according to an embodiment of the present invention;

FIG. 4 is a diagram of a global dynamic convolution module according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating input data according to an embodiment of the present invention;

FIG. 6 is a graph comparing the effectiveness of the method of the present invention with that of the prior art.

Detailed Description

In the following description, various aspects of the invention will be described, but it will be apparent to those skilled in the art that the invention may be practiced with only some or all of the inventive structures or processes. Specific numbers, configurations and sequences are set forth in order to provide clarity of explanation, but it will be apparent that the invention may be practiced without these specific details. In other instances, well-known features have not been set forth in detail in order not to obscure the invention.

Referring to fig. 1, the method comprises the following specific steps:

step 1: performing difference analysis on two multi-temporal SAR images captured in the same geographic area to obtain a difference map:

the calculation formula of the difference map is as follows:

I _DI ＝|logI ₁ -logI ₂ |

step 2: for difference chart I _DI Pre-classifying, constructing a training data set and a testing data set:

step 21: for difference chart I _DI Pre-classifying by using a hierarchical FCM (fuzzy C-means) clustering algorithm to obtain a pseudo label matrix, wherein the pseudo label values are 0,1 and 0.5 respectively, namely, high-probability samples which can be judged to be unchanged or changed and uncertain samples are generated after pre-classification;

step 22: randomly selecting p% of pixels from the pixels with the pseudo label values of 0 and 1, wherein the value of p is an integer not greater than 10 (the optimal value in the data set used in the method is 6), extracting the spatial positions of the pixels, taking r multiplied by r neighborhood pixel blocks around pixel points as a training set on the pixels of the corresponding spatial positions of the original two multi-temporal SAR images, extracting field pixel blocks by filling 0 with field pixels at edge pixel points, and taking r as an odd number not less than 3 (the optimal value in the data set used in the method is 9 or 11);

step 2.3: extracting r multiplied by r neighborhood pixel blocks around all pixels from the original two multi-temporal SAR images as a test set, extracting the field pixel blocks by filling 0 in the field pixels by the edge pixel points, wherein the value of r is an odd number not less than 3 (the optimal value in the data set used by the method is 9 or 11);

and 3, step 3: constructing a global dynamic convolution neural network:

step 3.1: constructing a global dynamic convolution layer:

step 3.1.1: constructing a global feature coding module:

step 3.1.2: constructing a context feature projection module:

step 3.1.3: constructing a convolution kernel weight generation module:

Is converted into

Second linear layer will be characterized

Is converted into

Then M is added _G Performing dimension expansion to obtain

Will M _C Performing dimension expansion to obtain

M＝δ(M′ _G +M′ _C )

wherein the content of the first and second substances,

δ represents the Sigmoid activation function, as the size of the original convolution kernel W. And finally, multiplying the generated convolution kernel weight M and the current convolution kernel weight W element by element, and combining the global information and the local information to obtain a new convolution kernel weight W' of the input data self-adaption, wherein the calculation formula is as follows:

W′＝M⊙W

wherein, l represents element-by-element multiplication;

step 3.1.4: performing conventional convolution on current data in the input data by using the new convolution kernel generated in the step 3.1.3;

step 3.2: extracting low-level features F of input data by using the low-level global dynamic convolution layer constructed in the step 3.1 _l ：

The size of a convolution kernel in the low-layer global dynamic convolution layer is k _l ×k _l The number of convolution kernels is n _l Wherein k is _l Is 3, n _l Number 48, low level feature F _l The calculation formula of (a) is as follows:

F _l ＝σ(BN(XW _l ′+b _l ))

wherein X represents an input sample in an input layer and is also an input of the low-layer global dynamic convolutional layer; w is a group of _l ' sample adaptive convolution kernel weights for the low-level global dynamic convolution layer obtained by step 3.1.3; b _l The bias item representing the low-level global dynamic convolution layer is obtained by random initialization and optimized through network iterative training; BN represents batch normalization operation; σ denotes the ReLU activation function;

step 3.3: extracting middle layer characteristic F of input data by using the middle layer global dynamic convolution layer constructed in the step 3.1 _m ：

The convolution kernel in the middle layer global dynamic convolution layer is largeSmall is k _m ×k _m The number of convolution kernels is n _m Wherein k is _m Is taken to be 3, n _m Number 96, middle layer feature F _m The calculation formula of (a) is as follows:

F _m ＝σ(BN(F _l W _m ′+b _m ))

wherein, F _l The low-level features representing the input data are also the input of the middle-level global dynamic convolution layer; w' _m Representing the low-level feature adaptive convolution kernel weight of the middle-level global dynamic convolution layer obtained in the step 3.1.3; b _m The bias item representing the middle layer global dynamic convolution layer is obtained by random initialization and optimized through network iterative training; BN represents batch normalization operation; σ denotes the ReLU activation function;

step 3.4: extracting high-level characteristics F of input data by using the high-level global dynamic convolution layer constructed in the step 3.1 _h ：

F _h ＝σ(BN(F _m W′ _h +b _h ))

wherein, F _m The middle-layer characteristics representing the input data are also the input of the high-layer global dynamic convolution layer; w' _h Representing the middle layer characteristic self-adaptive convolution kernel weight of the high-layer global dynamic convolution layer obtained in the step 3.1.3; b _h The bias item representing the high-level global dynamic convolutional layer is obtained by random initialization and optimized through network iterative training; BN represents batch normalization operation; σ denotes the ReLU activation function;

step 3.5: the high-level characteristic F obtained in the step 3.4 is processed _h Through the full junction layer, Y:

Y＝W _fc2 (W _fc1 F _h )

wherein, W _fc1 Denotes a first layer fully connected operation, W _fc2 Indicating a second level of fully connected operation, viaY obtained after two full-join operations is a vector with dimension of 2 multiplied by 1

To output a prediction tag for each sample

When a is>When the position of the magnetic core is b,

is equal to the class to which a belongs i.e.

When a is<When b is at

Is equal to the class to which b belongs i.e

And 4, step 4: and (3) using the training data set obtained in the step (2) for training the global dynamic convolutional neural network constructed in the step (3):

step 4.1: recording the number of training rounds as t, and constructing virtual samples for all samples in batches in a mixed sample data enhancement mode from the 1 st round to the t/2 nd round during batch training to expand training samples:

the calculation formula of the mixed sample data enhancement is as follows:

wherein x is _i And x _j Representing two input sample data, y _i And y _j Sample labels corresponding to them, respectively, and (x) _i ,y _i ) And (x) _j ,y _j ) Is two samples randomly drawn from the training data set constructed in step 2.2, so

And

λ～Beta(α,β)

step 4.2: gradually transitioning from the (t/2+1) round to the tth round, the enhancement mode of the mixed sample data to the enhancement mode of the basic data,

whether the mixed sample data is used for enhancement is controlled by the probability epsilon of linear decline, and the calculation formula of epsilon is as follows:

ε＝(t-i)/2t

where i represents the current number of rounds. Setting a random number theta as a threshold value, wherein the random range of theta is [0,1], and when theta is less than epsilon, enhancing by using the mixed sample data in the step 4.1; otherwise, expanding the sample by using basic data enhancement, such as clipping, turning, rotating the sample and the like;

step 4.3: and calculating a loss function of the global dynamic convolutional neural network in the training process, wherein the calculation formula is as follows:

wherein the content of the first and second substances,

if the enhancement of mixed sample data is carried out: y is _i And y _j For two random input samples x as described in step 4.1 _i And x _j The corresponding sample label is marked with a corresponding sample label,

is x _i And x _j Mixed sample

Obtaining a prediction label after passing through the network constructed in the step 3; if no enhancement of the mixed sample data is performed, y _i ＝y _j Y is the real label of the current input sample;

step 4.4: optimizing global dynamic convolutional neural network parameters using a Stochastic Gradient Descent (SGD) algorithm;

and 5: inputting the test data set in the step 2.3 into the optimized global dynamic convolutional neural network, and obtaining a prediction label of the test data set according to the processes in the steps 3.2 to 3.5;

step 6: and obtaining a change result graph of the place in the step 1 according to the prediction label obtained in the step 5.

The effect of the present invention is further explained by combining simulation experiments as follows:

the simulation experiment of the invention is carried out in the hardware environment of Intel Xeon E5-2609, GeForce RTX 2080 and memory 32GB and the software environment of Ubuntu 16.04.6, PyTorch and Matlab2016 a. The simulation experiment data of the present invention is shown in fig. 5, wherein fig. 5(a) is a real SAR image taken at time 1; fig. 5(b) is a real SAR image taken at time 2; fig. 5(c) is a reference diagram of simulation change detection results of real SAR images, which is manually and carefully labeled by an expert considering prior knowledge. The experimental objects of the invention are three groups of multi-temporal SAR image data sets, namely a Sulzberger data set, a Chaohu I data set and a Chaohu II data set. Sulzberger datasets were taken by ENVISAT satellites at 11 and 16 days 3/2011, with a size of 256 × 256 pixels, as in the first row of FIG. 5. The Chaohu i and Chaohu ii datasets were acquired by Sentinel-1 satellites at 5 and 7 months 2020, at 384 × 384 pixels, as shown in the second and third rows of fig. 5.

The results of the comparison of the method of the present invention with the prior art more advanced change detection method are shown in FIG. 6. The PCAKM method in the comparative experiment is proposed in the article "Unstand changed detection in satellite images using the primary component analysis and k-means clustering"; the GaborPCANet method is proposed in the article "Automatic change detection in synthetic aperture images based on PCANet"; the NR-ELM method is proposed in the article "Change detection from synthetic apparatus based on neighbor-based ratio and extreme learning machine"; the CWNN method is proposed in the article "Sea change detection in sars images based on volumetric-horizontal neural networks"; the DDNet method is proposed in the article "Change detection in synthetic mapping image using a dual-domain network".

As shown in fig. 6, for the Sulzberger dataset, the method of the present invention can preserve the details of the variation and suppress the speckle noise on the dataset; aiming at the data sets of the Chaohu I and the Chaohu II, the influence of speckle noise is much stronger, and the comparison method shows different sensitivities to the noise in different areas, so that the performances of the data sets are no longer consistent with those of the data sets of the Sulzberger, however, the method can still obtain good performances, and the method has better robustness to the noise.

The present invention compares the classification accuracy (PCC) and Kappa Coefficient (KC) with the above method and calculates as follows:

wherein the content of the first and second substances,

OE＝FP+FN

PRE＝[(TP+FP-FN)×TP+(TN+FN-FP)×TN]/(N×N)

n is the total number of pixels, OE is the total error number, FP is the false detection number, which represents the number of pixels in the reference picture which are not in the changed class but are detected as the changed class in the final changed picture; FN is the number of missed detects and represents the number of pixels in the reference picture that are originally in the changed class but are detected as unchanged in the final changed picture. PRE represents the number and proportional relationship of false detections and missed detections, where TP is the number of truly changing pixels and TN is the number of truly unchanged pixels. The larger PCC and KC values indicate that the change detection result is more accurate and the noise suppression capability is stronger.

Tables 1, 2 and 3 show the results of comparative experiments of the present invention and the above-described method. As can be seen from the table, the PCC and KC values of the method of the invention are the highest, and good performance is obtained on three sets of data sets, which shows that the method of the invention can more accurately detect the change information in the input image and has better robustness to noise.

TABLE 1 Sulzberger data set Change detection test results

Method	FP	FN	OE	PCC(％)	KC(％)
						PKAKM	3308	701	4009	93.88	84.49
GaborPCANet	2485	494	2979	95.45	88.34
						NR-ELM	2386	646	3032	95.37	88.07
CWNN	1598	1132	2730	95.83	88.98
						DDNet	1932	494	2426	96.30	90.40
The method of the invention	1398	731	2129	96.75	91.44

TABLE 2 Change detection test results for the Chaohu I dataset

Method	FP	FN	OE	PCC(％)	KC(％)
						PKAKM	13199	2996	16195	89.02	43.99
GaborPCANet	16777	1500	18277	87.61	45.01
						NR-ELM	2252	3552	5804	96.06	69.63
CWNN	2908	2808	5716	96.12	71.85
						DDNet	4983	1217	6200	95.80	73.54
The method of the invention	3029	1034	4063	97.24	81.46

TABLE 3 Change detection test results for the Chaohu II dataset

Method	FP	FN	OE	PCC(％)	KC(％)
						PKAKM	8521	2248	10769	92.70	65.58
GaborPCANet	2946	1771	4717	96.80	82.66
						NR-ELM	595	3836	4431	97.00	81.27
CWNN	959	2397	3356	97.72	86.63
						DDNet	3107	779	3886	97.36	86.18
The method of the invention	792	1218	2010	98.64	92.24

The method based on the global dynamic convolutional neural network is mainly specially provided for improving the multi-temporal SAR image change detection performance. But the method is also suitable for analyzing the images shot by common imaging equipment such as a digital camera and a mobile phone, and the obtained beneficial effects are similar.

The above description is only a preferred embodiment of the present invention, and not intended to limit the present invention in other forms, and any person skilled in the art may apply the above modifications or changes to the equivalent embodiments with equivalent changes, without departing from the technical spirit of the present invention, and any simple modification, equivalent change and change made to the above embodiments according to the technical spirit of the present invention still belong to the protection scope of the technical spirit of the present invention.

Claims

1. A SAR image change detection method based on a global dynamic convolution neural network is characterized by comprising the following steps:

step 1, carrying out difference analysis on two multi-temporal SAR images captured in the same geographic area to obtain a difference map I _DI ；

Step 2, comparing the difference chart I _DI Pre-classifying to construct a training data set and a test data set;

step 3, constructing a global dynamic convolution neural network;

the constructed global dynamic convolutional neural network core architecture comprises three global dynamic convolutional layers with consistent structures, each global dynamic convolutional layer is composed of a global feature coding module, a context feature projection module and a convolutional kernel weight generation module, and the global dynamic convolutional neural network structure is as follows: input layer → low-level global dynamic convolution layer → middle-level global dynamic convolution layer → high-level global dynamic convolution layer → full link layer, where the sample size input by the input layer is c × r × r, and c represents the number of input data channels;

and 4, step 4: the training data set obtained in the step 2 is used for training the global dynamic convolution neural network constructed in the step 3, and during training, a two-stage mixed sample data enhancement mode is adopted for training;

and 5: and testing the test data set by using the trained network model to obtain a prediction label of the test data set, thereby obtaining a change detection result of the whole image.

2. The SAR image change detection method based on the global dynamic convolutional neural network as claimed in claim 1, characterized in that: the step 3 of constructing the global dynamic convolutional neural network is specifically realized by the following method:

step 31, constructing a global dynamic convolution layer: firstly, a global feature coding module codes global information through an average pooling layer and a linear layer, then a context feature projection module projects the coded features to an output dimension space through context feature projection, and finally a convolution kernel weight generation module generates a convolution kernel containing the global context information and uses the generated new convolution kernel to execute conventional convolution;

step 32, extracting low-level features F of the input data by using the low-level global dynamic convolution layer constructed in the step 31 _l ：

F _l ＝σ(BN(XW _l ′+b _l ))

Wherein X represents an input sample in an input layer and is also an input of the low-layer global dynamic convolutional layer; w _l ' sample adaptive convolution kernel weights representing lower layer global dynamic convolution layers; b is a mixture of _l The bias item representing the low-level global dynamic convolution layer is obtained by random initialization and optimized through network iterative training; BN represents batch normalization operation; σ denotes the ReLU activation function;

step 33: extracting middle layer characteristic F of input data by using middle layer global dynamic convolution layer constructed in step 31 _m ：

F _m ＝σ(BN(F _l W′ _m +b _m ))

Wherein, F _l The low-level features representing the input data are also the input of the middle-level global dynamic convolution layer; w' _m Representing the low-layer characteristic self-adaptive convolution kernel weight of the middle-layer global dynamic convolution layer; b _m The bias item representing the middle layer global dynamic convolution layer is obtained by random initialization and optimized through network iterative training;

step 34: extracting high-level features F of input data by using the high-level global dynamic convolutional layer constructed in the step 31 _h ：

F _h ＝σ(BN(F _m W′ _h +b _h ))

Wherein, F _m The middle-layer characteristics of the input data are represented, and the input of the high-layer global dynamic convolution layer is also represented; w' _h Middle layer feature adaptive convolution representing high layer global dynamic convolution layerA kernel weight; b _h The bias item representing the high-level global dynamic convolution layer is obtained by random initialization and optimized through network iterative training;

Y＝W _fc2 (W _fc1 F _h )

To output a prediction tag for each sample

When a is>When the position of the magnetic core is b,

is equal to the class to which a belongs, i.e.

When a is<When the position of the magnetic core is b,

is equal to the class to which b belongs, i.e.

3. The SAR image change detection method based on the global dynamic convolutional neural network of claim 2, characterized in that: the step 31 is specifically realized in the following manner:

step 311, constructing a global feature coding module: the global feature coding module comprises an average pooling layer, a linear layer, a normalization layer and an activation layer, wherein the average pooling layer extracts global information of input data and reduces the size of an input data space to k multiplied by k, wherein k is the size of a convolution kernel; then projecting the features in all c channels to a vector with the size of m through a linear layer; and obtaining the output characteristics of the global characteristic coding module through normalization and a ReLU activation function, wherein the output characteristics are as follows:

step 312, constructing a context feature projection module:

the context feature projection module comprises a linear layer, a normalization layer and an activation layer, wherein the linear layer projects the feature G obtained in the step 311 into a space with an output dimension n, and the output feature of the context feature projection module is obtained through normalization and a ReLU activation function and is as follows:

step 313: a convolution kernel weight generation module is constructed:

(1) the weight generation module firstly utilizes two linear layers to respectively convert the feature G obtained in the step 311 and the feature C obtained in the step 312 into the space size of a convolution kernel, and the first linear layer converts the feature

Is converted into

Second linear layer will be characterized

Is converted into

Then M is added _G Performing dimension expansion to obtain

Will M _C Performing dimension expansion to obtain

(2) Then, M is added _G And M _C Adding to generate convolution kernel weight M, and calculating as follows:

M＝δ(M′ _G +M′ _C )

wherein the content of the first and second substances,

the size of the convolution kernel is the same as that of an original convolution kernel W, and delta represents a Sigmoid activation function;

(3) finally, the generated convolution kernel weight M and the current convolution kernel weight W are multiplied element by element, and the global information and the local information are combined to obtain a new convolution kernel weight W' of the input data self-adaption, wherein the calculation formula is as follows:

W′＝M⊙W

wherein, "-" indicates element-by-element multiplication;

step 314: the current data in the input data is conventionally convolved with the new convolution kernel generated in step 313.

4. The SAR image change detection method based on the global dynamic convolutional neural network as claimed in claim 1, characterized in that: the step 4 of training by adopting a two-stage mixed sample data enhancement mode comprises the following steps:

step 41: recording the number of training rounds as t, and constructing virtual samples for every two samples in all batches in batch training by using a mixed sample data enhancement mode from the 1 st round to the t/2 nd round, and expanding training samples;

the calculation formula of the mixed sample data enhancement is as follows:

wherein x is _i And x _j Representing two input sample data, y _i And y _j Sample labels corresponding to them, respectively, and (x) _i ,y _i ) And (x) _j ,y _j ) Are two samples randomly drawn from the training data set constructed in step 2, so

And

λ～Beta(α,β)

Wherein α and β are parameters in a beta distribution;

step 42: from the (t/2+1) round to the tth round, the enhancement mode of the mixed sample data is gradually transited to the enhancement mode of the basic data, whether the mixed sample data is used for enhancement is controlled by the probability epsilon of linear decline, and the calculation formula of epsilon is as follows:

ε＝(t-i)/2t

wherein i represents the current round number, a random number theta is set as a threshold value, the random range of theta is [0,1], and when theta is less than epsilon, the mixed sample data in the step 41 is used for enhancement; otherwise, the samples are augmented using the base data enhancement;

step 43: and calculating a loss function of the global dynamic convolutional neural network in the training process, wherein the calculation formula is as follows:

wherein the content of the first and second substances,

if the mixed sample data enhancement is carried out: y is _i And y _j For two random input samples x as described in step 41 _i And x _j The corresponding sample label is marked with a corresponding sample label,

is x _i And x _j Mixed sample

A prediction tag is obtained after the network constructed in the step 3 is passed;

if no enhancement of the mixed sample data is performed, y _i ＝y _j Y is the real label of the current input sample;

obtaining a prediction label for the current input sample after passing through the network constructed in the step 3; l is _CE Represents a cross entropy loss function, and log represents a base 10 logarithmic operation;

and step 44: and optimizing global dynamic convolutional neural network parameters by using a random gradient descent SGD algorithm.

5. The SAR image change detection method based on the global dynamic convolutional neural network of claim 1, characterized in that: in the step 1, a logarithm ratio operator is used for performing difference analysis on the two multi-temporal SAR images to obtain a difference map of the two multi-temporal SAR images, and a calculation formula of the difference map is as follows:

I _DI ＝|log I ₁ -log I ₂ |

wherein, I ₁ And I ₂ Two multi-temporal SAR images, I, each representing the same geographical area _DI Is represented by ₁ And I ₂ The two disparity maps of the multi-temporal SAR images, | · | represents absolute value operation, and log represents base-10 logarithm operation.

6. The SAR image change detection method based on the global dynamic convolutional neural network as claimed in claim 1, characterized in that: the step 2 specifically comprises the following steps:

step 21, comparing the difference chart I _DI Pre-classifying by using a hierarchical FCM clustering algorithm to obtain a pseudo label matrix, wherein the pseudo label values are 0,1 and 0.5 respectively, namely, high-probability samples and uncertain samples which are judged to be unchanged or changed are generated after pre-classification;

step 22, randomly selecting p% of pixels from pixels with pseudo label values of 0 and 1, wherein the value of p is an integer not greater than 10, extracting the spatial positions of the pixels, taking r × r neighborhood pixel blocks around pixel points as a training set on the pixels of the corresponding spatial positions of the original two multi-temporal SAR images, extracting the domain pixel blocks by filling 0 with the edge pixel points through the domain pixels, and taking the value of r as an odd number not less than 3;

and step 23, extracting r × r neighborhood pixel blocks around all pixels from the original two multi-temporal SAR images as a test set, extracting the field pixel blocks by filling 0 in the field pixels by the edge pixel points, wherein the value of r is an odd number not less than 3.