CN116778207A

CN116778207A - Unsupervised depth multi-scale SAR image change detection method based on spatial frequency domain

Info

Publication number: CN116778207A
Application number: CN202310790988.5A
Authority: CN
Inventors: 王路; 马丽睿; 赵天睿; 鄂佳慧; 赵春晖
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2023-09-19
Anticipated expiration: 2043-06-30
Also published as: CN116778207B

Abstract

The invention provides an unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain. According to the method, pseudo-label extraction is carried out on the difference image generated according to the SAR image in the detection area through hierarchical fuzzy C-means clustering, so that the problem of insufficient labels is solved. The SAR image change detection method is applicable to more scenes. The invention utilizes the space information and the frequency domain information of the input SAR image, and proposes the extraction of the deep features of the space multi-region multi-scale, and the image features captured in the mode are more beneficial to detection. The invention introduces an attention mechanism and a gating linear unit in a space domain and a frequency domain respectively, so as to improve the sensitivity of the invention to the change details, reduce the influence of inherent speckle noise of SAR images and improve the detection precision.

Description

Unsupervised depth multi-scale SAR image change detection method based on spatial frequency domain

Technical Field

The invention belongs to the technical field of remote sensing image processing, in particular to a method for detecting image change of a synthetic aperture radar (Synthetic Aperture Radar, SAR), and particularly relates to an unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain.

Background

The remote sensing image change detection is a process of obtaining change information by analyzing remote sensing images of the same scene at different times. Synthetic Aperture Radar (SAR) is one of remote sensing technologies, and is not influenced by sunlight, cloud cover and weather in the imaging process due to the microwave imaging principle, so that the SAR provides unique advantages for change detection. However, the phase angle loses continuity due to random back scattering of the SAR imaging system base unit, resulting in the presence of speckle noise in the SAR image. These inherent speckle noise make it difficult for detection techniques to accurately detect the region of variation.

The traditional detection technology adopts threshold segmentation and clustering as a main method for acquiring change information, but the detection accuracy is poor due to larger errors of results caused by small change of the threshold, and the clustering method is sensitive to noise, and the detection accuracy of the clustering method is reduced due to inherent speckle noise in SAR images. With the continuous development of deep learning, the deep learning-based method is continuously applied to SAR image change detection, for example: deep confidence networks, convolutional learning networks, and the like. But deep learning based detection methods typically require a large number of training samples. In the case where a manual tag is lacking or a detection region tag is difficult to obtain, a detection method based on deep learning hardly exerts an optimal detection level. Thus, tag deficiency and interference of speckle noise are two important challenges facing SAR image change detection.

Disclosure of Invention

The invention aims to solve the problems of insufficient labels and speckle noise interference, and provides an unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain so as to improve the detection precision of SAR image change detection.

The invention is realized by the following technical scheme, and provides an unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain, which comprises the following steps:

step 1: for two obtained SAR images I of the same region at different times ₁ And I ₂ Performing logarithmic comparison operation to obtain a difference image I _d ；

Step 2: differential image I using pseudo tag generator _d Pseudo tag extraction is carried out, so that a training and testing data set is constructed;

step 3: inputting the generated training data set into an unsupervised depth multi-scale network of a spatial frequency domain for training;

step 4: calculating cross entropy loss and carrying out back propagation;

step 5: inputting the test data set into a trained unsupervised depth multi-scale network, and performing label prediction on pixel points in the test data set;

step 6: combining the obtained predictive labels with pseudo labels in the training data set to obtain final change information about the input region.

Further, in step 1, the log ratio operation is calculated as follows:

where log represents a logarithmic operation based on e and |·| represents an absolute value operation.

Further, the step 2 specifically includes:

step 2.1: for difference image I _d Performing hierarchical fuzzy C-means clustering to obtain difference images I _d The pixels in the pixel array are divided into 3 categories, namely a variable category, a constant category and an uncertain category; the category information corresponding to the pixel point is the pseudo tag of the pixel point;

step 2.2: selecting partial pixel points from the variable class and the invariable class, and inputting SAR images I from two with the selected pixel points as the center ₁ 、I ₂ And difference image I _d Respectively extracting image blocks with the size of r multiplied by r, splicing the image blocks, and taking the spliced image blocks and pseudo labels of central pixel points of the image blocks together as a training data set;

step 2.3: from two input SAR images I centered on a pixel in the uncertainty class ₁ 、I ₂ And difference image I _d Respectively extracting image blocks with the size of r multiplied by r and splicing the image blocks to form a test data set.

Further, the spatial frequency domain unsupervised depth multi-scale network extracts spatial features and frequency domain features of the input data respectively, and judges the possibility of change or no change of the extracted features after splicing, and the corresponding pixel category is obtained according to the level of the possibility.

Further, the step 3 specifically includes:

step 3.1: performing convolution dimension-increasing operation on input data;

performing convolution operation on the input, wherein the size of a convolution kernel is n multiplied by 1, and convolution characteristics with the size of n multiplied by r are obtained;

step 3.2: performing multi-region selection on the characteristics;

the obtained features are divided into P ₁ 、P ₂ And P ₃ Three parts, each part having a size of

Selecting P ₁ I.e. neglecting P ₁ Edge regions above and below the level, remain P ₁ Is of the size ofIs a horizontal region of (2);

selecting P ₂ I.e. ignore P ₂ Edge regions on the left and right sides in the vertical direction retain P ₂ Is of the size ofIs a vertical region of (2);

reserved P ₃ The complete area is obtained to have a size ofIs a full area of (a);

step 3.3: deep multi-scale feature extraction is carried out;

step 3.4: performing fast Fourier transform on the generated image block to obtain frequency domain features, and enabling the frequency domain features to pass through three gating linear units, wherein the features output by a third gating unit are frequency domain features;

one of the gating linear units is calculated as follows:

wherein D is _l Is the output of the first gated linear cell, X _l Is the input of the first gating linear cell, W ₁ And W is ₂ Is a weight matrix, a and b are offsets;

step 3.5: and splicing the obtained final spatial characteristics and the frequency domain characteristics, and then sending the spliced final spatial characteristics and the frequency domain characteristics into a full-connection layer to obtain the corresponding predictive label.

Further, the step 3.3 specifically includes:

step 3.3.1: the horizontal area, the vertical area and the full area obtained in the step 3.1 are firstly combined with the size ofConvolving the convolution kernels of (2) to obtain first layers respectivelyLevel characterization->First layer vertical feature->And first layer full area feature->

Step 3.3.2: to level the first layerAnd first layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>First layer fill level feature of (2)>And a first layer fills the vertical feature->Characterizing the first layer full area->Obtaining the first layer key feature after passage-space attention mechanism>First layer fill level characterization->The first layer fills the vertical features->And first layer key feature->Adding to obtain a first layer of spatial features F ₁ ；

First layer spatial features F ₁ The calculation process is as follows:

step 3.3.3: the obtained first layer space feature F ₁ Again performing multi-region selection;

step 3.3.4: the horizontal area, the vertical area and the full area obtained in the step 3.3.3 are equal to Convolving with a convolution kernel of (c), where k ₂ ＞k ₁ Respectively obtaining the horizontal characteristic of the second layer +.>Second layer vertical feature->And second layer full area feature->

Step 3.3.5: to level the second layerAnd second layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>Second layer fill level feature of->And a second layer fills the vertical feature->Characterization of the second layer full area->Obtaining the second layer key feature after passage-space attention mechanism>Second layer fill level characterization->Second layer filled vertical features->And second layer key feature->Adding to obtain a second layer of spatial features F ₂ ；

Second layer spatial features F ₂ The calculation process is as follows:

step 3.3.6: the obtained second layer space feature F ₂ Performing multi-region selection;

step 3.3.7: the horizontal area, the vertical area and the full area obtained in the step 3.3.6 are equal toConvolving with a convolution kernel of (c), where k ₃ ＞k ₂ Respectively obtain the third layer horizontal characteristic->Third layer vertical feature->And third layer full area feature->

Step 3.3.8: level features of the third layerAnd third layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>Third layer filling level characteristics of- >And third layer filling vertical feature->Characterization of the third layer full region->Obtaining the third layer key feature +.>Filling level characteristics of the third layer->Third layer filled vertical feature->And third layer key feature->Adding to obtain a third layer of spatial features F ₃ ；

Third layer spatial feature F ₃ The calculation process is as follows:

step 3.3.9: carrying out multi-scale fusion on the spatial features obtained in the steps 3.3.2, 3.3.5 and 3.3.8 to obtain final spatial features;

the second layer of spatial features F obtained in the steps 3.3.5 and 3.3.8 ₂ And third layer spatial feature F ₃ Performing transpose convolution operations to obtain respective magnitudesAnd->Is characterized by (2); will be of the size ofIs filled with a zero element to a size +.>Is characterized by (2); finally, the first layer of spatial features F of the step 3.3.2 ₁ And step 3.3.9 are all +.>And (3) performing convolution operation after the features of the model are spliced to obtain final spatial features.

Further, the cross entropy loss function calculation process is as follows:

wherein l _i Is the i-th sample is obtained by the pseudo tag obtained for the tag generator, p _i Is the predictive label of the ith sample, T ₁ Is the number of training samples.

Further, the channel-space attention mechanism in steps 3.3.2, 3.3.5, 3.3.8 is implemented as follows:

Layer one feature to be obtainedSmoothing features are obtained by global average pooling and global maximum pooling, respectively>And sharp features->After that the smoothing feature is->And sharp features->Respectively sending the images into a multi-layer perceptron with a hidden layer, and obtaining a channel attention map M by means of element summation _C ；

Channel attention map M _C The calculation process of (1) is as follows:

wherein σ represents a Sigmoid function, MLP (·) represents a multi-layer perceptron, avg (·) represents an average pooling layer, and Max (·) represents a maximum pooling layer;

mapping M the resulting channel attention map _C And the obtained full area feature of the first layerMultiplying the elements by each other to obtain the attention value in the space dimensionWeighting features for the layer 1 channel>

Layer one channel weighting featureThe calculation process is as follows:

wherein the method comprises the steps ofRepresenting element-by-element multiplication;

weighting the obtained first layer channelPerforming an average pooling and a maximum pooling operation along the channel dimension, resulting in spatial smoothing features, respectively +.>And spatial sharp features->Splicing the two features along the channel dimension, and finally generating a spatial attention map M through convolution operation _s ；

Spatial attention map M _s The calculation process of (1) is as follows:

wherein [; and represents a channel series;

Map the resulting spatial attention M _s Weighting features with layer I channelsMultiplying element by element to obtain the first layer key feature +.>

Layer one key featureThe calculation process is as follows:

the invention provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain when executing the computer program.

The invention proposes a computer readable storage medium for storing computer instructions which, when executed by a processor, implement the steps of the spatial frequency domain based unsupervised depth multiscale SAR image change detection method.

Compared with the prior art, the invention has the beneficial effects that:

1. pseudo-label extraction is carried out on the difference image generated according to the SAR image in the detection region through hierarchical fuzzy C-means clustering, so that the problem of insufficient labels is solved. The SAR image change detection method is applicable to more scenes.

2. The invention utilizes the space information and the frequency domain information of the input SAR image, and proposes the extraction of the deep features of the space multi-region multi-scale, and the image features captured in the mode are more beneficial to detection.

3. The invention introduces an attention mechanism and a gating linear unit in a space domain and a frequency domain respectively, so as to improve the sensitivity of the invention to the change details, reduce the influence of inherent speckle noise of SAR images and improve the detection precision.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flow chart of an unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain.

Fig. 2 is a schematic diagram of the image processing process of the present invention.

Fig. 3 is a schematic diagram of a channel-space attention mechanism.

FIG. 4 is a schematic diagram of input data according to the present invention.

Fig. 5 is a graph comparing the effect of the present invention with that of the prior art method.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides an unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain, which comprises the following steps:

step 4: calculating cross entropy loss and carrying out back propagation;

In step 1, the log ratio operation is calculated as follows:

The step 2 specifically comprises the following steps:

The non-supervision depth multi-scale network of the spatial frequency domain extracts the spatial features and the frequency domain features of the input data respectively, the extracted features are spliced, the possibility of change or no change of the extracted features is judged, and the corresponding pixel types are obtained according to the level of the possibility.

The step 3 specifically comprises the following steps:

step 3.1: performing convolution dimension-increasing operation on input data;

step 3.2: performing multi-region selection on the characteristics;

step 3.3: deep multi-scale feature extraction is carried out;

one of the gating linear units is calculated as follows:

The step 3.3 specifically comprises the following steps:

step 3.3.1: the horizontal area, the vertical area and the full area obtained in the step 3.1 are firstly combined with the size ofConvolving the convolution kernels of (2) to obtain the first layer horizontal features +. >First layer vertical feature->And first layer full area feature->

First layer spatial features F ₁ The calculation process is as follows:

step 3.3.4: the horizontal area, the vertical area and the full area obtained in the step 3.3.3 are equal toConvolving with a convolution kernel of (c), where k ₂ ＞k ₁ Respectively obtaining the horizontal characteristic of the second layer +.>Second layer vertical feature->And second layer full area feature->

Step 3.3.5: to level the second layerAnd second layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>Second layer fill level feature of->And a second layer fills the vertical feature->Characterization of the second layer full area->Obtaining the second layer key feature after passage-space attention mechanism >Second layer fill level characterization->Second layer filled vertical features->And second layer key feature->Adding to obtain a second layer of spatial features F ₂ ；

Second layer spatial features F ₂ The calculation process is as follows:

Step 3.3.8: level features of the third layerAnd third layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>Third layer filling level characteristics of->And third layer filling vertical feature->Characterization of the third layer full region->Obtaining the third layer key feature +.>Filling level characteristics of the third layer->Third layer filled vertical feature->And third layer key feature->Adding to obtain a third layer of spatial features F ₃ ；

Third layer spatial feature F ₃ The calculation process is as follows:

The cross entropy loss function calculation process is as follows:

wherein l _i Is the i-th sample is obtained by the pseudo tag obtained for the tag generator, p _i Is the i-th sample predictive tag, T ₁ Is the number of training samples.

The channel-space attention mechanism in steps 3.3.2, 3.3.5, 3.3.8 is implemented as follows:

Channel attention map M _C The calculation process of (1) is as follows:

Mapping M the resulting channel attention map _C And the obtained full area feature of the first layerMultiplying the elements by each other to obtain the attention value in the space dimension, and obtaining the weighted characteristic of the first layer channel +.>

Layer one channel weighting featureThe calculation process is as follows:

Spatial attention map M _s The calculation process of (1) is as follows:

wherein [; and represents a channel series;

Layer one key featureThe calculation process is as follows:

the present invention will be described in detail with reference to specific examples.

The calculation process of the logarithmic ratio operation is as follows:

wherein log represents a logarithmic operation based on e, and |·| represents an absolute value operation;

step 2.1: for difference image I _d Performing hierarchical fuzzy C-means clustering to obtain difference images I _d The pixels in (a) are divided into 3 categories, namely a changed category, a constant category and an uncertain category. The category information corresponding to the pixel point is the pseudo tag of the pixel point;

step 2.2: selecting one third of pixel points from all the variable pixel points, selecting one tenth of pixel points from all the unchanged pixel points, and inputting SAR images I from two with the selected pixel points as the center ₁ 、I ₂ And difference image I _d The image blocks with the size of 15 multiplied by 15 are extracted and spliced to form the image blocks with the size of 3 multiplied by 15. Taking the spliced image block and the pseudo tag of the center pixel point of the image block as a training data set;

step 2.3: from two input SAR images I centered on a pixel in the uncertainty class ₁ 、I ₂ And difference image I _d Extracting 15X 15 image blocks and splicing the image blocks to form a test data set, wherein the size of the image blocks is 3X 15;

Step 3: inputting the training data set generated in the step 2.2 into an unsupervised depth multi-scale network of a spatial frequency domain for training;

the spatial frequency domain unsupervised depth multi-scale network is used for extracting spatial features and frequency domain features of input data respectively, splicing the extracted features, judging the possibility of change or no change of the extracted features, and obtaining the corresponding pixel types according to the level of the possibility.

Step 3.1: performing convolution dimension-increasing operation on input data;

a convolution operation is performed on the input with a convolution kernel of 15 x 1. A convolution feature of size 15 x 15 is obtained.

Step 3.2: performing multi-region selection on the characteristics;

the obtained features are divided into P ₁ 、P ₂ And P ₃ Three sections, each section having a size of 5 x 15;

selecting P ₁ I.e. neglecting P ₁ Edge regions above and below the level, remain P ₁ Obtaining a horizontal region of size 5 x 3 x 15;

selecting P ₂ I.e. ignore P ₂ Edge regions on the left and right sides in the vertical direction retain P ₂ To obtain a vertical region of size 5 x 15 x 3;

reserved P ₃ The complete area gives a complete area of size 5×15×15;

Step 3.3: deep multi-scale feature extraction is carried out;

step 3.3.1: the horizontal area, the vertical area and the full area obtained in the step 3.1 are firstly convolved with convolution kernels with the size of 5 multiplied by 3, and the horizontal characteristics of the first layer are respectively obtainedFirst layer vertical feature->And first layer full area features

Step 3.3.2: to level the first layerFirst layer vertical feature->Filling the ignored edge portions with 0 elements gives a first layer filling level characteristic +.>And a first layer fills the vertical feature->Characterizing the first layer full area->Obtaining the first layer key feature after passage-space attention mechanism>First layer fill level characterization->And a first layer fills the vertical feature->And first layer key feature->Adding to obtain a first layer of spatial features F ₁ ；

First layer spatial features F ₁ The calculation process is as follows:

step 3.3.3: the first layer of spatial features F obtained in the step 3.3.2 ₁ Performing multi-region selection again according to the step 3.2;

step 3.3.4: convolving the horizontal region, vertical region and full region obtained in step 3.3.3 with a convolution kernel of size 5 x 5, respectively obtain the horizontal characteristics of the second layerSecond layer vertical feature->And second layer full area features

Step 3.3.5: to level the second layerSecond layer vertical feature->Filling the ignored edge portions with 0 elements gives a second layer filling level characteristic +.>And a second layer fills the vertical feature->Characterization of the second layer full area->Obtaining the second layer key feature after passage-space attention mechanism>Second layer fill level characterization->And a second layer fills the vertical feature->And second layer key feature->Adding to obtain a second layer of spatial features F ₂ ；

Second layer spatial features F ₂ The calculation process is as follows:

step 3.3.6: carrying out multi-region selection on the second layer of space features obtained in the step 3.3.5 according to the step 3.2;

step 3.3.7: convolving the horizontal region, the vertical region and the full region obtained in the step 3.3.6 with convolution kernels with the size of 5 multiplied by 7 to respectively obtain the horizontal characteristics of the third layerThird layer vertical feature->And third layer full area feature

Step 3.3.8: level features of the third layerThird layer vertical feature->Filling the ignored edge portions with 0 elements gives a third layer filling level feature +.>And third layer filling vertical feature->Characterization of the third layer full region->Obtaining the third layer key feature +. >Filling level characteristics of the third layer->And third layer filling vertical feature->And third layer key feature->Adding to obtain a third layer of spatial features F ₃ ；

Third layer spatial feature F ₃ The calculation process is as follows:

the second layer of spatial features F obtained in the steps 3.3.5 and 3.3.8 ₂ Third layer spatial feature F ₃ Transpose convolution operations are performed to obtain features of 5×15×15 and 5×13×13, respectively. Features of size 5 x 13 are filled with zero elements to features of size 5 x 15. Finally, the first layer of spatial features F of the step 3.3.2 ₁ And step 3.3.9, splicing the features with the sizes of 5 multiplied by 15, and then performing convolution operation to obtain the final spatial features;

step 3.4: performing fast Fourier transform on the image block generated in the step 2.2 to obtain frequency domain features, and enabling the frequency domain features to pass through three gating linear units, wherein the features output by a third gating unit are frequency domain features;

one of the gating linear units is calculated as follows:

wherein D is _l Is the output of the first gated linear cell, X _l Is the input of the first gating linear cell, W ₁ And W is ₂ Is a matrix of weights that are to be used,a and b are offsets.

Step 3.5: splicing the spatial characteristics and the frequency domain characteristics obtained in the steps 3.3.9 and 3.4, and then sending the spliced spatial characteristics and the frequency domain characteristics into a full-connection layer to obtain corresponding predictive labels;

step 4: calculating cross entropy loss and carrying out back propagation;

the cross entropy loss function calculation process is as follows:

Step 5: inputting the test data set of the step 2.3 into the network after the step 3, and carrying out label prediction on the pixel points in the test data set according to the process of the step 3;

step 6: combining the predictive label obtained in step 5 with the pseudo label in the training dataset in step 2.2 to obtain final change information about the input region.

Channel attention map M _C The calculation process of (1) is as follows:

Layer one channel weighting featureThe calculation process is as follows:

Spatial attention map M _s The calculation process of (1) is as follows:

wherein [; and represents a channel series;

map the resulting spatial attention M _s Weighting features F with layer I channels _l ^w Element-by-element multiplication to obtain first layer key features

Layer one key featureThe calculation process is as follows:

the invention is further illustrated by simulation experiments as follows:

the detection effect of the invention is verified by four groups of SAR images obtained by shooting 4 groups of SAR images in the same region at different times. The first line of figure 4 shows the Ottawa dataset captured by the Radarsat-1SAR sensor in wortmanning, canada, 5 and 8 months 1997, with an image size of 290 x 350. The second line of fig. 4 is the Sulzberger dataset taken by the Envisat satellite in the soviebert ice bank at 11 and 16 days of 2011, 3 months. The original picture size is 2263×264, here a pixel size of 256×256 is selected. The third and fourth rows of fig. 4 are the yellow river C and D datasets, respectively, captured by Radarsat-2 in 2008 and 2009, respectively, with original dimensions 7666 x 7692, and truncations 291 x 444 and 306 x 291, respectively, constituting the yellow river C and D datasets. The column (c) in fig. 4 represents the true variation region.

The comparison experimental result diagram of the SAR image change detection algorithm is shown in fig. 5. The comparative experiment was carried out in the article "Sea ice change detection in SAR images based on convolutional-wavelet neural networks" by the rational-wavelet neural networks (CWNN). Garbor principal component analysis (abbreviated as GarborPCA) is proposed by the article "Automatic change detection in synthetic aperture radar images based on PCANet". Extreme learning machine (abbreviated as ELM) is proposed by the article "Change detection from synthetic aperture radar images based on neighborhood-based ratio and extreme learning machine". The Dual-domain network (abbreviated as DDNet) is proposed by the article "Change detection in synthetic aperture radar images using a Dual-domain network". The Layer-based noise-tolderant network (abbreviated as LANTNet) is proposed by the article "Synthetic aperture radar image change detection via Layer attention-based noise-tolderant network". In fig. 5, a black pixel represents a pixel where an actual change is detected as unchanged, a white pixel represents a pixel where an actual change is detected as changed, a red pixel represents a pixel where an actual change is detected as changed, and a green pixel represents a pixel where an actual change is detected as unchanged. As can be seen from FIG. 5, the invention can more accurately extract the change information in the image by detecting less false pixel points (red and green pixel points) on four data sets than other methods.

As can be seen from the first 6 columns of FIG. 5, for the image change boundary, the number of green pixels is less, and for the unchanged change region, the number of red pixels is less than that of other algorithms, which indicates that the method can better capture change details, reduce the interference of noise on detection and improve the detection precision.

The invention compares the objective evaluation indexes with other methods through False Positive (FP), false Negative (FN), total false detection number (OE), classification accuracy (PCC) and Kappa Coefficient (KC), wherein the specific meanings of the evaluation indexes are as follows:

false Positives (FP): the number of pixels which are actually unchanged but are detected as being changed;

false Negative (FN): the number of pixels which are actually changed but are detected as unchanged;

total false detection number (OE): total number of pixel points mischecked;

OE＝FP+FN

classification accuracy (PCC): ratio of total correct detection number to total pixel number

Wherein TP represents the number of pixels actually changed and detected as changed, and TN represents the number of pixels actually unchanged and detected as unchanged.

Kappa Coefficient (KC): measuring consistency of detection result and true value

Wherein the method comprises the steps of

Where N is the total number of pixels.

The number of the pixel points which are erroneously detected in the detection result can be judged through the magnitude of the OE, and the lower OE, the higher PCC and the KC can indicate that the detection accuracy is higher. Tables 1, 2, 3 and 4 set forth the results of the test of the present invention on four different sets of data compared to the above-described comparison method, respectively.

TABLE 1 detection results of the present application and other comparative methods on Ottawa dataset

/>

TABLE 2 detection results of the present application and other comparative methods on Sulzberger datasets

TABLE 3 detection results of the present application and other comparative methods on yellow river C dataset

TABLE 4 detection results of the present application and other comparative methods on yellow river D dataset

/>

In summary, the method can detect the change region in different scenes, the detection result is superior to other algorithms, the detection of the SAR image change region is realized, and the method has better detection precision.

The application provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain when executing the computer program.

The application proposes a computer readable storage medium for storing computer instructions which, when executed by a processor, implement the steps of the spatial frequency domain based unsupervised depth multiscale SAR image change detection method.

The memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and direct memory bus RAM (DRRAM). It should be noted that the memory of the methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a high-density digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method. To avoid repetition, a detailed description is not provided herein.

It should be noted that the processor in the embodiments of the present application may be an integrated circuit chip with signal processing capability. In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The processor may be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.

The invention provides a spatial frequency domain-based unsupervised depth multi-scale SAR image change detection method, which is described in detail above, and specific examples are applied to illustrate the principle and implementation of the invention, and the description of the above examples is only used for helping to understand the method and core ideas of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. An unsupervised depth multi-scale SAR image change detection method based on a spatial frequency domain is characterized by comprising the following steps of: the method comprises the following steps:

step 4: calculating cross entropy loss and carrying out back propagation;

2. The method according to claim 1, characterized in that: in step 1, the log ratio operation is calculated as follows:

3. The method according to claim 2, characterized in that: the step 2 specifically comprises the following steps:

4. A method according to claim 3, characterized in that: the non-supervision depth multi-scale network of the spatial frequency domain extracts the spatial features and the frequency domain features of the input data respectively, the extracted features are spliced, the possibility of change or no change of the extracted features is judged, and the corresponding pixel types are obtained according to the level of the possibility.

5. The method according to claim 4, wherein: the step 3 specifically comprises the following steps:

step 3.1: performing convolution dimension-increasing operation on input data;

step 3.2: performing multi-region selection on the characteristics;

step 3.3: deep multi-scale feature extraction is carried out;

one of the gating linear units is calculated as follows:

6. The method according to claim 5, wherein: the step 3.3 specifically comprises the following steps:

step 3.3.1: the horizontal area, the vertical area and the full area obtained in the step 3.1 are firstly combined with the size ofConvolving the convolution kernels to obtain first layer horizontal features F ₁ ^h First layer vertical feature F ₁ ^v And first layer full area feature F ₁ ^a ；

Step 3.3.2: to first layer horizontal feature F ₁ ^h And first layer vertical feature F ₁ ^v Filling the ignored edge portions with 0 element to obtain the final product with the same sizeFirst layer fill level feature F of (1) ₁ ^hz And a first layer filling vertical feature F ₁ ^vz To the first layer full area characteristic F ₁ ^a Obtaining a first layer of key features F through a channel-space attention mechanism ₁ ^k The method comprises the steps of carrying out a first treatment on the surface of the Filling level of first layer feature F ₁ ^hz First layer filled vertical feature F ₁ ^vz And first layer key feature F ₁ ^k Adding to obtain a first layer of spatial features F ₁ ；

First layer spatial features F ₁ The calculation process is as follows:

F ₁ ＝F ₁ ^hz +F ₁ ^vz +F ₁ ^k

Step 3.3.5: to level the second layerAnd second layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>Second layer fill level feature of->And a second layer fills the vertical feature->Characterization of the second layer full area->Obtaining the second layer key feature after passage-space attention mechanism>Second layer fill level characterization- >Second layer filled vertical features->And second layer key feature->Adding to obtain a second layer of spatial features F ₂ ；

Second layer spatial features F ₂ The calculation process is as follows:

Step 3.3.8: level features of the third layerAnd third layer vertical feature->Filling the ignored edge portions with 0 element to obtain a size of +.>Third layer filling level characteristics of->And third layer filling vertical feature->Characterization of the third layer full region->Obtaining third layer key feature through channel-space attention mechanismFilling level characteristics of the third layer->Third layer filled vertical feature->And third layer key feature->Adding to obtain a third layer of spatial features F ₃ ；

Third layer spatial feature F ₃ The calculation process is as follows:

7. The method according to claim 6, wherein: the cross entropy loss function calculation process is as follows:

8. The method according to claim 7, wherein: the channel-space attention mechanism in steps 3.3.2, 3.3.5, 3.3.8 is implemented as follows:

layer I feature F to be obtained _l ^a Obtaining smooth features F through global average pooling and global maximum pooling respectively _l ^cavg And sharp feature F _l ^cmax The method comprises the steps of carrying out a first treatment on the surface of the And then smoothing feature F _l ^cavg And sharp feature F _l ^cmax Respectively sending the images into a multi-layer perceptron with a hidden layer, and obtaining a channel attention map M by means of element summation _C ；

Channel attention map M _C The calculation process of (1) is as follows:

M _c ＝σ(MLP(Avg(F _l ^cavg ))+MLP(Max(F _l ^cmax )))

mapping M the resulting channel attention map _C And the obtained full-area characteristic F of the first layer _l ^a Multiplying the elements by each other to obtain the attention value in the space dimension, thereby obtaining the weighted characteristic F of the first layer channel _l ^w ；

Layer one channel weighting feature F _l ^w The calculation process is as follows:

weighting characteristic F of the obtained first layer channel _l ^w Performing an average pooling and a maximum pooling operation along the channel dimension, resulting in spatially smoothed features F, respectively _l ^savg And a spatially sharp feature F _l ^smax Splicing the two features along the channel dimension, and finally generating a spatial attention map M through convolution operation _s ；

Spatial attention map M _s The calculation process of (1) is as follows:

M _s ＝σ(Conv([Avg(F _l ^a )；Max(F _l ^a )]))

wherein [; and represents a channel series;

map the resulting spatial attention M _s Weighting features F with layer I channels _l ^w Element-by-element multiplication to obtain first layer key feature F _l ^k ；

Layer I Key features F _l ^k The calculation process is as follows:

9. an electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1-8 when the computer program is executed.

10. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of any one of claims 1-8.