CN113469074A

CN113469074A - Remote sensing image change detection method and system based on twin attention fusion network

Info

Publication number: CN113469074A
Application number: CN202110765132.3A
Authority: CN
Inventors: 陈璞花; 孙杰; 焦李成; 刘芳; 李玲玲; 张向荣
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2021-07-06
Filing date: 2021-07-06
Publication date: 2021-10-01
Anticipated expiration: 2041-07-06
Also published as: CN113469074B

Abstract

The invention discloses a twin attention fusion network-based remote sensing image change detection method and system, which preprocesses a double-time phase remote sensing image; making the preprocessed double-temporal remote sensing image into a training data set and a test data set; constructing a twin attention fusion network SE-Sim-Resnet; training a twin attention fusion network using a training data set; and inputting the test data set into the trained twin attention fusion network, and determining whether the central pixel of the input remote sensing image pair changes according to the output result to finish the change detection of the remote sensing image. The change detection effect of the remote sensing image can be greatly improved.

Description

Remote sensing image change detection method and system based on twin attention fusion network

Technical Field

The invention belongs to the technical field of remote sensing image change detection, and particularly relates to a remote sensing image change detection method and system based on a twin attention fusion network.

Background

The change detection means that the change information of the earth surface and the ground features is determined by analyzing images of the same geographic area shot at different moments, and the research aims to find out interested change information and filter irrelevant change information appearing as interference factors. The remote sensing image has a plurality of changes irrelevant to the interested area, and the causes of the changes are various, such as the sun irradiation angle, the earth surface humidity, the shooting season of multi-time phase images and the like.

In recent years, the deep learning technique has been rapidly developed and plays an important role in the field of digital image processing. The deep learning can automatically learn the characteristics meeting the task requirements through data driving, the change detection task is converted into the classification problem of pixel points one by one, and a change detection algorithm based on attention mechanism and twin network fusion is designed. The change detection algorithm of pixel-by-pixel classification usually splices the double-time phase images together in channel dimensions, and cuts image blocks from the spliced images as training data of a classification network. In the training process, the method cannot aim at enhancing the relation between two remote sensing images, and the fitting capacity of the model is reduced to a certain extent. The twin network fusion algorithm based on the attention mechanism inputs double-time phase images into two branches with the same structure, and fuses characteristic images of the two branches through the attention mechanism to screen out channels beneficial to model classification and adaptively recalibrate characteristic response of the channels.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a method and a system for detecting the change of the remote sensing image based on the twin attention fusion network aiming at the defects in the prior art, so that the prediction accuracy of the change detection of the remote sensing image is improved.

The invention adopts the following technical scheme:

a remote sensing image change detection method based on a twin attention fusion network comprises the following steps:

s1, preprocessing the double-time-phase remote sensing image;

s2, making the double-temporal remote sensing image preprocessed in the step S1 into a training data set and a testing data set;

s3, constructing a twin attention fusion network SE-Sim-Resnet;

s4, training the twin attention fusion network constructed in the step S3 by using the training data set manufactured in the step S2;

and S5, inputting the test data set manufactured in the step S2 into the twin attention fusion network trained in the step S4, and determining whether the central pixel of the input remote sensing image pair changes according to the output result to finish the change detection of the remote sensing image.

Specifically, in step S1, two remote sensing images in the same geographical area and in different directions are read, and the two remote sensing images are preprocessed by using a radiation correction method.

Specifically, in step S2, 400 changed pixels and 1600 unchanged pixels are obtained from the label image in a random sampling manner, and then paired image blocks each having a size of 10 × 10 are cut from the corrected two-time-phase remote sensing image with 2000 pixels as a center, which is used as a training data set;

sequentially turning the front 5 rows upwards, the rear 5 rows downwards, the left 5 columns leftwards and the right 5 columns rightwards of the remote sensing image, and cutting paired image blocks by taking each pixel of the remote sensing image before turning as a center to serve as a test data set;

the labels of the image pair of the training data set and the test data set are values 0 and 1, which respectively represent that the central pixel point of the image pair is an unchanged pixel point or a changed pixel point.

Specifically, in step S3, the construction of the twin attention fusion network specifically includes:

s301, building two branch networks, and respectively extracting the characteristics of the double-time-phase remote sensing image;

s302, adding an SE attention mechanism between the two branch networks built in the step S301, and constructing an SE attention fusion module;

s303, adding a classifier at the tail end of the two branch networks built in the step S301, and forming a twin attention fusion network by the two branch networks, the SE attention fusion module and the classifier.

Further, in step S301, each branch network is formed by stacking residual modules; in each branch network, residual modules outputting characteristic images with the same size are recorded into a group, the used branch network comprises 3 groups, and each group comprises 5 residual modules; the network structures of the two branch networks are the same, and the two branch networks share the same set of network parameters.

Further, in step S302, a pair of residual error modules shared by parameters is located at the same position of the two branch networks, and the residual error modules are used as twin residual error modules; the SE attention fusion module adds an SE attention mechanism between the twin residual error modules and forms the SE attention fusion module together with the twin residual error network.

Further, the SE attention fusion module first splices the residual feature image F1 and the residual feature image F2 along the channel dimension to form a feature image V with dimension H × W × 2C₁(ii) a The compression module then uses global average pooling to pool the feature image V₁Feature vector V compressed to 1 × 1 × 2C in dimension₂(ii) a Feature vector V₂The first C elements correspond to the channel information, feature vector V, of the feature image F1₂The last C elements correspond to the channel information of the feature image F2;

the excitation module of the SE attention fusion module establishes the relation of each channel through two full connection layers and distributes the weight to the channels again; the first full-connection layer of the excitation module changes the channel number from 2C to C/4, and the scaling coefficient is 8; the second fully-connected layer of the excitation module outputs a feature vector V with the dimension of 1 multiplied by C₃；

The SE attention mechanism models all channels of the residual error feature image F1 and the residual error feature image F2 and learns correlation information among all channels; SE attention fusion module combines feature vector V₃Multiplying the residual characteristic images F1 and F2 respectively, and outputting results F1 'and F2' to represent residual characteristic images fused through an attention mechanism; the SE attention fusion module contains two outputs, I1 and F1 'respectively, I2 and F2' respectively.

Specifically, in step S4, the initial learning rate is 0.001, the batchsize is 128, and the network parameters are initialized using the kaiming normal distribution; the model training is iterated for 200 times, the learning rate is attenuated in a three-section mode, and the learning rate is attenuated to one tenth of the original learning rate every 80 generations; cross entropy was used as a loss function and Adam was used as an optimization method.

Specifically, in step S5, if the probability that the input data is a changed pixel is greater than 0.5, the input data is classified as a changed pixel, and if the probability that the input data is a changed pixel is less than 0.5, the input data is classified as an unchanged pixel.

Another technical solution of the present invention is a twin attention fusion network-based remote sensing image change detection system, including:

the processing module is used for preprocessing the double-time-phase remote sensing image;

the data module is used for making the double-time-phase remote sensing image preprocessed by the processing module into a training data set and a test data set;

the network module is used for constructing a twin attention fusion network SE-Sim-Resnet;

the training module is used for training a twin attention fusion network built by the network module by using a training data set made by the data module;

and the detection module is used for inputting the test data set manufactured by the data module into the twin attention fusion network trained by the training module, determining whether the central pixel of the input remote sensing image pair changes according to the output result, and finishing the change detection of the remote sensing image.

Compared with the prior art, the invention has at least the following beneficial effects:

the invention discloses a twin attention fusion network-based remote sensing image change detection method, which uses a twin branch network as a basic structure to respectively extract the characteristics of remote sensing images at the same place and different time. The algorithm adds an attention mechanism in the branch networks, explicitly models the interdependence between all channels in the two branch networks, and enables the algorithm to assign a weight to the channel of each feature image according to the feedback of network loss. Through continuous iterative training, the characteristics which effectively represent changes can be assigned with larger weight in the twin network, and the characteristics which represent irrelevant changes can be assigned with lower weight at the same time. The design of this mode greatly enhances the detection accuracy of change detection.

Furthermore, due to the fact that the condition measurement difference is obtained, image changes caused by non-ground object changes exist in the double-time phase remote sensing image, and therefore the influence of the non-ground object changes on the change detection algorithm is very important to reduce. Among the preprocessing processes for change detection, the most important processes are geometric correction and radiation correction.

Further, the deep neural network needs a large amount of data as a training set to train the network, and uses the test set to verify the performance of the network, and randomly screens out a part of pixels from the dual-temporal remote sensing image through step S2 to produce a training data set, and uses all pixels of the dual-temporal image to produce a test set.

Further, a change detection network based on deep learning is designed through step S3.

Further, the features of the two-phase images are respectively extracted by the twin network of parameter sharing through step S301. If the ground feature information of the double time phase remote sensing images is the same, the features extracted by the two branch networks are also the same, and if the ground feature information of the double time phase remote sensing images is different, the features extracted by the two branch networks are also different; it is beneficial for the classifier in step S303 to determine whether the input image pair has changed.

Further, an SE attention mechanism is added to the twin network through step S302, and an SE attention fusion module is constructed. The SE attention mechanism enables comprehensive analysis of feature images extracted by the twin network and reassignment of weights to each channel of the feature images, wherein features representing change information are assigned a greater weight and features representing irrelevant change information are assigned a lesser weight.

Furthermore, a pair of residual modules sharing parameters are arranged at the same position of the two branch networks, and the pair of residual modules are recorded as twin residual modules. The SE attention fusion module adds an SE attention mechanism between the twin residual error modules, and the SE attention fusion module and the twin residual error network jointly form the SE attention fusion module.

Further, in order to fit the model to the current data set, the network is trained using the training data set generated in step S2.

Further, to verify the performance of the model, the model is tested using the test data set generated in step S2. The input of the model is a pair of double-time phase remote sensing images, the output is a two-dimensional vector, the first element represents the probability that the central pixel of the input data is an unchanged sample, the second element represents the probability that the central pixel of the input data is a changed sample, and the sum of the numerical values is 1. If the value of the first element is greater than 0.5, it indicates that the central pixel of the input data is an unchanged sample, otherwise it indicates a changed sample.

In summary, the invention designs a new change detection model based on the twin network and the attention mechanism, and the model can improve the change detection effect to a certain extent and obtain the change detection result with better visual effect and higher numerical index.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a network architecture diagram of an SE attention fusion model;

FIG. 2 is a network architecture diagram of the SE-Sim-Resnet;

FIG. 3 is a network structure diagram of each Layer;

FIG. 4 is a first set of graphs of simulation results, wherein (a) is the remote sensing image of time phase 1, (b) is the remote sensing image of time phase 2, (c) is the label, and (d) is the experimental result;

FIG. 5 is a second set of graphs of simulation results, wherein (a) is the remote sensing image of time phase 1, (b) is the remote sensing image of time phase 2, (c) is the label, and (d) is the experimental result;

fig. 6 is a third set of graphs showing the results of simulation experiments, where (a) is the remote sensing image of time phase 1, (b) is the remote sensing image of time phase 2, (c) is a label, and (d) is the results of the experiments.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.

The invention provides a twin attention fusion network-based remote sensing image change detection method, which comprises the steps of preprocessing a double-time-phase remote sensing image; making a training data set and a test data set; constructing a twin attention fusion network SE-Sim-Resnet; training the model using a training data set; the final result is obtained using the test data set. The method uses two parameter-shared branch networks to respectively extract the characteristics of two remote sensing images, and uses an attention fusion module to establish a connection between the two branch networks. The attention fusion module can adaptively distribute weight to each channel of the feature image in the branch network, so that the feature learned by the network can reduce the interference of irrelevant change in the remote sensing image, enhance the extraction of the change feature, and has important significance in the fields of land resource estimation, earthquake resistance and disaster relief and the like.

The invention discloses a twin attention fusion network-based remote sensing image change detection method, which comprises the following steps of:

s1, preprocessing the double-time-phase remote sensing image;

reading in two remote sensing images in the same geographical area and in different directions, and processing the two remote sensing images by using a radiation correction algorithm. The two-time-phase remote sensing images in fig. 4 and 5 are isomorphic images, and the two-time-phase remote sensing images in fig. 6 are heterogeneous images, wherein the isomorphic images represent that two remote sensing images are captured by the same type of sensor, and the heterogeneous images represent that two remote sensing images are captured by different types of sensors. For heterogeneous images, step S1 does not use radiation correction.

S2, making a training data set and a testing data set;

when a training data set is generated, firstly, 400 changed pixels and 1600 unchanged pixels are obtained from a label image in a random sampling mode, then paired image blocks with the size of 10 x 10 are cut from a corrected double-time phase remote sensing image by taking 2000 pixels as centers, wherein the 10 x 10 image blocks represent neighborhood information of a central pixel point.

When a test data set is generated, firstly, the front 5 rows of the remote sensing image are turned upwards, the rear 5 rows of the remote sensing image are turned downwards, the left 5 columns of the remote sensing image are turned leftwards, and the right 5 columns of the remote sensing image are turned rightwards in sequence, so that each pixel of the remote sensing image before turning can be used as the center of a 10 x 10 image block. Secondly, cutting image blocks into pairs by taking each pixel of the remote sensing image before turning over as a center, wherein the sampling size of the image blocks is the same as that of the training data set.

The labels of the image pair of the training dataset and the test dataset are values 0 and 1, which respectively represent that the central pixel point of the image pair is an unchanged pixel point or a changed pixel point.

S3, constructing a twin attention fusion network SE-Sim-Resnet (SE-Sim-Resnet is short for network, SE is an attention mechanism, Simese is twin, and Resnet is a basic classification network used by the method);

each branch network is formed by stacking residual modules. In each branch network, residual modules outputting feature images with the same size are recorded as a group, and the branch network used by the algorithm comprises 3 groups, and each group comprises 5 residual modules. The network structures of the two branch networks are the same, and the two branch networks share the same group of network parameters; the network structure of a single Layer is shown in FIG. 3, and the single Layer corresponds to one Layer in FIG. 2 on SE-Sim-Resnet;

s302, adding an SE attention mechanism between the two branch networks to construct an SE attention fusion module;

and a pair of residual modules shared by parameters are arranged at the same position of the two branch networks, and the pair of residual modules are recorded as twin residual modules. The SE attention fusion module adds an SE attention mechanism between the twin residual error modules, and the SE attention fusion module and the twin residual error network jointly form the SE attention fusion module.

In the forward propagation process, the output of the residual module is the sum of the residual characteristic image calculated by the convolution network and the input data. The input data of the twin residual error modules are respectively I1E R^H×W×C′And I2 ∈ R^H×W×C′The calculated residual characteristic images are respectively F1E R^H×W×CAnd F2 ∈ R^H×W×CAs shown in fig. 1 by input1, input2, Feature map1 and Feature map 2. If C 'is equal to C, the jump connection is a constant mapping, and if C' is not equal to C, the jump connection adds 1 × 1 convolution to achieve scaling of the dimension. The output characteristic images of the twin residual error modules are respectively O1E R^H×W×CAnd O2 ∈ R^H×W×CAs shown in output1 and output2 in fig. 1.

Please refer to fig. 1, SE notesThe intention fusion module firstly splices residual characteristic images F1 and F2 along a channel dimension to form a characteristic image V with dimension H multiplied by W multiplied by 2C₁The compression module then pools the feature images V using global averaging₁Feature vector V compressed to 1 × 1 × 2C in dimension₂。V₂The first C elements correspond to the channel information, V, of the feature image F1₂The latter C elements correspond to the channel information of the feature image F2. And the excitation module of the SE attention fusion module establishes the connection of each channel through two full-connection layers and assigns the weight to the channel again. The first fully-connected layer of the excitation module changes the number of channels from 2C to C/4, and the scaling factor is 8. In order to ensure that the feature vector output by the SE attention mechanism can be multiplied by the residual feature image, the second full-connection layer of the excitation module outputs a feature vector V with the dimension of 1 × 1 × C₃. Through the above steps, the SE attention mechanism models all channels of the feature images F1 and F2, learns correlation information between all channels, adaptively recalibrates the feature response of the channels according to the correlation information, assigns larger weights to channels that are beneficial for change detection, and assigns smaller weights to channels that have less influence on the change detection result. SE attention fusion module combines feature vector V₃The output results F1 'and F2' of the multiplication by the residual feature images F1 and F2, respectively, represent residual feature images fused by the attention mechanism. The twin residual module in the SE attention fusion module adds its inputs I1 and I2 to F1 'and F2' after being connected by jumps, respectively.

S303, adding a classifier.

Referring to fig. 2, the classifier of SE-Siam-Resnet is composed of a fully connected layer, with an output dimension of 2, representing the probability that the central pixel of the input remote sensing image pair is an unchanged pixel and a changed pixel, respectively. The SE-Sim-Resnet firstly pools feature images output by the two branch networks through global averaging, and then inputs the feature images into a classifier after splicing the output of the feature images together along the channel dimension.

S4, training the model by using the training data set;

using the training data set generated in step S2 and the network training model constructed in step S3, the initial learning rate was set to 0.001, the batch size was set to 128, and the network parameters were initialized using the kaiming normal distribution during model training. The model training is iterated for 200 times, the learning rate is attenuated in a three-section mode, and the learning rate is attenuated to one tenth of the original learning rate every 80 generations. The model uses cross entropy as a loss function and Adam as an optimization algorithm.

And S5, acquiring a final result by using the test data set.

The test data set is input into the trained model, the output of which represents the probability that the central pixel of the input remote sensing image pair is an unchanged pixel and a changed pixel. If the probability that the input data is the changed pixel is more than 0.5, the input data is classified as the changed pixel, and if the probability that the input data is the changed pixel is less than 0.5, the input data is classified as the unchanged pixel.

In another embodiment of the present invention, a twin attention fusion network-based remote sensing image change detection system is provided, which can be used to implement the above twin attention fusion network-based remote sensing image change detection method.

In yet another embodiment of the present invention, a terminal device is provided that includes a processor and a memory for storing a computer program comprising program instructions, the processor being configured to execute the program instructions stored by the computer storage medium. The Processor may be a Central Processing Unit (CPU), or may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable gate array (FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, etc., which is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and is specifically adapted to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor provided by the embodiment of the invention can be used for the operation of the remote sensing image change detection method based on the twin attention fusion network, and comprises the following steps:

preprocessing the double-temporal remote sensing image; making the preprocessed double-temporal remote sensing image into a training data set and a test data set; constructing a twin attention fusion network SE-Sim-Resnet; training a twin attention fusion network using a training data set; and inputting the test data set into the trained twin attention fusion network, and determining whether the central pixel of the input remote sensing image pair changes according to the output result to finish the change detection of the remote sensing image.

In still another embodiment of the present invention, the present invention further provides a storage medium, specifically a computer-readable storage medium (Memory), which is a Memory device in a terminal device and is used for storing programs and data. It is understood that the computer readable storage medium herein may include a built-in storage medium in the terminal device, and may also include an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also, one or more instructions, which may be one or more computer programs (including program code), are stored in the memory space and are adapted to be loaded and executed by the processor. It should be noted that the computer-readable storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory.

One or more instructions stored in the computer-readable storage medium can be loaded and executed by the processor to realize the corresponding steps of the method for detecting the change of the remote sensing image based on the twin attention fusion network in the embodiment; one or more instructions in the computer-readable storage medium are loaded by the processor and perform the steps of:

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The effect of the present invention will be further explained with the simulation result

1. Experiment platform

The simulation test hardware platform of the invention is: NVIDIARTX 2060, 6G video memory.

The software platform of the invention is as follows: windows10 operating system, python3.6, pytorch1.4

2. Experimental data set

FIG. 4 is a set of remote sensing images in an SZTAKI AirChange Benchmark dataset with dimensions 952 × 640, spatial resolution of 1.5 meters, and labels drawn by an expert manually. FIG. 5 is a set of remotely sensed images in a Quickbird dataset provided by the Guangdong government data Innovation tournament, the images all having a size of 512 by 512. Figure 6 is an eosinomural data set that is a heterogeneous data set with two different phases of captured images from two different sensors. The eosinomural data set comprises a SAR image captured in 2008 and an RGB image captured in 2012, and has a data set size of 921 × 593.

3. Evaluation index of simulation experiment

Wherein pre is precision ratio, rec is recall ratio, acc is prediction accuracy ratio, F1 is F1 coefficient, TP is real case, TN is true case, FP is false positive case, and FN is false negative case.

4. Results of the experiment

TABLE 1

TABLE 2

TABLE 3

In the above table, Resnet32 is a binary network, which splices two remote sensing image blocks together along the channel dimension as input and outputs the input as the change probability of the central pixel of the remote sensing image block. The Sim-Resnet is also a two-class network, comprising two branch networks with the same structure and shared parameters. The difference between the Sim-Resnet and SE-Sim-Resnet is that no SE attention mechanism is added between the two branched networks, and the difference between the Sim-Resnet and Resnet32 is that the feature-extracting network is converted into a branch shared by two parameters. Two branch networks of the Sim-Resnet respectively process one remote sensing image block.

Table 1 represents the numerical index of the data shown in fig. 4 under different methods, table 2 represents the numerical index of the data shown in fig. 5 under different methods, and table 3 represents the numerical index of the data shown in fig. 6 under different methods, and it can be known from the above table that the SE-Siam-Resnet algorithm has better performance on different types of data, and the pre, rec, OA and F1 indexes thereof are all the highest in the above methods.

In conclusion, the method and the system for detecting the change of the remote sensing image based on the twin attention fusion network can greatly improve the change detection effect of the remote sensing image.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. A remote sensing image change detection method based on a twin attention fusion network is characterized by comprising the following steps:

s1, preprocessing the double-time-phase remote sensing image;

s3, constructing a twin attention fusion network SE-Sim-Resnet;

2. The method according to claim 1, wherein in step S1, two remote sensing images of the same geographical area and different directions are read in, and the two remote sensing images are preprocessed by using a radiometric correction method.

3. The method according to claim 1, wherein in step S2, 400 changed pixels and 1600 unchanged pixels are obtained from the label image by means of random sampling, and then pairs of image blocks each having a size of 10 × 10 are cropped from the corrected two-time phase remote sensing image as a training data set, with 2000 screened pixels as a center;

4. The method according to claim 1, wherein in step S3, constructing the twin attention fusion network is specifically:

5. The method according to claim 4, wherein in step S301, each branch network is formed by stacking residual modules; in each branch network, residual modules outputting characteristic images with the same size are recorded into a group, the used branch network comprises 3 groups, and each group comprises 5 residual modules; the network structures of the two branch networks are the same, and the two branch networks share the same set of network parameters.

6. The method according to claim 4, wherein in step S302, a pair of residual modules with shared parameters are located at the same position of the two branch networks as twin residual modules; the SE attention fusion module adds an SE attention mechanism between the twin residual error modules and forms the SE attention fusion module together with the twin residual error network.

7. The method of claim 6, wherein the SE attention fusion module first splices the residual feature image F1 and the residual feature image F2 along the channel dimension to form a feature image V with dimension H x W x 2C₁(ii) a The compression module then uses global average pooling to pool the feature image V₁Feature vector V compressed to 1 × 1 × 2C in dimension₂(ii) a Feature vector V₂The first C elements correspond to the channel information, feature vector V, of the feature image F1₂The last C elements correspond to the channel information of the feature image F2;

8. The method according to claim 1, wherein in step S4, the initial learning rate is 0.001, the batch size is 128, and the network parameters are initialized using a kaiming normal distribution; the model training is iterated for 200 times, the learning rate is attenuated in a three-section mode, and the learning rate is attenuated to one tenth of the original learning rate every 80 generations; cross entropy was used as a loss function and Adam was used as an optimization method.

9. The method of claim 1, wherein in step S5, the input data is classified as changed pixels if the probability of the input data being changed pixels is greater than O.5, and classified as unchanged pixels if the probability of the input data being changed pixels is less than 0.5.

10. A remote sensing image change detection system based on a twin attention fusion network is characterized by comprising: