CN113609896A

CN113609896A - Object-level remote sensing change detection method and system based on dual-correlation attention

Info

Publication number: CN113609896A
Application number: CN202110692812.7A
Authority: CN
Inventors: 胡翔云; 张琳; 张觅
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2021-11-05
Anticipated expiration: 2041-06-22
Also published as: CN113609896B

Abstract

The invention provides an object-level remote sensing change detection method and system based on dual-correlation attention, which are used for enhancing data for change detection and generating double input streams; setting a backbone network sharing weight for receiving double input streams and extracting different scale characteristics of double time phase images; setting a dual-correlation attention-guided feature fusion neck, focusing on the correlation of the same-scale double-time-phase features in the space level and the channel level to obtain refined difference features, and setting a refined path aggregation pyramid module to fuse the features among different scale layers; and finally, the difference characteristics of different scales are sent to a change detection head, and the position, the size and the change confidence coefficient of the changed ground object are predicted in a boundary box mode. The data enhancement method special for change detection can accelerate model training and improve model performance, can effectively resist pseudo-change interference in image centering through dual correlation attention mechanism guidance, and has high accuracy and robustness.

Description

Object-level remote sensing change detection method and system based on dual-correlation attention

Technical Field

The invention belongs to the field of automatic change detection of remote sensing images, and particularly relates to an object-level remote sensing change detection method and system based on dual-correlation attention.

Background

Change detection is the process of detecting differences by observing and identifying the state of objects or phenomena at different times. More precisely, the purpose of change detection is to find change information of a particular semantic category of interest, with the interference of filtering out irrelevant change information. It has always been one of the most important problems in the field of remote sensing. At present, change detection is widely applied to various applications, such as urban planning, land resource management, environmental monitoring, agricultural investigation, disaster assessment and other applications, and has great research value.

Currently, the change detection method can be classified into a pixel-level change detection method, an object-level change detection method, and a scene-level change detection method according to the granularity of a basic unit used. Pixel-level methods typically output a binary mask prediction at the pixel level by extracting features that vary from floor to floor, classifying as either varying or constant from pixel to pixel across the input size. For object-level change detection, an object is generally obtained by segmentation or detection, and a multi-temporal feature of the object is extracted and compared to determine whether the object has changed, so as to output an object-level prediction, such as a binary mask of an object instance or a bounding box of the object. The scene-level method is used for analyzing whether the category of the corresponding scene changes or not and what kind of changes occur at different time from the semantic level, and classifying the images or image slices to provide prediction of image-level labels. Pixel-level and object-level methods are used more since change detection most of the time requires determining changes to a particular class of region or object.

As for the change detection method at the pixel level, representative conventional methods are Principal Component Analysis (PCA), Change Vector Analysis (CVA), and the like. In view of the problem that the manually designed features have poor adaptability to high-resolution images and the superiority of the deep learning method in feature extraction, more and more change detection researches start to introduce the deep learning method. These models have a large receptive field, and the performance is far better than that of the traditional method, but the processing mode still stays at the pixel level. The pixel-level method always has a hidden constraint condition, and high-precision mutual registration is required to be carried out between different phase data, namely, the front image and the rear image are strictly aligned. However, shape changes due to different viewing angles and shadow shadows caused by high-rise buildings tend to create false change areas. Furthermore, recent studies have shown that modern deep neural networks are not strictly invariant to translation. Therefore, it is difficult to avoid such spurious changes in a pixel-level approach with pixel correspondence effects under supervision of a "pixel label" mode.

The conventional object-based change detection method uses the "object" as an analysis unit to reduce erroneous judgment of the pseudo-change region to some extent. Notably, an "object" in an object-based approach is typically a set of local clusters of pixels obtained by segmenting an image using spectral texture features, geometric features (e.g., shape and area), and other information. However, due to the limitations of the conventional manual design feature extraction methods, these segmented regions usually rely on thresholds, making the "object" prone to excessive segmentation and boundary fragmentation, having no semantic integrity, and being unable to truly simulate the actual geographic entity target.

Disclosure of Invention

In order to overcome the technical problems, the invention provides an object-level remote sensing change detection scheme based on dual-correlation attention by utilizing a deep learning technology, which is used for detecting changed geographic entities (such as newly added buildings and changed artificial ground objects). Compared with a pixel-level method and a traditional object-based method, the method disclosed by the invention focuses more on the overall information and context relation of the changed geographic entities, and can effectively resist pseudo-change interference in the image.

The invention provides an object-level remote sensing change detection method based on dual correlation attention, which comprises the following steps,

step 1, carrying out data enhancement for change detection to generate a dual input stream, wherein the implementation mode comprises the steps of carrying out random time sequence exchange and combination on four pairs of images to obtain a pair of synthetic image pairs, carrying out enhancement operation on one of the synthetic images, including random geometric transformation, brightness transformation, Gaussian noise increase and color transformation, and carrying out corresponding geometric transformation on the other synthetic image;

step 2, setting a backbone network sharing weight for receiving double input streams and extracting different scale characteristics of double time phase images;

step 3, a dual-correlation attention guiding feature fusion neck is arranged on the basis of a backbone network, a dual-correlation attention module is arranged in the feature fusion neck to guide the network to focus on the correlation of the dual-time-phase features of the same scale in the space level and the channel level so as to obtain refined difference features, and a refined path aggregation pyramid module is arranged to fuse the features of different scale layers;

and 4, finally, sending the difference characteristics of different scales to a change detection head, and predicting the position, the size and the change confidence coefficient of the changed ground feature in a boundary box mode.

In the backbone network sharing the weight, the feature reuse is realized through a layered feature fusion strategy, and excessively repeated gradient information is cut off, so that the parallel features of the double-time phase image with different scales are extracted.

And a CSPDarkNet-53 network is used for constructing a twin network as a backbone network for sharing weights, five-scale features are extracted, and the features are labeled as C1-C5 layers from the top layer to the bottom layer in sequence.

Moreover, the dual cooperative attention-guided feature fusion neck fuses features from the C3-C5 layer of the twin CSPDarknet-53 network with three dual correlated attention modules to generate P3-P5 layer features, and then fuses difference features between different scale layers with a refined path aggregation pyramid module.

Moreover, the dual correlation attention module comprises a space correlation attention module, a channel correlation attention module and a variation difference module, and parallel features F from the same scale layer are set_aB series connection to obtain characteristic F_abGuiding spatially dependent attention difference features F_PChannel-related attention-directed difference feature F_CAnd feature F_abFurther fusion is carried out to obtain the final variation difference characteristic F.

The change detection head is divided into three layers, namely a detection head-S, a detection head-M and a detection head-L, which respectively correspond to the N3-N5 layers of characteristics from the refined path aggregation pyramid module, learn the change confidence coefficient and the coordinates of the change object in parallel according to the supervision information, and learn the regression of the boundary frame of the change object by using the prior frame; training the whole network model by using a multi-task loss function according to the training sample label; and inputting the double time phase images to be detected into the trained network relatively to obtain an accurate change detection result.

On the other hand, the invention also provides an object-level remote sensing change detection system based on dual correlation attention, which is used for realizing the object-level remote sensing change detection method based on dual correlation attention.

And, including the following modules,

the system comprises a first module, a second module and a third module, wherein the first module is used for carrying out data enhancement for change detection and generating double input streams, the implementation mode comprises the steps of carrying out random time sequence exchange and combination on four pairs of images to obtain a pair of synthetic image pairs, carrying out enhancement operation on one of the synthetic images, including random geometric transformation, brightness transformation, Gaussian noise increase and color transformation, and carrying out only corresponding geometric transformation on the other synthetic image;

the second module is used for setting a backbone network sharing the weight and receiving the double input streams to extract different scale characteristics of the double time phase images;

and the third module is used for setting a dual-correlation attention-guided feature fusion neck on the basis of the backbone network, wherein the dual-correlation attention module is arranged in the feature fusion neck and used for guiding the network to concern the correlation of the double-temporal features of the same scale in the space level and the channel level so as to obtain the refined difference features, and the refined path aggregation pyramid module is arranged for fusing the features of different scale layers.

And the fourth module is used for finally sending the difference characteristics with different scales to the change detection head, and predicting the position, the size and the change confidence coefficient of the changed ground object in the form of a boundary box.

Alternatively, the system comprises a processor and a memory, wherein the memory is used for storing program instructions, and the processor is used for calling the stored instructions in the memory to execute the object-level remote sensing change detection method based on dual-correlation attention.

Or, the method comprises a readable storage medium, wherein a computer program is stored on the readable storage medium, and when the computer program is executed, the method for detecting the object-level remote sensing change based on the dual correlation attention is realized.

Compared with the prior art, the invention has the following three advantages:

1) the data enhancement mode specially used for change detection can accelerate model training and improve the performance of the model.

The data enhancement method provided by the invention helps the network to pay more attention to the nature of change by randomly exchanging the early-stage image and the later-stage image on line instead of overfitting under a specific input mode. The method randomly combines four pairs of double-phase image pairs into a pair of synthetic image pairs, and the method is helpful for implicitly increasing batch processing size, enriching change scenes and helping the model to more stably and more quickly converge. By scaling the synthetic image pair, the occupation ratio of the changed small target is increased implicitly, the network is guided to pay attention to the change detection of the small target, and the performance of the model is improved effectively.

2) And capturing difference features by using feature correlation to accelerate model convergence.

The dual correlation attention module designed by the invention establishes the correlation attention of the parallel characteristics from the channel and space hierarchy, guides the network to further refine the characteristics related to the change from the channel and space hierarchy of the characteristics, inhibits the unrelated characteristics and is beneficial to the rapid convergence of the network.

3) The method has the advantages of superiority and robustness, good change detection effect and capability of effectively resisting pseudo change interference.

According to the invention, through the network structure design and the loss function design, the constructed end-to-end change detection network focuses more on the overall characteristics and context association of the changed object, and the pseudo-change interference caused by the change of the visual angle and the projection difference is avoided. The whole training process does not need to design artificial feature guide, and the network can self-adaptively learn the required features and has better generalization. And under a complex change scene, better stable performance is achieved.

Drawings

Fig. 1 is a flow chart of a data enhancement method dedicated to change detection according to an embodiment of the present invention.

Fig. 2 is a general framework diagram of a dual relevance attention-directed change detection network according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a dual-correlation attention module according to an embodiment of the present invention, wherein fig. 3(a) is a general architecture of the dual-correlation attention module, fig. 3(b) is a specific architecture of a spatial correlation attention module, fig. 3(c) is a specific architecture of a channel-correlation attention module, and fig. 3(d) is a specific architecture of a variation module.

Detailed Description

The technical solution of the present invention is specifically described below with reference to the accompanying drawings and examples.

In a data layer, the framework designs a data enhancement method special for change detection, so that the training speed of the model can be effectively accelerated, and the performance of the model is improved. In a model level, the framework constructs a dual-correlation attention-guided change detection network, and can effectively extract the overall characteristics and context association of a change object. The framework ultimately represents the detected changed geographic entities (e.g., newly added buildings, artificial structures, etc.) in the form of a bounding box.

The embodiment of the invention provides an object-level remote sensing change detection method based on dual-correlation attention, which comprises a data enhancement process (shown in figure 1) special for change detection and a dual-correlation attention-guided change detection network (shown in figure 2). The dual-correlation attention-guided change detection network sequentially comprises a backbone network sharing weight, a dual-cooperation attention-guided feature fusion neck and a change detection head, wherein the dual-cooperation attention-guided feature fusion neck comprises a plurality of dual-correlation attention modules and a refined path aggregation pyramid module, and the dual-correlation attention modules and the refined path aggregation pyramid module correspond to the backbone network. Additionally, FIG. 3 illustrates architectural details of dual attention modules in a network. The method specifically comprises the following steps:

step 1: the data enhancement mode special for change detection randomly combines and transforms four pairs of images to generate a dual-input stream.

The embodiment adopts a data enhancement mode special for change detection to carry out random time sequence exchange and combination on four pairs of images, and carries out operations such as random geometric transformation, color transformation and the like to generate double input streams.

The method comprises the following substeps:

step 1.1, randomly selecting four pairs of double-time-phase images in different areas from a training set, and randomly exchanging front-time-phase images and rear-time-phase images.

And 1.2, splicing and combining the four processed pairs of images into a pair of synthetic image pairs in a random sequence.

For example, the region 1T0 time phase image, the region 2T0 time phase image, the region 3T0 time phase image, the region 4T0 time phase image, the region 1T1 time phase image, the region 2T1 time phase image, the region 3T1 time phase image, and the region 4T1 time phase image are randomly switched, and then the region 1T1 time phase image, the region 2T0 time phase image, the region 3T1 time phase image, and the region 4T0 time phase image are synthesized as the image 1, and the region 1T0 time phase image, the region 2T1 time phase image, the region 3T0 time phase image, and the region 4T1 time phase image are synthesized as the image 2.

Step 1.3, for one of the composite images, the embodiment performs the following enhancement functions:

(1) geometric transformation: random transformation, including clipping, scaling, translation, flipping and rotation;

(2) luminance transformation;

(3) gaussian noise is added;

(4) and (5) color transformation.

To ensure the versatility and robustness of the network, the embodiment only performs the same geometric transformation method for another composite image. Through the above steps, a dual-input video stream is generated using the pair of synthesized images.

Step 2: and setting a backbone network sharing the weight for receiving the double input streams and extracting different scale characteristics of the double time phase images.

The embodiment uses a CSPDarkNet-53 network to construct a twin network as a backbone network for sharing weight to receive double input streams, realizes feature reuse through a layered feature fusion strategy, and truncates excessively repeated gradient information for extracting parallel features of different scales of a double-temporal image (the top-to-bottom C1-C5 layer in FIG. 2 corresponds to five features of corresponding scales).

And step 3: the dual cooperative attention-guided feature fusion neck fuses features of C3-C5 layers from a twin CSPDarknet-53 network by adopting three dual related attention modules to generate P3-P5 layer features, and then fuses difference features among different scale layers by adopting a refined path aggregation pyramid module. The method comprises the following substeps:

step 3.1, the dual correlation attention module obtains refined difference features by using feature correlation of a spatial hierarchy and a channel hierarchy for parallel features of a same scale layer, wherein the feature correlation includes a spatial correlation attention module, a channel correlation attention module and a change difference module, and specific details are shown in fig. 3, and the method includes the following sub-steps:

step 3.1.1, parallel features F from the same scale layer_aAnd b are fed into a spatially dependent attention module. First calculate their spatial correlation map C_PAs shown in fig. 3(b), the following is calculated:

wherein ,

is through F_aResult obtained by transposition and warping operations, F_b ^′Is formed by F_bThe result obtained by the deforming operation is that,

representing a matrix multiplication. From the spatial correlation map C_PTo, forF_aAnd F_bEnhancing the spatial local features related to the change, inhibiting the spatial local features unrelated to the change, and obtaining a spatial correlation diagram C_POf softmax and normalizing the result with the deformed F_aThe enhanced feature F is obtained by matrix multiplication_PaAccording to transposed spatial correlation diagram C_POf softmax and normalizing the result with the deformed F_bThe matrix multiplication may result in an enhanced feature F_Pb. Finally, the feature F is fused_Pa and F_PbTo generate a spatially dependent attention weight W_PThe fusion process includes using the feature F_Pa and F_PbConnection (symbol C in FIG. 3 (b)) and sigmoid conversion are performed to obtain a feature F_PabAnd compressing the channel number from 2C to C by 1 × 1 convolution and 3 × 3 convolution to obtain W' p, and finally performing sigmoid transformation.

Step 3.1.2, parallel features F from the same scale layer_a,F_bIs sent to the channel dependent attention module. First calculate their channel correlation maps C_CAs shown in fig. 3(c), the calculation is as follows:

wherein ,

is through F_bTransposition and deformation operation obtained result, F'_aIs formed by F_aThe result obtained by the deforming operation is that,

representing a matrix multiplication.According to the channel correlation chart C_CTo F_aAnd F_bWherein the channels not related to the variation are suppressed, obtaining a correlation graph C according to the channels_CFeature F for enhancement_Ca and F_Cb. Finally, the feature F is fused_Ca and F_CbTo generate a channel dependent attention weight W_C. The fusion process includes using the feature F_Ca and F_CbConnection (symbol C in FIG. 3 (C)) and sigmoid conversion are performed to obtain feature F_CabW 'is obtained by compressing the channel number from 2C to C through 1 x 1 convolution and 3 x 3 convolution'_CAnd finally carrying out sigmoid transformation.

Step 3.1.3, parallel features F from the same scale layer_a,F_bIs sent to the variation difference module. The variation difference module respectively extracts difference features through absolute difference operation and serial operation, and finally fuses the difference features extracted by the two branches to obtain a final difference feature F_DAs shown in fig. 3 (d). F_d1From F_a,F_bObtaining an absolute difference F_abFrom F_a,F_bPerforming characteristic connection, and performing 1 × 1 convolution to obtain characteristic F_d2Finally F_d1 and F_d2Carrying out feature connection, 1 × 1 convolution, batch normalization and LeakyReLU transformation to obtain final feature F_D。

LeakyReLU(x)＝max(αx,x)

Step 3.1.4, parallel features F from the same scale layer_a,F_bCarrying out a tandem connection to obtain a characteristic F_ab。

Step 3.1.5, F_DAre respectively at W_P,W_CTo obtain a difference feature F guided by spatially dependent attention_PChannel-related attention-directed difference feature F_CAs shown in fig. 3(a), the calculation method is as follows:

F′_p＝W_P⊙F_D

F′_c＝W_C⊙F_D

wherein |, indicates a dot-product operation by element,

representing an element-by-element addition operation.

Difference feature F guiding spatially dependent attention_PChannel-related attention-directed difference feature F_CAnd feature F_abFurther fusion was performed to obtain the final variation difference feature F.

And 3.2, fusing the difference characteristics among different scale layers by the refined path aggregation pyramid module.

The path aggregation pyramid network is additionally provided with a bottom-up path outside the conventional top-down path, and the expression of low-level features can be effectively enhanced in feature fusion. The refinement path aggregation pyramid module provided by the invention uses series operation instead of element-by-element addition operation when fusing different scale features, so that the features of a lower layer and a higher layer can be fused in a self-adaptive manner, and the semantic change of the ground objects with different scales can be captured more effectively.

The features of the C3-C5 layer are subjected to feature fusion of different time phases through a dual correlation attention module, and are subjected to feature fusion of different scales through a top-down path. Specifically, the generation of the P4 layer feature is taken as an example. The C5 layer features were passed through the dual correlation attention module to directly obtain the P5 layer features. The features of the P5 layer are up-sampled and amplified to the same resolution as the features of the C4 layer, and are fused with the features of the C4 layer obtained by a dual correlation attention module to obtain the features of the P4 layer. The specific mode is to carry out feature connection and carry out 1 × 1 convolution, 3 × 3 convolution, batch normalization and LeakyReLU transformation. The feature generation for the P3 layer is the same as for the P4 layer.

And then, the features of the P3-P5 layer are further subjected to feature fusion through a bottom-up path to generate features N3-N5. Specifically, taking the generation of N4-level features as an example, the N3-level features are obtained from P3-level features through 1 × 1 convolution, 3 × 3 convolution, batch normalization and LeakyReLU transformation, the N3-level features are upsampled and reduced to P4-level feature resolution, and are subjected to feature fusion with P4 features to obtain N4-level features, specifically, the N3-level features are subjected to feature concatenation and subjected to 1 × 1 convolution, 3 × 3 convolution, batch normalization and LeakyReLU transformation. The feature generation for the N5 layer is the same as for the N4 layer.

And 4, sending the characteristics (N3-N5) of different scales from the refined path aggregation pyramid module into a change detection head to output the change object prediction in a bounding box form.

In the dual-correlation attention-guided change detection network, difference features of different scales are finally sent to a change detection head of the network, and detected change geographic entities (such as newly added buildings, artificial structures and the like) are represented in a boundary box mode.

In an embodiment, step 4 comprises the following sub-steps:

step 4.1, dividing the change detection head into three layers, namely a detection head-S, a detection head-M and a detection head-L, respectively corresponding to the characteristics of N3-N5 layers from the refined path aggregation pyramid module, and learning the change confidence factor Conf of the change object in parallel according to the supervision information_changedAnd coordinates (x, y, w, h). And a priori frame is used for fast and efficiently learning regression of a boundary frame of the variable object, the number of the priori frames on three detection layers is 3 by default, and the length and the width of all the priori frames are obtained by performing kmeans clustering according to training set data. A is the prior frame number, C is the number of the characteristic channels, W is the width of the characteristic diagram, and H is the height of the characteristic diagram.

And 4.2, training the model by using a multi-task loss function according to the training sample label. The Loss function Loss includes a bounding box regression Loss L_regAnd change confidence prediction loss L_Conf：

Loss＝λ_reg×L_reg+λ_Conf×L_Conf

wherein λ_regRepresents the weight of the regression loss in the total loss, λ_ConfRepresenting the contribution of the change confidence prediction loss to the total loss. Take lambda_reg＝1，λ_Conf＝2.5。

The method comprises the following substeps:

and 4.2.1, generating positive and negative samples.

The ratio of the ground truth box size to the anchor size is calculated. If the ratio is within the set threshold (1/4-4), the invention regards the anchor as a positive sample (confidence is 1); otherwise, consider a negative sample (confidence 0).

And 4.2.2, calculating the regression loss of the boundary box. The CIoU loss function is used as the bounding box regression loss function.

Wherein b represents the coordinates of the center point of the prediction bounding box,

representing the coordinates of the center point of the true value box, IoU is an intersection comparison factor, the intersection area of the prediction box and the true value box is removed from the union area of the two,

c represents the diagonal distance of the minimum closed rectangle containing the prediction frame and the ground truth frame, and the similarity of the aspect ratio of the prediction boundary box and the ground truth box is measured by v,

and

represents the width and length of the true value box, w and h represent the width and length of the prediction box, alpha is a weight coefficient,

and 4.2.3, calculating the change confidence prediction loss. The Focal loss function is used as the change confidence loss function.

Where y' represents the predicted confidence of change, y represents the true confidence, α is used to control the contribution of positive and negative samples to the loss, and γ is used to control the contribution of difficult and simple samples to the loss. In the experiments of the present invention, α ═ 0.25 and γ ═ 1.5 were used by default.

And 4.2.4, training the dual-correlation attention-guided object-level change detection network by utilizing multitask loss until the whole network converges to the optimal precision.

And 4.3, inputting the double-time phase image to be detected into the network trained in the training step 4.2 relatively, and obtaining an accurate change detection result. And generating a double-phase image pair to be detected in a manner consistent with the processing manner of the training set in the previous step.

The invention can be seen from the prediction result of the change detection of partial experimental data, and can accurately and robustly detect the change of the remote sensing images in different scenes. The experimental result can prove that the data enhancement method specially used for the change detection can accelerate the model training and improve the model performance, and the detection network based on dual correlation attention can effectively resist the pseudo-change interference in the image centering by performing end-to-end learning and training on the change image pair training set and guiding through the dual correlation attention mechanism, so that the method has higher accuracy and robustness.

In specific implementation, a person skilled in the art can implement the automatic operation process by using a computer software technology, and a system device for implementing the method, such as a computer-readable storage medium storing a corresponding computer program according to the technical solution of the present invention and a computer device including a corresponding computer program for operating the computer program, should also be within the scope of the present invention.

In some possible embodiments, an object-level remote sensing change detection system based on dual correlated attention is provided, comprising the following modules,

In some possible embodiments, there is provided a dual correlation attention-based object-level remote sensing change detection system, including a processor and a memory, the memory storing program instructions, and the processor calling the stored instructions in the memory to execute a dual correlation attention-based object-level remote sensing change detection method as described above.

In some possible embodiments, a system for detecting object-level remote sensing change based on dual-correlation attention is provided, which includes a readable storage medium, on which a computer program is stored, and when the computer program is executed, the method for detecting object-level remote sensing change based on dual-correlation attention is implemented.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or scope of the invention as defined in the appended claims.

Claims

1. An object-level remote sensing change detection method based on dual correlation attention is characterized by comprising the following steps: comprises the following steps of (a) carrying out,

2. The method for detecting the object-level remote sensing change based on the dual correlation attention according to claim 1, characterized in that: in the weight sharing backbone network, feature reuse is realized through a layered feature fusion strategy, and excessively repeated gradient information is cut off for extracting parallel features of the double-time phase image in different scales.

3. The method for detecting the object-level remote sensing change based on the dual correlation attention according to claim 2, characterized in that: a twin network is constructed by using a CSPDarkNet-53 network as a backbone network for sharing weights, five-scale features are extracted, and the features are sequentially marked as C1-C5 layers from the top layer to the bottom layer.

4. The method for detecting the object-level remote sensing change based on the dual correlation attention according to claim 3, characterized in that: the dual cooperative attention-guided feature fusion neck fuses features of C3-C5 layers from a twin CSPDarknet-53 network by adopting three dual related attention modules to generate P3-P5 layer features, and then fuses difference features among different scale layers by adopting a refined path aggregation pyramid module.

5. The method for detecting the object-level remote sensing change based on dual correlation attention according to claim 1, 2, 3 or 4, characterized in that: the dual correlation attention module comprises a space correlation attention module, a channel correlation attention module and a change difference module, and parallel features F from the same scale layer are set_a,F_bCarrying out a tandem connection to obtain a characteristic F_abGuiding spatially dependent attention difference features F_PChannel-related attention-directed difference feature F_CAnd feature F_abFurther fusion is carried out to obtain the final variation difference characteristic F.

6. The method for detecting the object-level remote sensing change based on the dual correlation attention according to claim 4, wherein the method comprises the following steps: the change detection head is divided into three layers, namely a detection head-S, a detection head-M and a detection head-L, which respectively correspond to the N3-N5 layer characteristics from the refined path aggregation pyramid module, learn the change confidence coefficient and the coordinate of the change object in parallel according to the supervision information, and learn the regression of the boundary frame of the change object by using the prior frame; training the whole network model by using a multi-task loss function according to the training sample label; and inputting the double time phase images to be detected into the trained network relatively to obtain an accurate change detection result.

7. An object-level remote sensing change detection system based on dual correlation attention is characterized in that: the method is used for realizing the object-level remote sensing change detection method based on the dual correlation attention according to any one of claims 1 to 6.

8. The system for detecting object-level remote sensing change based on dual correlation attention of claim 7, wherein: comprises the following modules which are used for realizing the functions of the system,

the system comprises a first module, a second module and a third module, wherein the first module is used for carrying out data enhancement for change detection and generating double input streams, the implementation mode comprises the steps of carrying out random time sequence exchange and combination on four pairs of images to obtain a pair of synthetic image pairs, carrying out enhancement operation on one of the synthetic images, including random geometric transformation, brightness transformation, Gaussian noise increase and color transformation, and carrying out corresponding geometric transformation on the other synthetic image;

9. The system for detecting object-level remote sensing change based on dual correlation attention of claim 7, wherein: comprising a processor and a memory, the memory being used for storing program instructions, the processor being used for calling the stored instructions in the memory to execute a method for object-level remote sensing change detection based on dual correlation attention as claimed in any one of claims 1 to 6.

10. The system for detecting object-level remote sensing change based on dual correlation attention of claim 7, wherein: comprising a readable storage medium having stored thereon a computer program which, when executed, implements a method for object-level remote sensing change detection based on dual correlated attention as claimed in any one of claims 1-6.