CN115641316A

CN115641316A - Weak supervision change detection method and device based on background mixed data expansion technology

Info

Publication number: CN115641316A
Application number: CN202211332063.8A
Authority: CN
Inventors: 邢艳; 黄睿; 魏接达
Original assignee: Civil Aviation University of China
Current assignee: Civil Aviation University of China
Priority date: 2022-10-28
Filing date: 2022-10-28
Publication date: 2023-01-24

Abstract

The invention discloses a method and a device for detecting weak supervision change based on a background mixed data expansion technology, wherein the method comprises the following steps: constructing a weak supervision training set, a background guide set and an augmentation data set; enhancing an input image pair with a mask, background, through a background-aware enhancement operation to obtain a new image pair

Obtaining the final output after expansion through an enhancement strategy; continuously updating the change detection model parameters by using a learning algorithm to finally obtain an optimal model; and finally, testing by using the trained model. The device comprises: a processor and a memory. The present invention uses a background guide set to enrich potential background variations in training examples; through back grAn out-aware enhancement operation to help the change detection model see different background changes; generalization capability is enhanced by a consistency loss function.

Description

Weak supervision change detection method and device based on background mixed data expansion technology

Technical Field

The invention relates to the field of data expansion, in particular to a method and a device for detecting weak supervision change based on a background mixed data expansion technology.

Background

Change detection (Change detection) is a technique capable of detecting a Change area in a pair of two-time images. Early change detection techniques were based on algebra, such as: gradient (gradient), change Vector Analysis (CVA), etc. These conventional change detection models cannot deal with noise of background change, and some complex models are gradually proposed. For example: models such as CNN (convolutional neural network), GAN (generative countermeasure network), and the like are also increasingly applied to change detection. In recent studies, transformations have also been used for change detection. The change detection method mentioned in document [1] combines a hierarchically structured transform encoder and a multilayer perceptual (MLP) decoder in a twin network structure, effectively presenting the multi-scale remote details required for accurate CDs. These methods have achieved good results, but since the change detection data set does not contain all the environmental transformations, a data extension suitable for change detection is necessary.

Data Augmentation (Data Augmentation) is a method that can effectively improve the generalization capability of depth models. At present, the data expansion technology is mainly applied to classification and detection tasks. The CutOut data expansion method described in document [2] cuts out a rectangular area from an input image, and then fills pixels of the area with 0; the random erasure algorithm introduced in the document [3] randomly selects a rectangular frame for an input image, and then randomly replaces the pixel value of the rectangular frame, thereby achieving the effect of data expansion; the CutMix method introduced in the document [4] cuts off a part of the region of the image, and then fills the cut-off region with the pixel values of other data in a training set so as to improve the robustness of the model; the Mixup image mixing enhancement method introduced in the document [5] adds two sample data according to a certain proportion to obtain new sample data. By correctly adopting some data expansion methods, the problem of insufficient data quantity can be solved, and the generalization capability and robustness of the model can be rapidly improved.

The data expansion methods described above are only suitable for classification or detection tasks, but not for change detection. It is therefore necessary to provide a method for change detection.

Reference to the literature

[1]Bandara,W.G.C.；and Patel,V.M.2022.A Transformer-Based Siamese Network for Change Detection.arXivpreprint arXiv:2201.01293.

[2]DeVries,T.；and Taylor,G.W.2017.Improved regularization of convolutional neural networks with cutout.arXiv preprint arXiv:1708.04552.

[3]Zhong Z,Zheng L,Kang G,et al.Random Erasing Data Augmentation[C]//AAAI.2020:13001-13008.

[4]Yun,S.；Han,D.；Oh,S.J.；Chun,S.；Choe,J.；and Yoo,Y.2019.Cutmix:Regularization strategy to train strong classifiers with localizable features.In Proceedings of the IEEE/CVF international conference on computer vision,6023–6032.

[5]Zhang H,Cisse M,Dauphin Y N,et al.mixup:Beyond Empirical RiskMinimization[C]//International Conference on Learning Representations.2018.

Disclosure of Invention

The invention provides a method and a device for detecting weak supervision change based on background mixed Data expansion technology (BGMix), provides a new Consistency Loss function (Augmented and Real Data Consistency Loss), and constructs a background bootstrap set; training a depth model for change detection in a weakly supervised manner using background mixing techniques; enriching potential context changes in training examples using a context bootstrap set; the background-aware enhancement operation helps the change detection model to see different background changes; the generalization ability is enhanced by a consistency loss function, described in detail below:

a method for detecting weakly supervised change based on background mixed data extension technology, the method comprising:

constructing a weak supervision training set, a background guide set and an augmentation data set;

enhancing an input image pair with a mask, background, through a background-aware enhancement operation to obtain a new image pair

Obtaining the final output after expansion through an enhancement strategy;

continuously updating the change detection model parameters by using a learning algorithm to finally obtain an optimal model; and finally, testing by using the trained model.

Wherein the enhancing the input image pair by the background perception enhancing operation uses a mask, a background to obtain a new image pair

The method specifically comprises the following steps:

randomly sampling a background pair < B from the background boot set ₁ ,B ₂ Using augmentation operations in augmented datasets to pair input images < I ₁ ,I ₂ The treatment is carried out.

Further, the obtaining of the expanded final output through the enhancement strategy specifically includes:

for each augmentation path, sampling three operations from the augmentation dataset and stacking to construct a new operation; randomly sampling an operation from the newly constructed operations, the sampling operation being for the mask C, background B and image pair I ₁ ,I ₂ Mixing the three to obtain a new image pair;

and repeating the steps, acquiring K new image pairs after K times of enhancement, and mixing the K new image pairs with the randomly sampled weights.

The method comprises the following steps of continuously updating change detection model parameters by using a learning algorithm, and finally obtaining an optimal model specifically:

the final loss function is:

wherein the content of the first and second substances,

wherein [ ·]Is a join operation, psi (-) refers to a pre-trained VGG16 network for perceptual feature extraction, cos (-) is a trigonometric function,

to an enhanced image pair;

a weakly supervised change detection apparatus based on background mixed data extension techniques, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause the apparatus to perform any of the method steps described.

The technical scheme provided by the invention has the beneficial effects that:

1. the invention provides a data expansion method, which is used for improving the performance of a change detection model, and a background guide set B is constructed and used for enriching the potential background of an input image;

2. according to the method, background-aware enhanced operation is used to help a change detection model to see different background changes; the invention enables an original image pair to be paired by constructing an augmented set of operationsConversion to a new image pair

3. The invention uses Augmented&Real Data Consistency Loss evaluation of enhanced image pairs with different backgrounds

And original image pairThe similarity between the two images is used for reducing errors caused by image background change and enhancing the generalization capability of the change detection model.

Drawings

FIG. 1 is a flow chart of a method for detecting weakly supervised change based on background mixed data expansion technique;

FIG. 2 is a flow chart of a BGMix function proposed by the present invention;

fig. 3 is a schematic structural diagram of BGMix according to the present invention;

FIG. 4 is a diagram illustrating the generation result of the proposed method.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.

The embodiment of the invention provides a background mixing enhancement technology specially designed for change detection aiming at the problem of insufficient performance of the change detection of a remote sensing image, and can mix an original input image pair so that a change detection model can well detect a change area.

A background mixing enhancement technique specifically designed for change detection, the technique comprising the steps of:

1. constructing a weak supervision training set T, a background guide set B and an augmentation operation set O

The method comprises the following steps: images with image-level labels are prepared to construct a weakly supervised training set T, for example: AICD (change detection reference in aerial imagery) and BCD (building change detection) datasets; collecting N image pairs to construct a background guide set B, each image pair including only background changes and no target changes; an augmented operation set O is set that includes M enhanced operations, for example: posterize, rotate, equalize, and the like.

2. Performing BGMix enhancement operations

The BGMix enhancement operation is subdivided into two operations: background-aware Augmentation operation and Augmentation strategy.

Background-aware augmentation operation: first, a background pair is randomly sampled from a background boot set (ii). The input image pair is then subjected to an augmentation operation using the augmented datasetAnd (4) processing.

The operation function is specifically as follows:

wherein o is _j For transforming image pairsOperation of α _j To change the background operation of the original image pair, C is a change mask,

is passing through o _j And (5) obtaining an image pair after operation.

This formula can be further described as:

wherein, the corresponding elements of the two matrixes are multiplied, and Rep is a function for replacing the original image background image.

The operation steps of the Augmentation strand are as follows: k augmented paths are set, for each augmented path, three operations are first sampled from the augmented operation set O and stacked to build a new operation. Then, an operation is randomly sampled from the newly constructed operations, the sampling operation being used to pair the mask C, the background B and the image pair I ₁ ,I ₂ The three are mixed to obtain a new image pair and achieve the purpose of enhancement. And repeating the steps, acquiring K new image pairs after K times of enhancement, and mixing the K new image pairs with the randomly sampled weights. Finally, the mixed image pair is further mixed with the original input image to obtain the final outputAnd (6) outputting the image.

3. Performing a learning algorithm

From a training data set T _train Sampling an image pair and passing phi _θ (. The) performs change detection. Then, an enhanced image pair is obtained by the BGMix function. Thereafter, the defined loss function is calculated and the model parameters are updated.

Wherein the loss function comprises 5 parts, the first part being to

And

the perceptual similarity between as a first loss function, which can be described as:

wherein, [ ·]Is a join operation, psi (-) refers to a pre-trained VGG16 network for perceptual feature extraction, cos (-) is a trigonometric function,

as an enhanced image pair.

The second part randomly samples a background pair from the background guide set B. Then use this background to replace and a new image pair is obtained, the specific function being:

I′ ₁ ＝Rep(I ₁ ，B ₁ ，C),I′ ₂ ＝Rep(I ₂ ，B ₂ ，C)， (5)

wherein, I' ₁ ,I′ ₂ For replacing the original image pair < I by Rep function ₁ ,I ₂ Image pair obtained background.

In addition, a new background pair can be synthesizedBackground area sticking in (1)Is adhered toTo the corresponding position in (a). Then a new background pair can be obtained, the specific function being:

B′ ₁ ＝Rep(B ₁ ，I ₁ ，C),B′ ₂ ＝Rep(B ₂ ，I ₂ ，C)， (6)

wherein, B' ₁ ,B′ ₂ Is a background pair obtained by a Rep function.

A new loss function is then defined to evaluateAnd<I′ ₁ ,I′ ₂ >、and<B′ ₁ ,B′ ₂ >similarity. The loss function can be described as:

the third part embeds more background context from the background pair, which will facilitate the identification of small area areas. The loss function is described as:

where SSIM (·) is a loss of structural similarity.

The fourth section uses two discriminators as the penalty for the antagonism to make the convolutional neural network phi _θ (. Cndot.) produced accurate results of the changes. The loss function is described as the following function:

wherein D is ₁ And D ₂ Two discriminators.

Finally, to control potential prediction errors due to background variations, a penalty function is set such that φ is used as input when the background pair is taken as input _θ (. Cndot.) predict all zero mask. The loss function is described as the following function：

Where the number 0 is a tensor of all 0's. Through all the above loss functions, a final loss function can be obtained, and the specific functions are as follows:

wherein the content of the first and second substances,

is defined as the function:

wherein, { lambda ] _i For each of the balanced equations.

4. Testing of networks

After the third step, the trained models can be obtained, and then the images of the test set are read to test the trained change detection models (FCD, AFA, and WCD).

In summary, the embodiment of the present invention uses the BGMix data extension method to enhance the input image pair I by Background-aware augmentation operation using the Background boot set B and the change mask C ₁ ,I ₂ To obtain a new image pair

Obtaining the final output after expansion through the Augmentation strand; then, continuously updating the parameters of the change detection model by using a learning algorithm to finally obtain an optimal model; and finally, testing by using the trained model.

A weakly supervised change detection apparatus based on background mixed data extension technology, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling the program instructions stored in the memory to cause the apparatus to perform any of the method steps of:

Obtaining the final output after expansion through an enhancement strategy;

Wherein an input image pair is enhanced with a mask, background, by a background-aware enhancement operation to obtain a new image pair

The method specifically comprises the following steps:

randomly sampling a background pair from a background boot setUsing augmented operation in augmented dataset to pair input imagesAnd (6) processing.

Further, obtaining the expanded final output through the enhancement strategy specifically includes:

The method comprises the following steps of continuously updating change detection model parameters by using a learning algorithm, and finally obtaining an optimal model:

the final loss function is：

Wherein, the first and the second end of the pipe are connected with each other,

to an enhanced image pair;

in the embodiment of the present invention, except for the specific description of the model of each device, the model of other devices is not limited as long as the device can perform the above functions.

Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A method for detecting weak supervision change based on background mixed data expansion technology is characterized by comprising the following steps:

enhancing output using mask, background through background awareness enhancement operationsEntering the image pair to obtain a new image pair

Obtaining the final output after expansion through an enhancement strategy;

2. The method of claim 1, wherein the enhancement of the input image pair by the background-aware enhancement operation using the mask and the background to obtain a new image pair

The method specifically comprises the following steps:

3. The method for detecting weakly supervised change based on background mixed data extension technology as claimed in claim 1, wherein the obtaining of the extended final output through the enhancement strategy is specifically:

4. The method according to claim 1, wherein the parameters of the change detection model are continuously updated by using a learning algorithm, and the finally obtained optimal model specifically comprises:

the final loss function is:

wherein [ ·]Is a connection operation, psi (-) refers to a pre-trained VGG16 network for perceptual feature extraction, cos (-) is a trigonometric function,

to an enhanced image pair;

5. a weakly supervised change detection apparatus based on background mixed data extension technology, the apparatus comprising: a processor and a memory, the memory having stored therein program instructions, the processor calling upon the program instructions stored in the memory to cause the apparatus to perform any of the method steps described.