CN115511781A

CN115511781A - Building change detection method based on multi-scale twin network

Info

Publication number: CN115511781A
Application number: CN202210889737.8A
Authority: CN
Inventors: 罗子娟; 李清伟; 李沛; 丁帅; 李雪松; 陈杰
Original assignee: CETC 28 Research Institute
Current assignee: CETC 28 Research Institute
Priority date: 2022-07-27
Filing date: 2022-07-27
Publication date: 2022-12-23

Abstract

The invention discloses a building change detection method based on a multi-scale twin network, which comprises the following steps: performing color space conversion on two images to be detected, which contain building changes; carrying out brightness histogram matching; carrying out chroma histogram matching; constructing a multi-scale twin network model and training; carrying out multi-scale feature extraction on the image by using the trained model to obtain a feature map; and combining the characteristic diagram and the trained model to obtain a difference diagram, and completing the building change detection based on the multi-scale twin network. Aiming at the problem of false detection caused by the existence of shadows in buildings and other landmark covering objects, the invention adopts a double histogram matching method to eliminate the difference of influencing the chromaticity and the brightness, and simultaneously introduces a multi-scale feature extraction model to reduce the scale difference between the changed buildings. The method realizes the preliminary screening of important target reconstruction and extension places such as buildings and the like, and assists in carrying out detailed judgment and accurate judgment of the change area.

Description

Building change detection method based on multi-scale twin network

Technical Field

The invention relates to a building change detection method, in particular to a multi-scale twin network-based building change detection method.

Background

The existing change detection method mainly comprises a pixel level and target level change detection method. The pixel level change detection method takes pixels as an analysis unit, adopts a strategy of direct comparison or comparison after classification, can be effectively applied to medium and low resolution remote sensing images, is simple and easy to implement, has difficulty in determining an optimal threshold value, and cannot identify the change type. The target-level change detection method may be classified into a change detection method based on image set segmentation and a change detection method based on independent image segmentation. The method is simple and easy to implement, but can not detect the ground object with the changed shape.

Disclosure of Invention

The invention aims to: the invention aims to solve the technical problem of providing a building change detection method based on a multi-scale twin network aiming at the defects of the prior art.

In order to solve the technical problem, the invention discloses a building change detection method based on a multi-scale twin network, which comprises the following steps:

step 1, performing color space conversion on two images to be detected, namely a first moment image and a second moment image, which contain building changes;

step 2, matching a brightness histogram, namely prescribing a brightness channel of the image at the second moment by using a brightness channel of the image at the first moment;

step 3, carrying out chroma histogram matching, namely stipulating a chroma channel of the image at the first moment by using a chroma channel of the image at the second moment;

and 4, step 4: constructing a multi-scale twin network model;

step 5, training the multi-scale twin network model;

step 6, performing multi-scale feature extraction on the image at the first time by using the trained multi-scale twin network model to form a first feature map; performing multi-scale feature extraction on the image at the second moment to form a second feature map;

and 7, obtaining a difference diagram by using the trained multi-scale twin network model through the first characteristic diagram and the second characteristic diagram, and completing the building change detection based on the multi-scale twin network.

The method for color space conversion in step 1 of the invention comprises the following steps:

conversion from RGB space to YCbCr space, comprising:

Y＝0.299R+.587G+0.114B

Cb＝0.564(B-Y)

Cr＝0.713(R-Y)

where Y represents the luminance value of the image, cb represents the blue density offset of the image, cr represents the red density offset of the image, R represents the red channel value of the image, G represents the green channel value of the image, and B represents the blue of the image.

The method for matching the brightness histogram in the step 2 of the invention comprises the following steps:

step 2-1, converting the first time image and the second time image into a first brightness image l ₁ (x, y) and a second luminance image l ₂ (x，y)：

l ₁ (x，y)＝0.31×R _(x，y) +0.29×B _(x，y) +0.4×G _(x，y)

l ₂ (x，y)＝0.31×R _(x，y) +0.29×B _(x，y) +0.4×G _(x，y)

Wherein R is _(x，y) 、B _(x，y) And G _(x，y) The RGB value of a pixel point (x, y), wherein x represents the x-axis coordinate of the pixel point, and y represents the y-axis coordinate of the pixel point;

step 2-2, taking the brightness at the first moment as a reference, matching the brightness at the second moment to the first moment:

the method for matching the chromaticity histogram in the step 3 of the invention comprises the following steps:

step 3-1, converting the first time image and the second time image into a first color image S ₁ (x, y) and second chrominance image S ₂ (x，y)：

S ₁ (x，y)＝0.21×R _(x，y) +0.72×B _(x，y) +0.07×G _(x，y)

S ₂ (x，y)＝0.21×R _(x，y) +0.72×B _(x，y) +0.07×G _(x，y)

Step 3-2, taking the chroma of the image at the second moment as a reference, matching the chroma of the image at the first moment with the brightness of the image at the second moment:

the multi-scale twin network model in step 4 of the invention comprises a multi-scale feature extraction network and a measurement network.

The multi-scale feature extraction network comprises: four convolutional layers and one pooling layer;

the convolution kernel size of the first convolution layer is 1 x 1, and the number of the convolution kernels is 24; the second convolution layer adopts hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, and the number of holes is 6; the third convolution layer adopts the hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, and the number of holes is 12; the fourth convolution layer adopts the hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, and the number of holes is 18; the fifth layer is a pooling layer, the size of a pooling window is 2 x 2, and the step length is 2 x 2;

the measurement network consists of three fully-connected layers, the number of neurons of the first two fully-connected layers is 128, and the activation function adopts a relu function (refer to R Hahnloser, R.Sarpsehkar, M A Mahowald, R.J.Douglas, H.S.Seung (2000) & Digital selection and analog amplification copy in a coded-amplified silicon circuit. Nature 405; the number of the neurons of the fully-connected layer of the third layer is 2, and the activating function adopts a Softmax function (refer to Li, should three clusters.) the convolutional neural network Softmax layer based on the FPGA realizes [ J ]. Modern computer (professional edition), 2017 (26): 21-24.).

The method for training the multi-scale twin network model in the step 5 comprises the following steps:

in the training process, a cross entropy loss function L is adopted _log (y,p)：

L _log (y,p)＝-(ylog(p)+(1-y)log(1-p))

Wherein y is a label (the variation class is 1, and the non-variation class is 0), and p is the prediction probability;

in the training process, a gradient descent method is adopted to optimize network parameters, and the gradient descent is defined as follows:

where J (θ) is a given loss function, m is the number of samples input per training, h _θ (x ⁱ ) To train the weights of the samples, x ⁱ To train the sample value, y ⁱ Is the label value of the sample, and i is the sample's serial number.

In step 6 of the present invention, the multi-scale feature extraction is performed on the first time image to form a first feature map, and the method includes:

performing 1-by-1 convolution operation on the first time image to form a first part of a first characteristic diagram;

performing convolution kernel 3 × 3 on the image at the first time, and performing convolution on the holes with the hole number of 6 to form a second part of the first characteristic diagram;

performing convolution kernel 3 x 3 on the image at the first moment, and performing convolution on the holes with the hole number of 12 to form a third part of the first characteristic diagram;

performing convolution kernel 3 x 3 on the image at the first time, and performing convolution on the holes with the hole number of 18 to form a fourth part of the first characteristic diagram;

pooling the first time image to form a fifth portion of the first feature map;

and connecting the five parts of the first feature map, and performing convolution operation of 1 x 1 to form the first feature map, namely the first feature map vector.

In step 6 of the present invention, the multi-scale feature extraction is performed on the image at the second moment to form a second feature map, and the method includes:

performing convolution operation of 1 x 1 on the image at the second moment to form a first part of a second feature map;

performing convolution kernel 3 × 3 on the image at the second moment, and performing convolution on the holes with the hole number of 6 to form a second part of the second feature map;

performing convolution kernel 3 × 3 on the image at the second moment, and performing convolution on the holes with the hole number of 12 to form a third part of the second feature map;

performing convolution kernel 3 x 3 on the image at the second moment, and performing convolution on the holes with the hole number of 18 to form a fourth part of the second feature map;

pooling the images at the second moment to form a fifth part of the second feature map;

and connecting the five parts of the second characteristic diagram, and performing convolution operation of 1 x 1 to form a second characteristic diagram, namely a second characteristic diagram vector.

The method for obtaining the difference map in the step 7 of the invention comprises the following steps:

measuring the distance between the first characteristic diagram and the second characteristic diagram; and obtaining distance measurement reflecting similarity difference information between the phase images at different moments by using the trained multi-scale twin network model and adopting a measure learning method, separating the changed area from the unchanged area, and minimizing the distance between pixel points of the unchanged area to obtain a difference graph of the change of the building.

Has the beneficial effects that:

the building change detection method based on the multi-scale twin network provided by the project adopts a method of combining two times of histogram matching and the multi-scale twin network, effectively solves the problems caused by large difference of chromaticity and brightness of two time phase data, shadow of the building and large size difference, and has higher detection precision compared with the existing pixel-level and target-level algorithms.

Drawings

The foregoing and/or other advantages of the invention will become further apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.

FIG. 1 is a schematic flow chart of the method of the present invention.

FIG. 2 is a schematic diagram of a multi-scale twin network structure according to the present invention.

FIG. 3 is an image of a zone in the example during month 11.

Fig. 4 is an image of a certain zone in month 4 in the embodiment.

FIG. 5 is a graph showing the results of the variation detection in FIGS. 3 and 4 using the method.

Detailed Description

The invention provides a building change detection method based on a multi-scale twin network, aiming at the problems that the difference of the chromaticity and the brightness of two time phase images is large, the difference of the scale of a changed building is large, the shadow exists in the building, and the false detection is caused by other earth surface coverings, the histogram matching is used for matching and correcting the chromaticity and the brightness in the preprocessing stage. And sending the corrected remote sensing images at the two moments into a multi-scale twin network for training to obtain a difference diagram.

As shown in fig. 1, a method for detecting a change in a building based on a multi-scale twin network includes the following steps:

step 1: the time 1 image and the time 2 image are subjected to color space conversion from an RGB space to a YCbCr space.

Y＝0.299R+.587G+0.114B

Cb＝0.564(B-Y)

Cr＝0.713(R-Y)

Step 2: performing luminance histogram matching, and prescribing a luminance channel at time 2 by using a luminance channel at time 1, wherein the step 2 includes:

step 2-1: converting the time 1 image and the time 2 image into a luminance image l ₁ (x, y) and l ₂ (x，y)。

l ₁ (x，y)＝0.31×R _(x，y) +0.29×B _(x，y) +0.4×G _(x，y)

l ₂ (x，y)＝0.31×R _(x，y) +0.29×B _(x，y) +0.4×G _(x，y)

Wherein R is _(x，y) 、B _(x，y) 、G _(x，y) Is the RGB value of the corresponding pixel (x, y).

Step 2-2: matching the brightness at time 2 to time 1 based on the brightness at time 1

And 3, step 3: and performing chroma histogram matching, and prescribing a chroma channel at the moment 1 by using the chroma channel at the moment 2.

Characterized in that the step 3 comprises:

step 3-1: converting the image at time 1 and the image at time 2 into a chrominance image S ₁ (x, y) and S ₂ (x，y)

S ₁ (x，y)＝0.21×R _(x，y) +0.72×B _(x，y) +0.07×G _(x，y)

S ₂ (x，y)＝0.21×R _(x，y) +0.72×B _(x，y) +0.07×G _(x，y)

Step 3-2: matching the luminance at time 1 to time 2 based on the luminance at time 2

And 4, step 4: and performing multi-scale feature extraction on the image at the moment 1 to form a feature map. Characterized in that, the step 4 comprises:

step 4-1: the time 1 image is convolved with 1 x 1 to form the feature map 1-1.

Step 4-2: and (3) performing convolution kernel 3 x 3 on the image at the moment 1, and performing convolution on the holes with the hole number of 6 to form a feature map 1-2.

Step 4-3: and (3) performing convolution kernel 3 x 3 on the image at the moment 1, and performing convolution on the holes with the hole number of 12 to form a feature map 1-3.

Step 4-4: the image at time 1 was convolved with a convolution kernel 3 x 3 with a number of holes of 18 to form feature maps 1-4.

And 4-5: pooling is performed on the time 1 image to form feature maps 1-5.

And 4-6: and connecting the characteristic diagrams 1-1, 1-2, 1-3, 1-4 and 1-5, and performing convolution operation with 1 x 1 to form a characteristic diagram vector.

And 5: performing multi-scale feature extraction on the image at the moment 2 to form a feature map, wherein the step 5 comprises:

step 5-1: the image at time 2 is convolved by 1 x 1 to form a feature map 2-1.

Step 5-2: and (3) performing convolution kernel 3 x 3 on the image at the moment 2, and performing convolution on the holes with the hole number of 6 to form a feature map 2-2.

Step 5-3: and (3) performing convolution kernel 3 x 3 on the image at the moment 2, and performing convolution on the holes with the hole number of 12 to form a feature map 2-3.

Step 5-4: the image at time 2 is convolved with a convolution kernel 3 x 3 and a number of holes 18 to form feature maps 2-4.

And 5-5: pooling is performed on the time 2 image to form feature maps 2-5.

And 5-6: and connecting the feature maps 2-1, 2-2, 2-3, 2-4 and 2-5, and performing convolution operation with 1 x 1 to form a feature map vector.

And 6: distance measurement is carried out on the feature graph of the image at the moment 2 and the feature graph of the image at the moment 2, a similarity rule contained in data is mined by using measure learning, a distance measurement capable of reflecting similarity difference information among multi-temporal images is learned, a change area is effectively separated from a non-change area, the distance between pixels in the non-change area is minimized as far as possible, and a change difference graph is formed. The measurement network consists of three fully-connected layers,

the number of the neurons of the first two full connection layers is 128, and the activation function adopts a relu function; the number of the neurons of the third full-connection layer is 2, and the activation function adopts a softmax function.

And 7: after the model is built, the model is firstly trained, and a cross entropy loss function is adopted in the training process. It is defined as:

L _log (y，p)＝-(ylog(p)+(1-y)log(1-p))

where y is the label (variance class is 1, non-variance class is 0) and p is the prediction probability

In the training process, a gradient descent method is adopted to optimize network parameters, and the gradient descent is defined as follows: given loss function

Where m is the number of samples input per training, h _θ (x ⁱ ) Is the weight of the training sample.

And 8: after the model training is finished, the model parameters are used for carrying out gradual calculation to obtain a difference diagram.

Example 1:

the invention provides a building change detection method based on a multi-scale twin network, which takes building change detection classification as an example, firstly, a data set for change detection is constructed, for example, a semantic change detection data set of SECOND is selected, the size of each image is 512 x 512, wherein 100 pairs of images are selected by a training set, and 20 pairs of images are selected by a testing set for testing. The specific change detection steps are as follows:

step 1: each image of the training set is color space converted from RGB space to YCbCr space.

Y＝0.299R+.587G+0.114B

Cb＝0.564(B-Y)

Cr＝0.713(R-Y)

Step 2: one of each group of image pairs of the training set is selected as an image 1, the other one is selected as an image 2, and the brightness channel of the image 2 is specified by the brightness channel of the image 1.

l ₁ (x，y)＝0.31×R _(x，y) +0.29×B _(x，y) +0.4×G _(x，y)

l ₂ (x，y)＝0.31×R _(x，y) +0.29×B _(x，y) +0.4×G _(x，y)

And step 3: the chroma channel of picture 1 is specified by the chroma channel of picture 2:

S ₁ (x，y)＝0.21×R _(x，y) +0.72×B _(x，y) +0.07×G _(x，y)

S ₂ (x，y)＝0.21×R _(x，y) +0.72×B _(x，y) +0.07×G _(x，y)

and 4, step 4: the method comprises the steps of constructing a multi-scale feature extraction network, mainly comprising four convolution layers and a pooling layer, wherein the convolution kernel size of the first convolution layer is l x 1, the number of convolution kernels is 24, the second convolution layer adopts hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, the number of holes is 6, the third convolution layer adopts hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, the number of holes is 12, the fourth convolution layer adopts hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, the number of holes is 18, the fifth pooling layer has the pooling window size of 2 x 2, and the step length is 2 x 2.

And 5: and (4) after the image 1 and the image 2 in each image pair are subjected to step 4, splicing the output vectors to form a feature map.

Step 6: and constructing a measurement network, wherein the measurement network consists of three fully-connected layers, the number of the neurons of the first two fully-connected layers is 128, the number of the neurons of the third fully-connected layer is 2, and the activation function adopts a softmax function.

And 7: after the model is built, the model is trained by adopting a cross entropy loss function and a gradient descent method.

Example 2:

as shown in fig. 3, a real map of a region to be detected in a certain area is captured from a google map, wherein the map is a map of 11 months in 2021, and is used as the first time image in the method of the present invention;

as shown in fig. 4, the same real map of the area to be detected in the certain area as shown in fig. 3 is extracted from the google map, and the map is a map of 4 months in 2022, and is used as the second time image in the method of the present invention;

after the detection by the method of the present invention, the detection result is obtained, as shown in fig. 5, wherein the black part indicates that no change is detected, and the white part indicates that a change is detected, which mainly represents the change of the building and the farmland in the embodiment.

In a specific implementation, the present application provides a computer storage medium and a corresponding data processing unit, where the computer storage medium is capable of storing a computer program, and the computer program, when executed by the data processing unit, may execute the inventive content of the multi-scale twin network-based building change detection method and some or all of the steps in each embodiment provided by the present invention. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like.

It is clear to those skilled in the art that the technical solutions in the embodiments of the present invention can be implemented by means of a computer program and its corresponding general-purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a computer program or a software product, where the computer program or the software product may be stored in a storage medium and include instructions for enabling a device (which may be a personal computer, a server, a single chip microcomputer, an MUU, or a network device) including a data processing unit to execute the method according to the embodiments or some parts of the embodiments of the present invention.

The present invention provides a method and a concept for building change detection based on multi-scale twin network, and a method and a way for implementing the technical solution are many, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a plurality of improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims

1. A building change detection method based on a multi-scale twin network is characterized by comprising the following steps:

step 2, matching a brightness histogram, namely stipulating a brightness channel of the image at the second moment by using a brightness channel of the image at the first moment;

and 4, step 4: constructing a multi-scale twin network model;

step 5, training the multi-scale twin network model;

2. The method for detecting building change based on multi-scale twin network as claimed in claim 1, wherein the method of color space conversion in step 1 is:

converting from the RGB space to the YCbCr space, comprising:

Y＝0.299R+.587G+0.114B

Cb＝0.564(B-Y)

Cr＝0.713(R-Y)

3. The method for detecting building change based on multi-scale twin network as claimed in claim 2, wherein the method for matching brightness histogram in step 2 comprises:

step 2-1, converting the first time image and the second time image into a first brightness image l ₁ (x, y) and a second luminance image l ₂ (x,y)：

l ₁ (x,y)＝0.31×R _(x,y) +0.29×B _(x,y) +0.4×G _(x,y)

l ₂ (x,y)＝0.31×R _(x,y) +0.29×B _(x,y) +0.4×G _(x,y)

Wherein R is _(x,y) 、B _(x,y) And G _(x,y) The RGB value of a pixel point (x, y) is shown, wherein x represents the x-axis coordinate of the pixel point, and y represents the y-axis coordinate of the pixel point;

4. the method for detecting building change based on multi-scale twin network as claimed in claim 3, wherein the method for matching chromaticity histogram in step 3 comprises:

step 3-1, converting the first time image and the second time image into a first color image S ₁ (x, y) and second chrominance image S ₂ (x,y)：

S ₁ (x,y)＝0.21×R _(x,y) +0.72×B _(x,y) +0.07×G _(x,y)

S ₂ (x,y)＝0.21×R _(x,y) +0.72×B _(x,y) +0.07×G _(x,y)

5. the method as claimed in claim 4, wherein the multi-scale twin network model in step 4 comprises a multi-scale feature extraction network and a metric network.

6. The method as claimed in claim 5, wherein the multi-scale feature extraction network in step 4 comprises: four convolutional layers and one pooling layer;

the convolution kernel size of the first convolution layer is 1 x 1, and the number of the convolution kernels is 24; the second convolution layer adopts the cavity convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, and the number of cavities is 6; the third convolution layer adopts the hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, and the number of holes is 12; the fourth convolution layer adopts the hole convolution, the convolution kernel size is 3 x 3, the number of convolution kernels is 64, and the number of holes is 18; the fifth layer is a pooling layer, the size of a pooling window is 2 x 2, and the step length is 2 x 2;

in the step 4, the measurement network consists of three full connection layers, the number of neurons of the first two full connection layers is 128, and the activation function adopts a relu function; the number of the neurons of the third full-connection layer is 2, and the activation function adopts a softmax function.

7. The method for detecting building changes based on a multi-scale twin network as claimed in claim 6, wherein the method for training the multi-scale twin network model in step 5 comprises:

L _log (y,p)＝-(ylog(p)+(1-y)log(1-p))

Wherein y is a label and p is a prediction probability;

where J (θ) is a given loss function, m is the number of samples input per training, h _θ (x ⁱ ) To train the weights of the samples, x ⁱ To train sample values, y ⁱ Is the label value of the sample, i is the serial number of the sample.

8. The method for detecting building change based on multi-scale twin network as claimed in claim 7, wherein the multi-scale feature extraction is performed on the first time image in step 6 to form a first feature map, and the method comprises:

performing convolution operation of 1 x 1 on the image at the first time to form a first part of a first characteristic diagram;

pooling the first time image to form a fifth portion of the first feature map;

and connecting the five parts of the first feature map, and performing convolution operation with 1 x 1 to form the first feature map, namely a first feature map vector.

9. The method for detecting building change based on multi-scale twin network as claimed in claim 8, wherein the second moment image is multi-scale feature extracted in step 6 to form a second feature map, and the method comprises:

pooling the images at the second moment to form a fifth portion of the second feature map;

10. The method for detecting building change based on multi-scale twin network as claimed in claim 9, wherein the method for obtaining difference map in step 7 comprises:

measuring the distance between the first characteristic diagram and the second characteristic diagram; and obtaining distance measurement reflecting similarity difference information between the phase images at different moments by using the trained multi-scale twin network model and adopting a measure learning method, separating a change region from a non-change region, and minimizing the distance between pixel points of the non-change region to obtain a difference graph of the building change.