Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an embodiment of the present invention, there is provided an embodiment of a method for detecting changes in highly-differentiated remote sensing images, it is noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than here.
Fig. 1 is a method for detecting changes in high-resolution remote sensing images according to an embodiment of the invention, and as shown in fig. 1, the method includes the following steps:
step S102, acquiring neighborhood characteristics of each pixel in two time phase images;
step S104, inputting the neighborhood characteristics into a branch convolution neural network model to obtain the categories of the image, wherein the categories are a changed category and an unchanged category; the branch convolutional neural network model is obtained by carrying out unsupervised pre-training on images and comprises an input layer, a branch network, a full connection layer and a classification layer, wherein the branch network is used for taking neighborhood features as labels and carrying out convolutional dimension reduction on the neighborhood features, and the classification layer is used for classifying the images into different categories.
The neighborhood characteristics of each pixel extracted from the two-time-phase high-resolution remote sensing image are input into the branch convolutional neural network through deep learning network change (training branch convolutional neural network model), unsupervised pre-training is firstly carried out on the branch network, and network parameter initialization of the two branch parts (branch networks) is completed. And (5) performing supervised training by taking the neighborhood characteristics as input and matching with manually marked change labels. Different from the prior art, the prior art extracts the neighborhood features of the target pixel, then flattens the neighborhood features into feature vectors, and then splices the feature vectors of two time phases. In the embodiment, by using the structure of the branch network, the neighborhood characteristics of the target pixel can be directly used as the output of the network from the two time-phase images at the same time, so that a large amount of conversion operations are avoided, the technical problem of low image change detection precision in the prior art is solved, and the detection efficiency is improved.
The branch convolutional neural network in the steps can enable the invention to detect the change of the high-resolution remote sensing image more efficiently and accurately, the structure of the branch convolutional neural network is also different from the prior art, the embodiment uses the classification layer to replace the output layer in the prior art, if the output layer and the loss function of the twin network in the prior art are directly used, the obtained result is also a gray image, the gray of each point of the gray image represents the similarity degree of corresponding pixels of two time-phase images, it is difficult to define that what the gray value is greater than is changed, and what the gray value is less than is unchanged. In the embodiment, the classifier is used, so that the characteristics of the pixels corresponding to the two phases can be classified into two categories, namely a changed category and an unchanged category, and the obtained image is a binary image with clear black and white, and has practical value.
In an alternative embodiment, inputting the neighborhood features into the branch convolution neural network model to obtain the categories of the images comprises: inputting the neighborhood characteristics into a branch network to obtain convolution characteristics; reducing the dimension of the convolution characteristic through a full connection layer to obtain a dimension reduction characteristic; and inputting the dimension reduction feature into a classification layer to obtain the category of the image.
The branch network in the above steps is different from the branch network in the twin network in the prior art, and the branch network of the embodiment adopts a structure of convolution first and deconvolution second, so that the advantage is that the input image itself can be used as a label in the training stage, the input image is firstly convolved to reduce the dimension, the dimension reduction can lose part of information, the dimension is increased through deconvolution, the output result obtained by using the lost residual information is utilized, the output result is more similar to the input image through continuous training, the parameter of the branch network can be initialized through the training, and the whole network can be trained more quickly and more quickly by using the manual label later.
In an alternative embodiment, obtaining the neighborhood feature of each pixel in the two-phase image comprises: and preprocessing the image through a histogram matching algorithm to obtain a corrected image, wherein the color tone of the corrected image is consistent with that of other images.
The method has the advantages that the histogram matching algorithm is used for carrying out data preprocessing on the basis that the two-time phase images are registered and absolute radiation correction is carried out, so that the interference of imaging conditions for detecting the change of the two-time phase images is reduced.
After the image is preprocessed to obtain a corrected image, feature extraction is needed, and in a feature extraction stage, the neighborhood matrix features of the target pixels are used as the extraction features of the current pixels. And if the coordinates of the target pixel are (i, j), the neighborhood matrix is all pixels from (i-n, j-n) to (i + n, j + n), wherein n is a natural number, through experiments, the optimal value of n is 2, namely 5 multiplied by 5 adjacent pixels around the target pixel are extracted, and if the target pixel is located at the edge position of the high-resolution remote sensing image, zero filling processing is carried out on the edge. In an alternative embodiment, the feature extraction is performed on the rectified image by using a 5 × 5 neighborhood matrix of the target pixel as a criterion for feature extraction.
In order to increase the generalization of the network model, after the branch convolutional neural network model is trained once, two inputs (two-time phase image neighborhood characteristics) of the branch convolutional neural network need to be exchanged for positions and trained again. In an optional implementation mode, the positions of the neighborhood features of the two time phase images are exchanged, and the exchanged neighborhood features are input into the branch convolutional neural network to train the model once again to obtain a new branch convolutional neural network model.
Through the steps, the sample imbalance condition that the construction change is more than the removal change in the training sample can be counteracted.
In order to remove the stroke effect, in an optional implementation mode, a shadow index is calculated by a principal component analysis method and the brightness, saturation and hue value components of each time phase, wherein a place with a high shadow index represents a shadow area, and a place with a low shadow index represents a non-shadow area; segmenting the shadow index image by a maximum inter-class variance method to obtain a shadow image, wherein the shadow image is a binary image; and subtracting the shadow image from the image to be detected to obtain an image with shadow interference removed.
The above shadow removal process is described in an alternative embodiment as follows:
firstly, a single-channel principal component analysis result of each phase image is extracted by principal component analysis. Then, the two-time phase image is converted into an HIS color model from an RGB color model, and the shadow index is calculated
Wherein LPCA is the single-channel principal component analysis result, and I and S are the I channel component and the S channel component of the HIS color model. And finally, performing binarization threshold segmentation on the shadow index by a maximum inter-class variance method, wherein the pixel with the large shadow index is the shadow pixel. Shadow pixels of the two time phases are extracted, and the change of the shadow pixels is ignored in the preliminary change detection result, so that a shadow interference removing result can be obtained, and the detection precision is improved.
In an optional embodiment, the image without the shadow interference is divided into a plurality of sub-images by a mean shift algorithm, wherein each sub-image is of the same color category; counting the ratio of the number of the changed pixels in the image without shadow interference to the total number of pixels of each color category; and when the ratio exceeds a preset threshold value, all pixels of the current category are changed pixels, and if the ratio does not exceed the preset threshold value, all pixels of the current category are unchanged pixels.
The above denoising process is described in detail in an alternative embodiment as follows:
firstly, a mean shift algorithm is used for segmenting two time-phase images, pixels in a specified color radius and a specified space radius are gathered into one class, then all pixels in the class are filled with the RGB gray average value of the pixels of the current class again, namely each closed patch with the same color in the segmented images represents one class. After the image is segmented, counting the ratio of the changed pixels to the total number of pixels of each category in the shadow interference removing result in the previous step, if the ratio exceeds a certain threshold value, all the pixels of the current category are considered to be changed pixels, and if the ratio does not exceed the certain threshold value, all the pixels of the current category are considered to be unchanged pixels. And respectively processing the two time phase images by the denoising enhancement algorithm, and merging the processing results of the two time phases to finally obtain a final denoising enhanced result image. For example, after image segmentation, a square area with a size of 5 × 5 is considered as a class, and the total number of pixels in the class is 5 × 5 — 25, but only 13 pixels in the square may be detected to be changed, and the changed pixels are represented by white in a black-and-white binary image. If the predetermined threshold is 50%, 13 is greater than 25 × 50%, which is 12.5, all pixels in the category are considered to be changed pixels, and all pixels in the black-and-white binary image should be represented by white, and in this case, incomplete detection can be completed. If the threshold is specified to be 60%, then 13 is less than 25 x 60% to 15, and all pixels in this category are considered unchanged pixels and should be represented in black. The just detected 13 white pixels are considered noise data and should be blacked out. The value range of the threshold is adjusted according to the actual situation. The threshold value is selected from 60% to 90%, a good denoising effect can be obtained, but different images have different characteristics, in practical application, a small part of a detection area needing to be changed can be selected to be tested firstly, an optimal threshold value suitable for the current area is determined, and then the optimal threshold value is applied to the whole area.
The method removes the noise interference which is difficult to remove in the prior art, and the result is more accurate.
The method belongs to a pixel-oriented change detection method, and as shown in fig. 2, firstly, data preprocessing is carried out on a high-resolution remote sensing image, then, feature extraction is carried out on each pixel, then, a deep learning algorithm is used for classifying features of pixels corresponding to two time phases to finish change detection, shadow interference is removed from an obtained result, but in a denoising and enhancing stage, because an image segmentation idea of an object-oriented change detection method is used for reference, boundary information of each land class is obtained, and a final change detection result is obtained. By the method, two or more images of the same region in different time phases can be analyzed, and a changed part and an unchanged part in the images can be detected. Specifically, the present embodiment uses a histogram matching algorithm on the basis that the two-phase images have been registered and absolute radiation corrected, so that the interference of the imaging conditions for the two-phase image change detection is reduced. And then determining an optimal feature extraction strategy through a feature extraction experiment, and inputting the extracted neighborhood features and the artificial labels into a branch convolution neural network for training. And after the trained model is obtained, testing the two-time phase remote sensing image to obtain a preliminary change detection result, and removing the edge tracing effect by calculating the shadow index by using principal component analysis and HIS color model conversion. And finally, processing the result by using the denoising enhancement algorithm provided by the embodiment to obtain a final change detection result.
The following describes the whole image change detection process in a complete embodiment:
in the data preprocessing stage, the high-resolution remote sensing image is shot by a high-resolution satellite when the high-resolution satellite rotates around the earth, and because the satellite has a long running period and different cloud layer coverage conditions when the high-resolution remote sensing image runs to the same region, the non-cloud layer occlusion images shot at different recent periods are often required to be spliced and synthesized into one image, which is called as a time-phase high-resolution remote sensing image in this embodiment. Images of different phases often differ greatly due to differences in imaging season, imaging angle, and image registration. This is very disadvantageous for pixel-oriented change detection, and in order to mitigate the effect of such differences, the present embodiment requires preprocessing of the image.
As shown in fig. 3, since the high-resolution second remote sensing image used in the present embodiment has been subjected to registration and absolute radiation correction, it can be seen that the image pixels in the same region substantially correspond to each other, but because the color temperatures of the images are different due to different imaging seasons, the present embodiment adopts a histogram matching algorithm for processing, so that the hues of one image and another image to be matched are kept as consistent as possible, and adverse effects caused by different imaging conditions are eliminated. The analysis result proves that the image after histogram matching and the image before image correction have larger qualitative improvement, which shows that the relative radiation correction effect is better when the method is used for carrying out the relative radiation correction.
In the feature extraction stage, since the change detection is performed for the pixels in the embodiment, the change detection is converted into a problem of performing two-classification on the features of the two-time-phase pixels, that is, the corresponding pixel features are classified into two categories, namely a changed category and an unchanged category. And extracting the characteristics by adopting the neighborhood matrix characteristics of the target pixel. The neighborhood matrix characteristic refers to an n multiplied by n area around a target pixel, the method tests the influence of 3 multiplied by 3, 5 multiplied by 5, 7 multiplied by 7 and sparse 5 multiplied by 5 neighborhood matrixes as characteristics on the change detection result, and the analysis result is as follows. The 3 x 3 neighborhood matrix is used, so that reconstruction loss in the pre-training stage is smaller, and the 3 x 3 neighborhood matrix is small in information quantity and easier to characterize. However, the 3 × 3 neighborhood matrix is also more similar to the result image processed by the conventional change vector analysis method, which indicates that the context around the pixel cannot be fully considered by the 3 × 3 neighborhood information, i.e., the size of the 3 × 3 neighborhood matrix is still small. In addition, 3 × 3 has a small amount of noise, and in conclusion, 3 × 3 is faster (small amount of data) than the best performing 5 × 5 training and prediction phase, but the false detection rate is higher. The best result can be obtained by using the 5 multiplied by 5 neighborhood matrix, after the context relationship is fully considered, the large-area false detection is avoided, meanwhile, the memory space occupied during the preprocessing is small, and the training speed is moderate. The reconstruction loss is greatest in the pre-training phase using the 7 x 7 neighborhood matrix, for the reason exactly opposite to 3 x 3. The use of a 7 x 7 neighborhood matrix is extremely memory consuming in the data preparation process and is slow to process. The 7 multiplied by 7 neighborhood matrix can improve the change detection positive detection rate and simultaneously inhibit the improvement of the false detection rate to a small extent, particularly the improvement of the water area change and the overall detail, but the consideration of huge data volume and higher cost of the improved detail.
In order to more fully consider the neighborhood information in a wider range and avoid the problems of increased computation time and increased occupied storage space caused by increased data size, the sparse 5 × 5 neighborhood matrix is tried in the embodiment. The sparse 5 x 5 neighborhood matrix may cover as large an area as the 7 x 7 neighborhood matrix, but only the storage space of the 5 x 5 neighborhood matrix is used. The use of the sparse 5 x 5 neighborhood matrix is not different from the 5 x 5 matrix in memory consumption and processing speed, and the run results also show that the final effect of the sparse 5 x 5 matrix is between 3 x 3 and 5 x 5 matrices. And (3) comprehensively considering factors such as change detection effect and hardware overhead of the whole process, and finally selecting a 5 multiplied by 5 neighborhood matrix as a standard for final feature extraction. The process of extracting the neighborhood matrix features is shown in fig. 4.
The neighborhood characteristics of each pixel extracted from the two-time-phase high-resolution remote sensing image are input into a branch convolutional neural network through deep learning network change detection, and the branch convolutional neural network comprises 4 parts, namely an input layer, a branch network, a full connection layer and a classification layer. The input layer is the dimension of the extracted neighborhood features, the classification layer uses a softmax classifier, and the specific network structure is shown in fig. 5. The structure of the branch network is actually an autoencoder based on a convolutional neural network, when the branch network receives an input image, a convolutional layer is used for convolution firstly, the number of convolutional kernels is 64, and the convolution mode is same; then, not passing through a pooling layer, and then performing convolution layer II, wherein the number of convolution kernels is 32, and the convolution mode is same as same; to prevent overfitting, a Dropout layer is next introduced; then inputting the data into a deconvolution layer I, wherein the number of convolution kernels is 32, and the convolution mode is same; secondly, carrying out deconvolution on the sum of the convolution kernels, wherein the number of convolution kernels is 64, and the convolution mode is same; and finally, accessing to an output layer, wherein the dimension of the output data is consistent with that of the input data, and the specific branch network structure is shown in fig. 6. In the training process, firstly, an input image is input into a branch network, and unsupervised pre-training is carried out by taking the input image as a label, so that the weights of the two branch networks are initialized (the training is slow, and a classification layer cannot be generated in the training); and then flattening and inputting the output of the branch network convolution layer II into the full connection layer shown in the figure 4, and then training the classifier by using the artificial label to finally obtain a trained network model. The network model can utilize the more abstract convolution characteristics to classify by automatically extracting the convolution characteristics in the neighborhood matrix image blocks. In order to increase the generalization of the network model, the two-time phase image neighborhood features need to be exchanged and trained again. Thus, the unbalanced condition of the samples with more construction change than dismantling change in the training samples can be counteracted. The trained network model can classify each pixel into a change class and an unchanged class, and a preliminary change detection result is obtained.
The prediction result of the deep learning network is not the final result, and needs to be perfected and enhanced through an algorithm. In the process of detecting the change of the town area, the embodiment finds that the change of the town area is less, but because the buildings in the town area are more, the shadow length of each time-phase building irradiated by the sun is different, and the shadow change caused by the different length is very much. This pseudo variation caused by the shadow around the building, etc., is referred to as "edge effect" in the present embodiment.
As shown in fig. 7, the shadow sizes of the three buildings in the two-time phase diagram are different, which causes the edge effect in the change detection. To alleviate such a stroking effect, the present embodiment uses Principal Component Analysis (PCA) and a method of RGB conversion into a HIS color model and calculating a Shadow Index (SI) to extract and remove shadows. The principal component analysis is a dimension reduction method, and can reduce the dimension of the RGB three-channel image of the present embodiment into a single-channel grayscale image, and the present embodiment records the image after the principal component analysis processing as an LPCA, so that he can ignore color information that is not of interest in the present embodiment during the process of extracting shadows, and retain as much shadow information as possible. The HIS color model is also a widely adopted color model, as is the RGB color model. Unlike the RGB color model, which characterizes each pixel in terms of red, green and blue three primary color channel gray scale values, the HIS characterizes each pixel using three components, luminance (I), saturation (S) and hue value (H). The present embodiment obtains the luminance, saturation, and hue value components of each phase by RGB conversion into the HIS color model.
Finally, this embodiment can use the above data to calculate the shading index SI:
a place with a high shadow index is considered a shadow area, and a place with a low shadow index indicates a non-shadow area. And (4) segmenting the shadow index image by using a maximum inter-class variance method to obtain a binary image, namely the extracted shadow image. These extracted shadows are subtracted from the predicted change image, which is the final result of the removal of the edge effect.
After the shadow interference is removed, the embodiment finds that a great deal of noise still fills in the change detection result, and the accuracy of the edge information of the change detection result is not too high compared with the original image. If a simple median filtering denoising method is used, a large amount of boundary information is lost, the edge of a denoising result becomes smooth, and in order to make the final change detection result more pure and to obtain more accurate boundary information, the embodiment provides a denoising enhancement algorithm. The denoising enhancement algorithm first segments the two-phase image using a Mean Shift (Mean Shift) algorithm. The mean shift algorithm may group pixels within a specified color radius and spatial radius into a class, and then refill all pixels in this class with the RGB grayscale mean of the pixels of the current class, i.e. each closed patch of the segmented image with the same color represents a class. After the image is divided, counting the ratio of the change pixels in the change detection prediction result of the previous step to the total number of pixels of each category, if the ratio exceeds a certain threshold value, all the pixels of the current category are considered to be the change pixels, and if the ratio does not exceed the certain threshold value, all the pixels of the current category are considered to be the unchanged pixels. By the method, fine noise pixels can be removed, and the incomplete part of the boundary information in the original prediction result can be complemented, so that the accuracy of the final result is further improved, and an effect image of the denoising enhancement algorithm and a final change detection result are shown in FIG. 8.
The embodiment of the invention also provides a high-resolution remote sensing image change detection system, which can realize the functions through a classification module. It should be noted that the high-resolution remote sensing image change detection system according to the embodiment of the present invention may be used to execute the high-resolution remote sensing image change detection method according to the embodiment of the present invention, and the high-resolution remote sensing image change detection method according to the embodiment of the present invention may also be executed by the high-resolution remote sensing image change detection system according to the embodiment of the present invention. Fig. 9 is a schematic diagram of a high-resolution remote sensing image change detection system according to an embodiment of the invention. As shown in fig. 9, a high-resolution remote sensing image change detection system includes:
an obtaining module 92, configured to obtain a neighborhood characteristic of each pixel in the two time-phase images;
a classification module 94, configured to input the neighborhood features into the branch convolutional neural network model to obtain categories of the image, where the categories are a changed category and an unchanged category; the branch convolutional neural network model is obtained by carrying out unsupervised pre-training on the image through a branch convolutional neural network, the branch convolutional neural network comprises an input layer, a branch network, a full connection layer and a classification layer, the branch network is used for taking the neighborhood characteristics as labels and carrying out convolution dimensionality reduction on the neighborhood characteristics, and the classification layer is used for classifying the image into different classes.
The embodiment of the high-resolution remote sensing image change detection system corresponds to a high-resolution remote sensing image change detection method, so that the beneficial effects are not repeated.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.