CN110598748B

CN110598748B - Heterogeneous image change detection method and device based on convolutional neural network fusion

Info

Publication number: CN110598748B
Application number: CN201910745676.6A
Authority: CN
Inventors: 李刚; 蒋骁; 刘瑜; 何友
Original assignee: Tsinghua University; Naval Aeronautical University
Current assignee: Tsinghua University; Naval Aeronautical University
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2021-09-21
Anticipated expiration: 2039-08-13
Also published as: CN110598748A

Abstract

The invention discloses a heterogeneous image change detection method and device based on convolutional neural network fusion, wherein the method comprises the following steps: extracting the convolution neural network structure and the homogeneous transformation characteristics of the heterogeneous image; according to the homogeneous transformation characteristic loss function, fusing content characteristics and style characteristics among heterogeneous images to perform homogeneous space transformation of the heterogeneous images to obtain a homogeneous space transformation result; and carrying out change detection according to the result of the homogeneous space transformation to obtain a final heterogeneous image change detection result. The method can accurately detect the change area of the heterogeneous images under the complex terrain condition, and further comprehensively and deeply extract the homogeneous transformation characteristics among the heterogeneous images.

Description

Heterogeneous image change detection method and device based on convolutional neural network fusion

Technical Field

The invention relates to the technical field of remote sensing image fusion target detection, in particular to a heterogeneous image change detection method and device based on convolutional neural network fusion.

Background

In the related technology, (1) earthy large estimation of building using VHR optical and SAR image, the scene of heterogeneous remote sensing image refers to the fact that the images before and after the change are heterogeneous, and no additional homogeneous remote sensing image exists and can be fused for change detection. The method creatively provides a method for constructing a homogeneous feature space, namely mapping a heterogeneous image to a homogeneous image for change detection. Specifically, the terrain parameters of the optical image before change are extracted, and the SAR image before change is synthesized according to the shooting parameters of the existing SAR image after change. Namely, taking an SAR image space as a homogeneous feature space, carrying out homogeneous transformation on an optical image before change, and mapping the optical image into an SAR image. Thus, change detection can be performed in the SAR image space. The method solves the problem that the change detection cannot be carried out between the heterogeneous remote sensing images before by constructing the homogeneous feature space of the remote sensing images. The method has the disadvantages that the extracted homogeneous space features only contain height parameters of the image, and accurate homogeneous image mapping is difficult to perform under complex terrain conditions. Furthermore, the process of homogenous spatial mapping requires additional acquisition parameters of the SAR image, so this method is not applicable under most experimental conditions.

(2) The article provides a heterogeneous remote sensing image Change detection method based on Homogeneous space transformation of pixel characteristics, namely Homogeneous Pixel Transformation (HPT). And establishing a mapping relation of pixel pairs between the heterogeneous images by manually selecting a non-change area in the registered heterogeneous images, and further constructing a homogeneous feature space for homogeneous space transformation. Finally, the synthesized homogeneous image is used to detect a change region. The method has the disadvantages that the method only utilizes the pixel value characteristics of the image and does not use the characteristics of texture, scale and the like. Furthermore, in the homogeneous feature description, the content feature and the stylistic feature are not distinguished, but all the features are transformed. In the homogeneous space transformation of heterogeneous remote sensing images, the transformed images are generally expected to keep semantic information of ground objects in the original images, for example, the buildings are still formed after the buildings are transformed, and parameters such as geometric shapes are still consistent with the original images; in addition, the stylistic form of the transformed image should be as close as possible to the form of the target transformed image. For example, after the optical image before the change is subjected to the homogeneous space transformation and synthesized into the SAR image, it is desirable that the ground object semantic of the synthesized image retains semantic information of the optical image before the change, and the stylistic form is close to that of the SAR image. If the content and style form characteristics are not distinguished, the abstract semantic characteristics of the ground objects which should be reserved are necessarily destroyed if all the transformation is carried out, and the homogeneous space transformation fails. Therefore, it is difficult to accurately and comprehensively perform the homogeneous space transformation in a complicated topographic environment.

(3) A deep convolutional coupled network for changing detection based on heterogeneous optical and radar images, firstly, an algorithm for constructing a symmetric convolutional coupled network to generate a homogeneous feature space is provided. The network is connected with heterogeneous images for convolution layers at two ends of the network respectively, and a plurality of coupling layers are arranged in the middle of the network to generate the homogeneous transformation characteristics. The method can detect the change area between heterogeneous images under the unsupervised condition. The disadvantage is that the number of layers of the neural network is small, so that the feature abstraction capability of the network is limited, and the homogeneous transformation features between heterogeneous images are difficult to capture comprehensively and accurately. In addition, the text does not distinguish content features and format features in the homogeneous feature description, and the loss of feature semantic information inside the image is easily caused.

(4) The method comprises the steps of detecting whether a building is changed and damaged or not through an optical image, and judging the level of the change and damage of the building in a detected change area by utilizing an SAR image. The advantage of this method is that the damage to the building is subdivided. The method has the disadvantages that only primary features such as mean, variance and contrast are extracted in change detection, and the change relation between heterogeneous images is difficult to reflect. Furthermore, these features confuse semantic information and stylistic information of image features. Performing homogeneous spatial transformation without discriminative mapping of these features can lose semantic information of image features, easily leading to transformation failure.

(5) The patent refers to the field of 'investigating or analysing materials by determining their chemical or physical properties'. The algorithm extracts the geometric parameters of the ground objects from the pre-earthquake optical image and simulates the forms of the ground objects in the SAR image. And confirming the damage change condition of the building by means of the change of multiple mutual information through the comparison with the real SAR image after the earthquake. The method has the disadvantage that the geometric parameter characteristics of the ground features are difficult to accurately estimate under the condition of complex terrain. Furthermore, all the areas for extracting the geometric parameters of the ground features need to be marked manually, which is difficult to realize in a complex terrain environment.

(6) Earth dark mapping by using remotely sent data: the Haiti case study, which extracts parameters sensitive to The feature change for The optical image and The SAR image respectively, includes NDI number (Normalized Difference Index), KLD (KL entropy) and Mutual Information (MI) in The optical image and 1CD (Intensity Correlation Difference) in The SAR image. Similar to the homogeneous transformation characteristic, the above parameters are used as the transformation information and classified by using an SVM (support vector Machine) for extracting the ground feature transformation area. The method has the advantage of considering parameters of various terrain changes and is used for detecting the change area. The disadvantage is that these parameters are sensitive to the complex terrain environment, and unchanged areas of the complex terrain in the image are easily detected as false alarms, resulting in failure of change detection.

(7) Extracting data used by the 2008 Ms 8.0 Wenchean earth quality from SAR remote sensing data, and provides a remote sensing image change detection algorithm based on the combination of SAR, optical image and GIS (Geographic Information System). The method extracts pixel information from the high-resolution SAR images before and after the change, and detects the change of the ground object by combining the information provided by the high-resolution optical image and the GIS. The algorithm has disadvantages in that it is difficult to detect the ground feature change in a complex terrain environment only by the pixel value, and there is a limitation that a high-resolution optical image and GIS data are simultaneously required.

(8) An Extraction of earth feature change detection algorithm based on SAR image correlation and optical image spectrum fusion is provided. The method utilizes SAR images before and after change to extract correlation coefficients of pixel points, and simultaneously fuses the spectrum section of the optical image after earthquake to obtain a final ground object change detection result. The SAR image point-by-point correlation coefficient in the method is sensitive to the change of the terrain environment, and false alarm or missing detection is easily caused in the complex terrain environment.

(9) Image change detection algorithms: a systematic overview summarizes a series of image change detection algorithms. Indicating a complex image background, such as a complex terrain environment in a remotely sensed image, can reduce the accuracy of change detection of conventional algorithms. The reason is that most of the features used by conventional algorithms are sensitive to the image background. These complex image backgrounds and truly varying image regions are easily confused among features extracted by conventional variation detection algorithms, thereby causing detection errors.

In summary, there are limitations of heterogeneous image detection algorithms in a homogeneous space structure in a complex terrain environment and difficulties in sufficiently and accurately performing homogeneous space transformation under the complex terrain condition in the related art, and a solution is urgently needed.

Disclosure of Invention

The present application is based on the recognition and discovery by the inventors of the following problems:

it can be seen from the summary and comparison of the existing heterogeneous image-based change detection methods, that most of the methods extract homogeneous transformation features which are relatively simple and primitive, and the differences between the content features and the stylistic features of the images are not considered in the transformation process. Under the condition that the terrain environment in the remote sensing image is simpler, the more elementary homogeneous transformation characteristics can describe the homogeneous mapping relation between heterogeneous images. However, under complex terrain conditions, the remote sensing image will comprise a plurality of different types of ground features, even the pixels or textures of the different ground features are similar, and the relatively elementary homogeneous transformation characteristic in the current method is difficult to describe the transformation relation between the ground features of the heterogeneous image under the complex terrain. In addition, under the condition of complex terrain, the types of the ground features are very rich, the more primitive features are difficult to distinguish the content features and the style features of the homogeneous space transformation, and the content semantic information of the ground features can be seriously lost by directly carrying out the transformation. In summary, the accuracy of the homogeneous space transformation of the heterogeneous image is affected by the current method under the complex terrain condition, and finally the heterogeneous image change detection fails.

Aiming at the problems, the invention adopts the homogeneous transformation characteristic based on CNN (Convolutional Neural Network) to carry out homogeneous space transformation between heterogeneous images. CNN is an important deep learning method that has been widely studied in recent years, and is capable of comprehensively and deeply extracting features of signals (voice, images, video, and the like). CNN can realize sufficient and comprehensive feature extraction from concrete to abstract, from small scale to large scale by constructing a multi-level neural network from shallow to deep. For example, at the bottom of CNN, some low-level features, such as edges, lines, and corners, can be extracted. While the higher levels of CNN can extract more complex and abstract features through multiple convolutional, rectifying, and pooling layer operations. Meanwhile, since the CNN includes a plurality of network layers, the meanings of features output between the respective layers are different from each other. Therefore, the embodiment of the invention can classify the outputs and distinguish the content characteristics and the style characteristics of the image, thereby keeping the content semantic information of the ground features from being lost in the process of homogeneous space transformation. In summary, the output characteristics of CNNs from low level to high level can fully and deeply describe the homogeneous transformation characteristics between heterogeneous images, thereby realizing accurate homogeneous spatial transformation.

The method firstly constructs a multi-level convolutional neural network, and extracts the homogeneous transformation characteristics of heterogeneous images by using the output of different layers of the convolutional neural network. In this way, in a complex terrain environment, the homogeneous transformation characteristics between heterogeneous images can be fully extracted. In consideration of the application background, the embodiment of the invention adopts the optical image before the change and the SAR image after the change as the heterogeneous images for experiments in the scene setting of the heterogeneous remote sensing images. Furthermore, in order to maintain the content semantic information of the ground features, the extracted features are divided into two types: content features and texture features. Wherein the content features describe the image content information that needs to be maintained in the homogenous space transformation. The texture features represent information representing the stylistic form of the image that needs to be transformed. For example, if the optical image needs to be transformed into the SAR image in a homogeneous space, the original content features of the optical image need to be maintained, and the texture features need to be transformed into the texture features of the SAR image. In a convolutional neural network, the output at a high level represents an abstract feature of the input, and thus may represent content information of an image. And outputting texture features of different scales of the image through texture operators at each level from low to high. In this way, through deep feature extraction of multiple levels of the convolutional neural network, the homogeneous transformation features in the complex terrain environment can be fully and fully described. Secondly, constructing a corresponding loss function, and fusing the content characteristics of the original image and the style and form characteristics of the target image, so that the content characteristics of the original image and the texture and form characteristics of the target image are kept as much as possible in the output image. Therefore, the homogenous transformed image can be synthesized more accurately. And finally, outputting a final change detection result in the original heterogeneous image and the synthesized homogeneous transformation image based on the existing change detection method. The heterogeneous images are subjected to homogeneous space transformation through the convolutional neural network, the problems that homogeneous transformation features between the heterogeneous images are not deeply and insufficiently extracted under the condition of complex terrain can be solved, accurate homogeneous space transformation of the heterogeneous images is achieved, and finally the purpose of accurately detecting the change area is achieved.

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, an object of the present invention is to provide a heterogeneous image change detection method based on convolutional neural network fusion, which can accurately detect a change area of a heterogeneous image under a complex terrain condition, and further comprehensively and deeply extract homogeneous transformation features between heterogeneous images.

The invention also aims to provide a heterogeneous image change detection device based on convolutional neural network fusion.

In order to achieve the above object, an embodiment of the invention provides a heterogeneous image change detection method based on convolutional neural network fusion, which includes the following steps: extracting the convolution neural network structure and the homogeneous transformation characteristics of the heterogeneous image; according to the homogeneous transformation characteristic loss function, fusing content characteristics and style characteristics among the heterogeneous images to perform homogeneous space transformation of the heterogeneous images to obtain a homogeneous space transformation result; and carrying out change detection according to the result of the homogeneous space transformation to obtain a final heterogeneous image change detection result.

The heterogeneous image change detection method based on convolutional neural network fusion, which is disclosed by the embodiment of the invention, is characterized in that a homogeneous feature space is constructed by using CNN, homogeneous transformation features under a complex terrain environment are accurately and comprehensively described based on different layers of CNN, homogeneous transformation mapping results of heterogeneous images are output, and finally change detection is realized through comparison among the homogeneous images, so that the content and texture of remote sensing images under the complex terrain environment in different scales can be described by applying the CNN, namely, the change area of the heterogeneous images can be still accurately detected under the complex terrain condition, and the homogeneous transformation features among the heterogeneous images are comprehensively and deeply extracted.

In addition, the heterogeneous image change detection method based on convolutional neural network fusion according to the above embodiment of the present invention may further have the following additional technical features:

further, in an embodiment of the present invention, the extracting the convolutional neural network structure and the homogeneous transformation feature of the image includes: constructing a convolutional neural network based on VGGnet, wherein the convolutional neural network has no fully connected layer, the weight of the whole network is a fixed preset value, and the output of the network is each selected network layer; and selecting the content characteristics and style form characteristics of the corresponding network layer separation output image according to the characteristics of different layer outputs in the convolutional neural network.

Further, in one embodiment of the present invention, the loss function is:

L(I；I_ori，I_tag)＝λ_c||F_c(I)-F_c(I_ori)||²+λ_s||F_s(I)-F_s(I_tag)||²，

wherein, I_oriIs the original image that needs to be transformed; i is_tagIs an image with a target style form, | | | · | | represents a 2 norm, λ_cAnd λ_sIs two constants to be set, which respectively represent the weight of the content characteristic and the stylistic characteristic in the fusion process, F_sCharacteristic of stylistic form of the image, F_cIs the content semantic feature of the image.

Further, in one embodiment of the present invention, the result of the homogenous spatial transformation is:

wherein,

is a simulated pre-change SAR image which is synthesized by homogeneous space transformation,

is a simulated changed optical image which is subjected to homogeneous space transformation synthesis.

Further, in an embodiment of the present invention, the performing change detection according to the result of the homogenous space transformation to obtain a final heterogeneous image change detection result includes: detecting a first variation region of the optical image by using a Gabor wavelet and a support vector machine; detecting a second change region of the SAR image by using the Markov random field; segmenting the first variation region and the second variation region based on the same superpixel segmentation strategy; and fusing the first change area and the second change area by superpixels by adopting an evidence theory method to generate a final heterogeneous image change detection result.

In order to achieve the above object, another embodiment of the present invention provides a heterogeneous image change detection apparatus based on convolutional neural network fusion, including: the extraction module is used for extracting the convolutional neural network structure and the homogeneous transformation characteristic of the heterogeneous image; the fusion module is used for fusing content characteristics and style characteristics among the heterogeneous images according to the homogeneous transformation characteristic loss function so as to carry out homogeneous space transformation on the heterogeneous images and obtain a homogeneous space transformation result; and the detection module is used for carrying out change detection according to the result of the homogeneous space transformation to obtain a final heterogeneous image change detection result.

According to the heterogeneous image change detection device based on convolutional neural network fusion, the CNN is used for constructing the homogeneous feature space, the homogeneous transformation features under the complex terrain environment are accurately and comprehensively described based on different layers of the CNN, the homogeneous transformation mapping result of the heterogeneous image is output, and finally change detection is achieved through comparison between the homogeneous images, so that the content and texture of the remote sensing image under the complex terrain environment in different scales can be described through the CNN, namely the change area of the heterogeneous image can be still accurately detected under the complex terrain condition, and the homogeneous transformation features between the heterogeneous images are comprehensively and deeply extracted.

In addition, the heterogeneous image change detection device based on the convolutional neural network fusion according to the above embodiment of the present invention may further have the following additional technical features:

further, in an embodiment of the present invention, the extracting module is further configured to construct a convolutional neural network based on VGGnet, where the convolutional neural network does not have a fully connected layer, a weight of the entire network is a fixed preset value, and an output of the network is each selected network layer; and selecting the content characteristics and style form characteristics of the corresponding network layer separation output image according to the characteristics of different layer outputs in the convolutional neural network.

Further, in one embodiment of the present invention, the loss function is:

wherein,

Further, in an embodiment of the present invention, the detection module is further configured to detect a first change region of an optical image by using a Gabor wavelet and a support vector machine, detect a second change region of an SAR image by using a markov random field, segment the first change region and the second change region based on the same superpixel segmentation strategy, fuse the first change region and the second change region by superpixel using an evidence theory method, and generate the final heterogeneous image change detection result.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow chart of a heterogeneous image change detection method based on convolutional neural network fusion according to an embodiment of the present invention;

FIG. 2 is a detailed diagram of a convolutional neural network extracting heterogeneous image content features and stylistic features according to an embodiment of the present invention;

FIG. 3 is an overall flow diagram of a heterogeneous image undergoing a homogenous spatial transform according to an embodiment of the invention;

FIG. 4 is a flow chart of a heterogeneous image change detection method based on a homogeneous spatial transformation according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of image data used in an experiment according to an embodiment of the present invention, in which: (a) a pre-earthquake optical image; (b) a post-earthquake SAR image;

fig. 6 is a comparison of image change detection results obtained from experiments according to an embodiment of the present invention with corresponding true value images, wherein: (a) change detection method SCCN [7 ]; (b) change detection method HPT [8 ]; (c) the method provided by the invention; (d) a true value image;

fig. 7 is a schematic structural diagram of a heterogeneous image change detection apparatus based on convolutional neural network fusion according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

The following describes a heterogeneous image change detection method and apparatus based on convolutional neural network fusion according to an embodiment of the present invention with reference to the drawings, and first, a heterogeneous image change detection method based on convolutional neural network fusion according to an embodiment of the present invention will be described with reference to the drawings.

Fig. 1 is a flowchart of a heterogeneous image change detection method based on convolutional neural network fusion according to an embodiment of the present invention.

As shown in fig. 1, the heterogeneous image change detection method based on convolutional neural network fusion includes the following steps:

in step S101, a convolutional neural network structure and a homogeneous transformation feature of the heterogeneous image are extracted.

It can be understood that the key of the heterogeneous image change detection is to construct a suitable homogeneous feature space, so that the heterogeneous image can be accurately mapped in the space for comparison, and finally a change detection result is output. The detection method provided by the embodiment of the invention utilizes the convolutional neural network to construct the homogeneous feature space, wherein the depth characteristic of the convolutional neural network determines that the feature extraction of the convolutional neural network still has comprehensiveness and depth under a complex terrain environment, so that the homogeneous feature space can be accurately described.

Further, in an embodiment of the present invention, extracting the convolutional neural network structure and the homogeneous transformation feature of the image includes: constructing a convolutional neural network based on VGGnet, wherein the convolutional neural network has no full connection layer, the weight of the whole network is a fixed preset value, and the output of the network is each selected network layer; and selecting the content characteristics and style form characteristics of the corresponding network layer separation output image according to the characteristics of different layer outputs in the convolutional neural network.

Specifically, the convolutional neural network construction and homogeneous transformation feature extraction specifically includes:

without loss of generality, the embodiment of the invention considers two heterogeneous remote sensing images before and after the change. One of which is the optical image before change, noted

The other is a changed SAR image and is recorded as

And assuming that the two heterogeneous remote sensing images have been registered with each other. To verify the validity of the proposed method of the embodiment of the present inventionThe two heterogeneous remote sensing images contain various different ground object types, and a complex terrain environment exists.

In order to solve the problem that homogeneous transformation features are difficult to accurately extract in a complex terrain environment, the embodiment of the invention constructs a convolutional neural network with a multi-hierarchy structure and extracts the homogeneous transformation features based on the output of the network.

(1) In order to extract the homogeneous transformation characteristics of the image deeply, the embodiment of the invention constructs a convolutional neural network based on VGGnet, as shown in FIG. 2. The VGGnet has a deep network structure, comprises a plurality of convolution layers, rectification layers and pooling layers, and can effectively extract the features of images under complex terrain conditions. However, unlike VGGnet, the network constructed by the embodiments of the present invention has the following three features:

A. there is no fully connected layer. The fully connected layer outputs global information of the image, however it loses local position information of the image. Under a complex terrain environment, each position of the image comprises different ground objects, so that the position information can communicate homogeneous transformation mapping of local identical ground objects among heterogeneous images. The function of the fully-connected layer does not meet the requirement of performing a geospatial transformation according to embodiments of the present invention.

B. The weight of the whole network is fixed. And the weight of the whole network is trained according to VGGnet. The network constructed by the embodiment of the invention does not carry out traditional training or fine adjustment on the parameter values of the network, but similar to the network working process, and the characteristics extracted under the condition of complex terrain are output according to the trained and fine-adjusted network.

C. The output of the network is the respective selected network layer. The output of the network is not the VGGnet feed-forward network termination output, but rather is output at selected layers of the network. Therefore, the characteristics of the homogeneous transformation from concrete to abstract and multi-scale in the complex terrain environment can be extracted.

(2) According to the characteristics of different layer outputs in the constructed CNN, the content characteristics and the style form characteristics of the corresponding network layer separation output image are selected, as shown in FIG. 2. Existing methods confuse content features and stylistic features in a homogenous spatial transformation. The adoption of the mixed characteristics to carry out homogeneous space transformation of heterogeneous images will lose the content semantic information of the images and influence the accuracy of transforming and synthesizing the images. And (3) separating the content semantic features and the format features of the output image by considering different attributes carried by the output of different network layers of the CNN:

A. the deepest convolutional layer close to the ground object scale in the VGG network is directly used as the content feature output. The content semantic information of the ground features in the remote sensing image is generally one of the most abstract features under the scale of the specified ground features. These features represent very abstract information such as the type and form of the feature. In addition, the scale of these feature extractions must also be close to that of the surface features. If the extracted feature scale is too large, the content semantic information of adjacent ground objects is inevitably introduced, and feature extraction is interfered. On the contrary, if the scale is too small, it is difficult to grasp the overall characteristics of the ground feature, which affects the abstraction level of the feature. For convenience, the embodiments of the present invention semantically characterize the content of an image as Fc.

B. And (4) utilizing the pooling layers of the VGG in all scales and extracting style form characteristics based on a gram texture operator. The stylistic form of the image is described by the textural features of the image at various scales. The VGG network containing a plurality of scales can effectively cover texture information of each scale of the remote sensing image. Meanwhile, the texture is used as gray scale information of the internal period change of the image and is suitable for maximum value operation of the VGG pooling layer to extract. Finally, to convert the pooled outputs at various scales into texture feature representations, embodiments of the present invention utilize the commonly used gram texture operator to act on the output of the pooling layer. In this way, the stylistic form of the image can be fully and deeply described and can be expressed separately from the content semantic features of the image. Therefore, the content semantic information of the image can be completely reserved in the homogeneous space transformation, and the accuracy of the homogeneous space transformation of the heterogeneous image is finally improved. Similarly, the embodiment of the invention marks the style and form characteristics of the image as F_s

Compared with the current change detection algorithm based on heterogeneous images, the convolutional neural network constructed by the embodiment of the invention has deep and comprehensive homogeneous transformation feature extraction capability. The homogeneity transformation features extracted by the current heterogeneous image change detection algorithm only comprise pixel values or low-level neural network outputs. The convolutional neural network constructed in the embodiment of the invention has a network structure with a plurality of levels, so that the method not only can cover more specific homogeneous transformation characteristics with smaller scale extracted by the current method, but also can extract more abstract characteristics with larger scale through a multi-level network structure. In addition, the embodiment of the invention separates the content semantic features and the style features in the homogeneous transformation features, ensures the integrity of the content semantic information of the image and improves the accuracy of transforming and synthesizing the image.

In summary, even in a complex terrain environment, the features extracted by the embodiment of the invention can also describe the image content and style form information from concrete to abstract of each scale of the remote sensing image, and the problem that the prior method is difficult to perform homogeneous space transformation sufficiently and accurately under the complex terrain condition is solved.

In step S102, content features and stylistic features between heterogeneous images are fused according to a homogeneous transformation feature loss function to perform homogeneous spatial transformation of the heterogeneous images, and a homogeneous spatial transformation result is obtained.

It can be understood that based on the homogeneous feature space, the content and the wind format features of the corresponding heterogeneous image are extracted, and corresponding loss functions are designed and fused to accurately map the heterogeneous image into the homogeneous image.

Specifically, the loss function construction and homogeneous transformation feature fusion specifically includes:

based on the homomorphic transformation features extracted in step S101, the constructive loss function fuses the content features and the stylistic form features between the heterogeneous images, and performs homomorphic spatial transformation of the heterogeneous images. Inspired by image style transformation, the following loss function is constructed:

L(I；I_ori，I_tag)＝λ_c||F_c(I)-F_c(I_ori)||²+λ_s||F_s(I)-F_s(I_tag)||² (1)

wherein, I_oriIs an original image which needs to be transformed, namely the content of the original image is preserved and the style of the original image is transformed; i is_tagIs an image having a target stylistic form. I | · | | represents a 2 norm. Lambda [ alpha ]_cAnd λ_sThe two constants are required to be set and respectively represent the weight of the content characteristic and the stylistic characteristic in the fusion process. The result of the homogenous spatial transformation can be expressed as:

wherein, I_genIs the result image of the homogeneous spatial transformation that needs to be synthesized by the embodiments of the present invention. As can be seen from equation 1, the loss function l (i) includes two additive terms in total. The first addition measures I and I_oriThe distance between content features. As can be seen from equation 2, I is generated in the fusion_gen(I_oriI_tag) In the invention, embodiments expect it to be with I_oriThe distance between content features is as small as possible, i.e. the content features between them are as close as possible. Therefore, in the process of homogeneous space transformation, the embodiment of the invention can ensure that the content semantic information of the image is not lost or damaged. Second additive term metrics I and I_tagDistance between stylistic features. Minimizing operation of equation 2 represents I_gen(I_ori，I_tag) Should approach I as closely as possible_tagThe style and form characteristics of the fused homogeneous space transformation image are as close as possible to the style and texture characteristics of the target image. Therefore, the result of the homogenous spatial transformation of the heterogeneous image can be converged to the stylistic form of the target image.

By the operations of formulas 1 and 2, the fusion processing can be performed for the contents of the heterogeneous images and the features in the style of the style, respectively. In this way, even under the condition of complex terrain with various ground features, the integrity of the content information can be guaranteed not to be lost in the process of homogeneous space transformation. Furthermore, the transformed style texture may be made to approach the target.

According to formula 2, the embodiment of the present invention can obtain the result of the homogeneous spatial transformation of the heterogeneous image:

wherein,

is a simulated pre-change SAR image which is synthesized by homogeneous space transformation; in a similar manner, the first and second substrates are,

is a simulated changed optical image which is subjected to homogeneous space transformation synthesis. The fusion process of the homogenous spatial transformation of the heterogeneous images can be seen in fig. 3.

In step S103, change detection is performed according to the result of the homogeneous space transformation, and a final heterogeneous image change detection result is obtained.

It is understood that the embodiment of the present invention performs comparison on the basis of the synthesized homogeneous image, and outputs the result of the change detection.

Further, in an embodiment of the present invention, performing change detection according to a result of the homogeneous space transformation to obtain a final heterogeneous image change detection result, including: detecting a first variation region of the optical image by using a Gabor wavelet and a support vector machine; detecting a second change region of the SAR image by using the Markov random field; segmenting the first variation region and the second variation region based on the same super-pixel segmentation strategy; and fusing the first change area and the second change area one by adopting an evidence theory method to generate a final heterogeneous image change detection result.

Specifically, the change detection based on the homogeneous feature transformation image specifically includes:

the embodiment of the invention utilizes the generated homogeneous space transformation image

And

and in combination with

And

and (5) carrying out change detection. As mentioned in the foregoing description,

from the pre-change optical image

Content characteristics and changed SAR image

The style and form characteristics are fused. Thus, embodiments of the invention may be viewed broadly as

Is the SAR image before change. In the same way, the method for preparing the composite material,

can be considered approximately as a modified optical image. Through the homogeneous space transformation, the embodiment of the invention respectively has the optical image and the SAR image before and after the change. Therefore, conventional heterogeneous image fusion based methods can be employed for change detection.

Here, the embodiments of the present invention perform change detection by means of a heterogeneous image fusion method based on superpixels and evidence theory. First, an optical image is detected using a Gabor wavelet and a support vector machine

And

is marked as M_opt(ii) a Detection of SAR images using Markov Random Field (MRF)

And

is marked as M_SAR. Second, M is split based on the same superpixel splitting strategy_opt and M_SAR. Finally, super-pixel-by-super-pixel fusion M is carried out by adopting an evidence theory method_optAnd M_SARGenerating a final change detection result M_final。

The heterogeneous image change detection method based on convolutional neural network fusion will be further described by a specific embodiment.

As shown in fig. 4, the method of the embodiment of the present invention includes the steps of:

a) on the satellite, optical and SAR sensors are used to respectively obtain the optical image before the change of the changed area

And post-change SAR image

And the two images are registered.

b) The convolutional neural network is constructed on the basis of the VGG network, and the details are shown in FIG. 2. Respectively extracting by using the network

And

content feature F of_cAnd stylistic feature F_s。

c) A loss function is constructed as shown in equation 1. By minimizing the loss function, fuse correspondences

And

content of (1)Sign F_cAnd stylistic feature F_sGenerating a geospatial transformation image, i.e. a simulated pre-change SAR image

And the changed optical image

d) Optical image extraction using SVM and Gabor wavelets

And

change region M of_opt(ii) a SAR image extraction using MRF

And

change region M of_SAR。

e) Partitioning M with the same superpixel partitioning strategy_optAnd M_sAR。

f) Fusing M superpixel by superpixel through evidence theory_optAnd M_sARObtaining the final change detection result M_final。

Further, the embodiment of the present invention verifies the heterogeneous image change detection method provided by the present invention by using real scene data, specifically as follows:

the research scene of the experiment is set as a part of coastal area of the first capital Taikong of the sea in 2010, which is seriously damaged and changed due to earthquake buildings. The optical image before the change and the SAR image after the change are shown in fig. 5. It can be seen that the topographical environment in the figure is very complex: the method comprises two basic geographic environments of sea and land; in addition, various buildings, such as houses, roads, bridges and the like are densely staggered. As before, existing heterogeneous image change detection algorithms would be difficult to apply.

Here, the method of the embodiment of the present invention and two latest heterogeneous image change detection methods: a comparison was made between a Symmetric Convolutional Coupled Network (SCCN) and a Homogeneous Pixel Transform (HPT). In the parameter setting of the method of the embodiment of the invention, lambda_cAnd λ_sAre respectively 5.0 and 1 × 10⁵. The final test result and the corresponding truth image are shown in fig. 6. It can be seen from the comparison of the detection results that, because the method provided by the embodiment of the invention effectively separates and extracts the content and the stylistic features of the image through the convolutional neural network based on deep learning, the best change detection result is still obtained under the complex terrain condition. In contrast, the two methods of comparison are affected by complicated terrain conditions, and the detection result is remarkably deteriorated.

In addition, as shown in table 1, the examples of the present invention compare the above three methods using OA (Overall Accuracy) for measuring the Overall test result. It can be seen that the detection results of the embodiment of the invention are obviously superior to those of the other two methods. Wherein, table 1 is a comparison table of experimental results based on overall classification accuracy.

TABLE 1

Method	OA(％)
		SCCN	93.80
HPT	95.19
		The method provided by the embodiment of the invention	95.91

In summary, the method provided by the embodiment of the invention is different from the latest change detection method for heterogeneous remote sensing images at present, and aims at scenes with complex terrain environments in the remote sensing images. The method provided by the embodiment of the invention can accurately detect the change area of the heterogeneous image under the complex terrain condition by constructing the content and the style characteristics of the convolution neural network separation image based on deep learning and fusing by utilizing the loss function to generate the homogeneous space transformation result of the heterogeneous image, thereby breaking through the limitation of the complex terrain environment on the change detection of the heterogeneous remote sensing image. The effectiveness of the method is shown by the processing result of the measured data taking the sea-land earthquake as the scene.

According to the heterogeneous image change detection method based on the convolutional neural network fusion, which is provided by the embodiment of the invention, the content and texture of the remote sensing image in different scales under the complex terrain environment can be described by applying the CNN, namely, the change area of the heterogeneous image can be still accurately detected under the complex terrain condition, and further, the homogeneous transformation characteristic among the heterogeneous images is comprehensively and deeply extracted.

Next, a heterogeneous image change detection apparatus based on convolutional neural network fusion according to an embodiment of the present invention will be described with reference to the drawings.

As shown in fig. 7, the heterogeneous image change detection apparatus 10 based on convolutional neural network fusion includes: an extraction module 100, a fusion module 200 and a detection module 300.

The extraction module 100 is configured to extract a convolutional neural network structure and a homogeneous transformation feature of the heterogeneous image. The fusion module 200 is configured to fuse content features and stylistic features between heterogeneous images according to a homogeneous transformation feature loss function to perform homogeneous spatial transformation on the heterogeneous images, so as to obtain a homogeneous spatial transformation result. The detection module 300 is configured to perform change detection according to the result of the homogeneous space transformation to obtain a final heterogeneous image change detection result. The device 10 of the embodiment of the invention can still accurately detect the change area of the heterogeneous images under the complicated terrain condition, and further comprehensively and deeply extract the homogeneous transformation characteristics among the heterogeneous images.

Further, in an embodiment of the present invention, the extraction module 100 is further configured to construct a convolutional neural network based on VGGnet, where the convolutional neural network does not have a fully connected layer, a weight of the entire network is a fixed preset value, and an output of the network is each selected network layer; and selecting the content characteristics and style form characteristics of the corresponding network layer separation output image according to the characteristics of different layer outputs in the convolutional neural network.

Further, in one embodiment of the present invention, the loss function is:

Further, in one embodiment of the invention, the result of the homogenous spatial transformation is:

wherein,

Further, in an embodiment of the present invention, the detection module 300 is further configured to detect a first change region of the optical image by using a Gabor wavelet and a support vector machine, detect a second change region of the SAR image by using a markov random field, segment the first change region and the second change region based on the same super-pixel segmentation strategy, and fuse the first change region and the second change region by super-pixel by using an evidence theory method to generate a final heterogeneous image change detection result.

It should be noted that the foregoing explanation of the embodiment of the heterogeneous image change detection method based on convolutional neural network fusion is also applicable to the heterogeneous image change detection apparatus based on convolutional neural network fusion of this embodiment, and details are not repeated here.

According to the heterogeneous image change detection device based on convolutional neural network fusion, which is provided by the embodiment of the invention, a homogeneous feature space is constructed by using CNN, homogeneous transformation features under a complex terrain environment are accurately and comprehensively described based on different layers of CNN, homogeneous transformation mapping results of heterogeneous images are output, and finally change detection is realized through comparison among the homogeneous images, so that the content and texture of remote sensing images under the complex terrain environment in different scales can be described by applying CNN, namely, the change area of the heterogeneous images can be still accurately detected under the complex terrain condition, and the homogeneous transformation features among the heterogeneous images are comprehensively and deeply extracted.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "N" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of implementing the embodiments of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A heterogeneous image change detection method based on convolutional neural network fusion is characterized by comprising the following steps:

extracting the convolution neural network structure and the homogeneity transformation characteristic of the heterogeneous image, wherein the extraction of the convolution neural network structure and the homogeneity transformation characteristic of the image comprises the following steps: constructing a convolutional neural network based on VGGnet, wherein the convolutional neural network has no fully connected layer, the weight of the whole network is a fixed preset value, and the output of the network is each selected network layer; selecting content characteristics and style form characteristics of the corresponding network layer separation output image according to the characteristics of different layer outputs in the convolutional neural network;

according to the homogeneous transformation characteristic loss function, fusing content characteristics and stylistic form characteristics among the heterogeneous images to perform homogeneous space transformation of the heterogeneous images to obtain a homogeneous space transformation result, wherein the loss function is as follows:

L(I；I_ori,I_tag)＝λ_c||F_c(I)-F_c(I_ori)||²+λ_s||F_s(I)-F_s(I_tag)||²wherein, I_oriIs an original image that needs to be transformed, I_tagIs an image with a target style form, | | | · | | represents a 2 norm, λ_cAnd λ_sIs two constants to be set, which respectively represent the weight of the content characteristic and the stylistic characteristic in the fusion process, F_sCharacteristic of stylistic form of the image, F_cSemantic features of the content of the image;

and carrying out change detection according to the result of the homogeneous space transformation to obtain a final heterogeneous image change detection result.

2. The method of claim 1, wherein the result of the homogenous spatial transformation is:

wherein,

3. The method according to claim 2, wherein the performing change detection according to the result of the homogeneous spatial transformation to obtain a final heterogeneous image change detection result comprises:

detecting a first variation region of the optical image by using a Gabor wavelet and a support vector machine;

detecting a second change region of the SAR image by using the Markov random field;

segmenting the first variation region and the second variation region based on the same superpixel segmentation strategy;

and fusing the first change area and the second change area by superpixels by adopting an evidence theory method to generate a final heterogeneous image change detection result.

4. A heterogeneous image change detection device based on convolutional neural network fusion is characterized by comprising:

the extraction module is used for extracting the convolutional neural network structure and the homogeneous transformation characteristic of the heterogeneous image, wherein the extraction module is further used for constructing the convolutional neural network based on VGGnet, the convolutional neural network does not have a full connection layer, the weight of the whole network is a fixed preset value, and the output of the network is each selected network layer; selecting content characteristics and style form characteristics of the corresponding network layer separation output image according to the characteristics of different layer outputs in the convolutional neural network;

a fusion module, configured to fuse content features and stylistic features between the heterogeneous images according to the homogeneous transformation feature loss function, so as to perform homogeneous space transformation on the heterogeneous images, and obtain a result of the homogeneous space transformation, where the loss function is:

and the detection module is used for carrying out change detection according to the result of the homogeneous space transformation to obtain a final heterogeneous image change detection result.

5. The apparatus of claim 4, wherein the result of the homogenous spatial transformation is:

wherein,

is to carry outA simulated pre-change SAR image synthesized by the geospatial transform,

6. The apparatus of claim 5, wherein the detection module is further configured to detect a first change region of an optical image using a Gabor wavelet and a support vector machine, detect a second change region of an SAR image using a Markov random field, segment the first change region and the second change region based on a same superpixel segmentation strategy, and generate the final heterogeneous image change detection result by fusing the first change region and the second change region super-pixel by super-pixel using an evidence theory method.