CN112818966A

CN112818966A - Multi-mode remote sensing image data detection method and system

Info

Publication number: CN112818966A
Application number: CN202110408562.XA
Authority: CN
Inventors: 洪勇; 罗冷坤; 李江; 董朝阳
Original assignee: Wuhan Optics Valley Information Technology Co ltd
Current assignee: Wuhan Optics Valley Information Technology Co ltd
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2021-05-18
Anticipated expiration: 2041-04-16
Also published as: CN112818966B

Abstract

The invention provides a method and a system for detecting multi-mode remote sensing image data, electronic equipment and a storage medium, wherein the method comprises the following steps: the multi-mode remote sensing image data of different time sequences are respectively input into a multi-mode feature mining network, the fusion vector features of the multi-mode remote sensing image data of each time sequence are output, the fusion vector features are input into a change detection network, and whether the multi-mode remote sensing image data of different time sequences have differences or not is identified by the change detection network. The multi-mode feature mining network constructed by the invention is used for mining and fusing the features of the multi-mode remote sensing image data, meets the requirements of diversity, multi-time sequence and multi-feature of a training data set, can improve the accuracy of network mining features, and realizes the detection of abnormal states of the multi-mode remote sensing images with different time sequences by constructing the multi-mode feature mining network and a change detection network, thereby providing support for researching the development trend of a detected region.

Description

Multi-mode remote sensing image data detection method and system

Technical Field

The invention relates to the field of remote sensing image processing, in particular to a multi-mode remote sensing image data detection method and system.

Background

The remote sensing change detection technology is widely applied to the fields of urban planning, rural land right determination, disaster early warning and the like, and the visual transformation of monitoring bare soil and buildings is utilized to reflect urban development conditions, other plots are monitored to be changed into farmland plots to reflect high-standard farmland planning, and forest plots are monitored to be changed into soil to reflect mountain landslide. With the development of high-resolution remote sensing images, the existing optical satellite data is not enough to meet the timeliness of remote sensing ground monitoring. In the face of massive, multi-modal and high-resolution remote sensing data sets, the traditional image interpretation technology is poor in performance, and meanwhile, the strong computing power of 5G network transmission and GPU/NPU (Graphics Processing Unit, microprocessor/Neural-network Processing Unit, network processor) meets the requirements of real-time Processing and real-time transmission of data.

The single-mode remote sensing data can only provide information attributes from a single angle, a database is traditionally constructed by means of mathematical statistics, knowledge graph spectrums and data mining algorithms, a change result is obtained by utilizing a vector difference value, deep learning and multi-mode remote sensing image data are combined, and an interpretation algorithm with higher efficiency, accuracy and robustness is urgently explored. The multi-modal remote sensing data fusion expands data sources in information dimension and sample quantity, and supplements the data requirement of a deep learning change detection algorithm.

Disclosure of Invention

The present invention provides a method and system for multi-modal remote sensing image data detection that overcomes, or at least partially solves, the above-mentioned problems.

According to a first aspect of the present invention, there is provided a method for detecting multimodal remote sensing image data, including: inputting multi-mode remote sensing image data of a first time sequence into a trained multi-mode feature mining network, and acquiring first fusion vector features corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region; inputting the first fusion vector feature and the second fusion vector feature into a trained change detection network, and obtaining a difference result identified by the change detection network, wherein the change result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference or not.

On the basis of the technical scheme, the invention can be improved as follows.

Optionally, the multimode remote sensing image data of the first time sequence includes first hyperspectral satellite data, first high-resolution satellite data and first multi-time sequence Modis data, and the multimode remote sensing image data of the second time sequence includes second hyperspectral satellite data, second high-resolution satellite data and second multi-time sequence Modis data.

Optionally, the multi-modal feature mining network includes three deep convolutional neural networks and a feature engineering structure; the method for inputting the multi-mode remote sensing image data of the first time sequence into the trained multi-mode feature mining network to obtain the first fusion vector feature corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network comprises the following steps: inputting first hyperspectral satellite data, first high-resolution satellite data and first multi-timing-sequence Modis data into three deep convolutional neural networks respectively to obtain vector features output by each deep convolutional neural network; inputting the three vector characteristics into the characteristic engineering structure to obtain a first fusion vector characteristic output by the characteristic engineering structure; the method for inputting the multi-mode remote sensing image data of the second time sequence into the trained multi-mode feature mining network to obtain the second fusion vector feature corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network comprises the following steps: inputting second hyperspectral satellite data, second high-resolution satellite data and second multi-timing-sequence Modis data into three deep convolutional neural networks respectively to obtain vector characteristics output by each deep convolutional neural network; and inputting the three vector characteristics into the characteristic engineering structure, and acquiring second fusion vector characteristics output by the characteristic engineering structure.

Optionally, the deep convolutional neural network includes a plurality of network layers, each network layer includes a convolutional layer and a pooling layer, and the plurality of network layers are connected in sequence; the characteristic engineering structure comprises a plurality of network layers, each network layer comprises a convolution layer and a BN layer, and the network layers are connected in sequence, wherein the number of the network layers of the deep convolution neural network is the same as that of the network layers of the characteristic engineering structure.

Optionally, the inputting the three vector features into the feature engineering structure to obtain a first fused vector feature output by the feature engineering structure includes: the feature engineering structure performs fusion calculation on three vector features through the following formula:

wherein,

for final fusion of vector features, t_a、t_bAnd t_cRespectively representing the vector features in three dimensions,

、

、

the weights in each dimension are represented separately,

in order to shift the weight of the weight,

is a weight-weighted index, m_nRepresenting an offset.

Optionally, the change detection network is a symmetrical semantic segmentation network including a convolution layer, a pooling layer, a deconvolution layer, and an anti-pooling layer, which are connected in sequence; the method further comprises the following steps: and simultaneously taking the deconvolution result of the last stage and the deconvolution result of the penultimate stage as the input of the anti-pooling layer.

Optionally, the method further includes: training a preset network model based on the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence, and calculating a recognition result to calculate the loss of the damaged network model; based on the loss, adjusting the model parameters of the preset network model to obtain an optimal preset network model; the preset network model is composed of the multi-modal feature mining network and the change detection network, and the model parameters of the preset network model at least comprise the weight of each deep convolutional neural network in each dimension

、

、

And offset weights m and m_n。

According to a second aspect of the present invention, there is provided a multimodal remote sensing image detection system, comprising: the acquisition module is used for inputting the multi-mode remote sensing image data of the first time sequence into the trained multi-mode feature mining network and acquiring first fusion vector features corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region; and the identification module is used for inputting the first fusion vector characteristics and the second fusion vector characteristics into a trained change detection network, and acquiring a difference result identified by the change detection network, wherein the change result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference or not.

According to a third aspect of the present invention, there is provided an electronic device, comprising a memory and a processor, wherein the processor is configured to implement the steps of the multimodal remote sensing image detection method when executing a computer management program stored in the memory.

According to a fourth aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer management-like program, which when executed by a processor, implements the steps of the multimodal remote sensing image data detection method.

The invention provides a multi-mode remote sensing image data detection method and a system, which are characterized in that multi-mode remote sensing image data of different time sequences are respectively input into a multi-mode feature mining network, fusion vector features of the multi-mode remote sensing image data of each time sequence are respectively output, a plurality of fusion vector features are input into a change detection network, and whether the multi-mode remote sensing image data of different time sequences have differences or not is identified by the change detection network. The invention constructs a multi-mode feature mining network and a change detection network, wherein the multi-mode feature mining network mines and fuses the features of the multi-mode remote sensing image data, meets the requirements of diversity, multi-time sequence and multi-feature of a training data set, and can improve the accuracy of network mining features.

Drawings

Fig. 1 is a flowchart of a method for detecting multimodal remote sensing image data according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a multi-modal feature mining network according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a multi-modal remote sensing image data detection system according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a hardware structure of a possible electronic device provided in the present invention;

fig. 5 is a schematic diagram of a hardware structure of a possible computer-readable storage medium according to the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

Fig. 1 is a flowchart of a method for detecting multimodal remote sensing image data according to an embodiment of the present invention, and as shown in fig. 1, the method includes: s1, inputting the multi-mode remote sensing image data of the first time sequence into the trained multi-mode feature mining network, and acquiring a first fusion vector feature corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region; and S2, inputting the first fusion vector feature and the second fusion vector feature into a trained change detection network, and acquiring a difference result identified by the change detection network, wherein the change result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference.

It can be understood that, aiming at the defect that the single-mode remote sensing image data can only provide information attributes from a single angle, the embodiment of the invention starts from the multi-mode remote sensing image data, extracts and fuses the characteristics of the multi-mode remote sensing image data, expands the data source from the information dimension and the sample number and supplements the data requirement of the deep learning change detection algorithm.

The multi-modal remote sensing image data fusion is mainly divided into a pixel level, a feature level and a decision level, bottom-layer features are usually mined by utilizing a deep convolutional neural network and then vector superposition is carried out, feature engineering is mainly used for carrying out dimensionality reduction, thinning treatment and splicing on the bottom-layer data, and the decision level fusion is mainly used for making up for element loss by adopting a multi-layer pyramid pooling structure.

Specifically, multi-mode remote sensing image data of different time sequences in the same region are obtained, the multi-mode remote sensing image data of each time sequence are input into a trained multi-mode feature mining network, and fusion vector features of the remote sensing image data under multiple modes, namely the fusion vector features of the multi-mode remote sensing image data of each time sequence are obtained.

The fusion vector characteristics corresponding to the multi-mode remote sensing image data with different time sequences are input into a trained change detection network, and the change detection network identifies whether the remote sensing image data with different time sequences change or not, namely the remote sensing image data with different time sequences have differences, a difference label can be output, for example, if the remote sensing image data with different time sequences have differences, 1 is output, and if the remote sensing image data with different time sequences do not have differences, 0 is output.

The embodiment of the invention respectively inputs the multi-mode remote sensing image data with different time sequences into the multi-mode feature mining network, respectively outputs the fusion vector feature of the multi-mode remote sensing image data with each time sequence, inputs the fusion vector features into the change detection network, and identifies whether the multi-mode remote sensing image data with different time sequences have difference or not by the change detection network. By constructing the multi-mode feature mining network and the change detection network, the multi-mode feature mining network mines and fuses the features of the multi-mode remote sensing image data, the requirements of diversity, multi-time sequence and multi-features of a training data set are met, and the accuracy of network mining features can be improved.

In a possible embodiment, the first time-series multimodal remote sensing image data includes first hyperspectral satellite data, first high-resolution satellite data and first multi-time-series Modis data, and the second time-series multimodal remote sensing image data includes second hyperspectral satellite data, second high-resolution satellite data and second multi-time-series Modis data.

It can be understood that, in the embodiment of the present invention, the multimodal remote sensing image data mainly includes image data of time-space-spectrum three-dimensional dimensions, and mainly includes hyperspectral satellite data, high resolution satellite data, and multi-timing Modis data. It should be noted that the multi-modal remote sensing image data may include more dimensional image data, and is not limited to three dimensions.

In one possible embodiment, the multi-modal feature mining network comprises three deep convolutional neural networks and a feature engineering structure; the method for inputting the multi-mode remote sensing image data of the first time sequence into the trained multi-mode feature mining network to obtain the first fusion vector feature corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network comprises the following steps: inputting first hyperspectral satellite data, first high-resolution satellite data and first multi-timing-sequence Modis data into three deep convolutional neural networks respectively to obtain vector features output by each deep convolutional neural network; inputting the three vector characteristics into the characteristic engineering structure to obtain a first fusion vector characteristic output by the characteristic engineering structure; the method for inputting the multi-mode remote sensing image data of the second time sequence into the trained multi-mode feature mining network to obtain the second fusion vector feature corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network comprises the following steps: inputting second hyperspectral satellite data, second high-resolution satellite data and second multi-timing-sequence Modis data into three deep convolutional neural networks respectively to obtain vector characteristics output by each deep convolutional neural network; and inputting the three vector characteristics into the characteristic engineering structure, and acquiring second fusion vector characteristics output by the characteristic engineering structure.

It can be understood that the multimodal feature mining network comprises three deep convolutional neural networks and a feature engineering structure, wherein for data of three modalities corresponding to the multimodal remote sensing image data of the first time sequence, namely first hyperspectral satellite data, first high-resolution satellite data and first multi-time sequence Modis data, the three deep convolutional neural networks are respectively input, and each deep convolutional neural network outputs bottom features of the satellite data of the three modalities. Then, the three underlying features are input into the feature engineering structure, the feature engineering structure fuses the three underlying features, and the fused vector feature corresponding to the multimodal remote sensing image data of the first time series is called a first fused vector feature, and the fused vector feature corresponding to the remote sensing image data of the second time series is called a second fused vector feature.

It should be noted that the number of deep convolutional neural networks included in the multi-modal feature mining network is equal to the number of dimensions of the multi-modal image data, for example, if the multi-modal remote sensing image data includes image data with three dimensions, then the multi-modal feature mining network includes three deep convolutional neural networks.

In one possible embodiment, the deep convolutional neural network comprises a plurality of network layers, each network layer comprises a convolutional layer and a pooling layer, and the network layers are connected in sequence; the characteristic engineering structure comprises a plurality of network layers, each network layer comprises a convolution layer and a BN layer, and the network layers are connected in sequence, wherein the number of the network layers of the deep convolution neural network is the same as that of the network layers of the characteristic engineering structure.

Referring to fig. 2, a schematic structural diagram of the multi-modal feature mining network is shown, and as can be seen from fig. 2, the multi-modal feature mining network includes three deep convolutional neural networks and a feature engineering structure. Wherein each deep convolutional neural network comprises a plurality of network layers, and each network layer comprises a convolutional layer and a pooling layer. The feature engineering structure also includes a plurality of network layers, each network layer including a convolutional layer and a BN layer. Wherein the number of network layers of the feature engineering structure is the same as the number of network layers of the deep convolutional neural network.

Taking multi-mode remote sensing image data with a first time sequence as an example, inputting first hyperspectral satellite data, first high-resolution satellite data and first multi-time sequence Modis data into corresponding first layer networks of deep convolutional neural networks respectively, and outputting a first intermediate result by the first layer network of each deep convolutional neural network to obtain three first intermediate results. Inputting the three first intermediate results into a first-level network layer of the feature engineering structure; and by analogy, inputting the three second intermediate results into a second-level network layer of the feature engineering structure until the three Nth intermediate results are input into a last-level network layer of the feature engineering structure, and outputting the first fusion vector features of the multimodal remote sensing image data of the first time sequence, wherein N is the number of each deep convolutional neural network and the network layers of the feature engineering structure.

In a possible embodiment, inputting three vector features into the feature engineering structure, and obtaining a first fused vector feature output by the feature engineering structure includes: the feature engineering structure performs fusion calculation on three vector features through the following formula:

wherein,

、

、

the weights in each dimension are represented separately,

in order to shift the weight of the weight,

is a weight-weighted index, m_nRepresenting an offset.

It can be understood that, for each network layer of the feature engineering structure, three intermediate results (which may be understood as three vector features) need to be fused, and specifically, the feature engineering structure may fuse the three vector features through the above formula (1) to obtain a final fused vector feature.

In a possible embodiment, the change detection network is a symmetrical semantic segmentation network comprising a convolutional layer, a pooling layer, a deconvolution layer and an anti-pooling layer, which are connected in sequence; the method further comprises the following steps: and simultaneously taking the deconvolution result of the last stage and the deconvolution result of the penultimate stage as the input of the anti-pooling layer.

It can be understood that the change detection network builds a symmetrical semantic segmentation network based on the convolutional layer, the pooling layer, the deconvolution layer and the anti-pooling layer, modifies the multilayer deconvolution structure on the right side by using the idea of the FCNN network, specifically adds the deconvolution layer on the original deconvolution layer structure, uses the last stage deconvolution result and the penultimate stage deconvolution result as output, and inputs the output to the next network layer, so that the processing compensates for the loss of the features to a certain extent.

The convolution operation and the deconvolution operation in the symmetrical semantic segmentation network are as follows:

；

wherein,

representing a set of input feature maps, j representing the number of input feature maps,

a feature map representing the ith input in the (n-1) layer,

and

the convolution kernel and the offset are represented separately,

the feature graph representing the ith input in the n layers, and the output of the symmetrical semantic segmentation network is as follows:

wherein

Is the input signature, k is the size of the convolution kernel, m is the fill step size, and Out is the output signature.

In a possible implementation manner, the method further includes: training a preset network model based on the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence, and calculating the loss of the preset network model according to the recognition result; based on the loss, adjusting model parameters of the preset network model to obtain an optimal preset network model; the preset network model is composed of the multi-modal feature mining network and the change detection network, and the model parameters of the preset network model at least comprise the weight of each deep convolutional neural network in each dimension

、

、

And offset weights m, m_n。

It can be understood that the change detection network of the multi-modal feature mining network needs to be trained to be used. Wherein,the multi-modal feature mining network and the change detection network can be trained respectively, then the preset network model formed by the multi-modal feature mining network and the change detection network is trained, the loss of the recognition result is calculated according to the training set, and the internal parameters of each network model are optimized and adjusted according to the loss to obtain the final preset network model. Wherein, the model parameters to be adjusted include, but are not limited to, the weight of each deep convolutional neural network in each dimension

、

、

And offset weights m, m_nAnd adjusting the model parameters of the change detection network to optimize a preset network model formed by the multi-modal feature mining network and the change detection network.

The multi-mode remote sensing image data of different time sequences in the same region are identified based on the trained preset network model, and a classification structure for judging whether the multi-mode remote sensing image data of different time sequences have differences or not is output, so that areas with differences can be identified from the multi-mode remote sensing image data of different time sequences.

In order to verify the performance of the multimode remote sensing image data detection method provided by the invention, the same region GF2 satellite data, sentinel 2 satellite data and Modis data three-dimensional remote sensing image data are selected for experimental verification, and experimental parameters are set as follows:

TABLE 1 deep convolutional neural network parameters

Table 2 change detection network parameters

Assuming there are (k +1) classes (including k object classes and 1 background class), i is the correct class and j is the incorrect class.

The total number of the pixel points which belong to the i class but are predicted to be the j class is represented as follows:

and (TP) (true Positive), wherein the model prediction is a positive example, and the true value is a positive example.

And (FP) (false positive), the model is predicted to be a positive example, and the true value is a negative example.

FN (false negative), the model is predicted as negative, the true value is positive.

And (3) expressing TN (true negative), wherein the model is predicted to be a negative example, and the true value is a negative example.

PA (Pixel Accuracy): the simplest judgment method in the image field is to measure the precision by calculating the proportion of correctly predicted pixels to the total pixels.

；

MPA (Mean Pixel Accuracy ): based on the improvement of pixel precision judgment, the proportion of correctly classified pixel points in each class is calculated respectively, and then the pixel precision of all the classes is averaged.

WIou (Weighted Intersection over Union), the frequency of each class appearing in the image is different based on the improvement of MIoU, and the measurement method uses the frequency as the weight to calculate the precision.

；

The model evaluation using the above parameters is shown in table 3:

TABLE 3 comparison of the results

The experimental result shows that compared with the original label, the method has a remarkable effect, and meanwhile, the overall parameters in the model evaluation are improved by about 5%, so that the change detection effect of the change multi-time-sequence remote sensing data can be improved.

Fig. 3 is a system for detecting multimodal remote sensing image data according to an embodiment of the present invention, where the system includes an obtaining module 31 and an identifying module 32, where:

the obtaining module 31 is configured to input the multimodal remote sensing image data of the first time sequence into the trained multimodal feature mining network, and obtain a first fusion vector feature corresponding to the multimodal remote sensing image data of the first time sequence output by the multimodal feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region; the identification module 32 is configured to input the first fused vector feature and the second fused vector feature into a trained change detection network, and obtain a difference result identified by the change detection network, where the change result represents whether the multi-modal remote sensing image data of the first time sequence and the multi-modal remote sensing image data of the second time sequence have a difference.

It can be understood that the multimodal remote sensing image data detection system provided by the embodiment of the present invention corresponds to the multimodal remote sensing image data detection method provided by each of the foregoing embodiments, and the relevant technical features of the multimodal remote sensing image data detection system may refer to the relevant technical features of the multimodal remote sensing image data detection method, and are not described herein again.

Referring to fig. 4, fig. 4 is a schematic view of an embodiment of an electronic device according to an embodiment of the invention. As shown in fig. 4, an embodiment of the present invention provides an electronic device 400, which includes a memory 410, a processor 420, and a computer program 411 stored in the memory 410 and executable on the processor 420, and when the processor 420 executes the computer program 411, the following steps are implemented: inputting multi-mode remote sensing image data of a first time sequence into a trained multi-mode feature mining network, and acquiring first fusion vector features corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region; inputting the first fusion vector feature and the second fusion vector feature into a trained change detection network, and obtaining a difference result identified by the change detection network, wherein the change result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference or not.

Referring to fig. 5, fig. 5 is a schematic diagram illustrating an embodiment of a computer-readable storage medium according to the present invention. As shown in fig. 5, the present embodiment provides a computer-readable storage medium 500 having a computer program 511 stored thereon, the computer program 511 implementing the following steps when executed by a processor: inputting multi-mode remote sensing image data of a first time sequence into a trained multi-mode feature mining network, and acquiring first fusion vector features corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region; inputting the first fusion vector feature and the second fusion vector feature into a trained change detection network, and obtaining a difference result identified by the change detection network, wherein the change result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference or not.

The method and the system for detecting the multi-mode remote sensing image data have the following beneficial effects that:

(1) and a multi-mode feature mining network is provided, and by combining the remote sensing data features of three dimensions of space-spectrum, a special deep convolution neural network is respectively constructed to mine the features thereof, so that the self characteristics of hyperspectral satellite data, high-resolution satellite data and multi-timing-sequence Modis fully exerted, and the requirements of diversity, multi-timing sequence and multi-feature of a training data set are met.

(2) And (3) constructing multi-mode feature engineering, and combining different scale features of three dimensions of-space-spectrum, preventing the loss of necessary information in the feature mining process, making up the multi-scale structure of the deep learning data set, and realizing timeliness, high resolution and information multi-scale detection.

(3) Compared with the traditional change detection network, the concept of the FCNN is introduced, the multilayer deconvolution structure on the right side is modified, the last-stage deconvolution result and the penultimate-stage deconvolution result are simultaneously used by the input layer of the upper adoption layer, and loss of characteristics is compensated to a certain extent.

It should be noted that, in the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to relevant descriptions of other embodiments for parts that are not described in detail in a certain embodiment.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A multi-mode remote sensing image data detection method is characterized by comprising the following steps:

inputting multi-mode remote sensing image data of a first time sequence into a trained multi-mode feature mining network, and acquiring first fusion vector features corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region;

inputting the first fusion vector feature and the second fusion vector feature into a trained change detection network, and obtaining a difference result identified by the change detection network, wherein the difference result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference or not.

2. The detection method according to claim 1, wherein the first time-series multimodal remote sensing image data comprises first hyperspectral satellite data, first high-resolution satellite data and first multisequence Modis data, and the second time-series multimodal remote sensing image data comprises second hyperspectral satellite data, second high-resolution satellite data and second multisequence Modis data.

3. The detection method according to claim 2, wherein the multi-modal feature mining network comprises three deep convolutional neural networks and one feature engineering structure;

the method for inputting the multi-mode remote sensing image data of the first time sequence into the trained multi-mode feature mining network to obtain the first fusion vector feature corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network comprises the following steps:

inputting first hyperspectral satellite data, first high-resolution satellite data and first multi-timing-sequence Modis data into three deep convolutional neural networks respectively to obtain vector features output by each deep convolutional neural network;

inputting the three vector characteristics into the characteristic engineering structure to obtain a first fusion vector characteristic output by the characteristic engineering structure;

the method for inputting the multi-mode remote sensing image data of the second time sequence into the trained multi-mode feature mining network to obtain the second fusion vector feature corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network comprises the following steps:

inputting second hyperspectral satellite data, second high-resolution satellite data and second multi-timing-sequence Modis data into three deep convolutional neural networks respectively to obtain vector characteristics output by each deep convolutional neural network; (ii) a

And inputting the three vector characteristics into the characteristic engineering structure, and acquiring second fusion vector characteristics output by the characteristic engineering structure.

4. The detection method according to claim 3, wherein the deep convolutional neural network comprises a plurality of network layers, each network layer comprising a convolutional layer and a pooling layer, the plurality of network layers being connected in sequence; the characteristic engineering structure comprises a plurality of network layers, each network layer comprises a convolution layer and a BN layer, and the network layers are connected in sequence, wherein the number of the network layers of the deep convolution neural network is the same as that of the network layers of the characteristic engineering structure.

5. The detection method according to claim 3 or 4, wherein the inputting three vector features into the feature engineering structure and obtaining a first fused vector feature output by the feature engineering structure comprises:

the feature engineering structure performs fusion calculation on three vector features through the following formula:

wherein,

、

、

the weights in each dimension are represented separately,

in order to shift the weight of the weight,

is a weight-weighted index, m_nRepresenting an offset.

6. The detection method according to claim 5, wherein the change detection network is a symmetric semantic segmentation network comprising a convolutional layer, a pooling layer, an anti-convolutional layer, and an anti-pooling layer connected in sequence; the method further comprises the following steps:

and simultaneously taking the deconvolution result of the last stage and the deconvolution result of the penultimate stage as the input of the anti-pooling layer.

7. The detection method according to claim 6, further comprising:

training a preset network model based on the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence, and calculating the loss of the preset network model according to the recognition result output by the preset network model;

based on the loss, adjusting the model parameters of the preset network model to obtain an optimal preset network model;

the preset network model is composed of the multi-modal feature mining network and the change detection network, and the model parameters of the preset network model at least comprise the weight of each deep convolutional neural network in each dimension

、

、

And offset weights m and m_n。

8. A multi-modal remote sensing image data detection system, comprising:

the acquisition module is used for inputting the multi-mode remote sensing image data of the first time sequence into the trained multi-mode feature mining network and acquiring first fusion vector features corresponding to the multi-mode remote sensing image data of the first time sequence output by the multi-mode feature mining network; inputting multi-mode remote sensing image data of a second time sequence into the trained multi-mode feature mining network, and acquiring second fusion vector features corresponding to the multi-mode remote sensing image data of the second time sequence output by the multi-mode feature mining network, wherein the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence are in the same region;

and the identification module is used for inputting the first fusion vector characteristics and the second fusion vector characteristics into a trained change detection network, and acquiring a difference result identified by the change detection network, wherein the change result represents whether the multi-mode remote sensing image data of the first time sequence and the multi-mode remote sensing image data of the second time sequence have difference or not.

9. An electronic device, comprising a memory and a processor, wherein the processor is configured to execute a computer management program stored in the memory to implement the steps of the method according to any one of claims 1-7.

10. A computer-readable storage medium, having stored thereon a computer management-like program, which when executed by a processor, performs the steps of the method of detecting multimodal remote sensing image data according to any one of claims 1 to 7.