CN114863235A

CN114863235A - Fusion method of heterogeneous remote sensing images

Info

Publication number: CN114863235A
Application number: CN202210493488.0A
Authority: CN
Inventors: 李刚; 张强; 王学谦; 王志豪
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2022-08-05

Abstract

The application provides a fusion method of heterogeneous remote sensing images, relates to the technical field of image processing, and aims to synthesize respective advantages of optical images and radar images to generate fusion images. The method comprises the following steps: acquiring an optical image and a radar image; inputting the optical image and the radar image into a homomorphic image generation model to obtain a fusion image, wherein the fusion image comprises the content characteristics of the optical image and the style and form characteristics of the radar image; the homogeneity change image generation model is a model which learns to generate an image sample to be iterated based on an optical image sample and a radar image sample and iterates the image sample to be iterated to generate a fusion image sample. The patent is funded by a national emphasis research and development plan 2021YFA 0715201.

Description

Fusion method of heterogeneous remote sensing images

Technical Field

The application relates to the technical field of image processing, in particular to a fusion method of heterogeneous remote sensing images.

Background

The remote sensing image comprises an optical image and a radar image, and refers to image data about ground objects, sea surfaces and the like acquired by a space-based or space-based imaging sensor. The optical image has high resolution, rich color information and easy explanation; but the imaging quality is not good under the poor illumination conditions such as night and cloud shielding. The radar has the advantages of all-time, all-weather, strong penetrability and the like, but the quality of radar images is influenced by clutter level, and the interpretability is relatively poor.

How to realize advantage complementation according to an optical image and a radar image to obtain an image which can still ensure that the integrity of content information is not lost even under a complex terrain condition and a transformed style texture approaches to a target is a technical problem to be solved urgently.

Disclosure of Invention

In view of the above, embodiments of the present invention provide a method for fusing heterogeneous remote sensing images, so as to overcome the above problems or at least partially solve the above problems.

The embodiment of the invention provides a fusion method of heterogeneous remote sensing images, which comprises the following steps:

acquiring an optical image and a radar image;

inputting the optical image and the radar image into a homomorphic image generation model to obtain a fusion image, wherein the fusion image comprises the content characteristics of the optical image and the style and form characteristics of the radar image;

the homogeneity change image generation model is a model which learns to generate an image sample to be iterated based on an optical image sample and a radar image sample and iterates the image sample to be iterated to generate a fusion image sample.

Optionally, the homogeneity change image generation model is obtained by training an initial model, where the initial model includes an image generation module and a first feature extraction module with a fixed weight; the step of training the initial model comprises at least:

inputting the optical image samples and the radar image samples into the initial model;

extracting a first content feature of the optical image sample and a first style form feature of the radar image sample by using the first feature extraction module;

inputting the first content feature and the first style feature into the image generation module, and outputting a first intermediate image sample;

constructing a loss function according to the current input and output of the image generation module;

and updating the model parameters of the initial model based on the loss function to obtain the homogeneity change image generation model.

Optionally, the initial model further includes a second feature extraction module with random weight; the step of training the initial model further comprises:

obtaining the image sample to be iterated generated by the image generation module, wherein the image sample to be iterated is: in the case that the input of the image generation module is the first content feature and the first stylistic form feature, minimizing the loss function;

extracting a second content feature of the optical image sample and a second style form feature of the radar image sample by using the second feature extraction module;

acquiring a second intermediate image sample generated by the image generation module, wherein the second intermediate image sample is: in the case that the input of the image generation module is the second content feature and the second stylistic form feature, minimizing the loss function;

according to the second intermediate image sample, iterating the image sample to be iterated, including:

and under the condition that an iteration termination condition is met, taking the second intermediate image sample as the fusion image sample output by the homogeneity change image generation model.

Optionally, the step of taking the second intermediate image sample as the fused image sample output by the homogeneity-variation image generation model when an iteration termination condition is satisfied includes:

calculating an iteration convergence error between the second intermediate image sample and the image sample to be iterated;

and under the condition that the iterative convergence error is smaller than an iterative convergence error threshold value, or under the condition that the frequency of generating the second intermediate image sample reaches a preset frequency, taking the second intermediate image sample as the fusion image sample output by the homogeneity change image generation model.

Optionally, the iterating the image sample to be iterated according to the second intermediate image sample further includes:

taking the second intermediate image sample as the image sample to be iterated under the condition that the iteration termination condition is not met, and executing the following steps: and extracting a second content characteristic of the optical image sample and a second style characteristic of the radar image sample by using the second characteristic extraction module.

Optionally, in a case that the current input of the image generation module is the first content feature and the first stylistic form feature, the constructing a loss function according to the current input and output of the image generation module includes:

extracting a first content feature and a first style form feature of the first intermediate image sample by using the first feature extraction module;

constructing the loss function from a difference between a first content feature of the first intermediate image sample and a first content feature of the optical image sample, and a difference between a first stylistic form feature of the first intermediate image sample and a first stylistic form feature of the radar image sample.

Optionally, in a case that the current input of the image generation module is the second content feature and the second stylistic form feature, the constructing a loss function according to the current input and output of the image generation module includes:

extracting a second content feature and a second style feature of the second intermediate image sample by using the second feature extraction module;

constructing the loss function based on a difference between a second content feature of the second intermediate image sample and a second content feature of the optical image sample, and a difference between a second stylistic form feature of the second intermediate image sample and a second stylistic form feature of the radar image sample.

Optionally, after the obtaining of the fused image, the method further includes:

acquiring a target radar image, wherein the target radar image is an image which is from the same radar as the radar image and observes the same area as the optical image, and the area comprises a target proposed area;

inputting the fusion image and the target radar image into a classifier to obtain respective target proposed areas of the fusion image and the target radar image;

obtaining a common target proposal area of the fusion image and the target radar image according to the respective target proposal areas of the fusion image and the target radar image;

and fusing the common target proposal area and the target radar image to obtain an image with enhanced target proposal area.

Optionally, the classifier is trained by the following steps:

obtaining a plurality of labeled box samples from the radar image samples and the fused image samples, wherein the labels are used for characterizing whether the box samples are the target proposed area;

obtaining a gradient norm diagram of each of a plurality of box samples;

inputting the gradient norm diagrams of the box samples into an initial classifier to obtain a classification prediction result of whether the box samples are target proposed areas or not;

and updating the parameters of the initial classifier according to the labels and classification prediction results of the box samples to obtain the classifier.

Optionally, the fusing the common proposed target area and the target radar image to obtain an image with an enhanced proposed target area includes:

calculating a joint probability density function of background clutter in the target radar image and the fusion image based on a Koprala model, wherein the joint probability density function represents the common background clutter distribution condition of the target radar image and the fusion image;

calculating a background clutter suppression matrix of the target radar image and the fusion image according to the joint probability density function;

acquiring the similarity of the target radar image and the fusion image;

and obtaining an image enhanced by the target proposal area according to the common target proposal area, the target radar image, the background clutter suppression matrix and the similarity.

The embodiment of the invention has the following advantages:

in this embodiment, the homogeneous change image generation model may directly generate a fusion image according to the optical image and the radar image, which is relatively fast. The fused image comprises the content characteristics of the optical image and the style form characteristics of the radar image, and the advantages of the optical image and the advantages of the radar image are integrated, so that the fused image can still ensure that the integrity of content information is not lost even under the condition of complex terrain, and the style texture approaches to the radar image. The homogeneity change image generation model further learns that the image sample to be iterated is iterated to generate a fusion image sample on the basis of learning to generate the image sample to be iterated based on the optical image sample and the radar image sample, so that the homogeneity of a fusion image generated by the homogeneity change image generation model in a new content characteristic and style characteristic subspace can be further improved, and the homogeneity change image generation model is more accurate.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.

FIG. 1 is a flow chart illustrating the steps of a method for fusing heterogeneous remote sensing images according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a first feature extraction module according to an embodiment of the present invention;

FIG. 3 is a flow diagram illustrating the loss function construction and notification transformation feature fusion in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a process of image iteration in an embodiment of the invention;

FIG. 5 is a training process and a testing process for target enhancement in an embodiment of the present invention;

FIG. 6 is a flowchart of a heterogeneous remote sensing image fusion method based on homogeneous transformation and target enhancement in the embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.

Referring to fig. 1, a flowchart illustrating steps of a fusion method of heterogeneous remote sensing images according to an embodiment of the present invention is shown, and as shown in fig. 1, the fusion method of heterogeneous remote sensing images may specifically include the following steps:

step S11: acquiring an optical image and a radar image;

step S12: inputting the optical image and the radar image into a homomorphic image generation model to obtain a fusion image, wherein the fusion image comprises the content characteristics of the optical image and the style and form characteristics of the radar image; the homogeneity change image generation model is a model which learns to generate an image sample to be iterated based on an optical image sample and a radar image sample and iterates the image sample to be iterated to generate a fusion image sample.

The optical image is an image obtained by adopting an optical photographic system, and can be a visible light black and white full-color photo, a color infrared photo, a multiband photo, a thermal infrared photo and the like obtained by aerial remote sensing. The radar image refers to a Synthetic Aperture Radar (SAR) image generated by a SAR system, which can be installed on a flying platform such as an airplane, a satellite, a spacecraft, or the like.

The homogeneity-variation image generation model can generate a fusion image according to an input optical image and a radar image, wherein the radar image is used for providing style form characteristics, so that the radar image and the optical image are not required to observe the same area, and the radar image and the optical image are not required to be registered. The enhancement of the target proposed area is realized based on the target radar image and the fusion image as will be described later; to achieve enhancement of the proposed area of interest, the radar image from which the fused image is generated should be from the same radar as the target radar image.

Compared with the method for generating the homogeneous transformation image by using the optical image and the radar image without registration, the method for generating the homogeneous transformation image by using the optical image and the radar image without registration has the advantages of wider application scene, higher speed and higher practicability. For example, when observing a certain scene to perform target recognition or change detection, it is assumed that an optical image is obtained first, and if a radar image registered with the optical image is to be acquired, it is necessary to wait for a long time, so that it takes a long time to perform homogeneous transformation. By adopting the technical scheme, the optical image and the radar image do not need to be registered, the homogeneity transformation can be directly completed based on the homogeneity data of the radar sensor, more time-efficient data is provided for subsequent target enhancement, and the subsequent carrying task can be rapidly developed. Therefore, the technical scheme of obtaining the fusion image by carrying out homogeneous transformation on the optical image and the radar image without registration has the advantages of wider application scene and higher speed.

The optical image may have influences such as shadows and unbalanced lighting conditions, and in order to generate a better fusion image, the optical image may be preprocessed, and the radar image and the preprocessed optical image may be input into a homogeneous change image generation model. Optionally, the optical image may be preprocessed by using a sobel operator, an edge profile of the feature information of the optical image in minutes is extracted, 20% of original information is added, and then gaussian noise with a normalized mean value of 0.15-0.25 and a normalized variance of 0.01 is added. The optical image is preprocessed, so that the influences of shadow, unbalanced illumination conditions and the like can be overcome, and basic image semantic information can be kept.

The optical image and the radar image are input into a homogeneous change image generation model, the homogeneous change image generation model can extract the content characteristics of the optical image and the style and form characteristics of the radar image, and then the content characteristics of the optical image and the style and form characteristics of the radar image are subjected to homogeneous space transformation to generate a fusion image. The content features of the fused image are close to the content features of the optical image, and the stylistic form features of the fused image are close to the stylistic form features of the radar image. The fused image can ensure the integrity of the content semantic information of the image, can describe the image content and style form information from concrete to abstract of each scale of the remote sensing image even in a complex terrain environment, and can fully and accurately perform heterogeneous remote sensing image homogeneous space transformation under the complex terrain conditions of various ground objects.

The homogeneity change image generation model is a model for learning how to generate iterative image samples and how to optimize the iterative image samples through optical image samples and radar image samples. The training step of the homogeneity-variance image generation model will be described in detail later.

By adopting the technical scheme of the embodiment of the application, the homogeneous change image generation model can directly generate the fusion image according to the optical image and the radar image, and is relatively quick. The fused image comprises the content characteristics of the optical image and the style form characteristics of the radar image, and the advantages of the optical image and the advantages of the radar image are integrated, so that the fused image can still ensure that the integrity of content information is not lost even under the condition of complex terrain, and the style texture approaches to the radar image. The homogeneity change image generation model further learns that the image sample to be iterated is iterated to generate a fusion image sample on the basis of learning to generate the image sample to be iterated based on the optical image sample and the radar image sample, so that the homogeneity of a fusion image generated by the homogeneity change image generation model in a new content characteristic and style characteristic subspace can be further improved, and the homogeneity change image generation model is more accurate.

The homogeneity variation image generation model is obtained by training an initial model, and the initial model comprises an image generation module and a first feature extraction module with fixed weight. The initial model comprises an image generation module, wherein the weight of the image generation module is to be trained, and the image generation module is used for performing homogeneous space transformation on input content characteristics and style characteristics to generate an image/image sample. Through training, the content features of the images/image samples output by the image generation module are more and more close to the content features of the input image generation module, and the stylistic form features are also more and more close to the stylistic form features of the input image generation module. It can be understood that the homogeneity variation image generation model also has an image generation module, and the weight of the image generation module in the homogeneity variation image generation model is trained.

The first feature extraction module with fixed weight can be a convolutional neural network with a multi-level structure constructed based on VGGnet (a deep convolutional neural network), and content features of the optical image and stylistic form features of the radar image can be extracted from the convolutional neural network. It is understood that the first content feature refers to a content feature extracted from the first feature extraction module with fixed weight, and the first stylistic feature refers to a stylistic feature extracted from the first feature extraction module with fixed weight.

Fig. 2 shows a schematic structural diagram of a first feature extraction module, which may include a plurality of convolutional layers, active layers, and pooling layers, and is capable of effectively extracting features of an image under complex terrain conditions. However, unlike VGGnet, the first feature extraction module has the following three features: there is no full connection layer, the weight of the whole network is fixed, the output of the network is each selected network layer.

The fully-connected layer is used to output global information of the image, but it loses local position information of the image. Under the complex terrain conditions of various terrain features, each position of the image comprises different terrain features, so that the position information can communicate the homogeneous space transformation mapping of local identical terrain features among heterogeneous images. The function of the full connection layer does not meet the requirement of performing homogeneous space transformation, so the first feature extraction module has or does not have the full connection layer.

And the weight value of the first feature extraction module is based on the result that the VGGnet is trained. The network of the first feature extraction module does not perform traditional training or fine tuning of parameter values of the network, but similar to the network working process, and features extracted under complex terrain conditions are output according to the trained and fine-tuned network.

The output of the first feature extraction module is not the output of a VGGnet feed-forward type network terminal, but is output at each selected layer of the convolutional neural network. Therefore, the characteristics of the remote sensing image from concrete to abstract and multi-scale homogeneous space transformation under the complex terrain conditions of various ground features can be extracted.

As shown in fig. 2, according to the output characteristics of different layers of the constructed convolutional neural network, the content characteristics and the style and form characteristics of the corresponding network layer separation output image can be selected.

Alternatively, the deeper or deepest convolution layer close to the ground feature scale in the VGGnet network can be directly output as the content feature, because the content semantic information of the ground feature in the remote sensing image is generally one of the most abstract features under the scale of the specified ground feature. For convenience, the content of the image is characterized as Fc.

Alternatively, pooling layers at various scales of VGGnet may be utilized and style-style features extracted based on the gram texture operator. The stylistic form of the image is described by the textural features of the image at various scales. The VGGnet network containing a plurality of scales can effectively cover texture information of each scale of the radar image. Meanwhile, the texture features are suitable for extraction by mean operation of the VGGnet pooling layer. And finally, converting the pooled output of each scale into stylistic form feature representation by utilizing a gram texture operator. For convenience, the stylistic features of an image are denoted as Fs.

Based on the content characteristics and the style characteristics extracted by the first characteristic extraction module, the homogeneous space transformation characteristics which are more concrete and have smaller scale can be covered, and more abstract and larger scale characteristics can be extracted through a multi-level network structure. In addition, the content characteristics and the style characteristics in the homogeneous space transformation characteristics are separated, the integrity of the content semantic information of the image is ensured, and the accuracy of transforming and synthesizing the image is improved. In conclusion, even in a complex terrain environment, the extracted content features and the extracted stylistic form features can describe concrete to abstract image content and stylistic form information of each scale of the remote sensing image, and heterogeneous remote sensing image homogeneous space transformation can be sufficiently and accurately performed under the complex terrain conditions of various ground features.

On the basis of the technical scheme, in order to train the initial model, an optical image sample and a radar image sample are firstly acquired. The optical image sample and the radar image sample do not need to observe the same region and do not need to be registered.

The optical image sample and the radar image sample are input into an initial model, and a first feature extraction module in the initial model can extract a first content feature of the optical image sample and a first stylistic form feature of the radar image sample. The image generation module can fuse the content characteristics and the style characteristics among the heterogeneous images to perform homogeneous space transformation on the heterogeneous remote sensing images. Thus, the first content feature and the first stylistic aspect feature are input to an image generation module, which may generate a first intermediate image sample.

From the current input and output of the image generation module, a loss function can be constructed. Based on the loss function, model parameters of the initial model are updated, and a homogeneity change image generation model can be obtained.

In the case where the current input to the image generation module is the first content feature and the first stylistic feature, constructing the loss function may include: extracting a first content characteristic and a first style form characteristic of the first intermediate image sample by using a first characteristic extraction module; a loss function is constructed from the difference between the first content features of the first intermediate image sample and the first content features of the optical image sample, and the difference between the first stylistic form features of the first intermediate image sample and the first stylistic form features of the radar image sample. The loss function constructed based on the first intermediate image sample can be represented by the following formula:

wherein the content of the first and second substances,

representing a loss function constructed based on the first intermediate image sample;

is a first intermediate image sample;

is an optical image sample/pre-processed optical image sample;

is a radar image sample;

to represent

A first content characteristic of (a);

to represent

A first content characteristic of (a);

to represent

A first stylistic form characteristic of (a);

to represent

A first stylistic form characteristic of (a); i | · | | represents a 2 norm. Lambda [ alpha ] _c Is a preset constant, generally the setting interval is 0.01-0.05, and respectively represents the weight of the content characteristics and the stylistic characteristics in the fusion process. The result of the homogenous spatial transformation can be expressed as:

the loss function includes two additive terms, the first additive term being measured by

And

the distance between content features in between is desirably as small as possible to allow for

And

as close as possible to the content features in between; the second additive term measures

And

distance between stylistic features of (a); it is desirable that the distance be as small as possible so that

And with

The stylistic features in between are as close as possible. Thus, in the process of homogeneous space transformationIn the method, the semantic information of the content of the image is not lost or damaged, and the style form of the first intermediate image sample is ensured to be consistent with the style form of the radar image sample.

By constructing the loss function and minimizing the result of the homogeneous spatial transformation, fusion processing can be performed on the content features and the morphological features of the heterogeneous images respectively. Therefore, even under the complex terrain condition with various terrain features, the integrity of the content information can be still ensured not to be lost in the process of the homogeneous space transformation, and the transformed style texture can approach to the target.

In actual implementation, initialization is required

To meet the computational requirements in the loss function. The general initialization image can adopt an input image or a fusion result of fixed network weight, and carries out homogeneous feature transformation fusion to output a first intermediate image sample update

FIG. 3 shows a flow diagram of the loss function construction and notification transformation feature fusion.

It can be understood that the model parameters of the initial model are updated based on the loss function constructed from the first intermediate image sample, and essentially the weights of the image generation module are updated.

On the basis of the technical scheme, the initial model further comprises a second feature extraction module with random weight, the second feature extraction module is similar to the first feature extraction module in structure and comprises a plurality of convolution layers, an activation layer and a pooling layer, and features of the image can be effectively extracted under the condition of complex terrain. The second feature extraction module has the following three characteristics: there is no fully connected layer, weight randomization of the entire network, the output of the network is each selected network layer. The weight randomization of the whole network is that the second feature extraction module firstly obtains a trained weight according to a traditional method for training or fine-tuning the parameter value of the network, and then mixes Gaussian white noise on the trained weight to realize the network output of features extracted under the complex terrain conditions of various ground features.

On the basis of training the initial model based on the loss function constructed by the first intermediate image sample, the training of the initial model can be continued. Inputting the first content characteristic and the first style characteristic into an image generation sample, and acquiring an image sample to be iterated generated by an image generation module; the image generation module has been trained based on the loss function constructed from the first intermediate image samples. The image sample to be iterated is: in the case where the input to the image generation module is a first content characteristic of the optical image sample and a first stylistic characteristic of the radar image sample, the image sample having the least loss function.

Extracting a second content characteristic of the optical image sample and a second style characteristic of the radar image sample by using a second characteristic extraction module; the second content feature is a content feature extracted from a second feature extraction module with random weight, and the second stylistic form feature is a stylistic form feature extracted from the second feature extraction module with random weight.

And inputting the second content characteristic and the second style characteristic into an image generation sample, and acquiring a second intermediate image sample generated by the image generation module. In the case where the current input to the image generation module is the second content feature and the second stylistic characteristic, a loss function may be constructed from the current input and output of the image generation module.

Extracting a second content characteristic and a second style characteristic of the second intermediate image sample by using a second characteristic extraction module; and constructing a loss function according to the difference between the second content characteristic of the second intermediate image sample and the second content characteristic of the optical image sample and the difference between the second stylistic form characteristic of the second intermediate image sample and the second stylistic form characteristic of the radar image sample, and updating the weight of the image generation module based on the loss function.

It is understood that in the process of updating the weight values of the image generation module based on the loss function, a plurality of second intermediate image samples are generated according to the second content features of the optical image samples and the second stylistic form features of the radar image samples. Inputting the second content characteristics of the optical image sample and the second style form characteristics of the radar image sample into the updated image generation module, and outputting a second intermediate image sample as follows: in case the input to the image generation module is a second content characteristic of the optical image sample and a second stylistic characteristic of the radar image sample, the image sample with the smallest loss function is used.

And iterating the image samples to be iterated according to the second intermediate image sample which minimizes the loss function. Specifically, in the case where the iteration end condition is satisfied, the second intermediate image sample that minimizes the loss function is taken as a fused image sample that is the output of the homogeneity-change image generation model.

The iteration termination condition may be: and enabling the iteration convergence error between the second intermediate image sample with the minimum loss function and the image sample to be iterated to be smaller than the iteration convergence error threshold value, or enabling the iteration times to reach preset times. Wherein the iterative convergence error between the second intermediate image sample with the smallest loss function and the image sample to be iterated can be determined by

A calculation is performed wherein K characterizes the number of iterations, initially K is 0,

characterizing a second intermediate image sample obtained from the (K + 1) th iteration,

characterizing a second intermediate image sample obtained by the Kth iteration; when K is 0, the obtained second intermediate image sample is an image sample to be iterated obtained based on the first content features of the optical image sample and the first style and form features of the radar image sample; when K is 1, the obtained second intermediate image sample is: optics extracted for the first time based on the second feature extraction moduleObtaining a second intermediate image sample which enables the loss function to be minimum by using a second content characteristic of the image sample and a second style form characteristic of the radar image sample; when K is 2, the obtained second intermediate image sample is: and obtaining a second intermediate image sample which minimizes the loss function based on a second content characteristic of the optical image sample and a second stylistic form characteristic of the radar image sample, which are extracted for the second time by the second characteristic extraction module.

And calculating an iteration convergence error between the first second intermediate image sample and the image sample to be iterated, wherein the iteration convergence error can be used as one iteration, the first intermediate image sample which minimizes the loss function can be generated once in the one iteration, or the first content feature and the first style form feature can be extracted once by the first feature extraction module.

In the case that the iteration termination condition is not satisfied, the second intermediate image sample that minimizes the loss function may be used as the image sample to be iterated, then the second feature extraction module is used to extract the second content features of the optical image sample and the second stylistic form features of the radar image sample again, and a second intermediate image sample that minimizes a new loss function is generated based on the second content features of the optical image sample and the second stylistic form features of the radar image sample that are extracted again. The new loss function is constructed based on the second content features and the second stylistic form features of the newly generated second intermediate image sample, and the second content features of the re-extracted optical image sample and the second stylistic form features of the radar image sample.

Fig. 4 shows a process diagram of image iteration. The iterative process may include the steps of:

step S21, appointing the maximum iteration times K and the iteration convergence error threshold epsilon;

step S22, obtaining the image sample to be iterated, and initializing the image sample to be iterated

Setting k to be 0, wherein k is the current iteration number;

step S23 input

And obtain

Respective content characteristics

And features of stylistic form

Generating a second intermediate image, i.e. obtaining a new input image

And k is set to k + 1;

step S24, if k is>K, or

The iteration is stopped and the fused image sample of the final model output is

Otherwise, return to step S23;

the fused image sample synthesized by the homogenous spatial transform updated to the iteration termination output may be set to have a maximum iteration number K of 100 and an iteration number e of 0.01.

In the foregoing, how the initial model generates an image sample to be iterated based on an optical image sample and a radar image sample, and iterates the image sample to be iterated to obtain a fused image sample is introduced; it can be understood that the method for generating the image to be iterated based on the optical image and the radar image by the homogeneity variation image generation model and the method for obtaining the fusion image by iterating the image to be iterated are similar, but a loss function does not need to be constructed, and the image generation module can directly output the fusion image.

After the heterogeneous remote sensing image is transformed into a fused image in a homogeneous mode, a final fused image can be obtained by adopting fusion means such as principal component analysis, additive fusion, multiplicative fusion and the like.

Because tasks such as target detection and the like are generally accepted in the follow-up remote sensing image fusion, the method based on homogeneous transformation and the existing fusion means cannot effectively enhance the target proposed area in the fusion image and inhibit clutter, and the follow-up target detection performance is easily reduced. Therefore, on the basis of the technical scheme, the target radar image can be subjected to target enhancement according to the fusion image. The target radar image is subjected to target enhancement, the target-clutter ratio in the fusion image can be remarkably improved, and the subsequent target detection and identification performance can be improved. Fig. 5 illustrates a training process and a testing process of the target enhancement, which will be described in detail later.

Optionally, enhancement of a common target region in the remote sensing image may be implemented based on the target proposal, and suppression of a common clutter region in the remote sensing image may be implemented based on the kopp model.

And acquiring a target radar image, wherein the target radar image is an image which is from the same radar as the radar image and observes the same area as the optical image, and the observed area comprises a target proposed area. The target radar image and the optical image must be configured and consistent in size, otherwise, registration is needed, and the image with low resolution is interpolated to make the two consistent in size.

Inputting the fusion image and the target radar image into a classifier to obtain respective target proposed areas of the fusion image and the target radar image; obtaining a common target proposal area of the fusion image and the target radar image according to the respective target proposal areas of the fusion image and the target radar image; and fusing the common target proposal area and the target radar image to obtain an image with enhanced target proposal area.

The target radar image and the optical image are images for observing the same area and registering, and the fusion image is an image which reserves the content characteristics of the optical image, so that the target radar image and the fusion image have a corresponding relation, and information such as the structure, the number, the formation and the like of the target is respectively contained in the target radar image and the fusion image, so that the target radar image and the fusion image have intrinsic correlation, namely joint sparsity of data.

The respective target proposed areas of the fusion image and the target radar image are identified based on the classifier, the target proposed areas can be targets to be detected, such as ships on the sea surface, houses at a certain position and the like, pixel points outside the target proposed areas can be considered as clutter, and the clutter includes the sea surface, clouds and the like in the case that the targets are ships. The classifier is a model that learns how to identify the proposed region of interest based on the fused image samples and the radar image samples, where the radar image samples and the fused image samples used to train the classifier are also registered.

Optionally, inputting the fused image and the target radar image into a classifier to obtain respective target proposed areas of the fused image and the target radar image, including: adjusting the scales of the fusion image and the target radar image to obtain the fusion image with multiple scales and the target radar image with multiple scales; inputting the fusion images with multiple scales and the target radar images with multiple scales into the classifier to obtain target proposed areas contained in the iteration images with multiple scales respectively and target proposed areas contained in the target radar images with multiple scales respectively; integrating the target proposed areas contained in the fused images of multiple scales to obtain the target proposed area of the fused image; and synthesizing target proposal areas contained in the target radar images of multiple scales to obtain a target proposal of the target radar image.

The fusion image and the target radar image can be respectively adjusted to be the fusion image and the target radar image under a plurality of scales. And identifying target proposed areas contained in the fusion image and the target radar image respectively at each scale by using the classifier and a sliding window (the size is the same as that of a down-sampling sample box adopted when the initial classifier is trained). And the classifier gradually moves according to the sliding window to completely recognize the whole image. The classifier judges whether the content included in the sliding window contains the target or not, and all sliding window areas judged to contain the target in the moving process of the classifier are connected to obtain a target proposed area of one image.

And then scaling the fused image and the target radar image under each scale to the original scale to obtain a target proposed area of the fused image. If any one corresponding pixel point belongs to the target proposal area, the pixel point also belongs to the target proposal area in the fused image of the original scale. If any corresponding pixel point belongs to the target proposal area, the pixel point also belongs to the target proposal area in the target radar image of the original scale.

According to the respective target proposed areas of the fused image and the target radar image, a common target proposed area of the fused image and the target radar image can be obtained. Alternatively, the proposed target area of the fused image and the proposed target area of the target radar image may be represented by a matrix respectively, and the hadamard product of the two matrices may be taken, so as to obtain a common proposed target area of the fused image and the target radar image.

Because the respective target proposal areas of the fusion image and the target radar image comprise the target proposal area under each scale image, the false alarm rate is effectively reduced in a multi-scale angle.

Optionally, the classifier is trained by the following steps: obtaining a plurality of box samples with labels from the radar image samples and the fusion image samples, wherein the labels represent whether the box samples are the target proposed area; obtaining a gradient norm diagram of each of a plurality of box samples; inputting the gradient norm diagrams of the box samples into an initial classifier to obtain a classification prediction result of whether the box samples are target proposed areas or not; and updating the parameters of the initial classifier according to the labels and classification prediction results of the box samples to obtain the classifier.

A plurality of box samples may be obtained from the radar image samples and the fused image samples using a downsampled sample box, and a label may be labeled for each box sample. Alternatively, the downsampled sample box size may be 8. Calculate the respective image gradient norm G (x, y) for each box sample:

wherein gx and gy respectively represent a transverse gradient and a longitudinal gradient at coordinates (x, y), and x, y is 1, 2. Therefore, unimportant detailed information can be filtered, the image saliency edge is enhanced, and robustness to different-scale targets is good.

The classifier can be a Support Vector Machine (SVM), the gradient norm diagram of each box sample is output to an initial classifier, and two classification prediction results of whether each box sample is clutter or not can be obtained. And updating the parameters of the initial classifier by taking the label of each square sample as constraint to obtain the trained classifier.

Optionally, fusing the common proposed target area and the target radar image to obtain an enhanced image of the proposed target area, which may include: calculating a joint probability density function of background clutter in the target radar image and the fusion image based on the Korea model, wherein the joint probability density function represents the common background clutter distribution condition of the target radar image and the fusion image; calculating a background clutter suppression matrix of the target radar image and the fusion image according to a joint probability density function; acquiring the similarity of the target radar image and the fusion image; and obtaining an enhanced image of the target proposal area according to the similarity of the common target proposal area, the target radar image and the background clutter suppression matrix.

Copula is a concept in probability theory that can decompose the joint distribution of a multivariate random variable into the edge distribution of each variable and the function of the correlation structure between each variable.

Background clutter in the image is useless with respect to the proposed region of interest and interferes with the identification of the proposed region of interest, so it is desirable to suppress the background clutter. The joint probability density function represents the common background clutter distribution condition in the target radar image and the fusion image, so that the common suppression coefficient distribution of the background clutter of the target radar image and the fusion image, namely a background clutter suppression matrix, can be obtained according to the joint probability density function. When the value of the joint probability density function of a certain region is larger, the common suppression coefficient of two pictures in the corresponding region is larger.

Calculating the similarity degree of the target radar image and the fused image, wherein the two images are related to the same target, and the higher the similarity degree of a part at the same position in the two images is, the higher the probability that the position of the two images contains the same content is obviously, for the image to be obtained, the more the two parts are worth fusing together, so that the more information contained in the fused image is, and therefore, the higher the similarity degree of the part is calculated, the more the part of the fused image is marked with the fused weight. And calculating a similarity function of each part of the target radar image and the fusion image, and further converting the obtained similarity function into a fusion weight function.

Alternatively, the similarity k may be calculated by the following formula:

wherein (m, n) represents a pixel point at each position, I ^SAR The radar image of the target is characterized,

the fused image is characterized, and M and N are the sizes of the two images.

The fusion weight function α can be calculated by the following formula:

α＝-logκ

the higher the similarity degree of the target radar image and the fusion image is, the smaller the k value is, and the larger the alpha value is.

Correcting the common target proposed area according to the background clutter suppression matrix to obtain a corrected proposed area; and according to the fusion weight function, carrying out weighted summation on the revised proposed area and the target radar image to obtain an image enhanced by the target proposed area. The target proposed area enhanced image can be obtained by the following formula:

I ^F ＝I ^SAR +α(O⊙P)

wherein, I ^F And (3) proposing an enhanced image for the target proposing area, wherein O represents a common target proposing area, and P represents a background clutter suppression matrix.

Optionally, the model parameters of the copula model include a copula parameter and a function form, and may be obtained through the following steps: acquiring respective background clutter pixel points of the radar image sample and the fusion image sample; generating clutter pixel pairs according to background clutter pixel points at the same positions in the radar image sample and the fusion image sample; and substituting the clutter pixel pairs into an edge inference function to obtain model parameters of the Korea model.

The model parameters of the kepa model can be obtained by solving the following optimization problem:

wherein the content of the first and second substances,

is a clutter pixel pair; c represents the Korea density function, θ _c Representing a parameter in the Korea density function, f _S Probability density function representing background clutter in a target radar image, f _A A probability density function representing background clutter in the fused image,

representing the corresponding estimate, theta _c Denotes theta _c The range of values of (a) to (b),

denotes a Kopp Density dictionary, H ₀ Indicating that clutter occupies the current pixel pair,

represents a log-likelihood function:

wherein, F _S And F _A And respectively representing probability distribution functions of background clutter in the radar image and the fusion image. Alternatively, the kopp density dictionary is constructed from 4 kinds of kopp functions in table 1 (the kopp density is the cross partial derivative of the kopp function).

Thus, model parameters and functions of the Koprala model can be obtained

Optionally, the calculating a joint probability density function of the target radar image and the background clutter in the fused image based on the copula model includes: generating a plurality of pixel pairs according to each pixel point of the target radar image and the pixel point of the corresponding position in the fusion image; and substituting the plurality of pixel pairs into the Koprala model to obtain a joint probability density function of the background clutter in the target radar image and the fusion image.

The joint probability density function of the background clutter in the target radar image and the fusion image can be calculated by the following formula:

wherein (z) _S ,z _A ) Representing pairs of pixels in the target radar image and the fused image, F _S And F _A Can be prepared from

And

and (4) obtaining the integral. And the joint probability density function represents the common background clutter distribution condition in the target radar image and the fusion image.

Optionally, the calculating a background clutter suppression matrix of the target radar image and the fused image according to the joint probability density function includes: generating a plurality of pixel pairs according to each pixel point of the target radar image and the pixel point of the corresponding position in the fusion image; and obtaining a background clutter suppression matrix of the background clutter in the target radar image and the fusion image according to the plurality of pixel pairs and the joint probability density function.

The background clutter suppression matrix for the target radar image and the fused image may be calculated by the following formula

Where M is 1,2, …, M, N is 1,2, …, N, resulting in a matrix

Alternatively,

FIG. 6 is a flow chart of a heterogeneous remote sensing image fusion method based on homogeneous transformation and target enhancement.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, electronic devices and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The method for fusing heterogeneous remote sensing images provided by the application is described in detail, specific examples are applied in the method for explaining the principle and the implementation mode of the application, and the description of the embodiments is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A fusion method of heterogeneous remote sensing images is characterized by comprising the following steps:

acquiring an optical image and a radar image;

2. The method according to claim 1, wherein the homogeneity-variation image generation model is obtained by training an initial model, and the initial model comprises an image generation module and a first feature extraction module with fixed weight; the step of training the initial model comprises at least:

3. The method of claim 2, wherein the initial model further comprises a second feature extraction module with random weights; the step of training the initial model further comprises:

4. The method according to claim 3, wherein the step of outputting the second intermediate image sample as the fused image sample of the homogeneous change image generation model when the iteration termination condition is satisfied comprises:

5. The method of claim 3, wherein iterating the image samples to be iterated according to the second intermediate image samples further comprises:

and under the condition that the iteration termination condition is not met, taking the second intermediate image sample as the image sample to be iterated, and returning to the step: and extracting a second content characteristic of the optical image sample and a second style characteristic of the radar image sample by using the second characteristic extraction module.

6. The method according to any one of claims 2-5, wherein in the case that the current input of the image generation module is the first content feature and the first stylistic form feature, the constructing a loss function according to the current input and output of the image generation module comprises:

7. The method according to any one of claims 2-5, wherein in the case that the current input of the image generation module is the second content feature and the second stylistic form feature, constructing a loss function according to the current input and output of the image generation module comprises:

8. The method of claim 1, wherein after said obtaining a fused image, the method further comprises:

9. The method of claim 8, wherein the classifier is trained by:

obtaining a gradient norm diagram of each of a plurality of box samples;

10. The method of claim 8, wherein fusing the common proposed target area and the target radar image to obtain an enhanced proposed target area image comprises:

acquiring the similarity of the target radar image and the fusion image;