CN112419155B

CN112419155B - Super-resolution reconstruction method for fully-polarized synthetic aperture radar image

Info

Publication number: CN112419155B
Application number: CN202011348480.2A
Authority: CN
Inventors: 沈焕锋; 林镠鹏; 李�杰; 袁强强
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2020-11-26
Filing date: 2020-11-26
Publication date: 2022-04-15
Anticipated expiration: 2040-11-26
Also published as: CN112419155A

Abstract

The invention provides a method for reconstructing a super-resolution image of a full-polarization aperture radar based on a multi-scale attention mechanism, which constructs a residual convolution neural network embedded with modules of up-sampling, spatial attention, channel attention, multi-scale attention, adaptive loss functions and the like of a feature layer, combines a supervision type and zero-throw type training mechanism and obtains a network for training convergence. And performing super-resolution reconstruction on the low-resolution synthetic aperture radar image to be processed through the trained network to obtain a high-resolution fully-polarized synthetic aperture radar image, and effectively reconstructing the spatial information of the image while maintaining the polarization information of the image.

Description

Super-resolution reconstruction method for fully-polarized synthetic aperture radar image

Technical Field

The invention relates to the field of remote sensing image processing and computer vision, in particular to a method for reconstructing a super-resolution image of a fully-polarized synthetic aperture radar based on a multi-scale attention mechanism.

Background

The fully-polarized synthetic aperture radar can obtain various scattering information of ground objects in a single scene through different polarization modes, and therefore plays an important role in ship identification, post-earthquake evaluation, land utilization classification and other applications. However, due to the limitation of various factors such as the signal bandwidth and the antenna size of the synthetic aperture radar system, the spatial resolution of the image is inevitably reduced while multi-polarization information is obtained. Therefore, the method for reconstructing the spatial resolution by using the super-resolution reconstruction technology is an important way for improving the spatial information of the fully-polarized synthetic aperture radar.

The existing super-resolution method for the image of the fully-polarized synthetic aperture radar mainly has three types. The first is a frequency domain method based on fourier transform shift characteristics. The method can process a linear degradation model, but is difficult to better process a complete polarization synthetic aperture radar complex degradation model. The second category is spatial domain methods based on image priors. The method only uses the prior information of the image, and does not effectively utilize the external information. The third category of methods is based on deep learning. The method depends on an external database and can better fit a complex degradation process, but the network structure used by the method is relatively simple at present, and the utilization of the internal features of the image is relatively less. Therefore, it is necessary to develop a technique that can fit the complex degradation process of the fully-polarized synthetic aperture radar and efficiently reconstruct the spatial resolution thereof while maintaining the polarization information.

Disclosure of Invention

In order to solve the technical problems of the existing super-resolution reconstruction algorithm, the invention provides a super-resolution reconstruction method of a fully-polarized synthetic aperture radar image based on a multi-scale attention mechanism, so as to obtain the fully-polarized synthetic aperture radar image with high spatial resolution.

The technical scheme provided by the invention is as follows: a super-resolution reconstruction method for a fully-polarized synthetic aperture radar image based on a multi-scale attention mechanism comprises the following steps:

step 1, establishing an observation model of a full-polarization synthetic aperture radar image, wherein the model provides a degradation relation between a high-spatial-resolution full-polarization synthetic aperture radar image and a low-spatial-resolution full-polarization synthetic aperture radar image;

step 2, image preprocessing, namely constructing a training data set by utilizing the preprocessed images;

step 3, constructing a super-resolution reconstruction network of the fully-polarized synthetic aperture radar image based on a multi-scale attention mechanism;

the super-resolution reconstruction network takes a low-resolution synthetic aperture radar image as input, performs dimensionality increasing operation through convolution, and obtains a high-resolution feature layer through up-sampling of the feature layer; then, inputting the obtained high-resolution feature layer into a multi-scale attention module, extracting features of multiple scales, and performing cascade operation on the extracted feature layer; each scale attention module comprises three embedded modules, namely a space attention module, a channel attention module and an attention fusion module; finally, performing dimensionality reduction operation on the feature layer obtained by the multi-scale attention module through convolution to obtain a high-resolution synthetic aperture radar image;

step 4, training the super-resolution reconstruction network constructed in the step 3 to be convergent by using the training data set constructed in the step 2;

and 5, performing super-resolution reconstruction on the low-resolution synthetic aperture radar image by using the converged super-resolution reconstruction network trained in the step 4 to obtain a high-resolution fully-polarized synthetic aperture radar image.

Further, the observation model in step 1 is constructed as follows,

representing the high-resolution fully-polarized synthetic aperture radar image as x, and representing the degraded low-resolution fully-polarized synthetic aperture radar image as y, then the observation model of the fully-polarized synthetic aperture radar image is represented as:

y＝f_ds(x) (1)

wherein f is_ds(.) represents a down-sampling function.

Further, step 2 comprises the following two parts;

step 2.1, preprocessing the image, including radiation correction, terrain correction and multi-view processing, to obtain a corrected image; and (4) according to the observation model established in the step (1), performing down-sampling processing on the high-resolution image to obtain a low-resolution image. Obtaining a high-resolution and low-resolution fully-polarized synthetic aperture radar image pair containing the same ground object through cutting, and constructing a data set;

and 2.2, enhancing the data set constructed in the step 2.1, wherein the enhancing comprises rotating by 90 degrees, rotating by 180 degrees, rotating by 270 degrees and turning over to obtain a training data set.

Further, the super-resolution reconstruction network in the step 3 specifically includes;

step 3.1, constructing a feature layer upsampling module, wherein the module is used for performing upsampling operation on a low-resolution feature layer result obtained by convolution from the aspect of the feature layer to obtain a high-resolution feature layer, and the module is defined as:

F_hr＝f_us(F_lr) (2)

wherein, F_hrRepresenting a high-resolution characteristic layer, F_lrRepresenting low resolution feature layers, f_us(.) represents an upsampling function;

step 3.2: constructing a spatial attention module, wherein the spatial attention module is used for weighting the spatial weight of the characteristic layer of the fully-polarized synthetic aperture radar and enhancing the spatial resolution of the image of the fully-polarized synthetic aperture radar, and the spatial attention module is defined as:

wherein the content of the first and second substances,

representing feature layers weighted by a spatial attention module,

spatial domain, C, H, representing a feature layer_,W represents the number of channels, height and width of the feature layer respectively,

a layer of spatial attention input features is represented,

a spatial attention weight map is represented,

representing an element multiplication operation; the spatial attention weight map calculation method is as follows:

wherein, F_spaRepresents the input feature layer, sigma (.) and delta (.) represent the Sigmoid activation function and the ReLU activation function, respectively,

which represents a convolution operation, the operation of the convolution,

and

two weight terms and two bias terms respectively representing spatial attention modules;

step 3.3, constructing a channel attention module, wherein the module is used for weighting the polarization channel weight of the full-polarization synthetic aperture radar feature layer and maintaining the polarization information of each channel, and the module is defined as:

wherein the content of the first and second substances,

representing feature layers weighted by the channel attention module,

a layer of channel attention input features is represented,

a channel attention weight map is shown,

representing an element multiplication operation; the channel attention weight map calculation method is as follows:

wherein, F_chaRepresenting input feature layer, P_avg(.) represents an average pooling operation,

and

two weight terms and two bias terms, respectively, representing channel attention, wherein,

step 3.4, an attention fusion module for performing information fusion on the spatial attention weight-weighting result and the channel attention weight-weighting result obtained in the steps 3.2 and 3.3, the module being defined as:

wherein, F_fusFor the fused results, Concat (.) represents the feature layer cascading operation, W_fusAnd b_fusRespectively representing a weight term and a bias term of the attention fusion module;

step 3.5, constructing a multi-scale attention module, wherein the module comprises three scales, namely an original scale, an original scale downscaling scale and an original scale upscaling scale; the original scale is used for extracting the features of the target with the conventional size, the original scale downscaling is used for extracting the features of the small target, and the original scale upscaling is used for extracting the features of the large target; the module embeds the attention mechanism described in steps 3.2-3.4, which is defined as:

wherein, F_msFeature layer, W, representing a multiscale attention Module_msAnd b_msRespectively representing a weight term and a bias term of the multi-scale attention module;

the original scale feature layer calculation method comprises the following steps:

F_s0＝F_fus(F) (9)

wherein, F_s0Representing a layer of original scale features, F_fus(.) represents an embedded attention fusion module, F represents an input feature layer, and the original scale down-scale feature layer is calculated as follows:

F_s1＝f_ds(F_fus(f_us(F))) (10)

wherein, F_s1Representing a layer of original downscaled features, f_ds(.) represents the down-sampling function, f_us(.) represents an upsampling function, and the original scale up feature layer is calculated as follows:

F_s2＝f_us(F_fus(f_ds(F))) (11)

wherein, F_s2Representing an original scale up feature layer;

step 3.6, constructing an adaptive loss function, wherein the adaptive loss function consists of two parts, including an L1 loss function for avoiding overfitting of network parameters caused by abnormal values and an L2 loss function for maintaining numerical relationships, and the adaptive loss function is defined as the following form:

L_total(Θ)＝αL₁(Θ)+βL₂(Θ) (12)

wherein L is_total(Θ) denotes the adaptive loss function, L₁(Θ) represents the L1 loss function, L₂(Θ) represents the L2 loss function, Θ being a neural network parameter, α and β being regularization parameters for adjusting the weights of the L1 and L2 loss functions; specifically, the L1 loss function is defined as:

wherein N represents a training image pair

Number of (2), xⁱ,yⁱRespectively representing a high-resolution fully-polarized synthetic aperture radar image and a low-resolution fully-polarized synthetic aperture radar image, rho, of the ith pair of training imagesⁱRepresenting a residual error between a sampling result on the characteristic layer and the high-resolution synthetic aperture radar image, and ξ (·) representing the output of the super-resolution reconstruction network;

the L2 loss function is defined as:

the calculation method of the self-adaptive regularization parameter comprises the following steps:

further, the training mechanism adopted in step 4 includes: a supervised training mechanism and a zero throw training mechanism; training the super-resolution reconstruction network by using a supervision type training mechanism to obtain a converged pre-training network, and then further training the pre-training network by using a zero throw type training mechanism to obtain a converged reconstruction network;

step 4.1, a supervision training mechanism, wherein the supervision training mechanism is to use an external database for supervision training; carrying out supervised training on the super-resolution reconstruction network through the low-high split-joint aperture radar image pair matched with the external data set by using the training data set constructed in the step 2 to obtain a convergent pre-training network, and fully utilizing the external information of the external data set through the mechanism;

4.2, a zero throw training mechanism, namely training by using data internal information, namely training a pre-training network by using a low-resolution aperture radar image; specifically, the training mechanism performs degradation processing on a low-resolution image to be processed by using a formula (1) to obtain a down-sampling result, in the training mechanism, the down-sampling result is used as a target image of a pre-training network, the low-resolution image to be processed is used as a reference image of the pre-training network, a pairing relation between the down-sampling result of the low-resolution image and the low-resolution image is established, the pre-training network is subjected to self-supervision training, and through the mechanism, the internal information of the image to be processed is fully utilized.

The invention has the advantages that:

(1) and performing super-resolution reconstruction on the image to be processed directly through an end-to-end residual convolutional neural network to obtain the high-resolution fully-polarized synthetic aperture radar.

(2) By embedding space attention and channel attention in a multi-scale module and designing L1 and L2 adaptive loss functions, a network capable of effectively extracting multi-scale target space information and polarization information is constructed.

(3) Through two training mechanisms, external database information and image internal information can be effectively utilized.

Drawings

Fig. 1 is a flowchart of a super-resolution reconstruction method according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of a network framework according to an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is explained in detail in the following by combining the attached drawings and the embodiment.

Step 1: establishing an observation model of a full-polarization synthetic aperture radar image, wherein the embodiment represents a high-resolution full-polarization synthetic aperture radar image as x, and represents a degraded low-resolution full-polarization synthetic aperture radar image as y, and then the observation model of the full-polarization synthetic aperture radar image can be represented as follows:

y＝f_ds(x) (1)

wherein f is_ds(.) represents a down-sampling function.

Step 2: and (5) image preprocessing.

Step 2.1: and preprocessing the image such as radiation correction, terrain correction, multi-view processing and the like to obtain a corrected image. And (4) performing down-sampling processing on the high-resolution image according to the observation model established in the step (1) to obtain a low-resolution image. And obtaining a high-resolution and low-resolution fully-polarized synthetic aperture radar image pair containing the same ground object by cutting, wherein the overlapping area between the image pair is 20%.

Step 2.2: and (3) enhancing the data set constructed in the step (2.1), wherein the data set is effectively expanded by rotating by 90 degrees, rotating by 180 degrees, rotating by 270 degrees and turning over.

And step 3: and constructing a full-polarization synthetic aperture radar image super-resolution reconstruction network based on a multi-scale attention mechanism, wherein the network is an end-to-end residual convolution neural network. As shown in fig. 2, the residual convolutional neural network includes: and taking a low-resolution synthetic aperture radar image as input, performing dimensionality increasing operation through 1 × 1 convolution, and performing upsampling on the feature layer to obtain a high-resolution feature layer. And then inputting the obtained high-resolution feature layer into a multi-scale attention module, extracting features of three scales, and performing cascade operation on the extracted feature layer. Wherein, each scale attention module comprises three embedded modules of a space attention module, a channel attention module and an attention fusion module. And finally, performing dimensionality reduction operation on the feature layer obtained by the multi-scale attention module through 1 × 1 convolution to obtain a high-resolution synthetic aperture radar image. Six sub-modules are embedded in the residual convolutional neural network, and the residual convolutional neural network comprises: the system comprises a feature layer up-sampling module, a space attention module, a channel attention module, an attention fusion module, a multi-scale attention module and an adaptive loss function.

Step 3.1: and constructing a feature layer up-sampling module. The module is used for performing an upsampling operation on a low-resolution feature layer result obtained by convolution from a feature layer level to obtain a feature layer with a high resolution, and the module can be defined as follows:

F_hr＝f_us(F_lr) (2)

wherein, F_hrRepresenting a high-resolution characteristic layer, F_lrRepresenting low resolution feature layers, f_us(.) represents an upsampling function.

Step 3.2: a spatial attention module is constructed. The module is used for weighting the spatial weight of the full-polarization synthetic aperture radar characteristic layer and enhancing the spatial resolution of the full-polarization synthetic aperture radar image. This module may be defined as:

wherein the content of the first and second substances,

representing feature layers weighted by a spatial attention module,

a layer of spatial attention input features is represented,

a spatial attention weight map is represented,

representing an element multiplication operation. The spatial attention weight map calculation method is as follows:

representing a convolution operation.

And

two weight terms and two bias terms representing spatial attention modules, respectively.

Step 3.3: a channel attention module is constructed. The module is used for weighting the polarization channel weight of the full-polarization synthetic aperture radar characteristic layer and keeping the polarization information of each channel. The module may be defined as:

wherein the content of the first and second substances,

representing feature layers weighted by the channel attention module,

a layer of channel attention input features is represented,

a channel attention weight map is shown,

representing an element multiplication operation. The channel attention weight map calculation method is as follows:

and

step 3.4: and an attention fusion module. The module is used for carrying out information fusion on the space attention weight weighting result and the channel attention weight weighting result obtained in the step 3.2 and the step 3.3. The module may be defined as:

wherein, F_fusFor the fused results, Concat (.) represents the feature layer cascading operation, W_fusAnd b_fusRespectively representing the weight term and the bias term of the attention fusion module.

Step 3.5: a multi-scale attention module is constructed. The module comprises three scales which are an original scale, an original scale downscaling scale and an original scale upscaling scale respectively. The original scale is used for extracting the features of the target with the conventional size, the original scale down scale is used for extracting the features of the small target, and the original scale up scale is used for extracting the features of the large target. The module embeds the attention mechanism described in steps 3.2-3.4, which is defined as:

wherein, F_msFeature layer, W, representing a multiscale attention Module_msAnd b_msRespectively representing the weight term and the bias term of the multi-scale attention module.

F_s0＝F_fus(F) (9)

wherein, F_s0Representing a layer of original scale features, F_fus(.) represents an inline attention fusion module and F represents the feature layer of the input.

The method for calculating the original scale down-scale feature layer comprises the following steps:

F_s1＝f_ds(F_fus(f_us(F))) (10)

wherein, F_s1Representing a layer of original downscaled features, f_ds(.) represents the down-sampling function, f_us(.) represents an upsampling function.

The original scale upscaling feature layer calculation method comprises the following steps:

F_s2＝f_us(F_fus(f_ds(F))) (11)

wherein, F_s2Representation sourceAnd (5) starting scale up of the feature layer.

Step 3.6: an adaptive loss function is constructed. The adaptive loss function consists of two parts, including an L1 loss function for avoiding overfitting of network parameters due to outliers, and an L2 loss function for maintaining numerical relationships, which can be defined as the following form:

L_total(Θ)＝αL₁(Θ)+βL₂(Θ) (12)

wherein L is_total(Θ) denotes the adaptive loss function, L₁(Θ) represents the L1 loss function, L₂(Θ) represents the L2 loss function, with Θ being the neural network parameter and α and β being the regularization parameters used to adjust the weights of the L1 and L2 loss functions. Specifically, the L1 loss function may be defined as:

wherein N represents a training image pair

Number of (2), xⁱ,yⁱRespectively representing a high-resolution fully-polarized synthetic aperture radar image and a low-resolution fully-polarized synthetic aperture radar image, rho, of the ith pair of training imagesⁱAnd the residual error between the sampling result on the characteristic layer and the high-resolution synthetic aperture radar image is represented, and xi (.) represents the output of the super-resolution reconstruction network.

The L2 loss function may be defined as:

and 4, step 4: and (3) training the super-resolution reconstruction network constructed in the step (3) by using the training data set constructed in the step (2). The training mechanism adopted by the invention comprises: a supervised training mechanism and a zero throw training mechanism. And training the super-resolution reconstruction network by using a supervision type training mechanism to obtain a convergent pre-training network. And then, further training the pre-training network by using a zero-throw type training mechanism to obtain a converged super-resolution reconstruction network.

Step 4.1: a supervised training mechanism. The mechanism is to use an external database for supervision training and is characterized in that the training data set constructed in the step 2 is used, and supervised training is carried out on the super-resolution reconstruction network through the low-high split-combined aperture radar image pair matched with the external data set, so as to obtain a convergent pre-training network. By this mechanism, the external information of the external data set is fully utilized.

Step 4.2: a zero throw training mechanism. The mechanism is to use data internal information to train, and is characterized in that the low-resolution and integration aperture radar image is used for training a pre-training network. Specifically, the training mechanism performs a degradation process on the low-resolution image to be processed by using a formula (1) to obtain a down-sampling result. In the training mechanism, a downsampling result is used as a target image of a pre-training network, a low-resolution image to be processed is used as a reference image of the pre-training network, a pairing relation between a downsampling result of a low-resolution image and the low-resolution image is established, and self-supervision training is carried out on the pre-training network. By the mechanism, the internal information of the image to be processed is fully utilized.

And 5: and (5) super-resolution reconstruction of the image to be processed. And (4) performing super-resolution reconstruction on the low-resolution synthetic aperture radar image to be processed by utilizing the converged super-resolution convolutional neural network trained in the step (4) to obtain a high-resolution fully-polarized synthetic aperture radar image.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A super-resolution reconstruction method for a fully-polarized synthetic aperture radar image is characterized by comprising the following steps:

2. The method of claim 1, wherein: the observation model in step 1 is constructed as follows,

y＝f_ds(x) (1)

wherein f is_ds(.) represents a down-sampling function.

3. The method of claim 1, wherein: step 2 comprises the following two parts;

step 2.1, preprocessing the image, including radiation correction, terrain correction and multi-view processing, to obtain a corrected image; according to the observation model established in the step 1, performing down-sampling processing on the high-resolution image to obtain a low-resolution image, obtaining a high-resolution and low-resolution fully-polarized synthetic aperture radar image pair containing the same ground object by cutting, and constructing a data set;

and 2.2, performing data enhancement on the data set constructed in the step 2.1 to obtain a training data set.

4. The method of claim 1, wherein: the super-resolution reconstruction network in the step 3 specifically comprises the following steps;

F_hr＝f_us(F_lr) (2)

step 3.2: constructing a spatial attention module, wherein the spatial attention module is used for weighting the spatial weight of the characteristic layer of the fully-polarized synthetic aperture radar and enhancing the spatial resolution of the image of the fully-polarized synthetic aperture radar; the module is defined as:

wherein the content of the first and second substances,

representing feature layers weighted by a spatial attention module,

a layer of spatial attention input features is represented,

a spatial attention weight map is represented,

which represents a convolution operation, the operation of the convolution,

and

step 3.3, constructing a channel attention module, wherein the module is used for weighting the polarization channel weight of the full-polarization synthetic aperture radar characteristic layer and maintaining the polarization information of each channel; the module is defined as:

wherein the content of the first and second substances,

representing feature layers weighted by the channel attention module,

a layer of channel attention input features is represented,

a channel attention weight map is shown,

and

F_s0＝F_fus(F) (9)

F_s1＝f_ds(F_fus(f_us(F))) (10)

wherein, F_s1Representing original downscalingCharacteristic layer, f_ds(.) represents the down-sampling function, f_us(.) represents an upsampling function, and the original scale up feature layer is calculated as follows:

F_s2＝f_us(F_fus(f_ds(F))) (11)

wherein, F_s2Representing an original scale up feature layer;

L_total(Θ)＝αL₁(Θ)+βL₂(Θ) (12)

wherein N represents a training image pair

the L2 loss function is defined as:

5. the method of claim 2, wherein: the training mechanism adopted in step 4 comprises: a supervised training mechanism and a zero throw training mechanism; training the super-resolution reconstruction network by using a supervision type training mechanism to obtain a converged pre-training network, and then further training the pre-training network by using a zero throw type training mechanism to obtain a converged reconstruction network;

6. The method of claim 2, wherein: the overlap area between image pairs in step 2.1 is 20%.

7. The method of claim 2, wherein: the data enhancement in step 2.2 includes rotation by 90 °, rotation by 180 °, rotation by 270 °, and flipping.