CN113538234A

CN113538234A - Remote sensing image super-resolution reconstruction method based on lightweight generation model

Info

Publication number: CN113538234A
Application number: CN202110730777.3A
Authority: CN
Inventors: 聂婕; 袁丽媛; 魏志强; 郑程予; 耿浩冉; 韦志国
Original assignee: Ocean University of China
Current assignee: Ocean University of China
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2021-10-22

Abstract

The invention belongs to the technical field of image generation, and discloses a remote sensing image super-resolution reconstruction method based on a lightweight generation model, which is used for constructing a generation countermeasure super-resolution network total model based on a remote sensing image, wherein the generation countermeasure super-resolution network total model comprises a generation network and a judgment network, and a super-resolution image which is more similar to a real image is generated through continuous training countermeasures of the generation network and the judgment network; the generation network integrates a significance mechanism of attention, weight distribution is carried out on different channel characteristics again, so that the model can learn the characteristic distribution of the remote sensing image in a self-adaptive mode, two network model parameters are continuously optimized in an iterative mode through a loss function and an optimizer, dynamic balance between the two network model parameters is expected to be achieved, the probability distribution of the generated image is approximately the same as the data distribution of a real image, and the problem of unsuitability of image super-resolution reconstruction is solved.

Description

Remote sensing image super-resolution reconstruction method based on lightweight generation model

Technical Field

The invention belongs to the technical field of image generation, relates to a remote sensing image generation method, and particularly relates to a remote sensing image super-resolution reconstruction method based on a lightweight generation model.

Background

Because remote sensing images are mostly long-distance imaging, one pixel can often cover the earth surface range of several square meters or even thousands of square meters, so the obtained images have the characteristics of complex ground object types, fuzzy target elements and the like, great difficulty is caused to later-stage ground object extraction, interpretation and the like, and the images have certain limitation in related field research and application. Therefore, generating a remote sensing image close to the real image is the key to overcome this limitation.

The existing method for realizing image super-resolution reconstruction is to use depth residual convolution to generate a countermeasure network, the generation network mostly uses a structure formed by connecting a plurality of residual modules in series, and the generated image and a natural real image are used as the input of a discriminator. Through continuous training and confrontation of the generation network and the discrimination network, the image generated by the generation network is more and more approximate to a real image. In addition, at present, convolutional neural network hyper-resolution reconstruction methods based on different structures also exist, for example, a method based on a residual structure reconstructs high-frequency detail information of an image by learning a residual between a high-resolution image and a stretched and amplified low-resolution image; for example, the method based on the dense structure enables the image characteristics to be fully transmitted by splicing the characteristics of each layer in the front as the input of a subsequent network, thereby realizing efficient network training; for example, a method based on a recursive structure learns each level of residual error between a low-resolution image and a high-resolution image, then adds the residual error with the original low-resolution image to obtain a corresponding reconstruction result of each level, and performs weighting calculation on the result to obtain a final super-resolution image.

At present, the super-resolution reconstruction method for generating the countermeasure network based on the depth residual convolution is mostly developed on the basis of common images, and some challenges exist in the super-resolution research of remote sensing images. Particularly, the remote sensing image has rich earth surface information, obvious ground object characteristics and obvious target object textures, and the remote sensing image is mostly formed in a long distance, so that the problems of low image resolution, large image deformation and the like easily exist because the remote sensing image is large in spatial scale. This makes it difficult to apply the super-resolution method of the common image directly to the remote sensing image. Therefore, for remote sensing images, a new network model is required to be proposed to improve the resolution of the images.

Compared with the common image, the remote sensing image has the differences of image dimension and spatial resolution, and has the characteristics of rich texture characteristics, strong autocorrelation and the like, so that the difficulty in reconstructing the high-definition remote sensing image is higher. Meanwhile, from the current research situation, the higher the image precision is, the more the number of network layers is, and the harder the training is. Therefore, on the basis of ensuring the image accuracy, thinking about how to solve the problems of large network parameter quantity, difficult training and the like also becomes a challenge.

In contrast, the method is designed in aspects of establishing a network model, adjusting related learning strategies, reasonably planning a data set and the like so as to accurately extract the characteristics of the remote sensing image, achieve the aims of recovering the edge texture of the image and improving the perception quality, improve the network structure and reduce the number of parameters under the condition of ensuring that the precision is not changed, and realize light weight design.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a remote sensing image super-resolution reconstruction method based on a lightweight generation model, and designs a generation countermeasure super-resolution network total model based on a remote sensing image, which consists of a generation network and a judgment network, and generates a super-resolution image which is more similar to a real image through continuous training countermeasures of the generation network and the judgment network. The generation network fuses a significance mechanism of attention, is named as an attention generation network AGN, redistributes weight distribution on different channel characteristics, fully considers interdependency among different channels, adopts methods with different importance degrees and different treatments, enables a model to be capable of learning the characteristic distribution of a remote sensing image in a self-adaptive mode, and continuously iteratively optimizes two network model parameters through a loss function and an optimizer so as to achieve dynamic balance between the two network model parameters, and enables the probability distribution of the generated image to be approximately the same as the data distribution of a real image so as to relieve the ill-posed problem of image super-resolution reconstruction. In addition, the respective advantages of the depth separable convolution and the residual error network are fully utilized, image feature learning is realized, network redundancy is reduced, lightweight network design is realized, and the problems that most of super-resolution reconstruction methods based on depth learning are large in network layer number, large in parameter quantity and the like are solved.

In order to solve the technical problems, the invention adopts the technical scheme that:

the remote sensing image super-resolution reconstruction method based on the lightweight generative model comprises the following steps:

step one, building a lightweight generation network structure named attention generation network AGN, wherein the generation network structure integrates a channel attention mechanism significance model, and simultaneously, replacing a common convolution with a deep separable convolution;

and secondly, constructing a generation countermeasure super-resolution network total model based on the remote sensing image, wherein the model comprises a generation network and a discrimination network, and generating a super-resolution image which is more similar to the real image through continuous training countermeasures of the generation network and the discrimination network.

Further, the attention generation network AGN comprises a shallow feature extraction layer f₁Channel attention layer f₂Middle layer characteristic extraction layer f₃Upper sampling layer f₄；

Attention generating network AGN denoted f_AGNInputting a low resolution image into the network, where the low resolution image is marked as LR and the output super resolution image is marked as SR, we can obtain:

SR＝f_AGN(LR)

through the whole process of attention generation network AGN, it can be expressed as:

SR＝f₄(f₃(f₂(f₁(LR))+f₁(LR))+f₁(LR))

further, the channel attention layer f₂Composed of a CAdepth Block composed of a channel attention base Block CA _ Block and two layers of depth separable convolution nodesA Depth _ Block component; the channel attention basic Block CA _ Block consists of an average pooling layer, a convolution layer and a Sigmoid function; and the input features are scaled through a channel attention basic Block CA _ Block to obtain output, and the input and the output are multiplied to obtain the final output of the channel attention basic Block CA _ Block.

Furthermore, the average pooling layer of the channel attention basic Block CA _ Block obtains statistics represented by spatial features of each channel, the first convolutional layer and the second convolutional layer use convolution kernels with the same size, but the number of output channels of the first convolutional layer is reduced by four times, the output channels of the second convolutional layer are expanded to the original number of channels, then the reconstructed channel representation is input into a Sigmoid function, weight distribution is performed on features of different channels, dependency relationships of different channels are established, and finally multiplication with the input channel structure is performed to obtain output.

Further, the channel attention layer f₂The adopted depth separable convolution structure is used for realizing image feature extraction through two convolution processes of channel by channel and point by point, firstly, the spatial dimension of an input feature map is fused by using channel by channel convolution, each convolution operation is processed with a corresponding single input channel, and then the obtained feature map channel information is fused by using point by point convolution.

Furthermore, during upsampling, the remote sensing image features are mapped into a high-resolution space from a low-resolution space by utilizing sub-pixel convolution operation to obtain a square characteristic diagram of an upsampling factor, pixels are extracted from the same position in each characteristic diagram to carry out pixel recombination operation, the recombined pixels are sub-pixels on a new characteristic diagram, and finally an amplified image of the upsampling factor multiple is obtained.

Further, the generation network and the discrimination network for generating the total antagonistic super-resolution network model optimize the network through a super-resolution reconstruction algorithm, wherein the super-resolution reconstruction algorithm comprises the following specific steps:

step 1: input training sample pair P ═ X, Y }: the high-resolution remote sensing image sample in the training set is X₁,X₂,…X_n]After the cutting is carried out at random,obtaining a high resolution image sample X ═ X₁,x₂,…x_n]After down-sampling operation, obtaining corresponding low-resolution image sample Y ═ Y₁,y₂,…y_n]；

Step 2: converting the low resolution image sample Y to [ Y ═ Y₁,y₂,…y_n]Generating a network according to batch input, and obtaining a super-resolution image S ═ S corresponding to the network₁,s₂,…s_n]；

Step 3: inputting the super-resolution image sample S obtained in Step2 and the cut high-resolution image sample X into a discrimination network; then the discrimination network gives out its discrimination result, i.e. the probability that the input image is a natural image, which is respectively marked as p_s、p_x(ii) a Then calculating to obtain the discrimination loss d_lossAnd the result is transmitted reversely, and the judgment network parameter is updated;

step 4: generating network gradient zero setting, calculating and generating network loss g_lossAnd reversely transmitting the result, and updating and generating the network parameter;

then, inputting the low-resolution image sample Y into a generation network, generating a corresponding super-resolution image S by the generation network, and calculating the generation network loss g after updating the parameters_lossAnd recalculating the discrimination loss d after updating the parameters_loss。

Step 5: returning to the

steps

3 and 4, the parameters of different batches of data are updated continuously until the round of training is finished.

Further, at Step3, the loss d is discriminated_lossThe calculation formula is as follows:

d_loss＝1-p_x+p_s

at Step4, it generates a network loss g_lossThe calculation formula is as follows:

g_loss＝MSE_loss+VGG_loss+Adversarial_loss

therein, MSE_lossFor computing high-definition images

And generating an image

The inter-pixel difference relationship of (a); CGG_lossFor computationally generating images

And high definition image I^HREuclidean distances between the feature representations; adversal_lossGenerating images for discriminant model determination

Probability of a high-definition image in an original data set; the above three network losses are calculated as follows:

wherein the content of the first and second substances,

representation Generation of convolutional neural network, θ_GRepresenting network weight and bias terms;

refers to a feature graph, W, obtained through a VGG-19 network_i,jAnd H_i,jDimensions representing a VGG-19 network feature graph;

characterised by a probability value, i.e. discrimination model determining the image to be generated

Probability of a high-definition image in the original data set.

Compared with the prior art, the invention has the advantages that:

(1) the invention provides a novel generation confrontation super-resolution network total model based on remote sensing images, an attention generation network AGN integrates a significance mechanism of channel attention, different channel characteristics are subjected to weight distribution again, the interdependency among different channels is fully considered, and different treatment methods with different importance degrees are adopted, so that the model can learn the characteristic distribution of the remote sensing images in a targeted manner.

Through a loss function and two network model parameters of the optimizer for continuously iterating and optimizing the generation network and the judgment network, the probability distribution of the generated image is approximately the same as the data distribution of the real image, and the problem of unsuitability of super-resolution reconstruction of the image is solved.

(2) The invention realizes image feature learning, reduces network redundancy and realizes lightweight network design by utilizing respective advantages of the depth separable convolution and the residual error network. The method comprises the steps of replacing the common convolution of the generated network with the deep separable convolution to reduce the parameter quantity of the model, improve the performance of the network model and help to achieve the aim of lightweight design of the network. And then, the input features and the output features of the channel attention are added by using a residual jump connection function, and the feature graph of the shallow network is transmitted to a later network layer, so that the connection of features among network layers is enhanced, and the network is helped to recover more image detail information.

Experimental results show that the model reconstruction method is good in model reconstruction effect and excellent in network training, information can be fully mined in remote sensing image data, high-frequency information of the image can be recovered, and the image edge contour and texture expression capacity can be improved. And under the condition of the same image reconstruction effect, the quantity of the used parameters and the calculated amount are less, and the method can play an important practical significance in the aspect of remote sensing application.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a network architecture diagram of the overall model of the network of the present invention;

FIG. 2 is a network architecture diagram of a generation network of the present invention;

FIG. 3 is a flow chart of a channel attention method of the present invention;

fig. 4 is a flowchart of an upsampling method of the present invention.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments.

The invention researches a remote sensing image super-resolution reconstruction method for generating a countermeasure network based on an attention mechanism. Firstly, the characteristics of the remote sensing image are analyzed, and effective characteristic information of the remote sensing image is mined. Thirdly, a lightweight network design method is designed, network redundancy is reduced, and network performance is improved. In contrast, the embodiment provides a remote sensing image super-resolution reconstruction method based on a lightweight generation model, which includes the following steps:

step one, building a lightweight generation network structure named as an attention generation network AGN, wherein the generation network structure integrates a channel attention mechanism significance model, fully excavates detail characteristics such as edge textures of remote sensing images, recovers image high-frequency information, achieves a good reconstruction effect, and meanwhile replaces common convolution with deep separable convolution, so that parameter quantity of the model is effectively reduced, and the goal of network lightweight design is facilitated;

and step two, constructing a remote sensing image-based generation countermeasure super-resolution network total model, wherein the model comprises a generation network (attention generation network AGN in step one) and a discrimination network, and generating a super-resolution image which is more similar to a real image through continuous training countermeasures of the generation network and the discrimination network.

Each part is described below:

network structure of one, network general model

The network structure is as shown in fig. 1, the input image is a training data set of remote sensing images, the image size is 256 × 256, the size of the low resolution image obtained after four times down-sampling operation is 64 × 64, and then the image is input into the generation network to obtain a reconstructed super-resolution image, at this time, the image size is 256 × 256. And inputting the reconstructed image and the corresponding real image into a discrimination network, so that the discrimination network discriminates the generated image and the real image, and optimizing the network by a network generation and discrimination network game method.

1. Generating networks

The generation network is an end-to-end simple network structure, and the invention adopts a post-up sampling method, and focuses on the generation of the network AGN, which comprises a shallow feature extraction layer f₁Channel attention layer f₂Middle layer characteristic extraction layer f₃Upper sampling layer f₄The network architecture is shown in fig. 2:

the present embodiment represents the attention generating network AGN as f_AGNInputting a low resolution image into the network, where the low resolution image is marked as LR and the output super resolution image is marked as SR, we can obtain:

SR＝f_AGN(LR)

in this embodiment, the shallow feature extraction layer is composed of a 9 × 9 convolution kernel, the channel attention layer is composed of a cadempt Block, and the channel attention layer is composed of a channel attention base Block CA _ Block and a Depth _ Block composed of two layers of Depth separable convolution structures; the middle layer feature extraction layer is composed of a convolution kernel of 3 multiplied by 3; the up-sampling layer is composed of two sub-pixel convolution up-sampling blocks; and finally, obtaining a complete network architecture through a layer of convolution kernel of 9 multiplied by 9.

SR＝f₄(f₃(f₂(f₁(LR))+f₁(LR))+f₁(LR))

channel attention method: it should be noted that the channel attention basic Block CA _ Block described in this embodiment is composed of an average pooling layer, a convolutional layer, and a Sigmoid function. The statistics of the spatial feature representation of each channel is obtained through the average pooling layer, the convolution kernel with the size of 1 multiplied by 1 is used in the first convolution layer, the number of output channels is reduced by four times, the dimensionality is reduced, and the complexity of the model can be limited. The second layer of convolution also uses convolution kernels with the same size, output channels are expanded to the original number of the channels, then reconstructed channel representations are input into a Sigmoid function, weight distribution is carried out on different channel characteristics, the dependency relationship of different channels is established, the utilization degree of image characteristics is improved, the generalization capability of a model is enhanced, and finally multiplication with an input channel structure is carried out to obtain output. In a word, the input features are scaled by the channel attention basic Block CA _ Block to obtain an output, and then the input and the output are multiplied to obtain the final output of the channel attention basic Block CA _ Block. The specific process is shown in fig. 3.

Depth separable convolution: channel attention layer f of the present embodiment₂The adopted depth separable convolution structure is mainly used for realizing image feature extraction through two convolution processes of channel by channel and point by point. The idea is as follows: the method comprises the steps of firstly fusing the space dimensions (namely height and width) of an input feature map by using channel-by-channel convolution, processing each convolution operation with a corresponding single input channel, and then fusing the obtained feature map channel information by using point-by-point convolution.

Using the depth separable convolution process can be expressed as follows: after passing through the CA _ Block module, the size of the obtained input feature map is 64 × 64 × 64 (length × width × number of channels). The first step is to perform a channel-by-channel convolution operation, where 3 × 3 × 64 single convolution kernel in the conventional convolution is changed to use 64 convolution kernels of 3 × 3 × 1 size, so that each convolution kernel and each channel of the corresponding input feature map are respectively subjected to convolution operation to obtain 64 output features of 64 × 64 × 1 size, and the 64 feature maps are stacked together to obtain a 64 × 64 × 64 output feature map. And secondly, performing point-by-point convolution operation, performing convolution operation on the output feature map obtained in the first step by using a convolution kernel of 1 × 1 × 64 to obtain a feature map of 64 × 64 × 1, wherein the number of the convolution kernels of 1 × 1 is 64, and the output feature map of 64 × 64 × 64 size can be obtained through a depth separable convolution process.

The up-sampling method comprises the following steps: during upsampling, the remote sensing image features are mapped into a high-resolution space from a low-resolution space by utilizing sub-pixel convolution operation to obtain a square characteristic diagram of an upsampling factor, pixels are extracted from the same position in each characteristic diagram to carry out pixel recombination operation, the recombined pixels are sub-pixels on a new characteristic diagram, and finally an amplified image of the upsampling factor multiple is obtained. Specifically, after a convolution operation is performed on an input image of H × W size, n is obtained²The process of (n is an up-sampling factor) a H × W feature map, and then performing pixel reconstruction on the feature map to obtain an nH × nW enlarged image is shown in fig. 4.

Second, super-resolution reconstruction algorithm

The generation network and the discrimination network for generating the total model of the confrontation super-resolution network optimize the network through a super-resolution reconstruction algorithm, and the sample of the high-resolution remote sensing image in the training set is assumed to be [ X ]₁,X₂,…X_n]Obtaining high-resolution image sample X ═ X after random cutting₁,x₂,…x_n]After down-sampling operation of a certain scale factor, obtaining a corresponding low-resolution image sample Y ═ Y₁,y₂,…y_n]. Through the above operations, the training sample pair (i.e., the high-resolution image pair and the low-resolution image pair) is P ═ X, Y.

The super-resolution reconstruction algorithm comprises the following specific steps:

step 1: input training sample pair P ═ X, Y }:

step 2: converting the low resolution image sample Y to [ Y ═ Y₁,y₂,…y_n]Generating a network G according to batch input, and obtaining a super-resolution image S ═ S corresponding to the network G₁,s₂,…s_n]；

Step 3: inputting the super-resolution image sample S obtained in Step2 and the cut high-resolution image sample X into a discrimination network D; then the discrimination network D gives the discrimination result, i.e. the probability that the input image is a natural image, which is respectively marked as p_s、p_x(ii) a Then calculateTo obtain the discrimination loss d_lossAnd the result is propagated reversely, and the judgment network parameters are updated. The discrimination loss d_lossThe calculation formula is as follows:

d_loss＝1-P_x+P_s

step 4: generating network G gradient zero setting, calculating generated network loss G_lossAnd the result is propagated reversely, and the network parameters are updated and generated. It generates a network loss g_lossThe calculation formula is as follows:

g_loss＝MSE_loss+VGG_loss+Adversarial_loss

therein, MSE_lossIs a loss calculation method commonly adopted in the field of image processing and is used for calculating high-definition images

And generating an image

The inter-pixel difference relationship of (a); VGG_lossFor computationally generating images

Probability of a high-definition image in the original data set. The above three network losses are calculated as follows:

wherein the content of the first and second substances,

Probability of a high-definition image in the original data set.

Then, the low-resolution image sample Y is input into a generation network G, the generation network G generates a corresponding super-resolution image S, and the generation network loss G after the update parameters are calculated_lossAnd recalculating the discrimination loss d after updating the parameters_loss。

Step 5: returning to the

steps

The technical effects of the invention are verified:

selecting a remote sensing image Data Set with sufficient samples, selecting 600 remote sensing images of airlane in a NWPU-RESISC45 Data Set for training, selecting 200 remote sensing images in RSOD-Dataset for verification, and performing a series of test experiments on public Data sets such as RSOD-Dataset, NWPU-RESISC45, NWPU-VHR, UC Merceded Land-Use Data Set and the like. And finally, evaluating and comparing the experimental results by utilizing two evaluation indexes, namely PSNR and SSIM, developing a comparison experiment and an ablation experiment of the same reference method, verifying and optimizing.

Experimental results show that the overall effect of the hyper-resolution image of the AGN network provided by the invention is higher than that of the original SRGAN network, the PSNR mean value is about 0.2dB higher than that of the SRGAN, and for a single test image, the PSNR value is about 1dB higher than that of the SRGAN, and SSIM is about 0.5. Compared with the SRGAN network model, the parameter number of the AGN network model is greatly reduced by 49%, the floating point calculation amount is reduced by 26%, and the occupied memory space is reduced by 20%. Therefore, the AGN network model is lighter in weight and higher in model efficiency.

In conclusion, the invention provides a novel generation confrontation super-resolution network model AGN based on the remote sensing image, the network integrates the significance mechanism of attention, the weight distribution is carried out on the characteristics of different channels again, the interdependency among different channels is fully considered, and the method of different treatment with different importance degrees is adopted, so that the model can pertinently learn the characteristic distribution of the remote sensing image, two network model parameters are continuously optimized in an iterative mode through a loss function and an optimizer, and the high-frequency details of the remote sensing image are recovered. Experiments show that the hyper-resolution reconstruction model provided by the method is good in reconstruction effect and excellent in network training, information can be fully mined in remote sensing image data, high-frequency information of the image can be recovered, and the image edge contour and texture expression capacity can be improved.

The invention realizes image feature learning and reduces network redundancy by utilizing respective advantages of the depth separable convolution and the residual error network. The method comprises the steps of replacing the common convolution of the generated network with the deep separable convolution to reduce the parameter quantity of the model, improve the performance of the network model and help to achieve the aim of lightweight design of the network. And then, the input features and the output features of the channel attention are added by using a residual jump connection function, and the feature graph of the shallow network is transmitted to a later network layer, so that the connection of features among network layers is enhanced, and the network is helped to recover more image detail information. Through the adjustment of the loss function, the model is continuously tested and optimized, and the network structure is adjusted to reduce the parameter number under the condition of ensuring that the precision is not changed, so that the lightweight design is realized. The experimental result shows that under the condition of the same image reconstruction effect, the quantity of the used parameters and the calculated quantity are less, and the method can play an important practical significance in the aspect of remote sensing application.

It is understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art should understand that they can make various changes, modifications, additions and substitutions within the spirit and scope of the present invention.

Claims

1. The remote sensing image super-resolution reconstruction method based on the lightweight generative model is characterized by comprising the following steps of:

and secondly, constructing a generation countermeasure super-resolution network total model based on the remote sensing image, wherein the model comprises the generation network and the judgment network in the step one, and generating a super-resolution image which is more similar to the real image through continuous training countermeasures of the generation network and the judgment network.

2. The remote sensing image super-resolution reconstruction method based on lightweight generative model of claim 1, wherein the attention generation network AGN comprises a shallow feature extraction layer f₁Channel attention layer f₂Middle layer characteristic extraction layer f₃Upper sampling layer f₄；

SR＝f_AGN(LR)

SR＝f₄(f₃(f₂(f₁(LR))+f₁(LR))+f₁(LR))。

3. according to the claimsSolving 2 the remote sensing image super-resolution reconstruction method based on the lightweight generative model is characterized in that the channel attention layer f₂The device comprises a CAdepth Block, a channel attention base Block CA _ Block and a Depth _ Block consisting of two layers of Depth separable convolution structures; the channel attention basic Block CA _ Block consists of an average pooling layer, a convolution layer and a Sigmoid function; and the input features are scaled through a channel attention basic Block CA _ Block to obtain output, and the input and the output are multiplied to obtain the final output of the channel attention basic Block CA _ Block.

4. The remote sensing image super-resolution reconstruction method based on the lightweight generative model as claimed in claim 3, wherein the average pooling layer of the channel attention basic Block CA _ Block obtains statistics of spatial feature representation of each channel, the first convolution layer and the second convolution layer use convolution kernels with the same size, but the number of output channels of the first convolution layer is reduced by four times, the output channels of the second convolution layer are expanded to the original number of channels, then the reconstructed channel representation is input into a Sigmoid function, weight distribution is carried out on different channel features, dependency relationships of different channels are established, and finally multiplication with the input channel structure is carried out to obtain output.

5. The remote sensing image super-resolution reconstruction method based on lightweight generative model as claimed in claim 3, wherein the channel attention layer f₂The adopted depth separable convolution structure is used for realizing image feature extraction through two convolution processes of channel by channel and point by point, firstly, the spatial dimension of an input feature map is fused by using channel by channel convolution, each convolution operation is processed with a corresponding single input channel, and then the obtained feature map channel information is fused by using point by point convolution.

6. The remote sensing image super-resolution reconstruction method based on the lightweight generative model as claimed in claim 5, wherein during the up-sampling, the remote sensing image features are mapped from a low resolution space to a high resolution space by using sub-pixel convolution operation to obtain a square feature map of an up-sampling factor, then pixels are extracted from the same position in each feature map to carry out pixel recombination operation, the recombined pixels are sub-pixels on a new feature map, and finally an amplified image with the multiple of the up-sampling factor is obtained.

7. The remote sensing image super-resolution reconstruction method based on the lightweight generation model according to any one of claims 1 to 6, wherein the generation network and the discrimination network for generating the total model of the anti-super-resolution network optimize the network through a super-resolution reconstruction algorithm, and the super-resolution reconstruction algorithm comprises the following specific steps:

step 1: input training sample pair P ═ X, Y }: the high-resolution remote sensing image sample in the training set is X₁，X₂，…X_n]Obtaining high-resolution image sample X ═ X after random cutting₁，x₂，…x_n]After down-sampling operation, obtaining corresponding low-resolution image sample Y ═ Y₁，y₂，…y_n]；

Step 2: converting the low resolution image sample Y to [ Y ═ Y₁，y₂，…y_n]Generating a network according to batch input, and obtaining a super-resolution image S ═ S corresponding to the network₁，s₂，…s_n]；

then, inputting the low-resolution image sample Y into a generation network, generating a corresponding super-resolution image S by the generation network, and calculating and updatingParameterized generated network loss g_lossAnd recalculating the discrimination loss d after updating the parameters_loss。

Step 5: returning to the steps 3 and 4, the parameters of different batches of data are updated continuously until the round of training is finished.

8. The remote sensing image super-resolution reconstruction method based on lightweight generative model of claim 7, wherein the discrimination loss d is determined in Step3_lossThe calculation formula is as follows:

d_loss＝1-p_x+p_s

g_loss＝MSE_loss+VGG_loss+Adversarial_loss

therein, MSE_lossFor computing high-definition images

And generating an image

wherein the content of the first and second substances,

refers to a feature graph, W, obtained through a VGG-19 network_i，jAnd H_i，jDimensions representing a VGG-19 network feature graph;

Probability of a high-definition image in the original data set.