CN116228608B

CN116228608B - Processing network for defogging remote sensing image and defogging method for remote sensing image

Info

Publication number: CN116228608B
Application number: CN202310517870.5A
Authority: CN
Inventors: 李冠群
Original assignee: Genyu Muxing Beijing Space Technology Co ltd
Current assignee: Genyu Muxing Beijing Space Technology Co ltd
Priority date: 2023-05-10
Filing date: 2023-05-10
Publication date: 2023-08-01
Anticipated expiration: 2043-05-10
Also published as: CN116228608A

Abstract

The invention relates to the technical field of image processing, and provides a processing network for defogging a remote sensing image and a defogging method for the remote sensing image. The processing network comprises: the image input end is used for acquiring an original remote sensing image with fog to be processed; n stepwise self-attention module layers; each step-type self-attention module layer comprises more than one step-type self-attention module, and the number of the step-type self-attention modules contained in the step-type self-attention module layers is reduced along with the increase of the layer number; the input and output of the stepped self-attention module in each stepped self-attention module layer are connected end to end in turn; the stepped self-attention modules in adjacent stepped self-attention module layers are connected through up-sampling or down-sampling. The method and the device can better capture global context information of the remote sensing image, thereby removing atmospheric influences such as haze and the like and improving the visibility of the remote sensing image.

Description

Processing network for defogging remote sensing image and defogging method for remote sensing image

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to a processing network for defogging a remote sensing image and a defogging method for the remote sensing image.

Background

With the development of science and technology, remote sensing images are widely applied to various scenes, such as environmental monitoring, disaster management and city planning. However, the quality of these remote sensing images is often affected by factors such as atmospheric haze, resulting in reduced visibility of objects in the images. In the prior art, an image is usually defogged by adopting a defogging method, and defogging is a technology for restoring the visibility of an object in a blurred image.

The traditional defogging method is based on hand-made characteristics and heuristic methods, has limited capability of accurately recovering the visibility of an object in a blurred image, is also available at present and obtains favorable results. However, the existing image defogging method is often oriented to a natural image, and compared with the natural image, a more complex target scene and semantic features exist in the remote sensing image, so that the existing image defogging method is difficult to obtain a satisfactory effect on defogging of the remote sensing image.

Disclosure of Invention

In order to solve at least one technical problem in the background art, the invention provides a processing network for defogging a remote sensing image and a defogging method for the remote sensing image, which are used for capturing global context information of the remote sensing image better aiming at the remote sensing image, so as to remove the influence of atmospheric factors such as haze and the like and improve the visibility of the remote sensing image.

According to a first aspect of the present invention there is provided a processing network for defogging a remote sensing image, the processing network comprising:

the image input end is used for acquiring an original remote sensing image with fog to be processed;

n ladder-type self-attention module layers, wherein the value range of N is 2-10;

each step-type self-attention module layer comprises more than one step-type self-attention module, and the number of the step-type self-attention modules contained in the step-type self-attention module layers is reduced along with the increase of the layer number;

the input and output of more than one stepped self-attention module contained in each stepped self-attention module layer are connected end to end in sequence and are used for extracting the atmospheric characteristics of the remote sensing image; the step-type self-attention modules included in the adjacent step-type self-attention module layers are connected through up-sampling or down-sampling and are used for extracting global context information of the remote sensing image;

the first step type self-attention module of the first step type self-attention module layer is connected with the image input end and is used for extracting the characteristics of the remote sensing image with fog; the last stepwise self-attention module of the first stepwise self-attention module layer is used for outputting defogging images.

Further, the first step-type self-attention module of each step-type self-attention module layer is used for being connected with the second step-type self-attention module of the step-type self-attention module layer of the upper layer; the last stepwise self-attention module of each stepwise self-attention module layer is used for connecting with the last stepwise self-attention module in the last and last stepwise self-attention module layer of the previous stepwise self-attention module layer.

Further, the number of the stepped self-attention modules in each stepped self-attention module layer is even.

Further, when N is a number of 3, the number of the stepped self-attention modules in the first layer of stepped self-attention module layers is six; the number of the stepped self-attention modules in the second stepped self-attention module layer is four, and the second stepped self-attention module layer is adjacent to the first stepped self-attention module layer; the number of the stepped self-attention modules in the third stepped self-attention module layer is two, and the third stepped self-attention module layer is adjacent to the second stepped self-attention module layer.

Further, the second stepped self-attention module and the third stepped self-attention module in the first stepped self-attention module layer are respectively connected with the first stepped self-attention module and the second stepped self-attention module in the second stepped self-attention module layer through downsampling; the second step-type self-attention module in the second step-type self-attention module layer is connected with the first step-type self-attention module in the third step-type self-attention module layer through downsampling;

the second step-type self-attention module in the third step-type self-attention module layer is connected with the third step-type self-attention module in the second step-type self-attention module layer through upsampling; and the third and fourth stepped self-attention modules in the second stepped self-attention module layer are respectively connected with the fourth stepped self-attention module and the fifth stepped self-attention module in the first stepped self-attention module layer through upsampling.

According to a second aspect of the present invention, the present invention further provides a remote sensing image defogging method using the above processing network, including:

acquiring a remote sensing image with fog;

inputting the remote sensing image with fog into the processing network, and extracting the characteristics of the remote sensing image with fog by utilizing N ladder-type self-attention module layers;

and outputting defogging remote sensing images.

Further, the feature extraction of the fogged remote sensing image by using the N step-shaped self-attention module layers includes:

extracting features of the remote sensing image with fog by utilizing signal transmission of more than one stepped self-attention module in each stepped self-attention module layer; the method comprises the steps of,

and performing feature extraction on the remote sensing image with fog by utilizing the superposition operation among the stepped self-attention modules in the adjacent stepped self-attention module layers.

Further, the feature extraction of the remote sensing image with fog is performed by using the superposition operation between the stepped self-attention modules in the adjacent stepped self-attention module layers, including:

and superposing the output of the previous step type self-attention module in the layer where the current step type self-attention module is positioned with the output of the layer of the previous step type self-attention module, and inputting the superposed output of the previous step type self-attention module into the current step type self-attention module to extract the characteristics.

Further, the processing method of the step-type self-attention module comprises the following steps:

performing layer normalization operation and convolution calculation on the features to be input to obtain preprocessing features;

calibrating the preprocessing characteristics and outputting first-layer step characteristics;

multiplying the first layer of step features by the first layer of step features, acting on a Sigmoid activation function, and outputting a second layer of step features;

multiplying the second layer of step features with the first layer of step features and acting on a convolution function to output a third layer of step features;

and adding the to-be-input feature with the third-layer step feature, and outputting a target feature serving as the to-be-input feature of the next step-type self-attention module.

Further, when N is a number 3, the first layer of the stepwise self-focusing module layer includes six stepwise self-focusing modules, which are sequentially named as A1-A6, the second layer of the stepwise self-focusing module layer includes four stepwise self-focusing modules, which are sequentially named as B1-B4, the third layer of the stepwise self-focusing module layer includes two stepwise self-focusing modules, which are sequentially named as C1 and C2, and the feature extraction is performed on the remote sensing image with fog by using the N stepwise self-focusing module layers, which includes:

the module A1 performs feature extraction on the remote sensing image with fog, outputs of the modules A2 and A3 are respectively input into the modules B1 and B2 through downsampling, the output of the module B2 is input into the module C1 through downsampling, and the input of the module B2 comprises the outputs of the module A3 and the module B1; the outputs of the modules B3 and B4 are respectively input to the modules A4 and A5 through upsampling, wherein the input of the module A4 comprises the outputs of the module A3 and the module B3, the input of the module A5 comprises the outputs of the module A4 and the module B4, and the input of the module B3 comprises the output of the module C2 and the output of the module B2; the module A6 performs feature extraction on the output of the module A5 and outputs defogging remote sensing images.

By the technical scheme of the invention, the following technical effects can be obtained:

according to the processing network for defogging the remote sensing image and the remote sensing image defogging method based on the processing network, the global context information in the remote sensing image is gradually captured through designing the plurality of layers of the step-type self-attention modules, so that the processing network is more stable to various environmental conditions, the influence of atmospheric factors such as haze can be removed, and the visibility of the remote sensing image is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a processing network architecture for defogging a remote sensing image according to the present invention;

FIG. 2 is a flow chart of a remote sensing image defogging method based on a processing network in the invention;

FIG. 3 is a flowchart of a method for extracting features of the fogged remote sensing image using the N ladder-type self-attention module layers in FIG. 2;

FIG. 4 is a flow chart of a processing method of the ladder type self-attention module in the present invention;

FIG. 5 is a flowchart of a method for computing a self-attention module according to the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that the terms "first," "second," "third," and "fourth," etc. in the description and claims of the present application are used for distinguishing between different objects and not for describing a particular sequential order. The terms "comprising" and "having" and any variations thereof, in embodiments of the present application, are intended to cover non-exclusive inclusions.

Fig. 1 is a schematic diagram of a processing network structure for defogging a remote sensing image according to an embodiment of the present invention, and as shown in fig. 1, the present invention provides a processing network for defogging a remote sensing image, the processing network comprising:

The processing network processes the remote sensing image for a plurality of times by utilizing the connection relation of a plurality of step-type self-attention modules (SSAM), each step-type self-attention module comprises an input port and an output port, the processing process of the input characteristics is the same, and each step-type self-attention module accurately outputs the input characteristics through three-layer step-type calculation. In this embodiment, the processing network is of a symmetrical structure.

The number of layers of the step type self-attention module layer is not limited, and optionally, N is 2 to 10, so that excessive calculation information is avoided, and defogging calculation is facilitated. In this embodiment, the number of the step-type self-focusing module layers is 3.

The number of stepped self-attention modules within different layers of stepped self-attention modules is different and gradually decreases as the number of layers increases. Optionally, the number of the stepped self-attention modules in each stepped self-attention module layer is even, so that the down-sampling and up-sampling processes can be symmetrical, namely, the data information is symmetrical, and the calculation is convenient.

In the same layer of the stepped self-attention module layer, the output port of the stepped self-attention module is connected with the input port of the next adjacent stepped self-attention module, i.e. the input and the output are connected end to end in sequence. In this embodiment, when N is a number of 3, the number of stepped self-attention modules in the first layer of stepped self-attention module layer is six, the number of stepped self-attention modules in the second layer of stepped self-attention module layer is four, and the second layer of stepped self-attention module layer is adjacent to the first layer of stepped self-attention module layer; the number of the stepped self-attention modules in the third stepped self-attention module layer is two, and the third stepped self-attention module layer is adjacent to the second stepped self-attention module layer.

As another alternative embodiment, when N is a number of 4, the number of the stepped self-attention modules included in the first to fourth stepped self-attention module layers may be eight, six, four, or two in sequence.

The adjacent ladder-type self-attention module layers are connected through up sampling or down sampling, the up sampling connection is used for amplifying the image, and the down sampling connection is used for reducing the image. Optionally, a first stepped self-attention module of each stepped self-attention module layer is used for connecting with a second stepped self-attention module of the upper stepped self-attention module layer by adopting a first connection mode; the last step type self-attention module of each step type self-attention module layer is used for being connected with the last step type self-attention module in the last step type self-attention module layer by adopting a second connection mode, and the second connection mode is different from the first connection mode.

In this embodiment, the second stepped self-attention module and the third stepped self-attention module in the first stepped self-attention module layer are connected to the first stepped self-attention module and the second stepped self-attention module in the second stepped self-attention module layer through downsampling, respectively; the second step-type self-attention module in the second step-type self-attention module layer is connected with the first step-type self-attention module in the third step-type self-attention module layer through downsampling; the second step-type self-attention module in the third step-type self-attention module layer is connected with the third step-type self-attention module in the second step-type self-attention module layer through upsampling; and the third and fourth stepped self-attention modules in the second stepped self-attention module layer are respectively connected with the fourth stepped self-attention module and the fifth stepped self-attention module in the first stepped self-attention module layer through upsampling. Wherein, the first step type self-attention module in the different step type self-attention module layers is positioned at the same side, and each module layer is orderly arranged from the first step type self-attention module.

It should be noted that, when upsampling or downsampling is performed between different ladder-type self-attention module layers, multiple features between different layers may be overlapped, and specifically, a concat [ ] function may be used to perform the overlapping operation.

In this embodiment, the input of the second stepwise self-attention module in the second stepwise self-attention module layer includes a superposition of the output of the first stepwise self-attention module in the same layer and the output of the third stepwise self-attention module in the first stepwise self-attention module layer; the input of a third stepwise self-attention module in the second stepwise self-attention module layer comprises a superposition of the output of the second stepwise self-attention module in the same layer and the output of the second stepwise self-attention module in the third stepwise self-attention module layer; the input of the fourth stepwise self-attention module in the first stepwise self-attention module layer comprises a superposition of the output of the third stepwise self-attention module in the same layer and the output of the third stepwise self-attention module in the second stepwise self-attention module layer; the input of the fifth stepwise self-attention module in the first stepwise self-attention module layer comprises a superposition of the output of the fourth stepwise self-attention module in the same layer and the output of the fourth stepwise self-attention module in the second stepwise self-attention module layer.

It can be understood that after receiving the remote sensing image with fog, the processing network provided by the invention can extract the transverse characteristics of the remote sensing image in the same step type self-attention module layer, and perform longitudinal processing on the remote sensing image between adjacent step type self-attention module layers, and the two are combined to capture the global context information of the remote sensing image and remove the influence of atmospheric factors such as haze and the like.

According to another embodiment of the present invention, fig. 2 is a flowchart of a remote sensing image defogging method based on the above processing network in the present invention, and as shown in fig. 2, the present invention provides a remote sensing image defogging method, which can be applied to any electronic device of a terminal, such as a computer, a tablet computer, a mobile phone, etc. The defogging method for the remote sensing image comprises the following steps:

s200, acquiring a remote sensing image with fog;

s400, inputting the remote sensing image with fog into the processing network, and extracting the characteristics of the remote sensing image with fog by utilizing N ladder-type self-attention module layers;

s600, outputting defogging remote sensing images.

Optionally, before defogging the remote sensing image by using the processing network, the method further includes:

and S100, training and testing the remote sensing image of the processing network. Specifically, the step S100 includes: training the processing network by using the paired foggy remote sensing images and defogging remote sensing tag images until the overall loss of the processing network is basically stable, ending the training and obtaining the trained processing network; the overall loss comprises an L1 distance loss and a structural similarity loss between the defogging remote sensing image and the defogging remote sensing label image.

Said overall lossExpressed formally as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,and->Respectively representing a pair of foggy remote sensing images and foggy remote sensing label images,representing a Full-sized stepped self-attention reconstruction network (Full-Scal)e protected Self-Attention Reconstruction Network, FSSSRN), i.e. the processing network described above,/->L1 distance (L1 Norm) calculation representing defogging remote sensing image and defogging remote sensing label image, < ->Representing structural similarity (Structural Similarity Index, SSIM) calculations.

In step S200, optionally, the mode of acquiring the remote sensing image may be to connect the terminal with the shooting device wirelessly, so as to realize automatic image transmission; or is obtained by a manual uploading mode. The remote sensing image can be shot through equipment such as a satellite unmanned aerial vehicle and the like, and can be an aerial image or a satellite image.

In step S400, the processing network is a trained network structure. As shown in fig. 3, the feature extraction of the fogged remote sensing image by using N stepped self-attention module layers includes:

s410, carrying out feature extraction on the remote sensing image with fog by utilizing signal transmission of more than one stepped self-attention module in each stepped self-attention module layer; the method comprises the steps of,

s420, performing feature extraction on the remote sensing image with fog by utilizing the superposition operation among the stepped self-attention modules in the adjacent stepped self-attention module layers.

It can be understood that the multiple ladder-type self-attention modules in the same layer sequentially extract features of the remote sensing image, and the extracted features comprise atmospheric features such as fog, haze, thin cloud and the like. Meanwhile, features among different layers are overlapped, and features are continuously extracted, so that global context information of a remote sensing image is better captured, and the remote sensing image is more stable to various environmental conditions.

Further, the step S420 includes: and superposing the output of the previous step type self-attention module in the layer where the current step type self-attention module is positioned with the output of the layer of the previous step type self-attention module, and inputting the superposed output of the previous step type self-attention module into the current step type self-attention module to extract the characteristics.

In this embodiment, when N is an integer 3, the first step-like self-attention module layer includes six step-like self-attention modules, which are sequentially named as A1-A6, the second step-like self-attention module layer includes four step-like self-attention modules, which are sequentially named as B1-B4, the third step-like self-attention module layer includes two step-like self-attention modules, which are sequentially named as C1 and C2, and the step S400 includes:

In the processing network processing process, after the input features, each stepped self-attention module can perform three-layer stepped calculation so as to accurately perform the input features. Specifically, as shown in fig. 4, the processing method of each of the step-type self-attention modules is as follows:

s500, performing layer normalization operation and convolution calculation on the features to be input to obtain preprocessing features;

in the step, the feature to be input comprises the output feature of the last step type self-attention module; the preprocessing characteristics may be expressed as follows:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representation layer normalization (Layer Normalization, layerNorm) operation，/>Then the pre-processing characteristics obtained after pre-processing, < >>A Convolution (Conv) with a kernel size of 1×1 is shown.

S510, calibrating the preprocessing characteristics and outputting first-layer step characteristics;

in this step, the preprocessing feature may be calibrated using a self-attention module. The first layer step feature F ₁ The method can be expressed as follows:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,is a self-attention module.

Specifically, as shown in fig. 5, the method for calculating the self-attention module includes:

s511, extracting an attention weight of the preprocessing feature, where an expression of the attention weight is as follows:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,and->Representing a maximum pooling operation and an average pooling operation respectively,the function represents the superposition of the channel layers between the features,/->Indicating core rulerConvolution with dimensions 1X 1, < >>Representing Sigmoid activation function,/->Representing the pretreatment feature,/->Representing the extracted attention weight;

s512, calibrating the preprocessing feature by using the attention weight to obtain a first layer of step feature, wherein the expression of the first layer of step feature is as follows:

；

wherein the saidRepresenting the corresponding element multiplication operation between features, +.>Convolution representing a kernel size of 3 x 3, < >>Representing transpose convolution>Representing the calibrated output feature, i.e., the first layer step feature.

S520, multiplying the first-layer ladder feature with the output feature of the self-attention module, acting on a Sigmoid activation function, and outputting a second-layer ladder feature;

in this step, the second layer step featureThe method can be expressed as follows:

；

s530, multiplying the second-layer step feature with the first-layer step feature and acting on a convolution function to output a third-layer step feature;

in this step, the third layer step featureThe method can be expressed as follows:

；

and S540, adding the feature to be input and the third-layer step feature, and outputting a target feature serving as the feature to be input of the next step type self-attention module.

；

Wherein, the liquid crystal display device comprises a liquid crystal display device,representing the corresponding element addition operation between features, +.>Then it is the output characteristic of the stepped self-attention module.

In step S600, the defogging remote sensing image is an image that does not include atmospheric factors such as haze, and the visibility of the defogging remote sensing image is significantly improved compared with a remote sensing image with haze.

The defogging method for the remote sensing image by the processing network in the actual application process is described in detail below with reference to fig. 1, which specifically includes: recording the obtained remote sensing image with fog asFirstly, inputting the remote sensing image with fog into a first step type self-attention module A1 of a first layer in a processing network, and formally representing as follows:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,is a ladder-type self-attention module, +.>Output characteristics of the first stepwise self-attention module for the first layer;

further, it willInput to the first layer second stepwise self-attention module A2 is expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,is an output characteristic of the second stepwise self-attention module of the first layer.

Further, it willInput into the second tier first stepped self-attention module B1 by a downsampling operation is formally represented as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing a downsampling operation, +.>Representing the output characteristics of the first stepped self-attention module B1 of the second layer.

Further, willInput into the first layer third stepped self-attention module A3 by a downsampling operation is formally represented as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,is an output characteristic of a third stepwise self-attention module of the first layer.

Further, it willThrough the downsampling operation, and +.>The stacking operation of the channel layers is performed and then input into a second step-type self-attention module B2 of a second layer, which is formally expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of a second stepped self-attention module of a second layer; />Representing a stacking operation of channel layers between multiple features.

Further, it willInput into the first stepped self-attention module C1 of the third layer by a downsampling operation is formally expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of the first stepped self-attention module of the third layer.

Further, it willInput to the second stepwise self-attention module C2 of the third layer, and output through the up-sampling operation, expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing an upsampling operation, +.>Representing the output characteristics of the second stepped self-attention module of the third layer and subjected to the upsampling operation.

Further, it willAnd->The stacking operation of the channel layer is performed, and then the stacking operation is input to a third step-type self-attention module B3 of the second layer, which can be expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of a third stepped self-attention module of the second layer.

Further, it willUp-sampling operation is carried out, and then the up-sampling operation is carried out/>The stacking operation of the channel layers is performed and then input to a fourth step-type self-attention module A4 of the first layer, which is formally expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of the fourth stepped self-attention module of the first layer.

Further, it willInput to the second tier fourth stepped self-attention module B4, denoted:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of a fourth stepped self-attention module of the second layer.

Further, it willUp-sampling operation is performed, and then with + ->The stacking operation of the channel layers is performed and then input to a fifth step-type self-attention module A5 of the first layer, which is formally expressed as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of the fifth stepped self-attention module of the first layer.

Further, it willInput to the first layer sixth stepwise self-attention module A6 is represented as:

；

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the output characteristics of the sixth stepwise self-attention module A6 of the first layer, i.e. the final output of the processing network, i.e. the defogging remote sensing image.

In other alternative embodiments, in order to further improve the visibility of the fogged remote sensing image, the definition and sharpness of the remote sensing image may be further increased, for example, details of the remote sensing image may be enhanced by improving pixels of the photographing device, or the like.

The beneficial effects of the invention are as follows: according to the processing network for defogging the remote sensing image and the remote sensing image defogging method based on the processing network, the global context information in the remote sensing image is gradually captured through designing the plurality of layers of the step-type self-attention modules, so that the processing network is more stable to various environmental conditions, the influence of atmospheric factors such as haze can be removed, and the visibility of the remote sensing image is improved.

Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and device described above may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative.

It will be appreciated by persons skilled in the art that the scope of the invention referred to in the present invention is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept. Such as the above-mentioned features and the technical features disclosed in the present invention (but not limited to) having similar functions are replaced with each other.

It should be understood that, the sequence numbers of the steps in the summary and the embodiments of the present invention do not necessarily mean the order of execution, and the execution order of the processes should be determined by the functions and the internal logic, and should not be construed as limiting the implementation process of the embodiments of the present invention.

Claims

1. A processing network for defogging a remote sensing image, comprising:

the input and output of more than one stepped self-attention module contained in each stepped self-attention module layer are connected end to end in sequence and are used for extracting the atmospheric characteristics of the remote sensing image; the step-type self-attention modules included in the adjacent step-type self-attention module layers are connected through up-sampling and down-sampling and are used for extracting global context information of the remote sensing image;

the first step type self-attention module of the first step type self-attention module layer is connected with the image input end and is used for extracting the characteristics of the remote sensing image with fog; the last step type self-attention module of the first step type self-attention module layer is used for outputting defogging images;

the first step type self-attention module of each step type self-attention module layer is used for being connected with the second step type self-attention module in the step type self-attention module layer of the upper layer;

the last stepwise self-attention module of each stepwise self-attention module layer is used for connecting with the last stepwise self-attention module in the last and last stepwise self-attention module layer of the previous stepwise self-attention module layer.

2. The processing network for defogging a remote sensing image according to claim 1, wherein the number of the stepwise self-attention modules in each stepwise self-attention module layer is an even number.

3. The processing network for defogging a remote sensing image according to claim 2, wherein when said N is a number of 3, the number of stepwise self-attention modules in the first stepwise self-attention module layer is six; the number of the stepped self-attention modules in the second stepped self-attention module layer is four, and the second stepped self-attention module layer is adjacent to the first stepped self-attention module layer; the number of the stepped self-attention modules in the third stepped self-attention module layer is two, and the third stepped self-attention module layer is adjacent to the second stepped self-attention module layer.

4. A processing network for defogging a remote sensing image according to claim 3, wherein a second stepwise self-attention module and a third stepwise self-attention module within said first stepwise self-attention module layer are connected to the first stepwise self-attention module and the second stepwise self-attention module within said second stepwise self-attention module layer by downsampling, respectively; the second step-type self-attention module in the second step-type self-attention module layer is connected with the first step-type self-attention module in the third step-type self-attention module layer through downsampling;

5. A remote sensing image defogging method using the processing network of any of claims 1 to 4, wherein the remote sensing image defogging method comprises:

acquiring a remote sensing image with fog;

and outputting defogging remote sensing images.

6. The defogging method for a remote sensing image according to claim 5, wherein the feature extraction of the fogged remote sensing image by using N stepwise self-attention module layers comprises:

7. The defogging method for remote sensing images according to claim 6, wherein the feature extraction of the fogged remote sensing image by using a superposition operation between the stepwise self-attention modules in the adjacent stepwise self-attention module layers comprises:

8. The remote sensing image defogging method according to claim 7, wherein said processing method of the stepwise self-attention module comprises:

9. The defogging method of claim 8, wherein when N is a number of 3, a first layer of stepwise self-attention module layers includes six stepwise self-attention modules, sequentially designated as A1-A6, a second layer of stepwise self-attention module layers includes four stepwise self-attention modules, sequentially designated as B1-B4, a third layer of stepwise self-attention module layers includes two stepwise self-attention modules, sequentially designated as C1, C2, and the feature extraction of the fogged remote sensing image using N stepwise self-attention module layers includes: