CN110782398B

CN110782398B - Image processing method, generative countermeasure network system and electronic device

Info

Publication number: CN110782398B
Application number: CN201811529115.4A
Authority: CN
Inventors: 张毅伟; 赵元; 沈海峰
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2018-12-13
Filing date: 2018-12-13
Publication date: 2020-12-18
Anticipated expiration: 2038-12-13
Also published as: CN110782398A

Abstract

The invention relates to an image processing method, a generative countermeasure network, an electronic device, and a storage medium. The method comprises the following steps: performing first processing on an image to be processed through an initial feature extraction layer to obtain initial features; performing second processing on the input features input to the down-sampling feature extraction layers to obtain shallow features, depth features and down-sampling features; performing third processing on the down-sampling features output by the Mth down-sampling feature extraction layer through the global feature extraction layer to obtain global features; performing fourth processing on the input features input to the up-sampling feature extraction layers to obtain up-sampling features; performing fifth processing on the upsampling features output by the Mth upsampling feature extraction layer through a residual feature extraction layer to obtain residual features; and adding the residual features and the image to be processed to obtain a target image. The method not only can reduce the processing complexity in the image enhancement process, but also can improve the restoration capability of the micro texture.

Description

Image processing method, generative countermeasure network system and electronic device

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to an image processing method, a generating countermeasure network system and electronic equipment.

Background

The image processing includes: image denoising, deblurring, defogging, raindrop removal, low-illumination enhancement, boundary enhancement and the like. For many years, image processing technology has been a difficulty in research and application fields, image degradation is often encountered in application due to complexity and variety of real scenes, and the uniqueness of related image processing technology limits the use of the related image processing technology in engineering. In most cases, the traditional image processing technology is used for fusing a plurality of single problems after processing, and the rationality of the technology is not questionable. However, analysis shows that the single image processing problems are not independent from each other, for example, under the condition that noise and blur exist simultaneously, the prior de-noising can cause partial loss of image texture information, so that the subsequent de-blurring process is influenced, and the prior de-blurring not only causes inaccurate image restoration but also amplifies noise factors; also, for example, the low illumination enhancement technique is not only to adjust the distribution of luminance information, but also to study and apply the technique to ensure both enhancement of luminance information and great suppression of noise information.

The traditional image processing technology carries out degradation modeling on each single problem and then carries out restoration operation by combining a statistical model and image prior information. For example, Bayesian model and noise hypothesis are adopted for enhancement in image denoising; in the image deblurring, a maximum posterior probability model and heavy tail distribution are adopted for modeling; in image defogging, a reverse dark channel is adopted to adjust color information; in low-illumination enhancement, a combined mode of a reverse dark channel and denoising is mostly adopted. The single enhancement problems are well researched, but in a real scene, image processing is a complex mode, and the image processing by using the single enhancement mode cannot achieve a good enhancement effect after being processed for multiple times.

Disclosure of Invention

In view of the above, embodiments of the present invention provide an image processing method, a generative countermeasure network system and an electronic device, so as to effectively solve the above problems.

The embodiment of the invention is realized by the following steps:

in a first aspect, an embodiment of the present invention provides an image processing method, which is applied to a generative confrontation network GAN, where the GAN includes a generative network, and the generative network includes an initial feature extraction layer, M downsampling feature extraction layers connected in sequence, a global feature extraction layer, M upsampling feature extraction layers connected in sequence, and a residual feature extraction layer, where M is an integer greater than or equal to 1. The method comprises the following steps:

performing first processing on an image to be processed through the initial feature extraction layer to obtain initial features;

performing second processing on input features input to the down-sampling feature extraction layers by each down-sampling feature extraction layer to obtain shallow features, depth features and down-sampling features, wherein the input feature of the first down-sampling feature extraction layer connected with the initial feature extraction layer is the initial feature, and the input features of the second to Mth down-sampling feature extraction layers are the down-sampling features output by the previous down-sampling feature extraction layer;

performing third processing on the down-sampling features output by the Mth down-sampling feature extraction layer through the global feature extraction layer to obtain global features;

performing fourth processing on the input features input to the upsampling feature extraction layers by each upsampling feature extraction layer to obtain upsampling features, wherein the input features of the first upsampling feature extraction layer connected with the global feature extraction layer comprise: a first direct connection feature and a first cross connection feature, wherein the first direct connection feature is a downsampling feature and a depth feature output by the mth downsampling feature extraction layer and a first series feature of the global feature, and the first cross connection feature is a shallow feature output by the mth downsampling feature extraction layer and a depth feature output by the M-1 downsampling feature extraction layer; the input features of the jth upsampling feature extraction layer include: a jth direct connection feature and a jth cross connection feature, wherein the jth direct connection feature is an up-sampling feature output by a previous up-sampling feature extraction layer, and the jth cross connection feature is a shallow feature output by an M-j +1 th down-sampling feature extraction layer and a depth feature output by an M-j th down-sampling feature extraction layer; the input features of the mth up-sampling feature extraction layer include: the M-th direct connection feature is an up-sampling feature output by the M-1 th up-sampling feature extraction layer, the M-th cross-connection feature is a shallow feature output by the first down-sampling feature extraction layer, and j is sequentially from 2 to M-1; performing fifth processing on the upsampling features output by the Mth upsampling feature extraction layer through the residual feature extraction layer to obtain residual features; and adding the residual error characteristics and the image to be processed to obtain a target image.

In the embodiment of the application, an initial feature extraction layer in a network is generated to obtain an initial feature, then, M downsampling feature extraction layers are used for downsampling coding processing on an input feature input to the network to obtain a shallow feature, a depth feature and a downsampling feature, then, M upsampling feature extraction layers are used for upsampling decoding processing on the input feature input to the network to obtain an upsampling feature, a residual error feature extraction layer is used for performing residual error extraction on an upsampling feature output by the last upsampling feature extraction layer to obtain a residual error feature, and finally, the residual error feature and an image to be processed are added to obtain a required restored image, namely a target image. The method comprises the steps of extracting deeper features through downsampling processing, obtaining more deep features after downsampling processing for multiple times, then successively restoring an image through upsampling processing for multiple times correspondingly, and inserting corresponding shallow features and depth features obtained through downsampling every time of upsampling, so that the enhancement effect of the finally restored image is better; the image enhancement is carried out by acquiring more deep-level features, the generalization capability of the model is improved, and compared with the situation that the image is processed for multiple times in a single enhancement mode, the calculation complexity in the image enhancement process is reduced. In addition, when the down-sampling is finished, the global feature of the down-sampling feature under the minimum scale output by the last down-sampling feature extraction layer is extracted through the global feature extraction layer, so that the generalization capability and robustness of a generated network are enhanced, and the restoration capability of the micro texture is improved.

With reference to a possible implementation manner of the embodiment of the first aspect, the performing, by each of the downsampling feature extraction layers, the second processing on the input feature input to the corresponding layer to obtain a shallow feature, a depth feature, and a downsampling feature includes: sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on input features input to the down-sampling feature extraction layers to obtain first basic features; sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain the shallow feature; sequentially carrying out third convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain a conversion feature, carrying out pooling processing on the second basic feature, and connecting the pooled feature and the conversion feature in series to obtain the downsampling feature; and sequentially performing fourth convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain the depth feature.

With reference to one possible implementation manner of the embodiment of the first aspect, the sequentially performing residual error processing on the first basic feature three times to obtain a second basic feature includes: sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature; connecting the third basic feature with the first basic feature in series to obtain a second serial feature; sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the second series connection characteristic to obtain a fourth basic characteristic; connecting the second series characteristic with the fourth basic characteristic in series to obtain a third series characteristic; sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the third series connection characteristic to obtain a fifth basic characteristic; and connecting the fifth basic feature and the third series feature in series to obtain the second basic feature.

With reference to a possible implementation manner of the embodiment of the first aspect, the performing, by each upsampling feature extraction layer, a fourth process on the input feature input to the upsampling feature extraction layer to obtain an upsampling feature includes: sequentially performing deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the up-sampling feature extraction layers to obtain sixth basic features; connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature; sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on the fourth serial connection characteristic to obtain a seventh basic characteristic; and sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the seventh basic feature to obtain the up-sampling feature.

With reference to one possible implementation manner of the embodiment of the first aspect, the deconvolution processing includes: and performing deconvolution processing through a deconvolution layer with the size of a third size and the step size of a second step size.

With reference to one possible implementation manner of the embodiment of the first aspect, the first volume processing includes: convolution processing is performed by convolutional layers having a first size and a first step size.

With reference to one possible implementation manner of the embodiment of the first aspect, the second convolution processing includes: the convolution process is performed by the convolution layer having the second size and the first step size.

With reference to a possible implementation manner of the embodiment of the first aspect, the third convolution processing includes: convolution processing is performed by convolutional layers with the size of the third size and the step size of the second step size.

With reference to a possible implementation manner of the embodiment of the first aspect, the fourth convolution processing includes: convolution processing is performed by convolutional layers having a first size and a second step size.

With reference to a possible implementation manner of the embodiment of the first aspect, the performing, by the initial feature extraction layer, the first processing on the image to be processed includes: and sequentially performing fifth convolution processing, normalization processing and first nonlinear conversion processing on the image to be processed through the initial feature extraction layer.

With reference to a possible implementation manner of the embodiment of the first aspect, the performing, by the residual feature extraction layer, a fifth process on the upsampled feature output by the mth upsampled feature extraction layer includes: and sequentially performing fifth convolution processing and second nonlinear conversion processing on the up-sampling features output by the Mth up-sampling feature extraction layer through the residual error feature extraction layer.

With reference to a possible implementation manner of the embodiment of the first aspect, the fifth convolution processing includes: convolution processing is performed by convolutional layers of a fourth size and a first step size.

With reference to a possible implementation manner of the embodiment of the first aspect, the performing, by the global feature extraction layer, third processing on the downsampled feature output by the mth downsampled feature extraction layer to obtain a global feature includes: performing full convolution on the downsampled features output by the Mth downsampled feature extraction layer through the global feature extraction layer to obtain full convolution features; and sequentially carrying out deconvolution processing, normalization processing and first nonlinear conversion processing on the full convolution features to obtain the global features.

With reference to a possible implementation manner of the embodiment of the first aspect, the GAN further includes: a countermeasure network serially connected behind the generator network, the GAN being trained by: during training, respectively inputting the target image and the reference image output by the generation network into the countermeasure network; training the countermeasure network and the generation network by using a single alternative iteration optimization method until iteration is finished, optimizing the countermeasure network firstly, and then optimizing the generation network, wherein a loss function used in the training process is as follows: l ═ L_adv+λ.L_contWherein L is a loss function, L_advAs a countermeasure function, L_contIs a conditional loss function, lambda is the weight of the conditional loss function, and takes the value of 0-1,

wherein, x to p_rFor the statistical distribution of the reference image,

is a statistical distribution of the target image,

is the mixed distribution of the reference image and the target image, D is the countermeasure network, x,

The reference image, the target image and the weighted sum of the reference image and the target image are respectively, E (.) is an expectation, alpha is a parameter of a regular term, and a value is 0-1. In the embodiment of the application, the loss function is used for generating a pairing countermeasure networkAnd optimizing the line to improve the resilience and the robustness of the generated network.

With reference to one possible implementation manner of the embodiment of the first aspect, the optimization formula of the countermeasure network is as follows:

with reference to a possible implementation manner of the embodiment of the first aspect, the optimization formula of the generated network is as follows:

wherein the conditional loss function

c. w and h are respectively the number, width and height of channels of the target image at the pixel level, G is the generation network, and b takes a value of 1 or 2. In the embodiment of the application, the distribution of the target image is constrained to be close to the distribution of the reference image by using the conditional loss function, and the model is constrained by adopting the pixel level, so that the noise disturbance can be suppressed.

In a second aspect, an embodiment of the present invention further provides a generative confrontation network system, including a generative network, where the generative network includes: the device comprises an initial feature extraction layer, a global feature extraction layer, M downsampling feature extraction layers connected in sequence, M upsampling feature extraction layers connected in sequence and a residual error feature extraction layer, wherein M is an integer greater than or equal to 1;

the initial feature extraction layer is used for performing first processing on an image to be processed to obtain initial features;

each downsampling feature extraction layer is used for carrying out second processing on input features input to the downsampling feature extraction layer to obtain shallow features, depth features and downsampling features, wherein the input feature of the first downsampling feature extraction layer connected with the initial feature extraction layer is the initial feature, and the input features of the second to Mth downsampling feature extraction layers are the downsampling features output by the previous downsampling feature extraction layer;

the global feature extraction layer is used for performing third processing on the down-sampling features output by the Mth down-sampling feature extraction layer to obtain global features; each upsampling feature extraction layer is configured to perform fourth processing on an input feature input to the upsampling feature extraction layer to obtain an upsampling feature, where an input feature of a first upsampling feature extraction layer connected to the global feature extraction layer includes: the cross-connection feature is a shallow feature output by the Mth downsampling feature extraction layer and a depth feature output by the M-1 th downsampling feature extraction layer; the input features of the jth upsampling feature extraction layer include: the cross-connection feature is a shallow feature output by the (M-j + 1) th downsampling feature extraction layer and a depth feature output by the (M-j) th downsampling feature extraction layer; the input features of the mth up-sampling feature extraction layer include: the direct connection feature is an up-sampling feature output by the M-1 th up-sampling feature extraction layer, the cross-connection feature is a shallow feature output by the first down-sampling feature extraction layer, and j is sequentially from 2 to M-1;

the residual error feature extraction layer is configured to perform fifth processing on the upsampling feature output by the mth upsampling feature extraction layer to obtain a residual error feature, so as to add the residual error feature and the image to be processed to obtain a target image.

With reference to one possible implementation manner of the embodiment of the second aspect, each of the downsampling feature extraction layers includes: the device comprises a first feature extraction unit, a second feature extraction unit, a third feature extraction unit, a fourth feature extraction unit and a fifth feature extraction unit; the first feature extraction unit is used for sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on input features input to the first feature extraction unit to obtain first basic features; the second feature extraction unit is configured to perform second convolution processing, normalization processing, and first nonlinear conversion processing on the first basic feature in sequence to obtain the shallow feature; the fifth feature extraction unit is configured to sequentially perform residual error processing on the first basic features three times to obtain second basic features; the third feature extraction unit is configured to perform third convolution processing, normalization processing, and first nonlinear conversion processing on the second basic feature in sequence to obtain a conversion feature, perform pooling processing on the second basic feature, and connect the pooled feature and the conversion feature in series to obtain the downsampled feature; and the fourth feature extraction unit is configured to perform fourth convolution processing, normalization processing, and first nonlinear conversion processing on the second basic feature in sequence to obtain the depth feature.

With reference to a possible implementation manner of the embodiment of the second aspect, the fifth feature extraction unit is specifically configured to: sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature; connecting the third basic feature with the first basic feature in series to obtain a second serial feature; sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the second series connection characteristic to obtain a fourth basic characteristic; connecting the second series characteristic with the fourth basic characteristic in series to obtain a third series characteristic; sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the third series connection characteristic to obtain a fifth basic characteristic; and connecting the fifth basic feature and the third series feature in series to obtain the second basic feature. .

With reference to one possible implementation manner of the embodiment of the second aspect, each of the upsampling feature extraction layers includes: a sixth feature extraction unit, a first feature extraction unit and a second feature extraction unit; the sixth feature extraction unit is configured to perform deconvolution processing, normalization processing, and first nonlinear conversion processing on the directly connected features input to the sixth feature extraction unit in sequence to obtain sixth basic features; connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature; the first feature extraction unit is configured to sequentially perform first convolution processing, normalization processing, and first nonlinear conversion processing on the fourth serial feature to obtain a seventh basic feature; and the second feature extraction unit is configured to perform second convolution processing, normalization processing, and first nonlinear conversion processing on the seventh basic feature in sequence to obtain the upsampling feature.

With reference to one possible implementation manner of the embodiment of the second aspect, the deconvolution processing includes: and performing deconvolution processing through a deconvolution layer with the size of a third size and the step size of a second step size.

With reference to one possible implementation manner of the embodiment of the second aspect, the first volume processing includes: convolution processing is performed by convolutional layers having a first size and a first step size.

With reference to one possible implementation manner of the embodiment of the second aspect, the second convolution processing includes: the convolution process is performed by the convolution layer having the second size and the first step size.

With reference to one possible implementation manner of the embodiment of the second aspect, the third convolution processing includes: convolution processing is performed by convolutional layers with the size of the third size and the step size of the second step size.

With reference to a possible implementation manner of the embodiment of the second aspect, the fourth convolution processing includes: convolution processing is performed by convolutional layers having a first size and a second step size.

With reference to a possible implementation manner of the embodiment of the second aspect, the initial feature extraction layer is specifically configured to perform fifth convolution processing, normalization processing, and first nonlinear conversion processing on the image to be processed in sequence.

With reference to a possible implementation manner of the second aspect, the residual feature extraction layer is specifically configured to perform fifth convolution processing and second nonlinear conversion processing on the upsampled features output by the mth upsampled feature extraction layer in sequence.

With reference to a possible implementation manner of the embodiment of the second aspect, the fifth convolution processing includes: convolution processing is performed by convolutional layers of a fourth size and a first step size.

With reference to a possible implementation manner of the embodiment of the second aspect, the global feature extraction layer is specifically configured to perform full convolution on the downsampled features output by the M-th downsampled feature extraction layer to obtain full convolution features; and sequentially carrying out deconvolution processing, normalization processing and first nonlinear conversion processing on the full convolution features to obtain the global features.

In a third aspect, an embodiment of the present invention further provides an electronic device, including: a memory and a processor, the memory and the processor connected; the memory is used for storing programs; the processor is configured to invoke a program stored in the memory to perform the method provided in the foregoing first aspect embodiment and/or one possible implementation manner provided in connection with the first aspect embodiment.

In a fourth aspect, an embodiment of the present invention further provides a storage medium, where the storage medium includes a computer program, and the computer program is executed by a computer to perform the method provided in the foregoing first aspect and/or in connection with one possible implementation manner provided by the first aspect.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The above and other objects, features and advantages of the present invention will become more apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.

Fig. 1 is a schematic structural diagram of a generation network in a generation countermeasure network according to an embodiment of the present invention.

Fig. 2 illustrates a schematic structural diagram of the downsampling feature extraction layer in fig. 1 according to an embodiment of the present invention.

Fig. 3 illustrates a schematic structural diagram of the upsampling feature extraction layer in fig. 1 according to an embodiment of the present invention.

Fig. 4 is a schematic structural diagram of a generation network according to an embodiment of the present invention.

Fig. 5 shows a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Fig. 6 is a flowchart illustrating an image processing method according to an embodiment of the present invention.

Fig. 7 shows a flowchart of step S104 in fig. 6 according to an embodiment of the present invention.

Fig. 8 is a schematic layer structure diagram of a generative countermeasure network provided in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the present invention, it should be noted that the terms "first", "second", "third", and the like are used only for distinguishing the description, and are not intended to indicate or imply relative importance. Further, the term "and/or" in the present application is only one kind of association relationship describing the associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone.

First embodiment

The embodiment of the application provides a Generative Adaptive Network (GAN), which includes a generation Network. Referring to fig. 1, the generating network includes: the device comprises an initial feature extraction layer, M downsampling feature extraction layers, a global feature extraction layer, M upsampling feature extraction layers and a residual feature extraction layer, wherein the M downsampling feature extraction layers, the global feature extraction layer, the M upsampling feature extraction layers and the residual feature extraction layer are sequentially connected, and M is an integer greater than or equal to 1. By adopting the generation network with strong robustness and comprising the layer structure to process the graph to be processed, the restored image with good enhancement effect can be obtained, the calculation complexity in the image processing process can be reduced, and meanwhile, the global feature of the downsampling feature under the minimum scale output by the last downsampling feature extraction layer is extracted through the global feature extraction layer, so that the generalization capability and robustness of the generation network are enhanced, and the restoration capability of the micro texture is improved.

The initial feature extraction layer is used for performing first processing on an image to be processed to obtain initial features. For convenience of understanding, in the present embodiment, the processing of the blurred image is described as an example, but the network is not limited to processing only the blurred image. Namely, the fuzzy image to be processed is input into the initial feature extraction layer, and the initial features of the fuzzy image can be extracted under the action of the initial feature extraction layer. Optionally, the initial feature extraction layer may sequentially perform fifth convolution processing, normalization processing, and first nonlinear conversion processing on the blurred image, so as to extract the initial feature of the image to be processed.

Wherein the fifth convolution process includes: convolution processing is performed by convolutional layers of a fourth size and a first step size. For example, the image to be processed is convolved by a convolution layer of size 7x7 with a step size of 1. The first non-linear transformation process in this application may be implemented by using an activation function, leak ReLU, or a variant thereof (e.g., PReLU).

Each downsampling feature extraction layer is used for carrying out second processing on input features input to the downsampling feature extraction layer to obtain shallow features, depth features and downsampling features, wherein the first downsampling feature extraction layer connected with the initial feature extraction layer is used for inputting the features, and the second to Mth downsampling feature extraction layers are used for outputting the downsampling features which are output by the downsampling feature extraction layer before the downsampling feature extraction layer. In the present embodiment, M is an integer of 1 or more, and M is 5 as an example, but this is not to be construed as limiting the present application. At this time, the input feature of the first downsampling feature extraction layer connected with the initial feature extraction layer is the initial feature, the input features of the second to fifth downsampling feature extraction layers are the downsampling features output by the previous downsampling feature extraction layer, that is, the input feature of the second downsampling feature extraction layer is the downsampling feature output by the first downsampling feature extraction layer, the input feature of the third downsampling feature extraction layer is the downsampling feature output by the second downsampling feature extraction layer, and the like. That is, when M is 5, 5 shallow features, 5 depth features, and 5 downsampled features can be obtained. Because each downsampling feature extraction layer is in a serial relation, the output features are different due to different input features, that is, the shallow features output by the first downsampling feature extraction layer are different from the shallow features output by the second downsampling feature extraction layer, the depth features output by the first downsampling feature extraction layer are different from the depth features output by the second downsampling feature extraction layer, and the downsampling features output by the first downsampling feature extraction layer are different from the downsampling features output by the second downsampling feature extraction layer. The situation is similar for the remaining downsampled feature extraction layers.

Wherein each of the downsampling feature extraction layers includes: a first feature extraction unit, a second feature extraction unit, a third feature extraction unit, a fourth feature extraction unit, and a fifth feature extraction unit, as indicated by the dashed line boxes in fig. 2. The output of the first feature extraction unit is connected with the input of the second feature extraction unit and the input of the fifth feature extraction unit respectively, and the output of the fifth feature extraction unit is connected with the input of the third feature extraction unit and the input of the fourth feature extraction unit respectively.

The first feature extraction unit is configured to perform first convolution processing, normalization processing, and first nonlinear conversion processing on input features input to the first feature extraction unit in sequence to obtain first basic features. For example, for the first down-sampling feature extraction layer, the initial features are input into the first feature extraction unit, and under the action of the first feature extraction unit, the first basic features can be obtained. The rest is similar.

Wherein the first volume process comprises: convolution processing is performed by convolutional layers having a first size and a first step size. For example, the convolution layer with size 3 × 3 and step size 1 performs convolution processing on the input feature input to itself. The first nonlinear conversion process may be implemented by using an activation function, leak relu, or a variant thereof.

And the second feature extraction unit is used for sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a shallow feature, namely, inputting the first basic feature into the second feature extraction unit, and under the action of the second feature extraction unit, obtaining the shallow feature (namely, the output feature of the second feature extraction unit).

Wherein the second convolution processing includes: the convolution process is performed by the convolution layer having the second size and the first step size. For example, the convolution layer having a size of 1 × 1 and a step size of 1 performs convolution processing on the input feature input to itself. The implementation manners of the first nonlinear conversion processing may be referred to each other.

And the third feature extraction unit is used for sequentially performing third convolution processing, normalization processing and first nonlinear conversion processing on the second basic features output by the fifth feature extraction unit to obtain conversion features, performing pooling processing on the second basic features, and connecting the pooled features and the conversion features in series to obtain the down-sampling features. That is, the third feature extraction unit performs the third convolution processing, the normalization processing, and the first nonlinear conversion processing on the second basic feature in sequence, and also performs the pooling processing on the second basic feature, and finally outputs the serial feature of the second basic feature and the second basic feature, that is, the down-sampling feature. Namely, the second basic features output by the fifth feature extraction unit are input into the third feature extraction unit, and under the action of the third feature extraction unit, the down-sampling features (i.e. the output features of the third feature extraction unit) can be obtained.

Wherein the third convolution processing includes: convolution processing is performed by convolutional layers with the size of the third size and the step size of the second step size. For example, the second base feature is convolved with a convolution layer of size 5x5 with a step size of 2. The implementation manners of the first nonlinear conversion processing may be referred to each other.

And the fourth feature extraction unit is used for sequentially performing fourth convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain the depth feature. Namely, the second basic feature is input into the fourth feature extraction unit, and under the action of the fourth feature extraction unit, the depth feature (i.e., the output feature of the fourth feature extraction unit) can be obtained.

Wherein the fourth convolution processing includes: convolution processing is performed by convolutional layers having a first size and a second step size. For example, the second base feature is convolved with a convolution layer of size 3x3 with a step size of 2. The implementation manners of the first nonlinear conversion processing may be referred to each other.

Since the local area of the deep feature of the image is large and the local area of the shallow feature is small, the convolution layer with the step length of 2 is used for extracting the deep feature, and the convolution layer with the step length of 1 is used for extracting the shallow feature.

And the fifth feature extraction unit is used for sequentially carrying out residual error processing on the first basic features for three times to obtain the second basic features. That is, the first basic feature is input into the fifth feature extraction unit, and under the action of the fifth feature extraction unit, the second basic feature (that is, the output feature of the fifth feature extraction unit) can be obtained. Further, the fifth feature extraction unit is specifically configured to: sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature; connecting the third basic feature with the first basic feature in series to obtain a second serial feature (namely, carrying out first residual processing); sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the second series connection characteristic to obtain a fourth basic characteristic; connecting the second series characteristic with the fourth basic characteristic in series to obtain a third series characteristic (namely, second residual processing); sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the third series connection characteristic to obtain a fifth basic characteristic; and connecting the fifth basic feature and the third series feature in series to obtain the second basic feature. (i.e., a third residual processing).

The second series characteristic is a characteristic obtained by connecting the third basic characteristic and the first basic characteristic in series. Similarly, the third series characteristic is a characteristic obtained by connecting the second series characteristic and the fourth basic characteristic in series.

As can be seen from the above, the principle of each residual processing is the same, and the input characteristics are different each time, so the output characteristics are also different. Only the principle of the first residual processing will be described below, and the principle parameters of the second residual processing and the third residual processing may be the principle of the first residual processing. Wherein, the first residual error processing comprises the following steps: sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature; and connecting the third basic characteristic with the first basic characteristic in series to obtain a second series characteristic. That is, the first basic feature is first convolved, then normalized, then first nonlinear conversion is performed, then the first convolved, then normalized and first nonlinear conversion are performed again to obtain a third basic feature, and finally the third basic feature and the first basic feature are output in series, and the first residual error processing is finished.

And the global feature extraction layer is used for carrying out third processing on the down-sampling features output by the Mth down-sampling feature extraction layer to obtain global features. That is, the downsampled features output by the last downsampled feature extraction layer are input into the global feature extraction layer, and under the action of the global feature extraction layer, the global features can be obtained. Optionally, performing full convolution on the down-sampling features output by the Mth down-sampling feature extraction layer through the global feature extraction layer to obtain full convolution features; then, the full convolution feature is sequentially subjected to deconvolution processing, normalization processing and first nonlinear conversion processing, and the global feature can be obtained.

Wherein, the deconvolution processing includes: and performing deconvolution processing through a deconvolution layer with the size of a third size and the step size of a second step size. For example, the first base feature is convolved with a convolution layer of size 5x5 with a step size of 2. The implementation manners of the first nonlinear conversion processing may be referred to each other.

Each up-sampling feature extraction layer is used for carrying out fourth processing on the input features input to the up-sampling feature extraction layer to obtain up-sampling features. Wherein the input features of the first upsampled feature extraction layer connected to the global feature extraction layer comprise: the device comprises a first direct connection feature and a first cross connection feature, wherein the first direct connection feature is a first series feature of a down-sampling feature, a depth feature and a global feature output by the Mth down-sampling feature extraction layer, and the first cross connection feature is a shallow feature output by the Mth down-sampling feature extraction layer and a depth feature output by the M-1 th down-sampling feature extraction layer. The input features of the jth upsampling feature extraction layer include: the system comprises a jth direct connection feature and a jth cross-connection feature, wherein the jth direct connection feature is an up-sampling feature output by a previous up-sampling feature extraction layer of the jth direct connection feature, and the jth cross-connection feature is a shallow feature output by an M-j +1 th down-sampling feature extraction layer and a depth feature output by an M-j th down-sampling feature extraction layer. The input features of the mth upsampling feature extraction layer include: the M-th direct connection feature is an up-sampling feature output by an M-1-th up-sampling feature extraction layer, the M-th cross-connection feature is a shallow feature output by a first down-sampling feature extraction layer, and j is sequentially from 2 to M-1.

Wherein, the input feature of each up-sampling feature extraction layer comprises: a direct connection feature and a cross-connection feature. And different up-sampling feature extraction layers have different direct connection features and cross-connection features. For ease of understanding, M is given as 5 for example. At this time, the directly connected feature in the input feature of the first up-sampling feature extraction layer connected to the global feature extraction layer is the down-sampling feature output by the 5 th (last) down-sampling feature extraction layer, the depth feature (output by the 5 th down-sampling feature extraction layer) and the first serial feature of the global feature, where the first serial feature is a feature obtained by serially connecting the down-sampling feature output by the 5 th down-sampling feature extraction layer, the depth feature output by the 5 th down-sampling feature extraction layer and the global feature output by the global feature extraction layer. The cross-connection features in the input features of the first up-sampling feature extraction layer are shallow features output by the 5 th down-sampling feature extraction layer and depth features output by the 4 th down-sampling feature extraction layer.

The directly connected feature in the input features of the jth (i.e. 2 nd to 4 th) upsampled feature extraction layer is the upsampled feature output by the upsampled feature extraction layer before the directly connected feature is output by the upsampled feature extraction layer. That is, the directly connected feature in the input features of the second upsampled feature extraction layer is the upsampled feature output by the first upsampled feature extraction layer. And the directly connected features in the input features of the third up-sampling feature extraction layer are the up-sampling features output by the second up-sampling feature extraction layer. And the directly connected features in the input features of the fourth up-sampling feature extraction layer are the up-sampling features output by the third up-sampling feature extraction layer. The cross-connection features in the input features of the jth (i.e. 2 nd to 4 th) up-sampling feature extraction layer are shallow features output by the (M-j + 1) th down-sampling feature extraction layer and depth features output by the (M-j) th down-sampling feature extraction layer. That is, the cross-connection feature in the input features of the 2 nd upsampling feature extraction layer is a shallow feature output by the 4 th downsampling feature extraction layer and a depth feature output by the 3 rd downsampling feature extraction layer. The cross-connection features in the input features of the 3 rd up-sampling feature extraction layer are shallow features output by the 3 rd down-sampling feature extraction layer and depth features output by the 2 nd down-sampling feature extraction layer. The cross-connection features in the input features of the 4 th up-sampling feature extraction layer are shallow features output by the 2 nd down-sampling feature extraction layer and depth features output by the 1 st down-sampling feature extraction layer.

The direct connection feature in the input feature of the 5 th (i.e. last) upsampling feature extraction layer is the upsampling feature output by the 4 th upsampling feature extraction layer, and the cross connection feature in the input feature of the 5 th (i.e. last) upsampling feature extraction layer is the shallow feature output by the first downsampling feature extraction layer.

It should be noted that the number of the downsampling feature extraction layers is the same as that of the upsampling feature extraction layers, that is, when the number of the downsampling feature extraction layers is 1, the number of the upsampling feature extraction layers is also 1. When M is 1, that is, only one down-sampling feature extraction layer and one up-sampling feature extraction layer are provided, at this time, the direct connection feature of the up-sampling feature extraction layer is the first series feature of the down-sampling feature, the depth feature and the global feature output by the down-sampling feature extraction layer, and the cross-connection feature of the up-sampling feature extraction layer is the shallow feature output by the down-sampling feature extraction layer. When M is 2, that is, there are 2 downsampling feature extraction layers and 2 upsampling feature extraction layers, at this time, the direct connection feature of the first upsampling feature extraction layer is the downsampling feature output by the second downsampling feature extraction layer, the depth feature (output by the second downsampling feature extraction layer) and the first series feature of the global feature, and the cross-connection feature of the first upsampling feature extraction layer is the shallow feature output by the second downsampling feature extraction layer and the depth feature output by the first downsampling feature extraction layer. The direct connection feature of the second up-sampling feature extraction layer is an up-sampling feature output by the first up-sampling feature extraction layer, and the cross-connection feature of the second up-sampling feature extraction layer is a shallow feature output by the first down-sampling feature extraction layer. For the case that M is 3, 4 and other values, please refer to the corresponding principle, which is not described herein.

Optionally, as shown in fig. 3, each of the up-sampling feature extraction layers includes: the device comprises a sixth feature extraction unit, a first feature extraction unit and a second feature extraction unit. Wherein the output of the sixth feature extraction unit is connected with the input of the first feature extraction unit, and the output of the first feature extraction unit is connected with the output of the second feature extraction unit. Note that, since the input features of the respective upsampling feature extraction layers are different, the output upsampling features are also different.

The sixth feature extraction unit is used for sequentially performing deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the sixth feature extraction unit to obtain sixth basic features; and connecting the sixth basic feature in series with the cross-connection feature input to the sixth basic feature to obtain a fourth series feature. That is, the deconvolution processing, the normalization processing and the first nonlinear conversion processing are sequentially performed on the directly connected features input to the self to obtain a sixth basic feature, and then the sixth basic feature is connected in series with the cross-connection features input to the self to obtain a fourth series connection feature. For convenience of understanding, taking M as 5 as an example, in the first up-sampling feature extraction layer, at this time, the direct connection feature input to the sixth feature extraction unit is the first series feature of the down-sampling feature, the depth feature and the global feature output by the 5 th down-sampling feature extraction layer, and the cross-connection feature is the shallow feature output by the 5 th down-sampling feature extraction layer and the depth feature output by the 4 th down-sampling feature extraction layer. Therefore, the above process is: the first series characteristic is sequentially subjected to deconvolution processing, normalization processing and first nonlinear conversion processing to obtain a sixth basic characteristic, and then the sixth basic characteristic and the cross-connection characteristic are connected in series (namely the sixth basic characteristic, the shallow characteristic output by the 5 th downsampling characteristic extraction layer and the depth characteristic output by the 4 th downsampling characteristic extraction layer are connected in series) to obtain a fourth series characteristic.

And the first feature extraction unit in the upsampling feature extraction layer is used for sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on the fourth serial features to obtain seventh basic features. That is, the fourth series characteristic is input into the first characteristic extraction unit, and the seventh basic characteristic (i.e., the output characteristic of the first characteristic extraction unit) can be obtained under the action of the first characteristic extraction unit. The first convolution processing and the first nonlinear conversion processing may be implemented by referring to each other.

And the second feature extraction unit in the upsampling feature extraction layer is used for sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the seventh basic feature to obtain the upsampling feature. That is, the seventh basic feature is input into the second feature extraction unit, and under the action of the second feature extraction unit, the upsampling feature (that is, the output feature of the second feature extraction unit) can be obtained. The implementation manners of the second convolution processing and the first nonlinear conversion processing may be mutually referred to.

It should be noted that each upsampling feature extraction layer may include only: a sixth feature extraction unit and a first feature extraction unit. That is, in this embodiment, the up-sampling feature extraction layer does not include the second feature extraction unit. Alternatively, each upsampling feature extraction layer may include only the sixth feature extraction unit and the second feature extraction unit. That is, in this embodiment, the upsampling feature extraction layer does not include the first feature extraction unit. The structure shown in fig. 3 should therefore not be construed as limiting the present application.

And the residual error feature extraction layer is used for performing fifth processing on the up-sampling feature output by the Mth (namely the last) up-sampling feature extraction layer to obtain a residual error feature so as to add the residual error feature and the image to be processed to obtain a target image. Namely, the upsampling features output by the last upsampling feature extraction layer are input into the residual error feature extraction layer, and under the action of the residual error feature extraction layer, the residual error features can be obtained. Optionally, the residual feature extraction layer sequentially performs fifth convolution processing and second nonlinear conversion processing on the upsampled features output by the mth upsampled feature extraction layer.

Wherein the fifth convolution process includes: convolution processing is performed by convolutional layers of a fourth size and a first step size. For example, the image to be processed is convolved by a convolution layer of size 7x7 with a step size of 1. The second nonlinear conversion process may be implemented by using an activation function Tanh or a variant thereof (sigmoid).

After the residual error features are obtained, the residual error features and the image to be processed are added (namely, are connected in series) to obtain a restored image, namely, a target image.

For easy understanding of the above process, please refer to fig. 4. Fig. 4 only shows the case where M is 3, and the case of the remaining numerical values may be adjusted correspondingly according to the above-described principle. Among them, the box 2 (corresponding to the above first feature extraction unit), the box 3 (corresponding to the above third feature extraction unit), the box 4 (corresponding to the above fourth feature extraction unit), the box 5 (corresponding to the above fifth feature extraction unit), and the box 6 (corresponding to the above second feature extraction unit) on the left side in fig. 4 are one down-sampling feature extraction layer. The box 7 (corresponding to the sixth feature extraction unit above), the box 2 (corresponding to the first feature extraction unit above), and the box 6 (corresponding to the second feature extraction unit above) on the right side in fig. 4 are one up-sampling feature extraction layer. In fig. 4, 1 is an initial feature extraction layer, and 8 is a residual feature extraction layer. The box 9 and the box 7 between the last down-sampled feature extraction layer (each down-sampled feature extraction layer comprises the box 2, the box 3, the box 4, the box 5, the box 6) and the first up-sampled feature extraction layer (each up-sampled feature extraction layer comprises the box 7, the box 2, the box 6) are the global feature extraction. Since the network structure is a Bi-span connection, which is a cross-layer connection operation, the network structure is called Bi-Skip, and a Global feature (Global) is also considered, the generated network shown in fig. 4 can be called a Global-Bi-Skip network.

Wherein, different numbers in the rectangular frame in fig. 4 represent different layer structures, the layer structures corresponding to the same numbers are the same, and the detailed description of each layer structure is shown in table 1 below.

TABLE 1

Wherein, the activation function 1 in table 1 is LeakyReLU or a variant of LeakyReLU such as PReLU; the activation function 2 is Tanh or a variant of Tanh such as sigmoid.

The first size, the second size, the third size, and the fourth size in this application represent the size of a convolution kernel or an deconvolution kernel. In the embodiment of the present application, only the case where the first size is 3x3, the case where the second size is 1x1, the case where the third size is 5x5, and the case where the fourth size is 7x7 are shown. Values of the first size, the second size, the third size, and the fourth size may be switched with each other, such as the case where the first size is 5x5, the second size is 1x1, the third size is 3x3, and the fourth size is 7x 7. The above examples are therefore not to be construed as limiting the present application.

Second embodiment

As shown in fig. 5, fig. 5 is a block diagram illustrating a structure of an electronic device 100 according to an embodiment of the present invention. The electronic device 100 includes: memory 120, memory controller 130, and processor 140. The components and configuration of electronic device 100 shown in FIG. 5 are exemplary only, and not limiting, and electronic device 100 may have other components and configurations as desired.

The memory 120, the memory controller 130, and the processor 140 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines.

The Memory 120 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 120 is used for storing a program, that is, a program required for executing the image processing method shown in the present embodiment, and the processor 140 executes the program after receiving the execution instruction, and the method executed by the electronic device 100 defined by the flow disclosed in any embodiment of the present invention described later may be applied to the processor 140, or implemented by the processor 140. After the processor 140 receives the execution instruction and calls the program stored in the memory 120 through the bus, the processor 140 may execute the flow of the image processing method.

The processor 140 may be an integrated circuit chip having signal processing capabilities. The processor may be a general-purpose processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art.

In the embodiment of the present invention, the electronic device 100 may be, but is not limited to, a web server, a database server, a cloud server, and the like.

Referring to fig. 6, steps included in an image processing method applied to the electronic device 100 according to an embodiment of the present invention will be described with reference to fig. 6. The image processing method is applied to the generative countermeasure network. The generation type countermeasure network comprises a generation network, wherein the generation network comprises an initial feature extraction layer, M down-sampling feature extraction layers, a global feature extraction layer, M up-sampling feature extraction layers and a residual error feature extraction layer, the M down-sampling feature extraction layers, the global feature extraction layer, the M up-sampling feature extraction layers and the residual error feature extraction layer are sequentially connected, and M is an integer greater than or equal to 1.

Step S101: and performing first processing on the image to be processed through the initial feature extraction layer to obtain initial features.

When image enhancement processing needs to be carried out on the image waiting for processing the degraded image and the blurred image, the image waiting for processing is input into the initial feature extraction layer, and the initial feature of the image waiting for processing can be extracted under the action of the initial feature extraction layer. For example, the initial feature extraction layer sequentially performs fifth convolution processing, normalization processing, and first nonlinear conversion processing on the blurred image, so as to extract the initial features of the image to be processed.

Step S102: and performing second processing on the input features input into the down-sampling feature extraction layers to obtain shallow features, depth features and down-sampling features.

Input features input to the device are updated by a downsampling feature extraction layer, and shallow features, depth features and downsampling features are obtained. The input features of the first downsampling feature extraction layer connected with the initial feature extraction layer are initial features, and the input features of the second to Mth downsampling feature extraction layers are downsampling features output by the previous downsampling feature extraction layer. For the first downsampling feature extraction layer, at this time, the input features input to the first downsampling feature extraction layer are the initial features output by the initial feature extraction layer, and shallow features, depth features and downsampling features can be obtained under the action of the first downsampling feature extraction layer.

Each downsampling feature extraction layer performs second processing on input features input to the downsampling feature extraction layer, the principle of obtaining shallow features, depth features and downsampling features is the same, only the output features are different due to different input features, for example, when M is 5, the input feature of the first downsampling feature extraction layer connected with the initial feature extraction layer is an initial feature, the input features of the second to fifth downsampling feature extraction layers are downsampling features output by the previous downsampling feature extraction layer, that is, the input feature of the second downsampling feature extraction layer is a downsampling feature output by the first downsampling feature extraction layer, the input feature of the third downsampling feature extraction layer is a downsampling feature output by the second downsampling feature extraction layer, and the like.

Wherein, carry out the second through every feature extraction layer of down sampling to the input feature of inputting to oneself and handle, obtain shallow feature, depth feature and the feature of down sampling, include: sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on input features input to the down-sampling feature extraction layers to obtain first basic features; sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain the shallow feature; sequentially carrying out three times of residual error processing on the first basic characteristic to obtain a second basic characteristic; sequentially carrying out third convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain a conversion feature, carrying out pooling processing on the second basic feature, and connecting the pooled feature and the conversion feature in series to obtain the downsampling feature; and sequentially performing fourth convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain the depth feature. After the down-sampling processing is carried out for multiple times, more deep features can be obtained, that is, more image detail information can be extracted, and then the image texture is finally output and the more accurate restored image is restored.

The above-described procedure (step S102) will be described with reference to the layer structure of the downsampled feature extraction layer shown in fig. 2, that is, the first feature extraction unit sequentially performs the first convolution processing, normalization processing, and first nonlinear conversion processing on the input features input to the first feature extraction unit, thereby obtaining the first basic features. For example, for the first down-sampling feature extraction layer, the initial features are input into the first feature extraction unit, and under the action of the first feature extraction unit, the first basic features can be obtained. The rest is similar.

And sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the first basic features by using a second feature extraction unit to obtain shallow features, namely inputting the first basic features into the second feature extraction unit, and obtaining the shallow features (namely the output features of the second feature extraction unit) under the action of the second feature extraction unit.

And sequentially performing third convolution processing, normalization processing and first nonlinear conversion processing on the second basic features output by the fifth feature extraction unit by using a third feature extraction unit to obtain conversion features, performing pooling processing on the second basic features, and connecting the pooled features and the conversion features in series to obtain the down-sampling features. That is, the third feature extraction unit performs the third convolution processing, the normalization processing, and the first nonlinear conversion processing on the second basic feature in sequence, and also performs the pooling processing on the second basic feature, and finally outputs the serial feature of the second basic feature and the second basic feature, that is, the down-sampling feature. That is, the second basic feature is input into the third feature extraction unit, and under the action of the third feature extraction unit, the down-sampling feature (that is, the output feature of the third feature extraction unit) can be obtained.

And sequentially performing fourth convolution processing, normalization processing and first nonlinear conversion processing on the second basic features by using a fourth feature extraction unit to obtain depth features. Namely, the second basic feature is input into the fourth feature extraction unit, and under the action of the fourth feature extraction unit, the depth feature (i.e., the output feature of the fourth feature extraction unit) can be obtained.

And sequentially carrying out three times of residual error processing on the first basic features by using a fifth feature extraction unit to obtain the second basic features. That is, the first basic feature is input into the fifth feature extraction unit, and under the action of the fifth feature extraction unit, the second basic feature (that is, the output feature of the fifth feature extraction unit) can be obtained. Further, the fifth feature extraction unit is specifically configured to: sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature; connecting the third basic feature with the first basic feature in series to obtain a second serial feature (namely, carrying out first residual processing); sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the second series connection characteristic to obtain a fourth basic characteristic; connecting the second series characteristic with the fourth basic characteristic in series to obtain a third series characteristic (namely, second residual processing); sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the third series connection characteristic to obtain a fifth basic characteristic; and connecting the fifth basic feature with the third serial feature in series to obtain the second basic feature (namely, third residual processing).

As can be seen from the above, the principle of each residual processing is the same, and the input characteristics are different each time, so the output characteristics are also different. Only the first residual processing will be described below, and the principle parameters of the second residual processing and the third residual processing may be the principle of the first residual processing. Wherein, the first residual error processing comprises the following steps: sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature; and connecting the third basic characteristic with the first basic characteristic in series to obtain a second series characteristic. That is, the first basic feature is first convolved, then normalized, then first nonlinear conversion is performed, then the first convolved, then normalized and first nonlinear conversion are performed again to obtain a third basic feature, and finally the third basic feature and the first basic feature are output in series, and the first residual error processing is finished.

The process of feature extraction by the downsampling feature extraction layer can be regarded as a process of feature coding, and the downsampling processing is finished and the coding is finished.

Step S103: and performing third processing on the down-sampling features output by the Mth down-sampling feature extraction layer through the global feature extraction layer to obtain global features.

After the down-sampling processing is finished, namely after the coding is finished, the global feature extraction layer is used for carrying out third processing on the down-sampling features output by the Mth down-sampling feature extraction layer to obtain global features, so that the generalization capability and robustness of a generated network are enhanced, and the restoration capability of the micro texture is improved. That is, the downsampled features output by the last downsampled feature extraction layer are input into the global feature extraction layer, and under the action of the global feature extraction layer, the global features can be obtained. Optionally, performing full convolution on the down-sampling features output by the Mth down-sampling feature extraction layer through the global feature extraction layer to obtain full convolution features; then, the full convolution feature is sequentially subjected to deconvolution processing, normalization processing and first nonlinear conversion processing, and the global feature can be obtained.

Step S104: and performing fourth processing on the input features input into the up-sampling feature extraction layers to obtain the up-sampling features.

After the down-sampling process is finished, that is, after the encoding is finished, feature decoding needs to be performed, that is, feature up-sampling process needs to be performed. At this time, the third processing is performed on the input features input to the third processing layer by each upsampling feature extraction layer, and upsampling features are obtained.

Wherein the input features of the first upsampled feature extraction layer connected to the global feature extraction layer comprise: the device comprises a first direct connection feature and a first cross connection feature, wherein the first direct connection feature is a first series feature of a down-sampling feature, a depth feature and a global feature output by the Mth down-sampling feature extraction layer, and the first cross connection feature is a shallow feature output by the Mth down-sampling feature extraction layer and a depth feature output by the M-1 th down-sampling feature extraction layer. The input features of the jth upsampling feature extraction layer include: the system comprises a jth direct connection feature and a jth cross-connection feature, wherein the jth direct connection feature is an up-sampling feature output by a previous up-sampling feature extraction layer of the jth direct connection feature, and the jth cross-connection feature is a shallow feature output by an M-j +1 th down-sampling feature extraction layer and a depth feature output by an M-j th down-sampling feature extraction layer. The input features of the mth upsampling feature extraction layer include: the M-th direct connection feature is an up-sampling feature output by an M-1-th up-sampling feature extraction layer, the M-th cross-connection feature is a shallow feature output by a first down-sampling feature extraction layer, and j is sequentially from 2 to M-1.

Each up-sampling feature extraction layer performs third processing on the input features input to the up-sampling feature extraction layer, and the principle of obtaining the up-sampling features is the same, except that the output features are different due to different input features. The input features of each up-sampling feature extraction layer include: a direct connection feature and a cross-connection feature. Where the addition of "first", "jth", "mth", etc. to the straight connection feature and the cross-connection feature is merely for the purpose of distinguishing. And different up-sampling feature extraction layers have different direct connection features and cross-connection features.

Alternatively, a process of obtaining the upsampled features by performing the third processing on the input features input to the upsampled feature extraction layer by each of the upsampled feature extraction layers may be described with reference to the flowchart shown in fig. 7.

Step S201: and sequentially carrying out deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the up-sampling feature extraction layers to obtain sixth basic features.

Step S202: and connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature.

The above-described procedure (step S201, step S202) will be described with reference to the layer structure of the up-sampling feature extraction layer shown in fig. 3. Namely, a sixth feature extraction unit is used for sequentially carrying out deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the sixth feature extraction unit to obtain sixth basic features; and connecting the sixth basic feature in series with the cross-connection feature input to the sixth basic feature to obtain a fourth series feature. That is, the deconvolution processing, the normalization processing and the first nonlinear conversion processing are sequentially performed on the directly connected features input to the self to obtain a sixth basic feature, and then the sixth basic feature is connected in series with the cross-connection features input to the self to obtain a fourth series connection feature.

Step S203: and sequentially carrying out first convolution processing, normalization processing and first nonlinear conversion processing on the fourth series characteristic to obtain a seventh basic characteristic.

After the fourth serial feature is obtained, the first feature extraction unit in the upsampling feature extraction layer is used for sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on the fourth serial feature, and a seventh basic feature is obtained. That is, the fourth series characteristic is input into the first characteristic extraction unit, and the seventh basic characteristic (i.e., the output characteristic of the first characteristic extraction unit) can be obtained under the action of the first characteristic extraction unit. The first convolution processing and the first nonlinear conversion processing may be implemented by referring to each other.

Step S204: and sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the seventh basic feature to obtain the up-sampling feature.

After the seventh basic feature is obtained, the second feature extraction unit is used for sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the seventh basic feature to obtain the up-sampling feature. That is, the seventh basic feature is input into the second feature extraction unit, and under the action of the second feature extraction unit, the upsampling feature (that is, the output feature of the second feature extraction unit) can be obtained. The implementation manners of the second convolution processing and the first nonlinear conversion processing may be mutually referred to.

It should be noted that each upsampling feature extraction layer may include only: a sixth feature extraction unit and a first feature extraction unit. At this time, the flow shown in fig. 7 is changed to: sequentially performing deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the up-sampling feature extraction layers to obtain sixth basic features; connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature; and sequentially carrying out first convolution processing, normalization processing and first nonlinear conversion processing on the fourth series connection characteristic to obtain the up-sampling characteristic.

Alternatively, each upsampling feature extraction layer may include only the sixth feature extraction unit and the second feature extraction unit. At this time, the flow shown in fig. 7 is changed to: sequentially performing deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the up-sampling feature extraction layers to obtain sixth basic features; connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature; and sequentially carrying out second convolution processing, normalization processing and first nonlinear conversion processing on the fourth series connection characteristic to obtain the up-sampling characteristic. The flow shown in fig. 7 should therefore not be construed as limiting the application.

Here, the end of upsampling also means the end of decoding, where downsampling corresponds to upsampling, and therefore the number of downsampled feature extraction layers is the same as the number of upsampled feature extraction layers.

Step S105: and performing fifth processing on the upsampling features output by the Mth upsampling feature extraction layer through the residual feature extraction layer to obtain residual features.

And after the decoding is finished, namely after the up-sampling is finished, performing fifth processing on the up-sampling features output by the Mth up-sampling feature extraction layer through the residual feature extraction layer to obtain residual features. Namely, the upsampling features output by the last upsampling feature extraction layer are input into the residual error feature extraction layer, and under the action of the residual error feature extraction layer, the residual error features can be obtained. Optionally, the residual feature extraction layer sequentially performs fifth convolution processing and second nonlinear conversion processing on the upsampled features output by the mth upsampled feature extraction layer, so as to obtain the residual features.

Step S106: and adding the residual error characteristics and the image to be processed to obtain a target image.

After the residual features are obtained, the residual features and the image to be processed are added (i.e., connected in series) to obtain a restored image, i.e., a target image.

In order to improve the processing capability of the generation network for the blurred image, that is, to output a restored image with high definition, that is, a target image, the generation network needs to be trained and optimized. When training the optimized generation network, a countermeasure network, that is, a discriminant network, needs to be connected in series behind the generation network, as shown in fig. 8. In the embodiment of the present application, parameters of the countermeasure network used are shown in table 2.

Table 2 (setting parameters of countermeasure network)

The parameter dimensions in table 2, such as 32x3x5x5, 128x128x5x5, etc., indicate the following meanings, where the first digits, such as 32, 64, 128, etc., indicate the number of channels of the current layer feature, the second digits, such as 3, 32, 64, etc., indicate the number of channels of the previous layer feature, and the last two digits, such as 5x5, 4x4, and 1x1, indicate the size of the convolution kernel.

Fig. 8 shows a case where the generative countermeasure network includes a generative network and a countermeasure network. The generation network and the confrontation network are just like two players, the purpose of the generation network is to generate vivid images as much as possible, so that the confrontation network cannot identify true and false, and the purpose of the confrontation network is to distinguish the input images from a true sample set or a false sample set as much as possible, wherein the closer the output value is to 1, the higher the possibility that the input images are from the true sample set is, the closer the output value is to 0, the higher the possibility that the input images are from the false sample set is, and of course, the two can be reversed.

The generative confrontation network (including the generative network and the confrontation network connected in series behind the generative network) can be trained by the following method, which is hereinafter referred to as GAN: during training, respectively inputting a target image (false sample set) and a reference image (true sample set) which generate network output into a countermeasure network; and training the confrontation network and generating the network by using a single alternative iteration optimization method until the iteration is finished. During optimization, the countermeasure network is optimized first, and then the generation network is optimized. The method for optimizing the countermeasure network by the single alternate iteration is characterized in that the generation network is not considered temporarily when the countermeasure network is optimized, and the countermeasure network is not considered temporarily when the generation network is optimized. The number of iterations can be set as required, for example, set to 300.

In training the GAN, the following loss function may be used: l ═ L_adv+λ.L_contTo train the generative confrontation network. Wherein L is a loss function, L_advAs a countermeasure function, L_contIs a conditional loss function, lambda is the weight of the conditional loss function, and takes the value of 0-1,

wherein, x to p_rFor the statistical distribution of the reference image,

is a statistical distribution of the target image,

The reference image, the target image and the weighted sum of the reference image and the target image are respectively, E (.) is an expectation, alpha is a parameter of a regular term, and a value is 0-1.

Based on the loss function, an optimization formula of the countermeasure network can be obtained as follows:

wherein, the optimization formula of the countermeasure network means: and solving the minimum value of the first term, the second term and the third term. WhereinThe first term is:

the second term is:

the third term is

This term is a gradient regularization term.

Based on the loss function, an optimization formula for generating the network can be obtained as follows:

wherein, the meaning of generating the optimization formula of the network is as follows: and solving the minimum value of the first term and the second term. Wherein the first term is:

the second term is: lambda.L_cont. Wherein G is a generation network.

As an alternative embodiment, wherein the conditional loss function

c. w and h are respectively the number, width and height of channels of the target image at the pixel level, G is the generation network, and b takes a value of 1 or 2. In this embodiment, the distribution of the target image is constrained to be close to the distribution of the reference image by using the conditional loss function, and the model is constrained by using the pixel level, so that noise disturbance can be suppressed. Wherein the content of the first and second substances,

is a pixel level constraint.

The generated network and the judgment network are trained through the method, so that the graph finally output by the generated network can be identified by the countermeasure network, namely the judgment network, and the final output value is about 0.5. That is, when the output value of the confrontation network is about 0.5, the generation network is optimal, that is, trained, and at this time, the trained generation network can be used for restoring the motion blurred image.

It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.

Third embodiment

The present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a computer, the computer program performs the steps of the method described in the second embodiment. For specific implementation, reference may be made to the method embodiment, which is not described herein again.

Specifically, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, or the like, and when the program code on the storage medium is executed, the image processing method shown in the above-described embodiment can be executed.

In the embodiments provided in the present application, it should be understood that the disclosed method can be implemented in other ways. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a notebook computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An image processing method is characterized by being applied to a generation type countermeasure network GAN, wherein the GAN comprises a generation network, and the generation network comprises an initial feature extraction layer, M downsampling feature extraction layers, a global feature extraction layer, M upsampling feature extraction layers and a residual feature extraction layer, wherein the M downsampling feature extraction layers, the global feature extraction layer, the M upsampling feature extraction layers and the residual feature extraction layer are sequentially connected, and M is an integer greater than or equal to 1; the method comprises the following steps:

performing first processing on an image to be processed through the initial feature extraction layer to obtain initial features, wherein the first processing comprises initial feature extraction processing;

performing second processing on input features input to the down-sampling feature extraction layers by each down-sampling feature extraction layer to obtain shallow features, depth features and down-sampling features, wherein the second processing comprises shallow feature extraction processing, depth feature extraction processing and down-sampling feature extraction processing, the input feature of the first down-sampling feature extraction layer connected with the initial feature extraction layer is the initial feature, and the input features of the second to Mth down-sampling feature extraction layers are the down-sampling features output by the previous down-sampling feature extraction layer;

performing third processing on the downsampled features output by the Mth downsampled feature extraction layer through the global feature extraction layer to obtain global features, wherein the third processing comprises global feature extraction processing;

performing fourth processing on the input features input to the upsampling feature extraction layer through each upsampling feature extraction layer to obtain upsampling features, wherein the fourth processing comprises upsampling feature extraction processing, and the input features of the first upsampling feature extraction layer connected with the global feature extraction layer comprise: a first direct connection feature and a first cross connection feature, wherein the first direct connection feature is a downsampling feature and a depth feature output by the mth downsampling feature extraction layer and a first series feature of the global feature, and the first cross connection feature is a shallow feature output by the mth downsampling feature extraction layer and a depth feature output by the M-1 downsampling feature extraction layer; the input features of the jth upsampling feature extraction layer include: a jth direct connection feature and a jth cross connection feature, wherein the jth direct connection feature is an up-sampling feature output by a previous up-sampling feature extraction layer, and the jth cross connection feature is a shallow feature output by an M-j +1 th down-sampling feature extraction layer and a depth feature output by an M-j th down-sampling feature extraction layer; the input features of the mth up-sampling feature extraction layer include: the M-th direct connection feature is an up-sampling feature output by the M-1 th up-sampling feature extraction layer, the M-th cross-connection feature is a shallow feature output by the first down-sampling feature extraction layer, and j is sequentially from 2 to M-1;

performing fifth processing on the upsampled features output by the Mth upsampled feature extraction layer through the residual feature extraction layer to obtain residual features, wherein the fifth processing comprises residual feature extraction processing;

and adding the residual error characteristics and the image to be processed to obtain a target image.

2. The method according to claim 1, wherein the second processing is performed on the input features input to the second processing by each of the downsampled feature extraction layers to obtain shallow features, depth features, and downsampled features, and the method comprises:

sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on input features input to the down-sampling feature extraction layers to obtain first basic features;

sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain the shallow feature;

sequentially carrying out three times of residual error processing on the first basic characteristic to obtain a second basic characteristic;

sequentially carrying out third convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain a conversion feature, carrying out pooling processing on the second basic feature, and connecting the pooled feature and the conversion feature in series to obtain the downsampling feature;

and sequentially performing fourth convolution processing, normalization processing and first nonlinear conversion processing on the second basic feature to obtain the depth feature.

3. The method according to claim 2, wherein the performing residual processing three times on the first basic feature sequentially to obtain a second basic feature comprises:

sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the first basic feature to obtain a third basic feature;

connecting the third basic feature with the first basic feature in series to obtain a second serial feature;

sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the second series connection characteristic to obtain a fourth basic characteristic;

connecting the second series characteristic with the fourth basic characteristic in series to obtain a third series characteristic;

sequentially performing first convolution processing, normalization processing, first nonlinear conversion processing, first convolution processing, normalization processing and first nonlinear conversion processing on the third series connection characteristic to obtain a fifth basic characteristic;

and connecting the fifth basic feature and the third series feature in series to obtain the second basic feature.

4. The method according to claim 1, wherein the fourth processing is performed on the input features input to the upsampling feature extraction layer through each upsampling feature extraction layer to obtain the upsampling features, and the method comprises the following steps:

sequentially performing deconvolution processing, normalization processing and first nonlinear conversion processing on the directly connected features input to the up-sampling feature extraction layers to obtain sixth basic features;

connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature;

sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on the fourth serial connection characteristic to obtain a seventh basic characteristic;

and sequentially performing second convolution processing, normalization processing and first nonlinear conversion processing on the seventh basic feature to obtain the up-sampling feature.

5. The method of claim 4, wherein the deconvolution process comprises:

and performing deconvolution processing through a deconvolution layer with the size of a third size and the step size of a second step size.

6. The method of claim 2, 3 or 4, wherein the first convolution process comprises:

convolution processing is performed by convolutional layers having a first size and a first step size.

7. The method of claim 2 or 4, wherein the second convolution process comprises:

the convolution process is performed by the convolution layer having the second size and the first step size.

8. The method of claim 2, wherein the third convolution process comprises:

convolution processing is performed by convolutional layers with the size of the third size and the step size of the second step size.

9. The method of claim 2, wherein the fourth convolution process comprises:

convolution processing is performed by convolutional layers having a first size and a second step size.

10. The method according to claim 1, wherein the first processing of the image to be processed by the initial feature extraction layer comprises:

and sequentially performing fifth convolution processing, normalization processing and first nonlinear conversion processing on the image to be processed through the initial feature extraction layer.

11. The method according to claim 1, wherein the fifth processing, by the residual feature extraction layer, the upsampled features output by the mth upsampled feature extraction layer, includes:

and sequentially performing fifth convolution processing and second nonlinear conversion processing on the up-sampling features output by the Mth up-sampling feature extraction layer through the residual error feature extraction layer.

12. The method of claim 10 or 11, wherein the fifth convolution process comprises:

convolution processing is performed by convolutional layers of a fourth size and a first step size.

13. The method according to claim 1, wherein the third processing, by the global feature extraction layer, the down-sampled feature output by the M-th down-sampled feature extraction layer to obtain a global feature comprises:

performing full convolution on the downsampled features output by the Mth downsampled feature extraction layer through the global feature extraction layer to obtain full convolution features;

and sequentially carrying out deconvolution processing, normalization processing and first nonlinear conversion processing on the full convolution features to obtain the global features.

14. The method of claim 1, wherein the GAN further comprises: a countermeasure network serially connected behind the generator network, the GAN being trained by:

during training, respectively inputting the target image and the reference image output by the generation network into the countermeasure network;

training the confrontation network and the generation network by using a single alternative iteration optimization method until the iteration is finished, and optimizing the stationAnd (3) optimizing the generation network according to the confrontation network, wherein a loss function used in the training process is as follows: l ═ L_adv+λ.L_contWherein L is a loss function, L_advAs a countermeasure function, L_contIs a conditional loss function, lambda is the weight of the conditional loss function, and takes the value of 0-1,

wherein, x to p_rFor the statistical distribution of the reference image,

is a statistical distribution of the target image,

15. The method of claim 14, wherein the optimization formula for the countermeasure network is as follows:

16. the method of claim 15, wherein the optimization formula for the generated network is as follows:

wherein the conditional loss function

c. w and h are respectively the number, width and height of channels of the target image at the pixel level, G is the generation network, and b takes a value of 1 or 2.

17. A generative confrontation network system comprising a generative network, the generative network comprising: the device comprises an initial feature extraction layer, M downsampling feature extraction layers, a global feature extraction layer, M upsampling feature extraction layers and a residual feature extraction layer, wherein the M downsampling feature extraction layers, the global feature extraction layer, the M upsampling feature extraction layers and the residual feature extraction layer are sequentially connected, and M is an integer greater than or equal to 1;

the initial feature extraction layer is used for performing first processing on an image to be processed to obtain initial features, wherein the first processing comprises initial feature extraction processing;

each downsampling feature extraction layer is used for carrying out second processing on input features input to the downsampling feature extraction layer to obtain shallow features, depth features and downsampling features, wherein the second processing comprises shallow feature extraction processing, depth feature extraction processing and downsampling feature extraction processing, the input feature of the first downsampling feature extraction layer connected with the initial feature extraction layer is the initial feature, and the input features of the second to Mth downsampling feature extraction layers are the downsampling features output by the previous downsampling feature extraction layer;

the global feature extraction layer is configured to perform third processing on the downsampled features output by the mth downsampled feature extraction layer to obtain global features, where the third processing includes global feature extraction processing;

each up-sampling feature extraction layer is configured to perform fourth processing on an input feature input to the up-sampling feature extraction layer to obtain an up-sampling feature, where the fourth processing includes up-sampling feature extraction processing, and an input feature of a first up-sampling feature extraction layer connected to the global feature extraction layer includes: a first direct connection feature and a first cross connection feature, wherein the first direct connection feature is a downsampling feature and a depth feature output by the mth downsampling feature extraction layer and a first series feature of the global feature, and the first cross connection feature is a shallow feature output by the mth downsampling feature extraction layer and a depth feature output by the M-1 downsampling feature extraction layer; the input features of the jth upsampling feature extraction layer include: a jth direct connection feature and a jth cross connection feature, wherein the jth direct connection feature is an up-sampling feature output by a previous up-sampling feature extraction layer, and the jth cross connection feature is a shallow feature output by an M-j +1 th down-sampling feature extraction layer and a depth feature output by an M-j th down-sampling feature extraction layer; the input features of the mth up-sampling feature extraction layer include: the M-th direct connection feature is an up-sampling feature output by the M-1 th up-sampling feature extraction layer, the M-th cross-connection feature is a shallow feature output by the first down-sampling feature extraction layer, and j is sequentially from 2 to M-1;

the residual error feature extraction layer is configured to perform fifth processing on the upsampled feature output by the mth upsampled feature extraction layer to obtain a residual error feature, so as to add the residual error feature and the image to be processed to obtain a target image, where the fifth processing includes residual error feature extraction processing.

18. The generative confrontation network system of claim 17, wherein each of the down-sampled feature extraction layers comprises: the device comprises a first feature extraction unit, a second feature extraction unit, a third feature extraction unit, a fourth feature extraction unit and a fifth feature extraction unit;

the first feature extraction unit is used for sequentially performing first convolution processing, normalization processing and first nonlinear conversion processing on input features input to the first feature extraction unit to obtain first basic features;

the second feature extraction unit is configured to perform second convolution processing, normalization processing, and first nonlinear conversion processing on the first basic feature in sequence to obtain the shallow feature;

the fifth feature extraction unit is configured to sequentially perform residual error processing on the first basic features three times to obtain second basic features;

the third feature extraction unit is configured to perform third convolution processing, normalization processing, and first nonlinear conversion processing on the second basic feature in sequence to obtain a conversion feature, perform pooling processing on the second basic feature, and connect the pooled feature and the conversion feature in series to obtain the downsampled feature;

and the fourth feature extraction unit is configured to perform fourth convolution processing, normalization processing, and first nonlinear conversion processing on the second basic feature in sequence to obtain the depth feature.

19. The generative confrontation network system according to claim 18, wherein the fifth feature extraction unit is specifically configured to:

20. The generative confrontation network system of claim 17, wherein each of the upsampling feature extraction layers comprises: a sixth feature extraction unit, a first feature extraction unit and a second feature extraction unit;

the sixth feature extraction unit is configured to perform deconvolution processing, normalization processing, and first nonlinear conversion processing on the directly connected features input to the sixth feature extraction unit in sequence to obtain sixth basic features; connecting the sixth basic feature with the cross-connection feature input to the sixth basic feature in series to obtain a fourth series feature;

the first feature extraction unit is configured to sequentially perform first convolution processing, normalization processing, and first nonlinear conversion processing on the fourth serial feature to obtain a seventh basic feature;

and the second feature extraction unit is configured to perform second convolution processing, normalization processing, and first nonlinear conversion processing on the seventh basic feature in sequence to obtain the upsampling feature.

21. The generative confrontation network system of claim 20, wherein the deconvolution process comprises: and performing deconvolution processing through a deconvolution layer with the size of a third size and the step size of a second step size.

22. A generative countermeasure network system according to claim 18, 19 or 20, wherein the first convolution process comprises:

23. The generative confrontation network system according to claim 18 or 20, wherein the second convolution process comprises:

24. The generative confrontation network system of claim 18, wherein the third convolution process comprises:

25. The generative confrontation network system of claim 18, wherein the fourth convolution process comprises:

26. The generative countermeasure network system according to claim 17, wherein the initial feature extraction layer is specifically configured to perform a fifth convolution process, a normalization process, and a first nonlinear transformation process on the image to be processed in sequence.

27. The generative countermeasure network system according to claim 17, wherein the residual feature extraction layer is specifically configured to sequentially perform a fifth convolution process and a second nonlinear transformation process on the upsampled feature output from the mth upsampled feature extraction layer.

28. The generative confrontation network system according to claim 26 or claim 27, wherein the fifth convolution process comprises:

29. The generative countermeasure network system according to claim 17, wherein the global feature extraction layer is specifically configured to perform full convolution on the downsampled features output by the M-th downsampled feature extraction layer to obtain full convolution features;

30. An electronic device, comprising: a memory and a processor, the memory and the processor connected;

the memory is used for storing programs; the processor is configured to invoke a program stored in the memory to perform the method of any of claims 1-16.

31. A storage medium, characterized in that the storage medium comprises a computer program which, when executed by a computer, performs the method according to any one of claims 1-16.