CN112488923A

CN112488923A - Image super-resolution reconstruction method and device, storage medium and electronic equipment

Info

Publication number: CN112488923A
Application number: CN202011455129.3A
Authority: CN
Inventors: 朱圣晨
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2021-03-12

Abstract

The embodiment of the application provides an image super-resolution reconstruction method and device, a storage medium and electronic equipment. According to the method and the device, a first image needing super-resolution reconstruction is obtained, a pre-trained image super-resolution reconstruction network is obtained, the first image is processed through a feature extraction module, a residual error module, an up-sampling module and a reconstruction module in the image super-resolution reconstruction network, and a second image with the resolution ratio larger than that of the first image is obtained. The image super-resolution reconstruction network provided by the embodiment of the application adopts the up-sampling module to sample the image data from multiple scales, so that the prediction precision is effectively improved, and the accuracy of the image super-resolution reconstruction is further effectively improved.

Description

Image super-resolution reconstruction method and device, storage medium and electronic equipment

Technical Field

The application relates to the technical field of computers, in particular to an image super-resolution reconstruction method, an image super-resolution reconstruction device, a storage medium and electronic equipment.

Background

Super-Resolution Image Reconstruction (Super-Resolution Image Reconstruction) refers to reconstructing a high-Resolution Image from a low-Resolution Image. Common image super-resolution reconstruction methods include statistical feature-based and deep learning-based image super-resolution reconstruction methods. However, details are easily lost in the reconstruction process of the common method, the obtained super-resolution reconstruction image has poor effect, and the accuracy of the super-resolution reconstruction of the image is influenced.

Disclosure of Invention

The embodiment of the application provides a model training method, an image reconstruction method, a related device, a storage medium and electronic equipment, which can improve the accuracy of image super-resolution reconstruction.

In a first aspect, an embodiment of the present application provides an image super-resolution reconstruction method, including:

acquiring a first image needing super-resolution reconstruction;

acquiring a pre-trained image super-resolution reconstruction network, wherein the image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module;

performing feature extraction on the first image through the feature extraction module to obtain a feature extraction result;

performing residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result;

performing upsampling processing on the feature extraction result through the upsampling module to obtain a first upsampling result, and performing upsampling processing on the residual error processing result through the upsampling module to obtain a second upsampling result;

fusing the first up-sampling result and the second up-sampling result to obtain a fused result;

and performing super-resolution reconstruction on the fusion result through a reconstruction module to obtain a second image, wherein the resolution of the second image is greater than that of the first image.

In a second aspect, an embodiment of the present application provides an image super-resolution reconstruction apparatus, including:

the acquisition module is used for acquiring a first image needing super-resolution reconstruction;

the system comprises an acquisition network module, a pre-training image super-resolution reconstruction network and a pre-training image super-resolution reconstruction module, wherein the image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module;

the characteristic extraction module is used for extracting the characteristics of the first image through the characteristic extraction module to obtain a characteristic extraction result;

the residual error processing module is used for carrying out residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result;

the sampling processing module is used for performing upsampling processing on the feature extraction result through the upsampling module to obtain a first upsampling result, and performing upsampling processing on the residual error processing result through the upsampling module to obtain a second upsampling result;

the fusion module is used for fusing the first up-sampling result and the second up-sampling result to obtain a fusion result;

and the reconstruction module is used for performing super-resolution reconstruction on the fusion result through the reconstruction module to obtain a second image, wherein the resolution of the second image is greater than that of the first image.

In a third aspect, a storage medium is provided in the embodiments of the present application, and has a computer program stored thereon, where the computer program is enabled, when the computer program runs on a computer, to execute the image super-resolution reconstruction method provided in any embodiment of the present application.

In a fourth aspect, an electronic device provided in an embodiment of the present application includes a processor and a memory, where the memory has a computer program, and the processor is configured to execute the image super-resolution reconstruction method provided in any embodiment of the present application through the computer program.

The method includes the steps that a first image needing super-resolution reconstruction is obtained; acquiring a pre-trained image super-resolution reconstruction network, wherein the image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module; performing feature extraction on the first image through the feature extraction module to obtain a feature extraction result; performing residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result; performing upsampling processing on the feature extraction result through the upsampling module to obtain a first upsampling result, and performing upsampling processing on the residual error processing result through the upsampling module to obtain a second upsampling result; fusing the first up-sampling result and the second up-sampling result to obtain a fused result; and performing super-resolution reconstruction on the fusion result through a reconstruction module to obtain a second image, wherein the resolution of the second image is greater than that of the first image. The image super-resolution reconstruction network provided by the embodiment of the application adopts the up-sampling module to sample the image data from multiple scales, so that the prediction precision is effectively improved, and the accuracy of the image super-resolution reconstruction is further effectively improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

Fig. 1 is a schematic structural diagram of an image super-resolution reconstruction network provided in an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a feature extraction module provided in an embodiment of the present application.

Fig. 3 is a schematic structural diagram of a reconstruction module according to an embodiment of the present application.

Fig. 4 is a detailed structural schematic diagram of an image super-resolution reconstruction network provided in an embodiment of the present application.

Fig. 5 is a schematic structural diagram of a residual unit provided in an embodiment of the present application.

Fig. 6 is a schematic structural diagram of a residual weight unit provided in an embodiment of the present application.

Fig. 7 is a schematic structural diagram of a residual convolution unit according to an embodiment of the present application.

Fig. 8 is a schematic structural diagram of an upsampling unit provided in an embodiment of the present application.

Fig. 9 is a schematic structural diagram of an upsampling subunit provided in the embodiment of the present application.

Fig. 10 is a schematic structural diagram of a splicing subunit provided in an embodiment of the present application.

Fig. 11 is a flowchart illustrating an image super-resolution reconstruction method according to an embodiment of the present application.

Fig. 12 is a flowchart illustrating a training method of an image super-resolution reconstruction network according to an embodiment of the present application.

Fig. 13 is a block schematic diagram of an image super-resolution reconstruction apparatus provided in an embodiment of the present application.

Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Fig. 15 is another schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present application are illustrated as being implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.

The embodiment of the application provides an image super-resolution reconstruction method, and an execution subject of the image super-resolution reconstruction method can be the image super-resolution reconstruction device provided by the embodiment of the application or an electronic device integrated with the image super-resolution reconstruction device, wherein the image super-resolution reconstruction device can be realized in a hardware or software manner. The electronic device may be a computer device, which may be a terminal device such as a smart phone, a tablet computer, a personal computer, or a server. When the computer device is used for executing the image processing method, the computer device comprises a pre-trained image super-resolution reconstruction network. The following is a detailed description of the analysis.

Referring to the drawings, fig. 1 is a schematic structural diagram of an image super-resolution reconstruction network provided in an embodiment of the present application, where the image super-resolution reconstruction network includes a feature extraction module, a residual error module, an upsampling module, and a reconstruction module.

The feature extraction module is used for inputting images and extracting features of the input images. As shown in fig. 2, the feature extraction module may include an input unit and a convolution unit. The input unit is configured to input a first image, where the first image may be an image with a resolution lower than a preset threshold, and for example, the first image may be a low-resolution image. The convolution unit is used for performing convolution processing on the input image so as to extract characteristic information in the image.

The reconstruction module is used for performing super-resolution reconstruction on the fusion result to obtain a high-resolution reconstruction image. Specifically, convolution processing is performed and an image is output. As shown in fig. 3, the reconstruction module may include two convolution units and one output unit. When the fusion result is input, the fusion result is convolved by two convolution units to obtain a reconstructed image of the input image. The reconstructed image is then output by an output module.

The residual module includes a residual unit (e.g., ARRU, adaptive residual in residual unit, dense residual unit in weight-adaptive residual). Referring to fig. 4, fig. 4 is a detailed structure diagram of the image super-resolution reconstruction network according to the embodiment of the present application, and a schematic structural diagram of a residual module is shown in a dashed frame of the residual module. Wherein each residual unit is connected in series, and the output result of each residual unit is used as the input data of the next adjacent residual unit. As shown in fig. 4, the input data of the residual unit 1 is the output result of the feature extraction module, and the output data of the residual unit 1 is used as the input data of the residual unit 2 until the residual unit n outputs the residual processing result.

As shown in fig. 5, fig. 5 is a schematic diagram of a composition structure of the residual error unit. Each residual unit includes a plurality of residual weight units (e.g., an AWRB, adaptive weight residual block, or adaptive weight residual block).

In some embodiments, the residual unit may include 4 residual weight units. As shown in fig. 5, the input data and the output data of the residual weight unit 1 are input data of the residual weight unit 2, and so on. When the output data of the residual weight unit 4 is obtained, the output data is multiplied by a predetermined coefficient (in _ scale) to give a weight, and similarly, the input data of the residual weight unit 1 is also multiplied by another predetermined coefficient (res _ scale). The two preset coefficients are related to the training times, and the accuracy of super-resolution reconstruction can be effectively improved along with the updating of the training. And then, splicing the multiplication results of the two units to obtain the output data of the residual unit.

As shown in fig. 6, fig. 6 is a schematic diagram of a composition structure of the residual weight unit. The residual weight unit includes a plurality of residual convolution units (e.g., AWCB, adaptive weight convolution block) and a convolution unit. And the residual convolution units are connected in a dense connection mode.

In some embodiments, the adaptive weight residual module may include 4 residual convolution units and 1 convolution unit.

As shown in fig. 6, the input data and the output data of the residual convolution unit 1 are input data of the residual convolution unit 2. The input data, the output data of the residual convolution unit 2, and the input data of the residual convolution unit 1 serve as input data of the residual convolution unit 3. The input data and the output data of the residual convolution unit 3, the input data of the residual convolution unit 1, and the input data of the residual convolution unit 2 are used as the input data of the residual convolution unit 4. The output data of the residual convolution unit 4, the input data of the residual convolution unit 1, the input data of the residual convolution unit 2, and the input data of the residual convolution unit 3 serve as input data of the convolution unit. The convolution unit performs convolution processing on input data. And then, multiplying the data after convolution processing by a preset coefficient, multiplying the residual convolution unit 1 by another preset coefficient, and splicing the multiplication results of the two coefficients to serve as output data, similar to the ARRU internal processing mode.

As shown in fig. 7, fig. 7 is a schematic diagram of a composition structure of the residual convolution unit. The residual convolution unit includes a first convolution layer, an active layer, and a second convolution layer. In some embodiments, as shown in fig. 7, the first convolutional layer, the activation layer, and the second convolutional layer are connected in series. The activation function may be a non-Linear activation function, for example, a Leaky RELU function (Leaky Rectified Linear Units) function. For example, when the feature extraction result is input from the first convolution layer, the residual extraction result can be obtained after the first convolution layer, the activation layer, and the second convolution layer are sequentially processed.

As shown in fig. 4, after the residual processing is performed on the input data by the residual module and the feature extraction is performed on the input data by the feature extraction module, multi-scale sampling and fusion processing needs to be performed on the obtained residual processing result and feature extraction result by the up-sampling module. The upsampling module may include a plurality of upsampling units as shown in the block of the upsampling module in fig. 4. The up-sampling module is used for sampling and fusing input data. The upsampling module may include a plurality of upsampling units (e.g., MSFU).

As shown in fig. 8, fig. 8 is a schematic diagram of a composition structure of an up-sampling unit. Each up-sampling unit comprises a splicing subunit and a plurality of up-sampling subunits, wherein output data of the plurality of up-sampling subunits are used as input data of the splicing module.

The up-sampling subunit is used for amplifying the input data. In some embodiments, as shown in fig. 9, fig. 9 is a schematic structural diagram of a composition of an upsampling subunit provided in an embodiment of the present application. The upsampling sub-unit may include 1 upsampling function and 1 convolutional layer. And performing up-sampling processing on the input data through an up-sampling function, and performing convolution processing on the result after the up-sampling processing through a convolution layer so as to realize amplification processing on the input data.

And the splicing subunit is used for splicing the output data of the plurality of up-sampling subunits. In some embodiments, as shown in fig. 10, fig. 10 is a schematic structural diagram of a splicing subunit provided in an embodiment of the present application. The splice subunit can include a splice function and a convolutional layer. The splicing function may be various, for example, may be a concat function, and may be specifically set according to actual requirements.

It can be understood that, in order to implement the multi-scale sampling and fusion process, the number of convolution kernels in each upsampling unit (i.e., convolution kernels of the convolution layers in the upsampling subunit and the splicing subunit) is different. In some embodiments, as shown in the figure, when 4 upsampling units are included in the image super-resolution reconstruction network, the number of convolution kernels of convolution layers included in the 4 upsampling units is different, and may be 1 × 1, 3 × 3, 5 × 5, or 7 × 7 convolution kernels from top to bottom, respectively. The up-sampling module is beneficial to recovering abundant detail information by fusing the characteristics with different receptive fields.

It can be understood that, in order to improve the convergence speed of the image super-resolution reconstruction network and reduce memory occupation, the convolution units and the convolution layers (including the convolution unit in the feature extraction module, the convolution unit in the reconstruction module, and the convolution unit, the convolution layer, etc. in each unit) in the embodiment of the present application are all weight normalization convolution modules, that is, weights of convolution are normalized, where a formula of weight normalization is as follows:

wherein v is an original convolution kernel in the convolution, g is a training parameter which is the same as the dimension of v, and w is a normalized convolution kernel.

As can be seen from fig. 4, the input data of the residual block is used as the input data of the upsampling unit 1, the output data of the residual block is used as the input data of the upsampling unit 2, the output data of the upsampling unit 1 and the upsampling unit 2 are used as the input data of the upsampling unit 4, the output data of the upsampling unit 1 is used as the input data of the upsampling unit 3, the output data of the upsampling unit 3 and the upsampling unit 4 are used as the input data of the reconstruction block, and the reconstructed image is obtained after convolution processing.

From the overall structure, the image super-resolution reconstruction Network in the embodiment of the present application adopts a macro structure of Dense Residual connection, specifically, to each internal module, the connection mode between the Residual module and the plurality of upsampling units adopts a connection mode of a Dense Residual Network (Residual sense Network), and a part of the connection in the Residual module adopts jump connection. On one hand, the network convergence speed is higher, on the other hand, the generation of artifacts can be reduced, and the accuracy of image reconstruction is improved. On the other hand, the convolution kernels with different sizes are used for multi-scale prediction, and the accuracy of image reconstruction is further improved.

An embodiment of the present application provides an image super-resolution reconstruction method, as shown in fig. 11, fig. 11 is a schematic flowchart of the image super-resolution reconstruction method provided in the embodiment of the present application, and the image super-resolution reconstruction method may include:

101, acquiring a first image needing super-resolution reconstruction.

The first image can be acquired in a plurality of ways. For example, the images may be acquired from a local memory, or may be acquired from an external monitoring device, a medical instrument, or the like.

It will be appreciated that the first image may be a lower resolution image. For example, if the monitored picture has low sharpness, super-resolution reconstruction is required.

In some embodiments, after the first image is acquired, the resolution of the first image may be acquired, and when it is determined that the resolution of the first image is greater than the preset resolution, that is, the first image is an image with a higher resolution, it may be determined that the first image meets the picture resolution requirement of the user, and then reconstruction is not performed through a pre-trained image super-resolution reconstruction network. The preset resolution can be specifically set according to actual requirements.

102, acquiring a pre-trained image super-resolution reconstruction network, wherein the image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module.

The image super-resolution reconstruction network can be obtained through pre-training.

In some embodiments, the image super-resolution reconstruction network may be a generation network in a Generative Accommodation Network (GAN).

For example, in the pre-training process, a high resolution image sample may be obtained from the image dataset, and a corresponding low resolution image may be obtained by processing the high resolution image sample. And inputting the low-resolution image into an untrained generation network for reconstruction to obtain a reconstructed image. And calculating the loss between the high-resolution image sample and the reconstructed image, then firstly carrying out pre-training on an untrained generation network, and then carrying out iterative training on the pre-trained generation network and an untrained discrimination network. Specifically, during training, the generation network is trained separately, and then the discrimination network and the generation network are trained together, i.e., fine-tune (fine-tune) is performed on the basis of the generation network trained separately. And when the neural network parameters are converged, finishing training to obtain a trained generation network, and taking the trained generation network as an image super-resolution reconstruction network.

The image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module.

103, performing feature extraction on the first image through the feature extraction module to obtain a feature extraction result.

The feature extraction module may include an input unit and a convolution unit.

For example, after a first image is input to the convolution unit through the input unit, the convolution unit performs convolution processing on the first image to extract the features of the first image, so as to obtain a feature extraction result.

And 104, performing residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result.

In some embodiments, the residual module may include a plurality of residual units, and the residual processing result may be obtained by performing residual processing on the feature extraction result through the plurality of residual units. The method for performing residual processing on the feature extraction result through the residual module may include:

inputting the feature extraction result into a first residual error unit for self-adaptive residual error processing, and outputting a self-adaptive residual error result;

and inputting the self-adaptive residual error result into a second residual error unit for self-adaptive residual error processing, and repeating the steps until the output result of the last residual error unit is input into the last residual error unit for self-adaptive residual error processing, and outputting a residual error processing result.

In some embodiments, the residual unit may be an adaptive residual in residual unit (ARRU) in weight adaptive residual. The residual error unit can be specifically set according to actual requirements.

For example, as shown in fig. 4, a plurality of residual units in fig. 4 are connected in series. And the residual error unit 1 performs self-adaptive residual error processing on the feature extraction result to obtain a self-adaptive residual error result. Inputting the self-adaptive residual error result into the residual error unit 2 to continue the self-adaptive residual error processing, and so on until the last residual error unit outputs the residual error processing result.

In some embodiments, the residual unit may include a plurality of residual weight units, and the adaptive residual result may be obtained by processing the plurality of residual weight units, where each residual weight unit is processed in the same manner. Taking the processing of the first residual weight unit as an example, the method for inputting the feature extraction result into the first residual unit for adaptive residual processing and outputting the adaptive residual result may include:

inputting the feature extraction result into a first residual error weight unit for residual error fusion processing, and outputting a residual error fusion result;

splicing the residual fusion result and the feature extraction result, inputting the spliced result into a second residual weight unit for residual fusion processing, and repeating the steps until the output result of the last residual weight unit and the input data are spliced and input into a last residual weight unit for residual fusion processing;

splicing the output result and the input data of the last residual weight unit to obtain a splicing result;

multiplying the splicing result by a first preset coefficient to obtain a first product result, and multiplying the feature extraction result by a second preset coefficient to obtain a second product result, wherein the first preset coefficient and the second preset coefficient are updated along with a pre-training process;

and splicing the first product result and the second product result to obtain the self-adaptive residual error result.

For other residual error units, the input data of the first residual error weight unit in the residual error unit is multiplied by the second preset coefficient to obtain the second product result.

The residual weight unit may be an Adaptive Weight Residual Block (AWRB). The residual weight unit can be set according to actual requirements.

The first preset coefficient and the second preset coefficient can be set according to actual requirements. For example, the first preset coefficient and the second preset coefficient may be initialized to 1 during the training process and linearly increase as the training process progresses.

For example, as shown in fig. 5, the 4 residual weight units in fig. 5 are mainly connected in a skip connection manner. The residual weight unit 1 performs residual fusion processing on the input feature extraction result to obtain a residual fusion result. And splicing the residual fusion result and the feature extraction result, and inputting the spliced residual fusion result and the feature extraction result into a residual weighting unit 2 to continue residual fusion. And in the same way, the residual error is continuously input into the residual error weight unit 4 for residual error fusion processing after the output result of the residual error weight unit 3 and the input data are spliced.

And then, splicing the output result of the residual error weight unit 4 and the input data to obtain a splicing result. And multiplying the splicing result by a first preset coefficient, and multiplying the feature extraction result (namely the input data of the residual weight unit 1) by a second preset coefficient. And splicing the multiplication results of the two to obtain a self-adaptive residual error result.

When performing super-resolution reconstruction using a generative countermeasure network, various Artifacts (Artifacts, which are images of various forms that do not exist in an original object but appear in an image) are likely to appear in a generated picture. According to the embodiment of the application, the structure of the generation network is innovated, a connection mode of a dense residual error structure and jump connection is adopted in the generation network, and the innovated structure can reduce the generation frequency of artifacts, so that the accuracy of the reconstructed image is higher.

In some embodiments, the residual weighting unit may include m residual convolution units, and the residual fusion of the feature extraction result by the first residual weighting unit may include:

inputting the feature extraction result into a first residual convolution unit for residual extraction processing, and outputting a residual extraction result;

inputting the residual extraction result and the feature extraction result into a second residual convolution unit for residual extraction;

by analogy, inputting the input data of the first m-1 residual convolution units and the processing result of the (m-1) th residual convolution unit into the mth residual convolution unit for residual extraction processing;

performing convolution processing on the output result of the mth residual convolution unit and the input data of the first m-1 residual convolution units to obtain a convolution processing result;

multiplying the convolution processing result by a first preset coefficient to obtain a third product result, and multiplying the feature extraction result by a second preset coefficient to obtain a fourth product result, wherein the first preset coefficient and the second preset coefficient are updated along with a preset training process;

and splicing the third product result and the fourth product result to obtain a residual error fusion result.

In some embodiments, the residual convolution unit may be an Adaptive Weight Convolution Block (AWCB). The setting can be specifically carried out according to actual requirements. Wherein the number m of the residual convolution units may be a positive integer, for example, m is 1,2,3, … ….

For example, as shown in fig. 6, fig. 6 is a schematic diagram of a connection structure of 4 residual convolution units. The residual convolution units are mainly connected in a similar dense residual connection mode.

And inputting the feature extraction result into a residual convolution unit 1 to perform residual extraction processing, so as to obtain a residual extraction result. Then, the output result (i.e., the residual extraction result) of the residual convolution unit 1 and the input data (i.e., the feature extraction result) are input to the residual convolution unit 2 for processing. As can be seen from the figure, for each residual convolution unit, the input data is the sum of the output result of the last residual convolution unit and the input data of the previous residual convolution units. And in the same way, the input data of the first 3 residual convolution units and the processing result of the residual convolution unit 3 are input into the residual convolution unit 4 for residual extraction processing.

And then, inputting the output result of the residual convolution unit 4 and the input data of the first 3 residual convolution units into a convolution unit for convolution processing to obtain a convolution processing result. And multiplying the convolution processing result by a first preset coefficient to obtain a third product result, multiplying the feature extraction result by a second preset coefficient to obtain a fourth product result, and splicing the third product result and the fourth product result to obtain a residual error fusion result.

In some embodiments, the residual convolution unit includes an active layer, a first convolution layer and a second convolution layer, and the method of outputting the residual extraction result may include:

and processing the feature extraction result sequentially through the first convolution layer, the activation layer and the second convolution layer to obtain a residual extraction result.

For other residual convolution units, the feature extraction result is input to the first residual convolution unit for residual extraction processing, and the manner of outputting the residual extraction result can also have other expression forms. For example, the residual extraction process may be performed on input data of other residual convolution units.

Wherein the activation layer may include an activation function therein. The activation function may be a leakage ReLu function, for example, as shown in fig. 7, the first convolutional layer, the activation function, and the second convolutional layer are sequentially connected in series. Inputting the feature extraction result from the first convolution layer, and sequentially processing the first convolution layer, the activation layer and the second convolution layer to obtain a residual extraction result.

As can be seen from the above, the residual module provided in the embodiment of the present application adopts a dense residual structure similar to that in the dense residual network. Meanwhile, jump connection is added in the residual error module. The reconstruction accuracy is higher, the network convergence is faster, and the image reconstruction efficiency is improved.

And 105, performing upsampling processing on the feature extraction result through the upsampling module to obtain a first upsampling result, and performing upsampling processing on the residual error processing result through the upsampling module to obtain a second upsampling result.

Wherein the upsampling module may include n upsampling units. The value of n can be various, and can be determined according to the computing power of the image super-resolution reconstruction network, and it can be understood that the larger the value of n is, the larger the calculated amount is, and the higher the accuracy of the corresponding reconstructed image is. For example, n may be a positive even number (n ═ 2,4,6 … …), i.e., the upsampling units appear in pairs.

In some embodiments, the upsampling module may be a multi-scale feature upsampling unit (MSFU). The setting can be specifically carried out according to actual requirements.

In some embodiments, when the upsampling module may include n upsampling units, the upsampling module performs upsampling on the feature extraction result to obtain a first upsampling result, which may include:

performing upsampling processing on the feature extraction result through a first upsampling unit;

performing upsampling processing on the output result of the first upsampling unit through the third upsampling unit, and so on until the output result of the (n-3) th upsampling unit is subjected to upsampling processing by the (n-1) th upsampling unit to obtain a first upsampling result;

the method for performing upsampling processing on the residual error processing result by the upsampling module to obtain a second upsampled result may include:

performing upsampling processing on the residual error processing result through a second upsampling unit;

splicing the output result of the first up-sampling unit and the output result of the second up-sampling unit to obtain a sampling splicing result, and performing up-sampling processing on the sampling splicing result through a fourth up-sampling unit;

and repeating the steps until the output result of the (n-3) th upsampling unit and the output result of the (n-2) th upsampling unit are spliced to obtain a sampling fusion result, and performing upsampling processing on the sampling fusion result through the nth upsampling unit to obtain a second upsampling processing result.

Taking the input data of the first upsampling unit as the feature extraction result as an example, as shown in fig. 4, a schematic structural diagram of the upsampling module when the upsampling module has four upsampling units is shown in a dashed line frame of the upsampling module in fig. 4. Inputting the feature extraction result to a first up-sampling unit for up-sampling processing, and then inputting the output result to a third up-sampling unit and a fourth up-sampling unit; meanwhile, the second up-sampling unit performs up-sampling processing on the residual processing result output by the residual module, and then inputs the output result to the fourth up-sampling unit for processing.

And the third up-sampling unit performs up-sampling processing on the output result of the first up-sampling unit to obtain a first up-sampling result. And the fourth up-sampling unit performs up-sampling processing on the sum of the output results of the first up-sampling unit and the second up-sampling unit to obtain a second up-sampling result.

In some embodiments, the upsampling unit may include a plurality of upsampling sub-units and a stitching sub-unit, and the method for upsampling the feature extraction result by the first upsampling unit may include:

performing multi-scale sampling processing on the feature extraction result through the plurality of up-sampling subunits to obtain a plurality of scale sampling results;

and splicing the multiple scale sampling results through the splicing subunit.

Wherein each upsampling subunit may be composed of an upsampling function and a convolutional layer in sequence. When data is input into the up-sampling subunit, the data is subjected to up-sampling processing and then convolution processing. The number of the upsampling subunits can be set according to actual requirements, and in some embodiments, the upsampling unit may include four upsampling subunits.

In order to realize multi-scale sampling processing, convolution kernels of convolution layers contained in different up-sampling units are different. For example, when the upsampling module includes four upsampling units, the convolutional layers included in the different upsampling units may be convolutional kernels of 1 × 1, 3 × 3, 5 × 5, and 7 × 7, respectively.

Wherein each splicing subunit may be composed of a splicing function and a convolutional layer once. And the data input into the splicing subunit is processed by the splicing function and the convolution layer in sequence, and then a splicing result is output. The splicing function may be set according to actual requirements, for example, the splicing function may be a con-cat function.

Taking the input data of the first upsampling unit as the feature extraction result as an example, as shown in the figure, the first upsampling unit includes four upsampling sub-units and one splicing sub-unit. And inputting the feature extraction result into the four up-sampling subunits, and sequentially performing up-sampling function and convolutional layer processing to obtain four output results. And then inputting the four output results into a splicing subunit, and obtaining the output result of the first up-sampling unit through the processing of a splicing function and a convolution layer.

And 106, fusing the first up-sampling result and the second up-sampling result to obtain a fused result.

There may be multiple methods for obtaining the fusion result by fusing the first upsampling result and the second upsampling result. For example, the first upsampled result and the second upsampled result may be summed to obtain a fused result. For another example, the first upsampling result and the second upsampling result may be spliced to obtain a fusion result.

And 107, performing super-resolution reconstruction on the fusion result through a reconstruction module to obtain a second image, wherein the resolution of the second image is greater than that of the first image.

The reconstruction module may include a plurality of convolution units and an output unit.

In some embodiments, two convolution units and one output unit may be included in the reconstruction module. And the fusion result is subjected to reconstruction processing through two convolution units in sequence, and then a second image is output through an output unit. When the first image is a low resolution image, the second image may be a high resolution reconstructed image, and thus the resolution of the second image is greater than the resolution of the first image.

It can be understood that the convolution units and the convolution layers (including the convolution unit in the feature extraction module, the convolution unit in the reconstruction module, and the convolution unit and the convolution layer in each unit) in the embodiment of the present application are all weight normalization convolution modules, that is, the weights of the convolution are normalized, so that the convergence speed of the image super-resolution reconstruction network can be increased, and the memory usage can be reduced. The weight normalization formula is as follows:

Therefore, the first image needing to be subjected to super-resolution reconstruction is obtained, the pre-trained image super-resolution reconstruction network is obtained, and the first image is processed through the feature extraction module, the residual error module, the upsampling module and the reconstruction module in the image super-resolution reconstruction network to obtain the second image with the resolution ratio larger than that of the first image.

According to the super-resolution image reconstruction method and device, the structure of the network is innovated, and the super-resolution image reconstruction accuracy is improved. On the macro Network structure formed by each module, a Dense Residual error connection mode similar to that in a Dense Residual error Network (Residual error Network) is adopted. For the structure of each module, a dense residual error structure similar to that in a dense residual error network is also adopted. Meanwhile, jump connection is added in the residual error module.

On one hand, the dense residual error structure and the jump connection reduce the probability of the generated image with artifacts, and improve the image accuracy of the reconstructed image; on the other hand, the added jump connection enables the image super-resolution reconstruction network to be converged more quickly, and the efficiency of reconstructing the image is improved. In addition, convolution kernels with different sizes are used for multi-scale sampling and prediction during sampling, and the accuracy of reconstructed images is improved.

Before the image super-resolution reconstruction is carried out, the image super-resolution reconstruction network can be trained in advance. An embodiment of the present application further provides a training method for an image super-resolution reconstruction network, as shown in fig. 12, fig. 12 is a schematic flow chart of the training method provided in the embodiment of the present application, and the training method may include:

201. a sample set of images is acquired.

Wherein each image sample group comprises a first image sample and a second image sample. The first image sample corresponds to the second image sample, the first image sample having a higher resolution than the second image sample.

Wherein the first image sample may be a high resolution image. The first image sample may be acquired from a high resolution image dataset. For example, a blend of DIV2K and Flicker2K data sets may be used as the high resolution image data set, with the image resolution in the data set being 2K, and the first image sample is taken from the blended data set. Because the image scenes contained in the data set are rich and the content is various, the requirement of model training can be met.

Wherein the second image sample can be obtained by performing resolution reduction processing on the first image sample. The manner of performing resolution reduction on the first image sample may include a variety of ways, and in some embodiments, the second image sample may be obtained by performing a reduction process on the size of the first image sample. For example, the second image sample may be acquired by reducing the size of the first image sample to 1/16. Specifically, the second image sample may be obtained by down-sampling the first image sample by a factor of 16 using bilinear interpolation in MATLAB.

In some embodiments, the manner in which the image sample set is acquired may be varied. For example, after acquiring a plurality of first image samples, performing downsampling on the first image samples to obtain a plurality of second image samples, and forming an image sample group by the first image samples and the corresponding second image samples, the method for acquiring an image sample group may specifically include:

acquiring a plurality of first image samples; performing downsampling processing on each first image sample to obtain a second image sample corresponding to each first image sample group; and setting each first image sample group and the corresponding second image sample as the image sample group.

For example, a plurality of first image samples may be obtained from a high-resolution image dataset obtained by mixing a DIV2K dataset and a Flicker2K dataset, each first image sample is downsampled by 16 times by using bilinear interpolation in MATLAB to obtain a second image sample corresponding to each first image sample group, and each first image sample and its corresponding second image sample are set as an image sample group.

It is understood that, in the training process, the size of the image sample, that is, the size of each first image sample and each second image sample, needs to be fixed in order to improve the training efficiency and accuracy. However, since the sizes in the training data sets are all not consistent, in some embodiments, the sizes in the image sample set may be fixed by randomly truncating a fixed size in each first image sample, while truncating the same location in each second image sample size.

202. And constructing a discrimination network corresponding to the image super-resolution reconstruction network.

In some embodiments, a Generative model in a Generative Adaptive Networks (GAN) may be used as an image super-resolution reconstruction network, and a discriminant model therein may be used as a discriminant network. The network is constructed on the basis of an untrained Generative Adaptive Networks (GAN).

The generative confrontation network is a deep learning model, and the model comprises at least two modules: a Generative Model (Generative Model) and a discriminant Model (discriminant Model) were generated. In the training process, the generating model generates similar data by learning real data, for example, after the generating model learns some real animal pictures, the generating model can generate animal pictures by itself. And the discrimination model classifies the animal picture and the real animal picture generated by the generation model, feeds back the classification result to the generation model and trains the generation model. For example, a picture is generated and output 0, a real picture is output 1, and the discrimination result is fed back to the generated model and trained. This will make the animal picture generated by the generative model more and more approximate to the real picture. Through repeated training, when the generated picture and the real picture cannot be distinguished by the discriminant model, the generated confrontation network fitting is completed, namely training is completed.

203. And performing pre-training on the image super-resolution reconstruction network according to the image sample group, and performing combined training on the image super-resolution reconstruction network and the discrimination network according to the image sample group.

The embodiment of the present application takes the example that the image super-resolution reconstruction network is a generation network as an example. During pre-training, a generation network can be trained individually according to an image sample group, and a discrimination network is subjected to fine-tuning (fine-tune, a neural network training method, which is to finely tune part of parameters in other models according to a trained neural network model so as to achieve the purpose of reducing training times) according to the generation network obtained by individual training, and then the discrimination network and the generation network are trained together.

The method for jointly training the image super-resolution reconstruction network and the discrimination network according to the image sample set can be various, and similar to the training method of the generative countermeasure network, the image super-resolution reconstruction network and the discrimination network can be jointly trained through the image sample set comprising the first image sample and the corresponding second image sample.

For example, when the image super-resolution reconstruction network is a generation network, the generation network may perform image super-resolution reconstruction on the second image sample to obtain a reconstructed image, and the generation network and the discrimination network are alternately trained by comparing the reconstructed image with the first image sample. And finishing training until the network parameters are converged.

In some embodiments, the method for jointly training the image super-resolution reconstruction network and the discriminant network according to the image sample set may include:

performing image super-resolution reconstruction on the second image sample through the image super-resolution reconstruction network to obtain a reconstructed image corresponding to the second image sample;

obtaining the generation loss of the image super-resolution reconstruction network according to the first image sample, the reconstructed image and the discrimination network;

obtaining the discrimination loss of the discrimination network according to the first image sample and the reconstructed image;

and performing combined training on the image super-resolution reconstruction network and the discrimination network according to the generation loss and the discrimination loss by taking the network parameter convergence as constraint.

For example, the image super-resolution reconstruction network is a generation network. When the first image sample is a high resolution image, the second image sample is a low resolution image processed from the high resolution image. The super-resolution reconstructed image can be generated by inputting the low-resolution image data to a generation network. Then, for the generation network, the generation loss includes a relative countermeasure loss obtained by discriminating the high-resolution picture and the reconstructed image by the discrimination network, and a loss obtained by a difference between the high-resolution picture and the reconstructed image. For a discriminative network, the discrimination loss includes a countermeasure loss calculated from the high resolution image and the reconstructed image. And alternately executing the training process of generating the network and judging the network according to different losses, and finishing the training when the neural network parameters of the two networks are converged.

In some embodiments, there may be multiple methods of alternately training the generating network and the discriminating network. In some embodiments, an Adam optimizer may be used to perform iterative training on the generation network and the discrimination network, and when an error between the reconstructed image and the high-resolution image is smaller than a preset error value or a weight change between two iterations is smaller than a preset weight, the neural network parameters converge, and the training is completed. For another example, the number of iterations may be preset, and when the number of iterations exceeds the preset number, the neural network parameters converge, and the training is completed.

In some embodiments, the generation loss of the image super-resolution reconstruction network includes a reconstruction loss and a countervailing loss, and the method of obtaining the generation loss of the image super-resolution reconstruction network according to the first image sample, the reconstructed image and the discrimination network may include:

obtaining the reconstruction loss of the image super-resolution reconstruction network according to the first image sample and the reconstruction image;

calculating the countermeasure loss of the image super-resolution reconstruction network according to the similarity between the first image sample and the reconstructed image through the discrimination network;

and obtaining the generation loss of the image super-resolution reconstruction network according to the reconstruction loss and the confrontation loss.

In some embodiments, the reconstruction loss may include a high frequency loss, a low frequency loss, and a full map loss, and the method of obtaining the reconstruction loss of the image super-resolution reconstruction network from the first image sample and the reconstructed image may include:

extracting the characteristics of the first image sample and the reconstructed image according to the frequency to obtain characteristic information;

acquiring high-frequency loss and low-frequency loss between the first image sample and the reconstructed image according to the characteristic information;

calculating the total image loss of the first image sample and the reconstructed image according to the correlation between the first image sample and the reconstructed image;

and obtaining the reconstruction loss of the image super-resolution reconstruction network according to the high-frequency loss, the low-frequency loss and the full-image loss.

There are various methods for extracting the features of the first image sample and the reconstructed image according to the frequency. For example, the first image sample and the reconstructed image may be decomposed to obtain a plurality of sub-images, and feature information of the image may be obtained by sub-image extraction.

For example, the pictures of the first image sample and the reconstructed image sample may be equally divided into four subgraphs by one-level wavelet decomposition: graph A is a low-frequency subgraph, graph H is a horizontal high-frequency subgraph, graph V is a vertical high-frequency subgraph, and graph D is a diagonal high-frequency subgraph. Each sub-picture is half the size of the original picture. And performing feature extraction on the sub-graph to obtain corresponding high-frequency loss and low-frequency loss.

There are various ways to obtain the high frequency loss and the low frequency loss between the first image sample and the reconstructed image according to the feature information.

For example, for high-frequency loss, features of a high-frequency subgraph can be extracted through a neural network, and the euclidean distance of the features can be obtained to be used as the high-frequency loss. For example, the features of the high-frequency subgraphs corresponding to the reconstructed image and the first image sample can be extracted through the 5_4 layer of the VGG19 network, and the euclidean distance is obtained as the high-frequency perceptual loss, and the formula is as follows:

wherein H_genRefers to the horizontal high-frequency photons of the reconstructed imageFIG. H_hrRefers to the horizontal high-frequency subgraph of the first image sample, VGG () refers to the feature extraction function of VGG network, L_HIs referred to as the loss between the reconstructed image and the horizontal high frequency subgraph of the first image sample. Similarly, the loss between the reconstructed image and the vertical high frequency sub-graph of the first image sample, and the loss between the diagonal high frequency sub-graphs can be obtained. The sum of these three losses is taken as the high frequency loss of the reconstructed image and the first image sample.

For another example, for low frequency loss, the L1 loss between the reconstructed image and the low frequency sub-image can be taken as the low frequency loss. L1 loss, also known as minimum absolute deviation (LAD), minimum absolute error (LAE). The formula is shown below:

L_1,A＝L₁(A_gen,A_hr)

wherein A is_genIs a low-frequency subgraph of the reconstructed image, A_hrIs a low frequency sub-picture, L, of the first image sample_1,ARefers to the L1 loss between the reconstructed image and the first image sample.

There are various ways to calculate the total image loss of the first image sample and the reconstructed image according to the correlation between the first image sample and the reconstructed image. For example, the L1 loss between the first image sample and the reconstructed image can be taken as a global graph loss, as shown in the following equation:

L_1,img＝L₁(gen,hr)

the reconstruction loss of the image super-resolution reconstruction network can be obtained according to the high-frequency loss, the low-frequency loss and the full-image loss in various ways. For example, according to actual requirements, the high-frequency loss and the low-frequency loss are multiplied by a preset coefficient, the whole-image loss is multiplied by another preset coefficient, and the multiplication results are added to obtain the reconstruction loss of the image super-resolution reconstruction network.

In some embodiments, the way of calculating the countermeasure loss of the image super-resolution reconstruction network according to the similarity between the first image sample and the reconstructed image through the discriminant network may be various, for example, the whole image data of the first image sample and the reconstructed image may be input into the discriminant network, and the equation is as follows:

where hr refers to the first image sample, gen refers to the reconstructed image, L_D(hr, gen) refers to the loss between the first image sample and the reconstructed image, L_D(gen, hr) refers to the loss between the reconstructed image and the first image sample.

In some embodiments, the way of deriving the generation loss of the image super-resolution reconstruction network from the reconstruction loss and the countervailing loss may be various. For example, the reconstruction loss and the countermeasure loss may be given different weights and summed to obtain the generation loss. For example, the high frequency loss is multiplied by a first coefficient, the sum of the low frequency loss and the full-scale loss is multiplied by a second coefficient, the penalty loss is multiplied by a third coefficient, and the three multiplication results are summed to obtain the resulting loss, as shown in the following equation:

wherein, α, γ, λ are coefficients set according to actual requirements.

In some embodiments, there may be multiple ways to derive the discriminant loss of the discriminant network from the first image sample and the reconstructed image. In some embodiments, the discriminant loss may be derived from a discriminant network loss function, which is shown below:

for example, after the generation loss and the discrimination loss are obtained, an Adam optimizer can be used to perform iterative training on the two networks alternately, and when an error between a reconstructed image and a high-resolution image is smaller than a preset error value, parameters of neural networks of the two networks are converged, and the training is completed.

Because the common image reconstruction method usually uses L1 or L2 loss to train the image super-resolution reconstruction network, the generated pictures are often too smooth, the details are lost, and the appearance is poor. Meanwhile, in the reconstruction of the super-resolution image, the high-frequency details are difficult to recover due to the large magnification factor. According to the embodiment of the application, when the image super-resolution reconstruction network is trained according to the loss function, the loss function is innovated, high-frequency loss and low-frequency loss are calculated respectively by extracting high frequency and low frequency of the image, the recovery of high-frequency details and low-frequency contours is enhanced, the image reconstruction effect is improved, and the image reconstruction accuracy is improved.

As can be seen from the above, the image super-resolution reconstruction network training method provided in the embodiment of the present application can acquire an image sample set, where the image sample set includes a first image sample and a second image sample, and a resolution of the first image sample is greater than a resolution of the second image sample; constructing a discrimination network corresponding to the image super-resolution reconstruction network; and performing pre-training on the image super-resolution reconstruction network according to the image sample group, and performing combined training on the image super-resolution reconstruction network and the discrimination network according to the image sample group.

In order to better implement the image super-resolution reconstruction method provided by the embodiment of the application, the embodiment of the application also provides a device based on the image super-resolution reconstruction method. The terms are the same as those in the image super-resolution reconstruction method, and specific implementation details can refer to the description in the method embodiment.

Referring to fig. 13, fig. 13 is a block diagram of an image super-resolution reconstruction apparatus 300 according to an embodiment of the present application. Specifically, the image super-resolution reconstruction apparatus 300 includes:

an image super-resolution reconstruction apparatus comprising: an obtaining module 301, an obtaining network module 302, a feature extraction module 303, a residual error processing module 304, a sampling processing module 305, a fusion module 306, and a reconstruction module 307.

An obtaining module 301, configured to obtain a first image that needs super-resolution reconstruction;

an obtaining network module 302, configured to obtain a pre-trained image super-resolution reconstruction network, where the image super-resolution reconstruction network includes a feature extraction module 303, a residual error module, an upsampling module, and a reconstruction module 307;

a feature extraction module 303, configured to perform feature extraction on the first image through the feature extraction module 303 to obtain a feature extraction result;

a residual error processing module 304, configured to perform residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result;

a sampling processing module 305, configured to perform upsampling on the feature extraction result by the upsampling module to obtain a first upsampling result, and perform upsampling on the residual error processing result by the upsampling module to obtain a second upsampling result;

a fusion module 306, configured to fuse the first upsampling result and the second upsampling result to obtain a fusion result;

a reconstruction module 307, configured to perform super-resolution reconstruction on the fusion result through the reconstruction module 307 to obtain a second image, where a resolution of the second image is greater than a resolution of the first image.

In some embodiments, the residual module includes a plurality of residual units, and the feature extraction module 303 is specifically configured to:

In some embodiments, the residual error unit includes a plurality of residual error weight units, and the feature extraction module 303 is specifically configured to:

In some embodiments, the residual weighting unit includes m residual convolution units, where m is a positive integer, and the feature extraction module 303 is specifically configured to:

inputting the residual error extraction result and the feature extraction result into a second residual error convolution unit for residual error extraction processing;

In some embodiments, the residual convolution unit includes an active layer, a first convolution layer and a second convolution layer, and the feature extraction module 303 is specifically configured to:

In some embodiments, the upsampling module includes n upsampling units, where n is a positive even number greater than or equal to 4, and the sampling processing module 305 is specifically configured to:

In some embodiments, the upsampling unit includes a plurality of upsampling subunits and a splicing subunit, and the sample processing module 305 is specifically configured to:

and splicing the multiple scale sampling results through the splicing subunit.

In some embodiments, the image super-resolution reconstruction apparatus further includes a pre-training unit 308, where the pre-training unit 308 is configured to:

acquiring an image sample group, wherein the image sample group comprises a first image sample and a second image sample, and the resolution of the first image sample is greater than that of the second image sample;

constructing a discrimination network corresponding to the image super-resolution reconstruction network;

and performing pre-training on the image super-resolution reconstruction network according to the image sample group, and performing combined training on the image super-resolution reconstruction network and the discrimination network according to the image sample group.

In some embodiments, the pre-training unit 308 is specifically configured to:

and performing joint training on the image super-resolution reconstruction network and the discrimination network according to the generation loss and the discrimination loss by taking network parameter convergence as constraint.

In some embodiments, the pre-training unit 308 is specifically configured to:

obtaining the reconstruction loss of the image super-resolution reconstruction network according to the first image sample and the reconstructed image;

In some embodiments, the pre-training unit 308 is specifically configured to:

respectively giving corresponding weights to the high-frequency loss, the low-frequency loss, the full-image loss and the countermeasure loss;

and adding the weighted high-frequency loss, low-frequency loss, full-image loss and countervailing loss to obtain the generation loss of the generation network.

In some embodiments, the pre-training unit 308 is specifically configured to:

acquiring a plurality of first image samples;

performing downsampling processing on each first image sample to obtain a second image sample corresponding to each first image sample;

each of the first image samples and its corresponding second image sample are set as the set of image samples.

As can be seen from the above description, the embodiment of the present application provides an image super-resolution reconstruction apparatus, where an obtaining module 301 is configured to obtain a first image that needs to be subjected to super-resolution reconstruction; an obtaining network module 302, configured to obtain a pre-trained image super-resolution reconstruction network, where the image super-resolution reconstruction network includes a feature extraction module 303, a residual error module, an upsampling module, and a reconstruction module 307; a feature extraction module 303, configured to perform feature extraction on the first image through the feature extraction module 303 to obtain a feature extraction result; a residual error processing module 304, configured to perform residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result; a sampling processing module 305, configured to perform upsampling on the feature extraction result by the upsampling module to obtain a first upsampling result, and perform upsampling on the residual error processing result by the upsampling module to obtain a second upsampling result; a fusion module 306, configured to fuse the first upsampling result and the second upsampling result to obtain a fusion result; a reconstruction module 307, configured to perform super-resolution reconstruction on the fusion result through the reconstruction module 307 to obtain a second image, where a resolution of the second image is greater than a resolution of the first image. The image super-resolution reconstruction device provided by the embodiment of the application consists of a plurality of modules, so that the super-resolution reconstruction can be performed on the image from multiple scales in the reconstruction process, and the accuracy of the image super-resolution reconstruction is improved.

The embodiment of the application also provides an electronic device 400. Referring to fig. 14, the electronic device 400 includes a processor 401 and a memory. The processor 401 is electrically connected to the memory.

The processor 400 is a control center of the electronic device 400, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device 400 by running or loading a computer program stored in the memory 402, and performs data processing by data stored in the memory 402, thereby performing overall monitoring of the electronic device 400.

The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the computer programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.

In this embodiment, the processor 401 in the electronic device 400 loads instructions corresponding to one or more processes of the computer program into the memory 402 according to the following steps, and the processor 401 runs the computer program stored in the memory 402, so as to implement various functions, as follows:

acquiring a first image needing super-resolution reconstruction; acquiring a pre-trained image super-resolution reconstruction network, wherein the image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module; performing feature extraction on the first image through the feature extraction module to obtain a feature extraction result; performing residual error processing on the feature extraction result through the residual error module to obtain a residual error processing result; performing upsampling processing on the feature extraction result through the upsampling module to obtain a first upsampling result, and performing upsampling processing on the residual error processing result through the upsampling module to obtain a second upsampling result; fusing the first up-sampling result and the second up-sampling result to obtain a fused result; and performing super-resolution reconstruction on the fusion result through a reconstruction module to obtain a second image, wherein the resolution of the second image is greater than that of the first image.

As can be seen from the above, the electronic device in the embodiment of the application acquires the pre-trained image super-resolution reconstruction network by acquiring the first image to be subjected to super-resolution reconstruction, and processes the first image through the feature extraction module, the residual error module, the upsampling module and the reconstruction module in the image super-resolution reconstruction network to obtain the second image with the resolution greater than that of the first image. The image super-resolution reconstruction network provided by the embodiment of the application adopts the up-sampling module to sample the image data from multiple scales, so that the prediction precision is effectively improved, and the accuracy of the image super-resolution reconstruction is further effectively improved.

Referring also to fig. 15, in some embodiments, the electronic device 400 may further include: a display 403, radio frequency circuitry 404, audio circuitry 405, and a power supply 406. The display 403, the rf circuit 404, the audio circuit 405, and the power source 406 are electrically connected to the processor 401.

The display 403 may be used to display information entered by or provided to the user as well as various graphical user interfaces, which may be made up of graphics, text, icons, video, and any combination thereof. The Display 403 may include a Display panel, and in some embodiments, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The rf circuit 404 may be used for transceiving rf signals to establish wireless communication with a network device or other electronic devices through wireless communication, and for transceiving signals with the network device or other electronic devices.

The audio circuit 405 may be used to provide an audio interface between a user and an electronic device through a speaker, microphone.

The power source 406 may be used to power various components of the electronic device 400. In some embodiments, power supply 406 may be logically coupled to processor 401 via a power management system, such that functions to manage charging, discharging, and power consumption management are performed via the power management system.

Although not shown, the electronic device 400 may further include a camera, a bluetooth module, and the like, which are not described in detail herein.

Embodiments of the present application further provide a storage medium storing a computer program, which when run on a computer, causes the computer to execute the image super-resolution reconstruction method in any one of the above embodiments, such as: acquiring a first image needing super-resolution reconstruction; acquiring a pre-trained image super-resolution reconstruction network, wherein the image super-resolution reconstruction network comprises a feature extraction module, a residual error module, an up-sampling module and a reconstruction module; extracting the features of the first image through the feature extraction module to obtain a feature extraction result; performing residual processing on the feature extraction result through the residual module to obtain a residual processing result; performing upsampling processing on the feature extraction result through the upsampling module to obtain a first upsampling result, and performing upsampling processing on the residual error processing result through the upsampling module to obtain a second upsampling result; merging the first up-sampling result and the second up-sampling result to obtain a merged result; and performing super-resolution reconstruction on the fusion result through a reconstruction module to obtain a second image, wherein the resolution of the second image is greater than that of the first image.

In the embodiment of the present application, the storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It should be noted that, for the image super-resolution reconstruction method of the embodiment of the present application, it can be understood by those skilled in the art that all or part of the process for implementing the image super-resolution reconstruction method of the embodiment of the present application can be completed by controlling the relevant hardware through a computer program, where the computer program can be stored in a computer readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution, the process may include the process of the embodiment of the image super-resolution reconstruction method. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.

In the image super-resolution reconstruction apparatus according to the embodiment of the present application, each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented as a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium such as a read-only memory, a magnetic or optical disk, or the like.

The term "module" as used herein may be considered a software object executing on the computing system. The various components, modules, engines, and services described herein may be viewed as objects implemented on the computing system. The apparatus and method described herein are preferably implemented in software, but may also be implemented in hardware, and are within the scope of the present application.

The above detailed description is provided for the image super-resolution reconstruction method, apparatus, storage medium and electronic device provided in the embodiments of the present application, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiments is only used to help understanding the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. An image super-resolution reconstruction method is characterized by comprising the following steps:

acquiring a first image needing super-resolution reconstruction;

2. The image super-resolution reconstruction method of claim 1, wherein the residual module comprises a plurality of residual units, and the residual processing of the feature extraction result by the residual module to obtain a residual processing result comprises:

3. The method of claim 2, wherein the residual error unit comprises a plurality of residual error weight units, the feature extraction result is input to a first residual error unit for adaptive residual error processing, and the outputting of the adaptive residual error result comprises:

4. The image super-resolution reconstruction method of claim 3, wherein the residual weighting unit includes m residual convolution units, where m is a positive integer, and the residual fusion of the feature extraction results by the first residual weighting unit to obtain a residual fusion result includes:

5. The image super-resolution reconstruction method according to claim 1, wherein the residual convolution unit includes an active layer, a first convolution layer and a second convolution layer, the feature extraction result is input to the first residual convolution unit for residual extraction processing, and outputting the residual extraction result includes:

6. The image super-resolution reconstruction method according to claim 1, wherein the upsampling module comprises n upsampling units, where n is a positive even number greater than or equal to 4, and the upsampling module performs upsampling on the feature extraction result to obtain a first upsampling result, and the method comprises:

the upsampling module upsamples the residual error processing result to obtain a second upsampling result, including:

7. The image super-resolution reconstruction method according to claim 6, wherein the upsampling unit comprises a plurality of upsampling sub-units and a stitching sub-unit, and the upsampling processing on the feature extraction result by the first upsampling unit comprises:

and splicing the multiple scale sampling results through the splicing subunit.

8. The image super-resolution reconstruction method according to claim 1, wherein before the acquiring the first image to be super-resolution reconstructed, the method further comprises:

9. The image super-resolution reconstruction method of claim 8, wherein the jointly training the image super-resolution reconstruction network and the discrimination network according to the image sample group comprises:

10. The method of claim 9, wherein obtaining the generation loss of the image super-resolution reconstruction network according to the first image sample, the reconstructed image and the discrimination network comprises:

11. The image super-resolution reconstruction method according to claim 10, wherein obtaining the reconstruction loss of the image super-resolution reconstruction network from the first image sample and the reconstructed image comprises:

12. The image super-resolution reconstruction method according to claim 11, wherein obtaining the generation loss of the generation network from the image super-resolution reconstruction loss and the countermeasure loss comprises:

13. The image super-resolution reconstruction method according to claim 8, wherein the acquiring the image sample group comprises:

acquiring a plurality of first image samples;

14. An image super-resolution reconstruction apparatus, comprising:

15. A storage medium having stored thereon a computer program, characterized in that it executes the image super-resolution reconstruction method according to any of claims 1 to 13 when being loaded by a processor.

16. An electronic device comprising a processor and a memory, the memory storing a computer program, wherein the processor is configured to execute the image super-resolution reconstruction method according to any one of claims 1 to 13 by loading the computer program.