CN116664409B

CN116664409B - Image super-resolution reconstruction method, device, computer equipment and storage medium

Info

Publication number: CN116664409B
Application number: CN202310959247.5A
Authority: CN
Inventors: 赵旭; 汪婧; 李德建; 刘晗; 李正浩; 崔丙锋
Original assignee: Beijing Smartchip Microelectronics Technology Co Ltd
Current assignee: Beijing Smartchip Microelectronics Technology Co Ltd
Priority date: 2023-08-01
Filing date: 2023-08-01
Publication date: 2023-10-31
Anticipated expiration: 2043-08-01
Also published as: CN116664409A

Abstract

The invention discloses an image super-resolution reconstruction method, an image super-resolution reconstruction device, computer equipment and a storage medium. Firstly, inputting an initial resolution image into a linear activation module of a super resolution reconstruction model to obtain shallow image features of the initial resolution image; secondly, carrying out feature extraction on shallow image features through a first lightweight feature extraction module to obtain first deep image features; then, carrying out mask setting on each element in the first deep image feature through fine-granularity sparse mask branches to obtain a corresponding fine-granularity sparse mask; then, through a ghost feature extraction branch, carrying out ghost feature extraction based on a fine-granularity sparse mask and the first deep image feature to obtain a corresponding second deep image feature; and finally, carrying out super-resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image. An improvement in the way a unified mask is used across all channels is achieved.

Description

Image super-resolution reconstruction method, device, computer equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and apparatus for reconstructing super resolution of an image, a computer device, and a storage medium.

Background

With the development of image processing technology, an image super-resolution processing technology for reconstructing an observed low-resolution image into a corresponding high-resolution image to improve the resolution of an original image appears, and the technology can meet the requirement of high-definition display of the image, so that the technology has important application value in the fields of monitoring equipment, satellite images, medical images and the like.

In the related art, unnecessary computation may be skipped by using a mask. However, the manner in which a unified mask is used across all channels remains to be improved.

Disclosure of Invention

The embodiments of the present specification aim to solve at least one of the technical problems in the related art to some extent. For this reason, the embodiments of the present specification provide an image super-resolution reconstruction method, apparatus, computer device, and storage medium.

The embodiment of the specification provides an image super-resolution reconstruction method, which comprises the following steps:

inputting the initial resolution image into a linear activation module of the super resolution reconstruction model to obtain shallow image features of the initial resolution image; the super-resolution reconstruction model comprises a plurality of first lightweight characteristic extraction modules and second lightweight characteristic extraction modules which are connected in sequence; the second lightweight class feature extraction module comprises a fine granularity sparse mask branch and a ghost feature extraction branch which are parallel;

Performing feature extraction on the shallow image features through the first lightweight feature extraction module to obtain first deep image features;

mask setting is carried out on each element in the first deep image feature output by the first lightweight feature extraction module through the fine-granularity sparse mask branch, so that a corresponding fine-granularity sparse mask is obtained;

performing ghost feature extraction on the basis of the fine-granularity sparse mask and the first deep image feature through the ghost feature extraction branch to obtain a corresponding second deep image feature;

performing super-resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image; wherein the resolution of the initial resolution image is lower than the resolution of the target resolution image.

In one embodiment, the performing super-resolution reconstruction according to the initial resolution image, the shallow layer image feature and the second deep layer image feature to obtain a target resolution image includes:

residual connection is carried out on the second deep image characteristic and the shallow image characteristic to obtain a detail image characteristic;

Performing dimension reduction processing on the detail image features through a first simple linear block to obtain dimension reduction image features;

and carrying out residual connection and up-sampling processing on the basis of the initial resolution image and the dimension-reduced image characteristics to obtain the target resolution image.

In one embodiment, the linear activation module includes a second simple linear block and an activation module; inputting the initial resolution image into a linear activation module of a super resolution reconstruction model to obtain shallow image features of the initial resolution image, wherein the method comprises the following steps:

performing expansion input channel number processing on the initial resolution image through the second simple linear block to obtain high-dimensional image characteristics;

and operating the high-dimensional image features through the activation module to obtain the shallow image features.

In one embodiment, the fine granularity sparse mask is obtained using the following formula, including:

if the current elementGenerating a mask 1 if +.>Generating a mask 0 when the value is not larger than the preset value;

wherein, the liquid crystal display device comprises a liquid crystal display device,soft output representing the fine-grained sparse mask branch,/->Representing noise sampled from Gumbel (0, 1) distribution, +. >To weigh the parameters->Representing the Sigmoid activation function.

In one embodiment, the extracting the ghost feature by the branch of extracting the ghost feature, extracting the ghost feature based on the fine granularity sparse mask and the first deep image feature, to obtain a corresponding second deep image feature, includes:

utilizing the first deep image feature and the fine granularity sparse mask to multiply point by point to obtain sparse image features;

performing sparse convolution processing on the sparse image features to obtain convolved sparse image features;

adding the convolved sparse image features and the first deep image features to obtain first intermediate image features;

and carrying out ghost feature extraction based on the first intermediate image feature to obtain the second deep image feature.

In one embodiment, the extracting the ghost feature based on the first intermediate image feature to obtain the second deep image feature includes:

performing depth separable convolution on the first intermediate image feature to obtain a channel number ghost rateAdjust->Residual connection of the channel dimension is carried out, and a second intermediate image feature with the channel dimension of C is obtained; wherein (1) >The value range of (2) is not more than 1;

and carrying out channel attention distribution processing and activating processing based on the second intermediate image feature to obtain the second deep image feature.

In one embodiment, the generating process of the super-resolution reconstruction model includes:

performing bicubic interpolation on the first resolution image to obtain a second resolution image; wherein the resolution of the second resolution image is lower than the resolution of the first resolution image;

performing image enhancement processing on the second resolution image to obtain a training image sample set;

constructing a lightweight feature extraction module based on edge region features, texture region features and flat region features of the second resolution image; the lightweight characteristic extraction module comprises a fine-granularity sparse mask branch and a ghost characteristic extraction branch which are parallel; the fine-granularity sparse mask branch is used for generating a fine-granularity sparse mask of the current lightweight characteristic extraction module, and the ghost characteristic extraction branch is used for extracting compact ghost characteristics in the second resolution image;

sequentially connecting a plurality of lightweight characteristic extraction modules to construct an initial model; wherein each channel of the input of the lightweight feature extraction module is assigned a separate mask;

And training the initial model by using the training image sample set to obtain the super-resolution reconstruction model.

In one embodiment, the ghost feature extraction branch comprises a ghost linear block, and the fine-granularity sparse mask branch comprises a first simple linear block; wherein the ghost linear block comprises a first convolution layer and a second convolution layer which are adjacent to each other; the first simple linear block comprises a third convolution layer and a fourth convolution layer which are adjacent to each other; the training the initial model by using the training image sample set to obtain the super-resolution reconstruction model comprises the following steps:

training the initial model by using the training image sample set until a model training stopping condition is met;

after the initial model is trained, combining the first convolution layer and the second convolution layer into one convolution layer, and combining the third convolution layer and the fourth convolution layer into one convolution layer to obtain the super-resolution reconstruction model.

The embodiment of the present specification provides an image super-resolution reconstruction apparatus, which includes:

the shallow image feature acquisition module is used for inputting the initial resolution image into the linear activation module of the super resolution reconstruction model to obtain shallow image features of the initial resolution image; the super-resolution reconstruction model comprises a plurality of first lightweight characteristic extraction modules and second lightweight characteristic extraction modules which are connected in sequence; the second lightweight class feature extraction module comprises a fine granularity sparse mask branch and a ghost feature extraction branch which are parallel;

The first deep layer extraction module is used for extracting the characteristics of the shallow image characteristics through the first lightweight characteristic extraction module to obtain first deep layer image characteristics;

the element mask setting module is used for carrying out mask setting on each element in the first deep image feature output by the first lightweight feature extraction module through the fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask;

the second deep layer extraction module is used for extracting the ghost features based on the fine granularity sparse mask and the first deep layer image features through the ghost feature extraction branches to obtain corresponding second deep layer image features;

the super-resolution reconstruction module is used for performing super-resolution reconstruction according to the initial resolution image, the shallow layer image features and the second deep layer image features to obtain a target resolution image; wherein the resolution of the initial resolution image is lower than the resolution of the target resolution image.

if the current elementGenerating a mask 1 if +. >Generating a mask 0 when the value is not larger than the preset value;

wherein, the liquid crystal display device comprises a liquid crystal display device,soft output representing the fine-grained sparse mask branch,/->Representing noise sampled from Gumbel (0, 1) distribution, +.>To weigh the parameters->Representing the Sigmoid activation function.

In one embodiment, the second deep layer extraction module is further configured to perform point-by-point multiplication with the fine-granularity sparse mask by using the first deep layer image feature to obtain a sparse image feature; performing sparse convolution processing on the sparse image features to obtain convolved sparse image features; adding the convolved sparse image features and the first deep image features to obtain first intermediate image features; and carrying out ghost feature extraction based on the first intermediate image feature to obtain the second deep image feature.

In one embodiment, the second deep extraction module is further configured to perform a depth separable convolution on the first intermediate image feature to determine a channel number ghost rateAdjust->Residual connection of the channel dimension is carried out, and a second intermediate image feature with the channel dimension of C is obtained; wherein (1)>The value range of (2) is not more than 1; and carrying out channel attention distribution processing and activating processing based on the second intermediate image feature to obtain the second deep image feature. The present specification embodiment provides a computer apparatus including: a memory, and one or more processors communicatively coupled to the memory; the memory has stored therein instructions executable by the one or more processors to cause the one or more processors to implement the steps of the method of any of the embodiments described above.

The present description provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method according to any of the above embodiments.

The present description provides a computer program product comprising instructions which, when executed by a processor of a computer device, enable the computer device to perform the steps of the method of any one of the embodiments described above.

In the above description embodiment, firstly, an initial resolution image is input into a linear activation module of a super resolution reconstruction model to obtain shallow image features of the initial resolution image; secondly, carrying out feature extraction on shallow image features through a first lightweight feature extraction module to obtain first deep image features; then, carrying out mask setting on each element in the first deep image feature through a fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask; then, through a ghost feature extraction branch, carrying out ghost feature extraction based on a fine-granularity sparse mask and the first deep image feature to obtain a corresponding second deep image feature; and finally, carrying out super-resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image with resolution higher than that of the initial resolution image. And carrying out mask setting on each element in the first deep image feature to obtain a fine-granularity sparse mask, and carrying out ghost feature extraction based on the fine-granularity sparse mask to realize improvement on a mode of using unified masks on all channels in the related technology.

Drawings

Fig. 1a is a schematic view of a scene of an image super-resolution reconstruction method according to an embodiment of the present disclosure;

fig. 1b is a schematic flow chart of an image super-resolution reconstruction method according to an embodiment of the present disclosure;

FIG. 1c is a schematic diagram of a lightweight feature extraction module provided by an embodiment of the present disclosure;

FIG. 2a is a schematic flow chart of determining a target resolution image according to an embodiment of the present disclosure;

fig. 2b is a schematic diagram of a method for implementing super-resolution reconstruction of an image according to an embodiment of the present disclosure;

FIG. 3a is a schematic flow chart of determining shallow image features according to an embodiment of the present disclosure;

fig. 3b is a schematic diagram of a method for implementing super-resolution reconstruction of an image according to another embodiment of the present disclosure;

FIG. 4a is a schematic flow chart of determining a second deep image feature according to an embodiment of the present disclosure;

FIG. 4b is a schematic diagram of a merged convolutional layer comprised by a ghost linear block provided in an embodiment of the present disclosure;

FIG. 5a is a schematic flow chart of determining a second deep image feature according to another embodiment of the present disclosure;

FIG. 5b is a schematic diagram of a ghost feature extraction branch provided in an embodiment of the present disclosure;

Fig. 6a is a schematic flow chart of a super-resolution reconstruction model generating process according to an embodiment of the present disclosure;

FIG. 6b is a schematic diagram of a lightweight feature extraction module provided by an embodiment of the present disclosure;

FIG. 7a is a schematic flow chart of determining a super-resolution reconstruction model according to an embodiment of the present disclosure;

FIG. 7b is a schematic diagram of adjacent convolution layers included in a first simple linear block provided by an embodiment of the present disclosure;

FIG. 7c is a schematic diagram of a combined convolution layer included in a first simple linear block provided by an embodiment of the present disclosure;

FIG. 7d is a schematic illustration of adjacent convolution layers included in a ghost-linear block provided in an embodiment of the present disclosure;

fig. 8 is a flowchart of an image super-resolution reconstruction method according to another embodiment of the present disclosure;

fig. 9 is a schematic diagram of an image super-resolution reconstruction device according to an embodiment of the present disclosure;

fig. 10 is an internal configuration diagram of a computer device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

In the related art, the image blocks can be input into neural network models with different complexity to perform image processing. Unnecessary computations may also be skipped by using a mask. However, feature redundancy on different channels varies, so using a uniform mask across all channels cannot extract the variability of the different channel features. For example, SMSR (spark Mask SR, state simulation software system) proposes adding a leachable Mask in the space and channel, respectively, skipping the locations that do not need to be calculated. The SMSR employs a two-stage approach, first determining important channels, and then identifying important spatial regions by a shared mask across all channels.

Illustratively, since the work of an image Super-resolution model (SRCNN) based on a convolutional neural network, the research of a neural network algorithm for image Super-resolution reconstruction is gradually deepened, such as srres net (Super-Resolution Residual Network ) having 16 residual units; in the RCAN (Regulator of Calcineurin, low rank convolutional neural network reconstruct super resolution image) framework, there are 400 layers of neural network depths. Although significant advances in performance have been made in the approach of improving the resolution of the original image through more complex networks, deployment on resource-constrained (e.g., embedded end-side devices, etc.) devices may be limited due to the significant computational load and complexity. However, SRCNN relies on context of image region, training convergence speed is slow, and the network is applicable to a single scale. Therefore, how to more effectively utilize redundancy and sparsity of the neural network feature channels is a key to achieving a more efficient lightweight image super-resolution network.

Based on this, the present embodiment provides an image super-resolution reconstruction method. Firstly, inputting an initial resolution image into a linear activation module of a super resolution reconstruction model to obtain shallow image features of the initial resolution image; secondly, carrying out feature extraction on shallow image features through a first lightweight feature extraction module to obtain first deep image features; then, carrying out mask setting on each element in the first deep image feature through a fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask; then, through a ghost feature extraction branch, carrying out ghost feature extraction based on a fine-granularity sparse mask and the first deep image feature to obtain a corresponding second deep image feature; and finally, carrying out super-resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image with resolution higher than that of the initial resolution image. And carrying out mask setting on each element in the first deep image feature to obtain a fine-granularity sparse mask, and carrying out ghost feature extraction based on the fine-granularity sparse mask to realize improvement on a mode of using unified masks on all channels in the related technology.

The method provided in the embodiment of the present disclosure may be applied to the application scenario of fig. 1a, and the image super-resolution reconstruction method is applied to the terminal 110 and the server 120. The terminal 110 and the server 120 are connected through a network. The server 120 may be configured to construct a super-resolution reconstruction model, construct a training sample for the super-resolution reconstruction model, and train the initial super-resolution reconstruction model by using the training sample of the super-resolution reconstruction model until a model training stop condition is satisfied, thereby obtaining the super-resolution reconstruction model. The super-resolution reconstruction model comprises a plurality of first lightweight characteristic extraction modules and second lightweight characteristic extraction modules which are connected in sequence. The second lightweight class feature extraction module includes fine-grained sparse mask branches and ghost feature extraction branches in parallel. The fine-granularity sparse mask branch is used for generating fine-granularity sparse masks of the current lightweight feature extraction module, and the ghost feature extraction branch is used for extracting compact ghost features in the second resolution image. The trained super-resolution reconstruction model is deployed on the terminal 110. The terminal 110 inputs the initial resolution image into a linear activation module of the super resolution reconstruction model to reconstruct the image super resolution, the terminal 110 can acquire shallow image features of the initial resolution image, perform feature extraction on the shallow image features through a first lightweight feature extraction module to obtain first deep image features, mask each element in the first deep image features through fine-granularity sparse mask branches to obtain corresponding fine-granularity sparse masks, perform ghost feature extraction based on the fine-granularity sparse masks and the first deep image features through ghost feature extraction branches to obtain corresponding second deep image features, and perform super resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image.

Wherein the terminal 110 may be an electronic device having network access capabilities. Specifically, for example, terminal 110 may be a desktop computer, tablet computer, notebook computer, smart phone, digital assistant, smart wearable device, shopping guide terminal, television, smart speaker, microphone, and the like. Wherein, intelligent wearable equipment includes but is not limited to intelligent bracelet, intelligent wrist-watch, intelligent glasses, intelligent helmet, intelligent necklace etc.. The server 120 may be an electronic device with some arithmetic processing capability. Which may have a network communication module, a processor, memory, and the like. The server may be a distributed server, or may be a system having multiple processors, memories, network communication modules, etc. operating in concert. Alternatively, the server may be a server cluster formed for several servers. Or, with the development of science and technology, the server may also be a new technical means capable of realizing the corresponding functions of the embodiment of the specification. For example, a new form of "server" based on quantum computing implementation may be possible.

The embodiment of the present disclosure provides a method for reconstructing super-resolution image, referring to fig. 1b, the method may include the following steps:

S110, inputting the initial resolution image into a linear activation module of the super resolution reconstruction model to obtain shallow image features of the initial resolution image.

And S120, carrying out feature extraction on the shallow image features through a first lightweight feature extraction module to obtain first deep image features.

The super-resolution reconstruction model comprises a plurality of first lightweight characteristic extraction modules and second lightweight characteristic extraction modules which are connected in sequence. The linear activation module includes a second simple linear block and an activation module. Referring to fig. 1c, the second lightweight class feature extraction module includes a fine granularity sparse mask branch FGSM and a ghost feature extraction branch GFEM in parallel. The higher the picture resolution, the clearer the picture, the more pixels are contained in the unit area of the picture, and the more the picture is clear, the more the total number of pixels are contained. The super-resolution reconstruction model is constructed for sparsity and redundancy of features in the super-resolution network. When the picture or the video is acquired, if the local area of the picture or the video is required to be enlarged, the enlargement can be realized through a super-resolution algorithm.

Specifically, an initial image for input into a super-resolution reconstruction model is acquired. And performing image preprocessing on the initial image to obtain an initial resolution image. And taking the initial resolution image as the input of the super resolution reconstruction model, and primarily extracting the shallow image features of the initial resolution image through a linear activation module of the super resolution reconstruction model, so that the shallow image features of the initial resolution image can be obtained. The shallow image features are input to a first lightweight feature extraction module for feature extraction, and the first deep image features can be obtained by feature extraction of the shallow image features.

In some embodiments, the size of the initial resolution images may be inconsistent, and the initial resolution images with image sizes greater than 64×64 may be randomly cropped, fixing the size of each initial resolution image to 64×64.

S130, carrying out mask setting on each element in the first deep image feature output by the first lightweight feature extraction module through a fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask.

Wherein fine-grained sparse masks can locate and skip redundant computation to efficiently reason. In particular, the spatial mask may dynamically identify important regions and the channel mask may mark redundant channels in non-important regions. The goal of the spatial mask is to identify important areas in the first deep image features (i.e., 0 represents non-important areas, 1 represents important areas), and the channel mask is used to mark redundant channels in "non-important" areas (i.e., 0 represents redundant channels, 1 represents reserved channels).

In some cases, redundant computation of the super-resolution reconstruction model can be cut down through fine-granularity sparse mask branches, so that an efficient super-resolution reconstruction model is realized.

Specifically, the second lightweight class feature extraction module includes fine-grained sparse mask branches. And taking the first deep image feature output by the first lightweight feature extraction module as the input of a fine-granularity sparse mask branch, wherein each element in the first deep image feature can be subjected to mask setting through the fine-granularity sparse mask branch, and a fine-granularity sparse mask corresponding to the first deep image can be obtained.

Illustratively, the fine-grained sparse mask branch may set a different mask for each element of the input feature by a Gumbel-softmax policy (Gumbel-softmax relaxation, gu Beier-softmax policy), the dimensions of the fine-grained sparse mask being consistent with the dimensions of the first deep image feature. For example, the dimensions of a fine-granularity sparse mask areThe dimension of the first deep image feature is +.>. The output of the Gumbil-softmax strategy is maximized, but to prevent simple maximizing from causing proper gradient truncation for back-propagation learning, a continuously-derivative function is employed to simulate a strategy that maximizes. The Gumbel-softmax strategy can more accurately fit different feature distributions, achieving better performance.

And S140, performing ghost feature extraction based on the fine-granularity sparse mask and the first deep image features through a ghost feature extraction branch to obtain corresponding second deep image features.

And S150, performing super-resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image.

The lightweight characteristic output by the GhostNet network structure of the end-side neural network is called ghost characteristic. The objective of the ghost feature extraction branch and the GhostNet is to extract deep features by using a lightweight module, so the deep features output by the ghost feature extraction branch are also called ghost features. The resolution of the initial resolution image is lower than the resolution of the target resolution image.

In some cases, the super-Resolution reconstruction model may recover a corresponding High-Resolution (HR) target Resolution image from a given Low-Resolution (LR) initial Resolution image.

Specifically, with continued reference to fig. 1c, the second lightweight feature extraction module includes a ghost feature extraction branch GFEM. And taking Fine-grained sparse Masks (Fine-grained Masks) and the first deep image features as inputs of a ghost feature extraction branch GFEM, wherein the ghost feature extraction branch GFEM can be used for carrying out ghost feature extraction on the first deep image features based on the Fine-grained sparse Masks, and a corresponding second deep image feature can be obtained. And then carrying out super-resolution reconstruction on the second deep image features, the shallow image features and the initial resolution image to obtain a target resolution image with higher resolution than the initial resolution image.

In the above image super-resolution reconstruction method, firstly, inputting an initial resolution image into a linear activation module of a super-resolution reconstruction model to obtain shallow image features of the initial resolution image; secondly, carrying out feature extraction on shallow image features through a first lightweight feature extraction module to obtain first deep image features; then, carrying out mask setting on each element in the first deep image feature through a fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask; then, through a ghost feature extraction branch, carrying out ghost feature extraction based on a fine-granularity sparse mask and the first deep image feature to obtain a corresponding second deep image feature; and finally, carrying out super-resolution reconstruction according to the initial resolution image, the shallow image features and the second deep image features to obtain a target resolution image with resolution higher than that of the initial resolution image. And carrying out mask setting on each element in the first deep image feature to obtain a fine-granularity sparse mask, and carrying out ghost feature extraction based on the fine-granularity sparse mask to realize improvement on a mode of using unified masks on all channels in the related technology.

In some embodiments, referring to fig. 2a, performing super-resolution reconstruction according to the initial resolution image, the shallow image feature, and the second deep image feature to obtain the target resolution image may include the following steps:

and S210, carrying out residual connection on the second deep image characteristic and the shallow image characteristic to obtain a detail image characteristic.

S220, performing dimension reduction processing on the detail image features through the first simple linear block to obtain dimension reduction image features.

S230, residual connection and up-sampling processing are carried out based on the initial resolution image and the dimension-reduced image characteristics, and a target resolution image is obtained.

Wherein, the residual connection is the output of adding the input into the convolution through the direct connection edge. The detail image features may be high frequency details present in the image. The upsampling process may be to restore the resolution of the feature map to the resolution size of the original picture.

Specifically, referring to fig. 2b, a residual connection is established between the second deep image feature and the shallow image feature, so as to obtain a detail image feature. The detail image features may be subjected to a dimension reduction process by a first simple linear block (Simple Linear Block, SLB) 210, and the dimension-reduced image features may be obtained by the dimension reduction process. Then, residual connection establishment can be performed between the initial resolution image and the dimension-reduced image features, and image details between the initial resolution image and the dimension-reduced image features can be obtained. Finally, the initial resolution image 230 and the dimension-reduced image feature are up-sampled by the Shuffle layer 220, so that a target resolution image 240 having a resolution higher than that of the initial resolution image can be obtained. The Shuffle layer refers to a pixelshuffle layer (pixel Shuffle layer), which is a convolution layer with an up-sampling function widely used in the super-division field.

In the above image super-resolution reconstruction method, residual connection is performed according to the second deep image feature and the shallow image feature to obtain the detail image feature, the detail image feature is subjected to dimension reduction processing through the first simple linear block to obtain the dimension reduction image feature, and residual connection and up-sampling processing are performed based on the initial resolution image and the dimension reduction image feature to obtain the target resolution image. The computational complexity of the neural network model can be reduced.

In some embodiments, referring to fig. 3a, the linear activation module includes a second simple linear block and an activation module. Referring to fig. 3b, inputting the initial resolution image 230 into the linear activation module of the super resolution reconstruction model to obtain the shallow image features of the initial resolution image may include the following steps:

s310, performing expansion input channel number processing on the initial resolution image through a second simple linear block to obtain high-dimensional image characteristics.

S320, operating the high-dimensional image features through the activation module to obtain shallow image features.

Specifically, referring to fig. 3b, the initial resolution image 230 is input to a linear activation module of the super resolution reconstruction model, the number of channels of the initial resolution image can be extended by a second simple linear block 250 included in the linear activation module, and the high-dimensional image feature can be obtained by extending the number of channels of the initial resolution image. And then, the shallow image features of the high-dimensional image features can be initially extracted through an activation module included in the linear activation module, so that the shallow image features of the initial resolution image are obtained. Wherein the activation function may be implemented by the pre activation layer 260.

In the image super-resolution reconstruction method, the second simple linear block is used for carrying out expansion input channel number processing on the initial resolution image to obtain high-dimensional image features, and the activation module is used for operating the high-dimensional image features to obtain shallow image features. Input data may be provided for subsequent fine-grained sparse masking and extraction of ghost features.

In some embodiments, deriving the fine-grained sparse mask using the following formula may include:

wherein, the liquid crystal display device comprises a liquid crystal display device,soft output representing fine-grained sparse mask branches, +.>Representing noise sampled from Gumbel (0, 1) distribution, +.>To weigh the parameters->Representing the Sigmoid activation function.

Specifically, if the fine-granularity sparse mask is obtained, the current element is in the formulaAbove a preset value, a mask 1 can be generated if +.>No greater than a preset value, a mask of 0 may be generated.

In the above image super-resolution reconstruction method, if the current element isGenerating a mask 1 if +.>And generating a mask 0, wherein the mask is not larger than a preset numerical value. The variability of different channel features may be extracted by fine-grained sparse masks.

In some embodiments, referring to fig. 4a, by the ghost feature extraction branch, ghost feature extraction is performed based on the fine-granularity sparse mask and the first deep image feature, so as to obtain a corresponding second deep image feature, which may include the following steps:

S410, carrying out point-by-point multiplication by utilizing the first deep image feature and the fine-granularity sparse mask to obtain sparse image features.

S420, performing sparse convolution processing on the sparse image features to obtain convolved sparse image features.

And S430, adding the convolved sparse image features and the first deep image features to obtain first intermediate image features.

S440, extracting ghost features based on the first intermediate image features to obtain second deep image features.

With continued reference to fig. 1c, the ghost feature extraction branch GFEM included in the lightweight feature extraction module may perform ghost feature extraction based on a Fine-grained sparse mask (Fine-grained Masks) and the first deep image feature, so as to obtain a second deep image feature. The ghost feature extraction branches include a ghost linear block (Ghost Linear Block, GLB), see fig. 4b, where the ghost linear block (Ghost Linear Block, GLB) includes a convolutional layer of 1*1 and a convolutional layer of 3*3, and sparse convolution processing can be performed on sparse image features by the convolutional layer included in the ghost linear block (Ghost Linear Block, GLB).

Specifically, the first deep image features and the fine-granularity sparse mask are input into a ghost linear block (Ghost Linear Block, GLB). In a ghost linear block (Ghost Linear Block, GLB), the first deep image feature may be first point-wise multiplied with a fine-grained sparse mask to obtain a sparse image feature. Secondly, sparse convolution processing can be carried out on the sparse image features through a convolution layer of 1*1 and a convolution layer of 3*3 in sequence, so that the convolved sparse image features are obtained. And then adding the convolved sparse image features and the first deep image features to obtain first intermediate image features. And finally, extracting the ghost features of the first intermediate image features to obtain second deep image features.

In the above image super-resolution reconstruction method, the first deep image feature and the fine-granularity sparse mask are utilized to multiply point by point to obtain sparse image features, sparse convolution processing is performed on the sparse image features to obtain convolved sparse image features, addition processing is performed on the convolved sparse image features and the first deep image features to obtain first intermediate image features, ghost feature extraction is performed on the basis of the first intermediate image features, and the second deep image features are obtained.

In some embodiments, referring to fig. 5a, the extracting of the ghost feature based on the first intermediate image feature to obtain the second deep image feature may include the following steps:

s510, performing depth separable convolution on the first intermediate image feature, and performing channel number ghost rateAdjustment ofAnd carrying out residual connection in the channel dimension to obtain a second intermediate image feature with the channel dimension of C.

And S520, carrying out channel attention distribution processing and activation processing based on the second intermediate image feature to obtain a second deep image feature.

Wherein, the liquid crystal display device comprises a liquid crystal display device,the value range of (2) is not more than 1.

Specifically, referring to fig. 5b, a first intermediate image feature may be obtained by a ghost linear block (Ghost Linear Block, GLB). The channel number ghost ratio can be achieved by performing a depth-separable convolution on the first intermediate image feature with the convolution layer (3 x 3 depthwise conv) of 3*3 Adjust to->And then carrying out residual connection on the channel dimension of the first intermediate image feature to obtain a second intermediate image feature with the channel dimension of C. Finally, the attention distribution processing can be performed on the second intermediate image feature with the channel dimension of C through a Channel Attention module (channel attention module)By passing through the pre activation layer, the second deep image feature can be obtained.

In the above image super-resolution reconstruction method, the first intermediate image feature is subjected to depth separable convolution to obtain the channel number ghost rateAdjust->And carrying out residual connection in the channel dimension to obtain a second intermediate image feature with the channel dimension C, and carrying out channel attention distribution processing and activation processing based on the second intermediate image feature to obtain a second deep image feature. The high frequency details of the first intermediate image can be learned by means of the residual connection.

In some embodiments, referring to fig. 6a, the generation process of the super-resolution reconstruction model may include the following steps:

and S610, performing bicubic interpolation on the first resolution image to obtain a second resolution image.

Wherein the resolution of the second resolution image is lower than the resolution of the first resolution image. Bicubic interpolation is commonly used to sharpen and magnify digital images. The interpolation essentially consists in calculating the value of a point from the values of several points, bicubic interpolation being in fact a smoothing of the pixels, the values of the different pixels being in fact the sampling points of a function.

Specifically, a plurality of high resolution images (HR) may be acquired as the first resolution image, and bicubic interpolation (bicubic downscaling) may be performed on the first resolution image, and a second resolution image may be obtained. Wherein, a part of the acquired high-resolution images can be used as a training sample set, and a part of the acquired high-resolution images can be used as a verification sample set.

Illustratively, there are 800 training set pictures and 100 verification set pictures in the high resolution dataset DIV 2K. Applying bicubic interpolation (bicubic downscaling) on the High Resolution (HR) image, i.e., the first resolution image, may result in a Low Resolution (LR) image, i.e., the second resolution image.

S620, performing image enhancement processing on the second resolution image to obtain a training image sample set.

The image enhancement processing includes, but is not limited to, random cropping operation, rotation operation, and translation operation.

In some cases, the image is subjected to image enhancement processing, so that the number of training sample sets can be increased, and generalization of the model can be increased.

Specifically, each low-resolution image is randomly cut into 64×64 image blocks, and a total of 64×800=51200 training image sample sets are obtained. The image enhancement process can also be performed by random rotation and horizontal flipping of the image. The first resolution image is downsampled through bicubic interpolation to obtain second resolution images, at this time, the size of each second resolution image may be inconsistent, the second resolution image with the size larger than 64×64 images is randomly cut, or the size of each image may be fixed to 64×64, and a training image sample set may be obtained.

S630, constructing a lightweight feature extraction module based on the edge region features, the texture region features and the flat region features of the second resolution image.

The lightweight characteristic extraction module comprises a fine-granularity sparse mask branch and a ghost characteristic extraction branch which are parallel; the fine-granularity sparse mask branch is used for generating fine-granularity sparse masks of the current lightweight feature extraction module, and the ghost feature extraction branch is used for extracting compact ghost features in the second resolution image. The inference complexity can be reduced by the re-parameterization method, and clearer texture region characteristics can be obtained.

In some cases, each channel of the input in the lightweight feature extraction module is assigned a separate mask to further reduce the computational complexity of the super-resolution reconstruction model.

Specifically, the detail loss of the low-resolution image is mainly located in the edge area and the texture area, and the loss of the flat area is less, so that the same effect can be obtained by giving less calculation amount to the processing. Edge region features, texture region features, flat region features of the second resolution image may be used to construct the lightweight feature extraction module.

S640, sequentially connecting the lightweight characteristic extraction modules to construct an initial model.

S650, training the initial model by using the training image sample set to obtain a super-resolution reconstruction model.

Wherein each channel of the input of the lightweight feature extraction module is assigned a separate mask.

Specifically, the plurality of lightweight feature extraction modules may be sequentially connected according to respective functions, and an initial model may be constructed. And inputting the obtained training image sample set into an initial model to reconstruct the super-resolution of the image, so that a target resolution image with resolution higher than that of the initial resolution image can be obtained. The label image is an original high resolution image of the training image sample set. And then determining a loss value of the initial model based on the target resolution image and the label image corresponding to the training image sample, and updating parameters of the initial model based on the model loss value. And by analogy, training the updated initial model continuously, and obtaining a super-resolution reconstruction model when the model training stopping condition is reached. The model training stopping condition may be that the model loss value tends to converge, or that the training turns reach a preset number of turns.

The model loss value is formulated as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing the loss between the target resolution image and the label image,/- >And->Respectively representing a label image and a target resolution image, < >>For balancing the sparse mask penalty against the penalty between the target resolution image and the label image. Wherein the sparse mask penalty may be expressed as:

wherein, the liquid crystal display device comprises a liquid crystal display device,represents the sum of the number of FLPs (Floating-point Operations Per Second) at sparse locations in layer i, but +.>Representing the sum of the number of FLOPs at all positions of the i-th layer. By adjusting superparameter->To control the sparsity ratio and thus the complexity of the model and thus the sparsity penalty.

In this embodiment, referring to fig. 6b, gffs is a lightweight feature extraction module, where the lightweight feature extraction module includes a fine-grain sparse mask branch and a ghost feature extraction branch in parallel, FGSM is a fine-grain sparse mask branch, and GFEM is a ghost feature extraction branch. (1) Representing the input from the output of the GFFS of the upper layer, (2) representing the input from the output of the fine-grained sparse mask branch in the current lightweight feature extraction module. The output of the GFFS of the previous layer is taken as the input of the 1*1 convolution layer included in the FGSM, the output result obtained through the 1*1 convolution layer is taken as the input of the Relu layer, the output result obtained through the Relu layer is taken as the input of the simple linear block (Simple Linear Block, SLB), the output result obtained through the simple linear block (Simple Linear Block, SLB) is taken as the input of the Prelu layer, and the Fine-granularity sparse mask (Fine-granen Masks) can be obtained through the Prelu layer. The output of the last GFFS and fine-grained sparse mask may be used as inputs to a Ghost Linear block (Ghost Linear Block, GLB) included in the GFEM, by a Ghost Linear block (Ghost Linear Block, GLB) as output resultThe input of the layer, the convolution layer (3×3 depthwise conv) of 3*3, is by +.>The output result obtained by the layer and the output result obtained by the ghost linear blocks (Ghost Linear Block, GLB) are taken as input of the Channel Attention module, the output result obtained by the Channel Attention module is taken as input of the Prelu layer, and the output result obtained by the Prelu layer can obtain the second deep image characteristic.

In this embodiment, please continue to refer to fig. 3b, gffs is a lightweight feature extraction module. The initial resolution image 230 is taken as an input of a simple linear block (Simple Linear Block, SLB), an output result obtained through the simple linear block (Simple Linear Block, SLB) is taken as an input of a Prelu layer, an output result obtained through the Prelu layer is taken as an input of GFFS modules, and the number of GFFS modules passing in between can be set according to specific situations. The output result from the last GFFS module and the output result from the Prelu layer are input to a simple linear block (Simple Linear Block, SLB), the output result from the simple linear block (Simple Linear Block, SLB) and the initial resolution image 230 are input to the Shuffle layer, and the output result from the Shuffle layer is input to the target resolution image 240.

In the above image super-resolution reconstruction method, image enhancement processing is performed on the second resolution image to obtain a training image sample set, a lightweight feature extraction module is built based on edge region features, texture region features and flat region features of the second resolution image, a plurality of lightweight feature extraction modules are sequentially connected to build an initial model, and the training image sample set is used for training the initial model to obtain the super-resolution reconstruction model. By constructing the super-resolution reconstruction model, the computational complexity of the neural network model can be reduced.

In some implementations, referring to fig. 7a, the ghost feature extraction branch includes a ghost linear block, and the fine-granularity sparse mask branch includes a first simple linear block; wherein the ghost linear block comprises a first convolution layer and a second convolution layer which are adjacent; the first simple linear block comprises a third convolution layer and a fourth convolution layer which are adjacent to each other; training the initial model by using a training image sample set to obtain a super-resolution reconstruction model, wherein the method comprises the following steps of:

s710, training the initial model by using the training image sample set until the model training stopping condition is met.

S720, after training of the initial model is completed, combining the first convolution layer and the second convolution layer into one convolution layer, and combining the third convolution layer and the fourth convolution layer into one convolution layer to obtain the super-resolution reconstruction model.

The model training stopping condition may be that the model loss value tends to converge, or that the training turns reach a preset number of turns.

Specifically, during the model training phase, referring to fig. 7b, the first simple linear block (Simple Linear Block, SLB) includes adjacent third and fourth convolution layers. The third convolution layer may be a convolution layer of 3*3 and the fourth convolution layer may be a convolution layer of 1*1. After the initial model has been trained, referring to fig. 7c, the third convolution layer and the fourth convolution layer may be combined into one 3*3 convolution layer. In the model training phase, referring to fig. 7d, the ghost linear block (Ghost Linear Block, GLB) includes adjacent first and second convolution layers, the first convolution layer may be a 3*3 convolution layer and the second convolution layer may be a 1*1 convolution layer. After the initial model has been trained, referring to fig. 4b, the first convolution layer and the second convolution layer may be combined into one 3*3 convolution layer. The structure used by the ghost linear block and the first simple linear block in the model training stage and after the initial model training is completed is different, namely the convolution layer of 1*1 and the direct connection operation are saved, but the output result is the same. After the initial model is trained, one convolution layer formed by combining the first convolution layer and the second convolution layer and one convolution layer formed by combining the third convolution layer and the fourth convolution layer can form part of the super-resolution reconstruction model.

In the image super-resolution reconstruction method, training is carried out on the initial model by utilizing the training image sample set until the model training stopping condition is met, after the initial model is trained, the first convolution layer and the second convolution layer are combined into one convolution layer, and the third convolution layer and the fourth convolution layer are combined into one convolution layer, so that the super-resolution reconstruction model is obtained. The complexity of the network can be reduced by incorporating convolutional layers.

The present disclosure further provides an image super-resolution reconstruction method, referring to fig. 8, which may include the following steps:

s802, inputting the initial resolution image into a linear activation module of the super-resolution reconstruction model, and performing expansion input channel number processing on the initial resolution image through a second simple linear block to obtain high-dimensional image features.

The linear activation module comprises a second simple linear block and an activation module. The super-resolution reconstruction model comprises a plurality of second simple linear blocks, an activation module, a first lightweight characteristic extraction module, a second lightweight characteristic extraction module and a first simple linear block which are connected in sequence; the second lightweight class feature extraction module includes fine-grained sparse mask branches and ghost feature extraction branches in parallel.

S804, operating the high-dimensional image features through an activation module to obtain shallow image features.

S806, feature extraction is carried out on the shallow image features through a first lightweight feature extraction module, and first deep image features are obtained.

S808, carrying out mask setting on each element in the first deep image feature output by the first lightweight feature extraction module through a fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask.

Wherein the fine-grained sparse mask may be obtained by the following formula

If the current elementGenerating a mask 1 if +.>Generating a mask 0 when the value is not larger than a preset value;

And S810, carrying out point-by-point multiplication by utilizing the first deep image feature and the fine-granularity sparse mask to obtain sparse image features.

And S812, performing sparse convolution processing on the sparse image features to obtain convolved sparse image features.

And S814, adding the convolved sparse image features and the first deep image features to obtain first intermediate image features.

S816, performing depth separable convolution on the first intermediate image feature to obtain the channel number ghost rateAdjustment ofAnd carrying out residual connection of the channel dimension to obtain a second intermediate image feature with the channel dimension of C.

S818, performing channel attention distribution processing and activation processing based on the second intermediate image feature to obtain a second deep image feature.

S820, residual connection is carried out on the second deep image feature and the shallow image feature, and the detail image feature is obtained.

S822, performing dimension reduction processing on the detail image features through the first simple linear block to obtain dimension reduction image features.

S824, residual connection and up-sampling processing are carried out based on the initial resolution image and the dimension-reduced image characteristics, and a target resolution image is obtained.

Wherein the resolution of the initial resolution image is lower than the resolution of the target resolution image.

The method (SRGFS) proposed by the embodiments of the present disclosure achieves better results in terms of image quality (indexed by PSNR) and computational complexity (multiplied by the number of additions MACs) and model parameters when performing 180p to 720p magnification on the Urban100 dataset than other lightweight image super-resolution models (SRCNN, FSRCNN, MOREMNAS-C, SESR, etc.). SRGFS T/-S represents both a micro model and a small model version of the method proposed by the embodiments of the present specification, models with different numbers of GFFS, -Unify represents the model of the method proposed by the embodiments of the present specification after the fine-grained mask design is removed.

Referring to fig. 9, an image super-resolution reconstruction apparatus 900 according to an embodiment of the present disclosure includes: the device comprises a shallow feature acquisition module 910, a first deep layer extraction module 920, an element mask setting module 930, a second deep layer extraction module 940 and a super resolution reconstruction module 950.

The shallow image feature acquisition module 910 is configured to input an initial resolution image into a linear activation module of the super resolution reconstruction model, so as to obtain shallow image features of the initial resolution image; the super-resolution reconstruction model comprises a plurality of first lightweight characteristic extraction modules and second lightweight characteristic extraction modules which are connected in sequence; the second lightweight class feature extraction module comprises a fine granularity sparse mask branch and a ghost feature extraction branch which are parallel;

the first deep layer extraction module 920 is configured to perform feature extraction on the shallow layer image feature through the first lightweight feature extraction module to obtain a first deep layer image feature;

an element mask setting module 930, configured to set a mask for each element in the first deep image feature output by the first lightweight feature extraction module through the fine-granularity sparse mask branch, to obtain a corresponding fine-granularity sparse mask;

A second deep layer extraction module 940, configured to extract, through the ghost feature extraction branch, a ghost feature based on the fine-granularity sparse mask and the first deep layer image feature, to obtain a corresponding second deep layer image feature;

the super-resolution reconstruction module 950 is configured to perform super-resolution reconstruction according to the initial resolution image, the shallow layer image feature, and the second deep layer image feature, so as to obtain a target resolution image; wherein the resolution of the initial resolution image is lower than the resolution of the target resolution image.

In some embodiments, the super-resolution reconstruction module 950 is further configured to perform residual connection with the shallow image feature according to the second deep image feature to obtain a detail image feature; performing dimension reduction processing on the detail image features through a first simple linear block to obtain dimension reduction image features; and carrying out residual connection and up-sampling processing on the basis of the initial resolution image and the dimension-reduced image characteristics to obtain the target resolution image.

In some embodiments, the fine-grained sparse mask is obtained using the following formula, including:

if the current element A kind of electronic deviceGenerating a mask 1 if +.>Generating a mask 0 when the value is not larger than the preset value;

In some embodiments, the second deep extraction module 940 is further configured to perform point-wise multiplication with the fine-granularity sparse mask by using the first deep image feature to obtain a sparse image feature; performing sparse convolution processing on the sparse image features to obtain convolved sparse image features; adding the convolved sparse image features and the first deep image features to obtain first intermediate image features; and carrying out ghost feature extraction based on the first intermediate image feature to obtain the second deep image feature.

In some embodiments, the second depth extraction module 940 is further configured to perform depth separable convolution on the first intermediate image feature to ghost the channel numberAdjust->Residual connection of channel dimension is carried out, and channel dimension is obtainedA second intermediate image feature of degree C; wherein (1) >The value range of (2) is not more than 1; and carrying out channel attention distribution processing and activating processing based on the second intermediate image feature to obtain the second deep image feature.

For a specific description of the image super-resolution reconstruction apparatus 900, reference may be made to the description of the method for reconstructing the image super-resolution hereinabove, and the description thereof will not be repeated here.

In some embodiments, a computer device is provided, which may be a terminal, and an internal structure diagram thereof may be as shown in fig. 10. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of image super-resolution reconstruction. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 10 is merely a block diagram of a portion of the structure associated with the aspects disclosed herein and is not limiting of the computer device to which the aspects disclosed herein apply, and in particular, the computer device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In some embodiments, a computer device is provided, comprising a memory in which a computer program is stored, and a processor which, when executing the computer program, carries out the method steps of the above embodiments.

The present description embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method of any of the above embodiments.

An embodiment of the present specification provides a computer program product comprising instructions which, when executed by a processor of a computer device, enable the computer device to perform the steps of the method of any one of the embodiments described above.

It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, for example, may be considered as a ordered listing of executable instructions for implementing logical functions, and may be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

Claims

1. An image super-resolution reconstruction method, characterized in that the method comprises:

inputting the initial resolution image into a linear activation module of the super resolution reconstruction model to obtain shallow image features of the initial resolution image; the super-resolution reconstruction model further comprises a plurality of first lightweight class feature extraction modules and second lightweight class feature extraction modules which are connected in sequence; the second lightweight class feature extraction module comprises a fine granularity sparse mask branch and a ghost feature extraction branch which are parallel;

mask setting is carried out on each element in the first deep image feature output by the first lightweight feature extraction module through the fine-granularity sparse mask branch, so that a corresponding fine-granularity sparse mask is obtained; the fine granularity sparse mask is obtained by adopting the following formula, and the method comprises the following steps:

if the current elementGenerating a mask 1 if +.>Generating a mask 0 when the value is not larger than the preset value; wherein (1)>Soft-input representing the fine-grained sparse mask branches Go out (I)>And->Representing noise sampled from Gumbel (0, 1) distribution, +.>To weigh the parameters->Representing a Sigmoid activation function;

2. The method of claim 1, wherein performing super-resolution reconstruction from the initial resolution image, the shallow image features, and the second deep image features to obtain a target resolution image comprises:

3. The method of claim 1, wherein the linear activation module comprises a second simple linear block and an activation module; inputting the initial resolution image into a linear activation module of a super resolution reconstruction model to obtain shallow image features of the initial resolution image, wherein the method comprises the following steps:

4. The method according to claim 1, wherein the extracting the ghost feature by the ghost feature extracting branch based on the fine-granularity sparse mask and the first deep image feature to obtain a corresponding second deep image feature includes:

5. The method of claim 4, wherein the performing ghost feature extraction based on the first intermediate image feature to obtain the second deep image feature comprises:

performing depth separable convolution on the first intermediate image feature to obtain a channel number ghost rateAdjust to->And->Residual connection of the channel dimension is carried out, and a second intermediate image feature with the channel dimension of C is obtained; wherein (1)>The value range of (2) is not more than 1;

6. The method according to any one of claims 1 to 5, wherein the generating of the super-resolution reconstruction model comprises:

7. The method of claim 6, wherein the ghost feature extraction branch comprises a ghost linear block and the fine-grained sparse mask branch comprises a first simple linear block; wherein the ghost linear block comprises a first convolution layer and a second convolution layer which are adjacent to each other; the first simple linear block comprises a third convolution layer and a fourth convolution layer which are adjacent to each other; the training the initial model by using the training image sample set to obtain the super-resolution reconstruction model comprises the following steps:

8. An image super-resolution reconstruction apparatus, the apparatus comprising:

the element mask setting module is used for carrying out mask setting on each element in the first deep image feature output by the first lightweight feature extraction module through the fine-granularity sparse mask branch to obtain a corresponding fine-granularity sparse mask; the fine granularity sparse mask is obtained by adopting the following formula, and the method comprises the following steps:

If the current elementGenerating a mask 1 if +.>Generating a mask 0 when the value is not larger than the preset value; wherein (1)>Soft output representing the fine-grained sparse mask branch,/->And->Representing noise sampled from Gumbel (0, 1) distribution, +.>To weigh the parameters->Representing a Sigmoid activation function;

9. The apparatus of claim 8, wherein the super-resolution reconstruction module is further configured to perform residual connection with the shallow image feature according to the second deep image feature to obtain a detail image feature; performing dimension reduction processing on the detail image features through a first simple linear block to obtain dimension reduction image features; and carrying out residual connection and up-sampling processing on the basis of the initial resolution image and the dimension-reduced image characteristics to obtain the target resolution image.

10. The apparatus of claim 8, wherein the second deep extraction module is further configured to perform a point-wise multiplication with the fine-granularity sparse mask using the first deep image feature to obtain a sparse image feature; performing sparse convolution processing on the sparse image features to obtain convolved sparse image features; adding the convolved sparse image features and the first deep image features to obtain first intermediate image features; and carrying out ghost feature extraction based on the first intermediate image feature to obtain the second deep image feature.

11. The apparatus of claim 10, wherein the second depth extraction module is further configured to perform a depth separable convolution on the first intermediate image feature to ghost a channel number ratioAdjust to->And->Residual connection of the channel dimension is carried out, and a second intermediate image feature with the channel dimension of C is obtained; wherein (1)>Is a range of values of (a)Is not greater than 1; and carrying out channel attention distribution processing and activating processing based on the second intermediate image feature to obtain the second deep image feature.

12. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.

13. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 7.