CN110246084B

CN110246084B - Super-resolution image reconstruction method, system and device thereof, and storage medium

Info

Publication number: CN110246084B
Application number: CN201910409645.3A
Authority: CN
Inventors: 徐颖; 关蕙欣; 翟懿奎; 邓文博; 柯琪锐; 应自炉; 甘俊英; 曾军英
Original assignee: Wuyi University
Current assignee: Wuyi University
Priority date: 2019-05-16
Filing date: 2019-05-16
Publication date: 2023-03-31
Anticipated expiration: 2039-05-16
Also published as: CN110246084A

Abstract

The invention discloses a super-resolution image reconstruction method, a system, a device and a storage medium thereof, wherein firstly, an original low-resolution image is input, and bicubic interpolation is carried out on the input low-resolution image through image preprocessing; and then, the preprocessed images are connected and arranged with each other through a sparse deep convolution neural network containing pooling manifold constraint, high-frequency detail information components missing in the images are learned, and reconstruction of high-resolution clear images is achieved. The invention can effectively reduce the calculation amount of network parameters and simultaneously reserve the important detail information of the image. And inputting the original low-resolution image into a sparse depth convolution neural network model trained by a set algorithm, and realizing super-resolution reconstruction into a high-resolution image.

Description

Super-resolution image reconstruction method, system and device thereof, and storage medium

Technical Field

The invention relates to the technical field of image reconstruction, in particular to a depth sparse convolution neural network face super-resolution image reconstruction method based on pooling manifold constraint, a system, a device and a storage medium thereof.

Background

In recent years, the country has entered the period of rapid development of society, people have increased their attention to the protection of personal privacy, property and personal safety, and in order to maintain social security and citizen safety, video monitoring systems are widely used in the fields of traffic and safety. In case of reconnaissance and tracking, police need to obtain clear video monitoring image quality so as to improve the rate of solving a case. However, in actual monitoring, due to the limitation of the resolution of the imaging device of the video monitoring system and the interference of factors such as the distance between the target imaging face and the camera, blur, noise, weather environment and the like, only a video imaging effect with small resolution and poor quality can be obtained, so that the identification of the target face becomes difficult. Therefore, it is very critical to restore a low-resolution image to a high-resolution image by a super-resolution reconstruction technique.

In the current mainstream super-resolution reconstruction algorithm, the application of the deep convolutional neural network embodies great advantages. The calculation performance of the deep convolutional neural network is mainly determined by the number of convolution multiplication operations in the network, and the application effect is limited due to the huge calculation amount of the convolutional neural network; meanwhile, considering that image detail information loss in the network data transmission process is inevitably caused by a pooling layer (namely, sampling operation is performed on image parameter data) in the network, a deep neural network structure using a full convolution layer is mostly selected, so that the image detail information is effectively reserved, and the super-resolution reconstruction effect is closer to the reality. However, once the deep neural network structure has no pooling layer, the image data is trained and tested in the form of full convolution layer, although the input-output characteristic diagram between the network layers is kept unchanged, the calculation amount of the neural network is increased, and the time consumption is increased, thereby causing negative effects. Although the existing research proposes that the number of multiplication operations in the convolutional neural network is reduced by utilizing the linear transformation of the neurons and the convolutional kernel in the convolutional operation or by modifying the network weight to compress the neural network, once the linear transformation is carried out on the neurons and the convolutional kernel, the network sparsity disappears, and the network acceleration cannot be carried out by utilizing the sparsity; the improved activation function is used in the full convolution network to reduce the calculation complexity between the neural network layers, but the occupied consumption of the network memory and the network parameter quantity are not reduced.

Disclosure of Invention

In order to solve the above problems, the present invention provides a super-resolution image reconstruction method, a system, an apparatus, and a storage medium thereof, which can effectively reduce the amount of network parameter calculation while retaining important detail information of an image, and input an original low-resolution image into a sparse deep convolutional neural network model trained by a set algorithm, thereby realizing super-resolution reconstruction into a high-resolution image.

The technical scheme adopted by the invention for solving the problems is as follows:

in a first aspect, an embodiment of the present invention provides a super-resolution image reconstruction method, including:

acquiring an original image;

carrying out bicubic interpolation on the original image, and enhancing the number of image pixels to obtain a preprocessed image;

carrying out image data analysis clustering and reconstruction on the preprocessed image through a sparse depth convolution neural network containing pooling manifold constraint;

and outputting the reconstructed image.

Further, the acquiring the original image includes:

and for the monitoring video, performing screenshot operation on the video by frames by using a Matlab writing program to obtain an original image data set.

Further, the bicubic interpolation calculation involves 16 pixel points, and the calculation formula is as follows:

wherein, (i ', j') represents the fractional pixel coordinates in the original image, dx, dy

The coordinates are respectively in the directions of the x axis and the y axis, and F (i ', j') represents the original image passing through

Calculating the sum of weight convolution between 16 pixel points nearest to each pixel point coordinate

R (x) represents an interpolation expression, and the calculation formula is as follows:

the number of image pixels can be enhanced by bi-cubic interpolation of the original image.

Further, the image data analysis clustering and reconstruction of the preprocessed images through the sparse deep convolution neural network containing the pooling manifold constraint comprises the following steps:

the preprocessed image is subjected to manifold function constrained pooling sampling, feature map extraction in a convolutional layer, data jump connection and convolutional neural network sparsification through two pairs of image data of a mode I and a mode II to respectively obtain an image A and an image B, and the image A and the image B are subjected to combined convolution operation to obtain a reconstructed image.

Further, the mode one includes:

the method comprises the steps of enabling a preprocessed image to pass through a pooling layer with manifold constraint, connecting and rearranging down-sampling copied feature map data to obtain a new feature map, neglecting an interpolation feature map generated by bicubic interpolation, inputting the interpolation feature map into a convolution and jump connection architecture with 10 layers, then inputting generated feature map into a next layer of convolution layer to obtain a vector with the size of (H/2, W/2, 4), then carrying out sparsification on a convolution neural network, and finally carrying out sub-pixel amplification operation on the obtained vector to sample up to a target resolution ratio to generate an image A with the size of (H, W).

Further, the second mode includes:

the preprocessed image is firstly input into a convolution layer with 64 filters to generate a feature map with the output size of (H, W, 64), and then input into a 10-layer convolution and jump link architecture, the feature map with the output size of (H, W, 64) is thinned through a network, and then an image B is generated through the convolution layer with a single filter.

In a second aspect, an embodiment of the present invention further provides a super-resolution image reconstruction system, including:

an image acquisition unit for acquiring an original image;

the image data enhancement unit is used for carrying out bicubic interpolation on the original image, enhancing the number of image pixels and obtaining a preprocessed image;

the image analysis clustering and reconstruction unit is used for carrying out image data analysis clustering and reconstruction on the preprocessed images through a sparse depth convolution neural network containing pooling manifold constraints;

an image output unit for outputting the reconstructed image.

In a third aspect, an embodiment of the present invention further provides a super-resolution image reconstruction apparatus, including:

at least one processor; and (c) a second step of,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the invention.

In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, where computer-executable instructions are stored, and the computer-executable instructions are configured to enable a computer to execute the method described in the first aspect of the present invention.

One or more technical schemes provided in the embodiment of the invention have at least the following beneficial effects: the invention provides a super-resolution image reconstruction method, a system, a device and a storage medium thereof.A primary low-resolution image is input, and bicubic interpolation is carried out on the input low-resolution image through image preprocessing; then, the preprocessed images are connected and arranged with each other through a sparse depth convolution neural network containing pooling manifold constraint, high-frequency detail information components missing in the images are learned, and reconstruction of high-resolution clear images is achieved. The invention can effectively reduce the calculation amount of network parameters and simultaneously reserve the important detail information of the image. And inputting the original low-resolution image into a sparse deep convolution neural network model trained by a set algorithm, and realizing super-resolution reconstruction into a high-resolution image. The method can keep the network sparsity, relieve the disappearance of important detail information caused by the sampling of data information by a pooling layer, and relieve the negative influence caused by the occupation consumption of network memory and the increase of parameter calculation amount caused by a full convolution deep neural network.

Drawings

The invention is further illustrated with reference to the following figures and examples.

FIGS. 1a and 1b are flow charts of a super-resolution image reconstruction method according to a first embodiment of the present invention;

FIG. 2 is a flowchart of super resolution reconstruction of an image based on a deep convolutional neural network according to a first embodiment of the present invention;

FIG. 3 is a diagram illustrating a convolution and jump-join process performed on image data according to a first embodiment of the present invention;

FIG. 4 is a flowchart of the network sparsification operation of the convolution and jump connection architecture for image data according to the first embodiment of the present invention;

FIG. 5 is a schematic diagram of a super-resolution image reconstruction system according to a second embodiment of the present invention;

fig. 6 is a schematic configuration diagram of a super-resolution image reconstruction apparatus according to a third embodiment of the present invention.

Detailed Description

Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.

In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.

In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If there is a description of first and second for the purpose of distinguishing technical features only, this is not to be understood as indicating or implying a relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of technical features indicated.

In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.

The embodiments of the present invention will be further explained with reference to the drawings.

Referring to fig. 1a and 1b, a first embodiment of the present invention provides a super-resolution image reconstruction method, including but not limited to the following steps:

s100, acquiring an original image;

s200, carrying out bicubic interpolation on the original image to enhance the number of image pixels and obtain a preprocessed image;

s300, carrying out image data analysis clustering and reconstruction on the preprocessed image through a sparse depth convolution neural network containing pooling manifold constraint;

and S400, outputting the reconstructed image.

The original data acquisition is realized through S100, and S100 specifically includes: and for the monitoring video, a Matlab programming program is used for carrying out screenshot operation on the video by frames to obtain a low-resolution image data set.

The image data enhancement and preprocessing are realized through S200, and the purpose is to emphasize some interesting features of an original image, inhibit uninteresting features, enrich image information quantity and enhance image identification effect. The specific function method is as follows:

the bicubic interpolation calculation involves 16 pixel points, (i ', j') represents the pixel coordinates of a fractional part in an original image, dx and dy are the coordinates in the directions of an x axis and a y axis respectively, and F (i ', j') represents a new pixel value calculated by the sum of weight convolutions between 16 pixel points closest to the coordinates of each pixel point in the original input image. R (x) represents an interpolation expression, because the super-resolution reconstruction of the face image is researched by the invention, the processing of the face image has certain complexity, a convolution sampling interpolation calculation based on a polynomial is adopted, and the formula is as follows:

bicubic interpolation of the original image can enhance the number of image pixels.

The method comprises the steps of S300, analyzing, clustering and reconstructing image data, inputting an image enhanced by the image data to perform super-resolution reconstruction based on a deep convolutional neural network, and mainly comprising two aspects of a pooling sampling process with manifold function constraint, extraction of feature maps in convolutional layers, data jump connection and convolution neural network sparseness for reconstructing a high-resolution image. The network structure is roughly shown in fig. 2.

The main process of the super-resolution reconstruction of the image data designed by the invention is as follows: after an originally input low-resolution image is subjected to data enhancement through bicubic interpolation, convolution and connection processing are performed on image data through two different modes, as shown in fig. 2, a feature diagram result image A and an image B processed through the two modes are subjected to combined convolution operation, a final super-resolution reconstruction image is output, and fig. 3 is a visual description of connection of convolution and jumping.

In the first mode, an image after bicubic interpolation data enhancement processing passes through a pooling layer with manifold constraint, connection and rearrangement between downsampling copied feature map data are carried out to obtain a new feature map, an interpolation feature map generated by bicubic interpolation is ignored and is input into a framework of convolution (including conversion convolution) and jump connection by 10 layers, the generated feature map is input into a convolution layer of the next layer to obtain a vector size of (H/2, W/2 and 4), then the convolution neural network is thinned, and finally the obtained vector is subjected to sub-pixel amplification operation to be up-sampled to a target resolution ratio to generate a high-quality super-resolution image A with the size of (H, W).

In the second mode, the image of the original low-resolution image after the bicubic interpolation processing is directly input into a 10-layer convolution and jump link architecture, and more image detail information can be provided for reconstructing a high-quality super-resolution image. As shown in the dotted line portion of fig. 2, after the bicubic interpolation of the low-resolution image, the feature map with the output size (H, W, 64) is first input to a convolution layer with 64 filters to generate a feature map with the output size (H, W, 64), and then input to the convolution and jump connection structure (with the same weight as the mode one) described above, the feature map with the output size (H, W, 64) is thinned through a network, and then the high-quality super-resolution image B is generated by the convolution layer with a single filter.

And combining the high-quality super-resolution images A and B generated by the two modes, and obtaining a final super-resolution reconstruction image through a convolution layer of a single filter.

Finally, outputting the high resolution image is achieved through S400.

The following is a detailed description of the main process:

because the original input low-resolution image contains more image detail information, and simultaneously, in order to avoid network memory occupation consumption and parameter calculation amount increase caused by a full convolution layer network architecture, the invention provides a pooling process with popular constraint, and the input feature map is sampled to a plurality of resolution reduction versions (without image detail information loss) by readjusting pixel positions.

And a manifold learning idea of local Linear Embedding (localization Linear Embedding) is adopted between the pooling layers, so that the detail information of the data in the pooling sampling process is not lost, and the network calculation amount is reduced. For example, after the samples are mapped from the high-dimensional space to the low-dimensional space, the linear relationship between the samples in each domain is not changed.

The coordinate of the sample point xi to be calculated is found by using the K neighbor idea in its original high-dimensional neighborhood, and three samples xj, xl, xk nearest to it are assumed that xi can be linearly represented by xj, xl, xk, that is:

x _i ＝w _ij x _j +w _ik x _k +w _il x _l

wherein wij, wjk, wil are weight coefficients. After dimension reduction is carried out by LLE, the projections x 'i, x' j, x 'k corresponding to the projections x' i and xj, xl, xk of xi in the low-dimensional space keep the same linear relation as much as possible. Namely:

x' _i ≈w _ij x' _j +w _ik x' _k +w _il x' _l

the weight parameters are kept as consistent as possible in the low-dimensional space and the high-dimensional space.

The algorithm can be divided into two parts, the first step is to calculate the field reconstruction coefficients w of all samples according to the field relation, namely to find out the linear relation between each sample and the samples in the field, and the formula is as follows:

wherein order C _jk ＝(x _i -x _j ) ^T (x _i -x _k ),

The second step is to solve the coordinate of each sample in the low-dimensional space according to the invariance of the neighborhood reconstruction coefficient, and the formula is as follows:

let C _jk ＝(x _i -x _j ) ^T z＝(z ₁ ,z ₂ ,...,z _m )∈R ^d'×m ,(W) _ij ＝w _ij ,M＝(I-W) ^T+1

In the convolutional layer, a self-adaptive sample selection method is added to realize the manifold analysis and clustering of the local structure of the face, and the self-adaptive sample selection method is used for recovering the missing high-frequency information in the low-frequency image block.

Extracting a characteristic image block pk (k =1, \8230;, n) at a position corresponding to the d-th layer convolution layer, constructing a set Pi = [ p1, p2, \8230;, pn ] in a column vector mode, and calculating an LPP transformation matrix Ai and a mapping data matrix Yi = AiTPi of Pi. For image blocks of an input low-resolution image (LR), a low-dimensional feature set of the image blocks is calculated, and then image block pairs corresponding to a set closest to the low-dimensional features are selected in a mapping data matrix Yi according to Euclidean distances to serve as a training set for recovering high-frequency information missing in the low-frequency image blocks.

The network sparsification operation following the convolution and jump join architecture leaves the sparsity of the weights and activation functions preserved and significantly reduces the computational effort before the multiplication operation is performed, as shown in figure 4. The method is carried out in three stages: intensive training, pruning and retraining. Intensive training: we train a dense p × p kernel directly in the transform domain. Inverse transform-elimination, which directly initializes and trains transformed kernels by back propagation, requiring preservation of the kernels in the spatial domain or transformation of the spatial kernels; trimming: pruning the transformed kernel by calculating a threshold t required for realizing the required pruning rate r and resetting the ownership with an absolute value less than t to zero; and (3) retraining: keeping the clipped weight zero, the sparse mask is calculated during the clipping step and constant retraining is maintained during this period. The formulation is as follows:

S＝A ^T [[Prune(GgG ^T )]⊙[ReLU(B ^T dB)]]A

where GgGT L is the weight gradient and dL is the calculated gradient of the input activation.

And finally, defining a residual image as r = Y-x in order to enable the reconstructed high-resolution image to be close to the real image Y of the original face database to the maximum extent, wherein x represents a low-resolution image, and Y represents a corresponding high-resolution image in the face database. Inputting residual estimation and low-resolution images at a loss layer, corresponding to the images of the face database, and adopting a mean square error function as a judgment loss value, wherein the mean square error function is expressed as:

where n is the number of samples, θ = { W1, W2, \8230;, wd, B1, B2, \8230;, bd } are network parameters.

In the whole network architecture, a ReLU activation function is used, in order to determine the nonlinear characteristics of the function and the whole network, the neural network can be arbitrarily approximated to any nonlinear function, and all negative conversions are activated to return to zero in network sparseness, so that the multiplication times in a domain are reduced. The formula is as follows:

F(x)＝max(0,x)

and reconstructing the low-resolution image through a depth convolution neural network to obtain a corresponding high-resolution image output.

In addition, referring to fig. 5, a second embodiment of the present invention provides a super-resolution image reconstruction system including:

an image acquisition unit 110 for acquiring an original image;

the image data enhancement unit 120 is configured to perform bicubic interpolation on the original image, enhance the number of image pixels, and obtain a preprocessed image;

the image analysis clustering and reconstruction unit 130 is used for performing image data analysis clustering and reconstruction on the preprocessed image through a sparse depth convolution neural network containing pooling manifold constraint;

an image output unit 140 for outputting the reconstructed image.

The super-resolution image reconstruction system in the present embodiment is based on the same inventive concept as the super-resolution image reconstruction method in the first embodiment, and therefore, the super-resolution image reconstruction system in the present embodiment has the same advantageous effects: the image acquisition unit 110 acquires an original image; the image data enhancement unit 120 performs bicubic interpolation on the original image to enhance the number of image pixels, so as to obtain a preprocessed image; the image analysis clustering and reconstruction unit 130 performs image data analysis clustering and reconstruction on the preprocessed images through a sparse depth convolution neural network containing pooling manifold constraint; the image output unit 140 outputs the reconstructed image. The super-resolution image reconstruction system can effectively reduce the calculated amount of network parameters and simultaneously reserve important detail information of images. And inputting the original low-resolution image into a sparse depth convolution neural network model trained by a set algorithm, and realizing super-resolution reconstruction into a high-resolution image.

Referring to fig. 6, the third embodiment of the present invention also provides a super-resolution image reconstruction apparatus including:

at least one processor;

and a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a super-resolution image reconstruction method as in any one of the first embodiments above.

The apparatus 200 may be any type of smart terminal, such as a cell phone, a tablet, a personal computer, etc.

The processor and memory may be connected by a bus or other means, such as by a bus in FIG. 6.

The memory, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the face recognition model construction method in the embodiments of the present invention. The processor executes various functional applications and data processing of the apparatus 200 by executing non-transitory software programs, instructions and modules stored in the memory, that is, implements the super-resolution image reconstruction method of any one of the above-described method embodiments.

The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the device 200, and the like. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory located remotely from the processor, and such remote memory may be coupled to the device 200 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory and when executed by the one or more processors perform the super-resolution image reconstruction method of any of the method embodiments described above, e.g., perform method steps S100 to S400 of the first embodiment described above.

The fourth embodiment of the present invention also provides a computer-readable storage medium storing computer-executable instructions, which are executed by one or more control processors, for example, by one of the processors in fig. 6, and can cause the one or more processors to execute a super-resolution image reconstruction method in the above-described method embodiment, for example, the method steps S100 to S400 in the first embodiment.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a general hardware platform, and certainly can also be implemented by hardware. It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

While the preferred embodiments of the present invention have been described, the present invention is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and such equivalent modifications or substitutions are to be included within the scope of the present invention defined by the claims.

Claims

1. A super-resolution image reconstruction method is characterized by comprising the following steps:

acquiring an original image;

outputting a reconstructed image;

the image data analysis clustering and reconstruction of the preprocessed image through the sparse depth convolution neural network containing the pooling manifold constraint comprise the following steps: performing manifold function constrained pooling sampling on preprocessed image data through a mode I and a mode II, extracting feature maps in a convolutional layer, performing data jump connection and convolution neural network sparsification to respectively obtain an image A and an image B, and performing combined convolution operation on the image A and the image B to obtain a reconstructed image;

wherein the first mode comprises:

preprocessing an image, passing through a pooling layer with manifold constraint, performing connection and rearrangement between downsampling copy feature map data to obtain a new feature map, neglecting an interpolation feature map generated by bicubic interpolation, inputting the interpolation feature map into a convolution and jump connection architecture with 10 layers, then inputting the generated feature map into a next layer of convolution layer to obtain a vector with the size of (H/2, W/2, 4), then performing sparsification on a convolution neural network, and finally performing sub-pixel amplification operation on the obtained vector to obtain an up-sampling target resolution to generate an image A with the size of (H, W), wherein H is height and W is width;

the second mode includes: the preprocessed image is firstly input into a convolution layer with 64 filters to generate a feature map with the output size of (H, W, 64), and then input into a 10-layer convolution and jump link architecture, the feature map with the output size of (H, W, 64) is thinned through a network, and then an image B is generated through the convolution layer with a single filter.

2. The super-resolution image reconstruction method according to claim 1, wherein the acquiring the original image comprises:

3. The super-resolution image reconstruction method according to claim 1, wherein the bicubic interpolation calculation involves 16 pixel points, and the calculation formula is as follows:

wherein, (i ', j') represents the fractional part pixel coordinates in the original image, dx and dy are the coordinates in the directions of the x axis and the y axis respectively, m and n are preset constants, F (i ', j') represents a new pixel value calculated by the sum of weight convolutions between 16 pixel points closest to the coordinates of each pixel point in the original image, R (x) represents an interpolation expression, and the calculation formula is as follows:

。

4. a super-resolution image reconstruction system applied to the super-resolution image reconstruction method according to any one of claims 1 to 3, comprising:

an image acquisition unit for acquiring an original image;

the image analysis clustering and reconstruction unit is used for carrying out image data analysis clustering and reconstruction on the preprocessed image through a sparse depth convolution neural network containing pooling manifold constraint;

an image output unit for outputting the reconstructed image.

5. A super-resolution image reconstruction apparatus, comprising:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-3.

6. A computer-readable storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of any one of claims 1-3.