CN113793265A

CN113793265A - Image super-resolution method and system based on depth feature relevance

Info

Publication number: CN113793265A
Application number: CN202111074208.4A
Authority: CN
Inventors: 潘金山; 臧庆; 唐金辉
Original assignee: Nanjing University of Science and Technology
Current assignee: Nanjing University of Science and Technology
Priority date: 2021-09-14
Filing date: 2021-09-14
Publication date: 2021-12-14

Abstract

The invention relates to an image super-resolution method and system based on depth feature correlation, wherein the method comprises the following steps: acquiring a low-resolution image to be reconstructed; inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image; the training process of the student model comprises the following steps: taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the high-resolution image is estimated through the high-performance image super-resolution model with small parameter quantity.

Description

Image super-resolution method and system based on depth feature relevance

Technical Field

The invention relates to the field of image super-resolution, in particular to an image super-resolution method and system based on depth feature relevance.

Background

In daily life, more and more low-power consumption devices such as mobile phones and embedded terminals are widely used, and people need to perform related processing on low-resolution images so as to obtain high-resolution images with good visual effects on mobile devices. Therefore, the application of the image super-resolution algorithm in low power consumption devices has received a great deal of attention.

The goal of image super-resolution techniques is to estimate a high-resolution image from a low-resolution image. The degradation process of the image super-resolution problem is generally defined as:

L＝SM+n， (1)

where L, I, n represent low resolution images, high resolution images and noise, respectively, and S and K represent the matrix form of a down-sampling matrix and a blur kernel with scale factors, respectively. Image super-resolution is a pathological problem because there are an infinite number of pairs of blur kernel K and high resolution image I that can generate the same low resolution image L. The traditional interpolation-based method is simple and rapid, but the quality of the recovered high-resolution image is poor. In recent years, with the rapid development of deep learning technology, the method based on the deep convolutional neural network greatly leads the traditional method based on interpolation in the process of reconstructing a high-resolution image, and related experiments show that the deep network with larger parameter quantity can improve the performance of the image super-resolution algorithm more obviously, but also brings about the substantial increase of calculation time and memory consumption. In a real scene, the large models with huge calculation amount and parameter amount cannot be deployed into low-power consumption devices such as mobile phones. In order to solve the problem, some model compression methods are needed to obtain some image super-resolution models with smaller parameters and better performance.

Disclosure of Invention

The invention aims to provide an image super-resolution method and system based on depth feature correlation, which are used for estimating a high-resolution image through a student super-resolution model with small parameter quantity and high performance.

In order to achieve the purpose, the invention provides the following scheme:

an image super-resolution method based on depth feature correlation comprises the following steps:

acquiring a low-resolution image to be reconstructed;

inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;

the training process of the student model comprises the following steps:

taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.

Optionally, the low-resolution training image is an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:

performing depth characteristic relevance calculation on the output of residual error group modules of different network depths of the student model and the trained teacher model to obtain a depth characteristic relevance matrix;

determining a distillation loss function according to the depth characteristic correlation matrix;

determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image;

taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.

Optionally, before performing depth feature correlation calculation on the output of the residual group modules of different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further includes:

and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.

Optionally, the depth feature relevance calculation is performed on the output of the residual group modules of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, which specifically includes:

performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction;

normalizing and averaging the depth features of the teacher model subjected to dimensionality reduction to obtain a depth feature correlation matrix of the teacher model;

normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.

Optionally, the determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image specifically includes:

the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;

determining a first supervision loss function according to the high-resolution image output by the trained teacher model;

a second supervised loss function is determined from the real image.

An image super-resolution system based on depth feature correlation comprises:

the acquisition module is used for acquiring a low-resolution image to be reconstructed;

the reconstruction module is used for inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;

the reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.

Optionally, the student model training submodule specifically includes:

the depth characteristic relevance matrix determining unit is used for carrying out depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix;

a distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix;

the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model;

the student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.

Optionally, the student model training sub-module further includes:

and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.

Optionally, the depth feature correlation matrix determining unit specifically includes:

the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction;

the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix;

the student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.

Optionally, the supervision loss function determining unit specifically includes: the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;

the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model;

a second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.

According to the specific embodiment provided by the invention, the invention discloses the following technical effects:

according to the image super-resolution method and system based on the depth feature relevance, when a student model is trained, a low-resolution training image is used as input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function. And a distillation loss function is introduced to transfer the relevance information from the teacher model to the student model in a knowledge distillation mode, so that the student model still has good performance when the parameter and the calculated amount are low, and a high-quality high-resolution image is reconstructed.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a schematic flow chart of an image super-resolution method based on depth feature correlation according to the present invention;

FIG. 2 is a schematic diagram of the overall structure of a super-resolution network using knowledge distillation provided by the present invention;

FIG. 3 is a schematic diagram of a teacher model structure provided by the present invention;

FIG. 4 is a schematic structural diagram of a reverse residual error module in the student model according to the present invention;

fig. 5 is a schematic structural diagram of a feature mapping module provided in the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The invention aims to provide an image super-resolution method and system based on depth feature correlation, which are used for estimating a high-resolution image through a high-performance image super-resolution model with small parameter quantity.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1 and 2, given a low resolution image (LR), the present invention is a task of transferring depth feature relevance knowledge from a heavyweight teacher model to a lightweight student model, which is completed under a knowledge distillation framework, so that the lightweight student model can improve its performance under the supervision of the heavyweight teacher model, and reconstruct a high resolution image (SR) with a good effect. The solid lines in the flow chart represent the training process and the dashed lines represent the testing process. In the training process, the paired data sets subjected to double and triple down sampling are adopted for training, a teacher model is trained, parameters are fixed, a student model is trained from the beginning, depth feature relevance knowledge is extracted from the trained teacher model and is transmitted to the student model for learning, and finally the student model obtains better performance (a solid line). When the student models are converged gradually under the supervision of the teacher model, given the tested low-resolution images, the trained student models can be used for reconstructing high-resolution images (dotted lines). The invention provides an image super-resolution method based on depth feature relevance, which comprises the following steps:

and acquiring a low-resolution image to be reconstructed.

And inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image.

The training process of the student model comprises the following steps:

taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function. Namely, a low-resolution training image is used as the input of the student model; training the student model by using a loss function attenuation mechanism by using a high-resolution training image as the output of the student model and an overall loss function as a loss function to obtain a trained student model; the bulk loss function comprises a distillation loss function and a supervisory loss function; the distillation loss function is determined based on a student model and the trained teacher model. The supervision loss function is determined according to the high-resolution image and the real image output by the trained teacher model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model.

In practical application, the low-resolution training image is used as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:

and performing depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix.

Determining a distillation loss function from the depth signature correlation matrix.

And determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image.

In practical application, before performing depth feature correlation calculation on the output of the residual group modules with different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further includes:

In practical application, the depth feature relevance calculation is performed on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, and the depth feature relevance matrix specifically includes:

and performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction.

And carrying out normalization and equalization processing on the depth features of the teacher model after dimension reduction to obtain a correlation matrix of the depth features of the teacher model.

In practical application, the determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model specifically includes:

the supervisory loss function includes a first supervisory loss function and a second supervisory loss function.

And determining a first supervision loss function according to the high-resolution image output by the trained teacher model.

A second supervised loss function is determined from the real image.

The invention also provides a more specific implementation mode of the image super-resolution method based on the depth feature correlation, which comprises the following steps:

selecting a teacher model and training

In order to construct a knowledge distillation framework, a heavyweight super-resolution model RCAN (residual channel assessment network) shown in FIG. 3 is selected as a teacher model, the construction form of the teacher model is nested residuals, the model consists of ten residual group modules, each residual group module consists of twenty residual modules, and each residual module consists of four layers of standard convolutions. Each convolutional layer is connected to an active layer. The method has high performance on the image super-resolution task, and the form of the network is shown as formula (2).

I_SR＝T(I_LR) (2)

Wherein T is a teacher model, L_LRAnd I_SRRespectively, a low-resolution image and a high-resolution image output by the teacher model.

And after the teacher model is selected, training the teacher model. Firstly, inputting a low-resolution image (LR) into a teacher model, and performing feature extraction through a layer of convolution to obtain shallow features F₀Then, a high-level feature F is obtained through a nested residual error module_DFAnd a shallow feature F is arranged at the tail part of the nested residual error structure₀And high level feature F_DFOverlapping, and then performing upsampling by an upsampling module to obtain a characteristic F_upFinally, a high resolution image (SR) is obtained by the reconstruction module. Utilizing a loss function formula (3) to constrain a high-resolution image generated by the teacher model, then fixing the parameters of the trained teacher model, and not using the parameters in the following training process of the student modelAnd (4) performing back propagation, and providing depth features of different network depths to calculate relevance knowledge and transferring the relevance knowledge to student model learning.

Where T refers to the teacher model and,

and

respectively a low resolution image and a true high resolution image, L_{cons_T}A loss function for the teacher model training process.

Construction of student models

The student model is obtained by compressing the number of residual modules and the number of channels in the teacher model, specifically, reducing the number of residual modules in the teacher model from 20 to 2, and reducing the number of feature channels in the teacher model from 64 to 16. And replacing the standard convolution in the teacher model residual error module with the optimized lightweight inverted residual error module, wherein the lightweight inverted residual error module is shown in fig. 4, and finally constructing a lightweight student model.

The optimized lightweight inverse residual module is composed of four separate layers, the first layer is called an expansion layer and uses 1x1 convolution to map features from a low-dimensional space to a high-dimensional space, the second layer is called a depth convolution layer and performs lightweight filtering on each input channel through group convolution, the third layer is a ReLU activation layer and performs nonlinear mapping on the features, the fourth layer is called a projection layer and also uses 1x1 convolution to map feature maps from the high-dimensional space to the low-dimensional space, and new features are constructed by calculating linear combination of input features. The purpose of adopting the inverted residual error module is to reduce the calculated amount of the student model and accelerate the reasoning speed of the student model.

Due to the fact that the group convolution operation and the 1x1 convolution are introduced into the inverse residual error module, the student model can be reduced while not reducing too much performanceThe calculated amount of the standard convolution can be reduced by 8 to 9 times by adopting a reverse residual error module. Inverse residual error module

Represented by the formula:

wherein the content of the first and second substances,

to reverse residual module, F_inputFinger input of high-dimensional features, F_outputRefers to the output high-dimensional characteristics of the module.

Training student model

Depth feature mapping in teacher models

In the framework of knowledge distillation, student models were trained from scratch.

Firstly, simultaneously inputting a low-resolution image (LR) into a trained teacher model and a trained student model, then taking the output of residual error groups with different network depths, such as the first, the fifth and the tenth, in the teacher model and the student model as depth features, and compressing the number of channels in the models when constructing the student model, so that the dimensions of the extracted depth features in the teacher model and the student model are inconsistent, so that a feature mapping module, namely a feature mapping unit shown in FIG. 5 is adopted, the three depth features of the teacher model are processed by the feature mapping module by taking the minimum error method of PCA as reference, wherein the feature mapping module is composed of two layers of 1x1 convolution layers, the depth features of the teacher model are mapped from a high dimension to a low dimension by adopting a first layer of convolution, and then the low-dimension depth features are mapped into output high-dimension features by adopting a second layer of convolution, low-dimensional depth feature F enabling intermediate output of feature mapping module_reductThe main feature information can be retained to reconstruct the output high-dimensional features similar to the input high-dimensional features as much as possible. The feature mapping module is shown as follows:

wherein the content of the first and second substances,

for the feature mapping module, F_inputFinger input of high-dimensional features, F_outputRefers to the output high-dimensional characteristics of the module.

In the distillation training process, because the depth feature in the teacher model is inconsistent with the depth feature dimension in the student model, the feature mapping module is needed to be adopted to reduce the dimension of the depth feature in the teacher model, and in the training process of the feature mapping module, the output high-dimensional feature of the feature mapping module is constrained by adopting a mean square error cost function formula (6):

wherein, F_inputFinger input of high-dimensional features, F_outputThe output high-dimensional characteristic of the module is referred to, N is the number of samples in a batch, and i is the ith sample in the batch.

Low-dimensional depth feature F in derived feature mapping module_reductAnd then, mining the relevance knowledge of the depth features by using a depth feature relevance calculation module, wherein the module extracts a feature relevance matrix to represent the inherent relevance among different features. The specific contents of the depth feature correlation matrix calculated from the depth features are as follows:

assuming that the dimension of the low-dimensional depth feature after the feature mapping of the given teacher model is F e R^c×W×HThe last two dimensions of the depth feature F are first compressed into one dimension, the compression dimension being

Where C is the number of channels, WH is the product of length and width, and R is a real number.

To effectively transmit the knowledge in the distillation processThe rich knowledge of the depth features in the teacher model limits the depth features generated by the teacher model and the student models to a solution space for calculation by comparing the depth features

The size of the solution space is limited by normalizing each column, wherein normalization refers to dividing the minimum value of data minus the current dimension by the maximum value of the current dimension minus the minimum value of the current dimension, so that the data can be limited within the range of 0-1. After normalization, in order to compress the computation amount of the depth feature correlation calculation, the depth feature after normalization is first averaged in the channel dimension, as shown in formula (7), and the depth feature dimension after processing is

Then, the depth feature correlation matrix D is calculated by equation (8).

Wherein

Refer to

The transpose of (a) is performed,

for the depth feature to be compressed,

to obtain the depth feature after the second dimension, WH is the value of the depth feature multiplied by the first dimension and [ i, j]The feature points with coordinates i, j in the depth feature.

In the course of student model training, the teacher model and the student model all calculate depth characteristic correlation matrix, and introduce distillation loss function formula (9) to superviseStudent model depth feature correlation matrix D generated by student model_sTeacher model depth characteristic correlation matrix D generated by maximum and teacher models_TAnd the consistency enables the student model to learn the relevance of the depth features in the teacher model.

Wherein L is_fdAs a function of the loss of distillation,

is a matrix of teacher-model correlations,

is a student model correlation matrix.

In addition to introducing distillation loss for training, both the high resolution image (SR) and the true high resolution image (HR) output by the teacher model are used to supervise learning of the student model, and the output high resolution image (SR) of the student model is further constrained using formula (10), a first supervised loss function, and formula (11), a second supervised loss function.

Wherein T refers to a teacher model, S refers to a student model,

refers to the ith low resolution image, N refers to the number of images of a batch, L_{sr_cons}In order to be a first supervised loss function,

is the output of the teacher's model,

is the output of the student model.

Wherein the content of the first and second substances,

refers to the ith real high resolution image.

Finally, the overall loss function of the student model is shown in equation (12).

L(θ)＝αL_{cons_s}+βL_{sr_cons}+γL_fd (12)

Where α, β, γ are weighting parameters that balance the effect of the different loss functions.

In the aspect of weight parameter selection, in the early stage of student network training, the distillation loss function can help a student model to rapidly learn the effective knowledge of a teacher model, so that the network convergence of the student model is accelerated, but the distillation loss function can inhibit the further learning of the student model in the later stage of student model training. The effect of distillation loss is gradually reduced in the training process through a loss function weight attenuation mechanism, and the knowledge distillation effect is further improved:

where σ is an attenuation constant coefficient and n_eThe current epoch number in the whole training process is referred to, and n is a variable parameter used for determining how many epochs are attenuated once.

The invention also provides an image super-resolution system based on depth feature correlation, which comprises the following components:

and the acquisition module is used for acquiring the low-resolution image to be reconstructed.

And the reconstruction module is used for inputting the low-resolution image to be reconstructed into the trained student model for reconstruction to obtain a high-resolution image.

In practical application, the student model training submodule specifically includes:

and the depth characteristic relevance matrix determining unit is used for performing depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix.

A distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix.

And the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image.

In practical application, the student model training submodule further includes:

In practical application, the depth feature correlation matrix determining unit specifically includes:

and the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction.

And the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix.

In practical application, the unit for determining a supervisory loss function specifically includes: the supervisory loss function includes a first supervisory loss function and a second supervisory loss function.

And the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model.

In order to solve the problem of overlarge parameter quantity of the model, the inverse residual module is introduced to replace a standard convolutional layer, the relevance knowledge is transferred from the heavyweight super-resolution model to the lightweight super-resolution model by excavating the depth characteristic relevance knowledge in the super-resolution model and further utilizing a knowledge distillation mode and introducing distillation loss under a knowledge distillation framework, so that the lightweight super-resolution model still has better performance when the parameter quantity and the calculated quantity are lower, and the performance of the lightweight super-resolution model is improved. Therefore, the problem that the heavyweight super-resolution model cannot be deployed to the low-power-consumption embedded equipment is solved.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims

1. An image super-resolution method based on depth feature relevance is characterized by comprising the following steps:

acquiring a low-resolution image to be reconstructed;

the training process of the student model comprises the following steps:

2. The method for super-resolution of images based on depth feature correlation according to claim 1, wherein the training image with low resolution is an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:

3. The image super-resolution method based on depth feature correlation according to claim 2, wherein before performing depth feature correlation calculation on the output of the residual group module with different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further comprises:

4. The image super-resolution method based on depth feature relevance of claim 2, wherein the depth feature relevance calculation is performed on the output of the residual group module of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, and specifically comprises:

5. The image super-resolution method based on depth feature correlation according to claim 2, wherein the determining a supervised loss function according to the high-resolution image output by the trained teacher model and the real image specifically comprises:

a second supervised loss function is determined from the real image.

6. An image super-resolution system based on depth feature correlation is characterized by comprising:

7. The image super-resolution system based on depth feature correlation according to claim 6, wherein the student model training sub-module specifically comprises:

8. The depth feature correlation-based image super-resolution system of claim 7, wherein the student model training sub-module further comprises:

9. The image super-resolution system based on depth feature correlation according to claim 7, wherein the depth feature correlation matrix determination unit specifically includes:

10. The image super-resolution system based on depth feature correlation according to claim 7, wherein the supervised loss function determination unit specifically includes: the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;