CN113793265A - Image super-resolution method and system based on depth feature relevance - Google Patents

Image super-resolution method and system based on depth feature relevance Download PDF

Info

Publication number
CN113793265A
CN113793265A CN202111074208.4A CN202111074208A CN113793265A CN 113793265 A CN113793265 A CN 113793265A CN 202111074208 A CN202111074208 A CN 202111074208A CN 113793265 A CN113793265 A CN 113793265A
Authority
CN
China
Prior art keywords
model
loss function
resolution
image
student model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111074208.4A
Other languages
Chinese (zh)
Inventor
潘金山
臧庆
唐金辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202111074208.4A priority Critical patent/CN113793265A/en
Publication of CN113793265A publication Critical patent/CN113793265A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to an image super-resolution method and system based on depth feature correlation, wherein the method comprises the following steps: acquiring a low-resolution image to be reconstructed; inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image; the training process of the student model comprises the following steps: taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the high-resolution image is estimated through the high-performance image super-resolution model with small parameter quantity.

Description

Image super-resolution method and system based on depth feature relevance
Technical Field
The invention relates to the field of image super-resolution, in particular to an image super-resolution method and system based on depth feature relevance.
Background
In daily life, more and more low-power consumption devices such as mobile phones and embedded terminals are widely used, and people need to perform related processing on low-resolution images so as to obtain high-resolution images with good visual effects on mobile devices. Therefore, the application of the image super-resolution algorithm in low power consumption devices has received a great deal of attention.
The goal of image super-resolution techniques is to estimate a high-resolution image from a low-resolution image. The degradation process of the image super-resolution problem is generally defined as:
L=SM+n, (1)
where L, I, n represent low resolution images, high resolution images and noise, respectively, and S and K represent the matrix form of a down-sampling matrix and a blur kernel with scale factors, respectively. Image super-resolution is a pathological problem because there are an infinite number of pairs of blur kernel K and high resolution image I that can generate the same low resolution image L. The traditional interpolation-based method is simple and rapid, but the quality of the recovered high-resolution image is poor. In recent years, with the rapid development of deep learning technology, the method based on the deep convolutional neural network greatly leads the traditional method based on interpolation in the process of reconstructing a high-resolution image, and related experiments show that the deep network with larger parameter quantity can improve the performance of the image super-resolution algorithm more obviously, but also brings about the substantial increase of calculation time and memory consumption. In a real scene, the large models with huge calculation amount and parameter amount cannot be deployed into low-power consumption devices such as mobile phones. In order to solve the problem, some model compression methods are needed to obtain some image super-resolution models with smaller parameters and better performance.
Disclosure of Invention
The invention aims to provide an image super-resolution method and system based on depth feature correlation, which are used for estimating a high-resolution image through a student super-resolution model with small parameter quantity and high performance.
In order to achieve the purpose, the invention provides the following scheme:
an image super-resolution method based on depth feature correlation comprises the following steps:
acquiring a low-resolution image to be reconstructed;
inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the training process of the student model comprises the following steps:
taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
Optionally, the low-resolution training image is an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:
performing depth characteristic relevance calculation on the output of residual error group modules of different network depths of the student model and the trained teacher model to obtain a depth characteristic relevance matrix;
determining a distillation loss function according to the depth characteristic correlation matrix;
determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image;
taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
Optionally, before performing depth feature correlation calculation on the output of the residual group modules of different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further includes:
and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.
Optionally, the depth feature relevance calculation is performed on the output of the residual group modules of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, which specifically includes:
performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction;
normalizing and averaging the depth features of the teacher model subjected to dimensionality reduction to obtain a depth feature correlation matrix of the teacher model;
normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
Optionally, the determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image specifically includes:
the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function is determined from the real image.
An image super-resolution system based on depth feature correlation comprises:
the acquisition module is used for acquiring a low-resolution image to be reconstructed;
the reconstruction module is used for inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
Optionally, the student model training submodule specifically includes:
the depth characteristic relevance matrix determining unit is used for carrying out depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix;
a distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix;
the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model;
the student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
Optionally, the student model training sub-module further includes:
and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.
Optionally, the depth feature correlation matrix determining unit specifically includes:
the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction;
the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix;
the student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
Optionally, the supervision loss function determining unit specifically includes: the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the image super-resolution method and system based on the depth feature relevance, when a student model is trained, a low-resolution training image is used as input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function. And a distillation loss function is introduced to transfer the relevance information from the teacher model to the student model in a knowledge distillation mode, so that the student model still has good performance when the parameter and the calculated amount are low, and a high-quality high-resolution image is reconstructed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart of an image super-resolution method based on depth feature correlation according to the present invention;
FIG. 2 is a schematic diagram of the overall structure of a super-resolution network using knowledge distillation provided by the present invention;
FIG. 3 is a schematic diagram of a teacher model structure provided by the present invention;
FIG. 4 is a schematic structural diagram of a reverse residual error module in the student model according to the present invention;
fig. 5 is a schematic structural diagram of a feature mapping module provided in the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide an image super-resolution method and system based on depth feature correlation, which are used for estimating a high-resolution image through a high-performance image super-resolution model with small parameter quantity.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1 and 2, given a low resolution image (LR), the present invention is a task of transferring depth feature relevance knowledge from a heavyweight teacher model to a lightweight student model, which is completed under a knowledge distillation framework, so that the lightweight student model can improve its performance under the supervision of the heavyweight teacher model, and reconstruct a high resolution image (SR) with a good effect. The solid lines in the flow chart represent the training process and the dashed lines represent the testing process. In the training process, the paired data sets subjected to double and triple down sampling are adopted for training, a teacher model is trained, parameters are fixed, a student model is trained from the beginning, depth feature relevance knowledge is extracted from the trained teacher model and is transmitted to the student model for learning, and finally the student model obtains better performance (a solid line). When the student models are converged gradually under the supervision of the teacher model, given the tested low-resolution images, the trained student models can be used for reconstructing high-resolution images (dotted lines). The invention provides an image super-resolution method based on depth feature relevance, which comprises the following steps:
and acquiring a low-resolution image to be reconstructed.
And inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image.
The training process of the student model comprises the following steps:
taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function. Namely, a low-resolution training image is used as the input of the student model; training the student model by using a loss function attenuation mechanism by using a high-resolution training image as the output of the student model and an overall loss function as a loss function to obtain a trained student model; the bulk loss function comprises a distillation loss function and a supervisory loss function; the distillation loss function is determined based on a student model and the trained teacher model. The supervision loss function is determined according to the high-resolution image and the real image output by the trained teacher model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model.
In practical application, the low-resolution training image is used as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:
and performing depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix.
Determining a distillation loss function from the depth signature correlation matrix.
And determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image.
Taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
In practical application, before performing depth feature correlation calculation on the output of the residual group modules with different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further includes:
and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.
In practical application, the depth feature relevance calculation is performed on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, and the depth feature relevance matrix specifically includes:
and performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction.
And carrying out normalization and equalization processing on the depth features of the teacher model after dimension reduction to obtain a correlation matrix of the depth features of the teacher model.
Normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
In practical application, the determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model specifically includes:
the supervisory loss function includes a first supervisory loss function and a second supervisory loss function.
And determining a first supervision loss function according to the high-resolution image output by the trained teacher model.
A second supervised loss function is determined from the real image.
The invention also provides a more specific implementation mode of the image super-resolution method based on the depth feature correlation, which comprises the following steps:
selecting a teacher model and training
In order to construct a knowledge distillation framework, a heavyweight super-resolution model RCAN (residual channel assessment network) shown in FIG. 3 is selected as a teacher model, the construction form of the teacher model is nested residuals, the model consists of ten residual group modules, each residual group module consists of twenty residual modules, and each residual module consists of four layers of standard convolutions. Each convolutional layer is connected to an active layer. The method has high performance on the image super-resolution task, and the form of the network is shown as formula (2).
ISR=T(ILR) (2)
Wherein T is a teacher model, LLRAnd ISRRespectively, a low-resolution image and a high-resolution image output by the teacher model.
And after the teacher model is selected, training the teacher model. Firstly, inputting a low-resolution image (LR) into a teacher model, and performing feature extraction through a layer of convolution to obtain shallow features F0Then, a high-level feature F is obtained through a nested residual error moduleDFAnd a shallow feature F is arranged at the tail part of the nested residual error structure0And high level feature FDFOverlapping, and then performing upsampling by an upsampling module to obtain a characteristic FupFinally, a high resolution image (SR) is obtained by the reconstruction module. Utilizing a loss function formula (3) to constrain a high-resolution image generated by the teacher model, then fixing the parameters of the trained teacher model, and not using the parameters in the following training process of the student modelAnd (4) performing back propagation, and providing depth features of different network depths to calculate relevance knowledge and transferring the relevance knowledge to student model learning.
Figure BDA0003261599840000091
Where T refers to the teacher model and,
Figure BDA0003261599840000092
and
Figure BDA0003261599840000093
respectively a low resolution image and a true high resolution image, Lcons_TA loss function for the teacher model training process.
Construction of student models
The student model is obtained by compressing the number of residual modules and the number of channels in the teacher model, specifically, reducing the number of residual modules in the teacher model from 20 to 2, and reducing the number of feature channels in the teacher model from 64 to 16. And replacing the standard convolution in the teacher model residual error module with the optimized lightweight inverted residual error module, wherein the lightweight inverted residual error module is shown in fig. 4, and finally constructing a lightweight student model.
The optimized lightweight inverse residual module is composed of four separate layers, the first layer is called an expansion layer and uses 1x1 convolution to map features from a low-dimensional space to a high-dimensional space, the second layer is called a depth convolution layer and performs lightweight filtering on each input channel through group convolution, the third layer is a ReLU activation layer and performs nonlinear mapping on the features, the fourth layer is called a projection layer and also uses 1x1 convolution to map feature maps from the high-dimensional space to the low-dimensional space, and new features are constructed by calculating linear combination of input features. The purpose of adopting the inverted residual error module is to reduce the calculated amount of the student model and accelerate the reasoning speed of the student model.
Due to the fact that the group convolution operation and the 1x1 convolution are introduced into the inverse residual error module, the student model can be reduced while not reducing too much performanceThe calculated amount of the standard convolution can be reduced by 8 to 9 times by adopting a reverse residual error module. Inverse residual error module
Figure BDA0003261599840000101
Represented by the formula:
Figure BDA0003261599840000102
wherein the content of the first and second substances,
Figure BDA0003261599840000103
to reverse residual module, FinputFinger input of high-dimensional features, FoutputRefers to the output high-dimensional characteristics of the module.
Training student model
Depth feature mapping in teacher models
In the framework of knowledge distillation, student models were trained from scratch.
Firstly, simultaneously inputting a low-resolution image (LR) into a trained teacher model and a trained student model, then taking the output of residual error groups with different network depths, such as the first, the fifth and the tenth, in the teacher model and the student model as depth features, and compressing the number of channels in the models when constructing the student model, so that the dimensions of the extracted depth features in the teacher model and the student model are inconsistent, so that a feature mapping module, namely a feature mapping unit shown in FIG. 5 is adopted, the three depth features of the teacher model are processed by the feature mapping module by taking the minimum error method of PCA as reference, wherein the feature mapping module is composed of two layers of 1x1 convolution layers, the depth features of the teacher model are mapped from a high dimension to a low dimension by adopting a first layer of convolution, and then the low-dimension depth features are mapped into output high-dimension features by adopting a second layer of convolution, low-dimensional depth feature F enabling intermediate output of feature mapping modulereductThe main feature information can be retained to reconstruct the output high-dimensional features similar to the input high-dimensional features as much as possible. The feature mapping module is shown as follows:
Figure BDA0003261599840000104
wherein the content of the first and second substances,
Figure BDA0003261599840000105
for the feature mapping module, FinputFinger input of high-dimensional features, FoutputRefers to the output high-dimensional characteristics of the module.
In the distillation training process, because the depth feature in the teacher model is inconsistent with the depth feature dimension in the student model, the feature mapping module is needed to be adopted to reduce the dimension of the depth feature in the teacher model, and in the training process of the feature mapping module, the output high-dimensional feature of the feature mapping module is constrained by adopting a mean square error cost function formula (6):
Figure BDA0003261599840000111
wherein, FinputFinger input of high-dimensional features, FoutputThe output high-dimensional characteristic of the module is referred to, N is the number of samples in a batch, and i is the ith sample in the batch.
Low-dimensional depth feature F in derived feature mapping modulereductAnd then, mining the relevance knowledge of the depth features by using a depth feature relevance calculation module, wherein the module extracts a feature relevance matrix to represent the inherent relevance among different features. The specific contents of the depth feature correlation matrix calculated from the depth features are as follows:
assuming that the dimension of the low-dimensional depth feature after the feature mapping of the given teacher model is F e Rc×W×HThe last two dimensions of the depth feature F are first compressed into one dimension, the compression dimension being
Figure BDA0003261599840000118
Where C is the number of channels, WH is the product of length and width, and R is a real number.
To effectively transmit the knowledge in the distillation processThe rich knowledge of the depth features in the teacher model limits the depth features generated by the teacher model and the student models to a solution space for calculation by comparing the depth features
Figure BDA0003261599840000119
The size of the solution space is limited by normalizing each column, wherein normalization refers to dividing the minimum value of data minus the current dimension by the maximum value of the current dimension minus the minimum value of the current dimension, so that the data can be limited within the range of 0-1. After normalization, in order to compress the computation amount of the depth feature correlation calculation, the depth feature after normalization is first averaged in the channel dimension, as shown in formula (7), and the depth feature dimension after processing is
Figure BDA0003261599840000112
Then, the depth feature correlation matrix D is calculated by equation (8).
Figure BDA0003261599840000113
Wherein
Figure BDA0003261599840000114
Refer to
Figure BDA0003261599840000115
The transpose of (a) is performed,
Figure BDA0003261599840000116
for the depth feature to be compressed,
Figure BDA0003261599840000117
to obtain the depth feature after the second dimension, WH is the value of the depth feature multiplied by the first dimension and [ i, j]The feature points with coordinates i, j in the depth feature.
In the course of student model training, the teacher model and the student model all calculate depth characteristic correlation matrix, and introduce distillation loss function formula (9) to superviseStudent model depth feature correlation matrix D generated by student modelsTeacher model depth characteristic correlation matrix D generated by maximum and teacher modelsTAnd the consistency enables the student model to learn the relevance of the depth features in the teacher model.
Figure BDA0003261599840000121
Wherein L isfdAs a function of the loss of distillation,
Figure BDA0003261599840000122
is a matrix of teacher-model correlations,
Figure BDA0003261599840000123
is a student model correlation matrix.
In addition to introducing distillation loss for training, both the high resolution image (SR) and the true high resolution image (HR) output by the teacher model are used to supervise learning of the student model, and the output high resolution image (SR) of the student model is further constrained using formula (10), a first supervised loss function, and formula (11), a second supervised loss function.
Figure BDA0003261599840000124
Wherein T refers to a teacher model, S refers to a student model,
Figure BDA0003261599840000125
refers to the ith low resolution image, N refers to the number of images of a batch, Lsr_consIn order to be a first supervised loss function,
Figure BDA0003261599840000126
is the output of the teacher's model,
Figure BDA0003261599840000127
is the output of the student model.
Figure BDA0003261599840000128
Wherein the content of the first and second substances,
Figure BDA0003261599840000129
refers to the ith real high resolution image.
Finally, the overall loss function of the student model is shown in equation (12).
L(θ)=αLcons_s+βLsr_cons+γLfd (12)
Where α, β, γ are weighting parameters that balance the effect of the different loss functions.
In the aspect of weight parameter selection, in the early stage of student network training, the distillation loss function can help a student model to rapidly learn the effective knowledge of a teacher model, so that the network convergence of the student model is accelerated, but the distillation loss function can inhibit the further learning of the student model in the later stage of student model training. The effect of distillation loss is gradually reduced in the training process through a loss function weight attenuation mechanism, and the knowledge distillation effect is further improved:
Figure BDA0003261599840000131
where σ is an attenuation constant coefficient and neThe current epoch number in the whole training process is referred to, and n is a variable parameter used for determining how many epochs are attenuated once.
The invention also provides an image super-resolution system based on depth feature correlation, which comprises the following components:
and the acquisition module is used for acquiring the low-resolution image to be reconstructed.
And the reconstruction module is used for inputting the low-resolution image to be reconstructed into the trained student model for reconstruction to obtain a high-resolution image.
The reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
In practical application, the student model training submodule specifically includes:
and the depth characteristic relevance matrix determining unit is used for performing depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix.
A distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix.
And the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image.
The student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
In practical application, the student model training submodule further includes:
and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.
In practical application, the depth feature correlation matrix determining unit specifically includes:
and the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction.
And the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix.
The student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
In practical application, the unit for determining a supervisory loss function specifically includes: the supervisory loss function includes a first supervisory loss function and a second supervisory loss function.
And the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model.
A second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.
In order to solve the problem of overlarge parameter quantity of the model, the inverse residual module is introduced to replace a standard convolutional layer, the relevance knowledge is transferred from the heavyweight super-resolution model to the lightweight super-resolution model by excavating the depth characteristic relevance knowledge in the super-resolution model and further utilizing a knowledge distillation mode and introducing distillation loss under a knowledge distillation framework, so that the lightweight super-resolution model still has better performance when the parameter quantity and the calculated quantity are lower, and the performance of the lightweight super-resolution model is improved. Therefore, the problem that the heavyweight super-resolution model cannot be deployed to the low-power-consumption embedded equipment is solved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. An image super-resolution method based on depth feature relevance is characterized by comprising the following steps:
acquiring a low-resolution image to be reconstructed;
inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the training process of the student model comprises the following steps:
taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
2. The method for super-resolution of images based on depth feature correlation according to claim 1, wherein the training image with low resolution is an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:
performing depth characteristic relevance calculation on the output of residual error group modules of different network depths of the student model and the trained teacher model to obtain a depth characteristic relevance matrix;
determining a distillation loss function according to the depth characteristic correlation matrix;
determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image;
taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
3. The image super-resolution method based on depth feature correlation according to claim 2, wherein before performing depth feature correlation calculation on the output of the residual group module with different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further comprises:
and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.
4. The image super-resolution method based on depth feature relevance of claim 2, wherein the depth feature relevance calculation is performed on the output of the residual group module of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, and specifically comprises:
performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction;
normalizing and averaging the depth features of the teacher model subjected to dimensionality reduction to obtain a depth feature correlation matrix of the teacher model;
normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
5. The image super-resolution method based on depth feature correlation according to claim 2, wherein the determining a supervised loss function according to the high-resolution image output by the trained teacher model and the real image specifically comprises:
the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function is determined from the real image.
6. An image super-resolution system based on depth feature correlation is characterized by comprising:
the acquisition module is used for acquiring a low-resolution image to be reconstructed;
the reconstruction module is used for inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
7. The image super-resolution system based on depth feature correlation according to claim 6, wherein the student model training sub-module specifically comprises:
the depth characteristic relevance matrix determining unit is used for carrying out depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix;
a distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix;
the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model;
the student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
8. The depth feature correlation-based image super-resolution system of claim 7, wherein the student model training sub-module further comprises:
and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.
9. The image super-resolution system based on depth feature correlation according to claim 7, wherein the depth feature correlation matrix determination unit specifically includes:
the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction;
the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix;
the student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
10. The image super-resolution system based on depth feature correlation according to claim 7, wherein the supervised loss function determination unit specifically includes: the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.
CN202111074208.4A 2021-09-14 2021-09-14 Image super-resolution method and system based on depth feature relevance Pending CN113793265A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111074208.4A CN113793265A (en) 2021-09-14 2021-09-14 Image super-resolution method and system based on depth feature relevance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111074208.4A CN113793265A (en) 2021-09-14 2021-09-14 Image super-resolution method and system based on depth feature relevance

Publications (1)

Publication Number Publication Date
CN113793265A true CN113793265A (en) 2021-12-14

Family

ID=78880175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111074208.4A Pending CN113793265A (en) 2021-09-14 2021-09-14 Image super-resolution method and system based on depth feature relevance

Country Status (1)

Country Link
CN (1) CN113793265A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019183A (en) * 2022-07-28 2022-09-06 北京卫星信息工程研究所 Remote sensing image model migration method based on knowledge distillation and image reconstruction
CN115222600A (en) * 2022-07-29 2022-10-21 大连理工大学 Multispectral remote sensing image super-resolution reconstruction method for contrast learning
CN116070697A (en) * 2023-01-17 2023-05-05 北京理工大学 Replaceable convenient knowledge distillation method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830813A (en) * 2018-06-12 2018-11-16 福建帝视信息科技有限公司 A kind of image super-resolution Enhancement Method of knowledge based distillation
CN110458765A (en) * 2019-01-25 2019-11-15 西安电子科技大学 The method for enhancing image quality of convolutional network is kept based on perception
CN112200722A (en) * 2020-10-16 2021-01-08 鹏城实验室 Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment
WO2021042828A1 (en) * 2019-09-04 2021-03-11 华为技术有限公司 Neural network model compression method and apparatus, and storage medium and chip
CN112734646A (en) * 2021-01-19 2021-04-30 青岛大学 Image super-resolution reconstruction method based on characteristic channel division
CN112884643A (en) * 2019-11-29 2021-06-01 国网江苏省电力有限公司盐城供电分公司 Infrared image super-resolution reconstruction method based on EDSR network
CN113065635A (en) * 2021-02-27 2021-07-02 华为技术有限公司 Model training method, image enhancement method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830813A (en) * 2018-06-12 2018-11-16 福建帝视信息科技有限公司 A kind of image super-resolution Enhancement Method of knowledge based distillation
CN110458765A (en) * 2019-01-25 2019-11-15 西安电子科技大学 The method for enhancing image quality of convolutional network is kept based on perception
WO2021042828A1 (en) * 2019-09-04 2021-03-11 华为技术有限公司 Neural network model compression method and apparatus, and storage medium and chip
CN112884643A (en) * 2019-11-29 2021-06-01 国网江苏省电力有限公司盐城供电分公司 Infrared image super-resolution reconstruction method based on EDSR network
CN112200722A (en) * 2020-10-16 2021-01-08 鹏城实验室 Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment
CN112734646A (en) * 2021-01-19 2021-04-30 青岛大学 Image super-resolution reconstruction method based on characteristic channel division
CN113065635A (en) * 2021-02-27 2021-07-02 华为技术有限公司 Model training method, image enhancement method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
胡超: "基于金字塔式GAN与轻量化网络知识蒸馏图像超分辨研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, 15 July 2020 (2020-07-15), pages 43 - 54 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115019183A (en) * 2022-07-28 2022-09-06 北京卫星信息工程研究所 Remote sensing image model migration method based on knowledge distillation and image reconstruction
CN115019183B (en) * 2022-07-28 2023-01-20 北京卫星信息工程研究所 Remote sensing image model migration method based on knowledge distillation and image reconstruction
CN115222600A (en) * 2022-07-29 2022-10-21 大连理工大学 Multispectral remote sensing image super-resolution reconstruction method for contrast learning
CN116070697A (en) * 2023-01-17 2023-05-05 北京理工大学 Replaceable convenient knowledge distillation method and system

Similar Documents

Publication Publication Date Title
CN113793265A (en) Image super-resolution method and system based on depth feature relevance
CN108830813B (en) Knowledge distillation-based image super-resolution enhancement method
EP4138391A1 (en) Mimic compression method and apparatus for video image, and storage medium and terminal
Luo et al. Lattice network for lightweight image restoration
CN110533591B (en) Super-resolution image reconstruction method based on codec structure
CN114418850A (en) Super-resolution reconstruction method with reference image and fusion image convolution
CN114912486A (en) Modulation mode intelligent identification method based on lightweight network
CN115829834A (en) Image super-resolution reconstruction method based on half-coupling depth convolution dictionary learning
CN109672885B (en) Video image coding and decoding method for intelligent monitoring of mine
CN116188274A (en) Image super-resolution reconstruction method
CN105184742A (en) Image denoising method of sparse coding based on Laplace graph characteristic vector
CN113658122A (en) Image quality evaluation method, device, storage medium and electronic equipment
CN116385265B (en) Training method and device for image super-resolution network
CN113096015A (en) Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network
CN117522694A (en) Diffusion model-based image super-resolution reconstruction method and system
CN111738957A (en) Intelligent beautifying method and system for image, electronic equipment and storage medium
CN116485654A (en) Lightweight single-image super-resolution reconstruction method combining convolutional neural network and transducer
CN113128586B (en) Spatial-temporal fusion method based on multi-scale mechanism and series expansion convolution remote sensing image
CN115880158A (en) Blind image super-resolution reconstruction method and system based on variational self-coding
CN115375540A (en) Terahertz image super-resolution method based on deep learning algorithm
CN114022356A (en) River course flow water level remote sensing image super-resolution method and system based on wavelet domain
CN113744152A (en) Tide water image denoising processing method, terminal and computer readable storage medium
CN112261415B (en) Image compression coding method based on overfitting convolution self-coding network
CN117853730A (en) U-shaped full convolution medical image segmentation network based on convolution kernel attention mechanism
CN113763241B (en) Depth self-learning image super-resolution method based on similar image guidance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination