CN113793265A - Image super-resolution method and system based on depth feature relevance - Google Patents
Image super-resolution method and system based on depth feature relevance Download PDFInfo
- Publication number
- CN113793265A CN113793265A CN202111074208.4A CN202111074208A CN113793265A CN 113793265 A CN113793265 A CN 113793265A CN 202111074208 A CN202111074208 A CN 202111074208A CN 113793265 A CN113793265 A CN 113793265A
- Authority
- CN
- China
- Prior art keywords
- model
- loss function
- resolution
- image
- student model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012549 training Methods 0.000 claims abstract description 110
- 230000007246 mechanism Effects 0.000 claims abstract description 19
- 230000008569 process Effects 0.000 claims abstract description 14
- 239000011159 matrix material Substances 0.000 claims description 71
- 238000004821 distillation Methods 0.000 claims description 34
- 238000013507 mapping Methods 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000012935 Averaging Methods 0.000 claims description 11
- 238000010606 normalization Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 9
- 230000009467 reduction Effects 0.000 claims description 9
- 230000002441 reversible effect Effects 0.000 claims description 9
- 230000002829 reductive effect Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 description 110
- 238000013140 knowledge distillation Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 239000000126 substance Substances 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to an image super-resolution method and system based on depth feature correlation, wherein the method comprises the following steps: acquiring a low-resolution image to be reconstructed; inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image; the training process of the student model comprises the following steps: taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the high-resolution image is estimated through the high-performance image super-resolution model with small parameter quantity.
Description
Technical Field
The invention relates to the field of image super-resolution, in particular to an image super-resolution method and system based on depth feature relevance.
Background
In daily life, more and more low-power consumption devices such as mobile phones and embedded terminals are widely used, and people need to perform related processing on low-resolution images so as to obtain high-resolution images with good visual effects on mobile devices. Therefore, the application of the image super-resolution algorithm in low power consumption devices has received a great deal of attention.
The goal of image super-resolution techniques is to estimate a high-resolution image from a low-resolution image. The degradation process of the image super-resolution problem is generally defined as:
L=SM+n, (1)
where L, I, n represent low resolution images, high resolution images and noise, respectively, and S and K represent the matrix form of a down-sampling matrix and a blur kernel with scale factors, respectively. Image super-resolution is a pathological problem because there are an infinite number of pairs of blur kernel K and high resolution image I that can generate the same low resolution image L. The traditional interpolation-based method is simple and rapid, but the quality of the recovered high-resolution image is poor. In recent years, with the rapid development of deep learning technology, the method based on the deep convolutional neural network greatly leads the traditional method based on interpolation in the process of reconstructing a high-resolution image, and related experiments show that the deep network with larger parameter quantity can improve the performance of the image super-resolution algorithm more obviously, but also brings about the substantial increase of calculation time and memory consumption. In a real scene, the large models with huge calculation amount and parameter amount cannot be deployed into low-power consumption devices such as mobile phones. In order to solve the problem, some model compression methods are needed to obtain some image super-resolution models with smaller parameters and better performance.
Disclosure of Invention
The invention aims to provide an image super-resolution method and system based on depth feature correlation, which are used for estimating a high-resolution image through a student super-resolution model with small parameter quantity and high performance.
In order to achieve the purpose, the invention provides the following scheme:
an image super-resolution method based on depth feature correlation comprises the following steps:
acquiring a low-resolution image to be reconstructed;
inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the training process of the student model comprises the following steps:
taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
Optionally, the low-resolution training image is an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:
performing depth characteristic relevance calculation on the output of residual error group modules of different network depths of the student model and the trained teacher model to obtain a depth characteristic relevance matrix;
determining a distillation loss function according to the depth characteristic correlation matrix;
determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image;
taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
Optionally, before performing depth feature correlation calculation on the output of the residual group modules of different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further includes:
and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.
Optionally, the depth feature relevance calculation is performed on the output of the residual group modules of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, which specifically includes:
performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction;
normalizing and averaging the depth features of the teacher model subjected to dimensionality reduction to obtain a depth feature correlation matrix of the teacher model;
normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
Optionally, the determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image specifically includes:
the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function is determined from the real image.
An image super-resolution system based on depth feature correlation comprises:
the acquisition module is used for acquiring a low-resolution image to be reconstructed;
the reconstruction module is used for inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
Optionally, the student model training submodule specifically includes:
the depth characteristic relevance matrix determining unit is used for carrying out depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix;
a distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix;
the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model;
the student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
Optionally, the student model training sub-module further includes:
and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.
Optionally, the depth feature correlation matrix determining unit specifically includes:
the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction;
the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix;
the student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
Optionally, the supervision loss function determining unit specifically includes: the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the image super-resolution method and system based on the depth feature relevance, when a student model is trained, a low-resolution training image is used as input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function. And a distillation loss function is introduced to transfer the relevance information from the teacher model to the student model in a knowledge distillation mode, so that the student model still has good performance when the parameter and the calculated amount are low, and a high-quality high-resolution image is reconstructed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flow chart of an image super-resolution method based on depth feature correlation according to the present invention;
FIG. 2 is a schematic diagram of the overall structure of a super-resolution network using knowledge distillation provided by the present invention;
FIG. 3 is a schematic diagram of a teacher model structure provided by the present invention;
FIG. 4 is a schematic structural diagram of a reverse residual error module in the student model according to the present invention;
fig. 5 is a schematic structural diagram of a feature mapping module provided in the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide an image super-resolution method and system based on depth feature correlation, which are used for estimating a high-resolution image through a high-performance image super-resolution model with small parameter quantity.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1 and 2, given a low resolution image (LR), the present invention is a task of transferring depth feature relevance knowledge from a heavyweight teacher model to a lightweight student model, which is completed under a knowledge distillation framework, so that the lightweight student model can improve its performance under the supervision of the heavyweight teacher model, and reconstruct a high resolution image (SR) with a good effect. The solid lines in the flow chart represent the training process and the dashed lines represent the testing process. In the training process, the paired data sets subjected to double and triple down sampling are adopted for training, a teacher model is trained, parameters are fixed, a student model is trained from the beginning, depth feature relevance knowledge is extracted from the trained teacher model and is transmitted to the student model for learning, and finally the student model obtains better performance (a solid line). When the student models are converged gradually under the supervision of the teacher model, given the tested low-resolution images, the trained student models can be used for reconstructing high-resolution images (dotted lines). The invention provides an image super-resolution method based on depth feature relevance, which comprises the following steps:
and acquiring a low-resolution image to be reconstructed.
And inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image.
The training process of the student model comprises the following steps:
taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function. Namely, a low-resolution training image is used as the input of the student model; training the student model by using a loss function attenuation mechanism by using a high-resolution training image as the output of the student model and an overall loss function as a loss function to obtain a trained student model; the bulk loss function comprises a distillation loss function and a supervisory loss function; the distillation loss function is determined based on a student model and the trained teacher model. The supervision loss function is determined according to the high-resolution image and the real image output by the trained teacher model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model.
In practical application, the low-resolution training image is used as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:
and performing depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix.
Determining a distillation loss function from the depth signature correlation matrix.
And determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image.
Taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
In practical application, before performing depth feature correlation calculation on the output of the residual group modules with different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further includes:
and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.
In practical application, the depth feature relevance calculation is performed on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, and the depth feature relevance matrix specifically includes:
and performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction.
And carrying out normalization and equalization processing on the depth features of the teacher model after dimension reduction to obtain a correlation matrix of the depth features of the teacher model.
Normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
In practical application, the determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model specifically includes:
the supervisory loss function includes a first supervisory loss function and a second supervisory loss function.
And determining a first supervision loss function according to the high-resolution image output by the trained teacher model.
A second supervised loss function is determined from the real image.
The invention also provides a more specific implementation mode of the image super-resolution method based on the depth feature correlation, which comprises the following steps:
selecting a teacher model and training
In order to construct a knowledge distillation framework, a heavyweight super-resolution model RCAN (residual channel assessment network) shown in FIG. 3 is selected as a teacher model, the construction form of the teacher model is nested residuals, the model consists of ten residual group modules, each residual group module consists of twenty residual modules, and each residual module consists of four layers of standard convolutions. Each convolutional layer is connected to an active layer. The method has high performance on the image super-resolution task, and the form of the network is shown as formula (2).
ISR=T(ILR) (2)
Wherein T is a teacher model, LLRAnd ISRRespectively, a low-resolution image and a high-resolution image output by the teacher model.
And after the teacher model is selected, training the teacher model. Firstly, inputting a low-resolution image (LR) into a teacher model, and performing feature extraction through a layer of convolution to obtain shallow features F0Then, a high-level feature F is obtained through a nested residual error moduleDFAnd a shallow feature F is arranged at the tail part of the nested residual error structure0And high level feature FDFOverlapping, and then performing upsampling by an upsampling module to obtain a characteristic FupFinally, a high resolution image (SR) is obtained by the reconstruction module. Utilizing a loss function formula (3) to constrain a high-resolution image generated by the teacher model, then fixing the parameters of the trained teacher model, and not using the parameters in the following training process of the student modelAnd (4) performing back propagation, and providing depth features of different network depths to calculate relevance knowledge and transferring the relevance knowledge to student model learning.
Where T refers to the teacher model and,andrespectively a low resolution image and a true high resolution image, Lcons_TA loss function for the teacher model training process.
Construction of student models
The student model is obtained by compressing the number of residual modules and the number of channels in the teacher model, specifically, reducing the number of residual modules in the teacher model from 20 to 2, and reducing the number of feature channels in the teacher model from 64 to 16. And replacing the standard convolution in the teacher model residual error module with the optimized lightweight inverted residual error module, wherein the lightweight inverted residual error module is shown in fig. 4, and finally constructing a lightweight student model.
The optimized lightweight inverse residual module is composed of four separate layers, the first layer is called an expansion layer and uses 1x1 convolution to map features from a low-dimensional space to a high-dimensional space, the second layer is called a depth convolution layer and performs lightweight filtering on each input channel through group convolution, the third layer is a ReLU activation layer and performs nonlinear mapping on the features, the fourth layer is called a projection layer and also uses 1x1 convolution to map feature maps from the high-dimensional space to the low-dimensional space, and new features are constructed by calculating linear combination of input features. The purpose of adopting the inverted residual error module is to reduce the calculated amount of the student model and accelerate the reasoning speed of the student model.
Due to the fact that the group convolution operation and the 1x1 convolution are introduced into the inverse residual error module, the student model can be reduced while not reducing too much performanceThe calculated amount of the standard convolution can be reduced by 8 to 9 times by adopting a reverse residual error module. Inverse residual error moduleRepresented by the formula:
wherein the content of the first and second substances,to reverse residual module, FinputFinger input of high-dimensional features, FoutputRefers to the output high-dimensional characteristics of the module.
Training student model
Depth feature mapping in teacher models
In the framework of knowledge distillation, student models were trained from scratch.
Firstly, simultaneously inputting a low-resolution image (LR) into a trained teacher model and a trained student model, then taking the output of residual error groups with different network depths, such as the first, the fifth and the tenth, in the teacher model and the student model as depth features, and compressing the number of channels in the models when constructing the student model, so that the dimensions of the extracted depth features in the teacher model and the student model are inconsistent, so that a feature mapping module, namely a feature mapping unit shown in FIG. 5 is adopted, the three depth features of the teacher model are processed by the feature mapping module by taking the minimum error method of PCA as reference, wherein the feature mapping module is composed of two layers of 1x1 convolution layers, the depth features of the teacher model are mapped from a high dimension to a low dimension by adopting a first layer of convolution, and then the low-dimension depth features are mapped into output high-dimension features by adopting a second layer of convolution, low-dimensional depth feature F enabling intermediate output of feature mapping modulereductThe main feature information can be retained to reconstruct the output high-dimensional features similar to the input high-dimensional features as much as possible. The feature mapping module is shown as follows:
wherein the content of the first and second substances,for the feature mapping module, FinputFinger input of high-dimensional features, FoutputRefers to the output high-dimensional characteristics of the module.
In the distillation training process, because the depth feature in the teacher model is inconsistent with the depth feature dimension in the student model, the feature mapping module is needed to be adopted to reduce the dimension of the depth feature in the teacher model, and in the training process of the feature mapping module, the output high-dimensional feature of the feature mapping module is constrained by adopting a mean square error cost function formula (6):
wherein, FinputFinger input of high-dimensional features, FoutputThe output high-dimensional characteristic of the module is referred to, N is the number of samples in a batch, and i is the ith sample in the batch.
Low-dimensional depth feature F in derived feature mapping modulereductAnd then, mining the relevance knowledge of the depth features by using a depth feature relevance calculation module, wherein the module extracts a feature relevance matrix to represent the inherent relevance among different features. The specific contents of the depth feature correlation matrix calculated from the depth features are as follows:
assuming that the dimension of the low-dimensional depth feature after the feature mapping of the given teacher model is F e Rc×W×HThe last two dimensions of the depth feature F are first compressed into one dimension, the compression dimension beingWhere C is the number of channels, WH is the product of length and width, and R is a real number.
To effectively transmit the knowledge in the distillation processThe rich knowledge of the depth features in the teacher model limits the depth features generated by the teacher model and the student models to a solution space for calculation by comparing the depth featuresThe size of the solution space is limited by normalizing each column, wherein normalization refers to dividing the minimum value of data minus the current dimension by the maximum value of the current dimension minus the minimum value of the current dimension, so that the data can be limited within the range of 0-1. After normalization, in order to compress the computation amount of the depth feature correlation calculation, the depth feature after normalization is first averaged in the channel dimension, as shown in formula (7), and the depth feature dimension after processing isThen, the depth feature correlation matrix D is calculated by equation (8).
WhereinRefer toThe transpose of (a) is performed,for the depth feature to be compressed,to obtain the depth feature after the second dimension, WH is the value of the depth feature multiplied by the first dimension and [ i, j]The feature points with coordinates i, j in the depth feature.
In the course of student model training, the teacher model and the student model all calculate depth characteristic correlation matrix, and introduce distillation loss function formula (9) to superviseStudent model depth feature correlation matrix D generated by student modelsTeacher model depth characteristic correlation matrix D generated by maximum and teacher modelsTAnd the consistency enables the student model to learn the relevance of the depth features in the teacher model.
Wherein L isfdAs a function of the loss of distillation,is a matrix of teacher-model correlations,is a student model correlation matrix.
In addition to introducing distillation loss for training, both the high resolution image (SR) and the true high resolution image (HR) output by the teacher model are used to supervise learning of the student model, and the output high resolution image (SR) of the student model is further constrained using formula (10), a first supervised loss function, and formula (11), a second supervised loss function.
Wherein T refers to a teacher model, S refers to a student model,refers to the ith low resolution image, N refers to the number of images of a batch, Lsr_consIn order to be a first supervised loss function,is the output of the teacher's model,is the output of the student model.
Wherein the content of the first and second substances,refers to the ith real high resolution image.
Finally, the overall loss function of the student model is shown in equation (12).
L(θ)=αLcons_s+βLsr_cons+γLfd (12)
Where α, β, γ are weighting parameters that balance the effect of the different loss functions.
In the aspect of weight parameter selection, in the early stage of student network training, the distillation loss function can help a student model to rapidly learn the effective knowledge of a teacher model, so that the network convergence of the student model is accelerated, but the distillation loss function can inhibit the further learning of the student model in the later stage of student model training. The effect of distillation loss is gradually reduced in the training process through a loss function weight attenuation mechanism, and the knowledge distillation effect is further improved:
where σ is an attenuation constant coefficient and neThe current epoch number in the whole training process is referred to, and n is a variable parameter used for determining how many epochs are attenuated once.
The invention also provides an image super-resolution system based on depth feature correlation, which comprises the following components:
and the acquisition module is used for acquiring the low-resolution image to be reconstructed.
And the reconstruction module is used for inputting the low-resolution image to be reconstructed into the trained student model for reconstruction to obtain a high-resolution image.
The reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
In practical application, the student model training submodule specifically includes:
and the depth characteristic relevance matrix determining unit is used for performing depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix.
A distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix.
And the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image.
The student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
In practical application, the student model training submodule further includes:
and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.
In practical application, the depth feature correlation matrix determining unit specifically includes:
and the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction.
And the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix.
The student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
In practical application, the unit for determining a supervisory loss function specifically includes: the supervisory loss function includes a first supervisory loss function and a second supervisory loss function.
And the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model.
A second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.
In order to solve the problem of overlarge parameter quantity of the model, the inverse residual module is introduced to replace a standard convolutional layer, the relevance knowledge is transferred from the heavyweight super-resolution model to the lightweight super-resolution model by excavating the depth characteristic relevance knowledge in the super-resolution model and further utilizing a knowledge distillation mode and introducing distillation loss under a knowledge distillation framework, so that the lightweight super-resolution model still has better performance when the parameter quantity and the calculated quantity are lower, and the performance of the lightweight super-resolution model is improved. Therefore, the problem that the heavyweight super-resolution model cannot be deployed to the low-power-consumption embedded equipment is solved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.
Claims (10)
1. An image super-resolution method based on depth feature relevance is characterized by comprising the following steps:
acquiring a low-resolution image to be reconstructed;
inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the training process of the student model comprises the following steps:
taking a low-resolution training image as an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
2. The method for super-resolution of images based on depth feature correlation according to claim 1, wherein the training image with low resolution is an input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through a high-resolution image and a real image output by a trained teacher model by using a loss function attenuation mechanism to obtain a trained student model, specifically comprising the following steps:
performing depth characteristic relevance calculation on the output of residual error group modules of different network depths of the student model and the trained teacher model to obtain a depth characteristic relevance matrix;
determining a distillation loss function according to the depth characteristic correlation matrix;
determining a supervision loss function according to the high-resolution image output by the trained teacher model and the real image;
taking a low-resolution training image as an input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
3. The image super-resolution method based on depth feature correlation according to claim 2, wherein before performing depth feature correlation calculation on the output of the residual group module with different network depths of the student model and the trained teacher model to obtain a depth feature correlation matrix, the method further comprises:
and training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled for two times and three times to obtain the trained teacher model.
4. The image super-resolution method based on depth feature relevance of claim 2, wherein the depth feature relevance calculation is performed on the output of the residual group module of the student model and the trained teacher model with different network depths to obtain a depth feature relevance matrix, and specifically comprises:
performing feature mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth features of the teacher model after dimension reduction;
normalizing and averaging the depth features of the teacher model subjected to dimensionality reduction to obtain a depth feature correlation matrix of the teacher model;
normalizing and averaging the output of the residual group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
5. The image super-resolution method based on depth feature correlation according to claim 2, wherein the determining a supervised loss function according to the high-resolution image output by the trained teacher model and the real image specifically comprises:
the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function is determined from the real image.
6. An image super-resolution system based on depth feature correlation is characterized by comprising:
the acquisition module is used for acquiring a low-resolution image to be reconstructed;
the reconstruction module is used for inputting the low-resolution image to be reconstructed into a trained student model for reconstruction to obtain a high-resolution image;
the reconstruction module comprises a student model training submodule, and the student model training submodule is used for taking a low-resolution training image as the input of the student model; taking a high-resolution training image as the output of the student model, taking an integral loss function as a loss function, and performing supervision training on the student model through the high-resolution image and the real image output by the trained teacher model by using a loss function attenuation mechanism to obtain a trained student model; the trained teacher model comprises ten residual group modules; the student model is obtained by compressing and replacing the teacher model; the bulk loss function includes a distillation loss function and a supervisory loss function.
7. The image super-resolution system based on depth feature correlation according to claim 6, wherein the student model training sub-module specifically comprises:
the depth characteristic relevance matrix determining unit is used for carrying out depth characteristic relevance calculation on the output of the residual error group modules of the student model and the trained teacher model with different network depths to obtain a depth characteristic relevance matrix;
a distillation loss function determination unit for determining a distillation loss function from the depth characteristic correlation matrix;
the supervision loss function determining unit is used for determining a supervision loss function according to the high-resolution image and the real image output by the trained teacher model;
the student model training unit is used for taking a low-resolution training image as the input of the student model; and taking the high-resolution training image as the output of the student model, and training the student model by using a loss function attenuation mechanism according to the distillation loss function and the supervision loss function to obtain the trained student model.
8. The depth feature correlation-based image super-resolution system of claim 7, wherein the student model training sub-module further comprises:
and the teacher model training unit is used for training the super-resolution model by using the low-resolution training image as the input of the teacher model and the high-resolution training image as the output of the teacher model and adopting the paired data sets sampled twice and three times to obtain the trained teacher model.
9. The image super-resolution system based on depth feature correlation according to claim 7, wherein the depth feature correlation matrix determination unit specifically includes:
the characteristic mapping subunit is used for performing characteristic mapping on the output of the residual error group modules with different network depths of the trained teacher model to obtain the depth characteristics of the teacher model after dimension reduction;
the teacher model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the dimensionality-reduced teacher model depth characteristics to obtain a teacher model depth characteristic correlation matrix;
the student model depth characteristic correlation matrix determining subunit is used for carrying out normalization and averaging processing on the output of the residual error group modules with different network depths of the student model to obtain a student model depth characteristic correlation matrix; the residual error group module of the student model comprises a reverse residual error module; the depth feature correlation matrix comprises a teacher model depth feature correlation matrix and a student model depth feature correlation matrix.
10. The image super-resolution system based on depth feature correlation according to claim 7, wherein the supervised loss function determination unit specifically includes: the supervisory loss function comprises a first supervisory loss function and a second supervisory loss function;
the first supervision loss function determining subunit is used for determining a first supervision loss function according to the high-resolution image output by the trained teacher model;
a second supervised loss function determining subunit, configured to determine a second supervised loss function from the real image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111074208.4A CN113793265A (en) | 2021-09-14 | 2021-09-14 | Image super-resolution method and system based on depth feature relevance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111074208.4A CN113793265A (en) | 2021-09-14 | 2021-09-14 | Image super-resolution method and system based on depth feature relevance |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113793265A true CN113793265A (en) | 2021-12-14 |
Family
ID=78880175
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111074208.4A Pending CN113793265A (en) | 2021-09-14 | 2021-09-14 | Image super-resolution method and system based on depth feature relevance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113793265A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115019183A (en) * | 2022-07-28 | 2022-09-06 | 北京卫星信息工程研究所 | Remote sensing image model migration method based on knowledge distillation and image reconstruction |
CN115222600A (en) * | 2022-07-29 | 2022-10-21 | 大连理工大学 | Multispectral remote sensing image super-resolution reconstruction method for contrast learning |
CN116070697A (en) * | 2023-01-17 | 2023-05-05 | 北京理工大学 | Replaceable convenient knowledge distillation method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830813A (en) * | 2018-06-12 | 2018-11-16 | 福建帝视信息科技有限公司 | A kind of image super-resolution Enhancement Method of knowledge based distillation |
CN110458765A (en) * | 2019-01-25 | 2019-11-15 | 西安电子科技大学 | The method for enhancing image quality of convolutional network is kept based on perception |
CN112200722A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment |
WO2021042828A1 (en) * | 2019-09-04 | 2021-03-11 | 华为技术有限公司 | Neural network model compression method and apparatus, and storage medium and chip |
CN112734646A (en) * | 2021-01-19 | 2021-04-30 | 青岛大学 | Image super-resolution reconstruction method based on characteristic channel division |
CN112884643A (en) * | 2019-11-29 | 2021-06-01 | 国网江苏省电力有限公司盐城供电分公司 | Infrared image super-resolution reconstruction method based on EDSR network |
CN113065635A (en) * | 2021-02-27 | 2021-07-02 | 华为技术有限公司 | Model training method, image enhancement method and device |
-
2021
- 2021-09-14 CN CN202111074208.4A patent/CN113793265A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830813A (en) * | 2018-06-12 | 2018-11-16 | 福建帝视信息科技有限公司 | A kind of image super-resolution Enhancement Method of knowledge based distillation |
CN110458765A (en) * | 2019-01-25 | 2019-11-15 | 西安电子科技大学 | The method for enhancing image quality of convolutional network is kept based on perception |
WO2021042828A1 (en) * | 2019-09-04 | 2021-03-11 | 华为技术有限公司 | Neural network model compression method and apparatus, and storage medium and chip |
CN112884643A (en) * | 2019-11-29 | 2021-06-01 | 国网江苏省电力有限公司盐城供电分公司 | Infrared image super-resolution reconstruction method based on EDSR network |
CN112200722A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment |
CN112734646A (en) * | 2021-01-19 | 2021-04-30 | 青岛大学 | Image super-resolution reconstruction method based on characteristic channel division |
CN113065635A (en) * | 2021-02-27 | 2021-07-02 | 华为技术有限公司 | Model training method, image enhancement method and device |
Non-Patent Citations (1)
Title |
---|
胡超: "基于金字塔式GAN与轻量化网络知识蒸馏图像超分辨研究", 《中国优秀硕士学位论文全文数据库信息科技辑》, 15 July 2020 (2020-07-15), pages 43 - 54 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115019183A (en) * | 2022-07-28 | 2022-09-06 | 北京卫星信息工程研究所 | Remote sensing image model migration method based on knowledge distillation and image reconstruction |
CN115019183B (en) * | 2022-07-28 | 2023-01-20 | 北京卫星信息工程研究所 | Remote sensing image model migration method based on knowledge distillation and image reconstruction |
CN115222600A (en) * | 2022-07-29 | 2022-10-21 | 大连理工大学 | Multispectral remote sensing image super-resolution reconstruction method for contrast learning |
CN116070697A (en) * | 2023-01-17 | 2023-05-05 | 北京理工大学 | Replaceable convenient knowledge distillation method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113793265A (en) | Image super-resolution method and system based on depth feature relevance | |
CN108830813B (en) | Knowledge distillation-based image super-resolution enhancement method | |
EP4138391A1 (en) | Mimic compression method and apparatus for video image, and storage medium and terminal | |
Luo et al. | Lattice network for lightweight image restoration | |
CN110533591B (en) | Super-resolution image reconstruction method based on codec structure | |
CN114418850A (en) | Super-resolution reconstruction method with reference image and fusion image convolution | |
CN114912486A (en) | Modulation mode intelligent identification method based on lightweight network | |
CN115829834A (en) | Image super-resolution reconstruction method based on half-coupling depth convolution dictionary learning | |
CN109672885B (en) | Video image coding and decoding method for intelligent monitoring of mine | |
CN116188274A (en) | Image super-resolution reconstruction method | |
CN105184742A (en) | Image denoising method of sparse coding based on Laplace graph characteristic vector | |
CN113658122A (en) | Image quality evaluation method, device, storage medium and electronic equipment | |
CN116385265B (en) | Training method and device for image super-resolution network | |
CN113096015A (en) | Image super-resolution reconstruction method based on progressive sensing and ultra-lightweight network | |
CN117522694A (en) | Diffusion model-based image super-resolution reconstruction method and system | |
CN111738957A (en) | Intelligent beautifying method and system for image, electronic equipment and storage medium | |
CN116485654A (en) | Lightweight single-image super-resolution reconstruction method combining convolutional neural network and transducer | |
CN113128586B (en) | Spatial-temporal fusion method based on multi-scale mechanism and series expansion convolution remote sensing image | |
CN115880158A (en) | Blind image super-resolution reconstruction method and system based on variational self-coding | |
CN115375540A (en) | Terahertz image super-resolution method based on deep learning algorithm | |
CN114022356A (en) | River course flow water level remote sensing image super-resolution method and system based on wavelet domain | |
CN113744152A (en) | Tide water image denoising processing method, terminal and computer readable storage medium | |
CN112261415B (en) | Image compression coding method based on overfitting convolution self-coding network | |
CN117853730A (en) | U-shaped full convolution medical image segmentation network based on convolution kernel attention mechanism | |
CN113763241B (en) | Depth self-learning image super-resolution method based on similar image guidance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |