CN108830813B - Knowledge distillation-based image super-resolution enhancement method - Google Patents
Knowledge distillation-based image super-resolution enhancement method Download PDFInfo
- Publication number
- CN108830813B CN108830813B CN201810603516.3A CN201810603516A CN108830813B CN 108830813 B CN108830813 B CN 108830813B CN 201810603516 A CN201810603516 A CN 201810603516A CN 108830813 B CN108830813 B CN 108830813B
- Authority
- CN
- China
- Prior art keywords
- network
- layer
- output
- image
- teacher
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013140 knowledge distillation Methods 0.000 title claims abstract description 19
- 238000010586 diagram Methods 0.000 claims abstract description 63
- 238000012549 training Methods 0.000 claims abstract description 47
- 230000000694 effects Effects 0.000 claims abstract description 26
- 238000002474 experimental method Methods 0.000 claims abstract description 24
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims abstract description 7
- 238000003062 neural network model Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 62
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 39
- 230000004913 activation Effects 0.000 claims description 27
- 238000012935 Averaging Methods 0.000 claims description 18
- 238000005457 optimization Methods 0.000 claims description 15
- 238000013527 convolutional neural network Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 14
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 238000000926 separation method Methods 0.000 claims description 6
- 101100365548 Caenorhabditis elegans set-14 gene Proteins 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 230000002708 enhancing effect Effects 0.000 claims 3
- 238000012546 transfer Methods 0.000 abstract description 4
- 238000011423 initialization method Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
- G06T3/4076—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a knowledge distillation-based image super-resolution enhancement method, which comprises the following steps: 1) acquiring training data and testing data; 2) training a teacher network; the teacher network has a neural network model with deeper convolutional layers, and 3) the student network is trained; 4) the teacher network guides the student network to learn; the characteristic diagram of the teacher network is absorbed by the student network learning through three groups of guiding experiments; 5) testing and evaluating the image reconstruction effect; 6) and further guiding the student network according to different matrix relations among the output characteristic graphs. The invention utilizes the related thought of knowledge distillation to transfer the performance of the teacher network to the student network, the student network model can be efficiently operated on the mobile equipment and the embedded equipment with low power consumption limitation, and the PSNR of the student network guided by the teacher network is obviously improved on the premise of no change of the student network structure, thereby obtaining better reconstruction effect.
Description
Technical Field
The invention relates to the field of computer vision and deep learning, in particular to an image super-resolution enhancement method based on knowledge distillation.
Background
Super Resolution (SR) is a classic problem in computer vision, and Single Image Super Resolution (SISR) aims to recover a High-Resolution (HR) Image corresponding to a Single Low-Resolution (LR) Image from the LR Image by using digital Image processing and other methods. In the super-resolution problem, assuming that the low resolution image is X, our goal is to recover a super-resolution image Y' that is as similar as possible to the real (GT) image Y.
The conventional Interpolation-based amplification method includes Bilinear Interpolation (Bilinear Interpolation) and Bicubic Interpolation (Bicubic Interpolation), etc., and calculates the missing intermediate pixels in the amplified high-resolution image by using a fixed calculation formula and performing weighted average by using the neighborhood pixel information in the low-resolution image, but the simple Interpolation algorithm does not generate more image details with high-frequency information.
Dong[1]The Super-Resolution convolutional Neural Network algorithm (SRCNN) proposed by et al first applies a convolutional Neural Network to image Super-Resolution, and it directly learns the end-to-end mapping relationship between an input low-Resolution image and a corresponding high-Resolution image. SRCNN well illustrates that deep learning is effective in the super resolution problem, and can reconstruct much of the high frequency image detail information. Kim[2]People who are subjected to VGG-net[3]Inspiring that Very Deep convolutional neural network (VDSR) is used in the Super-Resolution problem, the network structure of VDSR consists of 20 convolutional layers, more convolutional layers have larger receptive fields, more Image neighborhood information can be used for predicting Image high-frequency details, and therefore a better Super-Resolution reconstruction effect can be achieved. Lim[4]Et al by SRResNet[5]The enlightening of the method provides a deeper Enhanced depth Residual error network (EDSR), optimizes the structure of the SRResNet and obtains a better Super-Resolution reconstruction effect.
As can be seen from the work before academia, the image reconstruction effect is better as the network depth is increased. Although the depth of the network is increased to bring a better super-resolution reconstruction effect, the computation amount and the memory consumption are increased at the same time, and a deep convolutional neural network model cannot be run in real time in many practical application scenarios (for example, under the limiting conditions of low power consumption such as a mobile terminal and an embedded type).
Disclosure of Invention
The invention aims to provide an image super-resolution enhancement method based on knowledge distillation, which can improve the image super-resolution reconstruction effect of a network model on the premise of not changing the structure of a small convolutional neural network model, so that a super-resolution model based on a convolutional neural network can be efficiently operated on a mobile terminal and an embedded terminal.
The technical scheme adopted by the invention is as follows:
a knowledge distillation-based image super-resolution enhancement method comprises the following steps:
1) acquiring training data and testing data;
1-1) selecting DIV2K and Flickr2K as training sets, wherein the training sets comprise 3450 real images, and the testing sets respectively select International public data sets Set5, Set14, BSDS100 and Urban 100;
1-2) carrying out 3-time down-sampling on the real images of the training set by adopting a Bicubic down-sampling method to obtain a group of low-resolution images corresponding to the real images;
1-3) reading a real image and a low resolution image, respectively, using an imread () function in an opencv library, the images being in the format of BGR data representing blue, green, and red portions of a color space, respectively,
1-4) then converting the image of the BGR space to the YCrCB space, Y representing the brightness, i.e. the gray level value, Cr representing the difference between the BGR red portion and the BGR signal brightness value, and Cb representing the difference between the BGR blue portion and the BGR signal brightness value;
1-5) carrying out channel separation on an image in a YCrCb space, only selecting Y-channel data for training, and carrying out normalization processing on the Y-channel data;
1-6) cutting the Y-channel image, taking the cut real image block as a training target, taking the cut low-resolution image block as an input during network training, wherein training data required by each iteration is 32 pairs.
2) Training a teacher network; the teacher network is a neural network model with deeper convolutional layers,
2-1) the first layer of the teacher network is a feature extraction and representation layer, which is composed of a convolutional layer and a nonlinear activation layer, the nonlinear activation layer selects ReLU as an activation function, and the operation of the first layer can be expressed by the following formula:
F1(X)=max(0,W1*X+b1)
in the formula, W1,b1The weights and offsets of the first convolutional layer, respectively, "+" indicates the convolution operation, and the ReLU function is defined as max (0, x);
2-2) the middle layer of the teacher network consists of 10 residual blocks, each residual block has two convolution layers, and each convolution layer is followed by a nonlinear activation layer with an activation function of ReLU; adding the input of the first convolutional layer and the output of the second convolutional layer by using a jump connection, and only performing residual error learning on the input of the first convolutional layer; each residual block can be represented by the following formula:
F2n+1(X)=max(0,W2n+1*Fn(X)+b2n+1)+F2n-1(X) (1≤n≤10)
wherein n represents a residual block number, Fn(X) represents the output of the first convolutional layer and the nonlinear active layer in the residual block, W2n+1And b2n+1Respectively representing the weight and the offset of the second convolutional layer in the residual block, F2n-1(X) represents the input of the residual block.
2-3) the reconstruction layer of the teacher network is a deconvolution layer (deconvolution), and the deconvolution layer is used for up-sampling the output of the previous network layer to enable the size of the output super-resolution image to be equal to that of the training target;
2-4) for training of teacher's network, the learning rate is set to 0.0001, and the MSE function is used as a loss function of training target and network output, whose expression is as follows:
in the formula, n is the number of training samples; y isiIs an input image, Y'iIs a predicted image.
2-5) minimizing the loss function using Adam optimization method.
3) Training a student network;
in order to achieve a better reconstruction effect, the invention removes a normalization BN layer of a student network structure.
3-1) the first layer of the student network is a characteristic extraction and representation layer, and the parameter setting of the first layer of the student network is the same as that of the first layer of the teacher network;
3-2) the middle layer of the student network is composed of 3 depth separable convolution (depthwise partial convolution) modules, each module is composed of a 3 × 3 depth level convolutional layer (depthwise convolution) and a 1 × 1 convolutional layer, the depth level convolutional layer and the convolutional layer are both followed by a nonlinear activation layer with an activation function of ReLU, and the operation of the depth level convolution can be represented by the following formula:
wherein K is Dk×DkA xM deep convolution kernel that applies the mth filter in K to the mth feature map of F to produce the mth feature map of the filtered output feature map G;
3-3) the parameter setting of the student network reconstruction layer is the same as that of the teacher network reconstruction layer;
3-4) the learning rate, the loss function and the optimization method of the student network are the same as those of the teacher network;
4) the teacher network guides the student network to learn; the characteristic diagram of the teacher network is absorbed by the student network learning through three groups of guiding experiments;
step 4) guiding an MSE function to be used as a loss function of a training target and network output in an experiment, and recording as loss0, namely:
wherein n is the number of experimental samples, YiIs an input image, Y'iIs a predicted image.
And (3) guiding the first experiment: extracting the output characteristic diagram of the 1 st depth separable convolution module of the student network, averaging the output characteristic diagram and recording the average as S1Namely:in the formula, n1Is the number of characteristic diagrams, siThe ith feature map is output by a 1 st depth separable convolution module in the student network;
extracting output characteristic diagram of the 4 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T1Namely:in the formula, n1Is the number of feature maps, tiThe ith feature map is output by a 4 th residual block module in the teacher network;
using MSE function as T1And S1The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss1, namely:
in the formula, n1Is the number of characteristic graphs, T1Is the mean value of the output characteristic diagram of the 4 th residual block module of the teacher network, S1Is the mean of the output feature maps of the 1 st depth separable convolution module of the student network.
Guiding the total loss function of the first experiment to be less 0+ less 1, and minimizing the total loss function by using an Adam optimization method;
and (5) guiding an experiment II: extracting output characteristic diagram of 2 nd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S2Namely:in the formula, n2Is the number of characteristic diagrams, s2iThe ith feature map is output by a 2 nd depth separable convolution module in the student network;
extracting output characteristic diagram of the 7 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T2Namely:in the formula, n2Is the number of feature maps, t2iThe ith feature map is output by a 7 th residual block module in the teacher network;
using MSE function as T2And S2The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss2, namely:
in the formula, n2Is the number of characteristic graphs, T2Is the mean value, S, of the output characteristic map of the 7 th residual block module of the teacher network2Is the mean of the output feature maps of the 2 nd deep separable convolution module of the student network.
The total loss function of the second guided experiment is loss0+ loss2, and the Adam optimization method is used for minimizing the total loss function;
and (3) guiding an experiment III: extracting output characteristic diagram of 3 rd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S3Namely:in the formula, n3Is the number of characteristic diagrams, s3iThe ith feature map is output by a 3 rd depth separable convolution module in the student network;
extracting output characteristic diagram of 10 th residual block module of teacher network, averaging the output characteristic diagram and recording as T3Namely: in the formula, n3Is the number of feature maps, t3iThe ith feature map is output by a 10 th residual block module in the teacher network;
using MSE function as T3And S3The loss function of (2) enables students to learn the contents of the teacher network characteristic diagram through network and memorize the contentsIs loss3, i.e.:
in the formula, n3Is the number of characteristic graphs, T3Is the mean value, S, of the output characteristic map of the 10 th residual block module of the teacher network3The mean of the output feature maps of the 3 rd depth separable convolution module for the student network.
The total loss function for the third guiding experiment is loss0+ loss3, and Adam optimization method is used to minimize the total loss function.
5) Testing and evaluating the image reconstruction effect;
reading a real image of a test set by using an imread () function in an opencv library, wherein the image format is BGR data, converting the image of the BGR space into a YCrCB space, carrying out channel separation on the image of the YCrCb space, only selecting Y-channel data for testing, wherein the gray-scale value range of the Y-channel data is between [0 and 255], carrying out 3-time down-sampling on the gray-scale image of the test set by a Bicubic down-sampling method to obtain a corresponding low-resolution image, carrying out normalization processing on the Y-channel data to change the gray-scale value range of the Y-channel data to be between [0 and 1], and taking the gray-scale value range as the input of a network. And finally, calculating the PSNR of the output of the network and the gray level image of the real image to measure the super-resolution reconstruction effect.
Generally, the Peak signal-to-noise ratio (PSNR) is used to evaluate the quality of the image reconstruction effect, and the higher the PSNR value is, the better the image reconstruction effect is.
6) Further guiding the student network according to different matrix relations among the output characteristic graphs;
let the output tensor of the activation layer of the convolutional neural network be A ∈ RC×H×WWherein C is the number of the characteristic diagrams, H and W are the height and width of the characteristic diagrams respectively,
the function M takes tensor a as input and outputs a two-dimensional matrix, namely: m: rC×H×W→RH×WAnd the output characteristic graphs satisfy the following relations:
maximum value of feature map: mmax(A)=maxi=1,CAi
Minimum value of feature map: mmin(A)=mini=1,CAi。
In the formula, M is a function, p is a power coefficient, A is an output tensor of an activation layer of the convolutional neural network, C is the number of characteristic graphs, and i is a characteristic graph serial number.
The invention adopts the technical scheme to provide knowledge-based distillation technology[6]And a smaller neural network model is made to learn the characteristics of a deeper network model, and the image super-resolution enhancement effect of the small network model is improved on the premise of not changing the model structure of the small network and not increasing the calculated amount, so that the super-resolution model with a better effect can be efficiently operated on a mobile terminal or an embedded terminal with low power consumption limitation.
Drawings
The invention is described in further detail below with reference to the accompanying drawings and the detailed description;
FIG. 1 is a schematic diagram of a teacher network structure of a knowledge distillation-based image super-resolution enhancement method of the present invention;
FIG. 2 is a schematic diagram of a student network structure of an image super-resolution enhancement method based on knowledge distillation according to the present invention;
FIG. 3 is a schematic diagram of a teaching process of a teacher network to a student network of the knowledge distillation-based image super-resolution enhancement method of the present invention;
FIG. 4 shows the comparison effect of part of experiments of the knowledge distillation-based image super-resolution enhancement method.
Detailed Description
As shown in one of fig. 1 to 4, an object of the present invention is to provide a super-resolution reconstruction method based on knowledge distillation, which improves an image super-resolution reconstruction effect of a network model without changing a structure of a small convolutional neural network model, so that a super-resolution model based on a convolutional neural network can be efficiently operated on a mobile terminal and an embedded terminal.
The invention discloses a super-resolution reconstruction method based on knowledge distillation, which comprises the following specific implementation modes:
(1) and acquiring a training set and a test set.
The training set selects DIV2K and Flickr 2K. DIV2K has 800 real images and Flickr2K has 2650 real images for a total of 3450 images.
The test Set selects international public data sets Set5, Set14, BSDS100 and Urban100 respectively. Set5 has 5 test images, Set14 has 14 test images, and BSDS100 and Urban100 each have 100 test images.
And (3) performing 3-time down-sampling on the real images of the training set by adopting a Bicubic down-sampling method to obtain a group of low-resolution images corresponding to the real images.
The real image and the low-resolution image are read separately using the immed () function in the opencv library, the images are formatted as BGR data, BGR represents blue, green, and red portions of a color space, respectively, and then the image of the BGR space is converted to a YCrCB space, Y represents brightness, i.e., a gray-scale value, Cr represents a difference between the BGR red portion and a BGR signal luminance value, and Cb represents a difference between the BGR blue portion and a BGR signal luminance value.
The conversion formula from the BGR space to the YCrCb space is as follows:
Y=0.097906×B+0.504129×G+0.256789×R+16.0
Cr=-0.071246×B-0.367789×G+0.439215×R+128.0
Cb=0.439215×B-0.290992×G-0.148223×R+128.0
and performing channel separation on the image in the YCrCb space, and training by only selecting Y-channel data, wherein the image gray-scale value range is between [0 and 255], and performing normalization processing on the Y-channel data to change the image gray-scale value range into [0 and 1 ].
And (3) cutting the Y-channel image, wherein when the downsampling multiple is 3, the Y-channel image corresponding to the real image is cut into 120 × 120 image blocks serving as a training target, and the Y-channel image corresponding to the corresponding low-resolution image is cut into 40 × 40 image blocks serving as input during network training. The training data required for each iteration is 32 pairs.
(2) And (5) training a teacher network.
The teacher network is a neural network model with deeper convolutional layers, as shown in fig. 1. The first layer of the teacher's network is a feature extraction and presentation layer, consisting of a convolutional layer of 64 filters of size 3 x 3 and a nonlinear activation layer. The padding mode of the convolution layer is set as 'SAME', the sliding step (stride) of the convolution kernel is set as 1, the sizes of the images before and after the convolution operation are equal, the weight initialization method is set as XVaier, the bias term (bias) is initialized to be 0, and the nonlinear activation layer selects ReLU as the activation function. The operation of the first layer can be expressed by the following formula:
F1(X)=max(0,W1*X+b1)
in the formula, W1,b1The weights and offsets of the first convolutional layer, respectively, "+" indicates the convolution operation, and the ReLU function is defined as max (0, x).
The middle layer of the teacher network consists of 10 residual blocks, each residual block has two convolution layers, and each convolution layer is followed by a nonlinear activation layer with an activation function of ReLU. The input of the first convolutional layer and the output of the second convolutional layer are added by a skip connection, and residual learning is performed only on the input of the first convolutional layer. Each convolutional layer is composed of 64 filters with the size of 3 × 3, padding is set to 'SAME', stride is set to 1, weight initialization method is Xvaier method, and bias is initialized to 0. Each residual block can be represented by the following formula:
F2n+1(X)=max(0,W2n+1*Fn(X)+b2n+1)+F2n-1(X) (1≤n≤10)
wherein n represents a residual block number, Fn(X) represents the output of the first convolutional layer and the nonlinear active layer in the residual block, W2n+1And b2n+1Respectively representing the weight and the offset of the second convolutional layer in the residual block, F2n-1(X) represents the input of the residual block.
When the upsampling multiple is 3, the reconstructed layer of the teacher network is 1 deconvolution (deconvolution) with a filter size of 3 × 3, and stride is set to 3. The purpose of the deconvolution layer is to up-sample the output of the previous layer of network, so that the output super-resolution image is equal to the training target in size.
For training of the teacher's network, the learning rate is set to 0.0001, and the MSE function is used as a loss function of the training target and the network output, and the expression is as follows:
wherein n is the number of experimental samples, YiIs an input image, Y'iIs a predicted image.
The loss function is minimized using Adam optimization method.
(3) Training of student networks
The structure of the student network is shown in fig. 2, and in order to achieve a better reconstruction effect, the invention removes the normalized BN layer of the network structure. The first layer of the student network is a feature extraction and presentation layer, the parameter settings of which are the same as those of the first layer of the teacher network.
The middle layer of the student network is composed of 3 depth separable convolution (depth separable convolution) modules, each depth separable convolution module is composed of 64 depth level convolution layers (depth convolution) with the size of 3 x 3 and 64 convolution layers with the size of 1 x 1, and the depth level convolution layers and the convolution layers are both followed by a nonlinear activation layer with the activation function of ReLU. The depth level convolution padding mode is set to 'SAME', stride is set to 1. Stride of the convolutional layer is set to 1, the weight initialization method is the XVeier method, and bias is initialized to 0. The operation of depth level convolution can be represented by the following formula:
wherein K is Dk×DkThe xm deep convolution kernel applies the mth filter in K to the mth feature map of F to produce the mth feature map of the filtered output feature map G.
The parameter setting of the student network reconstruction layer is the same as that of the teacher network reconstruction layer.
The learning rate, loss function and optimization method of the student network are the same as those of the teacher network.
(4) And the teacher network guides the student network to learn.
The tutor network to student network tutoring process is shown in figure 3.
Experiment one:
extracting the output characteristic diagram of the 1 st module of the student network, averaging the output characteristic diagram and recording the average as S1Namely:
in the formula, n is the number of characteristic graphs, siAnd the ith feature map is output by the 1 st module in the student network.
Extracting output characteristic diagram of the 4 th module of the teacher network, averaging the output characteristic diagram and recording the average as T1Namely:
in the formula, n is the number of characteristic graphs, tiAnd the ith feature map is output by the 4 th module in the teacher network.
Using MSE function as T1And S1The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss1, namely:
the MSE function is used as a loss function for the training target and the network output and is denoted as loss0, i.e.:
wherein n is the number of experimental samples, YiFor inputting an image, YiIs a predicted image.
The total loss function loss, loss0+ loss1, is minimized using Adam optimization method.
Experiment two:
extracting output characteristic diagram of 2 nd module of student network, averaging the output characteristic diagram and recording as S2。
Extracting output characteristic diagram of 7 th module of teacher network, averaging the output characteristic diagram and recording the average as T2。
Experiment three:
extracting output characteristic diagram of the 3 rd module of the student network, averaging the output characteristic diagram and recording the average as S3。
Extracting the output characteristic diagram of the 10 th module of the teacher network, averaging the output characteristic diagram and recording the average as T3。
Using MSE function as T2And S2,T3And S3The loss functions of (1) are denoted as loss2 and loss3, respectively.
In the second experiment, the total loss function of the third experiment is loss0+ 2 and loss0+ 3 respectively, and the total loss function is minimized by using an Adam optimization method.
(5) Testing
Reading a real image of a test set by using an imread () function in an opencv library, wherein the image format is BGR data, converting the image of the BGR space into a YCrCB space, carrying out channel separation on the image of the YCrCb space, only selecting Y-channel data for testing, wherein the gray-scale value range of the Y-channel data is between [0 and 255], carrying out 3-time down-sampling on the gray-scale image of the test set by a Bicubic down-sampling method to obtain a corresponding low-resolution image, carrying out normalization processing on the Y-channel data to change the gray-scale value range of the Y-channel data to be between [0 and 1], and taking the gray-scale value range as the input of a network. And finally, calculating the PSNR of the output of the network and the gray level image of the real image to measure the super-resolution reconstruction effect. Generally, the Peak signal-to-noise ratio (PSNR) is used to evaluate the quality of the image reconstruction effect, and the higher the PSNR value is, the better the image reconstruction effect is.
The results of steps (2), (3) and (4) are shown in Table 1.
TABLE 1 instructor network to student network guidance effect
As can be seen from table 1, the PSNR of experiments one and two is slightly improved relative to the student network.
(6) Further guidance is as follows: and considering different matrix relations among the output characteristic graphs to further guide the student network. Let the output tensor of the activation layer of the convolutional neural network be A ∈ RC×H×WIn the formula, C is the number of the feature maps, H and W are the height and width of the feature maps respectively, and the function M takes the tensor a as input and outputs a two-dimensional matrix, that is:
M:RC×H×W→RH×W
the invention takes into account the relationship between the following characteristic diagrams:
maximum value of feature map: mmax(A)=maxi=1,CAi
Minimum value of feature map: mmin(A)=mini=1,CAi
In the formula, M is a function, p is a power coefficient, A is an output tensor of an activation layer of the convolutional neural network, C is the number of characteristic graphs, and i is a characteristic graph serial number.
On the basis of the step (4) and the step (5), the total loss function loss is 2 × loss0+ loss1+ loss2, and the total loss function is minimized by using an Adam optimization method. The results of further tutoring the student network by the teacher network are shown in table 2.
TABLE 2 further guidance of teacher network to student network
Mmean 2(A) The effect of the method compared with the bicubic interpolation method and the student network is shown in fig. 4.
According to the super-resolution method, under the premise that the structure of the student network is not changed, the PSNR of the student network guided by the teacher network is obviously improved, and a better reconstruction effect is obtained. The innovation of the super-resolution image enhancement method based on knowledge distillation mainly comprises the following three aspects:
first, the invention utilizes the related thought of knowledge distillation to transfer the performance of the teacher network to the student network, thereby greatly improving the image super-resolution reconstruction effect of the student network under the condition of not changing the structure of the student network model.
Secondly, in order to determine the effective information transfer mode of the teacher network and the student network model, the invention compares 7 different feature extraction and transfer methods, and finally determines the optimal feature extraction mode.
Third, the teacher network consumes significant computing resources while the student network model requires only a small amount of computation. The student network model provided by the invention can be efficiently operated on mobile equipment and embedded equipment with low power consumption limitation.
The technical solutions of the present invention have been described in detail, but the embodiments of the present invention are not considered to be limited to the description. It will be apparent to those skilled in the art that various changes may be made without departing from the spirit of the invention, and it is intended that all changes that are equivalent or similar to the invention shall fall within the scope of the invention.
Reference to the literature
[1]Chao Dong,Chen Change Loy,Kaiming He,Xiaoou Tang.Image Super-Resolution Using Deep Convolutional Networks[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2016,38(2):295-307.
[2]Jiwon Kim,Jung Kwon Lee,Kyoung Mu Lee.Accurate Image Super-Resolution Using Very Deep Convolutional Networks[C].IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:1646-1654.
[3]Karen Simonyan,Andrew Zisserman.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Science,2014.
[4]Bee Lim,Sanghyun Son,Heewon Kim,Seungjun Nah,Kyoung Mu Lee.Enhanced Deep Residual Networks for Single Image Super-Resolution[C].Computer Vision and Pattern Recognition Workshops.IEEE,2017:1132-1140.
[5]Christian Ledig,Lucas Theis,Ferenc Huszar,Jose Caballero,Andrew Cunningham,Alejandro Acosta,Andrew Aitken,Alykhan Tejani,Johannes Totz,Zehan Wang,Wenzhe Shi.Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network[C].IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:105-114.
[6]Geoffrey Hinton,Oriol Vinyals,Jeff Dean.Distilling the Knowledge in a Neural Network[J].Computer Science,2015,14(7):38-39.
Claims (4)
1. A knowledge distillation-based image super-resolution enhancement method is characterized by comprising the following steps: which comprises the following steps:
1) acquiring training data and testing data;
1-1) selecting DIV2K and Flickr2K as training sets, wherein the training sets comprise 3450 real images, and the testing sets respectively select international public data sets Set5, Set14, BSDS100 and Urban 100;
1-2) carrying out 3-time down-sampling on the real images of the training set by adopting a Bicubic down-sampling method to obtain a group of low-resolution images corresponding to the real images;
1-3) reading a real image and a low resolution image, respectively, using an imread () function in an opencv library, the images being in the format of BGR data representing blue, green, and red portions of a color space, respectively,
1-4) then converting the image of the BGR space to YCrCb space, Y representing the brightness, i.e. the gray level value, Cr representing the difference between the BGR red portion and the BGR signal brightness value, Cb representing the difference between the BGR blue portion and the BGR signal brightness value;
1-5) carrying out channel separation on an image in a YCrCb space, only selecting Y-channel data for training, and carrying out normalization processing on the Y-channel data;
1-6) cutting a Y-channel image, taking a cut real image block as a training target, taking a cut low-resolution image block as an input during network training, wherein training data required by each iteration is 32 pairs;
2) training a teacher network; the teacher network has a neural network model with deeper convolutional layers,
2-1) the first layer of the teacher network is a feature extraction and representation layer, which is composed of a convolutional layer and a nonlinear activation layer, the nonlinear activation layer selects ReLU as an activation function, and the operation of the first layer can be expressed by the following formula: f1(X)=max(0,W1*X+b1) In the formula, W1,b1The weights and offsets of the first convolutional layer, respectively, "+" indicates the convolution operation, and the ReLU function is defined as max (0, x);
2-2) the middle layer of the teacher network consists of 10 residual blocks, each residual block has two convolution layers, and each convolution layer is followed by a nonlinear activation layer with an activation function of ReLU; adding the input of the first convolutional layer and the output of the second convolutional layer by using a jump connection, and only performing residual error learning on the input of the first convolutional layer; each residual block is represented by the following formula:
F2o+1(X)=max(0,W2o+1*Fo(X)+b2o+1)+F2o-1(X)(1≤o≤10)
in the formula, o represents the residual block number, Fo(X) represents the output of the first convolutional layer and the nonlinear active layer in the residual block, W2o+1And b2o+1Respectively representing the weight and the offset of the second convolutional layer in the residual block, F2o-1(X) represents the input of a residual block;
2-3) the reconstruction layer of the teacher network is an deconvolution layer, and the deconvolution layer is used for up-sampling the output of the previous network layer to enable the output super-resolution image to be equal to the size of the training target;
2-4) for training of teacher's network, the learning rate is set to 0.0001, and the MSE function is used as a loss function of training target and network output, whose expression is as follows:where n is the number of training samples, YiIs an input image, Y'iIs a predicted image;
2-5) minimizing the loss function using Adam optimization method;
3) training a student network;
3-1) the first layer of the student network is a characteristic extraction and representation layer, and the parameter setting of the first layer of the student network is the same as that of the first layer of the teacher network;
3-2) the middle layer of the student network is composed of 3 depth separable convolution modules, each module is composed of a 3 x 3 depth level convolution layer and a 1 x 1 convolution layer, the depth level convolution layer and the convolution layer are both followed by a nonlinear activation layer with an activation function of ReLU, and the operation of the depth level convolution is represented by the following formula:
wherein K is Dk×DkA xM deep convolution kernel that applies the mth filter in K to the mth feature map of F to produce the mth feature map of the filtered output feature map G;
3-3) the parameter setting of the student network reconstruction layer is the same as that of the teacher network reconstruction layer;
3-4) the learning rate, the loss function and the optimization method of the student network are the same as those of the teacher network; i.e., the learning rate is set to 0.0001, the MSE function is used as a loss function for the training target and the network output, and the expression is as follows:in the formula, n is the number of training samples;
4) the teacher network guides the student network to learn; the characteristic diagram of the teacher network is absorbed by the student network learning through three groups of guiding experiments;
5) testing and evaluating the image reconstruction effect;
6) further guiding the student network according to different matrix relations among the output characteristic graphs;
let the output tensor of the activation layer of the convolutional neural network be A ∈ RC×H×WWherein C is the number of the characteristic diagrams, H and W are the height and width of the characteristic diagrams respectively,
the function M takes tensor a as input and outputs a two-dimensional matrix, namely: m: rC×H×W→RH×WAnd the output characteristic graphs satisfy the following relations:
maximum value of feature map: mmax(A)=maxi=1,CAi
Minimum value of feature map: mmin(A)=mini=1,CAi
In the formula, M is a function, p is a power coefficient, A is an output tensor of an activation layer of the convolutional neural network, C is the number of characteristic graphs, and i is a characteristic graph serial number.
2. The method for enhancing the super-resolution of the image based on the knowledge distillation as claimed in claim 1, wherein: the conversion formula from the BGR space to the YCrCb space in step 1-4) is as follows:
Y=0.097906×B+0.504129×G+0.256789×R+16.0
Cr=-0.071246×B-0.367789×G+0.439215×R+128.0
Cb=0.439215×B-0.290992×G-0.148223×R+128.0。
3. the method for enhancing the super-resolution of the image based on the knowledge distillation as claimed in claim 1, wherein: step 4) guiding an MSE function to be used as a loss function of a training target and network output in an experiment, and recording as loss0, namely:
wherein n is the number of experimental samples, YiIs an input image, Y'iIs a predicted image;
and (3) guiding the first experiment: extracting the output characteristic diagram of the 1 st depth separable convolution module of the student network, averaging the output characteristic diagram and recording the average as S1Namely:in the formula, n1Is the number of characteristic diagrams, siThe ith feature map is output by a 1 st depth separable convolution module in the student network;
extracting output characteristic diagram of the 4 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T1Namely:in the formula, n1Is the number of feature maps, tiIn a teacher networkThe ith feature map output by the 4 th residual block module;
using MSE function as T1And S1The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss1, namely:
in the formula, n1Is the number of characteristic graphs, T1Is the mean value of the output characteristic diagram of the 4 th residual block module of the teacher network, S1Is the mean of the output feature maps of the 1 st depth separable convolution module of the student network;
guiding the total loss function of the first experiment to be less 0+ less 1, and minimizing the total loss function by using an Adam optimization method;
and (5) guiding an experiment II: extracting output characteristic diagram of 2 nd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S2Namely:in the formula, n2Is the number of characteristic diagrams, s2iThe ith feature map is output by a 2 nd depth separable convolution module in the student network;
extracting output characteristic diagram of the 7 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T2Namely:in the formula, n2Is the number of feature maps, t2iThe ith feature map is output by a 7 th residual block module in the teacher network;
using MSE function as T2And S2The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss2, namely:
in the formula, n2In order to provide the number of the characteristic diagrams,T2is the mean value, S, of the output characteristic map of the 7 th residual block module of the teacher network2Is the mean of the output feature maps of the 2 nd depth separable convolution module of the student network;
the total loss function of the second guided experiment is loss0+ loss2, and the Adam optimization method is used for minimizing the total loss function; and (3) guiding an experiment III: extracting output characteristic diagram of 3 rd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S3Namely:in the formula, n3Is the number of characteristic diagrams, s3iThe ith feature map is output by a 3 rd depth separable convolution module in the student network;
extracting output characteristic diagram of 10 th residual block module of teacher network, averaging the output characteristic diagram and recording as T3Namely:in the formula, n3Is the number of feature maps, t3iThe ith feature map is output by a 10 th residual block module in the teacher network;
using MSE function as T3And S3The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss3, namely:
in the formula, n3Is the number of characteristic graphs, T3Is the mean value, S, of the output characteristic map of the 10 th residual block module of the teacher network3Average of the output feature maps of the 3 rd depth separable convolution module for the student network;
the total loss function for the third guiding experiment is loss0+ loss3, and Adam optimization method is used to minimize the total loss function.
4. The method for enhancing the super-resolution of the image based on the knowledge distillation as claimed in claim 1, wherein: the specific steps of the step 5 are as follows:
5-1) reading real images of the test set by using an imread () function in an opencv library, wherein the images are in a BGR data format,
5-2) converting the image of the BGR space to the YCrCb space and performing channel separation on the image of the YCrCb space, at which time only Y-channel data is selected for testing, the gray scale value of which ranges from 0 to 255,
5-3) carrying out 3-time down-sampling on the gray level image of the test set by using a Bicubic down-sampling method to obtain a corresponding low-resolution image, carrying out normalization processing on Y-channel data of the low-resolution image to enable the gray level value range to be between [0 and 1], and taking the gray level value range as the input of a network;
5-4) calculating the PSNR of the output of the network and the gray level image of the real image to measure the super-resolution reconstruction effect.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810603516.3A CN108830813B (en) | 2018-06-12 | 2018-06-12 | Knowledge distillation-based image super-resolution enhancement method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810603516.3A CN108830813B (en) | 2018-06-12 | 2018-06-12 | Knowledge distillation-based image super-resolution enhancement method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108830813A CN108830813A (en) | 2018-11-16 |
CN108830813B true CN108830813B (en) | 2021-11-09 |
Family
ID=64143896
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810603516.3A Active CN108830813B (en) | 2018-06-12 | 2018-06-12 | Knowledge distillation-based image super-resolution enhancement method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108830813B (en) |
Families Citing this family (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3785222B1 (en) | 2018-05-30 | 2024-04-17 | Shanghai United Imaging Healthcare Co., Ltd. | Systems and methods for image processing |
CN109658354B (en) * | 2018-12-20 | 2022-02-08 | 上海联影医疗科技股份有限公司 | Image enhancement method and system |
CN109816636B (en) * | 2018-12-28 | 2020-11-27 | 汕头大学 | Crack detection method based on intelligent terminal |
CN110309842B (en) * | 2018-12-28 | 2023-01-06 | 中国科学院微电子研究所 | Object detection method and device based on convolutional neural network |
CN109637546B (en) * | 2018-12-29 | 2021-02-12 | 苏州思必驰信息科技有限公司 | Knowledge distillation method and apparatus |
CN111414987B (en) * | 2019-01-08 | 2023-08-29 | 南京人工智能高等研究院有限公司 | Training method and training device of neural network and electronic equipment |
CN110458765B (en) * | 2019-01-25 | 2022-12-02 | 西安电子科技大学 | Image quality enhancement method based on perception preserving convolution network |
CN109978763A (en) * | 2019-03-01 | 2019-07-05 | 昆明理工大学 | A kind of image super-resolution rebuilding algorithm based on jump connection residual error network |
CN111814816A (en) * | 2019-04-12 | 2020-10-23 | 北京京东尚科信息技术有限公司 | Target detection method, device and storage medium thereof |
CN110111256B (en) * | 2019-04-28 | 2023-03-31 | 西安电子科技大学 | Image super-resolution reconstruction method based on residual distillation network |
CN110110634B (en) * | 2019-04-28 | 2023-04-07 | 南通大学 | Pathological image multi-staining separation method based on deep learning |
CN110111257B (en) * | 2019-05-08 | 2023-01-03 | 哈尔滨工程大学 | Super-resolution image reconstruction method based on characteristic channel adaptive weighting |
CN110245754B (en) * | 2019-06-14 | 2021-04-06 | 西安邮电大学 | Knowledge distillation guiding method based on position sensitive graph |
CN112116526B (en) * | 2019-06-19 | 2024-06-11 | 中国石油化工股份有限公司 | Super-resolution method of torch smoke image based on depth convolution neural network |
CN110598727B (en) * | 2019-07-19 | 2023-07-28 | 深圳力维智联技术有限公司 | Model construction method based on transfer learning, image recognition method and device thereof |
CN110796619B (en) * | 2019-10-28 | 2022-08-30 | 腾讯科技(深圳)有限公司 | Image processing model training method and device, electronic equipment and storage medium |
CN111209832B (en) * | 2019-12-31 | 2023-07-25 | 华瑞新智科技(北京)有限公司 | Auxiliary obstacle avoidance training method, equipment and medium for substation inspection robot |
CN111160533B (en) * | 2019-12-31 | 2023-04-18 | 中山大学 | Neural network acceleration method based on cross-resolution knowledge distillation |
CN111275646B (en) * | 2020-01-20 | 2022-04-26 | 南开大学 | Edge-preserving image smoothing method based on deep learning knowledge distillation technology |
US11900260B2 (en) | 2020-03-05 | 2024-02-13 | Huawei Technologies Co., Ltd. | Methods, devices and media providing an integrated teacher-student system |
CN113365107B (en) * | 2020-03-05 | 2024-05-10 | 阿里巴巴集团控股有限公司 | Video processing method, film and television video processing method and device |
CN111402311B (en) * | 2020-03-09 | 2023-04-14 | 福建帝视信息科技有限公司 | Knowledge distillation-based lightweight stereo parallax estimation method |
CN111428191B (en) * | 2020-03-12 | 2023-06-16 | 五邑大学 | Antenna downtilt angle calculation method and device based on knowledge distillation and storage medium |
CN111639744B (en) * | 2020-04-15 | 2023-09-22 | 北京迈格威科技有限公司 | Training method and device for student model and electronic equipment |
CN111626330B (en) * | 2020-04-23 | 2022-07-26 | 南京邮电大学 | Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation |
CN111598793A (en) * | 2020-04-24 | 2020-08-28 | 云南电网有限责任公司电力科学研究院 | Method and system for defogging image of power transmission line and storage medium |
CN111582101B (en) * | 2020-04-28 | 2021-10-01 | 中国科学院空天信息创新研究院 | Remote sensing image target detection method and system based on lightweight distillation network |
CN111681178B (en) * | 2020-05-22 | 2022-04-26 | 厦门大学 | Knowledge distillation-based image defogging method |
CN111681298A (en) * | 2020-06-08 | 2020-09-18 | 南开大学 | Compressed sensing image reconstruction method based on multi-feature residual error network |
CN111724306B (en) * | 2020-06-19 | 2022-07-08 | 福州大学 | Image reduction method and system based on convolutional neural network |
CN111881920B (en) * | 2020-07-16 | 2024-04-09 | 深圳力维智联技术有限公司 | Network adaptation method of large-resolution image and neural network training device |
CN112037139B (en) * | 2020-08-03 | 2022-05-03 | 哈尔滨工业大学(威海) | Image defogging method based on RBW-cycleGAN network |
CN111967597A (en) * | 2020-08-18 | 2020-11-20 | 上海商汤临港智能科技有限公司 | Neural network training and image classification method, device, storage medium and equipment |
CN112200062B (en) * | 2020-09-30 | 2021-09-28 | 广州云从人工智能技术有限公司 | Target detection method and device based on neural network, machine readable medium and equipment |
CN112200722A (en) * | 2020-10-16 | 2021-01-08 | 鹏城实验室 | Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment |
CN112348167B (en) * | 2020-10-20 | 2022-10-11 | 华东交通大学 | Knowledge distillation-based ore sorting method and computer-readable storage medium |
CN112734645B (en) * | 2021-01-19 | 2023-11-03 | 青岛大学 | Lightweight image super-resolution reconstruction method based on feature distillation multiplexing |
CN112884650B (en) * | 2021-02-08 | 2022-07-19 | 武汉大学 | Image mixing super-resolution method based on self-adaptive texture distillation |
CN113065635A (en) * | 2021-02-27 | 2021-07-02 | 华为技术有限公司 | Model training method, image enhancement method and device |
CN113096013B (en) * | 2021-03-31 | 2021-11-26 | 南京理工大学 | Blind image super-resolution reconstruction method and system based on imaging modeling and knowledge distillation |
CN113240580B (en) * | 2021-04-09 | 2022-12-27 | 暨南大学 | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation |
CN113177888A (en) * | 2021-04-27 | 2021-07-27 | 北京有竹居网络技术有限公司 | Hyper-resolution restoration network model generation method, image hyper-resolution restoration method and device |
CN113411425B (en) * | 2021-06-21 | 2023-11-07 | 深圳思谋信息科技有限公司 | Video super-division model construction processing method, device, computer equipment and medium |
CN113469977B (en) * | 2021-07-06 | 2024-01-12 | 浙江霖研精密科技有限公司 | Flaw detection device, method and storage medium based on distillation learning mechanism |
CN113592742A (en) * | 2021-08-09 | 2021-11-02 | 天津大学 | Method for removing image moire |
CN113724261A (en) * | 2021-08-11 | 2021-11-30 | 电子科技大学 | Fast image composition method based on convolutional neural network |
CN113807214B (en) * | 2021-08-31 | 2024-01-05 | 中国科学院上海微系统与信息技术研究所 | Small target face recognition method based on deit affiliated network knowledge distillation |
CN113793265A (en) * | 2021-09-14 | 2021-12-14 | 南京理工大学 | Image super-resolution method and system based on depth feature relevance |
CN113837308B (en) * | 2021-09-29 | 2022-08-05 | 北京百度网讯科技有限公司 | Knowledge distillation-based model training method and device and electronic equipment |
CN114359053B (en) * | 2022-01-07 | 2023-06-20 | 中国电信股份有限公司 | Image processing method, device, equipment and storage medium |
CN114898165B (en) * | 2022-06-20 | 2024-08-02 | 哈尔滨工业大学 | Deep learning knowledge distillation method based on model channel cutting |
CN117237190B (en) * | 2023-09-15 | 2024-03-15 | 中国矿业大学 | Lightweight image super-resolution reconstruction system and method for edge mobile equipment |
CN117952830B (en) * | 2024-01-24 | 2024-07-26 | 天津大学 | Three-dimensional image super-resolution reconstruction method based on iterative interaction guidance |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247989A (en) * | 2017-06-15 | 2017-10-13 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107358293A (en) * | 2017-06-15 | 2017-11-17 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107784628A (en) * | 2017-10-18 | 2018-03-09 | 南京大学 | A kind of super-resolution implementation method based on reconstruction optimization and deep neural network |
CN107945146A (en) * | 2017-11-23 | 2018-04-20 | 南京信息工程大学 | A kind of space-time Satellite Images Fusion method based on depth convolutional neural networks |
EP3319039A1 (en) * | 2016-11-07 | 2018-05-09 | UMBO CV Inc. | A method and system for providing high resolution image through super-resolution reconstruction |
-
2018
- 2018-06-12 CN CN201810603516.3A patent/CN108830813B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3319039A1 (en) * | 2016-11-07 | 2018-05-09 | UMBO CV Inc. | A method and system for providing high resolution image through super-resolution reconstruction |
CN107247989A (en) * | 2017-06-15 | 2017-10-13 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107358293A (en) * | 2017-06-15 | 2017-11-17 | 北京图森未来科技有限公司 | A kind of neural network training method and device |
CN107784628A (en) * | 2017-10-18 | 2018-03-09 | 南京大学 | A kind of super-resolution implementation method based on reconstruction optimization and deep neural network |
CN107945146A (en) * | 2017-11-23 | 2018-04-20 | 南京信息工程大学 | A kind of space-time Satellite Images Fusion method based on depth convolutional neural networks |
Non-Patent Citations (2)
Title |
---|
"Convolutional neural network-based transfer learning and knowledge distillation using multi-subject data in motor imagery BCI";Siavash Sakhavi 等;《2017 8th International IEEE/EMBS Conference on Neural Engineering (NER)》;20170815;588-591 * |
"基于增强监督知识蒸馏的交通标识分类";赵胜伟 等;《中国科技论文》;20171031;第12卷(第20期);2355-2360 * |
Also Published As
Publication number | Publication date |
---|---|
CN108830813A (en) | 2018-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108830813B (en) | Knowledge distillation-based image super-resolution enhancement method | |
CN112927202B (en) | Method and system for detecting Deepfake video with combination of multiple time domains and multiple characteristics | |
CN109509152B (en) | Image super-resolution reconstruction method for generating countermeasure network based on feature fusion | |
CN113240580A (en) | Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation | |
CN109685716B (en) | Image super-resolution reconstruction method for generating countermeasure network based on Gaussian coding feedback | |
CN111898701A (en) | Model training, frame image generation, frame interpolation method, device, equipment and medium | |
CN112734646A (en) | Image super-resolution reconstruction method based on characteristic channel division | |
CN111951164B (en) | Image super-resolution reconstruction network structure and image reconstruction effect analysis method | |
CN114898284B (en) | Crowd counting method based on feature pyramid local difference attention mechanism | |
CN114612714B (en) | Curriculum learning-based reference-free image quality evaluation method | |
CN111833261A (en) | Image super-resolution restoration method for generating countermeasure network based on attention | |
CN109872305A (en) | It is a kind of based on Quality Map generate network without reference stereo image quality evaluation method | |
CN113379606B (en) | Face super-resolution method based on pre-training generation model | |
CN113449691A (en) | Human shape recognition system and method based on non-local attention mechanism | |
CN113628152A (en) | Dim light image enhancement method based on multi-scale feature selective fusion | |
CN117635428A (en) | Super-resolution reconstruction method for lung CT image | |
CN114596233A (en) | Attention-guiding and multi-scale feature fusion-based low-illumination image enhancement method | |
CN112017116A (en) | Image super-resolution reconstruction network based on asymmetric convolution and construction method thereof | |
CN115713462A (en) | Super-resolution model training method, image recognition method, device and equipment | |
CN115115514A (en) | Image super-resolution reconstruction method based on high-frequency information feature fusion | |
CN116403063A (en) | No-reference screen content image quality assessment method based on multi-region feature fusion | |
CN117351542A (en) | Facial expression recognition method and system | |
CN113850721A (en) | Single image super-resolution reconstruction method, device and equipment and readable storage medium | |
CN108596831B (en) | Super-resolution reconstruction method based on AdaBoost example regression | |
CN111401453A (en) | Mosaic image classification and identification method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |