CN108830813B - Knowledge distillation-based image super-resolution enhancement method - Google Patents

Knowledge distillation-based image super-resolution enhancement method Download PDF

Info

Publication number
CN108830813B
CN108830813B CN201810603516.3A CN201810603516A CN108830813B CN 108830813 B CN108830813 B CN 108830813B CN 201810603516 A CN201810603516 A CN 201810603516A CN 108830813 B CN108830813 B CN 108830813B
Authority
CN
China
Prior art keywords
network
layer
output
image
teacher
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810603516.3A
Other languages
Chinese (zh)
Other versions
CN108830813A (en
Inventor
高钦泉
赵岩
童同
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Imperial Vision Information Technology Co ltd
Original Assignee
Fujian Imperial Vision Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Imperial Vision Information Technology Co ltd filed Critical Fujian Imperial Vision Information Technology Co ltd
Priority to CN201810603516.3A priority Critical patent/CN108830813B/en
Publication of CN108830813A publication Critical patent/CN108830813A/en
Application granted granted Critical
Publication of CN108830813B publication Critical patent/CN108830813B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • G06T3/4076Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution using the original low-resolution images to iteratively correct the high-resolution images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a knowledge distillation-based image super-resolution enhancement method, which comprises the following steps: 1) acquiring training data and testing data; 2) training a teacher network; the teacher network has a neural network model with deeper convolutional layers, and 3) the student network is trained; 4) the teacher network guides the student network to learn; the characteristic diagram of the teacher network is absorbed by the student network learning through three groups of guiding experiments; 5) testing and evaluating the image reconstruction effect; 6) and further guiding the student network according to different matrix relations among the output characteristic graphs. The invention utilizes the related thought of knowledge distillation to transfer the performance of the teacher network to the student network, the student network model can be efficiently operated on the mobile equipment and the embedded equipment with low power consumption limitation, and the PSNR of the student network guided by the teacher network is obviously improved on the premise of no change of the student network structure, thereby obtaining better reconstruction effect.

Description

Knowledge distillation-based image super-resolution enhancement method
Technical Field
The invention relates to the field of computer vision and deep learning, in particular to an image super-resolution enhancement method based on knowledge distillation.
Background
Super Resolution (SR) is a classic problem in computer vision, and Single Image Super Resolution (SISR) aims to recover a High-Resolution (HR) Image corresponding to a Single Low-Resolution (LR) Image from the LR Image by using digital Image processing and other methods. In the super-resolution problem, assuming that the low resolution image is X, our goal is to recover a super-resolution image Y' that is as similar as possible to the real (GT) image Y.
The conventional Interpolation-based amplification method includes Bilinear Interpolation (Bilinear Interpolation) and Bicubic Interpolation (Bicubic Interpolation), etc., and calculates the missing intermediate pixels in the amplified high-resolution image by using a fixed calculation formula and performing weighted average by using the neighborhood pixel information in the low-resolution image, but the simple Interpolation algorithm does not generate more image details with high-frequency information.
Dong[1]The Super-Resolution convolutional Neural Network algorithm (SRCNN) proposed by et al first applies a convolutional Neural Network to image Super-Resolution, and it directly learns the end-to-end mapping relationship between an input low-Resolution image and a corresponding high-Resolution image. SRCNN well illustrates that deep learning is effective in the super resolution problem, and can reconstruct much of the high frequency image detail information. Kim[2]People who are subjected to VGG-net[3]Inspiring that Very Deep convolutional neural network (VDSR) is used in the Super-Resolution problem, the network structure of VDSR consists of 20 convolutional layers, more convolutional layers have larger receptive fields, more Image neighborhood information can be used for predicting Image high-frequency details, and therefore a better Super-Resolution reconstruction effect can be achieved. Lim[4]Et al by SRResNet[5]The enlightening of the method provides a deeper Enhanced depth Residual error network (EDSR), optimizes the structure of the SRResNet and obtains a better Super-Resolution reconstruction effect.
As can be seen from the work before academia, the image reconstruction effect is better as the network depth is increased. Although the depth of the network is increased to bring a better super-resolution reconstruction effect, the computation amount and the memory consumption are increased at the same time, and a deep convolutional neural network model cannot be run in real time in many practical application scenarios (for example, under the limiting conditions of low power consumption such as a mobile terminal and an embedded type).
Disclosure of Invention
The invention aims to provide an image super-resolution enhancement method based on knowledge distillation, which can improve the image super-resolution reconstruction effect of a network model on the premise of not changing the structure of a small convolutional neural network model, so that a super-resolution model based on a convolutional neural network can be efficiently operated on a mobile terminal and an embedded terminal.
The technical scheme adopted by the invention is as follows:
a knowledge distillation-based image super-resolution enhancement method comprises the following steps:
1) acquiring training data and testing data;
1-1) selecting DIV2K and Flickr2K as training sets, wherein the training sets comprise 3450 real images, and the testing sets respectively select International public data sets Set5, Set14, BSDS100 and Urban 100;
1-2) carrying out 3-time down-sampling on the real images of the training set by adopting a Bicubic down-sampling method to obtain a group of low-resolution images corresponding to the real images;
1-3) reading a real image and a low resolution image, respectively, using an imread () function in an opencv library, the images being in the format of BGR data representing blue, green, and red portions of a color space, respectively,
1-4) then converting the image of the BGR space to the YCrCB space, Y representing the brightness, i.e. the gray level value, Cr representing the difference between the BGR red portion and the BGR signal brightness value, and Cb representing the difference between the BGR blue portion and the BGR signal brightness value;
1-5) carrying out channel separation on an image in a YCrCb space, only selecting Y-channel data for training, and carrying out normalization processing on the Y-channel data;
1-6) cutting the Y-channel image, taking the cut real image block as a training target, taking the cut low-resolution image block as an input during network training, wherein training data required by each iteration is 32 pairs.
2) Training a teacher network; the teacher network is a neural network model with deeper convolutional layers,
2-1) the first layer of the teacher network is a feature extraction and representation layer, which is composed of a convolutional layer and a nonlinear activation layer, the nonlinear activation layer selects ReLU as an activation function, and the operation of the first layer can be expressed by the following formula:
F1(X)=max(0,W1*X+b1)
in the formula, W1,b1The weights and offsets of the first convolutional layer, respectively, "+" indicates the convolution operation, and the ReLU function is defined as max (0, x);
2-2) the middle layer of the teacher network consists of 10 residual blocks, each residual block has two convolution layers, and each convolution layer is followed by a nonlinear activation layer with an activation function of ReLU; adding the input of the first convolutional layer and the output of the second convolutional layer by using a jump connection, and only performing residual error learning on the input of the first convolutional layer; each residual block can be represented by the following formula:
F2n+1(X)=max(0,W2n+1*Fn(X)+b2n+1)+F2n-1(X) (1≤n≤10)
wherein n represents a residual block number, Fn(X) represents the output of the first convolutional layer and the nonlinear active layer in the residual block, W2n+1And b2n+1Respectively representing the weight and the offset of the second convolutional layer in the residual block, F2n-1(X) represents the input of the residual block.
2-3) the reconstruction layer of the teacher network is a deconvolution layer (deconvolution), and the deconvolution layer is used for up-sampling the output of the previous network layer to enable the size of the output super-resolution image to be equal to that of the training target;
2-4) for training of teacher's network, the learning rate is set to 0.0001, and the MSE function is used as a loss function of training target and network output, whose expression is as follows:
Figure BDA0001693654340000031
in the formula, n is the number of training samples; y isiIs an input image, Y'iIs a predicted image.
2-5) minimizing the loss function using Adam optimization method.
3) Training a student network;
in order to achieve a better reconstruction effect, the invention removes a normalization BN layer of a student network structure.
3-1) the first layer of the student network is a characteristic extraction and representation layer, and the parameter setting of the first layer of the student network is the same as that of the first layer of the teacher network;
3-2) the middle layer of the student network is composed of 3 depth separable convolution (depthwise partial convolution) modules, each module is composed of a 3 × 3 depth level convolutional layer (depthwise convolution) and a 1 × 1 convolutional layer, the depth level convolutional layer and the convolutional layer are both followed by a nonlinear activation layer with an activation function of ReLU, and the operation of the depth level convolution can be represented by the following formula:
Figure BDA0001693654340000032
wherein K is Dk×DkA xM deep convolution kernel that applies the mth filter in K to the mth feature map of F to produce the mth feature map of the filtered output feature map G;
3-3) the parameter setting of the student network reconstruction layer is the same as that of the teacher network reconstruction layer;
3-4) the learning rate, the loss function and the optimization method of the student network are the same as those of the teacher network;
4) the teacher network guides the student network to learn; the characteristic diagram of the teacher network is absorbed by the student network learning through three groups of guiding experiments;
step 4) guiding an MSE function to be used as a loss function of a training target and network output in an experiment, and recording as loss0, namely:
Figure BDA0001693654340000033
wherein n is the number of experimental samples, YiIs an input image, Y'iIs a predicted image.
And (3) guiding the first experiment: extracting the output characteristic diagram of the 1 st depth separable convolution module of the student network, averaging the output characteristic diagram and recording the average as S1Namely:
Figure BDA0001693654340000034
in the formula, n1Is the number of characteristic diagrams, siThe ith feature map is output by a 1 st depth separable convolution module in the student network;
extracting output characteristic diagram of the 4 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T1Namely:
Figure BDA0001693654340000035
in the formula, n1Is the number of feature maps, tiThe ith feature map is output by a 4 th residual block module in the teacher network;
using MSE function as T1And S1The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss1, namely:
Figure BDA0001693654340000041
in the formula, n1Is the number of characteristic graphs, T1Is the mean value of the output characteristic diagram of the 4 th residual block module of the teacher network, S1Is the mean of the output feature maps of the 1 st depth separable convolution module of the student network.
Guiding the total loss function of the first experiment to be less 0+ less 1, and minimizing the total loss function by using an Adam optimization method;
and (5) guiding an experiment II: extracting output characteristic diagram of 2 nd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S2Namely:
Figure BDA0001693654340000042
in the formula, n2Is the number of characteristic diagrams, s2iThe ith feature map is output by a 2 nd depth separable convolution module in the student network;
extracting output characteristic diagram of the 7 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T2Namely:
Figure BDA0001693654340000043
in the formula, n2Is the number of feature maps, t2iThe ith feature map is output by a 7 th residual block module in the teacher network;
using MSE function as T2And S2The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss2, namely:
Figure BDA0001693654340000044
in the formula, n2Is the number of characteristic graphs, T2Is the mean value, S, of the output characteristic map of the 7 th residual block module of the teacher network2Is the mean of the output feature maps of the 2 nd deep separable convolution module of the student network.
The total loss function of the second guided experiment is loss0+ loss2, and the Adam optimization method is used for minimizing the total loss function;
and (3) guiding an experiment III: extracting output characteristic diagram of 3 rd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S3Namely:
Figure BDA0001693654340000045
in the formula, n3Is the number of characteristic diagrams, s3iThe ith feature map is output by a 3 rd depth separable convolution module in the student network;
extracting output characteristic diagram of 10 th residual block module of teacher network, averaging the output characteristic diagram and recording as T3Namely:
Figure BDA0001693654340000046
Figure BDA0001693654340000047
in the formula, n3Is the number of feature maps, t3iThe ith feature map is output by a 10 th residual block module in the teacher network;
using MSE function as T3And S3The loss function of (2) enables students to learn the contents of the teacher network characteristic diagram through network and memorize the contentsIs loss3, i.e.:
Figure BDA0001693654340000048
in the formula, n3Is the number of characteristic graphs, T3Is the mean value, S, of the output characteristic map of the 10 th residual block module of the teacher network3The mean of the output feature maps of the 3 rd depth separable convolution module for the student network.
The total loss function for the third guiding experiment is loss0+ loss3, and Adam optimization method is used to minimize the total loss function.
5) Testing and evaluating the image reconstruction effect;
reading a real image of a test set by using an imread () function in an opencv library, wherein the image format is BGR data, converting the image of the BGR space into a YCrCB space, carrying out channel separation on the image of the YCrCb space, only selecting Y-channel data for testing, wherein the gray-scale value range of the Y-channel data is between [0 and 255], carrying out 3-time down-sampling on the gray-scale image of the test set by a Bicubic down-sampling method to obtain a corresponding low-resolution image, carrying out normalization processing on the Y-channel data to change the gray-scale value range of the Y-channel data to be between [0 and 1], and taking the gray-scale value range as the input of a network. And finally, calculating the PSNR of the output of the network and the gray level image of the real image to measure the super-resolution reconstruction effect.
Generally, the Peak signal-to-noise ratio (PSNR) is used to evaluate the quality of the image reconstruction effect, and the higher the PSNR value is, the better the image reconstruction effect is.
6) Further guiding the student network according to different matrix relations among the output characteristic graphs;
let the output tensor of the activation layer of the convolutional neural network be A ∈ RC×H×WWherein C is the number of the characteristic diagrams, H and W are the height and width of the characteristic diagrams respectively,
the function M takes tensor a as input and outputs a two-dimensional matrix, namely: m: rC×H×W→RH×WAnd the output characteristic graphs satisfy the following relations:
p-th power of feature map mean:
Figure BDA0001693654340000051
mean of p-th power of feature:
Figure BDA0001693654340000052
maximum value of feature map: mmax(A)=maxi=1,CAi
Minimum value of feature map: mmin(A)=mini=1,CAi
In the formula, M is a function, p is a power coefficient, A is an output tensor of an activation layer of the convolutional neural network, C is the number of characteristic graphs, and i is a characteristic graph serial number.
The invention adopts the technical scheme to provide knowledge-based distillation technology[6]And a smaller neural network model is made to learn the characteristics of a deeper network model, and the image super-resolution enhancement effect of the small network model is improved on the premise of not changing the model structure of the small network and not increasing the calculated amount, so that the super-resolution model with a better effect can be efficiently operated on a mobile terminal or an embedded terminal with low power consumption limitation.
Drawings
The invention is described in further detail below with reference to the accompanying drawings and the detailed description;
FIG. 1 is a schematic diagram of a teacher network structure of a knowledge distillation-based image super-resolution enhancement method of the present invention;
FIG. 2 is a schematic diagram of a student network structure of an image super-resolution enhancement method based on knowledge distillation according to the present invention;
FIG. 3 is a schematic diagram of a teaching process of a teacher network to a student network of the knowledge distillation-based image super-resolution enhancement method of the present invention;
FIG. 4 shows the comparison effect of part of experiments of the knowledge distillation-based image super-resolution enhancement method.
Detailed Description
As shown in one of fig. 1 to 4, an object of the present invention is to provide a super-resolution reconstruction method based on knowledge distillation, which improves an image super-resolution reconstruction effect of a network model without changing a structure of a small convolutional neural network model, so that a super-resolution model based on a convolutional neural network can be efficiently operated on a mobile terminal and an embedded terminal.
The invention discloses a super-resolution reconstruction method based on knowledge distillation, which comprises the following specific implementation modes:
(1) and acquiring a training set and a test set.
The training set selects DIV2K and Flickr 2K. DIV2K has 800 real images and Flickr2K has 2650 real images for a total of 3450 images.
The test Set selects international public data sets Set5, Set14, BSDS100 and Urban100 respectively. Set5 has 5 test images, Set14 has 14 test images, and BSDS100 and Urban100 each have 100 test images.
And (3) performing 3-time down-sampling on the real images of the training set by adopting a Bicubic down-sampling method to obtain a group of low-resolution images corresponding to the real images.
The real image and the low-resolution image are read separately using the immed () function in the opencv library, the images are formatted as BGR data, BGR represents blue, green, and red portions of a color space, respectively, and then the image of the BGR space is converted to a YCrCB space, Y represents brightness, i.e., a gray-scale value, Cr represents a difference between the BGR red portion and a BGR signal luminance value, and Cb represents a difference between the BGR blue portion and a BGR signal luminance value.
The conversion formula from the BGR space to the YCrCb space is as follows:
Y=0.097906×B+0.504129×G+0.256789×R+16.0
Cr=-0.071246×B-0.367789×G+0.439215×R+128.0
Cb=0.439215×B-0.290992×G-0.148223×R+128.0
and performing channel separation on the image in the YCrCb space, and training by only selecting Y-channel data, wherein the image gray-scale value range is between [0 and 255], and performing normalization processing on the Y-channel data to change the image gray-scale value range into [0 and 1 ].
And (3) cutting the Y-channel image, wherein when the downsampling multiple is 3, the Y-channel image corresponding to the real image is cut into 120 × 120 image blocks serving as a training target, and the Y-channel image corresponding to the corresponding low-resolution image is cut into 40 × 40 image blocks serving as input during network training. The training data required for each iteration is 32 pairs.
(2) And (5) training a teacher network.
The teacher network is a neural network model with deeper convolutional layers, as shown in fig. 1. The first layer of the teacher's network is a feature extraction and presentation layer, consisting of a convolutional layer of 64 filters of size 3 x 3 and a nonlinear activation layer. The padding mode of the convolution layer is set as 'SAME', the sliding step (stride) of the convolution kernel is set as 1, the sizes of the images before and after the convolution operation are equal, the weight initialization method is set as XVaier, the bias term (bias) is initialized to be 0, and the nonlinear activation layer selects ReLU as the activation function. The operation of the first layer can be expressed by the following formula:
F1(X)=max(0,W1*X+b1)
in the formula, W1,b1The weights and offsets of the first convolutional layer, respectively, "+" indicates the convolution operation, and the ReLU function is defined as max (0, x).
The middle layer of the teacher network consists of 10 residual blocks, each residual block has two convolution layers, and each convolution layer is followed by a nonlinear activation layer with an activation function of ReLU. The input of the first convolutional layer and the output of the second convolutional layer are added by a skip connection, and residual learning is performed only on the input of the first convolutional layer. Each convolutional layer is composed of 64 filters with the size of 3 × 3, padding is set to 'SAME', stride is set to 1, weight initialization method is Xvaier method, and bias is initialized to 0. Each residual block can be represented by the following formula:
F2n+1(X)=max(0,W2n+1*Fn(X)+b2n+1)+F2n-1(X) (1≤n≤10)
wherein n represents a residual block number, Fn(X) represents the output of the first convolutional layer and the nonlinear active layer in the residual block, W2n+1And b2n+1Respectively representing the weight and the offset of the second convolutional layer in the residual block, F2n-1(X) represents the input of the residual block.
When the upsampling multiple is 3, the reconstructed layer of the teacher network is 1 deconvolution (deconvolution) with a filter size of 3 × 3, and stride is set to 3. The purpose of the deconvolution layer is to up-sample the output of the previous layer of network, so that the output super-resolution image is equal to the training target in size.
For training of the teacher's network, the learning rate is set to 0.0001, and the MSE function is used as a loss function of the training target and the network output, and the expression is as follows:
Figure BDA0001693654340000071
wherein n is the number of experimental samples, YiIs an input image, Y'iIs a predicted image.
The loss function is minimized using Adam optimization method.
(3) Training of student networks
The structure of the student network is shown in fig. 2, and in order to achieve a better reconstruction effect, the invention removes the normalized BN layer of the network structure. The first layer of the student network is a feature extraction and presentation layer, the parameter settings of which are the same as those of the first layer of the teacher network.
The middle layer of the student network is composed of 3 depth separable convolution (depth separable convolution) modules, each depth separable convolution module is composed of 64 depth level convolution layers (depth convolution) with the size of 3 x 3 and 64 convolution layers with the size of 1 x 1, and the depth level convolution layers and the convolution layers are both followed by a nonlinear activation layer with the activation function of ReLU. The depth level convolution padding mode is set to 'SAME', stride is set to 1. Stride of the convolutional layer is set to 1, the weight initialization method is the XVeier method, and bias is initialized to 0. The operation of depth level convolution can be represented by the following formula:
Figure BDA0001693654340000081
wherein K is Dk×DkThe xm deep convolution kernel applies the mth filter in K to the mth feature map of F to produce the mth feature map of the filtered output feature map G.
The parameter setting of the student network reconstruction layer is the same as that of the teacher network reconstruction layer.
The learning rate, loss function and optimization method of the student network are the same as those of the teacher network.
(4) And the teacher network guides the student network to learn.
The tutor network to student network tutoring process is shown in figure 3.
Experiment one:
extracting the output characteristic diagram of the 1 st module of the student network, averaging the output characteristic diagram and recording the average as S1Namely:
Figure BDA0001693654340000082
in the formula, n is the number of characteristic graphs, siAnd the ith feature map is output by the 1 st module in the student network.
Extracting output characteristic diagram of the 4 th module of the teacher network, averaging the output characteristic diagram and recording the average as T1Namely:
Figure BDA0001693654340000083
in the formula, n is the number of characteristic graphs, tiAnd the ith feature map is output by the 4 th module in the teacher network.
Using MSE function as T1And S1The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss1, namely:
Figure BDA0001693654340000084
the MSE function is used as a loss function for the training target and the network output and is denoted as loss0, i.e.:
Figure BDA0001693654340000085
wherein n is the number of experimental samples, YiFor inputting an image, YiIs a predicted image.
The total loss function loss, loss0+ loss1, is minimized using Adam optimization method.
Experiment two:
extracting output characteristic diagram of 2 nd module of student network, averaging the output characteristic diagram and recording as S2
Extracting output characteristic diagram of 7 th module of teacher network, averaging the output characteristic diagram and recording the average as T2
Experiment three:
extracting output characteristic diagram of the 3 rd module of the student network, averaging the output characteristic diagram and recording the average as S3
Extracting the output characteristic diagram of the 10 th module of the teacher network, averaging the output characteristic diagram and recording the average as T3
Using MSE function as T2And S2,T3And S3The loss functions of (1) are denoted as loss2 and loss3, respectively.
In the second experiment, the total loss function of the third experiment is loss0+ 2 and loss0+ 3 respectively, and the total loss function is minimized by using an Adam optimization method.
(5) Testing
Reading a real image of a test set by using an imread () function in an opencv library, wherein the image format is BGR data, converting the image of the BGR space into a YCrCB space, carrying out channel separation on the image of the YCrCb space, only selecting Y-channel data for testing, wherein the gray-scale value range of the Y-channel data is between [0 and 255], carrying out 3-time down-sampling on the gray-scale image of the test set by a Bicubic down-sampling method to obtain a corresponding low-resolution image, carrying out normalization processing on the Y-channel data to change the gray-scale value range of the Y-channel data to be between [0 and 1], and taking the gray-scale value range as the input of a network. And finally, calculating the PSNR of the output of the network and the gray level image of the real image to measure the super-resolution reconstruction effect. Generally, the Peak signal-to-noise ratio (PSNR) is used to evaluate the quality of the image reconstruction effect, and the higher the PSNR value is, the better the image reconstruction effect is.
The results of steps (2), (3) and (4) are shown in Table 1.
TABLE 1 instructor network to student network guidance effect
Figure BDA0001693654340000091
As can be seen from table 1, the PSNR of experiments one and two is slightly improved relative to the student network.
(6) Further guidance is as follows: and considering different matrix relations among the output characteristic graphs to further guide the student network. Let the output tensor of the activation layer of the convolutional neural network be A ∈ RC×H×WIn the formula, C is the number of the feature maps, H and W are the height and width of the feature maps respectively, and the function M takes the tensor a as input and outputs a two-dimensional matrix, that is:
M:RC×H×W→RH×W
the invention takes into account the relationship between the following characteristic diagrams:
p-th power of feature map mean:
Figure BDA0001693654340000092
mean of p-th power of feature:
Figure BDA0001693654340000101
maximum value of feature map: mmax(A)=maxi=1,CAi
Minimum value of feature map: mmin(A)=mini=1,CAi
In the formula, M is a function, p is a power coefficient, A is an output tensor of an activation layer of the convolutional neural network, C is the number of characteristic graphs, and i is a characteristic graph serial number.
On the basis of the step (4) and the step (5), the total loss function loss is 2 × loss0+ loss1+ loss2, and the total loss function is minimized by using an Adam optimization method. The results of further tutoring the student network by the teacher network are shown in table 2.
TABLE 2 further guidance of teacher network to student network
Figure BDA0001693654340000102
Mmean 2(A) The effect of the method compared with the bicubic interpolation method and the student network is shown in fig. 4.
According to the super-resolution method, under the premise that the structure of the student network is not changed, the PSNR of the student network guided by the teacher network is obviously improved, and a better reconstruction effect is obtained. The innovation of the super-resolution image enhancement method based on knowledge distillation mainly comprises the following three aspects:
first, the invention utilizes the related thought of knowledge distillation to transfer the performance of the teacher network to the student network, thereby greatly improving the image super-resolution reconstruction effect of the student network under the condition of not changing the structure of the student network model.
Secondly, in order to determine the effective information transfer mode of the teacher network and the student network model, the invention compares 7 different feature extraction and transfer methods, and finally determines the optimal feature extraction mode.
Third, the teacher network consumes significant computing resources while the student network model requires only a small amount of computation. The student network model provided by the invention can be efficiently operated on mobile equipment and embedded equipment with low power consumption limitation.
The technical solutions of the present invention have been described in detail, but the embodiments of the present invention are not considered to be limited to the description. It will be apparent to those skilled in the art that various changes may be made without departing from the spirit of the invention, and it is intended that all changes that are equivalent or similar to the invention shall fall within the scope of the invention.
Reference to the literature
[1]Chao Dong,Chen Change Loy,Kaiming He,Xiaoou Tang.Image Super-Resolution Using Deep Convolutional Networks[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2016,38(2):295-307.
[2]Jiwon Kim,Jung Kwon Lee,Kyoung Mu Lee.Accurate Image Super-Resolution Using Very Deep Convolutional Networks[C].IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:1646-1654.
[3]Karen Simonyan,Andrew Zisserman.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Science,2014.
[4]Bee Lim,Sanghyun Son,Heewon Kim,Seungjun Nah,Kyoung Mu Lee.Enhanced Deep Residual Networks for Single Image Super-Resolution[C].Computer Vision and Pattern Recognition Workshops.IEEE,2017:1132-1140.
[5]Christian Ledig,Lucas Theis,Ferenc Huszar,Jose Caballero,Andrew Cunningham,Alejandro Acosta,Andrew Aitken,Alykhan Tejani,Johannes Totz,Zehan Wang,Wenzhe Shi.Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network[C].IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:105-114.
[6]Geoffrey Hinton,Oriol Vinyals,Jeff Dean.Distilling the Knowledge in a Neural Network[J].Computer Science,2015,14(7):38-39.

Claims (4)

1. A knowledge distillation-based image super-resolution enhancement method is characterized by comprising the following steps: which comprises the following steps:
1) acquiring training data and testing data;
1-1) selecting DIV2K and Flickr2K as training sets, wherein the training sets comprise 3450 real images, and the testing sets respectively select international public data sets Set5, Set14, BSDS100 and Urban 100;
1-2) carrying out 3-time down-sampling on the real images of the training set by adopting a Bicubic down-sampling method to obtain a group of low-resolution images corresponding to the real images;
1-3) reading a real image and a low resolution image, respectively, using an imread () function in an opencv library, the images being in the format of BGR data representing blue, green, and red portions of a color space, respectively,
1-4) then converting the image of the BGR space to YCrCb space, Y representing the brightness, i.e. the gray level value, Cr representing the difference between the BGR red portion and the BGR signal brightness value, Cb representing the difference between the BGR blue portion and the BGR signal brightness value;
1-5) carrying out channel separation on an image in a YCrCb space, only selecting Y-channel data for training, and carrying out normalization processing on the Y-channel data;
1-6) cutting a Y-channel image, taking a cut real image block as a training target, taking a cut low-resolution image block as an input during network training, wherein training data required by each iteration is 32 pairs;
2) training a teacher network; the teacher network has a neural network model with deeper convolutional layers,
2-1) the first layer of the teacher network is a feature extraction and representation layer, which is composed of a convolutional layer and a nonlinear activation layer, the nonlinear activation layer selects ReLU as an activation function, and the operation of the first layer can be expressed by the following formula: f1(X)=max(0,W1*X+b1) In the formula, W1,b1The weights and offsets of the first convolutional layer, respectively, "+" indicates the convolution operation, and the ReLU function is defined as max (0, x);
2-2) the middle layer of the teacher network consists of 10 residual blocks, each residual block has two convolution layers, and each convolution layer is followed by a nonlinear activation layer with an activation function of ReLU; adding the input of the first convolutional layer and the output of the second convolutional layer by using a jump connection, and only performing residual error learning on the input of the first convolutional layer; each residual block is represented by the following formula:
F2o+1(X)=max(0,W2o+1*Fo(X)+b2o+1)+F2o-1(X)(1≤o≤10)
in the formula, o represents the residual block number, Fo(X) represents the output of the first convolutional layer and the nonlinear active layer in the residual block, W2o+1And b2o+1Respectively representing the weight and the offset of the second convolutional layer in the residual block, F2o-1(X) represents the input of a residual block;
2-3) the reconstruction layer of the teacher network is an deconvolution layer, and the deconvolution layer is used for up-sampling the output of the previous network layer to enable the output super-resolution image to be equal to the size of the training target;
2-4) for training of teacher's network, the learning rate is set to 0.0001, and the MSE function is used as a loss function of training target and network output, whose expression is as follows:
Figure FDA0003268192200000011
where n is the number of training samples, YiIs an input image, Y'iIs a predicted image;
2-5) minimizing the loss function using Adam optimization method;
3) training a student network;
3-1) the first layer of the student network is a characteristic extraction and representation layer, and the parameter setting of the first layer of the student network is the same as that of the first layer of the teacher network;
3-2) the middle layer of the student network is composed of 3 depth separable convolution modules, each module is composed of a 3 x 3 depth level convolution layer and a 1 x 1 convolution layer, the depth level convolution layer and the convolution layer are both followed by a nonlinear activation layer with an activation function of ReLU, and the operation of the depth level convolution is represented by the following formula:
Figure FDA0003268192200000021
wherein K is Dk×DkA xM deep convolution kernel that applies the mth filter in K to the mth feature map of F to produce the mth feature map of the filtered output feature map G;
3-3) the parameter setting of the student network reconstruction layer is the same as that of the teacher network reconstruction layer;
3-4) the learning rate, the loss function and the optimization method of the student network are the same as those of the teacher network; i.e., the learning rate is set to 0.0001, the MSE function is used as a loss function for the training target and the network output, and the expression is as follows:
Figure FDA0003268192200000022
in the formula, n is the number of training samples;
4) the teacher network guides the student network to learn; the characteristic diagram of the teacher network is absorbed by the student network learning through three groups of guiding experiments;
5) testing and evaluating the image reconstruction effect;
6) further guiding the student network according to different matrix relations among the output characteristic graphs;
let the output tensor of the activation layer of the convolutional neural network be A ∈ RC×H×WWherein C is the number of the characteristic diagrams, H and W are the height and width of the characteristic diagrams respectively,
the function M takes tensor a as input and outputs a two-dimensional matrix, namely: m: rC×H×W→RH×WAnd the output characteristic graphs satisfy the following relations:
p-th power of feature map mean:
Figure FDA0003268192200000023
mean of p-th power of feature:
Figure FDA0003268192200000024
maximum value of feature map: mmax(A)=maxi=1,CAi
Minimum value of feature map: mmin(A)=mini=1,CAi
In the formula, M is a function, p is a power coefficient, A is an output tensor of an activation layer of the convolutional neural network, C is the number of characteristic graphs, and i is a characteristic graph serial number.
2. The method for enhancing the super-resolution of the image based on the knowledge distillation as claimed in claim 1, wherein: the conversion formula from the BGR space to the YCrCb space in step 1-4) is as follows:
Y=0.097906×B+0.504129×G+0.256789×R+16.0
Cr=-0.071246×B-0.367789×G+0.439215×R+128.0
Cb=0.439215×B-0.290992×G-0.148223×R+128.0。
3. the method for enhancing the super-resolution of the image based on the knowledge distillation as claimed in claim 1, wherein: step 4) guiding an MSE function to be used as a loss function of a training target and network output in an experiment, and recording as loss0, namely:
Figure FDA0003268192200000031
Figure FDA0003268192200000032
wherein n is the number of experimental samples, YiIs an input image, Y'iIs a predicted image;
and (3) guiding the first experiment: extracting the output characteristic diagram of the 1 st depth separable convolution module of the student network, averaging the output characteristic diagram and recording the average as S1Namely:
Figure FDA0003268192200000033
in the formula, n1Is the number of characteristic diagrams, siThe ith feature map is output by a 1 st depth separable convolution module in the student network;
extracting output characteristic diagram of the 4 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T1Namely:
Figure FDA0003268192200000034
in the formula, n1Is the number of feature maps, tiIn a teacher networkThe ith feature map output by the 4 th residual block module;
using MSE function as T1And S1The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss1, namely:
Figure FDA0003268192200000035
in the formula, n1Is the number of characteristic graphs, T1Is the mean value of the output characteristic diagram of the 4 th residual block module of the teacher network, S1Is the mean of the output feature maps of the 1 st depth separable convolution module of the student network;
guiding the total loss function of the first experiment to be less 0+ less 1, and minimizing the total loss function by using an Adam optimization method;
and (5) guiding an experiment II: extracting output characteristic diagram of 2 nd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S2Namely:
Figure FDA0003268192200000036
in the formula, n2Is the number of characteristic diagrams, s2iThe ith feature map is output by a 2 nd depth separable convolution module in the student network;
extracting output characteristic diagram of the 7 th residual block module of the teacher network, averaging the output characteristic diagram, and recording the average as T2Namely:
Figure FDA0003268192200000037
in the formula, n2Is the number of feature maps, t2iThe ith feature map is output by a 7 th residual block module in the teacher network;
using MSE function as T2And S2The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss2, namely:
Figure FDA0003268192200000038
in the formula, n2In order to provide the number of the characteristic diagrams,T2is the mean value, S, of the output characteristic map of the 7 th residual block module of the teacher network2Is the mean of the output feature maps of the 2 nd depth separable convolution module of the student network;
the total loss function of the second guided experiment is loss0+ loss2, and the Adam optimization method is used for minimizing the total loss function; and (3) guiding an experiment III: extracting output characteristic diagram of 3 rd depth separable convolution module of student network, averaging the output characteristic diagram and recording the average as S3Namely:
Figure FDA0003268192200000041
in the formula, n3Is the number of characteristic diagrams, s3iThe ith feature map is output by a 3 rd depth separable convolution module in the student network;
extracting output characteristic diagram of 10 th residual block module of teacher network, averaging the output characteristic diagram and recording as T3Namely:
Figure FDA0003268192200000042
in the formula, n3Is the number of feature maps, t3iThe ith feature map is output by a 10 th residual block module in the teacher network;
using MSE function as T3And S3The loss function of (2) is to make the students learn the contents of the teacher network characteristic diagram through the network, and is recorded as loss3, namely:
Figure FDA0003268192200000043
in the formula, n3Is the number of characteristic graphs, T3Is the mean value, S, of the output characteristic map of the 10 th residual block module of the teacher network3Average of the output feature maps of the 3 rd depth separable convolution module for the student network;
the total loss function for the third guiding experiment is loss0+ loss3, and Adam optimization method is used to minimize the total loss function.
4. The method for enhancing the super-resolution of the image based on the knowledge distillation as claimed in claim 1, wherein: the specific steps of the step 5 are as follows:
5-1) reading real images of the test set by using an imread () function in an opencv library, wherein the images are in a BGR data format,
5-2) converting the image of the BGR space to the YCrCb space and performing channel separation on the image of the YCrCb space, at which time only Y-channel data is selected for testing, the gray scale value of which ranges from 0 to 255,
5-3) carrying out 3-time down-sampling on the gray level image of the test set by using a Bicubic down-sampling method to obtain a corresponding low-resolution image, carrying out normalization processing on Y-channel data of the low-resolution image to enable the gray level value range to be between [0 and 1], and taking the gray level value range as the input of a network;
5-4) calculating the PSNR of the output of the network and the gray level image of the real image to measure the super-resolution reconstruction effect.
CN201810603516.3A 2018-06-12 2018-06-12 Knowledge distillation-based image super-resolution enhancement method Active CN108830813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810603516.3A CN108830813B (en) 2018-06-12 2018-06-12 Knowledge distillation-based image super-resolution enhancement method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810603516.3A CN108830813B (en) 2018-06-12 2018-06-12 Knowledge distillation-based image super-resolution enhancement method

Publications (2)

Publication Number Publication Date
CN108830813A CN108830813A (en) 2018-11-16
CN108830813B true CN108830813B (en) 2021-11-09

Family

ID=64143896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810603516.3A Active CN108830813B (en) 2018-06-12 2018-06-12 Knowledge distillation-based image super-resolution enhancement method

Country Status (1)

Country Link
CN (1) CN108830813B (en)

Families Citing this family (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3785222B1 (en) 2018-05-30 2024-04-17 Shanghai United Imaging Healthcare Co., Ltd. Systems and methods for image processing
CN109658354B (en) * 2018-12-20 2022-02-08 上海联影医疗科技股份有限公司 Image enhancement method and system
CN109816636B (en) * 2018-12-28 2020-11-27 汕头大学 Crack detection method based on intelligent terminal
CN110309842B (en) * 2018-12-28 2023-01-06 中国科学院微电子研究所 Object detection method and device based on convolutional neural network
CN109637546B (en) * 2018-12-29 2021-02-12 苏州思必驰信息科技有限公司 Knowledge distillation method and apparatus
CN111414987B (en) * 2019-01-08 2023-08-29 南京人工智能高等研究院有限公司 Training method and training device of neural network and electronic equipment
CN110458765B (en) * 2019-01-25 2022-12-02 西安电子科技大学 Image quality enhancement method based on perception preserving convolution network
CN109978763A (en) * 2019-03-01 2019-07-05 昆明理工大学 A kind of image super-resolution rebuilding algorithm based on jump connection residual error network
CN111814816A (en) * 2019-04-12 2020-10-23 北京京东尚科信息技术有限公司 Target detection method, device and storage medium thereof
CN110111256B (en) * 2019-04-28 2023-03-31 西安电子科技大学 Image super-resolution reconstruction method based on residual distillation network
CN110110634B (en) * 2019-04-28 2023-04-07 南通大学 Pathological image multi-staining separation method based on deep learning
CN110111257B (en) * 2019-05-08 2023-01-03 哈尔滨工程大学 Super-resolution image reconstruction method based on characteristic channel adaptive weighting
CN110245754B (en) * 2019-06-14 2021-04-06 西安邮电大学 Knowledge distillation guiding method based on position sensitive graph
CN112116526B (en) * 2019-06-19 2024-06-11 中国石油化工股份有限公司 Super-resolution method of torch smoke image based on depth convolution neural network
CN110598727B (en) * 2019-07-19 2023-07-28 深圳力维智联技术有限公司 Model construction method based on transfer learning, image recognition method and device thereof
CN110796619B (en) * 2019-10-28 2022-08-30 腾讯科技(深圳)有限公司 Image processing model training method and device, electronic equipment and storage medium
CN111209832B (en) * 2019-12-31 2023-07-25 华瑞新智科技(北京)有限公司 Auxiliary obstacle avoidance training method, equipment and medium for substation inspection robot
CN111160533B (en) * 2019-12-31 2023-04-18 中山大学 Neural network acceleration method based on cross-resolution knowledge distillation
CN111275646B (en) * 2020-01-20 2022-04-26 南开大学 Edge-preserving image smoothing method based on deep learning knowledge distillation technology
US11900260B2 (en) 2020-03-05 2024-02-13 Huawei Technologies Co., Ltd. Methods, devices and media providing an integrated teacher-student system
CN113365107B (en) * 2020-03-05 2024-05-10 阿里巴巴集团控股有限公司 Video processing method, film and television video processing method and device
CN111402311B (en) * 2020-03-09 2023-04-14 福建帝视信息科技有限公司 Knowledge distillation-based lightweight stereo parallax estimation method
CN111428191B (en) * 2020-03-12 2023-06-16 五邑大学 Antenna downtilt angle calculation method and device based on knowledge distillation and storage medium
CN111639744B (en) * 2020-04-15 2023-09-22 北京迈格威科技有限公司 Training method and device for student model and electronic equipment
CN111626330B (en) * 2020-04-23 2022-07-26 南京邮电大学 Target detection method and system based on multi-scale characteristic diagram reconstruction and knowledge distillation
CN111598793A (en) * 2020-04-24 2020-08-28 云南电网有限责任公司电力科学研究院 Method and system for defogging image of power transmission line and storage medium
CN111582101B (en) * 2020-04-28 2021-10-01 中国科学院空天信息创新研究院 Remote sensing image target detection method and system based on lightweight distillation network
CN111681178B (en) * 2020-05-22 2022-04-26 厦门大学 Knowledge distillation-based image defogging method
CN111681298A (en) * 2020-06-08 2020-09-18 南开大学 Compressed sensing image reconstruction method based on multi-feature residual error network
CN111724306B (en) * 2020-06-19 2022-07-08 福州大学 Image reduction method and system based on convolutional neural network
CN111881920B (en) * 2020-07-16 2024-04-09 深圳力维智联技术有限公司 Network adaptation method of large-resolution image and neural network training device
CN112037139B (en) * 2020-08-03 2022-05-03 哈尔滨工业大学(威海) Image defogging method based on RBW-cycleGAN network
CN111967597A (en) * 2020-08-18 2020-11-20 上海商汤临港智能科技有限公司 Neural network training and image classification method, device, storage medium and equipment
CN112200062B (en) * 2020-09-30 2021-09-28 广州云从人工智能技术有限公司 Target detection method and device based on neural network, machine readable medium and equipment
CN112200722A (en) * 2020-10-16 2021-01-08 鹏城实验室 Generation method and reconstruction method of image super-resolution reconstruction model and electronic equipment
CN112348167B (en) * 2020-10-20 2022-10-11 华东交通大学 Knowledge distillation-based ore sorting method and computer-readable storage medium
CN112734645B (en) * 2021-01-19 2023-11-03 青岛大学 Lightweight image super-resolution reconstruction method based on feature distillation multiplexing
CN112884650B (en) * 2021-02-08 2022-07-19 武汉大学 Image mixing super-resolution method based on self-adaptive texture distillation
CN113065635A (en) * 2021-02-27 2021-07-02 华为技术有限公司 Model training method, image enhancement method and device
CN113096013B (en) * 2021-03-31 2021-11-26 南京理工大学 Blind image super-resolution reconstruction method and system based on imaging modeling and knowledge distillation
CN113240580B (en) * 2021-04-09 2022-12-27 暨南大学 Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN113177888A (en) * 2021-04-27 2021-07-27 北京有竹居网络技术有限公司 Hyper-resolution restoration network model generation method, image hyper-resolution restoration method and device
CN113411425B (en) * 2021-06-21 2023-11-07 深圳思谋信息科技有限公司 Video super-division model construction processing method, device, computer equipment and medium
CN113469977B (en) * 2021-07-06 2024-01-12 浙江霖研精密科技有限公司 Flaw detection device, method and storage medium based on distillation learning mechanism
CN113592742A (en) * 2021-08-09 2021-11-02 天津大学 Method for removing image moire
CN113724261A (en) * 2021-08-11 2021-11-30 电子科技大学 Fast image composition method based on convolutional neural network
CN113807214B (en) * 2021-08-31 2024-01-05 中国科学院上海微系统与信息技术研究所 Small target face recognition method based on deit affiliated network knowledge distillation
CN113793265A (en) * 2021-09-14 2021-12-14 南京理工大学 Image super-resolution method and system based on depth feature relevance
CN113837308B (en) * 2021-09-29 2022-08-05 北京百度网讯科技有限公司 Knowledge distillation-based model training method and device and electronic equipment
CN114359053B (en) * 2022-01-07 2023-06-20 中国电信股份有限公司 Image processing method, device, equipment and storage medium
CN114898165B (en) * 2022-06-20 2024-08-02 哈尔滨工业大学 Deep learning knowledge distillation method based on model channel cutting
CN117237190B (en) * 2023-09-15 2024-03-15 中国矿业大学 Lightweight image super-resolution reconstruction system and method for edge mobile equipment
CN117952830B (en) * 2024-01-24 2024-07-26 天津大学 Three-dimensional image super-resolution reconstruction method based on iterative interaction guidance

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247989A (en) * 2017-06-15 2017-10-13 北京图森未来科技有限公司 A kind of neural network training method and device
CN107358293A (en) * 2017-06-15 2017-11-17 北京图森未来科技有限公司 A kind of neural network training method and device
CN107784628A (en) * 2017-10-18 2018-03-09 南京大学 A kind of super-resolution implementation method based on reconstruction optimization and deep neural network
CN107945146A (en) * 2017-11-23 2018-04-20 南京信息工程大学 A kind of space-time Satellite Images Fusion method based on depth convolutional neural networks
EP3319039A1 (en) * 2016-11-07 2018-05-09 UMBO CV Inc. A method and system for providing high resolution image through super-resolution reconstruction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3319039A1 (en) * 2016-11-07 2018-05-09 UMBO CV Inc. A method and system for providing high resolution image through super-resolution reconstruction
CN107247989A (en) * 2017-06-15 2017-10-13 北京图森未来科技有限公司 A kind of neural network training method and device
CN107358293A (en) * 2017-06-15 2017-11-17 北京图森未来科技有限公司 A kind of neural network training method and device
CN107784628A (en) * 2017-10-18 2018-03-09 南京大学 A kind of super-resolution implementation method based on reconstruction optimization and deep neural network
CN107945146A (en) * 2017-11-23 2018-04-20 南京信息工程大学 A kind of space-time Satellite Images Fusion method based on depth convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Convolutional neural network-based transfer learning and knowledge distillation using multi-subject data in motor imagery BCI";Siavash Sakhavi 等;《2017 8th International IEEE/EMBS Conference on Neural Engineering (NER)》;20170815;588-591 *
"基于增强监督知识蒸馏的交通标识分类";赵胜伟 等;《中国科技论文》;20171031;第12卷(第20期);2355-2360 *

Also Published As

Publication number Publication date
CN108830813A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108830813B (en) Knowledge distillation-based image super-resolution enhancement method
CN112927202B (en) Method and system for detecting Deepfake video with combination of multiple time domains and multiple characteristics
CN109509152B (en) Image super-resolution reconstruction method for generating countermeasure network based on feature fusion
CN113240580A (en) Lightweight image super-resolution reconstruction method based on multi-dimensional knowledge distillation
CN109685716B (en) Image super-resolution reconstruction method for generating countermeasure network based on Gaussian coding feedback
CN111898701A (en) Model training, frame image generation, frame interpolation method, device, equipment and medium
CN112734646A (en) Image super-resolution reconstruction method based on characteristic channel division
CN111951164B (en) Image super-resolution reconstruction network structure and image reconstruction effect analysis method
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN114612714B (en) Curriculum learning-based reference-free image quality evaluation method
CN111833261A (en) Image super-resolution restoration method for generating countermeasure network based on attention
CN109872305A (en) It is a kind of based on Quality Map generate network without reference stereo image quality evaluation method
CN113379606B (en) Face super-resolution method based on pre-training generation model
CN113449691A (en) Human shape recognition system and method based on non-local attention mechanism
CN113628152A (en) Dim light image enhancement method based on multi-scale feature selective fusion
CN117635428A (en) Super-resolution reconstruction method for lung CT image
CN114596233A (en) Attention-guiding and multi-scale feature fusion-based low-illumination image enhancement method
CN112017116A (en) Image super-resolution reconstruction network based on asymmetric convolution and construction method thereof
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN115115514A (en) Image super-resolution reconstruction method based on high-frequency information feature fusion
CN116403063A (en) No-reference screen content image quality assessment method based on multi-region feature fusion
CN117351542A (en) Facial expression recognition method and system
CN113850721A (en) Single image super-resolution reconstruction method, device and equipment and readable storage medium
CN108596831B (en) Super-resolution reconstruction method based on AdaBoost example regression
CN111401453A (en) Mosaic image classification and identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant