WO2022160980A1 - 一种超分辨率方法、装置、终端设备及存储介质 - Google Patents

一种超分辨率方法、装置、终端设备及存储介质 Download PDF

Info

Publication number
WO2022160980A1
WO2022160980A1 PCT/CN2021/137582 CN2021137582W WO2022160980A1 WO 2022160980 A1 WO2022160980 A1 WO 2022160980A1 CN 2021137582 W CN2021137582 W CN 2021137582W WO 2022160980 A1 WO2022160980 A1 WO 2022160980A1
Authority
WO
WIPO (PCT)
Prior art keywords
resolution
image
super
sub
low
Prior art date
Application number
PCT/CN2021/137582
Other languages
English (en)
French (fr)
Inventor
孔祥涛
赵恒远
董超
乔宇
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2022160980A1 publication Critical patent/WO2022160980A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the present application relates to the technical field of deep learning, and in particular, to a super-resolution method, apparatus, terminal device and storage medium.
  • Super-resolution technology refers to the technology of reconstructing low-resolution images into high-resolution images.
  • the super-resolution algorithm based on deep learning is the most commonly used super-resolution method at present.
  • the super-resolution algorithm based on deep learning is to cut the low-resolution image into sub-images, and then input each sub-image into the super-resolution network model for processing to obtain a reconstructed image, and then stitch the reconstructed images of each sub-image. Get high-resolution images.
  • the more commonly used super-resolution network models include Accelerating the Super-Resolution Convolutional Neural Network (FSRCNN), fast, accurate, lightweight super-resolution and cascade residual network (Fast, Accurate , and Lightweight Super-Resolution with Cascading Residual Network, CARN), Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, SRResNet, Image Super-Resolution Use very deep residual channel attention network (Image Super-Resolution Using Very Deep Residual Channel Attention Networks, RCAN) and so on.
  • FSRCNN Super-Resolution Convolutional Neural Network
  • Fast Accurate
  • CARN Lightweight Super-Resolution with Cascading Residual Network
  • CARN Lightweight Super-Resolution with Cascading Residual Network
  • SRResNet Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
  • SRResNet Image Super-Resolution Use very deep
  • the present application provides a super-resolution method, apparatus, terminal device, and storage medium, which can reduce the amount of computation for super-resolution processing.
  • the present application provides a super-resolution method, comprising: inputting a low-resolution image to be processed into a trained classification super-resolution network model for processing, and outputting a high-resolution image corresponding to the low-resolution image;
  • the classification super-resolution network model includes a classification model and multiple super-resolution network models with different complexities.
  • the processing process of the classification super-resolution network model for low-resolution images includes:
  • the method further includes: using a preset first loss function, a second loss function, a third loss function and a training set to train a preset initial network model to obtain a classification super-score network model.
  • the initial classification model includes an initial classification model and multiple initial super-resolution network models with different complexities
  • the training set includes multiple low-resolution image samples and high-resolution image samples corresponding to each low-resolution image sample
  • the first loss function is used to reduce the error between the high-resolution image corresponding to the low-resolution image sample output by the initial classification model and the high-resolution image sample corresponding to the low-resolution image sample in the training set
  • the second loss The function is used to increase the difference between the maximum probability value and other probability values among the multiple probability values output by the initial classification model
  • the gap in the number of sub-image samples.
  • the initial network model processing the low-resolution image samples in the training set includes:
  • the sub-image samples are respectively input into multiple initial super-resolution network models for processing, and the first reconstructed image samples respectively output by the multiple initial super-resolution network models are obtained; the classification results are used to classify the multiple first reconstructed image samples
  • a weighted summation is performed to obtain a second reconstructed image sample; the second reconstructed image samples of the plurality of sub-image samples are spliced to obtain a high-resolution image corresponding to the low-resolution image sample.
  • the second loss function is:
  • L c is the negative of the sum of the distances between the probability values belonging to each complexity category output by the sub-image sample x after being processed by the initial classification model
  • M is the number of complexity categories
  • P i (x) is the The probability value of the image sample x being assigned to the ith complexity class.
  • the third loss function is:
  • L a is the number of sub-image samples assigned to each complexity category by the initial classification model in the batch process and The sum of the distances between.
  • B is the batch size
  • P i (x j ) represents the probability that the j-th sub-image sample is assigned to the i-th complexity class in a batch
  • the multiple super-resolution network models include a preset first super-resolution network model and at least one first super-resolution network model that has undergone network parameter reduction processing.
  • the present application provides a super-resolution device, comprising:
  • An acquisition unit for acquiring processed low-resolution images An acquisition unit for acquiring processed low-resolution images.
  • the processing unit is used for inputting the low-resolution image into the trained classification super-segmentation network model for processing, and outputting the high-resolution image corresponding to the low-resolution image.
  • the classification super-resolution network model includes a classification model and multiple super-resolution network models with different complexities.
  • the processing process of the classification super-resolution network model for low-resolution images includes:
  • the super-resolution device further includes a training unit:
  • the training unit is used for training the preset initial network model by using the preset first loss function, the second loss function, the third loss function and the training set to obtain a classification super-score network model.
  • the initial classification model includes an initial classification model and multiple initial super-resolution network models with different complexities
  • the training set includes multiple low-resolution image samples and high-resolution image samples corresponding to each low-resolution image sample
  • the first loss function is used to reduce the error between the high-resolution image corresponding to the low-resolution image sample output by the initial classification model and the high-resolution image sample corresponding to the low-resolution image sample in the training set
  • the second loss The function is used to increase the difference between the maximum probability value and other probability values among the multiple probability values output by the initial classification model
  • the gap in the number of sub-image samples.
  • the present application provides a terminal device, including: a memory and a processor, where the memory is used for storing a computer program; the processor is used for executing the method described in any one of the above-mentioned first aspect when the computer program is invoked.
  • the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method described in any one of the foregoing first aspects.
  • an embodiment of the present application provides a computer program product that, when the computer program product runs on a processor, causes the processor to execute the method described in any one of the foregoing first aspects.
  • the complexity of each sub-image of a low-resolution image is identified by using a classification model, and then super-resolution network models of different complexity are used to process different Subimage of complexity.
  • the sub-image with relatively small complexity is processed by the super-resolution network model with relatively small complexity, so as to reduce the calculation amount of the sub-image with relatively small complexity under the condition of ensuring the restoration effect, Speed up processing.
  • the sub-image with relatively large complexity is processed by the super-resolution network model with relatively large complexity, so as to ensure the restoration effect of the sub-image with relatively large complexity. Therefore, for a complete low-resolution image, the super-resolution method provided in this application can reduce the amount of calculation in the super-resolution processing and speed up the processing while ensuring the restoration effect of the high-resolution image. .
  • FIG. 1 is a schematic flowchart of an embodiment of a super-resolution method provided by an embodiment of the present application
  • FIG. 2 is a schematic flowchart of processing a low-resolution image by a classification super-resolution network model according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a network structure of a classification model provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the network structure of a plurality of FSRCNNs with different complexities provided by an embodiment of the present application;
  • FIG. 5 is a schematic diagram of the network structure of multiple SRResNets of different complexity provided by an embodiment of the present application
  • FIG. 6 is a schematic diagram 1 of experimental data comparison provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram 2 of experimental data comparison provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a training process of an initial network model provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram three of experimental data comparison provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram four of experimental data comparison provided by an embodiment of the application.
  • FIG. 11 is a schematic diagram five of experimental data comparison provided by an embodiment of the application.
  • FIG. 12 is a schematic structural diagram of a super-resolution device according to an embodiment of the application.
  • FIG. 13 is a schematic structural diagram of a terminal device according to an embodiment of the application.
  • the method of designing a lightweight network model or setting up efficient plug-in modules is usually adopted to reduce the amount of calculation.
  • the calculation amount of the entire network model is reduced, and for a sub-image with greater complexity, it will inevitably lead to a poor recovery effect.
  • the present application provides a super-resolution method, by designing a classification-super-resolution (Class Super-Resolution, Class SR) network model pair comprising a classification model and a plurality of super-resolution network models with different complexities
  • Low-resolution images are subjected to super-resolution processing.
  • the processing principle is to identify the complexity of each sub-image of a low-resolution image through a classification model, and then use a super-resolution network model of different complexity to process the sub-images of different complexity.
  • the sub-image with relatively small complexity is processed by the super-resolution network model with relatively small complexity, so as to reduce the calculation amount of the sub-image with relatively small complexity under the condition of ensuring the restoration effect, Speed up processing.
  • the sub-image with relatively large complexity is processed by the super-resolution network model with relatively large complexity, so as to ensure the restoration effect of the sub-image with relatively large complexity.
  • the accelerated processing of super-resolution processing of low-resolution images is realized.
  • the execution subject of the method may be an image processing device, such as a mobile terminal such as a smart phone, a smart phone, a tablet computer, a camera, etc., It can also be terminal devices such as desktop computers, robots, and servers.
  • the trained classification super-score network model provided by this application is deployed in the image processing device.
  • the low-resolution image can be input into the classification super-resolution network model for processing, and a high-resolution image corresponding to the low-resolution image can be output.
  • the classification super-resolution network model provided in this application includes a classification model and a plurality of super-resolution network models with different complexities (in FIG. 1, three different complexities of small, medium and large are taken as examples) super-resolution network models.
  • the processing process of the classification super-resolution network model for low-resolution images includes:
  • the image processing device may cut the low-resolution image according to the preset size of the sub-image.
  • the size of the sub-image can be set based on the input requirements of the classification model and the super-resolution network model used in the classification super-resolution network model.
  • the classification model may be any neural network model with classification function.
  • the classification model can be a convolutional neural network composed of several convolutional layers, pooling layers, and fully connected layers.
  • the classification model is used to identify the complexity of the sub-image, which can classify the input sub-image and output the probability value of the sub-image being classified into each complexity category.
  • the complexity class with the largest probability value is the complexity class of the sub-image.
  • the recognition difficulty of different sub-images is different, and thus the difficulty of restoring to a high-resolution image is also different. Therefore, in this application, the so-called complexity of an image refers to the difficulty of reconstruction to high resolution.
  • the output of the classification model is a vector of length M (M ⁇ 2, M is an integer), where M also represents the number of super-resolution network models in the classification super-resolution network model.
  • M M is an integer
  • M also represents the number of super-resolution network models in the classification super-resolution network model.
  • the sub-image After the complexity category of the sub-image is determined according to the classification model, the sub-image can be input into the super-resolution network model corresponding to the complexity category of the sub-image for processing, and the reconstructed image of the sub-image (that is, the sub-image is outputted) high-resolution images).
  • the sub-image is input into the super-resolution network model of "small complexity” for high-resolution restoration processing.
  • the multiple super-resolution network models of different complexity may include different network models. For example, assuming that three super-resolution network models with different complexities need to be set in the classification super-resolution network model, three super-resolution network models can be selected from the existing and/or reconstructed super-resolution network models to build the classification super-resolution network model.
  • FSRCNN in order of complexity of the network model from small to large, currently existing super-resolution network models include FSRCNN, CARN, SRResNet, RCAN and so on. If FSRCNN, CARN, and SRResNet are selected to build a classification super-resolution network model, FSRCNN is used as a super-resolution network model of "small” complexity, corresponding to the "small” complexity category; CARN is used as a super-resolution network model of "medium” complexity, Corresponds to the "medium” complexity category; SRResNet, as a super-resolution network model of "large” complexity, corresponds to the "large” complexity category.
  • the multiple super-resolution network models with different complexities may also include a preset first super-resolution network model and at least one first super-resolution network model that has undergone network parameter reduction processing. .
  • the first super-resolution network model may be any existing super-resolution network model or a reconstructed super-resolution network model. That is, in this embodiment of the present application, the original version and at least one simplified version of any super-resolution network model can be used to build a classification super-resolution network model.
  • the first super-resolution network model is FSRCNN.
  • the original version of the FSRCNN used is shown in (a) of Figure 4, and the original version includes convolutional layers a1, convolutional layers a2, 4-layer convolutional layers a3, convolutional layers a4 and 4 layer deconvolution layer.
  • the convolutional layer a1 is used to extract the features of sub-images.
  • the input channel (input channel) of the convolutional layer a1 is 3, the output channel (output channel) is 56, and the convolution kernel size (kernelsize) is 5.
  • the convolutional layer a2 is used to perform dimension reduction processing on the feature map output by the convolutional layer a1, so as to reduce the calculation amount of the subsequent feature mapping process.
  • the 4-layer continuous convolutional layer a3 is used for feature mapping, which maps low-resolution features to high-resolution features.
  • the convolutional layer a4 is used to increase the dimension of the feature map output by the convolutional layer a3 to restore the dimension of the feature map.
  • the 4-layer continuous deconvolution layer is used to perform an upsampling operation to obtain the reconstructed image of the sub-image.
  • the original version can be simplified to different degrees according to the number of required simplified versions, that is, the network parameters of the FSRCNN can be deleted in different degrees, so as to obtain the obtained results. required simplified version.
  • the complexity of the original version of FSRCNN is "large” by default, and two versions need to be simplified to obtain FSRCNN with complexity "small” and “medium”.
  • the network structure of the “medium” complexity FSRCNN can be shown in (b) of FIG. 4 .
  • the output channel of the convolutional layer a1 the input channel of the convolutional layer a2, the output channel of the convolutional layer a4, and the input channel of the deconvolutional layer are all reduced. Small is 36.
  • the network structure of the "small" complexity FSRCNN can be shown in (c) of Fig. 3.
  • the output channel of the convolutional layer a1 the input channel of the convolutional layer a2, the output channel of the convolutional layer a4, and the input channel of the deconvolutional layer are all reduced. Small is 16.
  • the first super-resolution network model is SRResNet.
  • the original version of FSRCNN is shown in (a) of Figure 5.
  • the original version includes convolutional layer a1, convolutional layer a2, 4-layer convolutional layer a3, convolutional layer a4 and 4-layer deconvolution Laminate.
  • the convolutional layer a1 is used to extract the features of sub-images.
  • the input channel (input channel) of the convolutional layer a1 is 3, the output channel (output channel) is 56, and the convolution kernel size (kernelsize) is 5.
  • the convolutional layer a2 is used to perform dimension reduction processing on the feature map output by the convolutional layer a1, so as to reduce the calculation amount of the subsequent feature mapping process.
  • the 4-layer continuous convolutional layer a3 is used for feature mapping, which maps low-resolution features to high-resolution features.
  • the convolutional layer a4 is used to increase the dimension of the feature map output by the convolutional layer a3 to restore the dimension of the feature map.
  • the 4-layer continuous deconvolution layer is used to perform an upsampling operation to obtain the reconstructed image of the sub-image.
  • the first super-resolution network model is SRResNet.
  • the original version of the obtained SRResNet is shown in (a) in Figure 5.
  • the original version includes convolutional layers b1, 16 residual layers, 2 convolutional layers b2, 2 pixel reorganization layers (pixel_shuffle), convolutional layers layer b3 and convolutional layer b4.
  • the convolutional layer b1 and the residual layer are used to extract the features of the sub-image.
  • the 2-layer convolutional layer b2 and the 2-layer pixel_shuffle are alternately arranged to map low-resolution features to high-resolution features.
  • the convolutional layer b3 and the convolutional layer b4 are used to perform an up-sampling operation to obtain a reconstructed image of the sub-image.
  • the complexity of the original version of SRResNet is "large” by default, and two versions need to be simplified to obtain SRResNet with "small” and “medium” complexity.
  • the network structure of the "medium” complexity SRResNet can be shown in (b) of FIG. 5 .
  • the output channel of the convolutional layer b1 the input channel and output channel of the residual layer, the input channel of the convolutional layer b2, the input channel of the convolutional layer b3 And the output channel, the input channel of the convolutional layer b4 are reduced to 48, and the output channel of the convolutional layer b2 is reduced to 48*4.
  • the network structure of the "small" complexity SRResNet can be shown in (c) of Figure 5.
  • the output channel of the convolutional layer b1 the input channel and output channel of the residual layer, the input channel of the convolutional layer b2, and the input channel of the convolutional layer b3
  • the input channel of the output channel and the convolutional layer b4 are reduced to 32, and the output channel of the convolutional layer b2 is reduced to 32*4.
  • the network parameters required to be calculated are reduced due to the reduction of the channel of the feature map in the network layer. Therefore, the amount of calculation in the process of processing the feature map is reduced, the processing speed is accelerated, and the corresponding complexity can be guaranteed.
  • the restoration effect of sub-images of degrees that is to say, compared with using the original version of the single first super-resolution network model, using the original version of the first super-resolution network model and the simplified version of the original version to build the classification super-resolution network model can be used to a certain extent. Reduce the amount of calculation and speed up the processing speed. That is, the classification super-resolution network model provided in this application can be regarded as an accelerated version of the first super-resolution network model.
  • step S203 After the reconstructed image of each sub-image is obtained, step S203 can be executed.
  • a classification model is used to identify the complexity of each sub-image of a low-resolution image, and then a super-resolution network model of different complexity is used to process the sub-images of different complexity.
  • the sub-image with relatively small complexity is processed by the super-resolution network model with relatively small complexity, so as to reduce the calculation amount of the sub-image with relatively small complexity under the condition of ensuring the restoration effect, Speed up processing.
  • the sub-image with relatively large complexity is processed by the super-resolution network model with relatively large complexity, so as to ensure the restoration effect of the sub-image with relatively large complexity. Therefore, for a complete low-resolution image, using the classification super-resolution network model provided by the present application to perform super-resolution processing can ensure the restoration effect of the high-resolution image while reducing the amount of computation.
  • the selected comparison group includes the original version of FSRCNN-O and the accelerated version of ClassSR-FSRCNN built with the network framework provided by this application, the original version of CARN-O and the accelerated version of ClassSR-CARN, and the original version of SRResNet-O and the accelerated version of ClassSR-SRResNet, the original version of RCAN-O and the accelerated version of ClassSR-RCAN.
  • FIG. 6 is a statistical diagram of the obtained experimental data after the original version of each super-resolution network model and the accelerated version built by using the network framework provided by this application are tested on the 8K image test set.
  • the ordinate is the peak signal to noise ratio (Peak Signal to Noise Ratio, PSNR) of the high-resolution image
  • the unit is dB
  • the abscissa is the calculation amount (FLOPs)
  • the unit is M.
  • the peak signal-to-noise ratio (PSNR) of the obtained high-resolution image can be guaranteed by using the accelerated version for super-resolution processing.
  • PSNR peak signal-to-noise ratio
  • Even on lightweight super-resolution network models e.g., FSRCNN-O and CARN-O
  • the PSNR of high-resolution images obtained by super-resolution processing with the accelerated version is improved compared to the original version.
  • the higher the PSNR the better the restoration effect of the network model on low-resolution images.
  • the computational load of the accelerated versions of each super-resolution network model is reduced by nearly 50% (-50%, -47%, -48%, -50%, respectively). That is to say, the processing speed of the accelerated version is nearly doubled compared to the original version.
  • Test/FLOPs represents the average PSNR (unit is dB) of the reconstructed high-resolution images after the corresponding network model performs super-resolution processing on 100 low-resolution images in the test set, and the average calculation amount (unit is M or G) ). It can be seen that after using the original version and the accelerated version to test on the same test set under different test conditions, the average PSNR of the high-resolution images output by the original version and the accelerated version are basically the same. That is to say, in the accelerated version, although some sub-images are processed by the simplified super-resolution network model, the restoration effect of the final restored high-resolution image is not significantly reduced.
  • the calculation amount of the accelerated version for processing low-resolution images is significantly reduced, from 100% to 50% to 71%. It can be seen that under the condition that the restoration effect of high-resolution images is guaranteed, the processing speed of the accelerated version is greatly improved compared with the original version.
  • Figure 7 is a schematic diagram of experimental data comparison of any two low-resolution image samples from the 2K image test set, the 4K image test set, and the 8K image test set. Among them, it includes the original version of each super-resolution network and the accelerated version of the reconstructed image sample obtained after super-resolution processing a sub-image sample, and also includes the reconstructed image sample (GT) corresponding to the sub-image sample in the test set. and high-resolution reconstructed image samples recovered using traditional bicubic interpolation.
  • GT reconstructed image sample
  • the classification super-resolution network model provided by the present application can speed up the processing speed while ensuring the restoration effect of the high-resolution image.
  • the preset initial network model can be trained by using the preset first loss function, the second loss function, the third loss function and the training set to obtain a classification super-score network model .
  • the initial network model refers to the classification super-score network model whose network parameters have not been optimized. It can be understood that the initial classification model includes an initial classification model and a plurality of initial super-resolution network models with different complexities.
  • the training set includes multiple low-resolution image samples and a high-resolution image sample corresponding to each low-resolution image sample.
  • the training set may include a 2K image training set, a 4K image training set, and/or an 8K image training set.
  • the present application provides a training method.
  • the network parameters of the initial classification model are optimized according to the recovery effect of the initial super-resolution network model on the sub-image samples, so that the trained classification model can accurately
  • the images are assigned to the appropriate super-resolution network model.
  • the initial network model's processing of the low-resolution image samples in the training set includes:
  • each sub-image sample input the sub-image sample into the initial classification model for processing to obtain a classification result, where the classification result includes the probability value of the sub-image sample being classified into each complexity category; input the sub-image sample into multiple Perform processing in the initial super-resolution network model to obtain first reconstructed image samples respectively output by multiple initial super-resolution network models; use the classification result to perform weighted summation on multiple first reconstructed image samples to obtain second reconstructed image samples .
  • the first loss function is used to calculate the high-resolution image corresponding to the low-resolution image sample output by the initial neural network and the low-resolution image in the training set.
  • the error between the high-resolution image samples corresponding to the samples and then adjust the network parameters of multiple initial super-resolution network models and initial classification models according to the error values. Understandably, the smaller the error, the better the recovery effect. In this way, the recovery effect can be back-propagated to the initial classification module to adjust the network parameters.
  • the first loss function is used to reduce the error between the high-resolution image corresponding to the low-resolution image sample output by the initial neural network and the high-resolution image sample corresponding to the low-resolution image sample in the training set.
  • the first loss function may be a conventional L1 loss function.
  • each probability value in the classification result output by the classification module is close in size, resulting in the classification being close to random classification.
  • the present application also provides a second loss function for increasing the difference between the largest probability value and other probability values among the plurality of probability values output by the initial classification model during the training process. That is to say, when classifying a sub-image sample, the initial classification model is constrained by the second loss function to ensure that the probability of the sub-image sample being classified into the corresponding complexity category is as large as possible, and tends to be as close to 1 as possible.
  • the second loss function may also be referred to as a classification-loss.
  • the second loss function can be expressed by the following formula:
  • L c is the negative number of the distance sum between the probability values belonging to each complexity category output by the same sub-image sample x after being processed by the initial classification model.
  • M is the number of complexity classes
  • P i (x) is the probability value of the sub-image sample x being assigned to the ith complexity class. This loss can widen the probability gap between different classification results, making the maximum probability value close to 1.
  • the present application in order to ensure that each initial super-resolution network model can be fully trained, so as to ensure the training effect of each initial super-resolution network model, the present application also provides a third loss function, the third The loss function is used to reduce the gap in the number of sub-image samples belonging to multiple complexity classes determined by the initial classification model. That is, the initial classification model is constrained by the third loss function to assign roughly the same number of sub-image samples to each complexity class during training. This ensures that each initial super-resolution network model can be fully trained.
  • the third loss function can be expressed by the following formula:
  • L a is the number and average number of sub-image samples assigned to each complexity category by the initial classification model in batch processing distance between and.
  • B is the batch size, which is the number of sub-image samples processed in a batch.
  • P i (x j ) is the probability value of the jth sub-image sample being assigned to the ith complexity class in a batch.
  • the third loss function may also be referred to as the average loss (Average-loss).
  • FIG. 9 is a schematic diagram of a training curve for training a classification model using the first loss function, the second loss function and the third loss function at the same time.
  • (a) in FIG. 9 shows the PSNR of the output high-resolution image samples of the initial classification super-score network model as a function of training time.
  • (b) in Fig. 9 shows the variation curve of the calculation amount of the initial classification super-score network model with the training time.
  • the PSNR of the initial classification super-score network model increases while the amount of computation decreases. It shows that each sub-image sample of each low-resolution image sample is gradually being assigned to a suitable super-resolution network model.
  • Figure 10 is the training curve (the first PSNR curve and the first FLOPs curve) for training the classification model using the first loss function and the second loss function but not using the third loss function, and the classification model using the three loss functions simultaneously Schematic diagram of the comparison between the training curves (the second PSNR curve and the second FLOPs curve) for training.
  • FIG. 10 shows the PSNR of the output high-resolution image samples of the initial classification super-score network model as a function of training time.
  • (b) in Fig. 10 shows the change curve of the calculation amount of the initial classification super-score network model with the training time.
  • Figure 11 is a training curve (the third PSNR curve and the third FLOPs curve) for training the classification model using the first loss function and the third loss function but not using the second loss function, and the classification model using the three loss functions at the same time.
  • (a) in Figure 10 represents the PSNR variation curve of the output high-resolution image samples of the initial classification super-score network model with the training time.
  • (b) in Fig. 11 shows the variation curve of the calculation amount of the initial classification super-score network model with the training time.
  • the joint training method provided by this application in combination with the first loss function, the second loss function, and the third loss function can ensure that each super-resolution network model can be fully trained, and make the classification model based on The restoration effect is effectively optimized, and effective classification results are output. It is ensured that the classification super-score network model obtained by training can greatly improve the processing speed while ensuring the recovery effect.
  • the network framework and training method provided in this application are universal. It can be applied to any image restoration task or tasks with image restoration effect as the evaluation index. For example, in addition to super-resolution tasks, it can also be applied to image denoising tasks. It can also greatly reduce the amount of calculation while ensuring PSNR.
  • the embodiment of the present application provides an image-driven brain atlas construction device, and the device embodiment corresponds to the foregoing method embodiment.
  • the details in the foregoing method embodiments are described one by one, but it should be clear that the apparatus in this embodiment can correspondingly implement all the content in the foregoing method embodiments.
  • FIG. 12 is a schematic structural diagram of a super-resolution apparatus provided by an embodiment of the present application. As shown in FIG. 12 , the super-resolution apparatus provided by this embodiment includes an acquisition unit 1201 and a processing unit 1202 .
  • the acquiring unit 1201 is used for acquiring the processed low-resolution image.
  • the processing unit 1202 is configured to input the low-resolution image into the trained classification super-resolution network model for processing, and output the high-resolution image corresponding to the low-resolution image.
  • the super-resolution apparatus further includes a training unit 1203 for training the preset initial network model by using the preset first loss function, second loss function, third loss function and training set to obtain a classification super-resolution method.
  • a training unit 1203 for training the preset initial network model by using the preset first loss function, second loss function, third loss function and training set to obtain a classification super-resolution method.
  • Sub-network model for training the preset initial network model by using the preset first loss function, second loss function, third loss function and training set to obtain a classification super-resolution method.
  • the super-resolution apparatus provided in this embodiment can execute the above-mentioned method embodiments, and the implementation principle and technical effect thereof are similar, and details are not described herein again.
  • FIG. 13 is a schematic structural diagram of a terminal device provided by an embodiment of the application.
  • the terminal device provided by this embodiment includes: a memory 1301 and a processor 1302.
  • the memory 1301 is used for storing computer programs; the processor 1302 is used for The methods described in the above method embodiments are executed when the computer program is invoked.
  • the terminal device provided in this embodiment may execute the foregoing method embodiments, and the implementation principle and technical effect thereof are similar, and details are not described herein again.
  • Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method described in the foregoing method embodiment is implemented.
  • the embodiments of the present application further provide a computer program product, when the computer program product runs on a terminal device, the terminal device executes the method described in the above method embodiments.
  • the above-mentioned integrated units are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium.
  • the present application realizes all or part of the processes in the methods of the above embodiments, which can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium.
  • the steps of each of the above method embodiments can be implemented.
  • the computer program includes computer program code
  • the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like.
  • the computer-readable storage medium may include at least: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (Read-Only Memory, ROM), random access Memory (Random Access Memory, RAM), electrical carrier signal, telecommunication signal and software distribution medium.
  • computer readable media may not be electrical carrier signals and telecommunications signals.
  • the disclosed apparatus/device and method may be implemented in other manners.
  • the apparatus/equipment embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the term “if” may be contextually interpreted as “when” or “once” or “in response to determining” or “in response to detecting “.
  • the phrases “if it is determined” or “if the [described condition or event] is detected” may be interpreted, depending on the context, to mean “once it is determined” or “in response to the determination” or “once the [described condition or event] is detected. ]” or “in response to detection of the [described condition or event]”.
  • references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.
  • the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

一种超分辨率方法、装置、终端设备及存储介质,涉及深度学习技术领域,能够降低计算量。该超分辨率方法包括:将待处理的低分辨率图像输入已训练的分类超分网络模型中处理,输出得到与低分辨率图像对应的高分辨率图像;其中,分类超分网络模型包括分类模型和复杂度不同的多个超分辨网络模型,分类超分网络模型对低分辨率图像的处理过程包括:将低分辨率图像切割为多个子图像;针对每个子图像,根据分类模型确定子图像的复杂度,并将子图像输入到多个超分辨网络模型中与子图像的复杂度对应的超分辨网络模型中处理,输出得到子图像的重建图像;将每个子图像的重建图像进行拼接,得到高分辨率图像。

Description

一种超分辨率方法、装置、终端设备及存储介质 技术领域
本申请涉及深度学习技术领域,尤其涉及一种超分辨率方法、装置、终端设备及存储介质。
背景技术
超分辨率技术是指将低分辨率图像重建为高分辨率图像的技术。而基于深度学习的超分辨率算法是目前较常用的超分辨率方法。基于深度学习的超分辨率算法是将低分辨率图像切割为子图像,让后将各个子图像分别输入到超分辨率网络模型中处理,得到重建图像,在对各个子图像的重建图像进行拼接得到高分辨率图像。
目前,较为常用的超分辨网络模型包括加速超分辨率卷积神经网络(Accelerating the Super-Resolution Convolutional Neural Network,FSRCNN),快速,准确,轻量级超分辨率与级联剩余网络(Fast,Accurate,and Lightweight Super-Resolution with Cascading Residual Network,CARN),照片-现实的单一图像超分辨率使用生成对抗性网络(Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,SRResNet),图像超分辨率使用非常深的剩余信道注意网络(Image Super-Resolution Using Very Deep Residual Channel Attention Networks,RCAN)等。这些超分辨率网络模型在对低分辨率图像进行超分辨率处理的过程中,计算量较大,导致处理速度较慢。
发明内容
有鉴于此,本申请提供一种超分辨率方法、装置、终端设备及存储介质,能够减少超分辨率处理的计算量。
第一方面,本申请提供一种超分辨率方法,包括:将待处理的低分辨率图像输入已训练的分类超分网络模型中处理,输出得到与低分辨率图像 对应的高分辨率图像;其中,分类超分网络模型包括分类模型和复杂度不同的多个超分辨网络模型,分类超分网络模型对低分辨率图像的处理过程包括:
将低分辨率图像切割为多个子图像;针对每个子图像,根据分类模型确定子图像的复杂度类别,并将子图像输入到多个超分辨网络模型中与复杂度类别对应的超分辨网络模型中处理,输出得到子图像的重建图像;将多个子图像的重建图像进行拼接,得到与低分辨率图像对应的高分辨率图像。
可选的,该方法还包括:利用预设的第一损失函数、第二损失函数、第三损失函数和训练集对预设的初始网络模型进行训练,得到分类超分网络模型。
其中,初始分类模型包括初始分类模型和复杂度不同的多个初始超分辨网络模型,训练集包括多个低分辨率图像样本和分别于每个低分辨率图像样本对应的高分辨率图像样本;第一损失函数用于减小初始分类模型输出的与低分辨率图像样本对应的高分辨率图像,和训练集中与低分辨率图像样本对应的高分辨率图像样本之间的误差;第二损失函数用于增大初始分类模型输出的多个概率值中的最大概率值与其他概率值之间的差值;第三损失函数用于减小初始分类模型确定的分别属于多个复杂度类别的子图像样本的数量差距。
可选的,在训练过程中,初始网络模型对训练集中的低分辨率图像样本的处理过程包括:
将低分辨率图像样本切割为多个子图像样本;针对每个子图像样本,将子图像样本输入初始分类模型中处理得到分类结果,分类结果包括子图像样本被归类到每个复杂度类别的概率值;将子图像样本分别输入到多个初始超分辨率网络模型中进行处理,得到多个初始超分辨率网络模型分别输出的第一重建图像样本;利用分类结果对多个第一重建图像样本进行加权求和,得到第二重建图像样本;将多个子图像样本的第二重建图像样本 进行拼接,得到与低分辨率图像样本对应的高分辨率图像。
可选的,第二损失函数为:
Figure PCTCN2021137582-appb-000001
其中,L c是子图像样本x由初始分类模型处理后输出的属于每个复杂度类别的概率值之间的距离之和的负数,M是复杂度类别的数量,P i(x)是子图像样本x被分到第i个复杂度类别的概率值。
可选的,第三损失函数为:
Figure PCTCN2021137582-appb-000002
其中,L a是批处理中初始分类模型分到每个复杂度类别的子图片样本数目和
Figure PCTCN2021137582-appb-000003
之间的距离之和。其中B是批处理大小,P i(x j)表示在一个批处理中第j张子图像样本被分到第i个复杂度类别的概率值,
Figure PCTCN2021137582-appb-000004
表示一个批处理中所有被分到第i个复杂度类别的子图像样本的概率值之和。
可选的,多个超分辨率网络模型包括预设的第一超分辨网络模型和至少一个经过网络参数删减处理的第一超分辨率网络模型。
第二方面,本申请提供一种超分辨率装置,包括:
获取单元,用于获取处理的低分辨率图像。
处理单元,用于将低分辨率图像输入已训练的分类超分网络模型中处理,输出得到与低分辨率图像对应的高分辨率图像。
其中,分类超分网络模型包括分类模型和复杂度不同的多个超分辨网络模型,分类超分网络模型对低分辨率图像的处理过程包括:
将低分辨率图像切割为多个子图像;针对每个子图像,根据分类模型确定子图像的复杂度类别,并将子图像输入到多个超分辨网络模型中与复杂度类别对应的超分辨网络模型中处理,输出得到子图像的重建图像;将 多个子图像的重建图像进行拼接,得到与低分辨率图像对应的高分辨率图像。
可选的,该超分辨率装置还包括训练单元:
训练单元,用于利用预设的第一损失函数、第二损失函数、第三损失函数和训练集对预设的初始网络模型进行训练,得到分类超分网络模型。
其中,初始分类模型包括初始分类模型和复杂度不同的多个初始超分辨网络模型,训练集包括多个低分辨率图像样本和分别于每个低分辨率图像样本对应的高分辨率图像样本;第一损失函数用于减小初始分类模型输出的与低分辨率图像样本对应的高分辨率图像,和训练集中与低分辨率图像样本对应的高分辨率图像样本之间的误差;第二损失函数用于增大初始分类模型输出的多个概率值中的最大概率值与其他概率值之间的差值;第三损失函数用于减小初始分类模型确定的分别属于多个复杂度类别的子图像样本的数量差距。
第三方面,本申请提供一种终端设备,包括:存储器和处理器,存储器用于存储计算机程序;处理器用于在调用计算机程序时执行上述第一方面中任一方式所述的方法。
第四方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现如上述第一方面中任一方式所述的方法。
第五方面,本申请实施例提供一种计算机程序产品,当计算机程序产品在处理器上运行时,使得处理器执行上述第一方面中任一方式所述的方法。
基于本申请所提供的超分辨率方法、装置、终端设备及存储介质,通过利用分类模型来识别低分辨率图像的各个子图像的复杂度,然后利用不同复杂度的超分辨网络模型来处理不同复杂度的子图像。一方面,使得复杂度相对较小的子图像利用复杂度相对较小的超分辨率网络模型来处理,以在保证恢复效果的情况下,减小复杂度相对较小的子图像的计算量,加 快处理速度。另一方面,使得复杂度相对较大的子图像利用复杂度相对较大的超分辨率网络模型来处理,保证复杂度相对较大的子图像的恢复效果。因此,对于一个完整的低分辨率图像来说,利用本申请提供的超分辨率方法,能够在保证高分辨率图像的恢复效果的情况下,降低超分辨率处理中的计算量,加快处理速度。
附图说明
图1为本申请一实施例提供的超分辨率方法的一个实施例的流程示意图;
图2为本申请一实施例提供的一种分类超分网络模型处理低分辨率图像的流程示意图;
图3为本申请一实施例提供的一种分类模型的网络结构示意图;
图4为本申请一实施例提供的多个不同复杂度的FSRCNN的网络结构示意图;
图5为本申请一实施例提供的多个不同复杂度的SRResNet的网络结构示意图;
图6为本申请一实施例提供的实验数据对比示意图一;
图7为本申请一实施例提供的实验数据对比示意图二;
图8为本申请一实施例提供的一种初始网络模型的训练流程示意图;
图9为本申请一实施例提供的实验数据对比示意图三;
图10为本申请一实施例提供的实验数据对比示意图四;
图11为本申请一实施例提供的实验数据对比示意图五;
图12为本申请一实施例提供的一种超分辨率装置的结构示意图;
图13为本申请一实施例提供的终端设备的结构示意图。
具体实施方式
目前,基于深度学习的超分辨率算法中,往往使用单一的超分辨率网络模型对低分辨率图像各个子图像进行超分辨率处理,以获得高分辨率图像。然而,经过验证发现同一低分辨率图像中的各个子图像的复杂度(也可以称为恢复难度)可能并不相同。对于复杂度较低的子图像来说,如果仍然使用复杂的超分辨率网络模型来处理,必然会造成计算量的冗余。计算量较大的情况下,处理速度就会降低。
目前为了加快处理速度,通常采用设计轻量级网络模型的方式或者设置高效的插件模块,来减少计算量。但是整个网络模型的计算量减少,对于一个复杂度较大的子图像来说,必然会导致恢复效果较差。
针对这一问题,本申请提供一种超分辨率方法,通过设计一种包含分类模型和多个复杂度不同的超分辨率网络模型的分类超分(Class Super-Resolution,Class SR)网络模型对低分辨率图像进行超分辨率处理。处理原理为通过分类模型来识别低分辨率图像的各个子图像的复杂度,然后利用不同复杂度的超分辨网络模型来处理不同复杂度的子图像。一方面,使得复杂度相对较小的子图像利用复杂度相对较小的超分辨率网络模型来处理,以在保证恢复效果的情况下,减小复杂度相对较小的子图像的计算量,加快处理速度。另一方面,使得复杂度相对较大的子图像利用复杂度相对较大的超分辨率网络模型来处理,保证复杂度相对较大的子图像的恢复效果。从而实现对低分辨率图像进行超分辨率处理的加速处理。
下面以具体地实施例对本申请的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
参见图1,为本申请提供的一种超分辨率方法方法的一个实施例的流程图,该方法的执行主体可以是图像处理设备,例如智能手机、智能手机、平板电脑、摄像机等移动终端,还可以是台式电脑、机器人、服务器等终端设备。图像处理设备中部署有本申请提供的已训练的分类超分网络模型。
如图1所示,图像处理设备获取到待处理的低分辨率图像后,即可将低分辨率图像输入分类超分网络模型中处理,输出得到与低分辨率图像对应的高分辨率图像。
在本申请所提供的分类超分网络模型包括分类模型和复杂度不同的多个(图1中以小、中、大三个不同的复杂度为例)超分辨网络模型。参见图2,分类超分网络模型对低分辨率图像的处理过程包括:
S201,将低分辨率图像切割为多个子图像。
其中,图像处理设备可以按照预设的子图像的大小对低分辨率图像进行切割。子图像的大小可以基于分类超分网络模型中所使用的分类模型和超分辨网络模型的输入要求进行设定。
S202,针对每个子图像,根据分类模型确定子图像的复杂度类别,并将子图像输入到多个超分辨网络模型中与该子图像的复杂度类别对应的超分辨网络模型中处理,输出得到该子图像的重建图像。
其中,分类模型可以是任一一种具备分类功能的神经网络模型。例如,如图3所示,分类模型可以是由若干个卷积层、池化层以及全连接层构成的卷积神经网络。分类模型用于识别子图像的复杂度,可以对输入的子图像进行分类处理,输出该子图像被归类到每个复杂度类别的概率值。概率值最大的复杂度类别即为该子图像的复杂度类别。
需要说明的是,由于不同子图像所含有的有效信息的量、识别对象的模糊程度等不同,导致不同子图像的识别难度不同,进而导致恢复成高分辨率图像的难度也不同。因此,在本申请中,所谓图像的复杂度是指重建为高分辨的难度。
可以理解的是,分类模型的输出为一个长度为M(M≥2,M为整数)的向量,其中M同时也表示分类超分网络模型中所具有的超分辨率网络模型的个数。例如,若分类模型输入一个子图像后,输出结果为(0.9,0.01,0.09),则表示该子图像被归类到“小”复杂度类别的概率为0.9,被归类到“中”复杂度类别的概率为0.01,被归类到“大”复杂度类别的概率为 0.09。由于被归类到“小”复杂度类别的概率值0.9为最大概率值,因此,该子图像的复杂度类别为“小”。
根据分类模型确定子图像的复杂度类别后,即可将该子图像输入与该子图像的复杂度类别对应的超分辨网络模型中处理,输出得到该子图像的重建图像(即为该子图像的高分辨率图像)。
例如,确定子图像的复杂度类别为“小”,则将该子图像输入到“小复杂度”的超分辨网络模型中进行高分辨率恢复处理。
在一个实施例中,复杂度不同的多个超分辨网络模型可以包括不同的网络模型。例如,假设分类超分网络模型中需要设置三个复杂度不同的超分辨网络模型,可以从已有的和/或重新构建的超分辨网络模型中选择3个来搭建分类超分网络模型。
示例性的,按照网络模型的复杂度从小到大顺序,目前已有的超分辨网络模型包括FSRCNN、CARN,SRResNet,RCAN等。若选择FSRCNN、CARN,SRResNet来搭建分类超分网络模型,则FSRCNN作为“小”复杂度的超分辨网络模型,对应“小”复杂度类别;CARN作为“中”复杂度的超分辨网络模型,对应“中”复杂度类别;SRResNet作为“大”复杂度的超分辨网络模型,对应“大”复杂度类别。
可选的,在另一个实施例中,复杂度不同的多个超分辨网络模型也可以包括预设的第一超分辨网络模型和至少一个经过网络参数删减处理的第一超分辨率网络模型。
其中,第一超分辨网络模型可以是任何已有的超分辨率网络模型或者重新构建的超分辨率网络模型。即在本申请实施例中,可以利用任一超分辨率网络模型的原始版本和至少一个简化版本来搭建分类超分网络模型。
示例性的,以SRResNet和FSRCNN为例。假设第一超分辨网络模型为FSRCNN。参见图4,假设所使用的FSRCNN的原始版本如图4中的(a)所示,原始版本中包括卷积层a1、卷积层a2、4层卷积层a3、卷积层a4以及4层反卷积层。其中,卷积层a1用于提取子图像的特征。卷积层a1 的输入通道(input channel)为3,输出通道(output channel)为56,卷积核大小(kernelsize)为5。卷积层a2用于对卷积层a1输出的特征图进行降维处理,以减小后续特征映射过程的计算量。卷积层a2的input channel=56,output channel=12,kernelsize=1。4层连续的卷积层a3用于进行特征映射,将低分辨率特征映射为高分辨率特征。卷积层a3的input channel=12,output channel=12,kernelsize=3。卷积层a4用于对卷积层a3输出的特征图进行升维处理,以恢复特征图的维度。卷积层a4的input channel=12,output channel=56,kernelsize=1。4层连续的反卷积层用于执行上采样操作,得到所述子图像的重建图像。反卷积层的input channel=56,output channel=3,kernelsize=9。
获得如图4中的(a)所示的FSRCNN后,可以根据所需的简化版本的个数对原始版本进行不同程度的简化,即在不同程度对FSRCNN的网络参数进行删减,从而得到所需的简化版本。
例如,FSRCNN的原始版本的复杂度默认为“大”,需要简化两个版本获得复杂度为“小”和“中”的FSRCNN。示例性的,网络参数删减后,“中”复杂度的FSRCNN的网络结构可以如图4中的(b)所示。相比于原始版本的FSRCNN,“中”复杂度的FSRCNN中,卷积层a1的output channel、卷积层a2的input channel、卷积层a4的output channel、反卷积层的input channel均减小为36。
“小”复杂度的FSRCNN的网络结构可以如图3中的(c)所示。相比于原始版本的FSRCNN,“小”复杂度的FSRCNN中,卷积层a1的output channel、卷积层a2的input channel、卷积层a4的output channel、反卷积层的input channel均减小为16。
假设第一超分辨网络模型为SRResNet。参见图5,的FSRCNN的原始版本如图5中的(a)所示,原始版本中包括卷积层a1、卷积层a2、4层卷积层a3、卷积层a4以及4层反卷积层。其中,卷积层a1用于提取子图像的特征。卷积层a1的输入通道(input channel)为3,输出通道(output channel) 为56,卷积核大小(kernelsize)为5。卷积层a2用于对卷积层a1输出的特征图进行降维处理,以减小后续特征映射过程的计算量。卷积层a2的input channel=56,output channel=12,kernelsize=1。4层连续的卷积层a3用于进行特征映射,将低分辨率特征映射为高分辨率特征。卷积层a3的input channel=12,output channel=12,kernelsize=3。卷积层a4用于对卷积层a3输出的特征图进行升维处理,以恢复特征图的维度。卷积层a4的input channel=12,output channel=56,kernelsize=1。4层连续的反卷积层用于执行上采样操作,得到所述子图像的重建图像。反卷积层的input channel=56,output channel=3,kernelsize=9。
假设第一超分辨网络模型为SRResNet。获取的SRResNet的原始版本如图5中的(a)所示,原始版本中包括卷积层b1、16层残差层、2层卷积层b2、2层像素重组层(pixel_shuffle)、卷积层b3以及卷积层b4。其中,卷积层b1和残差层用于提取子图像的特征。卷积层b1的input channel=3,output channel=64,kernelsize=5。16层连续的残差层均为无批规范化层(batch normalization,BN)的残差块,残差层的input channel=64,output channel=64,kernelsize=3。2层卷积层b2和2层pixel_shuffle交替排列,用于将低分辨率特征映射为高分辨率特征。卷积层b2的input channel=64,output channel=64*4,kernelsize=3,pixel_shuffle用于将卷积层b2输出的特征图的长宽增大两倍,通道数减小到64。卷积层b3以及卷积层b4用于执行上采样操作,得到所述子图像的重建图像。卷积层b3的input channel=64,output channel=64,kernelsize=3。卷积层b4的input channel=64,output channel=3,kernelsize=3。
例如,SRResNet的原始版本的复杂度默认为“大”,需要简化两个版本获得复杂度为“小”和“中”的SRResNet。示例性的,网络参数删减后,“中”复杂度的SRResNet的网络结构可以如图5中的(b)所示。相比于原始版本的SRResNet,“中”复杂度的SRResNet中,卷积层b1的output channel、残差层的input channel和output channel、卷积层b2的input channel、 卷积层b3的input channel和output channel、卷积层b4的input channel均减小为48,卷积层b2的output channel减小为48*4。
“小”复杂度的SRResNet的网络结构可以如图5中的(c)所示。相比于原始版本的SRResNet,“中”复杂度的SRResNet中,卷积层b1的output channel、残差层的input channel和output channel、卷积层b2的input channel、卷积层b3的input channel和output channel、卷积层b4的input channel均减小为32,卷积层b2的output channel减小为32*4。
可以理解的是,简化后,由于网络层中特征图的channel减小,使得所需计算的网络参数就减少,因此处理特征图的过程中计算量减少,处理速度加快,同时还能保证对应复杂度的子图像的恢复效果。也就是说,相比于使用单一的第一超分辨率网络模型的原始版本,采用第一超分辨率网络模型的原始版本和该原始版本的简化版本搭建分类超分网络模型,可以在一定程度上减少计算量,加快处理速度。即本申请提供的分类超分网络模型可以视为第一超分辨率网络模型的加速版本。
在得到每个子图像的重建图像后,即可执行步骤S203。
S203,将多个子图像的重建图像进行拼接,得到高分辨率图像。
在本申请实施例中,利用分类模型来识别低分辨率图像的各个子图像的复杂度,然后利用不同复杂度的超分辨网络模型来处理不同复杂度的子图像。一方面,使得复杂度相对较小的子图像利用复杂度相对较小的超分辨率网络模型来处理,以在保证恢复效果的情况下,减小复杂度相对较小的子图像的计算量,加快处理速度。另一方面,使得复杂度相对较大的子图像利用复杂度相对较大的超分辨率网络模型来处理,保证复杂度相对较大的子图像的恢复效果。因此,对于一个完整的低分辨率图像来说,利用本申请提供的分类超分网络模型进行超分辨率处理,能够在降低计算量的情况下保证高分辨率图像的恢复效果。
为了充分说明本申请提供的分类超分网络模型的效果,下面结合图6-7以及表1所示的实验数据对比进行示例性的说明。所选用的对比组包括以 原始版本的FSRCNN-O和采用本申请提供的网络框架搭建的加速版本的ClassSR-FSRCNN、原始版本的CARN-O和加速版本的ClassSR-CARN、原始版本的SRResNet-O和加速版本的ClassSR-SRResNet、原始版本的RCAN-O和加速版本的ClassSR-RCAN。
图6为将各个超分辨率网络模型的原始版本以及采用本申请提供的网络框架搭建的加速版本,在8K图像测试集上进行测试后,根据获得的实验数据的统计图。其中,纵坐标为高分辨率图像的峰值信噪比(Peak Signal to Noise Ratio,PSNR),单位为dB,横坐标为计算量(FLOPs),单位为M。
基于图6可以看出,利用加速版本进行超分辨率处理,得到的高分辨率图像的峰值信噪比(Peak Signal to Noise Ratio,PSNR)能够得到保证。甚至在轻量级的超分辨率网络模型(例如,FSRCNN-O和CARN-O)上,利用加速版本进行超分辨率处理得到的高分辨率图像的PSNR相比于原始版本有所提升。一般情况下,PSNR越高,表明网络模型对低分辨率图像的恢复效果越好。
而在计算量方面,各个超分辨率网络模型的加速版本的计算量均减少接近50%(分别为-50%、-47%、-48%、-50%)。也就是说,加速版本的处理速度相比于原始版本提高了将近一倍。
将各个网络超分辨率网络模型的原始版本以及加速版本分别在2K图像测试集、4K图像测试集、8K图像测试集上进行测试,每个测试集中包含100张低分辨率图像样本,所得到的实验参数可以如下表1所示:
表1
Figure PCTCN2021137582-appb-000005
Figure PCTCN2021137582-appb-000006
表1中Parameters表示网络模型的网络参数数据量。Test/FLOPs表示对应的网络模型在对测试集中100张低分辨率图像进行超分辨率处理后,重建的高分辨率图像的平均PSNR(单位为dB),和平均计算量(单位为M或者G)。可以看出,使用原始版本和加速版本不同测试条件下同样的测试集上进行测试后,原始版本和加速版本所输出的高分辨率图像的平均PSNR基本相等。也就是说,加速版本中虽然部分子图像通过简化后的超分辨率网络模型处理,但最终恢复的高分辨率图像的恢复效果没有明显降低。而在保证了高分辨率图像的恢复效果的情况下,相较于原始版本,加速版本处理低分辨率图像的计算量明显大幅度降低,均从100%降低至50%至71%。可见,在保证了高分辨率图像的恢复效果的情况下,相较于原始版本,加速版本的处理速度有大幅度提高。
图7是从2K图像测试集、4K图像测试集和8K图像测试集中任一两张低分辨率图像样本的实验数据对比示意图。其中,包括各个超分辨率网络的原始版本以及加速版本对一个子图样本的进行超分辨率处理后,得到的重建图像样本,还包括测试集中与该子图像样本对应的重建图像样本(GT)和利用传统的双三次插值(Bicubic interpolation)恢复的高分辨率重 建图像样本。
基于图7,从单张图像的超分辨率处理来看,本申请提供的分类超分网络模型能够在保证高分辨率图像的恢复效果的情况下,加快处理速度。
下面结合图8对本申请提供的分类超分网络模型的训练过程进行示例性的说明。
如图8所示,在本申请实施例中可以利用预设的第一损失函数、第二损失函数、第三损失函数和训练集对预设的初始网络模型进行训练,得到分类超分网络模型。
其中,初始网络模型是指网络参数未完成优化的分类超分网络模型。可以理解的是,初始分类模型包括初始分类模型和复杂度不同的多个初始超分辨网络模型。
训练集包括多个低分辨率图像样本和每个低分辨率图像样本对应的高分辨率图像样本。在本申请实施例中,该训练集可以包括2K图像训练集、4K图像训练集和/或8K图像训练集。
由于训练集中的低分辨率图像样本的各个子图像样本的复杂度难以量化,无法进行标注。因此,本申请提供一种训练方法,在训练过程中,根据初始超分辨网络模型对子图像样本的恢复效果优化初始分类模型的网络参数,以使得训练后的分类模型能够准确地将输入的子图像分配给合适的超分辨率网络模型中。
具体的,在训练过程中,初始网络模型对训练集中的低分辨率图像样本的处理过程包括:
S301,将低分辨率图像样本切割为多个子图像样本。
S302,针对每个子图像样本,将子图像样本输入初始分类模型中处理得到分类结果,分类结果包括子图像样本被归类到每个复杂度类别的概率值;将子图像样本分别输入到多个初始超分辨率网络模型中进行处理,得到多个初始超分辨率网络模型分别输出的第一重建图像样本;利用分类结 果对多个第一重建图像样本进行加权求和,得到第二重建图像样本。
S303,将多个子图像样本的第二重建图像样本进行拼接,得到与低分辨率图像样本对应的高分辨率图像。
初始神经网络输出与低分辨率图像样本对应的高分辨率图像后,利用第一损失函数计算该初始神经网络输出的与低分辨率图像样本对应的高分辨率图像和训练集中与低分辨率图像样本对应的高分辨率图像样本之间的误差,然后根据误差值调整多个初始超分辨率网络模型和初始分类模型的网络参数。可以理解的是,误差越小,表示恢复效果越好。如此,便可将恢复效果反向传播会初始分类模块,进行网络参数调整。
其中,该第一损失函数用于减小初始神经网络输出的与低分辨率图像样本对应的高分辨率图像,和训练集中与低分辨率图像样本对应的高分辨率图像样本之间的误差。第一损失函数可以是常规的L1损失函数。
在一个实施例中,为了保证训练后的分类模型能够有效分类,避免分类模块输出的分类结果中各个概率值大小接近,导致分类接近于随机分类。本申请还提供一种第二损失函数,用于在训练过程中增大初始分类模型输出的多个概率值中的最大概率值与其他概率值之间的差值。也就是说,通过第二损失函数约束初始分类模型在对某一子图像样本进行分类时,保证该子图像样本被分到对应复杂度类别的概率尽可能大,尽可能趋向于1。在本申请实施例中,该第二损失函数也可以称为分类损失(classification-loss)。
示例性的,该第二损失函数可以通过如下公式表示:
Figure PCTCN2021137582-appb-000007
其中,L c是同一子图像样本x由初始分类模型处理后输出的属于每个复杂度类别的概率值之间的距离和的负数。其中,其中M是复杂度类别的数量,P i(x)是子图像样本x被分到第i个复杂度类别的概率值。这种损失可以 扩大不同分类结果之间概率差距,使最大概率值接近1。
在一个实施例中,为了保证每个初始超分辨率网络模型都能够得到充分的训练,从而保证每个初始超分辨率网络模型的训练效果,本申请还提供一种第三损失函数,第三损失函数用于减小初始分类模型确定的分别属于多个复杂度类别的子图像样本的数量差距。也就是说,通过第三损失函数来约束初始分类模型在训练过程中,为每个复杂度类别分配到的子图像样本的数量大致相同。从而保证每个初始超分辨率网络模型都能得到充分的训练。
示例性的,该第三损失函数可以通过如下公式表示:
Figure PCTCN2021137582-appb-000008
其中,L a是批(batch)处理中初始分类模型分到每个复杂度类别的子图片样本数目和平均数
Figure PCTCN2021137582-appb-000009
之间的距离和。其中B是批处理大小(batchsize),即一个batch中处理的子图片样本的数量。P i(x j)是在一个batch中第j张子图像样本被分到第i个复杂度类别的概率值。
Figure PCTCN2021137582-appb-000010
是一个batch中所有被分到第i个复杂度类别的子图像样本的概率值之和。由于通过第二损失函数保证了被分到第i个复杂度类别的子图像样本的概率值均接近于1,因此,
Figure PCTCN2021137582-appb-000011
也接近于一个batch中被分到第i个复杂度类别的子图像样本数目。
可以理解的时,通过约束L a的取值范围来约束初始分类模型在训练过程中,为每个复杂度类别分配到的子图像样本的数量大致相同。从而使得与每个复杂度类别分别对应的初始超分辨率网络模型被训练到。在本申请,第三损失函数也可以称为平均损失(Average-loss)。
下面结合图9-11实验数据对本申请提供的训练方法的训练效果进行示 例性的说明。
假如,固定多个超分辨率网络模块的网络参数,只训练分类模型。
图9为同时使用第一损失函数、第二损失函数和第三损失函数对分类模型进行训练的训练曲线示意图。其中,图9中的(a)表示初始分类超分网络模型的所输出的高分辨率图像样本的PSNR随着训练时间的变化曲线。图9中的(b)表示初始分类超分网络模型的计算量随着训练时间的变化曲线。基于图9可以看出,随着训练时间的延长,初始分类超分网络模型的PSNR在上升的同时计算量在下降。表明,各个低分辨率图像样本的各个子图像样本正在逐渐被分配到合适的超分辨率网络模型。
图10为使用第一损失函数和第二损失函数但不使用第三损失函数对分类模型进行训练的训练曲线(第一PSNR曲线和第一FLOPs曲线),和同时使用三种损失函数对分类模型进行训练的训练曲线(第二PSNR曲线和第二FLOPs曲线)之间的对比示意图。其中,图10中的(a)表示初始分类超分网络模型的所输出的高分辨率图像样本的PSNR随着训练时间的变化曲线。图10中的(b)表示初始分类超分网络模型的计算量随着训练时间的变化曲线。
基于图10可以看出,随着训练时间的延长,初始分类超分网络模型的所输出的高分辨率图像样本的PSNR和计算量基本不变,且PSNR和计算量均较大。说明初始分类模型将所有子图像样本都分配到复杂度最大的超分辨率网络模型中处理。也就是说,若开始对多个超分辨率网络进行训练,不适用第三损失函数则会导致除复杂度最大的超分辨率网络模型之外,剩余的超分辨率网络模型将无法得到充分的训练。
图11为使用第一损失函数和第三损失函数但不使用第二损失函数对分类模型进行训练的训练曲线(第三PSNR曲线和第三FLOPs曲线),和同时使用三种损失函数对分类模型进行训练的训练曲线(第四PSNR曲线和第四FLOPs曲线)之间的对比示意图。其中,图10中的(a)表示初始分类超分网络模型的所输出的高分辨率图像样本的PSNR随着训练时间的变 化曲线。图11中的(b)表示初始分类超分网络模型的计算量随着训练时间的变化曲线。
基于图11可以看出,随着训练时间的延长,初始分类超分网络模型的所输出的高分辨率图像样本的PSNR和计算量的曲线都在大幅度波动。说明初始分类模型在对输入的各个子图像样本进行分类时发生了随机分类的问题,导致训练无法完成。
综上可知,采用本申请提供的结合第一损失函数、第二损失函数、第三损失函数进行联合训练方式,即能够保证各个超分辨率网络模型都能够得到充分的训练,又使得分类模型基于恢复效果进行有效的优化,并输出有效的分类结果。保证了训练所得的分类超分网络模型在保证恢复效果的情况下大幅度提高处理速度。
值得说明的是,本申请提供网络框架以及训练方法具备泛用性。可以应用于任何图像恢复任务或者以图像恢复效果为评价指标的任务中。例如,除了超分辨率任务外,还可以应用于图像去噪任务中。同样可以在保证PSNR的情况下大幅度降低计算量。
基于同一发明构思,作为对上述方法的实现,本申请实施例提供了一种影像驱动的脑图谱构建装置,该装置实施例与前述方法实施例对应,为便于阅读,本装置实施例不再对前述方法实施例中的细节内容进行逐一赘述,但应当明确,本实施例中的装置能够对应实现前述方法实施例中的全部内容。
图12为本申请实施例提供的超分辨率装置的结构示意图,如图12所示,本实施例提供的超分辨率装置包括:获取单元1201、处理单元1202。
其中,获取单元1201,用于获取处理的低分辨率图像。
处理单元1202,用于将低分辨率图像输入已训练的分类超分网络模型中处理,输出得到与低分辨率图像对应的高分辨率图像。
可选的,超分辨率装置还包括训练单元1203,用于利用预设的第一损失函数、第二损失函数、第三损失函数和训练集对预设的初始网络模型进 行训练,得到分类超分网络模型。
本实施例提供的超分辨率装置可以执行上述方法实施例,其实现原理与技术效果类似,此处不再赘述。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
基于同一发明构思,本申请实施例还提供了一种终端设备。图13为本申请实施例提供的终端设备的结构示意图,如图13所示,本实施例提供的终端设备包括:存储器1301和处理器1302,存储器1301用于存储计算机程序;处理器1302用于在调用计算机程序时执行上述方法实施例所述的方法。
本实施例提供的终端设备可以执行上述方法实施例,其实现原理与技术效果类似,此处不再赘述。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述方法实施例所述的方法。
本申请实施例还提供一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行时实现上述方法实施例所述的方法。
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来 指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读存储介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不 排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (10)

  1. 一种超分辨率方法,其特征在于,所述方法包括:
    将待处理的低分辨率图像输入已训练的分类超分网络模型中处理,输出得到与所述低分辨率图像对应的高分辨率图像;
    其中,所述分类超分网络模型包括分类模型和复杂度不同的多个超分辨网络模型,所述分类超分网络模型对所述低分辨率图像的处理过程包括:
    将所述低分辨率图像切割为多个子图像;
    针对每个子图像,根据所述分类模型确定所述子图像的复杂度类别,并将所述子图像输入到所述多个超分辨网络模型中与所述复杂度类别对应的超分辨网络模型中处理,输出得到所述子图像的重建图像;
    将所述多个子图像的重建图像进行拼接,得到所述与所述低分辨率图像对应的高分辨率图像。
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    利用预设的第一损失函数、第二损失函数、第三损失函数和训练集对预设的初始网络模型进行训练,得到所述分类超分网络模型;
    其中,所述初始分类模型包括初始分类模型和复杂度不同的多个初始超分辨网络模型,所述训练集包括多个低分辨率图像样本和分别于每个低分辨率图像样本对应的高分辨率图像样本;
    所述第一损失函数用于减小所述初始分类模型输出的与低分辨率图像样本对应的高分辨率图像,和所述训练集中与所述低分辨率图像样本对应的高分辨率图像样本之间的误差;
    所述第二损失函数用于增大所述初始分类模型输出的多个概率值中的最大概率值与其他概率值之间的差值;
    所述第三损失函数用于减小所述初始分类模型确定的分别属于多个复杂度类别的子图像样本的数量差距。
  3. 如权利要求2所述的方法,其特征在于,在训练过程中,所述初始网络模型对所述训练集中的低分辨率图像样本的处理过程包括:
    将所述低分辨率图像样本切割为多个子图像样本;
    针对每个子图像样本,将所述子图像样本输入所述初始分类模型中处理得到分类结果,所述分类结果包括所述子图像样本被归类到每个复杂度类别的概率值;将所述子图像样本分别输入到所述多个初始超分辨率网络模型中进行处理,得到所述多个初始超分辨率网络模型分别输出的第一重建图像样本;利用所述分类结果对多个所述第一重建图像样本进行加权求和,得到第二重建图像样本;
    将所述多个子图像样本的第二重建图像样本进行拼接,得到与所述低分辨率图像样本对应的高分辨率图像。
  4. 如权利要求2所述的方法,其特征在于,所述第二损失函数为:
    Figure PCTCN2021137582-appb-100001
    其中,L c是子图像样本x由所述初始分类模型处理后输出的属于每个复杂度类别的概率值之间的距离之和的负数,M是复杂度类别的数量,P i(x)是所述子图像样本x被分到第i个复杂度类别的概率值。
  5. 如权利要求2所述的方法,其特征在于,所述第三损失函数为:
    Figure PCTCN2021137582-appb-100002
    其中,L a是批处理中所述初始分类模型分到每个复杂度类别的子图片样本数目和
    Figure PCTCN2021137582-appb-100003
    之间的距离之和,其中B是批处理大小,P i(x j)表示在一个批处理中第j张子图像样本被分到第i个复杂度类别的概率值,
    Figure PCTCN2021137582-appb-100004
    表示一 个批处理中所有被分到第i个复杂度类别的子图像样本的概率值之和。
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述多个超分辨率网络模型包括预设的第一超分辨网络模型和至少一个经过网络参数删减处理的所述第一超分辨率网络模型。
  7. 一种超分辨率装置,其特征在于,包括:
    获取单元,用于获取处理的低分辨率图像;
    处理单元,用于将所述低分辨率图像输入已训练的分类超分网络模型中处理,输出得到与所述低分辨率图像对应的高分辨率图像;
    其中,所述分类超分网络模型包括分类模型和复杂度不同的多个超分辨网络模型,所述分类超分网络模型对所述低分辨率图像的处理过程包括:
    将所述低分辨率图像切割为多个子图像;
    针对每个子图像,根据所述分类模型确定所述子图像的复杂度类别,并将所述子图像输入到所述多个超分辨网络模型中与所述复杂度类别对应的超分辨网络模型中处理,输出得到所述子图像的重建图像;
    将所述多个子图像的重建图像进行拼接,得到所述与所述低分辨率图像对应的高分辨率图像。
  8. 如权利要求7所述的装置,其特征在于,所述装置还包括训练单元:
    所述训练单元,用于利用预设的第一损失函数、第二损失函数、第三损失函数和训练集对预设的初始网络模型进行训练,得到所述分类超分网络模型;
    其中,所述初始分类模型包括初始分类模型和复杂度不同的多个初始超分辨网络模型,所述训练集包括多个低分辨率图像样本和分别于每个低分辨率图像样本对应的高分辨率图像样本;
    所述第一损失函数用于减小所述初始分类模型输出的与低分辨率图像 样本对应的高分辨率图像,和所述训练集中与所述低分辨率图像样本对应的高分辨率图像样本之间的误差;
    所述第二损失函数用于增大所述初始分类模型输出的多个概率值中的最大概率值与其他概率值之间的差值;
    所述第三损失函数用于减小所述初始分类模型确定的分别属于多个复杂度类别的子图像样本的数量差距。
  9. 一种终端设备,其特征在于,包括:存储器和处理器,所述存储器用于存储计算机程序;所述处理器用于在调用所述计算机程序时执行如权利要求1-6任一项所述的方法。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-6任一项所述的方法。
PCT/CN2021/137582 2021-01-29 2021-12-13 一种超分辨率方法、装置、终端设备及存储介质 WO2022160980A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110130561.3 2021-01-29
CN202110130561.3A CN112862681B (zh) 2021-01-29 2021-01-29 一种超分辨率方法、装置、终端设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022160980A1 true WO2022160980A1 (zh) 2022-08-04

Family

ID=75987330

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137582 WO2022160980A1 (zh) 2021-01-29 2021-12-13 一种超分辨率方法、装置、终端设备及存储介质

Country Status (2)

Country Link
CN (1) CN112862681B (zh)
WO (1) WO2022160980A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115392216A (zh) * 2022-10-27 2022-11-25 科大讯飞股份有限公司 一种虚拟形象生成方法、装置、电子设备及存储介质
CN116071238A (zh) * 2023-03-06 2023-05-05 武汉人工智能研究院 图像超分处理方法、装置、电子设备及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112862681B (zh) * 2021-01-29 2023-04-14 中国科学院深圳先进技术研究院 一种超分辨率方法、装置、终端设备及存储介质
CN113421189A (zh) * 2021-06-21 2021-09-21 Oppo广东移动通信有限公司 图像超分辨率处理方法和装置、电子设备
CN113411521B (zh) * 2021-06-23 2022-09-09 北京达佳互联信息技术有限公司 视频处理方法、装置、电子设备及存储介质
CN113313633A (zh) * 2021-06-25 2021-08-27 西安紫光展锐科技有限公司 超分网络模型的训练方法、装置和电子设备
CN113596576A (zh) * 2021-07-21 2021-11-02 杭州网易智企科技有限公司 一种视频超分辨率的方法及装置
CN113674152A (zh) * 2021-08-03 2021-11-19 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备和计算机可读存储介质
CN113888411A (zh) * 2021-09-29 2022-01-04 豪威科技(武汉)有限公司 分辨率提升方法及可读存储介质
CN113706390A (zh) * 2021-10-29 2021-11-26 苏州浪潮智能科技有限公司 图像转换模型训练方法和图像转换方法、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110136056A (zh) * 2018-02-08 2019-08-16 华为技术有限公司 图像超分辨率重建的方法和装置
CN111598796A (zh) * 2020-04-27 2020-08-28 Oppo广东移动通信有限公司 图像处理方法及装置、电子设备、存储介质
CN111598779A (zh) * 2020-05-14 2020-08-28 Oppo广东移动通信有限公司 图像超分辨率处理方法和装置、电子设备及存储介质
US20210004935A1 (en) * 2018-04-04 2021-01-07 Huawei Technologies Co., Ltd. Image Super-Resolution Method and Apparatus
CN112862681A (zh) * 2021-01-29 2021-05-28 中国科学院深圳先进技术研究院 一种超分辨率方法、装置、终端设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130177242A1 (en) * 2012-01-10 2013-07-11 James E. Adams, Jr. Super-resolution image using selected edge pixels
US10009587B1 (en) * 2017-08-14 2018-06-26 Christie Digital Systems Usa, Inc. Real-time spatial-based resolution enhancement using shifted superposition
CN109886891B (zh) * 2019-02-15 2022-01-11 北京市商汤科技开发有限公司 一种图像复原方法及装置、电子设备、存储介质
CN111062872B (zh) * 2019-12-17 2021-02-05 暨南大学 一种基于边缘检测的图像超分辨率重建方法及系统
CN111080527B (zh) * 2019-12-20 2023-12-05 北京金山云网络技术有限公司 一种图像超分辨率的方法、装置、电子设备及存储介质
CN111369440B (zh) * 2020-03-03 2024-01-30 网易(杭州)网络有限公司 模型训练、图像超分辨处理方法、装置、终端及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110136056A (zh) * 2018-02-08 2019-08-16 华为技术有限公司 图像超分辨率重建的方法和装置
US20210004935A1 (en) * 2018-04-04 2021-01-07 Huawei Technologies Co., Ltd. Image Super-Resolution Method and Apparatus
CN111598796A (zh) * 2020-04-27 2020-08-28 Oppo广东移动通信有限公司 图像处理方法及装置、电子设备、存储介质
CN111598779A (zh) * 2020-05-14 2020-08-28 Oppo广东移动通信有限公司 图像超分辨率处理方法和装置、电子设备及存储介质
CN112862681A (zh) * 2021-01-29 2021-05-28 中国科学院深圳先进技术研究院 一种超分辨率方法、装置、终端设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KONG XIANGTAO; ZHAO HENGYUAN; QIAO YU; DONG CHAO: "ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic", 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 20 June 2021 (2021-06-20), pages 12011 - 12020, XP034008596, DOI: 10.1109/CVPR46437.2021.01184 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115392216A (zh) * 2022-10-27 2022-11-25 科大讯飞股份有限公司 一种虚拟形象生成方法、装置、电子设备及存储介质
CN115392216B (zh) * 2022-10-27 2023-03-14 科大讯飞股份有限公司 一种虚拟形象生成方法、装置、电子设备及存储介质
CN116071238A (zh) * 2023-03-06 2023-05-05 武汉人工智能研究院 图像超分处理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN112862681B (zh) 2023-04-14
CN112862681A (zh) 2021-05-28

Similar Documents

Publication Publication Date Title
WO2022160980A1 (zh) 一种超分辨率方法、装置、终端设备及存储介质
CN109493350B (zh) 人像分割方法及装置
CN108921806B (zh) 一种图像处理方法、图像处理装置及终端设备
WO2021164234A1 (zh) 图像处理方法以及图像处理装置
WO2021104058A1 (zh) 图像分割方法、装置及终端设备
WO2022105638A1 (zh) 图像退化处理方法、装置、存储介质及电子设备
CN112602088B (zh) 提高弱光图像的质量的方法、系统和计算机可读介质
WO2021082819A1 (zh) 一种图像生成方法、装置及电子设备
WO2022082999A1 (zh) 一种物体识别方法、装置、终端设备及存储介质
CN112614110B (zh) 评估图像质量的方法、装置及终端设备
CN113066017A (zh) 一种图像增强方法、模型训练方法及设备
WO2023082453A1 (zh) 一种图像处理方法及装置
WO2023179095A1 (zh) 一种图像分割方法、装置、终端设备及存储介质
CN112308866A (zh) 图像处理方法、装置、电子设备及存储介质
CN113222855A (zh) 一种图像恢复方法、装置和设备
WO2021139351A1 (zh) 图像分割方法、装置、介质及电子设备
JP2024508867A (ja) 画像クラスタリング方法、装置、コンピュータ機器及びコンピュータプログラム
CN110717864A (zh) 一种图像增强方法、装置、终端设备及计算机可读介质
CN113158773A (zh) 一种活体检测模型的训练方法及训练装置
CN110738625B (zh) 图像重采样方法、装置、终端及计算机可读存储介质
GB2587833A (en) Image modification styles learned from a limited set of modified images
WO2020056688A1 (zh) 提取图像关键点的方法及装置
CN116823869A (zh) 背景替换的方法和电子设备
CN110619668B (zh) 一种图像抽象方法、装置及终端设备
CN114764839A (zh) 动态视频生成方法、装置、可读存储介质及终端设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21922570

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21922570

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 21922570

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16/01/2024)