WO2023045297A1 - 图像超分辨率方法、装置、计算机设备和可读介质 - Google Patents

图像超分辨率方法、装置、计算机设备和可读介质 Download PDF

Info

Publication number
WO2023045297A1
WO2023045297A1 PCT/CN2022/085007 CN2022085007W WO2023045297A1 WO 2023045297 A1 WO2023045297 A1 WO 2023045297A1 CN 2022085007 W CN2022085007 W CN 2022085007W WO 2023045297 A1 WO2023045297 A1 WO 2023045297A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
dynamic
resolution
super
image
Prior art date
Application number
PCT/CN2022/085007
Other languages
English (en)
French (fr)
Inventor
易自尧
徐科
孔德辉
杨维
宋剑军
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Publication of WO2023045297A1 publication Critical patent/WO2023045297A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting

Definitions

  • the present disclosure relates to, but is not limited to, the field of computer vision technology.
  • the existing neural network-based image super-resolution algorithms have achieved good results, and are widely used in various types of images such as natural images and medical images.
  • the processing or optimization methods are as follows: 1) By designing the computing unit , such as residual block (residual block), depth-wise convolution (depth-wise convolution), deformable convolution (deformable convolution), etc., to improve performance or speed up reasoning; 2) by increasing the width (increasing the number of channels) and depth (increasing The number of network layers) increases network performance; 3) Improves performance by fusing the amount of information in each layer, such as using the attention mechanism, using dense connections, etc.
  • the present disclosure provides an image super-resolution method, device, computer equipment and readable medium.
  • the present disclosure provides an image super-resolution method, the method comprising: acquiring processing parameters and a first image to be processed; acquiring a dynamic super-resolution model, the dynamic super-resolution model including a dynamic processing model and a control model, the control model is configured to control the execution of the dynamic processing model or adjust the structure of the dynamic processing model, and the dynamic super-resolution model is obtained after overall training of the initial dynamic processing model and the initial control model; adjusting or controlling the dynamic processing model by using the control model according to the processing parameters, and processing the first image according to the adjusted or controlled dynamic processing model to obtain a second image corresponding to the first image, The second image has a higher resolution than the first image.
  • the present disclosure also provides an image super-resolution device, including an acquisition module, a control adjustment module, and an image processing module, the acquisition module is configured to acquire processing parameters and the first image to be processed; and acquire dynamic A super-resolution model, the dynamic super-resolution model includes a dynamic processing model and a control model, the control model is configured to control the execution of the dynamic processing model or adjust the structure of the dynamic processing model, the dynamic super-resolution model It is obtained after overall training of the initial dynamic processing model and the initial control model; the control adjustment module is configured to use the control model to adjust or control the dynamic processing model according to the processing parameters; the image processing module It is configured to process the first image according to the adjusted or controlled dynamic processing model to obtain a second image corresponding to the first image, the resolution of the second image is higher than the resolution of the first image .
  • an image super-resolution device including an acquisition module, a control adjustment module, and an image processing module, the acquisition module is configured to acquire processing parameters and the first image to be processed; and acquire dynamic A super-resolution model,
  • the present disclosure also provides a computer device, including: one or more processors; a storage device, on which one or more programs are stored; when the one or more programs are stored by the one or more When the processor is executed, the one or more processors are made to implement any image super-resolution method described herein.
  • the present disclosure also provides a computer-readable medium on which a computer program is stored, wherein when the program is executed by a processor, any image super-resolution method described herein is implemented.
  • FIG. 1 is a schematic flow chart of an image super-resolution method provided by the present disclosure
  • FIG. 2 is a schematic flow diagram of training a dynamic super-resolution model provided by the present disclosure
  • FIG. 3 is a schematic flow diagram of training a dynamic super-resolution model provided by the present disclosure
  • FIG. 4 is a schematic structural diagram of a dynamic loop super-resolution neural network model provided by the present disclosure
  • FIG. 5 is a schematic structural diagram of a dynamic layer-hopping super-resolution neural network model provided by the present disclosure
  • FIG. 6 is a schematic structural diagram of a dynamic pruning super-resolution neural network model provided by the present disclosure
  • FIG. 7 is a schematic diagram of an image super-resolution device provided by the present disclosure.
  • FIG. 8 is a schematic diagram of an image super-resolution device provided by the present disclosure.
  • Embodiments described herein may be described with reference to plan views and/or cross-sectional views by way of idealized schematic representations of the disclosure. Accordingly, the example illustrations may be modified according to manufacturing techniques and/or tolerances. Therefore, the embodiments are not limited to those shown in the drawings but include modifications of configurations formed based on manufacturing processes. Accordingly, the regions illustrated in the figures have schematic properties, and the shapes of the regions shown in the figures illustrate the specific shapes of the regions of the elements, but are not intended to be limiting.
  • the existing neural network-based image super-resolution algorithms have achieved good results, and are widely used in various types of images such as natural images and medical images.
  • the processing or optimization methods are as follows: 1) By designing the computing unit , such as residual block (residual block), depth-wise convolution (depth-wise convolution), deformable convolution (deformable convolution), etc., to improve performance or speed up reasoning; 2) by increasing the width (increasing the number of channels) and depth (increasing The number of network layers) increases network performance; 3) Improves performance by fusing the amount of information in each layer, such as using the attention mechanism, using dense connections, etc.
  • Super-resolution (SR) reconstruction technology uses the information of one or more low-resolution (LR) images to reconstruct a high-resolution (HR) image, and can eliminate blur and noise introduced by imaging devices.
  • This technology has a wide range of applications and has become one of the research hotspots in the field of image processing.
  • the embodiment of the present disclosure provides an image super-resolution method.
  • the image super-resolution method of the embodiment of the present disclosure is used for inference application, in addition to being implemented on the PC (Personal Computer, personal computer), it can also be implemented on the AI (Artificial Computer) side.
  • Intelligence, artificial intelligence) chips The embodiments of the present disclosure relate to image processing technology, the field of artificial intelligence, and the field of computer vision.
  • Image super-resolution is realized based on neural network, that is, the low-resolution image is input into the trained super-resolution neural network model through deep learning training model. Get high-resolution images.
  • the image super-resolution method of the present disclosure may include the following steps S11 to S13.
  • step S11 the processing parameters and the first image to be processed are acquired.
  • the first image to be processed is a low definition image (LR).
  • the processing parameter may be image magnification.
  • a dynamic super-resolution model is obtained, the dynamic super-resolution model includes a dynamic processing model and a control model, the control model is configured to control the execution of the dynamic processing model or adjust the structure of the dynamic processing model, the dynamic super-resolution model is the initial dynamic The treatment model and the initial control model are obtained after the overall training.
  • the dynamic processing model is configured to generate a high-resolution image from a low-resolution image.
  • the dynamic processing model includes multiple processing modules and processing layers, such as convolutional layers, ReLU (activation function) layers, pooling layers, residual blocks, etc.
  • the control model is a gate function (Gating Function), which is a simple neural network classifier configured to determine whether the processing module/processing layer in the dynamic processing model executes or only executes a part, that is, it is configured to determine whether the data flow passes through its controlled processing Module/processing layer.
  • gate functions can be designed into two types: forward propagation network gate functions and recurrent network gate functions, where the forward propagation network gate functions need to be based on the size, depth, and depth of each processing module/processing layer.
  • the loop network gate function can be shared by each processing module/processing layer, and the advantage is that it can better retain the information left by the previous processing module.
  • the type of dynamic super-resolution model is also different.
  • the gate function controls the entire module in the super-resolution neural network (that is, the entire processing module of the dynamic processing model), such as the RNN module
  • the dynamic super-resolution model is a dynamic loop super-resolution model
  • the gate function controls the convolutional layer channel
  • the dynamic super-resolution model is a dynamic width super-resolution model.
  • control model can control or adjust the dynamic processing model more accurately, the image super-resolution reconstruction effect is better, and the image quality is better .
  • step S13 the dynamic processing model is adjusted or controlled by the control model according to the processing parameters, and the first image is processed according to the adjusted or controlled dynamic processing model to obtain a second image corresponding to the first image.
  • the control model controls or adjusts the dynamic processing model according to the processing parameters.
  • the controlled or adjusted dynamic processing model is used for processing to obtain the second image
  • the second image is a super-resolution (SR) image, that is, the resolution of the second image is higher than that of the first image.
  • SR super-resolution
  • the image super-resolution method obtains processing parameters and the first image to be processed, and obtains a dynamic super-resolution model including a dynamic processing model and a control model.
  • the dynamic super-resolution model is an initial dynamic processing model obtained after overall training with the initial control model; using the control model to adjust or control the dynamic processing model according to the processing parameters, and processing the first image according to the adjusted or controlled dynamic processing model to obtain a second image corresponding to the first image, The resolution of the second image is higher than that of the first image;
  • the initial dynamic processing model and the initial control model have been trained as a whole, after using the control model to adjust or control the dynamic processing model, according to the adjusted Or the controlled dynamic processing model processes the first image, and can simplify the structure or execution times of the dynamic processing model as much as possible under the premise of ensuring the image processing requirements, so as to take into account image quality, high system operating speed and low computing power;
  • the dynamic super-resolution model can be automatically adjusted according to the processing
  • the traditional static neural network (such as the well-known ResNet, DenseNet) model uses the same network architecture and parameters for all input samples in the test phase.
  • the dynamic neural network model can be adjusted according to different samples. Its own structure/parameters, thus showing excellent advantages in terms of computing efficiency and expressive ability.
  • the embodiments of the present disclosure use a dynamic neural network model, that is, the width or depth of its own neural network can be adjusted according to different samples.
  • the control model i.e., the gate function
  • the gate function Since the decision-making process controlled by the control model (i.e., the gate function) is inherently discrete and thus non-differentiable, in related technologies, a differentiable soft-maximum decision is used during model training, which reverts to Hard decision, although the model training method of the related art supports gradient-based training, but because the network parameters are not optimized for the subsequent hard gating during the inference process, the prediction accuracy is poor.
  • the dynamic super-resolution model is trained using a reinforcement learning algorithm, and the initial dynamic processing model and the initial control model are trained as a whole.
  • the reinforcement learning algorithm is based on decision-making, learning the mapping from the environment state to the behavior, so that the behavior selected by the agent can obtain the maximum reward from the environment, so that the external environment can evaluate the learning system in a certain sense (or the operating performance of the entire system) for the best. Therefore, based on the reinforcement learning algorithm, the initial dynamic processing model and the initial control model are trained as a whole, and the control model is used to control and adjust the dynamic processing model to obtain rewards for training, which can improve the accuracy of the dynamic super-resolution model.
  • the training set can be a DIV2K data set and an Urban100 data set, and 1000 different scenes such as people, animals, plants, buildings, and natural scenes in the above data set can be used. 2K images are used as the training data in the training set.
  • the test set can use 100 images in the BSD100 data set, and 70 images produced by self-shooting.
  • the images in the training set can be preprocessed.
  • the images in the training set ie DIV2K dataset and Urban100 dataset
  • HR high-definition images
  • LR low-definition images
  • the images in the training set can be rotated counterclockwise, flipped, etc., and the rotated and flipped images can be used as training data in the training set.
  • GPU graphics processing unit, graphics processing unit
  • workstation When performing dynamic super-resolution model training, it is necessary to use a GPU (graphics processing unit, graphics processing unit) server or workstation to implement, and multiple graphics cards can be used for parallel operations during the training process.
  • the system environment of the server can be Ubuntu or Windows, and the framework for using deep learning can be pytorch.
  • the overall training of the initial dynamic processing model and the initial control model by using a reinforcement learning algorithm to obtain a dynamic super-resolution model may include the following steps S21 and S22 .
  • step S21 the initial dynamic processing model and the initial control model are trained as a whole by using a reinforcement learning algorithm in an iterative manner, and the training image is an image obtained by compressing the original image.
  • the training image is the image in the training set, which is a preprocessed image.
  • Preprocessing refers to compressing the high-definition image by a preset multiple to obtain a low-definition image.
  • the overall training is carried out in an iterative manner, and the data used for training (ie, training images) are preprocessed after the data.
  • step S22 in response to the satisfaction of the preset convergence condition, the training is ended to obtain the dynamic super-resolution model.
  • the preset convergence conditions include at least one of the following:
  • the weighted sum between the loss function and the obtained reward is the smallest, the loss function is the loss function between the image processed by the dynamic processing model and the original image corresponding to the corresponding training image, and the obtained reward is the control model control or adjustment Rewards earned by dynamic processing models.
  • the loss function reflects the super-resolution image reconstruction effect (ie, the quality of the reconstructed image) of the dynamic super-resolution model on the image input to the dynamic super-resolution model (ie, training data).
  • the rewards obtained can include positive rewards (the rewards are positive at this time), and can also be punishments (the rewards are negative at this time).
  • the test set can be used to evaluate the trained dynamic super-resolution model.
  • the PSNR peak signal-to-noise ratio
  • the method before using the reinforcement learning algorithm to iteratively use the training images to perform overall training on the initial dynamic processing model and the initial control model (ie step S21), the method may also include The following steps: step 21', isolating the initial control model, training the initial dynamic processing model, and obtaining the trained dynamic processing model.
  • the initial control model is isolated, the initial dynamic processing model is trained separately, and the pre-set optimizer, learning rate and training epoch (period) are used for training during training, so as to realize the supervised pre-training of the super-resolution neural network .
  • using the reinforcement learning algorithm to iteratively train the initial dynamic processing model and the initial control model using the training images includes the following steps: using the reinforcement learning algorithm to iteratively use the training The image pairs the trained dynamic processing model and the initial control model as a whole. That is to say, the dynamic processing model is trained separately to ensure the accuracy of the dynamic processing model, and then the initial control model and the trained dynamic processing model are trained as a whole to ensure the accuracy of the control model.
  • the dynamic super-resolution model includes a dynamic depth super-resolution model
  • the dynamic depth super-resolution model includes a first dynamic processing model and a first control model
  • the first control model is configured to, in the first dynamic processing model After executing at least once, control the execution of the first dynamic processing model, or control the execution of processing modules or processing layers in the first dynamic processing model.
  • the dynamic depth super-resolution model includes a dynamic loop super-resolution neural network model or a dynamic layer-skipping super-resolution neural network model.
  • Fig. 4 is a schematic diagram of the structure of the dynamic cycle super-resolution neural network model.
  • the first dynamic processing model of the dynamic cyclic super-resolution neural network model includes a RNN (Recurrent Neural Network, cyclic neural network) module and an upsampling (Upsample) module, and the low-resolution image (LR) is subjected to feature extraction , coding and other structures arrive at the RNN module.
  • the gate function ie, the first control model
  • judges whether the next cycle needs to go through If the gate function judges that the next cycle needs to go through, it passes through the RNN module again. If If the gate function judges that it does not need to go through the next cycle, it jumps out of the RNN module to perform other steps, and finally obtains a super-resolution image (SR) through the up-sampling module.
  • SR super-resolution image
  • Fig. 5 is a schematic diagram of the structure of a dynamic layer-hopping super-resolution neural network model.
  • the first dynamic processing model of the dynamic layer-skipping super-resolution neural network model includes multiple processing modules (such as residual module RB), multiple processing layers (such as convolutional layer Conv) and upsampling (Upsample ) module, the gate function (i.e. the first control model) acts on the processing module or processing layer, and judges the importance of the processing module or processing layer.
  • the gate function outputs 1 to make the data flow Pass, otherwise, output 0 to make the data directly skip the processing module or processing layer, and the data processed by the corresponding processing module or processing layer will obtain a super-resolution image (SR) through the up-sampling module.
  • SR super-resolution image
  • the dynamic super-resolution model includes a dynamic width super-resolution model
  • the dynamic width super-resolution model includes a second dynamic processing model and a second control model
  • the second control model is configured to adjust channels of each convolutional layer.
  • the dynamic width super-resolution model includes a dynamically pruned super-resolution neural network model.
  • Fig. 6 is a schematic diagram of the dynamic pruning super-resolution neural network model structure.
  • the second dynamic processing model of the dynamic pruning super-resolution neural network model includes multiple processing layers (such as convolution layer Conv) and upsampling (Upsample) module, gate function (ie the second control model) Act on all convolutional layers, and perform dynamic convolution on all convolutional layers.
  • the gate function judges the importance of each channel in the convolutional layer. If the channel is important, the gate function outputs 1 to open the channel. , if the channel is not important, then output 0, so that the channel is closed, and the data processed by each channel of each convolutional layer is finally processed by the upsampling module to obtain a super-resolution image (SR).
  • SR super-resolution image
  • Both the dynamic layer-skipping super-resolution neural network model shown in Figure 5 and the dynamic pruning super-resolution neural network model shown in Figure 6 use a circular gate function, that is, all processing modules or processing layers are controlled by the same gate function It is also possible to design gate functions specifically for each processing module or processing layer.
  • each gating layer i.e., the first control model, gate function
  • the estimated gating function is constructed in the context of policy optimization through the reinforcement learning algorithm, and the estimated The gating function is used to determine whether the processing modules or processing layers in the first dynamic processing model are skipped for execution.
  • the estimated gating function is as formula (1):
  • x i is the input
  • represents the probability distribution of the decision result of the gate function.
  • P(.) represents a probability function.
  • G i (.) represents a dynamic processing model, which processes the input xi .
  • the overall objective function is set to:
  • R i (1-g i )C i
  • the constant C i represents the cost of executing F i
  • R i represents the reward of the gate function skipping F i ; the preceding formula It is the loss function during training, including but not limited to L1, MSE, GANLoss and other functions and the addition of one or more of them.
  • J( ⁇ ) represents the overall objective function.
  • E x represents the expectation of x
  • x represents the input
  • E g represents the expectation of g
  • N represents the dynamic range
  • the gradient calculation formula (3) of the overall objective function is as follows:
  • the gate function (ie, the second control model) in the dynamic width super-resolution model decides whether to skip the channel in the convolutional layer, rather than deciding whether to skip the entire processing module or convolution layer, which is similar to model pruning.
  • Dynamic width super-resolution model training is mainly used in convolutional layers.
  • the convolutional layer is denoted as C1, C2, ..., Cm
  • the channels of the convolutional layer are respectively K1, K2, ..., Km
  • the objective function is expressed as the following formula (4):
  • L represents the loss function
  • L pnt represents the penalty item for the compromise between speed and accuracy
  • h(F i ) is the index list of the selected channel generated according to the input feature map
  • K[ ] is the index operation pruning of the channel Unit
  • conv means convolution operation
  • E Fi represents the expected value of the input features.
  • the initial model is randomly initialized, in which decisions are made randomly.
  • the super-resolution neural network is used as the environment, and the corresponding reward training gate function is obtained by closing different convolutional layer channels.
  • the formula (4) converges as a whole, the reinforcement learning is completed and the training ends.
  • the second control model can also be fixed, and the second dynamic processing model can be fine-tuned according to the strategy of the second control model, so that the second dynamic Processing models can specialize in specific tasks.
  • the image super-resolution method provided by the embodiments of the present disclosure is implemented based on a dynamic neural network.
  • the input processing parameters such as image magnification
  • image magnification on the premise of meeting the image quality requirements, as many steps in the dynamic neural network as possible are skipped.
  • Processing modules or processing layers so as to increase the running speed and reduce the computing power under the premise of ensuring high expressiveness.
  • the embodiment of the present disclosure uses a combination of a dynamic structure-like dynamic neural network and a super-resolution neural network, which is different from the way in which all input images in a traditional neural network are processed with the same model structure.
  • the embodiment of the present disclosure can modify its own network structure for different inputs , to improve the running speed, save computing power resources and improve the user experience while ensuring the high restoration quality of the image.
  • the embodiments of the present disclosure can be applied to mobile phone APPs, built-in image processing modules in cameras, built-in copy processing modules in medical imaging equipment, and high-definition televisions to realize the restoration function of old photos and old movies. After adding the image alignment module, it can also realize the conversion of black and white images into color images.
  • the embodiment of the present disclosure also provides an image super-resolution device.
  • FIG. 101 is configured to acquire processing parameters and the first image to be processed; and acquire a dynamic super-resolution model, where the dynamic super-resolution model includes a dynamic processing model and a control model, and the control model is configured to control the dynamic processing
  • the model executes or adjusts the structure of the dynamic processing model, and the dynamic super-resolution model is obtained after overall training of the initial dynamic processing model and the initial control model.
  • the control adjustment module 102 is configured to use the control model to adjust or control the dynamic processing model according to the processing parameters.
  • the image processing module 103 is configured to process the first image according to the adjusted or controlled dynamic processing model to obtain a second image corresponding to the first image, the resolution of the second image is higher than that of the first image The resolution of the image.
  • the dynamic super-resolution model is obtained after overall training of the initial dynamic processing model and the initial control model using a reinforcement learning algorithm.
  • the image super-resolution device further includes a model training module 104, and the model training module 104 is configured to use a training image to integrate the initial dynamic processing model and the initial control model in an iterative manner using a reinforcement learning algorithm.
  • training the training image is an image obtained by compressing the original image; in response to the satisfaction of the preset convergence condition, the training is ended to obtain the dynamic super-resolution model;
  • the preset convergence condition includes at least one of the following:
  • the weighted sum between the loss function and the obtained reward is the smallest, the loss function is the loss function between the image processed by the dynamic processing model and the original image corresponding to the corresponding training image, and the obtained reward is the control model Controlling or adjusting rewards obtained by the dynamic processing model.
  • the training module 104 is further configured to isolate the initial control model and train the The initial dynamic processing model is used to obtain the trained dynamic processing model.
  • the training module 104 is configured to use a training image to perform overall training on the trained dynamic processing model and the initial control model by using a reinforcement learning algorithm in an iterative manner.
  • the dynamic super-resolution model includes a dynamic depth super-resolution model
  • the dynamic depth super-resolution model includes a first dynamic processing model and a first control model
  • the first control model is configured to, After the first dynamic processing model is executed at least once, the execution of the first dynamic processing model is controlled, or the processing modules or processing layers in the first dynamic processing model are controlled to execute.
  • the dynamic depth super-resolution model includes a dynamic loop super-resolution neural network model or a dynamic layer-skipping super-resolution neural network model.
  • the dynamic super-resolution model includes a dynamic width super-resolution model
  • the dynamic width super-resolution model includes a second dynamic processing model and a second control model
  • the second control model is configured to adjust Channels of each convolutional layer in the second dynamic processing model.
  • the dynamic width super-resolution model includes a dynamic pruning super-resolution neural network model.
  • the present disclosure also provides a computer device, including: one or more processors; a storage device on which one or more programs are stored; when the one or more programs are executed by the one or more processors , so that the one or more processors implement the image super-resolution method as described above.
  • the present disclosure also provides a computer-readable medium on which a computer program is stored, wherein when the program is executed by a processor, the aforementioned image super-resolution method is realized.
  • the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute.
  • Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit .
  • Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. permanent, removable and non-removable media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供一种图像超分辨率方法,所述方法包括:获取处理参数和待处理的第一图像,并获取包括动态处理模型和控制模型的动态超分辨率模型,动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的;根据处理参数利用控制模型调整或控制动态处理模型,并根据经调整或控制的动态处理模型处理第一图像,得到第二图像,第二图像的分辨率高于第一图像的分辨率。本公开还提供一种图像超分辨率装置、计算机设备和计算机可读介质。

Description

图像超分辨率方法、装置、计算机设备和可读介质
相关申请的交叉引用
本申请要求2021年9月22日提交给中国专利局的第202111106460.9号专利申请的优先权,其全部内容通过引用合并于此。
技术领域
本公开涉及但不限于计算机视觉技术领域。
背景技术
现有的基于神经网络的图像超分辨率算法已经取得了不错的效果,在自然图像、医疗图像等各类别图像中都有比较广泛的应用,其处理或者优化方法如下:1)通过设计计算单元,比如残差块(residual block),深度卷积(depth-wise convolution),形变卷积(deformable convolution)等,提升性能或者加快推理速度;2)通过增加宽度(增加通道数量)和深度(增加网络层数)增加网络性能;3)通过融合各层的信息量提升性能,比如使用注意力(attention)机制,使用稠密连接等。
发明内容
本公开提供一种图像超分辨率方法、装置、计算机设备和可读介质。
第一方面,本公开提供一种图像超分辨率方法,所述方法包括:获取处理参数和待处理的第一图像;获取动态超分辨率模型,所述动态超分辨率模型包括动态处理模型和控制模型,所述控制模型配置为控制所述动态处理模型执行或者调整所述动态处理模型的结构,所述动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的;根据所述处理参数,利用所述控制模型调整或控制所述动态处理模型,并根据经调整或控制的动态处理模型处理所述第一图像,得到与所述第一图像对应的第二图像,所述第二图像的分辨率 高于所述第一图像的分辨率。
又一方面,本公开还提供一种图像超分辨率装置,包括获取模块、控制调整模块和图像处理模块,所述获取模块配置为,获取处理参数和待处理的第一图像;以及,获取动态超分辨率模型,所述动态超分辨率模型包括动态处理模型和控制模型,所述控制模型配置为控制所述动态处理模型执行或者调整所述动态处理模型的结构,所述动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的;所述控制调整模块配置为,根据所述处理参数,利用所述控制模型调整或控制所述动态处理模型;所述图像处理模块配置为,根据经调整或控制的动态处理模型处理所述第一图像,得到与所述第一图像对应的第二图像,所述第二图像的分辨率高于所述第一图像的分辨率。
又一方面,本公开还提供一种计算机设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现本文所述的任一图像超分辨率方法。
又一方面,本公开还提供一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现本文所述的任一图像超分辨率方法。
附图说明
图1为本公开提供的图像超分辨率方法的流程示意图;
图2为本公开提供的训练动态超分辨率模型的流程示意图;
图3为本公开提供的训练动态超分辨率模型的流程示意图;
图4为本公开提供的动态循环超分辨率神经网络模型结构示意图;
图5为本公开提供的动态跳层超分辨率神经网络模型结构示意图;
图6为本公开提供的动态剪枝超分辨率神经网络模型结构示意图;
图7为本公开提供的图像超分辨率装置示意图;
图8为本公开提供的图像超分辨率装置示意图。
具体实施方式
在下文中将参考附图更充分地描述示例实施方式,但是所述示例实施方式可以以不同形式来体现且不应当被解释为限于本文阐述的实施方式。反之,提供这些实施方式的目的在于使本公开透彻和完整,并将使本领域技术人员充分理解本公开的范围。
如本文所使用的,术语“和/或”包括一个或多个相关列举条目的任何和所有组合。
本文所使用的术语仅用于描述特定实施方式,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其他特征、整体、步骤、操作、元件、组件和/或其群组。
本文所述实施方式可借助本公开的理想示意图而参考平面图和/或截面图进行描述。因此,可根据制造技术和/或容限来修改示例图示。因此,实施方式不限于附图中所示的实施方式,而是包括基于制造工艺而形成的配置的修改。因此,附图中例示的区具有示意性属性,并且图中所示区的形状例示了元件的区的具体形状,但并不旨在是限制性的。
除非另外限定,否则本文所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。
现有的基于神经网络的图像超分辨率算法已经取得了不错的效果,在自然图像、医疗图像等各类别图像中都有比较广泛的应用,其处理或者优化方法如下:1)通过设计计算单元,比如残差块(residual  block),深度卷积(depth-wise convolution),形变卷积(deformable convolution)等,提升性能或者加快推理速度;2)通过增加宽度(增加通道数量)和深度(增加网络层数)增加网络性能;3)通过融合各层的信息量提升性能,比如使用注意力(attention)机制,使用稠密连接等。
但是目前针对训练好的超分辨率神经网络模型,输入超分辨率神经网络模型的所有低分辨率图像都是经历同样的网络处理流程输出高分辨率图像。而在设计超分辨率神经网络模型的时候,为了处理复杂图像,往往网络层数较多,网络结构较为复杂。但是在实际应用中,部分输入的低分辨率图像往往比较简单没有必要使用复杂的网络结构,从而造成时间和算力的浪费。
超分辨率(SR)重建技术是利用一幅或多幅低分辨率(LR)图像的信息重建出一幅高分辨率(HR)图像,同时能够消除由成像器件引入的模糊、噪声。该技术应用领域广泛,已经成为图像处理领域的研究热点之一。本公开实施方式提供一种图像超分辨率方法,在采用本公开实施方式的图像超分辨率方法进行推理应用时,除了可以在PC(Personal Computer,个人计算机)端实现,还可以在AI(Artificial Intelligence,人工智能)芯片中实现。本公开实施方式涉及图像处理技术,人工智能领域,以及计算机视觉领域,基于神经网络实现图像超分辨率,即通过深度学习训练模型,将低分辨率图像输入到训练好的超分辨率神经网络模型中得到高分辨率图像。
如图1所示,本公开的图像超分辨率方法包括可以包括以下步骤S11至S13。
在步骤S11,获取处理参数和待处理的第一图像。
待处理的第一图像为低清图像(LR)。在一些实施方式中,处理参数可以是图像放大倍率。
在步骤S12,获取动态超分辨率模型,动态超分辨率模型包括动态处理模型和控制模型,控制模型配置为控制动态处理模型执行或者调整动态处理模型的结构,动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的。
动态处理模型配置为将低分辨率图像生成高分辨率图像。动态 处理模型包括多个处理模块和处理层,如卷积层,ReLU(激活函数)层,池化层,残差块等。控制模型为门函数(Gating Function),是简单的神经网络分类器,配置为决定动态处理模型中的处理模块/处理层是否执行或者只执行一部分,即配置为决定数据流是否通过其控制的处理模块/处理层。基于神经网络结构,门函数可以设计成两类:前向传播网络式门函数和循环网络式门函数,其中前向传播网络式门函数需要根据每一个处理模块/处理层的大小、深度、所处网络位置等参数进行设置;循环网络式门函数是可以让每一个处理模块/处理层共用,好处是可以更好的保留前一个处理模块所留下的信息。根据门函数类型的不同,动态超分辨率模型类型也不同。当门函数控制超分辨率神经网络中的整个模块(即动态处理模型的整个处理模块)时,如RNN模块,则动态超分辨率模型为动态循环超分辨率模型,当门函数控制卷积层的通道时,则动态超分辨率模型为动态宽度超分辨率模型。
使用对初始动态处理模型和初始控制模型进行整体训练后得到的动态超分辨率模型,其控制模型对动态处理模型的控制或调整更为精确,图像超分辨率重建效果更优,图像质量更佳。
在步骤S13,根据处理参数,利用控制模型调整或控制动态处理模型,并根据经调整或控制的动态处理模型处理第一图像,得到与第一图像对应的第二图像。
在本步骤中,控制模型根据处理参数对动态处理模型进行控制或调整,在对第一图像进行超分图像重建过程中,利用控制后或调整后的动态处理模型进行处理,从而得到第二图像,第二图像为超清(SR)图像,即第二图像的分辨率高于第一图像的分辨率。
本公开实施方式提供的图像超分辨率方法,获取处理参数和待处理的第一图像,并获取包括动态处理模型和控制模型的动态超分辨率模型,动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的;根据处理参数利用控制模型调整或控制动态处理模型,并根据经调整或控制的动态处理模型处理第一图像,得到与第一图像对应的第二图像,第二图像的分辨率高于第一图像的分辨率; 本公开实施方式在已经对初始动态处理模型和初始控制模型进行整体训练的情况下,利用控制模型调整或控制动态处理模型后,根据调整或控制后的动态处理模型对第一图像进行处理,可以在保证图像处理要求的前提下,尽可能简化动态处理模型的结构或执行次数,从而兼顾图像质量、系统高运行速度和低运算力;而且,动态超分辨率模型可以根据处理参数自动调整,提高图像超分辨率处理的灵活性和广泛性。
传统的静态神经网络(如人们熟知的ResNet,DenseNet)模型在测试阶段对所有的输入样本均采用相同的网络架构与参数,与静态神经网络模型不同的是,动态神经网络模型可以根据不同样本调节自身的结构/参数,从而在运算效率、表达能力等方面展现出卓越的优势。本公开实施方式使用动态神经网络模型,即可以根据不同样本来调整自身神经网络的宽度或者深度。
由于由控制模型(即门函数)控制的决策过程是固有的、离散的,因此是不可微的,在相关技术中,在模型训练过程中使用可微的软最大决策,在推理过程中恢复为硬决策,虽然相关技术的模型训练方法支持基于梯度的训练,但由于在推理过程中没有针对后续的硬选通优化网络参数,导致预测精度较差。
为了提高动态超分辨率模型的精度,在一些实施方式中,动态超分辨率模型采用强化学习算法进行训练,且对初始动态处理模型和初始控制模型进行整体训练。强化学习算法基于决策实现,学习从环境状态到行为的映射,使得智能体选择的行为能够获得环境最大的奖励,使得外部环境对学习系统在某种意义下的评价(或整个系统的运行性能)为最佳。因此,基于强化学习算法,将初始动态处理模型和初始控制模型进行整体训练,根据控制模型对动态处理模型的控制调整动作获得奖励进行训练,可以提高动态超分辨率模型的精度。
在进行模型训练之前,构建训练集和测试集,示例性的,训练集可以是为DIV2K数据集以及Urban100数据集,可以将上述数据集中人物、动物、植物、建筑物、自然场景等不同场景1000张2K图像作为训练集中的训练数据。示例性的,测试集可以采用BSD100数 据集中的100张图像,以及自主拍摄制作的70张图像。
在构建完成训练集和测试集之后,可以对训练集中的图像进行预处理操作。将训练集(即DIV2K数据集和Urban100数据集)中的图像进行不同倍数的缩小。需要说明的是,训练集中的图像为高清图像(HR),在预处理操作中,将高清图像压缩预设倍数,得到低清图像(LR)。为了扩充训练集,进一步的,还可以把训练集中的图像分别进行逆时针旋转、翻转等处理,并将旋转、翻转后的图像作为训练集中的训练数据。
在进行动态超分辨率模型训练时,需要使用GPU(graphics processing unit,图形处理器)服务器或者工作站实现,训练过程中可以使用多张显卡进行并行操作。服务器的系统环境可以为Ubuntu或者Windows,使用深度学习的框架可以为pytorch。
以下结合图2,详细说明动态超分辨率模型的训练过程。如图2所示,采用强化学习算法对初始动态处理模型和初始控制模型进行整体训练得到动态超分辨率模型,可以包括以下步骤S21和S22。
在步骤S21,采用强化学习算法通过迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练,训练图像为对原始图像进行压缩处理得到的图像。
训练图像即为训练集中的图像,是经过预处理的图像,预处理是指,将高清图像压缩预设倍数后得到低清图像。在采用强化学习算法进行模型训练的过程中,针对预设的初始动态处理模型和预设的初始控制模型,通过迭代的方式进行整体训练,用于训练的数据(即训练图像)为经过预处理之后的数据。
在步骤S22,响应于预设收敛条件满足,结束训练,得到所述动态超分辨率模型。
在一些实施方式中,预设收敛条件包括以下至少之一:
(1)已训练预设迭代次数;
(2)损失函数与获得的奖励之间的加权和最小,损失函数为经动态处理模型处理得到的图像与相应训练图像对应的原始图像之间的损失函数,获得的奖励为控制模型控制或调整动态处理模型所获得 的奖励。
也就是说,满足上述一个或多个条件,就认为动态超分辨率模型收敛,结束训练。损失函数反映了动态超分辨率模型对输入该动态超分辨率模型的图像(即训练数据)进行超分辨率图像重建的效果(即重建图像的质量)。获得的奖励可以包括正向奖励(此时的奖励为正数),也可以是惩罚(此时的奖励为负数)。
动态超分辨率模型训练完成之后,可以利用测试集对训练后的动态超分辨率模型进行评估,示例性的,可以评估经过图像超分辨率重建的图像的PSNR(峰值信噪比)。
在一些实施方式中,如图3所示,在采用强化学习算法通过迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练(即步骤S21)之前,所述方法还可以包括以下步骤:步骤21’,隔离初始控制模型,训练初始动态处理模型,得到训练后的动态处理模型。
在本步骤中,将初始控制模型隔离,单独训练初始动态处理模型,训练时使用预先设置的优化器、学习率以及训练的epoch(时期)进行训练,实现超分辨率神经网络的有监督预训练。
相应的,所述采用强化学习算法通过迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练(即步骤S21),包括以下步骤:采用强化学习算法通过迭代的方式,利用训练图像对训练后的动态处理模型和初始控制模型进行整体训练。也就是说,先单独对动态处理模型进行训练,保证动态处理模型的精度,再对初始控制模型和训练后的动态处理模型进行整体训练,保证控制模型的精度。
在一些实施方式中,动态超分辨率模型包括动态深度超分辨率模型,动态深度超分辨率模型包括第一动态处理模型和第一控制模型,第一控制模型配置为,在第一动态处理模型执行至少一次之后,控制第一动态处理模型执行,或者,控制第一动态处理模型中的处理模块或处理层执行。
在一些实施方式中,动态深度超分辨率模型包括动态循环超分 辨率神经网络模型或动态跳层超分辨率神经网络模型。
图4为动态循环超分辨率神经网络模型结构示意图。如图4所示,动态循环超分辨率神经网络模型的第一动态处理模型包括RNN(Recurrent Neural Network,循环神经网络)模块和上采样(Upsample)模块,低分辨率图像(LR)经过特征提取、编码等结构到达RNN模块,经过一次RNN模块后由门函数(即第一控制模型)判断是否需要经过下一次的循环,若门函数判断为需要经过下一次循环,则再次经过RNN模块,若门函数判断为不需要经过下一次循环,则跳出RNN模块进行其他步骤,最后通过上采样模块得到超分辨率图像(SR)。
图5为动态跳层超分辨率神经网络模型结构示意图。如图5所示,动态跳层超分辨率神经网络模型的第一动态处理模型包括多个处理模块(如残差模块RB)、多个处理层(如卷积层Conv)和上采样(Upsample)模块,门函数(即第一控制模型)作用于处理模块或处理层,对该处理模块或处理层的重要性进行判断,如果处理模块或处理层比较重要,则门函数输出1使得数据流通过,反之则输出0使数据直接跳过该处理模块或处理层,经过相应处理模块或处理层处理后的数据通过上采样模块得到超分辨率图像(SR)。
在一些实施方式中,动态超分辨率模型包括动态宽度超分辨率模型,动态宽度超分辨率模型包括第二动态处理模型和第二控制模型,第二控制模型配置为调整第二动态处理模型中各卷积层的通道。
在一些实施方式中,动态宽度超分辨率模型包括动态剪枝超分辨率神经网络模型。
图6为动态剪枝超分辨率神经网络模型结构示意图。如图6所示,动态剪枝超分辨率神经网络模型的第二动态处理模型包括多个处理层(如卷积层Conv)和上采样(Upsample)模块,门函数(即第二控制模型)作用于所有的卷积层,对所有的卷积层进行动态卷积。当低分辨率图像(LR)经过一系列模块到达卷积层时,门函数对该卷积层中的每一个通道的重要性进行判断,若通道重要,则门函数输出1,使该通道打开,若通道不重要,则输出0,使该通道关闭,经过各卷积层的各通道处理后的数据最后经过上采样模块处理后得到 超分辨率图像(SR)。
图5所示的动态跳层超分辨率神经网络模型和图6所示的动态剪枝超分辨率神经网络模型均采用的循环式门函数,即所有处理模块或处理层均由同一个门函数控制,也可以针对每一个处理模块或处理层专门设计门函数。
为清楚说明本公开实施方式的方案,以下分别结合示例性实例对动态深度超分辨率模型和动态宽度超分辨率模型的训练过程进行详细说明。
1、动态深度超分辨率模型的训练
在动态深度超分辨率模型中,每一个选通层(即第一控制模型、门函数)做出一系列离散的决策,因此通过强化学习算法在策略优化的上下文中构建估计选通函数,估计选通函数用于判决第一动态处理模型中的处理模块或处理层是否被跳过执行。估计选通函数如公式(1):
π(x i)=P(G i(x i)=g i)         (1)
其中,x i为输入,g=[g 1,...,g N]~πF θ表示控制每一个处理模块或处理层的门函数,
Figure PCTCN2022085007-appb-000001
是由θ和g参数化的包括门函数的网络层,π表示门函数的判决结果的概率分布。P(.)表示概率函数。G i(.)表示动态处理模型,对输入x i进行处理。
整体目标函数设置为:
Figure PCTCN2022085007-appb-000002
其中,R i=(1-g i)C i,常数C i代表执行F i的花费,R i表示门函数跳过F i的奖励;前项公式
Figure PCTCN2022085007-appb-000003
为训练时的损失函数,包括但不限于L1、MSE、GANLoss等函数以及其中的一个或多个相加。J(θ)表示整体目标函数。E x表示对x的期望,x表示输入,E g表示对g的期望,N表示动态范围,
Figure PCTCN2022085007-appb-000004
表示设置的常参数,根据需要设置。
整体目标函数的梯度计算公式(3)如下:
Figure PCTCN2022085007-appb-000005
其中,
Figure PCTCN2022085007-appb-000006
表示学习分类精度的监督损失函数(supervised loss),
Figure PCTCN2022085007-appb-000007
为结合强化学习算法最终学习得到的反映计算节省的跳过学习策略(Skip learning policy)。
2、动态宽度超分辨率模型的训练
与动态深度超分辨率模型不同,动态宽度超分辨率模型中的门函数(即第二控制模型)决定的是否跳过卷积层中的通道,而不是决定是否跳过整个处理模块或卷积层,该方式和模型剪枝比较类似。动态宽度超分辨率模型训练主要运用在卷积层中。
卷积层表示为C1,C2,…,Cm,卷积层的通道分别为K1,K2,…,Km,信道为ni,(i=1,2,…,m)。卷积层产生特征图F1,F2,…,Fm,目标是在给定特征映射Fi,(i=1,2,…,m)的情况下,找到并剪除K i+1中的冗余卷积通道,以减少计算量并同时达到最大性能。
以第i层为例,目标函数表示为以下公式(4):
Figure PCTCN2022085007-appb-000008
其中,L表示损失函数,L pnt表示速度和精度之间折中的惩罚项,h(F i)是根据输入特征映射生成所选通道的索引列表,K[·]是通道的索引操作剪枝单元,conv表示卷积运算。E Fi表示输入特征的期望值。
在进行模型训练时,随机初始化初始模型,其中的决策是随机做出的。将超分辨率神经网络作为环境,通过关闭不同的卷积层通道得到相应奖励训练门函数,当公式(4)整体收敛时,强化学习完成,训练结束。需要说明的是,在完成第二控制模型和第二动态处理模型整体训练之后,还可以固定第二控制模型,并按照第二控制模型的策 略对第二动态处理模型进行微调,使第二动态处理模型能够专精于特定的任务。
本公开实施方式提供的图像超分辨率方法,基于动态神经网络实现,根据输入的处理参数(如图像放大倍率),在满足图像质量要求的前提下,尽可能多跳过执行动态神经网络中的处理模块或处理层,从而在保证高表现力的前提下,提高运行速度以及降低运算力。
本公开实施方式使用动态结构类动态神经网络与超分辨率神经网络相结合,与传统神经网络中所有输入图像采用同一模型结构处理的方式不同,本公开实施方式可以针对不同的输入修改自身网络结构,在保证图像高恢复质量的同时提升运行速度、节省算力资源以及提升用户体验。
本公开实施方式可以应用于手机APP、相机内置的图像处理模块、医疗影像设备内置份处理模块、高清电视中,实现老照片、老电影的修复功能。在加入图像对齐模块后,还可以实现黑白图像转化为彩色图像。
基于相同的技术构思,本公开实施方式还提供一种图像超分辨率装置,如图7所示,所述图像超分辨率装置包括获取模块101、控制调整模块102和图像处理模块103,获取模块101配置为,获取处理参数和待处理的第一图像;以及,获取动态超分辨率模型,所述动态超分辨率模型包括动态处理模型和控制模型,所述控制模型配置为控制所述动态处理模型执行或者调整所述动态处理模型的结构,所述动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的。
控制调整模块102配置为,根据所述处理参数,利用所述控制模型调整或控制所述动态处理模型。
图像处理模块103配置为,根据经调整或控制的动态处理模型处理所述第一图像,得到与所述第一图像对应的第二图像,所述第二图像的分辨率高于所述第一图像的分辨率。
在一些实施方式中,所述动态超分辨率模型是采用强化学习算 法对所述初始动态处理模型和所述初始控制模型进行整体训练后得到的。如图8所示,所述图像超分辨率装置还包括模型训练模块104,模型训练模块104配置为,采用强化学习算法通过迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练,所述训练图像为对原始图像进行压缩处理得到的图像;响应于预设收敛条件满足,结束训练,得到所述动态超分辨率模型;
其中,所述预设收敛条件包括以下至少之一:
已训练预设迭代次数;
损失函数与获得的奖励之间的加权和最小,所述损失函数为经动态处理模型处理得到的图像与相应训练图像对应的原始图像之间的损失函数,所述获得的奖励为所述控制模型控制或调整所述动态处理模型所获得的奖励。
在一些实施方式中,训练模块104还配置为,在采用强化学习算法通过迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练之前,隔离所述初始控制模型,训练所述初始动态处理模型,得到训练后的动态处理模型。
训练模块104配置为,采用强化学习算法通过迭代的方式,利用训练图像对所述训练后的动态处理模型和所述初始控制模型进行整体训练。
在一些实施方式中,所述动态超分辨率模型包括动态深度超分辨率模型,所述动态深度超分辨率模型包括第一动态处理模型和第一控制模型,所述第一控制模型配置为,在所述第一动态处理模型执行至少一次之后,控制所述第一动态处理模型执行,或者,控制所述第一动态处理模型中的处理模块或处理层执行。
在一些实施方式中,所述动态深度超分辨率模型包括动态循环超分辨率神经网络模型或动态跳层超分辨率神经网络模型。
在一些实施方式中,所述动态超分辨率模型包括动态宽度超分辨率模型,所述动态宽度超分辨率模型包括第二动态处理模型和第二控制模型,所述第二控制模型配置为调整所述第二动态处理模型中各卷积层的通道。
在一些实施方式中,所述动态宽度超分辨率模型包括动态剪枝超分辨率神经网络模型。
本公开还提供一种计算机设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如前所述的图像超分辨率方法。
本公开还提供一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如前所述的图像超分辨率方法。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
本文已经公开了示例实施方式,并且虽然采用了具体术语,但 它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施方式相结合描述的特征、特性和/或元素,或可与其他实施方式相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本发明的范围的情况下,可进行各种形式和细节上的改变。

Claims (10)

  1. 一种图像超分辨率方法,包括:
    获取处理参数和待处理的第一图像;
    获取动态超分辨率模型,所述动态超分辨率模型包括动态处理模型和控制模型,所述控制模型配置为控制所述动态处理模型执行或者调整所述动态处理模型的结构,所述动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的;
    根据所述处理参数,利用所述控制模型调整或控制所述动态处理模型,并根据经调整或控制的动态处理模型处理所述第一图像,得到与所述第一图像对应的第二图像,所述第二图像的分辨率高于所述第一图像的分辨率。
  2. 如权利要求1所述的方法,其中,所述动态超分辨率模型是采用强化学习算法对所述初始动态处理模型和所述初始控制模型进行整体训练后得到的,采用强化学习算法对所述初始动态处理模型和所述初始控制模型进行整体训练得到所述动态超分辨率模型,包括:
    采用强化学习算法通过迭代的方式,利用训练图像对所述初始动态处理模型和所述初始控制模型进行整体训练,所述训练图像为对原始图像进行压缩处理得到的图像;
    响应于预设收敛条件满足,结束训练,得到所述动态超分辨率模型;
    其中,所述预设收敛条件包括以下至少之一:
    已训练预设迭代次数;
    损失函数与获得的奖励之间的加权和最小,所述损失函数为经动态处理模型处理得到的图像与相应训练图像对应的原始图像之间的损失函数,所述获得的奖励为所述控制模型控制或调整所述动态处理模型所获得的奖励。
  3. 如权利要求2所述的方法,其中,在采用强化学习算法通过 迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练之前,所述方法还包括:
    隔离所述初始控制模型,训练所述初始动态处理模型,得到训练后的动态处理模型;
    所述采用强化学习算法通过迭代的方式,利用训练图像对初始动态处理模型和初始控制模型进行整体训练,包括:
    采用强化学习算法通过迭代的方式,利用训练图像对所述训练后的动态处理模型和所述初始控制模型进行整体训练。
  4. 如权利要求1-3任一项所述的方法,其中,所述动态超分辨率模型包括动态深度超分辨率模型,所述动态深度超分辨率模型包括第一动态处理模型和第一控制模型,所述第一控制模型配置为,在所述第一动态处理模型执行至少一次之后,控制所述第一动态处理模型执行,或者,控制所述第一动态处理模型中的处理模块或处理层执行。
  5. 如权利要求4所述的方法,其中,所述动态深度超分辨率模型包括动态循环超分辨率神经网络模型或动态跳层超分辨率神经网络模型。
  6. 如权利要求1-3任一项所述的方法,其中,所述动态超分辨率模型包括动态宽度超分辨率模型,所述动态宽度超分辨率模型包括第二动态处理模型和第二控制模型,所述第二控制模型配置为调整所述第二动态处理模型中各卷积层的通道。
  7. 如权利要求6所述的方法,其中,所述动态宽度超分辨率模型包括动态剪枝超分辨率神经网络模型。
  8. 一种图像超分辨率装置,包括获取模块、控制调整模块和图像处理模块,所述获取模块配置为,获取处理参数和待处理的第一图像;以及,获取动态超分辨率模型,所述动态超分辨率模型包括动态 处理模型和控制模型,所述控制模型配置为控制所述动态处理模型执行或者调整所述动态处理模型的结构,所述动态超分辨率模型是对初始动态处理模型和初始控制模型进行整体训练后得到的;
    所述控制调整模块配置为,根据所述处理参数,利用所述控制模型调整或控制所述动态处理模型;
    所述图像处理模块配置为,根据经调整或控制的动态处理模型处理所述第一图像,得到与所述第一图像对应的第二图像,所述第二图像的分辨率高于所述第一图像的分辨率。
  9. 一种计算机设备,包括:
    一个或多个处理器;
    存储装置,其上存储有一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1-7任一项所述的图像超分辨率方法。
  10. 一种计算机可读介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-7任一项所述的图像超分辨率方法。
PCT/CN2022/085007 2021-09-22 2022-04-02 图像超分辨率方法、装置、计算机设备和可读介质 WO2023045297A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111106460.9A CN115861045A (zh) 2021-09-22 2021-09-22 图像超分辨率方法、装置、计算机设备和可读介质
CN202111106460.9 2021-09-22

Publications (1)

Publication Number Publication Date
WO2023045297A1 true WO2023045297A1 (zh) 2023-03-30

Family

ID=85652137

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085007 WO2023045297A1 (zh) 2021-09-22 2022-04-02 图像超分辨率方法、装置、计算机设备和可读介质

Country Status (2)

Country Link
CN (1) CN115861045A (zh)
WO (1) WO2023045297A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116957917A (zh) * 2023-06-19 2023-10-27 广州极点三维信息科技有限公司 一种基于近端策略优化的图像美化方法及装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016110312A (ja) * 2014-12-04 2016-06-20 株式会社東芝 画像処理方法、画像処理装置及びプログラム
US20170024855A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Single Image Super-Resolution Method Using Transform-Invariant Directional Total Variation with S1/2+L1/2-norm
US20200043135A1 (en) * 2018-08-06 2020-02-06 Apple Inc. Blended neural network for super-resolution image processing
CN111192200A (zh) * 2020-01-02 2020-05-22 南京邮电大学 基于融合注意力机制残差网络的图像超分辨率重建方法
CN111640061A (zh) * 2020-05-12 2020-09-08 哈尔滨工业大学 一种自适应图像超分辨率系统
CN112488923A (zh) * 2020-12-10 2021-03-12 Oppo广东移动通信有限公司 图像超分辨率重建方法、装置、存储介质及电子设备
CN112508780A (zh) * 2019-09-16 2021-03-16 中移(苏州)软件技术有限公司 一种图像处理模型的训练方法、装置及存储介质
CN112991173A (zh) * 2021-03-12 2021-06-18 西安电子科技大学 基于双通道特征迁移网络的单帧图像超分辨率重建方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016110312A (ja) * 2014-12-04 2016-06-20 株式会社東芝 画像処理方法、画像処理装置及びプログラム
US20170024855A1 (en) * 2015-07-26 2017-01-26 Macau University Of Science And Technology Single Image Super-Resolution Method Using Transform-Invariant Directional Total Variation with S1/2+L1/2-norm
US20200043135A1 (en) * 2018-08-06 2020-02-06 Apple Inc. Blended neural network for super-resolution image processing
CN112508780A (zh) * 2019-09-16 2021-03-16 中移(苏州)软件技术有限公司 一种图像处理模型的训练方法、装置及存储介质
CN111192200A (zh) * 2020-01-02 2020-05-22 南京邮电大学 基于融合注意力机制残差网络的图像超分辨率重建方法
CN111640061A (zh) * 2020-05-12 2020-09-08 哈尔滨工业大学 一种自适应图像超分辨率系统
CN112488923A (zh) * 2020-12-10 2021-03-12 Oppo广东移动通信有限公司 图像超分辨率重建方法、装置、存储介质及电子设备
CN112991173A (zh) * 2021-03-12 2021-06-18 西安电子科技大学 基于双通道特征迁移网络的单帧图像超分辨率重建方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116957917A (zh) * 2023-06-19 2023-10-27 广州极点三维信息科技有限公司 一种基于近端策略优化的图像美化方法及装置
CN116957917B (zh) * 2023-06-19 2024-03-15 广州极点三维信息科技有限公司 一种基于近端策略优化的图像美化方法及装置

Also Published As

Publication number Publication date
CN115861045A (zh) 2023-03-28

Similar Documents

Publication Publication Date Title
US11551333B2 (en) Image reconstruction method and device
US10535141B2 (en) Differentiable jaccard loss approximation for training an artificial neural network
Kim et al. Global and local enhancement networks for paired and unpaired image enhancement
WO2022116856A1 (zh) 一种模型结构、模型训练方法、图像增强方法及设备
JP7143529B2 (ja) 画像復元方法及びその装置、電子機器並びに記憶媒体
EP3767549A1 (en) Delivery of compressed neural networks
WO2023045297A1 (zh) 图像超分辨率方法、装置、计算机设备和可读介质
Chira et al. Image super-resolution with deep variational autoencoders
US11475543B2 (en) Image enhancement using normalizing flows
CN116958534A (zh) 一种图像处理方法、图像处理模型的训练方法和相关装置
CN113313777A (zh) 一种图像压缩处理方法、装置、计算机设备和存储介质
CN115019173A (zh) 基于ResNet50的垃圾识别与分类方法
Jiang et al. Layer-wise deep neural network pruning via iteratively reweighted optimization
Wu et al. CASR: a context-aware residual network for single-image super-resolution
Tan et al. Deep learning on mobile devices through neural processing units and edge computing
Liu et al. Cross-resolution feature attention network for image super-resolution
Deng et al. Efficient test-time adaptation for super-resolution with second-order degradation and reconstruction
Luo et al. A fast denoising fusion network using internal and external priors
Li et al. Towards communication-efficient digital twin via ai-powered transmission and reconstruction
Wei et al. Perceptual quality assessment for no-reference image via optimization-based meta-learning
Liu et al. A fast and accurate super-resolution network using progressive residual learning
CN116246110A (zh) 基于改进胶囊网络的图像分类方法
CN113570036A (zh) 支持动态神经网络稀疏模型的硬件加速器架构
Li et al. Zero-referenced low-light image enhancement with adaptive filter network
Dong et al. Bayesian Deep Learning for Image Reconstruction: From structured sparsity to uncertainty estimation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22871361

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE