WO2018086354A1 - 图像升频系统及其训练方法、以及图像升频方法 - Google Patents

图像升频系统及其训练方法、以及图像升频方法 Download PDF

Info

Publication number
WO2018086354A1
WO2018086354A1 PCT/CN2017/089742 CN2017089742W WO2018086354A1 WO 2018086354 A1 WO2018086354 A1 WO 2018086354A1 CN 2017089742 W CN2017089742 W CN 2017089742W WO 2018086354 A1 WO2018086354 A1 WO 2018086354A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
neural network
convolutional neural
upscaling
upscaling system
Prior art date
Application number
PCT/CN2017/089742
Other languages
English (en)
French (fr)
Inventor
那彦波
张丽杰
李晓宇
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to US15/741,781 priority Critical patent/US10311547B2/en
Publication of WO2018086354A1 publication Critical patent/WO2018086354A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4007Interpolation-based scaling, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution

Definitions

  • the present invention relates to image processing and display technology, and more particularly to an image upscaling system, a training method thereof, a display device, and an image upscaling method.
  • image up-conversion refers to improving the resolution of an original image by means of image processing.
  • the upscaling method of the image may be based on interpolation, such as bicubic interpolation, or may be based on learning, such as constructing a neural network based machine learning model for image upscaling.
  • Convolutional neural networks have been widely used in the field of image processing to achieve image recognition, image classification and image upscaling.
  • Convolutional neural networks are a common deep learning architecture that typically includes a convolutional layer and a pooled layer.
  • the convolution layer is mainly used to extract the characteristics of the input data, and the pooling layer can use the average pooling or maximum pooling to reduce the dimension of the feature.
  • a typical cost function for parameter optimization uses a mean square error or similar average error, which tends to cause high resolution images reconstructed based on low resolution images to be unreal.
  • Embodiments of the present invention provide an image upscaling system, a training method of the image upscaling system, a display device including the image upscaling system, and a method of upconverting an image using the image upscaling system.
  • an image upscaling system includes at least two convolutional neural network modules and at least one recombiner, wherein the convolutional neural network module and the recombiner are alternately connected to each other.
  • the first convolutional neural network module of the at least two convolutional neural network modules is configured to receive the input image and the complementary image having the same resolution as the input image, and based on the input image and the complementary image having the same resolution as the input image , generating a first number of feature images and outputting to the next multiplexer connected thereto.
  • the other convolutional neural network modules in the at least two convolutional neural network modules are configured to receive an output image from the previous recombiner and a complementary image having the same resolution as the received output image, and based on the output image and A complementary image having the same resolution as the output image, a second number of feature images is generated and output to the next multiplexer connected thereto, or as an output of the image upscaling system.
  • the compositor is configured to synthesize each n*n feature images in the received feature image into one feature image, and output the synthesized third number of feature images to a next convolutional neural network module connected thereto, Or as an output of an image upscaling system.
  • n denotes an up-conversion ratio of the recombiner, which is an integer greater than 1
  • the number of feature images received by the recombiner is a multiple of n*n.
  • the supplemental image is an image with a fixed distribution and white noise.
  • the up-conversion ratio of the recombiner is the same.
  • the up-conversion ratio of the recombiner is a multiple of two.
  • the recombiner is an adaptive interpolation filter.
  • a display device comprising the image upscaling system described above.
  • a method for training the image upscaling system described above is provided.
  • a first training set is constructed that includes at least one down-converted image of the original image and the original image, wherein the resolution of the down-converted image is lower than the resolution of the original image.
  • a second training set is constructed, which includes an original image, a magnification factor, and a first degraded image of the original image based on the magnification factor, wherein the resolution of the first degraded image is the same as the resolution of the original image.
  • the original image and the first degraded image are taken as inputs, and the magnification factor is used as an output to train the convolutional neural network system.
  • the parameters of the image upscaling system are acquired using the trained convolutional neural network system and using the first training set. Then, based on the image upscaling system with the acquired parameters, a new training set is constructed again, which includes an original image, a magnification factor, and a second degraded image of the original image based on the magnification factor, wherein the resolution of the second degraded image The rate is the same as the resolution of the original image.
  • the new training set purchased the original image and the second degraded image are taken as inputs, and the magnification factor is used as an output to train the convolutional neural network system again.
  • the parameters of the image upscaling system are then acquired again using the trained convolutional neural network system and using the first training set.
  • the construction of the new training set described above, the training of the convolutional neural network system, and the acquisition of parameters of the image upscaling system are repeatedly performed.
  • the down-converted image can be obtained by performing downsampling on the original image.
  • the first degraded image may be obtained by downsampling the original image using a magnification factor and then upsampling the downsampled image using the magnification factor.
  • the downsampling uses a bicubic downsampling method and the upsampling uses a bicubic upsampling method.
  • the convolutional neural network system is trained using a stochastic gradient descent method to satisfy the parameters of the convolutional neural network.
  • ⁇ opt arg ⁇ min X (fD ⁇ (X, Down f (UP f (X))))
  • ⁇ opt represents a parameter of the convolutional neural network
  • f represents a multiplication factor
  • D ⁇ (X, Down f (Up f (X))) represents an original image X and a first degraded image or a second drop.
  • the quality image Downf (UPf(X))) is the magnification factor estimated by the convolutional neural network.
  • a parameter of an image upscaling system is obtained by using a random gradient descent method, wherein parameters of the image upscaling system are satisfied
  • ⁇ opt represents a parameter of the image upscaling system
  • D ⁇ (X, HR k ) represents an estimate by the convolutional neural network based on the original image HRk and an image X obtained by the image upscaling system
  • " table demonstrates the number operation.
  • the second degraded image may be obtained by downsampling the original image using a magnification factor, and then performing the downsampled image using a magnification factor through the trained image upscaling system. Up frequency.
  • the value of the magnification factor is different in different training sets.
  • the original image in the first training set, may be divided into a plurality of image blocks having a first size.
  • the original image in the second training set and the new training set may be divided into a plurality of image blocks having the second size.
  • the convolutional neural network module generates a first number of feature images based on the received input image and the complementary image having the same resolution as the input image, and outputs the same to the combiner.
  • the combiner synthesizes each n*n feature images in the received feature image into one feature image, and outputs the synthesized feature image to the next convolutional neural network module.
  • the next convolutional neural network module generates a second number of feature images based on the feature image output by the combiner and the complementary image having the same resolution as the received feature image.
  • n represents the up-conversion ratio of the recombiner, which is an integer greater than 1
  • the number of feature images received by the recombiner is a multiple of n*n.
  • the image upscaling system according to an embodiment of the present invention can obtain a realistic high resolution image by adding detailed information lacking in a low resolution image.
  • the image upscaling system according to an embodiment of the present invention can achieve different upsampling magnifications, thereby obtaining output images having different resolutions.
  • the training method of the image upscaling system according to the embodiment of the present invention can optimize the parameters of the image upscaling system, thereby allowing random input of the image upscaling system, compared with the conventional training method using a cost function based on mean square error or the like. Help to produce real results.
  • FIG. 1 is a schematic structural diagram of an image upscaling system according to an embodiment of the present invention.
  • FIGS. 2a to 2c are schematic diagrams showing specific examples of an image upscaling system provided by an embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of an example of a convolutional neural network module in the image upscaling system shown in FIG. 1;
  • FIG. 4 is a schematic view for explaining an up-conversion process of a recombiner in the image upscaling system shown in FIG. 1;
  • FIG. 5 is a schematic flowchart of a method for training an image upscaling system as shown in FIG. 1 according to an embodiment of the present invention
  • FIG. 6 is a schematic structural diagram of a convolutional neural network system for training
  • FIG. 7 is a schematic flow diagram of a method of upconverting an image using the image upscaling system of FIG. 1 in accordance with an embodiment of the present invention.
  • FIG. 1 shows a block diagram of an image upscaling system 100 in accordance with an embodiment of the present invention.
  • the convolutional neural network modules CN1, CN2, ... CNN and the recombiners M1, M2, ... MM are alternately connected to each other. Therefore, a recombiner is connected between every two adjacent convolutional neural network modules. Further, in the case of a plurality of recombiners, a convolutional neural network module is connected between every two adjacent recombiners.
  • the convolutional neural network module CN1 (corresponding to the "first convolutional neural network module") can receive the input image 1x and the supplemental image z1.
  • the supplemental image z1 has the same resolution as the input image and can be used to reconstruct new features that are missing in the low resolution image.
  • the convolutional neural network module CN1 generates a first number of feature images based on the received input image 1x and the supplemental image z1, and outputs to the next recombiner M1 connected thereto.
  • the other convolutional neural network modules CN2, . . . CNN may receive output images from the previous recombiners M1, M2, . . . MM connected thereto and respective complementary images z2, . . . ZN.
  • the supplemental images z2, z3, ... zN each have the same resolution as the output image of the corresponding recombiner.
  • Each convolutional neural network module generates a second number of feature images based on the received output image and the supplemental image, and outputs to the next multiplexer connected thereto, or as an output of the image upscaling system 100.
  • the combiner M1, M2, ... MM can receive the previous convolutional neural network module CN1, CN2, ...
  • the plurality of feature images output by the CNN-1 and each n*n feature images in the received feature images are combined into one feature image, whereby a third number of feature images magnified n times of the resolution can be obtained.
  • the combiner M1, M2, ... MM outputs the synthesized third number of feature images to the next convolutional neural network module CN2, CN3, ... CNN connected thereto, or as an output of the image upscaling system 100.
  • n denotes an up-conversion ratio of the recombiner, which is an integer greater than 1, and the number of feature images received by the recombiner is a multiple of n*n.
  • the input to the image upscaling system 100 shown in Figure 1 includes an input image 1x and N complementary images z1, z2, ... zN, the output may be a feature image from the composite output, or from a convolutional neural network module.
  • each convolutional neural network module can generate a feature image based on the received input image or the output image of the recombiner and the corresponding supplemental image. Since the supplemental image can be used to reconstruct missing features in the low resolution image, the generated feature image contains more detail than in the original image, contributing to rendering in the upconverted image.
  • the feature image is up-converted by the synthesizer, that is, the image resolution is magnified n times for each recombiner with an up-conversion ratio of n. Therefore, the image upscaling system 100 can obtain images having different resolutions.
  • the supplemental image is a feature input for each convolutional neural network structure, and may be an image having a fixed distribution and white noise.
  • the fixed distribution may be, for example, a uniform distribution, a Gaussian distribution, or the like.
  • the supplemental image may be an image related to, for example, texture.
  • the supplemental image may be an image related to, for example, an object.
  • multiple recombiners may have the same up-conversion ratio. If the image upscaling system includes k recombiners, the resolution of the image can be increased by k*n times by the image upscaling system. Further, the up-conversion ratio of the recombiner may be a multiple of 2.
  • the image upscaling system of this example may include two convolutional neural network modules CN1, CN2 and a recombiner M1.
  • the combiner M1 is connected between the convolutional neural network modules CN1 and CN2, and its up-conversion ratio is 2x. Therefore, the output of the image upscaling system is an image with a 2x improvement in resolution.
  • the image output by the convolutional neural network module CN2 has a higher image quality due to the addition of the supplementary image z1 than the image output by the synthesizer M1.
  • the image upscaling system shown in Figure 2b comprises three convolutional neural network modules CN1, CN2, CN3 and two recombiners M1 and M2 with an upconversion ratio of 2x. Therefore, the image up-conversion system can output an image that is 2x times larger in resolution and 4x times larger in resolution.
  • the image upscaling system shown in Fig. 2c comprises four convolutional neural network modules CN1, CN2, CN3, CN4 and three recombiners M1, M2 and M3 with an upsampling ratio of 2x. Therefore, the image upscaling system can output images with 2x, 4x, and 8x resolutions.
  • FIG. 3 shows a schematic structural diagram of an example of a convolutional neural network module CN in the image upscaling system 100 shown in FIG. 1.
  • the convolutional neural network module CN is a convolutional neural network structure that uses images as inputs and outputs, which may include multiple convolutional layers, each convolutional layer may include multiple filters.
  • the exemplary convolutional neural network structure shown in Figure 3 includes two layers of convolutional layers.
  • the input of the convolutional neural network structure is four images, and three feature images are generated after passing through the respective filters of the first layer of the convolution layer, and then two filters are generated after passing through the respective filters of the second layer convolution layer. Feature images are output.
  • the filter may be a filter of, for example, a 3 ⁇ 3 or 5 ⁇ 5 core, and has weights
  • k denotes the number of the convolutional layer
  • i denotes the number of the input image
  • j denotes the number of the output image.
  • Bias Is the increment added to the convolution output.
  • the parameters of the convolutional neural network structure are obtained by training the convolutional neural network structure using a sample input and output image set. The training on the structure of the convolutional neural network will be described in detail later.
  • the recombiner M that up-multiplies n can synthesize n*n feature images into one feature image such that the resolution of the image is magnified by n times. Therefore, the recombiner M substantially corresponds to an adaptive difference filter.
  • 4 is a schematic diagram for explaining the up-converting process of the recombiner M in the image upscaling system 100 shown in FIG. 1, in which the up-conversion ratio of the recombiner M is 2, and the recombinator is shown in the figure. 2x said. As shown in FIG.
  • the compositor M combines the input feature image into a group of four feature images, such as the feature image 4n, the feature image 4n+1, the feature image 4n+2, and the feature image 4n+3.
  • the four sets of feature images are composited.
  • the pixel values at the same position among the four feature images are matrix-arranged to generate a feature image of 4 ⁇ pixels.
  • the pixel information in the feature image is not modified (increased or lost) during image upscaling.
  • the image upscaling system may be implemented using hardware, software, or a combination of hardware and software.
  • an embodiment of the present invention provides a A new training method in which a new system (hereinafter referred to as "authentication system") is trained as an objective function of the image upscaling system.
  • authentication system a new system
  • the authentication system uses two images with the same resolution as input, where one input is the original high quality image and the other input is the degraded image of the original high quality image, which is obtained by first using the magnification factor The high quality image is downsampled and then the downsampled image is upsampled to the original resolution.
  • the output of the authentication system is a prediction of the rate factor.
  • the authentication system can be implemented using a convolutional neural network system.
  • the authentication system and the image upscaling system can be alternately trained.
  • the authentication system learns according to a standard upconverter (eg, a bicubic upconverter).
  • the image upscaling system then minimizes the magnification factor estimated by the authentication system.
  • the authentication system learns according to the newly improved image upscaling system.
  • the image upscaling system then again minimizes the magnification factor of the newly improved authentication system.
  • the training method of an embodiment of the present invention enables the authentication system and the image upscaling system to be performed as "opposing" networks based on each other's better results. Improve.
  • the training method of an embodiment of the present invention uses the predicted magnification factor of the authentication system as a cost function to optimize the parameters of the image upscaling system, which allows the input supplemental image to help produce a more realistic effect than existing training methods.
  • estimating the magnification factor in the authentication system can also fully illustrate the performance of the image upscaling system.
  • FIG. 5 shows a schematic flow chart of a method for training an image upscaling system as shown in FIG. 1 in accordance with an embodiment of the present invention.
  • the parameters of the authentication system and the parameters of the image upscaling system are obtained by alternately performing optimization on the authentication system and the image upscaling system.
  • the authentication system can employ a convolutional neural network system.
  • the first training set may include the original image HRN(k) and at least one down-converted image HR0(k), HR1(k), . . . , HRN-1(k) of the original image HRN(k).
  • the down-converted image refers to an image having a lower resolution than the original image. For example, assuming that the original image has a resolution of 8x, the resolution of the down-converted image may be 4x, 2x, 1x.
  • the down-converted image can be obtained by performing standard downsampling on the original image, for example using bicubic downsampling.
  • the original image may be one or more, ie k is a positive integer. Further, the original image may be divided into a plurality of image blocks having a first size.
  • the second training set may include an original image HRN(k), a magnification factor fk, and a first degradation image Y(k) of the original image HRN(k) based on the magnification factor fk.
  • the first degraded image has the same resolution as the original image.
  • the first degraded image can be obtained by first downsampling the original image using a magnification factor, and then upsampling the downsampled image using the same magnification factor.
  • Downsampling and upsampling can use standard algorithms, for example, downsampling can use bicubic downsampling, and upsampling can use bicubic upsampling.
  • the magnification factor can be a floating point number and can be randomly generated.
  • the original image may be divided into a plurality of image blocks having a second size.
  • the second size is different from the first size.
  • the convolutional neural network system has two inputs: an original image X and a degraded image Y of the original image X.
  • the output of the convolutional neural network system is a prediction of the magnification factor f, expressed as D ⁇ (X, Y), where ⁇ represents all parameters of the convolutional neural network system, including the parameters of the convolutional neural network module CNk and the fully connected network FCN Parameters.
  • a Stochastic Gradient Descent can be employed to train a convolutional neural network system.
  • First initialize the parameters in the convolutional neural network system.
  • the magnification factor is used as an output of the convolutional neural network system, and the parameters in the convolutional neural network system are adjusted so that The parameters of the convolutional neural network system satisfy the following formula
  • ⁇ opt arg ⁇ min X (fD ⁇ (X, Down f (UP f (X))) (1)
  • Equation (1) indicates that the parameters of the convolutional neural network system are parameters that minimize the difference between the true magnification factor and the estimated magnification factor.
  • the parameters of the convolutional neural network system are obtained through step S530, the parameters of the image upscaling system are acquired using the trained convolutional neural network system and the first training set B constructed in step S510 in step S540.
  • a random gradient descent method may be employed to obtain parameters of the image upscaling system.
  • the up-converted image is obtained by the image upscaling system.
  • the obtained up-converted image and the original image are taken as inputs, and the corresponding magnification factor is estimated. Adjust the parameters of the image upscaling system so that the parameters of the image upscaling system satisfy the following formula:
  • Equation (2) indicates that the parameters of the image upscaling system are such that the output of the image upscaling system has a minimum value relative to the input resulting in the output of the convolutional neural network system.
  • a new training set A1 ⁇ HRN(k), fk', Y'(k) ⁇ is constructed.
  • the new training set may include an original image HRN(k), a magnification factor fk', and a second reduced image Y'(k) of the original image HRN(k) based on the magnification factor fk'.
  • the second degraded image also has the same resolution as the original image.
  • the second degraded image can be obtained by first downsampling the original image using a magnification factor, and then using the same magnification factor pair by the trained image upscaling system and the standard upsampling method used in step S520.
  • the downsampled image is upsampled.
  • downsampling can use bicubic downsampling
  • upsampling can use bicubic upsampling.
  • the magnification factor can be a floating point number and can be randomly generated.
  • step S560 the convolutional neural network system is trained using the new training set A1 created in step S550 with the original image and the second degraded image as inputs, and the magnification factor as an output.
  • the training method in this step is the same as the training method in step S530.
  • step S560 the parameters of the convolutional neural network can be obtained again.
  • step S570 the convolutional neural network trained in step S560 is used and the first is used.
  • Training set B again obtaining the parameters of the image upscaling system.
  • the training method in this step is the same as the training method in step S540.
  • the predetermined condition may be a predetermined number of times or a condition that the parameters of the image upscaling system need to satisfy. If not, the above steps S550 to S570 are repeatedly executed. If it is satisfied, the training ends.
  • FIG. 7 illustrates a method of up-converting an image using an image upscaling system of an embodiment of the present invention.
  • the convolutional neural network module generates a first number of feature images based on the received input image and the complementary image having the same resolution as the input image, and outputs the same to the combiner.
  • the synthesizer synthesizes each n*n feature images in the received feature image into one feature image, and outputs the synthesized feature image to the next convolutional neural network module.
  • step S730 the next convolutional neural network module generates a second number of feature images based on the feature image output by the combiner and the complementary image having the same resolution as the received feature image.
  • the next convolutional neural network module generates a second number of feature images based on the feature image output by the combiner and the complementary image having the same resolution as the received feature image.
  • output images with different resolutions can be obtained.
  • Embodiments of the present invention also provide a display device including an image upscaling system in accordance with an embodiment of the present invention.
  • the display device can be, for example, a display, a mobile phone, a laptop computer, a tablet computer, a television, a digital photo frame, a wearable device, a navigation device, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Processing (AREA)

Abstract

一种图像升频系统、其训练方法和图像升频方法。图像升频系统可包括至少两个卷积神经网络模块和至少一个复合器。卷积神经网络模块与复合器彼此交替地连接。第一卷积神经网络模块可接收输入图像和对应的补充图像,生成第一数量的特征图像,并向与其连接的下一个复合器输出。其它卷积神经网络模块可接收来自前一个复合器的输出图像和对应的补充图像,生成第二数量的特征图像,并向与其连接的下一个复合器输出,或者作为图像升频系统的输出。复合器可将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,并将所合成的第三数量的特征图像输出到与其连接的下一个卷积神经网络模块,或者作为图像升频系统的输出。

Description

图像升频系统及其训练方法、以及图像升频方法
交叉引用
本申请要求于2016年11月9日提交的申请号为201610984239.6、名称为“图像升频系统及其训练方法、以及图像升频方法”的中国专利申请的优先权,该中国专利申请的全部内容通过引用全部并入本文。
技术领域
本发明涉及图像处理和显示技术,更具体地,涉及图像升频系统、其训练方法、显示装置以及图像升频方法。
背景技术
一般地,图像升频是指通过图像处理的方式提高原有图像的分辨率。当前,图像的升频方法可以是基于插值的,例如双三次(bicubic)插值,也可以是基于学习的,例如构建基于神经网络的机器学习模型以用于图像升频。
目前,卷积神经网络已被大量应用于图像处理领域,以实现图像识别、图像分类和图像升频等。卷积神经网络是一种常见的深度学习架构,通常包括卷积层和池化层。卷积层主要用于提取输入数据的特征,而池化层可采用平均池化或最大值池化来降低特征的维度。
由于低分辨率图像相较于高分辨率图像,丢失了高频信息,因此,在将低分辨率图像升频成高分辨率图像的过程中,需要补充这些信息。然而,现有的图像升频技术并不能重建这些高频信息。典型的用于参数优化的成本函数使用均方差或类似的平均误差,容易造成基于低分辨率图像而重建的高分辨率图像不真实。
发明内容
本发明的实施例提供了一种图像升频系统、该图像升频系统的训练方法、包括该图像升频系统的显示装置以及使用该图像升频系统对图像进行升频的方法。
根据本发明的第一个方面,提供了一种图像升频系统。该图像升频系统包括至少两个卷积神经网络模块以及至少一个复合器,其中,卷积神经网络模块与复合器彼此交替地连接。至少两个卷积神经网络模块中的第一卷积神经网络模块被配置为接收输入图像和与输入图像的分辨率相同的补充图像,并基于输入图像和与输入图像的分辨率相同的补充图像,生成第一数量的特征图像,并向与其连接的下一个复合器输出。至少两个卷积神经网络模块中的其它卷积神经网络模块被配置为接收来自前一个复合器的输出图像和与所接收的输出图像的分辨率相同的补充图像,并基于输出图像和 与输出图像的分辨率相同的补充图像,生成第二数量的特征图像,并向与其连接的下一个复合器输出,或者作为图像升频系统的输出。复合器被配置为将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,并将所合成的第三数量的特征图像输出到与其连接的下一个卷积神经网络模块,或者作为图像升频系统的输出。在该图像升频系统中,n表示复合器的升频倍率,是大于1的整数,复合器所接收的特征图像的数量是n*n的倍数。
在本发明的实施例中,补充图像是具有固定分布和白噪声的图像。
在本发明的实施例中,复合器的升频倍率是相同的。
在本发明的实施例中,复合器的升频倍率是2的倍数。
在本发明的实施例中,复合器是自适应插值滤波器。
根据本发明的第二个方面,提供了一种显示装置,包括上述的图像升频系统。
根据本发明的第三个方面,提供了一种用于训练上述的图像升频系统的方法。在该方法中,构建第一训练集合,其包括原始图像和原始图像的至少一个降频图像,其中降频图像的分辨率低于原始图像的分辨率。接着,构建第二训练集合,其包括原始图像、倍率因子和基于倍率因子的原始图像的第一降质图像,其中第一降质图像的分辨率与原始图像的分辨率相同。然后,利用第二训练集合,以原始图像和第一降质图像作为输入,以倍率因子作为输出,训练卷积神经网络系统。使用所训练的卷积神经网络系统并使用第一训练集合,获取图像升频系统的参数。然后,基于具有所获取的参数的图像升频系统,再次构建新的训练集合,其包括原始图像、倍率因子以及基于倍率因子的原始图像的第二降质图像,其中第二降质图像的分辨率与原始图像的分辨率相同。利用所购建的新的训练集合,以原始图像和第二降质图像作为输入,以倍率因子作为输出,再次训练卷积神经网络系统。然后,使用所训练的卷积神经网络系统并使用第一训练集合,再次获取图像升频系统的参数。重复执行上述的新的训练集合的构建、卷积神经网络系统的训练和图像升频系统的参数的获取。
在本发明的实施例中,进一步检查图像升频系统的参数是否满足预定条件,并且响应于图像升频系统的参数满足预定条件,停止图像升频系统的训练,以及响应于图像升频系统的参数不满足预定条件,继续执行图像升频系统的训练。
在本发明的实施例中,降频图像可通过对原始图像执行下采样来获得。
在本发明的实施例中,第一降质图像可通过以下操作获得:使用倍率因子对原始图像进行下采样,然后,使用该倍率因子对下采样后的图像进行上采样。
在本发明的实施例中,下采样使用双三次下采样法,上采样使用双三次上采样法。
在本发明的实施例中,采用随机梯度下降法训练卷积神经网络系统,以使得卷积神经网络的参数满足
θopt=argθminX(f-Dθ(X,Downf(UPf(X))))
其中,θopt表示所述卷积神经网络的参数,f表示倍频因子,Dθ(X,Downf(Upf(X)))表示基于原始图像X和第一降质图像或第二降质图像Downf(UPf(X)))而由卷积神经网络估计的倍率因子。
在本发明的实施例中,采用随机梯度下降法获取图像升频系统的参数,其中该图像升频系统的参数满足
Figure PCTCN2017089742-appb-000001
其中,αopt表示所述图像升频系统的参数,Dθ(X,HRk)表示基于所述原始图像HRk和通过所述图像升频系统获得的图像X而由所述卷积神经网络估计的倍率因子,“||·||”表示范数运算。
在本发明的实施例中,第二降质图像可通过以下操作获得:使用倍率因子对原始图像进行下采样,然后,通过所训练的图像升频系统,使用倍率因子对下采样后的图像进行升频。
在本发明的实施例中,倍率因子的值在不同的训练集合中是不同的。
在本发明的实施例中,在第一训练集合中,原始图像可被划分成多个具有第一尺寸的图像块。在第二训练集合和新的训练集合中,原始图像可被划分成多个具有第二尺寸的图像块。
根据本发明的第四个方面,提供了一种使用上述的图像升频系统对图像进行升频的方法。在该方法中,卷积神经网络模块基于所接收的输入图像和与输入图像的分辨率相同的补充图像,生成第一数量的特征图像,并输出到复合器。复合器将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,并将所合成的特征图像输出到下一个卷积神经网络模块。下一个卷积神经网络模块基于复合器输出的特征图像和与所接收的特征图像的分辨率相同的补充图像,生成第二数量的特征图像并输出。在该方法中,n表示复合器的升频倍率,是大于1的整数,复合器所接收的特征图像的数量是n*n的倍数。
根据本发明的实施例的图像升频系统通过加入低分辨率图像所缺少的细节信息,能够获得具有真实感的高分辨率图像。另外,根据本发明的实施例的图像升频系统能够实现不同的升频倍率,从而获得具有不同分辨率的输出图像。采用本发明的实施例的图像升频系统的训练方法,与传统的使用基于均方差等的成本函数的训练方法相比,能够优化图像升频系统的参数,从而允许图像升频系统的随机输入辅助产生真实的效果。
附图说明
为了更清楚地说明本发明的实施例的技术方案,下面将对实施例的附图进行简要说明,应当知道,以下描述的附图仅仅涉及本发明的一些实施例,而非对本发明的限制,其中:
图1是根据本发明的实施例的图像升频系统的结构示意图;
图2a至图2c是本发明的实施例所提供的图像升频系统的具体示例的示意图;
图3是如图1所示的图像升频系统中的卷积神经网络模块的示例的结构示意图;
图4是用于说明如图1所示的图像升频系统中的复合器的升频处理的示意图;
图5是根据本发明的实施例的用于训练如图1所示的图像升频系统的方法的示意性流程图;
图6是训练用的卷积神经网络系统的结构示意图;
图7是根据本发明的实施例的使用如图1所示的图像升频系统对图像进行升频的方法的示意性流程图。
具体实施方式
为了使本发明的实施例的目的、技术方案和优点更加清楚,下面将结合附图,对本发明的实施例的技术方案进行清楚、完整的描述。显然,所描述的实施例是本发明的一部分实施例,而不是全部的实施例。基于所描述的本发明的实施例,本领域技术人员在无需创造性劳动的前提下所获得的所有其它实施例,也都属于本发明保护的范围。
图1示出了根据本发明的实施例的图像升频系统100的结构示意图。如图1所示,图像升频系统100可包括多个卷积神经网络模块CN1、CN2、…CNN(统一表示为CN)和至少一个复合器M1、M2、…MM(统一表示为M),其中,M=N-1。卷积神经网络模块CN1、CN2、…CNN与复合器M1、M2、…MM彼此交替地连接。因此,在每两个相邻的卷积神经网络模块之间连接有一个复合器。进一步地,在多个复合器的情况下,则在每两个相邻的复合器之间连接有一个卷积神经网络模块。
卷积神经网络模块CN1(对应于“第一卷积神经网络模块”)可接收输入图像1x和补充图像z1。补充图像z1具有与输入图像相同的分辨率,可用于重建低分辨率图像中缺少的新特征。卷积神经网络模块CN1基于所接收的输入图像1x和补充图像z1,生成第一数量的特征图像,并输出到与其连接的下一个复合器M1。
其它卷积神经网络模块CN2、…CNN可接收来自与其连接的前一个复合器M1、M2、…MM的输出图像和各自的补充图像z2、…zN。补充图像z2、z3、…zN各自具有与相应的复合器的输出图像相同的分辨率。各个卷积神经网络模块基于所接收的输出图像和补充图像,生成第二数量的特征图像,并输出到与其连接的下一个复合器,或者作为图像升频系统100的输出。
复合器M1、M2、…MM可接收与其连接的前一个卷积神经网络模块CN1、CN2、… CNN-1输出的多个特征图像,并将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,由此可获得第三数量的分辨率放大n倍的特征图像。复合器M1、M2、…MM将所合成的第三数量的特征图像输出到与其连接的下一个卷积神经网络模块CN2、CN3、…CNN,或者作为图像升频系统100的输出。在本实施例中,n表示复合器的升频倍率,是大于1的整数,并且复合器所接收的特征图像的数量是n*n的倍数。
因此,图1所示的图像升频系统100的输入包括输入图像1x和N个补充图像z1、z2、…zN,输出可以是来自复合器输出的特征图像,或者是来自除了卷积神经网络模块CN1以外的其它卷积神经网络模块输出的特征图像。在本实施例的图像升频系统100中,各个卷积神经网络模块可根据所接收的输入图像或者复合器的输出图像以及相应的补充图像来生成特征图像。由于补充图像可用于重建低分辨率图像中缺失的特征,因此,所生成的特征图像包含了比原始图像中更多的细节,有助于在升频后的图像中再现。特征图像通过复合器进行升频处理,即,每经过一个升频倍率为n的复合器,图像分辨率就放大n倍。因此,图像升频系统100可以获得具有不同分辨率的图像。
在本实施例中,补充图像作为每一个卷积神经网络结构的特征输入,可以是具有固定分布和白噪声的图像。固定分布可以是例如均匀分布、高斯分布等。另外,对于低的图像放大倍数,例如2倍,补充图像可以是与例如纹理有关的图像。对于高的图像放大倍数,例如16倍,补充图像可以是与例如对象有关的图像。
在本发明的实施例中,多个复合器可以具有相同的升频倍率。如果图像升频系统包括k个复合器,则通过图像升频系统,图像的分辨率可以提高k*n倍。进一步地,复合器的升频倍率可以是2的倍数。
图2a至图2c示出了本发明的实施例所提供的图像升频系统的具体示例。如图2a所示,该示例的图像升频系统可包括两个卷积神经网络模块CN1、CN2和一个复合器M1。复合器M1被连接在卷积神经网络模块CN1和CN2之间,并且其升频倍率为2x。因此,图像升频系统的输出是分辨率提高2x倍的图像。与复合器M1输出的图像相比,卷积神经网络模块CN2输出的图像由于补充图像z1的加入而具有更高的图像质量。图2b所示的图像升频系统包括三个卷积神经网络模块CN1、CN2、CN3和两个升频倍率为2x的复合器M1和M2。因此,图像升频系统可以输出分辨率放大2x倍和分辨率提高4x倍的图像。图2c所示的图像升频系统包括四个卷积神经网络模块CN1、CN2、CN3、CN4和三个升频倍率为2x的复合器M1、M2和M3。因此,图像升频系统可以输出分辨率提高2x倍、4x倍和8x倍的图像。
图3示出了如图1所示的图像升频系统100中的卷积神经网络模块CN的示例的结构示意图。在本发明的实施例中,卷积神经网络模块CN是使用图像作为输入和输出的卷积神经网络结构,其可包括多个卷积层,每个卷积层可包括多个滤波器。图3 所示的示例性卷积神经网络结构包括两层卷积层。该卷积神经网络结构的输入是四个图像,在经过第一层卷积层的各个滤波器后生成三个特征图像,然后在经过第二层卷积层的各个滤波器后,生成两个特征图像并输出。在该卷积神经网络结构中,滤波器可以是例如3×3或5×5内核的滤波器,并具有权重
Figure PCTCN2017089742-appb-000002
其中,k表示卷积层的编号,i表示输入图像的编号,j表示输出图像的编号。偏置
Figure PCTCN2017089742-appb-000003
是添加到卷积输出的增量。通常,卷积神经网络结构的参数通过使用样本输入输出图像集合对卷积神经网络结构进行训练而获得。关于卷积神经网络结构的训练,将在后面详述。
虽然在此仅以具有两层卷积层的卷积神经网络结构为例进行了说明,但本领域的技术人员应当知道,也可以使用具有更多层卷积层的卷积神经网络结构。
在本发明的实施例中,升频倍频为n的复合器M可将n*n个特征图像合成为一个特征图像,以使得图像的分辨率放大了n倍。因此,复合器M实质上相当于自适应差值滤波器。图4是用于说明如图1所示的图像升频系统100中的复合器M的升频处理的示意图,在该图中,复合器M的升频倍率是2,在图中以复合器2x表示。如图4所示,复合器M将输入的特征图像以每4个特征图像为一组进行复合,如将特征图像4n、特征图像4n+1、特征图像4n+2、特征图像4n+3这四组特征图像进行复合,具体地,将4个特征图像中位于相同位置的像素值进行矩阵排列,从而生成一个4倍像素的特征图像。这样,在图像升频的过程中,不会修改(增加或丢失)特征图像中的像素信息。
根据本发明的实施例的图像升频系统可以使用硬件、软件、或硬件和软件的结合来实现。
图像升频系统在运行时,其包含的卷积神经网络模块的参数是固定的,因此,在运行前,需要通过对图像升频系统进行训练来确定卷积神经网络模块的参数。由于本发明的实施例的图像升频系统还包括补充图像作为输入,而现有的训练方法会消除所有的补充图像并进而阻止补充图像对输出产生影响,因此,本发明的实施例提供了一种新的训练方法,在该方法中,训练新的系统(以下称为“鉴别系统”)以作为图像升频系统的目标函数。鉴别系统采用具有相同分辨率的两个图像作为输入,其中,一个输入是原始高质量图像,另一个输入是该原始高质量图像的降质图像,该降质图像是通过先用倍率因子对原始高质量图像进行下采样,然后将下采样后的图像上采样到原始分辨率而获得的。鉴别系统的输出是倍率因子的预测。鉴别系统可使用卷积神经网络系统来实现。
在训练中,可对鉴别系统和图像升频系统交替地进行训练。首先,鉴别系统根据标准升频器(例如,双三次升频器)学习。然后,图像升频系统尽量最小化由鉴别系统估计的倍率因子。然后,鉴别系统根据新改进的图像升频系统学习。然后,图像升频系统再次尽量最小化新改进的鉴别系统的倍率因子。本发明的实施例的训练方法使得鉴别系统和图像升频系统作为“对抗式”网络来彼此基于对方的更好的结果来进行 改进。
本发明的实施例的训练方法使用鉴别系统的预测的倍率因子作为成本函数来优化图像升频系统的参数,与现有的训练方法相比,可以允许输入的补充图像帮助产生更真实的效果。另外,估计鉴别系统中的倍率因子还可全面地说明图像升频系统的性能。
下面通过附图来详细说明本发明的实施例的训练图像升频系统的方法。图5示出了根据本发明的实施例的用于训练如图1所示的图像升频系统的方法的示意性流程图。在本发明的实施例中,通过对鉴别系统和图像升频系统交替执行优化以获得鉴别系统的参数和图像升频系统的参数。鉴别系统可采用卷积神经网络系统。
如图5所示,在步骤S510,构建第一训练集合B={HR0(k),HR1(k),…,HRN-1(k),HRN(k)},其中,k=1,2,…。第一训练集合可包括原始图像HRN(k)和该原始图像HRN(k)的至少一个降频图像HR0(k),HR1(k),…,HRN-1(k)。在本发明的实施例中,降频图像是指分辨率低于原始图像的分辨率的图像。例如,假定原始图像具有分辨率8x,则降频图像的分辨率可以是4x、2x、1x。降频图像可以通过对原始图像执行标准下采样,例如使用双三次下采样来获得。
在本发明的实施例中,原始图像可以是一个或者多个,即k是正整数。进一步地,原始图像可以被划分成多个具有第一尺寸的图像块。
接着,在步骤S520,构建第二训练集合A0={HRN(k),fk,Y(k)}。第二训练集合可包括原始图像HRN(k)、倍率因子fk、和基于倍率因子fk的原始图像HRN(k)的第一降质图像Y(k)。在本发明的实施例中,第一降质图像具有与原始图像相同的分辨率。第一降质图像可通过如下方式获得:先使用倍率因子对原始图像进行下采样,然后,使用相同的倍率因子对下采样后的图像进行上采样。下采样和上采样可使用标准的算法,例如,下采样可使用双三次下采样法,上采样可使用双三次上采样法。在该步骤中,倍率因子可以是浮点数,并且可以是随机生成的。
在第二训练集合中,原始图像可被划分成多个具有第二尺寸的图像块。第二尺寸与第一尺寸不同。
然后,在步骤S530,利用所构建的第二训练集合A0,以原始图像和第一降质图像作为输入,以倍率因子作为输出,训练卷积神经网络系统。图6示出了这种卷积神经网络系统的示意性结构。如图6所示,卷积神经网络系统可包括多个卷积神经网络模块CNk(在图中,示例性地示出3个,k=1,2,3)、标准最大池化层P和全连接网络FCN。该卷积神经网络系统具有两个输入:原始图像X和该原始图像X的降质图像Y。降质图像Y可通过首先以倍率因子下采样原始图像,然后以相同的倍率因子对下采样后的图像进行上采样来获得。也就是说,Y=Downf(UPf(X))。该卷积神经网络系统的输出是倍率因子f的预测,用Dθ(X,Y)表示,其中θ表示卷积神经网络系统的所有参数,包括卷积神经网络模块CNk的参数和全连接网络FCN的参数。
在本发明的实施例中,可采用随机梯度下降法(Stochastic Gradient Descent)来训练卷积神经网络系统。首先,初始化卷积神经网络系统中的参数。然后,采用第二训练集合A0中的原始图像和第一降质图像作为卷积神经网络系统的输入,倍率因子作为卷积神经网络系统的输出,调整卷积神经网络系统中的参数,以使得卷积神经网络系统的参数满足下式
θopt=argθminX(f-Dθ(X,Downf(UPf(X))))    (1)
其中,θopt表示卷积神经网络系统的参数,f表示倍频因子,Dθ(X,Downf(Upf(X)))表示由卷积神经网络系统基于原始图像X和第一降质图像Downf(UPf(X)))而估计的倍率因子。公式(1)表示卷积神经网络系统的参数是使得真实的倍率因子与所估计的倍率因子之间的差最小的参数。
在通过步骤S530获得了卷积神经网络系统的参数后,在步骤S540,使用所训练的卷积神经网络系统和在步骤S510构建的第一训练集合B,获取图像升频系统的参数。在本发明的实施例中,可采用随机梯度下降法来获取图像升频系统的参数。首先,初始化图像升频系统中的参数。然后,使用第一训练集合B中的降频图像,通过图像升频系统来获得升频图像。然后,使用所训练的卷积神经网络系统,以所获得的升频图像和原始图像作为输入,估计相应的倍率因子。调整图像升频系统的参数,以使得图像升频系统的参数满足下式:
Figure PCTCN2017089742-appb-000004
其中,αopt表示图像升频系统的参数,Dθ(X,HRk)表示由所训练的卷积神经网络系统基于原始图像HRk和通过图像升频系统获得的升频图像X而估计的倍率因子,“||·||”表示范数运算。公式(2)表明图像升频系统的参数是使得图像升频系统的输出相对于输入导致卷积神经网络系统的输出具有最小值的参数。
然后,在步骤S550,基于具有在步骤S540中获取的参数的图像升频系统,构建新的训练集合A1={HRN(k),fk’,Y’(k)}。该新的训练集合可包括原始图像HRN(k)、倍率因子fk’以及基于倍率因子fk’的原始图像HRN(k)的第二降质图像Y’(k)。在本发明的实施例中,第二降质图像也具有与原始图像相同的分辨率。第二降质图像可通过如下方式获得:先使用倍率因子对原始图像进行下采样,然后,通过所训练的图像升频系统以及在步骤S520中使用的标准上采样法,使用相同的倍率因子对下采样后的图像进行上采样。例如,下采样可使用双三次下采样法,上采样可使用双三次上采样法。在该步骤中,倍率因子可以是浮点数,并且可以是随机生成的。
在步骤S560,利用在步骤S550中创建的新的训练集合A1,以其中的原始图像和第二降质图像作为输入,倍率因子作为输出,训练卷积神经网络系统。在该步骤中的训练方法与步骤S530中的训练方法相同。通过步骤S560,可再次获得卷积神经网络的参数。然后,在步骤S570,使用在步骤S560中训练的卷积神经网络并使用第一 训练集合B,再次获取图像升频系统的参数。该步骤中的训练方法与步骤S540中的训练方法相同。
然后,检查是否满足预定条件。该预定条件可以是预定次数,或者是图像升频系统的参数需满足的条件。如果不满足,则重复执行上述的步骤S550至S570。如果满足,则训练结束。
在相同的发明构思下,图7示出了使用本发明的实施例的图像升频系统对图像进行升频的方法。如图7所示,在步骤S710,卷积神经网络模块基于所接收的输入图像和与输入图像的分辨率相同的补充图像,生成第一数量的特征图像,并输出到复合器。接着,在步骤S720,复合器将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,并将所合成的特征图像输出到下一个卷积神经网络模块。然后,在步骤S730,下一个卷积神经网络模块基于复合器输出的特征图像和与所接收的该特征图像的分辨率相同的补充图像,生成第二数量的特征图像并输出。根据图像升频系统所包含的复合器的数量,可获得具有不同分辨率的输出图像。
本发明的实施例还提供了一种显示装置,其包括根据本发明的实施例的图像升频系统。该显示装置例如可以是显示器、移动电话、膝上型计算机、平板计算机、电视机、数码相框、可穿戴式设备、导航装置等。
以上对本发明的若干实施例进行了详细描述,但显然,本领域技术人员可以在不脱离本发明的精神和范围的情况下对本发明的实施例进行各种修改和变型。本发明的保护范围由所附的权利要求限定。

Claims (17)

  1. 一种图像升频系统,包括:至少两个卷积神经网络模块;以及至少一个复合器;
    其中,所述卷积神经网络模块与所述复合器彼此交替地连接,所述复合器连接在两个相邻的所述卷积神经网络模块之间;
    所述至少两个卷积神经网络模块中的第一卷积神经网络模块被配置为接收输入图像和与所述输入图像的分辨率相同的补充图像,并基于所述输入图像和与所述输入图像的分辨率相同的补充图像,生成第一数量的特征图像,并向与其连接的下一个所述复合器输出;
    所述至少两个卷积神经网络模块中的其它卷积神经网络模块被配置为接收来自前一个所述复合器的输出图像和与所接收的所述输出图像的分辨率相同的补充图像,并基于所述输出图像和与所述输出图像的分辨率相同的补充图像,生成第二数量的特征图像,并向与其连接的下一个所述复合器输出,或者作为所述图像升频系统的输出;
    所述复合器被配置为将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,并将所合成的第三数量的特征图像输出到与其连接的下一个所述卷积神经网络模块,或者作为所述图像升频系统的输出;
    其中,n表示所述复合器的升频倍率,是大于1的整数,所述复合器所接收的特征图像的数量是n*n的倍数。
  2. 根据权利要求1所述的图像升频系统,其中,所述补充图像是具有固定分布和白噪声的图像。
  3. 根据权利要求1或2所述的图像升频系统,其中,所述复合器的升频倍率是相同的。
  4. 根据权利要求1至3任意一项所述的图像升频系统,其中,所述复合器的升频倍率是2的倍数。
  5. 根据权利要求1至4任意一项所述的图像升频系统,其中,所述复合器是自适应插值滤波器。
  6. 一种显示装置,包括如权利要求1至5任意一项所述的图像升频系统。
  7. 一种用于训练如权利要求1至5任意一项所述的图像升频系统的方法,包括:
    构建第一训练集合,其包括原始图像和所述原始图像的至少一个降频图像,其中所述降频图像的分辨率低于所述原始图像的分辨率;
    构建第二训练集合,其包括所述原始图像、倍率因子和基于所述倍率因子的所述原始图像的第一降质图像,所述第一降质图像的分辨率与所述原始图像的分辨率相同;
    利用所述第二训练集合,以所述原始图像和所述第一降质图像作为输入,以所述倍率因子作为输出,训练卷积神经网络系统;
    使用所训练的所述卷积神经网络系统并使用所述第一训练集合,获取所述图像升频系统的参数;
    基于具有所获取的参数的图像升频系统,构建新的训练集合,其包括所述原始图像、所述倍率因子以及基于所述倍率因子的所述原始图像的第二降质图像,其中所述第二降质 图像的分辨率与所述原始图像的分辨率相同;
    利用所述新的训练集合,以所述原始图像和所述第二降质图像作为输入,以所述倍率因子作为输出,训练所述卷积神经网络系统;
    使用所训练的所述卷积神经网络系统并使用所述第一训练集合,再次获取所述图像升频系统的参数;以及
    重复执行所述新的训练集合的构建、所述卷积神经网络系统的训练和所述图像升频系统的参数的获取。
  8. 根据权利要求7所述的方法,还包括:
    检查所述图像升频系统的参数是否满足预定条件;
    响应于所述图像升频系统的参数满足所述预定条件,停止所述图像升频系统的训练;以及
    响应于所述图像升频系统的参数不满足所述预定条件,继续执行所述图像升频系统的训练。
  9. 根据权利要求7所述的方法,其中,所述降频图像通过对所述原始图像执行下采样来获得。
  10. 根据权利要求7所述的方法,其中,所述第一降质图像通过以下操作获得:
    使用所述倍率因子对所述原始图像进行下采样;以及
    使用所述倍率因子对下采样后的图像进行上采样。
  11. 根据权利要求10所述的方法,其中,所述下采样使用双三次下采样法,所述上采样使用双三次上采样法。
  12. 根据权利要求7所述的方法,其中,采用随机梯度下降法训练所述卷积神经网络系统,以使得所述卷积神经网络系统的参数满足
    θopt=argθminX(f-Dθ(X,Downf(UPf(X))))
    其中,θopt表示所述卷积神经网络系统的参数,f表示倍频因子,Dθ(X,Downf(Upf(X)))表示由所述卷积神经网络系统基于所述原始图像X和所述第一降质图像或所述第二降质图像Downf(UPf(X)))估计的倍率因子。
  13. 根据权利要求7所述的方法,其中,采用随机梯度下降法获取所述图像升频系统的参数,以使得所述图像升频系统的参数满足
    Figure PCTCN2017089742-appb-100001
    其中,αopt表示所述图像升频系统的参数,Dθ(X,HRk)表示由所述卷积神经网络系统基于所述原始图像HRk和通过所述图像升频系统获得的升频图像X而估计的倍率因子,“||·||”表示范数运算。
  14. 根据权利要求7所述的方法,其中,所述第二降质图像通过以下操作获得:
    使用所述倍率因子对所述原始图像进行下采样;以及
    通过所训练的图像升频系统,使用所述倍率因子对下采样后的图像进行升频。
  15. 根据权利要求7所述的方法,其中,所述倍率因子的值在不同的训练集合中是不同的。
  16. 根据权利要求7所述的方法,其中,在所述第一训练集合中,所述原始图像被划分成多个具有第一尺寸的图像块;
    在所述第二训练集合和所述新的训练集合中,所述原始图像被划分成多个具有第二尺寸的图像块。
  17. 一种使用如权利要求1至5任意一项所述的图像升频系统对图像进行升频的方法,包括:
    卷积神经网络模块基于所接收的输入图像和与所述输入图像的分辨率相同的补充图像,生成第一数量的特征图像,并输出到复合器;
    所述复合器将所接收的特征图像中的每n*n个特征图像合成为一个特征图像,并将所合成的特征图像输出到下一个卷积神经网络模块;
    下一个卷积神经网络模块基于所述复合器输出的特征图像和与所接收的特征图像的分辨率相同的补充图像,生成第二数量的特征图像并输出;
    其中,n表示所述复合器的升频倍率,是大于1的整数,所述复合器所接收的特征图像的数量是n*n的倍数。
PCT/CN2017/089742 2016-11-09 2017-06-23 图像升频系统及其训练方法、以及图像升频方法 WO2018086354A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/741,781 US10311547B2 (en) 2016-11-09 2017-06-23 Image upscaling system, training method thereof, and image upscaling method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610984239.6 2016-11-09
CN201610984239.6A CN108074215B (zh) 2016-11-09 2016-11-09 图像升频系统及其训练方法、以及图像升频方法

Publications (1)

Publication Number Publication Date
WO2018086354A1 true WO2018086354A1 (zh) 2018-05-17

Family

ID=62110148

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/089742 WO2018086354A1 (zh) 2016-11-09 2017-06-23 图像升频系统及其训练方法、以及图像升频方法

Country Status (3)

Country Link
US (1) US10311547B2 (zh)
CN (1) CN108074215B (zh)
WO (1) WO2018086354A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191376A (zh) * 2018-07-18 2019-01-11 电子科技大学 基于srcnn改进模型的高分辨率太赫兹图像重构方法
WO2020063648A1 (zh) * 2018-09-30 2020-04-02 京东方科技集团股份有限公司 生成对抗网络训练方法、图像处理方法、设备及存储介质

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018033137A1 (zh) * 2016-08-19 2018-02-22 北京市商汤科技开发有限公司 在视频图像中展示业务对象的方法、装置和电子设备
CN107124609A (zh) * 2017-04-27 2017-09-01 京东方科技集团股份有限公司 一种视频图像的处理系统、其处理方法及显示装置
CN107122826B (zh) 2017-05-08 2019-04-23 京东方科技集团股份有限公司 用于卷积神经网络的处理方法和系统、和存储介质
KR102567675B1 (ko) * 2017-08-24 2023-08-16 가부시키가이샤 한도오따이 에네루기 켄큐쇼 화상 처리 방법
CN109754357B (zh) * 2018-01-26 2021-09-21 京东方科技集团股份有限公司 图像处理方法、处理装置以及处理设备
CN111767979B (zh) * 2019-04-02 2024-04-23 京东方科技集团股份有限公司 神经网络的训练方法、图像处理方法、图像处理装置
KR20200142883A (ko) * 2019-06-13 2020-12-23 엘지이노텍 주식회사 카메라 장치 및 카메라 장치의 이미지 생성 방법
CN110288607A (zh) * 2019-07-02 2019-09-27 数坤(北京)网络科技有限公司 分割网络的优化方法、系统和计算机可读存储介质
KR102624027B1 (ko) * 2019-10-17 2024-01-11 삼성전자주식회사 영상 처리 장치 및 방법
CN115668273A (zh) 2020-09-15 2023-01-31 三星电子株式会社 电子装置、其控制方法和电子系统
KR20220036061A (ko) * 2020-09-15 2022-03-22 삼성전자주식회사 전자 장치, 그 제어 방법 및 전자 시스템

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140093185A1 (en) * 2012-09-28 2014-04-03 Luhong Liang Apparatus, system, and method for multi-patch based super-resolution from an image
CN104346629A (zh) * 2014-10-24 2015-02-11 华为技术有限公司 一种模型参数训练方法、装置及系统
CN105120130A (zh) * 2015-09-17 2015-12-02 京东方科技集团股份有限公司 一种图像升频系统、其训练方法及图像升频方法
CN204948182U (zh) * 2015-09-17 2016-01-06 京东方科技集团股份有限公司 一种图像升频系统及显示装置
CN105611303A (zh) * 2016-03-07 2016-05-25 京东方科技集团股份有限公司 图像压缩系统、解压缩系统、训练方法和装置、显示装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015192316A1 (en) * 2014-06-17 2015-12-23 Beijing Kuangshi Technology Co., Ltd. Face hallucination using convolutional neural networks
WO2017175282A1 (ja) * 2016-04-04 2017-10-12 オリンパス株式会社 学習方法、画像認識装置およびプログラム
CN105976318A (zh) * 2016-04-28 2016-09-28 北京工业大学 一种图像超分辨率重建方法
CN106067161A (zh) * 2016-05-24 2016-11-02 深圳市未来媒体技术研究院 一种对图像进行超分辨的方法
US10255522B2 (en) * 2016-06-17 2019-04-09 Facebook, Inc. Generating object proposals using deep-learning models
US10510146B2 (en) * 2016-10-06 2019-12-17 Qualcomm Incorporated Neural network for image processing
US20180129900A1 (en) * 2016-11-04 2018-05-10 Siemens Healthcare Gmbh Anonymous and Secure Classification Using a Deep Learning Network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140093185A1 (en) * 2012-09-28 2014-04-03 Luhong Liang Apparatus, system, and method for multi-patch based super-resolution from an image
CN104346629A (zh) * 2014-10-24 2015-02-11 华为技术有限公司 一种模型参数训练方法、装置及系统
CN105120130A (zh) * 2015-09-17 2015-12-02 京东方科技集团股份有限公司 一种图像升频系统、其训练方法及图像升频方法
CN204948182U (zh) * 2015-09-17 2016-01-06 京东方科技集团股份有限公司 一种图像升频系统及显示装置
CN105611303A (zh) * 2016-03-07 2016-05-25 京东方科技集团股份有限公司 图像压缩系统、解压缩系统、训练方法和装置、显示装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109191376A (zh) * 2018-07-18 2019-01-11 电子科技大学 基于srcnn改进模型的高分辨率太赫兹图像重构方法
CN109191376B (zh) * 2018-07-18 2022-11-25 电子科技大学 基于srcnn改进模型的高分辨率太赫兹图像重构方法
WO2020063648A1 (zh) * 2018-09-30 2020-04-02 京东方科技集团股份有限公司 生成对抗网络训练方法、图像处理方法、设备及存储介质
US11449751B2 (en) 2018-09-30 2022-09-20 Boe Technology Group Co., Ltd. Training method for generative adversarial network, image processing method, device and storage medium

Also Published As

Publication number Publication date
CN108074215A (zh) 2018-05-25
CN108074215B (zh) 2020-04-14
US20190005619A1 (en) 2019-01-03
US10311547B2 (en) 2019-06-04

Similar Documents

Publication Publication Date Title
WO2018086354A1 (zh) 图像升频系统及其训练方法、以及图像升频方法
US10970600B2 (en) Method and apparatus for training neural network model used for image processing, and storage medium
US10019642B1 (en) Image upsampling system, training method thereof and image upsampling method
US8275218B2 (en) Forward and backward image resizing method
JP6253331B2 (ja) 画像処理装置及び画像処理方法
CN107169927B (zh) 一种图像处理系统、方法及显示装置
US11900567B2 (en) Image processing method and apparatus, computer device, and storage medium
CN113837946B (zh) 一种基于递进蒸馏网络的轻量化图像超分辨率重建方法
CN204948182U (zh) 一种图像升频系统及显示装置
EP4207051A1 (en) Image super-resolution method and electronic device
KR102493492B1 (ko) 초해상도 모델의 메타 러닝을 통한 빠른 적응 방법 및 장치
CN107220934B (zh) 图像重建方法及装置
CN109102463B (zh) 一种超分辨率图像重建方法及装置
JP5289540B2 (ja) 画像処理装置、及び画像処理方法
CN102842111B (zh) 放大图像的补偿方法及装置
JP2012164147A (ja) 画像縮小装置、画像拡大装置、及びこれらのプログラム
CN115375539A (zh) 图像分辨率增强、多帧图像超分辨率系统和方法
JP2012151751A (ja) 画像縮小装置、画像拡大装置、及びこれらのプログラム
JP5181345B2 (ja) 画像処理装置及び画像処理方法
Zhou et al. Enhancing Real-Time Super Resolution with Partial Convolution and Efficient Variance Attention
JP6452793B2 (ja) 画像処理装置及び画像処理方法
JP6902425B2 (ja) カラー情報拡大器およびカラー情報推定器、ならびに、それらのプログラム
US20230360173A1 (en) Content-aware bifurcated upscaling
WO2016035568A1 (ja) 信号処理装置および信号処理方法、固体撮像素子、撮像装置、電子機器、並びにプログラム
US20150310595A1 (en) Local contrast enhancement method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17869498

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17869498

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.08.2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17869498

Country of ref document: EP

Kind code of ref document: A1