WO2019120110A1 - 图像重建方法及设备 - Google Patents
图像重建方法及设备 Download PDFInfo
- Publication number
- WO2019120110A1 WO2019120110A1 PCT/CN2018/120447 CN2018120447W WO2019120110A1 WO 2019120110 A1 WO2019120110 A1 WO 2019120110A1 CN 2018120447 W CN2018120447 W CN 2018120447W WO 2019120110 A1 WO2019120110 A1 WO 2019120110A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- resolution
- super
- model
- sub
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 112
- 238000012549 training Methods 0.000 claims description 55
- 230000008569 process Effects 0.000 claims description 38
- 238000013528 artificial neural network Methods 0.000 claims description 28
- 230000015572 biosynthetic process Effects 0.000 claims description 23
- 238000003786 synthesis reaction Methods 0.000 claims description 23
- 230000006870 function Effects 0.000 description 14
- 230000001537 neural effect Effects 0.000 description 12
- 238000012545 processing Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 238000000605 extraction Methods 0.000 description 10
- 238000012360 testing method Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 8
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000010801 machine learning Methods 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000004913 activation Effects 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000003542 behavioural effect Effects 0.000 description 3
- 230000002996 emotional effect Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 239000003086 colorant Substances 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- 244000025254 Cannabis sativa Species 0.000 description 1
- 208000004547 Hallucinations Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the embodiments of the present invention relate to the field of communications technologies, and in particular, to an image reconstruction method and device.
- Image super-resolution reconstruction refers to the technique of reconstructing low-resolution images into high-resolution images by using image processing methods, which can effectively improve the sharpness of images, and is important for video surveillance, camera photography, high-definition television, medical images and other fields. significance.
- image super-resolution reconstruction face image super-resolution is widely used, and face super-resolution reconstruction is also called face hallucination.
- methods for face super-resolution reconstruction include signal reconstruction based methods and machine learning based methods.
- the signal reconstruction method is mainly realized by signal reconstruction theory in the field of signal processing, such as Fourier transform, polynomial interpolation and the like.
- the signal-based reconstruction method is usually simple to implement, but the reconstructed image details are seriously lost, the edges are blurred, and the jagged shape is obvious.
- the machine learning-based method is to input a low-resolution image and then reconstruct a low-resolution image through a super-resolution model to obtain a reconstructed image with a large posterior probability estimate.
- the super-resolution model used by the machine learning-based approach is derived from the initial super-resolution model training.
- the super-resolution model training process is a process of adjusting parameters in the super-resolution model based on the pixel mean square error between the image reconstructed from the low-resolution image and the high-resolution image.
- the generated image is smooth and the high-frequency information is seriously lost.
- the embodiment of the present application discloses an image reconstruction method and device, which can improve image reconstruction quality.
- an embodiment of the present application provides an image reconstruction method, including: inputting a first image into a newly created super-resolution model to obtain a reconstructed second image, where a resolution of the second image is higher than the first Image; the newly created super-resolution model is obtained by training an initial super-resolution model using error loss; the error loss includes pixel mean square error and image feature mean square error; and the image feature includes texture feature and shape feature At least one of a spatial relationship feature and an image high-level semantic feature.
- the error loss includes the pixel mean square error, and the error loss also includes the image feature mean square error.
- the error loss used to train the initial super-resolution model contains more comprehensive information on the error loss, so the newly constructed super-resolution model used to reconstruct the image is more accurate, thus reducing the high-frequency information of the reconstructed image. Loss and improve the quality of reconstructed images.
- the error loss is an error loss between a third image and a fourth image, the third image being reconstructed by inputting a fifth image into the initial super-resolution model;
- the fourth image is a high resolution image
- the fifth image is a low resolution image obtained by the blurring process of the fourth image;
- the initial super resolution model is used to reconstruct the input of the initial super resolution
- the image of the model is used to increase the resolution.
- the number of the third image, the fourth image, and the fifth image is M, and the number of error losses is M, and the M images are
- the M fifth images are input to the initial super-resolution model reconstruction; the M error losses are determined according to the M third images and the M fourth images; Any one of the M error losses is an error loss between the ith third image in the M third images and the jth fourth image in the M fourth images.
- the image obtained by inputting the fifth image obtained by the blurring process to the initial super-resolution model is the i-th third image, and the M is a positive integer greater than 1. Both i and the i are less than or equal to M positive integers.
- the initial super-resolution model is adjusted by multiple error losses obtained by multiple sets of training samples, and more sample information for adjusting the initial super-resolution model is provided, and the newly-adjusted new information is obtained.
- the super-resolution model is more accurate.
- the training sample pairs are multiple pairs, each time an error loss is obtained, the initial super-resolution model is adjusted according to the one error loss, and the number of adjustments is too much, resulting in waste of processing resources and storage resources.
- the multiple error losses obtained by multiple training samples can adjust the initial super-resolution model, which can reduce the number of adjustments of parameters in the super-resolution model, thereby saving processing resources and storage resources.
- the newly created super-resolution model is obtained by adjusting parameters in the initial super-resolution model according to the M error losses; or
- the initial super-resolution model is a first super-resolution model
- the second super-resolution model is obtained by adjusting parameters in the first super-resolution model according to the first error loss.
- r error loss adjusts the parameters in the rth super-resolution model to obtain the r+1th super-resolution model, which is adjusted in the Mth super-resolution model using the Mth error loss Obtained by the parameter; wherein r is greater than or equal to 1 and less than or equal to M positive integer.
- the initial super-resolution model includes n super-resolution sub-models, the n being a positive integer greater than or equal to 2; the super-resolution sub-model is used to reconstruct the input super-resolution Image information of the sub-model to improve resolution; the image information includes information of pixel value information and image features; wherein the input of the first super-resolution sub-model is the first An image, an output of the first super-resolution sub-model as an input to a second super-resolution sub-model, and an output of the t-1th super-resolution sub-model as an input to a t-th super-resolution sub-model, An output of the tth super-resolution sub-model as an input of a t+1th super-resolution sub-model; the t is a positive integer satisfying 2 ⁇ t ⁇ n-1; and the t-th super-resolution sub- An output of the model as an input to an output synthesis module, the output of the output synthesis module being an input of an nth super-
- the low resolution image is reconstructed by cascading multiple super resolution sub-models.
- the reconstructed image obtained by the reconstruction has a higher pixel value, which can improve the image quality of the reconstructed image.
- the reconstructed images output by the first n-1 super-resolution sub-models are used as the image information of the last super-resolution sub-model, which contains more image information, which reduces the loss of image information, and thus can be improved by establishing The accuracy of the new super-resolution model improves image reconstruction quality.
- the output of the reconstructed image information of the integrated model model output The k is a positive integer satisfying 1 ⁇ k ⁇ n-1; the w k is a weight of the kth super-resolution sub-model.
- w k is the super-resolution of the original parameters of the model.
- the weight w k in the initial super-resolution model may be optimized based on the error loss.
- the super-resolution sub-model is a three-layer full convolution depth neural network.
- the above three-layer full convolution depth neural network is employed, and the first layer of convolutional layer and the second layer of convolutional layer are used for image information extraction of low resolution images, that is, information for super resolution reconstruction is obtained.
- the third layer of convolutional layer reconstructs the high resolution image by using the image information extracted and transformed by the first two layers.
- the three-layer full convolution depth neural network adds two layers of convolutional layers to help extract more accurate image information.
- the super-resolution sub-model composed of three-layer full-convolution depth neural network needs to be connected in series to form a super-resolution model.
- the super-resolution sub-model uses the above three-layer full-convolution depth neural network, which can use less computing resources to extract more accurate image information. More accurate image information facilitates the reconstruction of higher quality reconstructed images and saves computing resources.
- the error loss L ⁇ 1 L1+ ⁇ 2 L2+ ⁇ 3 L3, wherein the L1 is the pixel mean square error, and the ⁇ 1 is a weight of the pixel mean square error, L2 is the image feature mean square error, the ⁇ 2 is the weight of the image feature mean square error, the L3 is the regularization term of the w k , and the ⁇ 3 is the weight of the regularization term.
- the increased regularization term L3 is used to reduce over-fitting and improve the accuracy of the newly created super-resolution model, thereby improving the quality of the reconstructed image.
- each layer of convolutional layers in a three-layer full convolution depth neural network includes at least one convolution kernel, and the weight matrix W of the convolution kernel is a parameter in the initial super-resolution model.
- an embodiment of the present application provides an image reconstruction apparatus, including a processor and a memory, where the memory is used to store program instructions, and the processor is configured to invoke the program instructions to perform the following operations: inputting a first image a newly created super-resolution model to obtain a reconstructed second image having a higher resolution than the first image; the newly created super-resolution model is using an error loss to the initial super-resolution model
- the error loss includes pixel mean square error and image feature mean square error
- the image feature includes at least one of a texture feature, a shape feature, a spatial relationship feature, and an image high-level semantic feature.
- the error loss includes the pixel mean square error
- the error loss also includes the image feature mean square error.
- the error loss used to train the initial super-resolution model contains more comprehensive information on error loss, which can reduce the loss of high-frequency information of the reconstructed image and improve the reconstruction quality of the reconstructed image.
- the error loss is an error loss between a third image and a fourth image, the third image being reconstructed by inputting a fifth image into the initial super-resolution model;
- the fourth image is a high resolution image
- the fifth image is a low resolution image obtained by the blurring process of the fourth image;
- the initial super resolution model is used to reconstruct the input of the initial super resolution
- the image of the model is used to increase the resolution.
- the number of the third image, the fourth image, and the fifth image is M, and the number of error losses is M, and the M images are
- the M fifth images are input to the initial super-resolution model reconstruction; the M error losses are determined according to the M third images and the M fourth images; Any one of the M error losses is an error loss between the ith third image in the M third images and the jth fourth image in the M fourth images.
- the image obtained by inputting the fifth image obtained by the blurring process to the initial super-resolution model is the i-th third image, and the M is a positive integer greater than 1.
- the i and the j are both less than or equal to the M positive integer.
- the initial super-resolution model is adjusted by using multiple error losses obtained by multiple sets of training samples, and is provided for adjusting the initial
- the super-resolution model has more sample information, and the newly-built super-resolution model is adjusted. Accuracy higher.
- the newly created super-resolution model is obtained by adjusting parameters in the initial super-resolution model according to the M error losses; or
- the initial super-resolution model is a first super-resolution model
- the second super-resolution model is obtained by adjusting parameters in the first super-resolution model according to the first error loss.
- r error loss adjusts the parameters in the rth super-resolution model to obtain the r+1th super-resolution model, which is adjusted in the Mth super-resolution model using the Mth error loss Obtained by the parameter; wherein r is greater than or equal to 1 and less than or equal to M positive integer.
- the initial super-resolution model includes n super-resolution sub-models, the n being a positive integer greater than or equal to 2; the super-resolution sub-model is used to reconstruct the input super-resolution Image information of the sub-model to improve resolution; the image information includes information of pixel value information and image features; wherein the input of the first super-resolution sub-model is the first An image, an output of the first super-resolution sub-model as an input to a second super-resolution sub-model, and an output of the t-1th super-resolution sub-model as an input to a t-th super-resolution sub-model, An output of the tth super-resolution sub-model as an input of a t+1th super-resolution sub-model; the t is a positive integer satisfying 2 ⁇ t ⁇ n-1; and the t-th super-resolution sub- An output of the model as an input to an output synthesis module, the output of the output synthesis module being an input of an nth super-
- the low resolution image is reconstructed by cascading multiple super resolution sub-models.
- the reconstructed image obtained by the reconstruction has a higher pixel value, which can improve the image quality of the reconstructed image.
- the reconstructed images output by the first n-1 super-resolution sub-models are used as the image information of the last super-resolution sub-model, which contains more image information, which reduces the loss of image information, and thus can be improved by establishing The accuracy of the new super-resolution model improves image reconstruction quality.
- the output of the reconstructed image information of the integrated model model output The k is a positive integer satisfying 1 ⁇ k ⁇ n-1; the w k is a weight of the kth super-resolution sub-model.
- w k is the super-resolution of the original parameters of the model.
- the weight w k in the initial super-resolution model may be optimized based on the error loss.
- the super-resolution sub-model is a three-layer full convolution depth neural network.
- the above three-layer full-convolution depth neural network is used, and the first layer and the second layer are used for image information extraction on low-resolution images, that is, information available for super-resolution reconstruction is obtained.
- the third layer reconstructs the high-resolution image by using the image information extracted and transformed by the first two layers.
- the three-layer full convolution depth neural network adds two layers of convolutional layers to help extract more accurate image information.
- the super-resolution sub-model composed of three-layer full-convolution depth neural network needs to be connected in series to form a super-resolution model.
- the super-resolution sub-model uses the above three-layer full-convolution depth neural network, which can use less computing resources to extract more accurate image information. More accurate image information facilitates the reconstruction of higher quality reconstructed images and saves computing resources.
- the error loss L ⁇ 1 L1+ ⁇ 2 L2+ ⁇ 3 L3, wherein the L1 is the pixel mean square error, and the ⁇ 1 is a weight of the pixel mean square error, L2 is the image feature mean square error, the ⁇ 2 is the weight of the image feature mean square error, the L3 is the regularization term of the w k , and the ⁇ 3 is the weight of the regularization term.
- the increased regularization term L3 is used to reduce over-fitting and improve the accuracy of the newly created super-resolution model, thereby improving the quality of the reconstructed image.
- each layer of convolutional layers in a three-layer full convolution depth neural network includes at least one convolution kernel, and the weight matrix W of the convolution kernel is a parameter in the initial super-resolution model.
- an embodiment of the present application provides an image reconstruction device, where the device includes a module or unit for performing the image reconstruction method provided by the first aspect or any possible implementation of the first aspect.
- an embodiment of the present invention provides a chip system, where the chip system includes at least one processor, a memory, and an interface circuit, the memory, the interface circuit, and the at least one processor are interconnected by a line, and the at least one memory is stored.
- an embodiment of the present invention provides a computer readable storage medium, where the program instructions are stored, and when the program instructions are executed by a processor, the first aspect or the first aspect is implemented. It is possible to implement the method described.
- an embodiment of the present invention provides a computer program product, when the computer program product is run by a processor, implementing the method described in the first aspect or any possible implementation manner of the first aspect.
- the error loss includes the pixel mean square error, and the error loss also includes the image feature mean square error.
- the error loss used to train the initial super-resolution model contains more comprehensive information on error loss, which can reduce the loss of high-frequency information of the reconstructed image and improve the reconstruction quality of the reconstructed image.
- n is a positive integer greater than 1
- the low-resolution image is reconstructed by cascading multiple super-resolution sub-models.
- the reconstructed image obtained by the reconstruction has a higher pixel value, which can improve the image quality of the reconstructed image.
- the reconstructed images output by the first n-1 super-resolution sub-models are used as the image information of the last super-resolution sub-model, which contains more image information, which reduces the loss of image information, and thus can be improved by establishing The accuracy of the new super-resolution model improves image reconstruction quality.
- FIG. 1 is a schematic flowchart of an image reconstruction method according to an embodiment of the present application.
- FIG. 2 is a schematic block diagram of a method for establishing an image reconstruction model according to an embodiment of the present application
- FIG. 3 is a schematic structural diagram of a super-resolution model provided by an embodiment of the present application.
- FIG. 4 is a schematic structural diagram of a super-resolution model provided by an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of an image reconstruction device according to an embodiment of the present disclosure.
- FIG. 6 is a schematic structural diagram of another image reconstruction device according to an embodiment of the present application.
- Super resolution refers to a technique of reconstructing a low resolution (LR) image into a high resolution (HR) image by a computer using an image processing method.
- High-resolution images mean that the image has a high pixel density and can provide more image detail, which often plays a key role in the application.
- Image super-resolution techniques can be divided into two categories: image-based super-resolution methods based on reconstruction and image-based super-resolution methods based on learning.
- the reconstructed image super-resolution method can statistically estimate the high-resolution image of the maximum posterior probability by using a frequency domain algorithm or a spatial domain algorithm.
- the learning-based image super-resolution method can include two phases, a training phase and a test phase.
- the training phase the initial super-resolution model and training set are first established.
- the training set may include a plurality of low resolution images and a high resolution image corresponding to each low resolution image. Learning the correspondence between the high-resolution image and the low-resolution image through the low-resolution image in the training set and its corresponding high-resolution image, thereby correcting the value of the parameter in the initial super-resolution model, so that The error convergence between the high-resolution image and the reconstructed image finally determines the newly created super-resolution model obtained after training.
- super-resolution reconstruction of the image can be guided by the newly created super-resolution model.
- the method for acquiring the high resolution image corresponding to the low resolution image and the low resolution image may be: processing the high resolution image by a blur function to obtain a corresponding low resolution image.
- the initial super-resolution model may be an experimentally determined model and may be non-linear.
- the super resolution model can be a convolutional neural network.
- the neural network may be composed of neural units, and the neural unit may refer to an arithmetic unit having an input of x s and an intercept 1 , and the output of the arithmetic unit may be:
- W s is the weight of x s
- b is the offset of the neural unit.
- f is the activation function of the neural unit for introducing nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
- the output signal of the activation function can be used as an input to the next layer of convolution.
- the activation function can be a sigmoid function.
- a neural network is a network formed by joining together a plurality of the above-described single neural units, that is, the output of one neural unit may be an input of another neural unit.
- the input to each neural unit can be linked to the local accepted domain of the previous layer to extract features of the local accepted domain, which can be an area composed of several neural units.
- Convolutionas neuras network is a deep neural network with convolutional structures.
- the convolutional neural network consists of a feature extractor consisting of a convolutional layer and a sub-sampling layer.
- the feature decimator can be thought of as a filter, and the convolution process can be thought of as convolution with an input image or a convolution feature map using a trainable filter.
- a convolutional layer is a layer of neurons in a convolutional neural network that convolves an input signal. In the convolutional layer of a convolutional neural network, a neuron can be connected only to a portion of the adjacent layer neurons.
- a convolutional layer usually contains several feature planes, each of which can be composed of a number of rectangularly arranged neural units.
- the neural units of the same feature plane share weights, and the weights shared here are convolution kernels. Sharing weight can be understood as the way in which image information is extracted regardless of location. The underlying principle is that the statistics of a certain part of the image are the same as the other parts. That means that the image information learned in one part can also be used in another part. So for all positions on the image, we can use the same learned image information. In the same convolutional layer, multiple convolution kernels can be used to extract different image information. Generally, the larger the number of convolution kernels, the richer the image information reflected by the convolution operation.
- the convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can be reasonably weighted by learning during the training process of the convolutional neural network.
- the direct benefit of shared weighting is to reduce the connections between the layers of the convolutional neural network while reducing the risk of overfitting.
- the convolutional neural network can use the error back propagation (BP) algorithm to correct the parameters of the initial super-resolution model during the training process, so that the reconstruction error loss of the super-resolution model is smaller and smaller.
- BP error back propagation
- the input signal is forwarded until the output produces an error loss, and the parameters in the initial super-resolution model are updated by the back propagation error loss information, thereby converging the error loss.
- the backpropagation algorithm is a backpropagation motion dominated by error loss, aiming to obtain parameters of the optimal super-resolution model, such as weight matrix.
- the pixel value information of the image and the information of the image feature may be collectively referred to as image information.
- the pixel value can be a red, green, and blue (RGB) color value, and the pixel value can be a long integer representing the color.
- the pixel value is 65536*Red+256*Green+Blue, where Blue represents the blue component, Green represents the green component, and Red represents the red component.
- Blue represents the blue component
- Green represents the green component
- Red represents the red component.
- the pixel value can be a grayscale value.
- Image features refer to texture features, shape features, and spatial relationship features of images and high-level semantic features of images. The details are as follows:
- the texture feature of an image is a global feature of the image that describes the surface properties of the scene corresponding to the image or image region.
- the texture feature of an image is not based on the characteristics of a single pixel, but is a statistically calculated feature in a plurality of pixel-doped regions.
- the texture features of the image are more resistant to noise.
- the texture features of the image may be greatly deviated.
- the texture features of the image can be described in the following ways: a. Statistical methods, such as extracting texture features from the autocorrelation function of the image (or the energy spectrum function of the image), and extracting the thickness of the texture by calculating the energy spectrum function of the image. Characteristic parameters such as degree and directionality.
- Geometry is a texture feature analysis method based on the theory of texture primitives (basic texture elements). In this method, complex texture features can be composed of a number of simple texture primitives repeatedly arranged in a regular pattern.
- the model method is based on the structural model of the image, and uses the parameters of the model to characterize the texture features.
- the shape feature of an image can include two types of representation methods, one is a contour feature and the other is a region feature.
- the outline feature of an image refers to the outline of the outer boundary of the object, and the feature of the area of the image refers to the entire shape area occupied by the object.
- the shape feature of the image can be described in the following way: a.
- the boundary feature method is to obtain the shape parameter of the image by describing the boundary feature.
- Fourier shape descriptor method is to use the Fourier transform of the object boundary as the shape description, and use the closure and periodicity of the region boundary to derive the curvature function, the centroid distance, and the shape feature of the complex coordinate function from the boundary point.
- Geometric parameter method The expression and matching of the shape adopts the regional feature description method, for example, using the shape parameter moment, area, perimeter, etc. to describe the shape feature of the image.
- the spatial relationship feature of an image refers to the spatial or relative direction relationship between multiple regions segmented in an image. These relationships can also be divided into connection relationships, adjacency relationships, overlapping relationships, overlapping relationships, inclusion relationships, and tolerance. Relationships, etc.
- the spatial position of an image can be divided into two categories: relative spatial position and absolute spatial position.
- the relative spatial position emphasizes the relative situation between the targets, such as the relationship between the above and below.
- the absolute spatial position emphasizes the distance and orientation between the targets.
- the use of spatial relationship features of images can enhance the ability to distinguish between image content, but spatial relationship features are often sensitive to rotation, inversion, scale changes, etc. of images or targets.
- Image high-level semantic features are higher-level cognitive features relative to image texture features, shape features, and spatial relationship features, and are used to describe human understanding of images.
- Image high-level semantic features are images as objects to determine what position in the image, what relationship between the target scene, what scene is the image, and how to apply the scene. Extracting image high-level semantic features is the process of converting an input image into a visually readable text-like language representation. Obtaining high-level semantic features of images requires establishing a correspondence between images and semantic texts.
- the high-level semantic features of the image can be divided into object semantic features, spatial relationship semantic features, scene semantic features, behavioral semantic features and emotional semantic features.
- object semantic feature may be a feature for determining a person, an animal, an object, and the like.
- the spatial relationship semantic feature may be, for example, a semantic feature for determining "people in front of the house” or "ball on the grass.”
- scene semantic feature may be, for example, a semantic feature for determining "the sea” or "field”.
- the behavioral semantic feature can be, for example, a semantic feature used to determine "performance dance” or “sports competition.”
- Emotional semantic features may be, for example, semantic features used to determine "pleasant images” or "exciting images.”
- Object semantic features and spatial relationship semantic features require some logical reasoning and identify the categories of objects in the image. Scene semantic features, behavioral semantic features and emotional semantic features involve the abstract properties of images, and high-level reasoning is needed for the meaning of the features of images.
- methods for extracting high-level semantic features of images may include: a method based on processing scope, a method based on machine learning, a method based on human-computer interaction, and a method based on external information sources.
- the processing range method can be performed under the premise of image segmentation and object recognition.
- the object template, the scene classifier, and the like are used to mine the semantics by identifying the topological relationship between the object and the object, and generate corresponding scene semantic information.
- the machine learning-based method is to learn the low-level features of the image, and to explore the relationship between the underlying features and the image semantics, so as to establish the mapping relationship between the underlying features and the high-level semantic features of the image.
- the machine learning-based approach consists of two key steps: one is the extraction of low-level features, such as textures, shapes, and so on.
- the second is the use of mapping algorithms.
- the method based on human-computer interaction generally uses low-level features, while users add high-level knowledge.
- the extraction methods mainly include image preprocessing and feedback learning.
- the image preprocessing method may be manual annotation of images in the image library, or some automatic or semi-automatic image semantic annotation methods.
- Feedback learning adds manual intervention in the process of extracting image semantics, extracts the semantic features of images through repeated interactions between users and systems, and establishes and corrects high-level semantic concepts associated with image content.
- the feature loss may be an image feature mean square error
- the image feature may include at least one of a texture feature, a shape feature, a spatial relationship feature, and an image high-level semantic feature.
- the pixel mean square error and the image feature mean square error are described below.
- the reconstructed image of the input low-resolution image can be obtained, and the pixel mean square error between the reconstructed image and the high-resolution image corresponding to the input low-resolution image can be calculated, that is, the pixel mean square loss:
- L1 is the pixel mean squared loss
- F and H are the pixel value of the width of the image and the pixel value of the height, respectively
- I 1, x, y are the high resolution images corresponding to the input low resolution image at (x, y
- the pixel value of the position, I 2, x, y is the pixel value of the reconstructed image of the low resolution image at the (x, y) position.
- the image feature may be a feature extracted by the image feature extraction device from the reconstructed image and the high resolution image, and the image feature may be an N-dimensional vector.
- the feature loss can be the mean square error of the reconstructed image and the high resolution image feature, ie:
- L2 is the feature loss
- ⁇ 1, i is the image feature value of the i-th dimension of the high-resolution image corresponding to the low-resolution image
- ⁇ 2, i is the image feature value of the i-th dimension of the reconstructed image of the low-resolution image
- i is a positive integer greater than 1 and less than N.
- f(x i ) is the value of the established model at x i
- y i is the sampled value
- W (W 0 , W 1 , ..., W N ).
- are regularization terms, which are intended to reduce the risk of overfitting, where W can be a weight matrix and ⁇ is the weight of a regularization term.
- the prior art super-resolution model training process only considers the pixel mean square error, and the high-resolution image corresponding to the low-resolution image and the low-resolution image obtained by the low-resolution image reconstruction in the training set are The error of the pixel mean square error is used to train the initial super-resolution model, and finally the pixel mean square error is converged to obtain a new super-resolution model. That is to say, the newly created super-resolution model only considers the error loss caused by the mean square error of the pixels, and the high-frequency information loss of the reconstructed image obtained by the newly-built super-resolution model obtained by the above training process is severe, thereby reducing the reconstructed image. The quality of reconstruction.
- the embodiment of the present application provides an image reconstruction method.
- the error loss includes the pixel mean square error, and the error loss also includes the image feature mean square error.
- the error loss used to train the initial super-resolution model contains more comprehensive information on error loss, which can reduce the loss of high-frequency information of the reconstructed image and improve the reconstruction quality of the reconstructed image.
- the inventive principle involved in the present application may include: reconstructing a low-resolution image through an initial super-resolution model in a super-resolution model training phase to obtain a reconstructed image, that is, a super-resolution image, determining a super-resolution image and a low-resolution image.
- the error loss between the high resolution images corresponding to the image the error loss including the pixel mean square error and the image feature mean square error.
- the newly created super-resolution model is determined based on the error loss and the initial super-resolution model.
- FIG. 1 is a schematic flowchart of an image reconstruction method according to an embodiment of the present application. As shown in FIG. 1, the image reconstruction method includes, but is not limited to, the following steps S101-S104.
- the image reconstruction device inputs the fifth image to the initial super-resolution model to obtain the reconstructed third image.
- the image reconstruction device determines an error loss between the third image and the fourth image, and the error loss includes a pixel mean square error and an image feature mean square error.
- the image reconstruction device establishes a new super-resolution model according to the initial super-resolution model and error loss.
- the image reconstruction device inputs the first image into the newly created super-resolution model to obtain the reconstructed second image.
- the image reconstruction model that is, the super-resolution model
- the process of establishing a super-resolution model can be divided into a training phase and a testing phase.
- the training phase is a process of training the initial super-resolution model through high-resolution images corresponding to the low-resolution image and the low-resolution image to converge the error loss.
- the test phase is to input a low-resolution test image into the newly created super-resolution model to obtain a reconstructed image, and test the newly-built super-resolution model to reconstruct the image to improve the image resolution.
- the test phase can also be seen as the process of reconstructing an image using a new super-resolution model.
- steps S101-S103 can be regarded as the flow of the super-resolution model training phase.
- the newly created super-resolution model is the super-resolution model that adjusts the parameters in the initial super-resolution model by adjusting the initial super-resolution model by error loss.
- the newly-built super-resolution model can be directly Used to reconstruct an image to increase the resolution of the image.
- Step S104 can be regarded as a process of reconstructing an image using a newly created super-resolution model to improve the resolution of the image, that is, the test phase of the newly created super-resolution model.
- FIG. 2 is a schematic block diagram of a method for establishing an image reconstruction model according to an embodiment of the present application.
- the low resolution image 101 is input to the initial super resolution model 102, and the low resolution image 101 is reconstructed by the initial super resolution model 102 to obtain a reconstructed image 103, and the reconstructed image 103 and the high resolution image are calculated.
- the pixel mean squared difference 105 between 104 and the image feature mean squared difference 106 between the reconstructed image 103 and the high resolution image 104 is calculated.
- the image features may be image features of the reconstructed image and image features of the high resolution image extracted by the image feature extraction device from the reconstructed image 103 and the high resolution image 104, respectively.
- the error loss 107 is determined from the pixel mean squared 105 and the image feature mean squared 106, and the super-resolution model is updated based on the error loss 107 and the initial super-resolution model 102.
- the low resolution image is a lower resolution image obtained by the blurring process of the high resolution image.
- the first image is a low resolution image
- the second image is a reconstructed image reconstructed by the reconstructed super resolution model.
- the fifth image is a low resolution image
- the third image is a reconstructed image reconstructed by an initial super resolution model
- the fourth image is a high resolution image corresponding to the fifth image.
- the fifth image may be a low resolution image obtained by blurring the fourth image.
- the fourth image and the fifth image constitute a high and low resolution image training set.
- step S101 may be to input M fifth images into the initial super-resolution model to obtain reconstructed M third images.
- Step S102 may be to determine M error losses according to the M third image and the M fourth image, wherein any one of the M error losses is the ith third image and the M slice in the M third image.
- the error loss between the jth and fourth images in the fourth image, and the image obtained by the fifth image obtained by the blurring process of the jth fourth image is input to the initial super-resolution model is the i-th image
- M is an integer greater than 1, and both i and j are less than or equal to M positive integers.
- the newly created super-resolution model may be obtained by adjusting parameters in the initial super-resolution model based on the M error losses.
- the initial super-resolution model is a first super-resolution model
- the parameter in the first super-resolution model is adjusted according to the first error loss in the M error losses to obtain a second
- the super-resolution model adjusts the parameters in the r-th super-resolution model according to the r-th error loss to obtain the r+1th super-resolution model
- the newly-built super-resolution model adjusts the Mth using the Mth error loss
- the parameters in the super-resolution model are obtained; wherein r is greater than or equal to 1 and less than or equal to M positive integer.
- the pair of training samples can be multiple pairs, and M error losses can be calculated by training samples of the M pairs.
- the initial super-resolution model is parameterized by the above M error losses to obtain a new super-resolution model.
- the parameter adjustment of the super-resolution model is performed by the one error loss for each error loss, that is, the M-parameter adjustment is performed on the super-resolution model to obtain a new one.
- Super resolution model When M is a positive integer greater than 2, the initial super-resolution model is parameterized by multiple error losses obtained by multiple sets of training samples to obtain a newly created super-resolution model, which is provided for initial parameter adjustment.
- the super-resolution model has more sample information, and the newly-built super-resolution model obtained is more accurate.
- the training sample pairs are multiple pairs, each time an error loss is obtained, the initial super-resolution model is adjusted according to the one error loss, and the number of adjustments is too much, resulting in waste of processing resources and storage resources.
- the multiple error losses obtained by the multiple training samples are used to adjust the parameters of the initial super-resolution model to obtain a new super-resolution model, which can reduce the number of adjustments of parameters in the super-resolution model, thereby saving processing resources. And storage resources.
- the method of training the initial super-resolution model according to multiple error losses to obtain a new super-resolution model is not limited to the above two types, and the initial super-resolution model is trained according to multiple error losses to obtain a newly-built super-resolution model.
- Other methods can also be used. For example, if the super-resolution model is trained N times, the number of error losses used in at least one training is one or more. N may be a positive integer greater than 1 and less than M.
- the super-resolution model can include multiple super-resolution sub-models.
- FIG. 3 is a schematic structural diagram of a super-resolution model provided by an embodiment of the present application.
- the super-resolution model 102 may include n super-resolution sub-models 1021, n being a positive integer greater than or equal to 2; the super-resolution sub-model 1021 is used to reconstruct image information of the input super-resolution sub-model 1021.
- the image information includes information of pixel value information and image features;
- the input of the first super-resolution sub-model 1 is the first image, that is, the low-resolution image 101, and the output of the first super-resolution sub-module 1 is used as the second super-resolution.
- the input of the sub-model 2 the output of the t-1th super-resolution sub-model t-1 is input as the t-th super-resolution sub-model t, and the output of the t-th super-resolution sub-model t is taken as the t+1th
- the output is the input of the nth super-resolution sub-model n, the output of the n-th super-resolution sub-model n is the second image, ie the reconstructed image 103, and the output synthesis module 1022 is used for the first n-1 super-resolutions.
- the reconstructed image output by the rate model sub-model 1021 and the respective weights determine the image information of the input n-th super-resolution sub-
- the super-resolution model 102 described above may be included in the image reconstruction device such that the image reconstruction device performs the image reconstruction method described in FIG.
- the low resolution image is reconstructed by cascading multiple super resolution sub-models.
- the reconstructed image obtained by the reconstruction has a higher pixel value, which can improve the image quality of the reconstructed image.
- the reconstructed images output by the first n-1 super-resolution sub-models are used as the image information of the last super-resolution sub-model, which contains more image information, which reduces the loss of image information, and thus can be improved by establishing The accuracy of the new super-resolution model improves image reconstruction quality.
- the number of super-resolution sub-models needs to consider two conditions of reconstructing image quality requirements and computing resources. For higher reconstructed image quality requirements, the greater the number of super-resolution sub-models required, the more super-resolution sub-models will result in more system computation and more computational resources consumed. Therefore, the choice of the number of super-resolution sub-models requires a compromise between reconstructing image quality requirements and computing resources.
- the image information that can be reconstructed by the first super-resolution sub-model is about 90% of the image information of the high-resolution image in the training set, and the 90% of the image information is the first image information
- the image information that can be reconstructed by the two super-resolution sub-models is about 90% of the image information other than the first image information, and so on. That is to say, in the process of reconstructing an image for each super-resolution sub-model, 10% of the image information in the remaining unreconstructed image information is lost.
- the remaining unreconstructed image information can be understood as the missing image information of the previous super-resolution sub-model.
- the remaining unreconstructed image information can be understood as all the image information of the high-resolution image. If the reconstructed image quality requirement is to reconstruct the image quality up to 99% of the image information of the high resolution image.
- the computing resource can process up to five super-resolution sub-models in real time. If the number of super-resolution sub-models is N and N is a positive integer greater than or equal to 1, then N satisfies:
- the super-resolution model may not include the output synthesis module 1022, and may also include an output synthesis module 1022.
- the two super-resolution sub-models can be connected in series, and the input of the first super-resolution sub-model is a low-resolution image, and the output of the first super-resolution sub-model is The reconstructed image is input to the second super-resolution submodel, and the output of the second super-resolution submodel is the reconstructed image of the super-resolution model.
- the reconstructed image information output by the output integrated model model 1022 is:
- k is a positive integer satisfying 1 ⁇ k ⁇ n-1; w k is the weight of the kth super-resolution sub-model.
- the weight w k of the k-th super-resolution sub-model may be determined based on experimental results or experience.
- the weight w k of the kth super-resolution sub-model may also be a parameter in the super-resolution model, that is, in the initial super-resolution model training process, the initial super-resolution may be optimized according to the error loss.
- the weight w k in the rate model may be determined based on experimental results or experience.
- the weight w k of the kth super-resolution sub-model may also be a parameter in the super-resolution model, that is, in the initial super-resolution model training process, the initial super-resolution may be optimized according to the error loss.
- the weight w k in the rate model may be optimized according to the error loss.
- the super-resolution sub-model 1021 may be a three-layer full convolution depth neural network.
- the first layer of convolutional layer can be an input layer for extracting image information by region.
- the input layer can contain multiple convolution kernels for extracting different image information.
- the second layer convolutional layer may be a transform layer for nonlinearly transforming the extracted image information, and the extracted image information X may be f(W ⁇ X+b).
- the image information X may be a multi-dimensional vector, which is multi-dimensional image information extracted by a plurality of convolution kernels in the first layer convolutional layer, W is a weight vector of the convolution kernel, and b is an offset vector of the convolution kernel, f Can be an activation function.
- the third layer convolutional layer may be an output layer for reconstructing image information output by the second layer convolutional layer.
- the reconstruction process can also be reconstructed by a convolution operation of multiple convolution kernels and images.
- the output of the output layer can be a 3-channel (color) image or a single-channel (grayscale) image.
- the super-resolution model is composed of a plurality of super-resolution sub-models connected in series, the above-mentioned three-layer full-convolution depth neural network is adopted, and the first-layer convolution layer and the second-layer convolution layer are used to perform low-resolution images.
- Image information extraction ie obtaining information that can be used for super-resolution reconstruction.
- the third layer of convolutional layer reconstructs the high resolution image by using the image information extracted and transformed by the first two layers. Compared with the image information extraction using only one layer of convolutional layer, the two-layer convolutional layer with three layers of full convolution depth neural network can extract more accurate image information.
- the super-resolution sub-model composed of three-layer full-convolution depth neural network needs to be connected in series to form a super-resolution model.
- Multiple super-resolution sub-models require more computing resources in series, and fewer convolution layers. This means a lower amount of computation, so the number of convolutional layers in the super-resolution submodel needs to take into account the trade-off between computational resources and accuracy.
- the super-resolution sub-model uses the above three-layer full-convolution depth neural network, which can use less computing resources to extract more accurate image information. More accurate image information facilitates the reconstruction of higher quality reconstructed images and saves computing resources.
- the weight vector W of the convolution kernel may be a parameter in the super-resolution model, that is, in the initial super-resolution model training process, the weight of the convolution kernel may also be optimized according to the error loss.
- Vector W may be a parameter in the super-resolution model, that is, in the initial super-resolution model training process, the weight of the convolution kernel may also be optimized according to the error loss.
- the error loss is:
- L1 is the mean square error of the pixel
- ⁇ 1 is the weight of the mean square error of the pixels
- L2 is the mean square error of the image features.
- ⁇ 2 is the weight of the mean square error of the image feature
- L3 is the regularization term of w k
- ⁇ 3 is the weight of the regularization term.
- w is the weight matrix of the super-resolution sub-model.
- ⁇ 1 , ⁇ 2 and ⁇ 3 may be determined experimentally or empirically.
- the increased regularization term L3 is used to reduce over-fitting and improve the accuracy of the newly created super-resolution model, thereby improving the quality of the reconstructed image.
- Step 1 The obtained high-resolution face image is ⁇ Y m
- R a ⁇ b indicates the size of the image
- the resolution of the high-resolution face image is a ⁇ b.
- D can be a downsampling function, ie a fuzzy function.
- the resolution of the low-resolution face image is (a/t) ⁇ (b/t)
- t is a positive integer
- 1 ⁇ m ⁇ M ⁇ ⁇ R a ⁇ b that is, constitute a training set.
- the super-resolution model training is performed according to the training set.
- the specific process is as follows:
- the super-resolution model can contain n super-resolution sub-models.
- FIG. 4 is a schematic structural diagram of a super-resolution model provided by an embodiment of the present application.
- the kth super-resolution sub-model may be a three-layer full convolution depth neural network.
- the face image or the low-resolution face image output by the k-1 super-resolution sub-model is used as the input of the first-layer convolution layer in the k-th super-resolution sub-model, and the k-th super-resolution sub-model
- the output of the first layer of the convolutional layer is the face image information X k obtained by the convolution operation of the s convolution kernels respectively, and the face image information X k is used as the second layer of the kth super-resolution sub-model. Stacked input.
- the output of the second layer convolutional layer in the kth super-resolution submodel is f(W ⁇ X+b), the kth super resolution
- the output of the third layer convolutional layer in the submodel is a reconstructed face image 105 calculated by convolution with f convolution kernels by f(W ⁇ X+b).
- the size of each convolution kernel in the first layer convolution layer may be different from the size of the convolution kernel in the third layer convolution layer, and the convolution kernel size of the second layer convolution layer may be 1.
- the number of convolution kernels in the first convolution layer, the number of convolution kernels in the second convolution layer, and the number of convolution kernels in the third convolution layer may or may not be equal.
- k is a positive integer satisfying 1 ⁇ k ⁇ n-1.
- the nth super-resolution sub-model can also be a three-layer full convolution depth neural network.
- the face image information of the first convolutional layer input of the nth super-resolution sub-model can be: O k is the face image information of the face image reconstructed by the kth super-resolution sub-model.
- the second convolutional layer of the nth super-resolution sub-model is similar to the second convolutional layer of the k-th super-resolution sub-model described above, and will not be described again.
- the reconstructed face image 105 of the third convolutional layer output of the nth super-resolution sub-model is the reconstructed image 103 in the super-resolution model described in FIG.
- the number of convolution kernels of the third-layer convolutional layer can be equal to the number of channels of the input low-resolution face image, and is used to reconstruct a low-resolution face image to obtain a reconstructed person. Face image. For example, if the number of channels of the low-resolution face image is 3, that is, R, G, and B each occupy one channel. Then the last super-resolution sub-model, the third-level convolution layer of the n-th super-resolution sub-model, has a convolution kernel of 3, which is used to reconstruct low-resolution faces composed of red, green and blue colors.
- the image is used to reconstruct the face image, and the reconstructed face image is also composed of three colors of red, green and blue.
- the reconstructed face image is also composed of three colors of red, green and blue.
- the number of channels of the low-resolution face image is 1, that is, the low-resolution face image is a gray image
- the last super-resolution sub-model that is, the third layer of the n-th super-resolution sub-model
- the convolution layer has a convolution kernel of 1, which is used to reconstruct a grayscale low-resolution face image to obtain a reconstructed face image, and the reconstructed face image is also a grayscale image.
- the parameters in the newly created super-resolution model may also be determined according to the error loss and at least one of the following: the first image, the second image, and the third image. That is to say, the device for establishing the image reconstruction model can determine the parameters in the newly created super-resolution model based on the error loss and the image related to the error loss, without adjusting the parameters based on the parameters in the initial super-resolution model. value.
- the above image reconstruction method can be used in an image recognition system, such as a face recognition system.
- the above image reconstruction method can be used in an image enhancement system.
- FIG. 5 is a schematic structural diagram of an image reconstruction apparatus according to an embodiment of the present application.
- the device may include a processing unit 501 and a receiving unit 502, where:
- the processing unit 501 inputs the first image into the newly created super-resolution model to obtain the reconstructed second image, and the resolution of the second image is higher than the first image;
- the new super-resolution model is obtained by training the initial super-resolution model with error loss;
- the error loss includes pixel mean square error and image feature mean square error;
- image features include texture features, shape features, spatial relationship features and image high-level semantics. At least one of the characteristics;
- the receiving unit 502 is configured to receive the first image in the newly created super-resolution model.
- the error loss is an error loss between the third image and the fourth image, and the third image is obtained by reconstructing the fifth image into the initial super-resolution model; the fourth image is high resolution.
- the rate image, the fifth image is a low resolution image obtained by the blurring process of the fourth image; the initial super-resolution model is used to reconstruct the image of the input initial super-resolution model to improve the resolution.
- the number of the third image, the fourth image, and the fifth image is M
- the number of error losses is M
- the M images of the third image are input to the initial image of the fifth image.
- the reconstruction of the super-resolution model; M error losses are determined according to the M third image and the M fourth image;
- any one of the M error losses is an error loss between the i th third image in the M third image and the j th fourth image in the M fourth image, by the jth
- the image obtained by inputting the fifth image obtained by the blurring process to the initial super-resolution model is the i-th third image
- M is a positive integer greater than 1
- i and j are both less than or equal to M positive integer .
- the newly created super-resolution model is obtained by adjusting parameters in the initial super-resolution model according to M error losses;
- the initial super-resolution model is the first super-resolution model.
- the second super-resolution model is obtained by adjusting the parameters in the first super-resolution model according to the first error loss, and the second error-correcting model is adjusted according to the r-th error.
- the r+1 super-resolution model is obtained from the parameters in the r super-resolution models.
- the new super-resolution model is obtained by adjusting the parameters in the M-th super-resolution model using the Mth error loss; where r is Greater than or equal to 1 and less than or equal to M positive integer.
- the initial super-resolution model includes n super-resolution sub-models, n is a positive integer greater than or equal to 2; the super-resolution sub-model is used to reconstruct image information of the input super-resolution sub-model to Increasing the resolution; the image information includes information of pixel value information and image features;
- the input of the first super-resolution sub-model is the first image
- the output of the first super-resolution sub-model is used as the input of the second super-resolution sub-model, t-1
- the output of the super-resolution sub-model is used as the input of the t-th super-resolution sub-model, and the output of the t-th super-resolution sub-model is used as the input of the t+1th super-resolution sub-model;
- t is satisfying 2 ⁇ t a positive integer ⁇ n-1; and the output of the tth super-resolution submodel is used as the input of the output synthesis module, and the output of the output synthesis module is used as the input of the nth super-resolution sub-model, the nth super-resolution sub-
- the output of the model is a second image
- the output synthesis module is configured to determine an input of the nth super-resolution sub-model based on the reconstructed image information output by the first
- the reconstructed image information output by the integrated model model is output.
- k is a positive integer satisfying 1 ⁇ k ⁇ n-1;
- w k is the weight of the kth super-resolution sub-model.
- w k is the super-resolution parameters in the model.
- the super-resolution sub-model is a three-layer full convolution depth neural network.
- the error loss L ⁇ 1 L1+ ⁇ 2 L2+ ⁇ 3 L3, where L1 is the pixel mean square error, ⁇ 1 is the weight of the pixel mean square error, L2 is the image feature mean square error, ⁇ 2 For the weight of the image feature mean square error, L3 is the regularization term of w k , and ⁇ 3 is the weight of the regularization term.
- each module may also correspond to the corresponding description of the method embodiment shown in FIG. 1 , and details are not described herein again.
- the image reconstruction device described above may be an image recognition device such as a face recognition device.
- the above image reconstruction device may also be an image enhancement device or the like.
- FIG. 6 is a schematic structural diagram of another image reconstruction apparatus according to an embodiment of the present application.
- the device includes a processor 601, a memory 602, and a communication interface 603.
- the processor 601, the memory 602, and the communication interface 603 are connected to one another via a bus 604. among them:
- the memory 602 includes, but is not limited to, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read only memory (EPROM), or A compact disc read-only memory (CD-ROM) for storing related instructions and data. Specifically, the memory 602 can be used to store a super-resolution model.
- RAM random access memory
- ROM read-only memory
- EPROM erasable programmable read only memory
- CD-ROM compact disc read-only memory
- the communication interface 603 can be used to communicate with other devices, for example, can be used to receive a training set, the training set includes a fifth image and a fourth image, the fourth image is a high resolution image, and the fifth image is a fourth image is blurred The resulting low resolution image.
- the communication interface 603 can also be used to receive low resolution images that need to be reconstructed, such as receiving a first image.
- the processor 601 may be one or more central processing units (CPUs). In the case where the processor 601 is a CPU, the CPU may be a single core CPU or a multi-core CPU.
- CPUs central processing units
- the processor 601 in the image reconstruction device is configured to read the program code stored in the memory 602 and perform the following operations:
- the new super-resolution model is obtained by training the initial super-resolution model with error loss;
- the error loss includes pixel mean square error and image feature mean square error;
- image features include texture features, shape features, spatial relationship features and image high-level semantics. At least one of the characteristics.
- the error loss is an error loss between the third image and the fourth image
- the third image is obtained by reconstructing the fifth image into the initial super-resolution model
- the fourth image is high resolution.
- the rate image, the fifth image is a low resolution image obtained by the blurring process of the fourth image; the initial super-resolution model is used to reconstruct the image of the input initial super-resolution model to improve the resolution.
- the number of the third image, the fourth image, and the fifth image is M
- the number of error losses is M
- the M images of the third image are input to the initial image of the fifth image.
- the reconstruction of the super-resolution model; M error losses are determined according to the M third image and the M fourth image;
- any one of the M error losses is an error loss between the i th third image in the M third image and the j th fourth image in the M fourth image, by the jth
- the image obtained by inputting the fifth image obtained by the blurring process to the initial super-resolution model is the i-th third image
- M is a positive integer greater than 1
- i and j are both less than or equal to M positive integer .
- the newly created super-resolution model is obtained by adjusting parameters in the initial super-resolution model according to M error losses;
- the initial super-resolution model is the first super-resolution model.
- the second super-resolution model is obtained by adjusting the parameters in the first super-resolution model according to the first error loss, and the error is adjusted according to the r-th error loss.
- the r+1th super-resolution model is obtained from the parameters in the r-th super-resolution model, and the new super-resolution model is obtained by adjusting the parameters in the M-th super-resolution model using the Mth error loss; Is greater than or equal to 1 and less than or equal to M positive integer.
- the initial super-resolution model includes n super-resolution sub-models, n is a positive integer greater than or equal to 2; the super-resolution sub-model is used to reconstruct image information of the input super-resolution sub-model to Increasing the resolution; the image information includes information of pixel value information and image features;
- the input of the first super-resolution sub-model is the first image
- the output of the first super-resolution sub-model is used as the input of the second super-resolution sub-model, t-1
- the output of the super-resolution sub-model is used as the input of the t-th super-resolution sub-model, and the output of the t-th super-resolution sub-model is used as the input of the t+1th super-resolution sub-model;
- t is satisfying 2 ⁇ t a positive integer ⁇ n-1; and the output of the tth super-resolution submodel is used as the input of the output synthesis module, and the output of the output synthesis module is used as the input of the nth super-resolution sub-model, the nth super-resolution sub-
- the output of the model is a second image
- the output synthesis module is configured to determine an input of the nth super-resolution sub-model based on the reconstructed image information output by the first
- the reconstructed image information output by the integrated model model is output.
- k is a positive integer satisfying 1 ⁇ k ⁇ n-1;
- w k is the weight of the kth super-resolution sub-model.
- w k is the super-resolution parameters in the model.
- the super-resolution sub-model is a three-layer full convolution depth neural network.
- the error loss L ⁇ 1 L1+ ⁇ 2 L2+ ⁇ 3 L3, where L1 is the pixel mean square error, ⁇ 1 is the weight of the pixel mean square error, L2 is the image feature mean square error, ⁇ 2 For the weight of the image feature mean square error, L3 is the regularization term of w k , and ⁇ 3 is the weight of the regularization term.
- the image reconstruction device described above may be an image recognition device such as a face recognition device.
- the above image reconstruction device may also be an image enhancement device or the like.
- Embodiments of the present invention also provide a chip system including at least one processor, a memory and an interface circuit, the memory, the transceiver, and the at least one processor being interconnected by a line, the at least one memory
- the instructions are stored in the instructions; when the instructions are executed by the processor, the method flow shown in FIG. 1 is implemented.
- the embodiment of the invention further provides a computer readable storage medium, wherein the computer readable storage medium stores instructions, and when it runs on the processor, the method flow shown in FIG. 1 is implemented.
- the embodiment of the invention further provides a computer program product, wherein the method flow shown in FIG. 1 is implemented when the computer program product runs on a processor.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions can be stored in or transmitted by a computer readable storage medium.
- the computer instructions can be from a website site, computer, server or data center to another website site by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.) Transfer from a computer, server, or data center.
- the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
- the usable medium may be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium (eg, a Solid State Disk (SSD)) or the like.
- the program can be stored in a computer readable storage medium, when the program is executed
- the flow of the method embodiments as described above may be included.
- the foregoing storage medium includes various media that can store program codes, such as a ROM or a random access memory RAM, a magnetic disk, or an optical disk.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Processing (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Analysis (AREA)
Abstract
一种图像重建方法及设备,该方法包括:将第一图像输入新建的超分辨率模型以得到重建的第二图像,第二图像的分辨率高于第一图像;新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;误差损失包括像素均方差和图像特征均方差;图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。该方法可以提高重建图像的质量。
Description
本发明实施例涉及通信技术领域,尤其涉及一种图像重建方法及设备。
图像超分辨率重建是指利用图像处理方法将低分辨率图像重建为高分辨率图像的技术,可有效提升图像的清晰度,对于视频监控、相机拍照、高清电视、医学图像等各领域具有重要意义。而图像超分辨率重建中,人脸超分辨率重建(face image super-resolution)应用广泛,人脸超分辨率重建又称人脸幻生(face hallucination)。
目前,人脸超分辨率重建的方法包括基于信号重建的方法和基于机器学习的方法。其中,基于信号重建方法主要通过信号处理领域中的信号重建理论来实现,例如傅里叶变换、多项式插值等。基于信号的重建方法通常实现简单,但重建得到的图像细节信息丢失严重,边缘模糊,锯齿状明显。
而基于机器学习的方法是输入低分辨率图像,然后通过超分辨率模型重建低分辨率图像得到极大后验概率估计的重建图像。基于机器学习的方法使用的超分辨率模型是初始的超分辨率模型训练得到的。超分辨率模型训练过程是根据低分辨率图像重建得到的图像和高分辨率图像之间的像素均方差来对超分辨率模型中的参数进行调整的过程。但是,仅根据像素均方差训练得到的超分辨率模型在进行图像重建时,生成的图像平滑明显,高频信息损失严重。
发明内容
本申请实施例公开了一种图像重建方法及设备,可以提高图像重建质量。
第一方面,本申请实施例提供一种图像重建方法,包括:将第一图像输入新建的超分辨率模型以得到重建的第二图像,所述第二图像的分辨率高于所述第一图像;所述新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;所述误差损失包括像素均方差和图像特征均方差;所述图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。在超分辨率模型训练阶段,误差损失包含像素均方差,误差损失也包含图像特征均方差。用于训练初始的超分辨率模型的误差损失包含了更全面的误差损失的信息,因此训练得到的用于重建图像的新建的超分辨率模型更加精确,从而可以减少重建图像的高频信息的损失,提高重建图像的重建质量。
在一个实施例中,所述误差损失是第三图像和第四图像之间的误差损失,所述第三图像是将第五图像输入所述初始的超分辨率模型进行重建得到的;所述第四图像为高分辨率图像,所述第五图像为所述第四图像通过模糊化处理得到的低分辨率图像;所述初始的超分辨率模型用于重建输入所述初始的超分辨率模型的图像以提高分辨率。
在一个实施例中,所述第三图像、所述第四图像和所述第五图像的数量均为M张,所述误差损失的个数为M个,所述M张第三图像是将所述M张第五图像输入到所述初始的超分辨率模型重建得到的;所述M个误差损失是根据所述M张第三图像和所述M张第四 图像确定的;其中,所述M个误差损失中任意一个误差损失为所述M张第三图像中的第i张第三图像与所述M张第四图像中的第j张第四图像之间的误差损失,由所述第j张第四图像通过模糊化处理得到的第五图像输入到所述初始的超分辨率模型后得到的图像为所述第i张第三图像,所述M为大于1的正整数,所述i和所述i均为小于或等于M正整数。当M为大于2的正整数时,通过多组训练样本得到的多个误差损失来调整初始的超分辨率模型,提供用于调整初始的超分辨率模型的样本信息更多,调整得到的新建的超分辨率模型精确度更高。另外,如果训练样本对为多对,每次得到一个误差损失时即根据该一个误差损失来调整初始的超分辨率模型,调整次数过多,造成处理资源和存储资源的浪费。而多组训练样本得到的多个误差损失来调整初始的超分辨率模型,可以减少超分辨率模型中参数的调整次数,从而可以节约处理资源和存储资源。
在一个实施例中,所述新建的超分辨率模型是根据M个所述误差损失调整所述初始的超分辨率模型中的参数得到的;或者,
所述初始超分辨模型是第1个超分辨率模型,所述M个误差损失中根据第1个误差损失调整所述第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,所述新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,所述r为大于或等于1且小于或等于M正整数。
在一个实施例中,所述初始的超分辨率模型包含n个超分辨率子模型,所述n为大于等于2的正整数;所述超分辨率子模型用于重建输入所述超分辨率子模型的图像信息以提高分辨率;所述图像信息包含像素值信息和图像特征的信息;所述n个超分辨率子模型中,第一个超分辨率子模型的输入为所述第一图像,所述第一个超分辨率子模型的输出作为第二个超分辨率子模型的输入,第t-1个超分辨率子模型的输出作为第t个超分辨率子模型的输入,所述第t个超分辨率子模型的输出作为第t+1个超分辨率子模型的输入;所述t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型的输出作为输出综合模块的输入,所述输出综合模块的输出作为第n个超分辨率子模型的输入,所述第n个超分辨率子模型的输出为所述第二图像,所述输出综合模块用于根据前n-1个超分辨率模子模型输出的重建的图像信息和各自的权重确定所述第n个超分辨率子模型的输入。当n为大于1的正整数时,通过多个超分辨率子模型级联来对低分辨率图像进行重建。重建得到的重建图像的像素值更高,可以提高重建图像的图像质量。另外,前n-1个超分辨率子模型输出的重建图像均作为输入最后一个超分辨率子模型的图像信息,包含了更多的图像信息,减少了图像信息的丢失,从而可以提高通过建立的新建的超分辨率模型的精确度,提高图像重建质量。
在一个实施例中,所述w
k是所述初始的超分辨率模型中的参数。在初始的超分辨率模型训练过程,可以是根据误差损失来优化初始的超分辨率模型中的权重w
k。
在一个实施例中,所述超分辨率子模型为三层全卷积深度神经网络。采用的上述三层全卷积深度神经网络,第一层卷积层和第二层卷积层用于对低分辨率图像进行图像信息提 取,即获得可用于超分辨率重建的信息。第三层卷积层是利用前两层提取和变换后的图像信息,重建出高分辨率图像。与仅使用一层卷积层进行图像信息提取相比,三层全卷积深度神经网络多增加的两层卷积层可以帮助提取出更精确的图像信息。另外,三层全卷积深度神经网络构成的超分辨率子模型需要串联来构成超分辨率模型,多个超分辨率子模型串联就要求更多的计算资源,而较少的卷积层数意味着较低的计算量,因此超分辨率子模型中卷积层数量需要考虑计算资源和精确度的折衷。超分辨率子模型采用上述三层全卷积深度神经网络,可以使用较少的计算资源来提取更精确的图像信息。更精确的图像信息有利于重建出质量更高的重建图像,并节省计算资源。
在一个实施例中,所述误差损失L=λ
1L1+λ
2L2+λ
3L3,其中,所述L1为所述像素均方差,所述λ
1为所述像素均方差的权重,所述L2为所述图像特征均方差,所述λ
2为所述图像特征均方差的权重,所述L3为所述w
k的正则化项,所述λ
3为所述正则化项的权重。增加的正则化项L3,用于减少过拟合的情况,提高新建的超分辨率模型的精确度,从而提高重建图像的质量。
在一个实施例中,三层全卷积深度神经网络中每层卷积层包含至少一个卷积核,卷积核的权重矩阵W是所述初始的超分辨率模型中的参数。
第二方面,本申请实施例提供一种图像重建设备,包括处理器和存储器,所述存储器用于存储程序指令,所述处理器用于调用所述程序指令来执行如下操作:将第一图像输入新建的超分辨率模型以得到重建的第二图像,所述第二图像的分辨率高于所述第一图像;所述新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;所述误差损失包括像素均方差和图像特征均方差;所述图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。在超分辨率模型训练阶段,误差损失包含像素均方差,误差损失也包含图像特征均方差。用于训练初始的超分辨率模型的误差损失包含了更全面的误差损失的信息,从而可以减少重建图像的高频信息的损失,提高重建图像的重建质量。
在一个实施例中,所述误差损失是第三图像和第四图像之间的误差损失,所述第三图像是将第五图像输入所述初始的超分辨率模型进行重建得到的;所述第四图像为高分辨率图像,所述第五图像为所述第四图像通过模糊化处理得到的低分辨率图像;所述初始的超分辨率模型用于重建输入所述初始的超分辨率模型的图像以提高分辨率。
在一个实施例中,所述第三图像、所述第四图像和所述第五图像的数量均为M张,所述误差损失的个数为M个,所述M张第三图像是将所述M张第五图像输入到所述初始的超分辨率模型重建得到的;所述M个误差损失是根据所述M张第三图像和所述M张第四图像确定的;其中,所述M个误差损失中任意一个误差损失为所述M张第三图像中的第i张第三图像与所述M张第四图像中的第j张第四图像之间的误差损失,由所述第j张第四图像通过模糊化处理得到的第五图像输入到所述初始的超分辨率模型后得到的图像为所述第i张第三图像,所述M为大于1的正整数,所述i和所述j均为小于或等于M正整数当M为大于2的正整数时,通过多组训练样本得到的多个误差损失来调整初始的超分辨率模型,提供用于调整初始的超分辨率模型的样本信息更多,调整得到的新建的超分辨率模型精确度更高。
在一个实施例中,所述新建的超分辨率模型是根据M个所述误差损失调整所述初始的超分辨率模型中的参数得到的;或者,
所述初始超分辨模型是第1个超分辨率模型,所述M个误差损失中根据第1个误差损失调整所述第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,所述新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,所述r为大于或等于1且小于或等于M正整数。
在一个实施例中,所述初始的超分辨率模型包含n个超分辨率子模型,所述n为大于等于2的正整数;所述超分辨率子模型用于重建输入所述超分辨率子模型的图像信息以提高分辨率;所述图像信息包含像素值信息和图像特征的信息;所述n个超分辨率子模型中,第一个超分辨率子模型的输入为所述第一图像,所述第一个超分辨率子模型的输出作为第二个超分辨率子模型的输入,第t-1个超分辨率子模型的输出作为第t个超分辨率子模型的输入,所述第t个超分辨率子模型的输出作为第t+1个超分辨率子模型的输入;所述t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型的输出作为输出综合模块的输入,所述输出综合模块的输出作为第n个超分辨率子模型的输入,所述第n个超分辨率子模型的输出为所述第二图像,所述输出综合模块用于根据前n-1个超分辨率模子模型输出的重建的图像信息和各自的权重确定所述第n个超分辨率子模型的输入。当n为大于1的正整数时,通过多个超分辨率子模型级联来对低分辨率图像进行重建。重建得到的重建图像的像素值更高,可以提高重建图像的图像质量。另外,前n-1个超分辨率子模型输出的重建图像均作为输入最后一个超分辨率子模型的图像信息,包含了更多的图像信息,减少了图像信息的丢失,从而可以提高通过建立的新建的超分辨率模型的精确度,提高图像重建质量。
在一个实施例中,所述w
k是所述初始的超分辨率模型中的参数。在初始的超分辨率模型训练过程,可以是根据误差损失来优化初始的超分辨率模型中的权重w
k。
在一个实施例中,所述超分辨率子模型为三层全卷积深度神经网络。采用的上述三层全卷积深度神经网络,第一层和第二层用于对低分辨率图像进行图像信息提取,即获得可用于超分辨率重建的信息。第三层是利用前两层提取和变换后的图像信息,重建出高分辨率图像。与仅使用一层卷积层进行图像信息提取相比,三层全卷积深度神经网络多增加的两层卷积层可以帮助提取出更精确的图像信息。另外,三层全卷积深度神经网络构成的超分辨率子模型需要串联来构成超分辨率模型,多个超分辨率子模型串联就要求更多的计算资源,而较少的卷积层数意味着较低的计算量,因此超分辨率子模型中卷积层数量需要考虑计算资源和精确度的折衷。超分辨率子模型采用上述三层全卷积深度神经网络,可以使用较少的计算资源来提取更精确的图像信息。更精确的图像信息有利于重建出质量更高的重建图像,并节省计算资源。
在一个实施例中,所述误差损失L=λ
1L1+λ
2L2+λ
3L3,其中,所述L1为所述像素均方差,所述λ
1为所述像素均方差的权重,所述L2为所述图像特征均方差,所述λ
2为所述 图像特征均方差的权重,所述L3为所述w
k的正则化项,所述λ
3为所述正则化项的权重。增加的正则化项L3,用于减少过拟合的情况,提高新建的超分辨率模型的精确度,从而提高重建图像的质量。
在一个实施例中,三层全卷积深度神经网络中每层卷积层包含至少一个卷积核,所述卷积核的权重矩阵W是所述初始的超分辨率模型中的参数。
第三方面,本申请实施例提供一种图像重建设备,该设备包括用于执行第一方面或第一方面的任一种可能实现方式所提供的图像重建方法的模块或单元。
第四方面,本发明实施例提供一种芯片系统,该芯片系统包括至少一个处理器,存储器和接口电路,该存储器、该接口电路和该至少一个处理器通过线路互联,该至少一个存储器中存储有程序指令;该程序指令被该处理器执行时,实现第一方面或者第一方面的任一可能实现方式所描述的方法。
第五方面,本发明实施例提供一种计算机可读存储介质,该计算机可读存储介质中存储有程序指令,当该程序指令由处理器运行时,实现第一方面或者第一方面的任一可能实现方式所描述的方法。
第六方面,本发明实施例提供一种计算机程序产品,当该计算机程序产品在由处理器上运行时,实现第一方面或者第一方面的任一可能实现方式所描述的方法。
在超分辨率模型训练阶段,误差损失包含像素均方差,误差损失也包含图像特征均方差。用于训练初始的超分辨率模型的误差损失包含了更全面的误差损失的信息,从而可以减少重建图像的高频信息的损失,提高重建图像的重建质量。采用n个超分辨率子模型,当n为大于1的正整数时,通过多个超分辨率子模型级联来对低分辨率图像进行重建。重建得到的重建图像的像素值更高,可以提高重建图像的图像质量。另外,前n-1个超分辨率子模型输出的重建图像均作为输入最后一个超分辨率子模型的图像信息,包含了更多的图像信息,减少了图像信息的丢失,从而可以提高通过建立的新建的超分辨率模型的精确度,提高图像重建质量。
下面对本申请实施例用到的附图进行介绍。
图1是本申请实施例提供的一种图像重建方法的流程示意图;
图2是本申请实施例提供的一种图像重建模型建立方法的示意框图;
图3是本申请实施例提供的一种超分辨率模型的结构示意图;
图4是本申请实施例提供的一种超分辨率模型的结构示意图;
图5是本申请实施例提供的一种图像重建设备的结构示意图;
图6是本申请实施例提供的另一种图像重建设备的结构示意图。
首先,为了便于理解本申请实施例,对本申请实施例涉及的一些概念或术语进行解释。
(1)超分辨率
超分辨率(super resolution,SR)是指利用图像处理方法通过计算机将低分辨率(low resolution,LR)图像重建为高分辨率(high resolution,HR)图像的技术。高分辨率图像即意味着图像具有高像素密度,可以提供更多的图像细节,这些细节往往在应用中起到关键作用。
图像超分辨率技术可以分为两类:基于重建的图像超分辨率方法和基于学习的图像超分辨率方法。其中,基于重建的图像超分辨率方法可以通过频域算法或者空域算法来统计估算最大后验概率的高分辨率图像。基于学习的图像超分辨率方法可以包含训练阶段和测试阶段两个阶段。
在训练阶段,首先建立初始的超分辨率模型和训练集。训练集中可以包括多张低分辨率图像以及每张低分辨率图像对应的高分辨率图像。通过训练集中的低分辨率图像以及它对应的高分辨率图像,学习高分辨率图像和低分辨率图像之间的对应关系,进而修正初始的超分辨率模型中的参数的取值,来使高分辨率图像和重建图像之间误差收敛,最终确定训练后得到的新建的超分辨率模型。在测试阶段,可以根据新建的超分辨率模型来指导图像的超分辨率重建。
其中,获取低分辨率图像以及低分辨率图像对应的高分辨率图像的方法可以是,将高分辨率图像通过模糊函数处理以得到相应的低分辨率图像。
其中,初始的超分辨率模型可以是根据实验确定的模型,可以是非线性的。超分辨率模型可以是一种卷积神经网络。
(2)卷积神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以x
s和截距1为输入的运算单元,该运算单元的输出可以为:
其中,s=1、2、......n,n为大于1的自然数,W
s为x
s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入。激活函数可以是sigmoid函数。神经网络是将许多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
卷积神经网络(convosutionas neuras network,CNN)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器。该特征抽取器可以看作是滤波器,卷积过程可以看作是使用一个可训练的滤波器与一个输入的图像或者卷积特征平面(feature map)做卷积。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。这其中隐含的原理是:图像的某一部分的统计信息与其他 部分是一样的。即意味着在某一部分学习的图像信息也能用在另一部分上。所以对于图像上的所有位置,我们都能使用同样的学习得到的图像信息。在同一卷积层中,可以使用多个卷积核来提取不同的图像信息,一般地,卷积核数量越多,卷积操作反映的图像信息越丰富。
卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。
(3)反向传播算法
卷积神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的超分辨率模型中参数的大小,使得超分辨率模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的超分辨率模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的超分辨率模型的参数,例如权重矩阵。
(4)像素值和图像特征
图像的像素值信息和图像特征的信息可以统称为图像信息。
像素值可以是一个红绿蓝(RGB)颜色值,像素值可以是表示颜色的长整数。例如,像素值为65536*Red+256*Green+Blue,其中,Blue代表蓝色分量,Green代表绿色分量,Red代表红色分量。各个颜色分量中,数值越小,亮度越低,数值越大,亮度越高。对于灰度图像来说,像素值可以是灰度值。
图像特征是指图像的纹理特征、形状特征和空间关系特征和图像高层语义特征。具体介绍如下:
图像的纹理特征是图像的一种全局特征,描述了图像或图像区域所对应景物的表面性质。图像的纹理特征不是基于单个像素点的特征,是多个像素点组成区域中统计计算得到的特征。作为一种统计特征,图像的纹理特征对于噪声有较强的抵抗能力。但是,当图像的分辨率变化的时候,图像的纹理特征可能会发生较大偏差。可以用以下方法来描述图像的纹理特征:a.统计方法,例如从图像的自相关函数(或者说图像的能量谱函数)提取纹理特征,通过对图像的能量谱函数的计算,提取纹理的粗细度及方向性等特征参数。b.几何法,是建立在纹理基元(基本的纹理元素)理论基础上的一种纹理特征分析方法。该方法中,复杂的纹理特征可以由若干简单的纹理基元以一定的有规律的形式重复排列构成。c.模型法,是以图像的构造模型为基础,采用模型的参数表征纹理特征。
图像的形状特征可以包含两类表示方法,一类是轮廓特征,另一类是区域特征。图像的轮廓特征是指物体的外边界的轮廓,而图像的区域特征是指物体占据的整个形状区域。可以用以下方法描述图像的形状特征:a.边界特征法,是通过对边界特征的描述来获取图像的形状参数。b.傅里叶形状描述符法,是利用物体边界的傅里叶变换作为形状描述,并利用区域边界的封闭性和周期性,由边界点导出曲率函数、质心距离、复坐标函数的形状特征的表达。c.几何参数法:形状的表达和匹配采用区域特征描述方法,例如采用形状参数矩、面积、周长等来描述图像的形状特征。
图像的空间关系特征是指图像中分割出来的多个区域之间的相互的空间位置或相对方 向关系,这些关系也可分为连接关系、邻接关系、交叠关系、重叠关系、包含关系和包容关系等。通常图像的空间位置可以分为两类:相对空间位置和绝对空间位置。相对空间位置强调的是目标之间的相对情况,如上下左右关系等。绝对空间位置强调的是目标之间的距离大小以及方位。图像的空间关系特征的使用可加强对图像内容的描述区分能力,但空间关系特征常对图像或目标的旋转、反转、尺度变化等比较敏感。
图像高层语义特征相对于图像的纹理特征、形状特征和空间关系特征来说,是更高层的认知的特征,用于描述人类对图像的理解。图像高层语义特征是以图像为对象,来确定图像中何位置有何目标,目标场景之间的相互关系、图像是何场景以及如何应用场景。提取图像高层语义特征是将输入的图像转换为可直观理解的类文本语言表达的过程。获取图像高层语义特征需要建立图像和语义文本之间的对应关系。
根据图像中各语义要素间组合的抽象程度,图像高层语义特征可分为对象语义特征、空间关系语义特征、场景语义特征、行为语义特征和情感语义特征等。其中,对象语义特征可以是用于确定人、动物和实物等的特征。空间关系语义特征例如可以是用于确定“人在房前”或者“球在草地上”的语义特征。场景语义特征例如可以是用于确定“大海”或者“原野”的语义特征。行为语义特征例如可以是用于确定“表演舞蹈”或者“运动竞赛”的语义特征。情感语义特征例如可以是用于确定“令人愉悦的图像”或者“令人兴奋的图像”的语义特征。对象语义特征和空间关系语义特征需要进行一定的逻辑推理并识别出图像中目标的类别。场景语义特征、行为语义特征和情感语义特征涉及到图像的抽象属性,需要对图像的特征的含义进行高层推理。
可以理解的是,上述对图像高层语义特征的举例仅仅用于解释本申请实施例中的图像高层语义特征,不应构成限定。
根据图像高层语义特征来源的不同,提取图像高层语义特征的方法可以包括:基于处理范围的方法、基于机器学习的方法、基于人机交互的方法和基于外部信息源的方法。其中,基于处理范围方法可以在图像分割和对象识别的前提下进行,利用对象模板、场景分类器等,通过识别对象及对象之间的拓扑关系挖掘语义,生成对应的场景语义信息。基于机器学习的方法是对图像的低层特征进行学习,挖掘底层特征与图像语义之间的关联,从而建立起底层特征到图像高层语义特征的映射关系。基于机器学习的方法主要包含2个关键步骤:一是低层特征的提取,如纹理,形状等特征。二是映射算法的运用。基于人机交互的方法一般是系统使用低层特征,而用户则加入高层知识,提取方法主要包括图像预处理和反馈学习2个方面。图像预处理方式可以是对图像库中的图像进行人工标注,也可以是用一些自动或半自动的图像语义标注方法。反馈学习是在提取图像语义的过程中加入人工干预,通过用户与系统之间的反复交互来提取图像的语义特征,并建立和修正与图像内容相关联的高层语义概念。
(5)误差损失
前向传递输入信号直至输出产生的误差损失可以包括像素均方差和特征损失。特征损失可以是图像特征均方差,图像特征可以包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。以下分别对像素均方差和图像特征均方差进行说明。
根据初始模型可以得到输入的低分辨率图像的重建图像,可以计算该重建图像和输入 的低分辨率图像对应的高分辨率图像之间的像素均方差,即为像素均方差损失:
其中,L1为像素均方差损失,F和H分别为图像的宽度的像素值和高度的像素值,I
1,x,y为输入的低分辨率图像对应的高分辨率图像在(x,y)位置的像素值,I
2,x,y为低分辨率图像的重建图像在(x,y)位置的像素值。
图像特征可以是图像特征提取装置从重建图像和高分辨率图像提取出的特征,图像特征可以是一个N维的向量Θ。特征损失可以是重建图像和高分辨率图像特征均方差,即:
其中,L2为特征损失,Θ
1,i为低分辨率图像对应的高分辨率图像第i维的图像特征值,Θ
2,i为低分辨率图像的重建图像第i维的图像特征值,i为大于1小于N的正整数。
(6)正则化
在建立初始模型时,往往倾向于用复杂的模型来拟合复杂的数据,但是使用复杂模型会产生过拟合的风险,而正则化是数学优化中常用的方法,可以控制待优化参数的幅值,避免过拟合。过拟合是指模型在训练集上误差很小,但是在测试集上误差很大,即泛化能力差。过拟合的原因一般是由于数据中存在噪声或者用了过于复杂的模型拟合数据。
公式(1-1)中,我们的目标是最优化的最小二乘误差(least square error),二乘误差是:
或者
其中,f(x
i)是建立的模型在x
i处的取值,建立的初始模型可以是f(x
i)=w
0x
0+w
1x
1+...+w
Nx
N,y
i是采样值,W=(W
0,W
1,...,W
N)。W
TW和∑|W
i|是正则化项,目的是用于减少过拟合的风险,其中,W可以是权重矩阵,λ为正则化项的权重。
在进行图像重建时,现有技术的超分辨率模型训练过程仅考虑像素均方差,通过训练集中低分辨率图像重建得到的超分辨率图像和低分辨率图像对应的高分辨率图像之间的像素均方差组成的误差损失来对初始的超分辨率模型进行训练,最终使像素均方差收敛,得到新建的超分辨率模型。也即是说,新建的超分辨率模型仅考虑的像素均方差造成的误差损失,通过上述训练过程得到的新建的超分辨率模型得到的重建图像的高频信息损失严重,从而降低了重建图像的重建质量。
为提高重建图像的重建质量,本申请实施例提供一种图像重建方法,在超分辨率模型训练阶段,误差损失包含像素均方差,误差损失也包含图像特征均方差。用于训练初始的超分辨率模型的误差损失包含了更全面的误差损失的信息,从而可以减少重建图像的高频信息的损失,提高重建图像的重建质量。
本申请涉及的发明原理可以包括:在超分辨率模型训练阶段,将低分辨率图像经过初始的超分辨率模型重建后得到重建图像,即超分辨率图像,确定超分辨率图像和低分辨率图像对应的高分辨率图像之间的误差损失,该误差损失包含像素均方差和图像特征均方差。并根据该误差损失和初始的超分辨率模型来确定新建的超分辨率模型。通过更全面的误差损失来调整初始的超分辨率模型中的参数,可以提高新建的超分辨率模型重建图像的精确度,从而提高重建图像的重建质量。
基于上述主要发明原理,下面说明本申请提供的图像重建方法。
请一并参阅图1和图2,图1是本申请实施例提供的一种图像重建方法的流程示意图。如图1所示,该图像重建方法包括但不限于如下步骤S101-S104。
S101、图像重建设备将第五图像输入到初始的超分辨率模型以得到重建的第三图像。
S102、图像重建设备确定第三图像和第四图像之间的误差损失,误差损失包括像素均方差和图像特征均方差。
S103、图像重建设备根据所述初始的超分辨率模型和误差损失建立新建的超分辨率模型。
S104、图像重建设备将第一图像输入新建的超分辨率模型以得到重建的第二图像。
本申请实施例中,图像重建模型即超分辨率模型,用于重建输入超分辨率模型的图像以提高分辨率。建立超分辨率模型的过程可以分为训练阶段和测试阶段。训练阶段是通过低分辨率图像和低分辨率图像对应的高分辨率图像对初始的超分辨率模型进行训练以使误差损失收敛的过程。测试阶段是将低分辨率的测试图像输入新建的超分辨率模型以得到重建的图像,可以测试新建的超分辨率模型重建图像以提高图像分辨率的效果。测试阶段也可以看作是使用新建的超分辨率模型重建图像的过程。其中,步骤S101-S103可以看作是超分辨率模型训练阶段的流程。新建的超分辨率模型即为通过误差损失训练初始的超分辨率模型,来对初始的超分辨率模型中的参数进行调整得到的误差收敛的超分辨率模型,新建的超分辨率模型可以直接用于重建图像以提高图像的分辨率。其中,步骤S104可以看作是使用新建的超分辨率模型重建图像以提高图像的分辨率,即新建的超分辨率模型的测试阶段的流程。
具体地,图2是本申请实施例提供的一种图像重建模型建立方法的示意框图。如图2所示,低分辨率图像101输入初始的超分辨率模型102,通过初始的超分辨率模型102对低分辨率图像101进行重建得到重建图像103,计算重建图像103和高分辨率图像104之间的像素均方差105,并计算重建图像103和高分辨率图像104之间的图像特征均方差106。图像特征可以是图像特征提取装置分别从重建图像103和高分辨率图像104提取出来的重建图像的图像特征和高分辨率图像的图像特征。根据像素均方差105和图像特征均方差106确定误差损失107,并根据误差损失107和初始的超分辨率模型102,来更新超分辨率模型。 其中,低分辨率图像是高分辨率图像经过模糊化处理得到的分辨率较低的图像。
其中,在测试阶段,第一图像为低分辨率图像,第二图像为经过重建的超分辨率模型重建得到的重建图像。在训练阶段,第五图像为低分辨率图像,第三图像为经过初始的超分辨率模型重建得到的重建图像,第四图像为第五图像对应的高分辨率图像。第五图像可以是第四图像经过模糊化处理得到的低分辨率图像。第四图像和第五图像组成高、低分辨率的图像训练集。
在一个实施例中,步骤S101可以是将M张第五图像输入到初始的超分辨率模型以得到重建的M张第三图像。步骤S102可以是根据M张第三图像和M张第四图像确定M个误差损失,其中,M个误差损失中任意一个误差损失为M张第三图像中的第i张第三图像与M张第四图像中的第j张第四图像之间的误差损失,由第j张第四图像通过模糊化处理得到的第五图像输入到初始的超分辨率模型后得到的图像为第i张第三图像,M为大于1的整数,i和j均为小于或等于M正整数。
在一个实施例中,一种情况下,新建的超分辨率模型可以是根据M个所述误差损失调整所述初始的超分辨率模型中的参数得到的。
另一种情况下,所述初始超分辨模型是第1个超分辨率模型,所述M个误差损失中根据第1个误差损失调整所述第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,所述新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,所述r为大于或等于1且小于或等于M正整数。
也即是说,训练样本对可以是多对,可以是通过M对的训练样本计算得到M个误差损失。对于第一种情况,即是通过上述M个误差损失对初始的超分辨率模型进行一次参数调整,以得到新建的超分辨率模型。对于第二种情况,也可以是上述M个误差损失中,每得到一个误差损失即通过该一个误差损失对超分辨率模型进行参数调整,即对超分辨率模型进行M次参数调整以得到新建的超分辨率模型。当M为大于2的正整数时,通过多组训练样本得到的多个误差损失来对初始的超分辨率模型进行一次参数调整,以得到新建的超分辨率模型,提供用于参数调整初始的超分辨率模型的样本信息更多,调整得到的新建的超分辨率模型精确度更高。另外,如果训练样本对为多对,每次得到一个误差损失时即根据该一个误差损失来调整初始的超分辨率模型,调整次数过多,造成处理资源和存储资源的浪费。而多组训练样本得到的多个误差损失来对初始的超分辨率模型进行一次参数调整,以得到新建的超分辨率模型,可以减少超分辨率模型中参数的调整次数,从而可以节约处理资源和存储资源。
需要说明的,根据多个误差损失训练初始的超分辨率模型得到新建的超分辨率模型的方法不限于上述两种,根据多个误差损失训练初始的超分辨率模型得到新建的超分辨率模型还可以使用其他的方法。例如,对超分辨率模型进行N次训练,至少有一次训练使用的误差损失个数为一个以上。N可以是大于1小于M的正整数。上述两种情况提供的方法仅用于解释本申请实施例,不应构成限定。
在一个实施例中,超分辨率模型可以包含多个超分辨率子模型。请参阅图3,图3是本申请实施例提供的一种超分辨率模型的结构示意图。如图3所示,超分辨率模型102可 以包含n个超分辨率子模型1021,n为大于等于2的正整数;超分辨率子模型1021用于重建输入超分辨率子模型1021的图像信息以提高分辨率;图像信息包含像素值信息和图像特征的信息;
n个超分辨率子模型中,第一个超分辨率子模型1的输入为第一图像,即低分辨率图像101,第一个超分辨率子模1的输出作为第二个超分辨率子模型2的输入,第t-1个超分辨率子模型t-1的输出作为第t个超分辨率子模型t的输入,第t个超分辨率子模型t的输出作为第t+1个超分辨率子模型t+1的输入;t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型t的输出作为输出综合模块1022的输入,输出综合模块1022的输出作为第n个超分辨率子模型n的输入,第n个超分辨率子模型n的输出为第二图像,即重建图像103,输出综合模块1022用于根据前n-1个超分辨率模子模型1021输出的重建图像和各自的权重确定输入第n个超分辨率子模型n的图像信息。
其中,上述的超分辨率模型102可以包含在图像重建设备中,使得图像重建设备执行图1所描述的图像重建方法。
当n为大于1的正整数时,通过多个超分辨率子模型级联来对低分辨率图像进行重建。重建得到的重建图像的像素值更高,可以提高重建图像的图像质量。另外,前n-1个超分辨率子模型输出的重建图像均作为输入最后一个超分辨率子模型的图像信息,包含了更多的图像信息,减少了图像信息的丢失,从而可以提高通过建立的新建的超分辨率模型的精确度,提高图像重建质量。
在实际设计超分辨率模型时,超分辨率子模型的数量需要考虑重建图像质量需求和计算资源两个条件。对于越高的重建图像质量需求,需要超分辨率子模型的数量越多,但越多的超分辨率子模型会导致系统计算量越多,消耗的计算资源越多。因此,超分辨率子模型个数的选择需要在重建图像质量需求和计算资源之间进行折衷。
例如,如图3所示,假设第一个超分辨率子模型可以重建的图像信息大约是训练集中高分辨率图像的90%的图像信息,这90%的图像信息为第一图像信息,第二个超分辨率子模型可以重建的图像信息大约是第一图像信息之外的图像信息的90%,依次类推。也即是说,每个超分辨率子模型重建图像过程中,会丢失剩余未重建的图像信息中10%的图像信息。剩余未重建的图像信息可以理解为上一个超分辨率子模型丢失的图像信息。对于第一个超分辨率子模型来说,剩余未重建的图像信息可以理解为高分辨率图像的全部图像信息。如果重建图像质量需求是重建图像质量达到高分辨率图像的99%的图像信息。而计算资源最多可以实时处理5个超分辨率子模型,设超分辨率子模型的数量为N,N为大于等于1的正整数,则N满足:
根据上式(1-6)可以解出,N=2。则可以设置超分辨率子模型的个数为2个。
可以理解的是,如果超分辨率子模型的个数为1个或2个,则超分辨率模型可以不包含输出综合模块1022,也可以包含输出综合模块1022。超分辨率子模型的个数为2个时,该两个超分辨率子模型可以串联,第一个超分辨率子模型的输入为低分辨率图像,第一个超分辨率子模型输出的重建图像作为第二个超分辨率子模型的输入,第二个超分辨率子模 型的输出即是超分辨率模型的重建图像。
如图3所示,输出综合模子模型1022输出的重建的图像信息为:
其中,k为满足1≤k≤n-1的正整数;w
k为第k个超分辨率子模型的权重。
超分辨率模型中,第k个超分辨率子模型的权重w
k可以是依据实验结果或者经验确定的。第k个超分辨率子模型的权重w
k也可以是超分辨率模型中的参数,也即是说,在初始的超分辨率模型训练过程,也可以是根据误差损失来优化初始的超分辨率模型中的权重w
k。
在一个实施例中,超分辨率子模型1021可以是三层全卷积深度神经网络。三层卷积深度神经网络中,第一层卷积层可以是输入层,用于按区域提取图像信息。输入层可以包含多个卷积核,用于提取不同的图像信息。第二层卷积层可以是变换层,用于对提取的图像信息进行非线性变换,提取的图像信息X,则非线性变换可以是f(W·X+b)。图像信息X可以是多维向量,是通过第一层卷积层中的多个卷积核提取的多维的图像信息,W为卷积核的权重向量,b为卷积核的偏置向量,f可以是激活函数。第三层卷积层可以是输出层,用于对第二层卷积层输出的图像信息进行重建。重建的过程也可以是通过多卷积核与图像的卷积操作进行重建的。输出层的输出可以是3通道(彩色)图像或单通道(灰度)图像。
如果超分辨率模型是由多个超分辨率子模型串联构成,采用的上述三层全卷积深度神经网络,第一层卷积层和第二层卷积层用于对低分辨率图像进行图像信息提取,即获得可用于超分辨率重建的信息。第三层卷积层是利用前两层提取和变换后的图像信息,重建出高分辨率图像。与仅使用一层卷积层进行图像信息提取相比,三层全卷积深度神经网络多增加的两层卷积层可以提取出更精确的图像信息。另外,三层全卷积深度神经网络构成的超分辨率子模型需要串联来构成超分辨率模型,多个超分辨率子模型串联就要求更多的计算资源,而较少的卷积层数意味着较低的计算量,因此超分辨率子模型中卷积层数量需要考虑计算资源和精确度的折衷。超分辨率子模型采用上述三层全卷积深度神经网络,可以使用较少的计算资源来提取更精确的图像信息。更精确的图像信息有利于重建出质量更高的重建图像,并节省计算资源。
在一个实施例中,卷积核的权重向量W可以是超分辨率模型中的参数,也即是说,在初始的超分辨率模型训练过程,也可以根据误差损失来优化卷积核的权重向量W。
在一个实施例中,误差损失为:
L=λ
1L1+λ
2L2+λ
3L3 (1-8)
其中,L1为像素均方差,具体可参考公式(1-2)。λ
1为像素均方差的权重。L2为图像特征均方差,具体可参考公式(1-3)。λ
2为图像特征均方差的权重,L3为w
k的正则化项,λ
3为正则化项的权重。
w=(w
1,w
2,w
3,...,w
N-1) (1-9)
其中,w是超分辨率子模型的权重矩阵。
L3=w
Tw,或者,L3=∑|w
i| (1-10)
其中,λ
1、λ
2和λ
3的取值可以是根据实验或者经验确定的。
增加的正则化项L3,用于减少过拟合的情况,提高新建的超分辨率模型的精确度,从而提高重建图像的质量。
举例说明,在人脸图像重建的场景下,首先,需要进行训练前准备,即获取训练集,具体过程如下:
步骤一:获取的高分辨率人脸图像为{Y
m|1≤m≤M}∈R
a×b。
其中,M是训练样本的个数,R
a×b表示图像的尺寸,高分辨率人脸图像的分辨率为a×b。
步骤二:通过降采样函数得到低分辨率人脸图像T
m=D(Y
m),{T
m|1≤m≤M}∈R
(a/t)×(b/t)。
其中,D可以是降采样函数,即模糊函数。低分辨率人脸图像的分辨率为(a/t)×(b/t),t为正整数,{T
m|1≤m≤M}∈R
(a/t)×(b/t)和{Y
m|1≤m≤M}∈R
a×b,即组成训练集。
其次,根据训练集进行超分辨率模型训练,具体过程如下:
如图3所示,超分辨率模型可以包含n个超分辨率子模型。请参阅图4,图4是本申请实施例提供的一种超分辨率模型的结构示意图。如图4所示,第k个超分辨率子模型可以是三层全卷积深度神经网络。第k个超分辨率子模型的输入为人脸图像104,该人脸图像104可以是第k-1个超分辨率子模型输出的人脸图像,如果k=1,则该人脸图像104可以是图3中输入超分辨率模型102中的低分辨率人脸图像101。即第k-1个超分辨率子模型输出的人脸图像或者低分辨率人脸图像作为第k个超分辨率子模型中第一层卷积层的输入,第k个超分辨率子模型中第一层卷积层的输出为通过s个卷积核分别作卷积操作获得的人脸图像信息X
k,人脸图像信息X
k作为第k个超分辨率子模型中第二层卷积层的输入。其中,X
k=(X1,X2,...,Xs),第k个超分辨率子模型中第二层卷积层的输出为f(W·X+b),第k个超分辨率子模型中第三层卷积层的输出为通过f(W·X+b)与m个卷积核进行卷积计算得到的重建人脸图像105。需要说明的是,第一层卷积层中每个卷积核的大小与第三层卷积层中卷积核的大小可以不同,第二层卷积层的卷积核大小可以是1。第一层卷积层中卷积核的数量、第二层卷积层中卷积核的数量与第三层卷积层中卷积核的数量可以相等,也可以不相等。其中,k是满足1≤k≤n-1的正整数。
第n个超分辨率子模型也可以是三层全卷积深度神经网络。第n个超分辨率子模型的第一卷积层输入的人脸图像信息可以:
O
k为第k个超分辨率子模型重建人脸图像的人脸图像信息。第n个超分辨率子模型的第二卷积层与上述第k个超分辨率子模型的第二卷积层类似,不再赘述。第n个超分辨率子模型的第三卷积层输出的重建人脸图 像105为图3所描述的超分辨率模型中的重建图像103。
第n个超分辨率子模型中,第三层卷积层的卷积核数量可以和输入的低分辨率人脸图像的通道数相等,用于重建低分辨率人脸图像,以得到重建人脸图像。例如,如果低分辨率人脸图像的通道数为3,即R、G和B各占一个通道。则最后一个超分辨率子模型,即第n个超分辨率子模型的第三层卷积层的卷积核为3,用于重建由红、绿和蓝三色组成的低分辨率人脸图像,以得到重建人脸图像,重建人脸图像也是由红、绿和蓝三色组成的。再例如,如果低分辨率人脸图像的通道数为1,即低分辨率人脸图像为灰度图像,则最后一个超分辨率子模型,即第n个超分辨率子模型的第三层卷积层的卷积核为1,用于重建灰度的低分辨率人脸图像,以得到重建人脸图像,重建人脸图像也是灰度图像。
在一个实施例中,获取误差损失后,也可以根据该误差损失和以下至少一项来确定新建的超分辨率模型中的参数:第一图像、第二图像和第三图像。也即是说,建立图像重建模型的设备可以根据误差损失和该误差损失相关的图像确定新建的超分辨率模型中的参数,而无需在初始的超分辨率模型中的参数的基础上调整参数值。
需要进行说明的是,上述的图像重建方法可以用于图像识别系统中,例如人脸识别系统中。上述的图像重建方法可以用于图像增强系统中。
上述详细阐述了本发明实施例的方法,下面提供了本发明实施例的装置。
请参阅图5,图5是本申请实施例提供的一种图像重建设备的结构示意图。如图5所示,该设备可以包含处理单元501和接收单元502,其中:
处理单元501,将第一图像输入新建的超分辨率模型以得到重建的第二图像,第二图像的分辨率高于第一图像;
新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;误差损失包括像素均方差和图像特征均方差;图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项;
接收单元502,用于接收输入新建的超分辨率模型中的第一图像。作为一种可能的实施方式,误差损失是第三图像和第四图像之间的误差损失,第三图像是将第五图像输入初始的超分辨率模型进行重建得到的;第四图像为高分辨率图像,第五图像为第四图像通过模糊化处理得到的低分辨率图像;初始的超分辨率模型用于重建输入初始的超分辨率模型的图像以提高分辨率。
作为一种可能的实施方式,第三图像、第四图像和第五图像的数量均为M张,误差损失的个数为M个,M张第三图像是将M张第五图像输入到初始的超分辨率模型重建得到的;M个误差损失是根据M张第三图像和M张第四图像确定的;
其中,M个误差损失中任意一个误差损失为M张第三图像中的第i张第三图像与M张第四图像中的第j张第四图像之间的误差损失,由第j张第四图像通过模糊化处理得到的第五图像输入到初始的超分辨率模型后得到的图像为第i张第三图像,M为大于1的正整数,i和j均为小于或等于M正整数。
作为一种可能的实施方式,新建的超分辨率模型是根据M个误差损失调整初始的超分辨率模型中的参数得到的;或者,
初始超分辨模型是第1个超分辨率模型,M个误差损失中根据第1个误差损失调整第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,r为大于或等于1且小于或等于M正整数。
作为一种可能的实施方式,初始的超分辨率模型包含n个超分辨率子模型,n为大于等于2的正整数;超分辨率子模型用于重建输入超分辨率子模型的图像信息以提高分辨率;图像信息包含像素值信息和图像特征的信息;
n个超分辨率子模型中,第一个超分辨率子模型的输入为第一图像,第一个超分辨率子模型的输出作为第二个超分辨率子模型的输入,第t-1个超分辨率子模型的输出作为第t个超分辨率子模型的输入,第t个超分辨率子模型的输出作为第t+1个超分辨率子模型的输入;t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型的输出作为输出综合模块的输入,输出综合模块的输出作为第n个超分辨率子模型的输入,第n个超分辨率子模型的输出为第二图像,输出综合模块用于根据前n-1个超分辨率模子模型输出的重建的图像信息和各自的权重确定第n个超分辨率子模型的输入。也即是说,上述的初始的超分辨率模型包含在图像重建单元501中。
作为一种可能的实施方式,w
k是超分辨率模型中的参数。
作为一种可能的实施方式,超分辨率子模型为三层全卷积深度神经网络。
作为一种可能的实施方式,误差损失L=λ
1L1+λ
2L2+λ
3L3,其中,L1为像素均方差,λ
1为像素均方差的权重,L2为图像特征均方差,λ
2为图像特征均方差的权重,L3为w
k的正则化项,λ
3为正则化项的权重。
需要说明的是,各个模块的实现还可以对应参照图1所示的方法实施例的相应描述,这里不再赘述。
上述的图像重建设备可以是图像识别设备,例如人脸识别设备。上述的图像重建设备还可以是图像增强设备等。
请参阅图6,图6是本申请实施例提供的另一种图像重建设备的结构示意图。如图6所示,该设备包含处理器601、存储器602和通信接口603,处理器601、存储器602和通信接口603通过总线604相互连接。其中:
存储器602包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该存储器602用于存储相关指令及数据,具体的,存储器602可以用于存储超分辨率模型。
通信接口603可以用于与其他设备进行通信,例如,可以用于接收训练集,训练集包 含第五图像和第四图像,第四图像是高分辨率图像,第五图像是第四图像经过模糊化处理得到的低分辨率图像。通信接口603还可以用于接收需要重建的低分辨率图像,例如接收第一图像。
处理器601可以是一个或多个中央处理器(central processing unit,CPU),在处理器601是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。
图像重建设备中的处理器601用于读取存储器602中存储的程序代码,执行以下操作:
将第一图像输入新建的超分辨率模型以得到重建的第二图像,第二图像的分辨率高于第一图像;
新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;误差损失包括像素均方差和图像特征均方差;图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。
作为一种可能的实施方式,误差损失是第三图像和第四图像之间的误差损失,第三图像是将第五图像输入初始的超分辨率模型进行重建得到的;第四图像为高分辨率图像,第五图像为第四图像通过模糊化处理得到的低分辨率图像;初始的超分辨率模型用于重建输入初始的超分辨率模型的图像以提高分辨率。
作为一种可能的实施方式,第三图像、第四图像和第五图像的数量均为M张,误差损失的个数为M个,M张第三图像是将M张第五图像输入到初始的超分辨率模型重建得到的;M个误差损失是根据M张第三图像和M张第四图像确定的;
其中,M个误差损失中任意一个误差损失为M张第三图像中的第i张第三图像与M张第四图像中的第j张第四图像之间的误差损失,由第j张第四图像通过模糊化处理得到的第五图像输入到初始的超分辨率模型后得到的图像为第i张第三图像,M为大于1的正整数,i和j均为小于或等于M正整数。
作为一种可能的实施方式,新建的超分辨率模型是根据M个误差损失调整初始的超分辨率模型中的参数得到的;或者,
初始的超分辨模型是第1个超分辨率模型,M个误差损失中根据第1个误差损失调整第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,r为大于或等于1且小于或等于M正整数。
作为一种可能的实施方式,初始的超分辨率模型包含n个超分辨率子模型,n为大于等于2的正整数;超分辨率子模型用于重建输入超分辨率子模型的图像信息以提高分辨率;图像信息包含像素值信息和图像特征的信息;
n个超分辨率子模型中,第一个超分辨率子模型的输入为第一图像,第一个超分辨率子模型的输出作为第二个超分辨率子模型的输入,第t-1个超分辨率子模型的输出作为第t个超分辨率子模型的输入,第t个超分辨率子模型的输出作为第t+1个超分辨率子模型的输入;t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型的输出作为输出综合模块的输入,输出综合模块的输出作为第n个超分辨率子模型的输入,第n个超分辨率子模型的输出为第二图像,输出综合模块用于根据前n-1个超分辨率模子模型输出的重建的图像信息和各自的权重确定第n个超分辨率子模型的输入。
作为一种可能的实施方式,w
k是超分辨率模型中的参数。
作为一种可能的实施方式,超分辨率子模型为三层全卷积深度神经网络。
作为一种可能的实施方式,误差损失L=λ
1L1+λ
2L2+λ
3L3,其中,L1为像素均方差,λ
1为像素均方差的权重,L2为图像特征均方差,λ
2为图像特征均方差的权重,L3为w
k的正则化项,λ
3为正则化项的权重。
需要说明的是,上述各个操作的实现还可以对应参照图1所示的方法实施例的相应描述,这里不再赘述。上述的图像重建设备可以是图像识别设备,例如人脸识别设备。上述的图像重建设备还可以是图像增强设备等。
本发明实施例还提供一种芯片系统,所述芯片系统包括至少一个处理器,存储器和接口电路,所述存储器、所述收发器和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,图1所示的方法流程得以实现。
本发明实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在处理器上运行时,图1所示的方法流程得以实现。
本发明实施例还提供一种计算机程序产品,当所述计算机程序产品在处理器上运行时,图1所示的方法流程得以实现。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者通过所述计算机可读存储介质进行传输。所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如,固态硬盘(Solid State Disk,SSD))等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。
Claims (19)
- 一种图像重建方法,其特征在于,包括:将第一图像输入新建的超分辨率模型以得到重建的第二图像,所述第二图像的分辨率高于所述第一图像;所述新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;所述误差损失包括像素均方差和图像特征均方差;所述图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。
- 根据权利要求1所述的方法,其特征在于,所述误差损失是第三图像和第四图像之间的误差损失,所述第三图像是将第五图像输入所述初始的超分辨率模型进行重建得到的;所述第四图像为高分辨率图像,所述第五图像为所述第四图像通过模糊化处理得到的低分辨率图像;所述初始的超分辨率模型用于重建输入所述初始的超分辨率模型的图像以提高分辨率。
- 根据权利要求2所述的方法,其特征在于,所述第三图像、所述第四图像和所述第五图像的数量均为M张,所述误差损失的个数为M个,所述M张第三图像是将所述M张第五图像输入到所述初始的超分辨率模型重建得到的;所述M个误差损失是根据所述M张第三图像和所述M张第四图像确定的;其中,所述M个误差损失中任意一个误差损失为所述M张第三图像中的第i张第三图像与所述M张第四图像中的第j张第四图像之间的误差损失,由所述第j张第四图像通过模糊化处理得到的第五图像输入到所述初始的超分辨率模型后得到的图像为所述第i张第三图像,所述M为大于1的正整数,所述i和所述j均为小于或等于M正整数。
- 根据权利要求3所述的方法,其特征在于,所述新建的超分辨率模型是根据所述M个误差损失调整所述初始的超分辨率模型中的参数得到的;或者,所述初始超分辨模型是第1个超分辨率模型,所述M个误差损失中,根据第1个误差损失调整所述第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,所述新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,所述r为大于或等于1且小于或等于M正整数。
- 根据权利要求1至4任一项所述的方法,其特征在于,所述初始的超分辨率模型包含n个超分辨率子模型,所述n为大于等于2的正整数;所述超分辨率子模型用于重建输入所述超分辨率子模型的图像信息以提高分辨率;所述图像信息包含像素值信息和图像特征的信息;所述n个超分辨率子模型中,第一个超分辨率子模型的输入为所述第一图像,所述第一个超分辨率子模型的输出作为第二个超分辨率子模型的输入,第t-1个超分辨率子模型的输出作为第t个超分辨率子模型的输入,所述第t个超分辨率子模型的输出作为第t+1个超分辨率子模型的输入;所述t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型的输出作为输出综合模块的输入,所述输出综合模块的输出作为第n个超分辨率子模型的输入,所述第n个超分辨率子模型的输出为所述第二图像,所述输出综合模块用于根据前n-1个超 分辨率模子模型输出的重建的图像信息和各自的权重确定所述第n个超分辨率子模型的输入。
- 根据权利要求6所述的方法,其特征在于,所述w k是所述初始的超分辨率模型中的参数。
- 根据权利要求5至7任一项所述的方法,其特征在于,所述超分辨率子模型为三层全卷积深度神经网络。
- 根据权利要求6至8任一项所述的方法,其特征在于,所述误差损失L=λ 1L1+λ 2L2+λ 3L3,其中,所述L1为所述像素均方差,所述λ 1为所述像素均方差的权重,所述L2为所述图像特征均方差,所述λ 2为所述图像特征均方差的权重,所述L3为所述w k的正则化项,所述λ 3为所述正则化项的权重。
- 一种图像重建设备,其特征在于,包括处理器和存储器,所述存储器用于存储程序指令,所述处理器用于调用所述程序指令来执行如下操作:将第一图像输入新建的超分辨率模型以得到重建的第二图像,所述第二图像的分辨率高于所述第一图像;所述新建的超分辨率模型是使用误差损失对初始的超分辨率模型进行训练得到的;所述误差损失包括像素均方差和图像特征均方差;所述图像特征包含纹理特征、形状特征、空间关系特征和图像高层语义特征中的至少一项。
- 根据权利要求10所述的设备,其特征在于,所述误差损失是第三图像和第四图像之间的误差损失,所述第三图像是将第五图像输入所述初始的超分辨率模型进行重建得到的;所述第四图像为高分辨率图像,所述第五图像为所述第四图像通过模糊化处理得到的低分辨率图像;所述初始的超分辨率模型用于重建输入所述初始的超分辨率模型的图像以提高分辨率。
- 根据权利要求11所述的设备,其特征在于,所述第三图像、所述第四图像和所述第五图像的数量均为M张,所述误差损失的个数为M个,所述M张第三图像是将所述M张第五图像输入到所述初始的超分辨率模型重建得到的;所述M个误差损失是根据所述M张第三图像和所述M张第四图像确定的;其中,所述M个误差损失中任意一个误差损失为所述M张第三图像中的第i张第三图像与所述M张第四图像中的第j张第四图像之间的误差损失,由所述第j张第四图像通过模糊化处理得到的第五图像输入到所述初始的超分辨率模型后得到的图像为所述第i张第三图像,所述M为大于1的正整数,所述i和所述j均为小于或等于M正整数。
- 根据权利要求12所述的设备,其特征在于,所述新建的超分辨率模型是根据M个所述误差损失调整所述初始的超分辨率模型中的参数得到的;或者,所述初始超分辨模型是第1个超分辨率模型,所述M个误差损失中根据第1个误差损 失调整所述第1个超分辨模型中的参数得到第2个超分辨模型,根据第r个误差损失调整第r个超分辨率模型中的参数得到第r+1个超分辨模型,所述新建的超分辨率模型是使用第M个误差损失调整第M个超分辨率模型中的参数得到的;其中,所述r为大于或等于1且小于或等于M正整数。
- 根据权利要求10至13任一项所述的设备,其特征在于,所述初始的超分辨率模型包含n个超分辨率子模型,所述n为大于等于2的正整数;所述超分辨率子模型用于重建输入所述超分辨率子模型的图像信息以提高分辨率;所述图像信息包含像素值信息和图像特征的信息;所述n个超分辨率子模型中,第一个超分辨率子模型的输入为所述第一图像,所述第一个超分辨率子模型的输出作为第二个超分辨率子模型的输入,第t-1个超分辨率子模型的输出作为第t个超分辨率子模型的输入,所述第t个超分辨率子模型的输出作为第t+1个超分辨率子模型的输入;所述t为满足2≤t≤n-1的正整数;且第t个超分辨率子模型的输出作为输出综合模块的输入,所述输出综合模块的输出作为第n个超分辨率子模型的输入,所述第n个超分辨率子模型的输出为所述第二图像,所述输出综合模块用于根据前n-1个超分辨率模子模型输出的重建的图像信息和各自的权重确定所述第n个超分辨率子模型的输入。
- 根据权利要求15所述的设备,其特征在于,所述w k是所述初始的超分辨率模型中的参数。
- 根据权利要求14至16任一项所述的设备,其特征在于,所述超分辨率子模型为三层全卷积深度神经网络。
- 根据权利要求15至17任一项所述的设备,其特征在于,所述误差损失L=λ 1L1+λ 2L2+λ 3L3,其中,所述L1为所述像素均方差,所述λ 1为所述像素均方差的权重,所述L2为所述图像特征均方差,所述λ 2为所述图像特征均方差的权重,所述L3为所述w k的正则化项,所述λ 3为所述正则化项的权重。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序指令,当所述程序指令由处理器运行时,实现权利要求1-9任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18892072.2A EP3716198A4 (en) | 2017-12-20 | 2018-12-12 | IMAGE RECONSTRUCTION PROCESS AND DEVICE |
US16/903,667 US11551333B2 (en) | 2017-12-20 | 2020-06-17 | Image reconstruction method and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711387428.6A CN109949255B (zh) | 2017-12-20 | 2017-12-20 | 图像重建方法及设备 |
CN201711387428.6 | 2017-12-20 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/903,667 Continuation US11551333B2 (en) | 2017-12-20 | 2020-06-17 | Image reconstruction method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019120110A1 true WO2019120110A1 (zh) | 2019-06-27 |
Family
ID=66994374
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/120447 WO2019120110A1 (zh) | 2017-12-20 | 2018-12-12 | 图像重建方法及设备 |
Country Status (4)
Country | Link |
---|---|
US (1) | US11551333B2 (zh) |
EP (1) | EP3716198A4 (zh) |
CN (1) | CN109949255B (zh) |
WO (1) | WO2019120110A1 (zh) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111524068A (zh) * | 2020-04-14 | 2020-08-11 | 长安大学 | 一种基于深度学习的变长输入超分辨率视频重建方法 |
CN111612698A (zh) * | 2020-05-29 | 2020-09-01 | 西安培华学院 | 一种图像超分辨率修复重建方法 |
CN112070676A (zh) * | 2020-09-10 | 2020-12-11 | 东北大学秦皇岛分校 | 一种双通道多感知卷积神经网络的图片超分辨率重建方法 |
CN112446824A (zh) * | 2019-08-28 | 2021-03-05 | 新华三技术有限公司 | 一种图像重建方法及装置 |
CN112819695A (zh) * | 2021-01-26 | 2021-05-18 | 北京小米移动软件有限公司 | 图像超分辨率重建方法、装置、电子设备及介质 |
CN113229767A (zh) * | 2021-04-12 | 2021-08-10 | 佛山市顺德区美的洗涤电器制造有限公司 | 用于处理图像的方法、处理器、控制装置及家用电器 |
CN113723317A (zh) * | 2021-09-01 | 2021-11-30 | 京东科技控股股份有限公司 | 3d人脸的重建方法、装置、电子设备和存储介质 |
Families Citing this family (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11309081B2 (en) * | 2010-10-13 | 2022-04-19 | Gholam A. Peyman | Telemedicine system with dynamic imaging |
US11816795B2 (en) * | 2018-12-20 | 2023-11-14 | Sony Group Corporation | Photo-video based spatial-temporal volumetric capture system for dynamic 4D human face and body digitization |
US11720621B2 (en) * | 2019-03-18 | 2023-08-08 | Apple Inc. | Systems and methods for naming objects based on object content |
US11989772B2 (en) * | 2019-07-05 | 2024-05-21 | Infilect Technologies Private Limited | System and method for extracting product information from low resolution photos for updating product master |
CN110415172B (zh) * | 2019-07-10 | 2023-03-17 | 武汉大学苏州研究院 | 一种面向混合分辨率码流中人脸区域的超分辨率重建方法 |
CN110533594B (zh) * | 2019-08-30 | 2023-04-07 | Oppo广东移动通信有限公司 | 模型训练方法、图像重建方法、存储介质及相关设备 |
CN110647936B (zh) * | 2019-09-20 | 2023-07-04 | 北京百度网讯科技有限公司 | 视频超分辨率重建模型的训练方法、装置和电子设备 |
US11769180B2 (en) * | 2019-10-15 | 2023-09-26 | Orchard Technologies, Inc. | Machine learning systems and methods for determining home value |
CN110910436B (zh) * | 2019-10-30 | 2022-10-28 | 深圳供电局有限公司 | 基于图像信息增强技术的测距方法、装置、设备和介质 |
CN110930309B (zh) * | 2019-11-20 | 2023-04-18 | 武汉工程大学 | 基于多视图纹理学习的人脸超分辨率方法及装置 |
CN111127317B (zh) * | 2019-12-02 | 2023-07-25 | 深圳供电局有限公司 | 图像超分辨率重建方法、装置、存储介质和计算机设备 |
CN111325692B (zh) * | 2020-02-21 | 2023-06-13 | 厦门美图之家科技有限公司 | 画质增强方法、装置、电子设备和可读存储介质 |
US11257276B2 (en) * | 2020-03-05 | 2022-02-22 | Disney Enterprises, Inc. | Appearance synthesis of digital faces |
EP4128135A4 (en) | 2020-04-01 | 2023-06-07 | BOE Technology Group Co., Ltd. | COMPUTER IMPLEMENTED PROCESS, APPARATUS AND COMPUTER PROGRAM PRODUCT |
CN111553840B (zh) * | 2020-04-10 | 2023-06-27 | 北京百度网讯科技有限公司 | 图像超分辨的模型训练和处理方法、装置、设备和介质 |
CN111738924A (zh) * | 2020-06-22 | 2020-10-02 | 北京字节跳动网络技术有限公司 | 图像处理方法及装置 |
FR3112009B1 (fr) * | 2020-06-30 | 2022-10-28 | St Microelectronics Grenoble 2 | Procédé de conversion d’une image numérique |
WO2022011571A1 (zh) * | 2020-07-14 | 2022-01-20 | Oppo广东移动通信有限公司 | 视频处理方法、装置、设备、解码器、系统及存储介质 |
CN112102411B (zh) * | 2020-11-02 | 2021-02-12 | 中国人民解放军国防科技大学 | 一种基于语义误差图像的视觉定位方法及装置 |
CN112734645B (zh) * | 2021-01-19 | 2023-11-03 | 青岛大学 | 一种基于特征蒸馏复用的轻量化图像超分辨率重建方法 |
CN112801878A (zh) * | 2021-02-08 | 2021-05-14 | 广东三维家信息科技有限公司 | 渲染图像超分辨率纹理增强方法、装置、设备及存储介质 |
CN112950470B (zh) * | 2021-02-26 | 2022-07-15 | 南开大学 | 基于时域特征融合的视频超分辨率重建方法及系统 |
CN112991212A (zh) * | 2021-03-16 | 2021-06-18 | Oppo广东移动通信有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN113205005B (zh) * | 2021-04-12 | 2022-07-19 | 武汉大学 | 一种面向低光照低分辨率的人脸图像幻构方法 |
CN113096019B (zh) * | 2021-04-28 | 2023-04-18 | 中国第一汽车股份有限公司 | 图像重建方法、装置、图像处理设备及存储介质 |
US12081880B2 (en) * | 2021-05-11 | 2024-09-03 | Samsung Electronics Co., Ltd. | Image super-resolution with reference images from one or more cameras |
CN113222178B (zh) * | 2021-05-31 | 2024-02-09 | Oppo广东移动通信有限公司 | 模型训练方法、用户界面的生成方法、装置及存储介质 |
CN113269858B (zh) * | 2021-07-19 | 2021-11-30 | 腾讯科技(深圳)有限公司 | 虚拟场景渲染方法、装置、计算机设备和存储介质 |
CN113591798B (zh) * | 2021-08-23 | 2023-11-03 | 京东科技控股股份有限公司 | 文档文字的重建方法及装置、电子设备、计算机存储介质 |
CN113837941B (zh) * | 2021-09-24 | 2023-09-01 | 北京奇艺世纪科技有限公司 | 图像超分模型的训练方法、装置及计算机可读存储介质 |
CN114723611B (zh) * | 2022-06-10 | 2022-09-30 | 季华实验室 | 图像重建模型训练方法、重建方法、装置、设备及介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106204447A (zh) * | 2016-06-30 | 2016-12-07 | 北京大学 | 基于总变差分和卷积神经网络的超分辨率重建方法 |
CN106600538A (zh) * | 2016-12-15 | 2017-04-26 | 武汉工程大学 | 一种基于区域深度卷积神经网络的人脸超分辨率算法 |
CN106683067A (zh) * | 2017-01-20 | 2017-05-17 | 福建帝视信息科技有限公司 | 一种基于残差子图像的深度学习超分辨率重建方法 |
CN107464217A (zh) * | 2017-08-16 | 2017-12-12 | 清华-伯克利深圳学院筹备办公室 | 一种图像处理方法及装置 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6961139B2 (ja) * | 2015-07-24 | 2021-11-05 | エーテーハー チューリッヒ | 知覚的な縮小方法を用いて画像を縮小するための画像処理システム |
CN106096547B (zh) | 2016-06-11 | 2019-02-19 | 北京工业大学 | 一种面向识别的低分辨率人脸图像特征超分辨率重建方法 |
CN106991646B (zh) * | 2017-03-28 | 2020-05-26 | 福建帝视信息科技有限公司 | 一种基于密集连接网络的图像超分辨率方法 |
CN107330381A (zh) * | 2017-06-15 | 2017-11-07 | 浙江捷尚视觉科技股份有限公司 | 一种人脸识别方法 |
CN107392316B (zh) * | 2017-06-30 | 2021-05-18 | 北京奇虎科技有限公司 | 网络训练方法、装置、计算设备及计算机存储介质 |
CN107369189A (zh) * | 2017-07-21 | 2017-11-21 | 成都信息工程大学 | 基于特征损失的医学图像超分辨率重建方法 |
CN107480772B (zh) * | 2017-08-08 | 2020-08-11 | 浙江大学 | 一种基于深度学习的车牌超分辨率处理方法及系统 |
CN107451619A (zh) * | 2017-08-11 | 2017-12-08 | 深圳市唯特视科技有限公司 | 一种基于感知生成对抗网络的小目标检测方法 |
-
2017
- 2017-12-20 CN CN201711387428.6A patent/CN109949255B/zh active Active
-
2018
- 2018-12-12 EP EP18892072.2A patent/EP3716198A4/en active Pending
- 2018-12-12 WO PCT/CN2018/120447 patent/WO2019120110A1/zh unknown
-
2020
- 2020-06-17 US US16/903,667 patent/US11551333B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106204447A (zh) * | 2016-06-30 | 2016-12-07 | 北京大学 | 基于总变差分和卷积神经网络的超分辨率重建方法 |
CN106600538A (zh) * | 2016-12-15 | 2017-04-26 | 武汉工程大学 | 一种基于区域深度卷积神经网络的人脸超分辨率算法 |
CN106683067A (zh) * | 2017-01-20 | 2017-05-17 | 福建帝视信息科技有限公司 | 一种基于残差子图像的深度学习超分辨率重建方法 |
CN107464217A (zh) * | 2017-08-16 | 2017-12-12 | 清华-伯克利深圳学院筹备办公室 | 一种图像处理方法及装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3716198A4 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112446824B (zh) * | 2019-08-28 | 2024-02-27 | 新华三技术有限公司 | 一种图像重建方法及装置 |
CN112446824A (zh) * | 2019-08-28 | 2021-03-05 | 新华三技术有限公司 | 一种图像重建方法及装置 |
CN111524068B (zh) * | 2020-04-14 | 2023-06-02 | 长安大学 | 一种基于深度学习的变长输入超分辨率视频重建方法 |
CN111524068A (zh) * | 2020-04-14 | 2020-08-11 | 长安大学 | 一种基于深度学习的变长输入超分辨率视频重建方法 |
CN111612698A (zh) * | 2020-05-29 | 2020-09-01 | 西安培华学院 | 一种图像超分辨率修复重建方法 |
CN112070676B (zh) * | 2020-09-10 | 2023-10-27 | 东北大学秦皇岛分校 | 一种双通道多感知卷积神经网络的图片超分辨率重建方法 |
CN112070676A (zh) * | 2020-09-10 | 2020-12-11 | 东北大学秦皇岛分校 | 一种双通道多感知卷积神经网络的图片超分辨率重建方法 |
CN112819695A (zh) * | 2021-01-26 | 2021-05-18 | 北京小米移动软件有限公司 | 图像超分辨率重建方法、装置、电子设备及介质 |
CN112819695B (zh) * | 2021-01-26 | 2024-04-26 | 北京小米移动软件有限公司 | 图像超分辨率重建方法、装置、电子设备及介质 |
CN113229767B (zh) * | 2021-04-12 | 2022-08-19 | 佛山市顺德区美的洗涤电器制造有限公司 | 用于处理图像的方法、处理器、控制装置及家用电器 |
CN113229767A (zh) * | 2021-04-12 | 2021-08-10 | 佛山市顺德区美的洗涤电器制造有限公司 | 用于处理图像的方法、处理器、控制装置及家用电器 |
CN113723317A (zh) * | 2021-09-01 | 2021-11-30 | 京东科技控股股份有限公司 | 3d人脸的重建方法、装置、电子设备和存储介质 |
CN113723317B (zh) * | 2021-09-01 | 2024-04-09 | 京东科技控股股份有限公司 | 3d人脸的重建方法、装置、电子设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
EP3716198A1 (en) | 2020-09-30 |
CN109949255A (zh) | 2019-06-28 |
US20200311871A1 (en) | 2020-10-01 |
EP3716198A4 (en) | 2021-03-10 |
US11551333B2 (en) | 2023-01-10 |
CN109949255B (zh) | 2023-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019120110A1 (zh) | 图像重建方法及设备 | |
US11501415B2 (en) | Method and system for high-resolution image inpainting | |
WO2020199931A1 (zh) | 人脸关键点检测方法及装置、存储介质和电子设备 | |
TWI749356B (zh) | 一種圖像風格轉換方法及設備、儲存介質 | |
CN110335290B (zh) | 基于注意力机制的孪生候选区域生成网络目标跟踪方法 | |
WO2020216227A9 (zh) | 图像分类方法、数据处理方法和装置 | |
CN113705769B (zh) | 一种神经网络训练方法以及装置 | |
US11954822B2 (en) | Image processing method and device, training method of neural network, image processing method based on combined neural network model, constructing method of combined neural network model, neural network processor, and storage medium | |
CN112288011B (zh) | 一种基于自注意力深度神经网络的图像匹配方法 | |
WO2022116856A1 (zh) | 一种模型结构、模型训练方法、图像增强方法及设备 | |
CN112651438A (zh) | 多类别图像的分类方法、装置、终端设备和存储介质 | |
CN106228512A (zh) | 基于学习率自适应的卷积神经网络图像超分辨率重建方法 | |
CN110363068B (zh) | 一种基于多尺度循环生成式对抗网络的高分辨行人图像生成方法 | |
CN110909796A (zh) | 一种图像分类方法及相关装置 | |
US20220215617A1 (en) | Viewpoint image processing method and related device | |
CN115565043A (zh) | 结合多表征特征以及目标预测法进行目标检测的方法 | |
WO2023036157A1 (en) | Self-supervised spatiotemporal representation learning by exploring video continuity | |
CN114830168A (zh) | 图像重建方法、电子设备和计算机可读存储介质 | |
Uddin et al. | A perceptually inspired new blind image denoising method using $ L_ {1} $ and perceptual loss | |
CN116091823A (zh) | 一种基于快速分组残差模块的单特征无锚框目标检测方法 | |
CN116246110A (zh) | 基于改进胶囊网络的图像分类方法 | |
WO2024099026A1 (zh) | 图像处理方法、装置、设备、存储介质及程序产品 | |
Li et al. | Zero-referenced low-light image enhancement with adaptive filter network | |
WO2020187029A1 (zh) | 图像处理方法及装置、神经网络的训练方法、存储介质 | |
CN117593187A (zh) | 基于元学习和Transformer的遥感图像任意尺度超分辨率重建方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18892072 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018892072 Country of ref document: EP Effective date: 20200624 |