WO2021115242A1 - 一种超分辨率图像处理方法以及相关装置 - Google Patents

一种超分辨率图像处理方法以及相关装置 Download PDF

Info

Publication number
WO2021115242A1
WO2021115242A1 PCT/CN2020/134444 CN2020134444W WO2021115242A1 WO 2021115242 A1 WO2021115242 A1 WO 2021115242A1 CN 2020134444 W CN2020134444 W CN 2020134444W WO 2021115242 A1 WO2021115242 A1 WO 2021115242A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
resolution
low
super
image block
Prior art date
Application number
PCT/CN2020/134444
Other languages
English (en)
French (fr)
Inventor
林焕
陈濛
周琛晖
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021115242A1 publication Critical patent/WO2021115242A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof

Definitions

  • This application relates to the field of artificial intelligence, and in particular to a super-resolution image processing method and related devices.
  • Super-resolution technology refers to the use of one or more low-resolution (LR) images to obtain a clear high-resolution (HR) image, and the above processing is referred to as super-resolution processing for short.
  • video files are usually over-divided on a server (cloud), and the processing results are delivered to a terminal device (mobile terminal).
  • the image displayed by the terminal device is an image that has been processed by the server for super-division.
  • a large amount of data interaction is required between the server and the terminal device.
  • the terminal device needs to send a video file to the server, and the server performs over-division processing and then sends it back to the terminal device.
  • the above process will take up a lot of network bandwidth. Therefore, a solution for super-division processing in terminal equipment has now been proposed.
  • the embodiments of the present application provide a super-resolution image processing method and related devices, which can reduce the computational complexity of the equipment running the super-resolution image processing method and improve the image clarity after super-resolution processing.
  • an embodiment of the present application proposes a super-resolution image processing method, which may include: generating a detailed image block and a low-detail image block from a low-resolution image, wherein the size of the detailed image block is smaller than the low-resolution image block.
  • the source of the low-resolution image can be a media file, and specifically can be any encoded frame file in the video file, such as a key frame or other frames (P-frame or B-frame, etc.).
  • the amount of image feature information included in the rich-detail image block is greater than the amount of image feature information included in the less-detailed image block.
  • the color information of the image included in the rich-detail image block is larger than the image included in the less-detailed image block.
  • the detailed image block includes three colors of red (red, R), green (green, G) and blue (blue, B).
  • the channel corresponding to red is represented as (225, 0, 0), and the channel corresponding to green is It is expressed as (0, 225, 0), and the channel corresponding to blue is expressed as (0, 0, 225).
  • the image feature information (color information of the image) included in the rich details image block The number is greater than the number of image feature information (color information of the image) included in the less detailed image block; similar image blocks are determined according to the detailed image block, where the similar image block may be that the image features have a higher degree of similarity to the detailed image block.
  • the similarity between the image feature information included in the similar image block and the image feature information included in the detailed image block is greater than the first threshold.
  • the process of generating super-resolution images from low-resolution images firstly generate detailed image blocks and low-detail image blocks based on low-resolution images, and split the low-resolution images into smaller sizes.
  • the detailed rich image block is used as the reference image of the low-resolution image, thus reducing the amount of calculation of the device running the super-resolution image processing method.
  • similar image blocks are determined through the detailed image blocks, so that when the device performs super-resolution processing on low-resolution images through the first super-resolution network model, the similar image blocks can be introduced to perform super-resolution processing together.
  • the degree of similarity is higher (that is, greater than the first threshold), therefore, it can be considered
  • the similar image block includes more image feature information, and the similar image block is used as a reference image of a low-resolution image, which can effectively improve the definition of a super-resolution image.
  • generating the detailed image block according to the low-resolution image may include: generating the image block set according to the low-resolution image, and the image block set includes at least one low-resolution image. Image block. specific. After acquiring the low-resolution image, first divide the low-resolution image into smaller low-resolution image blocks, and these low-resolution image blocks form an image block set. Specifically, the low-resolution image is divided into low-resolution image blocks with a height of 32 pixels and a width of 32 pixels. The specific size of the low-resolution image block is determined by actual requirements (that is, the requirements of the subsequent neural network model).
  • the low-resolution image block is processed through the first network model to determine the image block with rich details and the image block with insufficient details.
  • the first network model may be a classification network (regression network), which is specifically composed of several Consists of a convolutional layer and at least one softmax layer.
  • regression network regression network
  • the image blocks (low-resolution image blocks) in the image block set are processed through the first network model to determine the image blocks with rich details and the image blocks with low details.
  • the low-resolution image is split to reduce the amount of calculation to determine the detailed image block.
  • the first network model may be obtained locally through machine learning training, or it may be obtained by training on a remote device, such as a cloud server, and then sent to the local.
  • generating a detailed image block and a less detailed image block based on a low-resolution image may include: performing convolution processing on the low-resolution image block through the first network model to generate The first convolution data set. Since the first network model needs to classify low-resolution image blocks, the first network model is used to perform convolution processing on the low-resolution image blocks to generate image feature data sets corresponding to the low-resolution image blocks , Is only the output result of preliminary convolution processing. It is necessary to perform further convolution processing on the image feature data set corresponding to the low-resolution image block through the first network model. The convolution processing result output by the convolution processing is used as the source data for the subsequent classification processing.
  • the result of the convolution processing is called the first convolution data set; after the first convolution data set is generated, the first convolution data set is classified by the first network model to determine the detailed image block and the first convolution data set.
  • the details are not rich in image blocks.
  • the first convolutional data set is input to the softmax layer for two-class classification to determine which image blocks in the image block set are image blocks with rich details and which image blocks are image blocks with low details.
  • the specific classification criterion may be: the first convolution data set includes feature maps of multiple image blocks, and the feature maps are used to indicate image feature information of the image blocks.
  • the softmax layer When the feature map of a certain image block shows that the image block has no outline (for example, when it is a blue sky background), the softmax layer outputs "0" corresponding to the image block to indicate that the image block is an image block with insufficient details.
  • Image feature information can be extracted for different types of image blocks, and the image feature information (feature map) of each image block can be performed based on the image feature information (feature map) of each image block. Classification to determine which image blocks are detailed image blocks and which image blocks are less detailed image blocks, which improves the flexibility of the implementation of this solution.
  • performing convolution processing on the low-resolution image block to generate the first convolution data set may include: segmenting the low-resolution image to generate a set of image blocks (the image block After the low-resolution image blocks are included in the set), convolution processing is performed on the low-resolution image blocks in the image block set through the first network model.
  • the first network model outputs the convolutional layer processing result.
  • the result is called the image feature data set corresponding to the low-resolution image block.
  • the image feature data set corresponding to the low-resolution image block includes the features of the low-resolution image block.
  • the first network model includes a multi-layer convolutional layer (greater than 2 layers), and the convolutional layer can extract the feature map of the image, for example, extract edge information of the image, contour information of the image, brightness information of the image, and / Or color information of the image, etc.; performing convolution processing on the image feature data set corresponding to the low-resolution image block through the first network model to generate the first convolution data set.
  • the first network model performs convolution processing on the low-resolution image block, since the convolution layer of the first network model can extract the feature map of the image, it can also output the feature map of the low-resolution image block. , So that the subsequent super-resolution image processing and use, improve the clarity of the super-resolution image.
  • determining the similar image block according to the detailed image block may include: according to the detailed image block, determining that the detail is rich in the image feature data set corresponding to the low-resolution image block The image feature data set corresponding to the image block. After determining which image blocks in the low-resolution image blocks are rich-detail image blocks, the terminal device determines which ones are in the image feature data set corresponding to the low-resolution image blocks output by the first network model according to these detailed image blocks. The feature map corresponding to the image block with rich details.
  • image feature data set corresponding to the detailed image block are collectively referred to as the image feature data set corresponding to the detailed image block; the image feature data set corresponding to the detailed image block is binarized, and the detailed image block is obtained by calculation.
  • the similarity of any two image blocks in. After determining the image feature data set corresponding to the rich-detail image block (ie, the feature map of the rich-detail image block), in order to facilitate subsequent calculations, the similarity of any two image blocks in the rich-detail image block is calculated. First, the feature map in the image feature data set corresponding to the detailed image block needs to be binarized.
  • Binarization processing specifically refers to the process of setting the feature value of each pixel on the image to 0 or 1, that is, the process of presenting the entire image with a clear black and white effect.
  • Binarization processing usually you can use "OpenCV" or "matlab” for binarization processing.
  • the binarization data of the feature map in the image feature data set corresponding to the detailed image block is obtained, use the calculation to obtain any two image blocks in the detailed image block
  • the similarity of any two image blocks is greater than the first threshold, the similar image block is determined according to the similarity.
  • the similarity of any two image blocks satisfies:
  • the F is the similarity
  • the N is the image size of the detailed image block
  • the P(i,j) and Q(i,j) are the image blocks corresponding to the any two detailed image blocks respectively.
  • the i is the abscissa value of the feature map pixel of the image block
  • the j is the ordinate value of the feature map pixel of the image block.
  • determining the similar image block according to the detailed image block may include: when the low-resolution image is a frame (first image frame) in the video, determining the low-resolution image The position of the rate image in the video, that is, the position information of the first image frame in the video.
  • the low-resolution images obtained are from video files. First, determine the low-resolution image corresponding to the detailed image block.
  • the position of the low-resolution image in the video is determined according to the low-resolution image Location information in the video file of the first image frame corresponding to the low-resolution image. For example, to determine the low-resolution image corresponding to a detailed image block, the position information of the first image frame in the video file is the 10th frame; due to the coherence of the video file, the similarity of adjacent frames is high, therefore, After determining the position of the low-resolution image corresponding to a detailed image block in the video file.
  • any image block is selected from the image obtained by decoding the second image frame, and any detailed image block is selected, and the above two image blocks are used for calculation.
  • Which image block in the image file corresponding to the frame is a similar image block; in another optional implementation manner, it is determined that the image block corresponding to the detailed image block in the second image frame is the similar image block.
  • the image block is taken as the similar image block at the same coordinate position of the image file.
  • performing super-resolution processing on the similar image block and the low-resolution image to generate the super-resolution image may include: performing image exchange between the detailed image block and the similar image block Process to generate the first exchange image.
  • the generated first exchange image has both the characteristics of richly detailed image blocks and similar image blocks.
  • one or more of the following methods can be used to perform image exchange processing: "concat mode", "concat+add mode” or "image swap"; according to similar image blocks, the image corresponding to the image block is rich in details.
  • the feature maps of similar image blocks are determined, and image exchange processing is performed on the feature maps of the similar image blocks and the feature maps of the low-resolution image blocks to generate a second exchange image.
  • the generated second exchange image has both the feature map of the similar image block and the feature map of the low-resolution image block; super-resolution processing is performed on the first exchange image, the second exchange image and the low-resolution image to generate the second exchange image.
  • super-resolution processing is performed on the first exchange image, the second exchange image and the low-resolution image to generate the second exchange image.
  • One image. Similar image blocks and feature maps of similar image blocks are used as reference images for super-resolution processing; if no secondary super-resolution processing is performed, the first generated image is a super-resolution image. Since the first exchange image and the second exchange image have rich image feature information, the clarity of the super-resolution image can be further improved through the above-mentioned processing method.
  • generating the super-resolution image based on the first image may include: if performing secondary super-resolution processing to obtain a high-definition image, the resolution of the high-definition image is greater than that of the low-resolution image. Rate the resolution of the image. For example, the resolution of a low-resolution image is 256*128, and the resolution of a high-definition image is 1280*960.
  • the high-definition image comes from a preset high-definition gallery.
  • the high-definition image in the pre-made high-definition gallery is determined through the first network model, and the similarity between the high-definition image and the first image is greater than a first threshold;
  • the high-definition image comes from a remote device, For example, a cloud computing device (cloud computing device system).
  • the cloud computing device uses the third super-resolution network model deployed in the cloud computing device system to generate The high-definition image; in order to further improve the clarity of the super-resolution image, the details obtained using the foregoing steps are not rich in image blocks.
  • the magnification processing includes: Bicubic or "linearf".
  • the magnified image is used as a reference image.
  • super-resolution processing is performed on the high-definition image, the enlarged image, and the first image through the second super-resolution network model to generate Super resolution image.
  • the second super-resolution network model performs super-resolution processing on image blocks in the high-definition image that are similar to the first image, image blocks in the enlarged image that are similar to the first image, and the first image to generate super-resolution image.
  • Similar image blocks are found through the exclusive-or matching algorithm, and similar image blocks can achieve information complementarity, so as to achieve the improvement of super-resolution image clarity.
  • the sources of similar image blocks are diverse, which further enhances the clarity of super-resolution images.
  • the method may further include: the super-resolution image processing device acquires various images, and these images form an image set.
  • the image collection includes: collecting pictures with different texture characteristics from the Internet and published data sets, such as animals, sky, human faces or buildings, etc., and mixing different types of pictures in equal proportions to obtain a training set.
  • Sources include data sets such as "DIV2K” or “Timofte 91 images", or images obtained by search engines, etc.; use a low-pass filter to filter the image collection, that is, delete image files with less detail and smoother in the image collection , Generate the first sub-training set, the number of image feature information included in the image in the first sub-training set is greater than the number of image feature information included in the image block with insufficient detail; perform data augmentation processing on the first sub-training set to generate the first sub-training set Two sub-training sets.
  • the data augmentation processing includes: image inversion, image rotation, image reduction and image stretching.
  • Data augmentation processing also includes: cropping, translation, affine, perspective, Gaussian noise, uneven light, dynamic blur, random color filling, etc.; generating the first training set according to the first training set and the second training set , The first training set is used to train the first network model. By generating the first training set and training the first network model, the accuracy of the first network model can be effectively improved.
  • an embodiment of the present application proposes a super-resolution image processing device, which can be deployed in a variety of devices such as cloud computing equipment, edge computing equipment systems, or terminal equipment.
  • the super-resolution image processing device includes a generating module and a determining module:
  • the generation module is used to generate a detailed image block and a low-detail image block according to a low-resolution image, wherein the size of the rich-detail image block and the low-detail image block is smaller than the low-resolution image, and the detailed image block
  • the number of included image feature information is greater than the number of image feature information included in the less detailed image block
  • a determining module configured to determine a similar image block according to the detailed image block, wherein the similarity between the image feature information included in the similar image block and the image feature information included in the detailed image block is greater than a first threshold;
  • the generating module is also used to perform super-resolution processing on the similar image block and the low-resolution image to generate a super-resolution image, wherein the similar image block is used as a reference image of the low-resolution image.
  • the generating module is specifically configured to generate a set of image blocks according to the low-resolution image
  • the determining module is specifically configured to perform two-classification processing on the first convolutional data set, and determine the image block with rich details and the image block with low details.
  • the generating module is specifically configured to determine the image feature data set corresponding to the detailed rich image block according to the detailed rich image block;
  • the generating module is specifically configured to perform binarization processing on the determined image feature data set to obtain the similarity of any two image blocks in the detailed image block;
  • the determining module is specifically configured to determine the similar image block when the similarity between any two image blocks is greater than the first threshold.
  • the F is the similarity
  • the N is the image size of the detailed image block
  • the P(i,j) and Q(i,j) are the image blocks corresponding to the any two detailed image blocks respectively.
  • the i is the abscissa value of the feature map pixel of the image block
  • the j is the ordinate value of the feature map pixel of the image block.
  • the determining module is specifically configured to determine the position of the low-resolution image in the video when the low-resolution image is a frame in the video;
  • the determining module is specifically configured to determine a second image frame according to the position, where the second image frame is an adjacent frame of the low-resolution image in the video;
  • the determining module is specifically configured to determine that the image block corresponding to the detailed image block in the second image frame is the similar image block.
  • the generating module is specifically configured to perform image exchange processing on the detailed image block and the similar image block to generate a first exchange image
  • the generating module is specifically configured to determine the feature map of the similar image block according to the similar image block, and perform image exchange processing on the feature map of the similar image block and the feature map of the low-resolution image block to generate a second exchange An image, wherein the feature map is used to indicate image feature information of the image block;
  • the generating module is specifically configured to perform super-resolution processing on the first exchange image, the second exchange image, and the low-resolution image to generate the first image;
  • the generating module is specifically configured to generate the super-resolution image according to the first image.
  • the super-resolution image processing apparatus further includes an acquisition module
  • the acquisition module is used to acquire a high-definition image, the resolution of the high-definition image is greater than the resolution of the low-resolution image;
  • the generating module is specifically configured to perform magnification processing on the image block with insufficient details to generate a magnified image, wherein the magnification processing includes bicubic interpolation processing;
  • the generating module is specifically configured to perform super-resolution processing on the high-definition image, the enlarged image, and the first image to generate the super-resolution image, where the high-definition image and the enlarged image are used as reference images of the first image .
  • the high-definition image comes from a remote device, and the high-definition image is an image generated by the remote device by performing super-resolution processing on the low-resolution image.
  • the high-definition image comes from a preset high-definition image library, and the preset high-definition image library includes at least one high-definition image.
  • the determining module is further configured to determine the high-definition image in the pre-made high-definition gallery according to the first image, and the high-definition image is more similar to the first image than the first image. Threshold.
  • the acquisition module is also used to acquire an image collection
  • the generating module is further configured to use a low-pass filter to perform filtering processing on the image set to generate a first sub-training set.
  • the number of image feature information included in the images in the first sub-training set is greater than that included in the less detailed image block.
  • the generating module is also used to perform data augmentation processing on the first sub-training set to generate a second sub-training set.
  • the data augmentation processing includes image inversion, image rotation, image reduction and image stretching;
  • the generating module is also used to generate a first training set according to the first sub-training set and the second sub-training set, the first training set is used to train the first network model, and the first network model is used to generate rich details Image blocks and details are not rich in image blocks.
  • the image feature information includes edge information of the image, contour information of the image, brightness information of the image, and/or color information of the image.
  • the embodiments of the present application provide a super-resolution image processing device.
  • the super-resolution image processing device includes at least one processor and a memory.
  • the memory stores computer instructions that can run on the processor.
  • the processor executes the method described in the foregoing first aspect or any one of the possible implementation manners of the first aspect.
  • the embodiments of the present application provide a terminal device.
  • the terminal device includes at least one processor, a memory, a communication port, a display, and a computer executable instruction stored in the memory and running on the processor.
  • the processor executes the method described in the foregoing first aspect or any one of the possible implementation manners of the first aspect.
  • the embodiments of the present application provide a computer-readable storage medium storing one or more computer-executable instructions.
  • the processor executes the first aspect or the first aspect described above. Any one of the possible ways to implement this method.
  • embodiments of the present application provide a computer program product (or computer program) that stores one or more computer-executable instructions.
  • the processor executes the first aspect described above. Or any one of the possible implementation methods of the first aspect.
  • the present application provides a chip system including a processor for supporting terminal devices to implement the functions involved in the above aspects.
  • the chip system further includes a memory for storing necessary program instructions and data for the terminal device.
  • the chip system can be composed of chips, or include chips and other discrete devices.
  • the technical effects brought by the second to seventh aspects or any one of the possible implementation manners may refer to the technical effects brought about by the first aspect or the different possible implementation manners of the first aspect, and details are not described herein again.
  • the embodiments of the present application provide a super-resolution image processing method and related devices.
  • the device running the super-resolution image processing method generates super-resolution images based on low-resolution images.
  • the low-resolution image is split into smaller-size detail-rich image blocks, and this detailed image block is used as a reference image for the low-resolution image, thus reducing the need to run the super-resolution Rate the amount of equipment calculations for image processing methods.
  • similar image blocks are determined through the detailed image blocks, so that when the device performs super-resolution processing on low-resolution images through the first super-resolution network model, the similar image blocks can be introduced to perform super-resolution processing together. Since the detailed image block includes more image feature information (greater than the less detailed image block), the image feature information included in the similar image block is similar to the image feature information included in the detailed image block (greater than the first threshold) ), therefore, it can be considered that the similar image block includes more image feature information, which can effectively improve the clarity of the super-resolution image.
  • FIG. 1a is a schematic diagram of an application scenario proposed by an embodiment of this application.
  • FIG. 1b is a schematic diagram of a system architecture provided by an embodiment of this application.
  • FIG. 1c is a schematic diagram of a system architecture provided by an embodiment of the application.
  • FIG. 2 is a schematic diagram of a system architecture 200 provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of a structure of a convolutional neural network provided by an embodiment of the application.
  • Fig. 4a is a schematic diagram of an embodiment of a super-resolution image processing method in an embodiment of the application
  • FIG. 4b is a schematic flowchart of a super-resolution image processing method provided by an embodiment of the application.
  • FIG. 4c is a schematic flowchart of a super-resolution image processing method provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of a process for determining similar image blocks according to an embodiment of the application.
  • FIG. 6 is a schematic diagram of a process for determining similar image blocks according to an embodiment of the application.
  • FIG. 7 is a schematic flowchart of a super-resolution processing proposed by an embodiment of this application.
  • FIG. 8a is a schematic flowchart of a super-resolution processing according to an embodiment of the application.
  • Figure 8b is a schematic diagram of a simulation experiment in an embodiment of the application.
  • Figure 8c is a schematic diagram of the calculation result of the interpolation algorithm
  • 8d is a schematic diagram of calculation results of the super-resolution image processing method proposed in an embodiment of the application.
  • Figure 8e is a schematic diagram of a simulation experiment in an embodiment of the application.
  • FIG. 9 is a schematic diagram of a process for generating a training set in an embodiment of the application.
  • FIG. 10 is a schematic diagram of an embodiment of a super-resolution image processing apparatus 1000 in an embodiment of the application.
  • FIG. 11 is a schematic structural diagram of a computing device provided by an embodiment of this application.
  • FIG. 12 is a schematic diagram of a structure of a chip provided by an embodiment of the application.
  • the embodiments of the present application provide a super-resolution image processing method and related devices.
  • the device running the super-resolution image processing method generates super-resolution images based on low-resolution images.
  • the detailed image block is divided into smaller-sized detail-rich image blocks, and the detailed image block is used as a reference image of the low-resolution image, so as to reduce the amount of calculation of the device.
  • similar image blocks are determined through the detailed image blocks, so that when the device performs super-resolution processing on low-resolution images, the similar image blocks can be introduced to perform super-resolution processing together to improve image clarity after super-resolution processing.
  • the super-resolution image processing method proposed in this application can be deployed on different devices, for example: (1), deployed in the mobile terminal (terminal device); (2), deployed in the cloud (server, cloud computing device or called cloud (Computing equipment system); (3), partly deployed on the mobile terminal (terminal equipment), partly deployed on the cloud (server, cloud computing equipment or called cloud computing equipment system), and the mobile terminal is used in conjunction with the cloud.
  • FIG. 1a is a schematic diagram of an application scenario proposed by an embodiment of this application.
  • the media file can be a video file, such as an audio video interleaved (AVI) video file; it can also be a picture file, such as a joint photographic experts group (joint photographic experts group, JPEG) picture file , There is no limitation here.
  • AVI audio video interleaved
  • JPEG joint photographic experts group
  • the terminal device plays local media files, and the super-resolution image processing method proposed in this application is deployed in the terminal device:
  • the terminal device plays local media files, such as the "album” application program plays local media files.
  • the terminal device obtains the media file from the local storage. After obtaining the media file, the terminal device performs subsequent super-resolution image processing on the media file.
  • the terminal device plays cloud media files, and the super-resolution image processing method proposed in this application is deployed in the terminal device:
  • the terminal device plays cloud media files, for example, the "Youku” application plays cloud media files.
  • the terminal device obtains the media file from the server that provides the cloud media file playback service, and performs subsequent super-resolution image processing on the media file.
  • the terminal device plays local media files, and the super-resolution image processing method proposed in this application is deployed in the server:
  • the terminal device When the terminal device plays a local media file, the terminal device obtains the media file from the local storage and sends the media file to the server deployed with the super-resolution image processing method, and the server super-resolution the media file Image Processing.
  • the server sends the super-resolution image processing result to the terminal device, and the terminal device plays the processed local media file based on the processing result.
  • the terminal device plays cloud media files, and the super-resolution image processing method proposed in this application is deployed in the server:
  • the terminal device When the terminal device plays a cloud media file, the terminal device (or a server that provides the cloud media file playback service) notifies the server deployed with the super-resolution image processing method, and obtains the media file through the address of the cloud media file. After the server performs super-resolution image processing on the media file, it sends the processing result to the terminal device (or the server that provides the cloud media file playback service).
  • the terminal device When the processing result is sent to the terminal device: the terminal device plays the processed cloud media file based on the processing result.
  • the server that provides the cloud media file playback service the server that provides the cloud media file playback service forwards the processing result to the terminal device, and the terminal device plays the processed cloud media file.
  • the terminal device plays local media files.
  • the super-resolution image processing method proposed in this application is partially deployed on the terminal device and partially deployed on the server.
  • the terminal device When the terminal device plays a local media file, the terminal device obtains the media file from the local storage. After the terminal device obtains the media file, the terminal device and the server cooperate to perform subsequent super-resolution image processing on the media file.
  • the terminal device plays cloud media files.
  • the super-resolution image processing method proposed in this application is partially deployed on the terminal device and partially deployed on the server.
  • the terminal device plays the cloud media file.
  • the terminal device obtains the media file from the server that provides the cloud media file playback service. After the terminal device obtains the media file, the terminal device and the server cooperate to perform subsequent super-resolution image processing on the media file.
  • step S2 the terminal device and/or server deployed with the super-resolution image processing method obtains the media file, and then processes the media file.
  • processing methods for different media files which are described below:
  • the terminal device and/or server deployed with the super-resolution image processing method extracts the image frame file in the video file according to the video file, and the image frame file may be the video file Any one of the encoded frame files in the file is, for example, a key frame (I frame), or other frames, such as a P frame or a B frame, and so on. According to the image frame file, a low-resolution image corresponding to the image frame file is obtained.
  • I frame key frame
  • other frames such as a P frame or a B frame
  • the terminal device and/or server deployed with the super-resolution image processing method obtains the low-resolution image corresponding to the picture file according to the picture file.
  • step S3 the terminal device and/or server deployed with the super-resolution image processing method performs super-resolution image processing on the low-resolution image, and outputs the super-resolution image.
  • the specific processing flow will be described in detail in the subsequent embodiments.
  • the super-resolution image processing method proposed in the embodiment of the present application can be applied to a variety of application environments, and can provide super-resolution image processing services in a variety of application environments. It has the characteristics of wide application range and high practicability.
  • the super-resolution image processing method provided in the embodiments of the present application may be executed by a super-resolution image processing apparatus.
  • the embodiment of the present application does not limit the location where the super-resolution image processing device is deployed.
  • Fig. 1b is a schematic diagram of a system architecture provided by an embodiment of the application.
  • the super-resolution image processing apparatus may run on a cloud computing device system (including at least one cloud computing device, such as a server, etc. ), it can also run on an edge computing device system (including at least one edge computing device, such as a server, a desktop computer, etc.), or on various terminal devices, such as a mobile phone, a notebook computer, a personal desktop computer, etc.
  • each part of the device can run in three environments of cloud computing equipment system, edge computing equipment system or terminal equipment, or can run in any two of these three environments.
  • the cloud computing equipment system, the edge computing equipment system and the terminal equipment are connected by a communication path, and can communicate and transmit data with each other.
  • the training method of the classification model provided in the embodiment of the present application is executed by the combined parts of the super-resolution image processing device running in three environments (or any two of the three environments).
  • FIG. 2 is a schematic diagram of a system architecture 200 according to an embodiment of the present application. Each part of the super-resolution image processing apparatus is deployed on different devices on the system architecture 200, so that the system architecture 200 The equipment in the work together to realize the function of the super-resolution image processing device.
  • the system architecture 200 includes a server 220, a database 230, a first communication device 240, a data storage system 250, and a second communication device 260.
  • the database 230, the server 220, and the data storage system 250 belong to cloud computing devices.
  • the first communication device 240 and the second communication device 260 are terminal devices.
  • the first communication device 240 is used to obtain a low-resolution image and send the low-resolution image to the server 220, and the server 220 uses the third super-resolution network model deployed in the server 220 according to the low-resolution image. To generate high-definition images.
  • the third super-resolution network model deployed in the server 220 in order to save network bandwidth resources and computing resources, can be every interval T time, T is a positive integer, according to the low-resolution image from the first communication device 240 Generate high-definition images; alternatively, in the low-resolution images (collection) from the first communication device 240, every interval Y images, Y is a positive integer, select a low-resolution image to generate a high-definition image, not done here limit.
  • the database 230 stores a first training set (the first training set includes a first sub-training set and a second sub-training set), and the first training set is used for the server 220 to perform iterative training on the first network model.
  • the server 220 may deliver the trained first network model to the first communication device 240 every time after a period of time, so that the first communication device 240 updates the local first network model.
  • the first training set may be uploaded to the server 220 by the user through the first communication device 240, or may be obtained by the server 220 from a data set such as a search engine or “DIV2K” through a data collection device.
  • the server 220 generates a high-definition image based on the low-resolution image uploaded by the first communication device 240, and then sends the high-definition image to the first communication device 240.
  • the first communication device 240 uses the locally deployed first super-resolution network model to perform super-resolution processing on the low-resolution image to generate the first image.
  • the first communication device 240 uses the locally deployed second super-resolution network model to perform super-resolution processing on the first image and the high-definition image to generate a super-resolution image.
  • the server 220 may also train one or more super-resolution network models among the first super-resolution network model, the second super-resolution network model, and the third super-resolution network model.
  • the server 220 may send the trained first super-resolution network model and the second super-resolution network model to the first communication device 240, so that the first communication device 240 updates the local first super-resolution network model and the second super-resolution network model.
  • Super-resolution network model may be used to train one or more super-resolution network models among the first super-resolution network model, the second super-resolution network model, and the third super-resolution network model.
  • the server 220 Before the server 220 sends the aforementioned super-resolution network model to the first communication device 240, the server 220 can also use the "HiAI Convert” or "ShaderNN Converter” software to process the aforementioned model, so that the first communication device 240 can successfully run the aforementioned super-resolution network model.
  • Resolution network model It should be noted that the first super-resolution network model and the second super-resolution network model can be two components of the same super-resolution network model, or can be different super-resolution network models, which are not limited here.
  • the server 220 can use the trained third super-resolution network model to update the local third super-resolution network model of the server 220.
  • the third super-resolution network model The model parameters of the network model are greater than the first super-resolution network model (and the second super-resolution network model), that is, the third super-resolution network model is larger than the first super-resolution network model (and the second super-resolution network model) ).
  • the high-definition image may also come from a preset high-definition image library, and the preset high-definition image library is stored in the data storage system 250.
  • the preset high-definition gallery can also be stored in the first communication device 240.
  • the preset high-definition gallery may be acquired by the server 220 from a data collection such as a search engine or "DIV2K" through a data acquisition device, or may be acquired by the first communication device 240, which is not limited here.
  • the first network model, the first super-resolution network model, and the second super-resolution network model that have been trained by the server 220 are sent to the second communication device 260.
  • the second communication device 260 runs the above-mentioned model, so that the second communication device 260 serves as a part of the super-resolution image processing apparatus and executes the super-resolution image processing method proposed in this application.
  • the first communication device 240 and the second communication device 260 include, but are not limited to, personal computers, computer workstations, smart phones, tablets, smart cameras, smart cars or other types of cellular phones, media consumption devices, wearable devices, set-top boxes, Game consoles, etc.
  • Both the first communication device 240 and the server 220 and the second communication device 260 and the server 220 may be connected via a wireless network.
  • the above-mentioned wireless network uses standard communication technologies and/or protocols.
  • the wireless network is usually the Internet, but it can also be any network, including but not limited to local area network (LAN), metropolitan area network (MAN), wide area network (WAN), mobile, and private networks Or any combination of virtual private networks).
  • LAN local area network
  • MAN metropolitan area network
  • WAN wide area network
  • mobile and private networks Or any combination of virtual private networks.
  • customized or dedicated data communication technology can also be used to replace or supplement the above-mentioned data communication technology.
  • FIG. 2 Although only one server 220, one first communication device 240, and one second communication device 260 are shown in FIG. 2, it should be understood that the example in FIG. 2 is only used to understand this solution.
  • the specific server 220, the first communication device Both the number of 240 and the second communication device 260 should be flexibly determined in accordance with the actual situation.
  • the first super-resolution network model, the second super-resolution network model, the third super-resolution network model, and the first network model are all neural network models for processing image data.
  • neural network models commonly used to process image data are convolutional neural networks (convolutional neural network, CNN) and other neural networks based on convolutional neural networks (such as recurrent neural network (RNN), super Resolution Convolutional Neural Network (Super Resolution Convolutional Neural Network, SRCNN), Deeply-recursive Convolutional Network (DRCN), or Efficient Sub-pixel Convolutional Neural Network (ESPCN), etc.).
  • CNN convolutional neural network
  • RNN recurrent neural network
  • SRCNN super Resolution Convolutional Neural Network
  • DRCN Deeply-recursive Convolutional Network
  • ESPCN Efficient Sub-pixel Convolutional Neural Network
  • FIG. 3 is a schematic diagram of a structure of a convolutional neural network provided by an embodiment of the application.
  • a convolutional neural network is a deep neural network with a convolutional structure, which is a type of deep learning ( The deep learning architecture refers to the use of machine learning algorithms to perform multiple levels of learning at different levels of abstraction. As a deep learning architecture, it is a feed-forward artificial neural network.
  • the convolutional neural network 100 may include an input layer 110, a convolutional layer/pooling layer 120, where the pooling layer is optional, and a neural network layer 130.
  • the convolutional layer/pooling layer 120 may include layers 121-126 as shown in the examples.
  • layer 121 is a convolutional layer
  • layer 122 is a pooling layer
  • layer 123 is a convolutional layer
  • 124 is a pooling layer
  • 121 and 122 are convolutional layers
  • 123 is a pooling layer
  • 124 and 125 are convolutional layers
  • 126 is a convolutional layer.
  • Pooling layer That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or as the input of another convolutional layer to continue the convolution operation.
  • the convolutional layer 121 can include many convolution operators.
  • the convolution operator is also called a kernel. Its role in image processing is equivalent to a filter that extracts specific information from the input image matrix.
  • the convolution operator can be a weight matrix. This weight matrix is usually predefined. In the process of convolution on the image, the weight matrix is usually one pixel after another pixel in the horizontal direction on the input image ( Or two pixels followed by two pixels, etc., the number of pixels depends on the value of the stride) for processing, so as to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image.
  • the depth dimension of the weight matrix and the depth dimension of the input image are the same.
  • the weight matrix will extend to Enter the entire depth of the image. Therefore, convolution with a single weight matrix will produce a convolution output with a single depth dimension, but in most cases, a single weight matrix is not used, but multiple weight matrices with the same dimension are applied.
  • the output of each weight matrix is stacked to form the depth dimension of the convolutional image.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract edge information of the image, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to eliminate unwanted noise in the image.
  • weight matrices are not exhaustively listed here.
  • the dimensions of the multiple weight matrices are the same, and the dimensions of the feature map after the multiple weight matrices with the same dimensions are extracted are also the same, and then the extracted multiple dimensions The same feature maps are combined to form the output of the convolution operation.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can extract information from the input image, thereby helping the convolutional neural network 100 to make correct predictions.
  • the initial convolutional layer (such as 121) often extracts more general features, which can also be called low-level features; with the convolutional neural network
  • the subsequent convolutional layers for example, 126
  • features such as high-level semantics
  • the pooling layer can also be a multi-layer convolutional layer followed by one or more pooling layers.
  • the sole purpose of the pooling layer is to reduce the size of the image space.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain an image with a smaller size.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of the maximum pooling.
  • the operators in the pooling layer should also be related to the image size.
  • the size of the image output after processing by the pooling layer can be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 100 After processing by the convolutional layer/pooling layer 120, the convolutional neural network 100 is not enough to output the required output information. Because as mentioned above, the convolutional layer/pooling layer 120 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other related information), the convolutional neural network 100 needs to use the neural network layer 130 to generate one or a group of required classes of output. Therefore, the neural network layer 130 may include multiple hidden layers (131, 132 to 13n as shown in FIG. 3) and an output layer 140. The parameters contained in the multiple hidden layers may be based on specific task types. The relevant training data is obtained by pre-training. For example, the task type can include image processing and skill selection after image processing.
  • the image processing part can include image recognition, image classification, image super-resolution processing, etc., while processing the image After that, skills can be selected according to the acquired image information; as an example, for example, in this application, it is applied to super-resolution image processing, the neural network is specifically expressed as a convolutional neural network, and the task is to perform super-resolution processing on the image:
  • the convolutional neural network needs to recognize the low-resolution image, and then obtain various image feature information in the image, such as: contour information, image brightness information, image Information such as texture information can then be used to determine similar image blocks that are similar to low-resolution images.
  • the convolutional neural network combines similar image blocks to perform super-resolution processing on the low-resolution image to generate a super-resolution image; optionally, in order to further improve the clarity of the super-resolution image, the convolutional neural network performs super-resolution processing on the high-resolution image.
  • the output layer 140 After the multiple hidden layers in the neural network layer 130, that is, the final layer of the entire convolutional neural network 100 is the output layer 140.
  • the output layer 140 has a loss function similar to the classification cross entropy, which is specifically used to calculate the prediction error.
  • the convolutional neural network 100 shown in FIG. 3 is only used as an example of a convolutional neural network.
  • the convolutional neural network may also exist in the form of other network models. Then introduce other types of neural networks one by one.
  • FIG. 4a is a schematic diagram of an embodiment of a super-resolution image processing method in an embodiment of the present application.
  • An embodiment of the super-resolution image processing method in the embodiment of the present application includes:
  • the terminal device when the terminal device plays a media file (as described in the foregoing embodiment of FIG. 1a), the terminal device can obtain a low-resolution image corresponding to the media file.
  • the media file is a video file
  • the acquired low-resolution image is any image frame in the video file, such as a key frame (I frame), or other frames, such as P frame or B frames and so on.
  • the media file is an image file
  • the obtained low-resolution image is the image file.
  • the terminal device after acquiring a low-resolution image, the terminal device first divides the low-resolution image into low-resolution image blocks of smaller size, and these low-resolution image blocks form an image block set. Specifically, the low-resolution image is divided into low-resolution image blocks with a height of 32 pixels and a width of 32 pixels.
  • the specific size of the low-resolution image block is determined by actual requirements (that is, the requirements of the subsequent neural network model, such as the first network Model), not limited here.
  • the terminal device divides the low-resolution image to generate low-resolution image blocks.
  • the terminal device performs convolution processing on the image block set corresponding to the low-resolution image through the first network model, and the first network model outputs the convolutional layer processing result, which is called the image feature data set corresponding to the low-resolution image block ,
  • the image feature data set includes feature maps of image blocks.
  • the first network model includes a multi-layer convolutional layer (greater than 2 layers), and the convolutional layer can extract the feature map of the image, for example, extract edge information of the image, contour information of the image, brightness information of the image, and / Or the color information of the image, etc.
  • the convolution processing results output by the first two convolution layers of the first network model as the feature map of the low-resolution image block, which is also called the low-resolution image block feature map.
  • the collection of these low-resolution image block feature maps is called the image feature data set corresponding to the low-resolution image block.
  • the first network model since the first network model needs to classify low-resolution image blocks, the first network model performs convolution processing on the low-resolution image blocks to generate image feature data corresponding to the low-resolution image blocks Set is only the output result of preliminary convolution processing. It is necessary to perform further convolution processing on the image feature data set corresponding to the low-resolution image block through the first network model.
  • the convolution processing result output by the convolution processing is used as the source data for the subsequent classification processing. This convolution processing result is called the first convolution data set.
  • the first convolutional data set is subjected to two-classification processing (softmax) through the first network model to determine image blocks with rich details and image blocks with low details.
  • the first network model may be a classification network (regression network), which is specifically composed of several convolutional layers and at least one softmax layer.
  • the first convolution data set is input to the softmax layer for two classification processing to determine which image blocks in the image block set are detailed image blocks and which image blocks are less detailed image blocks.
  • the specific classification criteria can be: the first convolutional data set includes feature maps of multiple image blocks.
  • the feature map of a certain image block shows that the image block has no outline (for example, when it is a blue sky background)
  • the softmax layer corresponds to the image block Output "0" to indicate that the image block is a less detailed image block.
  • the detailed image block has a wealth of image feature information.
  • the image feature information includes, but is not limited to: contour information of the image, brightness information of the image, texture information of the image, and so on.
  • Steps 401-405 describe how to determine image blocks with rich details and image blocks with low details in low-resolution images.
  • the above process can be described by the following formula:
  • CLASSIFY Softmax(Conv(resize(Crop(Input)) (H,W,1) ));
  • Input represents the input low-resolution image
  • W represents the width of the low-resolution image
  • 1 represents the input low-resolution image.
  • the resolution image is a single-channel image (that is, a grayscale image)
  • Crop means that the input low-resolution image is divided into low-resolution image blocks
  • resize means that the low-resolution image block is uniformly scaled to a fixed size Image block
  • Conv represents the convolution processing result of the convolution processing of the low-resolution image block after scaling processing
  • Softmax represents the result of the two-classification processing on the convolution processing result (the first convolution data set)
  • CLASSIFY indicates that the low-resolution image block is an image block with rich details or an image block with insufficient details
  • FIG. 4b is a schematic flowchart of a super-resolution image processing method provided by an embodiment of the application.
  • the terminal device After obtaining the low-resolution image, the terminal device generates a low-resolution image block according to the low-resolution image.
  • the terminal device determines similar image blocks based on image blocks with rich details.
  • FIG. 5 is a schematic diagram of a process for determining similar image blocks according to an embodiment of the application.
  • step D1 after the terminal device determines which image blocks in the low-resolution image blocks are rich-detail image blocks, the terminal device enriches the image blocks according to these details, and outputs the corresponding image features of the low-resolution image blocks in the first network model.
  • the terminal device determine which feature maps correspond to the image blocks with rich details.
  • the image feature data set corresponding to the low-resolution image block includes A, B, C, D, E, and F, which are feature maps of the six low-resolution image blocks.
  • a and B are image blocks with rich details.
  • the terminal device finds the feature maps corresponding to the A and B image blocks in the image feature data set corresponding to the low-resolution image block, and these determined feature maps are collectively referred to as the image feature data set corresponding to the detailed image block . That is, the image feature data set corresponding to the detailed image block includes the feature map of the A image block and the feature map of the B image block.
  • the terminal device determines the image feature data set corresponding to the rich-detail image block (ie, the feature map of the rich-detail image block), in order to facilitate subsequent calculations, it calculates the value of any two image blocks in the rich-detail image block. Similarity.
  • the feature map in the image feature data set corresponding to the detailed image block needs to be binarized. Binarization processing specifically refers to the process of setting the feature value of each pixel on the image to 0 or 1, that is, the process of presenting the entire image with a clear black and white effect. Usually you can use "OpenCV" or "matlab” for binarization.
  • F is the similarity
  • N is the image size of the detailed image block
  • P(i,j) and Q(i,j) are the feature maps of the corresponding image blocks of any two detailed image blocks
  • i is the image
  • the absc issa value of the feature map pixel of the block
  • j is the ordinate value of the feature map pixel of the image block.
  • the terminal device arbitrarily selects two feature maps in the image feature data set corresponding to the detailed image block, and calculates the similarity of the two feature maps.
  • the two similar images are the first feature map P and the second feature map.
  • Q(i, j) represents the binarized data of the second feature map at the coordinates (i, j) in the figure after the second feature map has been binarized.
  • the binarized data with the same coordinates on each feature map is selected for XOR calculation. Then sum up the XOR calculation results of all coordinates on the feature map and divide by the total number of pixel coordinates on a feature map ("N*N"). The final calculation result is the similarity of the image blocks corresponding to the two feature maps.
  • the similarity of any two image blocks in the image feature data set corresponding to the detailed image block can be calculated.
  • the two image blocks can be determined to be similar image blocks.
  • the first threshold is determined according to actual needs and is not limited here. In an optional solution, the first threshold is 0.7.
  • FIG. 4c is a schematic flowchart of a super-resolution image processing method provided by an embodiment of the application.
  • the terminal device After obtaining the low-resolution image, the terminal device generates a low-resolution image block according to the low-resolution image.
  • the first network model is used to process the low-resolution image block, and the result is output in the process of convolution processing to obtain the image feature data set corresponding to the image block with rich details.
  • the feature map in the image feature data set corresponding to the detailed image block is binarized.
  • the similarity is calculated to determine similar image blocks.
  • FIG. 6 is a schematic diagram of a process for determining similar image blocks according to an embodiment of the application.
  • the low-resolution image acquired by the terminal device comes from a video file.
  • the terminal device determines the position of the low-resolution image in the video, that is, the position information of the first image frame in the video . For example, a low-resolution image corresponding to a certain detailed image block is determined, and the position of the low-resolution image (the first image frame) in the video file is the 10th frame.
  • step F2 due to the coherence of the video file, the image similarity of adjacent frames is relatively high. Therefore, after determining the position of the low-resolution image corresponding to a certain image block with rich details in the video file. That is, after the position of the first image frame is determined. Search for image frames within a certain range before and after the first image frame, and determine one or more image frames as the second image frame. In an optional implementation manner, the second image frame is a key frame containing complete image information.
  • step F3 after determining the second image frame, the terminal device decodes the second image frame to obtain the corresponding image file.
  • the terminal device divides the image file to generate multiple image blocks.
  • any image block is selected from the image obtained by decoding the second image frame, and any detailed image block is selected, and the above two image blocks are used for calculation to determine the image file corresponding to the second image frame Which image block is a similar image block.
  • the specific calculation method is similar to the process corresponding to FIG. 5, and will not be repeated here.
  • the image block corresponding to the position of the image block with rich details in the second image frame is a similar image block.
  • the image block is determined as a similar image block at the same coordinate position of the image file.
  • the low-resolution image is divided into 3*3, a total of 9 image blocks
  • the detailed image block is the image block in the first row and first column of the low-resolution image.
  • the terminal device divides the image file obtained by decoding the second image frame into 3*3 (the size of the low-resolution image is the same as the size of the image file), and the terminal device determines that the image block in the first row and first column of the image file is Similar image blocks.
  • similar image blocks can be determined in a variety of ways, which improves the implementation flexibility of this solution.
  • similar image blocks can be determined, which reduces the power consumption of terminal devices that deploy the super-resolution image processing method proposed in this application.
  • the terminal device performs super-resolution processing on similar image blocks and low-resolution images to generate super-resolution images.
  • the terminal device can generate different super-resolution images through multiple solutions.
  • the first processing method no secondary super-resolution processing is performed.
  • the terminal device only uses the first super-resolution network model to perform super-resolution processing.
  • FIG. 7 is a schematic flowchart of a super-resolution processing according to an embodiment of the application.
  • G1. Perform image exchange processing on the detailed image block and the similar image block to generate the first exchange image.
  • step G1 the terminal device performs image exchange processing on the detailed image block and the similar image block to generate a first exchange image.
  • the generated first exchange image has both the characteristics of richly detailed image blocks and similar image blocks.
  • the image exchange processing can be performed by one or more of the following methods: "concat mode", “concat+add mode” or "image swap".
  • G2 According to the similar image blocks, determine the feature maps of the similar image blocks in the image feature data set corresponding to the detailed image blocks, and perform image exchange processing on the feature maps of the similar image blocks and the feature maps of the low-resolution image blocks to generate the first 2. Exchange images.
  • step G2 the terminal device determines the feature map of the similar image block in the image feature data set corresponding to the detailed image block according to the similar image block, and performs image exchange between the feature map of the similar image block and the feature map of the low-resolution image block Process to generate a second exchange image.
  • the generated second exchange image has both the feature map of the similar image block and the feature map of the low-resolution image block.
  • the image exchange processing can be performed by one or more of the following methods: "concat mode", "concat+add mode” or "feature map swap".
  • G3. Perform super-resolution processing on the first exchange image, the second exchange image, and the low-resolution image to generate the first image.
  • step G3 the terminal device performs super-resolution processing on the first exchange image, the second exchange image, and the low-resolution image to generate the first image. Similar image blocks and feature maps of similar image blocks are used as reference images for super-resolution processing.
  • Input(H,W,1) means inputting 4 similar “RGB” or “YUV” images, the number of channels of the image is 1, and "HR(H,W)” means super-resolution image.
  • G4 Generate a super-resolution image according to the first image.
  • step G4 if the terminal device does not perform secondary super-resolution processing, the first image generated in step G3 is a super-resolution image.
  • the first image output by the first super-resolution network model is a super-resolution image.
  • the second method of processing Perform secondary super-resolution processing. If the terminal device performs secondary super-resolution processing, please refer to Fig. 8a for details of step G4.
  • FIG. 8a is a schematic flowchart of a super-resolution processing according to an embodiment of the application.
  • step H1 the terminal device obtains a high-definition image, and the resolution of the high-definition image is greater than the resolution of the low-resolution image.
  • the high-definition image comes from a preset high-definition gallery.
  • the high-definition image comes from a cloud computing device system, and the cloud computing device system uses the third super-resolution network model deployed in the cloud computing device system to generate the high-definition image based on the low-resolution image sent by the terminal device.
  • the descriptions are made separately below.
  • the high-definition image comes from a preset high-definition gallery.
  • a large number of high-resolution images are stored in the preset high-definition gallery. So that terminal devices can use these images to improve the accuracy of super-resolution processing and improve the clarity of super-resolution images.
  • the high-definition image comes from a cloud computing device, and the cloud computing device (the cloud computing device system) uses the third super-resolution network model deployed in the cloud computing device system to generate based on the low-resolution image sent by the terminal device The high-definition image.
  • the terminal device acquires a low-resolution image (step 401)
  • the terminal device sends the low-resolution image to the cloud computing device system where the super-resolution image processing device is deployed
  • the cloud computing device system executes the subsequent steps 402-406, and the cloud computing device system generates a high-definition image based on the low-resolution image (using the third super-resolution network model).
  • the cloud computing device system sends the generated high-definition image to the terminal device.
  • the terminal device uses the high-definition image and the image generated by its own super-division image processing, and then performs super-resolution processing, and finally generates a super-resolution image, which is regarded as a super-resolution image.
  • Reference image for resolution processing To improve the clarity of super-resolution images.
  • the third super-resolution network model has the characteristics of occupying large computing resources and good super-resolution processing effects (compared with the first super-resolution network model and the second super-resolution network model deployed on terminal devices).
  • the terminal device After the terminal device obtains a high-definition image, it uses the high-definition image as a reference image. The terminal device needs to determine the similar image blocks of the high-definition image and the low-resolution image. The specific method for determining similar image blocks is similar to the content described in the foregoing embodiment, and will not be repeated here.
  • the terminal device may use the image blocks with insufficient details obtained in the foregoing steps.
  • the enlargement processing includes: Bicubic or "linearf".
  • the terminal device After the terminal device obtains the enlarged image, it uses the enlarged image as a reference image.
  • the terminal device needs to determine the similar image blocks of the enlarged image and the low-resolution image.
  • the specific method for determining similar image blocks is similar to the content described in the foregoing embodiment, and will not be repeated here.
  • step H3 the terminal device performs super-resolution processing on the high-definition image, the enlarged image, and the first image to generate a super-resolution image.
  • the terminal device uses the second super-resolution network model to perform super-resolution processing on image blocks in the high-definition image that are similar to the first image, image blocks in the enlarged image that are similar to the first image, and the first image to generate super-resolution processing.
  • Resolution image The specific process of generating the super-resolution image is similar to the content shown in the foregoing steps G1-G3, and will not be repeated here.
  • HR_MOBILE(H,W) represents the image obtained by the terminal device (including the enlarged image and the first image)
  • HR_CLOUD(H,W) represents the high-definition image generated by the cloud computing device system
  • HR(H,W) ) means super-resolution image.
  • the super-resolution image processing device deployed in the terminal device uses the first network model to recognize the acquired low-resolution image, and determines the image block with rich details and the image block with low details.
  • the super-resolution network model is used to perform super-resolution processing; for image blocks that are not rich in details, after magnification processing, super-resolution processing is performed together with the detailed image blocks.
  • the XOR matching algorithm is further used to determine similar image blocks. Under the premise of improving the matching accuracy of similar image blocks, the amount of calculation is reduced and the definition of super-resolution images is improved.
  • Fig. 8b is a schematic diagram of a simulation experiment in an embodiment of the application.
  • Peak signal to noise ratio (PSNR) is the most common and widely used objective evaluation index for images. "Ours (plus similar blocks)" proposes a super-resolution image processing method for this application. It can be seen that even when the amount of parameters is greatly reduced (reduce the amount of calculation), a high PSNR is still maintained. PSNR is based on the error between corresponding pixels, that is, based on error-sensitive image quality evaluation.
  • FIG. 8c is a schematic diagram of the calculation result of the interpolation algorithm
  • FIG. 8d is a schematic diagram of the calculation result of the super-resolution image processing method proposed in an embodiment of the application.
  • Figures 8c and 8d show that the interpolation algorithm (Bicubic) and the super-resolution image processing method proposed in the embodiments of the present application process the same low-resolution image and generate different calculation results.
  • FIG. 8e is a schematic diagram of a simulation experiment in an embodiment of the application.
  • FIG. 8e shows the amount of calculation saved by the super-resolution image processing method proposed in the embodiment of the present application in different scenarios compared with the interpolation algorithm (Bicubic). In different scenarios, the amount of calculation can be reduced by 20%-60%. It should be noted that this is only a possible simulation experiment result. Depending on the actual hardware, there may also be other simulation experiment results, which are not limited here.
  • the super-resolution image processing device may also generate a first training set to train the first network model.
  • FIG. 9 is a schematic diagram of a process of generating a training set in an embodiment of this application.
  • the super-resolution image processing device acquires various images, and these images form an image set.
  • the image collection includes: collecting pictures with different texture characteristics from the Internet and published data sets, such as animals, sky, human faces or buildings, etc., and mixing different types of pictures in equal proportions to obtain a training set.
  • Sources include data sets such as "DIV2K” or "Timofte 91images", or images obtained by search engines, and so on. It may also include the image feature data set corresponding to the detailed rich image block generated by the super-resolution image processing device in the process of processing the image.
  • a low-pass filter is used to perform filtering processing on the image set, that is, image files with less detail and smoother in the image set are deleted to generate a first sub-training set.
  • the image in the first sub-training set carries a rich-detail label, and the rich-detail label is used to identify that the image has rich image feature information.
  • the label can be manually marked.
  • the number of image feature information included in the image in the first sub-training set is greater than the number of image feature information included in the less detailed image block.
  • step 903 data augmentation processing is performed on the first sub-training set to generate a second sub-training set.
  • the data augmentation processing includes: image inversion, image rotation, image reduction and image stretching.
  • Data augmentation processing also includes: cropping, translation, affine, perspective, Gaussian noise, uneven light, dynamic blur, and random color filling. Select one or more of the data augmentation processing to process the image files in the first sub-training set. It can either perform multiple data augmentation processing on the same image file, or perform the same data augmentation processing on multiple image files, which is not limited here.
  • step 904 the super-resolution image processing device generates a first training set according to the first sub-training set and the second sub-training set.
  • the accuracy of the first network model can be effectively improved.
  • the above-mentioned super-resolution image processing apparatus includes hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the embodiments of the present application may divide the super-resolution image processing apparatus into functional modules according to the above method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one generating module 1001 in.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • the super-resolution image processing device 1000 includes:
  • the super-resolution image processing device 1000 includes a generating module 1001, a determining module 1002, and an acquiring module 1003:
  • the generating module 1001 is configured to generate a detailed image block and a low-detail image block according to a low-resolution image, wherein the size of the rich-detail image block and the low-detail image block is smaller than the low-resolution image, the rich-detail image
  • the number of image feature information included in the block is greater than the number of image feature information included in the less detailed image block;
  • the determining module 1002 is configured to determine similar image blocks according to the detailed image block generated by the generating module 1001, wherein the similarity between the image feature information included in the similar image block and the image feature information included in the detailed image block is greater than the first Threshold
  • the generating module 1001 is also used to perform super-resolution processing on the similar image block and the low-resolution image determined by the determining module 1002 to generate a super-resolution image, wherein the similar image block is used as the low-resolution image Refer to the picture.
  • the generating module 1001 is specifically configured to generate the image block set according to the low-resolution image, and the image block set includes at least one low-resolution image block;
  • the determining module 1002 is specifically configured to process the low-resolution image block generated by the generating module 1001, and determine the image block with rich details and the image block with low details;
  • the determining module 1002 is specifically configured to perform two-classification processing on the first convolution data set, and determine image blocks with rich details and image blocks with low details.
  • the generating module 1001 is specifically configured to determine the image feature data set corresponding to the detailed rich image block according to the detailed rich image block;
  • the generating module 1001 is specifically configured to perform binarization processing on the determined image feature data set to obtain the similarity of any two image blocks in the detailed image block;
  • the determining module 1002 is specifically configured to determine the similar image block when the similarity between any two image blocks is greater than the first threshold.
  • the similarity of any two image blocks satisfies:
  • the F is the similarity
  • the N is the image size of the detailed image block
  • the P(i,j) and Q(i,j) are the image blocks corresponding to the any two detailed image blocks respectively.
  • the i is the abscissa value of the feature map pixel of the image block
  • the j is the ordinate value of the feature map pixel of the image block.
  • the determining module 1002 is specifically configured to determine the position of the low-resolution image in the video when the low-resolution image is a frame in the video;
  • the determining module 1002 is specifically configured to determine a second image frame according to the position, where the second image frame is an adjacent frame of the low-resolution image in the video;
  • the determining module 1002 is specifically configured to determine, according to the determining module 1002, the image block corresponding to the detailed image block in the second image frame as the similar image block.
  • the generating module 1001 is specifically configured to perform image exchange processing on the detailed image block generated by the generating module 1001 and the similar image block determined by the determining module 1002 to generate a first exchange image;
  • the generating module 1001 is specifically configured to determine the feature map of the similar image block according to the similar image block determined by the determining module 1002, and perform image exchange processing on the feature map of the similar image block and the feature map of the low-resolution image block , Generate a second exchange image, where the feature map is used to indicate the image feature information of the image block;
  • the generating module 1001 is specifically configured to perform super-resolution processing on the first exchanged image, the second exchanged image, and the low-resolution image through the first super-resolution network model to generate a first image;
  • the generating module 1001 is specifically configured to generate the super-resolution image according to the first image generated by the generating module 1001.
  • the super-resolution image processing apparatus 1000 further includes an acquisition module 1003;
  • the acquiring module 1003 is configured to acquire a high-definition image, the resolution of the high-definition image is greater than the resolution of the low-resolution image;
  • the generating module 1001 is specifically configured to perform a magnification process on the incompletely detailed image block generated by the generation module 1001 to generate a magnified image, wherein the magnification process includes bicubic interpolation processing;
  • the generating module 1001 is specifically configured to generate the super-resolution image by performing super-resolution processing on the high-definition image, the enlarged image, and the first image obtained by the obtaining module 1003, where the high-definition image and the enlarged image are used as Reference image of this first image.
  • the high-definition image comes from a remote device, and the high-definition image is an image generated by the remote device by performing super-resolution processing on the low-resolution image.
  • the high-definition image comes from a preset high-definition gallery.
  • the determining module 1002 is further configured to determine the high-definition image in the pre-made high-definition gallery according to the first image generated by the generating module 1001, and the similarity between the high-definition image and the first image is greater than that of the first image. A threshold.
  • the obtaining module 1003 is also used to obtain a collection of images
  • the generating module 1001 is further configured to use a low-pass filter to perform filtering processing on the image set acquired by the acquiring module 1003 to generate a first sub-training set;
  • the generating module 1001 is also used to perform data augmentation processing on the first sub-training set generated by the generating module 1001 to generate a second sub-training set.
  • the data augmentation processing includes image inversion, image rotation, image reduction, and image scaling. stretch;
  • the generating module 1001 is further configured to generate a first training set according to the first sub-training set and the second sub-training set, the first training set is used to train the first network model, and the first network model is used to generate details Rich image blocks and less detailed image blocks.
  • the obtaining module 1003 can perform step 401 in the embodiment shown in FIG. 4a; the generating module 1001 can perform step 402 and step 404 in the embodiment shown in FIG. 4a; the generating module 1001 also Step 407 in the embodiment shown in FIG. 4a may be executed; the determining module 1002 may execute steps 405-406 in the embodiment shown in FIG. 4a.
  • the super-resolution image processing apparatus 1000 uses the first network model to recognize the acquired low-resolution image, and determines the image block with rich details and the image block with low details.
  • the super-resolution network model is used for super-resolution processing; for image blocks that are not rich in details, after being enlarged, they are processed together with the detailed image blocks for super-resolution processing.
  • the XOR matching algorithm is further used to determine similar image blocks. Under the premise of improving the matching accuracy of similar image blocks, the amount of calculation is reduced and the definition of super-resolution images is improved.
  • a variety of different reference images are used for super-resolution processing to effectively improve the clarity of super-resolution images.
  • FIG. 11 is a schematic structural diagram of a computing device provided in an embodiment of the present application.
  • the computing device 1100 may be deployed on the computing device 1100 as described in the embodiment corresponding to FIG.
  • the resolution image processing device 1000 is used to implement the function of the super-resolution image processing device in the embodiment corresponding to FIG. 10.
  • the computing device 1100 may be a computing device in a cloud computing device system, a terminal device, or an edge computing device system.
  • the super-resolution image processing apparatus 1000 may be deployed on the computing device 1100 to implement the functions implemented by the aforementioned super-resolution image processing apparatus.
  • the computing device 1100 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (CPU) 1122 (for example, one or more processors) and a memory 1132, one or one
  • the above storage medium 1130 (for example, one or one storage device with a large amount of storage) for storing the application program 1142 or the data 1144.
  • the memory 1132 and the storage medium 1130 may be short-term storage or persistent storage.
  • the program stored in the storage medium 1130 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the computing device.
  • the central processing unit 1122 may be configured to communicate with the storage medium 1130, and execute a series of instruction operations in the storage medium 1130 on the computing device 1100.
  • the computing device 1100 may also include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input and output interfaces 1158, and/or one or more operating systems 1141, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the central processing unit 1122 is configured to execute the aforementioned super-resolution image processing method.
  • the processor in the embodiment of the present application may be an integrated circuit chip with signal processing capability.
  • the steps of the foregoing method embodiments may be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the above-mentioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC application specific integrated circuit
  • FPGA Field Programmable Gate Array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory in the embodiments of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), and electrically available Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • DDR SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Enhanced SDRAM, ESDRAM Enhanced Synchronous Dynamic Random Access Memory
  • Synchronous Link Dynamic Random Access Memory Synchronous Link Dynamic Random Access Memory
  • DR RAM Direct Rambus RAM
  • the embodiment of the present application also provides a product including a computer program, which when running on a computer, causes the computer to execute the steps performed by the super-resolution image processing apparatus in the method described in the foregoing embodiment.
  • the embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores a program for performing super-resolution image processing. When it runs on a computer, the computer executes the program as described in the foregoing embodiment. The steps performed by the super-resolution image processing device in the method.
  • An embodiment of the present application further provides a chip, which includes a processing unit and a communication unit.
  • the processing unit may be, for example, a processor, and the communication unit may be, for example, an input/output interface, a pin, or a circuit.
  • the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip in the execution device executes the method for constructing the training set described in the foregoing embodiment.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit located outside the chip in the super-resolution image processing device, such as a read-only memory (read-only memory). -only memory, ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • Figure 12 is a schematic diagram of a structure of a chip provided by an embodiment of this application.
  • the chip can be expressed as a neural network processor NPU 1200, which is mounted as a coprocessor to the main CPU (Host CPU). ), the task is assigned by the Host CPU.
  • the core part of the NPU is the arithmetic circuit 1203.
  • the arithmetic circuit 1203 is controlled by the controller 1204 to extract matrix data from the memory and perform multiplication operations.
  • the arithmetic circuit 1203 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 1203 is a two-dimensional systolic array. The arithmetic circuit 1203 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 1203 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the corresponding data of matrix B from the weight memory 1202 and caches it on each PE in the arithmetic circuit.
  • the arithmetic circuit takes the matrix A data and matrix B from the input memory 1201 to perform matrix operations, and the partial or final result of the obtained matrix is stored in an accumulator 1208.
  • the unified memory 1206 is used to store input data and output data.
  • the weight data directly passes through the memory unit access controller (Direct Memory Access Controller, DMAC) 1205, and the DMAC is transferred to the weight memory 1202.
  • the input data is also transferred to the unified memory 1206 through the DMAC.
  • DMAC Direct Memory Access Controller
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 1210, which is used for the interaction of the AXI bus with the DMAC and the instruction fetch buffer (IFB) 1209.
  • IFB instruction fetch buffer
  • the bus interface unit 1210 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 1209 to obtain instructions from the external memory, and is also used for the storage unit access controller 1205 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • BIU Bus Interface Unit
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 1206 or to transfer the weight data to the weight memory 1202 or to transfer the input data to the input memory 1201.
  • the vector calculation unit 1207 includes multiple arithmetic processing units, and if necessary, further processes the output of the arithmetic circuit, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison and so on. It is mainly used in the calculation of non-convolutional/fully connected layer networks in neural networks, such as Batch Normalization, pixel-level summation, and upsampling of feature planes.
  • the vector calculation unit 1207 can store the processed output vector to the unified memory 1206.
  • the vector calculation unit 1207 may apply a linear function and/or a non-linear function to the output of the arithmetic circuit 1203, such as linearly interpolating the feature plane extracted by the convolutional layer, and, for example, a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 1207 generates normalized values, pixel-level summed values, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 1203, for example for use in a subsequent layer in a neural network.
  • the instruction fetch buffer 1209 connected to the controller 1204 is used to store instructions used by the controller 1204;
  • the unified memory 1206, the input memory 1201, the weight memory 1202, and the fetch memory 1209 are all On-Chip memories.
  • the external memory is private to the NPU hardware architecture.
  • each layer in each super-resolution network model shown in FIG. 4a to FIG. 8a may be performed by the arithmetic circuit 1203 or the vector calculation unit 1207.
  • processor mentioned in any of the foregoing may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the method in the first aspect.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate.
  • the physical unit can be located in one place or distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the connection relationship between the modules indicates that they have a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • this application can be implemented by means of software plus necessary general hardware.
  • it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CPUs, dedicated memory, Dedicated components and so on to achieve.
  • all functions completed by computer programs can be easily implemented with corresponding hardware.
  • the specific hardware structures used to achieve the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. Circuit etc.
  • software program implementation is a better implementation in more cases.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device execute the method described in each embodiment of this application.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, a computer, or a super-resolution image.
  • the processing device, computing device or data center uses wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to process another website, computer, or super-resolution image Device, computing equipment or data center for transmission.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).
  • one embodiment or “an embodiment” mentioned throughout the specification means that a specific feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application. Therefore, the appearances of "in one embodiment” or “in an embodiment” in various places throughout the specification do not necessarily refer to the same embodiment. In addition, these specific features, structures or characteristics can be combined in one or more embodiments in any suitable manner. It should be understood that in the various embodiments of the present application, the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, and should not correspond to the embodiments of the present application. The implementation process constitutes any limitation.
  • system and “network” in this article are often used interchangeably in this article.
  • the term “and/or” in this article is only an association relationship describing the associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations.
  • the character "/" in this text generally indicates that the associated objects before and after are in an "or” relationship.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B based on A does not mean that B is determined only based on A, and B can also be determined based on A and/or other information.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer device (which can be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in the various embodiments of the present application.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种超分辨率图像处理方法以及装置,运行该超分辨率图像处理方法的设备,根据低分辨率图像生成超分辨率图像的过程中,首先根据低分辨率图像生成细节丰富图像块,通过将低分辨率图像拆分为尺寸较小的细节丰富图像块。其次,通过该细节丰富图像块确定相似图像块,使得设备对低分辨率图像进行超分辨率处理时,可引入该相似图像块一同进行超分辨率处理。由于细节丰富图像块的尺寸较小,可减轻设备的运算量。相似图像块作为低分辨率图像的参考图,以提升超分处理后图像清晰度。

Description

一种超分辨率图像处理方法以及相关装置
本申请要求于2019年12月09日提交中国专利局、申请号为201911252760.0、发明名称为“一种超分辨率图像处理方法以及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,尤其涉及一种超分辨率图像处理方法以及相关装置。
背景技术
终端设备播放视频时,受限于视频文件或网络带宽等多种原因,终端设备最终展示的图像往往画质较差。为了解决这样的问题,超分辨率(super resolution,SR)技术应运而生。超分辨率技术指利用一幅或多幅低分辨率(low resolution,LR)图像得到一幅清晰的高分辨率(high resolution,HR)图像,上述处理过程简称为超分处理。
现有技术中,通常在服务器(云端)对视频文件进行超分处理,将处理结果下发给终端设备(移动端)。终端设备展示的图像为经过服务器超分处理后的图像。这种方案,服务器与终端设备之际需要进行大量的数据交互,例如终端设备需要将视频文件发送至服务器,由服务器进行超分处理后,再下发回终端设备,上述过程会占用大量的网络带宽。因此,现在提出了在终端设备进行超分处理的方案。
然而,目前在终端设备进行超分处理的方案,为了保证超分处理后图像的清晰度,需要占用大量运算资源。因此,需要一种占用较少运算资源且能保证超分处理后图像清晰度的超分辨率图像处理方法。
发明内容
本申请实施例提供了一种超分辨率图像处理方法和相关装置,在减轻运行该超分辨率图像处理方法的设备运算量的同时,提升超分处理后图像清晰度。
第一方面,本申请实施例提出一种超分辨率图像处理方法,可以包括:根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,其中,细节丰富图像块的尺寸小于该低分辨率图像,该低分辨率图像的来源可以是媒体文件,具体的可以是视频文件中的任一编码帧文件,例如是关键帧或是其它帧(P帧或B帧等等)。该细节丰富图像块包括的图像特征信息的数量大于该细节不丰富图像块包括的该图像特征信息的数量,例如:细节丰富图像块的包括的图像的颜色信息大于细节不丰富图像块包括的图像的颜色信息。细节丰富图像块包括红色(red,R)、绿色(green,G)和蓝色(blue,B)三种颜色,红色所对应的通道表示为(225,0,0),绿色所对应的通道表示为(0,225,0),蓝色所对应的通道表示为(0,0,225)。示例性地,细节不丰富图像块可以表现为(R,G,B)=(225,0,0)的一种颜色,显然,细节丰富图像块包括的图像特征信息(图像的颜色信息)的数量大于细节不丰富图像块包括的图像特征信息(图像的颜色信息)的数量;根据该细节丰富图像块确定相似图像块,其中,相似图像块可以是图像特征与细节丰富图像块相似程度较高的图像,该相似图像块包括的该图像特征信息与该细节丰富图像块包括的该图像特征信息的相似度(Similarity)大于第一阈值,例如:以图像特征信息为图像的颜色信息为例,若细 节丰富图像块包括(R,G,B)=(225,0,0)(0,225,0)(0,0,225),任一图像块包括的图像的颜色信息为(R,G,B)=(223,0,0)(0,223,0)(0,0,223),通过计算,该任一图像块的图像的颜色信息与该细节丰富图像块包括的图像的颜色信息相似度大于第一阈值,则确定该任一图像块为相似图像块;对该相似图像块和该低分辨率图像进行超分辨率处理,生成超分辨率图像,其中,该相似图像块作为该低分辨率图像的参考图。在一种可选的实现方式中,通过第一超分辨率网络模型对该相似图像块和该低分辨率图像进行超分辨率处理,生成超分辨率图像。
本申请实施例中,根据低分辨率图像生成超分辨率图像的过程中,首先根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,通过将低分辨率图像拆分为尺寸较小的细节丰富图像块,而该细节丰富图像块作为低分辨率图像的参考图,因此减轻运行该超分辨率图像处理方法的设备运算量。其次,通过该细节丰富图像块确定相似图像块,使得设备通过第一超分辨率网络模型对低分辨率图像进行超分辨率处理时,可引入该相似图像块一同进行超分辨率处理。由于该细节丰富图像块包括的图像特征信息较多,且相似图像块包括的图像特征信息与该细节丰富图像块包括的图像特征信息相似度较高(即大于第一阈值),因此,可以认为该相似图像块包括的图像特征信息较多,该相似图像块作为低分辨率图像的参考图,可有效提升超分辨率图像的清晰度。
结合第一方面,在一些实现方式中,根据该低分辨率图像生成该细节丰富图像块,可以包括:根据该低分辨率图像生成该图像块集合,该图像块集合中包括至少一个低分辨率图像块。具体的。获取了低分辨率图像后,首先将该低分辨率图像分割成尺寸较小的低分辨率图像块,这些低分辨率图像块组成图像块集合。具体的,将低分辨率图像分割为高32像素,宽32像素的低分辨率图像块,该低分辨率图像块的具体尺寸由实际需求决定(即后续神经网络模型的需求),此处不作限定;通过第一网络模型对该低分辨率图像块进行处理,确定该细节丰富图像块和细节不丰富图像块,该第一网络模型,可以是一种分类网络(回归网络),具体由若干个卷积层和至少一个softmax层组成。首先,通过将低分辨率图像拆分为尺寸较小的低分辨率图像块,这些低分辨率图像块组成图像块集合。然后,通过第一网络模型对图像块集合中的图像块(低分辨率图像块)进行处理,以确定细节丰富图像块和细节不丰富图像块。通过对低分辨率图像进行拆分以降低确定细节丰富图像块的运算量。该第一网络模型可以是在本地通过机器学习训练获得的,也可以在远端设备,例如云服务器上训练获得后发送到本地的。
结合第一方面,在一些实现方式中,根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,可以包括:通过该第一网络模型对该低分辨率图像块进行卷积处理,生成第一卷积数据集。由于第一网络模型需要将低分辨率图像块进行分类处理,因此,采用(通过)第一网络模型对低分辨率图像块进行卷积处理,生成的低分辨率图像块对应的图像特征数据集,仅仅是初步卷积处理的输出结果。需要通过第一网络模型对该低分辨率图像块对应的图像特征数据集进行进一步的卷积处理。该卷积处理所输出的卷积处理结果,作为后续分类处理的源数据。该卷积处理结果称为第一卷积数据集;生成第一卷积数据集后,通过该第一网络模型对该第一卷积数据集进行二分类处理,确定该细节丰富图像块和该细 节不丰富图像块。该第一卷积数据集输入至softmax层进行二分类处理,以确定图像块集合中哪些图像块是细节丰富图像块,哪些图像块是细节不丰富图像块。具体分类的标准可以是:第一卷积数据集中包括多个图像块的特征图,特征图用于指示图像块的图像特征信息。当某个图像块的特征图显示该图像块无轮廓(例如为蓝天背景时),softmax层对应该图像块输出“0”,以表示该图像块为细节不丰富图像块。通过上述方式,提供了确定细节丰富图像块和细节不丰富图像块的方法,能够针对不同类型的图像块提取图像特征信息,并基于各个图像块的图像特征信息(特征图)对各个图像块进行分类,以确定哪些图像块是细节丰富图像块,哪些图像块是细节不丰富图像块,提高了本方案的实现灵活性。
结合第一方面,在一些实现方式中,对该低分辨率图像块进行卷积处理,生成该第一卷积数据集,可以包括:将低分辨率图像分割,生成图像块集合(该图像块集合中包括低分辨率图像块)后,通过第一网络模型对该图像块集合中的低分辨率图像块进行卷积处理。由该第一网络模型输出卷积层处理结果,该结果称为低分辨率图像块对应的图像特征数据集,该低分辨率图像块对应的图像特征数据集包括该低分辨率图像块的特征图。可选的,该第一网络模型包括多层卷积层(大于2层),该卷积层可提取图像的特征图,例如,提取图像的边缘信息、图像的轮廓信息、图像的亮度信息和/或图像的颜色信息等等;通过该第一网络模型对该低分辨率图像块对应的图像特征数据集进行卷积处理,生成该第一卷积数据集。通过上述方式,第一网络模型对低分辨率图像块进行卷积处理的过程中,由于第一网络模型的卷积层可提取图像的特征图,因此还可以输出低分辨率图像块的特征图,以便后续超分辨率图像处理使用,提升了超分辨率图像的清晰度。
结合第一方面,在一些实现方式中,该根据该细节丰富图像块确定该相似图像块,可以包括:根据该细节丰富图像块,在该低分辨率图像块对应的图像特征数据集中确定细节丰富图像块对应的图像特征数据集。确定了低分辨率图像块中哪一些图像块是细节丰富图像块后,终端设备根据这些细节丰富图像块,在第一网络模型输出的低分辨率图像块对应的图像特征数据集中,确定哪些是细节丰富图像块所对应的特征图。将这些对应于细节丰富图像块的特征图,统称为细节丰富图像块对应的图像特征数据集;对该细节丰富图像块对应的图像特征数据集进行二值化处理,计算得到该细节丰富图像块中任意两个图像块的相似度。在确定了细节丰富图像块对应的图像特征数据集(即细节丰富图像块的特征图)后,为了便于后续计算,即计算细节丰富图像块中任意两个图像块的相似度。首先需要对细节丰富图像块对应的图像特征数据集中的特征图,进行二值化处理。二值化处理具体指的是图像上每个像素点的特征值设置为0或1,也就是将整个图像呈现出明显的黑白效果的过程。通常可使用“OpenCV”或“matlab”进行二值化处理,当得到细节丰富图像块对应的图像特征数据集中特征图的二值化数据后,使用计算得到细节丰富图像块中任意两个图像块的相似度;最后,当该任意两个图像块的相似度大于该第一阈值时,根据该相似度确定该相似图像块。通过对特征图的二值化处理,并使用异或匹配算法计算相似度,在占用较少计算资源的情况下,可以得到图像块较为准确的相似度,提高了相似图像块匹配精度。
结合第一方面,在一些实现方式中,任意两个图像块的相似度满足:
Figure PCTCN2020134444-appb-000001
其中,该F为该相似度,该N为该细节丰富图像块的图像尺寸,该P(i,j)和Q(i,j)为该任意两个细节丰富图像块分别对应的该图像块的该特征图,该i为该图像块的该特征图像素的横坐标值,该j为该图像块的该特征图像素的纵坐标值。通过上述方式,提供了确定该相似图像块的具体实现方法,提高了本方案的实现灵活性。
结合第一方面,在一些实现方式中,根据该细节丰富图像块确定该相似图像块,可以包括:当该低分辨率图像为视频中的一帧(第一图像帧)时,确定该低分辨率图像在该视频中的位置,也就是第一图像帧在该视频中的位置信息。例如,获取的低分辨率图像来自视频文件。首先,确定该细节丰富图像块对应的低分辨率图像。其次,确定该低分辨率图像在视频中的位置,具体如下:由于视频文件中某一图像帧(如第一图像帧)解码后得到该低分辨率图像,因此根据该低分辨率图像,确定该低分辨率图像对应的第一图像帧在该视频文件中的位置信息。例如,确定某一细节丰富图像块对应的低分辨率图像,在视频文件中第一图像帧位置信息为第10帧;由于视频文件具有连贯性,相邻帧的画面相似程度较高,因此,当确定了某一细节丰富图像块所对应的低分辨率图像,在视频文件中的位置后。寻找该第一图像帧位置信息,前后一定范围内的图像帧,并确定其中的一个或多个图像帧为第二图像帧。在一种可选的实现方式中,在第二图像帧进行解码获取的图像中选取任意一个图像块,选取任意一个细节丰富图像块,使用上述两种图像块进行计算,已确定在第二图像帧对应的图像文件中哪一图像块为相似图像块;在另一种可选的实现方式中,确定该第二图像帧中与该细节丰富图像块对应位置的图像块为该相似图像块。具体的,根据细节丰富图像块在低分辨率图像中的坐标,在该图像文件的相同坐标位置,取该图像块作为相似图像块。通过上述方法,可从同一视频文件的相邻帧中,获取相似图像块,可在保证超分辨率图像的清晰度前提下,减少对运算资源的占用率。
结合第一方面,在一些实现方式中,对该相似图像块和该低分辨率图像进行超分辨率处理,生成该超分辨率图像,可以包括:对细节丰富图像块和相似图像块进行图像交换处理,生成第一交换图像。生成的第一交换图像,兼具细节丰富图像块和相似图像块的特征。具体的,可以通过下列方式中的一种或多种,进行图像交换处理:“concat方式”、“concat+add方式”或“image swap”;根据相似图像块,在细节丰富图像块对应的图像特征数据集中确定相似图像块的特征图,并对相似图像块的特征图和低分辨率图像块的特征图进行图像交换处理,生成第二交换图像。生成的第二交换图像,兼具相似图像块的特征图和低分辨率图像块的特征图的特征;对第一交换图像、第二交换图像和低分辨率图像进行超分辨率处理,生成第一图像。相似图像块和相似图像块的特征图作为超分辨率处理的参考图;若不进行二次超分辨率处理,则生成的第一图像就是超分辨率图像。由于第一交换图像和第二交换图像具有丰富的图像特征信息,因此通过上述处理方式,可进一步提升超分辨率图像的清晰度。
结合第一方面,在一些实现方式中,该根据该第一图像生成该超分辨率图像,可以包 括:若进行二次超分辨率处理,获取高清图像,该高清图像的分辨率大于该低分辨率图像的分辨率。例如,低分辨率图像的分辨率为256*128,则高清图像的分辨率为1280*960,该高清图像存在两种可能的来源:(一)、该高清图像来自预置高清图库。根据该第一图像,通过该第一网络模型确定该预制高清图库中的该高清图像,该高清图像与该第一图像相似度大于第一阈值;(二)、该高清图像来自远端设备,例如云计算设备(云计算设备系统),在一种可选的实现方式中,该云计算设备使用部署于云计算设备系统的第三超分辨率网络模型,基于发送的低分辨率图像,生成该高清图像;为了进一步提升超分辨率图像的清晰度,使用前述步骤得到的细节不丰富图像块。对这些细节不丰富图像块进行放大处理,得到放大图像。具体的,放大处理包括:双三次插值处理(Bicubic)或“linearf”,当终端设备获取放大图像后,使用该放大图像作为参考图。确定该放大图像与低分辨率图像的相似图像块后,在一种可选的实现方式中,通过第二超分辨率网络模型对高清图像、放大图像和第一图像进行超分辨率处理,生成超分辨率图像。具体的,通过第二超分辨率网络模型对高清图像中与第一图像相似的图像块、放大图像中与第一图像相似的图像块,和第一图像进行超分辨率处理,生成超分辨率图像。通过异或匹配算法找到相似图像块,相似图像块可以实现信息互补,从而达到超分辨率图像清晰度的提升。另外相似图像块的来源多种多样,进一步提升超分辨率图像的清晰度。
结合第一方面,在一些实现方式中,该根据该低分辨率图像生成该细节丰富图像块之前,还可以包括:超分辨率图像处理装置获取各种图像,由这些图像构成图像集合。该图像集合中包括:从互联网和已公开的数据集中搜集具有不同纹理特征的图片,如动物、天空、人脸或建筑物等,并将不同类别的图片等比例混合得到训练集。来源有“DIV2K”或“Timofte 91 images”等数据集,或搜索引擎获得的图像等等;使用低通滤波器对图像集合进行滤波处理,即删除图像集合中细节较少,较为平滑的图像文件,生成第一子训练集,该第一子训练集中图像包括的图像特征信息的数量大于细节不丰富图像块包括的图像特征信息的数量;对第一子训练集进行数据增广处理,生成第二子训练集。具体的,数据增广处理包括:图像颠倒,图像旋转,图像缩小和图像拉伸。数据增广处理还包括:裁剪、平移、仿射、透视、高斯噪声、不均匀光、动态模糊和随机颜色填充等;根据该第一子训练集和该第二子训练集生成第一训练集,该第一训练集用于训练该第一网络模型。通过生成第一训练集,对第一网络模型进行训练,可有效提升第一网络模型的精度。
第二方面,本申请实施例提出了一种超分辨率图像处理装置,该超分辨率图像处理装置可部署于云计算设备、边缘计算设备系统或终端设备等多种设备中。该超分辨率图像处理装置包括生成模块和确定模块:
生成模块,用于根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,其中,该细节丰富图像块和该细节不丰富图像块的尺寸小于该低分辨率图像,该细节丰富图像块包括的图像特征信息的数量大于该细节不丰富图像块包括的图像特征信息的数量;
确定模块,用于根据该细节丰富图像块确定相似图像块,其中,该相似图像块包括的图像特征信息与该细节丰富图像块包括的图像特征信息的相似度大于第一阈值;
该生成模块,还用于对该相似图像块和该低分辨率图像进行超分辨率处理,生成超分 辨率图像,其中,该相似图像块作为该低分辨率图像的参考图。
结合第二方面,在一些实现方式中,该生成模块,具体用于根据该低分辨率图像生成图像块集合;
该确定模块,具体用于对该图像块集合中的图像块进行卷积处理,生成第一卷积数据集;
该确定模块,具体用于对该第一卷积数据集进行二分类处理,确定该细节丰富图像块和该细节不丰富图像块。
结合第二方面,在一些实现方式中,该生成模块,具体用于根据该细节丰富图像块,确定该细节丰富图像块对应的图像特征数据集;
该生成模块,具体用于对确定的该图像特征数据集进行二值化处理,得到该细节丰富图像块中任意两个图像块的相似度;
该确定模块,具体用于当该任意两个图像块的相似度大于该第一阈值时,确定该相似图像块。
结合第二方面,在一些实现方式中,任意两个图像块的相似度满足:
Figure PCTCN2020134444-appb-000002
其中,该F为该相似度,该N为该细节丰富图像块的图像尺寸,该P(i,j)和Q(i,j)为该任意两个细节丰富图像块分别对应的该图像块的该特征图,该i为该图像块的该特征图像素的横坐标值,该j为该图像块的该特征图像素的纵坐标值。
结合第二方面,在一些实现方式中,该确定模块,具体用于当该低分辨率图像为视频中的一帧时,确定该低分辨率图像在该视频中的位置;
该确定模块,具体用于根据该位置,确定第二图像帧,其中,该第二图像帧为该低分辨率图像在该视频中的相邻帧;
该确定模块,具体用于确定该第二图像帧中与该细节丰富图像块对应位置的图像块为该相似图像块。
结合第二方面,在一些实现方式中,该生成模块,具体用于对该细节丰富图像块和该相似图像块进行图像交换处理,生成第一交换图像;
该生成模块,具体用于根据该相似图像块,确定该相似图像块的特征图,并对该相似图像块的特征图和该低分辨率图像块的特征图进行图像交换处理,生成第二交换图像,其中所述特征图用于指示图像块的图像特征信息;
该生成模块,具体用于对该第一交换图像、该第二交换图像和该低分辨率图像进行超分辨率处理,生成第一图像;
该生成模块,具体用于根据该第一图像生成该超分辨率图像。
结合第二方面,在一些实现方式中,该超分辨率图像处理装置还包括获取模块;
该获取模块,用于获取高清图像,该高清图像的分辨率大于该低分辨率图像的分辨率;
该生成模块,具体用于对该细节不丰富图像块进行放大处理以生成放大图像,其中该 放大处理包括双三次插值处理;
该生成模块,具体用于对该高清图像、该放大图像和该第一图像进行超分辨率处理,生成该超分辨率图像,其中,该高清图像和该放大图像作为该第一图像的参考图。
结合第二方面,在一些实现方式中,该高清图像来自远端设备,该高清图像为该远端设备通过对该低分辨率图像进行超分辨率处理生成的图像。
结合第二方面,在一些实现方式中,该高清图像来自预置高清图库,该预置高清图库中包括至少一张该高清图像。
结合第二方面,在一些实现方式中,该该确定模块,还用于根据该第一图像,确定该预制高清图库中的该高清图像,该高清图像与该第一图像相似度大于该第一阈值。
结合第二方面,在一些实现方式中,该获取模块,还用于获取图像集合;
该生成模块,还用于使用低通滤波器对该图像集合进行滤波处理,生成第一子训练集,该第一子训练集中图像包括的图像特征信息的数量大于该细节不丰富图像块包括的图像特征信息的数量;
该生成模块,还用于对该第一子训练集进行数据增广处理,生成第二子训练集,该数据增广处理包括图像颠倒,图像旋转,图像缩小和图像拉伸;
该生成模块,还用于根据该第一子训练集和该第二子训练集生成第一训练集,该第一训练集用于训练该第一网络模型,第一网络模型用于生成细节丰富图像块和细节不丰富图像块。
结合第二方面,在一些实现方式中,该图像特征信息包括图像的边缘信息、图像的轮廓信息、图像的亮度信息和/或图像的颜色信息。
第三方面,本申请实施例提供了一种超分辨率图像处理装置,该超分辨率图像处理装置包括至少一个处理器和存储器,该存储器中存储有可在处理器上运行的计算机指令,当该计算机指令被该处理器执行时,该处理器执行如上述第一方面或第一方面任意一种可能的实现方式该的方法。
第四方面,本申请实施例提供了一种终端设备,该终端设备包括至少一个处理器、存储器、通信端口、显示器以及存储在存储器中并可在处理器上运行的计算机执行指令,当该计算机执行指令被该处理器执行时,该处理器执行如上述第一方面或第一方面任意一种可能的实现方式该的方法。
第五方面,本申请实施例提供了一种存储一个或多个计算机执行指令的计算机可读存储介质,当该计算机执行指令被处理器执行时,该处理器执行如上述第一方面或第一方面任意一种可能的实现方式该的方法。
第六方面,本申请实施例提供一种存储一个或多个计算机执行指令的计算机程序产品(或称计算机程序),当该计算机执行指令被该处理器执行时,该处理器执行上述第一方面或第一方面任意一种可能实现方式的方法。
第七方面,本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持终端设备实现上述方面中所涉及的功能。在一种可能的设计中,该芯片系统还包括存储器,该存储器,用于保存终端设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包 括芯片和其他分立器件。
其中,第二至第七方面或者其中任一种可能实现方式所带来的技术效果可参见第一方面或第一方面不同可能实现方式所带来的技术效果,此处不再赘述。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请实施例提供了一种超分辨率图像处理方法和相关装置,运行该超分辨率图像处理方法的设备,根据低分辨率图像生成超分辨率图像的过程中,首先根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,通过将低分辨率图像拆分为尺寸较小的细节丰富图像块,而该细节丰富图像块作为低分辨率图像的参考图,因此减轻运行该超分辨率图像处理方法的设备运算量。其次,通过该细节丰富图像块确定相似图像块,使得设备通过第一超分辨率网络模型对低分辨率图像进行超分辨率处理时,可引入该相似图像块一同进行超分辨率处理。由于该细节丰富图像块包括的图像特征信息较多(大于细节不丰富图像块),相似图像块包括的图像特征信息与该细节丰富图像块包括的图像特征信息相似度较高(大于第一阈值),因此,可以认为该相似图像块包括的图像特征信息较多,可有效提升超分辨率图像的清晰度。
附图说明
图1a为本申请实施例提出的一种应用场景示意图;
图1b为本申请实施例提供的一种系统架构示意图;
图1c为本申请实施例提供的一种系统架构示意图;
图2为本申请实施例提供的一种系统架构200的示意图;
图3为本申请实施例提供的卷积神经网络的一种结构示意图;
图4a为本申请实施例中超分辨率图像处理方法的一个实施例示意图;
图4b为本申请实施例提供的超分辨率图像处理方法的一个流程示意图;
图4c为本申请实施例提供的超分辨率图像处理方法的一个流程示意图;
图5为本申请实施例提出的一种确定相似图像块的流程示意图;
图6为本申请实施例提出的一种确定相似图像块的流程示意图;
图7为本申请实施例提出的一种超分辨率处理的流程示意图;
图8a为本申请实施例提出的一种超分辨率处理的流程示意图;
图8b为本申请实施例中一种仿真实验示意图;
图8c为插值算法的计算结果示意图;
图8d为本申请实施例提出的超分辨率图像处理方法的计算结果示意图;
图8e为本申请实施例中一种仿真实验示意图;
图9为本申请实施例中一种生成训练集的流程示意图;
图10为本申请实施例中超分辨率图像处理装置1000的一种实施例示意图;
图11为本申请实施例提供的计算设备一种结构示意图;
图12为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
本申请实施例提供了一种超分辨率图像处理方法和相关装置,运行该超分辨率图像处理方法的设备,根据低分辨率图像生成超分辨率图像的过程中,首先根据低分辨率图像生成细节丰富图像块,通过将低分辨率图像拆分为尺寸较小的细节丰富图像块,细节丰富图像块作为低分辨率图像的参考图,以减轻设备的运算量。其次,通过该细节丰富图像块确定相似图像块,使得设备对低分辨率图像进行超分辨率处理时,可引入该相似图像块一同进行超分辨率处理,以提升超分处理后图像清晰度。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
本申请提出的超分辨率图像处理方法可以部署于不同设备上,例如:(1)、部署于移动端(终端设备)中;(2)、部署于云端(服务器,云计算设备或称为云计算设备系统);(3)、部分部署于移动端(终端设备),部分部署于云端(服务器,云计算设备或称为云计算设备系统),移动端与云端配合使用。为了便于理解,请参阅图1a,图1a为本申请实施例提出的一种应用场景示意图。
S1、获取媒体文件。
步骤S1中、该媒体文件可以是视频文件,如音频视频交错格式(audio video interleaved,AVI)视频文件;也可以是图片文件,如,联合图像专家组格式(joint photographic experts group,JPEG)图片文件,此处不作限定。
S1存在多种情况,下面分别进行说明:
一、终端设备播放本地媒体文件,本申请提出的超分辨率图像处理方法部署于终端设备中:
终端设备播放本地媒体文件,如“相册”应用程序播放本地媒体文件。终端设备从本地的存储器中获取该媒体文件。终端设备获取该媒体文件后,对该媒体文件进行后续超分辨率图像处理。
二、终端设备播放云端媒体文件,本申请提出的超分辨率图像处理方法部署于终端设备中:
终端设备播放云端媒体文件,例如,“优酷”应用程序播放云端媒体文件。终端设备从提供该云端媒体文件播放服务的服务器中获取该媒体文件,并对该媒体文件进行后续超分辨率图像处理。
三、终端设备播放本地媒体文件,本申请提出的超分辨率图像处理方法部署于服务器中:
终端设备播放本地媒体文件时,终端设备从本地的存储器中获取该媒体文件,并将该媒体文件发送至部署有超分辨率图像处理方法的服务器中,由该服务器对该媒体文件进行超分辨率图像处理。该服务器将超分辨率图像处理结果发送给终端设备,终端设备基于该处理结果,播放经过处理后的本地媒体文件。
四、终端设备播放云端媒体文件,本申请提出的超分辨率图像处理方法部署于服务器中:
终端设备播放云端媒体文件时,终端设备(或提供该云端媒体文件播放服务的服务器)通知部署有超分辨率图像处理方法的服务器,通过该云端媒体文件的地址获取该媒体文件。服务器对该媒体文件进行超分辨率图像处理后,将处理结果发送给终端设备(或提供该云端媒体文件播放服务的服务器)。当该处理结果为发送给终端设备时:终端设备基于该处理结果,播放经过处理后的云端媒体文件。当该处理结果为发送给提供该云端媒体文件播放服务的服务器时:该提供该云端媒体文件播放服务的服务器将该处理结果转发给终端设备,终端设备播放经过处理后的云端媒体文件。
五、终端设备播放本地媒体文件,本申请提出的超分辨率图像处理方法部分部署于终端设备,部分部署于服务器。
终端设备播放本地媒体文件时,终端设备从本地的存储器中获取该媒体文件。终端设备获取该媒体文件后,由终端设备和服务器协同对该媒体文件进行后续超分辨率图像处理。
六、终端设备播放云端媒体文件,本申请提出的超分辨率图像处理方法部分部署于终端设备,部分部署于服务器。
终端设备播放云端媒体文件。终端设备从提供该云端媒体文件播放服务的服务器中获取该媒体文件。终端设备获取该媒体文件后,由终端设备和服务器协同对该媒体文件进行后续超分辨率图像处理。
S2、获取低分辨率图像。
步骤S2中,部署有该超分辨率图像处理方法的终端设备和/或服务器,获取媒体文件后,对该媒体文件进行处理。对于不同的媒体文件,存在不同的处理方式,下面分别进行说明:
当媒体文件为视频文件时,部署有该超分辨率图像处理方法的终端设备和/或服务器,根据该视频文件,提取该视频文件中的图像帧文件,该图像帧文件为可以是该视频文件中任一编码帧文件,例如是关键帧(I帧(I frame)),或是其它帧,如P帧或B帧等等。根据该图像帧文件,获取该图像帧文件对应的低分辨率图像。
当媒体文件为图片文件时,部署有该超分辨率图像处理方法的终端设备和/或服务器,根据该图片文件,获取该图片文件对应的低分辨率图像。
S3、对低分辨率图像进行超分辨率处理。
步骤S3中、部署有该超分辨率图像处理方法的终端设备和/或服务器,对低分辨率图像进行超分辨率图像处理,并输出超分辨率图像。具体处理流程,在后续实施例中详细描述。
本实施例中,本申请实施例提出的超分辨率图像处理方法可以应用于多种应用环境中, 并可以提供多种应用环境中超分辨率图像处理服务。具有适用范围广,实用性高等特点。
本申请实施例提供的超分辨率图像处理方法可以由超分辨率图像处理装置执行。如前述实施例,本申请实施例中并不限定超分辨率图像处理装置所部署的位置。示例性的,如图1b所示,图1b为本申请实施例提供的一种系统架构示意图,超分辨率图像处理装置可以运行在云计算设备系统(包括至少一个云计算设备,例如:服务器等),也可以运行在边缘计算设备系统(包括至少一个边缘计算设备,例如:服务器、台式电脑等),也可以运行在各种终端设备上,例如:手机、笔记本电脑、个人台式电脑等。
超分辨率图像处理装置中的各个组成部分还可以分别部署在不同的系统或服务器中。示例性的,如图1c所示,装置的各部分可以分别运行在云计算设备系统、边缘计算设备系统或终端设备这三个环境中,也可以运行在这三个环境中的任意两个中。云计算设备系统、边缘计算设备系统和终端设备之间由通信通路连接,可以互相进行通信和数据传输。本申请实施例提供的分类模型的训练方法由运行在三个环境(或三个环境中的任意两个)中的超分辨率图像处理装置的各组合部分配合执行。
下面以超分辨率图像处理装置一部分部署于终端设备,另一部分部署于云计算设备系统中为例进行说明。请参见图2,图2为本申请实施例提供的一种系统架构200的示意图,超分辨率图像处理装置中的各部分部署于该系统架构200上的不同设备上,以使得该系统架构200中的设备协同工作一起实现超分辨率图像处理装置的功能。如图2所示,该系统架构200包括服务器220、数据库230、第一通信设备240、数据存储系统250和第二通信设备260,其中,数据库230、服务器220以及数据存储系统250属于云计算设备系统,第一通信设备240和第二通信设备260属于终端设备。
示例性地,第一通信设备240用于获取低分辨率图像,并将低分辨率图像发送给服务器220,服务器220根据该低分辨率图像,通过部署于服务器220中第三超分辨率网络模型,生成高清图像。
可选的,部署于服务器220中的第三超分辨率网络模型,为了节省网络带宽资源和运算资源,可以每间隔T时间,T为正整数,根据来自第一通信设备240的低分辨率图像生成高清图像;也可以,在来自第一通信设备240的低分辨率图像(集合)中,每间隔Y张图像,Y为正整数,选取一张低分辨率图像生成高清图像,此处不做限制。
数据库230中存储有第一训练集(该第一训练集包括第一子训练集和第二子训练集),该第一训练集用于供服务器220对第一网络模型进行迭代训练。服务器220可以每经过一段时间,将训练后的第一网络模型下发至第一通信设备240中,以使得第一通信设备240更新本地的第一网络模型。该第一训练集可以为用户通过第一通信设备240上传至服务器220中的,也可以为服务器220通过数据采集设备从搜索引擎或“DIV2K”等数据集中获取的。
本申请实施例中,服务器220根据第一通信设备240上传的低分辨率图像生成高清图像后,将该高清图像发送给第一通信设备240。第一通信设备240使用部署于本地的第一超分辨率网络模型对低分辨率图像进行超分辨率处理,生成第一图像。第一通信设备240使用部署于本地的第二超分辨率网络模型对第一图像和高清图像进行超分辨率处理,生成 超分辨率图像。
可选的,服务器220也可以对第一超分辨率网络模型、第二超分辨率网络模型和第三超分辨率网络模型中的一种或多种超分辨率网络模型进行训练。服务器220可将已训练完成的第一超分辨率网络模型和第二超分辨率网络模型发送至第一通信设备240,使得第一通信设备240更新本地的第一超分辨率网络模型和第二超分辨率网络模型。在服务器220向第一通信设备240发送上述超分辨率网络模型之前,服务器220还可以使用“HiAI Convert”或“ShaderNN Converter”软件对上述模型进行处理,使得第一通信设备240可成功运行上述超分辨率网络模型。需要说明的是,该第一超分辨率网络模型和第二超分辨率网络模型可以同一个超分辨率网络模型的两个组成部分,也可以是不同超分辨率网络模型,此处不作限定。服务器220可以使用已训练完成的第三超分辨率网络模型更新服务器220本地的第三超分辨率网络模型,由于服务器220通常运算资源高于第一通信设备240,因此,该第三超分辨率网络模型的模型参数量大于第一超分辨率网络模型(和第二超分辨率网络模型),即第三超分辨率网络模型大于第一超分辨率网络模型(和第二超分辨率网络模型)。
可选的,该高清图像还可以来自预置高清图库,该预置高清图库存储于数据存储系统250中。该预置高清图库还可以存储于第一通信设备240中。该预置高清图库可以由服务器220通过数据采集设备从搜索引擎或“DIV2K”等数据集中获取的,也可以由第一通信设备240采集获取,此处不作限定。
可选的,服务器220已训练完成的第一网络模型、第一超分辨率网络模型和第二超分辨率网络模型发送至第二通信设备260。由第二通信设备260运行上述模型,使得该第二通信设备260作为超分辨率图像处理装置的一部分,并执行本申请提出的超分辨率图像处理方法。
其中,第一通信设备240和第二通信设备260包括但不限于个人计算机、计算机工作站、智能手机、平板电脑、智能摄像头、智能汽车或其他类型蜂窝电话、媒体消费设备、可穿戴设备、机顶盒、游戏机等。
第一通信设备240与服务器220以及第二通信设备260与服务器220之间均可以通过无线网络连接。其中,上述的无线网络使用标准通信技术和/或协议。无线网络通常为因特网、但也可以是任何网络,包括但不限于局域网(local area network,LAN)、城域网(metropolitan area network,MAN)、广域网(wide area network,WAN)、移动、专用网络或者虚拟专用网络的任何组合)。在另一些实施例中,还可以使用定制或专用数据通信技术取代或者补充上述数据通信技术。
虽然图2中仅示出了一个服务器220、一个第一通信设备240和一个第二通信设备260,但应当理解,图2中的示例仅用于理解本方案,具体服务器220、第一通信设备240和第二通信设备260的数量均应当结合实际情况灵活确定。
本申请实施例提出的:第一超分辨率网络模型、第二超分辨率网络模型、第三超分辨率网络模型和第一网络模型,均是对图像数据进行处理的神经网络模型。目前,常用对图像数据进行处理的神经网络模型,为卷积神经网络(convolutional neural network,CNN)以及基于卷积神经网络的其它神经网络(如:循环神经网络(recurrent neural network,RNN)、 超分辨率卷积神经网络(Super Resolution convolutional neural network,SRCNN)、深度循环网络(deeply-recursive convolutional network,DRCN)或亚像素卷神经网络(efficient sub-pixel convolutional neural network,ESPCN)等等)。为了便于理解本申请,下面将以卷积神经网络为例对本申请提出的超分辨率图像处理方法进行介绍。
请参阅图3,图3为本申请实施例提供的卷积神经网络的一种结构示意图,卷积神经网络(CNN)是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,是一种前馈(feed-forward)人工神经网络。如图3所示,卷积神经网络100可以包括输入层110,卷积层/池化层120,其中池化层为可选的,以及神经网络层130。
如图3所示卷积层/池化层120可以包括如示例121-126层,在一种实现中,121层为卷积层,122层为池化层,123层为卷积层,124层为池化层,125为卷积层,126为池化层;在另一种实现方式中,121、122为卷积层,123为池化层,124、125为卷积层,126为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。
以卷积层121为例,卷积层121可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素等等,像素的个数取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用维度相同的多个权重矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等等,此处不对所有权重矩阵进行穷举,该多个权重矩阵维度相同,经过该多个维度相同的权重矩阵提取后的特征图维度也相同,再将提取到的多个维度相同的特征图合并形成卷积运算的输出。
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以从输入图像中提取信息,从而帮助卷积神经网络100进行正确的预测。
当卷积神经网络100有多个卷积层的时候,初始的卷积层(例如121)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络100深度的加深,越往后的卷积层(例如126)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,即如图3中120所示例的121-126各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像大小相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。
在经过卷积层/池化层120的处理后,卷积神经网络100还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层120只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或别的相关信息),卷积神经网络100需要利用神经网络层130来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层130中可以包括多层隐含层(如图3所示的131、132至13n)以及输出层140,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到,例如该任务类型可以包括图像处理以及图像处理之后的技能选择,其中图像处理部分可以包括图像识别,图像分类,图像超分辨率处理等等,在对图像进行处理之后,可以根据获取到的图像信息进行技能选择;作为示例,例如在本申请应用于超分辨率图像处理、神经网络具体表现为卷积神经网络且任务为对图像进行超分辨率处理:将低分辨率图像输入到神经网络的卷积神经网络中,则卷积神经网络需要对低分辨率图像进行识别,进而获得图像中各种图像特征信息,例如:轮廓信息、图像的亮度信息、图像的纹理信息等信息,进而确定低分辨率图像相似的相似图像块。进而,卷积神经网络结合相似图像块对该低分辨率图像进行超分辨率处理,生成超分辨率图像;可选地,为了进一步提升超分辨率图像的清晰度,卷积神经网络对高清图像进行识别,以确定与低分辨率图像相似的高清图像,进而使用该高清图像、该相似图像块和低分辨率图像进行超分辨率处理,生成超分辨率图像,该高清图像和该相似图像块作为该低分辨率图像的参考图,等等。
在神经网络层130中的多层隐含层之后,也就是整个卷积神经网络100的最后层为输出层140,该输出层140具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络100的前向传播(如图3由110至140的传播为前向传播)完成,反向传播(如图3由140至110的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络100的损失及卷积神经网络100通过输出层输出的结果和理想结果之间的误差。
需要说明的是,如图3所示的卷积神经网络100仅作为一种卷积神经网络的示例,在具体的应用中,卷积神经网络还可以以其他网络模型的形式存在,此处不再对其他类型的神经网络进行一一介绍。
结合上述描述,下面开始对本申请实施例提供的超分辨率图像处理方法的具体实现流 程进行描述。以超分辨率图像处理装置一部分部署于终端设备,另一部分部署于云计算设备系统中为例进行说明。请参阅图4a,图4a为本申请实施例中超分辨率图像处理方法的一个实施例示意图。本申请实施例中超分辨率图像处理方法的一个实施例,包括:
401、获取低分辨率图像。
本实施例中,终端设备播放媒体文件时(如前述图1a实施例的描述),终端设备可获取该媒体文件对应的低分辨率图像。具体的,当该媒体文件为视频文件时,获取的低分辨率图像为该视频文件中任一图像帧,例如是关键帧(I帧(I frame)),或是其它帧,如P帧或B帧等等。当该媒体文件为图像文件时,获取的低分辨率图像即该图像文件。
402、根据低分辨率图像生成图像块集合。
本实施例中,终端设备获取了低分辨率图像后,首先将该低分辨率图像分割成尺寸较小的低分辨率图像块,这些低分辨率图像块组成图像块集合。具体的,将低分辨率图像分割为高32像素,宽32像素的低分辨率图像块,该低分辨率图像块的具体尺寸由实际需求决定(即后续神经网络模型的需求,如第一网络模型),此处不作限定。
403、通过第一网络模型对低分辨率图像对应的图像块集合进行卷积处理,生成低分辨率图像块对应的图像特征数据集。
本实施例中,终端设备将低分辨率图像分割,生成低分辨率图像块后。终端设备通过第一网络模型对低分辨率图像对应的图像块集合进行卷积处理,由该第一网络模型输出卷积层处理结果,该结果称为低分辨率图像块对应的图像特征数据集,该图像特征数据集中包括图像块的特征图。
可选的,该第一网络模型包括多层卷积层(大于2层),该卷积层可提取图像的特征图,例如,提取图像的边缘信息、图像的轮廓信息、图像的亮度信息和/或图像的颜色信息等等。我们选取该第一网络模型的前两层卷积层输出的卷积处理结果,作为低分辨率图像块的特征图,也称为低分辨率图像块特征图。这些低分辨率图像块特征图的集合,称为低分辨率图像块对应的图像特征数据集。
404、通过第一网络模型对低分辨率图像块对应的图像特征数据集进行卷积处理,生成第一卷积数据集。
本实施例中,由于第一网络模型需要将低分辨率图像块进行分类处理,因此,第一网络模型对低分辨率图像块进行卷积处理,生成的低分辨率图像块对应的图像特征数据集,仅仅是初步卷积处理的输出结果。需要通过第一网络模型对该低分辨率图像块对应的图像特征数据集进行进一步的卷积处理。该卷积处理所输出的卷积处理结果,作为后续分类处理的源数据。该卷积处理结果称为第一卷积数据集。
405、通过第一网络模型对第一卷积数据集进行二分类处理,确定细节丰富图像块和细节不丰富图像块。
本实施例中,终端设备生成第一卷积数据集后,通过第一网络模型对第一卷积数据集进行二分类处理(softmax),确定细节丰富图像块和细节不丰富图像块。该第一网络模型,可以是一种分类网络(回归网络),具体由若干个卷积层和至少一个softmax层组成。当低分辨率图像块输入进该第一网络模型进行处理,在最后一个卷积层输出的结果称为第一卷 积数据集。然后,该第一卷积数据集输入至softmax层进行二分类处理,以确定图像块集合中哪些图像块是细节丰富图像块,哪些图像块是细节不丰富图像块。具体分类的标准可以是:第一卷积数据集中包括多个图像块的特征图,当某个图像块的特征图显示该图像块无轮廓(例如为蓝天背景时),softmax层对应该图像块输出“0”,以表示该图像块为细节不丰富图像块。细节丰富图像块具有丰富的图像特征信息,图像特征信息包括但不限于:图像的轮廓信息、图像的亮度信息、图像的纹理信息等。
步骤401-405,所描述的是如何在低分辨率图像中,确定细节丰富图像块和细节不丰富图像块。上述过程,可使用如下公式描述:
CLASSIFY=Softmax(Conv(resize(Crop(Input)) (H,W,1)));
其中,Input表示输入的低分辨率图像,(H,W,1)中的“H”表示低分辨率图像的高度,“W”表示低分辨率图像的宽度,“1”表示输入的该低分辨率图像为单通道图像(即灰度图),“Crop”表示将输入的低分辨率图像分割为低分辨率图像块,“resize”表示将该低分辨率图像块统一缩放至固定尺寸的图像块,“Conv”表示经过缩放处理后的低分辨率图像块进行卷积处理的卷积处理结果,“Softmax”表示对卷积处理结果(第一卷积数据集)进行二分类处理的结果,“CLASSIFY”表示该低分辨率图像块是细节丰富图像块或是细节不丰富图像块,例如,“CLASSIFY=0”,表示该低分辨率图像块为细节不丰富图像块。
为了便于理解上述步骤(401-405),请参阅图4b,图4b为本申请实施例提供的超分辨率图像处理方法的一个流程示意图。终端设备获取低分辨率图像后,根据该低分辨率图像生成低分辨率图像块。使用第一网络模型对低分辨率图像块进行处理,最终确定哪一些低分辨率图像块是细节丰富图像块(如包括建筑物或人脸轮廓的图像块),哪一些低分辨率图像块是细节不丰富图像块(如包括天空背景的图像块)。
406、根据细节丰富图像块确定相似图像块。
本实施例中,终端设备根据细节丰富图像块确定相似图像块。对于如何确定相似图像块,存在多种不同方案,下面结合附图进行说明。一、在该低分辨率图像内确定相似图像块。二、在该低分辨率图像以外的图像确定相似图像块。
一、在该低分辨率图像内确定相似图像块。请参阅图5,图5为本申请实施例提出的一种确定相似图像块的流程示意图。
D1、根据细节丰富图像块,在低分辨率图像块对应的图像特征数据集中确定细节丰富图像块对应的图像特征数据集。
步骤D1中,终端设备确定了低分辨率图像块中哪一些图像块是细节丰富图像块后,终端设备根据这些细节丰富图像块,在第一网络模型输出的低分辨率图像块对应的图像特征数据集中,确定哪些是细节丰富图像块所对应的特征图。将这些对应于细节丰富图像块的特征图,统称为细节丰富图像块对应的图像特征数据集。
例如:低分辨率图像块对应的图像特征数据集中包括A、B、C、D、E和F,这6个低分辨率图像块的特征图。当第一网络模型确定A、B、C、D、E和F这6个低分辨率图像块中,A和B为细节丰富图像块。则在步骤D1中,终端设备在低分辨率图像块对应的图像特征数据集找到A和B图像块所对应的特征图,将这些确定的特征图统称为细节丰富图 像块对应的图像特征数据集。即,细节丰富图像块对应的图像特征数据集中包括A图像块的特征图,和B图像块的特征图。
D2、对细节丰富图像块对应的图像特征数据集进行二值化处理,计算得到细节丰富图像块中任意两个图像块的相似度,根据该相似度确定相似图像块。
本实施例中,终端设备在确定了细节丰富图像块对应的图像特征数据集(即细节丰富图像块的特征图)后,为了便于后续计算,即计算细节丰富图像块中任意两个图像块的相似度。首先需要对细节丰富图像块对应的图像特征数据集中的特征图,进行二值化处理。二值化处理具体指的是图像上每个像素点的特征值设置为0或1,也就是将整个图像呈现出明显的黑白效果的过程。通常可使用“OpenCV”或“matlab”进行二值化处理。
其次,当得到细节丰富图像块对应的图像特征数据集中特征图的二值化数据后,计算得到细节丰富图像块中任意两个图像块的相似度。本实施例中,为了减轻对运算资源的占用率,采用异或匹配算法计算相似度。具体计算相似度的公式如下:
Figure PCTCN2020134444-appb-000003
其中,F为相似度,N为细节丰富图像块的图像尺寸,P(i,j)和Q(i,j)为任意两个细节丰富图像块分别对应的图像块的特征图,i为图像块的特征图像素的横坐标值,j为图像块的特征图像素的纵坐标值。
下面举例进行说明:终端设备任意选取细节丰富图像块对应的图像特征数据集中两张特征图,计算这两张特征图的相似度,该两张相似图为第一特征图P和第二特征图Q。“P(i,j)”表示第一特征图经过二值化处理后,第一特征图在图中坐标(i,j)的二值化数据,例如:“P(1,1)=1”,表示第一特征图在(1,1)坐标上的二值化数据为1。同理“Q(i,j)”表示第二特征图经过二值化处理后,第二特征图在图中坐标(i,j)的二值化数据。由于第一特征图与第二特征图的图像尺寸一致,因此,选取各个特征图上相同坐标的二值化数据进行异或计算。然后将特征图上,所有坐标的异或计算的结果求和后,除以一张特征图上像素坐标的总数(“N*N”)。最后的计算结果,即这两张特征图所对应的图像块的相似度。
通过上述计算方法,可计算得到细节丰富图像块对应的图像特征数据集中任意两个图像块的相似度,当相似度大于第一阈值时,可确定这两个图像块为相似图像块。第一阈值根据实际需要确定,此处不作限定,在一种可选方案里,第一阈值为0.7。
为了便于理解上述步骤(D1-D2),请参阅图4c,图4c为本申请实施例提供的超分辨率图像处理方法的一个流程示意图。终端设备获取低分辨率图像后,根据该低分辨率图像生成低分辨率图像块。使用第一网络模型对低分辨率图像块进行处理,在卷积处理的过程中输出结果,得到细节丰富图像块对应的图像特征数据集。然后对细节丰富图像块对应的图像特征数据集中的特征图,进行二值化处理。最后通过计算相似度,以确定相似图像块。
本实施例中,通过上述方式,提供了如何确定相似图像块的具体实施方式,提高了本方案的实现灵活性。通过对特征图的二值化处理,并使用异或匹配算法计算相似度,在占 用较少计算资源的情况下,可以得到图像块较为准确的相似度。
二、在该低分辨率图像以外的图像确定相似图像块。
需要说明的是,在该低分辨率图像以外的图像确定相似图像块,存在以下两种情况(1)、该低分辨率图像以外的图像,与该低分辨率图像同属于同一视频文件。(2)、该低分辨率图像以外的图像,与该低分辨率图像不属于同一视频文件。对于(2)这种情况,具体确定相似图像块的方法,与图5对应的实施例类似,此处不再赘述。
对于(1)这种情况,在本实施例中进行说明。请参阅图6,图6为本申请实施例提出的一种确定相似图像块的流程示意图。
F1、确定低分辨率图像在视频中的位置。
步骤F1中,终端设备获取的低分辨率图像来自视频文件。首先,当该低分辨率图像为视频中的一帧(第一图像帧)时,终端设备确定该低分辨率图像在该视频中的位置,也就是第一图像帧在该视频中的位置信息。例如,确定某一细节丰富图像块对应的低分辨率图像,该低分辨率图像(第一图像帧)在视频文件中位置为第10帧。
F2、根据位置,确定第二图像帧。
步骤F2中,由于视频文件具有连贯性,相邻帧的画面相似程度较高,因此,当确定了某一细节丰富图像块所对应的低分辨率图像,在视频文件中的位置后。即确定了第一图像帧的位置后。寻找该第一图像帧前后一定范围内的图像帧,并确定其中的一个或多个图像帧为第二图像帧。在一种可选的实现方式中,该第二图像帧为包含完整图像信息的关键帧。
F3、根据第二图像帧,确定相似图像块。
步骤F3中,终端设备确定第二图像帧后,将该第二图像帧进行解码,获取对应的图像文件。终端设备对该图像文件进行分割,生成多个图像块。
示例性地,在第二图像帧进行解码获取的图像中选取任意一个图像块,选取任意一个细节丰富图像块,使用上述两种图像块进行计算,以确定在第二图像帧对应的图像文件中哪一图像块为相似图像块。具体计算方法类似图5对应的流程,此处不再赘述。
示例性地,确定第二图像帧中与细节丰富图像块对应位置的图像块为相似图像块。根据细节丰富图像块在低分辨率图像中的坐标,在该图像文件的相同坐标位置,确定该图像块作为相似图像块。例如:将低分辨率图像分割为3*3,一共9个图像块,该细节丰富图像块为该低分辨率图像中第一行第一列的图像块。则终端设备将第二图像帧解码得到的图像文件同样分割为3*3(低分辨率图像与该图像文件的尺寸大小一致),终端设备确定该图像文件第一行第一列的图像块为相似图像块。
本实施例中,通过上述方式,提供了如何确定相似图像块的具体实施方式,可以通过多种方式确定相似图像块,提升了本方案的实现灵活性。在占用较低计算资源的前提下,可确定相似图像块,降低了部署本申请提出的超分辨率图像处理方法的终端设备的功耗。
407、对相似图像块和低分辨率图像进行超分辨率处理,生成超分辨率图像。
本实施例中,终端设备对相似图像块和低分辨率图像进行超分辨率处理,生成超分辨率图像。以终端设备通过第一超分辨率网络模型进行超分辨率图像处理为例,终端设备可以通过多种方案生成不同的超分辨率图像。下面分别进行说明:
第一种处理方式;不进行二次超分辨率处理。
在这种情况下,终端设备仅使用第一超分辨率网络模型进行超分辨率处理。具体的,请参阅图7,图7为本申请实施例提出的一种超分辨率处理的流程示意图。
G1、对细节丰富图像块和相似图像块进行图像交换处理,生成第一交换图像。
步骤G1中,终端设备对细节丰富图像块和相似图像块进行图像交换处理,生成第一交换图像。生成的第一交换图像,兼具细节丰富图像块和相似图像块的特征。具体的,可以通过下列方式中的一种或多种,进行图像交换处理:“concat方式”、“concat+add方式”或“image swap”。
G2、根据相似图像块,在细节丰富图像块对应的图像特征数据集中确定相似图像块的特征图,并对相似图像块的特征图和低分辨率图像块的特征图进行图像交换处理,生成第二交换图像。
步骤G2中,终端设备根据相似图像块,在细节丰富图像块对应的图像特征数据集中确定相似图像块的特征图,并对相似图像块的特征图和低分辨率图像块的特征图进行图像交换处理,生成第二交换图像。生成的第二交换图像,兼具相似图像块的特征图和低分辨率图像块的特征图的特征。具体的,可以通过下列方式中的一种或多种,进行图像交换处理:“concat方式”、“concat+add方式”或“feature map Swap”。
G3、对第一交换图像、第二交换图像和低分辨率图像进行超分辨率处理,生成第一图像。
步骤G3中,终端设备对第一交换图像、第二交换图像和低分辨率图像进行超分辨率处理,生成第一图像。相似图像块和相似图像块的特征图作为超分辨率处理的参考图。
对于上述超分辨率处理过程,可使用下列公式描述:
HR(H,W)=Conv(Input(4,H,W,1));
其中,“Input(H,W,1)”表示输入4张相似的“RGB”或者“YUV”图像,该图像的通道数为1,“HR(H,W)”表示超分辨率图像。
G4、根据第一图像生成超分辨率图像。
步骤G4中,若终端设备不进行二次超分辨率处理,则步骤G3生成的第一图像就是超分辨率图像。第一超分辨率网络模型输出的第一图像为超分辨率图像。
第二种处理方式;进行二次超分辨率处理。若终端设备进行二次超分辨率处理,则步骤G4内容具体请参见图8a。图8a为本申请实施例提出的一种超分辨率处理的流程示意图。
H1、获取高清图像。
步骤H1中,终端设备获取高清图像,高清图像的分辨率大于低分辨率图像的分辨率。该高清图像存在两种可能的来源。(一)、该高清图像来自预置高清图库。(二)、该高清图像来自云计算设备系统,该云计算设备系统使用部署于云计算设备系统的第三超分辨率网络模型,基于终端设备发送的低分辨率图像,生成该高清图像。下面分别进行说明。
(一)、该高清图像来自预置高清图库。该预置高清图库中存储有大量分辨率较高的图像。以便终端设备使用这些图像提升超分辨率处理的精度,提升超分辨率图像的清晰度。
(二)、该高清图像来自云计算设备,该云计算设备(该云计算设备系统)使用部署于 云计算设备系统的第三超分辨率网络模型,基于终端设备发送的低分辨率图像,生成该高清图像。在这种情况下,终端设备获取低分辨率图像(步骤401)后,除了终端设备自身执行后续步骤,终端设备将该低分辨率图像发送至部署有超分辨率图像处理装置的云计算设备系统中,由云计算设备系统执行后续步骤402-406,云计算设备系统基于该低分辨率图像生成高清图像(使用第三超分辨率网络模型)。云计算设备系统将生成的高清图像发送给该终端设备,终端设备使用该高清图像和自身超分图像处理生成的图像,再进行超分辨率处理,最后生成超分辨率图像,该高清图像作为超分辨率处理的参考图。以提升超分辨率图像的清晰度。该第三超分辨率网络模型具有占用计算资源大,超分辨率处理效果好等特点(与部署于终端设备的第一超分辨率网络模型和第二超分辨率网络模型对比)。
当终端设备获取高清图像后,使用该高清图像作为参考图。终端设备需要确定该高清图像与低分辨率图像的相似图像块。具体确定相似图像块的方法,类似前述实施例描述内容,此处不再赘述。
H2、通过对细节不丰富图像块进行放大处理,生成放大图像。
步骤H2中,为了进一步提升超分辨率图像的清晰度,终端设备可使用前述步骤得到的细节不丰富图像块。对这些细节不丰富图像块进行放大处理,生成放大图像。具体的,放大处理包括:双三次插值处理(Bicubic)或“linearf”。
当终端设备获取放大图像后,使用该放大图像作为参考图。终端设备需要确定该放大图像与低分辨率图像的相似图像块。具体确定相似图像块的方法,类似前述实施例描述内容,此处不再赘述。
H3、通过第二超分辨率网络模型对高清图像、放大图像和第一图像进行超分辨率处理,生成超分辨率图像。
步骤H3中,终端设备对高清图像、放大图像和第一图像进行超分辨率处理,生成超分辨率图像。具体的,终端设备通过第二超分辨率网络模型对高清图像中与第一图像相似的图像块、放大图像中与第一图像相似的图像块,和第一图像进行超分辨率处理,生成超分辨率图像。其中,生成该超分辨率图像的具体过程,类似前述步骤G1-G3所示内容,此处不再赘述。
对于上述超分辨率处理过程,可使用下列公式描述:
HR(H,W)=Conv(HR_MOBILE(H,W),HR_CLOUD(H,W))
其中,“HR_MOBILE(H,W)”表示终端设备得到的图像(包括放大图像和第一图像),“HR_CLOUD(H,W)”表示云计算设备系统生成的高清图像,“HR(H,W)”表示超分辨率图像。
本申请实施例中,部署于终端设备的超分辨率图像处理装置,使用第一网络模型对获取的低分辨率图像进行识别,并确定细节丰富图像块和细节不丰富图像块。对于细节丰富图像块,使用超分辨率网络模型进行超分辨率处理;对于细节不丰富图像块,经过放大处理后,再与细节丰富图像块一起进行超分辨率处理。从而有效降低计算量。降低终端设备的能耗。对于细节丰富图像块,进一步使用异或匹配算法确定相似图像块,在提高相似图像块匹配精度的前提下,降低了计算量,并提升了超分辨率图像的清晰度。最后,使用多 种不同的参考图进行超分辨率处理,有效提升超分辨率图像的清晰度。具体请参见图8b,图8b为本申请实施例中一种仿真实验示意图,峰值信噪比(peak signal to noise ratio,PSNR)是现今最普遍、使用最为广泛的一种图像客观评价指标。“Ours(加相似块)”为本申请提出超分辨率图像处理方法,可以看出在参数量大幅缩小(减少计算量)的情况下,依然保持了较高的PSNR。而PSNR是基于对应像素点间的误差,即基于误差敏感的图像质量评价。该指标并未考虑到人眼的视觉特性(人眼对空间频率较低的对比差异敏感度较高,对亮度对比差异的敏感度比色度高,对一个区域的感知结果会受到其周围邻近区域的影响等),因而经常出现评价结果与人的主观感觉不一致的情况。因此,请参阅图8c与图8d,图8c为插值算法的计算结果示意图,图8d为本申请实施例提出的超分辨率图像处理方法的计算结果示意图。图8c和图8d所展示的是,插值算法(Bicubic)和本申请实施例提出的超分辨率图像处理方法,对同一张低分辨率图像进行处理,所生成的不同计算结果。可直观看出,本申请实施例提出的超分辨率图像处理方法所生成的图像,纹理细节更加清晰,而且没有引入负向效果。请参阅图8e,图8e为本申请实施例中一种仿真实验示意图。图8e展示的是相较于插值算法(Bicubic),本申请实施例提出的超分辨率图像处理方法在不同场景下所节省的计算量。在不同场景下,可以减少20%-60%的计算量。需要说明的是,这仅是一种可能的仿真实验结果,根据实际硬件的不同,还可以存在其它的仿真实验结果,此处不作限定。
基于前述实施例,对于第一网络模型,超分辨率图像处理装置还可以生成第一训练集以训练该第一网络模型。具体的,请参阅图9,图9为本申请实施例中一种生成训练集的流程示意图。
901、获取图像集合。
步骤901中,超分辨率图像处理装置获取各种图像,由这些图像构成图像集合。该图像集合中包括:从互联网和已公开的数据集中搜集具有不同纹理特征的图片,如动物、天空、人脸或建筑物等,并将不同类别的图片等比例混合得到训练集。来源有“DIV2K”或“Timofte 91images”等数据集,或搜索引擎获得的图像等等。还可以包括超分辨率图像处理装置在处理图像的过程中,生成的细节丰富图像块对应的图像特征数据集。
902、使用低通滤波器对图像集合进行滤波处理,生成第一子训练集。
步骤902中,使用低通滤波器对图像集合进行滤波处理,即删除图像集合中细节较少,较为平滑的图像文件,生成第一子训练集。在一种可选的实现方式中,第一子训练集中的图像携带细节丰富标签(label),该细节丰富标签用于标识该图像具有丰富的图像特征信息。该标签可以是人工标记的。第一子训练集中图像包括的图像特征信息的数量大于细节不丰富图像块包括的图像特征信息的数量。
903、对第一子训练集进行数据增广处理,生成第二子训练集。
步骤903中,对第一子训练集进行数据增广处理,生成第二子训练集。具体的,数据增广处理包括:图像颠倒,图像旋转,图像缩小和图像拉伸。数据增广处理还包括:裁剪、平移、仿射、透视、高斯噪声、不均匀光、动态模糊和随机颜色填充等。选择数据增广处理中的一种或多种对第一子训练集中的图像文件进行处理。既可以是对同一图像文件进行 多种数据增广处理,也可以是对多个图像文件进行同一种数据增广处理,此处不作限定。
904、根据第一子训练集和第二子训练集生成第一训练集。
步骤904中,超分辨率图像处理装置根据第一子训练集和第二子训练集生成第一训练集。
本申请实施例中,通过生成第一训练集,对第一网络模型进行训练,可有效提升第一网络模型的精度。
上述主要以方法的角度对本申请实施例提供的方案进行了介绍。可以理解的是,上述超分辨率图像处理装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对超分辨率图像处理装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个生成模块1001中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
下面对本申请中的超分辨率图像处理装置1000进行详细描述,请参阅图10,图10为本申请实施例中超分辨率图像处理装置1000的一种实施例示意图。超分辨率图像处理装置1000包括:
该超分辨率图像处理装置1000包括生成模块1001、确定模块1002和获取模块1003:
生成模块1001,用于根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,其中,该细节丰富图像块和该细节不丰富图像块的尺寸小于该低分辨率图像,该细节丰富图像块包括的图像特征信息的数量大于该细节不丰富图像块包括的图像特征信息的数量;
确定模块1002,用于根据生成模块1001生成的该细节丰富图像块确定相似图像块,其中,该相似图像块包括的图像特征信息与该细节丰富图像块包括的图像特征信息的相似度大于第一阈值;
该生成模块1001,还用于对确定模块1002所确定的该相似图像块和该低分辨率图像进行超分辨率处理,生成超分辨率图像,其中,该相似图像块作为该低分辨率图像的参考图。
在本申请的一些实施例中,该生成模块1001,具体用于根据该低分辨率图像生成该图像块集合,该图像块集合中包括至少一个低分辨率图像块;
该确定模块1002,具体用于对生成模块1001生成的该低分辨率图像块进行处理,确定该细节丰富图像块和细节不丰富图像块;
确定模块1002,具体用于第一卷积数据集进行二分类处理,确定细节丰富图像块和细节不丰富图像块。
在本申请的一些实施例中,该生成模块1001,具体用于根据该细节丰富图像块,确定该细节丰富图像块对应的图像特征数据集;
该生成模块1001,具体用于对确定的该图像特征数据集进行二值化处理,得到该细节丰富图像块中任意两个图像块的相似度;
该确定模块1002,具体用于当该任意两个图像块的相似度大于该第一阈值时,确定该相似图像块。
在本申请的一些实施例中,任意两个图像块的相似度满足:
Figure PCTCN2020134444-appb-000004
其中,该F为该相似度,该N为该细节丰富图像块的图像尺寸,该P(i,j)和Q(i,j)为该任意两个细节丰富图像块分别对应的该图像块的该特征图,该i为该图像块的该特征图像素的横坐标值,该j为该图像块的该特征图像素的纵坐标值。
在本申请的一些实施例中,该确定模块1002,具体用于当该低分辨率图像为视频中的一帧时,确定该低分辨率图像在该视频中的位置;
该确定模块1002,具体用于根据该位置,确定第二图像帧,其中,该第二图像帧为该低分辨率图像在该视频中的相邻帧;
该确定模块1002,具体用于根据确定模块1002确定该第二图像帧中与该细节丰富图像块对应位置的图像块为该相似图像块。
在本申请的一些实施例中,该生成模块1001,具体用于对生成模块1001生成的该细节丰富图像块和确定模块1002确定的该相似图像块进行图像交换处理,生成第一交换图像;
该生成模块1001,具体用于根据确定模块1002确定的该相似图像块,确定相似图像块的特征图,并对该相似图像块的特征图和该低分辨率图像块的特征图进行图像交换处理,生成第二交换图像,其中特征图用于指示图像块的图像特征信息;
该生成模块1001,具体用于通过该第一超分辨率网络模型对该第一交换图像、该第二交换图像和该低分辨率图像进行超分辨率处理,生成第一图像;
该生成模块1001,具体用于根据生成模块1001生成的该第一图像生成该超分辨率图像。
在本申请的一些实施例中,该超分辨率图像处理装置1000还包括获取模块1003;
该获取模块1003,用于获取高清图像,该高清图像的分辨率大于该低分辨率图像的分辨率;
该生成模块1001,具体用于对生成模块1001生成的该细节不丰富图像块进行放大处理以生成放大图像,其中该放大处理包括双三次插值处理;
该生成模块1001,具体用于通过对获取模块1003获取的该高清图像、该放大图像和该第一图像进行超分辨率处理,生成该超分辨率图像,其中,该高清图像和该放大图像作为该第一图像的参考图。
在本申请的一些实施例中,该高清图像来自远端设备,该高清图像为该远端设备通过 对该低分辨率图像进行超分辨率处理生成的图像。
在本申请的一些实施例中,该高清图像来自预置高清图库。
在本申请的一些实施例中,该确定模块1002,还用于根据生成模块1001生成的该第一图像确定该预制高清图库中的该高清图像,该高清图像与该第一图像相似度大于第一阈值。
在本申请的一些实施例中,
该获取模块1003,还用于获取图像集合;
该生成模块1001,还用于使用低通滤波器对获取模块1003获取的该图像集合进行滤波处理,生成第一子训练集;
该生成模块1001,还用于对生成模块1001生成的该第一子训练集进行数据增广处理,生成第二子训练集,该数据增广处理包括图像颠倒,图像旋转,图像缩小和图像拉伸;
该生成模块1001,还用于根据该第一子训练集和该第二子训练集生成第一训练集,该第一训练集用于训练该第一网络模型,第一网络模型用于生成细节丰富图像块和细节不丰富图像块。
在本申请的一些实施例中,获取模块1003可以执行如图4a所示的实施例中步骤401;生成模块1001可以执行如图4a所示的实施例中步骤402以及步骤404;生成模块1001还可以执行如图4a所示的实施例中步骤407;确定模块1002可以执行如图4a所示的实施例中步骤405-406。
通过前述实施例的举例说明可知,本申请实施例中,超分辨率图像处理装置1000使用第一网络模型对获取的低分辨率图像进行识别,并确定细节丰富图像块和细节不丰富图像块。对于细节丰富图像块,使用超分辨率网络模型进行超分辨率处理;对于细节不丰富图像块,经过放大处理后,再与细节丰富图像块一起进行超分辨率处理。从而有效降低计算量。降低部署该超分辨率图像处理装置的计算设备的能耗。对于细节丰富图像块,进一步使用异或匹配算法确定相似图像块,在提高相似图像块匹配精度的前提下,降低了计算量,并提升了超分辨率图像的清晰度。最后,使用多种不同的参考图进行超分辨率处理,有效提升超分辨率图像的清晰度。
本申请实施例还提供了一种计算设备,请参阅图11,图11是本申请实施例提供的计算设备一种结构示意图,计算设备1100上可以部署有图10对应实施例中所描述的超分辨率图像处理装置1000,用于实现图10对应实施例中超分辨率图像处理装置的功能,具体的,计算设备1100可以是云计算设备系统、终端设备或边缘计算设备系统中的一个计算设备。需要说明的是,超分辨率图像处理装置1000可以部署在计算设备1100上以实现前述超分辨率图像处理装置实现的功能。计算设备1100可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1122(例如,一个或一个以上处理器)和存储器1132,一个或一个以上存储应用程序1142或数据1144的存储介质1130(例如一个或一个以上海量存储设备)。其中,存储器1132和存储介质1130可以是短暂存储或持久存储。存储在存储介质1130的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对计算设备中的一系列指令操作。更进一步地,中央处理器1122 可以设置为与存储介质1130通信,在计算设备1100上执行存储介质1130中的一系列指令操作。
计算设备1100还可以包括一个或一个以上电源1126,一个或一个以上有线或无线网络接口1150,一个或一个以上输入输出接口1158,和/或,一个或一个以上操作系统1141,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
本申请实施例中,中央处理器1122,用于执行前述描述的超分辨率图像处理方法。
需要说明的是,中央处理器1122执行上述各个步骤的具体方式,与本申请中前述各个方法实施例基于同一构思,其带来的技术效果与本申请中前述各个方法实施例相同,具体内容可参见本申请前述所示的方法实施例中的叙述,此处不再赘述。
应注意,本申请实施例中的处理器可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述实施例描述的方法中超分辨率图像处理装置所执行的步骤。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行超分辨率图像处理的程序,当其在计算机上运行时,使得计算机执行如前述实施例 描述的方法中超分辨率图像处理装置所执行的步骤。
本申请实施例还提供一种芯片,芯片包括:处理单元和通信单元,该处理单元例如可以是处理器,该通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使执行设备内的芯片执行上述实施例描述的构建训练集的方法。可选地,该存储单元为该芯片内的存储单元,如寄存器、缓存等,该存储单元还可以是该超分辨率图像处理装置内的位于该芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图12,图12为本申请实施例提供的芯片的一种结构示意图,该芯片可以表现为神经网络处理器NPU 1200,NPU 1200作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路1203,通过控制器1204控制运算电路1203提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路1203内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路1203是二维脉动阵列。运算电路1203还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路1203是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器1202中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器1201中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)1208中。
统一存储器1206用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(Direct Memory Access Controller,DMAC)1205,DMAC被搬运到权重存储器1202中。输入数据也通过DMAC被搬运到统一存储器1206中。
BIU为Bus Interface Unit即,总线接口单元1210,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)1209的交互。
总线接口单元1210(Bus Interface Unit,简称BIU),用于取指存储器1209从外部存储器获取指令,还用于存储单元访问控制器1205从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器1206或将权重数据搬运到权重存储器1202中或将输入数据数据搬运到输入存储器1201中。
向量计算单元1207包括多个运算处理单元,在需要的情况下,对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元1207能将经处理的输出的向量存储到统一存储器1206。例如,向量计算单元1207可以将线性函数和/或非线性函数应用到运算电路1203的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在 一些实现中,向量计算单元1207生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路1203的激活输入,例如用于在神经网络中的后续层中的使用。
控制器1204连接的取指存储器(instruction fetch buffer)1209,用于存储控制器1204使用的指令;
统一存储器1206,输入存储器1201,权重存储器1202以及取指存储器1209均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,图4a至图8a所示的各个超分辨率网络模型中各层的运算可以由运算电路1203或向量计算单元1207执行。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述第一方面方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、超分辨率图像处理装置、计算设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、超分辨率图像处理装置、计算设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以 是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
另外,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对 现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。
总之,以上所述仅为本申请技术方案的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (27)

  1. 一种超分辨率图像处理方法,其特征在于,包括:
    根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,其中,所述细节丰富图像块和所述细节不丰富图像块的尺寸小于所述低分辨率图像,所述细节丰富图像块包括的图像特征信息的数量大于所述细节不丰富图像块包括的图像特征信息的数量;
    根据所述细节丰富图像块确定相似图像块,其中,所述相似图像块包括的图像特征信息与所述细节丰富图像块包括的图像特征信息的相似度大于第一阈值;
    对所述相似图像块和所述低分辨率图像进行超分辨率处理,生成超分辨率图像,其中,所述相似图像块作为所述低分辨率图像的参考图。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述低分辨率图像生成所述细节丰富图像块和所述细节不丰富图像块,包括:
    根据所述低分辨率图像生成图像块集合;
    对所述图像块集合中的图像块进行卷积处理,生成第一卷积数据集;
    对所述第一卷积数据集进行二分类处理,确定所述细节丰富图像块和所述细节不丰富图像块。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述细节丰富图像块确定所述相似图像块,包括:
    根据所述细节丰富图像块,确定所述细节丰富图像块对应的图像特征数据集;
    对确定的所述图像特征数据集进行二值化处理,得到所述细节丰富图像块中任意两个图像块的相似度;
    当所述任意两个图像块的相似度大于所述第一阈值时,确定所述相似图像块。
  4. 根据权利要求3所述的方法,其特征在于,所述任意两个图像块的相似度满足:
    Figure PCTCN2020134444-appb-100001
    其中,所述F为所述相似度,所述N为所述细节丰富图像块的图像尺寸,所述P(i,j)和Q(i,j)为所述任意两个细节丰富图像块分别对应的所述图像块的所述特征图,所述i为所述图像块的所述特征图像素的横坐标值,所述j为所述图像块的所述特征图像素的纵坐标值。
  5. 根据权利要求1所述的方法,其特征在于,当所述低分辨率图像为视频中的一帧时,所述根据所述细节丰富图像块确定所述相似图像块,包括:
    确定所述低分辨率图像在所述视频中的位置;
    根据所述位置,确定第二图像帧,其中,所述第二图像帧为所述低分辨率图像在所述视频中的相邻帧;
    确定所述第二图像帧中与所述细节丰富图像块对应位置的图像块为所述相似图像块。
  6. 根据权利要求3-5中任一项所述的方法,其特征在于,所述对所述相似图像块和所述低分辨率图像进行超分辨率处理,生成所述超分辨率图像,包括:
    对所述细节丰富图像块和所述相似图像块进行图像交换处理,生成第一交换图像;
    根据所述相似图像块,确定所述相似图像块的特征图,并对所述相似图像块的特征图和所述低分辨率图像块的特征图进行图像交换处理,生成第二交换图像,其中所述特征图用于指示图像块的图像特征信息;
    对所述第一交换图像、所述第二交换图像和所述低分辨率图像进行超分辨率处理,生成第一图像;
    根据所述第一图像生成所述超分辨率图像。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述第一图像生成所述超分辨率图像,包括:
    获取高清图像,所述高清图像的分辨率大于所述低分辨率图像的分辨率;
    对所述细节不丰富图像块进行放大处理以生成放大图像,其中所述放大处理包括双三次插值处理;
    对所述高清图像、所述放大图像和所述第一图像进行超分辨率处理,生成所述超分辨率图像,其中,所述高清图像和所述放大图像作为所述第一图像的参考图。
  8. 根据权利要求7所述的方法,其特征在于,所述高清图像来自远端设备,所述高清图像为所述远端设备通过对所述低分辨率图像进行超分辨率处理生成的图像。
  9. 根据权利要求7所述的方法,其特征在于,所述高清图像来自预置高清图库,所述预置高清图库中包括至少一张所述高清图像。
  10. 根据权利要求9所述的方法,其特征在于,所述获取所述高清图像之前,所述方法还包括:
    根据所述第一图像确定所述预制高清图库中的所述高清图像,所述高清图像与所述第一图像相似度大于所述第一阈值。
  11. 根据权利要求1所述的方法,其特征在于,所述根据低分辨率图像生成细节丰富图像块和细节不丰富图像块之前,所述方法还包括:
    获取图像集合;
    使用低通滤波器对所述图像集合进行滤波处理,生成第一子训练集,所述第一子训练集中图像包括的图像特征信息的数量大于所述细节不丰富图像块包括的图像特征信息的数量;
    对所述第一子训练集进行数据增广处理,生成第二子训练集,所述数据增广处理包括图像颠倒,图像旋转,图像缩小和图像拉伸;
    根据所述第一子训练集和所述第二子训练集生成第一训练集,所述第一训练集用于训练第一网络模型,所述第一网络模型用于生成所述细节丰富图像块和所述细节不丰富图像块。
  12. 根据权利要求1-11中任一项所述的方法,其特征在于,所述图像特征信息包括图像的边缘信息、图像的轮廓信息、图像的亮度信息和/或图像的颜色信息。
  13. 一种超分辨率图像处理装置,其特征在于,包括:
    生成模块,用于根据低分辨率图像生成细节丰富图像块和细节不丰富图像块,其中, 所述细节丰富图像块和所述细节不丰富图像块的尺寸小于所述低分辨率图像,所述细节丰富图像块包括的图像特征信息的数量大于所述细节不丰富图像块包括的图像特征信息的数量;
    确定模块,用于根据所述细节丰富图像块确定相似图像块,其中,所述相似图像块包括的图像特征信息与所述细节丰富图像块包括的图像特征信息的相似度大于第一阈值;
    所述生成模块,还用于对所述相似图像块和所述低分辨率图像进行超分辨率处理,生成超分辨率图像,其中,所述相似图像块作为所述低分辨率图像的参考图。
  14. 根据权利要求13所述的装置,其特征在于,
    所述生成模块,具体用于根据所述低分辨率图像生成图像块集合;
    所述确定模块,具体用于对所述图像块集合中的图像块进行卷积处理,生成第一卷积数据集;
    所述确定模块,具体用于对所述第一卷积数据集进行二分类处理,确定所述细节丰富图像块和所述细节不丰富图像块。
  15. 根据权利要求14所述的装置,其特征在于,
    所述生成模块,具体用于根据所述细节丰富图像块,确定所述细节丰富图像块对应的图像特征数据集;
    所述生成模块,具体用于对确定的所述图像特征数据集进行二值化处理,得到所述细节丰富图像块中任意两个图像块的相似度;
    所述确定模块,具体用于当所述任意两个图像块的相似度大于所述第一阈值时,确定所述相似图像块。
  16. 根据权利要求15所述的装置,其特征在于,所述任意两个图像块的相似度满足:
    Figure PCTCN2020134444-appb-100002
    其中,所述F为所述相似度,所述N为所述细节丰富图像块的图像尺寸,所述P(i,j)和Q(i,j)为所述任意两个细节丰富图像块分别对应的所述图像块的所述特征图,所述i为所述图像块的所述特征图像素的横坐标值,所述j为所述图像块的所述特征图像素的纵坐标值。
  17. 根据权利要求13所述的装置,其特征在于,
    所述确定模块,具体用于当所述低分辨率图像为视频中的一帧时,确定所述低分辨率图像在所述视频中的位置;
    所述确定模块,具体用于根据所述位置,确定第二图像帧,其中,所述第二图像帧为所述低分辨率图像在所述视频中的相邻帧;
    所述确定模块,具体用于确定所述第二图像帧中与所述细节丰富图像块对应位置的图像块为所述相似图像块。
  18. 根据权利要求15-17中任一项所述的装置,其特征在于,
    所述生成模块,具体用于对所述细节丰富图像块和所述相似图像块进行图像交换处理, 生成第一交换图像;
    所述生成模块,具体用于根据所述相似图像块,确定所述相似图像块的特征图,并对所述相似图像块的特征图和所述低分辨率图像块的特征图进行图像交换处理,生成第二交换图像,其中所述特征图用于指示图像块的图像特征信息;
    所述生成模块,具体用于对所述第一交换图像、所述第二交换图像和所述低分辨率图像进行超分辨率处理,生成第一图像;
    所述生成模块,具体用于根据所述第一图像生成所述超分辨率图像。
  19. 根据权利要求18所述的装置,其特征在于,所述超分辨率图像处理装置还包括获取模块;
    所述获取模块,用于获取高清图像,所述高清图像的分辨率大于所述低分辨率图像的分辨率;
    所述生成模块,具体用于对所述细节不丰富图像块进行放大处理以生成放大图像,其中所述放大处理包括双三次插值处理;
    所述生成模块,具体用于对所述高清图像、所述放大图像和所述第一图像进行超分辨率处理,生成所述超分辨率图像,其中,所述高清图像和所述放大图像作为所述第一图像的参考图。
  20. 根据权利要求19所述的装置,其特征在于,所述高清图像来自远端设备,所述高清图像为所述远端设备通过对所述低分辨率图像进行超分辨率处理生成的图像。
  21. 根据权利要求19所述的装置,其特征在于,所述高清图像来自预置高清图库,所述预置高清图库中包括至少一张所述高清图像。
  22. 根据权利要求21所述的装置,其特征在于,
    所述确定模块,还用于根据所述第一图像确定所述预制高清图库中的所述高清图像,所述高清图像与所述第一图像相似度大于所述第一阈值。
  23. 根据权利要求13所述的装置,其特征在于,
    所述获取模块,还用于获取图像集合;
    所述生成模块,还用于使用低通滤波器对所述图像集合进行滤波处理,生成第一子训练集,所述第一子训练集中图像包括的图像特征信息的数量大于所述细节不丰富图像块包括的图像特征信息的数量;
    所述生成模块,还用于对所述第一子训练集进行数据增广处理,生成第二子训练集,所述数据增广处理包括图像颠倒,图像旋转,图像缩小和图像拉伸;
    所述生成模块,还用于根据所述第一子训练集和所述第二子训练集生成第一训练集,所述第一训练集用于训练第一网络模型,所述第一网络模型用于生成所述细节丰富图像块和所述细节不丰富图像块。
  24. 根据权利要求13-23中任一项所述的装置,其特征在于,所述图像特征信息包括图像的边缘信息、图像的轮廓信息、图像的亮度信息和/或图像的颜色信息。
  25. 一种计算设备,其特征在于,包括存储器和处理器,
    所述存储器,用于存储计算机指令;
    所述处理器执行所述存储器存储的计算机指令,以执行上述权利要求1至12中任一项所述的方法。
  26. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,所述计算机指令被计算设备执行时,所述计算设备执行上述权利要求1至12中任一项所述的方法。
  27. 一种包含指令的计算机程序产品,其特征在于,当所述指令在计算机或处理器上运行时,使得所述计算机或所述处理器执行权利要求1至12中任一项所述的方法。
PCT/CN2020/134444 2019-12-09 2020-12-08 一种超分辨率图像处理方法以及相关装置 WO2021115242A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911252760.0 2019-12-09
CN201911252760.0A CN113034358A (zh) 2019-12-09 2019-12-09 一种超分辨率图像处理方法以及相关装置

Publications (1)

Publication Number Publication Date
WO2021115242A1 true WO2021115242A1 (zh) 2021-06-17

Family

ID=76329529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/134444 WO2021115242A1 (zh) 2019-12-09 2020-12-08 一种超分辨率图像处理方法以及相关装置

Country Status (2)

Country Link
CN (1) CN113034358A (zh)
WO (1) WO2021115242A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344791A (zh) * 2021-07-05 2021-09-03 中山大学 基于空洞卷积和特征融合的双目超分辨率图像检测方法、系统及介质
CN113628116A (zh) * 2021-10-12 2021-11-09 腾讯科技(深圳)有限公司 图像处理网络的训练方法、装置、计算机设备和存储介质
CN114862796A (zh) * 2022-05-07 2022-08-05 北京卓翼智能科技有限公司 一种用于风机叶片损伤检测的无人机
WO2024088130A1 (zh) * 2022-10-25 2024-05-02 华为技术有限公司 显示方法和电子设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117501300A (zh) * 2021-06-28 2024-02-02 华为技术有限公司 图像处理方法和图像处理装置
CN116453028B (zh) * 2023-06-13 2024-04-26 荣耀终端有限公司 视频处理方法、存储介质及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010108205A (ja) * 2008-10-30 2010-05-13 Hitachi Ltd 超解像画像作成方法
CN103632359A (zh) * 2013-12-13 2014-03-12 清华大学深圳研究生院 一种视频超分辨率处理方法
CN103985085A (zh) * 2014-05-26 2014-08-13 三星电子(中国)研发中心 图像超分辨率放大的方法和装置
CN109741258A (zh) * 2018-12-25 2019-05-10 广西大学 基于重建的图像超分辨率方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010108205A (ja) * 2008-10-30 2010-05-13 Hitachi Ltd 超解像画像作成方法
CN103632359A (zh) * 2013-12-13 2014-03-12 清华大学深圳研究生院 一种视频超分辨率处理方法
CN103985085A (zh) * 2014-05-26 2014-08-13 三星电子(中国)研发中心 图像超分辨率放大的方法和装置
CN109741258A (zh) * 2018-12-25 2019-05-10 广西大学 基于重建的图像超分辨率方法

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344791A (zh) * 2021-07-05 2021-09-03 中山大学 基于空洞卷积和特征融合的双目超分辨率图像检测方法、系统及介质
CN113344791B (zh) * 2021-07-05 2022-06-10 中山大学 基于空洞卷积和特征融合的双目超分辨率图像检测方法、系统及介质
CN113628116A (zh) * 2021-10-12 2021-11-09 腾讯科技(深圳)有限公司 图像处理网络的训练方法、装置、计算机设备和存储介质
CN114862796A (zh) * 2022-05-07 2022-08-05 北京卓翼智能科技有限公司 一种用于风机叶片损伤检测的无人机
WO2024088130A1 (zh) * 2022-10-25 2024-05-02 华为技术有限公司 显示方法和电子设备

Also Published As

Publication number Publication date
CN113034358A (zh) 2021-06-25

Similar Documents

Publication Publication Date Title
WO2021115242A1 (zh) 一种超分辨率图像处理方法以及相关装置
US11373275B2 (en) Method for generating high-resolution picture, computer device, and storage medium
Li et al. Low-light image and video enhancement using deep learning: A survey
JP6636154B2 (ja) 顔画像処理方法および装置、ならびに記憶媒体
CN107967669B (zh) 图片处理的方法、装置、计算机设备及存储介质
WO2018205676A1 (zh) 用于卷积神经网络的处理方法和系统、和存储介质
WO2022199583A1 (zh) 图像处理方法、装置、计算机设备和存储介质
WO2020125631A1 (zh) 视频压缩方法、装置和计算机可读存储介质
WO2022089657A1 (zh) 拼接图像的色差消除方法、装置、设备和可读存储介质
US20220335583A1 (en) Image processing method, apparatus, and system
WO2018082185A1 (zh) 图像处理方法和装置
WO2021027193A1 (zh) 人脸聚类方法、装置、设备和存储介质
WO2021052028A1 (zh) 图像颜色迁移方法、装置、计算机设备和存储介质
CN111583161A (zh) 模糊图像的增强方法、计算机设备和存储介质
CN111681177B (zh) 视频处理方法及装置、计算机可读存储介质、电子设备
CN111476710B (zh) 基于移动平台的视频换脸方法及系统
CN109389569B (zh) 基于改进DehazeNet的监控视频实时去雾方法
KR102263017B1 (ko) 3d cnn을 이용한 고속 영상 인식 방법 및 장치
WO2023000895A1 (zh) 图像风格转换方法、装置、电子设备和存储介质
WO2020151148A1 (zh) 基于神经网络的黑白照片色彩恢复方法、装置及存储介质
WO2022135574A1 (zh) 肤色检测方法、装置、移动终端和存储介质
WO2019090580A1 (en) System and method for image dynamic range adjusting
CN111383232A (zh) 抠图方法、装置、终端设备及计算机可读存储介质
Huang et al. Hybrid image enhancement with progressive laplacian enhancing unit
WO2022194079A1 (zh) 天空区域分割方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900330

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900330

Country of ref document: EP

Kind code of ref document: A1