WO2020062191A1 - Image processing method, apparatus and device - Google Patents

Image processing method, apparatus and device Download PDF

Info

Publication number
WO2020062191A1
WO2020062191A1 PCT/CN2018/108891 CN2018108891W WO2020062191A1 WO 2020062191 A1 WO2020062191 A1 WO 2020062191A1 CN 2018108891 W CN2018108891 W CN 2018108891W WO 2020062191 A1 WO2020062191 A1 WO 2020062191A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
frame
network model
resolution
sample
Prior art date
Application number
PCT/CN2018/108891
Other languages
French (fr)
Chinese (zh)
Inventor
谭文伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2018/108891 priority Critical patent/WO2020062191A1/en
Priority to CN201880093293.9A priority patent/CN112088393B/en
Publication of WO2020062191A1 publication Critical patent/WO2020062191A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting

Definitions

  • the present invention relates to image processing technology, and in particular, to an image processing method, device, and device.
  • multimedia information (picture information or video information) has become a mainstream multimedia file.
  • the terminal needs high-speed broadband to transmit high-resolution multimedia information, which will greatly increase the cost of information exchange between the two sides of the interactive terminal. Therefore, users usually convert high-resolution multimedia information to low-resolution multimedia information, and then send the low-resolution multimedia information to other terminals, which reduces the interaction cost.
  • the receiving terminal After receiving the low-resolution multimedia information, the receiving terminal needs to restore the low-resolution multimedia information to high-resolution multimedia information in order to obtain more detailed information. In practice, it has been found that the quality of the high-resolution multimedia information restored is poor.
  • the invention provides an image processing method, device and equipment, which improve the accuracy of converting a low-resolution image into a high-resolution image, so as to improve the quality of the high-resolution image.
  • an embodiment of the present invention provides an image packet processing method.
  • the method includes: acquiring a target image requiring super-resolution processing; and inputting the target image to a super-scoring network model for processing to obtain a high-resolution image.
  • Image wherein the network parameters of the super-segmented network model are obtained by adjusting multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are semantically identified through the image semantic network model owned.
  • the super-segment network model is a semantic-enhanced network model, that is, a semantic-enhanced super-segment network model can convert a low-resolution image into a semantic-enhanced high-resolution image, and a semantic-enhanced high-resolution image can be provided. More detailed feature information can provide high-resolution edge structure information, which improves the quality of high-resolution images.
  • an error of the super-scoring network model is determined according to the multi-frame sample image and a semantic feature map corresponding to each of the sample images; when the error is greater than a preset error value, the super-scoring network is determined.
  • the network parameters of the model are adjusted.
  • the network parameters of the ultra-scoring network model may be adjusted according to the multi-frame sample image and the semantic feature map corresponding to each sample image.
  • obtain a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images input each frame of the target sub-image into the image semantic network model, and perform semantic recognition to obtain each frame.
  • the semantic feature image corresponding to the sample image, and the target sub-image is a high-resolution sub-image or a low-resolution sub-image corresponding to any of the sample images in the multi-frame sample image;
  • a high-resolution sub-image of the sample image and a superimposed image of the semantic feature image of the sample image are used as reference images, and a low-resolution sub-image of the sample image is used as a training sample.
  • the super image is calculated based on the reference image and the training sample image.
  • the error of the sub-network model is in order to obtain the super-sub-network model with lower error.
  • the image processing device may set weights for the images output by the image semantic network model, in order to obtain an incapable super-segment network model to meet the different image needs of the user, that is, the larger the weight value, it indicates the semantics in the superimposed image
  • the weight value indicates the semantics in the superimposed image
  • the image semantic network model includes a multi-layer neural network, and the target sub-image is input into the image semantic network model.
  • the multi-layer neural network included in the image semantic network model performs semantic recognition and outputs multiple information.
  • Frame candidate feature images, each layer of the neural network outputs a frame of candidate feature images; performing grayscale processing on the candidate feature images of each frame to obtain a grayscale image; determining parameter values of the grayscale images of each frame, and
  • the gray image with the highest value is used as the semantic feature image of the sample image corresponding to the target sub-image, and the parameter value is determined according to the sharpness of the gray image and / or the amount of information provided by the gray image .
  • a candidate image with a higher definition and / or a larger amount of information is selected as a semantic feature image from the multi-frame candidate feature images to improve the quality of the semantic feature images, and further, to improve the super-scoring network model Performance for processing high-resolution images.
  • a super-scoring network model that matches the type of the target image can be selected to process the target image.
  • an embodiment of the present invention provides an image processing apparatus having a function of realizing the behavior in the implementation manner of the first aspect.
  • This function can be realized by hardware, and can also be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions, and the modules may be software and / or hardware.
  • the implementation of the image processing apparatus can be referred to the method for the first aspect. Implementation manners, duplicates are not repeated.
  • an embodiment of the present invention provides an electronic device.
  • the electronic device includes: a memory configured to store one or more programs; and a processor configured to call a program stored in the memory to implement the first aspect described above.
  • a processor configured to call a program stored in the memory to implement the first aspect described above.
  • FIG. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of an image processing method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of another super-segment network model and an image semantic network model according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the picture processing device may be set in any electronic device and used for performing a high-resolution picture conversion operation on a picture.
  • the electronic device includes, but is not limited to, smart mobile devices (such as mobile phones, PDAs, media players, etc.), wearable devices, headsets, personal computers, server computers, handheld or laptop devices, and so on.
  • FIG. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention. The method may be executed by an image processing apparatus. The specific explanation of the image processing apparatus is as described above. As shown in FIG. 1, the image processing method may include the following steps.
  • the image processing apparatus may obtain a target image requiring super-resolution processing from a local database, or download a target image requiring super-resolution processing from a network.
  • the target image refers to an image with a resolution lower than a preset resolution value, and the target image may refer to a captured image or any frame image in a captured video.
  • the network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are obtained through the semantic recognition of the image semantic network model.
  • the image processing device can input the target image into a super-scoring network model for processing to obtain a high-resolution image to improve the quality of the high-resolution image.
  • the high-resolution image may refer to an image with a resolution greater than a preset resolution value.
  • the high-resolution image may provide users with more detailed feature information and edge structure information.
  • the super-segmentation network model and the image semantic network model can be constituted by a convolutional neural network.
  • a convolutional neural network there are usually multiple convolutional layers, and each convolutional layer includes multiple convolutional kernels. It is three-dimensional and contains data in three dimensions of C, H, and W. C, H, and W represent the depth, height, and width of the data, respectively.
  • a convolution kernel is essentially a combination of a series of weights. By adjusting the weight of the convolution kernel in the super-segmented network model, the image conversion error of the super-segmented network model can be reduced. error.
  • the network parameter of the super-segment network model refers to the weight of the convolution kernel in the super-segment network model.
  • the image processing device may preprocess the target image, and input the pre-processed target image to a super-scoring network model for processing to obtain a high-resolution image.
  • preprocessing includes cropping the target image to extract areas that are of interest to the target image, such as cropping out the face of a person; or preprocessing includes scaling the target image to obtain a suitable image. The size of the super network model.
  • the image processing apparatus may obtain the type of the target image, determine a super-scoring network model that matches the type of the target image, and input the target image to a super-scoring network model that matches the type of the target image for processing To get high-resolution images.
  • the image processing device can obtain the type of the target image and classify it according to the content included in the target image.
  • the type of the target image includes the person image type and the scene image type. Or animal image types, which are classified according to the state of the target image, and the type of the target image includes a static image type or a dynamic image type.
  • a super-scoring network model matching the type of the target image is determined, and the target image is input to the super-scoring network model matching the type of the target image for processing to obtain a high-resolution image.
  • the target image is a person image type
  • a super network model matching the person image type is obtained, and the target image is input into the matched super network model for processing to obtain a high-resolution image.
  • the network parameters of the matched super-segmentation network model are adjusted through multiple frames including a person sample image and a semantic feature image corresponding to each frame of the person sample image.
  • the image processing device may train different types of super-scoring network models according to different types of sample images and semantic feature images corresponding to the sample images, for example, using multiple frames of sample images including animals and each frame of sample images
  • Corresponding semantic feature images are super-network models suitable for processing images including animals.
  • the network parameters of the super-segmented network model are adjusted based on a large number of sample images and the semantic feature images of each frame of sample images, the semantic feature images contain detailed feature information of the sample images. And edge structure information. Therefore, the high-resolution image obtained through the super-scoring network model can provide more detailed feature information and provide high-resolution edge structure information, which improves the quality of high-resolution images.
  • FIG. 2 is a schematic flowchart of an image processing method according to an embodiment of the present invention.
  • the method may be executed by an image processing apparatus.
  • the specific explanation of the image processing apparatus is as described above.
  • the difference between the embodiment of the present invention and the embodiment described in FIG. 1 is that the embodiment of the present invention calculates the error of the super-scoring network model by using multiple frame sample images and the semantic feature images of each frame sample image.
  • the error is greater than a preset error value
  • An embodiment of the present invention is shown in FIG. 2.
  • the image processing method may include the following steps.
  • the image processing apparatus may determine the error of the super-scoring network model according to the multi-frame sample image and the semantic feature map corresponding to each sample image.
  • step S201 includes steps S11 to S15. .
  • the target sub-image of each frame into the image semantic network model and perform semantic recognition to obtain the semantic feature image corresponding to the sample image of each frame.
  • the target sub-image is a high-resolution corresponding to any sample image in the multi-frame sample image. Rate sub-image or low-resolution sub-image.
  • the low-resolution sub-images of each frame are input into the super-scoring network model and processed to obtain high-resolution feature images of the sample images of each frame.
  • the image processing device may perform sampling processing on each frame of the multi-frame sample image to obtain a low-resolution sub-image corresponding to each frame of the sample image, and perform enhancement processing on each frame of the sample image to obtain each frame.
  • the low-resolution sub-images of each frame are input to the super-scoring network model for processing to obtain high-resolution feature images of the sample images of each frame, and the target sub-images of each frame are input to the image semantic network model for semantic recognition to obtain each frame.
  • the semantic feature image includes detailed feature information and edge structure information of the sample image.
  • the superimposed high-resolution sub-image of the sample image and the semantic feature image of the corresponding sample image are superimposed to obtain a superimposed image.
  • the superimposed image is a semantic-enhanced high-resolution image.
  • the superimposed image and the corresponding sample image of each frame are superimposed.
  • the high-resolution feature images of the two images are compared to obtain the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image.
  • the greater the degree of difference the smaller the similarity between the high-resolution feature image obtained by the super-segment network model and the superimposed image (that is, the semantic-enhanced high-resolution image), that is, the high-resolution feature is obtained by the super-segment network model.
  • the quality of the image is poor; on the contrary, the smaller the difference, it indicates that the similarity between the high-resolution feature image obtained by the super-segmented network model and the superimposed image (that is, the high-resolution image with enhanced semantics) is greater, that is, The quality of the high-resolution feature images obtained by the sub-network model is better. Therefore, the sum of differences can be calculated, and the sum of differences can be used as the error of the super network model.
  • the error of the super network model refers to the error that the super network model converts the image into a high-resolution image.
  • the quality of the high-resolution images processed by the sub-network model is poor; on the contrary, the smaller the error, it indicates that the quality of the high-resolution images processed by the super-network model is better.
  • each convolutional layer includes N k * k convolution kernels, where N can be any of [20 100] Integer, k can be 3 or 5.
  • the image processing device can obtain high-resolution sub-images and low-resolution sub-images of each frame of sample images in the N-frame sample images, and input the low-resolution sub-images of each frame of sample images into the super-scoring network model for processing.
  • the high-resolution feature image corresponding to the frame sample image is extracted with feature information of each frame of the high-resolution feature image, and is identified as f W (x j ), where x j represents the j-th sample image.
  • the target sub-image is input into the super-scoring network and S operation is performed to obtain a semantic feature image.
  • the semantic feature image is superimposed with a high-resolution sub-image to obtain a superimposed image.
  • the feature information of the superimposed image is extracted and identified as f s (y j ) + z j , y j is the target image corresponding to the j-th sample image, f s (y j ) represents the feature information of the semantic feature image of the target image corresponding to the j-th sample image, and z j is the height of the j-frame sample image.
  • Feature information of the resolution sub-image is extracted and identified as f s (y j ) + z j , y j is the target image corresponding to the j-th sample image, f s (y j ) represents the feature information of the semantic feature image of the target image corresponding to the j-th sample image, and z j is the height of the j-frame sample image.
  • the feature information of the high-resolution feature image of each frame is compared with the feature information of the corresponding superimposed image to determine the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image;
  • the difference sum is described, and the difference sum is used as an error of the super-scoring network model, and identified as W.
  • the error of the super-segmentation network can be expressed by equation (1).
  • MSE (f W (x j ), fs (y j ) + z j ) represents the feature information of the superimposed image of the j-th sample image and the feature of the high-resolution feature image of the j-th sample image. The degree of difference in information.
  • the image processing apparatus may set weights for the images output by the image semantic network model, and process the semantic feature images of the sample images in each frame according to the weights to obtain processed semantic feature maps.
  • the high-resolution sub-image of the sample image and the processed semantic feature image corresponding to the sample image are superimposed to obtain a superimposed image.
  • the image processing device can set weights for the image output by the image semantic network model according to the scene or according to the needs of the user, and process the semantic feature images corresponding to each frame sample image according to the weights to obtain the processed semantic feature images.
  • the super-resolution sub-image of the image is superimposed with the processed semantic feature image of the corresponding sample image to obtain a superimposed image.
  • the larger the weight value the more information the semantic feature image provides in the superimposed image, the higher the definition of the superimposed image, and further, the high-resolution image output by the super-network model is closer to the semantic feature image; otherwise, the weight The smaller the value, the less the information provided by the semantic feature image in the superimposed image, the lower the clarity of the superimposed image, and the closer the high-resolution image output by the super-scoring network model is to the target sub-image.
  • the weight set for the image output by the semantic network model is ⁇ .
  • the semantic feature image corresponding to each frame of the sample image is processed to obtain the processed semantic feature image, and the height of each frame of the sample image is high.
  • the resolution sub-image is superimposed with the processed semantic feature image of the corresponding sample image to obtain a superimposed image, and the feature information of the superimposed image is extracted. It can be identified as ⁇ f s (y j ) + z j , and ⁇ f s (y j ) is the processed image.
  • the feature information of the semantic feature image, and further, the error of the super-scoring network can be expressed by equation (2).
  • step S12 includes: inputting a target sub-image into the image semantic network model, and performing multi-layer neural network recognition through a multi-layer neural network included in the image semantic network model to output multiple frames of candidate feature images, and each layer of the neural network output One frame of candidate feature image, performing grayscale processing on each frame of the candidate feature image to obtain a grayscale image, determining a parameter value of each grayscale image, and using the grayscale image with the largest parameter value as the corresponding target sub-image For the semantic feature image of the sample image, the parameter value is determined according to the sharpness of the grayscale image and / or the amount of information provided by the grayscale image.
  • the image processing apparatus may input a target sub-image into the image semantic network model, and perform multi-layer neural network included in the image semantic network model to perform semantic recognition and output multiple frames of candidate feature images.
  • step S202 Determine whether the error is less than or equal to a preset error value.
  • the image processing device can determine whether the error is less than or equal to the preset error value. When the error is less than or equal to the preset error value, it indicates that the super-scoring network model can output a high-quality high-resolution image, and step S204 can be performed; otherwise, when the error If it is greater than the preset error value, it indicates that the super-scoring network model cannot output a high-quality high-resolution image, and step S205 may be performed.
  • the error When the error is greater than the preset error value, adjust the network parameters of the super network model and repeat S201.
  • the error of the execution of the super network model is less than or equal to the preset error value, so that the super network model can output high-quality High resolution image.
  • the error of the super-segmentation network model is smaller than the preset error, it indicates that the super-segmentation network model can output a high-quality high-resolution image, and the image processing device can obtain a target image that needs to be super-resolution-processed.
  • the target image is input to a super-scoring network model for processing to obtain a high-resolution image.
  • the target image is input to the super-scoring network model for processing to obtain a high-resolution image, so that more detailed feature information and higher-definition edge feature information can be obtained from the high-resolution image.
  • the network parameters of the super-segmented network model are adjusted according to a large number of sample images and the semantic feature images of each frame of the sample images, the semantic feature images contain detailed feature information of the sample images And edge structure information. Therefore, the high-resolution image obtained through the super-scoring network model can provide more detailed feature information and provide high-resolution edge structure information, which improves the quality of high-resolution images.
  • FIG. 4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention.
  • the image processing apparatus described in this embodiment includes:
  • the obtaining module 401 is configured to obtain a target image that needs to be subjected to super-resolution processing.
  • a processing module 402 configured to input the target image into a super-scoring network model for processing to obtain a high-resolution image
  • the network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are obtained through semantic recognition of the image semantic network model.
  • a determining module 403 is configured to determine an error of the super-scoring network model according to the multi-frame sample images and a semantic feature map corresponding to each of the sample images.
  • An adjustment module 404 is configured to adjust network parameters of the ultra-scoring network model when the error is greater than a preset error value.
  • the determining module 403 is specifically configured to obtain a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images; input each frame of the target sub-image into the image semantic network model for semantics
  • the semantic feature image corresponding to each of the sample images is identified, and the target sub-image is a high-resolution sub-image or a low-resolution sub-image corresponding to any one of the multi-frame sample images;
  • the low-resolution sub-images are input into the super-scoring network model and processed to obtain high-resolution feature images of the sample image in each frame; the high-resolution sub-images of the sample image in each frame and the semantic features of the corresponding sample image Superimpose the images to obtain a superimposed image; determine the difference between the high-resolution feature image of each sample image and the superimposed image of the corresponding sample image; calculate the sum of the differences, and use the sum of the differences as the error.
  • a setting module 405 is configured to set a weight for an image output by the image semantic network model.
  • the processing module 402 is further configured to process the semantic feature image of the sample image of each frame according to the weights to obtain a processed semantic feature map.
  • the determining module 403 is specifically configured to superimpose the high-resolution sub-image of the sample image of each frame and the processed semantic feature image of the corresponding sample image to obtain a superimposed image.
  • the determining module 403 is specifically configured to input the target sub-image into the image semantic network model, and perform multi-layer neural network included in the image semantic network model to perform semantic recognition and output multiple frame candidate feature images, each layer
  • the neural network outputs a frame of candidate feature images; performing grayscale processing on each frame of the candidate feature images to obtain a grayscale image; determining a parameter value of the grayscale image of each frame, and using the grayscale image with the largest parameter value as the grayscale image
  • the parameter value of the semantic feature of the sample image corresponding to the target sub-image is determined according to the sharpness of the gray image and / or the amount of information provided by the gray image.
  • the obtaining module 401 is further configured to obtain a type of the target image; and determine a super-scoring network model that matches the type of the target image.
  • the processing module 402 is configured to input the target image into a super-scoring network model matching the type of the target image for processing to obtain a high-resolution image.
  • the network parameters of the super-segmented network model are adjusted based on a large number of sample images and the semantic feature images of each frame of the sample images, the semantic feature images contain detailed feature information of the sample images. And edge structure information. Therefore, the high-resolution image obtained through the super-scoring network model can provide more detailed feature information and provide high-resolution edge structure information, which improves the quality of high-resolution images.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the electronic device includes a processor 501, a memory 502, a communication interface 503, and a power source 504.
  • the processor 501, the memory 502, the communication interface 503, and the power source 504 are connected to each other through a bus.
  • the processor 501 may be one or more CPUs. In the case where the processor 501 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 501 and the processor 501 may include a modem for The signal received by the transceiver 805 is subjected to modulation or demodulation processing.
  • the memory 502 includes, but is not limited to, RAM, ROM), EPROM, and CD-ROM.
  • the memory 502 is used to store instructions, an operating system, various applications, and data.
  • the communication interface 503 is connected to a forwarding plane device or other control plane devices.
  • the communication interface 503 includes multiple interfaces, which are respectively connected to multiple terminals or connected to a forwarding plane device.
  • the communication interface 503 may be a wired interface, a wireless interface, or a combination thereof.
  • the wired interface may be, for example, an Ethernet interface.
  • the Ethernet interface can be an optical interface, an electrical interface, or a combination thereof.
  • the wireless interface may be, for example, a wireless local area network (English: wireless local area network, abbreviation: WLAN) interface, a cellular network interface, or a combination thereof.
  • WLAN wireless local area network
  • the power supply 504 is configured to supply power to a control plane device.
  • the memory 502 is also used to store program instructions.
  • the processor 501 may call the program instructions stored in the memory 502 to implement the image processing method as shown in the foregoing embodiments of the present application.
  • control plane device provided in the embodiment of the present invention is similar to that of the method embodiment of the present invention. Therefore, the implementation and beneficial effects of the control plane device can be referred to as well as the beneficial effects. More details.
  • the present invention also provides a computer-readable storage medium on which a computer program is stored.
  • a computer program is stored.
  • the implementation of the present invention also provides a computer program product.
  • the computer program product includes a non-volatile computer-readable storage medium storing a computer program.
  • the computer program executes the corresponding embodiments of FIG. 1 and FIG. 2 described above.
  • the steps of the image processing method in FIG. 1, and the implementation and beneficial effects of the computer program product for solving the problem refer to the implementation and beneficial effects of the image processing method in FIG. 1 and FIG. 2 described above.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are an image processing method, apparatus and device. The method comprises: acquiring a target image that requires super-resolution processing; and inputting the target image into a super-resolution network model for processing so as to obtain a high-resolution image, wherein network parameters of the super-resolution network model are obtained by carrying out adjustment according to multiple frames of sample image and a semantic feature map corresponding to each frame of sample image, and the semantic feature map is obtained by means of an image semantic network model carrying out semantic recognition, thereby improving the quality of the obtained high-resolution image.

Description

图像处理方法、装置及设备Image processing method, device and equipment 技术领域Technical field
本发明涉及图像处理技术,尤其涉及一种图像处理方法、装置及设备。The present invention relates to image processing technology, and in particular, to an image processing method, device, and device.
背景技术Background technique
随着多媒体技术的发展,用户对多媒体信息的要求越来越高,例如,高分辨率多媒体信息(图片信息或视频信息等)成为了主流的多媒体文件。With the development of multimedia technology, users have higher and higher requirements for multimedia information. For example, high-resolution multimedia information (picture information or video information) has become a mainstream multimedia file.
当终端需要进行高分辨率多媒体信息交互时,终端需要高速宽带来传输高分辨多媒体信息,这样会大大提高交互终端双方的信息交互成本。因此,用户通常将高分辨率的多媒体信息转换为低分辨率多媒体信息,再将低分辨率多媒体信息发送至其他终端,降低了交互成本。接收终端接收到低分辨率多媒体信息后,需要将低分辨率多媒体信息还原为高分辨率多媒体信息,以获取更多细节信息,实践发现,还原得到的高分辨率多媒体信息的质量较差。When the terminal needs high-resolution multimedia information interaction, the terminal needs high-speed broadband to transmit high-resolution multimedia information, which will greatly increase the cost of information exchange between the two sides of the interactive terminal. Therefore, users usually convert high-resolution multimedia information to low-resolution multimedia information, and then send the low-resolution multimedia information to other terminals, which reduces the interaction cost. After receiving the low-resolution multimedia information, the receiving terminal needs to restore the low-resolution multimedia information to high-resolution multimedia information in order to obtain more detailed information. In practice, it has been found that the quality of the high-resolution multimedia information restored is poor.
发明内容Summary of the Invention
本发明提供一种图像处理方法、装置及设备,提高将低分辨率图像转换为高分辨率图像的精度,以提高高分辨率图像的质量。The invention provides an image processing method, device and equipment, which improve the accuracy of converting a low-resolution image into a high-resolution image, so as to improve the quality of the high-resolution image.
第一方面,本发明实施例提供了一种图像包处理方法,该方法包括:获取需要进行超分辨率处理的目标图像;将所述目标图像输入到超分网络模型进行处理,得到高分辨率图像;其中,所述超分网络模型的网络参数是根据多帧样本图像及每帧所述样本图像对应的语义特征图进行调整得到的,所述语义特征图是通过图像语义网络模型进行语义识别得到的。In a first aspect, an embodiment of the present invention provides an image packet processing method. The method includes: acquiring a target image requiring super-resolution processing; and inputting the target image to a super-scoring network model for processing to obtain a high-resolution image. Image; wherein the network parameters of the super-segmented network model are obtained by adjusting multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are semantically identified through the image semantic network model owned.
在该技术方案中,由于该超分网络模型的网络参数是根据大量样本图像及每帧样本图像的语义特征图像进行调整得到的,语义特征图像包含了样本图像的细节特征信息及边缘结构信息,因此,该超分网络模型为语义增强的网络模型,即通过语义增强的超分网络模型可将低分辨率图像转换为语义增强的高分辨率图像,且得到语义增强的高分辨率图像可以提供更多细节特征信息并可提供高清晰度的边缘结构信息,提高了高分辨率图像的质量。In this technical solution, since the network parameters of the super-segmented network model are adjusted based on a large number of sample images and the semantic feature images of each frame of sample images, the semantic feature images include detailed feature information and edge structure information of the sample images. Therefore, the super-segment network model is a semantic-enhanced network model, that is, a semantic-enhanced super-segment network model can convert a low-resolution image into a semantic-enhanced high-resolution image, and a semantic-enhanced high-resolution image can be provided. More detailed feature information can provide high-resolution edge structure information, which improves the quality of high-resolution images.
可选的,根据所述多帧样本图像及每帧所述样本图像对应的语义特征图确定所述超分网络模型的误差;当所述误差大于预设误差值时,对所述超分网络模型的网络参数进行调整。Optionally, an error of the super-scoring network model is determined according to the multi-frame sample image and a semantic feature map corresponding to each of the sample images; when the error is greater than a preset error value, the super-scoring network is determined. The network parameters of the model are adjusted.
在本发明实施例中,为了提高超分网络处理图像的精度,可以根据所述多帧样本图像及每帧所述样本图像对应的语义特征图对所述超分网络模型的网络参数进行调整。In the embodiment of the present invention, in order to improve the accuracy of the image processed by the ultra-scoring network, the network parameters of the ultra-scoring network model may be adjusted according to the multi-frame sample image and the semantic feature map corresponding to each sample image.
可选的,获取多帧样本图像中每帧样本图像对应的高分辨率子图像及低分辨率子图像,将每帧目标子图像输入到所述图像语义网络模型中进行语义识别得到每帧所述样本图像对应的语义特征图像,所述目标子图像为所述多帧样本图像中任一样本图像对应的高分辨率子图像或低分辨率子图像;将每帧所述低分辨率子图像输入到所述超分网络模型中进行处 理得到每帧所述样本图像的高分辨率特征图像;将每帧所述样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像;确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度;计算所述差异度总和,将差异度总和作为所述超分网络模型的误差。Optionally, obtain a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images, input each frame of the target sub-image into the image semantic network model, and perform semantic recognition to obtain each frame. The semantic feature image corresponding to the sample image, and the target sub-image is a high-resolution sub-image or a low-resolution sub-image corresponding to any of the sample images in the multi-frame sample image; Input to the super-scoring network model for processing to obtain a high-resolution feature image of the sample image of each frame; superimpose the high-resolution sub-image of the sample image of each frame with the semantic feature image of the corresponding sample image to obtain an overlay Determine the difference between the high-resolution feature image of each sample image and the superimposed image of the corresponding sample image; calculate the sum of the differences, and use the sum of the differences as the error of the super-scoring network model.
在技术方案中,将样本图像的高分辨率子图像及样本图像的语义特征图像的叠加图像作为参考图像,将样本图像的低分辨率子图像作为训练样本,根据参考图像及训练样本图像计算超分网络模型的误差,以便于得到误差较低的超分网络模型。In the technical solution, a high-resolution sub-image of the sample image and a superimposed image of the semantic feature image of the sample image are used as reference images, and a low-resolution sub-image of the sample image is used as a training sample. The super image is calculated based on the reference image and the training sample image. The error of the sub-network model is in order to obtain the super-sub-network model with lower error.
可选的,为所述图像语义网络模型输出的图像设置权重;根据所述权重对每帧所述样本图像的语义特征图像进行处理,得到处理后的语义特征图;将每帧所述样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像。Optionally, setting a weight for the image output by the image semantic network model; processing the semantic feature image of the sample image of each frame according to the weight to obtain a processed semantic feature map; and processing the sample image of each frame The superimposed high-resolution sub-image and the processed semantic feature image corresponding to the sample image are superimposed to obtain a superimposed image.
在该技术方案中,图像处理装置可以为图像语义网络模型输出的图像设置权重,以便得到不能性能的超分网络模型,以满足用户不同的图像需求,即权重值越大,表明叠加图像中语义特征图像提供的信息越多,叠加图像的清晰度越高,进而,使超分网络模型输出的高分辨率图像更接近语义特征图像;反之,权重值越小,表明叠加图像中语义特征图像提供的信息越少,叠加图像的清晰度较低,进而,使超分网络模型输出的高分辨率图像更接近目标子图像。In this technical solution, the image processing device may set weights for the images output by the image semantic network model, in order to obtain an incapable super-segment network model to meet the different image needs of the user, that is, the larger the weight value, it indicates the semantics in the superimposed image The more information provided by the feature image, the higher the clarity of the superimposed image, and further, the high-resolution image output by the super-scoring network model is closer to the semantic feature image; otherwise, the smaller the weight value, it indicates that the semantic feature image in the superimposed image provides The less information there is, the lower the clarity of the superimposed image, and the closer the high-resolution image output by the super-network model is to the target sub-image.
可选的,所述图像语义网络模型包括多层神经网络,将所述目标子图像输入到所述图像语义网络模型中,通过所述图像语义网络模型包括的多层神经网络进行语义识别输出多帧候选特征图像,每层所述神经网络输出一帧候选特征图像;对每帧所述候选特征图像进行灰度处理,得到灰度图像;确定每帧所述灰度图像的参数值,将参数值最大的灰度图像作为所述目标子图像对应的样本图像的语义特征图像,所述参数值是根据所述灰度图像的清晰度和/或所述灰度图像所提供的信息量确定的。Optionally, the image semantic network model includes a multi-layer neural network, and the target sub-image is input into the image semantic network model. The multi-layer neural network included in the image semantic network model performs semantic recognition and outputs multiple information. Frame candidate feature images, each layer of the neural network outputs a frame of candidate feature images; performing grayscale processing on the candidate feature images of each frame to obtain a grayscale image; determining parameter values of the grayscale images of each frame, and The gray image with the highest value is used as the semantic feature image of the sample image corresponding to the target sub-image, and the parameter value is determined according to the sharpness of the gray image and / or the amount of information provided by the gray image .
在该技术方案中,从多帧候选特征图像中选择清晰度较高和/或提供的信息量较多的候选图像作为语义特征图像,提高语义特征图像的质量,并进一步,提高超分网络模型处理高分辨率图像的性能。In this technical solution, a candidate image with a higher definition and / or a larger amount of information is selected as a semantic feature image from the multi-frame candidate feature images to improve the quality of the semantic feature images, and further, to improve the super-scoring network model Performance for processing high-resolution images.
可选的,获取所述目标图像的类型;确定与所述目标图像的类型匹配的超分网络模型;将所述目标图像输入到与所述目标图像的类型匹配的超分网络模型进行处理,得到高分辨率图像。Optionally, acquiring the type of the target image; determining a super-scoring network model matching the type of the target image; inputting the target image into a super-scoring network model matching the type of the target image for processing, Get high-resolution images.
在该技术方案中,为了提高处理图像的效率及处理图像的精确,可以选择与目标图像的类型匹配的超分网络模型来处理目标图像。In this technical solution, in order to improve the efficiency of processing images and the accuracy of processing images, a super-scoring network model that matches the type of the target image can be selected to process the target image.
第二方面,本发明实施例提供了一种图像处理装置,该图像处理装置具有实现上述第一方面的实现方式中行为的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块,该模块可以是软件和/或硬件。基于同一发明构思,由于该图像处理装置解决问题的原理以及有益效果可以参见上述第一方面的方法实施方式以及所带来的有益效果,因此该图像处理装置的实施可以参见上述第一方面的方法实施方式,重复之处不再赘述。In a second aspect, an embodiment of the present invention provides an image processing apparatus having a function of realizing the behavior in the implementation manner of the first aspect. This function can be realized by hardware, and can also be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, and the modules may be software and / or hardware. Based on the same inventive concept, since the principle and beneficial effects of the image processing apparatus for solving problems can be referred to the method implementation of the first aspect and the beneficial effects brought about by it, the implementation of the image processing apparatus can be referred to the method for the first aspect. Implementation manners, duplicates are not repeated.
第三方面,本发明实施例提供了一种电子设备,该电子设备包括:存储器,用于存储一个或多个程序;处理器,用于调用存储在该存储器中的程序以实现上述第一方面的方法 设计中的方案,该转发面设备解决问题的实施方式以及有益效果可以参见上述第一方面的方法的实施方式以及有益效果,重复之处不再赘述。According to a third aspect, an embodiment of the present invention provides an electronic device. The electronic device includes: a memory configured to store one or more programs; and a processor configured to call a program stored in the memory to implement the first aspect described above. For the solution in the method design of the method, the implementation manner and beneficial effects of the forwarding plane device for solving the problem, refer to the implementation manner and beneficial effects of the method of the first aspect described above, and repeated descriptions will not be repeated.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对本发明实施例中所需要使用的附图进行说明。In order to explain the technical solutions in the embodiments of the present invention more clearly, the accompanying drawings used in the embodiments of the present invention will be described below.
图1是本发明实施例提供的一种图像处理方法的流程示意图;FIG. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention;
图2是本发明实施例提供的一种图像处理方法的流程示意图;2 is a schematic flowchart of an image processing method according to an embodiment of the present invention;
图3是本发明实施例提供的另一种超分网络模型及图像语义网络模型的示意图。FIG. 3 is a schematic diagram of another super-segment network model and an image semantic network model according to an embodiment of the present invention.
图4是本发明实施例提供的一种图像处理装置的结构示意图;4 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention;
图5是本发明实施例提供的一种电子设备的结构示意图。FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
具体实施方式detailed description
下面结合本发明实施例中的附图对本发明实施例进行描述。The following describes the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention.
本发明实施例的图片处理装置可设置在任何的电子设备中,用于对图片进行高分辨率图片转换操作。该电子设备包括但不限于智能移动设备(如移动电话、掌上电脑、媒体播放器等等)、可穿戴设备、头戴设备、个人计算机、服务器计算机、手持式或膝上型设备等等。The picture processing device according to the embodiment of the present invention may be set in any electronic device and used for performing a high-resolution picture conversion operation on a picture. The electronic device includes, but is not limited to, smart mobile devices (such as mobile phones, PDAs, media players, etc.), wearable devices, headsets, personal computers, server computers, handheld or laptop devices, and so on.
下面进一步对本申请所提供的图像处理方法及相关设备进行介绍。The image processing method and related equipment provided in this application are further described below.
请参见图1,图1是本发明实施例提供的一种图像处理方法的流程示意图,所述方法可以由图像处理装置执行,其中,图像处理装置的具体解释如前所述。如图1所示,该图像处理方法可以包括如下步骤。Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an image processing method according to an embodiment of the present invention. The method may be executed by an image processing apparatus. The specific explanation of the image processing apparatus is as described above. As shown in FIG. 1, the image processing method may include the following steps.
S101、获取需要进行超分辨率处理的目标图像。S101. Obtain a target image that needs to be subjected to super-resolution processing.
本发明实施例中,图像处理装置可以从本地数据库中获取需要进行超分辨率处理的目标图像,或者从网络下载需要进行超分辨率处理的目标图像。该目标图像是指分辨率低于预设分辨率值的图像,该目标图像可以是指拍摄的图像,也可以是指拍摄的视频中的任一帧图像。In the embodiment of the present invention, the image processing apparatus may obtain a target image requiring super-resolution processing from a local database, or download a target image requiring super-resolution processing from a network. The target image refers to an image with a resolution lower than a preset resolution value, and the target image may refer to a captured image or any frame image in a captured video.
S102、将该目标图像输入到超分网络模型进行处理,得到高分辨率图像。S102. Input the target image into a super-scoring network model for processing to obtain a high-resolution image.
其中,该超分网络模型的网络参数是根据多帧样本图像及每帧所述样本图像对应的语义特征图进行调整得到的,该语义特征图是通过图像语义网络模型进行语义识别得到的。The network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are obtained through the semantic recognition of the image semantic network model.
本发明实施例中,由于该超分网络模型的网络参数是根据多帧样本图像及每帧样本图像的语义特征图像进行调整得到的,语义特征图像包含了样本图像的细节特征信息及边缘结构信息,因此,图像处理装置可以将该目标图像输入到超分网络模型进行处理,得到高分辨率图像,以提高高分辨率图像的质量。该高分辨率图像可以是指分辨率大于预设分辨率值的图像,高分辨率图像可以为用户提供更多细节特征信息及边缘结构信息。In the embodiment of the present invention, since the network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature images of each frame sample images, the semantic feature images include detailed feature information and edge structure information of the sample images. Therefore, the image processing device can input the target image into a super-scoring network model for processing to obtain a high-resolution image to improve the quality of the high-resolution image. The high-resolution image may refer to an image with a resolution greater than a preset resolution value. The high-resolution image may provide users with more detailed feature information and edge structure information.
其中,超分网络模型及图像语义网络模型可以是卷积神经网络构成,在卷积神经网络中,通常会有多个卷积层,每个卷积层包括多个卷积核,卷积核是三维的,包含C、H及W三个维度的数据,C、H及W分别表示数据的深度、高度及宽度。卷积核实质上是一系 列权重的组合。通过对超分网络模型中的卷积核的权重调整,可以降低超分网络模型的图像转换误差,通过对图像语义网络模型中的卷积核的权重调整,可以降低图像语义网络模型的语义识别误差。Among them, the super-segmentation network model and the image semantic network model can be constituted by a convolutional neural network. In a convolutional neural network, there are usually multiple convolutional layers, and each convolutional layer includes multiple convolutional kernels. It is three-dimensional and contains data in three dimensions of C, H, and W. C, H, and W represent the depth, height, and width of the data, respectively. A convolution kernel is essentially a combination of a series of weights. By adjusting the weight of the convolution kernel in the super-segmented network model, the image conversion error of the super-segmented network model can be reduced. error.
其中,该超分网络模型的网络参数是指超分网络模型中的卷积核的权重。Among them, the network parameter of the super-segment network model refers to the weight of the convolution kernel in the super-segment network model.
在一个实施例中,为了提高获取高分辨率图像的效率,图像处理装置可以对目标图像进行预处理,将预处理后的目标图像输入到超分网络模型进行处理,得到高分辨率图像。例如,预处理包括对目标图像进行剪裁处理,以提取用户对该目标图像感兴趣的区域,如,剪裁出人脸所在的区域;或者,预处理包括对目标图像进行放缩处理,以得到适合超分网络模型处理的尺寸。In one embodiment, in order to improve the efficiency of obtaining a high-resolution image, the image processing device may preprocess the target image, and input the pre-processed target image to a super-scoring network model for processing to obtain a high-resolution image. For example, preprocessing includes cropping the target image to extract areas that are of interest to the target image, such as cropping out the face of a person; or preprocessing includes scaling the target image to obtain a suitable image. The size of the super network model.
在一个实施例中,图像处理装置可以获取目标图像的类型,确定与目标图像的类型匹配的超分网络模型,将该目标图像输入到与所述目标图像的类型匹配的超分网络模型进行处理,得到高分辨率图像。In one embodiment, the image processing apparatus may obtain the type of the target image, determine a super-scoring network model that matches the type of the target image, and input the target image to a super-scoring network model that matches the type of the target image for processing To get high-resolution images.
为了提高获取高分辨率图像的效率及获取高分辨率图像的精确度,图像处理装置可以获取目标图像的类型,按照目标图像所包括的内容分类,目标图像的类型包括人物图像类型、景物图像类型或动物图像类型,按照目标图像的状态进行分类,目标图像的类型包括静态图像类型或动态图像类型。根据图像类型与超分网络模型的关系,确定与目标图像的类型匹配的超分网络模型,将该目标图像输入到与目标图像的类型匹配的超分网络模型进行处理,得到高分辨率图像。例如,该目标图像为人物图像类型,获取与人物图像类型匹配的超分网络模型,将该目标图像输入到匹配的超分网络模型中进行处理,得到高分辨率图像。该匹配的超分网络模型的网络参数是通过多帧包括人物样本图像及每帧人物样本图像对应的语义特征图像进行调整的。In order to improve the efficiency and accuracy of obtaining high-resolution images, the image processing device can obtain the type of the target image and classify it according to the content included in the target image. The type of the target image includes the person image type and the scene image type. Or animal image types, which are classified according to the state of the target image, and the type of the target image includes a static image type or a dynamic image type. According to the relationship between the image type and the super-scoring network model, a super-scoring network model matching the type of the target image is determined, and the target image is input to the super-scoring network model matching the type of the target image for processing to obtain a high-resolution image. For example, the target image is a person image type, and a super network model matching the person image type is obtained, and the target image is input into the matched super network model for processing to obtain a high-resolution image. The network parameters of the matched super-segmentation network model are adjusted through multiple frames including a person sample image and a semantic feature image corresponding to each frame of the person sample image.
在一个实施例中,图像处理装置可以根据不同类型的样本图像及样本图像对应的语义特征图像对不同类型的超分网络模型进行训练,例如,采用多帧包括动物的样本图像及每帧样本图像对应的语义特征图像,对适用于处理包括动物的图像的超分网络模型。In one embodiment, the image processing device may train different types of super-scoring network models according to different types of sample images and semantic feature images corresponding to the sample images, for example, using multiple frames of sample images including animals and each frame of sample images Corresponding semantic feature images are super-network models suitable for processing images including animals.
可见,通过实施图1所描述的方法,由于该超分网络模型的网络参数是根据大量样本图像及每帧样本图像的语义特征图像进行调整得到的,语义特征图像包含了样本图像的细节特征信息及边缘结构信息,因此,通过超分网络模型得到的高分辨率图像可以提供更多细节特征信息并可提供高清晰度的边缘结构信息,提高了高分辨率图像的质量。It can be seen that, by implementing the method described in FIG. 1, since the network parameters of the super-segmented network model are adjusted based on a large number of sample images and the semantic feature images of each frame of sample images, the semantic feature images contain detailed feature information of the sample images. And edge structure information. Therefore, the high-resolution image obtained through the super-scoring network model can provide more detailed feature information and provide high-resolution edge structure information, which improves the quality of high-resolution images.
请参见图2,图2是本发明实施例提供的一种图像处理方法的流程示意图,所述方法可以由图像处理装置执行,其中,图像处理装置的具体解释如前所述。本发明实施例与图1所述的实施例的区别在于,本发明实施例通过多帧样本图像及每帧样本图像的语义特征图像计算超分网络模型的误差,在误差大于预设误差值时,对超分网络模型的网络参数进行调整,以得到误差小于或等于预设误差值的超分网络模型。本发明实施例如图2所示,该图像处理方法可以包括如下步骤。Please refer to FIG. 2. FIG. 2 is a schematic flowchart of an image processing method according to an embodiment of the present invention. The method may be executed by an image processing apparatus. The specific explanation of the image processing apparatus is as described above. The difference between the embodiment of the present invention and the embodiment described in FIG. 1 is that the embodiment of the present invention calculates the error of the super-scoring network model by using multiple frame sample images and the semantic feature images of each frame sample image. When the error is greater than a preset error value , Adjusting the network parameters of the super-scoring network model to obtain a super-scoring network model with an error less than or equal to a preset error value. An embodiment of the present invention is shown in FIG. 2. The image processing method may include the following steps.
S201、根据该多帧样本图像及每帧该样本图像对应的语义特征图确定该超分网络模型的误差。S201. Determine an error of the super-scoring network model according to the multi-frame sample image and the semantic feature map corresponding to each sample image.
本发明实施例中,图像处理装置可以根据该多帧样本图像及每帧该样本图像对应的语 义特征图确定该超分网络模型的误差,在一个实施例中,步骤S201包括如步骤S11~S15。In the embodiment of the present invention, the image processing apparatus may determine the error of the super-scoring network model according to the multi-frame sample image and the semantic feature map corresponding to each sample image. In one embodiment, step S201 includes steps S11 to S15. .
S11、获取多帧样本图像中每帧样本图像对应的高分辨率子图像及低分辨率子图像。S11. Acquire a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images.
S12、将每帧目标子图像输入到该图像语义网络模型中进行语义识别得到每帧该样本图像对应的语义特征图像,该目标子图像为该多帧样本图像中任一样本图像对应的高分辨率子图像或低分辨率子图像。S12. Input the target sub-image of each frame into the image semantic network model and perform semantic recognition to obtain the semantic feature image corresponding to the sample image of each frame. The target sub-image is a high-resolution corresponding to any sample image in the multi-frame sample image. Rate sub-image or low-resolution sub-image.
S13、将每帧所述低分辨率子图像输入到所述超分网络模型中进行处理得到每帧所述样本图像的高分辨率特征图像。S13. The low-resolution sub-images of each frame are input into the super-scoring network model and processed to obtain high-resolution feature images of the sample images of each frame.
S14、将每帧所述样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像。S14. Superimpose the high-resolution sub-image of the sample image and the semantic feature image of the corresponding sample image to obtain a superimposed image.
S15、确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度;计算所述差异度总和,将差异度总和作为所述超分网络模型的误差。S15. Determine the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image for each frame; calculate the sum of the degree of difference, and use the sum of the degree of difference as the error of the super-scoring network model.
在步骤S11~S15中,图像处理装置可以对多帧样本图像中每帧样本图像进行采样处理,得到每帧样本图像对应的低分辨率子图像,并对每帧样本图像进行增强处理得到每帧样本图像对应的高分辨率子图像。将每帧低分辨率子图像输入到超分网络模型中进行处理得到每帧所述样本图像的高分辨率特征图像,将每帧目标子图像输入到图像语义网络模型中进行语义识别得到每帧该样本图像对应的语义特征图像,该语义特征图像包括样本图像的细节特征信息及边缘结构信息。In steps S11 to S15, the image processing device may perform sampling processing on each frame of the multi-frame sample image to obtain a low-resolution sub-image corresponding to each frame of the sample image, and perform enhancement processing on each frame of the sample image to obtain each frame. The high-resolution sub-image corresponding to the sample image. The low-resolution sub-images of each frame are input to the super-scoring network model for processing to obtain high-resolution feature images of the sample images of each frame, and the target sub-images of each frame are input to the image semantic network model for semantic recognition to obtain each frame. The semantic feature image corresponding to the sample image. The semantic feature image includes detailed feature information and edge structure information of the sample image.
进一步,将每帧该样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像,该叠加图像为语义增强的高分辨率图像,将该叠加图像与对应每帧样本图像的高分辨率特征图像进行比对,得到确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度。差异度越大,表明通过超分网络模型得到的高分辨率特征图像与叠加图像(即语义增强的高分辨率图像)之间的相似度较小,即通过超分网络模型得到高分辨率特征图像的质量较差;反之,差异度越小,表明通过超分网络模型得到的高分辨率特征图像与叠加图像(即语义增强的高分辨率图像)之间的相似度较大,即通过超分网络模型得到高分辨率特征图像的质量较好。因此,可以计算差异度总和,将差异度总和作为超分网络模型的误差,超分网络模型的误差是指超分网络模型将图像转换为高分辨率图像的误差,误差越大,表明该超分网络模型处理得到的高分辨率图像的质量较差;反之,误差越小,表明该超分网络模型处理得到的高分辨率图像的质量较好。Further, the superimposed high-resolution sub-image of the sample image and the semantic feature image of the corresponding sample image are superimposed to obtain a superimposed image. The superimposed image is a semantic-enhanced high-resolution image. The superimposed image and the corresponding sample image of each frame are superimposed. The high-resolution feature images of the two images are compared to obtain the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image. The greater the degree of difference, the smaller the similarity between the high-resolution feature image obtained by the super-segment network model and the superimposed image (that is, the semantic-enhanced high-resolution image), that is, the high-resolution feature is obtained by the super-segment network model. The quality of the image is poor; on the contrary, the smaller the difference, it indicates that the similarity between the high-resolution feature image obtained by the super-segmented network model and the superimposed image (that is, the high-resolution image with enhanced semantics) is greater, that is, The quality of the high-resolution feature images obtained by the sub-network model is better. Therefore, the sum of differences can be calculated, and the sum of differences can be used as the error of the super network model. The error of the super network model refers to the error that the super network model converts the image into a high-resolution image. The quality of the high-resolution images processed by the sub-network model is poor; on the contrary, the smaller the error, it indicates that the quality of the high-resolution images processed by the super-network model is better.
举例来说,如图2所示,假设该超分网络模型由连续两个卷积层构成,每个卷积层包括N个k*k的卷积核,N可以为[20 100]的任意整数,k可以为3或5。图像处理装置可以获取N帧样本图像中每帧样本图像的高分辨率子图像及低分辨率子图像,将每帧样本图像的低分辨率子图像输入到超分网络模型中进行处理,得到每帧样本图像对应的高分辨率特征图像,提取每帧高分辨率特征图像的特征信息,标识为f W(x j),x j表示第j帧样本图像。将目标子图像输入到超分网络中进行S运算得到语义特征图像,将语义特征图像与高分辨率子图像进行叠加得到叠加图像,提取该叠加图像的特征信息,标识为f s(y j)+z j,y j第 j帧样本图像对应的目标图像,f s(y j)表示第j帧样本图像对应的目标图像的语义特征图像的特征信息,z j表示第j帧样本图像的高分辨率子图像的特征信息。将每帧高分辨率特征图像的特征信息与对应的叠加图像的特征信息进行比对,得到确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度;计算所述差异度总和,将差异度总和作为所述超分网络模型的误差,标识为W。超分网络的误差可以用式(1)表示。 For example, as shown in Fig. 2, suppose that the super-segmentation network model is composed of two consecutive convolutional layers, and each convolutional layer includes N k * k convolution kernels, where N can be any of [20 100] Integer, k can be 3 or 5. The image processing device can obtain high-resolution sub-images and low-resolution sub-images of each frame of sample images in the N-frame sample images, and input the low-resolution sub-images of each frame of sample images into the super-scoring network model for processing. The high-resolution feature image corresponding to the frame sample image is extracted with feature information of each frame of the high-resolution feature image, and is identified as f W (x j ), where x j represents the j-th sample image. The target sub-image is input into the super-scoring network and S operation is performed to obtain a semantic feature image. The semantic feature image is superimposed with a high-resolution sub-image to obtain a superimposed image. The feature information of the superimposed image is extracted and identified as f s (y j ) + z j , y j is the target image corresponding to the j-th sample image, f s (y j ) represents the feature information of the semantic feature image of the target image corresponding to the j-th sample image, and z j is the height of the j-frame sample image. Feature information of the resolution sub-image. The feature information of the high-resolution feature image of each frame is compared with the feature information of the corresponding superimposed image to determine the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image; The difference sum is described, and the difference sum is used as an error of the super-scoring network model, and identified as W. The error of the super-segmentation network can be expressed by equation (1).
Figure PCTCN2018108891-appb-000001
Figure PCTCN2018108891-appb-000001
其中,式(1)MSE(f W(x j),fs(y j)+z j)表示第j帧样本图像的叠加图像的特征信息与第j帧样本图像的高分辨率特征图像的特征信息的差异度。 Among them, the formula (1) MSE (f W (x j ), fs (y j ) + z j ) represents the feature information of the superimposed image of the j-th sample image and the feature of the high-resolution feature image of the j-th sample image. The degree of difference in information.
在一个实施例中,图像处理装置可以为该图像语义网络模型输出的图像设置权重,根据该权重对每帧所述样本图像的语义特征图像进行处理,得到处理后的语义特征图,将每帧所述样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像。In one embodiment, the image processing apparatus may set weights for the images output by the image semantic network model, and process the semantic feature images of the sample images in each frame according to the weights to obtain processed semantic feature maps. The high-resolution sub-image of the sample image and the processed semantic feature image corresponding to the sample image are superimposed to obtain a superimposed image.
图像处理装置可以根据场景或根据用户的需求为该图像语义网络模型输出的图像设置权重,根据权重对每帧样本图像对应的语义特征图像进行处理,得到处理后的语义特征图像,将每帧样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像。其中,权重值越大,表明叠加图像中语义特征图像提供的信息越多,叠加图像的清晰度越高,进而,使超分网络模型输出的高分辨率图像更接近语义特征图像;反之,权重值越小,表明叠加图像中语义特征图像提供的信息越少,叠加图像的清晰度较低,进而,使超分网络模型输出的高分辨率图像更接近目标子图像。The image processing device can set weights for the image output by the image semantic network model according to the scene or according to the needs of the user, and process the semantic feature images corresponding to each frame sample image according to the weights to obtain the processed semantic feature images. The super-resolution sub-image of the image is superimposed with the processed semantic feature image of the corresponding sample image to obtain a superimposed image. Among them, the larger the weight value, the more information the semantic feature image provides in the superimposed image, the higher the definition of the superimposed image, and further, the high-resolution image output by the super-network model is closer to the semantic feature image; otherwise, the weight The smaller the value, the less the information provided by the semantic feature image in the superimposed image, the lower the clarity of the superimposed image, and the closer the high-resolution image output by the super-scoring network model is to the target sub-image.
举例来说,假设为语义网络模型输出的图像设置的权重为λ,根据权重为λ对每帧样本图像对应的语义特征图像进行处理,得到处理后的语义特征图像,将每帧样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像,提取叠加图像的特征信息,可标识为λf s(y j)+z j,λf s(y j)为处理后的语义特征图像的特征信息,进一步,超分网络的误差可以用式(2)表示。 For example, suppose that the weight set for the image output by the semantic network model is λ. According to the weight λ, the semantic feature image corresponding to each frame of the sample image is processed to obtain the processed semantic feature image, and the height of each frame of the sample image is high. The resolution sub-image is superimposed with the processed semantic feature image of the corresponding sample image to obtain a superimposed image, and the feature information of the superimposed image is extracted. It can be identified as λf s (y j ) + z j , and λf s (y j ) is the processed image. The feature information of the semantic feature image, and further, the error of the super-scoring network can be expressed by equation (2).
Figure PCTCN2018108891-appb-000002
Figure PCTCN2018108891-appb-000002
在一个实施例中,步骤S12包括:将目标子图像输入到所述图像语义网络模型中,通过图像语义网络模型包括的多层神经网络进行语义识别输出多帧候选特征图像,每层神经网络输出一帧候选特征图像,对每帧所述候选特征图像进行灰度处理,得到灰度图像,确定每帧灰度图像的参数值,将参数值最大的灰度图像作为所述目标子图像对应的样本图像的语义特征图像,所述参数值是根据所述灰度图像的清晰度和/或灰度图像所提供的信息量确定的。In one embodiment, step S12 includes: inputting a target sub-image into the image semantic network model, and performing multi-layer neural network recognition through a multi-layer neural network included in the image semantic network model to output multiple frames of candidate feature images, and each layer of the neural network output One frame of candidate feature image, performing grayscale processing on each frame of the candidate feature image to obtain a grayscale image, determining a parameter value of each grayscale image, and using the grayscale image with the largest parameter value as the corresponding target sub-image For the semantic feature image of the sample image, the parameter value is determined according to the sharpness of the grayscale image and / or the amount of information provided by the grayscale image.
为了输出较高质量的语义特征图像,图像处理装置可以将目标子图像输入到所述图像语义网络模型中,通过图像语义网络模型包括的多层神经网络进行语义识别输出多帧候选特征图像。对每帧所述候选特征图像进行灰度处理,得到灰度图像,确定每帧灰度图像的参数值,将参数值最大的灰度图像作为所述目标子图像对应的样本图像的语义特征图像,即将边缘结构清晰及能够提供丰富的细节特征信息的灰度图像作为语义特征图像,以便可以通过较高质量的语义特征图像对超分网络模型的网络参数进行训练,得到能够输出高质量的高分辨率图像的超分网络模型。In order to output a high-quality semantic feature image, the image processing apparatus may input a target sub-image into the image semantic network model, and perform multi-layer neural network included in the image semantic network model to perform semantic recognition and output multiple frames of candidate feature images. Perform grayscale processing on the candidate feature image of each frame to obtain a grayscale image, determine the parameter values of the grayscale image of each frame, and use the grayscale image with the largest parameter value as the semantic feature image of the sample image corresponding to the target sub-image That is, a grayscale image with clear edge structure and capable of providing rich detailed feature information is used as a semantic feature image, so that the network parameters of the super-separated network model can be trained through higher-quality semantic feature images to obtain high-quality high-quality Super-resolution network model for high-resolution images.
S202、判断误差是否小于或等于预设误差值。图像处理装置可以判断误差是否小于或等于预设误差值,当误差小于或等于预设误差值,表明超分网络模型能够输出较高质量的高分辨率图像,可以执行步骤S204;否则,当误差大于预设误差值,表明超分网络模型不能输出较高质量的高分辨率图像,可以执行步骤S205。S202. Determine whether the error is less than or equal to a preset error value. The image processing device can determine whether the error is less than or equal to the preset error value. When the error is less than or equal to the preset error value, it indicates that the super-scoring network model can output a high-quality high-resolution image, and step S204 can be performed; otherwise, when the error If it is greater than the preset error value, it indicates that the super-scoring network model cannot output a high-quality high-resolution image, and step S205 may be performed.
S203、对超分网络模型的网络参数进行调整。S203. Adjust network parameters of the super-segment network model.
当误差大于预设误差值时,对超分网络模型的网络参数进行调整,并重复执行S201,执行超分网络模型的误差小于或等于预设误差值,以便超分网络模型可以输出高质量的高分辨率图像。When the error is greater than the preset error value, adjust the network parameters of the super network model and repeat S201. The error of the execution of the super network model is less than or equal to the preset error value, so that the super network model can output high-quality High resolution image.
S204、获取需要进行超分辨率处理的目标图像。S204. Acquire a target image that needs to be subjected to super-resolution processing.
当超分网络模型的误差小于预设误差时,表明超分网络模型能够输出较高质量的高分辨率图像,图像处理装置可以获取需要进行超分辨率处理的目标图像。When the error of the super-segmentation network model is smaller than the preset error, it indicates that the super-segmentation network model can output a high-quality high-resolution image, and the image processing device can obtain a target image that needs to be super-resolution-processed.
S205、将目标图像输入到超分网络模型进行处理,得到高分辨率图像。S205. The target image is input to a super-scoring network model for processing to obtain a high-resolution image.
将目标图像输入到超分网络模型进行处理,得到高分辨率图像,以便可以从高分辨图像中获取更多细节特征信息及清晰度较高的边缘特征信息。The target image is input to the super-scoring network model for processing to obtain a high-resolution image, so that more detailed feature information and higher-definition edge feature information can be obtained from the high-resolution image.
可见,通过实施图2所描述的方法,由于该超分网络模型的网络参数是根据大量样本图像及每帧样本图像的语义特征图像进行调整得到的,语义特征图像包含了样本图像的细节特征信息及边缘结构信息,因此,通过超分网络模型得到的高分辨率图像可以提供更多细节特征信息并可提供高清晰度的边缘结构信息,提高了高分辨率图像的质量。It can be seen that, by implementing the method described in FIG. 2, since the network parameters of the super-segmented network model are adjusted according to a large number of sample images and the semantic feature images of each frame of the sample images, the semantic feature images contain detailed feature information of the sample images And edge structure information. Therefore, the high-resolution image obtained through the super-scoring network model can provide more detailed feature information and provide high-resolution edge structure information, which improves the quality of high-resolution images.
请参见4,为本发明实施例提供的一种图像处理装置的结构示意图。本实施例中所描述的图像处理装置,所述装置包括:Refer to FIG. 4, which is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention. The image processing apparatus described in this embodiment includes:
获取模块401,用于获取需要进行超分辨率处理的目标图像。The obtaining module 401 is configured to obtain a target image that needs to be subjected to super-resolution processing.
处理模块402,用于将所述目标图像输入到超分网络模型进行处理,得到高分辨率图像;A processing module 402, configured to input the target image into a super-scoring network model for processing to obtain a high-resolution image;
其中,所述超分网络模型的网络参数是根据多帧样本图像及每帧所述样本图像对应的语义特征图进行调整得到的,所述语义特征图是通过图像语义网络模型进行语义识别得到的。Wherein, the network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are obtained through semantic recognition of the image semantic network model. .
确定模块403,用于根据所述多帧样本图像及每帧所述样本图像对应的语义特征图确定所述超分网络模型的误差。A determining module 403 is configured to determine an error of the super-scoring network model according to the multi-frame sample images and a semantic feature map corresponding to each of the sample images.
调整模块404,用于当所述误差大于预设误差值时,对所述超分网络模型的网络参数进行调整。An adjustment module 404 is configured to adjust network parameters of the ultra-scoring network model when the error is greater than a preset error value.
所述确定模块403,具体用于获取多帧样本图像中每帧样本图像对应的高分辨率子图像及低分辨率子图像;将每帧目标子图像输入到所述图像语义网络模型中进行语义识别得到每帧所述样本图像对应的语义特征图像,所述目标子图像为所述多帧样本图像中任一样本图像对应的高分辨率子图像或低分辨率子图像;将每帧所述低分辨率子图像输入到所述超分网络模型中进行处理得到每帧所述样本图像的高分辨率特征图像;将每帧所述样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像;确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度;计算所述差异度总和,将差异度总和作为所述超分网络模型的误差。The determining module 403 is specifically configured to obtain a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images; input each frame of the target sub-image into the image semantic network model for semantics The semantic feature image corresponding to each of the sample images is identified, and the target sub-image is a high-resolution sub-image or a low-resolution sub-image corresponding to any one of the multi-frame sample images; The low-resolution sub-images are input into the super-scoring network model and processed to obtain high-resolution feature images of the sample image in each frame; the high-resolution sub-images of the sample image in each frame and the semantic features of the corresponding sample image Superimpose the images to obtain a superimposed image; determine the difference between the high-resolution feature image of each sample image and the superimposed image of the corresponding sample image; calculate the sum of the differences, and use the sum of the differences as the error.
设置模块405,用于为所述图像语义网络模型输出的图像设置权重。A setting module 405 is configured to set a weight for an image output by the image semantic network model.
所述处理模块402,还用于根据所述权重对每帧所述样本图像的语义特征图像进行处理,得到处理后的语义特征图。The processing module 402 is further configured to process the semantic feature image of the sample image of each frame according to the weights to obtain a processed semantic feature map.
所述确定模块403,具体用于将每帧所述样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像。The determining module 403 is specifically configured to superimpose the high-resolution sub-image of the sample image of each frame and the processed semantic feature image of the corresponding sample image to obtain a superimposed image.
所述确定模块403,具体用于将所述目标子图像输入到所述图像语义网络模型中,通过所述图像语义网络模型包括的多层神经网络进行语义识别输出多帧候选特征图像,每层所述神经网络输出一帧候选特征图像;对每帧所述候选特征图像进行灰度处理,得到灰度图像;确定每帧所述灰度图像的参数值,将参数值最大的灰度图像作为所述目标子图像对应的样本图像的语义特征图像,所述参数值是根据所述灰度图像的清晰度和/或所述灰度图像所提供的信息量确定的。The determining module 403 is specifically configured to input the target sub-image into the image semantic network model, and perform multi-layer neural network included in the image semantic network model to perform semantic recognition and output multiple frame candidate feature images, each layer The neural network outputs a frame of candidate feature images; performing grayscale processing on each frame of the candidate feature images to obtain a grayscale image; determining a parameter value of the grayscale image of each frame, and using the grayscale image with the largest parameter value as the grayscale image The parameter value of the semantic feature of the sample image corresponding to the target sub-image is determined according to the sharpness of the gray image and / or the amount of information provided by the gray image.
所述获取模块401,还用于获取所述目标图像的类型;确定与所述目标图像的类型匹配的超分网络模型。The obtaining module 401 is further configured to obtain a type of the target image; and determine a super-scoring network model that matches the type of the target image.
所述处理模块402,用于将所述目标图像输入到与所述目标图像的类型匹配的超分网络模型进行处理,得到高分辨率图像。The processing module 402 is configured to input the target image into a super-scoring network model matching the type of the target image for processing to obtain a high-resolution image.
可见,通过实施图4所描述的装置,由于该超分网络模型的网络参数是根据大量样本图像及每帧样本图像的语义特征图像进行调整得到的,语义特征图像包含了样本图像的细节特征信息及边缘结构信息,因此,通过超分网络模型得到的高分辨率图像可以提供更多细节特征信息并可提供高清晰度的边缘结构信息,提高了高分辨率图像的质量。It can be seen that, by implementing the device described in FIG. 4, since the network parameters of the super-segmented network model are adjusted based on a large number of sample images and the semantic feature images of each frame of the sample images, the semantic feature images contain detailed feature information of the sample images. And edge structure information. Therefore, the high-resolution image obtained through the super-scoring network model can provide more detailed feature information and provide high-resolution edge structure information, which improves the quality of high-resolution images.
请参见5,为本发明实施例提供的一种电子设备的结构示意图。该电子设备包括:包括处理器501、存储器502、通信接口503和电源504,所述处理器501、存储器502、通信接口503和电源504通过总线相互连接。Refer to FIG. 5, which is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device includes a processor 501, a memory 502, a communication interface 503, and a power source 504. The processor 501, the memory 502, the communication interface 503, and the power source 504 are connected to each other through a bus.
处理器501可以是一个或多个CPU,在处理器501是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU,处理器501,处理器501可以包括调制解调器,用于对收发器805接收到的信号进行调制或解调处理。The processor 501 may be one or more CPUs. In the case where the processor 501 is a CPU, the CPU may be a single-core CPU or a multi-core CPU. The processor 501 and the processor 501 may include a modem for The signal received by the transceiver 805 is subjected to modulation or demodulation processing.
存储器502包括但不限于是RAM、ROM)、EPROM、CD-ROM,该存储器502用于存储指令、操作系统、各种应用及数据。The memory 502 includes, but is not limited to, RAM, ROM), EPROM, and CD-ROM. The memory 502 is used to store instructions, an operating system, various applications, and data.
上述通信接口503与转发面设备或其他控制面设备相连。例如,通信接口503包括多 个接口,分别与多个终端相连或者与转发面设备相连接。通信接口503可以是有线接口,无线接口或其组合。有线接口例如可以为以太网接口。以太网接口可以是光接口,电接口或其组合。无线接口例如可以为无线局域网(英文:wireless local area network,缩写:WLAN)接口,蜂窝网络接口或其组合。The communication interface 503 is connected to a forwarding plane device or other control plane devices. For example, the communication interface 503 includes multiple interfaces, which are respectively connected to multiple terminals or connected to a forwarding plane device. The communication interface 503 may be a wired interface, a wireless interface, or a combination thereof. The wired interface may be, for example, an Ethernet interface. The Ethernet interface can be an optical interface, an electrical interface, or a combination thereof. The wireless interface may be, for example, a wireless local area network (English: wireless local area network, abbreviation: WLAN) interface, a cellular network interface, or a combination thereof.
电源504,用于为控制面设备进行供电。The power supply 504 is configured to supply power to a control plane device.
上述存储器502还用于存储程序指令。上述处理器501可以调用上述存储器502存储的程序指令,实现如本申请上述各实施例所示的图像处理方法。The memory 502 is also used to store program instructions. The processor 501 may call the program instructions stored in the memory 502 to implement the image processing method as shown in the foregoing embodiments of the present application.
基于同一发明构思,本发明实施例中提供的控制面设备解决问题的原理与本发明方法实施例相似,因此该控制面设备的实施以及有益效果可以参见以及有益效果,为简洁描述,在这里不再赘述。Based on the same inventive concept, the principle of solving the problem of the control plane device provided in the embodiment of the present invention is similar to that of the method embodiment of the present invention. Therefore, the implementation and beneficial effects of the control plane device can be referred to as well as the beneficial effects. More details.
本发明还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序解决问题的实施方式以及有益效果可以参见上述图1及图2的图像处理方法的实施方式以及有益效果,重复之处不再赘述。The present invention also provides a computer-readable storage medium on which a computer program is stored. For implementation manners and beneficial effects of the program, refer to the foregoing implementation manners and beneficial effects of the image processing method in FIG. 1 and FIG. The details are not repeated here.
本发明实施还提供了一种计算机程序产品,该计算机程序产品包括存储了计算机程序的非易失性计算机可读存储介质,该计算机程序被执行时使计算机执行上述图1及图2对应实施例中的图像处理方法的步骤,该计算机程序产品解决问题的实施方式以及有益效果可以参见上述图1及图2的图像处理方法的实施方式以及有益效果,重复之处不再赘述。The implementation of the present invention also provides a computer program product. The computer program product includes a non-volatile computer-readable storage medium storing a computer program. When the computer program is executed, the computer executes the corresponding embodiments of FIG. 1 and FIG. 2 described above. For the steps of the image processing method in FIG. 1, and the implementation and beneficial effects of the computer program product for solving the problem, refer to the implementation and beneficial effects of the image processing method in FIG. 1 and FIG. 2 described above.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,上述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。A person of ordinary skill in the art can understand that all or part of the processes in the method of the foregoing embodiment can be implemented by using a computer program to instruct related hardware. The above program can be stored in a computer-readable storage medium. When executed, the processes of the embodiments of the methods described above may be included.

Claims (15)

  1. 一种图像处理方法,其特征在于,所述方法包括:An image processing method, wherein the method includes:
    获取需要进行超分辨率处理的目标图像;Obtain a target image that needs to be super-resolution processed;
    将所述目标图像输入到超分网络模型进行处理,得到高分辨率图像;Inputting the target image into a super-scoring network model for processing to obtain a high-resolution image;
    其中,所述超分网络模型的网络参数是根据多帧样本图像及每帧所述样本图像对应的语义特征图进行调整得到的,所述语义特征图是通过图像语义网络模型进行语义识别得到的。Wherein, the network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are obtained through semantic recognition of the image semantic network model. .
  2. 根据权利要求1所述的方法,其特征在于,还包括:The method according to claim 1, further comprising:
    根据所述多帧样本图像及每帧所述样本图像对应的语义特征图确定所述超分网络模型的误差;Determining an error of the super-scoring network model according to the multi-frame sample image and the semantic feature map corresponding to each sample image;
    当所述误差大于预设误差值时,对所述超分网络模型的网络参数进行调整。When the error is greater than a preset error value, network parameters of the ultra-scoring network model are adjusted.
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述多帧样本图像及每帧所述样本图像对应的语义特征图确定所述超分网络模型的误差,包括:The method according to claim 2, wherein determining the error of the super-scoring network model according to the multi-frame sample image and the semantic feature map corresponding to each frame of the sample image comprises:
    获取多帧样本图像中每帧样本图像对应的高分辨率子图像及低分辨率子图像;Obtaining a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images;
    将每帧目标子图像输入到所述图像语义网络模型中进行语义识别得到每帧所述样本图像对应的语义特征图像,所述目标子图像为所述多帧样本图像中任一样本图像对应的高分辨率子图像或低分辨率子图像;Input the target sub-image of each frame into the image semantic network model and perform semantic recognition to obtain the semantic feature image corresponding to the sample image of each frame, and the target sub-image is corresponding to any sample image in the multi-frame sample image High-resolution sub-images or low-resolution sub-images;
    将每帧所述低分辨率子图像输入到所述超分网络模型中进行处理得到每帧所述样本图像的高分辨率特征图像;Inputting the low-resolution sub-image of each frame into the super-scoring network model for processing to obtain a high-resolution feature image of the sample image of each frame;
    将每帧所述样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像;Superimposing the high-resolution sub-image of the sample image of each frame with the semantic feature image of the corresponding sample image to obtain a superimposed image;
    确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度;Determining the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image in each frame;
    计算所述差异度总和,将差异度总和作为所述超分网络模型的误差。Calculate the sum of the differences and use the sum of the differences as the error of the super-score network model.
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:The method according to claim 3, further comprising:
    为所述图像语义网络模型输出的图像设置权重;Setting weights for images output by the image semantic network model;
    根据所述权重对每帧所述样本图像的语义特征图像进行处理,得到处理后的语义特征图;Processing the semantic feature image of the sample image of each frame according to the weights to obtain a processed semantic feature map;
    所述将每帧所述样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像,包括:The superimposing the high-resolution sub-image of each sample image and the semantic feature image of the corresponding sample image to obtain a superimposed image includes:
    将每帧所述样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像。A superimposed high-resolution sub-image of the sample image and the processed semantic feature image of the corresponding sample image are superimposed to obtain a superimposed image.
  5. 根据权利要求3所述的方法,其特征在于,所述图像语义网络模型包括多层神经网络,所述将每帧目标子图像输入到所述图像语义网络模型中进行语义识别得到每帧样本图像对应的语义特征图像,包括:The method according to claim 3, wherein the image semantic network model comprises a multilayer neural network, and the target sub-image of each frame is input into the image semantic network model for semantic recognition to obtain a sample image of each frame Corresponding semantic feature images, including:
    将所述目标子图像输入到所述图像语义网络模型中,通过所述图像语义网络模型包括的多层神经网络进行语义识别输出多帧候选特征图像,每层所述神经网络输出一帧候选特征图像;Inputting the target sub-image into the image semantic network model, and performing multi-layer neural network recognition through the image semantic network model to output multiple frames of candidate feature images, and each layer of the neural network outputs one frame of candidate features image;
    对每帧所述候选特征图像进行灰度处理,得到灰度图像;Performing grayscale processing on the candidate feature image of each frame to obtain a grayscale image;
    确定每帧所述灰度图像的参数值,将参数值最大的灰度图像作为所述目标子图像对应的样本图像的语义特征图像,所述参数值是根据所述灰度图像的清晰度和/或所述灰度图像所提供的信息量确定的。Determine the parameter value of the grayscale image for each frame, and use the grayscale image with the largest parameter value as the semantic feature image of the sample image corresponding to the target sub-image, and the parameter value is based on the sharpness and And / or determined by the amount of information provided by the grayscale image.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-5, further comprising:
    获取所述目标图像的类型;Acquiring the type of the target image;
    确定与所述目标图像的类型匹配的超分网络模型;Determining a super-scoring network model that matches the type of the target image;
    所述将所述目标图像输入到超分网络模型进行处理,得到高分辨率图像,包括:The inputting the target image into a super-scoring network model for processing to obtain a high-resolution image includes:
    将所述目标图像输入到与所述目标图像的类型匹配的超分网络模型进行处理,得到高分辨率图像。The target image is input to a super-scoring network model that matches the type of the target image for processing to obtain a high-resolution image.
  7. 一种图像处理装置,其特征在于,所述装置包括:An image processing device, wherein the device includes:
    获取模块,用于获取需要进行超分辨率处理的目标图像;An acquisition module for acquiring a target image that needs to be subjected to super-resolution processing;
    处理模块,用于将所述目标图像输入到超分网络模型进行处理,得到高分辨率图像;A processing module, configured to input the target image into a super-scoring network model for processing to obtain a high-resolution image;
    其中,所述超分网络模型的网络参数是根据多帧样本图像及每帧所述样本图像对应的语义特征图进行调整得到的,所述语义特征图是通过图像语义网络模型进行语义识别得到的。Wherein, the network parameters of the super-segmented network model are adjusted according to the multi-frame sample images and the semantic feature maps corresponding to the sample images in each frame, and the semantic feature maps are obtained through semantic recognition of the image semantic network model. .
  8. 根据权利要求7所述的装置,其特征在于,还包括:The apparatus according to claim 7, further comprising:
    所述装置还包括:The device further includes:
    确定模块,用于根据所述多帧样本图像及每帧所述样本图像对应的语义特征图确定所述超分网络模型的误差;A determining module, configured to determine an error of the ultra-scoring network model according to the multi-frame sample image and the semantic feature map corresponding to each sample image;
    调整模块,用于当所述误差大于预设误差值时,对所述超分网络模型的网络参数进行调整。An adjustment module is configured to adjust network parameters of the ultra-scoring network model when the error is greater than a preset error value.
  9. 根据权利要求8所述的装置,其特征在于,The device according to claim 8, characterized in that:
    所述确定模块,具体用于获取多帧样本图像中每帧样本图像对应的高分辨率子图像及低分辨率子图像;将每帧目标子图像输入到所述图像语义网络模型中进行语义识别得到每帧所述样本图像对应的语义特征图像,所述目标子图像为所述多帧样本图像中任一样本图像对应的高分辨率子图像或低分辨率子图像;将每帧所述低分辨率子图像输入到所述超分网络模型中进行处理得到每帧所述样本图像的高分辨率特征图像;将每帧所述样本图像的高分辨率子图像与对应样本图像的语义特征图像进行叠加得到叠加图像;确定每帧所述样本图像的高分辨率特征图像与对应样本图像的叠加图像的差异度;计算所述差异度总和,将差异度总和作为所述超分网络模型的误差。The determining module is specifically configured to obtain a high-resolution sub-image and a low-resolution sub-image corresponding to each frame of the multi-frame sample images; input each frame of the target sub-image into the image semantic network model for semantic recognition Obtain the semantic feature image corresponding to each sample image, and the target sub-image is a high-resolution sub-image or a low-resolution sub-image corresponding to any of the sample images in the multi-frame sample image; The resolution sub-image is input to the super-scoring network model and processed to obtain a high-resolution feature image of the sample image of each frame; the high-resolution sub-image of the sample image of each frame and the semantic feature image of the corresponding sample image are processed Superimpose to obtain a superimposed image; determine the degree of difference between the high-resolution feature image of the sample image and the superimposed image of the corresponding sample image for each frame; calculate the sum of the degree of difference, and use the sum of the degree of difference as the error of the super score network model .
  10. 根据权利要求9所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 9, further comprising:
    设置模块,用于为所述图像语义网络模型输出的图像设置权重;A setting module, configured to set a weight for an image output by the image semantic network model;
    所述处理模块,还用于根据所述权重对每帧所述样本图像的语义特征图像进行处理,得到处理后的语义特征图;The processing module is further configured to process the semantic feature image of the sample image of each frame according to the weights to obtain a processed semantic feature map;
    所述确定模块,具体用于将每帧所述样本图像的高分辨率子图像与对应样本图像的处理后的语义特征图像进行叠加得到叠加图像。The determining module is specifically configured to superimpose the high-resolution sub-image of the sample image of each frame and the processed semantic feature image of the corresponding sample image to obtain a superimposed image.
  11. 根据权利要求9所述的装置,其特征在于,The device according to claim 9, wherein:
    所述确定模块,具体用于将所述目标子图像输入到所述图像语义网络模型中,通过所述图像语义网络模型包括的多层神经网络进行语义识别输出多帧候选特征图像,每层所述神经网络输出一帧候选特征图像;对每帧所述候选特征图像进行灰度处理,得到灰度图像;确定每帧所述灰度图像的参数值,将参数值最大的灰度图像作为所述目标子图像对应的样本图像的语义特征图像,所述参数值是根据所述灰度图像的清晰度和/或所述灰度图像所提供的信息量确定的。The determining module is specifically configured to input the target sub-image into the image semantic network model, and perform multi-layer neural network included in the image semantic network model to perform semantic recognition and output multiple frames of candidate feature images. The neural network outputs a frame of candidate feature images; performs grayscale processing on each of the candidate feature images to obtain a grayscale image; determines the parameter value of the grayscale image of each frame, and uses the grayscale image with the largest parameter value as all The semantic feature image of the sample image corresponding to the target sub-image, and the parameter value is determined according to the sharpness of the gray image and / or the amount of information provided by the gray image.
  12. 根据权利要求7-11任一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 7-11, wherein the device further comprises:
    所述获取模块,还用于获取所述目标图像的类型;确定与所述目标图像的类型匹配的超分网络模型;The acquisition module is further configured to acquire the type of the target image; determine a super-scoring network model that matches the type of the target image;
    所述处理模块,用于将所述目标图像输入到与所述目标图像的类型匹配的超分网络模型进行处理,得到高分辨率图像。The processing module is configured to input the target image into a super-scoring network model matching the type of the target image for processing to obtain a high-resolution image.
  13. 一种电子设备,所述电子设备包括至少一个处理器、存储器及存储在所述存储器上并被所述至少一个处理器执行的指令,其特征在于,所述至少一个处理器执行所述指令,以实现权利要求1至6中任一项所述的图像处理方法的步骤。An electronic device includes at least one processor, a memory, and instructions stored on the memory and executed by the at least one processor, wherein the at least one processor executes the instruction, To implement the steps of the image processing method according to any one of claims 1 to 6.
  14. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1至6中任一项所述的图像处理方法的步骤。A computer-readable storage medium, characterized in that the computer storage medium stores a computer program, wherein the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to execute the program according to claim 1 The steps of the image processing method according to any one of 6 to 6.
  15. 一种计算机程序产品,其特征在于,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序执行操作来使计算机实现权利要求1至6中任一项所述的图像处理方法的步骤。A computer program product, characterized in that the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program executes operations to cause a computer to implement The steps of the image processing method described above.
PCT/CN2018/108891 2018-09-29 2018-09-29 Image processing method, apparatus and device WO2020062191A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2018/108891 WO2020062191A1 (en) 2018-09-29 2018-09-29 Image processing method, apparatus and device
CN201880093293.9A CN112088393B (en) 2018-09-29 2018-09-29 Image processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/108891 WO2020062191A1 (en) 2018-09-29 2018-09-29 Image processing method, apparatus and device

Publications (1)

Publication Number Publication Date
WO2020062191A1 true WO2020062191A1 (en) 2020-04-02

Family

ID=69952653

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/108891 WO2020062191A1 (en) 2018-09-29 2018-09-29 Image processing method, apparatus and device

Country Status (2)

Country Link
CN (1) CN112088393B (en)
WO (1) WO2020062191A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111932463A (en) * 2020-08-26 2020-11-13 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN112016542A (en) * 2020-05-08 2020-12-01 珠海欧比特宇航科技股份有限公司 Urban waterlogging intelligent detection method and system
US20210209732A1 (en) * 2020-06-17 2021-07-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Face super-resolution realization method and apparatus, electronic device and storage medium
CN113592709A (en) * 2021-02-19 2021-11-02 腾讯科技(深圳)有限公司 Image super-resolution processing method, device, equipment and storage medium
CN113971763A (en) * 2020-12-21 2022-01-25 河南铮睿科达信息技术有限公司 Small target segmentation method and device based on target detection and super-resolution reconstruction
CN116612466A (en) * 2023-07-20 2023-08-18 腾讯科技(深圳)有限公司 Content identification method, device, equipment and medium based on artificial intelligence
CN116883236A (en) * 2023-05-22 2023-10-13 阿里巴巴(中国)有限公司 Image superdivision method and image data processing method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162561A1 (en) * 2007-01-03 2008-07-03 International Business Machines Corporation Method and apparatus for semantic super-resolution of audio-visual data
CN105793891A (en) * 2013-11-30 2016-07-20 夏普株式会社 Method and device for determining a high resolution output image
CN107169450A (en) * 2017-05-15 2017-09-15 中国科学院遥感与数字地球研究所 The scene classification method and system of a kind of high-resolution remote sensing image
CN108428212A (en) * 2018-01-30 2018-08-21 中山大学 A kind of image magnification method based on double laplacian pyramid convolutional neural networks
CN108537731A (en) * 2017-12-29 2018-09-14 西安电子科技大学 Image super-resolution rebuilding method based on compression multi-scale feature fusion network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10929977B2 (en) * 2016-08-25 2021-02-23 Intel Corporation Coupled multi-task fully convolutional networks using multi-scale contextual information and hierarchical hyper-features for semantic image segmentation
CN106780363B (en) * 2016-11-21 2019-07-23 北京金山安全软件有限公司 Picture processing method and device and electronic equipment
CN108229455B (en) * 2017-02-23 2020-10-16 北京市商汤科技开发有限公司 Object detection method, neural network training method and device and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162561A1 (en) * 2007-01-03 2008-07-03 International Business Machines Corporation Method and apparatus for semantic super-resolution of audio-visual data
CN105793891A (en) * 2013-11-30 2016-07-20 夏普株式会社 Method and device for determining a high resolution output image
CN107169450A (en) * 2017-05-15 2017-09-15 中国科学院遥感与数字地球研究所 The scene classification method and system of a kind of high-resolution remote sensing image
CN108537731A (en) * 2017-12-29 2018-09-14 西安电子科技大学 Image super-resolution rebuilding method based on compression multi-scale feature fusion network
CN108428212A (en) * 2018-01-30 2018-08-21 中山大学 A kind of image magnification method based on double laplacian pyramid convolutional neural networks

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112016542A (en) * 2020-05-08 2020-12-01 珠海欧比特宇航科技股份有限公司 Urban waterlogging intelligent detection method and system
US20210209732A1 (en) * 2020-06-17 2021-07-08 Beijing Baidu Netcom Science And Technology Co., Ltd. Face super-resolution realization method and apparatus, electronic device and storage medium
US11710215B2 (en) * 2020-06-17 2023-07-25 Beijing Baidu Netcom Science And Technology Co., Ltd. Face super-resolution realization method and apparatus, electronic device and storage medium
CN111932463A (en) * 2020-08-26 2020-11-13 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN111932463B (en) * 2020-08-26 2023-05-30 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
CN113971763A (en) * 2020-12-21 2022-01-25 河南铮睿科达信息技术有限公司 Small target segmentation method and device based on target detection and super-resolution reconstruction
CN113592709A (en) * 2021-02-19 2021-11-02 腾讯科技(深圳)有限公司 Image super-resolution processing method, device, equipment and storage medium
CN113592709B (en) * 2021-02-19 2023-07-25 腾讯科技(深圳)有限公司 Image super processing method, device, equipment and storage medium
CN116883236A (en) * 2023-05-22 2023-10-13 阿里巴巴(中国)有限公司 Image superdivision method and image data processing method
CN116883236B (en) * 2023-05-22 2024-04-02 阿里巴巴(中国)有限公司 Image superdivision method and image data processing method
CN116612466A (en) * 2023-07-20 2023-08-18 腾讯科技(深圳)有限公司 Content identification method, device, equipment and medium based on artificial intelligence
CN116612466B (en) * 2023-07-20 2023-09-29 腾讯科技(深圳)有限公司 Content identification method, device, equipment and medium based on artificial intelligence

Also Published As

Publication number Publication date
CN112088393B (en) 2022-09-23
CN112088393A (en) 2020-12-15

Similar Documents

Publication Publication Date Title
WO2020062191A1 (en) Image processing method, apparatus and device
CN110555795B (en) High resolution style migration
CN109493350B (en) Portrait segmentation method and device
CN107330439B (en) Method for determining posture of object in image, client and server
US20220222776A1 (en) Multi-Stage Multi-Reference Bootstrapping for Video Super-Resolution
CN112823379B (en) Method and apparatus for training machine learning model, and apparatus for video style transfer
CN111507333B (en) Image correction method and device, electronic equipment and storage medium
US11481683B1 (en) Machine learning models for direct homography regression for image rectification
US20210256663A1 (en) Image processing method and apparatus, computer device, and storage medium
CN112990219B (en) Method and device for image semantic segmentation
CN110428382B (en) Efficient video enhancement method and device for mobile terminal and storage medium
WO2023035531A1 (en) Super-resolution reconstruction method for text image and related device thereof
CN113487618B (en) Portrait segmentation method, portrait segmentation device, electronic equipment and storage medium
CN114863539B (en) Portrait key point detection method and system based on feature fusion
TWI711004B (en) Picture processing method and device
US20230343119A1 (en) Captured document image enhancement
CN104200498B (en) Real-time video super-resolution processing method integrated with cortex-A7
WO2024027583A1 (en) Image processing method and apparatus, and electronic device and readable storage medium
CN114830168A (en) Image reconstruction method, electronic device, and computer-readable storage medium
TW202240535A (en) Method and system for image processing
WO2022016996A1 (en) Image processing method, device, electronic apparatus, and computer readable storage medium
CN116934591A (en) Image stitching method, device and equipment for multi-scale feature extraction and storage medium
CN116310615A (en) Image processing method, device, equipment and medium
US20240119633A1 (en) Method for converting near infrared image to rgb image and apparatus for same
WO2024057374A1 (en) Extraction system, extraction method, and extraction program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18934890

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18934890

Country of ref document: EP

Kind code of ref document: A1