CN114022361A - Image processing method, medium, device and computing equipment - Google Patents

Image processing method, medium, device and computing equipment Download PDF

Info

Publication number
CN114022361A
CN114022361A CN202111341884.3A CN202111341884A CN114022361A CN 114022361 A CN114022361 A CN 114022361A CN 202111341884 A CN202111341884 A CN 202111341884A CN 114022361 A CN114022361 A CN 114022361A
Authority
CN
China
Prior art keywords
image
loss function
sample image
function value
quality factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111341884.3A
Other languages
Chinese (zh)
Inventor
周琛晖
阮良
陈功
陈丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Netease Zhiqi Technology Co Ltd
Original Assignee
Hangzhou Netease Zhiqi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Netease Zhiqi Technology Co Ltd filed Critical Hangzhou Netease Zhiqi Technology Co Ltd
Priority to CN202111341884.3A priority Critical patent/CN114022361A/en
Publication of CN114022361A publication Critical patent/CN114022361A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure provides an image processing method, medium, device and computing equipment. The image processing method comprises the following steps: the method comprises the steps of obtaining an image to be processed, wherein the image to be processed is a degraded image, extracting a feature map and a feature vector corresponding to the image to be processed, determining at least one quality factor according to the feature vector, and obtaining a target image according to the feature map and the at least one quality factor. According to the super-resolution processing method and device, when the super-resolution processing is carried out on the image to be processed, the quality factor corresponding to the image to be processed is considered, so that the detail loss of the image can be effectively reduced while the blocking effect caused by compression is removed, and the high-resolution image with higher quality is obtained.

Description

Image processing method, medium, device and computing equipment
Technical Field
Embodiments of the present disclosure relate to the field of image processing technologies, and in particular, to an image processing method, medium, apparatus, and computing device.
Background
This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
In recent years, with the rapid development of deep learning technology, super-resolution technology (i.e. super-resolution technology) has shown a wide application prospect in the fields of image restoration, image enhancement and the like, becomes a research hotspot in the field of computer vision, and is concerned and valued by academic circles and industrial circles.
At present, in an image super-resolution processing technology based on deep learning, a low-resolution image processed by a pre-hypothesis degradation mode (such as downsampling) is generally input to a convolutional neural network model for training, so as to obtain a trained convolutional neural network model, and the trained convolutional neural network model is used for performing super-resolution processing on the image. However, when the degradation mode of the image is inconsistent with the pre-assumed degradation mode, the performance of the trained convolutional neural network model may be degraded, resulting in low quality of the obtained image.
Disclosure of Invention
The disclosure provides an image processing method, medium, device and computing equipment, which are used for solving the problem that in the current image super-resolution processing technology, when the degradation mode of an image is inconsistent with a pre-assumed degradation mode, the performance of a trained convolutional neural network model is reduced, so that the quality of the obtained image is not high.
In a first aspect of embodiments of the present disclosure, there is provided an image processing method comprising:
acquiring an image to be processed, wherein the image to be processed is an image subjected to degradation processing;
extracting a feature map and a feature vector corresponding to an image to be processed;
determining at least one quality factor according to the feature vector, wherein the quality factor is used for representing the quality feature of the image to be processed;
and obtaining a target image according to the characteristic diagram and at least one quality factor, wherein the target image is a high-resolution image corresponding to the image to be processed.
In one possible implementation, the image processing method further includes: the following steps are realized through an image super-resolution processing model: extracting a feature map and a feature vector corresponding to an image to be processed; determining at least one quality factor according to the feature vector; obtaining a target image according to the feature map and at least one quality factor; the image hyper-differentiation processing model is obtained by training based on a training set containing a plurality of sample image pairs, wherein the sample image pairs contain a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
In one possible embodiment, the image hyper-segmentation processing model is obtained by: acquiring a training set, wherein the training set comprises a plurality of sample image pairs, each sample image pair comprises a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing the quality characteristics of the second sample image; training the image super-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-segmentation processing model relative to a quality factor label, and the second loss function value is used for indicating the loss degree of the target image obtained by the image super-segmentation processing model relative to the second sample image; and adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
In one possible embodiment, adjusting the parameters of the image hyper-segmentation processing model according to the first loss function value and the second loss function value comprises: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
In one possible embodiment, determining the third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value includes: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In one possible embodiment, training an image hyper-segmentation processing model through a first sample image to obtain a first loss function value and a second loss function value includes: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
In a possible implementation, obtaining the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image includes: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
In a possible implementation, obtaining the second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image includes: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In one possible embodiment, obtaining a training set includes: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
In a second aspect, an embodiment of the present disclosure provides a model training method, including:
acquiring a training set, wherein the training set comprises a plurality of sample image pairs, each sample image pair comprises a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing the quality characteristics of the second sample image;
training the image super-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-segmentation processing model relative to a quality factor label, and the second loss function value is used for indicating the loss degree of the target image obtained by the image super-segmentation processing model relative to the second sample image;
and adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
In one possible embodiment, adjusting the parameters of the image hyper-segmentation processing model according to the first loss function value and the second loss function value comprises: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
In one possible embodiment, determining the third loss function value according to the first loss function value, the preset weight corresponding to the first loss function and the second loss function value includes: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In one possible embodiment, training an image hyper-segmentation processing model through a first sample image to obtain a first loss function value and a second loss function value includes: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
In a possible implementation, obtaining the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image includes: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
In a possible implementation, obtaining the second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image includes: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In one possible embodiment, obtaining a training set includes: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
In a third aspect, an embodiment of the present disclosure provides an image processing apparatus, including:
the first acquisition module is used for acquiring an image to be processed, wherein the image to be processed is a degraded image;
the extraction module is used for extracting a feature map and a feature vector corresponding to the image to be processed;
the determining module is used for determining at least one quality factor according to the feature vector, wherein the quality factor is used for representing the quality feature of the image to be processed;
and the second acquisition module is used for acquiring a target image according to the characteristic diagram and at least one quality factor, wherein the target image is a high-resolution image corresponding to the image to be processed.
In one possible embodiment, the image processing apparatus includes a processing module configured to: the following steps are realized through an image super-resolution processing model: extracting a feature map and a feature vector corresponding to an image to be processed; determining at least one quality factor according to the feature vector; obtaining a target image according to the feature map and at least one quality factor; the image hyper-differentiation processing model is obtained by training based on a training set containing a plurality of sample image pairs, wherein the sample image pairs contain a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
In a possible implementation, the image processing apparatus further includes a third obtaining module, configured to: acquiring a training set, wherein the training set comprises a plurality of sample image pairs, each sample image pair comprises a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing the quality characteristics of the second sample image; training the image super-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-segmentation processing model relative to a quality factor label, and the second loss function value is used for indicating the loss degree of the target image obtained by the image super-segmentation processing model relative to the second sample image; and adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
In a possible implementation manner, the third obtaining module, when configured to adjust a parameter of the image hyper-segmentation processing model according to the first loss function value and the second loss function value, is specifically configured to: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
In a possible implementation manner, the third obtaining module, when configured to determine the third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value, is specifically configured to: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In a possible implementation manner, the third obtaining module, when configured to train the image hyper-segmentation processing model through the first sample image to obtain the first loss function value and the second loss function value, is specifically configured to: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
In a possible implementation manner, the third obtaining module, when configured to obtain the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, is specifically configured to: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
In a possible implementation manner, the third obtaining module, when configured to obtain the second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image, is specifically configured to: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In a possible implementation manner, the third obtaining module, when being configured to obtain the training set, is specifically configured to: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
In a fourth aspect, an embodiment of the present disclosure provides a model training apparatus, including:
the acquisition module is used for acquiring a training set, the training set comprises a plurality of sample image pairs, the sample image pairs comprise a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing the quality characteristics of the second sample image;
the training module is used for training the image super-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-segmentation processing model relative to a quality factor label, and the second loss function value is used for indicating the loss degree of a target image obtained by the image super-segmentation processing model relative to a second sample image;
and the processing module is used for adjusting the parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
In a possible implementation, the processing module is specifically configured to: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
In a possible implementation manner, the processing module, when configured to determine the third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value, is specifically configured to: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In a possible implementation, the training module is specifically configured to: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
In a possible implementation manner, the training module, when configured to obtain the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, is specifically configured to: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
In a possible implementation manner, the training module, when configured to obtain the second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image, is specifically configured to: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In a possible implementation manner, the obtaining module is specifically configured to: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
In a fifth aspect, an embodiment of the present disclosure provides a computing device, including: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored by the memory to implement the image processing method according to the first aspect of the present disclosure.
In a sixth aspect, an embodiment of the present disclosure provides a computing device, including: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored by the memory to implement the model training method according to the second aspect of the present disclosure.
In a seventh aspect, the present disclosure provides a storage medium, in which computer program instructions are stored, and when executed, the image processing method according to the first aspect of the present disclosure is implemented.
In an eighth aspect, an embodiment of the present disclosure provides a storage medium, in which computer program instructions are stored, and when the computer program instructions are executed, the model training method according to the second aspect of the present disclosure is implemented.
In a ninth aspect, the present disclosure provides a computer program product comprising a computer program, which when executed by a processor implements the image processing method according to the first aspect of the present disclosure.
In a tenth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, implements a model training method according to the second aspect of the present disclosure.
According to the image processing method, the medium, the device and the computing equipment, the image to be processed is obtained and is a degraded image, the feature map and the feature vector corresponding to the image to be processed are extracted, at least one quality factor is determined according to the feature vector, and the target image is obtained according to the feature map and the at least one quality factor. According to the method and the device, the quality factor corresponding to the image to be processed is considered, the super-resolution processing can be carried out on the image to be processed in a self-adaptive mode based on different quality factors, and the corresponding target image is obtained, so that the blocking effect caused by compression can be removed, the detail loss of the image can be effectively reduced, and the high-resolution image with higher quality can be obtained.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
fig. 1 is a schematic view of an application scenario provided in the embodiment of the present disclosure;
fig. 2 is a flowchart of an image processing method according to an embodiment of the disclosure;
FIG. 3 is a flowchart of an image hyper-segmentation processing model obtaining method according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of obtaining training sets from different application scenarios according to an embodiment of the present disclosure;
FIG. 5 is a flowchart of a method for obtaining an image hyper-segmentation processing model according to another embodiment of the present disclosure;
FIG. 6 is a flow chart of a model training method provided by an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an image processing apparatus according to another embodiment of the present disclosure;
FIG. 9 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;
FIG. 10 is a schematic illustration of a storage medium provided by an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of a computing device according to an embodiment of the present disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to an embodiment of the disclosure, an image processing method, medium, apparatus, and computing device are provided.
In this context, it is to be understood that the terms referred to: super Resolution (SR), i.e. reconstructing a corresponding high-Resolution image from a low-Resolution image by using a preset algorithm or a preset model, and recovering more detailed information as much as possible, is an important research direction in the field of computer vision; convolutional Neural Networks (CNN) are important algorithms in the field of artificial intelligence, are a class of Neural Networks including Convolutional calculation and having a deep structure, and are one of representative algorithms for deep learning. Moreover, any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.
The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.
Summary of The Invention
The inventor finds that the current image super-resolution processing technology based on deep learning generally performs super-resolution processing on a low-resolution image processed by a pre-hypothesis degradation mode (such as down-sampling) to obtain a corresponding high-resolution image. In practical scenarios, the image degradation method is very complex, and may include multiple degradation methods (such as multiple compression, down-sampling, blur degradation, etc.). When the degradation mode of the image is inconsistent with the pre-hypothesis degradation mode, the image quality obtained by the current image super-resolution processing technology based on deep learning is not high.
In addition, in the related art, the super-resolution processing may be performed on the image by: the method comprises the steps of obtaining a low-resolution image after down-sampling operation is carried out on an input image, then coding and decoding the low-resolution image by using a standard x265 coder-decoder to obtain a decoded low-resolution image, inputting the decoded low-resolution image into a trained deep convolutional neural network model for super-resolution processing, and obtaining a high-resolution image corresponding to the input image. The deep convolutional neural network model in the above manner includes about fifty convolutional layers, the number of channels of each convolutional layer is 48, the number of network parameters is as high as tens of thousands, and the model is huge. When the deep convolutional neural network model is trained, an x265 encoder is adopted, and fixed Quantization Parameters (QP) are adopted to perform data preprocessing on the downsampled low-resolution image, so as to obtain a training data set. However, the above method has the following disadvantages: (1) the deep convolutional neural network model in the above manner is only trained on an image compressed by an x265 encoder, but in practical application, the degradation manner of the image is not only derived from the compression of the encoder, but multiple degradation manners (such as downsampling, multiple different manners of compression, and the like) may exist at the same time, and when the degradation manner is inconsistent with the x265 compression, the deep convolutional neural network model in the above manner may not exert good performance; (2) the above method is to directly reconstruct the image with low resolution, does not consider the quality characteristics of the image, lacks flexibility, does not consider the balance between decompression and detail retention, and often causes excessive loss of detail or unclean removal of blocking effect caused by compression.
Therefore, the current image super-resolution processing technology based on deep learning has the following limitations: (1) the adopted CNN scheme is often used for training a model for a specific degradation mode, is lack of flexibility, cannot control the intensity of image reconstruction, and often causes loss of details or unclean removal of blocking effect caused by compression in an actual scene; (2) most of the network models are trained based on fixed QP synthetic data, and are only compressed once, but images in practical application are often compressed for many times, and the network models obtained through training usually do not perform well in practical application scenes.
In view of the above problems, the present disclosure provides an image processing method, medium, device, and computing device, which obtain quality factors such as a compression degree and a downsampling of an image through a learning manner, and perform super-resolution processing on the image according to the quality factors, so that the quality of the image after super-resolution processing can be greatly improved, and a blocking effect caused by compression can be removed, and meanwhile, a detail loss of the image can be effectively reduced.
Application scene overview
An application scenario of the scheme provided by the present disclosure is first illustrated with reference to fig. 1. Fig. 1 is a schematic view of an application scenario provided by an embodiment of the present disclosure, as shown in fig. 1, in the application scenario, a client 101 performs video on demand, a server 102 receives a video on demand request sent by the client, the server 102 performs degradation processing (such as downsampling and compression) on a video image, transmits the video image after the degradation processing to the client 101 through a network, and the client 101 processes the received video image through a trained image super-resolution processing model to obtain a processed high-resolution video image, and displays the processed high-resolution video image. The specific implementation process of the client 101 processing the received video image through the trained image super-resolution processing model to obtain a processed high-resolution video image may refer to the schemes of the following embodiments.
It should be noted that fig. 1 is only a schematic diagram of an application scenario provided by the embodiment of the present disclosure, and the embodiment of the present disclosure does not limit the devices included in fig. 1, nor does it limit the positional relationship between the devices in fig. 1. For example, in the application scenario shown in fig. 1, a data storage device may be further included, and the data storage device may be an external memory with respect to the client 101 or the server 102, or may be an internal memory integrated in the client 101 or the server 102.
Exemplary method
A method for image processing according to an exemplary embodiment of the present disclosure is described below with reference to fig. 2 in conjunction with the application scenario of fig. 1. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present disclosure, and the embodiments of the present disclosure are not limited in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.
First, an image processing method is described by way of a specific embodiment.
Fig. 2 is a flowchart of an image processing method according to an embodiment of the present disclosure. The method of the disclosed embodiments may be applied in a computing device, which may be a server or a server cluster or the like. As shown in fig. 2, the method of the embodiment of the present disclosure includes:
s201, obtaining an image to be processed, wherein the image to be processed is an image subjected to degradation processing.
In the embodiment of the present disclosure, the image to be processed is, for example, a video image in different application scenes such as entertainment live broadcast, sports, games, tv series, and the like, and the video image is subjected to degradation processing, specifically, degradation modes such as compression and downsampling, which is not limited in this disclosure. It is understood that the image to be processed is a low resolution image after the degradation processing. Illustratively, the image to be processed may be input by a user to a computing device executing an embodiment of the present disclosure, or transmitted by another device to the computing device executing an embodiment of the present disclosure, and thus, the image to be processed may be obtained.
S202, extracting a feature map and a feature vector corresponding to the image to be processed.
In this step, after the image to be processed is obtained, for example, the feature map and the feature vector corresponding to the image to be processed may be extracted through a pre-trained image hyper-segmentation processing model, specifically, the image hyper-segmentation processing model includes a convolutional neural network for extracting the feature map and the feature vector corresponding to the image to be processed, and the feature map and the feature vector corresponding to the image to be processed are extracted through a convolutional layer of the convolutional neural network. The feature map corresponding to the image to be processed includes information such as image texture features, image structure features (such as shapes) and image color features corresponding to the image to be processed; the feature vector corresponding to the image to be processed is, for example, a 512-dimensional image quality feature vector. Optionally, if the image to be processed is a frame of current image to be processed, inputting the current image to be processed into a pre-trained image hyper-resolution processing model, and extracting a corresponding feature map and a corresponding feature vector; and if the image to be processed is a multi-frame image comprising the current image to be processed and images adjacent to the current image to be processed in front and at the back, inputting the multi-frame image to a pre-trained image super-resolution processing model together, and extracting a feature map and a feature vector corresponding to the current image to be processed. For how to train the image super-resolution processing model, reference may be made to the following embodiments, which are not described herein again.
S203, determining at least one quality factor according to the feature vector.
The quality factor is used for representing the quality characteristic of the image to be processed.
Illustratively, the quality factors include, for example, a compression rate, a down-sampling multiple, a blur degree, a contrast, and a saturation of the image to be processed, which the present disclosure is not limited to. For example, after the feature vector corresponding to the image to be processed is obtained, the feature vector corresponding to the image to be processed may be input to a pre-trained image hyper-differentiation processing model, and at least one quality factor corresponding to the image to be processed is determined through a convolutional neural network included in the image hyper-differentiation processing model and used for determining the quality factor according to the feature vector.
And S204, obtaining a target image according to the feature map and at least one quality factor.
And the target image is a high-resolution image corresponding to the image to be processed.
For example, after obtaining the feature map and at least one quality factor corresponding to the image to be processed, the feature map and at least one quality factor corresponding to the image to be processed may be input to a pre-trained image hyper-segmentation model, and the target image may be obtained through a convolutional neural network included in the image hyper-segmentation model and used for obtaining the target image according to the feature map and the at least one quality factor. The quality factors include, for example, a compression ratio and a down-sampling multiple of the image to be processed, and the high-resolution target image corresponding to the image to be processed can be obtained by processing the image to be processed through the pre-trained image hyper-resolution processing model according to the compression ratio and the down-sampling multiple of the image to be processed.
After a high resolution target image corresponding to the image to be processed is obtained, the target image may be displayed.
According to the image processing method provided by the embodiment of the disclosure, the image to be processed is obtained and is a degraded image, the feature map and the feature vector corresponding to the image to be processed are extracted, at least one quality factor is determined according to the feature vector, and the target image is obtained according to the feature map and the at least one quality factor. According to the method and the device, the quality factor corresponding to the image to be processed is considered, the super-resolution processing can be carried out on the image to be processed in a self-adaptive mode based on different quality factors, and the corresponding target image is obtained, so that the blocking effect caused by compression can be removed, the detail loss of the image can be effectively reduced, and the high-resolution image with higher quality can be obtained.
On the basis of the above embodiment, optionally, the following steps may be implemented by using an image hyper-segmentation processing model: extracting a feature map and a feature vector corresponding to an image to be processed; determining at least one quality factor according to the feature vector; obtaining a target image according to the feature map and at least one quality factor; the image hyper-differentiation processing model is obtained by training based on a training set containing a plurality of sample image pairs, wherein the sample image pairs contain a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
It should be noted that the image super-resolution processing model is obtained by training based on a training set including a plurality of sample image pairs, and for how to train the image super-resolution processing model, reference may be made to the subsequent embodiments, which are not described herein again. It can be understood that the picture to be processed is input into the image super-resolution processing model, that is, a high-resolution target image corresponding to the picture to be processed can be obtained. Specifically, firstly, an image to be processed is input into an image super-resolution processing model, and a feature map and a feature vector corresponding to the image to be processed can be extracted through the image super-resolution processing model; then, the image super-resolution processing model determines at least one quality factor corresponding to the image to be processed according to the feature vector corresponding to the image to be processed; and finally, the image super-resolution processing model obtains a high-resolution target image corresponding to the image to be processed according to the characteristic diagram corresponding to the image to be processed and at least one quality factor.
Next, an image super-resolution processing model acquisition method will be described by way of a specific embodiment.
Fig. 3 is a flowchart of an image hyper-resolution processing model obtaining method according to an embodiment of the present disclosure. The method of the disclosed embodiments may be applied in a computing device, which may be a server or a server cluster or the like. As shown in fig. 3, the method of the embodiment of the present disclosure includes:
s301, a training set is obtained, wherein the training set comprises a plurality of sample image pairs, each sample image pair comprises a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
In the embodiment of the present disclosure, the second sample image is, for example, a plurality of high-definition high-resolution images, and is used as a label image for acquiring an image super-resolution processing model. And performing degradation processing on the second sample image, wherein the degradation processing is performed by multiple times of compression and downsampling, so that the first sample image can be obtained. It is understood that the first sample image is a low-resolution image in which the second sample image is subjected to the degradation processing. The quality factor label is used to represent the quality characteristic of the second sample image, and exemplarily, the quality factor label is a one-dimensional vector of (10, 15, 50), where 10 represents the compression rate, 15 represents the blur degree, and 50 represents the contrast. Illustratively, a sample image pair comprises a second sample image with high definition and high resolution, a first sample image with low resolution obtained by performing degradation processing on the second sample image, and a corresponding quality factor label.
Further, obtaining the training set may include: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
Illustratively, fig. 4 is a schematic diagram of acquiring training sets from different application scenarios according to an embodiment of the present disclosure. As shown in fig. 4, it is shown that training sets can be obtained from different application scenarios of live entertainment, games, sports, and tv series, and the disclosure is not limited thereto. Firstly, obtaining second sample images with high definition and high resolution in different application scenes, and performing degradation processing in different degradation modes on the second sample images to obtain corresponding first sample images with low resolution; then, a quality factor label corresponding to the second sample image is obtained, and a plurality of sample image pairs are obtained according to the first sample image, the second sample image and the quality factor label, that is, a training set is obtained.
S302, training the image hyper-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image hyper-segmentation processing model relative to a quality factor label, and the second loss function value is used for indicating the loss degree of a target image obtained by the image hyper-segmentation processing model relative to a second sample image.
In the step, a first sample image is input to an image super-resolution processing model for training, a first loss function and a second loss function are set in the training process, and accordingly, the first loss function value and the second loss function value can be obtained. Determining the loss degree of the target quality factor obtained according to the image super-resolution processing model relative to the quality factor label according to the first loss function value; the degree of loss of the target image obtained by the image hyper-segmentation processing model relative to the second sample image can be determined according to the second loss function value, and the training of the image hyper-segmentation processing model is supervised through the first loss function value and the second loss function value. For how to obtain the first loss function value and the second loss function value, reference may be made to the following embodiments, which are not repeated herein.
And S303, adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
For example, when the image hyper-segmentation processing model is trained, parameters of the image hyper-segmentation processing model may be adjusted according to the obtained first loss function value and the second loss function value, and when a preset number of iterations is reached or when a target loss function value obtained according to the first loss function value and the second loss function value tends to a target loss threshold value, the trained image hyper-segmentation processing model is obtained. For how to adjust the parameters of the image super-resolution processing model according to the first loss function value and the second loss function value, reference may be made to subsequent embodiments, which are not described herein again.
The method for obtaining the image super-resolution processing model provided by the embodiment of the disclosure is based on a training set including a plurality of sample image pairs, wherein the sample image pairs include a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, the quality factor label is used for representing quality characteristics of the second sample image, the image super-resolution processing model is trained, and parameters of the image super-resolution processing model are adjusted according to a first loss function value and a second loss function value obtained in a training process, so that the trained image super-resolution processing model is obtained. Therefore, the generalization capability of the image super-resolution processing model can be greatly improved, and the high-resolution image with higher quality can be reconstructed from the low-resolution image through the image super-resolution processing model provided by the embodiment of the disclosure.
Fig. 5 is a flowchart of an image hyper-segmentation processing model obtaining method according to another embodiment of the present disclosure. On the basis of the above embodiments, the embodiments of the present disclosure further explain how to obtain an image hyper-resolution processing model. As shown in fig. 5, a method of an embodiment of the present disclosure may include:
s501, a training set is obtained, the training set comprises a plurality of sample image pairs, the sample image pairs comprise a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained after degradation processing of the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
For a detailed description of this step, reference may be made to the related description of S301 in the embodiment shown in fig. 3, and details are not repeated here.
In this embodiment of the present disclosure, the step S302 in fig. 3 may further include the following three steps S502 to S504:
s502, inputting the first sample image into an image super-resolution processing model, and obtaining a feature map and a feature vector corresponding to the first sample image.
Illustratively, the image hyper-segmentation processing model includes a convolutional neural network for obtaining a feature map and a feature vector corresponding to the first sample image, and the convolutional neural network includes, for example, three scales. Each scale, which may be understood as a convolution layer, provides a convolution neural network that is connected to the image hyper-segmentation processing model for obtaining the target image (corresponding to step S504). The number of channels in each dimension is, for example, n, 2n, 4n, and 8n, respectively, where n may be, for example, 8, 16, 32, and 64. In this step, the first sample image is input to an image hyper-segmentation processing model, and a feature map and a feature vector corresponding to the first sample image can be obtained through the convolution layer of the convolutional neural network. The feature map corresponding to the first sample image includes information such as image texture features, image structure features (such as shapes) and image color features corresponding to the first sample image; the feature vector corresponding to the first sample image is, for example, a 512-dimensional image quality feature vector.
S503, obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image.
Wherein the target quality factor comprises at least one of: image compression ratio, fuzzy degree, down sampling multiple, contrast and saturation.
In the step, the image compression ratio is the compression ratio of the image digital size after being compressed by the encoder and the original image digital size; the image blur is, for example, a blur caused by high-frequency loss after image compression, and the blur degree is the degree of image blur; the down-sampling multiple is the multiple for down-sampling the image, so that the corresponding image with low resolution can be obtained; contrast is a measure of the different brightness levels between the brightest white and darkest black of the bright and dark regions in an image; saturation is the degree to which an image is vivid in color. Illustratively, the image super-resolution processing model comprises a convolutional neural network for outputting a predicted quality factor according to a feature vector corresponding to a first sample image, and in the training process of the image super-resolution processing model, because the size of an image block obtained according to the feature vector corresponding to the first sample image is small, the predicted quality factor may not be accurate enough, and further training may not be stable, so that a first loss function is adopted for supervision. After the feature vector corresponding to the first sample image is obtained, the feature vector corresponding to the first sample image is input to the convolutional neural network, a predicted quality factor can be output, a first loss function value is obtained according to the predicted quality factor and the quality factor label through the following first loss function, and further a target quality factor is obtained based on the first loss function value:
Figure BDA0003352460710000181
wherein L isIQThe value of the first loss function is expressed,
Figure BDA0003352460710000182
a quality factor representing a prediction of the output of the image super-resolution processing model,
Figure BDA0003352460710000183
representing a quality factor label.
Further, obtaining the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image may include: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
Illustratively, the predicted quality factor, that is, the quality factor corresponding to the first sample image, can be obtained by inputting the feature vector corresponding to the first sample image into the image super-resolution processing model. According to the quality factor and the quality factor label corresponding to the first sample image, the first loss function value can be obtained through the first loss function, and then the target quality factor is obtained based on the first loss function value.
And S504, obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
Illustratively, the image hyper-resolution processing model includes a convolutional neural network for generating a target image according to a feature map and a target quality factor corresponding to a first sample image, after obtaining the feature map and the target quality factor corresponding to the first sample image, the feature map and the target quality factor corresponding to the first sample image are input to the convolutional neural network, so as to generate the target image, and a second loss function value is obtained according to the target image and a second sample image by the following second loss function:
Figure BDA0003352460710000184
wherein L isSRThe value of the second loss function is expressed,
Figure BDA0003352460710000185
representing a target image generated by an image hyper-resolution processing model,
Figure BDA0003352460710000186
and represents a second sample image corresponding to the first sample image, namely the label image.
Further, obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image may include: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
For example, the feature map and the target quality factor corresponding to the first sample image are input to the image super-resolution processing model, a target image corresponding to the first sample image may be generated, and then the second loss function value may be obtained through the second loss function according to the target image corresponding to the first sample image and the second sample image.
In the embodiment of the present disclosure, the step S303 in fig. 3 may further include the following two steps S505 and S506:
and S505, determining a third loss function value according to the first loss function value, the preset weight corresponding to the first loss function and the second loss function value.
In this step, the preset weight is, for example, a weight value corresponding to a first loss function obtained empirically. After the first loss function value and the first loss function are obtained, a third loss function value may be determined according to the first loss function value, a preset weight corresponding to the first loss function, and the second loss function value.
Further, determining a third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value may include: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
Illustratively, the third loss function value is determined by a third loss function as follows:
L=LSR+αLIQ
wherein L represents a third loss function value, and α represents a preset weight corresponding to the first loss function.
It will be appreciated that the third loss function value is the overall loss value of the image super-resolution processing model.
And S506, adjusting parameters of the image hyper-resolution processing model according to the third loss function value to obtain the trained image hyper-resolution processing model.
For example, after the third loss function value is obtained, parameters of the image super-segmentation processing model may be adjusted according to the third loss function value, and when a preset number of iterations is reached or the third loss function value approaches the target loss threshold, the trained image super-segmentation processing model is obtained.
The method for acquiring the image super-resolution processing model provided by the embodiment of the disclosure can greatly improve the generalization capability of the image super-resolution processing model, and can reconstruct a high-resolution image with higher quality from a low-resolution image through the image super-resolution processing model provided by the embodiment of the disclosure.
Fig. 6 is a flowchart of a model training method according to an embodiment of the present disclosure. The method of the disclosed embodiments may be applied in a computing device, which may be a server or a server cluster or the like. As shown in fig. 6, the method of the embodiment of the present disclosure includes:
s601, a training set is obtained, wherein the training set comprises a plurality of sample image pairs, each sample image pair comprises a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
Illustratively, referring to fig. 4, the training set may be obtained from different application scenarios of live entertainment, games, sports, and tv series, and the disclosure is not limited thereto. Firstly, obtaining second sample images with high definition and high resolution in different application scenes, and performing degradation processing in different degradation modes on the second sample images to obtain corresponding first sample images with low resolution; then, a quality factor label corresponding to the second sample image is obtained, and a plurality of sample image pairs are obtained according to the first sample image, the second sample image and the quality factor label, that is, a training set is obtained.
And S602, training the image hyper-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of the target quality factor obtained according to the image hyper-segmentation processing model relative to the quality factor label, and the second loss function value is used for indicating the loss degree of the target image obtained by the image hyper-segmentation processing model relative to the second sample image.
In this step, a first sample image is input to an image hyper-segmentation processing model for training, and in the training process, a first loss function and a second loss function as shown in the above embodiment are set, and accordingly, the first loss function value and the second loss function value can be obtained. Determining the loss degree of the target quality factor obtained according to the image super-resolution processing model relative to the quality factor label according to the first loss function value; the degree of loss of the target image obtained by the image hyper-segmentation processing model relative to the second sample image can be determined according to the second loss function value, and the training of the image hyper-segmentation processing model is supervised through the first loss function value and the second loss function value.
S603, adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
For example, when the image hyper-segmentation processing model is trained, parameters of the image hyper-segmentation processing model may be adjusted according to the obtained first loss function value and the second loss function value, and when a preset number of iterations is reached or when a target loss function value obtained according to the first loss function value and the second loss function value tends to a target loss threshold value, the trained image hyper-segmentation processing model is obtained.
In some embodiments, adjusting a parameter of the image hyper-segmentation processing model according to the first loss function value and the second loss function value may include: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
Optionally, determining a third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value may include: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In some embodiments, training the image hyper-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value may include: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
Optionally, obtaining the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image may include: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
Optionally, obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image, may include: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In some embodiments, obtaining the training set may include: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
It should be noted that the scheme of the model training method provided in the embodiment of the present disclosure is similar to the implementation principle and the technical effect of the scheme of the image super-resolution processing model obtaining method, and is not described herein again.
The technical scheme provided by the embodiment of the disclosure has a good effect when performing super-resolution processing on the image of the real video scene, and the reconstructed high-resolution image has better subjective visual perception. Exemplarily, for example, in a live video on demand scene of a network, under the condition of the same bandwidth, the technical scheme provided by the embodiment of the disclosure can reconstruct an image of a video after transmission and decoding, and has a higher definition effect; meanwhile, the transmission video can be further subjected to down-sampling compression and transmitted at a smaller code rate, and then restored through the technical scheme provided by the embodiment of the disclosure, namely, the reconstructed high-resolution image is obtained, so that the effects of reducing the code rate and saving bandwidth resources can be achieved. According to the technical scheme provided by the embodiment of the disclosure, the complex neural network is utilized to learn the degradation mode of the video image to obtain the quality factor of the image, the image reconstruction is carried out in a self-adaptive manner according to different quality factors, the blocking effect caused by video compression can be removed, the detail loss is effectively reduced, and the reconstructed video quality is better.
Exemplary devices
Having described the method of the exemplary embodiment of the present disclosure, the apparatus of the exemplary embodiment of the present disclosure will next be described with reference to fig. 7. The apparatus according to the exemplary embodiment of the present disclosure can implement the processes in the foregoing image processing method embodiments, and achieve the same functions and effects.
Fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure, and as shown in fig. 7, an image processing apparatus 700 according to an embodiment of the present disclosure includes: a first obtaining module 701, an extracting module 702, a determining module 703 and a second obtaining module 704. Wherein:
the first obtaining module 701 is configured to obtain an image to be processed, where the image to be processed is an image after degradation processing.
An extracting module 702 is configured to extract a feature map and a feature vector corresponding to the image to be processed.
A determining module 703, configured to determine at least one quality factor according to the feature vector, where the quality factor is used to represent a quality feature of the image to be processed.
And a second obtaining module 704, configured to obtain a target image according to the feature map and the at least one quality factor, where the target image is a high-resolution image corresponding to the image to be processed.
In one possible implementation, the image processing apparatus includes a processing module 705 operable to: the following steps are realized through an image super-resolution processing model: extracting a feature map and a feature vector corresponding to an image to be processed; determining at least one quality factor according to the feature vector; obtaining a target image according to the feature map and at least one quality factor; the image hyper-differentiation processing model is obtained by training based on a training set containing a plurality of sample image pairs, wherein the sample image pairs contain a first sample image, a second sample image and a quality factor label, the second sample image is an image obtained by performing degradation processing on the first sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
In a possible implementation, the image processing apparatus further includes a third obtaining module 706, configured to: acquiring a training set, wherein the training set comprises a plurality of sample image pairs, each sample image pair comprises a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing the quality characteristics of the second sample image; training the image super-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-segmentation processing model relative to a quality factor label, and the second loss function value is used for indicating the loss degree of the target image obtained by the image super-segmentation processing model relative to the second sample image; and adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
In a possible implementation manner, the third obtaining module 706 is specifically configured to, when configured to adjust a parameter of the image super-segmentation processing model according to the first loss function value and the second loss function value: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
In a possible implementation manner, the third obtaining module 706 is specifically configured to, when configured to determine the third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In a possible implementation, the third obtaining module 706, when configured to train the image hyper-segmentation processing model through the first sample image to obtain the first loss function value and the second loss function value, is specifically configured to: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
In a possible implementation manner, the third obtaining module 706 is specifically configured to, when configured to obtain the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image,: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
In a possible implementation manner, the third obtaining module 706 is specifically configured to, when configured to obtain the second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In a possible implementation, the third obtaining module 706, when used for obtaining the training set, is specifically configured to: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
The apparatus of the embodiment of the present disclosure may be configured to execute the scheme of the image processing method in any one of the method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 8 is a schematic structural diagram of an image processing apparatus according to another embodiment of the present disclosure, and as shown in fig. 8, an image processing apparatus 800 according to an embodiment of the present disclosure includes: an encoder 801, a quality predictor 802 and a decoder 803. Wherein:
the encoder 801 is configured to extract a feature map and a feature vector corresponding to an image to be processed according to the input image to be processed after the degradation processing. As shown in fig. 8, the encoder 801 comprises three scales, each providing a skip connection to the decoder 803, and the number of channels in each scale is, for example, n, 2n, 4n, and 8n, where n may be, for example, 8, 16, 32, and 64.
The quality predictor 802 is a convolutional neural network, and is configured to output a predicted target quality factor, i.e., a target quality factor corresponding to the image to be processed, based on the feature vector corresponding to the image to be processed output by the encoder 801, where the target quality factor is, for example, (a, b, c).
The decoder 803, which may also be referred to as a reconstruction module, is configured to obtain a high-resolution target image corresponding to the image to be processed according to the feature map corresponding to the image to be processed output by the encoder 801 and the target quality factor output by the quality predictor 802.
It is understood that the function of the encoder in the embodiments of the present disclosure is similar to that of the extraction module of the image processing apparatus in the above-described embodiments; the function of the quality predictor in the embodiment of the present disclosure is similar to that of the determination module of the image processing apparatus in the above-described embodiment; the function of the decoder in the embodiment of the present disclosure is similar to that of the second acquisition module of the image processing apparatus in the above-described embodiment.
Fig. 9 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure, and as shown in fig. 9, a model training apparatus 900 according to an embodiment of the present disclosure includes: an acquisition module 901, a training module 902, and a processing module 903. Wherein:
the obtaining module 901 is configured to obtain a training set, where the training set includes a plurality of sample image pairs, each sample image pair includes a first sample image, a second sample image, and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used to represent a quality feature of the second sample image.
A training module 902, configured to train the image hyper-segmentation processing model through the first sample image, to obtain a first loss function value and a second loss function value, where the first loss function value is used to indicate a degree of loss of the target quality factor obtained according to the image hyper-segmentation processing model relative to the quality factor label, and the second loss function value is used to indicate a degree of loss of the target image obtained by the image hyper-segmentation processing model relative to the second sample image.
And the processing module 903 is configured to adjust parameters of the image hyper-segmentation processing model according to the first loss function value and the second loss function value, so as to obtain a trained image hyper-segmentation processing model.
In a possible implementation, the processing module 903 may be specifically configured to: determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value; and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
In a possible implementation manner, the processing module 903, when configured to determine the third loss function value according to the first loss function value, the preset weight corresponding to the first loss function, and the second loss function value, may specifically be configured to: determining an updated first loss function value according to the first loss function value and a preset weight; determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
In one possible implementation, the training module 902 may be specifically configured to: inputting the first sample image into an image hyper-resolution processing model to obtain a characteristic map and a characteristic vector corresponding to the first sample image; obtaining a first loss function value and a target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, wherein the target quality factor comprises at least one of the following: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation; and obtaining a second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
In a possible implementation manner, the training module 902, when configured to obtain the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image, may specifically be configured to: obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image; and obtaining a first loss function value and a target quality factor according to the quality factor and the quality factor label corresponding to the first sample image.
In a possible implementation manner, the training module 902, when configured to obtain the second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image, may specifically be configured to: obtaining a target image corresponding to the first sample image according to the characteristic map corresponding to the first sample image and the target quality factor; and obtaining a second loss function value according to the target image corresponding to the first sample image and the second sample image.
In a possible implementation, the obtaining module 901 may be specifically configured to: acquiring second sample images in different application scenes; carrying out degradation processing in different degradation modes on the second sample image to obtain a corresponding first sample image; acquiring a quality factor label corresponding to the second sample image; and acquiring a training set according to the first sample image, the second sample image and the quality factor label.
The model training device provided in the embodiments of the present disclosure may be used in a scheme for executing the model training method in any one of the above method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
Exemplary Medium
Having described the method of the exemplary embodiment of the present disclosure, next, a storage medium of the exemplary embodiment of the present disclosure will be described with reference to fig. 10.
Fig. 10 is a schematic diagram of a storage medium according to an embodiment of the disclosure. Referring to fig. 10, a storage medium 1000 stores therein a program product for implementing the above method according to an embodiment of the present disclosure, which may employ a portable compact disc read only memory (CD-ROM) and include program codes, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. The readable signal medium may also be any readable medium other than a readable storage medium.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN).
Exemplary computing device
Having described the methods, media, and apparatus of the exemplary embodiments of the present disclosure, a computing device of the exemplary embodiments of the present disclosure is described next with reference to fig. 11.
The computing device 1100 shown in fig. 11 is only one example and should not impose any limitations on the functionality or scope of use of embodiments of the disclosure.
Fig. 11 is a schematic structural diagram of a computing device provided in an embodiment of the present disclosure, and as shown in fig. 11, the computing device 1100 is represented in the form of a general-purpose computing device. Components of computing device 1100 may include, but are not limited to: the at least one processing unit 1101, the at least one storage unit 1102, and a bus 1103 connecting different system components (including the processing unit 1101 and the storage unit 1102).
The bus 1103 includes a data bus, a control bus, and an address bus.
The storage unit 1102 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)11021 and/or cache memory 11022, and may further include readable media in the form of non-volatile memory, such as Read Only Memory (ROM) 11023.
The memory unit 1102 may also include a program/utility 11025 having a set (at least one) of program modules 11024, such program modules 11024 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Computing device 1100 can also communicate with one or more external devices 1104 (e.g., keyboard, pointing device, etc.). Such communication may occur via input/output (I/O) interfaces 1105. Moreover, the computing device 1100 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 1106. As shown in fig. 11, the network adapter 1106 communicates with the other modules of the computing device 1100 over the bus 1103. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the computing device 1100, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the image processing apparatus and the model training apparatus are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. An image processing method, comprising:
acquiring an image to be processed, wherein the image to be processed is an image subjected to degradation processing;
extracting a characteristic diagram and a characteristic vector corresponding to the image to be processed;
determining at least one quality factor according to the feature vector, wherein the quality factor is used for representing the quality feature of the image to be processed;
and obtaining a target image according to the characteristic diagram and the at least one quality factor, wherein the target image is a high-resolution image corresponding to the image to be processed.
2. The image processing method according to claim 1, further comprising:
the following steps are realized through an image super-resolution processing model:
extracting a characteristic diagram and a characteristic vector corresponding to the image to be processed;
determining the at least one quality factor according to the feature vector;
obtaining the target image according to the feature map and the at least one quality factor;
the image hyper-differentiation processing model is obtained by training based on a training set comprising a plurality of sample image pairs, wherein the sample image pairs comprise a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image.
3. The image processing method according to claim 2, wherein the image hyper-segmentation processing model is obtained by:
acquiring a training set, wherein the training set comprises a plurality of sample image pairs, the sample image pairs comprise the first sample image, the second sample image and the quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image;
training the image super-segmentation processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-segmentation processing model relative to the quality factor label, and the second loss function value is used for indicating the loss degree of a target image obtained by the image super-segmentation processing model relative to the second sample image;
and adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
4. The image processing method of claim 3, wherein said adjusting parameters of the image hyper-segmentation processing model according to the first loss function value and the second loss function value comprises:
determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function and the second loss function value;
and adjusting the parameters of the image super-resolution processing model according to the third loss function value.
5. The method according to claim 4, wherein determining a third loss function value according to the first loss function value, a preset weight corresponding to the first loss function, and the second loss function value comprises:
determining an updated first loss function value according to the first loss function value and the preset weight;
determining a third loss function value as a sum of the updated first loss function value and the second loss function value.
6. The image processing method of any of claims 3 to 5, wherein the training the image hyper-segmentation processing model by the first sample image to obtain a first loss function value and a second loss function value comprises:
inputting the first sample image into the image hyper-segmentation processing model to obtain a feature map and a feature vector corresponding to the first sample image;
obtaining the first loss function value and the target quality factor according to the feature vector corresponding to the first sample image and the quality factor label, where the target quality factor includes at least one of: image compression ratio, fuzzy degree, down-sampling multiple, contrast and saturation;
and obtaining the second loss function value according to the feature map corresponding to the first sample image, the target quality factor and the second sample image.
7. The method according to claim 6, wherein the obtaining the first loss function value and the target quality factor according to the feature vector and the quality factor label corresponding to the first sample image comprises:
obtaining a quality factor corresponding to the first sample image according to the feature vector corresponding to the first sample image;
and obtaining the first loss function value and the target quality factor according to the quality factor corresponding to the first sample image and the quality factor label.
8. The method according to claim 6, wherein obtaining the second loss function value according to the feature map corresponding to the first sample image, the target quality factor, and the second sample image comprises:
obtaining a target image corresponding to the first sample image according to the feature map corresponding to the first sample image and the target quality factor;
and obtaining the second loss function value according to the target image corresponding to the first sample image and the second sample image.
9. The image processing method according to any one of claims 3 to 5, wherein the acquiring a training set comprises:
acquiring the second sample image under different application scenes;
carrying out degradation processing in different degradation modes on the second sample image to obtain the corresponding first sample image;
acquiring the quality factor label corresponding to the second sample image;
and acquiring the training set according to the first sample image, the second sample image and the quality factor label.
10. A method of model training, comprising:
acquiring a training set, wherein the training set comprises a plurality of sample image pairs, the sample image pairs comprise a first sample image, a second sample image and a quality factor label, the first sample image is an image obtained by performing degradation processing on the second sample image, and the quality factor label is used for representing quality characteristics of the second sample image;
training an image super-resolution processing model through the first sample image to obtain a first loss function value and a second loss function value, wherein the first loss function value is used for indicating the loss degree of a target quality factor obtained according to the image super-resolution processing model relative to the quality factor label, and the second loss function value is used for indicating the loss degree of a target image obtained by the image super-resolution processing model relative to the second sample image;
and adjusting parameters of the image super-resolution processing model according to the first loss function value and the second loss function value to obtain the trained image super-resolution processing model.
CN202111341884.3A 2021-11-12 2021-11-12 Image processing method, medium, device and computing equipment Pending CN114022361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111341884.3A CN114022361A (en) 2021-11-12 2021-11-12 Image processing method, medium, device and computing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111341884.3A CN114022361A (en) 2021-11-12 2021-11-12 Image processing method, medium, device and computing equipment

Publications (1)

Publication Number Publication Date
CN114022361A true CN114022361A (en) 2022-02-08

Family

ID=80063958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111341884.3A Pending CN114022361A (en) 2021-11-12 2021-11-12 Image processing method, medium, device and computing equipment

Country Status (1)

Country Link
CN (1) CN114022361A (en)

Similar Documents

Publication Publication Date Title
CN109472270B (en) Image style conversion method, device and equipment
CN110717868B (en) Video high dynamic range inverse tone mapping model construction and mapping method and device
CN112950471A (en) Video super-resolution processing method and device, super-resolution reconstruction model and medium
CN112967207A (en) Image processing method and device, electronic equipment and storage medium
CN113724136A (en) Video restoration method, device and medium
CN114972001A (en) Image sequence rendering method and device, computer readable medium and electronic equipment
Luo et al. Masked360: Enabling Robust 360-degree Video Streaming with Ultra Low Bandwidth Consumption
CN112261417B (en) Video pushing method and system, equipment and readable storage medium
CN113658122A (en) Image quality evaluation method, device, storage medium and electronic equipment
US11928855B2 (en) Method, device, and computer program product for video processing
CN115205117B (en) Image reconstruction method and device, computer storage medium and electronic equipment
CN114900717B (en) Video data transmission method, device, medium and computing equipment
CN113014745B (en) Video image noise reduction method and device, storage medium and electronic equipment
CN115861121A (en) Model training method, image processing method, device, electronic device and medium
CN114022361A (en) Image processing method, medium, device and computing equipment
CN113658073A (en) Image denoising processing method and device, storage medium and electronic equipment
Singh et al. A content adaptive method of de-blocking and super-resolution of compressed images
CN116366852A (en) Video coding and decoding method, device, equipment and medium for machine vision task
CN111062886A (en) Super-resolution method, system, electronic product and medium for hotel pictures
Liu et al. Soft-IntroVAE for Continuous Latent Space Image Super-Resolution
US11647153B1 (en) Computer-implemented method, device, and computer program product
CN112073731B (en) Image decoding method, image decoding device, computer-readable storage medium and electronic equipment
US20240185388A1 (en) Method, electronic device, and computer program product for image processing
CN117036177A (en) Image restoration model determining method, image restoration method and device
CN117649353A (en) Image processing method, device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination