WO2023169318A1 - Image quality determination method, apparatus, device, and storage medium - Google Patents

Image quality determination method, apparatus, device, and storage medium Download PDF

Info

Publication number
WO2023169318A1
WO2023169318A1 PCT/CN2023/079505 CN2023079505W WO2023169318A1 WO 2023169318 A1 WO2023169318 A1 WO 2023169318A1 CN 2023079505 W CN2023079505 W CN 2023079505W WO 2023169318 A1 WO2023169318 A1 WO 2023169318A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
pixel
weight
semantic
saliency
Prior art date
Application number
PCT/CN2023/079505
Other languages
French (fr)
Chinese (zh)
Inventor
王帅
石雅南
Original Assignee
百果园技术(新加坡)有限公司
王帅
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 王帅 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2023169318A1 publication Critical patent/WO2023169318A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the embodiments of the present application relate to the field of image processing technology, and in particular, to an image quality determination method, device, equipment and storage medium.
  • Image quality evaluation refers to the quantitative description of the degree of distortion between two images with similar subject content in a subjective and objective manner. It plays a very important role in algorithm analysis and comparison, system performance evaluation, etc. in the field of image/video processing.
  • Image quality evaluation can be divided into subjective evaluation and objective evaluation in terms of methods.
  • Subjective evaluation refers to the evaluation of image quality through people's subjective feelings, that is, the original reference image and the distorted image are given, and the observer is allowed to evaluate the distorted image. It is generally described by the average subjective opinion score or the average subjective opinion score difference.
  • Subjective evaluation requires a lot of manpower and material resources, and the evaluation results are easily affected by the tester's subjective factors and external conditions. The complexity of the evaluation process seriously affects its accuracy and versatility, so it should be applied to actual video processing systems. is extremely difficult.
  • objective evaluation uses mathematical models to directly give quantitative values of distortion, is simple to operate, and is widely used in various fields.
  • image quality can be objectively evaluated through a series of objective evaluation indicators commonly used in the industry.
  • objective evaluation indicators are simple to operate and easy to implement, different objective indicators are not consistent with the subjective perception of the human eye to the same extent, resulting in unsatisfactory image quality evaluation results.
  • the embodiments of the present application provide an image quality determination method, device, equipment and storage medium, which solves the problem in related technologies that image quality evaluation results cannot well reflect subjective image quality feelings, so that the final image quality evaluation results are given More in line with the subjective perception of the human eye.
  • embodiments of the present application provide an image quality determination method, which method includes:
  • Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
  • embodiments of the present application also provide an image quality determination device, including:
  • An image acquisition module configured to acquire the distorted image and the corresponding original image
  • An information extraction module configured to extract information from the original image to obtain semantic information and saliency information
  • a weight calculation module configured to determine the pixel weight of each pixel based on the semantic information and the saliency information
  • An image quality calculation module configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.
  • embodiments of the present application also provide an image quality determination device, which includes:
  • processors one or more processors
  • a storage device for storing one or more programs
  • the one or more processors are caused to implement the image quality determination method described in the embodiments of this application.
  • embodiments of the present application also provide a storage medium that stores computer-executable instructions, which, when executed by a computer processor, are used to perform the image quality determination method described in the embodiments of the present application.
  • embodiments of the present application also provide a computer program product.
  • the computer program product includes a computer program.
  • the computer program is stored in a computer-readable storage medium.
  • At least one processor of the device reads the computer program from the computer-readable storage medium.
  • the computer program is fetched and executed, so that the device executes the image quality determination method described in the embodiment of the present application.
  • the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and The pixel values corresponding to the original image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation and makes the final image quality evaluation result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements. .
  • Figure 1 is a flow chart of an image quality determination method provided by an embodiment of the present application.
  • Figure 2 is a flow chart of a method for determining the pixel weight of a pixel point provided by an embodiment of the present application
  • Figure 3 is a flow chart of another image quality determination method provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram illustrating semantic information obtained through information extraction provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram illustrating saliency information obtained through information extraction provided by an embodiment of the present application.
  • Figure 6 is a flow chart of another image quality determination method provided by an embodiment of the present application.
  • Figure 7 is a flow chart of another image quality determination method provided by an embodiment of the present application.
  • Figure 8 is a structural block diagram of an image quality determination device provided by an embodiment of the present application.
  • Figure 9 is a schematic structural diagram of an image quality determination device provided by an embodiment of the present application.
  • first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in orders other than those illustrated or described herein, and that "first,” “second,” etc. are distinguished Objects are usually of one type, and the number of objects is not limited. For example, the first object can be one or multiple.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the related objects are in an "or” relationship.
  • Figure 1 is a flow chart of an image quality determination method provided by an embodiment of the present application, which can be used to evaluate image and video quality.
  • the method can be executed by computing devices such as servers, smart terminals, notebooks, tablets, etc. Specifically Includes the following steps:
  • Step S101 Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information.
  • a comparison calculation between the distorted image and the original image is performed to obtain the quantified value of the distorted image and thereby obtain the corresponding image quality evaluation.
  • the original image may be a clear undistorted image, that is, the reference image of the distorted image.
  • Distorted images are images with different distortions relative to the original image, that is, noise images.
  • the embodiment of this application is based on the objective image quality evaluation method. Method, the attention mechanism is introduced to achieve a final more reasonable image quality evaluation.
  • the acquired original image may be one image or a video frame sequence composed of multiple image frames, such as consecutive multiple frames of live image images in a live video.
  • Semantic information and saliency information are obtained by extracting information from the original image.
  • the information extraction method can be to extract information from the original image through a trained neural network model, or an image feature extraction algorithm can be used to extract information.
  • the semantic information represents the category of each pixel in the image.
  • the category of the pixel can be one of multiple preset object categories. Examples of the object categories can be faces, microphones, and chairs. , hair, keyboard, etc.
  • the saliency information represents the grayscale of each pixel in the image, which represents the degree of attention of each pixel to human vision, and can be a normalized grayscale image.
  • the semantic information and saliency information are presented through image masks.
  • Step S102 Determine the pixel weight of each pixel according to the semantic information and the saliency information.
  • Figure 2 is a flow chart of a method for determining the pixel weight of a pixel point provided by an embodiment of the present application, which specifically includes:
  • Step S1021 Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight.
  • the semantic information includes the semantic index value of each pixel obtained through information extraction, where different semantic index values correspond to different object categories. For example, taking the live broadcast scene as an example, there are 13 categories, corresponding to index values 0 to 12, that is, the semantic index value 0 corresponds to category 1, the semantic index value 1 corresponds to category 2, and the semantic index value 2 corresponds to category 3. analogy. Different object categories are set with corresponding basic weights. For example, category 1 corresponds to basic weight 1, category 2 corresponds to basic weight 2, and so on.
  • the method of determining the semantic weight of each pixel based on the semantic information and the corresponding basic weight may be: determining the semantic index value of each pixel, and determining the basic weight corresponding to the semantic index value of each pixel as the semantic Weights.
  • the basic weights are recorded as ⁇ 0 , ⁇ 1 ,..., ⁇ N , which in turn correspond to the basic weight values corresponding to the semantic index value 0 to the semantic index value N, and the semantics of the pixel point at position (i, j)
  • the weight is recorded as W se (i,j), then the semantic weight where p se (i,j) represents the graph
  • the specific size of the basic weight can be set based on subjective experience or measured through statistical methods.
  • Step S1022 Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight.
  • the saliency information includes the gray value of each pixel obtained by information extraction.
  • the gray value can be a normalized value, ranging from 0 to 1.
  • the value 0 represents "easiest to ignore", and the value 1 represents "most interesting”.
  • the predefined weight includes a predefined minimum weight value and a predefined maximum weight value.
  • the exemplary predefined minimum weight value is recorded as ⁇ min and the predefined maximum weight value is recorded as ⁇ max .
  • the specific sizes of the predefined maximum weight value and the predefined minimum weight value can be set based on subjective experience or measured through statistical methods.
  • Step S1023 Calculate the pixel weight of each pixel based on the semantic weight of each pixel and the corresponding saliency weight.
  • the final pixel weight can be calculated by: determining the product of the semantic weight of each pixel and the corresponding saliency weight. is the pixel weight of the pixel.
  • the semantic weight is recorded as W se (i, j)
  • the saliency weight is recorded as W sa (i, j)
  • the pixel weight of the pixel at the (i, j) position that is, the total weight is recorded as W ( i,j)
  • W(i,j) Wse (i,j)* Wsa (i,j).
  • Step S103 Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
  • the pixel weight is combined with an objective image quality evaluation method to calculate the final image quality information.
  • the objective image quality evaluation method can be PSNR (Peak Signal-to-Noise Ratio, peak signal-to-noise ratio) algorithm, SSIM (Structural SIMilarity, structural similarity) algorithm or VMAF (Video Multitime Assessment Fusion, video multimedia evaluation) Fusion) algorithm, etc.
  • the image quality information is calculated based on the pixel weight and the pixel values of the corresponding pixels of the original image and the distorted image. That is, the quantitative calculation of the degree of distortion of the distorted image relative to the original image is performed in combination with the pixel weight. The image quality information is calculated, making the final settlement result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.
  • the original image is extracted to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and the original
  • the pixel values of corresponding pixels in the image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation.
  • the attention mechanism is introduced so that the final image quality evaluation result meets the objectivity requirements. Under the premise, it is more in line with the subjective feeling of the human eye.
  • Figure 3 is a flow chart of another image quality determination method provided by the embodiment of the present application. It provides a method of extracting information from the original image to obtain semantic information and saliency information. As shown in Figure 3, it includes:
  • Step S201 Obtain the distorted image and the corresponding original image, and extract information from the original image through the trained convolutional neural network with a dual-branch structure to obtain semantic information and saliency information.
  • semantic information and saliency information are obtained by extracting information from the original image through a neural network model.
  • a convolutional neural network with a dual-branch structure is specifically used, in which semantic information is extracted through one branch and saliency information is extracted through the other branch.
  • Figure 4 is a schematic diagram of extracting semantic information provided by an embodiment of the present application; extracting information from the original image to obtain saliency information
  • Figure 5 is a schematic diagram of information extraction to obtain saliency information provided by an embodiment of the present application.
  • Step S202 Determine the pixel weight of each pixel according to the semantic information and the saliency information.
  • Step S203 Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
  • the semantic information and saliency information are obtained by extracting information from the original image through the trained convolutional neural network with a dual-branch structure, making the obtained semantic information and saliency information more accurate.
  • the rationality is stronger, so that the final image quality information can be significantly more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.
  • Figure 6 is a flow chart of another image quality determination method provided by the embodiment of the present application. It provides a process of extracting semantic information and saliency information using a dual-branch network architecture. As shown in Figure 6, it includes:
  • Step S301 Obtain the distorted image and the corresponding original image.
  • Step S302 Extract semantic information from the original image through the semantic branch of the convolutional neural network to obtain semantic information.
  • the semantic branch uses global pooling as a feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels.
  • the semantic branch has a smaller number of feature channels and a deeper number of layers, which is beneficial to extracting semantic context information of the image.
  • the semantic branch since the semantic branch only needs a large receptive field to capture the semantic context features of the image, the semantic branch adopts a lightweight structural design with a small number of channels in each layer. As the layer deepens, the feature map is quickly processed. Downsampling. In addition, in order to further expand the receptive field, it uses global pooling as the feature weight factor.
  • the preset channel number can be 1 or 2.
  • Step S303 Extract saliency information from the original image through the detail branch of the convolutional neural network, where the number of network levels of the detail branch is less than the preset number of levels.
  • the detail branch has a larger number of feature channels and a shallower number of layers to capture the spatial detail information of the image, such as texture, edges, etc.
  • the detail branch is mainly used to extract image spatial detail information, using a shallower number of levels, a richer number of features and a large resolution.
  • it includes three stages, each stage consisting of a cascade of several convolutional layers, batch normalization, and activation functions.
  • the stride of the first layer of convolution in each stage is 2, and the other layers have the same number of convolution kernels.
  • the output feature map size of this detail branch is 1/8 of the source size.
  • the preset number of layers may be 4 layers, 5 layers or 6 layers.
  • Step S304 Fusion the semantic information and the saliency information through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.
  • the fusion process is performed through the fusion layer.
  • the features output by the detail branch and the semantic branch are complementary features. After the fusion layer, a more complex and precise feature expression is obtained, so that the updated semantic information and saliency information are more accurate.
  • the weights of different object categories represented by semantic information are: the basic weight of the face is 1, the basic weight of the hair is 0.9, the basic weight of the microphone is 0.7, and the basic weight of the chair is 0.6; in the non-salient area, that is, the user's human eyes
  • the basic weight of the face is 0.5
  • the basic weight of the hair is 0.4
  • the basic weight of the microphone is 0.3
  • the basic weight of the chair is 0.3.
  • Step S305 Determine the pixel weight of each pixel based on the updated semantic information and saliency information.
  • the method of determining the pixel weight of each pixel based on the updated semantic information and saliency information can be referred to the aforementioned method of determining the pixel weight of each pixel based on the semantic information and saliency information, which will not be described again here.
  • Step S306 Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
  • the semantic information is extracted from the original image through the semantic branch of the convolutional neural network through specially set branch structures, and the semantic information is extracted from the original image through the detail branch of the convolutional neural network.
  • the saliency information is extracted to obtain the saliency information, which achieves more accurate and targeted information extraction.
  • a more complex and accurate feature expression is obtained through the fusion layer, so that the final image quality information can meet the objectivity requirements. More in line with the subjective perception of the human eye.
  • the convolutional neural network used further includes auxiliary segmentation head nodes to improve the convergence speed during training.
  • the auxiliary segmentation head is inserted into different positions of each branch to perform channel control in different dimensions by adjusting the computational complexity of the auxiliary segmentation head and the corresponding main segmentation head to achieve rapid convergence during training.
  • Figure 7 is a flow chart of another image quality determination method provided by an embodiment of the present application. It provides a process of calculating image quality information, as shown in Figure 7, including:
  • Step S401 Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information.
  • Step S402 Determine the pixel weight of each pixel according to the semantic information and the saliency information.
  • Step S403 Calculate the mean square error between the distorted image and the original image based on the pixel weight, and substitute the mean square error into the peak signal-to-noise ratio formula to calculate image quality information.
  • the calculation is performed by combining the obtained pixel weight with the PSNR algorithm as an example. bright.
  • I is the original image
  • K is the distorted image
  • i and j represent the coordinate position of a certain pixel in the image
  • WPSNR calculated by normalization
  • WPSNR uses adaptive methods by mining the semantic and saliency features of the image.
  • the weighting strategy introduces the attention mechanism, so it is more in line with the subjective perception of the human eye.
  • the percentages in the table above represent the proportion of cases where the correlation coefficient (Pearson coefficient or Spearman coefficient) between the current evaluation method and MOS data is the highest among a batch of ordinary live broadcast scene videos. It can be seen that WPSNR is significantly better than other evaluation methods.
  • FIG 8 is a structural block diagram of an image quality determination device provided by an embodiment of the present application.
  • the device is used to execute the image quality determination method provided by the above embodiment, and has functional modules and beneficial effects corresponding to the execution method.
  • the device specifically includes: image acquisition module 101, information extraction module 102, weight calculation module 103 and image quality calculation module 104, wherein,
  • the image acquisition module 101 is configured to acquire the distorted image and the corresponding original image
  • the information extraction module 102 is configured to extract information from the original image to obtain semantic information and saliency information
  • the weight calculation module 103 is configured to determine the pixel weight of each pixel according to the semantic information and the saliency information;
  • the image quality calculation module 104 is configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.
  • the original image is extracted to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and the original The pixel values corresponding to the image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation and makes the final image quality evaluation result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.
  • the weight calculation module 103 is configured as:
  • the pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight.
  • the semantic information includes the semantic index value of each pixel obtained through information extraction. Different semantic index values correspond to different object categories.
  • the weight calculation module 103 is configured as:
  • the basic weight corresponding to the semantic index value of each pixel is determined as the semantic weight.
  • the saliency information includes the gray value of each pixel obtained by information extraction
  • the predefined weight includes a predefined minimum weight value and a predefined maximum weight value
  • the weight calculation module 103 is configured as:
  • the linear difference of the gray value of each pixel is calculated based on the predefined minimum weight value and the predefined maximum weight value to obtain the saliency weight of each pixel.
  • the weight calculation module 103 is configured as:
  • the product of the semantic weight of each pixel and the corresponding saliency weight is determined as the pixel weight of the pixel.
  • the information extraction module 102 is configured as:
  • Semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure.
  • the information extraction module 102 is configured as:
  • the semantic information of the original image is extracted through the semantic branch of the convolutional neural network to obtain the semantic Information
  • the semantic branch uses global pooling as the feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels;
  • the saliency information is extracted from the original image through the detail branch of the convolutional neural network, and the number of network levels of the detail branch is less than the preset number of levels;
  • the semantic information and the saliency information are fused through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.
  • the image quality calculation module 104 is configured as:
  • the mean square error is substituted into the peak signal-to-noise ratio formula to calculate image quality information.
  • Figure 9 is a schematic structural diagram of an image quality determination device provided by an embodiment of the present application.
  • the device includes a processor 201, a memory 202, an input device 203 and an output device 204; the number of processors 201 in the device It can be one or more.
  • one processor 201 is taken as an example; the processor 201, memory 202, input device 203 and output device 204 in the device can be connected through a bus or other means.
  • the processor 201, the memory 202, the input device 203 and the output device 204 are connected through a bus. For example.
  • the memory 202 can be used to store software programs, computer-executable programs and modules, such as program instructions/modules corresponding to the image quality determination method in the embodiment of the present application.
  • the processor 201 executes various functional applications and data processing of the device by running software programs, instructions and modules stored in the memory 202, that is, implementing the above image quality determination method.
  • the input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and functional control of the device.
  • the output device 204 may include a display device such as a display screen.
  • Embodiments of the present application also provide a storage medium containing computer-executable instructions, which when executed by a computer processor are used to perform an image quality determination method described in the above embodiments, which includes:
  • Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
  • various aspects of the method provided by this application can also be implemented in the form of a program product, which includes program code.
  • the program product When the program product is run on a computer device, the program code is used to The computer device is caused to execute the steps in the methods described above in this specification according to various exemplary embodiments of the present application.
  • the computer device may execute the image quality determination method described in the embodiments of the present application.
  • the program product may be implemented in any combination of one or more readable media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

Embodiments of the present application provide an image quality determination method, an apparatus, a device, and a storage medium. The method comprises: obtaining a distorted image and a corresponding original image, and performing information extraction on the original image to obtain semantic information and saliency information; determining a pixel weight of each pixel point according to the semantic information and the saliency information; and calculating image quality information on the basis of the pixel weights and pixel values of pixel points corresponding to the distorted image and the original image. An image quality evaluation result ultimately determined by the present solution better conforms to a subjective feeling of the human eye while satisfying an objectivity requirement.

Description

图像质量确定方法、装置、设备和存储介质Image quality determination method, device, equipment and storage medium
本申请要求在2022年03月11日提交中国专利局,申请号为202210238500.3的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application with application number 202210238500.3, which was submitted to the China Patent Office on March 11, 2022. The entire content of this application is incorporated into this application by reference.
技术领域Technical field
本申请实施例涉及图像处理技术领域,尤其涉及一种图像质量确定方法、装置、设备和存储介质。The embodiments of the present application relate to the field of image processing technology, and in particular, to an image quality determination method, device, equipment and storage medium.
背景技术Background technique
图像质量评价指通过主客观的方式对两个主体内容相似的图像之间的失真程度进行定量化描述。它在图像/视频处理领域中对于算法分析比较、系统性能评估等方面有着非常重要的作用。Image quality evaluation refers to the quantitative description of the degree of distortion between two images with similar subject content in a subjective and objective manner. It plays a very important role in algorithm analysis and comparison, system performance evaluation, etc. in the field of image/video processing.
图像质量评价从方法上可分为主观评估和客观评估。主观评估是指通过人的主观感受来评价图像质量,即给出原始参考图像和失真图像,让观察者对失真图像进行评价,一般采用平均主观意见分或平均主观意见分差异来描述。主观评估需要大量的人力物力,且评估的结果容易受到测试者的主观因素和外界条件的影响,评估过程的复杂性严重影响了其准确性和通用性,因此将其应用到实际视频处理系统中是极其困难的。相比之下,客观评估则使用数学模型直接给出失真的量化值,操作简单,被广泛用于各种领域中。如通过业界常用的一系列客观评估指标进行图像质量的客观评估。虽然客观评估指标操作简单且易于实施,但是不同的客观指标与人眼主观感受的符合程度并不一样,由此导致图像质量评价结果并不理想。Image quality evaluation can be divided into subjective evaluation and objective evaluation in terms of methods. Subjective evaluation refers to the evaluation of image quality through people's subjective feelings, that is, the original reference image and the distorted image are given, and the observer is allowed to evaluate the distorted image. It is generally described by the average subjective opinion score or the average subjective opinion score difference. Subjective evaluation requires a lot of manpower and material resources, and the evaluation results are easily affected by the tester's subjective factors and external conditions. The complexity of the evaluation process seriously affects its accuracy and versatility, so it should be applied to actual video processing systems. is extremely difficult. In contrast, objective evaluation uses mathematical models to directly give quantitative values of distortion, is simple to operate, and is widely used in various fields. For example, image quality can be objectively evaluated through a series of objective evaluation indicators commonly used in the industry. Although objective evaluation indicators are simple to operate and easy to implement, different objective indicators are not consistent with the subjective perception of the human eye to the same extent, resulting in unsatisfactory image quality evaluation results.
发明内容Contents of the invention
本申请实施例提供了一种图像质量确定方法、装置、设备和存储介质,解决了相关技术中图像质量评价结果不能很好的反应主观图像质量感受的问题,使得最终给出的图像质量评价结果更加符合人眼主观感受。The embodiments of the present application provide an image quality determination method, device, equipment and storage medium, which solves the problem in related technologies that image quality evaluation results cannot well reflect subjective image quality feelings, so that the final image quality evaluation results are given More in line with the subjective perception of the human eye.
第一方面,本申请实施例提供了一种图像质量确定方法,该方法包括:In a first aspect, embodiments of the present application provide an image quality determination method, which method includes:
获取失真图像以及对应的原始图像,对所述原始图像进行信息提取得到语 义信息和显著度信息;Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain the language meaning information and salience information;
根据所述语义信息和所述显著度信息确定每个像素点的像素权重;Determine the pixel weight of each pixel according to the semantic information and the saliency information;
基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
第二方面,本申请实施例还提供了一种图像质量确定装置,包括:In a second aspect, embodiments of the present application also provide an image quality determination device, including:
图像获取模块,配置为获取失真图像以及对应的原始图像;An image acquisition module configured to acquire the distorted image and the corresponding original image;
信息提取模块,配置为对所述原始图像进行信息提取得到语义信息和显著度信息;An information extraction module configured to extract information from the original image to obtain semantic information and saliency information;
权重计算模块,配置为根据所述语义信息和所述显著度信息确定每个像素点的像素权重;A weight calculation module configured to determine the pixel weight of each pixel based on the semantic information and the saliency information;
图像质量计算模块,配置为基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。An image quality calculation module configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.
第三方面,本申请实施例还提供了一种图像质量确定设备,该设备包括:In a third aspect, embodiments of the present application also provide an image quality determination device, which includes:
一个或多个处理器;one or more processors;
存储装置,用于存储一个或多个程序,a storage device for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本申请实施例所述的图像质量确定方法。When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the image quality determination method described in the embodiments of this application.
第四方面,本申请实施例还提供了一种存储计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行本申请实施例所述的图像质量确定方法。In a fourth aspect, embodiments of the present application also provide a storage medium that stores computer-executable instructions, which, when executed by a computer processor, are used to perform the image quality determination method described in the embodiments of the present application.
第五方面,本申请实施例还提供了一种计算机程序产品,该计算机程序产品包括计算机程序,该计算机程序存储在计算机可读存储介质中,设备的至少一个处理器从计算机可读存储介质读取并执行计算机程序,使得设备执行本申请实施例所述的图像质量确定方法。In a fifth aspect, embodiments of the present application also provide a computer program product. The computer program product includes a computer program. The computer program is stored in a computer-readable storage medium. At least one processor of the device reads the computer program from the computer-readable storage medium. The computer program is fetched and executed, so that the device executes the image quality determination method described in the embodiment of the present application.
本申请实施例中,通过获取失真图像以及对应的原始图像,对原始图像进行信息提取得到语义信息和显著度信息,根据语义信息和显著度信息确定每个像素点的像素权重,基于像素权重以及原始图像和失真图像对应像素点的像素值计算得到图像质量信息,提高了图像质量评价的准确度,使得最终确定出的图像质量评价结果在满足客观性要求的前提下,更加符合人眼主观感受。In the embodiment of this application, by obtaining the distorted image and the corresponding original image, information is extracted from the original image to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and The pixel values corresponding to the original image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation and makes the final image quality evaluation result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements. .
附图说明 Description of drawings
图1为本申请实施例提供的一种图像质量确定方法的流程图;Figure 1 is a flow chart of an image quality determination method provided by an embodiment of the present application;
图2为本申请实施例提供的一种确定像素点的像素权重的方法的流程图;Figure 2 is a flow chart of a method for determining the pixel weight of a pixel point provided by an embodiment of the present application;
图3为本申请实施例提供的另一种图像质量确定方法的流程图;Figure 3 is a flow chart of another image quality determination method provided by an embodiment of the present application;
图4为本申请实施例提供的一种信息提取得到语义信息的图示示意图;Figure 4 is a schematic diagram illustrating semantic information obtained through information extraction provided by an embodiment of the present application;
图5为本申请实施例提供的一种信息提取得到显著度信息的图示示意图;Figure 5 is a schematic diagram illustrating saliency information obtained through information extraction provided by an embodiment of the present application;
图6为本申请实施例提供的另一种图像质量确定方法的流程图;Figure 6 is a flow chart of another image quality determination method provided by an embodiment of the present application;
图7为本申请实施例提供的另一种图像质量确定方法的流程图;Figure 7 is a flow chart of another image quality determination method provided by an embodiment of the present application;
图8为本申请实施例提供的一种图像质量确定装置的结构框图;Figure 8 is a structural block diagram of an image quality determination device provided by an embodiment of the present application;
图9为本申请实施例提供的一种图像质量确定设备的结构示意图。Figure 9 is a schematic structural diagram of an image quality determination device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请实施例作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请实施例,而非对本申请实施例的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请实施例相关的部分而非全部结构。The embodiments of the present application will be further described in detail below with reference to the accompanying drawings and examples. It can be understood that the specific embodiments described here are only used to explain the embodiments of the present application, but are not intended to limit the embodiments of the present application. In addition, it should be noted that, for convenience of description, only some but not all structures related to the embodiments of the present application are shown in the drawings.
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”等所区分的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”,一般表示前后关联对象是一种“或”的关系。The terms "first", "second", etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in orders other than those illustrated or described herein, and that "first," "second," etc. are distinguished Objects are usually of one type, and the number of objects is not limited. For example, the first object can be one or multiple. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the related objects are in an "or" relationship.
图1为本申请实施例提供的一种图像质量确定方法的流程图,可用于对图像、视频质量进行评估,该方法可以由计算设备如服务器、智能终端、笔记本、平板电脑等来执行,具体包括如下步骤:Figure 1 is a flow chart of an image quality determination method provided by an embodiment of the present application, which can be used to evaluate image and video quality. The method can be executed by computing devices such as servers, smart terminals, notebooks, tablets, etc. Specifically Includes the following steps:
步骤S101、获取失真图像以及对应的原始图像,对所述原始图像进行信息提取得到语义信息和显著度信息。Step S101: Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information.
在一个实施例中,通过进行失真图像和原始图像之间的比对计算以得到失真图像的量化值进而得到相应的图像质量评价。其中,原始图像可以是清晰的未失真的图像,也即失真图像的参考图像。失真图像为相对于原始图像存在的不同失真情况的图像,也即噪声图像。本申请实施例基于客观图像质量评价方 法,引入注意力机制以实现最终的更为合理的图像质量的评价。In one embodiment, a comparison calculation between the distorted image and the original image is performed to obtain the quantified value of the distorted image and thereby obtain the corresponding image quality evaluation. The original image may be a clear undistorted image, that is, the reference image of the distorted image. Distorted images are images with different distortions relative to the original image, that is, noise images. The embodiment of this application is based on the objective image quality evaluation method. Method, the attention mechanism is introduced to achieve a final more reasonable image quality evaluation.
在一个实施例中,该获取的原始图像可以是一幅图像也可以是多个图像帧组成的视频帧序列,如直播视频中的连续多帧直播画面图像。通过对原始图像进行信息提取以得到语义信息和显著度信息。可选的,信息提取的方式可以通过训练的神经网络模型对原始图像进行信息提取,也可以采用图像特征提取算法进行信息提取。其中,该语义信息表征图像中每个像素点的类别,可选的,像素点的类别可以是预先设置的多个物体类别中的其中一个,物体类别示例性的可以是人脸、麦克风、椅子、头发、键盘等。显著度信息表征图像中每个像素点的灰度,其表示每个像素对人眼视觉的关注程度,可以是归一化的灰度图。可选的,该语义信息和显著度信息通过图像掩膜的方式呈现。In one embodiment, the acquired original image may be one image or a video frame sequence composed of multiple image frames, such as consecutive multiple frames of live image images in a live video. Semantic information and saliency information are obtained by extracting information from the original image. Optionally, the information extraction method can be to extract information from the original image through a trained neural network model, or an image feature extraction algorithm can be used to extract information. The semantic information represents the category of each pixel in the image. Optionally, the category of the pixel can be one of multiple preset object categories. Examples of the object categories can be faces, microphones, and chairs. , hair, keyboard, etc. The saliency information represents the grayscale of each pixel in the image, which represents the degree of attention of each pixel to human vision, and can be a normalized grayscale image. Optionally, the semantic information and saliency information are presented through image masks.
步骤S102、根据所述语义信息和所述显著度信息确定每个像素点的像素权重。Step S102: Determine the pixel weight of each pixel according to the semantic information and the saliency information.
在一个实施例中,通过信息提取得到包含每个像素点的语义信息和显著度信息后,基于该语义信息和显著度信息计算得到每个像素点的像素权重。可选的,如图2所示,图2为本申请实施例提供的一种确定像素点的像素权重的方法的流程图,具体包括:In one embodiment, after obtaining the semantic information and saliency information of each pixel through information extraction, the pixel weight of each pixel is calculated based on the semantic information and saliency information. Optionally, as shown in Figure 2, Figure 2 is a flow chart of a method for determining the pixel weight of a pixel point provided by an embodiment of the present application, which specifically includes:
步骤S1021、根据语义信息和对应的基本权重确定每个像素点的语义权重。Step S1021: Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight.
在一个实施例中,语义信息包括信息提取得到的每个像素点的语义索引值,其中,不同的语义索引值对应不同的物体类别。示例性的,以直播场景为例设置有13个类别,分别对应索引值0至12,即语义索引值0对应类别1、语义索引值1对应类别2、语义索引值2对应类别3,以此类推。不同的物体类别设置有对应的基本权重,如类别1对应基本权重1、类别2对应基本权重2,以此类推。相应的,根据语义信息和对应的基本权重确定每个像素点的语义权重的方式可以是:确定每个像素点的语义索引值,将每个像素点的语义索引值对应的基本权重确定为语义权重。示例性的,基本权重记为ω01,...,ωN,其依次对应语义索引值0到语义索引值N对应的基本权重值,位置(i,j)的像素点的语义权重记为Wse(i,j),则语义权重其中pse(i,j)表示图 像掩膜在(i,j)位置的语义索引值。其中,该基本权重的具体大小可凭主观经验设定或者通过统计学方法测量得到。In one embodiment, the semantic information includes the semantic index value of each pixel obtained through information extraction, where different semantic index values correspond to different object categories. For example, taking the live broadcast scene as an example, there are 13 categories, corresponding to index values 0 to 12, that is, the semantic index value 0 corresponds to category 1, the semantic index value 1 corresponds to category 2, and the semantic index value 2 corresponds to category 3. analogy. Different object categories are set with corresponding basic weights. For example, category 1 corresponds to basic weight 1, category 2 corresponds to basic weight 2, and so on. Correspondingly, the method of determining the semantic weight of each pixel based on the semantic information and the corresponding basic weight may be: determining the semantic index value of each pixel, and determining the basic weight corresponding to the semantic index value of each pixel as the semantic Weights. For example, the basic weights are recorded as ω 0 , ω 1 ,..., ω N , which in turn correspond to the basic weight values corresponding to the semantic index value 0 to the semantic index value N, and the semantics of the pixel point at position (i, j) The weight is recorded as W se (i,j), then the semantic weight where p se (i,j) represents the graph The semantic index value of the image mask at the (i,j) position. The specific size of the basic weight can be set based on subjective experience or measured through statistical methods.
步骤S1022、根据显著度信息和对应的预定义权重计算得到每个像素点的显著度权重。Step S1022: Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight.
在一个实施例中,显著度信息包括信息提取得到的每个像素点的灰度值,可选的,该灰度值可以是归一化后的数值,取值范围为0至1,可选的取值0代表“最容易忽略”,取值1代表“最感兴趣”。预定义权重包括预定义最小权重值和预定义最大权重值,示例性的预定义最小权重值记为ωmin,预定义最大权重值记为ωmax。可选的,根据显著度信息和对应的预定义权重计算得到每个像素点的显著度权重,包括:基于预定义最小权重值和预定义最大权重值计算每个像素点的灰度值的线性差值得到每个像素点的显著度权重。示例性的,显著度权重记为Wsa(i,j),则Wsa(i,j)=(ωmaxmin)psa(i,j)+ωmin,其中,psa(i,j)表示图像掩膜在(i,j)位置的灰度值。其中,该预定义最大权重值和预定义最小权重值的具体大小可凭主观经验设定或者通过统计学方法测量得到。In one embodiment, the saliency information includes the gray value of each pixel obtained by information extraction. Optionally, the gray value can be a normalized value, ranging from 0 to 1. Optionally The value 0 represents "easiest to ignore", and the value 1 represents "most interesting". The predefined weight includes a predefined minimum weight value and a predefined maximum weight value. The exemplary predefined minimum weight value is recorded as ω min and the predefined maximum weight value is recorded as ω max . Optionally, calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight, including: calculating the linear grayscale value of each pixel based on the predefined minimum weight value and the predefined maximum weight value. The difference is the saliency weight of each pixel. For example, the saliency weight is recorded as W sa (i,j), then W sa (i,j)=(ω maxmin )p sa (i,j)+ω min , where, p sa (i , j) represents the gray value of the image mask at the (i, j) position. The specific sizes of the predefined maximum weight value and the predefined minimum weight value can be set based on subjective experience or measured through statistical methods.
步骤S1023、根据每个像素点的语义权重以及对应的显著度权重计算得到每个像素点的像素权重。Step S1023: Calculate the pixel weight of each pixel based on the semantic weight of each pixel and the corresponding saliency weight.
在一个实施例中,确定每个像素点的语义权重以及对应的显著度权重后,计算得到最终的像素权重的方式可以是:将每个像素点的语义权重与对应的显著度权重的乘积确定为像素点的像素权重。示例性的,语义权重记为Wse(i,j),显著度权重记为Wsa(i,j),在(i,j)位置的像素点的像素权重也即总权重记为W(i,j),则W(i,j)=Wse(i,j)*Wsa(i,j)。In one embodiment, after determining the semantic weight of each pixel and the corresponding saliency weight, the final pixel weight can be calculated by: determining the product of the semantic weight of each pixel and the corresponding saliency weight. is the pixel weight of the pixel. For example, the semantic weight is recorded as W se (i, j), the saliency weight is recorded as W sa (i, j), and the pixel weight of the pixel at the (i, j) position, that is, the total weight is recorded as W ( i,j), then W(i,j)= Wse (i,j)* Wsa (i,j).
步骤S103、基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。 Step S103: Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
在一个实施例中,在基于语义信息和显著度信息确定出像素权重后,将该像素权重与客观图像质量评价方法结合以计算得到最终的图像质量信息。可选的,该客观图像质量评价方法可以是PSNR(Peak Signal-to-Noise Ratio,峰值信噪比)算法、SSIM(Structural SIMilarity,结构相似性)算法或VMAF(Video Multime Assessment Fusion,视频多媒体评估融合)算法等。In one embodiment, after the pixel weight is determined based on semantic information and saliency information, the pixel weight is combined with an objective image quality evaluation method to calculate the final image quality information. Optionally, the objective image quality evaluation method can be PSNR (Peak Signal-to-Noise Ratio, peak signal-to-noise ratio) algorithm, SSIM (Structural SIMilarity, structural similarity) algorithm or VMAF (Video Multitime Assessment Fusion, video multimedia evaluation) Fusion) algorithm, etc.
在一个实施例中,基于像素权重以及原始图像和失真图像对应像素点的像素值计算得到图像质量信息,即在进行相对于原始图像的失真图像的失真程度进行定量化计算是,结合像素权重进行计算得到图像质量信息,使得最终结算结果在满足客观性要求的前提下,更加符合人眼主观感受。In one embodiment, the image quality information is calculated based on the pixel weight and the pixel values of the corresponding pixels of the original image and the distorted image. That is, the quantitative calculation of the degree of distortion of the distorted image relative to the original image is performed in combination with the pixel weight. The image quality information is calculated, making the final settlement result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.
由上述方案可知,通过获取失真图像以及对应的原始图像,对原始图像进行信息提取得到语义信息和显著度信息,根据语义信息和显著度信息确定每个像素点的像素权重,基于像素权重以及原始图像和失真图像对应像素点的像素值计算得到图像质量信息,提高了图像质量评价的准确度,在进行图像质量评价时,引入注意力机制使得最终确定出的图像质量评价结果在满足客观性要求的前提下,更加符合人眼主观感受。It can be seen from the above scheme that by obtaining the distorted image and the corresponding original image, the original image is extracted to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and the original The pixel values of corresponding pixels in the image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation. When performing image quality evaluation, the attention mechanism is introduced so that the final image quality evaluation result meets the objectivity requirements. Under the premise, it is more in line with the subjective feeling of the human eye.
图3为本申请实施例提供的另一种图像质量确定方法的流程图,给出了一种对原始图像进行信息提取得到语义信息和显著度信息的方法,如图3所示,包括:Figure 3 is a flow chart of another image quality determination method provided by the embodiment of the present application. It provides a method of extracting information from the original image to obtain semantic information and saliency information. As shown in Figure 3, it includes:
步骤S201、获取失真图像以及对应的原始图像,通过训练的双分支结构的卷积神经网络对原始图像进行信息提取得到语义信息和显著度信息。Step S201: Obtain the distorted image and the corresponding original image, and extract information from the original image through the trained convolutional neural network with a dual-branch structure to obtain semantic information and saliency information.
在一个实施例中,通过神经网络模型对原始图像进行信息提取得到语义信息和显著度信息。其中,具体使用双分支结构的卷积神经网络,其中通过一个分支提取得到语义信息,另一个分支提取得到显著度信息。其中,对原始图像进行信息提取得到语义信息的示意图如图4所示,图4为本申请实施例提供的一种信息提取得到语义信息的图示示意图;对原始图像进行信息提取得到显著度信息的示意图如图5所示,图5为本申请实施例提供的一种信息提取得到显著度信息的图示示意图。In one embodiment, semantic information and saliency information are obtained by extracting information from the original image through a neural network model. Among them, a convolutional neural network with a dual-branch structure is specifically used, in which semantic information is extracted through one branch and saliency information is extracted through the other branch. Among them, a schematic diagram of extracting information from the original image to obtain semantic information is shown in Figure 4. Figure 4 is a schematic diagram of extracting semantic information provided by an embodiment of the present application; extracting information from the original image to obtain saliency information The schematic diagram of is shown in Figure 5. Figure 5 is a schematic diagram of information extraction to obtain saliency information provided by an embodiment of the present application.
步骤S202、根据所述语义信息和所述显著度信息确定每个像素点的像素权重。Step S202: Determine the pixel weight of each pixel according to the semantic information and the saliency information.
步骤S203、基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。 Step S203: Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
由上述可知,在进行图像质量确定时,通过训练的双分支结构的卷积神经网络对原始图像进行信息提取得到语义信息和显著度信息,使得得到的语义信息和显著度信息的精确度更高,合理性更强,使得最终得到的图像质量信息,在满足客观性要求的前提下,可以明显的更加符合人眼主观感受。It can be seen from the above that when determining image quality, the semantic information and saliency information are obtained by extracting information from the original image through the trained convolutional neural network with a dual-branch structure, making the obtained semantic information and saliency information more accurate. , the rationality is stronger, so that the final image quality information can be significantly more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.
图6为本申请实施例提供的另一种图像质量确定方法的流程图,给出了一种双分支网络架构实现语义信息和显著性信息提取的过程,如图6所示,包括:Figure 6 is a flow chart of another image quality determination method provided by the embodiment of the present application. It provides a process of extracting semantic information and saliency information using a dual-branch network architecture. As shown in Figure 6, it includes:
步骤S301、获取失真图像以及对应的原始图像。Step S301: Obtain the distorted image and the corresponding original image.
步骤S302、通过卷积神经网络的语义分支对所述原始图像进行语义信息提取得到语义信息,所述语义分支使用全局池化作为特征权重因子,每一层网络的通道数小于预设通道数。Step S302: Extract semantic information from the original image through the semantic branch of the convolutional neural network to obtain semantic information. The semantic branch uses global pooling as a feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels.
在一个实施例中,该语义分支具有较少的特征通道数和较深的层数,利于提取图像的语义上下文信息。其中,由于语义分支只需要大的感受野就能够捕获图像的语义上下文特征,因此语义分支采用轻量级的结构设计,每一层的通道数很少,随着层级深入,特征图进行快速的下采样。此外,为了进一步扩大感受野,其使用了全局池化作为特征权重因子。其中,该预设通道数可以是1或2。In one embodiment, the semantic branch has a smaller number of feature channels and a deeper number of layers, which is beneficial to extracting semantic context information of the image. Among them, since the semantic branch only needs a large receptive field to capture the semantic context features of the image, the semantic branch adopts a lightweight structural design with a small number of channels in each layer. As the layer deepens, the feature map is quickly processed. Downsampling. In addition, in order to further expand the receptive field, it uses global pooling as the feature weight factor. Wherein, the preset channel number can be 1 or 2.
步骤S303、通过卷积神经网络的细节分支对所述原始图像进行显著度信息提取得到显著度信息,所述细节分支的网络层级数小于预设层级数。Step S303: Extract saliency information from the original image through the detail branch of the convolutional neural network, where the number of network levels of the detail branch is less than the preset number of levels.
其中,该细节分支具有较多的特征通道数和较浅的层数,用以捕获图像的空域细节信息,例如纹理、边缘等。细节分支的主要用于提取图像空域细节信息,采用较浅的层级数,较丰富的特征数和大分辨率。在一实施例中,其包含有3个阶段,每个阶段由若干个卷积层、批量归一化、激活函数的级联构成。每个阶段的第一层卷积的步长为2,其它层具有相同的卷积核数量。该细节分支的输出特征图尺寸为源尺寸的1/8。示例性的,该预设层级数可以是4层、5层或6层。Among them, the detail branch has a larger number of feature channels and a shallower number of layers to capture the spatial detail information of the image, such as texture, edges, etc. The detail branch is mainly used to extract image spatial detail information, using a shallower number of levels, a richer number of features and a large resolution. In one embodiment, it includes three stages, each stage consisting of a cascade of several convolutional layers, batch normalization, and activation functions. The stride of the first layer of convolution in each stage is 2, and the other layers have the same number of convolution kernels. The output feature map size of this detail branch is 1/8 of the source size. For example, the preset number of layers may be 4 layers, 5 layers or 6 layers.
步骤S304、通过卷积神经网络的融合层对所述语义信息和所述显著度信息进行融合得到包含更新后的语义信息和显著度信息的语义信息图。Step S304: Fusion the semantic information and the saliency information through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.
在一个实施例中,在得到语义分支和细节分支分别输出的语义信息和显著度信息后,通过融合层进行融合处理。细节分支和语义分支输出的特征为互补的特征,经过融合层得到更加复杂、精确的特征表达,以使得更新后的语义信息和显著度信息准确度更高。示例性的,在显著度区域中也即用户人眼感兴趣 的区域,语义信息表征的不同物体类别的权重为:人脸的基本权重1、头发的基本权重0.9、麦克风的基本权重0.7、椅子的基本权重0.6;在非显著度区域中也即用户人眼不感兴趣的区域,人脸的基本权重0.5、头发的基本权重0.4、麦克风的基本权重0.3、椅子的基本权重0.3。In one embodiment, after obtaining the semantic information and saliency information respectively output by the semantic branch and the detail branch, the fusion process is performed through the fusion layer. The features output by the detail branch and the semantic branch are complementary features. After the fusion layer, a more complex and precise feature expression is obtained, so that the updated semantic information and saliency information are more accurate. For example, in the salience area, that is, the user's eyes are interested in In the area, the weights of different object categories represented by semantic information are: the basic weight of the face is 1, the basic weight of the hair is 0.9, the basic weight of the microphone is 0.7, and the basic weight of the chair is 0.6; in the non-salient area, that is, the user's human eyes For areas that are not of interest, the basic weight of the face is 0.5, the basic weight of the hair is 0.4, the basic weight of the microphone is 0.3, and the basic weight of the chair is 0.3.
步骤S305、根据更新后的语义信息和显著度信息确定每个像素点的像素权重。Step S305: Determine the pixel weight of each pixel based on the updated semantic information and saliency information.
其中,基于更新后的语义信息和显著度信息确定每个像素点的像素权重的方式参见前述基于语义信息和显著度信息确定每个像素点的像素权重的方式,此处不再赘述。The method of determining the pixel weight of each pixel based on the updated semantic information and saliency information can be referred to the aforementioned method of determining the pixel weight of each pixel based on the semantic information and saliency information, which will not be described again here.
步骤S306、基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。Step S306: Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
由上述可知,在进行图像质量确定时,通过分别特殊设置的分支结构,即通过卷积神经网络的语义分支对原始图像进行语义信息提取得到语义信息,通过卷积神经网络的细节分支对原始图像进行显著度信息提取得到显著度信息,实现了更加精确的有针对性的信息提取,同时经过融合层得到更加复杂准确的特征表达,使得最终得到的图像质量信息在满足客观性要求的前提下,更加符合人眼主观感受。It can be seen from the above that when determining the image quality, the semantic information is extracted from the original image through the semantic branch of the convolutional neural network through specially set branch structures, and the semantic information is extracted from the original image through the detail branch of the convolutional neural network. The saliency information is extracted to obtain the saliency information, which achieves more accurate and targeted information extraction. At the same time, a more complex and accurate feature expression is obtained through the fusion layer, so that the final image quality information can meet the objectivity requirements. More in line with the subjective perception of the human eye.
在上述技术方案的基础上,使用的卷积神经网络进一步包括辅助分割头结点以提升训练时的收敛速度。在一实施例中,通过将辅助分割头插入到各个分支的不同位置,以通过调整辅助分割头的计算复杂度以及对应的主分割头来进行不同维度的通道控制,实现训练时的快速收敛。Based on the above technical solution, the convolutional neural network used further includes auxiliary segmentation head nodes to improve the convergence speed during training. In one embodiment, the auxiliary segmentation head is inserted into different positions of each branch to perform channel control in different dimensions by adjusting the computational complexity of the auxiliary segmentation head and the corresponding main segmentation head to achieve rapid convergence during training.
图7为本申请实施例提供的另一种图像质量确定方法的流程图,给出了一种计算得到图像质量信息的过程,如图7所示,包括:Figure 7 is a flow chart of another image quality determination method provided by an embodiment of the present application. It provides a process of calculating image quality information, as shown in Figure 7, including:
步骤S401、获取失真图像以及对应的原始图像,对所述原始图像进行信息提取得到语义信息和显著度信息。Step S401: Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information.
步骤S402、根据所述语义信息和所述显著度信息确定每个像素点的像素权重。Step S402: Determine the pixel weight of each pixel according to the semantic information and the saliency information.
步骤S403、基于所述像素权重计算所述失真图像和所述原始图像的均方误差,将所述均方误差代入峰值信噪比公式计算得到图像质量信息。Step S403: Calculate the mean square error between the distorted image and the original image based on the pixel weight, and substitute the mean square error into the peak signal-to-noise ratio formula to calculate image quality information.
在一个实施例中,以将得到的像素权重与PSNR算法结合计算为例进行说 明。示例性的,以原始图像和对应的失真图像尺寸大小为(m,n)为例,I为原始图像,K为失真图像,i、j表示图像某像素的坐标位置,首先计算得到基于权重的均方误差WMSE,再归一化计算得到的基于权重的图像质量信息WPSNR,计算方式为:

In one embodiment, the calculation is performed by combining the obtained pixel weight with the PSNR algorithm as an example. bright. For example, taking the size of the original image and the corresponding distorted image as (m, n), I is the original image, K is the distorted image, i and j represent the coordinate position of a certain pixel in the image, first calculate the weight-based The mean square error WMSE, and then the weight-based image quality information WPSNR calculated by normalization, are calculated as:

由上述可知,相比于传统的PSNR对各个像素的误差采用平等的权重(即上述公式中W(i,j)取值为1,WPSNR通过挖掘图像的语义和显著度特征,采用自适应的权重策略,引入了注意力机制,因此更加符合人眼主观感受。通过实验数据进行对比发现,使用本方案计算得到的WPSNR作为图像评价指标的方式相较于其它客观图像质量评价算法PSNR、SSIM、VMAF更优,具体参见下表:
It can be seen from the above that compared with traditional PSNR, which uses equal weight for the error of each pixel (that is, the value of W (i, j) in the above formula is 1), WPSNR uses adaptive methods by mining the semantic and saliency features of the image. The weighting strategy introduces the attention mechanism, so it is more in line with the subjective perception of the human eye. Through comparison of experimental data, it is found that the WPSNR calculated using this scheme as an image evaluation index is better than other objective image quality evaluation algorithms PSNR, SSIM, VMAF is better, see the table below for details:
上表中的百分比表示在一批普通直播场景视频中,当前评价方法与MOS数据的相关系数(Pearson系数或Spearman系数)最高的情况所占的比例。可以看出WPSNR明显优于其它评价方法。The percentages in the table above represent the proportion of cases where the correlation coefficient (Pearson coefficient or Spearman coefficient) between the current evaluation method and MOS data is the highest among a batch of ordinary live broadcast scene videos. It can be seen that WPSNR is significantly better than other evaluation methods.
图8为本申请实施例提供的一种图像质量确定装置的结构框图,该装置用于执行上述实施例提供的图像质量确定方法,具备执行方法相应的功能模块和有益效果。如图8所示,该装置具体包括:图像获取模块101、信息提取模块102、权重计算模块103和图像质量计算模块104,其中,Figure 8 is a structural block diagram of an image quality determination device provided by an embodiment of the present application. The device is used to execute the image quality determination method provided by the above embodiment, and has functional modules and beneficial effects corresponding to the execution method. As shown in Figure 8, the device specifically includes: image acquisition module 101, information extraction module 102, weight calculation module 103 and image quality calculation module 104, wherein,
图像获取模块101,配置为获取失真图像以及对应的原始图像;The image acquisition module 101 is configured to acquire the distorted image and the corresponding original image;
信息提取模块102,配置为对所述原始图像进行信息提取得到语义信息和显著度信息;The information extraction module 102 is configured to extract information from the original image to obtain semantic information and saliency information;
权重计算模块103,配置为根据所述语义信息和所述显著度信息确定每个像素点的像素权重; The weight calculation module 103 is configured to determine the pixel weight of each pixel according to the semantic information and the saliency information;
图像质量计算模块104,配置为基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。The image quality calculation module 104 is configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.
由上述方案可知,通过获取失真图像以及对应的原始图像,对原始图像进行信息提取得到语义信息和显著度信息,根据语义信息和显著度信息确定每个像素点的像素权重,基于像素权重以及原始图像和失真图像对应像素点的像素值计算得到图像质量信息,提高了图像质量评价的准确度,使得最终确定出的图像质量评价结果在满足客观性要求的前提下,更加符合人眼主观感受。It can be seen from the above scheme that by obtaining the distorted image and the corresponding original image, the original image is extracted to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and the original The pixel values corresponding to the image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation and makes the final image quality evaluation result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.
在一个可能的实施例中,所述权重计算模块103配置为:In a possible embodiment, the weight calculation module 103 is configured as:
根据所述语义信息和对应的基本权重确定每个像素点的语义权重;Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight;
根据所述显著度信息和对应的预定义权重计算得到每个像素点的显著度权重;Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight;
根据每个像素点的语义权重以及对应的显著度权重计算得到每个像素点的像素权重。The pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight.
在一个可能的实施例中,所述语义信息包括信息提取得到的每个像素点的语义索引值,不同的语义索引值对应不同的物体类别,所述权重计算模块103配置为:In a possible embodiment, the semantic information includes the semantic index value of each pixel obtained through information extraction. Different semantic index values correspond to different object categories. The weight calculation module 103 is configured as:
确定每个像素点的语义索引值;Determine the semantic index value of each pixel;
将每个像素点的语义索引值对应的基本权重确定为语义权重。The basic weight corresponding to the semantic index value of each pixel is determined as the semantic weight.
在一个可能的实施例中,所述显著度信息包括信息提取得到的每个像素点的灰度值,所述预定义权重包括预定义最小权重值和预定义最大权重值,所述权重计算模块103配置为:In a possible embodiment, the saliency information includes the gray value of each pixel obtained by information extraction, the predefined weight includes a predefined minimum weight value and a predefined maximum weight value, and the weight calculation module 103 is configured as:
基于所述预定义最小权重值和所述预定义最大权重值计算每个像素点的灰度值的线性差值得到每个像素点的显著度权重。The linear difference of the gray value of each pixel is calculated based on the predefined minimum weight value and the predefined maximum weight value to obtain the saliency weight of each pixel.
在一个可能的实施例中,所述权重计算模块103配置为:In a possible embodiment, the weight calculation module 103 is configured as:
将每个像素点的语义权重与对应的显著度权重的乘积确定为像素点的像素权重。The product of the semantic weight of each pixel and the corresponding saliency weight is determined as the pixel weight of the pixel.
在一个可能的实施例中,所述信息提取模块102配置为:In a possible embodiment, the information extraction module 102 is configured as:
通过训练的双分支结构的卷积神经网络对所述原始图像进行信息提取得到语义信息和显著度信息。Semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure.
在一个可能的实施例中,所述信息提取模块102配置为:In a possible embodiment, the information extraction module 102 is configured as:
通过卷积神经网络的语义分支对所述原始图像进行语义信息提取得到语义 信息,所述语义分支使用全局池化作为特征权重因子,每一层网络的通道数小于预设通道数;The semantic information of the original image is extracted through the semantic branch of the convolutional neural network to obtain the semantic Information, the semantic branch uses global pooling as the feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels;
通过卷积神经网络的细节分支对所述原始图像进行显著度信息提取得到显著度信息,所述细节分支的网络层级数小于预设层级数;The saliency information is extracted from the original image through the detail branch of the convolutional neural network, and the number of network levels of the detail branch is less than the preset number of levels;
通过卷积神经网络的融合层对所述语义信息和所述显著度信息进行融合得到包含更新后的语义信息和显著度信息的语义信息图。The semantic information and the saliency information are fused through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.
在一个可能的实施例中,所述图像质量计算模块104配置为:In a possible embodiment, the image quality calculation module 104 is configured as:
基于所述像素权重计算所述失真图像和所述原始图像的均方误差;Calculate the mean square error of the distorted image and the original image based on the pixel weight;
将所述均方误差代入峰值信噪比公式计算得到图像质量信息。The mean square error is substituted into the peak signal-to-noise ratio formula to calculate image quality information.
图9为本申请实施例提供的一种图像质量确定设备的结构示意图,如图9所示,该设备包括处理器201、存储器202、输入装置203和输出装置204;设备中处理器201的数量可以是一个或多个,图9中以一个处理器201为例;设备中的处理器201、存储器202、输入装置203和输出装置204可以通过总线或其他方式连接,图9中以通过总线连接为例。存储器202作为一种计算机可读存储介质,可用于存储软件程序、计算机可执行程序以及模块,如本申请实施例中的图像质量确定方法对应的程序指令/模块。处理器201通过运行存储在存储器202中的软件程序、指令以及模块,从而执行设备的各种功能应用以及数据处理,即实现上述的图像质量确定方法。输入装置203可用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的键信号输入。输出装置204可包括显示屏等显示设备。Figure 9 is a schematic structural diagram of an image quality determination device provided by an embodiment of the present application. As shown in Figure 9, the device includes a processor 201, a memory 202, an input device 203 and an output device 204; the number of processors 201 in the device It can be one or more. In Figure 9, one processor 201 is taken as an example; the processor 201, memory 202, input device 203 and output device 204 in the device can be connected through a bus or other means. In Figure 9, the processor 201, the memory 202, the input device 203 and the output device 204 are connected through a bus. For example. As a computer-readable storage medium, the memory 202 can be used to store software programs, computer-executable programs and modules, such as program instructions/modules corresponding to the image quality determination method in the embodiment of the present application. The processor 201 executes various functional applications and data processing of the device by running software programs, instructions and modules stored in the memory 202, that is, implementing the above image quality determination method. The input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and functional control of the device. The output device 204 may include a display device such as a display screen.
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行一种上述实施例描述的图像质量确定方法,其中,包括:Embodiments of the present application also provide a storage medium containing computer-executable instructions, which when executed by a computer processor are used to perform an image quality determination method described in the above embodiments, which includes:
获取失真图像以及对应的原始图像,对所述原始图像进行信息提取得到语义信息和显著度信息;Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information;
根据所述语义信息和所述显著度信息确定每个像素点的像素权重;Determine the pixel weight of each pixel according to the semantic information and the saliency information;
基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
值得注意的是,上述图像质量确定装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并 不用于限制本申请实施例的保护范围。It is worth noting that in the above embodiments of the image quality determination device, the various units and modules included are only divided according to functional logic, but are not limited to the above divisions, as long as the corresponding functions can be realized; in addition, The specific names of each functional unit are only for the convenience of distinguishing each other and are not It is not used to limit the scope of protection of the embodiments of this application.
在一些可能的实施方式中,本申请提供的方法的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在计算机设备上运行时,所述程序代码用于使所述计算机设备执行本说明书上述描述的根据本申请各种示例性实施方式的方法中的步骤,例如,所述计算机设备可以执行本申请实施例所记载的图像质量确定方法。所述程序产品可以采用一个或多个可读介质的任意组合实现。 In some possible implementations, various aspects of the method provided by this application can also be implemented in the form of a program product, which includes program code. When the program product is run on a computer device, the program code is used to The computer device is caused to execute the steps in the methods described above in this specification according to various exemplary embodiments of the present application. For example, the computer device may execute the image quality determination method described in the embodiments of the present application. The program product may be implemented in any combination of one or more readable media.

Claims (12)

  1. 图像质量确定方法,其中,包括:Image quality determination methods, including:
    获取失真图像以及对应的原始图像,对所述原始图像进行信息提取得到语义信息和显著度信息;Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information;
    根据所述语义信息和所述显著度信息确定每个像素点的像素权重;Determine the pixel weight of each pixel according to the semantic information and the saliency information;
    基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
  2. 根据权利要求1所述的图像质量确定方法,其中,所述根据所述语义信息和所述显著度信息确定每个像素点的像素权重,包括:The image quality determination method according to claim 1, wherein the determining the pixel weight of each pixel point according to the semantic information and the saliency information includes:
    根据所述语义信息和对应的基本权重确定每个像素点的语义权重;Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight;
    根据所述显著度信息和对应的预定义权重计算得到每个像素点的显著度权重;Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight;
    根据每个像素点的语义权重以及对应的显著度权重计算得到每个像素点的像素权重。The pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight.
  3. 根据权利要求2所述的图像质量确定方法,其中,所述语义信息包括信息提取得到的每个像素点的语义索引值,不同的语义索引值对应不同的物体类别,所述根据所述语义信息和对应的基本权重确定每个像素点的语义权重,包括:The image quality determination method according to claim 2, wherein the semantic information includes a semantic index value of each pixel obtained by information extraction, and different semantic index values correspond to different object categories. According to the semantic information and the corresponding basic weight determine the semantic weight of each pixel, including:
    确定每个像素点的语义索引值;Determine the semantic index value of each pixel;
    将每个像素点的语义索引值对应的基本权重确定为语义权重。The basic weight corresponding to the semantic index value of each pixel is determined as the semantic weight.
  4. 根据权利要求2或3所述的图像质量确定方法,其中,所述显著度信息包括信息提取得到的每个像素点的灰度值,所述预定义权重包括预定义最小权重值和预定义最大权重值,所述根据所述显著度信息和对应的预定义权重计算得到每个像素点的显著度权重,包括:The image quality determination method according to claim 2 or 3, wherein the saliency information includes the gray value of each pixel obtained by information extraction, and the predefined weight includes a predefined minimum weight value and a predefined maximum The weight value, which is calculated to obtain the saliency weight of each pixel based on the saliency information and the corresponding predefined weight, includes:
    基于所述预定义最小权重值和所述预定义最大权重值计算每个像素点的灰度值的线性差值得到每个像素点的显著度权重。The linear difference of the gray value of each pixel is calculated based on the predefined minimum weight value and the predefined maximum weight value to obtain the saliency weight of each pixel.
  5. 根据权利要求2所述的图像质量确定方法,其中,所述根据每个像素点的语义权重以及对应的显著度权重计算得到每个像素点的像素权重,包括:The image quality determination method according to claim 2, wherein the pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight, including:
    将每个像素点的语义权重与对应的显著度权重的乘积确定为像素点的像素权重。The product of the semantic weight of each pixel and the corresponding saliency weight is determined as the pixel weight of the pixel.
  6. 根据权利要求1-5中任一项所述的图像质量确定方法,其中,所述对所述原始图像进行信息提取得到语义信息和显著度信息,包括: The image quality determination method according to any one of claims 1 to 5, wherein the information extraction of the original image to obtain semantic information and saliency information includes:
    通过训练的双分支结构的卷积神经网络对所述原始图像进行信息提取得到语义信息和显著度信息。Semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure.
  7. 根据权利要求6所述的图像质量确定方法,其中,所述通过训练的双分支结构的卷积神经网络对所述原始图像进行信息提取得到语义信息和显著度信息,包括:The image quality determination method according to claim 6, wherein the semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure, including:
    通过卷积神经网络的语义分支对所述原始图像进行语义信息提取得到语义信息,所述语义分支使用全局池化作为特征权重因子,每一层网络的通道数小于预设通道数;Semantic information is extracted from the original image through the semantic branch of the convolutional neural network. The semantic branch uses global pooling as the feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels;
    通过卷积神经网络的细节分支对所述原始图像进行显著度信息提取得到显著度信息,所述细节分支的网络层级数小于预设层级数;The saliency information is extracted from the original image through the detail branch of the convolutional neural network, and the number of network levels of the detail branch is less than the preset number of levels;
    通过卷积神经网络的融合层对所述语义信息和所述显著度信息进行融合得到包含更新后的语义信息和显著度信息的语义信息图。The semantic information and the saliency information are fused through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.
  8. 根据权利要求1-7中任一项所述的图像质量确定方法,其中,所述基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息,包括:The image quality determination method according to any one of claims 1 to 7, wherein the image quality information is calculated based on the pixel weight and the pixel values corresponding to the original image and the distorted image, including :
    基于所述像素权重计算所述失真图像和所述原始图像的均方误差;Calculate the mean square error of the distorted image and the original image based on the pixel weight;
    将所述均方误差代入峰值信噪比公式计算得到图像质量信息。The mean square error is substituted into the peak signal-to-noise ratio formula to calculate image quality information.
  9. 图像质量确定装置,其中,包括:Image quality determining device, including:
    图像获取模块,配置为获取失真图像以及对应的原始图像;An image acquisition module configured to acquire the distorted image and the corresponding original image;
    信息提取模块,配置为对所述原始图像进行信息提取得到语义信息和显著度信息;An information extraction module configured to extract information from the original image to obtain semantic information and saliency information;
    权重计算模块,配置为根据所述语义信息和所述显著度信息确定每个像素点的像素权重;A weight calculation module configured to determine the pixel weight of each pixel based on the semantic information and the saliency information;
    图像质量计算模块,配置为基于所述像素权重以及所述原始图像和所述失真图像对应像素点的像素值计算得到图像质量信息。An image quality calculation module configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.
  10. 一种图像质量确定设备,所述设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现权利要求1-8中任一项所述的图像质量确定方法。An image quality determination device, the device includes: one or more processors; a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors, The one or more processors are caused to implement the image quality determination method according to any one of claims 1-8.
  11. 一种存储计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行权利要求1-8中任一项所述的图像质量确定方法。 A storage medium storing computer-executable instructions that, when executed by a computer processor, are used to perform the image quality determination method of any one of claims 1-8.
  12. 一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现权利要求1-8中任一项所述的图像质量确定方法。 A computer program product includes a computer program, wherein when the computer program is executed by a processor, the image quality determination method according to any one of claims 1-8 is implemented.
PCT/CN2023/079505 2022-03-11 2023-03-03 Image quality determination method, apparatus, device, and storage medium WO2023169318A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210238500.3A CN114596287A (en) 2022-03-11 2022-03-11 Image quality determination method, device, equipment and storage medium
CN202210238500.3 2022-03-11

Publications (1)

Publication Number Publication Date
WO2023169318A1 true WO2023169318A1 (en) 2023-09-14

Family

ID=81808691

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/079505 WO2023169318A1 (en) 2022-03-11 2023-03-03 Image quality determination method, apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN114596287A (en)
WO (1) WO2023169318A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114596287A (en) * 2022-03-11 2022-06-07 百果园技术(新加坡)有限公司 Image quality determination method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923703A (en) * 2010-08-27 2010-12-22 北京工业大学 Semantic-based image adaptive method by combination of slit cropping and non-homogeneous mapping
CN104574381A (en) * 2014-12-25 2015-04-29 南京邮电大学 Full reference image quality evaluation method based on LBP (local binary pattern)
US20170270653A1 (en) * 2016-03-15 2017-09-21 International Business Machines Corporation Retinal image quality assessment, error identification and automatic quality correction
CN107967480A (en) * 2016-10-19 2018-04-27 北京联合大学 A kind of notable object extraction method based on label semanteme
CN114596287A (en) * 2022-03-11 2022-06-07 百果园技术(新加坡)有限公司 Image quality determination method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923703A (en) * 2010-08-27 2010-12-22 北京工业大学 Semantic-based image adaptive method by combination of slit cropping and non-homogeneous mapping
CN104574381A (en) * 2014-12-25 2015-04-29 南京邮电大学 Full reference image quality evaluation method based on LBP (local binary pattern)
US20170270653A1 (en) * 2016-03-15 2017-09-21 International Business Machines Corporation Retinal image quality assessment, error identification and automatic quality correction
CN107967480A (en) * 2016-10-19 2018-04-27 北京联合大学 A kind of notable object extraction method based on label semanteme
CN114596287A (en) * 2022-03-11 2022-06-07 百果园技术(新加坡)有限公司 Image quality determination method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN114596287A (en) 2022-06-07

Similar Documents

Publication Publication Date Title
WO2018161775A1 (en) Neural network model training method, device and storage medium for image processing
CN112954312B (en) Non-reference video quality assessment method integrating space-time characteristics
CN104063883B (en) A kind of monitor video abstraction generating method being combined based on object and key frame
WO2022156640A1 (en) Gaze correction method and apparatus for image, electronic device, computer-readable storage medium, and computer program product
CN112950581B (en) Quality evaluation method and device and electronic equipment
WO2020253127A1 (en) Facial feature extraction model training method and apparatus, facial feature extraction method and apparatus, device, and storage medium
CN110751649B (en) Video quality evaluation method and device, electronic equipment and storage medium
CN110619319A (en) Improved MTCNN model-based face detection method and system
CN111340123A (en) Image score label prediction method based on deep convolutional neural network
WO2014187223A1 (en) Method and apparatus for identifying facial features
CN111950723A (en) Neural network model training method, image processing method, device and terminal equipment
CN110827312B (en) Learning method based on cooperative visual attention neural network
CN110866563B (en) Similar video detection and recommendation method, electronic device and storage medium
CN107481236A (en) A kind of quality evaluating method of screen picture
WO2023169318A1 (en) Image quality determination method, apparatus, device, and storage medium
CN112614110B (en) Method and device for evaluating image quality and terminal equipment
US20240153240A1 (en) Image processing method, apparatus, computing device, and medium
Li et al. Recent advances and challenges in video quality assessment
CN117237279A (en) Blind quality evaluation method and system for non-uniform distortion panoramic image
TWI803243B (en) Method for expanding images, computer device and storage medium
CN114125495A (en) Video quality evaluation model training method, video quality evaluation method and device
CN115546162A (en) Virtual reality image quality evaluation method and system
CN113556600B (en) Drive control method and device based on time sequence information, electronic equipment and readable storage medium
CN111047618A (en) Multi-scale-based non-reference screen content image quality evaluation method
Yang et al. No-reference image quality assessment focusing on human facial region

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23765896

Country of ref document: EP

Kind code of ref document: A1