WO2023169318A1

WO2023169318A1 - Image quality determination method, apparatus, device, and storage medium

Info

Publication number: WO2023169318A1
Application number: PCT/CN2023/079505
Authority: WO
Inventors: 王帅; 石雅南
Original assignee: 百果园技术(新加坡)有限公司; 王帅
Priority date: 2022-03-11
Filing date: 2023-03-03
Publication date: 2023-09-14
Also published as: CN114596287A

Abstract

Embodiments of the present application provide an image quality determination method, an apparatus, a device, and a storage medium. The method comprises: obtaining a distorted image and a corresponding original image, and performing information extraction on the original image to obtain semantic information and saliency information; determining a pixel weight of each pixel point according to the semantic information and the saliency information; and calculating image quality information on the basis of the pixel weights and pixel values of pixel points corresponding to the distorted image and the original image. An image quality evaluation result ultimately determined by the present solution better conforms to a subjective feeling of the human eye while satisfying an objectivity requirement.

Description

Image quality determination method, device, equipment and storage medium

This application claims priority to the Chinese patent application with application number 202210238500.3, which was submitted to the China Patent Office on March 11, 2022. The entire content of this application is incorporated into this application by reference.

Technical field

The embodiments of the present application relate to the field of image processing technology, and in particular, to an image quality determination method, device, equipment and storage medium.

Background technique

Image quality evaluation refers to the quantitative description of the degree of distortion between two images with similar subject content in a subjective and objective manner. It plays a very important role in algorithm analysis and comparison, system performance evaluation, etc. in the field of image/video processing.

Image quality evaluation can be divided into subjective evaluation and objective evaluation in terms of methods. Subjective evaluation refers to the evaluation of image quality through people's subjective feelings, that is, the original reference image and the distorted image are given, and the observer is allowed to evaluate the distorted image. It is generally described by the average subjective opinion score or the average subjective opinion score difference. Subjective evaluation requires a lot of manpower and material resources, and the evaluation results are easily affected by the tester's subjective factors and external conditions. The complexity of the evaluation process seriously affects its accuracy and versatility, so it should be applied to actual video processing systems. is extremely difficult. In contrast, objective evaluation uses mathematical models to directly give quantitative values of distortion, is simple to operate, and is widely used in various fields. For example, image quality can be objectively evaluated through a series of objective evaluation indicators commonly used in the industry. Although objective evaluation indicators are simple to operate and easy to implement, different objective indicators are not consistent with the subjective perception of the human eye to the same extent, resulting in unsatisfactory image quality evaluation results.

Contents of the invention

The embodiments of the present application provide an image quality determination method, device, equipment and storage medium, which solves the problem in related technologies that image quality evaluation results cannot well reflect subjective image quality feelings, so that the final image quality evaluation results are given More in line with the subjective perception of the human eye.

In a first aspect, embodiments of the present application provide an image quality determination method, which method includes:

Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain the language meaning information and salience information;

Determine the pixel weight of each pixel according to the semantic information and the saliency information;

Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.

In a second aspect, embodiments of the present application also provide an image quality determination device, including:

An image acquisition module configured to acquire the distorted image and the corresponding original image;

An information extraction module configured to extract information from the original image to obtain semantic information and saliency information;

A weight calculation module configured to determine the pixel weight of each pixel based on the semantic information and the saliency information;

An image quality calculation module configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.

In a third aspect, embodiments of the present application also provide an image quality determination device, which includes:

one or more processors;

a storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the image quality determination method described in the embodiments of this application.

In a fourth aspect, embodiments of the present application also provide a storage medium that stores computer-executable instructions, which, when executed by a computer processor, are used to perform the image quality determination method described in the embodiments of the present application.

In a fifth aspect, embodiments of the present application also provide a computer program product. The computer program product includes a computer program. The computer program is stored in a computer-readable storage medium. At least one processor of the device reads the computer program from the computer-readable storage medium. The computer program is fetched and executed, so that the device executes the image quality determination method described in the embodiment of the present application.

In the embodiment of this application, by obtaining the distorted image and the corresponding original image, information is extracted from the original image to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and The pixel values corresponding to the original image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation and makes the final image quality evaluation result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements. .

Description of drawings

Figure 1 is a flow chart of an image quality determination method provided by an embodiment of the present application;

Figure 2 is a flow chart of a method for determining the pixel weight of a pixel point provided by an embodiment of the present application;

Figure 3 is a flow chart of another image quality determination method provided by an embodiment of the present application;

Figure 4 is a schematic diagram illustrating semantic information obtained through information extraction provided by an embodiment of the present application;

Figure 5 is a schematic diagram illustrating saliency information obtained through information extraction provided by an embodiment of the present application;

Figure 6 is a flow chart of another image quality determination method provided by an embodiment of the present application;

Figure 7 is a flow chart of another image quality determination method provided by an embodiment of the present application;

Figure 8 is a structural block diagram of an image quality determination device provided by an embodiment of the present application;

Figure 9 is a schematic structural diagram of an image quality determination device provided by an embodiment of the present application.

Detailed ways

The embodiments of the present application will be further described in detail below with reference to the accompanying drawings and examples. It can be understood that the specific embodiments described here are only used to explain the embodiments of the present application, but are not intended to limit the embodiments of the present application. In addition, it should be noted that, for convenience of description, only some but not all structures related to the embodiments of the present application are shown in the drawings.

The terms "first", "second", etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in orders other than those illustrated or described herein, and that "first," "second," etc. are distinguished Objects are usually of one type, and the number of objects is not limited. For example, the first object can be one or multiple. In addition, "and/or" in the description and claims indicates at least one of the connected objects, and the character "/" generally indicates that the related objects are in an "or" relationship.

Figure 1 is a flow chart of an image quality determination method provided by an embodiment of the present application, which can be used to evaluate image and video quality. The method can be executed by computing devices such as servers, smart terminals, notebooks, tablets, etc. Specifically Includes the following steps:

Step S101: Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information.

In one embodiment, a comparison calculation between the distorted image and the original image is performed to obtain the quantified value of the distorted image and thereby obtain the corresponding image quality evaluation. The original image may be a clear undistorted image, that is, the reference image of the distorted image. Distorted images are images with different distortions relative to the original image, that is, noise images. The embodiment of this application is based on the objective image quality evaluation method. Method, the attention mechanism is introduced to achieve a final more reasonable image quality evaluation.

In one embodiment, the acquired original image may be one image or a video frame sequence composed of multiple image frames, such as consecutive multiple frames of live image images in a live video. Semantic information and saliency information are obtained by extracting information from the original image. Optionally, the information extraction method can be to extract information from the original image through a trained neural network model, or an image feature extraction algorithm can be used to extract information. The semantic information represents the category of each pixel in the image. Optionally, the category of the pixel can be one of multiple preset object categories. Examples of the object categories can be faces, microphones, and chairs. , hair, keyboard, etc. The saliency information represents the grayscale of each pixel in the image, which represents the degree of attention of each pixel to human vision, and can be a normalized grayscale image. Optionally, the semantic information and saliency information are presented through image masks.

Step S102: Determine the pixel weight of each pixel according to the semantic information and the saliency information.

In one embodiment, after obtaining the semantic information and saliency information of each pixel through information extraction, the pixel weight of each pixel is calculated based on the semantic information and saliency information. Optionally, as shown in Figure 2, Figure 2 is a flow chart of a method for determining the pixel weight of a pixel point provided by an embodiment of the present application, which specifically includes:

Step S1021: Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight.

In one embodiment, the semantic information includes the semantic index value of each pixel obtained through information extraction, where different semantic index values correspond to different object categories. For example, taking the live broadcast scene as an example, there are 13 categories, corresponding to index values 0 to 12, that is, the semantic index value 0 corresponds to category 1, the semantic index value 1 corresponds to category 2, and the semantic index value 2 corresponds to category 3. analogy. Different object categories are set with corresponding basic weights. For example, category 1 corresponds to basic weight 1, category 2 corresponds to basic weight 2, and so on. Correspondingly, the method of determining the semantic weight of each pixel based on the semantic information and the corresponding basic weight may be: determining the semantic index value of each pixel, and determining the basic weight corresponding to the semantic index value of each pixel as the semantic Weights. For example, the basic weights are recorded as ω ₀ , ω ₁ ,..., ω _N , which in turn correspond to the basic weight values corresponding to the semantic index value 0 to the semantic index value N, and the semantics of the pixel point at position (i, j) The weight is recorded as W _se (i,j), then the semantic weight where p _se (i,j) represents the graph The semantic index value of the image mask at the (i,j) position. The specific size of the basic weight can be set based on subjective experience or measured through statistical methods.

Step S1022: Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight.

In one embodiment, the saliency information includes the gray value of each pixel obtained by information extraction. Optionally, the gray value can be a normalized value, ranging from 0 to 1. Optionally The value 0 represents "easiest to ignore", and the value 1 represents "most interesting". The predefined weight includes a predefined minimum weight value and a predefined maximum weight value. The exemplary predefined minimum weight value is recorded as ω _min and the predefined maximum weight value is recorded as ω _max . Optionally, calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight, including: calculating the linear grayscale value of each pixel based on the predefined minimum weight value and the predefined maximum weight value. The difference is the saliency weight of each pixel. For example, the saliency weight is recorded as W _sa (i,j), then W _sa (i,j)=(ω _max -ω _min )p _sa (i,j)+ω _min , where, p _sa (i , j) represents the gray value of the image mask at the (i, j) position. The specific sizes of the predefined maximum weight value and the predefined minimum weight value can be set based on subjective experience or measured through statistical methods.

Step S1023: Calculate the pixel weight of each pixel based on the semantic weight of each pixel and the corresponding saliency weight.

In one embodiment, after determining the semantic weight of each pixel and the corresponding saliency weight, the final pixel weight can be calculated by: determining the product of the semantic weight of each pixel and the corresponding saliency weight. is the pixel weight of the pixel. For example, the semantic weight is recorded as W _se (i, j), the saliency weight is recorded as W _sa (i, j), and the pixel weight of the pixel at the (i, j) position, that is, the total weight is recorded as W ( i,j), then W(i,j)= _Wse (i,j)* _Wsa (i,j).

Step S103: Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.

In one embodiment, after the pixel weight is determined based on semantic information and saliency information, the pixel weight is combined with an objective image quality evaluation method to calculate the final image quality information. Optionally, the objective image quality evaluation method can be PSNR (Peak Signal-to-Noise Ratio, peak signal-to-noise ratio) algorithm, SSIM (Structural SIMilarity, structural similarity) algorithm or VMAF (Video Multitime Assessment Fusion, video multimedia evaluation) Fusion) algorithm, etc.

In one embodiment, the image quality information is calculated based on the pixel weight and the pixel values of the corresponding pixels of the original image and the distorted image. That is, the quantitative calculation of the degree of distortion of the distorted image relative to the original image is performed in combination with the pixel weight. The image quality information is calculated, making the final settlement result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.

It can be seen from the above scheme that by obtaining the distorted image and the corresponding original image, the original image is extracted to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and the original The pixel values of corresponding pixels in the image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation. When performing image quality evaluation, the attention mechanism is introduced so that the final image quality evaluation result meets the objectivity requirements. Under the premise, it is more in line with the subjective feeling of the human eye.

Figure 3 is a flow chart of another image quality determination method provided by the embodiment of the present application. It provides a method of extracting information from the original image to obtain semantic information and saliency information. As shown in Figure 3, it includes:

Step S201: Obtain the distorted image and the corresponding original image, and extract information from the original image through the trained convolutional neural network with a dual-branch structure to obtain semantic information and saliency information.

In one embodiment, semantic information and saliency information are obtained by extracting information from the original image through a neural network model. Among them, a convolutional neural network with a dual-branch structure is specifically used, in which semantic information is extracted through one branch and saliency information is extracted through the other branch. Among them, a schematic diagram of extracting information from the original image to obtain semantic information is shown in Figure 4. Figure 4 is a schematic diagram of extracting semantic information provided by an embodiment of the present application; extracting information from the original image to obtain saliency information The schematic diagram of is shown in Figure 5. Figure 5 is a schematic diagram of information extraction to obtain saliency information provided by an embodiment of the present application.

Step S202: Determine the pixel weight of each pixel according to the semantic information and the saliency information.

Step S203: Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.

It can be seen from the above that when determining image quality, the semantic information and saliency information are obtained by extracting information from the original image through the trained convolutional neural network with a dual-branch structure, making the obtained semantic information and saliency information more accurate. , the rationality is stronger, so that the final image quality information can be significantly more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.

Figure 6 is a flow chart of another image quality determination method provided by the embodiment of the present application. It provides a process of extracting semantic information and saliency information using a dual-branch network architecture. As shown in Figure 6, it includes:

Step S301: Obtain the distorted image and the corresponding original image.

Step S302: Extract semantic information from the original image through the semantic branch of the convolutional neural network to obtain semantic information. The semantic branch uses global pooling as a feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels.

In one embodiment, the semantic branch has a smaller number of feature channels and a deeper number of layers, which is beneficial to extracting semantic context information of the image. Among them, since the semantic branch only needs a large receptive field to capture the semantic context features of the image, the semantic branch adopts a lightweight structural design with a small number of channels in each layer. As the layer deepens, the feature map is quickly processed. Downsampling. In addition, in order to further expand the receptive field, it uses global pooling as the feature weight factor. Wherein, the preset channel number can be 1 or 2.

Step S303: Extract saliency information from the original image through the detail branch of the convolutional neural network, where the number of network levels of the detail branch is less than the preset number of levels.

Among them, the detail branch has a larger number of feature channels and a shallower number of layers to capture the spatial detail information of the image, such as texture, edges, etc. The detail branch is mainly used to extract image spatial detail information, using a shallower number of levels, a richer number of features and a large resolution. In one embodiment, it includes three stages, each stage consisting of a cascade of several convolutional layers, batch normalization, and activation functions. The stride of the first layer of convolution in each stage is 2, and the other layers have the same number of convolution kernels. The output feature map size of this detail branch is 1/8 of the source size. For example, the preset number of layers may be 4 layers, 5 layers or 6 layers.

Step S304: Fusion the semantic information and the saliency information through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.

In one embodiment, after obtaining the semantic information and saliency information respectively output by the semantic branch and the detail branch, the fusion process is performed through the fusion layer. The features output by the detail branch and the semantic branch are complementary features. After the fusion layer, a more complex and precise feature expression is obtained, so that the updated semantic information and saliency information are more accurate. For example, in the salience area, that is, the user's eyes are interested in In the area, the weights of different object categories represented by semantic information are: the basic weight of the face is 1, the basic weight of the hair is 0.9, the basic weight of the microphone is 0.7, and the basic weight of the chair is 0.6; in the non-salient area, that is, the user's human eyes For areas that are not of interest, the basic weight of the face is 0.5, the basic weight of the hair is 0.4, the basic weight of the microphone is 0.3, and the basic weight of the chair is 0.3.

Step S305: Determine the pixel weight of each pixel based on the updated semantic information and saliency information.

The method of determining the pixel weight of each pixel based on the updated semantic information and saliency information can be referred to the aforementioned method of determining the pixel weight of each pixel based on the semantic information and saliency information, which will not be described again here.

Step S306: Calculate image quality information based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.

It can be seen from the above that when determining the image quality, the semantic information is extracted from the original image through the semantic branch of the convolutional neural network through specially set branch structures, and the semantic information is extracted from the original image through the detail branch of the convolutional neural network. The saliency information is extracted to obtain the saliency information, which achieves more accurate and targeted information extraction. At the same time, a more complex and accurate feature expression is obtained through the fusion layer, so that the final image quality information can meet the objectivity requirements. More in line with the subjective perception of the human eye.

Based on the above technical solution, the convolutional neural network used further includes auxiliary segmentation head nodes to improve the convergence speed during training. In one embodiment, the auxiliary segmentation head is inserted into different positions of each branch to perform channel control in different dimensions by adjusting the computational complexity of the auxiliary segmentation head and the corresponding main segmentation head to achieve rapid convergence during training.

Figure 7 is a flow chart of another image quality determination method provided by an embodiment of the present application. It provides a process of calculating image quality information, as shown in Figure 7, including:

Step S401: Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information.

Step S402: Determine the pixel weight of each pixel according to the semantic information and the saliency information.

Step S403: Calculate the mean square error between the distorted image and the original image based on the pixel weight, and substitute the mean square error into the peak signal-to-noise ratio formula to calculate image quality information.

In one embodiment, the calculation is performed by combining the obtained pixel weight with the PSNR algorithm as an example. bright. For example, taking the size of the original image and the corresponding distorted image as (m, n), I is the original image, K is the distorted image, i and j represent the coordinate position of a certain pixel in the image, first calculate the weight-based The mean square error WMSE, and then the weight-based image quality information WPSNR calculated by normalization, are calculated as:

It can be seen from the above that compared with traditional PSNR, which uses equal weight for the error of each pixel (that is, the value of W (i, j) in the above formula is 1), WPSNR uses adaptive methods by mining the semantic and saliency features of the image. The weighting strategy introduces the attention mechanism, so it is more in line with the subjective perception of the human eye. Through comparison of experimental data, it is found that the WPSNR calculated using this scheme as an image evaluation index is better than other objective image quality evaluation algorithms PSNR, SSIM, VMAF is better, see the table below for details:

The percentages in the table above represent the proportion of cases where the correlation coefficient (Pearson coefficient or Spearman coefficient) between the current evaluation method and MOS data is the highest among a batch of ordinary live broadcast scene videos. It can be seen that WPSNR is significantly better than other evaluation methods.

Figure 8 is a structural block diagram of an image quality determination device provided by an embodiment of the present application. The device is used to execute the image quality determination method provided by the above embodiment, and has functional modules and beneficial effects corresponding to the execution method. As shown in Figure 8, the device specifically includes: image acquisition module 101, information extraction module 102, weight calculation module 103 and image quality calculation module 104, wherein,

The image acquisition module 101 is configured to acquire the distorted image and the corresponding original image;

The information extraction module 102 is configured to extract information from the original image to obtain semantic information and saliency information;

The weight calculation module 103 is configured to determine the pixel weight of each pixel according to the semantic information and the saliency information;

The image quality calculation module 104 is configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.

It can be seen from the above scheme that by obtaining the distorted image and the corresponding original image, the original image is extracted to obtain semantic information and saliency information, and the pixel weight of each pixel is determined based on the semantic information and saliency information. Based on the pixel weight and the original The pixel values corresponding to the image and the distorted image are calculated to obtain image quality information, which improves the accuracy of image quality evaluation and makes the final image quality evaluation result more consistent with the subjective perception of the human eye on the premise of meeting the objectivity requirements.

In a possible embodiment, the weight calculation module 103 is configured as:

Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight;

Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight;

The pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight.

In a possible embodiment, the semantic information includes the semantic index value of each pixel obtained through information extraction. Different semantic index values correspond to different object categories. The weight calculation module 103 is configured as:

Determine the semantic index value of each pixel;

The basic weight corresponding to the semantic index value of each pixel is determined as the semantic weight.

In a possible embodiment, the saliency information includes the gray value of each pixel obtained by information extraction, the predefined weight includes a predefined minimum weight value and a predefined maximum weight value, and the weight calculation module 103 is configured as:

The linear difference of the gray value of each pixel is calculated based on the predefined minimum weight value and the predefined maximum weight value to obtain the saliency weight of each pixel.

In a possible embodiment, the weight calculation module 103 is configured as:

The product of the semantic weight of each pixel and the corresponding saliency weight is determined as the pixel weight of the pixel.

In a possible embodiment, the information extraction module 102 is configured as:

Semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure.

The semantic information of the original image is extracted through the semantic branch of the convolutional neural network to obtain the semantic Information, the semantic branch uses global pooling as the feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels;

The saliency information is extracted from the original image through the detail branch of the convolutional neural network, and the number of network levels of the detail branch is less than the preset number of levels;

The semantic information and the saliency information are fused through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.

In a possible embodiment, the image quality calculation module 104 is configured as:

Calculate the mean square error of the distorted image and the original image based on the pixel weight;

The mean square error is substituted into the peak signal-to-noise ratio formula to calculate image quality information.

Figure 9 is a schematic structural diagram of an image quality determination device provided by an embodiment of the present application. As shown in Figure 9, the device includes a processor 201, a memory 202, an input device 203 and an output device 204; the number of processors 201 in the device It can be one or more. In Figure 9, one processor 201 is taken as an example; the processor 201, memory 202, input device 203 and output device 204 in the device can be connected through a bus or other means. In Figure 9, the processor 201, the memory 202, the input device 203 and the output device 204 are connected through a bus. For example. As a computer-readable storage medium, the memory 202 can be used to store software programs, computer-executable programs and modules, such as program instructions/modules corresponding to the image quality determination method in the embodiment of the present application. The processor 201 executes various functional applications and data processing of the device by running software programs, instructions and modules stored in the memory 202, that is, implementing the above image quality determination method. The input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and functional control of the device. The output device 204 may include a display device such as a display screen.

Embodiments of the present application also provide a storage medium containing computer-executable instructions, which when executed by a computer processor are used to perform an image quality determination method described in the above embodiments, which includes:

Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information;

It is worth noting that in the above embodiments of the image quality determination device, the various units and modules included are only divided according to functional logic, but are not limited to the above divisions, as long as the corresponding functions can be realized; in addition, The specific names of each functional unit are only for the convenience of distinguishing each other and are not It is not used to limit the scope of protection of the embodiments of this application.

In some possible implementations, various aspects of the method provided by this application can also be implemented in the form of a program product, which includes program code. When the program product is run on a computer device, the program code is used to The computer device is caused to execute the steps in the methods described above in this specification according to various exemplary embodiments of the present application. For example, the computer device may execute the image quality determination method described in the embodiments of the present application. The program product may be implemented in any combination of one or more readable media.

Claims

Image quality determination methods, including:

Obtain the distorted image and the corresponding original image, and perform information extraction on the original image to obtain semantic information and saliency information;

Determine the pixel weight of each pixel according to the semantic information and the saliency information;

Image quality information is calculated based on the pixel weight and the pixel values of corresponding pixels in the original image and the distorted image.
The image quality determination method according to claim 1, wherein the determining the pixel weight of each pixel point according to the semantic information and the saliency information includes:

Determine the semantic weight of each pixel according to the semantic information and the corresponding basic weight;

Calculate the saliency weight of each pixel based on the saliency information and the corresponding predefined weight;

The pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight.
The image quality determination method according to claim 2, wherein the semantic information includes a semantic index value of each pixel obtained by information extraction, and different semantic index values correspond to different object categories. According to the semantic information and the corresponding basic weight determine the semantic weight of each pixel, including:

Determine the semantic index value of each pixel;

The basic weight corresponding to the semantic index value of each pixel is determined as the semantic weight.
The image quality determination method according to claim 2 or 3, wherein the saliency information includes the gray value of each pixel obtained by information extraction, and the predefined weight includes a predefined minimum weight value and a predefined maximum The weight value, which is calculated to obtain the saliency weight of each pixel based on the saliency information and the corresponding predefined weight, includes:

The linear difference of the gray value of each pixel is calculated based on the predefined minimum weight value and the predefined maximum weight value to obtain the saliency weight of each pixel.
The image quality determination method according to claim 2, wherein the pixel weight of each pixel is calculated based on the semantic weight of each pixel and the corresponding saliency weight, including:

The product of the semantic weight of each pixel and the corresponding saliency weight is determined as the pixel weight of the pixel.
The image quality determination method according to any one of claims 1 to 5, wherein the information extraction of the original image to obtain semantic information and saliency information includes:

Semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure.
The image quality determination method according to claim 6, wherein the semantic information and saliency information are obtained by extracting information from the original image through a trained convolutional neural network with a dual-branch structure, including:

Semantic information is extracted from the original image through the semantic branch of the convolutional neural network. The semantic branch uses global pooling as the feature weight factor, and the number of channels in each layer of the network is less than the preset number of channels;

The saliency information is extracted from the original image through the detail branch of the convolutional neural network, and the number of network levels of the detail branch is less than the preset number of levels;

The semantic information and the saliency information are fused through the fusion layer of the convolutional neural network to obtain a semantic information graph containing the updated semantic information and saliency information.
The image quality determination method according to any one of claims 1 to 7, wherein the image quality information is calculated based on the pixel weight and the pixel values corresponding to the original image and the distorted image, including :

Calculate the mean square error of the distorted image and the original image based on the pixel weight;

The mean square error is substituted into the peak signal-to-noise ratio formula to calculate image quality information.
Image quality determining device, including:

An image acquisition module configured to acquire the distorted image and the corresponding original image;

An information extraction module configured to extract information from the original image to obtain semantic information and saliency information;

A weight calculation module configured to determine the pixel weight of each pixel based on the semantic information and the saliency information;

An image quality calculation module configured to calculate image quality information based on the pixel weight and the pixel values of corresponding pixels of the original image and the distorted image.
An image quality determination device, the device includes: one or more processors; a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors, The one or more processors are caused to implement the image quality determination method according to any one of claims 1-8.
A storage medium storing computer-executable instructions that, when executed by a computer processor, are used to perform the image quality determination method of any one of claims 1-8.
A computer program product includes a computer program, wherein when the computer program is executed by a processor, the image quality determination method according to any one of claims 1-8 is implemented.