CN115700744A

CN115700744A - Video quality evaluation method, device, equipment and medium based on coding information

Info

Publication number: CN115700744A
Application number: CN202110858544.1A
Authority: CN
Inventors: 张文杰; 陈莹
Original assignee: Beijing Ape Power Future Technology Co Ltd
Current assignee: Beijing Ape Power Future Technology Co Ltd
Priority date: 2021-07-28
Filing date: 2021-07-28
Publication date: 2023-02-07

Abstract

The utility model discloses a video quality evaluation method, device, equipment and storage medium based on coding information, in particular to the technical field of data processing, the realization scheme is as follows: acquiring a first video stream and a corresponding encoded second video stream, wherein the first video stream comprises a plurality of first image frames, and the second video stream comprises a plurality of second image frames corresponding to the plurality of first image frames respectively; decoding the second video stream, and determining the type of each second image frame and a corresponding first quantization parameter; determining a weight value of each position pixel point in each second image frame according to the type of each second image frame and the corresponding first quantization parameter; and determining a quality parameter corresponding to the second video stream based on the weight value of each position pixel point, the pixel value of each position pixel point in the first image frame and the second image frame. Therefore, the video quality can be accurately evaluated in multiple dimensions from the user subjective perception and data level, and the accuracy is high.

Description

Video quality evaluation method, device, equipment and medium based on coding information

Technical Field

The present disclosure relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for evaluating video quality based on coding information.

Background

With the rapid development of video coding and network technology, people have a higher and higher pursuit for high-quality video images. The video quality evaluation can be divided into subjective evaluation and objective evaluation according to an evaluation mode, wherein the subjective evaluation represents that the video or image quality is scored depending on subjective feeling of people, and the objective evaluation represents that the video quality is scored depending on a specific algorithm for a computer.

In the related technology, the video quality distortion before and after video coding can be reflected from a data layer through objective video quality evaluation indexes such as peak signal-to-noise ratio or structural similarity, but the subjective perception of human eyes cannot be reflected well. Therefore, how to effectively and accurately evaluate the video quality is a problem which needs to be solved at present.

Disclosure of Invention

The disclosure provides a video quality evaluation method, device, equipment and storage medium based on coding information.

According to a first aspect of the present disclosure, there is provided a video quality evaluation method based on coding information, including:

acquiring a first video stream and a corresponding encoded second video stream, wherein the first video stream comprises a plurality of first image frames, and the second video stream comprises a plurality of second image frames corresponding to the plurality of first image frames respectively;

decoding the second video stream to determine the type of each second image frame and a corresponding first quantization parameter;

determining a weight value of each position pixel point in each second image frame according to the type of each second image frame and a corresponding first quantization parameter;

and determining a quality parameter corresponding to the second video stream based on the weight value of each position pixel point and the pixel value of each position pixel point in the first image frame and the second image frame.

Optionally, the decoding the second video stream to determine the type of each second image frame and the corresponding first quantization parameter includes:

decoding the second video stream to determine the type of each second image frame and a second quantization parameter corresponding to each image block in each second image frame;

and determining the average value of the second quantization parameters corresponding to the image blocks in each second image frame as the first quantization parameter corresponding to each second image frame.

Optionally, the determining, according to the type of each second image frame and the corresponding first quantization parameter, a weight value of each position pixel point in each second image frame includes:

classifying each second image frame according to the type of each second image frame so as to determine an intra-coding frame image set, a predictive coding frame image set and a bidirectional predictive coding frame image set contained in the second video stream;

determining a reference quantization parameter corresponding to each image set according to a first quantization parameter corresponding to each second image frame in each image set;

determining an adjusting coefficient corresponding to each image set according to the reference quantization parameter corresponding to each image set;

adjusting the second quantization parameter corresponding to each image block in each second image frame based on the adjustment coefficient corresponding to each image set to determine a third quantization parameter corresponding to each image block;

and determining the weight value of each position pixel point in each second image frame according to each third quantization parameter.

Optionally, the determining a reference quantization parameter corresponding to each image set according to the first quantization parameter corresponding to each second image frame in each image set includes:

and determining the mean value of the first quantization parameters respectively corresponding to the second image frames in each image set as the reference quantization parameter corresponding to each image set.

Optionally, determining an adjustment coefficient corresponding to each image set according to the reference quantization parameter corresponding to each image set, includes:

and determining an adjusting coefficient corresponding to the reference quantization parameter corresponding to each of the rest image sets when the reference quantization parameter corresponding to any image set is adjusted to be the reference parameter.

Optionally, after the adjusting the second quantization parameter corresponding to each image block in each second image frame to determine the third quantization parameter corresponding to each image block, the method further includes:

under the condition that a third quantization parameter corresponding to any image block is smaller than a first threshold, determining the third quantization parameter corresponding to any image block as the first threshold;

alternatively, the first and second liquid crystal display panels may be,

and determining a third quantization parameter corresponding to any image block as a second threshold value when the third quantization parameter corresponding to any image block is larger than the second threshold value, wherein the second threshold value is larger than the first threshold value.

Optionally, the determining the quality parameter corresponding to the second video stream based on the weight value of each location pixel and the pixel value of each location pixel in the first image frame and the second image frame includes:

determining the mean square error between each second image frame and the corresponding first image frame based on the weight value of each pixel point and the pixel value of each pixel point in the first image frame and the second image frame respectively;

determining a reference mean square error according to each mean square error and the number of image frames contained in the second video stream;

normalizing the reference mean square error to determine a mean square error between the second video stream and the first video stream;

and determining a peak signal-to-noise ratio corresponding to the second video stream according to the mean square error between the second video stream and the first video stream.

determining a first mean value and a first variance corresponding to a first image frame according to a weight value corresponding to each position pixel point and a first pixel value in the first image frame;

determining a second mean value and a second variance corresponding to a second image frame according to a weight value corresponding to each position pixel point and a second pixel value in the second image frame;

determining covariance between the first image frame and the corresponding second image frame according to a first pixel value, a second pixel value, a weight value, a first average value corresponding to the first image frame and a second average value corresponding to the second image frame corresponding to each position pixel point;

determining the structural similarity between the first image frame and the corresponding second image frame according to the first mean value, the first variance, the second mean value, the second variance and the covariance;

and determining the structural similarity corresponding to the second video stream according to the structural similarities.

According to a second aspect of the present disclosure, there is provided an apparatus for evaluating video quality based on coding information, comprising:

an obtaining module, configured to obtain a first video stream and a corresponding encoded second video stream, where the first video stream includes a plurality of first image frames, and the second video stream includes a plurality of second image frames corresponding to the plurality of first image frames, respectively;

a first determining module, configured to perform decoding processing on the second video stream to determine a type of each second image frame and a corresponding first quantization parameter;

the second determining module is used for determining a weight value of each position pixel point in each second image frame according to the type of each second image frame and the corresponding first quantization parameter;

and the third determining module is used for determining the quality parameter corresponding to the second video stream based on the weight value of each position pixel point and the pixel value of each position pixel point in the first image frame and the second image frame.

Optionally, the first determining module is specifically configured to:

and determining the mean value of the second quantization parameters corresponding to the image blocks in each second image frame as the first quantization parameter corresponding to each second image frame.

Optionally, the second determining module includes:

a first determining unit, configured to classify each of the second image frames according to a type of each of the second image frames to determine an intra-coded frame image set, a predictive-coded frame image set, and a bidirectional predictive-coded frame image set included in the second video stream;

the second determining unit is used for determining a reference quantization parameter corresponding to each image set according to the first quantization parameter corresponding to each second image frame in each image set;

a third determining unit, configured to determine, according to the reference quantization parameter corresponding to each image set, an adjustment coefficient corresponding to each image set;

a fourth determining unit, configured to adjust the second quantization parameter corresponding to each image block in each second image frame based on the adjustment coefficient corresponding to each image set, so as to determine a third quantization parameter corresponding to each image block;

and the fifth determining unit is used for determining the weight value of each position pixel point in each second image frame according to each third quantization parameter.

Optionally, the second determining unit is specifically configured to:

Optionally, the third determining unit is specifically configured to:

and determining an adjustment coefficient corresponding to the reference quantization parameter corresponding to each of the rest image sets when the reference quantization parameter corresponding to any image set is adjusted to be the reference parameter.

Optionally, the fourth determining unit is further configured to:

alternatively, the first and second electrodes may be,

Optionally, the third determining module is specifically configured to:

determining a second mean value and a second variance corresponding to a second image frame according to the weight value corresponding to each position pixel point and a second pixel value in the second image frame;

determining covariance between the first image frame and the corresponding second image frame according to a first pixel value, a second pixel value and a weight value corresponding to each position pixel point, a first average value corresponding to the located first image frame and a second average value corresponding to the located second image frame;

According to a third aspect of the present disclosure, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to cause the at least one processor to perform a method as described in an embodiment of the above aspect.

According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program having instructions for causing a computer to perform the method of the above-described embodiment of the one aspect.

According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of an embodiment of the above-mentioned aspect.

In the embodiment of the disclosure, a first video stream and a corresponding encoded second video stream are obtained first, then the second video stream is decoded to determine the type of each second image frame and a corresponding first quantization parameter, then a weight value of each pixel point in each second image frame is determined according to the type of each second image frame and the corresponding first quantization parameter, and finally a quality parameter corresponding to the second video stream is determined based on the weight value of each pixel point and the pixel value of each pixel point in the first image frame and the second image frame. Therefore, different weights are given to different pixel points by combining with the coded information, and then the video quality is evaluated based on the weight value of each pixel point and the corresponding pixel values before and after compression, so that the video quality is accurately evaluated by multiple dimensions from the subjective perception and data level of a user, and the accuracy is high.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic flow chart of a video quality evaluation method based on coding information according to the present disclosure;

fig. 2 is a schematic flow chart of another video quality evaluation method based on coding information provided according to the present disclosure;

fig. 3 is a schematic flowchart of still another video quality evaluation method based on coding information according to the present disclosure;

fig. 4 is a schematic flowchart of still another video quality evaluation method based on coding information according to the present disclosure;

fig. 5 is a block diagram illustrating a structure of an apparatus for evaluating video quality based on coding information according to the present disclosure;

fig. 6 is a block diagram of an electronic device provided in the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The video quality evaluation method based on the coding information provided by the present disclosure may be executed by a video quality evaluation apparatus based on the coding information provided by the present disclosure, and may also be executed by an electronic device provided by the present disclosure, where the electronic device may include, but is not limited to, a terminal device such as a desktop computer, a tablet computer, and the like, and may also be a server, and hereinafter, without limiting the present disclosure, one video quality evaluation method based on the coding information provided by the present disclosure is executed by the video quality evaluation apparatus based on the coding information provided by the present disclosure, and is hereinafter referred to as "apparatus" for short.

A video quality evaluation method, apparatus, computer device, and storage medium based on coding information provided by the present disclosure are described in detail below with reference to the accompanying drawings.

Fig. 1 is a flowchart illustrating a video quality evaluation method based on coding information according to an embodiment of the disclosure.

As shown in fig. 1, the video quality evaluation method based on coding information may include the steps of:

step 101, a first video stream and a corresponding encoded second video stream are obtained, wherein the first video stream includes a plurality of first image frames, and the second video stream includes a plurality of second image frames corresponding to the plurality of first image frames, respectively.

The first video stream is also an original video, and the second video stream may be a video obtained by encoding the original video through any encoding device.

It should be noted that, in the present disclosure, the second video stream may be an encoded file generated based on any encoding standard. Alternatively, the second video stream may be a video stream generated by encoding the first video stream based on an h.264 encoding standard, or may be a video stream generated by encoding the first video stream based on any encoding standard such as h.265, AV1, AVs, and the like, which is not limited herein.

Optionally, the first video stream may be original video data of a teacher end or a student end acquired by a video device or a camera device in an online education scene, and the second video stream may be video stream data to be transmitted generated after encoding the acquired video data based on a preset encoding standard.

Step 102, decoding the second video stream to determine the type of each second image frame and the corresponding first quantization parameter.

It should be noted that, in order to reduce the data amount of the video data, the duplicate information in the original video, i.e., the first video, is usually removed by video coding. In order to ensure that the decoding end can recover the original video, an Intra Frame (I Frame), a Predicted Frame (P Frame), and a Bi-directional Predicted Frame (B Frame) are formed after video encoding.

The I frame, also called intra-frame coded frame, is an independent frame with all information, and can be independently decoded without referring to other images, and can be simply understood as a static picture. The P frame, also called an inter-frame prediction coding frame, needs to refer to the previous I frame for coding, which indicates the difference between the current frame picture and the previous frame (the previous frame may be an I frame or a P frame). When decoding, the difference defined by the frame is superposed on the picture buffered before, and the final picture is generated. The B frame is also called bidirectional predictive coding frame, that is, the B frame records the difference between the current frame and the previous and next frames. That is, to decode a B frame, not only the previous buffer picture but also the decoded picture are acquired, and the final picture is acquired by superimposing the previous and subsequent pictures on the data of the current frame.

In addition, quantization Parameter (QP), which is an important Parameter for video coding, can be used to characterize the Quantization level of video coding. The larger the quantization parameter is, the larger the distortion of the corresponding encoded video is.

That is, the first quantization parameter corresponding to each second image frame may be a quantization parameter corresponding to the second image frame, and may be used to characterize a quantization degree of the second image frame.

It is understood that, due to the video viewing end, the video stream generated by decoding the encoded second video stream is viewed, i.e. the quantization parameter may affect the subjective perception of the user to some extent.

Step 103, determining a weight value of each position pixel point in each second image frame according to the type of each second image frame and the corresponding first quantization parameter.

It should be noted that, during the encoding process, since the degree of reference of different types of image frames may be different, the corresponding quantization parameter used may also be different, and accordingly, the confidence of the distortion degree of the video characterized by the quantization parameter is also different. Thus, in the present disclosure, different weights may be determined for pixel points in different types of second image frames.

For example, the I frame, the P frame, and the B frame in the second image frame may respectively determine the weight value of each pixel point according to the corresponding first quantization parameter based on different manners.

Or, when the video quality is evaluated, the reference relation of the frame has no correlation with the quality of the video. Therefore, in this disclosure, the apparatus may first adjust the first quantization parameter corresponding to each second image frame based on the type of the second image frame to remove the influence of the type of the second image frame on the corresponding first quantization parameter, and then determine the weight value of each position pixel point in the second image frame according to the adjusted quantization parameter, so as to avoid the influence of the type of the second image frame on the quantization parameter, and the like, which is not limited in this disclosure.

And 104, determining a quality parameter corresponding to the second video stream based on the weight value of each position pixel point, the pixel value of each position pixel point in the first image frame and the second image frame.

Optionally, the determined quality parameter corresponding to the second video stream may be a Peak Signal Noise Ratio (PSNR), a Structural Similarity Index (SSIM), or the like, and is not limited herein.

It can be understood that, in the present disclosure, the quality parameter of each second image frame in the second video stream may be determined according to the weight value of each position pixel point and the corresponding pixel values before and after encoding, and then the quality parameters of each second image frame are fused to determine the quality parameter corresponding to the second video stream.

In the present disclosure, different weights are given to pixels in different types of second image frames according to the type of the second image frame in the encoded second video stream and corresponding first quantization parameters, and then quality parameters corresponding to the second video stream are determined according to a weight value and a pixel value corresponding to each pixel point.

Alternatively, the target areas in the first image frame and the second image frame may be determined first, and then the second video stream may be evaluated based on the pixel value of each position pixel point in the target area and the weight corresponding to the position pixel point in each target area, which is not limited herein.

In the embodiment of the disclosure, a first video stream and a corresponding encoded second video stream are obtained first, then the second video stream is decoded to determine the type of each second image frame and a corresponding first quantization parameter, then a weight value of each pixel point in each second image frame is determined according to the type of each second image frame and the corresponding first quantization parameter, and finally a quality parameter corresponding to the second video stream is determined based on the weight value of each pixel point and the pixel value of each pixel point in the first image frame and the second image frame. Therefore, different weights are given to different pixel points by combining coding information, and then the video quality is evaluated based on the weight value of each pixel point and the corresponding pixel values before and after compression, so that the video quality is accurately evaluated in multiple dimensions from the subjective perception and data level of a user, and the accuracy is high.

As can be seen from the above analysis, in the present disclosure, the type of each second image frame in the encoded second video stream and the corresponding first quantization parameter may be combined to determine the weight of each pixel point in the second image frame in the video quality evaluation, so as to avoid that the accuracy of the finally determined video quality is reduced due to the influence of the type of the second image frame on the quantization parameter. In an actual implementation process, the second video stream may usually include a plurality of I frames, a plurality of P frames, and/or a plurality of B frames, and the following describes in detail how to determine the weight of each pixel point in this case with reference to fig. 2.

Fig. 2 is a flowchart illustrating a video quality evaluation method based on coding information according to another embodiment of the present disclosure.

As shown in fig. 2, the video quality evaluation method based on coding information may include the steps of:

step 201, a first video stream and a corresponding encoded second video stream are obtained, where the first video stream includes a plurality of first image frames, and the second video stream includes a plurality of second image frames corresponding to the plurality of first image frames, respectively.

It should be noted that, the specific implementation process of step 201 may refer to step 101, which is not described herein again.

Step 202, performing decoding processing on the second video stream to determine the type of each second image frame and a second quantization parameter corresponding to each image block in each second image frame.

It is understood that the video decoding, i.e. the inverse process of the video encoding, in the present disclosure, the binary second video stream may be decompressed by any decoding device to obtain the second image frame sequence, the type of each second image frame, and the quantization parameter information of each image block contained therein.

The image block may be a coding block included in the second image frame, and the second image frame may be composed of a plurality of image blocks. The second quantization parameter may be a quantization parameter corresponding to each image block, and is used to characterize a quantization degree of each image block in an encoding process. During the encoding process, since the degree to which each image block in the second image frame is referred to may be different, the corresponding second quantization parameter used may also be different, and accordingly, the confidence of the distortion degree of the video characterized by the second quantization parameter is also different.

Step 203, determining an average value of the second quantization parameters corresponding to the image blocks in each second image frame as the first quantization parameter corresponding to each second image frame.

For convenience of description, M is recorded as the number of image blocks in any second image frame, and is represented by QP _j The quantization parameter representing the jth image block in any second image frame, i.e. the second quantization parameter, by QP _i Representing the first quantization parameter, the first quantization parameter may be calculated by the following formula:

it should be noted that, by using the above formula, the mean value of the second quantization parameters of each image block in any second image frame is first calculated, and then the mean value of the second quantization parameters corresponding to each image block is determined as the first quantization parameter corresponding to each second image frame. It is to be understood that the quantization parameter of any one of the second image frames may be calculated in the above manner, and is not limited herein.

Optionally, in this embodiment of the present disclosure, each image block in which the region of interest in the second image frame is located may be determined first, and then an average value of the second quantization parameters of each image block is used as the first quantization parameter corresponding to the second image frame, which is not limited herein. This reduces the amount of calculation and improves the efficiency of processing the image.

The region of interest may be a region of greater interest to human eyes in the second image frame, such as a target region of a human face, an animal, and the like in the second image frame, and the region of no interest may be other regions in the original image and the compressed image except the region of interest, which is not limited herein.

It should be noted that a rectangular region of the detection frame corresponding to the target object in the image may be determined as the region of interest, or a non-rectangular region including the target object, which is segmented according to the target edge, may be determined as the region of interest, which is not limited in this disclosure.

And step 204, classifying the second image frames according to the type of each second image frame to determine an intra-coded frame image set, a predictive-coded frame image set and a bidirectional predictive-coded frame image set contained in the second video stream.

Step 205, determining a reference quantization parameter corresponding to each image set according to the first quantization parameter corresponding to each second image frame in each image set.

Optionally, an average value of the first quantization parameters respectively corresponding to the second image frames in each image set may be determined as the reference quantization parameter corresponding to each image set.

For example, by QP _i A first quantization parameter representing the ith second image frame in the I-frame image set to

If N is the number of the second image frames in the current image set as the reference quantization parameter of the image set, the quantization parameter can be calculated by the following formula

It is understood that, by the same calculation method as described above, the reference quantization parameter corresponding to the P-frame image set can also be calculated

And reference quantization parameter corresponding to Q frame image set

And step 206, determining an adjustment coefficient corresponding to each image set according to the reference quantization parameter corresponding to each image set.

Since the reference relationship of a frame has no causal relationship with the video quality at the time of video quality evaluation, it is necessary to eliminate this influence at the time of video quality evaluation. This effect can be eliminated in the present disclosure by adjusting the reference quantization parameters corresponding to different types of image sets to the same value.

Optionally, the apparatus may determine, by using a reference quantization parameter corresponding to any image set as a reference parameter, an adjustment coefficient corresponding to when the reference quantization parameter corresponding to each of the remaining image sets is adjusted to be the reference parameter.

For example, the reference quantization parameter corresponding to the I-frame image set

As a reference parameter, a reference quantization parameter corresponding to the P frame image set can be determined

And reference quantization parameter corresponding to Q frame image set

When adjusted to the reference parameters, the corresponding adjustment coefficients F _P And F _B 。

Specifically, the calculation can be performed by the following formula:

in addition, the reference quantization parameter corresponding to the P frame image set

When the reference parameter is used as the reference parameter, the reference quantization parameter corresponding to the I-frame image set can be calculated and determined through the following formula

And reference quantization parameter corresponding to Q frame image set

When the reference parameter is adjusted, the corresponding adjustment coefficients F _I And F _B 。

After determining the adjustment coefficient corresponding to each image set, each second image frame in the image set may adjust the corresponding first quantization parameter based on the same adjustment coefficient.

Step 207, based on the adjustment coefficient corresponding to each image set, adjusting the second quantization parameter corresponding to each image block in each second image frame to determine a third quantization parameter corresponding to each image block.

For convenience of description, if F is used as the adjustment coefficient, QP is used as the second quantization parameter corresponding to each image block in each second image frame, and QP 'is used as the third quantization parameter, then QP' may be calculated by the following formula:

QP＝F*QP

in the present disclosure, in order to make the determined weight value of each pixel point more reliable, the third quantization parameter corresponding to each image block may be limited within a certain range as required.

Optionally, when the third quantization parameter corresponding to any image block is smaller than the first threshold, the apparatus determines the third quantization parameter corresponding to any image block as the first threshold. And when the third quantization parameter corresponding to any image block is larger than a second threshold, the device determines the third quantization parameter corresponding to any image block as the second threshold, wherein the second threshold is larger than the first threshold.

The first threshold may be a minimum value of the quantization parameter threshold interval, and the second threshold may be a maximum value of the quantization parameter threshold interval. For example, QP' is used as the third quantization parameter corresponding to any image block, QP is used _min As the first threshold, use QP _max As a result of the second threshold value,if the third quantization parameter QP' < QP _min The device may then select the QP _min As the third quantization parameter, if the third quantization parameter QP' > QP _max The device may then select the QP _max As a third quantization parameter.

It should be noted that different encoding modes may correspond to different encoding standards, and therefore, the thresholds of the quantization parameters corresponding to the image blocks may also differ. For example, for encoding standards h.264 and h.265, their corresponding quantization parameter threshold intervals may be [10, 40].

Alternatively, if only I frames are present in the second image frame corresponding to the second video stream, the apparatus may not adjust the second quantization parameter. Alternatively, if only I-frames and P-frames are in the second image frame, the apparatus may adjust only the P-frames. Alternatively, if only I-frames and B-frames are present in the second image frame, the apparatus may perform an adjustment on the B-frames. Alternatively, if only the P frame and the B frame are in the second image frame, the apparatus may perform adjustment on the B frame, and the like, which is not limited herein.

Alternatively, when there are I-frames, P-frames, and B-frames in the second image frame, the apparatus may also adjust the I-frames and the P-frames, or adjust the I-frames and the B-frames, and so on.

And step 208, determining a weight value of each pixel point in each second image frame according to each third quantization parameter.

It should be noted that, because the third quantization parameter corresponding to each image block eliminates the referenced relationship of the image in the video stream, and may reflect the importance of the image block in the video stream, the weight value corresponding to each pixel point in the image block may be accurately determined based on the third quantization parameter corresponding to the image block.

For example, the weights of all pixel points in an image block n of the t-th frame image in the video are the same, and the weight W of a certain pixel (t, i, j) in the image block _t，i，j The adjusted third quantization parameter QP for the block _t，n The formula is as follows:

therefore, according to the mode, the weight value of the pixel point of each image block can be obtained through calculation, and then the weight value of each pixel point in each second image frame is determined.

Step 209, determining a quality parameter corresponding to the second video stream based on the weight value of each position pixel, and the pixel values of each position pixel in the first image frame and the second image frame.

It should be noted that, the specific implementation process of step 209 may refer to any embodiment of the present disclosure, and is not described herein again.

In the embodiment of the disclosure, the mean values of the quantization coefficients corresponding to different types of image frames are adjusted to be consistent to eliminate the influence of the referenced degree of each image block on the quantization parameter of the image block, then the weight corresponding to each pixel point is determined based on the adjusted quantization parameter, and then the video quality is evaluated based on the weight value of each pixel point and the corresponding pixel values before and after compression, so that the video quality is accurately evaluated in multiple dimensions from the user subjective perception and data level, and the accuracy is high.

In the embodiment of the disclosure, a first video stream and a corresponding encoded second video stream are first obtained, then the second video stream is decoded to determine a type of each second image frame and a second quantization parameter corresponding to each image block in each second image frame, then an average value of the second quantization parameters corresponding to each image block in each second image frame is determined as a first quantization parameter corresponding to each second image frame, each second image frame is classified according to the type of each second image frame to determine a key image frame set, a differential image frame set and a bidirectional differential image frame set which are included in the second video stream, then a reference quantization parameter corresponding to each image set is determined according to a first quantization parameter corresponding to each second image frame in each image set, an adjustment coefficient corresponding to each image set is determined according to the reference quantization parameter corresponding to each image set, then the second quantization parameter corresponding to each image block in each second image frame is adjusted based on the adjustment coefficient corresponding to each image set to determine each corresponding third quantization parameter, and a position of each pixel point in each second image frame is determined based on the image frame adjustment coefficient corresponding to each third quantization parameter, and a position of each pixel point in each second image frame is determined based on a weight value of each image frame. Therefore, the weight of each pixel point in the second image frame in the video quality evaluation is determined by combining the type of each second image frame in the coded second video stream and the corresponding first quantization parameter, the video quality is evaluated, and therefore the video quality is accurately evaluated in multiple dimensions from the aspect of user subjective perception and data, and the accuracy is high.

Fig. 3 is a flowchart illustrating a video quality evaluation method based on coding information according to another embodiment of the disclosure.

As shown in fig. 3, the video quality evaluation method based on coding information may include the steps of:

step 301, a first video stream and a corresponding encoded second video stream are obtained, where the first video stream includes a plurality of first image frames, and the second video stream includes a plurality of second image frames corresponding to the plurality of first image frames, respectively.

Step 302, decoding the second video stream to determine the type of each second image frame and a second quantization parameter corresponding to each image block in each second image frame.

Step 303, determining an average value of the second quantization parameters corresponding to the image blocks in each second image frame as the first quantization parameter corresponding to each second image frame.

And step 304, classifying the second image frames according to the type of each second image frame to determine an intra-coded frame image set, a predictive-coded frame image set and a bidirectional predictive-coded frame image set contained in the second video stream.

Step 305, determining a reference quantization parameter corresponding to each image set according to the first quantization parameter corresponding to each second image frame in each image set.

Step 306, determining an adjustment coefficient corresponding to each image set according to the reference quantization parameter corresponding to each image set.

Step 307, based on the adjustment coefficient corresponding to each image set, adjusting the second quantization parameter corresponding to each image block in each second image frame to determine a third quantization parameter corresponding to each image block.

And 308, determining the weight value of each pixel point in each second image frame according to each third quantization parameter.

It should be noted that, for the specific implementation process of steps 301 to 308, reference may be made to any of the above embodiments, which are not described herein again.

Step 309, determining a mean square error between each second image frame and the corresponding first image frame based on the weight value of each position pixel point, the first image frame and the pixel value of each position pixel point in the second image frame.

For convenience of explanation, w _t，i，j As the weighted value corresponding to the pixel point with the coordinate (i, j) in the t frame image, taking X as the weight value _t，i，j As the pixel value corresponding to the pixel point at the position in the first image frame, the pixel value is expressed by Y _t，i，j If H is taken as the height of the t-th frame image and W is taken as the width of the t-th frame image as the pixel value corresponding to the pixel point at the position in the second image frame, the mean square error wMSE between the second image frame and the first image frame can be calculated by the following formula _t ：

It should be noted that, the mean square error between each second image frame and the corresponding first image frame can be calculated by the above formula. The mean square error is a measure of the degree of difference between the second image frame and the first image frame. The larger the mean square error, the larger the difference between the second image frame and the first image frame is represented.

Step 310, determining a reference mean square error according to each mean square error and the number of image frames included in the second video stream.

Optionally, if the number of image frames included in the second video stream is N, the mean square error wMSE according to each image frame is determined _t Calculating a reference mean square error by the following formula

The reference mean square error may be an average value of the respective mean square errors.

Step 311, the reference mean square error is normalized to determine the mean square error between the second video stream and the first video stream.

Optionally, the reference mean square error may be normalized by the following formula to further determine the mean square error wmes between the second video stream and the first video stream _norm ：

It will be appreciated that by normalizing the reference mean square error, the expression of the mean square error can be transformed to a dimensionless expression, i.e. a scalar. Thus, between different video streams, a lateral quality comparison can be made based on the scalar.

Step 312, determining a peak snr corresponding to the second video stream according to a mean square error between the second video stream and the first video stream.

After determining the mean square error between the second video stream and the first video stream, the apparatus may bring the mean square error into a formula for a peak signal-to-noise ratio, from which a corresponding peak signal-to-noise ratio for the second video stream may be determined.

Wherein wPSNR is peak signal-to-noise ratio, wMSE _norm Is the mean square error.

In addition, in the present disclosure, the peak snr obtained by the above calculation may be used as a quality evaluation result of the second video stream. The peak signal ratio, i.e., the ratio of the energy of the peak signal to the average energy of the noise, can be used to evaluate the quality of the image based on the error between the pixels. It will be appreciated that the peak signal to noise ratio can be used to assess the quality of the second video stream. A larger peak signal-to-noise ratio indicates a better quality of the second video stream, and a smaller peak signal-to-noise ratio indicates a worse quality of the second video stream.

In the embodiment of the disclosure, firstly, the weighted value of each position pixel point in the encoded second video stream is determined based on the encoding information, then, the weighted value of each position pixel point is used for determining the mean square error between the second video stream and the first video stream, and further determining the peak signal-to-noise ratio corresponding to the second video stream, so that the obtained peak signal-to-noise ratio can reflect the quality of the second video from the aspect of visitors, and is more suitable for the subjective evaluation of human eyes on the image quality, and the accuracy and reliability of the evaluation result are improved.

Fig. 4 is a flowchart illustrating a video quality evaluation method based on coding information according to another embodiment of the present disclosure.

As shown in fig. 4, the video quality evaluation method based on coding information may include the steps of:

step 401, a first video stream and a corresponding encoded second video stream are obtained, where the first video stream includes a plurality of first image frames, and the second video stream includes a plurality of second image frames corresponding to the plurality of first image frames, respectively.

Step 402, decoding the second video stream to determine the type of each second image frame and a second quantization parameter corresponding to each image block in each second image frame.

Step 403, determining an average value of the second quantization parameters corresponding to the image blocks in each second image frame as the first quantization parameter corresponding to each second image frame.

In step 404, according to the type of each second image frame, the second image frames are classified to determine an intra-coded frame image set, a predictive-coded frame image set, and a bidirectional predictive-coded frame image set included in the second video stream.

Step 405, determining a reference quantization parameter corresponding to each image set according to the first quantization parameter corresponding to each second image frame in each image set.

Step 406, determining an adjustment coefficient corresponding to each image set according to the reference quantization parameter corresponding to each image set.

Step 407, based on the adjustment coefficient corresponding to each image set, adjust the second quantization parameter corresponding to each image block in each second image frame to determine a third quantization parameter corresponding to each image block.

Step 408, determining a weight value of each pixel point in each second image frame according to each third quantization parameter.

It should be noted that, the specific implementation process of steps 401 to 408 may refer to any of the above embodiments, which is not described herein again.

Step 409, determining a first mean value and a first variance corresponding to a first image frame according to the weight value corresponding to each position pixel point and a first pixel value in the first image frame.

Step 410, determining a second mean value and a second variance corresponding to a second image frame according to the weight value corresponding to each position pixel point and a second pixel value in the second image frame.

The first mean value may be a mean value of each pixel value in the first image frame, the second mean value may be a mean value of each pixel value in the second image frame, the first variance may be a variance of a first pixel value corresponding to each pixel in the first image frame, and the second variance may be a variance of each second pixel value in the second image frame.

For convenience of explanation, H is taken as the height of the first image frame, W is taken as the width of the first image frame, and W is taken as the width of the first image frame _t，i，j As the weight corresponding to the pixel point X (i, j) of the first image frame, the mean value μ of the pixel points X (i, j) in the first image frame can be calculated by the following formula _X ：

Further, after determining the first mean value, the first variance σ may be calculated by the following formula _X ² ：

It should be noted that, through the above formula, the second average value μ corresponding to the second image frame can be obtained similarly _Y And a second variance σ _Y ² And (4) performing calculation.

Step 411, determining a covariance between the first image frame and the corresponding second image frame according to the first pixel value, the second pixel value, the weight value, the first average value corresponding to the located first image frame, and the second average value corresponding to the located second image frame corresponding to each position pixel point.

Specifically, the covariance σ can be calculated by the following formula _YY ：

Step 412, determining the structural similarity between the first image frame and the corresponding second image frame according to the first mean, the first variance, the second mean, the second variance and the covariance.

Alternatively, the structural similarity may be calculated by the following formula:

wherein SSIM is a second image frame and a corresponding first image frameStructural similarity between image frames, mu _Y Is the second mean value, mu _X Is the first mean value, σ _X ² Is a first variance, σ _Y ² Is the second variance, σ _XY Is covariance, c ₁ 、c ₂ Are reference coefficients.

Step 413, determining the structural similarity corresponding to the second video stream according to the structural similarities.

It should be noted that, after determining the structural similarity between each second image frame and the first image frame, the average value of each structural similarity may be used as the corresponding structural similarity of the second video stream. Alternatively, the average value of the structural similarities of the second image frames corresponding to the key image frames may also be used as the structural similarity corresponding to the second video stream, and the disclosure is not limited herein.

Alternatively, when determining the structural similarity corresponding to the second video stream, the structural similarity of each second image frame and the weight of each second image frame may be calculated. For example, the structural similarity of each second image frame may be multiplied by the corresponding weight, and then divided by the sum of the total weights of each second image frame, so that normalization may be performed, thereby obtaining the structural similarity of the structural second video stream, that is, the video evaluation result.

It should be noted that the structural similarity may represent a similarity between the second image frame and the corresponding first image frame. The greater the structural similarity, the better the quality of the second video stream; conversely, the smaller the structural similarity, the worse the quality of the second video stream.

In the embodiment of the disclosure, firstly, a weighted value of each pixel point in a video stream is determined based on coding information of a compressed second video stream, then, according to the weighted value of each pixel point, structural similarity between each second image frame and a corresponding first image frame in the compressed video stream is determined, and further, structural similarity corresponding to the second video stream is determined, so that the determined structural similarity can represent quality of the second video stream from a guest, and an evaluation result better accords with subjective feeling of human eyes, and accuracy and reliability of the evaluation result are improved.

In order to implement the foregoing embodiments, an apparatus for evaluating video quality based on coding information is also provided in the embodiments of the present disclosure. Fig. 5 is a block diagram of a video quality evaluation apparatus based on coding information according to an embodiment of the present disclosure.

As shown in fig. 5, the video quality evaluation apparatus based on coding information includes: an acquisition module 510, a first determination module 520, a second determination module 530, and a third determination module 540.

An obtaining module 510, configured to obtain a first video stream and a corresponding encoded second video stream, where the first video stream includes a plurality of first image frames, and the second video stream includes a plurality of second image frames corresponding to the plurality of first image frames, respectively;

a first determining module 520, configured to perform a decoding process on the second video stream to determine a type of each of the second image frames and a corresponding first quantization parameter;

a second determining module 530, configured to determine, according to the type of each second image frame and the corresponding first quantization parameter, a weight value of each location pixel point in each second image frame;

a third determining module 540, configured to determine a quality parameter corresponding to the second video stream based on the weight value of each location pixel and the pixel value of each location pixel in the first image frame and the second image frame.

Optionally, the first determining module is specifically configured to:

Optionally, the second determining module includes:

Optionally, the second determining unit is specifically configured to:

Optionally, the third determining unit is specifically configured to:

Optionally, the fourth determining unit is further configured to:

alternatively, the first and second electrodes may be,

Optionally, the third determining module is specifically configured to:

determining a mean square error between each second image frame and the corresponding first image frame based on the weight value of each pixel point and the pixel value of each pixel point in the first image frame and the second image frame respectively;

Optionally, the third determining module is specifically configured to:

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 6 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 6, the apparatus 600 includes a computing unit 601, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the device 600 can also be stored. The calculation unit 601, the ROM602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, and the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

Computing unit 601 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 601 performs the respective methods and processes described above, such as a video quality evaluation method based on encoding information. For example, in some embodiments, the video quality assessment method based on encoding information may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM602 and/or the communication unit 609. When the computer program is loaded into the RAM 603 and executed by the computing unit 601, one or more steps of the video quality assessment method based on coding information described above may be performed. Alternatively, in other embodiments, the calculation unit 601 may be configured by any other suitable means (e.g. by means of firmware) to perform a video quality evaluation method based on the encoding information.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program code, when executed by the processor or controller, causes the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above, reordering, adding or deleting steps, may be used. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A video quality evaluation method based on coding information is characterized by comprising the following steps:

acquiring a first video stream and a corresponding encoded second video stream, wherein the first video stream comprises a plurality of first image frames, and the second video stream comprises a plurality of second image frames respectively corresponding to the plurality of first image frames;

determining a weight value of each position pixel point in each second image frame according to the type of each second image frame and the corresponding first quantization parameter;

2. The method as claimed in claim 1, wherein said decoding said second video stream to determine a type of each of said second image frames and a corresponding first quantization parameter comprises:

3. The method as claimed in claim 2, wherein said determining a weight value for each position pixel point in each of said second image frames according to a type of said each of said second image frames and a corresponding first quantization parameter comprises:

determining an adjustment coefficient corresponding to each image set according to a reference quantization parameter corresponding to each image set;

4. The method of claim 3, wherein determining the reference quantization parameter for each image set based on the first quantization parameter for each second image frame in each image set comprises:

5. The method of claim 3, wherein determining the adjustment factor for each of the image sets based on the reference quantization parameter for each of the image sets comprises:

6. The method as claimed in claim 3, wherein after said adjusting the second quantization parameter corresponding to each image block in each of the second image frames to determine the third quantization parameter corresponding to each image block, further comprising:

alternatively, the first and second electrodes may be,

and determining a third quantization parameter corresponding to any image block as a second threshold when the third quantization parameter corresponding to any image block is greater than the second threshold, wherein the second threshold is greater than the first threshold.

7. The method according to any one of claims 1-6, wherein the determining the quality parameter corresponding to the second video stream based on the weight value of each of the position pixels and the pixel value of each of the position pixels in the first image frame and the second image frame comprises:

8. The method of claim 7, wherein the determining the quality parameter corresponding to the second video stream based on the weight value of each of the loxel points and the pixel value of each of the loxel points in the first image frame and the second image frame comprises:

9. An apparatus for evaluating video quality based on coding information, comprising:

a first determining module, configured to perform decoding processing on the second video stream to determine a type and a corresponding first quantization parameter of each second image frame;

10. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.