WO2022156534A1

WO2022156534A1 - Video quality assessment method and device

Info

Publication number: WO2022156534A1
Application number: PCT/CN2022/070276
Authority: WO
Inventors: 周芳汝; 安山
Original assignee: 北京沃东天骏信息技术有限公司; 北京京东世纪贸易有限公司
Priority date: 2021-01-21
Filing date: 2022-01-05
Publication date: 2022-07-28
Also published as: CN113781384A

Abstract

The present application relates to the technical field of video processing, and discloses a video quality assessment method and device. In one specific embodiment, the method comprises: obtaining a video frame sequence to be assessed; determining at least one sub video frame sequence from the video frame sequence to be assessed; scoring sub video frame sequences according to a preset scoring rule; and performing, according to scores of the sub video frame sequences, quality assessment on the video frame sequence to be assessed. The embodiment facilitates performing reasonable and effective assessment on video quality.

Description

Video quality assessment method and device

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application claims the priority of the Chinese patent application with the application number of 202110083566.5 and the invention titled "video quality assessment method and device", filed on January 21, 2021, the full text of which is incorporated into this application by reference middle.

technical field

The present application relates to the field of computer technologies, in particular to the field of video processing technologies, and in particular, to a video quality assessment method and apparatus.

Background technique

With the advent of the information age, video users have grown rapidly, and videos on various platforms have experienced explosive growth. In order to ensure users' viewing experience of videos, we need to evaluate the quality of a large number of videos.

At present, there are two main ways to assess the quality of videos: first, the video quality is assessed by artificial methods such as the click rate, like rate, and viewing time of the video; second, the quality of each frame of the video is compared Video quality is evaluated.

SUMMARY OF THE INVENTION

Embodiments of the present application provide a video quality assessment method, apparatus, device, and storage medium.

According to a first aspect, an embodiment of the present application provides a video quality assessment method, the method includes: acquiring a video frame sequence to be evaluated; and determining at least one sub-video frame sequence from the to-be-evaluated video frame sequence; It is a continuous video frame whose color mean gradient of a group of video frames satisfies the preset condition; according to the preset scoring rule, each sub-video frame sequence is scored, and the preset scoring rule is associated with the attribute information of each sub-video frame sequence ; According to the score of each sub-video frame sequence, the quality of the video frame sequence to be evaluated is evaluated.

In some embodiments, the preset scoring rules include at least one of the following: a first scoring rule associated with the average duration of each sub-video frame sequence; a second scoring associated with the target display object of each sub-video frame sequence rule.

In some embodiments, the preset scoring rule includes: a first scoring rule and a second scoring rule, and scoring each sub-video frame sequence according to the preset scoring rule includes: scoring each sub-video frame sequence according to the first scoring rule The video frame sequence is scored to obtain the first score of each sub-video frame sequence; the sub-video frame sequence is scored according to the second scoring rule to obtain the second score of each sub-video frame sequence; The second score is to score each sub-video frame sequence.

In some embodiments, the second scoring rule includes at least one of the following: a third scoring rule associated with the frequency with which the target presentation object of each sub-video frame sequence appears in the corresponding sub-video frame sequence; a third scoring rule associated with each sub-video frame sequence The target shows the fourth scoring rule associated with the area of the object in the corresponding sub-video frame sequence.

In some embodiments, the preset scoring rule includes a third scoring rule and a fourth scoring rule, and scoring each sub-video frame sequence according to the preset scoring rule includes: scoring each sub-video frame sequence according to the third scoring rule The frame sequence is scored to obtain the third score of each sub-video frame sequence; the sub-video frame sequence is scored according to the fourth scoring rule to obtain the fourth score of each sub-video frame sequence; according to the third score and the fourth score Score, score each sub-video frame sequence.

According to a second aspect, an embodiment of the present application provides an apparatus for evaluating video quality, the apparatus including: an obtaining module configured to obtain a sequence of video frames to be evaluated; a determining module configured to determine from the sequence of video frames to be evaluated At least one sub-video frame sequence is obtained, and the sub-video frame sequence is a continuous video frame whose color value gradients of a group of video frames meet preset conditions; the scoring module is configured to be configured according to preset scoring rules. The frame sequence is scored, and the preset scoring rule is associated with attribute information of each sub-video frame sequence; the evaluation module is configured to perform quality evaluation on the video frame sequence to be evaluated according to the score of each sub-video frame sequence.

In some embodiments, the preset scoring rule includes a first scoring rule and a second scoring rule, and the scoring module is further configured to: score each sub-video frame sequence according to the first scoring rule, and obtain the score of each sub-video frame sequence. First scoring; scoring each sub video frame sequence according to the second scoring rule to obtain a second score of each sub video frame sequence; scoring each sub video frame sequence according to the first score and the second score.

In some embodiments, the preset scoring rules include a third scoring rule and a fourth scoring rule, and the scoring module is further configured to: score each sub-video frame sequence according to the third scoring rule, and obtain the score of each sub-video frame sequence. the third score; score each sub-video frame sequence according to the fourth scoring rule to obtain a fourth score of each sub-video frame sequence; score each sub-video frame sequence according to the third score and the fourth score.

According to a third aspect, an embodiment of the present application provides an electronic device, the electronic device includes at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor , the instruction is executed by at least one processor, so that when the at least one processor is executed, the video quality assessment method according to any embodiment of the first aspect can be implemented.

According to a fourth aspect, an embodiment of the present application provides a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions are used to enable a computer to implement the video quality assessment according to any embodiment of the first aspect. method.

According to a fifth aspect, an embodiment of the present application provides a computer program product including a computer program, the computer program can implement the video quality assessment method according to any embodiment of the first aspect when the computer program is executed by a processor.

The present application obtains the video frame sequence to be evaluated; determines at least one sub-video frame sequence from the video frame sequence to be evaluated; scores each sub-video frame sequence according to a preset scoring rule; Scoring, to evaluate the quality of the video frame sequence to be evaluated, that is, first divide the video frame sequence to be evaluated into different pictures, and then score each picture according to the attribute information of each picture, and then evaluate the video quality, which effectively improves the quality of the video. Reasonableness and validity of quality assessment.

It should be understood that what is described in this section is not intended to identify key or critical features of embodiments of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become readily understood from the following description.

Description of drawings

FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;

2 is a flowchart of an embodiment of a video quality assessment method according to the present application;

3 is a schematic diagram of an application scenario of the video quality assessment method according to the present application;

4 is a flowchart of another embodiment of a video quality assessment method according to the present application;

5 is a schematic diagram of an embodiment of a video quality assessment apparatus according to the present application;

FIG. 6 is a schematic structural diagram of a computer system suitable for implementing the server of the embodiment of the present application.

Detailed ways

Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of the video quality assessment method of the present application may be applied.

As shown in FIG. 1 , the system architecture 100 may include

terminal devices

101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the

terminal devices

101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The

terminal devices

101, 102, and 103 interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications may be installed on the

terminal devices

101 , 102 and 103 , for example, video playback applications, communication applications, and the like.

The

terminal devices

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, and 103 are hardware, they can be various electronic devices with display screens, including but not limited to mobile phones and notebook computers. When the

terminal devices

101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (for example, to provide video quality assessment services), or can be implemented as a single software or software module. There is no specific limitation here.

The server 105 may be a server that provides various services, for example, acquiring the video frame sequence to be evaluated; determining at least one sub-video frame sequence from the video frame sequence to be evaluated; Score; according to the score of each sub-video frame sequence, perform quality assessment on the video frame sequence to be evaluated.

It should be noted that the server 105 may be hardware or software. When the server 105 is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or can be implemented as a single server. When the server is software, it can be implemented as a plurality of software or software modules (for example, used to provide a video quality assessment service), or can be implemented as a single software or software module. There is no specific limitation here.

It should be pointed out that the video quality assessment method provided by the embodiments of the present disclosure may be executed by the server 105, or by the

terminal devices

101, 102, 103, or by the server 105 and the

terminal devices

101, 102, 103 cooperate with each other implement. Correspondingly, each part (for example, each unit, sub-unit, module, sub-module) included in the video quality evaluation apparatus may be all set in the server 105, or all may be set in the

terminal devices

101, 102, 103, or may be set separately in the server 105 and the

terminal devices

101, 102, and 103.

It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

FIG. 2 shows a schematic flowchart 200 of an embodiment of a video quality assessment method that can be applied to the present application. In this embodiment, the video quality assessment method includes the following steps:

Step 201, acquiring a sequence of video frames to be evaluated.

In this embodiment, the execution subject (the server 105 or the

terminal devices

101, 102, 103 as shown in FIG. 1 ) can obtain the video frame sequence to be evaluated locally, or can obtain the video frame sequence to be evaluated from a remote video database in a wired or wireless manner. The server obtains the video frame sequence to be evaluated, which is not limited in this application.

Step 202: Determine at least one sub-video frame sequence from the to-be-evaluated video frame sequence.

In this embodiment, after acquiring the video frame sequence to be evaluated, the executing subject may determine at least one sub-video frame sequence from the to-be-evaluated video frame sequence, and the sub-video frame sequence is a group of video frames whose color mean gradient satisfies a predetermined Conditioned consecutive video frames.

The preset condition may be whether the color mean gradient of the video frame is greater than or equal to a preset color mean gradient threshold, whether it satisfies the preset color mean gradient threshold range, and the like. Here, the color mean gradient threshold and the color mean gradient threshold range can be determined according to experience, actual needs and specific application scenarios, which are not limited in this application.

Here, the color mean gradient is expressed in the following form. If the current video frame sequence to be evaluated can be divided into K video frames, where the kth (k=1, 2, . . . , K) video frame is represented by I _k , the video frame The pixel value of the i-th row and the j-th column in the frame is I _k (i, j), the size of the video frame is (W, H), and the RGB (Red Green Blue, red, green, blue) channels of the video frame are are denoted as R _k , G _k , and B _k , respectively. Calculate the pixel mean of the R, G, and B channels of each video frame, then the mean of the three channels of the kth frame are:

The color mean gradient of the _kth frame is denoted by Dk

Specifically, the color mean gradient threshold is represented by th (for example, th=100). If the color mean gradient of the kth frame is greater than or equal to the preset color mean gradient threshold, it is expressed as Tk ₌ 1; if the color mean gradient of the kth frame is equal to or greater than less than the preset color mean gradient threshold, expressed as T _k =0, as shown below

Wherein, the video frames in each sub-video frame sequence in the at least one sub-video frame sequence may be composed of video frames with T _k =0, and usually the video frames with T _k =0 are continuous frames, which are used to represent a picture, and Video frames for Tk ₌ 1 are generally used to represent transition frames between sequences of sub-video frames used to characterize a picture.

Step 203: Score each sub-video frame sequence according to a preset scoring rule.

In this embodiment, after determining each sub-video frame sequence, the execution body may further score each video frame sequence according to a preset scoring rule. Here, the preset scoring rule and the attribute information of each sub-video frame sequence Correlation, that is, the preset scoring rule is set according to the attribute information of each sub-video frame.

Here, the attribute information of each sub-video frame sequence may include the target presentation object corresponding to each sub-video frame sequence, the average presentation duration of each sub-video frame sequence, the number of presentation objects included in each sub-video frame sequence, and the like. The preset scoring rule may be set according to one or more of the attribute information of each sub-video frame sequence, which is not limited in this application.

Here, the target display object is the target object that each sub-video frame sequence mainly wants to display. The execution subject can determine the target display object according to the area of each display object in the sub-video frame sequence, or can also determine the target display object according to the area of each display object in the sub-video frame sequence. The frequency determines the target display object, which is not limited in this application.

Optionally, the execution body may determine the display object corresponding to the category with the highest occurrence frequency in the display object category set corresponding to each sub-video frame sequence as the target display object, wherein the display object category set is composed of each sub-video frame sequence. The category to which the display object of the video frame belongs. Here, if the determined category to which the display object of each video frame belongs has the same category, the same category is deleted, and only one of the same categories is retained.

Specifically, a sub-video frame sequence includes N ( _N =1, 2, .

where the i-th (i=1...M _n ) display object

The corresponding display object category is

The collection of display object categories is

If the same category exists in M _n display objects,

There are C _n different categories of display objects in the nth video frame, then the category set after deduplication of L _n is:

C _{n ≤} _Mn .

The display object category set composed of the deduplicated category sets in the N video frames is L={l ¹ ,l ² ,...,l ^M }, which includes M object categories in total. After category deduplication, the sub-video The category set corresponding to the frame sequence is L′={l ¹ ′,l ² ′,...,l ^C ′}, including C categories in total, and the frequency of each display object category in the set L′ is calculated. For the display object category l ⁱ ′, whose frequency is p ⁱ

The category of the target impression object is

Its index is i ^*

It can be considered that the category in the sub-video frame sequence is

The display object of is the target display object.

The execution subject can use the existing technology or future development technology to detect the category of the display object in the video frame, for example, SSD (Single Shot MultiBox Detector, one-stage multi-box detection algorithm), R-CNN (Region -based Convolution Neural Networks, region-based convolutional neural network algorithm), etc., to detect the categories of display objects contained in each video frame in each sub-video frame sequence.

Specifically, the sub-video frame sequence M includes video frame A (contained display objects are two people and a dog) and video frame B (contained display objects are one person and three vehicles). A and video frame B are detected, and it is obtained that the categories to which the display objects included in video frame A belong are people and animals, and the categories to which the display objects included in video frame B belong are people and cars, so the display corresponding to the sub-video frame sequence M The set of object categories is {person, animal, person, car}. The execution subject may determine the display objects (two persons in the video frame A and one person in the video frame B) corresponding to the persons in the presentation object category set as the target presentation objects.

In some optional manners, the preset scoring rules include at least one of the following scoring sub-rules: a first scoring rule associated with the average duration of each sub-video frame sequence; a target display object associated with each sub-video frame sequence The second scoring rule of the association.

In this implementation manner, the execution subject scores each sub-video frame sequence according to at least one of the first scoring rule and the second scoring rule.

The first scoring rule is associated with the average duration of each sub-video frame sequence, that is, the first scoring rule is set according to the average duration of each sub-video frame sequence.

Here, the manner in which the execution subject scores each sub-video frame sequence according to the first scoring rule may be to score each sub-video frame sequence according to a preset duration scoring comparison table, or may be based on whether the average duration of each sub-video frame sequence is Each sub-video frame sequence is scored if the preset duration threshold range is satisfied, which is not limited in this application.

The preset duration threshold range may be determined according to experience, actual needs and specific application scenarios, which is not limited in this application.

Specifically, Q sub-video frame sequences are determined from the to-be-evaluated video frame sequence L, the frame rate of the to-be-evaluated video frame sequence L is f, and the number of frames of the qth (value range of 1 to Q) sub-video frame sequence is F _q , then the average duration MT of each sub-video frame sequence is

Further, the execution subject can determine whether the MT meets the preset duration threshold range, and if so, the score is 1, and if not, the score is 0.

The second scoring rule is associated with the target display object of each sub-video frame sequence, that is, the second scoring rule is set according to the target display object of each sub-video frame sequence.

Here, the second scoring rule may be set according to the target display object of each sub-video frame sequence in a variety of ways, for example, it may be set according to the type of the target display object of each sub-video frame sequence; The frequency of the target display object appearing in the corresponding sub video frame sequence is set, and the setting is based on the area occupied by the target display object of each sub video frame sequence in the corresponding sub video frame sequence, etc. This application does not limit this. .

Correspondingly, the execution subject may score each sub-video frame sequence according to the second scoring rule in various ways, for example, scoring each sub-video frame sequence according to a preset target display object category scoring comparison table; according to each sub-video frame sequence; Whether the frequency of the target display object of the video frame sequence in the corresponding sub-video sequence frame meets the preset frequency threshold range to score each sub-video frame sequence; according to the area of the target display object of each sub-video frame sequence and the target display Whether the ratio of the area of the video frame of the object satisfies the preset ratio threshold range is used to score each sub-video frame sequence, etc., which is not limited in this application.

Specifically, the video frame sequence to be evaluated includes a sub video frame sequence A and a sub video frame sequence B, wherein the target display object of the sub video frame sequence A is a person; and the target display object of the sub video frame sequence B is an item, and the preset The scoring rule is: when the target display object is a person, the corresponding sub-video frame sequence is scored as 1 point; when the target display object is an item, the corresponding sub-video frame sequence is scored as 0 points, so the video frame sequence to be evaluated is scored as 0 points. The score of sub-video frame sequence A is 1 point, and the score of sub-video frame sequence B is 0 point.

It should be noted that, if the preset scoring rule includes the first scoring rule and the second scoring rule, the execution subject will score each sub-video frame sequence according to the first scoring rule and the second scoring rule.

Specifically, the first scoring rule is associated with the average duration of each sub video frame sequence, the second scoring rule is associated with the category of the target display object of each sub video frame sequence, and the video frame sequence to be evaluated includes the sub video frame sequence A and sub-video frame sequence B, wherein, the target display object of sub-video frame sequence A is a person, and the average duration is 5s; while the target display object of sub-video frame sequence B is an item, the average duration is 5s, and the preset scoring rule is: when When the target display object is a person and the average duration is greater than or equal to 3s, the score of the corresponding sub-video frame sequence is 1 point; otherwise, the score of the corresponding sub-video frame sequence is 0 point, so the sub-video frame sequence A in the video frame sequence to be evaluated The score is 1 point, and the score of the sub-video frame sequence B is 0 point.

In this implementation, each sub-video frame sequence is scored according to the first scoring rule and/or the second scoring rule, and then the quality of the video frame sequence to be evaluated is evaluated according to the score of each sub-video frame sequence, fully considering each sub-video frame sequence. The average duration in the attribute information of the frame sequence and/or the influence of the target display object on the video quality effectively improves the accuracy and validity of the evaluated video quality.

Step 204, according to the score of each sub-video frame sequence, perform quality assessment on the video frame sequence to be evaluated.

In this embodiment, after determining the score of each sub-video frame sequence, the execution body may evaluate the video frame sequence to be evaluated according to the score of each sub-video frame sequence and the corresponding weight coefficient.

The weight coefficients respectively corresponding to the first score and the second score may be determined according to experience, actual needs and specific application scenarios, which are not limited in this application.

Continue to refer to FIG. 3 , which is a schematic diagram of an application scenario of the video quality method according to this embodiment.

In the application scenario of FIG. 3 , the execution body 301 acquires the video frame sequence 302 to be evaluated, and determines three sub-video frame sequences from the to-be-evaluated video frame sequence 302, and the sub-video frame sequence is the color average value of a group of video frames The gradient satisfies a preset condition (for example, less than a preset color mean gradient threshold) for continuous video frames, and the three sub-video frame sequences are respectively sub-video frame sequence A303 (including the number of display objects), sub-video frame sequence B304 (including The number of display objects is 10) and the sub-video frame sequence C305 (including the number of display objects is 20); according to the preset scoring rules, each sub-video frame sequence is scored to obtain the score of each sub-video frame sequence, that is, the sub-video The score 306 of the frame A, the score 307 of the sub video frame sequence B, and the score 308 of the sub video frame sequence C, wherein the preset scoring rules are associated with the attribute information of each sub video frame sequence, and the attribute information includes the display object's Quantity, the preset scoring rule is a quantitative score comparison table, for example, the number of 20 corresponds to 2 points, the number of 10 corresponds to 1 point, and the number of 5 corresponds to 0 points; further, the execution subject according to the score of each sub-video frame sequence, the video to be evaluated The quality of the frame sequence is evaluated, for example, the score 309 of the video frame sequence to be evaluated is obtained by directly adding the score 306 of the sub video frame sequence B, the score 307 of the sub video frame sequence B, and the score 308 of the sub video frame sequence C.

The video quality evaluation method of the present disclosure obtains the video frame sequence to be evaluated; determines at least one sub-video frame sequence from the video frame sequence to be evaluated; scores each sub-video frame sequence according to a preset scoring rule; The score of each sub-video frame sequence is used to evaluate the quality of the video frame sequence to be evaluated, which effectively improves the rationality and effectiveness of evaluating the video quality.

Further reference is made to FIG. 4 , which shows a flow 400 of yet another embodiment of the video quality assessment method shown in FIG. 2 . In this embodiment, the preset scoring rules include a first scoring rule and a second scoring rule. The process 400 of the video quality assessment method of the present embodiment may include the following steps:

Step 401, acquiring a sequence of video frames to be evaluated.

In this embodiment, for the implementation details and technical effects of step 401, reference may be made to the description of step 201, and details are not repeated here.

Step 402: Determine at least one sub-video frame sequence based on the color mean gradient of each video frame in the to-be-evaluated video frame sequence.

In this embodiment, the implementation details and technical effects of step 402 can be referred to the description of step 202, which will not be repeated here.

Step 403: Score each sub-video frame sequence according to the first scoring rule to obtain a first score of each sub-video frame sequence.

In this embodiment, the way for the execution subject to score each sub-video frame sequence according to the first scoring rule may be to score each sub-video frame sequence according to a preset time-length scoring comparison table, or to score each sub-video frame sequence according to Whether the average duration of each sub-video frame sequence meets the preset duration threshold range, the sub-video frame sequence is scored, which is not limited in this application.

Step 404: Score each sub-video frame sequence according to the second scoring rule to obtain a second score of each sub-video frame sequence.

In this embodiment, the execution subject may score each sub-video frame sequence according to the second scoring rule in a variety of ways, for example, scoring each sub-video frame sequence according to a preset target display object category scoring comparison table; Each sub-video frame sequence is scored according to whether the frequency of occurrence of the target display object of each sub-video frame sequence meets the preset frequency threshold range; according to the area occupied by the target display object of each sub-video frame sequence and the video containing the target display object Whether the ratio of the area of the frame satisfies the preset ratio threshold range is used to score each sub-video frame sequence, etc., which is not limited in this application.

In some optional manners, the second scoring rule includes at least one of the following: a third scoring rule associated with the frequency of occurrence of the target display object of each sub-video frame sequence; Area is associated with the fourth scoring rule.

In this implementation manner, the execution subject scores each sub-video frame sequence according to at least one of the third scoring rule and the fourth scoring rule.

Wherein, the third scoring rule is associated with the frequency of the target display object of each sub-video frame sequence appearing in the corresponding sub-video frame sequence, that is, the third scoring rule is based on the target display object of each sub-video frame sequence in the corresponding sub-video frame sequence. The frequency setting that appears in the frame sequence.

Here, the frequency at which the target display object of each sub-video frame sequence appears in the corresponding sub-video frame sequence may be the frequency at which the target display object appears in the corresponding sub-video frame sequence, or the type of the target display object in the corresponding sub-video frame. The frequency of occurrence in the set of display object types corresponding to the sequence is not limited in this application.

The third scoring rule associated with the frequency of occurrence of the target display object may be a preset frequency score comparison table, or may be based on whether the frequency of the target display object of each sub-video sequence appearing in the corresponding sub-video frame sequence satisfies the preset frequency The frequency threshold range of the sub-video frame sequence is scored.

Specifically, if the display object category set corresponding to the sub-video frame sequence M is {person, person, person, animal}, the category of the target display object is human, and the frequency of people appearing in the display object category set is 0.75; The display object category set corresponding to the sequence N is {car, car, animal}, the target display object category is car, and the frequency of the car in the display object category set is 0.67. If the preset frequency threshold range is greater than or equal to 0.7 and less than or equal to 0.8, the third scoring rule is that if the preset frequency threshold range is met, the score is 1, and if the preset frequency threshold range is not met, the score is 0, according to In the third scoring rule, the score of the sub-video frame sequence M is 1, and the score of the sub-video frame sequence N is 0.

The fourth scoring rule is associated with the area of the target display object of each sub video frame sequence in the corresponding sub video frame sequence, that is, the fourth scoring rule is set according to the target display area of each sub video frame sequence.

Here, the fourth scoring rule associated with the area of the target display object in each sub-video frame sequence may be the first score between the total area of the target display object in each sub-video frame sequence and the total area of the video frame containing the target display object The fourth scoring rule associated with the ratio can also be the sum of the area of the target display object with the largest area in each video frame containing the target display object in each sub-video frame sequence and the total area of the video frame containing the target display object. The fourth scoring rule associated with the second ratio is not limited in this application.

Specifically, a sub-video frame sequence includes N ( _N =1, 2, .

where the i-th display object

The corresponding display object category is

The collection of display object categories is

The set of categories after deduplication of L _n is

C _{n ≤} _Mn . for the category

Required in all categories

Among the display objects, keep the display object with the largest area, and record the display object

The area of is

then for the category

After screening, the display object with the largest area is

The set of display objects of the filtered n-th video frame is

The display object set composed of the display object sets corresponding to N video frames is:

The display object category set corresponding to the display object set S is L={l ¹ ,l ² ,...,l ^M }. After the categories are deduplicated, the obtained category set is L'={l ¹ ',l ² ',... , l ^C ′}. Find all categories in the display object set S as the target display object category

, and calculate the proportion of their area in the video frame, the size of each video frame is W×H, and the area of s ⁱ is area(s ⁱ ), then the target display object of the qth sub-video frame sequence is The average area ratio R is

Correspondingly, the fourth scoring rule may be a ratio scoring comparison table, or may be scoring each sub-video frame sequence according to whether the ratio of each sub-video sequence satisfies a preset ratio threshold range.

Specifically, if the sub-video frame sequence M includes video frame C1 (including two display objects and one car) and video frame C2 (including two display objects and one animal), the corresponding display object category set For {person, car, person, animal}, its target display object is person. Further, the execution subject calculates the total area of the target display object (the sum of the area of the two people in the video frame C1 and the area of the two people in the video frame C2) and the total area of the video frame containing the target display object (the video frame C1 and the video The ratio of the total area of the frame C2), and the fourth scoring rule is to judge whether the ratio satisfies the preset ratio threshold range, if so, the score is 1, if not, the score is 0. If the ratio calculated above is 0.5, and the preset ratio threshold range is greater than 0.3 and less than or equal to 0.6, the score of the sub video frame sequence M is 1.

In this implementation, each sub-video frame sequence is scored according to the third scoring rule and/or the fourth scoring rule, and then the quality of the video frame sequence to be evaluated is evaluated according to the score of each sub-video frame sequence, fully considering each sub-video frame sequence. The influence of the area and/or the frequency of occurrence of the target object in the frame sequence on the video quality further improves the accuracy and validity of the assessed video quality.

In some optional manners, scoring each sub-video frame sequence according to a preset scoring rule includes: scoring each sub-video frame sequence according to a third scoring rule to obtain a third score, and according to a fourth scoring rule Each sub-video frame sequence is scored to obtain a fourth score, and each sub-video frame sequence is scored according to the third score and the fourth score.

In this implementation manner, the preset scoring rules include a third scoring rule and a fourth scoring rule, and the execution subject may first follow the frequency associated with the frequency of the target display object of each sub-video frame sequence appearing in the corresponding sub-video frame sequence. The third scoring rule scores each sub-video frame to obtain a third score; then, according to the fourth scoring rule associated with the area of the target display object of each sub-video frame sequence in the corresponding sub-video frame sequence The frame sequence is scored to obtain the fourth score, and finally the score of each sub-video frame sequence is obtained according to the third score and the fourth score.

Here, the executive body may obtain the score of each sub-video frame sequence according to the third score and the fourth score and their respective weight coefficients.

The weight coefficients corresponding to the third score and the fourth score respectively may be set according to experience, actual needs and specific application scenarios, which are not limited in this application.

In this implementation, each sub-video frame sequence is scored independently according to the third scoring rule and the fourth scoring rule, which is helpful for comprehensively considering the frequency of the target display objects of each sub-video frame sequence and the target display of each sub-video frame sequence. Under the condition that the area of the object affects the video quality, the accuracy of the obtained scores of each sub-video frame sequence is improved, thereby further improving the validity and rationality of evaluating the video quality.

In addition, it should be pointed out that if the preset scoring rules include the first scoring rule, the third scoring rule, and the fourth scoring rule, then according to the preset scoring rules, each sub-video frame sequence is scored, including: according to the first scoring rule According to a scoring rule, each sub-video frame sequence is scored to obtain the first score; according to the third scoring rule, each sub-video frame sequence is scored to obtain a third score; according to the fourth scoring rule, each sub-video frame sequence is scored according to the fourth scoring rule. Scoring is performed to obtain a fourth score; according to the first score, the third score and the fourth score, each sub-video frame sequence is scored.

Here, if a video frame sequence includes a total of Q sub-video frame sequences, according to the first scoring rule, the third scoring rule, and the fourth scoring rule, each sub-video frame sequence is scored to obtain the first score of each sub-video frame sequence. The first score is Score ₁ , and the third score is

The fourth rating is

Then the final score Score of the video frame sequence can be expressed as

Specifically, if the sub-video frame sequence M includes video frame C1 (including two display objects and one vehicle) and video frame C2 (including two display objects and one animal), the average duration of the sub-video frame sequence M is is 2s, the corresponding display object category set is {person, car, person, animal}, and the target display object is human. Further, the execution subject calculates the total area of the target display object (the sum of the area of the two people in the video frame C1 and the area of the two people in the video frame C2) and the total area of the video frame containing the target display object (the video frame C1 and the video The ratio of the total area of the frame C2), and the frequency 0.5 that the target display object appears in the display object category set, and the first scoring rule is to judge whether the average duration of the sub-video frame sequence satisfies the preset duration threshold range, if If it is satisfied, the score is 1; if not, the score is 0. The third scoring rule is to judge whether the frequency meets the preset frequency threshold range. If it is satisfied, the score is 1; if not, the score is 0. The four scoring rules are to judge whether the ratio satisfies the preset ratio threshold range. If so, the score is 1; if not, the score is 0. If the ratio calculated above is 0.5, the preset ratio threshold range is greater than 0.3 and less than or equal to 0.6, the preset duration threshold range is greater than 1s and less than or equal to 2s, and the preset frequency threshold range is greater than 0.4 and less than or equal to 0.6, then the sub- The score of the video frame sequence M can be expressed as the sum 3 of the first score, the third score and the fourth score.

In this implementation, each sub-video frame sequence is scored independently according to the first scoring, the third scoring rule and the fourth scoring rule, which is helpful for comprehensively considering the average duration of each sub-video frame sequence and the target of each sub-video frame sequence. Under the condition that the frequency of the display objects and the area of the target display object of each sub-video frame sequence affect the video quality, the accuracy of the obtained score of each sub-video frame sequence is improved, thereby further improving the effectiveness and efficiency of evaluating the video quality. rationality.

Step 405: Obtain a score of each sub-video frame sequence according to the first score of each sub-video frame sequence and the second score of each sub-video frame sequence.

In this embodiment, the execution body can directly obtain the score of each sub-video frame sequence according to the first score and the second score, or obtain each sub-video frame sequence according to the first score and the second score and their respective weight coefficients score, which is not limited in this application.

Step 406, according to the score of each sub-video frame sequence, perform quality assessment on the video frame sequence to be evaluated.

In this embodiment, for the implementation details and technical effects of step 406, reference may be made to the description of step 204, and details are not repeated here.

As can be seen from FIG. 4 , compared with the embodiment corresponding to FIG. 2 , the process 400 of the video quality evaluation method in this embodiment embodies that each sub-video frame sequence is evaluated according to the first scoring rule and the second scoring rule. The first score and the second score are obtained, and each sub-video frame sequence is scored according to the first score and the second score, and then the quality of the video frame sequence to be evaluated is evaluated according to the score of each sub-video frame sequence. The solution described in this embodiment grades each sub-video frame sequence independently according to the first scoring rule and the second scoring rule, which helps to comprehensively consider the average duration of each sub-video frame sequence and the target display of each sub-video frame sequence Under the condition of the influence of the object on the video index, the accuracy of the obtained scores of each sub-video frame sequence is improved, and the validity and rationality of the video quality evaluation are improved.

Further referring to FIG. 5 , as an implementation of the methods shown in the above figures, the present application provides an embodiment of a device for evaluating video quality. The embodiment of the device corresponds to the embodiment of the method shown in FIG. 1 . Can be used in various electronic devices.

As shown in FIG. 5 , the video quality evaluation apparatus 500 in this embodiment includes: an acquisition module 501 , a determination module 502 , a scoring module 503 and an evaluation module 504 .

The obtaining module 501 may be configured to obtain a sequence of video frames to be evaluated.

The determining module 502 may be configured to determine at least one sub-video frame sequence from the video frame sequence to be evaluated.

The scoring module 503 may be configured to score each sub-video frame sequence according to a preset scoring rule.

The evaluation module 504 may be configured to perform quality evaluation on the video frame sequence to be evaluated according to the score of each sub-video frame sequence.

In some optional ways of this embodiment, the preset scoring rules include at least one of the following: a first scoring rule associated with the average duration of each sub-video frame sequence; a target display object associated with each sub-video frame sequence The associated second scoring rule.

In some optional ways of this embodiment, the preset scoring rule includes a first scoring rule and a second scoring rule, and the scoring module is further configured to: score each sub-video frame sequence according to the first scoring rule, and obtain The first score of each sub-video frame sequence; the sub-video frame sequence is scored according to the second scoring rule to obtain the second score of each sub-video frame sequence; according to the first score and the second score, each sub-video frame sequence is scored. to rate.

In some optional manners of this embodiment, the second scoring rule includes at least one of the following: a third scoring rule associated with the frequency with which the target display object of each sub-video frame sequence appears in the corresponding sub-video frame sequence; A fourth scoring rule associated with the area of each sub-video frame sequence's target presentation object in the corresponding sub-video frame sequence.

In some optional ways of this embodiment, the preset scoring rules include a third scoring rule and a fourth scoring rule, and the scoring module is further configured to: score each sub-video frame sequence according to the third scoring rule, and obtain The third score of each sub-video frame sequence; according to the fourth scoring rule, score each sub-video frame sequence to obtain the fourth score of each sub-video frame sequence; according to the third score and the fourth score, to each sub-video frame sequence to rate.

According to the embodiments of the present application, the present application further provides an electronic device and a readable storage medium.

As shown in FIG. 6 , it is a block diagram of an electronic device according to the video quality assessment method according to an embodiment of the present application.

600 is a block diagram of an electronic device for a video quality assessment method according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.

As shown in FIG. 6, the electronic device includes: one or more processors 601, a memory 602, and interfaces for connecting various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system). A processor 601 is taken as an example in FIG. 6 .

The memory 602 is the non-transitory computer-readable storage medium provided by the present application. Wherein, the memory stores instructions executable by at least one processor, so that the at least one processor executes the video quality assessment method provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are used to cause the computer to execute the video quality assessment method provided by the present application.

As a non-transitory computer-readable storage medium, the memory 602 can be used to store non-transitory software programs, non-transitory computer-executable programs and modules, such as program instructions/modules corresponding to the video quality assessment method in the embodiments of the present application (for example, The acquisition module 501, the determination module 502, the scoring module 503 and the evaluation module 504 shown in FIG. 5). The processor 601 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 602, ie, implements the video quality assessment method in the above method embodiments.

The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the use of electronic equipment for video quality assessment, and the like. Additionally, memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 602 may optionally include memory located remotely relative to processor 601, and these remote memories may be connected via a network to the electronic device for video quality assessment. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the video quality assessment method may further include: an input device 603 and an output device 604 . The processor 601 , the memory 602 , the input device 603 and the output device 604 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 6 .

The input device 603 may receive input numerical or character information, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, one or more mouse buttons, a trackball, a joystick, and other input devices. Output devices 604 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical solutions of the embodiments of the present application, it is helpful to evaluate the video quality more reasonably and effectively.

It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be performed in parallel, sequentially or in different orders, and as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.

The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims

A video quality assessment method, the method comprising:

Obtain the video frame sequence to be evaluated;

From the to-be-evaluated video frame sequence, determine at least one sub-video frame sequence, where the sub-video frame sequence is a group of consecutive video frames whose color mean gradients of the video frames satisfy a preset condition;

According to a preset scoring rule, each sub-video frame sequence is scored, and the preset scoring rule is associated with the attribute information of each sub-video frame sequence;

According to the score of each sub-video frame sequence, quality assessment is performed on the video frame sequence to be evaluated.
The method according to claim 1, wherein the preset scoring rules include at least one of the following:

a first scoring rule associated with the average duration of each sub-video frame sequence;

A second scoring rule associated with the target presentation object of each sub-video frame sequence.
The method according to claim 2, wherein the preset scoring rule comprises: the first scoring rule and the second scoring rule, and the preset scoring rule for each sub-video frame sequence Score, including:

Score each sub-video frame sequence according to the first scoring rule to obtain the first score of each sub-video frame sequence;

Scoring each sub-video frame sequence according to the second scoring rule to obtain a second score of each sub-video frame sequence;

Each sub-video frame sequence is scored according to the first score and the second score.
The method of claim 2, wherein the second scoring rule includes at least one of the following:

a third scoring rule associated with the frequency with which the target presentation object of each sub-video frame sequence appears in the corresponding sub-video frame sequence;

A fourth scoring rule associated with the area of each sub-video frame sequence's target presentation object in the corresponding sub-video frame sequence.
The method according to claim 4, wherein the preset scoring rule includes the third scoring rule and the fourth scoring rule, and the preset scoring rule is used to perform the evaluation on each sub-video frame sequence. Scoring, including:

Scoring each sub-video frame sequence according to the third scoring rule to obtain a third score of each sub-video frame sequence;

Scoring each sub-video frame sequence according to the fourth scoring rule to obtain a fourth score of each sub-video frame sequence;

Each sub-video frame sequence is scored according to the third score and the fourth score.
A device for evaluating video quality, the device comprising:

an acquisition module, configured to acquire a sequence of video frames to be evaluated;

A determination module, configured to determine at least one sub-video frame sequence from the video frame sequence to be evaluated, where the sub-video frame sequence is a group of video frames whose color value gradients all satisfy a preset condition of continuous video frames;

a scoring module, configured to score each sub-video frame sequence according to a preset scoring rule, and the preset scoring rule is associated with attribute information of each sub-video frame sequence;

The evaluation module is configured to perform quality evaluation on the to-be-evaluated video frame sequence according to the score of each sub-video frame sequence.
The device according to claim 6, wherein the preset scoring rules include at least one of the following:

a first scoring rule associated with the average duration of each sub-video frame sequence;

A second scoring rule associated with the target presentation object of each sub-video frame sequence.
The apparatus according to claim 7, wherein the preset scoring rules include the first scoring rules and the second scoring rules, and the scoring module is further configured to:

Score each sub-video frame sequence according to the first scoring rule to obtain the first score of each sub-video frame sequence;

Scoring each sub-video frame sequence according to the second scoring rule to obtain a second score of each sub-video frame sequence;

Each sub-video frame sequence is scored according to the first score and the second score.
The apparatus of claim 7, wherein the second scoring rule includes at least one of the following:

a third scoring rule associated with the frequency with which the target presentation object of each sub-video frame sequence appears in the corresponding sub-video frame sequence;

A fourth scoring rule associated with the area of each sub-video frame sequence's target presentation object in the corresponding sub-video frame sequence.
The apparatus according to claim 9, wherein the preset scoring rule includes the third scoring rule and the fourth scoring rule, and the scoring module is further configured to:

Scoring each sub-video frame sequence according to the third scoring rule to obtain a third score of each sub-video frame sequence;

Scoring each sub-video frame sequence according to the fourth scoring rule to obtain a fourth score of each sub-video frame sequence;

Each sub-video frame sequence is scored according to the third score and the fourth score.
An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processing area to enable the at least one processor to perform the execution of any of claims 1-5 Methods.
A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to perform the method of any one of claims 1-5.
A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-5.