WO2017084256A1 - 一种视频质量评价方法及装置 - Google Patents

一种视频质量评价方法及装置 Download PDF

Info

Publication number
WO2017084256A1
WO2017084256A1 PCT/CN2016/082223 CN2016082223W WO2017084256A1 WO 2017084256 A1 WO2017084256 A1 WO 2017084256A1 CN 2016082223 W CN2016082223 W CN 2016082223W WO 2017084256 A1 WO2017084256 A1 WO 2017084256A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
frame
gop
video frame
content feature
Prior art date
Application number
PCT/CN2016/082223
Other languages
English (en)
French (fr)
Inventor
鞠增伟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017084256A1 publication Critical patent/WO2017084256A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a video quality evaluation method and apparatus.
  • the methods for video quality evaluation are mainly divided into two categories.
  • One is subjective video quality evaluation method, that is, a large number of observers evaluate the test video according to the pre-specified evaluation scale according to the merits of the visual effect, and give it to all observers.
  • the evaluation scores are weighted averaged as the subjective quality evaluation value of the test video; the other is the objective video quality evaluation method, which refers to the use of the designed computational model instead of the Human Visual System (HVS) pair test.
  • HVS Human Visual System
  • the objective video quality evaluation method is divided into full reference, partial reference and non-reference video quality evaluation methods based on the use of reference source information.
  • the video quality evaluation method without reference is mainly adopted.
  • the network video quality evaluation methods without reference mainly include a packet layer evaluation method based on data packet header information, a bitstream layer evaluation method based on packet header information and payload information, and a media layer evaluation method based on pixel level features.
  • the packet layer evaluation method for network video quality evaluation based on data packet header information cannot be correct The content characteristics of the video are analyzed, so the accuracy of the network video quality evaluation is low.
  • a bitstream layer evaluation method for network video quality evaluation based on packet header information and payload information can implement video quality evaluation using feature information of video content, can ensure accuracy of video quality prediction, but needs to be separately decoded and completely decoded.
  • the computational complexity is high, and parameters obtained by decoding the video by a terminal such as a set top box cannot be used to reconstruct the video picture, and the set top box cannot bear the overhead of complete decoding, so that the packet is based on the grouping.
  • the use of bitstream layer evaluation methods for header information and payload information is limited.
  • the media layer evaluation method for network video quality evaluation based on pixel-level features uses pixel-level feature extraction and calculation, and the computational complexity is also high.
  • the embodiment of the invention provides a video quality evaluation method and device, which comprehensively utilizes the content features of the video to evaluate the network video quality to improve the accuracy of the video quality prediction and reduce the computational complexity.
  • a video quality evaluation method obtains a frame parameter of a video frame and a decoding complexity information of the video frame decoded by the terminal, where the frame parameter may be, for example, a packet header detection. Information such as frame length.
  • the decoding complexity information is used by the terminal to decode the calculation cost corresponding to the video frame, which may be a calculation cost corresponding to each part involved in the decoding process of the terminal, and does not need to be separately decoded, thereby ensuring low computational overhead and reducing computational complexity. .
  • the state variable representing the video frame content feature may represent a change in the encoded information of the video frame, and a change in the format or residual, and further Extracting the content feature of the video frame based on the state variable characterizing the content characteristics of the video frame
  • the video quality evaluation is performed by using the content feature information of all video frames in the video, and the video content feature can be fully and completely used for video quality evaluation, which can improve the accuracy of video quality prediction.
  • the frame parameters include a frame header length of a video frame, a frame length, and a number of video blocks included in the video frame
  • the decoding complexity information is entropy coding, reordering, inverse variables, and
  • the inverse transform, and the computational cost values involved in each part of motion compensation, intra prediction, and inter prediction determine the state variables that characterize the content of the video frame through frame parameters and decoding complexity information, including: decoding in the process of acquiring entropy coding the average per symbol decoding complexity of C 0, reordering the decoding process decodes each video block average complexity C 1, and the inverse variable inverse transform decoding process decodes each video block average complexity C 2, motion The average computational complexity C 3 required for pre-fetching data for each symbol in the compensation, intra prediction, and inter prediction processes, and the amount of computation required to acquire the reference block for motion compensation, intra prediction, and inter prediction, and the reference block Adding a required amount of computation C 4 to the residual; determining a state variable K characterizing the
  • the state variables characterizing the content characteristics of the video frame are determined by the frame parameters and the decoding complexity information, and the following manner can be adopted:
  • C ED is the decoding complexity of entropy coding
  • C ReO is the decoding complexity of reordering
  • C IQ & IT is the decoding complexity of inverse and inverse transforms
  • C MC/IP is motion compensation module, intra prediction module and frame Decoding complexity of the inter prediction module
  • C ED C 0 ⁇ L
  • C ReO C 1 ⁇ N
  • C IQ & IT C 2 ⁇ N
  • C MC / IP C 3 ⁇ L h + C 4 ⁇ N
  • N is the video contained in the video frame
  • the number of blocks, L h is the length of the frame header of the video frame.
  • a state variable characterizing the content characteristics of the video frame may satisfy the following expression:
  • the K value is affected by the coding information such as the prediction mode and the motion vector of the video, that is, related to the coding information, and can represent the change of the video coding information, mainly including the prediction mode, the motion vector precision, and the motion vector range.
  • B is related to the number of blocks, which can be used to characterize the variation of the video format. When the frames are the same, B can represent the change of the residual, so K and B can characterize the state variables of the video content features.
  • the packet loss concentration is used to indicate that the corresponding GOP range is lost.
  • the degree of concentration of the packet occurs.
  • the packet loss that occurs in the actual network is mostly non-uniform packet loss.
  • the higher the packet loss concentration the greater the impact on the video quality.
  • the impact of the evaluation packet loss on video quality evaluation is more accurate.
  • the packet loss concentration may be determined according to the distance L loss of the first packet loss to the last packet loss in the corresponding GOP range, and the total number of packet loss N of the video frame in the GOP, where the N is fixed.
  • the growth trend of the packet loss concentration is opposite to the growth trend of the L loss . In the case where the L loss is fixed, the growth trend of the packet loss concentration is the same as the growth trend of the N.
  • the packet loss concentration meets the formula Where L loss represents the distance from the first packet loss to the last packet loss in the GOP, N is the total number of lost packets in the GOP, and c and k are pending constants.
  • the video quality evaluation is performed according to the content feature information of all the video frames, including:
  • the GOP refers to a video frame set from an I frame to a next I frame occurrence;
  • the packet loss concentration of the video frames in the GOP determines the packet loss distortion value of each GOP according to the packet loss concentration of the video frames in each GOP; and according to the basic quality score of each GOP and the packet loss distortion value of each GOP, Determining the quality value of each GOP; the entire video stream is composed of GOPs, and after obtaining the quality value of each GOP, calculating the quality value of the entire video stream can be implemented by a weighting method, that is, the quality of each GOP in the video. The values are weighted to obtain a video quality value. Specifically, since the human eye is impressed with the poor portion, the weight of the GOP with low quality can be set to be large, and the weight of the GOP with longer duration is larger.
  • the video quality value may be corrected by using a resolution correction factor, so that the finally obtained video quality value not only reflects the video quality received by the evaluation terminal device, but also reflects the viewing condition.
  • the video quality under the influence, and the adaptability of the evaluation results in different scenarios is realized.
  • the resolution correction factor is for the video resolution and the resolution of the device viewing the video.
  • the value obtained by the line modeling, and the value conforms to the representation of the sine function in the range of f_ratio ⁇ (0,1), and conforms to the representation of the inverse S function in the range of f_ratio ⁇ (1,+ ⁇ ); , f_ratio is the ratio between the viewing device resolution and the video resolution.
  • the video quality value may be corrected by using a screen size correction factor, so that the finally obtained video quality value not only reflects the video quality received by the evaluation terminal device, but also reflects the influence of the screen size of the device viewing the video. Video quality, to achieve the adaptability of evaluation results in different scenarios.
  • the screen size correction factor is a value obtained by modeling a screen size of a device viewing the video, the screen size correction factor satisfies a formula
  • M is the screen size correction factor
  • S_base is the reference size for setting the viewing device screen
  • S_min is the minimum size for setting the viewing device screen
  • S_true is the actual size of the viewing device screen
  • min is the video quality value corresponding to S_min
  • max To set the video quality value corresponding to the maximum size of the viewing device screen.
  • the video quality correction is performed by using the screen size correction factor, and the video quality correction is performed by using the resolution correction factor, and it may be used alone or in combination.
  • the corrected video quality value is corrected again, and the video quality value corrected by the resolution correction factor needs to be re-assigned before the video quality value corrected by the resolution correction factor is corrected again by using the screen size correction factor. Make a new assignment
  • the value of the video quality value falls within the set standard range, so that the embodiment of the present invention can be applied to a method of performing video quality evaluation using different standards.
  • a video quality evaluation apparatus in a second aspect, includes an acquisition unit and a processing unit, and the acquisition unit is configured to acquire a frame parameter and decode decoding complexity information of the video frame by the terminal.
  • the frame parameter may be, for example, information such as a frame length obtained by packet header detection.
  • the decoding complexity information is a calculation cost value of each part in the process of decoding the video frame by the terminal.
  • a processing unit configured to determine, by using the frame parameter and the decoding complexity information acquired by the acquiring unit, a state variable that characterizes a video frame content feature, and determining content feature information of the video frame based on the state variable, And repeating the above steps until the content feature information of all the video frames in the video is determined, and the video quality evaluation is performed according to the content feature information of all the video frames.
  • the video quality evaluation apparatus determines, by using frame parameters and decoding complexity information, a state variable that characterizes a video frame content feature, and determines content feature information of the video frame based on the state variable, which is comprehensive and complete.
  • the video content feature is used to make the video quality evaluation result more accurate
  • the decoding complexity information is a calculation cost value corresponding to the decoded video frame of the terminal, so that no separate decoding is needed, and the hard decoding function inherent in the terminal is used, and the calculation is not increased.
  • the overhead can guarantee low computational overhead and reduce computational complexity.
  • the frame parameters include a frame header length of the video frame, a frame length, and a number of video blocks included in the video frame, and the decoding complexity information is entropy coding, reordering, inverse variables, and inverse transform, and motion compensation, frames.
  • the processing unit specifically determines, by using the frame parameter and the decoding complexity information, a state variable that characterizes the content feature of the video frame in the following manner: decoding each code in the entropy encoding process decoding meta average complexity C 0, reordering the decoding process decodes each video block average complexity C 1, and the inverse variable inverse transform decoding process of decoding the average complexity C 2 of each video block, motion compensation, a frame
  • the processing unit determines the state variable K that characterizes the change of the encoding information of the video frame and the state variable that represents the change of the format or residual of the video frame.
  • K meets the formula
  • L is the frame length
  • N is the number of video blocks included in the video frame
  • L h is the length of the frame header of the video frame.
  • the processing unit is specifically configured to determine content feature information of the video frame based on the state variable in the following manner:
  • the processing unit may determine a packet loss distortion value of the GOP according to a packet loss concentration within the GOP, and according to a base quality score of the GOP and a packet loss distortion of the GOP. And determining a quality value of the GOP, and weighting the quality values of all GOPs in the video to obtain a video quality value.
  • the packet loss concentration is used to indicate the degree of concentration of packet loss in the corresponding GOP range, and may be based on the distance L loss from the first packet loss to the last packet loss in the corresponding GOP range, and the total video frame in the GOP.
  • the number of packet loss N is determined, wherein the growth trend of the packet loss concentration is opposite to the growth trend of L, and is the same as the growth trend of N.
  • the processing unit is specifically configured to perform video quality evaluation according to content feature information of all the video frames in the following manner:
  • the processing unit is further configured to modify the video quality value by using a resolution correction factor, so that the finally obtained video quality value not only reflects the video quality received by the evaluation terminal device, but also reflects the influence of the viewing condition.
  • Video quality to achieve the adaptability of evaluation results in different scenarios.
  • the resolution correction factor is a value obtained by modeling a video resolution and a device resolution for viewing the video, and the value conforms to a representation of a sine function in a range of f_ratio ⁇ (0, 1). In the range of f_ratio ⁇ (1, + ⁇ ), the representation of the inverse S function is met, where f_ratio is the ratio between the viewing device resolution and the video resolution.
  • the processing unit is further configured to correct the video quality value by using a screen size correction factor.
  • the processing unit may utilize a screen size correction factor to repair the resolution correction factor The video quality value after the correction is corrected again.
  • the screen size correction factor is a value obtained by modeling a screen size of a device viewing the video, the screen size correction factor satisfies a formula
  • M is the screen size correction factor
  • S_base is the reference size for setting the viewing device screen
  • S_min is the minimum size for setting the viewing device screen
  • S_true is the actual size of the viewing device screen
  • min is the video quality value corresponding to S_min
  • max To set the video quality value corresponding to the maximum size of the viewing device screen.
  • the final video quality value can not only reflect the video quality received by the evaluation terminal device, but also reflect the video quality under the influence of the viewing conditions, and realize different scenarios. The adaptability of the evaluation results.
  • a video quality evaluation apparatus comprising a processor and a memory, wherein the memory stores a computer readable program, and the processor is implemented by running a program in the memory
  • the first aspect relates to a method of performing video quality evaluation.
  • a computer storage medium for storing computer software instructions for use in the video quality evaluation apparatus, comprising a program designed to perform the video quality evaluation method of the first aspect.
  • Figure 1 shows the IPTV end-to-end architecture
  • FIG. 2 is a flowchart of an implementation of a video quality evaluation method according to an embodiment of the present invention
  • FIG. 3 is a block diagram of a video quality evaluation process according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of another implementation of a video quality evaluation method according to another embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a video quality evaluation apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of another structure of a video quality evaluation apparatus according to another embodiment of the present invention.
  • the video quality evaluation method provided by the embodiment of the present invention can be applied to an Internet Protocol Television (IPTV) video service.
  • IPTV Internet Protocol Television
  • the network element devices mainly included in the IPTV end-to-end architecture are: a program source, an IPTV core node (such as a live broadcast server, a core streaming server), and an IPTV edge node (such as an edge streaming server).
  • Bearer networks such as access aggregation layer devices
  • access nodes such as digital subscriber line access multiplexers, optical line terminals
  • terminals such as digital subscriber line access multiplexers, optical line terminals.
  • IPTV video quality evaluation can be performed by monitoring video service quality at different network element nodes.
  • the service quality monitoring in the IPTV video service includes: head-end video quality monitoring, terminal service quality monitoring, network video quality monitoring, and IPTV platform quality monitoring.
  • the video quality evaluation method provided by the embodiment of the present invention is not limited to the IPTV video service.
  • the video quality evaluation method provided by the embodiment of the present invention is mainly directed to the terminal service quality monitoring loop.
  • the video quality evaluation device involved in the present application can be deployed on the terminal side without reference, and the arrangement position is reasonable.
  • the video quality evaluation device may be part of the terminal, or may exist independently.
  • the video quality evaluation may be performed only by acquiring the transmission video stream on the terminal side, without obtaining the original reference video as the evaluation basis.
  • the terminal involved in the present invention may be a set top box (STB), an optical network terminal (ONT), or a personal computer (PC), which is not limited in the embodiment of the present invention.
  • FIG. 2 An implementation process of the video quality evaluation device for video quality evaluation can be seen in FIG. 2, as shown in FIG. 2, including:
  • S101 Acquire a frame parameter of the video frame and decoding complexity information.
  • the video quality evaluation apparatus receives the video stream, and parses the video frame parameter according to the video stream, where the video frame parameter includes a frame header length of the video frame, a frame length, and a number of video blocks included in the video frame.
  • the decoding complexity information in the embodiment of the present invention may be understood as the calculation cost value corresponding to the decoding of the video frame by the terminal, for example, entropy coding, reordering, inverse and inverse transform, and motion compensation, intra prediction, and frame.
  • the sum of the calculated cost values involved in each part is predicted.
  • S102 Determine, by the frame parameter and the decoding complexity information, a state variable that characterizes a video frame content feature.
  • S105 Perform video quality evaluation according to content feature information of all the video frames.
  • the video content feature information is extracted, and the video quality evaluation is performed, and the following manner may be adopted:
  • the decoding complexity is reflected by the CPU usage and other parameters on the terminal. For different devices, the processing capability is different. For example, the CPU usage of the same decoding complexity may be different. After a lot of experiments, it is shown that the decoding task of the video frame is the same as the decoding complexity and the frame length of the video frame. When the frame length changes drastically, the decoding complexity also changes significantly.
  • the decoding complexity is represented by C. L represents the frame length, and the relationship between the two is modeled as:
  • the correlation coefficient between the decoding complexity and the frame length is mostly close to 1, so f(L) can be defined as a linear function, that is, the frame length and the decoding complexity are linearly modeled.
  • f(L) can be defined as a linear function, that is, the frame length and the decoding complexity are linearly modeled.
  • L is the frame length of the video frame, which can be obtained by parsing the video stream packet header information.
  • C is the decoding complexity, and K and B are the fitting coefficients.
  • the decoding complexity of each part involved in the process of decoding the video frame by the terminal may be determined, and the decoding complexity of each part is summed to obtain a second expression of the decoding complexity, and the second The expression is expressed in the form of the first expression, and the fitting coefficients K and B are obtained.
  • the decoding complexity can be characterized as the sum of the decoding complexity of the parts involved in decoding the video frame by the terminal.
  • the terminal may include entropy coding, reordering, inverse and inverse transform, and motion compensation, intra prediction, and inter prediction in decoding the video frame, so the following formula may be used:
  • C ED is the decoding complexity of entropy coding
  • C ReO is the decoding complexity of reordering
  • C IQ&IT is the decoding complexity of inverse and inverse transforms
  • C MC/IP is motion compensation
  • intra prediction and inter prediction Decoding complexity is the decoding complexity of entropy coding
  • the second expression that can obtain the decoding complexity can satisfy the formula.
  • C 0 is an average decoding complexity of decoding each symbol
  • the symbol is a basic unit length of a frame length, that is, a length measurement unit, a basic composition unit
  • N is a number of video blocks included in the video frame.
  • C 1 is the average decoding complexity of decoding each video block in the reordering process
  • C 2 is the average decoding complexity of decoding each video block in the inverse variable and inverse transform
  • C 3 is motion compensation, intra prediction and frame
  • L h is the length of the frame header of the video frame
  • C 4 is motion compensation, intra prediction, and inter prediction. The amount of calculation required to obtain the reference block in the process and the amount of calculation required to add the reference block to the residual.
  • a state variable B characterizing the change in the format or residual of the video frame is determined based on the C 1 , C 2 , C 4 and the number of video blocks included in the video frame.
  • the second expression may be expressed as a form of the first expression, that is, the formula (4) is rewritten into the format of the formula (2), and K and B satisfy the formula.
  • C 0 , C 1 , C 2 , C 3 and C 4 can be obtained by a large number of experimental statistics on I, P, and B frames respectively on a large set of video sequences on various platforms, and the N value can be obtained by video resolution and Block size is determined, Can be obtained at the time of video stream packet detection.
  • the content feature of a video frame can also be understood as the complexity of the video frame.
  • the complexity of the video frame includes time complexity and spatial complexity.
  • the spatial complexity refers to information on a single frame, and the time complexity refers to frames and frames.
  • the K value is related to C 0 , C 3 , L h and L, and C 0 is the average decoding complexity of decoding each symbol in the entropy encoding process, and C 3 is motion compensation, intra prediction and frame.
  • the decoding complexity involved in the inter prediction process, so the K value is affected by the coding information such as the prediction mode and motion vector of the video.
  • B is related to N, that is, B is related to the number of blocks.
  • K and B can represent the state variables of the video content feature
  • K represents the video coding information change, mainly including the prediction mode, motion vector precision and motion vector range
  • B represents the change of the video format.
  • B can represent the residual. The change.
  • the decoding complexity is linearly represented, so the state variable is linearly modeled with the content feature information of the video frame, and the state variable is satisfied with the content feature information of the video frame.
  • Formula F(K, B) ⁇ K + ⁇ B;
  • F(K, B) is characterized by content feature information of the video frame
  • K and B are state variables that characterize the content characteristics of the video frame
  • ⁇ and ⁇ are set constants, and the constant is based on The actual situation can be set to a different value.
  • the video stream is composed of Group of Pictures (GOP).
  • the GOP refers to the set of video frames from the beginning of an I frame to the occurrence of the next I frame. Therefore, there is one I frame and several P and B in the GOP.
  • Frame, I frame is a reference frame, and P frame and B frame are non-reference frames.
  • content feature information of all video frames in each GOP may be determined by using the above method in a GOP unit, and video quality evaluation is performed according to content feature information of all video frames in each GOP.
  • the viewing is performed.
  • the following methods can be used:
  • Q code_loss is the video quality degradation value caused by compression distortion
  • Bitrate is the code rate
  • a1V, a2V, a3V and a4V are training coefficients, which can be set by the actual application scenario.
  • F GOP (K, B) is within the GOP. Content feature information of all video frames.
  • Q code is the base quality score of the GOP
  • 1+a is the highest subjective quality score that the video can achieve
  • a value is assigned different scores according to different resolutions.
  • the quality value of the GOP can be determined according to the basic quality score of each GOP and the packet loss distortion value of each GOP.
  • the packet loss distortion value may be determined by a frame loss condition and a frame loss condition, for example, a packet header of the video stream may be analyzed, and information such as a frame type and a packet loss may be extracted, and a frame parameter set and different types regarding frame damage and frame loss may be obtained. The proportion of packet loss damage, the loss of different types of frames, and so on.
  • the influence of the packet concentration degree of the video frames in each GOP may be further considered.
  • the packet loss concentration is used to indicate the degree of concentration of packet loss in the set range. For the non-uniform packet loss, the smaller the packet loss distance is, the higher the packet loss concentration is. The greater the impact on the video quality, the more the packet loss concentration in the embodiment of the present invention can make the impact of the evaluation packet loss on the video quality evaluation more accurate.
  • the packet loss concentration is used to indicate the degree of occurrence of packet loss in the corresponding GOP range, and may be based on the distance L loss from the first packet loss to the last packet loss in the corresponding GOP range, and the GOP.
  • the total number of lost packets of the inner video frame is determined by N, wherein, in the case of N fixed, the growth trend of the packet loss concentration is opposite to the growth trend of L loss . In the case where L loss is fixed, the packet loss is performed.
  • the growth trend of concentration is the same as the growth trend of N. For example, it can be determined by the following formula:
  • Embodiments of the present invention may be determined within a GOP since the mass fraction of dropped frames Number frameloss loss and the mass fraction of Q framedamage GOP since the frame loss and damage, with the loss of the correction of the concentration and the Q framedamage Number frameloss
  • a packet loss distortion value of the GOP may be expressed as f(L fcous_GOP ) ⁇ (Q frameloss + ⁇ Q framedamage ), where L fcous_GOP is a packet loss concentration degree of the GOP, and Q frameloss is a GOP
  • the quality fraction f(L fcous_GOP ) lost by Q framedamage due to frame loss is an adjustment function of the packet concentration degree according to L fcous_GOP due to the mass fraction of frame loss loss.
  • the quality value of the GOP is obtained based on the base quality score of the GOP and the packet loss distortion value of the GOP;
  • the quality value of the GOP satisfies the formula
  • the Q GOP is the quality value of the GOP
  • the Q code is the basic quality value of the GOP
  • the L fcous_GOP is the packet concentration degree
  • the f (L fcous_GOP ) is the adjustment function of the packet concentration degree
  • the Q frameloss is the lost within the GOP.
  • the mass fraction of the frame loss, Q framedamage the mass fraction lost due to frame damage.
  • the entire video stream is composed of GOPs. After the quality value of each GOP is obtained, the quality value of the entire video stream is calculated, which can be implemented by a weighting method, and the quality value of each GOP in the video is added. Right, get the video quality value. Specifically, since the human eye is impressed with the poor portion, the weight of the GOP with low quality can be set to be large, and the weight of the GOP with longer duration is larger.
  • the user can view the device and the viewing condition to further correct the video quality value, and implement device adaptability of the evaluation result.
  • the evaluation that ultimately reflects the user's viewing video quality experience includes not only evaluating the video quality received by the terminal, but also reflecting the difference between different viewing devices and viewing conditions.
  • the main considerations are: screen size, screen resolution, video zoom, viewing distance, and viewing angle. Due to the excessive selection of parameters, the calculation is complicated, the relationship between parameters is difficult to determine, and the calculation accuracy is difficult to guarantee. In addition, some parameters are difficult to extract, the flexibility is poor, and the practicability is also poor. In order to simplify the calculation and enhance the practicability, the contrast sensitivity is utilized.
  • the Contrast Sensitivity Function (CSF) models the screen size, screen resolution, device type, and video resolution, and corrects the video quality values.
  • the video quality value obtained by the video quality evaluation method mentioned above is corrected as an example, and the method for correcting the video quality value by using the viewing device and the viewing condition is applicable.
  • the video quality value obtained by any video quality evaluation method is not limited in the embodiment of the present invention.
  • the video quality value may be further modified as follows:
  • the video resolution and the device resolution are modeled, the video resolution is represented by v_res, the device resolution is represented by d_res, and the resolution correction factor is the video resolution v_res and the device for viewing the video.
  • f_ratio be the ratio between the viewing device resolution and the video resolution
  • the ultimate benchmark for evaluating video quality assessment methods is the subjective method data that best reflects the human experience.
  • the R value is the value obtained by modeling the video resolution and the resolution of the device viewing the video.
  • the relationship between values is close to the sine function and the inverse S function in different intervals of f_ratio, where the expression of the sine function is satisfied in the range of f_ratio ⁇ (0,1), in f_ratio ⁇ (1,+ ⁇ )
  • the representation conforms to the inverse S function in the range, wherein the inverse S function is the Sigmoid function.
  • f_ratio is the ratio between the resolution of the viewing device and the video resolution, A>2(B-1), and B is the f_ratio value corresponding to the half of the video quality value when f_ratio is greater than 1.
  • the screen size correction factor is a value obtained by modeling a screen size of a device for viewing the video, and the screen size correction factor has a linear relationship with a screen size of a device for viewing the video.
  • the screen size correction factor is expressed by using the following formula (11), that is, the screen size correction factor satisfies the formula.
  • M is the screen size correction factor
  • S_base is the reference size for setting the viewing device screen
  • S_min is the minimum size for setting the viewing device screen
  • S_true is the actual size of the viewing device screen
  • min is the video quality value corresponding to S_min
  • max To set the video quality value corresponding to the maximum size of the viewing device screen.
  • [S_min, S_base] refers to setting a standard size range of the viewing device, which can be adjusted as needed. For a size smaller than S_min, the size smaller than S_min is set to S_min, and for the size larger than S_base, Set the size larger than S_base to S_base.
  • min and max can be flexibly adjusted, and the selection of min can be set according to the above-set S_min.
  • the screen size of a laptop computer is generally 11 inches to 17 inches
  • the screen size of a smartphone is generally 3 inches to 5.5 inches.
  • the min and max corresponding to the computer are set to be greater than the min and max corresponding to the smartphone.
  • the video quality correction is performed by using the screen size correction factor, and the video quality correction is performed by using the resolution correction factor, and the video quality correction may be used alone or in combination. Description.
  • the resolution correction factor in combination with the video quality correction using the screen size correction factor and the video quality correction using the resolution correction factor, the resolution correction factor is first used to perform video quality correction, and then the screen size correction factor is used.
  • the video quality value corrected by the resolution correction factor is corrected again, and the video corrected by the resolution correction factor is required to be corrected again by using the screen size correction factor to correct the video quality value corrected by the resolution correction factor.
  • the quality value is re-assigned, so that the re-assigned video quality value falls within the set standard range, which is suitable for the method of video quality evaluation using different standards.
  • the specific implementation process is as follows:
  • the maximum range of possible video quality values [1, MAX] is further defined for various viewing devices. According to the minimum value, 1 and the interval map are obtained, and the re-assigned video quality value satisfies the formula.
  • V_R is the re-assigned video quality value
  • MAX is the maximum value of the set video quality value
  • V R is the corrected video quality value
  • V correction M*V_R
  • V is corrected to the corrected video quality value
  • M is the screen size correction factor
  • V_R is the re-assigned video quality value
  • the video quality evaluation apparatus involved in the embodiment of the present invention determines a state variable that characterizes a video frame content feature by using a frame length and a decoding complexity, and determines content feature information of the video frame based on the state variable, which is comprehensive and complete.
  • the video content feature is used to make the video quality evaluation result more accurate, and the decoding complexity is the calculation cost value corresponding to the terminal decoding video frame, so no separate decoding is needed, and the inherent hard decoding function of the terminal is used, and the calculation is basically not increased.
  • the overhead can guarantee low computational overhead and reduce computational complexity.
  • the video quality evaluation apparatus can more accurately evaluate the impact of packet loss on video quality by using packet loss concentration.
  • the viewing device and the viewing conditions are modeled, and the adaptability of the evaluation results in different scenarios can be achieved.
  • Figure 3 shows the model framework for video quality evaluation.
  • the input information is video stream information, and finally the video quality value, such as VMOS value, is output.
  • the video quality evaluation method according to the embodiment of the present invention can be understood as being based on the packet layer evaluation method for network video quality evaluation based on the data packet header information, and the decoding complexity and frame length modeling are added. Obtaining video frame content feature information, packet loss concentration modeling for video quality evaluation, and viewing device and viewing condition modeling to correct initial video quality.
  • the above three links can be implemented separately, Can be used in combination, Figure 3 is only for illustration sexual description. The specific implementation process will be described in detail below.
  • FIG. 4 is a flowchart of implementing a video quality evaluation method according to an embodiment of the present invention, and FIG. 4 includes:
  • S202a The video stream packet header is detected, and the frame length of the video frame and the frame header length of the video frame are obtained.
  • the data packet header analysis and detection of the obtained video stream may obtain User Datagram Protocol (UDP) packet information, and the frame length of the video frame and the frame header length of the video frame may be acquired by using the UDP packet information. And other information.
  • UDP User Datagram Protocol
  • S202b The network captures the packet and parses the file obtained by the packet capture to obtain the number of video blocks included in the video frame.
  • the file obtained by capturing the packet is parsed by the network packet to obtain information including a resolution and a block size, and the video included in the video frame is determined by the resolution and the block size. The number of blocks.
  • S203 Acquire decoding complexity information of the video frame, and obtain content feature information of the video frame according to the decoding complexity information of the video frame and the frame parameter.
  • the specific implementation process of the content feature information of the video frame is obtained according to the decoding complexity information of the video frame and the frame parameter, and the frame length and the decoding complexity involved in the foregoing embodiment may be modeled to determine the characterization.
  • the part between one I frame and the next I frame in the video is a GOP, so there is one I frame and several P and B frames in the GOP.
  • the frame type detection the number of the P frame and the B frame is obtained, and the related parameters of the GOP are determined according to the parameters of each video frame in the GOP, and the related parameters of the GOP may be, for example, the number of the P frame and the B frame. Wait.
  • each frame in the GOP is represented by a formula (5) in units of GOPs, That is, the state variables K and B that characterize the video content in each frame are obtained.
  • the content feature information F(K, B) of each frame can be obtained by combining K and B of the state variables characterizing the video content features in each frame.
  • F GOP (K, B) is content feature information of all video frames in the GOP
  • F I (K, B) is content feature information of an I frame in the GOP
  • F P (K, B) is The content feature information of the P frame in the GOP
  • F B (K, B) is the content feature information of the B frame in the GOP
  • m and n are the number of P and B frames in the GOP, respectively.
  • the packet can be captured by the network and parsed by the packet, and the code rate parameter of the GOP is obtained, according to the code rate parameter of the GOP and the content feature information of all video frames in the GOP. Determine the video quality degradation value caused by compression distortion.
  • the video quality degradation value caused by the compression distortion satisfies the formula (6)
  • a base quality score of the GOP is determined based on a video quality degradation value Q code_loss caused by the compression distortion, and a base quality score of the GOP satisfies the formula (7).
  • S205 Determine a packet loss concentration within the GOP, and determine a packet loss distortion value of the GOP.
  • the packet loss concentration is used to indicate the degree of concentration of packet loss in the corresponding GOP range, and the packet loss concentration in the GOP satisfies the formula (8).
  • the packet loss distortion value of the GOP may be determined based on the packet loss concentration.
  • b1V, b2V, and c are training coefficients
  • Number frameloss is the number of dropped frames
  • Number frameloss can be accumulated by the frame dropping event
  • Bitrate is the code rate
  • the quality score lost in a GOP due to frame damage may be recalculated:
  • Q framedamage is the mass score lost due to frame damage
  • Damage ratio is the frame damage rate
  • the Damage ratio can be calculated from the number of lost packets in the video frame
  • ⁇ , ⁇ and ⁇ are constant coefficients
  • c1V and c2V are training coefficients.
  • Q code is the basic quality value of GOP.
  • Q code sets a lower limit value, which is set according to the application. Below this lower limit, ⁇ Q code + ⁇ takes a constant value of 1.
  • the packet loss distortion value of the GOP can be expressed as f(L fcous_GOP ) ⁇ (Q code_loss + ⁇ Q framedamage ).
  • L fcous_GOP is the packet concentration degree of GOP
  • f(L fcous_GOP ) is the adjustment function of the packet concentration degree, which can be formulated according to L fcous_GOP .
  • f (L fcous_GOP ) k ⁇ (L fcous_GOP - L threshold ) + b
  • k is a positive number
  • b is a constant to be determined
  • L threshold is a defined threshold
  • the quality value of the GOP may be obtained based on the base quality score of the GOP and the packet loss distortion value of the GOP, and the quality value of the GOP satisfies the formula (9).
  • S207 Determine, according to the foregoing method, a quality value of all GOPs in the video, and weight the quality values of all GOPs in the video to obtain an initial video quality value.
  • the entire video is composed of individual GOPs. After obtaining the quality values of the respective GOPs, the quality values of all GOPs in the video are calculated, and the quality values of all GOPs in the video are weighted to obtain video quality values.
  • the resolution and the screen size are modeled, and the resolution correction factor and the screen size correction factor are obtained, and the process of correcting the video quality value may be referred to the description of the foregoing embodiment, and details are not described herein again.
  • the video quality evaluation method involved in the embodiment of the present invention determines a state variable that characterizes a video frame content feature by using a frame parameter and a decoding complexity information, and determines content feature information of the video frame based on the state variable, which is comprehensive
  • the video content feature is completely utilized, so that the video quality evaluation result is more accurate, and the decoding complexity is the calculation cost corresponding to the decoded video frame of the terminal, so that no separate decoding is needed, and the inherent hard decoding function of the terminal is used, and the basic decoding is not increased.
  • the computational overhead can guarantee low computational overhead and reduce computational complexity.
  • the use of packet loss concentration can more accurately evaluate the impact of packet loss on video quality.
  • the video quality value is further modified by using the resolution correction factor and the screen size correction factor, so as to achieve the adaptability of the evaluation results in different scenarios.
  • FIG. 5 is a simplified functional block diagram of the video quality evaluation apparatus 100, as shown in FIG.
  • the quality evaluation apparatus 100 includes an acquisition unit 101 and a processing unit 102, wherein:
  • the obtaining unit 101 is configured to: the frame parameter and the terminal decode the decoding complexity information of the video frame, where the decoding complexity information is a calculation cost value corresponding to the decoding of the video frame by the terminal.
  • the processing unit 102 is configured to determine, by using the frame parameter and the decoding complexity information acquired by the acquiring unit, a state variable that represents a content feature of the video frame, and determine content feature information of the video frame based on the state variable. And repeating the above steps until the content feature information of all the video frames in the video is determined, and the video quality evaluation is performed according to the content feature information of the entire video frame.
  • the video quality evaluation apparatus 100 determines, by using frame parameters and decoding complexity information, a state variable that characterizes a video frame content feature, and determines content feature information of the video frame based on the state variable, thereby fully utilizing video content features.
  • the decoding complexity information is the calculation cost value corresponding to the decoding of the video frame by the terminal, so no separate decoding is needed, and the hard decoding function inherent in the terminal is used, which basically does not increase the calculation overhead, can ensure low computational overhead and reduce computational complexity.
  • the processing unit 102 determines, by using the frame parameter and the decoding complexity information, a method for determining a state variable of a content feature of a video frame, and may refer to a method for performing frame length and decoding complexity according to the method embodiment. Modeling, determining a state variable that characterizes a video frame content feature, and determining a content feature information of the video frame based on the state variable, and details are not described herein.
  • F(K, B) is characterized by content feature information of the video frame
  • K and B are state variables characterizing the content characteristics of the video frame
  • ⁇ and ⁇ are constant.
  • the video quality evaluation apparatus involved in the embodiment of the present invention determines a state variable that characterizes a video frame content feature by using a frame length and a decoding complexity, and determines content feature information of the video frame based on the state variable, which is comprehensive and complete.
  • the video content feature is used to make the video quality evaluation result more accurate, and the decoding complexity is the calculation cost value corresponding to the terminal decoding video frame, so no separate decoding is needed, and the inherent hard decoding function of the terminal is used, and the calculation is basically not increased.
  • the overhead can guarantee low computational overhead and reduce computational complexity.
  • the processing unit 102 may perform video quality evaluation according to the content feature information of all the video frames in the following manner, including:
  • Packet loss distortion value based on the base quality score of each GOP and the packet loss of each GOP The true value is used to determine the quality value of each GOP; the quality value of each GOP in the video is weighted to obtain a video quality value.
  • the packet loss concentration may be determined according to the distance L loss of the first packet loss to the last packet loss in the corresponding GOP range, and the total number of packet loss N of the video frame in the GOP, where In the case where N is fixed, the growth trend of the packet loss concentration is opposite to the growth trend of L loss . In the case where L loss is fixed, the growth trend of the packet loss concentration is the same as the growth trend of N.
  • the embodiment of the present invention can more accurately evaluate the impact of packet loss on video quality by using packet loss concentration.
  • the processing unit 102 is further configured to modify the video quality value by using a resolution correction factor.
  • the resolution correction factor is a value obtained by modeling a video resolution and a device resolution for viewing the video, and the value conforms to a representation of a sine function in a range of f_ratio ⁇ (0, 1). In the range of f_ratio ⁇ (1,+ ⁇ ), it conforms to the representation of the inverse S function.
  • f_ratio is the ratio between the viewing device resolution and the video resolution.
  • the processing unit 102 is further configured to perform re-correction of the video quality value corrected by the resolution correction factor by using a screen size correction factor.
  • the screen size correction factor is a value obtained by modeling a screen size of a device viewing the video, the screen size correction factor satisfies a formula
  • M is the screen size correction factor
  • S_base is the reference size for setting the viewing device screen
  • the minimum size for setting the viewing device screen S_true is the actual size of the viewing device screen
  • min is the video quality value corresponding to S_min
  • max is Set the video quality value corresponding to the maximum size of the viewing device screen.
  • the embodiment of the invention further corrects the video quality value by using the resolution correction factor and the screen size correction factor, and can realize the adaptability of the evaluation result in different scenarios.
  • the video quality evaluation apparatus 100 provided by the embodiment of the present invention can be used to implement the video quality evaluation method according to the foregoing embodiment, and all the functions in the video quality evaluation process are implemented in the foregoing embodiment.
  • all the functions in the video quality evaluation process are implemented in the foregoing embodiment.
  • For the specific implementation process refer to the foregoing embodiment and the accompanying drawings. The related description is not repeated here.
  • FIG. 6 is a schematic structural diagram of a video quality evaluation apparatus 200 according to another embodiment of the present invention.
  • the video quality evaluation apparatus 200 employs a general computer system structure including a bus, a processor 201, a memory 202, and a communication interface 203.
  • the program code for executing the inventive scheme is stored in the memory 202 and controlled by the processor 201 for execution.
  • the bus can include a path to transfer information between various components of the computer.
  • the processor 201 can be a general purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the program of the present invention.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • One or more memories included in the computer system which may be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM) or Other types of dynamic storage devices that store information and instructions may also be disk storage. These memories are connected to the processor via a bus.
  • the communication interface 203 can use devices such as any transceiver to communicate with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), Wireless Local Area Network (WLAN), and the like.
  • devices such as any transceiver to communicate with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), Wireless Local Area Network (WLAN), and the like.
  • RAN Radio Access Network
  • WLAN Wireless Local Area Network
  • a memory 202 such as a RAM, holds an operating system and a program for executing the inventive arrangements.
  • the operating system is a program that controls the running of other programs and manages system resources.
  • the program stored in the memory 202 is used by the instruction processor 201 to perform a video quality evaluation method, including: acquiring frame parameters of a video frame, and decoding decoding complexity information of the video frame by the terminal, where the decoding complexity information is decoded by the terminal. a calculation overhead value corresponding to the video frame; And determining, according to the decoding complexity information, a state variable that characterizes a video frame content feature; determining content feature information of the video frame based on the state variable; repeating the above steps until determining all video frames in the video Content feature information; performing video quality evaluation based on content feature information of all the video frames.
  • video quality evaluation apparatus 200 of the present embodiment can be used to implement all the functions involved in the foregoing method embodiments.
  • specific implementation process reference may be made to the related description of the foregoing method embodiments, and details are not described herein again.
  • the embodiment of the present invention further provides a computer storage medium for storing the computer software instructions used in the video evaluation apparatus described in FIG. 5 or FIG. 6, which includes a program for executing the foregoing method embodiments.
  • the evaluation of video quality can be achieved by executing a stored program.
  • embodiments of the present invention can be provided as a method, apparatus (device), or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the computer program is stored/distributed in a suitable medium, provided with other hardware or as part of the hardware, or in other distributed forms, such as over the Internet or other wired or wireless telecommunication systems.
  • the present invention is a flow of a method, apparatus (device) and computer program product according to an embodiment of the present invention.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

一种视频质量评价方法及装置,获取视频帧的帧参数以及解码复杂度信息,所述解码复杂度信息为终端解码所述视频帧对应的计算开销值,通过所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量;基于所述状态变量,确定所述视频帧的内容特征信息;重复执行以上步骤,直至确定出视频中全部视频帧的内容特征信息;根据所述全部视频帧的内容特征信息,进行视频质量评价。可以全面完整的利用视频的内容特征,并降低计算复杂度。

Description

一种视频质量评价方法及装置
本申请要求于2015年11月18日提交中国专利局、申请号为201510793958.5,发明名称为“一种视频质量评价方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信技术领域,尤其涉及一种视频质量评价方法及装置。
背景技术
随着通信技术和多媒体技术的发展,网络视频业务得到广泛的发展与应用,应用场景也更加复杂化,因此,对网络视频质量进行实时而准确的评价,对视频编解码技术发展、网络规则、网内视频服务质量监控、优化视频应用系统中的算法和参数调优等至关重要。
进行视频质量评价的方法主要分为两类,一种是主观视频质量评价方法,即由大量观察者根据预先规定的评价尺度对测试视频按视觉效果的优劣进行评价打分,对所有观察者给出的评价分值进行加权平均,以此作为测试视频的主观质量评价值;另外一种是客观的视频质量评价方法,指利用设计的计算模型代替人类视觉系统(Human Visual System,HVS)对测试视频进行分析对比,得到测试视频的客观质量评价值。
客观的视频质量评价方法依据参考源信息的使用分为全参考、部分参考和无参考的视频质量评价方法。针对网络视频质量评价主要采用无参考的视频质量评价方法。目前无参考的网络视频质量评价方法主要有基于数据分组头信息的分组层评价方法、基于分组头信息和载荷信息的比特流层评价方法和基于像素级特征的媒体层评价方法。
基于数据分组头信息进行网络视频质量评价的分组层评价方法,无法正确 解析视频的内容特征,故网络视频质量评价的结果准确度较低。基于分组头信息和载荷信息进行网络视频质量评价的比特流层评价方法,可实现使用视频内容的特征信息进行视频质量评价,能够保证视频质量预测的准确度,但是需要单独解码,并在完全解码的情况下,视频流重构为视频画面,计算复杂度较高,并且例如机顶盒等终端对视频解码得到的参数不能用于重构视频画面,机顶盒又不能够承受完全解码的开销,使得基于分组头信息和载荷信息的比特流层评价方法使用受限。基于像素级特征进行网络视频质量评价的媒体层评价方法,使用像素级特征提取和计算,计算复杂度也较高。
故提供一种既能保证视频质量预测的准确度又能降低计算复杂度的网络视频质量评价方法,势在必行。
发明内容
本发明实施例提供一种视频质量评价方法及装置,以全面完整的利用视频的内容特征来评价网络视频质量以提高视频质量预测的准确度,并降低计算复杂度。
第一方面,提供一种视频质量评价方法,该视频质量评价方法中获取视频帧的帧参数以及终端解码所述视频帧的解码复杂度信息,所述帧参数例如可以是通过分组头检测得到的帧长等信息。所述解码复杂度信息为终端解码所述视频帧对应的计算开销值,可以是终端解码过程中涉及的各部分对应的计算开销值,无需进行单独解码,能够保证低计算开销,降低计算复杂度。根据视频帧的帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量,该表征视频帧内容特征的状态变量可以表征所述视频帧的编码信息变化,以及幅面或残差变化,进而依据表征视频帧内容特征的状态变量,提取视频帧的内容特征信 息,并利用视频中全部视频帧的内容特征信息进行视频质量评价,能够全面完整的利用视频内容特征进行视频质量评价,能够提高视频质量预测的准确度。
在一种可能的设计中,所述帧参数包括视频帧的帧头长度、帧长以及所述视频帧所包含的视频块数,所述解码复杂度信息为熵编码、重排序、反变量和反变换,以及运动补偿、帧内预测和帧间预测各部分涉及的计算开销值,通过帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量,具体包括:获取熵编码过程中解码每个码元的平均解码复杂度C0,重排序过程中解码每个视频块的平均解码复杂度C1,反变量和反变换过程中解码每个视频块的平均解码复杂度C2,运动补偿、帧内预测和帧间预测过程中每个码元预取数据所需平均计算复杂度C3,以及运动补偿、帧内预测和帧间预测获取参考块所需的计算量和将参考块与残差相加所需计算量C4;根据所述C0、C3和所述视频帧的帧头长度、帧长,确定表征所述视频帧的编码信息变化的状态变量K;根据所述C1、C2、C4和所述视频帧所包含的视频块数,确定表征所述视频帧的幅面或残差变化的状态变量B。
在一种可能的设计中,通过帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量,可采用如下方式:
A、对所述帧长与终端解码复杂度进行线性建模,得到终端解码复杂度的第一表达式,所述第一表达式中包括表征所述视频帧内容特征的状态变量,例如可采用C=K×L+B表示,其中K和B为表征视频帧内容特征的状态变量。
B、确定终端解码所述视频帧过程中涉及的各部分解码复杂度,并对各部分解码复杂度求和得到终端解码复杂度的第二表达式,所述解码过程中涉及的各部分解码复杂度包括熵编码、重排序、反变量和反变换、以及运动补偿、帧 内预测和帧间预测的解码复杂度,故所述第二表达式可以表示为:
C=CED+CReO+CIQ&IT+CMC/IP=(C0×L)+(C1×N)+(C2×N)+(C3×Lh+C4×N)
其中,CED为熵编码的解码复杂度,CReO为重排序的解码复杂度,CIQ&IT为反变量和反变换的解码复杂度,CMC/IP为运动补偿模块、帧内预测模块和帧间预测模块的解码复杂度;
CED=C0×L,CReO=C1×N,CIQ&IT=C2×N,CMC/IP=C3×Lh+C4×N,N为所述视频帧所包含的视频块数,Lh为所述视频帧帧头的长度。
C、将所述第二表达式表示为所述第一表达式的形式,得到表征所述视频帧内容特征的状态变量。例如表征所述视频帧内容特征的状态变量可满足如下表达式:
Figure PCTCN2016082223-appb-000001
上述K和B的表达式中,K值受视频的预测模式、运动矢量等编码信息影响,即与编码信息有关,可以表征视频编码信息的变化,主要包括预测模式、运动矢量精度及运动矢量范围。B与分块数有关,可以表征视频幅面的变化,当幅面相同时,B可以表征残差的变化,因此K和B可以表征视频内容特征的状态变量。
在一种可能的设计中,针对视频质量评价过程中,在确定丢包失真值时,依据所述GOP内的丢包集中度进行确定,所述丢包集中度用于表示对应GOP范围内丢包发生的集中程度,在实际网络中发生的丢包多是非均匀丢包,在相同的丢包率情况下,丢包集中度越高,则对视频质量的影响越大,考虑丢包集中度的影响,故可使得评测丢包对视频质量评价的影响更加准确。
所述丢包集中度可依据对应GOP范围内的第一个丢包到最后一个丢包的距离Lloss,以及该GOP内视频帧的总丢包数N所确定,其中,在N固定的情况下,所述丢包集中度的增长趋势与Lloss的增长趋势相反,在Lloss固定的情况下,所述丢包集中度的增长趋势与N的增长趋势相同。
例如,丢包集中度满足公式
Figure PCTCN2016082223-appb-000002
其中,Lloss表示所述GOP内第一个丢包到最后一个丢包的距离,N为所述GOP内的总丢包数,c和k为待定常量。
在一种可能的设计中,根据所述全部视频帧的内容特征信息,进行视频质量评价,包括:
根据所述全部视频帧的内容特征信息,确定视频中各图像组GOP的基础质量分值;其中,所述GOP是指从一个I帧开始到下一个I帧出现前的视频帧集合;获取各GOP内视频帧的丢包集中度,依据所述各GOP内视频帧的丢包集中度,确定各GOP的丢包失真值;根据各GOP的基础质量分值以及各GOP的丢包失真值,确定所述各GOP的质量值;整个视频流是由各GOP组成的,得到各GOP的质量值后,计算整个视频流的质量值,可以以加权的方法实现,即对视频中各GOP的质量值进行加权,得到视频质量值。具体的,由于人眼对于较差的部分印象深刻,因此可以设置质量低的GOP的权重较大,另外持续时间较长的GOP设置的权重较大。
进一步的,在得到视频质量值后,可利用分辨率修正因子,对所述视频质量值进行修正,以使得最终得到的视频质量值不仅可体现评价终端设备接收的视频质量,还可体现观看条件影响下的视频质量,实现不同场景下评价结果的自适应性。分辨率修正因子为对视频分辨率以及观看所述视频的设备分辨率进 行建模得到的数值,且所述数值在f_ratio∈(0,1]范围内符合正弦函数的表现形式,在f_ratio∈(1,+∞)范围内符合反向的S函数的表现形式;其中,f_ratio为观看设备分辨率与视频分辨率之间的比值。
进一步的,还可利用屏幕尺寸修正因子对视频质量值进行修正,以使得最终得到的视频质量值不仅可体现评价终端设备接收的视频质量,还可体现观看所述视频的设备屏幕尺寸影响下的视频质量,实现不同场景下评价结果的自适应性。
其中,所述屏幕尺寸修正因子为对观看所述视频的设备屏幕尺寸进行建模得到的数值,所述屏幕尺寸修正因子满足公式
Figure PCTCN2016082223-appb-000003
其中,M为屏幕尺寸修正因子,S_base为设定观看设备屏幕的基准尺寸、S_min为设定观看设备屏幕的最小尺寸、S_true为观看设备屏幕的实际尺寸,min为S_min对应的视频质量值,max为设定观看设备屏幕的最大尺寸对应的视频质量值。
需要说明的是利用屏幕尺寸修正因子对视频质量修正,以及利用分辨率修正因子进行视频质量修正,可以单独使用,也可二者结合使用。在利用屏幕尺寸修正因子对视频质量修正和利用分辨率修正因子进行视频质量修正结合使用过程中,需先利用分辨率修正因子进行视频质量修正,然后再利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正,并且在利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正之前,需将利用分辨率修正因子修正后的视频质量值重新赋值,使得重新赋 值后的视频质量值落在设定的标准范围内,以使本发明实施例可适用于利用不同标准进行视频质量评价的方法。
第二方面,提供一种视频质量评价装置,该视频质量评价装置包括获取单元和处理单元,所述获取单元,用于获取帧参数以及终端解码所述视频帧的解码复杂度信息。所述帧参数例如可以是通过分组头检测得到的帧长等信息。所述解码复杂度信息为终端解码所述视频帧过程中各部分的计算开销值。
处理单元,用于通过所述获取单元获取的所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量,基于所述状态变量,确定所述视频帧的内容特征信息,并重复执行以上步骤,直至确定出视频中的全部视频帧的内容特征信息,根据所述全部视频帧的内容特征信息,进行视频质量评价。
第二方面提供的视频质量评价装置,通过帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定所述视频帧的内容特征信息,能够全面完整的利用视频内容特征,使得视频质量评价结果更为准确,并且所述解码复杂度信息为终端解码视频帧对应的计算开销值,故无需进行单独解码,使用终端固有的硬解码功能,基本不增加计算开销,能够保证低计算开销,降低计算复杂度。
所述帧参数包括视频帧的帧头长度、帧长以及所述视频帧所包含的视频块数,所述解码复杂度信息为熵编码、重排序、反变量和反变换,以及运动补偿、帧内预测和帧间预测各部分涉及的计算开销值,所述处理单元具体采用如下方式通过帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量:获取熵编码过程中解码每个码元的平均解码复杂度C0,重排序过程中解码每个视频块的平均解码复杂度C1,反变量和反变换过程中解码每个视频块的平均解码复 杂度C2,运动补偿、帧内预测和帧间预测过程中每个码元预取数据所需平均计算复杂度C3,以及运动补偿、帧内预测和帧间预测获取参考块所需的计算量和将参考块与残差相加所需计算量C4;根据所述C0、C3和所述视频帧的帧头长度、帧长,确定表征所述视频帧的编码信息变化的状态变量K;根据所述C1、C2、C4和所述视频帧所包含的视频块数,确定表征所述视频帧的幅面或残差变化的状态变量B。
在第二方面的一种可实现方式中,所述处理单元确定的所述表征所述视频帧的编码信息变化的状态变量K和所述表征所述视频帧的幅面或残差变化的状态变量B满足公式
Figure PCTCN2016082223-appb-000004
其中,L为帧长,N为所述视频帧所包含的视频块数,Lh为所述视频帧帧头的长度。
所述处理单元,具体用于采用如下方式基于所述状态变量,确定所述视频帧的内容特征信息:
对所述状态变量与所述视频帧的内容特征信息进行线性建模,所述状态变量与所述视频帧的内容特征信息之间满足公式F(K,B)=αK+βB;其中,所述F(K,B)表征为所述视频帧的内容特征信息,α和β为常数。
在第二方面的另一种可实现方式中,所述处理单元可依据GOP内的丢包集中度确定所述GOP的丢包失真值,并根据GOP的基础质量分值以及GOP的丢包失真值,确定所述GOP的质量值,对所述视频中全部GOP的质量值进行加权,得到视频质量值。
所述丢包集中度用于表示对应GOP范围内丢包发生的集中程度,可依据对应GOP范围内的第一个丢包到最后一个丢包的距离Lloss,以及该GOP内视频帧的总丢包数N所确定,其中,所述丢包集中度的增长趋势与L的增长趋势相反,与N的增长趋势相同。
所述处理单元,具体用于采用如下方式根据所述全部视频帧的内容特征信息,进行视频质量评价:
根据所述全部视频帧的内容特征信息,确定视频中各GOP的基础质量分值;获取各GOP内视频帧的丢包集中度,其中,所述丢包集中度用于表示对应GOP范围内丢包发生的集中程度;依据所述各GOP内视频帧的丢包集中度,确定各GOP的丢包失真值;根据各GOP的基础质量分值以及各GOP的丢包失真值,确定所述各GOP的质量值;对视频中各GOP的质量值进行加权,得到视频质量值。具体的,由于人眼对于较差的部分印象深刻,因此可以设置质量低的GOP的权重较大,另外持续时间较长的GOP设置的权重较大。
所述处理单元,还用于利用分辨率修正因子,对所述视频质量值进行修正,以使得最终得到的视频质量值不仅可体现评价终端设备接收的视频质量,还可体现观看条件影响下的视频质量,实现不同场景下评价结果的自适应性。其中,所述分辨率修正因子为对视频分辨率以及观看所述视频的设备分辨率进行建模得到的数值,且所述数值在f_ratio∈(0,1]范围内符合正弦函数的表现形式,在f_ratio∈(1,+∞)范围内符合反向的S函数的表现形式,其中,f_ratio为观看设备分辨率与视频分辨率之间的比值。
进一步的,所述处理单元还用于利用屏幕尺寸修正因子对视频质量值进行修正。例如,所述处理单元可利用屏幕尺寸修正因子对利用分辨率修正因子修 正后的视频质量值进行再次修正。
所述屏幕尺寸修正因子为对观看所述视频的设备屏幕尺寸进行建模得到的数值,所述屏幕尺寸修正因子满足公式
Figure PCTCN2016082223-appb-000005
其中,M为屏幕尺寸修正因子,S_base为设定观看设备屏幕的基准尺寸、S_min为设定观看设备屏幕的最小尺寸、S_true为观看设备屏幕的实际尺寸,min为S_min对应的视频质量值,max为设定观看设备屏幕的最大尺寸对应的视频质量值。
利用屏幕尺寸修正因子和屏幕尺寸修正因子对视频质量值进行修正,可使得最终得到的视频质量值不仅可体现评价终端设备接收的视频质量,还可体现观看条件影响下的视频质量,实现不同场景下评价结果的自适应性。
第三方面,提供一种视频质量评价装置,该视频质量评价装置包括处理器和存储器,其中,所述存储器中存有计算机可读程序,所述处理器通过运行所述存储器中的程序,实现第一方面涉及的进行视频质量评价的方法。
第四方面,提供一种计算机存储介质,用于储存上述视频质量评价装置所用的计算机软件指令,其包含用于执行上述第一方面涉及的视频质量评价方法所设计的程序。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付 出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为IPTV端到端体系结构;
图2为本发明一实施例提供的视频质量评价方法的一种实现流程图;
图3为本发明实施例提供的视频质量评价流程框图;
图4为本发明另一实施例提供的视频质量评价方法另一种实现流程图;
图5为本发明一实施例提供的一种视频质量评价装置的构成示意图;
图6为本发明另一实施例提供的视频质量评价装置的另一种构成示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚地描述。
本发明实施例提供的视频质量评价方法可应用于网络协议电视(Internet Protocol Television,IPTV)视频业务。如图1所示,IPTV端到端体系结构中主要包括的网元设备为:节目源、IPTV核心节点(如直播转发服务器、核心流媒体服务器)、IPTV边缘节点(如边缘流媒体服务器)、承载网络(如接入汇聚层设备)、接入节点(如数字用户线路接入复用器、光线路终端)和终端等。视频数据由节目源传输到IPTV终端过程中,由于压缩编码引起的失真以及传输信道误码产生的失真,会影响视频质量。为了保证用户接收到高质量的IPTV视频内容,需要确定引起视频质量下降的因素,对视频质量进行评价。IPTV视频业务中,进行视频质量评价,可通过在不同的网元节点进行视频服务质量监控。IPTV视频业务中的服务质量监控包括:头端视频质量监控、终端业务质量监控、网络视频质量监控和IPTV平台质量监控。
需要说明的是,本发明实施例提供的视频质量评价方法并不限定于IPTV视频业务。
本发明实施例提供的视频质量评价方法主要是针对终端业务质量监控环 节,换言之,本申请涉及的视频质量评价装置可使用无参考方式部署在终端侧,布置位置合理。具体的,视频质量评价装置可以作为终端的一部分,也可以独立存在,其只需获取终端侧的传输视频流即可进行视频质量评估,而无需获取原始参考视频作为评价依据。可以理解的是本发明涉及的终端可以是机顶盒(Set Top Box,STB)、光网络终端(optical network terminal,ONT),或个人计算机(personal computer,PC),本发明实施例不做限定。
视频质量评价装置进行视频质量评价的一种实现流程,可参阅图2所示,如图2所示,包括:
S101:获取视频帧的帧参数以及解码复杂度信息。
具体的,视频质量评价装置接收视频流,根据所述视频流解析得到视频帧参数,该视频帧参数包括视频帧的帧头长度、帧长以及所述视频帧所包含的视频块数。
本发明实施例中所述解码复杂度信息可以理解为终端解码所述视频帧对应的计算开销值,例如可以是熵编码、重排序、反变量和反变换,以及运动补偿、帧内预测和帧间预测各部分涉及的计算开销值之和。
S102:通过所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量。
S103:基于所述状态变量,确定所述视频帧的内容特征信息。
S104:重复执行以上步骤,直至确定出视频中全部视频帧的内容特征信息。
S105:根据所述全部视频帧的内容特征信息,进行视频质量评价。
以下将对上述各执行步骤进行详细说明。
本发明实施例中实现根据视频帧的帧参数与解码复杂度信息,提取视频内容特征信息,进行视频质量评价,可采用如下方式:
(1)对帧长与解码复杂度进行线性建模,得到所述解码复杂度的第一表达式。
解码复杂度在终端上是通过CPU占用率等参数来体现的,对于不同的设备,处理能力不同,例如相同的解码复杂度对应的CPU占用率可能不同。经过大量实验表明:视频帧的解码任务计算量即解码复杂度与视频帧的帧长之间走势一致,当帧长发生剧烈变化时,解码复杂度也发生明显变化,用C表示解码复杂度,L表示帧长,二者之间的关系建模为:
C=f(L)                                 (1)
根据相关系数的统计,解码复杂度和帧长之间的相关系数大多数情况下接近1,因此可以将f(L)定义为线性函数,即所述帧长与所述解码复杂度线性建模后得到的解码复杂度的第一表达式可满足公式
C=K×L+B                                  (2)
其中,L为视频帧的帧长,可通过解析视频流分组头信息得到。C为解码复杂度,K和B为拟合系数。
(2)通过所述帧长与所述解码复杂度,确定拟合系数K和B:
本发明实施例中,可确定终端解码所述视频帧过程中涉及的各部分解码复杂度,并对各部分解码复杂度求和得到所述解码复杂度的第二表达式,将所述第二表达式表示为所述第一表达式的形式,得到拟合系数K和B。
解码复杂度可以表征为终端解码所述视频帧过程中涉及的各部分解码复杂度之和。例如,终端对视频帧进行解码过程中可包括熵编码、重排序、反变量和反变换、以及运动补偿、帧内预测和帧间预测等部分,故可采用如下公式表示:
C=CED+CReO+CIQ&IT+CMC/IP                (3)
其中,CED为熵编码的解码复杂度,CReO为重排序的解码复杂度,CIQ&IT为反变量和反变换的解码复杂度,CMC/IP为运动补偿、帧内预测和帧间预测的解码复杂度。
其中,
CED=C0×L;
CReO=C1×N;
CIQ&IT=C2×N;
CMC/IP=C3×Lh+C4×N;
故,可得到解码复杂度的第二表达式可满足公式
C=CED+CReO+CIQ&IT+CMC/IP=(C0×L)+(C1×N)+(C2×N)+(C3×Lh+C4×N)  (4)
其中,C0为解码每个码元的平均解码复杂度,所述码元为帧长的基本单元长度,即长度计量单位、基本组成单位,N为所述视频帧所包含的视频块数,C1为重排序过程中解码每个视频块的平均解码复杂度,C2为反变量和反变换过程中解码每个视频块的平均解码复杂度,C3为运动补偿、帧内预测和帧间预测过程中所述视频帧帧头中每个码元预取数据所需平均计算复杂度,Lh为所述视频帧帧头的长度,C4为运动补偿、帧内预测和帧间预测过程中获取参考块所需的计算量和将参考块与残差相加所需计算量。
根据所述C0、C3和所述视频帧的帧头长度、帧长,确定表征所述视频帧的编码信息变化的状态变量K。根据所述C1、C2、C4和所述视频帧所包含的视频块数,确定表征所述视频帧的幅面或残差变化的状态变量B。例如可将所述第二表达式表示为所述第一表达式的形式,即将公式(4)改写为公式(2)的格式,可得到K和B满足公式
将此式改写成公式(3)的格式,可令
Figure PCTCN2016082223-appb-000006
其中,C0、C1、C2、C3和C4可通过各种平台上对大型视频序列集,针对I、P、B帧分别进行的大量实验统计得到,N值可由视频分辨率和分块大小 确定,
Figure PCTCN2016082223-appb-000007
可在视频流分组检测时得到。
视频帧的内容特征也可理解为是视频帧的复杂度,所述视频帧的复杂度包括时间复杂度和空间复杂度,空间复杂度是指单帧上的信息,时间复杂度指帧与帧之间的信息。本发明实施例中K值与C0、C3、Lh和L有关,而C0为熵编码过程中解码每个码元的平均解码复杂度,C3为运动补偿、帧内预测和帧间预测过程中涉及的解码复杂度,故K值受视频的预测模式、运动矢量等编码信息影响。B与N有关,即B与分块数有关。因此K和B可以表征视频内容特征的状态变量,K表示视频编码信息变化,主要包括预测模式、运动矢量精度及运动矢量范围,B表征视频幅面的变化,当幅面相同时,B可以表征残差的变化。
(3)基于K和B,确定视频帧的内容特征信息。
本发明实施例中,解码复杂度是线性表示的,故对所述状态变量与所述视频帧的内容特征信息进行线性建模,所述状态变量与所述视频帧的内容特征信息之间满足公式F(K,B)=αK+βB;
其中,所述F(K,B)表征为所述视频帧的内容特征信息,所述K和B为表征所述视频帧内容特征的状态变量,α和β为设定的常数,该常数依据实际情况可设置为不同的值。
(4)确定全部视频帧的内容特征信息,进行视频质量评价。
视频流是由各个图像组(Group of Pictures,GOP)组成的,GOP是指从一个I帧开始到下一个I帧出现前的视频帧集合,故GOP内有一个I帧以及若干个P、B帧,I帧为参考帧,P帧和B帧为非参考帧。本发明实施例中可通过上述方法以GOP为单位,确定每个GOP内的全部视频帧的内容特征信息,根据各GOP内的全部视频帧的内容特征信息,进行视频质量评价。
本发明实施例中,根据各GOP内的全部视频帧的内容特征信息,进行视 频质量评价,可采用如下方式:
首先,确定视频中各GOP的基础质量分值,例如可采用如下方式:
根据GOP的码率等参数以及GOP内的全部视频帧的内容特征信息,确定压缩失真造成的视频质量下降值,所述压缩失真造成的视频质量下降值满足公式
Qcode_loss=a1V·ea2V·Bitrate+a3V·FGOP(K,B)+a4V      (6)
其中,Qcode_loss为压缩失真造成的视频质量下降值,Bitrate为码率,a1V、a2V、a3V和a4V为训练系数,可通过实际应用场景进行设定,FGOP(K,B)为GOP内的全部视频帧的内容特征信息。
基于所述压缩失真造成的视频质量下降值,确定GOP的基础质量分值,GOP的基础质量分值满足公式
Qcode=1+a-Qcode_loss;                   (7)
其中,Qcode为所述GOP的基础质量分值,1+a为所述视频能够达到的最高主观质量分值,a值依据分辨率的不同赋予不同分值。
其次,本发明实施例中可根据各GOP的基础质量分值以及各GOP的丢包失真值,确定GOP的质量值。
所述丢包失真值可通过帧丢失情况以及帧损失情况确定,例如可分析视频流的分组头,提取帧类型、丢包等信息,得到关于帧损伤和整帧丢失的帧参数集、不同类型帧的丢包损伤比例、不同类型帧的丢失情况等。
优选的,本发明实施例中确定丢包失真值过程中,可进一步考虑各GOP内视频帧的丢包集中度的影响。所述丢包集中度用于表示设定范围内丢包发生的集中程度,对于非均匀丢包,在相同的丢包率情况下,包丢失的距离越小,表明丢包集中度越高,则对视频质量的影响越大,故本发明实施例中应用丢包集中度可使得评测丢包对视频质量评价的影响更加准确。
本发明实施例中所述丢包集中度用于表示对应GOP范围内丢包发生的集 中程度,可依据对应GOP范围内的第一个丢包到最后一个丢包的距离Lloss,以及该GOP内视频帧的总丢包数N所确定,其中,在N固定的情况下,所述丢包集中度的增长趋势与Lloss的增长趋势相反,在Lloss固定的情况下,所述丢包集中度的增长趋势与N的增长趋势相同。例如,可采用如下公式确定:
Figure PCTCN2016082223-appb-000008
其中,c和k为待定常量。
本发明实施例中可确定GOP内由于丢帧损失的质量分数Numberframeloss以及GOP内由于帧损伤而损失的质量分数Qframedamage,利用所述丢包集中度对所述Numberframeloss和所述Qframedamage修正,得到GOP的丢包失真值,例如GOP的丢包失真值可表示为f(Lfcous_GOP)×(Qframeloss+∑Qframedamage),其中,Lfcous_GOP为GOP的丢包集中度,Qframeloss为GOP内由于帧丢失损失的质量分数,Qframedamage由于帧损伤而损失的质量分数f(Lfcous_GOP)为根据Lfcous_GOP制定的丢包集中度的调节函数。例如可采用f(Lfcous_GOP)=k×(Lfcous_GOP-Lthreshold)+b的表示形式,其中,k为正数,b为待定常数,Lthreshold为预设的门限值,Lfcous_GOP低于Lthreshold时,取f(Lfcous_GOP)=b。
基于GOP的基础质量分值以及GOP的丢包失真值,得到GOP的质量值;
所述GOP的质量值满足公式
QGOP=Qcode-f(Lfcous_GOP)×(Qframeloss+∑Qframedamage)      (9);
其中,所述QGOP为GOP的质量值,Qcode为GOP的基础质量值,Lfcous_GOP为丢包集中度,f(Lfcous_GOP)为丢包集中度的调节函数,Qframeloss为GOP内由于丢帧损失的质量分数,Qframedamage由于帧损伤而损失的质量分数。
整个视频流是由各GOP组成的,得到各GOP的质量值后,计算整个视频流的质量值,可以以加权的方法实现,对所述视频中各GOP的质量值进行加 权,得到视频质量值。具体的,由于人眼对于较差的部分印象深刻,因此可以设置质量低的GOP的权重较大,另外持续时间较长的GOP设置的权重较大。
进一步的,在得到视频质量值之后,可利用用户观看设备和观看条件,对视频质量值进行进一步的修正,实现评价结果的设备自适应性。
不同用户的观看设备和观看条件是不同的,因此最终能反映用户观看视频质量体验的评价不仅包括评价终端接收的视频质量,还应体现不同观看设备和观看条件下的不同。利用观看设备和观看条件对视频质量值进行修正时,主要考虑的因素有:屏幕的尺寸、屏幕分辨率、视频缩放、观看距离和观看角度等。由于选取参数过多会使计算复杂,参数间的关系难以判定,计算精度难以保障,另外,部分参数难以提取,灵活性差,实用性也较差,为了简化计算并增强实用性,利用对比敏感度函数(Contrast Sensitivity Function,CSF),对屏幕尺寸、屏幕分辨率、设备类型以及视频分辨率等进行建模,并对视频质量值进行修正。
需要说明的是,本发明实施例中是以上述涉及的视频质量评价方法得到的视频质量值进行修正为例进行说明的,对于利用观看设备和观看条件对视频质量值进行修正的方法,可适用于任何视频质量评价方法得到的视频质量值,本发明实施例并不限定。
本发明实施例中可采用如下方式对视频质量值做进一步修正:
(1)将待修正的视频质量值规范化,例如规范化到普遍广泛使用的[1,5]的范围。
(2)对分辨率进行建模,获取分辨率修正因子。
本发明实施例中对视频分辨率和设备分辨率进行建模,视频分辨率用v_res表示,设备分辨率用d_res表示,所述分辨率修正因子为对视频分辨率v_res以及观看所述视频的设备分辨率d_res进行建模得到的数值。
(一)计算设备分辨率与视频分辨率间的比值
设f_ratio为观看设备分辨率与视频分辨率之间的比值,则
f_ratio=d_res/v_res;
(二)根据f_ratio,获取分辨率修正因子R:
评价视频质量评价方法的终极基准是最体现人的体验的主观方法数据,R值为对视频分辨率以及观看所述视频的设备分辨率进行建模得到的数值,R值与人的主观观看分值间的关系趋势在f_ratio的不同区间上分别接近正弦函数和反向的S函数,其中,在f_ratio∈(0,1]范围内符合正弦函数的表现形式,在f_ratio∈(1,+∞)范围内符合反向的S函数的表现形式。其中,所述反向的S函数即Sigmoid函数。
故,f_ratio∈(0,1]时,取
Figure PCTCN2016082223-appb-000009
在f_ratio∈(1,+∞)时,R的推导过程如下:
1、Sigmoid函数:
Figure PCTCN2016082223-appb-000010
其中,x∈(-∞,+∞),本发明实施例中需要对横坐标限定范围,得到:
Figure PCTCN2016082223-appb-000011
其中x∈(-A,+A)。
2、对横坐标取反,得到:
Figure PCTCN2016082223-appb-000012
其中x∈(-A,+A)。
3、选取期望的横坐标B,B为当f_ratio大于1的情况下令视频质量值降半所对应的f_ratio值。
由于f_ratio∈(1,+∞),即横坐标值以1为左侧边界,所以将x的取值范围由[-A,A]压缩到[1-B,B-1],得到:
Figure PCTCN2016082223-appb-000013
4、将横坐标取值区间平移,左侧取1,得到:
Figure PCTCN2016082223-appb-000014
其中
Figure PCTCN2016082223-appb-000015
令f_ratio=x,R=f(x),区间左侧取开区间,向右拓展至无穷大,最终得到:
Figure PCTCN2016082223-appb-000016
其中fratio∈(1,+∞)。
最终得出分辨率修正因子:
Figure PCTCN2016082223-appb-000017
其中,f_ratio为观看设备分辨率与视频分辨率之间的比值,A>2(B-1),B为当f_ratio大于1的情况下令视频质量值降半所对应的f_ratio值。
(三)利用所述分辨率修正因子,对所述视频质量值进行修正,得到修正后的视频质量值。
本发明实施例中,所述修正后的视频质量值满足公式VR=R*V,其中,VR为修正后的视频质量值,V为未进行修正的原始视频质量值。
(3)对观看设备的屏幕尺寸进行建模,得到屏幕尺寸修正因子
本发明实施例中,所述屏幕尺寸修正因子为对观看所述视频的设备屏幕尺寸进行建模得到的数值,所述屏幕尺寸修正因子与观看所述视频的设备屏幕尺寸之间符合线性关系。
本发明实施例中,采用如下公式(11)的方式,表述所述屏幕尺寸修正因子,即所述屏幕尺寸修正因子满足公式
Figure PCTCN2016082223-appb-000018
其中,M为屏幕尺寸修正因子,S_base为设定观看设备屏幕的基准尺寸、S_min为设定观看设备屏幕的最小尺寸、S_true为观看设备屏幕的实际尺寸,min为S_min对应的视频质量值,max为设定观看设备屏幕的最大尺寸对应的视频质量值。
本发明实施例中,[S_min,S_base]是指设定观看设备的标准尺寸范围,可根据需要调整,对于小于S_min的尺寸,将该小于S_min的尺寸设定为S_min,对于大于S_base的尺寸,将该大于S_base的尺寸设定为S_base。
需要说明的是,对于不同种类设备,min和max可灵活调整,min的选取可依据上述设定的S_min进行设定。比如笔记本电脑的屏幕尺寸一般为11寸至17寸,智能手机的屏幕尺寸一般为3寸至5.5寸,对于同一视频在智能手机上观看效果一般不如在笔记本电脑上观看效果好,故可将笔记本电脑对应的min和max设置为大于智能手机对应的min和max。
本发明实施例中利用屏幕尺寸修正因子对视频质量修正,以及利用分辨率修正因子进行视频质量修正,可以单独使用,也可二者结合使用,本发明实施例以下以二者结合使用为例进行说明。
本发明实施例中在利用屏幕尺寸修正因子对视频质量修正和利用分辨率修正因子进行视频质量修正结合使用过程中,需先利用分辨率修正因子进行视频质量修正,然后再利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正,并且在利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正之前,需将利用分辨率修正因子修正后的视频质量值重新赋值,使得重新赋值后的视频质量值落在设定的标准范围内,以适用于利用不同标准进行视频质量评价的方法,具体实现过程如下:
(一)对修正后的视频质量值重新赋值,得到重新赋值后的视频质量值。
本发明实施例中对于上述进行分辨率建模过程中将视频质量值规范化的[1,5]为依据,对于各类观看设备,进一步限定视频质量值可能的的最大范围[1,MAX],根据最小值均取1及区间映射,得到重新赋值后的视频质量值满足公式
Figure PCTCN2016082223-appb-000019
其中,V_R为重新赋值后的视频质量值,MAX为设定的视频质量值的最 大值,VR为修正后的视频质量值。
(二)利用所述屏幕尺寸修正因子,对所述修正后的视频质量值再次进行修正,得到再次修正后的视频质量值。
所述再次修正后的视频质量值,满足公式V修正=M*V_R;
其中,V修正为再次修正后的视频质量值,M为屏幕尺寸修正因子,V_R为重新赋值后的视频质量值。
本发明实施例中涉及的视频质量评价装置,通过帧长与解码复杂度,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定所述视频帧的内容特征信息,能够全面完整的利用视频内容特征,使得视频质量评价结果更为准确,并且所述解码复杂度为终端解码视频帧对应的计算开销值,故无需进行单独解码,使用终端固有的硬解码功能,基本不增加计算开销,能够保证低计算开销,降低计算复杂度。
进一步的,本发明实施例涉及的视频质量评价装置,利用丢包集中度,能够更加准确的评价丢包对视频质量造成的影响。
进一步的,在视频质量评价过程中,对观看设备和观看条件进行建模,可实现不同场景下评价结果的自适应性。
本发明实施例以下将结合实际应用,对上述涉及的视频质量评价装置实现视频质量评价的过程进行详细说明。
图3所示为进行视频质量评价的模型框架,输入信息为视频流信息,最终输出视频质量值,例如VMOS值。
由图3可知,本发明实施例涉及的视频质量评价方法可以理解为是在基于数据分组头信息进行网络视频质量评价的分组层评价方法基础上,加入了通过解码复杂度与帧长建模并获取视频帧内容特征信息,丢包集中度建模进行视频质量评价,以及观看设备和观看条件建模对初始视频质量进行修正三个环节,在具体实施时,上述三个环节可单独实施,也可结合使用,图3仅是进行示意 性说明。以下对具体的实施流程进行详细说明。
图4所示为本发明一具体实施例中视频质量评价方法的实现流程图,如图4所示包括:
S201:获取视频流。
S202a:视频流分组头检测,得到视频帧的帧长以及视频帧的帧头长度。
对获取到的视频流的数据分组头分析检测,可获取用户数据包协议(User Datagram Protocol,UDP)分组信息,通过所述UDP分组信息可以获取到视频帧的帧长以及视频帧的帧头长度等信息。
S202b:网络抓包并对抓包得到的文件进行文件解析,得到视频帧所包含的视频块数。
本发明实施例中,通过网络抓包对抓包得到的文件进行文件解析可得到包括分辨率(Resolution)以及分块大小等信息,通过所述分辨率以及分块大小确定视频帧所包含的视频块数。
S203:获取视频帧的解码复杂度信息,并根据视频帧的解码复杂度信息以及帧参数,得到视频帧的内容特征信息。
本发明实施例中,根据视频帧的解码复杂度信息以及帧参数,得到视频帧的内容特征信息的具体实现过程,可按照上述实施例涉及的对帧长以及解码复杂度进行建模,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定视频帧的内容特征信息的过程,在此不再赘述。
S204:确定视频中各GOP的基础质量分值。
在视频中一个I帧到下一个I帧间的部分是一个GOP,因此GOP内有一个I帧以及若干个P、B帧。根据帧类型检测,得到P帧、B帧的个数,并依据GOP内每一视频帧的参数,确定GOP的相关参数,所述GOP的相关参数例如可以是P帧、B帧的个数信息等。
本发明实施例中,以GOP为单位,对GOP内的每一帧用公式(5)表示, 即得到每一帧中表征视频内容特征的状态变量K和B。
结合每一帧中表征视频内容特征的状态变量的K、B,即可得到每一帧的内容特征信息F(K,B)。
由于以GOP为单位进行计算,GOP内有I帧、P帧和B帧,故可得到GOP内的全部视频帧的内容特征信息,GOP内的全部视频帧的内容特征信息满足公式:
Figure PCTCN2016082223-appb-000020
其中,FGOP(K,B)为所述GOP内的全部视频帧的内容特征信息,FI(K,B)为所述GOP内I帧的内容特征信息,FP(K,B)为所述GOP内P帧的内容特征信息,FB(K,B)为所述GOP内B帧的内容特征信息,m和n分别为GOP内的P、B帧数。
本发明实施例中可通过网络抓包并对抓包得到的文件进行解析,获取所述GOP的码率参数,根据所述GOP的码率参数以及所述GOP内的全部视频帧的内容特征信息,确定压缩失真造成的视频质量下降值。
本发明实施例,所述压缩失真造成的视频质量下降值满足公式(6)
基于所述压缩失真造成的视频质量下降值Qcode_loss,确定所述GOP的基础质量分值,所述GOP的基础质量分值满足公式(7)。
S205:确定GOP内的丢包集中度,并确定GOP的丢包失真值。
本发明实施例中,所述丢包集中度用于表示相应GOP范围内丢包发生的集中程度,且GOP内的丢包集中度满足公式(8)。
本发明实施例中,可基于所述丢包集中度,确定所述GOP的丢包失真值。
具体的,一个GOP内由于丢帧损失的质量分数:
Qframeloss=b1V·log(b2V·Numberframeloss.Bitrate+c)        (13)
其中,b1V、b2V、c为训练系数,Numberframeloss为丢帧数,Numberframeloss可 通过丢帧事件累加得出,Bitrate为码率。
进一步的,本发明实施例中可再计算一个GOP内,由于帧损伤而损失的质量分数:
Figure PCTCN2016082223-appb-000021
其中,Qframedamage为由于帧损伤而损失的质量分数,Damageratio为帧损伤率,Damageratio可由视频帧内丢包数计算得出,γ、δ和θ为常系数,c1V和c2V为训练系数。Qcode为GOP的基础质量值,此处,Qcode设定一个下限值,根据应用情况进行设定,低于此下限值时,γQcode+δ取值为常数1。
最后,所述GOP的丢包失真值可表示为f(Lfcous_GOP)×(Qcode_loss+∑Qframedamage)。
其中,Lfcous_GOP为GOP的丢包集中度,f(Lfcous_GOP)为丢包集中度的调节函数,可根据Lfcous_GOP制定。例如可采用f(Lfcous_GOP)=k×(Lfcous_GOP-Lthreshold)+b的表示形式,其中,k为正数,b为待定常数,Lthreshold为制定的门限值,Lfcous_GOP低于Lthreshold时,取f(Lfcous_GOP)=b。
S206:确定视频中各GOP的质量值。
本发明实施例中可基于所述GOP的基础质量分值以及所述GOP的丢包失真值,得到所述GOP的质量值,所述GOP的质量值满足公式(9)。
S207:按照上述方法,确定视频中全部GOP的质量值,对所述视频中全部GOP的质量值进行加权,得到初始视频质量值。
整个视频是由各个GOP组成的,得到各个GOP的质量值后,计算视频中全部GOP的质量值,对所述视频中全部GOP的质量值进行加权,得到视频质量值。
S208:对分辨率和屏幕尺寸建模,对得到的初始视频质量值进行修正,得到最终视频质量值。
本发明实施例中对分辨率和屏幕尺寸进行建模,获取分辨率修正因子和屏幕尺寸修正因子,并对视频质量值进行修正的过程,可参阅上述实施例的描述,在此不再赘述。
本发明实施例中涉及的视频质量评价方法,通过帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定所述视频帧的内容特征信息,能够全面完整的利用视频内容特征,使得视频质量评价结果更为准确,并且所述解码复杂度为终端解码视频帧对应的计算开销值,故无需进行单独解码,使用终端固有的硬解码功能,基本不增加计算开销,能够保证低计算开销,降低计算复杂度。进一步的,利用丢包集中度,能够更加准确的评价丢包对视频质量造成的影响。更进一步的,利用分辨率修正因子和屏幕尺寸修正因子对视频质量值进行进一步的修正,可实现不同场景下评价结果的自适应性。
基于上述实施例涉及的视频质量评价方法,本发明一实施例中还提供了一种视频质量评价装置100,图5所示为视频质量评价装置100的简化功能方框图,如图5所示,视频质量评价装置100包括获取单元101和处理单元102,其中:
获取单元101,用于帧参数以及终端解码所述视频帧的解码复杂度信息,所述解码复杂度信息为终端解码所述视频帧对应的计算开销值。
处理单元102,用于通过所述获取单元获取的所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量,基于所述状态变量,确定所述视频帧的内容特征信息,并重复执行以上步骤,直至确定出视频中的全部视频帧的内容特征信息,根据所述全部视频帧的内容特征信息,进行视频质量评价。
视频质量评价装置100,通过帧参数与解码复杂度信息,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定所述视频帧的内容特征信息,能够全面完整的利用视频内容特征,使得视频质量评价结果更为准确,并且所 述解码复杂度信息为终端解码视频帧对应的计算开销值,故无需进行单独解码,使用终端固有的硬解码功能,基本不增加计算开销,能够保证低计算开销,降低计算复杂度.
本发明实施例中,所述处理单元102通过所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量的方法可以参考方法实施例涉及的对帧长以及解码复杂度进行建模,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定视频帧的内容特征信息的过程,这里不再进行赘述。
本发明实施例中,所述处理单元102,基于所述状态变量,确定所述视频帧的内容特征信息的过程可以为:对所述状态变量与所述视频帧的内容特征信息进行线性建模,所述状态变量与所述视频帧的内容特征信息之间满足公式F(K,B)=αK+βB。
其中,所述F(K,B)表征为所述视频帧的内容特征信息,所述K和B为表征所述视频帧内容特征的状态变量,α和β为常数。
本发明实施例中涉及的视频质量评价装置,通过帧长与解码复杂度,确定表征视频帧内容特征的状态变量,并基于所述状态变量,确定所述视频帧的内容特征信息,能够全面完整的利用视频内容特征,使得视频质量评价结果更为准确,并且所述解码复杂度为终端解码视频帧对应的计算开销值,故无需进行单独解码,使用终端固有的硬解码功能,基本不增加计算开销,能够保证低计算开销,降低计算复杂度。
具体的,所述处理单元102,可采用如下方式根据所述全部视频帧的内容特征信息,进行视频质量评价,包括:
根据所述全部视频帧的内容特征信息,确定视频中各GOP的基础质量分值。获取各GOP内视频帧的丢包集中度,其中,所述丢包集中度用于表示对应GOP范围内丢包发生的集中程度,依据各GOP内视频帧的丢包集中度,确定各GOP的丢包失真值;根据各GOP的基础质量分值以及各GOP的丢包失 真值,确定所述各GOP的质量值;对视频中各GOP的质量值进行加权,得到视频质量值。
本发明实施例中,所述丢包集中度可依据对应GOP范围内的第一个丢包到最后一个丢包的距离Lloss,以及该GOP内视频帧的总丢包数N所确定,其中,,在N固定的情况下,所述丢包集中度的增长趋势与Lloss的增长趋势相反,在Lloss固定的情况下,所述丢包集中度的增长趋势与N的增长趋势相同。
本发明实施例利用丢包集中度,能够更加准确的评价丢包对视频质量造成的影响。
可选的,所述处理单元102,还可用于利用分辨率修正因子,对所述视频质量值进行修正。
其中,所述分辨率修正因子为对视频分辨率以及观看所述视频的设备分辨率进行建模得到的数值,且所述数值在f_ratio∈(0,1]范围内符合正弦函数的表现形式,在f_ratio∈(1,+∞)范围内符合反向的S函数的表现形式。
其中,f_ratio为观看设备分辨率与视频分辨率之间的比值。
本发明实施例中,所述处理单元102,还用于利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正。
其中,所述屏幕尺寸修正因子为对观看所述视频的设备屏幕尺寸进行建模得到的数值,所述屏幕尺寸修正因子满足公式
Figure PCTCN2016082223-appb-000022
其中,M为屏幕尺寸修正因子,S_base为设定观看设备屏幕的基准尺寸、为设定观看设备屏幕的最小尺寸、S_true为观看设备屏幕的实际尺寸,min为S_min对应的视频质量值,max为设定观看设备屏幕的最大尺寸对应的视频质量值。
本发明实施例利用分辨率修正因子和屏幕尺寸修正因子对视频质量值进行进一步的修正,可实现不同场景下评价结果的自适应性。
本发明实施例提供的视频质量评价装置100可用于实现上述实施例涉及的视频质量评价方法,具备上述实施例实现视频质量评价过程中的所有功能,其具体实现过程可参阅上述实施例及附图的相关描述,在此不再赘述。
本发明实施例还提供一种视频质量评价装置,用于对终端的视频质量进行评价。图6所示的是本发明另一实施例提供的视频质量评价装置200的结构示意图。视频质量评价装置200采用通用计算机系统结构,包括总线,处理器201,存储器202和通信接口203,执行本发明方案的程序代码保存在存储器202中,并由处理器201来控制执行。
总线可包括一通路,在计算机各个部件之间传送信息。
处理器201可以是一个通用中央处理器(CPU),微处理器,特定应用集成电路application-specific integrated circuit(ASIC),或一个或多个用于控制本发明方案程序执行的集成电路。计算机系统中包括的一个或多个存储器,可以是只读存储器read-only memory(ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器random access memory(RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是磁盘存储器。这些存储器通过总线与处理器相连接。
通信接口203,可以使用任何收发器一类的装置,以便与其他设备或通信网络通信,如以太网,无线接入网(RAN),无线局域网(WLAN)等。
存储器202,如RAM,保存有操作系统和执行本发明方案的程序。操作系统是用于控制其他程序运行,管理系统资源的程序。
存储器202中存储的程序用于指令处理器201执行一种视频质量评价方法,包括:获取视频帧的帧参数以及终端解码所述视频帧的解码复杂度信息,所述解码复杂度信息为终端解码所述视频帧对应的计算开销值;通过所述帧参 数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量;基于所述状态变量,确定所述视频帧的内容特征信息;重复执行以上步骤,直至确定出视频中的全部视频帧的内容特征信息;根据所述全部视频帧的内容特征信息,进行视频质量评价。
可以理解的是,本实施例的视频质量评价装置200可用于实现上述方法实施例中涉及的所有功能,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
本发明实施例还提供了一种计算机存储介质,用于储存上述图5或图6所述的视频评价装置所用的计算机软件指令,其包含用于执行上述方法实施例所涉及的程序。通过执行存储的程序,可以实现对视频质量的评价。
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要求保护的本发明过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。
本领域技术人员应明白,本发明的实施例可提供为方法、装置(设备)、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机程序存储/分布在合适的介质中,与其它硬件一起提供或作为硬件的一部分,也可以采用其他分布形式,如通过Internet或其它有线或无线电信系统。
本发明是参照本发明实施例的方法、装置(设备)和计算机程序产品的流 程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管结合具体特征及其实施例对本发明进行了描述,显而易见的,在不脱离本发明的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本发明的示例性说明,且视为已覆盖本发明范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。

Claims (17)

  1. 一种视频质量评价方法,其特征在于,包括:
    获取视频帧的帧参数以及终端解码所述视频帧的解码复杂度信息,所述解码复杂度信息为终端解码所述视频帧对应的计算开销值;
    通过所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量;
    基于所述状态变量,确定所述视频帧的内容特征信息;
    重复执行以上步骤,直至确定出视频中的全部视频帧的内容特征信息;
    根据所述全部视频帧的内容特征信息,进行视频质量评价。
  2. 如权利要求1所述的方法,其特征在于,所述帧参数包括视频帧的帧头长度、帧长以及所述视频帧所包含的视频块数;所述解码复杂度信息为熵编码、重排序、反变量和反变换,以及运动补偿、帧内预测和帧间预测各部分涉及的计算开销值;
    所述通过所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量,具体包括:
    获取熵编码过程中解码每个码元的平均解码复杂度C0,重排序过程中解码每个视频块的平均解码复杂度C1,反变量和反变换过程中解码每个视频块的平均解码复杂度C2,运动补偿、帧内预测和帧间预测过程中每个码元预取数据所需平均计算复杂度C3,以及运动补偿、帧内预测和帧间预测获取参考块所需的计算量和将参考块与残差相加所需计算量C4
    根据所述C0、C3和所述视频帧的帧头长度、帧长,确定表征所述视频帧的编码信息变化的状态变量K;
    根据所述C1、C2、C4和所述视频帧所包含的视频块数,确定表征所述视频帧的幅面或残差变化的状态变量B。
  3. 如权利要求2所述的方法,其特征在于,所述表征所述视频帧的编码信息变化的状态变量K和所述表征所述视频帧的幅面或残差变化的状态变量B满足公式:
    Figure PCTCN2016082223-appb-100001
    其中,L为帧长,N为所述视频帧所包含的视频块数,Lh为所述视频帧帧头的长度。
  4. 如权利要求3所述的方法,其特征在于,所述基于所述状态变量,确定所述视频帧的内容特征信息,包括:
    对所述状态变量与所述视频帧的内容特征信息进行线性建模,所述状态变量与所述视频帧的内容特征信息之间满足公式F(K,B)=αK+βB;
    其中,所述F(K,B)表征为所述视频帧的内容特征信息,α和β为常数。
  5. 如权利要求1至4任一项所述的方法,其特征在于,根据所述全部视频帧的内容特征信息,进行视频质量评价,包括:
    根据所述全部视频帧的内容特征信息,确定视频中各图像组GOP的基础质量分值;其中,所述GOP是指从一个I帧开始到下一个I帧出现前的视频帧集合;
    获取各GOP内视频帧的丢包集中度,其中,所述丢包集中度用于表示对应GOP范围内丢包发生的集中程度;
    依据所述各GOP内视频帧的丢包集中度,确定各GOP的丢包失真值;
    根据各GOP的基础质量分值以及各GOP的丢包失真值,确定所述各GOP的质量值;
    对视频中各GOP的质量值进行加权,得到视频质量值。
  6. 如权利要求5所述的方法,其特征在于,所述丢包集中度依据对应GOP 范围内的第一个丢包到最后一个丢包的距离Lloss,以及该GOP内视频帧的总丢包数N所确定,其中,在N固定的情况下,所述丢包集中度的增长趋势与Lloss的增长趋势相反,在Lloss固定的情况下,所述丢包集中度的增长趋势与N的增长趋势相同。
  7. 如权利要求1至6任一项所述的方法,其特征在于,所述方法还包括:
    利用分辨率修正因子,对所述视频质量值进行修正;
    其中,所述分辨率修正因子为对视频分辨率以及观看所述视频的设备分辨率进行建模得到的数值,且所述数值在f_ratio∈(0,1]范围内符合正弦函数的表现形式,在f_ratio∈(1,+∞)范围内符合反向的S函数的表现形式;
    其中,f_ratio为观看设备分辨率与视频分辨率之间的比值。
  8. 如权利要求7所述的方法,其特征在于,所述方法还包括:
    利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正;
    其中,所述屏幕尺寸修正因子为对观看所述视频的设备屏幕尺寸进行建模得到的数值,所述屏幕尺寸修正因子满足公式
    Figure PCTCN2016082223-appb-100002
    其中,M为屏幕尺寸修正因子,S_base为设定观看设备屏幕的基准尺寸、S_min为设定观看设备屏幕的最小尺寸、S_true为观看设备屏幕的实际尺寸,min为S_min对应的视频质量值,max为设定观看设备屏幕的最大尺寸对应的视频质量值。
  9. 一种视频质量评价装置,其特征在于,包括:
    获取单元,用于获取视频帧的帧参数以及终端解码所述视频帧的解码复杂度信息,所述解码复杂度信息为终端解码所述视频帧对应的计算开销值;
    处理单元,用于通过所述获取单元获取的所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量,基于所述状态变量,确定所述视频帧的内容特征信息,并重复执行以上步骤,直至确定出视频中的全部视频帧的内容特征信息,根据所述全部视频帧的内容特征信息,进行视频质量评价。
  10. 如权利要求9所述的装置,其特征在于,所述帧参数包括视频帧的帧头长度、帧长以及所述视频帧所包含的视频块数;
    所述解码复杂度信息为熵编码、重排序、反变量和反变换,以及运动补偿、帧内预测和帧间预测各部分涉及的计算开销值;
    所述处理单元具体采用如下方式通过所述帧参数与所述解码复杂度信息,确定表征视频帧内容特征的状态变量:
    获取包括熵编码过程中解码每个码元的平均解码复杂度C0,重排序过程中解码每个视频块的平均解码复杂度C1,反变量和反变换过程中解码每个视频块的平均解码复杂度C2,运动补偿、帧内预测和帧间预测过程中每个码元预取数据所需平均计算复杂度C3,以及运动补偿、帧内预测和帧间预测获取参考块所需的计算量和将参考块与残差相加所需计算量C4
    根据所述C0、C3和所述视频帧的帧头长度、帧长,确定表征所述视频帧的编码信息变化的状态变量K;
    根据所述C1、C2、C4和所述视频帧所包含的视频块数,确定表征所述视频帧的幅面或残差变化的状态变量B。
  11. 如权利要求10所述的装置,其特征在于,所述表征所述视频帧的编码信息变化的状态变量K和所述表征所述视频帧的幅面或残差变化的状态变量B满足公式:
    Figure PCTCN2016082223-appb-100003
    其中,L为帧长,N为所述视频帧所包含的视频块数,Lh为所述视频帧帧头的长度。
  12. 如权利要求11所述的装置,其特征在于,所述处理单元,具体用于采用如下方式基于所述状态变量,确定所述视频帧的内容特征信息,包括:
    对所述状态变量与所述视频帧的内容特征信息进行线性建模,所述状态变量与所述视频帧的内容特征信息之间满足公式F(K,B)=αK+βB;
    其中,所述F(K,B)表征为所述视频帧的内容特征信息,α和β为常数。
  13. 如权利要求9至12任一项所述的装置,其特征在于,所述处理单元,具体用于采用如下方式根据所述全部视频帧的内容特征信息,进行视频质量评价,包括:
    根据所述全部视频帧的内容特征信息,确定视频中各图像组GOP的基础质量分值,其中,所述GOP是指从一个I帧开始到下一个I帧出现前的视频帧集合;
    获取各GOP内视频帧的丢包集中度,其中,在N固定的情况下,所述丢包集中度的增长趋势与Lloss的增长趋势相反,在Lloss固定的情况下,所述丢包集中度的增长趋势与N的增长趋势相同;
    依据所述各GOP内视频帧的丢包集中度,确定各GOP的丢包失真值;
    根据各GOP的基础质量分值以及各GOP的丢包失真值,确定所述各GOP的质量值;
    对视频中各GOP的质量值进行加权,得到视频质量值。
  14. 如权利要求13所述的装置,其特征在于,所述丢包集中度依据对应GOP范围内的第一个丢包到最后一个丢包的距离L,以及该GOP内视频帧的总丢包数N所确定,其中,所述丢包集中度的增长趋势与L的增长趋势相反,与N的增长趋势相同。
  15. 如权利要求9至14任一项所述的装置,其特征在于,所述处理单元,还用于:
    利用分辨率修正因子,对所述视频质量值进行修正;
    其中,所述分辨率修正因子为对视频分辨率以及观看所述视频的设备分辨率进行建模得到的数值,且所述数值在f_ratio∈(0,1]范围内符合正弦函数的表现形式,在f_ratio∈(1,+∞)范围内符合反向的S函数的表现形式;
    其中,f_ratio为观看设备分辨率与视频分辨率之间的比值。
  16. 如权利要求15所述的装置,其特征在于,所述处理单元,还用于:
    利用屏幕尺寸修正因子对利用分辨率修正因子修正后的视频质量值进行再次修正;
    其中,所述屏幕尺寸修正因子为对观看所述视频的设备屏幕尺寸进行建模得到的数值,所述屏幕尺寸修正因子满足公式
    Figure PCTCN2016082223-appb-100004
    其中,M为屏幕尺寸修正因子,S_base为设定观看设备屏幕的基准尺寸、S_min为设定观看设备屏幕的最小尺寸、S_true为观看设备屏幕的实际尺寸,min为S_min对应的视频质量值,max为设定观看设备屏幕的最大尺寸对应的视频质量值。
  17. 一种视频质量评价装置,其特征在于,包括:处理器和存储器,其中,所述存储器中存有计算机可读程序;
    所述处理器通过运行所述存储器中的程序,以用于完成上述权利要求1至8所述的方法。
PCT/CN2016/082223 2015-11-18 2016-05-16 一种视频质量评价方法及装置 WO2017084256A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510793958.5A CN106713901B (zh) 2015-11-18 2015-11-18 一种视频质量评价方法及装置
CN201510793958.5 2015-11-18

Publications (1)

Publication Number Publication Date
WO2017084256A1 true WO2017084256A1 (zh) 2017-05-26

Family

ID=58717249

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/082223 WO2017084256A1 (zh) 2015-11-18 2016-05-16 一种视频质量评价方法及装置

Country Status (2)

Country Link
CN (1) CN106713901B (zh)
WO (1) WO2017084256A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062527A (zh) * 2019-12-10 2020-04-24 北京爱奇艺科技有限公司 一种视频集流量预测方法及装置
CN111639235A (zh) * 2020-06-01 2020-09-08 重庆紫光华山智安科技有限公司 一种视频录像质量检测方法、装置、存储介质及电子设备
CN113595830A (zh) * 2021-07-30 2021-11-02 百果园技术(新加坡)有限公司 一种网络丢包状态的检测方法、装置、设备及存储介质
CN114079777A (zh) * 2020-08-20 2022-02-22 华为技术有限公司 视频处理方法和装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109005402A (zh) * 2017-06-07 2018-12-14 中国移动通信集团甘肃有限公司 一种视频的评估方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090244289A1 (en) * 2008-03-28 2009-10-01 Deutsche Telekom Ag Audio-visual quality estimation
US20110141289A1 (en) * 2009-12-16 2011-06-16 Electronics And Telecommunications Research Institute Method and system for assessing quality of multi-level video
CN102223565A (zh) * 2010-04-15 2011-10-19 上海未来宽带技术及应用工程研究中心有限公司 一种基于视频内容特征的流媒体视频质量评估方法
CN102257831A (zh) * 2011-06-09 2011-11-23 华为技术有限公司 一种视频质量评估的方法和网络节点
CN103634594A (zh) * 2012-08-21 2014-03-12 华为技术有限公司 一种获得视频编码压缩质量的方法及装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101356827B (zh) * 2005-12-05 2011-02-02 英国电讯有限公司 非介入式视频质量测量
JP2009260940A (ja) * 2008-03-21 2009-11-05 Nippon Telegr & Teleph Corp <Ntt> 映像品質客観評価方法、映像品質客観評価装置、およびプログラム
EP2144449A1 (en) * 2008-07-07 2010-01-13 BRITISH TELECOMMUNICATIONS public limited company Video quality measurement
EP2296379A4 (en) * 2008-07-21 2011-07-20 Huawei Tech Co Ltd METHOD, SYSTEM AND DEVICE FOR EVALUATING A VIDEO QUALITY
CN101742353B (zh) * 2008-11-04 2012-01-04 工业和信息化部电信传输研究所 无参考视频质量评价方法
CN101448175B (zh) * 2008-12-25 2010-12-01 华东师范大学 一种无参考的流视频质量评估方法
CN101790107B (zh) * 2009-01-22 2012-10-17 华为技术有限公司 一种测量视频质量的方法、装置及系统
US10728538B2 (en) * 2010-01-11 2020-07-28 Telefonaktiebolaget L M Ericsson(Publ) Technique for video quality estimation
EP2373049A1 (en) * 2010-03-31 2011-10-05 British Telecommunications Public Limited Company Video quality measurement
CN102740108B (zh) * 2011-04-11 2015-07-08 华为技术有限公司 一种视频数据质量评估方法和装置
EP2783513A4 (en) * 2011-11-25 2015-08-05 Thomson Licensing VIDEO QUALITY ASSIGNMENT TAKING INTO ACCOUNT SCENARIO FACTORS
EP2792144B1 (en) * 2011-12-15 2017-02-01 Thomson Licensing Method and apparatus for video quality measurement
CN103379360B (zh) * 2012-04-23 2015-05-27 华为技术有限公司 一种视频质量评估方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090244289A1 (en) * 2008-03-28 2009-10-01 Deutsche Telekom Ag Audio-visual quality estimation
US20110141289A1 (en) * 2009-12-16 2011-06-16 Electronics And Telecommunications Research Institute Method and system for assessing quality of multi-level video
CN102223565A (zh) * 2010-04-15 2011-10-19 上海未来宽带技术及应用工程研究中心有限公司 一种基于视频内容特征的流媒体视频质量评估方法
CN102257831A (zh) * 2011-06-09 2011-11-23 华为技术有限公司 一种视频质量评估的方法和网络节点
CN103634594A (zh) * 2012-08-21 2014-03-12 华为技术有限公司 一种获得视频编码压缩质量的方法及装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062527A (zh) * 2019-12-10 2020-04-24 北京爱奇艺科技有限公司 一种视频集流量预测方法及装置
CN111062527B (zh) * 2019-12-10 2023-12-05 北京爱奇艺科技有限公司 一种视频集流量预测方法及装置
CN111639235A (zh) * 2020-06-01 2020-09-08 重庆紫光华山智安科技有限公司 一种视频录像质量检测方法、装置、存储介质及电子设备
CN111639235B (zh) * 2020-06-01 2023-08-25 重庆紫光华山智安科技有限公司 一种视频录像质量检测方法、装置、存储介质及电子设备
CN114079777A (zh) * 2020-08-20 2022-02-22 华为技术有限公司 视频处理方法和装置
CN114079777B (zh) * 2020-08-20 2024-06-04 华为技术有限公司 视频处理方法和装置
CN113595830A (zh) * 2021-07-30 2021-11-02 百果园技术(新加坡)有限公司 一种网络丢包状态的检测方法、装置、设备及存储介质
CN113595830B (zh) * 2021-07-30 2024-02-20 百果园技术(新加坡)有限公司 一种网络丢包状态的检测方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN106713901B (zh) 2018-10-19
CN106713901A (zh) 2017-05-24

Similar Documents

Publication Publication Date Title
WO2017084256A1 (zh) 一种视频质量评价方法及装置
JP5970724B2 (ja) ビデオ符号化圧縮品質を取得するための方法および装置
Frnda et al. Impact of packet loss and delay variation on the quality of real-time video streaming
KR101414435B1 (ko) 비디오 스트림 품질 평가 방법 및 장치
WO2013143396A1 (zh) 一种数字视频质量控制方法及其装置
US11363298B2 (en) Video processing apparatus and processing method of video stream
WO2013173909A1 (en) Methods and apparatus for providing a presentation quality signal
JP5882320B2 (ja) 映像信号の符号化または圧縮の間の、映像信号の品質を評価するための方法および装置
Tommasi et al. Packet losses and objective video quality metrics in H. 264 video streaming
Joskowicz et al. Comparison of parametric models for video quality estimation: Towards a general model
US20150256822A1 (en) Method and Apparatus for Assessing Video Freeze Distortion Degree
Konuk et al. A spatiotemporal no-reference video quality assessment model
JP2015530807A (ja) ビデオ品質評価のためにコンテンツ複雑性を推定する方法および装置
Chien et al. Quality driven frame rate optimization for rate constrained video encoding
KR20150114959A (ko) 컨텍스트-기반 비디오 품질 평가를 위한 방법 및 장치
US9716881B2 (en) Method and apparatus for context-based video quality assessment
JP5956316B2 (ja) 主観画質推定装置、主観画質推定方法及びプログラム
Zhang et al. Compressed-domain-based no-reference video quality assessment model considering fast motion and scene change
Zhou et al. Content-adaptive parameters estimation for multi-dimensional rate control
Garcia et al. Video streaming
KR20040107850A (ko) 동영상 압축 부호화기의 왜곡 최적화 장치 및 방법
Karthikeyan et al. Perceptual video quality assessment in H. 264 video coding standard using objective modeling
Shahid et al. Perceptual quality estimation of H. 264/AVC videos using reduced-reference and no-reference models
Venkatesh Babu et al. Evaluation and monitoring of video quality for UMA enabled video streaming systems
Shi et al. A user-perceived video quality assessment metric using inter-frame redundancy

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16865438

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16865438

Country of ref document: EP

Kind code of ref document: A1