US9203708B2 - Estimating user-perceived quality of an encoded stream - Google Patents
Estimating user-perceived quality of an encoded stream Download PDFInfo
- Publication number
- US9203708B2 US9203708B2 US13/388,818 US201213388818A US9203708B2 US 9203708 B2 US9203708 B2 US 9203708B2 US 201213388818 A US201213388818 A US 201213388818A US 9203708 B2 US9203708 B2 US 9203708B2
- Authority
- US
- United States
- Prior art keywords
- sequence
- user
- stream
- perceived quality
- category
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/50—Network service management, e.g. ensuring proper service fulfilment according to agreements
- H04L41/5061—Network service management, e.g. ensuring proper service fulfilment according to agreements characterised by the interaction between service providers and their network customers, e.g. customer relationship management
- H04L41/5067—Customer-centric QoS measurements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/80—Responding to QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N17/004—Diagnosis, testing or measuring for television systems or their details for digital television systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/647—Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
Definitions
- the present invention generally relates to objectively estimating the quality of an encoded video stream, as perceived by a user, and particularly relates to estimating such quality in dependence on the video content of that stream.
- One objective approach for example, directly evaluates decoded video frames of a stream for certain characteristics that are modeled as being associated with user-perceived quality. Such characteristics may be the degree of motion present in the decoded video frames (e.g., jerkiness), the degree of detail present in the decoded video frames (e.g., blockiness or blurriness), and other visual artifacts present in the decoded video frames. While this objective approach proves advantageous for a number of reasons, one serious drawback to the approach remains that it cannot be used if the video stream payload is encrypted.
- packet-layer objective approaches fail to adequately account for the affect that the underlying video content of a video stream has on the user-perceived quality of that stream. Indeed, such known approaches may estimate that different video streams have the same user-perceived quality for a given bitrate, even though the streams in fact have very different qualities (e.g., on the order of several MOS values) because they relate to very different video content. Such inaccurate estimation of user-perceived quality ultimately degrades that quality, since any network adjustments made based on that estimation may be less effective in improving end-to-end performance.
- One or more embodiments herein dynamically and objectively estimate user-perceived quality of an encoded video stream in dependence on the underlying video content of that stream. But rather than necessarily evaluating the underlying video content directly (e.g., after decoding the stream's video frames), the embodiments estimate the user-perceived quality based on analyzing a chronological sequence of the absolute or relative sizes of the stream's encoded video frames. Notably, this sequence may be generated from an inspection of one or more packet-layer parameters. This means that user-perceived quality may be estimated based on a stream's underlying video content and thereby improve the accuracy of the quality estimate (as compared to estimates not based on video content), even if the video stream payload is encrypted.
- embodiments herein include a method for objectively estimating user-perceived quality of an encoded video stream.
- Processing according to this method includes generating a chronological sequence of the absolute or relative sizes of encoded video frames in the stream. For convenience, this sequence may be referred to herein simply as a frame size sequence.
- Processing according to the method then entails analyzing this sequence to identify a plurality of reference characteristics that are defined in a reference model. These reference characteristics are more specifically defined in the reference model as parameters that characterize or are otherwise associated with content-dependent variations in user-perceived quality.
- Processing of the method finally includes estimating the user-perceived quality of the stream based on the identified reference characteristics, according to the reference model.
- the reference model in one or more embodiments comprises a pre-determined linear regression model that expresses the user-perceived quality of a video stream as a linear combination of two or more reference statistical measures of the frame size sequence generated for that stream.
- the reference model in one or more other embodiments comprises a pre-determined logistic regression model that expresses a category of the sequence as a linear combination of two or more of the identified reference statistical measures of the sequence.
- the estimation process may entail obtaining a first estimate of a stream's quality based on reference patterns identified in the stream's frame size sequence, and then separately obtaining a second estimate of the stream's quality based on reference statistical measures identified for the stream's sequence. These two separate estimates may then be combined to obtain a final quality estimate for the stream.
- FIG. 2 is a logic flow diagram of a method for objectively estimating user-perceived quality of an encoded video stream according to one or more embodiments.
- FIGS. 4A-4B illustrate a detailed example of embodiments that utilize reference patterns in a sequence of video frame sizes for estimating user-perceived quality according to one or more embodiments.
- FIG. 5 is a logic flow diagram of a method for objectively estimating user-perceived quality of an encoded video stream based on both reference patterns in the stream and reference statistical measures of the stream, according to one or more embodiments.
- FIG. 6 is a graphical diagram that quantitatively illustrates advantageous results that are achieved in terms of improved quality estimation, according to one or more embodiments.
- FIG. 7 is a block diagram of an apparatus configured to objectively estimate user-perceived quality of an encoded video stream according to one or more embodiments.
- FIG. 8 is a block diagram of a control circuit in an apparatus configured to objectively estimate user-perceived quality of an encoded video stream according to one or more embodiments.
- FIG. 1 depicts a communications system 10 that includes a content provider serve one or more end nodes 14 , and one or more delivery networks 16 .
- the content provider server 12 encodes and otherwise prepares a video stream for delivery.
- the one or more delivery networks 16 which may include for instance a packet-data network 16 A, a wireless communications network 16 B, or the like, receive the encoded video stream from the content provider server 12 and deliver that stream to the one or more end nodes 14 .
- any one of a variety of different physical nodes within the system 10 dynamically and objectively estimates the quality of the stream, as perceived by a user of an end node 14 viewing the stream. This user-perceived quality may be dynamically estimated, for instance, as a Mean Opinion Score (MOS), based on one or more objective factors, rather than based on the actual subjective opinion of the user.
- MOS Mean Opinion Score
- the physical node estimating this user-perceived quality is one of the end nodes 14 to which the video stream is destined.
- the end node 14 may objectively estimate the user-perceived quality in dependence on transmission errors detected upon decoding the encoded video stream, delay in the end-to-end delivery of the stream, or any other objective factors available to the end node 14 that contribute to the quality ultimately perceived by the user of that node 14 .
- the node 14 may report the quality as feedback to the content provider server 12 , or to the delivery network 16 , for in-service quality monitoring.
- the physical node estimating the user-perceived quality is an intermediate network node 18 via which the encoded video stream is relayed to an end node 14 .
- Such an intermediate network node 18 may be included in the packet-data network 16 A (as shown in FIG. 1 ) or any other delivery network 16 serving as an intermediary between the content provider server 12 and an end node 14 .
- an intermediate network node 18 may estimate the user-perceived quality in dependence on various objective factors, but those factors are necessarily limited to those related to transmission from the content provider server 12 to the intermediate network node 18 , not from the intermediate network node 18 to an end node 14 .
- the node estimates the user-perceived quality based on analyzing a chronological sequence of the absolute or relative sizes of the stream's encoded video frames. Notably, this sequence may be generated from an inspection of one or more packet-layer parameters. This means that the node may estimate user-perceived quality based on a stream's underlying video content and thereby improve the accuracy of the quality estimate as compared to estimates not based on video content, even if the video stream payload is encrypted.
- FIG. 2 illustrates a method for objectively estimating user-perceived quality of an encoded video stream in this regard.
- processing according to this method includes generating a chronological sequence of the absolute or relative sizes of encoded video frames in the stream (Block 100 ).
- this sequence may be referred to herein simply as a frame size sequence, regardless of whether the sizes represented by the sequence are absolute sizes or relative sizes.
- Generation of the frame size sequence may entail inspecting one or more packet-layer parameters of the stream to obtain information pertaining to the sizes of the stream's encoded video frames. In such case, the sequence may then be generated based on this size information to describe the relationship over time between the sizes of the stream's frames.
- sequence may describe the absolute sizes of the stream's frames over time, may describe relative differences in the sizes of the stream's frames, or the like.
- sizes may be expressed in terms of bits, bytes, or any other fundamental unit that directly describes an amount of information.
- Reference characteristics are more specifically defined in the reference model as parameters that characterize or are otherwise associated with content-dependent variations in user-perceived quality. That is, the reference model models user-perceived quality of a stream as a function of the reference characteristics, and these reference characteristics reflect variations in the modeled quality that are due to the underlying video content of the stream.
- reference characteristics herein may comprise reference patterns in the sequence, reference statistical measures of the sequence, or any other such characteristics that are derived from an analysis of the sequence and that are parameters in the reference model for describing content-dependent variations in user-perceived quality.
- the reference characteristics thereby grossly describe relationships over time between video frames in terms of their size, without specifically distinguishing the frames in terms of their encoded frame type (i.e., I-frame, P-frame, or B-frame).
- processing of the method finally includes estimating the user-perceived quality of the stream based on the identified reference characteristics, according to the reference model (Block 120 ). Estimating user-perceived quality in this way, the method advantageously leverages the reference characteristics identified in the frame size sequence as indirect indicators of content-dependent variations in the user-perceived quality of a stream.
- the frame size sequence may be advantageously generated based on an inspection of one or more packet-layer parameters.
- One such parameter may include, for instance, a timestamp parameter, e.g., the timestamp field contained in the RTP (Real-time Transport Protocol) header.
- each video frame is delivered over the packet-data network 16 A within one or more packets.
- Each packet associated with the same video frame includes an identical timestamp parameter, because each of those packets is to be played out in the video stream at the same time.
- generating a chronological sequence of the absolute sizes of video frames may entail inspecting the timestamp parameters of packets and identifying, based on that inspection, which packets have the same timestamps.
- Generation may then include summing the payload sizes of those packets that have the same timestamps, to obtain the absolute size of any given video frame. Finally, generation may entail arranging or assembling those frame sizes in chronological playout order, as indicated by the identified timestamps.
- the graph in FIG. 3A plots the absolute sizes (in bits) of video frames that have been encoded according to this encoding scheme.
- the encoding scheme can be considered to effectively produce two different sequences of frame sizes, one sequence 130 from frames with an odd numbered temporal position and another sequence 140 from frames with an even numbered temporal position.
- the size of each frame in one sequence e.g., sequence 130
- the size of a similarly positioned frame in the other sequence e.g., sequence 140
- processing for estimating user-perceived quality may employ both of the frame size sequences.
- the processing produces separate estimates of user-perceived quality based on each of the frame size sequences and then linearly combines those separate estimates to obtain a final quality estimate.
- the processing linearly combines the sequences to generate a combined frame size sequence and then produces a single estimate of user-perceived quality based on that combined sequence. Linearly combining the sequences in this way may entail, for instance, generating a combined sequence from the difference between the sequences (or more specifically from the difference between the sizes of paired frames).
- the combined sequence may be generated by adding the sequences together (or more specifically by adding together the sizes of paired frames).
- FIG. 3B illustrates the envelope 150 of a combined sequence generated according to this latter embodiment, with half the sampling rate.
- One or more embodiments include still other variations to the frame size sequence generation process. Such embodiments generate a frame size sequence as just described above, but that sequence is referred to as a raw frame size sequence, since this raw sequence is further processed before being analyzed. Processing of the raw sequence may entail, for instance, segmentation and quantization as illustrated in FIG. 3C .
- one or more embodiments generate a raw frame size sequence 160 as described above.
- the embodiments then process this raw sequence 160 to obtain a processed version 170 of the raw sequence 160 .
- Processing of the raw sequence 160 as shown in FIG. 3C includes segmenting the raw sequence 160 into a plurality of segments of approximately equal size. This size may be predefined according to one or more size parameters in order to reduce the dimensionality of the raw sequence 160 .
- Processing then includes computing an average over each of these equally sized segments, and then quantizing the computed averages to obtain the processed version 170 of the raw sequence 160 .
- Quantization levels in this regard may also be predefined according to one or more parameters. Moreover, these quantization levels may be specified in terms of numerical size values or even arbitrary symbol values (which may even be, for instance, alphabetical letters). Specifying quantization levels in terms of symbol values may be particularly applicable for embodiments described below that employ reference patterns, since such patterns need not be actually represented as numerical patterns.
- estimation of user-perceived quality herein is based on analyzing the sequence to identify reference characteristics.
- These reference characteristics are parameters in a reference model for describing content-dependent variations in user-perceived quality.
- the reference characteristics that serve as parameters in the reference model comprise reference patterns.
- Reference model creation employs a plurality of reference video streams. These reference video streams have been identically encoded with the same encoding scheme, the same encoding parameters, and so on, but are known to have different video content. This ensures that any differences in the video frame sizes of the reference streams are attributable to differences in video content, not differences in encoding, and that any differences in the user-perceived qualities of those streams inherently capture content-dependent variations in quality.
- the reference model creation process thus first includes obtaining such reference streams, as well as obtaining user-perceived qualities of those reference streams that have been previously determined (e.g., empirically or otherwise).
- Model creation then includes defining different ranges of user-perceived qualities that will be modeled.
- the number of ranges to be defined may be specified by a particular input parameter to the model creation process, while the particular qualities collectively covered by the different ranges may depend on the qualities obtained for the reference streams. For example, if the qualities obtained for the reference streams vary between a MOS score of 0 and 5 and an input parameter specifies that the model is to define three ranges of qualities, the model may be created to define one range of qualities between 0 and 1.67, another range of qualities between 1.67 and 3.33, and yet another range of qualities between 3.33 and 5.
- model creation includes generating frame size sequences for the reference streams, in a manner analogous to that described above. Model creation then entails analyzing those sequences in order to generally categorize the types of frame size sequences that are characteristic of video streams associated with the different quality ranges.
- the created reference model thus defines different categories of frame size sequences as being characteristic of video streams that have qualities within different ranges.
- the reference model broadly categorizes frame size sequences in this way based on patterns in those sequences. For example, in categorizing the types of frame size sequences that are associated with a particular quality range, the model creation process identifies a set of commonly occurring patterns within the frame size sequences that have been generated for reference streams with qualities known to be within that range. The model creation process then defines the patterns in this set as being so-called reference patterns for the considered category, in order to exploit the reference patterns as being characteristic of frame size sequences that belong to the category.
- the reference model defines different categories of frame size sequences as being characterized by different sets of reference patterns, and as being associated with different ranges of user-perceived qualities.
- model creation populates the set of reference patterns that is to characterize that category by evaluating the reference streams that have a quality within the quality range of the category.
- Such evaluation entails generating a frame size sequence for each of those reference streams, and then analyzing those sequences to identify patterns in the sequences.
- Each of these identified patterns serves as a candidate for being included in the set of reference patterns that characterize the category; that is, each identified pattern serves as a candidate reference pattern. But, only the most commonly identified candidate reference patterns are selected for inclusion in the set of reference patterns for the category.
- reference patterns effectively serve as parameters of the reference model. Indeed, certain reference patterns map to a particular frame size sequence category, and that category in turn maps to a particular quality range.
- This estimation process specifically entails analyzing the frame size sequence generated from the video stream in order to identify any reference patterns in that sequence. This may involve searching for any of the reference patterns defined in the reference model, without distinguishing between the different reference pattern sets of the model. In any case, estimation processing then includes determining which set of reference patterns includes the most reference patterns identified in the frame size sequence, and categorizing the sequence as belonging to the category that is characterized by the determined set of reference patterns. Finally, processing entails estimating the user-perceived quality of the stream as a function of an average quality associated with the category to which the sequence belongs.
- FIGS. 4A-4B illustrate a helpful example of both the model creation process and the estimation process according to these embodiments.
- a node that creates the reference model obtains a group G of reference streams RS 1 . . . RS 6 .
- the node also obtains previously determined user-perceived qualities Q of those streams.
- these known qualities Q comprise MOS scores and vary among the streams, with RS 1 having a MOS score of 0, RS 2 having a MOS score of 0.5, RS 3 having a MOS score of 2, RS 4 having a MOS score of 2.25, RS 5 having a MOS score of 2.5, and RS 6 having a MOS score of 1.5.
- Range 1 spans MOS scores Q greater than or equal to 0 but less than 1
- range 2 spans MOS scores Q greater than or equal to 1 but less than 2
- range 3 spans MOS scores Q greater than or equal to 2 but less than 3.
- the node then defines different categories of frame size sequences that are to be associated with the different quality ranges.
- quality range 1 The node defines category 1 to be associated with this quality range 1 .
- the node determines a set of reference patterns that is to characterize category 1 (i.e., set 1 ) by evaluating reference streams RS 1 and RS 2 , since those streams have MOS scores within quality range 1 .
- Such evaluation entails generating a frame size sequence for each of reference streams RS 1 and RS 2 , and then analyzing each sequence to identify patterns in the sequence as candidate reference patterns.
- the node selects the most commonly identified candidate reference patterns for inclusion in set 1 .
- each frame size sequence is segmented and quantized with symbol values as described above.
- Each frame size sequence therefore includes a series of symbol values that are simply represented by alphabetical letters A-Z.
- candidate reference patterns that are identified from analysis of the sequence for reference stream RS 1 include: ABC, BDB, CCC, CDA, EFG, and so on.
- candidate reference patterns that are identified from analysis of the sequence for reference stream RS 2 include: ABC, CBA, CCB, EFG, HIK, and so on.
- the node correspondingly determines that candidate patterns ABC and EFG have been identified in the sequence for both reference streams RS 1 and RS 2 , and are therefore the most commonly identified candidate reference patterns.
- the node only selects these candidate patterns ABC and EFG for inclusion in set 1 , thereby excluding the other candidate patterns BDB, CCC, CDA, CBA, CCB, HIK, and so on.
- the node in this example creates a reference model that defines the first category 1 of frame size sequences as being characterized by the first set 1 of reference patterns ABC and EFG, and as being associated with the first quality range 1 spanning qualities greater than or equal to 0 but less than 1.
- the node creates a reference model that defines the second category 2 of frame size sequences as being characterized by a second set 2 of reference patterns BBB and EEE, and as being associated with the second quality range 2 spanning qualities greater than or equal to 1 but less than 2.
- the reference model similarly defines the third category 3 of frame size sequences as being characterized by a third set 3 of reference patterns MNO and LMN, and as being associated with the third quality range 3 spanning qualities greater than or equal to 2 but less than 3.
- reference patterns ABC, EFG, BBB, EEE, MNO, and LMN effectively serve as parameters of the reference model, since those reference patterns map to respective frame size sequence categories 1 - 3 , which in turn map to respective quality ranges 1 - 3 .
- the model creation process described in FIG. 4A has been simplified in a number of respects for purposes of illustration.
- the process in FIG. 4A only identified patterns of a certain length (e.g., 3 symbols)
- the process may actually identify patterns of multiple different lengths.
- FIG. 4A did not describe the process as manipulating the different sets of reference patterns in any way, the process may indeed perform such manipulation according to predefined rules.
- the predefined rules may specify that any given reference pattern cannot be included in multiple different sets of reference patterns (i.e., the reference patterns must be unique across the sets).
- the predefined rules may specify that each set must be of equal size, so as to eliminate any bias between sets.
- FIG. 48 illustrates the process of using this reference model to estimate the user-perceived quality of an example video stream.
- the estimation process includes generating a frame size sequence from the video stream, in the same way that frame size sequences were generated from the reference video streams during model creation.
- the estimation process then analyzes that sequence to search for and identify any reference patterns defined as parameters in the reference model.
- this analysis identifies three different reference patterns, namely reference patterns EFG, MNO, and LMN.
- the reference characteristics that serve as parameters in the reference model comprise reference statistical measures of the generated frame size sequence, rather than or in addition to reference patterns in that sequence. These reference statistical measures each comprise a single measure of some statistical attribute of the frame size sequence, computed according to a predetermined function or algorithm.
- Example reference statistical measures in this regard include the mean of the sequence, the variance of the sequence, the standard deviation of the sequence, the mode of the sequence, the median of the sequence, and the like.
- Other reference statistical measurements may include a so-called longest calm period measure (defined as the longest length of period in the sequence where the difference between the sizes of two consecutive video frames is smaller than a defined ratio (e.g., 0.1) of the mean of the sequence), a longest active period measure (defined as the longest length of period where the difference between the sizes of two consecutive frames is greater than a defined ratio (e.g., 0.0025) of the mean of the sequence), a longest small period measure (defined as the longest length of period where the sizes of video frames are smaller than a defined ratio (e.g., 0.9) of the mean of the sequence), and a longest large period measure (defined as the longest length of period where the sizes of video frames are greater than the mean of the sequence).
- a so-called longest calm period measure defined as the longest length of period in the sequence where the difference between the sizes of
- Still other reference statistical measures may include the number of bursts in the sequence (defined as the number of times in the sequence where the difference between the sizes of two consecutive frames is greater than a defined ratio (e.g., 0.045) of the mean of the sequence) and the number of passes through the median of the sequence (defined as the number of times the sequence goes from greater to smaller than the mean of the sequence, or vice versa).
- estimation processing entails analyzing the frame size sequence to compute or otherwise identify these measures for the sequence. Processing then includes estimating the quality of the video stream as a function of those measures, according to the reference model.
- the reference model in one or more embodiments comprises a pre-determined linear regression model that expresses the user-perceived quality of a video stream as a linear combination of two or more reference statistical measures of the frame size sequence generated for that stream.
- Creation of this linear regression model may entail, for instance, computing the reference statistical measures for each of a plurality of reference video streams, and performing linear regression analysis on the results with respect to the known user-perceived qualities of those reference streams.
- the reference statistical measures thereby becomes the predictor (i.e., independent) variables in the linear regression model, while user-perceived quality is the response (i.e., dependent) variable.
- a linear regression model herein may express user-perceived quality Q of a video stream as:
- a linear regression model models user-perceived quality of a stream as increasing with ME, LA, and LL, and as decreasing with LC, LS, and NP.
- long calm periods represented by LC may indicate that video frames are being encoded with the maximum number of available bits, meaning that the encoder likely needs more bits per frame to achieve better quality.
- the long active periods represented by LA may indicate the converse; namely, that the encoder has enough bits per frame to achieve high quality.
- This logistic regression model may entail, for instance, computing the reference statistical measures for each of a plurality of reference video streams, and performing logistic regression analysis on the results with respect to the known user-perceived qualities of those reference streams. Such analysis may use an ordinal model for fitting, and assume that there is no interaction between different frame size sequence categories. Regardless, the reference statistical measures thereby becomes the predictor (i.e., independent) variables in the logistic regression model, while frame size sequence category is the response (i.e., dependent) variable.
- a logistic regression model herein may express an intercept value I for a frame size sequence category as:
- M is the mean of the sequence
- V is the variance of the sequence
- S is the standard deviation of the sequence
- MO is the mode of the sequence
- LC is the longest calm period of the sequence
- LA is the longest active period of the sequence
- LS is the longest small period of the sequence
- NP is the number of passes through the median of the sequence.
- Intercept values for the different categories may correspondingly be [1.7580, 2.9797, 4.0756, 5.3036].
- the reference model used for quality estimation may define both reference patterns and reference statistical measures as parameters.
- the estimation process may entail obtaining a first estimate of a stream's quality based on reference patterns identified in the stream's frame size sequence, and then separately obtaining a second estimate of the stream's quality based on reference statistical measures identified for the stream's sequence. These two separate estimates may then be combined to obtain a final quality estimate for the stream.
- estimation processing as depicted in FIG. 5 includes generating a frame size sequence for the stream (Block 200 ). Processing then includes identifying reference patterns in the sequence (Block 210 ) as well as identifying the reference statistical measures for the sequence (Block 230 ). Processing further includes determining separate estimates of a category to which the stream's frame size sequence belongs based respectively on the identified reference patterns in the sequence (Block 220 ) and on two or more of the identified reference statistical measures of the sequence (e.g., via logistic regression as shown in Block 240 ).
- processing includes combining the separate category estimates to obtain a combined category estimate and then obtaining the first quality estimate as a function of the quality associated with the combined category estimate (Block 250 ). Meanwhile, the second quality estimate is computed based solely on the reference statistical measures, e.g., via linear regression (Block 260 ). Finally, the first and second quality estimates are combined (e.g., averaged together) to obtain the final quality estimate (Block 260 ).
- estimation processing may include generating first and second frame size sequences that respectively comprise a sequence of absolute frame sizes and a sequence of relative frame sizes. These frame size sequences may be analyzed separately to obtain separate quality estimates for the stream, which are then combined to produce a final estimate.
- the model creation process may be correspondingly adjusted to model user-perceived quality with respect to multiple different types of frame size sequences.
- the reference model may actually model user-perceived quality differently for different video stream bitrates. Such may entail for instance defining different reference patterns for different bitrates, different linear or logistic regression expressions (e.g., coefficients) for different bitrates, or the like.
- the graph in FIG. 6 illustrates these embodiments while also demonstrating the advantages achieved herein by estimating user-perceived quality in dependence on video content.
- MOS scores for a video stream vary as a function of the stream's bitrate.
- line 300 known approaches to objectively estimating MOS scores based on packet-layer parameters model two different video streams with different video content as having the same MOS scores as a function of bitrate, simply because they are identically encoded.
- lines 310 and 330 illustrate that embodiments herein model those two video streams as having different MOS scores as a function of bitrate, because they have different video content. Modeling MOS scores in this way indeed proves advantageous, as the model produces MOS score estimates that are closer to the actual MOS scores 340 and 350 for those streams (e.g., by as much as a half a MOS).
- This apparatus 20 may be, for instance, an end node 14 that receives the video stream or may be an intermediate network node 18 .
- the apparatus 20 includes one or more interfaces 22 and one or more processing circuits 24 .
- the one or more interfaces 22 use known signal processing techniques, typically according to one or more communication standards, for communicatively coupling the apparatus 20 to the content provider server 12 (directly or indirectly via other nodes) for receiving the video stream.
- the one or more interfaces 22 may also be configured to format digital data and condition a communication signal, from that data, for transmission over a communications link.
- the one or more processing circuits 24 are configured to extract digital data from the one or more interfaces 22 for processing, and to generate digital data for transmission over the one or more interfaces 22 . More particularly, the one or more processing circuits 24 comprise one or several microprocessors 26 , digital signal processors, and the like, as well as other digital hardware 28 and memory circuit 30 .
- Memory 30 which may comprise one or several types of memory such as read-only memory (ROM), random-access memory, cache memory, flash memory devices, optical storage devices, etc., stores program code 32 for executing one or more data communications protocols and for carrying out one or more of the techniques described herein. Memory 30 further stores program data 34 and user data 36 for carrying out such techniques, and also stores various parameters and/or other program data for controlling the operation of the apparatus 20 .
- FIG. 8 presents a more generalized view of a control circuit 40 configured to carry out the method shown in FIG. 2 .
- This control circuit 40 may have a physical configuration that corresponds directly to processing circuits 24 , for example, or may be embodied in two or more modules or units. In either case, the control circuit 40 includes one or more modules or sub-circuits to carry out operations in accordance with the method in FIG. 2 . These one or more units are pictured in FIG. 8 as a perceived quality estimator 42 .
- the perceived quality estimator 42 is configured to generate a chronological sequence of the absolute or relative sizes of encoded video frames in a received video stream. The estimator 42 is then configured to analyze that sequence to identify a plurality of reference characteristics that are defined in a reference model as parameters associated with content-dependent variations in user-perceived quality. Finally, the estimator 42 is configured to estimate the user-perceived quality of the stream based on the identified reference characteristics, according to the reference model.
- the techniques herein may be implemented in any one of a number of different physical nodes in the system 10 . If implemented in an end node 14 , the techniques can be implemented at the hardware level or the software level (as an API in the OS), with or without standardization. Another case could be that video player programmers include the techniques in their decoders/depacketizers. But this might be unnecessary since the decoder would then have access to the decoded and decrypted payload, which could be used to directly account for content-dependent quality variations. Updating the reference model can be done in an OS update, software update, or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
where ME is the median of the stream's frame size sequence, LC is the longest calm period of the sequence, LA is the longest active period of the sequence, LS is the longest small period of the sequence, LL is the longest large period of the sequence, and NP is the number of passes through the median of the frame sequence. In this example, and in general, a linear regression model models user-perceived quality of a stream as increasing with ME, LA, and LL, and as decreasing with LC, LS, and NP. Broadly, at least some of these relationships can be intuitively understood. For instance, in general, long calm periods represented by LC may indicate that video frames are being encoded with the maximum number of available bits, meaning that the encoder likely needs more bits per frame to achieve better quality. The long active periods represented by LA may indicate the converse; namely, that the encoder has enough bits per frame to achieve high quality.
Claims (24)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/388,818 US9203708B2 (en) | 2011-09-26 | 2012-01-16 | Estimating user-perceived quality of an encoded stream |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161539124P | 2011-09-26 | 2011-09-26 | |
US13/388,818 US9203708B2 (en) | 2011-09-26 | 2012-01-16 | Estimating user-perceived quality of an encoded stream |
PCT/SE2012/050031 WO2013048300A1 (en) | 2011-09-26 | 2012-01-16 | Estimating user-perceived quality of an encoded video stream |
Publications (2)
Publication Number | Publication Date |
---|---|
US20130081071A1 US20130081071A1 (en) | 2013-03-28 |
US9203708B2 true US9203708B2 (en) | 2015-12-01 |
Family
ID=47912733
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/388,818 Active 2032-10-07 US9203708B2 (en) | 2011-09-26 | 2012-01-16 | Estimating user-perceived quality of an encoded stream |
Country Status (1)
Country | Link |
---|---|
US (1) | US9203708B2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9674516B2 (en) * | 2014-11-25 | 2017-06-06 | Echostar Technologies L.L.C. | Systems and methods for picture quality monitoring |
CN107360431A (en) * | 2017-06-30 | 2017-11-17 | 武汉斗鱼网络科技有限公司 | A kind of determination methods and device of frame type |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150163271A1 (en) * | 2011-12-22 | 2015-06-11 | Telefonaktiebolaget L M Ericsson (Publ) | Apparatus and method for monitoring performance in a communications network |
US11871052B1 (en) * | 2018-09-27 | 2024-01-09 | Apple Inc. | Multi-band rate control |
WO2021256969A1 (en) * | 2020-06-15 | 2021-12-23 | Telefonaktiebolaget Lm Ericsson (Publ) | Method and system of a communication network for estimating bitrate of an encrypted video stream |
Citations (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6011868A (en) * | 1997-04-04 | 2000-01-04 | Hewlett-Packard Company | Bitstream quality analyzer |
US6285797B1 (en) | 1999-04-13 | 2001-09-04 | Sarnoff Corporation | Method and apparatus for estimating digital video quality without using a reference video |
WO2004008780A1 (en) | 2002-07-17 | 2004-01-22 | Koninklijke Philips Electronics N.V. | A method and apparatus for measuring the quality of video data |
US20070133608A1 (en) * | 2005-05-27 | 2007-06-14 | Psytechnics Limited | Video quality assessment |
EP1804519A1 (en) | 2004-10-18 | 2007-07-04 | Nippon Telegraph and Telephone Corporation | Video quality objective evaluation device, evaluation method, and program |
EP2077672A1 (en) | 2007-08-22 | 2009-07-08 | Nippon Telegraph and Telephone Corporation | Video quality estimation device, video quality estimation method, frame type judgment method, and recording medium |
US20090273676A1 (en) * | 2008-04-24 | 2009-11-05 | Psytechnics Limited | Method and apparatus for measuring blockiness in video images |
US20100110199A1 (en) | 2008-11-03 | 2010-05-06 | Stefan Winkler | Measuring Video Quality Using Partial Decoding |
US20100125871A1 (en) * | 2008-11-14 | 2010-05-20 | Google Inc. | Video play through rates |
US20100265334A1 (en) | 2009-04-21 | 2010-10-21 | Vasudev Bhaskaran | Automatic adjustments for video post-processor based on estimated quality of internet video content |
EP2252073A1 (en) | 2008-03-21 | 2010-11-17 | Nippon Telegraph and Telephone Corporation | Method, device, and program for objectively evaluating video quality |
WO2011048829A1 (en) | 2009-10-22 | 2011-04-28 | 日本電信電話株式会社 | Video quality estimation device, video quality estimation method, and video quality estimation program |
WO2011082719A1 (en) | 2010-01-11 | 2011-07-14 | Telefonaktiebolaget L M Ericsson (Publ) | Technique for video quality estimation |
US20110176060A1 (en) * | 2010-01-21 | 2011-07-21 | Qualcomm Incorporated | Data feedback for broadcast applications |
US20110222669A1 (en) | 2008-11-13 | 2011-09-15 | Luca Buriano | Method for estimating the quality of experience of a user in respect of audio and/or video contents distributed through telecommunications networks |
WO2011134113A1 (en) | 2010-04-30 | 2011-11-03 | Thomson Licensing | Method and apparatus for assessing quality of video stream |
US20120099793A1 (en) * | 2010-10-20 | 2012-04-26 | Mrityunjay Kumar | Video summarization using sparse basis function combination |
US20130016770A1 (en) * | 2010-04-02 | 2013-01-17 | Panasonic Corporation | Wireless communication device and wireless communication method |
US20130057705A1 (en) * | 2011-09-02 | 2013-03-07 | Verizon Patent And Licensing Inc. | Video quality scoring |
US8396323B2 (en) * | 2008-04-24 | 2013-03-12 | Psytechnics Limited | Method and apparatus for measuring blockiness in video images |
US20130111538A1 (en) * | 2010-07-05 | 2013-05-02 | Mitsubishi Electric Corporation | Video quality management system |
US20130208814A1 (en) * | 2010-07-30 | 2013-08-15 | Deutsche Telekom Ag | Methods and apparatuses for temporal synchronisation between the video bit stream and the output video sequence |
US20130265445A1 (en) * | 2010-12-10 | 2013-10-10 | Deutsche Telekom Ag | Method and apparatus for assessing the quality of a video signal during encoding and transmission of the video signal |
-
2012
- 2012-01-16 US US13/388,818 patent/US9203708B2/en active Active
Patent Citations (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6011868A (en) * | 1997-04-04 | 2000-01-04 | Hewlett-Packard Company | Bitstream quality analyzer |
US6285797B1 (en) | 1999-04-13 | 2001-09-04 | Sarnoff Corporation | Method and apparatus for estimating digital video quality without using a reference video |
WO2004008780A1 (en) | 2002-07-17 | 2004-01-22 | Koninklijke Philips Electronics N.V. | A method and apparatus for measuring the quality of video data |
US20040012675A1 (en) * | 2002-07-17 | 2004-01-22 | Koninklikje Philips Electronics N. V. Corporation. | Method and apparatus for measuring the quality of video data |
EP1804519A1 (en) | 2004-10-18 | 2007-07-04 | Nippon Telegraph and Telephone Corporation | Video quality objective evaluation device, evaluation method, and program |
US20070133608A1 (en) * | 2005-05-27 | 2007-06-14 | Psytechnics Limited | Video quality assessment |
EP2077672A1 (en) | 2007-08-22 | 2009-07-08 | Nippon Telegraph and Telephone Corporation | Video quality estimation device, video quality estimation method, frame type judgment method, and recording medium |
EP2252073A1 (en) | 2008-03-21 | 2010-11-17 | Nippon Telegraph and Telephone Corporation | Method, device, and program for objectively evaluating video quality |
US20090273676A1 (en) * | 2008-04-24 | 2009-11-05 | Psytechnics Limited | Method and apparatus for measuring blockiness in video images |
US8396323B2 (en) * | 2008-04-24 | 2013-03-12 | Psytechnics Limited | Method and apparatus for measuring blockiness in video images |
US20100110199A1 (en) | 2008-11-03 | 2010-05-06 | Stefan Winkler | Measuring Video Quality Using Partial Decoding |
US20110222669A1 (en) | 2008-11-13 | 2011-09-15 | Luca Buriano | Method for estimating the quality of experience of a user in respect of audio and/or video contents distributed through telecommunications networks |
US20100125871A1 (en) * | 2008-11-14 | 2010-05-20 | Google Inc. | Video play through rates |
US20100265334A1 (en) | 2009-04-21 | 2010-10-21 | Vasudev Bhaskaran | Automatic adjustments for video post-processor based on estimated quality of internet video content |
WO2011048829A1 (en) | 2009-10-22 | 2011-04-28 | 日本電信電話株式会社 | Video quality estimation device, video quality estimation method, and video quality estimation program |
EP2493205A1 (en) | 2009-10-22 | 2012-08-29 | Nippon Telegraph And Telephone Corporation | Video quality estimation device, video quality estimation method, and video quality estimation program |
WO2011082719A1 (en) | 2010-01-11 | 2011-07-14 | Telefonaktiebolaget L M Ericsson (Publ) | Technique for video quality estimation |
US20110176060A1 (en) * | 2010-01-21 | 2011-07-21 | Qualcomm Incorporated | Data feedback for broadcast applications |
US20130016770A1 (en) * | 2010-04-02 | 2013-01-17 | Panasonic Corporation | Wireless communication device and wireless communication method |
WO2011134113A1 (en) | 2010-04-30 | 2011-11-03 | Thomson Licensing | Method and apparatus for assessing quality of video stream |
US20130044224A1 (en) * | 2010-04-30 | 2013-02-21 | Thomson Licensing | Method and apparatus for assessing quality of video stream |
US20130111538A1 (en) * | 2010-07-05 | 2013-05-02 | Mitsubishi Electric Corporation | Video quality management system |
US20130208814A1 (en) * | 2010-07-30 | 2013-08-15 | Deutsche Telekom Ag | Methods and apparatuses for temporal synchronisation between the video bit stream and the output video sequence |
US20120099793A1 (en) * | 2010-10-20 | 2012-04-26 | Mrityunjay Kumar | Video summarization using sparse basis function combination |
US20130265445A1 (en) * | 2010-12-10 | 2013-10-10 | Deutsche Telekom Ag | Method and apparatus for assessing the quality of a video signal during encoding and transmission of the video signal |
US20130057705A1 (en) * | 2011-09-02 | 2013-03-07 | Verizon Patent And Licensing Inc. | Video quality scoring |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9674516B2 (en) * | 2014-11-25 | 2017-06-06 | Echostar Technologies L.L.C. | Systems and methods for picture quality monitoring |
US9854232B2 (en) | 2014-11-25 | 2017-12-26 | Echostar Technologies L.L.C. | Systems and methods for picture quality monitoring |
CN107360431A (en) * | 2017-06-30 | 2017-11-17 | 武汉斗鱼网络科技有限公司 | A kind of determination methods and device of frame type |
CN107360431B (en) * | 2017-06-30 | 2019-08-02 | 武汉斗鱼网络科技有限公司 | A kind of judgment method and device of frame type |
Also Published As
Publication number | Publication date |
---|---|
US20130081071A1 (en) | 2013-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5302342B2 (en) | Method, apparatus and system for evaluating the quality of a video code stream | |
US8902782B2 (en) | Method of determining video quality | |
JP5172440B2 (en) | Video quality estimation apparatus, method and program | |
Yang et al. | Bitstream-based quality assessment for networked video: a review | |
US9723329B2 (en) | Method and system for determining a quality value of a video stream | |
RU2589474C2 (en) | Device and method for monitoring performance in communication network | |
US9203708B2 (en) | Estimating user-perceived quality of an encoded stream | |
US10541894B2 (en) | Method for assessing the perceived quality of adaptive video streaming | |
Yang et al. | Content-adaptive packet-layer model for quality assessment of networked video services | |
Bentaleb et al. | Data-driven bandwidth prediction models and automated model selection for low latency | |
US12069122B2 (en) | System and method for managing video streaming quality of experience | |
JP4308227B2 (en) | Video quality estimation device, video quality management device, video quality estimation method, video quality management method, and program | |
Li et al. | Real‐Time QoE Monitoring System for Video Streaming Services with Adaptive Media Playout | |
Kumar et al. | Quality of experience driven rate adaptation for adaptive HTTP streaming | |
Hu et al. | A new approach for packet loss measurement of video streaming and its application | |
Zhang et al. | Towards influence of chunk size variation on video streaming in wireless networks | |
JP5390369B2 (en) | Video quality estimation apparatus and method, coding bit rate estimation apparatus and method, and program | |
Botia Valderrama et al. | Nonintrusive method based on neural networks for video quality of experience assessment | |
García-Pineda et al. | Adaptive SDN-based architecture using QoE metrics in live video streaming on Cloud Mobile Media | |
EP2745518B1 (en) | Estimating user-perceived quality of an encoded video stream | |
JP4668851B2 (en) | Quality class determination apparatus, quality class determination method, and program | |
Hosek et al. | On predicting video quality expectations of mobile users | |
Xu | Improving ABR video streaming design with systematic QoE measurement and cross layer analysis | |
Vega et al. | Cognitive real-time QoE management in video streaming services | |
Klein et al. | Evaluation of video quality monitoring based on pre-computed frame distortions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL), SWEDEN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TANO, RICHARD;LINDEGREN, DAVID;SIGNING DATES FROM 20120116 TO 20120117;REEL/FRAME:027649/0932 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LT Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TELEFONAKTIEBOLAGET LM ERICSSON (PUBL);REEL/FRAME:049149/0276 Effective date: 20190403 Owner name: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TELEFONAKTIEBOLAGET LM ERICSSON (PUBL);REEL/FRAME:049149/0276 Effective date: 20190403 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |