WO2018044338A1 - Quantization parameter reporting for video streaming - Google Patents

Quantization parameter reporting for video streaming Download PDF

Info

Publication number
WO2018044338A1
WO2018044338A1 PCT/US2016/060257 US2016060257W WO2018044338A1 WO 2018044338 A1 WO2018044338 A1 WO 2018044338A1 US 2016060257 W US2016060257 W US 2016060257W WO 2018044338 A1 WO2018044338 A1 WO 2018044338A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
value
representations
circuitry
representation
Prior art date
Application number
PCT/US2016/060257
Other languages
French (fr)
Inventor
Ozgur Oyman
Original Assignee
Intel IP Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel IP Corporation filed Critical Intel IP Corporation
Publication of WO2018044338A1 publication Critical patent/WO2018044338A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format

Definitions

  • Embodiments of the present disclosure generally relate to the field of networks, and more particularly, to apparatuses, systems, and methods for quantization parameter reporting for video streaming through the networks.
  • HTTP Hypertext transfer protocol
  • TCP transmission control protocol
  • IP Internet protocol
  • DASH Dynamic adaptive streaming over HTTP
  • 3GPP Third Generation Partnership Project
  • TS Technical Specification
  • MPEG Moving Picture Experts Group
  • ISO International Organization for Standardization
  • IEC International Electrotechnical Commission
  • MPD metadata file provides information on the structure and different versions of the media content representations stored in a server.
  • the information may include, for example, different bit rates, frame rates, resolutions, codec types, etc.
  • the MPD may include information on initialization and media segments for the media engine to ensure mapping of segments into a media presentation timeline for switching and synchronous presentation with other representations. Based on this MPD metadata information that describes the relation of the segments and how they form a
  • clients request the segments using HTTP GET or partial GET methods.
  • the client may fully control the streaming session, for example, the client may manage the on-time request and smooth play out of the sequence of segments, potentially adjusting bit rates or other attributes to, for example, react to changes of the device state or the user preferences.
  • Figure 1 illustrates an adaptive streaming system according to some embodiments.
  • Figure 2 illustrates a media presentation description file according to some embodiments.
  • Figure 3 illustrates a QP metadata file according to some embodiments.
  • Figure 4 illustrates an example of an extensible markup language syntax of common group and representation attributes and elements according to some
  • Figure 5 illustrates an example operation flow/algorithmic structure of a user equipment according to some embodiments.
  • Figure 6 illustrates an example operation flow/algorithmic structure of a user equipment according to some embodiments.
  • Figure 7 illustrates an example operation flow/algorithmic structure of a server according to some embodiments.
  • Figure 8 illustrates an example operation flow/algorithmic structure of a server according to some embodiments.
  • Figure 9 illustrates a computer system according to some embodiments.
  • phrases “A or B,” “A and/or B,” and “A/B” mean (A), (B), or (A and B).
  • Figure 1 illustrates an adaptive streaming system 100 in accordance with some embodiments.
  • the system 100 may include a server 104 and a user equipment (“UE") 108.
  • UE user equipment
  • the system 100 may provide a DASH-based streaming framework; however, the concepts described herein may equally apply to other adaptive streaming frameworks.
  • the server 104 may be a web server, a media server, or a metadata management server. In some embodiments, operations described as being provided by the server 104 may be provided by one or more different devices that are networked together.
  • the server 104 may receive audio/video input 1 12 that is to be streamed.
  • the audio/video input 112 may be sourced locally from the server 104 or provided to the server 104 from another device, for example, a remote media content server.
  • the server 104 may include encoding circuitry 116 and segmentation circuitry 120.
  • the encoding circuitry 116 may receive the audio/video input 112 and encode the media content using any of a number of codecs such as those standardized for video and audio compression by International Organization for Standardization/International
  • ISO/EIC Moving Picture Experts Group
  • MPEG Moving Picture Experts Group
  • other video/audio codecs may be used.
  • the encoding circuitry 116 may provide the encoded media content to the segmentation circuitry 120, which may split the input media into a series of segments that may then be provided to a streaming host circuitry 124 that is to control delivery of the adaptive streaming information.
  • the streaming host circuitry 124 may control lower-level communication circuitry ("CC") 126.
  • CC lower-level communication circuitry
  • the streaming host circuitry 124 may provide communication processing at the Internet layer (for example, Internet protocol (“IP”)) and above; while the communication circuitry 126 provides communication processing at a link layer and below.
  • IP Internet protocol
  • the encoding circuitry 116 or streaming host circuitry 124 may also generate a metadata file, for example, an MPD file, that may be used to convey an index of each segment and associated metadata information about the encoded content. If the encoding circuitry 116 generates the MPD file, it may provide the MPD file to the streaming host circuitry 124.
  • a metadata file for example, an MPD file
  • the encoding circuitry 116 When the encoding circuitry 116 encodes the media content it may also record a video quantization parameter ("QP") value that is associated with the corresponding encoded media content.
  • the video QP value may indicate how much a particular set of media content was quantized during the encoding process. In general, the higher the video QP value, the greater the amount of quantization performed during the encoding process.
  • the encoding circuitry 1 16 may also provide the QP information, for example, the video QP value, to the streaming host circuitry 124. As described in further detail below, the QP information may be included in the MPD file or may be in a metadata file that is separate from the MPD file.
  • the streaming host circuitry 124 may control delivery of the adaptive streaming information, for example, the encoded segments, the MPD file, and the QP information, to the user equipment 108 using HTTP protocols over one or more networks such as, for example, a wireless access network, a core network, an IP network, a public network, etc.
  • circuitry may refer to, be part of, or include any combination of integrated circuits (for example, a field-programmable gate array
  • circuitry may execute one or more software or firmware modules to provide the described functions.
  • circuitry may include logic, at least partially operable in hardware.
  • the UE 108 may include communication circuitry ("CC") 130 that provides communication processing that complements the communication processing provided by communication circuitry 126.
  • the UE 108 may also include streaming client circuitry 128 that provides communication processing that complements the communication processing provided by the streaming host circuitry 124.
  • the streaming client circuitry 128 may manage the request, delivery, and reception of the adaptive streaming content at the Intemet, Transport, and Application layers.
  • the streaming client circuitry 128 may implement, control, or otherwise utilize a web browser at an Application layer to facilitate operations of the streaming client circuitry 128.
  • the streaming client circuitry 128 may transmit an HTTP GET request to the streaming host circuitry 124 to request streaming of media content corresponding to the audio/video input 1 12.
  • the streaming host circuitry 124 may respond by transmitting the MPD file to the streaming client circuitry 128. While it is described that the streaming client circuitry 128 transmits messages to and receives messages from the streaming host circuitry 124, it will be understood that the streaming client circuitry 128 will cooperate with or otherwise control communication circuitry 130 to effectuate the
  • the streaming host circuitry 124 will cooperate with or otherwise control communication circuitry 126 to effectuate the transmission/reception of described messages.
  • the streaming client circuitry 128 may use the file to determine the index of each segment and associated metadata information. Subsequently, the streaming client circuitry 128 may transmit a uniform resource locator ("URL") in, for example, an HTTP GET URL request, to obtain a specific segment. For example, the streaming client circuitry 128 may transmit a first HTTP GET URL request that corresponds to a request for a first segment ("Segment 1"). The streaming host circuitry 124 may then respond with Segment 1. The streaming client circuitry 128 may continue to request desired segments to control the adaptive streaming session.
  • a uniform resource locator URL
  • the streaming client circuitry 128 may provide the received segments to decoding circuitry 132, which may decode the encoded segment and provide the decoded segments to media player circuitry 136.
  • the media player circuitry 136 may control output circuitry 140 to render the decoded segments as video on display device 144 or audio on or by audio device 148.
  • Video quality associated with the adaptive streaming session may be monitored in a number of ways.
  • video quality may be determined based on quality metrics such as, but not limited to, video multi-scale structural similarity (“MS-SSIM”), video mean opinion score (“MOS”), video quality metrics (“VQM”), structural similarity metrics (“SSIM”), peak signal-to-noise ratio (PSNR), perceptual evaluation of video quality metrics (“PEVQ”), etc.
  • quality metrics such as, but not limited to, video multi-scale structural similarity (“MS-SSIM”), video mean opinion score (“MOS”), video quality metrics (“VQM”), structural similarity metrics (“SSIM”), peak signal-to-noise ratio (PSNR), perceptual evaluation of video quality metrics (“PEVQ”), etc.
  • MS-SSIM video multi-scale structural similarity
  • MOS video mean opinion score
  • VQM video quality metrics
  • SSIM structural similarity metrics
  • PSNR peak signal-to-noise ratio
  • PEVQ perceptual evaluation of video quality metrics
  • Video resolution, frame rate, encoding bit rate, etc. may play an important role in video quality.
  • the video QP values may also play an important role because different video segments may have very different content characteristics and may yield a wide range of video quality even when encoded at the same bit rate.
  • tractor 1456 32.2 32.6 0.917 3.7 frontend 1429 23.7 30.8 0.946 3.9 pedestrianarea 1463 27.6 35.7 0.944 3.6 speedbag 1455 25.8 38.5 0.971 4.2 sunflower 1460 25.9 38.2 0.975 4.4
  • Table 2 Table 1 which corresponds with Table 4-10 of 3GPP TR 26.909, shows quality and video QP values (or, simply, "QP") of different video clips encoded at a same bit rate.
  • the MOS which may be a subjective video quality metric, may still vary in a wide range due to the content characteristic difference, which may be well captured by the QP value.
  • Table 2 which corresponds with Table 4-11 of 3 GPP TR 26.909, shows the correlation of the bit rate/QP and video quality scores.
  • videos encoded at the same bit rate their video MOS has a strong correlation to the video QP value, which may provide an indication of the video content complexity.
  • embodiments of the present disclosure provide a QP-aware streaming process in which the streaming client circuitry 128 can adaptively choose the content that best matches display capabilities of the display device 144. This may facilitate delivery of the streaming media content with a satisfactory video quality (for example, video MOS). Furthermore, providing the QP information at the session/HTTP level as described, rather than relying on feedback from codec processing level at the decoding circuitry 132, may facilitate timely and efficient determination of video quality metrics by, for example, a video MOS estimation engine. Various embodiments also describe mechanisms to perform video QP-aware streaming and content adaptation at the streaming client circuitry 128 based on quality characteristics determined for different configurations of video QP.
  • the QP information generated by the encoding circuitry 116 may further include quality metrics corresponding to different video QP configurations.
  • the streaming client circuitry 128 may use this information to perform rate adaptation and bit rate selection to improve quality for its chosen set of video QP parameters.
  • the streaming client circuitry 128 may transmit to the server 104, via the communication circuitry 130, a desired video QP configuration for the display device 144.
  • the server 104 may process the feedback information of the desired video QP configuration and perform transcoding to generate new content, for example, new DASH content, with the desired video QP configuration, which may then be provided or otherwise made available to the user equipment 108 as described elsewhere herein.
  • FIG. 2 illustrates an MPD file 200 in accordance with some embodiments.
  • MPD file 200 may be a manifest file for an HTTP adaptive stream that defines hierarchical levels that include information about encoded portions of media content available on a server, for example, server 104, for adaptive streaming.
  • the top-level of the MPD file 200 may include a number of periods (including, for example, period 204 and period 208) that describe a part of the media content with a start and a stop time. Each period may include one or more adaptation sets (for example, adaptation set 212 and adaptation set 216).
  • An adaptation set may be a set of interchangeable encoded versions of one or several media content components for a given period. For example, a period may have one adaptation set that provides video content and one or more other adaptation sets that provide audio content (possibly in one or more languages). In another example, a first adaptation set may provide high definition video content while a second adaptation set provides standard definition video content. For each period, one or more of the available adaptation sets may be selected by a UE based on, for example, user preferences, device capabilities, etc.
  • Individual adaptation sets may include one or more representations that represent a collection and encapsulation of one or more media streams and a delivery format associated with descriptive metadata.
  • the media streams of various representations may be formatted in various ways.
  • adaptation set 212 may include representations 220 and 224, each of which may be associated with different screen sizes, bandwidths, codecs, etc.
  • the streaming client circuitry for example, streaming client circuitry 128, may select between the different representations based on client preferences, network conditions, etc.
  • the streaming client circuitry 128 may switch between different representations of an adaptation set in a dynamic manner during a streaming session to maintain, for example, a desired video quality.
  • Individual representations may include sub-representations that include information that is specific to a particular media stream for the entire period.
  • the representation 220 may include sub-representations 228 and 232 each associated with a distinct set of information such as, for example, codecs, sampling rates, embedded subtitles, etc. for the period 204.
  • Individual representations may include segments that represent the unit of data associated with an HTTP URL and, optionally, a byte range that is specified by the MPD file 200.
  • the representation 220 may include segments 236 and 240.
  • the segments 236 and 240 may be individually requested by the HTTP GET requests transmitted by the streaming client circuitry 128.
  • a HTTP GET request transmitted by the streaming client circuitry 128.
  • the segments may also have sub-segments that are units of the segments that are associated with the specific information of the corresponding sub-representation.
  • the segment 236 may include a sub-segment 244 that is associated with the information of the sub-representation 228 and may also include a sub- segment 248 that is associated with the information of the sub-representation 232.
  • QP information 252 may be provided at an adaptation set level, a representation level, or a sub-representation level.
  • the QP information may be provided directly in adaptation set 212, as shown in Figure 2.
  • Providing the QP information 252 at the representation or sub-representation levels may also be employed in a direct manner by including the QP information 252 in, for example, the representation 220, sub-representation 228, or sub-representation 232.
  • the QP information 252 is shown with a dotted line border in Figure 2 to represent that the locations for the QP information 252 are various options.
  • Embodiments may include the QP information 252 in any one or combination of the various options.
  • the QP information 252 may be provided at a plurality of granularity levels.
  • first-level QP information may be provided with respect to a first hierarchical level (for example, an adaptation set or a representation) and second-level QP information may be provided with respect to a second hierarchical level (for example, a representation or a sub-representation).
  • first-level QP information may be provided with respect to a first hierarchical level (for example, an adaptation set or a representation) and second-level QP information may be provided with respect to a second hierarchical level (for example, a representation or a sub-representation).
  • adaptation set 212 may include QP information 252 that represents an average QP value for all the encoded content in the adaptation set 212.
  • Further QP information may be associated with lower levels, for example, representations 220 and 224 or sub-representations 228 and 232.
  • the lower-level QP information (at, for example, the sub- representations 228 and 232) may be the same or different than the higher-level QP information (at, for example, the adaptation set 212). This may, for example allow a UE to identify a second video QP value for a sub-representation that is different from a first video QP value for an adaptation set that the sub-representation is in, and request sub- segments of the sub-representation having the second video QP value.
  • QP information may be provided in a file that is separate from the manifest file.
  • Figure 3 illustrates a QP metadata file 300 in accordance with some embodiments.
  • the QP metadata file 300 may have an International Organization for Standardization ("ISO") base media file format or a Third Generation Partnership Project (“3GPP”) file format.
  • ISO International Organization for Standardization
  • 3GPP Third Generation Partnership Project
  • the QP metadata file 300 may have video QP information embedded into a timed metadata track linked to a corresponding metadata file, for example, MPD file 200.
  • the video QP metadata may include a vectorized set of video QP values that describe video QP across various content sections.
  • the QP metadata file 300 may include references to one or more periods that are defined in the MPD file 200. For individual periods, the QP metadata file 300 may include one or more video QP values associated with a
  • the QP metadata file 300 shows periods 1-n, which may correspond to periods 204-208 of the MPD file 200. Each of the periods may have i encoded content sections.
  • the QP metadata file 300 may then indicate that, for period 1, video QP values (1) 304 correspond to encoded content (1) 308, and video QP values (i) 312 may correspond to encoded content (i) 316; ... ; for period n, video QP values (1) 320 correspond to encoded content (1) 324, ... , and video QP values (i) 328 correspond to encoded content (i) 332.
  • the QP metadata file 300 may associate video QP values with encoded content for each of the periods of a corresponding manifest file in a similar manner.
  • the encoded content sections of the QP metadata file 300 may correspond to a hierarchical level of a manifest file.
  • an encoded content section may be provided for each of one or more adaptation sets, representations, or sub- representations.
  • the encoded content sections may be associated with a different or finer granularity.
  • the encoded content sections may correspond to one or more video frames of encoded content or media time intervals.
  • the manifest file may be configured with an indication of the existence of the metadata file.
  • a codecs parameter of the MPD may be provided with a dedicated new value that may be used to indicate the presence of video QP information in a corresponding metadata file.
  • the UE upon receiving the indication, may then send a separate request for the metadata file.
  • the QP information provided to the streaming client circuitry 128 may include a variety of QP information that pertains to the encoded content section with which it is associated.
  • the QP information may include a single value that is associated with the entire encoded content section.
  • the single value may be an average (for example, a mean, a median, or a modal average) QP value over the encoded content section.
  • the average may be a mean luma QP averaged over a duration of the encoded content section.
  • the QP information may additionally or alternatively include maximum or minimum QP values over the duration of the encoded content section.
  • information in addition to, but associated with, the QP information may be provided.
  • the QP information associated with an encoded content section may also include corresponding quality information. This quality information may be used to signal quality variations in a more granular fashion to enable more dynamic quality -based adaptations by the UE.
  • minimum/maximum quality attributes may be signaled at an adaptation set level.
  • a minimum quality attribute may be used to specify a minimum quality value in all representations of an adaptation set for each given video QP value; and a maximum quality attribute may be used to specify a maximum quality value in all representations in the adaptation set for each given video QP configuration.
  • the minimum and maximum quality values may quantify the minimum and maximum quality levels over a specified timeline that may correspond to a period, segment, or sub-segment. In some embodiments, these values may indicate long-term (or average) minimum and maximum quality measurements over an entire duration of an adaptation set.
  • vectorized sets of quality values may be provided to specify the minimum and maximum quality levels for the adaptation set across different segments and sub- segments. The signaling may be done such that, for each video QP value, a corresponding set of quality metrics may be provided.
  • the quality/QP information may be used by the streaming client circuitry 128 to maintain a high quality of experience ("QoE") for a user.
  • QoE quality of experience
  • the streaming client circuitry 128 may receive the manifest/metadata file and identify an adaptation set that best matches its display capabilities in terms of video QP values towards delivering satisfactory video quality.
  • the streaming client circuitry 128 may then request segments and sub-segments corresponding to the various DASH representations, taking into account the video QP information as well as the corresponding quality information in the manifest/metadata file.
  • the streaming client circuitry 128 may switch across different representations over random access points (also known as segment access points in
  • DASH by continuously tracking bandwidth, quality, central processor unit (“CPU”) load, etc. in an effort to optimize user QoE.
  • the streaming client circuitry 128 may use the quality information in the manifest/metadata file to decide which representation to switch to, and when/where the switching should occur for the highest QoE.
  • Presentation profiles as described in 7.3 The value shall be a subset of the respective value in any higher level of the document hierarchy (Representation, Adaptation Set, MPD).
  • the value is inferred to be the same as in the next higher level of the document hierarchy. For example, if the value is not present for a Representation, then @profiles at the Adaptation Set level is valid for the Representation.
  • @frameRate 0 Specifies the output frame rate of the video media type in the Representation. If the frame rate is varying, the value is the average frame over the entire duration of the Representation.
  • the value is coded as a string, either containing two integers separated by a "/", ("F/D"), or a single integer “F”.
  • the frame rate is the division F/D, or F, respectively, per second (i.e., the default value of D is "1").
  • sampling rate or a whitespace separated pair of decimal integer values specifying the minimum and maximum sampling rate of the audio media component type.
  • the values are in samples per second.
  • codecs 0 Specifies the codecs parameter specifying the media types.
  • the codec parameters shall also include the profile and level information where applicable.
  • @maximumS APPeri od 0 When present, specifies the maximum SAP interval in seconds of all contained media streams, where the SAP interval is the maximum time interval between the TSAP of any two successive SAPs of types 1 to 3 inclusive of one media stream in the associated Representations.
  • a Media Segment starts with a SAP in a media stream if the stream contains a SAP in that Media Segment, ISAU is the index of the first access unit that follows ISAP and ISAP is contained in the Media Segment.
  • pictures are intra coded). When not present, there may or may not be coding dependency between access units.
  • AudioChannelConfiguration 0... N Specifies the audio channel configuration of the audio media component type.
  • ContentProtection 0... N Specifies information about the use of content protection for the associated Representations.
  • EssentialPropert 0... N Specifies information about the containing element that is considered essential by the Media Presentation author for processing the containing element.
  • SupplementalProperty 0... N Specifies supplemental information about the containing element that may be used by the DASH client optimizing the processing.
  • an "O” may indicate the attribute is optional, while an “M” may indicate the attribute is mandatory.
  • the use column may be in the form ⁇ minOccurs>... ⁇ maxOccurs>, where "N" indicates an unbounded value.
  • XML extensible markup language
  • Figure 4 An example of an extensible markup language (“XML") syntax of common group and representation attributes and elements, such as those described above with respect to Table 3, is shown in Figure 4.
  • a video QP attribute shown as "QP” in Figure 4
  • QP may be provided with an unsigned integer data type to indicate that the video QP value may be specified by a numeric value without a fractional component.
  • the integer that represents the video QP value may be a 32-bit integer in some embodiments.
  • FIG. 5 illustrates an example operation flow/algorithmic structure 500 of the UE 108 according to some embodiments.
  • the operation flow/algorithmic structure 500 may include, at 504, receiving a manifest file.
  • the manifest file may be received by streaming client circuitry 128.
  • the manifest file which may correspond to an HTTP adaptive stream, may define hierarchical levels that include information about encoded portions of media content available for adaptive streaming on a server such as, for example, server 104.
  • the manifest file may be an MPD file such as, for example, MPD file 200.
  • the manifest file may be received as a result of an original request for media content such as, for example, an HTTP GET request transmitted by the streaming client circuitry 128.
  • the operation flow/algorithmic structure 500 may further include, at 508, selecting a video QP value that corresponds to at least a portion of the encoded media content represented in the manifest file received at 504.
  • the selecting of the video QP value may be performed by the streaming client circuitry 128.
  • the video QP value may be selected from a plurality of video QP values that respectively correspond to a plurality of sections of the encoded media content.
  • the plurality of video QP values, and their respective correspondences may be provided to the UE in the manifest file received at 504 or embedded in a metadata file such as, for example, QP metadata file 300.
  • the streaming client circuitry 128 may select the video QP value of interest based on display capabilities of the display device 144 and a desired video quality. For example, streaming client circuitry 128 may select the video QP value from the plurality of video QP values based on a determination that content encoded with the selected video QP value can be decoded and rendered on the display device 144 in a manner to meet the desired video quality.
  • the operation flow/algorithmic structure 500 may further include, at 512, requesting a hierarchical level based on the video QP value.
  • the requesting at 512 may be performed by the streaming client circuitry 128 transmitting an HTTP GET URL message to streaming host circuitry 124.
  • the hierarchical level may be an adaptation set, a representation, or a sub-representation.
  • the operation flow/algorithmic structure 500 may further include, at 516, receiving and decoding content corresponding to the requested hierarchical level.
  • the receiving at 516 may be performed by the streaming client circuitry 128 and the decoding at 516 may be performed by the decoding circuitry 132.
  • the decoding circuitry 132 may, during the decoding operations, record QP values.
  • the recorded QP values may be fed back to the server 104 if certain conditions are met. For example, if the QP value determined at the decoding operation differs from the QP value that was provided in the QP metadata, the user equipment 108 may feed back an indication of the mismatch.
  • the media player circuitry 136 may control the output circuitry 140, including the display device 144 and the audio device 148, to render the media content.
  • Figure 6 illustrates an example operation flow/algorithmic structure 600 of the UE 108 according to some embodiments.
  • the operation flow/algorithmic structure 600 may include, at 604, receiving a manifest file.
  • the receiving of the manifest file at 604 may be similar to the receiving of the manifest file at 504 as discussed above with respect to Figure 5.
  • the operation flow/algorithmic structure 600 may further include, at 608, identifying video QP values for individual adaptation sets of one or more adaptation sets in the manifest file for a selected period.
  • the identifying at 608 may be performed by the streaming client circuitry 128.
  • the video QP values may be identified directly from the manifest file.
  • the video QP values may be included in the manifest file.
  • the video QP values may be identified from a QP metadata file that corresponds to the manifest file.
  • the manifest file may include an indication of a presence of the corresponding QP metadata file.
  • the UE 108 may transmit a request for the QP metadata file upon learning of its existence through the manifest file.
  • the operation flow/algorithmic structure 600 may further include, at 612, requesting first encoded content for a selected period having a first video QP value.
  • the requesting at 612 may be performed by the streaming client circuitry 128.
  • the first encoded content may correspond to a first adaptation set, representation, etc.
  • the first encoded content may be selected based on a determination that the UE 108 is capable of decoding and rendering content of the first encoded content on the display device 144 in a manner to meet the desired video quality.
  • the operation flow/algorithmic structure 600 may further include, at 616, dynamically switching streaming between different encoded portions of the media content to maintain a desired video quality value.
  • the dynamically switching at 616 may be performed by the streaming client circuitry 128 in order to maintain a desired level of video quality.
  • the first encoded content may correspond to an adaptation set.
  • the streaming client circuitry 128 may initially request a first representation of the adaptation set. While decoding and rendering of the content of the first representation, the streaming client circuitry 128 may make a determination that the first representation fails or has a high probability of failing to meet the desired video quality. This may be the result of changing network conditions (for example, interference, loading, etc.), central processor unit (“CPU”) load, or other dynamic factors.
  • network conditions for example, interference, loading, etc.
  • CPU central processor unit
  • the streaming client circuitry 128 may issue a request to receive the content of the second representation.
  • the second representation may include the same video QP value (given that the representation is within the adaptation set and, therefore, is associated with the video QP value that is associated with the adaptation set) but have other parameters that differ, for example, different bit rates, frame rates, resolutions, codec types, quality information, etc.
  • the second representation may have a QP value that is different from the first representation. This may be the case when general QP information (for example, overall average video QP value) is provided with respect to the adaptation set and more specific QP information is provided with respect to the different representations.
  • the first encoded content may correspond to a granularity less than the adaptation set, for example, a representation, a sub-representation, one or more video frames, etc.
  • the first encoded content may correspond to a first representation having a first video QP value.
  • the streaming client circuitry 128 may select an adaptation set, which may or may not be selected based on a video QP value associated with the adaptation set, and may subsequently select a first representation based on a representation-level video QP value.
  • the streaming client circuitry 128 may request a second representation.
  • the second representation may be associated with a different representation-level video QP value.
  • Figure 7 illustrates an example operation flow/algorithmic structure 700 of the server 104 according to some embodiments.
  • the operation flow/algorithmic structure 700 may include, at 704, encoding content sections of an HTTP adaptive stream.
  • the encoding at 704 may be performed by encoding circuitry 116 to compress audio/video input 112 using one or more video or audio codecs.
  • the operation flow/algorithmic structure 700 may further include, at 708, recording video QP values associated with the encoded content sections.
  • the recording of the video QP values may be performed by encoding circuitry 116 or streaming host circuitry 124.
  • the operation flow/algorithmic structure 700 may further include, at 712, generating a manifest file.
  • the manifest file as described above, may define hierarchical levels that include information about the encoded content that is available for adaptive streaming.
  • the video QP values may be recorded in the manifest file.
  • the video QP values may be recorded in a metadata file that is associated with the manifest file.
  • the generating of the manifest file at 712 may be performed by the encoding circuitry 116 or the streaming host circuitry 124.
  • the operation flow/algorithmic structure 700 may further include, at 716, transmitting the manifest file and video QP values.
  • the streaming host circuitry 124 may perform the transmitting at 716, by controlling the communication circuitry 126.
  • the manifest file and the video QP values may be transmitted separately.
  • the manifest file may be originally transmitted to the UE 108. If the UE 108 sends an additional request for the QP metadata file, the streaming host circuitry 124 may respond by transmitting the video QP values in the QP metadata file.
  • Figure 8 illustrates an example operation flow/algorithmic structure 800 of the server 104 according to some embodiments.
  • the operation flow/algorithmic structure 800 may include, at 804, transmitting video QP metadata.
  • the transmitting of the video QP metadata may be performed by the streaming host circuitry 124 controlling the communication circuitry 126.
  • the video QP metadata may apply to a byte range for a plurality of representations of an HTTP adaptive stream.
  • the video QP metadata may include a vectorized set of video QP values that describe video QP across different video frames or different media time intervals.
  • the vectorized set of video QP values may be provided in a timed metadata track of a file having an ISO base media file format or 3GPP file format.
  • the operation flow/algorithmic structure 800 may further include, at 808, receiving a request for segments.
  • the streaming host circuitry 124 may receive the request at 808 from the UE 108.
  • the segments may be within the byte range and in a representation selected by the UE 108 from a plurality of representations in a manifest file.
  • the request may be received as an HTTP request with a URL identifying the specifically requested segments.
  • the operation flow/algorithmic structure 800 may further include, at 812, transmitting the requested segments.
  • the streaming host circuitry 124 may transmit the requested segments by controlling the communication circuitry 126.
  • Figure 9 is a block diagram illustrating components, according to some example embodiments, able to read instructions from a machine-readable or computer- readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein (for example, the techniques described with respect to operation flow/algorithmic structures of Figures 5-8).
  • Figure 9 shows a diagrammatic representation of computer system 900 including one or more processors (or processor cores) 910, one or more computer-readable media 920, and one or more communication resources 930, each of which are communicatively coupled via one or more interconnects 940.
  • the processors 910 may include one or more central processing unit (“CPUs”), reduced instruction set computing (“RISC”) processors, complex instruction set computing (“CISC”) processors, graphics processing units (“GPUs”), digital signal processors (“DSPs”) implemented as a baseband processor, for example, application specific integrated circuits ("ASICs”), radio-frequency integrated circuits (RFICs), etc.
  • the processors 910 may include a processor 912 and a processor 914.
  • the computer-readable media 920 may be suitable for use to store instructions 950 that cause the computer system 900, in response to execution of the instructions 950 by one or more of the processors 910, to practice selected aspects of the present disclosure.
  • the computer-readable media 920 may be non-transitory.
  • computer-readable storage medium 920 may include instructions 950.
  • the instructions 950 may be programming instructions or computer program code configured to enable the computer system 900, which may be implemented as the UE 108 or the server 104, in response to execution of the instructions 950, to implement (aspects of) any of the methods or elements described throughout this disclosure related to adaptive video streaming.
  • the instructions 950 may be configured to enable a device, in response to execution of the programming instructions 950, to implement
  • programming instructions 950 may be disposed on computer-readable media 920 that is transitory in nature, such as signals.
  • the computer-readable media 920 may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium.
  • the computer-readable media would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, random access memory (RAM), read only memory (ROM), an erasable programmable read-only memory (for example, EPROM, EEPROM, or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a transmission media such as those supporting the Internet or an intranet, or a magnetic storage device.
  • RAM random access memory
  • ROM read only memory
  • EEPROM erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • CD-ROM compact disc read-only memory
  • a transmission media such as those supporting the Internet or an intranet, or a magnetic storage device.
  • the computer-usable or computer-readable media could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
  • a computer-usable or computer-readable media may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the computer- usable media may include a propagated data signal with the computer-usable program code embodied therewith, either in baseband or as part of a carrier wave.
  • the computer- usable program code may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio frequency, etc.
  • Computer program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • LAN local area network
  • WAN wide area network
  • Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • the instructions 950 may reside, completely or partially, within at least one of the processors 910 (e.g., within the processor's cache memory), the computer-readable media 920, or any suitable combination thereof. Furthermore, any portion of the instructions 950 may be transferred to the processors 910 from any combination of the peripheral devices 904 and/or the databases 906. Accordingly, the memory of processors 910, the peripheral devices 904, and the databases 906 are additional examples of computer-readable media.
  • the communication resources 930 may include interconnection and/or network interface components or other suitable devices to communicate with one or more peripheral devices 904 and/or one or more remote devices, for example, databases 906, via a network 908.
  • the communication resources 930 may include wired communication components (e.g., for coupling via a Universal Serial Bus (USB)), cellular communication components, Near Field Communication (NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy), Wi-Fi® components, and other suitable wired communication components (e.g., for coupling via a Universal Serial Bus (USB)), cellular communication components, Near Field Communication (NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy), Wi-Fi® components,
  • the communication resources 930 may include a cellular modem to communicate over a cellular network, an Ethernet controller to communicate over an Ethernet network, etc.
  • one or more components of the computer system 900 may be included as a part of the UE 108 or the server 104 described with respect to Figure 1 , or one or more components of the UE 108 or server 104 described with respect to Figure 1 may be included as a part of the computer system 900.
  • encoding circuitry 1 16, segmentation circuitry 120, streaming host circuitry 124, or communication circuitry 126 may include processors 910, computer-readable media 920, or communication resources 930 to facilitate operations described above with respect to the server 104.
  • output circuitry 140 may include processors 910, computer-readable media 920, or communication resources 930 to facilitate operations described above with respect to the UE 108.
  • These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means that implement the function/act specified in the flowchart or block diagram block or blocks.
  • the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart or block diagram block or blocks.
  • Example 1 may include an apparatus comprising: streaming client circuitry to: receive a manifest file for a hypertext transport protocol (HTTP) adaptive stream, the manifest file to define hierarchical levels that include information
  • HTTP hypertext transport protocol
  • the hierarchical levels to include one or more adaptation sets, representations, and segments with
  • Example 2 may include the apparatus of example 1, wherein the streaming client circuitry is further to: determine display capabilities of a user equipment; determine a desired video quality; and select, based on the display capabilities and the desired video quality, the video QP value from a plurality of video QP values that correspond to a respective plurality of encoded content sections.
  • the streaming client circuitry is further to: determine display capabilities of a user equipment; determine a desired video quality; and select, based on the display capabilities and the desired video quality, the video QP value from a plurality of video QP values that correspond to a respective plurality of encoded content sections.
  • Example 3 may include the apparatus of example 2, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity (“MS-SSIM”) metric, a mean opinion score (“MOS”) metric, a structural similarity (“SSIM”) metric, peak signal to noise ratio (“PSNR”) or a perceptual evaluation of video quality (“PEVQ”) metric.
  • MS-SSIM multi-scale structural similarity
  • MOS mean opinion score
  • SSIM structural similarity
  • PSNR peak signal to noise ratio
  • PEVQ perceptual evaluation of video quality
  • Example 4 may include the apparatus of any one of examples 1-3, wherein the video QP value is included in: the manifest file or in metadata embedded in a Third Generation Partnership Project (“3GPP”) file format.
  • 3GPP Third Generation Partnership Project
  • Example 5 may include the apparatus of example 4, wherein the video QP value is included in the manifest file and the manifest file is a media presentation description for a dynamic adaptive streaming over HTTP (DASH) adaptation set.
  • DASH dynamic adaptive streaming over HTTP
  • Example 6 may include the apparatus of any one of examples 1-5, wherein the video QP value is a first video QP value and the streaming client circuitry is to report, to a content server, a second video QP value for a display device of a user equipment during reception, decoding, or rendering of the HTTP adaptive stream.
  • Example 7 may include the apparatus of any one of examples 1-6, wherein the video QP value corresponds to a mean luma QP averaged over a duration of the first encoded content section.
  • Example 8 may include the apparatus of any one of examples 1-7, wherein the streaming client circuitry is further to: select the video QP value from the manifest file for a plurality of sub-representations for a given period; and request sub-segments that are in at least one of the sub-representations and have the video QP value.
  • Example 9 may include the apparatus of any one of examples 1-8, wherein the video QP value is a first video QP value that corresponds to a first representation and the streaming client circuitry is further to switch to a second representation having a second video QP value at a selected access point in the HTTP adaptive stream to change a display performed by a mobile device in rendering the HTTP adaptive stream.
  • the video QP value is a first video QP value that corresponds to a first representation
  • the streaming client circuitry is further to switch to a second representation having a second video QP value at a selected access point in the HTTP adaptive stream to change a display performed by a mobile device in rendering the HTTP adaptive stream.
  • Example 10 may include the apparatus of any one of examples 1-9, wherein the streaming client circuitry is to receive an indication of the video QP value from a metadata management server.
  • Example 11 may include the apparatus of any one of examples 1-10, wherein the video QP value corresponds to an adaptation set, a representation, or a sub-representation of the manifest file.
  • Example 12 may include a media server operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the media server comprising: an encoder to: encode content sections of an HTTP adaptive stream, and to record video quantization parameter ("QP") values associated with the encoded content sections; and streaming host circuitry to transmit a manifest file for the HTTP adaptive stream and the video QP values to a user equipment, the manifest file to define hierarchical levels that include information about the encoded content sections.
  • HTTP hypertext transfer protocol
  • QP video quantization parameter
  • Example 13 may include the media server of example 12, wherein the hierarchical levels are to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments.
  • Example 14 may include the media server of example 13, wherein each representation of an adaptation set includes a same media file over a same byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
  • Example 15 may include the media server of any one of examples 12-14, wherein individual video QP values correspond to an adaptation set, a representation, or a sub- representation of the manifest file and the streaming host circuitry is to: receive, from the user equipment, an indication of a selection of an adaptation set, a representation, or a sub- representation, wherein the selection is based on a video QP value corresponding to the selected adaptation set, representation, or sub-representation; and transmit encoded portions of media content that correspond to the selected adaptation set, representation, or sub-representation based on receipt of the indication.
  • Example 16 may include the media server of any one of examples 12-15, wherein the encoder is further to record a vectorized set of video QP values describing video QP across different video frames or media time intervals and the streaming host circuitry is further to cause transmission of the vectorized set in a timed metadata track of an International Organization for Standardization ("ISO") base media file format or a Third Generation Partnership Project (“3 GPP”) file format.
  • ISO International Organization for Standardization
  • 3 GPP Third Generation Partnership Project
  • Example 17 may include the media server of any one of examples 12-16, wherein the indication of the video QP values is in the manifest file, which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • the manifest file which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • 3GPP Third Generation Partnership Project
  • Example 18 may include the media server of any one of examples 12-17, wherein the encoded content sections correspond to adaptation sets, representations, or segments.
  • Example 19 may include the media server of any one of examples 12-18, wherein the encoded content sections correspond to one or more frames of the HTTP adaptive stream.
  • Example 20 may include a user equipment (“UE") operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the UE comprising: communication circuitry to receive a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of
  • UE user equipment
  • HTTP hypertext transfer protocol
  • the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and streaming client circuitry to: identify video quantization parameter (QP) values for individual adaptation sets of the one or more adaptation sets in the manifest file for a selected period; request a first adaptation set for the selected period having a first video QP value; and dynamically switch streaming between different encoded portions of the media content in the first adaptation set to maintain a desired video quality value.
  • QP video quantization parameter
  • Example 21 may include the UE of example 20, wherein the streaming client circuitry is further to: determine the desired video quality based on capabilities of a display device of the UE; and identify the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
  • Example 22 may include the UE of example 20 or 21, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity (“MS-SSIM”) metric, a mean opinion score (“MOS”) metric, a structural similarity (“SSIM”) metric, peak signal to noise ratio (“PSNR”) or a perceptual evaluation of video quality (“PEVQ”) metric.
  • MS-SSIM multi-scale structural similarity
  • MOS mean opinion score
  • SSIM structural similarity
  • PSNR peak signal to noise ratio
  • PEVQ perceptual evaluation of video quality
  • Example 23 may include the UE of any one of examples 20-22, wherein the video QP value is included in: the manifest file, which is a media presentation description ("MPD") file; or metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • Example 24 may include the UE of any one of examples 20-23, wherein the streaming client circuitry is further to transmit, to a content server, via the communication circuitry, a desired video QP configuration for a display device of the UE.
  • Example 25 may include the UE of any one of examples 20-24, wherein the video QP value corresponds to a mean luma QP averaged over a segment duration.
  • Example 26 may include the UE of any one of examples 20-25, wherein the streaming client circuitry is further to: identify a second video QP value for a plurality of sub-representations for the selected period; and request sub-segments that are in at least one of the plurality of sub-representations and have the second video QP value.
  • Example 27 may include the UE of any one of examples 20-26, wherein the UE is to receive video QP configuration information from a metadata management server.
  • Example 28 may include one or more computer-readable media having instructions that, when executed, cause a media server operable to provide hypertext transfer protocol (HTTP) adaptive streaming to: transmit video QP metadata to a user equipment ("UE") over a byte range for a plurality of representations of an HTTP adaptive stream; receive a request for segments of the HTTP adaptive stream in the byte range in
  • HTTP hypertext transfer protocol
  • Example 29 may include the one or more computer-readable media of example 28, wherein the byte range is for a period, an adaptation set, a representation, a segment or a sub-segment.
  • Example 30 may include the one or more computer-readable media of example 28 or 29, wherein the video QP metadata includes a vectorized set of video QP values that describe video QP across different video frames or different media time intervals, wherein the vectorized set of video QP values is provided in a timed metadata track of file having an International Organization for Standardization (“ISO”) base media file format or Third Generation Partnership Project (“3GPP”) file format.
  • ISO International Organization for Standardization
  • 3GPP Third Generation Partnership Project
  • Example 31 may include the one or more computer-readable media of any one of examples 28-30, wherein the instructions, when executed, further cause the media server to transmit the video QP metadata in a manifest file for the HTTP adaptive stream that is a media presentation description ("MPD") for a dynamic adaptive streaming over HTTP (DASH) adaptation set, or in metadata embedded in a file having a Third
  • MPD media presentation description
  • DASH dynamic adaptive streaming over HTTP
  • Example 32 may include the one or more computer-readable media of any one of examples 28-31, wherein the requested segments are located in an adaptation set that includes the plurality of representations, and at least one representation includes a plurality of segments.
  • 3GPP Generation Partnership Project
  • Example 33 may include the one or more computer-readable media of any one of examples 28-32, wherein individual representations of the plurality of representations contain a same media file over the byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
  • Example 34 may include the one or more computer-readable media of any one of examples 28-33, wherein the request is a first request for first segments, the representation is a first representation, and the instructions, when executed, further cause the media server to: receive a second request for second segments of the HTTP adaptive stream in the byte range in a second representation selected from the plurality of representations; and transmit the second segments in the byte range to the UE.
  • Example 35 may include a user equipment (“UE") operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the UE comprising: communication circuitry to receive a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and streaming client circuitry to: identify video quantization parameter (QP) values for individual representations of a plurality of representations of the manifest file for a selected period; request a first representation for the selected period having a first video QP value; and dynamically switch streaming from the first representation to a
  • QP video quantization parameter
  • Example 36 may include the UE of example 35, wherein the streaming client circuitry is further to: determine the desired video quality based on capabilities of a display device of the UE; and identify the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
  • Example 37 may include the UE of example 35 or 36, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity (“MS-SSIM”) metric, a mean opinion score (“MOS”) metric, a structural similarity (“SSIM”) metric, peak signal to noise ratio (“PSNR”) or a perceptual evaluation of video quality (“PEVQ”) metric.
  • Example 38 may include the UE of any one of examples 35-37, wherein the first video QP value corresponds to a mean luma QP averaged over a segment duration.
  • Example 39 may include the UE of any one of examples 35 or 36, wherein the streaming client circuitry is further to: identify a second video QP value in the manifest file for a plurality of sub-representations for the selected period; and request sub-segments that are in at least one of the sub-representations and have a desired video
  • Example 40 may include the UE of any one of examples 35-39, wherein the streaming client circuitry is to dynamically switch streaming to the second representation at a selected access point in the HTTP adaptive stream to change performance of a display device of the UE in rendering the HTTP adaptive stream.
  • Example 41 may include the UE of any one of examples 35-40, wherein the UE is to receive video QP configuration information from a metadata management server.
  • Example 42 may include one or more computer-readable media having instructions that, when executed, cause a user equipment to: identify a manifest file for a hypertext transport protocol (HTTP) adaptive stream, the manifest file to define hierarchical levels that include information about encoded media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; select a video
  • HTTP hypertext transport protocol
  • QP quantization parameter
  • Example 43 may include the one or more computer-readable media of example 42, wherein the instructions, when executed, further cause the user equipment to: determine display capabilities of the user equipment; determine a desired video quality; and select, based on the display capabilities and the desired video quality, the video QP value from a plurality of video QP values that correspond to a respective plurality of encoded content sections.
  • Example 44 may include the one or more computer-readable media of example 43, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity (“MS-SSIM”) metric, a mean opinion score (“MOS”) metric, a structural similarity (“SSIM”) metric, peak signal to noise ratio (“PSNR”) or a perceptual evaluation of video quality (“PEVQ”) metric.
  • Example 45 may include the one or more computer-readable media of any one of examples 42-44, wherein the video QP value is included in: the manifest file or in metadata embedded in a Third Generation Partnership Project (“3GPP”) file format.
  • 3GPP Third Generation Partnership Project
  • Example 46 may include the one or more computer-readable media of example 45, wherein the video QP value is included in the manifest file and the manifest file is a media presentation description for a dynamic adaptive streaming over HTTP (DASH) adaptation set.
  • DASH dynamic adaptive streaming over HTTP
  • Example 47 may include the one or more computer-readable media of any one of examples 42-46, wherein the video QP value is a first video QP value and the instructions, when executed, further cause the user equipment to report, to a content server, a second video QP value for a display device of the user equipment during reception, decoding, or rendering of the HTTP adaptive stream.
  • the video QP value is a first video QP value and the instructions, when executed, further cause the user equipment to report, to a content server, a second video QP value for a display device of the user equipment during reception, decoding, or rendering of the HTTP adaptive stream.
  • Example 48 may include the one or more computer-readable media of any one of examples 42-47, wherein the video QP value corresponds to a mean luma QP averaged over a duration of a first encoded content section.
  • Example 49 may include the one or more computer-readable media of any one of examples 42-48, wherein the instructions, when executed, further cause the user equipment to: select the video QP value from the manifest file for a plurality of sub- representations for a given period; and request sub-segments that are in at least one of the sub-representations and have the video QP value.
  • Example 50 may include the one or more computer-readable media of any one of examples 42-49, wherein the video QP value corresponds to an adaptation set, a representation, or a sub-representation of the manifest file.
  • Example 51 may include a media server operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the media server comprising: means to encode content sections of an HTTP adaptive stream, and to record video quantization parameter ("QP") values associated with the encoded content sections; and means to transmit a manifest file for the HTTP adaptive stream and the video QP values to a user equipment, the manifest file to define hierarchical levels that include information about
  • HTTP hypertext transfer protocol
  • QP video quantization parameter
  • Example 52 may include the media server of example 51, wherein the hierarchical levels are to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments.
  • Example 53 may include the media server of example 52, wherein each representation of an adaptation set includes a same media file over a same byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
  • Example 54 may include the media server of any one of examples 51-53, wherein individual video QP values correspond to an adaptation set, a representation, or a sub- representation of the manifest file and the media server further comprises: means to receive, from the user equipment, an indication of a selection of an adaptation
  • Example 55 may include the media server of any one of examples 51-54, wherein the means to encode is further to record a vectorized set of video QP values describing video QP across different video frames or media time intervals and means to transmit is further to cause transmission of the vectorized set in a timed metadata track of an International Organization for Standardization (“ISO”) base media file format or a Third Generation Partnership Project (“3GPP”) file format.
  • ISO International Organization for Standardization
  • 3GPP Third Generation Partnership Project
  • Example 56 may include the media server of any one of examples 51-55, wherein the indication of the video QP values is in the manifest file, which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • the manifest file which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • 3GPP Third Generation Partnership Project
  • Example 57 may include the media server of any one of examples 51-56, wherein the encoded content sections correspond to adaptation sets, representations, or segments.
  • Example 58 may include the media server of any one of examples 51-57, wherein the encoded content sections correspond to one or more frames of the HTTP adaptive stream.
  • Example 59 may include a method of hypertext transfer protocol (HTTP) adaptive streaming, the method comprising: receiving a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and identifying video quantization parameter (QP) values for individual adaptation sets of the one or more adaptation sets in the manifest file for a selected period; requesting a first adaptation set for the selected period having a first video QP value; and dynamically switching streaming between different encoded portions of the media content in the first adaptation set to maintain a desired video quality value.
  • HTTP hypertext transfer protocol
  • Example 60 may include the method of example 59, further comprising:
  • Example 61 may include the method of example 59 or 60, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity (“MS-SSIM”) metric, a mean opinion score (“MOS”) metric, a structural similarity (“SSIM”) metric, peak signal to noise ratio (“PSNR”) or a perceptual evaluation of video quality (“PEVQ”) metric.
  • MS-SSIM multi-scale structural similarity
  • MOS mean opinion score
  • SSIM structural similarity
  • PSNR peak signal to noise ratio
  • PEVQ perceptual evaluation of video quality
  • Example 62 may include the method of any one of examples 59-61, wherein the video QP value is included in: the manifest file, which is a media presentation description ("MPD”) file; or metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • the manifest file which is a media presentation description ("MPD") file
  • 3GPP Third Generation Partnership Project
  • Example 63 may include the method of any one of examples 59-62, wherein the method further comprises transmitting, to a content server, a desired video QP configuration for a display device of the UE.
  • Example 64 may include the method of any one of examples 59-63, wherein the video QP value corresponds to a mean luma QP averaged over a segment duration.
  • Example 65 may include the method of any one of examples 59-64, further comprising: identifying a second video QP value for a plurality of sub-representations for the selected period; and requesting sub-segments that are in at least one of the plurality of sub-representations and have the second video QP value.
  • Example 66 may include the method of any one of examples 59-65, further comprising receiving video QP configuration information from a metadata management server.
  • Example 67 may include a method of hypertext transfer protocol (HTTP) adaptive streaming comprising: transmitting video QP metadata to a user equipment ("UE") over a byte range for a plurality of representations of an HTTP adaptive stream; receiving a request for segments of the HTTP adaptive stream in the byte range in
  • HTTP hypertext transfer protocol
  • Example 68 may include the method of example 67, wherein the byte range is for a period, an adaptation set, a representation, a segment or a sub-segment.
  • Example 69 may include the method of example 67 or 68, wherein the video QP metadata includes a vectorized set of video QP values that describe video QP across different video frames or different media time intervals, wherein the vectorized set of video QP values is provided in a timed metadata track of file having an International Organization for Standardization ("ISO") base media file format or Third
  • Example 70 may include the method of any one of examples 67-69, further comprising transmitting the video QP metadata in a manifest file for the HTTP adaptive stream that is a media presentation description ("MPD") for a dynamic adaptive streaming over HTTP (DASH) adaptation set, or in metadata embedded in a file having a Third Generation Partnership Project (“3GPP”) file format.
  • MPD media presentation description
  • DASH dynamic adaptive streaming over HTTP
  • Example 71 may include the method of any one of examples 67-70, wherein the requested segments are located in an adaptation set that includes the plurality of representations, and at least one representation includes a plurality of segments.
  • Example 72 may include the method of any one of examples 67-71, wherein individual representations of the plurality of representations contain a same media file over the byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
  • Example 73 may include the method of any one of examples 67-72, wherein the request is a first request for first segments, the representation is a first representation, and further comprising: receiving a second request for second segments of the HTTP adaptive stream in the byte range in a second representation selected from the plurality of representations; and transmitting the second segments in the byte range to the UE.
  • Example 74 may include a method of hypertext transfer protocol (HTTP) adaptive streaming, the method comprising: receiving a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; identify video quantization parameter (QP) values for individual representations of a plurality of representations of the manifest file for a selected period; requesting a first representation for the selected period having a first video QP value; and dynamically switching streaming from the first representation to a second representation to maintain a desired video quality value.
  • HTTP hypertext transfer protocol
  • Example 75 may include the method of example 74, further comprising:
  • UE user equipment
  • Example 76 may include the method of example 74 or 75, wherein the desired video quality corresponds to an obj ective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity (“MS-SSIM”) metric, a mean opinion score (“MOS”) metric, a structural similarity (“SSIM”) metric, peak signal to noise ratio (“PSNR”) or a perceptual evaluation of video quality (“PEVQ”) metric.
  • MS-SSIM multi-scale structural similarity
  • MOS mean opinion score
  • SSIM structural similarity
  • PSNR peak signal to noise ratio
  • PEVQ perceptual evaluation of video quality
  • Example 77 may include the method of any one of examples 74-76, further comprising: identifying a third video QP configuration in the manifest file for a plurality of sub-representations for the selected period; and requesting sub-segments that are in at least one of the sub-representations and have a desired video QP configuration.
  • Example 78 may include an apparatus configured to perform the method of any one of examples 59-77
  • Example 79 may include one or more computer-readable media having instructions that, when executed, cause a device to perform the method of any one of examples 59-77.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

An apparatus comprising: streaming client circuitry to: receive a manifest file for a hypertext transport protocol (HTTP) adaptive stream, the manifest file to define hierarchical levels that include information about encoded media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; select a video quantization parameter, QP, value that corresponds to at least a portion of the encoded media content; request a hierarchical level based on the video QP value; and receive a first encoded content section, the first encoded content section to correspond to the requested hierarchical level; and decoding circuitry, coupled with the streaming client circuitry, to decode the first encoded content section.

Description

QUANTIZATION PARAMETER REPORTING FOR VIDEO STREAMING
Field
Embodiments of the present disclosure generally relate to the field of networks, and more particularly, to apparatuses, systems, and methods for quantization parameter reporting for video streaming through the networks.
Background
Hypertext transfer protocol ("HTTP") streaming is spreading widely as a form of multimedia delivery of Internet video. HTTP-based delivery provides reliability and deployment simplicity due to the already broad adoption of both HTTP and its underlying protocols, transmission control protocol ("TCP")/Internet protocol ("IP"). Dynamic adaptive streaming over HTTP ("DASH") is a newer technology standardized in Third Generation Partnership Project (3GPP) Technical Specification (TS) 26.247, vl3.3.0 (June 24, 2016) and Moving Picture Experts Group ("MPEG") International Organization for Standardization ("ISO")/International Electrotechnical Commission ("IEC") 23009-1 (May 15, 2014).
In DASH, a manifest file referred to as a media presentation description
("MPD") metadata file provides information on the structure and different versions of the media content representations stored in a server. The information may include, for example, different bit rates, frame rates, resolutions, codec types, etc. In addition, the MPD may include information on initialization and media segments for the media engine to ensure mapping of segments into a media presentation timeline for switching and synchronous presentation with other representations. Based on this MPD metadata information that describes the relation of the segments and how they form a
media presentation, clients request the segments using HTTP GET or partial GET methods. The client may fully control the streaming session, for example, the client may manage the on-time request and smooth play out of the sequence of segments, potentially adjusting bit rates or other attributes to, for example, react to changes of the device state or the user preferences.
Brief Description of the Drawings
Embodiments will be readily understood by the following detailed description in conjunction with the accompanying drawings. To facilitate this description, like reference numerals designate like structural elements. Embodiments are illustrated by way of example and not by way of limitation in the figures of the accompanying drawings.
Figure 1 illustrates an adaptive streaming system according to some embodiments. Figure 2 illustrates a media presentation description file according to some embodiments.
Figure 3 illustrates a QP metadata file according to some embodiments.
Figure 4 illustrates an example of an extensible markup language syntax of common group and representation attributes and elements according to some
embodiments.
Figure 5 illustrates an example operation flow/algorithmic structure of a user equipment according to some embodiments.
Figure 6 illustrates an example operation flow/algorithmic structure of a user equipment according to some embodiments.
Figure 7 illustrates an example operation flow/algorithmic structure of a server according to some embodiments.
Figure 8 illustrates an example operation flow/algorithmic structure of a server according to some embodiments.
Figure 9 illustrates a computer system according to some embodiments.
Detailed Description
In the following detailed description, reference is made to the accompanying drawings, which form a part hereof wherein like numerals designate like parts throughout, and in which is shown by way of illustration embodiments that may be practiced. It is to be understood that other embodiments may be utilized and structural or logical changes may be made without departing from the scope of the present disclosure.
Various operations may be described as multiple discrete actions or operations in turn, in a manner that is most helpful in understanding the claimed subject matter.
However, the order of description should not be construed as to imply that these operations are necessarily order dependent. In particular, these operations may not be performed in the order of presentation. Operations described may be performed in a different order than the described embodiment. Various additional operations may be performed or described operations may be omitted in additional embodiments.
For the purposes of the present disclosure, the phrases "A or B," "A and/or B," and "A/B" mean (A), (B), or (A and B).
The description may use the phrases "in an embodiment," or "in embodiments," which may each refer to one or more of the same or different embodiments. Furthermore, the terms "comprising," "including," "having," and the like, as used with respect to embodiments of the present disclosure, are synonymous. Figure 1 illustrates an adaptive streaming system 100 in accordance with some embodiments. The system 100 may include a server 104 and a user equipment ("UE") 108. In this embodiment, the system 100 may provide a DASH-based streaming framework; however, the concepts described herein may equally apply to other adaptive streaming frameworks.
The server 104 may be a web server, a media server, or a metadata management server. In some embodiments, operations described as being provided by the server 104 may be provided by one or more different devices that are networked together.
The server 104 may receive audio/video input 1 12 that is to be streamed. The audio/video input 112 may be sourced locally from the server 104 or provided to the server 104 from another device, for example, a remote media content server.
The server 104 may include encoding circuitry 116 and segmentation circuitry 120. The encoding circuitry 116 may receive the audio/video input 112 and encode the media content using any of a number of codecs such as those standardized for video and audio compression by International Organization for Standardization/International
Electrotechnical Commission ("ISO/EIC") Moving Picture Experts Group ("MPEG"). In other embodiments, other video/audio codecs may be used.
The encoding circuitry 116 may provide the encoded media content to the segmentation circuitry 120, which may split the input media into a series of segments that may then be provided to a streaming host circuitry 124 that is to control delivery of the adaptive streaming information. The streaming host circuitry 124 may control lower-level communication circuitry ("CC") 126. In general, the streaming host circuitry 124 may provide communication processing at the Internet layer (for example, Internet protocol ("IP")) and above; while the communication circuitry 126 provides communication processing at a link layer and below.
The encoding circuitry 116 or streaming host circuitry 124 may also generate a metadata file, for example, an MPD file, that may be used to convey an index of each segment and associated metadata information about the encoded content. If the encoding circuitry 116 generates the MPD file, it may provide the MPD file to the streaming host circuitry 124.
When the encoding circuitry 116 encodes the media content it may also record a video quantization parameter ("QP") value that is associated with the corresponding encoded media content. The video QP value may indicate how much a particular set of media content was quantized during the encoding process. In general, the higher the video QP value, the greater the amount of quantization performed during the encoding process. The encoding circuitry 1 16 may also provide the QP information, for example, the video QP value, to the streaming host circuitry 124. As described in further detail below, the QP information may be included in the MPD file or may be in a metadata file that is separate from the MPD file.
The streaming host circuitry 124, working in conjunction with the communication circuitry 126, may control delivery of the adaptive streaming information, for example, the encoded segments, the MPD file, and the QP information, to the user equipment 108 using HTTP protocols over one or more networks such as, for example, a wireless access network, a core network, an IP network, a public network, etc.
As used herein, the term "circuitry" may refer to, be part of, or include any combination of integrated circuits (for example, a field-programmable gate array
("FPGA") an application specific integrated circuit ("ASIC"), etc.), discrete circuits, combinational logic circuits, system on a chip, SOC, system in a package, SiP, that provides the described functionality. In some embodiments, the circuitry may execute one or more software or firmware modules to provide the described functions. In some embodiments, circuitry may include logic, at least partially operable in hardware.
The UE 108 may include communication circuitry ("CC") 130 that provides communication processing that complements the communication processing provided by communication circuitry 126. The UE 108 may also include streaming client circuitry 128 that provides communication processing that complements the communication processing provided by the streaming host circuitry 124. In particular, the streaming client circuitry 128 may manage the request, delivery, and reception of the adaptive streaming content at the Intemet, Transport, and Application layers. In some embodiments, the streaming client circuitry 128 may implement, control, or otherwise utilize a web browser at an Application layer to facilitate operations of the streaming client circuitry 128.
The streaming client circuitry 128 may transmit an HTTP GET request to the streaming host circuitry 124 to request streaming of media content corresponding to the audio/video input 1 12. The streaming host circuitry 124 may respond by transmitting the MPD file to the streaming client circuitry 128. While it is described that the streaming client circuitry 128 transmits messages to and receives messages from the streaming host circuitry 124, it will be understood that the streaming client circuitry 128 will cooperate with or otherwise control communication circuitry 130 to effectuate the
transmission/reception of the described messages. Similarly, the streaming host circuitry 124 will cooperate with or otherwise control communication circuitry 126 to effectuate the transmission/reception of described messages.
Upon receiving the MPD file, the streaming client circuitry 128 may use the file to determine the index of each segment and associated metadata information. Subsequently, the streaming client circuitry 128 may transmit a uniform resource locator ("URL") in, for example, an HTTP GET URL request, to obtain a specific segment. For example, the streaming client circuitry 128 may transmit a first HTTP GET URL request that corresponds to a request for a first segment ("Segment 1"). The streaming host circuitry 124 may then respond with Segment 1. The streaming client circuitry 128 may continue to request desired segments to control the adaptive streaming session.
The streaming client circuitry 128 may provide the received segments to decoding circuitry 132, which may decode the encoded segment and provide the decoded segments to media player circuitry 136. The media player circuitry 136 may control output circuitry 140 to render the decoded segments as video on display device 144 or audio on or by audio device 148.
Video quality associated with the adaptive streaming session may be monitored in a number of ways. In some instances, video quality may be determined based on quality metrics such as, but not limited to, video multi-scale structural similarity ("MS-SSIM"), video mean opinion score ("MOS"), video quality metrics ("VQM"), structural similarity metrics ("SSIM"), peak signal-to-noise ratio (PSNR), perceptual evaluation of video quality metrics ("PEVQ"), etc.
Video resolution, frame rate, encoding bit rate, etc. may play an important role in video quality. In addition to those parameters, the video QP values may also play an important role because different video segments may have very different content characteristics and may yield a wide range of video quality even when encoded at the same bit rate. Consider, for example, the following tables from 3GPP TR 26.909, vO.4.0 (2016-06).
Figure imgf000007_0001
aspen 1454 35.9 26.9 0.834 2.2 crowdrun 1482 39.3 23.9 0.745 2.0 redkayak 1451 36.5 31.1 0.829 2.2 westwindeasy 1467 35.9 28.0 0.880 3.4 backs neak 1459 32.5 35.0 0.930 3.4 bbscore 1466 29.7 32.4 0.907 3.5 controlledburn 1476 29.1 31.5 0.924 3.5
tractor 1456 32.2 32.6 0.917 3.7 frontend 1429 23.7 30.8 0.946 3.9 pedestrianarea 1463 27.6 35.7 0.944 3.6 speedbag 1455 25.8 38.5 0.971 4.2 sunflower 1460 25.9 38.2 0.975 4.4
Table 1
Figure imgf000008_0001
Table 2 Table 1, which corresponds with Table 4-10 of 3GPP TR 26.909, shows quality and video QP values (or, simply, "QP") of different video clips encoded at a same bit rate. When the videos share the input parameters such as resolution, frame rate, bit rate, etc. and are displayed on the same device, the MOS, which may be a subjective video quality metric, may still vary in a wide range due to the content characteristic difference, which may be well captured by the QP value.
Table 2, which corresponds with Table 4-11 of 3 GPP TR 26.909, shows the correlation of the bit rate/QP and video quality scores. For videos encoded at the same bit rate, their video MOS has a strong correlation to the video QP value, which may provide an indication of the video content complexity.
Given the correlation between video quality, as displayed on a particular display device, and video QP value, embodiments of the present disclosure provide a QP-aware streaming process in which the streaming client circuitry 128 can adaptively choose the content that best matches display capabilities of the display device 144. This may facilitate delivery of the streaming media content with a satisfactory video quality (for example, video MOS). Furthermore, providing the QP information at the session/HTTP level as described, rather than relying on feedback from codec processing level at the decoding circuitry 132, may facilitate timely and efficient determination of video quality metrics by, for example, a video MOS estimation engine. Various embodiments also describe mechanisms to perform video QP-aware streaming and content adaptation at the streaming client circuitry 128 based on quality characteristics determined for different configurations of video QP. To this end, the QP information generated by the encoding circuitry 116 may further include quality metrics corresponding to different video QP configurations. The streaming client circuitry 128 may use this information to perform rate adaptation and bit rate selection to improve quality for its chosen set of video QP parameters.
In some embodiments, the streaming client circuitry 128 may transmit to the server 104, via the communication circuitry 130, a desired video QP configuration for the display device 144. In these embodiments, the server 104 may process the feedback information of the desired video QP configuration and perform transcoding to generate new content, for example, new DASH content, with the desired video QP configuration, which may then be provided or otherwise made available to the user equipment 108 as described elsewhere herein.
Figure 2 illustrates an MPD file 200 in accordance with some embodiments. The
MPD file 200 may be a manifest file for an HTTP adaptive stream that defines hierarchical levels that include information about encoded portions of media content available on a server, for example, server 104, for adaptive streaming. The top-level of the MPD file 200 may include a number of periods (including, for example, period 204 and period 208) that describe a part of the media content with a start and a stop time. Each period may include one or more adaptation sets (for example, adaptation set 212 and adaptation set 216).
An adaptation set may be a set of interchangeable encoded versions of one or several media content components for a given period. For example, a period may have one adaptation set that provides video content and one or more other adaptation sets that provide audio content (possibly in one or more languages). In another example, a first adaptation set may provide high definition video content while a second adaptation set provides standard definition video content. For each period, one or more of the available adaptation sets may be selected by a UE based on, for example, user preferences, device capabilities, etc.
Individual adaptation sets may include one or more representations that represent a collection and encapsulation of one or more media streams and a delivery format associated with descriptive metadata. The media streams of various representations may be formatted in various ways. For example, adaptation set 212 may include representations 220 and 224, each of which may be associated with different screen sizes, bandwidths, codecs, etc. The streaming client circuitry, for example, streaming client circuitry 128, may select between the different representations based on client preferences, network conditions, etc. In some embodiments, the streaming client circuitry 128 may switch between different representations of an adaptation set in a dynamic manner during a streaming session to maintain, for example, a desired video quality.
Individual representations may include sub-representations that include information that is specific to a particular media stream for the entire period. For example, the representation 220 may include sub-representations 228 and 232 each associated with a distinct set of information such as, for example, codecs, sampling rates, embedded subtitles, etc. for the period 204.
Individual representations may include segments that represent the unit of data associated with an HTTP URL and, optionally, a byte range that is specified by the MPD file 200. For example, the representation 220 may include segments 236 and 240. The segments 236 and 240 may be individually requested by the HTTP GET requests transmitted by the streaming client circuitry 128. In embodiments in which a
representation has a sub-representation, the segments may also have sub-segments that are units of the segments that are associated with the specific information of the corresponding sub-representation. For example, the segment 236 may include a sub-segment 244 that is associated with the information of the sub-representation 228 and may also include a sub- segment 248 that is associated with the information of the sub-representation 232.
In various embodiments, QP information 252 may be provided at an adaptation set level, a representation level, or a sub-representation level. To provide the QP information 252 at an adaptation set level for adaptation set 212, for example, the QP information may be provided directly in adaptation set 212, as shown in Figure 2. Providing the QP information 252 at the representation or sub-representation levels may also be employed in a direct manner by including the QP information 252 in, for example, the representation 220, sub-representation 228, or sub-representation 232. The QP information 252 is shown with a dotted line border in Figure 2 to represent that the locations for the QP information 252 are various options. Embodiments may include the QP information 252 in any one or combination of the various options.
In some embodiments, the QP information 252 may be provided at a plurality of granularity levels. For example, in one embodiment first-level QP information may be provided with respect to a first hierarchical level (for example, an adaptation set or a representation) and second-level QP information may be provided with respect to a second hierarchical level (for example, a representation or a sub-representation). This may be useful to provide the UE with general QP information at a relatively higher hierarchical level and providing more specific QP information at a relatively lower hierarchical level. For example, adaptation set 212 may include QP information 252 that represents an average QP value for all the encoded content in the adaptation set 212. Further QP information may be associated with lower levels, for example, representations 220 and 224 or sub-representations 228 and 232.
In some embodiments, the lower-level QP information (at, for example, the sub- representations 228 and 232) may be the same or different than the higher-level QP information (at, for example, the adaptation set 212). This may, for example allow a UE to identify a second video QP value for a sub-representation that is different from a first video QP value for an adaptation set that the sub-representation is in, and request sub- segments of the sub-representation having the second video QP value.
In some embodiments, QP information may be provided in a file that is separate from the manifest file. For example, Figure 3 illustrates a QP metadata file 300 in accordance with some embodiments. The QP metadata file 300 may have an International Organization for Standardization ("ISO") base media file format or a Third Generation Partnership Project ("3GPP") file format.
The QP metadata file 300 may have video QP information embedded into a timed metadata track linked to a corresponding metadata file, for example, MPD file 200. The video QP metadata may include a vectorized set of video QP values that describe video QP across various content sections.
In some embodiments, the QP metadata file 300 may include references to one or more periods that are defined in the MPD file 200. For individual periods, the QP metadata file 300 may include one or more video QP values associated with a
corresponding section of encoded content. For example, the QP metadata file 300 shows periods 1-n, which may correspond to periods 204-208 of the MPD file 200. Each of the periods may have i encoded content sections. The QP metadata file 300 may then indicate that, for period 1, video QP values (1) 304 correspond to encoded content (1) 308, and video QP values (i) 312 may correspond to encoded content (i) 316; ... ; for period n, video QP values (1) 320 correspond to encoded content (1) 324, ... , and video QP values (i) 328 correspond to encoded content (i) 332. In this manner, the QP metadata file 300 may associate video QP values with encoded content for each of the periods of a corresponding manifest file in a similar manner. In some embodiments, the encoded content sections of the QP metadata file 300 may correspond to a hierarchical level of a manifest file. For example, an encoded content section may be provided for each of one or more adaptation sets, representations, or sub- representations. In some embodiments, the encoded content sections may be associated with a different or finer granularity. For example, in some embodiments the encoded content sections may correspond to one or more video frames of encoded content or media time intervals.
In embodiments relying on a metadata file to provide the QP information, the manifest file may be configured with an indication of the existence of the metadata file. For example, a codecs parameter of the MPD may be provided with a dedicated new value that may be used to indicate the presence of video QP information in a corresponding metadata file. The UE, upon receiving the indication, may then send a separate request for the metadata file.
The QP information provided to the streaming client circuitry 128 may include a variety of QP information that pertains to the encoded content section with which it is associated. The QP information may include a single value that is associated with the entire encoded content section. The single value may be an average (for example, a mean, a median, or a modal average) QP value over the encoded content section. In some embodiments, the average may be a mean luma QP averaged over a duration of the encoded content section. In some embodiments, the QP information may additionally or alternatively include maximum or minimum QP values over the duration of the encoded content section.
In some embodiments, information in addition to, but associated with, the QP information may be provided. For example, in some embodiments the QP information associated with an encoded content section may also include corresponding quality information. This quality information may be used to signal quality variations in a more granular fashion to enable more dynamic quality -based adaptations by the UE.
In some embodiments, minimum/maximum quality attributes may be signaled at an adaptation set level. For example, a minimum quality attribute may be used to specify a minimum quality value in all representations of an adaptation set for each given video QP value; and a maximum quality attribute may be used to specify a maximum quality value in all representations in the adaptation set for each given video QP configuration. The minimum and maximum quality values may quantify the minimum and maximum quality levels over a specified timeline that may correspond to a period, segment, or sub-segment. In some embodiments, these values may indicate long-term (or average) minimum and maximum quality measurements over an entire duration of an adaptation set. In another embodiment, vectorized sets of quality values may be provided to specify the minimum and maximum quality levels for the adaptation set across different segments and sub- segments. The signaling may be done such that, for each video QP value, a corresponding set of quality metrics may be provided.
The quality/QP information may be used by the streaming client circuitry 128 to maintain a high quality of experience ("QoE") for a user. For example, the streaming client circuitry 128 may receive the manifest/metadata file and identify an adaptation set that best matches its display capabilities in terms of video QP values towards delivering satisfactory video quality. The streaming client circuitry 128 may then request segments and sub-segments corresponding to the various DASH representations, taking into account the video QP information as well as the corresponding quality information in the manifest/metadata file. The streaming client circuitry 128 may switch across different representations over random access points (also known as segment access points in
DASH) by continuously tracking bandwidth, quality, central processor unit ("CPU") load, etc. in an effort to optimize user QoE. To avoid large quality variations during bitstream switching over random access points, the streaming client circuitry 128 may use the quality information in the manifest/metadata file to decide which representation to switch to, and when/where the switching should occur for the highest QoE.
In 3 GPP TS 26.247 vl3.3.0 (June 24, 2016), common attributes and elements are defined for Adaptations et, Representation, and SubRepresentation in Table 8-9. Table 3, below, shows attributes and elements that may be used in place of Table 8-9 of TS 26.247 to provide for, among other things, attributes and elements for video QP. The attributes and elements listed in Table 3 may be present in Adaptations ets, Representations, or
SubRepresentations along with the semantics of these attributes. Unless otherwise noted, the sections/clauses referred to in the Description column refer to sections/clauses of 3 GPP TS 26.247.
Figure imgf000013_0001
Presentation profiles as described in 7.3. The value shall be a subset of the respective value in any higher level of the document hierarchy (Representation, Adaptation Set, MPD).
If not present, the value is inferred to be the same as in the next higher level of the document hierarchy. For example, if the value is not present for a Representation, then @profiles at the Adaptation Set level is valid for the Representation.
The same syntax as defined in 8.4.1 shall be used.
@width 0 Specifies the horizontal visual presentation size of the video media type in pixel.
If not present on any level, the value is unknown.
@height 0 Specifies the vertical visual presentation size of the video media type in pixel.
If not present on any level, the value is unknown.
@frameRate 0 Specifies the output frame rate of the video media type in the Representation. If the frame rate is varying, the value is the average frame over the entire duration of the Representation.
The value is coded as a string, either containing two integers separated by a "/", ("F/D"), or a single integer "F". The frame rate is the division F/D, or F, respectively, per second (i.e., the default value of D is "1").
If not present on any level, the value is unknown.
@qp 0 Specifies the video quantization parameter.
@audioSamplingRate 0 Either a single decimal integer value
specifying the sampling rate or a whitespace separated pair of decimal integer values specifying the minimum and maximum sampling rate of the audio media component type. The values are in samples per second.
If not present on any level, the value is unknown.
@mimeType 0 Specifies the MIME type of the concatenation of the Initialization Segment, if present, and all consecutive Media Segments in the
Representation.
@codecs 0 Specifies the codecs parameter specifying the media types. The codec parameters shall also include the profile and level information where applicable.
The contents of this attribute shall conform to either the simp-list or fancy-list productions of RFC6381 [26] clause 3.2, without the enclosing DQUOTE characters. The codec identifier for the media format, mapped into the name space for codecs as specified in RFC6381 [26], clause 3.3 shall be used.
@maximumS APPeri od 0 When present, specifies the maximum SAP interval in seconds of all contained media streams, where the SAP interval is the maximum time interval between the TSAP of any two successive SAPs of types 1 to 3 inclusive of one media stream in the associated Representations.
If not present on any level, the value is unknown.
@startWithSAP 0 When present and greater than 0, specifies that in the associated Representations, each Media Segment starts with a SAP of type less than or equal to the value of this attribute value in each media stream.
A Media Segment starts with a SAP in a media stream if the stream contains a SAP in that Media Segment, ISAU is the index of the first access unit that follows ISAP and ISAP is contained in the Media Segment.
If not present on any level, the value is unknown.
@maxPlayoutRate 0 Specifies the maximum play out rate as a
multiple of the regular play out rate, which is supported with the same decoder profile and level requirements as the normal play out rate.
If not present on any level, the value is 1.
Stop that @codingDependency 0 When present and "true", for all contained media streams, specifies that there is at least one access unit that depends on one or more other access units for decoding. When present and "false", for any media type, there is no access unit that depends on any other access unit for decoding (e.g., for video all the
pictures are intra coded). When not present, there may or may not be coding dependency between access units.
FramePacking 0... N Specifies frame-packing arrangement
information of the video media component type.
When no FramePacking element is provided for a video component, frame-packing shall not be used for the video media component.
For details see 8.6.3.1 and 8.6.3.8.
AudioChannelConfiguration 0... N Specifies the audio channel configuration of the audio media component type.
For details see clause 8.6.3.1 and 8.6.3.7.
ContentProtection 0... N Specifies information about the use of content protection for the associated Representations.
For details, refer to clause 8.6.3.1 and 8.6.3.2.
EssentialPropert 0... N Specifies information about the containing element that is considered essential by the Media Presentation author for processing the containing element.
For details see clause 8.6.3.9.
SupplementalProperty 0... N Specifies supplemental information about the containing element that may be used by the DASH client optimizing the processing.
For details see clause 8.6.3.10.
Table 3
For the attributes of Table 3 (those preceded with an "@"), an "O" may indicate the attribute is optional, while an "M" may indicate the attribute is mandatory. For the elements of Table 3, the use column may be in the form <minOccurs>... <maxOccurs>, where "N" indicates an unbounded value.
An example of an extensible markup language ("XML") syntax of common group and representation attributes and elements, such as those described above with respect to Table 3, is shown in Figure 4. As can be seen, a video QP attribute, shown as "QP" in Figure 4, may be provided with an unsigned integer data type to indicate that the video QP value may be specified by a numeric value without a fractional component. The integer that represents the video QP value may be a 32-bit integer in some embodiments.
Figure 5 illustrates an example operation flow/algorithmic structure 500 of the UE 108 according to some embodiments. The operation flow/algorithmic structure 500 may include, at 504, receiving a manifest file. The manifest file may be received by streaming client circuitry 128. The manifest file, which may correspond to an HTTP adaptive stream, may define hierarchical levels that include information about encoded portions of media content available for adaptive streaming on a server such as, for example, server 104. In some embodiments, the manifest file may be an MPD file such as, for example, MPD file 200. In some embodiments, the manifest file may be received as a result of an original request for media content such as, for example, an HTTP GET request transmitted by the streaming client circuitry 128.
The operation flow/algorithmic structure 500 may further include, at 508, selecting a video QP value that corresponds to at least a portion of the encoded media content represented in the manifest file received at 504. The selecting of the video QP value may be performed by the streaming client circuitry 128. In some embodiments, the video QP value may be selected from a plurality of video QP values that respectively correspond to a plurality of sections of the encoded media content. In some embodiments, the plurality of video QP values, and their respective correspondences, may be provided to the UE in the manifest file received at 504 or embedded in a metadata file such as, for example, QP metadata file 300.
In some embodiments, the streaming client circuitry 128 may select the video QP value of interest based on display capabilities of the display device 144 and a desired video quality. For example, streaming client circuitry 128 may select the video QP value from the plurality of video QP values based on a determination that content encoded with the selected video QP value can be decoded and rendered on the display device 144 in a manner to meet the desired video quality.
The operation flow/algorithmic structure 500 may further include, at 512, requesting a hierarchical level based on the video QP value. The requesting at 512 may be performed by the streaming client circuitry 128 transmitting an HTTP GET URL message to streaming host circuitry 124. The hierarchical level may be an adaptation set, a representation, or a sub-representation.
The operation flow/algorithmic structure 500 may further include, at 516, receiving and decoding content corresponding to the requested hierarchical level. The receiving at 516 may be performed by the streaming client circuitry 128 and the decoding at 516 may be performed by the decoding circuitry 132. In some embodiments, the decoding circuitry 132 may, during the decoding operations, record QP values. The recorded QP values may be fed back to the server 104 if certain conditions are met. For example, if the QP value determined at the decoding operation differs from the QP value that was provided in the QP metadata, the user equipment 108 may feed back an indication of the mismatch.
Following the decoding of the content by the decoding circuitry 132, the media player circuitry 136 may control the output circuitry 140, including the display device 144 and the audio device 148, to render the media content.
Figure 6 illustrates an example operation flow/algorithmic structure 600 of the UE 108 according to some embodiments.
The operation flow/algorithmic structure 600 may include, at 604, receiving a manifest file. The receiving of the manifest file at 604 may be similar to the receiving of the manifest file at 504 as discussed above with respect to Figure 5.
The operation flow/algorithmic structure 600 may further include, at 608, identifying video QP values for individual adaptation sets of one or more adaptation sets in the manifest file for a selected period. The identifying at 608 may be performed by the streaming client circuitry 128. In some embodiments, the video QP values may be identified directly from the manifest file. For example, the video QP values may be included in the manifest file. In some embodiments, the video QP values may be identified from a QP metadata file that corresponds to the manifest file. In some embodiments, the manifest file may include an indication of a presence of the corresponding QP metadata file. In the event the UE 108 is a QP-aware UE, that is, is capable of discerning and selecting specific QP values that are configured to increase a video quality of streamed content rendered by the UE 108, the UE 108 may transmit a request for the QP metadata file upon learning of its existence through the manifest file.
The operation flow/algorithmic structure 600 may further include, at 612, requesting first encoded content for a selected period having a first video QP value. The requesting at 612 may be performed by the streaming client circuitry 128. The first encoded content may correspond to a first adaptation set, representation, etc. The first encoded content may be selected based on a determination that the UE 108 is capable of decoding and rendering content of the first encoded content on the display device 144 in a manner to meet the desired video quality.
The operation flow/algorithmic structure 600 may further include, at 616, dynamically switching streaming between different encoded portions of the media content to maintain a desired video quality value. The dynamically switching at 616 may be performed by the streaming client circuitry 128 in order to maintain a desired level of video quality.
For example, in some embodiments, the first encoded content may correspond to an adaptation set. The streaming client circuitry 128 may initially request a first representation of the adaptation set. While decoding and rendering of the content of the first representation, the streaming client circuitry 128 may make a determination that the first representation fails or has a high probability of failing to meet the desired video quality. This may be the result of changing network conditions (for example, interference, loading, etc.), central processor unit ("CPU") load, or other dynamic factors. If the streaming client circuitry 128 determines that a second representation of the adaptation set, which may have a different bit rate, codec, quality information, etc., represents an increased probability of being decoded and rendered in a manner to meet the desired video quality, the streaming client circuitry 128 may issue a request to receive the content of the second representation. In various embodiments, the second representation may include the same video QP value (given that the representation is within the adaptation set and, therefore, is associated with the video QP value that is associated with the adaptation set) but have other parameters that differ, for example, different bit rates, frame rates, resolutions, codec types, quality information, etc. In other embodiments, the second representation may have a QP value that is different from the first representation. This may be the case when general QP information (for example, overall average video QP value) is provided with respect to the adaptation set and more specific QP information is provided with respect to the different representations.
In some embodiments, the first encoded content may correspond to a granularity less than the adaptation set, for example, a representation, a sub-representation, one or more video frames, etc. For example, in one embodiment, the first encoded content may correspond to a first representation having a first video QP value. The streaming client circuitry 128 may select an adaptation set, which may or may not be selected based on a video QP value associated with the adaptation set, and may subsequently select a first representation based on a representation-level video QP value. In the event the streaming client circuitry 128 determines that the first representation fails or has a high probability of failing to meet the desired video quality, the streaming client circuitry 128 may request a second representation. The second representation may be associated with a different representation-level video QP value. The dynamic switching of streaming between different encoded portions described above may be done during an active adaptive streaming session. In some instances, the dynamic switching of streaming may be transparent to a user.
Figure 7 illustrates an example operation flow/algorithmic structure 700 of the server 104 according to some embodiments.
The operation flow/algorithmic structure 700 may include, at 704, encoding content sections of an HTTP adaptive stream. The encoding at 704 may be performed by encoding circuitry 116 to compress audio/video input 112 using one or more video or audio codecs.
The operation flow/algorithmic structure 700 may further include, at 708, recording video QP values associated with the encoded content sections. The recording of the video QP values may be performed by encoding circuitry 116 or streaming host circuitry 124.
The operation flow/algorithmic structure 700 may further include, at 712, generating a manifest file. The manifest file, as described above, may define hierarchical levels that include information about the encoded content that is available for adaptive streaming. In some embodiments, the video QP values may be recorded in the manifest file. In other embodiments, the video QP values may be recorded in a metadata file that is associated with the manifest file. The generating of the manifest file at 712 may be performed by the encoding circuitry 116 or the streaming host circuitry 124.
The operation flow/algorithmic structure 700 may further include, at 716, transmitting the manifest file and video QP values. The streaming host circuitry 124 may perform the transmitting at 716, by controlling the communication circuitry 126. In some embodiments, the manifest file and the video QP values may be transmitted separately. For example, in an embodiment in which the video QP values are recorded in a QP metadata file, the manifest file may be originally transmitted to the UE 108. If the UE 108 sends an additional request for the QP metadata file, the streaming host circuitry 124 may respond by transmitting the video QP values in the QP metadata file.
Figure 8 illustrates an example operation flow/algorithmic structure 800 of the server 104 according to some embodiments.
The operation flow/algorithmic structure 800 may include, at 804, transmitting video QP metadata. The transmitting of the video QP metadata may be performed by the streaming host circuitry 124 controlling the communication circuitry 126. The video QP metadata may apply to a byte range for a plurality of representations of an HTTP adaptive stream. In some embodiments, the video QP metadata may include a vectorized set of video QP values that describe video QP across different video frames or different media time intervals. The vectorized set of video QP values may be provided in a timed metadata track of a file having an ISO base media file format or 3GPP file format.
The operation flow/algorithmic structure 800 may further include, at 808, receiving a request for segments. The streaming host circuitry 124 may receive the request at 808 from the UE 108. The segments may be within the byte range and in a representation selected by the UE 108 from a plurality of representations in a manifest file. The request may be received as an HTTP request with a URL identifying the specifically requested segments.
The operation flow/algorithmic structure 800 may further include, at 812, transmitting the requested segments. The streaming host circuitry 124 may transmit the requested segments by controlling the communication circuitry 126.
Figure 9 is a block diagram illustrating components, according to some example embodiments, able to read instructions from a machine-readable or computer- readable medium (e.g., a machine-readable storage medium) and perform any one or more of the methodologies discussed herein (for example, the techniques described with respect to operation flow/algorithmic structures of Figures 5-8). Specifically, Figure 9 shows a diagrammatic representation of computer system 900 including one or more processors (or processor cores) 910, one or more computer-readable media 920, and one or more communication resources 930, each of which are communicatively coupled via one or more interconnects 940.
The processors 910 may include one or more central processing unit ("CPUs"), reduced instruction set computing ("RISC") processors, complex instruction set computing ("CISC") processors, graphics processing units ("GPUs"), digital signal processors ("DSPs") implemented as a baseband processor, for example, application specific integrated circuits ("ASICs"), radio-frequency integrated circuits (RFICs), etc. As shown, the processors 910 may include a processor 912 and a processor 914.
The computer-readable media 920 may be suitable for use to store instructions 950 that cause the computer system 900, in response to execution of the instructions 950 by one or more of the processors 910, to practice selected aspects of the present disclosure. In some embodiments, the computer-readable media 920 may be non-transitory. As shown, computer-readable storage medium 920 may include instructions 950. The instructions 950 may be programming instructions or computer program code configured to enable the computer system 900, which may be implemented as the UE 108 or the server 104, in response to execution of the instructions 950, to implement (aspects of) any of the methods or elements described throughout this disclosure related to adaptive video streaming. In some embodiments, the instructions 950 may be configured to enable a device, in response to execution of the programming instructions 950, to implement
(aspects of) any of the methods or elements described throughout this disclosure related to encoding video/audio content, recording QP information, generating manifest/metadata files, requesting and providing encoded content and metadata, etc. In some
embodiments, programming instructions 950 may be disposed on computer-readable media 920 that is transitory in nature, such as signals.
Any combination of one or more computer-usable or computer-readable media may be utilized as the computer-readable media 920. The computer-readable media 920 may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a non-exhaustive list) of the computer-readable media would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, random access memory (RAM), read only memory (ROM), an erasable programmable read-only memory (for example, EPROM, EEPROM, or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a transmission media such as those supporting the Internet or an intranet, or a magnetic storage device. Note that the computer-usable or computer-readable media could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. In the context of this document, a computer-usable or computer-readable media may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer- usable media may include a propagated data signal with the computer-usable program code embodied therewith, either in baseband or as part of a carrier wave. The computer- usable program code may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio frequency, etc.
Computer program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
As shown in Figure 9, the instructions 950 may reside, completely or partially, within at least one of the processors 910 (e.g., within the processor's cache memory), the computer-readable media 920, or any suitable combination thereof. Furthermore, any portion of the instructions 950 may be transferred to the processors 910 from any combination of the peripheral devices 904 and/or the databases 906. Accordingly, the memory of processors 910, the peripheral devices 904, and the databases 906 are additional examples of computer-readable media.
The communication resources 930 may include interconnection and/or network interface components or other suitable devices to communicate with one or more peripheral devices 904 and/or one or more remote devices, for example, databases 906, via a network 908. For example, the communication resources 930 may include wired communication components (e.g., for coupling via a Universal Serial Bus (USB)), cellular communication components, Near Field Communication (NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy), Wi-Fi® components, and other
communication components. In some embodiments, the communication resources 930 may include a cellular modem to communicate over a cellular network, an Ethernet controller to communicate over an Ethernet network, etc.
In some embodiments, one or more components of the computer system 900 may be included as a part of the UE 108 or the server 104 described with respect to Figure 1 , or one or more components of the UE 108 or server 104 described with respect to Figure 1 may be included as a part of the computer system 900. For example, encoding circuitry 1 16, segmentation circuitry 120, streaming host circuitry 124, or communication circuitry 126 may include processors 910, computer-readable media 920, or communication resources 930 to facilitate operations described above with respect to the server 104. Similarly, output circuitry 140, media player circuitry 136, decoding circuitry 132, streaming client circuitry 128, or communication circuitry 130 may include processors 910, computer-readable media 920, or communication resources 930 to facilitate operations described above with respect to the UE 108.
The present disclosure is described with reference to flowchart illustrations or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations or block diagrams, and combinations of blocks in the flowchart illustrations or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for
implementing the functions/acts specified in the flowchart or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instruction means that implement the function/act specified in the flowchart or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart or block diagram block or blocks.
Some non-limiting examples are provided below.
Example 1 may include an apparatus comprising: streaming client circuitry to: receive a manifest file for a hypertext transport protocol (HTTP) adaptive stream, the manifest file to define hierarchical levels that include information
about encoded media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with
individual adaptation sets to include one or more representations and
individual representations to include one or more segments; select a video quantization parameter (QP) value that corresponds to at least a portion of the encoded media content; request a hierarchical level based on the video QP value; and receive a first encoded content section, the first encoded content section to correspond to the requested hierarchical level; and decoding circuitry, coupled with the streaming client circuitry, to decode the first encoded content section. Example 2 may include the apparatus of example 1, wherein the streaming client circuitry is further to: determine display capabilities of a user equipment; determine a desired video quality; and select, based on the display capabilities and the desired video quality, the video QP value from a plurality of video QP values that correspond to a respective plurality of encoded content sections. Example 3 may include the apparatus of example 2, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric. Example 4 may include the apparatus of any one of examples 1-3, wherein the video QP value is included in: the manifest file or in metadata embedded in a Third Generation Partnership Project ("3GPP") file format.
Example 5 may include the apparatus of example 4, wherein the video QP value is included in the manifest file and the manifest file is a media presentation description for a dynamic adaptive streaming over HTTP (DASH) adaptation set.
Example 6 may include the apparatus of any one of examples 1-5, wherein the video QP value is a first video QP value and the streaming client circuitry is to report, to a content server, a second video QP value for a display device of a user equipment during reception, decoding, or rendering of the HTTP adaptive stream. Example 7 may include the apparatus of any one of examples 1-6, wherein the video QP value corresponds to a mean luma QP averaged over a duration of the first encoded content section.
Example 8 may include the apparatus of any one of examples 1-7, wherein the streaming client circuitry is further to: select the video QP value from the manifest file for a plurality of sub-representations for a given period; and request sub-segments that are in at least one of the sub-representations and have the video QP value.
Example 9 may include the apparatus of any one of examples 1-8, wherein the video QP value is a first video QP value that corresponds to a first representation and the streaming client circuitry is further to switch to a second representation having a second video QP value at a selected access point in the HTTP adaptive stream to change a display performed by a mobile device in rendering the HTTP adaptive stream.
Example 10 may include the apparatus of any one of examples 1-9, wherein the streaming client circuitry is to receive an indication of the video QP value from a metadata management server.
Example 11 may include the apparatus of any one of examples 1-10, wherein the video QP value corresponds to an adaptation set, a representation, or a sub-representation of the manifest file.
Example 12 may include a media server operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the media server comprising: an encoder to: encode content sections of an HTTP adaptive stream, and to record video quantization parameter ("QP") values associated with the encoded content sections; and streaming host circuitry to transmit a manifest file for the HTTP adaptive stream and the video QP values to a user equipment, the manifest file to define hierarchical levels that include information about the encoded content sections.
Example 13 may include the media server of example 12, wherein the hierarchical levels are to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments.
Example 14 may include the media server of example 13, wherein each representation of an adaptation set includes a same media file over a same byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type. Example 15 may include the media server of any one of examples 12-14, wherein individual video QP values correspond to an adaptation set, a representation, or a sub- representation of the manifest file and the streaming host circuitry is to: receive, from the user equipment, an indication of a selection of an adaptation set, a representation, or a sub- representation, wherein the selection is based on a video QP value corresponding to the selected adaptation set, representation, or sub-representation; and transmit encoded portions of media content that correspond to the selected adaptation set, representation, or sub-representation based on receipt of the indication.
Example 16 may include the media server of any one of examples 12-15, wherein the encoder is further to record a vectorized set of video QP values describing video QP across different video frames or media time intervals and the streaming host circuitry is further to cause transmission of the vectorized set in a timed metadata track of an International Organization for Standardization ("ISO") base media file format or a Third Generation Partnership Project ("3 GPP") file format.
Example 17 may include the media server of any one of examples 12-16, wherein the indication of the video QP values is in the manifest file, which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project ("3GPP") file format.
Example 18 may include the media server of any one of examples 12-17, wherein the encoded content sections correspond to adaptation sets, representations, or segments.
Example 19 may include the media server of any one of examples 12-18, wherein the encoded content sections correspond to one or more frames of the HTTP adaptive stream.
Example 20 may include a user equipment ("UE") operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the UE comprising: communication circuitry to receive a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of
media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and streaming client circuitry to: identify video quantization parameter (QP) values for individual adaptation sets of the one or more adaptation sets in the manifest file for a selected period; request a first adaptation set for the selected period having a first video QP value; and dynamically switch streaming between different encoded portions of the media content in the first adaptation set to maintain a desired video quality value.
Example 21 may include the UE of example 20, wherein the streaming client circuitry is further to: determine the desired video quality based on capabilities of a display device of the UE; and identify the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
Example 22 may include the UE of example 20 or 21, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric.
Example 23 may include the UE of any one of examples 20-22, wherein the video QP value is included in: the manifest file, which is a media presentation description ("MPD") file; or metadata embedded in a file having a Third Generation Partnership Project ("3GPP") file format. Example 24 may include the UE of any one of examples 20-23, wherein the streaming client circuitry is further to transmit, to a content server, via the communication circuitry, a desired video QP configuration for a display device of the UE.
Example 25 may include the UE of any one of examples 20-24, wherein the video QP value corresponds to a mean luma QP averaged over a segment duration. Example 26 may include the UE of any one of examples 20-25, wherein the streaming client circuitry is further to: identify a second video QP value for a plurality of sub-representations for the selected period; and request sub-segments that are in at least one of the plurality of sub-representations and have the second video QP value. Example 27 may include the UE of any one of examples 20-26, wherein the UE is to receive video QP configuration information from a metadata management server.
Example 28 may include one or more computer-readable media having instructions that, when executed, cause a media server operable to provide hypertext transfer protocol (HTTP) adaptive streaming to: transmit video QP metadata to a user equipment ("UE") over a byte range for a plurality of representations of an HTTP adaptive stream; receive a request for segments of the HTTP adaptive stream in the byte range in
a representation selected from the plurality of representations; and transmit the requested segments in the byte range to a mobile device. Example 29 may include the one or more computer-readable media of example 28, wherein the byte range is for a period, an adaptation set, a representation, a segment or a sub-segment.
Example 30 may include the one or more computer-readable media of example 28 or 29, wherein the video QP metadata includes a vectorized set of video QP values that describe video QP across different video frames or different media time intervals, wherein the vectorized set of video QP values is provided in a timed metadata track of file having an International Organization for Standardization ("ISO") base media file format or Third Generation Partnership Project ("3GPP") file format.
Example 31 may include the one or more computer-readable media of any one of examples 28-30, wherein the instructions, when executed, further cause the media server to transmit the video QP metadata in a manifest file for the HTTP adaptive stream that is a media presentation description ("MPD") for a dynamic adaptive streaming over HTTP (DASH) adaptation set, or in metadata embedded in a file having a Third
Generation Partnership Project ("3GPP") file format. Example 32 may include the one or more computer-readable media of any one of examples 28-31, wherein the requested segments are located in an adaptation set that includes the plurality of representations, and at least one representation includes a plurality of segments.
Example 33 may include the one or more computer-readable media of any one of examples 28-32, wherein individual representations of the plurality of representations contain a same media file over the byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
Example 34 may include the one or more computer-readable media of any one of examples 28-33, wherein the request is a first request for first segments, the representation is a first representation, and the instructions, when executed, further cause the media server to: receive a second request for second segments of the HTTP adaptive stream in the byte range in a second representation selected from the plurality of representations; and transmit the second segments in the byte range to the UE. Example 35 may include a user equipment ("UE") operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the UE comprising: communication circuitry to receive a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and streaming client circuitry to: identify video quantization parameter (QP) values for individual representations of a plurality of representations of the manifest file for a selected period; request a first representation for the selected period having a first video QP value; and dynamically switch streaming from the first representation to a
second representation to maintain a desired video quality value.
Example 36 may include the UE of example 35, wherein the streaming client circuitry is further to: determine the desired video quality based on capabilities of a display device of the UE; and identify the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
Example 37 may include the UE of example 35 or 36, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric. Example 38 may include the UE of any one of examples 35-37, wherein the first video QP value corresponds to a mean luma QP averaged over a segment duration.
Example 39 may include the UE of any one of examples 35 or 36, wherein the streaming client circuitry is further to: identify a second video QP value in the manifest file for a plurality of sub-representations for the selected period; and request sub-segments that are in at least one of the sub-representations and have a desired video
QP configuration.
Example 40 may include the UE of any one of examples 35-39, wherein the streaming client circuitry is to dynamically switch streaming to the second representation at a selected access point in the HTTP adaptive stream to change performance of a display device of the UE in rendering the HTTP adaptive stream.
Example 41 may include the UE of any one of examples 35-40, wherein the UE is to receive video QP configuration information from a metadata management server.
Example 42 may include one or more computer-readable media having instructions that, when executed, cause a user equipment to: identify a manifest file for a hypertext transport protocol (HTTP) adaptive stream, the manifest file to define hierarchical levels that include information about encoded media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; select a video
quantization parameter (QP) value that corresponds to at least a portion of the
encoded media content; request a hierarchical level based on the video QP value.
Example 43 may include the one or more computer-readable media of example 42, wherein the instructions, when executed, further cause the user equipment to: determine display capabilities of the user equipment; determine a desired video quality; and select, based on the display capabilities and the desired video quality, the video QP value from a plurality of video QP values that correspond to a respective plurality of encoded content sections.
Example 44 may include the one or more computer-readable media of example 43, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric. Example 45 may include the one or more computer-readable media of any one of examples 42-44, wherein the video QP value is included in: the manifest file or in metadata embedded in a Third Generation Partnership Project ("3GPP") file format.
Example 46 may include the one or more computer-readable media of example 45, wherein the video QP value is included in the manifest file and the manifest file is a media presentation description for a dynamic adaptive streaming over HTTP (DASH) adaptation set.
Example 47 may include the one or more computer-readable media of any one of examples 42-46, wherein the video QP value is a first video QP value and the instructions, when executed, further cause the user equipment to report, to a content server, a second video QP value for a display device of the user equipment during reception, decoding, or rendering of the HTTP adaptive stream.
Example 48 may include the one or more computer-readable media of any one of examples 42-47, wherein the video QP value corresponds to a mean luma QP averaged over a duration of a first encoded content section. Example 49 may include the one or more computer-readable media of any one of examples 42-48, wherein the instructions, when executed, further cause the user equipment to: select the video QP value from the manifest file for a plurality of sub- representations for a given period; and request sub-segments that are in at least one of the sub-representations and have the video QP value. Example 50 may include the one or more computer-readable media of any one of examples 42-49, wherein the video QP value corresponds to an adaptation set, a representation, or a sub-representation of the manifest file.
Example 51 may include a media server operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the media server comprising: means to encode content sections of an HTTP adaptive stream, and to record video quantization parameter ("QP") values associated with the encoded content sections; and means to transmit a manifest file for the HTTP adaptive stream and the video QP values to a user equipment, the manifest file to define hierarchical levels that include information about
the encoded content sections.
Example 52 may include the media server of example 51, wherein the hierarchical levels are to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments. Example 53 may include the media server of example 52, wherein each representation of an adaptation set includes a same media file over a same byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
Example 54 may include the media server of any one of examples 51-53, wherein individual video QP values correspond to an adaptation set, a representation, or a sub- representation of the manifest file and the media server further comprises: means to receive, from the user equipment, an indication of a selection of an adaptation
set, a representation, or a sub-representation, wherein the selection is based on a video QP value corresponding to the selected adaptation set, representation, or sub- representation; and means to transmit encoded portions of the media content that correspond to the selected adaptation set, representation, or sub-representation based on receipt of the indication.
Example 55 may include the media server of any one of examples 51-54, wherein the means to encode is further to record a vectorized set of video QP values describing video QP across different video frames or media time intervals and means to transmit is further to cause transmission of the vectorized set in a timed metadata track of an International Organization for Standardization ("ISO") base media file format or a Third Generation Partnership Project ("3GPP") file format.
Example 56 may include the media server of any one of examples 51-55, wherein the indication of the video QP values is in the manifest file, which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project ("3GPP") file format.
Example 57 may include the media server of any one of examples 51-56, wherein the encoded content sections correspond to adaptation sets, representations, or segments. Example 58 may include the media server of any one of examples 51-57, wherein the encoded content sections correspond to one or more frames of the HTTP adaptive stream.
Example 59 may include a method of hypertext transfer protocol (HTTP) adaptive streaming, the method comprising: receiving a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and identifying video quantization parameter (QP) values for individual adaptation sets of the one or more adaptation sets in the manifest file for a selected period; requesting a first adaptation set for the selected period having a first video QP value; and dynamically switching streaming between different encoded portions of the media content in the first adaptation set to maintain a desired video quality value.
Example 60 may include the method of example 59, further comprising:
determining the desired video quality based on capabilities of a display device of the UE; and identifying the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
Example 61 may include the method of example 59 or 60, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric.
Example 62 may include the method of any one of examples 59-61, wherein the video QP value is included in: the manifest file, which is a media presentation description ("MPD") file; or metadata embedded in a file having a Third Generation Partnership Project ("3GPP") file format.
Example 63 may include the method of any one of examples 59-62, wherein the method further comprises transmitting, to a content server, a desired video QP configuration for a display device of the UE.
Example 64 may include the method of any one of examples 59-63, wherein the video QP value corresponds to a mean luma QP averaged over a segment duration.
Example 65 may include the method of any one of examples 59-64, further comprising: identifying a second video QP value for a plurality of sub-representations for the selected period; and requesting sub-segments that are in at least one of the plurality of sub-representations and have the second video QP value.
Example 66 may include the method of any one of examples 59-65, further comprising receiving video QP configuration information from a metadata management server.
Example 67 may include a method of hypertext transfer protocol (HTTP) adaptive streaming comprising: transmitting video QP metadata to a user equipment ("UE") over a byte range for a plurality of representations of an HTTP adaptive stream; receiving a request for segments of the HTTP adaptive stream in the byte range in
a representation selected from the plurality of representations; and transmitting the requested segments in the byte range to a mobile device.
Example 68 may include the method of example 67, wherein the byte range is for a period, an adaptation set, a representation, a segment or a sub-segment.
Example 69 may include the method of example 67 or 68, wherein the video QP metadata includes a vectorized set of video QP values that describe video QP across different video frames or different media time intervals, wherein the vectorized set of video QP values is provided in a timed metadata track of file having an International Organization for Standardization ("ISO") base media file format or Third
Generation Partnership Project ("3GPP") file format. Example 70 may include the method of any one of examples 67-69, further comprising transmitting the video QP metadata in a manifest file for the HTTP adaptive stream that is a media presentation description ("MPD") for a dynamic adaptive streaming over HTTP (DASH) adaptation set, or in metadata embedded in a file having a Third Generation Partnership Project ("3GPP") file format.
Example 71 may include the method of any one of examples 67-70, wherein the requested segments are located in an adaptation set that includes the plurality of representations, and at least one representation includes a plurality of segments.
Example 72 may include the method of any one of examples 67-71, wherein individual representations of the plurality of representations contain a same media file over the byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
Example 73 may include the method of any one of examples 67-72, wherein the request is a first request for first segments, the representation is a first representation, and further comprising: receiving a second request for second segments of the HTTP adaptive stream in the byte range in a second representation selected from the plurality of representations; and transmitting the second segments in the byte range to the UE.
Example 74 may include a method of hypertext transfer protocol (HTTP) adaptive streaming, the method comprising: receiving a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; identify video quantization parameter (QP) values for individual representations of a plurality of representations of the manifest file for a selected period; requesting a first representation for the selected period having a first video QP value; and dynamically switching streaming from the first representation to a second representation to maintain a desired video quality value.
Example 75 may include the method of example 74, further comprising:
determining the desired video quality based on capabilities of a display device of a user equipment (UE); and identifying the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
Example 76 may include the method of example 74 or 75, wherein the desired video quality corresponds to an obj ective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric.
Example 77 may include the method of any one of examples 74-76, further comprising: identifying a third video QP configuration in the manifest file for a plurality of sub-representations for the selected period; and requesting sub-segments that are in at least one of the sub-representations and have a desired video QP configuration.
Example 78 may include an apparatus configured to perform the method of any one of examples 59-77
Example 79 may include one or more computer-readable media having instructions that, when executed, cause a device to perform the method of any one of examples 59-77.
The description herein of illustrated implementations, including what is described in the Abstract, is not intended to be exhaustive or to limit the present disclosure to the precise forms disclosed. While specific implementations and examples are described herein for illustrative purposes, a variety of alternate or equivalent embodiments or implementations calculated to achieve the same purposes may be made in light of the above detailed description, without departing from the scope of the present disclosure, as those skilled in the relevant art will recognize.

Claims

CLAIMS What is claimed is:
1. An apparatus comprising:
streaming client circuitry to:
receive a manifest file for a hypertext transport protocol (HTTP) adaptive stream, the manifest file to define hierarchical levels that include information about encoded media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual
representations to include one or more segments;
select a video quantization parameter (QP) value that corresponds to at least a portion of the encoded media content;
request a hierarchical level based on the video QP value; and receive a first encoded content section, the first encoded content section to correspond to the requested hierarchical level; and
decoding circuitry, coupled with the streaming client circuitry, to decode the first encoded content section.
2. The apparatus of claim 1, wherein the streaming client circuitry is further to:
determine display capabilities of a user equipment;
determine a desired video quality; and
select, based on the display capabilities and the desired video quality, the video QP value from a plurality of video QP values that correspond to a respective plurality of encoded content sections.
3. The apparatus of claim 2, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric.
4. The apparatus of claim 1, wherein the video QP value is included in: the manifest file or in metadata embedded in a Third Generation Partnership Project ("3GPP") file format.
5. The apparatus of claim 4, wherein the video QP value is included in the manifest file and the manifest file is a media presentation description for a dynamic adaptive streaming over HTTP (DASH) adaptation set.
6. The apparatus of claim 1, wherein the video QP value is a first video QP value and the streaming client circuitry is to report, to a content server, a second video QP value for a display device of a user equipment during reception, decoding, or rendering of the HTTP adaptive stream.
7. The apparatus of claim 1, wherein the video QP value corresponds to a mean luma QP averaged over a duration of the first encoded content section.
8. The apparatus of claim 1, wherein the streaming client circuitry is further to:
select the video QP value from the manifest file for a plurality of sub- representations for a given period; and
request sub-segments that are in at least one of the sub-representations and have the video QP value.
9. The apparatus of claim 1, wherein the video QP value is a first video QP value that corresponds to a first representation and the streaming client circuitry is further to switch to a second representation having a second video QP value at a selected access point in the HTTP adaptive stream to change a display performed by a mobile device in rendering the HTTP adaptive stream.
10. The apparatus of claim 1, wherein the streaming client circuitry is to receive an indication of the video QP value from a metadata management server.
11. The apparatus of any one of claims 1-10, wherein the video QP value corresponds to an adaptation set, a representation, or a sub-representation of the manifest file.
12. A media server operable to provide hypertext transfer protocol (HTTP)
adaptive streaming, the media server comprising: an encoder to: encode content sections of an HTTP adaptive stream, and to record video quantization parameter ("QP") values associated with the encoded content sections; and
streaming host circuitry to transmit a manifest file for the HTTP adaptive stream and the video QP values to a user equipment, the manifest file to define hierarchical levels that include information about the encoded content sections.
13. The media server of claim 12, wherein the hierarchical levels are to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments.
14. The media server of claim 13, wherein each representation of an adaptation set includes a same media file over a same byte range that is encoded differently than other representations, wherein the encoding includes a bit rate, a frame rate, a resolution, or a codec type.
15. The media server of claim 12, wherein individual video QP values correspond to an adaptation set, a representation, or a sub-representation of the manifest file and the streaming host circuitry is to:
receive, from the user equipment, an indication of a selection of an adaptation set, a representation, or a sub-representation, wherein the selection is based on a video QP value corresponding to the selected adaptation set, representation, or sub-representation; and
transmit encoded portions of media content that correspond to the selected adaptation set, representation, or sub-representation based on receipt of the indication.
16. The media server of claim 12, wherein the encoder is further to record a vectorized set of video QP values describing video QP across different video frames or media time intervals and the streaming host circuitry is further to cause transmission of the vectorized set in a timed metadata track of an International Organization for Standardization ("ISO") base media file format or a Third Generation Partnership Project ("3 GPP") file format.
17. The media server of claim 12, wherein an indication of the video QP values is in the manifest file, which is a media presentation description file, or in metadata embedded in a file having a Third Generation Partnership Project ("3GPP") file format.
18. The media server of claim 12, wherein the encoded content sections correspond to adaptation sets, representations, or segments.
19. The media server of claim 12, wherein the encoded content sections correspond to one or more frames of the HTTP adaptive stream.
20. A user equipment ("UE") operable to provide hypertext transfer protocol (HTTP) adaptive streaming, the UE comprising:
communication circuitry to receive a manifest file for an HTTP adaptive stream, the manifest file to define hierarchical levels that include information about encoded portions of media content available for adaptive streaming, the hierarchical levels to include one or more adaptation sets, representations, and segments with individual adaptation sets to include one or more representations and individual representations to include one or more segments; and
streaming client circuitry to:
identify video quantization parameter (QP) values for individual adaptation sets of the one or more adaptation sets in the manifest file for a selected period; request a first adaptation set for the selected period having a first video QP value; and
dynamically switch streaming between different encoded portions of the media content in the first adaptation set to maintain a desired video quality value.
21. The UE of claim 20, wherein the streaming client circuitry is further to:
determine the desired video quality based on capabilities of a display device of the UE; and
identify the first video QP value based on a determination that the UE is to decode and render content with the first video QP value with the desired video quality.
22. The UE of claim 20 or 21, wherein the desired video quality corresponds to an objective video quality attribute, a subjective video quality attribute, or a quality metric that is a multi-scale structural similarity ("MS-SSIM") metric, a mean opinion score ("MOS") metric, a structural similarity ("SSIM") metric, peak signal to noise ratio ("PSNR") or a perceptual evaluation of video quality ("PEVQ") metric.
23. The UE of claim 20, wherein the video QP value is included in:
the manifest file, which is a media presentation description ("MPD") file; or metadata embedded in a file having a Third Generation Partnership Project ("3 GPP") file format.
24. The UE of claim 20, wherein the streaming client circuitry is further to transmit, to a content server, via the communication circuitry, a desired video QP configuration for a display device of the UE.
25. The UE of claim 20, wherein the video QP value corresponds to a mean luma QP averaged over a segment duration.
PCT/US2016/060257 2016-08-30 2016-11-03 Quantization parameter reporting for video streaming WO2018044338A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662381244P 2016-08-30 2016-08-30
US62/381,244 2016-08-30

Publications (1)

Publication Number Publication Date
WO2018044338A1 true WO2018044338A1 (en) 2018-03-08

Family

ID=57321452

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/060257 WO2018044338A1 (en) 2016-08-30 2016-11-03 Quantization parameter reporting for video streaming

Country Status (1)

Country Link
WO (1) WO2018044338A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220353435A1 (en) * 2021-04-29 2022-11-03 Cloudinary Ltd. System, Device, and Method for Enabling High-Quality Object-Aware Zoom-In for Videos

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140019593A1 (en) * 2012-07-10 2014-01-16 Vid Scale, Inc. Quality-driven streaming
US20140317308A1 (en) * 2013-04-19 2014-10-23 Futurewei Technologies, Inc Media Quality Information Signaling In Dynamic Adaptive Video Streaming Over Hypertext Transfer Protocol
US20160182594A1 (en) * 2014-12-19 2016-06-23 Cable Television Laboratories, Inc. Adaptive streaming

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140019593A1 (en) * 2012-07-10 2014-01-16 Vid Scale, Inc. Quality-driven streaming
US20140317308A1 (en) * 2013-04-19 2014-10-23 Futurewei Technologies, Inc Media Quality Information Signaling In Dynamic Adaptive Video Streaming Over Hypertext Transfer Protocol
US20160182594A1 (en) * 2014-12-19 2016-06-23 Cable Television Laboratories, Inc. Adaptive streaming

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
"3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Study on improved streaming Quality of Experience (QoE) reporting in 3GPP services and networks (Release 14)", 1 July 2016 (2016-07-01), XP051122507, Retrieved from the Internet <URL:http://www.3gpp.org/ftp/tsg_sa/WG4_CODEC/TSGS4_89/Docs/> [retrieved on 20160701] *
3GPP TR 26.909, June 2016 (2016-06-01)
3GPP TS 26.247 V13.3.0, 24 June 2016 (2016-06-24)
ADZIC VELIBOR ET AL: "Optimized adaptive HTTP streaming for mobile devices", APPLICATIONS OF DIGITAL IMAGE PROCESSING XXXIV, SPIE, 1000 20TH ST. BELLINGHAM WA 98225-6705 USA, vol. 8135, no. 1, 12 September 2011 (2011-09-12), pages 1 - 10, XP060021110, DOI: 10.1117/12.895546 *
ALBERTI CLAUDIO ET AL: "Automated QoE evaluation of Dynamic Adaptive Streaming over HTTP", 2013 FIFTH INTERNATIONAL WORKSHOP ON QUALITY OF MULTIMEDIA EXPERIENCE (QOMEX), IEEE, 3 July 2013 (2013-07-03), pages 58 - 63, XP032484707, DOI: 10.1109/QOMEX.2013.6603211 *
INTERNATIONAL ELECTROTECHNICAL COMMISSION (''IEC'') 23009-1, 15 May 2014 (2014-05-15)
TAKAGI MOTOHIRO ET AL: "Subjective video quality estimation to determine optimal spatio-temporal resolution", 2013 PICTURE CODING SYMPOSIUM (PCS), IEEE, 8 December 2013 (2013-12-08), pages 422 - 425, XP032566997, DOI: 10.1109/PCS.2013.6737773 *
TECHNICAL SPECIFICATION (TS) 26.247, 24 June 2016 (2016-06-24)
THANG TRUONG CONG ET AL: "Adaptive video streaming over HTTP with dynamic resource estimation", JOURNAL OF COMMUNICATIONS AND NETWORKS, NEW YORK, NY, USA,IEEE, US, vol. 15, no. 6, 1 December 2013 (2013-12-01), pages 635 - 644, XP011537203, ISSN: 1229-2370, [retrieved on 20140115], DOI: 10.1109/JCN.2013.000112 *
TRUONG CONG THANG ET AL: "Adaptive streaming of audiovisual content using MPEG DASH", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, IEEE SERVICE CENTER, NEW YORK, NY, US, vol. 58, no. 1, 1 February 2012 (2012-02-01), pages 78 - 85, XP011434215, ISSN: 0098-3063, DOI: 10.1109/TCE.2012.6170058 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220353435A1 (en) * 2021-04-29 2022-11-03 Cloudinary Ltd. System, Device, and Method for Enabling High-Quality Object-Aware Zoom-In for Videos

Similar Documents

Publication Publication Date Title
US9351020B2 (en) On the fly transcoding of video on demand content for adaptive streaming
KR102266325B1 (en) Video quality enhancement
KR101912072B1 (en) Methods for quality-aware adaptive streaming over hypertext transfer protocol
US8782165B2 (en) Method and transcoding proxy for transcoding a media stream that is delivered to an end-user device over a communications network
TWI526062B (en) Quality-aware rate adaptation techniques for dash streaming
US9042449B2 (en) Systems and methods for dynamic transcoding of indexed media file formats
US10003626B2 (en) Adaptive real-time transcoding method and streaming server therefor
US20160037176A1 (en) Automatic and adaptive selection of profiles for adaptive bit rate streaming
CN107634930B (en) Method and device for acquiring media data
KR20150110603A (en) Method and apparatus for performing adaptive streaming on media contents
US10834161B2 (en) Dash representations adaptations in network
US20140226711A1 (en) System and method for self-adaptive streaming of multimedia content
CN106454271A (en) Video processing system and method
US10419581B2 (en) Data cap aware video streaming client
US11196795B2 (en) Method and apparatus for predicting video decoding time
WO2018044338A1 (en) Quantization parameter reporting for video streaming
EP4068779A1 (en) Cross-validation of video encoding
US20140133573A1 (en) Methods and apparatus for transcoding digital video data
US20220272394A1 (en) Systems and methods for improved adaptive video streaming

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16795518

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16795518

Country of ref document: EP

Kind code of ref document: A1