WO2016004237A1 - Media presentation description signaling in typical broadcast content - Google Patents

Media presentation description signaling in typical broadcast content Download PDF

Info

Publication number
WO2016004237A1
WO2016004237A1 PCT/US2015/038879 US2015038879W WO2016004237A1 WO 2016004237 A1 WO2016004237 A1 WO 2016004237A1 US 2015038879 W US2015038879 W US 2015038879W WO 2016004237 A1 WO2016004237 A1 WO 2016004237A1
Authority
WO
WIPO (PCT)
Prior art keywords
component
segment
audio
processor
mpd
Prior art date
Application number
PCT/US2015/038879
Other languages
French (fr)
Inventor
Alexander GILADI
Original Assignee
Vid Scale, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vid Scale, Inc. filed Critical Vid Scale, Inc.
Publication of WO2016004237A1 publication Critical patent/WO2016004237A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43072Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09CCIPHERING OR DECIPHERING APPARATUS FOR CRYPTOGRAPHIC OR OTHER PURPOSES INVOLVING THE NEED FOR SECRECY
    • G09C1/00Apparatus or methods whereby a given sequence of signs, e.g. an intelligible text, is transformed into an unintelligible sequence of signs by transposing the signs or groups of signs or by replacing them by others according to a predetermined system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • OTT streaming utilizes Internet as a delivery medium.
  • Network capabilities have evolved to make video delivery over the Internet viable.
  • Media Presentation Description signaling may be used to handle broadcast content in Dynamic Adaptive Streaming over HTTP (DASH).
  • DASH Dynamic Adaptive Streaming over HTTP
  • a device may include a processor configured to receive multimedia presentation description (MPD) information relating to content.
  • the processor may determine, based upon the MPD information, that the content comprises a multiplexed representation comprising a closed captioning component multiplexed with at least one of an audio component or a video component.
  • the processor may determine whether the closed captioning component is a CEA- 608 closed captioning component or a CEA-708 closed captioning component.
  • the processor may determine whether display of the closed captioning component is supported on the device.
  • the processor may request the multiplexed representation, and receive content segments comprising the multiplexed representation.
  • a device may include a processor configured to receive MPD information relating to content.
  • the processor may determine, based upon the MPD information, that the content comprises a multiplexed representation comprising two audio components multiplexed together, the audio components encoded according to different audio codecs.
  • the two audio components are in different languages.
  • the processor may determine a language of at least one audio component.
  • the processor may determine whether the two audio codecs are supported.
  • the processor may request the multiplexed representation, and receive content segments comprising the multiplexed representation.
  • a device may include a processor configured to receive MPD information relating to content.
  • the processor may determine, based upon the MPD information, that the content comprises a multiplexed representation comprising a packet identifier of a segment included in a multiplexed representation.
  • the segment may be a video component.
  • the processor may determine a track identifier of at least one of a video component and an audio component in the segment.
  • the processor may determine an earliest presentation time (EPT) of a video component or an audio component in the segment.
  • the processor may determine the relative offset of an audio component in the segment from the EPT of a video component of the segment.
  • the processor may request the multiplexed representation, and receive content segments comprising the multiplexed representation.
  • FIG. 1A is a system diagram of an example communications system in which one or more disclosed embodiments may be implemented.
  • FIG. IB is a system diagram of an example wireless transmit/receive unit (WTRU) that may be used within the communications system illustrated in FIG. 1 A.
  • WTRU wireless transmit/receive unit
  • FIG. 1C is a system diagram of an example radio access network and an example core network that may be used within the communications system illustrated in FIG. 1 A.
  • FIG ID is a system diagram of another example radio access network and an example core network that may be used within the communications system illustrated in FIG. 1 A.
  • FIG. IE is a system diagram of another example radio access network and an example core network that may be used within the communications system illustrated in FIG.
  • FIG. 2 is a diagram of an example DASH system model. DETAILED DESCRIPTION
  • FIG. 1A is a diagram of an example communications system 100 in which one or more disclosed embodiments may be implemented.
  • the communications system 100 may be a multiple access system that provides content, such as voice, data, video, messaging, broadcast, etc., to multiple wireless users.
  • the communications system 100 may enable multiple wireless users to access such content through the sharing of system resources, including wireless bandwidth.
  • the communications systems 100 may employ one or more channel access methods, such as code division multiple access (CDMA), time division multiple access (TDMA), frequency division multiple access (FDMA), orthogonal FDMA (OFDMA), single- carrier FDMA (SC-FDMA), and the like.
  • CDMA code division multiple access
  • TDMA time division multiple access
  • FDMA frequency division multiple access
  • OFDMA orthogonal FDMA
  • SC-FDMA single- carrier FDMA
  • the communications system 100 may include wireless transmit/receive units (WTRUs) 102a, 102b, 102c, and/or 102d (which generally or collectively may be referred to as WTRU 102), a radio access network (RAN) 103/104/105, a core network 106/107/109, a public switched telephone network (PSTN) 108, the Internet 110, and other networks 112, though it will be appreciated that the disclosed embodiments contemplate any number of WTRUs, base stations, networks, and/or network elements.
  • Each of the WTRUs 102a, 102b, 102c, 102d may be any type of device configured to operate and/or communicate in a wireless environment.
  • the WTRUs 102a, 102b, 102c, 102d may be configured to transmit and/or receive wireless signals and may include user equipment (UE), a mobile station, a fixed or mobile subscriber unit, a pager, a cellular telephone, a personal digital assistant (PDA), a smartphone, a laptop, a netbook, a personal computer, a wireless sensor, consumer electronics, and the like.
  • UE user equipment
  • PDA personal digital assistant
  • smartphone a laptop
  • netbook a personal computer
  • a wireless sensor consumer electronics, and the like.
  • the communications systems 100 may also include a base station 114a and a base station 114b.
  • Each of the base stations 114a, 114b may be any type of device configured to wirelessly interface with at least one of the WTRUs 102a, 102b, 102c, 102d to facilitate access to one or more communication networks, such as the core network 106/107/109, the Internet 110, and/or the networks 112.
  • the base stations 114a, 114b may be a base transceiver station (BTS), a Node-B, an eNode B, a Home Node B, a Home eNode B, a site controller, an access point (AP), a wireless router, and the like. While the base stations 114a,
  • 114b may include any number of interconnected base stations and/or network elements.
  • the base station 114a may be part of the RAN 103/104/105, which may also include other base stations and/or network elements (not shown), such as a base station controller (BSC), a radio network controller (RNC), relay nodes, etc.
  • BSC base station controller
  • RNC radio network controller
  • the 114b may be configured to transmit and/or receive wireless signals within a particular geographic region, which may be referred to as a cell (not shown).
  • the cell may further be divided into cell sectors.
  • the cell associated with the base station 114a may be divided into three sectors.
  • the base station 114a may include three transceivers, i.e., one for each sector of the cell.
  • the base station 114a may employ multiple -input multiple output (MIMO) technology and, therefore, may utilize multiple transceivers for each sector of the cell.
  • MIMO multiple -input multiple output
  • the base stations 114a, 114b may communicate with one or more of the WTRUs 102a, 102b, 102c, 102d over an air interface 115/116/117, which may be any suitable wireless communication link (e.g., radio frequency (RF), microwave, infrared (IR), ultraviolet (UV), visible light, etc.).
  • the air interface 115/116/117 may be established using any suitable radio access technology (RAT).
  • RAT radio access technology
  • the communications system 100 may be a multiple access system and may employ one or more channel access schemes, such as CDMA, TDMA, FDMA, OFDMA, SC-FDMA, and the like.
  • the base station 114a in the RAN 103/104/105 and the WTRUs 102a, 102b, 102c may implement a radio technology such as Universal Mobile Telecommunications System (UMTS) Terrestrial Radio Access (UTRA), which may establish the air interface 115/116/117 using wideband CDMA (WCDMA).
  • UMTS Universal Mobile Telecommunications System
  • UTRA Universal Mobile Telecommunications System
  • WCDMA wideband CDMA
  • WCDMA may include communication protocols such as High-Speed Packet Access (HSPA) and/or Evolved HSPA (HSPA+).
  • HSPA may include High-Speed Downlink Packet Access (HSDPA) and/or High-Speed Uplink Packet Access (HSUPA).
  • HSPA High-Speed Packet Access
  • HSDPA High-Speed Downlink Packet Access
  • HSUPA High-Speed Uplink Packet Access
  • the base station 114a and the WTRUs 102a, 102b, 102c may implement a radio technology such as Evolved UMTS Terrestrial Radio Access (E-UTRA), which may establish the air interface 115/116/117 using Long Term Evolution (LTE) and/or LTE- Advanced (LTE- A).
  • E-UTRA Evolved UMTS Terrestrial Radio Access
  • LTE Long Term Evolution
  • LTE- A LTE- Advanced
  • the base station 114a and the WTRUs 102a, 102b, 102c may implement radio technologies such as IEEE 802.16 (i.e., Worldwide Interoperability for Microwave Access (WiMAX)), CDMA2000, CDMA2000 IX, CDMA2000 EV-DO, Interim Standard 2000 (IS-2000), Interim Standard 95 (IS-95), Interim Standard 856 (IS-856), Global System for Mobile communications (GSM), Enhanced Data rates for GSM Evolution (EDGE), GSM EDGE (GERAN), and the like.
  • IEEE 802.16 i.e., Worldwide Interoperability for Microwave Access (WiMAX)
  • CDMA2000, CDMA2000 IX, CDMA2000 EV-DO Code Division Multiple Access 2000
  • IS-95 Interim Standard 95
  • IS-856 Interim Standard 856
  • GSM Global System for Mobile communications
  • GSM Global System for Mobile communications
  • EDGE Enhanced Data rates for GSM Evolution
  • GERAN GSM EDGERAN
  • the base station 114b in FIG. 1 A may be a wireless router, Home Node B, Home eNode B, or access point, for example, and may utilize any suitable RAT for facilitating wireless connectivity in a localized area, such as a place of business, a home, a vehicle, a campus, and the like.
  • the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.11 to establish a wireless local area network (WLAN).
  • the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.15 to establish a wireless personal area network (WPAN).
  • WLAN wireless local area network
  • WPAN wireless personal area network
  • the base station 114b and the WTRUs 102c, 102d may utilize a cellular- based RAT (e.g., WCDMA, CDMA2000, GSM, LTE, LTE-A, etc.) to establish a picocell or femtocell.
  • a cellular- based RAT e.g., WCDMA, CDMA2000, GSM, LTE, LTE-A, etc.
  • the base station 114b may have a direct connection to the Internet 110.
  • the base station 114b may not be required to access the Internet 110 via the core network 106/107/109.
  • the RAN 103/104/105 may be in communication with the core network 106/107/109, which may be any type of network configured to provide voice, data, applications, and/or voice over internet protocol (VoIP) services to one or more of the WTRUs 102a, 102b, 102c, 102d.
  • the core network 106/107/109 may provide call control, billing services, mobile location-based services, pre-paid calling, Internet connectivity, video distribution, etc., and/or perform high-level security functions, such as user authentication.
  • VoIP voice over internet protocol
  • the RAN 103/104/105 and/or the core network 106/107/109 may be in direct or indirect communication with other RANs that employ the same RAT as the RAN 103/104/105 or a different RAT.
  • the core network in addition to being connected to the RAN 103/104/105, which may be utilizing an E-UTRA radio technology, the core network
  • 106/107/109 may also be in communication with another RAN (not shown) employing a GSM radio technology.
  • the core network 106/107/109 may also serve as a gateway for the WTRUs 102a, 102b, 102c, 102d to access the PSTN 108, the Internet 110, and/or other networks 112.
  • the PSTN 108 may include circuit-switched telephone networks that provide plain old telephone service (POTS).
  • POTS plain old telephone service
  • the Internet 110 may include a global system of interconnected computer networks and devices that use common communication protocols, such as the transmission control protocol (TCP), user datagram protocol (UDP) and the internet protocol (IP) in the TCP/IP internet protocol suite.
  • the networks 112 may include wired or wireless
  • the networks 112 may include another core network connected to one or more RANs, which may employ the same RAT as the RAN 103/104/105 or a different RAT.
  • FIG. 1 A is a system diagram of an example WTRU 102. As shown in FIG.
  • the WTRU 102 may include a processor 118, a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, non-removable memory 130, removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and other peripherals 138. It will be appreciated that the WTRU 102 may include any subcombination of the foregoing elements while remaining consistent with an embodiment.
  • GPS global positioning system
  • the base stations 1 14a and 114b, and/or the nodes that base stations 114a and 114b may represent, such as but not limited to transceiver station (BTS), a Node-B, a site controller, an access point (AP), a home node-B, an evolved home node-B (eNodeB), a home evolved node-B (HeNB), a home evolved node-B gateway, and proxy nodes, among others, may include some or all of the elements depicted in FIG. IB and described herein.
  • BTS transceiver station
  • Node-B a Node-B
  • AP access point
  • eNodeB evolved home node-B
  • HeNB home evolved node-B gateway
  • proxy nodes among others, may include some or all of the elements depicted in FIG. IB and described herein.
  • the processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller,
  • DSP digital signal processor
  • the processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment.
  • the processor 118 may be coupled to the transceiver 120, which may be coupled to the
  • FIG. IB depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
  • the transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station (e.g., the base station 114a) over the air interface 115/116/117.
  • a base station e.g., the base station 114a
  • the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals.
  • the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, for example.
  • the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
  • the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MIMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 115/116/117.
  • the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 115/116/117.
  • the transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122.
  • the WTRU 102 may have multi-mode capabilities.
  • the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, for example.
  • the processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit).
  • the processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128.
  • the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132.
  • the non-removable memory 130 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device.
  • the removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like.
  • SIM subscriber identity module
  • SD secure digital
  • the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
  • the processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102.
  • the power source 134 may be any suitable device for powering the WTRU 102.
  • the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), etc.), solar cells, fuel cells, and the like.
  • the processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102.
  • location information e.g., longitude and latitude
  • the WTRU 102 may receive location information over the air interface 115/116/117 from a base station (e.g., base stations 114a, 114b) and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment.
  • the processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity.
  • the peripherals 138 may include an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
  • the peripherals 138 may include an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player
  • FIG. 1C is a system diagram of the RAN 103 and the core network 106 according to an embodiment.
  • the RAN 103 may employ a UTRA radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 115.
  • the RAN 103 may also be in communication with the core network 106.
  • the RAN 103 may include Node-Bs 140a, 140b, 140c, which may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 115.
  • the Node-Bs 140a, 140b, 140c may each be associated with a particular cell (not shown) within the RAN 103.
  • the RAN 103 may also include RNCs 142a, 142b. It will be appreciated that the RAN 103 may include any number of Node-Bs and RNCs while remaining consistent with an embodiment.
  • the Node-Bs 140a, 140b may be in communication with the RNC 142a. Additionally, the Node-B 140c may be in communication with the RNC142b.
  • the Node-Bs 140a, 140b, 140c may communicate with the respective RNCs 142a, 142b via an Iub interface.
  • the RNCs 142a, 142b may be in communication with one another via an lur interface.
  • Each of the RNCs 142a, 142b may be configured to control the respective Node-Bs 140a, 140b, 140c to which it is connected.
  • each of the RNCs 142a, 142b may be configured to carry out or support other functionality, such as outer loop power control, load control, admission control, packet scheduling, handover control, macrodiversity, security functions, data encryption, and the like.
  • the core network 106 shown in FIG. 1C may include a media gateway (MGW) 144, a mobile switching center (MSC) 146, a serving GPRS support node (SGSN) 148, and/or a gateway GPRS support node (GGSN) 150. While each of the foregoing elements are depicted as part of the core network 106, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.
  • MGW media gateway
  • MSC mobile switching center
  • SGSN serving GPRS support node
  • GGSN gateway GPRS support node
  • the RNC 142a in the RAN 103 may be connected to the MSC 146 in the core network 106 via an IuCS interface.
  • the MSC 146 may be connected to the MGW 144.
  • the MSC 146 and the MGW 144 may provide the WTRUs 102a, 102b, 102c with access to circuit- switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and traditional land- line communications devices.
  • the RNC 142a in the RAN 103 may also be connected to the SGSN 148 in the core network 106 via an IuPS interface.
  • the SGSN 148 may be connected to the GGSN 150.
  • the SGSN 148 and the GGSN 150 may provide the WTRUs 102a, 102b, 102c with access to packet- switched networks, such as the Internet 110, to facilitate communications between and the WTRUs 102a, 102b, 102c and IP-enabled devices.
  • the core network 106 may also be connected to the networks 112, which may include other wired or wireless networks that are owned and/or operated by other service providers.
  • FIG. ID is a system diagram of the RAN 104 and the core network 107 according to an embodiment.
  • the RAN 104 may employ an E-UTRA radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 116.
  • the RAN 104 may also be in communication with the core network 107.
  • the RAN 104 may include eNode-Bs 160a, 160b, 160c, though it will be appreciated that the RAN 104 may include any number of eNode-Bs while remaining consistent with an embodiment.
  • the eNode-Bs 160a, 160b, 160c may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 1 16. In one
  • the eNode-Bs 160a, 160b, 160c may implement MIMO technology.
  • the eNode-B 160a for example, may use multiple antennas to transmit wireless signals to, and receive wireless signals from, the WTRU 102a.
  • Each of the eNode-Bs 160a, 160b, 160c may be associated with a particular cell (not shown) and may be configured to handle radio resource management decisions, handover decisions, scheduling of users in the uplink and/or downlink, and the like. As shown in FIG. ID, the eNode-Bs 160a, 160b, 160c may communicate with one another over an X2 interface.
  • the core network 107 shown in FIG. ID may include a mobility management gateway (MME) 162, a serving gateway 164, and a packet data network (PDN) gateway 166. While each of the foregoing elements are depicted as part of the core network 107, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.
  • MME mobility management gateway
  • PDN packet data network
  • the MME 162 may be connected to each of the eNode-Bs 160a, 160b, 160c in the
  • the MME 162 may be responsible for authenticating users of the WTRUs 102a, 102b, 102c, bearer
  • the MME 162 may also provide a control plane function for switching between the RAN 104 and other RANs (not shown) that employ other radio technologies, such as GSM or WCDMA.
  • the serving gateway 164 may be connected to each of the eNode-Bs 160a, 160b, 160c in the RAN 104 via the SI interface.
  • the serving gateway 164 may generally route and forward user data packets to/from the WTRUs 102a, 102b, 102c.
  • the serving gateway 164 may also perform other functions, such as anchoring user planes during inter-eNode B handovers, triggering paging when downlink data is available for the WTRUs 102a, 102b, 102c, managing and storing contexts of the WTRUs 102a, 102b, 102c, and the like.
  • the serving gateway 164 may also be connected to the PDN gateway 166, which may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a, 102b, 102c and IP-enabled devices.
  • the PDN gateway 166 may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a, 102b, 102c and IP-enabled devices.
  • the core network 107 may facilitate communications with other networks.
  • the core network 107 may provide the WTRUs 102a, 102b, 102c with access to circuit- switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and traditional land-line communications devices.
  • the core network 107 may include, or may communicate with, an IP gateway (e.g., an IP multimedia subsystem (IMS) server) that serves as an interface between the core network 107 and the PSTN 108.
  • IMS IP multimedia subsystem
  • the core network 107 may provide the WTRUs 102a, 102b, 102c with access to the networks 112, which may include other wired or wireless networks that are owned and/or operated by other service providers.
  • FIG. IE is a system diagram of the RAN 105 and the core network 109 according to an embodiment.
  • the RAN 105 may be an access service network (ASN) that employs IEEE 802.16 radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 117.
  • ASN access service network
  • the communication links between the different functional entities of the WTRUs 102a, 102b, 102c, the RAN 105, and the core network 109 may be defined as reference points.
  • the RAN 105 may include base stations 180a, 180b, 180c, and an ASN gateway 182, though it will be appreciated that the RAN 105 may include any number of base stations and ASN gateways while remaining consistent with an embodiment.
  • the base stations 180a, 180b, 180c may each be associated with a particular cell (not shown) in the RAN
  • 105 may each include one or more transceivers for communicating with the WTRUs 102a,
  • the base stations 180a, 180b, 180c may implement MIMO technology.
  • the base station 180a may use multiple antennas to transmit wireless signals to, and receive wireless signals from, the WTRU 102a.
  • the base stations 180a, 180b, 180c may also provide mobility management functions, such as handoff triggering, tunnel establishment, radio resource management, traffic classification, quality of service (QoS) policy enforcement, and the like.
  • the ASN gateway 182 may serve as a traffic aggregation point and may be responsible for paging, caching of subscriber profiles, routing to the core network 109, and the like.
  • the air interface 117 between the WTRUs 102a, 102b, 102c and the RAN 105 may be defined as an Rl reference point that implements the IEEE 802.16 specification.
  • each of the WTRUs 102a, 102b, 102c may establish a logical interface (not shown) with the core network 109.
  • the logical interface between the WTRUs 102a, 102b, 102c and the core network 109 may be defined as an R2 reference point, which may be used for authentication,
  • the communication link between each of the base stations 180a, 180b, 180c may be defined as an R8 reference point that includes protocols for facilitating WTRU handovers and the transfer of data between base stations.
  • the communication link between the base stations 180a, 180b, 180c and the ASN gateway 182 may be defined as an R6 reference point.
  • the R6 reference point may include protocols for facilitating mobility management based on mobility events associated with each of the WTRUs 102a, 102b, 102c.
  • the RAN 105 may be connected to the core network 109.
  • the communication link between the RAN 105 and the core network 109 may defined as an R3 reference point that includes protocols for facilitating data transfer and mobility management capabilities, for example.
  • the core network 109 may include a mobile IP home agent (MIP-HA) 184, an authentication, authorization, accounting (AAA) server 186, and a gateway 188. While each of the foregoing elements are depicted as part of the core network 109, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.
  • MIP-HA mobile IP home agent
  • AAA authentication, authorization, accounting
  • the MIP-HA may be responsible for IP address management, and may enable the
  • WTRUs 102a, 102b, 102c to roam between different ASNs and/or different core networks.
  • MIP-HA 184 may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a,
  • the AAA server 186 may be responsible for user authentication and for supporting user services.
  • the gateway 188 may facilitate interworking with other networks. For example, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to circuit-switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and traditional land-line communications devices. In addition, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to the networks 1 12, which may include other wired or wireless networks that are owned and/or operated by other service providers.
  • the RAN 105 may be connected to other ASNs and the core network 109 may be connected to other core networks.
  • the communication link between the RAN 105 the other ASNs may be defined as an R4 reference point, which may include protocols for coordinating the mobility of the WTRUs 102a, 102b, 102c between the RAN 105 and the other ASNs.
  • the communication link between the core network 109 and the other core networks may be defined as an R5 reference, which may include protocols for facilitating interworking between home core networks and visited core networks.
  • OTT streaming may utilize the Internet as a delivery medium.
  • Video- capable devices may range from mobile devices to Internet set-top boxes (STBs) to network TVs.
  • STBs Internet set-top boxes
  • Network capabilities have evolved to make high-quality video delivery over the Internet viable.
  • “Closed” networks may be controlled by a multi-system operator (MSO).
  • MSO multi-system operator
  • Internet may be a "best effort” environment.
  • bandwidth and latency may change (e.g., constantly).
  • Network conditions may be volatile in mobile networks. Dynamic adaptation to network changes may provide a tolerable user experience, for example, in volatile mobile networks.
  • Adaptive streaming may be considered similar to HTTP streaming.
  • User datagram protocol UDP
  • HTTP streaming may be used for internet video streaming.
  • the use of HTTP for Internet video streaming may be attractive and scalable, for example, due to the existing HTTP infrastructure.
  • the existing HTTP infrastructure may include, for example, one or more content distribution networks (CDNs) and the ubiquity of HTTP support on multiple platforms and devices.
  • CDNs content distribution networks
  • the use of HTTP streaming for internet video streaming may be attractive, for example, due to firewall penetration. Firewalls may disallow UDP traffic. Video over HTTP may be available behind firewalls.
  • HTTP streaming may be desirable for rate-adaptive streaming.
  • An asset may be segmented virtually or physically, for example, in HTTP adaptive streaming.
  • An asset may be published to CDN's, for example, in HTTP adaptive streaming.
  • Intelligence may reside in the client (e.g., DASH client).
  • the client may acquire the knowledge of the published alternative encodings (e.g., representations), for example, via a Media
  • the client may acquire the way to construct URLs to download a segment from a given representation.
  • An adaptive bit rate (ABR) client may observe network conditions.
  • An ABR client may decide which combination of bitrate, resolution, etc. may provide best quality of experience on the client device at an instance of time.
  • the ABR client may determine the optimal URL to use.
  • the ABR client may issue an HTTP GET request to download a segment, for example, as the ABR client determines the optimal URL to use.
  • DASH may be built on top of the HTTP/TCP/IP stack.
  • DASH may define a manifest format (e.g., Media Presentation Description (MPD)), and/or segment formats for ISO Base Media File Format and/or MPEG-2 Transport Streams.
  • MPD Media Presentation Description
  • DASH may define a set of quality metrics at the network, client operation, media presentation levels, etc. The set of quality metrics may enable an interoperable way of monitoring Quality of Experience (QoE) and Quality of Service (QoS).
  • QoE Quality of Experience
  • QoS Quality of Service
  • a representation may be a core concepts of DASH.
  • a representation may be a single encoded version of the complete asset.
  • a representation may be a single encoded version of a subset of the components of a complete set.
  • a representation may be ISO-BMFF including unmultiplexed 2.5 Mbps 720p AVC video and/or separate ISO-BMFF representations for 96 Kbps MPEG-4 AAC audio in different languages.
  • DASH may provide an unmultiplexed representation for audio and a separate unmultiplexed representation for video.
  • a single transport stream including two or more of video, audio, and/or subtitles may be a single multiplexed representation.
  • a combined structure may comprise video and English audio as a single multiplexed representation, along with for example, Spanish and Chinese audio tracks as separate unmultiplexed representations.
  • a segment may be the minimal individually addressable unit of media data.
  • a segment may be the entity that may be downloaded using URLs advertised via the MPD.
  • a media segment may be a four second part of a live broadcast, starting at playout time
  • a media segment may be a complete on-demand movie available for the whole period the movie is licensed.
  • the MPD may be an XML document.
  • the MPD may be an XML document that may advertise the available media.
  • the MPD may provide information requested by the client to select a representation, make adaptation decisions, and/or retrieve segments from the network.
  • the MPD may be independent of a segment.
  • the MPD may signal the properties requested to determine whether a representation may be successfully played.
  • the MPD may signal the functional properties (e.g., whether segments start at random access points).
  • the MPD may use a hierarchical data model, for example, to describe the complete presentation.
  • the MPD may signal information, such as at the lowest conceptual level of hierarchical data.
  • the information signaled may include bandwidth and/or codecs that may be used for successful presentation and/or ways of constructing URLs for accessing segments.
  • Additional information may be provided at the lowest conceptual level of hierarchical data, such as trick mode, random access information, layer and view information for scalable and multiview codecs, generic schemes which may be supported by a client wishing to play a given
  • DASH may provide a flexible URL construction functionality.
  • Single monolithic per-segment URL may provide rigid URL construction functionality.
  • Single monolithic per- segment URL may be possible in DASH.
  • DASH may allow dynamic construction of URLs.
  • DASH may allow dynamic construction of URLs, for example, by combining parts of the URL (e.g., base URLs) that may appear at different levels of the hierarchical data model.
  • Segments may be multi-path functionality, for example, if multiple base URL are used with segments requested from one or more location. Multi-path functionality may improve performance and reliability.
  • a list of URLs and byte ranges may reach several thousand elements per
  • DASH may allow the use of predefined variables (e.g., segment number, segment time, etc.). DASH may allow the use of printf-style syntax for quick construction of URLs using templates.
  • a number of segments may be listed as all segments (e.g., seg OOOOl .ts, seg_00002.ts, ... , seg_03600.ts).
  • a number of segments may be expressed as a single line (e.g., seg_$Index%05$.ts), for example, if the segments cannot be retrieved at the time the MPD is fetched. Multi-segment representations may be helpful in using templates, for example, due to template efficiency.
  • Adaption sets may be groups of different representations of the same asset and/or the same component, for example, in the un-multiplexed case.
  • Representations e.g., all representations
  • a client may switch between representations within an adaption set.
  • an adaptation set may be a collection of ten representations with video encoded in different bitrates and/or resolutions. Representation switching may occur at a segment and/or a subsegment, for example, while presenting the same content to the viewer.
  • Segment-level restrictions may be used in practical applications, such as DASH profiles and DASH subsets adopted by multiple SDOs. Segment-level restrictions may be applied to representations within an adaptation set.
  • a period may be a time-limited subset of a presentation. Adaptation sets may be valid within the period. Adaptation sets in different periods may include different representations (e.g., in terms of codecs, bitrates, etc.).
  • An MPD may include a single period for the whole duration of the asset. Periods may be used for ad markup, for example, where separate periods are dedicated to parts of the asset and/or to an advertisement.
  • the MPD may be an XML document that presents a hierarchy.
  • the hierarchy may, for example, start at global presentation-level properties (e.g., timing) and continue to period- level properties and/or adaptation sets available for that period. Representations may be at the lowest level of this hierarchy.
  • DASH may use a simplified version of XLink, for example, to allow loading parts of the MPD (e.g., periods) in real time from a remote location. For example, in ad insertion, precise timing of ad breaks may be known ahead of time and ad servers may determine the exact ad in real time.
  • An MPD may be dynamic or static.
  • a dynamic MPD may change.
  • a dynamic MPD may be periodically reloaded by the client.
  • a static MPD may be valid for the whole
  • a static MPD may be used in VoD applications.
  • a dynamic MPD may be used for live and PVR applications, for example.
  • Media segments may be time-bounded parts of a representation. Media segments may approximate segment durations that may appear in the MPD. Segment duration may be different for one or more segments. Segment durations may be constant and/or close to constant for segments (e.g., DASH-AVC/264 may use segments with durations within a 25% tolerance margin).
  • the MPD may include information regarding media segments that may be unavailable at the time the MPD is read by the client, for example, in a live broadcast scenario. Segments may be available within a defined availability time window. The time window may be calculated from the wall-clock time and/or segment duration.
  • An index segment may be a segment type. Index segments may appear as side files. Index segments may appear within media segments. Index segments may include timing and/or random access information. Index segments may make efficient implementation of random access and trick modes. Index segments may be used for more efficient bitstream switching. Index segments may be used for VoD and PVR type of applications. Index segments may be used less in live cases.
  • Segment-level and/or representation-level properties may be used to implement efficient bitstream switching.
  • DASH may provide functional requirements for segment-level and/or representation-level properties.
  • Segment-level and/or representation-level properties may be expressed in the MPD, for example, in a format-independent way.
  • Segment format specifications may include the format-level restrictions that may correspond to generic requirements.
  • a media segment may be denoted as i.
  • a representation may be denoted as R.
  • a media segment i of a representation R may be denoted as S (f).
  • the duration of S (J) may be denoted as D(S (f)).
  • the earliest presentation time of S (f) may be denoted as EPT(S R (i)). EPT may correspond to the earliest presentation time of the segment. The earliest presentation time may not be the time at which a segment may be successfully played out at random access.
  • Time alignment may be used for efficient switching between representations in an adaption set. Efficient switching in an adaption set may follow
  • segment i in an adaption set The ability to switch at a segment border without overlapped downloads and/or dual decoding may occur, for example, when switching follows
  • overlapped downloads and/or dual decoding may occur, for example, when a segment starts with a random access point of certain types.
  • Bitstream switching at a subsegment level may occur, for example, when indexing is used. Bitstream switching at a subsegment level may occur, for example, when subsegment switching follows EPT(S R (i)) ⁇ EPT(S R (i-l))+D(S (z _ l)). Bitstream switching at a subsegment a b b
  • level may occur, for example, when a subsegment starts with a random access point of certain types.
  • Systems may utilize time alignment and/or random access point placement restrictions. Restrictions may correspond to encodings with matching instantaneous decoder refresh (IDR) frames at segment borders and/or closed group of pictures (GOPs), for example, in video encoding.
  • IDR instantaneous decoder refresh
  • FIG. 2 is an example DASH system model 200.
  • a DASH client may include one or more of an access client (e.g., a DASH access engine 202), a media engine (e.g., a media engine 204), and/or an application (e.g., an application 206).
  • the DASH access engine 202 may be an HTTP client.
  • the DASH access engine 202 may receive an MPD and/or segment data, for example, via a CDN (not shown).
  • the DASH access engine 202 may send media (e.g., and timing) to the media engine 204.
  • the media may be in an MPEG format (e.g., MPEG-2 TS) or an ISO format (e.g., ISO-BMFF).
  • the media engine 204 may decode and present the media provided from the DASH access engine 202.
  • An DASH access engine 202 may pass events (e.g., and timing) to an application 206.
  • the on-the-wire format interfaces of the MPD and/or segments may be defined. Other interfaces may be defined according to implementers' discretion.
  • Timing behavior of a DASH client may be complex. Segments mentioned in a manifest may be valid, for example, in Apple HLS. A client may poll for new manifests, for example, in Apple HLS. DASH MPD may reduce polling behavior. DASH MPD may reduce polling behavior, for example, by defining MPD update frequency and/or allowing calculation of segment availability.
  • a static MPD may be valid.
  • a static MPD may always be valid.
  • a dynamic MPD may be valid.
  • a dynamic MPD may be valid, for example, from the time the dynamic MPD was fetched by the client for the duration of a refresh period.
  • An MPD may expose publication time.
  • MPD may provide the availability time of the earliest segment of a period.
  • the availability time of the earliest segment of a period may be denoted as 7 ⁇ (0).
  • a media segment may be denoted as n.
  • a media segment n may be available, for example, starting from time
  • T A (n) T A (0)+ ⁇ D(S R (i)).
  • a time shift buffer may be denoted as Ts.
  • the time shift buffer Ts may be stated in the
  • the window size availability may impact the catch-up TV functionality of a DASH deployment.
  • Segment availability time may be relied upon by the access client, for example, if the segment availability is within the MPD validity period.
  • MPD may declare bandwidth, for example, for a representation R. Bandwidth may be denoted as B R .
  • MPD may define a global minimum buffering time. A global minimum buffering time may be denoted as BT min .
  • An access client may be able to pass a segment to the media engine.
  • An access client may be able to pass a segment to the media engine, for example, after B xBT bits were downloaded.
  • a segment may start with a random access point.
  • the earliest time segment n that may be passed to the media engine may be denoted as T (n)+T (n)+BT . .
  • the downloaded time of segment n may be denoted as Tin).
  • a DASH client may start the playout immediately, for example, to minimize delay.
  • MPD may propose a presentation delay (e.g., as an offset from T (n)).
  • MPD may propose a presentation delay, for example, to ensure synchronization between different clients. Tight synchronization of segment HTTP
  • MPD validity and/or segment availability may be calculated using absolute (e.g., wall-clock) time.
  • Media time may be expressed within the segments.
  • Media time may develop between the encoder and client clocks, for example, in live case drift.
  • Media time may be addressed at the container level.
  • MPEG-2 TS and ISO-BMFF provide synchronization functionality.
  • HTTP may be stateless and/or client-driven. "push"-style events may be emulated using polls (e.g., frequent polls). Upcoming ad breaks may be signaled three to eight seconds before the start of the ad break, for example, in current ad insertion practice in cable/IPTV systems. A poll-based implementation may be inefficient. Events may address inefficient ad insertion.
  • Events may include timing information. Events may include time and duration information. Events may include payload information. Payload information may include arbitrary information. Events may have application-specific payloads. Inband events may be small message boxes. Small message boxes may appear at the beginning of media segments. MPD events may be a period-level list of timed elements. DASH may define an MPD validity expiration event. An MPD validity event may identify the earliest MPD version valid after a given presentation time.
  • DASH may be agnostic to digital rights management (DRM).
  • DASH may support signaling a DRM scheme.
  • DASH may support signaling DRM scheme properties within the MPD.
  • a DRM scheme may be signaled via the ContentProtection descriptor.
  • An opaque value may be passed within a DRM scheme.
  • a unique identifier for a scheme may be used to signal a DRM scheme.
  • the meaning of the opaque value may be defined to signal a DRM scheme.
  • a scheme-specific namespace may be used to signal a DRM scheme.
  • MPEG may provide content protection standards, such as Common Encryption for ISO-BMFF (CENC) and Segment Encryption and Authentication.
  • Common encryption may standardize which parts of a sample are encrypted and/or how encryption metadata may be signaled within a track.
  • the DRM module may be responsible for delivering the keys to the client, for example, given the encryption metadata in the segment.
  • Decryption may use standard AES-CTR and/or AES-CBC modes.
  • the CENC framework may be extensible. The CENC framework may use other encryption algorithms beyond AES-CTR and/or AES-CBC modes, for example, if other encryption algorithms are defined.
  • Common Encryption may be used with several commercial DRM systems. Common encryption may be the system used in DASH (e.g., DASH264).
  • DASH Segment Encryption and Authentication may be agnostic to the segment format.
  • Encryption metadata may be passed via the MPD.
  • MPD may include information regarding the key that may be used for decryption of a segment.
  • MPD may include information regarding how to obtain the key that may be used for decryption of a segment.
  • the baseline system may be equivalent to the one defined in HLS, for example, with AES-CBC encryption and HTTPS-based key transport.
  • the baseline system equivalent to the one defined in HLS may make MPEG-2 TS media segments compatible with encrypted HLS segments.
  • the DASH-SEA standard may be extensible.
  • the DASH-SEA standard may allow other encryption algorithms and/or more DRM systems, for example, similar to CENC.
  • DASH-SEA may offer a segment authenticity framework.
  • Segment authenticity framework may ensure that the segment received by the client is the same and/or similar to the segment the MPD author intended the client to receive.
  • Segment authenticity framework may use MAC and/or digest algorithms, for example, to prevent content modification within the network (e.g., ad replacement, altering inband events, etc.)
  • each representation may include one content component (e.g., video, audio in a specific language, etc.).
  • MPEG-2 transport stream segments may be used as an example herein, but all examples are equally relevant to all segment types.
  • a single asset may be split into several periods.
  • the content component used for timing calculations may be difficult to express.
  • a drift may develop between audio and video.
  • a glitch in a transition may occur, for example, if an inserted period is removed.
  • a glitch in a transition may occur, for example, as audio and video durations of previous periods may differ.
  • determinations made in Internet radio may be performed relative to audio, closed captioning, video, etc.
  • Video may be a slide show at a very slow, possibly variable framerate.
  • a "master" component relative to which all calculations may be made may be expressed.
  • LI and L2 may use different codecs (e.g., respectively CI and C2).
  • CI may be needed for LI and C2 may be needed for L2.
  • the lack of support for codec C2 may disallow playing in language L2.
  • the playing in language LI may be allowed.
  • languages LI and L2 may be English AC-3 and Spanish MP2.
  • a language may be disallowed, for example, if legacy US content is being retrofitted without re-encoding.
  • CEA-608 or CEA-708 as defined in SCTE 128-1
  • An implementer may recognize differences between multiplexed and burned-in closed captioning and/or subtitles.
  • CEA 608/708 closed captioning may be turned on and off. Burned-in captioning may be unable to turn on and off.
  • CEA 608/708 implementation may be used with a video codec to display CEA 608/708 closed captioning. The video codec may be used independently to display burned-in closed captioning.
  • signaling and/or handling of a changing aspect ratio, and/or signaling and/or handling of an intended display may use inband, for example, when multiple assets are in a single period (e.g., ad splicing done upstream, with different video characteristics).
  • Signaling aspect ratio in DASH may be utilized at the
  • the player may be unaware that the signaling aspect ratio may change, for example, when signaling aspect ratio in DASH.
  • Signaling discontinuities may occur. For example, time discontinuities may occur, such as when different streams are spliced together. Signaling discontinuities may be announced, for example, in MPEG-2 TS. Signaling discontinuities may result in PCR-PCR difference, for example, in MPEG-2 TS. Signaling the player to estimate the EPT of a segment using segment duration (e.g., rather than directly using timestamps) may be utilized to lessen signaling discontinuities and PCR-PCR difference.
  • ContentComponent type in MPEG DASH may be used to express different content components of a multiplexed representation.
  • ContentComponent element definitions may appear in ISO/IEC 23009-1 :2014, Section 5.3.4.
  • ContentComponent may be extended to provide additional attributes to express parameters.
  • ContentComponent may be extended to provide mapping between ContentComponent and appropriate packet identifiers (PID) (e.g., or track) in a stream.
  • PID packet identifiers
  • ContentComponent may be defined per application set.
  • ContentComponent may be defined per representation.
  • Table 1 is an example of ContentComponent element definitions, with attributes added to express parameters. TABLE 1
  • the @codecs element in the ContentComponent may be signaled.
  • the @codecs element in the ContentComponent may be signaled when different audio characteristics are present (e.g., two audio streams with different languages in a single multiplex use different codecs).
  • the @codecs element in the ContentComponent may be signaled to distinguish between closed captioning or subtitles multiplexed in a video bitstream and burned-in closed captioning or subtitles in which text is part of pre-encoded video.
  • the @codecs element in the ContentComponent may indicate that a multiplexed representation comprises two audio components multiplexed together, and/or that the audio components are encoded according to different audio codecs.
  • the multiplexed audio components may be in different languages (e.g., as indicated by @lang).
  • the client may determine whether one or more of the audio components are supported by the device.
  • ContentComponent For example, signaling of multi-channel audio and/or sampling rates may be added to the ContentComponent when different audio characteristics are present.
  • a 1 : 1 mapping may be established between the ContentComponent and the track IP and/or PID.
  • a 1 : 1 mapping may be established between the ContentComponent and the track IP and/or PID when a single asset is split into several periods.
  • the relative offsets of the "non-master” components from the "master” components may be determined.
  • the relative offsets of the "non-master” components from the “master” components may be determined, for example, by indicating the offset using the
  • the relative offsets of the "non-master" components from the "master” components may be determined when a single asset is split into several periods.
  • a client may determine an EPT of a video component or an audio component in a segment.
  • the processor may determine the relative offset of an audio component in the segment from the EPT of a video component of the segment.
  • Codecs may be registered for CEA-608 and/or CEA-708.
  • a 4cc value may be registered for CEA 608/708.
  • a 4cc value may be a separate value for the one or more standards (e.g., c608 and c708).
  • a 4cc value may be a defined general value (e.g., cc08).
  • a general value may be defined.
  • a profile may indicate whether CEA-708 may be used (e.g., let cc08.6 stand for CEA-608 and cc08.7 stand for CEA-708).
  • a PAR value may be defined.
  • a value for AFD may be defined.
  • a value for AFD may be defined when multiple assets are in a single period.
  • the definition of RatioType in ISO/IEC 23009-1 :2014 Annex B may permit values that may be disallowed for normal PAR calculation (e.g., the values "0:” and ":0"). Disallowed values may serve as an indicator that the aspect ratio and its handling may be specified inband.
  • a Role value may be used to indicate the main component that may be used as a "master” component.
  • a master component may be the component assumed for time calculation.
  • the role "master” may be added to Table 22 of ISO/IEC 23009-1 :2014, Section 5.8.5.5 Equivalent semantics may be added to Table 22 of ISO/IEC 23009-1 :2014, Section 5.8.5.5.
  • One "master" content component may appear. @trackld may be present.
  • a SupplementalProperty may be introduced, for example, to signal that
  • the expected receiver behavior may be used to calculate the mapping of presentation time specified in the bitstream.
  • the mapping of presentation time specified in the bitstream may be calculated relative to an estimation of segment duration and/or @presentationTimeOffset.
  • An initialization segment and/or first available segment may be fetched (e.g., requested and/or received). An initialization segment and/or first available segment may be fetched when different audio characteristics are present. An initialization segment and/or first available segment may be fetched to distinguish between closed captioning or subtitles multiplexed in a video bitstream and burned-in closed captioning or subtitles in which text is part of pre-encoded video.
  • the ContentComponent may be made into an extension of RepresentationBaseType.
  • the ContentComponent may be made into an extension of RepresentationBaseType, for example, to resolve signaling issues.
  • Examples of computer-readable storage media include, but are not limited to, a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
  • ROM read only memory
  • RAM random access memory
  • register cache memory
  • semiconductor memory devices magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
  • a processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

A device may comprise a processor that may be configured to receive an audio bitstream in a single multiplex. The device may determine languages included in the single multiplex. The device may determine codecs contained in the single multiplex. The device may signal codecs used by the ContentComponent. The device may receive instructions from the ContentComponent. The device may signal multi-channel audio. The device may signal sampling rates.

Description

MEDIA PRESENTATION DESCRIPTION SIGNALING
IN TYPICAL BROADCAST CONTENT
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S Provisional Patent Application No. 62/019,784, filed July 01, 2014 which is incorporated herein by reference in its entirety.
BACKGROUND
[0002] In recent years, "over-the-top" (OTT) streaming has emerged as a delivery medium. OTT streaming utilizes Internet as a delivery medium. Network capabilities have evolved to make video delivery over the Internet viable. Media Presentation Description signaling may be used to handle broadcast content in Dynamic Adaptive Streaming over HTTP (DASH).
SUMMARY
[0003] A device may include a processor configured to receive multimedia presentation description (MPD) information relating to content. The processor may determine, based upon the MPD information, that the content comprises a multiplexed representation comprising a closed captioning component multiplexed with at least one of an audio component or a video component. The processor may determine whether the closed captioning component is a CEA- 608 closed captioning component or a CEA-708 closed captioning component. The processor may determine whether display of the closed captioning component is supported on the device. The processor may request the multiplexed representation, and receive content segments comprising the multiplexed representation.
[0004] A device may include a processor configured to receive MPD information relating to content. The processor may determine, based upon the MPD information, that the content comprises a multiplexed representation comprising two audio components multiplexed together, the audio components encoded according to different audio codecs. The two audio components are in different languages. The processor may determine a language of at least one audio component. The processor may determine whether the two audio codecs are supported. The processor may request the multiplexed representation, and receive content segments comprising the multiplexed representation.
[0005] A device may include a processor configured to receive MPD information relating to content. The processor may determine, based upon the MPD information, that the content comprises a multiplexed representation comprising a packet identifier of a segment included in a multiplexed representation. The segment may be a video component. The processor may determine a track identifier of at least one of a video component and an audio component in the segment. The processor may determine an earliest presentation time (EPT) of a video component or an audio component in the segment. The processor may determine the relative offset of an audio component in the segment from the EPT of a video component of the segment. The processor may request the multiplexed representation, and receive content segments comprising the multiplexed representation.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] FIG. 1A is a system diagram of an example communications system in which one or more disclosed embodiments may be implemented.
[0007] FIG. IB is a system diagram of an example wireless transmit/receive unit (WTRU) that may be used within the communications system illustrated in FIG. 1 A.
[0008] FIG. 1C is a system diagram of an example radio access network and an example core network that may be used within the communications system illustrated in FIG. 1 A.
[0009] FIG ID is a system diagram of another example radio access network and an example core network that may be used within the communications system illustrated in FIG. 1 A.
[0010] FIG. IE is a system diagram of another example radio access network and an example core network that may be used within the communications system illustrated in FIG.
1A.
[0011] FIG. 2 is a diagram of an example DASH system model. DETAILED DESCRIPTION
[0012] A detailed description of illustrative embodiments will now be described with reference to the various Figures. Although this description provides a detailed example of possible implementations, it should be noted that the details are intended to be exemplary and in no way limit the scope of the application.
[0013] FIG. 1A is a diagram of an example communications system 100 in which one or more disclosed embodiments may be implemented. The communications system 100 may be a multiple access system that provides content, such as voice, data, video, messaging, broadcast, etc., to multiple wireless users. The communications system 100 may enable multiple wireless users to access such content through the sharing of system resources, including wireless bandwidth. For example, the communications systems 100 may employ one or more channel access methods, such as code division multiple access (CDMA), time division multiple access (TDMA), frequency division multiple access (FDMA), orthogonal FDMA (OFDMA), single- carrier FDMA (SC-FDMA), and the like.
[0014] As shown in FIG. 1A, the communications system 100 may include wireless transmit/receive units (WTRUs) 102a, 102b, 102c, and/or 102d (which generally or collectively may be referred to as WTRU 102), a radio access network (RAN) 103/104/105, a core network 106/107/109, a public switched telephone network (PSTN) 108, the Internet 110, and other networks 112, though it will be appreciated that the disclosed embodiments contemplate any number of WTRUs, base stations, networks, and/or network elements. Each of the WTRUs 102a, 102b, 102c, 102d may be any type of device configured to operate and/or communicate in a wireless environment. By way of example, the WTRUs 102a, 102b, 102c, 102d may be configured to transmit and/or receive wireless signals and may include user equipment (UE), a mobile station, a fixed or mobile subscriber unit, a pager, a cellular telephone, a personal digital assistant (PDA), a smartphone, a laptop, a netbook, a personal computer, a wireless sensor, consumer electronics, and the like.
[0015] The communications systems 100 may also include a base station 114a and a base station 114b. Each of the base stations 114a, 114b may be any type of device configured to wirelessly interface with at least one of the WTRUs 102a, 102b, 102c, 102d to facilitate access to one or more communication networks, such as the core network 106/107/109, the Internet 110, and/or the networks 112. By way of example, the base stations 114a, 114b may be a base transceiver station (BTS), a Node-B, an eNode B, a Home Node B, a Home eNode B, a site controller, an access point (AP), a wireless router, and the like. While the base stations 114a,
114b are each depicted as a single element, it will be appreciated that the base stations 114a,
114b may include any number of interconnected base stations and/or network elements.
[0016] The base station 114a may be part of the RAN 103/104/105, which may also include other base stations and/or network elements (not shown), such as a base station controller (BSC), a radio network controller (RNC), relay nodes, etc. The base station 114a and/or the base station
114b may be configured to transmit and/or receive wireless signals within a particular geographic region, which may be referred to as a cell (not shown). The cell may further be divided into cell sectors. For example, the cell associated with the base station 114a may be divided into three sectors. Thus, in one embodiment, the base station 114a may include three transceivers, i.e., one for each sector of the cell. In another embodiment, the base station 114a may employ multiple -input multiple output (MIMO) technology and, therefore, may utilize multiple transceivers for each sector of the cell.
[0017] The base stations 114a, 114b may communicate with one or more of the WTRUs 102a, 102b, 102c, 102d over an air interface 115/116/117, which may be any suitable wireless communication link (e.g., radio frequency (RF), microwave, infrared (IR), ultraviolet (UV), visible light, etc.). The air interface 115/116/117 may be established using any suitable radio access technology (RAT).
[0018] More specifically, as noted above, the communications system 100 may be a multiple access system and may employ one or more channel access schemes, such as CDMA, TDMA, FDMA, OFDMA, SC-FDMA, and the like. For example, the base station 114a in the RAN 103/104/105 and the WTRUs 102a, 102b, 102c may implement a radio technology such as Universal Mobile Telecommunications System (UMTS) Terrestrial Radio Access (UTRA), which may establish the air interface 115/116/117 using wideband CDMA (WCDMA).
WCDMA may include communication protocols such as High-Speed Packet Access (HSPA) and/or Evolved HSPA (HSPA+). HSPA may include High-Speed Downlink Packet Access (HSDPA) and/or High-Speed Uplink Packet Access (HSUPA).
[0019] In another embodiment, the base station 114a and the WTRUs 102a, 102b, 102c may implement a radio technology such as Evolved UMTS Terrestrial Radio Access (E-UTRA), which may establish the air interface 115/116/117 using Long Term Evolution (LTE) and/or LTE- Advanced (LTE- A).
[0020] In other embodiments, the base station 114a and the WTRUs 102a, 102b, 102c may implement radio technologies such as IEEE 802.16 (i.e., Worldwide Interoperability for Microwave Access (WiMAX)), CDMA2000, CDMA2000 IX, CDMA2000 EV-DO, Interim Standard 2000 (IS-2000), Interim Standard 95 (IS-95), Interim Standard 856 (IS-856), Global System for Mobile communications (GSM), Enhanced Data rates for GSM Evolution (EDGE), GSM EDGE (GERAN), and the like.
[0021] The base station 114b in FIG. 1 A may be a wireless router, Home Node B, Home eNode B, or access point, for example, and may utilize any suitable RAT for facilitating wireless connectivity in a localized area, such as a place of business, a home, a vehicle, a campus, and the like. In one embodiment, the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.11 to establish a wireless local area network (WLAN). In another embodiment, the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.15 to establish a wireless personal area network (WPAN). In yet another embodiment, the base station 114b and the WTRUs 102c, 102d may utilize a cellular- based RAT (e.g., WCDMA, CDMA2000, GSM, LTE, LTE-A, etc.) to establish a picocell or femtocell. As shown in FIG. 1 A, the base station 114b may have a direct connection to the Internet 110. Thus, the base station 114b may not be required to access the Internet 110 via the core network 106/107/109.
[0022] The RAN 103/104/105 may be in communication with the core network 106/107/109, which may be any type of network configured to provide voice, data, applications, and/or voice over internet protocol (VoIP) services to one or more of the WTRUs 102a, 102b, 102c, 102d. For example, the core network 106/107/109 may provide call control, billing services, mobile location-based services, pre-paid calling, Internet connectivity, video distribution, etc., and/or perform high-level security functions, such as user authentication. Although not shown in FIG. 1A, it will be appreciated that the RAN 103/104/105 and/or the core network 106/107/109 may be in direct or indirect communication with other RANs that employ the same RAT as the RAN 103/104/105 or a different RAT. For example, in addition to being connected to the RAN 103/104/105, which may be utilizing an E-UTRA radio technology, the core network
106/107/109 may also be in communication with another RAN (not shown) employing a GSM radio technology.
[0023] The core network 106/107/109 may also serve as a gateway for the WTRUs 102a, 102b, 102c, 102d to access the PSTN 108, the Internet 110, and/or other networks 112. The PSTN 108 may include circuit-switched telephone networks that provide plain old telephone service (POTS). The Internet 110 may include a global system of interconnected computer networks and devices that use common communication protocols, such as the transmission control protocol (TCP), user datagram protocol (UDP) and the internet protocol (IP) in the TCP/IP internet protocol suite. The networks 112 may include wired or wireless
communications networks owned and/or operated by other service providers. For example, the networks 112 may include another core network connected to one or more RANs, which may employ the same RAT as the RAN 103/104/105 or a different RAT.
[0024] Some or all of the WTRUs 102a, 102b, 102c, 102d in the communications system
100 may include multi-mode capabilities, i.e., the WTRUs 102a, 102b, 102c, 102d may include multiple transceivers for communicating with different wireless networks over different wireless links. For example, the WTRU 102c shown in FIG. 1 A may be configured to communicate with the base station 114a, which may employ a cellular-based radio technology, and with the base station 114b, which may employ an IEEE 802 radio technology. [0025] FIG. IB is a system diagram of an example WTRU 102. As shown in FIG. IB, the WTRU 102 may include a processor 118, a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, non-removable memory 130, removable memory 132, a power source 134, a global positioning system (GPS) chipset 136, and other peripherals 138. It will be appreciated that the WTRU 102 may include any subcombination of the foregoing elements while remaining consistent with an embodiment. Also, embodiments contemplate that the base stations 1 14a and 114b, and/or the nodes that base stations 114a and 114b may represent, such as but not limited to transceiver station (BTS), a Node-B, a site controller, an access point (AP), a home node-B, an evolved home node-B (eNodeB), a home evolved node-B (HeNB), a home evolved node-B gateway, and proxy nodes, among others, may include some or all of the elements depicted in FIG. IB and described herein.
[0026] The processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller,
Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like. The processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other functionality that enables the WTRU 102 to operate in a wireless environment. The processor 118 may be coupled to the transceiver 120, which may be coupled to the
transmit/receive element 122. While FIG. IB depicts the processor 118 and the transceiver 120 as separate components, it will be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.
[0027] The transmit/receive element 122 may be configured to transmit signals to, or receive signals from, a base station (e.g., the base station 114a) over the air interface 115/116/117. For example, in one embodiment, the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive IR, UV, or visible light signals, for example. In yet another embodiment, the transmit/receive element 122 may be configured to transmit and receive both RF and light signals. It will be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
[0028] In addition, although the transmit/receive element 122 is depicted in FIG. IB as a single element, the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MIMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 115/116/117.
[0029] The transceiver 120 may be configured to modulate the signals that are to be transmitted by the transmit/receive element 122 and to demodulate the signals that are received by the transmit/receive element 122. As noted above, the WTRU 102 may have multi-mode capabilities. Thus, the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11, for example.
[0030] The processor 118 of the WTRU 102 may be coupled to, and may receive user input data from, the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128 (e.g., a liquid crystal display (LCD) display unit or organic light-emitting diode (OLED) display unit). The processor 118 may also output user data to the speaker/microphone 124, the keypad 126, and/or the display/touchpad 128. In addition, the processor 118 may access information from, and store data in, any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132. The non-removable memory 130 may include random-access memory (RAM), read-only memory (ROM), a hard disk, or any other type of memory storage device. The removable memory 132 may include a subscriber identity module (SIM) card, a memory stick, a secure digital (SD) memory card, and the like. In other embodiments, the processor 118 may access information from, and store data in, memory that is not physically located on the WTRU 102, such as on a server or a home computer (not shown).
[0031] The processor 118 may receive power from the power source 134, and may be configured to distribute and/or control the power to the other components in the WTRU 102. The power source 134 may be any suitable device for powering the WTRU 102. For example, the power source 134 may include one or more dry cell batteries (e.g., nickel-cadmium (NiCd), nickel-zinc (NiZn), nickel metal hydride (NiMH), lithium-ion (Li-ion), etc.), solar cells, fuel cells, and the like.
[0032] The processor 118 may also be coupled to the GPS chipset 136, which may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102. In addition to, or in lieu of, the information from the GPS chipset 136, the WTRU 102 may receive location information over the air interface 115/116/117 from a base station (e.g., base stations 114a, 114b) and/or determine its location based on the timing of the signals being received from two or more nearby base stations. It will be appreciated that the WTRU 102 may acquire location information by way of any suitable location-determination method while remaining consistent with an embodiment. [0033] The processor 118 may further be coupled to other peripherals 138, which may include one or more software and/or hardware modules that provide additional features, functionality and/or wired or wireless connectivity. For example, the peripherals 138 may include an accelerometer, an e-compass, a satellite transceiver, a digital camera (for photographs or video), a universal serial bus (USB) port, a vibration device, a television transceiver, a hands free headset, a Bluetooth® module, a frequency modulated (FM) radio unit, a digital music player, a media player, a video game player module, an Internet browser, and the like.
[0034] FIG. 1C is a system diagram of the RAN 103 and the core network 106 according to an embodiment. As noted above, the RAN 103 may employ a UTRA radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 115. The RAN 103 may also be in communication with the core network 106. As shown in FIG. 1C, the RAN 103 may include Node-Bs 140a, 140b, 140c, which may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 115. The Node-Bs 140a, 140b, 140c may each be associated with a particular cell (not shown) within the RAN 103. The RAN 103 may also include RNCs 142a, 142b. It will be appreciated that the RAN 103 may include any number of Node-Bs and RNCs while remaining consistent with an embodiment.
[0035] As shown in FIG. 1C, the Node-Bs 140a, 140b may be in communication with the RNC 142a. Additionally, the Node-B 140c may be in communication with the RNC142b. The Node-Bs 140a, 140b, 140c may communicate with the respective RNCs 142a, 142b via an Iub interface. The RNCs 142a, 142b may be in communication with one another via an lur interface. Each of the RNCs 142a, 142b may be configured to control the respective Node-Bs 140a, 140b, 140c to which it is connected. In addition, each of the RNCs 142a, 142b may be configured to carry out or support other functionality, such as outer loop power control, load control, admission control, packet scheduling, handover control, macrodiversity, security functions, data encryption, and the like.
[0036] The core network 106 shown in FIG. 1C may include a media gateway (MGW) 144, a mobile switching center (MSC) 146, a serving GPRS support node (SGSN) 148, and/or a gateway GPRS support node (GGSN) 150. While each of the foregoing elements are depicted as part of the core network 106, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.
[0037] The RNC 142a in the RAN 103 may be connected to the MSC 146 in the core network 106 via an IuCS interface. The MSC 146 may be connected to the MGW 144. The MSC 146 and the MGW 144 may provide the WTRUs 102a, 102b, 102c with access to circuit- switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and traditional land- line communications devices.
[0038] The RNC 142a in the RAN 103 may also be connected to the SGSN 148 in the core network 106 via an IuPS interface. The SGSN 148 may be connected to the GGSN 150. The SGSN 148 and the GGSN 150 may provide the WTRUs 102a, 102b, 102c with access to packet- switched networks, such as the Internet 110, to facilitate communications between and the WTRUs 102a, 102b, 102c and IP-enabled devices.
[0039] As noted above, the core network 106 may also be connected to the networks 112, which may include other wired or wireless networks that are owned and/or operated by other service providers.
[0040] FIG. ID is a system diagram of the RAN 104 and the core network 107 according to an embodiment. As noted above, the RAN 104 may employ an E-UTRA radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 116. The RAN 104 may also be in communication with the core network 107.
[0041] The RAN 104 may include eNode-Bs 160a, 160b, 160c, though it will be appreciated that the RAN 104 may include any number of eNode-Bs while remaining consistent with an embodiment. The eNode-Bs 160a, 160b, 160c may each include one or more transceivers for communicating with the WTRUs 102a, 102b, 102c over the air interface 1 16. In one
embodiment, the eNode-Bs 160a, 160b, 160c may implement MIMO technology. Thus, the eNode-B 160a, for example, may use multiple antennas to transmit wireless signals to, and receive wireless signals from, the WTRU 102a.
[0042] Each of the eNode-Bs 160a, 160b, 160c may be associated with a particular cell (not shown) and may be configured to handle radio resource management decisions, handover decisions, scheduling of users in the uplink and/or downlink, and the like. As shown in FIG. ID, the eNode-Bs 160a, 160b, 160c may communicate with one another over an X2 interface.
[0043] The core network 107 shown in FIG. ID may include a mobility management gateway (MME) 162, a serving gateway 164, and a packet data network (PDN) gateway 166. While each of the foregoing elements are depicted as part of the core network 107, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.
[0044] The MME 162 may be connected to each of the eNode-Bs 160a, 160b, 160c in the
RAN 104 via an SI interface and may serve as a control node. For example, the MME 162 may be responsible for authenticating users of the WTRUs 102a, 102b, 102c, bearer
activation/deactivation, selecting a particular serving gateway during an initial attach of the WTRUs 102a, 102b, 102c, and the like. The MME 162 may also provide a control plane function for switching between the RAN 104 and other RANs (not shown) that employ other radio technologies, such as GSM or WCDMA.
[0045] The serving gateway 164 may be connected to each of the eNode-Bs 160a, 160b, 160c in the RAN 104 via the SI interface. The serving gateway 164 may generally route and forward user data packets to/from the WTRUs 102a, 102b, 102c. The serving gateway 164 may also perform other functions, such as anchoring user planes during inter-eNode B handovers, triggering paging when downlink data is available for the WTRUs 102a, 102b, 102c, managing and storing contexts of the WTRUs 102a, 102b, 102c, and the like.
[0046] The serving gateway 164 may also be connected to the PDN gateway 166, which may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a, 102b, 102c and IP-enabled devices.
[0047] The core network 107 may facilitate communications with other networks. For example, the core network 107 may provide the WTRUs 102a, 102b, 102c with access to circuit- switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and traditional land-line communications devices. For example, the core network 107 may include, or may communicate with, an IP gateway (e.g., an IP multimedia subsystem (IMS) server) that serves as an interface between the core network 107 and the PSTN 108. In addition, the core network 107 may provide the WTRUs 102a, 102b, 102c with access to the networks 112, which may include other wired or wireless networks that are owned and/or operated by other service providers.
[0048] FIG. IE is a system diagram of the RAN 105 and the core network 109 according to an embodiment. The RAN 105 may be an access service network (ASN) that employs IEEE 802.16 radio technology to communicate with the WTRUs 102a, 102b, 102c over the air interface 117. As will be further discussed below, the communication links between the different functional entities of the WTRUs 102a, 102b, 102c, the RAN 105, and the core network 109 may be defined as reference points.
[0049] As shown in FIG. IE, the RAN 105 may include base stations 180a, 180b, 180c, and an ASN gateway 182, though it will be appreciated that the RAN 105 may include any number of base stations and ASN gateways while remaining consistent with an embodiment. The base stations 180a, 180b, 180c may each be associated with a particular cell (not shown) in the RAN
105 and may each include one or more transceivers for communicating with the WTRUs 102a,
102b, 102c over the air interface 117. In one embodiment, the base stations 180a, 180b, 180c may implement MIMO technology. Thus, the base station 180a, for example, may use multiple antennas to transmit wireless signals to, and receive wireless signals from, the WTRU 102a. The base stations 180a, 180b, 180c may also provide mobility management functions, such as handoff triggering, tunnel establishment, radio resource management, traffic classification, quality of service (QoS) policy enforcement, and the like. The ASN gateway 182 may serve as a traffic aggregation point and may be responsible for paging, caching of subscriber profiles, routing to the core network 109, and the like.
[0050] The air interface 117 between the WTRUs 102a, 102b, 102c and the RAN 105 may be defined as an Rl reference point that implements the IEEE 802.16 specification. In addition, each of the WTRUs 102a, 102b, 102c may establish a logical interface (not shown) with the core network 109. The logical interface between the WTRUs 102a, 102b, 102c and the core network 109 may be defined as an R2 reference point, which may be used for authentication,
authorization, IP host configuration management, and/or mobility management.
[0051] The communication link between each of the base stations 180a, 180b, 180c may be defined as an R8 reference point that includes protocols for facilitating WTRU handovers and the transfer of data between base stations. The communication link between the base stations 180a, 180b, 180c and the ASN gateway 182 may be defined as an R6 reference point. The R6 reference point may include protocols for facilitating mobility management based on mobility events associated with each of the WTRUs 102a, 102b, 102c.
[0052] As shown in FIG. IE, the RAN 105 may be connected to the core network 109. The communication link between the RAN 105 and the core network 109 may defined as an R3 reference point that includes protocols for facilitating data transfer and mobility management capabilities, for example. The core network 109 may include a mobile IP home agent (MIP-HA) 184, an authentication, authorization, accounting (AAA) server 186, and a gateway 188. While each of the foregoing elements are depicted as part of the core network 109, it will be appreciated that any one of these elements may be owned and/or operated by an entity other than the core network operator.
[0053] The MIP-HA may be responsible for IP address management, and may enable the
WTRUs 102a, 102b, 102c to roam between different ASNs and/or different core networks. The
MIP-HA 184 may provide the WTRUs 102a, 102b, 102c with access to packet-switched networks, such as the Internet 110, to facilitate communications between the WTRUs 102a,
102b, 102c and IP-enabled devices. The AAA server 186 may be responsible for user authentication and for supporting user services. The gateway 188 may facilitate interworking with other networks. For example, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to circuit-switched networks, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and traditional land-line communications devices. In addition, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to the networks 1 12, which may include other wired or wireless networks that are owned and/or operated by other service providers.
[0054] Although not shown in FIG. IE, it will be appreciated that the RAN 105 may be connected to other ASNs and the core network 109 may be connected to other core networks. The communication link between the RAN 105 the other ASNs may be defined as an R4 reference point, which may include protocols for coordinating the mobility of the WTRUs 102a, 102b, 102c between the RAN 105 and the other ASNs. The communication link between the core network 109 and the other core networks may be defined as an R5 reference, which may include protocols for facilitating interworking between home core networks and visited core networks.
[0055] "Over-the-top" (OTT) streaming may utilize the Internet as a delivery medium.
Hardware capabilities have evolved to create a wide range of video-capable devices. Video- capable devices may range from mobile devices to Internet set-top boxes (STBs) to network TVs. Network capabilities have evolved to make high-quality video delivery over the Internet viable.
[0056] "Closed" networks may be controlled by a multi-system operator (MSO). Internet may be a "best effort" environment. In a "best-effort" environment, bandwidth and latency may change (e.g., constantly). Network conditions may be volatile in mobile networks. Dynamic adaptation to network changes may provide a tolerable user experience, for example, in volatile mobile networks.
[0057] Adaptive streaming may be considered similar to HTTP streaming. User datagram protocol (UDP) may be used for internet video streaming. HTTP streaming may be used for internet video streaming. The use of HTTP for Internet video streaming may be attractive and scalable, for example, due to the existing HTTP infrastructure. The existing HTTP infrastructure may include, for example, one or more content distribution networks (CDNs) and the ubiquity of HTTP support on multiple platforms and devices. The use of HTTP streaming for internet video streaming may be attractive, for example, due to firewall penetration. Firewalls may disallow UDP traffic. Video over HTTP may be available behind firewalls. HTTP streaming may be desirable for rate-adaptive streaming.
[0058] An asset may be segmented virtually or physically, for example, in HTTP adaptive streaming. An asset may be published to CDN's, for example, in HTTP adaptive streaming. Intelligence may reside in the client (e.g., DASH client). The client may acquire the knowledge of the published alternative encodings (e.g., representations), for example, via a Media
Presentation Description (MPD) file. The client may acquire the way to construct URLs to download a segment from a given representation. An adaptive bit rate (ABR) client may observe network conditions. An ABR client may decide which combination of bitrate, resolution, etc. may provide best quality of experience on the client device at an instance of time. The ABR client may determine the optimal URL to use. The ABR client may issue an HTTP GET request to download a segment, for example, as the ABR client determines the optimal URL to use.
[0059] DASH may be built on top of the HTTP/TCP/IP stack. DASH may define a manifest format (e.g., Media Presentation Description (MPD)), and/or segment formats for ISO Base Media File Format and/or MPEG-2 Transport Streams. DASH may define a set of quality metrics at the network, client operation, media presentation levels, etc. The set of quality metrics may enable an interoperable way of monitoring Quality of Experience (QoE) and Quality of Service (QoS).
[0060] A representation may be a core concepts of DASH. A representation may be a single encoded version of the complete asset. A representation may be a single encoded version of a subset of the components of a complete set. For example, a representation may be ISO-BMFF including unmultiplexed 2.5 Mbps 720p AVC video and/or separate ISO-BMFF representations for 96 Kbps MPEG-4 AAC audio in different languages. DASH may provide an unmultiplexed representation for audio and a separate unmultiplexed representation for video. A single transport stream including two or more of video, audio, and/or subtitles may be a single multiplexed representation. For example, a combined structure may comprise video and English audio as a single multiplexed representation, along with for example, Spanish and Chinese audio tracks as separate unmultiplexed representations.
[0061] A segment may be the minimal individually addressable unit of media data. A segment may be the entity that may be downloaded using URLs advertised via the MPD. For example, a media segment may be a four second part of a live broadcast, starting at playout time
0:42:38, ending at 0:42:42, and available within a three minute time window. A media segment may be a complete on-demand movie available for the whole period the movie is licensed.
[0062] The MPD may be an XML document. The MPD may be an XML document that may advertise the available media. The MPD may provide information requested by the client to select a representation, make adaptation decisions, and/or retrieve segments from the network.
The MPD may be independent of a segment. The MPD may signal the properties requested to determine whether a representation may be successfully played. The MPD may signal the functional properties (e.g., whether segments start at random access points). The MPD may use a hierarchical data model, for example, to describe the complete presentation.
[0063] Representations may be the lowest conceptual level of the hierarchical data model. The MPD may signal information, such as at the lowest conceptual level of hierarchical data. For example, the information signaled may include bandwidth and/or codecs that may be used for successful presentation and/or ways of constructing URLs for accessing segments.
Additional information may be provided at the lowest conceptual level of hierarchical data, such as trick mode, random access information, layer and view information for scalable and multiview codecs, generic schemes which may be supported by a client wishing to play a given
representation, etc.
[0064] DASH may provide a flexible URL construction functionality. Single monolithic per-segment URL may provide rigid URL construction functionality. Single monolithic per- segment URL may be possible in DASH. DASH may allow dynamic construction of URLs. DASH may allow dynamic construction of URLs, for example, by combining parts of the URL (e.g., base URLs) that may appear at different levels of the hierarchical data model. Segments may be multi-path functionality, for example, if multiple base URL are used with segments requested from one or more location. Multi-path functionality may improve performance and reliability.
[0065] A list of URLs and byte ranges may reach several thousand elements per
representation, for example, if short segments are used. DASH may allow the use of predefined variables (e.g., segment number, segment time, etc.). DASH may allow the use of printf-style syntax for quick construction of URLs using templates. A number of segments may be listed as all segments (e.g., seg OOOOl .ts, seg_00002.ts, ... , seg_03600.ts). A number of segments may be expressed as a single line (e.g., seg_$Index%05$.ts), for example, if the segments cannot be retrieved at the time the MPD is fetched. Multi-segment representations may be helpful in using templates, for example, due to template efficiency.
[0066] Adaption sets may be groups of different representations of the same asset and/or the same component, for example, in the un-multiplexed case. Representations (e.g., all representations) within an adaptation set may render the same content. A client may switch between representations within an adaption set.
[0067] In an example, an adaptation set may be a collection of ten representations with video encoded in different bitrates and/or resolutions. Representation switching may occur at a segment and/or a subsegment, for example, while presenting the same content to the viewer.
Seamless representation switch may be possible, for example, under some segment-level restrictions. Segment-level restrictions may be used in practical applications, such as DASH profiles and DASH subsets adopted by multiple SDOs. Segment-level restrictions may be applied to representations within an adaptation set.
[0068] A period may be a time-limited subset of a presentation. Adaptation sets may be valid within the period. Adaptation sets in different periods may include different representations (e.g., in terms of codecs, bitrates, etc.). An MPD may include a single period for the whole duration of the asset. Periods may be used for ad markup, for example, where separate periods are dedicated to parts of the asset and/or to an advertisement.
[0069] The MPD may be an XML document that presents a hierarchy. The hierarchy may, for example, start at global presentation-level properties (e.g., timing) and continue to period- level properties and/or adaptation sets available for that period. Representations may be at the lowest level of this hierarchy.
[0070] DASH may use a simplified version of XLink, for example, to allow loading parts of the MPD (e.g., periods) in real time from a remote location. For example, in ad insertion, precise timing of ad breaks may be known ahead of time and ad servers may determine the exact ad in real time.
[0071] An MPD may be dynamic or static. A dynamic MPD may change. A dynamic MPD may be periodically reloaded by the client. A static MPD may be valid for the whole
presentation. A static MPD may be used in VoD applications. A dynamic MPD may be used for live and PVR applications, for example.
[0072] Media segments may be time-bounded parts of a representation. Media segments may approximate segment durations that may appear in the MPD. Segment duration may be different for one or more segments. Segment durations may be constant and/or close to constant for segments (e.g., DASH-AVC/264 may use segments with durations within a 25% tolerance margin).
[0073] The MPD may include information regarding media segments that may be unavailable at the time the MPD is read by the client, for example, in a live broadcast scenario. Segments may be available within a defined availability time window. The time window may be calculated from the wall-clock time and/or segment duration.
[0074] An index segment may be a segment type. Index segments may appear as side files. Index segments may appear within media segments. Index segments may include timing and/or random access information. Index segments may make efficient implementation of random access and trick modes. Index segments may be used for more efficient bitstream switching. Index segments may be used for VoD and PVR type of applications. Index segments may be used less in live cases.
[0075] Segment-level and/or representation-level properties may be used to implement efficient bitstream switching. DASH may provide functional requirements for segment-level and/or representation-level properties. Segment-level and/or representation-level properties may be expressed in the MPD, for example, in a format-independent way. Segment format specifications may include the format-level restrictions that may correspond to generic requirements.
[0076] A media segment may be denoted as i. A representation may be denoted as R. A media segment i of a representation R may be denoted as S (f). The duration of S (J) may be denoted as D(S (f)). The earliest presentation time of S (f) may be denoted as EPT(SR(i)). EPT may correspond to the earliest presentation time of the segment. The earliest presentation time may not be the time at which a segment may be successfully played out at random access.
[0077] Time alignment may be used for efficient switching between representations in an adaption set. Efficient switching in an adaption set may follow
EPT(SR (i))<EPT(SR (i~l))+D(SR (z_l)), for a pair (e.g., any pair) of representation R and R and a b b
segment i in an adaption set. The ability to switch at a segment border without overlapped downloads and/or dual decoding may occur, for example, when switching follows
EPT(SR (i))<EPT(SR (i~l))+D(SR ( )). The ability to switch at a segment border without
a b b
overlapped downloads and/or dual decoding may occur, for example, when a segment starts with a random access point of certain types.
[0078] Bitstream switching at a subsegment level may occur, for example, when indexing is used. Bitstream switching at a subsegment level may occur, for example, when subsegment switching follows EPT(SR (i))<EPT(SR (i-l))+D(S (z_l)). Bitstream switching at a subsegment a b b
level may occur, for example, when a subsegment starts with a random access point of certain types.
[0079] Systems may utilize time alignment and/or random access point placement restrictions. Restrictions may correspond to encodings with matching instantaneous decoder refresh (IDR) frames at segment borders and/or closed group of pictures (GOPs), for example, in video encoding.
[0080] FIG. 2 is an example DASH system model 200. A DASH client may include one or more of an access client (e.g., a DASH access engine 202), a media engine (e.g., a media engine 204), and/or an application (e.g., an application 206). The DASH access engine 202 may be an HTTP client. The DASH access engine 202 may receive an MPD and/or segment data, for example, via a CDN (not shown). The DASH access engine 202 may send media (e.g., and timing) to the media engine 204. The media may be in an MPEG format (e.g., MPEG-2 TS) or an ISO format (e.g., ISO-BMFF). The media engine 204 may decode and present the media provided from the DASH access engine 202. An DASH access engine 202 may pass events (e.g., and timing) to an application 206. The on-the-wire format interfaces of the MPD and/or segments may be defined. Other interfaces may be defined according to implementers' discretion.
[0081] Timing behavior of a DASH client may be complex. Segments mentioned in a manifest may be valid, for example, in Apple HLS. A client may poll for new manifests, for example, in Apple HLS. DASH MPD may reduce polling behavior. DASH MPD may reduce polling behavior, for example, by defining MPD update frequency and/or allowing calculation of segment availability.
[0082] A static MPD may be valid. A static MPD may always be valid. A dynamic MPD may be valid. A dynamic MPD may be valid, for example, from the time the dynamic MPD was fetched by the client for the duration of a refresh period. An MPD may expose publication time.
[0083] MPD may provide the availability time of the earliest segment of a period. The availability time of the earliest segment of a period may be denoted as 7^(0). A media segment may be denoted as n. A media segment n may be available, for example, starting from time
n-\
TA(n)=TA(0)+ ∑ D(SR(i)). A media segment n may be available for the duration of the timeshift i=0
buffer. A time shift buffer may be denoted as Ts. The time shift buffer Ts may be stated in the
MPD. The window size availability may impact the catch-up TV functionality of a DASH deployment. Segment availability time may be relied upon by the access client, for example, if the segment availability is within the MPD validity period.
[0084] MPD may declare bandwidth, for example, for a representation R. Bandwidth may be denoted as BR. MPD may define a global minimum buffering time. A global minimum buffering time may be denoted as BTmin. An access client may be able to pass a segment to the media engine. An access client may be able to pass a segment to the media engine, for example, after B xBT bits were downloaded. A segment may start with a random access point. The earliest time segment n that may be passed to the media engine may be denoted as T (n)+T (n)+BT . . The downloaded time of segment n may be denoted as Tin). A DASH client may start the playout immediately, for example, to minimize delay. MPD may propose a presentation delay (e.g., as an offset from T (n)). MPD may propose a presentation delay, for example, to ensure synchronization between different clients. Tight synchronization of segment HTTP GET requests may impact infrastructure.
[0085] MPD validity and/or segment availability may be calculated using absolute (e.g., wall-clock) time. Media time may be expressed within the segments. Media time may develop between the encoder and client clocks, for example, in live case drift. Media time may be addressed at the container level. For example, both MPEG-2 TS and ISO-BMFF provide synchronization functionality.
[0086] HTTP may be stateless and/or client-driven. "push"-style events may be emulated using polls (e.g., frequent polls). Upcoming ad breaks may be signaled three to eight seconds before the start of the ad break, for example, in current ad insertion practice in cable/IPTV systems. A poll-based implementation may be inefficient. Events may address inefficient ad insertion.
[0087] Events may include timing information. Events may include time and duration information. Events may include payload information. Payload information may include arbitrary information. Events may have application-specific payloads. Inband events may be small message boxes. Small message boxes may appear at the beginning of media segments. MPD events may be a period-level list of timed elements. DASH may define an MPD validity expiration event. An MPD validity event may identify the earliest MPD version valid after a given presentation time.
[0088] DASH may be agnostic to digital rights management (DRM). DASH may support signaling a DRM scheme. DASH may support signaling DRM scheme properties within the MPD. A DRM scheme may be signaled via the ContentProtection descriptor. An opaque value may be passed within a DRM scheme. A unique identifier for a scheme may be used to signal a DRM scheme. The meaning of the opaque value may be defined to signal a DRM scheme. A scheme-specific namespace may be used to signal a DRM scheme.
[0089] MPEG may provide content protection standards, such as Common Encryption for ISO-BMFF (CENC) and Segment Encryption and Authentication. Common encryption may standardize which parts of a sample are encrypted and/or how encryption metadata may be signaled within a track. The DRM module may be responsible for delivering the keys to the client, for example, given the encryption metadata in the segment. Decryption may use standard AES-CTR and/or AES-CBC modes. The CENC framework may be extensible. The CENC framework may use other encryption algorithms beyond AES-CTR and/or AES-CBC modes, for example, if other encryption algorithms are defined. Common Encryption may be used with several commercial DRM systems. Common encryption may be the system used in DASH (e.g., DASH264).
[0090] DASH Segment Encryption and Authentication (DASH-SEA) may be agnostic to the segment format. Encryption metadata may be passed via the MPD. For example, MPD may include information regarding the key that may be used for decryption of a segment. MPD may include information regarding how to obtain the key that may be used for decryption of a segment. The baseline system may be equivalent to the one defined in HLS, for example, with AES-CBC encryption and HTTPS-based key transport. The baseline system equivalent to the one defined in HLS may make MPEG-2 TS media segments compatible with encrypted HLS segments. The DASH-SEA standard may be extensible. The DASH-SEA standard may allow other encryption algorithms and/or more DRM systems, for example, similar to CENC.
[0091] DASH-SEA may offer a segment authenticity framework. Segment authenticity framework may ensure that the segment received by the client is the same and/or similar to the segment the MPD author intended the client to receive. Segment authenticity framework may use MAC and/or digest algorithms, for example, to prevent content modification within the network (e.g., ad replacement, altering inband events, etc.)
[0092] In an example in which one or more (e.g., all) representations are unmultiplexed, each representation may include one content component (e.g., video, audio in a specific language, etc.). MPEG-2 transport stream segments may be used as an example herein, but all examples are equally relevant to all segment types.
[0093] A single asset may be split into several periods. In transport streams there may be a time distance between audio and video. Start times of two components may be different. The content component used for timing calculations may be difficult to express. A drift may develop between audio and video. A glitch in a transition may occur, for example, if an inserted period is removed. A glitch in a transition may occur, for example, as audio and video durations of previous periods may differ.
[0094] In an example, determinations made in Internet radio may be performed relative to audio, closed captioning, video, etc. Video may be a slide show at a very slow, possibly variable framerate. A "master" component relative to which all calculations may be made may be expressed. [0095] In an example, there may be two audio streams with different languages (e.g., LI and L2) in a single multiplex. LI and L2 may use different codecs (e.g., respectively CI and C2). CI may be needed for LI and C2 may be needed for L2. The lack of support for codec C2 may disallow playing in language L2. The playing in language LI may be allowed. For example, languages LI and L2 may be English AC-3 and Spanish MP2. A language may be disallowed, for example, if legacy US content is being retrofitted without re-encoding.
[0096] In an example, it may be difficult to distinguish between closed captioning and/or subtitles multiplexed in a video bitstream (e.g. using CEA-608 or CEA-708, as defined in SCTE 128-1) and "burned in" closed captioning and/or subtitles in which text is a part of the pre- encoded video (e.g., text cannot be removed without special processing). An implementer may recognize differences between multiplexed and burned-in closed captioning and/or subtitles. For example, CEA 608/708 closed captioning may be turned on and off. Burned-in captioning may be unable to turn on and off. CEA 608/708 implementation may be used with a video codec to display CEA 608/708 closed captioning. The video codec may be used independently to display burned-in closed captioning.
[0097] In an example, signaling and/or handling of a changing aspect ratio, and/or signaling and/or handling of an intended display (e.g., "letterbox") may use inband, for example, when multiple assets are in a single period (e.g., ad splicing done upstream, with different video characteristics). Signaling aspect ratio in DASH may be utilized at the
Representation/ AdaptationSet level. The player may be unaware that the signaling aspect ratio may change, for example, when signaling aspect ratio in DASH. Signaling discontinuities may occur. For example, time discontinuities may occur, such as when different streams are spliced together. Signaling discontinuities may be announced, for example, in MPEG-2 TS. Signaling discontinuities may result in PCR-PCR difference, for example, in MPEG-2 TS. Signaling the player to estimate the EPT of a segment using segment duration (e.g., rather than directly using timestamps) may be utilized to lessen signaling discontinuities and PCR-PCR difference.
[0098] ContentComponent type in MPEG DASH may be used to express different content components of a multiplexed representation. ContentComponent element definitions may appear in ISO/IEC 23009-1 :2014, Section 5.3.4. ContentComponent may be extended to provide additional attributes to express parameters. ContentComponent may be extended to provide mapping between ContentComponent and appropriate packet identifiers (PID) (e.g., or track) in a stream. ContentComponent may be defined per application set. ContentComponent may be defined per representation. Table 1 is an example of ContentComponent element definitions, with attributes added to express parameters. TABLE 1
Figure imgf000022_0001
[0099] The @codecs element in the ContentComponent may be signaled. For example, the @codecs element in the ContentComponent may be signaled when different audio characteristics are present (e.g., two audio streams with different languages in a single multiplex use different codecs). For example, the @codecs element in the ContentComponent may be signaled to distinguish between closed captioning or subtitles multiplexed in a video bitstream and burned-in closed captioning or subtitles in which text is part of pre-encoded video. For example, the @codecs element in the ContentComponent may indicate that a multiplexed representation comprises two audio components multiplexed together, and/or that the audio components are encoded according to different audio codecs. The multiplexed audio components may be in different languages (e.g., as indicated by @lang). The client may determine whether one or more of the audio components are supported by the device.
[0100] Signaling of multi-channel audio and/or sampling rates may be added to the
ContentComponent. For example, signaling of multi-channel audio and/or sampling rates may be added to the ContentComponent when different audio characteristics are present.
[0101] A 1 : 1 mapping may be established between the ContentComponent and the track IP and/or PID. For example, a 1 : 1 mapping may be established between the ContentComponent and the track IP and/or PID when a single asset is split into several periods.
[0102] The relative offsets of the "non-master" components from the "master" components may be determined. The relative offsets of the "non-master" components from the "master" components may be determined, for example, by indicating the offset using the
@componentPresentationTimeOffset. The relative offsets of the "non-master" components from the "master" components may be determined when a single asset is split into several periods. A client may determine an EPT of a video component or an audio component in a segment. The processor may determine the relative offset of an audio component in the segment from the EPT of a video component of the segment.
[0103] Codecs may be registered for CEA-608 and/or CEA-708. A 4cc value may be registered for CEA 608/708. A 4cc value may be a separate value for the one or more standards (e.g., c608 and c708). A 4cc value may be a defined general value (e.g., cc08). A general value may be defined. A profile may indicate whether CEA-708 may be used (e.g., let cc08.6 stand for CEA-608 and cc08.7 stand for CEA-708).
[0104] A PAR value may be defined. A value for AFD may be defined. A value for AFD may be defined when multiple assets are in a single period. The definition of RatioType in ISO/IEC 23009-1 :2014 Annex B may permit values that may be disallowed for normal PAR calculation (e.g., the values "0:" and ":0"). Disallowed values may serve as an indicator that the aspect ratio and its handling may be specified inband.
[0105] A Role value may be used to indicate the main component that may be used as a "master" component. A master component may be the component assumed for time calculation. In an example, the role "master" may be added to Table 22 of ISO/IEC 23009-1 :2014, Section 5.8.5.5 Equivalent semantics may be added to Table 22 of ISO/IEC 23009-1 :2014, Section 5.8.5.5. One "master" content component may appear. @trackld may be present.
[0106] A SupplementalProperty may be introduced, for example, to signal that
discontinuities may be present. The expected receiver behavior may be used to calculate the mapping of presentation time specified in the bitstream. The mapping of presentation time specified in the bitstream may be calculated relative to an estimation of segment duration and/or @presentationTimeOffset.
[0107] An initialization segment and/or first available segment may be fetched (e.g., requested and/or received). An initialization segment and/or first available segment may be fetched when different audio characteristics are present. An initialization segment and/or first available segment may be fetched to distinguish between closed captioning or subtitles multiplexed in a video bitstream and burned-in closed captioning or subtitles in which text is part of pre-encoded video.
[0108] The ContentComponent may be made into an extension of RepresentationBaseType. The ContentComponent may be made into an extension of RepresentationBaseType, for example, to resolve signaling issues.
[0109] Although features and elements are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in any combination with the other features and elements. In addition, the methods described herein may be implemented in a computer program, software, or firmware incorporated in a computer- readable medium for execution by a computer or processor. Examples of computer-readable media include electronic signals (transmitted over wired or wireless connections) and computer- readable storage media. Examples of computer-readable storage media include, but are not limited to, a read only memory (ROM), a random access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs). A processor in association with software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host computer.

Claims

What is Claimed:
1. A device comprising:
a processor configured to:
receive multimedia presentation description (MPD) information relating to content;
determine, based upon the MPD information, that the content comprises a multiplexed representation comprising a closed captioning component multiplexed with at least one of an audio component or a video component;
request the multiplexed representation; and
receive content segments comprising the multiplexed representation.
2. The device of claim 1, wherein the processor is further configured to determine whether the closed captioning component is a CEA-608 closed captioning component or a CEA-708 closed captioning component.
3. The device of claim 2, wherein the processor is further configured to determine whether display of the closed captioning component is supported on the device.
4. The device of claim 2, wherein the processor is further configured to determine whether the device is capable of decoding and displaying the closed captioning component.
5. The device of claim 1, wherein the processor is further configured to determine a language of the closed captioning component.
6. The device of claim 5, wherein the processor is further configured to determine whether the language is supported on the device.
7. The device of claim 5, wherein the processor is further configured to determine whether the device is capable of displaying closed captions in the determined language.
8. A device comprising:
a processor configured to:
receive multimedia presentation description (MPD) information relating to content; determine, based upon the MPD information, that the content comprises a multiplexed representation comprising two audio components multiplexed together, the audio components encoded according to different audio codecs;
request the multiplexed representation; and
receive content segments comprising the multiplexed representation.
9. The device of claim 8, wherein the processor is further configured to determine a language of at least one audio component.
10. The device of claim 9, wherein the processor is further configured to determine whether the audio codecs are supported.
11. The device of claim 10, wherein the two audio components are in different languages.
12. The device of claim 8, wherein the processor is further configured to distinguish between audio sampling rates for the two audio components of the multiplexed representation.
13. The device of claim 8, wherein the MPD information includes sampling rates for one or more segments included the multiplexed representation.
14. The device of claim 8, wherein the processor is further configured to determine that an audio codec of an audio segment of the multiplexed representation is not supported, and select a segment of the other audio segment of the multiplexed representation for playback.
15. A device comprising :
a processor configured to:
receive multimedia presentation description (MPD) information relating to content;
determine, based upon the MPD information, that the content comprises a multiplexed representation comprising a packet identifier of a segment included in a multiplexed representation;
request the multiplexed representation; and
receive content segments comprising the multiplexed representation.
16. The device of claim 15, wherein the segment is a video component.
17. The device of claim 15, wherein the processor is further configured to determine a track identifier of at least one of a video component and an audio component in the segment.
18. The device of claim 15, wherein the processor is further configured to determine an earliest presentation time (EPT) of a video component or an audio component in the segment.
19. The device of claim 18, wherein the processor is further configured to determine the relative offset of an audio component in the segment from the EPT of a video component of the segment.
20. The device of claim 18, wherein the MPD information includes an indication that one of the video component or the audio component is the main content component.
21. The device of claim 20, wherein the processor is further configured to determine the relative offset of at least one of a video component or an audio component in the segment from the EPT of the main content component of the segment.
22. The device of claim 15, wherein the processor is further configured to determine the aspect ratio of a video component of the segment.
23. A method comprising :
receiving multimedia presentation description (MPD) information relating to content; determining, based upon the MPD information, that the content comprises a multiplexed representation comprising a closed captioning component multiplexed with at least one of an audio component or a video component;
requesting the multiplexed representation; and
receiving content segments comprising the multiplexed representation.
24. The method of claim 23, further comprising determining whether the closed captioning component is a CEA-608 closed captioning component or a CEA-708 closed captioning component.
25. The method of claim 23, further comprising determining whether display of the closed captioning component is supported.
26. The method of claim 24, further comprising determining whether a device is capable of decoding and displaying the closed captioning component.
27. The method of claim 23, further comprising determining a language of the closed captioning component.
28. The method of claim 26, further comprising determining whether the language is supported on the device.
29. The method of claim 26, further comprising determining whether the device is capable of displaying closed captions in the determined language.
30. A method comprising:
receiving multimedia presentation description (MPD) information relating to content; determining, based upon the MPD information, that the content comprises a multiplexed representation comprising two audio components multiplexed together, the audio components encoded according to different audio codecs;
requesting the multiplexed representation; and
receiving content segments comprising the multiplexed representation.
31. The method of claim 30, further comprising determining a language of at least one audio component.
32. The method of claim 31, further comprising determining whether the audio codecs are supported.
33. The method of claim 30, wherein the two audio components are in different languages.
34. The method of claim 30, further comprising distinguishing between audio sampling rates for the two audio components of the multiplexed representation.
35. The method of claim 30, wherein the MPD information includes sampling rates for one or more segments included the multiplexed representation.
36. The method of claim 30, further comprising determining that an audio codec of an audio segment of the multiplexed representation is not supported, and select a segment of the other audio segment of the multiplexed representation for playback.
37. A method comprising:
receiving multimedia presentation description (MPD) information relating to content; determining, based upon the MPD information, that the content comprises a multiplexed representation comprising a packet identifier of a segment included in a multiplexed representation;
requesting the multiplexed representation; and
receiving content segments comprising the multiplexed representation.
38. The method of claim 37, wherein the segment is a video component.
39. The method of claim 37, further comprising determining a track identifier of at least one of a video component and an audio component in the segment.
40. The method of claim 37, further comprising determining an earliest presentation time (EPT) of a video component or an audio component in the segment.
41. The method of claim 40, further comprising determining the relative offset of an audio component in the segment from the EPT of a video component of the segment.
42. The method of claim 40, wherein the MPD information includes an indication that one of the video component or the audio component is the main content component.
43. The method of claim 42, further comprising determining the relative offset of at least one of a video component or an audio component in the segment from the EPT of the main content component of the segment.
44. The method of claim 37, further comprising determining the aspect ratio of a video component of the segment.
PCT/US2015/038879 2014-07-01 2015-07-01 Media presentation description signaling in typical broadcast content WO2016004237A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462019784P 2014-07-01 2014-07-01
US62/019,784 2014-07-01

Publications (1)

Publication Number Publication Date
WO2016004237A1 true WO2016004237A1 (en) 2016-01-07

Family

ID=53765535

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/038879 WO2016004237A1 (en) 2014-07-01 2015-07-01 Media presentation description signaling in typical broadcast content

Country Status (2)

Country Link
TW (1) TW201603568A (en)
WO (1) WO2016004237A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317303A1 (en) * 2011-06-08 2012-12-13 Futurewei Technologies, Inc. System and Method of Media Content Streaming with a Multiplexed Representation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120317303A1 (en) * 2011-06-08 2012-12-13 Futurewei Technologies, Inc. System and Method of Media Content Streaming with a Multiplexed Representation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"Liaison statement to DASH IF on DASH", 108. MPEG MEETING;31-3-2014 - 4-4-2014; VALENCIA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. N14370, 8 April 2014 (2014-04-08), XP030021106 *
"Text of ISO/IEC PDTR 23009-3 2nd edition DASH Implementation Guidelines", 108. MPEG MEETING;31-3-2014 - 4-4-2014; VALENCIA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. N14353, 25 April 2014 (2014-04-25), XP030021090 *
IRAJ SODAGAR (ON BEHALF OF DASH-IF): "DASH-IF liaison to MPEG", 108. MPEG MEETING; 31-3-2014 - 4-4-2014; VALENCIA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m33290, 29 March 2014 (2014-03-29), XP030061742 *
THOMAS STOCKHAMMER ET AL: "Dash in mobile networks and services", VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2012 IEEE, IEEE, 27 November 2012 (2012-11-27), pages 1 - 6, XP032309240, ISBN: 978-1-4673-4405-0, DOI: 10.1109/VCIP.2012.6410826 *

Also Published As

Publication number Publication date
TW201603568A (en) 2016-01-16

Similar Documents

Publication Publication Date Title
US12021883B2 (en) Detecting man-in-the-middle attacks in adaptive streaming
US20230209109A1 (en) Systems and methods for generalized http headers in dynamic adaptive streaming over http (dash)
AU2016200390B2 (en) Streaming with coordination of video orientation (cvo)
US20140019635A1 (en) Operation and architecture for dash streaming clients
WO2016172328A1 (en) Content protection and modification detection in adaptive streaming and transport streams
WO2016205674A1 (en) Dynamic adaptive contribution streaming
WO2017100569A1 (en) Trick mode restrictions for mpeg dash
WO2016004237A1 (en) Media presentation description signaling in typical broadcast content

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15745017

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15745017

Country of ref document: EP

Kind code of ref document: A1