FI20225190A1 - Signaling three-dimensional video information in communication networks - Google Patents

Signaling three-dimensional video information in communication networks Download PDF

Info

Publication number
FI20225190A1
FI20225190A1 FI20225190A FI20225190A FI20225190A1 FI 20225190 A1 FI20225190 A1 FI 20225190A1 FI 20225190 A FI20225190 A FI 20225190A FI 20225190 A FI20225190 A FI 20225190A FI 20225190 A1 FI20225190 A1 FI 20225190A1
Authority
FI
Finland
Prior art keywords
frame
dash
profile
media
video
Prior art date
Application number
FI20225190A
Other languages
Finnish (fi)
Swedish (sv)
Inventor
Ozgur Oyman
Original Assignee
Apple Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apple Inc filed Critical Apple Inc
Publication of FI20225190A1 publication Critical patent/FI20225190A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/178Metadata, e.g. disparity information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Library & Information Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

Embodiments of the present disclosure describe devices, methods, computer-readable media and systems configurations for signaling stereoscopic three-dimensional video content capabilities of a device in a communications network.Other embodiments may be described and claimed.

Description

SIGNALING THREE DIMENSIONAL VIDEO INFORMATION IN COM-
MUNICATION NETWORKS
Field
[0001] Embodiments of the present invention relate generally to the field of communications, and more particularly, to signaling three-dimensional video infor- mation in communication networks.
Background
[0002] Three-dimensional (3-D) video offers a high-quality and immersive multimedia experience, which has only recently become feasible on consumer elec- tronics and mobile platforms through advances in display technology, signal pro- cessing, transmission technology, and circuit design. It is currently being introduced tothe home through various channels, including by Blu-ray Disc™, cable and satel- lite transmission, etc., as well as to mobile networks through 3-D enabled smartphones, etc. Concepts related to delivery of such content through wireless net- works are being developed.
Brief Description of the Drawings
[0003] Embodiments will be readily understood by the following detailed description in conjunction with the accompanying drawings. To facilitate this de- scription, like reference numerals designate like structural elements. Embodiments
N are illustrated by way of example and not by way of limitation in the figures of the
O 25 accompanying drawings. se [0004] Figure 1 schematically illustrates a wireless communication network © in accordance with various embodiments.
I [0005] Figures 2a-b illustrate adaptation of streamed content and/or associat- a o ed session description and metadata files in accordance with various embodiments. 2 30 [0006] Figure 3 illustrates a setup of a streaming session in accordance with
N an embodiment.
N [0007] Figure 4 illustrates frame compatible packing formats in accordance with various embodiments.
[0008] Figure 5 illustrates a method of signaling 3-D video device capabili- ties in accordance with various embodiments.
[0009] Figure 6 illustrates a method of signaling 3-D video content in ac- cordance with various embodiments.
[00010] Figure 7 schematically depicts an example system in accordance with various embodiments.
Detailed Description
[00011] Illustrative embodiments of the present disclosure include, but are not limited to, methods, systems, computer-readable media, and apparatuses for signal- ing stereoscopic three-dimensional video content capabilities of a client device in a communication network. Some embodiments of this invention in this context could be on methods, systems, computer-readable media, and apparatuses for signaling stereoscopic three-dimensional video content capabilities of a mobile device in a — wireless communications network.
[00012] Various aspects of the illustrative embodiments will be described us- ing terms commonly employed by those skilled in the art to convey the substance of their work to others skilled in the art. However, it will be apparent to those skilled in the art that alternate embodiments may be practiced with only some of the described — aspects. For purposes of explanation, specific numbers, materials, and configurations are set forth in order to provide a thorough understanding of the illustrative embodi- ments. However, it will be apparent to one skilled in the art that alternate embodi- ments may be practiced without the specific details. In other instances, well-known
N features are omitted or simplified in order not to obscure the illustrative embodi-
O 25 ments. se [00013] Further, various operations will be described as multiple discrete op- 2 erations, in turn, in a manner that is most helpful in understanding the illustrative
I embodiments; however, the order of description should not be construed as to imply - that these operations are necessarily order dependent. In particular, these operations 2 30 need not be performed in the order of presentation.
N [00014] The phrase “in some embodiments” is used repeatedly. The phrase
N generally does not refer to the same embodiments; however, it may. The terms “comprising,” “having,” and “including” are synonymous, unless the context dictates otherwise. The phrase “A and/or B” means (A), (B), or (A and B). The phrases “A/B” and “A or B” mean (A), (B), or (A and B), similar to the phrase “A and/or B”.
The phrase “at least one of A, B and C” means (A), (B), (C), (A and B), (A and C), (B and C) or (A, B and C). The phrase “(A) B” means (B) or (A and B), that is, A is optional.
[00015] Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a wide variety of alternate and/or equivalent implementations may be substituted for the specific embodiments shown and described, without departing from the scope of the embod- iments of the present disclosure. This application is intended to cover any adaptations or variations of the embodiments discussed herein. Therefore, it is manifestly intend- ed that the embodiments of the present disclosure be limited only by the claims and the equivalents thereof.
[00016] As used herein, the term “module” may refer to, be part of, or include — an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (shared, dedicated, or group) and/or memory (shared, dedicated, or group) that exe- cute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that provide the described functionality
[00017] Significant improvements in video compression capability have been — demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing the standard, the joint video team of the ITU-T
Video Coding Experts Group (VCEG) and the International Organization for Stand- ardization (ISO)/International Electrotechnical Commission (IEC) Moving Picture
N Experts Group (MPEG) has also standardized an extension of AVC that is referred to
O 25 — as multiview video coding (MVC). MVC provides a compact representation for mul- se tiple views of the video scene, such as multiple synchronized video cameras. © [00018] In stereoscopic 3-D video applications, two views are displayed. One
I for the left eye and one for the right eye. There are various ways of formatting the - views of stereoscopic 3-D video content. In one embodiment, the encoding of stereo- 2 30 paired 3-D video may be a special case of MVC, where left and right eye views are
N produced via MVC. Other encoding formats of producing 3-D video content are also
N possible. Various devices may have different capabilities with respect to decoding and rendering these different formats. Embodiments described herein provide for various parameters of a device capability exchange that may facilitate delivery and viewing of the 3-D video content in a communication network such as a wireless network, e.g., an evolved universal terrestrial radio access network (EUTRAN).
[00019] Figure 1 schematically illustrates a network environment 100 in ac- cordance with various embodiments. The network environment 100 includes a user equipment (UE) 104, which may also be referred to as a client terminal or mobile device, wirelessly coupled with a radio access network (RAN) 108. The RAN 108 may include an enhanced node base station (eNB) 112 configured to communicate with the UE 104 via an over-the-air (OTA) interface. The RAN 108 may be part of a — third generation partnership project (3GPP) long-term evolution (LTE) advanced network and may be referred to as an EUTRAN. In other embodiments, other radio access network technologies may be utilized.
[00020] The UE 104 may communicate with a remote media server 116 through the RAN 108. While the eNB 112 is shown communicating directly with the — media server, it will be understood that the communications may flow through a number of intermediate networking components, e.g., switches, routers, gateways, etc., in various embodiments. For example, in some embodiments, the RAN 108 may be coupled with a core services network (CSN) that communicatively couples the
RAN 108 with a larger network, e.g., a wide area network, of which the media server 116 may be considered a part.
[00021] While Figure 1 describes the network environment as a wireless communication network, other embodiments may be used in other types of networks, e.g., wire-line networks. It may be understood that other network environments in
N which embodiments of the present invention may be employed may include addi-
O 25 — tional, fewer, or different components than those explicitly shown in the example se depicted in Figure 1. For example, embodiments of the present invention employed © in a wire-line network, may have the media server 116 and the UE 104 communi-
I cating with one another without the RAN 108. - [00022] The UE 104 and media server 116 may have a number of components 2 30 that are configured to facilitate access, storage, transmission, and display of 3-D vid-
N eo content. For example, the UE 104 may include a content management module
N 120, a media player 124 having a streaming application 126, and a display 128. The streaming application 126 may have sufficient functionality to receive 3-D video content and associated information; decode, unpack, and otherwise re-assemble the 3-D video; and render the 3-D video on the display 128. In various embodiments, the streaming application 126 may be referred to in the context of the streaming technol- ogy employed. For example, in embodiments in which the content is streamed by a 5 packet-switched streaming service (PSS), the streaming application 126 may be re- ferred to as a PSS application. The content management module 120 may negotiate or otherwise communicate streaming parameters including, e.g., device capability parameters, to enable receipt of the data in manner that facilitates operation of the media player 124.
[00023] The media server 116 may include content delivery module 132 hav- ing a streaming application 134, a content management module 136, and a content storage 140. The content delivery module 132 may encode, pack, or otherwise as- semble 3-D video content, stored in the content storage 140, for transmission to one or more UEs, e.g., UE 104. The content management module 136 may negotiate or — otherwise communicate streaming parameters including, e.g., device capability pa- rameters, and control the content delivery module 132 in a manner to facilitate deliv- ery of the 3-D content.
[00024] In some embodiments, one or more of the components that are shown as being part of the media server 116 may be disposed separately from the media — server 116 and communicatively coupled with the media server over a communica- tion link. For example, in some embodiments, content storage 140 may be disposed remotely from the content delivery module 132 and the content management module 136.
N [00025] In some embodiments, the content delivery module 132 may deliver,
O 25 — through eNB 112 in one example, the 3-D video content to the UE 104 in accordance se with a 3GPP streaming standard. For example, the 3-D video content may be trans- © mitted in accordance with a PSS standard, e.g., 3GPP TS 26.234 V11.0.0 (March 16,
I 2012), a dynamic adaptive streaming over HTTP (DASH) standard, e.g., 3GPP TS a
S 26.247 V.11.0.0 (March 16, 2012), a multimedia broadcast and multicast service 2 30 (MBMS) standard, e.g., TS 26.346 V11.1.0 (June 29, 2012), and/or an IMS-based
N PSS and MBMS services (IMS PSS MBMS) standard, e.g., TS 26.237 V.11.0.0
N (June 29, 2012). The streaming application 126 may be configured to receive the 3-D video content over any of a number of transport protocols, e.g., real-time transport protocol (RTP), hypertext transport protocol (HTTP), etc.
[00026] Capability exchange enables media streaming servers, such as media server 116, to provide a wide range of devices with video content suitable for the particular device in question. To facilitate server-side content negotiation for stream- ing, the media server 116 may determine the specific capabilities of the UE 104.
[00027] The content management module 120 and the content management module 136 may negotiate or otherwise communicate parameters of a 3-D video con- tent streaming session. This negotiation may take place through session-level signal- ing via the RAN 108. In some embodiments, the session-level signaling may include transmissions related to device capability information that includes stereoscopic 3-D video decoding and rendering capabilities of the media player 124. In various em- bodiments, the device capability information may further include pre-decoder buffer size, initial buffering, decoder capability, display properties (screen size, resolution, bit depth, etc.), streaming method (real-time streaming protocol (RTSP), HTTP, etc.) — adaptation support, quality of experience (QoE) support, extended real-time transport protocol (RTCP) reporting support, fast content switching support, supported RTP profiles, session description protocol (SDP) attributes, etc.
[00028] During the setup of the streaming session, the content management module 136 may use the device capability information to control the content delivery — module 132 in a manner to provide the UE 104 with the proper type of multimedia content. For example, the media server 116 may determine which variants of multi- ple available variants of a video stream are desired based on the actual capabilities of the UE 104 to determine the best-suited streams for that terminal. This may allow for
N improved delivery of 3-D video content and associated session description and
O 25 — metadata files, for example SDP file or a media presentation description (MPD) file, g to the UE 104. © [00029] The content delivery module 132 may access the content in the con-
I tent storage 140 and adapt the content and/or associated session description and a o metadata files, e.g., SDP/MPD files, according to the negotiated session parameters 2 30 prior to delivery of the content/associated files. The content, when delivered to the
N UE 104, may be decoded by the media player 124 and rendered on the display 128.
N [00030] Adaptation of content and/or associated session description and metadata files is shown in accordance with some specific examples with reference to
Figures 2a-b, while setup of streaming session is shown in accordance with a specific example with reference to Figure 3.
[00031] Figure 2a illustrates a DASH-based streaming embodiment with adap- tation of 3-D video formats in accordance with some embodiments. In particular,
Figure 2a illustrates an HTTP server 204 in communication with a DASH client 208 and implementing a pull-based streaming embodiment, in which the streaming con- trol is maintained by the client rather than the server, where the client downloads content from the server through a series of HT TP-based reguest-response transac- tions after the inspection of the MPD. In DASH-based streaming, the MPD metadata — file provides information on the structure and different versions of the media content representations stored in the HTTP server 204 (including different bitrates, frame rates, resolutions, codec types, etc.). Based on this MPD metadata information that describes the relation of the segments and how they form a media presentation,
DASH client 208 may request the media segments using HTTP GET or partial GET methods. The HTTP server 204 and DASH client 208 may be similar to and substan- tially interchangeable with media server 116 and UE 104, respectively.
[00032] In DASH, the set of 3-D video formats and corresponding content information may be signaled to the DASH client 208 in the MPD. Depending on the capability profile of the DASH client 208 and its supported 3-D formats, the HTTP — server 204 may offer different formatted content, e.g., the HTTP server 204 may ex- clude the 3-D formats that are not supported by the DASH client 208 in the MPD and only include those that are supported by the DASH client 208. In this context, the
HTTP server 204 may provide the content optimized for different 3-D video formats
N to the DASH client 208. In doing this, the HTTP server 204 may use the device ca-
O 25 — pability exchange signaling from the DASH client 208 describing the various sup- se ported 3-D video formats. The DASH client 208 may then reguest the corresponding © versions of the 3-D video content supported by the DASH client 208. Moreover,
I when retrieving an MPD with HTTP, the DASH client 208 may include 3-D video a o codec and format information in a GET reguest, including any temporary adjust- 2 30 ments to the 3-D video formats based on profile difference (ProfDiff). In an example,
N the difference may be configured to temporarily modify one or more MPD parame-
N ters for a content presentation session. For example, the difference may be config- ured to modify the MPD until the content presentation session ends or a subseguent difference (corresponding to the first communicated difference) is communicated to the HTTP server 204. This way the HTTP server 204 may deliver an optimized MPD to the DASH client 208.
[00033] Figure 2b illustrates an RTSP-based streaming embodiment with ad- — aptation of 3-D video formats in accordance with some embodiments. In particular,
Figure 2b illustrates a server 212 and a client 216 implementing a push-based stream- ing method, in which the streaming and session control are maintained by the server 212 rather than the client 216. The server 212 and client 216 may be similar to and substantially interchangeable with media server 116 and UE 104, respectively.
[00034] Examples of push-based streaming include PSS and
IMS PSS MBMS services based on the RTSP and session initiation protocol (SIP), respectively. In this context, the server 212 receives the set of supported 3-D video codecs and formats from the client 216 and adapts the content based on this infor- mation, e.g., the server 212 selects the most suited content version among stored con- — tent versions or dynamically transcodes the content based on the supported 3-D video formats and streams the content to the client 216. The session-related metadata car- ried in the SDP may carry the 3-D video format information for the streamed content.
[00035] Figure 3 illustrates a service discovery with subscribe/notify for
IMS PSS MBMS service in accordance with some embodiments. In particular, Fig- — ure3 illustrates interactions between a UE 304, an IP Multimedia (IM) Core Net- work (CN) subsystem 308, and a service discovery function (SDF) 312. The UE 304 may be similar to and substantially interchangeable with UE 104. The IM CN sub- system 308 and the SDF 312 may be part of a core network domain that interfaces
N with the access network domain, e.g., the RAN 108.
O 25 [00036] In the IMS PSS MBMS service, the UE 304 can send device capabil- se ity information, e.g., supported 3-D video codecs and formats, in a SIP SUBSCRIBE © message to the IM CN Subsystem 308 during service discovery. The IM CN subsys-
I tem 308 may then forward the message to the SDF 312. The SDF 312 determines the a o proper service discovery information, e.g. according to the capabilities of the UE 304 2 30 — as described in the user's profile (Personalized Service Discovery). The SDF 312
N may then send a SIP 200 OK message to the IM CN subsystem 308, which is relayed
N to the UE 304 to confirm the session initialization based on the sent device capability information that also includes the supported 3-D video codecs and formats. After-
ward, the SDF 312 may send a SIP NOTIFY message, with service discovery infor- mation, to the IM CN subsystem 308, which relays the SIP NOTIFY message back to the UE 304. The UE 304 may then respond by sending a SIP 200 OK message to the IM CN subsystem 308, which is then relayed to the SDF 312.
[00037] Such a framework enables optimized service discovery utilizing the supported 3-D video formats in IMS-based PSS and MBMS user services. Later dur- ing the IMS session, the UE 304 may also use SIP signaling to indicate updates in- cluding any temporary adjustments to the set of supported 3-D video codecs and formats based on ProfDiff (e.g, if the current device orientation is different from the — default device orientation). This may be done by refreshing the subscription through further SIP SUBSCRIBE messages including information on the updates to the 3-D video format information.
[00038] Referring again to Figure 1, in some embodiments, the media server 116 may be coupled with a device profile server 144 that has profile information of the UE 104. The profile information may include some or all of the device capability information. In such embodiments, the media server 116 may receive identification information from the UE 104 and then retrieve the profile information from the de- vice profile server 144. This may be done as part of the session-level signaling.
[00039] In some embodiments, the UE 104 may supplement the profile infor- mation retrieved from the device profile server 144 with extra attributes or overrides for attributes already defined in its device capability profile, based on ProfDiff sig- naling. In one example, such a temporary adjustment may be triggered by user pref- erences, for example if the user for a particular session only would like to receive
N two-dimensional (2-D) video even though the terminal is capable of rendering 3-D
O 25 — video. se [00040] The streaming application 134 may encode the 3-D video content for
O transmission in the network environment 100 in accordance with a number of differ-
I ent stream types, with each stream type having associated frame types. Frame types a o could include frame packing, simulcast, or 2-D plus auxiliary frame types. 2 30 [00041] Frame packing may include frame-compatible packing formats and
N full-resolution per view (FRPV) packing format. In frame-compatible packet for-
N mats, the streaming application 134 may spatially pack constituent frames of a stereo pair intoa single frame and encode the single frame. Output frames produced by the streaming application 126 contain constituent frames of a stereo pair. The spatial resolution of the original frames of each view and the packaged single frame may be the same. In this case, the streaming application 134 may down-sample the two con- stituent frames before the packing operation. The frame-compatible packing formats may use a vertical interleaving, horizontal interleaving, side-by-side, top-bottom, or checkerboard format as illustrated in Figures4a-e, respectively, and the down sam- pling may be performed accordingly.
[00042] In some embodiments, the streaming application 134 may indicate the frame-packing format that was used by including one or more frame packing ar- rangement supplemental enhancement information (SEI) messages as specified in the
H. 264/AVC standard into the bitstream. The streaming application 126 may decode the frame, unpack the two constituent frames from the output frames of the decoder, up sample the frames to revert the encoder side down sampling process, and render the constituent frames on the display 128.
[00043] A FRPV packing format may include temporal interleaving. In tem- poral interleaving, the 3-D video may be encoded at double the frame rate of the original video with each parent and subseguent pictures constituting a stereo pair (left and right view). The rendering of the time interleaved stereoscopic video may typically be performed at a high frame rate, where active (shutter) glasses are used to — blend the incorrect view in each eye. This may rely on accurate synchronization be- tween the glasses and the screen.
[00044] In embodiments using simulcast frame types, the left and the right views may be transmitted in separate, simulcast streams. The separately transmitted
N streams may be combined by the streaming application 126 and jointly decoded.
O 25 [00045] In embodiments using 2-D plus auxiliary frame types, 2-D video con- se tent may be sent by the streaming application 134 in conjunction with auxiliary in- © formation that may be used by the streaming application 126 to render 3-D video on
I the display 128. This auxiliary information may be, e.g., a depth/parallax map that is a o a 2-D map with each pixel defining a depth/parallax of one or more pixels in an as- 2 30 sociated 2-D video frame.
N [00046] In some embodiments, other frame types may be used. For example,
N in some embodiments the streaming application 134 may be capable of encoding stereoscopic views into a base view stream and a non-base view stream, which may be transmitted in the same or different streams. In some embodiments, this may be referred to as MVC-based for stereoscopic video. The non-base view stream may include inter-view prediction frames that provide spatial/temporal predictive infor- mation. The base view stream may be sufficient for a single-view, e.g., 2-D, decoder —torender the base view as 2-D video, while the non-base view stream may provide 3-
D decoders, e.g., streaming application 126, with sufficient information to render 3-
D video. If the media server 116 is aware of UEs’ capabilities, it can omit sending the non-base view stream to a device that does not support 3-D video or does not have sufficient bitrate to support 3-D video.
[00047] In various embodiments, the device capability information, transmit- ted from content management module 120 and/or device profile server 144 to content management module 136, may include a 3-D format attribute that includes a list of one or more formats relevant for streaming of stereoscopic 3-D video over relevant transmission protocol, e.g., RTP or HTTP, supported by the streaming application 126. In some embodiments, the 3-D format attribute may be a streaming frame pack- ing format for RTP or HTTP having an integer value “1” for vertical interleaving, “2” for horizontal interleaving, “3” for side-by-side, “4” for top-bottom, “0” for checkerboard, or “5” for temporal interleaving. In some embodiments, the same 3-D format attributes may be used to indicate frame packing formats supported in a spe- cific file or container format. In some embodiments, the 3-D format attribute may include a more generalized value, e.g., “FP” for frame packing.
[00048] In some embodiments, the 3-D format attribute may be another streaming format having a value “SC” for simulcast or “2DA” for 2-D video plus
N auxiliary information.
O 25 [00049] In embodiments in which the UE 104 supports more than one format se type, it may further indicate one or more preferred format types. This could be done 2 by listing the format types in an order of preference, associating a preference indica-
I tor with select format types, etc. - [00050] In some embodiments, in addition to providing a frame type attribute, 2 30 the content management module 120 and/or the device profile server 144 may pro-
N vide one or more component type attributes. The component type attributes may pro-
N vide additional details about specific types of video components, which are constitu- ent elements of the stereoscopic 3-D video, supported and/or preferred by the stream-
ing application 126.
[00051] The component type attributes may have a value *C” for indicating a center-view stream, “CD” for indicating a center-view stream and a depth map, “CP” for indicating a center-view stream and a parallax map, “D” for indicating a depth map, “P” for indicating a parallax map, *L” for indicating a left-view stream, “LD” for indicating a left-view stream and a depth map, “LIL” for indicating video frames that include alternating scan lines from the left and right views, “LP” for indicating a left-view stream and a parallax map, “R” for indicating a right-view stream, “Seq” to indicate frame sequential (e.g., video stream that includes alternating frames from the left and right streams — additional signaling, e.g., AVC SEI messages, may be needed to signal which frames contain left and right views), “SbS” for indicating side-by- side, and *TaB” for indicating top and bottom.
[00052] Each format type attribute may be associated with a respective set of component type attributes. For example, if the format type is SC, the associated component type may be L or R to indicate left and right views, respectively.
[00053] The device capability exchange signaling capability in the PSS speci- fication 3GPP TS 24.234 enables servers to provide a wide range of devices with content suitable for the particular device in question. In order to improve delivery of stereoscopic 3-D video content to the client terminal, the present disclosure describes anew set of attributes that may be included in the PSS vocabulary for device capabil- ity exchange signaling. These proposed attributes may describe the 3-D video decod- ing and rendering capabilities of the client terminal, including which 3-D video frame packing formats the client supports. This may for example allow the server
N and network to provide an optimized RTSP SDP or DASH MPD to the client termi-
O 25 — nal, as well as to perform the appropriate transcoding and 3-D format conversions in se order to match the transmitted 3-D video content to the capabilities of the client de- © vice.
I [00054] The device capability exchange signaling of supported 3-D video co- a o decs and formats may be enabled in 3GPP TS 26.234 with the inclusion of three new 2 30 attributes in the PSS vocabulary: (1) for Streaming component, two attributes indi-
N cating the list of supported frame packing formats relevant for streaming of stereo-
N scopic 3-D video over RTP and HTTP, respectively, and (2) for ThreeGPFileFormat component, one attribute indicating the list of supported frame packing formats rele-
vant for stereoscopic 3-D video that can be included in a 3GPP file format (3GP) file, which is a multimedia container format commonly used for 3GPP-based multimedia services. The details of the attribute definitions are presented below in accordance with some embodiments.
[00055] Attribute name: StreamingFramePackingFormatsRTP
[00056] Attribute definition: List of supported frame packing formats relevant for streaming of stereoscopic 3-D video over RTP supported by the PSS application. The frame packing formats within scope for stereoscopic 3-D video in- clude:
[00057] Frame Compatible Packing Formats: 1 = Vertical interleaving, 2 =
Horizontal interleaving. 3 = Side-by-Side, 4 = Top-Bottom, 0 = Checkerboard
[00058] Full-Resolution per View Packing Formats: 5 = Temporal Interleav- ing
[00059] Component: Streaming
[00060] Type: Literal (Bag)
[00061] Legal values: List of integer values corre- sponding to the supported frame packing formats.
[00062] Resolution rule: Append
[00063] EXAMPLE: <StreamingFramePacking-
FormatsRTP>
[00064] <rdf:Bag>
[00065] <rdf:li>3</rdf:li>
AN [00066]
O 25 <rdf:li>4</rdfli> se [00067] </rdf:Bag>
[00068]
I </StreamingFramePackingFormatsRTP> a o [00069] Attribute name: StreamingFramePackingFormatsHTTP 2 30 [00070] Attribute definition: List of supported frame packing formats
N relevant for streaming of stereoscopic 3-D video over HTTP supported by the PSS
N application. The frame packing formats within scope for stereoscopic 3-D video in- clude:
[00071] Frame Compatible Packing Formats: 1 = Vertical interleaving, 2 =
Horizontal interleaving. 3 = Side-by-Side, 4 = Top-Bottom, 0 = Checkerboard
[00072] Full-Resolution per View Packing Formats: 5 = Temporal Interleav- ing
[00073] Component: Streaming
[00074] Type: Literal (Bag)
[00075] Legal values: List of integer values corre- sponding to the supported frame packing formats.
[00076] Resolution rule: Append
[00077] EXAMPLE: <StreamingFramePacking-
FormatsHTTP>
[00078] <rdf:Bag>
[00079] <rdf:li>3</rdf:li> — [00080] <rdf:li>4</rdf:li>
[00081] </rdf:Bag>
[00082] </StreamingFramePackingFormatsHTTP> — [00083] Attribute name: ThreeGPFramePackingFormats
[00084] Attribute definition: List of supported frame packing formats relevant for stereoscopic 3-D video that can be included in a 3GP file and handled by the PSS application.
N [00085] Component: ThreeGPFileFormat
O 25 — [00086] Type: Literal (Bag) se [00087] Legal values: List of integer values corresponding to the © supported frame packing formats. Integer values shall be either 3 or 4 corresponding
I to the Side-by-Side and Top-and-Bottom frame packing formats respectively. a
S [00088] Resolution rule: Append 2 30 [00089] EXAMPLE: <ThreeGPFramePackingF or- ä mats>
[00090] <rdf:Bag>
[00091]
<rdf:li>3</rdf:li>
[00092] <rdf:li>4</rdf:li>
[00093] </rdf:Bag>
[00094] </ThreeGPFramePackingFormats>
[00095] In some embodiments, a media presentation, as described in MPD, for example, may include attributes and elements common to Adaptation Set, Represen- tation, and SubRepresentation. One such common element may be a FramePacking element. A FramePacking element may specify frame packing arrangement infor- mation of the video media component type. When no FramePacking element is pro- vided for a video component, frame-packing may not be used for the video media component.
[00096] The FramePacking element may include an @schemeldUri attribute — that includes a uniform resource indicator (URI) to identify the frame packing con- figuration scheme employed. In some embodiments, the FramePacking element may further include an (Ovalue attribute to provide a value for the descriptor element.
[00097] In some embodiments, multiple FramePacking elements may be pre- sent. If so, each element may contain sufficient information to select or reject the — described representation.
[00098] If the scheme or the value of all FramePacking elements are not rec- ognized, the client may ignore the described Representations. A client may reject the
Adaptation Set on the basis of observing a FramePacking element.
N [00099] For Adaptation Sets or Representations that contain a video compo-
O 25 nent that conforms to ISO/IEC Information technology — Coding of audio-visual ob- se jects — Part 10: Advanced Video Coding (ISO/IEC 14496-10:2012), a uniform re- © source number for FramePackin@schemeldUri may be
I urn:mpeg:dash:14496:10:frame packing arrangement type:2011, that may be de- a o fined to indicate the frame-packing arrangement as defined by Table D-8 of the 2 30 ISO/IEC 14496-10:2012 (‘Defintion of frame packing arrangement type’) to be
N contained in the FramePacking element. The (value may be the ‘Value’ column as
N specified in Table D-8 of the ISO/IEC 14496-10:2012 and may be interpreted ac- cording to the ‘Interpretation’ column in the same table.
[000100] Figure 5 illustrates a method 500 of signaling 3-D video device capa- bilities in accordance with some embodiments. Method 500 may be performed by components of a UE, e.g., UE 104. In some embodiments, the UE may include and/or have access to one or more computer-readable media having instructions stored thereon, that, when executed, cause the UE, or components thereof, to perform the method 500.
[000101] At 504, the UE may determine device capability information. As de- scribed above, the device capability information may include information as to the decoding and rendering capabilities of a media player. In some embodiments, a con- — tent management module, located on the UE or elsewhere, may determine this in- formation by running one or more scripts on the UE to directly test the capabilities.
In other embodiments, the content management module may access one or more stored files that contain the relevant information.
[000102] At 508, the UE may provide device capability information to the me- — dia server 116 or device profile server 144, including stereoscopic 3-D video decod- ing and rendering capabilities of the media player at the UE. As described above, the device capability information may include one or more format type attributes that represent a list of frame types supported by a streaming application of the UE. In some embodiments, the device capability information may be provided prior to or after the request at 512.
[000103] In some embodiments, some or all of the device capability infor- mation may be provided to the media server by another entity, e.g., a device profile server.
N [000104] At 512, the UE may reguest 3-D video content. In some embodiments,
O 25 the request may be in accordance with appropriate streaming/transport protocols, se e.g., HTTP, RTP, RTSP, DASH, MBMS, PSS, IMS PSS MBMS, etc. The request © may be directed to the media server and may include a uniform resource locator
I (URL) or some other indicator of the requested content or portions thereof. In some - embodiments, the temporary adjustment to device capability information (e.g., via 2 30 — ProfDiff signaling) may also be provided along with the request at 508. Accordingly,
N the UE may supplement the profile information retrieved from the device profile
N server with extra attributes or overrides for attributes already defined in its device capability profile, based on ProfDiff signaling. In one example, such a temporary adjustment may be triggered by user preferences, for example if the user for a partic- ular session only would like to receive two-dimensional (2-D) video even though the terminal is capable of rendering 3-D video.
[000105]
[000106] At 516, the UE may receive the requested 3-D video content and ren- der the content on a display of the UE. The rendering of the content may include a variety of processes such as, but not limited to, decoding, upconverting, unpacking, sequencing, etc.
[000107] Figure 6 illustrates a method 600 of signaling 3-D video content in accordance with some embodiments. Method 600 may be performed by components of a media server, e.g., media server 116. In some embodiments, the media server may include and/or have access to one or more computer-readable media having in- structions stored thereon, that, when executed, cause the media server, or compo- nents thereof, to perform the method 600.
[000108] At 604, the media server may determine device capability infor- mation. In some embodiments, the media server may determine the device capability information by receiving, e.g., as part of session-level signaling, the information from the UE or a device profile server.
[000109] At 608, the media server may receive a reguest for 3-D video content.
In some embodiments, the request may be in accordance with appropriate stream- ing/transport protocols, e.g., HTTP, RTP, RTSP, DASH, MBMS, PSS,
IMS PSS MBMS, etc. The request may be from the UE and may include a universal resource locator (URL) or some other indicator of the reguested content or portions
N thereof. In some embodiments, the reguest received at 608 may occur simultaneously
O 25 —with determination of the device capability information 604, before the determina- se tion, or after the determination. In some embodiments, the temporary adjustment to © device capability information (e.g., via ProfDiff signaling) may also be received
I along with the request at 608. Accordingly, the media server may be supplemented a o with the profile information retrieved from the device profile server with extra attrib- 2 30 utes or overrides for attributes already defined in its device capability profile, based
N on ProfDiff signaling.
N [000110] At 612, the media server may generate session description and/or metadata files to establish a streaming session, for example SDP file ora media presentation description (MPD) based on the device capability information account- ing for the stereoscopic 3-D video decoding and rendering capabilities of the media player at the UE.
[000111] At 616, the media server may encode the 3-D video content in a for- mat type indicated as being supported by the UE in the device capability information.
The 3-D video content may then be streamed to the mobile device.
[000112] The components described herein, e.g., UE 104, media server 116, and/or device profile server 144, may be implemented into a system using any suita- ble hardware and/or software to configure as desired. Figure 7 illustrates, for one embodiment, an example system 700 comprising one or more processor(s) 704, sys- tem control logic 708 coupled with at least one of the processor(s) 704, system memory 712 coupled with system control logic 708, non-volatile memory (NVM)/storage 716 coupled with system control logic 708, a network interface 720 coupled with system control logic 708, and input/output (I/O) devices 732 coupled — with system control logic 708.
[000113] The processor(s) 704 may include one or more single-core or multi- core processors. The processor(s) 704 may include any combination of general- purpose processors and dedicated processors (e.g., graphics processors, application processors, baseband processors, etc.). — [000114] System control logic 708 for one embodiment may include any suita- ble interface controllers to provide for any suitable interface to at least one of the processor(s) 704 and/or to any suitable device or component in communication with system control logic 708.
N [000115] System control logic 708 for one embodiment may include one or
O 25 more memory controller(s) to provide an interface to system memory 712. System se memory 712 may be used to load and store data and/or instructions, e.g., logic 724. © System memory 712 for one embodiment may include any suitable volatile memory,
I such as suitable dynamic random access memory (DRAM), for example. - [000116] NVM/storage 716 may include one or more tangible, non-transitory 2 30 computer-readable media used to store data and/or instructions, e.g., logic 724.
N NVM/storage 716 may include any suitable non-volatile memory, such as flash
N memory, for example, and/or may include any suitable non-volatile storage de- vice(s), such as one or more hard disk drive(s) (HDD(s)), one or more compact disk
(CD) drive(s), and/or one or more digital versatile disk (DVD) drive(s), for example.
[000117] The NVM/storage 716 may include a storage resource physically part of a device on which the system 700 is installed or it may be accessible by, but not necessarily a part of, the device. For example, the NVM/storage 716 may be ac- cessed over a network via the network interface 720 and/or over Input/Output (I/O) devices 732.
[000118] The logic 724, when executed by at least one of the processors 704 may cause the system to perform the operations described herein with respect to the
UE 104, media server 116, and/or device profile server 144. The logic 724 may be disposed additionally/alternatively in other components of the system, e.g., in system control logic 708, and may include any combination of hardware, software, or firm- ware components.
[000119] Network interface 720 may have a transceiver 722 to provide a radio interface for system 700 to communicate over one or more network(s) and/or with — any other suitable device. In various embodiments, the transceiver 722 may be inte- grated with other components of system 700. For example, the transceiver 722 may include a processor of the processor(s) 704, memory of the system memory 712, and
NVM/Storage of NVM/Storage 716. Network interface 720 may include any suita- ble hardware and/or firmware. Network interface 720 may include a plurality of antennas to provide a multiple input, multiple output radio interface. Network inter- face 720 for one embodiment may include, for example, a wired network adapter, a wireless network adapter, a telephone modem, and/or a wireless modem.
[000120] For one embodiment, at least one of the processor(s) 704 may be
N packaged together with logic for one or more controller(s) of system control logic
O 25 708. For one embodiment, at least one of the processor(s) 704 may be packaged to- se gether with logic for one or more controllers of system control logic 708 to form a © System in Package (SiP). For one embodiment, at least one of the processor(s) 704
I may be integrated on the same die with logic for one or more controller(s) of system - control logic 708. For one embodiment, at least one of the processor(s) 704 may be 2 30 integrated on the same die with logic for one or more controller(s) of system control
N logic 708 to form a System on Chip (SoC).
N [000121] In various embodiments, the I/O devices 732 may include user inter- faces designed to enable user interaction with the system 700, peripheral component interfaces designed to enable peripheral component interaction with the system 700, and/or sensors designed to determine environmental conditions and/or location in- formation related to the system 700.
[000122] In various embodiments, the user interfaces could include, but are not limited to, a display for rendering 3-D video (e.g., a liquid crystal display, a touch screen display, an auto-stereoscopic display, etc.), a speaker, a microphone, one or more cameras (e.g., a still camera and/or a video camera), a flashlight (e.g., a light emitting diode flash), and a keyboard.
[000123] In various embodiments, the peripheral component interfaces may include, but are not limited to, a non-volatile memory port, a universal serial bus (USB) port, an audio jack, and a power supply interface.
[000124] In various embodiments, the sensors may include, but are not limited to, a gyro sensor, an accelerometer, a proximity sensor, an ambient light sensor, and a positioning unit. The positioning unit may also be part of, or interact with, the network interface 720 to communicate with components of a positioning network, e.g., a global positioning system (GPS) satellite.
[000125] In various embodiments, the system 700 may be a mobile computing device such as, but not limited to, a laptop computing device, a tablet computing de- vice, a netbook, a smartphone, etc. In various embodiments, system 700 may have — more or less components, and/or different architectures.
[000126] Although certain embodiments have been illustrated and described herein for purposes of description, a wide variety of alternate and/or eguivalent em- bodiments or implementations calculated to achieve the same purposes may be sub-
N stituted for the embodiments shown and described without departing from the scope
O 25 — of the present disclosure. This application is intended to cover any adaptations or se variations of the embodiments discussed herein. Therefore, it is manifestly intended © that embodiments described herein be limited only by the claims and the equivalents z thereof. a > 3
N

Claims (18)

  1. Claims
    1 . An apparatus to be employed by a user equipment ( UE ), the apparatus comprising : content management circuitry to : transmit, via a Long Term Evolution ( LTE ) wireless communication network , an identifier of a Third Generation Partnership (3GP)- Dynamic Adaptive Streaming over Hypertext Transfer Protocol (DASH) profile associated with the UE , the 3GP - DASH profile to indicate one or more restrictions associated with stereoscopic three - dimensional (3 - D ) video content supported by the UE ; receive a media presentation description (MPD) that includes information associated with a first media presentation that complies with the 3GP - DASH profile and ex- cludes information associated with one or more other media presentations that do not comply with the 3GP - DASH profile ; and transmit a Hypertext Transfer Protocol (HTTP ) GET or partial GET request for a DASH representation associated with the first media presentation ; and a media player coupled to the content management circuitry , the media player to receive and render the DASH representation .
  2. 2. The apparatus of claim 1, wherein the 3GP - DASH profile is a multiview stereo- scopic 3D video profile to indicate that the UE supports multiview stereoscopic 3D video content that includes a base view and a non-base view that are temporally in- terleaved. 3 25
  3. 3. The apparatus of claim 1 , wherein the 3GP- DASH profile is a frame- packed se stereoscopic 3D video profile to indicate that the UE supports frame - packed 3D © video content that includes a base view and a non - base view packed in a same I frame. a O 2 30
  4. 4. The apparatus of claim 3 , wherein the MPD includes a frame packing element to N indicate a type of frame packing format used for the first media presentation . N
  5. 5 . The apparatus of claim 4 , wherein the frame packing element indicates that the type of frame packing format used is a vertical interleaving frame compatible pack- ing format, a horizontal interleaving frame compatible packing format, a side- by - side frame compatible packing format, a top-bottom frame compatible packing for- mat, or a checkerboard frame compatible packing format.
  6. 6 . The apparatus of claim 1 , wherein the MPD includes one or more attributes asso- ciated with individual DASH representations of the first media presentation , wherein the individual representations include representations associated with different time — periods of the media presentation .
  7. 7 . The apparatus of claim 1, wherein the UE is to receive the DASH representation via a multimedia broadcast and multicast service ( MBMS ).
  8. 8. The apparatus of claim 1 , further comprising an auto stereoscopic display cou- pled to the media player to display the rendered DASH representation .
  9. 9. A media server , comprising : content management circuitry to : — obtain a Third Generation Partnership ( 3GP ) - Dynamic Adaptive Streaming over Hypertext Transfer Protocol ( DASH ) profile associated with a user equipment (UE ) of a wireless communication network , wherein the 3GP - DASH profile is a a mul- tiview stereoscopic three - dimensional ( 3D ) video profile to indicate that the UE N supports multiview stereoscopic 3D video content that includes a base view and a O 25 non — base view that are temporally interleaved or a frame-packed stereoscopic 3D se video profile to indicate that the UE supports frame- packed 3D video content that © includes a base view and a non -base view packed in a same frame; I generate , based on the obtained 3GP- DASH profile , a media presentation descrip- - tion (MPD ) that includes one or more attributes associated with a first DASH repre- 2 30 sentation of a media presentation that complies with the 3GP - DASH profile and N excludes attributes associated with a second DASH representation of the media N presentation that does not comply with the 3GP -DASH profile ; and transmit the generated MPD to the UE ;
    content delivery circuitry, coupled to the content management circuitry, to deliver 3D video content associated with the first DASH representation to the UE.
  10. 10 . The media server of claim 9, wherein the MPD includes a frame packing ele- ment to indicate a type of frame packing format used for the first DASH representa- tion .
  11. 11 . The media server of claim 10, wherein the frame packing element indicates that the type of frame packing format used is a vertical interleaving frame compatible — packing format, a horizontal interleaving frame compatible packing format, a side - by -side frame compatible packing format , a top - bottom frame compatible packing format , or a checkerboard frame compatible packing format.
  12. 12 . The media server of claim 9 , wherein the content management circuitry is to — receive an identifier from the UE , and wherein the content management circuitry is to obtain the 3GP -DASH profile from a device profile server based on the identifier.
  13. 13 . One or more non - transitory computer- readable media having instructions, stored thereon , that when executed cause a user eguipment ( UE ) to : transmit, via a Long Term Evolution ( LTE ) wireless communication network , an identifier of a Third Generation Partnership ( 3GP) - Dynamic Adaptive Streaming over Hypertext Transfer Protocol ( DASH ) profile associated with the UE , the 3GP - DASH profile to indicate one or more restrictions associated with stereoscopic N three- dimensional ( 3D ) video content supported by the UE; O 25 — receive a media presentation description ( MPD ) that includes one or more attributes se associated with a first individual DASH representation of a media presentation that © complies with the 3GP - DASH profile and excludes one or more attributes associat- I ed with a second individual DASH representation of the media presentation that does - not comply with the 3GP -DASH profile ; 2 30 transmit a Hypertext Transfer Protocol ( HTTP ) GET or partial GET request for a N DASH representation associated with the media presentation ; N obtain the DASH representation ; and render the obtained DASH representation .
  14. 14 . The one or more media of claim 13 , wherein the 3GP - DASH profile is a mul- tiview stereoscopic 3D video profile to indicate that the UE supports multiview ste- reoscopic 3D video content that includes a base view and a non - base view that are temporally interleaved .
  15. 15 . The one or more media of claim 13 , wherein the 3GP- DASH profile is a frame- packed stereoscopic 3D video profile to indicate that the UE supports frame- packed 3D video content that includes a base view and a non - base view packed in a same frame.
  16. 16 . The one or more media of claim 15, wherein the MPD includes a frame packing element to indicate a type of frame packing format used for the media presentation.
  17. 17. The one or more media of claim 16 , wherein the frame packing element indi- — cates that the type of frame packing format used is a vertical interleaving frame com- patible packing format , a horizontal interleaving frame compatible packing format , a side -by - side frame compatible packing format , a top - bottom frame compatible packing format, or a checkerboard frame compatible packing format.
  18. 18 . The one or more media of claim 13 , wherein the MPD excludes information associated with one or more other media presentations that do not comply with the 3GP - DASH profile . N N O N O <Q 0 O I a a O O LO N N O N
FI20225190A 2012-04-09 2013-04-03 Signaling three-dimensional video information in communication networks FI20225190A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201261621939P 2012-04-09 2012-04-09
US201261679627P 2012-08-03 2012-08-03
US201261626767P 2012-09-25 2012-09-25

Publications (1)

Publication Number Publication Date
FI20225190A1 true FI20225190A1 (en) 2022-03-03

Family

ID=85783731

Family Applications (1)

Application Number Title Priority Date Filing Date
FI20225190A FI20225190A1 (en) 2012-04-09 2013-04-03 Signaling three-dimensional video information in communication networks

Country Status (1)

Country Link
FI (1) FI20225190A1 (en)

Similar Documents

Publication Publication Date Title
FI129521B (en) Signaling three dimensional video information in communication networks
ES2482605B2 (en) Three-dimensional video information signaling in communication networks
AU2013246041B2 (en) Signaling three dimensional video information in communication networks
FI20225190A1 (en) Signaling three-dimensional video information in communication networks
BR112014024061B1 (en) COMPUTER READABLE MEDIUM, METHOD, DEVICE AND SYSTEM FOR SIGNING THREE-DIMENSIONAL VIDEO INFORMATION ON COMMUNICATION NETWORKS