WO2021144139A1 - Procédé, appareil et produit-programme d'ordinateur permettant la signalisation d'une synchronisation d'orientation de fenêtre d'affichage dans une distribution de vidéo panoramique - Google Patents

Procédé, appareil et produit-programme d'ordinateur permettant la signalisation d'une synchronisation d'orientation de fenêtre d'affichage dans une distribution de vidéo panoramique Download PDF

Info

Publication number
WO2021144139A1
WO2021144139A1 PCT/EP2020/088035 EP2020088035W WO2021144139A1 WO 2021144139 A1 WO2021144139 A1 WO 2021144139A1 EP 2020088035 W EP2020088035 W EP 2020088035W WO 2021144139 A1 WO2021144139 A1 WO 2021144139A1
Authority
WO
WIPO (PCT)
Prior art keywords
viewport
stream
timestamp value
client device
motion
Prior art date
Application number
PCT/EP2020/088035
Other languages
English (en)
Inventor
Saba AHSAN
Igor Danilo Diego Curcio
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Publication of WO2021144139A1 publication Critical patent/WO2021144139A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • G02B27/0172Head mounted characterised by optical features
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/242Synchronization processes, e.g. processing of PCR [Program Clock References]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0138Head-up displays characterised by optical features comprising image capture systems, e.g. camera
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/014Head-up displays characterised by optical features comprising information/image processing systems
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0179Display position adjusting means not related to the information to be displayed
    • G02B2027/0187Display position adjusting means not related to the information to be displayed slaved to motion of at least a part of the body of the user, e.g. head, eye

Definitions

  • An example embodiment relates generally to immersive content consumption, and, more particularly, to techniques for signaling of viewport orientation timing in panoramic video delivery.
  • video content may be delivered via streaming, such as for virtual reality applications or other types of applications.
  • the video content that is captured and delivered may be expansive and may provide a panoramic view (e.g., a 180° view, 360° view, omnidirectional view, and/or the like).
  • omnidirectional video enables spherical viewing direction with support for head-mounted displays, providing an interactive and immersive experience for users.
  • users may have only a limited field of view at any one instant and may change their viewing direction, such as by rotating their head when wearing a head-mounted display (HMD) while continuing to view the panoramic video content.
  • HMD head-mounted display
  • the entire content of a panoramic video may be streamed to a player.
  • users of a virtual reality application generally have a limited field of view such that at any point in time the user views only a portion of the panoramic video content.
  • Viewport-dependent streaming is based upon the viewing direction of the user equipped with the HMD such that the video content located in the viewing direction is delivered at a high quality (e.g., comprising a high resolution, a high framerate, and/or the like), while all other video content is delivered at a lower quality.
  • a high quality e.g., comprising a high resolution, a high framerate, and/or the like
  • new viewport orientation information may be sent to a streaming source using a feedback message.
  • the streaming source may then update the transmitted video stream based on the new viewport orientation information.
  • current viewport-dependent streaming methods have limitations, including, for example, inaccurate calculation of important quality parameters. BRIEF SUMMARY
  • a method, apparatus, and computer program product are disclosed for providing for signaling of viewport orientation timing in panoramic video delivery.
  • the method, apparatus and computer program product are configured to receive a first stream based on a first viewport orientation and receive a second stream based on a second viewport orientation.
  • the method, apparatus and computer program product are further configured to determine a first timestamp value associated with a frame of the first stream, determine a second timestamp value associated with a frame of the second stream, and determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value which may then be signaled to a source device.
  • a source device may make more informed decisions on how to use available bandwidth. Further benefits include improved session monitoring, higher-level metrics of Real Time Control Protocol (RTCP) reports, and lower overhead, as sending motion to high- quality delay values or, in some embodiments, only relevant timing information, provides more accurate information in an efficient manner. Additionally, aggregated and/or periodic transmission of motion to high-quality delay values may further reduce overhead.
  • RTCP Real Time Control Protocol
  • a method comprising receiving, at a client device, a first stream comprising panoramic video content based on a first viewport orientation.
  • the method further comprises detecting an event comprising a change in the first viewport orientation to a second viewport orientation.
  • the method further comprises in response to the event, generating a feedback message comprising one or more updated parameters of the second viewport orientation.
  • the method further comprises causing transmission of the feedback message to a source device.
  • the method further comprises receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation.
  • the method further comprises determining a first timestamp value associated with a corresponding frame of the first stream.
  • the method further comprises determining a second timestamp value associated with a corresponding frame of the second stream.
  • the method further comprises determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and causing transmission of the motion to high-quality delay value to the source device.
  • the method further comprises storing the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.
  • the method further comprises determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred.
  • the method further comprises determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device and causing transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.
  • the frame of the first stream comprises a last frame rendered at the client device in association with the first stream.
  • the frame of the second stream comprises a first frame rendered at the client device in association with the second stream.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • an apparatus comprising processing circuitry and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the processing circuitry, cause the apparatus at least to receive a first stream comprising panoramic video content based on a first viewport orientation, to detect an event comprising a change in the first viewport orientation to a second viewport orientation, to, in response to the event, generate a feedback message comprising one or more updated parameters of the second viewport orientation, to cause transmission of the feedback message to a source device, to receive, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation, to determine a first timestamp value associated with a frame of the first stream, and to determine a second timestamp value associated with a frame of the second stream.
  • the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and cause transmission of the motion to high-quality delay value to the source device.
  • the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further store the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.
  • the at least one memory and computer program code are further configured to, with the processing circuitry, to cause the apparatus at least to further determine a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred.
  • the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further determine a viewport-delivered timestamp value based on a time at which the second stream is first received and cause transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.
  • the frame of the first stream comprises a last frame rendered in association with the first stream.
  • the frame of the second stream comprises a first frame rendered in association with the second stream.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • a computer program product comprising a non- transitory computer readable storage medium having program code portions stored thereon, the program code portions configured, upon execution, to receive, a first stream comprising panoramic video content based on a first viewport orientation.
  • the program code portions are further configured to, upon execution, detect an event comprising a change in the first viewport orientation to a second viewport orientation.
  • the program code portions are further configured to, upon execution, in response to the event, generate a feedback message comprising one or more updated parameters of the second viewport orientation.
  • the program code portions are further configured to, upon execution, cause transmission of the feedback message.
  • the program code portions are further configured to , upon execution, receive, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation.
  • the program code portions are further configured to, upon execution, determine a first timestamp value associated with a corresponding frame of the first stream.
  • the program code portions are further configured to determine, upon execution, a second timestamp value associated with a corresponding frame of the second stream.
  • the program code portions are further configured to , upon execution, to determine, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and cause transmission of the motion to high-quality delay value .
  • the program code portions are further configured to, upon execution, store the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.
  • the program code portions are further configured to, upon execution, determine a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred.
  • the program code portions are further configured to, upon execution, determine a viewport- delivered timestamp value based on a time at which the second stream is first received and cause transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value.
  • the frame of the first stream comprises a last frame rendered in association with the first stream.
  • the frame of the second stream comprises a first frame rendered in association with the second stream.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • an apparatus comprising means for receiving, at a client device, a first stream comprising panoramic video content based on a first viewport orientation.
  • the apparatus further comprises means for detecting an event comprising a change in the first viewport orientation to a second viewport orientation.
  • the apparatus further comprises means for in response to the event, generating a feedback message comprising one or more updated parameters of the second viewport orientation.
  • the apparatus further comprises means for causing transmission of the feedback message to a source device.
  • the apparatus further comprises means for receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation.
  • the apparatus further comprises means for determining a first timestamp value associated with a corresponding frame of the first stream.
  • the apparatus further comprises means for determining a second timestamp value associated with a corresponding frame of the second stream.
  • the apparatus further comprises means for determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value and causing transmission of the motion to high- quality delay value to the source device.
  • the apparatus further comprises means for storing the motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.
  • the apparatus further comprises means for determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation to the second viewport orientation occurred.
  • the apparatus further comprises means for determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device and causing transmission of one or more of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.
  • the frame of the first stream comprises a last frame rendered at the client device in association with the first stream.
  • the frame of the second stream comprises a first frame rendered at the client device in association with the second stream.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • a method comprising causing transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation.
  • the method further comprises receiving a feedback message comprising one or more updated parameters of a second viewport orientation.
  • the method further comprises generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation.
  • the method further comprises causing transmission of the second stream to the client device.
  • the method further comprises, in response to the transmission of the second stream, receiving one or more determined parameters from the client device.
  • the method further comprises updating the second stream based on the received one or more determined parameters and causing transmission of the updated second stream to the client device.
  • the one or more determined parameters comprises a motion to high-quality delay value.
  • the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • RTP Real Time Protocol
  • an apparatus comprising processing circuitry and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the processing circuitry, cause the apparatus at least to cause transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation.
  • the at least one memory and computer program code are further configured, with the processing circuitry, cause the apparatus at least to further receive a feedback message comprising one or more updated parameters of a second viewport orientation.
  • the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to further generate, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation and cause transmission of the second stream to the client device.
  • the at least one memory and computer program code are further configured, with the processing circuitry, to, cause the apparatus at least to, in response to the transmission of the second stream, receive one or more determined parameters from the client device.
  • the at least one memory and computer program code are further configured, with the processing circuitry, to cause the apparatus at least to update the second stream based on the received one or more determined parameters and cause transmission of the updated second stream to the client device.
  • the one or more determined parameters comprises a motion to high-quality delay value.
  • the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • RTP Real Time Protocol
  • a computer program product comprising a non- transitory computer readable storage medium having program code portions stored thereon, the program code portions configured, upon execution, to cause transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation.
  • the program code portions are further configured to, upon execution, receive a feedback message comprising one or more updated parameters of a second viewport orientation.
  • the program code portions are further configured to, upon execution, generate, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation.
  • the program code portions are further configured to, upon execution, cause transmission of the second stream to the client device.
  • the program code portions are further configured to, upon execution, in response to the transmission of the second stream, receive one or more determined parameters from the client device.
  • the program code portions are further configured to, upon execution, update the second stream based on the received one or more determined parameters and cause transmission of the updated second stream to the client device.
  • the one or more determined parameters comprises a motion to high-quality delay value.
  • the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • RTP Real Time Protocol
  • an apparatus comprising means for causing transmission of, to a client device, a first stream comprising panoramic video content based on a first viewport orientation.
  • the apparatus further comprises means for receiving a feedback message comprising one or more updated parameters of a second viewport orientation.
  • the apparatus further comprises means for generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation.
  • the apparatus further comprises means for causing transmission of the second stream to the client device.
  • the apparatus further comprises means for, in response to the transmission of the second stream, receiving one or more determined parameters from the client device.
  • the apparatus further comprises means for updating the second stream based on the received one or more determined parameters and causing transmission of the updated second stream to the client device.
  • the one or more determined parameters comprises a motion to high-quality delay value.
  • the one or more determined parameters comprises a first timestamp value, a second timestamp value, a viewport-change timestamp value, and a viewport-delivered timestamp value.
  • the first timestamp value and the second timestamp value are Real Time Protocol (RTP) timestamp values.
  • RTP Real Time Protocol
  • FIG. l is a block diagram of a system including a source device and a client device configured to communicate via a network in accordance with an example embodiment
  • FIG. 2 is a block diagram of an apparatus that may be specifically configured in accordance with an example embodiment of the present disclosure
  • FIG. 3A illustrates a view of an example 360-degree video conference in accordance with some embodiments
  • FIG. 3B illustrates a view of an example 360-degree video conference in accordance with some embodiments
  • FIG. 4 is a signal diagram of an example data flow in accordance with an example embodiment
  • FIG. 5A is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 in accordance with an example embodiment
  • FIG. 5B is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 in accordance with an example embodiment
  • FIG. 5C is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 in accordance with an example embodiment
  • FIG. 6 is a signal diagram of an example data flow in accordance with an example embodiment.
  • FIG. 7 is a flow chart illustrating the operations performed, such as by the apparatus of FIG. 2 when embodied by a source device in accordance with an example embodiment.
  • circuitry refers to (a) hardware-only circuit implementations (e.g., implementations in analog circuitry and/or digital circuitry); (b) combinations of circuits and computer program product(s) comprising software and/or firmware instructions stored on one or more computer readable memories that work together to cause an apparatus to perform one or more functions described herein; and (c) circuits, such as, for example, a microprocessor s) or a portion of a microprocessor(s), that require software or firmware for operation even if the software or firmware is not physically present.
  • This definition of ‘circuitry’ applies to all uses of this term herein, including in any claims.
  • circuitry also includes an implementation comprising one or more processors and/or portion(s) thereof and accompanying software and/or firmware.
  • circuitry as used herein also includes, for example, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, other network device (such as a core network apparatus), field programmable gate array, and/or other computing device.
  • FIG. 1 a block diagram of a system 100 is illustrated for providing for signaling of viewport orientation timing in panoramic video delivery, according to an example embodiment.
  • system 100 of FIG. 1 as well as the illustrations in other figures are each provided as an example of some embodiments and should not be construed to narrow the scope or spirit of the disclosure in any way.
  • the scope of the disclosure encompasses many potential embodiments in addition to those illustrated and described herein.
  • FIG. 1 illustrates one example of a configuration of a system providing for signaling of viewport orientation timing
  • panoramic video content may refer to any type of immersive video content, such as 360-degree video, 180-degree video content, omnidirectional (e.g., spherical) video content, and/or the like.
  • the system 100 may include one or more first devices, e.g., client devices 104 (examples of which include but are not limited to a head-mounted display, camera, panoramic video device, virtual reality system, augmented reality system, video playback device, mobile phones, smartphones, tablets, and/or the like), and/or one or more second devices, e.g., source devices 102 (also known as a server, remote computing device, remote server, streaming server, and/or the like).
  • client devices 104 examples of which include but are not limited to a head-mounted display, camera, panoramic video device, virtual reality system, augmented reality system, video playback device, mobile phones, smartphones, tablets, and/or the like
  • second devices e.g., source devices 102 (also known as a server, remote computing device, remote server, streaming server, and/or the like).
  • the disclosure describes an example embodiment comprising a client-server architecture
  • the disclosure is not limited to a client- server architecture, but is also applicable to other architectures, such as peer-to-peer, broadcast/multicast, distributed, multi-party, point-to-point or point-to-multipoint, etc.
  • the first device can encompass other types of computing and/or communication devices other than a client device 104 and the second device can encompass other types of computing and/or communication devices other than a source device 102.
  • the first device and the second device may be of the same type of device, such as mobile phones or tablets with 360-degree content display and/or streaming capabilities in a stand-alone mode or tethered with another virtual reality or augmented reality display device.
  • the system 100 may further comprise a network 106.
  • the network 106 may comprise one or more wireline networks, one or more wireless networks, or some combination thereof.
  • the network 106 may, for example, comprise a serving network (e.g., a serving cellular network) for one or more source devices 102.
  • the network 106 may com prise, in certain embodiments, one or more source devices 102 and/or one or more client devices 104.
  • the network 106 may comprise the Internet.
  • the network 106 may comprise a wired access link connecting one or more source devices 102 to the rest of the network 106 using, for example, Digital Subscriber Line (DSL), optical and/or coaxial cable technology.
  • the network 106 may comprise a public land mobile network (for example, a cellular network), such as a network that may be implemented by a network operator (for example, a cellular access provider).
  • the network 106 may operate in accordance with universal terrestrial radio access network (UTRAN) standards, evolved UTRAN (E- UTRAN) standards, current and future implementations of Third Generation Partnership Project (3GPP) long term evolution (LTE) (also referred to as LTE-A) standards, third generation (3G), fourth generation (4G), and fifth generation (5G) standards, current and future implementations of International Telecommunications Union (ITU) International Mobile Telecommunications Advanced (IMT-A) system standards, and/or the like.
  • 3GPP Third Generation Partnership Project
  • LTE long term evolution
  • 4G fourth generation
  • 5G fifth generation
  • ITU International Telecommunications Union
  • IMT-A International Mobile Telecommunications Advanced
  • one or more source devices 102 may be configured to connect directly with one or more client devices 104 via, for example, an air interface without routing communications via one or more elements of the network 106. Additionally, or alternatively, one or more of the source devices 102 may be configured to communicate with one or more of the client devices 104 over the network 106. In this regard, the client devices 104 may comprise one or more nodes of the network 106.
  • the system 100 may be configured according to an architecture for providing for panoramic video streaming.
  • the system 100 may be configured to provide for immersive, panoramic video streaming and techniques to support a wide variety of applications including virtual reality and augmented reality applications.
  • FIG. 2 One example of an apparatus 200 that may be configured to function as the source device 102 and/or client device 104 is depicted in FIG. 2.
  • the apparatus 200 includes, is associated with or is in communication with processing circuitry 22, a memory 24 and a communication interface 26.
  • the processing circuitry 22 may be in communication with the memory 24 via a bus for passing information among components of the apparatus.
  • the memory device may be non-transitory and may include, for example, one or more volatile and/or non-volatile memories.
  • the memory 24 may be an electronic storage device (e.g., a computer readable storage medium) comprising gates configured to store data (e.g., bits) that may be retrievable by a machine (e.g., a computing device like the processing circuitry).
  • the memory 24 may be configured to store information, data, content, applications, instructions, or the like for enabling the apparatus to carry out various functions in accordance with an example embodiment of the present disclosure.
  • the memory 24 could be configured to buffer input data for processing by the processing circuitry 22.
  • the memory device 24 could be configured to store instructions for execution by the processing circuitry 22.
  • the apparatus 200 may, in some embodiments, be embodied in various computing devices as described above.
  • the apparatus may be embodied as an integrated circuit, also denoted as a chip or chip set.
  • the apparatus may comprise one or more physical packages (e.g., a packaged integrated circuit and/or chips) including materials, components and/or wires on a structural assembly (e.g., a baseboard).
  • the structural assembly may provide physical strength, conservation of size, and/or limitation of electrical interaction for component circuitry included thereon.
  • the apparatus 200 may therefore, in some cases, be configured to implement an embodiment of the present invention on a single chip or as a single “system on a chip.”
  • a chip or chipset may constitute means for performing one or more operations for providing the functionalities described herein.
  • the processing circuitry 22 may be embodied in a number of different ways.
  • the processing circuitry may be embodied as one or more of various hardware processing means such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), a processing element with or without an accompanying DSP, or various other circuitry including integrated circuits such as, for example, an ASIC (application specific integrated circuit), an FPGA (field programmable gate array), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like.
  • the processing circuitry 22 may include one or more processing cores configured to perform independently.
  • a multi-core processing circuitry may enable multiprocessing within a single physical package.
  • the processing circuitry 22 may include one or more processors configured in tandem via the bus to enable independent execution of instructions, pipelining and/or multithreading.
  • the processing circuitry 22 may be configured to execute instructions stored in the memory 24 or otherwise accessible to the processing circuitry.
  • the processing circuitry 22 may be configured to execute hard coded functionality.
  • the processing circuitry may represent an entity (e.g., physically embodied in circuitry) capable of performing operations according to an embodiment of the present disclosure while configured accordingly.
  • the processing circuitry when the processing circuitry is embodied as an ASIC, FPGA or the like, the processing circuitry may be specifically configured hardware for conducting the operations described herein.
  • the instructions when the processing circuitry is embodied as an executor of instructions, the instructions may specifically configure the processor to perform the algorithms and/or operations described herein when the instructions are executed.
  • the processing circuitry 22 may be a general-purpose processor or a processor of a specific device (e.g., an image or video processing system) configured to employ an embodiment of the present invention by further configuration of the processing circuitry by instructions for performing the algorithms and/or operations described herein.
  • the processing circuitry 22 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processing circuitry.
  • ALU arithmetic logic unit
  • the communication interface 26 may be any means such as a device or circuitry embodied in either hardware or a combination of hardware and software that is configured to receive and/or transmit data, including media content in the form of video or image files, one or more audio tracks or the like.
  • the communication interface 26 may include, for example, an antenna (or multiple antennas) and supporting hardware and/or software for enabling communications with a wireless communication network.
  • the communication interface 26 may include the circuitry for interacting with the antenna(s) to cause transmission of signals via the antenna(s) or to handle receipt of signals received via the antenna(s).
  • the communication interface 26 may alternatively or also support wired communication.
  • the communication interface may include a communication modem and/or other hardware/software for supporting communication via cable, digital subscriber line (DSL), universal serial bus (USB) or other mechanisms.
  • signaling between the source device 102 and the client device 104 may be carried out over any protocol at any layer of the International Organization for Standardization (ISO) Open Systems Interconnection (OSI) protocol stack (e.g., Session Description Protocol (SDP), Session Initiation Protocol (SIP), Real Time Streaming Protocol (RTSP), Real-Time Transport Protocol (RTP), Real-Time Transport Control Protocol (RTCP), Moving Pictures Experts Group Dynamic Adaptive Streaming over Hypertext Transfer Protocol (HTTP) (MPEG DASH) and the like).
  • SDP Session Description Protocol
  • SIP Session Initiation Protocol
  • RTSP Real Time Streaming Protocol
  • RTP Real-Time Transport Protocol
  • RTCP Real-Time Transport Control Protocol
  • HTTP Moving Pictures Experts Group Dynamic Adaptive Streaming over Hypertext Transfer Protocol
  • MPEG DASH Moving Pictures Experts Group Dynamic Adaptive Streaming over Hypertext Transfer Protocol
  • the 3 rd Generation Partnership Project (3GPP) is a standards organization which develops protocols and is known for the development and maintenance of various standards.
  • 3GPP is presently developing virtual reality (VR) conferencing solutions using the Multimedia Telephony Service for Internet Protocol Multimedia Subsystems (IMS) (MTSI) and Telepresence standards, which enable multi-party video conferences over mobile networks.
  • a typical VR conference comprises one or more parties that are sending immersive content (e.g., via one or more source devices 102) to one or more parties equipped with devices capable of viewing the immersive content (e.g., a client device 104, such as an HMD and/or the like).
  • FIG. 3 A an example VR conference is depicted wherein a source device located in a conference room A is sending viewport-dependent 360-degree video to two remote participants B and C, each equipped with a client device: participant B viewing the 360-degree video via an HMD and participant C viewing the 360-degree video via a mobile phone.
  • Viewport information is sent from the remote participants (e.g., the client devices) to the 360-degree content source device in order to receive updated viewport- dependent video.
  • Viewport-dependent delivery is often utilized for VR content in order to deliver improved quality for video frames within the viewport (e.g., the video frames which the user is presently viewing) in comparison to video content outside of the viewport.
  • This is beneficial for bandwidth saving and optimized delivery.
  • a user receives (e.g., via streaming and/or downloading) and consumes panoramic content (referenced hereinafter as 360-degree content by way of example but not of limitation) on a client device 104 (e.g., a virtual reality device such as a head-mounted display) of FIG. 1, content within a viewport area is visible to the user. In other words, content that the user is viewing at any given time is viewport area content.
  • the entirety of the 360- degree content may be received or downloaded, with viewport area content being received or downloaded at a higher quality than content not currently being viewed by the user.
  • content not currently being viewed by a user may be referred to as a background area.
  • background area content may be received or downloaded in a lower quality than the viewport area content in order to preserve bandwidth.
  • a margin area (sometimes referred to as guard band(s)) may be extended around the viewport area.
  • the margin area of an example embodiment may comprise a left margin area, right margin area, top margin area, and/or bottom margin area.
  • the margin area may be received or downloaded with an intermediate quality between the viewport area quality (higher quality) and the background area quality (lower quality).
  • the margin area may also be received in the same quality as a viewport area (e.g., higher quality). Margin areas may be useful during rendering.
  • a margin area may be extended, fully or partially, around a viewport area to compensate for any deviation of the actual viewport at the time of viewing from the predicted viewport at the time of rendering.
  • margins may be symmetrical on opposite sides of the viewport, regardless of the state of head motion (e.g., a user turning their head while wearing a head-mounted display).
  • a margin area may be extended asymmetrically in accordance with a direction of a head motion and/or the speed of the head motion.
  • a sender e.g., source device 102
  • FIG. 3B depicts another example VR conference, wherein a source device located in a conference room A is sending viewport-independent panoramic video to two remote participants B and C, each equipped with a client device.
  • a source device located in a conference room A is sending viewport-independent panoramic video to two remote participants B and C, each equipped with a client device.
  • there may be an intermediary device such as a Media Resource Function (MRF) and/or a Media Control Unit (MCU) as shown in Figure 3B.
  • MRF Media Resource Function
  • MCU Media Control Unit
  • the MRF and/or MCU may receive the viewport-independent video from a video source and deliver viewport-dependent video to the remote participant(s).
  • viewport information may be sent from the remote participants (e.g., the client devices) to the MRF and/or MCU in order to receive updated viewport-dependent video.
  • the transport protocol used for media delivery in MTSI and Telepresence is Real Time Protocol (RTP).
  • Real Time Control Protocol may be used as the control protocol for sending control information during the session.
  • Session Initiation Protocol SIP
  • Session Description Protocols SDP
  • RTCP may be used for session establishment, session management, and session parameters that influence the media session exchanged and, in some cases, negotiated using SIP and/or SDP including device capabilities, available media streams, media stream characteristics, and/or the like.
  • RTCP may be used for sending real-time control information such as Quality of Service (QoS) parameters from a receiving device (e.g., a client device 104) to the streaming source (e.g., source device 102).
  • QoS Quality of Service
  • QoS parameters may include parameters related to jitter, packets dropped, and/or the like.
  • Other real-time control parameters such as viewport information, may also be sent using RTCP feedback messages.
  • the source device 102 may use adaptive mechanisms during media delivery to optimize the session quality based on transport characteristics.
  • the area outside the viewport may be provided at a lower quality than the viewport area.
  • new viewport orientation information may be sent to the source device using a feedback message (e.g., an RTCP feedback message).
  • the source device may then update the transmitted video stream based on the new viewport orientation information received.
  • the sender e.g., source device
  • Time comprising at least one Round Trip Time (RTT) is however involved for the updated viewport area to be visible to the user at the client device 104. Yet, the actual time taken may be longer depending on other parameters such as processing delays at the sender and/or receiver as shown in FIG 4.
  • FIG. 4 depicts an example signal diagram of a sender of panoramic video content (e.g., a source device 102) transmitting a viewport-dependent video stream over RTP to a receiver (e.g., a client device 104). The viewport orientation is then changed at the receiver due to an event, such as head and/or body motion or a change in the gaze of the viewer.
  • RTP Round Trip Time
  • the receiver may be configured to then send an RTCP feedback message with the new viewport orientation information to the sender at the earliest possible time. In some embodiments, this time may be dependent on a timing rule associated with the RTCP protocol and/or the particular message.
  • the sender may then update the RTP media stream according to the new viewport orientation information. However, additional sender-side delays may be involved, such as delays incurred due to encoding the additional content and/or the like.
  • the receiver may then begin receiving the updated content and rendering the new viewport area at high quality.
  • Motion to high-quality delay may refer to the time taken after the viewport change event (e.g., the receiver’s head motion) for the new viewport to be updated to high quality and rendered at the client device 104.
  • Motion to high-quality delay is an important QoS parameter for not only 360-degree video conferencing but any panoramic video session in general, including VR streaming.
  • An RTP sender e.g., source device 102
  • this adaptation includes adapting viewport-dependent delivery e.g. delivering viewports with high quality margins when the motion to high-quality delay is large.
  • the RTP sender e.g., source device 102
  • the RTCP feedback message which carries real time control and feedback information during an RTP session, has a structured format with fields that follow a predefined specification to which both source devices and client devices must adhere.
  • RTT and RTCP traffic may not be treated the same way in a network (e.g., carried over channels with differing Quality of Service parameters), and since motion to high-quality delay is affected by the delays experienced by both types of messages and on both the sending and receiving sides, any RTT values calculated using only RTCP may not accurately represent and/or include the network delays involved in the RTCP reporting of the viewport change (e.g., head motion) and/or updated RTP delivery of the new viewport area.
  • an example embodiment is herein described providing for signaling of viewport orientation timing, such as a motion to high-quality delay value, in panoramic video delivery.
  • the motion to high-quality delay value may be representative of a total amount of time taken from when a motion (e.g., change in viewport orientation due to head and/or body motion) occurs at the client device 104 to when a first frame associated with an updated viewport orientation is rendered in high-quality (e.g., within a viewport area relative to a quality level of a background area) at the client device 104.
  • signaling may be used between a source device 102 and a client device 104.
  • FIGs. 5A, 5B, and 5C depict operations for two example embodiments. In both cases, viewport dependent 360-degree content is fetched from a source device 102 by a client device 104 over the network 106. The fetched content is played at the client device 104 using a VR display (e.g., HMD).
  • a VR display e.g.
  • FIG. 5A illustrates a method 500 that may be performed by the apparatus 200 when embodied by a client device 104.
  • the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving a first stream comprising panoramic video content based on a first viewport orientation.
  • the stream may comprise video frames which comprise packets of data used for rendering the video frames.
  • the client device 104 may receive the panoramic video stream from the source device 102 and/or via the network 106.
  • a user may stream panoramic (e.g., 360-degree, omnidirectional, and/or the like) video content at a client device 104, such as a head-mounted display.
  • the panoramic video content may originate and be streamed from a source device 102, such as a server.
  • the source device 102 may be the original source of the panoramic video content.
  • the source device 102 may be one or more intermediary devices (e.g., an MRF and/or MCU as described above) that receive the panoramic video content directly or indirectly from an original source and relay the panoramic video content to one or more client devices 104.
  • the intermediary devices e.g., the MRF and/or MCU as described above
  • the intermediary devices may further process the video content prior to transmission of the video content to a client device.
  • the panoramic video content may be received after a request by the client device 104 is provided to the source device 102.
  • the first viewport orientation, Vi may comprise an orientation at which the client device 104 is presently positioned.
  • the panoramic video content that is received by the client device 104 may be received using viewport-dependent delivery and comprise viewport and non-viewport content.
  • viewport content may be defined as a portion of the panoramic video content, such as one or more tiles and/or video frames, which a user is currently viewing at the client device 104.
  • Non-viewport content may be a portion of the panoramic video content which is not being viewed by a user at the client device 104.
  • the panoramic video content based on a first viewport orientation received at the client device 104 may be viewport-dependent content comprising a viewport area (e.g., an area currently viewed by a user at the client device 104), a margin area as described above, and a background area (e.g., an area currently not being viewed by a user at the client device 104).
  • the client device 104 may receive panoramic video content for all of these areas (the viewport area, margin area, and background area) in varying quality levels.
  • the viewport area may be delivered at a higher quality than the background area and/or margin area.
  • non viewport area content e.g., a background area and/or margin area
  • non viewport area content may be handled differently than viewport area content in viewport-dependent delivery.
  • non viewport area content may be downloaded at a lower quality than viewport content.
  • portions of the non-viewport content may not be downloaded at all, or be delivered at a higher quality (e.g., margin areas).
  • the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for detecting an event comprising a change in the first viewport orientation of the client device to a second viewport orientation.
  • the client device 104 such as via processing circuitry 22 and/or a component such as accelerometer and/or the like of the client device, may detect the change in viewport orientation by measuring a motion of the client device.
  • the client device 104 may comprise circuitry for determining motion of the client device (e.g., HMD) and generating data associated with an updated viewport orientation, such as a second viewport orientation, Vi+i, which may be provided by the circuitry to the client device 104 for processing.
  • the second viewport orientation may occur at a later time than the first viewport orientation.
  • the client device 104 may generate a feedback message.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, or the like, for generating a feedback message comprising one or more updated parameters of the second viewport orientation.
  • the feedback message may be generated in response to the detected event comprising the change in the first viewport orientation, Vi, to the second viewport orientation, Vi+i.
  • the feedback message may only be generated in an instance in which the degree of change in viewport orientation is above a predefined threshold.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, or the like, for determining whether to generate a feedback message based on a comparison of a degree of change of the first viewport orientation and second viewport orientation with a predefined threshold.
  • small motions e.g., head or body motions
  • the feedback message may comprise an RTCP feedback message. It is to be appreciated that while embodiments herein describe use of RTCP, other protocols may be used and/or employed as described above.
  • the feedback message may be generated to comprise one or more updated parameters of the second viewport orientation, Vi+i.
  • the client device may cause transmission of the feedback message, e.g., to a source device 102.
  • the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for causing transmission of the feedback message.
  • the client device 104 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving, in response to the transmission, a second stream comprising panoramic video content based on the second viewport orientation of the client device.
  • the client device 104 may receive the second stream from the source device 102 after transmission of the feedback message to the source device.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a first timestamp value associated with a corresponding frame of the first stream. For example, the client device may determine an RTP timestamp value of any of the packets belonging to a video frame which was rendered last at the client device 104 during the first stream based on the first viewport orientation, Vi, prior to the gaze change event such as head or eye pupil motion, may be herein referred to as Vi Highest TS. As described above, the source device 102 may continue to send viewport-dependent content based on Vi until the source device receives the feedback message comprising a new viewport orientation Vi+i.
  • transmission of the feedback message from the client device 104 to the source device 102 comprising new viewport information may be carried over an RTP or an RTCP message, which may be transmitted at a different time compared to the message comprising Vi Highest TS.
  • the source device 102 may then update the viewport-dependent stream to use the new viewport orientation, Vi+i.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a second timestamp value associated with a corresponding frame of the second stream.
  • the RTP timestamp value of any of the packets belonging to the first rendered video frame using viewport-dependent delivery with the second viewport orientation, Vi+i may be referred to herein as Vi+i_Lowest_TS.
  • FIG. 5A depicts operations performed in a particular numerical order, alternative orders of the operations are possible.
  • the determination of a first timestamp value associated with a frame of the first stream may occur prior to transmission of the feedback message to the source device 102 (e.g., step 504).
  • the client device 104 may continue to carry out operations depicted for example in FIG. 5B.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining, based at least on a difference between the first timestamp value and the second timestamp value, a motion to high-quality delay value.
  • the motion to high-quality delay value may herein be referred to as Motion to HQ and may be determined by the client device 104 using the following formula:
  • Motion_to_HQ Vi +i _Lowest_TS - Vi_Highest_TS
  • the motion to high-quality delay value (e.g., a 0
  • Motion to HQ may be expressed and/or stored as an RTP timestamp unit.
  • the motion to high-quality delay value may be expressed in other units as well, including but not limited to seconds, milliseconds, and/or the like.
  • the motion to high-quality delay value may be normalized and/or expressed as a range without any units.
  • a zero (0) value may be assigned as the motion to high-quality delay value in an instance in which the viewport orientation updates at a speed so fast that the first rendered frame after the event (e.g., head motion) is already at the new, updated viewport quality, while higher motion to high-quality delay values may indicate a measure of the delay proportionally increasing while decreasing user experience.
  • the event e.g., head motion
  • the client device 104 may optionally store the motion to high-quality delay value, e.g., in memory 24.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for storing the motion to high-quality delay value.
  • the client device 104 may store the determined motion to high-quality delay value in a data structure comprising a plurality of previously determined motion to high-quality delay values.
  • the motion to high-quality delay value may be sent as a single, unaltered value, representing only one viewport orientation change event resulting in a viewport orientation update.
  • the motion to high-quality delay value may also be sent as a result of a statistical function using the motion to high-quality delay value and/or one or more previously determined motion to high-quality delay values as inputs, e.g., as a representation of the maximum, average, minimum, median value, and/or the like.
  • This statistical aggregation may be for the entire streaming session, for multiple streaming sessions, or, in some embodiments, for a particular time window.
  • the client device 104 may store (e.g., in memory 24) an array, list, or other data structure for storing previously determined motion to high-quality delay values for the purpose of sending to the source device 102, regardless of whether statistical aggregation is performed.
  • the client device 104 may cause transmission of more than one motion to high- quality delay value to the source device 102, such as a list, array or other data structure that comprises multiple values together.
  • the client device 104 may cause transmission of the motion to high-quality delay value, e.g., to the source device 102.
  • the client device 104 includes means, such as the communication interface 26 and/or the like, for causing transmission of the motion to high-quality delay value.
  • the client device 104 may continue to carry out operations depicted for example in FIG. 5C.
  • the client device 104 may cause transmission of timing information to the source device 102 for processing.
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a viewport-change timestamp value based on a time at which the change in the first viewport orientation of the client device to the second viewport orientation occurred. For example, during an event in which a change in viewport orientation occurs at the client device 104, the client device may determine a viewport-change (VC) timestamp value (e.g., an RTP timestamp value) and optionally store the VC timestamp value (e.g., in memory 24).
  • VC viewport-change
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for determining a viewport-delivered timestamp value based on a time at which the second stream is first received at the client device. For example, after the second stream is received at the client device 104 as described above, the client device may determine a viewport-delivered (VD) timestamp value (e.g., an RTP timestamp value) and optionally store the VD timestamp value (e.g., in memory 24).
  • VD viewport-delivered
  • the client device 104 includes means, such as the processing circuitry 22, memory 24, and/or the like, for causing transmission of the first timestamp value, the second timestamp value, the viewport-change timestamp value, and the viewport-delivered timestamp value to the source device.
  • the values may all comprise RTP timestamp units or, in other embodiments, any other time unit and/or format, as long as the client device and source device are aware of the units and/or formats used.
  • the values (e.g., timestamp values) may comprise either absolute or relative times.
  • a strategy associated with the viewport-dependent delivery may change, such as, for example, the viewport area’s level of quality may increase or decrease.
  • the quality of the frames used to determine the VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and Vi Highest TS be the same, however, it may be the quality assigned to the viewport.
  • the quality of the frames used to determine the timestamp values VC timestamp, VD timestamp, Vi+i_Lowest_TS and Vi Highest TS must be the same or comparable.
  • a source device 102 may temporarily change the strategy of the viewport-dependent delivery during head motion and the timing is associated with when the viewport quality after motion is equivalent or comparable to the viewport quality before motion.
  • FIG. 7 illustrates a method 700 that may be performed by the apparatus 200 when embodied by a source device 102.
  • the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for causing transmission of a first stream comprising panoramic video content based on a first viewport orientation of the client device.
  • the source device 102 may cause transmission of panoramic video content to one or more client devices 104.
  • the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving a feedback message comprising one or more updated parameters of a second viewport orientation.
  • the client device 104 may alter a viewport orientation (e.g., through head and/or body motion or the like) such that the client device 104 generates a feedback message comprising one or more updated parameters of a second viewport orientation and causes transmission of the feedback message to the source device 102.
  • the source device 102 may be configured to receive the feedback message.
  • the source device 102 includes means, such as the processing circuitry 22, memory 24, and/or the like, for generating, in response to the feedback message, a second stream comprising panoramic video content based on the second viewport orientation of the client device. For example, in some embodiments, the source device 102 may update the first stream and/or generate a new stream in accordance with the one or more updated parameters of the feedback message received from the client device 104.
  • the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for causing transmission of the second stream to the client device.
  • the client device 104 may receive the second stream.
  • the source device 102 includes means, such as the processing circuitry 22, communication interface 26, or the like, for receiving one or more determined parameters from the client device 104.
  • the one or more determined parameters may be received in response to transmission of the second stream.
  • the client device 104 may determine one or more parameters, such as time stamp values, a motion to high-quality delay value, and/or the like.
  • the one or more determined parameters may comprise a motion to high-quality delay value determined at the client device 104 (e.g., as described above with reference to FIG. 5B).
  • the one or more determined parameters may comprise a first timestamp value (e.g., an RTP timestamp value of any of the packets belonging to a video frame which was rendered last at the client device 104 during the first stream based on the first viewport orientation), a second timestamp value (e.g., an RTP timestamp value of any of the packets belonging to the first rendered video frame using viewport-dependent delivery with the second viewport orientation), a viewport-change timestamp value (e.g., a timestamp value based on a time at which the change in the first viewport orientation of the client device to the second viewport orientation occurred), and a viewport-delivered timestamp value (e.g., a timestamp value based on a time at which the second stream is first received at the client device) as
  • the source device 102 includes means, such as the processing circuitry 22, memory 24, or the like, for updating the second stream based on the received one or more determined parameters and causing transmission of the updated second stream to the client device 104.
  • the received one or more determined parameters e.g., a motion to high-quality delay value and/or the timestamp values described above
  • the source device 102 may adjust and/or alter a panoramic video stream to the client device 104 by not providing margin area extension.
  • a predefined threshold e.g., a low numerical value
  • the source device 102 may adjust and/or alter a panoramic video stream to the client device 104 by extending the margin area in order to improve user experience at the client device 104.
  • the source device 102 may be enabled to make more informed decisions and determinations, for example, on how to use available bandwidth.
  • the source device 102 may prioritize transmission of viewport areas with margin areas when bandwidth is available.
  • the source device 102 may reduce and/or eliminate extension of margin areas and instead, for example, upgrade a quality level of a viewport area.
  • a method, apparatus, and computer program product are disclosed for providing for signaling of viewport orientation timing, such as in panoramic video delivery.
  • signaling of viewport orientation timing in, e.g., panoramic video delivery
  • user experience during immersive content consumption may be improved while avoiding unnecessary increases in latency and bandwidth consumption.
  • the determined values may be used directly by the source device for improved network adaptation.
  • the source device 102 that extends margins around the viewport during VR content delivery.
  • the margins may be delivered at the same or comparable quality as the viewport area in order to reduce quality degradation during head motion.
  • the margin extensions may be unnecessary, whereas when motion to high-quality delay is high, extending the margins may significantly improve the user experience.
  • the source device 102 can make more informed decisions on how to use available bandwidth.
  • the source device may prioritize sending viewport area content with margins when bandwidth is available.
  • the source device 102 may reduce margin extensions and upgrade the quality of the viewport area instead.
  • RTCP messages provide low-level metrics, such as, for example, jitter and RTT.
  • Low-level metrics such as, for example, jitter and RTT.
  • Higher level metrics that directly impact user experience, such as the motion to high-quality delay value discussed herein, are more suited for use as indicators for bitrate adaptation as they incorporate buffering and application-level delays in addition to network delays.
  • the present disclosure additionally reduces overhead for source devices 102.
  • Conventional methods of using RTCP XR reports e.g., as described in Internet Engineering Task Force (IETF) Request For Comments 3611
  • IETF Internet Engineering Task Force
  • Packet Receipt Time Report Block may be used by a sender to get receipt times of RTP packets.
  • using the report block for calculating motion to high-quality delay at the source device 102 requires that the source device is aware of the sequence numbers associated with viewport orientation changes and, additionally, requires the source device 102 to keep track of these sequence numbers at least until the relevant reports are received.
  • packet receipt times are expressed in RTP timestamp units for a block of sequence numbers, this manner of expression creates significant signaling overhead.
  • Sending motion to high-quality delay values or even only the relevant timing information e.g., VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and Vi Highest TS
  • the relevant timing information e.g., VC timestamp value, VD timestamp value, Vi+i_Lowest_TS and Vi Highest TS
  • Aggregated and/or periodically received motion to high-quality delay values can further reduce the signaling overhead.
  • the signaled values of the present disclosure enable improved session monitoring capabilities at the source device and other network elements.
  • the metrics collected using the signaled values may be used for live monitoring as well as to enable engineers to further optimize the system 100.
  • FIG. 5A, 5B, 5C, and 7 illustrate flowcharts depicting methods according to an example embodiment of the present invention. It will be understood that each block of the flowcharts and combination of blocks in the flowcharts may be implemented by various means, such as hardware, firmware, processor, circuitry, and/or other communication devices associated with execution of software including one or more computer program instructions. For example, one or more of the procedures described above may be embodied by computer program instructions. In this regard, the computer program instructions which embody the procedures described above may be stored by a memory device 24 of an apparatus employing an embodiment of the present invention and executed by a processor 22.
  • any such computer program instructions may be loaded onto a computer or other programmable apparatus (for example, hardware) to produce a machine, such that the resulting computer or other programmable apparatus implements the functions specified in the flowchart blocks.
  • These computer program instructions may also be stored in a computer- readable memory that may direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture the execution of which implements the function specified in the flowchart blocks.
  • the computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operations to be performed on the computer or other programmable apparatus to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide operations for implementing the functions specified in the flowchart blocks.
  • blocks of the flowcharts support combinations of means for performing the specified functions and combinations of operations for performing the specified functions for performing the specified functions. It will also be understood that one or more blocks of the flowchart, and combinations of blocks in the flowcharts, can be implemented by special purpose hardware-based computer systems which perform the specified functions, or combinations of special purpose hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Optics & Photonics (AREA)
  • Databases & Information Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

La présente invention concerne un procédé, un appareil et un produit-programme d'ordinateur permettant la signalisation d'une synchronisation d'orientation de fenêtre d'affichage, par exemple dans une distribution de vidéo panoramique. Dans le contexte d'un procédé, le procédé consiste : à recevoir un premier flux comprenant un contenu vidéo panoramique fondé sur une première orientation de fenêtre d'affichage ; à générer un message de rétroaction comprenant un ou plusieurs paramètre(s) mis à jour d'une seconde orientation de fenêtre d'affichage et à provoquer la transmission du message de rétroaction ; à recevoir également un second flux comprenant un contenu vidéo panoramique fondé sur la seconde orientation de fenêtre d'affichage du dispositif client ; à déterminer un mouvement à une valeur de retard de haute qualité en fonction de marquages temporels associés aux premier et second flux et à provoquer la transmission du mouvement à une valeur de retard de haute qualité.
PCT/EP2020/088035 2020-01-14 2020-12-30 Procédé, appareil et produit-programme d'ordinateur permettant la signalisation d'une synchronisation d'orientation de fenêtre d'affichage dans une distribution de vidéo panoramique WO2021144139A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062961004P 2020-01-14 2020-01-14
US62/961,004 2020-01-14

Publications (1)

Publication Number Publication Date
WO2021144139A1 true WO2021144139A1 (fr) 2021-07-22

Family

ID=74141572

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2020/088035 WO2021144139A1 (fr) 2020-01-14 2020-12-30 Procédé, appareil et produit-programme d'ordinateur permettant la signalisation d'une synchronisation d'orientation de fenêtre d'affichage dans une distribution de vidéo panoramique

Country Status (1)

Country Link
WO (1) WO2021144139A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018049221A1 (fr) * 2016-09-09 2018-03-15 Vid Scale, Inc. Procédés et appareil de réduction de la latence pour une diffusion continue adaptative de fenêtre d'affichage à 360 degrés
WO2019120638A1 (fr) * 2017-12-22 2019-06-27 Huawei Technologies Co., Ltd. Fov+ échelonnable pour distribution de vidéo de réalité virtuelle (vr) à 360° à des utilisateurs finaux distants

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018049221A1 (fr) * 2016-09-09 2018-03-15 Vid Scale, Inc. Procédés et appareil de réduction de la latence pour une diffusion continue adaptative de fenêtre d'affichage à 360 degrés
WO2019120638A1 (fr) * 2017-12-22 2019-06-27 Huawei Technologies Co., Ltd. Fov+ échelonnable pour distribution de vidéo de réalité virtuelle (vr) à 360° à des utilisateurs finaux distants

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YONG HE (INTERDIGITAL) ET AL: "MPEG-I: VR Experience Metrics", no. m41120, 11 July 2017 (2017-07-11), XP030069463, Retrieved from the Internet <URL:http://phenix.int-evry.fr/mpeg/doc_end_user/documents/119_Torino/wg11/m41120-v1-m41120_VR_Experience_Metrics.zip m41120_VR_Experience_Metrics.doc> [retrieved on 20170711] *

Similar Documents

Publication Publication Date Title
JP7486527B2 (ja) イマーシブメディアコンテンツの提示および双方向性の360°ビデオ通信
CN109891850B (zh) 用于减少360度视区自适应流媒体延迟的方法和装置
US9042444B2 (en) System and method for transmission of data signals over a wireless network
RU2497304C2 (ru) Динамическая модификация свойств видео
CN111147893B (zh) 一种视频自适应方法、相关设备以及存储介质
US9282133B2 (en) Communicating control information within a real-time stream
US12108097B2 (en) Combining video streams in composite video stream with metadata
JP2015536594A (ja) 積極的なビデオフレームドロップ
KR102486847B1 (ko) 링크 인식 스트리밍 적응
KR101863965B1 (ko) 적응적 멀티미디어 서비스를 제공하는 장치 및 방법
WO2022127605A1 (fr) Procédé et appareil de commutation de réseau
KR100982630B1 (ko) 콘텐츠의 스트림 비트율을 조정하기 위한 디바이스 및프로세스 그리고 관련 제품
US10659190B1 (en) Optimizing delay-sensitive network-based communications with latency guidance
Bentaleb et al. Performance analysis of ACTE: A bandwidth prediction method for low-latency chunked streaming
KR20230002784A (ko) 오디오 및/또는 비디오 콘텐츠 전송을 위한 방법 및 서버
CN109862400A (zh) 一种流媒体传输方法、装置及其系统
US20180227231A1 (en) Identifying Network Conditions
US20220294555A1 (en) Optimizing delay-sensitive network-based communications with latency guidance
WO2021144139A1 (fr) Procédé, appareil et produit-programme d&#39;ordinateur permettant la signalisation d&#39;une synchronisation d&#39;orientation de fenêtre d&#39;affichage dans une distribution de vidéo panoramique
US10270832B1 (en) Method and system for modifying a media stream having a variable data rate
US9363574B1 (en) Video throttling based on individual client delay
EP3013013B1 (fr) Dispositif de mise en réseau et procédé de gestion de tampons dans une diffusion vidéo en continu sur un réseau
US20140244798A1 (en) TCP-Based Weighted Fair Video Delivery
CN109379127A (zh) 一种数据处理方法及装置
JP2011239232A (ja) 送信装置、送信方法、並びにプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20838579

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20838579

Country of ref document: EP

Kind code of ref document: A1