WO2015197818A1 - Hevc-tiled video streaming - Google Patents

Hevc-tiled video streaming Download PDF

Info

Publication number
WO2015197818A1
WO2015197818A1 PCT/EP2015/064527 EP2015064527W WO2015197818A1 WO 2015197818 A1 WO2015197818 A1 WO 2015197818A1 EP 2015064527 W EP2015064527 W EP 2015064527W WO 2015197818 A1 WO2015197818 A1 WO 2015197818A1
Authority
WO
WIPO (PCT)
Prior art keywords
hevc
spatial
video data
tiled
tiles
Prior art date
Application number
PCT/EP2015/064527
Other languages
French (fr)
Inventor
Emmanuel Thomas
Ray Van Brandenburg
Original Assignee
Koninklijke Kpn N.V.
Nederlandse Organisatie Voor Toegepast-Natuurwetenschappelijk Onderzoek Tno
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Kpn N.V., Nederlandse Organisatie Voor Toegepast-Natuurwetenschappelijk Onderzoek Tno filed Critical Koninklijke Kpn N.V.
Priority to US15/318,619 priority Critical patent/US10694192B2/en
Priority to EP15734102.5A priority patent/EP3162075B1/en
Publication of WO2015197818A1 publication Critical patent/WO2015197818A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6017Methods or arrangements to increase the throughput
    • H03M7/6023Parallelization
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/23439Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements for generating different versions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/654Transmission by server directed to the client
    • H04N21/6547Transmission by server directed to the client comprising parameters, e.g. for client setup
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8451Structuring of content, e.g. decomposing content into time segments using Advanced Video Coding [AVC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format

Definitions

  • the invention relates to HEVC-tiled video streaming, and, in particular, though not exclusively, to a method of streaming HEVC-tiled video data to a client device, a client device for processing HEVC-tiled video data, a non-transitory computer-readable storage medium comprising a recording area for storing HEVC-tiled video data and data structures associated with HEVC-tiled video data and a computer program product using such method.
  • High-resolution panorama videos however, enable a user (and/or director) a certain degree of interaction with the video the user (and/or director) is watching (directing) without having to manipulate the camera in a physical sense.
  • pan-tilt-zoom interaction it is possible to extract from the high- resolution panorama video a sub-region of the video a user or director is interested in. This sub-region may be referred to as the region of interest (ROI).
  • ROI region of interest
  • tiled streaming technique with which the full video panorama is divided into multiple independently encoded videos, whereby the client device, also referred to as client, has multiple decoders allowing it to reconstruct any part of the full video panorama, if necessary by stitching together a number of such independent videos.
  • WO2012/168365 describes content delivery systems, e.g. CDNs, for streaming spatially segmented content to clients.
  • the client i.e. the client device
  • the client should be able to synchronize the decoders and to stitch the decoded video tiles into the full video.
  • the client processes may become complex and resource intensive.
  • Another form of tiled streaming is known from the HEVC standard, which provides a very efficient encoding and decoding scheme for video data.
  • HEVC tiles were originally introduced in the HEVC standard for decoding of the video data using multi-core processors so that tiles in a HEVC- tiled video stream may be processed (encoded/decoded) in parallel.
  • HEVC-tiles may also be used for playout of only a subset of the HEVC tiles in the video frames of a HEVC-tiled stream.
  • the subset may e.g. relate to a region- of-interest (ROI) in the image area of the (raw) panorama video.
  • ROI region- of-interest
  • the HEVC tiles should be independently encoded so that the decoder is able to decode only a subset of the HEVC tiles.
  • the HEVC standard allows an HEVC encoder to be configured for restricting the spatial and temporal predictions in the video coding (e.g. motion vectors and in-loop filters) within the boundaries of one or more HEVC tiles.
  • the video data when managing multiple independent HEVC tiles at transport level one could format the video data as a single HEVC-tiled stream. In that case however the video data of all HEVC-tiles should be transmitted to the client and tiles can only be manipulated at decoder level.
  • the invention may relate to a method of selecting and/or streaming HEVC-tiled video data to a client device.
  • the method may comprise providing said client device with a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset , preferably a plurality, of HEVC tiles of a HEVC-tiled (panorama) video stream; and, selecting a spatial segment identifier of said spatial manifest file for requesting a delivery node to deliver at least part of the video data of a spatial segment as a HEVC-tiled video stream to the client device.
  • the method may include using said selected spatial segment identifier for sending a request, preferably an HTTP request, to said delivery node for delivering video data associated with said spatial segment to said client device.
  • the invention thus relates to a data structure defining a subset of HEVC tiles (i.e. one or more HEVC tiles) of a full set of HEVC tiles of a HEVC-tiled video (e.g. a HEVC-tiled panorama video).
  • the subset of HEVC-tiles may be referred to as a so-called spatial segment, wherein the spatial segment defines part of an image frame within the full image frames of the full HEVC-tiled video stream (e.g. a HEVC tiled panorama video).
  • the video data of a spatial segment may be stored as an independently decodable video data in a file that can be accessed by the client device, e.g.
  • an HAS enabled client device also referred to as a HAS client device or simply HAS client
  • using the spatial manifest file Different spatial segments may be defined in the spatial manifest file, which is used by the client device to locate a delivery node (e.g. a media server) that can send the requested data to the client device.
  • a delivery node e.g. a media server
  • the spatial manifest file comprising the spatial segments may be used by the client device in order to retrieve HEVC-tiled video data.
  • a user may select a region-of-interest (ROI), e.g. the centre of a rendered HEVC-tiled panorama video, via the user interface of the device, wherein the position of the ROI in the image area of the panorama video may at least partly coincide with the position of the spatial segment in the full image area of the panorama video.
  • ROI region-of-interest
  • the spatial manifest file comprising spatial segment instances allows requesting a set of HEVC-tiles on the basis of a single or at least a reduced number of request messages, e.g. HTTP request messages, and receive the video data of the spatial segment in a single HEVC-tiled video stream that can be directly processed by the HEVC decoder.
  • request messages e.g. HTTP request messages
  • the video data of a spatial segment is formatted as a HEVC-tiled video stream so that the data can be readily decoded by a single HEVC decoder.
  • the concept of tiles as described in this disclosure may be supported by different video codecs.
  • HEVC High Efficiency Video Coding
  • HEVC tiles may be created by an encoder that divides each video frame of a media stream into a number of rows and columns ("a grid of tiles") defining tiles of a predefined width and height expressed in units of coding tree blocks (CTB).
  • CTB coding tree blocks
  • An HEVC bitstream may comprise decoder information for informing a decoder how the video frames should be divided in tiles.
  • the decoder information may inform the decoder on the tile division of the video frames in different ways.
  • the decoder information may comprise information on a uniform grid of n by m tiles, wherein the size of the tiles in the grid can be deduced on the basis of the width of the frames and the CTB size. Because of rounding inaccuracies, not all tiles may have the exact same size.
  • the decoder information may comprise explicit information on the widths and heights of the tiles (e.g. in terms of coding tree block units). This way video frames may be divided in tiles of different size. Only for the tiles of the last row and the last column the size may be derived from the remaining number of CTBs. Thereafter, a packetizer may packetize the raw HEVC bitstream into a suitable media container that is used by a transport protocol.
  • Video codecs that allow to precisely define encoding/decoding dependencies between tiles, such that tiles within a spatial segment may contain encoding dependencies, but tiles across the boundaries of spatial segments do not contain encoding dependencies, may include the video codec VP9 of Google or - to some extent - the MPEG-4 Part 10 AVC/H.264, the Advanced Video Coding (AVC) standard.
  • AVC Advanced Video Coding
  • VP9 coding dependencies are broken along vertical tile boundaries, which means that two tiles in the same tile row may be decoded at the same time.
  • slices may be used to divide each frame in multiple rows, wherein each of these rows defines a tile in the sense that the media data is independently decodable.
  • HEVC tile is not restricted to only tiles according to the HEVC standard, but generally defines a subregion of arbitrarily shape and/or dimensions within the image region of the video frames wherein the encoding process can be configured such that the media data within the boundaries of the tile , or when more than one tile is comprised in a spatial segment, within the boundaries of such segment, is independently decodable.
  • HEVC tile is not restricted to only tiles according to the HEVC standard, but generally defines a subregion of arbitrarily shape and/or dimensions within the image region of the video frames wherein the encoding process can be configured such that the media data within the boundaries of the tile , or when more than one tile is comprised in a spatial segment, within the boundaries of such segment, is independently decodable.
  • segment or slice may be used instead of the term "tile”.
  • the invention is equally suitable for use with video codecs that are different from HEVC (e.g. VP9) or are (future) derivatives from HEVC, as long as these codecs have the characteristic that they are suitable for encoding a video, whereby different regions (sub areas) of images representing the video can be independently encoded within the boundaries of a spatial segment, in a single encoding process, and whereby the independently encoded regions can be decoded in a single decoding process.
  • the term independently refers to the notion that the coding is performed in a manner that no encoding dependencies exist between these regions across the boundaries of spatial segments.
  • video data of HEVC tiles in said spatial segment do not have spatial and/or temporal decoding dependencies with video data of HEVC tiles in said HEVC-tiled video stream that are not part of the spatial segment.
  • coding (decoding/encoding) constraints for a spatial segment may be summarized as follows:
  • HEVC tiles in a first spatial segment A may not have any coding dependencies on HEVC tiles in a second spatial segment B;
  • HEVC tiles in a first spatial segment A may have coding dependencies on other tiles in the spatial segment A, under the condition that: a.
  • a first HEVC tile 1 in a spatial segment A at a time instance Frame N may not have any coding dependencies on a second HEVC tile 2 in the spatial segment N at time instance Frame N b.
  • a HEVC tile 2 in spatial segment A at a time instance Frame N mag have coding
  • HEVC tile 2 of spatial segment A at an earlier time instance (e.g. Frame N-1 ) of or a later time instance (e.g. Frame N+1 ).
  • the spatial segment may be defined by segment boundaries that coincide with the HEVC tile boundaries in A row and column direction of said HEVC-tiled video stream.
  • the segment boundaries may enclose a rectangular area comprising an integer number of HEVC tiles (a subset of HEVC tiles) that is smaller than the number of HEVC tiles in said HEVC-tiled video stream.
  • the integer number is preferably large than 1.
  • the HEVC tiles of said rectangular area are preferably a plurality (i.e. a multiple).
  • the image area associated with a spatial segment may define a small part of the full image area of the full HEVC-tiled panorama video.
  • video data of at least part of said HEVC tiles in said spatial segment are decoded in parallel by a HEVC decoder.
  • the spatial segment may comprise multiple HEVC tiles
  • the tiles may be decoded in parallel by multiple processor cores.
  • video data of the HEVC tiles in said spatial segment do not have spatial and/or temporal decoding dependency.
  • the video data of each HEVC tile in the spatial segment can be decoded by the HEVC decoder without any information of the other HEVC tiles in the spatial segment.
  • video data of at least part of said HEVC tiles in said spatial segment have one or more spatial and/or temporal decoding dependencies.
  • the HEVC-tiled video stream is encoded such that within the boundaries of a spatial segment .dependencies of video data between different HEVC tiles belonging to the same spatial segment are allowed and/or exist. In that case, the HEVC-tiles in the spatial segment (i.e. a subset of HEVC tiles of a HEVC-tiled video stream) can be efficiently compressed without any further quality loss.
  • said spatial manifest file may further comprises one or more HEVC tile identifiers, preferably (part of) one or more URLs, for locating one or more one or more delivery nodes configured for delivering video data associated with at least one HEVC tile of the subset of HEVC tiles of a spatial segment.
  • the HEVC tiles in a spatial segment may be individually accessible by a client device on the basis of the spatial manifest file.
  • said spatial manifest file may further comprise metadata associated with said selected spatial segment, wherein said metadata may include at least one of: information for determining that the selected spatial segment is related to HEVC-tiled video data; information for determining the number and/or size of HEVC-tiles in the selected spatial segment; information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream; and/or, information for determining whether the video data of a HEVC tile of said spatial segment have one or more temporal decoding dependencies on video data of other HEVC tiles in said spatial segment.
  • the spatial manifest file may comprise information (metadata) for the decoder so that the decoder can be initialized or configured before it receives HEVC-tiled video data that is requested by the client device on the basis of the spatial manifest file.
  • video data associated with a spatial segment are stored in a separate tracks, wherein video data in said track may at least be partly accessible by said client device on the basis of spatial segment identifiers.
  • video data associated a HEVC tile in a spatial segment are stored in a separate track, wherein video data in said track may at least be partly accessible by said client device on the basis of one or more HEVC tile identifiers.
  • the video data may be stored as separate tracks on a computer-readable storage medium.
  • Video data in a track may be linked with a spatial segment identifier or a HEVC tile identifier in order to allow the client device to request delivery of video data stored in one or more of tracks to a device comprising a HEVC decoder for decoding the video data and rendering video content.
  • one or more spatial segments preferably identifiable by spatial segment identifiers, spatially overlap.
  • This configuration of the spatial segments has the advantage that for instance a user interaction such as panning "off screen” may be performed in an improved manner.
  • video data related to an image area that is "off screen” at the moment that it is requested may be retrieved. If such "off screen" image area (which is not yet displayed) is partly comprised in the video data of the tracks with spatial segments that are being retrieved, enough time may be gained before partially overlapping spatial segments of another track are being retrieved, such that the Off screen' panning action may be perceived as seamless.
  • the invention may relate to a client device, wherein said client device may be configured for: parsing a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled video stream; and, using a spatial segment identifier of said spatial manifest file for requesting a delivery node the delivery of video data of a spatial segment.
  • a HEVC decoder may be used for decoding video data of said spatial segment that are requested by said client device on the basis of said spatial manifest file.
  • the invention may relate to a non-transitory computer-readable storage medium comprising a recording area for storing video data, wherein said recording area may comprise: video data associated with a spatial segment, said spatial segment comprising a subset of HEVC tiles of a HEVC-tiled video stream, the video data of said spatial segment being accessible on the basis of an spatial segment identifier.
  • video data of said one or more spatial segments are accessible on the basis of one or more spatial segment tile identifiers.
  • video data of said one or more spatial segments are stored as separate video tracks in said recording area.
  • said recording area may further comprise at least one base track comprising one or more extractors, wherein an extractor is pointing to a video track.
  • the invention may relate to a non-transitory computer-readable storage medium comprising a stored data structure, preferably a spatial manifest file for use by a device, preferably a client device, or for use in the methods as described above, wherein said data structure may comprise: one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled video stream.
  • said data structure may comprise one or more HEVC tile identifiers, preferably (part of) one or more URLs, for locating one or more one or more delivery nodes configured for delivering video data associated with at least one HEVC tile of the subset of HEVC tiles of a spatial segment.
  • said data structure may further comprise metadata associated with said selected spatial segment, wherein said metadata may include at least one of: information for determining that the selected spatial segment is related to HEVC-tiled video data; information for determining the number and/or size of HEVC-tiles in the selected spatial segment; information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream; and, information for determining whether there the video data of a HEVC tile of said spatial segment have one or more spatial decoding dependencies on other HEVC tiles in said spatial segment.
  • said metadata may include at least one of: information for determining that the selected spatial segment is related to HEVC-tiled video data; information for determining the number and/or size of HEVC-tiles in the selected spatial segment; information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (pan
  • the invention may relate to a video tiling system configured for: receiving video data, preferably wide field of view (panorama) video data; encoding said video data HEVC-tiled video comprising one or more spatial segments, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled panorama video stream; generating a spatial manifest file associated with said HEVC-tiled video data, said spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to a client device, preferably said spatial manifest file further comprising information for determining the position of at least part of said one or more spatial segments and/or the position of HEVC tiles in a spatial segment within the tiled image area of said HEVC-tiled video stream.
  • the invention may also be related to computer program product comprising software code portions configured for, when run in the memory of a computer, executing the method steps according to any of the above claims.
  • Fig. 1 depicts the concept of a spatial HEVC segment according to an embodiment of the invention.
  • Fig. 2 depicts a data structure for HEVC-tiled video data according to an embodiment of the invention.
  • Fig. 3 depicts a schematic of a temporally segmented HEVC-tiled stream comprising spatial HEVC segments according to an embodiment of the invention.
  • Fig. 4 schematically depicts a spatial manifest file for use by a HAS client device according to an embodiment of the invention.
  • Fig. 5A and 5B schematically depict a spatial manifest file comprising spatial segments according to an embodiment of the invention.
  • Fig. 6A and 6B schematically depict a spatial manifest file comprising spatial segments according to an embodiment of the invention.
  • Fig. 7 depicts a client device configured to rendering HEVC-tiled video data on the basis of a spatial manifest file according to an embodiment of the invention.
  • Fig. 8 schematically depicts a flow diagram of a streaming HEVC-tiled video data on the basis of a spatial manifest file according to an embodiment of the invention.
  • Fig. 9 depicts a schematic of a process for generating a HEVC-tiled stream of file comprising one or more spatial segments according to an embodiment of the invention.
  • Fig. 10 is a block diagram illustrating an exemplary data processing system that may be used in systems and methods as described with reference to Fig. 1 -9. Detailed description
  • FIG. 1A and 1 B depict schematics of a HEVC-tiled video stream according to various embodiments of the invention.
  • a video stream e.g. a high-density (HD) or ultra high-density (UHD) wide field-of-view or panorama video stream, may be encoded on the basis of the HEVC video compression standard.
  • HEVC a video image is partitioned in so-called coding tree units (CTU), which is the basic processing unit used in the HEVC standard for the encoding and decoding process.
  • CTU coding tree units
  • the HEVC encoder may be configured to divide video frames 100 in the HEVC stream in so-called HEVC tiles 102, wherein a HEVC tile is rectangular area defined by a particular grouping of CTUs.
  • the HEVC tiles may divide the image area of the panorama video into multiple adjacent rectangular regions (which may be of different size), wherein the boundaries of the HEVC tiles are defined by HEVC tile boundaries in the column and row direction 104,106.
  • a HEVC stream comprising HEVC tiles may be referred to as a HEVC-tiled video stream.
  • HEVC tiles were originally introduced in the HEVC standard for encoding and decoding of the video data using multi-core processors so that tiles in a HEVC-tiled stream may be processed (encoded and decoded) in parallel.
  • HEVC-tiles may also be used for playout of only a subset of the HEVC tiles in the video frames of a HEVC-tiled stream.
  • the subset may e.g. relate to a region-of-interest (ROI) in the image area of the (raw) panorama video.
  • ROI region-of-interest
  • the HEVC tiles should be independently encoded over time so that the decoder is able to decode only a subset of the HEVC tiles over multiple frames.
  • the HEVC standard allows an HEVC encoder to be configured for restricting the temporal predictions in the video coding (e.g. motion vectors and in-loop filters) within the boundaries of one or more HEVC tiles.
  • the video data when managing multiple independent HEVC tiles at transport level one could format the video data as a single HEVC-tiled stream. In that case however the video data of all HEVC-tiles should be transmitted to the client device and tiles can only be manipulated at decoder level.
  • Such scheme would introduce a large number of HTTP request in order to request all temporal segments of the desired set of HEVC tiles.
  • a set of HEVC tiles may be grouped into a so-called spatial segment, for example Fig. 1A depicts a spatial segment 108i comprising a subset of HEVC tiles (in this example 8 HEVC tiles) from the full set of HEVC tiles building the HEVC-tiled video frame of a panorama video (in this example 24 HEVC segments).
  • Fig. 1 B depicts an example of multiple spatial segments 108 2 (in this case four spatial segments) each comprising one or more HEVC tiles (in this example six) from the full set of HEVC tiles.
  • the spatial segments in Fig. 1 B are of equal size, multiple spatial segments of different size (different number of HEVC tiles) are also envisaged.
  • a spatial segment coincide with the HEVC tile column and row boundaries 104,106 so that it encloses an integer number (one or more) , preferably a plurality, HEVC tiles.
  • the spatial segment defines a sub-region in the image frame comprising multiple HEVC tiles.
  • the spatial segment thus defines an image area that is larger than an image area associate with an individual HEVC-tile and smaller than the image area of the image area of the full panorama video.
  • the data format associated with a spatial segment may defined such that spatial segment can be accessed by the client device on transport level by requesting the spatial segment from a media server and receiving the video data in the spatial segment as an independent HEVC stream (a spatial segment stream).
  • one or more HEVC tiles in a spatial segment may be configured to have one or more (temporal) decoding dependencies in previous and/or future frames and no (temporal) decoding dependencies between tiles within the spatial segment and tiles outside the spatial segment. In that case, the video data may be efficiently compressed without any further loss of quality.
  • the HEVC tiles in a spatial segment may be configured to have one or more (temporal) decoding dependencies in previous and/or future frames and no decoding dependency between tiles within the spatial segment and tiles outside the spatial segment.
  • the rendering of the HEVC tiles in the spatial segment may be controlled decoder level.
  • the position of HEVC tiles and the position of one or more spatial segments in the full image region may be determined by tile position information and segment position information respectively.
  • the position information may be defined on the basis a coordinate system associated with the full image region.
  • Tile position information may comprise coordinates of tile regions within the image region of said source video. This way, every HEVC tile may be related to a tile region in the image region of the HEVC video stream.
  • the full image region of the HEVC stream may be reconstructed by the HEVC decoder.
  • a coordinate system that is used for defining the tile position information may also be used for defining the position of a spatial segment.
  • a Cartesian coordinate system may be used.
  • curvilinear coordinate systems may be used, e.g. cylindrical, spherical or polar coordinate system.
  • a spatial segment defines HEVC-tiled video data comprising a subset of HEVC tiles from the complete set of HEVC tiles of a HEVC-tiled wide field of view video (e.g. a panorama video).
  • a spatial segment may be defined using the concept of a so-called motion-constrained tile set.
  • the information defining such tile set may be defined as a SEI message in the MPEG stream.
  • the motion-constrained tile set is defined as follows: temporal_motion_constrained_tile_sets( payloadSize ) ⁇
  • each_tile_one_tile_set_flag 1 means that 1 tile is a tile set (a spatial segment);
  • num_sets_in_message_minus1 defines the number of tile sets (the number of spatial segments);
  • mcts_id[i] gives an arbitrary id for the i-th tile set
  • num_tile_rects_in_set_minus1 [i] defines the number of HEVC tiles in the motion-constrained tile set
  • top_left_tile_index[i][j] and bottom_right_tile_index[i][j] defines the top lef tand bottom right indexes of the tiles in the tile set.
  • the mc_all_tiles_exact_sample_value_match_flag parameter is set to 1 .
  • the HEVC standard thus allows defining sets of tiles within the bitstream .
  • the special segment data structure allows a client device to access and retrieve these tile sets on a transport level (e.g. MPEG DASH level).
  • Fig. 2 depicts an example of data structure 200 of an HEVC-tiled video file or stream , in this particular example an MPEG-4 file 202, comprising one or more spatial segments.
  • the video file or stream may comprise one or more (video) tracks 2 ⁇ 61.4, which serve as a container for independently decodable video data associated with one or more spatial segments and, optionally, one or more HEVC tiles.
  • a track may define a container comprising video data 210 wherein the spatial and temporal predictions for the video coding (e.g . motion vector and in-loop filters.) are within the boundaries of the spatial segment.
  • a track may further comprise position information 208.
  • the decoder may use the position information in order to determine the position of the spatial segment within the HEVC-tiled video image.
  • position information in a track may comprise an origin and size information in order to allow the decoder to position a spatial segment or a HEVC tile in a reference space wherein a position in the space may be determined by a coordinate system associated with the full image.
  • the data structure 200 may further comprise a so-called base track 204.
  • the base track may comprise information that determines the sequence of the tracks that need to be decoded by HEVC decoder.
  • the base track may comprise extractors 212, wherein an extractor defines a reference to one or more corresponding tracks.
  • the decoder may replace an extractor with audio and/or video data of a track it refers to.
  • the HEVC decoder thus uses the information in the base track in order to generate on the video data in the tracks a coherent bitstream for decoding.
  • the decoder may simply ignore its corresponding extractor. In that case, the absence of such track may be interpreted by the decoder as "missing data". Since the video data in the tracks are independently decodable, the absence of data from one or more tracks does not prevent the decoder from decoding other tracks that can be retrieved.
  • an HEVC tile may be decoded independently from the other HEVC tiles, so that the absence of data from one or more tracks does not prevent the decoder from decoding other tracks that can be retrieved.
  • the base track may comprise video data associated with the full image region of the source video, e.g. a panorama video.
  • the video may be selected in a quality such that it can be transported in the HEVC stream without taking up too much bandwidth.
  • the data format depicted in Fig. 2 may be used for storing spatial segments and HEVC tiles as independent files such that a client device may request delivery of these files.
  • the streams depicted in Fig. 1 and 2 may be delivered to a client device (also simply referred to as a client throughout this application) for playout using an adaptive streaming protocol such as an HTTP adaptive streaming (HAS) protocol.
  • HTTP adaptive streaming protocols include Apple HTTP Live Streaming [http://tools.ietf.org/html/draft-pantos-http-live-streaming-13], Microsoft Smooth Streaming [http://www.iis.net/download/SmoothStreaming], Adobe HTTP Dynamic Streaming [http://www.adobe.com/products/ httpdynamicstreaming], 3GPP-DASH [TS 26.247 Transparent end-to-end Packet-switched Streaming Service (PSS); Progressive Download and
  • HTTP Dynamic Adaptive Streaming over HTTP
  • MPEG Dynamic Adaptive Streaming over HTTP [MPEG DASH ISO/IEC 23001-6].
  • HTTP allows an efficient, firewall-friendly and scalable scheme for delivering tile streams (and segments) to clients.
  • the spatially divided, independently decodable video data (i.e. the video data of the spatial segments) may be temporally divided in so-called temporal segments of a predetermined time period as shown in Fig. 3.
  • Fig. 3 depicts a schematic of a temporally segmented HEVC-tiled stream comprising spatial segments according to an embodiment of the invention.
  • . n are divided in a plurality of spatial segments 302 ⁇ (in this particular example 4 spatial segments), wherein each spatial segment comprises a plurality of HEVC-tiles 304.
  • the video data associated with each spatial segment may be temporarily divided in temporal segments 308 1 2
  • a temporal segment may start with a media unit, e.g. an I frame, that has not coding dependencies on other frames in the temporal segment or other temporal segments so that the decoder can directly start decoding video data in the spatial segment.
  • a media unit e.g. an I frame
  • the video data a spatial segment 302i may not have any decoding dependency on other spatial segments 302 2 ⁇ of the same video frame or earlier video frames in the same temporal segment or earlier temporal segments.
  • the video data in a temporal segment may start with a frame that can be decoded without the need of other frames. This way, a client may receive a spatial segment of a spatial segment stream and start decoding the video data of the first video frame in the spatial segment without the need of other video data.
  • video data associated with each spatial segment may be delivered as separate HEVC-tiled streams to the client.
  • video data associated with two or more spatial segments may be delivered in one HEVC-tile stream to the client.
  • an HAS streaming protocol is used for delivering video data to an HAS client (which is a client device configured for processing video data delivered on the basis of HTTP Adaptive Streaming)
  • a HEVC- tiled stream may be further divided in temporal segments.
  • tile constraints for a spatial segment may be summarized as follows: 1. HEVC tiles in a first spatial segment A may not have any coding dependencies on HEVC tiles in a second spatial segment B;
  • HEVC tiles in a first spatial segment A may have coding dependencies on other tiles in the spatial segment A, under the condition that: a.
  • a first HEVC tile 1 in a spatial segment A at a time instance Frame N may not have any coding dependencies on a second HEVC tile 2 in the spatial segment N at time instance Frame N b.
  • a HEVC tile 2 in spatial segment A at a time instance Frame N mag have coding
  • HEVC tile 2 of spatial segment A at an earlier time instance (e.g. Frame N-1 ) or a later time instance (e.g. Frame N+1 ).
  • the latter condition ensures that encoding and decoding processes can be parallized between different CPU cores.
  • the HAS client may be provided with a so-called spatial manifest file (SMF) in order to inform the HAS client about the spatial and temporal relation of the spatial segments in the HEVC-tiled stream.
  • SMF spatial manifest file
  • an SMF may comprise stream identifiers (e.g. (part of) an URL), which a client may use in order to locate and access one or more delivery nodes
  • a media server e.g. one or more media servers or a content delivery network (CDN), which are capable of delivering the temporally segmented video data associated with one or more spatial segments on the basis of a HAS protocol to a HAS client.
  • CDN content delivery network
  • the client i.e. client device
  • the client device may parse the manifest file and use the information in the manifest file to request the desired (temporal and spatial) segments in order to render the video data.
  • the user interface may be configured to allow a user to interact with a displayed imaging region using a user interface and select e.g. a region of interest (ROI) that at least party coincides with a predefined spatial segment.
  • the user interface may generate an instruction for the client device to request HEVC-tiled video data of the spatial segment and render the video data of the spatial segment on the screen.
  • a user may move and/or expand the ROI and - in response - an appropriate spatial segment within that tile representation may be selected in order to render video image that at least partly coincides with the ROI.
  • Fig. 4 schematically depicts a spatial manifest file for a HAS client device according to an embodiment of the invention.
  • the spatial manifest file may define one or more hierarchical data levels 0.
  • the first data level 402 may relate to a Spatial Composition defining one or more Spatial Representations 406 ⁇
  • the Spatial Representation may form a second data level.
  • the source video may be formed on the basis of one or more high-resolution and, often, wide field-of-view HD or even UHD video streams or files.
  • a Spatial Composition may comprise different Spatial Representations 404
  • the video frames of the source file may be encoded into a HEVC-tiled video file or stream comprising one or more
  • the Spatial Representation may comprise metadata.
  • the metadata in the Segment representation 404 2 may comprise video resolution information 416
  • a Spatial Representation may comprise one or more Spatial Segments 410 as described in detail with reference to Fig. 1 -3.
  • a Spatial Segment may define one or more HEVC tiles 406i
  • a Spatial Segment may comprise metadata, e.g. segment position information 412 defining the position of a spatial segment in the HEVC-tiled video image.
  • the spatial segment instance may comprise a segment identifier 414, e.g. an URL, which may be used for retrieving video data associated with a Spatial Segment.
  • the HEVC tiles in a Spatial Segment may be defined by HEVC tile instance 406 1 . 4 .
  • HEVC tile instance may comprise a tile identifier 418i, 2 for identifying a HEVC tile in the video data of a Spatial Segment. Further, in an embodiment, a HEVC tile instance may comprise tile position information (e.g. tile coordinates) 422 1 2 defining the position of a HEVC tile in video frames of the HEVC-tiled stream.
  • tile position information e.g. tile coordinates
  • the segment position information and the tile position information in the SMF may be generally referred to as position information.
  • the coordinates used for defining the position of the spatial segment or an HEVC tile may be based on an absolute or a relative coordinate system and used by the HEVC decoder to spatially position the HEVC tiles into a seamless video image for display.
  • Fig. 5A and 5B schematically depict a spatial manifest file for streaming HEVC-tiled video data to a device according to an embodiment of the invention.
  • Fig. 5A and 5B depict an example of an MPEG-DASH MPD defining a HEVC-tiled video stream comprising spatial segments.
  • DASH Dynamic Adaptive Streaming over HTTP
  • the MPD may comprise different MPD video elements 502,504,506 which are associated with an identifier, e.g. (part of) an URL or URI.
  • the DASH enabled client device also referred to as DASH client
  • the first MPD video element 502 may be associated with at least one HEVC-tiled panorama video (a wide field-of-view video defined by the URI "full_panorama_2_4K.mp4") comprising 2x4 HEVC tiles.
  • the second and third MPD video element may define special segments within the tiled image are of the HEVC-tiled panorama video.
  • the second MPD video element 504 may be associated with a first spatial segment defined by a first spatial segment identifier, the URI "full_panorma-left.mp4"). This first spatial segment may comprise 4 HEVC tiles (2 by 2) and may be associated with a first (left) part of the HEVC-tiled panorama video.
  • the third MPD video element 506 may be associated with a second spatial segment defined by a second spatial segment identifier, URI "full_panorma-right.mp4").
  • This second spatial segment may comprise 4 HEVC tiles (2 by 2) and may be associated with a second (right) part of the HEVC- tiled panorama video.
  • the spatial relationship between the MPD video elements is defined on the basis of position information, which will be described hereunder in more detail.
  • An MPD video element may be defined as an "AdaptationSet” attribute comprising one or more representations (different versions of the same or associated content wherein the difference may be defined by one or more encoding parameters).
  • a DASH client may use the information in the MPD to request video data associated with a MPD video element from the network. Furthermore, a DASH client may use information (metadata) in the MPD to configure the HEVC decoder so that it may start decoding the HEVC-tiled video data as soon as the video data are received.
  • the information (metadata) for configuring the HEVC decoder may include the spatial relationship between the MPD video elements.
  • the MPD author may include position information in the MPD.
  • the position information may be defined by one or more spatial relationship descriptors (SRDs) 508,510i. 5 ,512i. 5 .
  • An SRD may be used in the EssentialProperty attribute (information that is required to be understood by the client when processing a descriptor) or a SupplementalProperty attribute (information that may be discarded by a client when processing a descriptor) in order to inform the client that a spatial relationship between the MPD video elements exist.
  • the spatial relationship descriptor schemeldUri "urn:mpeg:dash:srd:2014”) may be used as a data structure for formatting the position information.
  • the position information may be defined on the basis of the @value attribute 509,511i. 5 ,513i. 5 which may comprise a sequence of parameters including but not limited to:
  • the source_id parameter 514 may define the set of MPD video elements (AdaptationSet or SubRepresentation) that have a spatial relationship with each other.
  • the position parameters 516 x,y,w,h may define the position of a MPD video element wherein the coordinates x,y define the origin of the image region of the MPD video element and w and h define the width and height of the image region.
  • the position parameters may be expressed in a given arbitrary unit, e.g. pixel units.
  • the tuple W and H 518 define the dimension of the reference space expressed in an arbitrary unit which is the same as the x,y,w and h.
  • the spatial_set_id 520 allows grouping of MPD video elements in a coherent group. Such group of MPD video elements may be e.g. used as a resolution layer indicator.
  • the source parameter "1 " in the position information in the different MPD video elements indicate that the different MPD video elements are spatially related to each other.
  • the first MPD video element 502 may be defined as an AdaptationSet wherein the values x,y,w,h,W,H of the SRD are set to 0, indicating that this MPD video element defines a base track of an MPEG4 stream wherein the base track comprises "extractors" (pointers) to the video data in the tracks defined in the other MPD video elements (in a similar way as described with reference to Fig. 2).
  • the second and third MPD video elements 504,506 may be defined as an
  • AdaptationSet comprising a Representation 503 and one or more SubRepresentations 505i (i.e. parts composing this Representation which can be linked to the concept of tracks at the container level).
  • SubRepresentations 505i i.e. parts composing this Representation which can be linked to the concept of tracks at the container level.
  • Representation level comprising a set of one or more HEVC tiles (in this example four HEVC tiles) that are defined at SubRepresentation level.
  • the SubRepresentations can be also selectively requested when the range of bytes delimiting each track within a SubSegment is accessible to the client.
  • a spatial segment may have a data format that is similar to the one depicted in Fig. 2.
  • Each spatial segment may be stored as a separate track in the MPEG stream.
  • the video data in a track may be encoded such that independent playout of the (temporal segments of) a spatial segment by the HEVC decoder is possible.
  • Each tile track may comprise HEVC encoded video data as defined by the encoder attribute "codecs" 522, which refers in this example refers to an "hvtl " type codec wherein the "t" in “hvtl " refers to HEVC-tiled video data).
  • each HEVC tile in the SubRepresenation may be associated with an SRD 510 2 .5 comprising one or more position parameters 511 2 -5 for defining the position of the HEVC tile.
  • the client not only use the information in the SMF to locate delivery nodes in the network node that can deliver the desired video data to the client, but also uses metadata of the HEVC-tiled video streams defined in SMF in order allow a client to select a particular ROI (e.g. a spatial segment) and to configure the decoder before the HEVC-tiled video data are received by the client.
  • This metadata may include for example:
  • HEVC-tiled video data e.g. a codec attribute "hvtl "
  • - information for determining the number and/or size of HEVC-tiles in the selected spatial segment using e.g. the number of HEVC-tiles that are represented as a SubRepresentation and/or part of the position information associated with the SRDs;
  • the SegmentBase indexRange 524 1 2 may be used in order to define a byte range (in this example bytes 0 to 7632), which allows a client to link a temporal segment number with a particular range of bytes.
  • Fig. 6A and 6B schematically depict a data structure, in particular a spatial manifest file, for streaming HEVC-tiled video data to a device, preferably a client device, according to another embodiment of the invention.
  • Fig. 6A and 6B depict an example of an MPEG-DASH MPD defining a HEVC-tiled video stream, comprising spatial segments.
  • the MPD may comprise a number of MPD video elements 602,604,606,608,610,612 wherein the spatial relationship between the different MPD video elements is described on the basis of position information in the MPD in a similar way as described above with reference to Fig. 5A and 5B.
  • a first MPD video element 602 may define a first (low- resolution) panorama stream identified by an stream identifier 614i, in this case the URI
  • the first MPD video element may be defined as an AdaptationSet wherein the values 618 x,y,w,h,W,H of the SRD 616 are used to describe the spatial position of the video data of the first panorama stream and its spatial relation with respect to the other MPD video elements.
  • the video data in this stream may be encoded as a conventional HEVC stream.
  • the other MPD video elements may form a group of MPD video elements defining a HEVC-tiled stream comprising one or more spatial segments.
  • the grouping of these MPD video elements may be realized on the basis of the spatial_set_id in the SRDs of these MPD video elements.
  • the spatial_set_id's of these video elements are set to "3"
  • the spatial_set_id of the first video element is set to "1 ".
  • the second MPD video element 604 may define a high-resolution HEVC-tiled panorama stream at Representation level, identified by a stream identifier 620, in this case the URI "panorama_8K-base.mp4" wherein its position is defined on the basis of an SRD 622.
  • the values x,y,w,h,W,H of the SRD of the second MPD video element are set to 0, indicating that this MPD video element defines a base track of an MPEG4 stream wherein the base track comprises "extractors"
  • each track may comprise video data of a spatial segment comprising one or more HEVC tiles.
  • the other four MPD video elements 606,608,610,612 define four spatial segments that are also defined at Representation level.
  • Each spatial segment may be identified by a spatial segment identifier 624 lJt and each spatial segment may be associated with an SRD 626 ⁇ and parameter values 628 ⁇ in order to describe the spatial position of the spatial segment and its spatial relation with respect to the other MPD video elements.
  • the HEVC-tiles in the spatial segment may be described on the basis of a number of SubRepresentations in Fig. 5A) within the respective Representation of a spatial segment. Hence, the number of SubRepresentations in the Represenation of the spatial segment provides the number of HEVC-tiles in a spatial segment.
  • a spatial segment may be stored as a separate track in the MPEG stream as described with reference to Fig. 2 above.
  • the video data in a track may be encoded such that independent playout of the (temporal segments of) a spatial segment by the decoder is possible.
  • a spatial segment may comprise a number of HEVC tiles as shown by the encoder attribute "codecs" which refers to an "hvtl " type codec (the t in hvtl refers to tile). Further, a HEVC tile in the
  • SubRepresenation may be associated with an SRD comprising one or more position parameters for defining the position of the HEVC tile and its position with respect to other MPD video elements defined in the MPD in a similar way as described with reference to Fig. 5A and 5B.
  • an DASH client may request different streams on the basis of the information associated with the MPD video elements, e.g. a low-resolution panorama video stream, a HEVC-tiled high-resolution panorama video stream or one or more spatial segments comprising HEVC-tiles. Further, on the basis of the MPDs as depicted in Fig. 5A and 5B and Fig. 6A and 6B, a DASH client may send metadata associated with a requested stream to the HEVC decoder in order to configure and initialize the decoder so that is ready for decoding the video data as soon as the data are received by the client.
  • a DASH client may send metadata associated with a requested stream to the HEVC decoder in order to configure and initialize the decoder so that is ready for decoding the video data as soon as the data are received by the client.
  • the SegmentBase indexRange may be used in order to define a byte range (in this example bytes 0 to 7632) which allows a client to link a temporal segment number with a particular range of bytes.
  • Fig. 7 depicts a client device for retrieving and processing HEVC-tiled video data according to one embodiment of the invention.
  • the client device 702 may comprise a user navigation function 704 for interpreting user interaction with the (tiled) content that is processed and rendered by a media player 706.
  • the user navigation function may be connected to a user interface that may include a touch-screen, a camera, keyboard, mouse, trackerball, joystick, microphone, head tracking sensor, eye and gaze tracking, buttons or any other man-machine interface that allows manipulation (e.g. panning, zooming and/or tilting) of the displayed content.
  • the client device may further comprise a manifest cache 708 for receiving and storing one or more manifest files from a content provider and/or a content source in the network (e.g. a media server or a CDN).
  • the cache may comprise one or more SMFs 710 wherein an SMF may comprise one or more spatial segments as described in detail with reference to Fig. 1 -6 above.
  • the manifest cache may be connected to a stream selector 712 that may parse the SMF and use the information in the SMF to select one or more streams and instruct a stream processor 714 to request the one or more selected streams from the network 716 on the basis of a suitable protocol, e.g. the HTTP protocol or the like.
  • a suitable protocol e.g. the HTTP protocol or the like.
  • the stream processor may send information 722 (metadata) on the selected stream to configure the HEVC decoder 720.
  • the stream processor may send metadata associated with the spatial segment (e.g.
  • the HEVC decoder may be configured and initialized for processing the requested HEVC-tiled stream before the video data have been received.
  • the stream processor may use the SMF in order to request a desired spatial segment.
  • the stream process may send requests for the base track and one or more tracks associated with one or more spatial segments.
  • the stream processor may relay the data of the base track and video data associated one or more spatial tracks to a buffer 718 before the data are decoded by an HEVC decoder 720 in the media player.
  • Fig. 8 depicts a HEVC-tiled streaming process according to an embodiment of the invention.
  • the video data may be distributed by a so-called content delivery network (CDN) to clients (i.e. client devices) using an HAS protocol.
  • CDN content delivery network
  • the process may start with a client requesting and receiving a spatial manifest file SMF from a content provider CP (steps 802 and 804).
  • the SMF may for example relate to an MPD as described with reference to Fig.
  • 5A and 5B defining a HEVC-tiled panorama video (2 x 4 HEVC tiles) with two spatial segments: a first spatial segment "full_panorama-left.MP4" with 2 x 2 HEVC tiles defining the left sub-region of the full panorama and a second spatial segment "full_panorama-right.MP4" with 2 x 2 HEVC tiles defining the right sub-region of the full panorama.
  • the client may parse the SMF and select e.g. a spatial segment identifier, e.g. "full_panorma-left.MP4" and the associated base track "full_panorama_2_4K-base.mp4" (step 806). Further, it may retrieve metadata associated with the selected segment (e.g. information regarding the fact that the selected video data relates to HEVC- tiled video data, that the selected video data are defined as a spatial segment comprising a number of HEVC tiles, the position information associated with the spatial segment and its HEVC tiles, etc.).
  • a spatial segment identifier e.g. "full_panorma-left.MP4"
  • the associated base e.g. "full_panorama_2_4K-base.mp4"
  • the client may send a request for the base track to the network.
  • the client may send a request message (step 808), e.g. an HTTP GET message, comprising an identifier of the base track (e.g. an URL) to a so-called request routing (RR) node of the CDN.
  • the request routing node may locate the delivery node (e.g. a media server) on which the data of the requested base track is stored and send the URL of the localized delivery node in a redirect message back to the client (step 810).
  • the client may use the URL for sequentially requesting the temporal segments of the selected first spatial segment as identified in the SMF.
  • the client may send a request message back to the deliver node that is configured to deliver the base track to the client (step 812).
  • the delivery node may send the base track to the client (step 814).
  • the data of the base track may be buffered and send the HEVC decoder, while the retrieval process for the temporal segments comprising the video data of the first spatial segment is continued by sending a request for the first temporal segment "full_panorma-left_1 .MP4" of the first spatial segment to the network (step 816).
  • the client may receive the first temporal segment in a response message (step 818) and start decoding and rendering the video on the basis of the information in the base track (step 820). This process may be repeated for subsequent temporal segments of the first spatial segment.
  • the client device may be triggered by the user navigation function switch to from a first spatial segment to a second spatial segment (step 830).
  • the user navigation function may detect a user interaction that is interpreted by the user navigation function as a panning action.
  • the stream selector in the client device may parse the SMF and select a second spatial segment (step 832) defining the right sub-region of the full panorama, configure the decoder on the basis of the metadata in the SMF of the second spatial segment and start requesting temporal segments of the second spatial segment from the CDN (step 834) in a similar way as illustrated by steps 816-828).
  • Fig. 9 schematically depicts a process for generating HEVC-tiled video data according to an embodiment of the invention.
  • one or more cameras 902 e.g. one or more high- resolution, wide field-of-view cameras may be used to generate or compose a source video.
  • An HEVC encoder 904 may be used to generate one or more HEVC-tiled streams on the basis of a source video.
  • the HEVC-tiled streams may define a Spatial Composition on the basis of the source video.
  • the Spatial composition may comprise one or more Spatial Representations 908i, 2 - In an
  • a tile representation may relate to a panorama HEVC-tiled stream comprising a particular number of HEVC tiles per video frame. Further, for one or more Spatial Representations, one or more Spatial Segments 910i. 3 may be generated wherein a spatial segment may relate to a HEVC-tiled stream comprising a subset of HEVC tiles of a HEVC-tiled video stream.
  • information (metadata) on the generated video data: identifiers and spatial and temporal information of a set of streams may be formatted in a SMF 912 as described with reference to Fig. 4-6.
  • the thus generated one or more (HEVC-tiled) streams and SMFs may be stored at one or more delivery nodes 922 1 2 in the network 916.
  • a delivery node is configured to deliver a
  • HEVC-tiled stream to a client device 924.
  • a delivery node may be a media server.
  • at least part of the delivery nodes (sometimes also referred to as surrogate nodes) may be part of a dedicated content delivery network (CDN).
  • CDN Content Delivery Network Control Function 920
  • the CDNCF may distribute the HEVC-tiled streams over different delivery nodes so that efficient distribution of the streams is ensured.
  • the CDN may update the tile (and segment) identifiers (the URLs) in the MPD such that a client device may efficiently access delivery nodes of the CDN in order to request the delivery of HEVC-tiled content.
  • the client device 924 When the client device 924 would like to access the video data, it may be provided with a SMF from a content provider or the CDN and use the SMF to request and playout HEVC-tiled video data.
  • the client device may generally relate to a (mobile) video data processing device such as an electronic tablet, a smart-phone, a notebook, a media player, a home gateway or DASH enabled devices such as a DASH-enabled HbbTV display device.
  • the device may be a set-top box or content storage device configured for processing and temporarily storing content for future consumption by a content play-out device, which has access to the stored content.
  • the delivery of the video data to the client device may be based on any data transmission scheme.
  • a unicast scheme may be used to transmit data from a delivery node to the client.
  • a broadcast or a multicast scheme e.g. an IP multicast scheme
  • Fig. 10 is a block diagram illustrating an exemplary data processing system that may be used in systems and methods as described with reference to Fig. 1 -9.
  • Data processing system 1000 may include at least one processor 1002 coupled to memory elements 1004 through a system bus 1006. As such, the data processing system may store program code within memory elements 1004. Further, processor 1002 may execute the program code accessed from memory elements 1004 via system bus 1006.
  • data processing system may be implemented as a computer that is suitable for storing and/or executing program code. It should be appreciated, however, that data processing system 1000 may be implemented in the form of any system including a processor and memory that is capable of performing the functions described within this specification.
  • Memory elements 1004 may include one or more physical memory devices such as, for example, local memory 1008 and one or more bulk storage devices 1010.
  • Local memory may refer to random access memory or other non-persistent memory device(s) generally used during actual execution of the program code.
  • a bulk storage device may be implemented as a hard drive or other persistent data storage device.
  • the processing system 1000 may also include one or more cache memories (not shown) that provide temporary storage of at least some program code in order to reduce the number of times program code must be retrieved from bulk storage device 1010 during execution.
  • I/O devices depicted as input device 1012 and output device 1014 optionally can be coupled to the data processing system.
  • input device may include, but are not limited to, for example, a keyboard, a pointing device such as a mouse, or the like.
  • output device may include, but are not limited to, for example, a monitor or display, speakers, or the like.
  • Input device and/or output device may be coupled to data processing system either directly or through intervening I/O controllers.
  • a network adapter 1016 may also be coupled to data processing system to enable it to become coupled to other systems, computer systems, remote network devices, and/or remote storage devices through intervening private or public networks.
  • the network adapter may comprise a data receiver for receiving data that is transmitted by said systems, devices and/or networks to said data and a data transmitter for transmitting data to said systems, devices and/or networks.
  • Modems, cable modems, and Ethernet cards are examples of different types of network adapter that may be used with data processing system 1050.
  • memory elements 1004 may store an application 1018. It should be appreciated that data processing system 1000 may further execute an operating system (not shown) that can facilitate execution of the application. Application, being implemented in the form of executable program code, can be executed by data processing system 1100, e.g., by processor 1002. Responsive to executing application, data processing system may be configured to perform one or more operations to be described herein in further detail.
  • data processing system 1000 may represent a client data processing system.
  • application 1018 may represent a client application that, when executed, configures data processing system 1000 to perform the various functions described herein with reference to a "client”, which is for the purpose of this application, sometimes also referred to as a "client device".
  • client aka client device may include, but are not limited to, a personal computer, a portable computer, a mobile phone, or the like.

Abstract

A method is described of streaming HEVC-tiled video data to a client device comprising: providing said client device with a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled (panorama) video stream; and, selecting a spatial segment identifier in said spatial manifest file for requesting a delivery node to deliver at least part of the video data of a spatial segment as a HEVC-tiled video stream to the client device.

Description

HEVC-tiled video streaming
Field of the invention
The invention relates to HEVC-tiled video streaming, and, in particular, though not exclusively, to a method of streaming HEVC-tiled video data to a client device, a client device for processing HEVC-tiled video data, a non-transitory computer-readable storage medium comprising a recording area for storing HEVC-tiled video data and data structures associated with HEVC-tiled video data and a computer program product using such method.
Background of the invention
Over the past few years, advances in both camera and image processing technologies not only enable recording in ever higher resolutions, but also enable stitching the output of multiple cameras together, allowing a set of cameras that together record in full 360 degrees in even higher resolutions than 8Kx4K. These developments make it possible to change the way users experience video. Conventionally a broadcast of e.g. a football match comprises a sequence of camera shots carefully aligned and controlled by a director. In such a broadcast stream, each camera movement in the final stream corresponds to a physical alteration to the position, angle or zoom level of a camera itself. High-resolution panorama videos however, enable a user (and/or director) a certain degree of interaction with the video the user (and/or director) is watching (directing) without having to manipulate the camera in a physical sense. Using pan-tilt-zoom interaction, it is possible to extract from the high- resolution panorama video a sub-region of the video a user or director is interested in. This sub-region may be referred to as the region of interest (ROI).
Since in this particular use case a specific user is, at any given instant in time, only watching a subset of the full video panorama, bandwidth requirements can be reduced by sending only the part of the video the user is interested in. There are a number of techniques with which such functionality can be achieved. One of these techniques is the so-called tiled streaming technique, with which the full video panorama is divided into multiple independently encoded videos, whereby the client device, also referred to as client, has multiple decoders allowing it to reconstruct any part of the full video panorama, if necessary by stitching together a number of such independent videos.
WO2012/168365 describes content delivery systems, e.g. CDNs, for streaming spatially segmented content to clients. After requesting multiple tile streams from the network, the client (i.e. the client device) needs to buffer the different streams and multiple instances of the decoder need to be started. The client should be able to synchronize the decoders and to stitch the decoded video tiles into the full video. Hence, when switching to a tiled streaming mode comprises a large number of tile streams, the client processes may become complex and resource intensive. Another form of tiled streaming is known from the HEVC standard, which provides a very efficient encoding and decoding scheme for video data. HEVC tiles were originally introduced in the HEVC standard for decoding of the video data using multi-core processors so that tiles in a HEVC- tiled video stream may be processed (encoded/decoded) in parallel.
Besides parallel processing, HEVC-tiles may also be used for playout of only a subset of the HEVC tiles in the video frames of a HEVC-tiled stream. The subset may e.g. relate to a region- of-interest (ROI) in the image area of the (raw) panorama video.
In that case, the HEVC tiles should be independently encoded so that the decoder is able to decode only a subset of the HEVC tiles. In order to generate such sets of independently decodable HEVC tiles, the HEVC standard allows an HEVC encoder to be configured for restricting the spatial and temporal predictions in the video coding (e.g. motion vectors and in-loop filters) within the boundaries of one or more HEVC tiles.
The absence of spatial and temporal decoding between the tiles (that is between the video data of the tiles) however would introduce a reduced compression efficiency, which could lead to a loss in video quality or an increase in the bitrate.
Hence, in order to achieve high compression rates one would require division of the frames into a few relatively large tiles. Reduction of the amount of tiles however would reduce the amount of parallelism that can be achieved thereby limiting the encoding and decoding speed. When dividing the frames of a video into a large number of small tiles, a high level of parallelism could be achieved however the compression efficiency would be substantially reduced.
Furthermore, when managing multiple independent HEVC tiles at transport level one could format the video data as a single HEVC-tiled stream. In that case however the video data of all HEVC-tiles should be transmitted to the client and tiles can only be manipulated at decoder level. Alternatively, one could format the multiple independent HEVC tiles as separate streams so that only a subset of HEVC tiles needs to be streamed to the client. Such scheme would introduce a large number of HTTP requests in order to request all temporal segments of the desired set of HEVC tiles.
Hence, there is a need in the art for improved methods and systems for streaming HEVC-tiled video data. In particular, there is a need in the art for methods and systems for streaming HEVC-tiled video data that reduces the amount of network traffic and does not increase the processor load of the device.
Summary of the invention
It is an objective of the invention to reduce or eliminate at least one of the drawbacks known in the prior art. In a first aspect the invention may relate to a method of selecting and/or streaming HEVC-tiled video data to a client device. In an embodiment, the method may comprise providing said client device with a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset , preferably a plurality, of HEVC tiles of a HEVC-tiled (panorama) video stream; and, selecting a spatial segment identifier of said spatial manifest file for requesting a delivery node to deliver at least part of the video data of a spatial segment as a HEVC-tiled video stream to the client device.
In an embodiment, the method may include using said selected spatial segment identifier for sending a request, preferably an HTTP request, to said delivery node for delivering video data associated with said spatial segment to said client device.
The invention thus relates to a data structure defining a subset of HEVC tiles (i.e. one or more HEVC tiles) of a full set of HEVC tiles of a HEVC-tiled video (e.g. a HEVC-tiled panorama video). The subset of HEVC-tiles may be referred to as a so-called spatial segment, wherein the spatial segment defines part of an image frame within the full image frames of the full HEVC-tiled video stream (e.g. a HEVC tiled panorama video). The video data of a spatial segment may be stored as an independently decodable video data in a file that can be accessed by the client device, e.g. an HAS enabled client device (also referred to as a HAS client device or simply HAS client), using the spatial manifest file. Different spatial segments may be defined in the spatial manifest file, which is used by the client device to locate a delivery node (e.g. a media server) that can send the requested data to the client device.
The spatial manifest file comprising the spatial segments may be used by the client device in order to retrieve HEVC-tiled video data. For example, a user may select a region-of-interest (ROI), e.g. the centre of a rendered HEVC-tiled panorama video, via the user interface of the device, wherein the position of the ROI in the image area of the panorama video may at least partly coincide with the position of the spatial segment in the full image area of the panorama video. By requesting the spatial segment that (partly) spatially coincides with the ROI, video data associated with the selected ROI may be provided to the client device in a HEVC-tiled video stream.
Thus, instead of sending a plurality of requests associated with a plurality of independent tile streams before the video data can be decoded, the spatial manifest file comprising spatial segment instances allows requesting a set of HEVC-tiles on the basis of a single or at least a reduced number of request messages, e.g. HTTP request messages, and receive the video data of the spatial segment in a single HEVC-tiled video stream that can be directly processed by the HEVC decoder.
The video data of a spatial segment is formatted as a HEVC-tiled video stream so that the data can be readily decoded by a single HEVC decoder. This way network traffic and processor resources of the client device (also referred to as a user device) can be substantially reduced when compared with known tiled streaming schemes. The concept of tiles as described in this disclosure may be supported by different video codecs. For example the High Efficiency Video Coding (HEVC) standard allows the use of independently decodable tiles (HEVC tiles). HEVC tiles may be created by an encoder that divides each video frame of a media stream into a number of rows and columns ("a grid of tiles") defining tiles of a predefined width and height expressed in units of coding tree blocks (CTB). An HEVC bitstream may comprise decoder information for informing a decoder how the video frames should be divided in tiles. The decoder information may inform the decoder on the tile division of the video frames in different ways. In one variant, the decoder information may comprise information on a uniform grid of n by m tiles, wherein the size of the tiles in the grid can be deduced on the basis of the width of the frames and the CTB size. Because of rounding inaccuracies, not all tiles may have the exact same size. In another variant, the decoder information may comprise explicit information on the widths and heights of the tiles (e.g. in terms of coding tree block units). This way video frames may be divided in tiles of different size. Only for the tiles of the last row and the last column the size may be derived from the remaining number of CTBs. Thereafter, a packetizer may packetize the raw HEVC bitstream into a suitable media container that is used by a transport protocol.
Other video codecs that allow to precisely define encoding/decoding dependencies between tiles, such that tiles within a spatial segment may contain encoding dependencies, but tiles across the boundaries of spatial segments do not contain encoding dependencies, may include the video codec VP9 of Google or - to some extent - the MPEG-4 Part 10 AVC/H.264, the Advanced Video Coding (AVC) standard. In VP9 coding dependencies are broken along vertical tile boundaries, which means that two tiles in the same tile row may be decoded at the same time. Similarly, in the AVC encoding, slices may be used to divide each frame in multiple rows, wherein each of these rows defines a tile in the sense that the media data is independently decodable. Hence, in this disclosure the term "HEVC tile" is not restricted to only tiles according to the HEVC standard, but generally defines a subregion of arbitrarily shape and/or dimensions within the image region of the video frames wherein the encoding process can be configured such that the media data within the boundaries of the tile , or when more than one tile is comprised in a spatial segment, within the boundaries of such segment, is independently decodable. In other video codecs other terms such as segment or slice may be used instead of the term "tile".
It should thus further be noted that the invention is equally suitable for use with video codecs that are different from HEVC (e.g. VP9) or are (future) derivatives from HEVC, as long as these codecs have the characteristic that they are suitable for encoding a video, whereby different regions (sub areas) of images representing the video can be independently encoded within the boundaries of a spatial segment, in a single encoding process, and whereby the independently encoded regions can be decoded in a single decoding process. The term independently refers to the notion that the coding is performed in a manner that no encoding dependencies exist between these regions across the boundaries of spatial segments. In an embodiment, video data of HEVC tiles in said spatial segment do not have spatial and/or temporal decoding dependencies with video data of HEVC tiles in said HEVC-tiled video stream that are not part of the spatial segment.
In an embodiment, coding (decoding/encoding) constraints for a spatial segment may be summarized as follows:
1. HEVC tiles in a first spatial segment A may not have any coding dependencies on HEVC tiles in a second spatial segment B;
2. HEVC tiles in a first spatial segment A may have coding dependencies on other tiles in the spatial segment A, under the condition that: a. A first HEVC tile 1 in a spatial segment A at a time instance Frame N may not have any coding dependencies on a second HEVC tile 2 in the spatial segment N at time instance Frame N b. A HEVC tile 2 in spatial segment A at a time instance Frame N mag have coding
dependencies on HEVC tile 2 of spatial segment A at an earlier time instance (e.g. Frame N-1 ) of or a later time instance (e.g. Frame N+1 ).
The latter condition ensures that the coding processes can be executed in parallel by different CPU cores.
In an embodiment, the spatial segment may be defined by segment boundaries that coincide with the HEVC tile boundaries in A row and column direction of said HEVC-tiled video stream. In another embodiment, the segment boundaries may enclose a rectangular area comprising an integer number of HEVC tiles (a subset of HEVC tiles) that is smaller than the number of HEVC tiles in said HEVC-tiled video stream. The integer number is preferably large than 1. Thus the HEVC tiles of said rectangular area are preferably a plurality (i.e. a multiple). Hence, the image area associated with a spatial segment may define a small part of the full image area of the full HEVC-tiled panorama video.
In an embodiment, video data of at least part of said HEVC tiles in said spatial segment are decoded in parallel by a HEVC decoder. As the spatial segment may comprise multiple HEVC tiles, the tiles may be decoded in parallel by multiple processor cores.
In an embodiment, video data of the HEVC tiles in said spatial segment do not have spatial and/or temporal decoding dependency. Hence, the video data of each HEVC tile in the spatial segment can be decoded by the HEVC decoder without any information of the other HEVC tiles in the spatial segment. In another embodiment video data of at least part of said HEVC tiles in said spatial segment have one or more spatial and/or temporal decoding dependencies. In such an embodiment, the HEVC-tiled video stream is encoded such that within the boundaries of a spatial segment .dependencies of video data between different HEVC tiles belonging to the same spatial segment are allowed and/or exist. In that case, the HEVC-tiles in the spatial segment (i.e. a subset of HEVC tiles of a HEVC-tiled video stream) can be efficiently compressed without any further quality loss.
In an embodiment, said spatial manifest file may further comprises one or more HEVC tile identifiers, preferably (part of) one or more URLs, for locating one or more one or more delivery nodes configured for delivering video data associated with at least one HEVC tile of the subset of HEVC tiles of a spatial segment. Hence, in this embodiment, the HEVC tiles in a spatial segment may be individually accessible by a client device on the basis of the spatial manifest file.
In an embodiment, said spatial manifest file may further comprise metadata associated with said selected spatial segment, wherein said metadata may include at least one of: information for determining that the selected spatial segment is related to HEVC-tiled video data; information for determining the number and/or size of HEVC-tiles in the selected spatial segment; information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream; and/or, information for determining whether the video data of a HEVC tile of said spatial segment have one or more temporal decoding dependencies on video data of other HEVC tiles in said spatial segment. Hence, the spatial manifest file may comprise information (metadata) for the decoder so that the decoder can be initialized or configured before it receives HEVC-tiled video data that is requested by the client device on the basis of the spatial manifest file.
In an embodiment, video data associated with a spatial segment are stored in a separate tracks, wherein video data in said track may at least be partly accessible by said client device on the basis of spatial segment identifiers. In an embodiment, video data associated a HEVC tile in a spatial segment are stored in a separate track, wherein video data in said track may at least be partly accessible by said client device on the basis of one or more HEVC tile identifiers. Hence, in order to allow a client device to request video data of a spatial segment and/or video data of one or more HEVC tiles in a spatial segment, the video data may be stored as separate tracks on a computer-readable storage medium. Video data in a track may be linked with a spatial segment identifier or a HEVC tile identifier in order to allow the client device to request delivery of video data stored in one or more of tracks to a device comprising a HEVC decoder for decoding the video data and rendering video content.
In an embodiment of the invention, one or more spatial segments, preferably identifiable by spatial segment identifiers, spatially overlap. This configuration of the spatial segments has the advantage that for instance a user interaction such as panning "off screen" may be performed in an improved manner. In certain embodiments, video data related to an image area that is "off screen" at the moment that it is requested, may be retrieved. If such "off screen" image area (which is not yet displayed) is partly comprised in the video data of the tracks with spatial segments that are being retrieved, enough time may be gained before partially overlapping spatial segments of another track are being retrieved, such that the Off screen' panning action may be perceived as seamless.
In a further aspect, the invention may relate to a client device, wherein said client device may be configured for: parsing a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled video stream; and, using a spatial segment identifier of said spatial manifest file for requesting a delivery node the delivery of video data of a spatial segment.
In an embodiment, a HEVC decoder may be used for decoding video data of said spatial segment that are requested by said client device on the basis of said spatial manifest file.
In another aspect, the invention may relate to a non-transitory computer-readable storage medium comprising a recording area for storing video data, wherein said recording area may comprise: video data associated with a spatial segment, said spatial segment comprising a subset of HEVC tiles of a HEVC-tiled video stream, the video data of said spatial segment being accessible on the basis of an spatial segment identifier.
In a further embodiment, video data of said one or more spatial segments are accessible on the basis of one or more spatial segment tile identifiers.
In an embodiment, video data of said one or more spatial segments are stored as separate video tracks in said recording area. In another embodiment, said recording area may further comprise at least one base track comprising one or more extractors, wherein an extractor is pointing to a video track.
In yet another aspect, the invention may relate to a non-transitory computer-readable storage medium comprising a stored data structure, preferably a spatial manifest file for use by a device, preferably a client device, or for use in the methods as described above, wherein said data structure may comprise: one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled video stream.
In an embodiment, said data structure may comprise one or more HEVC tile identifiers, preferably (part of) one or more URLs, for locating one or more one or more delivery nodes configured for delivering video data associated with at least one HEVC tile of the subset of HEVC tiles of a spatial segment.
In an embodiment, said data structure may further comprise metadata associated with said selected spatial segment, wherein said metadata may include at least one of: information for determining that the selected spatial segment is related to HEVC-tiled video data; information for determining the number and/or size of HEVC-tiles in the selected spatial segment; information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream; and, information for determining whether there the video data of a HEVC tile of said spatial segment have one or more spatial decoding dependencies on other HEVC tiles in said spatial segment.
In a further aspect, the invention may relate to a video tiling system configured for: receiving video data, preferably wide field of view (panorama) video data; encoding said video data HEVC-tiled video comprising one or more spatial segments, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a HEVC-tiled panorama video stream; generating a spatial manifest file associated with said HEVC-tiled video data, said spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to a client device, preferably said spatial manifest file further comprising information for determining the position of at least part of said one or more spatial segments and/or the position of HEVC tiles in a spatial segment within the tiled image area of said HEVC-tiled video stream.
The invention may also be related to computer program product comprising software code portions configured for, when run in the memory of a computer, executing the method steps according to any of the above claims.
The invention will be further illustrated with reference to the attached drawings, which schematically will show embodiments according to the invention. It will be understood that the invention is not in any way restricted to these specific embodiments.
Brief description of the drawings
Fig. 1 depicts the concept of a spatial HEVC segment according to an embodiment of the invention.
Fig. 2 depicts a data structure for HEVC-tiled video data according to an embodiment of the invention.
Fig. 3 depicts a schematic of a temporally segmented HEVC-tiled stream comprising spatial HEVC segments according to an embodiment of the invention.
Fig. 4 schematically depicts a spatial manifest file for use by a HAS client device according to an embodiment of the invention.
Fig. 5A and 5B schematically depict a spatial manifest file comprising spatial segments according to an embodiment of the invention.
Fig. 6A and 6B schematically depict a spatial manifest file comprising spatial segments according to an embodiment of the invention. Fig. 7 depicts a client device configured to rendering HEVC-tiled video data on the basis of a spatial manifest file according to an embodiment of the invention.
Fig. 8 schematically depicts a flow diagram of a streaming HEVC-tiled video data on the basis of a spatial manifest file according to an embodiment of the invention.
Fig. 9 depicts a schematic of a process for generating a HEVC-tiled stream of file comprising one or more spatial segments according to an embodiment of the invention.
Fig. 10 is a block diagram illustrating an exemplary data processing system that may be used in systems and methods as described with reference to Fig. 1 -9. Detailed description
Fig. 1A and 1 B depict schematics of a HEVC-tiled video stream according to various embodiments of the invention.
A video stream, e.g. a high-density (HD) or ultra high-density (UHD) wide field-of-view or panorama video stream, may be encoded on the basis of the HEVC video compression standard. In HEVC a video image is partitioned in so-called coding tree units (CTU), which is the basic processing unit used in the HEVC standard for the encoding and decoding process.
The HEVC encoder may be configured to divide video frames 100 in the HEVC stream in so-called HEVC tiles 102, wherein a HEVC tile is rectangular area defined by a particular grouping of CTUs. The HEVC tiles may divide the image area of the panorama video into multiple adjacent rectangular regions (which may be of different size), wherein the boundaries of the HEVC tiles are defined by HEVC tile boundaries in the column and row direction 104,106. In this application, a HEVC stream comprising HEVC tiles may be referred to as a HEVC-tiled video stream.
HEVC tiles were originally introduced in the HEVC standard for encoding and decoding of the video data using multi-core processors so that tiles in a HEVC-tiled stream may be processed (encoded and decoded) in parallel. HEVC-tiles may also be used for playout of only a subset of the HEVC tiles in the video frames of a HEVC-tiled stream. The subset may e.g. relate to a region-of-interest (ROI) in the image area of the (raw) panorama video.
In that case, the HEVC tiles should be independently encoded over time so that the decoder is able to decode only a subset of the HEVC tiles over multiple frames. In order to generate such sets of independently decodable HEVC tiles, the HEVC standard allows an HEVC encoder to be configured for restricting the temporal predictions in the video coding (e.g. motion vectors and in-loop filters) within the boundaries of one or more HEVC tiles.
The absence of temporal decoding dependency between the tiles (i.e. between the video data of different tiles) however would introduce a reduced compression efficiency, which could lead to a loss in video quality or an increase in the bitrate.
Hence, in order to achieve high compression rates one would require division of the frames into a few relatively large tiles. Reduction of the amount of tiles however would reduce the amount of parallelism that can be achieved thereby limiting the encoding and decoding speed. When dividing the frames of a video into a large number of small tiles, a high level of parallelism could be achieved however the compression efficiency would be substantially reduced.
Furthermore, when managing multiple independent HEVC tiles at transport level one could format the video data as a single HEVC-tiled stream. In that case however the video data of all HEVC-tiles should be transmitted to the client device and tiles can only be manipulated at decoder level. Alternatively, one could format the multiple independent HEVC tiles as separate streams (HEVC tile streams) so that only a subset of HEVC tiles needs to be streamed to the client device. Such scheme would introduce a large number of HTTP request in order to request all temporal segments of the desired set of HEVC tiles.
In order to address the above-mentioned problems, a set of HEVC tiles may be grouped into a so-called spatial segment, for example Fig. 1A depicts a spatial segment 108i comprising a subset of HEVC tiles (in this example 8 HEVC tiles) from the full set of HEVC tiles building the HEVC-tiled video frame of a panorama video (in this example 24 HEVC segments). Similarly, Fig. 1 B depicts an example of multiple spatial segments 1082 (in this case four spatial segments) each comprising one or more HEVC tiles (in this example six) from the full set of HEVC tiles. Although the spatial segments in Fig. 1 B are of equal size, multiple spatial segments of different size (different number of HEVC tiles) are also envisaged.
The boundaries of a spatial segment coincide with the HEVC tile column and row boundaries 104,106 so that it encloses an integer number (one or more) , preferably a plurality, HEVC tiles. Hence, the spatial segment defines a sub-region in the image frame comprising multiple HEVC tiles. The spatial segment thus defines an image area that is larger than an image area associate with an individual HEVC-tile and smaller than the image area of the image area of the full panorama video. Furthermore, the data format associated with a spatial segment may defined such that spatial segment can be accessed by the client device on transport level by requesting the spatial segment from a media server and receiving the video data in the spatial segment as an independent HEVC stream (a spatial segment stream).
In an embodiment, one or more HEVC tiles in a spatial segment may be configured to have one or more (temporal) decoding dependencies in previous and/or future frames and no (temporal) decoding dependencies between tiles within the spatial segment and tiles outside the spatial segment. In that case, the video data may be efficiently compressed without any further loss of quality.
In another embodiment, the HEVC tiles in a spatial segment may be configured to have one or more (temporal) decoding dependencies in previous and/or future frames and no decoding dependency between tiles within the spatial segment and tiles outside the spatial segment. In that case, the rendering of the HEVC tiles in the spatial segment may be controlled decoder level.
The position of HEVC tiles and the position of one or more spatial segments in the full image region may be determined by tile position information and segment position information respectively. The position information may be defined on the basis a coordinate system associated with the full image region. Tile position information may comprise coordinates of tile regions within the image region of said source video. This way, every HEVC tile may be related to a tile region in the image region of the HEVC video stream. On the basis of the full set of HEVC tiles and the tile position information, the full image region of the HEVC stream may be reconstructed by the HEVC decoder.
A coordinate system that is used for defining the tile position information may also be used for defining the position of a spatial segment. In case the image region relates to a 2D or 3D image region, a Cartesian coordinate system may be used. Alternatively, in case the image regions relates to a curved image region other non-Cartesian, curvilinear coordinate systems may be used, e.g. cylindrical, spherical or polar coordinate system.
Hence, from the above, it follows that a spatial segment defines HEVC-tiled video data comprising a subset of HEVC tiles from the complete set of HEVC tiles of a HEVC-tiled wide field of view video (e.g. a panorama video). In case of an HEVC-encoded bitstream, a spatial segment may be defined using the concept of a so-called motion-constrained tile set. The information defining such tile set may be defined as a SEI message in the MPEG stream. The motion-constrained tile set is defined as follows: temporal_motion_constrained_tile_sets( payloadSize ) {
mc all tiles exact sample Rvalue match flag
each_tile_one_tile_set_flag
if( !each_tile_one_tile_set_flag ) {
num_sets_in_message_minus1
for( i = 0; i <= num_sets_in_message_minus1 ; i++) {
mcts_id[ i ]
num_tile_rects_in_set_minus1 [ i ]
for( j = 0; j <= num_tile_rects_in_set_minus1 [ i ]; j++) {
top_left_tile_index[ i ][ j ]
bottom_right_tile_index[ i ][ j ]
}
if( !mc_all_tiles_exact_sample_value_match_flag )
exact_sample_value_match_flag[ i ]
}
} wherein: each_tile_one_tile_set_flag equal 1 means that 1 tile is a tile set (a spatial segment);
num_sets_in_message_minus1 defines the number of tile sets (the number of spatial segments);
mcts_id[i] gives an arbitrary id for the i-th tile set
num_tile_rects_in_set_minus1 [i] defines the number of HEVC tiles in the motion-constrained tile set;
top_left_tile_index[i][j] and bottom_right_tile_index[i][j] defines the top lef tand bottom right indexes of the tiles in the tile set. Further, the mc_all_tiles_exact_sample_value_match_flag parameter is set to 1 . The HEVC standard thus allows defining sets of tiles within the bitstream . As will described hereunder in more detail, the special segment data structure allows a client device to access and retrieve these tile sets on a transport level (e.g. MPEG DASH level).
Fig. 2 depicts an example of data structure 200 of an HEVC-tiled video file or stream , in this particular example an MPEG-4 file 202, comprising one or more spatial segments.
In an embodiment, the video file or stream may comprise one or more (video) tracks 2Ο61.4, which serve as a container for independently decodable video data associated with one or more spatial segments and, optionally, one or more HEVC tiles. Hence, a track may define a container comprising video data 210 wherein the spatial and temporal predictions for the video coding (e.g . motion vector and in-loop filters.) are within the boundaries of the spatial segment.
In an embodiment, a track may further comprise position information 208. The decoder may use the position information in order to determine the position of the spatial segment within the HEVC-tiled video image. In an embodiment, position information in a track may comprise an origin and size information in order to allow the decoder to position a spatial segment or a HEVC tile in a reference space wherein a position in the space may be determined by a coordinate system associated with the full image.
In a further embodiment, the data structure 200 may further comprise a so-called base track 204. The base track may comprise information that determines the sequence of the tracks that need to be decoded by HEVC decoder. In particular, the base track may comprise extractors 212, wherein an extractor defines a reference to one or more corresponding tracks. By parsing the base track, the decoder may replace an extractor with audio and/or video data of a track it refers to. The HEVC decoder thus uses the information in the base track in order to generate on the video data in the tracks a coherent bitstream for decoding.
If a particular video application does not require a particular spatial segment (or one or more HEVC tiles), the decoder may simply ignore its corresponding extractor. In that case, the absence of such track may be interpreted by the decoder as "missing data". Since the video data in the tracks are independently decodable, the absence of data from one or more tracks does not prevent the decoder from decoding other tracks that can be retrieved.
In an embodiment, an HEVC tile may be decoded independently from the other HEVC tiles, so that the absence of data from one or more tracks does not prevent the decoder from decoding other tracks that can be retrieved.
In a further embodiment, the base track may comprise video data associated with the full image region of the source video, e.g. a panorama video. The video may be selected in a quality such that it can be transported in the HEVC stream without taking up too much bandwidth. In an embodiment, the data format depicted in Fig. 2 may be used for storing spatial segments and HEVC tiles as independent files such that a client device may request delivery of these files.
The streams depicted in Fig. 1 and 2 may be delivered to a client device (also simply referred to as a client throughout this application) for playout using an adaptive streaming protocol such as an HTTP adaptive streaming (HAS) protocol. Examples of HTTP adaptive streaming protocols include Apple HTTP Live Streaming [http://tools.ietf.org/html/draft-pantos-http-live-streaming-13], Microsoft Smooth Streaming [http://www.iis.net/download/SmoothStreaming], Adobe HTTP Dynamic Streaming [http://www.adobe.com/products/ httpdynamicstreaming], 3GPP-DASH [TS 26.247 Transparent end-to-end Packet-switched Streaming Service (PSS); Progressive Download and
Dynamic Adaptive Streaming over HTTP] and MPEG Dynamic Adaptive Streaming over HTTP [MPEG DASH ISO/IEC 23001-6]. HTTP allows an efficient, firewall-friendly and scalable scheme for delivering tile streams (and segments) to clients.
When using a HAS protocol, the spatially divided, independently decodable video data (i.e. the video data of the spatial segments) may be temporally divided in so-called temporal segments of a predetermined time period as shown in Fig. 3.
In particular, Fig. 3 depicts a schematic of a temporally segmented HEVC-tiled stream comprising spatial segments according to an embodiment of the invention. The video frames 306<|.n are divided in a plurality of spatial segments 302^ (in this particular example 4 spatial segments), wherein each spatial segment comprises a plurality of HEVC-tiles 304. The video data associated with each spatial segment may be temporarily divided in temporal segments 3081 2
In an embodiment, a temporal segment may start with a media unit, e.g. an I frame, that has not coding dependencies on other frames in the temporal segment or other temporal segments so that the decoder can directly start decoding video data in the spatial segment.
The video data a spatial segment 302i may not have any decoding dependency on other spatial segments 3022^ of the same video frame or earlier video frames in the same temporal segment or earlier temporal segments. The video data in a temporal segment may start with a frame that can be decoded without the need of other frames. This way, a client may receive a spatial segment of a spatial segment stream and start decoding the video data of the first video frame in the spatial segment without the need of other video data.
In an embodiment, video data associated with each spatial segment may be delivered as separate HEVC-tiled streams to the client. In another embodiment, video data associated with two or more spatial segments may be delivered in one HEVC-tile stream to the client. In case, an HAS streaming protocol is used for delivering video data to an HAS client (which is a client device configured for processing video data delivered on the basis of HTTP Adaptive Streaming) , a HEVC- tiled stream may be further divided in temporal segments.
Hence, tile constraints for a spatial segment may be summarized as follows: 1. HEVC tiles in a first spatial segment A may not have any coding dependencies on HEVC tiles in a second spatial segment B;
2. HEVC tiles in a first spatial segment A may have coding dependencies on other tiles in the spatial segment A, under the condition that: a. A first HEVC tile 1 in a spatial segment A at a time instance Frame N may not have any coding dependencies on a second HEVC tile 2 in the spatial segment N at time instance Frame N b. A HEVC tile 2 in spatial segment A at a time instance Frame N mag have coding
dependencies on HEVC tile 2 of spatial segment A at an earlier time instance (e.g. Frame N-1 ) or a later time instance (e.g. Frame N+1 ). The latter condition ensures that encoding and decoding processes can be parallized between different CPU cores.
The HAS client may be provided with a so-called spatial manifest file (SMF) in order to inform the HAS client about the spatial and temporal relation of the spatial segments in the HEVC-tiled stream. As will be described hereunder in more detail, an SMF may comprise stream identifiers (e.g. (part of) an URL), which a client may use in order to locate and access one or more delivery nodes
(e.g. one or more media servers or a content delivery network (CDN), which are capable of delivering the temporally segmented video data associated with one or more spatial segments on the basis of a HAS protocol to a HAS client.
The client (i.e. client device) may parse the manifest file and use the information in the manifest file to request the desired (temporal and spatial) segments in order to render the video data.
As will be described hereunder in more detail, when the video data are rendered using a suitable user interface (e.g. a touch screen or a pointing device), the user interface may be configured to allow a user to interact with a displayed imaging region using a user interface and select e.g. a region of interest (ROI) that at least party coincides with a predefined spatial segment. In response to the user interaction, the user interface may generate an instruction for the client device to request HEVC-tiled video data of the spatial segment and render the video data of the spatial segment on the screen. Hence, a user may move and/or expand the ROI and - in response - an appropriate spatial segment within that tile representation may be selected in order to render video image that at least partly coincides with the ROI.
Fig. 4 schematically depicts a spatial manifest file for a HAS client device according to an embodiment of the invention. The spatial manifest file (SMF) may define one or more hierarchical data levels 0. The first data level 402 may relate to a Spatial Composition defining one or more Spatial Representations 406<|.3 of a source video (e.g. source1.mp4). The Spatial Representation may form a second data level. Typically, the source video may be formed on the basis of one or more high-resolution and, often, wide field-of-view HD or even UHD video streams or files.
A Spatial Composition may comprise different Spatial Representations 404
generated by an HEVC encoder and other representations of the source file, e.g. a non-tiled low- resolution video. The Spatial Representations may differ in HEVC tile sizes, format (2D or 3D), different video and/or audio qualities and/or resolutions (e.g. SD/HD/UHD, bitrates, etc.), field-of- views, camera angles, etc.). In order to generate a Spatial Representation, the video frames of the source file may be encoded into a HEVC-tiled video file or stream comprising one or more
(independently) decodable Spatial Segments that may form a third data level in the SMF.
The Spatial Representation may comprise metadata. For example, in Fig. 4 the metadata in the Segment representation 4042 may comprise video resolution information 416
indicating that the HEVC tiles of video data of a particular Spatial Representation is associated with a 4096 x 2160 video data format.
A Spatial Representation may comprise one or more Spatial Segments 410 as described in detail with reference to Fig. 1 -3. A Spatial Segment may define one or more HEVC tiles 406i Further, a Spatial Segment may comprise metadata, e.g. segment position information 412 defining the position of a spatial segment in the HEVC-tiled video image. Further, the spatial segment instance may comprise a segment identifier 414, e.g. an URL, which may be used for retrieving video data associated with a Spatial Segment.
The HEVC tiles in a Spatial Segment may be defined by HEVC tile instance 4061.4. A
HEVC tile instance may comprise a tile identifier 418i,2 for identifying a HEVC tile in the video data of a Spatial Segment. Further, in an embodiment, a HEVC tile instance may comprise tile position information (e.g. tile coordinates) 4221 2 defining the position of a HEVC tile in video frames of the HEVC-tiled stream.
The segment position information and the tile position information in the SMF may be generally referred to as position information. The coordinates used for defining the position of the spatial segment or an HEVC tile may be based on an absolute or a relative coordinate system and used by the HEVC decoder to spatially position the HEVC tiles into a seamless video image for display.
Fig. 5A and 5B schematically depict a spatial manifest file for streaming HEVC-tiled video data to a device according to an embodiment of the invention. In particular Fig. 5A and 5B depict an example of an MPEG-DASH MPD defining a HEVC-tiled video stream comprising spatial segments. DASH (Dynamic Adaptive Streaming over HTTP) is a streaming protocol belonging to the family of HAS protocols. The MPD may comprise different MPD video elements 502,504,506 which are associated with an identifier, e.g. (part of) an URL or URI. The DASH enabled client device (also referred to as DASH client) may use the identifier to access and retrieve the video data associated with the MPD video elements. For example, in this example, the first MPD video element 502 may be associated with at least one HEVC-tiled panorama video (a wide field-of-view video defined by the URI "full_panorama_2_4K.mp4") comprising 2x4 HEVC tiles. The second and third MPD video element may define special segments within the tiled image are of the HEVC-tiled panorama video. The second MPD video element 504 may be associated with a first spatial segment defined by a first spatial segment identifier, the URI "full_panorma-left.mp4"). This first spatial segment may comprise 4 HEVC tiles (2 by 2) and may be associated with a first (left) part of the HEVC-tiled panorama video. Similarly, the third MPD video element 506 may be associated with a second spatial segment defined by a second spatial segment identifier, URI "full_panorma-right.mp4"). This second spatial segment may comprise 4 HEVC tiles (2 by 2) and may be associated with a second (right) part of the HEVC- tiled panorama video. The spatial relationship between the MPD video elements is defined on the basis of position information, which will be described hereunder in more detail.
An MPD video element may be defined as an "AdaptationSet" attribute comprising one or more representations (different versions of the same or associated content wherein the difference may be defined by one or more encoding parameters).
A DASH client may use the information in the MPD to request video data associated with a MPD video element from the network. Furthermore, a DASH client may use information (metadata) in the MPD to configure the HEVC decoder so that it may start decoding the HEVC-tiled video data as soon as the video data are received. The information (metadata) for configuring the HEVC decoder may include the spatial relationship between the MPD video elements. To that end, the MPD author may include position information in the MPD. The position information may be defined by one or more spatial relationship descriptors (SRDs) 508,510i.5,512i.5. An SRD may be used in the EssentialProperty attribute (information that is required to be understood by the client when processing a descriptor) or a SupplementalProperty attribute (information that may be discarded by a client when processing a descriptor) in order to inform the client that a spatial relationship between the MPD video elements exist. In an embodiment, the spatial relationship descriptor schemeldUri "urn:mpeg:dash:srd:2014") may be used as a data structure for formatting the position information.
In an embodiment, the position information may be defined on the basis of the @value attribute 509,511i.5,513i.5 which may comprise a sequence of parameters including but not limited to:
- The source_id parameter 514 may define the set of MPD video elements (AdaptationSet or SubRepresentation) that have a spatial relationship with each other.
- The position parameters 516 x,y,w,h may define the position of a MPD video element wherein the coordinates x,y define the origin of the image region of the MPD video element and w and h define the width and height of the image region. The position parameters may be expressed in a given arbitrary unit, e.g. pixel units.
- The tuple W and H 518 define the dimension of the reference space expressed in an arbitrary unit which is the same as the x,y,w and h.
- The spatial_set_id 520 allows grouping of MPD video elements in a coherent group. Such group of MPD video elements may be e.g. used as a resolution layer indicator. The source parameter "1 " in the position information in the different MPD video elements indicate that the different MPD video elements are spatially related to each other.
The first MPD video element 502 may be defined as an AdaptationSet wherein the values x,y,w,h,W,H of the SRD are set to 0, indicating that this MPD video element defines a base track of an MPEG4 stream wherein the base track comprises "extractors" (pointers) to the video data in the tracks defined in the other MPD video elements (in a similar way as described with reference to Fig. 2).
The second and third MPD video elements 504,506 may be defined as an
AdaptationSet, comprising a Representation 503 and one or more SubRepresentations 505i (i.e. parts composing this Representation which can be linked to the concept of tracks at the container level). This way the second and third MPD video elements may define spatial segments at
Representation level comprising a set of one or more HEVC tiles (in this example four HEVC tiles) that are defined at SubRepresentation level.
In an embodiment, the SubRepresentations can be also selectively requested when the range of bytes delimiting each track within a SubSegment is accessible to the client.
In an embodiment, a spatial segment may have a data format that is similar to the one depicted in Fig. 2. Each spatial segment may be stored as a separate track in the MPEG stream. The video data in a track may be encoded such that independent playout of the (temporal segments of) a spatial segment by the HEVC decoder is possible. Each tile track may comprise HEVC encoded video data as defined by the encoder attribute "codecs" 522, which refers in this example refers to an "hvtl " type codec wherein the "t" in "hvtl " refers to HEVC-tiled video data). Further, each HEVC tile in the SubRepresenation may be associated with an SRD 5102.5 comprising one or more position parameters 5112-5 for defining the position of the HEVC tile.
Hence, from the above it follows that the client not only use the information in the SMF to locate delivery nodes in the network node that can deliver the desired video data to the client, but also uses metadata of the HEVC-tiled video streams defined in SMF in order allow a client to select a particular ROI (e.g. a spatial segment) and to configure the decoder before the HEVC-tiled video data are received by the client. This metadata may include for example:
- information for determining that the selected spatial segment is related to HEVC-tiled video data (e.g. a codec attribute "hvtl ");
- information for determining the number and/or size of HEVC-tiles in the selected spatial segment using (e.g. the number of HEVC-tiles that are represented as a SubRepresentation and/or part of the position information associated with the SRDs);
- information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream (e.g. part of the position information associated with the SRDs); - information for determining whether there the video data of a HEVC tile of said spatial segment have one or more spatial decoding dependencies on other HEVC tiles in said spatial segment. The SegmentBase indexRange 5241 2 may be used in order to define a byte range (in this example bytes 0 to 7632), which allows a client to link a temporal segment number with a particular range of bytes.
Fig. 6A and 6B schematically depict a data structure, in particular a spatial manifest file, for streaming HEVC-tiled video data to a device, preferably a client device, according to another embodiment of the invention. In particular Fig. 6A and 6B depict an example of an MPEG-DASH MPD defining a HEVC-tiled video stream, comprising spatial segments. In this particular case, the MPD may comprise a number of MPD video elements 602,604,606,608,610,612 wherein the spatial relationship between the different MPD video elements is described on the basis of position information in the MPD in a similar way as described above with reference to Fig. 5A and 5B.
In this particular example, a first MPD video element 602 may define a first (low- resolution) panorama stream identified by an stream identifier 614i, in this case the URI
full_panorama-HD.mp4. The first MPD video element may be defined as an AdaptationSet wherein the values 618 x,y,w,h,W,H of the SRD 616 are used to describe the spatial position of the video data of the first panorama stream and its spatial relation with respect to the other MPD video elements. The video data in this stream may be encoded as a conventional HEVC stream.
In contrast, the other MPD video elements may form a group of MPD video elements defining a HEVC-tiled stream comprising one or more spatial segments. The grouping of these MPD video elements may be realized on the basis of the spatial_set_id in the SRDs of these MPD video elements. In this particular example, the spatial_set_id's of these video elements are set to "3", while the spatial_set_id of the first video element is set to "1 ".
The second MPD video element 604 may define a high-resolution HEVC-tiled panorama stream at Representation level, identified by a stream identifier 620, in this case the URI "panorama_8K-base.mp4" wherein its position is defined on the basis of an SRD 622. The values x,y,w,h,W,H of the SRD of the second MPD video element are set to 0, indicating that this MPD video element defines a base track of an MPEG4 stream wherein the base track comprises "extractors"
(pointers) to the video data in the tracks defined in the other MPD video elements in a similar way as described with reference to Fig. 2. In this particular case, each track may comprise video data of a spatial segment comprising one or more HEVC tiles.
In this example, the other four MPD video elements 606,608,610,612 define four spatial segments that are also defined at Representation level. Each spatial segment may be identified by a spatial segment identifier 624lJt and each spatial segment may be associated with an SRD 626^ and parameter values 628^ in order to describe the spatial position of the spatial segment and its spatial relation with respect to the other MPD video elements. The HEVC-tiles in the spatial segment may be described on the basis of a number of SubRepresentations
Figure imgf000020_0001
in Fig. 5A) within the respective Representation of a spatial segment. Hence, the number of SubRepresentations in the Represenation of the spatial segment provides the number of HEVC-tiles in a spatial segment.
A spatial segment may be stored as a separate track in the MPEG stream as described with reference to Fig. 2 above. The video data in a track may be encoded such that independent playout of the (temporal segments of) a spatial segment by the decoder is possible. A spatial segment may comprise a number of HEVC tiles as shown by the encoder attribute "codecs" which refers to an "hvtl " type codec (the t in hvtl refers to tile). Further, a HEVC tile in the
SubRepresenation may be associated with an SRD comprising one or more position parameters for defining the position of the HEVC tile and its position with respect to other MPD video elements defined in the MPD in a similar way as described with reference to Fig. 5A and 5B.
On the basis of the MPDs as depicted in Fig. 5A and 5B and Fig. 6A and 6B, an DASH client may request different streams on the basis of the information associated with the MPD video elements, e.g. a low-resolution panorama video stream, a HEVC-tiled high-resolution panorama video stream or one or more spatial segments comprising HEVC-tiles. Further, on the basis of the MPDs as depicted in Fig. 5A and 5B and Fig. 6A and 6B, a DASH client may send metadata associated with a requested stream to the HEVC decoder in order to configure and initialize the decoder so that is ready for decoding the video data as soon as the data are received by the client.
The SegmentBase indexRange
Figure imgf000020_0002
may be used in order to define a byte range (in this example bytes 0 to 7632) which allows a client to link a temporal segment number with a particular range of bytes.
Fig. 7 depicts a client device for retrieving and processing HEVC-tiled video data according to one embodiment of the invention. The client device 702 may comprise a user navigation function 704 for interpreting user interaction with the (tiled) content that is processed and rendered by a media player 706. The user navigation function may be connected to a user interface that may include a touch-screen, a camera, keyboard, mouse, trackerball, joystick, microphone, head tracking sensor, eye and gaze tracking, buttons or any other man-machine interface that allows manipulation (e.g. panning, zooming and/or tilting) of the displayed content.
The client device may further comprise a manifest cache 708 for receiving and storing one or more manifest files from a content provider and/or a content source in the network (e.g. a media server or a CDN). The cache may comprise one or more SMFs 710 wherein an SMF may comprise one or more spatial segments as described in detail with reference to Fig. 1 -6 above.
The manifest cache may be connected to a stream selector 712 that may parse the SMF and use the information in the SMF to select one or more streams and instruct a stream processor 714 to request the one or more selected streams from the network 716 on the basis of a suitable protocol, e.g. the HTTP protocol or the like. When selecting a HEVC-tiled stream, e.g. a spatial segment, the stream processor may send information 722 (metadata) on the selected stream to configure the HEVC decoder 720. For example, when a particular spatial segment is selected from the SMF, the stream processor may send metadata associated with the spatial segment (e.g. information regarding the fact that the video data relates to HEVC-tiled video data, that the video data are defined as a spatial segment comprising a number of HEVC tiles, the position information associated with the spatial segment and its HEVC tiles, etc.). This way, the HEVC decoder may be configured and initialized for processing the requested HEVC-tiled stream before the video data have been received. Thereafter, the stream processor may use the SMF in order to request a desired spatial segment. To that end, the stream process may send requests for the base track and one or more tracks associated with one or more spatial segments.
Once the stream processor starts receiving the requested video data, it may relay the data of the base track and video data associated one or more spatial tracks to a buffer 718 before the data are decoded by an HEVC decoder 720 in the media player.
Fig. 8 depicts a HEVC-tiled streaming process according to an embodiment of the invention. In this particular example, the video data may be distributed by a so-called content delivery network (CDN) to clients (i.e. client devices) using an HAS protocol. The process may start with a client requesting and receiving a spatial manifest file SMF from a content provider CP (steps 802 and 804). The SMF may for example relate to an MPD as described with reference to Fig. 5A and 5B defining a HEVC-tiled panorama video (2 x 4 HEVC tiles) with two spatial segments: a first spatial segment "full_panorama-left.MP4" with 2 x 2 HEVC tiles defining the left sub-region of the full panorama and a second spatial segment "full_panorama-right.MP4" with 2 x 2 HEVC tiles defining the right sub-region of the full panorama.
The client, in particular the stream selector in the client, may parse the SMF and select e.g. a spatial segment identifier, e.g. "full_panorma-left.MP4" and the associated base track "full_panorama_2_4K-base.mp4" (step 806). Further, it may retrieve metadata associated with the selected segment (e.g. information regarding the fact that the selected video data relates to HEVC- tiled video data, that the selected video data are defined as a spatial segment comprising a number of HEVC tiles, the position information associated with the spatial segment and its HEVC tiles, etc.).
Thereafter the client may send a request for the base track to the network. In particular, the client may send a request message (step 808), e.g. an HTTP GET message, comprising an identifier of the base track (e.g. an URL) to a so-called request routing (RR) node of the CDN. The request routing node may locate the delivery node (e.g. a media server) on which the data of the requested base track is stored and send the URL of the localized delivery node in a redirect message back to the client (step 810).
The client may use the URL for sequentially requesting the temporal segments of the selected first spatial segment as identified in the SMF. Thus, after having received the redirect message, the client may send a request message back to the deliver node that is configured to deliver the base track to the client (step 812). In response, the delivery node may send the base track to the client (step 814). The data of the base track may be buffered and send the HEVC decoder, while the retrieval process for the temporal segments comprising the video data of the first spatial segment is continued by sending a request for the first temporal segment "full_panorma-left_1 .MP4" of the first spatial segment to the network (step 816). The client may receive the first temporal segment in a response message (step 818) and start decoding and rendering the video on the basis of the information in the base track (step 820). This process may be repeated for subsequent temporal segments of the first spatial segment.
After a certain time, e.g. during the retrieval of the sixth temporal segment (steps 826 and 828), the client device may be triggered by the user navigation function switch to from a first spatial segment to a second spatial segment (step 830). For example, the user navigation function may detect a user interaction that is interpreted by the user navigation function as a panning action. The stream selector in the client device may parse the SMF and select a second spatial segment (step 832) defining the right sub-region of the full panorama, configure the decoder on the basis of the metadata in the SMF of the second spatial segment and start requesting temporal segments of the second spatial segment from the CDN (step 834) in a similar way as illustrated by steps 816-828).
Fig. 9 schematically depicts a process for generating HEVC-tiled video data according to an embodiment of the invention. In this example, one or more cameras 902, e.g. one or more high- resolution, wide field-of-view cameras may be used to generate or compose a source video. An HEVC encoder 904 may be used to generate one or more HEVC-tiled streams on the basis of a source video. The HEVC-tiled streams may define a Spatial Composition on the basis of the source video. The Spatial composition may comprise one or more Spatial Representations 908i,2- In an
embodiment, a tile representation may relate to a panorama HEVC-tiled stream comprising a particular number of HEVC tiles per video frame. Further, for one or more Spatial Representations, one or more Spatial Segments 910i.3 may be generated wherein a spatial segment may relate to a HEVC-tiled stream comprising a subset of HEVC tiles of a HEVC-tiled video stream. During the encoding process, information (metadata) on the generated video data: identifiers and spatial and temporal information of a set of streams may be formatted in a SMF 912 as described with reference to Fig. 4-6.
The thus generated one or more (HEVC-tiled) streams and SMFs may be stored at one or more delivery nodes 9221 2 in the network 916. A delivery node is configured to deliver a
HEVC-tiled stream to a client device 924.
In an embodiment, a delivery node may be a media server. In another embodiment, at least part of the delivery nodes (sometimes also referred to as surrogate nodes) may be part of a dedicated content delivery network (CDN). In that case, HEVC-tiled streams may be ingested by a Content Delivery Network Control Function 920(CDNCF) sometimes also referred to as a request routing node. The CDNCF may distribute the HEVC-tiled streams over different delivery nodes so that efficient distribution of the streams is ensured. In an embodiment, the CDN may update the tile (and segment) identifiers (the URLs) in the MPD such that a client device may efficiently access delivery nodes of the CDN in order to request the delivery of HEVC-tiled content.
When the client device 924 would like to access the video data, it may be provided with a SMF from a content provider or the CDN and use the SMF to request and playout HEVC-tiled video data. Here, the client device may generally relate to a (mobile) video data processing device such as an electronic tablet, a smart-phone, a notebook, a media player, a home gateway or DASH enabled devices such as a DASH-enabled HbbTV display device. Alternatively, the device may be a set-top box or content storage device configured for processing and temporarily storing content for future consumption by a content play-out device, which has access to the stored content.
The delivery of the video data to the client device may be based on any data transmission scheme. For example, a unicast scheme may be used to transmit data from a delivery node to the client. Alternatively, a broadcast or a multicast scheme (e.g. an IP multicast scheme) may be used to transmit the data to the client.
Fig. 10 is a block diagram illustrating an exemplary data processing system that may be used in systems and methods as described with reference to Fig. 1 -9. Data processing system 1000 may include at least one processor 1002 coupled to memory elements 1004 through a system bus 1006. As such, the data processing system may store program code within memory elements 1004. Further, processor 1002 may execute the program code accessed from memory elements 1004 via system bus 1006. In one aspect, data processing system may be implemented as a computer that is suitable for storing and/or executing program code. It should be appreciated, however, that data processing system 1000 may be implemented in the form of any system including a processor and memory that is capable of performing the functions described within this specification.
Memory elements 1004 may include one or more physical memory devices such as, for example, local memory 1008 and one or more bulk storage devices 1010. Local memory may refer to random access memory or other non-persistent memory device(s) generally used during actual execution of the program code. A bulk storage device may be implemented as a hard drive or other persistent data storage device. The processing system 1000 may also include one or more cache memories (not shown) that provide temporary storage of at least some program code in order to reduce the number of times program code must be retrieved from bulk storage device 1010 during execution.
Input/output (I/O) devices depicted as input device 1012 and output device 1014 optionally can be coupled to the data processing system. Examples of input device may include, but are not limited to, for example, a keyboard, a pointing device such as a mouse, or the like. Examples of output device may include, but are not limited to, for example, a monitor or display, speakers, or the like. Input device and/or output device may be coupled to data processing system either directly or through intervening I/O controllers. A network adapter 1016 may also be coupled to data processing system to enable it to become coupled to other systems, computer systems, remote network devices, and/or remote storage devices through intervening private or public networks. The network adapter may comprise a data receiver for receiving data that is transmitted by said systems, devices and/or networks to said data and a data transmitter for transmitting data to said systems, devices and/or networks. Modems, cable modems, and Ethernet cards are examples of different types of network adapter that may be used with data processing system 1050.
As pictured in Fig. 10, memory elements 1004 may store an application 1018. It should be appreciated that data processing system 1000 may further execute an operating system (not shown) that can facilitate execution of the application. Application, being implemented in the form of executable program code, can be executed by data processing system 1100, e.g., by processor 1002. Responsive to executing application, data processing system may be configured to perform one or more operations to be described herein in further detail.
In one aspect, for example, data processing system 1000 may represent a client data processing system. In that case, application 1018 may represent a client application that, when executed, configures data processing system 1000 to perform the various functions described herein with reference to a "client", which is for the purpose of this application, sometimes also referred to as a "client device". Examples of a client aka client device may include, but are not limited to, a personal computer, a portable computer, a mobile phone, or the like.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. Method of streaming HEVC-tiled video data to a client device comprising:
providing said client device with a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset, preferably a plurality, of HEVC tiles of a HEVC-tiled (panorama) video stream, preferably said spatial manifest file further comprising information for determining the position of at least part of said one or more spatial segments and/or the position of HEVC tiles in a spatial segment within the tiled image area of said HEVC-tiled video stream; and,
selecting, preferably by said client device, a spatial segment identifier of said spatial manifest file for requesting a delivery node to deliver at least part of the video data of a spatial segment as a HEVC-tiled video stream to said client device.
2. Method according to claim 1 further comprising:
Using, preferably by said client device, said selected spatial segment identifier for sending a request, preferably an HTTP request, to said delivery node for delivering video data associated with said spatial segment to said client device.
3. Method according to claims 1 or 2 wherein said spatial segment is defined by segment boundaries that coincide with the HEVC tile boundaries in a row and column direction of said HEVC-tiled video stream, preferably said segment boundaries enclosing a rectangular area comprising an integer number, preferably a plurality, of HEVC tiles (a subset of HEVC tiles), preferably said number of HEVC tiles being smaller than the number of HEVC tiles in said HEVC-tiled video stream.
4. Method according to any of claims 1-3 wherein video data of at least part of said HEVC tiles in said spatial segment are decoded in parallel by a HEVC decoder.
5. Method according to any of claim 1-4 wherein video data of HEVC tiles in said spatial segment do not have spatial and/or temporal decoding dependency with video data of HEVC tiles in said HEVC-tiled video stream that are not part of the spatial segment.
6. Method according to any of claims 1-5 wherein video data of the HEVC tiles in said spatial segment do not have spatial and/or temporal decoding dependency; or, wherein video data of at least part of said HEVC tiles in said spatial segment have spatial and/or temporal decoding dependency.
7. Method according to any of claims 1-6 wherein said spatial manifest file further comprises one or more HEVC tile identifiers, preferably (part of) one or more URLs, for locating one or more one or more delivery nodes configured for delivering video data associated with at least one HEVC tile of the subset of HEVC tiles of a spatial segment.
8. Method according to any of claims 1-7 wherein said spatial manifest file further comprises metadata associated with said selected spatial segment, said metadata including at least one of:
information for determining that the selected spatial segment is related to HEVC-tiled video data;
information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream;
information for determining whether there the video data of a HEVC tile of said spatial segment have one or more spatial decoding dependencies on other HEVC tiles in said spatial segment.
9. Method according to any of claims 1-8
wherein video data associated with a spatial segment are stored as separate video track, preferably an MPEG-type file, said video data being accessible by said client device on the basis of spatial segment identifiers and/or HEVC tile identifiers.
10. Client device for processing HEVC-tiled video data, said client device being configured for:
parsing a spatial manifest file comprising one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset, preferably a plurality, of HEVC tiles of a HEVC-tiled video stream; preferably said spatial manifest file further comprising information for determining the position of at least part of said one or more spatial segments and/or the position of HEVC tiles in a spatial segment within the tiled image area of said HEVC-tiled video stream; and,
selecting a spatial segment identifier in said spatial manifest file for requesting a delivery node to deliver at least part of the video data of a spatial segment as a HEVC-tiled video stream to the device; and, optionally, using said selected spatial segment identifier for sending a request, preferably an HTTP request, to said delivery node for delivering video data associated with said spatial segment to said device.
1 1. Non-transitory computer-readable storage medium comprising a recording area for storing video data, preferably HEVC-tiled video data, said recording area comprising:
video data associated with one or more spatial segments, a spatial segment comprising a subset of HEVC tiles of a HEVC-tiled video stream, the video data of said one or more spatial segments being accessible on the basis of one or more spatial segment identifiers, and, optionally, video data of HEVC tiles in a spatial segment being accessible on the basis of one or more HEVC tile identifiers.
12. Non-transitory computer-readable storage medium for storing video data according to claim 1 1 , wherein video data of said one or more spatial segments are stored as separate video tracks in said recording area; and, optionally, wherein said recording area further comprises at least one base track comprising one or more extractors, an extractor pointing to a video track.
13. Non-transitory computer-readable storage medium comprising a stored data structure, preferably a spatial manifest file, and preferably for use by a client device according to claim 10, said data structure comprising:
one or more spatial segments identifiers for locating one or more delivery nodes configured for delivering video data associated with a spatial segment identified by at least one of said one or more spatial segments identifiers, preferably (part of) one or more URLs, to said client device, a spatial segment being associated with HEVC-tiled video data comprising a subset of HEVC tiles of a
HEVC-tiled video stream; and, optionally, one or more HEVC tile identifiers, preferably (part of) one or more URLs, for locating one or more one or more delivery nodes configured for delivering video data associated with at least one HEVC tile of the subset of HEVC tiles of a spatial segment.
14. Non-transitory computer-readable storage medium according to claim 13 further comprising metadata associated with said selected spatial segment, said metadata including at least one of:
information for determining that the selected spatial segment is related to HEVC-tiled video data;
information for determining the number and/or size of HEVC-tiles in the selected spatial segment; information for determining the position of the spatial segment and/or the position of the HEVC tiles in said spatial segment within the tiled image area of said HEVC-tiled (panorama) video stream;
information for determining whether there the video data of a HEVC tile of said spatial segment have one or more spatial decoding dependencies on other HEVC tiles in said spatial segment.
15. A computer program product comprising software code portions configured for, when run in the memory of a computer, executing the method steps according to any of claims 1-9.
PCT/EP2015/064527 2014-06-27 2015-06-26 Hevc-tiled video streaming WO2015197818A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US15/318,619 US10694192B2 (en) 2014-06-27 2015-06-26 HEVC-tiled video streaming
EP15734102.5A EP3162075B1 (en) 2014-06-27 2015-06-26 Hevc-tiled video streaming

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP14174761.8 2014-06-27
EP14174761 2014-06-27

Publications (1)

Publication Number Publication Date
WO2015197818A1 true WO2015197818A1 (en) 2015-12-30

Family

ID=51059303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/064527 WO2015197818A1 (en) 2014-06-27 2015-06-26 Hevc-tiled video streaming

Country Status (3)

Country Link
US (1) US10694192B2 (en)
EP (1) EP3162075B1 (en)
WO (1) WO2015197818A1 (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017140948A1 (en) * 2016-02-17 2017-08-24 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
WO2017140945A1 (en) * 2016-02-17 2017-08-24 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
EP3223524A1 (en) * 2016-03-22 2017-09-27 Thomson Licensing Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
EP3249928A1 (en) * 2016-05-23 2017-11-29 Thomson Licensing Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
EP3249929A1 (en) * 2016-05-25 2017-11-29 Thomson Licensing Method and network equipment for establishing a manifest
WO2017202699A1 (en) * 2016-05-23 2017-11-30 Canon Kabushiki Kaisha Method, device, and computer program for adaptive streaming of virtual reality media content
WO2017205504A1 (en) * 2016-05-24 2017-11-30 Qualcomm Incorporated Virtual reality video signaling in dynamic adaptive streaming over http
WO2017202899A1 (en) * 2016-05-25 2017-11-30 Koninklijke Kpn N.V. Spatially tiled omnidirectional video streaming
EP3293981A1 (en) 2016-09-08 2018-03-14 Koninklijke KPN N.V. Partial video decoding method, device and system
EP3301951A1 (en) 2016-09-30 2018-04-04 Koninklijke KPN N.V. Audio object processing based on spatial listener information
WO2018071666A1 (en) * 2016-10-12 2018-04-19 Arris Enterprises Llc Coding schemes for virtual reality (vr) sequences
WO2019002662A1 (en) 2017-06-26 2019-01-03 Nokia Technologies Oy An apparatus, a method and a computer program for omnidirectional video
EP3457706A4 (en) * 2016-05-13 2019-03-20 Sony Corporation File generation device and file generation method, and reproduction device and reproduction method
CN109565616A (en) * 2016-08-22 2019-04-02 谷歌有限责任公司 Interactive video multi-screen experience on a cellular telephone
CN109691103A (en) * 2016-07-14 2019-04-26 皇家Kpn公司 Video coding
CN110024400A (en) * 2016-12-07 2019-07-16 高通股份有限公司 The system and method that the signal of region of interest is sent
US10397666B2 (en) 2014-06-27 2019-08-27 Koninklijke Kpn N.V. Determining a region of interest on the basis of a HEVC-tiled video stream
JP2019524004A (en) * 2016-05-23 2019-08-29 キヤノン株式会社 Method, device and computer program for improving streaming of virtual reality media content
WO2019226369A1 (en) * 2018-05-25 2019-11-28 Microsoft Technology Licensing, Llc Adaptive panoramic video streaming using overlapping partitioned sections
US10659815B2 (en) 2018-03-08 2020-05-19 At&T Intellectual Property I, L.P. Method of dynamic adaptive streaming for 360-degree videos
US10674185B2 (en) 2015-10-08 2020-06-02 Koninklijke Kpn N.V. Enhancing a region of interest in video frames of a video stream
US10694192B2 (en) 2014-06-27 2020-06-23 Koninklijke Kpn N.V. HEVC-tiled video streaming
WO2020130912A1 (en) * 2018-12-20 2020-06-25 Telefonaktiebolaget Lm Ericsson (Publ) Improved tile address signalling in video encoding and decoding
US10715843B2 (en) 2015-08-20 2020-07-14 Koninklijke Kpn N.V. Forming one or more tile streams on the basis of one or more video streams
US10721530B2 (en) 2013-07-29 2020-07-21 Koninklijke Kpn N.V. Providing tile video streams to a client
US10764494B2 (en) 2018-05-25 2020-09-01 Microsoft Technology Licensing, Llc Adaptive panoramic video streaming using composite pictures
US10762710B2 (en) 2017-10-02 2020-09-01 At&T Intellectual Property I, L.P. System and method of predicting field of view for immersive video streaming
US10805614B2 (en) 2016-10-12 2020-10-13 Koninklijke Kpn N.V. Processing spherical video data on the basis of a region of interest
EP3761647A1 (en) 2019-07-05 2021-01-06 Tiledmedia B.V. Methods and devices for rendering a video on a display
US11523185B2 (en) 2019-06-19 2022-12-06 Koninklijke Kpn N.V. Rendering video stream in sub-area of visible display area

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8718445B1 (en) 2013-09-03 2014-05-06 Penthera Partners, Inc. Commercials on mobile devices
US9244916B2 (en) * 2013-10-01 2016-01-26 Penthera Partners, Inc. Downloading media objects
US10015527B1 (en) * 2013-12-16 2018-07-03 Amazon Technologies, Inc. Panoramic video distribution and viewing
WO2017036953A1 (en) * 2015-09-02 2017-03-09 Thomson Licensing Method, apparatus and system for facilitating navigation in an extended scene
KR102301352B1 (en) 2016-02-09 2021-09-14 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Concept for picture/video data streams allowing efficient reducibility or efficient random access
US10999602B2 (en) 2016-12-23 2021-05-04 Apple Inc. Sphere projected motion estimation/compensation and mode decision
US11259046B2 (en) 2017-02-15 2022-02-22 Apple Inc. Processing of equirectangular object data to compensate for distortion by spherical projections
US10924747B2 (en) 2017-02-27 2021-02-16 Apple Inc. Video coding techniques for multi-view video
JP6843655B2 (en) * 2017-03-09 2021-03-17 キヤノン株式会社 Transmitter, receiver, information processing method and program
US11093752B2 (en) 2017-06-02 2021-08-17 Apple Inc. Object tracking in multi-view video
GB2563439B (en) * 2017-06-16 2022-02-16 Canon Kk Methods, devices, and computer programs for improving streaming of portions of media data
GB2563865A (en) * 2017-06-27 2019-01-02 Canon Kk Method, device, and computer program for transmitting media content
US20190005709A1 (en) * 2017-06-30 2019-01-03 Apple Inc. Techniques for Correction of Visual Artifacts in Multi-View Images
US10754242B2 (en) 2017-06-30 2020-08-25 Apple Inc. Adaptive resolution and projection format in multi-direction video
WO2019009473A1 (en) * 2017-07-04 2019-01-10 엘지전자 주식회사 Area-based processing method and apparatus for 360-degree video
EP3454566B1 (en) 2017-09-11 2021-05-05 Tiledmedia B.V. Streaming frames of spatial elements to a client device
US11025919B2 (en) * 2017-10-03 2021-06-01 Koninklijke Kpn N.V. Client-based adaptive streaming of nonlinear media
WO2019107175A1 (en) * 2017-11-30 2019-06-06 ソニー株式会社 Transmission device, transmission method, reception device, and reception method
EP3721636A1 (en) 2017-12-07 2020-10-14 Koninklijke KPN N.V. Method for adaptive streaming of media
EP3503561A1 (en) * 2017-12-19 2019-06-26 Advanced Digital Broadcast S.A. System and method for optimization of video bitrate
EP3759922A1 (en) * 2018-04-03 2021-01-06 Huawei Technologies Co. Ltd. Error mitigation in sub-picture bitstream based viewport dependent video coding
US10812828B2 (en) 2018-04-10 2020-10-20 At&T Intellectual Property I, L.P. System and method for segmenting immersive video
CN112771884B (en) 2018-04-13 2023-02-10 华为技术有限公司 Immersive media metrics for virtual reality content with multiple positions
US10419738B1 (en) 2018-06-14 2019-09-17 Telefonaktiebolaget Lm Ericsson (Publ) System and method for providing 360° immersive video based on gaze vector information
WO2020008106A1 (en) * 2018-07-02 2020-01-09 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
US10841662B2 (en) 2018-07-27 2020-11-17 Telefonaktiebolaget Lm Ericsson (Publ) System and method for inserting advertisement content in 360° immersive video
CN112585978B (en) 2018-07-30 2023-07-25 皇家Kpn公司 Generating a composite video stream for display in VR
JP2021192470A (en) * 2018-09-07 2021-12-16 ソニーグループ株式会社 Content distribution system and content distribution method, and program
US10757389B2 (en) * 2018-10-01 2020-08-25 Telefonaktiebolaget Lm Ericsson (Publ) Client optimization for providing quality control in 360° immersive video during pause
US11924442B2 (en) 2018-11-20 2024-03-05 Koninklijke Kpn N.V. Generating and displaying a video stream by omitting or replacing an occluded part
CN116743997A (en) 2019-12-27 2023-09-12 阿里巴巴(中国)有限公司 Method and apparatus for signaling sub-image division information
KR20210107409A (en) 2020-02-24 2021-09-01 삼성전자주식회사 Method and apparatus for transmitting video content using edge computing service
CN111614975B (en) * 2020-05-08 2022-07-12 深圳拙河科技有限公司 Hundred million-level pixel video playing method, device, medium and equipment
EP4109917A1 (en) 2021-06-24 2022-12-28 Tiledmedia B.V. Method and system for requesting tile streams
EP4138401A1 (en) * 2021-08-17 2023-02-22 Nokia Technologies Oy A method, an apparatus and a computer program product for video encoding and video decoding

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012168356A1 (en) * 2011-06-08 2012-12-13 Koninklijke Kpn N.V. Locating and retrieving segmented content
WO2012168365A1 (en) 2011-06-08 2012-12-13 Koninklijke Kpn N.V. Spatially-segmented content delivery
WO2013063094A1 (en) * 2011-10-24 2013-05-02 Qualcomm Incorporated Grouping of tiles for video coding
WO2014057131A1 (en) * 2012-10-12 2014-04-17 Canon Kabushiki Kaisha Method and corresponding device for streaming video data

Family Cites Families (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2411852A1 (en) 2000-06-09 2001-12-13 Imove, Inc. Streaming panoramic video
JP2005142654A (en) 2003-11-04 2005-06-02 Matsushita Electric Ind Co Ltd Video transmitting apparatus and video receiving apparatus
US7440626B2 (en) 2004-12-02 2008-10-21 Mitsubishi Electric Research Laboratories, Inc. Image transcoding
US7480701B2 (en) 2004-12-15 2009-01-20 Microsoft Corporation Mixed-media service collections for multimedia platforms
US7894531B1 (en) 2005-02-15 2011-02-22 Grandeye Ltd. Method of compression for wide angle digital video
FR2884027B1 (en) 2005-04-04 2007-06-01 Canon Kk METHOD AND DEVICE FOR TRANSMITTING AND RECEIVING IMAGE SEQUENCES BETWEEN A SERVER AND A CUSTOMER
EP3343904A1 (en) 2006-09-29 2018-07-04 Rovi Guides, Inc. Systems and methods for a modular media guidance dashboard application
WO2008088741A2 (en) 2007-01-12 2008-07-24 Ictv, Inc. Interactive encoded content system including object models for viewing on a remote device
JP2010532628A (en) 2007-06-29 2010-10-07 トムソン ライセンシング Apparatus and method for reducing artifacts in images
KR101488548B1 (en) 2007-06-29 2015-02-02 톰슨 라이센싱 Video indexing method, and video indexing device
US20090300692A1 (en) 2008-06-02 2009-12-03 Mavlankar Aditya A Systems and methods for video streaming and display
CN101742324A (en) 2008-11-14 2010-06-16 北京中星微电子有限公司 Video encoding and decoding methods, video encoding and decoding systems and encoder-decoder
US20100232504A1 (en) 2009-03-13 2010-09-16 The State of Oregon acting by and through the State Board of Higher Education on behalf of the Supporting region-of-interest cropping through constrained compression
JP5443299B2 (en) 2010-08-26 2014-03-19 日本電信電話株式会社 Information transmission / reception system and information transmission / reception method
WO2013021656A1 (en) 2011-08-11 2013-02-14 パナソニック株式会社 Playback device, playback method, integrated circuit, broadcasting system, and broadcasting method
US10349077B2 (en) 2011-11-21 2019-07-09 Canon Kabushiki Kaisha Image coding apparatus, image coding method, image decoding apparatus, image decoding method, and storage medium
JP6295951B2 (en) 2012-06-25 2018-03-20 ソニー株式会社 Image decoding apparatus and image decoding method
WO2014025319A1 (en) 2012-08-08 2014-02-13 National University Of Singapore System and method for enabling user control of live video stream(s)
GB2505912B (en) 2012-09-14 2015-10-07 Canon Kk Method and device for generating a description file, and corresponding streaming method
TWI620435B (en) 2012-09-18 2018-04-01 Vid衡器股份有限公司 Method and apparatus for region of interest video coding using tiles and tile groups
GB2513139A (en) 2013-04-16 2014-10-22 Canon Kk Method and corresponding device for streaming video data
CN104704827B (en) 2012-11-13 2019-04-12 英特尔公司 Content-adaptive transform decoding for next-generation video
KR102539065B1 (en) 2013-01-04 2023-06-01 지이 비디오 컴프레션, 엘엘씨 Efficient scalable coding concept
GB2509954B (en) 2013-01-18 2016-03-23 Canon Kk Method of displaying a region of interest in a video stream
US9749627B2 (en) 2013-04-08 2017-08-29 Microsoft Technology Licensing, Llc Control data for motion-constrained tile set
JP6269813B2 (en) 2013-04-08 2018-01-31 ソニー株式会社 Scalability of attention area in SHVC
GB2513140B (en) 2013-04-16 2016-05-04 Canon Kk Methods, devices, and computer programs for streaming partitioned timed media data
GB2513303B (en) 2013-04-16 2017-06-07 Canon Kk Method and device for partitioning an image
JP6419173B2 (en) 2013-07-12 2018-11-07 キヤノン株式会社 An Adaptive Data Streaming Method with Push Message Control
RU2671946C2 (en) 2013-07-19 2018-11-08 Сони Корпорейшн Information processing device and method
GB2516825B (en) 2013-07-23 2015-11-25 Canon Kk Method, device, and computer program for encapsulating partitioned timed media data using a generic signaling for coding dependencies
GB2516826B (en) 2013-07-23 2016-06-22 Canon Kk Method, device and computer program for encapsulating partitioned timed media data by creating tracks to be independently encapsulated in at least one media f
EP2973228B1 (en) 2013-07-26 2019-08-28 Huawei Technologies Co., Ltd. Spatial adaptation in adaptive streaming
US10721530B2 (en) 2013-07-29 2020-07-21 Koninklijke Kpn N.V. Providing tile video streams to a client
US20150095450A1 (en) 2013-09-30 2015-04-02 Qualcomm Incorporated Utilizing multiple switchable adaptation sets for streaming media data
GB2519746B (en) 2013-10-22 2016-12-14 Canon Kk Method, device and computer program for encapsulating scalable partitioned timed media data
EP3092806A4 (en) 2014-01-07 2017-08-23 Nokia Technologies Oy Method and apparatus for video coding and decoding
US10542274B2 (en) 2014-02-21 2020-01-21 Microsoft Technology Licensing, Llc Dictionary encoding and decoding of screen content
US10182241B2 (en) 2014-03-04 2019-01-15 Microsoft Technology Licensing, Llc Encoding strategies for adaptive switching of color spaces, color sampling rates and/or bit depths
US20150264404A1 (en) 2014-03-17 2015-09-17 Nokia Technologies Oy Method and apparatus for video coding and decoding
GB2524726B (en) 2014-03-25 2018-05-23 Canon Kk Image data encapsulation with tile support
GB2524531B (en) * 2014-03-25 2018-02-07 Canon Kk Methods, devices, and computer programs for improving streaming of partitioned timed media data
JP6440747B2 (en) 2014-06-27 2018-12-19 コニンクリーケ・ケイピーエヌ・ナムローゼ・フェンノートシャップ Region of interest determination based on HEVC tiled video stream
US10694192B2 (en) 2014-06-27 2020-06-23 Koninklijke Kpn N.V. HEVC-tiled video streaming
GB2564731B (en) 2014-10-14 2019-05-29 Canon Kk Description of image composition with HEVC still image file format
US20180242028A1 (en) 2015-08-20 2018-08-23 Koninklijke Kpn N.V. Forming A Tiled Video On The Basis Of Media Streams
WO2017029400A1 (en) 2015-08-20 2017-02-23 Koninklijke Kpn N.V. Forming one or more tile streams on the basis of one or more video streams
US10674185B2 (en) 2015-10-08 2020-06-02 Koninklijke Kpn N.V. Enhancing a region of interest in video frames of a video stream
US10542258B2 (en) 2016-01-25 2020-01-21 Google Llc Tile copying for video compression
GB2550912B (en) 2016-05-27 2019-09-04 Canon Kk Method, device and computer program for encapsulating and parsing timed media data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012168356A1 (en) * 2011-06-08 2012-12-13 Koninklijke Kpn N.V. Locating and retrieving segmented content
WO2012168365A1 (en) 2011-06-08 2012-12-13 Koninklijke Kpn N.V. Spatially-segmented content delivery
WO2013063094A1 (en) * 2011-10-24 2013-05-02 Qualcomm Incorporated Grouping of tiles for video coding
WO2014057131A1 (en) * 2012-10-12 2014-04-17 Canon Kabushiki Kaisha Method and corresponding device for streaming video data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAN YE ET AL: "ROI tile sections", 102. MPEG MEETING; 15-10-2012 - 19-10-2012; SHANGHAI; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m26587, 10 October 2012 (2012-10-10), XP030054920 *

Cited By (91)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10721530B2 (en) 2013-07-29 2020-07-21 Koninklijke Kpn N.V. Providing tile video streams to a client
US10397666B2 (en) 2014-06-27 2019-08-27 Koninklijke Kpn N.V. Determining a region of interest on the basis of a HEVC-tiled video stream
US10694192B2 (en) 2014-06-27 2020-06-23 Koninklijke Kpn N.V. HEVC-tiled video streaming
US10715843B2 (en) 2015-08-20 2020-07-14 Koninklijke Kpn N.V. Forming one or more tile streams on the basis of one or more video streams
US10674185B2 (en) 2015-10-08 2020-06-02 Koninklijke Kpn N.V. Enhancing a region of interest in video frames of a video stream
US11323723B2 (en) 2016-02-17 2022-05-03 Nokia Technologies Oy Apparatus, a method and a computer program for video coding and decoding
CN108702503A (en) * 2016-02-17 2018-10-23 诺基亚技术有限公司 For Video coding and decoded device, method and computer program
WO2017140945A1 (en) * 2016-02-17 2017-08-24 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
KR102089457B1 (en) * 2016-02-17 2020-03-17 노키아 테크놀로지스 오와이 Apparatus, method and computer program for video coding and decoding
WO2017140948A1 (en) * 2016-02-17 2017-08-24 Nokia Technologies Oy An apparatus, a method and a computer program for video coding and decoding
KR20180113584A (en) * 2016-02-17 2018-10-16 노키아 테크놀로지스 오와이 Apparatus, method and computer program for video coding and decoding
RU2733218C2 (en) * 2016-03-22 2020-09-30 ИНТЕРДИДЖИТАЛ ВиСи ХОЛДИНГЗ, ИНК Method, apparatus and a stream for formatting an immersive video image for traditional and immersive playback devices
EP3223524A1 (en) * 2016-03-22 2017-09-27 Thomson Licensing Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
WO2017162479A1 (en) * 2016-03-22 2017-09-28 Thomson Licensing Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
JP7177034B2 (en) 2016-03-22 2022-11-22 インターデジタル ヴイシー ホールディングス, インコーポレイテッド Method, apparatus and stream for formatting immersive video for legacy and immersive rendering devices
US10958950B2 (en) 2016-03-22 2021-03-23 Interdigital Vc Holdings, Inc. Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
JP2019514313A (en) * 2016-03-22 2019-05-30 インターデジタル ヴイシー ホールディングス, インコーポレイテッド Method, apparatus and stream for formatting immersive video for legacy and immersive rendering devices
KR20190029505A (en) * 2016-03-22 2019-03-20 인터디지털 브이씨 홀딩스 인코포레이티드 Method, apparatus, and stream for formatting immersive video for legacy and immersive rendering devices
CN109314791A (en) * 2016-03-22 2019-02-05 交互数字Vc控股公司 By the video formatted method, apparatus and stream to be used for traditional display device and immersion display device of immersion
KR102308604B1 (en) * 2016-03-22 2021-10-06 인터디지털 브이씨 홀딩스 인코포레이티드 Method, apparatus and stream for formatting immersive video for legacy and immersive rendering devices
EP3457706A4 (en) * 2016-05-13 2019-03-20 Sony Corporation File generation device and file generation method, and reproduction device and reproduction method
GB2578227A (en) * 2016-05-23 2020-04-22 Canon Kk Method, device, and computer program for adaptive streaming of virtual reality media content
US10523980B2 (en) 2016-05-23 2019-12-31 Interdigital Vc Holdings, Inc. Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
WO2017202699A1 (en) * 2016-05-23 2017-11-30 Canon Kabushiki Kaisha Method, device, and computer program for adaptive streaming of virtual reality media content
RU2742344C2 (en) * 2016-05-23 2021-02-04 ИНТЕРДИДЖИТАЛ ВиСи ХОЛДИНГЗ, ИНК. Method, device and stream of immersive video formatting for devices of inherited and immersive rendering
CN109155874A (en) * 2016-05-23 2019-01-04 佳能株式会社 The method, apparatus and computer program of the self adaptation stream transmission of virtual reality media content
KR20190008325A (en) * 2016-05-23 2019-01-23 캐논 가부시끼가이샤 Method, device, and computer program for adaptive streaming of virtual reality media content
JP7223106B2 (en) 2016-05-23 2023-02-15 キヤノン株式会社 Method, device and computer program for adaptive streaming of virtual reality media content
KR102247399B1 (en) * 2016-05-23 2021-05-03 캐논 가부시끼가이샤 Method, device, and computer program for adaptive streaming of virtual reality media content
EP3249928A1 (en) * 2016-05-23 2017-11-29 Thomson Licensing Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
CN107454468A (en) * 2016-05-23 2017-12-08 汤姆逊许可公司 The method, apparatus and stream being formatted to immersion video
RU2711591C1 (en) * 2016-05-23 2020-01-17 Кэнон Кабусики Кайся Method, apparatus and computer program for adaptive streaming of virtual reality multimedia content
JP2022031346A (en) * 2016-05-23 2022-02-18 キヤノン株式会社 Method, device, and computer program for adaptively streaming virtual reality media content
EP3249930A1 (en) * 2016-05-23 2017-11-29 Thomson Licensing Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
JP2019519149A (en) * 2016-05-23 2019-07-04 キヤノン株式会社 Method, device and computer program for adaptive streaming of virtual reality media content
KR102307819B1 (en) * 2016-05-23 2021-10-05 인터디지털 브이씨 홀딩스 인코포레이티드 Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
KR20170132098A (en) * 2016-05-23 2017-12-01 톰슨 라이센싱 Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
GB2578227B (en) * 2016-05-23 2021-09-15 Canon Kk Method, device, and computer program for adaptive streaming of virtual reality media content
JP2019524004A (en) * 2016-05-23 2019-08-29 キヤノン株式会社 Method, device and computer program for improving streaming of virtual reality media content
CN107454468B (en) * 2016-05-23 2021-09-14 交互数字Vc控股公司 Method, apparatus and stream for formatting immersive video
JP2019521584A (en) * 2016-05-24 2019-07-25 クゥアルコム・インコーポレイテッドQualcomm Incorporated Signaling of Virtual Reality Video in Dynamic Adaptive Streaming over HTTP
US11375291B2 (en) 2016-05-24 2022-06-28 Qualcomm Incorporated Virtual reality video signaling in dynamic adaptive streaming over HTTP
US10587934B2 (en) 2016-05-24 2020-03-10 Qualcomm Incorporated Virtual reality video signaling in dynamic adaptive streaming over HTTP
CN109076238B (en) * 2016-05-24 2022-08-05 高通股份有限公司 Signaling virtual reality video in dynamic adaptive streaming over HTTP
KR20190014500A (en) * 2016-05-24 2019-02-12 퀄컴 인코포레이티드 Virtual reality video signaling in dynamic adaptive streaming over HTTP
KR102534899B1 (en) * 2016-05-24 2023-05-22 퀄컴 인코포레이티드 Virtual Reality Video Signaling in Dynamic Adaptive Streaming over HTTP
CN109076238A (en) * 2016-05-24 2018-12-21 高通股份有限公司 Virtual reality video is transmitted with signal in dynamic self-adapting stream transmission by HTTP
WO2017205504A1 (en) * 2016-05-24 2017-11-30 Qualcomm Incorporated Virtual reality video signaling in dynamic adaptive streaming over http
CN107438203B (en) * 2016-05-25 2021-11-23 交互数字麦迪逊专利控股公司 Method for establishing and receiving list, network equipment and terminal
US11363086B2 (en) 2016-05-25 2022-06-14 Interdigital Madison Patent Holdings, Sas Method and network equipment for establishing a manifest
WO2017202899A1 (en) * 2016-05-25 2017-11-30 Koninklijke Kpn N.V. Spatially tiled omnidirectional video streaming
CN107438203A (en) * 2016-05-25 2017-12-05 汤姆逊许可公司 For establishing the method and the network equipment of inventory
EP3249929A1 (en) * 2016-05-25 2017-11-29 Thomson Licensing Method and network equipment for establishing a manifest
EP3249931A1 (en) * 2016-05-25 2017-11-29 Thomson Licensing Method and network equipment for establishing a manifest
JP2018014710A (en) * 2016-05-25 2018-01-25 トムソン ライセンシングThomson Licensing Method and network equipment for establishing manifest
US11284124B2 (en) 2016-05-25 2022-03-22 Koninklijke Kpn N.V. Spatially tiled omnidirectional video streaming
JP7041472B2 (en) 2016-05-25 2022-03-24 インターデジタル マディソン パテント ホールディングス, エスアーエス How to create a manifest and network equipment
KR20170133274A (en) * 2016-05-25 2017-12-05 톰슨 라이센싱 Method and network equipment for establishing a manifest
KR102401666B1 (en) * 2016-05-25 2022-05-25 인터디지털 매디슨 페턴트 홀딩스 에스에이에스 Method and network equipment for establishing a manifest
US11943452B2 (en) 2016-07-14 2024-03-26 Koninklijke Kpn N.V. Systems and methods for video encoding and decoding
CN109691103A (en) * 2016-07-14 2019-04-26 皇家Kpn公司 Video coding
CN109565616A (en) * 2016-08-22 2019-04-02 谷歌有限责任公司 Interactive video multi-screen experience on a cellular telephone
CN109565616B (en) * 2016-08-22 2022-03-18 谷歌有限责任公司 Interactive video multi-screen experience on mobile phones
EP3783905A1 (en) 2016-09-08 2021-02-24 Koninklijke KPN N.V. Partial video decoding method , device and system
WO2018046705A2 (en) 2016-09-08 2018-03-15 Koninklijke Kpn N.V. Partial video decoding method, device and system
EP3293981A1 (en) 2016-09-08 2018-03-14 Koninklijke KPN N.V. Partial video decoding method, device and system
US11153580B2 (en) 2016-09-08 2021-10-19 Koninklijke Kpn N.V. Partial video decoding method, device and system
EP3301951A1 (en) 2016-09-30 2018-04-04 Koninklijke KPN N.V. Audio object processing based on spatial listener information
EP3301952A1 (en) 2016-09-30 2018-04-04 Koninklijke KPN N.V. Audio object processing based on spatial listener information
US11062482B2 (en) 2016-10-12 2021-07-13 Arris Enterprises Llc Coding schemes for virtual reality (VR) sequences
US10805614B2 (en) 2016-10-12 2020-10-13 Koninklijke Kpn N.V. Processing spherical video data on the basis of a region of interest
US11527015B2 (en) 2016-10-12 2022-12-13 Arris Enterprises Llc Coding schemes for virtual reality (VR) sequences
WO2018071666A1 (en) * 2016-10-12 2018-04-19 Arris Enterprises Llc Coding schemes for virtual reality (vr) sequences
CN110024400A (en) * 2016-12-07 2019-07-16 高通股份有限公司 The system and method that the signal of region of interest is sent
CN110024400B (en) * 2016-12-07 2021-08-24 高通股份有限公司 System and method for signaling a region of interest
US10893256B2 (en) 2017-06-26 2021-01-12 Nokia Technologies Oy Apparatus, a method and a computer program for omnidirectional video
WO2019002662A1 (en) 2017-06-26 2019-01-03 Nokia Technologies Oy An apparatus, a method and a computer program for omnidirectional video
EP3646593A4 (en) * 2017-06-26 2021-03-31 Nokia Technologies Oy An apparatus, a method and a computer program for omnidirectional video
US11282283B2 (en) 2017-10-02 2022-03-22 At&T Intellectual Property I, L.P. System and method of predicting field of view for immersive video streaming
US10818087B2 (en) 2017-10-02 2020-10-27 At&T Intellectual Property I, L.P. Selective streaming of immersive video based on field-of-view prediction
US10762710B2 (en) 2017-10-02 2020-09-01 At&T Intellectual Property I, L.P. System and method of predicting field of view for immersive video streaming
US10659815B2 (en) 2018-03-08 2020-05-19 At&T Intellectual Property I, L.P. Method of dynamic adaptive streaming for 360-degree videos
WO2019226369A1 (en) * 2018-05-25 2019-11-28 Microsoft Technology Licensing, Llc Adaptive panoramic video streaming using overlapping partitioned sections
US10764494B2 (en) 2018-05-25 2020-09-01 Microsoft Technology Licensing, Llc Adaptive panoramic video streaming using composite pictures
US10666863B2 (en) 2018-05-25 2020-05-26 Microsoft Technology Licensing, Llc Adaptive panoramic video streaming using overlapping partitioned sections
WO2020130912A1 (en) * 2018-12-20 2020-06-25 Telefonaktiebolaget Lm Ericsson (Publ) Improved tile address signalling in video encoding and decoding
US11272178B2 (en) 2018-12-20 2022-03-08 Telefonaktiebolaget Lm Ericsson (Publ) Video encoding and decoding
US11523185B2 (en) 2019-06-19 2022-12-06 Koninklijke Kpn N.V. Rendering video stream in sub-area of visible display area
EP3761647A1 (en) 2019-07-05 2021-01-06 Tiledmedia B.V. Methods and devices for rendering a video on a display
US11936838B2 (en) 2019-07-05 2024-03-19 Tiledmedia B.V. Methods and devices for rendering a video on a display
WO2021004918A1 (en) 2019-07-05 2021-01-14 Tiledmedia B.V. Methods and devices for rendering a video on a display

Also Published As

Publication number Publication date
EP3162075B1 (en) 2020-04-08
US20170155912A1 (en) 2017-06-01
US10694192B2 (en) 2020-06-23
EP3162075A1 (en) 2017-05-03

Similar Documents

Publication Publication Date Title
EP3162075B1 (en) Hevc-tiled video streaming
JP6440747B2 (en) Region of interest determination based on HEVC tiled video stream
US10862943B2 (en) Methods, devices, and computer programs for improving streaming of partitioned timed media data
EP3510438B1 (en) Method and apparatus for controlled observation point and orientation selection audiovisual content
US11019408B2 (en) Methods, devices, and computer programs for streaming partitioned timed media data
US20180324283A1 (en) Method and corresponding device for streaming video data
EP3466091B1 (en) Method, device, and computer program for improving streaming of virtual reality media content
WO2019202207A1 (en) Processing video patches for three-dimensional content
KR101944601B1 (en) Method for identifying objects across time periods and corresponding device
Kammachi‐Sreedhar et al. Omnidirectional video delivery with decoder instance reduction
EP3777219B1 (en) Method and apparatus for signaling and storage of multiple viewpoints for omnidirectional audiovisual content

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15734102

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2015734102

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015734102

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 15318619

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE