WO2020141995A1 - Prise en charge de réalité augmentée dans un format de média omnidirectionnel - Google Patents

Prise en charge de réalité augmentée dans un format de média omnidirectionnel Download PDF

Info

Publication number
WO2020141995A1
WO2020141995A1 PCT/SE2019/051329 SE2019051329W WO2020141995A1 WO 2020141995 A1 WO2020141995 A1 WO 2020141995A1 SE 2019051329 W SE2019051329 W SE 2019051329W WO 2020141995 A1 WO2020141995 A1 WO 2020141995A1
Authority
WO
WIPO (PCT)
Prior art keywords
bitstream
background layer
media
transparency
background
Prior art date
Application number
PCT/SE2019/051329
Other languages
English (en)
Inventor
Martin Pettersson
Mitra DAMGHANIAN
Rickard Sjöberg
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Publication of WO2020141995A1 publication Critical patent/WO2020141995A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/23Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding with coding of regions that are present throughout a whole video segment, e.g. sprites, background or mosaic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components

Definitions

  • the application relates to methods and apparatuses for video encoding and decoding.
  • VR virtual reality
  • HMD head-mounted displays
  • Use cases for VR include gaming and VR video, also referred to as omnidirectional video or 360° video.
  • Motion Picture Experts Group is currently working on some activities for immersive video to be published in the MPEG-I set of standards.
  • One activity is about 3 degrees of freedom (3DoF), a.k.a. 360° video, where the user may look in all directions of the sphere using a head mounted display (HMD), but with fixed head position around the origin.
  • 3DoF 3 degrees of freedom
  • HMD head mounted display
  • 3DoF+ 3 degrees of freedom
  • 3DoF+ scene is built up from a large number of views containing both texture and depth information.
  • Intermediate views are synthesized using texture and depth from neighboring views.
  • MPEG has also an activity being worked on for six degrees of freedom (6DoF) video.
  • 6DoF video the user has full flexibility to look around objects in a much larger volume compared to 3DoF+, enough to let the user stand and possibly walk around.
  • the plan to realize 6DoF video includes using a combination of background video and point cloud objects.
  • Point cloud objects are described with geometry information (points in 3D space) and attributes attached to each point in the point cloud. Attributes may include color values (e.g. RGB textures), reflectance, occupancy and opacity.
  • RGB textures color values
  • OMAF Omnidirectional MediA Format
  • AVC Advanced Video Coding
  • HEVC High Efficiency Coding
  • VVC Versatile Video Codec
  • references to the current version of OMAF, the current version of OMAF v2, the latest version of OMAF or the latest version of OMAF v2 is referring to OMAF v2 WD3 (wl7963-vl).
  • Two common projection formats supported by OMAF is equirectangular projection and cube map projection.
  • the equirectangular projection format can be easily explained with a world map, where the equator is ranging from left to right and with the pixels at the poles stretched along the top and bottom borders. The granularity is thus higher at the poles.
  • the cube map projection is built up by six 2D video projections in the shape of a cube.
  • a cube map video may be created by capturing 2D video with six cameras in six different orthogonal directions; up, down, front, back, left and right.
  • Figure 2 illustrates an example of a typical cube map.
  • the projected video format captured by the camera is typically packed into a picture that is more suitable for compression. This picture is referred to as a packed picture. After decoding the packed picture, the packed picture is unpacked to a picture with the projection format, referred to as a projected picture, before rendered to a HMD or display.
  • a simple use case for when packing a projected picture into a packed picture is useful is for the equirectangular projection format. Since the poles in
  • the packed picture for compression may be compressed in terms of pixels at the poles. This is exemplified in Figure 3 where the top and bottom areas are shrunk in the packed picture compared to the projected picture.
  • Figure 4 illustrates an example of a typical use case for 360° video, where a 360° camera captures a scene and sends the 360° video to a server. The server then packs the projected format and sends the packed format video to a user with an HMD. Note that the resolution of the projected picture may be larger than the resolution of the packed picture.
  • tiles of a picture are sometimes referred to as regions of a picture.
  • a tiled video is built up by a number of tiles that are coded independently from each other. This means that a single tile can be extracted from the encoded bitstream and the single tile may also be independently decoded. This fact is utilized in 360° video streaming by only sending the tiles that are covering the current viewport in high quality. The other tiles may be skipped or transmitted at a lower quality.
  • Figure 5 illustrates an example of a typical use case where tiles are utilized to enable high quality for the viewport and lower quality for the other area.
  • An overlay is a layer of visual media (e.g. a video, image item or timed text) that is rendered on top of another layer or background.
  • visual media e.g. a video, image item or timed text
  • the background visual media is called the background visual media and may comprise of a video, an image item, or timed text.
  • Overlays in OMAF may be fully opaque or semi-opaque over the whole overlay, or have individual transparency levels for each pixel in the overlay.
  • this is realized by sending an associated alpha plane with transparency values for each co-located pixel in the overlay. For instance, for an alpha plane with 8 bits, 256 different transparency levels can be expressed for each pixel where 0 means that the co-located pixel in the overlay is fully transparent and 255 means that the co-located pixel in the overlay is fully opaque.
  • the process of rendering a semi-transparent overlay with another overlay or background layer is referred to alpha-blending or superimposing the layer on top of the other layer or background layer.
  • the current version of OMAF v2 comprises an overlay structure that specifies the overlay related metadata per each overlay.
  • the overlay structure may be contained in a ProjectedOmniVideoBox, an OverlayConfigBox, OverlayConfigProperty, an OverlayProperty or an OverlaySample.
  • the syntax and semantics for the overlay structure in the current version of OMAF v2 is provided below:
  • num overlays specifies the number of overlays described by this structure. num_overlays equal to 0 is reserved.
  • num flag bytes specifies the number of bytes allocated collectively by the overlay _control_flag[i] syntax elements num flag bytes shall be equal to 1 or 2 in this version of this document. OMAF players shall allow num flag bytes greater than 2 to appear in the syntax.
  • overlay id provides a unique identifier for the overlay. No two overlays shall have the same overlay id value.
  • overlay _control_flag[i] when set to 1 defines that the structure as defined by the i-th overlay _control_struct[i] is present.
  • overlay control _flag[i] shall not be equal to 1 when i is greater than LastControlIdx. OMAF players shall allow both values of overlay _control_flag[i] to appear in the syntax for all values of i.
  • overlay _control_essential_flag[i] 0 specifies that OMAF players are not required to process the structure as defined by the i-th overlay_control_struct[i]
  • overlay _control_essential_flag[i] 1 specifies that OMAF players shall process the structure as defined by the i-th overlay _control_struct[i]
  • OMAF players shall be able to parse and process the structure
  • overlay _control_struct[9] i.e., the OverlayPriority control structure
  • overlay_control_essential_flag[i] is equal to 1 and an OMAF player is not capable of parsing or processing the structure as defined by overlay_control_struct[i], the following applies:
  • the OMAF player shall display neither the overlays specified by this OverlayStruct nor the background visual media.
  • the OMAF player shall not display the overlay specified by this SingleOverlay Struct.
  • byte_count[i] gives the byte count of the structure represented by the i-th overlay _control_struct[i] .
  • overlay _control_struct[i][byte_count[i]] defines the i-th structure with a byte count as defined by byte_count[i]
  • Positional information overlay region (position/size), rotation, depth (region depth minusl), and layering order (layering order)
  • layering order layering order
  • Rendering options opacity, blending mode when applicable
  • Create a 2D plane i.e. rectangle
  • the overlay plane shall be located in front of others when it is associated with lower layering order value than the planes with higher layering order value.
  • OMAF v2 comprises an entity group for grouping overlays and background visual media that are intended to be presented together.
  • entity group for grouping overlays and background visual media that are intended to be presented together.
  • definition, syntax and semantics for the identity group is shown below:
  • EntityToGroupBox with grouping type equal to 'ovbg' specifies tracks and image items containing overlays and background visual media that are intended to be presented together.
  • overlay _flag[i] shall be equal to 1. Otherwise, overlay _flag[i] shall be equal to 0.
  • overlays could also be determined as follows: When a track in an 'ovbg' entity group contains an OverlayConfigBox in its sample entry, it includes overlays. When an image item in an 'ovbg' entity group is associated with an overlay item property (i.e., OverlayConfigProperty), it includes overlays.
  • overlayConfigProperty When an image item in an 'ovbg' entity group is associated with an overlay item property (i.e., OverlayConfigProperty), it includes overlays.
  • backgound_flag[i] shall be equal to 1. Otherwise, background_flag[i] shall be equal to 0
  • An 'ovbg' entity group shall contain either a background visual media track or a background image item but not both. Additionally, any two background visual media tracks in the same 'ovbg' entity group shall be alternatives to each other, indicated by the same value of alternate group in their TrackHeaderBox, or shall belong to the same 2D spatial relationship track group. Any two background visual image items in the same 'ovbg' entity group shall belong to the same 'altr' entity group. [0037] When both one or more overlays and background visual media are region- wise packed into the same video track or image item included in an 'ovbg' entity group, the same 'ovbg' entity group shall contain no other track or image item containing background visual media.
  • bit(6) reserved 0;
  • overlay _flag[i] 0 specifies that the entity does not contain any overlays
  • overlay _flag[i] 1 specifies that the entity contains one or more overlays.
  • background_flag[i] 0 specifies that the entity does no contain background visual media
  • background _flag[i] 1 specifies that the entity contains background visual media
  • overlay _flag[i] and background_flag[i] shall be equal to 1 for each value of i in the range of 0 to num_entities_in_group - 1, inclusive.
  • the current version of OMAF v2 comprises a set of (restricted) scheme types that defines the scope of what the OMAF media bitstream supports.
  • Two scheme types defined in the current version of OMAF v2 are the packed equirectangular or cubemap projected video (‘ercm’) scheme type and the equirectangular or cubemap projected video with overlays ('ecov') scheme type, with the following definitions:
  • the 'ercm' scheme type is defined as a closed scheme type for projected omnidirectional video.
  • ProjectionFormatBox version of ProjectionFormatBox, StereoVideoBox (when present), RegionWisePackingBox (when present), RotationBox (when present), and CoveragelnformationBox (when present) shall be equal to 0.
  • SchemelnformationBox shall not directly or indirectly contain any boxes other than ProjectedOmniVideoBox, ProjectionFormatBox, StereoVideoBox,
  • RegionWisePackingBox RotationBox, and CoveragelnformationBox.
  • the track conforms to the constraints of scheme type equal to 'ercm' except that the ProjectedOmniVideoBox contained in the SchemelnformationBox is additionally allowed to contain OverlayConfigBox and ViewingSpaceBox.
  • ViewingSpaceBox (when present) shall be equal to 0.
  • AR augmented reality
  • AR may be realized in a number of ways.
  • AR may be realized in AR glasses, such as the Microsoft HoloLens, Google Glass or Magic Leap Lightwear.
  • AR glasses such as the Microsoft HoloLens, Google Glass or Magic Leap Lightwear.
  • the overlaid objects are projected directly (or indirectly by reflection in the glass) into one or both of the eyes, while the user is seeing the real-world environment through the semi-transparent glasses.
  • AR may also be used in handheld devices with a camera, such as smart phones and pads. The camera captures the real-world behind the screen, and objects or overlays are superimposed on top of the captured video and shown on the screen of the device.
  • AR is sometimes used interchangeably with mixed reality (MR).
  • OMAF includes functionalities for displaying omnidirectional video, with features such as overlays, multiple viewpoints, viewpoint switching and zooming, OMAF is not allowed to use overlays without the omnidirectional video, which would be useful for AR use cases. OMAF does not support AR due to at least the following constraints:
  • omnidirectional video or image item or over a viewport where a viewport is defined as“a region of omnidirectional image or video suitable for display and viewing by the user”.
  • a viewport is defined as“a region of omnidirectional image or video suitable for display and viewing by the user”.
  • The‘ovbg’ entity group shall contain either a background visual media track or a background image item. It is thus not possible for an‘obvg’ entity group to only group overlays.
  • a method and apparatus for decoding and rendering an omnidirectional media bitstream, wherein the media derived from the omnidirectional media bitstream comprises a background layer.
  • An omnidirectional media bitstream is acquired.
  • a determination is made, from the omnidirectional media bitstream, whether or not there is transparency in the background layer. Responsive to determining that there is transparency in the background layer, the omnidirectional media bitstream is decoded and rendered with transparency in the background layer. Responsive to determining that there is no transparency in the background layer, the omnidirectional media bitstream is decoded and rendered without transparency in the background layer.
  • an omnidirectional media bitstream is acquired. Responsive to determining that there is to be transparency in the background layer, the omnidirectional media bitstream is encoded and outputted with an indicator indicating transparency is allowed in the background layer. Responsive to determining that there is no transparency to be in the background layer, the omnidirectional media bitstream is encoded and outputted with the indicator indicating transparency is not allowed in the background layer.
  • One potential advantage provided by the inventive concepts includes supporting AR in an omnidirectional media bitstream.
  • OMAF is enabled to allow overlays to be used in AR use cases.
  • AR use cases include gaming such as Pokemon Go, remote social interaction, industrial design, medical use cases such as remote surgery, flight training with virtual feedback, translation of signs, and use cases for tourism and sightseeing with rendered information for the current geographic location.
  • omnidirectional media bitstream to also be used for AR
  • functionality and resources may be reused between the different use cases. For instance, overlays may be created that may both be useful for omnidirectional virtual reality (VR) and AR use cases.
  • VR virtual reality
  • Figure 1 is a diagram illustrating an example of 3DoF (360 video), 3DoF+, and 6DoF video according to some embodiments;
  • Figure 2 is a block diagram illustrating a cube map according to some embodiments.
  • Figure 3 is a block diagram illustrating an example of packing a projected equirectangular picture packed into a packed picture that is more suitable for compression according to some embodiments;
  • Figure 4 is a diagram illustrating a use case for 360° video according to some embodiments.
  • Figure 5 is a diagram illustrating a use case for 360° video with tiles according to some embodiments.
  • Figure 6 is a block diagram illustrating an environment in which augmented reality may be implemented according to some embodiments.
  • Figure 7 is a block diagram illustrating an encoder according to some embodiments.
  • Figure 8 is a block diagram illustrating a decoder according to some embodiments.
  • Figures 9-18 are flow charts illustrating operations of a decoder or an encoder in accordance with some embodiments of inventive concepts.
  • Use cases for AR include gaming such as Pokemon Go, remote social interaction, industrial design, medical use cases such as remote surgery, flight training with virtual feedback, translation of signs, and use cases for tourism and sightseeing with rendered information for the current geographic location.
  • Figure 6 illustrates an example of an operating environment of an encoder 600 that may be used to encode OMAF bitstreams as described herein to enable OMAF to allow overlays to be used in AR use cases.
  • the encoder 600 receives media such as omnidirectional media from network 602 and/or from storage 604 and encodes the media into bitstreams as described below and transmits the encoded media to decoder 606 via network 608
  • Storage device 604 may be part of a storage depository of videos such as a storage repository of a store or a streaming video service, a separate storage component, a component of a mobile device, etc.
  • the decoder 606 may be part of an augmented reality (AR) device 610 having an AR display 612 such as a head mounted display, AR glasses, etc.
  • AR augmented reality
  • the AR device 610 may be a mobile device, a set-top device, a head-mounted display, AR glasses, an AR desktop computer, and the like.
  • FIG. 7 is a block diagram illustrating elements of encoder 600 configured to encode video frames according to some embodiments of inventive concepts.
  • encoder 600 may include a network interface circuit 705 (also referred to as a network interface) configured to provide communications with other devices/entities/functions/etc.
  • the encoder 600 may also include a processor circuit 701 (also referred to as a processor) coupled to the network interface circuit 705, and a memory circuit 703 (also referred to as memory) coupled to the processor circuit.
  • the memory circuit 703 may include computer readable program code that when executed by the processor circuit 701 causes the processor circuit to perform operations according to embodiments disclosed herein.
  • processor circuit 701 may be defined to include memory so that a separate memory circuit is not required. As discussed herein, operations of the encoder 600 may be performed by processor 701 and/or network interface 705. For example, processor 701 may control network interface 705 to transmit
  • modules may be stored in memory 703, and these modules may provide instructions so that when instructions of a module are executed by processor 701, processor 701 performs respective operations.
  • FIG. 8 is a block diagram illustrating elements of decoder 606 configured to decode video frames according to some embodiments of inventive concepts.
  • decoder 606 may include a network interface circuit 805 (also referred to as a network interface) configured to provide communications with other devices/entities/functions/etc.
  • the decoder 606 may also include a processor circuit 801 (also referred to as a processor) coupled to the network interface circuit 805, and a memory circuit 803 (also referred to as memory) coupled to the processor circuit.
  • the memory circuit 803 may include computer readable program code that when executed by the processor circuit 801 causes the processor circuit to perform operations according to embodiments disclosed herein.
  • processor circuit 801 may be defined to include memory so that a separate memory circuit is not required. As discussed herein, operations of the decoder 606 may be performed by processor 801 and/or network interface 805. For example, processor 801 may control network interface 805 to receive
  • modules may be stored in memory 803, and these modules may provide instructions so that when instructions of a module are executed by processor 801, processor 801 performs respective operations.
  • an indicator e.g. a flag
  • an omnidirectional media bitstream e.g. an OMAF bitstream
  • the indicator may also more directly specify if the bitstream is intended for augmented reality or not.
  • operations of the decoder 606 and encoder 600 shall be described that are common to various embodiments.
  • the decoder 606, in operation 901 acquires an omnidirectional media bitstream.
  • an indicator is determined from syntax elements in the bitstream, the indicator specifying whether or not the bitstream allows transparency in a background layer.
  • the omnidirectional media bitstream allows transparency, in operation 905, it is determined, from the omnidirectional media bitstream, whether there is transparency in the background layer. Responsive to determining that there is transparency in the background layer, the omnidirectional media bitstream is decoded and rendered in operation 907 with transparency in the background. Responsive to determining that there is no transparency in the background layer, the omnidirectional media bitstream is decoded and rendered in operation 909 without transparency in the background.
  • operation 903 is optional.
  • operation 905 is performed without first determining that the omnidirectional media bitstream allows transparency.
  • the decoder determines, from the omnidirectional media bitstream, whether there is transparency in the background layer.
  • the encoder 600 receives an indication to encode an omnidirectional media bitstream.
  • the indication may be an instruction, a receipt of an input for encoding, etc.
  • an indicator is determined to use from one or syntax elements, the indicator specifying whether or not the bitstream allows transparency in a background layer. Responsive to determining that there is transparency in the background layer, the omnidirectional media bitstream is encoded in operation 1005 with transparency in the background. Responsive to determining that there is no transparency in the background layer, the omnidirectional media bitstream is encoded in operation 1007 without transparency in the background.
  • a first embodiment described in more detail below provides an indicator that specifies if transparency in the background layer is allowed.
  • a second embodiment described in more detail below provides a method for signaling a flag to determine whether the background layer is a fully opaque background visual media or a fully transparent background layer.
  • a third embodiment described in more detail below provides a method for signaling a flag to determine whether the background layer is an overlay (i.e. a semi transparent layer) or a fully opaque background visual media.
  • a fourth embodiment described in more detail below provides a method for using alpha blending in the background visual media.
  • a fifth embodiment described in more detail below provides a method for signaling an indicator comprising a set of constraints (e.g. a restricted scheme type) that specifies whether or not the media bitstream allows transparency in the background layer. It may also specify that the bitstream is intended for augmented reality.
  • constraints e.g. a restricted scheme type
  • a bitstream is referred to as a series of bits transmitted over a network.
  • a bitstream may alternatively be one or more data files stored on a physical medium, such as a HDD, RAM or flash memory.
  • media bitstream, file and media file are used interchangeably.
  • a media player is in this context a collective term for file/segment reception or file access; file/segment decapsulation; decoding of audio, video, image, or timed text bitstreams, and rendering of audio, video, images, or timed text.
  • An overlay is piece of visual media rendered over omnidirectional video or image item or over a viewport.
  • Omnidirectional media is media such as image or video and its associated audio that enable rendering according to the user's viewing orientation, if consumed with a head-mounted device, or according to user's desired viewport, otherwise, as if the user was in the spot where and when the media was captured.
  • a background layer is a layer on which an overlay is superimposed.
  • Background layer can be either background visual media or a transparent background layer.
  • a background visual media is a piece of visual media on which an overlay is superimposed.
  • a transparent background layer is a background layer intended for realizing AR functionality.
  • the transparent background layer is not signaled in a track.
  • a viewpoint is an omnidirectional media corresponding to one
  • a viewport is a region of omnidirectional image or video suitable for display and viewing by the user.
  • a visual media is a video, image item, or timed text.
  • Rendering is a process of generating audio-visual content for playback from the decoded audio-visual data according to the user's viewing orientation, if consumed with a head-mounted device, or according to user's desired viewport, otherwise.
  • a track is a high-level ISO Base Media File Format structure (box), inherited by OMAF, for carrying media data and some related info for the track, including timed sequence of related samples.
  • a track corresponds to a sequence of images or sampled audio of a specific type (e.g. hevc media type or aac media type).
  • a file decoder is a collective term for file/segment decapsulation and decoding of video, audio or image bitstreams.
  • decoder bitstream decoder
  • media bitstream decoder is used interchangeably with file decoder.
  • a file decoding process is a process specified as a part of a media profile specification that takes as input a set of ISOBMFF tracks or items and derives either of the following:
  • decoding bitstream decoding and media bitstream decoding is used interchangeably with file decoding.
  • opaque is used in the meaning fully opaque, i.e. no light is allowed to pass through.
  • An opaque pixel in an image is not blended with any pixel in a layer below.
  • transparent is used in the meaning that some or all light is allowed to pass through.
  • a transparent pixel in an image may be blended with a pixel in a layer below.
  • semi-transparent is used when the meaning of translucent is intended, i.e. allowing some light to pass through.
  • fully transparent is used when allowing all light to pass through. A fully transparent image is thus invisible.
  • an indicator is provided that when decoded, specifies whether transparency in the background layer is allowed.
  • an omnidirectional media bitstream is acquired (i.e., in operation 901 or operation 1001), where the media bitstream may comprise video, point cloud or graphics elements.
  • Video may for instance be mono (2D) video, stereo video, 3DoF video, 3DoF+ video or 6D0F video.
  • the media bitstream may for instance be an OMAF media bitstream.
  • the media derived from the media bitstream may be a background layer.
  • the background layer may for instance be a video played in the background or an image displayed as a background image in the omnidirectional output.
  • the background layer may also be a solid color or be fully transparent. Additional layers, for instance layers comprising overlays, point clouds or graphics elements may be superimposed on the background layer.
  • the media bitstream may have an indicator, encoded with one or more syntax elements in the bitstream, that specifies whether the bitstream allows transparency in the background layer or not.
  • the indicator may for instance be a flag, a pointer or an object.
  • the indicator is encoded in the bitstream in a struct, container, box, property or atom, either as part of metadata or sample data.
  • the decoder in operation 1101, determines the indication from a flag in the omnidirectional media bitstream.
  • the indicator is decoded from a struct, a container, a box, a property or an atom.
  • the indicator may be used to determine whether the omnidirectional bitstream is intended for AR.
  • an output is decoded that is semi-transparent visual media.
  • Semi-transparent visual media may for instance be rendered in AR glasses, AR- compatible smart phones, AR-compatible pads and AR-compatible VR head mounted displays.
  • the AR-compatibility of these devices may for instance comprise a front facing camera that captures what the user would see if the device was not in the way, and on top of this captured video the semi-transparent visual media is superimposed.
  • a media decoder and/or an omnidirectional media player may execute the method described in this embodiment by all or a subset of the following steps:
  • This media bitstream may be an OMAF media bitstream or OMAF media file.
  • the media bitstream may for instance comprise video, point cloud, graphics or any combination of these
  • the flag may for instance be decoded from a struct, structure, container, box, property or atom in the bitstream.
  • the flag may specifically be decoded from an overlay structure containing overlay related metadata
  • the decoding comprise determining whether there is transparency in the background layer or not.
  • a flag may be used to determine whether the background layer is a fully opaque background visual media or a fully transparent background layer
  • the second embodiment is similar to the first embodiment.
  • the indicator in the media bitstream specifies whether the background layer is a fully opaque background visual media or a fully transparent background layer.
  • the indicator is placed in a box containing related metadata for augmented reality.
  • a no media samples indication is signaled for the fully transparent background layer.
  • the fully transparent background layer could for instance be background of the real-world when the media bitstream is rendered in an AR device.
  • the fully transparent layer may also be the local camera capture of an AR device.
  • samples are signaled for the fully transparent background layer.
  • an alpha plane is signaled for the fully transparent background layer.
  • the indicator is a flag encoded in the bitstream in an overlay structure containing overlay related metadata.
  • the overlay structure may for instance be in a ProjectedOmniVideoBox, an OverlayConfigBox,
  • OverlayConfigProperty an OverlayProperty or an OverlaySample in the OMAF media bitstream.
  • bit(4) reserved 0;
  • Semantics background layer type specifies the type for the background layer on which the overlays specified in the OverlavStruct are superimposed background layer type equal to 0 specifies that the background layer is background visual media
  • background layer type 1 specifies that the background layer is transparent background layer.
  • a background visual media is decoded from a track in the media bitstream and at least one overlay is decoded in operation 1303 from a track in the media bitstream.
  • the background visual media and the at least one overlay may be decoded from the same track or from different tracks in the bitstream.
  • the at least one overlay are superimposed on the background visual media in operation 1305.
  • the background visual media is output in operation 1307 with the superimposed at least one overlay.
  • the indicator specifies that the background layer is a fully transparent background layer
  • at least one overlay is decoded from a track in the media bitstream.
  • the at least one overlay may be superimposed on the fully transparent background layer in operation 1311 to form a semi-transparent visual media.
  • the semi transparent visual media is outputted in operation 1313.
  • the fully transparent background layer is not signaled in the bitstream since it contains no data.
  • the at least one overlay may be rendered directly to the output of the media player, where the output matrix before rendering can be defined as the transparent background layer and the output of the media player after rendering is the semi-transparent visual media.
  • the encoder 600 performs similar operations during encoding.
  • the background layer is a fully opaque background visual media
  • the background visual media is encoded in a track in the omnidirectional media bitstream.
  • at least one overlay is encoded in a track in the omnidirectional media bitstream.
  • the background visual media and the at least one overlay may be encoded in the same track or in different tracks in the bitstream.
  • the encoded omnidirectional media bitstream is provided with the encoded background visual media and the at least one overlay.
  • the encoded omnidirectional media bitstream may be provided to internal or external storage or to a decoder such as decoder 606 via network 608.
  • the background layer is a fully transparent background layer
  • at least one overlay is encoded in a track in the omnidirectional media bitstream to generate an encoded omnidirectional media bitstream.
  • the encoded omnidirectional bitstream is provided.
  • the encoded omnidirectional bitstream may be provided to the decoder 606 via network 608 or provided to internal or external storage, etc.
  • the rendering camera has to
  • the bitstream may be an entity group for grouping overlays and background visual media.
  • This entity group may comprise a combination of media background visual media tracks and overlays or comprise only overlays.
  • An 'ovbg' entity group shallmav contain either a background visual media track or a background image item but not both. Additionally, any two
  • entity group shall belong to the same 'altr' entity group.
  • group may contain only overlays.
  • a media decoder and/or an omnidirectional media player may execute the method described in this embodiment by all or a subset of the following steps:
  • This media bitstream may be an OMAF media bitstream or OMAF media file.
  • the media bitstream may for instance comprise video, point cloud, graphics or any combination of these
  • the flag may for instance be decoded from a struct, structure, container, box, property or atom in the bitstream.
  • the flag may specifically be decoded from an overlay structure containing overlay related metadata
  • a media encoder and/or an omnidirectional media bitstream creator may execute the method described in this embodiment by all or a subset of the following steps to encode and/or create an omnidirectional media bitstream (where the media bitstream may be an OMAF media bitstream comprising at least one of video, point cloud, graphics), that is intended for augmented reality (AR):
  • AR augmented reality
  • the flag may for instance be encoded in a struct, structure, container, box, property or atom in the bitstream.
  • the flag may specifically be encoded in an overlay structure containing overlay related metadata.
  • the indicator is an indicator of at least one of the following:
  • a flag is used to determine whether the background layer is an overlay (semi-transparent layer) or a fully opaque background visual media.
  • the indicator specifies whether the background layer is an overlay or a background visual media.
  • operation 1501 it is determined whether the background layer is an overlay.
  • operation 1503 an overlay is decoded from a track in the bitstream and in operation 1505, a semi transparent visual media is derived from the overlay.
  • operation 1507 the semi-transparent visual media is outputted. In practice, this may be done as in the second embodiment, where the overlay may be rendered directly to the output of the media player, where the output of the media player after rendering is a semi-transparent visual media.
  • an encoder determines whether the background layer is an overlay. When it is determined that the background layer is an overlay, in operation 1603, an overlay is encoded in a track in the bitstream and in operation 1605, the omnidirectional media bitstream is provided with the encoded track.
  • the omnidirectional media bitstream may be provided to internal or external storage, to a decoder, etc.
  • alpha blending may be used in the background visual media
  • the background visual media may be semi transparent.
  • An alpha plane to be used for the background visual media may be signaled in the bitstream.
  • the alpha plane may be a binary matrix indicating which pixels of the background visual media that should be transparent and which pixels that should be opaque.
  • the alpha plane may also comprise a level of transparency for each (or some) pixel in the background visual media. The level of transparency may for instance be signaled with 8 bits per pixel, giving 256 levels of transparency.
  • a track is decoded comprising a background visual media.
  • the indicator is used to determine whether the background visual media has an alpha plane.
  • the indicator may specify whether the background visual media has an associated alpha plane or not.
  • the alpha pane is blended with the background visual media to form a semi-transparent visual media. For example, each (or some) pixels are blended during rendering to form semi-transparent visual media output for the background layer.
  • the encoder in operation 1801 encodes a track comprising the background visual media.
  • the indicator is used to indicate whether the background visual media has an alpha plane.
  • a restricted scheme type is used.
  • the indicator may be a scheme type, a restricted scheme type, a profile, a level, a set of constraints, or a subset of elements that specifies whether or not the bitstream allows transparency in the background layer.
  • the indicator may also specify that the bitstream is intended for augmented reality.
  • the specification for decoding the media bitstream contains a number of scheme types that specify the scope for the bitstream.
  • scheme types for projected omnidirectional video, equirectangular projected video, packed equirectangular or cubemap projected video, fish-eye omnidirectional video and a scheme type for packed equirectangular or cubemap projected video where overlays and viewing space information may be additionally present.
  • a scheme type is introduced that restricts the scheme for the bitstream to have the scope of AR, i.e. it shall allow transparent backgrounds and overlays.
  • the scheme type mandates that each video track of the file shall be an overlay video and each image track in the file shall be an image item.
  • an indicator such as the indicator of embodiment 1, contains the scheme type that specifies that the bitstream allows transparent background and may comprise AR content.
  • OverlavConfigBox may contain ViewingSpaceBox The value of version of
  • a flag is used to indicate whether or not background visual media will be displayed.
  • the indicator is used to determine whether the background visual media will be rendered or not.
  • an omnidirectional media bitstream contains a background visual media and an overlay, and the indicator specifies that the background visual media shall be displayed, then the background visual media is rendered to the display and the overlay is superimposed on top of the background visual media.
  • an omnidirectional media bitstream contains a background visual media and an overlay, and the indicator specifies that the background visual media shall not be displayed, then only the overlay is rendered to the display.
  • a flag is used to indicate if the output media should be rendered and/or displayed on a transparent background or an opaque background
  • an indicator e.g. a flag
  • the output media e.g. OMAF output media, where media is either video or an image item
  • the output media contains an alpha plane to specify the transparency for each pixel.
  • the flag is used to determine if the background layer should be rendered and/or displayed as transparent in an AR display and rendered and/or displayed with a solid color in a VR display.
  • the color is a default color.
  • the color is signaled in the media bitstream.
  • decoding and rendering (907) the omnidirectional media bitstream with transparency in the background layer
  • decoding and rendering (909) the omnidirectional media bitstream without transparency in the background layer.
  • Embodiment 2 The method of Embodiment 1 wherein the omnidirectional media bitstream is an Omnidirectional MediA Format, OMAF, omnidirectional media bitstream.
  • determining the indicator comprises determining (1101) the indicator from a flag in the omnidirectional media bitstream.
  • determining the indicator comprises decoding (1103) the indicator from a struct, a container, a box, a property or an atom.
  • the omnidirectional media bitstream is one of a video bitstream, a point cloud bitstream, a graphics bitstream or a combination of two or more of the video bitstream, the point cloud bitstream, and the graphics bitstream.
  • Embodiment 8 wherein responsive to determining the omnidirectional media bitstream is intended for AR, decoding (1203) an output that is semi transparent visual media.
  • decoding (1309) at least one overlay from a track in the omnidirectional media bitstream
  • a decoder for a communication network comprising:
  • memory (803) coupled with the processor, wherein the memory comprises instructions that when executed by the processor cause the processor to perform operations according to any of Embodiments 1-18.
  • a computer program comprising computer-executable instructions configured to cause a device to perform the method according to any one of Embodiments 1-18, when the computer-executable instructions are executed on a processor (801) comprised in the device.
  • a computer program product comprising a computer-readable storage medium (803), the computer-readable storage medium having computer-executable instructions configured to cause a device to perform the method according to any one of Embodiments 1-18 when the computer-executable instructions are executed on a processor (801) comprised in the device.
  • An apparatus comprising:
  • memory (803) communicatively coupled to the processor, said memory comprising instructions executable by the processor, which cause the processor to perform operations comprising: acquiring (901) an omnidirectional media bitstream;
  • decoding and rendering (907) the omnidirectional media bitstream with transparency in the background layer
  • decoding and rendering (909) the omnidirectional media bitstream without transparency in the background layer.
  • Embodiment 23 The apparatus of Embodiment 22 wherein the omnidirectional media bitstream is an Omnidirectional MediA Format, OMAF, omnidirectional media bitstream.
  • determining the indicator comprises further operations comprising determining (1101) the indicator from a flag in the omnidirectional media bitstream.
  • determining the indicator comprises further operations comprising decoding (1103) the indicator from a struct, container, box, property or atom.
  • the omnidirectional media bitstream is one of a video bitstream, a point cloud bitstream, a graphics bitstream or a combination of two or more of the video bitstream, the point cloud bitstream, and the graphics bitstream.
  • the instructions contain further instructions which cause the processor to perform operations comprising using the indicator to determine (1201) whether the omnidirectional media bitstream is intended for augmented reality (AR).
  • AR augmented reality
  • Embodiment 29 The apparatus of Embodiment 28 wherein the instructions contain further instructions which cause the processor to perform operations comprising responsive to determining the omnidirectional media bitstream is intended for AR, decoding (1203) an output that is semi-transparent visual media.
  • decoding (133) at least one overlay from the track in the
  • decoding (1309) at least one overlay from a track in the omnidirectional media bitstream
  • decoding (1503) an overlay from a track in the bitstream; and deriving (1505) a semi-transparent visual media from the overlay; and outputting (1507) the semi-transparent visual media.
  • a method for encoding an omnidirectional media bitstream comprising a background layer comprising:
  • Embodiment 40 The method of Embodiment 39 wherein the omnidirectional media bitstream is an Omnidirectional MediA Format, OMAF, omnidirectional media bitstream.
  • encoding the omnidirectional media bitstream with the indicator comprises encoding the indicator in a struct, a container, a box, a property or an atom.
  • the omnidirectional media bitstream is one of a video bitstream, a point cloud bitstream, a graphics bitstream or a combination of two or more of the video bitstream, the point cloud bitstream, and the graphics bitstream.
  • encoding 1603 an overlay in a track in the omnidirectional media bitstream.
  • An encoder for a communication network comprising: a processor (701); and
  • memory coupled with the processor, wherein the memory comprises instructions that when executed by the processor cause the processor to perform operations according to any of Embodiments 39-53.
  • a computer program comprising computer-executable instructions configured to cause an encoder to perform the method according to any one of Embodiments 38-53, when the computer-executable instructions are executed on a processor (701) comprised in the encoder.
  • a computer program product comprising a computer-readable storage medium (703), the computer-readable storage medium having computer-executable instructions configured to cause an encoder to perform the method according to any one of Embodiments 38-53 when the computer-executable instructions are executed on a processor (701) comprised in the encoder.
  • An apparatus comprising:
  • memory communicatively coupled to the processor, said memory comprising instructions executable by the processor, which cause the processor to perform operations comprising:
  • Embodiment 57 The apparatus of Embodiment 57 wherein the omnidirectional media bitstream is an Omnidirectional MediA Format, OMAF, omnidirectional media bitstream.
  • Embodiments 57-59 comprising further instructions to cause the processor to perform operations comprising selecting an indicator to use from one of a flag, a pointer, or an object.
  • instructions to encode the indicator comprises further instructions to cause the processor to perform operations comprising encoding the indicator in a struct, a container, a box, a property or an atom.
  • the omnidirectional media bitstream is one of a video bitstream, a point cloud bitstream, a graphics bitstream or a combination of two or more of the video bitstream, the point cloud bitstream, and the graphics bitstream.
  • the apparatus of any of the Embodiments 57-62 comprising further instructions to cause the processor to perform operations comprising using the indicator to provide an indication to whether the media bitstream is intended for augmented reality (AR).
  • AR augmented reality
  • encoding 1603 an overlay in a track in the omnidirectional media bitstream.
  • connection may include wirelessly coupled, connected, or responsive.
  • the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
  • Well-known functions or constructions may not be described in detail for brevity and/or clarity.
  • the term “and/or” includes any and all combinations of one or more of the associated listed items.
  • the terms “comprise”, “comprising”, “comprises”, “include”, “including”, “includes”, “have”, “has”, “having”, or variants thereof are open- ended, and include one or more stated features, integers, elements, steps, components or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions or groups thereof.
  • the common abbreviation “e.g.”, which derives from the Latin phrase “exempli gratia” may be used to introduce or specify a general example or examples of a previously mentioned item, and is not intended to be limiting of such item.
  • the common abbreviation “i.e.”, which derives from the Latin phrase “id est,” may be used to specify a particular item from a more general recitation.
  • Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits.
  • These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).
  • any appropriate steps, methods, features, functions, or benefits disclosed herein may be performed through one or more functional units or modules of one or more virtual apparatuses.
  • Each virtual apparatus may comprise a number of these functional units.
  • These functional units may be implemented via processing circuitry, which may include one or more microprocessor or microcontrollers, as well as other digital hardware, which may include digital signal processors (DSPs), special-purpose digital logic, and the like.
  • the processing circuitry may be configured to execute program code stored in memory, which may include one or several types of memory such as read-only memory (ROM), random-access memory (RAM), cache memory, flash memory devices, optical storage devices, etc.
  • Program code stored in memory includes program instructions for executing one or more telecommunications and/or data communications protocols as well as instructions for carrying out one or more of the techniques described herein.
  • the processing circuitry may be used to cause the respective functional unit to perform corresponding functions according one or more embodiments of the present disclosure.
  • the term unit may have conventional meaning in the field of electronics, electrical devices and/or electronic devices and may include, for example, electrical and/or electronic circuitry, devices, modules, processors, memories, logic solid state and/or discrete devices, computer programs or instructions for carrying out respective tasks, procedures, computations, outputs, and/or displaying functions, and so on, as such as those that are described herein.

Abstract

La présente invention concerne un procédé, un appareil et un programme d'ordinateur permettant de décoder et de rendre un flux de bits de média omnidirectionnel, le média déduit du flux de bits de média omnidirectionnel comprenant une couche d'arrière-plan. Le procédé comprend l'acquisition d'un flux de bits de média omnidirectionnel. Le procédé comprend en outre le fait de déterminer, à partir du flux de bits de média omnidirectionnel, s'il y a une transparence dans la couche d'arrière-plan. Le procédé comprend en outre, en réponse au fait de déterminer qu'il y a une transparence dans la couche d'arrière-plan, le décodage et le rendu du flux de bits de média omnidirectionnel avec la transparence dans la couche d'arrière-plan. Le procédé comprend en outre, en réponse au fait de déterminer qu'il n'y a pas de transparence dans la couche d'arrière-plan, le décodage et le rendu du flux de bits de média omnidirectionnel sans la transparence dans la couche d'arrière-plan.
PCT/SE2019/051329 2019-01-03 2019-12-20 Prise en charge de réalité augmentée dans un format de média omnidirectionnel WO2020141995A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962787912P 2019-01-03 2019-01-03
US62/787,912 2019-01-03

Publications (1)

Publication Number Publication Date
WO2020141995A1 true WO2020141995A1 (fr) 2020-07-09

Family

ID=71407093

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2019/051329 WO2020141995A1 (fr) 2019-01-03 2019-12-20 Prise en charge de réalité augmentée dans un format de média omnidirectionnel

Country Status (1)

Country Link
WO (1) WO2020141995A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332324A (zh) * 2021-12-27 2022-04-12 北京字节跳动网络技术有限公司 图像处理方法、装置、设备及介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6987518B2 (en) * 2002-03-27 2006-01-17 Sony Corporation Graphics and video integration with alpha and video blending
US20110149037A1 (en) * 2008-08-26 2011-06-23 Koninklijke Philips Electronics N.V. Method and system for encoding a 3D video signal, encoder for encoding a 3-D video signal, encoded 3D video signal, method and system for decoding a 3D video signal, decoder for decoding a 3D video signal.
US20160112723A1 (en) * 2014-10-17 2016-04-21 Ross Video Limited Transfer of video and related data over serial data interface (sdi) links
US20170177150A1 (en) * 2015-12-21 2017-06-22 Mediatek Inc. Display control for transparent display
WO2018069215A1 (fr) * 2016-10-12 2018-04-19 Thomson Licensing Procédé, appareil et flux permettant de coder une transparence et des informations d'ombre d'un format vidéo immersif
EP3396958A1 (fr) * 2017-04-24 2018-10-31 INTEL Corporation Codage vidéo de réalité mixte avec des superpositions
US20180376125A1 (en) * 2017-06-23 2018-12-27 Media Tek Inc. Methods and apparatus for deriving composite tracks

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6987518B2 (en) * 2002-03-27 2006-01-17 Sony Corporation Graphics and video integration with alpha and video blending
US20110149037A1 (en) * 2008-08-26 2011-06-23 Koninklijke Philips Electronics N.V. Method and system for encoding a 3D video signal, encoder for encoding a 3-D video signal, encoded 3D video signal, method and system for decoding a 3D video signal, decoder for decoding a 3D video signal.
US20160112723A1 (en) * 2014-10-17 2016-04-21 Ross Video Limited Transfer of video and related data over serial data interface (sdi) links
US20170177150A1 (en) * 2015-12-21 2017-06-22 Mediatek Inc. Display control for transparent display
WO2018069215A1 (fr) * 2016-10-12 2018-04-19 Thomson Licensing Procédé, appareil et flux permettant de coder une transparence et des informations d'ombre d'un format vidéo immersif
EP3396958A1 (fr) * 2017-04-24 2018-10-31 INTEL Corporation Codage vidéo de réalité mixte avec des superpositions
US20180376125A1 (en) * 2017-06-23 2018-12-27 Media Tek Inc. Methods and apparatus for deriving composite tracks

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332324A (zh) * 2021-12-27 2022-04-12 北京字节跳动网络技术有限公司 图像处理方法、装置、设备及介质

Similar Documents

Publication Publication Date Title
US10249019B2 (en) Method and apparatus for mapping omnidirectional image to a layout output format
US11651752B2 (en) Method and apparatus for signaling user interactions on overlay and grouping overlays to background for omnidirectional content
US11051040B2 (en) Method and apparatus for presenting VR media beyond omnidirectional media
US10944977B2 (en) Methods and apparatus for encoding and decoding overlay compositions
EP3698551A1 (fr) Procédé, appareil et flux pour format vidéo volumétrique
EP2235685B1 (fr) Processeur d'images servant à recouvrir un objet graphique
JP2022532302A (ja) 3DoF+/MIV及びV-PCCのための没入型ビデオコーディング技術
WO2019191205A1 (fr) Procédé, appareil et flux pour format vidéo volumétrique
US11979546B2 (en) Method and apparatus for encoding and rendering a 3D scene with inpainting patches
EP3632101A1 (fr) Mise en correspondance de coordonnées pour la restitution d'une scène panoramique
US20230042874A1 (en) Volumetric video with auxiliary patches
WO2020141995A1 (fr) Prise en charge de réalité augmentée dans un format de média omnidirectionnel
Chen et al. Simplified carriage of MPEG immersive video in HEVC bitstream
US20220377302A1 (en) A method and apparatus for coding and decoding volumetric video with view-driven specularity
EP4038880A1 (fr) Procédé et appareil pour coder, transmettre et décoder une vidéo volumétrique
JP2023531579A (ja) ボリュメトリックメディア処理方法および装置
US20240013475A1 (en) Transparency range for volumetric video
US11743559B2 (en) Methods and systems for derived immersive tracks
US20230224501A1 (en) Different atlas packings for volumetric video
US20230345020A1 (en) Method for processing video data stream, video decoding apparatus, and method for encoding data stream
EP4173295A1 (fr) Procédé et appareil pour coder et décoder un contenu volumétrique dans et à partir d'un flux de données
Le Feuvre et al. Graphics Composition for Multiview Displays

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19907952

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19907952

Country of ref document: EP

Kind code of ref document: A1