WO2021176139A1 - Efficient culling of volumetric video atlas bitstreams - Google Patents
Efficient culling of volumetric video atlas bitstreams Download PDFInfo
- Publication number
- WO2021176139A1 WO2021176139A1 PCT/FI2021/050146 FI2021050146W WO2021176139A1 WO 2021176139 A1 WO2021176139 A1 WO 2021176139A1 FI 2021050146 W FI2021050146 W FI 2021050146W WO 2021176139 A1 WO2021176139 A1 WO 2021176139A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- atlas
- metadata
- view
- volumetric video
- culling
- Prior art date
Links
- 238000013507 mapping Methods 0.000 claims abstract description 223
- 230000000153 supplemental effect Effects 0.000 claims description 124
- 230000000007 visual effect Effects 0.000 claims description 74
- 238000000034 method Methods 0.000 claims description 44
- 238000004590 computer program Methods 0.000 claims description 41
- 230000002688 persistence Effects 0.000 claims description 40
- 230000006978 adaptation Effects 0.000 claims description 18
- 238000009877 rendering Methods 0.000 claims description 8
- 230000011664 signaling Effects 0.000 description 23
- 230000004048 modification Effects 0.000 description 13
- 238000012986 modification Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 8
- 238000013461 design Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000002123 temporal effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 208000037540 Alveolar soft tissue sarcoma Diseases 0.000 description 2
- 208000008524 alveolar soft part sarcoma Diseases 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 108091000069 Cystinyl Aminopeptidase Proteins 0.000 description 1
- 102100020872 Leucyl-cystinyl aminopeptidase Human genes 0.000 description 1
- 101150070547 MAPT gene Proteins 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/08—Volume rendering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/001—Model-based coding, e.g. wire frame
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/194—Transmission of image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/235—Processing of additional data, e.g. scrambling of additional data or processing content descriptors
- H04N21/2353—Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8146—Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/84—Generation or processing of descriptive data, e.g. content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
Definitions
- the examples and non-limiting embodiments relate generally to video codecs, and more particularly, to efficient culling of volumetric video atlas bitstreams.
- an apparatus includes means for receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
- atlas-to-view mapping metadata indicates an association between patches in at least one atlas and at least one view
- an apparatus includes means for providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for transmitting the information to a receiving device.
- an apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: receive information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and cull the one or more sets of components belonging to the same atlas from at least one volumetric video bit
- an apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: provide information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmit the information to a receiving device.
- a method includes receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
- a method includes providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
- a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video
- a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
- FIG. 1A, FIG. IB, and FIG. 1C depict a 3VC elementary stream structure for one atlas (patch data and video encoded components).
- FIG. 2 is a diagram depicting relationships between objects and V-PCC elements (patches and volumetric 2D rectangles).
- FIG. 3 shows an example modified miv_view_params_list () sub-structure of the adaptation_params_rbsp () structure in 3VC (as specified in WD4 d24 of ISO/IEC 23090-12), with the modification highlighted.
- FIG. 4 shows an example modified miv_view_params_update_extrinsics () sub-structure of the adaptation_params_rbsp () structure in 3VC (as specified in WD4 d24 of ISO/IEC 23090-12), with the modification highlighted.
- FIG. 5 shows an example modified adaptation_params_rbsp () structure in 3VC (as specified in WD4 d24 of ISO/IEC 23090-12), with the modification highlighted which includes a new structure miv_atlas_map_update ().
- FIG. 6 shows an example miv_atlas_map_update () structure.
- FIG. 7 shows an example modified patch information SEI message, with the modification highlighted.
- FIG. 8A shows a first part of an example modified scene object information SEI message, and wherein collectively FIG. 8A, FIG. 8B, and FIG. 8C are FIG. 8.
- FIG. 8B shows a second part of the example modified scene object information SEI message, with the modification highlighted.
- FIG. 8C shows a third part of the example modified scene object information SEI message.
- FIG. 9 is an example apparatus, which may be implemented in hardware, configured to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
- FIG. 10 is an example method to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
- FIG. 11 is another example method to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
- FIG. 12 is another example method to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
- volumetric video coding standard 3GPP 3rd Generation Partnership Project 3VC video-based volumetric video coding standard, or volumetric video coding
- the examples referred to herein relate to volumetric video coding, where dynamic 3D objects or scenes are coded into video streams for delivery and playback.
- the MPEG standards V-PCC (Video-based Point Cloud Compression) and MIV (Metadata for Immersive Video) are two examples of such volumetric video compression. These standards share a common base standard, 3VC (Volumetric Video Coding).
- the 3D scene is segmented into a number of regions according to heuristics based on, for example, spatial proximity and/or similarity of the data in the region.
- the segmented regions are projected into 2D patches, where each patch contains at least surface texture and depth channels, the depth channel giving the displacement of the surface pixels from the 2D view plane associated with that patch.
- the patches are further packed into an atlas that can be encoded and streamed as a regular 2D video.
- a 3VC bitstream may contain one or more atlases.
- An atlas consists of an atlas metadata bitstream (atlas_sub_bitstream) and video encoded component bitstreams (video_sub_bitstreams).
- the atlas metadata bitstream carries patch layout information for related video encoded component bitstreams.
- MIV introduced a concept of a special atlas or master atlas of specific type 0x3F. This master atlas only contains the atlas metadata bitstream where common parameters such as view or camera parameters may be signaled.
- FIG. 1A, FIG. IB, and FIG. 1C (collectively FIG. 1) describes the 3VC bitstream structure 100 for a single atlas, where atlases are signaled in vpcc_unit_headers .
- the 3VC bitstream structure 100 includes a V-PCC bitstream 102, and atlas sub-bitstream 104, and an atlas tile group layer RBSP 106.
- V-PCC bitstream includes a plurality of VPCC unit headers 110 (including 110-2, 110-3, 110-4, and 110-5), a VPCC sample stream precision 112, a plurality of VPCC sample stream sizes 114 (including 114-2, 114-3, 114-4, and 114-5), a VPS 115 associated with a VPCC unit payload, and atlas sub-bitstream 117 associated with a VPCC unit payload, and plurality of video sub-bitstreams (116-3, 116-4, and 116-5) each associated with a VPCC unit payload.
- VPCC unit header 110 has a volumetric unit header type of VPCC_VPS for VPS
- VPCC unit header 110-2 has a volumetric unit header type of VPCC_AD for atlas data
- VPCC unit header 110-3 has a volumetric unit header type of VPCC_OVD for occupancy video data
- VPCC unit header 110-4 has a volumetric unit header type of VPCC_VPS for GVD for geometry video data
- VPCC unit header 110-4 has a volumetric unit header type of VPCC_AVD for attribute video data.
- size 114 corresponds to the size of items 110 and 115
- size 114-2 corresponds to the size of items 110-2 and 117
- size 114-3 corresponds to the size of items 110-3 and 116-3
- size 114-4 corresponds to the size of items 110-4 and 116-4
- size 114-5 corresponds to the size of 110-5 and 116-5 (where for example the unit of size is the number of RBSP bytes).
- Atlas sub-bitstream 104 includes a NAL sample stream precision 122, a plurality of NAL sample stream sizes 124 (including 124-2, 124-3, 124-4, 124- 5, 124-6, and 124-7), a plurality of NAL unit headers 120 (including 120-2, 120-3, 120-4, 120-5, 120-6, and 120-7), an ASPS 126 having a number of RBSP bytes, an AFPS 127 having a number of RBSP bytes, a NAL prefix SEI 128 having a number of RBSP bytes, a plurality of atlas tile group layer raw byte sequence payloads 130 (including 130-2 and 130-3) having a number of RBSP bytes, and a NAL suffix SEI 132 having a number of RBSP bytes.
- size 124 corresponds to the size of items 120 and 126
- size 124-2 corresponds to the size of items 120-2 and 127
- size 124-3 corresponds to the size of items 120-3 and 128
- size 124-4 corresponds to the size of items 120-4 and 130
- size 124-5 corresponds to the size of 120-5 and 130-2
- size 124-6 corresponds to the size of 120-6 and 130-3
- size 124-7 corresponds to the size of 120-7 and 132 (where for example the unit of size is the number of RBSP bytes).
- the atlas tile group layer RBSP 106 includes an atlas tile group data unit 140, an atlas tile group header 142, a plurality of atlas tile group data unit patch modes 144 (including 144-2, 144-3, 144-4, 144-5, and 144-6), and a plurality of patch information data 146 (including 146-2, 146-3, 146-4, 146-5, and 146-6).
- vpcc_units with different headers may be stored in separate tracks. Tracks with the same atlas_id may reference each other in order to establish a logical hierarchy.
- a master atlas may be used to provide a single entry point in the file.
- the master atlas may refer to other atlases as described in U.S. provisional application no. 62/959,449 (corresponding to U.S. nonprovisional application no. 17/140,580), entitled "Storage Of Multiple Atlases From One V-PCC Elementary Stream In ISOBMFF".
- MIV in addition to the patch information, in MIV in particular there is additional view metadata that describes the projection parameters, such as depth range and camera intrinsic and extrinsic parameters, for the patches.
- the patches in the patch atlas reference the view metadata by view id, and there are typically much fewer views than there are patches.
- the 3VC bitstream supports a special "master atlas" that may only contain atlas metadata without an actual video bitstream.
- each patch in a 3VC (V-PCC or MIV) atlas comes with sufficient metadata for determining whether that patch may be visible in a view of the scene rendered with given camera parameters.
- This view frustum culling of scene elements is a common rendering optimization in 3D graphics and can be applied to volumetric video as well.
- view frustum culling can also be applied to each MIV view, enabling coarser (or more conservative) culling at the view level followed by further culling at the patch level.
- Culling may refer to removing or ignoring information that is not relevant, where extraction can be done for relevant information or irrelevant information. For example, consider the difference extracting a track from file versus culling a track from file.
- the content may be too large for the client to access, decode, and/or render all at once.
- Larger scenes may typically be split into multiple video atlases in any case due to video decoder resolution limits, so it is desirable to facilitate partial access at the atlas level and/or use smaller partitions inside atlases.
- HEVC supports highly flexible partitioning of a video sequence. Each frame of the sequence is split up into rectangular or square regions (Units or Blocks), each of which is predicted from previously coded data. After prediction, any residual information is transformed, and entropy encoded.
- Each coded video frame, or picture is partitioned into Tiles and/or Slices, which are further partitioned into Coding Tree Units (CTUs).
- the CTU is the basic unit of coding, analogous to the Macroblock in earlier standards, and can be up to 64x64 pixels in size.
- Multiple Atlases in V-PCC elementary stream After the MPEG 128 meeting the V-PCC elementary bitstream may contain more than one of an atlas. This functionality was added to carry data encoded according to the MIV specification (23090- 12).
- vuh_atlas_id was added to V-PCC unit header for V-PCC units with types: VPCC_AD, VPCC_GVD, VPCC_OVD, and VPCC_AVD, corresponding respectively to Atlas Data, Geometry Video Data, Occupancy Video Data, and Attribute Video Data.
- V-PCC sample allows only one V-PCC unit payload to be stored. Consequently, a V-PCC Track per atlas would have to be created.
- V-PCC Component Tracks can be created without modification, as from their perspective vuh_atlas_id is yet another identifier of a track similar to vuh_unit_type, vuh_attribute_index, vuh_map_index, and vuh_attribute_dimension_index .
- V-PCC Object Annotation in 3VC 23090-5.
- V-PCC it is possible to annotate each region of the volumetric bitstream, i.e. the patches or groups of patches that are identified using a "rectangular" shaped volumetric rectangle, with different information. This process may include whether these elements are associated with a particular object (likely an object in the physical/world space) and certain properties that could be useful for their extraction and rendering.
- Such information may include labeling of objects, the size and shape of the points that correspond to the object, whether the object is visible or not, visibility cone information, material ids, and collision information, among others.
- FIG. 2 Shown in FIG. 2 is an object 202, where object 202 has an object ID.
- the object 202 is associated with a tile/patch object 204, shown as TileX.Patches where a tile (indexed from 0 to m) may access patches by dereferencing the patches object.
- the object 202 is also associated with a plurality of 2D volumetric rectangles 206 (indexed from 0 to n).
- the object 202 has a number of properties 208 including, as shown in FIG. 2, labels, 3D bounding box information, collision shapes, point size, whether the object is hidden or visible, a priority, visibility cones, and object relationships.
- the properties 208 have labels 210, which in the example shown in FIG. 2 are indexed from 0 to 255, where each label has a label ID, label text, and a label language.
- Objects may correspond to "real", i.e. physical, objects within a scene, or even conceptual objects that may relate to physical or other properties. Objects may be associated with different parameters, or properties (e.g. properties 208), which may also correspond to information provided during the creation or editing process of the point cloud, scene graph, etc. It is possible that some objects may relate to one another and in some cases an object could be part of another object.
- An object could be persistent in time and could also be updated at any time/frame while the associated information may persist from that point onward.
- Multiple patches or 2D volumetric rectangles e.g. rectangles 206
- Such relationships could persist or also need to change in time because objects may move or their placement in the atlas may have changed.
- a camera number, and camera extrinsic and camera intrinsic information is not fixed and may change on a group of picture basis (e.g., GOP).
- 23090-12 introduces in WD4 of the specification an adaptation params structure that can carry this information.
- Adaptation params are carried by an NAL unit with a particular NAL unity type.
- this adaptation params is carried in an atlas with a unique value of atlas_id equal to 0x3F.
- Each camera (view) has a unique index and determined within miv view params list.
- Box-structured file formats Box-structured and hierarchical file format concepts have been widely used for media storage and sharing.
- the most well-known file formats in this regard are the ISO Base Media File Format (ISOBMFF, ISO/IEC 14496-12) and its variants such as MP4 and 3GPP file formats.
- ISOBMFF allows storage of timely captured audio/visual media streams, called media tracks.
- the metadata which describes the track is separated from the encoded bitstream itself.
- the format provides mechanisms to access media data in a codec-agnostic fashion from a file parser perspective.
- a 3VC (V-PCC/MIV) bitstream containing a coded point cloud sequence (CPCS) is composed of VPCC units carrying V- PCC parameter set (VPS) data, an atlas information bitstream, and 2D video encoded bitstreams (e.g. an occupancy map bitstream, a geometry bitstream, and zero or more attribute bitstreams).
- V-PCC/MIV V-PCC/MIV bitstream
- V-PCC/MIV bitstream can be stored in an ISOBMFF container according to ISO/IEC 23090-10. Two modes are supported: single-track container and multi-track container.
- Single-track container is utilized in the case of simple ISOBMFF encapsulation of a V-PCC encoded bitstream.
- a V-PCC bitstream is directly stored as a single track without further processing.
- Single-track should use a sample entry type of'vpel' or 'vpeg'.
- Atlas parameter sets (as defined in ISO/IEC 23090-5) are stored in the setupUnit of sample entry. Under the 'vpeg' sample entry, the atlas parameter sets may be present in setupUnit array of sample entry, or in the elementary stream.
- Multi-track container maps V-PCC units of a 3VC (V- PCC/MIV) elementary stream to individual tracks within the container file based on their types.
- V-PCC track is a track carrying the volumetric visual information in the V-PCC bitstream, which includes the atlas sub-bitstream and the atlas sequence parameter sets.
- V-PCC component tracks are restricted video scheme tracks which carry 2D video encoded data for the occupancy map, geometry, and attribute sub-bitstreams of the 3VC (V-PCC/MIV) bitstream. Multi-track should use for V-PCC track a sample entry type of 'vpcl' or 'vpcg'.
- Atlas culling is not currently possible, however. While the view metadata is available in the "master atlas" and each view can be culled against the rendering view frustum, the connection to the actual scene data corresponding to each view is through the patch metadata that resides in each atlas metadata bitstream.
- every atlas metadata bitstream must be accessed before it is possible to determine whether a given atlas is relevant for the client at a given moment. This makes at least network streaming optimizations impossible, and hinders optimization of bitstream parsing and decoding in general.
- the adaptation_params_rbsp structure that contains MIV related view metadata is contained in the universally accessible "master" atlas (i.e. the atlas with vuh_atlas_id equal to 0x3F). New elements in the adaptation_params_rbsp structure are added to provide information about mapping from views to atlases. This mapping may indicate, for every view, the atlas that contains patches referring back to the view in question. [0069] The renderer may apply view frustum culling to each view first.
- the mapping metadata can be, for example, a bitmask of N bits, where N is the number of atlas sub-bitstreams. Each bit in the mask therefore corresponds to one atlas. In each view, the mask may have a bit set for every atlas if the atlas corresponding to the bit contains patches for that view, and a bitwise OR operation over the potentially visible views may produce the combined bitmask.
- the bitmask may be embedded in the miv_view_params_list() sub-structure of the adaptation_params_rbsp () structure in 3VC.
- the newly added mvp_a11as_map_flag indicates whether atlas map mask information is available for given view.
- the newly added mvp_a11as_map_mask contains the bitmask of atlases where patches linking to the given view may be found.
- the length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in ISO/IEC 23090-5.
- FIG. 3 also shows the example modified miv_view_params_list () sub-structure 300 of the adaptation_params_rbsp () structure in 3VC, with the modification highlighted as item 302.
- a temporal update of the atlas map can be done together with a camera extrinsic in the sub- structure miv_view_params_update_extrinsics() of the adaptation_params_rbsp () structure in 3VC.
- An example modified miv_view_params_update_extrinsics () structure is as follows: [0076] The newly added mvpue_atlas_map_flag indicates whether atlas map mask information is available for a given view.
- the newly added mvpue_atlas_map_mask contains the bitmask of atlases where patches linking to the given view may be found.
- the length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in 23090-5.
- FIG. 4 also shows the example modified miv_view_params_update_extrinsics () sub-structure 400 of the adaptation_params_rbsp () structure in 3VC, with the modification highlighted as item 402.
- a temporal update is done as a newly added structure miv_atlas_map_update () of the adaptation_params_rbsp () structure in 3VC.
- An example modified adaptation_params_rbsp () structure is as follows:
- FIG. 5 also shows the example modified adaptation_params_rbsp () structure 500 in 3VC, with the modification highlighted as item 502 which includes a new structure miv_atlas_map_update ().
- FIG. 6 also shows the example miv_atlas_map_update() structure 600.
- the encoder may optimize the patch layout so that patches belonging to a certain view are grouped together in a single atlas. This makes the view-based culling of atlases more effective.
- a patch information SEI message is extended to include an atlas map element that would inform a renderer in what other atlases the object is present.
- Each object can have visibility information, and a renderer can perform culling based on this information.
- a renderer could request the needed atlases (that can be mapped to tracks) from a file parser.
- An example modified patch_information(payload_size) structure is provided below.
- the newly added pi_patch_atlas_map_mask contains the bitmask of atlases where patches linking to the given object can be found.
- the length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in ISO/IEC 23090-5.
- FIG. 7 also shows the example modified patch information SEI message 700, with the modification highlighted as item 702. [0087] In another embodiment, the scene object information
- SEI message is contained in the universally accessible "master" atlas (i.e. atlas with vuh_atlas_id equal to 0x3F).
- the scene_object_information SEI message is extended to provide mapping of object IDs to atlases. This metadata may indicate, for every object, the atlas that contains patches referring back to the object in question.
- An example modified scene_object_information (payloadSize) SEI message is as follows.
- soi_object_atlas_map_mask contains the bitmask of atlases where patches linking to the given object can be found.
- the length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in ISO/IEC 23090-5.
- FIG. 8A, FIG. 8B, and FIG. 8C also show the example modified scene object information SEI message as collectively items 800, 810, and 820, with the modification highlighted as item 802 within FIG. 8B.
- FIG. 8A, FIG. 8B, and FIG. 8C are FIG. 8.
- a special atlas with a predefined atlas ID is specified to contain view metadata, while the patch metadata is contained in per-atlas metadata units.
- these per-atlas metadata units are moved from separate atlases to the "master atlas” in order to make them universally available.
- the signaling related aspects of this embodiment are largely covered in U.S. provisional application no. 62/959,449 (corresponding to U.S. nonprovisional application no. 17/140,580), entitled "Storage Of Multiple Atlases From One V-PCC Elementary Stream In ISOBMFF".
- the novelty of this embodiment includes the decoding process, which allows a decoder to cull whole atlases based on patch metadata.
- the renderer may cull all patches against the current rendering viewing frustum, and decode only the atlas sub-bitstreams that contain potentially visible patches. This can be implemented in several ways, of which two examples are:
- Embodiment 1 (1. View-to- atlas mapping metadata) first, then process only patches referring to a potentially visible view, and mark the relevant atlases as required
- Embodiment 1 After finding the required atlases, access to those may continue as in Embodiment 1 (1. View-to-atlas mapping metadata), potentially via a network request before decoding the relevant atlas sub-bitstream.
- the V-PCC bitstream containing the coded point cloud sequence (CPCS) that is composed of VPCC units carrying V-PCC parameter set (VPS) data, more than one atlas bitstream, and more than one 2D video encoded bitstreams is stored in ISOBMFF.
- V-PCC bitstream is one carrying volumetric video compressed according to MPEG Immersive Media defined in of ISO/IEC 23090-12.
- each atlas bitstream is encapsulated in a separate V-PCC track.
- One of those tracks is interpreted as a parameter track that is part of the multi-atlas V-PCC bitstream, while other tracks are interpreted as normal V-PCC tracks that are part of the multi-atlas V-PCC bitstream.
- a V-PCC track is part of the multi-atlas V-PCC bitstream when it contains a 'mapt' track reference to another V-PCC track and has a sample entry type equal to 'vpcl' or 'vpcg'.
- This referenced track is referred to as the parameter track of the multi-atlas V-PCC bitstream and could have a sample entry type equal to 'vpcP'.
- a parameter track does not include ACL NAL units.
- a normal track does not carry ACL NAL units belonging to more than one atlas.
- all the atlas NAL units that apply to the entire V-PCC access unit are carried in the parameter track.
- These atlas NAL units include (but are not limited to) adaptation_params_rbsp, SEI messages as well as EOB and EOS NAL units, when present.
- the atlas NAL units that do not apply to a given atlas are not carried in the normal track containing that atlas.
- the NAL units that apply to an atlas are carried in the normal track containing that atlas.
- a sample groups in order to enable a view- frustum culling, i.e. culling objects outside of the user's current view of the scene, a sample groups is defined. It provides a mapping of an atlas to a view. Due to the use of a sample group (or one or more sample groups), the mapping can change along the timeline of the volumetric video.
- Quantity Zero or more
- a view information sample group entry identifies which views are carried by samples.
- the grouping_type_parameter is not defined for the SampleToGroupBox with grouping type 'vpvi'.
- a view information sample group entry may also provide information in which other track, atlases, or tile group samples with data containing the same view is carried.
- Semantics group_id specifies the unique identifier of the group.
- num_views specifies the number of views carried by samples.
- view_index specifies the index of a view carried by samples. The index is mapped to view index in the active adaptation_params_rbsp.
- num_atlases specifies the number of atlases, other than the atlas contained in the track the sample group belongs to, that contain samples with the view with the index equal to view index.
- atlas_id specifies the id of an atlas that contains samples with the view with the index equal to view_index.
- num_tile_groups specifies the number of tile groups within atlas with id equal to atlas_id than contain samples with the view with the index equal to view_index.
- num_tile_groups equals to 0 then all tile groups belonging to atlas with id equal to atlas_id contain samples with view with the index equal to view_index
- tile_groups_address specifies the address of tile group within atlas with id equal to atlas_id that contains samples with the view with the index equal to view index.
- a sample group in order to enable a view- frustum culling, i.e. culling objects outside of the user's current view of the scene, a sample group is defined. It provides a mapping of an atlas to an object.
- the object may include visibility cone information that can be used for culling. Due to the use of a sample group (or one or more sample groups), the mapping can change along the timeline of the volumetric video.
- a view information sample group entry identifies which views are carried by samples.
- the grouping_type_parameter is not defined for the SampleToGroupBox with grouping type 'vpoi'.
- a view information sample group entry may also provide information in which other tracks samples with data containing the same view is carried.
- Semantics group_id specifies the unique identifier of the group.
- num_objects specifies the number of objects carried by samples.
- object_index specifies the index of an object carried by samples. The index is mapped to object index soi_object_idx in the active scene_object_information SEI message.
- num_atlases specifies the number of atlases, other than the atlas contained in the track the sample group belongs to, that contain samples with the object with the index equal to object_index.
- atlas_id specifies the id of an atlas that contains samples with the object with the index equal to object_index.
- num_tile_groups specifies the number of tile groups within atlas with id equal to atlas_id than contain samples with the object with the index equal to object_index. When num_tile_groups equals to 0, then all tile groups belonging to atlas with id equal to atlas_id contain samples with object with the index equal to object_index.
- tile_groups_address specifies the address of tile group within atlas with id equal to atlas_id that contains samples with the object with the index equal to object_index.
- V-PCC parameter tracks contain adaptation_params_rbsp with additional signaling of atlas_map per view as described in 1. 'View-to-atlas mapping metadata'.
- Each atlas is carried by one track, and based on adaptation_params_rbsp an application informs a file parser which atlases are required at a given time.
- the file parser maps atlas ids to track ids based on the atlas_id in VPCCUnitHeaderBox that is contained in the VPCCSampleEntry of every V-PCC track carrying atlas data.
- V-PCC parameter tracks contain scene_object_information with additional signaling of atlas_map per view as described in 1. 'View-to-atlas mapping metadata'. Each atlas is carried by one track, and based on scene_object_information an application informs a file parser which atlases are required at a given time. The file parser maps atlas ids to track ids based on the atlas_id in VPCCUnitHeaderBox that is contained in the VPCCSampleEntry of every V-PCC track carrying atlas data.
- a V-PCC bitstream containing a coded point cloud sequence (CPCS) that is composed of VPCC units carrying V-PCC parameter set (VPS) data, one atlas bitstream, and more than one 2D video encoded bitstreams is stored in ISOBMFF.
- An example of such V-PCC bitstream is one carrying volumetric video compressed according to V-PCC defined in ISO/IEC 23090-5.
- One atlas bitstream is encapsulated in a separate V- PCC track. One of those tracks is interpreted as a tile parameter track that is part of the V-PCC bitstream, while other tracks are interpreted as tile tracks that are part of the V-PCC bitstream. Each tile track carries samples containing one or more atlas_tile_group_layer_rbsp structures.
- a tile track is part of the V-PCC bitstream when it contains a 'mtpt' track reference to another V-PCC track and has a sample entry type equal to 'vptl' or 'vptg'.
- This referenced track is referred to as the tile parameter track of the V-PCC bitstream and could have a sample entry type equal to 'vptP'.
- a tile parameter track does not include ACL NAL units.
- ACL NAL units For any atlas access unit carried by samples in a tile parameter track and a number of tile tracks, all the atlas NAL units that apply to the entire atlas access unit are carried in the tile parameter track.
- These atlas NAL units include (but are not limited to) adaptation_params_rbsp, atlas_sequence_parameters_rbsp, atlas_frame_parameters_rbsp, SEI messages as well as EOB and EOS NAL units, when present.
- Each of the tile tracks may contain ObjectInformationSampleGroupEntry or
- the proposed signaling does not enable extraction of components arbitrarily, and the components always need to relate to the same atlas.
- the extraction happens for one or more sets of components belonging to the same atlas, rather than just extracting a component. Belonging in the same atlas means that the components share the same atlas id.
- the component can be atlas data or video coded occupancy, attribute or geometry data.
- the embodiments described herein do not necessarily relate to extracting or culling single components, but sets of components that represent a partial portion of the scene. With the atlas to view and atlas to object mapping, entire atlases may be culled.
- FIG. 9 is an example apparatus 900, which may be implemented in hardware, configured to implement efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein.
- the apparatus 900 comprises a processor 902, at least one non-transitory memory 904 including computer program code 905, wherein the at least one memory 904 and the computer program code 905 are configured to, with the at least one processor 902, cause the apparatus to implement a process, component, module, or function (collectively 906) to implement efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein.
- the apparatus 900 optionally includes a display and/or I/O interface 908 that may be used to display a culled bitstream.
- the apparatus 900 also includes one or more network (NW) interfaces (I/F(s)) 910.
- NW network interfaces
- the I/F(s) 910 may be wired and/or wireless and communicate over a channel or the Internet/other network(s) via any communication technique.
- the NW I/F(s) 910 may comprise one or more transmitters and one or more receivers.
- the N/W I/F(s) 910 may comprise standard well-known components such as an amplifier, filter, frequency-converter, (de)modulator, and encoder/decoder circuitry(ies) and one or more antennas.
- the processor 902 is configured to implement item 906 without use of memory 904.
- the apparatus 900 may be a remote, virtual or cloud apparatus.
- the apparatus 900 may be either a writer or a reader (e.g. parser), or both a writer and a reader (e.g. parser).
- the apparatus 900 may be either a coder or a decoder, or both a coder and a decoder.
- the apparatus 900 may be a user equipment (UE), a head mounted display (HMD), or any other fixed or mobile device.
- UE user equipment
- HMD head mounted display
- the memory 904 may be implemented using any suitable data storage technology, such as semiconductor based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory.
- the memory 904 may comprise a database for storing data.
- Interface 912 enables data communication between the various items of apparatus 900, as shown in FIG. 9.
- Interface 912 may be one or more buses, or interface 912 may be one or more software interfaces configured to pass data between the items of apparatus 900.
- the interface 912 may be one or more buses such as address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, and the like.
- the apparatus 900 need not comprise each of the features mentioned, or may comprise other features as well.
- FIG. 10 is an example method 1000 for implementing efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein.
- the method includes providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of items 1004, 1006, 1008, or 1010.
- the method includes wherein the cull signaling comprises view-to-atlas mapping metadata that enables culling of sub bitstreams via per-view visibility culling.
- the method includes wherein the cull signaling comprises object- to-atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling.
- the method includes wherein the cull signaling comprises patch atlas metadata within a metadata for immersive video master atlas to enable sub-bitstream culling based on fine-grained patch visibility culling.
- the method includes wherein the cull signaling comprises partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
- Method 1000 may be implemented by apparatus 900.
- FIG. 11 is another example method 1100 for implementing efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein.
- the method includes receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream, wherein the information comprises one or more of 1104, 1106, 1108, or 1110.
- the method includes wherein the information comprises atlas- to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view.
- the method includes wherein the information comprises atlas- to-object mapping metadata that indicates an association between at least one object and the at least one atlas.
- the method includes wherein the information comprises patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine grained patch visibility culling.
- the method includes wherein the information comprises partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level.
- the method includes culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
- Method 1100 may be implemented by a decoder apparatus, or by apparatus 900.
- FIG. 12 is another example method 1200 for implementing efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein.
- the method includes providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream, wherein the information comprises one or more of 1204, 1206, 1208, or 1210.
- the method includes wherein the information comprises atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view.
- the method includes wherein the information comprises atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas.
- the method includes wherein the information comprises patch atlas metadata within a metadata for immersive video master atlas to indicate sub bitstream culling based on fine-grained patch visibility culling.
- the method includes wherein the information comprises partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level.
- the method includes transmitting the information to a receiving device.
- Method 1200 may be implemented by an encoder apparatus, or by apparatus 900.
- references to a 'computer', 'processor', etc. should be understood to encompass not only computers having different architectures such as single/multi-processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field-programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other processing circuitry.
- References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device such as instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device, etc.
- circuitry may refer to any of the following: (a) hardware circuit implementations, such as implementations in analog and/or digital circuitry, and (b) combinations of circuits and software (and/or firmware), such as (as applicable): (i) a combination of processor(s) or (ii) portions of processor(s)/software including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus to perform various functions, and (c) circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.
- circuitry would also cover an implementation of merely a processor (or multiple processors) or a portion of a processor and its (or their) accompanying software and/or firmware.
- the term 'circuitry' would also cover, for example and if applicable to the particular element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or another network device.
- Circuitry may also be used to mean a function or a process, such as one implemented by an encoder or decoder, or a codec.
- an example apparatus may be provided that includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to perform: provide signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to- atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
- the apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: cull at least one volumetric video atlas using the provided signaling; and render a view frustum corresponding to the non-culled volumetric video atlas bitstreams.
- the apparatus may further include wherein the view- to-atlas mapping metadata is a bitmask of N bits, where N is a number of atlas sub-bitstreams.
- the apparatus may further include wherein the bitmask is embedded in a view parameter substructure of an adaptation parameter structure.
- the view- to-atlas mapping metadata comprises a temporal update of an atlas map together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
- the apparatus may further include wherein the view- to-atlas mapping metadata comprises a temporal update as an atlas map update substructure of an adaptation parameter structure.
- the apparatus may further include wherein the at least one volumetric video atlas is culled after the volumetric video atlas has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
- the apparatus may further include wherein the object- to-atlas mapping metadata comprises an atlas map element to inform a renderer in what other atlases an object is present, wherein the atlas map element extends a patch information supplemental enhancement information message.
- the apparatus may further include wherein the object- to-atlas mapping metadata comprises an extension to a scene object information supplemental enhancement information message to provide a mapping of object identifiers (IDs) to atlases.
- IDs object identifiers
- the apparatus may further include wherein the extension is implemented as a bitmask and indicates, for every object, an atlas that contains patches referring back to the respective object.
- the apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform, to implement the fine-grained patch visibility culling, either: loop over all patch atlases, detect potentially visible patches, mark an atlas as required once a first potentially visible patch is found, and move to a next patch atlas; or perform the per-view visibility culling, process patches referring to a potentially visible view, and mark relevant atlases as required.
- the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform, to implement the fine-grained patch visibility culling, either: loop over all patch atlases, detect potentially visible patches, mark an atlas as required once a first potentially visible patch is found, and move to a next patch atlas; or perform the per-view visibility culling, process patches referring to a potentially visible view, and mark relevant atlases as required.
- the apparatus may further include wherein when the at least one volumetric video atlas bitstream contains a coded point cloud sequence (CPCS) that is composed of units carrying V-PCC parameter set (VPS) data, more than one atlas bitstream, and more than one 2D video encoded bitstreams, the at least one volumetric video atlas bitstream is stored in ISOBMFF.
- CPCS coded point cloud sequence
- VPS V-PCC parameter set
- the apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: define a sample group entry to provide a mapping of an atlas to a view to enable a view-frustum culling.
- the apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: define a sample group entry to provide a mapping of an atlas to an object to enable a view-frustum culling.
- the apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: inform a file parser which atlases are required at a given time with signaling within an adaptation parameters structure.
- the apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: inform a file parser which atlases are required at a given time with signaling within a supplemental enhancement information scene object information message.
- the apparatus may further include wherein when the at least one volumetric video atlas bitstream contains a coded point cloud sequence (CPCS) that is composed of units carrying V-PCC parameter set (VPS) data, one atlas bitstream, and more than one 2D video encoded bitstreams, the at least one volumetric video atlas bitstream is stored in ISOBMFF.
- CPCS coded point cloud sequence
- VPS V-PCC parameter set
- the apparatus may further include wherein the at least one volumetric video atlas is culled without having to access every atlas metadata bitstream.
- an example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations may be provided, the operations comprising: providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to-atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub-bitstream culling based on fine grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
- an example method includes providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to-atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
- an example apparatus includes means for providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to- atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
- view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling
- object-to- atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling
- patch atlas metadata within a metadata for immersive video master atlas to enable sub bitstream culling based on
- An example apparatus includes means for receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to- view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to- object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
- the apparatus may further include means for rendering a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled.
- the atlas-to-view mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
- the atlas-to-view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message.
- the persistence may be specified using a flag, wherein the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
- the at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
- the apparatus may further include means for interpreting a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
- the information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
- the volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas.
- the atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
- the one or more sets of components belonging to the same atlas may share an atlas identifier.
- a component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene.
- the apparatus may further include means for culling an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
- the atlas-to-object mapping metadata may be received as a supplemental enhancement information message, and may indicate a value of the at least one object given an atlas identifier and an index of the at least one object.
- the atlas-to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object.
- the atlas- to-object mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
- the atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
- An example apparatus includes means for providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for transmitting the information to a receiving device.
- atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view
- atlas-to-object mapping metadata indicates an association between at least one object and the at least one atla
- the information may be provided using at least one sample group entry object.
- the atlas-to-view mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
- the atlas-to- view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the apparatus may further include means for encoding the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas.
- the at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
- the apparatus may further include means for defining a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
- the information may signals partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
- the volumetric video bitstream may be a set of visual volumetric video-based coding sub-bitstreams, which belong to the same atlas.
- the atlas-to- view mapping metadata or the atlas-to-object mapping metadata may be provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
- the one or more sets of components belonging to the same atlas may share an atlas identifier.
- a component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene.
- An entire atlas may be culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
- the atlas-to-object mapping metadata may be provided as a supplemental enhancement information message, and may indicate a value of the at least one object given an atlas identifier and an index of the at least one object.
- the atlas-to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object.
- the atlas-to-object mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
- the atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the apparatus may further include means for encoding the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
- An example apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: receive information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and cull the one or more sets of components belonging to the same atlas from the at least one
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: render a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled.
- the atlas-to-view mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
- the atlas-to-view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
- the at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: interpret a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
- the information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
- the volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas.
- the atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
- the one or more sets of components belonging to the same atlas may share an atlas identifier.
- a component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: cull an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
- the atlas-to-object mapping metadata may be received as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object.
- the atlas- to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object.
- the atlas-to-object mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
- the atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: render a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled; and wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
- the atlas-to-view mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases
- the atlas-to-view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message
- the persistence may be specified using a flag, wherein the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one volumetric video bitstream may comprises one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and where the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video- based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: interpret a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
- the information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
- the volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas, the atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure, the one or more sets of components belonging to the same atlas may share an atlas identifier, or a component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: cull an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata, wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
- the atlas-to-object mapping metadata may be received as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object, wherein the atlas-to- object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object, wherein the atlas-to-object mapping metadata is received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases, wherein the atlas-to- object mapping metadata specifies a persistence of a previous atlas object supplemental enhancement information message, wherein the persistence is specified using a flag, wherein the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until
- An example apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: provide information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmit the information to a receiving device.
- the information may be provided using at least one sample group entry object.
- the atlas-to-view mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
- the atlas-to- view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas.
- the at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video- based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: define a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
- the information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
- the volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas.
- the atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
- the one or more sets of components belonging to the same atlas may share an atlas identifier.
- a component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene.
- An entire atlas may be culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
- the atlas-to-object mapping metadata may be provided as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object.
- the atlas-to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object.
- the atlas-to-object mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
- the atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message.
- the persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
- Other aspects of the apparatus may include the following.
- the information may be provided using at least one sample group entry object, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
- the atlas-to-view mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases, wherein the atlas-to-view mapping metadata specifies a persistence of a previous atlas view supplemental enhancement information message, wherein the persistence is specified using a flag, wherein the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
- the at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video- based coding sequences comprise at least one video based point cloud coding parameter set, and where the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: define a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view- frustum culling.
- the information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
- the volumetric video bitstream may be a set of visual volumetric video-based coding sub-bitstreams, which belong to the same atlas, where the atlas-to-view mapping metadata or the atlas-to-object mapping metadata is provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure, the one or more sets of components belonging to the same atlas share an atlas identifier, or a component of the one or more sets of components is atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components represent a partial portion of a scene.
- the at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas, wherein an entire atlas is culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
- the atlas- to-object mapping metadata may be provided as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object, wherein the atlas-to-object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object, wherein the atlas-to-object mapping metadata is provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases, wherein the atlas-to-object mapping metadata specifies a persistence of a previous atlas object supplemental enhancement information message, wherein the persistence is specified using a flag, wherein the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until
- An example method includes receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
- atlas-to-view mapping metadata indicates an association between patches in at least one atlas and at least one view
- atlas-to-object mapping metadata indicates an
- the method may further include rendering a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled.
- An example method includes providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
- An example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations comprising: receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to- view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to- object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the
- An example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations comprising: providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- Computer Security & Cryptography (AREA)
- Library & Information Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
An apparatus includes means for receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream, the information comprising one or more of (1102): atlas-to-view mapping metadata indicating an association between patches in at least one atlas and at least one view (1104); atlas-to-object mapping metadata indicating an association between at least one object and the at least one atlas (1106); patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine- grained patch visibility culling (1108); or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level (1110); and means for culling the one or more sets of components from the at least one volumetric video bitstream, based on the information (1112).
Description
Efficient Culling Of Volumetric Video Atlas Bitstreams
TECHNICAL FIELD
[0001] The examples and non-limiting embodiments relate generally to video codecs, and more particularly, to efficient culling of volumetric video atlas bitstreams.
BACKGROUND
[0002] It is known to perform video coding and decoding.
SUMMARY
[0003] In accordance with an aspect, an apparatus includes means for receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[0004] In accordance with an aspect, an apparatus includes means for providing information related to a culling of one or
more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for transmitting the information to a receiving device.
[0005] In accordance with an aspect, an apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: receive information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and cull the one or more sets of components
belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[0006] In accordance with an aspect, an apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: provide information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmit the information to a receiving device.
[0007] In accordance with an aspect, a method includes receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained
patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[0008] In accordance with an aspect, a method includes providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
[0009] In accordance with an aspect, a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations is provided, the operations comprising: receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view;
atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[0010] In accordance with an aspect, a non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations is provided, the operations comprising: providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The foregoing aspects and other features are explained in the following description, taken in connection with the accompanying drawings, wherein:
[0012] FIG. 1A, FIG. IB, and FIG. 1C (collectively FIG. 1) depict a 3VC elementary stream structure for one atlas (patch data and video encoded components).
[0013] FIG. 2 is a diagram depicting relationships between objects and V-PCC elements (patches and volumetric 2D rectangles).
[0014] FIG. 3 shows an example modified miv_view_params_list () sub-structure of the adaptation_params_rbsp () structure in 3VC (as specified in WD4 d24 of ISO/IEC 23090-12), with the modification highlighted.
[0015] FIG. 4 shows an example modified miv_view_params_update_extrinsics () sub-structure of the adaptation_params_rbsp () structure in 3VC (as specified in WD4 d24 of ISO/IEC 23090-12), with the modification highlighted.
[0016] FIG. 5 shows an example modified adaptation_params_rbsp () structure in 3VC (as specified in WD4 d24 of ISO/IEC 23090-12), with the modification highlighted which includes a new structure miv_atlas_map_update ().
[0017] FIG. 6 shows an example miv_atlas_map_update () structure.
[0018] FIG. 7 shows an example modified patch information SEI message, with the modification highlighted.
[0019] FIG. 8A shows a first part of an example modified scene object information SEI message, and wherein collectively FIG. 8A, FIG. 8B, and FIG. 8C are FIG. 8.
[0020] FIG. 8B shows a second part of the example modified scene object information SEI message, with the modification highlighted.
[0021] FIG. 8C shows a third part of the example modified scene object information SEI message.
[0022] FIG. 9 is an example apparatus, which may be implemented in hardware, configured to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
[0023] FIG. 10 is an example method to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
[0024] FIG. 11 is another example method to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
[0025] FIG. 12 is another example method to implement efficient culling of volumetric video atlas bitstreams, based on the examples described herein.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
[0026] The following acronyms and abbreviations that may be found in the specification and/or the drawing figures are defined as follows:
2D two-dimensional
3D or 3d three-dimensional
3GPP 3rd Generation Partnership Project
3VC video-based volumetric video coding standard, or volumetric video coding
ACL atlas coding layer
AFPS atlas frame parameter set
ASIC application specific integrated circuit
ASPS atlas sequence parameter set
ATGDU atlas tile group data unit
CD committee draft
CPCS coded point cloud sequence
CTU coding tree unit
DIS draft international standard
EOB end of bitstream
EOS end of sequence
Exp exponential
FDIS final draft international standard f (n) fixed-pattern bit string using n bits written
(from left to right) with the left bit first
FPGA field programmable gate array
GOP group of picture(s)
HEVC high efficiency video coding
HMD head mounted display
ID or id identifier
IEC International Electrotechnical Commission info information
I/O input/output
IRAP intra random access picture
ISO International Organization for Standardization
ISOBMFF ISO/IEC base media file format
MIV MPEG Immersive Video standard, or Metadata for Immersive Video
MP4 MPEG-4 Part 14
MPEG moving picture experts group
NAL network abstraction layer
NW network
params parameters
RBSP or rbsp raw byte sequence payload
SEI supplemental enhancement information u (n) unsigned integer using n bits u (v) unsigned integer where the number of bits varies in a manner dependent on the value of other syntax elements UE user equipment ue (v) unsigned integer 0-th order Exp-Golomb-coded syntax element with the left bit first V3C visual volumetric video-based coding
VPCC or V-PCC video based point cloud coding standard or video-based point cloud compression
VPS V-PCC parameter set
WD4 working draft 4
[0027] The examples referred to herein relate to volumetric video coding, where dynamic 3D objects or scenes are coded into video streams for delivery and playback. The MPEG standards V-PCC (Video-based Point Cloud Compression) and MIV (Metadata for Immersive Video) are two examples of such volumetric video compression. These standards share a common base standard, 3VC (Volumetric Video Coding).
[0028] In both V-PCC and MIV, a similar methodology is adopted: the 3D scene is segmented into a number of regions according to heuristics based on, for example, spatial proximity and/or similarity of the data in the region. The segmented regions are projected into 2D patches, where each patch contains at least surface texture and depth channels, the depth channel giving the displacement of the surface pixels from the 2D view plane associated with that patch. The patches
are further packed into an atlas that can be encoded and streamed as a regular 2D video.
[0029] As defined in ISO/IEC 23090-5, a 3VC bitstream may contain one or more atlases. An atlas consists of an atlas metadata bitstream (atlas_sub_bitstream) and video encoded component bitstreams (video_sub_bitstreams). The atlas metadata bitstream carries patch layout information for related video encoded component bitstreams. To support signaling of shared parameter sets across atlases MIV introduced a concept of a special atlas or master atlas of specific type 0x3F. This master atlas only contains the atlas metadata bitstream where common parameters such as view or camera parameters may be signaled. FIG. 1A, FIG. IB, and FIG. 1C (collectively FIG. 1) describes the 3VC bitstream structure 100 for a single atlas, where atlases are signaled in vpcc_unit_headers .
[0030] As shown in FIG. 1, the 3VC bitstream structure 100 includes a V-PCC bitstream 102, and atlas sub-bitstream 104, and an atlas tile group layer RBSP 106. Included in the V-PCC bitstream is a plurality of VPCC unit headers 110 (including 110-2, 110-3, 110-4, and 110-5), a VPCC sample stream precision 112, a plurality of VPCC sample stream sizes 114 (including 114-2, 114-3, 114-4, and 114-5), a VPS 115 associated with a VPCC unit payload, and atlas sub-bitstream 117 associated with a VPCC unit payload, and plurality of video sub-bitstreams (116-3, 116-4, and 116-5) each associated with a VPCC unit payload. As shown in FIG. 1, VPCC unit header 110 has a volumetric unit header type of VPCC_VPS for VPS, VPCC unit header 110-2 has a volumetric unit header type of VPCC_AD for atlas data, VPCC unit header 110-3 has a volumetric unit header type of VPCC_OVD for occupancy video data, VPCC unit header 110-4 has a volumetric unit header type of VPCC_VPS for GVD
for geometry video data, and VPCC unit header 110-4 has a volumetric unit header type of VPCC_AVD for attribute video data. In some examples, size 114 corresponds to the size of items 110 and 115, size 114-2 corresponds to the size of items 110-2 and 117, size 114-3 corresponds to the size of items 110-3 and 116-3, size 114-4 corresponds to the size of items 110-4 and 116-4, and size 114-5 corresponds to the size of 110-5 and 116-5 (where for example the unit of size is the number of RBSP bytes).
[0031] As further shown in FIG. 1, atlas sub-bitstream 104 includes a NAL sample stream precision 122, a plurality of NAL sample stream sizes 124 (including 124-2, 124-3, 124-4, 124- 5, 124-6, and 124-7), a plurality of NAL unit headers 120 (including 120-2, 120-3, 120-4, 120-5, 120-6, and 120-7), an ASPS 126 having a number of RBSP bytes, an AFPS 127 having a number of RBSP bytes, a NAL prefix SEI 128 having a number of RBSP bytes, a plurality of atlas tile group layer raw byte sequence payloads 130 (including 130-2 and 130-3) having a number of RBSP bytes, and a NAL suffix SEI 132 having a number of RBSP bytes. In some examples, size 124 corresponds to the size of items 120 and 126, size 124-2 corresponds to the size of items 120-2 and 127, size 124-3 corresponds to the size of items 120-3 and 128, size 124-4 corresponds to the size of items 120-4 and 130, size 124-5 corresponds to the size of 120-5 and 130-2, size 124-6 corresponds to the size of 120-6 and 130-3, and size 124-7 corresponds to the size of 120-7 and 132 (where for example the unit of size is the number of RBSP bytes).
[0032] As further shown in FIG. 1, the atlas tile group layer RBSP 106 includes an atlas tile group data unit 140, an atlas tile group header 142, a plurality of atlas tile group data unit patch modes 144 (including 144-2, 144-3, 144-4, 144-5,
and 144-6), and a plurality of patch information data 146 (including 146-2, 146-3, 146-4, 146-5, and 146-6).
[0033] Over the course of the standardization process, the naming of the syntax structures and elements defined in ISO/IEC FDIS 23090-5 and ISO/IEC DIS 23090-12 has been modified in comparison to the terms used in this disclosure. However, the functionality of those structures and elements remains the same and the naming changes do not impact the ideas presented in this disclosure. Some of the notable name changes are: 3VC is renamed to V3C (Visual Volumetric Video-based Coding). V- PCC bitstream is V3C bitstream and all unit, header, and payload naming is changed accordingly. Atlas tile group layer was renamed to atlas tile layer and all syntax element names were modified accordingly.
[0034] From a file format perspective sequences of vpcc_units with different headers may be stored in separate tracks. Tracks with the same atlas_id may reference each other in order to establish a logical hierarchy. In addition, a master atlas may be used to provide a single entry point in the file. The master atlas may refer to other atlases as described in U.S. provisional application no. 62/959,449 (corresponding to U.S. nonprovisional application no. 17/140,580), entitled "Storage Of Multiple Atlases From One V-PCC Elementary Stream In ISOBMFF".
[0035] In addition to the patch information, in MIV in particular there is additional view metadata that describes the projection parameters, such as depth range and camera intrinsic and extrinsic parameters, for the patches. The patches in the patch atlas reference the view metadata by view id, and there are typically much fewer views than there are patches. In order to support the MIV multi-camera model, the
3VC bitstream supports a special "master atlas" that may only contain atlas metadata without an actual video bitstream.
[0036] Thus, each patch in a 3VC (V-PCC or MIV) atlas comes with sufficient metadata for determining whether that patch may be visible in a view of the scene rendered with given camera parameters. This view frustum culling of scene elements is a common rendering optimization in 3D graphics and can be applied to volumetric video as well. In MIV, view frustum culling can also be applied to each MIV view, enabling coarser (or more conservative) culling at the view level followed by further culling at the patch level. Culling may refer to removing or ignoring information that is not relevant, where extraction can be done for relevant information or irrelevant information. For example, consider the difference extracting a track from file versus culling a track from file.
[0037] In larger volumetric video scenes, the content may be too large for the client to access, decode, and/or render all at once. Larger scenes may typically be split into multiple video atlases in any case due to video decoder resolution limits, so it is desirable to facilitate partial access at the atlas level and/or use smaller partitions inside atlases.
[0038] HEVC supports highly flexible partitioning of a video sequence. Each frame of the sequence is split up into rectangular or square regions (Units or Blocks), each of which is predicted from previously coded data. After prediction, any residual information is transformed, and entropy encoded.
[0039] Each coded video frame, or picture, is partitioned into Tiles and/or Slices, which are further partitioned into Coding Tree Units (CTUs). The CTU is the basic unit of coding, analogous to the Macroblock in earlier standards, and can be up to 64x64 pixels in size.
[0040] Multiple Atlases in V-PCC elementary stream. After the MPEG 128 meeting the V-PCC elementary bitstream may contain more than one of an atlas. This functionality was added to carry data encoded according to the MIV specification (23090- 12). In order to enable this functionality, vuh_atlas_id was added to V-PCC unit header for V-PCC units with types: VPCC_AD, VPCC_GVD, VPCC_OVD, and VPCC_AVD, corresponding respectively to Atlas Data, Geometry Video Data, Occupancy Video Data, and Attribute Video Data.
[0041] Addition of vuh_atlas_id creates implications in the design of a multi-track container structure when it comes to V-PCC Track:
• V-PCC sample allows only one V-PCC unit payload to be stored. Consequently, a V-PCC Track per atlas would have to be created.
• No functionality that links the number of V-PCC tracks to the same V-PCC elementary stream.
• No design for how VPCC_VPS would be stored (e.g. would it be duplicated per each V-PCC Track)
• No design to signal the main V-PCC track.
• No design to signal shared data between V-PCC tracks, like configurations.
[0042] V-PCC Component Tracks can be created without modification, as from their perspective vuh_atlas_id is yet another identifier of a track similar to vuh_unit_type, vuh_attribute_index, vuh_map_index, and vuh_attribute_dimension_index .
[0043] Object Annotation in 3VC 23090-5. In V-PCC it is possible to annotate each region of the volumetric bitstream, i.e. the patches or groups of patches that are identified using a "rectangular" shaped volumetric rectangle, with different
information. This process may include whether these elements are associated with a particular object (likely an object in the physical/world space) and certain properties that could be useful for their extraction and rendering. Such information may include labeling of objects, the size and shape of the points that correspond to the object, whether the object is visible or not, visibility cone information, material ids, and collision information, among others.
[0044] Such relationships can be seen in the diagram 200 of FIG. 2, where it is apparent that the relationships are similar to the ones encountered in "relational databases". Shown in FIG. 2 is an object 202, where object 202 has an object ID. The object 202 is associated with a tile/patch object 204, shown as TileX.Patches where a tile (indexed from 0 to m) may access patches by dereferencing the patches object. The object 202 is also associated with a plurality of 2D volumetric rectangles 206 (indexed from 0 to n). The object 202 has a number of properties 208 including, as shown in FIG. 2, labels, 3D bounding box information, collision shapes, point size, whether the object is hidden or visible, a priority, visibility cones, and object relationships. The properties 208 have labels 210, which in the example shown in FIG. 2 are indexed from 0 to 255, where each label has a label ID, label text, and a label language.
[0045] Objects may correspond to "real", i.e. physical, objects within a scene, or even conceptual objects that may relate to physical or other properties. Objects may be associated with different parameters, or properties (e.g. properties 208), which may also correspond to information provided during the creation or editing process of the point cloud, scene graph, etc. It is possible that some objects may
relate to one another and in some cases an object could be part of another object.
[0046] An object could be persistent in time and could also be updated at any time/frame while the associated information may persist from that point onward. Multiple patches or 2D volumetric rectangles (e.g. rectangles 206), which can contain themselves multiple patches, could be associated with a single object, but there may be a desire to also associate the same patch or 2D volumetric rectangles with multiple objects. Such relationships could persist or also need to change in time because objects may move or their placement in the atlas may have changed.
[0049] For a more detailed description of structure presented in FIG. 2 see input contribution m52705 to MPEG 129 meeting, Brussels, January, 2020. [0050] Multiple camera views in 3VC 23090-5 and MIV 23090-
12. In contrast to a fixed number of the camera views, in the MIV specification a camera number, and camera extrinsic and camera intrinsic information is not fixed and may change on a group of picture basis (e.g., GOP). In order to address this in the 3VC specification, 23090-12 introduces in WD4 of the specification an adaptation params structure that can carry this information. Adaptation params are carried by an NAL unit with a particular NAL unity type. In case there is more than one atlas in 3VC then this adaptation params is carried in an atlas with a unique value of atlas_id equal to 0x3F.
[0051] The adaptation params structure is as follows:
[0052] Each camera (view) has a unique index and determined within miv view params list.
[0053] Partial Access in V-PCC ISOBMFF. The CD text of 23090- 10 also introduces a high level solution for partial access to 3-dimensional space where tracks are grouped based on what the spatial region data in those tracks belong to. The specification does not mention, however, whether those tracks are from one V-PCC elementary stream or from independent V-PCC elementary streams. In the case of being from one V-PCC elementary stream, the tool would allow only to split video tracks (occupancy, geometry, attribute) based on what the spatial region data in those tracks belong to. The atlas data in V-PCC track remains in its original form. Consequently, all atlas data needs to be downloaded/decoded even if only one spatial region is displayed to the end user. Splitting the atlas data into a number of atlases that correspond to some spatial regions would help the partial access scenario.
[0054] However, as mentioned above, the storage of multiple atlases in a multi-track container structure is not fully supported. U.S. provisional application no. 62/959,449 (corresponding to U.S. nonprovisional application no.
17/140,580), entitled "Storage Of Multiple Atlases From One V- PCC Elementary Stream In ISOBMFF", aims to clarify these concepts.
[0055] Box-structured file formats. Box-structured and hierarchical file format concepts have been widely used for media storage and sharing. The most well-known file formats in this regard are the ISO Base Media File Format (ISOBMFF, ISO/IEC 14496-12) and its variants such as MP4 and 3GPP file formats.
[0056] ISOBMFF allows storage of timely captured audio/visual media streams, called media tracks. The metadata which describes the track is separated from the encoded bitstream itself. The format provides mechanisms to access media data in a codec-agnostic fashion from a file parser perspective.
[0057] A 3VC (V-PCC/MIV) bitstream, containing a coded point cloud sequence (CPCS), is composed of VPCC units carrying V- PCC parameter set (VPS) data, an atlas information bitstream, and 2D video encoded bitstreams (e.g. an occupancy map bitstream, a geometry bitstream, and zero or more attribute bitstreams). A 3VC (V-PCC/MIV) bitstream can be stored in an ISOBMFF container according to ISO/IEC 23090-10. Two modes are supported: single-track container and multi-track container.
[0058] Single-track container is utilized in the case of simple ISOBMFF encapsulation of a V-PCC encoded bitstream. In this case, a V-PCC bitstream is directly stored as a single track without further processing. Single-track should use a sample entry type of'vpel' or 'vpeg'.
[0059] Under the 'vpel' sample entry, all atlas parameter sets (as defined in ISO/IEC 23090-5) are stored in the setupUnit of sample entry. Under the 'vpeg' sample entry, the
atlas parameter sets may be present in setupUnit array of sample entry, or in the elementary stream.
[0060] Multi-track container maps V-PCC units of a 3VC (V- PCC/MIV) elementary stream to individual tracks within the container file based on their types. There are two types of tracks in a multi-track container: V-PCC track and V-PCC component track. The V-PCC track is a track carrying the volumetric visual information in the V-PCC bitstream, which includes the atlas sub-bitstream and the atlas sequence parameter sets. V-PCC component tracks are restricted video scheme tracks which carry 2D video encoded data for the occupancy map, geometry, and attribute sub-bitstreams of the 3VC (V-PCC/MIV) bitstream. Multi-track should use for V-PCC track a sample entry type of 'vpcl' or 'vpcg'.
[0061] Under the 'vpcl' sample entry, all atlas parameter sets (as defined in ISO/IEC 23090-5) shall be in the setupUnit array of sample entry. Under the 'vpcg' sample entry, the atlas parameter sets may be present in this array, or in the stream.
[0062] In large and/or complex scenes, it is highly desirable to implement partial access at the atlas level so that entire atlases can be ignored if they are not necessary for rendering during the current intra period. This enables savings both in the network streaming layer as well as the video decoder layer.
[0063] Atlas culling is not currently possible, however. While the view metadata is available in the "master atlas" and each view can be culled against the rendering view frustum, the connection to the actual scene data corresponding to each view is through the patch metadata that resides in each atlas metadata bitstream.
[0064] Thus, every atlas metadata bitstream must be accessed before it is possible to determine whether a given atlas is
relevant for the client at a given moment. This makes at least network streaming optimizations impossible, and hinders optimization of bitstream parsing and decoding in general.
[0065] U.S. provisional application no.
62/959,449 (corresponding to U.S. nonprovisional application no. 17/140,580), entitled "Storage Of Multiple Atlases From One V-PCC Elementary Stream In ISOBMFF", clarifies how metadata for different atlases may be signaled inside a single bitstream or track.
[0066] Described herein are three alternative and complementary embodiments to address the problem:
- adding view-to-atlas mapping metadata to enable culling of sub-bitstreams via per-view visibility culling
- adding object-to-atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling
- moving patch atlas metadata to the MIV "master" atlas to enable more fine-grained patch visibility culling, leading to more effective culling of atlas sub-bitstreams
- leveraging 3VC partial access metadata to implement atlas bitstream culling o store each atlas in its own track and provide sample grouping information to allow atlas bitstream culling on a file format level
[0067] These embodiments can be used individually or in combination with each other. Corresponding encoder embodiments are also described.
[0068] 1. View-to-atlas mapping metadata. In one embodiment, the adaptation_params_rbsp structure that contains MIV related view metadata is contained in the universally accessible "master" atlas (i.e. the atlas with vuh_atlas_id equal to
0x3F). New elements in the adaptation_params_rbsp structure are added to provide information about mapping from views to atlases. This mapping may indicate, for every view, the atlas that contains patches referring back to the view in question. [0069] The renderer may apply view frustum culling to each view first. All views that are deemed potentially visible may then be queried for the atlas mapping metadata, and the combined atlas mapping metadata may indicate the atlases that must be accessed in order to render the visible views. [0070] The mapping metadata can be, for example, a bitmask of N bits, where N is the number of atlas sub-bitstreams. Each bit in the mask therefore corresponds to one atlas. In each view, the mask may have a bit set for every atlas if the atlas corresponding to the bit contains patches for that view, and a bitwise OR operation over the potentially visible views may produce the combined bitmask. As an example, the bitmask may be embedded in the miv_view_params_list() sub-structure of the adaptation_params_rbsp () structure in 3VC.
[0072] The newly added mvp_a11as_map_flag indicates whether atlas map mask information is available for given view.
[0073] The newly added mvp_a11as_map_mask contains the bitmask of atlases where patches linking to the given view may be found. The length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in ISO/IEC 23090-5.
[0074] FIG. 3 also shows the example modified miv_view_params_list () sub-structure 300 of the adaptation_params_rbsp () structure in 3VC, with the modification highlighted as item 302.
[0075] In another embodiment, a temporal update of the atlas map can be done together with a camera extrinsic in the sub- structure miv_view_params_update_extrinsics() of the adaptation_params_rbsp () structure in 3VC. An example modified miv_view_params_update_extrinsics () structure is as follows:
[0076] The newly added mvpue_atlas_map_flag indicates whether atlas map mask information is available for a given view.
[0077] The newly added mvpue_atlas_map_mask contains the bitmask of atlases where patches linking to the given view may be found. The length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in 23090-5.
[0078] FIG. 4 also shows the example modified miv_view_params_update_extrinsics () sub-structure 400 of the adaptation_params_rbsp () structure in 3VC, with the modification highlighted as item 402.
[0079] In another embodiment, a temporal update is done as a newly added structure miv_atlas_map_update () of the adaptation_params_rbsp () structure in 3VC. An example modified adaptation_params_rbsp () structure is as follows:
[0080] FIG. 5 also shows the example modified adaptation_params_rbsp () structure 500 in 3VC, with the modification highlighted as item 502 which includes a new structure miv_atlas_map_update ().
[0082] FIG. 6 also shows the example miv_atlas_map_update() structure 600.
[0083] In another embodiment, the encoder may optimize the patch layout so that patches belonging to a certain view are grouped together in a single atlas. This makes the view-based culling of atlases more effective.
[0084] 2. Object-to-atlas mapping metadata. In one embodiment, a patch information SEI message is extended to include an atlas map element that would inform a renderer in what other atlases the object is present. Each object can have visibility information, and a renderer can perform culling based on this information. Based on the object description and information in which atlases' patches describing the object are present, a renderer could request the needed atlases (that can be mapped to tracks) from a file parser. An example modified patch_information(payload_size) structure is provided below.
[0085] The newly added pi_patch_atlas_map_mask contains the bitmask of atlases where patches linking to the given object can be found. The length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in ISO/IEC 23090-5.
[0086] FIG. 7 also shows the example modified patch information SEI message 700, with the modification highlighted as item 702. [0087] In another embodiment, the scene object information
SEI message is contained in the universally accessible "master" atlas (i.e. atlas with vuh_atlas_id equal to 0x3F). The scene_object_information SEI message is extended to provide mapping of object IDs to atlases. This metadata may indicate, for every object, the atlas that contains patches referring back to the object in question.
[0089] The newly added soi_object_atlas_map_mask contains the bitmask of atlases where patches linking to the given object can be found. The length of the bitmask depends on the number of atlases, i.e. vps_atlas_count_minusl as defined in ISO/IEC 23090-5.
[0090] FIG. 8A, FIG. 8B, and FIG. 8C also show the example modified scene object information SEI message as collectively items 800, 810, and 820, with the modification highlighted as item 802 within FIG. 8B. Collectively FIG. 8A, FIG. 8B, and FIG. 8C are FIG. 8.
[0091] 3. Grouping of patch metadata to MIV "master atlas".
In the MIV bitstream format, a special atlas with a predefined atlas ID is specified to contain view metadata, while the patch metadata is contained in per-atlas metadata units. In this embodiment, these per-atlas metadata units are moved from separate atlases to the "master atlas" in order to make them universally available. The signaling related aspects of this embodiment are largely covered in U.S. provisional application no. 62/959,449 (corresponding to U.S. nonprovisional application no. 17/140,580), entitled "Storage Of Multiple Atlases From One V-PCC Elementary Stream In ISOBMFF". The
novelty of this embodiment includes the decoding process, which allows a decoder to cull whole atlases based on patch metadata.
[0092] In this embodiment, the renderer may cull all patches against the current rendering viewing frustum, and decode only the atlas sub-bitstreams that contain potentially visible patches. This can be implemented in several ways, of which two examples are:
- loop over all patch atlases, detect potentially visible patches, and once a first potentially visible patch is found, mark that atlas as required and move to the next one, or
- perform the view culling of Embodiment 1 (1. View-to- atlas mapping metadata) first, then process only patches referring to a potentially visible view, and mark the relevant atlases as required
[0093] After finding the required atlases, access to those may continue as in Embodiment 1 (1. View-to-atlas mapping metadata), potentially via a network request before decoding the relevant atlas sub-bitstream.
[0094] 4. 3VC partial access-based embodiment. In MPEG #129 partial access related functionality in 3VC was adopted. However, the signaling on file format level has not been defined. Considering the embodiments 1, 2, 3 (respectively 1. View-to-atlas mapping metadata, 2. Object-to-atlas mapping metadata, 3. Grouping of patch metadata to MIV "master atlas") where atlas culling is performed using information on views, objects, or patches, this embodiment focuses on atlas culling using partial access functionality by providing file format level design.
[0095] Input contribution to MPEG 129 meeting m52705 defines partial access functionality, which consists of concepts for defining objects with various characteristics, including visibility cones and bounding boxes as well as linking objects with other objects, tile groups, patches and volumetric rectangles.
[0096] In one embodiment, the V-PCC bitstream containing the coded point cloud sequence (CPCS) that is composed of VPCC units carrying V-PCC parameter set (VPS) data, more than one atlas bitstream, and more than one 2D video encoded bitstreams is stored in ISOBMFF. An example of such V-PCC bitstream is one carrying volumetric video compressed according to MPEG Immersive Media defined in of ISO/IEC 23090-12.
[0097] In case the V-PCC bitstream contains multiple atlases, each atlas bitstream is encapsulated in a separate V-PCC track. One of those tracks is interpreted as a parameter track that is part of the multi-atlas V-PCC bitstream, while other tracks are interpreted as normal V-PCC tracks that are part of the multi-atlas V-PCC bitstream.
[0098] A V-PCC track is part of the multi-atlas V-PCC bitstream when it contains a 'mapt' track reference to another V-PCC track and has a sample entry type equal to 'vpcl' or 'vpcg'. This referenced track is referred to as the parameter track of the multi-atlas V-PCC bitstream and could have a sample entry type equal to 'vpcP'.
[0099] A parameter track does not include ACL NAL units. A normal track does not carry ACL NAL units belonging to more than one atlas. For any V-PCC access unit carried by samples in a parameter track and a number of normal tracks, all the atlas NAL units that apply to the entire V-PCC access unit are carried in the parameter track. These atlas NAL units include
(but are not limited to) adaptation_params_rbsp, SEI messages as well as EOB and EOS NAL units, when present. The atlas NAL units that do not apply to a given atlas are not carried in the normal track containing that atlas. The NAL units that apply to an atlas are carried in the normal track containing that atlas.
[00100] In another embodiment, in order to enable a view- frustum culling, i.e. culling objects outside of the user's current view of the scene, a sample groups is defined. It provides a mapping of an atlas to a view. Due to the use of a sample group (or one or more sample groups), the mapping can change along the timeline of the volumetric video.
[00101] View Information Sample Group Entry Definition
Box Type: 'vpvi'
Container: Sample Group Description Box ('sgpd')
Mandatory: No
Quantity: Zero or more
[00102] A view information sample group entry identifies which views are carried by samples. The grouping_type_parameter is not defined for the SampleToGroupBox with grouping type 'vpvi'. A view information sample group entry may also provide information in which other track, atlases, or tile group samples with data containing the same view is carried.
[00103] Syntax aligned (8) class ViewlnformationSampleGroupEntry extends VisualSampleGroupEntry ('vpvi', version = 0, flags) { unsigned int(8) group_id; unsigned int(8) num_views; for(i=0; i < num views; i++) {
unsigned int(8) view_index; unsigned int(8) num_atlases; for(j=0; j < num_atlases; j++) { unsigned int(32) atlas_id; unsigned int(32) num_tile_groups for(k=0; k < num_tile_groups; k++) { unsigned int(32) tile_groups_address;
}
}
}
}
[00104] Semantics group_id specifies the unique identifier of the group. num_views specifies the number of views carried by samples. view_index specifies the index of a view carried by samples. The index is mapped to view index in the active adaptation_params_rbsp. num_atlases specifies the number of atlases, other than the atlas contained in the track the sample group belongs to, that contain samples with the view with the index equal to view index. atlas_id specifies the id of an atlas that contains samples with the view with the index equal to view_index. num_tile_groups specifies the number of tile groups within atlas with id equal to atlas_id than contain samples with the view with the index equal to view_index. When num_tile_groups equals to 0, then all tile groups belonging to
atlas with id equal to atlas_id contain samples with view with the index equal to view_index tile_groups_address specifies the address of tile group within atlas with id equal to atlas_id that contains samples with the view with the index equal to view index.
[00105] In another embodiment, in order to enable a view- frustum culling, i.e. culling objects outside of the user's current view of the scene, a sample group is defined. It provides a mapping of an atlas to an object. The object may include visibility cone information that can be used for culling. Due to the use of a sample group (or one or more sample groups), the mapping can change along the timeline of the volumetric video.
[00106] Object Information Sample Group Entry
Definition
Box Type: ' vpoi' Container: Sample Group Description Box ('sgpd') Mandatory: No Quantity: Zero or more
[00107] A view information sample group entry identifies which views are carried by samples. The grouping_type_parameter is not defined for the SampleToGroupBox with grouping type 'vpoi'. A view information sample group entry may also provide information in which other tracks samples with data containing the same view is carried.
[00108] Syntax aligned (8) class ObjectInformationSampleGroupEntry extends VisualSampleGroupEntry ('vpoi', version = 0, flags) { unsigned int(8) group_id;
unsigned int(8) num_objects; for(i=0; i < num_objects; i++) { unsigned int(8) object_index; unsigned int(8) num_atlases; for(j=0; j < num_atlases; j++) { unsigned int(32) atlas_id; unsigned int(32) num_tile_groups for(k=0; k < num_tile_groups; k++) { unsigned int(32) tile_groups_address;
}
}
}
}
[00109] Semantics group_id specifies the unique identifier of the group. num_objects specifies the number of objects carried by samples. object_index specifies the index of an object carried by samples. The index is mapped to object index soi_object_idx in the active scene_object_information SEI message. num_atlases specifies the number of atlases, other than the atlas contained in the track the sample group belongs to, that contain samples with the object with the index equal to object_index. atlas_id specifies the id of an atlas that contains samples with the object with the index equal to object_index. num_tile_groups specifies the number of tile groups within atlas with id equal to atlas_id than contain samples with the object with the index equal to object_index. When
num_tile_groups equals to 0, then all tile groups belonging to atlas with id equal to atlas_id contain samples with object with the index equal to object_index. tile_groups_address specifies the address of tile group within atlas with id equal to atlas_id that contains samples with the object with the index equal to object_index.
[00110] In another embodiment, V-PCC parameter tracks contain adaptation_params_rbsp with additional signaling of atlas_map per view as described in 1. 'View-to-atlas mapping metadata'.
Each atlas is carried by one track, and based on adaptation_params_rbsp an application informs a file parser which atlases are required at a given time. The file parser maps atlas ids to track ids based on the atlas_id in VPCCUnitHeaderBox that is contained in the VPCCSampleEntry of every V-PCC track carrying atlas data.
[00111] In another embodiment, V-PCC parameter tracks contain scene_object_information with additional signaling of atlas_map per view as described in 1. 'View-to-atlas mapping metadata'. Each atlas is carried by one track, and based on scene_object_information an application informs a file parser which atlases are required at a given time. The file parser maps atlas ids to track ids based on the atlas_id in VPCCUnitHeaderBox that is contained in the VPCCSampleEntry of every V-PCC track carrying atlas data.
[00112] In another embodiment, a V-PCC bitstream containing a coded point cloud sequence (CPCS) that is composed of VPCC units carrying V-PCC parameter set (VPS) data, one atlas bitstream, and more than one 2D video encoded bitstreams is stored in ISOBMFF. An example of such V-PCC bitstream is one carrying volumetric video compressed according to V-PCC defined in ISO/IEC 23090-5.
[00113] One atlas bitstream is encapsulated in a separate V- PCC track. One of those tracks is interpreted as a tile parameter track that is part of the V-PCC bitstream, while other tracks are interpreted as tile tracks that are part of the V-PCC bitstream. Each tile track carries samples containing one or more atlas_tile_group_layer_rbsp structures.
[00114] A tile track is part of the V-PCC bitstream when it contains a 'mtpt' track reference to another V-PCC track and has a sample entry type equal to 'vptl' or 'vptg'. This referenced track is referred to as the tile parameter track of the V-PCC bitstream and could have a sample entry type equal to 'vptP'.
[00115] A tile parameter track does not include ACL NAL units. For any atlas access unit carried by samples in a tile parameter track and a number of tile tracks, all the atlas NAL units that apply to the entire atlas access unit are carried in the tile parameter track. These atlas NAL units include (but are not limited to) adaptation_params_rbsp, atlas_sequence_parameters_rbsp, atlas_frame_parameters_rbsp, SEI messages as well as EOB and EOS NAL units, when present.
[00116] Each of the tile tracks may contain ObjectInformationSampleGroupEntry or
ViewlnformationSampleGroupEntry as defined in the previous embodiments.
[00117] As described herein, while components are extracted, the proposed signaling does not enable extraction of components arbitrarily, and the components always need to relate to the same atlas. The extraction happens for one or more sets of components belonging to the same atlas, rather than just extracting a component. Belonging in the same atlas means that the components share the same atlas id. The component can be
atlas data or video coded occupancy, attribute or geometry data. Thus the embodiments described herein do not necessarily relate to extracting or culling single components, but sets of components that represent a partial portion of the scene. With the atlas to view and atlas to object mapping, entire atlases may be culled.
[00118] FIG. 9 is an example apparatus 900, which may be implemented in hardware, configured to implement efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein. The apparatus 900 comprises a processor 902, at least one non-transitory memory 904 including computer program code 905, wherein the at least one memory 904 and the computer program code 905 are configured to, with the at least one processor 902, cause the apparatus to implement a process, component, module, or function (collectively 906) to implement efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein. The apparatus 900 optionally includes a display and/or I/O interface 908 that may be used to display a culled bitstream. The apparatus 900 also includes one or more network (NW) interfaces (I/F(s)) 910. The NW
I/F(s) 910 may be wired and/or wireless and communicate over a channel or the Internet/other network(s) via any communication technique. The NW I/F(s) 910 may comprise one or more transmitters and one or more receivers. The N/W I/F(s) 910 may comprise standard well-known components such as an amplifier, filter, frequency-converter, (de)modulator, and encoder/decoder circuitry(ies) and one or more antennas. In some examples, the processor 902 is configured to implement item 906 without use of memory 904.
[00119] The apparatus 900 may be a remote, virtual or cloud apparatus. The apparatus 900 may be either a writer or a
reader (e.g. parser), or both a writer and a reader (e.g. parser). The apparatus 900 may be either a coder or a decoder, or both a coder and a decoder. The apparatus 900 may be a user equipment (UE), a head mounted display (HMD), or any other fixed or mobile device.
[00120] The memory 904 may be implemented using any suitable data storage technology, such as semiconductor based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The memory 904 may comprise a database for storing data. Interface 912 enables data communication between the various items of apparatus 900, as shown in FIG. 9. Interface 912 may be one or more buses, or interface 912 may be one or more software interfaces configured to pass data between the items of apparatus 900. For example, the interface 912 may be one or more buses such as address, data, or control buses, and may include any interconnection mechanism, such as a series of lines on a motherboard or integrated circuit, fiber optics or other optical communication equipment, and the like. The apparatus 900 need not comprise each of the features mentioned, or may comprise other features as well.
[00121] FIG. 10 is an example method 1000 for implementing efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein. At 1002, the method includes providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of items 1004, 1006, 1008, or 1010. At 1004, the method includes wherein the cull signaling comprises view-to-atlas mapping metadata that enables culling of sub bitstreams via per-view visibility culling. At 1006, the method includes wherein the cull signaling comprises object- to-atlas mapping metadata to enable culling of sub-bitstreams
via per-object visibility culling. At 1008, the method includes wherein the cull signaling comprises patch atlas metadata within a metadata for immersive video master atlas to enable sub-bitstream culling based on fine-grained patch visibility culling. At 1010, the method includes wherein the cull signaling comprises partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level. Method 1000 may be implemented by apparatus 900.
[00122] FIG. 11 is another example method 1100 for implementing efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein. At 1102, the method includes receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream, wherein the information comprises one or more of 1104, 1106, 1108, or 1110. At 1104, the method includes wherein the information comprises atlas- to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view. At 1106, the method includes wherein the information comprises atlas- to-object mapping metadata that indicates an association between at least one object and the at least one atlas. At 1108, the method includes wherein the information comprises patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine grained patch visibility culling. At 1110, the method includes wherein the information comprises partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level. At 1112, the method includes culling the one or more sets of components belonging to the same atlas
from the at least one volumetric video bitstream, based on the information. Method 1100 may be implemented by a decoder apparatus, or by apparatus 900.
[00123] FIG. 12 is another example method 1200 for implementing efficient culling of volumetric video atlas bitstreams based on the example embodiments described herein. At 1202, the method includes providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream, wherein the information comprises one or more of 1204, 1206, 1208, or 1210. At 1204, the method includes wherein the information comprises atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view. At 1206, the method includes wherein the information comprises atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas. At 1208, the method includes wherein the information comprises patch atlas metadata within a metadata for immersive video master atlas to indicate sub bitstream culling based on fine-grained patch visibility culling. At 1210, the method includes wherein the information comprises partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level. At 1212, the method includes transmitting the information to a receiving device. Method 1200 may be implemented by an encoder apparatus, or by apparatus 900.
[00124] References to a 'computer', 'processor', etc. should be understood to encompass not only computers having different architectures such as single/multi-processor architectures and sequential (Von Neumann)/parallel architectures but also
specialized circuits such as field-programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other processing circuitry. References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device such as instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device, etc.
[00125] As used herein, the term 'circuitry' may refer to any of the following: (a) hardware circuit implementations, such as implementations in analog and/or digital circuitry, and (b) combinations of circuits and software (and/or firmware), such as (as applicable): (i) a combination of processor(s) or (ii) portions of processor(s)/software including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus to perform various functions, and (c) circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present. As a further example, as used herein, the term 'circuitry' would also cover an implementation of merely a processor (or multiple processors) or a portion of a processor and its (or their) accompanying software and/or firmware. The term 'circuitry' would also cover, for example and if applicable to the particular element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or another network device. Circuitry may also be used to mean a function or a process, such as one implemented by an encoder or decoder, or a codec.
[00126] Based on the examples referred to herein, an example apparatus may be provided that includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to perform: provide signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to- atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
[00127] The apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: cull at least one volumetric video atlas using the provided signaling; and render a view frustum corresponding to the non-culled volumetric video atlas bitstreams.
[00128] The apparatus may further include wherein the view- to-atlas mapping metadata is a bitmask of N bits, where N is a number of atlas sub-bitstreams.
[00129] The apparatus may further include wherein the bitmask is embedded in a view parameter substructure of an adaptation parameter structure.
[00130] The apparatus may further include wherein the view- to-atlas mapping metadata comprises a temporal update of an atlas map together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
[00131] The apparatus may further include wherein the view- to-atlas mapping metadata comprises a temporal update as an atlas map update substructure of an adaptation parameter structure.
[00132] The apparatus may further include wherein the at least one volumetric video atlas is culled after the volumetric video atlas has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
[00133] The apparatus may further include wherein the object- to-atlas mapping metadata comprises an atlas map element to inform a renderer in what other atlases an object is present, wherein the atlas map element extends a patch information supplemental enhancement information message.
[00134] The apparatus may further include wherein the object- to-atlas mapping metadata comprises an extension to a scene object information supplemental enhancement information message to provide a mapping of object identifiers (IDs) to atlases.
[00135] The apparatus may further include wherein the extension is implemented as a bitmask and indicates, for every object, an atlas that contains patches referring back to the respective object.
[00136] The apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform, to implement the fine-grained patch
visibility culling, either: loop over all patch atlases, detect potentially visible patches, mark an atlas as required once a first potentially visible patch is found, and move to a next patch atlas; or perform the per-view visibility culling, process patches referring to a potentially visible view, and mark relevant atlases as required.
[00137] The apparatus may further include wherein when the at least one volumetric video atlas bitstream contains a coded point cloud sequence (CPCS) that is composed of units carrying V-PCC parameter set (VPS) data, more than one atlas bitstream, and more than one 2D video encoded bitstreams, the at least one volumetric video atlas bitstream is stored in ISOBMFF.
[00138] The apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: define a sample group entry to provide a mapping of an atlas to a view to enable a view-frustum culling.
[00139] The apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: define a sample group entry to provide a mapping of an atlas to an object to enable a view-frustum culling.
[00140] The apparatus may further include wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to perform: inform a file parser which atlases are required at a given time with signaling within an adaptation parameters structure.
[00141] The apparatus may further include wherein the at least one memory and the computer program code are further configured
to, with the at least one processor, cause the apparatus at least to perform: inform a file parser which atlases are required at a given time with signaling within a supplemental enhancement information scene object information message.
[00142] The apparatus may further include wherein when the at least one volumetric video atlas bitstream contains a coded point cloud sequence (CPCS) that is composed of units carrying V-PCC parameter set (VPS) data, one atlas bitstream, and more than one 2D video encoded bitstreams, the at least one volumetric video atlas bitstream is stored in ISOBMFF.
[00143] The apparatus may further include wherein the at least one volumetric video atlas is culled without having to access every atlas metadata bitstream.
[00144] Based on the examples referred to herein, an example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations may be provided, the operations comprising: providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to-atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub-bitstream culling based on fine grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
[00145] Based on the examples referred to herein, an example method may be provided that includes providing signaling to
cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to-atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
[00146] Based on the examples referred to herein, an example apparatus may be provided that includes means for providing signaling to cull at least one volumetric video atlas bitstream, wherein the cull signaling comprises one or more of: view-to-atlas mapping metadata that enables culling of sub-bitstreams via per-view visibility culling; object-to- atlas mapping metadata to enable culling of sub-bitstreams via per-object visibility culling; patch atlas metadata within a metadata for immersive video master atlas to enable sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each atlas in its own track, and providing sample grouping information to allow culling on a file format level.
[00147] An example apparatus includes means for receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to- view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-
object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[00148] Other aspects of the apparatus may include the following. The apparatus may further include means for rendering a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled. The atlas-to-view mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases. The atlas-to-view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message. The persistence may be specified using a flag, wherein the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded
such that patches belonging to a certain view are grouped together in a single atlas. The at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit. The apparatus may further include means for interpreting a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling. The information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family. The volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas. The atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure. The one or more sets of components belonging to the same atlas may share an atlas identifier. A component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene. The apparatus may further include means for culling an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata. The atlas-to-object mapping metadata may be received as a supplemental enhancement information message, and may indicate a value of the at least one object given an atlas identifier and an index of the at least one object. The atlas-to-object mapping metadata may
indicate, for the at least one object, an atlas that contains patches referring back to the at least one object. The atlas- to-object mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases. The atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
[00149] An example apparatus includes means for providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial
access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for transmitting the information to a receiving device.
[00150] Other aspects of the apparatus may include the following. The information may be provided using at least one sample group entry object. The atlas-to-view mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases. The atlas-to- view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The apparatus may further include means for encoding the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas. The at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a
visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit. The apparatus may further include means for defining a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling. The information may signals partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family. The volumetric video bitstream may be a set of visual volumetric video-based coding sub-bitstreams, which belong to the same atlas. The atlas-to- view mapping metadata or the atlas-to-object mapping metadata may be provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure. The one or more sets of components belonging to the same atlas may share an atlas identifier. A component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene. An entire atlas may be culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata. The atlas-to-object mapping metadata may be provided as a supplemental enhancement information message, and may indicate a value of the at least one object given an atlas identifier and an index of the at least one object. The atlas-to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object. The atlas-to-object mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases. The atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement
information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The apparatus may further include means for encoding the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
[00151] An example apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: receive information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and cull the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[00152] Other aspects of the apparatus may include the following. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: render a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled. The atlas-to-view mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases. The atlas-to-view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas. The at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding
intra random access picture sub-bitstream unit. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: interpret a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling. The information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family. The volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas. The atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure. The one or more sets of components belonging to the same atlas may share an atlas identifier. A component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: cull an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata. The atlas-to-object mapping metadata may be received as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object. The atlas- to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object. The atlas-to-object mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases. The atlas-to-object mapping metadata may specify a persistence of a previous atlas
object supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one volumetric video bitstream may be culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
[00153] Other aspects of the apparatus may include the following. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: render a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled; and wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas. The atlas-to-view mapping metadata may be received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases, the atlas-to-view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message, the persistence may be specified using a flag, wherein the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a
current atlas frame, and wherein the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one volumetric video bitstream may comprises one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and where the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video- based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: interpret a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling. The information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family. The volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas, the atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure, the one or more sets of components belonging to the same atlas may share an atlas identifier, or a component of the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or
more sets of components may represent a partial portion of a scene. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: cull an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata, wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas. The atlas-to-object mapping metadata may be received as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object, wherein the atlas-to- object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object, wherein the atlas-to-object mapping metadata is received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases, wherein the atlas-to- object mapping metadata specifies a persistence of a previous atlas object supplemental enhancement information message, wherein the persistence is specified using a flag, wherein the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
[00154] An example apparatus includes at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: provide information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmit the information to a receiving device.
[00155] Other aspects of the apparatus may include the following. The information may be provided using at least one sample group entry object. The atlas-to-view mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases. The atlas-to- view mapping metadata may specify a persistence of a previous atlas view supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to
the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas. The at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set, and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video- based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: define a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling. The information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family. The volumetric video bitstream may be a set of visual volumetric video-based coding sub bitstreams, which belong to the same atlas. The atlas-to-view mapping metadata or the atlas-to-object mapping metadata may be provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure. The one or more sets of components belonging to the same atlas may share an atlas identifier. A component of
the one or more sets of components may be atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components may represent a partial portion of a scene. An entire atlas may be culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata. The atlas-to-object mapping metadata may be provided as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object. The atlas-to-object mapping metadata may indicate, for the at least one object, an atlas that contains patches referring back to the at least one object. The atlas-to-object mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases. The atlas-to-object mapping metadata may specify a persistence of a previous atlas object supplemental enhancement information message. The persistence may be specified using a flag, where the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
[00156] Other aspects of the apparatus may include the following. The information may be provided using at least one sample group entry object, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas. The atlas-to-view mapping metadata may be provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases, wherein the atlas-to-view mapping metadata specifies a persistence of a previous atlas view supplemental enhancement information message, wherein the persistence is specified using a flag, wherein the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present. The at least one volumetric video bitstream may comprise one or more coded visual volumetric video-based coding sequences, where the one or more coded visual volumetric video- based coding sequences comprise at least one video based point cloud coding parameter set, and where the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit. The at least one memory and the computer program code may be further configured to, with the
at least one processor, cause the apparatus at least to: define a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view- frustum culling. The information may signal partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family. The volumetric video bitstream may be a set of visual volumetric video-based coding sub-bitstreams, which belong to the same atlas, where the atlas-to-view mapping metadata or the atlas-to-object mapping metadata is provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure, the one or more sets of components belonging to the same atlas share an atlas identifier, or a component of the one or more sets of components is atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components represent a partial portion of a scene. The at least one memory and the computer program code may be further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas, wherein an entire atlas is culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata. The atlas- to-object mapping metadata may be provided as a supplemental enhancement information message, and indicate a value of the at least one object given an atlas identifier and an index of the at least one object, wherein the atlas-to-object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object, wherein the atlas-to-object mapping metadata is provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases, wherein the atlas-to-object mapping metadata specifies a persistence of a previous atlas
object supplemental enhancement information message, wherein the persistence is specified using a flag, wherein the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame, and wherein the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
[00157] An example method includes receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[00158] The method may further include rendering a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled.
[00159] An example method includes providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
[00160] An example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations is provided, the operations comprising: receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to- view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to- object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file
format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
[00161] An example non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations is provided, the operations comprising: providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
[00162] It should be understood that the foregoing description is only illustrative. Various alternatives and modifications may be devised by those skilled in the art. For example, features recited in the various dependent claims could be combined with each other in any suitable combination (s). In addition, features from different embodiments described above could be selectively combined into a new embodiment. Accordingly, the description is intended to embrace all such alternatives, modifications and variances which fall within the scope of the appended claims.
Claims
1. An apparatus comprising: means for receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
2. The apparatus of claim 1, further comprising:
means for rendering a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled.
3. The apparatus of any one of claims 1 to 2, wherein the atlas-to-view mapping metadata is received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
4. The apparatus of any one of claims 1 to 3, wherein the atlas-to-view mapping metadata specifies a persistence of a previous atlas view supplemental enhancement information message.
5. The apparatus of claim 4, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
6. The apparatus of any one of claims 1 to 5, wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
7. The apparatus of any one of claims 1 to 6, wherein: the at least one volumetric video bitstream comprises one or more coded visual volumetric video-based coding sequences; the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set; and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
8. The apparatus of any one of claims 1 to 7, further comprising: means for interpreting a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
9. The apparatus of any one of claims 1 to 8, wherein the information signals partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
10. The apparatus of any one of claims 1 to 9, wherein the volumetric video bitstream is a set of visual volumetric video- based coding sub-bitstreams, which belong to the same atlas.
11. The apparatus of any one of claims 1 to 10, wherein the atlas-to-view mapping metadata or the atlas-to-object mapping metadata is received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
12. The apparatus of any one of claims 1 to 11, wherein the one or more sets of components belonging to the same atlas share an atlas identifier.
13. The apparatus of any one of claims 1 to 12, wherein a component of the one or more sets of components is atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components represent a partial portion of a scene.
14. The apparatus of any one of claims 1 to 13, further comprising: means for culling an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
15. The apparatus of any one of claims 1 to 14, wherein the atlas-to-object mapping metadata is received as a supplemental enhancement information message, and indicates a value of the at least one object given an atlas identifier and an index of the at least one object.
16. The apparatus of any one of claims 1 to 15, wherein the atlas-to-object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object.
17. The apparatus of any one of claims 1 to 16, wherein the atlas-to-object mapping metadata is received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
18. The apparatus of any one of claims 1 to 17, wherein the atlas-to-object mapping metadata specifies a persistence of a
previous atlas object supplemental enhancement information message.
19. The apparatus of claim 18, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
20. The apparatus of any one of claims 1 to 19, wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
21. An apparatus comprising: means for providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view;
atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and means for transmitting the information to a receiving device.
22. The apparatus of claim 21, wherein the information is provided using at least one sample group entry object.
23. The apparatus of any one of claims 21 to 22, wherein the atlas-to-view mapping metadata is provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
24. The apparatus of any one of claims 21 to 23, wherein the atlas-to-view mapping metadata specifies a persistence of a previous atlas view supplemental enhancement information message.
25. The apparatus of claim 24, wherein: the persistence is specified using a flag;
the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
26. The apparatus of any one of claims 21 to 25, further comprising: means for encoding the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas.
27. The apparatus of any one of claims 21 to 26, wherein: the at least one volumetric video bitstream comprises one or more coded visual volumetric video-based coding sequences; the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set; and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
28. The apparatus of any one of claims 21 to 27, further comprising:
means for defining a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
29. The apparatus of any one of claims 21 to 28, wherein the information signals partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
30. The apparatus of any one of claims 21 to 29, wherein the volumetric video bitstream is a set of visual volumetric video- based coding sub-bitstreams, which belong to the same atlas.
31. The apparatus of any one of claims 21 to 30, wherein the atlas-to-view mapping metadata or the atlas-to-object mapping metadata is provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
32. The apparatus of any one of claims 21 to 31, wherein the one or more sets of components belonging to the same atlas share an atlas identifier.
33. The apparatus of any one of claims 21 to 32, wherein a component of the one or more sets of components is atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components represent a partial portion of a scene.
34. The apparatus of any one of claims 21 to 33, further comprising: wherein an entire atlas is culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
35. The apparatus of any one of claims 21 to 34, wherein the atlas-to-object mapping metadata is provided as a supplemental
enhancement information message, and indicates a value of the at least one object given an atlas identifier and an index of the at least one object.
36. The apparatus of any one of claims 21 to 35, wherein the atlas-to-object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object.
37. The apparatus of any one of claims 21 to 36, wherein the atlas-to-object mapping metadata is provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
38. The apparatus of any one of claims 21 to 37, wherein the atlas-to-object mapping metadata specifies a persistence of a previous atlas object supplemental enhancement information message.
39. The apparatus of claim 38, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
40. The apparatus of any one of claims 21 to 39, further comprising: means for encoding the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
41. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: receive information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at
least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and cull the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
42. The apparatus of claim 41, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: render a view frustum corresponding to one or more sets of components of the volumetric video bitstream that have not been culled.
43. The apparatus of any one of claims 41 to 42, wherein the atlas-to-view mapping metadata is received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
44. The apparatus of any one of claims 41 to 43, wherein the atlas-to-view mapping metadata specifies a persistence of a previous atlas view supplemental enhancement information message.
45. The apparatus of claim 44, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the
current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
46. The apparatus of any one of claims 41 to 45, wherein the at least one volumetric video bitstream is culled after the at least one volumetric video bitstream has been encoded such that patches belonging to a certain view are grouped together in a single atlas.
47. The apparatus of any one of claims 41 to 46, wherein: the at least one volumetric video bitstream comprises one or more coded visual volumetric video-based coding sequences; the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set; and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
48. The apparatus of any one of claims 41 to 47, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: interpret a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
49. The apparatus of any one of claims 41 to 48, wherein the information signals partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
50. The apparatus of any one of claims 41 to 49, wherein the volumetric video bitstream is a set of visual volumetric video- based coding sub-bitstreams, which belong to the same atlas.
51. The apparatus of any one of claims 41 to 50, wherein the atlas-to-view mapping metadata or the atlas-to-object mapping metadata is received together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
52. The apparatus of any one of claims 41 to 51, wherein the one or more sets of components belonging to the same atlas share an atlas identifier.
53. The apparatus of any one of claims 41 to 52, wherein a component of the one or more sets of components is atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components represent a partial portion of a scene.
54. The apparatus of any one of claims 41 to 53, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: cull an entire atlas using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
55. The apparatus of any one of claims 41 to 54, wherein the atlas-to-object mapping metadata is received as a supplemental enhancement information message, and indicates a value of the
at least one object given an atlas identifier and an index of the at least one object.
56. The apparatus of any one of claims 41 to 55, wherein the atlas-to-object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object.
57. The apparatus of any one of claims 41 to 56, wherein the atlas-to-object mapping metadata is received as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between objects and atlases.
58. The apparatus of any one of claims 41 to 57, wherein the atlas-to-object mapping metadata specifies a persistence of a previous atlas object supplemental enhancement information message.
59. The apparatus of claim 58, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
60. The apparatus of any one of claims 41 to 59, wherein the at least one volumetric video bitstream is culled after the at
least one volumetric video bitstream has been encoded such that patches belonging to a certain object are grouped together in a single atlas.
61. An apparatus comprising: at least one processor; and at least one non-transitory memory including computer program code; wherein the at least one memory and the computer program code are configured to, with the at least one processor, cause the apparatus at least to: provide information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and
transmit the information to a receiving device.
62. The apparatus of claim 61, wherein the information is provided using at least one sample group entry object.
63. The apparatus of any one of claims 61 to 62, wherein the atlas-to-view mapping metadata is provided as a supplemental enhancement information message comprising a payload size and bitmask indicating mapping information between views and atlases.
64. The apparatus of any one of claims 61 to 63, wherein the atlas-to-view mapping metadata specifies a persistence of a previous atlas view supplemental enhancement information message.
65. The apparatus of claim 64, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas view supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas view supplemental enhancement information message applies to the current atlas frame and persists for subsequent atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
66. The apparatus of any one of claims 61 to 65, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to:
encode the at least one volumetric video bitstream such that patches belonging to a certain view are grouped together in a single atlas.
67. The apparatus of any one of claims 61 to 66, wherein: the at least one volumetric video bitstream comprises one or more coded visual volumetric video-based coding sequences; the one or more coded visual volumetric video-based coding sequences comprise at least one video based point cloud coding parameter set; and the one or more coded visual volumetric video-based coding sequences comprise at least one visual volumetric video-based coding sub-bitstream associated with a visual volumetric video-based coding component that starts with a corresponding intra random access picture sub-bitstream unit.
68. The apparatus of any one of claims 61 to 67, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: define a sample group entry that provides a mapping of the at least one atlas to the at least one object that configures a view-frustum culling.
69. The apparatus of any one of claims 61 to 68, wherein the information signals partial access utilizing a visual volumetric video-based coding supplemental enhancement information message family.
70. The apparatus of any one of claims 61 to 69, wherein the volumetric video bitstream is a set of visual volumetric video- based coding sub-bitstreams, which belong to the same atlas.
71. The apparatus of any one of claims 61 to 70, wherein the atlas-to-view mapping metadata or the atlas-to-object mapping metadata is provided together with a camera extrinsic in a view parameter extrinsic substructure of an adaptation parameter structure.
72. The apparatus of any one of claims 61 to 71, wherein the one or more sets of components belonging to the same atlas share an atlas identifier.
73. The apparatus of any one of claims 61 to 72, wherein a component of the one or more sets of components is atlas data, or video coded occupancy, attribute or geometry data, and the one or more sets of components represent a partial portion of a scene.
74. The apparatus of any one of claims 61 to 73, further comprising: wherein an entire atlas is culled using the atlas-to-view mapping metadata and the atlas-to-object mapping metadata.
75. The apparatus of any one of claims 61 to 74, wherein the atlas-to-object mapping metadata is provided as a supplemental enhancement information message, and indicates a value of the at least one object given an atlas identifier and an index of the at least one object.
76. The apparatus of any one of claims 61 to 75, wherein the atlas-to-object mapping metadata indicates, for the at least one object, an atlas that contains patches referring back to the at least one object.
77. The apparatus of any one of claims 61 to 76, wherein the atlas-to-object mapping metadata is provided as a supplemental enhancement information message comprising a payload size and
bitmask indicating mapping information between objects and atlases.
78. The apparatus of any one of claims 61 to 77, wherein the atlas-to-object mapping metadata specifies a persistence of a previous atlas object supplemental enhancement information message.
79. The apparatus of claim 78, wherein: the persistence is specified using a flag; the flag being equal to zero specifies that the atlas object supplemental enhancement information message applies to a current atlas frame; and the flag being equal to one specifies that the atlas object supplemental enhancement information message applies to the current atlas frame and persists for subsequence atlas frames in decoding order until meeting at least one condition comprising a beginning of a new sequence, an ending of the at least one volumetric video bitstream, or an atlas frame having a supplemental enhancement information message present.
80. The apparatus of any one of claims 61 to 79, wherein the at least one memory and the computer program code are further configured to, with the at least one processor, cause the apparatus at least to: encode the at least one volumetric video bitstream such that patches belonging to a certain object are grouped together in a single atlas.
81. A method comprising:
receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
82. A method comprising: providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of:
atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
83. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: receiving information to cull one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view;
atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas; patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and culling the one or more sets of components belonging to the same atlas from the at least one volumetric video bitstream, based on the information.
84. A non-transitory program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine for performing operations, the operations comprising: providing information related to a culling of one or more sets of components belonging to the same atlas from at least one volumetric video bitstream; wherein the information comprises one or more of: atlas-to-view mapping metadata that indicates an association between patches in at least one atlas and at least one view; atlas-to-object mapping metadata that indicates an association between at least one object and the at least one atlas;
patch atlas metadata within a metadata for immersive video master atlas to indicate sub-bitstream culling based on fine-grained patch visibility culling; or partial access metadata, wherein leveraging the partial access metadata comprises storing each of the at least one atlas in its own track, and providing sample grouping information to indicate culling on a file format level; and transmitting the information to a receiving device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21764191.9A EP4115624A4 (en) | 2020-03-03 | 2021-03-01 | Efficient culling of volumetric video atlas bitstreams |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202062984410P | 2020-03-03 | 2020-03-03 | |
US62/984,410 | 2020-03-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021176139A1 true WO2021176139A1 (en) | 2021-09-10 |
Family
ID=77555167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/FI2021/050146 WO2021176139A1 (en) | 2020-03-03 | 2021-03-01 | Efficient culling of volumetric video atlas bitstreams |
Country Status (3)
Country | Link |
---|---|
US (1) | US11240532B2 (en) |
EP (1) | EP4115624A4 (en) |
WO (1) | WO2021176139A1 (en) |
Families Citing this family (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11818401B2 (en) | 2017-09-14 | 2023-11-14 | Apple Inc. | Point cloud geometry compression using octrees and binary arithmetic encoding with adaptive look-up tables |
US11113845B2 (en) | 2017-09-18 | 2021-09-07 | Apple Inc. | Point cloud compression using non-cubic projections and masks |
US11010928B2 (en) | 2018-04-10 | 2021-05-18 | Apple Inc. | Adaptive distance based point cloud compression |
US11017566B1 (en) | 2018-07-02 | 2021-05-25 | Apple Inc. | Point cloud compression with adaptive filtering |
US11202098B2 (en) | 2018-07-05 | 2021-12-14 | Apple Inc. | Point cloud compression with multi-resolution video encoding |
US11012713B2 (en) | 2018-07-12 | 2021-05-18 | Apple Inc. | Bit stream structure for compressed point cloud data |
US11367224B2 (en) | 2018-10-02 | 2022-06-21 | Apple Inc. | Occupancy map block-to-patch information compression |
US11711544B2 (en) | 2019-07-02 | 2023-07-25 | Apple Inc. | Point cloud compression with supplemental information messages |
WO2021002592A1 (en) * | 2019-07-03 | 2021-01-07 | 엘지전자 주식회사 | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method |
US11627314B2 (en) | 2019-09-27 | 2023-04-11 | Apple Inc. | Video-based point cloud compression with non-normative smoothing |
US11562507B2 (en) * | 2019-09-27 | 2023-01-24 | Apple Inc. | Point cloud compression using video encoding with time consistent patches |
US11538196B2 (en) | 2019-10-02 | 2022-12-27 | Apple Inc. | Predictive coding for point cloud compression |
US11895307B2 (en) | 2019-10-04 | 2024-02-06 | Apple Inc. | Block-based predictive coding for point cloud compression |
US11798196B2 (en) | 2020-01-08 | 2023-10-24 | Apple Inc. | Video-based point cloud compression with predicted patches |
US11625866B2 (en) | 2020-01-09 | 2023-04-11 | Apple Inc. | Geometry encoding using octrees and predictive trees |
CN115668938A (en) * | 2020-03-18 | 2023-01-31 | Lg电子株式会社 | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device, and point cloud data receiving method |
WO2021206333A1 (en) * | 2020-04-11 | 2021-10-14 | 엘지전자 주식회사 | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method |
EP4124032A4 (en) * | 2020-04-12 | 2023-05-31 | LG Electronics, Inc. | Device for transmitting point cloud data, method for transmitting point cloud data, device for receiving point cloud data, and method for receiving point cloud data |
WO2021210837A1 (en) * | 2020-04-13 | 2021-10-21 | 엘지전자 주식회사 | Device for transmitting point cloud data, method for transmitting point cloud data, device for receiving point cloud data, and method for receiving point cloud data |
US11838485B2 (en) * | 2020-04-16 | 2023-12-05 | Electronics And Telecommunications Research Institute | Method for processing immersive video and method for producing immersive video |
WO2021256909A1 (en) * | 2020-06-19 | 2021-12-23 | 엘지전자 주식회사 | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method |
US11615557B2 (en) | 2020-06-24 | 2023-03-28 | Apple Inc. | Point cloud compression using octrees with slicing |
US11620768B2 (en) | 2020-06-24 | 2023-04-04 | Apple Inc. | Point cloud geometry compression using octrees with multiple scan orders |
CN114554243B (en) * | 2020-11-26 | 2023-06-20 | 腾讯科技(深圳)有限公司 | Data processing method, device and equipment of point cloud media and storage medium |
US11831920B2 (en) * | 2021-01-08 | 2023-11-28 | Tencent America LLC | Method and apparatus for video coding |
US11948338B1 (en) | 2021-03-29 | 2024-04-02 | Apple Inc. | 3D volumetric content encoding using 2D videos and simplified 3D meshes |
CN113849579B (en) * | 2021-09-27 | 2024-06-28 | 支付宝(杭州)信息技术有限公司 | Knowledge graph data processing method and system based on knowledge view |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019202207A1 (en) * | 2018-04-19 | 2019-10-24 | Nokia Technologies Oy | Processing video patches for three-dimensional content |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11902540B2 (en) * | 2019-10-01 | 2024-02-13 | Intel Corporation | Immersive video coding using object metadata |
-
2021
- 2021-03-01 EP EP21764191.9A patent/EP4115624A4/en active Pending
- 2021-03-01 US US17/188,295 patent/US11240532B2/en active Active
- 2021-03-01 WO PCT/FI2021/050146 patent/WO2021176139A1/en unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019202207A1 (en) * | 2018-04-19 | 2019-10-24 | Nokia Technologies Oy | Processing video patches for three-dimensional content |
Non-Patent Citations (4)
Title |
---|
A. M. TOURAPIS (APPLE), J. KIM, K. MAMMOU, V. ZAKHARCHENKO, J. BOYCE, B. SALAHIEH, L. KONRAD, R. JOSHI,: "[V-PCC] Object Annotation of Patches and Volumetric Rectangles in V-PCC", 129. MPEG MEETING; 20200113 - 20200117; BRUSSELS; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 11 January 2020 (2020-01-11), XP030225282 * |
CHUCHU WANG, BIN WANG, YULE SUN, LU YU: "[MPEG-I Visual] Immersive Video CE1.2:Culling of subblocks for viewport rendering", 129. MPEG MEETING; 20200113 - 20200117; BRUSSELS; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 8 January 2020 (2020-01-08), XP030224938 * |
See also references of EP4115624A4 * |
SEJIN OH (LGE): "[PCC-SYSTEM]Mapping between 3D spatial regions of point cloud data and subset of V-PCC components for partial access of point cloud data", 129. MPEG MEETING; 20200113 - 20200117; BRUSSELS; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), 11 January 2020 (2020-01-11), XP030224886 * |
Also Published As
Publication number | Publication date |
---|---|
US11240532B2 (en) | 2022-02-01 |
US20210281879A1 (en) | 2021-09-09 |
EP4115624A1 (en) | 2023-01-11 |
EP4115624A4 (en) | 2024-03-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11240532B2 (en) | Efficient culling of volumetric video atlas bitstreams | |
CN114930863B (en) | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device and point cloud data receiving method | |
CN114930813B (en) | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device and point cloud data receiving method | |
CN114946178B (en) | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device and point cloud data receiving method | |
KR20210117142A (en) | Point cloud data transmission apparatus, point cloud data transmission method, point cloud data reception apparatus and point cloud data reception method | |
EP4135319A1 (en) | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method | |
US11601634B2 (en) | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device, and point cloud data reception method | |
JP2023509190A (en) | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method | |
EP4171040A1 (en) | Point cloud data transmission device, point cloud data transmission method, point cloud data reception device and point cloud data reception method | |
EP4088480A1 (en) | Storage of multiple atlases from one v-pcc elementary stream in isobmff | |
CN115428442B (en) | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device and point cloud data receiving method | |
WO2021062645A1 (en) | File format for point cloud data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21764191 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021764191 Country of ref document: EP Effective date: 20221004 |