WO2021251141A1 - 情報処理装置および方法 - Google Patents
情報処理装置および方法 Download PDFInfo
- Publication number
- WO2021251141A1 WO2021251141A1 PCT/JP2021/019969 JP2021019969W WO2021251141A1 WO 2021251141 A1 WO2021251141 A1 WO 2021251141A1 JP 2021019969 W JP2021019969 W JP 2021019969W WO 2021251141 A1 WO2021251141 A1 WO 2021251141A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- tile
- information
- file
- subsample
- data unit
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title abstract description 89
- 238000003672 processing method Methods 0.000 claims abstract description 11
- 238000000605 extraction Methods 0.000 claims description 38
- 239000000284 extract Substances 0.000 claims description 23
- 230000008520 organization Effects 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 description 47
- 230000011664 signaling Effects 0.000 description 22
- 238000010586 diagram Methods 0.000 description 13
- 238000003860 storage Methods 0.000 description 12
- 238000010276 construction Methods 0.000 description 9
- 230000006835 compression Effects 0.000 description 8
- 238000007906 compression Methods 0.000 description 8
- 238000009877 rendering Methods 0.000 description 6
- 238000005538 encapsulation Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 244000144972 livestock Species 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- VBRBNWWNRIMAII-WYMLVPIESA-N 3-[(e)-5-(4-ethylphenoxy)-3-methylpent-3-enyl]-2,2-dimethyloxirane Chemical compound C1=CC(CC)=CC=C1OC\C=C(/C)CCC1C(C)(C)O1 VBRBNWWNRIMAII-WYMLVPIESA-N 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000003796 beauty Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/85406—Content authoring involving a specific file format, e.g. MP4 format
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/116—Details of conversion of file system types or formats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/001—Model-based coding, e.g. wire frame
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Definitions
- the present disclosure relates to an information processing device and a method, and more particularly to an information processing device and a method capable of suppressing an increase in a load of reproduction processing.
- G-PCC Geometry-based Point Cloud Compression
- ISOBMFF International Organization for Standardization Base Media File Format
- the method of storing the G-PCC bit stream in ISOBMFF is MPEG-I Part 18 (ISO /) for the purpose of playing back the G-PCC encoded bit stream from the local storage and improving the efficiency of network distribution.
- ISO / MPEG-I Part 18
- Standardization work is underway in IEC 23090-18) (see, for example, Non-Patent Document 3).
- the G-PCC bitstream can be equipped with a partial access structure that can independently decode and reproduce the bitstream of some points.
- a data unit that can be independently decrypted and played back (independently accessible) is called a tile.
- a profile has been proposed that decodes only the part of the point cloud in the field of view or the part closer to the viewpoint position to a higher resolution (see, for example, Non-Patent Document 4).
- a profile has been proposed that decodes only the part of the point cloud in the field of view or the part closer to the viewpoint position to a higher resolution (see, for example, Non-Patent Document 4).
- Non-Patent Document 3 in the case of a G-PCC bitstream having such a partial access structure, it is stored in a different track for each partially playable partial point cloud. In other words, the granularity of partial access depends on the number of tracks.
- the larger the point cloud the more diverse partial access may be required. That is, in the case of the method described in Non-Patent Document 3, more tracks are required. As the number of tracks increased, the file size could increase. Further, as the number of tracks increases, the complexity of managing the tracks increases, so that the load of the reproduction process may increase.
- Non-Patent Document 3 the information indicating the relationship between the tile and the data unit is stored only in the header of each data unit in the G-PCC bitstream. Therefore, it is necessary to parse the G-PCC bitstream in order to specify the data unit to be extracted. That is, it is necessary to parse unnecessary G-PCC bitstreams, which may increase the load of the reproduction process.
- This disclosure has been made in view of such a situation, and is intended to enable the increase in the load of the reproduction process to be suppressed.
- the information processing device of one aspect of the present technology uses the tile identification information indicating the tile of the point cloud corresponding to the data unit of the bit stream of the point cloud that expresses the object of the three-dimensional shape as a set of points as a sample.
- a tile management information generation unit that generates tile management information that is information for managing the tile corresponding to a subsample consisting of a single data unit or a plurality of consecutive data units of the bit stream stored in a file, and the bit. It is an information processing apparatus including a file generation unit for generating the file for storing the stream and the tile management information.
- the information processing method of one aspect of the present technology is a sample using the tile identification information indicating the tile of the point cloud corresponding to the data unit of the bit stream of the point cloud that expresses the object of the three-dimensional shape as a set of points.
- Generates tile management information that is information for managing the tile corresponding to a subsample consisting of a single data unit or a plurality of consecutive data units of the bit stream stored in a file, and generates the bit stream and the tile management information. This is an information processing method for generating the file for storing the data.
- the information processing device of the other aspect of the present technology is composed of a single data unit of the bit stream or a plurality of continuous data units stored in a file together with the bit stream of the point cloud that represents an object having a three-dimensional shape as a set of points.
- the tile management information that is information for managing the tile corresponding to the subsample stored in the file using the tile identification information indicating the tile of the point cloud corresponding to the subsample to be performed.
- An information processing apparatus including an extraction unit that extracts a desired portion of a bit stream necessary for reproduction of the tile from the file.
- the information processing method of the other aspect of the present technology is composed of a single data unit of the bit stream or a plurality of continuous data units stored in a file together with the bit stream of the point cloud that expresses a three-dimensional object as a set of points.
- the tile management information that is information for managing the tile corresponding to the subsample stored in the file using the tile identification information indicating the tile of the point cloud corresponding to the subsample to be performed.
- This is an information processing method for extracting a desired portion of a bit stream necessary for reproducing the desired tile from the file.
- tile identification information indicating the tile of the point cloud corresponding to the data unit of the bit stream of the point cloud that expresses the object of the three-dimensional shape as a set of points is used.
- Tile management information is generated, which is information for managing tiles corresponding to a subsample consisting of a single or consecutive data units of a bit stream stored in a file as a sample, and the bit stream and the tile management information.
- a file is generated to store the information.
- a bit of the point based on the tile management information, which is information for managing the tile corresponding to the subsample stored in the file using the tile identification information indicating the tile of the point cloud corresponding to the subsample composed of the unit.
- the part of the stream needed to play the desired tile is extracted from the file.
- G-PCC It is a figure explaining the outline of G-PCC. It is a figure explaining partial access. It is a figure which shows the structural example of a G-PCC bit stream. It is a figure which shows the example of the syntax of the tile inventory. It is a figure which shows the example of a file structure. It is a figure which shows the example of a file structure. It is a figure explaining the scalable decoding. It is a figure explaining the signaling of the tile identification information. It is a figure which shows the example of the file structure in the case of a single track. It is a figure explaining an example of signaling of tile identification information. It is a figure explaining the example of the SubSampleInformationBox. It is a figure which shows the example of codec_specific_parameters.
- Non-Patent Document 1 (above)
- Non-Patent Document 2 (above)
- Non-Patent Document 3 (above)
- Non-Patent Document 4 (above)
- Non-Patent Document 5 https://www.matroska.org/index.html
- ⁇ Point cloud> Conventionally, there has been 3D data such as a point cloud that represents a three-dimensional structure based on point position information and attribute information.
- a three-dimensional structure (object with a three-dimensional shape) is expressed as a set of a large number of points.
- the point cloud is composed of position information (also referred to as geometry) and attribute information (also referred to as attribute) of each point.
- Attributes can contain any information.
- the attributes may include color information, reflectance information, normal information, etc. of each point.
- the point cloud has a relatively simple data structure and can express an arbitrary three-dimensional structure with sufficient accuracy by using a sufficiently large number of points.
- Non-Patent Document 1 discloses a coding technique called Geometry-based Point Cloud Compression (G-PCC), which encodes this point cloud separately into geometry and attributes.
- G-PCC is in the process of being standardized in MPEG-I Part 9 (ISO / IEC 23090-9).
- the octree coding as shown in FIG. 1 is applied to the compression of the geometry.
- octree coding is performed by octree as shown on the right of FIG. 1 to indicate the presence or absence of points in each block in rectangular Voxel-represented data as shown on the left of FIG. It is a method of expression. In this method, as shown in FIG. 1, the block in which the point exists is expressed as 1, and the block in which the point does not exist is expressed as 0.
- the coded data (bitstream) generated by encoding the geometry as described above is also referred to as a geometry bitstream.
- bitstream generated by encoding the attributes
- G-PCC bitstream a bitstream that combines a geometry bitstream and an attribute bitstream into one
- the G-PCC bitstream can be equipped with a partial access structure that can independently decode and reproduce the bitstream of some points.
- a partial access structure that can independently decode and reproduce the bitstream of some points.
- the point cloud of this partial access structure there are tiles and slices as data units that can be independently decrypted and played back (independently accessible).
- the bounding box 21 is set so as to include the object 20 having a three-dimensional shape.
- the tile 22 is a rectangular parallelepiped area within the bounding box 21.
- the slice 24 is a set of points in the tile 23. Points may overlap between slices (ie, one point may belong to multiple slices).
- the point cloud at a certain time is called a point cloud frame.
- This frame is a data unit corresponding to a frame in a two-dimensional moving image.
- ⁇ G-PCC bitstream with partial access structure An example of the main structure of a G-PCC bitstream in which such a partially accessible point cloud is encoded (an example of the Type-length-value bytestream format defined in Annex B of Non-Patent Document 1) is shown. Shown in 3. That is, the G-PCC bitstream shown in FIG. 3 has a partial access structure, and a part thereof can be extracted and decoded independently of others.
- each square shows one Type-length-value encapsulation structure (tlv_encapsulation ()).
- the G-PCC bit stream has a sequence parameter set (SPS (Sequence Parameter Set)), a geometry parameter set (GPS (Geometry Parameter Set)), and an attribute parameter set (APS (s) (Attribute Parameter). It has a set)), a tile inventory, a geometry data unit, and an attribute data unit.
- the sequence parameter set is a parameter set that has parameters related to the entire sequence.
- a geometry parameter set is a parameter set that has parameters related to geometry.
- An attribute parameter set is a parameter set that has parameters related to attributes. There may be multiple geometry parameter sets and attribute parameter sets. The geometry parameter set and the attribute parameter set may be different on a slice-by-slice basis (can be set on a slice-by-slice basis).
- the tile inventory manages information about tiles.
- the tile inventory stores identification information, position information, size information, and the like of each tile.
- FIG. 4 shows an example of the syntax of the tile inventory.
- the tile inventory stores tile identification information (tile_id), tile position and size information (tile_bounding_box_offset_xyz, tile_bounding_box_size_xyz), and the like for each tile.
- Tile inventory is variable on a frame-by-frame basis (can be set on a frame-by-frame basis).
- a data unit is a data unit that can be extracted independently of others.
- a geometry data unit is a geometry data unit.
- the attribute data unit is the attribute data unit.
- the attribute data unit is generated for each attribute included in the attribute.
- the slice is composed of one geometry data unit and 0 or more attribute data units.
- a slice is composed of a single or contiguous data unit in a G-PCC bitstream.
- Each data unit stores slice identification information (slice_id) indicating the slice to which the data unit belongs. That is, the same slice identification information is stored in the data units belonging to the same slice. In this way, the geometry data unit and the attribute data unit belonging to the same slice are associated with each other by using the slice identification information.
- a tile is composed of a single or multiple consecutive slices in a G-PCC bitstream.
- Each geometry data unit stores tile identification information (tile_id) indicating the tile to which the slice to which the geometry data unit belongs belongs. That is, the geometry data units belonging to the same tiles store the same tile identification information. That is, slices belonging to the same tile are associated with each other by using the tile identification information.
- this tile identification information is managed in the tile inventory as described above, and information such as the position and size of the tile corresponding to each tile identification information in the three-dimensional space is linked.
- information such as the position and size of the tile corresponding to each tile identification information in the three-dimensional space is linked.
- Non-Patent Document 2 discloses ISOBMFF (International Organization for Standardization Base Media File Format), which is a file container specification of MPEG-4 (Moving Picture Experts Group-4), an international standard technology for video compression.
- ISOBMFF International Organization for Standardization Base Media File Format
- Non-Patent Document 3 discloses a method of storing a G-PCC bitstream in ISOBMFF for the purpose of improving the efficiency of reproduction processing and network distribution of the G-PCC-encoded bitstream from the local storage. .. This method is being standardized in MPEG-I Part 18 (ISO / IEC 23090-18).
- FIG. 5 is a diagram showing an example of the file structure in that case.
- a G-PCC bitstream stored in ISOBMFF is called a G-PCC file.
- the sequence parameter set is stored in the GPCC Decoder Configuration Record of the G-PCC file.
- the GPCCDecoderConfigurationRecord may further include a geometry parameter set, an attribute parameter set, and a tile inventory, depending on the sample entry type.
- the sample of the media data box includes geometry slice and attribute slice corresponding to 1 point cloud frame.
- it may include geometry parameter sets, attribute parameter sets, tile inventories, depending on the sample entry type.
- the G-PCC file has a structure for accessing and decoding the partial point cloud based on the three-dimensional spatial information.
- the G-PCC file stores each partial point cloud in different tracks.
- a partial point cloud consists of one or more tiles.
- the point cloud frame 61 is composed of the partial point cloud 61A, the partial point cloud 61B, and the partial point cloud 61C.
- the partial point cloud 61A, the partial point cloud 61B, and the partial point cloud 61C are stored in different tracks (G-PCC tracks) of the G-PCC files, respectively. With such a structure, it is possible to select the tile to be played by selecting the track to be played.
- Non-Patent Document 4 discloses a profile that supports a scalable decryption function of a point cloud. This profile enables decoding and rendering according to the viewpoint position, for example, during local reproduction of a large-scale point cloud still image. For example, in FIG. 7, the area outside the field of view 73 (the white background area in the figure) when the line-of-sight direction 72 is viewed from the viewpoint 71 is not decoded, and only the partial point cloud in the field of view 73 is decoded and reproduced.
- the partial point cloud in the area near the viewpoint 71 is decoded and played at high LoD (at high resolution), and the partial point in the area far from the viewpoint 71 (light gray area in the figure).
- the cloud enables decoding and rendering processing according to the viewpoint position, such as decoding and playing with low LoD (at low resolution). As a result, the reproduction of unnecessary information is reduced, so that an increase in the load of the reproduction process can be suppressed.
- Non-Patent Document 3 in the case of a G-PCC bitstream having such a partial access structure, it is stored in a different track for each partially playable partial point cloud. In other words, the granularity of partial access depends on the number of tracks.
- the larger the point cloud the more diverse partial access may be required. That is, in the case of the method described in Non-Patent Document 3, more tracks are required. As the number of tracks increased, the file size could increase. Further, as the number of tracks increases, the complexity of managing the tracks increases, so that the load of the reproduction process may increase.
- Non-Patent Document 3 the information indicating the relationship between the tile and the data unit is stored only in the header of each data unit in the G-PCC bitstream. Therefore, it is necessary to parse the G-PCC bitstream in order to specify the data unit to be extracted. That is, it is necessary to parse unnecessary G-PCC bitstreams, which may increase the load of the reproduction process.
- the tile identification information is stored in the G-PCC file.
- a subsample is formed in a sample in a G-PCC file, and tile identification information is stored in the G-PCC file as tile management information for managing tiles corresponding to each subsample.
- an information processing device it is stored in a file as a sample using tile identification information indicating a point cloud tile corresponding to a point cloud bit stream data unit that expresses a three-dimensional object as a set of points.
- tile management information generator that generates tile management information, which is information for managing tiles corresponding to subsamples consisting of a single or consecutive data units of a bit stream, and the bit stream and tile management information. It is provided with a file generator that generates a file.
- the tile identification information indicating the tile of the point cloud corresponding to the data unit of the bit stream of the point cloud that expresses the object of the three-dimensional shape as a set of points is stored in a file as a sample.
- Generate tile management information that is information for managing tiles corresponding to a subsample consisting of a single or contiguous data unit of a bit stream, and generate a file that stores the bit stream and tile management information. do.
- the desired tile of the bitstream based on the tile management information, which is the information for managing the tile corresponding to the subsample stored in the file using the tile identification information indicating the tile of the point cloud corresponding to. It is provided with an extraction unit that extracts the part necessary for reproduction of the file from the file.
- an information processing method corresponds to a subsample composed of a single or consecutive data units of a point cloud that is stored in a file together with a point cloud bit stream that expresses a three-dimensional object as a set of points.
- Playback of the desired tile of the bitstream based on the tile management information which is information for managing the tiles corresponding to the subsamples stored in the file using the tile identification information that indicates the tiles in the point cloud. Try to extract the necessary part from the file.
- G-PCC bitstream As a use case of G-PCC bitstream, there is encoding of large-scale point cloud data such as point cloud map data and virtual assets in movie production (digital data of a real movie set).
- Such a large-scale point cloud is mainly expected to be played locally. Since the client generally has a limitation on the cache size, instead of decoding the entire G-PCC bitstream, only the required area is decoded and rendered each time.
- the processing of decoding and rendering only the partial point cloud in the visible area according to the viewpoint position, or the partial point cloud in the near area Is expected to be decoded and rendered at high LoD (at high resolution), and the partial point cloud in a distant area is expected to be decoded and rendered at low LoD (at low resolution).
- a large-scale point cloud is mainly assumed to be played locally (playing a part of the whole, not playing the whole). Therefore, as described above, based on the tile identification information managed by the tile management information, the information necessary for reproducing the desired tile can be extracted and decoded to generate the presentation information. It is possible to suppress an increase in the load on the client.
- FIG. 9 is a diagram showing a main configuration example of a G-PCC file in the case of a single track. As shown in FIG. 9, in the case of a single track, both the geometry data unit and the attribute data unit may be stored in the sample.
- a sample for storing G-PCC data is also referred to as a G-PCC sample (G-PCC sample).
- the tile management information may include a list of tile identification information corresponding to the subsamples generated for each track in the file and stored in that track.
- the G-PCC file is an ISOBMFF file
- the tile management information may be stored in the box that stores the information about the subsample in the moov box of the G-PCC file.
- FIG. 10 there is a SubSampleInformationBox ('subs') specified by ISO / IEC 23090-18 in the moov box of the G-PCC file.
- tile management information (list of tile identification information) may be stored in this SubSampleInformationBox (method 1-1).
- the SubSampleInformationBox may be stored in the moof box of the G-PCC file.
- the file generator that generates the ISOBMFF parses the G-PCC bitstream, extracts tile identification information, etc., and extracts the extracted tiles. Store identification information etc. as tile management information in SubSampleInformatoinBox.
- the file generator acquires tile identification information or the like from the encoder and stores the acquired tile identification information or the like as tile management information in the SubSampleInformationBox. By doing so, the tile management information (list of tile identification information) can be stored in the SubSampleInformationBox.
- FIG. 11 shows an example of the syntax of this SubSampleInformationBox.
- codec specific parameters that can store arbitrary parameters are prepared.
- This codec specific parameters may be extended to store tile management information (list of tile identification information).
- tile management information list of tile identification information
- the affinity with the conventional standard can be improved. This makes it possible to realize a file that can be processed even by a general-purpose encoder or decoder.
- the G-PCC data in the G-PCC sample may be subsampled for each tile (method 1-1-1).
- the G-PCC data stored in the sample is subsampled for each tile. That is, a single or contiguous data unit of a bitstream composed of data units of geometry and / or data units of attributes that belong to the same tile of the bitstream stored in the sample is a subsample. It may be set.
- the tile management information may include information for associating such a subsample with the tile identification information corresponding to the data unit of the geometry contained in the subsample.
- slices contain one geometry data unit, and tiles are made up of one or more slices, so for a single track, one or more subsamples made up of this data unit. It can also be said that it is composed of a single or continuous data unit including the geometry data unit of.
- a single or continuous parameter set and tile inventory are also set as subsamples.
- tile management information in the above-mentioned tile management information, as shown in FIG. 10, information (data_units_for_tile) indicating whether or not the subsample is a tile is stored for each subsample. If data_units_for_tile is false (eg 0), it indicates that the subsample is a subsample consisting of a single or contiguous set of parameters and a tile inventory. Further, when data_units_for_tile is true (for example, 1), it indicates that the subsample is a subsample composed of a single or consecutive data units constituting the same tile.
- tile identification information (tile_id) is further stored.
- the tile_id indicates the tile corresponding to the subsample (that is, the tile to which the data unit constituting the subsample belongs).
- the association between the geometry data unit and the attribute data unit is omitted in the tile management information (in the same subsample). Both are linked by being included).
- FIG. 12 is a diagram showing an example of the syntax of codec specific parameters.
- codec specific parameters subsample data_units_for_tile and tile_id are stored as tile management information. That is, in the tile management information, information (data_units_for_tile) indicating whether or not the subsample is a tile is stored for each subsample, and if data_units_for_tile is true, further tile identification information (tile_id) is stored. ..
- the codec specific parameters may be extended by using flags.
- the contents of codec specific parameters can be switched depending on the value of flags. Therefore, tile management information (data_units_for_tile, tile_id, etc.) can be stored while keeping the existing parameters. This makes it possible to improve the affinity with the conventional standard. This makes it possible to realize a file that can be processed even by a general-purpose encoder or decoder.
- the subsample may be set for each slice. That is, as shown in the fifth row from the top of the table shown in FIG. 8, the G-PCC data in the G-PCC sample may be subsampled for each slice (method 1-1-2). ..
- the G-PCC data stored in the sample is subsampled for each slice. That is, a single or contiguous data unit of a bitstream composed of data units of geometry and / or data units of attributes that belong to the same slice of the bitstream stored in the sample is a subsample. It may be set.
- the tile identification information indicates the tile to which the slice corresponding to the subsample belongs.
- the tile management information may include information for associating such a subsample with the tile identification information corresponding to the data unit of the geometry contained in the subsample.
- a slice contains one geometry data unit, so in the case of a single track, a subsample composed of this data unit may be a single piece of data or a plurality of consecutive pieces of data containing one geometry data unit. It can be said that it is composed of units.
- a single or continuous parameter set and tile inventory are also set as subsamples.
- tile management information in the above tile management information, as shown in FIG. 13, information (data_units_for_slice) indicating whether or not the subsample is a slice is stored for each subsample. If data_units_for_slice is false (eg 0), it indicates that the subsample is a subsample consisting of multiple parameter sets and tile inventory, either singular or contiguous. When data_units_for_slice is true (for example, 1), it indicates that the subsample is a subsample composed of a single or consecutive data units constituting the same slice.
- tile identification information (tile_id) is further stored.
- the tile_id indicates the tile to which the slice corresponding to the subsample belongs (that is, the tile to which the data unit constituting the subsample belongs).
- the association between the geometry data unit and the attribute data unit is omitted in the tile management information (in the same subsample). Both are linked by being included).
- FIG. 14 is a diagram showing an example of the syntax of codec specific parameters in this case.
- codec specific parameters subsample data_units_for_slice and tile_id are stored as tile management information. That is, in the tile management information, information (data_units_for_slice) indicating whether or not the subsample is a slice is stored for each subsample, and if data_units_for_slice is true, further tile identification information (tile_id) is stored. ..
- the codec specific parameters may be extended by using flags as in the example shown in FIG. By doing so, the affinity with the conventional standard can be improved. This makes it possible to realize a file that can be processed even by a general-purpose encoder or decoder.
- Subsample for each data unit The subsample may be set for each data unit. That is, as shown in the sixth row from the top of the table shown in FIG. 8, the G-PCC data in the G-PCC sample may be subsampled for each data unit (method 1-1-3). ). In the example of FIG. 15, the G-PCC data stored in the sample is subsampled for each data unit. That is, a single data unit of bitstream geometry or attributes stored in the sample may be set as a subsample. In this case, the tile identification information indicates the tile to which the subsample (data unit) belongs.
- the tile management information corresponds to the subsample consisting of the data unit of the geometry, the information corresponding to the tile identification information and the slice identification information corresponding to the data unit of the geometry, and the subsample consisting of the data unit of the attribute.
- Information associated with the slice identification information corresponding to the data unit of the attribute may be included.
- the slice identification information is information indicating a slice of a point cloud corresponding to a bitstream data unit.
- each parameter set and tile inventory are also set as subsamples.
- the payload type (payload type), tile identification information (tile id), and slice identification information (slice identification information) (payload type), tile identification information (tile id), and slice identification information ( sliceid) is stored.
- the payload type (payload type) and slice identification information (slice id) are stored for the subsample of the attribute data unit.
- the payload type is stored for the other subsamples.
- the payload type indicates the type of data that constitutes the subsample (for example, whether it is a geometry data unit, an attribute data unit, or something else).
- the tile_id indicates the tile corresponding to the subsample (that is, the tile to which the data unit constituting the subsample belongs).
- the slice id indicates the slice corresponding to the subsample (that is, the slice to which the data unit constituting the subsample belongs). In this case, the geometry data unit and the attribute data unit are associated with each other by the slice identification information.
- FIG. 16 is a diagram showing an example of the syntax of codec specific parameters in this case.
- the payload type, tile_id, and geom_slice_id are stored as the tile management information for the subsample of the geometry data unit.
- geom_slice_id is slice identification information indicating the slices formed by the geometry data unit.
- payload type and attr_slice_id are stored for the subsample of the attribute data unit.
- attr_slice_id is slice identification information indicating the slices formed by the attribute data unit.
- the payment type is stored for other subsamples.
- the codec specific parameters may be extended by using flags as in the example shown in FIG. By doing so, the affinity with the conventional standard can be improved. This makes it possible to realize a file that can be processed even by a general-purpose encoder or decoder.
- SubSampleInformationBox is expanded to store tile management information (tile identification information), but instead of this SubSampleInformationBox, SubSampleItemProperty is expanded to store tile management information (tile identification information). You may.
- the extension method is the same as in the case of the SubSampleInformationBox described above. By storing tile management information in SubSampleItemProperty, the same effect can be obtained for still images.
- the tile identification information may be stored in the timed metadata (method 1-2).
- the timed metadata track is associated with the G-PCC track by track reference ('gsli'). This method can be applied to each of the above-mentioned methods. That is, the information stored in each method may be stored in the timed metadata.
- the geometry data unit and the attribute data unit are stored in different tracks, and each track is a tile that manages the tile identification information corresponding to the subsample in that track. Management information may be stored.
- each track is the same as for a single track. Therefore, in the case of multi-track, the same effect as in the case of single track can be obtained. In addition, each method described in the case of a single track can be applied to this multi-track.
- data_units_for_tile is stored as tile management information for each subsample in the geometry track (geometry track) and the attribute track (attribute track). For the subsamples that make up the tile, the tile_id is also stored.
- data_units_for_slice is stored as tile management information for each subsample in the geometry track (geometry track) and the attribute track (attribute track). For the subsamples that make up the slice, the tile_id is also stored.
- the payload type in the geometry track, is stored as tile management information for each subsample in the track.
- tile_id and sliceid are also stored.
- the payload type is stored as tile management information for each subsample in the track.
- a slice id is also stored.
- data_units_for_tile and data_units_for_slice are true (for example, 1) when the subsample is a geometry data unit in the case of a geometry track and sub in the case of an attribute track.
- samples are attribute data units that make up the same slice.
- the tile identification information may be stored in the timed metadata as in the case of the single track.
- the same effect as in the case of single track can be obtained.
- Matryoshka media container ⁇ 2-3.
- Matryoshka media container> an example of applying ISOBMFF as a file format has been described, but the file that stores the G-PCC bitstream is arbitrary and may be other than ISOBMFF.
- the G-PCC bitstream may be stored in a Matroska Media Container, as shown at the bottom of the table shown in FIG. 8 (Method 3).
- FIG. 21 shows a main configuration example of the matryoshka media container.
- the tile management information may be stored as a newly defined element under the Track Entry element.
- the tile management information (tile identification information) is stored in the timed metadata
- the timed metadata may be stored in a Track entry different from the Track entry in which the G-PCC bitstream is stored. ..
- FIG. 22 is a block diagram showing an example of the configuration of a file generation device, which is an aspect of an information processing device to which the present technology is applied.
- the file generation device 300 shown in FIG. 22 is a device that encodes point cloud data by applying G-PCC and stores the G-PCC bitstream generated by the coding in ISOBMFF.
- the file generation device 300 applies the above-mentioned technology and stores the G-PCC bitstream in ISOBMFF so as to enable partial access. That is, the file generation device 300 stores the tile identification information of each subsample as tile management information in the G-PCC file.
- FIG. 22 shows the main things such as the processing unit and the data flow, and not all of them are shown in FIG. 22. That is, in the file generation device 300, there may be a processing unit that is not shown as a block in FIG. 22, or there may be a processing or data flow that is not shown as an arrow or the like in FIG. 22.
- the file generation device 300 has an extraction unit 311, a coding unit 312, a bit stream generation unit 313, a tile management information generation unit 314, and a file generation unit 315.
- the coding unit 312 includes a geometry coding unit 321, an attribute coding unit 322, and a metadata generation unit 323.
- the extraction unit 311 extracts geometry data and attribute data from the point cloud data input to the file generation device 300, respectively.
- the extraction unit 311 supplies the data of the extracted geometry to the geometry coding unit 321 of the coding unit 312. Further, the extraction unit 311 supplies the extracted attribute data to the attribute coding unit 322 of the coding unit 312.
- the coding unit 312 encodes the data in the point cloud.
- the geometry coding unit 321 encodes the geometry data supplied from the extraction unit 311 and generates a geometry bit stream.
- the geometry coding unit 321 supplies the generated geometry bitstream to the metadata generation unit 323. Further, the geometry coding unit 321 also supplies the generated geometry bitstream to the attribute coding unit 322.
- the attribute coding unit 322 encodes the attribute data supplied from the extraction unit 311 and generates an attribute bit stream.
- the attribute coding unit 322 supplies the generated attribute bit stream to the metadata generation unit 323.
- the metadata generation unit 323 refers to the supplied geometry bitstream and attribute bitstream, and generates metadata.
- the metadata generation unit 323 supplies the generated metadata to the bitstream generation unit 313 together with the geometry bitstream and the attribute bitstream.
- the bitstream generation unit 313 multiplexes the supplied geometry bitstream, attribute bitstream, and metadata to generate a G-PCC bitstream.
- the bitstream generation unit 313 supplies the generated G-PCC bitstream to the tile management information generation unit 314.
- the tile management information generation unit 314 is ⁇ 2. Applying the above-mentioned technology in Signaling tile identification information>, the tile identification information indicating the tile of the point cloud corresponding to the supplied G-PCC bit stream data unit is used and stored in a file as a sample. Generates tile management information, which is information for managing tiles corresponding to a subsample consisting of a single bit stream or a plurality of consecutive data units. The tile management information generation unit 314 supplies the tile management information to the file generation unit 315 together with the G-PCC bit stream.
- the file generation unit 315 is ⁇ 2.
- the above-mentioned technology is applied in Signaling of tile identification information> to generate a G-PCC file that stores the supplied G-PCC bitstream and tile management information (tile identification information).
- the file generation unit 315 outputs the G-PCC file generated as described above to the outside of the file generation device 300.
- the tile management information generation unit 314 when subsampled for each tile, the tile management information generation unit 314 generates tile management information (list of tile identification information) according to the syntax as shown in FIG.
- the file generation unit 315 stores the tile management information in the codec specific parameters of the SubSampleInformationBox.
- the tile management information generation unit 314 when subsampled for each slice, the tile management information generation unit 314 generates tile management information (list of tile identification information) according to the syntax as shown in FIG.
- the file generation unit 315 stores the tile management information in the codec specific parameters of the SubSampleInformationBox.
- the tile management information generation unit 314 when subsampled for each data unit, the tile management information generation unit 314 generates tile management information (list of tile identification information) according to the syntax as shown in FIG.
- the file generation unit 315 stores the tile management information in the codec specific parameters of the SubSampleInformationBox.
- the file generation unit 315 can store tile management information in SubSampleItemProperty or can store it in timed metadata. Also, ⁇ 2. As described above in Signaling tile identification information>, even in the case of multitrack, the tile management information generation unit 314 can generate tile management information, and the file generation unit 315 stores the tile management information in a file. can do.
- the extraction unit 311 of the file generation device 300 extracts the geometry and the attribute from the point cloud in step S301, respectively.
- step S302 the coding unit 312 encodes the geometry and the attribute extracted in step S301 to generate a geometry bit stream and an attribute bit stream.
- the coding unit 312 further generates the metadata thereof.
- step S303 the bitstream generation unit 313 multiplexes the geometry bitstream, attribute bitstream, and metadata generated in step S302 to generate a G-PCC bitstream.
- step S304 the tile management information generation unit 314 has ⁇ 2.
- the above-mentioned technique is applied in Signaling of tile identification information> to generate tile management information for managing tile identification information included in the G-PCC bit stream generated in step S303.
- step S305 the file generation unit 315 generates other information, applies the above-mentioned technology, and generates a G-PCC file for storing the G-PCC bitstream and tile management information.
- step S305 When the process of step S305 is completed, the file generation process is completed.
- the file generation device 300 has ⁇ 2. This technology explained in Signaling tile identification information> is applied, and the tile identification information is stored in the G-PCC file. By doing so, it is possible to reduce the processing (decoding and the like) of unnecessary information, and it is possible to suppress an increase in the load of the reproduction processing.
- FIG. 24 is a block diagram showing an example of a configuration of a reproduction device, which is an aspect of an information processing device to which the present technology is applied.
- the playback device 400 shown in FIG. 24 is a device that decodes a G-PCC file, constructs a point cloud, renders it, and generates presentation information.
- the reproduction device 400 applies the above-mentioned technology, extracts the information necessary for reproducing the desired tile in the point cloud from the G-PCC file, decodes the extracted information, and reproduces it. Can be done. That is, the reproduction device 400 can decode and reproduce only a part of the point cloud.
- FIG. 24 shows the main things such as the processing unit and the data flow, and not all of them are shown in FIG. 24. That is, in the reproduction device 400, there may be a processing unit that is not shown as a block in FIG. 24, or there may be a processing or data flow that is not shown as an arrow or the like in FIG. 24.
- the reproduction device 400 has a control unit 401, a file acquisition unit 411, a reproduction processing unit 412, and a presentation processing unit 413.
- the reproduction processing unit 412 includes a file processing unit 421, a decoding unit 422, and a presentation information generation unit 423.
- the control unit 401 controls each processing unit in the reproduction device 400.
- the file acquisition unit 411 acquires a G-PCC file for storing the point cloud to be reproduced and supplies it to the reproduction processing unit 412 (file processing unit 421).
- the reproduction processing unit 412 performs processing related to reproduction of the point cloud stored in the supplied G-PCC file.
- the file processing unit 421 of the playback processing unit 412 acquires the G-PCC file supplied from the file acquisition unit 411 and extracts the bit stream from the G-PCC file. At that time, the file processing unit 421 is described in ⁇ 2.
- the above-mentioned technique is applied in Signaling of tile identification information>, and only the bitstream necessary for reproducing the desired tile is extracted.
- the file processing unit 421 supplies the extracted bit stream to the decoding unit 422.
- the decoding unit 422 decodes the supplied bitstream and generates geometry and attribute data.
- the decoding unit 422 supplies the generated geometry and attribute data to the presentation information generation unit 423.
- the presentation information generation unit 423 constructs a point cloud using the supplied geometry and attribute data, and generates presentation information which is information for presenting (for example, displaying) the point cloud. For example, the presentation information generation unit 423 renders using the point cloud, and generates a display image of the point cloud as the presentation information as viewed from a predetermined viewpoint. The presentation information generation unit 423 supplies the presentation information thus generated to the presentation processing unit 413.
- the presentation processing unit 413 performs a process of presenting the supplied presentation information.
- the presentation processing unit 413 supplies the presentation information to a display device or the like outside the reproduction device 400 and causes the presentation information to be presented.
- FIG. 25 is a block diagram showing a main configuration example of the reproduction processing unit 412.
- the file processing unit 421 has a bitstream extraction unit 431.
- the decoding unit 422 has a geometry decoding unit 441 and an attribute decoding unit 442.
- the presentation information generation unit 423 has a point cloud construction unit 451 and a presentation processing unit 452.
- the bitstream extraction unit 431 has ⁇ 2. Applying the above-mentioned technology in Signaling tile identification information>, refer to the tile management information included in the supplied G-PCC file, and based on the tile management information (the tile identification information contained in), the G -Extract the bitstream needed to play the desired tile (that is, the geometry bitstream and attribute bitstream corresponding to that tile) from the PCC file.
- the bitstream extraction unit 431 specifies the tile identification information corresponding to the desired tile based on the information such as the tile inventory. Then, the bitstream extraction unit 431 refers to the tile management information stored in the codec specific parameters and the like of the SubSampleInformationBox, and identifies the subsample corresponding to the tile identification information corresponding to the desired tile. Then, the bitstream extraction unit 431 extracts the specified subsample bitstream.
- the bitstream extraction unit 431 shows the tile management information stored in the codec specific parameters and the like of the SubSampleInformationBox in FIG. Analyze the tile management information based on such a syntax.
- the bitstream extraction unit 431 shows the tile management information stored in the codec specific parameters and the like of the SubSampleInformationBox in FIG. Analyze the tile management information based on such a syntax.
- the bitstream extraction unit 431 displays the tile management information stored in the codec specific parameters of the SubSampleInformationBox in FIG. Analyze the tile management information based on the syntax as shown.
- the bitstream extraction unit 431 refers to the SubSampleItemProperty.
- the bitstream extraction unit 431 refers to the timed metadata.
- the G-PCC file may be multitrack.
- the bitstream extraction unit 431 supplies the extracted geometry bitstream to the geometry decoding unit 441. Further, the bitstream extraction unit 431 supplies the extracted attribute bitstream to the attribute decoding unit 442.
- the geometry decoding unit 441 decodes the supplied geometry bitstream and generates geometry data.
- the geometry decoding unit 441 supplies the generated geometry data to the point cloud construction unit 451.
- the attribute decoding unit 442 decodes the supplied attribute bit stream and generates attribute data.
- the attribute decoding unit 442 supplies the generated attribute data to the point cloud construction unit 451.
- the point cloud construction unit 451 constructs a point cloud using the supplied geometry and attribute data. That is, the point cloud construction unit 451 can construct a desired tile of the point cloud.
- the point cloud construction unit 451 supplies the constructed point cloud data to the presentation processing unit 452.
- the presentation processing unit 452 generates presentation information using the supplied point cloud data.
- the presentation processing unit 452 supplies the generated presentation information to the presentation processing unit 413.
- the playback device 400 can more easily obtain desired tiles based on the tile management information (tile identification information) stored in the G-PCC file without having to parse the entire bitstream. Only can be extracted, decrypted, constructed and presented. Therefore, it is possible to suppress an increase in the load of the reproduction process.
- the file acquisition unit 411 of the reproduction device 400 acquires the G-PCC file to be reproduced in step S401.
- step S402 the bitstream extraction unit 431 is required to decode and display the desired tile based on the tile management information (tile identification information) stored in the G-PCC file acquired in step S401. Extract parameter sets and data units. That is, the bitstream extraction unit 431 has ⁇ 2.
- the above-mentioned technique is applied in Signaling of tile identification information>, and a geometry bitstream or an attribute bitstream corresponding to a desired tile is extracted from the G-PCC file.
- the bitstream extraction unit 431 identifies and extracts the sequence parameter set, geometry parameter set, attribute parameter set, and tile inventory based on the payload type stored in the SubSampleInformationBox of the G-PCC file.
- the bitstream extraction unit 431 determines the decoding method for each tile based on the position information of the tiles shown in the extracted tile inventory.
- the bitstream extraction unit 431 is a subsample that constitutes a tile to be decoded (that is, a geometry data unit or an attribute that constitutes the tile) based on the tile management information (tile identification information) stored in the SubSampleInformationBox of the G-PCC file. Data unit) is identified and extracted.
- step S403 the geometry decoding unit 441 of the decoding unit 422 decodes the geometry bitstream extracted in step S402 and generates geometry data. Further, the attribute decoding unit 442 decodes the attribute bit stream extracted in step S402 and generates attribute data.
- step S404 the point cloud construction unit 451 constructs a point cloud using the geometry and attribute data generated in step S403. That is, the point cloud construction unit 451 can construct a desired tile (a part of the point cloud).
- step S405 the presentation processing unit 452 generates presentation information by rendering using the point cloud constructed in step S404.
- the presentation processing unit 413 supplies the presentation information to the outside of the reproduction device 400 and causes the presentation information to be presented.
- step S405 When the process of step S405 is completed, the reproduction process is completed.
- the reproduction device 400 has ⁇ 2.
- the technique described in Signaling tile identification information> is applied, and the information corresponding to the desired tile is extracted and reproduced using the tile identification information stored in the G-PCC file. By doing so, it is possible to reduce the processing (decoding and the like) of unnecessary information, and it is possible to suppress an increase in the load of the reproduction processing.
- Addendum> ⁇ Computer> The series of processes described above can be executed by hardware or software.
- the programs constituting the software are installed in the computer.
- the computer includes a computer embedded in dedicated hardware and, for example, a general-purpose personal computer capable of executing various functions by installing various programs.
- FIG. 27 is a block diagram showing a configuration example of computer hardware that executes the above-mentioned series of processes programmatically.
- the CPU Central Processing Unit
- ROM ReadOnly Memory
- RAM RandomAccessMemory
- the input / output interface 910 is also connected to the bus 904.
- An input unit 911, an output unit 912, a storage unit 913, a communication unit 914, and a drive 915 are connected to the input / output interface 910.
- the input unit 911 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like.
- the output unit 912 includes, for example, a display, a speaker, an output terminal, and the like.
- the storage unit 913 is composed of, for example, a hard disk, a RAM disk, a non-volatile memory, or the like.
- the communication unit 914 is composed of, for example, a network interface.
- the drive 915 drives a removable medium 921 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 901 loads the program stored in the storage unit 913 into the RAM 903 via the input / output interface 910 and the bus 904 and executes the above-mentioned series. Is processed.
- the RAM 903 also appropriately stores data and the like necessary for the CPU 901 to execute various processes.
- the program executed by the computer can be recorded and applied to the removable media 921 as a package media or the like, for example.
- the program can be installed in the storage unit 913 via the input / output interface 910 by mounting the removable media 921 in the drive 915.
- This program can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasting. In that case, the program can be received by the communication unit 914 and installed in the storage unit 913.
- this program can also be installed in advance in ROM 902 or storage unit 913.
- this technology can be applied to any configuration.
- the present technology can be applied to various electronic devices.
- the present technology includes a processor as a system LSI (Large Scale Integration) (for example, a video processor), a module using a plurality of processors (for example, a video module), and a unit using a plurality of modules (for example, a video unit).
- a processor as a system LSI (Large Scale Integration) (for example, a video processor), a module using a plurality of processors (for example, a video module), and a unit using a plurality of modules (for example, a video unit).
- a processor as a system LSI (Large Scale Integration) (for example, a video processor), a module using a plurality of processors (for example, a video module), and a unit using a plurality of modules (for example, a video unit).
- a processor as a system LSI (Large Scale Integration) (for example, a video processor), a module using a plurality of processors (for example,
- this technology can be applied to a network system composed of a plurality of devices.
- the present technology may be implemented as cloud computing that is shared and jointly processed by a plurality of devices via a network.
- this technology is implemented in a cloud service that provides services related to images (moving images) to any terminal such as computers, AV (AudioVisual) devices, portable information processing terminals, and IoT (Internet of Things) devices. You may try to do it.
- the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Therefore, a plurality of devices housed in separate housings and connected via a network, and a device in which a plurality of modules are housed in one housing are both systems. ..
- Systems, devices, processing units, etc. to which this technology is applied can be used in any field such as transportation, medical care, crime prevention, agriculture, livestock industry, mining, beauty, factories, home appliances, weather, nature monitoring, etc. .. The use is also arbitrary.
- this technology can be applied to systems and devices used for providing ornamental contents and the like.
- the present technology can be applied to systems and devices used for traffic such as traffic condition supervision and automatic driving control.
- the present technology can be applied to systems and devices used for security purposes.
- the present technology can be applied to a system or device used for automatic control of a machine or the like.
- the present technology can be applied to systems and devices used for agriculture and livestock industry.
- the present technology can also be applied to systems and devices for monitoring natural conditions such as volcanoes, forests and oceans, and wildlife. Further, for example, the present technology can be applied to systems and devices used for sports.
- the "flag” is information for identifying a plurality of states, and is not only information used for identifying two states of true (1) or false (0), but also three or more states. It also contains information that can identify the state. Therefore, the value that this "flag” can take may be, for example, 2 values of 1/0 or 3 or more values. That is, the number of bits constituting this "flag” is arbitrary, and may be 1 bit or a plurality of bits.
- the identification information (including the flag) is assumed to include not only the identification information in the bit stream but also the difference information of the identification information with respect to a certain reference information in the bit stream. In, the "flag” and “identification information” include not only the information but also the difference information with respect to the reference information.
- various information (metadata, etc.) regarding the coded data may be transmitted or recorded in any form as long as it is associated with the coded data.
- the term "associate" means, for example, to make the other data available (linkable) when processing one data. That is, the data associated with each other may be combined as one data or may be individual data.
- the information associated with the coded data (image) may be transmitted on a transmission path different from the coded data (image).
- the information associated with the coded data (image) may be recorded on a recording medium (or another recording area of the same recording medium) different from the coded data (image). good.
- this "association" may be a part of the data, not the entire data.
- the image and the information corresponding to the image may be associated with each other in any unit such as a plurality of frames, one frame, or a part within the frame.
- the embodiment of the present technology is not limited to the above-described embodiment, and various changes can be made without departing from the gist of the present technology.
- the configuration described as one device (or processing unit) may be divided and configured as a plurality of devices (or processing units).
- the configurations described above as a plurality of devices (or processing units) may be collectively configured as one device (or processing unit).
- a part of the configuration of one device (or processing unit) may be included in the configuration of another device (or other processing unit). ..
- the above-mentioned program may be executed in any device.
- the device may have necessary functions (functional blocks, etc.) so that necessary information can be obtained.
- each step of one flowchart may be executed by one device, or may be shared and executed by a plurality of devices.
- one device may execute the plurality of processes, or the plurality of devices may share and execute the plurality of processes.
- a plurality of processes included in one step can be executed as processes of a plurality of steps.
- the processes described as a plurality of steps can be collectively executed as one step.
- the processing of the steps for writing the program may be executed in chronological order in the order described in the present specification, and may be executed in parallel or in a row. It may be executed individually at the required timing such as when it is broken. That is, as long as there is no contradiction, the processes of each step may be executed in an order different from the above-mentioned order. Further, the processing of the step for describing this program may be executed in parallel with the processing of another program, or may be executed in combination with the processing of another program.
- a plurality of technologies related to this technology can be independently implemented independently as long as there is no contradiction.
- any plurality of the present technologies can be used in combination.
- some or all of the techniques described in any of the embodiments may be combined with some or all of the techniques described in other embodiments.
- a part or all of any of the above-mentioned techniques may be carried out in combination with other techniques not described above.
- the present technology can also have the following configurations.
- (1) The bit stream stored in a file as a sample using the tile identification information indicating the tile of the point cloud corresponding to the data unit of the bit stream of the point cloud expressing an object having a three-dimensional shape as a set of points.
- a tile management information generation unit that generates tile management information that is information for managing the tile corresponding to a subsample consisting of a single unit or a plurality of consecutive data units of the above.
- An information processing device including a file generation unit that generates the file that stores the bitstream and the tile management information.
- (2) The information processing apparatus according to (1), wherein the tile management information is generated for each track of the file and includes a list of the tile identification information corresponding to the subsample stored in the track.
- the file is an ISOBMFF (International Organization for Standardization Base Media File Format) file.
- the subsample is composed of the data unit of the geometry belonging to the same tile of the bitstream, the data unit of the attribute, or both.
- the information processing apparatus according to (3), wherein the tile management information includes information for associating the tile identification information corresponding to the data unit of the geometry contained in the subsample with respect to the subsample.
- the subsample is composed of the data unit of geometry and / or the data unit of attributes belonging to the same slice of the bitstream.
- the information processing apparatus wherein the tile management information includes information for associating the tile identification information corresponding to the data unit of the geometry contained in the subsample with respect to the subsample.
- the subsample is composed of the singular data unit of the geometry or attribute of the bitstream.
- the tile management information is Corresponds to the tile identification information corresponding to the data unit of the geometry and the data unit of the bitstream corresponding to the data unit of the geometry for the subsample consisting of the data unit of the geometry.
- Information that associates the slice identification information indicating the slice of the point cloud with the information The information processing apparatus according to (3), which includes information for associating the slice identification information corresponding to the data unit of the attribute with respect to the subsample composed of the data unit of the attribute.
- the information processing apparatus according to any one of (3) to (6), wherein the tile management information is stored in the timed metadata of the file.
- the file generation unit stores the data unit of the geometry and the data unit of the attribute in different tracks of the file.
- the information processing device according to any one of (1) to (7), wherein the tile management information generation unit generates the tile management information in each of the tracks.
- the information processing apparatus according to any one of (1) to (8), wherein the file generation unit generates the file for storing the bit stream generated by the coding unit.
- a point that represents a three-dimensional object as a set of points The point corresponding to a subsample composed of a single data unit of the bit stream or a plurality of continuous data units stored in a file together with a bit stream of the cloud.
- the desired tile of the bit stream An information processing device equipped with an extraction unit that extracts the part required for reproduction from the file.
- the tile management information includes a list of the tile identification information corresponding to the subsample generated for each track of the file and stored in the track.
- the information processing apparatus wherein the extraction unit identifies the subsample corresponding to the desired tile based on the list, and extracts the specified subsample.
- the file is an ISOBMFF (International Organization for Standardization Base Media File Format) file.
- the extraction unit identifies the subsample corresponding to the desired tile based on the list of the tile management information stored in the moov box or the moof box of the file, and extracts the specified subsample.
- the information processing apparatus according to (12).
- the subsample is composed of the data unit of geometry and / or the data unit of attributes belonging to the same tile of the bitstream.
- the tile management information includes information for associating the tile identification information corresponding to the data unit of the geometry contained in the subsample with respect to the subsample.
- the subsample is composed of the data unit of geometry and / or the data unit of attributes belonging to the same slice of the bitstream.
- the tile management information includes information for associating the tile identification information corresponding to the data unit of the geometry contained in the subsample with respect to the subsample.
- the subsample is composed of the singular data unit of the geometry or attributes of the bitstream.
- the tile management information is Corresponds to the tile identification information corresponding to the data unit of the geometry and the data unit of the bitstream corresponding to the data unit of the geometry for the subsample consisting of the data unit of the geometry.
- Information that associates the slice identification information indicating the slice of the point cloud with the information The information processing apparatus according to (13), which includes information for associating the slice identification information corresponding to the data unit of the attribute with respect to the subsample composed of the data unit of the attribute.
- the information processing apparatus according to (13) to (16) wherein the tile management information is stored in the timed metadata of the file.
- the data unit of the geometry and the data unit of the attribute are stored in different tracks.
- the extraction unit extracts the portion of the bitstream necessary for reproducing the desired tile from the file in each of the tracks based on the tile management information (11) to (17).
- the information processing device described in any of them.
- the information processing apparatus according to any one of (11) to (18), further comprising a decoding unit for decoding the portion of the bitstream extracted by the extraction unit, which is necessary for reproducing the desired tile. ..
- (20) A point that represents a three-dimensional object as a set of points The point corresponding to a subsample composed of a single data unit of the bit stream or a plurality of continuous data units stored in a file together with a bit stream of the cloud.
- the desired tile of the bit stream An information processing method that extracts the part required for playback from the file.
- 300 file generation device 311 extraction unit, 312 coding unit, 313 bitstream generation unit, 314 tile management information generation unit, 315 file generation unit, 321 geometry coding unit, 322 attribute coding unit, 323 metadata generation unit, 400 playback device, 401 control unit, 411 file acquisition unit, 412 playback processing unit, 413 presentation processing unit, 421 file processing unit, 422 decoding unit, 423 presentation information generation unit, 431 bitstream extraction unit, 441 geometry decoding unit, 442 Attribute decoding unit, 451 point cloud construction unit, 452 presentation processing unit
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Studio Circuits (AREA)
Abstract
Description
1.G-PCCビットストリームのパーシャルアクセス
2.タイル識別情報のシグナリング
3.第1の実施の形態(ファイル生成装置)
4.第2の実施の形態(再生装置)
5.付記
<技術内容・技術用語をサポートする文献等>
本技術で開示される範囲は、実施の形態に記載されている内容だけではなく、出願当時において公知となっている以下の非特許文献等に記載されている内容や以下の非特許文献において参照されている他の文献の内容等も含まれる。
非特許文献2:(上述)
非特許文献3:(上述)
非特許文献4:(上述)
非特許文献5:https://www.matroska.org/index.html
従来、点の位置情報や属性情報等により3次元構造を表すポイントクラウド(Point cloud)等の3Dデータが存在した。
非特許文献1には、このポイントクラウドについて、ジオメトリとアトリビュートに分けて符号化する、Geometry-based Point Cloud Compression(G-PCC)という符号化技術が開示された。G-PCCは、MPEG-I Part 9 (ISO/IEC 23090-9) で規格化作業中である。
G-PCCビットストリームは、一部のポイントのビットストリームを他と独立に復号し、再生することができるパーシャルアクセス(partial access)構造を備えることができる。このパーシャルアクセス構造のポイントクラウドにおける、独立に復号および再生が可能な(独立にアクセス可能な)データ単位としてタイル(tile)やスライス(slice)がある。
このようなパーシャルアクセスが可能なポイントクラウドを符号化したG-PCCビットストリームの主な構造の例(非特許文献1のAnnex Bで定義されているType-length-value bytestream formatの例)を図3に示す。つまり、図3に示されるG-PCCビットストリームは、パーシャルアクセス構造を備え、その一部を抽出して他と独立に復号し得る。
非特許文献2には、動画圧縮の国際標準技術MPEG-4(Moving Picture Experts Group - 4)のファイルコンテナ仕様であるISOBMFF(International Organization for Standardization Base Media File Format)が開示された。
非特許文献3には、このG-PCCで符号化されたビットストリームのローカルストレージからの再生処理やネットワーク配信の効率化を目的とし、G-PCCビットストリームをISOBMFFに格納する方法が開示された。この方法は、MPEG-I Part 18 (ISO/IEC 23090-18) で規格化作業中である。
G-PCCファイルは、3次元空間情報に基づき、partial point cloudにアクセスしてデコードするための構造を備える。G-PCCファイルは、各パーシャルポイントクラウドを互いに異なるトラック(track)に格納する。パーシャルポイントクラウドは、単数または複数のタイルで構成される。例えば、図6に示されるように、ポイントクラウドフレーム61が、パーシャルポイントクラウド61A、パーシャルポイントクラウド61B、およびパーシャルポイントクラウド61Cにより構成されるとする。その場合、パーシャルポイントクラウド61A、パーシャルポイントクラウド61B、およびパーシャルポイントクラウド61Cは、それぞれ、G-PCCファイルの互いに異なるトラック(G-PCC track)に格納される。このような構造により、再生するトラックを選択することにより、再生するタイルを選択することができる。
非特許文献4には、ポイントクラウドをスケーラブルな復号機能をサポートするプロファイルが開示された。このプロファイルにより、例えば、大規模なポイントクラウド静止画のローカル再生時における、視点位置に応じたデコード・レンダリングが可能になる。例えば、図7において、視点71から視線方向72を見た場合の視界73の外の領域(図中白地の領域)は復号せず、視界73内のパーシャルポイントクラウドのみを復号し、再生する。また、視点71に近い領域(図中濃いグレーの領域)のパーシャルポイントクラウドは高LoDで(高解像度で)復号して再生し、視点71から遠い領域(図中薄いグレーの領域)のパーシャルポイントクラウドは低LoDで(低解像度で)復号して再生する、いった視点位置に応じたデコード・レンダリング処理が可能になる。これにより、不要な情報の再生が低減されるので、再生処理の負荷の増大を抑制することができる。
以上のように、特に大規模なポイントクラウドのローカル再生時において、視点位置に応じたpartial point cloudのデコード・レンダリング処理が有用である。
そこで、図8に示される表の一番上の段に示されるように、G-PCCファイルにタイル識別情報を格納するようにする。例えば、G-PCCファイルにおいてサンプル内にサブサンプルを形成し、各サブサンプルに対応するタイルを管理するためのタイル管理情報としてタイル識別情報をG-PCCファイルに格納する。
G-PCCファイルでは、ジオメトリとアトリビュートを1つのトラックに格納する構造(シングルトラック(single track encapsulation structure)とも称する)と、ジオメトリとアトリビュートを互いに異なるトラックに格納する構造(マルチトラック(multi-track encapsulation structure)とも称する)とがある。ここでは、図8の表の上から2段目に示されるように、シングルトラックの場合のタイル識別情報の格納について説明する(方法1)。
タイル管理情報は、ファイルのトラック毎に生成され、そのトラックに格納されるサブサンプルに対応するタイル識別情報の一覧を含むようにしてもよい。
図8に示される表の上から4段目に示されるように、G-PCCサンプル内のG-PCCデータは、タイル毎にサブサンプル化されてもよい(方法1-1-1)。図10の例では、サンプルに格納されるG-PCCデータが、タイル毎にサブサンプル化されている。つまり、サンプルに格納される、ビットストリームの互いに同一のタイルに属するジオメトリのデータユニット若しくはアトリビュートのデータユニットまたはその両方により構成される、ビットストリームの単数または連続する複数のデータユニットが、サブサンプルとして設定されてもよい。そして、タイル管理情報が、そのようなサブサンプルに対して、サブサンプルに含まれるジオメトリのデータユニットに対応するタイル識別情報を対応付ける情報を含むようにしてもよい。
なお、サブサンプルは、スライス毎に設定してもよい。つまり、図8に示される表の上から5段目に示されるように、G-PCCサンプル内のG-PCCデータは、スライス毎にサブサンプル化されてもよい(方法1-1-2)。図13の例では、サンプルに格納されるG-PCCデータが、スライス毎にサブサンプル化されている。つまり、サンプルに格納される、ビットストリームの互いに同一のスライスに属するジオメトリのデータユニット若しくはアトリビュートのデータユニットまたはその両方により構成される、ビットストリームの単数または連続する複数のデータユニットが、サブサンプルとして設定されてもよい。この場合、タイル識別情報は、そのサブサンプルに対応するスライスが属するタイルを示す。そして、タイル管理情報が、そのようなサブサンプルに対して、サブサンプルに含まれるジオメトリのデータユニットに対応するタイル識別情報を対応付ける情報を含むようにしてもよい。
なお、サブサンプルは、データユニット毎に設定してもよい。つまり、図8に示される表の上から6段目に示されるように、G-PCCサンプル内のG-PCCデータは、データユニット毎にサブサンプル化されてもよい(方法1-1-3)。図15の例では、サンプルに格納されるG-PCCデータが、データユニット毎にサブサンプル化されている。つまり、サンプルに格納される、ビットストリームのジオメトリまたはアトリビュートの単数のデータユニットが、サブサンプルとして設定されてもよい。この場合、タイル識別情報は、そのサブサンプル(データユニット)が属するタイルを示す。そして、タイル管理情報が、ジオメトリのデータユニットからなるサブサンプルに対して、そのジオメトリのデータユニットに対応するタイル識別情報およびスライス識別情報を対応付ける情報と、アトリビュートのデータユニットからなるサブサンプルに対して、そのアトリビュートのデータユニットに対応するスライス識別情報を対応付ける情報とを含むようにしても良い。なお、スライス識別情報は、ビットストリームのデータユニットに対応するポイントクラウドのスライスを示す情報である。
なお、以上においては、SubSampleInformationBoxを拡張してタイル管理情報(タイル識別情報)を格納するように説明したが、このSubSampleInformationBoxの代わりに、SubSampleItemPropertyを拡張してタイル管理情報(タイル識別情報)を格納してもよい。拡張の方法は、上述したSubSampleInformationBoxの場合と同様である。SubSampleItemPropertyにおいてタイル管理情報を格納することにより、静止画に対しても同様の効果を得ることができる。
図8に示される表の上から7段目に示されるように、タイル識別情報をtimed metadataに格納してもよい(方法1-2)。図17に示されるように、timed metadata trackは、track reference (‘gsli’)でG-PCCトラックに紐づけられる。この方法は、上述した各方法について、適用することができる。つまり、各方法において格納される情報を、timed metadataに格納してもよい。
本技術は、G-PCCファイルがマルチトラック(multi-track encapsulation structure)の場合も適用できる。図8の表の上から8段目に示されるように、マルチトラックの場合のタイル識別情報の格納について説明する(方法2)。マルチトラックの場合、G-PCCファイル構造は図18のようになる。つまり、ジオメトリデータユニットとアトリビュートデータユニットが互いに異なるトラックに格納される。そこで、図19に示されるように、それぞれのトラックにおいてタイル管理情報を格納すればよい。
以上においてはファイルフォーマットとしてISOBMFFを適用する例について説明したが、G-PCCビットストリームを格納するファイルは任意であり、ISOBMFF以外であってもよい。例えば、図8に示される表の最下段に示されるように、G-PCCビットストリームが、マトリョーシカメディアコンテナ(Matroska Media Container)に格納されるようにしてもよい(方法3)。マトリョーシカメディアコンテナの主な構成例を図21に示す。
<ファイル生成装置>
符号化側装置について説明する。以上に説明した本技術(の各方法)は、任意の装置において適用することができる。図22は、本技術を適用した情報処理装置の一態様であるファイル生成装置の構成の一例を示すブロック図である。図22に示されるファイル生成装置300は、G-PCCを適用してポイントクラウドデータを符号化し、その符号化により生成したG-PCCビットストリームをISOBMFFに格納する装置である。
このファイル生成装置300により実行されるファイル生成処理の流れの例を、図23のフローチャートを参照して説明する。
<再生装置>
図24は、本技術を適用した情報処理装置の一態様である再生装置の構成の一例を示すブロック図である。図24に示される再生装置400は、G-PCCファイルを復号し、ポイントクラウドを構築し、レンダリングして提示情報を生成する装置である。その際、再生装置400は、上述した本技術を適用し、ポイントクラウドの中の所望のタイルの再生に必要な情報をG-PCCファイルから抽出し、その抽出した情報を復号して再生することができる。つまり、再生装置400は、ポイントクラウドの一部のみを復号し、再生することができる。
この再生装置400により実行される再生処理の流れの例を、図26のフローチャートを参照して説明する。
<コンピュータ>
上述した一連の処理は、ハードウエアにより実行させることもできるし、ソフトウエアにより実行させることもできる。一連の処理をソフトウエアにより実行する場合には、そのソフトウエアを構成するプログラムが、コンピュータにインストールされる。ここでコンピュータには、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータ等が含まれる。
以上においては、ポイントクラウドデータの符号化・復号に本技術を適用する場合について説明したが、本技術は、これらの例に限らず、任意の規格の3Dデータの符号化・復号に対して適用することができる。つまり、上述した本技術と矛盾しない限り、符号化・復号方式等の各種処理、並びに、3Dデータやメタデータ等の各種データの仕様は任意である。また、本技術と矛盾しない限り、上述した一部の処理や仕様を省略してもよい。
本技術を適用したシステム、装置、処理部等は、例えば、交通、医療、防犯、農業、畜産業、鉱業、美容、工場、家電、気象、自然監視等、任意の分野に利用することができる。また、その用途も任意である。
なお、本明細書において「フラグ」とは、複数の状態を識別するための情報であり、真(1)または偽(0)の2状態を識別する際に用いる情報だけでなく、3以上の状態を識別することが可能な情報も含まれる。したがって、この「フラグ」が取り得る値は、例えば1/0の2値であってもよいし、3値以上であってもよい。すなわち、この「フラグ」を構成するbit数は任意であり、1bitでも複数bitでもよい。また、識別情報(フラグも含む)は、その識別情報をビットストリームに含める形だけでなく、ある基準となる情報に対する識別情報の差分情報をビットストリームに含める形も想定されるため、本明細書においては、「フラグ」や「識別情報」は、その情報だけではなく、基準となる情報に対する差分情報も包含する。
(1) 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームのデータユニットに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて、サンプルとしてファイルに格納される前記ビットストリームの単数または連続する複数の前記データユニットからなるサブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報を生成するタイル管理情報生成部と、
前記ビットストリームおよび前記タイル管理情報を格納する前記ファイルを生成するファイル生成部と
を備える情報処理装置。
(2) 前記タイル管理情報は、前記ファイルのトラック毎に生成され、前記トラックに格納される前記サブサンプルに対応する前記タイル識別情報の一覧を含む
(1)に記載の情報処理装置。
(3) 前記ファイルはISOBMFF(International Organization for Standardization Base Media File Format)のファイルであり、
前記タイル管理情報は、前記ファイルのmoovボックスまたはmoofボックス内の前記サブサンプルに関する情報を格納するボックスに格納される
(2)に記載の情報処理装置。
(4) 前記サブサンプルは、前記ビットストリームの互いに同一の前記タイルに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
(3)に記載の情報処理装置。
(5) 前記サブサンプルは、前記ビットストリームの互いに同一のスライスに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
(3)に記載の情報処理装置。
(6) 前記サブサンプルは、前記ビットストリームのジオメトリまたはアトリビュートの単数の前記データユニットにより構成され、
前記タイル管理情報は、
前記ジオメトリの前記データユニットからなる前記サブサンプルに対して、前記ジオメトリの前記データユニットに対応する前記タイル識別情報と、前記ジオメトリの前記データユニットに対応する、前記ビットストリームの前記データユニットに対応する前記ポイントクラウドのスライスを示すスライス識別情報とを対応付ける情報と、
前記アトリビュートの前記データユニットからなる前記サブサンプルに対して、前記アトリビュートの前記データユニットに対応する前記スライス識別情報を対応付ける情報と
を含む
(3)に記載の情報処理装置。
(7) 前記タイル管理情報は、前記ファイルのtimed metadataに格納される
(3)乃至(6)のいずれかに記載の情報処理装置。
(8) 前記ファイル生成部は、ジオメトリの前記データユニットとアトリビュートの前記データユニットとを、前記ファイルの互いに異なるトラックに格納し、
前記タイル管理情報生成部は、それぞれの前記トラックにおいて、前記タイル管理情報を生成する
(1)乃至(7)のいずれかに記載の情報処理装置。
(9) 前記ポイントクラウドのデータを符号化し、前記ビットストリームを生成する符号化部をさらに備え、
前記ファイル生成部は、前記符号化部により生成された前記ビットストリームを格納する前記ファイルを生成する
(1)乃至(8)のいずれかに記載の情報処理装置。
(10) 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームのデータユニットに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて、サンプルとしてファイルに格納される前記ビットストリームの単数または連続する複数の前記データユニットからなるサブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報を生成し、
前記ビットストリームおよび前記タイル管理情報を格納する前記ファイルを生成する
情報処理方法。
を備える情報処理装置。
(12) 前記タイル管理情報は、前記ファイルのトラック毎に生成され、前記トラックに格納される前記サブサンプルに対応する前記タイル識別情報の一覧を含み、
前記抽出部は、前記一覧に基づいて所望の前記タイルに対応する前記サブサンプルを特定し、特定した前記サブサンプルを抽出する
(11)に記載の情報処理装置。
(13) 前記ファイルはISOBMFF(International Organization for Standardization Base Media File Format)のファイルであり、
前記抽出部は、前記ファイルのmoovボックスまたはmoofボックス内に格納されている前記タイル管理情報の前記一覧に基づいて所望の前記タイルに対応する前記サブサンプルを特定し、特定した前記サブサンプルを抽出する
(12)に記載の情報処理装置。
(14) 前記サブサンプルは、前記ビットストリームの互いに同一の前記タイルに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
(13)に記載の情報処理装置。
(15) 前記サブサンプルは、前記ビットストリームの互いに同一のスライスに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
(13)に記載の情報処理装置。
(16) 前記サブサンプルは、前記ビットストリームのジオメトリまたはアトリビュートの単数の前記データユニットにより構成され、
前記タイル管理情報は、
前記ジオメトリの前記データユニットからなる前記サブサンプルに対して、前記ジオメトリの前記データユニットに対応する前記タイル識別情報と、前記ジオメトリの前記データユニットに対応する、前記ビットストリームの前記データユニットに対応する前記ポイントクラウドのスライスを示すスライス識別情報とを対応付ける情報と、
前記アトリビュートの前記データユニットからなる前記サブサンプルに対して、前記アトリビュートの前記データユニットに対応する前記スライス識別情報を対応付ける情報と
を含む
(13)に記載の情報処理装置。
(17) 前記タイル管理情報は、前記ファイルのtimed metadataに格納される
(13)乃至(16)に記載の情報処理装置。
(18) 前記ファイルは、ジオメトリの前記データユニットとアトリビュートの前記データユニットとが、互いに異なるトラックに格納されており、
前記抽出部は、それぞれの前記トラックにおいて、前記タイル管理情報に基づいて、前記ビットストリームの、所望の前記タイルの再生に必要な前記部分を、前記ファイルから抽出する
(11)乃至(17)のいずれかに記載の情報処理装置。
(19) 前記抽出部により抽出された前記ビットストリームの、所望の前記タイルの再生に必要な前記部分を復号する復号部をさらに備える
(11)乃至(18)のいずれかに記載の情報処理装置。
(20) 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームとともにファイルに格納される、前記ビットストリームの単数または連続する複数のデータユニットにより構成されるサブサンプルに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて前記ファイルに格納される前記サブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報に基づいて、前記ビットストリームの、所望の前記タイルの再生に必要な部分を、前記ファイルから抽出する
情報処理方法。
Claims (20)
- 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームのデータユニットに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて、サンプルとしてファイルに格納される前記ビットストリームの単数または連続する複数の前記データユニットからなるサブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報を生成するタイル管理情報生成部と、
前記ビットストリームおよび前記タイル管理情報を格納する前記ファイルを生成するファイル生成部と
を備える情報処理装置。 - 前記タイル管理情報は、前記ファイルのトラック毎に生成され、前記トラックに格納される前記サブサンプルに対応する前記タイル識別情報の一覧を含む
請求項1に記載の情報処理装置。 - 前記ファイルはISOBMFF(International Organization for Standardization Base Media File Format)のファイルであり、
前記タイル管理情報は、前記ファイルのmoovボックスまたはmoofボックス内の前記サブサンプルに関する情報を格納するボックスに格納される
請求項2に記載の情報処理装置。 - 前記サブサンプルは、前記ビットストリームの互いに同一の前記タイルに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
請求項3に記載の情報処理装置。 - 前記サブサンプルは、前記ビットストリームの互いに同一のスライスに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
請求項3に記載の情報処理装置。 - 前記サブサンプルは、前記ビットストリームのジオメトリまたはアトリビュートの単数の前記データユニットにより構成され、
前記タイル管理情報は、
前記ジオメトリの前記データユニットからなる前記サブサンプルに対して、前記ジオメトリの前記データユニットに対応する前記タイル識別情報と、前記ジオメトリの前記データユニットに対応する、前記ビットストリームの前記データユニットに対応する前記ポイントクラウドのスライスを示すスライス識別情報とを対応付ける情報と、
前記アトリビュートの前記データユニットからなる前記サブサンプルに対して、前記アトリビュートの前記データユニットに対応する前記スライス識別情報を対応付ける情報と
を含む
請求項3に記載の情報処理装置。 - 前記タイル管理情報は、前記ファイルのtimed metadataに格納される
請求項3に記載の情報処理装置。 - 前記ファイル生成部は、ジオメトリの前記データユニットとアトリビュートの前記データユニットとを、前記ファイルの互いに異なるトラックに格納し、
前記タイル管理情報生成部は、それぞれの前記トラックにおいて、前記タイル管理情報を生成する
請求項1に記載の情報処理装置。 - 前記ポイントクラウドのデータを符号化し、前記ビットストリームを生成する符号化部をさらに備え、
前記ファイル生成部は、前記符号化部により生成された前記ビットストリームを格納する前記ファイルを生成する
請求項1に記載の情報処理装置。 - 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームのデータユニットに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて、サンプルとしてファイルに格納される前記ビットストリームの単数または連続する複数の前記データユニットからなるサブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報を生成し、
前記ビットストリームおよび前記タイル管理情報を格納する前記ファイルを生成する
情報処理方法。 - 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームとともにファイルに格納される、前記ビットストリームの単数または連続する複数のデータユニットにより構成されるサブサンプルに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて前記ファイルに格納される前記サブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報に基づいて、前記ビットストリームの、所望の前記タイルの再生に必要な部分を、前記ファイルから抽出する抽出部
を備える情報処理装置。 - 前記タイル管理情報は、前記ファイルのトラック毎に生成され、前記トラックに格納される前記サブサンプルに対応する前記タイル識別情報の一覧を含み、
前記抽出部は、前記一覧に基づいて所望の前記タイルに対応する前記サブサンプルを特定し、特定した前記サブサンプルを抽出する
請求項11に記載の情報処理装置。 - 前記ファイルはISOBMFF(International Organization for Standardization Base Media File Format)のファイルであり、
前記抽出部は、前記ファイルのmoovボックスまたはmoofボックス内に格納されている前記タイル管理情報の前記一覧に基づいて所望の前記タイルに対応する前記サブサンプルを特定し、特定した前記サブサンプルを抽出する
請求項12に記載の情報処理装置。 - 前記サブサンプルは、前記ビットストリームの互いに同一の前記タイルに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
請求項13に記載の情報処理装置。 - 前記サブサンプルは、前記ビットストリームの互いに同一のスライスに属するジオメトリの前記データユニット若しくはアトリビュートの前記データユニットまたは両方により構成され、
前記タイル管理情報は、前記サブサンプルに対して、前記サブサンプルに含まれる前記ジオメトリの前記データユニットに対応する前記タイル識別情報を対応付ける情報を含む
請求項13に記載の情報処理装置。 - 前記サブサンプルは、前記ビットストリームのジオメトリまたはアトリビュートの単数の前記データユニットにより構成され、
前記タイル管理情報は、
前記ジオメトリの前記データユニットからなる前記サブサンプルに対して、前記ジオメトリの前記データユニットに対応する前記タイル識別情報と、前記ジオメトリの前記データユニットに対応する、前記ビットストリームの前記データユニットに対応する前記ポイントクラウドのスライスを示すスライス識別情報とを対応付ける情報と、
前記アトリビュートの前記データユニットからなる前記サブサンプルに対して、前記アトリビュートの前記データユニットに対応する前記スライス識別情報を対応付ける情報と
を含む
請求項13に記載の情報処理装置。 - 前記タイル管理情報は、前記ファイルのtimed metadataに格納される
請求項13に記載の情報処理装置。 - 前記ファイルは、ジオメトリの前記データユニットとアトリビュートの前記データユニットとが、互いに異なるトラックに格納されており、
前記抽出部は、それぞれの前記トラックにおいて、前記タイル管理情報に基づいて、前記ビットストリームの、所望の前記タイルの再生に必要な前記部分を、前記ファイルから抽出する
請求項11に記載の情報処理装置。 - 前記抽出部により抽出された前記ビットストリームの、所望の前記タイルの再生に必要な前記部分を復号する復号部をさらに備える
請求項11に記載の情報処理装置。 - 3次元形状のオブジェクトをポイントの集合として表現するポイントクラウドのビットストリームとともにファイルに格納される、前記ビットストリームの単数または連続する複数のデータユニットにより構成されるサブサンプルに対応する前記ポイントクラウドのタイルを示すタイル識別情報を用いて前記ファイルに格納される前記サブサンプルに対応する前記タイルを管理するための情報であるタイル管理情報に基づいて、前記ビットストリームの、所望の前記タイルの再生に必要な部分を、前記ファイルから抽出する
情報処理方法。
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MX2022015293A MX2022015293A (es) | 2020-06-09 | 2021-05-26 | Dispositivo y metodo de procesamiento de informacion. |
JP2022530117A JPWO2021251141A1 (ja) | 2020-06-09 | 2021-05-26 | |
US18/000,520 US20230222693A1 (en) | 2020-06-09 | 2021-05-26 | Information processing apparatus and method |
KR1020227039220A KR20230021646A (ko) | 2020-06-09 | 2021-05-26 | 정보 처리 장치 및 방법 |
EP21822483.0A EP4164213A4 (en) | 2020-06-09 | 2021-05-26 | DEVICE AND METHOD FOR PROCESSING INFORMATION |
CN202180039811.0A CN115868156A (zh) | 2020-06-09 | 2021-05-26 | 信息处理装置和方法 |
BR112022024646A BR112022024646A2 (pt) | 2020-06-09 | 2021-05-26 | Aparelho e método de processamento de informações |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063036656P | 2020-06-09 | 2020-06-09 | |
US63/036,656 | 2020-06-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021251141A1 true WO2021251141A1 (ja) | 2021-12-16 |
Family
ID=78846004
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/019969 WO2021251141A1 (ja) | 2020-06-09 | 2021-05-26 | 情報処理装置および方法 |
Country Status (8)
Country | Link |
---|---|
US (1) | US20230222693A1 (ja) |
EP (1) | EP4164213A4 (ja) |
JP (1) | JPWO2021251141A1 (ja) |
KR (1) | KR20230021646A (ja) |
CN (1) | CN115868156A (ja) |
BR (1) | BR112022024646A2 (ja) |
MX (1) | MX2022015293A (ja) |
WO (1) | WO2021251141A1 (ja) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016540410A (ja) * | 2013-10-22 | 2016-12-22 | キヤノン株式会社 | スケーラブルな分割タイムドメディアデータをカプセル化するための方法、デバイス、およびコンピュータプログラム |
US20190318488A1 (en) * | 2018-04-12 | 2019-10-17 | Samsung Electronics Co., Ltd. | 3d point cloud compression systems for delivery and access of a subset of a compressed 3d point cloud |
WO2020013249A1 (ja) * | 2018-07-13 | 2020-01-16 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
WO2020075781A1 (ja) * | 2018-10-09 | 2020-04-16 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
WO2021049333A1 (ja) * | 2019-09-11 | 2021-03-18 | ソニー株式会社 | 情報処理装置、情報処理方法、再生処理装置及び再生処理方法 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020071111A1 (ja) * | 2018-10-02 | 2020-04-09 | ソニー株式会社 | 情報処理装置および情報処理方法、並びにプログラム |
-
2021
- 2021-05-26 EP EP21822483.0A patent/EP4164213A4/en active Pending
- 2021-05-26 BR BR112022024646A patent/BR112022024646A2/pt unknown
- 2021-05-26 KR KR1020227039220A patent/KR20230021646A/ko unknown
- 2021-05-26 JP JP2022530117A patent/JPWO2021251141A1/ja active Pending
- 2021-05-26 US US18/000,520 patent/US20230222693A1/en active Pending
- 2021-05-26 CN CN202180039811.0A patent/CN115868156A/zh active Pending
- 2021-05-26 MX MX2022015293A patent/MX2022015293A/es unknown
- 2021-05-26 WO PCT/JP2021/019969 patent/WO2021251141A1/ja unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016540410A (ja) * | 2013-10-22 | 2016-12-22 | キヤノン株式会社 | スケーラブルな分割タイムドメディアデータをカプセル化するための方法、デバイス、およびコンピュータプログラム |
US20190318488A1 (en) * | 2018-04-12 | 2019-10-17 | Samsung Electronics Co., Ltd. | 3d point cloud compression systems for delivery and access of a subset of a compressed 3d point cloud |
WO2020013249A1 (ja) * | 2018-07-13 | 2020-01-16 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
WO2020075781A1 (ja) * | 2018-10-09 | 2020-04-16 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 三次元データ符号化方法、三次元データ復号方法、三次元データ符号化装置、及び三次元データ復号装置 |
WO2021049333A1 (ja) * | 2019-09-11 | 2021-03-18 | ソニー株式会社 | 情報処理装置、情報処理方法、再生処理装置及び再生処理方法 |
Non-Patent Citations (5)
Title |
---|
"Information technology - Coding of audio-visual objects - Part 12: ISO base media file format", ISO/IEC 14496-12, 20 February 2015 (2015-02-20) |
"Information technology - MPEG-I (Coded Representation of Immersive Media) - Part 9: Geometry-based Point Cloud Compression", SO/IEC 23090- 9:2020(E |
SATORU KUMAOHJI NAKAGAMI: "G-PCC] (New proposal) On scalability profile", ISO/IEC JTC1/SC29/WG11 MPEG2020/ M53292, April 2020 (2020-04-01) |
See also references of EP4164213A4 |
SEJIN OHRYOHEI TAKAHASHIYOUNGKWON LIM: "WD of ISO/IEC 23090-18 Carriage of Geometry-based Point Cloud Compression Data", ISO/IEC JTC 1/SC 29/WG 11 N19286, 5 June 2020 (2020-06-05) |
Also Published As
Publication number | Publication date |
---|---|
MX2022015293A (es) | 2023-01-04 |
EP4164213A4 (en) | 2023-10-04 |
US20230222693A1 (en) | 2023-07-13 |
JPWO2021251141A1 (ja) | 2021-12-16 |
CN115868156A (zh) | 2023-03-28 |
KR20230021646A (ko) | 2023-02-14 |
EP4164213A1 (en) | 2023-04-12 |
BR112022024646A2 (pt) | 2022-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11532103B2 (en) | Information processing apparatus and information processing method | |
WO2021049333A1 (ja) | 情報処理装置、情報処理方法、再生処理装置及び再生処理方法 | |
WO2021117859A1 (ja) | 画像処理装置および方法 | |
GB2509953A (en) | Displaying a Region of Interest in a Video Stream by Providing Links Between Encapsulated Video Streams | |
WO2021251173A1 (ja) | 情報処理装置および方法 | |
US20240107049A1 (en) | Information processing device and information processing method | |
WO2021251185A1 (ja) | 情報処理装置および方法 | |
JP2024003189A (ja) | 情報処理装置および方法 | |
WO2022059495A1 (ja) | 情報処理装置および方法 | |
WO2023176419A1 (ja) | 情報処理装置および方法 | |
WO2022054744A1 (ja) | 情報処理装置および方法 | |
WO2021251141A1 (ja) | 情報処理装置および方法 | |
US20240046562A1 (en) | Information processing device and method | |
EP3972260A1 (en) | Information processing device, information processing method, reproduction processing device, and reproduction processing method | |
WO2023277062A1 (ja) | 情報処理装置および方法 | |
WO2019138928A1 (ja) | 情報処理装置および方法 | |
WO2022138221A1 (ja) | 情報処理装置および方法 | |
EP4325870A1 (en) | Information processing device and method | |
WO2022075078A1 (ja) | 画像処理装置および方法 | |
EP4325871A1 (en) | Information processing device and method | |
US20220076485A1 (en) | Information processing apparatus and information processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21822483 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022530117 Country of ref document: JP Kind code of ref document: A |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112022024646 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112022024646 Country of ref document: BR Kind code of ref document: A2 Effective date: 20221201 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021822483 Country of ref document: EP Effective date: 20230109 |