CN114902284A

CN114902284A - Information processing apparatus and method

Info

Publication number: CN114902284A
Application number: CN202080091116.4A
Authority: CN
Inventors: 加藤毅; 隈智; 中神央二; 安田弘幸; 矢野幸司
Original assignee: Sony Group Corp
Current assignee: Sony Group Corp
Priority date: 2020-01-07
Filing date: 2020-12-24
Publication date: 2022-08-12
Also published as: JPWO2021140930A1; WO2021140930A1; KR20220122995A; EP4071715A4; EP4071715A1; US20230023219A1

Abstract

The present disclosure relates to an information processing apparatus and method capable of suppressing a reduction in encoding efficiency. Attribute information about each point in a point cloud representing an object of a three-dimensional shape as a set of points is hierarchically separated by recursively repeating classification of a predicted point and a reference point for deriving a difference between the attribute information and a predicted value with respect to the reference point for deriving the predicted value of the attribute information, and at this time, the reference point is set based on the center of gravity of the point. The present disclosure is suitable for, for example, an information processing apparatus, an image processing apparatus, an encoding apparatus, a decoding apparatus, an electronic instrument, an information processing method, and a program.

Description

Information processing apparatus and method

Technical Field

The present disclosure relates to an information processing apparatus and method, and more particularly, to an information processing apparatus and method capable of suppressing a decrease in encoding efficiency.

Background

Conventionally, for example, a method of encoding 3D data representing a three-dimensional structure such as a point cloud has been considered (for example, see non-patent document 1). The data of the point cloud includes geometric data (also referred to as position information) and attribute data (also referred to as attribute information) of each point. Thus, the point cloud is encoded for each piece of geometric data and attribute data. Various methods have been proposed as a method of encoding attribute data. For example, it has been proposed to use a technique called lifting (lifting) (for example, see non-patent document 2). Further, a method capable of scalable decoding of attribute data has also been proposed (for example, see non-patent document 3).

In such a lifting scheme, the process of setting a point as a reference point or a predicted point is recursively repeated with respect to the reference point, thereby layering attribute data. Then, according to the hierarchical structure, a prediction value of the attribute data of the prediction point is derived using the attribute data of the reference point, and a difference value between the prediction value and the attribute data is encoded. In such a hierarchy of attribute data, the following methods have been proposed: the first point and the last point in Morton order (Morton order) are alternately selected as reference points among candidates for the reference points in each hierarchy (for example, see non-patent document 4).

[ list of references ]

[ non-patent document ]

Non-patent document 1: mekuria, IEEE student members, K.Blom, P.Cesar, IEEE members, "Design, Implementation and Evaluation of a Point Cloud code for Tele-Implementation Video", tcsvt _ paper _ submitted _ library.

Non-patent document 2: khaled Mammou, Alexis Tourapis, Jungsun Kim, Fabrice Robin, Valery Valentin, Yeping Su, "Lifting Scheme for loss Attribute Encoding in TMC1", ISO/IEC JTC1/SC29/WG11 MPEG2018/m42640, 4 months 2018, USA, san Diego.

Non-patent document 3: ohji Nakagami, Satoru Kuma, "[ G-PCC ] Spatial scalability support for G-PCC", ISO/IEC JTC1/SC29/WG11MPEG2019/m47352, 3 months 2019, Switzerland, Geneva.

Non-patent document 4: hyejung Hur, Sejin Oh, "[ G-PCC ] [ New Proposal ] on improved spatial scalable lifting", ISO/IEC JTC1/SC29/WG11MPEG2019/M51408, month 10 of 2019, Switzerland, Geneva.

Disclosure of Invention

[ problem to be solved by the invention ]

However, the method described in non-patent document 4 is not always optimal, and other methods are also required.

The present disclosure has been made in view of such circumstances and can suppress a decrease in encoding efficiency.

[ solution of problem ]

An information processing apparatus according to an aspect of the present technology is an information processing apparatus including: a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value, with respect to the reference point, wherein the layering unit sets the reference point based on a centroid of each point.

An information processing method according to an aspect of the present technology is an information processing method including: when the layering of the attribute information is performed for each point of the point cloud representing an object having a three-dimensional shape as a set of points by repeating classification of a prediction point for deriving a difference between the attribute information and a predicted value of the attribute information and a reference point for deriving the predicted value with respect to a recursion, the reference point is set based on a centroid of the points.

An information processing apparatus according to another aspect of the present technology is an information processing apparatus including: a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value, with respect to the reference point, wherein the layering unit sets the reference point based on a distribution manner of the respective points.

An information processing method according to another aspect of the present technology is an information processing method including: when the layering of attribute information is performed for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a predicted value of the attribute information and a reference point for deriving the predicted value, with respect to the reference point, the reference point is set based on a distribution manner of the points.

An information processing apparatus according to still another aspect of the present technology is an information processing apparatus including: a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value, with respect to the reference point; and an encoding unit that encodes information on the setting of the reference point by the hierarchical unit.

An information processing method according to still another aspect of the present technology is an information processing method including: for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, the attribute information is layered by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value, and encoding information relating to setting of the reference point.

An information processing apparatus according to still another aspect of the present technology is an information processing apparatus including: a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value with respect to the reference point, wherein the layering unit alternately selects, as the reference point, a point closer to a center of a bounding box and a point farther from the center of the bounding box from candidates for the reference point for each layer.

An information processing method according to still another aspect of the present technology is an information processing method including: when layering of attribute information is performed by recursively repeating classification of a prediction point for each point of a point cloud representing an object having a three-dimensional shape as a set of points and a reference point for deriving a prediction value with respect to the reference point, a point closer to the center of a bounding box and a point farther from the center of the bounding box are alternately selected as the reference point in candidates for the reference point in each layer, the prediction point being used to derive a difference between the attribute information and the prediction value of the attribute information.

In the information processing apparatus and method according to one aspect of the present technology, when the layering of the attribute information is performed by recursively repeating the classification of the predicted point for each point of the point cloud expressing an object having a three-dimensional shape as a set of points and the reference point for deriving the predicted value with respect to the reference point, the reference point is set based on the centroid of the point, the predicted point being used to derive the difference between the attribute information and the predicted value of the attribute information.

In the information processing apparatus and method according to another aspect of the present technology, when the layering of the attribute information is performed by recursively repeating the classification of the predicted point and the reference point with respect to the reference point for the attribute information of each point of the point cloud representing an object having a three-dimensional shape as a set of points, the reference point is set based on the distribution manner of the points, the predicted value is used to derive a difference value between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value with respect to the reference point.

In an information processing apparatus and method according to still another aspect of the present technology, attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points is layered by recursively repeating classification of prediction points for deriving a difference between the attribute information and predicted values of the attribute information and reference points for deriving the predicted values, and encoding information relating to setting of the reference points, with respect to the reference points.

In the information processing apparatus and method according to still another aspect of the present technology, when hierarchical levels of attribute information are performed by recursively repeating classification of a prediction point for each point of a point cloud representing an object having a three-dimensional shape as a set of points and a reference point for deriving a prediction value, the point closer to the center of a bounding box and the point farther from the center of the bounding box are alternately selected as the reference point in candidates for the reference point for each hierarchical level.

Drawings

Fig. 1 is a diagram describing an example of a state of lifting.

Fig. 2 is a diagram describing an example of a method of setting a reference point based on morton order.

Fig. 3 is a diagram describing an example of a method of setting a reference point based on morton order.

Fig. 4 is a diagram describing an example of a method of setting a reference point based on morton order.

Fig. 5 is a diagram describing an example of a method of setting a reference point based on morton order.

Fig. 6 is a diagram describing an example of the reference point setting method.

Fig. 7 is a diagram describing an example of a method of setting a reference point based on a centroid (centroid).

Fig. 8 is a diagram describing an example of a method of deriving a centroid.

Fig. 9 is a diagram describing an example of a method of deriving a centroid.

Fig. 10 is a diagram showing an example of an area from which a centroid is to be derived.

Fig. 11 is a diagram describing an example of a method of selecting a point.

Fig. 12 is a diagram describing an example of the same condition.

Fig. 13 is a block diagram showing a main configuration example of an encoding apparatus.

Fig. 14 is a block diagram showing a main configuration example of the attribute information encoding unit.

Fig. 15 is a block diagram showing a main configuration example of the hierarchical processing unit.

Fig. 16 is a flowchart describing an example of the flow of the encoding process.

Fig. 17 is a flowchart describing an example of the flow of the attribute information encoding process.

Fig. 18 is a flowchart describing an example of the flow of the hierarchical processing.

Fig. 19 is a flowchart describing an example of the flow of the reference point setting process.

Fig. 20 is a block diagram showing a main configuration example of a decoding apparatus.

Fig. 21 is a block diagram showing a main configuration example of the attribute information decoding unit.

Fig. 22 is a flowchart describing an example of the flow of the decoding process.

Fig. 23 is a flowchart describing an example of the flow of the attribute information decoding process.

Fig. 24 is a flowchart describing an example of the flow of the delaminating process.

Fig. 25 is a diagram showing an example of the table.

Fig. 26 is a diagram describing an example of table and signaling transmission.

Fig. 27 is a flowchart describing an example of the flow of the reference point setting process.

Fig. 28 is a diagram showing an example of information to be signaled.

Fig. 29 is a diagram showing an example of a target notified by signaling.

Fig. 30 is a diagram showing an example of information to be signaled.

Fig. 31 is a diagram showing an example of information to be signaled.

Fig. 32 is a diagram showing an example of syntax in the case of signaling fixed-length-bit information.

Fig. 33 is a diagram showing an example of syntax in the case of signaling fixed-length-bit information.

Fig. 34 is a diagram showing an example of variable-length-bit information to be signaled.

Fig. 35 is a diagram showing an example of variable-length-bit information to be signaled.

Fig. 36 is a diagram showing an example of syntax in the case of signaling information of variable-length bits.

Fig. 37 is a diagram showing an example of syntax in the case of signaling information of variable-length bits.

Fig. 38 is a flowchart describing an example of the flow of the reference point setting process.

Fig. 39 is a diagram describing an example of the search order.

Fig. 40 is a flowchart describing an example of the flow of the reference point setting process.

Fig. 41 is a block diagram showing a main configuration example of a computer.

Detailed Description

Hereinafter, a mode for carrying out the present disclosure (hereinafter referred to as an embodiment) will be described. Note that description will be made in the following order.

1. Setting of reference points

2. First embodiment (method 1)

3. Second embodiment (method 2)

4. Third embodiment (method 3)

5. Fourth embodiment (method 4)

6. Appendix

<1. setting of reference Point >

< documents supporting technical contents and technical terminology, etc. >

The scope of the disclosure in the present technology includes not only the contents described in the embodiments but also the contents described in the following non-patent documents known at the time of filing.

Non-patent document 1: (as described above)

Non-patent document 2: (as described above)

Non-patent document 3: (as described above)

Non-patent document 4: (as described above)

That is, the contents described in the above-mentioned non-patent documents, the contents of other documents referred to in the above-mentioned non-patent documents, and the like are also the basis for determining the support requirement.

< Point cloud >

Conventionally, there is 3D data such as a point cloud representing a three-dimensional structure by position information, attribute information, and the like of points and a mesh which is composed of vertices, edges, and faces and defines a three-dimensional shape using polygon representation.

For example, in the case of a point cloud, a three-dimensional structure (three-dimensional object) is expressed as a collection of a large number of points. The data of the point cloud (also referred to as point cloud data) includes position information (also referred to as geometric data) and attribute information (also referred to as attribute data) of each point. The attribute data may include any information. For example, color information, reflectance information, normal line information, and the like of each point may be included in the attribute data. As described above, the point cloud data has a relatively simple data structure, and can express any three-dimensional structure with sufficient accuracy by using a sufficiently large number of points.

< quantification of positional information using voxels >

Since such point cloud data has a relatively large data amount, in order to compress the data amount by encoding or the like, an encoding method using voxels has been conceived. Voxels are three-dimensional regions used to quantify geometric data (location information).

That is, a three-dimensional region comprising a point cloud, also called Bounding box, is divided into small three-dimensional regions called voxels, for each of which it is indicated whether a point is included or not. In this way, the position of each point is quantified in units of voxels. Therefore, by converting the point cloud data into data of such voxels (also referred to as voxel data), an increase in the amount of information can be suppressed (generally, the amount of information can be reduced).

< OctreeImage >

Furthermore, it has been conceived for geometric data to construct octrees using such voxel data. Octree is obtained by converting voxel data into a tree structure. The value of each bit of the lowest node of the octree indicates the presence or absence of a point in each voxel. For example, a value of "1" indicates that a voxel contains a point, while a value of "0" indicates that a voxel does not contain a point. In an octree, one node corresponds to eight voxels. That is, each node of the octree includes 8-bit data, and the 8-bit data indicates the presence or absence of a point in eight voxels.

Then, the higher node of the octree indicates the presence or absence of a point in a region in which eight voxels corresponding to the lower nodes belonging to the node are combined into one voxel. That is, the higher nodes are generated by collecting information of voxels of the lower nodes. Note that in the case of a node having a "0" value, i.e. all the corresponding eight voxels do not include a point, the node is deleted.

In this way, a tree structure (octree) including nodes having values other than "0" is constructed. That is, the octree may indicate the presence or absence of points in the voxels at each resolution. By performing octree conversion and encoding, the position information is decoded from the highest resolution (highest level) to a desired level (resolution), so that point cloud data having that resolution can be recovered. That is, it is possible to easily decode at any resolution without decoding information of an unnecessary hierarchy (resolution). In other words, the voxel (resolution) scalability can be realized.

Further, as described above, by omitting nodes having a value of "0", the resolution of voxels in the region where no point is present can be reduced, so that an increase in the amount of information can be further suppressed (in general, the amount of information is reduced).

< promotion >

On the other hand, when attribute data (attribute information) is encoded, it is assumed that geometric data (positional information) including degradation caused by encoding is known, and encoding is performed using the positional relationship between points. As a method of encoding such attribute data, a method using a Region Adaptive Hierarchical Transform (RAHT) or a transform called lifting described in non-patent document 2 has been considered. By applying these techniques, the attribute data can be layered like an octree of geometry data.

For example, in the case of the promotion described in non-patent document 2, attribute data is layered by recursively repeating a process of setting a point as a reference point or a predicted point with respect to the reference point. Then, according to the hierarchical structure, a prediction value of the attribute data of the prediction point is derived using the attribute data of the reference point, and a difference value between the prediction value and the attribute data is encoded.

For example, in fig. 1, it is assumed that the point P5 is selected as the reference point. In this case, the search for the predicted point is performed in a circular area of radius R centered on the point P5. In this case, since the point P9 is located in the area, the point P9 is set as a prediction point (a prediction point using the point P5 as a reference point) from which a prediction value is derived with reference to the point P5.

By such processing, for example, respective differences of the point P7 to the point P9 indicated by a white circle, respective differences of the point P1, the point P3, and the point P6 indicated by oblique lines, and respective differences of the point P0, the point P2, the point P4, and the point P5 indicated by a gray circle are derived as differences of different levels.

Note that although the point cloud is arranged in a three-dimensional space and the above-described processing is actually performed in the three-dimensional space, the three-dimensional space is schematically illustrated using a two-dimensional plane for convenience of description in fig. 1. That is, the description made with reference to fig. 1 can be similarly applied to processing, a phenomenon, and the like in a three-dimensional space.

In the following description, a three-dimensional space is appropriately described using a two-dimensional plane. Unless otherwise specified, the description can be basically similarly applied to processes, phenomena, and the like in a three-dimensional space.

For example, the selection of reference points in the hierarchy has been performed according to morton order. For example, in the tree structure shown in fig. 2, in the case where one reference point is selected from a plurality of nodes of a certain hierarchy and the selected reference point is set as a node of a higher hierarchy, searches for the plurality of nodes are performed in morton order, and the node appearing first is selected as the reference point. In fig. 2, each circle represents a node, and a black circle represents a node selected as a reference point (i.e., the node is selected as a node of a higher hierarchy). In fig. 2, the corresponding nodes are ordered from left to right in morton order. That is, in the case of the example of fig. 2, the leftmost node is always selected.

On the other hand, in such a hierarchy of attribute data, non-patent document 4 proposes the following method: the first point and the last point in morton order are alternately selected as the reference points among the candidates of the reference points of each hierarchy. That is, as in the example of fig. 3, in the hierarchy of LoD N, the first node in morton order is selected as the reference point, and in the next hierarchy (LoD N-1), the last node in morton order is selected as the reference point.

Fig. 4 shows an example of how the reference points are selected in three-dimensional space using a two-dimensional plane. Each square in a of fig. 4 indicates a voxel in a certain level. Further, the circle indicates a candidate of a reference point as a processing target. For example, in the case of selecting a reference point from the 2 × 2 points shown in a of fig. 4, the first point (gray point) of the morton sequence is selected as the reference point. In the higher hierarchical level shown in B of fig. 4, the last point (gray point) in morton order of the 2 × 2 points is selected as a reference point. Further, in a higher-level hierarchy as shown in C of fig. 4, the first point (gray point) in morton order of 2 × 2 points is selected as a reference point.

The respective arrows shown in a to C of fig. 4 indicate the movement of the reference point. In this case, the movement range of the reference point is limited to a narrow range as indicated by a broken-line box shown in C of fig. 4, and thus a decrease in prediction accuracy is suppressed.

However, when the reference point is selected at the position shown in fig. 5 as in the case of fig. 4, the position of the reference point moves as in a to C of fig. 5. Fig. 4 shows another example of how a reference point may be selected in three-dimensional space using a two-dimensional plane. That is, the movement range of the reference point as indicated by the broken-line box shown in C of fig. 5 is wider than that in the case of fig. 4, and the prediction accuracy may be lowered.

As described above, in the method described in non-patent document 4, the prediction accuracy decreases depending on the position of a point, and the coding efficiency may decrease.

< method of setting reference Point >

Thus, for example, as in method 1 shown in the top row of the table of fig. 6, in the hierarchy of attribute data, the centroid of a point may be obtained, and a reference point may be set based on the centroid. For example, a point near the derived centroid may be selected as the reference point.

Further, for example, as in method 2 shown in the second row from the top of the table of fig. 6, in the hierarchy of attribute data, a reference point may be selected according to the distribution pattern (distribution manner) of points.

Further, for example, as in method 3 shown in the third row from the top of the table of fig. 6, in the hierarchy of attribute data, information on the setting of the reference point may be transmitted from the encoding side to the decoding side.

Further, for example, as in method 4 shown in the fourth row from the top of the table of fig. 6, in the hierarchy of attribute data, a point near the center of the bounding box and a point far from the center of the bounding box may be alternately selected as reference points for each hierarchy.

By applying any of these methods, a decrease in coding efficiency can be suppressed. Note that the methods described above may be applied in any combination. Further, each of the above-described methods may be applied to encoding or decoding of attribute data compatible with scalable decoding, and may also be applied to encoding or decoding of attribute data incompatible with scalable decoding.

< 2> first embodiment

< method 1>

A case where the above-described "method 1" is applied will be described. In the case of "method 1", the centroid of a point is derived, and a reference point is selected based on the centroid. Any point may be set as a reference point relative to the derived centroid. For example, a point closer to the derived centroid (e.g., a point located closer to the centroid) may be selected as the reference point.

A of fig. 7 shows an example of a target area in which a reference point is set. In a of fig. 7, a square indicates a voxel, and a circle indicates a point. That is, a of fig. 7 is a diagram schematically showing an example of a voxel structure in a three-dimensional space using a two-dimensional plane. For example, when it is assumed that points a to C arranged as shown in a of fig. 7 are candidates for the reference point, a point B close to the centroids of these candidates as shown in B of fig. 7 may be selected as the reference point. B of fig. 7 shows a hierarchical structure of attribute data similar to fig. 2 and the like, and black circles indicate reference points. That is, point B is selected as a reference point for point a to point C.

By referring to the point near the centroid in this manner, the point near more other points can be set as the reference point. Therefore, in short, the reference point can be set so as to suppress a decrease in the prediction accuracy of more prediction points, and a decrease in the encoding efficiency can be suppressed.

< method of deriving centroid >

The method of deriving the centroid is arbitrary. For example, the centroid of any point may be applied as the centroid for selecting the reference point. For example, a centroid of points located within a predetermined range may be derived and used to select a reference point. In this way, an increase in the number of points used to derive the centroid can be suppressed, and an increase in load can be suppressed.

The range of points used to derive the centroid (also referred to as the centroid-deriving target range) may be any range. For example, the centroid candidate for the reference point may be derived like method (1) shown in the second row from the top of the table of "method of deriving centroids" shown in fig. 8. That is, for example, as shown in a of fig. 9, a voxel region including 2 × 2 × 2 voxels in which there is a point to be a candidate of the reference point may be set as the centroid derivation target range. In a of fig. 9, a 2 × 2 × 2 voxel region in a three-dimensional space is schematically shown on a two-dimensional plane (as a 2 × 2 square). In this case, the centroid of three points indicated by circles in a of fig. 9 is derived, and this centroid is used to set the reference point.

In this way, since it is sufficient to derive the centroid of the point (reference point candidate) as the processing target, it is not necessary to search for other points or the like, and the centroid can be easily derived.

Note that the voxel region to be set as the centroid derivation target range is arbitrary and is not limited to 2 × 2 × 2. For example, the centroid of a point located in a voxel region formed by N × N (N > ═ 2) can be derived. That is, an nxnxnxnxn voxel region may be set as the centroid derivation target range.

For example, as shown in a of fig. 10, a voxel region to be set as a centroid derivation target range (a voxel region indicated by a bold line in a of fig. 10) and a voxel region as a target from which a reference point is derived (a voxel region indicated in gray in a of fig. 10) may be at the same position (both ranges may be perfectly matched). Note that in fig. 10, a voxel region actually arranged in a three-dimensional space is schematically shown on a two-dimensional plane. Further, in a of fig. 10, for convenience of description, the centroid-derived target range and the voxel region as the target from which the reference point is derived are shown slightly shifted from each other, but are shown such that the two ranges are perfectly matched in practice.

Further, for example, as shown in B of fig. 10, a voxel region to be set as the centroid derivation target range (a voxel region indicated by a thick line in B of fig. 10) may be wider than a voxel region as a target from which a reference point is derived (a voxel region indicated in gray in B of fig. 10). In the example in B of fig. 10, a 4 × 4 × 4 voxel region is set as the centroid derivation target range.

Further, for example, as shown in C of fig. 10, the center of a voxel region to be set as the centroid derivation target range (a voxel region indicated by a thick line in C of fig. 10) may not coincide with the center of a voxel region as a target from which a reference point is derived (a voxel region indicated in gray in C of fig. 10). That is, the centroid derivation target range may extend unevenly in a predetermined direction with respect to the voxel region as the target from which the reference point is derived. For example, in order to prevent the centroid derivation target range from protruding from the bounding box when near the edge of the bounding box or the like, the extension of the centroid derivation target range may be biased in this manner.

Further, for example, as in method (2) shown in the 3 rd row from the top of the table of "method of deriving centroids" shown in fig. 8, centroids of N nearby points may be obtained. That is, for example, as shown in B of fig. 9, N points may be searched from a side closer to the center coordinates of a voxel region including 2 × 2 × 2 voxels in which there is a point to be a reference point candidate, and the centroids of the N points may be derived. In B of fig. 9, the distribution of points actually arranged in a three-dimensional space is schematically shown on a two-dimensional plane. Further, the black circle indicates the center coordinates of the voxel region as the target from which the reference point is derived. That is, N points (white circles) are sequentially selected from the side closer to the black circle, and the centroid thereof is derived.

In this way, the number of points to be searched can be limited to N, and thus an increase in load caused by the search can be suppressed.

Further, for example, as in method (3) shown in the fourth row from the top of the table of "method of deriving centroids" shown in fig. 8, candidates of a reference point (a point existing in a voxel region including 2 × 2 × 2 voxels) may be excluded from the nearby N points derived by method (2). That is, as shown in C of fig. 9, 2 × 2 × 2 voxel regions may be excluded from the centroid derivation target range, and centroids of points located outside the 2 × 2 × 2 voxel regions may be derived. In C of fig. 9, the distribution of points actually arranged in a three-dimensional space is schematically shown on a two-dimensional plane.

Further, for example, as in method (4) shown in the fifth row from the top of the table of "method of deriving a centroid" shown in fig. 8, a centroid of a point in a region of radius r centered on the center coordinate of a voxel region including 2 × 2 × 2 voxels in which a candidate point of a reference point exists may be derived. That is, in this case, for example, as shown in D of fig. 9, the centroid of a point located in a region of a radius r centered on the center coordinates of a voxel region including 2 × 2 × 2 voxels indicated by a dashed-line box is derived. Note that, in D of fig. 9, the distribution of points actually arranged in a three-dimensional space is schematically shown on a two-dimensional plane. Further, a black circle indicates the center coordinates of a voxel region as a target from which a reference point is derived, and a white circle indicates a point in a region of radius r centered on the center coordinates of the voxel region including 2 × 2 × 2 voxels.

In this way, the present techniques may also be applied to scalable non-compliant lifting that does not use voxel structures.

< method of selecting reference Point >

In the case of "method 1" as described above, for example, a point close to the derived centroid may be set as the reference point. In the case where there are a plurality of "points near the centroid", any one of these points is selected as the reference point. The method of selection is arbitrary. For example, the reference point may be set in accordance with each method shown in the table of "method of selecting a reference point from a plurality of candidates" in fig. 11.

For example, as shown in a of fig. 12, there may be a plurality of points having equal distances from the centroid to each other. Further, for example, as shown in B of fig. 12, in order to suppress an increase in load caused by calculation, for example, all points located close enough may be assumed as "points close to the centroid". In the case of B of fig. 12, all points located within a range from the centroid radius Dth are regarded as "points close to the centroid". In this case, there may be a plurality of "points close to the centroid".

In such a case, for example, as in method (1) shown in the second row from the top of the table of "method of selecting a reference point from a plurality of candidates" shown in fig. 11, the first point to be a processing target in a predetermined search order may be selected.

Further, for example, as in method (2) shown in the third row from the top of the table of "method of selecting a reference point from a plurality of candidates" shown in fig. 11, the first point or the last point to be a processing target in a predetermined search order may be selected. For example, whether to select the first point to be a processing target in a predetermined search order or to select the last point to be a processing target in a predetermined search order may be switched for each hierarchy.

Further, for example, as in method (3) shown in the fourth row from the top of the table of "method of selecting a reference point from a plurality of candidates" shown in fig. 11, points to be processing targets in the middle (number/2) in a predetermined search order may be selected.

Further, for example, as in method (4) shown in the fifth row from the top of the table of "method of selecting a reference point from a plurality of candidates" shown in fig. 11, a point to be a processing target in an order specified in a predetermined search order may be selected. That is, the nth point to be a processing target in a predetermined search order may be selected. The specified order (N) may be predetermined or may be settable by a user, an application, or the like. Further, in the case where the designation rank (rank) can be set, information on the designated order (N) can be signaled (transmitted).

As in the above methods (1) to (4), in the case where there are a plurality of candidates having substantially the same condition with respect to the centroid, the reference point may be set based on a predetermined search order from among the plurality of candidates.

Note that the search order is arbitrary. For example, the order may be morton or may be an order other than morton. Further, the search order may be predefined by a standard or the like, or may be settable by a user, an application, or the like. In the case where the search order can be set, information on the search order can be signaled (transmitted).

Further, for example, in method (5) shown in the sixth row from the top of the table of "method of selecting a reference point from a plurality of candidates" shown in fig. 11, the centroid derivation target range may be set to a wider range, centroids of points within the wide range centroid derivation target range may be derived, and the newly derived centroids may be used to select the points. That is, the centroid can be derived again by changing the conditions.

< encoding device >

Next, an apparatus to which the present technology is applied will be described. Fig. 13 is a block diagram showing an example of the configuration of an encoding apparatus as one aspect of an information processing apparatus to which the present technology (method 1) is applied. The encoding apparatus 100 shown in fig. 13 is an apparatus that encodes a point cloud (3D data). The encoding device 100 encodes the point cloud by applying the present technique described in the present embodiment.

Note that although fig. 13 shows main elements such as processing units and data flows, these shown in fig. 13 do not necessarily include all the elements. That is, in the encoding apparatus 100, there may be a processing unit not shown as a block in fig. 13, or there may be a process or a data flow not shown as an arrow or the like in fig. 13.

As shown in fig. 13, the encoding apparatus 100 includes a positional information encoding unit 101, a positional information decoding unit 102, a point cloud generating unit 103, an attribute information encoding unit 104, and a bitstream generating unit 105.

Position information encoding section 101 encodes geometric data (position information) of a point cloud (3D data) input to encoding apparatus 100. The method for encoding is arbitrary as long as the method is a method compatible with scalable decoding. For example, the positional information encoding unit 101 layers the geometric data to generate an octree, and encodes the octree. Further, for example, processing such as filtering or quantization for noise suppression (noise removal) may be performed. Position information encoding section 101 supplies the generated encoded data of the geometry data to position information decoding section 102 and bit stream generating section 105.

The positional information decoding unit 102 acquires the encoded data of the geometric data supplied from the positional information encoding unit 101, and decodes the encoded data. The method of decoding is arbitrary as long as the method is a method corresponding to the encoding of the positional information encoding unit 101. For example, processing such as filtering or inverse quantization for denoising may be performed. The positional information decoding unit 102 supplies the generated geometric data (decoding result) to the point cloud generating unit 103.

The point cloud generating unit 103 acquires attribute data (attribute information) of the point cloud input to the encoding apparatus 100 and geometric data (decoding result) supplied from the positional information decoding unit 102. The point cloud generating unit 103 performs a process (recoloring process) of matching the attribute data with the geometric data (decoding result). The point cloud generating unit 103 supplies attribute data (decoding result) corresponding to the geometric data to the attribute information encoding unit 104.

The attribute information encoding unit 104 acquires the geometric data (decoding result) and the attribute data supplied from the point cloud generating unit 103. The attribute information encoding unit 104 encodes the attribute data using the geometric data (decoding result), and generates encoded data of the attribute data.

At this time, the attribute information encoding unit 104 encodes the attribute data by applying the present technique (method 1) described above. Attribute information encoding section 104 supplies the encoded data of the generated attribute data to bit stream generating section 105.

The bitstream generation unit 105 acquires encoded data of the geometric data supplied from the positional information encoding unit 101. Further, the bit stream generating unit 105 acquires encoded data of the attribute data supplied from the attribute information encoding unit 104. The bitstream generation unit 105 generates a bitstream including encoded data. Bitstream generation section 105 outputs the generated bitstream to the outside of encoding apparatus 100.

With such a configuration, the encoding apparatus 100 can obtain the centroid of the points in the hierarchy of the attribute data and set the reference point based on the centroid. By referring to the point near the centroid in this way, a point near more other points can be set as the reference point. Therefore, in short, the reference point can be set so as to suppress a decrease in the prediction accuracy of more predicted points, and a decrease in the coding efficiency can be suppressed.

Note that each of these processing units (positional information encoding unit 101 to bitstream generation unit 105) of the encoding device 100 has an arbitrary configuration. For example, each processing unit may be configured by a logic circuit that implements the above-described processing. Further, each processing unit may include, for example, a Central Processing Unit (CPU), a Read Only Memory (ROM), a Random Access Memory (RAM), or the like, and executes programs using them, thereby realizing the above-described processing. Of course, each processing unit may have two configurations, and a part of the above-described processing may be realized by a logic circuit and another part may be realized by executing a program. The configurations of the processing units may be independent of each other, and for example, a part of the processing units may realize a part of the above-described processing by a logic circuit, another part of the processing units may realize the above-described processing by executing a program, and still another processing unit may realize the above-described processing by both execution of the program and the logic circuit.

< Attribute information encoding means >

Fig. 14 is a block diagram showing a main configuration example of the attribute information encoding unit 104 (fig. 13). Note that although fig. 14 shows main elements such as processing units and data flows, these shown in fig. 14 do not necessarily include all the elements. That is, in the attribute information encoding unit 104, there may be a processing unit not shown as a block in fig. 14, or there may be a process or a data flow not shown as an arrow or the like in fig. 14.

As shown in fig. 14, the attribute information encoding unit 104 includes a hierarchical processing unit 111, a quantization unit 112, and an encoding unit 113.

The hierarchy processing unit 111 performs processing related to the hierarchy of the attribute data. For example, the layer processing unit 111 acquires the attribute data and the geometric data (decoding result) supplied from the point cloud generating unit 103. The hierarchical processing unit 111 uses the geometric data to hierarchy the attribute data. At this time, the layering processing unit 111 performs layering by applying the present technique (method 1) described above. That is, the hierarchical processing unit 111 derives the centroid of the points in each hierarchy, and selects the reference point based on the centroid. Then, the hierarchical processing unit 111 sets a reference relationship in each hierarchy of the hierarchical structure, derives a predicted value of attribute data of each predicted point using attribute data of a reference point based on the reference relationship, and derives a difference value between the attribute data and the predicted value. The hierarchical processing unit 111 supplies the attribute data (difference value) to be hierarchical to the quantization unit 112.

At this time, the hierarchical processing unit 111 may also generate control information regarding the hierarchy. The hierarchical processing unit 111 can also supply the generated control information to the quantization unit 112 together with the attribute data (difference value).

The quantization unit 112 acquires the attribute data (difference value) and the control information supplied from the hierarchical processing unit 111. The quantization unit 112 quantizes the attribute data (difference value). The method of quantization is arbitrary. The quantization unit 112 supplies the quantized attribute data (difference value) and control information to the encoding unit 113.

The encoding unit 113 acquires the quantized attribute data (difference value) and the control information supplied from the quantization unit 112. The encoding unit 113 encodes the quantized attribute data (difference value) and generates encoded data of the attribute data. The method of encoding is arbitrary. Further, the encoding unit 113 includes control information in the generated encoded data. In other words, encoded data including attribute data of the control information is generated. Encoding section 113 supplies the generated encoded data to bit stream generation section 105.

By performing the layering as described above, the attribute information encoding unit 104 can set a point close to the centroid as the reference point, and thus can set a point close to more other points as the reference point. Therefore, in short, the reference point can be set to suppress a decrease in prediction accuracy of more prediction points, and a decrease in coding efficiency can be suppressed.

Note that these processing units (hierarchical processing unit 111 to encoding unit 113) have an arbitrary configuration. For example, each processing unit may be configured by a logic circuit that implements the above-described processing. Further, each processing unit may include, for example, a CPU, ROM, RAM, or the like, and executes programs using them, thereby realizing the above-described processing. Of course, each processing unit may have two configurations, and a part of the above-described processing may be realized by a logic circuit and another part may be realized by executing a program. The configurations of the processing units may be independent of each other, and for example, a part of the processing units may realize a part of the above-described processing by a logic circuit, another part of the processing units may realize the above-described processing by executing a program, and still another processing unit may realize the above-described processing by both execution of the program and the logic circuit.

< layered processing Unit >

Fig. 15 is a block diagram showing a main configuration example of the hierarchical processing unit 111 (fig. 14). Note that although fig. 15 shows main elements such as processing units and data flows, these shown in fig. 15 do not necessarily include all the elements. That is, in the hierarchical processing unit 111, there may be a processing unit not shown as a block in fig. 15, or there may be a process or a data flow not shown as an arrow or the like in fig. 15.

As shown in fig. 15, the hierarchical processing unit 111 includes a reference point setting unit 121, a reference relationship setting unit 122, an inversion unit 123, and a weighting value derivation unit 124.

The reference point setting unit 121 performs processing related to setting of a reference point. For example, the reference point setting unit 121 classifies a group of points as processing targets into a reference point to which attribute data is referred and a prediction point for deriving a prediction value of the attribute data based on geometric data of each point. That is, the reference point setting unit 121 sets the reference point and the prediction point. The reference point setting unit 121 recursively repeats this processing with respect to the reference point. That is, the reference point setting unit 121 sets the reference point and the prediction point in the hierarchy as the processing target using the reference point set in the previous hierarchy as the processing target. In this way, a layered structure is constructed. That is, the attribute data is layered. The reference point setting unit 121 supplies information indicating that the reference point and the prediction point of each hierarchy are set to the reference relationship setting unit 122.

The reference relationship setting unit 122 performs processing relating to setting of a reference relationship for each hierarchy based on the information supplied from the reference point setting unit 121. That is, the reference relationship setting unit 122 sets a reference point (i.e., a reference destination) to be referred to for deriving a predicted value for each predicted point of each hierarchy. Then, the reference relationship setting unit 122 derives a prediction value of the attribute data of each prediction point based on the reference relationship. That is, the reference relationship setting unit 122 derives a prediction value of the attribute data of the prediction point using the attribute data of the reference point set as the reference destination. Further, the reference relationship setting unit 122 derives a difference value between the derived predicted value of the predicted point and the attribute data. The reference relationship setting unit 122 supplies the derived difference values (the layered attribute data) to the inverting unit 123 of each layer.

Note that the reference point setting unit 121 may generate control information and the like on the hierarchy of the attribute data as described above, supply the control information and the like to the quantization unit 112, and transmit the control information and the like to the decoding side.

The inversion unit 123 performs processing related to inversion of the hierarchy. For example, the reversing unit 123 acquires the attribute data supplied from the reference relationship setting unit 122 that is layered. In the attribute data, information of each hierarchy is layered in the order of generation. The inverting unit 123 inverts the hierarchy of the attribute data. For example, the reversing unit 123 assigns a hierarchy number (a number for identifying a hierarchy in which a value is incremented by 1 every time the highest hierarchy is lowered by 0 or 1 and the lowest hierarchy has the maximum value) to each hierarchy of the attribute data in the reverse order of the generation order so that the generation order is the order from the lowest hierarchy to the highest hierarchy. The inversion unit 123 supplies the gradation-inverted attribute data to the weighting value derivation unit 124.

The weighted value derivation unit 124 performs processing related to weighting. For example, the weighting value deriving unit 124 acquires the attribute data supplied from the inverting unit 123. The weighted value deriving unit 124 derives weighted values of the acquired attribute data. The method of deriving the weighting values is arbitrary. The weighted value derivation unit 124 supplies the attribute data (difference value) and the derived weighted value to the quantization unit 112 (fig. 14). Further, the weighted value derivation unit 124 may provide the derived weighted value as control information to the quantization unit 112 and transmit the weighted value to the decoding side.

In the above-described hierarchical processing unit 111, the above-described present technology can be applied to the reference point setting unit 121. That is, the reference relationship setting unit 122 may apply the above-described "method 1", derive the centroid of the point and set the reference point based on the centroid. In this way, a decrease in prediction accuracy and a decrease in coding efficiency can be suppressed.

Note that this layering process is arbitrary. For example, the processing of the reference point setting unit 121 and the processing of the reference relationship setting unit 122 may be performed in parallel. For example, the reference point setting unit 121 may set a reference point and a prediction point for each hierarchy, and the reference relationship setting unit 122 may set a reference relationship.

Note that these processing units (the reference point setting unit 121 to the weighting value derivation unit 124) have an arbitrary configuration. For example, each processing unit may be configured by a logic circuit that implements the above-described processing. Further, each processing unit may include, for example, a CPU, ROM, RAM, or the like, and executes programs using them, thereby realizing the above-described processing. Of course, each processing unit may have two configurations, and a part of the above-described processing may be realized by a logic circuit and another part may be realized by executing a program. The configurations of the processing units may be independent of each other, and for example, a part of the processing units may realize a part of the above-described processing by a logic circuit, another part of the processing units may realize the above-described processing by executing a program, and still another processing unit may realize the above-described processing by both execution of the program and the logic circuit.

< flow of encoding processing >

Next, the processing performed by the encoding apparatus 100 will be described. The encoding apparatus 100 encodes data of the point cloud by performing an encoding process. An example of the flow of the encoding process will be described with reference to the flowchart of fig. 16.

When the encoding process starts, in step S101, the positional information encoding unit 101 of the encoding apparatus 100 encodes geometric data (positional information) of the input point cloud, and generates encoded data of the geometric data.

In step S102, the positional information decoding unit 102 decodes the encoded data of the geometric data generated in step S101, and generates positional information.

In step S103, the point cloud generation unit 103 performs a recoloring process using the attribute data (attribute information) of the input point cloud and the geometric data (decoding result) generated in step S102, and associates the attribute data with the geometric data.

In step S104, the attribute information encoding unit 104 performs attribute information encoding processing, thereby encoding the attribute data subjected to the recoloring processing in step S103, and generates encoded data of the attribute data. At this time, the attribute information encoding unit 104 executes processing by applying the present technique (method 1) described above. For example, in the hierarchy of the attribute data, the attribute information encoding unit 104 derives the centroid of the point and sets the reference point based on the centroid. Details of the attribute information encoding process will be described later.

In step S105, the bit stream generation unit 105 generates and outputs a bit stream including the encoded data of the geometry data generated in step S101 and the encoded data of the attribute data generated in step S104.

When the process of step S105 ends, the encoding process ends.

By performing the processing of each step in this way, the encoding device 100 can suppress a decrease in prediction accuracy and can suppress a decrease in encoding efficiency.

< flow of attribute information encoding processing >

Next, an example of the flow of the attribute information encoding process executed in step S104 of fig. 16 will be described with reference to the flowchart of fig. 17.

When the attribute information encoding process is started, the hierarchy processing unit 111 of the attribute information encoding unit 104 hierarchies the attribute data by performing the hierarchy process in step S111. That is, the reference point and the prediction point of each hierarchy are set, and the reference relationship is also set. At this time, the layering processing unit 111 performs layering by applying the present technique (method 1) described above. For example, in the hierarchy of the attribute data, the attribute information encoding unit 104 derives the centroid of the point and sets the reference point based on the centroid. Details of the layering process will be described later.

In step S112, the hierarchical processing unit 111 derives a predicted value of the attribute data of each predicted point in each hierarchy of the attribute data hierarchical in step S111, and derives a difference value between the attribute data of the predicted point and the predicted value.

In step S113, the quantization unit 112 quantizes each difference value derived in step S112.

In step S114, the encoding unit 113 encodes the difference value quantized in step S112, and generates encoded data of the attribute data.

When the process of step S114 ends, the attribute information encoding process ends, and the process returns to fig. 16.

By performing the processing of each step in this way, the hierarchical processing unit 111 can apply the above-described "method 1", derive the centroid of the point in the hierarchy of the attribute data, and set the reference point based on the centroid. Therefore, the hierarchical processing unit 111 can hierarchy the attribute data so as to suppress a decrease in prediction accuracy, and thus can suppress a decrease in coding efficiency.

< flow of layering treatment >

Next, an example of the flow of the layering process performed in step S111 of fig. 17 will be described with reference to the flowchart of fig. 18.

When the hierarchical processing is started, in step S121, the reference point setting unit 121 of the hierarchical processing unit 111 sets the value of the variable LoD Index (LoD Index) indicating the hierarchy as the processing target to an initial value (e.g., "0").

In step S122, the reference point setting unit 121 performs the reference point setting process, and sets the reference point in the hierarchy as a processing target (i.e., the prediction point is also set). The details of the reference point setting process will be described later.

In step S123, the reference relationship setting unit 122 sets the reference relationship of the hierarchy as the processing target (which reference point is referred to in deriving the predicted value of each predicted point).

In step S124, the reference point setting unit 121 increments the LoD index and sets the processing target to the next level.

In step S125, the reference point setting unit 121 determines whether all points have been processed. In a case where it is determined that there is an unprocessed point, that is, in a case where it is determined that the layering is not completed, the process returns to step S122 and the processes of step S122 and subsequent steps are repeated. As described above, the processing of steps S122 to S125 is performed for each hierarchy, and in the case where it is determined in step S125 that all the points have been processed, the processing proceeds to step S126.

In step S126, the reversing unit 123 reverses the hierarchies of the attribute data generated as described above, and assigns a hierarchy number to each hierarchy in the reverse direction of the generation order.

In step S127, the weighted value deriving unit 124 derives a weighted value for the attribute data of each hierarchy.

When the process of step S127 ends, the process returns to fig. 14.

< flow of reference relationship setting processing >

Next, an example of the flow of the reference point setting process executed in step S122 of fig. 18 will be described with reference to the flowchart of fig. 19.

When the reference point setting process is started, in step S141, the reference point setting unit 121 specifies a set of points for deriving the centroid, and derives the centroid of the set of points as the processing target. As described above, the method of deriving the centroid is arbitrary. For example, the centroid may be derived using any of the methods shown in the table of fig. 8.

In step S142, the reference point setting unit 121 selects a point close to the centroid derived in step S141 as the reference point. The method of selecting the reference point is arbitrary. For example, the centroid may be derived using any of the methods shown in the table of fig. 11.

When the processing of step S142 ends, the reference point setting processing ends, and the processing returns to fig. 18.

By performing the processing of each step in this way, the reference point setting unit 121 can apply the above-described "method 1", derive the centroid of the points in the hierarchy of the attribute data, and set the reference point based on the centroid. Therefore, the hierarchical processing unit 111 can hierarchy the attribute data so as to suppress a decrease in prediction accuracy, and thus can suppress a decrease in coding efficiency.

< decoding apparatus >

Next, another example of an apparatus to which the present technology is applied will be described. Fig. 20 is a block diagram showing a configuration example of a decoding apparatus as an aspect of an information processing apparatus to which the present technology is applied. The decoding apparatus 200 shown in fig. 20 is an apparatus that decodes encoded data of a point cloud (3D data). The decoding device 200 decodes the encoded data of the point cloud by applying the present technique (method 1) described in the present embodiment.

Note that although fig. 20 shows main elements such as processing units and data flows, these shown in fig. 20 do not necessarily include all the elements. That is, in the decoding apparatus 200, there may be processing units not shown as blocks in fig. 20, or there may be processing or data streams not shown as arrows or the like in fig. 20.

As shown in fig. 20, the decoding apparatus 200 includes an encoded data extracting unit 201, a position information decoding unit 202, an attribute information decoding unit 203, and a point cloud generating unit 204.

The encoded data extraction unit 201 acquires and holds a bit stream input to the decoding apparatus 200. The encoded data extraction unit 201 extracts encoded data of geometric data (position information) and attribute data (attribute information) from the held bit stream. The encoded data extracting unit 201 supplies the encoded data of the extracted geometric data to the positional information decoding unit 202. The encoded data extracting unit 201 supplies the encoded data of the extracted attribute data to the attribute information decoding unit 203.

The positional information decoding unit 202 acquires the encoded data of the geometric data supplied from the encoded data extracting unit 201. The positional information decoding unit 202 decodes the encoded data of the geometry data, and generates geometry data (decoding result). The method of decoding is arbitrary as long as the method is a method similar to that in the case of the positional information decoding unit 102 of the encoding apparatus 100. The position information decoding unit 202 supplies the generated geometric data (decoding result) to the attribute information decoding unit 203 and the point cloud generating unit 204.

The attribute information decoding unit 203 acquires encoded data of the attribute data supplied from the encoded data extracting unit 201. The attribute information decoding unit 203 acquires the geometric data (decoding result) supplied from the position information decoding unit 202. The attribute information decoding unit 203 decodes the encoded data of the attribute data by applying the method of the present technology (method 1) described above using the position information (decoding result), and generates the attribute data (decoding result). The attribute information decoding unit 203 supplies the generated attribute data (decoding result) to the point cloud generating unit 204.

The point cloud generating unit 204 acquires geometric data (decoding result) supplied from the positional information decoding unit 202. The point cloud generating unit 204 acquires the attribute data (decoding result) supplied from the attribute information decoding unit 203. The point cloud generating unit 204 generates a point cloud (decoding result) using the geometric data (decoding result) and the attribute data (decoding result). The point cloud generating unit 204 outputs data of the generated point cloud (decoding result) to the outside of the decoding apparatus 200.

With such a configuration, the decoding apparatus 200 can select a point close to the centroid of the point as a reference point in the de-hierarchy. Therefore, for example, the decoding apparatus 200 can correctly decode the encoded data of the attribute data encoded by the encoding apparatus 100. Therefore, a decrease in prediction accuracy can be suppressed, and a decrease in coding efficiency can be suppressed.

Note that these processing units (the encoded data extraction unit 201 to the point cloud generation unit 204) have an arbitrary configuration. For example, each processing unit may be configured by a logic circuit that implements the above-described processing. Further, each processing unit may include, for example, a CPU, ROM, RAM, or the like, and executes programs using them, thereby realizing the above-described processing. Of course, each processing unit may have two configurations, and a part of the above-described processing may be realized by a logic circuit and another part may be realized by executing a program. The configurations of the processing units may be independent of each other, and for example, a part of the processing units may realize a part of the above-described processing by a logic circuit, another part of the processing units may realize the above-described processing by executing a program, and still another processing unit may realize the above-described processing by both execution of the program and the logic circuit.

< Attribute information decoding means >

Fig. 21 is a block diagram showing a main configuration example of the attribute information decoding unit 203 (fig. 20). Note that although fig. 21 shows main elements such as processing units and data flows, these shown in fig. 21 do not necessarily include all the elements. That is, in the attribute information decoding unit 203, there may be a processing unit not shown as a block in fig. 21, or there may be a process or a data flow not shown as an arrow or the like in fig. 21.

As shown in fig. 21, the attribute information decoding unit 203 includes a decoding unit 211, an inverse quantization unit 212, and a de-layering processing unit 213.

The decoding unit 211 performs processing related to decoding of encoded data of attribute data. For example, the decoding unit 211 acquires encoded data of the attribute data supplied to the attribute information decoding unit 203.

The decoding unit 211 decodes the encoded data of the attribute data, and generates attribute data (decoding result). The method of decoding is arbitrary as long as the method is a method corresponding to the encoding method of the encoding unit 113 (fig. 14) of the encoding device 100. Further, the generated attribute data (decoding result) corresponds to the attribute data before encoding, is a difference value between the attribute data and its predicted value, and is quantized. The decoding unit 211 supplies the generated attribute data (decoding result) to the inverse quantization unit 212.

Note that, in the case where the encoded data of the attribute data includes control information on the weighting value and control information on the hierarchy of the attribute data, the decoding unit 211 also supplies the control information to the inverse quantization unit 212.

The inverse quantization unit 212 performs processing related to inverse quantization of attribute data. For example, the inverse quantization unit 212 acquires the attribute data (decoding result) and the control information supplied from the decoding unit 211.

The inverse quantization unit 212 inversely quantizes the attribute data (decoding result). At this time, in the case where control information on the weighted value is supplied from the decoding unit 211, the inverse quantization unit 212 also acquires the control information and inversely quantizes the attribute data (decoding result) based on the control information (using the weighted value derived based on the control information).

Further, in the case where control information on the hierarchy of the attribute data is supplied from the decoding unit 211, the inverse quantization unit 212 also acquires the control information.

The inverse quantization unit 212 supplies the attribute data (decoding result) that is inverse-quantized to the de-layering processing unit 213. Further, in the case where control information on the hierarchy of the attribute data is acquired from the decoding unit 211, the inverse quantization unit 212 also supplies the control information to the de-hierarchy processing unit 213.

The de-layering processing unit 213 acquires the attribute data (decoding result) supplied from the inverse quantization unit 212 that is inversely quantized. As described above, the attribute data is a difference value. Further, the de-layering processing unit 213 acquires the geometric data (decoding result) supplied from the positional information decoding unit 202. The de-layering processing unit 213 performs de-layering, which is the inverse process of the layering performed by the layering processing unit 111 (fig. 14) of the encoding apparatus 100 on the acquired attribute data (difference value), using the geometric data.

Here, de-layering will be described. For example, the de-layering processing unit 213 layers the attribute data by a method similar to that of the encoding apparatus 100 (layering processing unit 111) based on the geometric data supplied from the positional information decoding unit 202. That is, the delaminating processing unit 213 sets a reference point and a prediction point of each hierarchy based on the decoded geometric data, and sets a hierarchical structure of the attribute data. The de-hierarchy processing unit 213 also sets the reference relationship (reference destination of each predicted point) of each level of the hierarchical structure using the reference point and the predicted point.

Then, the de-hierarchy processing unit 213 de-hierarchies the acquired attribute data (difference value) using the hierarchical structure and the reference relationship of each hierarchy. That is, the de-layering processing unit 213 derives the predicted values of the predicted points from the reference points according to the reference relationship, and restores the attribute data of each predicted point by adding the predicted values to the difference values. The de-hierarchy processing unit 213 performs this processing for each hierarchy from a higher hierarchy to a lower hierarchy. That is, the de-hierarchy processing unit 213 restores the attribute data of the predicted point of the hierarchy that is the processing target using, as a reference point, the predicted point obtained by restoring the attribute data in the hierarchy higher than the hierarchy that is the processing target as described above.

In the de-layering performed in such a procedure, the de-layering processing unit 213 sets a reference point by applying the present technique (method 1) described above when layering the attribute data based on the decoded geometric data. That is, the delaminating processing unit 213 derives the centroid of the points and selects a point close to the centroid as a reference point. The de-layering processing unit 213 supplies the de-layered attribute data to the point cloud generating unit 204 (fig. 20) as a decoding result.

By performing the de-layering as described above, the de-layering processing unit 213 can set a point close to the centroid as a reference point, and thus can layer the attribute data so as to suppress a decrease in prediction accuracy. That is, the attribute information decoding unit 203 can correctly decode encoded data encoded by a similar method. For example, the attribute information decoding unit 203 can correctly decode the encoded data of the attribute data encoded by the attribute information encoding unit 104 described above. Therefore, a decrease in coding efficiency can be suppressed.

Note that these processing units (decoding unit 211 to de-layering processing unit 213) have an arbitrary configuration. For example, each processing unit may be configured by a logic circuit that implements the above-described processing. Further, each processing unit may include, for example, a CPU, ROM, RAM, or the like, and executes programs using them, thereby realizing the above-described processing. Of course, each processing unit may have two configurations, and a part of the above-described processing may be realized by a logic circuit and another part may be realized by executing a program. The configurations of the processing units may be independent of each other, and for example, a part of the processing units may realize a part of the above-described processing by a logic circuit, another part of the processing units may realize the above-described processing by executing a program, and still another processing unit may realize the above-described processing by both execution of the program and the logic circuit.

< flow of decoding processing >

Next, the processing performed by the decoding apparatus 200 will be described. The decoding apparatus 200 decodes the encoded data of the point cloud by performing a decoding process. An example of the flow of the decoding process will be described with reference to the flowchart of fig. 22.

When the decoding process starts, in step S201, the encoded data extraction unit 201 of the decoding apparatus 200 acquires and holds a bitstream, and extracts encoded data of geometry data and encoded data of attribute data from the bitstream.

In step S202, the positional information decoding unit 202 decodes the encoded data of the extracted geometric data, and generates geometric data (decoding result).

In step S203, the attribute information decoding unit 203 executes attribute information decoding processing, decodes the encoded data of the attribute data extracted in step S201, and generates attribute data (decoding result). At this time, the attribute information decoding unit 203 executes processing by applying the present technique (method 1) described above. For example, in the hierarchy of the attribute data, the attribute information decoding unit 203 derives the centroid of the points and sets the points close to the centroid as the reference points. Details of the attribute information decoding process will be described later.

In step S204, the point cloud generating unit 204 generates and outputs a point cloud (decoding result) using the geometric data (decoding result) generated in step S202 and the attribute data (decoding result) generated in step S203.

When the process of step S204 ends, the decoding process ends.

By performing the processing of each step in this way, the decoding device 200 can correctly decode encoded data of attribute data encoded by a similar method. For example, the decoding device 200 can correctly decode the encoded data of the attribute data encoded by the encoding device 100. Therefore, a decrease in prediction accuracy can be suppressed, and a decrease in coding efficiency can be suppressed.

< flow of attribute information decoding processing >

Next, an example of the flow of the attribute information decoding process executed in step S203 of fig. 22 will be described with reference to the flowchart of fig. 23.

When the attribute information decoding process starts, in step S211, the decoding unit 211 of the attribute information decoding unit 203 decodes the encoded data of the attribute data and generates the attribute data (decoding result). The attribute data (decoding result) is quantized as described above.

In step S212, the inverse quantization unit 212 inversely quantizes the attribute data (decoding result) generated in step S211 by performing an inverse quantization process.

In step S213, the de-layering processing unit 213 performs de-layering processing to de-layer the attribute data (difference value) that was inversely quantized in step S212 and derive attribute data of each point. At this time, the delaminating processing unit 213 performs delaminating by applying the present technique (method 1) described above. For example, in the hierarchy of the attribute data, the de-hierarchy processing unit 213 derives the centroid of the points and sets the points close to the centroid as the reference points. Details of the de-layering process will be described later.

When the process of step S213 ends, the attribute information decoding process ends, and the process returns to fig. 22.

By performing the processing of each step in this way, the attribute information decoding unit 203 can apply the above-described "method 1" and set a point close to the centroid of the point as a reference point in the hierarchy of the attribute data. Therefore, the de-hierarchy processing unit 213 can hierarchy the attribute data so as to suppress a decrease in prediction accuracy. That is, the attribute information decoding unit 203 can correctly decode encoded data encoded by a similar method. For example, the attribute information decoding unit 203 can correctly decode the encoded data of the attribute data encoded by the attribute information encoding unit 104. Therefore, a decrease in coding efficiency can be suppressed.

< flow of De-layering treatment >

Next, an example of the flow of the delaminating process performed in step S213 of fig. 23 will be described with reference to the flowchart of fig. 24.

When the de-layering process starts, in step S221, the de-layering processing unit 213 performs the layering process of the attribute data (decoding result) using the geometric data (decoding result), restores the reference point and the prediction point of each layer set on the encoding side, and also restores the reference relationship of each layer. That is, the delaminating processing unit 213 performs processing similar to the layering processing performed by the layering processing unit 111, sets a reference point and a prediction point for each layer, and also sets a reference relationship for each layer.

For example, the delaminating processing unit 213 applies the above-described "method 1" similarly to the delaminating processing unit 111, derives the centroid of the point, and sets the point close to the centroid as the reference point.

In step S222, the de-layering processing unit 213 de-layers the attribute data (decoding result) using the hierarchical structure and the reference relationship, and restores the attribute data of each point. That is, the de-layering processing unit 213 derives a predicted value of the attribute data of the predicted point from the attribute data of the reference point based on the reference relationship, and adds the predicted value and a difference value of the attribute data (decoding result) to restore the attribute data.

When the process of step S222 ends, the delaminating process ends, and the process returns to fig. 23.

By performing the processing of each step in this way, the de-layering processing unit 213 can realize layering similar to that at the time of encoding. That is, the attribute information decoding unit 203 can correctly decode encoded data encoded by a similar method. For example, the attribute information decoding unit 203 can correctly decode the encoded data of the attribute data encoded by the attribute information encoding unit 104. Therefore, a decrease in coding efficiency can be suppressed.

< 3> second embodiment

< method 2>

Next, a case where the "method 2" described above with reference to fig. 6 is applied will be described. In the case of "method 2", in the hierarchy of attribute data, a reference point is selected according to a distribution pattern (distribution manner) of points.

For example, in the table (table information) shown in fig. 25, the distribution pattern (distribution manner) of the points in the processing target area in which the reference point is set indicates that the information (index) of the selected point is associated with each other in this case. For example, in the second row from the top of the table, it is shown that in the case where the distribution pattern of dots in the 2 × 2 × 2 voxel region in which the reference point is set is "10100001", a dot with an index of "2", that is, a dot appearing second is selected. Each bit value of the distribution pattern "10100001" indicates the presence or absence of a point in each voxel of 2 × 2 × 2, a value "1" indicates the presence of a point in the voxel to which a bit is allocated, and a value "0" indicates the absence of a point in the voxel to which a bit is allocated.

In this table, similarly, an index of a point to be selected is indicated for each distribution pattern. That is, in the hierarchy of the attribute data, the table is referred to and the point of the index corresponding to the distribution pattern of the points in the processing target area in which the reference point is set is selected as the reference point.

In this way, the reference point can be more easily selected.

< Table information >

The table information may be any information as long as the table information associates the distribution pattern of dots with information indicating dots to be selected. For example, as in method (1) shown in the second row from the top of the table of "table" shown in a of fig. 26, table information for selecting a point close to the centroid for each point distribution manner may be used. That is, an index of a point close to the centroid position in the case of the distribution pattern may be associated with each distribution pattern.

Further, for example, as in method (2) shown in the third row from the top of the table of "table" shown in a of fig. 26, table information for selecting an arbitrary point for each point distribution manner may be used. Further, for example, as in method (3) shown in the fourth row from the top of the table of "table", a table to be used may be selected from a plurality of tables. For example, the table to be used may be switched according to the hierarchy (depth of LoD).

< Signaling of Table information >

Note that the table information may be prepared in advance. For example, the predetermined table information may be defined by a standard. In this case, signaling transmission of table information (transmission from the encoding side to the decoding side) is unnecessary.

Further, the position information decoding unit 102 may derive table information from the geometric data. Further, position information decoding section 202 may derive table information from the geometry data (decoding result). In this case, signaling transmission of table information (transmission from the encoding side to the decoding side) is unnecessary.

Of course, the table information may be generated (or updated) by a user, application, or the like. In this case, the generated (or updated) table information may be signaled. That is, for example, the encoding unit 113 may perform encoding of information on table information and include its encoded data in a bitstream or the like in order to perform signaling transmission.

Further, as described above, the table information can be switched according to the hierarchy (depth of LoD). In this case, it is possible to perform such that in method (1) shown in the second row from the top of the table of "table" shown in B of fig. 26, the switching method is defined in advance by a standard or the like and the information indicating the switching method is not signaled.

Further, as in method (2) shown in the third row from the top of the table of "table" shown in B of fig. 26, an index (identification information) indicating the selected table may be signaled. For example, the index may be signaled in an Attribute Parameter Set (Attribute Parameter Set).

Further, for example, as in method (3) shown in the fourth row from the top of the table of "table" shown in B of fig. 26, the selected table information itself may be signaled. For example, the table information may be signaled in an Attribute block Header (Attribute Brick Header).

Further, as in method (4) shown in the fifth row from the top of the table of "table" shown in B of fig. 26, a part of the selected table information may be signaled. That is, the table information may be able to be partially updated. For example, the table information may be signaled in an Attribute block Header (Attribute Brick Header).

Also in the case of applying this method 2, the configurations of the encoding device 100 and the decoding device 200 are substantially similar to those of the encoding device 100 and the decoding device 200 in the case of applying the above-described method 1. Therefore, the encoding device 100 can execute each process, such as an encoding process, an attribute information encoding process, and a layering process, in a flow similar to that in the case of the first embodiment.

< flow of reference Point setting processing >

An example of the flow of the reference point setting process in this case will be described with reference to the flowchart of fig. 27. When the reference point setting process is started, the reference point setting unit 121 refers to the table information and selects a reference point according to the point distribution pattern in step S301.

In step S302, the reference point setting unit 121 determines whether to signal information on the table used. In the case where the determination is signaled, the process proceeds to step S303.

In step S303, the reference point setting unit 121 signals information on the table used. When the processing of step S303 ends, the reference point setting processing ends, and the processing returns to fig. 18.

Since the encoding apparatus 100 transmits the table information in this manner, the decoding apparatus 200 can perform decoding using the table information.

Note that the decoding apparatus 200 may execute each process, such as a decoding process, an attribute information decoding process, and a de-layering process, in a flow similar to that in the case of the first embodiment.

< 4> third embodiment

< method 3>

Next, a case where the "method 3" described above with reference to fig. 6 is applied will be described. In the case of "method 3", the reference point set in the hierarchy of attribute data may be signaled.

For example, as in method (1) shown in the second row from the top of the table of "signaled target" shown in fig. 28, information indicating whether all nodes (all points) are referred to, i.e., whether a node is set as a reference point or a predicted point, may be signaled. For example, as shown in a of fig. 29, all nodes in all hierarchies may be sorted in morton order, and each node may be assigned an index (index 0 to index K). In other words, each node (and information to each node) may be identified by an index of 0 through an index of K.

Further, for example, as in method (2) shown in the third row from the top of the table of "signaled target" shown in fig. 28, information indicating which node (point) is referred to and selected as a reference point for a part of the hierarchy may be signaled. For example, as shown in B of fig. 29, a hierarchy (LoD) to be a target of signaling may be specified, all nodes of the hierarchy may be sorted in morton order, and an index may be assigned to each node.

For example, it is assumed that the attribute data has a hierarchical structure as shown in a of fig. 30. That is, points are selected as reference points one by one from the point #0 of LoD2, the point #1 of LoD2, and the point #2 of LoD2, so as to form each point #0 of LoD 1. In this case, when indexes are allocated to the points of LoD2 in the search order shown in B of fig. 30, each point of LoD1#0 is indicated by the index of LoD2 as shown in C of fig. 30. In other words, the distribution of the points of Lod1#0 can be expressed by specifying "Lod 20, Lod 21, Lod 20".

In this way, a 2 × 2 × 2 voxel region of LoD N may be specified by the index of LoD N-1. That is, one voxel of 2 × 2 × 2 can be specified by LoD (hierarchy specification) and hierarchy (mth). In this way, by performing the designation of the hierarchy and the index, it is possible to perform signaling for only a part of the hierarchy as necessary, and therefore it is possible to suppress an increase in the amount of code and suppress a decrease in the encoding efficiency as compared with the case of the method (1).

Further, for example, as in method (3) shown in the fourth row from the top of the table of "signaled targets" shown in fig. 28, the points to be signaled may be limited by the number of points in the N × N voxel region at the lower hierarchy. That is, information on setting a reference point for a point satisfying a predetermined condition may be signaled. In this way, as shown in C of fig. 29, the number of nodes to be targets of signaling can be further reduced. Therefore, a decrease in coding efficiency can be suppressed.

For example, it is assumed that the attribute data has a hierarchical structure as shown in a of fig. 31. In this case, the signaled target is limited to a 2 × 2 × 2 voxel region comprising three or more points. When indices are assigned to the points of LoD2 in the search order shown in B of fig. 31, the voxel region shown on the right side of LoD2 is excluded from the signaled targets. Therefore, no index is assigned to this voxel region. Therefore, as shown in C of fig. 31, compared with the case of C of fig. 30, the amount of data to be signaled can be reduced and an increase in the amount of code can be suppressed.

Note that the method (2) and the method (3) may be applied in combination, for example, in the method (4) shown in the fifth row from the top of the table of "signaled target" as shown in fig. 28.

< fixed length signaling >

The signaling as described above may be performed for fixed length data. For example, the signaling may be performed using the syntax as shown in a of fig. 32. In the syntax of a of fig. 32, num _ Lod is a parameter indicating the number of lods to be signaled. lodNa [ i ] is a parameter indicating the Lod number. voxelType [ i ] is a parameter indicating the type of voxel to be signaled. By specifying this parameter, the transmission target in the Lod can be restricted. num _ node is a parameter indicating the number to be actually signaled. The num _ node can be derived from the geometry data. node [ k ] represents the signaled information for each voxel region of 2 × 2 × 2. k denotes the node number in morton order.

Note that in the case where parsing needs to be performed for parallel processing or the like before obtaining geometry data, signaling transmission may be performed using the syntax shown in B of fig. 32. In the case where parsing is required as described above, only num _ node needs to be signaled.

Further, in the case of performing signaling transmission with fixed-length data, the syntax shown in a of fig. 33 may be applied. In this case, a Flag for control signaling is signaled [ k ]. Also in this case, when parsing needs to be performed for parallel processing or the like before obtaining the geometry data, signaling can be performed using the syntax shown in B of fig. 33. In the case where the resolution is required as described above, only the num _ node signal needs to be signaled.

< variable Length Signaling transport >

Further, the signaling transmission as described above may be performed using variable-length data. For example, as in the example of fig. 34, the locations of nodes in a 2 x 2 voxel region may be signaled. In this case, the bit length of the signaling transmission may be set according to the number of nodes in the 2 × 2 × 2 voxel region, for example, based on the table information shown in a of fig. 34.

On the decoding side, the number of nodes included in the 2 × 2 × 2 voxel region may be determined from the geometric data. Therefore, as in the example shown in B of fig. 34, even if a bit string such as "10111010 … …" is input, since the number of nodes included in a 2 × 2 × 2 voxel region can be grasped from the geometric data, information of each voxel region can be correctly obtained by performing division with an appropriate bit length.

Further, as in the example of fig. 35, an index of table information to be used may be signaled. In this case, for example, as shown in a of fig. 35, the bit length of the signaling transmission may be made variable according to the number of nodes in the 2 × 2 × 2 voxel region.

For example, in the case where the number of nodes is five to eight based on the table information shown in a of fig. 35, two bits are allocated and the table information in B of fig. 35 is selected. In this case, a bit string "00" indicates that the first node in a predetermined search order (e.g., morton order) is selected. Further, a bit string "01" indicates that the first node in the reverse order (reverse) of the predetermined search order (e.g., morton order) is selected. Further, the bit string "10" indicates that the second node in the predetermined search order is selected (not the opposite). Further, the bit string "11" indicates that the second node in the reverse order (reverse) of the predetermined search order is selected.

Further, for example, in the case where the number of the node is three or four based on the table information shown in a of fig. 35, one bit is allocated and the table information in C of fig. 35 is selected. In this case, a bit string "0" indicates that the first node in a predetermined search order (e.g., morton order) is selected. Further, a bit string "1" indicates that the first node in the reverse order (reverse) of the predetermined search order (e.g., morton order) is selected.

On the decoding side, the number of nodes included in the 2 × 2 × 2 voxel region may be determined from the geometric data. Therefore, as in the example shown in D of fig. 35, even if a bit string such as "10111010 … …" is input, since the number of nodes included in a 2 × 2 × 2 voxel region can be grasped from the geometric data, information of each voxel region can be correctly obtained by performing division with an appropriate bit length.

An example of syntax in the case of variable length is shown in fig. 36. In the syntax of fig. 36, bitLength is a parameter indicating a bit length. Further, signalType [ i ] is a parameter indicating a method of variable length coding of each LoD.

Note that also in the case of this variable length, when parsing needs to be performed for parallel processing or the like before geometric data is obtained, num _ node may be signaled or flag [ j ] may be signaled as in the syntax shown in fig. 37.

< flow of reference Point setting processing >

An example of the flow of the reference point setting process in this case will be described with reference to the flowchart of fig. 38. When the reference point setting process is started, the reference point setting unit 121 selects a reference point in step S321.

In step S322, the reference point setting unit 121 signals information about the reference point set in step S321. When the processing of step S322 ends, the reference point setting processing ends, and the processing returns to fig. 18. Since the encoding apparatus 100 transmits the table information in this manner, the decoding apparatus 200 can perform decoding using the table information.

<5 > fourth embodiment >

< method 4>

Next, a case where the "method 4" described above with reference to fig. 6 is applied will be described. In the case of "method 4", a point closer to the center of the bounding box and a point farther from the center of the bounding box among the candidates of the reference point may be alternately selected as the reference points for each hierarchy.

For example, in the case where the reference point is selected at the position shown in fig. 39, as in a of fig. 39 to C of fig. 39, a point closer to the center of the bounding box and a point farther from the center of the bounding box are alternately selected for each hierarchy. In this way, as shown in C of fig. 39, the movement range of the reference point is limited to a narrow range as indicated by the dashed line box, and thus a decrease in prediction accuracy is suppressed.

In this way, by using the selection direction of the point as a reference for the center of the bounding box, it is possible to suppress a decrease in the prediction accuracy regardless of the position of the bounding box.

Note that such point selection can be achieved by changing the search order of the points according to the position of the bounding box. For example, the search order may be an order of distances from the center of the bounding box. In this way, the search order can be changed according to the position of the bounding box. Further, for example, the search order may be changed for each of eight divided regions obtained by dividing the bounding box into eight (two in each direction of xyz).

< flow of reference Point setting processing >

An example of the flow of the reference point setting process in this case will be described with reference to the flowchart of fig. 40. When the reference point setting process is started, in step S341, the reference point setting unit 121 determines whether a point closer to the center of the bounding box has been selected as the reference point of the previous hierarchy.

In a case where it is determined that the closer point has been selected, the process proceeds to step S342.

In step S342, the reference point setting unit 121 selects, as the reference point, the point farthest from the center of the bounding box among the reference point candidates. When the processing of step S342 ends, the reference point setting processing ends, and the processing returns to fig. 18.

Further, in the case where it is determined in step S341 that the closer point is not selected, the processing proceeds to step S343.

In step S343, the reference point setting unit 121 selects, as the reference point, the point closest to the center of the bounding box among the reference point candidates. When the processing of step S343 ends, the reference point setting processing ends, and the processing returns to fig. 18.

By selecting the reference point in this way, the encoding device 100 can suppress a decrease in the prediction accuracy of the reference point. Therefore, a decrease in coding efficiency can be suppressed.

<6. appendix >

< methods of delamination and De-delamination >

In the above description, promotion has been described as an example of a method for layering and de-layering attribute information, but the present technology can be applied to any technology for layering attribute information. That is, the method of layering and de-layering the attribute information may be a method other than promotion. Further, the method of layering and de-layering attribute information may be a non-scalable method or a scalable method as described in non-patent document 3.

< control information >

The control information regarding the present technology described in each of the above embodiments may be transmitted from the encoding side to the decoding side. For example, control information (e.g., enabled _ flag) that controls whether the application of the present technology described above is enabled (or disabled) may be transmitted. Further, for example, control information specifying a range (e.g., an upper limit or a lower limit or both of block sizes, slices, pictures, sequences, components, views, layers, and the like) in which the application of the present technology described above is permitted (or prohibited) may be transmitted.

< surroundings and vicinity >

Note that in the present description, positional relationships such as "vicinity" or "surrounding" may include not only spatial positional relationships but also temporal positional relationships.

< computer >

The series of processes described above may be performed by hardware or may be performed by software. In the case where a series of processes is executed by software, a program constituting the software is installed in a computer. Here, the computer includes a computer incorporated in dedicated hardware, a general-purpose personal computer or the like that can execute various functions by installing various programs, for example.

Fig. 41 is a block diagram showing a configuration example of hardware of a computer that executes the above-described series of processing by a program.

In a computer 900 shown in fig. 41, a Central Processing Unit (CPU)901, a Read Only Memory (ROM)902, and a Random Access Memory (RAM)903 are connected to each other via a bus 904.

An input-output interface 910 is also connected to bus 904. An input unit 911, an output unit 912, a storage unit 913, a communication unit 914, and a drive 915 are connected to the input-output interface 910.

The input unit 911 includes, for example, a keyboard, a mouse, a microphone, a touch panel, an input terminal, and the like. The output unit 912 includes, for example, a display, a speaker, an output terminal, and the like. The storage unit 913 includes, for example, a hard disk, a RAM disk, a nonvolatile memory, and the like. The communication unit 914 includes, for example, a network interface. The drive 915 drives a removable medium 921 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.

In the computer configured as described above, the CPU 901 loads a program stored in, for example, the storage unit 913 into the RAM 903 via the input-output interface 910 and the bus 904 and executes the program so as to execute the above-described series of processing. The RAM 903 also appropriately stores data and the like necessary for the CPU 901 to execute various processes.

For example, the program executed by the computer may be applied by being recorded in a removable medium 921 which is a package medium or the like. In this case, the program can be installed in the storage unit 913 via the input-output interface 910 by attaching the removable medium 921 to the drive 915.

Further, the program may be provided via a wired or wireless transmission medium such as a local area network, the internet, or digital satellite broadcasting. In this case, the program may be received by the communication unit 914 and installed in the storage unit 913.

In addition, the program may be installed in advance in the ROM 902 or the storage unit 913.

< objectives of the present technology >

Although the case where the present technology is applied to encoding and decoding of point cloud data has been described above, the present technology is not limited to these examples and can be applied to encoding and decoding of 3D data of any standard. That is, as long as there is no contradiction with the present technology described above, various types of processing such as encoding and decoding methods and specifications of various types of data such as 3D data and metadata are arbitrary. Further, as long as there is no contradiction with the present technology, a part of the above-described processing and specification may be omitted.

Further, in the above description, the encoding apparatus 100 and the decoding apparatus 200 have been described as application examples of the present technology, but the present technology can be applied to any configuration.

For example, the present technology can be applied to various electronic devices such as transmitters and receivers (e.g., television receivers and mobile phones) of satellite broadcasting, cable broadcasting such as cable television, distribution on the internet, and distribution to terminals through cellular communication, or devices that record images on media such as optical disks, magnetic disks, and flash memories or reproduce images from storage media (e.g., hard disk recorders and cameras).

Further, for example, the present technology can also be implemented as a configuration of a part of a device such as a processor (e.g., a video processor) as a system large-scale integration (LSI) or the like, a module (e.g., a video module) using a plurality of processors or the like, a unit (e.g., a video unit) using a plurality of modules or the like, or a set (e.g., a video set) obtained by further adding other functions to the unit.

Further, for example, the present technology can also be applied to a network system including a plurality of devices. For example, the present technology may be implemented as cloud computing that is shared and cooperatively processed by a plurality of devices via a network. For example, the present technology may be implemented in a cloud service that provides a service related to an image (moving image) to any terminal such as a computer, an Audio Visual (AV) device, a portable information processing terminal, or an internet of things (IoT) device.

Note that in this specification, the system means a collection of a plurality of components (devices, modules (parts), and the like), and it is not important whether all the components are in the same housing. Therefore, a plurality of apparatuses accommodated in different housings and connected via a network, and one apparatus in which a plurality of modules are accommodated in one housing are all systems.

< fields and applications to which the present technology is applied >

Note that the system, apparatus, processing unit, etc. to which the present technology is applied may be used in any field, such as transportation, medical treatment, crime prevention, agriculture, animal husbandry, mining, beauty, factories, home appliances, weather, nature monitoring, etc. Further, the use thereof is arbitrary.

< others >

Note that in this specification, a "flag" is information for identifying a plurality of states, and includes not only information for identifying two states of true (1) or false (0), but also information that can identify three or more states. Thus, the value that the "flag" may take may be, for example, two values of 1 and 0, or three or more values. That is, the number of bits constituting the "flag" is arbitrary, and may be one bit or a plurality of bits. Further, it is assumed that identification information (including a flag) includes not only identification information thereof in a bitstream but also difference information of the identification information with respect to some reference information in the bitstream, and therefore, in this specification, "flag" and "identification information" include not only information thereof but also difference information with respect to the reference information.

Further, various types of information (metadata, etc.) related to the encoded data (bit stream) may be transmitted or recorded in any form as long as the information is associated with the encoded data. Herein, the term "associated" means that one data can be used (linked) when processing other data, for example. That is, data associated with each other may be combined into one data, or may be separate data. For example, information associated with the encoded data (image) may be transmitted on a transmission path different from that of the encoded data (image). Further, for example, information associated with the encoded data (image) may be recorded in a different recording medium (or another recording area of the same recording medium) from the encoded data (image). Note that the "association" may be a portion of the data rather than the entire data. For example, an image and information corresponding to the image may be associated with each other in an arbitrary unit such as a plurality of frames, one frame, or a part of a frame.

Note that in this specification, terms such as "combine", "multiplex", "add", "integrate", "include", "store", "put", "insert", and "embed" mean to combine a plurality of items into one, for example, to combine encoded data and metadata into one data, and mean one method of "associating" described above.

Furthermore, the embodiments of the present technology are not limited to the above-described embodiments, and various modifications are possible without departing from the scope of the present technology.

For example, a configuration described as one apparatus (or processing unit) may be divided and configured as a plurality of apparatuses (or processing units). On the contrary, the configurations described above as a plurality of devices (or processing units) may be combined and configured as one device (or processing unit). Further, a configuration other than the above-described configuration may of course be added to the configuration of each apparatus (or each processing unit). Further, if the configuration and operation of the entire system are substantially the same, a part of the configuration of a certain apparatus (or processing unit) may be included in the configuration of another apparatus (or another processing unit).

Further, for example, the above-described program may be executed in any device. In this case, it is sufficient that the apparatus has necessary functions (function blocks and the like) and can acquire necessary information.

Further, for example, each step of one flowchart may be performed by one apparatus, or may be shared and performed by a plurality of apparatuses. Further, in the case where a plurality of processes are included in one step, the plurality of processes may be executed by one apparatus, or may be shared and executed by a plurality of apparatuses. In other words, a plurality of processes included in one step can be executed as a plurality of steps of processes. On the contrary, the process described as a plurality of steps may be collectively performed as one step.

Further, for example, in a program executed by a computer, processing describing steps of the program may be performed in time series in the order described in the present specification, or may be performed in parallel or individually at necessary timing such as when a call is made. That is, as long as no contradiction occurs, the processing in the respective steps may be performed in an order different from the above-described order. Further, the processing in the step for describing the program may be executed in parallel with the processing in another program, or may be executed in combination with the processing in another program.

Further, for example, a plurality of techniques related to the present technique may be independently implemented as a single subject as long as there is no contradiction. Of course, any number of the present techniques may also be used and implemented in combination. For example, part or all of the present technology described in any one of the embodiments may be implemented in combination with part or all of the present technology described in the other embodiments. Further, some or all of any of the present techniques described above may be implemented by being used with another technique not described above.

Note that the present technology may have the following configuration.

(1) An information processing apparatus comprising:

a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value with respect to the reference point, wherein:

the hierarchical unit sets a reference point based on the centroid of the point.

(2) The information processing apparatus according to (1), wherein:

the hierarchical unit sets a point closer to the centroid among the candidate points as a reference point.

(3) The information processing apparatus according to (1) or (2), wherein:

the hierarchical unit sets a reference point based on the centroid of the points located within a predetermined range.

(4) The information processing apparatus according to any one of (1) to (3), wherein:

the hierarchical unit sets a reference point based on a predetermined search order from a plurality of candidates under substantially the same condition with respect to the centroid.

(5) An information processing method comprising:

when layering of attribute information is performed by recursively repeating classification of a prediction point for each point of a point cloud representing an object having a three-dimensional shape as a set of points and a reference point for deriving a prediction value, the reference point is set based on a centroid of the point, the prediction point being used to derive a difference between the attribute information and the prediction value of the attribute information.

(6) An information processing apparatus comprising:

the hierarchical unit sets a reference point based on the distribution of the points.

(7) The information processing apparatus according to (6), wherein:

the hierarchical unit sets a reference point based on table information that specifies a point close to the centroid of the points for each distribution pattern of the points.

(8) The information processing apparatus according to (6) or (7), wherein:

the hierarchical unit sets a reference point based on table information that specifies a predetermined point for each distribution pattern of the points.

(9) The information processing apparatus according to (8), further comprising:

an encoding unit that encodes information relating to table information.

(10) An information processing method comprising:

when the layering of attribute information is performed by recursively repeating classification of a prediction point for deriving a difference between attribute information and a predicted value of the attribute information and a reference point for deriving a predicted value, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, with respect to the reference point, the reference point is set based on a distribution manner of the points.

(11) An information processing apparatus comprising:

a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value, with respect to the reference point; and

an encoding unit that encodes information on the setting of the reference point by the hierarchical unit.

(12) The information processing apparatus according to (11), wherein:

the encoding unit encodes information on the setting of the reference point for all points.

(13) The information processing apparatus according to (11), wherein:

the encoding unit encodes information on a setting of a reference point for a part of hierarchical points.

(14) The information processing apparatus according to (13), wherein:

the encoding unit also encodes information on the setting of the reference point for the point satisfying the predetermined condition.

(15) An information processing method comprising:

for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layering the attribute information by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value with respect to the reference point; and

information relating to the setting of the reference point is encoded.

(16) An information processing apparatus comprising:

the hierarchical unit alternately selects, as the reference point, a point closer to the center of the bounding box and a point farther from the center of the bounding box from the candidates for the reference point at each level.

(17) The information processing apparatus according to (16), wherein:

the hierarchical unit selects a point closer to the center of the bounding box and a point farther from the center of the bounding box among the candidates based on the search order according to the position of the bounding box.

(18) The information processing apparatus according to (17), wherein:

the search order is the order of distances from the center of the bounding box.

(19) The information processing apparatus according to (17), wherein:

the search order is set for each region obtained by dividing the bounding box into eight regions.

(20) An information processing method comprising:

when layering of attribute information is performed by recursively repeating classification of a prediction point for each point of a point cloud representing an object having a three-dimensional shape as a set of points and a reference point with respect to the reference point, a point closer to the center of a bounding box and a point farther from the center of the bounding box are alternately selected as the reference point in candidates for the reference point at each layer, the prediction point being used to derive a difference between the attribute information and a predicted value of the attribute information, the reference point being used to derive the predicted value.

[ list of reference numerals ]

100: encoding device

101: position information encoding unit

102: position information decoding unit

103: point cloud generating unit

104: attribute information encoding unit

105: bit stream generation unit

111: hierarchical processing unit

112: quantization unit

113: coding unit

121: reference point setting unit

122: reference relation setting unit

123: inversion unit

124: weight value derivation unit

200: decoding device

201: coded data extraction unit

202: position information decoding unit

203: attribute information decoding unit

204: point cloud generating unit

211: decoding unit

212: inverse quantization unit

213: de-layering processing unit

Claims

1. An information processing apparatus comprising:

a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating classification of a prediction point for deriving a difference value between the attribute information and a prediction value of the attribute information and a reference point for deriving the prediction value, with respect to the reference point, wherein,

the hierarchical unit sets the reference point based on a centroid of the point.

2. The information processing apparatus according to claim 1,

the hierarchical unit sets a point closer to the centroid among the candidate points as the reference point.

3. The information processing apparatus according to claim 1,

the hierarchical unit sets the reference point based on a centroid of a point located within a predetermined range.

4. The information processing apparatus according to claim 1,

the hierarchical unit sets the reference point based on a predetermined search order from a plurality of candidates under substantially the same condition with respect to the centroid.

5. An information processing method comprising:

when attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points is hierarchical by recursively repeating a prediction point for deriving a difference between the attribute information and a predicted value of the attribute information and a classification of a reference point for deriving the predicted value, with respect to the reference point, the reference point is set based on a centroid of the point.

6. An information processing apparatus comprising:

the hierarchical unit sets the reference points based on a distribution manner of the points.

7. The information processing apparatus according to claim 6,

the hierarchical unit sets the reference point based on table information that specifies a point close to a centroid of a point for each distribution pattern of the points.

8. The information processing apparatus according to claim 6,

the hierarchical unit sets the reference point based on table information that specifies a predetermined point for each distribution pattern of the points.

9. The information processing apparatus according to claim 8, further comprising:

an encoding unit that encodes information relating to the table information.

10. An information processing method comprising:

when attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points is layered by recursively repeating classification of a prediction point for deriving a difference between the attribute information and a predicted value of the attribute information and a reference point for deriving the predicted value, and the reference point with respect to the reference point, the reference point is set based on a distribution manner of the points.

11. An information processing apparatus comprising:

a layering unit that, for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating a prediction point for deriving a difference value between the attribute information and a prediction value of the attribute information and a classification of the reference point for deriving the prediction value, and a reference point for deriving the prediction value

12. The information processing apparatus according to claim 11,

13. The information processing apparatus according to claim 11,

the encoding unit encodes information on the setting of the reference point for a part of hierarchical points.

14. The information processing apparatus according to claim 13,

the encoding unit also encodes information on the setting of the reference point for a point satisfying a predetermined condition.

15. An information processing method, comprising:

for attribute information of each point of a point cloud representing an object having a three-dimensional shape as a set of points, layering the attribute information by recursively repeating a classification of a prediction point and a reference point with respect to the reference point, the prediction point being used to derive a difference between the attribute information and a prediction value of the attribute information, the reference point being used to derive the prediction value, and

encoding information relating to the setting of the reference point.

16. An information processing apparatus comprising:

the hierarchical unit alternately selects, as the reference point, a point closer to a center of a bounding box and a point farther from the center of the bounding box from candidates for the reference point for each hierarchy.

17. The information processing apparatus according to claim 16,

the hierarchical unit selects a point closer to the center of the bounding box and a point farther from the center of the bounding box among the candidates based on a search order according to the position of the bounding box.

18. The information processing apparatus according to claim 17,

the search order is an order of distances from the center of the bounding box.

19. The information processing apparatus according to claim 17,

20. An information processing method comprising:

when the layering of the attribute information is performed by recursively repeating classification of a prediction point for each point of a point cloud representing an object having a three-dimensional shape as a set of points and the reference point with respect to a reference point, a point closer to a center of the bounding box and a point farther from the center of the bounding box are alternately selected as the reference point from candidates for the reference point for each layer, the prediction point being used to derive a difference between the attribute information and a prediction value of the attribute information, the reference point being used to derive the prediction value.