WO2024042909A1

WO2024042909A1 - Decoding method, encoding method, decoding device, and encoding device

Info

Publication number: WO2024042909A1
Application number: PCT/JP2023/025991
Authority: WO
Inventors: 敦伊藤; 敏康杉尾; 賀敬井口; 孝啓西
Original assignee: パナソニックインテレクチュアルプロパティコーポレーションオブアメリカ
Priority date: 2022-08-26
Filing date: 2023-07-14
Publication date: 2024-02-29

Abstract

Provided is a decoding method for decoding a plurality of three-dimensional points, the decoding method comprising: acquiring, from a bit stream, a plurality of nodes having an octree structure and included in a first slice (S301); acquiring, from the bit stream, information for deriving the shape of a first node among the plurality of nodes (S302); and decoding the first node according to the information (S303). The shape of the first node is different from the prescribed shape of the other nodes among the plurality of nodes. For example, the shape is a rectangular parallelepiped shape, and need not be a cube shape.

Description

Decoding method, encoding method, decoding device, and encoding device

The present disclosure relates to a decoding method, an encoding method, a decoding device, and an encoding device.

In the future, devices and services that utilize three-dimensional data are expected to become more widespread in a wide range of fields, including computer vision for autonomous operation of cars or robots, map information, monitoring, infrastructure inspection, and video distribution. Three-dimensional data is acquired by various methods, such as a distance sensor such as a range finder, a stereo camera, or a combination of multiple monocular cameras.

One of the methods for expressing three-dimensional data is a method called point cloud, which represents the shape of a three-dimensional structure using a group of points in three-dimensional space. A point cloud stores the positions and colors of point clouds. Point clouds are expected to become the mainstream method for expressing three-dimensional data, but point clouds require a very large amount of data. Therefore, when storing or transmitting three-dimensional data, it is essential to compress the amount of data through encoding, just as with two-dimensional moving images (an example is MPEG-4 AVC or HEVC standardized by MPEG). Become.

Additionally, point cloud compression is partially supported by a public library (Point Cloud Library) that performs point cloud-related processing.

Additionally, there is a known technology that uses three-dimensional map data to search for and display facilities located around a vehicle (for example, see Patent Document 1).

International Publication No. 2014/020663

In such encoding and decoding methods, it is desired to improve the encoding efficiency.

An object of the present disclosure is to provide a decoding method, an encoding method, a decoding device, or an encoding device that can improve encoding efficiency.

A decoding method according to an aspect of the present disclosure is a decoding method for decoding a plurality of three-dimensional points, and the method includes acquiring a plurality of nodes having an octree structure and included in a first slice from a bitstream; Information for deriving the shape of a first node among the plurality of nodes is obtained from the bitstream, the first node is decoded according to the information, and the shape is different from other nodes among the plurality of nodes. differs from the prescribed shape of the node.

An encoding method according to an aspect of the present disclosure is an encoding method that encodes a plurality of three-dimensional points, has an octree structure, and encodes a plurality of nodes included in a first slice. generates a bitstream, and stores information for deriving a shape of a first node among the plurality of nodes in the bitstream, and the shape is a prescribed shape of another node among the plurality of nodes. different from.

The present disclosure can provide a decoding method, an encoding method, a decoding device, or an encoding device that can improve encoding efficiency.

FIG. 1 is a diagram illustrating an example of a source point group according to an embodiment. FIG. 2 is a diagram illustrating an example of a pruned 8-ary tree according to the embodiment. FIG. 3 is a diagram illustrating a two-dimensional display of leaf nodes according to the embodiment. FIG. 4 is a diagram for explaining a method of generating centroid vertices according to the embodiment. FIG. 5 is a diagram for explaining a method of generating centroid vertices according to the embodiment. FIG. 6 is a diagram illustrating an example of vertex information according to the embodiment. FIG. 7 is a diagram illustrating an example of a trisoap surface according to an embodiment. FIG. 8 is a diagram for explaining point cloud restoration processing according to the embodiment. FIG. 9 is a diagram illustrating an example of slice division according to the embodiment. FIG. 10 is a diagram showing an example of vertices according to the embodiment. FIG. 11 is a diagram illustrating an example of a trisoap surface that should originally be generated according to the embodiment. FIG. 12 is a diagram illustrating an example of a trithorpe surface when no edge vertices are generated according to the embodiment. FIG. 13 is a diagram illustrating an example of a restored point group according to the embodiment. FIG. 14 is a diagram showing an example of vertices according to the embodiment. FIG. 15 is a diagram illustrating an example of a trisoap surface according to an embodiment. FIG. 16 is a diagram illustrating an example of transmission information according to the embodiment. FIG. 17 is a diagram illustrating an example syntax of a GDU header according to the embodiment. FIG. 18 is a diagram illustrating an example of setting the adjustment width of a non-standard width node according to the embodiment. FIG. 19 is a flowchart of encoding processing by the encoding device according to the embodiment. FIG. 20 is a flowchart of decoding processing by the decoding device according to the embodiment. FIG. 21 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 22 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 23 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 24 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 25 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 26 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 27 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 28 is a diagram showing the relationship between the node width and the width of the inclusive point coordinates according to the embodiment. FIG. 29 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 30 is a diagram illustrating an example of setting a non-standard width node according to the embodiment. FIG. 31 is a diagram illustrating an example syntax of GPS and GDU headers according to the embodiment. FIG. 32 is a diagram illustrating an example of slices and nodes according to the embodiment. FIG. 33 is a diagram illustrating an example of slices and nodes according to the embodiment. FIG. 34 is a diagram illustrating an example of processing when omitting according to the embodiment. FIG. 35 is a diagram illustrating an example syntax of GPS and GDU headers according to the embodiment. FIG. 36 is a flowchart of the omission determination process of the start end adjustment process according to the embodiment. FIG. 37 is a flowchart of the omission determination process of the end end adjustment process according to the embodiment. FIG. 38 is a flowchart of node position determination processing according to the embodiment. FIG. 39 is a flowchart of node width determination processing according to the embodiment. FIG. 40 is a flowchart of decoding processing according to the embodiment. FIG. 41 is a block diagram of a decoding device according to an embodiment. FIG. 42 is a flowchart of encoding processing according to the embodiment. FIG. 43 is a block diagram of a decoding device according to an embodiment.

A three-dimensional data decoding method according to an aspect of the present disclosure is a decoding method for decoding a plurality of three-dimensional points, and the method decodes a plurality of nodes having an octree structure and included in a first slice from a bitstream. obtain information for deriving the shape of a first node among the plurality of nodes from the bitstream, decode the first node according to the information, and determine the shape of the plurality of nodes. It is different from the prescribed shape of my other nodes. According to this, it is possible to set a node having a shape different from the prescribed shape. Therefore, variable nodes can be set according to the size of the slice or the distribution of the point group. Therefore, there is a possibility that encoding efficiency can be improved.

For example, the shape may be a rectangular parallelepiped, and may not be a cube. For example, an edge of the first slice may coincide with an edge of any one of the plurality of nodes. According to this, even if the node end and the slice end do not match, the node end and the slice end can be made to match. Therefore, since it is possible to prevent a vertex from being generated at the end of the node, it is possible to suppress the generation of blank areas at slice boundaries. Therefore, the accuracy of the point cloud to be decoded can be improved.

For example, the information may indicate the size of the shape or the positions of both ends of the side of the first node. According to this, the decoding device can generate a node having a shape different from the prescribed shape using the information.

For example, the information may include adjustment information for adjusting the prescribed shape to the shape. According to this, the decoding device can generate a node having a shape different from the prescribed shape using the adjustment information. Furthermore, compared to the case where the absolute amount of position information is sent, there is a possibility that the amount of information can be made smaller.

For example, the decoding may be performed according to a compression method in which the plurality of three-dimensional points are approximated by a plane or a curved surface within the first node. For example, the compression method may be a Triangle-Soup compression method.

For example, the shape may be determined to generate the plane or the curved surface within the first node. According to this, by setting a node having a shape different from the prescribed shape, a plane or a curved surface can be generated within the first node.

For example, the side of the shape may have an apex thereon, and the plane or curved surface may intersect the side at the apex. According to this, by setting a node having a shape different from the prescribed shape, a plane or a curved surface can be generated within the first node.

For example, the first node may be provided in contact with a second slice adjacent to the first slice. According to this, for example, when a node end and a slice end do not match, it is possible to prevent a vertex from being generated at the node end, so it is possible to suppress the generation of blank areas at slice boundaries. For example, if only nodes of a prescribed shape are provided at the slice boundary, the slice boundary may separate the nodes. In that case, the first node cannot be encoded or decoded using the point cloud in the second slice adjacent to the first slice, so the reconstruction accuracy of the three-dimensional point cloud near the slice boundary may deteriorate. There is. Therefore, in this aspect, by setting a node having a shape different from the prescribed shape, the end of the first node can be made to coincide with the end of the first slice, for example. Thereby, it is possible to suppress deterioration in the reconstruction accuracy of the three-dimensional point group near the slice boundary. Note that if this aspect is applied to the tri-soap method, it is possible to prevent edge vertices from being generated appropriately.

For example, the information is provided for each slice, and the information of the second slice has an 8-ary tree structure, and is used to derive the shape of the second node among the plurality of nodes included in the second slice. and the shape of the second node may be different from the prescribed shape. According to this, the shape of the node can be set for each slice.

For example, the size of the prescribed shape may be expressed as a power of two, and the size of the shape may be different from the size expressed as a power of two.

For example, the shape of the first node is defined by a first length along a first direction, a second length along a second direction, and a third length along a third direction. , the first direction, the second direction, and the third direction are orthogonal to each other, and only the first length is the first length, the second length, and the third length. is different from the length of the prescribed shape of a node other than the first node, or the first length and the third length are different from the length of the prescribed shape of the nodes other than the first node, or 2 may differ from the length of the defined shape, respectively.

For example, the first node is closest to the origin of the first slice among the plurality of nodes in one of a first direction, a second direction, and a third direction; The second direction and the third direction may be orthogonal to each other. According to this, when the start position of the slice does not match the origin, for example, the start position of the node can be adjusted in accordance with the start position of the slice.

For example, the plurality of nodes may include a third node having a shape different from the prescribed shape, and the third node may be farthest from the origin in the one direction among the plurality of nodes. According to this, when the end position of a slice does not match the end position of a node, for example, the end position of a node can be adjusted in accordance with the end position of a slice.

For example, if the starting position of the first slice does not coincide with the origin, the bitstream may include the information, and if the starting position of the first slice coincides with the origin, the bitstream may not include the information. good. According to this, the occurrence of processing for adjusting the shape of the node can be reduced. Furthermore, transmission of information for deriving the shape of the node can be omitted. Therefore, it is possible to reduce the amount of processing and the amount of bitstream data.

For example, if the end position of the first slice does not match the end of the first node, the bitstream includes the information, and the end position of the first slice does not match the end of the first node. In this case, the bitstream may not include the information. According to this, the occurrence of processing for adjusting the shape of the node can be reduced. Furthermore, transmission of information for deriving the shape of the node can be omitted. Therefore, it is possible to reduce the amount of processing and the amount of bitstream data.

A three-dimensional data encoding method according to one aspect of the present disclosure is an encoding method that encodes a plurality of three-dimensional points, has an octree structure, and encodes a plurality of nodes included in a first slice. information for deriving the shape of a first node among the plurality of nodes is stored in the bitstream, and the shape is different from other nodes among the plurality of nodes. The shape differs from the specified shape. According to this, it is possible to set a node having a shape different from the prescribed shape. Therefore, variable nodes can be set according to the size of the slice or the distribution of the point group. Therefore, there is a possibility that encoding efficiency can be improved.

Further, a decoding device according to an aspect of the present disclosure is a decoding device that decodes a plurality of three-dimensional points, and includes a processor and a memory, and the processor uses the memory to decode a bitstream from a bitstream. Obtain a plurality of nodes having an octree structure and included in a first slice, obtain information for deriving the shape of a first node among the plurality of nodes from the bitstream, One node is decoded according to the information, and the shape is different from the prescribed shape of other nodes among the plurality of nodes.

Further, an encoding device according to an aspect of the present disclosure is an encoding device that encodes a plurality of three-dimensional points, and includes a processor and a memory, and the processor uses the memory to A bitstream is generated by encoding a plurality of nodes included in a first slice, and information for deriving the shape of a first node among the plurality of nodes is added to the bitstream. and the shape is different from the prescribed shape of other nodes among the plurality of nodes.

Note that these comprehensive or specific aspects may be realized by a system, a method, an integrated circuit, a computer program, or a computer-readable recording medium such as a CD-ROM. and a recording medium may be used in any combination.

Hereinafter, embodiments will be specifically described with reference to the drawings. Note that each of the embodiments described below represents a specific example of the present disclosure. The numerical values, shapes, materials, components, arrangement positions and connection forms of the components, steps, order of steps, etc. shown in the following embodiments are examples, and do not limit the present disclosure. Further, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claims will be described as arbitrary constituent elements.

(Embodiment)
An encoding device (three-dimensional data encoding device) and a decoding device (three-dimensional data decoding device) according to this embodiment will be described below. The encoding device generates a bitstream by encoding three-dimensional data. The decoding device generates three-dimensional data by decoding the bitstream.

The three-dimensional data is, for example, three-dimensional point group data (also referred to as point group data). A point cloud is a collection of three-dimensional points and indicates the three-dimensional shape of an object. The point cloud data includes position information and attribute information of a plurality of three-dimensional points. The position information indicates the three-dimensional position of each three-dimensional point. Note that the position information may also be referred to as geometry information. For example, position information is expressed in a rectangular coordinate system or a polar coordinate system.

The attribute information indicates, for example, color information, reflectance, transmittance, infrared information, normal vector, or time information. One three-dimensional point may have a single attribute information, or may have multiple types of attribute information.

Note that although the encoding and decoding of location information will be mainly described below, the encoding device may also encode and decode attribute information.

[Tri-soap method]
The encoding device according to this embodiment encodes position information using a TriSoup (Triangle-Soup) method.

The trisoap method is one of the methods for encoding position information of point cloud data, and is an irreversible compression method. In the trisoap method, the original point group to be processed is replaced with a set of triangles, and the point group is approximated on the plane. Specifically, the original point group is replaced with vertex information within the node, and the vertices are connected to generate a triangle group. Additionally, vertex information for generating triangles is stored in a bitstream and sent to the decoding device.

First, encoding processing using the trisoap method will be explained. FIG. 1 is a diagram showing an example of a group of original points. As shown in FIG. 1, a point group 102 of the object is included in the object space 101 and includes a plurality of points 103.

First, the encoding device divides the original point group into an octree to a predetermined depth. In octree division, a target space is divided into eight nodes (subspaces), and 8-bit information (occupancy code) indicating whether a point group is included in each node is generated. Further, the node including the point cloud is further divided into eight nodes, and 8-bit information indicating whether or not the point group is included in each of the eight nodes is generated. This process is repeated up to a predetermined hierarchy.

Here, in normal octree encoding, division is repeated until, for example, the number of point groups included in a node becomes one or less than a threshold value. On the other hand, in the trithorpe method, 8-ary tree division is performed up to an intermediate layer, but is not performed on layers below that layer. Such an 8-ary tree up to an intermediate level is called a pruned 8-ary tree.

FIG. 2 is a diagram showing an example of a pruned 8-ary tree. As shown in FIG. 2, the point cloud 102 is divided into a plurality of leaf nodes 104 (lowest layer nodes) of a pruned 8-ary tree.

Next, the encoding device performs the following processing on each leaf node 104 of the pruned 8-ary tree. Note that in the following, leaf nodes are also simply referred to as nodes. The encoding device generates a vertex on the edge as a representative point of a point group close to the edge of the node. This vertex is called an edge vertex. For example, edge vertices are generated for each of a plurality of edges (eg, four parallel sides).

FIG. 3 is an example of a two-dimensional display of the leaf node 104, and is a diagram showing, for example, an xy plane viewed from the z direction shown in FIG. 1. As shown in FIG. 3, edge vertices 112 are generated on the edge based on points near the edge among the plurality of points 111 in the leaf node 104.

Note that in FIG. 3, the dotted line on the outer periphery of the leaf node 104 is an edge. Further, in this example, the edge vertex 112 is generated at a position that is a weighted average of the positions of points within 1 distance from the edge (points included in the range 113 in FIG. 3). Note that the unit of distance is, for example, the resolution of a point group, but is not limited thereto. Moreover, although this distance (threshold value) is 1 in this example, it may be other than 1 and may be variable.

Next, the encoding device also generates vertices inside the node based on the point group existing in the normal direction of the plane including the plurality of edge vertices. This vertex is called a centroid vertex.

FIGS. 4 and 5 are diagrams for explaining the method of generating centroid vertices. First, the encoding device selects, for example, four points as representative points from a group of edge vertices. In the example shown in FIG. 4, edge vertices v1 to v4 are selected. Next, the encoding device calculates an approximate plane 121 passing through the four points. Next, the encoding device calculates the normal n of the approximate plane 121 and the average coordinate M of the four points. Next, the encoding device converts the centroid vertex C into a weighted average coordinate of one or more points (for example, points included in the range 122 shown in FIG. 5) close to a half-line extending from the average coordinate M in the direction of the normal line n. generate.

Next, the encoding device entropy encodes vertex information, which is information on edge vertices and centroid vertices, and converts the encoded vertex information into a geometry data unit (hereinafter referred to as GDU) included in the bitstream. ). Note that the GDU includes information indicating a pruned 8-ary tree in addition to the vertex information.

FIG. 6 is a diagram showing an example of vertex information. Through the above processing, the point group 102 is converted into vertex information 123, as shown in FIG.

Next, the decoding process of the bitstream generated above will be explained. First, the decoding device decodes the GDU from the bitstream and obtains vertex information. Next, the decoding device connects the vertices to generate a TriSoup-Surface, which is a group of triangles.

FIG. 7 is a diagram showing an example of a trithorpe surface. In the example shown in FIG. 7, four edge vertices v1 to v4 and a centroid vertex C are generated based on the vertex information. Further, a triangle 131 (trisoap surface) having the centroid vertex C and two edge vertices as vertices is generated. For example, pairs of two edge vertices on two adjacent edges are each selected, and a triangle 131 having the selected pair and the centroid vertex as vertices is generated.

FIG. 8 is a diagram for explaining the point cloud restoration process. By performing the above processing for each leaf node, a three-dimensional model in which the object is represented by a plurality of triangles 131 is generated, as shown in FIG.

Next, the decoding device restores the position information of the point group 133 by generating points 132 at regular intervals on the surface of the triangle 131.

[Slice boundary processing]
Hereinafter, a case will be described in which a point group is divided into a plurality of slices and encoded using the trithorpe method. In such a case, if the slice width does not match an integral multiple of the leaf node width, points may not be restored at the slice boundaries. Specifically, when a point group exists across adjacent first and second slices, leaf nodes that belong to the first slice and span the first and second slices are located in the second slice. Since it does not include the point cloud included in the node, there is a problem that a blank area occurs inside the node.

Due to the existence of a blank area on the edge of a node that is in contact with a blank area, there is no point group close to the edge, so an edge vertex cannot be generated for the edge. Alternatively, when edge vertices are generated in this state, the distance between the point group and the edge increases, resulting in a vertex position that does not reflect the actual point group distribution, resulting in a problem that the accuracy of the decoded point group deteriorates.

FIG. 9 is a diagram showing an example where a point group extending in the vertical direction is divided into slices. FIG. 10 is a diagram showing an example of vertices generated in this case. In the example shown in FIG. 10, the slice is divided from the bottom by nodes having a specified width. In this case, node 1 at the top of the slice includes a blank area. As a result, edge vertex v2 should be generated at the upper edge, but edge vertex v2 cannot be generated because there is no point group near the edge. Note that the slice boundaries shown in FIGS. 9 and 10 are boundaries of bounding boxes of slices.

FIG. 11 is a diagram showing an example of the trisoap surface that should originally be generated. FIG. 12 is a diagram showing an example of a trithorpe surface when no edge vertices are generated as shown in FIG. 10. As shown in FIG. 12, if edge vertices v2 and v3 are not generated on the upper edge, a trisoap surface is generated only with vertices v1 and v4 on the lower edge and the centroid vertex c. As a result, the trithorpe surface that should originally traverse the node is generated only at the bottom of the node.

FIG. 13 is a diagram showing an example of the restored point group in this case. As shown in FIG. 13, since points are not restored in areas where there is no trisoap surface, holes are created across the restored point group, as in area 201 shown in FIG.

In this embodiment, the width of the node located at the edge of the bounding box of a slice is set to a width different from the specified width. As a result, it is possible to suppress the occurrence of blank areas within nodes and generate edge vertices that cannot be generated in the above problem.

FIG. 14 is a diagram showing an example of vertices generated in this case. FIG. 15 is a diagram showing an example of a trisoap surface in this case. As shown in FIG. 14, a node 1 having an unspecified width is provided at the end of the slice. As a result, the blank space within the node 1 can be eliminated and the edge vertex v2 can be generated. Therefore, as shown in FIG. 15, it is possible to generate a tri-thorpe surface that traverses the inside of the node, thereby eliminating the generation of transverse holes in the restored point group.

Additionally, the encoding device stores, for example, information indicating the slice width in the GDU header as adjustment width information that is information for calculating the adjustment width of this non-standard width node.

Here, a non-specified width node is a node in which the length of a side along at least one of the length, width, and height is different from the specified length (specified width). The non-specified width node has a rectangular parallelepiped or cubic shape that is different from the cubic shape of the specified width. Further, the adjustment width information is information for adjusting the specified length of the side of the node to an adjustment width of a side length different from the specified width (non-specified width). For example, the adjustment width information may indicate the length of the adjustment width itself, may indicate the difference between the specified width and the adjustment width, or may indicate the ratio between the specified width and the adjustment width.

For example, the adjustment width (non-specified width) of a non-specified width node is expressed as min (slice width - node position, specified width). In other words, the non-specified width of a certain node is set to the minimum value between (slice width - node position) and the specified width. Here, the node position is the position (coordinates) of the corner of the node that is closest to the origin, as shown in FIG.

With the above, edge vertices can also be generated at the end nodes of the bounding box of the slice. As a result, the trisoap surface can be arranged without interruption, which eliminates the occurrence of holes in the restored point cloud.

Note that as another method, for example, the following method may be used. Note that the main focus here is on solving problems in encoding processing and standards.

The point group included in the first slice is also included in the leaf node of the second slice. In other words, the first slice and the second slice have overlapping point groups.

The point group is divided into slices by matching the boundary coordinates between the first slice and the second slice with the leaf node width. In other words, nodes with blank spaces due to slice division are not generated.

[Syntax]
Transmission information transmitted from the encoding device to the decoding device in order to implement the above method will be described below. For example, transmission information is stored in a bitstream.

FIG. 16 is a diagram showing an example of this transmission information. As shown in FIG. 16, the GDU header included in the bitstream includes a non-standard width processing flag and slice width information. The non-standard width processing flag is information indicating whether or not to set the above-mentioned non-standard width node. For example, a value of 1 indicates that a non-standard width node is set, and a value of 0 indicates that a non-standard width node is not set. The slice width information indicates the slice width (the width of the bounding box of the slice).

FIG. 17 is a diagram showing an example of the syntax of the GDU header (geometry_data_unit_header). As shown in FIG. 17, slice width information is stored in the GDU header only when the non-standard width processing flag has a value of 1 (true). Further, the slice width information includes, for example, information indicating the width of each slice in the x, y, and z axes.

The slice width information is information for each slice, and here, the non-standard width processing flag and slice width information are stored in the GDU header, which is a header for each slice.

Note that the non-standard width processing flag may be information indicating whether slice width information is included in the bitstream. Further, the names of flags, information, etc. shown in this embodiment are merely examples, and any names can be used.

FIG. 18 is a diagram showing an example of setting the adjustment width of the non-standard width node. As described above, the adjustment width is expressed as min (slice width−node position, specified width). FIG. 18 shows an example of setting the adjustment range in the x-axis direction. In this example, the adjustment width=min(100-96, 32)=4. Note that calculations can be made in the same manner for the y-axis and z-axis directions.

Although an example has been described here in which the non-standard width processing flag and slice width information are stored for each slice, these pieces of information may be common to a plurality of slices. In this case, these pieces of information may be stored in a header higher than the GDU header, such as SPS or GPS. SPS (Sequence Parameter Set) is metadata (parameter set) common to multiple frames. GPS (Geometry Parameter Set) is metadata (parameter set) related to encoding of position information. For example, GPS is metadata common to multiple frames.

Additionally, a flag indicating whether this information is stored in the upper header or for each slice may be stored in the SPS or GPS. In this case, the storage location of these pieces of information is switched based on the flag.

Furthermore, since these pieces of information are for processing specific to the tri-soap method, they may be stored in the bitstream only when the encoding method is the tri-soap method.

[Processing flow]
The flow of processing in the encoding device and decoding device will be described below. FIG. 19 is a flowchart of encoding processing by the encoding device.

First, the encoding device generates a pruned 8-ary tree, and stores 8-ary tree information indicating the pruned 8-ary tree in the GDU (S101). For example, the encoding device entropy encodes the octree information and stores the encoded octree information in the GDU.

Next, the encoding device determines whether the slice width (the width of the bounding box of the slice) is an integral multiple of the specified width (S102). If the slice width is not an integral multiple of the specified width (No in S102), the encoding device stores a non-standard width processing flag=1 and slice width information in the GDU header (S103).

On the other hand, if the slice width is an integral multiple of the specified width (Yes in S102), the encoding device stores a non-standard width processing flag = 0 in the GDU header (S104).

Next, the encoding device performs the following steps S105 to S109 on each of the plurality of leaf nodes of the pruned octtree.

First, the encoding device determines whether the non-standard width processing flag=1 (S105). If the non-standard width processing flag=1 (Yes in S105), the encoding device determines whether the target node exists at the edge of the slice (the edge of the bounding box of the slice) (S106). For example, if the target node includes a slice boundary, the encoding device determines that the target node exists at the slice edge, and if the target node does not include the slice boundary, the encoding device determines that the target node does not exist at the slice edge.

If the target node exists at the edge of the slice (Yes in S106), the encoding device calculates the adjustment width from the slice width indicated by the slice width information and the node position of the target node, and adjusts the target node to the calculated adjustment width. (S107).

On the other hand, if the non-standard width processing flag = 0 (No in S105) or if the target node does not exist at the slice end (No in S106), the encoding device sets the width of the target node to the standard width ( S108).

Next, the encoding device generates edge vertices on the edges of the target node and generates centroid vertices inside the target node based on the point group distribution within the target node (S109). With the above, the loop processing for the target node is completed.

After the loop processing for all leaf nodes is completed, the encoding device entropy encodes the vertex information indicating the positions of the edge vertices and centroid vertices of the plurality of leaf nodes, and stores the encoded vertex information in the GDU. (S110).

Next, the encoding device generates a bitstream including the GDU header and GDU, and outputs the bitstream (S111). That is, the encoding device transmits the bitstream to the decoding device.

FIG. 20 is a flowchart of the decoding process by the decoding device. First, the decoding device obtains the GDU header and GDU from the bitstream (S121). Next, the decoding device obtains the non-standard width processing flag from the GDU header, and determines whether the non-standard width processing flag = 1 (S122).

If the non-standard width processing flag = 1 (Yes in S122), the decoding device acquires slice width information from the GDU header (S123). On the other hand, if the non-standard width processing flag=0 (No in S122), the decoding device does not acquire slice width information from the GDU header.

Next, the decoding device obtains octree information from the GDU. For example, the decoding device acquires 8-ary tree information by entropy decoding the encoded 8-ary tree information included in the GDU. Next, the decoding device generates a group of leaf nodes of the pruned 8-ary tree using the 8-ary tree information (S124).

Next, the decoding device performs the following steps S125 to S130 on each of the plurality of leaf nodes of the pruned 8-ary tree.

First, the decoding device determines whether the non-standard width processing flag=1 (S125). If the non-standard width processing flag=1 (Yes in S125), the decoding device determines whether the target node exists at the edge of the slice (the edge of the bounding box of the slice) (S126). For example, if the target node includes a slice boundary, the decoding device determines that the target node exists at the slice edge, and if the target node does not include the slice boundary, the decoding device determines that the target node does not exist at the slice edge.

If the target node exists at the edge of the slice (Yes in S126), the decoding device calculates the adjustment width from the slice width indicated by the slice width information and the node position of the target node, and adjusts the target node to the calculated adjustment width. (S127).

On the other hand, if the non-standard width processing flag is not 1 (No in S125), or if the target node does not exist at the edge of the slice (No in S126), the decoding device sets the width of the target node to the specified width (S128). .

Next, the decoding device obtains vertex information indicating the positions of edge vertices and centroid vertices from the GDU (S129). For example, the decoding device acquires vertex information by entropy decoding encoded vertex information included in the GDU.

Next, the decoding device generates a triangle group by connecting a plurality of vertices indicated by the vertex information (S130). With the above, the loop processing for the target node is completed.

After completing the loop processing for all leaf nodes, the decoding device generates a decoded point group by generating points at regular intervals on the surfaces of multiple triangles (S131).

[Modified example]
In the above description, an example has been described in which the non-standard width node is generated at the end of the slice (the right end in FIG. 18), but the position where the non-standard width node is generated is not limited to this. FIGS. 21 to 24 are diagrams showing examples of setting non-standard width nodes.

For example, as shown in FIG. 21, the position of the non-standard width node may be in the middle of the slice. Alternatively, as shown in FIG. 22, the position of the non-standard width node may be at the beginning of the slice.

Furthermore, as shown in FIGS. 23 and 24, a plurality of non-standard width nodes may be set. For example, as shown in FIG. 23, two non-standard width nodes may be set at both ends of the slice. Alternatively, as shown in FIG. 24, two non-standard width nodes may be set at the end of the slice and at the middle of the slice.

In this way, when setting a plurality of non-standard width nodes, the sum of the widths of the plurality of non-standard width nodes only needs to be the above-mentioned adjustment width.

Based on these considerations, the following can be used as a method for calculating the adjustment width and as non-standard width node information, which is information for non-standard width nodes, that is stored in the bitstream.

As shown in FIGS. 21 and 22, when the non-specified width node is placed at a location other than the end of the slice, the non-specified width node information includes the adjusted width information indicating the adjusted width and the insertion position of the non-specified width node. and insertion position information indicating the insertion position. For example, in the example shown in FIG. 21, the insertion position information indicates the bit string "4b0010". Furthermore, in the example shown in FIG. 22, the insertion position information indicates the bit string "4b1000". Note that the plurality of bits in the bit string each correspond to a node, and the bit corresponding to the non-standard width node is set to 1, and the other bits are set to 0.

Note that, as another method, the insertion position information may indicate information that specifies a non-standard width node. For example, a serial number (identifier) may be assigned to a node, and the insertion position information may indicate a serial number of a non-standard width node. For example, a serial number may be set for each axis within a slice. In this case, in the example shown in FIG. 21,

serial numbers

1, 2, 3, and 4 are set in order from the left node, and the insertion position information indicates the value 3. Note that the serial number may be a serial number for all nodes within a slice.

In this case, the decoding device can use the insertion position information to determine whether the target node is a non-standard width node. Further, the adjustment width can be calculated using the same method as shown in FIG.

On the other hand, as shown in FIGS. 23 and 24, when non-standard width nodes are arranged at multiple positions within a slice, the insertion position information indicates the individual insertion positions of the multiple non-standard width nodes. For example, in the example shown in FIG. 23, the insertion position information indicates the bit string "5b1001". In the example shown in FIG. 24, the insertion position information indicates the bit string "5b10010". Note that here we have shown an example in which one bit string is used for one axis (x-axis in this example) (that is, an example in which a different bit string is used for each axis), but one bit string is used for all nodes in the slice. Two bit strings may be used.

Furthermore, in the examples shown in FIGS. 23 and 24, the adjustment width information indicates the adjustment width of each of the two non-standard width nodes. Specifically, the adjustment width information indicates a value 2, which is the adjustment width of the first non-standard width node, and a value 2, which is the adjustment width of the second non-standard width node.

Alternatively, the adjustment width information may indicate the total adjustment width of two non-standard width nodes and the adjustment width of one non-standard width node. Even in this case, the decoding device can calculate the adjustment widths of the two non-standard width nodes from the adjustment width information. In this case, for example, if the adjustment width information of the last non-standard width node in a row of non-standard width nodes on one axis is omitted, a rule is determined in advance, and the decoding device , the adjustment widths of the two non-standard width nodes may be calculated from the adjustment width information. Further, the non-standard width node information may be composed of individual information of all non-standard width nodes within a slice, instead of information for each axis.

For example, in the example shown in FIG. 24, if the adjustment width information of the fourth node is omitted, the adjustment width of the fourth node is the sum of the adjustment widths (4) - the adjustment width of the first node (2). It is found as =2.

Additionally, the starting position of the bounding box of a slice may have an offset from the slice boundary. FIG. 25 is a diagram showing an example of setting the non-standard width node in this case. The example shown in FIG. 25 is an example in which the node at the end of the slice is set to a non-standard width node.

In this case, the non-standard width node information includes the amount of offset from the origin coordinates to the start position of the bounding box of the slice. The decoding device can calculate the adjustment width using this offset amount.

Specifically, the adjustment width is found as min (slice width (100) - node position (121) + offset amount (25), specified width) = min (4, 32) = 4.

Additionally, the decoding device may use octree information to determine whether the target node is located at the slice end. For example, for a certain coordinate axis, by referring to the occupancy code and continuing to trace only the side closest to the origin or the side farthest from the origin, starting from the root node with depth=0, the slice end node can be determined. In this case, the non-standard width node information includes information indicating the fractional node width, which is the adjusted width of the node at the end of the slice. The decoding device uses this information to set the adjustment width of the node at the edge of the slice to the fractional node width.

FIG. 26 is a diagram showing an example of setting the non-standard width node in this case. In this example, the fractional node width=4, and the decoding device sets the width of the node to 4 if the target node is the slice end node, and otherwise sets the width of the node to 32.

Additionally, the non-specified width may be a value larger than the specified width. For example, in the example shown in FIG. 21, the third node and the fourth node may be combined. That is, three nodes whose widths are 32, 32, and 36 from the left may be set.

Further, as a result of adopting non-standard widths for all three axes of the x-axis, y-axis, and z-axis for a certain node, the node may become a cube.

Alternatively, instead of the slice width information, the position information of the corner of the node may be transmitted so that the size of the non-standard width node can be restored. For example, the position information may be coordinate information of two corners in the direction of the non-standard width, or may be coordinate information of all eight corners.

The encoding device may quantize the above-mentioned non-standard width node information for calculating the adjustment width, and transmit the quantized information to the decoding device. In this case, the decoding device may perform the above processing by dequantizing the quantized information and using the obtained non-standard width node information. In this case, a case may occur in which the width of the blank area within the non-standard width node does not become completely zero. On the other hand, by performing quantization, the amount of data to be transmitted can be reduced.

Information on non-standard width nodes may be transmitted by combining each of the above ideas.

Note that the default size of the node is determined depending on the size of the bounding box to be octree encoded (bit depth of the original data) and the depth to which the octree division is to proceed. be done. This specified width is expressed, for example, as a power of two.

In addition, in the above explanation, a non-standard width node is a node in which the length of a side along at least one of the length, width, and height differs from the specified width, but the length in only one direction, or in only two directions. The length may be different from the specified width. That is, the non-standard width node may be a rectangular parallelepiped.

Furthermore, in the above description, four edge vertices existing on four parallel edges of a node are found, but the number of edge vertices to be found is not limited to four. The number of edge vertices is not particularly limited as long as an approximate plane is obtained.

Furthermore, the method of finding the centroid apex is not limited to the above method. As long as the decoding device can determine the plane of the triangle, the centroid vertices may be determined using other methods.

Furthermore, in the above description, the tri-soap method was used as the compression method, but the method of this embodiment is also effective for compression methods other than the tri-soap method. In other words, the method of this embodiment is a compression method that approximates a point group using a plane or curved surface within a node, and is effective for a compression method that requires edge vertices to generate the plane or curved surface. It is.

Furthermore, in the above description, the non-standard length of the side of the non-standard width node is determined, but it is not essential to determine the non-standard length. It is only necessary to find the shape of the non-standard width node; for example, the positions of the two ends of the side with the non-standard length may be found. That is, the position of the non-standard width node may be determined.

[How to adjust the origin side]
In FIG. 25, a case has been described in which the starting position of the bounding box of a slice does not coincide with the origin. The details of this case will be explained below.

The origin of the slice is an unspecified offset amount, and the starting position of the bounding box of the slice does not necessarily match the origin. Therefore, there may be a case where the point group of the slice after subtracting this offset amount is distributed away from the origin of the encoding coordinate system. That is, there are cases where the origin of the encoded coordinate system and the origin of the bounding box of the slice are different. Furthermore, in this case, the boundaries of the bounding box of the slice may not coincide with the boundaries of the leaf nodes. In this case, it is necessary to generate non-standard width nodes on both the side of the bounding box of the slice on the origin side and the side of the bounding box on the side far from the origin.

The encoding device uses the slice position, which is the start coordinate of the bounding box of the slice, and the width of the bounding box as information for the decoding device to calculate the node position and node width (adjusted width) of these non-standard width nodes. The slice width is transmitted to the decoding device.

FIG. 27 is a diagram showing an example of setting the non-standard width node in this case. In FIG. 27, the encoded coordinate system is expressed one-dimensionally. Furthermore, the shaded area is an area (slice) in which the point group is distributed. Further, the origin shown in FIG. 27 is the origin of the encoding coordinate system.

Here, if the specified width of the node is W, the slice position is A, the slice width is B, and the original node position is nodePos, the adjusted node position newNodePos and the adjusted node width newNodeWidth are calculated as follows. .

newNodePos=(nodePos<A)? A: nodePos;
newNodeWidth=(nodePos<A)? (W-(A-nodePos)):min(A+B-nodePos+1,W)

According to the above, the node position is adjusted to A if nodePos<A, otherwise it is not adjusted. That is, the node position of node 1, which is the leading node, is changed from P1 to A, and the node positions of the other nodes are not changed.

Further, if nodePos<A, the node width is adjusted to (W-(A-nodePos)), otherwise it is set to min(A+B-nodePos+1, W). That is, the node width of node 1 is adjusted to W1=W-(A-nodePos)=W-(A-P1). The node width of node 2, which is the end node, is set to W2=min(A+B-nodePos+1, W)=A+B-P2+1. The node widths of other nodes are set to W. The reason why "+1" is used here is that there is a relationship: node width=width of the inclusive point coordinates+1. FIG. 28 is a diagram showing the relationship between the node width and the width of the inclusive point coordinates. For example, when W=8, A=0, B=5, and nodePos=0, newNodeWidth=6 as shown in FIG. 28.

Furthermore, when the non-standard width processing flag included in the header in the bitstream is 1, the decoding device obtains A and B from the transmission information, and uses the above formula from the initial node position and initial node width of each node. are used to calculate the adjusted node positions and adjusted node widths of

nodes

1 and 2.

Note that, as shown in FIGS. 21 to 24, in this case as well, the position of the non-standard width node may be located at the beginning of the slice, in the middle of the slice, at both ends of the slice, or at multiple positions.

Furthermore, when non-standard width nodes are provided at multiple positions on the same axis, the total node width may be the above-mentioned adjustment width.

Based on these considerations, the following can be used as the calculation method and the non-standard width node information, which is information for non-standard width nodes, that is stored in the bitstream.

FIG. 29 is a diagram showing an example of setting a non-standard width node. As shown in FIG. 29, if the start position of the bounding box of the slice has an offset from the origin within the frame of the slice boundary, and the slice width (95) is not a multiple of the specified width (32) of the node, the encoding device However, if the offset amount from the origin to the start position of the bounding box is transmitted to the decoding device, the decoding device can calculate the adjusted position and adjustment width of node 1 on the origin side.

Furthermore, if the encoding device transmits the slice width to the decoding device, the decoding device can calculate the adjustment width of the node 2 on the far side from the origin.

Specifically, here, if the node position before adjustment of node 1 on the origin side is 32 as shown in FIG. 29, then the adjusted position of node 1 = (32<37)? 37:32=37. Also, adjustment width of node 1 = (32<37)? (32-(37-32)): min(37+95-32+1, 32)=27.

Adjustment position of node 2 on the side far from the origin = (128<37)? (32-(37-128)): min(37+95-128+1, 32)=5.

In addition to the information for calculating from the positional relationship between the node position and the slice bounding box, the non-standard width node information also includes information specifying a non-standard width node and the information for the specified node. Information indicating the adjustment position and adjustment width of the node may be included. The information specifying the non-standard width node indicates, for example, a serial number assigned to the node.

FIG. 30 is a diagram showing an example of setting the non-standard width node in this case. In the example shown in FIG. 30, 0 to 3 are assigned as node numbers, which are serial numbers of nodes. Assuming that the adjustment position and adjustment width of the node with node number = 0 are P0 and W0, and the adjustment position and adjustment width of the node with node number = 3 are P1 and W1, in the example shown in FIG. 30, P0 = 37 and W0 = 27. , P1=128, and W1=5, and these pieces of information are transmitted.

In addition, in FIG. 29, a method for determining a non-standard width node and calculating its position and width has been explained, but it is also possible to use any of the methods explained in FIGS. Good too.

Furthermore, the transmission information transmitted from the encoding device to the decoding device to implement this method is shown below. FIG. 31 is a diagram illustrating an example of the syntax of GPS (geometry_parameter_set) and GDU header.

The GPS includes a first non-standard width processing flag and a second non-standard width processing flag. The first non-standard width processing flag is information indicating whether or not to perform the above-described node position and node width adjustment on the node located at the end of the slice closer to the origin. For example, a value of 1 indicates that the adjustment is performed, and a value of 0 indicates that the adjustment is not performed.

The second non-standard width processing flag is information indicating whether or not to perform the above-described node width adjustment on the node located at the slice end on the side far from the origin. For example, a value of 1 indicates that the adjustment is performed, and a value of 0 indicates that the adjustment is not performed.

When the first non-standard width processing flag is 1, the GDU header includes first bit length information, a first quantization parameter, and slice position information.

The slice position information indicates the slice position, which is the position (coordinates) of the bounding box of the slice, and for example, indicates the three-dimensional coordinates (x, y, z coordinates) of the corner of the bounding box of the slice that is closest to the origin. .

The first bit length information indicates the bit length of slice position information. The first quantization parameter indicates a quantization parameter (quantization value) used for quantizing slice position information.

When the second non-standard width processing flag is 1, the GDU header includes second bit length information, a second quantization parameter, and slice width information.

The slice width information indicates the slice width, which is the width of the bounding box of the slice. For example, the slice width information indicates the width of the bounding box in each of the x, y, and z directions.

The second bit length information indicates the bit length of the slice width. The second quantization parameter indicates a quantization parameter used for quantizing slice width information.

Here, the slice position is represented by slice position information<<first quantization parameter, and the slice width is represented by slice width information<<second quantization parameter.

Here, we have described an example in which these pieces of information are stored for each slice, but if these pieces of information are common to multiple slices, these pieces of information are stored above the GDU header such as SPS or GPS. may be stored in the header of Further, a flag indicating whether this information is stored in the upper header or for each slice may be stored in the SPS or GPS. In this case, the storage location of these pieces of information is switched based on the flag. Furthermore, this information may be provided individually for each axis of x, y, and z.

Furthermore, the encoding device does not need to transmit the first non-standard width processing flag and the second non-standard width processing flag shown in FIG. 31. In this case, when the first quantization parameter = 0, the decoding device does not perform the above-mentioned adjustment of the node position and node width on the node located at the edge of the slice closer to the origin; , make such adjustments. Furthermore, when the second quantization parameter = 0, the decoding device does not perform the above-mentioned node width adjustment on the node located at the edge of the slice far from the origin, and in other cases, performs the adjustment. conduct.

Furthermore, since the slice position information and the slice width information are numerical values in the encoded coordinate system, they may be determined to be positive values.

[Omission of transmission and processing]
In the above description, it is necessary to always store slice width information and slice position information in the header for a point group to be encoded, and to recalculate node positions and widths for all nodes of all slices. In reality, the start end of a slice to be encoded may coincide with the origin, or the end end of a slice may coincidentally coincide with the end end of a node. In this case, there is no need to adjust the node position or node width for the start end node or end end node of the slice, so it is necessary to transmit information for this adjustment process and to adjust the node position and node width. Recalculation processing can be omitted.

The encoding device performs this omission determination and transmits information indicating the determination result to the decoding device. A dedicated flag may be used for this transmission, or the first bit length information and second bit length information described above may be used. Specifically, when the first bit length information indicates 0, it means that transmission of slice position information is omitted, and when the second bit length information indicates 0, it means that transmission of slice width information is omitted. means. This allows the amount of header data to be suppressed and the processing time for encoding processing to be shortened.

FIG. 32 is a diagram showing an example of a slice and a node when the starting end of the slice coincides with the origin. In this case, there is no need to adjust the starting end.

FIG. 33 is a diagram showing an example of a slice and a node in a case where the start end of the slice coincides with the origin and the end end of the slice coincides with the end end of the node. In this case, adjustment processing is not required for both the start end and the end end.

FIG. 34 is a diagram showing an example of processing (syntax) when the above omission is performed. slice_bb_pos_bits shown in the figure is first bit length information indicating the bit length of slice position information. slice_bb_width_bits is second bit length information indicating the bit length of slice width information. Also, A is the slice position, B is the slice width, nodePos is the node position (node position before adjustment), nodeWidth is the node width (node width before adjustment), and newNodeWidth is the node after adjustment. W is the specified width of the node.

In this process, the adjustment process is performed only when the bit length of the slice position information is a value greater than 0, so the processing time of the encoding process can be reduced. Furthermore, if the start end of a slice coincides with the origin or the end end coincidentally coincides with the end end of a node, the amount of data in the header can be reduced.

FIG. 35 is a diagram showing an example of the syntax of the GPS and GDU headers in this case. The syntax shown in FIG. 35 differs from the syntax shown in FIG. 31 in that the first quantization parameter and slice position information are stored in the GDU header only when the first bit length information is greater than 0. and a condition that the second quantization parameter and slice width information are stored in the GDU header only when the second bit length information is greater than 0 are added.

By adding this condition, the amount of data in the header can be reduced when the start end of the slice matches the origin or the end end matches the end end of the node.

FIG. 36 is a flowchart of the omission determination process of the start end adjustment process (slice position transmission). The encoding device determines whether the start position of the slice (coordinates of the start end of the bounding box of the slice) matches the origin (S201).

If the start position of the slice matches the origin (Yes in S201), the encoding device does not transmit slice position information (S202). That is, the encoding device stores slice_bb_pos_bits=0 in the bitstream and does not store slice position information in the bitstream.

On the other hand, if the start position of the slice does not match the origin (No in S201), the encoding device transmits slice position information (S203). That is, the encoding device stores slice_bb_pos_bits set to a value greater than 0 and slice position information in the bitstream.

Note that the above determination may be performed only when the first non-standard width processing flag is 1. Further, when the first non-standard width processing flag is 0, slice position information is not transmitted.

FIG. 37 is a flowchart of the omission determination process of the end end adjustment process (slice width transmission). The encoding device determines whether the end position of the slice (coordinates of the end of the bounding box of the slice) matches the end of the node (S211).

If the end position of the slice matches the end end of the node (Yes in S211), the encoding device does not transmit slice width information (S212). That is, the encoding device stores slice_bb_width_bits=0 in the bitstream and does not store slice width information in the bitstream.

On the other hand, if the end position of the slice does not match the end of the node (No in S211), the encoding device transmits slice width information (S213). That is, the encoding device stores slice_bb_width_bits set to a value greater than 0 and slice width information in the bitstream.

Note that the above determination may be performed only when the second outside-specified width processing flag is 1. Further, when the second non-standard width processing flag is 0, slice width information is not transmitted.

FIG. 38 is a flowchart of the node position determination process. For example, the process shown in FIG. 38 is performed for each node.

First, the encoding device determines whether the first bit length information (slice_bb_pos_bits) is greater than 0 (S221).

If the first bit length information is greater than 0 (Yes in S221), the encoding device determines whether the node position (nodePos) of the target node is smaller than the slice position (S222).

If the node position is smaller than the slice position (Yes in S222), the encoding device sets the slice position as the adjusted node position (S223).

On the other hand, if the first bit length information is 0 (No in S221) or if the node position is equal to or greater than the slice position (No in S222), the encoding device does not change (adjust) the node position (S224).

Note that similar processing is performed in the decoding device as well.

FIG. 39 is a flowchart of the node width determination process. For example, the process shown in FIG. 39 is performed for each node.

First, the encoding device determines whether the first bit length information (slice_bb_pos_bits) is greater than 0 (S231).

If the first bit length information is greater than 0 (Yes in S231), the encoding device determines whether the node position (nodePos) of the target node is smaller than the slice position (S232).

If the node position is smaller than the slice position (Yes in S232), the encoding device sets the specified width (W) - (slice position (A) - node position (nodePos)) as the adjusted node width (nodeWidth). (S233).

On the other hand, if the node position is greater than or equal to the slice position (No in S232), the encoding device determines whether the second bit length information (slice_bb_width_bits) is greater than 0 (S234).

If the second bit length information is greater than 0 (Yes in S234), the encoding device converts min(slice position (A) + slice width (B) - node position (nodePos) + 1, specified width (W)) into a node. It is set as the width (nodeWidth) (S235).

On the other hand, if the second bit length information is 0 (No in S234), the encoding device does not change the node width (nodeWidth) (S236). That is, the encoding device sets the node width to the specified width (W).

Furthermore, if the first bit length information is 0 (No in S231), the encoding device determines whether the second bit length information (slice_bb_width_bits) is greater than 0 (S237). Note that the case where the first bit length information is 0 (No in S231) means that the slice position information is not transmitted, and the start end of the slice coincides with the origin.

If the second bit length information is greater than 0 (Yes in S237), the encoding device sets min(slice width (B) - node position (nodePos) + 1, specified width (W)) as the node width (nodeWidth). (S238).

On the other hand, if the second bit length information is 0 (No in S237), the encoding device does not change the node width (nodeWidth) (S236). That is, the encoding device sets the node width to the specified width (W).

[summary]
As described above, the decoding device (three-dimensional data decoding device) according to the embodiment performs the processing shown in FIG. 40. The decoding device is a decoding device that decodes a plurality of three-dimensional points. The decoding device obtains a plurality of nodes having an octree structure and included in the first slice from the bitstream (S301), and derives the shape of the first node among the plurality of nodes from the bitstream. (S302), the first node is decoded according to the information (S303), and the shape of the first node is different from the prescribed shape of other nodes among the plurality of nodes. According to this, it is possible to set a node having a shape different from the prescribed shape. Therefore, variable nodes can be set according to the size of the slice or the distribution of the point group. Therefore, there is a possibility that encoding efficiency can be improved.

For example, the shape of the first node is a rectangular parallelepiped, not a cube. For example, the edge of the first slice coincides with the edge of any one of the plurality of nodes. According to this, even if the node end and the slice end do not match, the node end and the slice end can be made to match. Therefore, since it is possible to prevent a vertex from being generated at the end of the node, it is possible to suppress the generation of blank areas at slice boundaries. Therefore, the accuracy of the point cloud to be decoded can be improved.

For example, the size of the shape of the first node is different from the prescribed size of the prescribed shape. For example, the length of the side of the shape of the first node is different from the prescribed length (eg, prescribed width) of the side of the prescribed shape.

For example, the information for deriving the shape of the first node indicates the size of the shape of the first node or the positions of both ends of the sides of the first node. According to this, the decoding device can generate a node having a shape different from the prescribed shape using the information.

For example, the information for deriving the shape of the first node includes adjustment information (for example, slice width information or slice position information) for adjusting the prescribed shape to the shape of the first node. According to this, the decoding device can generate a node having a shape different from the prescribed shape using the adjustment information. Furthermore, compared to the case where the absolute amount of position information is sent, there is a possibility that the amount of information can be made smaller.

For example, the decoding of the first node is performed according to a compression method in which a plurality of three-dimensional points are approximated by a plane or curved surface within the first node. For example, the compression method is the Triangle-Soup compression method.

For example, the shape of the first node is determined in order to generate a plane or a curved surface within the first node. According to this, by setting a node having a shape different from the prescribed shape, a plane or a curved surface can be generated within the first node.

For example, the edge of the shape of the first node has a vertex on it, and the plane or curved surface intersects the edge at the vertex. According to this, by setting a node having a shape different from the prescribed shape, a plane or a curved surface can be generated within the first node. For example, the plurality of three-dimensional points include a first three-dimensional point located near the vertex.

For example, the first node is provided in contact with a second slice adjacent to the first slice. According to this, for example, when a node end and a slice end do not match, it is possible to prevent a vertex from being generated at the node end, so it is possible to suppress the generation of blank areas at slice boundaries. For example, if only nodes of a prescribed shape are provided at the slice boundary, the slice boundary may separate the nodes. In that case, the first node cannot be encoded or decoded using the point cloud in the second slice adjacent to the first slice, so the accuracy of restoring the three-dimensional point cloud near the slice boundary may deteriorate. There is. Therefore, in this aspect, by setting a node having a shape different from the prescribed shape, the end of the first node can be made to coincide with the end of the first slice, for example. Thereby, it is possible to suppress deterioration in the reconstruction accuracy of the three-dimensional point group near the slice boundary. Note that if this aspect is applied to the tri-soap method, it is possible to prevent edge vertices from being generated appropriately.

For example, information for deriving the shape of the first node is provided for each slice, and information for the second slice has an 8-ary tree structure, and information for deriving the shape of the first node is provided for each slice. is used to derive the shape of the second node, and the shape of the second node is different from the prescribed shape. According to this, the shape of the node can be set for each slice.

For example, the size of the prescribed shape is expressed as a power of two, and the size of the first node shape is different from the size expressed as a power of two.

For example, the shape of the first node is defined by a first length along a first direction, a second length along a second direction, and a third length along a third direction; The first direction, second direction, and third direction are orthogonal to each other, and only the first length among the first length, second length, and third length defines a node other than the first node. The length is different from the length of the shape, or only the first length and the second length among the first length, the second length and the third length are different from the length of the prescribed shape.

For example, the first node is closest to the origin of the first slice among the plurality of nodes in one of the first direction, the second direction, and the third direction; The third directions are orthogonal to each other. According to this, when the start position of the slice does not match the origin, for example, the start position of the node can be adjusted in accordance with the start position of the slice. Note that the origin is a position of a slice, a node, a three-dimensional point, or a reference position for defining a shape.

For example, the plurality of nodes include a third node having a shape different from the prescribed shape, and the third node is the farthest from the origin in one direction among the plurality of nodes. According to this, when the end position of a slice does not match the end position of a node, for example, the end position of a node can be adjusted in accordance with the end position of a slice.

For example, if the starting position of the first slice does not coincide with the origin, the bitstream contains information for deriving the shape of the first node, and if the starting position of the first slice coincides with the origin, the bitstream contains the information for deriving the shape of the first node. It does not contain information for deriving the shape of the node. According to this, the occurrence of processing for adjusting the shape of the node can be reduced. Furthermore, transmission of information for deriving the shape of the node can be omitted. Therefore, it is possible to reduce the amount of processing and the amount of bitstream data. Note that the start position (starting end) of a slice is the position of the end of the slice closer to the origin, and the end position (end end) of the slice is the position of the end of the slice farther from the origin. Similarly, the starting position (starting edge) of a node is the position of the edge of the node on the side closer to the origin, and the ending position (end edge) of a node is the position of the edge of the node on the side far from the origin. .

For example, if the ending position of the first slice does not match the ending edge of the first node, the bitstream includes information for deriving the shape of the first node, and the ending position of the first slice does not match the ending edge of the first node. , the bitstream does not contain information for deriving the shape of the first node. According to this, the occurrence of processing for adjusting the shape of the node can be reduced. Furthermore, transmission of information for deriving the shape of the node can be omitted. Therefore, it is possible to reduce the amount of processing and the amount of bitstream data.

FIG. 41 is a block diagram of the decoding device 10. For example, the decoding device 10 includes a processor 11 and a memory 12, and the processor 11 uses the memory 12 to perform the above processing.

Furthermore, the encoding device (three-dimensional data encoding device) according to the embodiment performs the processing shown in FIG. 42. The encoding device is an encoding device that encodes a plurality of three-dimensional points. The encoding device has an octree structure, generates a bitstream by encoding multiple nodes included in the first slice (S311), and derives the shape of the first node among the multiple nodes. information for the first node is stored in the bitstream (S312), and the shape of the first node is different from the prescribed shape of other nodes among the plurality of nodes. According to this, it is possible to set a node having a shape different from the prescribed shape. Therefore, variable nodes can be set according to the size of the slice or the distribution of the point group. Therefore, there is a possibility that encoding efficiency can be improved. Furthermore, the encoding device may perform the same processing as the decoding device described above.

FIG. 43 is a block diagram of the encoding device 20. For example, the encoding device 20 includes a processor 21 and a memory 22, and the processor 21 uses the memory 22 to perform the above processing.

Although the encoding device (three-dimensional data encoding device), decoding device (three-dimensional data decoding device), etc. according to the embodiment and modification of the present disclosure have been described above, the present disclosure is limited to this embodiment. It is not something that will be done.

Furthermore, each processing unit included in the encoding device, decoding device, etc. according to the above embodiments is typically realized as an LSI, which is an integrated circuit. These may be integrated into one chip individually, or may be integrated into one chip including some or all of them.

Further, circuit integration is not limited to LSI, and may be realized using a dedicated circuit or a general-purpose processor. An FPGA (Field Programmable Gate Array) that can be programmed after the LSI is manufactured, or a reconfigurable processor that can reconfigure the connections and settings of circuit cells inside the LSI may be used.

Furthermore, in each of the above embodiments, each component may be configured with dedicated hardware, or may be realized by executing a software program suitable for each component. Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory.

The present disclosure also describes an encoding method (three-dimensional data encoding method) or a decoding method (three-dimensional data decoding method), etc.

Additionally, the present disclosure may be implemented as a program that executes the encoding method or decoding method on a computer, processor, or device. Further, the present disclosure may be implemented as a bitstream generated by the above encoding method. Further, the present disclosure may be realized as a recording medium on which the program or the bitstream is recorded. For example, the present disclosure may be realized as a non-transitory computer-readable recording medium on which the program or the bitstream is recorded.

Furthermore, the division of functional blocks in the block diagram is just an example; multiple functional blocks can be realized as one functional block, one functional block can be divided into multiple functional blocks, or some functions can be moved to other functional blocks. It's okay. Further, functions of a plurality of functional blocks having similar functions may be processed in parallel or in a time-sharing manner by a single piece of hardware or software.

Furthermore, the order in which the steps in the flowchart are executed is for illustrative purposes to specifically explain the present disclosure, and may be in an order other than the above. Further, some of the above steps may be executed simultaneously (in parallel) with other steps.

Although the encoding device, decoding device, etc. according to one or more aspects have been described above based on the embodiment, the present disclosure is not limited to this embodiment. Unless departing from the spirit of the present disclosure, various modifications that can be thought of by those skilled in the art to this embodiment, and forms constructed by combining components of different embodiments are also within the scope of one or more aspects. may be included within.

The present disclosure can be applied to encoding devices and decoding devices.

10

Decoding device

11, 21

Processor

12, 22 Memory 20 Encoding device 101

Target space

102, 133

Point group

103, 111, 132 Point 104 Leaf node 112

Edge vertex

113, 122 Range 121 Approximate plane 123 Vertex information 131 Triangle 201 Region

Claims

A decoding method for decoding a plurality of three-dimensional points, the method comprising:
Obtaining a plurality of nodes having an octree structure and included in the first slice from the bitstream,
obtaining information for deriving the shape of a first node among the plurality of nodes from the bitstream;
decoding the first node according to the information;
The shape is different from the prescribed shapes of other nodes among the plurality of nodes.
The decoding method according to claim 1, wherein the shape is a rectangular parallelepiped and not a cube.
The decoding method according to claim 1, wherein an edge of the first slice coincides with an edge of any one of the plurality of nodes.
The decoding method according to claim 1, wherein the information indicates the size of the shape or the positions of both ends of the side of the first node.
The decoding method according to claim 1, wherein the information includes adjustment information for adjusting the prescribed shape to the shape.
The decoding method according to claim 1, wherein the decoding is performed according to a compression method in which the plurality of three-dimensional points are approximated by a plane or a curved surface within the first node.
The decoding method according to claim 6, wherein the compression method is a Triangle-Soup compression method.
The decoding method according to claim 6, wherein the shape is determined to generate the plane or the curved surface within the first node.
The edges of the shape have vertices on them,
The decoding method according to claim 8, wherein the plane or the curved surface intersects the edge at the vertex.
The decoding method according to claim 1, wherein the first node is provided in contact with a second slice adjacent to the first slice.
The information is provided for each slice,
The information of the second slice has an octree structure and is used to derive the shape of a second node among the plurality of nodes included in the second slice,
The decoding method according to claim 10, wherein the shape of the second node is different from the prescribed shape.
The decoding method according to claim 1, wherein the size of the prescribed shape is expressed as a power of 2, and the size of the shape is different from the size expressed as a power of 2.
the shape of the first node is defined by a first length along a first direction, a second length along a second direction, and a third length along a third direction; the first direction, the second direction, and the third direction are orthogonal to each other;
Of the first length, the second length, and the third length, only the first length is different from the length of the prescribed shape of the node other than the first node, or The decoding according to claim 1, wherein only the first length and the second length of the first length, the second length, and the third length are respectively different from the length of the prescribed shape. Method.
The first node is closest to the origin of the first slice among the plurality of nodes in one of a first direction, a second direction, and a third direction; The decoding method according to claim 1, wherein the direction and the third direction are orthogonal to each other.
The plurality of nodes include a third node having a shape different from the prescribed shape,
The decoding method according to claim 14, wherein the third node is the farthest from the origin in the one direction among the plurality of nodes.
If the starting position of the first slice does not coincide with the origin, the bitstream includes the information;
if the starting position of the first slice coincides with an origin, the bitstream does not include the information;
The decoding method according to claim 1.
If the ending position of the first slice does not match the ending end of the first node, the bitstream includes the information;
If the end position of the first slice coincides with the end of the first node, the bitstream does not include the information;
The decoding method according to claim 1.
An encoding method for encoding a plurality of three-dimensional points, the method comprising:
It has an octree structure and generates a bitstream by encoding multiple nodes included in the first slice,
storing information for deriving a shape of a first node among the plurality of nodes in the bitstream;
The encoding method, wherein the shape is different from prescribed shapes of other nodes among the plurality of nodes.
A decoding device that decodes a plurality of three-dimensional points,
a processor;
Equipped with memory and
The processor uses the memory to:
Obtaining a plurality of nodes having an octree structure and included in the first slice from the bitstream,
obtaining information for deriving the shape of a first node among the plurality of nodes from the bitstream;
decoding the first node according to the information;
The shape is different from the prescribed shapes of other nodes among the plurality of nodes.
An encoding device that encodes a plurality of three-dimensional points,
a processor;
Equipped with memory and
The processor uses the memory to:
It has an octree structure and generates a bitstream by encoding multiple nodes included in the first slice,
storing information for deriving a shape of a first node among the plurality of nodes in the bitstream;
The shape is different from the prescribed shapes of other nodes among the plurality of nodes.