WO2024120326A1

WO2024120326A1 - Geometric coding method, geometric decoding method and terminal

Info

Publication number: WO2024120326A1
Application number: PCT/CN2023/136035
Authority: WO
Inventors: 张伟; 王贵旗; 杨付正; 吕卓逸
Original assignee: 维沃移动通信有限公司
Priority date: 2022-12-09
Filing date: 2023-12-04
Publication date: 2024-06-13
Also published as: CN118175341A

Abstract

Disclosed in the present application are a geometric coding method, a geometric decoding method and a terminal, belonging to the technical field of coding and decoding. The geometric coding method provided by the embodiment of the present application comprises: acquiring geometric information of a point cloud to be coded; according to the geometric information of said point cloud, generating a bounding box corresponding to said point cloud, the bounding box comprising at least two nodes to be coded, and said nodes being determined on the basis of multiway-tree division on the bounding box; with respect to each of said nodes, according to node parameters corresponding to said point cloud, determining the maximum N coded nodes associated with the node to be coded; according to occupation information of the maximum N coded nodes, generating context information corresponding to each node to be coded; and on the basis of the context information corresponding to the nodes to be coded, performing geometric coding on the nodes to be coded, so as to generate a target data stream.

Description

Geometric encoding method, geometric decoding method and terminal

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to Chinese patent application No. 202211583723.X filed in China on December 9, 2022, the entire contents of which are incorporated herein by reference.

Technical Field

The present application belongs to the field of coding and decoding technology, and specifically relates to a geometric coding method, a geometric decoding method and a terminal.

Background technique

A point cloud is a set of irregularly distributed discrete points in space that express the spatial structure and surface properties of a three-dimensional object or scene.

In the multi-tree-based geometric coding of point clouds, the current node to be coded needs to be geometrically coded according to the context information corresponding to the current node to be coded, and the above context information is composed of placeholder codes of coded nodes. In this way, for the nodes to be coded close to the leaf nodes, a large number of placeholder codes of coded nodes need to be stored, which takes up a lot of memory space.

Summary of the invention

The embodiments of the present application provide a geometric encoding method, a geometric decoding method and a terminal, which can solve the problem in the related art that a large number of placeholder codes of encoded nodes need to be stored, which occupies a large amount of memory space.

In a first aspect, a geometric encoding method is provided, comprising:

The encoding end obtains the geometric information of the point cloud to be encoded;

The encoding end generates a bounding box corresponding to the point cloud to be encoded according to the geometric information of the point cloud to be encoded; the bounding box includes at least two nodes to be encoded, and the nodes to be encoded are determined based on multi-branch tree division of the bounding box;

The encoder determines, for each node to be encoded, a maximum of N encoded nodes associated with the node to be encoded according to the node parameters corresponding to the point cloud to be encoded, where the node to be encoded is a non-initial node in the point cloud to be encoded, and N is an integer greater than or equal to 1;

The encoder generates context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes;

The encoding end performs geometric encoding on the node to be encoded based on context information corresponding to the node to be encoded to generate a target bitstream.

In a second aspect, a geometric decoding method is provided, comprising:

The decoding end obtains the target bitstream;

The decoding end decodes the target code stream to obtain a point cloud to be decoded, wherein the point cloud to be decoded includes at least two nodes to be decoded;

The decoding end determines, for each node to be decoded, a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded, wherein the node to be decoded is a non-initial node in the point cloud to be decoded, and N is an integer greater than or equal to 1;

The decoding end generates context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes;

The decoding end performs geometric decoding on the node to be decoded based on the context information corresponding to the node to be decoded, and generates reconstructed geometric information corresponding to the point cloud to be decoded.

In a third aspect, a geometric encoding device is provided, comprising:

The acquisition module is used to obtain the geometric information of the point cloud to be encoded;

A first generating module is used to generate a bounding box corresponding to the point cloud to be encoded according to the geometric information of the point cloud to be encoded; the bounding box includes at least two nodes to be encoded, and the nodes to be encoded are determined based on multi-branch tree partitioning of the bounding box;

A determination module, configured to determine, for each node to be encoded, a maximum of N encoded nodes associated with the node to be encoded according to a node parameter corresponding to the point cloud to be encoded, wherein the node to be encoded is a non-initial node in the point cloud to be encoded, and N is an integer greater than or equal to 1;

A second generating module, configured to generate context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes;

The third generating module is used to perform geometric coding on the node to be coded based on the context information corresponding to the node to be coded, so as to generate a target bit stream.

In a fourth aspect, a geometric decoding device is provided, comprising:

An acquisition module is used to acquire a target bitstream;

A decoding module, used for decoding the target code stream to obtain a point cloud to be decoded, wherein the point cloud to be decoded includes at least two nodes to be decoded;

A determination module, configured to determine, for each node to be decoded, a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded, wherein the node to be decoded is a non-initial node in the point cloud to be decoded, and N is an integer greater than or equal to 1;

A first generating module, configured to generate context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes;

The second generating module is used to perform geometric decoding on the node to be decoded based on the context information corresponding to the node to be decoded, and generate reconstructed geometric information corresponding to the point cloud to be decoded.

In a fifth aspect, a terminal is provided, the terminal comprising a processor and a memory, the memory storing a program or instruction that can be run on the processor, the program or instruction being executed by the processor to implement the method described in the first aspect. The steps of the method, or the steps of implementing the method as described in the second aspect.

In a sixth aspect, a readable storage medium is provided, on which a program or instruction is stored. When the program or instruction is executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method described in the second aspect are implemented.

In the seventh aspect, a chip is provided, comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run a program or instruction to implement the method described in the first aspect, or to implement the steps of the method described in the second aspect.

In an eighth aspect, a computer program/program product is provided, wherein the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the steps of the method described in the first aspect, or to implement the steps of the method described in the second aspect.

In the embodiment of the present application, for each node to be encoded, geometric encoding of the node to be encoded can be achieved only based on the placeholder information of at most N encoded nodes associated with the node to be encoded. Compared with the solution in the related art that requires the placeholder information of all encoded nodes to perform geometric encoding on the node to be encoded, the embodiment of the present application can perform geometric encoding on the node to be encoded through the placeholder information of a small number of encoded nodes, thereby reducing the storage space of the placeholder information of the encoded node and freeing up a large amount of memory space.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG1 is a schematic diagram of the framework of a point cloud AVS point cloud encoding device;

FIG2 is a schematic diagram of the framework of a point cloud AVS point cloud decoding device;

FIG3 is a schematic diagram of a flow chart of a geometric encoding method provided in an embodiment of the present application;

FIG4 is a schematic diagram of a flow chart of a geometric decoding method provided in an embodiment of the present application;

FIG5 is a structural diagram of a geometric encoding device provided in an embodiment of the present application;

FIG6 is a structural diagram of a geometric decoding device provided in an embodiment of the present application;

FIG7 is a structural diagram of a communication device provided in an embodiment of the present application;

FIG8 is a schematic diagram of the hardware structure of a terminal provided in an embodiment of the present application.

Detailed ways

The following will be combined with the drawings in the embodiments of the present application to clearly describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field belong to the scope of protection of this application.

The terms "first", "second", etc. in the specification and claims of this application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It should be understood that the terms used in this way are interchangeable under appropriate circumstances, so that the embodiments of the present application can be implemented in an order other than those illustrated or described herein, and the objects distinguished by "first" and "second" are generally of the same type, and do not limit the number of objects. For example, the first object can be one, and the second can also be one. In addition, in the specification and claims, "and/or" means at least one of the connected objects, and the character "/" generally means that the objects connected before and after are in an "or" relationship.

The geometric encoding device corresponding to the geometric encoding method in the embodiment of the present application and the geometric decoding device corresponding to the geometric decoding method can both be terminals, which can also be called terminal equipment or user terminal (User Equipment, UE). The terminal can be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer) or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a handheld computer, a netbook, an ultra-mobile personal computer (ultra-mobile personal computer, UMPC), a mobile Internet device (Mobile Internet Device, MID), an augmented reality (augmented reality) or a tablet computer (Tablet Personal Computer). The terminal side devices include: artificial intelligence (AR)/virtual reality (VR) devices, robots, wearable devices (Wearable Device) or vehicle-mounted devices (VUE), pedestrian terminals (PUE), smart homes (home appliances with wireless communication functions, such as refrigerators, TVs, washing machines or furniture, etc.), game consoles, personal computers (personal computers, PCs), teller machines or self-service machines, etc., and wearable devices include: smart watches, smart bracelets, smart headphones, smart glasses, smart jewelry (smart bracelets, smart bracelets, smart rings, smart necklaces, smart anklets, smart anklets, etc.), smart wristbands, smart clothing, etc. It should be noted that the specific type of the terminal 11 is not limited in the embodiment of the present application.

For ease of understanding, some contents involved in the embodiments of the present application are described below:

Please refer to FIG. 1. As shown in FIG. 1, currently, in the digital audio and video coding technology standard, the geometric information and attribute information of the point cloud are encoded separately using the point cloud audio and video standard (AVS) point cloud encoding device. First, the geometric information is converted into coordinates so that all the point clouds are contained in a bounding box, and then the coordinates are quantized. Quantization mainly plays a role in scaling. Since quantization rounds the geometric coordinates, the geometric information of some points is the same, which is called duplicate points. Whether to remove duplicate points is determined according to parameters. The two steps of quantization and removal of duplicate points are also called voxelization. Next, the bounding box is divided into a multi-tree, such as an octree, a quadtree or a binary tree. In the multi-tree-based geometric information encoding framework, the bounding box is divided into 8 equal sub-cubes, and the non-empty sub-cubes are divided until the division is stopped when the leaf node is a unit cube of 1x1x1, and the number of points in the leaf node is encoded to generate a binary code stream.

After the geometric encoding is completed, the geometric information is reconstructed for the subsequent recoloring. Attribute encoding is mainly for color and reflectance information. First, determine whether to perform color space conversion based on the parameters. If color space conversion is performed, the color information is converted from the red green blue (RGB) color space to the brightness color (YUV) color space, where Y represents brightness and U and V represent chromaticity. Then, the original point cloud is used to recolor the geometrically reconstructed point cloud so that the unencoded attribute information corresponds to the reconstructed geometric information. In color information encoding, after sorting the point cloud by Morton code or Hilbert code, the nearest neighbor of the point to be predicted is searched by geometric spatial relationship, and the reconstructed attribute value of the neighbor is used to predict the point to be predicted to obtain the predicted attribute value, and then the real attribute value and the predicted attribute value are differentiated to obtain the prediction residual, and finally the prediction residual is quantized and encoded to generate a binary code stream.

It should be understood that the decoding process in the digital audio and video coding and decoding technical standard corresponds to the above-mentioned encoding process. Specifically, the framework of the AVS point cloud decoding device is shown in Figure 2.

The present application provides a geometric coding method. The present application is described below through some embodiments and application scenarios in conjunction with the accompanying drawings. The geometric encoding method provided in the application embodiment is described in detail.

Please refer to Figure 3, which is a flow chart of a geometric encoding method in an embodiment of the present application. The geometric encoding method provided in this embodiment includes the following steps:

S301, the encoding end obtains geometric information of the point cloud to be encoded.

S302: The encoding end generates a bounding box corresponding to the point cloud to be encoded according to geometric information of the point cloud to be encoded.

In this step, after obtaining the geometric information of the point cloud to be encoded, coordinate translation and coordinate quantization can be performed on the geometric position of the point cloud to be encoded to generate a bounding box containing the point cloud to be encoded. By performing multi-branch tree division on the bounding box, at least two nodes to be encoded included in the bounding box are determined.

Among them, the above-mentioned multi-tree partitioning includes but is not limited to binary tree partitioning, quadtree partitioning and octree partitioning.

S303: The encoding end determines, for each node to be encoded, a maximum of N encoded nodes associated with the node to be encoded according to the node parameters corresponding to the point cloud to be encoded.

The node to be encoded is a non-initial node in the point cloud to be encoded, and N is an integer greater than or equal to 1. For specific implementation methods of determining the maximum N encoded nodes associated with the node to be encoded according to the node parameters, please refer to the subsequent embodiments.

It should be noted that, for the initial node in the point cloud to be encoded, the initial node may be geometrically encoded according to a default value preset for the initial node.

It should be noted that the number of coded nodes associated with different nodes to be coded may be the same or different, but the number of coded nodes associated with each node to be coded does not exceed N.

S304: The encoder generates context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes.

The above placeholder information includes a placeholder code.

In this step, an arithmetic operation may be performed on at least part of the placeholder codes of all the encoded nodes to obtain context information corresponding to the node to be encoded.

It should be understood that the placeholder information of the above-mentioned encoded nodes can be stored in the cache of the encoding end.

S305: The encoder performs geometric coding on the node to be coded based on the context information corresponding to the node to be coded to generate a target bitstream.

Optionally, the node parameter is used to characterize the number of encoded nodes, and determining the maximum N encoded nodes associated with the node to be encoded according to the node parameter corresponding to the point cloud to be encoded includes:

The encoding end, based on the encoding order of the node to be encoded, encodes the node whose encoding order is before the node to be encoded. The maximum N encoded nodes are determined as the maximum N encoded nodes associated with the node to be encoded.

In this embodiment, the node parameter is used to characterize the number of encoded nodes. In this implementation, the encoded node that is located before the node to be encoded in the encoding order can be determined as the encoded node associated with the node to be encoded, and then the context information corresponding to the node to be encoded is generated based on the placeholder information of the encoded node, wherein the number of encoded nodes represented by the node parameter does not exceed N.

In this embodiment, the number of encoded nodes associated with the node to be encoded is limited by node parameters, and then in subsequent steps, the node to be encoded can be geometrically encoded by the placeholder information of a smaller number of encoded nodes, thereby reducing the storage space of the placeholder information of the encoded nodes and freeing up a large amount of memory space.

Optionally, the determining, according to the node parameters corresponding to the to-be-encoded point cloud, a maximum of N encoded nodes associated with the to-be-encoded node comprises:

The encoding end performs a search operation on the point cloud to be encoded with the geometric position corresponding to the node to be encoded as the search center, and the search range of the search operation is determined based on the node parameter;

The encoding end determines the searched encoded nodes as the maximum N encoded nodes associated with the node to be encoded.

In this embodiment, the node parameters can represent the search range. In this implementation, the geometric position corresponding to the node to be encoded can be used as the search center to perform a search operation on the point cloud to be encoded, and the encoded nodes within the search range can be determined as the encoded nodes associated with the node to be encoded.

As described above, the placeholder information of the encoded nodes can be stored in the cache of the encoding end. In the case where the point parameter can represent the search range, the above array can be represented by an array, where the array subscript is the Morton code calculated according to the geometric coordinates of the node to be encoded, the array value is the placeholder information of each encoded node associated with the node to be encoded, and the array capacity is the size of the search range represented by the node parameter.

One way to update the array is to update the array index according to the Morton code calculated from the geometric coordinates of the node to be encoded, thereby updating the array.

Another way to update the array is: in the process of determining the maximum N encoded nodes associated with the node to be encoded, if the array has reached the upper limit of the cache capacity, clear the placeholder information of the encoded points outside the search range represented by the node parameters.

Optionally, the performing geometric coding on the node to be coded based on the context information corresponding to the node to be coded to generate a target bitstream includes:

The encoding end encodes the context information corresponding to the node to be encoded through the context model corresponding to the point cloud to be encoded, generates a target code stream, and writes a model index into the target code stream; the model index is used to characterize the context model corresponding to the point cloud to be encoded.

In this embodiment, the encoder selects a context model from multiple context models through adaptive judgment. The following information is encoded, and the model index representing the context model is written into the target bitstream. It should be understood that if the encoder only stores one set of context models, the model index will not be written into the target bitstream.

Optionally, the node parameters are parameters agreed upon by a protocol, or the node parameters are determined based on pre-acquired indication information.

An optional implementation is that the node parameter is the number of shifts agreed upon by the protocol. In this implementation, the node parameter can be directly specified at the encoding end.

Another optional implementation is that the node parameters are determined by pre-acquired indication information. In this implementation, the node parameters are determined by parsing the indication information.

Please refer to Figure 4, which is a schematic diagram of the flow of the geometric decoding method provided in the embodiment of the present application. The geometric decoding method provided in this embodiment includes the following steps:

S401, the decoding end obtains the target bit stream.

S402: The decoding end decodes the target code stream to obtain a point cloud to be decoded.

In this step, the acquired target code stream is decoded to obtain a point cloud to be decoded, wherein the point cloud to be decoded includes at least two nodes to be decoded.

S403: The decoding end determines, for each node to be decoded, a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded.

The above-mentioned node to be decoded is a non-initial node in the point cloud to be decoded, and N is an integer greater than or equal to 1.

It should be noted that, for the initial node in the point cloud to be decoded, the initial node may be geometrically decoded according to the default value of the initial node obtained by decoding the target code stream.

It should be noted that the number of decoded nodes associated with different nodes to be decoded may be the same or different, but the number of decoded nodes associated with each node to be decoded does not exceed N.

S404: The decoding end generates context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes.

For specific implementation methods of how to determine the maximum N decoded nodes associated with the node to be decoded according to the node parameters, please refer to the subsequent embodiments.

S405: The decoding end performs geometric decoding on the node to be decoded based on the context information corresponding to the node to be decoded, and generates reconstructed geometric information corresponding to the point cloud to be decoded.

In the embodiment of the present application, for each node to be decoded, geometric decoding of the node to be decoded can be achieved only based on the placeholder information of at most N decoded nodes associated with the node to be decoded. In other words, the node to be decoded can be geometrically decoded using the placeholder information of a relatively small number of decoded nodes, thereby reducing the storage space of the placeholder information of the decoded nodes and freeing up a large amount of memory space.

Optionally, the node parameter is used to characterize the number of decoded nodes, and determining the maximum N decoded nodes associated with the node to be decoded according to the node parameter corresponding to the point cloud to be decoded includes:

The decoding end determines, based on the decoding order of the node to be decoded, N decoded nodes whose decoding order is before the node to be decoded as the maximum N decoded nodes associated with the node to be decoded.

In this embodiment, the node parameter is used to characterize the number of decoded nodes. In this implementation, the decoded nodes that are located before the node to be decoded in decoding order can be determined as decoded nodes associated with the node to be decoded, wherein the number of decoded nodes characterized by the node parameter does not exceed N.

Optionally, determining, according to the node parameters corresponding to the point cloud to be decoded, a maximum of N decoded nodes associated with the node to be decoded comprises:

The decoding end performs a search operation on the point cloud to be decoded with the geometric position corresponding to the node to be decoded as the search center, and the search range of the search operation is determined based on the node parameter;

The decoding end determines the searched decoded nodes as the maximum N decoded nodes associated with the node to be decoded.

In this embodiment, the node parameters can represent the search range. In this implementation, the geometric position corresponding to the node to be decoded can be used as the search center to perform a search operation on the point cloud to be decoded, and the decoded nodes within the search range can be determined as decoded nodes associated with the node to be decoded.

Optionally, the geometric decoding of the node to be decoded based on the context information corresponding to the node to be decoded to generate the reconstructed geometric information corresponding to the point cloud to be decoded includes:

The decoding end decodes the context information corresponding to the node to be decoded through the context model corresponding to the point cloud to be decoded, and generates reconstructed geometric information corresponding to the point cloud to be decoded; the context model is determined based on the model index carried by the target code stream.

In this embodiment, the target code stream may be decoded to obtain a model index, and then the context information corresponding to the node to be decoded may be decoded using a context model represented by the model index to generate reconstructed geometric information corresponding to the point cloud to be decoded.

Optionally, the node parameter is a parameter agreed upon by a protocol, or the node parameter is determined based on indication information carried by the target code stream.

It should be noted that the geometric decoding method provided in this embodiment is the inverse process of the geometric encoding provided in the above embodiments.

The geometric coding method provided in the embodiment of the present application can be executed by a geometric coding device. In the embodiment of the present application, a geometric coding device executing the geometric coding method is taken as an example to illustrate the geometric coding device provided in the embodiment of the present application.

As shown in FIG5 , the embodiment of the present application further provides a geometric encoding device 500, including:

The acquisition module 501 is used to acquire geometric information of the point cloud to be encoded;

A first generating module 502 is used to generate a bounding box corresponding to the point cloud to be encoded according to the geometric information of the point cloud to be encoded; the bounding box includes at least two nodes to be encoded, and the nodes to be encoded are determined based on multi-branch tree partitioning of the bounding box;

A determination module 503 is used to determine, for each node to be encoded, a maximum of N encoded nodes associated with the node to be encoded according to the node parameters corresponding to the point cloud to be encoded, wherein the node to be encoded is a non-initial node in the point cloud to be encoded, and N is an integer greater than or equal to 1;

A second generating module 504 is used to generate context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes;

The third generating module 505 is used to perform geometric coding on the node to be coded based on the context information corresponding to the node to be coded, so as to generate a target bitstream.

Optionally, the node parameter is used to characterize the number of encoded nodes, and the determination module 503 is specifically used to:

Based on the coding order of the node to be coded, a maximum of N coded nodes whose coding order is before the node to be coded are determined as the maximum of N coded nodes associated with the node to be coded.

Optionally, the determining module 503 is further specifically configured to:

Taking the geometric position corresponding to the node to be encoded as the search center, performing a search operation on the point cloud to be encoded, wherein the search range of the search operation is determined based on the node parameter;

The searched coded nodes are determined as the maximum N coded nodes associated with the node to be coded.

Optionally, the third generating module 505 is specifically configured to:

The context information corresponding to the node to be encoded is encoded through the context model corresponding to the point cloud to be encoded, a target code stream is generated, and a model index is written into the target code stream; the model index is used to characterize the context model corresponding to the point cloud to be encoded.

This device embodiment corresponds to the geometric coding method embodiment shown in FIG3 above. All implementation processes and implementation methods of the encoding end in the above method embodiment are applicable to this device embodiment and can achieve the same technical effect.

The geometric decoding method provided in the embodiment of the present application can be executed by a geometric decoding device. In the embodiment of the present application, a geometric decoding device executing the geometric decoding method is taken as an example to illustrate the geometric decoding device provided in the embodiment of the present application.

As shown in FIG6 , the embodiment of the present application further provides a geometric decoding device 600, including:

The acquisition module 601 is used to acquire the target bit stream;

A decoding module 602 is used to decode the target code stream to obtain a point cloud to be decoded, wherein the point cloud to be decoded includes at least two nodes to be decoded;

A determination module 603 is used to determine, for each node to be decoded, a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded, wherein the node to be decoded is a non-initial node in the point cloud to be decoded, and N is an integer greater than or equal to 1;

A first generating module 604 is used to generate context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes;

The second generating module 605 is used to perform geometric decoding on the node to be decoded based on the context information corresponding to the node to be decoded, and generate reconstructed geometric information corresponding to the point cloud to be decoded.

Optionally, the node parameter is used to characterize the number of decoded nodes, and the determination module 603 is specifically used to:

Based on the decoding order of the node to be decoded, a maximum of N decoded nodes whose decoding order is before the node to be decoded are determined as the maximum of N decoded nodes associated with the node to be decoded.

Optionally, the determining module 603 is further specifically configured to:

Taking the geometric position corresponding to the node to be decoded as the search center, performing a search operation on the point cloud to be decoded, wherein the search range of the search operation is determined based on the node parameters;

The searched decoded nodes are determined as the maximum N decoded nodes associated with the node to be decoded.

Optionally, the second generating module 605 is specifically configured to:

The context information corresponding to the node to be decoded is decoded through the context model corresponding to the point cloud to be decoded, so as to generate reconstructed geometric information corresponding to the point cloud to be decoded; the context model is determined based on the model index carried by the target code stream.

The geometric decoding device provided in the embodiment of the present application can implement each process implemented by the method embodiment of Figure 4 and achieve the same technical effect. To avoid repetition, it will not be repeated here.

The geometric encoding device and the geometric decoding device in the embodiments of the present application may be electronic devices, such as electronic devices with an operating system, or components in electronic devices, such as integrated circuits or chips. The electronic device may be a terminal, or may be other devices other than a terminal. Exemplarily, the terminal may include but is not limited to the types of terminals listed above, and other devices may be servers, network attached storage (NAS), etc., which are not specifically limited in the embodiments of the present application.

Optionally, as shown in Figure 7, an embodiment of the present application also provides a communication device 700, including a processor 701 and a memory 702, and the memory 702 stores programs or instructions that can be executed on the processor 701. For example, when the communication device 700 is a terminal, the program or instructions are executed by the processor 701 to implement the various steps of the above-mentioned geometric encoding method embodiment, or to implement the various steps of the above-mentioned geometric decoding method embodiment, and can achieve the same technical effect.

The embodiment of the present application further provides a terminal, including a processor and a communication interface, wherein the processor is configured to perform the following operations:

Get the geometric information of the point cloud to be encoded;

Generate a bounding box corresponding to the point cloud to be encoded according to the geometric information of the point cloud to be encoded;

For each node to be encoded, determine a maximum of N encoded nodes associated with the node to be encoded according to the node parameters corresponding to the point cloud to be encoded;

Generate context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes;

Based on the context information corresponding to the node to be encoded, geometric encoding is performed on the node to be encoded to generate a target bitstream.

Alternatively, the processor is used to perform the following operations:

Get the target code stream;

Decoding the target code stream to obtain a point cloud to be decoded;

For each node to be decoded, determine a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded;

Generate context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes;

Based on the context information corresponding to the node to be decoded, the node to be decoded is geometrically decoded to generate reconstructed geometric information corresponding to the point cloud to be decoded.

The terminal embodiment corresponds to the above-mentioned encoding end and decoding end method embodiments, and each implementation process and implementation mode of the above-mentioned method embodiment can be applied to the terminal embodiment and can achieve the same technical effect. Specifically, Figure 8 is a hardware structure diagram of a terminal implementing the embodiment of the present application.

The terminal 800 includes but is not limited to: a radio frequency unit 801, a network module 802, an audio output unit 803, an input unit 804, a sensor 805, a display unit 806, a user input unit 807, an interface unit 808, a memory 809, and a processor 810.

Those skilled in the art will appreciate that the terminal 800 may also include a power source (such as a battery) for supplying power to each component, and the power source may be logically connected to the processor 810 through a power management system, so as to implement functions such as managing charging, discharging, and power consumption management through the power management system. The terminal structure shown in FIG8 does not constitute a limitation on the terminal, and the terminal may include more or fewer components than shown in the figure, or combine certain components, or arrange components differently, which will not be described in detail here.

It should be understood that in the embodiment of the present application, the input unit 804 may include a graphics processing unit (GPU) 8041 and a microphone 8042, and the graphics processor 8041 processes the image data of the static picture or video obtained by the image capture device (such as a camera) in the video capture mode or the image capture mode. The display unit 806 may include a display panel 8061, and the display panel 8061 may be configured in the form of a liquid crystal display, an organic light emitting diode, etc. The user input unit 807 includes a touch panel 8071 and at least one of other input devices 8072. The touch panel 8071 is also called a touch screen. The touch panel 8071 may include two parts: a touch detection device and a touch controller. Other input devices 8072 may include, but are not limited to, a physical keyboard, function keys (such as a volume control key, a switch key, etc.), a trackball, a mouse, and a joystick, which will not be repeated here.

In the embodiment of the present application, after receiving downlink data from the network side device, the RF unit 801 can transmit the data to the processor 88 for processing; the RF unit 801 can send uplink data to the network side device. Generally, the RF unit 801 includes but is not limited to an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, etc.

The memory 809 can be used to store software programs or instructions and various data. The memory 809 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instruction required for at least one function (such as a sound playback function, an image playback function, etc.), etc. In addition, the memory 809 may include a volatile memory or a non-volatile memory, or the memory 809 may include both volatile and non-volatile memories. Among them, the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory may be a random access memory (RAM), a static random access memory (SRAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (DRAM), or a volatile memory. The memory 809 in the embodiment of the present application includes but is not limited to these and any other suitable types of memory.

The processor 810 may include one or more processing units; optionally, the processor 810 integrates an application processor and a modem processor, wherein the application processor mainly processes operations related to an operating system, a user interface, and application programs, and the modem processor mainly processes wireless communication signals, such as a baseband processor. It is understandable that the modem processor may not be integrated into the processor 810.

The processor 810 is configured to perform the following operations:

Get the geometric information of the point cloud to be encoded;

Alternatively, the processor 810 is further configured to perform the following operations:

Get the target code stream;

Decoding the target code stream to obtain a point cloud to be decoded;

An embodiment of the present application also provides a readable storage medium, on which a program or instruction is stored. When the program or instruction is executed by a processor, the various processes of the above-mentioned geometric encoding method embodiment are implemented, or the various processes of the above-mentioned geometric decoding method embodiment are implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here.

The processor is the processor in the terminal described in the above embodiment. The readable storage medium includes a computer readable storage medium, such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk or an optical disk.

An embodiment of the present application further provides a chip, which includes a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run programs or instructions to implement the various processes of the above-mentioned geometric encoding method embodiment, or to implement the various processes of the above-mentioned geometric decoding method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.

It should be understood that the chip mentioned in the embodiments of the present application can also be called a system-level chip, a system chip, a chip system or a system-on-chip chip, etc.

The embodiments of the present application further provide a computer program/program product, which is stored in a storage medium. The computer program/program product is executed by at least one processor to implement the various processes of the above-mentioned geometric encoding method embodiment, or to implement the various processes of the above-mentioned geometric decoding method embodiment, and can achieve the same technical effect. To avoid repetition, it will not be repeated here.

It should be noted that, in this article, the terms "comprise", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the sentence "comprises one..." does not exclude the presence of other identical elements in the process, method, article or device including the element. In addition, it should be noted that the scope of the methods and devices in the embodiments of the present application is not limited to performing functions in the order shown or discussed, and may also include performing functions in a substantially simultaneous manner or in reverse order according to the functions involved, for example, the described method may be performed in an order different from that described, and various steps may also be added, omitted, or combined. In addition, the features described with reference to certain examples may be combined in other examples.

Through the description of the above implementation methods, those skilled in the art can clearly understand that the above-mentioned embodiment methods can be implemented by means of software plus a necessary general hardware platform, and of course by hardware, but in many cases the former is a better implementation method. Based on such an understanding, the technical solution of the present application, or the part that contributes to the prior art, can be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, a magnetic disk, or an optical disk), and includes a number of instructions for enabling a terminal (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in each embodiment of the present application.

The embodiments of the present application are described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementation methods. The above-mentioned specific implementation methods are merely illustrative and not restrictive. Under the guidance of the present application, ordinary technicians in this field can also make many forms without departing from the purpose of the present application and the scope of protection of the claims, all of which are within the protection of the present application.

Claims

A geometric encoding method, comprising:

The encoding end obtains the geometric information of the point cloud to be encoded;

The encoding end generates a bounding box corresponding to the point cloud to be encoded according to the geometric information of the point cloud to be encoded; the bounding box includes at least two nodes to be encoded, and the nodes to be encoded are determined based on multi-branch tree division of the bounding box;

The encoder determines, for each node to be encoded, a maximum of N encoded nodes associated with the node to be encoded according to the node parameters corresponding to the point cloud to be encoded, where the node to be encoded is a non-initial node in the point cloud to be encoded, and N is an integer greater than or equal to 1;

The encoder generates context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes;

The encoding end performs geometric encoding on the node to be encoded based on context information corresponding to the node to be encoded to generate a target bitstream.
The method according to claim 1, wherein the node parameter is used to characterize the number of encoded nodes, and the determining, according to the node parameter corresponding to the point cloud to be encoded, a maximum of N encoded nodes associated with the node to be encoded comprises:

The encoding end determines, based on the encoding order of the node to be encoded, a maximum of N encoded nodes whose encoding order is before the node to be encoded as the maximum of N encoded nodes associated with the node to be encoded.
The method according to claim 1, wherein the determining, according to the node parameters corresponding to the point cloud to be encoded, a maximum of N encoded nodes associated with the node to be encoded comprises:

The encoder performs a search operation on the point cloud to be encoded with the geometric position corresponding to the node to be encoded as the search center, and the search range of the search operation is determined based on the node parameter;

The encoding end determines the searched encoded nodes as the maximum N encoded nodes associated with the node to be encoded.
The method according to any one of claims 1 to 3, wherein the step of geometrically encoding the node to be encoded based on the context information corresponding to the node to be encoded to generate a target bitstream comprises:

The encoding end encodes the context information corresponding to the node to be encoded through the context model corresponding to the point cloud to be encoded, generates a target code stream, and writes a model index into the target code stream; the model index is used to characterize the context model corresponding to the point cloud to be encoded.
The method according to any one of claims 1 to 4, wherein the node parameters are parameters agreed upon by a protocol, or the node parameters are determined based on pre-acquired indication information.
A geometric decoding method, comprising:

The decoding end obtains the target bitstream;

The decoding end decodes the target code stream to obtain a point cloud to be decoded, wherein the point cloud to be decoded includes at least two points to be decoded. Code node;

The decoding end determines, for each node to be decoded, a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded, wherein the node to be decoded is a non-initial node in the point cloud to be decoded, and N is an integer greater than or equal to 1;

The decoding end generates context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes;

The decoding end performs geometric decoding on the node to be decoded based on the context information corresponding to the node to be decoded, and generates reconstructed geometric information corresponding to the point cloud to be decoded.
The method according to claim 6, wherein the node parameter is used to characterize the number of decoded nodes, and the determining, according to the node parameter corresponding to the point cloud to be decoded, a maximum of N decoded nodes associated with the node to be decoded comprises:

The decoding end determines, based on the decoding order of the node to be decoded, a maximum of N decoded nodes whose decoding order is before the node to be decoded as the maximum of N decoded nodes associated with the node to be decoded.
The method according to claim 6, wherein the determining, according to the node parameters corresponding to the point cloud to be decoded, a maximum of N decoded nodes associated with the node to be decoded comprises:

The decoding end performs a search operation on the point cloud to be decoded with the geometric position corresponding to the node to be decoded as the search center, and the search range of the search operation is determined based on the node parameter;

The decoding end determines the searched decoded nodes as the maximum N decoded nodes associated with the node to be decoded.
The method according to any one of claims 6 to 8, wherein the step of geometrically decoding the node to be decoded based on the context information corresponding to the node to be decoded and generating the reconstructed geometric information corresponding to the point cloud to be decoded comprises:

The decoding end decodes the context information corresponding to the node to be decoded through the context model corresponding to the point cloud to be decoded, and generates reconstructed geometric information corresponding to the point cloud to be decoded; the context model is determined based on the model index carried by the target code stream.
The method according to any one of claims 6 to 9, wherein the node parameter is a parameter agreed upon by a node parameter protocol, or the node parameter is determined based on indication information carried by the target code stream.
A geometric encoding device, comprising:

The acquisition module is used to obtain the geometric information of the point cloud to be encoded;

A first generating module is used to generate a bounding box corresponding to the point cloud to be encoded according to the geometric information of the point cloud to be encoded; the bounding box includes at least two nodes to be encoded, and the nodes to be encoded are determined based on multi-branch tree partitioning of the bounding box;

A determination module, configured to determine, for each node to be encoded, a maximum of N encoded nodes associated with the node to be encoded according to a node parameter corresponding to the point cloud to be encoded, wherein the node to be encoded is a non-initial node in the point cloud to be encoded, and N is an integer greater than or equal to 1;

A second generating module, configured to generate context information corresponding to the node to be encoded according to the placeholder information of the maximum N encoded nodes;

The third generating module is used to perform geometric coding on the node to be coded based on the context information corresponding to the node to be coded, so as to generate a target bit stream.
The apparatus according to claim 11, wherein the node parameter is used to characterize the number of encoded nodes, and the determination module is specifically used to:

Based on the coding order of the node to be coded, a maximum of N coded nodes whose coding order is before the node to be coded are determined as the maximum of N coded nodes associated with the node to be coded.
The apparatus according to claim 11, wherein the determining module is further specifically configured to:

Taking the geometric position corresponding to the node to be encoded as the search center, performing a search operation on the point cloud to be encoded, wherein the search range of the search operation is determined based on the node parameter;

The searched coded nodes are determined as the maximum N coded nodes associated with the node to be coded.
The device according to any one of claims 11 to 13, wherein the third generating module is specifically used to:

The context information corresponding to the node to be encoded is encoded through the context model corresponding to the point cloud to be encoded, a target code stream is generated, and a model index is written into the target code stream; the model index is used to characterize the context model corresponding to the point cloud to be encoded.
The device according to any one of claims 11 to 14, wherein the node parameters are parameters agreed upon by a protocol, or the node parameters are determined based on pre-acquired indication information.
A geometric decoding device, comprising:

An acquisition module is used to acquire a target bitstream;

A decoding module, used for decoding the target code stream to obtain a point cloud to be decoded, wherein the point cloud to be decoded includes at least two nodes to be decoded;

A determination module, configured to determine, for each node to be decoded, a maximum of N decoded nodes associated with the node to be decoded according to the node parameters corresponding to the point cloud to be decoded, wherein the node to be decoded is a non-initial node in the point cloud to be decoded, and N is an integer greater than or equal to 1;

A first generating module, configured to generate context information corresponding to the node to be decoded according to the placeholder information of the maximum N decoded nodes;

The second generating module is used to perform geometric decoding on the node to be decoded based on the context information corresponding to the node to be decoded, and generate reconstructed geometric information corresponding to the point cloud to be decoded.
The apparatus according to claim 16, wherein the node parameter is used to characterize the number of decoded nodes, and the determination module is specifically used to:

Based on the decoding order of the node to be decoded, a maximum of N decoded nodes whose decoding order is before the node to be decoded are determined as the maximum of N decoded nodes associated with the node to be decoded.
The apparatus according to claim 16, wherein the determining module is further specifically configured to:

Taking the geometric position corresponding to the node to be decoded as the search center, a search operation is performed on the point cloud to be decoded. The search range of the search operation is determined based on the node parameters;

The searched decoded nodes are determined as the maximum N decoded nodes associated with the node to be decoded.
The device according to any one of claims 16 to 18, wherein the second generating module is specifically configured to:

The context information corresponding to the node to be decoded is decoded through the context model corresponding to the point cloud to be decoded, so as to generate reconstructed geometric information corresponding to the point cloud to be decoded; the context model is determined based on the model index carried by the target code stream.
The device according to any one of claims 16 to 19, wherein the node parameter is a parameter agreed upon by a node parameter protocol, or the node parameter is determined based on indication information carried by the target code stream.
A terminal comprises a processor and a memory, wherein the memory stores programs or instructions that can be run on the processor, and when the programs or instructions are executed by the processor, the steps of the geometric encoding method described in any one of claims 1 to 5 are implemented, or the steps of the geometric decoding method described in any one of claims 6 to 10 are implemented.
A readable storage medium storing a program or instruction, wherein the program or instruction, when executed by a processor, implements the steps of a geometric encoding method as described in any one of claims 1 to 5, or implements the steps of a geometric decoding method as described in any one of claims 6 to 10.
A chip, comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is used to run a program or instruction to implement the steps of the geometric encoding method as described in any one of claims 1 to 5, or to implement the steps of the geometric decoding method as described in any one of claims 6 to 10.