WO2023051783A1

WO2023051783A1 - Encoding method, decoding method, apparatus, device, and readable storage medium

Info

Publication number: WO2023051783A1
Application number: PCT/CN2022/123245
Authority: WO
Inventors: 冯亚楠; 李琳; 周冰; 徐嵩; 邢刚; 马思伟; 王苫社; 徐逸群; 胡玮
Original assignee: 咪咕文化科技有限公司; 中国移动通信集团有限公司; 北京大学
Priority date: 2021-09-30
Filing date: 2022-09-30
Publication date: 2023-04-06
Also published as: CN113766229A; CN113766229B

Abstract

The present application discloses an encoding method, a decoding method, an apparatus, a device, and a readable storage medium, relating to the technical field of image processing, so as to increase processing performance. The method comprises: clustering point cloud data to be processed in a current frame, to obtain multiple sub-point clouds; generating a generalized Laplacian matrix for any target sub-point cloud of the multiple sub-point clouds, according to Euclidean distances between multiple point pairs in the target sub-point cloud, and a Euclidean distance between a target point in the target sub-point cloud and a point corresponding to the target point; using the generalized Laplacian matrix to perform inter-frame prediction and an image Fourier residual transform on the target sub-point cloud; and separately quantizing and encoding the transformed multiple sub-point clouds, to obtain an encoded code stream; the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.

Description

Encoding method, decoding method, device, equipment and readable storage medium

Cross References to Related Applications

This application is based on a Chinese patent application with application number 202111160289.X and an application date of September 30, 2021. The applicants are Migu Culture Technology Co., Ltd., China Mobile Communications Group Co., Ltd., and Peking University. The application name is "a Encoding method, decoding method, device, equipment and readable storage medium", and claim the priority of this Chinese patent application, the entire content of this Chinese patent application is hereby incorporated by reference into this application.

technical field

The present application relates to the technical field of image processing, and relates to but not limited to an encoding method, decoding method, device, equipment and readable storage medium.

Background technique

With the development of computer hardware and algorithms, the acquisition of 3D point cloud data is becoming more and more convenient, and the data volume of point cloud data is also increasing. Point cloud data consists of a large number of three-dimensional disordered points, each point includes position information (X, Y, Z) and several attribute information (color, normal vector, etc.).

In order to facilitate the storage and transmission of point cloud data, point cloud compression technology has gradually become the focus of attention. The prior art provides a scheme to selectively encode one or more 3D point cloud blocks using inter-coding (eg, motion compensation) techniques of previously encoded/decoded frames. However, the processing performance such as encoding of this scheme is poor.

Contents of the invention

Embodiments of the present application provide an encoding method, a decoding method, a device, a device, and a readable storage medium, so as to improve processing performance.

An embodiment of the present application provides an encoding method applied to an encoding device, including:

Cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;

For any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and the target point in the target sub-point cloud and the target The Euclidean distance between the corresponding points of the point generates a generalized Laplacian matrix;

Using the generalized Laplacian matrix to perform inter-frame prediction and map Fourier residual transformation on the target sub-point cloud;

Quantize and encode the transformed multiple sub-point clouds respectively to obtain the encoded code stream;

Wherein, the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.

The embodiment of the present application also provides a decoding method, which is applied to a decoding device, and the method includes:

Get the encoded code stream;

Performing an inverse Fourier transform of the graph based on the Euclidean distance weight on the coded code stream to obtain a transform result;

Obtaining a decoded code stream based on the transformation result;

Wherein, the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.

The embodiment of the present application also provides an encoding device, including:

The first acquisition module is configured to cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;

The first generation module is configured to, for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and, in the target sub-point cloud The Euclidean distance between the target point and the corresponding point of the target point generates a generalized Laplacian matrix;

The first transformation module is configured to use the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud;

The first encoding module is configured to respectively quantize and encode the transformed sub-point clouds to obtain encoded code streams;

The embodiment of the present application also provides a decoding device, including:

The second obtaining module is configured to obtain the encoded code stream;

The second transformation module is configured to perform an inverse Fourier transformation of the encoded code stream based on Euclidean distance weights to obtain a transformation result;

The first decoding module is configured to obtain a decoded code stream based on the transformation result;

The embodiment of the present application also provides an electronic device, including: a memory, a processor, and a program stored in the memory and operable on the processor. When the processor executes the program, the above-mentioned encoding method or decoding steps in the method.

The embodiment of the present application further provides a readable storage medium, where a program is stored on the readable storage medium, and when the program is executed by a processor, the steps in the encoding method or decoding method as described above are implemented.

In the embodiment of the present application, the point cloud data to be processed in the current frame is clustered to obtain multiple sub-point clouds, and for any target sub-point cloud, according to the relationship between multiple point pairs in the target sub-point cloud The Euclidean distance, and the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point generates a generalized Laplacian matrix, and using the generalized Laplacian matrix, respectively Inter-frame prediction and image Fourier residual transformation are performed on the plurality of sub-point clouds, so as to obtain an encoded code stream based on the transformation result. Since the generalized Laplacian matrix is generated using the Euclidean distance between points, in the embodiment of the present application, the global correlation feature can be used to more fully express the correlation between points, so that The similarity between point cloud data can be removed as much as possible, and the coding performance can be improved. At the same time, since the performance of the encoding end is improved, correspondingly, for the decoding end, since the data to be decoded is optimized, the decoding efficiency and performance can be improved accordingly.

Description of drawings

Fig. 1 is a flow chart of the encoding method provided by the embodiment of the present application;

Fig. 2 and Fig. 3 are the schematic diagrams comparing the effect of the method of the embodiment of the present application and the method of the prior art;

Fig. 4 is a flow chart of the decoding method provided by the embodiment of the present application;

FIG. 5 is a structural diagram of an encoding device provided in an embodiment of the present application;

FIG. 6 is a structural diagram of a decoding device provided by an embodiment of the present application.

Detailed ways

The term "and/or" in the embodiments of this application describes the association relationship of associated objects, indicating that there may be three relationships, for example, A and/or B, which may mean: A exists alone, A and B exist simultaneously, and B exists alone These three situations. The character "/" generally indicates that the contextual objects are an "or" relationship.

The term "plurality" in the embodiments of the present application refers to two or more, and other quantifiers are similar.

The technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

Referring to FIG. 1 , FIG. 1 is a flowchart of an encoding method provided by an embodiment of the present application, which is applied to an encoding device. As shown in Figure 1, the following steps are included:

Step 101. Cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds.

In this step, voxelize the point cloud data to be processed to obtain point cloud voxels, and then cluster the voxelized point cloud data to obtain multiple sub-point clouds.

Exemplarily, a 3D grid with a preset size is constructed, and the point cloud data to be processed is placed in the constructed 3D grid to obtain the coordinates of each point, and the 3D grid containing the points is used as a point cloud voxel to obtain Multiple point cloud voxels. In addition, the coordinates and attribute information of each point cloud voxel can also be obtained. Wherein, the attribute information includes intensity, color and so on. In the embodiment of the present application, the coordinates of the point cloud voxel can be the coordinates of the center point of each point in the point cloud voxel; the color information of the point cloud voxel can be the color information of each point in the point cloud voxel average value. In practical applications, the point cloud data to be processed can also be voxelized by means of an octree to obtain multiple point cloud voxels.

Among them, the method of uniform space division is used to cluster the point cloud data. As a clustering method, for example, a K-means clustering method can be used.

In the embodiment of the present application, the point cloud data to be processed is divided into multiple sub-point clouds based on the location information, and the space is evenly divided. Each sub-point cloud can be encoded independently.

Step 102, for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and the target point in the target sub-point cloud and The Euclidean distance between corresponding points of the target point generates a generalized Laplacian matrix.

Wherein, any sub-point cloud in the plurality of sub-point clouds can be used as the target sub-point cloud. In practice, each target sub-point cloud is treated in the same way.

Exemplarily, in this step, the following contents may be included:

S1021. Obtain a weight matrix according to the Euclidean distances between multiple point pairs in the target sub-point cloud.

Wherein, the target sub-point cloud may include multiple points, and every two points constitute a point pair in this embodiment. In the embodiment of the present application, the Euclidean distance between two points in each point pair is calculated. For example, for the i-th point and the j-th point in the target sub-point cloud, the Euclidean distance between the i-th point and the j-th point is calculated. Exemplarily, for the Euclidean distance d(i,j) between point i(x ₁ , x ₂ ... x _n ) and point j (y ₁ , y ₂ ... y _n ), in practice, it can be The following formula (1) is calculated:

Wherein, 1≤i≤M, 1≤j≤M, i, j, M are integers, and M is the total number of points included in the target sub-point cloud.

Exemplarily, the weights are calculated according to the following formula (2), and the weights are used to form the weight matrix W:

Among them, W _ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud; distance represents the Euclidean distance between the i-th point and the j-th point; σ is not equal to 0 Constant, representing the tuning parameter.

S1022. Obtain a Laplacian matrix according to the degree matrix and the weight matrix.

In this step, the difference between the degree matrix and the weight matrix is used as a Laplacian matrix.

Exemplarily: L=D-W, represents a Laplacian matrix, D represents a degree matrix, and W represents a weight matrix.

Wherein, the diagonal element d _i =Σ _j W _ij of the degree matrix, and other elements are 0. Among them, d _i represents the i-th diagonal element of the degree matrix, W _ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud

S1023. Generate a diagonal matrix.

The diagonal matrix is generated according to the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point.

Exemplarily, the reference point cloud of the target sub-point element may first be determined in the reference frame. For example, motion estimation is performed in a reference frame and a matching reference point cloud is found. Among them, there is a one-to-one correspondence between the target sub-point cloud and the reference point. For example, using an iterative closest point algorithm, in the reference frame, the reference point cloud of the target sub-point element can be determined based on the Euclidean distance. Then, the diagonal matrix D _w is generated based on the Euclidean distance between each point in the target sub-point cloud and the corresponding point of each point in the reference point cloud. Wherein, the value on the i-th diagonal of the diagonal matrix is the reciprocal of the Euclidean distance between the i-th point and point p, and other elements are 0. Wherein, point p is the corresponding point of the i-th point in the reference point cloud.

S1024. Obtain the generalized Laplacian matrix according to the diagonal matrix and the Laplacian matrix.

In this step, the sum of the diagonal matrix and the Laplacian matrix is used as the generalized Laplacian matrix.

Exemplarily, Lg=L+D _w , where Lg represents a generalized Laplacian matrix, L represents a Laplacian matrix, and D _w represents a diagonal matrix.

Step 103, using the generalized Laplacian matrix, perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud.

In this step, the inter-frame prediction and graph Fourier residual transform can be understood as inter-frame prediction and graph Fourier residual transform based on Euclidean distance weights, which may include the following:

S1031. Obtain an attribute prediction value of the reference frame for the target attribute of the current frame.

In the embodiment of the present application, an inter-frame prediction method is used to predict the attribute value of the current frame by using the reference frame. Wherein, the attributes may include color, intensity, normal vector and so on. Then, the target attribute can be any attribute.

Exemplarily, in this step, according to the following formula (3), the attribute prediction value of the reference frame to the target attribute of the current frame is obtained:

in,

Indicates the attribute prediction value of the reference frame to the target attribute of the current frame, x _t-1 indicates the attribute value of the target attribute of the reference frame, and Lg indicates the generalized Laplacian matrix.

S1032. Generate a residual of the target attribute of the current frame according to the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame.

Exemplarily, here, the difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame can be used as the residual, as shown in formula (4) :

where δ represents the residual of the target attribute of the current frame,

Indicates the attribute prediction value of the target attribute of the reference frame to the current frame, x _t represents the attribute value of the target attribute of the current frame.

The residual is obtained through an inter-frame prediction method, so as to obtain as much difference between two frames as possible. Since the same part between two frames does not require additional processing, the bit rate can be saved by calculating the residual.

S1033. Transform the residual of the target attribute of the current frame based on the generalized Laplacian matrix.

In this step, a transformation matrix is obtained by using the generalized Laplacian matrix, and then the residual of the target attribute of the current frame is transformed by using the transformation matrix.

Exemplarily, the following formula (5) is solved to obtain the transformation matrix:

Among them, Lg represents the generalized Laplacian matrix,

Represents the transformation matrix.

Utilize following formula (6), obtain described transformation result:

Among them, θ represents the transformation result,

represents the transformation matrix, and δ represents the residual of the target attribute of the current frame.

In the embodiment of this application, on the basis of the traditional graph Fourier transform, the concept of generalized graph Fourier transform is introduced, and the inter-frame attributes of point cloud data are predicted and residual transformed, so that the gap between data can be removed. Redundancy improves coding efficiency.

Among them, the processing method for other sub-point clouds is the same as that for the target sub-point cloud.

Step 104: Quantize and code the transformed sub-point clouds respectively to obtain coded code streams.

In this step, uniform quantization and arithmetic coding are performed on the transformed sub-point clouds to obtain coded streams.

Taking the target attribute as color as an example, here, the color can be decomposed into three 3×1 vectors (for example: color space model (Luminance Chrominance-Blue Chrominance-Red, YUV) or (Red Green Blue, RGB)). Taking the Y component as an example, predict the attribute value of the current frame according to the process in S1031, and generate a residual according to S1032. After that, transform the residual by using S1033. The transformed Y component is uniformly quantized and arithmetically encoded to obtain a code stream. For each component, it can be processed in the same way as the Y component.

In the embodiment of the present application, since the generalized Laplacian matrix is generated by using the Euclidean distance between points, in the embodiment of the present application, the global correlation feature can be used to more fully express the point-to-point The correlation between them, so that the similarities between point cloud data can be removed as much as possible, and the coding performance can be improved.

In practice, tests are performed on real point cloud sequences. In the test, a test was first performed on 16 frames of dynamic point clouds. Figure 2 shows the method and region adaptive hierarchical transformation (Region Adaptive Hierarchical Transform, RAHT) of the embodiment of the present application, and the main direction weight map Fourier transform ( Main direction Weight Chart Fourier Transform, NWCFT) method performance comparison. Among them, as shown in Figure 2, 9 different point cloud data are given, that is, benchmark data sets 1 to 9, based on method 1 (RAHT), method 2 (NWCFT) and method 3 (the coding provided by the embodiment of this application) respectively. Method), the comparison of the determined rate-distortion performance, here, the benchmark data sets 1 to 9 can be point cloud data respectively: Longdress, Loot, Redandblack, Soldier, Andrew, David, Phil, Ricardo and Sarah. In order to quantify the gain, in the experiment, the data comparison with the RAHT method was done, as shown in Figure 3.

It can be seen from Figure 2 and Figure 3 that in the embodiment of the present application, on the basis of the traditional graph Fourier transform, the concept of generalized graph Fourier transform is introduced to predict and residual the inter-frame attributes of the point cloud Transform, so that the redundancy between data can be further removed, and the coding efficiency can be improved. Experimental results show that the method of the embodiment of the present application can improve the subjective and objective performance, and can be effectively applied to the actual point cloud compression, transmission, and storage systems.

Referring to FIG. 4 , FIG. 4 is a flowchart of a decoding method provided by an embodiment of the present application, which is applied to an encoding device. As shown in Figure 4, the following steps are included:

Step 401. Acquire an encoded code stream.

Step 402: Perform inverse Fourier transform of the graph based on the Euclidean distance weight on the coded code stream to obtain a transform result.

At the decoding end, after entropy decoding is performed on the coded code stream, inverse quantization is performed on the coded code stream. Afterwards, the inverse Fourier transform of the graph based on the Euclidean distance weight is performed on the dequantized encoded code stream to obtain the transform result.

Exemplarily, the following formula (7) can be used to perform inverse Fourier transform of the inverse quantized code stream based on the Euclidean distance weight:

in,

Indicates the inverse transformation residual value,

represents the transformation matrix,

Represents the quantized residual value of the target attribute of the current frame, and ε represents the inverse quantization coefficient.

Step 403: Obtain a decoded code stream based on the transformation result.

In the embodiment of the present application, since the generalized Laplacian matrix is generated by using the Euclidean distance between points, in the embodiment of the present application, the global correlation feature can be used to more fully express the point-to-point The correlation between them, so that the similarities between point cloud data can be removed as much as possible, and the coding performance can be improved. Since the performance of the encoding end is improved, correspondingly, for the decoding end, since the data to be decoded is optimized, the decoding efficiency and performance can be improved accordingly.

The embodiment of the present application also provides an encoding device. Referring to FIG. 5 , FIG. 5 is a structural diagram of an encoding device provided by an embodiment of the present application. Since the problem-solving principle of the encoding device is similar to the encoding method in the embodiment of the present application, reference may be made to the implementation of the method for the implementation of the encoding device.

As shown in Figure 5, the encoding device 500 includes:

The first acquisition module 501 is configured to cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;

The first generating module 502 is configured to, for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and the target sub-point cloud The Euclidean distance between the target point in and the corresponding point of the target point generates a generalized Laplacian matrix;

The first transformation module 503 is configured to use the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud;

The first encoding module 504 is configured to respectively quantize and encode the transformed multiple sub-point clouds to obtain an encoded code stream; wherein, the corresponding point is located in the reference point cloud of the target sub-point cloud, and the reference point The cloud is located in the reference frame of the current frame.

In some embodiments, the first acquisition module 501 includes: a first processing submodule configured to voxelize the point cloud data to be processed to obtain point cloud voxels; a first acquisition submodule configured In order to cluster the voxelized point cloud data, multiple sub-point clouds are obtained.

In some embodiments, the first generation module 502 includes: a second acquisition submodule configured to obtain a weight matrix according to the Euclidean distance between multiple point pairs in the target sub-point cloud; a third acquisition submodule , configured to obtain a Laplacian matrix according to the degree matrix and the weight matrix; the first generation submodule is configured to generate a diagonal matrix; the second generation submodule is configured to obtain a Laplacian matrix according to the diagonal matrix and the Laplacian Placian matrix to get the generalized Laplacian matrix.

In some embodiments, the third acquisition submodule is configured to use the difference between the degree matrix and the weight matrix as a Laplacian matrix; the second generation submodule is configured to use the pair The sum of the angle matrix and the Laplacian matrix, as the generalized Laplacian matrix; wherein, the diagonal element d _i =Σ _j W _ij of the degree matrix, wherein, d _i represents the degree matrix The i-th diagonal element, W _ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud; 1≤i≤M, 1≤j≤M, i, j , M is an integer, and M is the total number of points included in the target sub-point cloud; the diagonal matrix is based on the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point Generated.

In some embodiments, the second acquisition submodule includes: a first calculation unit configured to calculate the i-th point and the j-th point in the target sub-point cloud to obtain the i-th point and the j-th point The Euclidean distance between points; the first acquisition unit is configured to calculate the weight according to the following formula, and use the weight to form the weight matrix:

Among them, W _ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud; distance represents the Euclidean distance between the i-th point and the j-th point; σ is not equal to 0 A constant represents an adjustment parameter; 1≤i≤M, 1≤j≤M, i, j, and M are integers, and M is the total number of points included in the target sub-point cloud.

In some embodiments, the first generating submodule includes: a first determining unit configured to determine a reference point cloud of the target sub-point element in the reference frame; a first generating unit configured to The Euclidean distance between each point in the point cloud and the corresponding point of each point in the reference point cloud generates the diagonal matrix; wherein, the i-th diagonal of the diagonal matrix The value on is the reciprocal of the Euclidean distance between the i-th point and point p, where point p is the corresponding point of the i-th point in the reference point cloud.

In some embodiments, the first determination unit is configured to determine the reference point cloud of the target sub-point element in the reference frame by using an iterative closest point algorithm.

In some embodiments, the first transformation module 503 includes: a fourth acquisition submodule, configured to acquire the attribute prediction value of the reference frame to the target attribute of the current frame; a third generation submodule, configured to The attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame generate a residual of the target attribute of the current frame; the first transformation submodule is configured to be based on the The generalized Laplacian matrix transforms the residual of the target attribute of the current frame.

In some embodiments, the fourth obtaining submodule is configured to obtain the attribute prediction value of the reference frame for the target attribute of the current frame according to the following formula:

in,

In some embodiments, the third generating submodule is configured to use the difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame as the residual.

In some embodiments, the first transformation submodule includes: a second acquisition unit configured to use the generalized Laplacian matrix to obtain a transformation matrix; a first transformation unit configured to use the transformation matrix to transform the The residual of the target attribute of the current frame is transformed.

In some embodiments, the second acquisition unit is configured to solve the following formula to obtain the transformation matrix:

Among them, Lg represents the generalized Laplacian matrix,

Represents the transformation matrix.

In some embodiments, the first transformation unit is configured to use the following formula to obtain the transformation result:

Among them, θ represents the transformation result,

The encoding device 500 provided in the embodiment of the present application can implement the corresponding embodiment of the above-mentioned encoding method, and its implementation principle and technical effect are similar, so this embodiment will not be described here again.

The embodiment of the present application also provides a decoding device. Referring to FIG. 6, FIG. 6 is a structural diagram of a decoding device provided by an embodiment of the present application. Since the principle of the decoding device to solve the problem is similar to the decoding method in the embodiment of the present application, the implementation of the decoding device can refer to the implementation of the method, and the repetition will not be repeated.

As shown in Figure 6, the decoding device 600 includes:

The second obtaining module 601 is configured to obtain an encoded code stream;

The second transformation module 602 is configured to perform an inverse Fourier transformation of the encoded code stream based on Euclidean distance weights to obtain a transformation result;

The first decoding module 603 is configured to obtain a decoded code stream based on the transformation result;

In some embodiments, the second transformation module includes: a second processing submodule configured to dequantize the coded code stream; a second transformation submodule configured to dequantize the coded code stream based on Inverse Fourier transform of the graph with Euclidean distance weights to obtain the transformation result.

In some embodiments, the first transformation sub-module is configured to use the following formula to perform inverse Fourier transform of the graph based on Euclidean distance weights on the coded code stream after dequantization:

in,

Indicates the inverse transformation residual value,

represents the transformation matrix,

The decoding device 600 provided in the embodiment of the present application can execute the corresponding embodiment of the above-mentioned decoding method, and its implementation principle and technical effect are similar, so this embodiment will not be described here again.

It should be noted that the division of units in the embodiment of the present application is schematic, and is only a logical function division, and there may be another division manner in actual implementation. In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

If the integrated unit is implemented in the form of a software function unit and sold or used as an independent product, it can be stored in a processor-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other various media that can store program codes.

The embodiment of the present application also provides a readable storage medium, on which a program is stored, and when the program is executed by a processor, each process of the above encoding or decoding method embodiment can be achieved, and the same technical effect can be achieved. To avoid repetition, it is not described here. Wherein, the readable storage medium can be any available medium or data storage device that can be accessed by the processor, including but not limited to magnetic storage (such as floppy disk, hard disk, magnetic tape, magneto optical disc (Magneto Optical disc, MO disc) etc.), optical storage (such as: laser disc (Compact Disk, CD), digital versatile disc (Digital Versatile Disc, DVD), Blu-ray Disc (Blu-ray Disc, BD), high-definition universal disc (High-Definition Versatile Disc, HVD ), etc.), and semiconductor memory; for example: ROM, Erasable Programmable Read Only Memory (Erasable Programmable Read Only Memory, EPROM), Electrically Erasable Programmable Read Only Memory (EEPROM), non- Volatile Memory (Non Volatile Memory NVM), Solid State Disk (Solid State Disk, SSD), etc.

It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.

Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk, etc.) ) includes several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.

The embodiments of the present application have been described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementations. The above-mentioned specific implementations are only illustrative and not restrictive. Those of ordinary skill in the art will Under the inspiration of this application, without departing from the purpose of this application and the scope of protection of the claims, many forms can also be made, all of which belong to the protection of this application.

Industrial Applicability

The application discloses an encoding method, a decoding method, a device, a device, and a readable storage medium, and relates to the technical field of image processing, so as to improve processing performance. The method includes: clustering the point cloud data to be processed in the current frame to obtain multiple sub-point clouds; for any target sub-point cloud in the multiple sub-point clouds, according to multiple The Euclidean distance between point pairs, and the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point generates a generalized Laplacian matrix; using the generalized Laplacian Matrix, performing inter-frame prediction and image Fourier residual transformation on the target sub-point cloud; respectively quantizing and encoding the transformed sub-point clouds to obtain the encoded code stream; wherein, the corresponding point is located in the target sub-point cloud In the reference point cloud of the point cloud, the reference point cloud is located in the reference frame of the current frame.

Claims

An encoding method applied to an encoding device, comprising:

Cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;

For any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and the target point in the target sub-point cloud and the target The Euclidean distance between the corresponding points of the point generates a generalized Laplacian matrix;

Using the generalized Laplacian matrix, performing inter-frame prediction and image Fourier residual transformation on the target sub-point cloud;

Quantize and encode the transformed multiple sub-point clouds respectively to obtain the encoded code stream;

Wherein, the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.
The method according to claim 1, wherein said clustering the point cloud data to be processed of the current frame to obtain a plurality of sub-point clouds comprises:

Voxelize the point cloud data to be processed to obtain point cloud voxels;

Clustering the voxelized point cloud data to obtain the multiple sub-point clouds.
The method according to claim 1, wherein, according to the Euclidean distance between a plurality of point pairs in the target sub-point cloud, and the correspondence between the target point in the target sub-point cloud and the target point Euclidean distance between points, generating a generalized Laplacian matrix including:

According to the Euclidean distance between multiple point pairs in the target sub-point cloud, a weight matrix is obtained;

According to the degree matrix and the weight matrix, a Laplacian matrix is obtained;

generate a diagonal matrix;

According to the diagonal matrix and the Laplacian matrix, the generalized Laplacian matrix is obtained.
The method according to claim 3, wherein said obtaining a Laplacian matrix according to the degree matrix and said weight matrix comprises:

using the difference between the degree matrix and the weight matrix as the Laplacian matrix;

According to the diagonal matrix and the Laplacian matrix, the generalized Laplacian matrix is obtained, including:

Using the sum of the diagonal matrix and the Laplacian matrix as the generalized Laplacian matrix;

Wherein, the diagonal elements of the degree matrix d i =∑ j W ij , where d i represents the i-th diagonal element of the degree matrix, and W ij represents the i-th point in the target sub-point cloud The weight corresponding to the edge to the jth point; 1≤i≤M, 1≤j≤M, i, j, M are integers, and M is the total number of points included in the target sub-point cloud;

The diagonal matrix is generated according to the Euclidean distance between the target point in the target sub-point cloud and the corresponding point of the target point.
The method according to claim 3, wherein, according to the Euclidean distance between a plurality of point pairs in the target sub-point cloud, obtain a weight matrix, comprising:

For the i-th point and the j-th point in the target sub-point cloud, calculate the euclidean distance between the i-th point and the j-th point;

Calculate the weight according to the following formula, and use the weight to form the weight matrix:

Among them, W ij represents the weight corresponding to the edge from the i-th point to the j-th point in the target sub-point cloud; distance represents the Euclidean distance between the i-th point and the j-th point; σ is not equal to 0 A constant represents an adjustment parameter; 1≤i≤M, 1≤j≤M, i, j, and M are integers, and M is the total number of points included in the target sub-point cloud.
The method according to claim 3, wherein said generating a diagonal matrix comprises:

determining a reference point cloud of the target sub-point element in the reference frame;

Generate the diagonal matrix based on the Euclidean distance between each point in the target sub-point cloud and the corresponding point of each point in the reference point cloud;

Wherein, the value on the i-th diagonal of the diagonal matrix is the reciprocal of the Euclidean distance between the i-th point and point p, where point p is the i-th point in the reference point cloud corresponding point.
The method according to claim 6, wherein said determining the reference point cloud of the target sub-point element in the reference frame comprises:

The reference point cloud of the target sub-point element is determined in the reference frame by using an iterative closest point algorithm.
The method according to claim 1, wherein said utilizing said generalized Laplacian matrix to perform inter-frame prediction and map Fourier residual transformation on said target sub-point cloud, comprising:

Obtaining an attribute prediction value of the reference frame to the target attribute of the current frame;

generating a residual of the target attribute of the current frame according to the attribute value of the target attribute of the current frame and the predicted value of the target attribute of the reference frame to the target attribute of the current frame;

Transforming the residual of the target attribute of the current frame based on the generalized Laplacian matrix.
The method according to claim 8, wherein said obtaining the attribute prediction value of the reference frame to the target attribute of the current frame comprises:

According to the following formula, the attribute prediction value of the target attribute of the reference frame to the current frame is obtained:

in,
Indicates the attribute prediction value of the reference frame to the target attribute of the current frame, x t-1 indicates the attribute value of the target attribute of the reference frame, and Lg indicates the generalized Laplacian matrix.
The method according to claim 8, wherein the target attribute of the current frame is generated according to the attribute value of the target attribute of the current frame and the attribute prediction value of the reference frame to the target attribute of the current frame residuals, including:

The difference between the attribute value of the target attribute of the current frame and the attribute prediction value of the target attribute of the current frame by the reference frame is used as the residual.
The method according to claim 8, wherein said transforming the residual of the target attribute of the current frame based on the generalized Laplacian matrix comprises:

Using the generalized Laplacian matrix to obtain a transformation matrix;

The residual of the target attribute of the current frame is transformed by using the transformation matrix.
The method according to claim 11, wherein said utilizing said generalized Laplacian matrix to obtain a transformation matrix comprises:

Solve the following formula to obtain the transformation matrix:

Among them, Lg represents the generalized Laplacian matrix,
Represents the transformation matrix.
The method according to claim 11, wherein said transforming the residual of the target attribute of the current frame using the transformation matrix comprises:

Utilize following formula, obtain described transformation result:

Among them, θ represents the transformation result,
represents the transformation matrix, and δ represents the residual of the target attribute of the current frame.
A decoding method applied to a decoding device, the method comprising:

Get the code stream;

Performing an inverse Fourier transform of the graph based on the Euclidean distance weight on the coded code stream to obtain a transform result;

Obtaining a decoded code stream based on the transformation result;

Wherein, the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.
The method according to claim 14, wherein said performing an inverse Fourier transform on the encoded code stream based on Euclidean distance weights to obtain a transform result includes:

Dequantizing the coded code stream;

The inverse Fourier transform of the graph based on the Euclidean distance weight is performed on the coded code stream after inverse quantization to obtain the transform result.
The method according to claim 15, wherein said performing inverse Fourier transform on the dequantized encoded code stream based on Euclidean distance weights to obtain a transform result, comprising:

Use the following formula to perform the inverse Fourier transform of the graph based on the Euclidean distance weight on the coded stream after dequantization:

in,
Indicates the inverse transformation residual value,
represents the transformation matrix,
Represents the quantized residual value of the target attribute of the current frame, and ε represents the inverse quantization coefficient.
An encoding device comprising:

The first acquisition module is configured to cluster the point cloud data to be processed in the current frame to obtain multiple sub-point clouds;

The first generation module is configured to, for any target sub-point cloud in the plurality of sub-point clouds, according to the Euclidean distance between multiple point pairs in the target sub-point cloud, and, in the target sub-point cloud The Euclidean distance between the target point and the corresponding point of the target point generates a generalized Laplacian matrix;

The first transformation module is configured to use the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the target sub-point cloud;

The first encoding module is configured to respectively quantize and encode the transformed sub-point clouds to obtain encoded code streams;

Wherein, the corresponding point is located in a reference point cloud of the target sub-point cloud, and the reference point cloud is located in a reference frame of the current frame.
A decoding device, comprising:

The second obtaining module is configured to obtain the encoded code stream;

The second transformation module is configured to perform an inverse Fourier transformation of the encoded code stream based on Euclidean distance weights to obtain a transformation result;

The first decoding module is configured to obtain a decoded code stream based on the transformation result;

Wherein, the coded code stream is obtained by the coding device using the generalized Laplacian matrix to perform inter-frame prediction and image Fourier residual transformation on the sub-point cloud.
An electronic device, comprising: a memory, a processor, and a program stored in the memory and operable on the processor;

The processor is used to read the program in the memory to implement the steps in the encoding method according to any one of claims 1 to 13; or to implement the decoding method according to any one of claims 14 to 16 in the steps.
A readable storage medium for storing a program, and when the program is executed by a processor, implements the steps in the encoding method according to any one of claims 1 to 13; or implements the steps in any one of claims 14 to 16 A step in the described decoding method.