WO2022133755A1 - 点云的解码方法、编码方法、解码器以及编码器 - Google Patents
点云的解码方法、编码方法、解码器以及编码器 Download PDFInfo
- Publication number
- WO2022133755A1 WO2022133755A1 PCT/CN2020/138435 CN2020138435W WO2022133755A1 WO 2022133755 A1 WO2022133755 A1 WO 2022133755A1 CN 2020138435 W CN2020138435 W CN 2020138435W WO 2022133755 A1 WO2022133755 A1 WO 2022133755A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- value
- point
- attribute information
- point cloud
- initial
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 123
- 238000001914 filtration Methods 0.000 claims abstract description 47
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 30
- 238000013139 quantization Methods 0.000 claims description 74
- 238000012545 processing Methods 0.000 claims description 42
- 230000008569 process Effects 0.000 claims description 36
- 238000006243 chemical reaction Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 14
- 238000005259 measurement Methods 0.000 claims description 13
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 20
- 230000006870 function Effects 0.000 description 21
- 238000010586 diagram Methods 0.000 description 15
- 230000009466 transformation Effects 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 6
- 230000006835 compression Effects 0.000 description 6
- 238000007906 compression Methods 0.000 description 6
- 239000003086 colorant Substances 0.000 description 5
- 230000003068 static effect Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000001360 synchronised effect Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 3
- 238000002591 computed tomography Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000012804 iterative process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 1
- 101000638069 Homo sapiens Transmembrane channel-like protein 2 Proteins 0.000 description 1
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 102100032054 Transmembrane channel-like protein 2 Human genes 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/40—Tree coding, e.g. quadtree, octree
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/001—Model-based coding, e.g. wire frame
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/004—Predictors, e.g. intraframe, interframe coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/36—Level of detail
Definitions
- the embodiments of the present application relate to the field of point cloud encoding and decoding, and more particularly, to a point cloud decoding method, encoding method, decoder, and encoder.
- Point clouds have begun to spread to various fields, such as virtual/augmented reality, robotics, geographic information systems, medical fields, etc.
- a large number of point clouds on the surface of objects can be accurately obtained, often corresponding to hundreds of thousands of points in one scene.
- Such a large number of points also brings challenges to the storage and transmission of computers. Therefore, point-to-point compression has become a hot issue.
- octree encoding is performed on the position information of the point cloud; at the same time, the color information is predicted according to the octree-encoded position information, and then the color is encoded by making a difference with the original color information. information to encode the point cloud.
- Embodiments of the present application provide a point cloud decoding method, an encoding method, a decoder, and an encoder, which can improve the reconstruction accuracy in the point cloud decoding process, and further, can improve the decoding effect.
- a decoding method for point clouds including:
- the initial chrominance value is filtered by using the Kalman filtering algorithm to obtain the final chrominance value
- a decoded point cloud is obtained according to the final reconstructed value of the attribute information of the target point.
- a method for encoding a point cloud including:
- the code stream is obtained by encoding the number of residual values of the attribute information of the lossless encoded point and the residual value of the attribute information of the target point to be written into the code stream.
- a decoder comprising:
- a parsing unit configured to parse the code stream of the point cloud to obtain an initial reconstruction value of attribute information of a target point in the point cloud
- a conversion unit configured to convert the initial reconstruction value into an initial luminance value and an initial chrominance value
- a filtering unit configured to filter the initial chrominance value using a Kalman filtering algorithm to obtain a final chrominance value
- a first processing unit configured to obtain a final reconstructed value of the attribute information of the target point based on the final chrominance value and the initial luminance value
- the second processing unit is configured to obtain a decoded point cloud according to the final reconstructed value of the attribute information of the target point.
- an encoder comprising:
- a first processing unit configured to process the position information of a target point in the point cloud to obtain reconstruction information of the position information of the target point
- the second processing unit is used to obtain the predicted value of the attribute information of the target point according to the reconstruction information of the position information of the target point;
- a third processing unit processing the attribute information of the target point in the point cloud to obtain the true value of the attribute information of the target point;
- a fourth processing unit configured to obtain a residual value of the attribute information of the target point according to the predicted value of the attribute information of the target point and the real value of the attribute information of the target point;
- the encoding unit is configured to encode the number of residual values of the attribute information of the lossless encoded point and the residual value of the attribute information of the target point to be written into the code stream to obtain the code stream.
- an embodiment of the present application provides a data processing device for point cloud media, and the data processing device for point cloud media includes:
- a processor adapted to implement computer instructions
- a computer-readable storage medium where computer instructions are stored in the computer-readable storage medium, and the computer instructions are suitable for being loaded by a processor and executing the above-mentioned data processing method for point cloud media.
- an embodiment of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are read and executed by a processor of a computer device, the computer device can execute the above point.
- Data processing method of cloud media
- the reconstruction accuracy of the attribute information of the target point can be improved.
- the quality of the point reconstruction process is enhanced, correspondingly, the decoding effect of point cloud coding can be improved.
- using the Kalman filtering algorithm to filter only the initial chrominance value can improve the filtering effect and further improve the Decoding effect of point cloud encoding.
- FIG. 1 is a schematic block diagram of a coding framework provided by an embodiment of the present application.
- FIG. 2 is a schematic block diagram of an LOD layer provided by an embodiment of the present application.
- FIG. 3 is a schematic block diagram of a decoding framework provided by an embodiment of the present application.
- FIG. 4 is a schematic flowchart of a method for decoding a point cloud provided by an embodiment of the present application.
- FIG. 5 is a schematic diagram of the principle of Kalman filtering provided by an embodiment of the present application.
- FIG. 6 is another schematic flowchart of a point cloud decoding method provided by an embodiment of the present application.
- FIG. 7 is a schematic flowchart of a point cloud encoding method provided by an embodiment of the present application.
- FIG. 8 is a schematic block diagram of a decoder provided by an embodiment of the present application.
- FIG. 9 is a schematic block diagram of an encoder provided by an embodiment of the present application.
- FIG. 10 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
- a point cloud is a set of discrete points that are randomly distributed in space and express the spatial structure and surface properties of a three-dimensional object or three-dimensional scene.
- Point cloud data is the specific recording form of point cloud.
- the point cloud data of each point in the point cloud can include geometric information and attribute information.
- the geometric information of each point in the point cloud refers to the Cartesian three-dimensional coordinate data
- the attribute information of each point in the point cloud may include but not limited to at least one of the following: color information, material information, and laser reflection intensity information.
- Color information can be information in any color space.
- the color information may be Red Green Blue (RGB) information.
- the color information may also be luminance chrominance (YcbCr, YUV) information.
- Y represents brightness (Luma)
- Cb (U) represents blue color difference
- Cr (V) represents red color
- U and V represent chroma (Chroma)
- chroma is used to describe color difference information.
- Each point in the point cloud has the same amount of attribute information.
- each point in a point cloud has two attribute information, color information and laser reflection intensity.
- each point in the point cloud has three attribute information: color information, material information and laser reflection intensity information.
- the geometric information of points can also be called geometric components or geometric components of point cloud media
- the attribute information of points can also be called attribute components or attribute components of point cloud media.
- Component Point cloud media may include a geometric component and one or more attribute components.
- point clouds can be divided into two categories, namely, machine-perceived point clouds and human-eye-perceived point clouds.
- the application scenarios of machine perception point cloud include but are not limited to: autonomous navigation system, real-time inspection system, geographic information system, visual sorting robot, rescue robot and other point cloud application scenarios.
- the application scenarios of human eye perception point cloud include but are not limited to: digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, 3D immersive interaction and other point cloud application scenarios.
- the acquisition methods of point cloud include but are not limited to: computer generation, 3D laser scanning, 3D photogrammetry, etc. Computers can generate point clouds of virtual 3D objects and scenes.
- 3D scanning can obtain point clouds of static real-world 3D objects or scenes, and millions of point clouds can be obtained per second.
- 3D cameras can obtain point clouds of dynamic real-world three-dimensional objects or scenes, and can obtain tens of millions of point clouds per second.
- the point cloud on the surface of the object can be collected through acquisition equipment such as photoelectric radar, lidar, laser scanner, and multi-view camera.
- the point cloud obtained according to the principle of laser measurement which may include three-dimensional coordinate information of the point and laser reflection intensity (reflectance) of the point.
- the point cloud obtained according to the principle of photogrammetry may include three-dimensional coordinate information of the point and color information of the point.
- the point cloud is obtained by combining the principles of laser measurement and photogrammetry, which may include three-dimensional coordinate information of the point, laser reflection intensity of the point, and color information of the point.
- the point cloud can also be divided into three types of point clouds based on the acquisition method of the point cloud, namely the first static point cloud, the second type of dynamic point cloud, and the third type of dynamically acquired point cloud.
- the first static point cloud the object is stationary, and the device for acquiring the point cloud is also stationary;
- the second type of dynamic point cloud the object is moving, but the device for acquiring the point cloud is stationary; for the third type of dynamic point cloud
- the point cloud is acquired, and the device that acquires the point cloud is moving.
- point clouds of biological tissues and organs can be obtained from magnetic resonance imaging (MRI), computed tomography (CT), and electromagnetic localization information.
- MRI magnetic resonance imaging
- CT computed tomography
- electromagnetic localization information reduce the cost and time period of point cloud acquisition and improve the accuracy of the data.
- the change in the acquisition method of point clouds makes it possible to acquire a large number of point clouds. With the continuous accumulation of large-scale point clouds, efficient storage, transmission, publishing, sharing and standardization of point clouds have become the key to point cloud applications.
- Point cloud data can be used to form point cloud media, which can be a media file.
- the point cloud media may include multiple media frames, each media frame in the point cloud media being composed of point cloud data.
- Point cloud media can express the spatial structure and surface properties of 3D objects or 3D scenes flexibly and conveniently, so it is widely used.
- After encoding the point cloud media encapsulate the encoded stream to form an encapsulated file, and the encapsulated file can be used for transmission to users.
- the encapsulated file needs to be decapsulated first, then decoded, and finally the decoded data stream is presented.
- Package files can also be referred to as point cloud files.
- point clouds can be encoded through the point cloud encoding framework.
- the point cloud coding framework can be the Geometry Point Cloud Compression (G-PCC) codec framework provided by the Moving Picture Experts Group (MPEG) or the Video Point Cloud Compression (Video Point Cloud Compression, V-PCC) codec framework, it can also be the AVS-PCC codec framework provided by the Audio Video Standard (AVS).
- G-PCC codec framework can be used to compress the first static point cloud and the third type of dynamically acquired point cloud
- the V-PCC codec framework can be used to compress the second type of dynamic point cloud.
- the G-PCC codec framework is also called point cloud codec TMC13
- the V-PCC codec framework is also called point cloud codec TMC2.
- FIG. 1 is a schematic block diagram of an encoding framework 100 provided by an embodiment of the present application.
- the encoding framework 100 can obtain the location information and attribute information of the point cloud from the acquisition device.
- the encoding of point cloud includes position encoding and attribute encoding.
- the process of position encoding includes: performing preprocessing on the original point cloud, such as coordinate transformation, quantization and removing duplicate points; and encoding to form a geometric code stream after constructing an octree.
- the attribute coding process includes: by given the reconstruction information of the position information of the input point cloud and the true value of the attribute information, select one of the three prediction modes for point cloud prediction, quantify the predicted result, and perform arithmetic coding to form attribute code stream.
- position encoding can be achieved by the following units:
- Coordinate transformation transformation (Tanmsform coordinates) unit 101, quantize and remove duplicate points (Quantize and remove points) unit 102, octree analysis (Analyze octree) unit 103, geometric reconstruction (Reconstruct Geometry) unit 104 and first arithmetic coding (Arithmetic) enconde) unit 105.
- the coordinate transformation unit 101 can be used to transform the world coordinates of the points in the point cloud into relative coordinates. For example, the geometric coordinates of the points are respectively subtracted from the minimum value of the xyz coordinate axes, which is equivalent to the DC removal operation, so as to realize the transformation of the coordinates of the points in the point cloud from world coordinates to relative coordinates.
- the quantization and removal of duplicate points unit 102 can reduce the number of coordinates through quantization; points that were originally different after quantization may be assigned the same coordinates, and based on this, duplicate points can be deleted through a deduplication operation; for example, points with the same quantization position and Multiple clouds with different attribute information can be merged into one cloud through attribute transformation.
- the quantization and removal of duplicate points unit 102 is an optional unit module.
- the octree analysis unit 103 may encode the position information of the quantized points using an octree encoding method.
- the point cloud is divided in the form of an octree, so that the position of the point can be in a one-to-one correspondence with the position of the octree.
- the flag (flag) is recorded as 1, for geometry encoding.
- the first arithmetic coding unit 105 can perform arithmetic coding on the position information output by the octree analysis unit 103 by using the entropy coding method, that is, the position information output by the octree analysis unit 103 uses the arithmetic coding method to generate a geometric code stream; the geometric code stream also It can be called a geometry bitstream.
- Attribute encoding can be achieved through the following units:
- Color space transform (Transform colors) unit 110 attribute transform (Transfer attributes) unit 111, Region Adaptive Hierarchical Transform (RAHT) unit 112, predicting transform (predicting transform) unit 113 and lifting transform (lifting transform) ) unit 114 , a quantization (Quantize) unit 115 and a second arithmetic coding unit 116 .
- RAHT Region Adaptive Hierarchical Transform
- the color space transformation unit 110 may be used to transform the RGB color space of the points in the point cloud into YCbCr format or other formats.
- the attribute transformation unit 111 may be used to transform attribute information of points in the point cloud to minimize attribute distortion.
- the attribute conversion unit 111 may be used to obtain the true value of the attribute information of the point.
- the attribute information may be color information of dots.
- any prediction unit can be selected to predict the point in the point cloud.
- the unit for predicting points in the point cloud may include at least one of: RAHT 112 , predicting transform unit 113 and lifting transform unit 114 .
- any one of the RAHT 112, the predicting transform unit 113 and the lifting transform unit 114 can be used to predict the attribute information of points in the point cloud to obtain the predicted value of the attribute information of the points, Further, the residual value of the attribute information of the point can be obtained based on the predicted value of the attribute information of the point.
- the residual value of the attribute information of the point may be the actual value of the attribute information of the point minus the predicted value of the attribute information of the point.
- the prediction transformation unit 113 may also be used to generate a level of detail (LOD), to sequentially predict the attribute information of points in the LOD, and to calculate a prediction residual for subsequent quantization coding. Specifically, for each point in the LOD, find the three nearest neighbor points in the LOD in front of it, and then use the reconstructed values of the three neighbor points to predict the current point to obtain the predicted value; based on this, it can be based on The predicted value of the current point and the true value of the current point get the residual value of the current point. For example, the residual value can be determined based on the following equation:
- AttrResidualQuant (attrValue-attrPred)/Qstep
- AttrResidualQuant represents the residual value of the current point
- attrPred represents the predicted value of the current point
- attrValue represents the real value of the current point
- Qstep represents the quantization step size.
- Qstep is calculated by the quantization parameter (Quantization Parameter, Qp).
- the current point will be used as the nearest neighbor of the subsequent point, and the reconstructed value of the current point will be used to predict the attribute information of the subsequent point.
- the reconstructed value of the attribute information of the current point can be obtained by the following formula:
- reconstructedColor represents the reconstructed value of the current point
- attrResidualQuant represents the residual value of the current point
- Qstep represents the quantization step size
- attrPred represents the predicted value of the current point.
- Qstep is calculated by the quantization parameter (Quantization Parameter, Qp).
- the LOD generation process includes: obtaining the Euclidean distance between points according to the position information of the points in the point cloud; dividing the points into different LOD layers according to the Euclidean distance.
- different ranges of Euclidean distances may be divided into different LOD layers. For example, a point can be randomly picked as the first LOD layer. Then calculate the Euclidean distance between the remaining points and the point, and classify the points whose Euclidean distance meets the requirements of the first threshold as the second LOD layer.
- the centroid of the midpoint of the second LOD layer calculate the Euclidean distance between the points other than the first and second LOD layers and the centroid, and classify the points whose Euclidean distance meets the second threshold as the third LOD layer. And so on, put all the points in the LOD layer.
- the threshold of Euclidean distance By adjusting the threshold of Euclidean distance, the number of points in each layer of LOD can be increased.
- the manner of dividing the LOD layer may also adopt other manners, which are not limited in this application. It should be noted that the point cloud can be directly divided into one or more LOD layers, or the point cloud can be divided into multiple point cloud slices first, and then each point cloud slice can be divided into one or more slices. LOD layers.
- the point cloud can be divided into multiple point cloud slices, and the number of points in each point cloud slice can be between 550,000 and 1.1 million.
- Each point cloud tile can be seen as a separate point cloud.
- Each point cloud slice can be divided into multiple LOD layers, and each LOD layer includes multiple points. In one embodiment, the division of the LOD layer may be performed according to the Euclidean distance between points.
- FIG. 2 is a schematic block diagram of an LOD layer provided by an embodiment of the present application.
- the point cloud includes multiple points arranged in the original order, namely P0, P1, P2, P3, P4, P5, P6, P7, P8 and P9.
- the assumption can be based on point and point
- the Euclidean distance between them can divide the point cloud into 3 LOD layers, namely LOD0, LOD1 and LOD2.
- LOD0 may include P0, P5, P4 and P2
- LOD2 may include P1, P6 and P3
- LOD3 may include P9, P8 and P7.
- LOD0, LOD1 and LOD2 can be used to form the LOD-based order of the point cloud, namely P0, P5, P4, P2, P1, P6, P3, P9, P8 and P7.
- the LOD-based order can be used as the encoding order of the point cloud.
- the quantization unit 115 may be used to quantize residual values of attribute information of points. For example, if the quantization unit 115 and the predictive transformation unit 113 are connected, the quantization unit can be used to quantize the residual value of the attribute information of the point output by the predictive transformation unit 113 . For example, the residual value of the attribute information of the point output by the predictive transform unit 113 is quantized using a quantization step size, so as to improve the system performance.
- the second arithmetic coding unit 116 may perform entropy coding on the residual value of the attribute information of the point by using zero run length coding, so as to obtain the attribute code stream.
- the attribute code stream may be bit stream information
- the predicted value (predicted value) of the attribute information of the point in the point cloud may also be referred to as the color predicted value (predicted Color) in the LOD mode.
- a residual value of the point can be obtained by subtracting the predicted value of the attribute information of the point from the actual value of the attribute information of the point.
- the residual value of the attribute information of the point may also be referred to as a color residual value (residualColor) in the LOD mode.
- the predicted value of the attribute information of the point and the residual value of the attribute information of the point are added to generate a reconstructed value of the attribute information of the point.
- the reconstructed value of the attribute information of the point may also be referred to as a color reconstructed value (reconstructedColor) in the LOD mode.
- FIG. 3 is a schematic block diagram of a decoding framework 200 provided by an embodiment of the present application.
- the decoding framework 200 can obtain the code stream of the point cloud from the encoding device, and obtain the position information and attribute information of the points in the point cloud by parsing the code.
- the decoding of point cloud includes position decoding and attribute decoding.
- the position decoding process includes: performing arithmetic decoding on the geometric code stream; merging after constructing the octree, and reconstructing the position information of the point to obtain the reconstruction information of the position information of the point; The reconstructed information of the information is subjected to coordinate transformation to obtain the position information of the point.
- the position information of the point may also be referred to as the geometric information of the point.
- the attribute decoding process includes: obtaining the residual value of the attribute information of the point in the point cloud by parsing the attribute code stream; obtaining the residual value of the attribute information of the point after inverse quantization by inverse quantizing the residual value of the attribute information of the point value; based on the reconstruction information of the position information of the point obtained during the position decoding process, select one of the three prediction modes to perform point cloud prediction, obtain the predicted value, and obtain the reconstructed value of the attribute information of the point based on the predicted value and the residual value; The reconstructed value of the attribute information of the point is inversely transformed into the color space to obtain the decoded point cloud.
- the position decoding can be implemented by the following units: a first arithmetic decoding unit 201, an octree analysis (synthesize octree) unit 202, a geometric reconstruction (Reconstruct geometry) unit 203, and an inverse coordinate transformation (inverse transform coordinates) unit. 204.
- Attribute encoding can be implemented by the following units: a second arithmetic decoding unit 210, an inverse quantize unit 211, a RAHT unit 212, a predicting transform unit 213, a lifting transform unit 214, and an inverse color space transform (inverse trasform colors) unit 215.
- each unit in the decoding framework 200 may refer to the functions of the corresponding units in the encoding framework 100 .
- the decoding framework 200 may divide the point cloud into a plurality of LODs according to the Euclidean distances between the points in the point cloud; and then sequentially decode the attribute information of the points in the LODs. For example, the number of zeros in the zero-run coding technique (zero_cnt) is calculated to decode the residual value based on the number of zeros; then, the decoding framework 200 may perform inverse quantization based on the decoded residual value, and perform inverse quantization based on the inverse quantized residual value.
- the residual value is added to the predicted value of the current point to obtain the reconstructed value of the point cloud until all point clouds are decoded.
- the decoding operation is performed according to the zero-run coding technology. First, the size of the first zero_cnt in the code stream is solved. If it is greater than 0, it means that the residual value of the attribute information of this point is 0. If zero_cnt is not 0, it means that the attribute of this point is 0. The residual value of the information is not 0; if the residual value is 0, use the first analytical function to analyze the residual value; if the residual value is not 0, use the second analytical function to analyze the residual value. Parse.
- the current point will be used as the nearest neighbor of the subsequent LOD midpoint, and the reconstructed value of the current point will be used to predict the attribute information of the subsequent point.
- inverse transform transform
- inverse quantization scale/scaling
- orthogonal transform if one of the matrices is used for transform, the other matrix is used for inverse transform.
- the matrices used in the decoder may be referred to as "transform" matrices.
- Embodiments of the present application provide a point cloud decoding method, an encoding method, a decoder, and an encoder, which can improve the reconstruction accuracy in the point cloud decoding process, and further, can improve the decoding effect.
- FIG. 4 shows a schematic flowchart of a point cloud decoding method 300 according to an embodiment of the present application, and the decoding method 300 may be executed by a decoding end.
- the decoding framework 200 shown in FIG. 3 is the point cloud decoder TMC13.
- the technical solution of the present application will be described below by taking the decoder as the execution body.
- the decoding method 300 may include:
- S305 Obtain a decoded point cloud according to the final reconstructed value of the attribute information of the target point.
- the initial reconstructed value can be a value in RGB format
- the three primary color components are all 0 (the weakest), the mixture is black; when the three primary color components are all k (the strongest), the mixture is white.
- various colors between black and white can be mixed.
- the luminance value ie, the luminance signal Y
- the chrominance value ie, the color difference signal Cr and the color difference signal Cb
- YCbCr chrominance value
- Kalman filtering is performed on the converted original chrominance values, so as to improve the decoding effect.
- the reconstruction accuracy of the attribute information of the target point can be improved.
- the quality of the point reconstruction process is enhanced, correspondingly, the decoding effect of point cloud coding can be improved.
- using the Kalman filtering algorithm to filter only the initial chrominance value can improve the filtering effect and further improve the Decoding effect of point cloud encoding.
- BDBR represents the code rate difference under the same Peak Signal to Noise Ratio (PSNR), the smaller the BDBR, the better the performance of the encoding algorithm.
- A-type point cloud sequence represents the point cloud of points including the color information of the point and other attribute information
- the B-type point cloud sequence includes the point cloud of the point only including the color information of the point.
- the BDBR average value of point cloud-like sequences can objectively and truly reflect that the performance of the encoding algorithm can be improved by introducing quantization weights.
- the code rate decreases and the PSNR increases, which can indicate that the new method has better performance.
- the PSNR that is, the quality of the video
- BDBR can be used to measure the performance of the encoding algorithm.
- other parameters can also be used to measure the performance of the encoding algorithm to characterize the changes in the bit rate and PSNR of the video obtained by the new method compared to the video obtained by the original method. This is not specifically limited.
- BDPSNR delta peak signal-to-noise rate
- BDPSNR the Björgyard incremental signal-to-noise power ratio ( delta peak signal-to-noise rate, BD-PSNR or BDPSNR) to measure the performance of the coding algorithm
- BDPSNR represents the difference in PSNR under the same code rate
- the larger the BDPSNR the better the performance of the coding algorithm.
- the Kalman filtering process in the embodiment of the present application may be combined with the post-processing process of the decoder, or may be combined with the filtering processing process in the loop, which is not specifically limited in the embodiment of the present application.
- the point cloud involved in this application may be a complete point cloud, or may be a point cloud formed by dividing the complete point cloud into slices.
- FIG. 5 is a schematic diagram of the principle of Kalman filtering provided by an embodiment of the present application.
- the initial chromaticity value of the attribute information of the target point can be represented by a curve 371
- the final chromaticity value of the attribute information of the target point can be represented by a curve 372
- the target can be represented by a curve 373
- the measured value of the attribute information of the target point in other words, the initial chromaticity value of the attribute information of the target point can be filtered by using the measured value of the attribute information of the target point to obtain the final color value of the attribute information of the target point. degree value.
- the initial chrominance value may also be referred to as a priori estimated value; the final chrominance value may also be referred to as an optimal estimated value or a posteriori estimated value.
- Kalman filtering of target points in a point cloud can be divided into a prediction process and a correction process.
- the target point is the kth point in the point cloud; in the prediction process, the state of the kth point can be estimated according to the initial chromaticity value of the k-1th point, and the measured value of the kth point can be obtained; During the calibration process, use the measurement value of the kth point to correct the initial chromaticity value of the kth point to obtain the final chromaticity value of the kth point.
- the S303 may include:
- the Kalman filtering algorithm is used to filter the initial chrominance value to obtain a final chrominance value.
- the reconstructed value of the attribute information of one or more points before the target point may be the reconstructed value of the attribute information of a point before the target point.
- the reconstructed value of the attribute information of one or more points before the target point may be an average value, a maximum value or a minimum value of the reconstructed values of the attribute information of a plurality of points before the target point. For example, taking the chrominance value of the initial reconstructed value of the attribute information of one or more points before the target point as the measurement value, and using the Kalman filtering algorithm to filter the initial chrominance value to obtain the final chrominance value.
- one or more points before the target point are one or more points located before the target point in decoding order.
- one or more points before the target point may also be one or more points determined in other order. It should be noted that one or more points before the target point can be understood as one or more points located before the target point in the LOD layer where the target point is located, and can also be understood as before the coding sequence one or more points.
- the decoder will use the chrominance value of the reconstructed value as a measurement value, and use the Kalman filtering algorithm to filter the initial chrominance value to obtain the final chrominance value.
- the S303 may include:
- the residual value of the attribute information of the target point is obtained after performing inverse quantization processing based on the target quantization step size among the multiple quantization step sizes; the S302 may include: in the When the target quantization step size is greater than or equal to the threshold N, the initial reconstruction value is converted into the initial luminance value and the initial chrominance value, where N is a non-negative integer.
- the multiple quantization step sizes are sorted in ascending order, N is the value of the nth quantization step size among the multiple quantization step sizes, and n is an integer greater than 0.
- the initial reconstructed value is converted into the initial luminance value and the initial chrominance value
- the S302 may include:
- the initial reconstructed value is converted into the initial luminance value and the initial chrominance value using a color space conversion function, and the initial luminance value and the initial chrominance value are located in the color space supported by the display screen.
- the color space conversion function may be transformGbrToYCbCrBt709.
- the color space may also be referred to as a color gamut space.
- the purpose of this application is to protect the use of a color space conversion function to convert the initial reconstructed value into a color space or a color gamut space supported by the display screen, and the embodiment of this application does not limit the specific conversion function.
- different conversion functions may be used according to the properties of the display screen, and even different conversion functions may be used based on the color spaces or color gamut spaces involved in different codec standards.
- the S301 may include:
- the attribute information of the point is obtained.
- M is a positive integer greater than 1
- m is a positive integer greater than or equal to 0.
- the quantization process may introduce a large quantization error.
- the residual value of the attribute information of some points is not quantized, which is equivalent to the reconstructed value of the attribute information of the partial point. This can improve the accuracy of the reconstructed value of the attribute information of all points in the point cloud.
- the improvement of the value accuracy will make the update and iterative process of the Kalman filter more accurate, and further improve the decoding effect.
- the decoder does not perform inverse quantization for points at the same position, so as to achieve the effect of decoding correct and better quality reconstruction values.
- the method 300 may further include:
- the code stream is analyzed to obtain the number of points of the residual value of the attribute information of the points that do not need inverse quantization in the code stream.
- the S301 may include:
- the quantization process may introduce a large quantization error.
- the final reconstructed values of some points are replaced with real values, which can improve the accuracy of the reconstructed values of the attribute information of all points in the point cloud.
- the update and iterative process of the Kalman filter will be more accurate. Accurate, further improve the decoding effect.
- the decoder uses the real value for the same position point, in order to achieve the effect of decoding the correct and better quality reconstructed value.
- the method 300 may further include:
- the code stream is parsed to obtain the number of points whose final reconstructed value of the attribute information of the point in the code stream is the real value, and the real value of the attribute information of the point.
- the method 300 may further include:
- the point cloud is divided into one or more LOD layers, each LOD layer including one or more points.
- the S301 may include:
- FIG. 6 is another schematic diagram of a decoding method provided by an embodiment of the present application.
- the encoder traverses the point cloud, the initial reconstructed value of the attribute information of the current point in the point cloud is an RGB value, and the decoder calculates the average reconstructed value of the reconstructed values of the attribute information of the three points before the current point. Then, the decoder can use the function transformGbrToYCbCrBt709 to convert the initial reconstruction value of the current point and the average reconstruction value of the previous three points from the RGB domain to the YUV domain, that is, the encoder obtains the average chrominance value of the average reconstruction value and the average reconstruction value.
- the luminance value, as well as the initial chrominance and initial luminance values of the current point are examples of the luminance of the current point.
- the decoder performs Kalman filter calculation based on the obtained average chrominance value as the measured value, and uses the initial chrominance value as the predicted value to calculate the final chrominance value of the current point, and then based on the final chrominance value of the current point and the initial luminance value value to get the final reconstructed value.
- the UV component of the YUV component is used as the input parameter required by the Kalman filter, that is, the measured value
- the UV component of the YUV component of the current point is used as another input parameter of the Kalman filter, that is, the predicted value.
- the filtered final reconstructed value will overwrite the initial reconstructed value of the current point.
- Kalman filtering is only performed on the initial chrominance value, which avoids the quality loss of the initial luminance value and can improve the filtering performance.
- FIG. 7 shows a schematic flowchart of a point cloud encoding method 400 according to an embodiment of the present application.
- the method 400 may be performed by the encoding side.
- the encoding framework 100 or encoder shown in FIG. 2 may be performed by the encoding side.
- the method 400 may include:
- S405 Encode the number of residual values of the attribute information of the lossless encoded points to be written in the code stream and the residual value of the attribute information of the target point to obtain the code stream.
- the S405 may include:
- the encoder processes the position information of a target point in the point cloud to obtain reconstruction information of the position information of the target point; and obtains the reconstruction information of the position information of the target point according to the reconstruction information of the position information of the target point.
- the predicted value of the attribute information; the attribute information of the target point in the point cloud is processed to obtain the real value of the attribute information of the target point; according to the predicted value of the attribute information of the target point and the The real value of the attribute information is used to obtain the residual value of the attribute information of the target point; the residual value of the attribute information of the target point is encoded to obtain the code stream.
- the number of points that do not need the residual value of the attribute information of the quantized point may be encoded; for example, when the number of points in one LOD layer in the point cloud is less than the threshold M, the number of points in the LOD layer is , encode the residual value of the attribute information of the point without inverse quantization, to obtain the code stream; when the number of points in one LOD layer in the point cloud is greater than or equal to the M, for the first LOD layer in the LOD layer m*M points, encode the residual value of the attribute information of the points without inverse quantization to obtain the code stream, where M is a positive integer greater than 1, and m is a positive integer greater than or equal to 0.
- the method 400 may further include:
- the point cloud is divided into one or more LOD layers, each LOD layer including one or more points.
- the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the present application.
- the implementation of the embodiments constitutes no limitation.
- FIG. 8 is a schematic block diagram of a point cloud decoder 500 provided by an embodiment of the present application.
- the decoder 500 may include:
- the parsing unit 501 is used for parsing the code stream of the point cloud to obtain the initial reconstruction value of the attribute information of a target point in the point cloud;
- a conversion unit 502 configured to convert the initial reconstruction value into an initial luminance value and an initial chrominance value
- a filtering unit 503 configured to filter the initial chrominance value using a Kalman filtering algorithm to obtain a final chrominance value
- a first processing unit 504 configured to obtain a final reconstruction value of the attribute information of the target point based on the final chrominance value and the initial luminance value;
- the second processing unit 505 is configured to obtain a decoded point cloud according to the final reconstructed value of the attribute information of the target point.
- the filtering unit 503 is specifically configured to:
- the Kalman filtering algorithm is used to filter the initial chromaticity value to obtain the final chromaticity value.
- one or more points before the target point are one or more points located before the target point in decoding order.
- the residual value of the attribute information of the target point is obtained after performing inverse quantization processing based on the target quantization step size among multiple quantization step sizes; the conversion unit 502 is specifically configured to:
- the initial reconstruction value is converted into the initial luminance value and the initial chrominance value, where N is a non-negative integer.
- the multiple quantization step sizes are sorted in ascending order, N is the value of the nth quantization step size among the multiple quantization step sizes, and n is greater than 0 Integer.
- the conversion unit 502 is specifically configured to:
- the initial reconstructed value is converted into the initial luminance value and the initial chrominance value using a color space conversion function, and the initial luminance value and the initial chrominance value are located in the color space supported by the display screen.
- the parsing unit 501 is specifically configured to:
- the attribute information of the point is obtained.
- M is a positive integer greater than 1
- m is a positive integer greater than or equal to 0.
- the parsing unit 501 is further configured to:
- the code stream is analyzed to obtain the number of points of the residual value of the attribute information of the points that do not need inverse quantization in the code stream.
- the parsing unit 501 is specifically configured to:
- the parsing unit 501 is further configured to:
- the code stream is parsed to obtain the number of points whose final reconstructed value of the attribute information of the point in the code stream is the real value, and the real value of the attribute information of the point.
- the parsing unit 501 is further configured to:
- the point cloud is divided into one or more LOD layers, each LOD layer including one or more points.
- the parsing unit 501 is specifically configured to:
- the initial reconstruction value of the attribute information of the target point is obtained.
- FIG. 9 is a schematic block diagram of a point cloud encoder 600 provided by an embodiment of the present application.
- the encoder 600 may include:
- a first processing unit 601 configured to process the position information of a target point in the point cloud to obtain reconstruction information of the position information of the target point;
- the second processing unit 602 is used to obtain the predicted value of the attribute information of the target point according to the reconstruction information of the position information of the target point;
- the third processing unit 603 processes the attribute information of the target point in the point cloud to obtain the true value of the attribute information of the target point;
- a fourth processing unit 604 configured to obtain a residual value of the attribute information of the target point according to the predicted value of the attribute information of the target point and the real value of the attribute information of the target point;
- the encoding unit 605 is configured to encode the number of residual values of the attribute information of the lossless encoded points that need to be written into the code stream and the residual value of the attribute information of the target point to obtain the code stream.
- the encoding unit 605 is specifically used for:
- the first processing unit 601 is specifically configured to:
- the point cloud is divided into one or more LOD layers, each LOD layer including one or more points.
- the apparatus embodiments and the method embodiments may correspond to each other, and similar descriptions may refer to the method embodiments. To avoid repetition, details are not repeated here.
- the decoder 500 may correspond to executing the corresponding subject in the method 300 of the embodiments of the present application, and each unit in the decoder 500 is to implement the corresponding process in the method 300.
- the encoder 600 may correspond to executing the corresponding process in the method 300.
- the corresponding main body in the method 400 in the embodiment of the present application, and each unit in the encoder 600 are respectively to implement the corresponding process in the method 400, and are not repeated here for brevity.
- each unit in the encoder or decoder involved in the embodiments of the present application may be respectively or all merged into one or several other units to form, or some of the unit(s) may be further split into It is composed of multiple units with smaller functions, which can realize the same operation without affecting the realization of the technical effects of the embodiments of the present application.
- the above units are divided based on logical functions.
- the function of one unit may also be implemented by multiple units, or the functions of multiple units may be implemented by one unit.
- the encoder or decoder may also include other units, and in practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by cooperation of multiple units.
- a general-purpose computing device including a general-purpose computer such as a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), etc.
- a general-purpose computer may be implemented
- a computer program (including program code) capable of executing the steps involved in the corresponding method is run on the computer to construct the encoder or decoder involved in the embodiments of the present application, and to implement the encoding method or decoding method provided by the embodiments of the present application.
- the computer program may be recorded on, for example, a computer-readable storage medium, and loaded on any electronic device with processing capability through the computer-readable storage medium, and executed in it, to implement the corresponding methods of the embodiments of the present application.
- the units mentioned above can be implemented in the form of hardware, can also be implemented by instructions in the form of software, and can also be implemented in the form of a combination of software and hardware.
- the steps of the method embodiments in the embodiments of the present application may be completed by hardware integrated logic circuits in the processor and/or instructions in the form of software, and the steps of the methods disclosed in conjunction with the embodiments of the present application may be directly embodied as hardware
- the execution of the decoding processor is completed, or the execution is completed by a combination of hardware and software in the decoding processor.
- the software may be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and other storage media mature in the art.
- the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
- FIG. 10 is a schematic structural diagram of an electronic device 700 provided by an embodiment of the present application.
- the electronic device 700 includes at least a processor 710 and a computer-readable storage medium 720 .
- the processor 710 and the computer-readable storage medium 720 may be connected through a bus or other means.
- the computer-readable storage medium 720 is used for storing a computer program 721
- the computer program 721 includes computer instructions
- the processor 710 is used for executing the computer instructions stored in the computer-readable storage medium 720 .
- the processor 710 is the computing core and the control core of the electronic device 700, which is suitable for implementing one or more computer instructions, and is specifically suitable for loading and executing one or more computer instructions to implement corresponding method processes or corresponding functions.
- the processor 710 may also be referred to as a central processing unit (Central Processing Unit, CPU).
- the processor 710 may include, but is not limited to: a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field Programmable Gate Array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
- the computer-readable storage medium 720 may be a high-speed RAM memory, or a non-volatile memory (Non-Volatile Memory), such as at least one disk memory; computer readable storage medium.
- the computer-readable storage medium 720 includes, but is not limited to, volatile memory and/or non-volatile memory.
- the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM).
- Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
- RAM Random Access Memory
- SRAM Static RAM
- DRAM Dynamic RAM
- SDRAM Synchronous DRAM
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM DDR SDRAM
- enhanced SDRAM ESDRAM
- synchronous link dynamic random access memory SLDRAM
- Direct Rambus RAM Direct Rambus RAM
- the electronic device 700 may be the decoding framework 200 shown in FIG. 3 or the decoder 600 shown in FIG. 8 ; the computer-readable storage medium 720 stores first computer instructions; Load and execute the first computer instructions stored in the computer-readable storage medium 720 to implement the corresponding steps in the method embodiment shown in FIG. 4; in specific implementation, the first computer instructions in the computer-readable storage medium 720 are executed by the processor 710 Load and execute corresponding steps, which are not repeated here to avoid repetition.
- the electronic device 700 may be the encoding framework 100 shown in FIG. 1 or the encoder 600 shown in FIG. 9 ; the computer-readable storage medium 720 stores second computer instructions; Load and execute the second computer instructions stored in the computer-readable storage medium 720 to implement the corresponding steps in the method embodiment of FIG. 7 ; in specific implementation, the second computer instructions in the computer-readable storage medium 720 are loaded by the processor 710 and perform corresponding steps, which are not repeated here to avoid repetition.
- an embodiment of the present application further provides a computer-readable storage medium (Memory), where the computer-readable storage medium is a memory device in the electronic device 700 for storing programs and data.
- computer readable storage medium 720 may include both a built-in storage medium in the electronic device 700 , and certainly also an extended storage medium supported by the electronic device 700 .
- the computer-readable storage medium provides storage space in which the operating system of the electronic device 700 is stored.
- one or more computer instructions suitable for being loaded and executed by the processor 710 are also stored in the storage space, and these computer instructions may be one or more computer programs 721 (including program codes).
- a computer program product or computer program comprising computer instructions stored in a computer readable storage medium.
- the electronic device 700 may be a computer, the processor 710 reads the computer instructions from the computer-readable storage medium 720, and the processor 710 executes the computer instructions, so that the computer executes the encoding method provided in the above-mentioned various optional manners or decoding method.
- the computer program product includes one or more computer instructions.
- the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website site, computer, server or data center via Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, digital subscriber line, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
- wired eg, coaxial cable, optical fiber, digital subscriber line, DSL
- wireless eg, infrared, wireless, microwave, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
测试序列 | 亮度 | 色度(蓝色色差) | 色度(红色色差) |
A类点云序列 | -0.8% | -5.7% | -7.4% |
B类点云序列 | -0.3% | -3.6% | -4.2% |
A类、B类点云序列的平均值 | -0.5% | -4.6% | -5.7% |
Claims (20)
- 一种点云的解码方法,其特征在于,包括:对点云的码流进行解析,得到所述点云中一目标点的属性信息的初始重建值;将所述初始重建值转换为初始亮度值和初始色度值;利用卡尔曼滤波算法对所述初始色度值进行滤波,得到最终色度值;基于所述最终色度值和所述初始亮度值,得到所述目标点的属性信息的最终重建值;根据所述目标点的属性信息的最终重建值,得到解码点云。
- 根据权利要求1所述的方法,其特征在于,所述利用卡尔曼滤波算法对所述初始色度值进行滤波,得到最终色度值,包括:以所述目标点之前的一个或多个点的属性信息的重建值的色度值作为测量值,利用所述卡尔曼滤波算法对所述初始色度值进行滤波,得到最终色度值。
- 根据权利要求2所述的方法,其特征在于,所述以所述目标点之前的一个或多个点的属性信息的重建值的色度值作为测量值,利用所述卡尔曼滤波算法对所述初始色度值进行滤波,得到最终色度值,包括:计算所述目标点之前的一个或多个点的属性信息的重建值的平均重建值;将所述平均重建值转换为平均亮度值和平均色度值;将所述平均色度值作为测量值,利用所述卡尔曼滤波算法对所述初始色度值进行滤波,得到所述最终色度值。
- 根据权利要求2所述的方法,其特征在于,所述目标点之前的一个或多个点为按照解码顺序位于所述目标点之前的一个或多个点。
- 根据权利要求1所述的方法,其特征在于,所述目标点的属性信息的残差值为基于多个量化步长中的目标量化步长进行反量化处理后得到的;所述将所述初始重建值转换为初始亮度值和初始色度值,包括:在所述目标量化步长大于或等于阈值N的情况下,将所述初始重建值转换为所述初始亮度值和所述初始色度值,N为非负整数。
- 根据权利要求5所述的方法,其特征在于,所述多个量化步长按照由小到大的顺序排序,N为所述多个量化步长中的第n个量化步长的值,n为大于0的整数。
- 根据权利要求1所述的方法,其特征在于,所述将所述初始重建值转换为初始亮度值和初始色度值,包括:利用颜色空间转换函数将所述初始重建值转换为所述初始亮度值和所述初始色度值,所述初始亮度值和所述初始色度值位于显示屏支持的颜色空间内。
- 根据权利要求1所述的方法,其特征在于,所述对点云的码流进行解析,得到所述点云中一目标点的属性信息的初始重建值,包括:当所述点云中的一个LOD层的点数小于阈值M时,针对LOD层中的点,基于无损编码的点的属性信息的残差值和点的属性信息的预测值,得到点的属性信息的初始重建值;当所述点云中的一个LOD层的点数大于或等于所述M时,针对LOD层中的第m*M个点,基于无损编码的点的属性信息的残差值和点的属性信息的预测值,得到点的属性信息的初始重建值,M为大于1的正整数,m为大于或等于0的正整数。
- 根据权利要求8所述的方法,其特征在于,所述方法还包括:对所述码流进行解析,得到所述码流中的不需要反量化的点的属性信息的残差值的点的数量。
- 根据权利要求1所述的方法,其特征在于,所述对点云的码流进行解析,得到所述点云中一目标点的属性信息的初始重建值,包括:当所述点云中的一个LOD层的点数小于阈值T时,针对LOD层中的点,将点的属 性信息的最终重建值替换为属性信息的真实值;当所述点云中的一个LOD层的点数大于或等于所述T时,将LOD层中的第t*T个点的属性信息的最终重建值替换为真实值,T为大于1的正整数,t为大于或等于0的正整数。
- 根据权利要求10所述的方法,其特征在于,所述方法还包括:对所述码流进行解析,得到所述码流中点的属性信息的最终重建值为真实值的点的数量、以及点的属性信息的真实值。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:将所述点云划分为一个或多个LOD层,每个LOD层包括一个或多个点。
- 根据权利要求1所述的方法,其特征在于,所述对点云的码流进行解析,得到所述点云中一目标点的属性信息的初始重建值,包括:对所述码流进行解析,得到所述目标点的位置信息的重建信息;根据所述目标点的位置信息的重建信息,得到所述目标点的属性信息的预测值;对所述码流进行解析,得到所述目标点的属性信息的残差值;根据所述目标点的属性信息的预测值和所述目标点的属性信息的残差值,得到所述目标点的属性信息的初始重建值。
- 一种点云的编码方法,其特征在于,包括:对点云中的一目标点的位置信息进行处理,得到所述目标点的位置信息的重建信息;根据所述目标点的位置信息的重建信息,得到所述目标点的属性信息的预测值;对点云中的所述目标点的属性信息进行处理,得到所述目标点的属性信息的真实值;根据所述目标点的属性信息的预测值和所述目标点的属性信息的真实值,得到所述目标点的属性信息的残差值;对需要写入码流中的无损编码的点的属性信息的残差值的数量以及所述目标点的属性信息的残差值进行编码,得到所述码流。
- 根据权利要求14所述的方法,其特征在于,所述对需要写入码流中的无损编码的点的属性信息的残差值的数量以及所述目标点的属性信息的残差值进行编码,得到所述码流,包括:当所述点云中的一个LOD层的点数小于阈值M时,针对LOD层中的点,将未经反量化的点的属性信息的残差值进行编码,得到所述码流;当所述点云中的一个LOD层的点数大于或等于所述M时,针对LOD层中的第m*M个点,将未经反量化的点的属性信息的残差值进行编码,得到所述码流,M为大于1的正整数,m为大于或等于0的正整数。
- 根据权利要求14所述的方法,其特征在于,所述方法还包括:将所述点云划分为一个或多个LOD层,每个LOD层包括一个或多个点。
- 一种解码器,其特征在于,包括:解析单元,用于对点云的码流进行解析,得到所述点云中一目标点的属性信息的初始重建值;转换单元,用于将所述初始重建值转换为初始亮度值和初始色度值;滤波单元,用于利用卡尔曼滤波算法对所述初始色度值进行滤波,得到最终色度值;第一处理单元,用于基于所述最终色度值和所述初始亮度值,得到所述目标点的属性信息的最终重建值;第二处理单元,用于根据所述目标点的属性信息的最终重建值,得到解码点云。
- 一种编码器,其特征在于,包括:第一处理单元,用于对点云中的一目标点的位置信息进行处理,得到所述目标点的位置信息的重建信息;第二处理单元,用于根据所述目标点的位置信息的重建信息,得到所述目标点的属 性信息的预测值;第三处理单元,对点云中的所述目标点的属性信息进行处理,得到所述目标点的属性信息的真实值;第四处理单元,用于根据所述目标点的属性信息的预测值和所述目标点的属性信息的真实值,得到所述目标点的属性信息的残差值;编码单元,用于对需要写入码流中的无损编码的点的属性信息的残差值的数量以及所述目标点的属性信息的残差值进行编码,得到所述码流。
- 一种电子设备,其特征在于,包括:处理器,适于执行计算机程序;计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至13中任一项所述的点云的解码方法或如权利要求14至16中任一项所述的点云的解码方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机指令,所述计算机指令适于由处理器加载并执行如权利要求1至13中任一项所述的点云的解码方法或如权利要求14至16中任一项所述的点云的解码方法。
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/138435 WO2022133755A1 (zh) | 2020-12-22 | 2020-12-22 | 点云的解码方法、编码方法、解码器以及编码器 |
CN202080107853.9A CN116601947A (zh) | 2020-12-22 | 2020-12-22 | 点云的解码方法、编码方法、解码器以及编码器 |
JP2023537455A JP2024505796A (ja) | 2020-12-22 | 2020-12-22 | 点群復号化方法、点群符号化方法、復号器及び符号器 |
KR1020237024384A KR20230124018A (ko) | 2020-12-22 | 2020-12-22 | 포인트 클라우드 디코딩 방법, 포인트 클라우드 인코딩방법, 디코더 및 인코더 |
EP20966335.0A EP4270949A4 (en) | 2020-12-22 | 2020-12-22 | DECODING METHOD AND POINT CLOUD ENCODING METHOD, AND DECODER AND ENCODER |
CN202311874021.1A CN117710587A (zh) | 2020-12-22 | 2020-12-22 | 点云的解码方法、编码方法、解码器以及编码器 |
US18/335,714 US20230326090A1 (en) | 2020-12-22 | 2023-06-15 | Point cloud decoding method, point cloud encoding method, and decoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/138435 WO2022133755A1 (zh) | 2020-12-22 | 2020-12-22 | 点云的解码方法、编码方法、解码器以及编码器 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/335,714 Continuation US20230326090A1 (en) | 2020-12-22 | 2023-06-15 | Point cloud decoding method, point cloud encoding method, and decoder |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022133755A1 true WO2022133755A1 (zh) | 2022-06-30 |
Family
ID=82157037
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/138435 WO2022133755A1 (zh) | 2020-12-22 | 2020-12-22 | 点云的解码方法、编码方法、解码器以及编码器 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230326090A1 (zh) |
EP (1) | EP4270949A4 (zh) |
JP (1) | JP2024505796A (zh) |
KR (1) | KR20230124018A (zh) |
CN (2) | CN117710587A (zh) |
WO (1) | WO2022133755A1 (zh) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190311499A1 (en) * | 2018-04-10 | 2019-10-10 | Apple Inc. | Adaptive distance based point cloud compression |
CN110708560A (zh) * | 2018-07-10 | 2020-01-17 | 腾讯美国有限责任公司 | 点云数据处理方法和装置 |
CN110996098A (zh) * | 2018-10-02 | 2020-04-10 | 腾讯美国有限责任公司 | 处理点云数据的方法和装置 |
CN111095929A (zh) * | 2017-09-14 | 2020-05-01 | 苹果公司 | 点云压缩 |
CN111145090A (zh) * | 2019-11-29 | 2020-05-12 | 鹏城实验室 | 一种点云属性编码方法、解码方法、编码设备及解码设备 |
CN111242997A (zh) * | 2020-01-13 | 2020-06-05 | 北京大学深圳研究生院 | 一种基于滤波器的点云属性预测方法及设备 |
WO2020146539A1 (en) * | 2019-01-08 | 2020-07-16 | Apple Inc. | Point cloud compression using a space filling curve for level of detail generation |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10909725B2 (en) * | 2017-09-18 | 2021-02-02 | Apple Inc. | Point cloud compression |
EP3866115B1 (en) * | 2018-10-09 | 2024-03-27 | Panasonic Intellectual Property Corporation of America | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
CN110418135B (zh) * | 2019-08-05 | 2022-05-27 | 北京大学深圳研究生院 | 一种基于邻居的权重优化的点云帧内预测方法及设备 |
-
2020
- 2020-12-22 CN CN202311874021.1A patent/CN117710587A/zh active Pending
- 2020-12-22 CN CN202080107853.9A patent/CN116601947A/zh active Pending
- 2020-12-22 WO PCT/CN2020/138435 patent/WO2022133755A1/zh active Application Filing
- 2020-12-22 KR KR1020237024384A patent/KR20230124018A/ko unknown
- 2020-12-22 JP JP2023537455A patent/JP2024505796A/ja active Pending
- 2020-12-22 EP EP20966335.0A patent/EP4270949A4/en active Pending
-
2023
- 2023-06-15 US US18/335,714 patent/US20230326090A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111095929A (zh) * | 2017-09-14 | 2020-05-01 | 苹果公司 | 点云压缩 |
US20190311499A1 (en) * | 2018-04-10 | 2019-10-10 | Apple Inc. | Adaptive distance based point cloud compression |
CN110708560A (zh) * | 2018-07-10 | 2020-01-17 | 腾讯美国有限责任公司 | 点云数据处理方法和装置 |
CN110996098A (zh) * | 2018-10-02 | 2020-04-10 | 腾讯美国有限责任公司 | 处理点云数据的方法和装置 |
WO2020146539A1 (en) * | 2019-01-08 | 2020-07-16 | Apple Inc. | Point cloud compression using a space filling curve for level of detail generation |
CN111145090A (zh) * | 2019-11-29 | 2020-05-12 | 鹏城实验室 | 一种点云属性编码方法、解码方法、编码设备及解码设备 |
CN111242997A (zh) * | 2020-01-13 | 2020-06-05 | 北京大学深圳研究生院 | 一种基于滤波器的点云属性预测方法及设备 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4270949A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP4270949A4 (en) | 2024-03-06 |
CN117710587A (zh) | 2024-03-15 |
CN116601947A (zh) | 2023-08-15 |
KR20230124018A (ko) | 2023-08-24 |
EP4270949A1 (en) | 2023-11-01 |
JP2024505796A (ja) | 2024-02-08 |
US20230326090A1 (en) | 2023-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114788264B (zh) | 用于发出虚拟边界和环绕运动补偿的信号的方法 | |
US20230342985A1 (en) | Point cloud encoding and decoding method and point cloud decoder | |
EP4258671A1 (en) | Point cloud attribute predicting method, encoder, decoder, and storage medium | |
WO2022067775A1 (zh) | 点云的编码、解码方法、编码器、解码器以及编解码系统 | |
CN114902670B (zh) | 用信号通知子图像划分信息的方法和装置 | |
US20230237704A1 (en) | Point cloud decoding and encoding method, and decoder, encoder and encoding and decoding system | |
WO2022133755A1 (zh) | 点云的解码方法、编码方法、解码器以及编码器 | |
WO2022133752A1 (zh) | 点云的编码方法、解码方法、编码器以及解码器 | |
WO2023159428A1 (zh) | 编码方法、编码器以及存储介质 | |
WO2022217472A1 (zh) | 点云编解码方法、编码器、解码器及计算机可读存储介质 | |
WO2023240455A1 (zh) | 点云编码方法、编码装置、编码设备以及存储介质 | |
WO2023097694A1 (zh) | 解码方法、编码方法、解码器以及编码器 | |
WO2023197337A1 (zh) | 索引确定方法、装置、解码器以及编码器 | |
WO2023023918A1 (zh) | 解码方法、编码方法、解码器以及编码器 | |
WO2022257155A1 (zh) | 解码方法、编码方法、解码器、编码器以及编解码设备 | |
US20230290012A1 (en) | Point cloud encoding method and system, point cloud decoding method and system, point cloud encoder, and point cloud decoder | |
WO2023197338A1 (zh) | 索引确定方法、装置、解码器以及编码器 | |
WO2023240660A1 (zh) | 解码方法、编码方法、解码器以及编码器 | |
US20230051431A1 (en) | Method and apparatus for selecting neighbor point in point cloud, encoder, and decoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20966335 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202080107853.9 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023537455 Country of ref document: JP |
|
ENP | Entry into the national phase |
Ref document number: 20237024384 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2020966335 Country of ref document: EP Effective date: 20230724 |