WO2022246724A1

WO2022246724A1 - Point cloud decoding and upsampling and model training methods and apparatus

Info

Publication number: WO2022246724A1
Application number: PCT/CN2021/096287
Authority: WO
Inventors: 元辉; 刘昊; 王婷婷; 李明
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-05-27
Filing date: 2021-05-27
Publication date: 2022-12-01
Also published as: CN117242493A

Abstract

The present application provides point cloud decoding and upsampling and model training methods and apparatus. The point cloud decoding method comprises: obtaining geometric information of a point cloud; according to the geometric information of the point cloud, dividing the point cloud into at least one point cloud block; and inputting the geometric information of the point cloud blocks into a generator for upsampling, so as to obtain upsampled geometric information of the point cloud blocks, wherein the generator comprises: a feature extraction module, a feature upsampling module, and a geometric generation module, the feature extraction module is used to extract first feature information of the point cloud blocks, the feature upsampling module is used to upsample the first feature information of the point cloud blocks to second feature information, and the geometric generation module is used to map the second feature information of the point cloud blocks into a geometric space, so as to obtain the upsampled geometric information of the point cloud blocks. That is, the generator of the present application is a deep learning-based generator, which is used in turn to generate a high-precision point cloud having high accuracy.

Description

Point cloud decoding, upsampling and model training method and device

technical field

The present application relates to the field of point cloud technology, and in particular to a point cloud decoding, upsampling and model training method and device.

Background technique

The surface of the object is collected by the collection device to form point cloud data, which includes hundreds of thousands or more points. During the video production process, the point cloud data is transmitted between the point cloud encoding device and the point cloud decoding device in the form of point cloud media files. However, such a large number of points brings challenges to transmission, therefore, point cloud encoding equipment needs to compress the point cloud data before transmission.

The point cloud decoding end decodes the point cloud code stream to obtain the reconstructed point cloud. However, in some application scenarios, it is necessary to use high-quality point clouds with higher accuracy than the original ones. For example, for the sparse point clouds collected by radar in the field of autonomous driving, a lot of post-processing is required to improve the accuracy of point clouds to improve driving performance. safety. However, current point cloud upsampling methods have poor point cloud upsampling effects and low accuracy.

Contents of the invention

The embodiment of the present application provides a point cloud decoding, upsampling and model training method and device, so as to improve the accuracy of point cloud upsampling.

In the first aspect, the embodiment of the present application provides a point cloud decoding method, including:

Decode the point cloud code stream to obtain the geometric information of the point cloud;

dividing the point cloud into at least one point cloud block according to the geometric information of the point cloud;

Input the geometric information of the point cloud block into the generator for upsampling, and obtain the upsampling geometric information of the point cloud block;

Wherein, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to extract the The first feature information of the point cloud block is up-sampled to the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into a geometric space, so as to obtain the up-sampling of the point cloud block geometric information.

In a second aspect, the present application provides a point cloud upsampling method, including:

Obtain the geometric information of the point cloud to be upsampled;

Divide the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled;

In a third aspect, the present application provides a model training method, including:

Obtaining geometric information of the training point cloud, and dividing the training point cloud into at least one training point cloud block according to the geometric information of the training point cloud;

The feature extraction module of the geometric information input generator of described training point cloud block is carried out feature extraction, obtains the first feature information of described training point cloud block;

Inputting the first feature information of the training point cloud block into the feature upsampling module of the generator for upsampling to obtain the second feature information of the training point cloud block;

Inputting the second feature information of the training point cloud block into the geometric generation module of the generator for geometric reconstruction, and obtaining the predicted upsampling geometric information of the training point cloud block;

According to the predicted upsampling geometric information of the training point cloud block, the feature extraction module, feature upsampling module and geometry generation module in the generator are trained to obtain the trained generator.

In a fourth aspect, a point cloud decoder is provided, configured to execute the method in the above first aspect or various implementations thereof. Specifically, the point cloud decoder includes a functional unit for executing the method in the above first aspect or its implementations.

In a fifth aspect, a point cloud decoder is provided, including a processor and a memory. The memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the method in the above first aspect or its various implementations.

In a sixth aspect, a device for upsampling a point cloud is provided, configured to execute the method in the above second aspect or its various implementations. Specifically, the point cloud upsampling device includes a functional unit for executing the method in the above second aspect or its various implementations.

In a seventh aspect, a point cloud upsampling device is provided, including a processor and a memory. The memory is used to store a computer program, and the processor is used to invoke and run the computer program stored in the memory, so as to execute the method in the above second aspect or its various implementations.

In an eighth aspect, a model training device is provided, configured to execute the method in the above third aspect or various implementations thereof. Specifically, the model training device includes a functional unit for executing the method in the above third aspect or its various implementations.

In a ninth aspect, a model training device is provided, including a processor and a memory. The memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory, so as to execute the method in the above third aspect or its various implementations.

In a tenth aspect, a chip is provided, configured to implement any one of the foregoing first to third aspects or the method in each implementation manner thereof. Specifically, the chip includes: a processor, configured to call and run a computer program from the memory, so that the device installed with the chip executes any one of the above-mentioned first to third aspects or any of the implementations thereof. method.

In an eleventh aspect, there is provided a computer-readable storage medium for storing a computer program, and the computer program causes a computer to execute any one of the above-mentioned first to third aspects or the method in each implementation manner thereof.

A twelfth aspect provides a computer program product, including computer program instructions, the computer program instructions cause a computer to execute any one of the above first to third aspects or the method in each implementation manner.

A thirteenth aspect provides a computer program, which, when running on a computer, causes the computer to execute any one of the above first to third aspects or the method in each implementation manner.

Based on the above technical scheme, the point cloud is divided into at least one point cloud block through the geometric information of the point cloud; the geometric information of the point cloud block is input into the generator for up-sampling, and the up-sampling geometric information of the point cloud block is obtained; the generator Including: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to upsample the first feature information of the point cloud block into the second feature Information, the geometry generation module is used to map the second feature information of the point cloud block into the geometric space, so as to obtain the upsampling geometric information of the point cloud block. That is, the generator in the embodiment of the present application is a generator based on deep learning, through which more characteristic information of the point cloud can be learned, and then when the generator is used for upsampling of the point cloud, a high-precision point cloud can be generated, Moreover, the features of the high-precision point cloud are close to the true value of the point cloud, thereby improving the accuracy of point cloud upsampling.

Description of drawings

FIG. 1 is a schematic block diagram of a point cloud encoding and decoding system involved in an embodiment of the present application;

Fig. 2 is a schematic block diagram of a point cloud encoder provided by an embodiment of the present application;

Fig. 3 is a schematic block diagram of a point cloud decoder provided by an embodiment of the present application;

FIG. 4 is a schematic flow chart of a model training method provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a network of a generator according to an embodiment of the present application;

FIG. 6A is a schematic structural diagram of a feature extraction module involved in an embodiment of the present application;

FIG. 6B is a schematic structural diagram of a feature extraction block involved in an embodiment of the present application;

FIG. 6C is a schematic structural diagram of the second feature extraction unit HRA involved in the embodiment of the present application;

FIG. 6D is a schematic structural diagram of a residual block involved in the embodiment of the present application;

FIG. 6E is a schematic structural diagram of the second feature extraction unit HRA involved in the embodiment of the present application;

FIG. 6F is a schematic structural diagram of the second feature extraction unit HRA involved in the embodiment of the present application;

FIG. 6G is a schematic structural diagram of a gating unit involved in an embodiment of the present application;

FIG. 7A is a schematic structural diagram of a feature upsampling module involved in an embodiment of the present application;

FIG. 7B is another schematic structural diagram of the feature upsampling module involved in the embodiment of the present application;

FIG. 7C is a schematic structural diagram of the feature extraction submodule involved in the embodiment of the present application;

FIG. 7D is a schematic diagram of a specific network structure of the feature upsampling module provided by the embodiment of the present application;

FIG. 8 is a schematic diagram of a specific network structure of the geometry generation module provided by the embodiment of the present application;

FIG. 9 is a schematic diagram of a training process involving a generator according to an embodiment of the present application;

FIG. 10 is another schematic diagram of the training process involving the generator according to the embodiment of the present application;

FIG. 11 is a schematic diagram of a network structure of a discriminator;

FIG. 12 is a schematic flowchart of a model training method provided by an embodiment of the present application;

FIG. 13 is a schematic diagram of a specific network structure of the discriminator provided in the embodiment of the present application;

FIG. 14 is a schematic flow diagram of a point cloud upsampling method provided in an embodiment of the present application;

FIG. 15 is a schematic diagram of a network structure of a generator involved in an embodiment of the present application;

FIG. 16 is a schematic flow diagram of a point cloud decoding method provided in an embodiment of the present application;

Fig. 17 is a schematic block diagram of a point cloud decoder provided by an embodiment of the present application;

Fig. 18 is a schematic block diagram of a point cloud upsampling device provided by an embodiment of the present application;

Fig. 19 is a schematic block diagram of a model training device provided by an embodiment of the present application;

Fig. 20 is a schematic block diagram of an electronic device provided by an embodiment of the present application.

Detailed ways

The present application can be applied to the technical field of point cloud upsampling, for example, can be applied to the technical field of point cloud compression.

In order to facilitate the understanding of the embodiments of the present application, firstly, the relevant concepts involved in the embodiments of the present application are briefly introduced as follows:

Point cloud refers to a set of discrete point sets randomly distributed in space, expressing the spatial structure and surface properties of 3D objects or 3D scenes.

Point cloud data (Point Cloud Data) is a specific record form of point cloud, and the points in the point cloud can include point location information and point attribute information. For example, the point position information may be three-dimensional coordinate information of the point. The location information of a point may also be referred to as geometric information of a point. For example, the attribute information of a point may include color information and/or reflectivity and the like. For example, the color information may be information on any color space. For example, the color information may be (RGB). For another example, the color information may be luminance and chrominance (YcbCr, YUV) information. For example, Y represents brightness (Luma), Cb (U) represents blue color difference, Cr (V) represents red color, and U and V are expressed as chromaticity (Chroma) for describing color difference information. For example, according to the point cloud obtained according to the principle of laser measurement, the points in the point cloud may include the three-dimensional coordinate information of the point and the laser reflection intensity (reflectance) of the point. For another example, in the point cloud obtained according to the principle of photogrammetry, the points in the point cloud may include three-dimensional coordinate information and color information of the point. For another example, combining the principles of laser measurement and photogrammetry to obtain a point cloud, the points in the point cloud may include the three-dimensional coordinate information of the point, the laser reflection intensity (reflectance) of the point, and the color information of the point.

Ways to obtain point cloud data may include but not limited to at least one of the following: (1) Generated by computer equipment. The computer device can generate point cloud data according to virtual three-dimensional objects and virtual three-dimensional scenes. (2) 3D (3-Dimension, three-dimensional) laser scanning acquisition. Point cloud data of static real-world 3D objects or 3D scenes can be obtained through 3D laser scanning, and millions of point cloud data can be obtained per second; (3) 3D photogrammetry acquisition. Through 3D photography equipment (that is, a group of cameras or camera equipment with multiple lenses and sensors) to collect the visual scene of the real world to obtain the point cloud data of the visual scene of the real world, through 3D photography can obtain dynamic real world three-dimensional objects Or point cloud data of a 3D scene. (4) Obtain point cloud data of biological tissues and organs through medical equipment. In the medical field, point cloud data of biological tissues and organs can be obtained through magnetic resonance imaging (Magnetic Resonance Imaging, MRI), electronic computer tomography (Computed Tomography, CT), electromagnetic positioning information and other medical equipment.

Point clouds can be divided into dense point clouds and sparse point clouds according to the way of acquisition.

According to the time series type of data, point cloud is divided into:

The first type of static point cloud: that is, the object is stationary, and the device for obtaining the point cloud is also stationary;

The second type of dynamic point cloud: the object is moving, but the device for obtaining the point cloud is still;

The third type of dynamic acquisition of point clouds: the equipment for acquiring point clouds is in motion.

According to the purpose of point cloud, it can be divided into two categories:

Category 1: Machine perception point cloud, which can be used in scenarios such as autonomous navigation systems, real-time inspection systems, geographic information systems, visual sorting robots, and emergency rescue robots;

Category 2: Human eyes perceive point clouds, which can be used in point cloud application scenarios such as digital cultural heritage, free viewpoint broadcasting, 3D immersive communication, and 3D immersive interaction.

In some embodiments, the point cloud upsampling method provided by the embodiment of the present application can be applied to the point cloud encoding and decoding framework, for example, the geometric information of the point cloud parsed from the code stream by the point cloud decoder is upsampled to obtain Upsampled point clouds with higher accuracy.

The following is an introduction to the relevant knowledge of point cloud encoding and decoding.

FIG. 1 is a schematic block diagram of a point cloud encoding and decoding system involved in an embodiment of the present application. It should be noted that FIG. 1 is just an example, and the point cloud encoding and decoding system in the embodiment of the present application includes but is not limited to what is shown in FIG. 1 . As shown in FIG. 1 , the point cloud encoding and decoding system 100 includes an encoding device 110 and a decoding device 120 . The encoding device is used to encode the point cloud data (which can be understood as compression) to generate a code stream, and transmit the code stream to the decoding device. The decoding device decodes the code stream generated by the encoding device to obtain decoded point cloud data.

The encoding device 110 in the embodiment of the present application can be understood as a device having a point cloud encoding function, and the decoding device 120 can be understood as a device having a point cloud decoding function. Devices, including, for example, smartphones, desktop computers, mobile computing devices, notebook (e.g., laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, point cloud gaming consoles, vehicle-mounted computers, etc. .

In some embodiments, the encoding device 110 can transmit the encoded point cloud data (eg code stream) to the decoding device 120 via the channel 130 . Channel 130 may include one or more media and/or devices capable of transmitting encoded point cloud data from encoding device 110 to decoding device 120 .

In one example, channel 130 includes one or more communication media that enable encoding device 110 to transmit encoded point cloud data directly to decoding device 120 in real-time. In this instance, the encoding device 110 may modulate the encoded point cloud data according to the communication standard, and transmit the modulated point cloud data to the decoding device 120 . The communication medium includes a wireless communication medium, such as a radio frequency spectrum. Optionally, the communication medium may also include a wired communication medium, such as one or more physical transmission lines.

In another example, the channel 130 includes a storage medium, which can store the point cloud data encoded by the encoding device 110 . The storage medium includes a variety of local access data storage media, such as optical discs, DVDs, flash memory, and the like. In this example, the decoding device 120 can acquire encoded point cloud data from the storage medium.

In another example, the channel 130 may include a storage server, and the storage server may store the point cloud data encoded by the encoding device 110 . In this instance, the decoding device 120 may download the stored encoded point cloud data from the storage server. Optionally, the storage server can store the encoded point cloud data and can transmit the encoded point cloud data to the decoding device 120, such as a web server (for example, for a website), a file transfer protocol (FTP) server, etc. .

In some embodiments, the encoding device 110 includes a point cloud encoder 112 and an output interface 113 . Wherein, the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.

In some embodiments, the encoding device 110 may include a point cloud source 111 in addition to the point cloud encoder 112 and the input interface 113 .

The point cloud source 111 may include at least one of a point cloud acquisition device (for example, a scanner), a point cloud archive, a point cloud input interface, and a computer graphics system, wherein the point cloud input interface is used to receive from a point cloud content provider Point cloud data, computer graphics system is used to generate point cloud data.

The point cloud encoder 112 encodes the point cloud data from the point cloud source 111 to generate a code stream. The point cloud encoder 112 directly transmits the encoded point cloud data to the decoding device 120 via the output interface 113 . The encoded point cloud data can also be stored on a storage medium or a storage server for subsequent reading by the decoding device 120 .

In some embodiments, the decoding device 120 includes an input interface 121 and a point cloud decoder 122 .

In some embodiments, the decoding device 120 may further include a display device 123 in addition to the input interface 121 and the point cloud decoder 122 .

Wherein, the input interface 121 includes a receiver and/or a modem. The input interface 121 can receive the encoded point cloud data through the channel 130 .

The point cloud decoder 122 is used to decode the encoded point cloud data to obtain decoded point cloud data, and transmit the decoded point cloud data to the display device 123 .

The display device 123 displays the decoded point cloud data. The display device 123 may be integrated with the decoding device 120 or external to the decoding device 120 . The display device 123 may include various display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.

In addition, FIG. 1 is only an example, and the technical solution of the embodiment of the present application is not limited to FIG. 1 . For example, the technology of the present application can also be applied to one-sided point cloud encoding or one-sided point cloud decoding.

The current point cloud encoder can use the Geometry Point Cloud Compression (G-PCC) codec framework provided by the Moving Picture Experts Group (MPEG) or the video-based point cloud compression (Video Point Cloud Compression, V-PCC) codec framework, or the AVS-PCC codec framework provided by Audio Video Standard (AVS). Both G-PCC and AVS-PCC are aimed at static sparse point clouds, and their coding frameworks are roughly the same. The G-PCC codec framework can be used to compress the first static point cloud and the third type of dynamically acquired point cloud, and the V-PCC codec framework can be used to compress the second type of dynamic point cloud. The G-PCC codec framework is also called point cloud codec TMC13, and the V-PCC codec framework is also called point cloud codec TMC2.

The following uses the G-PCC codec framework as an example to describe the applicable point cloud encoder and point cloud decoder in this embodiment of the present application.

Fig. 2 is a schematic block diagram of a point cloud encoder provided by an embodiment of the present application.

From the above, it can be seen that the points in the point cloud can include the position information of the point and the attribute information of the point, therefore, the encoding of the point in the point cloud mainly includes the position encoding and the attribute encoding. In some examples, the position information of the points in the point cloud is also called geometric information, and the corresponding position codes of the points in the point cloud may also be called geometric codes.

The process of position encoding includes: preprocessing the points in the point cloud, such as coordinate transformation, quantization, and removing duplicate points, etc.; then, geometrically encoding the preprocessed point cloud, such as constructing an octree, based on the constructed The octree performs geometric encoding to form a geometric code stream. At the same time, based on the position information output by the constructed octree, the position information of each point in the point cloud data is reconstructed to obtain the reconstruction value of the position information of each point.

The attribute encoding process includes: by given the reconstruction information of the position information of the input point cloud and the original value of the attribute information, select one of the three prediction modes for point cloud prediction, quantify the predicted results, and perform arithmetic coding to form property stream.

As shown in Figure 2, position coding can be achieved by the following units:

Coordinate conversion (Tanmsform coordinates) unit 201, quantization and removal of repeated points (Quantize and remove points) unit 202, octree analysis (Analyze octree) unit 203, geometric reconstruction (Reconstruct geometry) unit 204 and first arithmetic coding (Arithmetic enconde) unit 205.

The coordinate transformation unit 201 can be used to transform the world coordinates of points in the point cloud into relative coordinates. For example, subtracting the minimum values of the xyz coordinate axes from the geometric coordinates of the point is equivalent to a DC operation to convert the coordinates of the points in the point cloud from world coordinates to relative coordinates.

Quantization and removal of duplicate points unit 202 can reduce the number of coordinates by quantization; original different points may be given the same coordinates after quantization, based on this, duplicate points can be deleted through de-duplication operations; for example, with the same quantization position and Multiple clouds with different attribute information can be merged into one cloud through attribute conversion. In some embodiments of the present application, the Quantize and Remove Duplicate Points unit 202 is an optional unit module.

The octree analysis unit 203 may use an octree encoding method to encode the position information of the quantized points. For example, the point cloud is divided in the form of an octree, so that the position of the point can be in one-to-one correspondence with the position of the octree, and the position of the point in the octree is counted, and its flag (flag) is recorded as 1 for geometric encoding.

The geometry reconstruction unit 204 may perform position reconstruction based on the position information output by the octree analysis unit 203 to obtain reconstruction values of the position information of each point in the point cloud data.

The first arithmetic coding unit 205 can arithmetically encode the position information output by the octree analysis unit 203 in an entropy coding manner, that is, the position information output by the octree analysis unit 203 is generated using an arithmetic coding method to generate a geometric code stream; the geometric code stream is also Can be called geometry bitstream (geometry bitstream).

Attribute coding can be achieved by the following units:

Color space conversion (Transform colors) unit 210, attribute conversion (Transfer attributes) unit 211, region adaptive layered transformation (Region Adaptive Hierarchical Transform, RAHT) unit 212, prediction change (predicting transform) unit 213 and lifting transform (lifting transform) ) unit 214, a quantization coefficient (Quantize coefficients) unit 215, and a second arithmetic coding unit 216.

It should be noted that the point cloud encoder 200 may include more, less or different functional components than those shown in FIG. 2 .

The color space conversion unit 210 can be used to convert the RGB color space of points in the point cloud into YCbCr format or other formats.

The attribute conversion unit 211 can be used to convert attribute information of points in the point cloud to minimize attribute distortion. For example, the attribute conversion unit 211 can be used to obtain the original value of the attribute information of the point. For example, the attribute information may be color information of dots.

After the original value of the attribute information of the point is converted by the attribute conversion unit 211, any prediction unit can be selected to predict the point in the point cloud. The prediction unit may include: RAHT 212, predicting transform unit 213, and lifting transform unit 214.

In other words, any one of the RAHT 212, the predicting transform unit 213 and the lifting transform unit 214 can be used to predict the attribute information of the points in the point cloud, so as to obtain the predicted values of the attribute information of the points, Furthermore, the residual value of the attribute information of the point is obtained based on the predicted value of the attribute information of the point. For example, the residual value of the point's attribute information may be the original value of the point's attribute information minus the predicted value of the point's attribute information.

In an embodiment of the present application, the predictive transformation unit 213 may also be used to generate a level of detail (LOD). The generation process of LOD includes: according to the position information of the points in the point cloud, the Euclidean distance between the points is obtained; according to the Euclidean distance, the points are divided into different detail expression layers. In one embodiment, after sorting the Euclidean distances, the Euclidean distances in different ranges can be divided into different detail expression layers. For example, a point can be randomly selected as the first detail expression layer. Then calculate the Euclidean distance between the remaining points and this point, and classify the points whose Euclidean distance meets the first threshold requirement as the second detailed expression layer. Obtain the centroid of the point in the second detail expression layer, calculate the Euclidean distance between points other than the first and second detail expression layer and the centroid, and classify the points whose Euclidean distance meets the second threshold as the third detail expression layer. By analogy, all points are classified into the detail expression layer. By adjusting the threshold of the Euclidean distance, the number of points in each LOD layer can be increased. It should be understood that other manners may also be used for LOD division, which is not limited in this application.

It should be noted that the point cloud can be directly divided into one or more detail expression layers, or the point cloud can be divided into multiple point cloud slices first, and then each point cloud slice can be divided into one or more Multiple LOD layers.

For example, the point cloud can be divided into multiple point cloud cutouts, and the number of points in each point cloud cutout can be between 550,000 and 1.1 million. Each point cloud slice can be regarded as a separate point cloud. Each point cloud slice can be divided into multiple detail expression layers, and each detail expression layer includes multiple points. In one embodiment, the detail expression layer can be divided according to the Euclidean distance between points.

The quantization unit 215 may be used to quantize residual values of attribute information of points. For example, if the quantization unit 215 is connected to the predictive transformation unit 213, the quantization unit may be used to quantize the residual value of the point attribute information output by the predictive transformation unit 213.

For example, the residual value of the point attribute information output by the predictive transformation unit 213 is quantized using the quantization step size, so as to improve system performance.

The second arithmetic coding unit 216 may use zero run length coding to perform entropy coding on the residual value of the attribute information of the point to obtain an attribute code stream. The attribute code stream may be bit stream information.

Fig. 3 is a schematic block diagram of a point cloud decoder provided by an embodiment of the present application.

As shown in FIG. 3 , the decoder 300 can obtain the point cloud code stream from the encoding device, and obtain the position information and attribute information of the points in the point cloud by parsing the code. The decoding of point cloud includes position decoding and attribute decoding.

The process of position decoding includes: performing arithmetic decoding on the geometric code stream; merging after constructing the octree, and reconstructing the position information of the point to obtain the reconstruction information of the position information of the point; Transform to get the position information of the point. The location information of a point may also be referred to as geometric information of a point.

The attribute decoding process includes: obtaining the residual value of the attribute information of the point cloud by parsing the attribute code stream; dequantizing the residual value of the attribute information of the point to obtain the residual value of the attribute information of the dequantized point value; based on the reconstruction information of the position information of the point obtained in the position decoding process, select one of the following three prediction modes: RAHT, prediction change and promotion change to predict the point cloud, and obtain the predicted value, which is consistent with the residual value Add the reconstruction value of the attribute information of the point; perform color space inverse transformation on the reconstruction value of the attribute information of the point to obtain the decoded point cloud.

As shown in Figure 3, position decoding can be achieved by the following units:

A first arithmetic decoding unit 301 , an octree analysis unit 302 , a geometry reconstruction unit 304 and an inverse transform coordinates unit 305 .

Attribute coding can be achieved by the following units:

A second arithmetic decoding unit 310, an inverse quantize unit 311, an RAHT unit 312, a predicting transform unit 313, a lifting transform unit 314, and an inverse trasform colors unit 315.

It should be noted that decompression is an inverse process of compression, and similarly, the functions of each unit in the decoder 300 may refer to the functions of corresponding units in the encoder 200 . In addition, the point cloud decoder 300 may include more, fewer or different functional components than in FIG. 3 .

For example, the decoder 300 can divide the point cloud into multiple LODs according to the Euclidean distance between points in the point cloud; then, decode the attribute information of the points in the LOD in sequence; number (zero_cnt), to decode the residual based on zero_cnt; then, the decoding framework 200 can perform dequantization based on the decoded residual value, and add the dequantized residual value to the predicted value of the current point to obtain the Point cloud reconstruction values until all point clouds are decoded. The current point will be used as the nearest neighbor point of the subsequent LOD midpoint, and the attribute information of the subsequent point will be predicted by using the reconstructed value of the current point.

The above is the basic process of the point cloud codec based on the G-PCC codec framework. With the development of technology, some modules or steps of the framework or process may be optimized. This application is applicable to the G-PCC codec-based The basic process of the point cloud codec under the decoding framework, but not limited to the framework and process.

The current point cloud encoding and decoding method reconstructs the point cloud to the original scale, but in some application scenarios, it is necessary to use high-quality point clouds with higher precision than the original ones, for example, sparse point clouds collected by radar in areas such as autonomous driving , often need to do a lot of post-processing work to improve the accuracy of the point cloud to improve driving safety.

The embodiment of the present application provides a point cloud upsampling method, which uses deep learning to upsample point cloud geometric information to obtain a higher resolution (or precision) point cloud, thereby meeting the task requirements for high-precision point clouds.

The point cloud upsampling method involved in the embodiment of the present application will be introduced below in combination with specific embodiments.

The point cloud upsampling method provided in this application uses a generator after deep learning to upsample the geometric information of the point cloud, and the generator is a piece of software code or a chip with data processing functions. Based on this, the training process of the generator is firstly introduced.

Fig. 4 is a schematic flow chart of the model training method provided by an embodiment of the present application. As shown in Fig. 4, the training process of the generator includes:

S401. Obtain geometric information of the training point cloud.

It should be noted that, for ease of description, the embodiment of the present application records the point cloud used for generator training as a training point cloud.

The above-mentioned training point cloud is a point cloud in the training set, which includes multiple point clouds, and the process of using each point cloud in the training set to train the generator is consistent. For the convenience of description, the embodiment of the present application uses a training set Take point clouds as an example.

S402. Divide the training point cloud into at least one training point cloud block according to the geometric information of the training point cloud.

In the embodiment of the present application, in the process of up-sampling the point cloud, the point cloud is divided into point cloud blocks, and the point cloud geometric information is up-sampled with the point cloud blocks as objects.

In some embodiments, the ways of dividing the training point cloud into at least one training point cloud block in S402 include but are not limited to the following ways:

Method 1: Divide the training point cloud into at least one training point cloud block of equal size according to the geometric information of the training point cloud. That is to say, the geometric scale of each point cloud block is the same.

Method 2: Divide the training point cloud into at least one training point cloud block according to the geometric information of the training point cloud, and each training point cloud block includes the same number of points.

Method 3: Obtain at least one seed point from the training point cloud according to the geometric information of the training point cloud, for example, randomly sample a specified number of seed points from the training point cloud by using Monte Carlo random sampling method. For each seed point, determine the neighboring points of the seed point, divide the seed point and the neighboring points of the seed point into a training point cloud block, and obtain at least one training point cloud block. In the third way, the obtained training point cloud blocks are also called point cloud patches (Patch), and the number of points included in each training point cloud block in the training point cloud blocks obtained in this way is the same.

In some embodiments, the training point cloud blocks obtained above are recorded as

Where N is the number of points included in the training point cloud block, and 3 is the geometric information dimension of the training point cloud block.

S403. Input the geometric information of the training point cloud block into the feature extraction module of the generator to perform feature extraction, and obtain the first feature information of the training point cloud block.

The network structure of the generator involved in the embodiment of the present application will be introduced below in conjunction with FIG. 5. It should be noted that the network structure of the generator in the embodiment of the present application includes but is not limited to the modules shown in FIG. more or less modules.

Fig. 5 is a kind of network schematic diagram of the generator of the embodiment of the present application, as shown in Fig. 5, generator includes feature extraction module, feature upsampling module and geometry generation module, and feature extraction module is used to extract the first of training point cloud block One feature information, the feature sampling module is used to upsample the first feature information of the training point cloud block into the second feature information, and the geometry generation module is used to map the second feature information of the training point cloud block into the geometric space, so as to obtain Upsampled geometric information for training point cloud blocks.

As shown in Figure 5, the feature extraction module is used to extract the expressive features of each point, such as the geometric information of the low-resolution training point cloud block

Input the feature extraction module, the feature extraction module is used to extract the expressive feature information of each point in the training point cloud block, and output the feature information of the training point cloud block

Among them, N is the number of points in the training point cloud block, 3 is the geometric information dimension, and C is the feature dimension.

For the convenience of description, the feature information output by the feature extraction module is recorded as the first feature information of the training point cloud block.

In some embodiments, the present application proposes a feature extraction module based on a Dynamic Graph Hierarchical Residual Aggregation (DGHRA) unit, as shown in FIG. 6A , the feature extraction module includes: M densely connected feature extraction blocks (Feature Extraction Block, referred to as FEB), that is, the output of the previous FEB is used as the input of each subsequent FEB, as shown in Figure 6A.

In some embodiments, the above S403 includes the following S403-A1 to S403-A4:

S403-A1. Input the geometric information of the training point cloud block into the feature extraction module, and acquire the i-th third feature information of the training point cloud block extracted by the i-th feature extraction block among the M feature extraction blocks.

Wherein, i is a positive integer smaller than M.

In some embodiments, if i=1, the embodiment of the present application further includes: determining the initial feature information of the training point cloud block according to the geometric information of the training point cloud block; inputting the initial feature information of the training point cloud block into the first In the first feature extraction block, the first third feature information of the training point cloud block extracted by the first feature extraction block is obtained.

S403-A2. Obtain the i-th fourth feature information of the training point cloud block according to the i-th third feature information of the training point cloud block.

In a possible implementation, if i is not equal to 1, the above S403-A2 includes: acquiring the third feature information extracted by each feature extraction block before the i-th feature extraction block among the M feature extraction blocks; The third feature information extracted by each feature extraction block located before the i-th feature extraction block is concatenated with the third feature information extracted by the i-th feature extraction block, as the i-th feature information of the training point cloud block Four feature information.

In a possible implementation, if i is equal to 1, the first third feature information extracted by the first feature extraction block in the M feature extraction units is used as the first fourth feature information of the training point cloud block .

S403-A3. Input the ith fourth feature information of the training point cloud block into the i+1th feature extraction block to obtain the i+1th third feature information of the training point cloud block.

S403-A4. Use the Mth third feature information extracted by the Mth feature extraction block of the training point cloud block as the first feature information of the training point cloud block.

For example, assume that M=4, that is, the feature extraction module includes 4 feature extraction blocks FEB, as shown in Figure 6A, first determine the initial feature information of the training point cloud block according to the geometric information of the training point cloud block; the training point cloud The initial feature information of the block is input into the first feature extraction block to obtain the first third feature information of the training point cloud block extracted by the first feature extraction block. Since there is no other feature extraction block before the first feature extraction block, the first third feature information is input into the second FEB as the first fourth feature information, and the second FEB is based on the first fourth feature information Output the second and third feature information. Concatenate the second third feature information with the first third feature information as the second fourth feature information, input the second fourth feature information into the third FEB, and the third FEB according to the first The second piece of fourth characteristic information outputs the third third characteristic information. Concatenate the third third characteristic information, the second third characteristic information and the first third characteristic information as the third fourth characteristic information, and input the third fourth characteristic information into the fourth FEB Among them, the fourth FEB outputs the fourth third feature information according to the third fourth feature information, and the fourth third feature information is used as the first feature information of the training point cloud block.

In some embodiments, as shown in Figure 6A, with the deepening of the feature extraction module network, when the feature dimensions of FEBs located deep in the network are too large, in order to reduce the training complexity of the network, convolution is set between FEBs Network, such as a convolutional network with a convolution kernel set to 1X1, to reduce the feature dimension in the input FEB.

In some embodiments, as shown in FIG. 6B, each feature extraction block FEB includes a first feature extraction unit and at least one second feature extraction unit connected in series, wherein the first feature extraction unit is a dynamic map layered residual Aggregation (Dynamic graph hierarchical residual aggregation, DGHRA) unit, the second feature extraction unit is hierarchical residual aggregation block (Hierarchical residual aggregation, referred to as HRA), used to extract more detailed features. The processing process of each FEB is the same, and they are iterated with each other. The processing process of each FEB is the same. It is convenient to describe as an example. Take the i+1th feature extraction block as an example. At this time, the above S403-A3 includes S403-A31 To S403-A32:

S403-A31. Input the i-th fourth feature information of the training point cloud block into the first feature extraction unit in the i+1th feature extraction block, so that the first feature extraction unit targets the current point in the training point cloud block , search for K neighboring points of the current point, and based on the ith fourth feature information, subtract the fourth feature information of the current point from the fourth feature information of the neighboring points to obtain K residual feature information; The residual feature information is concatenated with the fourth feature information of the current point to obtain the i-th concatenated feature information of the current point, and according to the i-th concatenated feature information of the current point, the i-th concatenated feature information of the training point cloud block is obtained A cascade of feature information.

For example, the size of the i-th fourth feature information of the training point cloud block is NXC, and the i-th fourth feature information of the training point cloud block with a size of NXC is input in the first feature extraction unit, for the training point cloud block For the current point of , the first feature extraction unit searches K neighboring points of the current point, for example, dynamically searches the K neighboring points of the current point through a feature space nearest neighbor search method. Next, obtain the fourth feature information of the current point from the ith fourth feature information of the training point cloud block, and copy K copies of the fourth feature information of the current point, and compare the fourth feature information of the current point with K neighboring The fourth feature information of each adjacent point in the point is subtracted to obtain K residual feature information, the size of which is 1XKXC. The K residual feature information of the current point is concatenated with the fourth feature information of the current point to obtain the ith concatenated feature information of the current point. Referring to the above method, the i-th concatenated feature information of each point in the training point cloud block can be obtained, and then the i-th concatenated feature information of the training point cloud block can be obtained, and its size is NXKX2C.

S403-A32. Input the i-th cascaded feature information of the training point cloud block into the first second feature extraction unit in the i+1th feature extraction block to obtain the first, first, and fifth feature information, and Input the first and fifth feature information into the second second feature extraction unit in the i+1th feature extraction block to obtain the second fifth feature information, and proceed sequentially, and the i+1th The fifth feature information extracted by the last second feature extraction unit in the feature extraction block is used as the i+1th third feature information of the training point cloud block.

For example, the i+1th feature extraction block includes 3 second feature extraction units, and the i-th cascaded feature information of the above-mentioned training point cloud block size of NXKX2C is input into the first i+1th feature extraction block The second feature extraction unit, the first second feature extraction unit extracts the detailed features of the training point cloud block, outputs the first fifth feature information of the training point cloud block, and inputs the first fifth feature information The second second feature extraction unit, the second second feature extraction unit outputs the second fifth feature information of the training point cloud block according to the first fifth feature information, and inputs the second fifth feature information into the second Three second feature extraction units, the third second feature extraction unit outputs the third fifth feature information of the training point cloud block according to the second fifth feature information, and the third fifth feature information is used as the training point The i+1th third characteristic information of the cloud block.

Optionally, the size of the i+1th third feature information is NXC.

In some embodiments, as shown in FIG. 6C, the second feature extraction unit includes P residual blocks (Residual block, RB for short), and P is a positive integer. At this time, the above S403-A32 includes:

S403-A321. Input the i-th cascaded feature information into the first second feature extraction unit in the i+1th feature extraction block, and obtain the output of the j-th residual block in the first second feature extraction unit. For the first residual information, j is a positive integer smaller than P.

S403-A322. Input the first residual information and the i-th concatenated feature information output by the j-th residual block into the j+1-th residual block in the first second feature extraction unit to obtain the j-th residual block The first residual information output by +1 residual block.

In a possible implementation, add the first residual information output by the jth residual block and the ith concatenated feature information, and input the added feature information into the j+1th residual block In the difference block, the first residual information output by the j+1th residual block is obtained.

S403-A323, according to the first residual information output by at least one of the P residual blocks in the first second feature extraction unit, and the i-th concatenated feature information, determine the first second feature The fifth feature information output by the extraction unit.

Then, input the fifth feature information output by the first second feature extraction unit into the second second feature extraction unit, and proceed sequentially until the last second feature extraction unit is executed; extract the i+1th feature The fifth feature information output by the last second feature extraction unit in the block is determined as the i+1th third feature information of the training point cloud block.

It should be noted that the embodiment of the present application does not limit the specific network structure of the residual block.

In some possible implementations, the network structure of the residual block is as shown in Figure 6D, the residual block includes multiple linear layers with a linear rectification function (Relu), and the residual block used in the embodiment of the present application is used for Feature mining helps the network converge.

For example, assuming that the above P=4, that is, there are 4 residual blocks in the second feature extraction unit, and the i-th concatenated feature information is input into the first residual block RB in the second feature extraction unit HRA, the first The RB outputs the first residual information 1, and adds the first residual information 1 and the i-th concatenated feature information to the second RB, and the second RB outputs the first residual information 2, and the first residual information The information 2 and the i-th concatenated feature information are added and input to the third RB to obtain the first residual information 3 output by the third RB, and the first residual information 3 and the i-th concatenated feature information are added and input to the first Four RBs, obtain the first residual information 4 output by the fourth RB, and then determine the fifth feature output by the first feature extraction unit according to the first residual information output by each RB and the i-th concatenated feature information information.

In the above S403-A323, according to the feature information output by at least one of the P residual blocks in the first second feature extraction unit, and the i-th cascaded feature information, determine the first second feature extraction unit The ways of outputting the fifth feature information include but not limited to the following:

Way 1: add the first residual information output by the last residual block to the i-th concatenated feature information, and use it as the fifth feature information output by the first second feature extraction unit.

Method 2, the above S403-A323 includes:

Step B1, combine the first residual information output by the last residual block among the P residual blocks in the first second feature extraction unit, and the first residual information output by at least one residual block among the P-1 residual blocks. The residual information is concatenated, wherein the P-1 residual blocks are the residual blocks except the last residual block among the P residual blocks;

For example, as shown in FIG. 6E, the first residual information output by the last residual block in the P residual blocks is compared with the first residual information output by each residual block in the P-1 residual blocks. Cascade to obtain the feature information after cascading.

Step B2, according to the concatenated feature information and the i-th concatenated feature information, determine the fifth feature information output by the first second feature extraction unit.

The implementation methods of the above step B2 include but are not limited to the following:

Way 1: add the concatenated feature information to the i-th concatenated feature information, and use it as the fifth feature information output by the first second feature extraction unit.

Mode 2, as shown in Figure 6F, the second feature extraction unit also includes a gating unit, and at this time the above step B2 includes inputting the cascaded feature information into the gating unit for de-redundancy, and obtaining the de-redundant feature information ; Add the feature information after de-redundancy to the i-th cascaded feature information, and use it as the fifth feature information output by the first second feature extraction unit.

The residual block of the embodiment of the present application provides detailed residual information, and inputs the detailed residual information obtained by each residual block into the gating unit to collect more feature details and realize full learning of the network.

The embodiment of the present application does not limit the network structure of the above-mentioned gate control unit.

In a possible implementation, as shown in Figure 6G, the gating unit includes Squeeze-and-excitation networks (SE-Net for short) and a linear layer, where SE-Net includes a global average Pooling layer, full connection (Full connection, referred to as FC) layer, the SE-Net network can perform global average pooling of feature information with a size of NXKXC, and obtain feature information with a size of 1X1XC. After passing through the full connection layer, the size is the feature information of 1X1XC, multiply the feature information of 1X1XC processed by the fully connected layer with the feature information of size NXKXC on the channel, and obtain the feature information of size NXKXC after de-redundancy, and finally de-redundancy The feature information whose size is the feature information of NXKXC is input to the linear layer, and finally the feature information after de-redundancy is output.

In combination with the network structure of the feature extraction module, the process of inputting the geometric information of the training point cloud block into the feature extraction module to obtain the first feature information of the training point cloud block is described in detail. After obtaining the first feature information of the training point cloud block, perform the following S404 to up-sample the first feature information to obtain the second feature information of the training point cloud.

S404. Input the first feature information of the training point cloud block into the feature up-sampling module of the generator for up-sampling, and obtain the second feature information of the training point cloud block.

As shown in Figure 5, the feature upsampling module is used to upsample the first feature information of the training point cloud block to obtain the second feature information of the training point cloud block, for example, the first feature information of the training point cloud block

Upsampling is the second feature information

Where r is a preset sampling rate, which is a positive integer, and C′ is the feature dimension of the second feature information of the training point cloud block after upsampling.

In some embodiments, as shown in FIG. 7A , the feature upsampling module includes: a feature upsampling submodule and a feature extraction submodule, wherein the feature upsampling submodule is used to upsample the first feature information of the training point cloud block , to obtain the upsampled feature information of the training point cloud block. The feature extraction sub-module is used to perform feature extraction on the upsampled feature information of the training point cloud block to obtain expressive features of the training point cloud block, and use the expressive feature as the second feature information of the training point cloud block.

Based on the above shown in Figure 7A, the above S404 includes S404-A1 and S404-A2:

S404-A1. Input the first feature information of the training point cloud block into the feature upsampling submodule, so that the feature upsampling submodule copies r copies of the first feature information of the training point cloud block according to the preset upsampling rate r , and add an n-dimensional vector on the feature dimension to the first feature information after copying, to obtain the upsampling feature information of the training point cloud block, wherein the values of the n-dimensional vectors corresponding to different first feature information are different;

S404-A2. Input the upsampled feature information of the training point cloud block into the feature extraction submodule to obtain the second feature information of the training point cloud block extracted by the feature extraction submodule.

For example, copy r copies of the first feature information F of the training point cloud block, and for each assigned feature, add an n-dimensional vector to its feature dimension, so that there is a clear difference between each copied feature The difference, at this time the feature dimension of each point is C+n. For example, n=2, that is, add a 2-dimensional vector to the feature dimension of each point, where the values of the vector are equally spaced, for example, the value of the vector is equally spaced from -0.2 to 0.2, and the feature dimension of each point is C +2. The feature information whose feature dimension is C+n is recorded as the upsampling feature information of the training point cloud block. Next, the upsampled feature information of the training point cloud block is input into the feature extraction sub-module to perform detail feature extraction to obtain the second feature information of the training point cloud block.

In some embodiments, as shown in FIG. 7B , the feature upsampling module further includes a first autocorrelation attention network. At this time, the above S404-A2 includes: inputting the first upsampling feature information of the training point cloud block into the first The autocorrelation attention network performs feature interaction to obtain the upsampling feature information of the training point cloud block after the feature interaction; the upsampling feature information of the training point cloud block after the feature interaction is input into the feature extraction sub-module for feature extraction to obtain the training point The second characteristic information of the cloud block.

Optionally, the feature dimension of the upsampled feature information of the training point cloud block after the feature interaction is the same as the feature dimension of the upsampled feature information of the training point cloud block. That is, the first autocorrelation attention network is used for feature interaction, allowing the network to learn more detailed features.

Optionally, since the feature dimension of the upsampled feature information is C+n, the value of C+n is relatively large, for example, about 700. When using data with a large feature dimension for network training, the processing efficiency is low, or even impossible. deal with. Based on this, the first autocorrelation attention network in the embodiment of the present application has a reduced left and right to reduce the feature dimension of the feature information, so that the feature dimension of the upsampled feature information of the training point cloud block after feature interaction is lower than that of the training point The feature dimension of the upsampled feature information of the cloud block. That is, the first autocorrelation attention network is not only used for feature interaction, but also used to reduce the feature dimension to reduce the training complexity of the network, thereby improving the training speed of the network.

In some embodiments, as shown in Figure 7C, the feature extraction submodule includes Q third feature extraction units connected in series, Q is a positive integer, wherein the feature extraction process of each third feature extraction unit is the same, at this time, The above S404-A2 includes:

S404-A21. Input the upsampling feature information of the training point cloud block into the feature extraction submodule, and obtain the kth enhanced upsampling feature information of the training point cloud block extracted by the kth third feature extraction unit;

S404-A22. Input the kth enhanced upsampling feature information of the training point cloud block into the k+1th third feature extraction unit for feature extraction, and obtain the k+1th enhanced upsampling feature information of the training point cloud block;

S404-A23. Using the Qth enhanced upsampled feature information of the training point cloud block extracted by the last third feature extraction unit among the Q third feature extraction units as the second feature information of the training point cloud block.

For example, if Q=3, input the upsampling feature information of the training point cloud block into the first third feature extraction unit for feature extraction, obtain the first enhanced upsampling feature information of the training point cloud block, and use the first enhanced Input the upsampling feature information into the second third feature extraction unit for feature extraction, obtain the second enhanced upsampling feature information of the training point cloud block, and input the second enhanced upsampling feature information into the third third feature extraction unit Perform feature extraction to obtain the third enhanced upsampling feature information of the training point cloud block, and record the third enhanced upsampling feature information of the training point cloud block as the second feature information of the training point cloud block.

In some embodiments, the network structure of the above-mentioned third feature extraction unit is the same as that of the above-mentioned second feature extraction unit.

In some embodiments, the network structures of the third feature extraction unit and the second feature extraction unit are not completely the same.

In some embodiments, the third feature extraction unit includes L residual blocks, where L is a positive integer. At this time, S404-A22 includes:

S404-A221, input the kth enhanced upsampling feature information of the training point cloud block into the k+1th third feature extraction unit, and obtain the output of the lth residual block in the k+1th third feature extraction unit The second residual information, l is a positive integer less than or equal to L;

S404-A222, input the second residual information output by the lth residual block and the kth enhanced upsampling feature information into the l+1th residual block, and obtain the first outputted by the l+1th residual block Second residual information;

For example, add the second residual information output by the lth residual block and the kth enhanced upsampling feature information, and input the added feature information into the l+1th residual block to determine the The second residual information output by l+1 residual blocks.

S404-A223. Obtain the k+1th enhanced upsampling feature information of the training point cloud block according to the second residual information output by at least one of the L residual blocks and the kth enhanced upsampling feature information .

For example, assuming that the above P=4, that is, there are 4 residual blocks in the third feature extraction unit, and the kth enhanced upsampling feature information of the training point cloud block is input into the first k+1th third feature extraction unit. In a residual block RB, the first RB outputs the second residual information 1, and the second residual information 1 and the kth enhanced upsampling feature information are added to the second RB, and the second RB outputs the first Two residual information 2, the second residual information 2 and the kth enhanced upsampling feature information are added to the third RB to obtain the second residual information 3 output by the third RB, and the second residual information 3 Adding the kth enhanced upsampling feature information to the fourth RB to obtain the second residual information 4 output by the fourth RB, and then according to the second residual information output by each RB and the kth enhanced upsampling feature information , to get the k+1 enhanced upsampling feature information of the training point cloud block.

In the above S404-A223, according to the second residual information output by at least one of the L residual blocks and the kth enhanced upsampling feature information, the k+1th enhanced upsampling feature of the training point cloud block is obtained Information methods include but are not limited to the following:

Method 1: add the second residual information output by the last residual block in the L residual blocks to the kth enhanced upsampling feature information, and use it as the i+1th third feature information of the training point cloud block .

Method 2, the above S404-A223 includes step C1 and step C2:

Step C1, concatenate the second residual information output by the last residual block in the L residual blocks with the second residual information output by at least one residual block in the L-1 residual blocks, where L -1 residual block is a residual block except the last residual block among the L residual blocks.

For example, as shown in FIG. 7E , the second residual information output by the last residual block in the L residual blocks is compared with the second residual information output by each residual block in the L-1 residual blocks. Cascade to obtain the feature information after cascading.

Step C2, according to the concatenated feature information and the kth upsampling feature information, determine the k+1th enhanced upsampling feature information of the training point cloud block.

The implementation methods of the above step C2 include but are not limited to the following:

Way 1: add the cascaded feature information to the kth enhanced upsampling feature information, and use it as the k+1th enhanced upsampling feature information of the training point cloud block.

Mode 2, the third feature extraction unit also includes a gating unit, and at this time, the above step C2 includes: inputting the cascaded feature information into the gating unit for de-redundancy to obtain de-redundant feature information; The final feature information is added to the k-th enhanced up-sampled feature information, and used as the k+1-th enhanced up-sampled feature information of the training point cloud block.

FIG. 7D is a schematic diagram of a specific network structure of the feature upsampling module provided in the embodiment of the present application. As shown in FIG. 7D , the feature upsampling submodule upsamples the first feature information with a size of NXC to a size of rNX(C+ 2) The upsampled feature information of the size rNX(C+2) is input into the first self-correlation attention network (Self-attetion), and the upsampled feature information after feature interaction is obtained, and the feature interaction is The upsampled feature information of the input feature extraction submodule, the feature extraction submodule includes a plurality of third feature extraction units, the optional third feature extraction unit is consistent with the network structure of the second feature extraction unit HRA, and outputs the training point cloud block The second characteristic information of .

In a possible implementation manner, the network structure of the above-mentioned gate control unit is as shown in FIG. 6G , and for details, refer to the above-mentioned description of S403 .

The above network structure combined with the feature upsampling module describes in detail the process of inputting the first feature information of the training point cloud block into the feature upsampling module to obtain the second feature information of the training point cloud block. After obtaining the second feature information of the training point cloud block, perform the following S405 to perform spatial conversion on the second feature information to obtain upsampled geometric information of the training point cloud block.

S405. Input the second characteristic information of the training point cloud block into the geometry generation module in the generator, and obtain the upsampled geometric information of the training point cloud block.

The function of the geometry generation module in the embodiment of the present application is to obtain the second feature information of the training point cloud by upsampling

Remap from the feature space back to the geometric space, and finally obtain the upsampled point cloud, that is, return F _up to the geometric space, and obtain the upsampled geometric information of the training point cloud block

Among them, 3 refers to the geometric information dimension, and rN is the number of points included in the training point cloud block after upsampling.

The embodiment of the present application does not limit the specific network structure of the geometry generation module.

In some embodiments, the geometry generation module includes a plurality of fully connected layers, then the above S405 includes: inputting the second feature information of the training point cloud block into a plurality of fully connected layers for space conversion, and obtaining the upsampled geometry of the training point cloud block information.

In some embodiments, directly outputting the upsampled geometric information cannot well generate a uniformly distributed point cloud and suppress noise at the boundary. At this time, in order to solve this technical problem, this application will upsample the point cloud by r+m times; then generate the upsampled geometric information through FC; then use the high-pass image filter to explicitly remove multiple high-frequency points (ie noise) in each upsampled Patch, and finally pass the farthest point sampling (Farthest point sampling , referred to as FPS) algorithm, downsampling the point cloud to r times, output

Based on this, as shown in FIG. 8, the geometry generation module includes: a geometry reconstruction unit, a filter unit, and a downsampling unit, wherein the geometry reconstruction unit includes multiple fully connected layers. At this time, the above S405 includes:

S405-A1. Input the second characteristic information of the training point cloud block into the geometric reconstruction unit to perform geometric reconstruction, and obtain the initial upsampling geometric information of the training point cloud block.

S405-A2. Input the initial upsampling geometric information of the training point cloud block into the filtering unit for denoising, and obtain the initial upsampling geometric information of the training point cloud block for filtering noise.

Optionally, the filtering unit may be a high-pass image filter, which explicitly removes a plurality of, for example, 5 high-frequency points (ie, noise points) in each upsampling patch.

S405-A3. Input the initial upsampling geometric information of the training point cloud block to filter out the noise into the downsampling unit for downsampling, and obtain the predicted upsampling geometric information of the training point cloud block.

For example, at the end, the furthest point sampling (FPS) algorithm is used to downsample the initial upsampling geometric information of the training point cloud block to filter out noise from r+m times to r times, and output

m is a positive integer, for example m=2. That is, the upsampling rate corresponding to the upsampling geometric information of the training point cloud block is less than or equal to the upsampling rate of the feature upsampling module.

S406. According to the predicted upsampling geometric information of the training point cloud block, train the feature extraction module, feature upsampling module and geometry generation module in the generator to obtain a trained generator.

In the embodiment of this application, the implementation of the above S406 includes but is not limited to the following methods:

Method 1: According to the loss between the predicted upsampled geometric information of the training point cloud block and the upsampled true value of the geometric information of the training point cloud block, the feature extraction module, feature upsampling module and geometry generation module in the reverse training generator module to get the trained generator.

It should be noted that the training process of the embodiment of the present application is an iterative process, and each training process is consistent, and the parameters in the generator (such as the weight matrix) are updated once during each training process until the model training end condition is reached until.

Optionally, the model training end condition includes that the number of training times reaches a preset number of times, or the prediction error of the generator reaches a preset value, and the like.

Fig. 9 is a schematic diagram of the training process involving the generator in the embodiment of the present application. As shown in Fig. 9, the geometric information of the training point cloud block is input into the generator, and the predicted upsampling geometry of the training point cloud block output by the generator is obtained. Information, according to the predicted upsampled geometric information of the training point cloud and the upsampled true value of the geometric information of the training point cloud, adjust the parameters of the feature extraction module, feature upsampling module and geometry generation module in the generator, for example, according to the training The loss between the upsampled geometric information of the point cloud and the upsampled true value of the geometric information of the training point cloud updates the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module to obtain a trained generator .

Wherein, the upsampled true value of the geometric information of the training point cloud can be understood as the data included in the training data after upsampling the geometric information of the training point cloud.

Optionally, the resolution of the upsampled true value of the geometric information of the training point cloud is lower than the upsampled geometric information of the training point cloud output by the generator.

Method 2, training the generator with the help of the judger, at this time the above S406 includes:

S406-A1. Input the predicted upsampled geometric information of the training point cloud block into the discriminator to obtain a first discrimination result of the discriminator, and the discriminator is used to judge whether the data input to the discriminator is the upsampled true value of the training point cloud block.

S406-A2. According to the first discrimination result of the discriminator, train the feature extraction module, feature upsampling module and geometry generation module in the generator to obtain a trained generator.

Wherein, the discriminator may be a piece of software code or a chip with data processing function.

Fig. 10 is another schematic diagram of the training process involving the generator in the embodiment of the present application. As shown in Fig. 10, the geometric information of the training point cloud block is input into the generator to obtain the upsampled geometry of the training point cloud block output by the generator Next, input the upsampled geometric information of the training point cloud block into the discriminator, and obtain the first discriminant result output by the discriminator. According to the first discrimination result of the discriminator, the parameter matrix of the feature extraction module, the feature upsampling module and the geometry generation module in the generator are adjusted to realize the training of the generator.

Specifically, the training point cloud is divided into at least one training point cloud block, and for each training point cloud block in at least one training point cloud block, the geometric information of the training point cloud block is input in the generator, and the generator is The geometric information of the training point cloud block is up-sampled to obtain the predicted up-sampling geometric information of the training point cloud block. After the geometric information of the training point cloud block is up-sampled, it becomes a dense training point cloud block. The dense training point cloud block should have the same geometric distribution as the upsampled true value of the training point cloud block, that is, if the generator has high precision, the geometric distribution of the training point cloud block after the generator upsampling should be close to The geometric distribution of the upsampled ground truth for training point cloud blocks.

Based on this, in the embodiment of the present application, the predicted upsampled geometric information of the training point cloud block is input to the discriminator, so that the discriminator judges whether the data input to the discriminator is the upsampled true value of the training point cloud block, and outputs the first discrimination result. When the first discrimination result is the first value, such as 0, it means that the discriminator judges that the data input to the discriminator is an upsampled training point cloud block, indicating that the generator has not been trained, and the parameter matrix in the generator is reversed. Adjustment. When the first discriminant result is the second value, such as 1, it means that the discriminator judges that the data input to the discriminator is the upsampled ground truth of the training point cloud block. At this time, the training of the generator is completed, and then the training The completed generator upsamples the geometric information of the point cloud.

In some embodiments, the above S406-A2 includes:

S406-A21. Determine a first loss of the generator according to the first discrimination result.

The embodiment of the present application does not limit the specific type of the loss function used when determining the first loss according to the first discrimination result.

In a possible implementation manner, according to the first discrimination result, a least squares loss function is used to determine the first loss of the generator.

For example, the first loss of the generator is determined according to the following formula (1):

Among them, L _gen (P _up ) is the first loss, P _up is the upsampling geometric information of the training point cloud block, D(P _up ) is the input of the upsampling geometric information of the training point cloud block to the discriminator, and the output of the discriminator The first judgment result of .

S406-A22. According to the first loss, determine a parameter matrix of a feature extraction module, a feature upsampling module, and a geometry generation module in the generator.

According to the first loss in the above S406-A22, the ways of determining the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator include but are not limited to the following:

Method 1: Determine the parameter matrix of the feature extraction module, feature upsampling module, and geometry generation module in the generator based on the first loss. For example, when the first loss is greater than a certain preset value, it means that the accuracy of the generator has not reached Preset requirements, inversely adjust the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator. If the first loss is less than a certain preset value, it means that the accuracy of the generator meets the preset requirements, and the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator is fixed at this time.

Mode 2, the above S406-A22 includes the following steps:

Step A1. Determine at least one second loss of the generator;

Step A2. Determine the target loss of the generator according to the first loss of the generator and at least one second loss of the generator;

Step A3, according to the target loss of the generator, determine the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator.

In the second method, in order to further improve the training accuracy of the generator, at least one second loss of the generator is determined, and according to the first loss and at least one second loss of the generator, the feature extraction module, feature The parameter matrices of the upsampling module and the geometry generation module are tuned to improve the training accuracy of the generator.

The embodiment of the present application does not limit the manner of determining at least one second loss of the generator in the above step A1, which is specifically determined according to actual needs.

In an example, the above step A1 includes: according to the upsampled geometric information of the training point cloud block and the upsampled true value of the geometric information of the training point cloud block, using the ground motion distance method to determine a second loss of the generator.

Using the ground motion distance method to determine a second loss of the generator is also called the reconstruction loss function. The purpose is to make the upsampled training point cloud block and the upsampled true value of the training point cloud block have a consistent geometric distribution.

For example, according to the following formula (2), a second loss of the generator is determined:

Among them, L _rec is the second loss, EMD represents the ground motion distance method, P _up is the geometric information of the upsampled training point cloud block, P _T is the upsampled true value of the geometric information of the training point cloud block, φ:P _up → P _T is a bijection composed of two equal-sized subsets P _up and P _T , p _i is the i-th point in P _up , φ(p _i ) means that in P _T according to the bijective relationship φ Find the corresponding point of _pi .

In an example, the above step A1 includes: determining at least one second loss of the generator according to a uniform loss function.

For example, according to the following formula (3), a second loss of the generator is determined:

Among them, L _uni is the second loss,

S _i refers to the local surface i obtained by the radius sphere (radius r _q ) method, T is the number of seed points obtained, and d _i,j represents the distance between the jth point and its nearest neighbor in the i-th local surface distance,

is the desired number of points per local surface,

is the spatial distance between each point and its nearest neighbor in the desired local surface.

In an example, the above step A1 includes:

Step A11, downsampling the upsampled geometric information of the training point cloud block to obtain a downsampled training point cloud block with the same number of points as the training point cloud block.

For example, use Farthest point sampling (FPS) to downsample the upsampled training point cloud block P _up to the same number of points as the low-resolution training point cloud block P _ori to obtain the downsampled training point cloud block P _low , ie

Step A12. According to the geometric information of the downsampled training point cloud block and the geometric information of the training point cloud block, a second loss of the generator is determined by using the ground motion distance method.

For example, according to the following formula (4), a second loss of the generator is determined:

Among them, L _id is the second loss of the generator, P _ori is the low-resolution training point cloud block, P _low is the training point cloud block after downsampling, φ:P _low → P _ori means that it is composed of P _low and P _ori In the bijection formed, there is one and only one way to move P _low and P _ori to the minimum distance between the point sets of each other,

is the kth point in P _low ,

for

Corresponding point in P _ori .

After at least one second loss of the generator is determined according to the above method, the target loss of the generator is determined according to the first loss of the generator and at least one second loss of the generator, for example, the first loss of the generator and at least one second loss of the generator A weighted average of the second loss, which determines the target loss for the generator.

Exemplarily, the target loss of the generator is determined according to the following formula (5):

L _G ＝w _gen L _gen (P _up )+w _rec L _rec +w _uni L _uni +w _id L _id (5)

Among them, L _G is the target loss of the generator, L _gen (P _up ) is the first loss of the generator, L _rec , L _uni , L _id are the second losses of the generator respectively, w _gen is the first loss of the generator Weights, w _rec , w _uni , and w _id are weights corresponding to the second losses, respectively.

It should be noted that, the embodiments of the present application do not limit the specific values of the weights corresponding to the above losses, which are specifically determined according to actual needs.

Optionally, w _gen =1.

Optionally, w _rec =100.

Optionally, w _uni =10.

Optionally, w _id =1.

The embodiment of the present application uses the training point cloud to train the generator to obtain the trained generator, so that in practical applications, the trained generator can be used to upsample the geometric information of the point cloud to obtain a high-precision point cloud. Further, the embodiment of the present application divides the training point cloud into training point cloud blocks, uses the training point cloud blocks to train the generator, and uses the discriminator to supervise the training process of the generator, thereby improving the training accuracy and accuracy of the generator. reliability.

The training process of the generator is introduced above in combination with the network structure of the generator, and the discriminator involved in the above S406-A1 is introduced below.

In some embodiments, the above discriminator is a pre-trained discriminator.

In some embodiments, the discriminator is not pre-trained, that is, the embodiment of the present application also involves a training process of the discriminator.

In the embodiment of the present application, before using the geometric information of the training point cloud block to train the generator, the discriminator is trained once, and then S406-A1 is executed using the trained discriminator.

In a possible training method, during the training process of the discriminator, the discriminator and the generator are alternately trained, that is, in the training process, the geometric information of the training point cloud block is used to train the discriminator first, and the discriminator After the training, the generator is trained using the geometric information of the training point cloud block, and the training process of the discriminator and the generator is carried out alternately until the training of the generator and the discriminator is completed.

In some embodiments, the training process of the discriminator specifically includes the following steps:

Step 21. Input the predicted upsampled geometric information of the training point cloud block generated by the generator into the discriminator, and obtain the second discrimination result output by the discriminator; input the upsampled true value of the geometric information of the training point cloud block into the discriminator, and obtain the third discrimination result output by the discriminator;

Step 22. Determine the loss of the discriminator according to the second discrimination result and the third discrimination result;

Step 23. Adjust the parameters in the discriminator according to the loss of the discriminator.

The embodiment of the present application does not limit the type of loss function used to determine the loss of the discriminator according to the second discrimination result and the third discrimination result in step 21 .

In a possible implementation manner, step 21 includes: according to the second discrimination result and the third discrimination result, using a least squares loss function to determine the loss of the discriminator.

For example, according to the following formula (7), the loss of the discriminator is determined:

Among them, L _dis (P _up , P _T ) represents the loss of the discriminator, P _T is the upsampled true value of the training point cloud block, P _up refers to the point cloud obtained by the upsampling of the generator, that is, the training after upsampling Point cloud blocks.

In the embodiment of the present application, the discriminator is trained according to the difference between the discriminator's discriminant result of the predicted upsampled geometric information of the training point cloud block and the upsampled true value of the geometric information of the training point cloud block, thereby improving the discriminator. training accuracy.

In the following, combined with the network structure of the discriminator, the process of the discriminator obtaining the discriminant result based on the geometric information of the point cloud block will be described in detail, that is, the discriminator generates the first discriminant result, the second discriminant result and the third discriminant result. The process is introduced.

Figure 11 is a schematic diagram of a network structure of the discriminator. As shown in Figure 11, the discriminator includes a global discriminant module, a boundary discriminant module and a fully connected module, wherein the global discriminant module is used to extract the global feature information of the point cloud, and the The module is used to extract the boundary feature information of the point cloud, and the fully connected module is used to process the global feature information and boundary feature information of the point cloud to obtain the discrimination result.

Fig. 12 is a schematic flow chart of the model training method provided by an embodiment of the present application. As shown in Fig. 11 and Fig. 12, the process for the discriminator to obtain the discriminant result includes:

S601. Acquire geometric information of boundary points of a target point cloud block.

For example, a high-pass graph filter is used to extract the geometric information of the boundary points of the target point cloud block.

S602. Input the geometric information of the boundary points of the target point cloud block into the boundary discrimination module to perform boundary feature extraction, and obtain boundary feature information of the target point cloud block.

S603. Input the geometric information of the target point cloud block into the global discrimination module for global feature extraction, and obtain the global feature information of the target point cloud block.

S604. Input the global feature information and boundary feature information of the target point cloud block into the fully connected module to obtain the target discrimination result of the discriminator.

For example, the global feature information and boundary feature information of the target point cloud block are concatenated; the concatenated global feature information and boundary feature information are input into the fully connected module to obtain the target discrimination result of the discriminator.

The discriminator in the embodiment of the present application can be understood as a double-headed discriminator, which can realize the judgment on the two dimensions of the whole world and the boundary, thereby improving the accuracy of the judgment.

Specifically, in order to efficiently suppress the boundary noise of the generated point cloud, for each input target point cloud

First extract the boundary points of the target point cloud block, for example, extract the R boundary points of the target point cloud block through a high-pass image filter

R<N', and then explicitly send the complete target point cloud block and P _b into the double-headed discriminator shown in Figure 12, and obtain the global feature information of the target point cloud block output by the global discriminant module, and the boundary The boundary discrimination feature of the target point cloud block output by the discriminant module, the global feature information and boundary feature information of the target point cloud block are input into the full connection module, and the target discrimination result of the discriminator is obtained.

In the embodiment of the present application, the discriminator obtains the first discriminant result, the second discriminant result, and the third discriminant result in the same process.

Wherein, if the target point cloud block is a training point cloud block sampled by the generator, and the judger is trained by the training point cloud block, then the above target discrimination result is the first discrimination result. If the target point cloud block is a training point cloud block sampled by the generator, and the judger has not been trained by the training point cloud block, then the above target discrimination result is the second discrimination result. If the target point cloud block is the upsampled true value of the training point cloud block, the target discrimination result is the third discrimination result. That is to say, first use the training point cloud block to train the discriminator, the training process of the discriminator is to input the training point cloud block into the generator that has not been trained by the training point cloud block, and the generator generates the above For the sampled training point cloud block 1, the up-sampled training point cloud block 1 generated by the generator is input into a judger that has not been trained by the training point cloud block, and the judger outputs a second judgment result. Next, input the upsampling true value of the training point cloud block into the discriminator, and the discriminator outputs the third discriminant result, and update the parameter matrix of the discriminator according to the second discriminant result and the third discriminant result to realize the discriminant A training session of the machine. Next, the upsampled training point cloud block 1 generated by the generator is input into the above-mentioned discriminator trained by the training point cloud block, and the discriminator outputs the first discrimination result.

The following are the global discrimination module and the boundary discrimination module in the discriminator respectively.

In some embodiments, as shown in Figure 13, the global discriminant module sequentially includes along the network depth direction: a first number of multi-layer perceptrons, a first maximum pooling layer, a second autocorrelation attention network, a second number of A multilayer perceptron and a second max pooling layer. At this time, the above S603 includes:

S603-A1. Input the geometric information of the target point cloud block into the first number of multi-layer perceptrons for feature extraction, and obtain the first global feature information of the target point cloud block;

S603-A2. Input the first global feature information into the first maximum pooling layer to perform dimension reduction processing, and obtain the second global feature information of the target point cloud block;

S603-A3. Input the first global feature information and the second global feature information into the second autocorrelation attention network to perform feature interaction, and obtain the third global feature information of the target point cloud block;

In a possible implementation, the first global feature information and the second global feature information are concatenated; the concatenated first global feature information and the second global feature information are input into the second autocorrelation attention network Perform feature interaction to obtain the third global feature information of the target point cloud block.

S603-A4. Input the third global feature information into the second number of multi-layer perceptrons for feature extraction, and obtain the fourth global feature information of the target point cloud block;

S603-A5. Input the fourth global feature information into the second maximum pooling layer for dimensionality reduction processing to obtain the global feature information of the target point cloud block.

Specifically, first, the geometric information of the target point cloud is input into the first number of multilayer perception machines (Multilayer perception, MLP for short) for feature extraction, and the first global feature information of the target point cloud is obtained; then, the first global The feature information is input into the first maximum pooling layer for dimension reduction processing, and the second global feature information of the target point cloud block is obtained through the maximum pooling operation; then, the first global feature information and the second global feature information are input into the second autocorrelation attention The force (Self-attetion) network performs feature interaction, improves the feature interaction between each point, and obtains the third global feature information of the target point cloud block; then, the third global feature information is input into the second number of multi-layer perceptrons (MLP) and further feature extraction to obtain the fourth global feature information of the target point cloud block; finally, input the fourth global feature information into the second maximum pooling layer for dimensionality reduction processing to obtain the global feature information of the target point cloud block.

Optionally, the first quantity is equal to the second quantity.

Optionally, both the first quantity and the second quantity are equal to 2.

Optionally, the first number of multilayer perceptrons includes a first layer of multilayer perceptrons and a second layer of multilayer perceptrons, and the second number of multilayer perceptrons includes a third layer of multilayer perceptrons and a fourth layer of multilayer perceptrons. Layer perceptron, the feature dimension of the first layer of multi-layer perceptron, the second layer of multi-layer perceptron, the third layer of multi-layer perceptron and the fourth layer of multi-layer perceptron gradually increases.

Optionally, the feature dimension of the first layer of multi-layer perceptron is 32, the feature dimension of the second layer of multi-layer perceptron is 64, the feature dimension of the third layer of multi-layer perceptron is 128, and the fourth layer of multi-layer perceptron The feature dimension of is 256.

In some embodiments, continuing to refer to FIG. 13 , the boundary discrimination module sequentially includes along the network depth direction: a third number of multi-layer perceptrons, a third maximum pooling layer, a third autocorrelation attention network, a third Four multi-layer perceptrons and the fourth maximum pooling layer, at this time, the above S602 includes:

S602-A1. Input the geometric information of the boundary points of the target point cloud block into a third number of multi-layer perceptrons for feature extraction, and obtain the first boundary feature information of the target point cloud block;

S602-A2. Input the first boundary feature information into the third maximum pooling layer to perform dimension reduction processing, and obtain the second boundary feature information of the target point cloud block;

S602-A3. Input the first boundary feature information and the second boundary feature information into the third autocorrelation attention network to perform feature interaction, and obtain the third boundary feature information of the target point cloud block.

In a possible implementation manner, S602-A3 includes: concatenating the first boundary feature information and the second boundary feature information; inputting the concatenated first boundary feature information and the second boundary feature information into a third self- The relevant attention network performs feature interaction to obtain the third boundary feature information of the target point cloud block.

S602-A4. Input the third boundary feature information into the fourth number of multi-layer perceptrons for feature extraction, and obtain the fourth boundary feature information of the target point cloud block;

S602-A5. Input the fourth boundary feature information into the fourth maximum pooling layer to perform dimensionality reduction processing to obtain boundary feature information of the target point cloud block.

Specifically, first input the boundary geometric information of the target point cloud block into a third number of multi-layer perceptrons (MLP) for feature extraction, and obtain the first boundary feature information of the target point cloud block; then, input the first boundary feature information The third maximum pooling layer performs dimension reduction processing, and obtains the second boundary feature information of the target point cloud block through the maximum pooling operation; then, the first boundary feature information and the second boundary feature information are input into the third autocorrelation attention (Self -attetion) network for feature interaction, enhance the feature interaction between each point, and obtain the third boundary feature information of the target point cloud block; then, input the third boundary feature information into the fourth number of multi-layer perceptrons (MLP) Further feature extraction is performed to obtain the fourth boundary feature information of the target point cloud block; finally, the fourth boundary feature information is input into the fourth maximum pooling layer for dimensionality reduction processing to obtain the boundary feature information of the target point cloud block.

Optionally, the third quantity is equal to the fourth quantity.

Optionally, both the third quantity and the fourth quantity are equal to 2.

Optionally, the third number of multi-layer perceptrons includes a fifth-layer multi-layer perceptron and a sixth-layer multi-layer perceptron, and the fourth number of multi-layer perceptrons includes a seventh-layer multi-layer perceptron and an eighth-layer multi-layer perceptron. Layer perceptron, the feature dimension of the fifth layer multilayer perceptron, sixth layer multilayer perceptron, seventh layer multilayer perceptron and eighth layer multilayer perceptron gradually increases.

Optionally, the feature dimension of the eighth-layer multi-layer perceptron is greater than or equal to the feature dimension of the seventh-layer multi-layer perceptron, and smaller than or equal to the feature dimension of the fourth-layer multi-layer perceptron. For example, the feature dimension of the eighth-layer multilayer perceptron is greater than or equal to 128 and less than or equal to 256.

Optionally, the feature dimension of the fifth-layer multi-layer perceptron is 32, the feature dimension of the sixth-layer multi-layer perceptron is 64, the feature dimension of the seventh-layer multi-layer perceptron is 128, and the eighth-layer multi-layer perceptron The feature dimension of is 192.

Continuing to refer to Figure 13, the global feature information of the target point cloud block output by the global discrimination module and the boundary feature information output by the boundary discrimination module are cascaded, and the connected global feature information and boundary feature information are input into the fully connected layer The module obtains the confidence value of the discriminator through three fully connected layers (FC), that is, the discriminant result of the discriminator. If the input of the discriminator is the upsampled point cloud output by the generator, the confidence value is close to 0. If the discriminator The input is the upsampled true value of the point cloud, and the confidence value is close to 1.

Based on this, the training of the generator can be supervised according to the discrimination results of the discriminator, thereby improving the training accuracy of the generator, so that the distribution of the upsampled point cloud of the trained generator is close to the true value of the upsampled point cloud , to ensure the accuracy of the upsampled point cloud.

In the embodiment of the present application, in order to improve the judgment accuracy of the discriminator, a new discriminator is proposed. The discriminator includes a global discrimination module and a boundary discrimination module, and discriminates the global information and boundary information of the point cloud, thereby improving the discrimination. The discriminative accuracy of the discriminator improves the training accuracy of the generator when the discriminator is used to assist the training of the generator.

The training process of the generator is introduced above, and the upsampling process of the geometric information of the point cloud using the trained generator is introduced below. The above trained generator can realize the upsampling of the geometric information of the point cloud.

Fig. 14 is a schematic flow chart of the point cloud upsampling method provided by the embodiment of the present application. As shown in Fig. 14, the point cloud upsampling process includes:

S701. Acquire geometric information of the point cloud to be upsampled.

Optionally, the point cloud to be upsampled may be collected in real time by a point cloud collection device.

Optionally, the point cloud to be upsampled may be obtained from other storage devices.

Optionally, the point cloud to be upsampled is decoded by the decoding device from the code stream obtained by the editing device.

The embodiment of the present application does not limit the specific process of obtaining the point cloud to be processed.

S702. Divide the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled.

In some embodiments, the methods of dividing the point cloud to be upsampled into at least one point cloud block in S702 include but are not limited to the following methods:

Method 1: Divide the point cloud to be upsampled into at least one point cloud block of equal size according to the geometric information of the point cloud to be upsampled. That is to say, the geometric scale of each point cloud block is the same.

Method 2: Divide the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled, and each point cloud block includes the same number of points.

Method 3: Obtain at least one seed point from the point cloud to be upsampled according to the geometric information of the point cloud to be upsampled, for example, use Monte Carlo random sampling method to randomly sample a specified number of seed points from the point cloud to be upsampled . For each seed point, determine the neighboring points of the seed point, divide the seed point and the neighboring points of the seed point into a point cloud block, and then obtain at least one point cloud block. In the third method, the obtained point cloud blocks are also called point cloud patches (Patch), and the number of points included in each point cloud block in the obtained point cloud blocks is the same.

S703. Input the geometric information of the point cloud block into the generator for up-sampling, and obtain the up-sampling geometric information of the point cloud block.

Fig. 15 is a schematic diagram of a network structure of the generator involved in the embodiment of the present application. As shown in Fig. 15, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, wherein the feature extraction module is used to extract points The first feature information of the cloud block, the feature sampling module is used to upsample the first feature information of the point cloud block into the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into the geometric space, In order to obtain the upsampled geometric information of the point cloud block.

The network structure of the feature extraction module, feature upsampling module and geometry generation module in the generator is introduced below.

In some embodiments, as shown in Figure 6A, the feature extraction module includes densely connected M feature extraction blocks;

For the i+1th feature extraction block in the M feature extraction blocks, the i+1th feature extraction block is used to output the i+1th third feature information according to the input i-th fourth feature information, and the ith The fourth feature information is determined according to the i-th third feature information output by the i-th feature extraction block, and the first feature information of the point cloud block is based on the output of the M-th feature extraction block in the M feature extraction blocks As determined by the Mth third feature information, i is a positive integer smaller than M. For details, refer to the description of S403 above, and details will not be repeated here.

In some embodiments, if i is not equal to 1, the i-th fourth feature information is the third feature information extracted by each feature extraction block before the i-th feature extraction block in the M feature extraction blocks, and The feature information obtained by cascading the third feature information extracted by the i feature extraction blocks. If i is equal to 1, the i-th fourth feature information is the first third feature information output by the first feature extraction block in the M feature extraction blocks.

In some embodiments, as shown in Figure 6B, the feature extraction block includes: a first feature extraction unit and S second feature extraction units connected in series, where S is a positive integer;

For the first extraction unit in the i+1th feature extraction block, the first extraction unit is used to search for K neighboring points of the current point for the current point in the point cloud block, and based on the i-th feature extraction unit of the point cloud block Four feature information, the fourth feature information of the current point is subtracted from the fourth feature information of the adjacent point to obtain K residual feature information, and the K residual feature information is graded with the fourth feature information of the current point According to the i-th cascade feature information of the current point, get the i-th cascade feature information of the point cloud block, and concatenate the i-th cascade feature information of the point cloud block The feature information is input to the first second feature extraction unit in the S second feature extraction units;

The first second feature extraction unit is used to output the first fifth feature information to the second second feature extraction unit according to the ith cascaded feature information of the point cloud block, wherein the i+1th of the point cloud block The third feature information is the fifth feature information output by the last second feature extraction unit among the S second feature extraction units. For details, refer to the above-mentioned description of S403-A31, which will not be repeated here.

In some embodiments, as shown in FIG. 6C, the second feature extraction unit includes P residual blocks, where P is a positive integer;

For the j+1th residual block in the sth second feature extraction unit, the j+1th residual block is used according to the output of the jth residual block in the sth second feature extraction unit The j-th first residual information and the fifth feature information input to the s-th second feature extraction unit output the j+1-th first residual information, where j is a positive integer less than P, and s is less than or A positive integer equal to S. Optionally, adding the j-th first residual information output by the j-th residual block in the s-th second feature extraction unit to the fifth feature information input to the s-th second feature extraction unit After that, input the j+1th residual block in the sth second feature extraction unit.

The fifth feature information output by the sth second feature extraction unit is based on the first residual information information output by at least one residual block in the sth second feature extraction unit, and input to the sth second feature extraction unit. The fifth feature information is determined.

In a possible implementation manner, the fifth feature information output by the sth second feature extraction unit is based on the first residual information output by the last residual block in the sth second feature extraction unit, and The feature information after concatenation of the first residual information output by at least one residual block in the P-1 residual blocks and the fifth feature information input to the s-th second feature extraction unit are determined, wherein, P The -1 residual block is a residual block except the last residual block among the P residual blocks of the s-th second feature extraction unit.

In a possible implementation manner, the fifth feature information output by the sth second feature extraction unit is based on the first residual information output by the last residual block in the sth second feature extraction unit, and The first residual information output by at least one residual block in the P-1 residual blocks is concatenated and then determined by adding the fifth feature information input to the s-th second feature extraction unit.

In some embodiments, as shown in FIG. 6F, the second feature extraction unit further includes a gating unit,

For the gating unit in the sth second feature extraction unit, the gating unit is used for the first residual information output by the last residual block in the sth second feature extraction unit, and P-1 De-redundancy is performed on the feature information after concatenation of the first residual information output by at least one residual block in the residual block, and the de-redundant feature information is output; the fifth feature information output by the sth second feature extraction unit It is determined after adding the feature information after de-redundancy to the fifth feature information input to the sth second feature extraction unit.

The network structure of the feature extraction module in the generator is introduced above with reference to FIGS. 6A to 6F , and the network structure of the feature upsampling module in the generator is introduced below with reference to FIGS. 7A to 7D .

In some embodiments, as shown in FIG. 7A, the feature upsampling module includes: a feature upsampling submodule and a feature extraction submodule;

Wherein, the feature upsampling submodule is used to copy r copies of the first feature information of the point cloud block according to the preset upsampling rate r, and add an n-dimensional vector to the feature dimension of the copied first feature information, Obtain the upsampling feature information of the point cloud block, and input the upsampling feature information of the point cloud block into the feature extraction submodule, wherein the values of the n-dimensional vectors corresponding to different first feature information are different;

The feature extraction sub-module is used to output the second feature information of the point cloud block according to the up-sampled feature information of the point cloud block.

In some embodiments, as shown in FIG. 7C, the feature extraction submodule includes Q third feature extraction units, where Q is a positive integer;

For the k+1th third feature extraction unit among the Q third feature extraction units, the k+1th third feature extraction unit is used to extract the point cloud block according to the kth third feature extraction unit. k enhanced upsampling feature information, the k+1th enhanced upsampling feature information of the output point cloud block, k is a positive integer less than Q;

The second feature information of the point cloud block is the Qth enhanced upsampling feature information of the point cloud block extracted by the last third feature extraction unit among the Q third feature extraction units.

In some embodiments, as shown in Figure 7D, the third feature extraction unit is the HRA in Figure 7D, the third feature extraction unit includes L residual blocks, L is a positive integer, for example, the third feature extraction unit includes 4 residual block RB;

For the l+1th residual block in the k+1th third feature extraction unit, the l+1th residual block is used according to the lth residual in the k+1th third feature extraction unit The lth second residual information output by the block and the kth enhanced upsampling feature information input to the k+1th third feature extraction unit, and the l+1th second residual information is output, where l is less than L Positive integer; optionally, after adding the l-th second residual information output by the l-th residual block and the k-th enhanced upsampling feature information, input the l+1th residual block.

The k+1th enhanced upsampling feature information of the point cloud block is determined according to the second residual information output by at least one residual block in the k+1th third feature extraction unit, and the kth enhanced upsampling feature information of.

In a possible implementation, the k+1th enhanced upsampled feature information of the above point cloud block is based on the second residual information output by the last residual block in the L residual blocks, and the L-1 residual The second residual information output by at least one residual block in the difference block is determined by concatenating the feature information and the kth enhanced upsampling feature information, wherein, the L-1 residual block is the k+1th A residual block except the last residual block among the L residual blocks of the three-feature extraction unit.

In a possible implementation, the k+1th enhanced upsampled feature information of the above point cloud block is based on the second residual information output by the last residual block in the L residual blocks, and the L-1 residual It is determined by adding the concatenated feature information of the second residual information output by at least one residual block in the difference block and the kth enhanced upsampling feature information.

In some embodiments, as shown in FIG. 7D, the third feature extraction unit further includes a gating unit;

For the gating unit in the k+1th third feature extraction unit, the gating unit is used to output the second residual information of the last residual block in the k+1th third feature extraction unit, and L - The second feature information output by at least one residual block in the 1 residual block is de-redundant after concatenating the feature information, and outputting the feature information after de-redundancy;

The k+1th enhanced upsampling feature information of the point cloud block is determined after adding the deredundant feature information to the kth enhanced upsampling feature information.

In some embodiments, the network structure of the third feature extraction unit is the same as that of the above-mentioned second feature extraction unit.

In some embodiments, as shown in Figure 7B, the feature upsampling module further includes a first autocorrelation attention network;

Wherein, the first autocorrelation attention network is used to perform feature interaction on the upsampling feature information of the point cloud block output by the feature upsampling submodule, and output the upsampling feature information of the point cloud block after feature interaction to the feature extraction submodule;

At this time, the feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block after feature interaction.

Optionally, the feature dimension of the upsampled feature information of the point cloud block after the feature interaction is lower than the feature dimension of the upsampled feature information of the point cloud block.

The network structure of the feature extraction module in the generator is introduced above with reference to FIG. 7A to FIG. 7D , and the network structure of the geometry generation module in the generator is introduced below in conjunction with FIG. 8 and FIG. 15 .

In some embodiments, the geometry generation module includes a plurality of fully connected layers;

The multiple fully connected layers are used to output upsampled geometric information of the point cloud block according to the second feature information of the point cloud block.

In some embodiments, as shown in Figure 8 and Figure 15, the geometry generation module includes: a geometry reconstruction unit, a filtering unit and a downsampling unit;

Wherein, the geometric reconstruction unit is used to geometrically reconstruct the second feature information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to the filtering unit;

The filter unit is used to denoise the initial upsampling geometric information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to filter out the noise to the downsampling unit;

The down-sampling unit is used for down-sampling the initial up-sampling geometric information of the point cloud block to filter noise to a target up-sampling rate, and output the up-sampling geometric information of the point cloud block.

Optionally, the target upsampling rate is less than or equal to the upsampling rate of the feature upsampling module.

In order to further illustrate the technical effect of the present application, the scheme proposed in the embodiment of the present application is implemented on the test platform, and the chamfering distance (CD), the Hausdorff distance (HD), and the point-to-surface distance (P2FD) are used respectively ) to measure the similarity between the upsampled point cloud and the upsampled ground truth of the point cloud. Optionally, the upsampling rate r is set to 4. The technical solution of the embodiment of the application was tested with the optimization-based method EAR, the most advanced point cloud upsampling network PU-Net, MPU, and PU-GAN, and the results on the test data set are shown in Table 1:

Table 1

As shown in Table 1, the difference between the point cloud generated by the method proposed in this application and the upsampled true value of the point cloud is the smallest, for example, Chamfer distance (CD for short), Hausdorff distance (Hausdorff distance , referred to as HD), and the point-to-face distance (Point to face distance, referred to as P2FD) are 0.258, 3.571 and 2.392 respectively. Therefore, it is shown that the point cloud upsampling method proposed in this application can realize effective upsampling of the point cloud.

The point cloud upsampling method of the embodiment of the present application obtains the geometric information of the point cloud to be upsampled, divides the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled, and divides the point cloud block into The geometric information of the input generator is upsampled, and the upsampled geometric information of the point cloud block is obtained. The generator includes: a feature extraction module, a feature upsampling module and a geometry generation module. The feature extraction module is used to extract the first point cloud block. A feature information, the feature sampling module is used to upsample the first feature information of the point cloud block into the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into the geometric space to obtain the point cloud The block's upsampled geometry information. That is, the generator in the embodiment of the present application is a generator based on deep learning, through which more characteristic information of the point cloud can be learned, and then when the generator is used for upsampling of the point cloud, a high-precision point cloud can be generated, Moreover, the feature of the high-precision point cloud is close to the true value of the upsampling of the point cloud, thereby improving the accuracy of the upsampling of the point cloud.

In some embodiments, the point cloud upsampling method provided in the embodiment of the present application can also be applied to a point cloud encoding and decoding framework, for example, it can be applied to a point cloud decoding end.

Fig. 16 is a schematic flow chart of the point cloud decoding method provided by the embodiment of the present application. As shown in Fig. 16, the point cloud decoding method includes:

S801. Decode the point cloud code stream to obtain geometric information of the point cloud.

The point cloud code stream includes attribute code stream and geometry code stream. By decoding the geometry code stream, the geometric information of the point cloud can be obtained, and by decoding the attribute code stream, the attribute information of the point cloud can be obtained.

The process of decoding the geometric code stream and obtaining the geometric information of the point cloud refers to the prior art, and will not be repeated in this embodiment of the present application.

S802. Divide the point cloud into at least one point cloud block according to the geometric information of the point cloud.

The execution process of the above S802 is consistent with that of the above S702, refer to the description of the above S702, and will not be repeated here.

S803. Input the geometric information of the point cloud block into the generator for up-sampling, and obtain the up-sampled geometric information of the point cloud block.

Referring to the above-mentioned generator shown in Figure 15, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to extract the point cloud block The first feature information of the cloud block is upsampled to the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into a geometric space, so as to obtain the upsampled geometric information of the point cloud block.

First, the network structure of the feature extraction module is introduced with reference to FIG. 6A to FIG. 6F .

Optionally, the target upsampling rate is a preset value.

Optionally, the above target upsampling rate is parsed from the point cloud code stream.

The embodiment of the present application upsamples the geometric information of the point cloud generated by the point cloud decoding end to generate a high-precision reconstruction point cloud, which can meet the application scenarios of high-precision point clouds, and further improves the diversity of point cloud decoding.

It should be understood that Fig. 4 to Fig. 16 are only examples of the present application, and should not be construed as limiting the present application.

The preferred embodiments of the present application have been described in detail above in conjunction with the accompanying drawings. However, the present application is not limited to the specific details in the above embodiments. Within the scope of the technical concept of the present application, various simple modifications can be made to the technical solutions of the present application. These simple modifications all belong to the protection scope of the present application. For example, the various specific technical features described in the above specific implementation manners can be combined in any suitable manner if there is no contradiction. Separately. As another example, any combination of various implementations of the present application can also be made, as long as they do not violate the idea of the present application, they should also be regarded as the content disclosed in the present application.

It should also be understood that in the various method embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the order of execution of the processes should be determined by their functions and internal logic, and should not be used in this application. The implementation of the examples constitutes no limitation. In addition, in the embodiment of the present application, the term "and/or" is only an association relationship describing associated objects, indicating that there may be three relationships. Specifically, A and/or B may mean: A exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in this article generally indicates that the contextual objects are an "or" relationship.

The method embodiment of the present application is described in detail above with reference to FIG. 4 to FIG. 16 , and the device embodiment of the present application is described in detail below in conjunction with FIG. 17 to FIG. 20 .

Fig. 17 is a schematic block diagram of a point cloud decoder provided by an embodiment of the present application.

As shown in Figure 17, this point point cloud decoder 20 can comprise:

The decoding unit 21 is used to decode the point cloud code stream to obtain the geometric information of the point cloud;

A division unit 22, configured to divide the point cloud into at least one point cloud block according to the geometric information of the point cloud;

An up-sampling unit 23, configured to input the geometric information of the point cloud block into the generator for up-sampling, to obtain the up-sampled geometric information of the point cloud block;

In some embodiments, the feature extraction module comprises densely connected M feature extraction blocks;

For the i+1th feature extraction block in the M feature extraction blocks, the i+1th feature extraction block is used to output the i+1th third feature according to the input i-th fourth feature information information, the i-th fourth feature information is determined according to the i-th third feature information output by the i-th feature extraction block, and the first feature information of the point cloud block is determined according to the M feature extraction blocks Determined by the Mth third feature information output by the Mth feature extraction block, the i is a positive integer smaller than M.

In some embodiments, if i is not equal to 1, the i-th fourth feature information is the first feature extracted by each feature extraction block before the i-th feature extraction block among the M feature extraction blocks. Three feature information, feature information concatenated with the third feature information extracted by the ith feature extraction block;

If i is equal to 1, the ith fourth feature information is the first third feature information output by the first feature extraction block in the M feature extraction blocks.

In some embodiments, the feature extraction block includes: a first feature extraction unit and S second feature extraction units connected in series, wherein S is a positive integer;

For the first extraction unit in the i+1th feature extraction block, the first extraction unit is used to search for K neighboring points of the current point for the current point in the point cloud block, and based on For the ith fourth feature information of the point cloud block, the fourth feature information of the current point is subtracted from the fourth feature information of the adjacent point to obtain K residual feature information, and the The K residual feature information is concatenated with the fourth feature information of the current point to obtain the i-th concatenated feature information of the current point, and according to the i-th concatenated feature information of the current point, the The i-th concatenated feature information of the point cloud block, and input the i-th concatenated feature information of the point cloud block into the first second feature extraction unit in the S second feature extraction units;

The first second feature extraction unit is used to output the first fifth feature information to the second second feature extraction unit according to the i-th cascaded feature information of the point cloud block, wherein the point cloud The i+1th third feature information of the block is the fifth feature information output by the last second feature extraction unit among the S second feature extraction units.

In some embodiments, the second feature extraction unit includes P residual blocks, where P is a positive integer;

For the j+1th residual block in the sth second feature extraction unit, the j+1th residual block is used according to the jth residual in the sth second feature extraction unit The j-th first residual information output by the block and the fifth feature information input to the s-th second feature extraction unit, and the j+1-th first residual information is output, wherein the j is less than P A positive integer, the s is a positive integer less than or equal to S;

The fifth feature information output by the sth second feature extraction unit is based on the first residual information output by at least one residual block in the sth second feature extraction unit, and input to the sth second feature extraction unit determined by the fifth feature information of the second feature extraction unit.

In some embodiments, the up-sampling unit 23 is further configured to input the j-th first residual information and the j-th residual information output by the j-th residual block in the s-th second feature extraction unit to the s-th After the fifth feature information of the second feature extraction unit is added, it is input to the j+1th residual block in the sth second feature extraction unit.

In some embodiments, the fifth feature information output by the s th second feature extraction unit is based on the first residual information output by the last residual block in the s th second feature extraction unit, and The feature information after the concatenation of the first residual information output by at least one residual block in the P-1 residual blocks is determined from the fifth feature information input to the s-th second feature extraction unit, wherein , the P-1 residual blocks are residual blocks except the last residual block among the P residual blocks of the s-th second feature extraction unit.

In some embodiments, the fifth feature information output by the s th second feature extraction unit is based on the first residual information output by the last residual block in the s th second feature extraction unit, and The first residual information output by at least one residual block in the P-1 residual blocks is concatenated and then determined after being added to the fifth feature information input to the s-th second feature extraction unit of.

In some embodiments, the second feature extraction unit further includes a gating unit,

For the gating unit in the s th second feature extraction unit, the gating unit is used for the first residual information output by the last residual block in the s th second feature extraction unit, De-redundancy is performed on the feature information concatenated with the first residual information output by at least one residual block in the P-1 residual blocks, and the de-redundant feature information is output; the s second The fifth feature information output by the feature extraction unit is determined after adding the de-redundant feature information and the fifth feature information input to the s-th second feature extraction unit.

In some embodiments, the feature upsampling module includes: a feature upsampling submodule and a feature extraction submodule;

The feature upsampling submodule is used to copy r copies of the first feature information of the point cloud block according to the preset upsampling rate r, and add an n dimension to the feature dimension of the copied first feature information vector, obtain the upsampling feature information of the point cloud block, and input the upsampling feature information of the point cloud block into the feature extraction submodule, wherein the values of n-dimensional vectors corresponding to different first feature information are different;

The feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block.

In some embodiments, the feature extraction submodule includes Q third feature extraction units, where Q is a positive integer;

For the k+1th third feature extraction unit among the Q third feature extraction units, the k+1th third feature extraction unit is used to extract all The kth enhanced upsampling feature information of the point cloud block, output the k+1th enhanced upsampling feature information of the point cloud block, and the k is a positive integer less than Q;

In some embodiments, the third feature extraction unit includes L residual blocks, and the L is a positive integer;

For the l+1th residual block in the k+1th third feature extraction unit, the l+1th residual block is used according to the k+1th third feature extraction unit The lth second residual information output by the lth residual block and the kth enhanced upsampling feature information input to the k+1th third feature extraction unit, output the l+1th second residual difference information, the l is a positive integer less than L;

The k+1th enhanced upsampling feature information of the point cloud block is the second residual information output from at least one residual block in the k+1th third feature extraction unit, and the kth Enhanced upsampling feature information determined.

In some embodiments, the upsampling unit 23 is further configured to: after adding the lth second residual information output by the lth residual block and the kth enhanced upsampling feature information, input The l+1th residual block.

In some embodiments, the k+1th enhanced upsampling feature information of the point cloud block is based on the second residual information output by the last residual block in the L residual blocks, and the L-1 residual The second residual information output by at least one residual block in the difference block is determined by concatenating the feature information and the kth enhanced upsampling feature information, wherein the L-1 residual blocks are the A residual block except the last residual block among the L residual blocks of the k+1 third feature extraction unit.

In some embodiments, the k+1th enhanced upsampling feature information of the point cloud block is based on the second residual information output by the last residual block in the L residual blocks, and the L-1 residual The second residual information output by at least one residual block in the difference block is determined by adding the concatenated feature information to the kth enhanced upsampling feature information.

In some embodiments, the third feature extraction unit further includes a gating unit;

For the gating unit in the k+1th third feature extraction unit, the gating unit is used for the second output of the last residual block in the k+1th third feature extraction unit performing de-redundancy on the residual information and the feature information concatenated with the second feature information output by at least one of the L-1 residual blocks, and outputting the de-redundant feature information;

In some embodiments, the feature upsampling module further includes a first autocorrelation attention network;

The first autocorrelation attention network is used to perform feature interaction on the upsampling feature information of the point cloud block output by the feature upsampling submodule, and output the upsampling feature information of the point cloud block after feature interaction to the feature extraction submodule;

The feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block after feature interaction.

In some embodiments, the feature dimension of the upsampled feature information of the point cloud block after the feature interaction is lower than the feature dimension of the upsampled feature information of the point cloud block.

In some embodiments, the geometry generation module includes: a geometry reconstruction unit, a filtering unit, and a downsampling unit;

The geometric reconstruction unit is used to geometrically reconstruct the second feature information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to the filtering unit;

The filtering unit is used to denoise the initial upsampling geometric information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to filter noise to the downsampling unit;

The down-sampling unit is configured to down-sample the initial up-sampled geometric information of the point cloud block after filtering noise to a target up-sampling rate, and output the up-sampled geometric information of the point cloud block.

In some embodiments, the target upsampling rate is less than or equal to the upsampling rate of the feature upsampling module.

In some embodiments, the decoding unit 21 is further configured to: decode the point cloud code stream to obtain the target upsampling rate.

It should be understood that the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here. Specifically, the point cloud decoder 20 shown in FIG. 17 may correspond to the corresponding subject in the point cloud decoding method of the embodiment of the present application, and the aforementioned and other operations and/or functions of each unit in the point cloud decoder 20 In order to realize the corresponding processes in the point cloud decoding method respectively, for the sake of brevity, details are not repeated here.

Fig. 18 is a schematic block diagram of a point cloud upsampling device provided by an embodiment of the present application.

As shown in Figure 18, the model training device 40 includes:

An acquisition unit 41, configured to acquire geometric information of the point cloud to be upsampled;

A division unit 42, configured to divide the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled;

An up-sampling unit 43, configured to input the geometric information of the point cloud block into the generator for up-sampling, to obtain the up-sampling geometric information of the point cloud block;

In some embodiments, the up-sampling unit 43 is further configured to input the j-th first residual information and the j-th residual information output by the j-th residual block in the s-th second feature extraction unit to the s-th After the fifth feature information of the second feature extraction unit is added, it is input to the j+1th residual block in the sth second feature extraction unit.

In some embodiments, the upsampling unit 43 is further configured to: after adding the lth second residual information output by the lth residual block and the kth enhanced upsampling feature information, input The l+1th residual block.

It should be understood that the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here. Specifically, the point cloud upsampling device 40 shown in FIG. 18 may correspond to the corresponding subject in the point cloud upsampling method of the embodiment of the present application, and the aforementioned and other operations of each unit in the point cloud upsampling device 40 and The /or functions are respectively used to realize the corresponding process in the point cloud upsampling method, for the sake of brevity, no more details are given here.

Fig. 19 is a schematic block diagram of a model training device provided by an embodiment of the present application.

As shown in Figure 19, the model training device 10 includes:

Acquisition unit 11, for obtaining the geometric information of training point cloud;

A division unit 12, configured to divide the training point cloud into at least one training point cloud block according to the geometric information of the training point cloud;

The training unit 13 is used to input the geometric information of the training point cloud block into the feature extraction module of the generator for feature extraction, and obtain the first feature information of the training point cloud block; the first feature information of the training point cloud block is Feature information is input into the feature upsampling module of the generator for upsampling to obtain the second feature information of the training point cloud block; the second feature information of the training point cloud block is input into the geometry generation module of the generator Perform geometric reconstruction to obtain the predicted upsampling geometric information of the training point cloud block; according to the predicted upsampling geometric information of the training point cloud block, the feature extraction module, feature upsampling module and geometric generation in the generator The module is trained to obtain the trained generator.

In some embodiments, the training unit 12 is specifically configured to input the predicted upsampled geometric information of the training point cloud block into a discriminator to obtain the first discriminant result of the discriminator, and the discriminator is used to judge the input Whether the data of the discriminator is the upsampled true value of the training point cloud block; according to the first discrimination result of the discriminator, the feature extraction module, feature upsampling module and geometry generation module in the generator are trained , to get the trained generator.

In some embodiments, the feature extraction module includes M densely connected feature extraction blocks, and the training unit 12 is specifically configured to input the geometric information of the training point cloud block into the feature extraction module, and obtain the M The i-th third feature information of the training point cloud block extracted by the i-th feature extraction block in the feature extraction block, the i is a positive integer less than M; according to the i-th feature information of the training point cloud block The third feature information is to obtain the i-th fourth feature information of the training point cloud block; the i-th fourth feature information of the training point cloud block is input in the i+1 feature extraction block to obtain the i+1 feature extraction block. The i+1th third feature information of the training point cloud block; the Mth third feature information extracted by the Mth feature extraction block of the training point cloud block is used as the first feature information of the training point cloud block characteristic information.

In some embodiments, the training unit 12 is specifically configured to, if i is not equal to 1, obtain the third feature extracted by each feature extraction block located before the ith feature extraction block among the M feature extraction blocks information; and the third feature information extracted by each feature extraction block located before the ith feature extraction block is concatenated with the third feature information extracted by the ith feature extraction block, as the The i-th fourth feature information of the training point cloud block;

If i is equal to 1, the first third feature information extracted by the first feature extraction block in the M feature extraction units is used as the ith fourth feature information of the training point cloud block.

In some embodiments, the feature extraction block includes: a first feature extraction unit and at least one second feature extraction unit connected in series, and a training unit 12, which is specifically used to convert the i-th fourth feature of the training point cloud block The feature information is input to the first feature extraction unit in the i+1th feature extraction block, so that the first feature extraction unit searches for K of the current point for the current point in the training point cloud block. neighboring points, and based on the ith fourth characteristic information, subtracting the fourth characteristic information of the current point from the fourth characteristic information of the neighboring points to obtain K residual characteristic information; The K residual feature information is concatenated with the fourth feature information of the current point to obtain the i-th concatenated feature information of the current point, and according to the i-th concatenated feature information of the current point, it is obtained The i-th concatenated feature information of the training point cloud block; the i-th concatenated feature information of the training point cloud block is input into the first second feature extraction in the i+1 feature extraction block unit to obtain the first, first, and fifth feature information, and input the first, first, and fifth feature information into the second, second feature extraction unit in the i+1th feature extraction block , to obtain the second fifth feature information; the fifth feature information extracted by the last second feature extraction unit in the i+1 feature extraction block is used as the i+1th feature information of the training point cloud block Three characteristic information.

In some embodiments, the second feature extraction unit includes P residual blocks, where P is a positive integer, and the training unit 12 is specifically configured to input the i-th concatenated feature information into the i+th The first second feature extraction unit in one feature extraction block obtains the first residual information output by the jth residual block in the first second feature extraction unit, where j is less than or equal to P is a positive integer; input the first residual information output by the jth residual block and the ith concatenated feature information into the j+1th residual in the first second feature extraction unit In the block, the first residual information output by the j+1th residual block is obtained; according to the first residual information output by at least one of the P residual blocks in the first second feature extraction unit The residual information, as well as the i-th concatenated feature information, determine fifth feature information output by the first second feature extraction unit.

In some embodiments, the training unit 12 is specifically configured to add the first residual information output by the jth residual block and the ith concatenated feature information, and add the added feature The information is input into the j+1th residual block, and the first residual information output by the j+1th residual block is obtained.

In some embodiments, the training unit 12 is specifically configured to output the first residual information output by the last residual block in the P residual blocks and at least one residual block in the P-1 residual blocks The first residual information is concatenated, wherein the P-1 residual blocks are residual blocks except the last residual block among the P residual blocks; according to the concatenated features information and the i-th concatenated feature information to determine the fifth feature information output by the first second feature extraction unit.

In some embodiments, the training unit 12 is specifically configured to add the concatenated feature information and the i-th concatenated feature information as the first output of the first second feature extraction unit. Five characteristic information.

In some embodiments, the second feature extraction unit further includes a gating unit, a training unit 12, specifically configured to input the cascaded feature information into the gating unit for de-redundancy, to obtain de-redundancy The feature information after de-redundancy is added to the i-th concatenated feature information as the fifth feature information output by the first second feature extraction unit.

In some embodiments, the feature upsampling module includes: a feature upsampling submodule and a feature extraction submodule, a training unit 12, specifically for inputting the first feature information of the training point cloud block into the feature upsampling sub-module, so that the feature up-sampling sub-module copies r shares of the first feature information of the training point cloud block according to the preset up-sampling rate r, and performs a feature dimension on the copied first feature information Add an n-dimensional vector to obtain the upsampling feature information of the training point cloud block, wherein the values of the n-dimensional vectors corresponding to different first feature information are different; the upsampling feature information of the training point cloud block is input into the The feature extraction sub-module obtains the second feature information of the training point cloud block extracted by the feature extraction sub-module.

In some embodiments, the feature extraction submodule includes Q third feature extraction units connected in series, the Q is a positive integer, and the training unit 12 is specifically used to convert the upsampled feature information of the training point cloud block to Input the feature extraction submodule to obtain the kth enhancement upsampling feature information of the training point cloud block extracted by the kth third feature extraction unit; the kth enhancement upsampling feature information of the training point cloud block The feature information is input into the k+1th third feature extraction unit to obtain the k+1th enhanced upsampling feature information of the training point cloud block extracted by the k+1th third feature extraction unit; The Qth enhanced upsampled feature information of the training point cloud block extracted by the last third feature extraction unit among the Q third feature extraction units is used as the second feature information of the training point cloud block.

In some embodiments, the third feature extraction unit includes L residual blocks, where L is a positive integer, and the training unit 12 is specifically used to enhance and upsample the feature information of the k th training point cloud block Input the k+1th third feature extraction unit to obtain the second residual information output by the lth residual block in the k+1th third feature extraction unit, and the l is less than or equal to L is a positive integer; input the second residual information output by the lth residual block and the kth enhanced upsampling feature information into the l+1th residual block to obtain the l+1th The second residual information output by the residual block; according to the second residual information output by at least one residual block in the L residual blocks, and the kth enhanced upsampling feature information, the training point is obtained The k+1th enhanced upsampled feature information of the cloud block.

In some embodiments, the training unit 12 is specifically configured to add the second residual information output by the lth residual block and the kth enhanced upsampling feature information, and add the added The characteristic information is input into the l+1th residual block, and the second residual information output by the l+1th residual block is determined.

In some embodiments, the training unit 12 is specifically configured to output the second residual information output by the last residual block in the L residual blocks, and at least one residual block output in the L-1 residual blocks The second residual information is concatenated, wherein the L-1 residual blocks are residual blocks except the last residual block among the L residual blocks; according to the concatenated features information and the kth enhanced upsampling feature information to determine the k+1th enhanced upsampling feature information of the training point cloud block.

In some embodiments, the training unit 12 is specifically configured to add the concatenated feature information and the kth enhanced upsampling feature information as the k+1th of the training point cloud block Enhance upsampled feature information.

In some embodiments, the third feature extraction unit further includes a gating unit, a training unit 12, specifically configured to input the cascaded feature information into the gating unit for de-redundancy, to obtain de-redundancy The feature information after de-redundancy is added to the k-th enhanced up-sampling feature information, and used as the k+1th enhanced up-sampling feature information of the training point cloud block.

In some embodiments, the feature upsampling module further includes a first autocorrelation attention network, a training unit 12, specifically for inputting the upsampling feature information of the training point cloud block into the first autocorrelation attention The network performs feature interaction to obtain the upsampling feature information of the training point cloud block after the feature interaction; the upsampling feature information of the training point cloud block after the feature interaction is input into the feature extraction submodule to perform feature extraction, and obtain The second feature information of the training point cloud block.

Optionally, the feature dimension of the upsampled feature information of the training point cloud block after the feature interaction is lower than the feature dimension of the upsampled feature information of the training point cloud block.

In some embodiments, the geometry generation module includes multiple fully connected layers, and the training unit 12 is specifically configured to input the second feature information of the training point cloud block into the multiple fully connected layers to obtain the training Prediction of point cloud blocks upsamples geometric information.

In some embodiments, the geometry generation module includes: a geometry reconstruction unit, a filter unit, and a downsampling unit, and a training unit 12, specifically configured to input the second feature information of the training point cloud block into the geometry reconstruction unit for further processing. Geometric reconstruction to obtain the initial upsampling geometric information of the training point cloud block; input the initial upsampling geometric information of the training point cloud block into a filter unit for noise removal, and obtain the initial upsampling of the training point cloud block to filter out noise Sampling geometric information; inputting the initial upsampling geometric information of the training point cloud block to filter out noise into the downsampling unit for downsampling, to obtain the predicted upsampling geometric information of the training point cloud block.

Optionally, the upsampling rate corresponding to the upsampling geometric information of the training point cloud block is less than or equal to the upsampling rate of the feature upsampling module.

Optionally, the discriminator is a pre-trained discriminator.

In some embodiments, the training unit 12 is further configured to use the geometric information of the training point cloud block to train the discriminator.

In some embodiments, the training unit 12 is specifically configured to input the predicted upsampling geometric information of the training point cloud block generated by the generator into the discriminator, and obtain a second discrimination result of the discriminator; Inputting the upsampling true value of the geometric information of the training point cloud block into the discriminator to obtain a third discriminant result of the discriminator; according to the second discriminant result and the third discriminant result, determine the discriminator The loss; according to the loss of the discriminator, the discriminator is trained.

In some embodiments, the training unit 12 is specifically configured to determine the loss of the discriminator by using a least square loss function according to the second discrimination result and the third discrimination result.

In some embodiments, the discriminator includes: a global discriminant module, a boundary discriminant module and a fully connected module, and a training unit 12, which is specifically used to obtain the geometric information of the boundary points of the target point cloud block; The geometric information of the boundary point is input into the boundary discrimination module to obtain the boundary feature information of the target point cloud block; the geometric information of the target point cloud block is input to the global discrimination module to obtain the target point cloud block Global feature information; input the global feature information and boundary feature information of the target point cloud block into the full connection module to obtain the target discrimination result of the discriminator; wherein, if the target point cloud block is the generator Upsampled training point cloud block, and the judge has not been trained by the training point cloud block, then the target discrimination result is the second discrimination result; if the target point cloud block is the training point If the upsampling true value of the cloud block, the target discrimination result is the third discrimination result; if the target point cloud block is the training point cloud block after the generator upsamples, and the judger passes the If the training point cloud is trained, the target discrimination result is the first discrimination result.

In some embodiments, the training unit 12 is specifically configured to use a high-pass image filter to extract the geometric information of the boundary points of the target point cloud block.

In some embodiments, the training unit 12 is specifically configured to concatenate the global feature information and boundary feature information of the target point cloud block; input the concatenated global feature information and boundary feature information into the full connection module , to obtain the target discrimination result of the discriminator.

In some embodiments, the global discrimination module includes sequentially along the network depth direction: a first number of multi-layer perceptrons, a first maximum pooling layer, a second autocorrelation attention network, and a second number of multi-layer perceptrons machine and the second maximum pooling layer; the training unit 12 is specifically used to input the geometric information of the target point cloud block into the first number of multi-layer perceptrons for feature extraction, and obtain the first number of the target point cloud block A global feature information; inputting the first global feature information into the first maximum pooling layer for dimensionality reduction processing to obtain the second global feature information of the target point cloud block; combining the first global feature information and The second global feature information is input into the second autocorrelation attention network for feature interaction to obtain the third global feature information of the target point cloud block; the third global feature information is input into the second number of The multi-layer perceptron then extracts features to obtain the fourth global feature information of the target point cloud block; input the fourth global feature information into the second maximum pooling layer for dimensionality reduction processing to obtain the target point cloud The global feature information of the block.

In some embodiments, the training unit 12 is specifically configured to concatenate the first global feature information and the second global feature information; combine the concatenated first global feature information and the second global feature information The global feature information is input into the second autocorrelation attention network for feature interaction to obtain the third global feature information of the target point cloud block.

Optionally, the first quantity is equal to the second quantity.

Optionally, the first number and the second number are both equal to 2.

In some embodiments, the first number of multilayer perceptrons includes a first layer of multilayer perceptrons and a second layer of multilayer perceptrons, and the second number of multilayer perceptrons includes a third layer of multilayer perceptrons machine and the fourth layer of multi-layer perceptron, the first layer of multi-layer perceptron, the second layer of multi-layer perceptron, the third layer of multi-layer perceptron and the fourth layer of multi-layer perceptron The feature dimension gradually increases sequentially.

Optionally, the feature dimension of the first layer of multi-layer perceptron is 32, the feature dimension of the second layer of multi-layer perceptron is 64, and the feature dimension of the third layer of multi-layer perceptron is 128, so The feature dimension of the fourth layer multilayer perceptron is 256.

In some embodiments, the boundary discrimination module sequentially includes along the network depth direction: a third number of multi-layer perceptrons, a third maximum pooling layer, a third autocorrelation attention network, and a fourth number of multi-layer perceptrons machine and the fourth maximum pooling layer; the training unit 12 is specifically used to input the geometric information of the boundary points of the target point cloud block into the third number of multi-layer perceptrons for feature extraction, and obtain the target point The first boundary feature information of the cloud block; the first boundary feature information is input into the third maximum pooling layer for dimension reduction processing, and the second boundary feature information of the target point cloud block is obtained; the first The boundary feature information and the second boundary feature information are input into the third autocorrelation attention network for feature interaction to obtain the third boundary feature information of the target point cloud block; the third boundary feature information is input into the A fourth number of multi-layer perceptrons perform feature extraction to obtain fourth boundary feature information of the target point cloud block; input the fourth boundary feature information into the fourth maximum pooling layer for dimensionality reduction processing to obtain the Describe the boundary feature information of the target point cloud block.

In some embodiments, the training unit 12 is specifically configured to concatenate the first boundary feature information and the second boundary feature information; combine the concatenated first boundary feature information and the second boundary feature information The boundary feature information is input into the third autocorrelation attention network for feature interaction to obtain the third boundary feature information of the target point cloud block.

Optionally, the third quantity is equal to the fourth quantity.

Optionally, both the third quantity and the fourth quantity are equal to 2.

In some embodiments, the third number of multilayer perceptrons includes a fifth layer of multilayer perceptrons and a sixth layer of multilayer perceptrons, and the fourth number of multilayer perceptrons includes a seventh layer of multilayer perceptrons machine and the eighth layer multi-layer perceptron, the fifth layer multi-layer perceptron, the sixth layer multi-layer perceptron, the seventh layer multi-layer perceptron and the eighth layer multi-layer perceptron The feature dimension gradually increases sequentially.

Optionally, the feature dimension of the eighth-layer multi-layer perceptron is greater than or equal to the feature dimension of the seventh-layer multi-layer perceptron, and smaller than or equal to the feature dimension of the fourth-layer multi-layer perceptron.

Optionally, the feature dimension of the fifth-layer multi-layer perceptron is 32, the feature dimension of the sixth-layer multi-layer perceptron is 64, and the feature dimension of the seventh-layer multi-layer perceptron is 128, so The feature dimension of the eighth layer multilayer perceptron is 192.

In some embodiments, the training unit 12 is specifically configured to determine the first loss of the generator according to the first discrimination result; and determine the feature extraction module and feature of the generator according to the first loss. Parameter matrix for the upsampling block and the geometry generation block.

In some embodiments, the training unit 12 is specifically configured to determine the first loss of the generator by using a least squares loss function according to the first discrimination result.

In some embodiments, the training unit 12 is specifically configured to determine at least one second loss of the generator; according to the first loss of the generator and at least one second loss of the generator, determine the generation The target loss of the generator; according to the target loss of the generator, determine the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator.

In some embodiments, the training unit 12 is specifically configured to determine the A second loss for the generator.

In some embodiments, the training unit 12 is specifically configured to downsample the upsampled geometric information of the training point cloud block to obtain a downsampled training point cloud block with the same number of points as the training point cloud block; according to the Downsampling the geometric information of the training point cloud block and the geometric information of the training point cloud block, sampling the ground motion distance method, and determining a second loss of the generator.

In some embodiments, the training unit 12 is specifically configured to determine a second loss of the generator according to the following formula:

Wherein, the L _id is the second loss of the generator, the P _ori is a training point cloud block, and the P _low is a downsampled training point cloud block, and φ:P _low →P _ori means that P For the bijection composed of _low and P _ori , there is only one and only way to move P _low and P _ori to the minimum distance between the point sets of each other.

is the kth point in the P _low , the

for the said

Corresponding point in the P _ori .

In some embodiments, the training unit 12 is specifically configured to determine at least one second loss of the generator according to a uniform loss function.

In some embodiments, the training unit 12 is specifically configured to use a weighted average of the first loss of the generator and the at least one second loss to determine the target loss of the generator.

It should be understood that the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here. Specifically, the point cloud upsampling device 10 shown in FIG. 18 may correspond to the corresponding subject in the model training method of the embodiment of the present application, and the foregoing and other operations and/or The functions are to realize the corresponding processes in each method such as the model training method, and for the sake of brevity, details are not repeated here.

The device and system of the embodiments of the present application are described above from the perspective of functional units with reference to the accompanying drawings. It should be understood that the functional unit may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software units. Specifically, each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware The decoding processor is executed, or the combination of hardware and software units in the decoding processor is used to complete the execution. Optionally, the software unit may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.

As shown in Figure 20, the electronic device 30 may be the point cloud upsampling device described in the embodiment of the present application, or a point cloud decoder, or a model training device, and the electronic device 30 may include:

A memory 33 and a processor 32 , the memory 33 is used to store a computer program 34 and transmit the program code 34 to the processor 32 . In other words, the processor 32 can call and run the computer program 34 from the memory 33 to implement the method in the embodiment of the present application.

For example, the processor 32 can be used to execute the steps in the above-mentioned method 200 according to the instructions in the computer program 34 .

In some embodiments of the present application, the processor 32 may include, but is not limited to:

General-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates Or transistor logic devices, discrete hardware components, and so on.

In some embodiments of the present application, the memory 33 includes but is not limited to:

volatile memory and/or non-volatile memory. Among them, the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash. The volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (Static RAM, SRAM), Dynamic Random Access Memory (Dynamic RAM, DRAM), Synchronous Dynamic Random Access Memory (Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous connection dynamic random access memory (synch link DRAM, SLDRAM) and Direct Memory Bus Random Access Memory (Direct Rambus RAM, DR RAM).

In some embodiments of the present application, the computer program 34 can be divided into one or more units, and the one or more units are stored in the memory 33 and executed by the processor 32 to complete the present application. Methods. The one or more units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program 34 in the electronic device 30 .

As shown in Figure 20, the electronic device 30 may also include:

A transceiver 33 , the transceiver 33 can be connected to the processor 32 or the memory 33 .

Wherein, the processor 32 can control the transceiver 33 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices. Transceiver 33 may include a transmitter and a receiver. The transceiver 33 may further include antennas, and the number of antennas may be one or more.

It should be understood that the various components in the electronic device 30 are connected through a bus system, wherein the bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.

The present application also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a computer, the computer can execute the methods of the above method embodiments. In other words, the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.

When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, server, or data center by wire (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media. The available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital point cloud disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc. .

Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

A unit described as a separate component may or may not be physically separated, and a component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

The above content is only the specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application, and should covered within the scope of protection of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims

A point cloud decoding method is characterized in that, comprising:

Decode the point cloud code stream to obtain the geometric information of the point cloud;

dividing the point cloud into at least one point cloud block according to the geometric information of the point cloud;

Input the geometric information of the point cloud block into the generator for upsampling, and obtain the upsampling geometric information of the point cloud block;

Wherein, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to extract the The first feature information of the point cloud block is up-sampled to the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into a geometric space, so as to obtain the up-sampling of the point cloud block geometric information.
The method according to claim 1, wherein the feature extraction module comprises densely connected M feature extraction blocks;

For the i+1th feature extraction block in the M feature extraction blocks, the i+1th feature extraction block is used to output the i+1th third feature according to the input i-th fourth feature information information, the i-th fourth feature information is determined according to the i-th third feature information output by the i-th feature extraction block, and the first feature information of the point cloud block is determined according to the M feature extraction blocks Determined by the Mth third feature information output by the Mth feature extraction block, the i is a positive integer smaller than M.
The method according to claim 2, characterized in that,

If i is not equal to 1, the i-th fourth feature information is the third feature information extracted by each feature extraction block before the i-th feature extraction block in the M feature extraction blocks, and the The feature information after cascading the third feature information extracted by the i-th feature extraction block;

If i is equal to 1, the ith fourth feature information is the first third feature information output by the first feature extraction block in the M feature extraction blocks.
The method according to claim 2, wherein the feature extraction block comprises: a first feature extraction unit and S second feature extraction units connected in series, wherein S is a positive integer;

For the first extraction unit in the i+1th feature extraction block, the first extraction unit is used to search for K neighboring points of the current point for the current point in the point cloud block, and based on For the ith fourth feature information of the point cloud block, the fourth feature information of the current point is subtracted from the fourth feature information of the adjacent point to obtain K residual feature information, and the The K residual feature information is concatenated with the fourth feature information of the current point to obtain the i-th concatenated feature information of the current point, and according to the i-th concatenated feature information of the current point, the The i-th concatenated feature information of the point cloud block, and input the i-th concatenated feature information of the point cloud block into the first second feature extraction unit in the S second feature extraction units;

The first second feature extraction unit is used to output the first fifth feature information to the second second feature extraction unit according to the i-th cascaded feature information of the point cloud block, wherein the point cloud The i+1th third feature information of the block is the fifth feature information output by the last second feature extraction unit among the S second feature extraction units.
The method according to claim 4, wherein the second feature extraction unit comprises P residual blocks, and the P is a positive integer;

For the j+1th residual block in the sth second feature extraction unit, the j+1th residual block is used according to the jth residual in the sth second feature extraction unit The j-th first residual information output by the block and the fifth feature information input to the s-th second feature extraction unit, and the j+1-th first residual information is output, wherein the j is less than P A positive integer, the s is a positive integer less than or equal to S;

The fifth feature information output by the sth second feature extraction unit is based on the first residual information output by at least one residual block in the sth second feature extraction unit, and input to the sth second feature extraction unit determined by the fifth feature information of the second feature extraction unit.
The method according to claim 5, wherein the method further comprises: combining the jth first residual information output by the jth residual block in the sth second feature extraction unit and After the fifth feature information is input to the s th second feature extraction unit for addition, it is input to the j+1 th residual block in the s th second feature extraction unit.
The method according to claim 5, wherein the fifth feature information output by the s second feature extraction unit is based on the output of the last residual block in the s second feature extraction unit. Residual information information, feature information concatenated with the first residual information information output by at least one residual block in the P-1 residual blocks, and input to the sth second feature extraction unit. determined by five feature information, wherein, the P-1 residual blocks are residual blocks except the last residual block among the P residual blocks of the s-th second feature extraction unit.
The method according to claim 7, wherein the fifth feature information output by the s second feature extraction unit is based on the output of the last residual block in the s second feature extraction unit. Residual information information, feature information concatenated with the first residual information information output by at least one residual block in the P-1 residual blocks, and the fifth input to the sth second feature extraction unit The feature information is determined after adding.
The method according to claim 8, wherein the second feature extraction unit further comprises a gating unit,

For the gating unit in the s th second feature extraction unit, the gating unit is used for the first residual information output by the last residual block in the s th second feature extraction unit, De-redundancy is performed on the feature information concatenated with the first residual information output by at least one residual block in the P-1 residual blocks, and the de-redundant feature information is output; the s second The fifth feature information output by the feature extraction unit is determined after adding the de-redundant feature information and the fifth feature information input to the s-th second feature extraction unit.
The method according to any one of claims 1-9, wherein the feature upsampling module comprises: a feature upsampling submodule and a feature extraction submodule;

The feature upsampling submodule is used to copy r copies of the first feature information of the point cloud block according to the preset upsampling rate r, and add an n dimension to the feature dimension of the copied first feature information vector, obtain the upsampling feature information of the point cloud block, and input the upsampling feature information of the point cloud block into the feature extraction submodule, wherein the values of n-dimensional vectors corresponding to different first feature information are different;

The feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block.
The method according to claim 10, wherein the feature extraction submodule includes Q third feature extraction units, and the Q is a positive integer;

For the k+1th third feature extraction unit among the Q third feature extraction units, the k+1th third feature extraction unit is used to extract all The kth enhanced upsampling feature information of the point cloud block, output the k+1th enhanced upsampling feature information of the point cloud block, and the k is a positive integer less than Q;

The second feature information of the point cloud block is the Qth enhanced upsampling feature information of the point cloud block extracted by the last third feature extraction unit among the Q third feature extraction units.
The method according to claim 11, wherein the third feature extraction unit includes L residual blocks, and the L is a positive integer;

For the l+1th residual block in the k+1th third feature extraction unit, the l+1th residual block is used according to the k+1th third feature extraction unit The lth second residual information output by the lth residual block and the kth enhanced upsampling feature information input to the k+1th third feature extraction unit, output the l+1th second residual difference information, the l is a positive integer less than L;

The k+1th enhanced upsampling feature information of the point cloud block is the second residual information output from at least one residual block in the k+1th third feature extraction unit, and the kth Enhanced upsampling feature information determined.
The method according to claim 12, characterized in that, the method further comprises: performing an operation on the lth second residual information output by the lth residual block and the kth enhanced upsampling feature information After the addition, input the l+1th residual block.
The method according to claim 13, characterized in that, the k+1th enhanced upsampled feature information of the point cloud block is based on the second residual information output by the last residual block in the L residual blocks , determined by the feature information concatenated with the second residual information output by at least one residual block in the L-1 residual blocks and the kth enhanced upsampling feature information, wherein the L-1 The residual blocks are the residual blocks except the last residual block among the L residual blocks of the k+1 third feature extraction unit.
The method according to claim 14, wherein the k+1th enhanced upsampling feature information of the point cloud block is based on the second residual information output by the last residual block in the L residual blocks , and determined after adding the feature information concatenated with the second residual information output by at least one residual block in the L-1 residual blocks and the kth enhanced upsampling feature information.
The method according to claim 15, wherein the third feature extraction unit further comprises a gating unit;

For the gating unit in the k+1th third feature extraction unit, the gating unit is used for the second output of the last residual block in the k+1th third feature extraction unit performing de-redundancy on the residual information and the feature information concatenated with the second feature information output by at least one of the L-1 residual blocks, and outputting the de-redundant feature information;

The k+1th enhanced upsampling feature information of the point cloud block is determined after adding the kth enhanced upsampling feature information according to the deredundant feature information.
The method according to claim 10, wherein the feature upsampling module further comprises a first autocorrelation attention network;

The first autocorrelation attention network is used to perform feature interaction on the upsampling feature information of the point cloud block output by the feature upsampling submodule, and output the upsampling feature information of the point cloud block after feature interaction to the feature extraction submodule;

The feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block after feature interaction.
The method according to claim 17, wherein the feature dimension of the upsampled feature information of the point cloud block after the feature interaction is lower than the feature dimension of the upsampled feature information of the point cloud block.
The method according to claim 1, wherein the geometry generation module comprises a plurality of fully connected layers;

The multiple fully connected layers are used to output upsampled geometric information of the point cloud block according to the second feature information of the point cloud block.
The method according to claim 1, wherein the geometry generation module comprises: a geometry reconstruction unit, a filtering unit, and a downsampling unit;

The geometric reconstruction unit is used to geometrically reconstruct the second feature information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to the filtering unit;

The filtering unit is used to denoise the initial upsampling geometric information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to filter noise to the downsampling unit;

The down-sampling unit is configured to down-sample the initial up-sampled geometric information of the point cloud block after filtering noise to a target up-sampling rate, and output the up-sampled geometric information of the point cloud block.
The method according to claim 20, wherein the target upsampling rate is less than or equal to the upsampling rate of the feature upsampling module.
The method according to claim 20, further comprising:

Decoding the point cloud stream to obtain the target upsampling rate.
A point cloud decoder, characterized in that it comprises:

The decoding unit is used to decode the point cloud code stream to obtain the geometric information of the point cloud;

A division unit, configured to divide the point cloud into at least one point cloud block according to the geometric information of the point cloud;

An upsampling unit, configured to input the geometric information of the point cloud block into the generator for upsampling, to obtain the upsampled geometric information of the point cloud block;

Wherein, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to extract the The first feature information of the point cloud block is up-sampled to the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into a geometric space, so as to obtain the up-sampling of the point cloud block geometric information.
A point cloud decoder, characterized in that, comprising: a processor and a memory;

The memory is used to store computer programs;

The processor is used for invoking and running the computer program stored in the memory, so as to execute the point cloud encoding method according to any one of claims 1-22.
A point cloud upsampling method is characterized in that, comprising:

Obtain the geometric information of the point cloud to be upsampled;

Divide the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled;

Input the geometric information of the point cloud block into the generator for upsampling, and obtain the upsampling geometric information of the point cloud block;

Wherein, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to extract the The first feature information of the point cloud block is up-sampled to the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into a geometric space, so as to obtain the up-sampling of the point cloud block geometric information.
The method according to claim 25, wherein the feature extraction module comprises densely connected M feature extraction blocks;

For the i+1th feature extraction block in the M feature extraction blocks, the i+1th feature extraction block is used to output the i+1th third feature according to the input i-th fourth feature information information, the i-th fourth feature information is determined according to the i-th third feature information output by the i-th feature extraction block, and the first feature information of the point cloud block is determined according to the M feature extraction blocks Determined by the Mth third feature information output by the Mth feature extraction block, the i is a positive integer smaller than M.
The method of claim 26, wherein

If i is not equal to 1, the i-th fourth feature information is the third feature information extracted by each feature extraction block before the i-th feature extraction block in the M feature extraction blocks, and the The feature information after cascading the third feature information extracted by the i-th feature extraction block;

If i is equal to 1, the ith fourth feature information is the first third feature information output by the first feature extraction block in the M feature extraction blocks.
The method according to claim 26, wherein the feature extraction block comprises: a first feature extraction unit and S second feature extraction units connected in series, wherein S is a positive integer;

For the first extraction unit in the i+1th feature extraction block, the first extraction unit is used to search for K neighboring points of the current point for the current point in the point cloud block, and based on For the ith fourth feature information of the point cloud block, the fourth feature information of the current point is subtracted from the fourth feature information of the adjacent point to obtain K residual feature information, and the The K residual feature information is concatenated with the fourth feature information of the current point to obtain the i-th concatenated feature information of the current point, and according to the i-th concatenated feature information of the current point, the The i-th concatenated feature information of the point cloud block, and input the i-th concatenated feature information of the point cloud block into the first second feature extraction unit in the S second feature extraction units;

The first second feature extraction unit is used to output the first fifth feature information to the second second feature extraction unit according to the i-th cascaded feature information of the point cloud block, wherein the point cloud The i+1th third feature information of the block is the fifth feature information output by the last second feature extraction unit among the S second feature extraction units.
The method according to claim 28, wherein the second feature extraction unit comprises P residual blocks, and the P is a positive integer;

For the j+1th residual block in the sth second feature extraction unit, the j+1th residual block is used according to the jth residual in the sth second feature extraction unit The j-th first residual information output by the block and the fifth feature information input to the s-th second feature extraction unit, and the j+1-th first residual information is output, wherein the j is less than P A positive integer, the s is a positive integer less than or equal to S;

The fifth feature information output by the sth second feature extraction unit is based on the first residual information output by at least one residual block in the sth second feature extraction unit, and input to the sth second feature extraction unit determined by the fifth feature information of the second feature extraction unit.
The method according to claim 29, characterized in that the method further comprises: combining the jth first residual information output by the jth residual block in the sth second feature extraction unit and After the fifth feature information is input to the s th second feature extraction unit for addition, it is input to the j+1 th residual block in the s th second feature extraction unit.
The method according to claim 29, wherein the fifth feature information output by the s th second feature extraction unit is based on the s th second feature information output by the last residual block in the s th second feature extraction unit Residual information information, feature information concatenated with the first residual information information output by at least one residual block in the P-1 residual blocks, and input to the sth second feature extraction unit. determined by five feature information, wherein, the P-1 residual blocks are residual blocks except the last residual block among the P residual blocks of the s-th second feature extraction unit.
The method according to claim 31, wherein the fifth feature information output by the s th second feature extraction unit is based on the s th second feature information output by the last residual block in the s th second feature extraction unit Residual information information, feature information concatenated with the first residual information information output by at least one residual block in the P-1 residual blocks, and the fifth input to the sth second feature extraction unit The feature information is determined after adding.
The method according to claim 32, wherein the second feature extraction unit further comprises a gating unit,

For the gating unit in the s th second feature extraction unit, the gating unit is used for the first residual information output by the last residual block in the s th second feature extraction unit, De-redundancy is performed on the feature information concatenated with the first residual information output by at least one residual block in the P-1 residual blocks, and the de-redundant feature information is output; the s second The fifth feature information output by the feature extraction unit is determined after adding the de-redundant feature information and the fifth feature information input to the s-th second feature extraction unit.
The method according to any one of claims 25-33, wherein the feature upsampling module comprises: a feature upsampling submodule and a feature extraction submodule;

The feature upsampling submodule is used to copy r copies of the first feature information of the point cloud block according to the preset upsampling rate r, and add an n dimension to the feature dimension of the copied first feature information vector, obtain the upsampling feature information of the point cloud block, and input the upsampling feature information of the point cloud block into the feature extraction submodule, wherein the values of n-dimensional vectors corresponding to different first feature information are different;

The feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block.
The method according to claim 34, wherein the feature extraction submodule includes Q third feature extraction units, and the Q is a positive integer;

For the k+1th third feature extraction unit among the Q third feature extraction units, the k+1th third feature extraction unit is used to extract all The kth enhanced upsampling feature information of the point cloud block, output the k+1th enhanced upsampling feature information of the point cloud block, and the k is a positive integer less than Q;

The second feature information of the point cloud block is the Qth enhanced upsampling feature information of the point cloud block extracted by the last third feature extraction unit among the Q third feature extraction units.
The method according to claim 35, wherein the third feature extraction unit comprises L residual blocks, and the L is a positive integer;

For the l+1th residual block in the k+1th third feature extraction unit, the l+1th residual block is used according to the k+1th third feature extraction unit The lth second residual information output by the lth residual block and the kth enhanced upsampling feature information input to the k+1th third feature extraction unit, output the l+1th second residual difference information, the l is a positive integer less than L;

The k+1th enhanced upsampling feature information of the point cloud block is the second residual information output from at least one residual block in the k+1th third feature extraction unit, and the kth Enhanced upsampling feature information determined.
The method according to claim 36, characterized in that, the method further comprises: performing an operation on the lth second residual information output by the lth residual block and the kth enhanced upsampling feature information After the addition, input the l+1th residual block.
The method according to claim 37, characterized in that, the k+1th enhanced upsampling feature information of the point cloud block is based on the second residual information output by the last residual block in the L residual blocks , determined by the feature information concatenated with the second residual information output by at least one residual block in the L-1 residual blocks and the kth enhanced upsampling feature information, wherein the L-1 The residual blocks are the residual blocks except the last residual block among the L residual blocks of the k+1 third feature extraction unit.
The method according to claim 38, characterized in that, the k+1th enhanced upsampled feature information of the point cloud block is based on the second residual information output by the last residual block in the L residual blocks , and determined after adding the feature information concatenated with the second residual information output by at least one residual block in the L-1 residual blocks and the kth enhanced upsampling feature information.
The method according to claim 39, wherein the third feature extraction unit further comprises a gating unit;

For the gating unit in the k+1th third feature extraction unit, the gating unit is used for the second output of the last residual block in the k+1th third feature extraction unit performing de-redundancy on the residual information and the feature information concatenated with the second feature information output by at least one of the L-1 residual blocks, and outputting the de-redundant feature information;

The k+1th enhanced upsampling feature information of the point cloud block is determined after adding the deredundant feature information to the kth enhanced upsampling feature information.
The method according to claim 34, wherein the feature upsampling module further comprises a first autocorrelation attention network;

The first autocorrelation attention network is used to perform feature interaction on the upsampling feature information of the point cloud block output by the feature upsampling submodule, and output the upsampling feature information of the point cloud block after feature interaction to the feature extraction submodule;

The feature extraction submodule is configured to output second feature information of the point cloud block according to the upsampled feature information of the point cloud block after feature interaction.
The method according to claim 41, wherein the feature dimension of the upsampled feature information of the point cloud block after the feature interaction is lower than the feature dimension of the upsampled feature information of the point cloud block.
The method according to claim 25, wherein the geometry generation module comprises a plurality of fully connected layers;

The multiple fully connected layers are used to output upsampled geometric information of the point cloud block according to the second feature information of the point cloud block.
The method according to claim 25, wherein the geometry generation module comprises: a geometry reconstruction unit, a filtering unit, and a downsampling unit;

The geometric reconstruction unit is used to geometrically reconstruct the second feature information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to the filtering unit;

The filtering unit is used to denoise the initial upsampling geometric information of the point cloud block, and output the initial upsampling geometric information of the point cloud block to filter noise to the downsampling unit;

The down-sampling unit is configured to down-sample the initial up-sampled geometric information of the point cloud block after filtering noise to a target up-sampling rate, and output the up-sampled geometric information of the point cloud block.
The method according to claim 44, wherein the target upsampling rate is less than or equal to the upsampling rate of the feature upsampling module.
A point cloud upsampling device is characterized in that, comprising:

An acquisition unit, configured to acquire geometric information of the point cloud to be upsampled;

A division unit, configured to divide the point cloud to be upsampled into at least one point cloud block according to the geometric information of the point cloud to be upsampled;

An upsampling unit, configured to input the geometric information of the point cloud block into the generator for upsampling, to obtain the upsampled geometric information of the point cloud block;

Wherein, the generator includes: a feature extraction module, a feature upsampling module and a geometry generation module, the feature extraction module is used to extract the first feature information of the point cloud block, and the feature sampling module is used to extract the The first feature information of the point cloud block is up-sampled to the second feature information, and the geometry generation module is used to map the second feature information of the point cloud block into a geometric space, so as to obtain the up-sampling of the point cloud block geometric information.
A point cloud upsampling device, characterized in that it includes: a processor and a memory;

The memory is used to store computer programs;

The processor is used for invoking and running the computer program stored in the memory, so as to execute the method according to any one of claims 25-45.
A model training method, characterized in that, comprising:

Obtaining geometric information of the training point cloud, and dividing the training point cloud into at least one training point cloud block according to the geometric information of the training point cloud;

The feature extraction module of the geometric information input generator of described training point cloud block is carried out feature extraction, obtains the first feature information of described training point cloud block;

Inputting the first feature information of the training point cloud block into the feature upsampling module of the generator for upsampling to obtain the second feature information of the training point cloud block;

Inputting the second feature information of the training point cloud block into the geometric generation module of the generator for geometric reconstruction, and obtaining the predicted upsampling geometric information of the training point cloud block;

According to the predicted upsampling geometric information of the training point cloud block, the feature extraction module, feature upsampling module and geometry generation module in the generator are trained to obtain the trained generator.
The method according to claim 48, wherein the feature extraction module, feature upsampling module and geometry generation module in the generator are trained according to the predicted upsampling geometric information of the training point cloud block , to get the trained generator, including:

The predicted upsampling geometric information of the training point cloud block is input into the discriminator to obtain the first discrimination result of the discriminator, and the discriminator is used to judge whether the data input to the discriminator is the data of the training point cloud block upsampled true value;

According to the first discrimination result of the discriminator, the feature extraction module, feature upsampling module and geometry generation module in the generator are trained to obtain a trained generator.
The method according to claim 48 or 49, wherein the feature extraction module comprises M densely connected feature extraction blocks, and the geometric information of the training point cloud block is input into the feature extraction of the generator Module carries out feature extraction, obtains the first characteristic information of described training point cloud block, comprises:

Input the geometric information of the training point cloud block into the feature extraction module, and obtain the ith third feature information of the training point cloud block extracted by the ith feature extraction block in the M feature extraction blocks , the i is a positive integer less than M;

According to the i-th third feature information of the training point cloud block, the i-th fourth feature information of the training point cloud block is obtained;

Inputting the ith fourth feature information of the training point cloud block into the i+1 feature extraction block to obtain the i+1 third feature information of the training point cloud block;

The Mth third feature information extracted by the Mth feature extraction block of the training point cloud block is used as the first feature information of the training point cloud block.
The method according to claim 50, wherein the i-th fourth feature information of the training point cloud block is obtained according to the i-th third feature information of the training point cloud block, comprising:

If i is not equal to 1, then obtain the third feature information extracted by each feature extraction block before the ith feature extraction block in the M feature extraction blocks; and place it in the ith feature extraction block The third feature information extracted by each previous feature extraction block is concatenated with the third feature information extracted by the i-th feature extraction block, as the i-th fourth feature information of the training point cloud block;

If i is equal to 1, the first third feature information extracted by the first feature extraction block in the M feature extraction units is used as the ith fourth feature information of the training point cloud block.
The method according to claim 50, wherein the feature extraction block comprises: a first feature extraction unit and at least one second feature extraction unit connected in series, and the i-th feature extraction unit of the training point cloud block is The fourth feature information is input in the i+1 feature extraction block, and the i+1 third feature information of the training point cloud block is obtained, including:

Input the i-th fourth feature information of the training point cloud block into the first feature extraction unit in the i+1 feature extraction block, so that the first feature extraction unit is specific to the training point cloud block For the current point in , search K neighboring points of the current point, and based on the ith fourth characteristic information, compare the fourth characteristic information of the current point with the fourth characteristic information of the neighboring points subtraction to obtain K residual feature information; concatenate the K residual feature information with the fourth feature information of the current point to obtain the ith concatenated feature information of the current point, and according to the The i-th cascade feature information of the current point is obtained to obtain the i-th cascade feature information of the training point cloud block;

Inputting the i-th cascaded feature information of the training point cloud block into the first second feature extraction unit in the i+1th feature extraction block to obtain the first, first, and fifth feature information, and Inputting the first and fifth feature information into the second second feature extraction unit in the i+1th feature extraction block to obtain the second fifth feature information;

The fifth feature information extracted by the last second feature extraction unit in the i+1th feature extraction block is used as the i+1th third feature information of the training point cloud block.
The method according to claim 52, wherein the second feature extraction unit includes P residual blocks, the P is a positive integer, and the i-th concatenated feature of the training point cloud block is The information is input into the first second feature extraction unit in the i+1th feature extraction block to obtain the first first fifth feature information, including:

Inputting the i-th cascaded feature information into the first second feature extraction unit in the i+1th feature extraction block to obtain the j-th residual block in the first second feature extraction unit The first residual information output, the j is a positive integer less than or equal to P;

inputting the first residual information output by the jth residual block and the ith concatenated feature information into the j+1th residual block in the first second feature extraction unit, to obtain The first residual information output by the j+1th residual block;

According to the first residual information output by at least one of the P residual blocks in the first second feature extraction unit, and the i-th concatenated feature information, determine the first and second The fifth feature information output by the feature extraction unit.
The method according to claim 53, wherein the first residual information output by the jth residual block and the ith concatenated feature information are input into the first second feature In the j+1th residual block in the extraction unit, the first residual information output by the j+1th residual block is obtained, including:

Adding the first residual information output by the jth residual block and the ith concatenated feature information, and inputting the added feature information into the j+1th residual block , to obtain the first residual information output by the j+1th residual block.
The method according to claim 53, characterized in that the first residual information output from at least one residual block in the P residual blocks in the first second feature extraction unit, and the The i-th cascaded feature information is used to determine the fifth feature information output by the first second feature extraction unit, including:

Concatenating the first residual information output by the last residual block in the P residual blocks with the first residual information output by at least one residual block in the P-1 residual blocks, wherein the The P-1 residual blocks are residual blocks other than the last residual block among the P residual blocks;

The fifth feature information output by the first second feature extraction unit is determined according to the concatenated feature information and the i-th concatenated feature information.
The method according to claim 55, characterized in that, according to the concatenated feature information and the ith concatenated feature information, the fifth feature information output by the first second feature extraction unit is determined ,include:

Adding the concatenated feature information to the i-th concatenated feature information is used as fifth feature information output by the first and second feature extraction units.
The method according to claim 55, wherein the second feature extraction unit further comprises a gating unit, and according to the concatenated feature information and the i-th concatenated feature information, the i-th concatenated feature information is used to determine the The fifth feature information output by a second feature extraction unit includes:

Inputting the cascaded feature information into the gating unit for de-redundancy, to obtain de-redundant feature information;

Adding the feature information after de-redundancy to the ith concatenated feature information is used as the fifth feature information output by the first second feature extraction unit.
The method according to any one of claims 48 or 49, wherein the feature upsampling module comprises: a feature upsampling submodule and a feature extraction submodule, the first feature of the training point cloud block The feature upsampling module of information input described generator carries out upsampling, obtains the second feature information of described training point cloud block, comprises:

Input the first feature information of the training point cloud block into the feature upsampling submodule, so that the feature upsampling submodule uses the first feature of the training point cloud block according to the preset upsampling rate r Copy r copies of the information, and add an n-dimensional vector on the feature dimension to the first feature information after copying, so as to obtain the up-sampling feature information of the training point cloud block, wherein the values of the n-dimensional vectors corresponding to different first feature information Are not the same;

Inputting the upsampled feature information of the training point cloud block into the feature extraction submodule to obtain the second feature information of the training point cloud block extracted by the feature extraction submodule.
The method according to claim 58, wherein the feature extraction submodule comprises Q third feature extraction units connected in series, the Q is a positive integer, and the upsampling of the training point cloud block Feature information is input into the feature extraction submodule to obtain the second feature information of the training point cloud block extracted by the feature extraction submodule, including:

Input the upsampling feature information of the training point cloud block into the feature extraction submodule, and obtain the kth enhanced upsampling feature information of the training point cloud block extracted by the kth third feature extraction unit;

Input the kth enhanced upsampling feature information of the training point cloud block into the k+1 third feature extraction unit to obtain the training point cloud block extracted by the k+1 third feature extraction unit The k+1 enhanced upsampling feature information;

The Qth enhanced upsampling feature information of the training point cloud block extracted by the last third feature extraction unit of the Q third feature extraction units is used as the second feature information of the training point cloud block.
The method according to claim 59, wherein the third feature extraction unit includes L residual blocks, the L is a positive integer, and the kth enhanced upsampling of the training point cloud block The feature information is input into the k+1th third feature extraction unit to obtain the k+1th enhanced upsampling feature information of the training point cloud block extracted by the k+1th third feature extraction unit, including:

Inputting the kth enhanced upsampling feature information of the training point cloud block into the k+1th third feature extraction unit to obtain the lth residual block in the k+1th third feature extraction unit The second residual information output, the l is a positive integer less than or equal to L;

inputting the second residual information output by the lth residual block and the kth enhanced upsampling feature information into the l+1th residual block to obtain the output of the l+1th residual block The second residual information of ;

According to the second residual information output by at least one residual block in the L residual blocks, and the kth enhanced upsampling feature information, obtain the k+1th enhanced upsampling of the training point cloud block characteristic information.
The method according to claim 60, wherein the second residual information output by the lth residual block and the kth enhanced upsampling feature information are input into the l+1th residual In the block, the second residual information output by the l+1 residual block is obtained, including:

adding the second residual information output by the lth residual block to the kth enhanced upsampling feature information, and inputting the added feature information into the l+1th residual block , determine the second residual information output by the l+1th residual block.
The method according to claim 60, wherein, according to the second residual information output by at least one residual block in the L residual blocks, and the kth enhanced upsampling feature information, determine The k+1 enhanced upsampling feature information of the training point cloud block includes:

Concatenate the second residual information output by the last residual block in the L residual blocks with the second residual information output by at least one residual block in the L-1 residual blocks, wherein the The L-1 residual blocks are residual blocks other than the last residual block among the L residual blocks;

According to the concatenated feature information and the kth enhanced upsampled feature information, determine the k+1th enhanced upsampled feature information of the training point cloud block.
The method according to claim 62, wherein the k+1th enhanced upsampling of the training point cloud block is determined according to the concatenated feature information and the kth enhanced upsampled feature information Characteristic information, including:

Adding the concatenated feature information and the kth enhanced upsampled feature information as the k+1th enhanced upsampled feature information of the training point cloud block.
The method according to claim 62, wherein the third feature extraction unit further includes a gating unit, and according to the cascaded feature information and the kth enhanced upsampling feature information, determine the The k+1th enhanced upsampling feature information of the training point cloud block, including:

Inputting the cascaded feature information into the gating unit for de-redundancy, to obtain de-redundant feature information;

The feature information after de-redundancy is added to the kth enhanced upsampled feature information, and used as the k+1th enhanced upsampled feature information of the training point cloud block.
The method according to claim 58, wherein the feature upsampling module further comprises a first autocorrelation attention network, and the upsampling feature information of the training point cloud block is input into the feature extraction submodule , obtaining the second feature information of the training point cloud block extracted by the feature extraction unit, including:

The upsampling feature information of the training point cloud block is input into the first autocorrelation attention network to perform feature interaction, and the upsampling feature information of the training point cloud block after feature interaction is obtained;

Inputting the upsampled feature information of the training point cloud block after the feature interaction into the feature extraction sub-module for feature extraction to obtain second feature information of the training point cloud block.
The method according to claim 65, wherein the feature dimension of the upsampled feature information of the training point cloud block after the feature interaction is lower than the feature dimension of the upsampled feature information of the training point cloud block.
The method according to claim 48 or 49, wherein the geometry generation module comprises a plurality of fully connected layers, and the second feature information of the training point cloud block is input into the geometry generation module of the generator Perform geometric reconstruction to obtain the predicted upsampling geometric information of the training point cloud block, including:

Inputting the second feature information of the training point cloud block into the plurality of fully connected layers to obtain the predicted upsampling geometric information of the training point cloud block.
The method according to claim 48 or 49, wherein the geometry generation module comprises: a geometry reconstruction unit, a filter unit and a downsampling unit, and the second feature information of the training point cloud block is input into the The geometric generation module of the generator obtains the predicted upsampling geometric information of the training point cloud block, including:

The second feature information of the training point cloud block is input into the geometric reconstruction unit to perform geometric reconstruction, and the initial upsampling geometric information of the training point cloud block is obtained;

The initial upsampling geometric information of the training point cloud block is input into the filtering unit for denoising, and the initial upsampling geometric information of the training point cloud block for filtering noise is obtained;

Inputting the initial upsampling geometric information of the training point cloud block to filter out noises into the downsampling unit for downsampling to obtain the predicted upsampling geometric information of the training point cloud block.
The method according to claim 68, wherein the upsampling rate corresponding to the upsampling geometric information of the training point cloud block is less than or equal to the upsampling rate of the feature upsampling module.
The method according to claim 49, wherein the discriminator is a pre-trained discriminator.
The method according to claim 49, further comprising:

The discriminator is trained using the geometric information of the training point cloud blocks.
The method according to claim 71, wherein said using the geometric information of said training point cloud block to train said discriminator comprises:

The predicted upsampling geometry information of the training point cloud blocks generated by the generator is input in the discriminator to obtain the second discriminant result of the discriminator;

Inputting the upsampling true value of the geometric information of the training point cloud block into the discriminator to obtain the third discriminant result of the discriminator;

determining a loss of the discriminator according to the second discrimination result and the third discrimination result;

Based on the loss of the discriminator, the discriminator is trained.
The method according to claim 72, wherein the determining the loss of the discriminator according to the second discrimination result and the third discrimination result comprises:

The loss of the discriminator is determined by using a least squares loss function according to the second discrimination result and the third discrimination result.
The method according to claim 72, wherein the discriminator comprises: a global discriminant module, a boundary discriminant module and a fully connected module, and the method further comprises:

Obtain the geometric information of the boundary points of the target point cloud block;

Input the geometric information of the boundary points of the target point cloud block into the boundary discrimination module to obtain the boundary feature information of the target point cloud block;

Input the geometric information of the target point cloud block into the global discrimination module to obtain the global feature information of the target point cloud block;

Inputting the global feature information and boundary feature information of the target point cloud block into the fully connected module to obtain the target discrimination result of the discriminator;

Wherein, if the target point cloud block is a training point cloud block sampled by the generator, and the judger has not been trained by the training point cloud block, then the target discrimination result is the second discrimination Result; if the target point cloud block is the upsampling true value of the training point cloud block, then the target discrimination result is the third discrimination result; if the target point cloud block is the generator upsampling After the training point cloud block, and the judge is trained by the training point cloud block, the target discrimination result is the first discrimination result.
The method according to claim 74, wherein said acquiring the geometric information of the boundary points of the target point cloud block comprises:

Using a high-pass image filter to extract the geometric information of the boundary points of the target point cloud block.
The method according to claim 74, wherein the inputting the global feature information and boundary feature information of the target point cloud block into the fully connected module to obtain the target discrimination result of the discriminator includes:

Cascading the global feature information and boundary feature information of the target point cloud block;

The cascaded global feature information and boundary feature information are input into the fully connected module to obtain the target discrimination result of the discriminator.
The method according to claim 74, wherein the global discrimination module comprises sequentially along the network depth direction: a first number of multi-layer perceptrons, a first maximum pooling layer, a second autocorrelation attention network, The second number of multi-layer perceptrons and the second maximum pooling layer; the geometric information of the target point cloud block is input into the global discrimination module to obtain the global feature information of the target point cloud block, including:

Inputting the geometric information of the target point cloud block into the first number of multi-layer perceptrons for feature extraction to obtain the first global feature information of the target point cloud block;

Inputting the first global feature information into the first maximum pooling layer for dimensionality reduction processing to obtain second global feature information of the target point cloud block;

Inputting the first global feature information and the second global feature information into the second autocorrelation attention network for feature interaction to obtain the third global feature information of the target point cloud block;

Inputting the third global feature information into the second number of multi-layer perceptrons for feature extraction to obtain the fourth global feature information of the target point cloud block;

Inputting the fourth global feature information into the second maximum pooling layer to perform dimensionality reduction processing to obtain the global feature information of the target point cloud block.
The method according to claim 77, wherein the first global feature information and the second global feature information are input into the second autocorrelation attention network for feature interaction to obtain the target point The third global feature information of the cloud block, including:

cascading the first global feature information and the second global feature information;

The concatenated first global feature information and the second global feature information are input into the second autocorrelation attention network for feature interaction to obtain third global feature information of the target point cloud block.
The method of claim 77, wherein said first amount is equal to said second amount.
The method of claim 79, wherein the first number and the second number are both equal to two.
The method of claim 80, wherein the first number of multilayer perceptrons includes a first layer of multilayer perceptrons and a second layer of multilayer perceptrons, and the second number of multilayer perceptrons Including the third layer multi-layer perceptron and the fourth layer multi-layer perceptron, the first layer multi-layer perceptron, the second layer multi-layer perceptron, the third layer multi-layer perceptron and the fourth layer multi-layer perceptron The feature dimensions of the four-layer multilayer perceptron gradually increase sequentially.
The method according to claim 81, wherein the feature dimension of the first layer of multi-layer perceptron is 32, the feature dimension of the second layer of multi-layer perceptron is 64, and the feature dimension of the third layer of multi-layer perceptron is The feature dimension of the perceptron is 128, and the feature dimension of the fourth layer multi-layer perceptron is 256.
The method according to claim 74, wherein the boundary discrimination module comprises in sequence along the network depth direction: a third number of multi-layer perceptrons, a third maximum pooling layer, a third autocorrelation attention network, The fourth number of multilayer perceptrons and the fourth maximum pooling layer; the geometric information of the boundary points of the target point cloud block is input into the boundary discrimination module to obtain the boundary feature information of the target point cloud block, include:

Input the geometric information of the boundary points of the target point cloud block into the third number of multi-layer perceptrons for feature extraction, and obtain the first boundary feature information of the target point cloud block;

Inputting the first boundary feature information into the third maximum pooling layer for dimensionality reduction processing to obtain second boundary feature information of the target point cloud block;

Inputting the first boundary feature information and the second boundary feature information into the third autocorrelation attention network for feature interaction to obtain the third boundary feature information of the target point cloud block;

Inputting the third boundary feature information into the fourth number of multi-layer perceptrons for feature extraction to obtain the fourth boundary feature information of the target point cloud block;

Inputting the fourth boundary feature information into the fourth maximum pooling layer for dimensionality reduction processing to obtain boundary feature information of the target point cloud block.
The method according to claim 83, wherein the first boundary feature information and the second boundary feature information are input into the third autocorrelation attention network for feature interaction to obtain the target point The third boundary feature information of the cloud block, including:

cascading the first boundary feature information and the second boundary feature information;

The concatenated first boundary feature information and the second boundary feature information are input into the third autocorrelation attention network for feature interaction to obtain third boundary feature information of the target point cloud block.
The method of claim 83, wherein said third amount is equal to said fourth amount.
The method of claim 85, wherein the third number and the fourth number are both equal to two.
The method of claim 86, wherein the third number of multilayer perceptrons includes a fifth layer of multilayer perceptrons and a sixth layer of multilayer perceptrons, and the fourth number of multilayer perceptrons Including the seventh layer multi-layer perceptron and the eighth layer multi-layer perceptron, the fifth layer multi-layer perceptron, the sixth layer multi-layer perceptron, the seventh layer multi-layer perceptron and the first layer The feature dimensions of the eight-layer multilayer perceptron gradually increase sequentially.
The method according to claim 87, wherein the feature dimension of the eighth-layer multi-layer perceptron is greater than or equal to the feature dimension of the seventh-layer multi-layer perceptron, and is less than or equal to the fourth-layer multi-layer The feature dimension of the perceptron.
The method according to claim 88, wherein the feature dimension of the fifth layer multi-layer perceptron is 32, the feature dimension of the sixth layer multi-layer perceptron is 64, and the seventh layer multi-layer perceptron The feature dimension of the perceptron is 128, and the feature dimension of the eighth layer multi-layer perceptron is 192.
The method according to claim 49, characterized in that, according to the first discrimination result of the discriminator, the feature extraction module, feature upsampling module and geometry generation module in the generator are trained to obtain the training After the generator, including:

determining a first loss of the generator according to the first discrimination result;

According to the first loss, the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator is determined.
The method according to claim 90, wherein the determining the first loss of the generator according to the first discrimination result comprises:

According to the first discrimination result, a least squares loss function is used to determine a first loss of the generator.
The method according to claim 90, wherein, according to the first loss, determining the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator comprises:

determining at least one second loss for the generator;

determining a target loss for the generator based on a first loss for the generator and at least one second loss for the generator;

According to the target loss of the generator, the parameter matrix of the feature extraction module, feature upsampling module and geometry generation module in the generator is determined.
The method of claim 92, wherein said determining at least one second loss of said generator comprises:

According to the upsampled geometric information of the training point cloud block and the upsampled true value of the geometric information of the training point cloud block, a second loss of the generator is determined by using ground motion distance.
The method of claim 92, wherein said determining at least one second loss of said generator comprises:

Downsampling the upsampling geometric information of the training point cloud block to obtain a downsampling training point cloud block with the same number of points as the training point cloud block;

According to the geometric information of the downsampled training point cloud block and the geometric information of the training point cloud block, a ground motion distance is sampled to determine a second loss of the generator.
The method according to claim 94, characterized in that, according to the geometric information of the down-sampled training point cloud block and the geometric information of the training point cloud block, the method of sampling ground motion distance is used to determine the generator's A second loss, including:

A second loss of the generator is determined according to the following formula:

Wherein, the L id is the second loss of the generator, the P ori is a training point cloud block, and the P low is a downsampled training point cloud block, and φ:P low →P ori means that P For the bijection composed of low and P ori , there is only one and only way to move P low and P ori to the minimum distance between the point sets of each other.
is the kth point in the P low , the
for the said
Corresponding point in the P ori .
The method of claim 92, wherein said determining at least one second loss of said generator comprises:

Based on a uniform loss function, at least one second loss of the generator is determined.
The method according to claim 92, wherein determining the target loss of the generator based on the first loss of the generator and at least one second loss comprises:

A weighted average of the generator's first loss and the at least one second loss is used to determine the generator's target loss.
A generator training device, characterized in that it comprises:

An acquisition unit is used to acquire the geometric information of the training point cloud;

A division unit, configured to divide the training point cloud into at least one training point cloud block according to the geometric information of the training point cloud;

The training unit is used to input the geometric information of the training point cloud block into the feature extraction module of the generator for feature extraction to obtain the first feature information of the training point cloud block; the first feature information of the training point cloud block Information is input into the feature upsampling module of the generator for upsampling to obtain the second feature information of the training point cloud block; the second feature information of the training point cloud block is input into the geometry generation module of the generator for Geometric reconstruction to obtain the predicted upsampling geometric information of the training point cloud block; according to the predicted upsampling geometric information of the training point cloud block, the feature extraction module, feature upsampling module and geometry generation module in the generator Perform training to obtain the trained generator.
A generator training device, characterized in that it includes: a processor and a memory;

The memory is used to store computer programs;

The processor is used for invoking and running the computer program stored in the memory, so as to execute the method according to any one of claims 48-97.
A computer-readable storage medium, characterized by being used to store a computer program, the computer program causes a computer to execute the method according to any one of claims 1 to 23 or 25 to 45 or 48 to 97.