CN113160068B - Point cloud completion method and system based on image - Google Patents
Point cloud completion method and system based on image Download PDFInfo
- Publication number
- CN113160068B CN113160068B CN202110204647.6A CN202110204647A CN113160068B CN 113160068 B CN113160068 B CN 113160068B CN 202110204647 A CN202110204647 A CN 202110204647A CN 113160068 B CN113160068 B CN 113160068B
- Authority
- CN
- China
- Prior art keywords
- point cloud
- area
- incomplete
- fine
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000006073 displacement reaction Methods 0.000 claims abstract description 81
- 239000013598 vector Substances 0.000 claims abstract description 70
- 238000005070 sampling Methods 0.000 claims abstract description 27
- 230000000007 visual effect Effects 0.000 claims abstract description 12
- 230000004927 fusion Effects 0.000 claims description 22
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 8
- 238000007499 fusion processing Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Length Measuring Devices By Optical Means (AREA)
- Image Processing (AREA)
Abstract
The application provides a point cloud completion method and system based on an image, and relates to the technical field of computer vision and computational camera science, wherein the method comprises the following steps: acquiring a single true color RGB image of a target object and incomplete point clouds in different scenes, and performing point cloud reconstruction on the RGB image to acquire sparse point clouds; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
Description
Technical Field
The application relates to the technical field of computer vision and computational photography, in particular to a point cloud completion method and system based on an image.
Background
Generally, stereoscopic vision has important application value in the fields of augmented reality AR/virtual reality VR, film production, industrial flaw detection and the like, and is important for accurate description and complete presentation of three-dimensional objects. Three-dimensional point clouds are a common representation of three-dimensional objects that accurately describe the coordinates of each point cloud in space. Each point is a triplet (x, y, z) describing the position of a point in space, a point cloud consisting of a series of points describing a three-dimensional object within space. Due to the characteristics of simple structure and easy processing, the point cloud becomes the main acquisition format of the depth camera. The depth camera senses depth information in a three-dimensional space by using elements such as structured light and a flight time sensor and stores the depth information in a point cloud form. The depth camera cannot acquire the complete three-dimensional structure of the three-dimensional object at one time or the acquired point cloud has defects of different degrees because the acquired object is in a motion state due to the fact that the acquisition visual angle, the object shielding and other factors are limited. Incomplete point cloud cannot effectively and completely describe the structure of the three-dimensional object, has low practical value and is difficult to be effectively applied to various production scenes. The completion and enhancement of incomplete point cloud are an urgent technical means in the industry.
Disclosure of Invention
The present application is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, a first objective of the present application is to provide an image-based point cloud completion method, which introduces view information to improve the precision of point cloud completion and achieve view-based cross-modal completion of three-dimensional point cloud.
A second objective of the present application is to provide an image-based point cloud completion system.
In order to achieve the above object, an embodiment of a first aspect of the present application provides an image-based point cloud completion method, including:
acquiring a single true color RGB image of a target object, acquiring incomplete point clouds of the target object in different scenes, and performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to acquire a sparse point cloud with a complete outline;
aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud;
calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using a chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value;
respectively predicting displacement vectors with different lengths by using a dynamic displacement prediction method aiming at the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain an enhanced fine point cloud area and an enhanced rough point cloud area;
and merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud.
According to the point cloud completion method based on the image, the point cloud reconstruction is carried out on a single true color RGB image of a target object and incomplete point clouds under different scenes to obtain sparse point clouds; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
Optionally, in an embodiment of the present application, acquiring incomplete point clouds of the target object in different scenes includes:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein each point is a triplet (x, y, z), and the whole point cloud can be represented asWherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
Optionally, in an embodiment of the present application, the performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete contour includes:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
Optionally, in an embodiment of the present application, the chamfer distance is defined as:
wherein, P and Q are twoTwo point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
the method comprises the following steps of calculating the nearest neighbor distance between the rough point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a rough point cloud area according to the nearest neighbor distance and a threshold value, wherein the method comprises the following steps:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the distance between the nearest neighbors is smaller than the threshold, namely the distance between the nearest neighbors is the fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
Optionally, in an embodiment of the present application, the predicting displacement vectors with different lengths for the fine point cloud area and the coarse point cloud area by using a dynamic displacement prediction method, respectively, adding the displacement vectors to different point clouds, and obtaining an enhanced fine point cloud area and an enhanced coarse point cloud area includes:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
In order to achieve the above object, a second aspect of the present application provides an image-based point cloud completion system, including:
the first acquisition module is used for acquiring a single true color RGB image of a target object;
the acquisition module is used for acquiring incomplete point clouds of the target object in different scenes;
the reconstruction module is used for performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete outline;
the second acquisition module is used for aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, combining the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to acquire a coarse point cloud;
the third acquisition module is used for calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold;
the fourth acquisition module is used for predicting displacement vectors with different lengths by using a dynamic displacement prediction method respectively aiming at the fine point cloud area and the rough point cloud area, and adding the displacement vectors with different point clouds respectively to acquire the reinforced fine point cloud area and the reinforced rough point cloud area;
and the merging module is used for merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud.
According to the point cloud completion system based on the image, the point cloud reconstruction is carried out on the RGB image to obtain the sparse point cloud by obtaining a single true color RGB image of the target object and incomplete point clouds under different scenes; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
Optionally, in an embodiment of the present application, the acquisition module is specifically configured to:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein, each point is a triple (x, y, z), the point cloud as a whole can be represented asWherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
Optionally, in an embodiment of the present application, the reconstruction module is specifically configured to:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
Optionally, in an embodiment of the present application, the chamfer distance is defined as:
wherein, P and Q are twoTwo point sets in space, and the chamfer distance describes the distance between the closest points between the two point clouds;
the third obtaining module is specifically configured to:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the nearest neighbor distance is smaller than the threshold and is a fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
Optionally, in an embodiment of the application, the fourth obtaining module is specifically configured to:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a point cloud completion method based on an image according to an embodiment of the present disclosure;
FIG. 2 is an exemplary diagram of incomplete point clouds in two scenarios according to the embodiment of the present application;
FIG. 3 is an exemplary diagram of a neural network establishing a mapping relationship between a view and a point cloud according to an embodiment of the present application;
FIG. 4 is a diagram illustrating an example of a structure of a dynamic displacement predictor according to an embodiment of the present disclosure;
FIG. 5 is an exemplary diagram of incomplete point cloud and complete point cloud data according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an image-based point cloud completion system according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The image-based point cloud completion method and system according to the embodiments of the present application are described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a point cloud completion method based on an image according to an embodiment of the present disclosure.
In particular, RGB image data is easy to acquire, has low requirements on equipment, and can provide rich view data. According to the method, the complete structure of the three-dimensional object can be roughly estimated through single RGB data due to the high-speed development of the deep learning technology, the abundant data of the RGB images are used as supports, and the incomplete point cloud data acquired by the depth camera is completed through a cross-modal fusion technology.
As shown in fig. 1, the image-based point cloud completion method can implement incomplete point cloud completion tasks in a plurality of different scenes, and includes the following steps:
101, acquiring a single true color RGB image of a target object, acquiring incomplete point clouds of the target object in different scenes, and performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to acquire a sparse point cloud with a complete outline.
In the embodiment of the present application, a single RGB image may be directly acquired using a conventional RGB camera.
In the embodiment of the application, a depth camera or a laser radar is used for shooting from a target object in a self-shading scene to obtain incomplete point cloud; and/or shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein each point is a triplet (x, y, z), and the whole point cloud can be represented asWherein, P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
Specifically, the incomplete point cloud under different scenes such as self-occlusion and mutual occlusion is acquired, and the incomplete point cloud can be acquired by using a depth camera or a laser radar, as shown in fig. 2, the incomplete point cloud data is directly acquired by using the depth camera or the laser radar to shoot from a certain visual angle. Each point is a triplet (x, y, z), and the point cloud as a whole can be represented asWherein P represents the three-dimensional structure of the corresponding object and N represents the number of points of the point cloud. The three-dimensional point cloud cross-modal completion based on the view can complete incomplete point cloud under the following two scenes.
In the embodiment of the application, an image encoder is constructed by using a series of convolutions, the image features of an RGB image are extracted, and the image features are encoded into hidden space vectors; and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder to obtain the sparse point cloud with a complete outline.
Specifically, a point cloud reconstruction neural network is used for reconstructing a sparse point cloud with a complete outline according to an input view, namely an RGB image, and a mapping relation between the view and the point cloud is established by using the neural network. The specific structure is shown in fig. 3, the input is a single view, the output is a sparse point cloud, and an encoder-decoder structure is used in the process. In the encoding stage, an image encoder is constructed by using a series of convolutions, the image features of an input view are extracted, and the image is encoded into 7 × 7 × 512 hidden space vectors. A series of deconvolution is used as a point cloud decoder in the decoding stage to restore the hidden space vector to a coarse reconstructed point cloud. At the same time, the encoder will record the convolved signature maps of each layer for 2D guidance of the partition enhancement stage.
And 102, aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, combining the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud.
Specifically, the sparse point cloud and the input incomplete point cloud have a viewing angle deviation, and the sparse point cloud and the input incomplete point cloud are aligned by using the camera external reference matrix, that is, the sparse point cloud and the input incomplete point cloud are unified to the same viewing angle by using the camera external reference matrix.
Specifically, the sparse Point cloud and the input incomplete Point cloud are combined in a Point Union mode, Sampling is carried out by using Farthest Point Sampling (Farthest Point Sampling) to obtain a coarse Point cloud, namely the sparse Point cloud and the input incomplete Point cloud are combined, and the combined Point cloud has the problem of uneven density and the like. And (3) downsampling the Point cloud under the condition of ensuring the density uniformity by using a Farthest Point Sampling (Farthest Point Sampling) to obtain a coarse Point cloud.
And 103, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value.
In the embodiment of the present application, the chamfer distance is defined as:
wherein, P and Q are twoTwo point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
using the chamfering distance to calculate the nearest neighbor distance between the rough point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a rough point cloud area according to the nearest neighbor distance and a threshold value, wherein the method comprises the following steps: dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as a threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the part of the nearest neighbor distance smaller than the threshold is the fine point cloud, and acquiring a fine point cloud area and a coarse point cloud area.
Specifically, the rough point cloud is randomly divided into two parts, the distance between the closest points of the two parts is calculated by using chamfers, and the average value of the distances is taken as the threshold value T. And calculating the nearest point distance between the coarse point cloud and the input incomplete point cloud by using the chamfering distance, wherein the point cloud with the distance less than the threshold value T is determined as the fine point cloud.
And 104, respectively predicting displacement vectors with different lengths by using a dynamic displacement prediction method for the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain the reinforced fine point cloud area and the reinforced rough point cloud area.
And 105, merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud.
In the embodiment of the application, the point cloud three-dimensional feature, the view two-dimensional feature and the coarse point cloud global feature are obtained and subjected to fusion processing, and the fusion feature is obtained; constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud; and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud.
Specifically, a dynamic displacement prediction method is used for predicting displacement vectors with different lengths for a fine point cloud area and a rough point cloud area, and the displacement vectors are added with different point clouds to achieve strong pertinence on the point clouds in different areas.
Specifically, point cloud 3D features, view 2D features and coarse point cloud global features are fused to obtain fusion features. Constructing a dynamic displacement predictor, predicting a small-scale displacement vector of the fine point cloud area according to the input fusion characteristics, and adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an accurate point cloud A; and for the rough point cloud area, predicting a large-scale displacement vector according to the input fusion characteristics, and adding the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the accurate point cloud B.
The structure of the dynamic displacement predictor is shown in fig. 4.
And finally, combining the obtained accurate point clouds A and B to obtain the final high-precision point cloud.
For example, as shown in fig. 5, a single view is acquired using an RGB camera; acquiring incomplete point cloud by using a depth camera; reconstructing sparse point cloud according to a single view; aligning the sparse point cloud and the input incomplete point cloud by using camera parameters; combining the sparse point cloud and the input incomplete point cloud to obtain a coarse point cloud; filtering the fine point cloud area and the coarse point cloud area by utilizing the chamfering distance; carrying out targeted partition reinforcement on the fine point cloud area and the rough point cloud area by using a dynamic displacement prediction method; and merging the strengthened fine point cloud area and the rough point cloud area to obtain the final high-precision point cloud. According to the method and the device, view information is introduced on the basis of the traditional single-mode point cloud completion technology to improve the precision of the completed point cloud, and the view-based three-dimensional point cloud cross-mode completion
According to the point cloud completion method based on the image, the point cloud reconstruction is carried out on a single true color RGB image of a target object and incomplete point clouds under different scenes to obtain sparse point clouds; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
In order to implement the above embodiment, the present application further provides an image-based point cloud completion system.
Fig. 6 is a schematic structural diagram of an image-based point cloud completion system according to an embodiment of the present disclosure.
As shown in fig. 6, the image-based point cloud complementing system includes: a first acquisition module 610, an acquisition module 620, a reconstruction module 630, a second acquisition module 640, a third acquisition module 650, a fourth acquisition module 660, and a merge module 670.
The first acquiring module 610 is configured to acquire a single true color RGB image of a target object.
And the acquisition module 620 is used for acquiring incomplete point clouds of the target object in different scenes.
A reconstruction module 630, configured to perform point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network, to obtain a sparse point cloud with a complete contour.
A second obtaining module 640, configured to align the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to be unified to a same viewing angle, combine the sparse point cloud and the incomplete point cloud, and perform sampling by using a farthest point sampling method, so as to obtain a coarse point cloud.
A third obtaining module 650, configured to calculate a nearest neighboring point distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and obtain a fine point cloud area and a coarse point cloud area according to the nearest neighboring point distance and a threshold.
A fourth obtaining module 660, configured to predict displacement vectors with different lengths for the fine point cloud area and the coarse point cloud area by using a dynamic displacement prediction method, add the displacement vectors to different point clouds, and obtain an enhanced fine point cloud area and a reinforced coarse point cloud area.
And a merging module 670, configured to merge the strengthened fine point cloud area and the rough point cloud area to obtain a high-precision point cloud.
In this embodiment of the application, the acquisition module 620 is specifically configured to: shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or shooting the target object under self-occlusion and mutual-occlusion scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein each point is a triplet (x, y, z), and the whole point cloud can be represented asWherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
In this embodiment of the application, the reconstruction module 630 is specifically configured to: using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors; and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
In the embodiment of the present application, the chamfer distance is defined as:
wherein, P and Q are twoTwo point sets in space, and the chamfer distance describes the distance between the closest points between the two point clouds;
the third obtaining module 650 is specifically configured to: dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the distance between the nearest neighbors is smaller than the threshold, namely the distance between the nearest neighbors is the fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
In the embodiment of the present application, the fourth obtaining module 660 is specifically configured to obtain a point cloud three-dimensional feature, a view two-dimensional feature, and a coarse point cloud global feature fusion process, so as to obtain a fusion feature; constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector with the spatial coordinates of the fine point cloud area, and acquiring an enhanced fine point cloud area; and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
According to the point cloud completion system based on the image, the point cloud reconstruction is carried out on the RGB image to obtain the sparse point cloud by obtaining a single true color RGB image of the target object and incomplete point clouds under different scenes; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
It should be noted that the foregoing explanation of the embodiment of the image-based point cloud completion method is also applicable to the image-based point cloud completion system of the embodiment, and details are not repeated here.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.
Claims (8)
1. An image-based point cloud completion method is characterized by comprising the following steps:
acquiring a single true color RGB image of a target object, acquiring incomplete point clouds of the target object in different scenes, and performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to acquire a sparse point cloud with a complete outline;
aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud;
calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using a chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value;
respectively predicting displacement vectors with different lengths by using a dynamic displacement prediction method aiming at the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain an enhanced fine point cloud area and an enhanced rough point cloud area;
merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud;
the method for predicting the displacement vectors with different lengths by respectively using a dynamic displacement prediction method aiming at the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain the reinforced fine point cloud area and the reinforced rough point cloud area comprises the following steps:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector with the spatial coordinates of the fine point cloud area, and acquiring an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
2. The method of claim 1, wherein collecting a cloud of incomplete points of the target object under different scenes comprises:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein, each point is a triplet (x, y, z), and the point cloud is wholly expressed asWherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
3. The method of claim 1, wherein the point cloud reconstructing the RGB image using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete contour comprises:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
4. The method of claim 1, wherein the chamfer distance is defined as:
wherein, P and Q are twoTwo point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
the method comprises the following steps of calculating the nearest neighbor distance between the rough point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a rough point cloud area according to the nearest neighbor distance and a threshold value, wherein the method comprises the following steps:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the nearest neighbor distance is smaller than the threshold and is a fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
5. An image-based point cloud completion system, comprising:
the first acquisition module is used for acquiring a single true color RGB image of a target object;
the acquisition module is used for acquiring incomplete point clouds of the target object in different scenes;
the reconstruction module is used for performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete outline;
the second acquisition module is used for aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, combining the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to acquire a coarse point cloud;
the third acquisition module is used for calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold;
the fourth acquisition module is used for predicting displacement vectors with different lengths by using a dynamic displacement prediction method respectively aiming at the fine point cloud area and the rough point cloud area, and adding the displacement vectors with different point clouds respectively to acquire the reinforced fine point cloud area and the reinforced rough point cloud area;
the merging module is used for merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud;
the fourth obtaining module is specifically configured to:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
6. The system of claim 5, wherein the acquisition module is specifically configured to:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
depth of useShooting by a camera or a laser radar from the target object under self-shielding and mutual-shielding scenes to obtain incomplete point cloud; wherein, each point is a triplet (x, y, z), and the point cloud is wholly expressed asWherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
7. The system of claim 5, wherein the reconstruction module is specifically configured to:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
8. The system of claim 5, wherein the chamfer distance is defined as:
wherein, P and Q are twoTwo point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
the third obtaining module is specifically configured to:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the distance between the nearest neighbors is smaller than the threshold, namely the distance between the nearest neighbors is the fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110204647.6A CN113160068B (en) | 2021-02-23 | 2021-02-23 | Point cloud completion method and system based on image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110204647.6A CN113160068B (en) | 2021-02-23 | 2021-02-23 | Point cloud completion method and system based on image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113160068A CN113160068A (en) | 2021-07-23 |
CN113160068B true CN113160068B (en) | 2022-08-05 |
Family
ID=76883873
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110204647.6A Active CN113160068B (en) | 2021-02-23 | 2021-02-23 | Point cloud completion method and system based on image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113160068B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113486988B (en) * | 2021-08-04 | 2022-02-15 | 广东工业大学 | Point cloud completion device and method based on adaptive self-attention transformation network |
CN113808224A (en) * | 2021-08-18 | 2021-12-17 | 南京航空航天大学 | Point cloud geometric compression method based on block division and deep learning |
CN113409227B (en) * | 2021-08-19 | 2021-11-30 | 深圳市信润富联数字科技有限公司 | Point cloud picture repairing method and device, electronic equipment and storage medium |
CN113808261B (en) * | 2021-09-30 | 2022-10-21 | 大连理工大学 | Panorama-based self-supervised learning scene point cloud completion data set generation method |
CN114627351B (en) * | 2022-02-18 | 2023-05-16 | 电子科技大学 | Fusion depth estimation method based on vision and millimeter wave radar |
CN115496881B (en) * | 2022-10-19 | 2023-09-22 | 南京航空航天大学深圳研究院 | Monocular image-assisted point cloud complement method for large aircraft |
CN118505909B (en) * | 2024-07-17 | 2024-10-11 | 浙江大学 | Map-assisted incomplete cloud completion method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017023210A1 (en) * | 2015-08-06 | 2017-02-09 | Heptagon Micro Optics Pte. Ltd. | Generating a merged, fused three-dimensional point cloud based on captured images of a scene |
CN109613557A (en) * | 2018-11-28 | 2019-04-12 | 南京莱斯信息技术股份有限公司 | A kind of system and method for completion laser radar three-dimensional point cloud target |
CN112102472A (en) * | 2020-09-01 | 2020-12-18 | 北京航空航天大学 | Sparse three-dimensional point cloud densification method |
-
2021
- 2021-02-23 CN CN202110204647.6A patent/CN113160068B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017023210A1 (en) * | 2015-08-06 | 2017-02-09 | Heptagon Micro Optics Pte. Ltd. | Generating a merged, fused three-dimensional point cloud based on captured images of a scene |
CN109613557A (en) * | 2018-11-28 | 2019-04-12 | 南京莱斯信息技术股份有限公司 | A kind of system and method for completion laser radar three-dimensional point cloud target |
CN112102472A (en) * | 2020-09-01 | 2020-12-18 | 北京航空航天大学 | Sparse three-dimensional point cloud densification method |
Non-Patent Citations (2)
Title |
---|
Deep Neural Network for 3D Point Cloud Completion with Multistage Loss Function;Haohao Huang 等;《2019 Chinese Control And Decision Conference (CCDC)》;20190912;全文 * |
基于多分支结构的点云补全网络;罗开乾 等;《激光与光电子学进展》;20201231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113160068A (en) | 2021-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113160068B (en) | Point cloud completion method and system based on image | |
Panek et al. | Meshloc: Mesh-based visual localization | |
CN111340922B (en) | Positioning and map construction method and electronic equipment | |
CA2650557C (en) | System and method for three-dimensional object reconstruction from two-dimensional images | |
CN111612728B (en) | 3D point cloud densification method and device based on binocular RGB image | |
US9380293B2 (en) | Method for generating a model of a flat object from views of the object | |
US20090296984A1 (en) | System and Method for Three-Dimensional Object Reconstruction from Two-Dimensional Images | |
US8634637B2 (en) | Method and apparatus for reducing the memory requirement for determining disparity values for at least two stereoscopically recorded images | |
CN111639663A (en) | Method for multi-sensor data fusion | |
EP3293700B1 (en) | 3d reconstruction for vehicle | |
CN112270701B (en) | Parallax prediction method, system and storage medium based on packet distance network | |
CN113256699B (en) | Image processing method, image processing device, computer equipment and storage medium | |
CN113888458A (en) | Method and system for object detection | |
CN112630469B (en) | Three-dimensional detection method based on structured light and multiple light field cameras | |
CN115035235A (en) | Three-dimensional reconstruction method and device | |
CN114119992A (en) | Multi-mode three-dimensional target detection method and device based on image and point cloud fusion | |
Meerits et al. | Real-time scene reconstruction and triangle mesh generation using multiple RGB-D cameras | |
Lu et al. | Stereo disparity optimization with depth change constraint based on a continuous video | |
Wei et al. | A stereo matching algorithm for high‐precision guidance in a weakly textured industrial robot environment dominated by planar facets | |
Kang et al. | 3D urban reconstruction from wide area aerial surveillance video | |
Zhu et al. | Toward the ghosting phenomenon in a stereo-based map with a collaborative RGB-D repair | |
CN115170745B (en) | Unmanned aerial vehicle distance measurement method based on stereoscopic vision | |
CN114266900B (en) | Monocular 3D target detection method based on dynamic convolution | |
CN116977810B (en) | Multi-mode post-fusion long tail category detection method and system | |
CN115187743B (en) | Subway station internal environment arrangement prediction and white mode acquisition method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |