CN113160068B - Point cloud completion method and system based on image - Google Patents

Point cloud completion method and system based on image Download PDF

Info

Publication number
CN113160068B
CN113160068B CN202110204647.6A CN202110204647A CN113160068B CN 113160068 B CN113160068 B CN 113160068B CN 202110204647 A CN202110204647 A CN 202110204647A CN 113160068 B CN113160068 B CN 113160068B
Authority
CN
China
Prior art keywords
point cloud
area
incomplete
fine
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110204647.6A
Other languages
Chinese (zh)
Other versions
CN113160068A (en
Inventor
赵曦滨
张轩诚
高跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202110204647.6A priority Critical patent/CN113160068B/en
Publication of CN113160068A publication Critical patent/CN113160068A/en
Application granted granted Critical
Publication of CN113160068B publication Critical patent/CN113160068B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Image Processing (AREA)

Abstract

The application provides a point cloud completion method and system based on an image, and relates to the technical field of computer vision and computational camera science, wherein the method comprises the following steps: acquiring a single true color RGB image of a target object and incomplete point clouds in different scenes, and performing point cloud reconstruction on the RGB image to acquire sparse point clouds; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.

Description

Point cloud completion method and system based on image
Technical Field
The application relates to the technical field of computer vision and computational photography, in particular to a point cloud completion method and system based on an image.
Background
Generally, stereoscopic vision has important application value in the fields of augmented reality AR/virtual reality VR, film production, industrial flaw detection and the like, and is important for accurate description and complete presentation of three-dimensional objects. Three-dimensional point clouds are a common representation of three-dimensional objects that accurately describe the coordinates of each point cloud in space. Each point is a triplet (x, y, z) describing the position of a point in space, a point cloud consisting of a series of points describing a three-dimensional object within space. Due to the characteristics of simple structure and easy processing, the point cloud becomes the main acquisition format of the depth camera. The depth camera senses depth information in a three-dimensional space by using elements such as structured light and a flight time sensor and stores the depth information in a point cloud form. The depth camera cannot acquire the complete three-dimensional structure of the three-dimensional object at one time or the acquired point cloud has defects of different degrees because the acquired object is in a motion state due to the fact that the acquisition visual angle, the object shielding and other factors are limited. Incomplete point cloud cannot effectively and completely describe the structure of the three-dimensional object, has low practical value and is difficult to be effectively applied to various production scenes. The completion and enhancement of incomplete point cloud are an urgent technical means in the industry.
Disclosure of Invention
The present application is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, a first objective of the present application is to provide an image-based point cloud completion method, which introduces view information to improve the precision of point cloud completion and achieve view-based cross-modal completion of three-dimensional point cloud.
A second objective of the present application is to provide an image-based point cloud completion system.
In order to achieve the above object, an embodiment of a first aspect of the present application provides an image-based point cloud completion method, including:
acquiring a single true color RGB image of a target object, acquiring incomplete point clouds of the target object in different scenes, and performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to acquire a sparse point cloud with a complete outline;
aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud;
calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using a chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value;
respectively predicting displacement vectors with different lengths by using a dynamic displacement prediction method aiming at the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain an enhanced fine point cloud area and an enhanced rough point cloud area;
and merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud.
According to the point cloud completion method based on the image, the point cloud reconstruction is carried out on a single true color RGB image of a target object and incomplete point clouds under different scenes to obtain sparse point clouds; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
Optionally, in an embodiment of the present application, acquiring incomplete point clouds of the target object in different scenes includes:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein each point is a triplet (x, y, z), and the whole point cloud can be represented as
Figure BDA0002949353590000021
Wherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
Optionally, in an embodiment of the present application, the performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete contour includes:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
Optionally, in an embodiment of the present application, the chamfer distance is defined as:
Figure BDA0002949353590000022
wherein, P and Q are two
Figure BDA0002949353590000023
Two point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
the method comprises the following steps of calculating the nearest neighbor distance between the rough point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a rough point cloud area according to the nearest neighbor distance and a threshold value, wherein the method comprises the following steps:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the distance between the nearest neighbors is smaller than the threshold, namely the distance between the nearest neighbors is the fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
Optionally, in an embodiment of the present application, the predicting displacement vectors with different lengths for the fine point cloud area and the coarse point cloud area by using a dynamic displacement prediction method, respectively, adding the displacement vectors to different point clouds, and obtaining an enhanced fine point cloud area and an enhanced coarse point cloud area includes:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
In order to achieve the above object, a second aspect of the present application provides an image-based point cloud completion system, including:
the first acquisition module is used for acquiring a single true color RGB image of a target object;
the acquisition module is used for acquiring incomplete point clouds of the target object in different scenes;
the reconstruction module is used for performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete outline;
the second acquisition module is used for aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, combining the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to acquire a coarse point cloud;
the third acquisition module is used for calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold;
the fourth acquisition module is used for predicting displacement vectors with different lengths by using a dynamic displacement prediction method respectively aiming at the fine point cloud area and the rough point cloud area, and adding the displacement vectors with different point clouds respectively to acquire the reinforced fine point cloud area and the reinforced rough point cloud area;
and the merging module is used for merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud.
According to the point cloud completion system based on the image, the point cloud reconstruction is carried out on the RGB image to obtain the sparse point cloud by obtaining a single true color RGB image of the target object and incomplete point clouds under different scenes; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
Optionally, in an embodiment of the present application, the acquisition module is specifically configured to:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein, each point is a triple (x, y, z), the point cloud as a whole can be represented as
Figure BDA0002949353590000041
Wherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
Optionally, in an embodiment of the present application, the reconstruction module is specifically configured to:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
Optionally, in an embodiment of the present application, the chamfer distance is defined as:
Figure BDA0002949353590000042
wherein, P and Q are two
Figure BDA0002949353590000043
Two point sets in space, and the chamfer distance describes the distance between the closest points between the two point clouds;
the third obtaining module is specifically configured to:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the nearest neighbor distance is smaller than the threshold and is a fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
Optionally, in an embodiment of the application, the fourth obtaining module is specifically configured to:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a point cloud completion method based on an image according to an embodiment of the present disclosure;
FIG. 2 is an exemplary diagram of incomplete point clouds in two scenarios according to the embodiment of the present application;
FIG. 3 is an exemplary diagram of a neural network establishing a mapping relationship between a view and a point cloud according to an embodiment of the present application;
FIG. 4 is a diagram illustrating an example of a structure of a dynamic displacement predictor according to an embodiment of the present disclosure;
FIG. 5 is an exemplary diagram of incomplete point cloud and complete point cloud data according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of an image-based point cloud completion system according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The image-based point cloud completion method and system according to the embodiments of the present application are described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a point cloud completion method based on an image according to an embodiment of the present disclosure.
In particular, RGB image data is easy to acquire, has low requirements on equipment, and can provide rich view data. According to the method, the complete structure of the three-dimensional object can be roughly estimated through single RGB data due to the high-speed development of the deep learning technology, the abundant data of the RGB images are used as supports, and the incomplete point cloud data acquired by the depth camera is completed through a cross-modal fusion technology.
As shown in fig. 1, the image-based point cloud completion method can implement incomplete point cloud completion tasks in a plurality of different scenes, and includes the following steps:
101, acquiring a single true color RGB image of a target object, acquiring incomplete point clouds of the target object in different scenes, and performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to acquire a sparse point cloud with a complete outline.
In the embodiment of the present application, a single RGB image may be directly acquired using a conventional RGB camera.
In the embodiment of the application, a depth camera or a laser radar is used for shooting from a target object in a self-shading scene to obtain incomplete point cloud; and/or shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein each point is a triplet (x, y, z), and the whole point cloud can be represented as
Figure BDA0002949353590000051
Wherein, P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
Specifically, the incomplete point cloud under different scenes such as self-occlusion and mutual occlusion is acquired, and the incomplete point cloud can be acquired by using a depth camera or a laser radar, as shown in fig. 2, the incomplete point cloud data is directly acquired by using the depth camera or the laser radar to shoot from a certain visual angle. Each point is a triplet (x, y, z), and the point cloud as a whole can be represented as
Figure BDA0002949353590000052
Wherein P represents the three-dimensional structure of the corresponding object and N represents the number of points of the point cloud. The three-dimensional point cloud cross-modal completion based on the view can complete incomplete point cloud under the following two scenes.
In the embodiment of the application, an image encoder is constructed by using a series of convolutions, the image features of an RGB image are extracted, and the image features are encoded into hidden space vectors; and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder to obtain the sparse point cloud with a complete outline.
Specifically, a point cloud reconstruction neural network is used for reconstructing a sparse point cloud with a complete outline according to an input view, namely an RGB image, and a mapping relation between the view and the point cloud is established by using the neural network. The specific structure is shown in fig. 3, the input is a single view, the output is a sparse point cloud, and an encoder-decoder structure is used in the process. In the encoding stage, an image encoder is constructed by using a series of convolutions, the image features of an input view are extracted, and the image is encoded into 7 × 7 × 512 hidden space vectors. A series of deconvolution is used as a point cloud decoder in the decoding stage to restore the hidden space vector to a coarse reconstructed point cloud. At the same time, the encoder will record the convolved signature maps of each layer for 2D guidance of the partition enhancement stage.
And 102, aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, combining the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud.
Specifically, the sparse point cloud and the input incomplete point cloud have a viewing angle deviation, and the sparse point cloud and the input incomplete point cloud are aligned by using the camera external reference matrix, that is, the sparse point cloud and the input incomplete point cloud are unified to the same viewing angle by using the camera external reference matrix.
Specifically, the sparse Point cloud and the input incomplete Point cloud are combined in a Point Union mode, Sampling is carried out by using Farthest Point Sampling (Farthest Point Sampling) to obtain a coarse Point cloud, namely the sparse Point cloud and the input incomplete Point cloud are combined, and the combined Point cloud has the problem of uneven density and the like. And (3) downsampling the Point cloud under the condition of ensuring the density uniformity by using a Farthest Point Sampling (Farthest Point Sampling) to obtain a coarse Point cloud.
And 103, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value.
In the embodiment of the present application, the chamfer distance is defined as:
Figure BDA0002949353590000061
wherein, P and Q are two
Figure BDA0002949353590000062
Two point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
using the chamfering distance to calculate the nearest neighbor distance between the rough point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a rough point cloud area according to the nearest neighbor distance and a threshold value, wherein the method comprises the following steps: dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as a threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the part of the nearest neighbor distance smaller than the threshold is the fine point cloud, and acquiring a fine point cloud area and a coarse point cloud area.
Specifically, the rough point cloud is randomly divided into two parts, the distance between the closest points of the two parts is calculated by using chamfers, and the average value of the distances is taken as the threshold value T. And calculating the nearest point distance between the coarse point cloud and the input incomplete point cloud by using the chamfering distance, wherein the point cloud with the distance less than the threshold value T is determined as the fine point cloud.
And 104, respectively predicting displacement vectors with different lengths by using a dynamic displacement prediction method for the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain the reinforced fine point cloud area and the reinforced rough point cloud area.
And 105, merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud.
In the embodiment of the application, the point cloud three-dimensional feature, the view two-dimensional feature and the coarse point cloud global feature are obtained and subjected to fusion processing, and the fusion feature is obtained; constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud; and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud.
Specifically, a dynamic displacement prediction method is used for predicting displacement vectors with different lengths for a fine point cloud area and a rough point cloud area, and the displacement vectors are added with different point clouds to achieve strong pertinence on the point clouds in different areas.
Specifically, point cloud 3D features, view 2D features and coarse point cloud global features are fused to obtain fusion features. Constructing a dynamic displacement predictor, predicting a small-scale displacement vector of the fine point cloud area according to the input fusion characteristics, and adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an accurate point cloud A; and for the rough point cloud area, predicting a large-scale displacement vector according to the input fusion characteristics, and adding the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the accurate point cloud B.
The structure of the dynamic displacement predictor is shown in fig. 4.
And finally, combining the obtained accurate point clouds A and B to obtain the final high-precision point cloud.
For example, as shown in fig. 5, a single view is acquired using an RGB camera; acquiring incomplete point cloud by using a depth camera; reconstructing sparse point cloud according to a single view; aligning the sparse point cloud and the input incomplete point cloud by using camera parameters; combining the sparse point cloud and the input incomplete point cloud to obtain a coarse point cloud; filtering the fine point cloud area and the coarse point cloud area by utilizing the chamfering distance; carrying out targeted partition reinforcement on the fine point cloud area and the rough point cloud area by using a dynamic displacement prediction method; and merging the strengthened fine point cloud area and the rough point cloud area to obtain the final high-precision point cloud. According to the method and the device, view information is introduced on the basis of the traditional single-mode point cloud completion technology to improve the precision of the completed point cloud, and the view-based three-dimensional point cloud cross-mode completion
According to the point cloud completion method based on the image, the point cloud reconstruction is carried out on a single true color RGB image of a target object and incomplete point clouds under different scenes to obtain sparse point clouds; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
In order to implement the above embodiment, the present application further provides an image-based point cloud completion system.
Fig. 6 is a schematic structural diagram of an image-based point cloud completion system according to an embodiment of the present disclosure.
As shown in fig. 6, the image-based point cloud complementing system includes: a first acquisition module 610, an acquisition module 620, a reconstruction module 630, a second acquisition module 640, a third acquisition module 650, a fourth acquisition module 660, and a merge module 670.
The first acquiring module 610 is configured to acquire a single true color RGB image of a target object.
And the acquisition module 620 is used for acquiring incomplete point clouds of the target object in different scenes.
A reconstruction module 630, configured to perform point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network, to obtain a sparse point cloud with a complete contour.
A second obtaining module 640, configured to align the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to be unified to a same viewing angle, combine the sparse point cloud and the incomplete point cloud, and perform sampling by using a farthest point sampling method, so as to obtain a coarse point cloud.
A third obtaining module 650, configured to calculate a nearest neighboring point distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and obtain a fine point cloud area and a coarse point cloud area according to the nearest neighboring point distance and a threshold.
A fourth obtaining module 660, configured to predict displacement vectors with different lengths for the fine point cloud area and the coarse point cloud area by using a dynamic displacement prediction method, add the displacement vectors to different point clouds, and obtain an enhanced fine point cloud area and a reinforced coarse point cloud area.
And a merging module 670, configured to merge the strengthened fine point cloud area and the rough point cloud area to obtain a high-precision point cloud.
In this embodiment of the application, the acquisition module 620 is specifically configured to: shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or shooting the target object under self-occlusion and mutual-occlusion scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein each point is a triplet (x, y, z), and the whole point cloud can be represented as
Figure BDA0002949353590000081
Wherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
In this embodiment of the application, the reconstruction module 630 is specifically configured to: using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors; and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
In the embodiment of the present application, the chamfer distance is defined as:
Figure BDA0002949353590000082
wherein, P and Q are two
Figure BDA0002949353590000083
Two point sets in space, and the chamfer distance describes the distance between the closest points between the two point clouds;
the third obtaining module 650 is specifically configured to: dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the distance between the nearest neighbors is smaller than the threshold, namely the distance between the nearest neighbors is the fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
In the embodiment of the present application, the fourth obtaining module 660 is specifically configured to obtain a point cloud three-dimensional feature, a view two-dimensional feature, and a coarse point cloud global feature fusion process, so as to obtain a fusion feature; constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector with the spatial coordinates of the fine point cloud area, and acquiring an enhanced fine point cloud area; and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
According to the point cloud completion system based on the image, the point cloud reconstruction is carried out on the RGB image to obtain the sparse point cloud by obtaining a single true color RGB image of the target object and incomplete point clouds under different scenes; unifying the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud; calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value; and respectively predicting displacement vectors with different lengths for the fine point cloud area and the rough point cloud area, respectively adding the displacement vectors with different point clouds, and merging the reinforced fine point cloud area and the rough point cloud area to obtain the high-precision point cloud. Therefore, view information is introduced to improve the precision of the point cloud completion and achieve cross-mode completion of the three-dimensional point cloud based on the view.
It should be noted that the foregoing explanation of the embodiment of the image-based point cloud completion method is also applicable to the image-based point cloud completion system of the embodiment, and details are not repeated here.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are well known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (8)

1. An image-based point cloud completion method is characterized by comprising the following steps:
acquiring a single true color RGB image of a target object, acquiring incomplete point clouds of the target object in different scenes, and performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to acquire a sparse point cloud with a complete outline;
aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, merging the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to obtain a coarse point cloud;
calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using a chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold value;
respectively predicting displacement vectors with different lengths by using a dynamic displacement prediction method aiming at the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain an enhanced fine point cloud area and an enhanced rough point cloud area;
merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud;
the method for predicting the displacement vectors with different lengths by respectively using a dynamic displacement prediction method aiming at the fine point cloud area and the rough point cloud area, and respectively adding the displacement vectors with different point clouds to obtain the reinforced fine point cloud area and the reinforced rough point cloud area comprises the following steps:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector with the spatial coordinates of the fine point cloud area, and acquiring an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
2. The method of claim 1, wherein collecting a cloud of incomplete points of the target object under different scenes comprises:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
shooting the target object under self-shielding and mutual shielding scenes by using a depth camera or a laser radar to obtain incomplete point cloud; wherein, each point is a triplet (x, y, z), and the point cloud is wholly expressed as
Figure FDA0003713091370000021
Wherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
3. The method of claim 1, wherein the point cloud reconstructing the RGB image using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete contour comprises:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
4. The method of claim 1, wherein the chamfer distance is defined as:
Figure FDA0003713091370000022
wherein, P and Q are two
Figure FDA0003713091370000023
Two point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
the method comprises the following steps of calculating the nearest neighbor distance between the rough point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a rough point cloud area according to the nearest neighbor distance and a threshold value, wherein the method comprises the following steps:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the nearest neighbor distance is smaller than the threshold and is a fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
5. An image-based point cloud completion system, comprising:
the first acquisition module is used for acquiring a single true color RGB image of a target object;
the acquisition module is used for acquiring incomplete point clouds of the target object in different scenes;
the reconstruction module is used for performing point cloud reconstruction on the RGB image by using a point cloud reconstruction neural network to obtain a sparse point cloud with a complete outline;
the second acquisition module is used for aligning the sparse point cloud and the incomplete point cloud by using a camera external parameter matrix so as to unify the sparse point cloud and the incomplete point cloud to the same visual angle, combining the sparse point cloud and the incomplete point cloud, and sampling by using a farthest point sampling mode to acquire a coarse point cloud;
the third acquisition module is used for calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the chamfering distance, and acquiring a fine point cloud area and a coarse point cloud area according to the nearest neighbor distance and a threshold;
the fourth acquisition module is used for predicting displacement vectors with different lengths by using a dynamic displacement prediction method respectively aiming at the fine point cloud area and the rough point cloud area, and adding the displacement vectors with different point clouds respectively to acquire the reinforced fine point cloud area and the reinforced rough point cloud area;
the merging module is used for merging the strengthened fine point cloud area and the rough point cloud area to obtain high-precision point cloud;
the fourth obtaining module is specifically configured to:
acquiring a point cloud three-dimensional feature, a view two-dimensional feature and a coarse point cloud global feature, and performing fusion processing to acquire a fusion feature;
constructing a dynamic displacement predictor for the fine point cloud area, predicting a small-scale displacement vector according to the input fusion characteristics, adding the small-scale displacement vector and the spatial coordinates of the fine point cloud area to obtain an enhanced fine point cloud area;
and the dynamic displacement predictor predicts a large-scale displacement vector for the rough point cloud area according to the input fusion characteristics, and adds the large-scale displacement vector with the spatial coordinates of the rough point cloud area to obtain the strengthened rough point cloud area.
6. The system of claim 5, wherein the acquisition module is specifically configured to:
shooting the target object in a self-shading scene by using a depth camera or a laser radar to obtain incomplete point cloud; and/or the presence of a gas in the gas,
depth of useShooting by a camera or a laser radar from the target object under self-shielding and mutual-shielding scenes to obtain incomplete point cloud; wherein, each point is a triplet (x, y, z), and the point cloud is wholly expressed as
Figure FDA0003713091370000031
Wherein P represents the three-dimensional structure of the target object, and N represents the point number of the point cloud.
7. The system of claim 5, wherein the reconstruction module is specifically configured to:
using a series of convolution to construct an image encoder, extracting image features of the RGB image, and encoding the image features into hidden space vectors;
and recovering the hidden space vector into a rough reconstruction point cloud by using a series of deconvolution as a point cloud decoder, and acquiring the sparse point cloud with the complete contour.
8. The system of claim 5, wherein the chamfer distance is defined as:
Figure FDA0003713091370000041
wherein, P and Q are two
Figure FDA0003713091370000042
Two point sets in space, the chamfer distance describing the distance between the closest points between the two point clouds;
the third obtaining module is specifically configured to:
dividing the coarse point cloud into two parts, calculating the average value of the nearest neighbor distance between the two parts of the point cloud by using a formula (1) as the threshold, calculating the nearest neighbor distance between the coarse point cloud and the incomplete point cloud by using the formula (1), wherein the distance between the nearest neighbors is smaller than the threshold, namely the distance between the nearest neighbors is the fine point cloud, and acquiring the fine point cloud area and the coarse point cloud area.
CN202110204647.6A 2021-02-23 2021-02-23 Point cloud completion method and system based on image Active CN113160068B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110204647.6A CN113160068B (en) 2021-02-23 2021-02-23 Point cloud completion method and system based on image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110204647.6A CN113160068B (en) 2021-02-23 2021-02-23 Point cloud completion method and system based on image

Publications (2)

Publication Number Publication Date
CN113160068A CN113160068A (en) 2021-07-23
CN113160068B true CN113160068B (en) 2022-08-05

Family

ID=76883873

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110204647.6A Active CN113160068B (en) 2021-02-23 2021-02-23 Point cloud completion method and system based on image

Country Status (1)

Country Link
CN (1) CN113160068B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486988B (en) * 2021-08-04 2022-02-15 广东工业大学 Point cloud completion device and method based on adaptive self-attention transformation network
CN113808224A (en) * 2021-08-18 2021-12-17 南京航空航天大学 Point cloud geometric compression method based on block division and deep learning
CN113409227B (en) * 2021-08-19 2021-11-30 深圳市信润富联数字科技有限公司 Point cloud picture repairing method and device, electronic equipment and storage medium
CN113808261B (en) * 2021-09-30 2022-10-21 大连理工大学 Panorama-based self-supervised learning scene point cloud completion data set generation method
CN114627351B (en) * 2022-02-18 2023-05-16 电子科技大学 Fusion depth estimation method based on vision and millimeter wave radar
CN115496881B (en) * 2022-10-19 2023-09-22 南京航空航天大学深圳研究院 Monocular image-assisted point cloud complement method for large aircraft
CN118505909B (en) * 2024-07-17 2024-10-11 浙江大学 Map-assisted incomplete cloud completion method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017023210A1 (en) * 2015-08-06 2017-02-09 Heptagon Micro Optics Pte. Ltd. Generating a merged, fused three-dimensional point cloud based on captured images of a scene
CN109613557A (en) * 2018-11-28 2019-04-12 南京莱斯信息技术股份有限公司 A kind of system and method for completion laser radar three-dimensional point cloud target
CN112102472A (en) * 2020-09-01 2020-12-18 北京航空航天大学 Sparse three-dimensional point cloud densification method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017023210A1 (en) * 2015-08-06 2017-02-09 Heptagon Micro Optics Pte. Ltd. Generating a merged, fused three-dimensional point cloud based on captured images of a scene
CN109613557A (en) * 2018-11-28 2019-04-12 南京莱斯信息技术股份有限公司 A kind of system and method for completion laser radar three-dimensional point cloud target
CN112102472A (en) * 2020-09-01 2020-12-18 北京航空航天大学 Sparse three-dimensional point cloud densification method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Deep Neural Network for 3D Point Cloud Completion with Multistage Loss Function;Haohao Huang 等;《2019 Chinese Control And Decision Conference (CCDC)》;20190912;全文 *
基于多分支结构的点云补全网络;罗开乾 等;《激光与光电子学进展》;20201231;全文 *

Also Published As

Publication number Publication date
CN113160068A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
CN113160068B (en) Point cloud completion method and system based on image
Panek et al. Meshloc: Mesh-based visual localization
CN111340922B (en) Positioning and map construction method and electronic equipment
CA2650557C (en) System and method for three-dimensional object reconstruction from two-dimensional images
CN111612728B (en) 3D point cloud densification method and device based on binocular RGB image
US9380293B2 (en) Method for generating a model of a flat object from views of the object
US20090296984A1 (en) System and Method for Three-Dimensional Object Reconstruction from Two-Dimensional Images
US8634637B2 (en) Method and apparatus for reducing the memory requirement for determining disparity values for at least two stereoscopically recorded images
CN111639663A (en) Method for multi-sensor data fusion
EP3293700B1 (en) 3d reconstruction for vehicle
CN112270701B (en) Parallax prediction method, system and storage medium based on packet distance network
CN113256699B (en) Image processing method, image processing device, computer equipment and storage medium
CN113888458A (en) Method and system for object detection
CN112630469B (en) Three-dimensional detection method based on structured light and multiple light field cameras
CN115035235A (en) Three-dimensional reconstruction method and device
CN114119992A (en) Multi-mode three-dimensional target detection method and device based on image and point cloud fusion
Meerits et al. Real-time scene reconstruction and triangle mesh generation using multiple RGB-D cameras
Lu et al. Stereo disparity optimization with depth change constraint based on a continuous video
Wei et al. A stereo matching algorithm for high‐precision guidance in a weakly textured industrial robot environment dominated by planar facets
Kang et al. 3D urban reconstruction from wide area aerial surveillance video
Zhu et al. Toward the ghosting phenomenon in a stereo-based map with a collaborative RGB-D repair
CN115170745B (en) Unmanned aerial vehicle distance measurement method based on stereoscopic vision
CN114266900B (en) Monocular 3D target detection method based on dynamic convolution
CN116977810B (en) Multi-mode post-fusion long tail category detection method and system
CN115187743B (en) Subway station internal environment arrangement prediction and white mode acquisition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant