CN107862735B

CN107862735B - RGBD three-dimensional scene reconstruction method based on structural information

Info

Publication number: CN107862735B
Application number: CN201710865372.4A
Authority: CN
Inventors: 齐越; 王晨; 衡亦舒
Original assignee: Qingdao Research Institute Of Beijing University Of Aeronautics And Astronautics; Beihang University
Current assignee: Qingdao Research Institute Of Beijing University Of Aeronautics And Astronautics; Beihang University
Priority date: 2017-09-22
Filing date: 2017-09-22
Publication date: 2021-03-05
Anticipated expiration: 2037-09-22
Also published as: CN107862735A

Abstract

The invention belongs to the technical field of three-dimensional image processing, and particularly relates to an RGBD three-dimensional scene reconstruction method based on structural information, which comprises the following steps of S1, detecting scene information in the ith frame, and marking the scene information in model data corresponding to the ith frame; s2, estimating a camera attitude corresponding to the (i + 1) th frame, and adding scene information marked in model data corresponding to the (i) th frame during calculation; s3, according to the camera posture corresponding to the (i + 1) th frame, the (i + 1) th frame is merged into the model data corresponding to the (i) th frame to obtain the model data corresponding to the (i + 1) th frame; s4, detecting scene information in a projection graph of the model data corresponding to the (i + 1) th frame, and back-projecting the scene information to the model data corresponding to the (i + 1) th frame; and S5, for each value of i, i is 1,2,3, N-1, wherein N is the total frame number, and the steps S1-S4 are repeated to complete the three-dimensional reconstruction work. The method can detect the geometrical structure information existing in the scene in real time, and can minimize the camera attitude estimation of registration error in the camera attitude estimation process, thereby better completing the real-time three-dimensional reconstruction technology.

Description

RGBD three-dimensional scene reconstruction method based on structural information

Technical Field

The invention belongs to the technical field of three-dimensional image processing, and particularly relates to an RGBD three-dimensional scene reconstruction method based on structural information.

Background

The real-time three-dimensional reconstruction technology is always a research hotspot in the modeling field, along with the popularization and development of depth sensors, provides very favorable preconditions for the real-time three-dimensional reconstruction technology, and greatly improves the feasibility and the precision in the field. In a real-time modeling process, pose estimation of a camera is a core problem, and stability and reliability of camera pose estimation are key points for determining a final modeling result. A commonly used attitude estimation algorithm is an ICP algorithm, but due to inaccuracy of acquisition depth in an actual scanning process, the ICP algorithm may have error accumulation in different degrees in a continuous operation process, which may eventually cause failure of attitude estimation. Some of the more common solutions adopted at present are to combine geometric image information and color image information, including pre-registration based on color image features, point-to-point registration based on color information weighting, and some edge structure information as a weighting standard. The methods are based on the original data of the acquired image, and directly calculate at the pixel or point cloud level without considering the prior knowledge of the structure in the scene.

Disclosure of Invention

The invention aims to provide an RGBD three-dimensional scene reconstruction method based on structural information, which can detect the geometric structural information existing in a scene in real time, and can minimize camera attitude estimation of registration error in the camera attitude estimation process, thereby better completing a real-time three-dimensional reconstruction technology.

In order to achieve the purpose, the invention adopts the following technical scheme: an RGBD three-dimensional scene reconstruction method based on structural information,

s1, detecting scene information in the ith frame, and marking the scene information in model data corresponding to the ith frame;

s2, estimating a camera attitude corresponding to the (i + 1) th frame, and adding scene information marked in model data corresponding to the (i) th frame during calculation;

s3, according to the camera posture corresponding to the (i + 1) th frame, the (i + 1) th frame is merged into the model data corresponding to the (i) th frame to obtain the model data corresponding to the (i + 1) th frame;

s4, detecting scene information in a projection graph of the model data corresponding to the (i + 1) th frame, and back-projecting the scene information to the model data corresponding to the (i + 1) th frame;

and S5, for each value of i, i is 1,2,3, N-1, wherein N is the total frame number, and the steps S1-S4 are repeated to complete the three-dimensional reconstruction work.

Further, the scene information is plane structure information.

Further, the specific step of step S1 is:

s11, preprocessing the ith frame to obtain the coordinates and normal vectors of each point of the ith frame in a world coordinate system;

and S12, detecting the plane structure information in the ith frame, and marking the plane structure information in the model data corresponding to the ith frame.

The RGBD three-dimensional scene reconstruction method based on the structural information according to claim 3, wherein the specific step of detecting the plane structural information in step S12 is as follows:

s121, selecting a plane point in the three-dimensional point cloud of the ith frame in a world coordinate system;

s122, setting an initial alternative plane;

s123, clustering an initial plane area;

s124, obtaining a plane in the three-dimensional point cloud, and recalculating a plane equation of the plane;

and S125, merging planes.

Further, the specific steps of step S121 are:

(1) calculating the curvature value of each point in the three-dimensional point cloud,

where k (u, v) represents the curvature value of a point with pixel coordinates u, v, n₀Is a triangular normal vector formed by the right neighborhood point and the upper neighborhood point of the position to be solved,

n₁,n₂,n₃normal vectors corresponding to triangles in other three directions;

(2) dividing points in the three-dimensional point cloud into plane area points and non-plane area points according to the curvature values; and setting a threshold, and regarding each point in the three-dimensional point cloud, if the curvature value of the point is greater than the threshold, considering the point as a non-planar area point, and not including the subsequent calculation process.

Further, the specific step of step S122 is:

(1) dividing the plane area points;

(2) calculating a plane equation corresponding to each area,

C＝A_m*4 ^TA_m*4

wherein m is the total number of all points in the region block, and A is a matrix formed by all points; solving the eigenvalue and eigenvector of the matrix C, taking the eigenvector corresponding to the minimum eigenvalue as the equation parameter of the plane area corresponding to the area, and normalizing the eigenvector to be P ═ P₁,p₂,...,p_i,...p_nAs an initial candidate plane.

Further, the specific steps of step S123 are:

(1) calculating the geometric relation between all the plane area points and the initial candidate plane, namely calculating the distance between each plane area point and the initial candidate plane and the included angle between the normal vector of each point and the normal vector of the initial candidate plane;

(2) setting a threshold, and regarding a plane area point, if the distance from the plane area point to the initial candidate plane and the included angle between the normal vector of the plane area point and the normal vector of the initial candidate plane are both smaller than the threshold, determining that the point belongs to the point corresponding to the initial plane area;

(3) and solving the equation parameter corresponding to the minimum measurement value to serve as the plane where the plane point is located, and further clustering out an initial plane area.

Further, the specific steps of step S124 are: calculating the area of the initial plane region obtained by clustering, setting a threshold value, regarding each initial plane region obtained by clustering, if the area of the initial plane region is larger than the threshold value, considering the plane as the plane in the three-dimensional point cloud, and otherwise, removing the plane.

Further, the specific steps of step S125 are:

(1) selecting any two planes, calculating an included angle between normal vectors of the two planes and an average value of distances from all points in one plane to the other plane, setting a threshold value, and combining the two planes if the two values are smaller than the threshold value;

(2) and respectively calculating the similarity between corresponding plane equations for all the reserved initial plane areas, and merging the two areas when the similarity is very close.

Further, the specific step of step S4 is:

s41, projecting the model data corresponding to the (i + 1) th frame to obtain a model projection diagram corresponding to the (i + 1) th frame;

s42, determining whether the unmarked points in the model projection graph corresponding to the (i + 1) th frame belong to a known plane or not;

s43, determining whether a new plane is generated or not for points which are not marked in the model projection graph corresponding to the (i + 1) th frame and do not belong to the known plane;

and S44, marking the points which are determined in the step S42 and belong to the known plane and the new plane generated in the step S42, and back projecting the points to the model data corresponding to the (i + 1) th frame.

The method deeply analyzes the requirement on the RGBD key frame in the three-dimensional reconstruction, and has the advantages compared with the prior key frame extraction technology aiming at the three-dimensional reconstruction:

(1) in consideration of the requirement of real-time performance, in the process of calculating the plane structure, a blocking strategy is adopted, the work of the plane mark is parallelized, and the operation efficiency can be greatly improved.

(2) Structural information of a scene is considered as favorable prior knowledge, the structural information is added into an ICP point cloud registration process, and structural information constraints are added in a paired point cloud searching process and an energy optimization process, so that registration errors can be reduced as much as possible.

Drawings

FIG. 1 shows the result of structure detection of an initial frame in a scene 1 according to the present invention, wherein (a) is an original color image, (b) is a normal vector projection diagram, and (c) is the detection result;

FIG. 2 shows the result of the structure detection of the second frame in scene 1, where (a) is the original color image, (b) is the projection of the normal vector of the model, and (c) is the detection result;

FIG. 3 shows the results of the overall structure detection of the model after a period of time for scene 1 acquisition in the present invention;

FIG. 4 shows the results of modeling part of the data of scene 1 in the present invention;

FIG. 5 shows the result of the structure detection of the initial frame in scene 2 in the present invention, wherein (a) is the original color image and (b) is the detection result;

fig. 6 shows the result of structure detection after a period of time data acquisition in scene 2 in the present invention, where (a) is the original color image and (b) is the detection result;

FIG. 7 shows the results of modeling part of the data of scene 2 in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The principle of the invention is as follows: first frame data of a scene is collected, two-dimensional data are converted into three-dimensional point clouds by utilizing internal and external parameters of a camera, a normal vector of each point is calculated by utilizing field correlation, the position of the camera corresponding to a current frame is specified as a world coordinate system, and a plane structure is detected for an initial frame. Firstly, calculating the curvature value of each point, excluding information of some non-planar structure points, partitioning all planar structure points to be processed, estimating a planar parameter corresponding to each data block as a candidate plane, and partitioning to realize parallel operation and meet the real-time requirement. And then sequentially calculating the geometric relationship between each point and the candidate planes, wherein the geometric relationship mainly comprises the distance between the point and the plane and the included angle between the normal vector of the point and the normal vector of the plane, clustering all the points in the projection graph through the judgment of the point and the plane, reserving some regions with larger areas as clustering results, and recalculating plane parameters corresponding to the regions. However, due to the premise of the previous blocking, a plurality of planes corresponding to the area blocks belong to the same area in the real scene, and therefore, area merging operation needs to be performed on all the calculated plane areas, and the premise of merging is that plane equations corresponding to the two plane areas have high similarity. The work of marking the first frame of image structure information is completed through the steps. Then acquiring subsequent frame data successively, estimating the pose of the camera for each frame data, firstly searching a point pair matched between the model projection drawing and the current frame point cloud, and in the process of searching the point pair, taking a structure point with a mark in the model projection drawing as an alternative point for the reliability of information, and searching a point pair which is closest to the geometric information of the current frame at the position corresponding to the current frame; and secondly, constructing an energy equation, and adding an energy term of structural similarity measurement on the basis of the geometric measurement of the distance between the original calculation point pairs, thereby minimizing the registration error as much as possible. And fusing the current frame data with the data of the model according to the calculated camera pose to form new model data, obtaining a new model projection drawing under the current camera pose, and recalculating the structure identifier of the projection drawing. The calculation process is mainly divided into two parts, namely area expansion of the existing plane and re-measurement of the new plane. The adopted method is a region growing method, so that the operation speed can be greatly improved. All the marking results are fused into new model data, so that reliable structural information can be provided for subsequent posture estimation and modeling work.

Based on the principle, the RGBD three-dimensional scene reconstruction method based on the structural information specifically comprises the following steps:

firstly, obtaining depth image data of an ith frame, and preprocessing the depth image data, wherein a fast double-filtering method is mainly adopted:

wherein P is the coordinate of the three-dimensional point in the world coordinate system after mapping (A, B, C, D, C, B, C

s、p_i、p_j、I_i、I_j、δ₁、δ₂) Or (d, s, I, I, j, δ₁、δ₂) Meaning and value range of

Then, converting the two-dimensional data into a three-dimensional point cloud according to the internal parameters of the depth camera:

v(u,v)＝K^-1·(u,v,d)^T (2)

wherein u, v are pixel coordinates in the filtered depth data map, d is the corresponding depth value, K^-1Is the inverse of the depth camera's internal reference matrix.

And setting the camera position corresponding to the initial frame scene as the origin of a world coordinate system, and converting the ith frame into a three-dimensional point cloud under the world coordinate system to obtain the coordinates of each point. Sequentially calculating the normal vector of each point in the three-dimensional point cloud under the current world coordinate system according to the three-dimensional coordinates of the adjacent pixels:

where n (u, v) represents a normal vector to a point with pixel coordinates u, v.

S12, detecting the plane structure information in the ith frame, and marking the plane structure information in model data corresponding to the ith frame;

the specific steps for detecting the plane structure information in the ith frame are as follows:

calculating the curvature value of each point in the three-dimensional point cloud,

n₁,n₂,n₃and the normal vectors corresponding to the triangles in the other three directions. Dividing points in the three-dimensional point cloud into plane area points and non-plane area points according to the curvature values; and setting a threshold, and regarding each point in the three-dimensional point cloud, if the curvature value of the point is greater than the threshold, considering the point as a non-planar area point, and not including the subsequent calculation process. In the present embodiment, the threshold is set to 0.01, and when k (u, v) is greater than 0.01, the point (u, v) is considered to be a non-planar point, otherwise it is a planar point.

S122, setting an initial alternative plane;

the specific operation steps of step S122 are: firstly, the plane area points are partitioned, in the embodiment, the partitioning is carried out according to the area size of 30 × 30, the plane equation corresponding to each area is calculated,

C＝A_m*4 ^TA_m*4 (5)

where m is the total number of all points in the region block and A is the matrix of all points. Solving the eigenvalue and eigenvector of the matrix C, taking the eigenvector corresponding to the minimum eigenvalue as the equation parameter of the plane area corresponding to the area, and normalizing the eigenvector to be P ═ P₁,p₂,...,p_i,...p_nAs an initial candidate plane.

S123, clustering an initial plane area;

the method comprises the steps of firstly calculating the geometric relation between all plane area points and an initial candidate plane, namely calculating the distance between each point and the initial candidate plane and the included angle between the normal vector of each point and the normal vector of the initial candidate plane for each plane area point, setting a threshold value, regarding a plane area point, if the distance between each point and the initial candidate plane and the included angle between the normal vector of each point and the normal vector of the initial candidate plane are smaller than the threshold value, regarding the point as the point corresponding to the initial plane area, solving an equation parameter corresponding to the minimum metric value as the plane where the plane point is located, and further clustering the initial plane area. In this embodiment, the threshold value of the distance from the set point to the plane is (0.04 × depth value of the point), and the threshold value of the angle between the normal vector of the point and the normal vector of the plane is 15 degrees.

calculating the area of the initial plane region obtained by clustering, setting a threshold, regarding each initial plane region obtained by clustering, if the area is greater than the threshold, the plane is considered to be a plane in the three-dimensional point cloud, otherwise, removing the plane, specifically, calculating a bounding box of each initial plane region, and solving the corresponding plane area, in this embodiment, setting that the initial plane region is removed when the area is less than (0.06m × 0.06 m). The plane equations for the remaining planes are then recalculated by the RANSAC algorithm.

S125, merging planes;

since in a real scene, a common planar structure such as a ground is separated by some furniture and is identified as two planar areas, the planes obtained in step S124 need to be merged. The method comprises the specific steps of selecting any two planes, calculating an included angle between normal vectors of the two planes and an average value of distances from all points in one plane to the other plane, setting a threshold value, and combining the two planes if the two values are smaller than the threshold value. In this embodiment, the threshold of the angle between the normal vectors of the two planes is 2 degrees, and the average threshold of the distance from all points on one plane to the other plane is 0.01 m.

And respectively calculating the similarity between corresponding plane equations for all the reserved initial plane areas, and merging the two areas when the similarity is very close.

Wherein p is_i,p_jAre respectively two differentArea-corresponding plane parameter, A_kA matrix composed of all the points contained in the area i, n_iThe number of points in region i.

And finally, marking the obtained plane structure information in model data corresponding to the ith frame.

s21, calculating three-dimensional point cloud under a world coordinate system corresponding to the (i + 1) th frame, and recording the three-dimensional point cloud

And transforming the frame to a model coordinate system corresponding to the ith frame, and recording the frame as

Wherein

T^i+1-＞iIs the camera pose matrix to be solved.

S22, obtaining projection point cloud of model data corresponding to the ith frame and recording the projection point cloud

S23, calculating

And

the matching point pair between:

wherein

A measure of the distance between the pairs of points is represented,

represents a measure of the normal component angle between the pairs of points,

is composed of

And the parameter equation of the plane represents the distance measurement from the point to the plane.

By being at

The above formula is calculated in the 3 x 3 neighborhood, and the point pair with the minimum calculation result is used as the point pair

The matching points.

And S24, adding the constraint of the plane structure information when constructing the energy equation to obtain the camera attitude corresponding to the (i + 1) th frame.

Wherein the content of the first and second substances,

to represent

And (4) a parameter equation of the plane. The above equation is sequentially iterated by a least square optimization method to finally converge to obtain the camera attitude T^i+1-＞i。

s41, projecting the model data corresponding to the (i + 1) th frame to obtain a model projection diagram corresponding to the (i + 1) th frame; there are already some points in the projection that were marked in the previous calculation.

where v and n are the position and normal vector of the point to be found, respectively, P is the set of all known plane parameters, and dv is the depth value of the v point.

S43, determining whether a new plane is generated or not by a region growing method for points which are not marked in the model projection diagram corresponding to the (i + 1) th frame and do not belong to the known plane;

firstly, a seed point is selected arbitrarily, the relation between the seed point and four neighborhoods of the seed point is judged in sequence, when the formula (12) is close to 0, the field and the seed point are considered to belong to the same plane, and then the neighborhood point is used as a new seed point to carry out continuous iterative calculation until the condition is not met any more.

D＝(arccos(n_neighborgn_seed)*180/Pi-15) (10)

And after the calculation marks are completed for all the points, the points are back projected into the model data corresponding to the (i + 1) th frame, and the data of the whole model are updated.

In order to verify the effectiveness and the practicability of the method, simulation work is respectively carried out on the data under two different conditions of the existing data set and the real scene. As can be seen from comparison among the calculation results of the data sets corresponding to fig. 1-4 and fig. 1(c), fig. 2(c) and fig. 3, in the process of continuously accumulating the calculation of the scene model, data can be effectively supplemented, and simultaneously, plane information of each region in the scene can be effectively segmented, so that a relatively complete three-dimensional scene model corresponding to fig. 4 is finally formed. Fig. 5-7 correspond to the process of acquiring data and modeling in a real scene, and identify the results corresponding to fig. 5(b) and fig. 6(b) by effectively detecting the plane geometric information in the scene, thereby finally forming a relatively complete three-dimensional scene model corresponding to fig. 7.

Compared with other existing modeling methods, the method can firstly utilize the structural information of the scene as prior knowledge to be added into the optimization process of the camera attitude, can reduce the registration error as much as possible, and secondly adopts a more effective parallel method to calculate in the detection process of the structural information, thereby realizing a technology with higher time efficiency and more accurate positioning.

It will be understood that modifications and variations can be made by persons skilled in the art in light of the above teachings and all such modifications and variations are intended to be included within the scope of the invention as defined in the appended claims.

Claims

1. An RGBD three-dimensional scene reconstruction method based on structural information is characterized in that,

2. The RGBD three-dimensional scene reconstruction method based on the structural information as claimed in claim 1, wherein the scene information is planar structural information.

3. The RGBD three-dimensional scene reconstruction method based on structural information according to claim 2, wherein the specific steps of the step S1 are as follows:

4. The RGBD three-dimensional scene reconstruction method based on the structural information according to claim 3, wherein the specific step of detecting the plane structural information in step S12 is as follows:

s122, setting an initial alternative plane;

s123, clustering an initial plane area;

and S125, merging planes.

5. The RGBD three-dimensional scene reconstruction method based on the structural information as claimed in claim 4, wherein the step S121 comprises the following steps:

where k (u, v) represents the curvature value of a point with pixel coordinates u, v, n₀Right neighborhood point and top for the location to be soughtThe normal vector of the triangle formed by the neighborhood of squares points,

6. The RGBD three-dimensional scene reconstruction method based on the structural information as claimed in claim 5, wherein the step S122 comprises the following steps:

(1) dividing the plane area points;

(2) calculating a plane equation corresponding to each area,

C＝A_m*4 ^TA_m*4

7. The RGBD three-dimensional scene reconstruction method based on the structural information as claimed in claim 6, wherein the step S123 comprises the following steps:

8. The RGBD three-dimensional scene reconstruction method based on structural information according to claim 7, wherein the specific steps of the step S124 are as follows: calculating the area of the initial plane region obtained by clustering, setting a threshold value, regarding each initial plane region obtained by clustering, if the area of the initial plane region is larger than the threshold value, considering the plane as the plane in the three-dimensional point cloud, and otherwise, removing the plane.

9. The RGBD three-dimensional scene reconstruction method based on structural information according to claim 7, wherein the specific steps of the step S125 are as follows:

10. The RGBD three-dimensional scene reconstruction method based on structural information according to any of claims 2-9, wherein the specific steps of step S4 are:

and S44, marking the points which are determined in the step S42 and belong to the known plane and the new plane generated in the step S43, and back projecting the points to the model data corresponding to the (i + 1) th frame.