CN112435206B

CN112435206B - Method for reconstructing three-dimensional information of object by using depth camera

Info

Publication number: CN112435206B
Application number: CN202011330054.6A
Authority: CN
Inventors: 杨东学; 黄华; 尹辉; 金泰辰; 许宏丽; 高亮
Original assignee: Beijing Jiaotong University
Current assignee: Beijing Jiaotong University
Priority date: 2020-11-24
Filing date: 2020-11-24
Publication date: 2023-11-21
Anticipated expiration: 2040-11-24
Also published as: CN112435206A

Abstract

The invention provides a method for reconstructing three-dimensional information of an object by using a depth camera. The method comprises the following steps: shooting a current frame picture of an object by using a depth camera, and acquiring and storing feature points and corresponding feature descriptors of each frame picture; the method comprises the steps of performing three-dimensional matching on characteristic points of two pictures of an adjacent frame, calculating a rotation matrix and a translation vector between the two pictures of the adjacent frame by using a three-dimensional matching result of the characteristic points of the two pictures of the adjacent frame, and unifying three-dimensional coordinates of all the characteristic points of the two pictures of the adjacent frame to the same coordinate system by using the rotation matrix and the translation vector to obtain a full-surface point cloud of an object; and constructing a graph model by using the full-surface point cloud of the object, and obtaining three-dimensional information of the object reconstructed by the depth camera according to the optimized graph model. According to the invention, the external mark feature points are used for three-dimensional matching to recover the pose relation of the adjacent frames, the matching process does not depend on the texture information of the object, and the whole process does not need a calibration step.

Description

Method for reconstructing three-dimensional information of object by using depth camera

Technical Field

The invention relates to the technical field of three-dimensional vision measurement reconstruction, in particular to a method for reconstructing three-dimensional information of an object by using a depth camera.

Background

The three-dimensional reconstruction of the full surface of an object is a popular research direction in the field of computer vision. There are two methods for generating three-dimensional information of an object: one is to manually design the geometry of an object using geometric modeling software; another is to recover the three-dimensional information of the object by two-dimensional projection pictures using visual methods. The technical key points of the full-surface three-dimensional reconstruction of the object are the generation of single-view point clouds and the splicing of point clouds under different view angles. And for the point cloud obtained after the splicing, how to reduce the accumulated error is also an important step in the three-dimensional reconstruction process.

The generation method of the single view point cloud is divided into a passive optical method and an active optical method. And the passive optical reconstruction is carried out by carrying out three-dimensional matching on pictures shot by the left and right depth cameras, and a three-dimensional coordinate corresponding to the matching point is calculated by using a triangulation method. The passive optical method is suitable for the condition of abundant textures on the surface of an object, and is easy to generate errors in stereo matching of pictures under the condition of not abundant textures or even no textures, so that the generation precision of point cloud is affected. While the three-dimensional reconstruction of the representative structured light of the active optical method can overcome the condition of the object with insufficient texture. The structured light method restores three-dimensional information of an object by projecting stripes or encoded patterns with a priori information and acquiring the pattern information using a depth camera.

The laser three-dimensional reconstruction is matched with the rotating table to obtain the high-precision point cloud of the object. However, the method requires the rotation axis calibration of the laser three-dimensional reconstruction matched with the rotation table during each reconstruction, and has complex operation and higher device cost. In addition, during the reconstruction process, the device needs to be recalibrated once the false touch of the equipment occurs. If the turntable and the laser are bound, they tend to be relatively close together and the size of the reconstructed object cannot be too large.

For the three-dimensional reconstruction of the whole surface, two adjacent point clouds are often needed to be found for splicing, and in the three-dimensional matching method in the prior art, the minimum value of the distance of the feature descriptors is often found when the feature descriptors lack real priori. But such a minimum assumption does not necessarily guarantee that the two feature points are true matching points. The point cloud stitching in the prior art generally needs to use ICP (Iterative closest point, iterative closest point algorithm) for iterative optimization, and one method is time-consuming, and on the other hand, degradation occurs.

Therefore, how to develop an effective object three-dimensional reconstruction method based on the depth camera of the coding marker point is a problem to be solved.

Disclosure of Invention

The embodiment of the invention provides a method for reconstructing three-dimensional information of an object by using a depth camera, which aims to overcome the problems in the prior art.

In order to achieve the above purpose, the present invention adopts the following technical scheme.

A method for reconstructing three-dimensional information of an object using a depth camera, comprising:

printing and pasting diamond coding patterns on the surface of a round table, placing the round table with the coding patterns on a rotary table, placing an object on the round table, and placing a depth camera in front of the object;

shooting a current frame picture of an object by using a depth camera, sequentially rotating a rotary table by a certain angle, shooting the picture of the object at each angle until the rotary table rotates by 360 degrees, and acquiring and storing feature points and corresponding feature descriptors of each frame of picture;

the method comprises the steps of performing three-dimensional matching on characteristic points of two pictures of an adjacent frame, calculating a rotation matrix and a translation vector between the two pictures of the adjacent frame by using a three-dimensional matching result of the characteristic points of the two pictures of the adjacent frame, and unifying three-dimensional coordinates of all the characteristic points of the two pictures of the adjacent frame under the same coordinate system by using the rotation matrix and the translation vector; executing the processing process on two pictures of all adjacent frames, and unifying the three-dimensional coordinates of all characteristic points of all the frames of pictures to the same coordinate system to obtain a full-surface point cloud of the object;

and constructing a graph model by using the full-surface point cloud of the object, performing global optimization on the graph model by using an optimization function, and obtaining three-dimensional information of the object reconstructed by using a depth camera according to the optimized graph model.

Preferably, the printing and pasting of the diamond-shaped coding pattern on the surface of the round table, placing the round table with the coding pattern on the rotary table, placing the object on the round table, placing the depth camera in front of the object, and the method comprises the following steps:

according to the size of the rotary table, diamond coding patterns are designed, red, green or blue colors are filled in each diamond of the diamond coding patterns according to corresponding numbers, and the diamond coding patterns filled with the colors are printed and adhered to the surface of the rotary table;

and placing the round platform with the diamond coding patterns on a rotary platform, and placing the object to be rebuilt on the round platform. The depth camera is placed in front of the object, so that the visual field of the depth camera is ensured to contain diamond coding patterns on the object and the round table.

Preferably, the diamond-shaped coding pattern is designed according to the size of the rotary table, and the diamond-shaped coding pattern comprises:

according to the recurrence relation x _n ＝x _n-5 +2 and initial sequence x ₀ ＝1，x ₁ ＝1，x ₂ ＝1，x ₃ ＝1，x ₄ ＝1，x ₅ Recursively generating an array with length 728, filling each element in the array into a matrix with the size of 8 x 91 according to the order of circulation from top left to bottom right according to the generation order;

each element of the matrix is replaced by diamond patterns with three different colors of red, green and blue, wherein the red, green and blue respectively correspond to the three elements of the matrix, the background of the matrix is white, a diamond coding pattern is obtained, a plurality of rows of diamond coding patterns are cut according to the size of a rotary table, after a circle of diamond coding patterns in a plurality of rows are stuck on a round table, a column of black diamond is finally used for connecting the diamond patterns end to end.

Preferably, the step of photographing the current frame of the object with the depth camera, sequentially rotating the rotation table by a certain angle, photographing the image of the object at each angle until the rotation table rotates 360 degrees, and obtaining and storing the feature points and the corresponding feature descriptors of each frame of image, includes:

for a current frame picture shot by a current depth camera, taking corner points among diamonds in a diamond coding pattern on the picture as feature points, taking 2*3 diamond colors around the feature points as feature descriptors, and storing the feature points and the feature descriptors corresponding to the current frame picture;

rotating the rotary table by a certain angle, moving the depth camera, ensuring that the visual field of the depth camera can contain the object and diamond coding patterns on the round table, shooting the next frame of picture again by using the depth camera, and storing the feature points and feature descriptors corresponding to the next frame of picture again;

and repeatedly executing the processing process until the rotating table rotates 360 degrees, wherein a certain overlapping area is needed between two pictures of adjacent frames.

Preferably, the storing the feature points and feature descriptors corresponding to the current frame picture includes:

for pixel points (i, j) in i rows and j columns in the picture, calculating

k is an empirical threshold, and p (i, j) represents the gray value of the pixel point (i, j) in the ith row and j columns;

if S (i, j) is more than 0, the pixel point is a class-I feature point, and if S (i, j) is less than 0, the pixel point is a class-II feature point;

for a class of feature points, detecting the nearest upper diamond color and the nearest lower diamond color of the feature points, for a class of feature points, detecting the nearest left diamond color and the nearest right diamond color of the feature points, respectively representing five colors of black, red, green and white by numbers of 0, 1, 2, 3 and 4, for a current pixel point (i, j), the corresponding color channel pixel value of the current pixel point is r (i, j), g (i, j) and b (i, j), and enabling Mic (i, j) =min (r (i, j), g (i, j), b (i, j)), mac (i, j) =max (r (i, j), g (i, j), b (i, j), and judging the color of the current pixel point (i, j) by calculating H (i, j):

for a class of feature points, six-bit five-system is used for storing six diamond colors of the upper left, the upper right, the upper left, the lower right and the lower right of the class of feature points, and the six-bit five-system number F (i, j) is the feature descriptor of the class of feature points.

Preferably, the stereo matching is performed on feature points of two pictures of an adjacent frame, a rotation matrix and a translation vector between the two pictures of the adjacent frame are calculated by using a stereo matching result of the feature points of the two pictures of the adjacent frame, and three-dimensional coordinates of all feature points of the two pictures of the adjacent frame are unified to the same coordinate system by using the rotation matrix and the translation vector, including:

for two adjacent pictures: left and right images, feature descriptor F for judging current feature point of left image _left Feature descriptor F of the current feature point of (i, j) and right graphs _right If the feature descriptors are the same, judging that the current feature points of the left image and the current feature points of the right image are matching points, and storing the matching points with the same feature descriptors;

calculating a de-averaged three-dimensional coordinate wl corresponding to the left image feature point set _i ＝(x _i ，y _i ，z _i ) ^T Three-dimensional coordinates wr corresponding to right-hand image feature point set _i ＝(x _i ，y _i ，z _i ) ^T ，R ^* For the denoised rotation transformation matrix, wl _i ＝R ^* *wr _i . Order theDecomposing the singular value A to obtain: a=u Σv ^T ，R ^* ＝VU ^T ；

Calculating a three-dimensional coordinate centroid p corresponding to the left image feature point set ₁ Three-dimensional coordinate centroid p corresponding to right-hand image feature point set ₂ According to the rotation matrix R calculated above ^* Obtaining a translation vector T ^* ＝p ₁ -R ^* p ₂ ；

Using a rotation transformation matrix R ^* And translation vector T ^* Unifying the three-dimensional coordinates of all the characteristic points of the two pictures of the adjacent frames to the same coordinate system; and executing the processing procedure on two pictures of all adjacent frames, unifying the three-dimensional coordinates of all characteristic points of all the frames of pictures under a world coordinate system, and taking the position of the depth camera when the first frame of picture is taken as the origin of the world coordinate system to obtain the full-surface point cloud of the object.

Preferably, the constructing a graph model by using the full-surface point cloud of the object, and performing global optimization on the graph model by using an optimization function includes:

constructing a graph model by using the full-surface point cloud of the object, wherein the construction of the graph model comprises vertexes, edges, edge weights and global optimization functions; when the rotary table rotates for one circle, the position of the depth camera corresponding to each frame of picture is at a certain node under the world coordinate system, and the three-dimensional coordinates of the depth camera nodes and all the characteristic points are jointly defined as the vertex of the graph model;

if the depth camera i can observe the feature point j, connecting the node where the depth camera is currently located with the feature point j, and defining the weight of the edge as the reprojection error from the feature point j to the depth camera: e=z-h (T, p);

wherein z is the pixel coordinate of the feature point observed by the camera, h (T, p) is the pixel coordinate of the camera model after re-projecting the world coordinate p under the current external parameter T, and e is a re-projection error;

and optimizing the whole graph model by using an LM method to ensure that the sum of the side weights of the graph model is minimum, wherein the loss function is as follows:wherein e _ij And representing a reprojection error observed by the depth camera i on the world coordinate point j, wherein m is the pose number of the depth camera, and n is the feature point number.

Preferably, the obtaining three-dimensional information of the object reconstructed by the depth camera according to the optimized graph model includes:

obtaining an optimized rotation matrix and a translation vector according to the optimized graph model depth camera, obtaining an optimized full-surface point cloud of the object according to the optimized rotation matrix and the translation vector and the optimized camera pose depth camera, wherein the full-surface point cloud comprises three-dimensional coordinates and corresponding RGB color information of each point cloud, and reconstructing the three-dimensional information of the object according to the three-dimensional coordinates and the corresponding RGB color information of all the point clouds.

According to the technical scheme provided by the embodiment of the invention, the embodiment of the invention uses the external mark characteristic points for stereo matching to recover the pose relationship of the adjacent frames, and the matching process does not depend on the texture information of the object. The coding information used by the invention can provide accurate feature descriptors, so that more accurate matching relation can be obtained, and ICP calculation is avoided. The method and the device restore the pose and use the matching of the image characteristic points, calculate the rotation matrix and the translation vector through the depth camera, and do not need a calibration step in the whole process.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a process flow diagram of a method for reconstructing three-dimensional information of an object using a depth camera based on coded marker points according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a partial diamond-shaped coding pattern filled with colors according to an embodiment of the present invention;

fig. 3 is a view of stereo matching effect of two adjacent frames of pictures after rotating an angle.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or coupled. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

For the purpose of facilitating an understanding of the embodiments of the invention, reference will now be made to the drawings of several specific embodiments illustrated in the drawings and in no way should be taken to limit the embodiments of the invention.

The depth camera is a new technology in recent years, and compared with the traditional depth camera, the depth camera is functionally added with a depth measurement, so that the surrounding environment and changes can be sensed more conveniently and accurately. The equipment used in the embodiment of the invention comprises a depth camera, a rotary table, a round table stuck with diamond coding patterns and an object to be rebuilt.

The processing flow of the method for reconstructing three-dimensional information of an object by using a depth camera based on coding mark points provided by the embodiment of the invention is shown in fig. 1, and the method comprises the following processing steps:

and S10, designing diamond coding patterns corresponding to the M matrix according to the size of the rotary table, filling three colors of red, green or blue in each diamond of the diamond coding patterns according to corresponding numbers, and printing and pasting the diamond coding patterns filled with the colors on the surface of the rotary table.

M array definition: if all M n windows (m.ltoreq.r, n.ltoreq.s) of an r.ltoreq.s periodic array A containing k elements are different in pairs and all windows cover exactly all combinations of elements, then array A is referred to as an (r, s, M, n) -M array. Wherein, the constraint conditions are as follows: rs=k ^mn ,k＝p ^q Where p is a prime number.

The M array used in this experiment contained 3 elements, with a matrix size of 9 x 81, where any 2*3 sub-matrices in the matrix were different from each other. The diamond pattern is then constructed by corresponding three elements in the matrix with three colors red, green and blue.

Fig. 2 is a schematic diagram of a partial diamond-shaped coding pattern filled with colors according to an embodiment of the present invention.

And S20, placing the device, placing the round table with the diamond coding patterns on the rotary table, and placing the object to be rebuilt on the round table. Placing the depth camera in front of the object, and ensuring that the field of view of the depth camera can contain diamond coding patterns on the object and the round table;

step S30, regarding a current frame picture shot by a current depth camera, taking corner points among diamonds in a diamond coding pattern on the picture as key points, taking 2*3 diamond colors around the key points as feature descriptors, and storing feature points and feature descriptors corresponding to the current frame picture. And the rotary table is rotated for a certain angle, the depth camera is moved, and the visual field of the depth camera is ensured to contain the object and diamond coding patterns on the round table. And shooting the next frame of picture again by using the depth camera, and storing the feature points and the feature descriptors corresponding to the next frame of picture again.

And repeatedly executing the processing process until the rotating table rotates 360 degrees, wherein a certain overlapping area is needed between adjacent frames.

And S40, acquiring three-dimensional coordinates corresponding to the feature points of the two pictures by using a depth camera, and performing three-dimensional matching on the two adjacent pictures according to the stored feature points to acquire a matching relationship between the two pictures. And calculating a rotation matrix R and a translation vector t between the two pictures by utilizing the three-dimensional coordinates corresponding to the feature points of the two pictures and the matching relation between the two pictures. And according to the rotation matrix R and the translation vector t, unifying the three-dimensional coordinates of all the characteristic points of the two pictures to the same coordinate system to obtain the full-surface point cloud of the object.

And S50, setting three-dimensional coordinates of each feature point and pose coordinates of the depth camera at different angles as points, setting observation of the depth camera on the feature points as edges, and constructing a graph optimization model. The errors of the edges are reprojection errors, the sum of the errors of all the edges is used as an optimization function, and the optimization function is utilized to carry out global optimization on the graph optimization model, so that three-dimensional information of the object to be reconstructed is obtained.

In the step S10, the M-array is constructed based on the primitive polynomial f (x) =x over the galois field GF (3) ⁶ +x+2, the constructed M matrix size is set to 8 x 91. The generation process of the M matrix comprises the following steps: first according to the recurrence relation x _n ＝x _n-5 +2 and initial sequence x ₀ ＝1，x ₁ ＝1，x ₂ ＝1，x ₃ ＝1，x ₄ ＝1，x ₅ =1, recursively generates an array of length 728, where the addition and multiplication operations are calculated over GF (3). The elements in the array are filled into a matrix with the size of 8 x 91 according to the order of the cycle from the upper left to the lower right according to the generated order.

And replacing each element of the matrix with diamond patterns with three different colors of red, green and blue, wherein the red, green and blue respectively correspond to the three elements of the matrix, and the background of the matrix is white to obtain the diamond coding pattern. Cutting out a plurality of lines of diamond coding patterns, and pasting the diamond coding patterns on the surface of the round table. Wherein for a 40cm diameter by 10cm high circular table, three rows of patterns are typically required. After a circle of diamond-shaped coding patterns in a plurality of rows are stuck on the round table, finally, a row of black diamond-shaped patterns are used for connecting the diamond-shaped patterns end to end. The sticker can use a row of black diamond seals for one circle, so that diamond patterns are ensured to be uniformly covered on the surface of the white turntable. While a black diamond may be used as a marker for one revolution.

For the current shot picture I ₁ The pixel points of row i and column j (i,j) Calculation of

k is an empirical threshold, and p (i, j) represents the gray value of the pixel point (i, j) of the ith row and j column.

If S (i, j) is greater than 0, the pixel point is a type of feature point, and if S (i, j) is less than 0, the pixel point is a type of feature point.

Let I ₂ (I, j) = |s (I, j) | picture I is extracted using the oxford thresholding method ₂ Maximum points in all connected areas, I ₁ Is a gray-scale picture after color picture conversion.

By the method of I ₁ Calculate S (I, j), where I and j are traversed and assign the absolute value of S (I, j) to I ₂ . Thus I ₂ Is a black-and-white picture after a pair of processing. By the method of I ₂ Extraction of I using the Ojin thresholding method ₂ Is the maximum point in (a).

For a class of feature points, detecting the two last upper and lower diamond colors of the feature points, and for a class of feature points, detecting the two last left and right diamond colors of the feature points. The five colors of black, red, green, blue and white are respectively represented by numerals 0, 1, 2, 3 and 4. The red, green and blue are the colors used for coding the diamond, and the black is the black diamond color for sealing the seal around the paper. While white refers to the background color of the pattern. So that 5 colors need to be detected later.

For the current pixel p (i, j), its corresponding color channel pixel value is r (i, j), g (i, j), b (i, j). Let Mic (i, j) =min (r (i, j), g (i, j), b (i, j)), mac (i, j) =max (r (i, j), g (i, j), b (i, j)) determine the current pixel point color by calculating H (i, j):

For example: for a certain first type of feature point P ₁ The feature descriptor corresponding to this point is a 5-ary number 123323. Then the colors around this point for a total of 2 by 3 diamonds are red, green, blue, green, blue, respectively. If the left picture and the right picture have 123323 feature descriptors in stereo matching, the point corresponding to the feature descriptors is the same point in the three-dimensional space.

The algorithm only calculates feature descriptors of the first type of feature points, and the second type of feature points are used for assisting the first type of feature points in positioning and detecting diamond colors.

Fig. 3 is an effect diagram of stereo matching of adjacent frames according to an embodiment of the present invention, where for two adjacent pictures: and matching the first class of feature points through feature descriptors in the left diagram and the right diagram.

Feature descriptor F for judging current feature point of left graph _left Feature descriptor F of the current feature point of (i, j) and right graphs _right And (i, j) whether the current feature points of the left image and the right image are the same or not, and if so, judging that the current feature points of the left image and the current feature points of the right image are matching points. And saving the same matching points of all feature descriptors.

Calculating a de-averaged three-dimensional coordinate wl corresponding to the left image feature point set _i ＝(x _i ，y _i ，z _i ) ^T Three-dimensional coordinates wr corresponding to right-hand image feature point set _i ＝(x _i ，y _i ，z _i ) ^T ，R ^* For the denoised rotation transformation matrix, wl _i ＝R ^* *wr _i . Order theDecomposing the singular value A to obtain: a=u Σv ^T Thus R is ^* ＝VU ^T 。

Calculating a three-dimensional coordinate centroid p corresponding to the left image feature point set ₁ Three-dimensional coordinate centroid p corresponding to right-hand image feature point set ₂ According to the rotation matrix R calculated above ^* The translation vector T can be obtained ^* ＝p ₁ -R ^* p ₂ 。

And constructing a graph model by using the full-surface point cloud of the object. The construction of the graph model needs to include vertices, edges, edge weights, and global optimization functions. The vertex is the three-dimensional coordinate of the second type of feature point and the three-dimensional position coordinate of the camera rotated one circle. Edges: if the camera observes feature points, then the edge is built. Side weight: and (5) re-projecting errors. Global optimization function: the sum of all re-projection errors is minimized.

When the rotary table rotates for one circle, the position of the depth camera corresponding to each frame of picture is at a certain point under the world coordinate system, and the three-dimensional coordinates of the depth camera node and the characteristic point are jointly defined as the vertex of the graph model.

If the depth camera i can observe the feature point j, the node of the depth camera is bordered by the feature point j. The weight of an edge is defined as the reprojection error of the feature point j to the depth camera: e=z-h (T, p).

The whole graph model is optimized by using an LM (Levenberg-Marquarrelt-Marquardt) method, so that the sum of the graph model side weights is minimum, and the loss function is as follows:wherein e _ij And representing a reprojection error observed by the depth camera i on the world coordinate point j, wherein m is the pose number of the depth camera, and n is the feature point number.

And obtaining three-dimensional information of the object reconstructed by the depth camera according to the optimized graph model. After the graph model is optimized, an optimized rotation matrix and a translation vector are obtained according to the optimized graph model depth camera, an optimized full-surface point cloud of the object is obtained according to the optimized rotation matrix and the translation vector and the optimized camera pose depth camera, the full-surface point cloud comprises three-dimensional coordinates and corresponding RGB color information of each point cloud, and the three-dimensional information of the object is reconstructed according to the three-dimensional coordinates and the corresponding RGB color information of all the point clouds.

In summary, the embodiment of the invention uses the external mark feature points for stereo matching to recover the pose relationship of the adjacent frames, and the matching process does not depend on the texture information of the object. The traditional stereo matching method determines a matching relation by calculating the feature similarity, and the result is unstable and requires large calculation amount by ICP iteration. The coding information used by the invention can provide accurate feature descriptors, so that more accurate matching relation can be obtained, and ICP calculation is avoided.

The method and the device restore the pose and use the matching of the image characteristic points, calculate the rotation matrix and the translation vector through the depth camera, and do not need a calibration step in the whole process. Therefore, compared with the whole surface reconstruction operation of the rotary table, the method is simpler, is not afraid of touching equipment by mistake in the operation process, and has strong fault tolerance. And the laser cannot move in the reconstruction process, so that the top of the object is difficult to reconstruct. The invention can move the depth camera in the vertical direction to obtain three-dimensional information of the top of the object.

The method carries out relative pose recovery by designing the feature coding pattern and the corresponding feature detection method, so that compared with the three-dimensional reconstruction of the laser rotary table, the method does not need to recalibrate each time, and saves time. The method can provide stable feature points and can not influence the surface texture of the object because the designed features are adhered on the rotary table. Finally, as pose errors can accumulate over time, the method uses Bundle Adjustment method to perform global optimization. The invention can generate the full-surface point cloud of the object with low cost, convenience and rapidness under the condition that the precision is not inferior to that of a laser and a handheld scanner.

Those of ordinary skill in the art will appreciate that: the drawing is a schematic diagram of one embodiment and the modules or flows in the drawing are not necessarily required to practice the invention.

From the above description of embodiments, it will be apparent to those skilled in the art that the present invention may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present invention.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, with reference to the description of method embodiments in part. The apparatus and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.

Claims

1. A method for reconstructing three-dimensional information of an object using a depth camera, comprising:

constructing a graph model by using the full-surface point cloud of the object, performing global optimization on the graph model by using an optimization function, and obtaining three-dimensional information of the object reconstructed by using a depth camera according to the optimized graph model;

the storing of the feature points and the feature descriptors corresponding to the current frame picture comprises:

for pixel points (i, j) in i rows and j columns in the picture, calculating

，

for one type of feature point, six-bit five-system is used for storing six diamond colors of the upper left, the upper right, the upper left, the lower right and the lower right of the one type of feature point, the six-bit five-system number F (i, j) is the feature descriptor of the one type of feature point,

the method for three-dimensionally matching the characteristic points of two pictures of the adjacent frame comprises the steps of calculating a rotation matrix and a translation vector between the two pictures of the adjacent frame by utilizing a three-dimensional matching result of the characteristic points of the two pictures of the adjacent frame, unifying three-dimensional coordinates of all the characteristic points of the two pictures of the adjacent frame to the same coordinate system by utilizing the rotation matrix and the translation vector, and comprises the following steps:

for two adjacent pictures: left and right images, feature descriptor F for judging current feature point of left image _left Feature descriptor F of the current feature point of (i, j) and right graphs _left If the feature descriptors are the same, judging that the current feature points of the left image and the current feature points of the right image are matching points, and storing the matching points with the same feature descriptors;

calculating a de-averaged three-dimensional coordinate wl corresponding to the left image feature point set _i ＝(x _i ，y _i ，z _i ) ^T Three-dimensional coordinates wr corresponding to right-hand image feature point set _i ＝(x _i ，y _i ，z _i ) ^T ，R ^* For the denoised rotation transformation matrix, wl _i ＝R ^* *wr _i Order-making

Decomposing the singular value A to obtain: a=u Σv ^T ，R ^* ＝VU ^T ，

2. The method of claim 1, wherein printing and adhering the diamond-shaped code pattern to the surface of the table, placing the table with the code pattern on the turntable, placing the object on the table, and placing the depth camera in front of the object, comprises:

the round platform with the diamond-shaped coding patterns is placed on the rotary platform, the object to be rebuilt is placed on the round platform, the depth camera is placed in front of the object, and the fact that the view of the depth camera can contain the diamond-shaped coding patterns on the object and the round platform is guaranteed.

3. The method of claim 2, wherein the diamond-shaped encoding pattern is designed according to the size of the turntable, comprising: diamond-shaped encoding patterns are designed according to the dimensions of the rotary table, comprising:

according to recursionRelationship x _n ＝x _n-5 +2 and initial sequence x ₀ ＝1，x ₁ ＝1，x ₂ ＝1，x ₃ ＝1，x ₄ ＝1，x ₅ Recursively generating an array with the length 728, filling each element in the array into a matrix with the size of 8 x 91 according to the order of circulation from top left to bottom right according to the generating order;

4. A method according to claim 3, wherein the capturing the current frame of the image of the object with the depth camera sequentially rotates the rotation table by a certain angle, capturing the image of the object at each angle until the rotation table is rotated by 360 degrees, and acquiring and storing the feature points and the corresponding feature descriptors of each frame of the image, includes:

5. The method of claim 4, wherein constructing a graph model using the full-surface point cloud of the object, and globally optimizing the graph model using an optimization function comprises:

and optimizing the whole graph model by using an LM method to ensure that the sum of the side weights of the graph model is minimum, wherein the loss function is as follows:

wherein e _ij And representing a reprojection error observed by the depth camera i on the world coordinate point j, wherein m is the pose number of the depth camera, and n is the feature point number.

6. The method of claim 5, wherein obtaining three-dimensional information of the object reconstructed with a depth camera from the optimized phantom comprises: