CN106981043B

CN106981043B - High-precision three-dimensional information rapid acquisition method based on random forest

Info

Publication number: CN106981043B
Application number: CN201611037639.2A
Authority: CN
Inventors: 王琼华; 熊召龙; 邢妍; 董衍煜; 罗令; 邓欢
Original assignee: Sichuan University
Current assignee: Sichuan University
Priority date: 2016-11-23
Filing date: 2016-11-23
Publication date: 2020-04-10
Anticipated expiration: 2036-11-23
Also published as: CN106981043A

Abstract

The invention provides a high-precision three-dimensional information rapid acquisition method based on a random forest, which does not need stereo matching and avoids the reduction of precision and acquisition rate caused by dense calculation. The method comprises three processes of learning model training data acquisition, random forest learning and deformation speckle optimal depth classification regression, wherein a hardware system consists of a digital projector, a camera and a detected scene, two working states of an initialization state and a running state exist, and three-dimensional information can be acquired rapidly with high precision in the running state.

Description

High-precision three-dimensional information rapid acquisition method based on random forest

Technical Field

The invention relates to a three-dimensional information acquisition technology, in particular to a high-precision three-dimensional information rapid acquisition method based on a random forest.

Background

The three-dimensional information acquisition technology can simultaneously obtain the three-dimensional coordinates of a detected scene, and has important significance and wide application prospect in the fields of real object profiling, industrial detection, virtual reality, machine vision, intelligent interaction and the like. Three-dimensional information acquisition is mainly divided into two categories, namely a passive three-dimensional information acquisition method and an active three-dimensional information acquisition method. The passive three-dimensional information acquisition method is based on a stereo matching principle, two-dimensional images of a detected scene at different visual angles are acquired from multiple angles by using two or more camera systems, and when texture information of the detected scene is too simple or the detected scene has reflectivity in different directions, the calculation precision of the method is greatly reduced, the calculation complexity and the processing time are greatly increased, and high-precision three-dimensional information cannot be quickly acquired; the active three-dimensional information acquisition method mainly utilizes the form of structured light illumination, has the characteristic of high precision, but the precision of the method greatly depends on the conversion frame number of the illumination structured light and the optimization degree of a dynamic path during calculation, and the two just limit the acquisition speed of the active three-dimensional information acquisition method.

Disclosure of Invention

The invention aims to realize a high-precision and quick three-dimensional information acquisition method. In order to achieve the purpose, the invention provides a high-precision three-dimensional information rapid acquisition method based on a random forest, which does not need stereo matching and avoids the reduction of acquisition precision and acquisition rate caused by dense calculation. The method comprises three processes of obtaining learning model training data, learning random forests and classification regression of the optimal depth of the deformed speckles.

The hardware system of the invention consists of a digital projector, a camera and a measured scene, as shown in figure 1, the digital projector and the camera are precisely calibrated to project pseudo-random digital speckles on the surface of the measured scene, the camera synchronously acquires corresponding deformed speckles, and the focal length of the lens of the digital projector is the same as that of the lens of the camera and is marked asfThe optical centers of the two lenses are spaced bybAnd the working state remains unchanged.

The invention works in two working states: an initialization state and a runtime state. The working process is as shown in the attached figure 2, when the method works for the first time, an initialization state is firstly entered, a plurality of training scene depth values are obtained by using a high-precision structured light measurement method, and a classifier and a regressor of the random forest are trained and learned by combining a local binary feature label of each pixel of the deformed speckles; then the method enters a running state, a camera acquires deformation speckles of a detected scene, a random forest is used for classifying local binary features of each pixel to acquire local binary feature labels, and then the random deep forest is used for regressing the local binary feature labels to acquire optimal depth values corresponding to the deformation speckles. When the method does not work for the first time, the method directly enters a running state, and the deformed speckles are classified and the optimal depth is regressed. Due to the irrelevance between pixels of each point of the deformed speckles, in the running state of the method, the classification regression of the optimal depth of the deformed speckles of the detected scene is processed in parallel by using a Graphic Processing Unit (GPU).

In the learning model training data acquisition process, firstly, a high-precision structured light measurement method is used for acquiring the training dataTDepth values of training scenes, whereintThe depth value of the training scene is recorded asD _t(x,y). According to the parameters of the digital projector and the camera, the depth value of the scene to be trained can be obtainedD _t(x,y) Corresponding training scene disparity valuesd _t(x,y)：

（1）

Wherein (A), (B), (C), (D), (C), (x,y) Coordinates of each point pixel in a pixel coordinate system are used for training scene depth values and training scene parallax values. Then, a digital projector projects a pseudo-random speckle image onto the training sceneR(x,y) The camera acquires corresponding deformed speckleI _t(x,y) And is provided withM×NSliding window of pixels, traversing deformable speckleI _t(x,y) Per dot pixel of (1) <x,y) Obtaining (A)x,y) Local binary characteristics of the points. To pairTThe same processing is carried out on each training scene to obtain the parallax value of the training scened _t(x,y) And local binary characteristics are used as learning model training data, and the learning model training data is input into the random forest for learning of the random forest.

In the learning process of the random forest, firstly, the input deformed speckles are utilizedI _t(x,y) Local binary characterization of pixels per point and digital projector projected speckle imagesR(x,y) Calculating the deformed speckleI _t(x,y) The parallax label corresponding to each pixelc _t(x,y). When the digital projector and the camera are accurately calibrated, the parallax labelc _t(x,y) Satisfies the following conditions:

（2）

（3）

wherein (A), (B), (C), (D), (C), (x',y') As parallax labelsc _t(x,y) In speckle imagesR(x,y) The corresponding pixel coordinates. By usingTTraining scene parallax value obtained by each training scened _t(x,y) And corresponding parallax labelc _t(x,y) Composing training set dataSAnd independently training a plurality of trees. In the present invention, each tree has in commonNLayer, frontkThe layer nodes are integers, so that the classification problem is solved; rear endN-kAnd the layer nodes are decimal numbers, and prediction data of sub-pixel precision is obtained through a regression function.

The learning process of the random forest of the invention starts from the root node of each tree and randomly sets a series of random discrimination parametersδUsing each of the discrimination parametersδWill train set dataSDivided into left training set dataS _L(δ) And right training set dataS _R(δ) While simultaneously computing the objective functionO(δ)：

（4）

Wherein the content of the first and second substances,E(S)、E(S _L(δ))、E(S _R(δ) Respectively training set dataSLeft training set dataS _L(δ) And right training set dataS _R(δ) The entropy of information of (1). Random discrimination parameterδIntermediate order objective functionO(δ) The maximum value is the final discrimination parameter of the node. For left training set dataS _L(δ) And right training set dataS _R(δ) Recursion is carried out in sequence until the training depth reaches the second of the treekLayer, or training set data is not re-separable. Training independent trees into decision trees, the decision trees forming a random forest for runtime stateF。

In the classification regression process of the optimal depth of the deformed speckles, as shown in the attached figure 3, a digital projector projects pseudo-random digital speckles on the surface of a measured scene, and the pixel arrangement and local amplification structure of the pseudo-random digital speckles are shown in the attached figure 4. Camera for acquiring deformation speckles of detected sceneI'(x,y) Using random forestsFBefore decision treekLayer, classifying each point pixel local binary characteristic to obtain local binary characteristic labelc' _I(x,y) And then the random deep forest is recycledFAfter decision treeN-kLayer, regression is carried out to local binary characteristic label, obtains surveyed scene deformation speckleI'(x,y) Corresponding sub-pixel precision local binary feature labelc'(x,y) Because the digital projector and the camera are accurately calibrated, the optimal depth value of the detected sceneD'(x,y) Satisfies the following conditions:

（5）

in the invention, speckle is deformed due to the detected sceneI'(x,y) The processing of each pixel has irrelevancy, so that in the running state of the method, the classification regression of the optimal depth of the detected scene deformation speckles is carried out, and the GPU is used for carrying out parallel processing to obtain higher processing speed.

The invention provides a high-precision three-dimensional information rapid acquisition method based on a random forest. The method does not need stereo matching, avoids the reduction of precision and acquisition rate caused by dense calculation, and is a high-precision and rapid three-dimensional information acquisition method. The hardware system of the invention consists of a digital projector, a camera and a measured scene, and has two working states of an initialization state and a running state, and the three-dimensional information of the measured scene can be rapidly acquired with high precision in the running state.

Drawings

FIG. 1 is a hardware system diagram of a high-precision three-dimensional information rapid acquisition method based on random forest

FIG. 2 is a flow chart of the method of the present invention

FIG. 3 is a classification regression process of the optimal depth of deformed speckles in the present invention

FIG. 4 shows the pseudo-random digital speckle pixel arrangement and local magnification structure of the present invention

The reference numbers in the figures are:

1 digital projector, 2 camera, 3 measured scene, 4 measured scene deformation speckle, 5 random forest node, 6 random forest decision tree frontkLayer, 7 random forest decision TreeN-kLayer 8, optimal depth value of a detected scene, 9 pseudo-random digital speckle pixel arrangement and 10 pseudo-random digital speckle local amplification structure.

It should be understood that the above-described figures are merely schematic and are not drawn to scale.

Detailed Description

The following describes an exemplary embodiment of a method for fast obtaining three-dimensional information with high precision based on random forest according to the present invention in detail, and further describes the present invention in detail. It should be noted that the following examples are only for illustrative purposes and should not be construed as limiting the scope of the present invention, and that the skilled person in the art may make modifications and adaptations of the present invention without departing from the scope of the present invention.

The invention provides a high-precision three-dimensional information rapid acquisition method based on random forests.

The hardware system of the invention consists of a digital projector, a camera and a measured scene, as shown in figure 1, the digital projector and the camera are precisely calibrated to project pseudo-random digital speckles on the surface of the measured scene, the camera synchronously acquires corresponding deformed speckles, the focal length of the lens of the digital projector is the same as that of the lens of the camera, and the method is characterized in thatf =45mm, the optical center interval between the two lenses isb=570mm, and the operating state remains unchanged.

The invention works in two working states: an initialization state and a runtime state. The working process is as shown in the attached figure 2, when the method works for the first time, an initialization state is firstly entered, a plurality of training scene depth values are obtained by using a high-precision structured light measurement method, and a classifier and a regressor of the random forest are trained and learned by combining a local binary feature label of each pixel of the deformed speckles; then the method enters a running state, a camera acquires deformation speckles of a detected scene, a random forest is used for classifying local binary features of each pixel to acquire local binary feature labels, and then the random deep forest is used for regressing the local binary feature labels to acquire optimal depth values corresponding to the deformation speckles. When the method does not work for the first time, the method directly enters a running state, and the deformed speckles are classified and the optimal depth is regressed. Due to the irrelevance between pixels of each point of the deformed speckles, in the running state of the method, the classification regression of the optimal depth of the deformed speckles of the detected scene utilizes the GPU to perform parallel processing.

In the learning model training data acquisition process, firstly, a high-precision structured light measurement method is used for acquiring the training data togetherTDepth values of =13000 training scenes, wheretThe depth value of the training scene is recorded asD _t(x,y) According to the parameters of the digital projector and the camera, the depth value of the scene to be trained can be obtainedD _t(x,y) Corresponding training scene disparity valuesd _t(x,y)：

（1）

Wherein (A), (B), (C), (D), (C), (x,y) Coordinates of each point pixel in a pixel coordinate system are used for training scene depth values and training scene parallax values. Then, a digital projector projects a pseudo-random speckle image onto the training sceneR(x,y) The camera acquires corresponding deformed speckleI _t(x,y) And is provided withM×NSliding window of =32 × 32 pixels, traverse deformed speckleI _t(x,y) Per dot pixel of (1) <x,y) Obtaining (A)x,y) Local binary characteristics of the points. 13000 training scenes are processed in the same way to obtain the parallax value of the training scenesd _t(x,y) And local binary characteristics are used as learning model training data, and the learning model training data is input into the random forest for learning of the random forest.

（2）

（3）

wherein (A), (B), (C), (D), (C), (x',y') As parallax labelsc _t(x,y) In speckle imagesR(x,y) The corresponding pixel coordinates. Training scene parallax value obtained by using 13000 training scenesd _t(x,y) And corresponding parallax labelc _t(x,y) Composing training set dataSAnd independently training a plurality of trees. In the present invention, each tree has in commonN=15 layers, frontkThe nodes of the layer =9 are integers, so that the classification problem is solved; rear endN-kAnd (4) =6 layers of nodes are decimal numbers, and prediction data of sub-pixel precision are obtained through a regression function.

（4）

（5）

Claims

1. A high-precision three-dimensional information rapid acquisition method based on random forests is characterized by comprising three processes of acquisition of learning model training data, learning of random forests and classification regression of optimal depths of deformed speckles, wherein in the acquisition process of the learning model training data, firstly, the depth values of T training scenes are acquired by using a high-precision structured light measurement method, and the depth value of the T training scene is recorded as D_t(x, y), according to the parameters of the digital projector and the camera, the depth value D of the training scene can be obtained_t(x, y) corresponding training scene disparity values

Wherein, (x, y) is the coordinate of each pixel of the training scene depth value and the training scene parallax value in the pixel coordinate system, f is the focal length of the lens of the digital projector and the focal length of the lens of the camera, and b is the optical center interval of the lens of the digital projector and the camera; then, the digital projector projects a pseudo-random speckle image R (x, y) to the training scene, and the camera acquires the corresponding deformed speckle I_t(x, y) and setting a sliding window of M × N pixels to traverse the deformed speckle I_tAcquiring local binary characteristics of the (x, y) point for each point pixel (x, y) on the (x, y); finally, the T training scenes are processed in the same way to obtain a training scene parallax value d_t(x, y) and local binary characteristics are used as learning model training data, and the random forest is input for learning of the random forest; the learning process of the random forest is started from the root node of each tree, a series of random discrimination parameters delta are randomly set, and the training set data S is divided into left training set data S by utilizing each discrimination parameter delta_L(delta) and right training set data S_R(δ) simultaneously calculating an objective function

Wherein E (S) and E (S)_L(δ))、E(S_R(δ)) training set data S and left training set data S_L(delta) and right training set data S_RThe information entropy of (delta) and the maximum value of the objective function O (delta) in the random discrimination parameters delta are the final discrimination parameters of the node; for left training set data S_L(delta) and right training set data S_R(delta) recursing the processes in sequence until the training depth reaches the kth layer of the tree or training set data is irreparable, training a plurality of independent trees into decision trees, and forming a random forest F for the running state by the decision trees; in the classification regression process of the optimal depth of the deformed speckles, a digital projector projects pseudo-random digital speckles on the surface of a detected scene, a camera obtains deformed speckles I '(x, y) of the detected scene, and each pixel local binary feature is classified by utilizing a front k layer of a random forest F decision tree to obtain a local binary feature label c'_I(x, y), regressing the local binary characteristic label by utilizing an N-k layer behind a random forest F decision tree to obtain a sub-pixel precision local binary characteristic label c ' (x, y) corresponding to the detected scene deformation speckle I ' (x, y), wherein the optimal depth value D ' (x, y) of the detected scene meets the requirement of accurate calibration of a digital projector and a camera

The method comprises the following steps that f is the focal length of a lens of a digital projector and a lens of a camera, b is the optical center interval of the lens of the digital projector and the lens of the camera, and (x, y) is the coordinate of each pixel of a training scene depth value and a training scene parallax value in a pixel coordinate system.