Background technology
LiDAR (LightDetectionAndRanging), is called lidar, is the abbreviation of laser scanning and detection system. LiDAR system is mainly divided into two big classes: machine load LiDAR system and ground LiDAR system, and the present invention is mainly for the ground LiDAR system under indoor complicated environment. Compare and the three-dimensional reconstruction of tradition based on picture, have fast based on the three-dimensional reconstruction of 3 D laser scanning, accurately, the advantage such as noncontact. Due to the discontinuity of laser scanner, for the cloud data coordinate conversion that obtained by multiple scanning spot is under unified coordinate system, the point cloud model formed, it is necessary to carry out Image registration. Image registration technology is mainly divided into two big classes: a kind of is stress in the extraction of image data discrete feature and location, and such technology feature is not need initial position estimation, but is unsuitable for the unconspicuous situation of feature; Another kind is famous ICP (IterativeClosestPoint) algorithm proposed at first by Besl and McKay, this algorithm by constantly calculate two width image overlap area corresponding points between rigid body transformation relation, repeat change, find and finally make the distance square error between corresponding points reach minimum rotation matrix and translation vector. This algorithm has better robustness and accuracy than upper a kind of registration Algorithm, but original I CP algorithm depends primarily on choosing and minimumization of error function of corresponding points pair in convergence, many researchs afterwards are all devoted to by finding different error judgment functions, and the choosing method of corresponding points pair strengthens the stability of ICP algorithm. Kinect is the body sense equipment of Microsoft's exploitation, and it belongs to the novel sensor of class RIM (RangeImaging) camera [2] by name, is mainly used in catching skeleton structure, realizes the conception using health as controller with this. Kinect launches unit (pulse light, light modulated or structure light) by one, photosensitive sensor (CCD, CMOS or APD), optical system and some driving circuits and calculating unit composition. Kinect device synchronously obtains scene depth and image information, can automatically complete scene texture and map, and the depth data obtained based on Kinect device is the cloud data of structurizing, and the present invention adopts the data fusion method reconstruction of three-dimensional model based on volume elements. (reference: Remondino, F., HeritageRecordingand3DModelingwithPhotogrammetryand3DSca nning.RemoteSensing, 2011.)
Data fusion mainly adopts volume elements technology, proposes the method for VolumeIntersection as far back as Baumgart in 1974, and points out each volume elements only two states: 0 and 1,1 mean that the rebuilt target of this volume elements occupies, otherwise 0 then. Within 1989, A.Elfes utilizes sonar to be taken grid concept (OccupancyGrid) by proposing during Mobile Robotics Navigation, it is that the volume elements in whole space is divided into three classes: Occupied, Free, Unknown, and utilize probability function expression of space service condition, utilize the method can realize robot autonomous navigation location. The oriented distance field (SignedDistanceFunction) that Hoppe puts body surface by structure rebuilds body surface. The border existed for improving in Hoppe method rebuilds undesirable problem, simultaneously and solve during normal direction is propagated local " isolated island " problem that may occur, Curless and Levoy preserves two values in every individual element, one is weight information, another is distance value, and every amplitude deepness image is converted to the oriented distance field of a weighting (WeightedSignedDistanceFunction) by them.
After Kinect issues, University of Washington and the Intel cooperation development research project of RGB-DDensePointCloudMapping by name, this project is intended to utilize Kinect to realize the automatic mapping of robot. Due to the precision of lesser concern mapping own, therefore rebuild effect not ideal enough, scene exists in a lot " hole ", and has " ghost image " phenomenon to produce. In August, 2011, Microsoft illustrates its KinectFusion project achievement in SIGGRAPH conference, it is intended to utilize Kinect to realize strengthening reality (AugmentedReality). Wherein for the problem that Kinect original depth-map precision is low, utilize Kinect high frequency output feature, body reconfiguration technique and Point-To-PlaneICP algorithm are proposed based on document, unblind distance function (TruncatedSignedDistanceFunctions) and GPU is adopted to carry out parallel accelerate, realizing real-time meticulous three-dimensional modeling, precision can reach millimeter level.
Summary of the invention
It is limited that the present invention mainly solves the traditional laser scanner scans scope existing for prior art, especially for the indoor scene of complexity, cannot complete the technical problem of data collection task etc. for narrow and small region; Providing one, to give full play to novel body sense equipment Kinect cheap, obtain scene depth and image information simultaneously, within every second, 30 frame rate export, can obtain a large amount of degree of depth information, can keep the features such as Information Monitoring integrity, it is achieved to the indoor LiDAR missing data complementing method based on Kinect of the collecting work of local data.
The above-mentioned technical problem of the present invention is mainly solved by following technical proposals:
A kind of indoor LiDAR missing data complementing method based on Kinect, it is characterised in that, comprise the following steps:
Step 1, adopts LiDAR equipment carrying out simple scan for indoor scene, obtains scan-data, namely obtain the image of LiDAR device scan; For the region of disappearance in LiDAR device procedures, adopt Kinect device again to scan, and extract the crucial frame in scanning process, obtain sparse scan-data; Namely sparse RGB-D image is obtained;
Step 2, adopts SIFT algorithm that the RGB-D image of Kinect device collection in step 1 is carried out feature extraction, and utilizes RANSAC operator to be rejected by off-note matching point in the feature extracted;
Step 3, in step 2 the SIFT feature of rejecting abnormalities characteristic matching point carry out merger;
Step 4, adopts SIFT algorithm that the image of the LiDAR device scan of LiDAR equipment collection in step 1 is carried out feature extraction, and the SIFT feature extracted with the Kinect device of merger in step 3 is slightly mated, and obtains conversion matrix;
Step 5, carries out fine match by completing slightly the LiDAR image of coupling and the RGB-D image of Kinect in step 4, obtains meticulous conversion matrix;
Step 6, merges the part missing data that the LiDAR image and Kinect that complete fine match in step 5 scan RGB-D image, obtains complete scan-image.
Here it should be explained that, image is exactly image, and model comprises three-dimensional coordinate information, mainly gives directions cloud model here. In lidar and kinect data gathering process, not only comprise image data (RGB image and degree of depth information) and also comprise cloud data (series of points cloud can generate point cloud model))
At the above-mentioned indoor LiDAR missing data complementing method based on Kinect, in described step 3, due to the image that adjacent crucial frame obtains, there is more overlapping part, there is repeatability in the SIFT feature therefore obtained based on step 2 equally; Extracted by feature and Feature Mapping, SIFT feature point has point (x, y, z) one to one in cloud data, i.e. three-dimensional mapping point, and three-dimensional mapping point obtains from volume element model, under being therefore in unified system of coordinates, if then SIFT feature is identical, then the coordinate of the three-dimensional mapping point of its correspondence is then close or identical, adopt three-dimensional mapping point to carry out feature merger, namely realized the merger of SIFT feature by the method for point of proximity cluster, specifically describe as follows:
A given hyperspace Rk, a vector in Rk is a sample point, the finite set of these sample points is collectively referred to as sample set, given sample set E, with a sample point s', the nearest neighbour of s' is exactly that arbitrary sample point s �� E meets Nearest (E, s', s), wherein Nearest is as given a definition:
Upper formula middle distance tolerance is Europe formula distance, namely
Wherein siIt is i-th dimension degree of vector s.
At the above-mentioned indoor LiDAR missing data complementing method based on Kinect, in described step 4, comprise following sub-step further:
Step 4.1, chooses three to point from the data centralization of registration respectively, and obtains this three couple and put corresponding to the three-dimensional coordinate in source cloud data collection S and object point cloud data set T;
Step 4.2, right for three points in step 1, carry out transformation matrix calculating, obtain transformation matrix H;
Step 4.3, calculates at this transformation matrix HiUnder inlier number, if being greater than the threshold value �� of setting, then the inlier that utilizes least-squares calculation all Renewal model parameter; If the threshold value being less than setting, then carry out next iteration;
Step 4.4, after iteration K time, finds the H maximum containing " Inlier " numberf, it can be used as final transformation matrix.
At the above-mentioned indoor LiDAR missing data complementing method based on Kinect, in described step 6, comprise following sub-step further:
Step 6.1; data construct volume element model for Kinect scanning: define a said three-dimensional body; N*M*Q size; and this said three-dimensional body is divided into the volume elements of L*P*O size; wherein, N, M, Q, L, P, O are positive integer, and (division of volume elements can freely define; but in order to more efficient data organization, usually it is divided into decile volume elements. ) initial time each volume elements do not comprise any data information and give initial value to volume elements.
Step 6.2, by LiDAR model insertion to Kinect volume element model: owing to the data volume of LiDAR device scan is big, if building independent volume element model seriously affect data-handling efficiency, therefore Kinect volume element model LiDAR model insertion built, due in steps of 5, LiDAR image and Kinect image fine match, the some centering in coupling is chosen the point of LiDAR model and is added in Kinect volume element model.
Step 6.2, carries out assignment to volume elements: owing to LiDAR data and Kinect data naturalization are to, under unified coordinate system, now, being deposited in volume element model by the depth data obtained respectively, and correspondingly upgrades the data information that each volume elements preserves.
Step 6.3, depth data merges: after executing above-mentioned steps, same volume elements exists multiple depth data, is subject to body surface situation and sensor impact, causes error different, therefore need depth data is carried out fusion treatment. , concrete grammar is:
Steps A, first carries out weight analysis to each summit, and proportion range is 0-1, and weight size depends on method vector and the light angle on summit, angle is more big, and weight is more little, simultaneously, for boundary, there is less weight, for non-vertex weights, take linear interpolation method to calculate weighted value:
Step B, then according to above-mentioned weight fusion depth data. Function D (x) as merge after depth value, by { di(x) | i=0 ..., n}, wherein, diX depth value that () is each model, and the weight information { w of correspondencei(x) | i=0 ..., n}, wherein, wiX weighted value that () is each model, is obtained by following formula:
Wi(x)=�� wi(x)
Therefore, tool of the present invention has the following advantages: give full play to novel body sense equipment Kinect cheap, obtaining scene depth and image information, within every second, 30 frame rate export, can obtain a large amount of degree of depth information, can keep the features such as Information Monitoring integrity, it is achieved to the collecting work of local data simultaneously.
Embodiment
Below by embodiment, and by reference to the accompanying drawings, the technical scheme of the present invention is described in further detail.
First, introduce the theoretical basis that the present invention relates to:
1.1, RGB-D data
RGB-D data, it is exactly the cloud data with color information in fact, here RGB-D data it are referred to as, mainly in order to illustrate that Kinect synchronously obtains color image when obtaining degree of depth information, and can mapping one by one between two parts of data, this is significant for rebuilding the three-dimensional model with photorealistic. The mode of different equipment records may be different, but is broadly divided into two classes, a class record be the scalar of a distance value, another class record be the D coordinates value of this point. The depth image that Kinect scanning obtains is the scalar of a distance value, as shown in formula (1).
R (i, j)={ ri,j|0��i��row-1,0��j��col-1}(1)
The ranks number of the depth image obtained with Kinect is 480,640 respectively. Concrete, each the pixel value on depth image, can only be in a certain threshold value owing to limitting by the ability of sensor, and for Kinect, its investigation depth the farthest can reach 7 meters, but Effect on Detecting preferably concentrates between 1.2 meters to 3.5 meters. The numeral that in practical application, Kinect depth device records, in 0��2047 scope, needs in practical application this depth value is converted into actual range, as shown in formula (2).
RealDistance=1.0/ (raw_depth*-0.0030711016+3.3309495161) (2)
1.2, ICP algorithm
In point cloud registration algorithm, what be most widely used is the ICP algorithm [21,30] that Besl and Chen proposes. The method is not only applicable to cloud data, is suitable for for other curved surface datas yet. Its process can be divided into two steps, and the first step establishes corresponding point set, and the 2nd step establishes the coordinate transform matrix between some cloud according to corresponding point set. Then more than iteration two steps, till error function meets accuracy requirement, it is seen that the essence of ICP algorithm is the Optimum Matching method based on least square. The point set obtained of assuming to sample from two groups of cloud datas is P1And P2, transformation matrix is designated as T=[R | t], then the net result of ICP algorithm convergence makes error function minimum, as shown in formula (3).
Wherein to establish corresponding point set be the key of ICP algorithm, it is resolved that the speed of convergence of algorithm and final splicing precision. Original point cloud data sampling (2) is determined that initial corresponding point set (3) is removed mistake corresponding points and (4) coordinate transform solved for (1) by the key step of ICP algorithm.
1.3, volume elements merges
So-called volume elements or body element (Voxel), its core concept is that two dimension or three-dimensional space are subdivided into continuous print fritter, and the region of these segmentations is then referred to as volume elements (Voxels) in three dimensions. when the three-dimensional model that the true world rebuild by degree of depth image is looked in fusion more, for the true world is mapped in CyberSpace, first predefine said three-dimensional body (3DVolume), assume the size of 3m �� 3m �� 3m, then this said three-dimensional body is divided into regular fritter, i.e. volume elements, assume 512 �� 512 �� 512 sizes, each volume elements does not comprise time initial any data information, after being followed the trail of by camera position, by each frame data all naturalization under the same coordinate system, now, just the depth data newly obtained can be put into this volume element model, and correspondingly upgrade the data information that each volume elements preserves.
After two groups of cloud data registrations, it is possible to these two groups of cloud datas are stitched together well, the method for surface merging now can be adopted to rebuild complete body surface. For solving Problems existing in surface merging, implicit can be used to rebuild complete body surface. The clou that volume elements is rebuild uses continuous print Implicitly function D (x) to represent each sampling point, in this function, each some x is with weight information and range information (WeightedSignedDistance), crossing with body surface along point of view direction P from body element x, this segment distance is exactly distance function diX () needs the value of record, as shown in Figure 1.
Visible, by calculating signed distance function (SignedDistanceFunction, SDF) distance value obtained can just can be born, before canonical means that in directions of rays this volume elements is positioned at reconstructed surface, negative then after being positioned at reconstructed surface, positive and negative intersection is then for rebuilding the real surface of object, and that is D (x)=0 is exactly the body surface reconstructed. Along with data constantly enter, now need to upgrade the value that each volume elements records, and we know when the same area gathers its depth data from different directions, affect by body surface situation and sensor performance, its error is different, therefore should be weighted when data fusion and be averaging, if weight arrange improper, surface appearance " fold " probably making to reconstruct, as shown in FIG. 2 and 3.
Utilize volume elements technology, high accuracy three-dimensional model can be reconstructed by the fusion of depth data, wherein D (x)=0 place is the body surface reconstructed, that is by aforesaid method reconstruct actual be the surface model of object, it is possible to regard 2.5 dimensions as. Therefore, in numerous voxel data, in fact small portion voxel data is only had to have recorded useful surface information, when after the scanning completing a scene, only need the part of D (x)=0 in holding body data, so can reduce data volume, it is to increase data-handling efficiency, and can be used in model splicing hereinafter.
Introduce specific embodiments of the invention below:
Step 1, Kinect scans the key-frame extraction of process, obtains more sparse scan-data. For LiDAR equipment in the region carrying out disappearance in simple scan process, Kinect device is utilized to scan. Owing to Kinect device carries out data gathering with the speed of 30 frames/second, and repeat region between adjacent frame is more, therefore by the method for key-frame extraction, under the prerequisite ensureing disappearance contextual data integrity, obtain local effectively scan-data, reduce the later data treatment time. The present invention is by directly calculating the angle deflection amount of camera and translational movement to determine the interpolation of crucial frame.
Step 2, the feature based on the RGB image of Kinect is extracted. Owing to RGB image and cloud data are through registration, therefore map, by RGB image extracts the unique point obtained, the feature that just can be used as cloud data on cloud data. Wherein, RGB image is extracted by SIFT operator realization character. But, it is noted that cloud data exists a large amount of invalid data when SIFT feature being mapped on cloud data, when therefore mapping, need the validity to data to judge. In general invalid data occurs in both cases, and one is that the visual range of degree of depth image is different from the visual range of color image, and on color image, the region of existence probably cannot receive degree of depth information on degree of depth image; Two is the edge that SIFT feature is in object, and we know that cloud data keeps not good in edge. Simultaneously, SIFT feature is obtained mainly for obtaining rigid transformation matrix, therefore without the need to preserving too much SIFT feature.
Step 3, carries out merger to the feature in step 2. Due to the image that adjacent crucial frame obtains, there is more overlapping part, there is repeatability in the SIFT feature therefore obtained based on step 2 equally. Being extracted by feature and Feature Mapping, SIFT feature point has point (x, y, z) one to one in cloud data, for follow-up describe convenient we term it three-dimensional mapping point. Three-dimensional mapping point obtains from volume element model, under being therefore in unified system of coordinates. If SIFT feature is identical, then the coordinate of the three-dimensional mapping point of its correspondence should be very close, as shown in Figure 6. Utilize this characteristic, it is possible to use three-dimensional mapping point carries out feature merger. The present invention realizes the merger of SIFT feature by the method for point of proximity cluster. Specifically describe as follows:
A vector in given hyperspace Rk, a Rk is a sample point, and the finite set of these sample points is collectively referred to as sample set, given sample set E, and a sample point s', the nearest neighbour of s' is exactly that arbitrary sample point s �� E meets Nearest (E, s', s), Nearest is as given a definition:
Formula middle distance tolerance above is Europe formula distance, namely
Wherein siIt is i-th dimension degree of vector s.
Step 4, the feature of LiDAR image is extracted, and slightly mates with the feature of Kinect device, obtains conversion matrix. Identical with step 2, LiDAR image also adopts SIFT operator realization character to extract. For the feature that LiDAR equipment and Kinect device obtain, RANSAC algorithm is adopted to realize thick coupling,
Mainly comprise the following steps:
Step 4.1, chooses three to point from the data centralization of registration respectively, and obtains this three couple and put corresponding to the three-dimensional coordinate in source cloud data collection S and object point cloud data set T.
Step 4.2, right for above-mentioned three points, carry out transformation matrix calculating, obtain transformation matrix H.
Step 4.3, calculates at this transformation matrix HiUnder " inlier " number, if being greater than the threshold value �� of setting, then the inlier that utilizes least-squares calculation all Renewal model parameter; If the threshold value being less than setting, then carry out next iteration.
Step 4.4, after iteration K time, finds the H maximum containing " Inlier " numberf, it can be used as final transformation matrix.
Step 5, the RGB-D image of LiDAR image and Kinect carries out fine match. Reliable transformation matrix can be calculated by RANSAC algorithm, but utilization changes transformation matrix and can only realize the thick coupling between two groups of cloud datas, if now mating two groups of cloud datas with transformation matrix can not obtain perfect splicing effect, on thick coupling basis, therefore to be used the essence coupling that ICP algorithm realizes between cloud data. But traditional ICP algorithm requires that any point in a width degree of depth image can find correspondingly matching point in another width image, and in the present invention LiDAR image only there is part region with Kinect degree of depth image overlapping, therefore adopt tradition ICP algorithm to carry out fine match Shortcomings. The present invention is directed to the problems referred to above, ICP has been carried out 2 improvement, comprise the matching point removing distance far apart with to remove the point on mesh (face sheet) border right, namely at all matching points of boundary by disallowable. A new matrix can be calculated based on the ICP coupling improved, this matrix application just can be realized on two groups of cloud datas the accurate splicing of data.
Step 6, the lack part Model Fusion of LiDAR model and Kinect scanning, mainly comprises structure volume elements, and depth data merges two steps, specific as follows.
Step 6.1, for the data construct volume element model of Kinect scanning. Defining a said three-dimensional body (3DVolume), 3m �� 3m �� 3m size, and this said three-dimensional body is divided into the volume elements of 512 �� 512 �� 512 sizes, time initial, each volume elements does not comprise any data information and gives initial value to volume elements.
Step 6.2, by LiDAR model insertion to Kinect volume element model. Owing to the data volume of LiDAR device scan is big, if building independent volume element model seriously affect data-handling efficiency, the Kinect volume element model therefore LiDAR model insertion built. In steps of 5, LiDAR image and Kinect image fine match, the some centering in coupling is chosen the point of LiDAR model and is added in Kinect volume element model.
Step 6.3, carries out assignment to volume elements. Owing to LiDAR data and Kinect data naturalization are to, under unified coordinate system, now, being deposited in volume element model by the depth data obtained respectively, and correspondingly upgrade the data information that each volume elements preserves.
Step 6.4, depth data merges. After executing above-mentioned steps, there is multiple depth data in same volume elements, is subject to body surface situation and sensor impact, causes error different, therefore need depth data is carried out fusion treatment.
Steps A, first carries out weight analysis to each summit. Proportion range is 0-1, and weight size depends on method vector and the light angle on summit, and angle is more big, and weight is more little. Meanwhile, for boundary, there is less weight. For non-vertex weights, linear interpolation method is taked to calculate weighted value, as shown in Figure 1.
Step B, then according to above-mentioned weight fusion depth data. Function D (x) as merge after depth value, by { di(x) | i=0 ... wherein, n} (diX depth value that () is each model) and corresponding weight information { wi(x) | i=0 ... wherein, n} (wiX weighted value that () is each model) calculated by formula (6) (7).
Wi(x)=�� wi(x)(7)
Specific embodiment described herein is only to the present invention's spirit explanation for example. Described specific embodiment can be made various amendment or supplements or adopt similar mode to substitute by those skilled in the art, but can't deviate the spirit of the present invention or surmount the scope that appended claims defines.