Kinect-based real-time depth map and color map registration and optimization method
Technical Field
The invention relates to the fields of image processing, numerical analysis, three-dimensional reconstruction, computer science, parallel computation and the like, in particular to a method for registering and optimizing a real-time depth map and a color map based on Kinect.
Background
The depth generation technology of video is a classic problem in the field of computer vision. To date, no algorithm has been able to guarantee the accuracy of depth generation for complex scenes while guaranteeing the real-time performance of video depth generation. People therefore prefer to use external devices for simultaneous capture of depth and video to ensure real-time and accuracy of depth information.
In recent years, microsoft has introduced a light, small and cheap Kinect device, which can acquire a depth image and a color image in a scene in real time, but due to the fact that environmental factors in the scene interfere with infrared light, the depth image has many hollow points, and the Kinect device has poor practical experience under the condition of high requirements on the quality of the depth image. And the positions and resolutions of the infrared receiving camera and the color camera of the Kinect are different, so that the positions and pixel points of the depth image and the color image are not corresponding, although microsoft provides a mapping method from the depth image to the color image in the Kinect SDK, in fact, the method is based on the DIBR technology, and projects an image with low resolution to an image with high resolution, so that a large number of hollow points are generated, and the problems of ghost image, shielding and the like are caused.
Disclosure of Invention
In order to overcome the defects of poor instantaneity, poor stability and poor accuracy of the existing mapping method of the depth map and the color map, the invention provides a Kinect-based real-time depth map and color map registration and optimization method which is good in instantaneity, good in stability and high in accuracy.
Therefore, the invention adopts the following technical scheme that the method comprises the following steps:
a Kinect-based real-time depth map and color map registration and optimization method comprises the following steps:
(1) connecting the Kinect equipment by using an application programming interface Kinect SDK2.0 pushed by Microsoft;
(2) acquiring an original depth data stream and a color data stream;
(3) filling fine holes of the depth map by using the background frame;
(4) filling a large hole of the depth map by using a bidirectional linear scanning and polling interpolation method;
(5) obtaining a coordinate mapping relation m _ pColorcoordinates of the depth image and the color image through a function MapDepthFrameToColorSpace provided by Kinect SDK 2.0;
(6) map table m _ pColorCoordinates coordinate exception part handling;
(7) establishing a depth image color _ depth aligned with the color image;
(8) and cutting the color image and the aligned depth image, and subtracting the non-overlapped area of the shooting areas of the depth camera and the color camera, thereby realizing the complete alignment of the depth image and the color image.
Further, in the step (3), the method for filling the fine holes in the depth map by using the background frame is as follows:
3.1) background frame generation: first-read depth image data Dep0Updating of background frame as background frame image data: the formula is as follows
BacDep(x,y)=Dep0(x,y) Dep0(x,y)>BacDep (x, y) +10
Wherein BacDep (x, y) is depth data of each pixel point of background frame, and Dep0(x, y) is the depth data of each pixel point of the current depth image;
3.2) filling the small holes of the current depth frame by using the background frame: for each pixel point of the depth map if and only if Dep0(x, y) < boundary value and BacDep (x, y) > boundary value, where Dep0 is the depth image and BacDep is the background image, the boundary value is used to distinguish whether it is a hole point, the following steps are performed:
3.2.1) setting the expansion number to expand to 1, and executing 3.2.2);
3.2.2) extending the point (x, y) in the Dep0 to the adjacent pixel points to obtain (x + extended, y), (x-extended, y), (x, y + extended), (x, y-extended), if there is one of the points belonging to the background frame, that is, the absolute difference between the pixel point and the corresponding pixel point of the background frame is less than 10, then Dep0(x, y) is BacDep (x, y);
3.2.3) if the expanded scanning times are limited, the expanded is increased by 1, and the jump is carried out to 3.2.1); otherwise, ending the step.
Still further, in step (4), the method for filling the large holes in the depth map by using the bidirectional linear scanning polling interpolation method is as follows:
4.1) setting endSC to be scanning times, initializing endSC to be 0, and executing 4.2);
4.2) scanning Dep from the upper right to the lower left of the image0For each pixel point (x, y), if it is a hole point, executing the third step; otherwise, skipping the point and continuing traversal, and executing 4.5) after traversal is finished;
4.3) scanning the endSC 3 points to the left from (x, y) until either (0, y) or (x-endSC 3, y) or a non-void point (index, y) is scanned, wherein x-endSC 3 ≦ index < x. When a non-hole point (index, y) is retrieved, then scan back from point (index, y) to point (x, y), there is the following formula:
Dep0(i,y)=Dep0(index,y)index<i≤x
after the scanning is finished, if the (x, y) is still the cavity point, executing the fourth step; otherwise return to 4.2);
4.4) scan down endSC 3 points from (x, y) until either (x, depthHeight) or (x, y + endSC 3) or a non-void point (x, index) is scanned, where y < index ≦ y + endSC 3, depthHeight being the height of the depth image. When a non-hole point (x, index) is retrieved, then (x, y) is scanned back from (x, index), with the following equation:
Dep0(x,i)=Dep0(x,index)y<i≤index
returning to 4.2) after the scanning is finished;
4.5) scanning the image from the lower left to the upper right to fill the cavity once according to the steps of 4.2) to 4.4), and executing 4.6) after traversing is finished;
4.6) if the endSC is not equal to the upper limit of the scanning times, the endSC is increased by 1 and jumps to 4.2); otherwise, ending the step.
Further, in step (6), the map table m _ pColorcordinates coordinate exception part processing is as follows:
6.1) coordinate left edge exception part handling
For each point (0, y) at the leftmost end of the mapping table, where 0 ≦ y < depthHeight,
depthHeight is the height of the depth map, and the following steps are performed:
6.1.1) scanning to the right from (0, y) of the mapping table, and if the scanned point is (index, y), extracting three continuous pixel points (index, y), (index +1, y), (index +2, y), extracting corresponding coordinate points A, B, C from the mapping table, and executing 6.1.2); if the index value is depthWidth, stopping scanning and finishing the step; wherein depthWidth is the width of the depth map;
6.1.2) if both A, B and B, C manhattan distances are less than a predefined threshold, then A, B, C are considered to be consecutive coordinate points without anomalies, and then 6.1.3) is performed; otherwise, return to 6.1.1); wherein the Manhattan distance formula is as follows:
d(i,j)=|xi-xj|+|yi-yj|
wherein d (i, j) is the Manhattan distance, xi, yi are the coordinates of the first coordinate point, xj, yj are the coordinates of the second coordinate point;
6.1.3) go back from (index, y) to (0, y), for each point (i, y) in between, 0 ≦ i < index, resetting it at the value m _ pColorcordinates (i, y) of the mapping table, assuming the coordinates D, with the formula:
D.X=A.X-(index-i)*3
D.Y=A.Y
wherein D.X and D.Y are X and Y coordinate values of D, and A.X and A.Y are X and Y coordinate values of A;
returning to 6.1.1 after traversing is finished);
6.2) coordinate Right edge Exception part handling
Processing a rightmost point (depthWidth, y) according to the step 6.1), wherein y is more than or equal to 0 and less than depthHeight, and the depthWidth and the depthHeight are respectively the width and the height of the depth map;
6.3) marking ghost coordinate area and internal coordinate abnormal area
And establishing a ghost mark table map _ use with the size of depthWidth depthHeight, and initializing the value of each point in the map _ use to be 0, wherein the ghost mark table map _ use represents a region which is not a ghost or abnormal coordinates. For each point (0, y) at the leftmost end of the mapping table, where 0 ≦ y < depthHeight, the following steps are performed:
6.3.1) scanning to the right from (0, y) of the mapping table m _ pColorcordinates, if the scanned point is (index, y), extracting two continuous pixel points (index, y), (index +1, y), extracting corresponding coordinate points A and B from the mapping table, and executing 6.3.2); and if the index value is depthWidth, stopping scanning and ending the step.
6.3.2) if the X coordinate of A, B conforms to the relationship A.X > B.X, then B is considered to be a coordinate outlier, followed by 6.3.3); otherwise, return to 6.3.1);
6.3.3) start scanning to the right from (index +1, y), set the scanned point as (i, y) and the corresponding coordinate as C, if C.X < A.X, the value of the mark map _ user (i, y) is 1, which indicates that the corresponding point is in the ghost or abnormal point region, and then continue scanning; otherwise, the scan is stopped, and the process returns to 6.3.1).
In step (7), the process of establishing the depth image color _ depth aligned with the color map is as follows:
7.1) establishing a depth image color _ depth aligned with the color map
Initializing all point correspondence values in color _ depth to be 0, and then reversely taking a depth value according to a coordinate mapping table, wherein the formula is as follows:
color_depth(m_pColorCoordinates(i,j))=Dep0(i,j)map_use(i,j)≠1
wherein m _ pColorcordinates (i, j) represents the pixel point coordinates in the color image corresponding to the point (i, j) in the depth image, and the value of map _ use (i, j) is 1, which represents that the point (i, j) is in the ghost image area;
7.2) filling the holes by a linear scanning polling interpolation method, comprising the following steps:
7.2.1) set endSC to scan times, initialize endSC to 0, execute 7.2.2);
7.2.2) scanning color _ depth from the upper right to the lower left of the image, and for each pixel point (x, y), executing 7.2.3 if the pixel point is a hole point); otherwise, skipping this point and continuing the traversal; jump to 7.2.5 after traversal);
7.2.3) sweep the endSC 3 points to the left from (x, y) until either (0, y) or (x-endSC 3, y) or a non-void point (index, y) is swept, where x-endSC 3 ≦ index < x. When a non-hole point (index, y) is retrieved, then (x, y) is scanned back from (index, y), with the following equation:
color_depth(i,y)=color_depth(index,y)index<i≤x
after the scanning is finished, if (x, y) is still the cavity point, 7.2.4 is executed; otherwise, go back to 7.2.2);
7.2.4) scan down endSC 3 points from (x, y) until either (x, colorHeight) or (x, y + endSC 3) or a non-void point (x, index) is scanned, where y < index ≦ y + endSC 3, colorHeight being the height of the color image. When a non-hole point (x, index) is retrieved, then (x, y) is scanned back from (x, index), with the following equation:
color_depth(x,i)=color_depth(x,index)y<i≤index
after the scanning is finished, returning to 7.2.2);
7.2.5) if endSC is not equal to the upper limit of scanning times, then endSC increases by 1 and jumps to 7.2.2); otherwise, ending the step.
The invention has the beneficial effects that: the real-time performance is good, the stability is good, and the accuracy is higher.
Drawings
Fig. 1 is a flowchart of a method for registering and optimizing a color map and a real-time depth map based on Kinect.
Fig. 2 is the resulting depth image from the final registration.
Fig. 3 is the image resulting from the final registration.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Referring to fig. 1 to 3, a method for registering and optimizing a real-time depth map and a color map based on Kinect includes the following steps:
(1) connecting the Kinect equipment by using an application programming interface Kinect SDK2.0 pushed by Microsoft;
(2) acquiring an original depth data stream and a color data stream;
(3) the depth image hole preliminary processing, utilize the background frame to fill up the tiny hole of the depth image, include the following steps:
3.1) background frame generation: first-read depth image data Dep0Updating of background frame as background frame image data: the formula is as follows
BacDep(x,y)=Dep0(x,y) Dep0(x,y)>BacDep (x, y) +10
Wherein BacDep (x, y) is the data of each pixel point of the background frame, Dep0(x, y) is the data of each pixel point of the current depth image;
3.2) filling the tiny holes of the current depth frame by using the background frame: for each pixel point of the depth map if and only if Dep0(x,y)<50 and BacDep (x, y)>At 50, the following steps are executed:
3.2.1): setting the expansion number to be expanded to be 1, and executing 3.2.2);
3.2.2): points (x, y) in the Dep0 are respectively extended to adjacent pixels to obtain (x + extended, y), (x-extended, y), (x, y + extended), (x, y-extended), and if one of the points belongs to the background frame, that is, the absolute difference between the pixel point and the corresponding pixel point of the background frame is less than 10, then the Dep0(x, y) is BacDep (x, y);
3.2.3) if the expanded scanning times are limited, the expanded is increased by 1, and the jump is carried out to 3.2.1); otherwise, ending the step;
(4) depth map hole depth processing, a bidirectional linear scanning polling interpolation method for filling holes, comprising the following steps:
4.1) setting endSC to be scanning times, initializing endSC to be 0, and executing 4.2);
4.2) scanning Dep from the upper right to the lower left of the image0For each pixel point (x, y), if it is a hole point, executing the third step; otherwise skip this pointContinuously traversing, and executing 4.5) after traversing is finished;
4.3) scanning the endSC 3 points to the left from (x, y) until either (0, y) or (x-endSC 3, y) or a non-void point (index, y) is scanned, wherein x-endSC 3 ≦ index < x. When a non-hole point (index, y) is retrieved, then scan back from point (index, y) to point (x, y), there is the following formula:
Dep0(i,y)=Dep0(index,y)index<i≤x
after the scanning is finished, if the (x, y) is still the cavity point, executing the fourth step; otherwise return to 4.2);
4.4) scan down endSC 3 points from (x, y) until either (x, depthHeight) or (x, y + endSC 3) or a non-void point (x, index) is scanned, where y < index ≦ y + endSC 3, depthHeight being the height of the depth image. When a non-hole point (x, index) is retrieved, then (x, y) is scanned back from (x, index), with the following equation:
Dep0(x,i)=Dep0(x,index)y<i≤index
returning to 4.2) after the scanning is finished;
4.5) scanning the image from the lower left to the upper right to fill the cavity once according to the steps of 4.2) to 4.4), and executing 4.6) after traversing is finished;
4.6) if the endSC is not equal to the upper limit of the scanning times, the endSC is increased by 1 and jumps to 4.2); otherwise, ending the step.
(5) Establishing coordinate mapping relation of depth image and color image
Obtaining the coordinate mapping relation m _ pColorcordinates of the depth image and the color image through a function MapDepthFrameToColorSpace provided by Kinect SDK2.0, wherein the coordinate mapping relation m _ pColorCoordinates is stored, wherein each depth point is in a color image Col0Corresponding coordinates of (a).
(6) The mapping table m _ pColorcordinates coordinate exception part processing proceeds as follows:
6.1) coordinate left edge exception part handling
For each point (0, y) at the leftmost end of the mapping table, where 0 ≦ y < depthHeight,
depthHeight is the height of the depth map, and the following steps are performed:
6.1.1) scanning to the right from (0, y) of the mapping table, and if the scanned point is (index, y), extracting three continuous pixel points (index, y), (index +1, y), (index +2, y), extracting corresponding coordinate points A, B, C from the mapping table, and executing 6.1.2); if the index value is depthWidth, stopping scanning and finishing the step; wherein depthWidth is the width of the depth map;
6.1.2) if both A, B and B, C manhattan distances are less than a predefined threshold, then A, B, C are considered to be consecutive coordinate points without anomalies, and then 6.1.3) is performed; otherwise, return to 6.1.1); wherein the Manhattan distance formula is as follows:
d(i,j)=|xi-xj|+|yi-yj|
wherein d (i, j) is the Manhattan distance, xi, yi are the coordinates of the first coordinate point, xj, yj are the coordinates of the second coordinate point;
6.1.3) go back from (index, y) to (0, y), for each point (i, y) in between, 0 ≦ i < index, resetting it at the value m _ pColorcordinates (i, y) of the mapping table, assuming the coordinates D, with the formula:
D.X=A.X-(index-i)*3
D.Y=A.Y
wherein D.X and D.Y are X and Y coordinate values of D, and A.X and A.Y are X and Y coordinate values of A;
returning to 6.1.1 after traversing is finished);
6.2) coordinate Right edge Exception part handling
Processing a rightmost point (depthWidth, y) according to the step 6.1), wherein y is more than or equal to 0 and less than depthHeight, and the depthWidth and the depthHeight are respectively the width and the height of the depth map;
6.3) marking ghost coordinate area and internal coordinate abnormal area
And establishing a ghost mark table map _ use with the size of depthWidth depthHeight, and initializing the value of each point in the map _ use to be 0, wherein the ghost mark table map _ use represents a region which is not a ghost or abnormal coordinates. For each point (0, y) at the leftmost end of the mapping table, where 0 ≦ y < depthHeight, the following steps are performed:
6.3.1) scanning to the right from (0, y) of the mapping table m _ pColorcordinates, if the scanned point is (index, y), extracting two continuous pixel points (index, y), (index +1, y), extracting corresponding coordinate points A and B from the mapping table, and executing 6.3.2); and if the index value is depthWidth, stopping scanning and ending the step.
6.3.2) if the X coordinate of A, B conforms to the relationship A.X > B.X, then B is considered to be a coordinate outlier, followed by 6.3.3); otherwise, return to 6.3.1);
6.3.3) start scanning to the right from (index +1, y), set the scanned point as (i, y) and the corresponding coordinate as C, if C.X < A.X, the value of the mark map _ user (i, y) is 1, which indicates that the corresponding point is in the ghost or abnormal point region, and then continue scanning; otherwise, the scan is stopped, and the process returns to 6.3.1).
(7) Establishing a depth image color _ depth aligned with the color image, initializing all point correspondence values in the color _ depth to be 0, and carrying out the following process:
7.1) establishing a depth image color _ depth aligned with the color map
Initializing all point correspondence values in color _ depth to be 0, and then reversely taking a depth value according to a coordinate mapping table, wherein the formula is as follows:
color_depth(m_pColorCoordinates(i,j))=Dep0(i,j)map_use(i,j)≠1
wherein m _ pColorcordinates (i, j) represents the pixel point coordinates in the color image corresponding to the point (i, j) in the depth image, and the value of map _ use (i, j) is 1, which represents that the point (i, j) is in the ghost image area;
7.2) filling the holes by a linear scanning polling interpolation method, comprising the following steps:
7.2.1) set endSC to scan times, initialize endSC to 0, execute 7.2.2);
7.2.2) scanning color _ depth from the upper right to the lower left of the image, and for each pixel point (x, y), executing 7.2.3 if the pixel point is a hole point); otherwise, skipping this point and continuing the traversal; jump to 7.2.5 after traversal);
7.2.3) sweep the endSC 3 points to the left from (x, y) until either (0, y) or (x-endSC 3, y) or a non-void point (index, y) is swept, where x-endSC 3 ≦ index < x. When a non-hole point (index, y) is retrieved, then (x, y) is scanned back from (index, y), with the following equation:
color_depth(i,y)=color_depth(index,y)index<i≤x
after the scanning is finished, if (x, y) is still the cavity point, 7.2.4 is executed; otherwise, go back to 7.2.2);
7.2.4) scan down endSC 3 points from (x, y) until either (x, colorHeight) or (x, y + endSC 3) or a non-void point (x, index) is scanned, where y < index ≦ y + endSC 3, colorHeight being the height of the color image. When a non-hole point (x, index) is retrieved, then (x, y) is scanned back from (x, index), with the following equation:
color_depth(x,i)=color_depth(x,index)y<i≤index
after the scanning is finished, returning to 7.2.2);
7.2.5) if endSC is not equal to the upper limit of scanning times, then endSC increases by 1 and jumps to 7.2.2); otherwise, ending the step.
(8) Cropping color images and aligned depth images
And subtracting the area where the shooting areas of the depth camera and the color camera are not overlapped. Thereby achieving perfect alignment of the depth image and the color image.
(9) And outputting the registered depth image and color image, wherein fig. 2 is the final registered depth image, and fig. 2 is the final registered image (displayed in gray scale and obtained in actual processing as a color image).