KR20220052784A

KR20220052784A - Depth Image based Real-time ground detection method

Info

Publication number: KR20220052784A
Application number: KR1020200137127A
Authority: KR
Inventors: 김동규
Original assignee: (주)베라시스
Priority date: 2020-10-21
Filing date: 2020-10-21
Publication date: 2022-04-28
Also published as: KR102547333B1

Abstract

The present invention relates to a depth image-based real-time bottom detection method to be performed by a small amount of computation. According to the present invention, a method for detecting the bottom in real-time on the basis of a depth image detects the bottom in a region of interest (ROI) area from a time of flight (TOF) camera while the TOF camera is mounted to photograph objects located within a predetermined space. The method comprises the following steps: dividing a depth image into cells after setting a bottom search area; determining whether an image cell is flat by using an eigen value after normal distribution transformation (NDT) is applied to the image cell; extracting only cells close to a bottom criterion as candidate cells; determining whether the number of cells is greater than the minimum number of candidate cells; determining that there is no bottom plane when the number of cells is less than the minimum number of candidate cells in the determination step and performing floor modeling with cell-based RANSAC when the number of cells is greater than the minimum number of candidate cells in the determination step; and adding a distance criterion inlier except for the candidate cell.

Description

Depth Image based Real-time ground detection method

본 발명은 깊이 영상 기반 실시간 바닥 검출방법에 관한 것으로, 더욱 상세하게는, 깊이 영상 기반의 노이즈에 강인하면서도 실시간으로 실행될 수 있도록 적은 연산량을 가지는 깊이 영상 기반 실시간 바닥 검출방법에 관한 것이다.The present invention relates to a depth image-based real-time floor detection method, and more particularly, to a depth image-based real-time floor detection method that is robust against noise based on a depth image and has a small amount of computation to be executed in real time.

모노카메라 컬러영상에서의 객체 정보는 정확한 거리정보를 얻을 수 없다. 왜냐하면, 3차원 실세계(3D Real world)의 배경과 객체에서 반사된 빛이 카메라 이미지 센서에 부딪히면서 얻어지는 이미지에는 깊이라고 하는 차원이 하나 줄어들어 많은 정보가 소실된다.Object information in a mono-camera color image cannot obtain accurate distance information. This is because, in the image obtained when the light reflected from the background and object of the 3D real world collides with the camera image sensor, the dimension called depth is reduced by one, and a lot of information is lost.

그래서 이미지로부터 거리정보를 얻기 위해서는 카메라 캘리브레이션이 선행되어야 한다. 실세계(Real world)에서의 점이 카메라 렌즈를 통해서 굴절되고 센서에 부딪혀서 얻은 이미지의 x, y 좌표에 대응하는지를 알기 위한 캘리브레이션으로도 스케일 팩터(scale factor)는 얻을 수 없기 때문에 또 다른 특정한 조건을 만족하여야 얻을 수 있다. 객체가 땅에 붙어있고 땅은 평평한 조건이거나, 또는 객체의 실제 크기를 알고 있거나 하는 등의 정보이다.Therefore, in order to obtain distance information from an image, camera calibration should be preceded. Since the scale factor cannot be obtained even with calibration to know whether a point in the real world corresponds to the x and y coordinates of the image obtained by refracting through the camera lens and colliding with the sensor, another specific condition must be satisfied. can be obtained It is information such as whether the object is attached to the ground and the ground is flat, or the actual size of the object is known.

깊이 정보를 얻는 센서는 라이다, 스테레오 카메라를 이용한 이미지 정합, TOF 카메라 등이 있다. 비행시간거리측정(TOF:Time Of Flight) 카메라는 일반 카메라와 비슷하게 m x n의 픽셀 수를 가지고 각각의 픽셀이 색상이나 밝기가 아닌, 카메라로부터의 거리값을 가진다. 객체를 검출하고 정보를 얻기 위해서 바닥영역을 먼저 검출하고 제거한 뒤에 남아있는 거리 데이터들을 가공하고 그룹화하여 검출해야 할 것이다.Sensors that obtain depth information include lidar, image registration using stereo cameras, and TOF cameras. A Time Of Flight (TOF) camera has a pixel count of m x n, similar to a normal camera, and each pixel has a distance from the camera, not a color or brightness. In order to detect an object and obtain information, it will be necessary to detect and group the remaining distance data after first detecting and removing the floor area.

3D point cloud의 데이터는 3D Real world 좌표계로 환산하면 x, y, z의 값을 가진 데이터인데 이를 가공하기 위해서 다른 지점(point)들과의 거리값을 계산하고 연관관계를 따져야 하고, 이는 많은 연산량을 필요로 한다.The data of 3D point cloud is data with values of x, y, and z when converted to the 3D real world coordinate system. need.

또한, 태생적으로 3D point cloud들이 가지는 노이즈의 영향을 많이 받아 알고리즘을 적용하기 힘든 측면이 있다.In addition, it is difficult to apply the algorithm because it is naturally affected by the noise of 3D point clouds.

영상 기반 거리 검출Image-based distance detection

모노카메라 단일 이미지로는 정확한 거리를 알 수 없다. 일반적으로 스테레오 카메라를 이용하여 동시에 촬영된 이미지와 스테레오 카메라의 기하학 구조를 이용하여 거리를 측정할 수 있다. 이 방법은 정합에 잘려나가는 사각지대가 존재하고, 거리가 멀수록 정확도가 떨어진다. 하지만 영상만으로 거리를 측정해낼 수 있다는 장점이 있다. 물체의 실제크기를 알고 있는 경우 카메라 왜곡이 없을 때, 간단한 비례식으로 구할 수 있다.The exact distance cannot be known from a single monocamera image. In general, a distance can be measured using an image simultaneously photographed using a stereo camera and a geometry of the stereo camera. This method has a cut-off blind spot in registration, and the accuracy decreases as the distance increases. However, it has the advantage of being able to measure the distance only with an image. If the actual size of the object is known, it can be obtained by a simple proportional expression when there is no camera distortion.

영상기반 거리측정으로 얻어진 깊이영상과, TOF 카메라로부터 얻어진 깊이영상은 원론적으로 같은 의미를 지닌다. 둘 다 pixel이 색상과 밝기정보가 아닌 깊이 값을 가진 부분은 같고, 이 값을 구하기 위해 사용된 기술의 성능과 특성차이가 존재할 뿐이다.A depth image obtained by image-based distance measurement and a depth image obtained from a TOF camera have the same meaning in principle. In both cases, the pixel has the same depth value, not color and brightness information, and there is only a difference in performance and characteristics of the technology used to obtain this value.

깊이 영상 기반 바닥검출Depth image-based floor detection

깊이 영상의 정의Definition of depth image

깊이 영상은 카메라 영상에서 픽셀(pixel)이 색상과 밝기정보가 아닌 깊이 값을 가진 영상을 말한다. 일반적으로 깊이 값은 무지개색상과 같이 표현되어 색상정보처럼 보일 수 있으나 실제로는 거리값을 한눈에 볼 수 있도록 거리를 색상에 대응하여 나타낸 것이다. 깊이 영상은 pixel 하나가 거리와 대응되는데 단순하게 깊이 영상의 x, y위치와 거리값만을 가지고 계산하기에는 어려움이 따른다. 그렇기에 거리값을 실제 3D Real world상의 좌표로 변환하는 과정이 필요하다. TOF 카메라도 카메라 특성을 따르기 때문에 이를 이용하여 거리값을 실제 3D Real world 좌표계로 변환할 수 있다.A depth image refers to an image in which a pixel in a camera image has a depth value rather than color and brightness information. In general, the depth value is expressed like a rainbow color, so it can look like color information. In a depth image, one pixel corresponds to a distance, and it is difficult to simply calculate it using only the x and y positions and distance values of the depth image. Therefore, it is necessary to convert the distance value into coordinates in the real 3D real world. Since the TOF camera also follows the camera characteristics, it can be used to convert the distance value into a real 3D real world coordinate system.

3D point cloud에 관한 연구Research on 3D point cloud

3D point cloud에서 모든 점들이 바닥의 구성요소가 아니기 때문에 이를 고려하여 바닥을 검출해야 한다. 또한, 3D point cloud는 점의 수도 많고, 데이터도 실수형태의 데이터가 X, Y, Z형태로 들어가 있어서 데이터량이 많으므로 각각의 점 자체를 이용하기 보다 real world의 공간을 3D grid로 나누고 그 안의 3d point들의 분포를 분석, voxel로 표현하여 효율적으로 관리하는 방법이 연구되고 있다. 3D point cloud가 존재하는 real world 공간을 나누기 위해서는 TOF 깊이영상 1장이 아닌 SLAM(Simultaneous Localization and Mapping)과 같은 알고리즘을 이용하여 여러 깊이 정보를 누적 정합시킨 지도에서 적용하는 것이 적합하다.Since not all points in the 3D point cloud are components of the floor, it is necessary to detect the floor taking this into account. In addition, the 3D point cloud has a large number of points, and the real world data is divided into 3D grids rather than using each point itself, and the data in the real world is divided into 3D grids and the A method for efficiently managing the distribution of 3D points by analyzing and expressing them in voxel is being studied. In order to divide the real world space where the 3D point cloud exists, it is appropriate to apply it to a map in which multiple depth information is accumulated and matched using an algorithm such as SLAM (Simultaneous Localization and Mapping) instead of one TOF depth image.

3D point cloud와 RANSAC(RANdomSAmple Consensus) 알고리즘을 이용하여 바닥평면을 모델링 할 수 있다. Point 3개를 추출하여 점 3개를 이용한 평면의 방정식으로 평면의 법선벡터를 계산하여 바닥평면을 모델링 한다. 그리고 다른 point들의 consensus를 카운트하여 최대값을 갱신하는 방향으로 모델링과 consensus를 반복한다. Consensus를 확인하는 과정에서 평면과의 거리정보를 이용하지만 이는 point들의 숫자가 많아 연산량이 높다. 그리고 모델링(modeling)을 할 때, 3 point 선택시에도 문제점이 있는데, 3d point cloud의 정보는 일반적으로 노이즈가 포함되어 있고, 이 노이즈는 예측불가라고 한다. 이 노이즈가 포함된 3개의 point들을 뽑아 추출할 시 바닥이 살짝 기울어지게 되면 consensus 계산할 때, 실제 인라이어(inlier)도 아웃라이어(outlier)로 판단되어 제외되는 경우가 많다. 노이즈로 인한 잘못된 모델링은 많은 iteration의 필요성을 야기시켜 연산량의 증가로 이어지게 된다.The floor plane can be modeled using 3D point cloud and RANSAC (RANdomSAmple Consensus) algorithm. The floor plane is modeled by extracting 3 points and calculating the normal vector of the plane using the equation of the plane using 3 points. Then, by counting the consensus of other points, the modeling and consensus are repeated in the direction of updating the maximum value. In the process of checking the consensus, distance information from the plane is used, but the amount of computation is high because the number of points is large. And when modeling, there is a problem even when selecting 3 points. The information of 3d point cloud generally contains noise, and this noise is said to be unpredictable. If the floor is slightly tilted when extracting and extracting three points containing this noise, the actual inlier is often judged as an outlier and excluded when calculating consensus. Incorrect modeling due to noise causes the need for many iterations, leading to an increase in the amount of computation.

카메라 캘리브레이션은camera calibration

3D real world의 좌표계의 한 점이 카메라 렌즈를 통과하여 이미지센서에투영되면서 이미지 space의 u, v로 변환되는 것을 기하학적으로 기술한 것이다.It is a geometric description of a point in the coordinate system of the 3D real world being transformed into u and v of the image space as it passes through the camera lens and is projected on the image sensor.

RANSAC은RANSAC is

노이즈와 같은 outlier들이 포함된 데이터셋으로부터 모델 파라미터를 예측하는 방법이다. 데이터 일부를 랜덤(Random)하게 Sample하여 모델을 만들고 이 모델을 지지(consensus)하는 데이터들의 수를 카운트하여 높은 지지를 가진 모델을 얻는 방법이다. 이를 반복적으로 수행하여 inlier와 outlier의 비율을 알 때, 알고리즘의 성공확률을 알 수 있다.It is a method of predicting model parameters from a dataset containing outliers such as noise. It is a method to obtain a model with high support by making a model by randomly sampling a part of data and counting the number of data supporting this model. When the ratio of inlier and outlier is known by repeatedly performing this operation, the probability of success of the algorithm can be known.

RANSAC의 장점은 이론적으로 outlier를 제외한 inlier들로 이루어진 파라미터 모델을 얻을 수 있다.The advantage of RANSAC is that it is theoretically possible to obtain a parametric model consisting of inliers excluding outliers.

RANSAC의 단점은 반복횟수에 있고 얻어진 결과가 최적이 아닐 가능성이 존재한다. Outlier가 많을수록 많은 반복이 필요하다. 단 하나의 모델만 얻을 수 있다. The disadvantage of RANSAC lies in the number of iterations and there is a possibility that the results obtained may not be optimal. The more outliers there are, the more iterations are needed. Only one model can be obtained.

NDTNDT

논문: 리모트 센싱(Remote Sens. 2017,9,433;doi:10.3390/rs9050433 www.mdpi.com/journal/remotesensing, Article: 정규분포변환 셀에 기초한 3차원 클라우드 평면 세그멘테이션을 위한 개선된 RANSAC(An Improved RANSAC for 3D Point Clould Plane Segmentation Based on Normal Distribution Transformation Cells))Paper: Remote Sens. 2017,9,433;doi:10.3390/rs9050433 www.mdpi.com/journal/remotesensing, Article: An Improved RANSAC for 3D Cloud Plane Segmentation Based on Normal Transform Cells 3D Point Clould Plane Segmentation Based on Normal Distribution Transformation Cells))

위 논문에서 3d point cloud map을 3차원 실세계(3D real world)에서 공간분할하여 NDT를 적용 후 평면 셀(cell) 판단을 한다. 평면 모델을 최적화를 통해 피팅(fitting) 시킨다.In the above paper, the 3D point cloud map is spatially partitioned in the 3D real world and NDT is applied to determine the planar cell. Fit the planar model through optimization.

NDT는 Normal Distribution Transformation의 약자로 정규 분포 변환이라고 해석될 수 있다. 3D real world의 데이터를 NDT 변환을 하여 공분산을 구할 수 있다. 공분산은 데이터의 분포와 상관관계를 나타내준다. 2차원에서의 공분산은 타원의 형태를 가지고 3차원에서는 타원체 형태의 모양을 가지게 된다(도 1은 공분산과 eigen vector, eigen value를 나타낸 도면이다).NDT is an abbreviation of Normal Distribution Transformation and can be interpreted as Normal Distribution Transformation. Covariance can be obtained by performing NDT transformation of 3D real world data. Covariance shows the distribution and correlation of data. The covariance in 2D has the shape of an ellipse, and in 3D, the covariance has the shape of an ellipsoid ( FIG. 1 is a diagram showing the covariance, eigen vector, and eigen value).

[선행기술문헌][Prior art literature]

1. 논문: 리모트 센싱(Remote Sens. 2017,9,433;doi:10.3390/rs9050433 www.mdpi.com/journal/remotesensing, Article: 정규분포변환 셀에 기초한 3차원 클라우드 평면 세그멘테이션을 위한 개선된 RANSAC(An Improved RANSAC for 3D Point Clould Plane Segmentation Based on Normal Distribution Transformation Cells))1. Paper: Remote Sens. 2017,9,433;doi:10.3390/rs9050433 www.mdpi.com/journal/remotesensing, Article: An Improved RANSAC (An Improved for 3D Cloud Plane Segmentation Based on Normal Distribution Cells) RANSAC for 3D Point Clould Plane Segmentation Based on Normal Distribution Transformation Cells))

2. 대한민국 공개특허공보 제10-2015-0109868호(2015.10.02.공개)(발명의 명칭: 깊이정보를 사용한 바닥영역 처리방법과 이를 위한 처리장치 및 프로그램을 기록한 컴퓨터 판독 가능 기록 매체)2. Republic of Korea Patent Publication No. 10-2015-0109868 (published on October 2, 2015) (Title of the invention: a method for processing a floor area using depth information, a processing device for the same, and a computer-readable recording medium recording a program)

본 발명의 목적은 상기한 바와 같은 종래의 실정을 감안하여 제안된 것으로, 깊이 영상 기반의 노이즈에 강인하면서도 실시간으로 실행될 수 있도록 적은 연산량을 가지는 깊이 영상 기반 실시간 바닥 검출방법을 제공하는데 있다.It is an object of the present invention to provide a depth image-based real-time floor detection method, which has been proposed in view of the conventional situation as described above, and has a small amount of computation to be executed in real time while being robust to depth image-based noise.

상기한 바와 같은 목적을 달성하기 위한 본 발명의 바람직한 실시예에 따르면,According to a preferred embodiment of the present invention for achieving the above object,

일정 공간내에 위치한 사물들을 촬영할 수 있도록 TOF(Time Of Flight) 카메라가 장착된 상태에서, TOF 카메라 영상으로부터 ROI 영역에서 깊이영상 기반 실시간 바닥 검출 방법은,A real-time floor detection method based on a depth image in an ROI area from a TOF camera image in a state where a TOF (Time Of Flight) camera is mounted so that objects located within a certain space can be photographed,

바닥 탐색 영역 설정 후 깊이 이미지(image)를 셀(cell)로 분할하는 단계,After setting the bottom search area, dividing the depth image into cells,

이미지 셀(cell)을 NDT(Normal Distribution Transformation) 적용 후 고유값(eigen value)을 이용하여 평면여부를 판단하는 단계,After applying NDT (Normal Distribution Transformation) to an image cell, determining whether it is flat using an eigen value;

바닥 기준에 근접한 셀(cell)들만 후보군 셀(cell)로 추출하는 단계,Extracting only cells close to the bottom criterion as candidate cells;

셀(cell)이 최소후보 셀(cell)수보다 큰지를 판단하는 단계,Determining whether the cell (cell) is greater than the minimum number of candidate cells (cell),

만약, 상기 판단 단계에서, 셀(cell)이 최소후보 셀(cell)수보다 작으면 바닥평면 없음으로 판단하며,If, in the determination step, the number of cells is smaller than the minimum number of candidate cells, it is determined that there is no floor plane,

만약, 상기 판단 단계에서, 셀(cell)이 최소후보 셀(cell)수보다 크면, 셀(cell) 기반 RANSAC으로 바닥 모델링 하는 단계, 및If, in the determination step, the cell (cell) is greater than the minimum number of candidate cells (cell), the step of modeling the floor with a cell (cell) based RANSAC, and

후보군 셀(cell)이외의 거리 기준 인라이어(inlier)를 추가하는 단계를 포함하는 깊이영상 기반 실시간 바닥 검출 방법이 제공된다.There is provided a depth image-based real-time floor detection method including adding a distance-based inlier other than a candidate cell (cell).

또한, 상기 셀 기반 RANSAC으로 바닥 모델링 하는 단계는In addition, the step of modeling the floor with the cell-based RANSAC is

후보군 셀(cell)의 평균(mean)값으로 3개의 셀(cell) 랜덤 선택 후 바닥 평면 모델링하는 단계,3 cell (cell) random selection as the mean (mean) value of the candidate cell (cell) and floor plane modeling,

후보군 셀(cell)들과의 평면각도, 거리 계산 후 인라이어 셀(inlier cell) 조건을 판단하는 단계,Determining an inlier cell condition after calculating the plane angle and distance with the candidate cells;

이후, 컨센서스(Consensus)(지지) 카운트가 최대 카운트(MAX COUNT)보다 큰지를 판단하는 단계,Thereafter, determining whether the consensus (support) count is greater than the maximum count (MAX COUNT);

상기 판단 단계에서, 컨센서스(Consensus)(지지) 카운트가 최대 카운트(MAX COUNT)보다 크면,In the determination step, if the consensus (support) count is greater than the maximum count (MAX COUNT),

베스트 모델(Best model)로 판단하여, 인라이어 셀(inlier cell)을 갱신하는 단계,Determining the best model (Best model), updating the inlier cell (inlier cell),

베스트 모델(Best model)의 인라이어(inlier) 비율이 95% 이상인지를 판단하는 단계,Determining whether the inlier ratio of the best model is 95% or more;

상기 단계에서, 베스트 모델(Best model)의 인라이어(inlier) 비율이 95% 이상이면, RANSAC 반복(Iteration)을 탈출하는 단계,In the above step, if the inlier ratio of the best model is 95% or more, escaping the RANSAC iteration;

상기 단계에서, 베스트 모델(Best model)의 인라이어(inlier) 비율이 95% 이상이 아니면, 최대(MAX) 반복(Iteration)인지를 판단하는 단계,In the above step, if the inlier ratio of the best model is not more than 95%, determining whether it is the maximum (MAX) iteration;

상기 단계에서, 최대(MAX) 반복(Iteration)이면, RANSAC 반복(Iteration)을 탈출하는 단계를 수행하는 것을 특징으로 한다.In the above step, if the maximum (MAX) iteration (Iteration), characterized in that performing the step of escaping the RANSAC iteration (Iteration).

또한, 이미지 공간에서 셀(cell)로 분할되어 하나의 셀(cell)안에 포함된 3D 실세계(Real world)의 점 데이터들의 집합의 공분산을 계산하여, 점들이 퍼져있는 형태가 평면의 형태인지 아닌지를 판단하는 것을 특징으로 한다.In addition, by calculating the covariance of the set of point data of the 3D real world divided into cells in the image space and included in one cell, it is determined whether the shape in which the points are spread is the shape of a plane or not. characterized by judging.

또한, 2차원에서 데이터 군집의 공분산의 고유 벡터(eigen vector)는 분포가 가장 작고, 가장 큰 2축의 방향을, 고유값(eigen value)는 축의 크기를 나타내며, 2축은 직교하고, 3차원에서 데이터 군집의 공분산의 고유 벡터(eigen vector)는 타원체의 3축 방향을, 고유값(eigen value)는 각 축의 크기를 나타내고 3개의 축은 서로 직교하는 형태를 띄며, 이 축의 방향과 크기를 이용하여 데이터의 분포가 평면에 해당하는지 선에 해당하는지, 아닌지 구분할 수 있도록 된 것을 특징으로 한다.In addition, in 2D, the eigen vector of the covariance of the data cluster has the smallest distribution and the direction of the largest 2 axes, and the eigen value indicates the size of the axis, the 2 axes are orthogonal, and the data in 3D The eigen vector of the covariance of the cluster represents the three-axis direction of the ellipsoid, and the eigen value represents the size of each axis, and the three axes are orthogonal to each other. It is characterized in that it is possible to distinguish whether the distribution corresponds to a plane or a line or not.

본 발명의 다른 측면에 따르면, 일정 공간내에 위치한 사물들을 촬영할 수 있도록 TOF(Time Of Flight) 카메라가 장착된 상태에서, TOF 카메라 영상으로부터 ROI 영역에서 깊이영상 기반 실시간 바닥 검출 방법은,According to another aspect of the present invention, a real-time floor detection method based on a depth image in an ROI region from a TOF camera image in a state in which a TOF (Time Of Flight) camera is mounted so as to photograph objects located in a certain space,

TOF 카메라로부터 깊이 영상을 수신하는 제1 단계,A first step of receiving a depth image from a TOF camera,

일정 크기의 이미지에서 하단부분의 ROI영역을 설정하는 제2 단계,A second step of setting the ROI area of the lower part in an image of a certain size,

일정 크기의 ROI 영역을 일정크기의 정사각형 픽셀 그리드(grid)로 분할하는 제3 단계,A third step of dividing the ROI area of a certain size into a square pixel grid of a certain size;

위 단계를 통과한 각각의 셀(cell)들을 NDT(Normal Distribution Transformation)을 사용해서 공분산을 구하고, 타원체 형태의 공분산 매트릭스(matrix)의 고유값(eigen value)와 고유벡터(eigen vector)를 계산하는 제4 단계,For each cell that has passed the above steps, the covariance is calculated using NDT (Normal Distribution Transformation), and the eigen value and eigen vector of the ellipsoid-shaped covariance matrix are calculated. Step 4,

위의 절차를 통과한 후보군 평면 셀(cell)들이 평면을 계산하기에 충분한 셀(cell)이 있는지 판단하는 제5 단계,A fifth step of determining whether there are enough cells for the candidate group planar cells that have passed the above procedure to calculate a plane,

바닥 평면 후보군 셀(cell) 집합 중에서 3개를 추출하는 제6 단계,A sixth step of extracting three from a set of floor plan candidate cells;

이 셀(Cell) 안에 포인트(point)집합들의 평균값을 이용하여 3개의 점을 만들고 이를 이용하여 평면을 모델링하고 법선벡터를 계산하는 제7 단계,The seventh step of making three points using the average value of point sets in this cell, modeling the plane using them, and calculating the normal vector;

평면의 방정식이 RANSAC에서의 모델이 되어 바닥 평면 후보군 cell들을 검사하되, 바닥 모델과 cell이 이루는 각도기 ±10도 이내여야 하며, 둘의 거리는 5cm 이하이어야 컨센서스(Consensus)(지지) 카운트로 인라이어(Inlier)에 추가하는 제8 단계,The equation of the plane becomes a model in RANSAC and examines the floor plan candidate cells, but the angle between the floor model and the cell must be within ±10 degrees, and the distance between the two must be less than 5 cm. 8th step of adding to the Inlier,

상기 제8단계 이후에, 마지막으로 ROI내에 현재 바닥 모델의 인라이어 셀(inlier cell)이 아닌 다른 모든 셀들을 거리 기준으로 검사하여 5cm 이내의 거리를 가지고 있으면 이들 또한 바닥으로 인라이어(inlier)로 추가하는 제 9단계를 더 포함하는 것을 특징으로 하는 깊이영상 기반 실시간 바닥 검출 방법이 제공된다.After the 8th step, finally, all cells other than the inlier cells of the current floor model in the ROI are checked based on the distance, and if they have a distance within 5 cm, they are also converted to the floor as inliers. There is provided a depth image-based real-time floor detection method further comprising a ninth step of adding.

또한, 상기 제4 단계에서, 각각의 셀(cell)들의 고유값(eigen value)값을 크기순으로 정렬하여 사용하는 것을 특징으로 한다.In addition, in the fourth step, it is characterized in that the eigen value of each cell is sorted in order of size and used.

또한, 상기 제8 단계에서의 두 조건 중 하나라도 벗어난다면 인라이어(inlier)에서 제외하고, 현재 바닥 모델이 기존의 가장 많은 지지를 얻은 바닥모델보다 컨센서스(Consensus)(지지)가 높다면 베스트 모델(best model)을 현재모델로 대체하고, 인라이어(inlier list)도 갱신하며, 이를 최대(Max) 반복(iteration) 수까지 반복하고, 반복 도중 모델에 대한 바닥 평면 후보군 셀(cell)들의 컨센서스(consensus)(지지) 비율이 95%를 넘어간다면, 더 이상 반복하지 않은 채 RANSAC 알고리즘을 종료하는 것을 특징으로 한다.In addition, if any one of the two conditions in the eighth step is deviated, it is excluded from the inlier, and if the current floor model has a higher consensus (support) than the existing floor model that has the most support, the best model (best model) is replaced with the current model, the inlier list is also updated, this is repeated up to the maximum number of iterations, and the consensus of the bottom plane candidate cells for the model during iteration ( If the consensus) (support) ratio exceeds 95%, it is characterized in that the RANSAC algorithm is terminated without further repetition.

또한, 상기 깊이 영상 셀(cell)은 3차원 실세계(3D Real world)공간의 데이터들중에서 깊이 영상 셀(cell)기준으로 군집한 것을 특징으로 하며,In addition, the depth image cell (cell) is characterized in that it is clustered based on the depth image cell (cell) among the data in the 3D real world space,

이미지 셀(Image cell) 기반으로 NDT 알고리즘 처리 후 평면을 판단하고, 작은 후보군 안에서 RANSAC으로 모델링하는 것을 특징으로 한다.It is characterized by determining the plane after NDT algorithm processing based on image cell and modeling with RANSAC in a small candidate group.

이상 설명한 바와 같이, 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법에 의하면, 깊이 영상 기반의 노이즈에 강인하면서도 실시간으로 실행될 수 있도록 적은 연산량을 가지는 효과가 있다.As described above, according to the depth image-based real-time floor detection method according to the present invention, there is an effect of having a small amount of computation to be executed in real time while being robust to the depth image-based noise.

또한, 본 발명에 따르면, 이 부분에서 셀(cell)단위가 아닌 픽셀(pixel)단위의 접근을 하지는 않았지만 한다면, 더 세밀한 부분까지 바닥 영역으로 검출할 수 있다. 픽셀(Pixel)들을 이미지 셀(image cell)기반 NDT 평면형태로 검사하였기 때문에 대부분의 평면에 해당하는 데이터들은 포함되어 있다. 즉 검사해야 할 point 수 자체가 적다는 것이다. 또한, 바닥 평면 모델(model)에 포함된 셀(cell)근방의 인접 영역만 검사하는 것이기 때문에 연산량의 부담도 적다. 바닥 평면의 가장자리부분의 잔여 point pixel만을 추가하는 것이므로, best model 평면과의 거리를 이용하여 추가할 수 있다.In addition, according to the present invention, if an approach is not performed in a pixel unit rather than a cell unit in this part, even a more detailed part can be detected as the bottom area. Since the pixels were examined in the form of an image cell-based NDT plane, data corresponding to most planes is included. That is, the number of points to be inspected itself is small. In addition, since only the adjacent area near the cell included in the floor plan model is checked, the burden of computation is small. Since only the remaining point pixels on the edge of the floor plane are added, it can be added using the distance from the best model plane.

도 1은 공분산과 고유벡터(eigen vector), 및 고유값(eigen value)를 나타낸 도면으로서, 2차원에서의 공분산은 타원의 형태를 가지고 3차원에서는 타원체 형태의 모양을 가지게 되는 것을 나타낸 도면이다.
도 2는 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법에서의 바닥 검출과정의 일예를 순차적으로 나타낸 흐름도이다.
도 3 및 도 4는 일반적인 포인트 기반 RANSAC 바닥 검출방법의 결과 화면들을 나타낸 도면들이다.
도 5는 깊이 영상 이미지를 나타낸 도면이다.
도 6은 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법을 실행하기 위한 초기 영상의 일예를 나타낸 도면이다.
도 7은 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법의 실행영상의 일예를 나타낸 도면이다.1 is a diagram showing covariance, eigen vector, and eigen value, in which covariance in two dimensions has an ellipse shape and in 3D it has an ellipsoid shape.
2 is a flowchart sequentially illustrating an example of a floor detection process in a depth image-based real-time floor detection method according to the present invention.
3 and 4 are diagrams showing result screens of a general point-based RANSAC floor detection method.
5 is a diagram illustrating a depth video image.
6 is a diagram illustrating an example of an initial image for executing the depth image-based real-time floor detection method according to the present invention.
7 is a diagram illustrating an example of an execution image of a depth image-based real-time floor detection method according to the present invention.

이하 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법을 첨부도면을 참조로 상세히 설명한다.Hereinafter, a depth image-based real-time floor detection method according to the present invention will be described in detail with reference to the accompanying drawings.

도 2는 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법에서의 바닥 검출과정의 일예를 순차적으로 나타낸 흐름도이고, 도 3 및 도 4는 일반적인 포인트 기반 RANSAC 바닥 검출방법의 결과 화면들을 나타낸 도면들이고, 도 5는 깊이영상 이미지를 나타낸 도면이고, 도 6은 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법을 실행하기 위한 초기 영상의 일예를 나타낸 도면이고, 도 7은 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법의 실행영상의 일예를 나타낸 도면이다.2 is a flowchart sequentially showing an example of a floor detection process in a depth image-based real-time floor detection method according to the present invention, and FIGS. 3 and 4 are diagrams showing result screens of a general point-based RANSAC floor detection method, FIG. 5 is a view showing a depth image image, FIG. 6 is a view showing an example of an initial image for executing a real-time floor detection method based on a depth image according to the present invention, and FIG. 7 is a view showing a real-time floor detection based on a depth image according to the present invention It is a diagram showing an example of an execution image of the method.

본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법에 따르면,According to the depth image-based real-time floor detection method according to the present invention,

이미지 공간에서 셀(cell)로 분할되어 하나의 셀(cell)안에 포함된 3차원 실세계(3D Real world)의 점 데이터들의 집합의 공분산을 계산하여, 점들이 퍼져있는 형태가 평면의 형태인지 아닌지를 판단하게 된다. 또한, 2차원에서 데이터 군집의 공분산의 고유 벡터(eigen vector)는 분포가 가장 작고, 가장 큰 2축의 방향을, 고유값(eigen value)는 축의 크기를 나타낸다. 2축은 직교한다. 3차원에서 데이터 군집의 공분산의 고유벡터(eigen vector)는 타원체의 3축 방향을, 고유값(eigen value)는 각 축의 크기를 나타내고, 3개의 축은 서로 직교하는 형태를 띈다. 이 축의 방향과 크기를 이용하여 데이터의 분포가 평면에 해당하는지 선에 해당하는지,아닌지 구분할 수 있다.By calculating the covariance of the set of point data of the 3D real world divided into cells in the image space and included in one cell, it is determined whether the shape in which the points are spread is the shape of a plane or not. will judge In addition, an eigen vector of covariance of a data cluster in two dimensions indicates the direction of two axes having the smallest and largest distribution, and an eigen value indicates the size of the axis. The two axes are orthogonal. In 3D, the eigen vector of the covariance of the data cluster indicates the three-axis direction of the ellipsoid, the eigen value indicates the size of each axis, and the three axes are orthogonal to each other. By using the direction and size of this axis, it is possible to distinguish whether the distribution of data corresponds to a plane or a line.

본 발명에 따르면, 목표는According to the present invention, the goal is

TOF 카메라로부터 얻어진 320x240 깊이 영상 스틸샷으로 사용한다.It is used as a still shot of a 320x240 depth image obtained from a TOF camera.

3D point cloud데이터를 누적한 맵 형태의 데이터가 아닌 스틸샷 한장을 사용한다.It uses one still shot, not map-type data that accumulates 3D point cloud data.

설치 위치와 자세는 대략적으로 알고 있는 상태(평면을 찾는 것이 아닌 바닥 검출이기 때문에)이다.The installation position and posture are roughly known (because it is floor detection, not plane finding).

실시간 로버스터 알고리즘(Real-time & Robust Algorithm)Real-time & Robust Algorithm

실시간(Real-time)-->point간의 거리 정보를 계산해야 하는 것에서 point묶음 cell단위 계산이라 RANSAC알고리즘 사용 시 최대 반복횟수를 줄이고, voting 판단을 위한 비교해야 하는 수도 줄어서 실시간 처리 가능하다.Real-time--> From having to calculate distance information between points, it is a point-packed cell unit calculation, so when using the RANSAC algorithm, the maximum number of repetitions is reduced, and the number of comparisons for voting judgment is reduced, so real-time processing is possible.

Robust-->3D point cloud에는 예측 불가능한 노이즈가 끼기 때문에 point 데이터를 그대로 사용하여 모델링을 할 때, 오차가 포함된 결과가 나타나게 된다. Cell 단위로 묶게 되어서 cell이 평면형태를 이루는지 아닌지 판단하여 평면이 아닌 cell들은 바닥 모델링을 위한 재료에서 제외된다. 근접한 물체에 point들이 찍히면 원거리보다 개수가 많아지고 단순 point 거리기준 voting시 영향을 많이 받아 실제 바닥이 아닌 모델이 검출되기도 하는 문제를 cell 단위로 처리할 경우 해결이 가능하다.Robust--> 3D point cloud contains unpredictable noise, so when modeling using point data as it is, results with errors appear. Cells are grouped together, so it is judged whether the cells form a planar shape or not, and non-planar cells are excluded from the material for floor modeling. When points are placed on a nearby object, the number is greater than that of a distant object, and it is possible to solve the problem that a model rather than the actual floor is detected because it is greatly influenced by simple point distance-based voting.

본 발명에서는 깊이 영상을 이용한 cell 기반 RANSAC 알고리즘이 사용되고, 추출된 모델에 평면에 가까운 inlier 추가된다.In the present invention, a cell-based RANSAC algorithm using a depth image is used, and a near-planar inlier is added to the extracted model.

도 2는 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법에서의 바닥 검출과정의 일예를 순차적으로 나타낸 흐름도로서, 2 is a flowchart sequentially showing an example of a floor detection process in a depth image-based real-time floor detection method according to the present invention;

도 2를 참조하여, 알고리즘 프로세서(Algorithm Process)을 간단히 설명한다. Referring to FIG. 2 , an algorithm process will be briefly described.

본 발명에서 TOF 카메라가 일정 공간에 장착된 상태에서, TOF 카메라의 이미지로부터 설정된 ROI(관심) 영역에서 깊이 이미지(혹은 영상 이라 함) 기반 실시간으로 바닥을 검출한다.In the present invention, in a state where the TOF camera is mounted in a certain space, the floor is detected in real time based on a depth image (or called an image) in the ROI (interest) area set from the image of the TOF camera.

먼저, 바닥 탐색 영역 설정후 깊이 이미지를 셀(cell)로 분할한다(S2).First, after setting the bottom search area, the depth image is divided into cells (S2).

이후, 이미지 셀(cell)을 NDT 알고리즘 적용 후 고유값(eigen value)을 이용하여 평면여부를 판단한다(S4). 한편, 상기 NDT 알고리즘은 공간 분할용으로 맵(map)기반으로 제안된 것이다. 예를 들어 라이다(ridar)는 도로 맵(map)기반 기술로 소개되고 있다.Thereafter, it is determined whether the image cell is flat using an eigen value after applying the NDT algorithm (S4). Meanwhile, the NDT algorithm is proposed based on a map for spatial division. For example, lidar is being introduced as a road map-based technology.

이후, 바닥 기준에 근접한 셀(cell)들만 후보군 셀(cell)로 추출한다(S6).Thereafter, only cells close to the bottom criterion are extracted as candidate cells (S6).

이후, 셀(cell)이 최소후보 셀(cell)수보다 큰지를 판단한다(S8).Thereafter, it is determined whether the number of cells is greater than the minimum number of candidate cells (S8).

만약, 상기 단계 S8에서, 셀(cell)이 최소후보 셀(cell)수보다 작으면 바닥평면 없음으로 판단한다(S7).If, in step S8, the number of cells is smaller than the minimum number of candidate cells, it is determined that there is no floor plane (S7).

만약, 상기 단계 S8에서, 셀(cell)이 최소후보 셀(cell)수보다 크면,If, in step S8, the number of cells is greater than the minimum number of candidate cells,

셀(cell) 기반 RANSAC(알고리즘)으로 바닥 모델링 한다(S10). 단계 S10까지 진행하면 도 7과 같이 이미지가 표시된다.The floor is modeled with a cell-based RANSAC (algorithm) (S10). If it proceeds to step S10, an image is displayed as shown in FIG.

이후, 후보군 셀(cell)이외의 거리 기준 인라이어(inlier)를 추가한다(S12).Thereafter, a distance-based inlier other than the candidate cell is added (S12).

상기 단계들의 처리를 위한 깊이영상 기반 실시간 바닥 검출 프로그램 저장부(미도시)에서,In the depth image-based real-time floor detection program storage (not shown) for the processing of the above steps,

이미지 분할처리부(미도시)에서, 바닥 탐색 영역 설정후 깊이 이미지를 셀(cell)로 분할처리한다.The image segmentation processing unit (not shown) divides the depth image into cells after setting the floor search area.

평면판단부(미도시)에서, 이미지 셀(cell)을 NDT 알고리즘 적용 후 고유값(eigen value)을 이용하여 평면여부를 판단한다In the plane determination unit (not shown), it is determined whether the image cell is flat using an eigen value after applying the NDT algorithm.

후보군 추출부(미도시)에서 바닥 기준에 근접한 셀(cell)들만 후보군 셀(cell)로 추출한다.In the candidate group extraction unit (not shown), only cells close to the floor criterion are extracted as candidate cells.

셀 판단부(미도시)에서 셀(cell)이 최소후보 셀(cell)수보다 큰지를 판단한다.A cell determination unit (not shown) determines whether the number of cells is greater than the minimum number of candidate cells.

상기 셀 판단부는 만약, 상기 단계 S8에서, 셀(cell)이 최소후보 셀(cell)수보다 작으면 바닥평면 없음으로 판단한다.If the cell determination unit is smaller than the minimum number of candidate cells in step S8, it is determined that there is no floor plan.

또한, 상기 셀 판단부는 만약, 상기 단계 S8에서, 셀(cell)이 최소후보 셀(cell)수보다 크면,In addition, if the cell determination unit is greater than the minimum number of candidate cells in step S8,

셀(cell) 기반 RANSAC(알고리즘)으로 바닥 모델링 한다.The floor is modeled with cell-based RANSAC (algorithm).

인라이어 추가부(미도시)에서는 후보군 셀(cell)이외의 거리 기준 인라이어(inlier)를 추가한다.The inlier adding unit (not shown) adds a distance-based inlier other than the candidate cell.

한편, 셀(cell) 기반 RANSAC 바닥 모델링과정은 셀 기반 RANSAC으로 바닥 모델링 프로그램(미도시)에 의해 수행된다. 즉, 후보군 셀(cell)의 평균(mean)값으로 3개의 셀(cell) 랜덤 선택 후 바닥 평면 모델링하고,Meanwhile, the cell-based RANSAC floor modeling process is performed by a cell-based RANSAC floor modeling program (not shown). That is, after random selection of three cells as the mean value of the candidate cells, the floor plane is modeled,

후보군 셀(cell)들과의 평면각도, 거리 계산 후 인라이어 셀(inlier cell) 조건을 판단하고,After calculating the plane angle and distance with the candidate cells, the condition of the inlier cell is determined,

이후, Consensus(지지) 카운트가 최대 카운트(MAX COUNT)보다 큰지를 판단하고,Thereafter, it is determined whether the Consensus count is greater than the maximum count (MAX COUNT),

컨센서스(Consensus)(지지) 카운트가 최대 카운트(MAX COUNT)보다 크면, 베스트 모델(Best model), 인라이어 셀(inlier cell)을 갱신하도록 제어하고, If the consensus (support) count is greater than the maximum count (MAX COUNT), control to update the best model (Best model), inlier cell (inlier cell),

이후, Best model의 인라이어(inlier) 비율이 95% 이상인지를 판단하고,After that, it is determined whether the inlier ratio of the best model is 95% or more,

Best model의 인라이어(inlier) 비율이 95% 이상이면, RANSAC 반복(Iteration)을 탈출하도록 제어하고,If the inlier ratio of the best model is more than 95%, control to escape the RANSAC iteration,

상기 단계에서, Best model의 인라이어(inlier) 비율이 95% 이상이 아니면, MAX Iteration인지를 판단하고,In the above step, if the inlier ratio of the Best model is not more than 95%, it is determined whether it is MAX Iteration,

상기 단계에서, MAX 반복(Iteration)이면, RANSAC 반복(Iteration)을 탈출하도록 제어한다.In the above step, if the MAX iteration (Iteration), control to escape the RANSAC iteration (Iteration).

도 3 및 도 4는 일반적인 포인트 기반 RANSAC 바닥 검출방법을 나타낸 도면으로서, 도 3을 참조하면, 단순하게 RANSAC으로 모델링된 평면 방정식과 포인트(point)와의 거리만을 계산한 방법이라서 세로로 세워진 파티션 부분도 RANSAC 모델링의 인라이어(Inlier)로 동작한다.3 and 4 are diagrams showing a general point-based RANSAC floor detection method. Referring to FIG. 3, it is a method of calculating only the distance between the plane equation modeled by the RANSAC and the point, and thus a vertical partition view It works as an inlier of RANSAC modeling.

도 4를 참조하면, 도 3의 경우와 유사하지만, 실제 바닥영역이 검출되는 경우를 나타낸 것이다(바닥영역이 책상영역보다 넓기 때문). 도 4에서 바닥부분 검출이 잘된 경우를 나타낸다.Referring to FIG. 4 , it is similar to the case of FIG. 3 , but shows a case in which an actual floor area is detected (because the floor area is wider than the desk area). 4 shows a case in which the bottom part was detected well.

RANSAC의 문제점은 RANSAC은 아웃라이어를 제외한 강인한 모델을 만들 수 있다는 장점이 있지만, 일반적으로 사용되는 3d point cloud 데이터를 이용한 RANSAC기반 바닥 검출에는 도 3 처럼 잘못된 결과가 나올 수 있고, 세로 벽면의 아웃라이어(outlier)를 걸러내지 못하는 문제점이 있다. 그 이외에도 많은 반복횟수를 필요로 하여 실시간 처리가 안되는 단점이 있다.The problem with RANSAC is that RANSAC has the advantage of being able to create a robust model excluding outliers, but RANSAC-based floor detection using commonly used 3d point cloud data may produce incorrect results as shown in FIG. 3, and vertical wall outliers There is a problem in that it cannot filter out outliers. In addition, there is a disadvantage that real-time processing is not possible because it requires a large number of iterations.

도 5는 TOF 카메라로 접수한 일반적인 깊이 영상 사진을 나타낸다.5 shows a general depth image photograph received by a TOF camera.

도 6은 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법을 실행하기 위한 초기 영상의 일예를 나타낸 도면이다.6 is a diagram illustrating an example of an initial image for executing the depth image-based real-time floor detection method according to the present invention.

도 6을 참조하면, 320*240 이미지 중에서 설치위치를 고려한 ROI 영역과 ROI 영역 내에 10 by 10 픽셀로 분할된 상태를 나타낸다.Referring to FIG. 6 , an ROI area considering the installation location among 320*240 images and a state divided into 10 by 10 pixels within the ROI area are shown.

본 발명에서는 실시간으로 데이터를 얻어서 사용할 수 있다.In the present invention, data can be obtained and used in real time.

도 7은 본 발명에 따른 깊이 영상 기반 실시간 바닥 검출방법의 실행영상의 일예를 나타낸 도면이다.7 is a view showing an example of an execution image of the depth image-based real-time floor detection method according to the present invention.

도 7을 참조하면, ROI 영역 내에 NDT 변환 후, RANSAC 바닥 평면 인라이어(Inlier)를 표시한 것으로서, 흰색 부분은 인라이어(inlier) 바닥 검출부분이고, 회색 부분은 마지막에 추가된 바닥 평면 후보군이 아니었던 인라이어(inlier)이고(단계 S12 참조),Referring to FIG. 7 , after NDT transformation in the ROI region, the RANSAC floor plane inlier is displayed. The white part is the inlier bottom detection part, and the gray part is the bottom plane candidate group added at the end. It is an inlier that was not (see step S12),

하늘색 부분은 RANSAC best model시에 선택된 3개의 셀(cell)을 나타내고(하늘색 부분의 갯수는 매번 바뀌게 된다), 작은 원부분은 평면이 아닌 영역을 나타낸다.The light blue part represents three cells selected during the RANSAC best model (the number of light blue parts changes every time), and the small circle represents a non-planar area.

상기 단계 S12 과정에서 바닥 평면이 표시되게 되면, 이 바닥표시 데이터가 유용하게 사용될 수 있다. 왜냐하면, 전체 이미지에서 오브젝트(object)(예, 서 있는 사람)가 중요하기 때문에 오브젝트를 확실하게 인식하려면 바닥데이터를 제거해야 할 필요가 있기 때문이다.When the floor plane is displayed in the step S12, this floor display data can be usefully used. This is because, in order to reliably recognize an object, it is necessary to remove the floor data because an object (eg, a standing person) is important in the entire image.

한편, 인라이어(inlier)는 실제 검출하고자 하는 모델에 부합하는 데이터라고 할 수 있으며, 아웃라이어는 검출할 때 반영되지 않아야 하는 데이터를 말한다.On the other hand, an inlier can be said to be data that matches a model to be actually detected, and an outlier refers to data that should not be reflected during detection.

본 발명에 따르면, 반복 횟수는 줄이면서 성공률은 높이는 방법 특히, 실시간으로 처리가능한 방법을 제안한다.According to the present invention, a method for increasing the success rate while reducing the number of iterations, in particular, a method capable of processing in real time is proposed.

본 발명에서는 단계 S2, S4를 통해 연산량을 줄인다.In the present invention, the amount of computation is reduced through steps S2 and S4.

또한 셀기반 RANSAC으로 바닥 모델링을 수행한다(S112, S114, S116,S118,S120,S122,S124).In addition, floor modeling is performed with cell-based RANSAC (S112, S114, S116, S118, S120, S122, S124).

한편, 단계 S114는 평면이 아닐 수도 있는 영상 부분, L자 혹은 기타 노이즈가 걸러지는 과정이다.Meanwhile, step S114 is a process of filtering out an image portion that may not be flat, an L-shape, or other noise.

이하 설명에서 베스트 모델을 저장하는 과정에서는 컨센서스(consensus)(지지)했던 것들의 인덱스(index)만 저장하게 된다.In the following description, in the process of storing the best model, only the indexes of those that have been supported (consensus) are stored.

본 발명의 특징Features of the present invention

본 발명에 따르면, 먼저 카메라의 대략적인 설치 위치를 알고 있는 상태에서 진행한다. 예를 들어 자율주행 차량의 후방에 장착한다고 한다면, 설치 위치는 대략 1m부근에 설치하고, 수평선을 카메라의 중심축이 바라보도록 설치한다고 하지만, 여러가지 요인들로 인해서 자세가 바뀔 수 있다. 땅이 기울어졌거나, 설치할 때의 자세가 약간 틀어졌거나, 차량의 뒷바퀴의 바람이 빠져서 살짝 내려앉았거나 하는 등의 이유이다. 이때 설치 각도가 바닥평면 찾을 때의 기준이 될 것이다.According to the present invention, first, the approximate installation position of the camera is known. For example, if it is installed on the rear of an autonomous vehicle, the installation location is installed at approximately 1m and the horizontal line is installed so that the central axis of the camera faces, but the posture may change due to various factors. The reason is that the ground is tilted, the posture when installing is slightly distorted, or the rear wheel of the vehicle has fallen slightly due to lack of wind. At this time, the installation angle will be the standard when finding the floor plane.

- TOF 카메라로부터 깊이 영상을 받는다. 깊이 영상은 320*240의 크기를 가지고 픽셀 수는 76800개이다. 카메라의 크기가 달라진다고 한다면 파라미터들 또한 그에 맞추어 바꾸면 된다.- Receive depth image from TOF camera. The depth image has a size of 320*240 and has 76800 pixels. If the size of the camera changes, the parameters can also be changed accordingly.

-320*240 크기의 이미지에서 하단부분의 ROI영역을 설정한다(이는 설치 위치에 따라 바닥이 많이 보일경우, 아니면 적게 보일 경우에 따라서 조절할 수 있다).-Set the ROI area in the lower part of the image of -320*240 size (this can be adjusted according to the case where the floor is seen more or less depending on the installation location).

- 320*120 크기의 ROI 영역을 10 by 10 pixel 크기의 grid로 분할한다(크기 분할도 바뀔 수 있긴 하지만, 너무 작을경우 노이즈에 의해 평면특징이 잘 추출되지 않아서 적당히 크면 좋다. 고스펙의 카메라를 사용한다면 그에 맞추어 크기를 변경해도 무방하다). 그리고 10 by 10 pixel 중에서 데이터가 존재하는 비율이 50%가 넘는 cell만 다음 절차를 진행한다. 넘지 못한 cell들은 노이즈로 크기가 너무 작거나 노이즈로 간주한다.- Divide the 320*120 ROI area into a grid of 10 by 10 pixels (size division can be changed, but if it is too small, planar features are not well extracted due to noise, so it is good if it is large enough. If you use it, you can change the size accordingly). In addition, only cells with more than 50% of data among 10 by 10 pixels proceed with the following procedure. Cells that do not exceed the size are considered to be too small or noise as noise.

-위 절차를 통과한 각각의 cell들을 Normal Distribution Transformation을 사용해서 공분산을 구한다. 그리고 타원체 형태의 공분산 matrix의 eigen value와 eigen vector를 계산한다.-For each cell that has passed the above procedure, find the covariance using Normal Distribution Transformation. Then, the eigen value and eigen vector of the covariance matrix in the form of an ellipsoid are calculated.

- 각각의 cell들의 eigen value값을 크기순으로 정렬하여 사용한다(λ1>λ2>λ3).λ1,λ2가 λ3보다 5배 이상 크면 평면으로 계산한다(이것은 센서의 정확도에 따라 달라질 수 있다. 센서의 정확도가 클수록 노이즈가 적어서 λ들의 크기 차이가 크다). 그 이후에 설치 위치에 따른 바닥 기준 평면과의 각도가 ±20도 이내일 경우에 바닥평면 후보군 cell로 남는다. 나머지는 NDT만 계산된 채로 남는다(20도 카메라의 성능이 좋으면 크기를 줄여도 된다.).- Use the eigen value of each cell in order of size (λ1>λ2>λ3). If λ1 and λ2 are more than 5 times greater than λ3, it is calculated as a flat surface (this may vary depending on the accuracy of the sensor. The greater the accuracy of λ, the less noise, so the difference in magnitude of λ is large). After that, if the angle with the floor reference plane according to the installation location is within ±20 degrees, it remains as a floor plane candidate cell. For the rest, only the NDT remains calculated (if the performance of the 20-degree camera is good, the size can be reduced).

-위의 절차를 통과한 후보군 평면 cell들이 평면을 계산하기에 충분한 cell이 있는지 판단한다. 객체가 너무 가깝거나 벽을 보고 있거나 하는 등의 상황에서는 바닥면적이 보이기 어렵다. 이때에는 “바닥평면이 없다”라고 결론짓고 알고리즘을 종료한다. 만약 최소 바닥 cell 개수 기준을 넘으면 다음을 진행한다.- It is judged whether there are enough cells to calculate the plane of the candidate group planar cells that have passed the above procedure. In situations where an object is too close or looking at a wall, it is difficult to see the floor area. At this time, it concludes that “there is no floor plane” and terminates the algorithm. If the minimum number of bottom cells is exceeded, proceed as follows.

-바닥 평면 후보군 cell 집합 중에서 3개를 추출한다. Cell 안에 point 집합들의 평균값을 이용하여 3개의 점을 만들고 이를 이용하여 평면을 모델링하고 법선벡터를 계산한다. 평면의 방정식이 RANSAC에서의 모델이 되어 바닥 평면 후보군 cell들을 검사한다. 바닥 모델과 cell이 이루는 각도기 ±10도(이에 제한되지 않음, ±5도, ±20도가 될 수 있음)이내여야 하며, 둘의 거리는 5cm이하여야(이에 제한되지 않음)Consensus 카운트로 Inlier에 추가한다. 두 조건 중 하나라도 벗어난다면 inlier에서 제외한다. 현재 바닥 모델이 기존의 가장 많은 지지를 얻은 바닥모델보다 Consensus가 높다면 best model을 현재모델로 대체하고, inlier list도 갱신한다. 이를 Max iteration 수까지 반복한다. 반복 도중에 모델에 대한 바닥 평면 후보군 cell들의 consensus 비율이 95%를 넘어간다면, 더 이상 반복하지 않은 채 RANSAC 알고리즘을 종료한다.- Extract 3 cells from the floor plan candidate cell set. Create 3 points using the average value of point sets in the cell, model the plane using them, and calculate the normal vector. The plane equation is modeled in RANSAC to examine the bottom plane candidate cells. The protractor between the floor model and the cell must be within ±10 degrees (not limited thereto, it can be ±5 degrees or ±20 degrees), and the distance between the two must be 5 cm or less (but not limited to this) Add to Inlier as consensus count . If either of these conditions is exceeded, it is excluded from the inlier. If the current floor model has a higher consensus than the existing floor model with the most support, the best model is replaced with the current model, and the inlier list is also updated. Repeat this up to the maximum number of iterations. If the consensus ratio of the floor plane candidate cells for the model exceeds 95% during iteration, the RANSAC algorithm is terminated without repeating any more.

-마지막으로 ROI내에 현재 바닥 모델의 inlier cell이 아닌 다른 모든 셀들을 거리 기준으로 검사하여 5cm 이내의 거리를 가지고 있으면 이들 또한 바닥으로 inlier로 추가한다.-Finally, all cells other than the inlier cell of the current floor model in the ROI are checked based on the distance, and if they have a distance within 5 cm, they are also added as inliers as the floor.

이하, 본 발명에 따른 방법을 좀 더 자세하게 설명한다.Hereinafter, the method according to the present invention will be described in more detail.

카메라 설치 위치, 기준평면 초기화Camera installation position, reference plane initialization

주로 바닥 검출은 객체 검출 이전단계에 수행되는 것으로 객체를 분리 및 처리하기 위해 사용된다. 본 특허는 1개의 바닥을 찾는 것을 목표로 한다. 그리고 이는 카메라의 설치 각도를 대략적으로 알고 있다는 가정을 한다.Mainly, floor detection is performed before object detection and is used to separate and process objects. This patent aims to find one floor. And it assumes that you roughly know the installation angle of the camera.

필요한 정보는 바닥으로부터 카메라의 높이, 그리고 카메라의 자세값을 적용한 기준 바닥평면의 방정식이다. 예시로 약 1m, 카메라는 정면을 바라보도록 설치하였다. TOF카메라를 사용하는 환경에 따라, 장착하는 물체 혹은 위치에 따라 카메라의 기준 파라미터를 정하면 될 것이다.The required information is the height of the camera from the floor and the equation of the reference floor plane with the camera's attitude value applied. For example, about 1m, the camera was installed to face the front. Depending on the environment in which the TOF camera is used, the standard parameters of the camera may be determined according to the mounting object or location.

평면의 방정식은 ax+by+cz+d=0이며, (a,b,c)는 평면에 대한 법선벡터를 나타낸다. 위의 조건을 적용하기 전에 real world의 좌표축인 x,y,z를 설명하면, x축은 이미지의 평면에서의 u축과 방향이 같고 평행하다, 그리고 y축은 v축과 평행하다. z축은 카메라 중심축과 평행하며 카메라의 중심축과 바라보는 방향이 같다. 이 조건대로 바닥평면의 1m, 카메라의 중심축은 바닥평면과 수평하도록 설치하고, 이미지 상에서 u축이 지평선과 수평하도록 설치한다고 한다. 그렇게 되면 바닥 평면의 방정식을 이용하여 나타낼 수 있는데 (a,b,c)는 (0,1,0)의 값을 갖는다. 수식은 1y+d=0 으로 간소화된다. 여기에서 1m 높이에 설치하였으므로, (0, 1, 0)을, x,y,z에 대입하면 d=-1이 된다. 이 부분을 이용하여 기준바닥 평면과 RANSAC으로 얻어진 후보군 평면과의 거리(카메라 위치 기준으로 계산한 거리)를 이용할 수 있겠지만, 카메라의 위치와 각도가 흔들릴 수 있어서 이를 고정으로 둘 수 없다. 또한, 본 특허에서는 카메라의 설치 위치와 각도가 정확하지 않을 수 있다는 점을 고려하기 때문에, 기준 평면 방정식과의 거리를 이용한 부분은 제외하였다. 다시 돌아와서 실제 설치 위치가 바뀌게 되거나 기준 바닥평면이 달라질 경우 기준 바닥 평면의 법선벡터를 구하여 적용하면 된다. 설치 높이가 달라진다고 하더라도 각도만 같다면, a,b,c의 값은 변하지 않는다. 그러므로 기준 평면과 바닥 후보군 cell 평면일 때, 기준 평면과 RANSAC으로 추출된 바닥 모델 평면일 때, 평면과의 평면각도 계산 시 값이 달라지지 않는다. 계산시에는 a,b,c만 이용하기 때문이다.The equation of the plane is ax+by+cz+d=0, and (a, b, c) represents the normal vector to the plane. Before applying the above conditions, if we explain the x, y, and z coordinate axes of the real world, the x-axis is the same as and parallel to the u-axis in the image plane, and the y-axis is parallel to the v-axis. The z-axis is parallel to the central axis of the camera, and the viewing direction is the same as the central axis of the camera. According to this condition, 1 m of the floor plane, the central axis of the camera should be installed so that it is horizontal to the floor plane, and the u-axis should be installed so that the u-axis is horizontal to the horizon on the image. Then, it can be expressed using the equation of the floor plane, where (a,b,c) has a value of (0,1,0). The formula is simplified to 1y+d=0. Since it is installed at a height of 1 m here, if (0, 1, 0) is substituted into x, y, z, d = -1. Using this part, the distance between the reference floor plane and the candidate plane obtained by RANSAC (the distance calculated based on the camera position) can be used, but the position and angle of the camera may be shaken, so it cannot be fixed. In addition, since this patent takes into account that the installation position and angle of the camera may not be accurate, the part using the distance from the reference plane equation is excluded. If the actual installation location is changed or the reference floor plane is changed when you come back, you can find the normal vector of the reference floor plane and apply it. Even if the installation height is changed, if the angle is the same, the values of a, b, and c do not change. Therefore, when the reference plane and the floor candidate cell plane are used, when the reference plane and the floor model plane extracted by RANSAC are used, the value does not change when calculating the plane angle with the plane. This is because only a, b, and c are used in the calculation.

위와 같은 기준으로 카메라를 설치할 경우 거리가 무한대까지 측정이 가능하다면 바닥영역은 TOF 이미지 영역내의 v축의 절반을 차지하게 되어 관심영역(Region of Interest)을 320*120크기로 주었다. 이 값은 설치하는 위치와 방향에 따라 달라질 수 있다. 위에 기재한 조건대로 320*240 이미지의 하단 절반영역을 ROI로 설정하였고 이를 10 by 10 pixel 크기의 cell로 분할하였다. 일반적으로 3D Point Cloud의 데이터에는 노이즈가 있는데 너무 작은 cell로 분할할 경우 이 셀들을 NDT 변환을 하더라도 노이즈로 인해 평면특성이 나오기 어렵다. 이는 센서의 성능이 좋아 노이즈의 크기가 매우 작다면 cell의 크기가 작더라도 준수한 성능을 낼 수 있다.이는 센서의 스펙과 성능에 따라 조절해야 한다. 너무 크게 설정해버리면 굴곡진 데이터들이 하나의 cell로 대표되기 때문에, 실제 바닥영역과 차이가 발생할 수 있으므로, TOF 센서로부터 받는 영상의 크기와 데이터의 정확도에 따라 분할 크기를 조절해야 한다.If the camera is installed based on the above criteria, if the distance can be measured to infinity, the bottom area occupies half of the v-axis in the TOF image area, so the Region of Interest is given a size of 320*120. This value may vary depending on the installation location and direction. According to the conditions described above, the lower half of the 320*240 image was set as an ROI, and it was divided into 10 by 10 pixel cells. In general, there is noise in the data of 3D Point Cloud, but if the cells are divided into too small cells, even if these cells are subjected to NDT transformation, it is difficult to obtain planar characteristics due to noise. If the size of the noise is very small due to the good sensor performance, good performance can be achieved even if the cell size is small. This should be adjusted according to the specification and performance of the sensor. If it is set too large, since the curved data is represented by one cell, there may be a difference from the actual floor area, so the division size must be adjusted according to the size of the image received from the TOF sensor and the accuracy of the data.

10 by 10 pixel 크기의 cell에선 100개의 3D point data가 있다. TOF 깊이 영상에서는 모든 픽셀이 데이터를 가지고 있는 것은 아니다. 너무 가깝거나 멀어서, 센서 성능의 한계 때문에, 매끄러운 표면에 빛이 반사되어 돌아오지 않거나, 흡수되는 경우 등의 이유로 데이터가 없을 수 있다. 그렇기에 종종 cell들 중에서 데이터가 비어있는 경우도 있다. 이렇게 데이터가 없는 경우에는 NDT 변환 시 cell의 공분산이 평면 형태로 계산되기가 힘들다. Cell 안에 데이터가 3/4 이상 존재하는 경우만 NDT를 계산한다. 3/4 기준은 센서의 성능에 따라 다르게 설정할 수 있다. 이 기준으로 NDT 계산을 위한 cell들을 걸러낸다. 깊이 영상의 한 pixel의 값이 단순하게 카메라로부터의 거리로만 표현되어 있다면 카메라 캘리브레이션 정보를 이용하여 3D Real world값이고, 3축 직교좌표계인 X, Y, Z로 변환한 뒤 NDT를 진행한다. NDT는 Normal Distribution Transformation의 약자로 정규 분포 변환 정도로 번역될 수 있다. 3차원의 점 집합의 위치 평균값 m과 3 by 3 형태의 공분산 행렬을 구한다. 이 공분산 행렬에 대한 고유값(eigen value)와 고유벡터(eigen vector)를 구한다. 고유벡터는 3가지가 나오게 되는데 이 3개의 고유벡터는 서로 수직인 주성분 벡터이다. 그리고 3축의 방향을 가진다. 고유값은 고유벡터 각각의 크기를 말해준다. 고유값들의 크기로 데이터의 공분산 형태가 원반형태를 띄는지 판단할 수 있고, 고유벡터의 값으로 기준 바닥 평면의 방정식을 이용하여 기준 바닥 평면과의 각도를 계산할 수 있다. 각 cell들의 eigen value인 λ1,λ2,λ3를 크기가 큰 순으로 정렬하여 λ1> λ2> λ3로 만든다. λ1>λ3 ×5 이고 λ2>λ3 ×5을 만족할 때, 공분산의 형태가 평면에 해당한다고 판단한다. 이 부분에서 실제 Real world의 3D Point Cloud 데이터들을 정사각형 형태의 cell로 공간분할을 하여 NDT를 적용한다면, 평면에서는 타원이 아닌 원반형태가 나오게 될 것이다. 맵의 크기에 따라 많은 공간분할이 필요하고, 데이터가 많이 있을 경우에 유효한 방법이나, TOF 카메라로부터 얻는 1장의 깊이 영상에는 Real world 공간분할이 적절하지 않다. 비어있는 공간도 많고, 거리가 멀어질 경우 그 공간 안에 들어있는 데이터의 숫자가 급격하게 감소하여 공분산을 구한다고 하더라도 정확도 면에서 떨어질 수 밖에 없다. Image cell 기반 NDT는 cell 안에 데이터의 수를 보장한다. 하지만 거리가 멀어질수록 원반이 아닌 타원의 형태가 나오게 되므로, λ1과 λ3, λ2와 λ3의 비교만 있고, λ1,λ2의 비교 의미는 없다.There are 100 3D point data in a cell with a size of 10 by 10 pixels. In TOF depth images, not all pixels have data. Data may not be available because it is too close or too far away, because of limited sensor performance, light is not reflected back on a smooth surface, is absorbed, etc. Therefore, data is often empty among cells. In the absence of such data, it is difficult to calculate the covariance of the cell in a flat form during NDT transformation. NDT is calculated only when there are more than 3/4 of data in the cell. The 3/4 criterion can be set differently depending on the performance of the sensor. Based on this criterion, cells for NDT calculation are filtered out. If the value of one pixel of the depth image is simply expressed only by the distance from the camera, it is a 3D real world value using the camera calibration information, and after converting it into X, Y, Z, which is a 3-axis Cartesian coordinate system, NDT is performed. NDT is an abbreviation of Normal Distribution Transformation and can be translated to the extent of Normal Distribution Transformation. The position average value m of the three-dimensional point set and the covariance matrix in the form of 3 by 3 are obtained. An eigen value and an eigen vector are obtained for this covariance matrix. There are three eigenvectors, and these three eigenvectors are principal component vectors perpendicular to each other. And it has three axes. The eigenvalue tells the size of each eigenvector. With the size of the eigenvalues, it can be determined whether the covariance of the data has a disk shape, and the angle with the reference floor plane can be calculated using the equation of the reference floor plane as the value of the eigenvector. λ1, λ2, and λ3, which are the eigen values of each cell, are sorted in the order of size to make λ1 > λ2 > λ3. When λ1>λ3 ×5 and λ2>λ3 ×5 are satisfied, it is determined that the form of covariance corresponds to a plane. In this part, if NDT is applied by spatially dividing the 3D Point Cloud data of the real world into square-shaped cells, it will come out in the form of a disk, not an ellipse, on the plane. A lot of spatial division is required depending on the size of the map, and it is an effective method when there is a lot of data, but real world spatial division is not appropriate for a single depth image obtained from a TOF camera. There are many empty spaces, and if the distance increases, the number of data in the space decreases rapidly, so even if the covariance is calculated, the accuracy is inevitably reduced. Image cell-based NDT guarantees the number of data in a cell. However, as the distance increases, the shape of an ellipse, not a disk, comes out, so there is only a comparison between λ1 and λ3, and λ2 and λ3, and there is no comparison between λ1 and λ2.

평면으로 판단된 cell은 기준 바닥 평면과의 각도를 계산한다. 평면 2개의 각을 구할 때, 법선벡터들을 이용한 벡터의 내적으로 각도를 구한다.이 각도의 크기가 +20도 미만일 때 바닥 평면 후보군 cell의 List에 추가한다. 이 조건을 만족하지 못한다면 평면에 해당하는 cell이지만, 바닥 후보군 평면에는 들어가지 못한다. 예를들면 벽이나 박스의 겉면과 같은 것이 이에 해당한다. 여기에서 거리값의 비교는 하지 않는다. 그 이유는 설치 각도가 살짝 틀어졌을 경우 거리가 멀어질수록 해당 cell과 기준 바닥 평면과의 거리는 늘어나기 때문에 하나의 임계치로 조절할 수 없다. NDT 결과 선택된 바닥 평면 후보군 cell들의 숫자를 세어서 20개 미만일 경우에는 바닥이 없다 라고 하고 알고리즘을 끝낸다. 이 조건은 선택이며, 너무 큰 수를 조건으로 삼게 되면 바닥 검출 알고리즘이 동작하지 않을 것이고 너무 낮게 된다면 RANSAC에 의한 바닥평면 모델링 시 낮은 검출 성능을 보인다.The cell determined to be flat calculates the angle with the reference floor plane. When calculating the angle of two planes, the angle is obtained as the dot product of the vector using normal vectors. When the size of this angle is less than +20 degrees, it is added to the list of cell candidates for the floor plane. If this condition is not satisfied, the cell corresponds to the plane, but it cannot enter the floor candidate plane. For example, the exterior of a wall or box. A comparison of distance values is not performed here. The reason is that if the installation angle is slightly distorted, the distance between the cell and the reference floor plane increases as the distance increases, so it cannot be adjusted with a single threshold. As a result of NDT, if the number of selected floor plan candidate cells is less than 20, it is said that there is no floor and the algorithm ends. This condition is optional, and if the number is too large, the floor detection algorithm will not work, and if it is set too low, the detection performance of the floor plane modeling by RANSAC is low.

최소 후보 cell수를 통과한다면 RANSAC에 의한 바닥 평면 검출을 적용한다. RANSAC은 데이터 군집 중에서 임의로 데이터를 선택하여 모델을 만든 뒤 이 모델에 대한 지지수를 확인하고 이를 반복적용하여 지지수가 높은 모델을 선택하도록 하는 방법이다. Outlier가 포함된 데이터에서 Outlier들의 데이터의 영향을 줄인 강인하게 모델을 추정할 수 있지만 최적의 모델 생성을 보장하지는 못한다는 단점이 있다. RANSAC방법을 이용하여 바닥 평면 후보군 cell들 중에서 3개의 cell을 임의로 중복되지 않게 추출한다. 추출된 cell의 평균값들을 이용하여 일반적인 평면의 방정식을 계산한다. 이 세 점을 Q, R, S라고 한다면 벡터 QR, QS를 계산하고,ax+by+cz+d=0에서 a,b,c,d값을 계산한다. 계산된 수식과 모든 바닥 평면 후보군 cell의 평균값을 이용한 거리 계산과, 모델 평면과 cell 평면이 이루는 각도를 계산한다. 거리의 크기가 5CM 이하의 값을 가지고,각도의 크기가 10도 이하의 값을 가지게 된다면 지지 수에 추가하고 Inlier로 분류한다. 지지 수가 이전 best model의 count수보다 크다면, 현재 모델과 지지 수, Inlier list를 best로 갱신한다. Best 모델의 지지(Consensus)비율이 RANSAC 반복 탈출조건인 95%를 넘어가면 RANSAC 알고리즘 종료, 넘지 못하면 다시 반복한다. 여기에서 사용된 각도들과 최소 cell 수, 탈출조건 등 파라미터들은 환경에 맞추어 변동될 수 있다. RANSAC으로 바닥을 검출하는 과정에서 사용된 수식은 다음과 같다.If the minimum number of candidate cells is passed, floor plane detection by RANSAC is applied. RANSAC is a method that selects a model with a high support by selecting data randomly from a data cluster, creating a model, checking the support for this model, and repeatedly applying it. Although the model can be estimated robustly by reducing the influence of the data of the outliers in the data including outliers, it has the disadvantage that it does not guarantee the optimal model generation. By using the RANSAC method, three cells are randomly selected from among the floor plan candidate cells so that they do not overlap. Calculate the equation of a general plane using the average values of the extracted cells. If these three points are Q, R, S, the vectors QR and QS are calculated, and the values of a, b, c, and d are calculated at ax+by+cz+d=0. Calculate the distance using the calculated formula and the average value of all floor plane candidate cells, and calculate the angle between the model plane and the cell plane. If the distance has a value of 5 cm or less and the angle has a value of 10 degrees or less, it is added to the number of supports and classified as an inlier. If the number of supports is greater than the count number of the previous best model, the current model, number of supports, and Inlier list are updated to best. If the consensus ratio of the best model exceeds 95%, which is the RANSAC iteration escape condition, the RANSAC algorithm is terminated. Parameters such as angles, minimum number of cells, and escape conditions used here can be changed according to the environment. The formula used in the process of detecting the floor with RANSAC is as follows.

평면의 방정식equation of the plane

세 점으로 평면의 방정식을 구하기Find the equation of a plane with three points

두 평면의 각도 구하기Find the angle of two planes

평면과 점 사이의 거리 구하기Find the distance between a plane and a point

RANSAC으로 추출된 best 바닥 model을 평면의 방정식 형태로 얻고난 뒤, ROI영역에 있는 모든 cell을 검사하여 Inlier에 추가한다. 이는 NDT로 평면이 아닌 형태로 나타난 cell들을 거리기준을 두고 추가하는 것이다. 예를들면 작은 돌맹이가 놓여진 공간은 평면으로 계산되진 않았지만 객체로 분류하기에는 너무나도 작기 때문에 바닥으로 추가하는 것이다. 혹은 벽이 바닥과 만나 직각으로 된 구조물이 cell들로 분류되어 이 부분도 바닥으로 추가될 수 있을 것이다.After obtaining the best floor model extracted with RANSAC in the form of a plane equation, all cells in the ROI area are checked and added to the Inlier. This is to add cells that appear in a non-planar shape with NDT based on the distance. For example, the space where a small boulder is placed is not calculated as a flat surface, but it is too small to be classified as an object, so it is added as a floor. Alternatively, a structure in which the wall meets the floor and has a right angle is classified into cells, so this part can also be added as a floor.

이 부분에서 셀(cell)단위가 아닌 픽셀(pixel)단위의 접근을 하지는 않았지만 한다면, 더 세밀한 부분까지 바닥 영역으로 검출할 수 있다. 픽셀(Pixel)들을 이미지 셀(image cell)기반 NDT 평면형태로 검사하였기 때문에 대부분의 평면에 해당하는 데이터들은 포함되어 있다. 즉 검사해야 할 point 수 자체가 적다는 것이다. 또한, 바닥 평면 모델(model)에 포함된 셀(cell)근방의 인접 영역만 검사하는 것이기 때문에 연산량의 부담도 적다. 바닥 평면의 가장자리부분의 잔여 point pixel만을 추가하는 것이므로, best model 평면과의 거리를 이용하여 추가할 수 있다.In this part, if the approach is not done in units of pixels, not in units of cells, even more detailed parts can be detected as the bottom area. Since the pixels were examined in the form of an image cell-based NDT plane, data corresponding to most planes is included. That is, the number of points to be inspected itself is small. In addition, since only the adjacent area near the cell included in the floor plan model is checked, the burden of computation is small. Since only the remaining point pixels on the edge of the floor plane are added, it can be added using the distance from the best model plane.

S2: 바닥 탐색 영역 설정후 깊이 영상을 셀(cell)로 분할하는 단계
S4: 셀(cell)을 NDT 적용 후 고유값(eigen value)을 이용하여 평면여부를 판단하는 단계
S6: 바닥 기준에 근접한 셀(cell)들만 후보군 셀(cell)로 추출하는 단계
S8: 셀(cell)이 최소후보 셀(cell)수보다 큰지를 판단하는 단계
S10: 셀(cell) 기반 RANSAC으로 바닥 모델링 하는 단계
S12: 후보군 셀(cell)이외의 거리 기준 인라이어(inlier)를 추가하는 단계S2: Step of dividing the depth image into cells after setting the bottom search area
S4: Step of determining whether a cell is flat using an eigen value after applying NDT
S6: Step of extracting only cells close to the bottom criterion as candidate cells
S8: determining whether the cell (cell) is greater than the minimum number of candidate cells (cell)
S10: Step of floor modeling with cell-based RANSAC
S12: adding a distance-based inlier other than the candidate cell (cell)

Claims

A real-time floor detection method based on a depth image in an ROI area from a TOF camera image in a state where a TOF (Time Of Flight) camera is mounted so that objects located within a certain space can be photographed,
After setting the bottom search area, dividing the depth image into cells,
After applying NDT (Normal Distribution Transformation) to an image cell, determining whether it is flat using an eigen value;
Extracting only cells close to the bottom criterion as candidate cells;
Determining whether the cell (cell) is greater than the minimum number of candidate cells (cell),
If, in the determination step, the number of cells is smaller than the minimum number of candidate cells, it is determined that there is no floor plane,
If, in the determination step, the cell (cell) is greater than the minimum number of candidate cells (cell), the step of modeling the floor with a cell (cell) based RANSAC, and
A depth image-based real-time floor detection method comprising adding a distance-based inlier other than a candidate cell.

The method of claim 1,
The step of modeling the floor with the cell-based RANSAC is
3 cell (cell) random selection as the mean (mean) value of the candidate cell (cell) and floor plane modeling,
Determining an inlier cell condition after calculating the plane angle and distance with the candidate cells;
Thereafter, determining whether the consensus (support) count is greater than the maximum count (MAX COUNT);
In the determination step, if the consensus (support) count is greater than the maximum count (MAX COUNT),
Determining the best model (Best model), updating the inlier cell (inlier cell),
Determining whether the inlier ratio of the best model is 95% or more;
In the above step, if the inlier ratio of the best model is 95% or more, escaping the RANSAC iteration;
In the above step, if the inlier ratio of the best model is not more than 95%, determining whether it is the maximum (MAX) iteration;
In the above step, if the maximum (MAX) iteration (Iteration), depth image-based real-time floor detection method, characterized in that performing the step of escaping the RANSAC iteration (Iteration).

According to claim 1, wherein the covariance of a set of point data of the 3D real world divided into cells in the image space and included in one cell is calculated, so that the shape in which the points are spread is a plane Depth image-based real-time floor detection method, characterized in that it is determined whether the shape is or not.

The method according to claim 1, wherein the eigen vector of the covariance of the data cluster in two dimensions indicates the direction of the two axes having the smallest and largest distribution, and the eigen value indicates the size of the axis, and the two axes are orthogonal; In 3D, the eigen vector of the covariance of the data cluster indicates the three-axis direction of the ellipsoid, and the eigen value indicates the size of each axis, and the three axes are orthogonal to each other. Depth image-based real-time floor detection method, characterized in that by using it, it is possible to distinguish whether the distribution of data corresponds to a plane, a line, or not.

A real-time floor detection method based on a depth image in an ROI area from a TOF camera image in a state where a TOF (Time Of Flight) camera is mounted so that objects located within a certain space can be photographed,
A first step of receiving a depth image from a TOF camera,
A second step of setting the ROI area of the lower part in an image of a certain size,
A third step of dividing the ROI area of a certain size into a square pixel grid of a certain size;
For each cell that has passed the above steps, the covariance is calculated using NDT (Normal Distribution Transformation), and the eigen value and eigen vector of the ellipsoid-shaped covariance matrix are calculated. Step 4,
A fifth step of determining whether there are enough cells for the candidate group planar cells that have passed the above procedure to calculate a plane,
A sixth step of extracting three from a set of floor plan candidate cells;
The seventh step of making three points using the average value of point sets in this cell, modeling the plane using them, and calculating the normal vector;
The equation of the plane becomes a model in RANSAC and examines the floor plan candidate cells, but the angle between the floor model and the cell must be within ±10 degrees, and the distance between the two must be less than 5 cm. 8th step of adding to the Inlier,
After the 8th step, finally, all cells other than the inlier cells of the current floor model in the ROI are checked based on the distance, and if they have a distance within 5 cm, they are also referred to as inliers to the floor. Depth image-based real-time floor detection method further comprising a ninth step of adding.

[Claim 6] The method of claim 5, wherein, in the fourth step, the eigen value of each cell is arranged in order of size and used.

The method of claim 5, wherein if any one of the two conditions in the eighth step is deviated, the current floor model has a higher consensus (support) than the existing floor model that has obtained the most support, except for an inlier. If it is high, it replaces the best model with the current model, updates the inlier list, iterates up to the maximum number of iterations, and during iteration, the bottom plane candidate cell for the model ), if the consensus (support) ratio exceeds 95%, the depth image-based real-time floor detection method, characterized in that the RANSAC algorithm is terminated without further repetition.

6. The method of claim 5,
The depth image cell is characterized in that it is clustered based on a depth image cell among data in a 3D real world space,
Depth image-based real-time floor detection method, characterized in that the plane is determined after NDT based on an image cell and modeled with RANSAC in a small candidate group.