KR100930626B1

KR100930626B1 - Object Posture Recognition Method of Robot with Stereo Camera

Info

Publication number: KR100930626B1
Application number: KR1020070084856A
Authority: KR
Inventors: 전세웅; 김봉석; 이종배; 박창우
Original assignee: 전자부품연구원
Priority date: 2007-08-23
Filing date: 2007-08-23
Publication date: 2009-12-09
Also published as: KR20090020251A

Abstract

본 발명은 스테레오 카메라를 구비한 로봇의 물체 자세 인식 방법에 관한 것으로서, 인식 대상 물체의 실효성 있는 특징점(Constrained Feature)들로 이루어지는 3차원 점군 데이터(3D Point Cloud)를 획득하여 데이터 베이스에 저장하고, 스테레오 카메라로 상기 인식 대상 물체를 촬영한 영상의 실효성 있는 특징점(Constrained Feature)들로 이루어지는 3차원 점군 데이터를 획득한 후, 두 개의 3차원 점군 데이터를 정합하여 촬영한 인식 대상 물체의 자세를 인식하는 것을 특징으로 한다.The present invention relates to a method for recognizing an object posture of a robot having a stereo camera. The present invention obtains and stores 3D point cloud data consisting of effective feature points of an object to be recognized and stored in a database. After acquiring three-dimensional point group data consisting of effective feature points of the image of the object to be photographed by a stereo camera, two three-dimensional point group data are matched to recognize the posture of the object to be photographed. It is characterized by.

본 발명에 의하면, ICP(Iterative Closest Point) 알고리즘을 사용하여 두 영상을 정합함으로써, 물체를 바라보는 위치에 관계없이 인식이 가능하고 또한 물체가 부분적으로 가려지더라도 정확한 포즈 인식이 가능하며, 동일한 물체에 대한 두 개의 영상에서 동일한 특징을 찾아 정합함으로써, ICP(Iterative Closest Point) 알고리즘 수행시 작업 시간을 줄일 수 있다According to the present invention, by matching two images using an iterative closest point (ICP) algorithm, recognition is possible regardless of the position of the object, and accurate pose recognition is possible even if the object is partially obscured. By finding and matching the same feature in two images of, we can reduce the work time when ICP (Iterative Closest Point) algorithm is executed.

물체 인식, 로봇, 스테레오 카메라, ICP 알고리즘, 영상 정합, 포즈 Object Recognition, Robot, Stereo Camera, ICP Algorithm, Image Matching, Pose

Description

Method for object pose recognition of robot with stereo camera

본 발명은 스테레오 카메라를 구비한 로봇의 물체 자세 인식 방법에 관한 것으로, 보다 상세하게는 로봇이 가정 또는 공장 현장에 있어서 물체의 종류를 알고 있다고 하였을 때, 물체의 자세(Pose)를 정확히 인식하여 파지하도록 하기 위한 물체 자세 인식 방법에 관한 것이다.The present invention relates to a method for recognizing an object posture of a robot having a stereo camera, and more particularly, when a robot knows a type of an object in a home or factory site, it accurately recognizes and grasps a pose of the object. The present invention relates to a method for recognizing an object posture.

최근 로봇 산업은 산업용 로봇에서 출발하여 군사용이나 과학기술 용도를 벗어나 가정용 로봇으로까지 그 활동 영역을 넓혀가고 있다. Recently, the robot industry is starting to move from industrial robots to military robots and science and technology.

여기서, 로봇이 주어진 임무를 수행하기 위해서는 물체 인식(Object Recognition) 기술이 필수적이다. 물체 인식 기술이란 영상 처리를 통해 일정 공간 속에 놓인 컵, 책상, 의자, 전화 등과 같이 다양한 물체들을 구분해내는 기술을 말한다.Here, object recognition technology is essential for a robot to perform a given task. Object recognition technology refers to a technology that distinguishes various objects such as cups, desks, chairs, and telephones placed in a certain space through image processing.

예를 들어, 가정용 로봇에게 "식탁 위에 있는 빵을 갖고 와라"고 명령했을 때, 가정용 로봇이 "식탁"과 "빵"이 무엇인지 그리고 어디에 있는지를 인식해야 한다.For example, when ordering a home robot to "take the bread on the table," the home robot must know what and where the "table" and "bread" are.

물체 인식 기술은 1970년대 컴퓨터가 본격적으로 나오면서부터 활발히 연구되어 왔고, 1980년대에 2차원 모양 매칭에 기반하여 주로 산업 비전에서 부품 검사 등에 이용되었으며, 1980년대 말부터 3차원 모델기반의 물체 인식 기술이 활발히 연구되었다. 특히 3차원 다면체 인식을 위해 얼라인먼트(Alignment) 기법이 성공적으로 적용되었다.Object recognition technology has been actively studied since the computer came out in the 1970s, and was mainly used for inspection of parts in industrial vision based on two-dimensional shape matching in the 1980s. Actively studied. In particular, the alignment technique has been successfully applied for 3D polyhedral recognition.

그리고 1990년대 중반부터 영상기반기법이 대두되면서 좀 더 본격적인 물체인식연구가 진행되었는데, PCA(Principle Component Analysis)와 같은 주성분 분석기법을 이용한 물체 인식 기술이 그 중 한 예이다.In addition, as image-based techniques emerged in the mid-1990s, more full-scale object recognition research has been conducted. For example, object recognition technology using principal component analysis such as PCA (Principle Component Analysis) is one example.

물체 인식 기술 중 3차원 인식 기술은 물체에 대한 3차원 정보를 추출하는 단계, 3차원 영상 데이터로부터 물체의 기하학적인 특징을 추출하는 단계, 추출된 기하학적인 특징의 유효성을 검증하는 단계, 실측 데이터와 데이터 베이스 내의 데이터를 서로 비교 분석하는 단계, 물체를 인식하는 단계 등으로 구분되어 진행된다.Among the object recognition technologies, the three-dimensional recognition technology includes extracting three-dimensional information about an object, extracting geometric features of the object from the three-dimensional image data, validating the extracted geometric features, and measuring the measured data. The data is classified into a step of comparing and analyzing data in the database and a step of recognizing an object.

종래의 물체 인식 기술은 데이터 베이스 상에 있는 물체와 인식 대상 물체 전체의 3차원 점군 데이터를 이용하여 정합하는 과정을 수행함으로써, 물체를 인식하는데 많은 시간이 소요된다는 문제점이 있다.Conventional object recognition technology has a problem in that it takes a long time to recognize the object by performing the matching process using the three-dimensional point group data of the object on the database and the entire object to be recognized.

본 발명의 스테레오 카메라를 구비한 로봇의 물체 자세 인식 방법의 바람직한 실시예는, 인식 대상 물체 전체의 3차원 점군 데이터(3D Point Cloud)를 획득하여 데이터 베이스에 저장하는 단계와, 스테레오 카메라를 이용하여 상기 인식 대상 물체를 촬영한 후, 촬영한 영상으로부터 3차원 점군 데이터를 획득하는 단계와, 상기 데이터 베이스에 저장된 물체 전체의 3차원 점군 데이터와 상기 촬영한 영상의 3차원 점군 데이터를 정합하여 상기 촬영한 인식 대상 물체의 자세를 인식하는 단계를 포함하여 이루어지며, 상기 3차원 점군 데이터는 상기 인식 대상 물체의 ㅅ시실효성 있는 특징점(Constrained Feature)들로 이루어지는 것을 특징으로 한다.According to a preferred embodiment of a method for recognizing an object pose of a robot having a stereo camera according to the present invention, obtaining and storing 3D point cloud data of an entire object to be recognized in a database, and using a stereo camera Photographing the object to be recognized and acquiring 3D point group data from the photographed image; matching the 3D point group data of the entire object stored in the database with the 3D point group data of the photographed image And a step of recognizing a posture of an object to be recognized, wherein the 3D point group data is composed of constrained features of the object to be recognized.

여기서, 상기 인식 대상 물체 전체의 3차원 점군 데이터(3D Point Cloud)를 획득하는 단계는, 상기 인식 대상 물체의 전면부의 3차원 점군 데이터를 획득하는 단계와, 상기 인식 대상 물체의 후면부의 3차원 점군 데이터를 획득하는 단계와, 상기 인식 대상 물체의 전면부의 3차원 점군 데이터와 상기 인식 대상 물체의 후면부의 3차원 점군 데이터를 병합(Merge)하는 단계를 포함하여 이루어지는 것을 특징 으로 한다.The acquiring three-dimensional point cloud data of the entire object to be recognized may include acquiring three-dimensional point group data of a front part of the object to be recognized, and a three-dimensional point group of a rear part of the object to be recognized. And acquiring data, and merging three-dimensional point group data of the front part of the object to be recognized and three-dimensional point group data of the rear part of the object to be recognized.

한편, 상기 3차원 점군 데이터의 획득은, 상기 인식 대상 물체의 영상으로부터 변위 지도(Disparity Map) 이미지를 획득하는 단계와, 상기 인식 대상 물체의 영상으로부터 에지(Edge) 이미지를 검출하는 단계와, 상기 변위 지도 이미지와 상기 에지 이미지를 AND 연산하는 단계를 포함하여 이루어지는 것을 특징으로 한다.The three-dimensional point group data acquisition may include obtaining a disparity map image from an image of the object to be recognized, detecting an edge image from the image of the object to be recognized, and And ANDing the displacement map image and the edge image.

여기서, 상기 에지(Edge) 이미지를 검출하는 단계는, 상기 인식 대상 물체의 영상에 가우시안 필터(Gaussian Filter)를 이용하여 평활화(Equalization)를 실시하는 단계와, 미분 연산자를 이용하여 상기 영상의 기울기(Gradient) 크기 및 기울기 방향을 구하는 단계와, 상기 기울기 방향에 따라 상기 영상에 non-maximum suppression을 적용하는 단계와, 히스테리시스(Hystersis) 기법을 사용하여 상기 영상의 에지를 검출하는 단계를 포함하여 이루어지는 것을 특징으로 한다.The detecting of the edge image may include performing equalization using a Gaussian filter on the image of the object to be recognized, and gradient of the image using a differential operator. Obtaining a gradient size and a tilt direction, applying a non-maximum suppression to the image according to the tilt direction, and detecting edges of the image using a hysteresis technique. It features.

본 발명에 의하면, 현재 촬영한 인식 대상 물체의 3차원 점군 데이터(3D Point Cloud)와 데이터 베이스 상에 저장되어 있는 동일 물체의 전체 3차원 점군 데이터(3D Point Cloud)를 정합(Matching)하는데 있어 ICP(Iterative Closest Point) 알고리즘을 사용함으로써, 물체를 바라보는 위치에 관계없이 인식이 가능하고 또한 물체가 부분적으로 가려지더라도 정확한 포즈 인식이 가능하다.According to the present invention, ICP is used to match 3D point cloud data (3D Point Cloud) of the currently-recognized object and 3D point cloud data of the same object stored in the database. By using the (Iterative Closest Point) algorithm, it is possible to recognize the object regardless of the position of the object and to recognize the exact pose even if the object is partially obscured.

그리고, 동일한 물체에 대한 두 개의 영상에서 동일한 특징을 찾아 정합 함으로써, 3차원 점군 데이터를 줄일 수 있으며 그로 인해 ICP(Iterative Closest Point) 알고리즘 수행시 작업 시간을 줄일 수 있다.In addition, by finding and matching the same feature in two images of the same object, three-dimensional point group data can be reduced, thereby reducing the work time when performing an iterative closest point (ICP) algorithm.

본 발명에 의하면, 인식 대상 물체의 위치나 자세를 정확히 인식할 수 있기 때문에 부품 검사, 자동 조립, 용접 등의 작업을 효율적으로 수행할 수 있다. According to the present invention, since the position or posture of the object to be recognized can be accurately recognized, work such as part inspection, automatic assembly, welding, and the like can be efficiently performed.

이하, 도 1 내지 도 5를 참조하여 본 발명의 스테레오 카메라를 구비한 로봇의 물체 자세 인식 방법에 대해 상세히 설명한다.Hereinafter, a method of recognizing an object posture of a robot having a stereo camera of the present invention will be described in detail with reference to FIGS. 1 to 5.

도 1은 본 발명의 스테레오 카메라를 구비한 로봇의 물체 자세 인식 방법의 개괄적인 흐름을 나타낸 순서도이다.1 is a flow chart illustrating a general flow of a method for recognizing an object pose of a robot having a stereo camera according to the present invention.

이에 도시된 바와 같이, 먼저 로봇이 파지하고자하는 물체 전체의 3D Point Cloud(3차원 점군 데이터)를 획득하여 데이터 베이스에 저장한다(S 100). 여기서, 3D Point Cloud는 상기 물체의 실효성 있는 특징점(Constrained Feature)들로 이루어진다.As shown in the drawing, first, a 3D point cloud (three-dimensional point group data) of the entire object to be gripped by the robot is obtained and stored in the database (S 100). Here, the 3D Point Cloud is composed of effective feature points of the object.

이때, 상기 물체 전체의 3D Point Cloud DB를 생성하는 방법으로는 다음과 같다. 즉, 로봇의 스테레오 카메라로 상기 물체의 전면부를 촬상하여 전면부의 3D Point Cloud를 획득하고, 로봇의 스테레오 카메라로 상기 물체의 후면부를 촬상하여 후면부의 3D Point Cloud를 획득한 후, 이를 정교하게 병합(Merge)하여 상기 물체의 전체 3D Point Cloud를 획득한다. At this time, a method of generating the 3D Point Cloud DB of the entire object is as follows. That is, a 3D Point Cloud of the front part is obtained by capturing the front part of the object with a stereo camera of the robot, and a 3D Point Cloud of the rear part is obtained by capturing the back part of the object with the stereo camera of the robot, and then finely merged ( Merge) to obtain a full 3D Point Cloud of the object.

다음으로, 로봇의 스테레오 카메라를 이용하여 현재 공간상에 있는 인식 대상 물체의 3D Point Cloud를 획득한다(S 110). 여기서, 3D Point Cloud는 상기 인 식 대상 물체의 실효성 있는 특징점(Constrained Feature)들로 이루어진다.Next, using the stereo camera of the robot to obtain a 3D Point Cloud of the object to be recognized in the current space (S 110). Here, the 3D Point Cloud is composed of effective feature points (Constrained Features) of the object to be recognized.

상기 단계 S 100 및 단계 S 110에서, 상기 물체의 3D Point Cloud를 획득하는 과정은 이후에 자세히 설명하기로 한다.In the step S 100 and step S 110, the process of obtaining the 3D Point Cloud of the object will be described in detail later.

이어서, 상기 인식 대상 물체의 3D Point Cloud와 데이터 베이스 상에 저장되어 있는 3D Point Cloud를 정합(Matching) 하는 과정을 수행함으로써, 현재 공간상에 있는 인식 대상 물체의 포즈가 데이터 베이스 상에 저장된 기준 포즈로부터 어느 정도 변화하여 있는지를 검출한다(S 120).Subsequently, by matching the 3D Point Cloud of the object to be recognized with the 3D Point Cloud stored on the database, a pose of the object to be recognized in the current space is stored in the database. The degree of change is detected from (S120).

여기서, 공간좌표(X, Y, Z)의 형태로 구성된 상기 3D Point Cloud는 각기 획득한 위치 및 물체의 포즈가 서로 다르기 때문에 이를 정합하는 데에는 수학적인 방법이 필요하며, 이때 사용되는 알고리즘으로는 ICP(Iterative Closest Point) 알고리즘이 있다.In this case, the 3D Point Cloud configured in the form of spatial coordinates (X, Y, Z) requires a mathematical method to match since the acquired positions and poses of the objects are different from each other. (Iterative Closest Point) algorithm.

상기 ICP 알고리즘은 두 개의 3차원 점군 데이터(3D Point Cloud)의 대응관계를 모르는 상황에서도 적용가능하다는 장점이 있다.The ICP algorithm has an advantage that it can be applied even in a situation in which the correspondence of two 3D point cloud data is not known.

상기 ICP 알고리즘을 통해 두 개의 3차원 점군 데이터 사이의 관계를 나타내는 벡터 즉, 회전 벡터(Rotation Vector)와 병진 벡터(Translation Vector)를 구할 수 있으며, 이를 통해 현재 공간상에 있는 물체의 포즈(Pose)가 데이터 베이스 상에 저장된 물체의 기준 포즈와 얼마만큼의 변화가 있는지를 정확히 인식할 수 있게 된다. Through the ICP algorithm, a vector representing a relationship between two three-dimensional point group data, that is, a rotation vector and a translation vector, can be obtained, and through this, a pose of an object in a current space is obtained. Can accurately recognize how much variation there is with the reference pose of the object stored on the database.

한편, ICP 알고리즘을 통해 인식 대상 물체의 3D Point Cloud와 데이터 베이스 상에 저장되어 있는 3D Point Cloud를 정합하는 과정을 살펴보면 다음과 같다.Meanwhile, the process of matching the 3D Point Cloud of the object to be recognized with the 3D Point Cloud stored in the database through the ICP algorithm is as follows.

i) 입력된 두 영상 데이터 간에 미리 정의된 영역에서 데이터의 각 점마다 대응하는 가장 가까운 모델의 점 집합(Point Set)을 찾는다. 이때 각 점들 간의 거리는 유클리디안 거리(Euclidean Distance)를 이용하여 계산한다.i) Find the point set of the closest model corresponding to each point of the data in the predefined area between the two input image data. At this time, the distance between each point is calculated using the Euclidean distance.

ii) 상기 구해진 각 점 집합(Point Set) 간의 거리를 최소화하는 3차원 변환 파라미터 즉, 회전 벡터(Rotation Vector)와 병진 벡터(Translation Vector)를 구한다. ii) A three-dimensional transformation parameter, that is, a rotation vector and a translation vector, is minimized to minimize the distance between the obtained point sets.

iii) 상기 점 집합(Point Set) 간의 정합을 위해 상기 ii) 과정에서 구한 회전 벡터(Rotation Vector) 및 병진 벡터(Translation Vector)를 적용하여 데이터 점들을 변환한다. iii) In order to match the point set, data points are transformed by applying a rotation vector and a translation vector obtained in step ii).

즉, 상기 병진 벡터(Translation Vector)를 통해 대상 데이터를 이동시키면서 두 영상 데이터에서 거리가 가장 가까운 점 집합을 찾고, 그 거리를 최소화시키는 회전 벡터(Rotation Vector)를 구하여 두 영상 데이터를 일치시켜 나간다.That is, while moving the target data through the translation vector, a point set closest to the distance is found from the two image data, and a rotation vector for minimizing the distance is obtained to match the two image data.

iv) 거리 오차가 최소가 될 때까지 위의 과정을 반복하여 두 영상 데이터를 일치시킨다.iv) Repeat the above procedure until the distance error is minimum to match the two image data.

이와 같이 ICP 알고리즘을 이용한 3차원 영상 정합 과정을 도 2에서 개략적으로 나타내었다.As described above, the 3D image registration process using the ICP algorithm is schematically illustrated in FIG. 2.

본 발명에서는 먼저 물체의 모든 3차원 좌표값을 알고 있는 기준 모델을 정하여 두고, 현재 스테레오 카메라로부터 얻어진 물체의 3차원 좌표값들이 기준 모델로부터 얼마만큼 변화되어 있는지를 계산하여 주어진 3D 물체의 포즈를 인식하도록 하는데, 이때 기준 모델과 현재 얻어진 영상을 정합하는 방법으로 ICP 알고리즘 을 사용한다.In the present invention, first, a reference model that knows all three-dimensional coordinate values of an object is determined, and the pose of the given 3D object is recognized by calculating how much the three-dimensional coordinate values of the object currently obtained from the stereo camera are changed from the reference model. In this case, ICP algorithm is used as a method of matching the reference model and the currently obtained image.

또한, 본 발명에서는 물체의 3D Point Cloud를 획득하는 경우, 물체의 전체 면에 대한 점군 데이터가 아닌 물체의 실효성 있는 특징점(Constrained Feature)들로 이루어진 점군 데이터를 사용한다.In addition, in the present invention, when obtaining the 3D Point Cloud of the object, the point group data consisting of effective feature points (Constrained Features) of the object is used instead of the point group data for the entire surface of the object.

도 3은 본 발명의 3D Point Cloud를 획득하는 과정을 나타낸 순서도이다. 이러한 과정은 물체 전체의 3D Point Cloud DB를 생성(즉, 물체의 전면부의 3D Point Cloud를 획득할 때와 물체의 후면부의 3D Point Cloud를 획득할 때)하는 경우와, 현재 공간상에 있는 인식 대상 물체의 3D Point Cloud를 획득하는데 경우에 모두 사용된다.3 is a flowchart illustrating a process of acquiring a 3D point cloud of the present invention. This process involves creating a 3D Point Cloud DB of the entire object (i.e. when acquiring the 3D Point Cloud at the front of the object and acquiring the 3D Point Cloud at the back of the object), and the recognition object currently in space. Used to acquire 3D Point Cloud of an object.

먼저, 로봇의 스테레오 카메라로 인식 대상 물체를 촬영한 영상에서 변위 지도(Disparity Map)를 추출한다(S 200).First, a disparity map is extracted from an image of an object to be recognized by a stereo camera of a robot (S200).

본 발명에서는 스테레오 카메라를 이용한 스테레오 비전 처리 기술을 통해 물체의 3차원 공간상에서의 위치 정보를 얻게 되는데, 스테레오 비전 처리 기술이란 서로 다른 관측 지점에서 획득한 두 개 이상의 영상 및 카메라 파라미터를 이용하여 관측 공간상의 거리 및 관측 물체의 3차원 형태를 감지하는 기술을 말한다.In the present invention, a stereo vision processing technique using a stereo camera obtains position information of an object in a three-dimensional space. The stereo vision processing technique uses an observation space using two or more images and camera parameters acquired from different observation points. The technology of detecting the distance of the image and the three-dimensional shape of the observed object.

한편, 스테레오 비전의 기본 원리를 도 4를 참조하여 설명하면 다음과 같다. Meanwhile, the basic principle of the stereo vision will be described with reference to FIG. 4.

여기서, 두 관측점 사이의 거리를 b(baseline)로, 두 렌즈의 초점 거리(focal length)를 f로, 왼쪽 영상의 중심과 상기 왼쪽 영상에 맺힌 물체 사이의 거리를 d_l, 오른쪽 영상의 중심과 상기 오른쪽 영상에 맺힌 물체 사이의 거리를 d_r이라고 가정한다.Here, the distance between the two viewpoints is b (baseline), the focal length of the two lenses is f, the distance between the center of the left image and the object in the left image is d _l , the center of the right image Assume that the distance between the objects formed in the right image is d _r .

이와 같이, 임의의 거리(b, baseline)를 두고 배치된 두 대의 카메라를 통해 촬영된 두 개 영상에서 피사체 간의 거리 차를 변위(disparity)라고 하며, 이때 변위 d = d_l- d_r이 된다.As such, the distance difference between the subjects in the two images photographed by two cameras arranged at arbitrary distances b and a baseline is called disparity, and the displacement d = d _l -d _r .

그리고 수학식 1을 참조하면, 상기 변위 d와 렌즈의 초점 거리 f 및 두 관측점 사이의 거리 b를 이용하여 카메라로부터 물체와의 실제 거리 r를 알 수 있다.In addition, referring to Equation 1, the actual distance r from the camera to the object can be known using the displacement d, the focal length f of the lens, and the distance b between two observation points.

여기서, 변위 지도(Disparity Map)는 상기 스테레오 비전의 기본 원리를 이용하여 두 영상 사이의 일치점을 찾는 스테레오 영상 정합 과정을 통해 구해지게 된다.In this case, a disparity map is obtained through a stereo image matching process that finds a coincidence point between two images using the basic principle of the stereo vision.

스테레오 영상 정합 방법 중 상관도(Correlation) 기반 스테레오는 영상으로부터 특징점 추출 과정이 없이 두 영상에서 각 화소들 사이의 상관도를 계산하여 대응점을 찾는 방법이다.Correlation-based stereo among stereo image matching methods is a method of finding a corresponding point by calculating correlation between pixels in two images without extracting feature points from an image.

이는 영상에서 정합하려는 대상점 주변에 윈도우를 설정하고, 이 윈도우와 같은 사이즈의 윈도우를 다른 영상에 적용하여 수평 라인 선상으로 검색하는 것이다. 이때 유사도를 나타내는 기준에 의해 검색 구간 내에서 최적인 점을 찾아 선택 하게 된다.This sets a window around the target point to be matched in the image, and applies a window of the same size to another image to search on a horizontal line. In this case, the optimal point is found and selected within the search section based on the criterion indicating the similarity.

이렇게 상관도 기법을 이용하여 두 영상으로부터 변위(disparity) 영역 간의 값을 구할 수 있으며, 이는 도 5에 나타낸 바와 같은 3차원 변위 공간상에 저장되게 된다.The correlation technique can be used to obtain a value between the disparity regions from two images, which are stored in a three-dimensional displacement space as shown in FIG. 5.

이 3차원 공간에서 최적의 값을 갖는 변위(disparity) 값의 데이터를 맵핑(Mapping) 해놓은 것이 변위 지도(Disparity Map)이다. 상기 변위 지도를 통해 물체의 3차원 좌표값과 거리 정보를 알 수 있다. Disparity Map is a mapping of data of a disparity value having an optimal value in this three-dimensional space. Through the displacement map, three-dimensional coordinate values and distance information of the object can be known.

즉, 상기 변위 지도는 스테레오 영상 정합 결과 가까운 물체는 밝게 표시하고, 멀리 떨어져 있는 물체일수록 어둡게 표시하여 대상 물체의 거리 정보를 표시한다.That is, as a result of stereo image registration, the displacement map displays the near objects brighter and displays the distance information of the target object by darkening the farther objects.

다음으로, 인식 대상 물체의 에지(Edge)를 추출한다(S 210).Next, an edge of the object to be recognized is extracted (S210).

여기서, 상기 인식 대상 물체의 에지(Edge)는 캐니 에지 알고리즘(Canny Edge Algorithm)을 이용하여 추출하는 것이 바람직하다.Here, the edge of the object to be recognized may be extracted using a Canny Edge Algorithm.

상기 캐니 에지 알고리즘은 i) 이미지에 존재하는 대부분의 실제 에지(Edge)를 추출할 수 있어야 하고(Good Detection), ii) 추출된 에지는 실제 이미지 내에서 가능한 가까이 있어야 하며(Good Localization), iii) 이미지 내의 주어진 에지는 단 한 번만 표시되어야 하고, 이미지의 잡음이 잘못된 에지를 만들어서는 안 된다(Minimum Response)는 조건을 만족하여 우수한 성능을 나타낸다.The Canny edge algorithm must: i) be able to extract most of the actual edges present in the image (Good Detection), ii) the extracted edges should be as close as possible within the actual image (Good Localization), iii) A given edge in an image should be displayed only once and the noise in the image should not produce false edges (Minimum Response), which satisfies the condition and shows excellent performance.

본 발명에서 캐니 에지 알고리즘을 사용하여 인식 대상 물체의 에지를 추출 하는 과정은 다음과 같다.In the present invention, the process of extracting the edge of the object to be recognized using the Canny edge algorithm is as follows.

step 1. 먼저 이미지 내의 잡음(Noise)을 줄이기 위해 가우시안 필터(Gaussian Filter)를 이용하여 평활화(Equalization)를 실시한다. 이때, 가우시안 마스크(Gaussian Mask)를 크게 할수록 잡음에 대한 민감도는 떨어지게 된다.step 1. First, equalization is performed using a Gaussian filter to reduce noise in the image. At this time, as the Gaussian mask is increased, the sensitivity to noise decreases.

step 2. x, y 축 소벨 연산자(Sobel Operator)를 사용하여 기울기(Gradient)의 크기를 계산하고, 상기 소벨 연산자에 의해 구해진 x, y 축 벡터를 이용해 기울기의 방향을 구한다.step 2. The magnitude of the gradient is calculated using the Sobel Operator on the x and y axes, and the direction of the slope is obtained using the x and y axis vectors obtained by the Sobel operator.

step 3. 정해진 기울기 방향에 따라 비 최대치 억제(non-maximum suppression) 과정을 적용한다. 이는 정해진 기울기 방향에 존재하는 평활화된 픽셀(equalized pixel) 값 중 최대 값(maximum)을 제외하고 0으로 지정하는 것으로, 최소한의 에지를 구할 수 있게 된다.step 3. A non-maximum suppression process is applied according to the determined slope direction. This is set to 0 except for the maximum value of the equalized pixel values existing in the predetermined tilt direction, so that the minimum edge can be obtained.

step 4. 에지(Edge)의 결정은 히스테리시스(Hysteresis)라는 기법을 통해 이루어지며, 이 방법은 에지를 이루는 픽셀(Pixel) 값의 편차가 클 경우 단일한 임계(Threshold) 값이 적용되었을 때 에지의 일정 부분이 제거되는 것을 방지하기 위한 것이다.step 4. The edge is determined by a technique called hysteresis. This method uses the edge of the edge when a single threshold is applied when the pixel value of the edge is large. This is to prevent the removal of certain parts.

따라서, 로우(Low)와 하이(High) 두 개의 임계값이 사용되며, 하이(High) 보다 큰 값은 edge로 간주하고, 로우(Low) 보다 작은 값은 non-edge로 간주한다. 그리고 로우(Low)와 하이(High) 사이의 값은 주변에 하이(High) 이상인 값이 있을 경우 에지로 간주한다.Therefore, two thresholds, Low and High, are used. Values greater than High are considered edges and values less than Low are considered non-edges. The value between Low and High is regarded as an edge when there is a value above High.

이어서, 상기 변위 지도(Disparity Map) 이미지와 인식 대상 물체의 에지(Edge) 이미지를 앤드(AND) 연산하여 3D Point Cloud를 획득한다(S 220).Subsequently, an AND operation is performed on the disparity map image and the edge image of the object to be recognized to obtain a 3D point cloud (S220).

즉, 스테레오 카메라로 촬영한 영상으로부터 3D 정보가 포함되어 있는 2차원 변위 지도(Disparity Map)를 추출하고, 캐니 에지 알고리즘을 통해 2차원 에지 데이터를 추출한 후, 상기 변위 지도와 에지 데이터를 AND 연산한다.That is, a 2D disparity map including 3D information is extracted from an image captured by a stereo camera, 2D edge data is extracted through a Canny edge algorithm, and the displacement map and edge data are ANDed. .

상기 변위 지도(Disparity Map) 이미지와 캐니 에지 이미지를 AND 연산하여 얻어지는 것이 Feature Constrained Edge인데 이것이 촬영 영상의 3D Point Cloud가 되며, 여기에는 변위 지도에 포함된 3D 정보가 반영되어 있다.The feature-constrained edge is obtained by performing an AND operation on the disparity map image and the canny edge image, which becomes a 3D point cloud of the captured image, and reflects 3D information included in the displacement map.

이와 같이 본 발명은 3D Point Cloud를 획득하는 방식에 있어서, 변위 지도(Disparity Map) 이미지를 추출하고, 캐니 에지 이미지를 추출한 후, 상기 추출한 변위 지도 이미지와 캐니 에지 이미지를 AND 연산하는 것에 그 특징이 있다. As described above, the present invention is characterized by extracting a Disparity Map image, extracting a Canny edge image, and ANDing the extracted Displacement Map image and Canny edge image in a method of obtaining a 3D Point Cloud. have.

이러한 방식으로 3D Point Cloud를 획득하면 점 군의 개수가 줄어들어 ICP(Iterative Closest Point) 알고리즘 수행시 작업 시간을 획기적으로 줄일 수 있다.Acquiring the 3D Point Cloud in this way reduces the number of point groups, which can dramatically reduce the work time when performing the Iterative Closest Point (ICP) algorithm.

본 발명에 의해 획득한 3D Point Cloud는 인식 대상 물체의 실효성 있는 특징점(Constrained Feature) 들로 이루어지기 때문에, 3D Point Cloud의 점 군의 개수가 줄어들더라도 인식 대상 물체의 3차원 정보를 잘 반영하며, 그로 인해 ICP(Iterative Closest Point) 알고리즘 수행시 두 영상 간의 정합에 지장을 주지 않는다.Since the 3D point cloud obtained by the present invention is composed of effective feature points of the object to be recognized, even if the number of point groups of the 3D point cloud is reduced, the 3D information of the object to be recognized is well reflected. Therefore, it does not interfere with registration between two images when performing the ICP (Iterative Closest Point) algorithm.

즉, 기존에는 ICP(Iterative Closest Point) 알고리즘 수행시 작업 시간을 줄이고자 3D Point Cloud의 점 군의 개수를 줄였을 때, 3D Point Cloud에 인식 대상 물체의 3차원 정보가 제대로 반영되지 않아 매칭이 잘 되지 않았는데, 본 발명에 의하면 이러한 문제를 해결할 수 있게 된다.In other words, when reducing the number of point groups in the 3D Point Cloud to reduce the work time when performing the Iterative Closest Point (ICP) algorithm, the 3D point cloud does not properly reflect the 3D information of the object to be recognized so that the matching is good. The present invention can solve this problem.

이상에서 대표적인 실시예를 통하여 본 발명에 대하여 상세하게 설명하였으나, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 상술한 실시예에 대하여 본 발명의 범주에서 벗어나지 않는 한도 내에서 다양한 변형이 가능함을 이해할 것이다. Although the present invention has been described in detail with reference to exemplary embodiments above, those skilled in the art to which the present invention pertains can make various modifications to the above-described embodiments without departing from the scope of the present invention. I will understand.

그러므로 본 발명의 권리범위는 설명된 실시예에 국한되어 정해져서는 안 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be defined by the claims below and equivalents thereof.

도 1은 본 발명의 스테레오 카메라를 구비한 로봇의 물체 자세 인식 방법의 개괄적인 흐름을 나타낸 순서도.1 is a flow chart illustrating a general flow of a method for recognizing an object pose of a robot having a stereo camera according to the present invention.

도 2는 본 발명의 ICP 알고리즘을 이용한 3차원 영상 정합 과정을 나타낸 도면.2 is a view showing a three-dimensional image registration process using the ICP algorithm of the present invention.

도 3은 본 발명의 3D Point Cloud를 획득하는 과정을 나타낸 순서도.3 is a flowchart illustrating a process of obtaining a 3D Point Cloud of the present invention.

도 4는 본 발명의 스테레오 비전의 기본 원리를 나타낸 도면.4 illustrates the basic principles of stereo vision of the present invention.

도 5는 본 발명의 상관도(Correlation) 기반 스테레오 영상 정합 기법에 있어서, 두 영상의 변위(disparity) 영역 간의 값이 3차원 변위 공간상에 저장되는 상태를 나타낸 도면.5 is a view showing a state in which a value between a disparity region of two images is stored in a three-dimensional displacement space in a correlation-based stereo image matching technique of the present invention.

Claims

Obtaining 3D point cloud data of the entire object to be recognized and storing the same in 3D point cloud;

Photographing the object to be recognized using a stereo camera, and then obtaining 3D point group data from the photographed image; And

And matching the 3D point group data of the entire object stored in the database with the 3D point group data of the photographed image to recognize the posture of the photographed object to be recognized.

The three-dimensional point group data is composed of effective feature points (Constrained Features) of the object to be recognized,

Acquiring 3D point cloud data of the entire object to be recognized may include:

Acquiring 3D point group data of the front part of the object to be recognized;

Acquiring 3D point group data of a rear part of the object to be recognized; And

And merging three-dimensional point group data of the front part of the object to be recognized and three-dimensional point group data of the rear part of the object to be recognized.

delete

The method of claim 1,

Matching the 3D point group data of the entire object stored in the database and the 3D point group data of the photographed image is performed by using an ICP (Iterative Closest Point) algorithm. Recognition method.

The method of claim 3,

Recognizing the posture of the photographed object to be recognized,

And a rotation vector and a translation vector obtained as a result of performing the iterative closest point (ICP) algorithm.

The method according to claim 1 or 3,

Acquiring the three-dimensional point group data,

Obtaining a disparity map image from an image of the object to be recognized;

Detecting an edge image from an image of the object to be recognized; And

And performing an AND operation on the displacement map image and the edge image.

The method of claim 5,

Acquiring the Disparity Map image,

An object pose recognition method of a robot having a stereo camera, characterized by using a correlation-based stereo image matching technique.

The method of claim 5,

Detection of the edge image,

An object pose recognition method of a robot having a stereo camera, characterized by using a Canny Edge Algorithm.

The method of claim 7, wherein

Detecting the edge image,

Performing equalization on the image of the object to be recognized using a Gaussian filter;

Obtaining a gradient magnitude and a gradient direction of the image using a differential operator;

Applying a non-maximum suppression process to the image according to the tilt direction; And

A method of recognizing an object pose of a robot having a stereo camera, comprising detecting edges of the image using a hysteresis technique.

The method of claim 8,

And the derivative operator uses a Sobel operator.