KR101057936B1

KR101057936B1 - 3D Object Retrieval and Pose Estimation Based on Object's Appearance at Arbitrary Angles

Info

Publication number: KR101057936B1
Application number: KR1020090033527A
Authority: KR
Inventors: 황인준; 탁윤식
Original assignee: 고려대학교 산학협력단
Priority date: 2009-02-16
Filing date: 2009-04-17
Publication date: 2011-08-18
Also published as: KR20100093445A

Abstract

본 발명은 임의의 각도에서의 객체의 외형에 기반한 3차원 객체 검색 및 자세 추정 방법에 관한 것으로, 본 발명의 3차원 객체 검색 및 자세 추정 방법은 (a) 데이터베이스에 저장된 모든 3차원 객체들의 외형 정보로서 거리 곡선(distance curve)을 계산하고, 카메라의 회전에 의해 발생되는 거울상을 가진 영상들을 하나의 특징 정보로 결합하며, 객체의 대칭성에 따라 중복 외형 패턴을 제거한 후 남은 영상들의 거리 곡선을 사용하여 인덱스를 구성하는 단계; (b) 질의 영상의 거리 곡선을 계산한 후, 상기 (a) 단계에서 구성된 3차원 객체 데이터 인덱스 구조를 통해 질의 영상의 거리 곡선과의 유사도 비교 방법을 적용하여 가장 유사한 외형에 기반한 유사 객체들을 검색하는 단계; (c) 상기 (b) 단계에서 검색된 가장 유사한 외형과 동일한 외형을 가지는 영상을 포함하는 객체의 후보 자세를 추출하는 단계; 및 (d) 상기 (c) 단계에서 추출된 객체의 후보 자세로부터 추출한 영상들과 질의 영상과의 SIFT 알고리즘을 적용하여 최종적으로 가장 유사하다고 판단된 자세를 질의 영상 속의 객체의 자세로부터 추정하는 단계를 포함한다.The present invention relates to a three-dimensional object search and attitude estimation method based on the appearance of the object at any angle, the three-dimensional object search and attitude estimation method of the present invention (a) the appearance information of all three-dimensional objects stored in the database Calculate distance curve, combine images with mirror image generated by camera rotation into one feature information, and use distance curve of remaining images after removing overlapping pattern according to object symmetry. Constructing an index; (b) After calculating the distance curve of the query image, search similar objects based on the most similar appearance by applying a similarity comparison method to the distance curve of the query image through the three-dimensional object data index structure constructed in step (a). Making; (c) extracting a candidate pose of the object including an image having the same appearance as the most similar appearance found in step (b); And (d) estimating a posture determined to be the most similar from the posture of the object in the query image by applying a SIFT algorithm between the images extracted from the candidate posture of the object extracted in step (c) and the query image. Include.

자세 추정, 3차원 객체 검색, 외형 기반 검색, 거리 곡선 Pose estimation, 3D object search, contour based search, distance curve

Description

SHAD-BASED 3D OBJECT DETECTION AND POSE ESTIMATION METHOD USING ARBITARY VIEW IMAGE}

본 발명은 3차원 객체 검색 및 자세 추정 방법에 관한 것으로, 좀 더 구체적으로 임의의 각도에서의 객체의 외형에 기반한 3차원 객체 검색 및 자세 추정 방법에 관한 것이다.The present invention relates to a three-dimensional object search and attitude estimation method, and more particularly to a three-dimensional object search and attitude estimation method based on the appearance of the object at any angle.

3차원 데이터가 점차 다양한 영역에서 생성되고 사용됨에 따라, 단일 영상 질의를 통한 3차원 객체 검색 및 자세 예측 방법은 보안, CAD, 가상현실, 의료 영상 분석, 로봇 자동화 및 장소 인식 등의 많은 영역에서 활발히 연구되고 있다. 단일 영상 질의를 통해 이러한 목적을 달성하기 위해서는 카메라의 위치에 따른 객체의 가능한 모든 영상을 고려하여야 한다. 이는 막대한 양의 계산 시간과 저장 공간을 필요로 한다. 따라서 효율적인 인덱스 구조가 필요하다.As 3D data is gradually generated and used in various areas, 3D object retrieval and pose prediction methods using a single image query are actively used in many areas such as security, CAD, virtual reality, medical image analysis, robot automation, and place recognition. Is being studied. In order to achieve this goal through a single image query, we must consider all possible images of the object according to the camera position. This requires an enormous amount of computation time and storage space. Therefore, an efficient index structure is needed.

한편, 단일 영상 질의를 사용한 3차원 객체 검색에서 사용되는 방법 중 하나로 객체의 외형 정보를 사용하는 방법이 있다. 사람이 객체를 인식하는 방법과 유사하게 객체의 외형 정보를 사용하는 이 방법은 비교적 빠른 시간 내에 만족할 만 한 결과를 얻을 수 있지만, 객체의 자세에 따라 동일한 외형이 발생될 수 있기 때문에 외형 정보만을 사용할 경우 객체의 정확한 자세를 알 수 없다는 단점이 있다.On the other hand, one of the methods used in the three-dimensional object search using a single image query is to use the appearance information of the object. Similar to how a person recognizes an object, this method of using the object's appearance information can achieve satisfactory results in a relatively short time, but only the appearance information can be used because the same appearance can occur depending on the posture of the object. The disadvantage is that the exact position of the object is not known.

단일 영상 질의를 통한 3차원 객체 검색 및 자세 예측을 위해 많이 사용되는 또 다른 방법은 크기 불변 특징 변환(Scale-Invariant Feature Transform; SIFT) 알고리즘이다. SIFT 알고리즘은 영상의 회전 및 크기 변화가 발생하더라도 정확하게 검색할 수 있는 장점이 있지만, 이를 위해 많은 양의 특징 정보를 추출해서 사용하기 때문에 전처리 과정이나 유사도 비교 과정에서 많은 시간이 걸린다는 큰 단점이 있다.Another popular method for 3D object retrieval and posture prediction using a single image query is the Scale-Invariant Feature Transform (SIFT) algorithm. The SIFT algorithm has the advantage of being able to search accurately even if the rotation and size change of the image occurs, but it has a big disadvantage that it takes a lot of time in the preprocessing or similarity comparison process because a large amount of feature information is extracted and used. .

이러한 SIFT 알고리즘의 속도를 빠르게 하기 위해서 Speed-Up Robust Feature(SURF)가 제안되었다. SURF는 SIFT보다 훨씬 빠른 시간 내에 유사도 비교가 가능하지만, 이렇게 유사도 비교를 빠르게 하기 위해 적은 양의 특징 정보를 사용함으로써 정확도가 많이 떨어진다는 단점이 있다.In order to speed up the speed of the SIFT algorithm, a Speed-Up Robust Feature (SURF) has been proposed. SURF can compare similarities in a much faster time than SIFT, but has a disadvantage in that accuracy is greatly reduced by using a small amount of feature information in order to speed up the similarity comparison.

따라서 SIFT의 특성을 보완하기 위해서는 효과적인 인덱스 구조가 필요하다. 이러한 SIFT의 인덱스 구조로서 인식된 점들의 수를 기반으로 한 이진 트리 등의 다양한 방법이 제안되었지만, 일반화된 SIFT의 인덱스 구조의 부재로 인해 상당수의 관련 연구에서는 질의 영상과 3차원 데이터 간의 유사도 비교가 순차적으로 이루어지고 있다.Therefore, effective index structure is needed to complement the characteristics of SIFT. Various methods such as binary tree based on the number of points recognized as the index structure of the SIFT have been proposed, but due to the lack of the generalized index structure of the SIFT, many related studies have compared the similarity between the query image and the 3D data. It is done sequentially.

따라서 실시간으로 3차원 객체를 검색하고 자세를 예측하기 위해서는 보다 효과적인 인덱스 구조의 구성 및 빠른 검색 방법에 대한 요구가 절실한 실정이다.Therefore, in order to search for 3D objects in real time and to predict attitudes, there is an urgent need for a more effective index structure construction and a fast retrieval method.

본 발명은 3차원 객체 검색의 정확도에 영향을 주지 않으면서도 유사 객체 검색을 위한 데이터의 양을 효율적으로 줄이는 한편, 3차원 객체의 외형 정보만을 사용함으로써 빠른 검색 속도를 얻을 수 있는 3차원 객체 검색 방법을 제공하기 위한 것이다.The present invention efficiently reduces the amount of data for similar object search without affecting the accuracy of the three-dimensional object search, and achieves a fast search speed by using only the appearance information of the three-dimensional object. It is to provide.

또한 본 발명은 3차원 객체의 질의 영상과 가장 유사한 자세를 예측할 수 있는 3차원 객체의 자세 추정 방법을 제공하는 것을 목적으로 한다.Another object of the present invention is to provide a method for estimating a pose of a 3D object capable of predicting a pose most similar to a query image of a 3D object.

상기 목적을 달성하기 위한 본 발명의 임의의 각도에서의 객체의 외형에 기반한 3차원 객체 검색 및 자세 추정 방법은, (a) 데이터베이스에 저장된 모든 3차원 객체들의 외형 정보로서 거리 곡선(distance curve)을 계산하고, 카메라의 회전에 의해 발생되는 거울상을 가진 영상들을 하나의 특징 정보로 결합하며, 객체의 대칭성에 따라 중복 외형 패턴을 제거한 후 남은 영상들의 거리 곡선을 사용하여 인덱스를 구성하는 단계; (b) 질의 영상의 거리 곡선을 계산한 후, 상기 (a) 단계에서 구성된 3차원 객체 데이터 인덱스 구조를 통해 질의 영상의 거리 곡선과의 유사도 비교 방법을 적용하여 가장 유사한 외형에 기반한 유사 객체들을 검색하는 단계; (c) 상기 (b) 단계에서 검색된 가장 유사한 외형과 동일한 외형을 가지는 영상을 포함하는 객체의 후보 자세를 추출하는 단계; 및 (d) 상기 (c) 단계에서 추출된 객체의 후보 자세로부터 추출한 영상들과 질의 영상과의 SIFT 알고리즘을 적용하여 최종적으로 가장 유사하다고 판단된 자세를 질의 영상 속의 객체의 자세로부터 추정하는 단계를 포함한다.To achieve the above object, a three-dimensional object search and attitude estimation method based on the appearance of an object at any angle of the present invention comprises: (a) a distance curve as appearance information of all three-dimensional objects stored in a database; Calculating, combining images having the mirror image generated by the rotation of the camera into one feature information, and constructing an index using distance curves of the remaining images after removing the overlapping contour pattern according to the symmetry of the object; (b) After calculating the distance curve of the query image, search similar objects based on the most similar appearance by applying a similarity comparison method to the distance curve of the query image through the three-dimensional object data index structure constructed in step (a). Making; (c) extracting a candidate pose of the object including an image having the same appearance as the most similar appearance found in step (b); And (d) estimating a posture determined to be the most similar from the posture of the object in the query image by applying a SIFT algorithm between the images extracted from the candidate posture of the object extracted in step (c) and the query image. Include.

이상의 구성을 통한 본 발명의 3차원 객체 검색 및 자세 추정 방법에 따르면, 빠르고 정확한 3차원 객체 검색 및 자세 추정이 가능하다.According to the three-dimensional object search and pose estimation method of the present invention through the above configuration, it is possible to quickly and accurately search the three-dimensional object and attitude estimation.

단일 카메라 영상 질의를 통한 3차원 객체 검색을 위해서는 객체의 가능한 모든 영상을 고려하여야 하기 때문에 지나칠 정도로 많은 양의 데이터를 유사 객체 검색을 위해 고려하여야 한다. 본 발명은 검색의 정확도에 영향을 주지 않으면서도 객체의 대칭성을 이용하여 이러한 데이터의 양을 획기적으로 줄일 수 있는 방법을 제안한다.In order to search for 3D objects through a single camera image query, it is necessary to consider all possible images of the object. Therefore, an excessively large amount of data must be considered for similar object search. The present invention proposes a method that can dramatically reduce the amount of such data by using the symmetry of the object without affecting the accuracy of the search.

또한 본 발명은 유사 객체 검색을 위해 3차원 객체의 다양한 카메라 영상의 외형 정보만을 사용하는 외형 정보 기반의 유사 객체 검색 방법을 제안한다. 나아가 외형 정보만을 객체 검색에 사용할 경우 동일 외형을 가지는 다수의 객체의 생길 수 있기 때문에, 본 발명은 이러한 동일 외형의 물체의 가능한 자세들 중에서 질의 영상과 가장 가까운 자세를 예측할 수 있는 방법을 제안한다.In addition, the present invention proposes a similar object search method based on the appearance information using only the appearance information of various camera images of the three-dimensional object to search for similar objects. Furthermore, when only the appearance information is used to search for an object, a plurality of objects having the same appearance may be generated. Accordingly, the present invention proposes a method capable of predicting a pose closest to the query image among possible poses of the object of the same appearance.

앞의 일반적인 설명 및 다음의 상세한 설명 모두 예시적이라는 것이 이해되어야 하며, 청구된 발명의 부가적인 설명이 제공되는 것으로 여겨져야 한다. 참조 부호들이 본 발명의 바람직한 실시예들에 상세히 표시되어 있으며, 그것의 예들이 참조 도면들에 표시되어 있다. 가능한 어떤 경우에도, 동일한 참조 번호들이 동일 한 또는 유사한 부분을 참조하기 위해서 설명 및 도면들에 사용된다. 이하, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있도록 본 발명의 실시예를 첨부된 도면을 참조하여 설명한다.It is to be understood that both the foregoing general description and the following detailed description are exemplary, and that additional explanations of the claimed invention are provided. Reference numerals are shown in detail in preferred embodiments of the invention, examples of which are indicated in the reference figures. In any case, the same reference numerals are used in the description and the drawings to refer to the same or similar parts. DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described with reference to the accompanying drawings so that those skilled in the art may easily implement the technical idea of the present invention.

도 1은 본 발명에 따른 3차원 객체 검색 및 자세 추정 방법을 예시적으로 보여주는 흐름도이다. 도 1을 참조하면, 먼저 데이터베이스에 저장된 모든 3차원 객체들의 외형 정보로서 거리 곡선(distance curve)을 계산하고, 카메라의 회전에 의해 발생되는 거울상을 가진 영상들을 하나의 특징 정보로 결합하며, 객체의 대칭성에 따라 중복 외형 패턴을 제거한 후 남은 영상들의 거리 곡선을 사용하여 인덱스를 구성한다(S101). 본 발명에 따른 인덱스 구성 방법에 대한 세부 절차는 이후 도 4를 통해 상세히 설명될 것이다.1 is a flowchart illustrating an example of a 3D object search and pose estimation method according to the present invention. Referring to FIG. 1, first, a distance curve is calculated as appearance information of all three-dimensional objects stored in a database, and the images having mirror images generated by the rotation of the camera are combined into one feature information. The index is constructed using the distance curves of the remaining images after removing the overlapping external pattern according to the symmetry (S101). Detailed procedures for the index construction method according to the present invention will be described in detail later with reference to FIG.

다음 단계에서 질의 영상의 거리 곡선을 계산한 후, 단계 S101에서 구성된 3차원 객체 데이터 인덱스 구조를 통해 질의 영상의 거리 곡선과의 유사도 비교 방법을 적용하여 가장 유사한 외형에 기반한 유사 객체들을 검색한다(S103). 이때 회전 불변을 보장하기 위해 동적 시간 정합(Dynamic Time Warping; DTW)에 기반한 동적 인덱싱이 구성된다.After calculating the distance curve of the query image in the next step, the similarity comparison method with the distance curve of the query image is applied through the three-dimensional object data index structure configured in step S101 to search for similar objects based on the most similar appearance (S103). ). In this case, dynamic indexing based on dynamic time warping (DTW) is configured to ensure rotational invariance.

구체적으로, 먼저 UB_Dist()과 LB_Dist()의 두 개의 함수를 사용하여 질의 영상의 거리 곡선과 가장 비유사한 거리 곡선을 제거한다. UB_Dist()는 언제나 사용된 검색 방법의 거리보다 크거나 같은 거리를 반환한다. 마찬가지로, LB_Dist()는 언제나 사용된 검색 방법의 거리보다 작거나 같은 거리를 반환한다. 이를 수식으로 표현하면 다음과 같다.Specifically, first, two functions, UB_Dist () and LB_Dist (), are used to remove the distance curves most similar to those of the query image. UB_Dist () always returns a distance greater than or equal to the distance of the search method used. Similarly, LB_Dist () always returns a distance less than or equal to the distance of the search method used. If this is expressed as an expression, it is as follows.

UB_Dist() ≥ Matching method() ≥ LB_Dist()UB_Dist () ≥ Matching method () ≥ LB_Dist ()

본 발명은 검색 방법으로 동적 시간 정합(Dynamic Time Warping; DTW)을 사용한다. 따라서 LB_Dist 함수로 LB_Keogh 거리가, 그리고 UB_Dist 함수로 유클리디언(Euclidean) 거리가 사용될 수 있다. 모든 거리 곡선에 대한 UB_Dist 값들 중에서 가장 작은 거리를 찾은 후, LB_Dist 값이 가장 작은 UB_Dist 값보다 큰 모든 거리 곡선을 제거함으로써 후보 거리 곡선의 수를 줄인다. 이와 같은 과정을 거쳐 다수의 후보들을 제거하고 남은 소수의 후보들만 동적으로 인덱싱한 후 질의 영상의 거리 곡선으로부터 최소 거리를 가지는 거리 곡선을 검색한다.The present invention uses Dynamic Time Warping (DTW) as a search method. Therefore, the LB_Keogh distance can be used as the LB_Dist function and the Euclidean distance can be used as the UB_Dist function. After finding the smallest distance among the UB_Dist values for all the distance curves, the number of candidate distance curves is reduced by removing all distance curves in which the LB_Dist value is larger than the smallest UB_Dist value. Through this process, after removing a plurality of candidates and dynamically indexing only a few candidates, a distance curve having a minimum distance from the distance curve of the query image is searched.

이후 검색된 가장 유사한 외형과 동일한 외형을 가지는 영상을 포함하는 객체의 후보 자세를 추출한다(S105). 구체적으로 다음과 같은 과정을 거친다.Subsequently, a candidate pose of an object including an image having the same appearance as the retrieved most similar appearance is extracted (S105). Specifically, the process follows.

도 2는 다른 자세를 가지는 객체를 보여주는데, 도 2에서 (a)와 (d)는 상이한 카메라 영상이고, (b)와 (e)는 그들의 외형 윤곽이며, (c)와 (f)는 그들의 거리 곡선이다. 도 2에 도시된 바와 같이, 비록 그들의 자세(시각적 특징)가 다를지라도, 그들의 외형 윤곽(거리 곡선)은 동일하거나 서로에 대하여 반대이다. 단계 S101의 인덱스 구조 구성 단계에서 중복 외형이 제거되었기 때문에 이와 같은 후보 자세는 더 이상 사용될 수 없다.FIG. 2 shows objects with different poses, in which (a) and (d) are different camera images, (b) and (e) are their outline contours, and (c) and (f) are their distances It is a curve. As shown in Fig. 2, although their postures (visual characteristics) are different, their outline contours (distance curves) are the same or opposite to each other. Since the duplicated appearance has been eliminated in the index structure construction step of step S101, such a candidate attitude can no longer be used.

따라서 동적으로 후보 자세를 계산하여 데이터베이스에서 해당 영상을 검색하거나 또는 CAD(computer-aided design)와 같은 소프트웨어 도구를 사용하여 데이터베이스에 있는 3차원 객체로부터 후보 자세를 생성할 필요가 있다. 후자의 경우에 후보 자세로부터 가장 유사한 자세와 그것의 각에 대한 좌표가 필요하다. 이는 다음과 같이 계산된다.Therefore, it is necessary to dynamically calculate candidate poses and retrieve corresponding images from the database, or generate candidate poses from three-dimensional objects in the database using software tools such as computer-aided design (CAD). In the latter case, the coordinates for the most similar pose and its angle from the candidate pose are needed. This is calculated as follows.

먼저, 수평 반복 주기 H_period에 따라 180 / H_period 만큼의 후보 자세들을 얻을 수 있다. 가장 유사한 자세와 후보 자세 사이의 수평각은 다음의 수학식 1을 사용하여 쉽게 계산할 수 있다.First, candidate postures of 180 / H _periods can be obtained according to the horizontal repetition period H _period . The horizontal angle between the most similar posture and the candidate posture can be easily calculated using Equation 1 below.

여기서 1 ≤ i ≤ (180 / H_period)이고, j는 가장 유사한 자세와 후보 자세 사이의 수평각이며, k는 가장 유사한 자세의 각이다.Where 1 ≦ i ≦ (180 / H _period ), j is the horizontal angle between the most similar pose and the candidate pose, and k is the angle of the most similar pose.

단계 S105에서 추출된 객체의 후보 자세로부터 추출한 영상들과 질의 영상과의 SIFT 알고리즘을 적용하여 최종적으로 가장 유사하다고 판단된 자세를 질의 영상 속의 객체의 자세로부터 추정한다(S107). SIFT 알고리즘은 본 발명이 속하는 기술 분야에서 널리 알려진 것이므로 상세한 설명을 생략한다.A SIFT algorithm between the images extracted from the candidate postures of the object extracted in step S105 and the query image is applied to finally estimate a posture determined to be the most similar from the posture of the object in the query image (S107). Since the SIFT algorithm is well known in the art to which the present invention pertains, a detailed description thereof will be omitted.

도 3은 동일한 외형에 대한 두 개의 카메라 시점 변화 영상의 SIFT 특징을 보여준다. 도 3에 도시된 바와 같이 동일한 외형에 대한 카메라 시점 변화 영상은 다른 SIFT 특징을 가질 수 있기 때문에, SIFT를 사용하여 효과적으로 가장 유사한 자세를 식별할 수 있다. 일반적으로 SIFT는 특징을 계산하는데 상당한 양의 시간을 필요로 한다. 그러나 본 발명에서는 단계 S105를 통해 적은 수의 후보 자세들에 대 한 SIFT 특징이 필요하므로 큰 문제가 되지 않는다.3 shows SIFT characteristics of two camera viewpoint change images for the same appearance. As shown in FIG. 3, since the camera viewpoint change image for the same appearance may have different SIFT features, SIFT may be used to effectively identify the most similar pose. In general, SIFT requires a significant amount of time to calculate a feature. However, in the present invention, since the SIFT feature for the small number of candidate poses is required through step S105, this is not a big problem.

도 4는 본 발명에 따른 유사 객체 검색을 위한 인덱스 구조 구성 방법을 예시적으로 보여주는 흐름도이다. 도 4를 참조하면, 먼저 객체의 외형 정보로서 데이터베이스에 저장된 모든 3차원 객체의 중심점과 외곽선과의 거리를 계산한 거리 곡선이 계산된다(S401). 도 5는 객체의 외형 정보로서 거리 곡선을 계산하는 세부 절차를 예시적으로 보여주는데, 도 5에서 (a)는 3차원 객체의 영상이고, (b)는 외곽선을 따라 중심점과 외곽선과의 거리를 계산하는 과정을 보여주며, (c)는 해당 영상의 거리 곡선이다.4 is a flowchart illustrating a method of constructing an index structure for searching for similar objects according to the present invention. Referring to FIG. 4, first, a distance curve obtained by calculating the distance between the center point and the outline of all three-dimensional objects stored in the database as the appearance information of the object is calculated (S401). FIG. 5 exemplarily shows a detailed procedure of calculating a distance curve as the appearance information of an object. In FIG. 5, (a) is an image of a three-dimensional object, and (b) is a distance between a center point and an outline along an outline. (C) shows the distance curve of the image.

한편, 단일 영상 질의를 통한 3차원 객체 검색을 위해서는 객체의 가능한 모든 영상이 고려되어야 하는데, 이는 막대한 양의 계산 시간과 저장 공간을 필요로 한다. 대부분의 객체들은 좌우 및/또는 전후 대칭성을 가지는데, 이를 도 6에 나타낸 바와 같이 객체의 종류에 따라 다음과 같이 분류할 수 있다. (1) 대칭적인 객체의 경우, 거울 영상(mirror image)은 거울상의 외형을 가진다; (2) 어떤 종류의 객체이든 후면 영상은 전면 영상의 거울상의 외형을 가진다; (3) 객체와 객체의 투영 영상은 동일한 외형을 가진다. 이와 같은 특징에 근거하여 본 발명은 거울상 영상 을 결합하고, 중복 외형 패턴을 제거함으로써 검색의 정확도에 영향을 주지 않으면서도 인덱스 되어야 할 영상의 수를 최소화할 수 있다. 이를 구체적으로 설명하면 다음과 같다.On the other hand, all three possible images of an object must be considered to search for a 3D object through a single image query, which requires a large amount of computation time and storage space. Most objects have left and right and / or symmetry, and they can be classified as follows according to the types of objects as shown in FIG. 6. (1) in the case of symmetrical objects, the mirror image has a mirror image appearance; (2) the rear image of any kind of object has the mirror image of the front image; (3) The object and the projected image of the object have the same appearance. Based on this feature, the present invention can minimize the number of images to be indexed without influencing the accuracy of the search by combining mirror images and eliminating duplicate appearance patterns. This will be described in detail as follows.

객체의 종류에 관계없이 전면 영상과 후면 영상은 동일한 거울상의 외형을 가진다. 이러한 경우에 그들의 거리 곡선은 서로에 대하여 반대이고 이에 따라 이 산 푸리에 변환(discrete Fourier transform; DFT)은 동일한 값을 반환한다. 따라서 인덱싱에 이산 푸리에 변환 값을 사용하면 하나의 이산 푸리에 변환 특징 값을 사용하여 이러한 거울상 영상들을 결합할 수 있다(S403). 인덱스를 구성할 때 거울상 영상을 결합하고 검색할 때 거울상의 외형을 복구함으로써 정확성을 유지하면서도 인덱스 공간을 반으로 줄일 수 있다.Regardless of the type of object, the front and back images have the same mirror image. In this case their distance curves are opposite to each other and thus this Discrete Fourier transform (DFT) returns the same value. Therefore, when the discrete Fourier transform value is used for indexing, the mirror image may be combined using one discrete Fourier transform feature value (S403). The index space can be reduced by half while maintaining the accuracy by combining the mirror image and reconstructing the mirror image when searching.

3차원 객체의 전체 시점은 수평 및 수직 카메라 움직임의 조합을 통해 생성될 수 있으므로, 수평과 수직 두 개의 평면에서 객체의 대칭이 고려될 수 있다. 객체의 대칭에 따라 각 평면에 대하여 4개의 다른 클래스가 정의될 수 있다. 먼저, 전면과 측면에서의 외형의 대칭에 따라 서로 다른 4개의 수평 클래스 H1 내지 H4가 다음과 같이 정의된다.Since the entire viewpoint of the 3D object may be generated through a combination of horizontal and vertical camera movements, symmetry of the object in two horizontal and vertical planes may be considered. Depending on the symmetry of the object, four different classes can be defined for each plane. First, four different horizontal classes H1 to H4 are defined as follows according to the symmetry of the outer and front sides.

H1 : 이 클래스는 구(sphere)와 같이 가능한 모든 수평 방향 카메라 영상이 동일한 객체를 포함한다.H1: This class contains the same object as all possible horizontal camera images, such as spheres.

H2 : 이 클래스는 차(car)와 같이 외형 패턴이 매 90도마다 반복되는 객체를 포함한다. H2 This class contains an object whose appearance pattern is repeated every 90 degrees, like a car.

H3 : 이 클래스는 주사위와 같이 외형 패턴이 90도보다 더 자주 반복되는 객체를 포함한다. 예컨대, 주사위와 같은 정육면체의 경우에는 외형 패턴이 매 45도마다 반복된다.H3: This class contains objects whose appearance patterns repeat more often than 90 degrees, such as dice. For example, in the case of cubes such as dice, the appearance pattern is repeated every 45 degrees.

H4 : 이 클래스는 어떠한 수평 방향 카메라 영상으로부터도 외형 패턴이 반복되지 않는 객체를 포함한다.H4: This class contains objects whose appearance patterns are not repeated from any horizontal camera image.

수평 클래스와 마찬가지 방법으로 전면과 평면에서의 외형의 대칭에 따라 수 직 클래스 V1 내지 V4가 정의된다. 수직 클래스의 특성은 수평 클래스와 동일하다. 이와 같이 본 발명은 객체의 수평 및 수직 대칭성에 따라 16개의 클래스를 구성하고, 클래스에 따라 해당 객체를 분류한다. 이후 중복되는 외형 패턴을 분석하여 중복 외형들을 제거함으로써(S405) 검색의 정확성을 희생하지 않고도 인덱스 공간과 검색 시간을 줄일 수 있다.In the same way as the horizontal class, the vertical classes V1 to V4 are defined by the symmetry of the contours in the front and plane. The properties of the vertical class are the same as the horizontal class. As described above, the present invention configures 16 classes according to horizontal and vertical symmetry of objects, and classifies corresponding objects according to classes. Then, by analyzing the overlapping appearance patterns and removing the duplicated appearances (S405), index space and search time can be reduced without sacrificing the accuracy of the search.

단계 S403에서의 거울상 영상 결합과 단계 S405에서의 중복 외형 패턴 제거 과정을 거친 후 남은 영상들의 거리 곡선을 이용하여 인덱스 구조를 구성한다(S407). 카메라의 위치에 따른 객체의 모든 가능한 영상을 고려하는 것은 엄청나게 많은 데이터베이스 용량을 필요로 하기 때문에 효율적인 인덱스 구조는 효율적인 3차원 객체 검색 성능을 위해 필수적이다. 그러나 완전한 거리 곡선에 대한 인덱스 구조를 구성하는 것은 “차원의 저주(curse of dimensionality)”라고 알려진 문제 때문에 비효율적이다.The index structure is constructed using the distance curves of the remaining images after the mirror image combination in step S403 and the overlapping contour pattern removal process in step S405. An efficient index structure is essential for efficient three-dimensional object retrieval performance, because taking into account all possible images of an object according to the camera's position requires an enormous amount of database capacity. However, constructing an index structure for a complete distance curve is inefficient because of a problem known as "curse of dimensionality."

따라서 본 발명은 단계 S403의 거울상 영상 결합에서 사용된 DFT를 사용하여 인덱스의 차원을 감소시킨다. DFT는 회전 불변을 보장하기 때문에 회전 불변의 인덱스 구조를 구성하는데 사용될 수 있다.Therefore, the present invention reduces the dimension of the index by using the DFT used in the mirror image combination of step S403. The DFT can be used to construct a rotation invariant index structure because it guarantees rotation invariance.

한편, 본 발명의 상세한 설명에서는 구체적인 실시예에 관하여 설명하였으나, 본 발명의 범위에서 벗어나지 않는 한도 내에서 여러 가지로 변형할 수 있다. 그러므로 본 발명의 범위는 상술한 실시예에 국한되어 정해져서는 안되며 후술하는 특허청구범위뿐만 아니라 이 발명의 특허청구범위와 균등한 것들에 의해 정해져야 한다.Meanwhile, in the detailed description of the present invention, specific embodiments have been described, but various modifications may be made without departing from the scope of the present invention. Therefore, the scope of the present invention should not be limited to the above-described embodiments, but should be defined by the equivalents of the claims of the present invention as well as the following claims.

도 1은 본 발명에 따른 3차원 객체 검색 및 자세 추정 방법을 예시적으로 보여주는 흐름도.1 is a flowchart illustrating a three-dimensional object search and attitude estimation method according to the present invention.

도 2는 동일 외형의 두개의 다른 자세를 보여주는 예시도.2 is an exemplary view showing two different postures of the same appearance.

도 3은 동일한 외형에 대한 두 개의 카메라 시점 변화 영상의 SIFT 특징을 보여주는 예시도.3 is an exemplary view showing SIFT characteristics of two camera viewpoint change images for the same appearance.

도 4는 본 발명에 따른 유사 객체 검색을 위한 인덱스 구조 구성 방법을 예시적으로 보여주는 흐름도.4 is a flowchart illustrating a method of constructing an index structure for searching for similar objects according to the present invention.

도 5는 객체의 외형 정보로서 거리 곡선을 계산하는 세부 절차를 보여주는 예시도.5 is an exemplary view showing a detailed procedure for calculating a distance curve as appearance information of an object.

도 6은 객체의 대칭에 따른 외형 패턴을 보여주는 예시도.6 is an exemplary view showing an appearance pattern according to symmetry of an object.

Claims

In the three-dimensional object search and attitude estimation method based on the appearance of the object at any angle,

(a) constructing an index using a distance curve as appearance information of all three-dimensional objects stored in a database;

(b) After calculating the distance curve of the query image, search similar objects based on the most similar appearance by applying a similarity comparison method to the distance curve of the query image through the three-dimensional object data index structure constructed in step (a). Making;

(c) extracting a candidate pose of the object including an image having the same appearance as the most similar appearance found in step (b); And

(d) applying a SIFT algorithm between the images extracted from the candidate postures of the object extracted in step (c) and the query image to estimate the posture finally determined to be the most similar from the posture of the object in the query image; But

In step (a),

(a-1) calculating a distance curve obtained by calculating distances between the center point and the outline of all three-dimensional objects stored in the database as the appearance information of the object;

(a-2) combining the images having a mirror image generated by the rotation of the camera into one feature information;

(a-3) removing the overlapping shapes by analyzing the overlapping appearance patterns according to the symmetry of the object; And

(a-4) 3D object search and pose estimation method comprising the step of constructing an index using the distance curve of the remaining images after removing the duplicated appearance through the step (a-3).

delete

The method of claim 1, wherein the step (a-2) combines the images having the mirror image into one feature information by using one discrete Fourier transform feature value.

The method according to claim 1, wherein the step (a-3),

Composing 16 classes according to the symmetry of objects and classifying the objects according to the classes; And

3. The method of claim 3, further comprising removing the overlapping shapes by analyzing the overlapping shapes of the classified objects.

The three-dimensional object of claim 4, wherein the sixteen classes are a combination of four different horizontal classes defined according to the horizontal symmetry of the object and four different vertical classes defined according to the vertical symmetry of the object. Search and posture estimation method.

The method of claim 1, wherein the step (a-4) reduces the dimension of the index using a Discrete Fourier Transform.

The method according to claim 1, wherein step (b),

(b-1) removing distance curves most similar to distance curves of the query image using two functions of UB_Dist () and LB_Dist ();

(b-2) removing the plurality of candidates through step (b-1) and dynamically indexing only the remaining candidates; And

(b-3) searching for a distance curve having a minimum distance from the distance curve of the query image;

The method of claim 7, wherein the searching is performed by dynamic time warping.

The method of claim 8, wherein the LB_Dist () is an LB_Keogh distance, and the UB_Dist () is an Euclidean distance.

The method of claim 1, wherein the step (c) is performed by dynamically calculating the candidate pose and searching for a corresponding image in a database.

The method of claim 1, wherein step (c) is performed by generating candidate poses from three-dimensional objects in a database using a computer-aided design (CAD).

The three-dimensional object of claim 11, wherein the candidate pose is obtained by a number of 180 / H _periods according to a horizontal repetition period H _period , and a horizontal angle between the most similar pose and the candidate pose is calculated by the following equation. Search and posture estimation method.

Where 1 ≦ i ≦ (180 / H _period ), j is the horizontal angle between the most similar pose and the candidate pose, and k is the angle of the most similar pose.