KR102129060B1

KR102129060B1 - Content-based 3d model retrieval method using a single depth image, 3d model retrieval server for performing the methods and computer readable recording medium thereof

Info

Publication number: KR102129060B1
Application number: KR1020130075640A
Authority: KR
Inventors: 배민수; 박인규
Original assignee: 네이버 주식회사; 인하대학교 산학협력단
Priority date: 2013-06-28
Filing date: 2013-06-28
Publication date: 2020-07-02
Also published as: KR20150002157A

Abstract

본 발명은 한 장의 깊이 영상을 이용한 효율적인 내용기반 3차원 모델 검색 기법에 관한 것이다. 본 발명은 한 장의 깊이 영상만을 질의 영상으로 이용하여 3차원 모델 검색을 수행할 수 있도록, 하나의 3차원 모델에 대하여 중요도에 기반하여 깊이 영상을 획득하기 위한 카메라 시점들을 적응적으로 샘플링하고, 샘플링된 카메라 시점들 각각에서 깊이 영상을 획득함으로써 해당 3차원 모델을 복수의 깊이 영상들로서 표현할 수 있는 데이터베이스 구축 방법을 제안한다. 또한, 본 발명은 전술한 바와 같이 구축된 데이터베이스를 이용하여, 입력된 질의 영상과 데이터베이스 내에 저장된 깊이 영상들 간의 유사도를 비교함으로써, 질의 영상에 매칭되는 3차원 모델을 검색할 수 있는 3차원 모델 검색 방법을 제안한다.The present invention relates to an efficient content-based 3D model search method using a single depth image. The present invention adaptively samples and views camera viewpoints for acquiring a depth image based on importance for one 3D model, so that a 3D model search can be performed using only one depth image as a query image We propose a database construction method that can express the 3D model as a plurality of depth images by acquiring depth images from each of the camera viewpoints. In addition, the present invention uses a database constructed as described above, by comparing the similarity between the input query image and the depth images stored in the database, a three-dimensional model search that can search for a three-dimensional model matching the query image I suggest a method.

Description

Content-based 3D model retrieval method using single depth image, 3D model retrieval server and computer-readable recording medium that performs the same. AND COMPUTER READABLE RECORDING MEDIUM THEREOF}

본 발명은 단일 깊이 영상을 이용한 내용기반 3차원 모델 검색 방법, 이를 수행하는 3차원 모델 검색 서버 및 컴퓨터로 판독 가능한 기록매체에 관한 것이다. 더욱 상세하게, 본 발명은 3차원 모델에서 중요도가 높은 시점들을 결정하기 위하여 카메라 시점을 적응적으로 샘플링하고, 적응적으로 샘플링된 각각의 시점의 깊이 영상에 대하여 회전불변 기술자를 생성 및 저장함으로써 해당 3차원 모델을 깊이 영상의 집합으로써 표현할 수 있는 3차원 모델 검색 데이터베이스 구축 방법 및 이를 수행하는 3차원 모델 검색 서버에 관한 것이다. 또한, 본 발명은 전술한 바와 같이 구성되는 3차원 모델 검색 데이터베이스를 이용하여 사용자 단말기로부터 전송되는 단일 깊이 영상을 질의 영상으로 수신하고, 질의 영상과 3차원 모델 검색 데이터베이스 내에 저장된 3차원 모델들 각각의 회전불변 기술자를 비교하여, 질의 영상과 유사한 회전불변 기술자를 갖는 3차원 모델들을 검색결과로써 제공할 수 있는 3차원 모델 검색 방법 및 이를 수행하기 위한 3차원 검색 서버에 관한 것이다.
The present invention relates to a method for searching a content-based 3D model using a single depth image, a 3D model search server performing the same, and a computer-readable recording medium. More specifically, the present invention adaptively samples the camera viewpoint in order to determine viewpoints of high importance in the 3D model, and generates and stores a rotation-invariant descriptor for each depth image of the adaptively sampled viewpoint. The present invention relates to a method for constructing a 3D model search database capable of expressing a 3D model as a set of depth images and a 3D model search server performing the same. In addition, the present invention receives a single depth image transmitted from a user terminal as a query image using a 3D model search database configured as described above, and each of the 3D models stored in the query image and the 3D model search database. The present invention relates to a 3D model search method capable of providing 3D models having a rotation invariant descriptor similar to a query image as search results, and a 3D search server for performing the same.

최근 애니메이션, 영화, 게임, 웹 서비스 등의 발전으로 3차원 모델 데이터의 수요가 급격히 증가하고 있다. 따라서 멀티미디어 및 엔터테인먼트 시장을 비롯한 각종 어플리케이션에서 3차원 모델의 필요성이 점차 증대되고 있다. 또한 현재 2차원 영상 검색을 위한 다양한 형태의 질의(query) 입력 기반 서비스는 이미 각종 포털 사이트를 통해 제공되고 있으나 Google 3D Warehouse와 같은 텍스트 기반의 3차원 모델 검색 엔진 외에 보편화된 3차원 모델 검색(3D model retrieval) 시스템은 온라인상에 거의 존재하지 않는다. 따라서 사용자와 상호 대화형의(user-interactive) 3차원 모델 검색을 위하여, 텍스트 외에도 영상의 입력을 통한 내용 기반(content-based) 3차원 모델 검색 시스템의 알고리즘 개발이 필요하다.With the recent development of animation, movies, games, and web services, the demand for 3D model data is rapidly increasing. Therefore, the need for a three-dimensional model is gradually increasing in various applications including the multimedia and entertainment market. Also, currently, various types of query input-based services for 2D image search are already provided through various portal sites, but in addition to text-based 3D model search engines such as Google 3D Warehouse, generalized 3D model search (3D) The model retrieval) system rarely exists online. Therefore, in order to search for a user-interactive 3D model with a user, it is necessary to develop an algorithm for a content-based 3D model search system through input of an image in addition to text.

이러한 필요성에 의해 내용 기반 3차원 모델 검색 기법에 대한 다양한 연구들이 진행되었다. 3차원 모델 검색이란 입력 질의(query)에 대해 그와 유사하거나 같은 분류에 속하는 3차원 모델을 데이터베이스에서 검색하는 것을 의미하며, 특히 내용기반 검색은 텍스트 키워드 대신 2차원 영상 또는 3차원 모델 자체를 입력으로 하여 유사한 형상의 3차원 모델을 검색하는 것을 말한다. 종래의 내용 기반 3차원 모델 검색 기법들을 살펴보면, 종래의 내용 기반 3차원 모델 검색 기법들은 유사 3차원 모델 검색을 위한 기술자의 특징에 따라 크게 특징(feature) 기반 방식, 그래프(graph) 기반 방식, 시점(view) 기반 방식으로 분류될 수 있다.Due to this necessity, various studies have been conducted on content-based 3D model retrieval techniques. Searching for a 3D model means searching a database for a 3D model belonging to a classification similar to or similar to the input query, and in particular, content-based search inputs a 2D image or 3D model itself instead of text keywords It means to search 3D model of similar shape. Looking at the conventional content-based 3D model search techniques, the conventional content-based 3D model search techniques are largely based on a feature of a descriptor for searching for a similar 3D model, a feature-based method, a graph-based method, and a viewpoint. It can be classified in a (view) based manner.

종래기술에 따른 특징 기반 방식의 3차원 모델 검색은 3차원 모델 자체의 기하학적, 위상학적 특징 기술자를 추출하여 유사도를 측정하는 방법이다. 3차원 모델 표면의 법선 벡터를 이용하여 그 각도를 기술자로 추출하거나 가우시안 구면체로 매핑시키는 기법이 연구되었으며, 이러한 방법은 쉽게 기술자를 생성할 수 있지만 3차원 모델 구성의 정밀도가 검색 성능에 큰 영향을 끼칠 수 있다. 또한 3차원 모델의 형태를 하나의 함수로 표현하고 그에 따른 확률 분포를 생성하여 유사도를 측정하는 히스토그램 방식이 특징 기반 방식의 검색 기법에 포함된다. 히스토그램을 이용한 기법들은 작은 왜곡에 강건하고 3차원 모델의 변형에 강인하다는 장점이 있으나 전역적 방법이기 때문에 유사한 3차원 모델의 세부적인 구분이 어려우며 유사도를 파악하기 위한 히스토그램의 분석과 계산이 복잡한 단점이 있다.The feature-based 3D model search according to the prior art is a method of measuring similarity by extracting the geometric and topological feature descriptors of the 3D model itself. Techniques for extracting the angle as a descriptor or mapping it to a Gaussian sphere using normal vectors on the surface of a 3D model have been studied, and these methods can be easily generated, but the precision of constructing a 3D model greatly affects search performance. Can cause In addition, a histogram method of expressing the shape of a 3D model as a function and generating a probability distribution according to the similarity is included in the feature-based search technique. Techniques using histograms have the advantage of being robust to small distortions and robust to deformation of 3D models, but because of the global method, detailed classification of similar 3D models is difficult, and the histogram analysis and calculation to understand similarity are complicated. have.

종래기술에 따른 그래프 기반의 방식은 3차원 모델 자체를 검색에 용이하게 가공하거나 새로운 데이터를 생성하여 검색에 사용한다는 점에서 벡터 기반 기술자 방식(vector-based descriptor method)인 특징 기반 방식이나 시점 기반 방식과 구분된다. 그래프 기반 방식은 3차원 모델의 골격 구조를 파악하여 리브(Reeb) 그래프를 구성하며 노드와 노드 사이의 관계를 비교하여 유사도를 계산하거나, 뼈대 형태 그래프(skeletal graph)를 형태 기술자로 생성하여 3차원 모델의 위상정합을 통해 기하학적인 변형에 강건한 검색을 할 수 있는 기법이 연구되었다. 그래프 기반 검색 방식들은 비강체(non-rigid)의 검색이 가능하고 비교적 뛰어난 검색 성능을 기대할 수 있는 장점이 있지만 3차원 모델을 그래프 형식으로 변환하는 과정에서의 연산이 어렵고, 다양한 3차원 모델의 표현 방법에 적용하기 어려운 큰 문제를 가지고 있다.The graph-based method according to the prior art is a vector-based descriptor method, a feature-based method or a view-based method, in that the 3D model itself is easily processed for search or new data is used for search. It is separated from. The graph-based method constructs a Reeb graph by grasping the skeletal structure of a 3D model, calculates similarity by comparing the relationship between nodes, and generates a skeletal graph as a shape descriptor to create a 3D A technique that can perform a robust search for geometric deformations through topological matching of the model was studied. Graph-based search methods have the advantage of being capable of non-rigid search and expecting relatively superior search performance, but they are difficult to perform in the process of converting a 3D model into a graph format, and expressing various 3D models It has a big problem that is difficult to apply to the method.

종래기술에 따른 시점 기반의 검색 방식이란 3차원 모델은 수 개의 2.5차원 깊이 영상으로 표현될 수 있다는 깊이 영상 기반 표현(depth image-based representation)기법을 바탕으로 하며, 3차원 모델에 대한 특정 시점을 샘플링(sampling)하여 양상-관계 그래프(aspect-relation graph)로 표현하고 입력 영상과의 정합으로 유사도를 측정하는 기법이다. 3차원 모델의 실루엣 영상을 생성하여 푸리에 변환하고, 그 계수를 기술자로 나타내어 3차원 모델의 유사도 계산에 사용하거나 3차원 모델을 둘러싼 구면체 상에 균일하게 생성된 시점에서 깊이 영상을 생성하고 이에 대한 기술자를 통해 유사도를 측정하는 기법이 연구되었다. 이와 유사하게 SIFT (Scale Invariant Feature Transform) 알고리즘을 통해 3차원 모델로부터 생성한 깊이 영상의 지역적 특징(local feature)를 파악하여 유사3차원 모델을 검색하는 기법이나 3차원 모델을 복셀 형태로 표현하여 간략화시킨 기술자를 사용하는 기법이 제안되었다. 추가적으로 컬러 및 깊이 영상, 3차원 모델을 모두 질의로 사용 가능한 포괄적인 시점 기반 검색 기법이나 사용자의 이진 스케치(binary sketch)를 입력으로 하는 간편한 검색 기법도 존재한다. 이러한 종래기술에 따른 시점 기반 방식의 검색 기법은 3차원 모델을 2차원 영상으로 투영시키는 과정에서, 3차원 모델의 미세한 다각형의 누락, 구멍 등에 의한 영향을 크게 줄일 수 있으므로 3차원 모델의 완성도에 구애받지 않는 강건한 검색을 할 수 있다는 장점이 있다. 그러나 적은 수의 영상 샘플을 사용할 경우 자기 폐색(self-occlusion)으로 인한 3차원 정보의 손실을 막기 힘들다는 단점이 존재한다. 또한 3차원 모델에 대한 깊이 영상 취득에 있어서 사용자가 촬영할 확률이 높은 부분, 3차원 모델의 많은 정보를 포함하고 있는 부분에서 시점이 샘플링되지 않는다면 검색에 실패할 확률이 높다는 문제점이 있다. The 3D model of a viewpoint-based search method according to the prior art is based on a depth image-based representation technique that a three-dimensional model can be expressed as several 2.5-dimensional depth images. It is a technique of sampling and expressing it as an aspect-relation graph and measuring similarity by matching with the input image. Generates a silhouette image of a 3D model, Fourier transforms, and expresses the coefficients as descriptors to use in calculating the similarity of the 3D model or to generate a depth image at a uniformly generated point on the spherical body surrounding the 3D model and describe it The method of measuring similarity was studied. Similarly, SIFT (Scale Invariant Feature Transform) algorithm is used to identify the local features of the depth image generated from the 3D model and to search for the similar 3D model or to express and simplify the 3D model in voxel form. A technique using a pre-determined engineer has been proposed. In addition, there are also comprehensive viewpoint-based search techniques that can use both color and depth images and 3D models as queries, or simple search techniques that use the user's binary sketch as input. In the process of projecting a 3D model into a 2D image, the viewpoint-based search technique according to the related art can greatly reduce the influence of the omission of a fine polygon or a hole in the 3D model, thus constraining the completeness of the 3D model. It has the advantage of being able to do robust searches that you do not receive. However, when using a small number of image samples, there is a disadvantage that it is difficult to prevent loss of 3D information due to self-occlusion. In addition, in the depth image acquisition of the 3D model, there is a problem in that the probability of a search failure is high if a viewpoint is not sampled in a portion that a user is likely to photograph, or a portion that contains a lot of information of the 3D model.

따라서 전술한 바와 같은 종래기술에 따른 내용 기반 3차원 모델 검색 기법들의 문제점을 해결할 수 있는 새로운 3차원 모델 검색 기법의 개발이 필요하다.Therefore, it is necessary to develop a new 3D model search technique that can solve the problems of the content-based 3D model search techniques according to the prior art as described above.

한편 Microsoft Kinect와 같은 보급형 3차원 카메라가 급속히 보급되고 있다. 이는 가격 대비 양질의 깊이 영상 취득환경 구축을 가능하게 하였으며, 이러한 시스템을 이용하여 검색에 필요한 입력 데이터를 실시간 컬러 영상 및 깊이 영상으로 확장하는 것이 용이해졌다. 그러나 현재로서는 이러한 보급형 3차원 카메라를 3차원 모델 검색에 사용하고 있는 기술이 존재하지 않는다. 따라서, 보급형 3차원 카메라를 3차원 모델 검색에 사용할 수 있는 새로운 3차원 모델 검색 기법의 개발이 필요하다.
Meanwhile, entry-level 3D cameras such as Microsoft Kinect are rapidly spreading. This made it possible to build a high-quality depth image acquisition environment for the price, and it was easy to expand the input data required for search to real-time color images and depth images using these systems. However, at present, there is no technology that uses such an entry-level 3D camera for 3D model search. Therefore, there is a need to develop a new three-dimensional model search technique that can be used to search for three-dimensional models of entry-level three-dimensional cameras.

본 발명의 목적은 위에서 언급한 종래기술의 문제점을 해결하는 것이다. The object of the present invention is to solve the problems of the prior art mentioned above.

본 발명의 일 목적은, 3차원 모델에 대하여 사용자가 질의 영상으로서 촬영할 확률이 높은 부분과 3차원 모델을 대표할 수 있는 부분에 대한 카메라 시점들을 적응적으로 샘플링하고, 샘플링된 카메라 시점들을 해당 3차원 모델에 대한 깊이 영상을 획득하기 위한 카메라 시점들로 설정할 수 있는 3차원 모델 검색 데이터베이스 구축 방법, 이를 수행하는 검색 서버 및 컴퓨터로 판독 가능한 기록매체를 제공하는 것이다. One object of the present invention is to adaptively sample the camera viewpoints for a portion that is likely to be photographed by a user as a query image and a portion that can represent a 3D model, and to sample the 3D models. It is to provide a method for constructing a 3D model search database that can be set as camera viewpoints for acquiring a depth image of a dimensional model, a search server performing the same, and a computer-readable recording medium.

본 발명의 다른 일 목적은, 적응적으로 샘플링된 복수의 카메라 시점들 각각에서 획득한 복수의 깊이 영상들의 집합으로서 3차원 모델을 표현하고, 3차원 모델의 기하학적 특징이 검색에 활용되도록 3차원 모델 검색 데이터베이스를 구성할 수 있는 3차원 모델 검색 데이터베이스 구축 방법, 이를 수행하는 검색 서버 및 컴퓨터로 판독 가능한 기록매체를 제공하는 것이다.Another object of the present invention is to express a 3D model as a set of a plurality of depth images obtained from each of a plurality of adaptively sampled camera viewpoints, and to provide a 3D model with geometric features of the 3D model for search. It is to provide a method for constructing a 3D model search database capable of constructing a search database, a search server performing the search database, and a computer-readable recording medium.

본 발명의 또 다른 일 목적은, 보급형 3차원 카메라를 통해 획득된 단일 깊이 영상을 질의 영상으로 활용하여 이에 매칭되는 3차원 모델을 검색 및 제공할 수 있는 내용기반 3차원 모델 검색 방법, 이를 수행하는 검색 서버 및 컴퓨터로 판독 가능한 기록매체를 제공하는 것이다.Another object of the present invention is to use a single depth image obtained through a low-end three-dimensional camera as a query image to search and provide a three-dimensional model matching it, a content-based three-dimensional model search method for performing the same A search server and a computer-readable recording medium are provided.

본 발명의 또 다른 일 목적은, 질의 영상과 3차원 모델 검색 데이터베이스 내에 저장되어 있는 모든 3차원 모델의 깊이 영상들 각각에 대한 유사도를 병렬로 동시에 측정할 수 있는 내용기반 3차원 모델 검색 방법, 이를 수행하는 검색 서버 및 컴퓨터로 판독 가능한 기록매체를 제공하는 것이다.
Another object of the present invention is a content-based 3D model search method capable of simultaneously measuring the similarity of each of the depth images of all 3D models stored in the query image and the 3D model search database, in parallel. It is to provide a search server and a computer-readable recording medium to perform.

상기한 바와 같은 본 발명의 목적을 달성하고, 후술하는 본 발명의 특유의 효과를 달성하기 위한, 본 발명의 특징적인 구성은 하기와 같다. In order to achieve the object of the present invention as described above and to achieve the unique effects of the present invention described below, the characteristic configuration of the present invention is as follows.

본 발명의 일 태양에 따르면, 3차원 모델 검색 서버에 의해 수행되는 3차원 모델 검색 방법에 있어서, (A) 사용자 단말기로부터 전송되는 단일 깊이 영상을 질의 영상으로서 수신하는 단계; (B) 수신된 질의 영상으로부터 회전불변 기술자를 생성하는 단계; (C) 상기 질의 영상의 회전불변 기술자와 미리 저장된 회전불변 기술자들의 유사도를 계산하는 단계; 및 (D) 상기 계산된 유사도를 기초로 적어도 하나 이상의 3차원 모델을 선택하고, 선택된 3차원 모델에 대한 정보를 포함하는 검색 결과를 상기 사용자 단말기로 전송하는 단계를 포함하는 것을 특징으로 하는, 단일 깊이 영상을 이용한 내용기반 3차원 모델 검색 방법이 제안된다.According to an aspect of the present invention, a 3D model search method performed by a 3D model search server, the method comprising: (A) receiving a single depth image transmitted from a user terminal as a query image; (B) generating a rotation-invariant descriptor from the received query image; (C) calculating the similarity between the rotation-invariant descriptor of the query image and the previously stored rotation-invariant descriptor; And (D) selecting at least one 3D model based on the calculated similarity, and transmitting a search result including information on the selected 3D model to the user terminal. A content-based 3D model search method using depth images is proposed.

또한, 본 발명의 다른 일 태양에 따르면, 3차원 모델 검색 서버에 있어서, 사용자 단말기로부터 전송되는 단일 깊이 영상을 질의 영상으로서 수신하고, 상기 수신된 질의 영상으로부터 회전불변 기술자를 생성하는 질의 영상 처리부; 및 상기 질의 영상의 회전불변 기술자와 미리 저장된 회전불변 기술자들의 유사도를 계산하고, 상기 계산된 유사도를 기초로 적어도 하나 이상의 3차원 모델을 선택하고, 선택된 3차원 모델에 대한 정보를 포함하는 검색 결과를 상기 사용자 단말기로 전송하는 검색 처리부를 포함하는 것을 특징으로 하는, 3차원 모델 검색 서버가 제안된다.
According to another aspect of the present invention, a 3D model search server, comprising: a query image processing unit for receiving a single depth image transmitted from a user terminal as a query image and generating a rotation invariant descriptor from the received query image; And calculating the similarity between the rotation-invariant descriptor of the query image and the pre-stored rotation-invariant descriptor, selecting at least one three-dimensional model based on the calculated similarity, and selecting a search result including information on the selected three-dimensional model. A three-dimensional model search server is proposed, which comprises a search processing unit for transmitting to the user terminal.

본 발명의 바람직한 일 실시예에 따르면, 3차원 모델에 대한 복수의 카메라 시점들 각각에 대한 중요도를 판단하고, 판단된 중요도에 기초하여 3차원 모델에 대한 카메라 시점들을 적응적으로 샘플링함으로써, 사용자가 질의 영상으로서 촬영할 확률이 높은 부분과 3차원 모델을 대표할 수 있는 부분에 대한 카메라 시점들을 설정할 수 있는 효과를 기대할 수 있다.According to a preferred embodiment of the present invention, by determining the importance for each of a plurality of camera viewpoints for the three-dimensional model, and by adaptively sampling the camera viewpoints for the three-dimensional model based on the determined importance, the user As a query image, it is possible to expect an effect of setting camera viewpoints on a portion having a high probability of shooting and a portion representing a 3D model.

또한, 본 발명에 따르면, 중요도가 높은 카메라 시점들을 3차원 모델에 대한 깊이 영상을 획득하기 위한 카메라 시점들로 설정함으로써 3차원 모델 검색의 정확도와 속도를 향상시키는 효과를 기대할 수 있다.In addition, according to the present invention, an effect of improving the accuracy and speed of a 3D model search can be expected by setting camera viewpoints with high importance as camera viewpoints for acquiring a depth image for a 3D model.

또한, 본 발명에 따르면, 적응적으로 샘플링된 복수의 카메라 시점들 각각에서 획득한 복수의 깊이 영상들의 집합으로서 3차원 모델을 표현함으로써, 3차원 모델의 기하학적 특징을 3차원 모델 검색에 효율적으로 이용할 수 있는 효과를 기대할 수 있다.Further, according to the present invention, by expressing a three-dimensional model as a set of a plurality of depth images obtained from each of a plurality of adaptively sampled camera viewpoints, the geometrical features of the three-dimensional model can be efficiently used for three-dimensional model search You can expect the effect that can be.

또한, 본 발명에 따르면, 보급형 3차원 카메라를 통해 획득된 단일 깊이 영상을 질의 영상으로 활용하여 이에 매칭되는 3차원 모델을 검색 및 제공할 수 있는 효과를 기대할 수 있다.In addition, according to the present invention, it is possible to expect an effect of searching and providing a 3D model matching this by utilizing a single depth image obtained through a low-end 3D camera as a query image.

또한, 본 발명에 따르면, 3차원 모델 검색 데이터베이스 내에 저장되어 있는 모든 3차원 모델의 깊이 영상들 각각에 대하여 질의 영상과의 유사도를 병렬로 측정함으로써, 3차원 모델 검색 속도를 향상시키는 효과를 기대할 수 있다.
In addition, according to the present invention, by measuring the similarity with the query image in parallel for each of the depth images of all 3D models stored in the 3D model search database, the effect of improving the 3D model search speed can be expected. have.

도 1은 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버를 포함하는 전체 시스템의 구성 블록도.
도 2는 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버의 구성 블록도.
도 3은 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버에서 수행되는 3차원 모델 검색 데이터베이스 구축과정을 도시한 순서도.
도 4는 종래기술에 따른 20면체의 메쉬 분할 결과를 나타낸 예시도.
도 5a 및 도 5b는 카메라 시점에 따른 중요도 변화결과를 나타낸 예시도.
도 6a 내지 도 6d는 중요도를 계산하기 위한 가중치들의 변화에 따른 카메라 시점 샘플링 결과를 비교한 예시도.
도 7은 본 발명에 따라 소정의 가중치를 사용하여 카메라 시점을 샘플링한 결과를 나타내는 예시도.
도 8은 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버에서 수행되는 3차원 모델 검색과정을 도시한 순서도.
도 9는 본 발명의 바람직한 일 실시예에 따른 단일 깊이 영상을 질의 영상으로서 이용한 3차원 모델 검색 방법을 사용한 검색 결과를 나타낸 예시도.
도 10은 본 발명의 바람직한 일 실시예에 따른 단일 깊이 영상을 질의 영상으로서 이용한 3차원 모델 검색 방법을 사용한 검색 결과와 종래기술에 따른 3차원 모델을 질의 영상으로서 이용한 3차원 모델 검색 방법을 사용한 검색 결과를 나타내는 예시도.
도 11은 본 발명의 바람직한 일 실시예에 따른 적응적 시점 샘플링을 적용한 검색결과와 렌더링한 시점을 적용한 검색결과의 성능을 비교한 검색성능 비교 그래프.
도 12는 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 방법과 다른 검색 방법의 성능을 비교한 검색성능 비교 그래프.1 is a block diagram of an entire system including a 3D model search server according to an exemplary embodiment of the present invention.
Figure 2 is a block diagram of a three-dimensional model search server according to an embodiment of the present invention.
3 is a flowchart illustrating a process of constructing a 3D model search database performed by a 3D model search server according to an exemplary embodiment of the present invention.
Figure 4 is an exemplary view showing the mesh segmentation results of the icosahedron according to the prior art.
5A and 5B are exemplary views showing a result of a change in importance according to a camera viewpoint.
6A to 6D are exemplary views comparing camera viewpoint sampling results according to changes in weights for calculating importance.
7 is an exemplary view showing a result of sampling a camera viewpoint using a predetermined weight according to the present invention.
8 is a flowchart illustrating a 3D model search process performed in a 3D model search server according to an exemplary embodiment of the present invention.
9 is an exemplary view showing a search result using a 3D model search method using a single depth image as a query image according to an exemplary embodiment of the present invention.
FIG. 10 shows search results using a 3D model search method using a single depth image as a query image and search using a 3D model search method using a 3D model according to the prior art as a query image, according to an exemplary embodiment of the present invention. Example diagram showing results.
11 is a search performance comparison graph comparing the performance of a search result to which an adaptive viewpoint sampling is applied and a search result to which a rendered viewpoint is applied according to an exemplary embodiment of the present invention.
12 is a search performance comparison graph comparing the performance of a 3D model search method and another search method according to an exemplary embodiment of the present invention.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 본 발명의 범위는 적절하게 설명된다면 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다. For a detailed description of the present invention, which will be described later, reference is made to the accompanying drawings that illustrate, by way of example, specific embodiments in which the present invention may be practiced. These examples are described in detail enough to enable those skilled in the art to practice the present invention. It should be understood that the various embodiments of the invention are different, but need not be mutually exclusive. For example, the specific shapes, structures, and properties described herein can be implemented in other embodiments without departing from the spirit and scope of the invention in relation to one embodiment. In addition, it should be understood that the location or placement of individual components within each disclosed embodiment can be changed without departing from the spirit and scope of the invention. Therefore, the following detailed description is not intended to be taken in a limiting sense, and the scope of the present invention is limited only by the appended claims, along with all ranges equivalent to those claimed by the claims, if appropriately described. In the drawings, similar reference numerals refer to the same or similar functions throughout several aspects.

이하, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명을 용이하게 실시할 수 있도록 하기 위하여, 본 발명의 바람직한 실시예들에 관하여 첨부된 도면을 참조하여 상세히 설명하기로 한다.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings in order to enable those skilled in the art to easily implement the present invention.

[본 발명의 바람직한 실시예][Preferred embodiment of the present invention]

본 발명의 실시예에서, 용어 '3차원 모델의 다중 깊이 영상 기반 표현'이란 3차원 형상이 복수의 깊이 영상으로부터 생성될 수 있다는 개념에 기초하여, 3차원 모델을 각각의 다른 시점에서 획득한 복수의 깊이 영상들의 집합으로서 표현하는 방법을 의미한다.In an embodiment of the present invention, the term'multi-depth image-based representation of a three-dimensional model' is based on the concept that a three-dimensional shape can be generated from a plurality of depth images, and a plurality of three-dimensional models obtained at each different viewpoint Means a method of expressing as a set of depth images.

또한, 본 발명의 실시예에 있어, 용어 '카메라 시점의 중요도'란 해당 카메라 시점에서 획득한 깊이 영상이 해당 3차원 모델을 대표할 수 있는 정도를 소정의 알고리즘에 따라 수치화한 정보를 의미한다. 즉, 중요도가 높은 시점이란 다른 시점에 비하여 해당 시점에서 획득된 깊이 영상이 질의 영상으로서 입력될 가능성이 높거나 및/또는 해당 3차원 모델을 대표할 수 있는 부분을 많이 포함하고 있다는 것을 의미할 수 있다. 카메라 시점의 중요도는 면적 중요도, 곡률 중요도, 카메라 자세 중요도 중 하나 이상의 중요도에 기초하여 산출될 수 있으나, 본 발명이 이에 한정되는 것은 아니다.In addition, in an embodiment of the present invention, the term'importance of the camera viewpoint' refers to information obtained by quantifying the degree to which a depth image acquired from the camera viewpoint can represent the corresponding 3D model according to a predetermined algorithm. That is, a high-priority viewpoint may mean that a depth image obtained at a corresponding viewpoint is more likely to be input as a query image and/or includes a portion that can represent the corresponding 3D model compared to other viewpoints. have. The importance of the camera viewpoint may be calculated based on one or more of the importance of the area, the importance of the curvature, and the importance of the attitude of the camera, but the present invention is not limited thereto.

또한, 본 발명의 실시예에 있어, 초기 카메라 시점이란, 3차원 모델 검색 데이터베이스 구축의 대상이 되는 3차원 모델에 대하여 카메라 시점을 적응적으로 샘플링하기 위하여 최초로 설정되는 카메라 시점을 의미한다. 예를 들어, 본 발명의 실시예에 있어, 해당 3차원 모델의 단위 반지름 크기의 구에 내접하는 20면체가 설정되고, 해당 20면체의 각 정점들이 초기 카메라 시점으로 설정될 수 있다. 다만, 본 발명이 이에 한정되는 것은 아니며 다양한 방식으로 초기 카메라 시점이 설정될 수 있다.In addition, in an embodiment of the present invention, the initial camera viewpoint means a camera viewpoint that is first set to adaptively sample the camera viewpoint with respect to a 3D model that is a target of building a 3D model search database. For example, in an embodiment of the present invention, an icosahedron inscribed in a sphere having a unit radius size of the corresponding 3D model may be set, and each vertex of the icosahedron may be set as an initial camera viewpoint. However, the present invention is not limited to this, and the initial camera viewpoint may be set in various ways.

또한, 본 발명의 실시예에 있어, 용어 '카메라 시점의 적응적 샘플링'이란, 초기 카메라 시점들의 중요도에 기초하여 카메라 시점을 샘플링하는 것을 의미한다. 즉, 본 발명에 있어 '카메라 시점의 적응적 샘플링'이란 상대적으로 높은 중요도를 갖는 부분에서 더 많은 카메라 시점을 샘플링하기 위하여 안출된 개념으로서, 예를 들어, 전술한 바와 같은 20면체의 각 정점의 중요도에 기초하여 20면체에 대한 메쉬 분할을 수행함으로써, 카메라 시점의 적응적 샘플링이 수행될 수 있다. 다만, 본 발명이 이에 한정되는 것은 아니며, 본 발명의 실시예들에 대한 다양한 변형 및/또는 변용이 이루어질 수 있고, 이러한 변형 및/또는 변용에 불구하고 카메라 시점의 중요도에 따라 적응적 샘플링이 수행된다는 본 발명의 기술적 사상을 그대로 포함하고 있는 한 본 발명의 권리범위에 속함은 당업자에게 자명할 것이다.
In addition, in an embodiment of the present invention, the term'adaptive sampling of the camera viewpoint' means sampling the camera viewpoint based on the importance of the initial camera viewpoints. That is, in the present invention,'adaptive sampling of a camera viewpoint' is a concept devised to sample more camera viewpoints in a part having a relatively high importance, for example, of each vertex of the icosahedron as described above. By performing mesh segmentation on the icosahedron based on the importance, adaptive sampling of the camera viewpoint can be performed. However, the present invention is not limited to this, and various modifications and/or changes may be made to embodiments of the present invention, and adaptive sampling is performed according to the importance of the camera viewpoint despite such modifications and/or modifications. It will be apparent to those skilled in the art that it belongs to the scope of the present invention as long as it includes the technical idea of the present invention as it is.

I. 3차원 모델 검색 서버를 포함하는 3차원 모델 검색 시스템의 구성과 기능I. Configuration and function of 3D model search system including 3D model search server

도 1은 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버(300)를 포함하는 전체 시스템의 구성 블록도이다. 이하에서, 도 1을 참조하여 본 발명에 따른 3차원 모델 검색 서버(300)를 포함하는 전체 시스템의 구성과 기능에 대하여 개략적으로 설명하도록 한다.1 is a configuration block diagram of an entire system including a 3D model search server 300 according to an exemplary embodiment of the present invention. Hereinafter, the configuration and function of the entire system including the three-dimensional model search server 300 according to the present invention will be described with reference to FIG. 1.

도 1에 도시된 바와 같이, 본 발명에 따른 3차원 모델 검색 시스템은 3차원 카메라(110)를 구비한 사용자 단말기(100), 네트워크(200) 및 3차원 모델 검색 서버(300)를 포함할 수 있다.As shown in FIG. 1, the 3D model search system according to the present invention may include a user terminal 100 having a 3D camera 110, a network 200, and a 3D model search server 300. have.

본 발명의 일 실시예에 따른 사용자 단말기(100)는 사용자의 조작에 따라 연결된 3차원 카메라(110)를 통해 검색하고자 하는 객체에 대해 한 장의 깊이 영상을 획득하고, 획득된 단일 깊이 영상을 질의 영상으로서 3차원 모델 검색 서버(300)로 전송하는 기능을 수행하도록 구성된다. 또한, 사용자 단말기(100)는 전송된 질의 영상에 대응하여 검색된 3차원 모델 검색 결과를 3차원 모델 검색 서버(300)로부터 수신하고, 수신된 검색 결과를 디스플레이부(미도시)를 통해 출력하게 된다. 이러한 사용자 단말기(100)는 데스크톱 컴퓨터뿐만 아니라 노트북 컴퓨터, 워크스테이션, 팜톱(palmtop) 컴퓨터, 개인 휴대 정보 단말기(personal digital assistant: PDA), 웹 패드, 네비게이션 장치, 스마트 폰을 포함하는 이동 통신 단말기 등과 같이 메모리 수단을 구비하고 마이크로 프로세서를 탑재하여 연산 능력을 갖춘 디지털 기기라면 얼마든지 본 발명에 따른 사용자 단말기(100)로서 채택될 수 있다. 다만, 본 발명에 따른 사용자 단말기(100)가 3차원 모델 검색을 수행하기 위한 구성요소이므로, 사용자 단말기(100)는 사용자의 조작에 의해 특정 객체의 깊이 영상을 획득할 수 있는 3차원 카메라(110)를 포함하거나 또는 3차원 카메라(110)에 연결된 디지털 기기로 제한될 수 있다.The user terminal 100 according to an embodiment of the present invention acquires a depth image of an object to be searched through a 3D camera 110 connected according to a user's manipulation, and queries the acquired single depth image It is configured to perform the function of transmitting to the 3D model search server 300 as. In addition, the user terminal 100 receives the 3D model search result searched in response to the transmitted query image from the 3D model search server 300 and outputs the received search result through the display unit (not shown). . The user terminal 100 is not only a desktop computer, but also a notebook computer, a workstation, a palmtop computer, a personal digital assistant (PDA), a web pad, a navigation device, a mobile communication terminal including a smart phone, and the like. Likewise, any digital device equipped with a memory means and equipped with a microprocessor and having computing power can be adopted as the user terminal 100 according to the present invention. However, since the user terminal 100 according to the present invention is a component for performing a 3D model search, the user terminal 100 can obtain a depth image of a specific object by user manipulation. ) Or a digital device connected to the 3D camera 110.

또한, 본 발명의 일 실시예에 따른 사용자 단말기(100)에 연결된 3차원 카메라(110)는 사용자의 조작에 따라 특정 객체의 컬러 영상과 이에 대하여 등록된 깊이 영상을 촬영하여 사용자 단말기(100)로 출력하도록 구성된다. 이러한 3차원 카메라(110)로서 다양한 보급형 3차원 카메라들 중 하나가 채택될 수 있다. 예를 들어, 본 발명에 따른 사용자 단말기(100)에는 마이크로소프트사의 키넥트(kinect)가 3차원 카메라(110)로서 연결될 수 있으나, 본 발명이 이에 한정되는 것은 아니다.In addition, the three-dimensional camera 110 connected to the user terminal 100 according to an embodiment of the present invention takes a color image of a specific object and a depth image registered thereon according to the user's manipulation, and then transmits it to the user terminal 100. It is configured to output. As such a three-dimensional camera 110, one of a variety of low-end three-dimensional camera may be adopted. For example, the user terminal 100 according to the present invention may be connected to Microsoft's Kinect as a three-dimensional camera 110, but the present invention is not limited thereto.

질의 영상의 획득과정을 보다 구체적으로 살펴보면, 먼저 사용자 단말기(100)는 특정 객체의 3차원 모델 검색에 필요한 실제 객체의 정확한 분할을 위해, 3차원 카메라(110)를 제어하여 획득된 컬러 영상에 대하여 깊이 영상을 등록한다. 즉. 사용자 단말기(100)는 3차원 카메라(110)를 통해 획득된 물체의 컬러 영상과 깊이 영상이 연관되도록 한다. 이러한 방식으로 검색 대상 객체에 대한 컬러 영상과 등록된 깊이 영상이 함께 획득되면, 사용자 단말기(100)는 디스플레이부를 통해 획득된 컬러 영상을 출력한다. 사용자는 디스플레이부를 통해 출력되는 컬러 영상을 확인하고, 마우스(미도시) 등의 입력수단을 조작하여 3차원 모델 검색에 필요한 객체를 선택한다. 이러한 과정에 있어, 본 발명에 따른 사용자 단말기(100)는 그랩컷(GrabCut) 기법을 사용하여 컬러 영상에 대한 객체 분할을 수행하도록 구성될 수 있다. 사용자 단말기(100)는 사용자의 선택에 따라 컬러 영상 상에서 객체 분할을 위한 마스크를 생성하며, 이렇게 생성된 마스크에 기초해 대응되는 깊이 영상의 특정 영역을 질의 영상으로서 결정하도록 구성된다. 깊이 영상의 특정 영역이 질의 영상으로서 결정되면, 사용자 단말기(100)는 결정된 깊이 영상을 질의 영상으로서 3차원 모델 검색 서버(300)로 전송하게 된다.Looking at the acquisition process of the query image in more detail, first, the user terminal 100 controls the 3D camera 110 to obtain a color image obtained by controlling the 3D camera 110 for accurate segmentation of a real object required for 3D model search of a specific object. Register the depth image. In other words. The user terminal 100 allows the color image and the depth image of the object acquired through the 3D camera 110 to be associated. When the color image and the registered depth image for the object to be searched are acquired together in this way, the user terminal 100 outputs the color image acquired through the display unit. The user checks the color image output through the display unit, and operates an input means such as a mouse (not shown) to select an object for 3D model search. In this process, the user terminal 100 according to the present invention may be configured to perform object segmentation on a color image using a GrabCut technique. The user terminal 100 generates a mask for object segmentation on a color image according to a user's selection, and is configured to determine a specific region of a corresponding depth image as a query image based on the generated mask. When a specific region of the depth image is determined as the query image, the user terminal 100 transmits the determined depth image as the query image to the 3D model search server 300.

본 발명의 일 실시예에 따르면, 네트워크(200)는 유선 및 무선 등과 같은 그 통신 양태를 가리지 않고 구성될 수 있으며, 단거리 통신망(PAN; Personal Area Network), 근거리 통신망(LAN; Local Area Network), 도시권 통신망(MAN; Metropolitan Area Network), 광역 통신망(WAN; Wide Area Network) 등 다양한 통신망으로 구성될 수 있다.According to an embodiment of the present invention, the network 200 may be configured regardless of its communication mode, such as wired and wireless, a short-range communication network (PAN), a local area network (LAN), It may be composed of various communication networks such as a metropolitan area network (MAN) and a wide area network (WAN).

본 발명의 일 실시예에 따른 3차원 모델 검색 서버(300)는 크게 2가지 기능을 수행하도록 구성될 수 있다. 첫째, 본 발명에 따른 3차원 모델 검색 서버(300)는 단일 깊이 영상을 질의 영상으로 사용하여 3차원 모델의 검색을 수행할 수 있도록, 복수의 3차원 모델들에 대한 3차원 모델 검색 데이터베이스(350)를 구축하는 기능을 수행하도록 구성될 수 있다. The 3D model search server 300 according to an embodiment of the present invention may be configured to perform two functions. First, the 3D model search server 300 according to the present invention can search a 3D model for a plurality of 3D models so that a 3D model can be searched using a single depth image as a query image 350 ) Can be configured to perform the function of building.

즉, 전술한 바와 같이, 본 발명에 따른 3차원 모델 검색 서버(300)는 사용자 단말기(100)에 연결된 보급형 3차원 카메라(110)로부터 취득된 한 장의 깊이 영상을 이용하여 내용기반 3차원 모델 검색을 수행하도록 구성된다. 따라서, 본 발명에 따르면 3차원 모델 데이터 자체를 질의로 사용하는 종래기술들에 비해 매우 간편히 질의 데이터를 얻을 수 있다는 장점이 있다. 그러나 본 발명에 따른 3차원 모델 검색 시스템이 한 장의 깊이 영상만을 검색에 사용하므로, 3차원 모델 자체를 질의로 사용하는 종래기술들에 비해 유사도 비교에 필요한 정보가 현저히 부족해질 수 있다는 문제점이 있을 수 있다. 따라서, 이러한 문제점을 해결하기 위하여, 질의 영상인 한 장의 깊이 영상과 데이터베이스(350)에 저장된 3차원 모델의 깊이 영상들 간의 효율적인 매칭이 이루어지도록 해야할 필요성이 있다. 3차원 모델을 렌더링하여 z-버퍼를 통해 얻을 수 있는 깊이 영상은 카메라의 위치에 따라 다양하게 변할 수 있기 때문에, 사용자가 촬영할 확률이 높은 부분과 3차원 모델을 대표할 수 있는 부분에서 카메라 시점이 샘플링되어야 한다. 따라서 본 발명에 따른 3차원 모델 검색 서버(300)는 3차원 모델의 곡률, 투영된 면적, 카메라 자세 등의 요소들을 고려하여 특정 카메라 시점이 가지는 해당 3차원 모델에 대한 중요도를 판단하고, 판단된 중요도에 따라 3차원 모델에 대한 깊이 영상을 획득하기 위한 카메라 시점들을 적응적으로 샘플링하도록 구성될 수 있다. 중요도에 기초하여 카메라 시점들이 적응적으로 샘플링되기 때문에, 사용자가 촬영할 확률이 높은 카메라 시점들 및/또는 3차원 모델의 많은 정보를 포함하고 있는 카메라 시점들이 적응적으로 결정될 수 있다. 따라서, 본 발명에 따른 3차원 모델 검색 서버(300)는 3차원 모델에 대하여 상대적으로 중요한 카메라 시점들을 적응적으로 결정하고, 결정된 카메라 시점들 각각에서 해당 3차원 모델에 대한 깊이 영상을 획득하며, 획득된 복수의 깊이 영상들로서 해당 3차원 모델을 표현함으로써 3차원 모델 데이터베이스(350)를 구축할 수 있다. 예를 들어, 본 발명에 따른 3차원 모델 검색 서버(300)는 1개의 3차원 모델을 적응적으로 샘플링된 카메라 시점들에서 획득된 100개의 깊이 영상들로써 표현하고, 이를 데이터베이스(350)에 구조화하여 저장하도록 구성될 수 있다. 한편, 실시예를 구성하기에 따라, 본 발명에 따른 3차원 모델 검색 서버(300)는 획득된 깊이 영상들 자체를 데이터베이스(350)에 저장하는 대신, 소정의 알고리즘에 따라 획득된 깊이 영상으로부터 유사도 비교에 필요한 정보(예를 들어, 회전불변 기술자)를 생성하고, 생성된 정보를 데이터베이스(350)에 저장하도록 구성될 수도 있다. 전술한 바와 같은 본 발명에 따른 3차원 모델 검색 서버(300)의 데이터베이스(350) 구축 기능에 대해서는 도 2 내지 도 7을 참조하여 더 구체적으로 설명하도록 한다.That is, as described above, the 3D model search server 300 according to the present invention searches for a content-based 3D model using a depth image acquired from the entry-level 3D camera 110 connected to the user terminal 100. It is configured to perform. Therefore, according to the present invention, there is an advantage that query data can be obtained very easily compared to conventional techniques using 3D model data itself as a query. However, since the 3D model search system according to the present invention uses only one depth image for search, there may be a problem that information required for comparison of similarity may be significantly shorter than conventional techniques using the 3D model itself as a query. have. Therefore, in order to solve such a problem, there is a need to make an efficient matching between a depth image of a query image and a depth image of a 3D model stored in the database 350. Since the depth image that can be obtained through the z-buffer by rendering a 3D model can vary depending on the position of the camera, the camera's point of view is the part where the user is likely to shoot and the part that can represent the 3D model. Should be sampled. Accordingly, the 3D model search server 300 according to the present invention determines factors of importance for a corresponding 3D model of a specific camera viewpoint in consideration of factors such as curvature, projected area, and camera posture of the 3D model, and determines It may be configured to adaptively sample camera viewpoints for obtaining a depth image for a 3D model according to importance. Since the camera viewpoints are adaptively sampled based on the importance, camera viewpoints that are likely to be photographed by the user and/or camera viewpoints containing a lot of information of the 3D model can be adaptively determined. Accordingly, the 3D model search server 300 according to the present invention adaptively determines camera viewpoints that are relatively important for the 3D model, and acquires a depth image for the corresponding 3D model at each of the determined camera viewpoints, The 3D model database 350 may be constructed by expressing the corresponding 3D model as a plurality of acquired depth images. For example, the 3D model search server 300 according to the present invention expresses one 3D model as 100 depth images obtained from adaptively sampled camera viewpoints, and structures it in the database 350 It can be configured to store. On the other hand, according to the configuration of the embodiment, the 3D model search server 300 according to the present invention instead of storing the acquired depth images in the database 350, similarity from the depth images obtained according to a predetermined algorithm It may be configured to generate information necessary for comparison (eg, a rotation-invariant descriptor) and store the generated information in the database 350. The function of building the database 350 of the 3D model search server 300 according to the present invention as described above will be described in more detail with reference to FIGS. 2 to 7.

따라서, 전술한 바와 같은 본 발명에 따른 3차원 모델 검색 서버(300)에 의해 구축되는 3차원 모델 검색 데이터베이스(350)에는 복수의 3차원 모델들 각각이 복수의 깊이 영상들(또는 유사도 비교에 필요한 깊이 영상으로부터 획득될 수 있는 정보들)로서 구조화되어 저장된다. 실시예를 구성하기에 따라, 본 발명에 따른 3차원 모델 검색 데이터베이스(350)는 3차원 모델 자체를 저장하도록 구성될 수도 있으며, 또는 3차원 모델에 해당되는 깊이 영상들의 집합만을 저장하도록 구성될 수도 있다. Therefore, each of the plurality of 3D models is required for a plurality of depth images (or similarity comparison) in the 3D model search database 350 constructed by the 3D model search server 300 according to the present invention as described above. Structured as information that can be obtained from a depth image). Depending on the configuration of the embodiment, the 3D model search database 350 according to the present invention may be configured to store the 3D model itself, or may be configured to store only a set of depth images corresponding to the 3D model. have.

또한, 본 발명에 따른 3차원 모델 검색 서버(300)는 사용자 단말기(100)로부터 전송되는 단일 깊이 영상을 질의 영상으로서 수신하고, 수신된 질의 영상과 3차원 모델 검색 데이터베이스(350)에 저장되어 있는 깊이 영상들의 유사도를 비교하며, 유사도 비교결과에 따라 질의 영상에 매칭되는 3차원 모델(들)의 정보를 추출하여 사용자 단말기(100)로 전송하도록 구성될 수 있다. 실시예를 구성하기에 따라, 본 발명에 따른 3차원 모델 검색 서버(300)는 질의 영상과 데이터베이스(350)에 저장되어 있는 깊이 영상들 자체를 비교하도록 구성될 수도 있다. 또는, 전술한 바와 같이, 깊이 영상들로부터 생성된 회전불변 기술자가 데이터베이스(350)에 저장되어 있는 경우, 본 발명에 따른 3차원 모델 검색 서버(300)는 데이터베이스 구축시 사용된 회전불변 기술자를 생성 알고리즘을 질의 영상에 적용하여 질의 영상에 대응되는 회전불변 기술자를 생성하고, 이를 데이터베이스(350)에 저장되어 있는 회전불변 기술자들과 비교함으로써 유사도를 판단하도록 구성될 수도 있다. 전술한 바와 같은 본 발명에 따른 3차원 모델 검색 서버(300)의 3차원 모델 검색 기능에 대해서는 도 2, 도 8을 참조하여 더 구체적으로 설명하도록 한다.In addition, the 3D model search server 300 according to the present invention receives a single depth image transmitted from the user terminal 100 as a query image, and is stored in the received query image and the 3D model search database 350. It may be configured to compare similarities of depth images, extract information of the 3D model(s) matching the query image according to the similarity comparison result, and transmit the extracted information to the user terminal 100. Depending on the configuration of the embodiment, the 3D model search server 300 according to the present invention may be configured to compare the query image and the depth images stored in the database 350. Alternatively, as described above, when the rotation-invariant descriptor generated from the depth images is stored in the database 350, the 3D model search server 300 according to the present invention generates the rotation-invariant descriptor used when constructing the database. The algorithm may be configured to determine the similarity by applying the algorithm to the query image, generating a rotation-invariant descriptor corresponding to the query image, and comparing it with rotation-invariant descriptors stored in the database 350. The 3D model search function of the 3D model search server 300 according to the present invention as described above will be described in more detail with reference to FIGS. 2 and 8.

한편, 도 1에서 본 발명에 따른 3차원 모델 검색 서버(300)가 3차원 모델 데이터베이스 구축 및 3차원 모델 검색 기능을 모두 수행하는 것으로 도시되어 있으나, 실시예를 구성하기에 따라 물리적으로 및/또는 논리적으로 분리된 별개의 서버들이 각각의 기능을 수행하도록 구성될 수 있음은 당업자에게 자명할 것이다. 또한, 전술한 바와 같은 3차원 모델 검색 서버(300)의 3차원 모델 데이터베이스 구축 기능을 초기에 데이터베이스 구축시 오프라인 상에서 1회 수행될 수 있으며, 추후 3차원 모델이 추가되는 경우 그때마다 추가되는 3차원 모델에 대하여 수행될 수 있다. 한편, 전술한 바와 같은 3차원 모델 검색 서버(300)의 3차원 모델 검색 기능을 온라인 상에서 실시간으로 수행된다.
On the other hand, in FIG. 1, the 3D model search server 300 according to the present invention is shown to perform both a 3D model database construction and a 3D model search function, but physically and/or according to an embodiment. It will be apparent to those skilled in the art that separate servers that are logically separated can be configured to perform each function. In addition, the 3D model database building function of the 3D model search server 300 as described above can be performed once offline when the database is initially built, and when 3D models are added later, 3D models added each time It can be performed on the model. Meanwhile, the 3D model search function of the 3D model search server 300 as described above is performed in real time online.

IIII . 3차원 모델 검색 서버의 구성과 기능. Configuration and function of 3D model search server

도 2는 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버(300)의 구성 블록도이다. 도 2에 도시된 바와 같이, 본 발명에 따른 3차원 모델 검색 서버(300)는 전술한 바와 같은 3차원 모델 데이터베이스(350)를 구축하기 위한 데이터베이스 생성부(310)와, 3차원 모델 검색 기능을 수행하기 위한 질의영상 처리부(320) 및 검색 처리부(330)를 포함할 수 있다.
2 is a block diagram of a 3D model search server 300 according to an exemplary embodiment of the present invention. As shown in FIG. 2, the 3D model search server 300 according to the present invention includes a database generation unit 310 and a 3D model search function for constructing the 3D model database 350 as described above. It may include a query image processing unit 320 and a search processing unit 330 for performing.

1. 3차원 모델 검색 데이터베이스의 구축 - 3차원 모델의 다중 깊이 영상 기반 표현1. Construction of 3D model search database-Multi-depth image-based representation of 3D model

이하에서, 3차원 모델의 다중 깊이 영상 표현(multple depth image-based representation)을 위한 적응적 시점 샘플링 기법을 설명한다. 다중 깊이 영상 기반 표현이란 3차원 형상이 복수의 깊이 영상으로부터 생성될 수 있다는 개념에 기초하여, 3차원 모델을 각각의 다른 시점에서 취득한 수개의 깊이 영상 집합으로 표현하는 방법이다. 본 발명에서는 3차원 카메라를 통해 취득한 질의 깊이 영상과 3차원 모델 간의 비교를 위하여 데이터베이스(350)에 저장되는 각각의 3차원 모델을 다중 깊이 영상 기반으로 표현하였다. 하지만 기존의 깊이 영상 기반 검색 알고리즘이 3차원 모델을 둘러싼 구면체상에 균일하게 샘플링된 시점에서 3차원 모델을 렌더링한 것과 달리 본 발명에서는 3차원 모델의 기하학적 형태를 고려한 적응적 시점 샘플링을 통해 단 한 장의 질의 깊이 영상을 이용한 3차원 모델 검색이 보다 효과적으로 이루어질 수 있도록 하였다.Hereinafter, an adaptive viewpoint sampling technique for multi-depth image-based representation of a 3D model will be described. The multi-depth image-based expression is a method of expressing a 3D model as a set of several depth images acquired at different views based on the concept that a 3D shape can be generated from a plurality of depth images. In the present invention, each 3D model stored in the database 350 is expressed based on a multi-depth image for comparison between a query depth image acquired through a 3D camera and a 3D model. However, unlike the conventional depth image-based search algorithm that renders a 3D model at a uniformly sampled point on a spherical body surrounding a 3D model, in the present invention, only one is achieved through adaptive viewpoint sampling considering the geometric shape of the 3D model. We made it possible to search 3D models more effectively using the depth image of the intestine.

도 2에 도시된 바와 같이, 본 발명에 따른 데이터베이스 생성부(310)는 정규화 모듈(312), 카메라 시점 결정모듈(314), 및 회전불변 기술자 생성모듈(316)을 포함할 수 있다. 또한, 도 3은 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버에서 수행되는 3차원 모델 검색 데이터베이스 구축과정을 도시한 순서도이다. 이하에서, 도 2의 데이터베이스 생성부(310)의 구성 블록도 및 도 3의 순서도를 참조하여, 본 발명에 따른 3차원 모델 검색 서버(300)에서 수행되는 3차원 모델 검색 데이터베이스(350)의 구축과정에 대하여 상세하게 살펴보도록 한다.As shown in FIG. 2, the database generator 310 according to the present invention may include a normalization module 312, a camera viewpoint determination module 314, and a rotation invariant descriptor generation module 316. In addition, FIG. 3 is a flowchart illustrating a process of constructing a 3D model search database performed by a 3D model search server according to an exemplary embodiment of the present invention. Hereinafter, with reference to the configuration block diagram of the database generation unit 310 of FIG. 2 and the flowchart of FIG. 3, the construction of the 3D model search database 350 performed by the 3D model search server 300 according to the present invention Let's look at the process in detail.

먼저, 도 3에 도시되어 있는 바와 같이, 본 발명에 따른 데이터베이스 생성부(310)는 3차원 모델 검색 데이터베이스를 구축하기 위하여, 복수의 3차원 모델들 중 하나를 선택한다(S300). 데이터베이스 생성부(310)는 선택된 3차원 모델에 대하여, 3차원 모델 정규화 단계(S302), 초기 카메라 시점 설정 단계(S304), 중요도 계산 단계(S306), 카메라 시점의 적응적 샘플링 단계(S308), 및 회전불변 기술자 생성 및 저장 단계(S310)를 수행하여, 선택된 3차원 모델을 복수의 회전불변 기술자들의 집합으로써 표현 및 저장하도록 구성된다. 전술한 바와 같은 S302 단계 내지 S310 단계는 3차원 모델 검색 데이터베이스 구축의 대상이 되는 모든 3차원 모델들에 대하여 반복적으로 수행된다. 이하에서 각각의 단계를 보다 구체적으로 설명하도록 한다.First, as shown in FIG. 3, the database generator 310 according to the present invention selects one of a plurality of 3D models in order to construct a 3D model search database (S300). The database generation unit 310, for the selected three-dimensional model, the three-dimensional model normalization step (S302), the initial camera viewpoint setting step (S304), the importance calculation step (S306), the adaptive sampling step of the camera viewpoint (S308), And a rotation invariant descriptor generation and storage step (S310), so that the selected 3D model is represented and stored as a set of a plurality of rotation invariant descriptors. Steps S302 to S310 as described above are repeatedly performed for all 3D models that are targets of building a 3D model search database. Hereinafter, each step will be described in more detail.

1-1. 3차원 모델의 정규화 단계(1-1. Normalization step of 3D model ( S302S302 ))

데이터베이스(350)의 3차원 모델들은 지역 좌표계에서의 크기, 원점, 방향이 모두 다르다. 따라서 3차원 모델의 다중 깊이 영상 표현에 앞서 데이터베이스(350)의 모든 3차원 모델에 대해 정규화 과정이 필요하다. 따라서, 본 발명에 따른 정규화 모듈(312)은 선택된 3차원 모델에 대하여 적어도 하나 이상의 정규화를 수행하도록 구성된다. 실시예를 구성하기에 따라 이러한 3차원 모델의 정규화 단계(S302)는 선택된 3차원 모델에 대하여 개별적으로 수행될 수도 있으며, 또는 데이터베이스(350) 내의 모든 3차원 모델들에 대하여 정규화 단계를 미리 수행하도록 구성될 수도 있다. 보다 구체적으로, 본 발명에 따른 정규화 모듈(312)은 먼저 3차원 모델의 중심이 좌표의 중심에 오도록 3차원 모델을 이동시킴으로써 이동 정규화를 수행한다. 한편, 본 발명에서 제안하는 회전불변 기술자는 회전의 변화에는 강인하지만 크기 변화에 민감하다. 따라서 본 발명에 따른 정규화 모듈(312)은 3차원 모델이 단위 구면체에 내접하도록 3차원 모델의 크기 정규화를 수행한다.The three-dimensional models of the database 350 have different sizes, origins, and directions in the local coordinate system. Therefore, a normalization process is required for all three-dimensional models in the database 350 prior to the multi-depth image representation of the three-dimensional model. Accordingly, the normalization module 312 according to the present invention is configured to perform at least one normalization on the selected 3D model. Depending on the configuration of the embodiment, the normalization step (S302) of the 3D model may be performed individually on the selected 3D model, or the normalization step may be previously performed on all 3D models in the database 350. It may be configured. More specifically, the normalization module 312 according to the present invention first performs the movement normalization by moving the 3D model such that the center of the 3D model comes to the center of the coordinates. On the other hand, the rotation-invariant technician proposed in the present invention is robust to changes in rotation, but is sensitive to size changes. Therefore, the normalization module 312 according to the present invention performs size normalization of the 3D model such that the 3D model inscribes the unit spherical body.

1-2. l면체를 통한 초기 카메라 시점 설정 단계(1-2. l Initial camera point of view setting through a polyhedron ( S304S304 ))

시점 기반 방식 유사도 측정을 위한 회전불변 기술자 생성에 앞서, 데이터베이스 내의 3차원 모델을 특정 카메라 시점에서 렌더링하고 깊이 영상을 취득해야 한다. 따라서, S302 단계에서 3차원 모델에 대한 정규화가 완료되면, 본 발명에 따른 카메라 시점 결정모듈(314)은 초기 카메라 시점을 설정하기 위하여, 크기 정규화 과정에서 사용된 단위 구면체에 내접하는 l면체(l은 4 이상의 양의 정수)를 설정하고, l면체를 구성하는 각각의 정점을 초기 카메라 시점으로 설정하도록 구성될 수 있다. 이때, l면체는 바람직하게 20면체로 설정될 수 있으며, 또한 20면체를 구성하는 각각의 면은 삼각면으로 구성될 수 있다. 이하에서, 이해 및 설명의 편의를 위하여, 20면체를 설정하여 초기 카메라 시점을 결정하고, 이를 중요도에 따라 적응적으로 샘플링하도록 구성된 실시예를 기준으로 설명하나, 본 발명이 이에 한정되는 것은 아니다.Before generating a rotation-invariant descriptor for similarity measurement in a viewpoint-based method, a 3D model in a database must be rendered at a specific camera viewpoint and a depth image must be acquired. Therefore, when normalization of the 3D model is completed in step S302, the camera viewpoint determination module 314 according to the present invention l-cuboid (l) that inscribes the unit spherical body used in the size normalization process to set the initial camera viewpoint. May be set to a positive integer greater than or equal to 4), and each vertex constituting the l-sided body may be set as an initial camera viewpoint. At this time, the l-hedron may be preferably set to an icosahedron, and each surface constituting the icosahedron may be composed of a triangular surface. Hereinafter, for convenience of understanding and explanation, the initial camera viewpoint is determined by setting the icosahedron, and the present invention is not limited thereto, based on an embodiment configured to adaptively sample according to importance.

본 발명에 따른 카메라 시점 결정모듈(314)은 단위 반지름 크기의 구에 내접하는 20면체(icosahedron)에 대해 메쉬 분할 기법(mesh subdivision)을 적용하여 카메라 시점을 샘플링하도록 구성될 수 있다. 도 4는 종래기술에 따른 20면체의 메쉬 분할 결과를 나타낸 예시도로서, 좌측에는 설정된 20면체가 도시되어 있으며, 우측에는 종래기술에 따라 20면체를 메쉬 분할한 결과가 도시되어 있다. 20면체의 모든 정점에 대해 메쉬 분할을 적용하는 경우, 도 4의 우측에 도시된 바와 같이 표면에 균일한 정점을 가진 단위 구면체를 만들어낼 수 있으며 각 정점은 구의 중심을 바라보는 카메라의 시점을 의미한다. 하지만 이와 같은 방법, 모든 정점에 대하여 균일하게 메쉬 분할을 수행하는 경우, 3차원 형상의 특징을 고려하지 않았으므로 단 한 장의 부분 깊이 영상을 질의 데이터로 사용하는 알고리즘으로의 적용에 비효율적이다. 균일한 시점의 샘플링으로 인하여 유사도 비교에 불필요한 시점들이 샘플링 될 수 있으며, 이는 검색 정확도의 저하와 함께 데이터의 낭비를 또한 가져올 수 있기 때문이다. 따라서 전술한 바와 같은 문제점을 해결하기 위하여, 본 발명에 따른 카메라 시점 결정모듈(314)은 3차원 모델 각각의 기하학적 형태를 고려하여 유사도 검색에 적절한 시점들을 적응적으로 샘플링하도록 구성될 수 있다. 이하에서, 이러한 본 발명에 따른 카메라 시점 결정모듈(314)에 의해 수행되는 카메라 시점의 적응적 샘플링에 대해 구체적으로 살펴본다.The camera viewpoint determination module 314 according to the present invention may be configured to sample the camera viewpoint by applying a mesh subdivision to an icosahedron inscribed to a sphere having a unit radius. Figure 4 is an exemplary view showing the result of the mesh segmentation of the icosahedron according to the prior art, the set is shown on the left icosahedron, and on the right is the result of mesh segmentation of the icosahedron according to the prior art. When mesh segmentation is applied to all vertices of an icosahedron, as shown in the right side of FIG. 4, a unit spherical body having a uniform vertex on the surface can be created, and each vertex represents the camera's viewpoint looking at the center of the sphere. do. However, in this method, when mesh segmentation is uniformly performed on all vertices, the characteristics of the 3D shape are not considered, and thus, it is inefficient for application to an algorithm using only one partial depth image as query data. This is because a sampling of a uniform viewpoint may cause unnecessary viewpoints to be compared for similarity comparison, which may also result in a waste of data as well as a decrease in search accuracy. Therefore, in order to solve the above-described problems, the camera viewpoint determination module 314 according to the present invention may be configured to adaptively sample viewpoints suitable for similarity search in consideration of each geometric shape of the 3D model. Hereinafter, adaptive sampling of the camera viewpoint performed by the camera viewpoint determination module 314 according to the present invention will be described in detail.

1-3. 서로 다른 카메라 시점에 대한 깊이 영상의 중요도 계산 단계(1-3. Steps for calculating the importance of depth images for different camera viewpoints ( S306S306 ))

초기 카메라 시점이 설정되면, 본 발명에 따른 카메라 시점 결정모듈(314)은 설정된 초기 카메라 시점들 각각에서 3차원 모델에 대한 깊이 영상을 획득하고, 획득된 깊이 영상들 각각에 대한 중요도를 산출하도록 구성될 수 있다. 깊이 영상의 중요도란, 특정 카메라 시점에서 획득된 깊이 영상이 해당 3차원 모델을 얼마나 잘 표현하고 있는지, 및/또는 사용자가 질의 영상으로서 촬영할 깊이 영상과 유사할 가능성이 얼마나 높은지를 정규화하여 수치적으로 나타낸 것이다. 따라서, 중요도가 높은 깊이 영상은 중요도가 낮은 깊이 영상에 비하여 해당 3차원 모델의 특징적인 부분을 상대적으로 더 많이 포함하고 있거나, 및/또는 사용자가 질의 영상으로서 촬영할 확률이 상대적으로 높다는 것을 의미한다. 또한, 깊이 영상의 중요도는 해당 깊이 영상이 획득된 카메라 시점의 중요도와도 혼용될 수 있다. 본 발명에 따른 카메라 시점 결정모듈(314)은 카메라 위치에 따라 달라지는 3차원 모델의 중요도(saliency)를 계산하여 카메라 시점의 적응적 샘플링을 수행하도록 구성될 수 있다. 중요도는 카메라 시점에 따라 달라지는 3차원 모델의 투영된 면적(area), 곡률(curvature) 그리고 카메라의 자세(pose) 중 적어도 하나 이상에 기초하여 결정될 수 있다. 각각의 중요도 요소는 0.0 ~ 1.0 사이의 정규화 된 값을 갖게 되며, 카메라 시점 v에 따른 전체 중요도 S(v)는 다음의 수학식 1과 같이 각각의 가중치가 적용된 세 가지 중요도 요소의 합으로 계산될 수 있다.When an initial camera viewpoint is set, the camera viewpoint determination module 314 according to the present invention is configured to acquire a depth image for a 3D model at each of the set initial camera viewpoints and calculate importance for each of the acquired depth images Can be. The importance of the depth image is numerically normalized by how well the depth image obtained at a specific camera viewpoint expresses the corresponding 3D model and/or how likely the user is to be similar to the depth image to be taken as a query image. It is shown. Accordingly, it means that a depth image having a high importance includes relatively more characteristic parts of the corresponding 3D model than a depth image having a low importance, and/or that a user has a relatively high probability of shooting as a query image. In addition, the importance of the depth image may be mixed with the importance of the camera viewpoint at which the depth image is acquired. The camera viewpoint determination module 314 according to the present invention may be configured to perform adaptive sampling of the camera viewpoint by calculating the saliency of the 3D model depending on the camera position. The importance may be determined based on at least one of a projected area, curvature, and pose of the 3D model that varies depending on a camera viewpoint. Each importance factor will have a normalized value between 0.0 and 1.0, and the overall importance S(v) according to the camera viewpoint v will be calculated as the sum of three weighted importance factors as shown in Equation 1 below. Can.

중요도를 계산하기 위한 가중치들은 필요에 따라 다양하게 설정될 수 있다. 예를 들어, 3차원 모델에 따라 면적이 중요한 경우 면적에 부여되는 가중치가 다른 가중치들에 비하여 상대적으로 높게 설정될 수 있다. 보다 바람직하게, 본 발명의 일 실시예에 있어 각각의 가중치는 α=1.0, β=1.5, γ=0.8로 설정될 수 있으며, 이러한 경우 최적의 검색 성능을 보일 수 있다.Weights for calculating the importance may be variously set as necessary. For example, when the area is important according to the 3D model, the weight assigned to the area may be set relatively high compared to other weights. More preferably, in one embodiment of the present invention, each weight may be set to α=1.0, β=1.5, γ=0.8, and in this case, optimal search performance may be exhibited.

1-3-1. 면적(1-3-1. area( areaarea ) 중요도) importance

본 발명에 따른 카메라 시점 결정모듈(314)은 특정 카메라 시점에서 투영된 3차원 모델의 면적이 클수록 면적 중요도를 높게 계산하도록 구성될 수 있다. 일반적으로 투영된 물체의 면적이 클수록 물체의 특징을 잘 표현할 수 있는 시점이라고 할 수 있다. 그러나, 각각의 3차원 모델이 나타낼 수 있는 최대 면적은 모두 다르기 때문에, 투영된 면적을 절대적인 수치로 표현하여 중요도로 사용할 수는 없다. 따라서 면적 중요도를 정규화시키는데 사용하기 위해 단위 구면체 상의 균일한 시점을 샘플링하고, 해당 시점에서 투영된 객체의 면적을 비교하여 각각의 3차원 모델이 투영되어 나타날 수 있는 최대 면적을 추정하도록 구성될 수 있다. 따라서 본 발명에 따른 카메라 시점 결정모듈(314)에서, 최종적인 면적 중요도 Area(v)는 다음의 수학식 2와 같이, 카메라 시점 v에서 투영되는 물체의 전경 영역을 F(x)라 했을 때, 이에 포함되는 화소의 총 개수와 해당 3차원 모델이 나타낼 수 있는 최대 화소수의 비로 계산될 수 있다.The camera viewpoint determination module 314 according to the present invention may be configured to calculate the area importance as the area of the 3D model projected from a specific camera viewpoint is larger. In general, it can be said that the larger the area of the projected object, the better the ability to express the characteristics of the object. However, since the maximum areas that each 3D model can represent are all different, the projected area cannot be expressed as an absolute value and used as importance. Therefore, it can be configured to sample a uniform viewpoint on a unit spherical surface for use in normalizing the area importance, and to estimate the maximum area that each 3D model can project by comparing the area of the projected object at that viewpoint. . Therefore, in the camera viewpoint determination module 314 according to the present invention, the final area importance Area(v) is F(x) when the foreground area of the object projected from the camera viewpoint v is as shown in Equation 2 below. It can be calculated as the ratio of the total number of pixels included in this and the maximum number of pixels that the corresponding 3D model can represent.

수학식 2에서, Area_max는 투영된 3차원 모델이 나타날 수 있는 최대 면적을 의미한다.In Equation 2, Area _max means the maximum area in which the projected 3D model can appear.

1-3-2. 곡률(1-3-2. curvature( curvaturecurvature ) 중요도) importance

또한, 본 발명에 따른 카메라 시점 결정모듈(314)은 특정 카메라 시점에서 투영된 3차원 모델의 표면이 큰 곡률을 많이 포함하고 있을수록 곡률 중요도를 높게 계산하도록 구성될 수 있다. 면적 중요도와 함께 투영된 물체 영역의 표면에서 큰 곡률을 많이 포함하고 있을수록, 이는 3차원 모델의 특징을 잘 나타내는 시점이라고 할 수 있다. 면적 중요도와 마찬가지로 절대적인 수치를 그대로 사용할 수 없으므로, 균일하게 샘플링된 시점에서 투영된 3차원 모델의 평균 곡률 합을 비교하여 최대 평균 곡률 합을 추정하고, 이를 정규화시키는데 사용하였다. 따라서 본 발명에 따른 카메라 시점 결정모듈(314)에서, 곡률 중요도 Curvature(v)는 다음의 수학식 3과 같이 카메라 시점 v에서 투영되는 물체의 표면 F(x)에 속하는 모든 화소 x에서의 평균 곡률 C(x)의 절대값의 총 합과 앞서 구한 최대 평균 곡률 합의 비로 계산될 수 있다.In addition, the camera viewpoint determination module 314 according to the present invention may be configured to calculate the importance of curvature as the surface of the 3D model projected from a specific camera viewpoint contains a large curvature. The more the surface area of the projected object area contains a large curvature, the more important it is to describe the characteristics of the 3D model. Since the absolute value cannot be used as it is with the importance of area, it was used to estimate the maximum average curvature sum by comparing the average curvature sum of the 3D model projected at a uniformly sampled time point, and to normalize it. Therefore, in the camera viewpoint determination module 314 according to the present invention, the curvature importance Curvature(v) is the average curvature of all pixels x belonging to the surface F(x) of the object projected from the camera viewpoint v as shown in Equation 3 below. It can be calculated as the ratio of the sum of the absolute values of C(x) and the sum of the maximum mean curvatures obtained earlier.

수학식 3에서, Curvature_max는 투영된 3차원 모델이 나타날 수 있는 최대 곡률 합이며, 평균 곡률 C(x)는 다음의 수학식 4와 같이 인접 정점 과와의 주 곡률(principal curvature)인 최소 곡률 k_min과 최대 곡률 k_max의 평균으로 계산될 수 있다.In Equation 3, Curvature _max is the sum of the maximum curvatures in which the projected 3D model can appear, and the average curvature C(x) is the minimum curvature that is the principal curvature with the adjacent vertex family as in Equation 4 below. It can be calculated as the average of k _min and the maximum curvature k _max .

1-3-3. 카메라 자세(1-3-3. Camera pose( posepose ) 중요도) importance

또한, 본 발명에 따른 카메라 시점 결정모듈(314)은 특정 카메라 시점에서 상기 3차원 모델을 바라보는 방향과 상기 3차원 모델의 상부가 가리키는 방향(수직 방향 : upright direction)과의 사이 각도가 작을수록 상기 카메라 자세 중요도를 높게 계산하도록 구성될 수 있다. 보다 구체적으로, 실세계에 존재하는 물체에 대한 영상을 취득한다고 가정할 때, 물체의 상부가 가리키는 방향을 수직 방향(upright direction)이라고 한다. 카메라가 바라보는 방향과 수직 방향의 사이 각도가 작을수록 영상이 취득될 확률이 높음과 동시에 물체의 특징을 많이 보유하고 있을 확률이 높다. 이는 대체로 평탄한 면에 놓여져 있는 일반적인 물체는 상부에 많은 특징을 갖고 있기 때문이다. 따라서 다음의 수학식 5에 나타낸 바와 같이, 본 발명에 따른 카메라 시점 결정모듈(314)은 카메라의 방향과 수직 방향 사이의 각 θ_v에 따라, 물체의 가장 아래 부분에서 최소, 수직 방향과 카메라 방향이 동일할 때 최대의 값을 갖도록, 카메라 자세 중요도를 계산하도록 구성될 수 있다.Also, the camera viewpoint determination module 314 according to the present invention has a smaller angle between the direction of looking at the 3D model at a specific camera viewpoint and the direction indicated by the top of the 3D model (vertical direction: upright direction). It may be configured to calculate the importance of the camera posture. More specifically, when it is assumed that an image of an object existing in the real world is acquired, the direction indicated by the upper part of the object is referred to as an upright direction. The smaller the angle between the direction the camera is looking at and the vertical direction, the higher the probability that an image will be acquired, and the higher the probability that it has many features of the object. This is because a general object lying on a generally flat surface has many characteristics at the top. Therefore, as shown in the following Equation 5, the camera viewpoint determination module 314 according to the present invention is the minimum, vertical direction and camera direction in the lowest part of the object according to the angle θ _v between the camera direction and the vertical direction. It can be configured to calculate the camera posture importance so that it has the maximum value when it is the same.

도 5a 및 도 5b는 카메라 시점에 따른 중요도 변화결과를 나타낸 예시도이다. 도 5a 및 도 5b에는 이상에서 설명한 세가지 중요도의 카메라 시점에 따른 변화를 종합한 두 가지 3차원 모델이 도시되어 있다. 도 5a 및 도 5b에 있어, 3차원 모델의 색상은 표면에서의 평균 곡률을 나타내며, 붉은색 영역은 큰 곡률을, 초록색 영역은 낮은 곡률을 갖는 것을 의미한다.5A and 5B are exemplary views showing a result of changing importance according to a camera viewpoint. 5A and 5B show two three-dimensional models summarizing changes according to the camera viewpoints of the three importance levels described above. 5A and 5B, the color of the 3D model represents the average curvature on the surface, and the red region means a large curvature, and the green region means a low curvature.

1-4. 카메라 시점의 1-4. Camera point of view 적응적Adaptive 샘플링 단계( Sampling step ( S308S308 ))

각 카메라 시점에서의 중요도를 구한 후, 본 발명에 따른 카메라 시점 결정모듈(314)은 20면체를 이루는 각각의 면에 대하여 해당 면을 구성하는 각 카메라 시점들의 중요도에 기초하여 해당 면의 중요도를 계산하도록 구성될 수 있다. 즉, 본 발명에 따른 카메라 시점 결정모듈(314)은 가장 높은 중요도를 갖는 부분에서 보다 많은 카메라 시점을 샘플링하기 위해 다음의 수학식 6을 사용하여 삼각형(즉, 20면체의 각 면)을 이루는 모든 정점 집합의 중요도 S(i,j,k)를 계산한다. 이러한 방식으로 계산된 면의 중요도는 20면체를 적응적으로 메쉬 분할하기 위하여 사용될 수 있다.After obtaining the importance at each camera viewpoint, the camera viewpoint determination module 314 according to the present invention calculates the importance of the corresponding surface based on the importance of each camera viewpoint constituting the surface for each face constituting the icosahedron. It can be configured to. That is, the camera viewpoint determination module 314 according to the present invention uses the following equation (6) to sample more camera viewpoints in the part having the highest importance, forming all of the triangles (that is, each face of the icosahedron). Calculate the importance S(i,j,k) of the vertex set. The importance of the face calculated in this way can be used to adaptively segment the icosahedron.

즉, 각 시점의 중요도만을 이용해 메쉬 분할을 수행하게 되면, 분할된 시점에서의 중요도 역시 매우 높은 값을 갖게 되기 때문에 어느 한 부분에서만 집중적으로 정점 분할이 이루어질 수 있다. 따라서 본 발명에 따른 카메라 시점 결정모듈(314)은 이를 방지하기 위해 분할된 깊이만큼의 가중치를 추가적으로 적용하도록 구성될 수 있다. n은 정점들이 분할된 최대 깊이를 나타내며, 초기값은 0으로 설정된다. 예시적인 일 실시예에 있어, 가중치는 λ=0.8로 설정될 수 있으며, 이를 통해 20면체의 메쉬 분할이 어느 한쪽으로만 집중되지 않게 된다. 따라서, 본 발명에 따른 카메라 시점 결정모듈(314)은 전술한 바와 같은 수학식 6을 이용하여 각 면의 중요도를 계산하고, 계산된 면의 중요도에 기초하여 미리 설정된 조건이 충족될 때까지 20면체의 메쉬 분할을 수행하도록 구성된다. 예를 들어, 본 발명에 따른 카메라 시점 결정모듈(314)은 100개의 카메라 시점이 획득될 때까지 메쉬 분할을 수행하여 카메라 시점을 적응적으로 샘플링하도록 구성될 수 있다. That is, if the mesh segmentation is performed using only the importance of each view point, the importance of the segmented view point also has a very high value, so that the vertex segmentation can be intensively performed in only one part. Therefore, the camera viewpoint determination module 314 according to the present invention may be configured to additionally apply a weight corresponding to the divided depth to prevent this. n represents the maximum depth at which vertices are divided, and the initial value is set to 0. In one exemplary embodiment, the weight may be set to λ=0.8, so that the mesh segmentation of the icosahedron is not concentrated to either side. Therefore, the camera viewpoint determination module 314 according to the present invention calculates the importance of each face using Equation 6 as described above, and the icosahedron until a preset condition is satisfied based on the calculated importance of the face. It is configured to perform mesh segmentation. For example, the camera viewpoint determination module 314 according to the present invention may be configured to adaptively sample the camera viewpoint by performing mesh segmentation until 100 camera viewpoints are acquired.

도 6a 내지 도 6d는 중요도를 계산하기 위한 가중치들의 변화에 따른 카메라 시점 샘플링 결과를 비교한 예시도이다. 도 6a 내지 도 6d에 지금까지 설명한 가중치 α, β, γ, λ의 변화에 따른 적응적 샘플링 결과가 도시되어 있다. 도 6a는 다른 가중치들을 β=0.1, γ=0.1, λ=0.8로 설정한 상태에서, α를 0.1, 0.3, 1.0, 2.0으로 변화시킨 카메라 시점 샘플링 결과를 나타낸다. 도 6a에서 확인될 수 있는 바와 같이, 면적 중요도의 가중치 α가 커질수록 보다 큰 면적이 투영될 수 있는 카메라 위치(3차원 모델 좌

우측)에 많은 시점들이 샘플링된다. 또한, 도 6b는 다른 가중치들을 α=0.1, γ=0.1, λ=0.8로 설정한 상태에서, β를 0.1, 0.3, 1.0, 2.0으로 변화시킨 카메라 시점 샘플링 결과를 나타낸다. 도 6b에서 확인될 수 있는 바와 같이, 곡률 중요도의 가중치 β가 커질수록 곡률 변화가 많은 부분(3차원 모델 좌

우측)에 보다 많은 시점이 샘플링되는 것을 확인할 수 있다. 6A to 6D are exemplary views comparing camera viewpoint sampling results according to changes in weights for calculating importance. 6A to 6D show adaptive sampling results according to changes in weights α, β, γ, and λ described so far. FIG. 6A shows a sampling result of a camera viewpoint in which α is changed to 0.1, 0.3, 1.0, and 2.0 while other weights are set to β=0.1, γ=0.1, and λ=0.8. As can be seen in FIG. 6A, as the weight α of the area importance increases, a camera position (a 3D model left) where a larger area can be projected.

Many views are sampled on the right). In addition, FIG. 6B shows a camera viewpoint sampling result in which β is changed to 0.1, 0.3, 1.0, and 2.0 while other weights are set to α=0.1, γ=0.1, and λ=0.8. As can be seen in FIG. 6B, the larger the weight β of curvature importance, the more the curvature changes (left side of the 3D model).

It can be seen that more viewpoints are sampled on the right side).

또한, 도 6c는 다른 가중치들을 α=0.1, β=0.1, λ=0.8로 설정한 상태에서, γ를 0.1, 0.3, 1.0, 2.0으로 변화시킨 카메라 시점 샘플링 결과를 나타낸다. 도 6c에서 확인될 수 있는 바와 같이, 카메라 자세 중요도의 가중치 γ가 커질수록 수직 방향 근처에서 많은 시점이 샘플링되는 것을 확인할 수 있다. 마지막으로, 도 6d는 다른 가중치들을 α=0.1, β=1.5, γ=0.8로 설정한 상태에서, λ를 0.7, 0.8, 0.9, 1.0으로 변화시킨 카메라 시점 샘플링 결과를 나타낸다. 도 6d에서 확인될 수 있는 바와 같이, 순활 분할 깊이에 따른 가중치 λ가 작아질수록 균일한 시점 샘플링에 가까워지고, 반대로 λ가 커질수록 특정 부분으로 수렴하여 20면체의 메쉬 분할이 이루어지는 것을 확인할 수 있다. 도 7은 본 발명에 따라 소정의 가중치를 사용하여 카메라 시점을 샘플링한 결과를 나타내는 예시도이다. 도 7에는 최종적으로 본 발명의 일 실시예에서 사용한 가중치(α=1.0, β=1.5, γ=0.8, λ=0.8)를 적용한 카메라 시점 샘플링 결과를 도시하였다.In addition, FIG. 6C shows a sampling result of a camera viewpoint in which γ is changed to 0.1, 0.3, 1.0, and 2.0 while other weights are set to α=0.1, β=0.1, and λ=0.8. As can be seen in FIG. 6C, it can be seen that as the weight γ of the importance of the camera posture increases, many viewpoints are sampled near the vertical direction. Finally, FIG. 6D shows a sampling result of a camera view in which λ is changed to 0.7, 0.8, 0.9, and 1.0 while other weights are set to α=0.1, β=1.5, and γ=0.8. As can be seen in FIG. 6D, it can be seen that the smaller the weight λ according to the smooth division depth is, the closer to the uniform viewpoint sampling. . 7 is an exemplary view showing a result of sampling a camera viewpoint using a predetermined weight according to the present invention. FIG. 7 shows the sampling results of the camera viewpoint to which the weights (α=1.0, β=1.5, γ=0.8, λ=0.8) used in an embodiment of the present invention are finally applied.

1-5. 깊이 영상들에 대한 회전불변 기술자 생성 단계(1-5. Step of generating an invariant descriptor for depth images ( S310S310 ))

본 발명에 따른 카메라 시점 결정모듈(314)이 전술한 단계들(S304 내지 S308)을 수행하여 카메라 시점들을 적응적으로 샘플링하고 샘플링된 카메라 시점들을 해당 3차원 모델에 대한 깊이 영상을 획득하기 위한 카메라 시점들로 결정하면, 본 발명에 따른 회전불변 기술자 생성모듈(316)은 샘플링된 카메라 시점들 각각에서 깊이 영상을 획득하고, 획득된 깊이 영상에 대한 회전불변 기술자를 생성하도록 구성된다. A camera for adaptively sampling the camera viewpoints by performing the above-described steps (S304 to S308) and obtaining the depth image for the corresponding 3D model by the camera viewpoint determination module 314 according to the present invention. When determining the viewpoints, the rotation-invariant descriptor generation module 316 according to the present invention is configured to acquire a depth image at each of the sampled camera viewpoints and generate a rotation-invariant descriptor for the obtained depth image.

사용자 단말기(100)의 3차원 카메라(110)로부터 입력받은 질의 깊이 영상과 다중 깊이 영상으로 표현된 영상은 카메라의 회전에 대한 정보를 포함하지 않는다. 따라서 렌더링된 모든 깊이 영상에 대해 회전불변 기술자를 생성하여 유사도 측정이 이루어져야 한다. 이를 위해 본 발명에 따른 회전불변 기술자 생성모듈(316)은 저니크(Zernike) 모멘트를 사용하여 깊이 영상에 대한 회전불변 기술자를 생성하도록 구성될 수 있다. 다만, 본 발명이 저니크 모멘트를 사용하는 것에 한정되는 것은 아니며, 회전불변 특성을 갖는 소정의 정보를 생성할 수 있는 다양한 공지된 기법들이 필요에 따라 채택되어 사용될 수 있으며, 이러한 경우에도 본 발명의 기술적 요지를 포함하고 있는 한 본 발명의 권리범위에 속함은 당업자에게 자명할 것이다. 저니크 모멘트는 단위 원 안에서 직교하는 복소 다항식(complex polynomials)들의 집합이므로 회전불변 특성을 갖고 있으며, 다음의 수학식 7에 제시한 바와 같이 영상 f(x,y)를 직교 기저함수로 투영함으로써 얻을 수 있다. N은 영상의 해상도를 나타내며 k와 m은 저니크 모멘트의 차수를 나타낸다. 차수 k는 k∈N⁺를 만족하는 자연수이며, m은

이고,

가 짝수임을 만족시키는 모든 자연수이다. 본 발명의 예시적인 일 실시예에 따른 회전불변 기술자 생성모듈(316)은 차수 k를 13까지 계산하여 하나의 깊이 영상당 56개의 저니크 모멘트를 생성하도록 구성될 수 있다.The query depth image and the multi-depth image received from the 3D camera 110 of the user terminal 100 do not include information about the rotation of the camera. Therefore, similarity measurement should be performed by generating a rotation-invariant descriptor for all rendered depth images. To this end, the rotation-invariant descriptor generation module 316 according to the present invention may be configured to generate a rotation-invariant descriptor for a depth image using a Zernike moment. However, the present invention is not limited to the use of a jerk moment, and various known techniques capable of generating predetermined information having a rotation-invariant characteristic may be adopted and used as necessary, and in this case, the present invention It will be apparent to those skilled in the art that the scope of the present invention falls within the scope of the technical subject. Since the Jernik moment is a set of complex polynomials orthogonal within a unit circle, it has a rotation-invariant characteristic, and is obtained by projecting the image f(x,y) as an orthogonal basis function as shown in Equation 7 below. Can. N represents the resolution of the image, and k and m represent the order of the jerk moment. Order k is a natural number that satisfies k∈N ⁺ , and m is

ego,

Is an all natural number that satisfies the even number. The rotation-invariant descriptor generation module 316 according to an exemplary embodiment of the present invention may be configured to calculate order k up to 13 to generate 56 jerk moments per depth image.

수학식 7에서, 차수 k와 m에 따른 방사 다항식(radial polynomial) R_km은 다음의 수학식 8과 같이 정의된다.In Equation 7, the radial polynomial R _km according to orders k and m is defined as Equation 8 below.

이는 차수 k에 대해 O(K³)의 복잡도를 가지므로 모든 깊이 영상의 기술자를 생성하는 데에 있어, 매번 연산하는 것은 매우 비효율적이다. 한편 동일한 해상도의 영상에서의 다항식 R_km은 모두 동일한 크기를 가지므로, 본 발명에 따른 회전불변 기술자 생성모듈(316)은 단 한번의 연산 후 스펙트럼을 저장해 두고 이를 다른 영상의 기술자 생성에 재활용하여 연산을 최소화하도록 구성될 수 있다.Since it has the complexity of O(K ³ ) for order k, it is very inefficient to operate every time in generating descriptors of all depth images. On the other hand, since the polynomial R _km in the image of the same resolution all have the same size, the rotation-invariant descriptor generation module 316 according to the present invention stores the spectrum after only one operation and recycles it to generate the descriptor of another image. It can be configured to minimize.

전술한 바와 같은 과정을 통해, 하나의 3차원 모델이 중요도에 기초하여 적응적으로 샘플링된 복수의 카메라 시점들에서 획득된 복수의 깊이 영상들로서 표현되어 3차원 모델 검색 데이터베이스(350)에 구조화되어 저장될 수 있으며, 보다 바람직하게 복수의 깊이 영상들 각각으로부터 생성되는 복수의 회전불변 기술자로서 3차원 모델이 표현되어 3차원 모델 검색 데이터베이스(350)에 구조화되어 저장될 수 있다. 예를 들어, 본 발명에 따른 데이터베이스 생성부(310)는 하나의 3차원 모델에 대하여 적응적으로 샘플링된 100개 카메라 시점들 각각에서 깊이 영상들을 획득하고, 각각의 깊이 영상마다 56개의 저니크 모멘트들을 생성하여 3차원 모델을 표현하도록 구성될 수 있다. 이러한 경우, 하나의 3차원 모델은 5600개의 저니크 모멘트들로 표현될 수 있다.
Through the above-described process, one 3D model is expressed as a plurality of depth images obtained from a plurality of camera viewpoints adaptively sampled based on importance, and is structured and stored in the 3D model search database 350 It may be, and more preferably, as a plurality of rotation invariant descriptors generated from each of the plurality of depth images, a 3D model may be expressed and structured and stored in the 3D model search database 350. For example, the database generating unit 310 according to the present invention acquires depth images from each of 100 camera viewpoints adaptively sampled for one 3D model, and 56 Jernik moments for each depth image It can be configured to generate a 3D model by generating them. In this case, one 3D model may be represented by 5600 jerk moments.

2. 단일 깊이 영상을 이용한 내용기반 3차원 모델 검색2. Content-based 3D model search using a single depth image

다시 도 2를 참조하면, 도 2에 도시된 바와 같이, 본 발명에 따른 3차원 모델 검색 서버(300)는 사용자 단말기(100)로부터 전송된 단일 깊이 영상을 질의 영상으로서 수신하고, 수신된 질의 영상에 매칭되는 3차원 모델 검색을 수행하기 위한 질의영상 처리부(320) 및 검색 처리부(330)를 포함할 수 있다. 본 발명에 따른 질의영상 처리부(320)는 수신된 질의 영상을 3차원 모델 검색에 적합하도록 처리하는 기능을 수행하도록 구성되며, 검색 처리부(330)는 질의영상 처리부(320)로부터 출력되는 처리된 질의 영상과 3차원 모델 검색 데이터베이스(350)에 저장된 깊이 영상 간의 유사도를 비교하여 검색을 수행하도록 구성될 수 있다. 또한, 전술한 바와 같은 기능을 수행하기 위하여, 본 발명에 따른 질의영상 처리부(320)는 필터 모듈(322), 원근보정 모듈(324), 및 회전불변 기술자 생성모듈(316)을 포함할 수 있다. 한편, 도 8은 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 서버에서 수행되는 3차원 모델 검색과정을 도시한 순서도이다. 이하에서, 도 2 및 도 8을 참조하여, 본 발명에 따른 3차원 모델 검색 서버(300)에서 수행되는 내용기반 3차원 모델 검색과정에 대하여 상세하게 살펴보도록 한다. Referring to FIG. 2 again, as shown in FIG. 2, the 3D model search server 300 according to the present invention receives a single depth image transmitted from the user terminal 100 as a query image, and receives the received query image It may include a query image processing unit 320 and a search processing unit 330 for performing a three-dimensional model search matching to. The query image processing unit 320 according to the present invention is configured to perform a function of processing the received query image to be suitable for a 3D model search, and the search processing unit 330 is a processed query output from the query image processing unit 320 It may be configured to perform a search by comparing the similarity between an image and a depth image stored in the 3D model search database 350. In addition, in order to perform the functions described above, the query image processing unit 320 according to the present invention may include a filter module 322, a perspective correction module 324, and a rotation invariant descriptor generation module 316. . Meanwhile, FIG. 8 is a flowchart illustrating a 3D model search process performed in a 3D model search server according to an exemplary embodiment of the present invention. Hereinafter, with reference to FIGS. 2 and 8, a detailed description will be given of a content-based 3D model search process performed by the 3D model search server 300 according to the present invention.

2-1. 질의 영상의 2-1. Vaginal 필터링Filtering 단계( step( S802S802 ))

본 발명에 따른 질의영상 처리부(320)는 사용자 단말기(100)로부터 전송되는 단일 깊이 영상을 질의 영상으로서 수신하고(S800), 수신된 질의 영상을 3차원 모델 검색에 적합하도록 처리하는 기능(S802, S804)을 수행하도록 구성될 수 있다. 이러한 기능을 수행하기 위하여, 본 발명에 따른 질의영상 처리부(320)는 수신된 질의 영상에 대해 3차원 카메라의 잡음을 제거하는 필터 모듈(322) 및 질의 영상의 원근성분을 제거하는 원근보정 모듈(324)을 포함할 수 있다.The query image processing unit 320 according to the present invention receives a single depth image transmitted from the user terminal 100 as a query image (S800) and processes the received query image to be suitable for a 3D model search (S802, S804). In order to perform this function, the query image processing unit 320 according to the present invention includes a filter module 322 for removing noise of a 3D camera with respect to a received query image and a perspective correction module for removing perspective components of the query image ( 324).

먼저 질의 영상의 필터링에 대하여 구체적으로 살펴보도록 한다. 본 발명에서 제안하는 3차원 모델 검색 알고리즘은 질의 데이터로서 잡음을 포함한 한 장의 깊이 영상만을 사용하므로 정제된 깊이 영상이 반드시 필요하다. 따라서 본 발명에 따른 필터 모듈(322)은 외곽선이 보존되는 평활화 필터인 양방향-필터링(bilateral-filtering) 기법을 통해 수신된 질의 영상의 3차원 카메라의 잡음을 제거하도록 구성될 수 있다. 즉, 화소의 위치 p에서의 필터링 결과 I_bilateral(p)는, 마스크 S에 포함되는 주변 화소 q에 대해 화소간의 공간적 거리(

)와 화소의 밝기(intensity) 차이(

)에 따라 변화하는 가우시안 가중치(Gaussian weight)

와

를 화소 p의 I_q에 곱하고, S 내부 화소의 수

에 의해 정규화 함으로서 얻을 수 있다. 이를 수식으로 표현하면 다음의 수학식 9와 같다.First, the filtering of the query image will be described in detail. Since the 3D model search algorithm proposed in the present invention uses only one depth image including noise as query data, a refined depth image is essential. Therefore, the filter module 322 according to the present invention may be configured to remove noise of a 3D camera of a query image received through a bilateral-filtering technique, which is a smoothing filter in which an outline is preserved. That is, as a result of filtering at the position p of the pixel, I _bilateral (p) is the spatial distance between pixels with respect to the surrounding pixels q included in the mask S (

) And the difference in pixel intensity (

) According to Gaussian weight

Wow

Is multiplied by I _q of pixel p, and the number of pixels inside S

Can be obtained by normalizing This is expressed by the following equation (9).

이러한 방식으로, 본 발명에 따른 필터 모듈(322)은 화소간의 공간적 거리와 화소의 밝기 차이를 함께 고려하여 영상을 필터링함으로써 깊이 영상의 잡음을 효과적으로 제거함과 동시에 객체의 윤곽을 유지할 수 있다.In this way, the filter module 322 according to the present invention can effectively remove the noise of the depth image and maintain the contour of the object by filtering the image in consideration of the spatial distance between pixels and the difference in brightness of the pixels.

2-2. 질의 영상의 원근보정 단계(2-2. Perspective correction step of query image ( S804S804 ))

본 발명에 따른 원근보정 모듈(324)은 필터링된 질의 영상에 대하여 원근성분을 제거하도록 구성될 수 있다. 시점기반의 3차원 모델 검색을 위해서는 모든 깊이 영상에서 동일한 투영이 이루어져야 한다. 이를 위해 데이터베이스(350)의 3차원 모델의 깊이 영상을 질의 영상과 동일한 원근(perspective) 성분을 가진 환경에서 취득하거나, 또는 질의 영상의 원근성분을 제거하고 데이터베이스(350)의 3차원 모델의 깊이 영상을 직교투영(orthogonal projection)하여 취득하는 두 가지의 방법을 사용할 수 있다. 본 발명에서는 데이터베이스(350)의 3차원 모델의 깊이 영상이 직교 투영되도록 하고, 3차원 카메라(110)로부터의 질의 깊이 영상은 원근 성분을 제거하여 두 영상에서 동일하게 직교투영이 이루어지도록 하였다. 따라서, 본 발명에 따른 원근보정 모듈(324)은, 질의 깊이 영상의 모든 화소를 3차원 공간의 점군(point cloud)로 나타내고, 이를 다시 직교투영(orthogonal projection)하여 깊이 영상을 다시 취득함으로써, 질의 깊이 영상의 원근 성분 제거를 수행하도록 구성된다. 여기서 깊이 영상의 화소를 점군으로 나타내기 위해서는 2차원 영상에서가 아니라 실제 3차원 공간에서 화소의 좌표를 구해야 하며 이는 다음의 수학식 10을 통해 수행된다.The perspective correction module 324 according to the present invention may be configured to remove the perspective component for the filtered query image. In order to search the viewpoint-based 3D model, the same projection must be performed on all depth images. To this end, the depth image of the 3D model of the database 350 is acquired in an environment having the same perspective component as the query image, or the perspective component of the query image is removed and the depth image of the 3D model of the database 350 is removed. Two methods of acquiring by orthogonal projection can be used. In the present invention, the depth image of the 3D model of the database 350 is orthogonally projected, and the depth image of the query from the 3D camera 110 is removed to remove the perspective component so that the orthogonal projection is performed in both images. Therefore, the perspective correction module 324 according to the present invention displays all the pixels of the query depth image as a point cloud in a 3D space, and acquires the depth image again by orthogonal projection, thereby querying It is configured to perform perspective component removal of the depth image. Here, in order to represent the pixels of the depth image as a point group, the coordinates of the pixels must be obtained in the actual 3D space, not in the 2D image, and this is performed through Equation 10 below.

수학식 10에 있어, (u,v)는 취득된 깊이 영상에서 화소의 좌표이며, x, y, z는 각각 실제 3차원 공간의 세 주축에 대한 점군의 좌표를 의미한다. 그리고 C_x, C_y는 카메라의 주점(principal point)을 뜻하며, F는 카메라의 초점 거리, d(u,v)는 깊이 영상 화소의 좌표 (u,v)에서의 깊이값이다.In Equation 10, (u,v) is the coordinates of a pixel in the acquired depth image, and x, y, and z are the coordinates of a point group with respect to the three main axes of the actual 3D space, respectively. And C _x , C _y means the principal point of the camera, F is the focal length of the camera, and d(u,v) is the depth value in the coordinates (u,v) of the depth image pixel.

2-3. 질의 영상의 회전불변 기술자 생성 단계(2-3. Step of generating an invariant descriptor of the query image ( S806S806 ))

한편, 본 발명에 따른 질의영상 처리부(320) 내에 포함되는 회전불변 기술자 생성모듈(326)은 데이터베이스(350) 내에 저장된 3차원 모델들에 대한 회전불변 기술자들과의 질의 영상을 비교하기 위하여, 필터링되고 원근성분이 제거된 질의 영상에 대하여 회전불변 기술자를 생성하도록 구성된다. 질의영상 처리부(320) 내에 포함되는 회전불변 기술자 생성모듈(326)은 전술한 바와 같은 데이터베이스 생성부(310) 내의 회전불변 기술자 생성모듈(316)과 실질적으로 동일한 알고리즘을 이용하여 질의 영상에 대한 회전불변 기술자를 생성하므로, 중복되는 설명은 생략하기로 한다. 또한, 질의영상 처리부(320) 내에 포함되는 회전불변 기술자 생성모듈(326)과 데이터베이스 생성부(310) 내의 회전불변 기술자 생성모듈(316)과 실질적으로 동일하므로, 3차원 모델 검색 서버(300)가 데이터베이스 구축 기능과 3차원 모델 검색 기능을 모두 수행하도록 구성되는 경우, 2개의 회전불변 기술자 모듈들(316, 326) 중 하나가 생략될 수 있다. 본 발명에 따른 질의영상 처리부(320) 내에 포함되는 회전불변 기술자 생성모듈(326)은 질의 영상에 대한 회전불변 기술자, 즉, 저니크 모멘트들을 생성하여 검색 처리부(330)로 출력하게 된다.On the other hand, the rotation-invariant descriptor generation module 326 included in the query image processing unit 320 according to the present invention is filtered to compare query images with rotation-invariant descriptors for 3D models stored in the database 350. It is configured to generate a rotation-invariant descriptor for a query image in which the perspective component is removed. The rotation invariant descriptor generation module 326 included in the query image processing unit 320 uses the same algorithm as the rotation invariant descriptor generation module 316 in the database generation unit 310 to rotate the query image. Since an invariant descriptor is generated, redundant description will be omitted. In addition, since the rotation invariant descriptor generation module 326 included in the query image processing unit 320 and the rotation invariant descriptor generation module 316 in the database generation unit 310 are substantially the same, the 3D model search server 300 is When configured to perform both the database construction function and the 3D model search function, one of the two rotation-invariant descriptor modules 316 and 326 may be omitted. The rotation-invariant descriptor generation module 326 included in the query image processing unit 320 according to the present invention generates a rotation-invariant descriptor for the query image, that is, jerk moments, and outputs it to the search processing unit 330.

2-4. 유사도 비교 및 검색결과 생성단계(2-4. Similarity comparison and search result generation step ( S808S808 내지 To S812S812 ))

본 발명에 따른 검색 처리부(330)는 질의영상 처리부(320)로부터 출력되는 질의 영상에 대한 회전불변 기술자들을 입력받고, 질의 영상의 회전불변 기술자와 데이터베이스(350) 내에 저장되어 있는 3차원 모델들에 대한 회전불변 기술자를 비교함으로써 유사도를 비교하도록 구성된다. The search processing unit 330 according to the present invention receives rotation invariant descriptors for the query image output from the query image processing unit 320, and rotates the invariant descriptors of the query image and the 3D models stored in the database 350. It is configured to compare similarities by comparing the rotation invariant descriptors.

2-4-1. 기술자를 통한 유사도 계산2-4-1. Similarity calculation through technician

본 발명에 따른 검색 처리부(330)에 의해 최종적으로 질의 영상의 기술자와 데이터베이스(350) 내의 모든 3차원 모델에 대한 기술자가 각각 비교된다. 다음의 수학식 11에 나타낸 바와 같이 각 모델의 차이(dissimilarity) D는 n개의 카메라 시점들에 대해 질의 데이터와의 저니크 모멘트 차이를 최소화하는 시점과의 차이 값이 되며, D가 작을수록 질의 영상에 대한 해당 3차원 모델의 유사도가 높은 것으로 판단된다.Finally, by the search processing unit 330 according to the present invention, the descriptor of the query image and the descriptors of all three-dimensional models in the database 350 are compared. As shown in the following Equation 11, the difference D of each model is a difference value from a time point for minimizing the jerk moment difference with the query data for n camera viewpoints, and the smaller the D, the smaller the query image It is judged that the similarity of the corresponding 3D model to is high.

최종적으로 본 발명에 따른 검색 처리부(330)는 D가 작은 순서대로 3차원 모델들을 선택하고, 선택된 3차원 모델들에 대한 정보를 포함하는 검색결과를 생성하여 사용자 단말기(100)로 전송함으로써 3차원 모델 검색 방법을 종료하게 된다.Finally, the search processing unit 330 according to the present invention selects 3D models in a small order of D, generates a search result including information on the selected 3D models, and transmits it to the user terminal 100 to generate a 3D model. The model search method ends.

2-4-2. 유사도 측정의 병렬화2-4-2. Parallelism measurement

한편 보다 바람직하게, 본 발명에 따른 검색 처리부(330)는 유사도 측정을 위한 이상의 수학식 11에 대해 NVIDIA CUDA(compute unified device architecture)를 적용하여 GPGPU(general purpose GPU)를 통한 병렬화 연산을 수행하도록 구성될 수 있다. 본 발명에 따른 3차원 모델 검색 알고리즘을 데이터가 매우 방대한 웹 서비스로 확장할 경우 실시간으로 유사모델 검색이 가능해야 하며, 이를 위해서는 병렬화를 통한 연산의 고속화가 반드시 필요하기 때문이다. 또한 수학식 11은 동일한 입력 데이터(즉, 질의 영상으로부터 생성된 동일한 회전불변 기술자)

에 대해 모든 스레드(thread)에서 독립적으로 연산이 수행될 수 있기 때문에 병렬화에 적합하다. Meanwhile, more preferably, the search processing unit 330 according to the present invention is configured to perform parallelization operation through a general purpose GPU (GPGPU) by applying NVIDIA compute unified device architecture (CUDA) to Equation 11 above for measuring similarity Can be. When the 3D model search algorithm according to the present invention is extended to a very large web service, it is necessary to search for a similar model in real time, and for this, it is necessary to speed up calculation through parallelization. Also, Equation 11 is the same input data (ie, the same rotation-invariant descriptor generated from the query image).

It is suitable for parallelization because the operation can be performed independently on all threads.

유사도 측정의 병렬처리 과정을 보다 구체적으로 살펴보도록 한다. 우선 데이터베이스(350) 내에 저장된 모든 3차원 모델들 및 시점들에 대한 기술자

와 질의 영상에 대한 기술자

를 GPU 메모리로 복사한다. 데이터 복사가 완료되면, pqn개의 스레드를 생성하고, 각각의 스레드 내부에서 수학식 11의 질의 데이터의 기술자와 각 모델과 시점에 대한 기술자 간 차이

를 계산한다. 이 과정의 결과로서 각각의 스레드에서는 질의 데이터의 km차 기술자와 데이터베이스(350)의 동일 차수 기술자간의 차가 구해진다. 그리고 스래드의 총 개수가 i라고 하였을 때 ni번째 해당하는 스레드에서 ni+pq 스레드까지 결과값의 합을 구하면 이것이 최종적인 n-view에서의 유사도가 된다.
Let us consider in more detail the parallel processing of similarity measurement. First, descriptors for all three-dimensional models and viewpoints stored in the database 350

And video technician

To the GPU memory. When the data copying is completed, pqn threads are created, and the difference between the descriptor of the query data of Equation 11 and the descriptor for each model and time in each thread

To calculate. As a result of this process, the difference between the km difference descriptor of the query data and the same order descriptor of the database 350 is obtained in each thread. And when the total number of threads is i, the sum of the results from the ni-th thread to the ni+pq thread is the similarity in the final n-view.

IIIIII . 본 발명에 따른 단일 깊이 영상을 이용한 내용기반 3차원 모델 검색 방법의 성능 평가. Performance evaluation of content-based 3D model search method using single depth image according to the present invention

이하에서 본 발명의 일 실시예에 따른 3차원 모델 검색 방법을 이용한 3차원 모델의 검색 결과 및 성능 분석을 제시한다. 2.5GHz Intel Core2 Quad CPU, NVIDIA GeForce GTX570 GPU, 2GB 메모리의 Windows 7 환경에서 Visual Studio 및 C언어를 사용하여 3차원 모델 검색 인터페이스를 구현하였다. 실험에 사용된 3차원 모델로는 PSB (Princeton Shape Benchmark) 검색 엔진 테스트용 3차원 모델 데이터베이스를 사용하였다. PSB는 907개의 3차원 모델과 131개의 클래스(class)로 구성되어 있다. 또한 OpenGL 환경에서 64*64 해상도로 데이터베이스의 3차원 모델을 렌더링하여 다중 시점 깊이 영상을 취득하였으며, 다중 깊이 영상 기반 표현을 위한 3차원 모델의 중요도 파악 과정은 512*512 해상도에서 3차원 모델을 렌더링하여 이루어졌다.Hereinafter, a search result and performance analysis of a 3D model using a 3D model search method according to an embodiment of the present invention are presented. In the Windows 7 environment with 2.5GHz Intel Core2 Quad CPU, NVIDIA GeForce GTX570 GPU, and 2GB memory, we implemented a 3D model search interface using Visual Studio and C language. As the three-dimensional model used in the experiment, a three-dimensional model database for testing the PSB (Princeton Shape Benchmark) search engine was used. The PSB is composed of 907 three-dimensional models and 131 classes. In addition, in the OpenGL environment, a 3D model of a database was rendered at 64*64 resolution to obtain a multi-view depth image, and the process of grasping the importance of the 3D model for multi-depth image-based expression renders a 3D model at 512*512 resolution. It was done.

먼저, 3차원 모델 검색 데이터베이스(350)를 구축해 놓은 상태에서, 질의 영상이 3차원 모델 검색 서버(300)에 입력되었을 때 3차원 모델 검색에 소요되는 시간을 표 1에 나타내었다. First, in a state in which the 3D model search database 350 is constructed, Table 1 shows the time required for the 3D model search when the query image is input to the 3D model search server 300.

수행 단계Steps to perform 수행 시간(msec)Running time (msec) 입력 질의 영상의 전처리 과정Preprocessing of input query image 14.314.3 기술자 생성 과정Technician creation process 109.8109.8 유사도 비교 과정
Similarity comparison process
GPUGPU 400.4400.4 CPUCPU 1381.61381.6 전체 수행시간
Total execution time
GPUGPU 524.5524.5 CPUCPU 1505.71505.7

표 1에 나타난 바와 같이, GPU를 통한 병렬처리를 이용함으로써 CPU 사용대비 약 30%로 검색 시간을 단축할 수 있음을 확인할 수 있다. 이는 데이터베이스(350)의 규모가 커질수록 더욱 큰 효율을 주게 될 것이다.As shown in Table 1, it can be seen that by using parallel processing through the GPU, the search time can be shortened to about 30% compared to the CPU usage. This will give greater efficiency as the size of the database 350 increases.

본 발명의 일 실시예에 따른 3차원 모델 검색 방법의 성능 분석을 위하여 정밀도-재현률(precision-recall) 측정을 수행하였다. 정밀도-재현률 측정법은 검색 성능 비교에서 가장 일반적인 척도로 사용되는 방법이며, 재현률(recall)이란 검색 모델에 대응하는 클래스에 속한 전체 모델 중 실제 검색에 사용된 모델의 비율로서 클래스의 전체 모델 중 검색을 통해 찾고자 하는 모델의 개수의 비율에 해당하는 수치이다. 또한 정밀도(precision)는 검색된 모델들이 질의에 사용된 모델에 대응하는 클래스에 속하는 비율을 나타내며, 다시 말하여 모델을 검색한 결과가 사용자의 원하는 클래스에 속한 비율이 된다. 즉 정밀도-재현률 그래프는 일반적으로 재현률 수치가 증가할수록 수치가 감소하게 되는데, 곡선이 위쪽에 위치하며 재현률 증가에 따른 정밀도 수치의 감소량이 적을수록 우수한 성능의 검색 알고리즘이라고 할 수 있다.In order to analyze the performance of the 3D model search method according to an embodiment of the present invention, precision-recall measurement was performed. The precision-reproducibility method is used as the most common measure in search performance comparison, and the recall is the ratio of the models used in the actual search among all models belonging to the class corresponding to the search model. It is a number corresponding to the ratio of the number of models to be searched through. In addition, precision indicates the ratio of the retrieved models belonging to the class corresponding to the model used in the query, that is, the result of searching the model becomes the ratio belonging to the user's desired class. In other words, the precision-reproducibility graph generally decreases as the reproducibility value increases. It can be said that the higher the reproducibility value, the higher the reproducibility value.

1. 실제 물체의 영상을 통한 성능 분석1. Performance analysis through image of real object

보급형 3차원 카메라(110)로서 Kinect를 이용하여 실제 물체를 통한 유사 3차원 모델 검색 성능을 실험하였다. 도 9는 본 발명의 바람직한 일 실시예에 따른 단일 깊이 영상을 질의 영상으로서 이용한 3차원 모델 검색 방법을 사용한 검색 결과를 나타낸 예시도이다. 도 9를 통해 나타난 바와 같이, 본 발명의 일 실시예에 따른 3차원 모델 검색 방법을 통해 보급형 3차원 카메라(110)로부터 한 장의 깊이 영상을 취득하고, 이를 입력 질의로 사용하여 기하학적으로 유사한 모델들을 쉽고 정확하게 검색할 수 있음을 확인할 수 있다.Using the Kinect as the entry-level 3D camera 110, the performance of searching similar 3D models through real objects was tested. 9 is an exemplary view showing a search result using a 3D model search method using a single depth image as a query image according to an exemplary embodiment of the present invention. As shown in FIG. 9, one depth image is obtained from the entry-level 3D camera 110 through a 3D model search method according to an embodiment of the present invention, and geometrical similar models are used by using it as an input query. You can see that you can search easily and accurately.

2. 데이터베이스 모델의 깊이 영상을 이용한 성능 분석2. Performance analysis using depth image of database model

3차원 모델 검색 방법의 정확한 성능 분석은 데이터베이스(350)에 속하는 3차원 모델에 대한 깊이 영상을 질의로 하여 이루어져야 한다. 하지만 모든 3차원 모델과 유사하거나 같은 클래스에 속하는 모든 실제 물체의 영상을 취득하는 것은 현실적으로 불가능하다. 따라서 본 발명에서는 PSB 검색 엔진 테스트용 3차원 모델 데이터베이스의 3차원 모델 중 25개 클래스의 총 211개의 3차원 모델에 대해 사용자 분석을 거쳐 중요하다고 판단되는 시점에서의 깊이 영상을 취득하고 이를 입력 질의로 사용하여 제안하는 알고리즘의 정성적인 분석과 정량적인 분석을 각각 수행하였다.The accurate performance analysis of the 3D model search method should be performed by querying the depth image of the 3D model belonging to the database 350 as a query. However, it is practically impossible to acquire images of all real objects that are similar to or belong to the same class as all 3D models. Therefore, in the present invention, a depth image at a time point deemed important through user analysis of a total of 211 3D models of 25 classes among 3D models in the 3D model database for PSB search engine testing is acquired and used as an input query. Qualitative and quantitative analysis of the proposed algorithm was performed.

2-1. 알고리즘의 정성적 성능 분석2-1. Qualitative performance analysis of the algorithm

도 10은 본 발명의 바람직한 일 실시예에 따른 단일 깊이 영상을 질의 영상으로서 이용한 3차원 모델 검색 방법을 사용한 검색 결과와 종래기술에 따른 3차원 모델을 질의 영상으로서 이용한 3차원 모델 검색 방법을 사용한 검색 결과를 나타내는 예시도이다. 도 10에 본 발명의 일 실시예에 따른 3차원 모델 검색 방법을 통해 3차원 모델을 검색한 결과와 Princeton University에서 제공하는 3차원 모델 검색 엔진(3D model search engine)을 사용해 검색한 결과를 비교하여 도시하였다. 종래기술에 따른 3차원 모델 검색 엔진에서는 PSB 3차원 모델을 입력 질의로 사용하였으며, 본 발명의 일 실시예에 따른 3차원 모델 검색 방법에서는 동일한 3차원 모델을 통해 취득된 한 장의 깊이 영상을 질의로 사용하였다. 검색 결과 비교에 있어 3차원 모델 검색 엔진은 PSB 뿐만 아니라 36000여개의 방대한 3차원 모델을 데이터베이스로 사용하고 있기 때문에 제안하는 알고리즘을 통한 결과와는 정확히 일치하지 않지만, 단 한장의 깊이 영상을 질의로 사용함에도 비슷한 수준의 결과를 보여주는 것을 정성적으로 파악할 수 있다.FIG. 10 shows a search result using a 3D model search method using a single depth image as a query image and a search result using a 3D model search method using a 3D model according to the prior art as a query image according to an exemplary embodiment of the present invention. It is an example showing the result. In FIG. 10, a result of searching a 3D model through a 3D model search method according to an embodiment of the present invention is compared with a search result using a 3D model search engine provided by Princeton University. Shown. In the 3D model search engine according to the prior art, the PSB 3D model was used as an input query. In the 3D model search method according to an embodiment of the present invention, a depth image acquired through the same 3D model is used as a query. Used. When comparing search results, the 3D model search engine uses not only the PSB, but also 36,000 massive 3D models as a database, so it does not exactly match the results through the proposed algorithm, but uses only one depth image as a query. However, it is possible to qualitatively grasp similar results.

2-2. 알고리즘의 정량적 성능 분석2-2. Quantitative performance analysis of the algorithm

도 11은 본 발명의 바람직한 일 실시예에 따른 적응적 시점 샘플링을 적용한 검색결과와 렌더링한 시점을 적용한 검색결과의 성능을 비교한 검색성능 비교 그래프이다. 도 11에 렌더링한 시점을 통한 검색 결과와 적응적 시점 샘플링을 적용한 검색 결과를 정밀도-재현률 수치를 통해 정량적으로 비교하였다. 이상에서 설명된 수학식 1의 면적, 곡률, 카메라 자세의 가중치는 α=1.0, β=1.5, γ=0.8로 설정하였으며, 수학식 6의 메쉬 분할 깊이에 따른 가중치는 λ=0.8로 설정하였다. 결과적으로 적응적 시점 샘플링의 통해 전체적으로 더욱 우수하고 안정적인 검색성능을 보이는 것을 확인할 수 있다. 이는 적응적 시점 샘플링을 통해 검색에 불필요하거나 중요하지 않은 부분에서의 시점 샘플링을 줄이고, 중요한 부분에서는 조밀한 시점을 샘플링함으로써 나타나는 결과라 할 수 있다.11 is a search performance comparison graph comparing the performance of a search result to which an adaptive view sampling is applied and a search result to which a rendered view is applied according to an exemplary embodiment of the present invention. The search results through the viewpoint rendered in FIG. 11 and the search results to which adaptive viewpoint sampling was applied were quantitatively compared through precision-reproducibility values. The weights of the area, curvature, and camera posture of Equation 1 described above were set to α=1.0, β=1.5, and γ=0.8, and weights according to the mesh segmentation depth of Equation 6 were set to λ=0.8. As a result, it can be seen that through the adaptive viewpoint sampling, it shows a better and more stable search performance as a whole. This can be said to be the result of reducing viewpoint sampling in areas that are not necessary or important for search through adaptive viewpoint sampling, and sampling dense viewpoints in important areas.

도 12는 본 발명의 바람직한 일 실시예에 따른 3차원 모델 검색 방법과 다른 검색 방법의 성능을 비교한 검색성능 비교 그래프이다. 마지막으로 도 12에 다른 3차원 모델 검색 알고리즘과의 성능 비교를 도시하였다. 성능 비교는 질의 데이터를 손쉽게 취득한다는 점에서 본 발명과 유사하다고 할 수 있는 사용자의 스케치 영상을 질의로 사용한 시점 기반 방식의 알고리즘과, 3차원 모델 자체를 질의 데이터로 사용하며 시점기반 방식, 히스토그램 기반 방식의 검색 기법을 적용한 알고리즘들과의 정밀도-재현률 평가로 이루어졌다. 도 12을 통해, 부분 깊이 영상만을 입력 질의로 사용함에도 불구하고 3차원 모델 자체를 입력 질의로 사용하는 종래기술에 따른 다른 알고리즘들에 근접한 성능을 보이고 있음을 확인할 수 있다. 특히 본 발명에 따른 3차원 모델 검색 알고리즘은 낮은 재현률에서 매우 높은 수준의 정밀도를 나타내고 있는데, 이는 유사도 비교에 있어 최상위에 속하는 3차원 모델들의 출력이 주요 기능이라 할 수 있는 각종 어플리케이션으로의 적용에 본 발명에서 제안하는 알고리즘이 충분한 경쟁력을 가지고 있음을 뒷받침한다. 또한 스케치 기반 방식보다 다량의 정보를 포함한 깊이 영상을 입력 질의로 사용함으로써 그에 비해 월등한 검색 성능을 나타내고 있는 것에 주목할 필요가 있다. 결과적으로 본 발명에 따른 3차원 모델 검색 알고리즘은 3차원 카메라(110)를 통해 매우 간편한 질의 데이터를 취득할 수 있을 뿐만 아니라, 높은 정확도의 3차원 모델 검색 성능을 제공하고 있음을 파악할 수 있다.
12 is a search performance comparison graph comparing the performance of a 3D model search method and another search method according to an exemplary embodiment of the present invention. Finally, Fig. 12 shows performance comparison with other 3D model search algorithms. The performance comparison is based on the viewpoint-based algorithm using the sketch image of the user, which can be said to be similar to the present invention, in terms of easily acquiring query data, and the 3D model itself as query data, based on the viewpoint-based method and histogram. It consisted of evaluating the precision-reproducibility with the algorithms applying the search technique. Through FIG. 12, it can be confirmed that despite using only a partial depth image as an input query, it shows performance close to other algorithms according to the prior art using the 3D model itself as an input query. In particular, the 3D model search algorithm according to the present invention shows a very high level of precision at a low reproducibility, which is seen in application to various applications in which the output of the 3D models belonging to the top in the similarity comparison is the main function. It supports that the algorithm proposed in the invention has sufficient competitiveness. It is also worth noting that the depth image containing a large amount of information is used as an input query rather than the sketch-based method, which shows superior search performance. As a result, it can be understood that the 3D model search algorithm according to the present invention can not only acquire very simple query data through the 3D camera 110, but also provide high accuracy 3D model search performance.

IVIV . 결론. conclusion

본 발명에서는 3차원 카메라로부터의 깊이 영상을 이용한 3차원 모델 검색 기법을 제안하였다. 입력 질의 영상의 전처리 과정으로 양방향-필터링을 통해 질의 영상으로서 입력된 깊이 영상의 잡음을 제거하였으며 깊이 영상의 원근 성분을 제거하였다. 또한 3차원 모델의 다중 깊이 영상 기반 표현 과정에서 3차원 모델의 곡률, 투영된 면적, 카메라의 자세를 고려한 모델의 중요도에 따라 적응적으로 카메라 시점을 샘플링하였다. 이렇게 샘플링된 카메라 시점에서의 깊이 영상에 대해 회전불변 기술자를 적용 및 비교함으로써 부분 깊이 영상만으로 정확한 유사 모델 검색 성능을 얻을 수 있었으며 추가적으로 GPU를 통한 병렬화를 통해 검색시간까지 단축시켜, 향후 온라인 서비스 또는 대량의 데이터 서비스로의 적용에 큰 효율을 가져올 수 있도록 하였다.In the present invention, a 3D model search technique using a depth image from a 3D camera is proposed. As a pre-processing of the input query image, noise of the depth image input as the query image was removed through bi-filtering and the perspective component of the depth image was removed. In addition, in the multi-depth image-based expression process of the 3D model, the camera viewpoint was adaptively sampled according to the importance of the model considering the curvature of the 3D model, the projected area, and the posture of the camera. By applying and comparing the rotation-invariant descriptor on the depth image from the sampled camera point of view, it was possible to obtain accurate similar model search performance with only the partial depth image, and additionally, the search time can be shortened through parallelization through the GPU, and future online service or mass It was able to bring great efficiency to its application as a data service.

본 발명에서 제안한 내용기반 3차원 모델 검색 기법은 단일 깊이 영상만을 질의 입력으로 사용하므로 3차원 카메라뿐만 아니라 스테레오 영상 기기, 모바일 기기 등 다양한 환경으로 확장 및 적용이 가능하다는 것에 주목해야 한다. 따라서, 본 발명은 3차원 카메라를 사용하는 예시적인 실시예에 한정되지 않으며, 이를 적용할 수 있는 다양한 디바이스 및/또는 분야들에 광범위하게 적용될 수 있다.
It should be noted that the content-based 3D model retrieval method proposed in the present invention uses only a single depth image as a query input, so that it can be extended and applied to various environments such as stereo imaging devices and mobile devices as well as 3D cameras. Accordingly, the present invention is not limited to the exemplary embodiment using a 3D camera, and can be widely applied to various devices and/or fields to which it can be applied.

본 발명에 따른 실시예들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(Floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동되도록 구성될 수 있으며, 그 역도 마찬가지다. Embodiments according to the present invention may be implemented in the form of program instructions that can be executed through various computer means and can be recorded in computer readable media. The computer-readable medium may include program instructions, data files, data structures, or the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the present invention, or may be known and usable by those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs, DVDs, and magnetic media such as floppy disks. Includes hardware devices specifically configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language code that can be executed by a computer using an interpreter, etc., as well as machine language codes produced by a compiler. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the present invention, and vice versa.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. As described above, in the present invention, specific matters such as specific components and the like have been described by limited embodiments and drawings, but they are provided only to help the overall understanding of the present invention, and the present invention is not limited to the above embodiments , Anyone who has ordinary knowledge in the field to which the present invention pertains can make various modifications and variations from these descriptions.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.
Accordingly, the spirit of the present invention should not be limited to the described embodiments, and should not be determined, but all claims that are equivalent or equivalent to the scope of the claims as well as the claims below will be considered to belong to the scope of the spirit of the invention. .

100 : 사용자 단말기 110 : 보급형 3차원 카메라
200 : 네트워크 300 : 3차원 모델 검색 서버
350 : 3차원 모델 검색 데이터베이스
310 : 데이터베이스 생성부 312 : 정규화 모듈
314 : 카메라 시점 결정모듈 316 : 회전불변 기술자 생성모듈
320 : 질의 영상 처리부 322 : 필터 모듈
324 : 원근보정 모듈 324 : 회전불변 기술자 생성모듈
330 : 검색 처리부100: user terminal 110: entry-level three-dimensional camera
200: network 300: 3D model search server
350: 3D model search database
310: database generator 312: normalization module
314: camera viewpoint determination module 316: rotation invariant technician generation module
320: query image processing unit 322: filter module
324: Perspective correction module 324: Rotation-invariant descriptor generation module
330: search processing unit

Claims

In the 3D model search method performed by the 3D model search server,
(A) receiving a single depth image transmitted from the user terminal as a query image;
(B) generating a rotation-invariant descriptor from the received query image;
(C) calculating the similarity between the rotation-invariant descriptor of the query image and the rotation-invariant descriptor previously stored in a 3D model search database; And
(D) selecting at least one 3D model based on the calculated similarity, and transmitting a search result including information on the selected 3D model to the user terminal.
The 3D model search database
And a rotation invariant descriptor of depth images obtained by adaptively sampling a camera viewpoint for the 3D model,
The 3D model search database
Calculate the importance of the face based on the importance of each camera viewpoint constituting the face for each face constituting the I-face (I is a positive integer greater than or equal to 4) corresponding to the 3D model,
Based on the importance of each of the calculated faces, mesh division is performed on the I faces until a preset condition is satisfied,
A method for retrieving content-based 3D models using a single depth image, characterized in that each vertex segmented by mesh is determined as an adaptive camera view point and a depth image of the 3D model is obtained.

delete

The method according to claim 1,
And removing the perspective component of the received query image.

The method according to claim 5,
In the step of removing the perspective component of the received query image, all pixels of the query image are expressed as a point cloud in a 3D space, and an orthogonal projection of the point group in the 3D space displays a depth image. A content-based 3D model retrieval method using a single depth image, characterized in that the perspective component is removed by re-acquiring.

The method according to claim 1,
The pre-stored rotation-invariant engineers are generated and stored using the jerk moment for each of the depth images of the 3D model,
The step (B) is a content-based three-dimensional model search method using a single depth image, characterized in that for generating the rotation invariant descriptor of the query image using the jerknik moment for the query image.

The method according to claim 1,
In the step (C), the similarity is determined by a minimum value among differences between the jerk moments for each of the depth images of the 3D model and the jerk moment for the query image. Content-based 3D model search method using images.

delete

The method according to claim 1,
The 3D model search database storing the pre-stored rotation invariant descriptors is constructed by a 3D model search database construction method performed by the 3D model search server,
The 3D model search database construction method,
(a) normalizing the 3D model; And
(b) calculating the importance for the initial camera viewpoint for the normalized 3D model, and adaptively sampling the camera viewpoint for the 3D model based on the calculated importance, Content-based 3D model search method using single depth image.

The method according to claim 10,
Step (a) is,
(a-1) a moving normalization step of moving the three-dimensional model such that the center of the three-dimensional model comes to the center of the coordinate system; And
(a-2) A content-based 3D model retrieval method using a single depth image, characterized in that it comprises a size normalization step of adjusting the size of the 3D model so that the moved 3D model inscribes a unit spherical body.

The method according to claim 11,
Step (b) is,
(b-1) Establishing an l-cuboid (l is a positive integer greater than or equal to 4) inscribed in the unit spherical body, and setting each vertex constituting the l-cuboid as an initial camera viewpoint for the normalized three-dimensional model step;
(b-2) calculating importance for each of the initial camera viewpoints; And
(b-3) adaptively sampling the camera viewpoint for the 3D model based on the importance of the calculated initial camera viewpoints, the content-based 3D model using a single depth image How to search.

The method according to claim 12,
In step (b-2), the importance of the initial camera viewpoint is calculated based on at least one of area importance, curvature importance, and camera attitude importance, and the content-based 3D model search using a single depth image Way.

delete

In the 3D model search method performed by the 3D model search server,
(A) receiving a single depth image transmitted from the user terminal as a query image;
(B) generating a rotation-invariant descriptor from the received query image;
(C) calculating the similarity between the rotation-invariant descriptor of the query image and the previously stored rotation-invariant descriptor; And
(D) selecting at least one 3D model based on the calculated similarity, and transmitting a search result including information on the selected 3D model to the user terminal.
The 3D model search database storing the pre-stored rotation invariant descriptors is constructed by a 3D model search database construction method performed by the 3D model search server,
The 3D model search database construction method,
(a) normalizing the three-dimensional model; And
(b) calculating the importance for the initial camera viewpoint for the normalized 3D model, and adaptively sampling the camera viewpoint for the 3D model based on the calculated importance,
Step (a) is,
(a-1) a moving normalization step of moving the three-dimensional model such that the center of the three-dimensional model is at the center of the coordinate system; And
(a-2) a size normalization step of adjusting the size of the 3D model such that the moved 3D model inscribes the unit spherical body,
Step (b) is,
(b-1) Establishing an l-cuboid (l is a positive integer greater than or equal to 4) inscribed in the unit spherical body, and setting each vertex constituting the l-cuboid as an initial camera viewpoint for the normalized three-dimensional model step;
(b-2) calculating importance for each of the initial camera viewpoints; And
(b-3) adaptively sampling the camera viewpoint for the 3D model based on the calculated importance levels of the initial camera viewpoints,
Step (b-2) is,
(b-2-1) calculating the importance of the corresponding face based on the importance of each camera viewpoint constituting the corresponding face for each face constituting the l-faceted body;
(b-2-2) performing mesh division on the l-hedron until a predetermined condition is satisfied based on the importance of each of the calculated faces; And
(b-2-3) determining each vertex finally meshed in the step (b-2-2) as an adaptive camera viewpoint for obtaining a depth image for the 3D model. Characterized by, content-based three-dimensional model search method using a single depth image.

The method according to claim 10,
The 3D model search database construction method,
(c) acquiring a depth image of the 3D model at each of the adaptively sampled camera viewpoints;
(d) generating a rotation-invariant descriptor for each of the acquired depth images; And
(e) expressing and storing the three-dimensional model as a set of the generated rotation-invariant descriptors, the method of searching for a content-based three-dimensional model using a single depth image.

delete

A computer-readable recording medium for carrying out the method according to any one of claims 1, 5 to 8, 10 to 13, 17 to 18.

In the three-dimensional model search server,
A query image processing unit receiving a single depth image transmitted from a user terminal as a query image and generating a rotation invariant descriptor from the received query image; And
The similarity between the rotation invariant descriptor of the query image and the previously stored invariant descriptor is calculated, and at least one 3D model is selected based on the calculated similarity, and the search result including information on the selected 3D model is retrieved. Search processing unit for transmitting to the user terminal; And
And a 3D model search database structured and stored as one set of rotation-invariant descriptors for each of the depth images generated through the adaptive viewpoint sampling technique.
The 3D model search database
Calculate the importance of the face based on the importance of each camera viewpoint constituting the face for each face constituting the I-face (I is a positive integer greater than or equal to 4) corresponding to the 3D model,
Based on the importance of each of the calculated faces, mesh division is performed on the I faces until a preset condition is satisfied,
Finally, each vertex segmented by mesh is determined as an adaptive camera viewpoint, and a depth image for the 3D model is obtained.

The method according to claim 22,
The query image processing unit,
A filter module for filtering the 3D camera noise of the query image;
Perspective correction module for removing the perspective component of the query image; And
And a rotation-invariant descriptor generation module that generates a rotation-invariant descriptor from the received query image.

delete

The method according to claim 23,
The pre-stored rotation-invariant engineers are generated and stored using the jerk moment for each of the depth images of the 3D model,
The rotation-invariant descriptor generation module generates a rotation-invariant descriptor of the query image by using a jerk moment for the query image.

The method according to claim 23,
The search processing unit calculates the similarity between the query image and the 3D model as a minimum value among the differences between the jerk moment for each of the depth images of the 3D model and the jerk moment for the query image. 3D model search server.

The method according to claim 27,
The search processing unit, 3D model search server, characterized in that for independently calculating the difference between the jerk moment for each of the depth images of the 3D models and the jerk moment for the query image, in parallel .

delete

The method according to claim 22,
The three-dimensional model search server,
Further comprising a database generation unit for normalizing a 3D model, calculating importance for an initial camera viewpoint for the normalized 3D model, and adaptively sampling a camera viewpoint for the 3D model based on the calculated importance. Characterized in that, the three-dimensional model search server.

The method according to claim 30,
The database generating unit,
And a normalization module for moving the three-dimensional model such that the center of the three-dimensional model is at the center of the coordinate system, and resizing the three-dimensional model so that the moved three-dimensional model inscribes a unit spherical body. , 3D model search server.

The method according to claim 31,
The database generating unit,
An l-cube (l is a positive integer greater than or equal to 4) inscribed in the unit spherical body is set, and each vertex constituting the l-cube is set as an initial camera viewpoint for the normalized 3D model, and the initial camera viewpoint And a camera viewpoint determination module for calculating the importance for each of the fields and adaptively sampling the camera viewpoint for the 3D model based on the calculated importance levels of the initial camera viewpoints. Model search server.

The method according to claim 32,
The camera viewpoint determination module, 3D model search server, characterized in that for calculating the importance for each of the initial camera viewpoints based on at least one of area importance, curvature importance and camera attitude importance.

delete

In the three-dimensional model search server,
A query image processing unit receiving a single depth image transmitted from a user terminal as a query image and generating a rotation invariant descriptor from the received query image;
The similarity between the rotation invariant descriptor of the query image and the previously stored invariant descriptor is calculated, and at least one 3D model is selected based on the calculated similarity, and the search result including information on the selected 3D model is retrieved. A search processing unit transmitting to the user terminal;
A three-dimensional model search database structured and stored as one set of rotation-invariant descriptors for each of the depth images generated through the adaptive viewpoint sampling technique; And
And a database generator for normalizing a 3D model, calculating importance for an initial camera viewpoint for the normalized 3D model, and adaptively sampling a camera viewpoint for the 3D model based on the calculated importance. That,
The database generating unit,
A l-cube (l is a positive integer greater than or equal to 4) inscribed in a unit spherical body is set, and each vertex constituting the l-cube is set as an initial camera viewpoint for the normalized 3D model, and the initial camera viewpoints Further comprising a camera viewpoint determination module for calculating the importance for each, and adaptively sampling the camera viewpoint for the three-dimensional model based on the calculated importance of the initial camera viewpoint,
The camera viewpoint determination module,
For each of the surfaces constituting the l-sided body, the importance of the corresponding surface is calculated based on the importance of each camera viewpoint constituting the surface,
Based on the importance of each of the calculated faces, mesh segmentation is performed on the l-facets until a preset condition is satisfied, and
Finally, each vertex segmented by mesh is determined as an adaptive camera viewpoint for obtaining a depth image for the 3D model, 3D model search server.

The method according to claim 32,
The database generating unit,
A depth image of the 3D model is acquired at each of the adaptively sampled camera viewpoints, a rotation invariant descriptor for each of the acquired depth images is generated, and the 3D is a set of the generated rotation invariant descriptors. 3D model search server, characterized in that it further comprises a rotation-invariant descriptor generation module constituting the 3D model database by expressing and storing the model.

delete