KR20090065965A

KR20090065965A - 3d image model generation method and apparatus, image recognition method and apparatus using the same and recording medium storing program for performing the method thereof

Info

Publication number: KR20090065965A
Application number: KR1020070133530A
Authority: KR
Inventors: 조선영; 최현철; 오세영; 정석주; 김삼용; 오상훈
Original assignee: 주식회사 케이티; 포항공과대학교 산학협력단
Priority date: 2007-12-18
Filing date: 2007-12-18
Publication date: 2009-06-23
Also published as: KR100930994B1

Abstract

A method and a device for generating a 3D(Dimensional) image model, an image recognition method and device using the same, and a recording medium storing a program thereof are provided to improve processing speed and correctness needed for generating the 3D image model by converting an image on 2D space. A 3D image feature selector(20) selects 3D image features of each 3D scan image. A 2D space projector(30) projects each 3D scanning image to 2D space. An image modifier(40) determines one of the projected 2D images from a reference image. The image modifier modifies the 2D images excluding the reference image according to a shape of the reference image. A 3D image recovering unit(50) extracts 2D image features of each modified image and recovers the 3D images by using the 2D image features. A model generator(60) generates a 3D image model by using the 3D images.

Description

3D image model generation method and apparatus, image recognition method and apparatus using the same and recording medium storing program for performing the method according to

본 발명은 3차원 영상 복원에 사용되는 3차원 영상 모델 생성 방법 및 3차원 영상 모델을 이용한 영상 인식 방법에 관한 것으로서, 특히 임의의 얼굴 영상을 복원하고, 개인의 신원을 인식하고 검증하는 생체 인식 시스템에 사용되는 3차원 영상 모델 생성 방법 및 얼굴 인식 방법 및 장치에 관한 것이다.The present invention relates to a method for generating a 3D image model and an image recognition method using a 3D image model used for 3D image reconstruction, and more particularly, to a biometric system for reconstructing an arbitrary face image and recognizing and verifying an individual's identity. The present invention relates to a 3D image model generation method and a face recognition method and apparatus used in the present invention.

3차원 얼굴 모델은 특정인의 2차원 영상에서 3차원 얼굴 형태를 복원하고, 얼굴의 포즈(pose)와 조명 등에 강인하게 얼굴 인식을 가능하게 해 주는 장점이 있어서, 최근 10년 사이에 매우 활발히 연구되고 있다. 3차원 얼굴 모델 기법은 기존의 2차원 상에서의 얼굴 인식 및 기타 영상 처리 방법들을 대체하고 있는 추세이다. The 3D face model has the advantage of reconstructing the 3D face shape from the 2D image of a specific person and enabling the face recognition to be robust in the pose and lighting of the face. have. The three-dimensional face model technique is replacing the existing two-dimensional face recognition and other image processing methods.

3차원 얼굴 모델을 생성하기 위해서는 다수의 3차원 얼굴 스캔 데이터로부터 일대일 대응되는 조밀한 얼굴 특징점 집합을 구하는 조밀 대응(dense correspondence)의 문제를 해결해야 한다.In order to generate a three-dimensional face model, it is necessary to solve the problem of dense correspondence that obtains a set of one-to-one corresponding dense facial feature points from a plurality of three-dimensional face scan data.

이를 해결하기 위한 기존의 방법들은 크게 두 가지가 있다. 하나는 미리 정의된 복잡한 3차원 얼굴 모델 형태의 특징점들을 3차원 스캔 데이터에 정합시켜 각 특징점들의 색 정보를 알아내는 방법이고, 다른 하나는 3차원 스캔 데이터로부터 필요한 특징점들을 추출하여 3차원 얼굴 모델의 형상 정보와 색정보를 만들어 내는 방법이 있다. 여기에서 특징점들은 모든 스캔 데이터에 대해서 일대일 대응이 되는 공동 특징점이어야 하는데, 일반적으로 특징점들의 개수는 1만~2만까지 이르기 때문에 얼굴 특징점을 효율적으로 추출하는 것이 중요하다.There are two existing methods to solve this problem. One is to find out the color information of each feature point by matching the feature points in the form of a predefined complex 3D face model to the 3D scan data, and the other is to extract the required feature points from the 3D scan data. There is a method of generating shape information and color information. In this case, the feature points should be a common feature point that has a one-to-one correspondence to all the scan data. In general, the feature points range from 10,000 to 20,000, so it is important to efficiently extract the face feature points.

다수의 특징점들을 추출하기 위한 기존의 방법으로는, 광 흐름(Optical Flow) 을 사용하여 3차원 얼굴 스캔 데이터들 간에 1만~2만 개의 조밀한 얼굴 특징점을 자동으로 추출하는 방법이 있다. 그러나 광 흐름을 이용한 방법은 일정한 크기의 영상 블록을 매칭하는 방법이기 때문에 얼굴의 형상과 색이 각각 서로 다른 3차원 스캔 데이터의 경우에는 매칭의 정확도가 매우 떨어지며, 매칭이 시작되는 초기 위치에 민감하고, 블록의 크기가 클 경우 매칭을 위한 연산량이 많은 문제가 있다.As a conventional method for extracting a plurality of feature points, there is a method of automatically extracting 10,000 to 20,000 dense face feature points between three-dimensional face scan data using optical flow. However, since the method using light flow is a method of matching image blocks of a certain size, the accuracy of matching is very low in case of 3D scan data having different shapes and colors of the face, and it is sensitive to the initial position where the matching starts. For example, when the size of a block is large, there is a problem in that a large amount of calculation is required for matching.

또한, 기존의 얼굴 인식 방법에 의하면, 복원된 3차원 얼굴의 3차원 얼굴 특징점과 색정보를 이용하여 인식기를 학습하는데, 얼굴 인식을 위해 입력되는 2차원 영상에 대하여 3차원 얼굴을 복원하는 과정을 통해 특징점과 색정보를 얻어 내야 하는데, 이 경우 3차원 얼굴 복원 과정의 연산량이 많아 얼굴 인식 속도가 매우 느 려지는 단점이 있다.In addition, according to the conventional face recognition method, the recognizer is trained using the 3D facial feature points and the color information of the restored 3D face, and the process of restoring the 3D face with respect to the 2D image input for face recognition is performed. It is necessary to obtain feature points and color information through this. In this case, there is a disadvantage in that the face recognition speed is very slow due to a large amount of computation in the 3D face restoration process.

특허 문헌으로서, 대한민국 등록특허 제360487호는 2차원 얼굴 영상에서 3차원 얼굴 모델로 텍스처를 매핑하는 방법을 개시하고 있는데, 이 경우 미리 정의된 한정된 제어점들을 이용하여 2차원 얼굴에 근접한 3차원 얼굴 영상을 생성하기 때문에 인식 효율의 향상에 일정한 한계가 있다.As a patent document, Korean Patent No. 360487 discloses a method of mapping a texture from a two-dimensional face image to a three-dimensional face model. In this case, a three-dimensional face image close to the two-dimensional face using predefined predefined control points. Since there is a limit to the improvement of the recognition efficiency.

상술한 종래 기술의 한계를 고려하여, 본 발명은 3차원 스캔 데이터로부터 선택된 소수의 3차원 영상 특징점들을 서로 대응시키는 2차원 상에서의 영상에 대한 변환을 수행함으로써 3차원 영상 모델 생성과 관련된 연산량을 줄이고, 얼굴 인식의 정확성을 향상시킬 수 있는 3차원 영상 모델 생성 방법 및 장치를 제공하는 것을 목적으로 한다.In view of the above-described limitations of the prior art, the present invention reduces the amount of computation associated with the generation of a 3D image model by performing a transformation on a 2D image that corresponds to a selected number of 3D image feature points from 3D scan data. It is an object of the present invention to provide a method and apparatus for generating a 3D image model that can improve the accuracy of face recognition.

또한, 본 발명은 3차원 영상 모델을 이용하여 2차원 트레이닝 영상으로 부터 3차원 영상을 생성하고 포즈와 조명을 변경시켜 얻어지는 2차원 영상들을 영상 인식을 위한 데이터로 활용함으로써, 인식을 하고자 하는 2차원 입력 영상을 3차원 영상으로 복원하지 않고도 포즈와 조명에 강인한 특성을 갖는 영상 인식 방법 및 장치를 제공하는 것을 목적으로 한다.In addition, the present invention is to create a three-dimensional image from the two-dimensional training image by using a three-dimensional image model by using the two-dimensional images obtained by changing the pose and lighting as data for image recognition, the two-dimensional to be recognized An object of the present invention is to provide a method and apparatus for recognizing an image having robustness to pose and lighting without reconstructing an input image into a 3D image.

상기 기술적 과제를 달성하기 위한 본 발명에 따른 3차원 영상 모델 생성 방법은 3차원 스캔 영상 각각에 따른 3차원 영상 특징점들을 선택하는 단계; 상기 3차원 스캔 영상 각각을 2차원 공간에 투영시키는 단계; 상기 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정하고, 상기 기준 영상을 제외한 나머지 2차원 영상들을 상기 기준 영상의 형상에 대응되도록 변형시키는 단계; 상기 변형된 영상 각각에 따른 2차원 영상 특징점들을 추출하고, 상기 추출된 2차원 영상 특징점들을 이용하여 3차원 영상들을 복원하는 단계; 및 상기 복원된 3차원 영상들을 이용하여 3차원 영상 모델을 생성하는 단계를 포함한다.According to an aspect of the present invention, there is provided a method of generating a 3D image model, the method including: selecting 3D image feature points according to 3D scan images; Projecting each of the three-dimensional scanned images into a two-dimensional space; Determining one image from the projected two-dimensional images as a reference image, and deforming the two-dimensional images other than the reference image to correspond to the shape of the reference image; Extracting two-dimensional image feature points according to each of the modified images, and reconstructing three-dimensional images using the extracted two-dimensional image feature points; And generating a 3D image model using the reconstructed 3D images.

상기 다른 기술적 과제를 달성하기 위한 본 발명에 따른 3차원 영상 모델 생성 장치는 3차원 스캔 영상 각각에 따른 3차원 영상 특징점들을 선택하는 3차원 영상 특징점 선택부; 상기 3차원 스캔 영상 각각을 2차원 공간에 투영시키는 2차원 공간 투영부; 상기 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정하고, 상기 기준 영상을 제외한 나머지 2차원 영상들을 상기 기준 영상의 형상에 대응되도록 변형시키는 영상 변형부; 상기 변형된 영상 각각에 따른 2차원 영상 특징점들을 추출하고, 상기 추출된 2차원 영상 특징점들을 이용하여 3차원 영상들을 복원하는 3차원 영상 복원부; 및 상기 복원된 3차원 영상들을 이용하여 3차원 영상 모델을 생성하는 모델 생성부를 포함한다.According to another aspect of the present invention, there is provided a apparatus for generating a 3D image model, comprising: a 3D image feature point selector configured to select 3D image feature points according to 3D scan images; A two-dimensional space projector configured to project each of the three-dimensional scanned images into a two-dimensional space; An image transformation unit configured to determine one image from the projected two-dimensional images as a reference image, and to modify the remaining two-dimensional images other than the reference image to correspond to the shape of the reference image; A 3D image reconstructing unit which extracts 2D image feature points according to each of the modified images, and reconstructs 3D images using the extracted 2D image feature points; And a model generator for generating a 3D image model using the reconstructed 3D images.

상기 다른 기술적 과제를 달성하기 위한 본 발명에 따른 영상 인식 방법은 3차원 스캔 영상 각각에 따른 3차원 영상 특징점들을 선택하고, 상기 3차원 스캔 영상 각각을 2차원 공간에 투영시키는 단계; 상기 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정하고, 상기 기준 영상을 제외한 나머지 2차원 영상들을 상기 기준 영상의 형상에 대응되도록 변형시키는 단계; 상기 변형된 영상 각각에 따른 2차원 영상 특징점들을 추출하고, 상기 추출된 2차원 영상 특징점들을 이용하여 3차원 영상을 각각 복원하는 단계; 상기 복원된 3차원 영상들을 이용하여 3차원 영상 모델을 생성하는 단계; 2차원 트레이닝 영상들을 입력 받고, 상기 3차원 영상 모델을 이용하여 상기 트레이닝 영상으로부터 예측되는 3차원 합성 영상을 각각 생성하는 단계; 상기 생성된 3차원 합성 영상들로 부터 각각의 트레이닝 영상에 따른 특징 벡터를 추출하는 단계; 및 인식하고자 하는 입력 영상에 대한 2차원 영상을 입력 받고, 입력된 2차원 영상으로 부터 특징 벡터를 추출한 후 상기 입력 영상에 따른 특징 벡터와 상기 트레이닝 영상에 따른 특징 벡터들 간의 유사성을 이용하여 입력된 영상에 대한 인식 결과를 산출하는 단계를 포함한다.According to another aspect of the present invention, there is provided a method of recognizing an image, selecting three-dimensional image feature points corresponding to each of a three-dimensional scanned image, and projecting each of the three-dimensional scanned image in a two-dimensional space; Determining one image from the projected two-dimensional images as a reference image, and deforming the two-dimensional images other than the reference image to correspond to the shape of the reference image; Extracting two-dimensional image feature points corresponding to each of the modified images, and reconstructing a three-dimensional image by using the extracted two-dimensional image feature points; Generating a 3D image model using the reconstructed 3D images; Receiving 2D training images and generating 3D composite images predicted from the training images using the 3D image model; Extracting a feature vector according to each training image from the generated 3D composite images; And receiving a 2D image of the input image to be recognized, extracting a feature vector from the input 2D image, and using the similarity between the feature vector according to the input image and the feature vectors according to the training image. Calculating a recognition result for the image.

상기 다른 기술적 과제를 달성하기 위한 본 발명에 따른 영상 인식 장치는 3차원 스캔 영상 각각에 따른 3차원 영상 특징점들을 선택하는 3차원 영상 특징점 선택부; 상기 3차원 스캔 영상 각각을 2차원 공간에 투영시키는 2차원 공간 투영부; 상기 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정하고, 상기 기준 영상을 제외한 나머지 2차원 영상들을 상기 기준 영상의 형상에 대응되도록 변형시키는 영상 변형부; 상기 변형된 영상 각각에 따른 2차원 영상 특징점들을 추출하고, 상기 추출된 2차원 영상 특징점들을 이용하여 3차원 영상을 각각 복원하는 3차원 영상 복원부; 상기 복원된 3차원 영상들을 이용하여 3차원 영상 모델을 생성하는 모델 생성부; 2차원 트레이닝 영상들을 입력 받고, 상기 3차원 영상 모델을 이용하여 상기 트레이닝 영상으로부터 예측되는 3차원 합성 영상을 각각 생성하는 3차원 영상 생성부; 상기 생성된 3차원 합성 영상들로 부터 각각의 트레이닝 영상에 따른 특징 벡터를 추출하는 특징 벡터 추출부; 인식하고자 하는 입력 영상에 대한 2차원 영상을 입력 받는 영상 입력부; 및 상기 입력 영상에 대한 특징 벡터를 추출한 후 상기 입력 영상에 따른 특징 벡터와 상기 트레이닝 영상에 따른 특징 벡터들 간의 유사성을 이용하여 입력된 영상에 대한 인식 결과를 산출하는 영상 판단부를 포함한다.According to another aspect of the present invention, there is provided an apparatus for recognizing an image, comprising: a 3D image feature point selector which selects 3D image feature points according to 3D scan images; A two-dimensional space projector configured to project each of the three-dimensional scanned images into a two-dimensional space; An image transformation unit configured to determine one image from the projected two-dimensional images as a reference image, and to modify the remaining two-dimensional images other than the reference image to correspond to the shape of the reference image; A 3D image reconstructing unit which extracts 2D image feature points according to each of the modified images, and reconstructs a 3D image using the extracted 2D image feature points, respectively; A model generator for generating a 3D image model using the reconstructed 3D images; A 3D image generator which receives 2D training images and generates 3D composite images predicted from the training images using the 3D image model; A feature vector extracting unit extracting a feature vector according to each training image from the generated 3D composite images; An image input unit configured to receive a 2D image of the input image to be recognized; And an image determiner configured to extract a feature vector for the input image and calculate a recognition result for the input image using similarity between the feature vector according to the input image and the feature vectors according to the training image.

본 발명의 3차원 영상 모델 생성 방법 및 장치에 따르면, 3차원 스캔 데이터로부터 선택된 소수의 특징점들과 3차원 영상 특징점들을 서로 대응시키기 위하여 2차원 상에서의 영상에 대한 변환을 수행함으로써, 3차원 영상 모델을 생성하는데 소요되는 프로세싱 속도와 정확성을 향상시킬 수 있다.According to the method and apparatus for generating a 3D image model of the present invention, a 3D image model is performed by performing transformation on an image in 2D to correspond to a few feature points selected from 3D scan data and 3D image feature points. It can improve the processing speed and accuracy required to generate the.

또한, 본 발명의 영상 인식 방법에 따르면, 2차원 트레이닝 영상을 3차원 영상 모델을 이용하여 3차원 영상을 생성한 후 포즈와 조명을 변경시켜 2차원의 다양한 영상을 얻고 이를 영상 인식을 위한 데이터로 활용함으로써, 영상 인식을 하고자 하는 2차원 입력 영상을 3차원 영상으로 복원하지 않고도 포즈와 조명에 강한 영상 인식이 가능하다. 또한, 2차원 입력 영상을 3차원 영상으로 복원함에 따라 소요되는 프로세싱 시간을 줄일 수 있고, 시스템의 간략화를 도모할 수 있다.In addition, according to the image recognition method of the present invention, after generating a three-dimensional image by using a three-dimensional image model of the two-dimensional training image to change the pose and lighting to obtain a variety of two-dimensional image as data for image recognition By utilizing the two-dimensional input image to be recognized, the image recognition that is resistant to pose and lighting is possible without restoring the three-dimensional image. In addition, processing time required for reconstructing the 2D input image into the 3D image can be reduced, and the system can be simplified.

이하 도면을 참조하여 본 발명의 3차원 영상 모델 생성 방법 및 장치, 이를 이용한 영상 인식 방법 및 장치 그리고 상기 방법들을 수행하는 프로그램이 기록된 기록 매체에 대하여 구체적으로 설명한다.Hereinafter, a method and apparatus for generating a 3D image model, an image recognition method and apparatus using the same, and a recording medium on which a program for performing the methods are recorded will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 3차원 영상 모델 생성 장치에 대한 블록도이다. 도 1에 도시된 3차원 영상 모델 생성 장치(1)는 3차원 스캔 영상 입력부(10), 3차원 영상 특징점 선택부(20), 2차원 공간 투영부(30), 영상 변환부(40), 3차원 영상 복원부(50) 및 모델 생성부(60)를 포함한다.1 is a block diagram of an apparatus for generating a 3D image model according to an embodiment of the present invention. The apparatus for generating a 3D image model 1 shown in FIG. 1 includes a 3D scan image input unit 10, a 3D image feature point selector 20, a 2D space projector 30, an image converter 40, The 3D image reconstructor 50 and the model generator 60 are included.

3차원 스캔 영상 입력부(10)는 3차원 스캔 영상 정보를 입력받고, 입력된 정 보를 저장한다. 3차원 스캔 영상 정보는 서로 다른 사람들로부터 획득된 픽셀별 위치 정보 및 색 정보를 포함한다. 예를 들어, 3차원 얼굴 스캔 정보는 5만 내지 10만 개의 3차원 위치 정보와 그에 따른 색 정보들로 구성된다. 얼굴 스캔 정보는 어떤 집단에 속한 모든 사람들에 대한 3차원 얼굴 모델을 생성하기 위해서는, 가능한한 많은 수의 사람들 얼굴에 대한 3차원 스캔 얼굴 정보가 필요하다.The 3D scan image input unit 10 receives 3D scan image information and stores the input information. The 3D scan image information includes pixel-specific location information and color information obtained from different people. For example, the three-dimensional face scan information is composed of 50,000 to 100,000 three-dimensional position information and color information accordingly. Face scan information requires three-dimensional scan face information for as many people's faces as possible in order to generate a three-dimensional face model for everyone in a group.

3차원 영상 특징점 선택부(20)는 미리 결정된 기준에 따라 3차원 스캔 영상 각각에 따른 3차원 영상 특징점을 선택한다. 3차원 영상 특징점 선택부(20)는 공지된 알고리즘들을 활용하여 얼굴 영상을 효과적으로 특정할 수 있는 소수의 특징점만을 추출할 수 있다. 특히, 3차원 영상 특징점 선택부(20)는 OpenGL의 picking 알고리즘을 이용하여 3차원 스캔 얼굴 영상에서 눈, 눈썹, 코, 입 등의 주요 얼굴 특징 위치와 관련된 소정 개수의 특징점들을 추출할 수 있다. 3차원 영상 특징점의 선택은 영상의 특징을 효과적으로 나타내는 소수의 특징점만을 선택하면 되기 때문에, 이 경우 수동 입력을 통해 3차원 영상의 특징점을 선택하도록 구현하는 것도 가능하다.The 3D image feature point selector 20 selects a 3D image feature point for each 3D scan image according to a predetermined criterion. The 3D image feature point selector 20 may extract only a few feature points that can effectively specify a face image using known algorithms. In particular, the 3D image feature point selector 20 may extract a predetermined number of feature points associated with major face feature positions such as eyes, eyebrows, nose, and mouth from the 3D scan face image using the OpenGL picking algorithm. Since the selection of the three-dimensional image feature point only needs to select a few feature points that effectively represent the characteristics of the image, in this case, it may be implemented to select the feature point of the three-dimensional image through manual input.

또한, 3차원 영상 특징점 선택부(20)는 선택된 소정 개수의 특징점들에 따른 삼각형 메쉬들을 생성하는 메쉬 생성부(미도시)를 더 포함할 수 있다. 메쉬 생성부는 들로네 삼각화(Delaunay triangulation) 기법을 통해 상기 추출된 특징점들에 따른 삼각형 메쉬(mesh)들을 결정할 수 있다.Also, the 3D image feature point selector 20 may further include a mesh generator (not shown) for generating triangular meshes according to the selected predetermined number of feature points. The mesh generator may determine triangular meshes according to the extracted feature points through a Delaunay triangulation technique.

도 2a와 2b는 3차원 얼굴 스캔 영상에서 소정 개수의 특징점들을 추출하고, 삼각형 메쉬를 생성하는 예를 나타내는 참고도이다. 도 2a에 표기된 번호는 얼굴의 주요 요소 즉 눈, 눈썹, 코, 입, 턱선, 이마 등과 관련된 총 115개의 특징점들에 대한 인덱스를 나타낸 것이다. 도 2b는 들로네 삼각화 기법을 이용하여 115개의 특징점들에 따른 삼각형 메쉬를 나타낸 것이다.2A and 2B are reference diagrams illustrating an example of extracting a predetermined number of feature points from a 3D face scan image and generating a triangle mesh. 2A represents the index of a total of 115 feature points related to the main elements of the face, namely eyes, eyebrows, nose, mouth, jaw line, forehead, and the like. 2B shows a triangular mesh according to 115 feature points using the Delaunay triangulation technique.

2차원 공간 투영부(30)는 3차원 스캔 영상을 소정의 중심축을 기준으로 회전하는 회전 각도와 높이를 두축으로 하는 2차원 공간에 투영한다. The two-dimensional space projecting unit 30 projects the three-dimensional scan image in two-dimensional space having two axes of rotation angle and height rotating about a predetermined central axis.

도 3은 3차원 얼굴 스캔 영상을 2차원 실린더 공간에 투영시킨 예를 나타내는 참고도이다. 도 3에서 a는 2차원 실린더 공간 상에 투영된 각각의 픽셀에 따른 색 정보를 나타내는 영상이고, b는 얼굴의 중심축을 기준으로 할 때 회전 반지름을 나타내는 영상이다.3 is a reference diagram illustrating an example in which a 3D face scan image is projected onto a 2D cylinder space. In FIG. 3, a is an image representing color information according to each pixel projected on a two-dimensional cylinder space, and b is an image showing a rotation radius when the central axis of the face is referenced.

영상 변형부(40)는 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정하고, 상기 기준 영상을 제외한 나머지 2차원 영상들을 상기 기준 영상에 대응되도록 변형시킨다. 영상 변형부(40)는 기준 영상 결정부(42), 마스크 영역 결정부(44) 및 어파인 변환부(46)를 포함한다.The image deforming unit 40 determines one image from the projected two-dimensional images as a reference image, and deforms the other two-dimensional images except the reference image to correspond to the reference image. The image transform unit 40 includes a reference image determiner 42, a mask region determiner 44, and an affine converter 46.

기준 영상 결정부(42)는 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정한다. 마스크 영역 결정부(44)는 기준 영상의 특징점들 중에서 외곽에 위치한 특징점들에 따른 마스크 영역을 결정한다. 어파인 변환부(46)는 결정된 마스크 영역을 이루는 특징점들 또는 모든 특징점들을 기준으로 하는 어파인 변환(affine transform)을 수행한다. 어파인 변환 결과, 기준 영상을 제외한 나머지 영상들의 특징점 위치는 기준 영상에 대응되도록 변환된다.The reference image determiner 42 determines one image from the projected two-dimensional images as the reference image. The mask area determiner 44 determines a mask area according to feature points located at the outer side among the feature points of the reference image. The affine transform unit 46 performs an affine transform based on the feature points or all the feature points forming the determined mask area. As a result of the affine transformation, the feature point positions of the remaining images except the reference image are converted to correspond to the reference image.

도 4는 본 발명의 일 실시예에 따라 2차원 실린더 공간 상의 영상을 어파인 변환하는 예를 나타내는 참고도이다. 도 4에서 102 영상은 2차원 실린더 공간상에 투영된 2차원 영상들 중에서 선택되는 기준 영상이고, 104 영상은 기준 영상(102)과 서로 다른 형상과 색 정보를 갖고 있는 영상이다. 106 영상은 마스크 영역을 나타내며, 상기 마스크 영역의 경계는 기준 영상의 115개의 특징점들 중에서 최외곽에 위치한 특징점들을 연결시킴을 통하여 특정될 수 있다. 108영상은 104기준 영상을 기준으로 하는 어파인 변환에 의하여 변형된 영상이다. 예를 들어, 102영상의 마스크 영역을 기준으로 104영상을 어파인 변환시킬 경우, 기준 얼굴 영상과 어파인 변환을 통해 모든 얼굴 영상들의 픽셀들은 서로 일대응 되도록 변형될 수 있다.4 is a reference diagram illustrating an example of affine transforming an image in a two-dimensional cylinder space according to an embodiment of the present invention. In FIG. 4, the 102 image is a reference image selected from two-dimensional images projected on a two-dimensional cylinder space, and the 104 image is an image having different shape and color information from the reference image 102. The 106 image represents a mask area, and the boundary of the mask area may be specified by connecting the outermost feature points among the 115 feature points of the reference image. The 108 images are transformed by affine transformation based on the 104 reference image. For example, when the 104 image is affine transformed based on the mask area of the 102 image, the pixels of all the face images may be transformed to correspond to each other through the reference face image and the affine transformation.

본 실시예에서 영상 변형부로부터 출력되는 영상들은 각 특징점들을 어파인 변환을 통해 기준 영상의 특징점들과 대응되도록 변형된 것이다. 영상 변형부는 어파인 변환 과정에서 각각 픽셀에 따른 인덱스를 부여할 수 있는데, 픽섹별 인덱스에 따라 모든 픽셀 들이 서로 대응될 수 있기 때문에 영상 정합과 같이 특징점들을 대응시키기 위한 과정이 이후에 필요하지 않게 된다.In the present embodiment, the images output from the image transformation unit are modified to correspond to the feature points of the reference image through affine transformation of each feature point. The image transformation unit may assign an index according to each pixel in the affine transformation process. Since all pixels may correspond to each other according to the index per pixel, a process for matching feature points such as image registration is not necessary later. .

3차원 영상 복원부(50)는 변형된 영상 각각에 따른 2차원 영상의 조밀 특징점들을 추출하고, 상기 추출된 조밀 특징점들을 이용하여 3차원 영상을 복원한다. 3차원 영상 복원부(50)는 조밀 특징점 추출부(52), 메쉬 생성부(54), 역어파인 변환부(56) 및 역투영부(58)를 포함한다.The 3D image reconstructor 50 extracts dense feature points of the 2D image according to each of the modified images, and reconstructs the 3D image using the extracted dense feature points. The 3D image reconstructor 50 includes a dense feature point extractor 52, a mesh generator 54, an inverse affine converter 56, and a reverse projector 58.

조밀 특징점 추출부(52)는 상기 마스크 영역 내에 포함된 모든 좌표 또는 픽셀들을 2차원 영상의 특징점으로서 추출한다.The dense feature extraction unit 52 extracts all coordinates or pixels included in the mask area as feature points of the 2D image.

3차원 얼굴 모델을 획득하기 위하여는 모든 스캔 데이터에 일대일 대응되는 조밀한 얼굴 특징점 집합을 구해야한다. 조밀 특징점 추출부(52)는 115개의 한정된 소수의 특징점들과 그 특징점들로 구성된 삼각형 메쉬에 대한 선형 보간을 통해 매우 조밀한 얼굴 특징점을 추출할 수 있다.In order to obtain a three-dimensional face model, a compact face feature set corresponding to all the scan data must be obtained. The dense feature extractor 52 may extract a very dense facial feature point through linear interpolation of a triangular mesh composed of 115 limited number of feature points and the feature points.

메쉬 생성부(54)는 조밀 특징점 추출부(52)에서 추출된 조밀 특징점들을 연결한 삼각형 메쉬들을 생성한다. 특히, 메쉬 생성부(54)는 들로네 삼각화 기법을 이용하여 조밀 특징점들로 구성된 삼각형 메쉬를 생성할 수 있다. 삼각형 메쉬가 결정되면 3차원 영상 모델을 화면에 표시할 때, 점으로 표현하지 않고 삼각형 메쉬에 따른 면의 형태로 영상을 나타낼 수 있다. The mesh generator 54 generates triangular meshes connecting the dense feature points extracted by the dense feature point extractor 52. In particular, the mesh generator 54 may generate a triangular mesh composed of dense feature points using a Delaunay triangulation technique. Once the triangular mesh is determined, when the 3D image model is displayed on the screen, the image may be represented in the form of a plane according to the triangular mesh without being represented as a dot.

특히, 3차원 얼굴 영상의 경우 얼굴 면의 앞면을 원래 색상으로 표현하고 뒷면은 검게 표현해야 3차원 얼굴의 표현이 왜곡되지 않는데, 삼각형 메쉬를 일정한 방향(시계방향 또는 반시계방향)으로 정의하면, 각 메쉬의 앞면 또는 뒷면 여부를 파악이 가능하므로 왜곡을 방지할 수 있다. 또한, 삼각형 메쉬는 3차원 영상 복원 과정에서, 3차원 영상 모델에 정의되어 있는 각 픽셀에 가중치를 적용하기 위하여 사용될 수 있다.In particular, in the case of a 3D face image, the front face of the face must be expressed in the original color and the back face must be black so that the expression of the 3D face is not distorted. When the triangle mesh is defined in a constant direction (clockwise or counterclockwise), You can determine whether the front or back of each mesh can prevent distortion. In addition, the triangular mesh may be used to apply a weight to each pixel defined in the 3D image model in the 3D image reconstruction process.

예를 들면, 2차원 얼굴 영상이 정면 얼굴이 아니고 측면 영상일 경우, 3차원 얼굴 모델에는 정의 되어있지만 2차원 얼굴 영상에서는 보이지 않는 픽셀들이 있을 수 있는데, 이 경우 보이지 않는 픽셀들에 가중치를 낮게 부여하고, 보이는 픽셀들에는 가중치를 높게 부여함으로써 3차원 얼굴 복원을 더욱 정확하게 할 수 있다 여기에서, 가중치 계산에 삼각형 메쉬가 사용되는데, 보이는 면이 삼각형 메쉬의 앞면일 경우에는 가중치를 높게 부여하고, 뒷면일 경우에는 가중치를 낮게 부여할 수 있다.For example, if the two-dimensional face image is a side image instead of the front face, there may be pixels defined in the three-dimensional face model but invisible in the two-dimensional face image, in which case low weights are given to the invisible pixels. 3D face reconstruction can be done more precisely by giving higher weights to the visible pixels. Here, the triangle mesh is used to calculate the weight. In this case, the weight may be lowered.

역어파인 변환부(56)는 삼각형 메쉬가 생성된 영상들을 역어파인 변환을 통해 2차원 실린더 공간 상에 존재하는 영상으로 변환한다. 특히, 역어파인 변환부(56)는 하기 수학식1에 따라 역어파인 변환을 수행할 수 있다.The inverse affine transformation unit 56 converts the images in which the triangular mesh is generated into an image existing in the two-dimensional cylinder space through the inverse affine transformation. In particular, the inverse fishnet transformation unit 56 may perform inverse fishnet transformation according to Equation 1 below.

[수학식1][Equation 1]

여기에서, m은 어파인 변환된 영상에 대한 실린더 공간상의 가로 좌표이고, n은 어파인 변환된 영상에 대한 실린더 공간상의 세로 좌표이며, u는 역어파인 변환하여 복원된 영상에 대한 실린더 공간상의 가로 좌표이고, v는 역어파인 변환하여 복원된 영상에 대한 실린더 공간상의 세로 좌표이며, M은 어파인 변환 행렬을 나타낸다.Where m is the horizontal coordinate in cylinder space for the affine transformed image, n is the vertical coordinate in cylinder space for the affine transformed image, u is the horizontal coordinate in cylinder space for the reconstructed image by inverse affine transformed image V is the vertical coordinate in cylinder space for the image reconstructed by inverse affine transformation, and M represents the affine transformation matrix.

역투영부(58)는 2차원 실린더 공간 상에 존재하는 영상에 대한 역투영을 통해 원래의 3차원 공간상에 존재하는 3차원 영상을 복원한다. 이렇게 복원된 3차원 영상은 2차원 조밀 특징점 좌표에 대응하는 3차원 조밀 특징점 좌표와 색 정보를 갖는다. 3차원 영상에 대한 정보는 하기 수학식2에 따라 실린더 공간상에 존재하는 영상 정보가 갖는 회전 각도와 반지름을 이용하여 획득할 수 있다.The reverse projection unit 58 reconstructs the three-dimensional image existing in the original three-dimensional space through reverse projection of the image existing in the two-dimensional cylinder space. The 3D image thus restored has 3D dense feature point coordinates and color information corresponding to the 2D dense feature point coordinates. Information about the 3D image may be obtained by using a rotation angle and a radius of the image information existing in the cylinder space according to Equation 2 below.

[수학식2][Equation 2]

여기에서, θ 는 실린더 좌표계의 회전 각도이다. 도 3을 참고하면 회전 각도는 예를 들어 y축을 중심으로 왼쪽은 음의 값, 오른쪽은 양의 값을 갖는 것으로 정의할 수 있다. T 는 실린더 좌표계의 가로축 단위 길이에 해당하는 각도이며, b는 실린더 좌표로부터 구한 회전 각도의 오프셋이고, R은 실린더 좌표로부터 구한 회전 반지름이며, r(u, v)는 (u, v) 좌표에 해당되는 반지름 값(702)이고, x, y, z는 3차원 영상 좌표이며, L은 실린더 좌표계의 세로축 단위 길이에 해당하는 길이이고, b는 실린더 좌표로 부터 구한 z 좌표의 오프셋이다.Here, θ is a rotation angle of the cylinder coordinate system. Referring to FIG. 3, the rotation angle may be defined as having a negative value on the left side and a positive value on the left side, for example. T is the angle corresponding to the horizontal unit length of the cylinder coordinate system, b is the offset of the rotation angle obtained from the cylinder coordinate, R is the rotation radius obtained from the cylinder coordinate, and r ( u , v ) is the ( u , v ) coordinate. Corresponding radius value 702, x, y, z are three-dimensional image coordinates, L is the length corresponding to the longitudinal unit length of the cylinder coordinate system, b is the offset of the z coordinate obtained from the cylinder coordinates.

모델 생성부(60)는 복원된 3차원 영상들을 이용하여 3차원 영상 모델을 생성한다. 본 실시예의 모델 생성부는 주성분 분석(PCA :　Principle Component Analysis)을 통해 3차원 모델을 생성하는 것으로서, 주성분 분석부(62), 3차원 형상 모델 생성부(64) 및 3차원 텍스쳐 모델 생성부(66)을 포함하여 구비된다. 주성분 분석부(62)는 복원된 3차원 영상 정보에 대한 주성분 분석을 수행한다. 예를 들어, 주성분 분석부(62)는 3차원 얼굴 특징점의 좌표와 색 정보에 대한 주성분 분석을 수행할 수 있다. 3차원 형상 모델 생성부(64)는 주성분 분석 결과에 따라 3차원 좌표의 평균 및 분산 벡터로 이루어진 3차원 형상 모델을 생성한다. 또한, 3차원 텍스쳐 모델 생성부(66)는 주성분 분석 결과에 따라 3차원 좌표가 갖는 색 정보의 평균 및 분산 벡터로 이루어진 3차원 텍스쳐 모델을 생성한다.The model generator 60 generates a 3D image model using the reconstructed 3D images. The model generator of the present embodiment generates a three-dimensional model through PCA (Principle Component Analysis), and includes a principal component analyzer 62, a three-dimensional shape model generator 64, and a three-dimensional texture model generator 66. It is provided, including. The principal component analyzer 62 performs principal component analysis on the reconstructed 3D image information. For example, the principal component analyzer 62 may perform principal component analysis on the coordinates and color information of the 3D facial feature point. The three-dimensional shape model generator 64 generates a three-dimensional shape model including an average and a variance vector of three-dimensional coordinates according to the principal component analysis result. In addition, the 3D texture model generator 66 generates a 3D texture model including an average and a variance vector of the color information of the 3D coordinates according to the principal component analysis result.

본 실시예에서는 PCA를 일예로 본 발명을 설명하였지만, FDA(Fisher Discriminant Analysis), ICA(Independent Component Analysis), LDA(Linear Discriminant Analysis) 등의 기법을 이용하여 3차원 영상 모델을 생성하는 것도 가능하다. 특히, PCA방식의 경우 영상을 최소의 차원으로 표현하기 때문에 계산량을 최소로 하고, 얼굴 영상에 있어서 가장 중요한 공통적인 특성에 대한 분석을 통해 최적화된 베이시스 벡터를 생성할 수 있다는 장점이 있다.In the present embodiment, the present invention has been described with PCA as an example, but it is also possible to generate a 3D image model using techniques such as FDA (Independent Component Analysis), ICA (Independent Component Analysis), and LDA (Linear Discriminant Analysis). . In particular, in the case of the PCA method, since the image is represented with the minimum dimension, the calculation amount is minimized, and an optimized basis vector can be generated by analyzing the most important common characteristics in the face image.

모델 생성부에서 생성되는 3차원 영상 모델은 3차원 영상을 표현할 수 있는 형상(shape) 기저(basis)와 텍스쳐(texture) 기저를 하기 수학식 3과 같은 선형 모델로 표현할 수 있다.The 3D image model generated by the model generator may represent a shape basis and a texture basis capable of expressing a 3D image as a linear model shown in Equation 3 below.

[수학식 3][Equation 3]

여기서 S는 임의의 3차원 영상의 형상이고, S_o는 3차원 영상 좌표의 평균(형상의 평균)이고, S_i는 3차원 영상 좌표의 i번째 분산 벡터이며, α_i는 각각의 분산 벡터에 대한 계수이고, T는 임의의 3차원 영상의 색 정보(텍스쳐)이며, T₀는 3차원 영상 좌표에 따른 색정보의 평균이고, T_i는 3차원 영상 좌표가 갖는 색정보의 j번째 분산 벡터이고, β_i는 각각의 분산 벡터에 대한 계수이며, n은 3차원 영상 좌표의 분산 벡터 개수이고, m은 3차원 영상 좌표가 가지는 색정보의 분산 벡터 개수이다. 상기 수학식3에 따라 계수 α_i, β_i를 조절하면, 3차원 영상 모델로 2차원 얼굴 영상에 해당하는 임의의 3차원 영상을 생성할 수 있다.Where S is the shape of an arbitrary three-dimensional image, S _o is the mean (average of the shape) of the three-dimensional image coordinates, S _i is the i-th dispersion vector of the three-dimensional image coordinates, and α _i is the respective dispersion vector. Is a coefficient of color information (texture) of an arbitrary three-dimensional image, T ₀ is an average of color information according to three-dimensional image coordinates, and T _i is a j-th dispersion vector of color information of the three-dimensional image coordinates. Β _i is a coefficient for each dispersion vector, n is the number of variance vectors of the 3D image coordinates, and m is the number of variance vectors of the color information of the 3D image coordinates. If the coefficients α _i and β _i are adjusted according to Equation 3, an arbitrary three-dimensional image corresponding to the two-dimensional face image may be generated by the three-dimensional image model.

도 5는 본 발명의 일 실시예에 따른 3차원 영상 모델 생성 방법을 나타내는 흐름도이다. 도 5에 도시된 3차원 영상 모델 생성 방법은 3차원 영상 모델 생성 장치(1)에서 시계열적으로 수행되는 하기 단계 들을 포함한다.5 is a flowchart illustrating a method of generating a 3D image model according to an embodiment of the present invention. The 3D image model generating method illustrated in FIG. 5 includes the following steps performed in time series in the 3D image model generating apparatus 1.

210단계에서 3차원 영상 특징점 선택부(20)는 3차원 스캔 영상 각각에 따른 영상 특징점들을 선택하고, 상기 선택된 영상 특징점들에 따른 삼각형 메쉬들을 생성한다. 영상 특징점들을 선택하는 방법의 예로서, 얼굴 영상의 경우 다른 부위와 식별이 용이한 눈꼬리, 눈썹, 콧구멍, 입술 등의 특징점들은 특징점 주위의 영상에 대하여 미리 마련된 템플릿을 기반으로, 트레이닝 얼굴 영상에서 템플릿과 가장 유사한 위치를 찾기 위한 영상 매칭 방법이 있다. 또한, 각각의 특징점 주위의 영상을 AdaBoost 로 학습된 검출기를 이용하여 트레이닝 얼굴 영상의 특징점을 검출하는 방법, 확률 모델에 대한 학습을 통해 각 픽셀이 특징점일 확률을 계산하고 확률 평균 위치를 찾는 방법 등이 있다.In operation 210, the 3D image feature point selector 20 selects image feature points according to each of the 3D scan images, and generates triangular meshes according to the selected image feature points. As an example of a method of selecting image feature points, in the case of a face image, feature points such as eye tail, eyebrows, nostrils, and lips, which are easily distinguishable from other regions, may be selected based on a template prepared for images around the feature points. There is an image matching method for finding a position most similar to a template. In addition, a method for detecting feature points of a training face image using a detector trained with AdaBoost on the image around each feature point, a method of calculating the probability of each pixel being a feature point, and finding a probability mean position by learning a probability model, etc. There is this.

220단계에서 2차원 공간 투영부(30)는 2차원 실린더 공간에 투영시킨다. 특히, 2차원 공간 투영부(30)는 3차원 스캔 영상을 소정의 중심축을 기준으로 회전하는 회전 각도와 높이를 두축으로 하는 2차원 공간에 투영하는 것이 바람직하다.In operation 220, the two-dimensional space projector 30 projects the two-dimensional cylinder space. In particular, the two-dimensional space projecting unit 30 preferably projects the three-dimensional scanned image in two-dimensional space having two axes of rotation angle and height rotating about a predetermined central axis.

230단계에서 영상 변형부(40)는 투영된 영상들을 기준 영상의 형태로 변형한다. 상세히 설명하면, 본 단계는 투영된 2차원 영상들 중에서 하나의 영상을 기준 영상으로 결정하는 단계; 상기 결정된 기준 영상의 특징점 들 중에서 외곽에 위치한 특징점들을 선택하고, 선택된 특징점들을 연결한 선을 경계로 하는 마스크 영역을 결정하는 단계; 상기 선택된 특징점들 또는 마스크 영역을 기준으로 하는 어파인 변환을 수행하여 기준 영상이 아닌 다른 영상들이 기준 영상에 대응되도록 변형 시키는 단계를 포함한다.In operation 230, the image transform unit 40 transforms the projected images into a reference image. In detail, the step may include determining one image from the projected two-dimensional images as a reference image; Selecting feature points located at an outer side from among the determined feature points of the reference image, and determining a mask area bordering a line connecting the selected feature points; And performing affine transformation on the basis of the selected feature points or mask area to transform images other than the reference image to correspond to the reference image.

240단계에서 3차원 영상 복원부(50)는 230단계에서 변형된 영상에서 2차원 영상의 조밀 특징점 및 메쉬를 결정한다. 본 단계에서 3차원 영상 복원부(50)는 마스크 영역 내에 포함된 각각의 좌표 또는 픽셀들을 2차원 영상의 특징점으로서 추출하고, 추출된 조밀 특징점들을 연결한 삼각형 메쉬들을 생성한다.In operation 240, the 3D image reconstructor 50 determines a dense feature point and a mesh of the 2D image from the image transformed in operation 230. In this step, the 3D image reconstructor 50 extracts each coordinate or pixel included in the mask area as a feature point of the 2D image, and generates triangular meshes connecting the extracted dense feature points.

250단계에서 3차원 영상 복원부(50)는 역어파인 변환과 역투영을 통해 3차원 영상을 복원한다. 3차원 영상 복원부(50)는 삼각형 메쉬가 생성된 영상들을 역어파인 변환을 통해 2차원 실린더 공간 상에 존재하는 영상으로 변환하고, 2차원 실린더 공간 상에 존재하는 영상에 대한 역투영을 통해 원래의 3차원 공간상에 존재하는 3차원 영상을 복원한다.In operation 250, the 3D image reconstructor 50 reconstructs the 3D image through reverse fishnet transformation and reverse projection. The 3D image reconstructor 50 converts the images in which the triangular mesh is generated into an image existing in the 2D cylinder space through inverse affine transformation, and performs original projection through the reverse projection of the image existing in the 2D cylinder space. Restores the 3D image existing in the 3D space.

260단계에서 3차원 영상 모델 생성부(60)는 복원된 3차원 영상들을 이용하여 3차원 영상 모델을 생성한다. 얼굴 영상의 경우, 3차원 영상 모델 생성부(60)는 3차원 얼굴 특징점의 좌표와 색 정보에 대한 주성분 분석을 통해 얼굴 인식에 효과적인 기저 벡터를 선별할 수 있다. 3차원 영상 모델 생성부(60)는 주성분 분석 결 과에 따라 3차원 좌표의 평균 및 분산 벡터로 이루어진 3차원 형상 모델과, 3차원 좌표가 갖는 색 정보의 평균 및 분산 벡터로 이루어진 3차원 텍스쳐 모델을 생성할 수 있다. 상기 생성된 3차원 좌표 분산 벡터의 계수와 색 정보 분산 벡터의 계수를 조절하면 임의의 3차원 얼굴 영상을 생성할 수 있다.In operation 260, the 3D image model generator 60 generates a 3D image model using the reconstructed 3D images. In the case of a face image, the 3D image model generator 60 may select a basis vector effective for face recognition through principal component analysis of coordinates and color information of a 3D face feature point. The three-dimensional image model generator 60 is a three-dimensional shape model consisting of the mean and the variance vector of the three-dimensional coordinates according to the result of the principal component analysis, and a three-dimensional texture model consisting of the mean and variance vector of the color information of the three-dimensional coordinates Can be generated. An arbitrary three-dimensional face image may be generated by adjusting the coefficient of the generated three-dimensional coordinate dispersion vector and the coefficient of the color information dispersion vector.

기존의 얼굴 영상 매칭을 이용하여 3차원 얼굴 모델을 생성하는 광 흐름(optical flow) 방법과 비교할 때, 얼굴 영상에서 그 차이가 매우 적은 볼 부분의 영상과 이마 부분의 영상은 인접한 픽셀들의 구분이 어렵기 때문에 광 흐름의 결과가 부정확할 수 있다. 또한, 광 흐름 결과를 얻기 위해서는 반복적인 블록 영상 매칭 과정이 필요한데, 이 경우 블록이 커질수록 연산량이 매우 많아지는 문제가 있으나, 본 실시예와 같이 소정 개수의 특징점을 사용하는 경우에는 소정 개수의 특징점들로 구성된 삼각형 메쉬 마다 한 번씩의 어파인 변환을 수행하기 때문에 연산량이 적은 잇점이 있다. 또한, 삼각형 메쉬 내부의 픽셀 값을 선형 보간 방법으로 계산하기 때문에 볼이나 이마 같은 균일한 픽셀 값을 갖는 영역에서도 대응점과 대응점의 색상 및 반지름 값을 정확하게 구할 수 있다.Compared to the optical flow method of generating a three-dimensional face model using face image matching, it is difficult to distinguish adjacent pixels between the image of the ball and the forehead where the difference is very small in the face image. As a result, the light flow may be inaccurate. In addition, an iterative block image matching process is required to obtain a light flow result. In this case, the larger the block is, the larger the calculation amount is. However, when a predetermined number of feature points are used as in the present embodiment, a predetermined number of feature points are used. This operation has a small amount of computation because it performs one affine transformation for each triangular mesh. In addition, since the pixel value inside the triangle mesh is calculated by linear interpolation method, even in areas with uniform pixel values such as balls or foreheads, it is possible to accurately calculate the color and radius of the corresponding point and the corresponding point.

도 6은 본 발명의 일 실시예에 따른 영상 인식 장치를 나타내는 블록도이다. 도 6에 도시된 영상 인식 장치(300)는 3차원 스캔 영상 입력부(302), 3차원 영상 특징점 선택부(304), 2차원 공간 투영부(306), 영상 변형부(308), 3차원 영상 복원부(310), 모델 생성부(312), 2차원 트레이닝 영상 입력부(314), 3차원 영상 생성부(316), 제1 특징 벡터 추출부(318), 영상 입력부(320) 및 영상 판단부(322)를 포함한다. 상기 구성요소들 중에서 3차원 스캔 영상 입력부(302) 내지 모델 생성 부(312)는 도 1에 도시된 구성 요소들에 대응되므로 공통된 설명은 이하 생략한다.6 is a block diagram illustrating an image recognition device according to an embodiment of the present invention. The image recognition apparatus 300 illustrated in FIG. 6 includes a 3D scan image input unit 302, a 3D image feature point selector 304, a 2D space projector 306, an image deformer 308, and a 3D image. Restoring unit 310, model generating unit 312, 2D training image input unit 314, 3D image generating unit 316, first feature vector extractor 318, image input unit 320 and image determination unit 322. Among the above components, since the 3D scan image input unit 302 to the model generating unit 312 correspond to the elements shown in FIG. 1, common descriptions thereof will be omitted.

우선, 영상 인식 장치(300)는 주요 요소에 대한 식별 정보 획득부(미도시)를 더 포함할 수 있다. 예를 들어, 식별 정보 획득부는 2차원 공간 투영부(306)를 통해 생성된 2차원 실린더 공간에 존재하는 영상의 좌표들 중에서 얼굴의 주요 요소(눈, 눈썹, 코, 입 등)에 해당하는 특징점들의 순번을 각각의 주요 요소 별로 저장하고, 각각의 주요 요소별로 최외곽 특징점들을 연결한 경계 영역 안에 있는 3차원 얼굴 모델의 조밀한 특징점들의 순번을 해당 주요 요소의 식별 정보로서 저장할 수 있다.First, the image recognition apparatus 300 may further include an identification information acquisition unit (not shown) for main elements. For example, the identification information acquisition unit may be a feature point corresponding to the main elements (eyes, eyebrows, nose, mouth, etc.) of the face among the coordinates of the image existing in the two-dimensional cylinder space generated by the two-dimensional space projector 306. The sequence number of the three-dimensional face model in the boundary region connecting the outermost feature points for each major element may be stored as identification information of the corresponding major element.

2차원 트레이닝 영상 입력부(314)는 영상 인식을 위한 사전적인 학습에 사용하기 위한 2차원 트레이닝 영상들을 입력 받는다. 예를 들어, 어떤 집단의 얼굴 인식 시스템 구축을 위해서는, 그 집단에 속하는 모든 구성원 들의 2차원 얼굴 영상 즉 트레이닝 영상이 필요하다. 얼굴 인식 시스템 구축을 위해서는 트레이닝 영상에 대한 사전적인 학습을 통해 특징벡터를 추출하는 것이 필요하다.The 2D training image input unit 314 receives 2D training images for use in prior learning for image recognition. For example, in order to build a face recognition system of a group, a 2D face image or training image of all members of the group is required. In order to construct a face recognition system, it is necessary to extract feature vectors through prior learning of training images.

3차원 영상 생성부(316)는 3차원 영상 모델을 이용하여 2차원 트레이닝 영상으로부터 예측되는 3차원 합성 영상을 생성한다.The 3D image generator 316 generates a 3D composite image predicted from the 2D training image using the 3D image model.

제1 특징 벡터 추출부(318)은 3차원 합성 영상을 2차원의 실린더 영역에 투영하고, 투영된 2차원 영상 각각에 따른 특징 벡터를 추출한다. 제1 특징 벡터 추출부(318)는 트레이닝 영상 각각에 대하여 다양한 포즈와 조명을 조절하는 영상 조절부(미도시), 상기 영상을 2차원 실린더 공간에 투영하는 투영부(미도시) 및 투영된 영상으로부터 영상 인식을 위한 학습 벡터를 생성하는 벡터 생성부(미도시)를 포함한다.The first feature vector extractor 318 projects the 3D composite image onto the 2D cylinder region, and extracts the feature vector according to each of the projected 2D images. The first feature vector extractor 318 may include an image controller (not shown) for adjusting various poses and illuminations for each training image, a projection unit (not shown) for projecting the image into a two-dimensional cylinder space, and a projected image. It includes a vector generator (not shown) for generating a learning vector for image recognition from the.

도 7은 2차원 트레이닝 얼굴 영상에 대하여 다양한 회전과 이동을 적용하고, 주요 요소를 추출하는 예를 나타내는 참고도이다. 얼굴 영상의 경우를 예로 들어 설명하면, 포즈 및 조명 조절부는 3차원 영상 생성부(316)에서 생성된 3차원 얼굴 영상에 대하여 다양한 포즈와 조명을 조건을 적용하여, 다양한 3차원 얼굴 영상을 획득한다. 투영부는 획득한 영상들을 2차원 실린더 공간 상에 투영한다. 벡터 생성부는 2차원 얼굴 영상의 주요 요소(눈+눈썹, 코, 입)에 따른 서브 영상을 학습 영상으로서 추출하고, 추출된 학습 영상에서 특징 벡터를 추출한다. 그러나, 얼굴의 주요 요소에 대한 서브 영상들의 크기는 서로 다르기 때문에 학습 벡터 생성에 앞서 서브 영상들의 크기를 확대 또는 축소하여 정규화시킬 필요가 있다.7 is a reference diagram illustrating an example of applying various rotations and movements to a two-dimensional training face image and extracting main elements. In the case of a face image, the pose and lighting controller obtains various three-dimensional face images by applying various poses and lighting conditions to the three-dimensional face image generated by the three-dimensional image generator 316. . The projection unit projects the acquired images on the two-dimensional cylinder space. The vector generator extracts a sub image according to main elements (eye + eyebrow, nose, mouth) of the 2D face image as a training image, and extracts a feature vector from the extracted training image. However, since the sizes of the sub-images for the main elements of the face are different from each other, it is necessary to enlarge or reduce the size of the sub-images and normalize them before generating the learning vector.

본 실시예에서 벡터 생성부(미도시)는 얼굴의 주요 요소에 대한 식별 정보를 이용하여 각 얼굴 요소에 해당하는 얼굴 특징점들의 최대 좌표와 최소 좌표를 구함으로써 2차원 상의 서브 영상을 추출하고, 추출된 서브 영상을 이용하여 얼굴 인식을 위한 학습 벡터를 제1 특징 벡터로서 생성하는 것이 바람직하다. 제1 특징 벡터는 예를 들어 얼굴의 주요 요소 영상에 따른 명암도(intensity)와 명암도의 에지(edge) 성분을 1차원 벡터로 재배열한 벡터가 있다. 이렇게 생성된 특징 벡터는 후술하는 신경망 학습부(326)의 학습을 위한 입력 벡터로 사용할 수 있다.In this embodiment, the vector generator (not shown) extracts and extracts a two-dimensional sub-image by obtaining the maximum coordinates and the minimum coordinates of facial feature points corresponding to each facial element using identification information on the main elements of the face. It is preferable to generate a learning vector for face recognition as a first feature vector using the sub-image. The first feature vector includes, for example, a vector obtained by rearranging the intensity and the edge components of the intensity as one-dimensional vectors according to the main element image of the face. The generated feature vector may be used as an input vector for learning of the neural network learner 326 to be described later.

영상 입력부(320)는 인식하고자 하는 입력 영상에 대한 2차원 영상을 입력 받는다. 영상 판단부(322)는 제2 특징 벡터 추출부(324)와 신경망 학습부(326)을 구비한다.The image input unit 320 receives a 2D image of the input image to be recognized. The image determiner 322 includes a second feature vector extractor 324 and a neural network learner 326.

제2 특징 벡터 추출부(324)는 입력 영상으로부터 제2 특징 벡터를 추출한다. 제2 특징 벡터 추출부(324)는 얼굴 영역 추출부(미도시), 주요 요소 검출부(미도시) 및 학습 벡터 생성부(미도시)를 더욱 구비한다. 입력 영상에는 검출하고자 하는 얼굴 영역과 배경 영역이 혼재되는데, 얼굴 영역 추출부는 얼굴색 또는 AdaBoost 방법을 이용하여 얼굴 영역을 검출한다. 주요 요소 검출부는 검출된 얼굴 영역에서 명암 히스토그램(intensity histogram)을 통해 주요 요소의 대략적인 위치를 탐색하고, 탐색된 주요 요소의 영상으로 학습된 확률 모델(예를 들어 가우시안 모델)을 이용하여 주요 요소를 검출한다. 학습 벡터 생성부는 검출된 주요 요소에 따른 명암도(intensity) 성분과 에지(edge) 성분을 1차원 벡터로 재 배열시킨 학습 벡터를 생성한다. 본 문헌에서는 상기 생성된 학습 벡터를 제2 특징 벡터라 칭한다.The second feature vector extractor 324 extracts a second feature vector from the input image. The second feature vector extractor 324 further includes a face region extractor (not shown), a key element detector (not shown), and a learning vector generator (not shown). In the input image, a face region and a background region to be detected are mixed, and the face region extractor detects a face region using a face color or an AdaBoost method. The key element detector detects an approximate position of the key elements through intensity histograms in the detected face region, and uses a probabilistic model (eg, a Gaussian model) trained with the image of the detected key elements. Detect. The learning vector generator generates a learning vector in which intensity and edge components are rearranged into one-dimensional vectors according to the detected main elements. In this document, the generated learning vector is called a second feature vector.

신경망 학습부(326)는 제2 특징 벡터를 입력받고, 제2 특징 벡터와 이미 추출되어 저장된 트레이닝 영상에 대한 제1 특징 벡터 간의 유사도 계산을 통해 입력 영상에 대한 인식 결과를 산출한다. 신경망 학습부(326)의 판단 결과 유사한 제1 특징 벡터가 탐색된 경우, 탐색된 제1 특징 벡터에 따른 사람의 ID를 출력한다. 만약, 유사한 제1 특징 벡터가 탐색되지 않은 경우 인증 거부에 메시지를 결과로서 출력한다.The neural network learner 326 receives the second feature vector and calculates a recognition result of the input image by calculating a similarity between the second feature vector and the first feature vector for the already extracted and stored training image. When a similar first feature vector is found as a result of the neural network learner 326, the ID of the person according to the found first feature vector is output. If no similar first feature vector is found, then output a message as a result of authentication denial.

도 8은 본 발명의 일 실시예에 따른 영상 인식 방법을 나타내는 흐름도이다. 도 8에 도시된 영상 인식 방법은 영상 인식 장치(300)에서 시계열적으로 수행되는 하기의 단계들을 포함한다.8 is a flowchart illustrating an image recognition method according to an embodiment of the present invention. The image recognition method illustrated in FIG. 8 includes the following steps performed in time series by the image recognition apparatus 300.

410단계에서 3차원 영상 생성부(316)는 2차원 트레이닝 영상으로부터 예측되는 3차원 영상을 생성한다.In operation 410, the 3D image generator 316 generates a 3D image predicted from the 2D training image.

도 9는 도 8의 410단계에 대한 세부 흐름도이다. 3차원 영상 생성부(316)는 2차원 트레이닝 영상을 입력 받아(412단계), 3차원 영상 모델을 초기화하고(414단계), 3차원 영상 생성부(316)는 2차원의 트레이닝 영상과 3차원 영상 모델로 복원되는 3차원 영상과의 차이가 최소화되도록 3차원 영상 모델의 변수 즉 3차원 좌표 분산 벡터의 계수 및 색정보 분산 벡터의 계수를 결정하며(416단계), 결정된 계수에 따른 3차원 영상을 생성한다(418단계). 여기에서 영상 모델의 초기화는 2차원 영상의 크기와 위치에 3차원 영상 모델이 대략적으로 일치하도록 3차원 영상 모델을 평행 이동하거나 회전하는 것을 의미한다.FIG. 9 is a detailed flowchart of step 410 of FIG. 8. The 3D image generator 316 receives the 2D training image (step 412), initializes the 3D image model (step 414), and the 3D image generator 316 receives the 2D training image and the 3D image. In order to minimize the difference between the 3D image reconstructed into the image model, the parameters of the 3D image model, that is, the coefficient of the 3D coordinate dispersion vector and the coefficient of the color information dispersion vector are determined (step 416), and the 3D image according to the determined coefficient is determined. Create (step 418). In this case, the initialization of the image model means that the 3D image model is parallelly moved or rotated so that the 3D image model approximately matches the size and position of the 2D image.

420단계에서 제1 특징 벡터 추출부(318)는 410단계에서 생성된 3차원 영상들을 2차원 공간에 투영시킨다. 본 단계에 앞서 3차원 영상 각각에 대하여 다양한 포즈와 조명을 적용하여 여러 개의 서로 다른 3차원 영상을 생성하고, 이로 부터 제1 특징 벡터를 추출할 경우 포즈의 조명 변화에 강인한 특징 벡터를 추출할 수 있다.In operation 420, the first feature vector extractor 318 projects the 3D images generated in operation 410 into a 2D space. Prior to this step, a plurality of different 3D images may be generated by applying various poses and lighting to each of the 3D images, and when the first feature vector is extracted from the 3D images, the feature vectors robust to the change of the lighting of the pose may be extracted. have.

430단계에서 제1 특징 벡터 추출부(318)는 주요 요소에 대한 식별 정보를 이용하여 서브 영상을 추출한다. 440단계에서 제1 특징 벡터 추출부(318)는 430단계에서 추출된 서브 영상 각각에서 특징 벡터를 추출한다. 450단계에서 제2 특징 벡터 추출부(324)는 인식하고자 하는 입력 영상으로부터 서브 영상을 추출한다.In operation 430, the first feature vector extractor 318 extracts a sub image using identification information about a main element. In operation 440, the first feature vector extractor 318 extracts a feature vector from each of the sub-images extracted in operation 430. In operation 450, the second feature vector extractor 324 extracts a sub image from an input image to be recognized.

460단계에서 제2 특징 벡터 추출부(324)는 추출된 서브 영상 각각에서 제1 특징 벡터에 대응되는 제2 특징 벡터를 추출한다. 제2 특징 벡터 추출에 앞서 입력 영상으로 부터 얼굴 영역과 얼굴의 주요 요소를 검출하는 단계를 더 포함하는 것이 바람직하다. 얼굴 영역은 얼굴색 또는 Adaboost 방법과 같은 기존의 얼굴 검출 방법을 통해 검출할 수 있다. 얼굴의 주요 요소는 검출된 얼굴 영역에서 명암 히스토그램(intensity histogram)과 트레이닝 얼굴 영상들의 주요 요소에 대하여 사전에 학습된 확률 모델을 이용하여 검출한다. 본 단계의 제 2 특징 벡터는 검출된 주요 요소의 명암도(intensity) 성분과 에지(edge) 성분을 1차원 벡터로 재 배열 시킨 벡터가 바람직하다.In operation 460, the second feature vector extractor 324 extracts a second feature vector corresponding to the first feature vector from each of the extracted sub-images. Prior to extracting the second feature vector, the method may further include detecting a face region and main elements of the face from the input image. The facial region may be detected through conventional face detection methods such as face color or Adaboost method. The main elements of the face are detected using intensity histograms and the key elements of the training face images in the detected face area using a previously learned probability model. The second feature vector of this step is preferably a vector in which the intensity and edge components of the detected main elements are rearranged into one-dimensional vectors.

종래에는 인식하고자 하는 2차원 영상을 3차원 영상으로 복원할 경우 프로세싱 시간이 길어지는 문제가 있으나, 본 실시예의 경우에는 이러한 3차원 영상 복원 과정이 없기 때문에 영상의 인식 속도를 향상시킬 수 있다. 본 발명의 경우 트레이닝 영상 정보로 부터 특징 벡터를 산출하는 과정에서 얼굴의 포즈와 조명을 다양하게 변경시켜 얻어지는 2차원 영상을 학습 데이터로 활용하기 때문에 얼굴 인식의 강인성을 유지할 수 있다.Conventionally, when a 2D image to be recognized is reconstructed into a 3D image, processing time is long. However, in the present embodiment, since the 3D image reconstruction process is not performed, the recognition speed of the image may be improved. In the present invention, since the two-dimensional image obtained by variously changing the pose and lighting of the face is used as training data in the process of calculating the feature vector from the training image information, the robustness of face recognition can be maintained.

470단계에서 신경망 학습부(326)는 특징 벡터들 간의 유사도를 계산한다. 특징 벡터들 간의 유사도를 계산하는 방법에 특별한 제한은 없으나, 예를 들어 코사인 거리, 유클리디안 거리, 마하라오비스 거리 등을 이용하여 유사한 특징 벡터를 탐색할 수 있다.In operation 470, the neural network learner 326 calculates similarity between feature vectors. There is no particular limitation on the method of calculating the similarity between feature vectors, but similar feature vectors may be searched using, for example, cosine distance, Euclidean distance, Maharaobis distance, and the like.

480단계에서 신경망 학습부(326)는 인식 결과를 산출한다. 470단계에서 대비되는 두 개의 얼굴 영상에 따른 특징 벡터간의 코사인 거리가 소정의 기준값 보다 작은 경우에는 동일한 얼굴로 판단하고, 큰 경우에는 다른 얼굴로 판단한다.In operation 480, the neural network learner 326 calculates a recognition result. If the cosine distance between the feature vectors according to the two face images contrasted in step 470 is smaller than the predetermined reference value, the cosine distance is determined to be the same face.

한편 본 발명의 3D 영상 모델 생성 방법과 영상 인식 방법은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다.Meanwhile, the 3D image model generation method and the image recognition method of the present invention can be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored.

컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현하는 것을 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 본 발명을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트 들은 본 발명이 속하는 기술 분야의 프로그래머들에 의하여 용이하게 추론될 수 있다.Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disks, optical data storage devices, and the like, which may be implemented in the form of a carrier wave (for example, transmission over the Internet). Include. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. In addition, functional programs, codes, and code segments for implementing the present invention can be easily inferred by programmers in the art to which the present invention belongs.

이제까지 본 발명에 대하여 바람직한 실시예를 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 본 발명을 구현할 수 있음을 이해할 것이다. 그러므로, 상기 개시된 실시예 들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 한다.So far I looked at the center of the preferred embodiment for the present invention. Those skilled in the art will understand that the present invention can be embodied in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in descriptive sense only and not for purposes of limitation. The scope of the present invention is shown not in the above description but in the claims, and all differences within the scope should be construed as being included in the present invention.

본 발명의 3차원 영상 모델 생성 방법 및 장치는 3차원 영상 모델을 생성하 는데 소요되는 프로세싱 속도와 정확성을 향상시킬 수 있기 때문에 3차원 영상 합성 장치, 3차원 영상 인식 장치에 사용하기에 적합하다. 또한, 본 발명의 영상 인식 장치는 영상 인식을 하고자 하는 2차원 입력 영상을 3차원 영상으로 복원하지 않고도 포즈와 조명에 강한 영상 인식이 가능하며, 2차원 입력 영상을 3차원 영상으로 복원함에 따라 소요되는 프로세싱 시간을 줄일 수 있기 때문에 시스템의 효율성을 향상시킬 수 있다.Since the method and apparatus for generating a 3D image model of the present invention can improve the processing speed and accuracy required to generate a 3D image model, it is suitable for use in a 3D image synthesis apparatus and a 3D image recognition apparatus. In addition, the image recognition apparatus of the present invention can recognize a strong image to pose and lighting without restoring the two-dimensional input image to be recognized as a three-dimensional image, it is necessary to restore the two-dimensional input image to a three-dimensional image This reduces the processing time required, which improves the efficiency of the system.

도 1은 본 발명의 일 실시예에 따른 3차원 영상 모델 생성 장치를 나타내는 블록도이다.1 is a block diagram illustrating an apparatus for generating a 3D image model, according to an exemplary embodiment.

도 2a와 2b는 3차원 얼굴 스캔 영상에서 특징점들을 추출하고, 삼각형 메쉬를 생성하는 예를 나타내는 참고도이다.2A and 2B are reference diagrams illustrating an example of extracting feature points from a 3D face scan image and generating a triangle mesh.

도 3은 3차원 얼굴 스캔 영상을 2차원 실린더 공간에 투영시키는 예를 나타내는 참고도이다.3 is a reference diagram illustrating an example of projecting a 3D face scan image to a 2D cylinder space.

도 4는 본 발명에서 2차원 실린더 공간상의 영상을 어파인 변환하는 예를 나타내는 것이다.Figure 4 shows an example of affine transformation of the image in the two-dimensional cylinder space in the present invention.

도 5는 본 발명의 일 실시예에 따른 3차원 영상 모델 생성 방법을 나타내는 흐름도이다.5 is a flowchart illustrating a method of generating a 3D image model according to an embodiment of the present invention.

도 6은 본 발명의 일 실시예에 따른 영상 인식 장치를 나타내는 블록도이다.6 is a block diagram illustrating an image recognition device according to an embodiment of the present invention.

도 7은 2차원 트레이닝 얼굴 영상에 대하여 다양한 회전과 이동을 적용하고, 주요 요소를 추출하는 예를 나타내는 참고도이다.7 is a reference diagram illustrating an example of applying various rotations and movements to a two-dimensional training face image and extracting main elements.

도 8은 본 발명의 일 실시예에 따른 영상 인식 방법을 나타내는 흐름도이다. 8 is a flowchart illustrating an image recognition method according to an embodiment of the present invention.

도 9는 도 8에서 410단계에 대한 세부 흐름도이다.FIG. 9 is a detailed flowchart of step 410 of FIG. 8.

Claims

a) selecting three-dimensional image feature points according to each of the three-dimensional scanned image;

b) projecting each of the three-dimensional scanned images in a two-dimensional space;

c) determining one image from the projected two-dimensional images as a reference image, and deforming the two-dimensional images other than the reference image to correspond to the shape of the reference image;

d) extracting two-dimensional image feature points according to each of the deformed images, and reconstructing three-dimensional images using the extracted two-dimensional image feature points; And

e) generating a 3D image model using the reconstructed 3D images.

The method of claim 1, wherein step c)

c1) determining one image from the projected two-dimensional images as a reference image; And

c2) performing affine transformation so that the position of the feature points of the remaining images except for the reference image and the position of the feature points of the reference image correspond to each other.

The method of claim 2, wherein c) is

The method may further include determining a mask area according to feature points located at the outer side among the feature points of the reference image determined in step c1), and the affine transformation of step c2) is performed within the mask area. 3D image model generation method.

The method of claim 3, wherein step d)

d1) extracting points according to respective coordinates included in the mask area as two-dimensional image feature points;

d2) generating triangular meshes according to the extracted two-dimensional image feature points;

d3) generating images projected on the two-dimensional cylinder space by performing inverse affine transformation on the triangle meshes generated in step d2); And

d4) reconstructing three-dimensional images by performing reverse projection on each of the images projected on the two-dimensional cylinder space generated in step d3).

The method of claim 1, wherein b)

b1) generating triangular meshes according to the selected feature points; And

b2) projecting each of the three-dimensional scan images generated by the triangle meshes in a two-dimensional cylinder space.

The method of claim 1,

And generating a 3D image model by using principal component analysis (PCA) on the reconstructed 3D images.

The method of claim 1,

The 3D image model includes a 3D shape model and a 3D texture model,

The three-dimensional shape model is a model based on a three-dimensional coordinate average and a variance vector generated through principal component analysis of coordinate information of three-dimensional feature points according to each of the restored three-dimensional images.

The 3D texture model is a 3D image model generation, characterized in that the model according to the 3D color information mean and the variance vector generated through the principal component analysis of the color information of the 3D feature point according to each of the reconstructed 3D image Way.

The method of claim 1, wherein step e)

e1) obtaining 3D coordinate information and color information of feature points according to each of the reconstructed 3D images; And

e2) generating a 3D image model for synthesizing an arbitrary 3D image using the acquired position information and color information.

A computer-readable recording medium having recorded thereon a program for executing the method of generating a 3D image model according to any one of claims 1 to 8.

A 3D image feature point selector for selecting 3D image feature points according to each of the 3D scan images;

A two-dimensional space projector configured to project each of the three-dimensional scanned images into a two-dimensional space;

An image transformation unit configured to determine one image from the projected two-dimensional images as a reference image, and to modify the remaining two-dimensional images other than the reference image to correspond to the shape of the reference image;

A 3D image reconstructing unit which extracts 2D image feature points according to each of the modified images, and reconstructs 3D images using the extracted 2D image feature points; And

And a model generator for generating a 3D image model using the reconstructed 3D images.

The method of claim 10, wherein the image transformation unit,

A reference image determiner which determines one image from the projected two-dimensional images as a reference image;

A mask area determiner configured to determine a mask area according to feature points located at the outside of the feature points of the reference image; And

And an affine transform unit that performs an affine transformation so that the position of the feature points of the remaining images except for the reference image and the position of the feature points of the reference image correspond to each other.

The method of claim 10,

The 3D image feature point selector further includes a mesh generator configured to generate triangular meshes according to the selected feature points.

The two-dimensional space projection unit comprises a projection unit for projecting each of the three-dimensional scan image generated by the triangular mesh in the two-dimensional cylinder space.

a) selecting three-dimensional image feature points according to each of the three-dimensional scanned images, and projecting each of the three-dimensional scanned images in a two-dimensional space;

b) determining one image from the projected two-dimensional images as a reference image, and deforming the other two-dimensional images except for the reference image to correspond to the shape of the reference image;

c) extracting two-dimensional image feature points according to each of the deformed images, and reconstructing a three-dimensional image by using the extracted two-dimensional image feature points, respectively;

d) generating a 3D image model using the reconstructed 3D images;

e) receiving 2D training images and generating 3D composite images predicted from the training images using the 3D image model;

f) extracting a feature vector according to each training image from the generated three-dimensional composite images; And

g) receiving a 2D image of the input image to be recognized, extracting a feature vector from the input 2D image, and using the similarity between the feature vector according to the input image and the feature vectors according to the training image; Calculating a recognition result for the image.

The method of claim 13,

The method may further include obtaining identification information about a main element constituting the image using the 2D images projected in step a).

Step f)

f1) The three-dimensional composite images generated in the step e) are projected in the two-dimensional space by adjusting the pose or the illumination, respectively, and the main images from the images projected in the two-dimensional space using the identification information of the main elements. Extracting sub-images related to the element; And

f2) normalizing the extracted sub-images, and extracting feature vectors from the normalized sub-images.

The method of claim 14, wherein obtaining the identification information comprises:

Each of the projected two-dimensional images is classified according to a main element, and a feature point order value defined by pixels within a boundary region of feature points located at an outer side among feature points included in the divided main elements Image recognition method characterized in that obtained by the identification information.

The method of claim 14,

The feature vector includes an intensity component and an edge component of the normalized sub-images.

The method of claim 14,

Generating modified 3D images by applying a predetermined rotational movement and positional movement to the 3D composite images generated in step e),

In step f1), the modified 3D images are respectively projected in a 2D space, and sub images related to the main element are extracted from the images projected in the 2D space by using identification information of the main elements. Further comprising:

The feature vector extracted in step f2) is a learning vector for neural network learning.

Calculating a recognition result for the input image using the similarity between the feature vector according to the input image and the feature vector according to the training image in step g) is performed on the training vector for the neural network learning and the input image. And image similarity between the feature vectors.

The method of claim 13, wherein step e)

e1) receiving two-dimensional training images;

e2) initializing the 3D image model for each of the training images;

e3) determining each variable of the 3D image model that matches the training image; And

e4) generating a three-dimensional synthesized image predicted according to the determined variable, respectively.

A computer-readable recording medium having recorded thereon a program for performing the image recognition method of claim 13 on a computer.

An image transformation unit which determines one image from the projected two-dimensional images as a reference image, and deforms the remaining two-dimensional images other than the reference image to correspond to the shape of the reference image;

A 3D image reconstructing unit which extracts 2D image feature points according to each of the modified images, and reconstructs a 3D image using the extracted 2D image feature points, respectively;

A model generator for generating a 3D image model using the reconstructed 3D images;

A 3D image generator which receives 2D training images and generates 3D composite images predicted from the training images using the 3D image model;

A feature vector extracting unit extracting a feature vector according to each training image from the generated 3D composite images;

An image input unit configured to receive a 2D image of the input image to be recognized; And

And an image determining unit configured to extract a feature vector for the input image and calculate a recognition result for the input image by using similarity between the feature vector according to the input image and the feature vector according to the training image. Video recognition device.

The method of claim 20,

Further comprising an identification information acquisition unit for obtaining the identification information for the main elements constituting the image by using the projected two-dimensional images,

The feature vector extractor may include an image controller configured to adjust various poses and lightings for each training image;

A projection unit for projecting the image into a two-dimensional cylinder space; And

And a vector generator for generating a feature vector for image recognition from the projected image.