KR20040079637A

KR20040079637A - Method and apparatus for face recognition using 3D face descriptor

Info

Publication number: KR20040079637A
Application number: KR1020030014615A
Authority: KR
Inventors: 이원숙; 기석철
Original assignee: 삼성전자주식회사
Priority date: 2003-03-08
Filing date: 2003-03-08
Publication date: 2004-09-16

Abstract

PURPOSE: A method and a device for recognizing a face using a 3-D(Dimensional) face descriptor are provided to recognize a face photo, the video including the face, and a 3-D face mesh model by using the 3-D face descriptor generated from the image having many poses, a video stream having face rotation, and the 3-D face mesh model. CONSTITUTION: A basis generator(11) generates/stores an entire feature space basis including a pose value in a database by registering/training one of the images having many poses, the video stream having the face rotation, and the 3-D face mesh model. An entire face descriptor generator(13) generates/stores the 3-D entire face descriptor in the database by using one of the images, the video stream, and the 3-D face mesh model, and the pose value provided from the basis generator. A partial face descriptor generator(15) generates a 3-D partial face descriptor by using one of the images, the video stream, and the 3-D face mesh model, and the pose value, and generates a 3-D partial face descriptor database by using the pose value and the 3-D partial face descriptor. A searcher(17) measures similarity by searching the 3-D partial face descriptor database and outputs a search result.

Description

{Method and apparatus for face recognition using 3D face descriptor}

본 발명은 얼굴 인식에 관한 것으로서, 특히 3차원 얼굴 기술자를 이용한 얼굴 인식방법 및 장치에 관한 것이다.The present invention relates to face recognition, and more particularly, to a face recognition method and apparatus using a three-dimensional face descriptor.

얼굴 인식을 위하여 종래에는 PCA(Pricipal Component Analysis)와 ICA(Independent Component Analysis) 방식을 주로 이용하였으며, 최근에는 얼굴 기술자를 이용한 얼굴 인식방법이 『M. Abdel-Mottaleb, J. H. Connell, R. M. Bolle, and R. Chellappa, "Face Descriptor syntax," Merging proposals p181, p551, and p650, ISO/MPEG m5207, Mebourne, 1999』에 개시되어 있다.Conventional component analysis (PCA) and independent component analysis (ICA) methods have been mainly used for face recognition, and recently, face recognition method using face descriptor is described in [M. Abdel-Mottaleb, J. H. Connell, R. M. Bolle, and R. Chellappa, "Face Descriptor syntax," Merging proposals p181, p551, and p650, ISO / MPEG m5207, Mebourne, 1999.

그러나, 상기 얼굴 기술자는 2차원에서 생성된 것으로서 국한된 정보를 가지고 있기 때문에 다양한 포즈 변화를 갖는 이미지에 대하여 적용시 얼굴 인식율에 한계가 있다. 또한, 3차원 얼굴정보와 관련된 종래기술들은 얼굴의 3차원 정보를 찾아내기 위하여 특별한 장치를 사용하거나, 다수의 카메라를 필요로 하거나, 2차원 영상으로부터 특징점을 찾아내어 3차원 정보를 만든 다음 3차원 마스크를 생성하는 복잡한 과정을 거쳐야 하는 문제가 있다.However, since the face descriptor has limited information as generated in two dimensions, there is a limit in face recognition rate when applied to an image having various pose changes. In addition, the related arts related to three-dimensional face information use a special device to find three-dimensional information of a face, require a plurality of cameras, or find a feature point from a two-dimensional image, and then create three-dimensional information. There is a problem that requires a complicated process of generating a mask.

따라서, 본 발명이 이루고자 하는 기술적 과제는 다수의 포즈를 갖는 이미지, 얼굴 회전을 가진 비디오 스트림, 3차원 얼굴 매쉬 모델 중의 하나로 생성된 3차원 얼굴 기술자를 이용하여 얼굴 사진, 얼굴이 포함된 비디오 및 3차원 얼굴 매쉬 모델 중의 어떠한 형태의 이미지가 들어오더라도 얼굴 인식이 가능한 3차원 얼굴 기술자를 이용한 일굴 인식방법 및 장치를 제공하는데 있다.Accordingly, the technical problem to be achieved by the present invention is a face photograph, a video including a face, and a face using a three-dimensional face descriptor generated as one of a plurality of pose images, a video stream having a face rotation, and a three-dimensional face mesh model. It is to provide a method and apparatus for recognizing a face using a three-dimensional face descriptor capable of face recognition even when an image of any type in the dimensional face mesh model is received.

본 발명이 이루고자 하는 다른 기술적 과제는 표정, 포즈 및 조명에 강인할 뿐 아니라 얼굴기술자의 크기가 작은 2차원 얼굴기술자 생성방법 및 장치를 제공하는데 있다.Another object of the present invention is to provide a method and apparatus for generating a two-dimensional face descriptor that is not only robust to facial expressions, poses and lighting, but also has a small size of the face descriptor.

상기 기술적 과제를 달성하기 위하여 본 발명에 따른 3차원 얼굴 기술자를 이용한 얼굴 인식장치는 다수의 포즈를 가진 이미지들, 얼굴 회전을 가진 비디오들 및 3차원 얼굴 매쉬모델 중의 하나로 등록하고 훈련시켜 포즈값을 포함한 전체 특징점 공간 베이시스를 생성하여 데이터베이스에 저장하는 베이시스 생성부; 각 유저별로 N 포즈의 영상, 얼굴 회전을 가진 비디오 스트림, 및 3차원 얼굴 매쉬모델 중의 하나와 베이시스 생성부(11)로부터 제공되는 포즈값을 포함한 전체 특징점 공간 베이시스를 이용하여 3차원 전체 얼굴기술자를 생성하여 데이터베이스에 저장하는 전체 얼굴기술자 생성부; 인식하고자 하는 임의의 유저의 영상, 비디오 스트림, 3차원 얼굴 메쉬모델 중의 하나와 상기 베이시스 생성부로부터 제공되는 포즈값을 포함한 전체 특징점 공간 베이시스를 이용하여 3차원 부분 얼굴기술자를 생성하고, 인식하고자 하는 임의의 유저의 포즈값과 상기 전체 얼굴기술자 생성부로부터 제공되는 3차원 전체 얼굴기술자를 이용하여 3차원 부분 얼굴기술자 데이터베이스를 생성하는 부분 얼굴기술자 생성부; 및 상기 부분 얼굴기술자 생성부에서 생성되는 인식하고자 하는 임의의 유저에 대한 부분 얼굴기술자를 이용하여 3차원 부분 얼굴기술자 데이터베이스를 검색하여 유사도를 측정하고 검색 결과를 출력하는 검색부를 포함한다.In order to achieve the above technical problem, a face recognition apparatus using a three-dimensional face descriptor according to the present invention registers and trains one of images having a plurality of poses, videos having a face rotation, and a three-dimensional face mesh model to acquire a pose value. A basis generator for generating an entire feature point spatial basis including the database and storing the same in a database; For each user, a three-dimensional full face descriptor is generated by using an N pose image, a video stream having a face rotation, and a full feature point spatial basis including one of the three-dimensional face mesh models and the pose value provided from the basis generator 11. A full face descriptor generator for generating and storing in a database; To generate and recognize a 3D partial face descriptor using an entire feature point spatial basis including an image, a video stream, and a 3D face mesh model of a user to be recognized and a pose value provided from the basis generator. A partial face descriptor generator for generating a three-dimensional partial face descriptor database using a pose value of an arbitrary user and a three-dimensional full face descriptor provided from the full face descriptor generator; And a search unit for searching for a 3D partial face descriptor database by using the partial face descriptor for any user to be generated generated by the partial face descriptor generator, measuring similarity, and outputting a search result.

상기 기술적 과제를 달성하기 위하여 본 발명에 따른 3차원 얼굴 기술자를 이용한 얼굴 인식방법은 (a) 다수의 포즈를 가진 이미지들, 얼굴 회전을 가진 비디오들 및 3차원 얼굴 매쉬모델 중의 하나인 등록데이터에 대하여 전체 특징점공간을 통해 전체 얼굴기술자로 등록한 다음, 부분 얼굴기술자로 표현하여 부분 얼굴기술자 데이터베이스를 생성하는 단계; (b) 임의의 포즈를 가진 이미지, 얼굴 회전을 가진 비디오 및 3차원 얼굴 매쉬모델 중의 하나인 검색데이터에 대하여 부분 특징점공간을 통해 부분 얼굴기술자로 표현하는 단계; 및 (c) 상기 (b) 단계에서 표현된 부분 얼굴기술자를 이용하여 상기 (a) 단계에서 생성된 부분 얼굴기술자 데이터베이스를 검색하는 단계를 포함한다.In order to achieve the above technical problem, a face recognition method using a three-dimensional face descriptor according to the present invention includes (a) images with multiple poses, videos with face rotation, and registration data, which is one of three-dimensional face mesh models. Registering a full face descriptor through the full feature point space, and then expressing the partial face descriptor to generate a partial face descriptor database; (b) expressing a partial face descriptor through partial feature point space for search data which is one of an image having an arbitrary pose, a video having a face rotation, and a three-dimensional face mesh model; And (c) searching the partial face descriptor database generated in step (a) using the partial face descriptor represented in step (b).

상기 다른 기술적 과제를 달성하기 위하여 본 발명에 따른 2차원 얼굴기술자 생성장치는 얼굴영상의 중심부에 대하여 히스토그램 및 가우시안 분석을 수행하여 얼굴톤 특징벡터를 생성하는 얼굴톤 특징 생성부; 정규화된 얼굴영상에 대하여 퓨리에변환, PCLDA 프로젝션 및 LDA 프로젝션을 순차적으로 수행하여 전체퓨리에특징벡터를 생성하는 전체퓨리에특징 생성부; 및 얼굴영상의 k개의 성분에 대하여 퓨리에변환, PCLDA 프로젝션 및 LDA 프로젝션을 순차적으로 수행하여 성분별퓨리에특징벡터를 생성하는 성분별퓨리에특징 생성부를 포함한다.In accordance with another aspect of the present invention, there is provided a apparatus for generating a two-dimensional face descriptor according to the present invention, comprising: a face tone feature generator for generating a face tone feature vector by performing histogram and Gaussian analysis on a central portion of a face image; A total Fourier feature generator for generating a Fourier feature vector by sequentially performing Fourier transform, PCLDA projection, and LDA projection on the normalized face image; And a component-specific Fourier feature generation unit configured to sequentially perform Fourier transform, PCLDA projection, and LDA projection on k components of the face image to generate Fourier-specific vector for each component.

상기 다른 기술적 과제를 달성하기 위하여 본 발명에 따른 2차원 얼굴기술자 생성방법은 (a) 얼굴영상의 중심부에 대하여 히스토그램 및 가우시안 분석을 수행하여 얼굴톤 특징벡터를 생성하는 단계; (b) 정규화된 얼굴영상에 대하여 퓨리에변환, PCLDA 프로젝션 및 LDA 프로젝션을 순차적으로 수행하여 전체퓨리에특징벡터를 생성하는 단계; 및 (c) 얼굴영상의 k개의 성분에 대하여 퓨리에변환, PCLDA 프로젝션 및 LDA 프로젝션을 순차적으로 수행하여 성분별퓨리에특징벡터를 생성하는 단계를 포함한다.According to another aspect of the present invention, there is provided a method of generating a two-dimensional face descriptor, comprising: (a) generating a facial tone feature vector by performing a histogram and a Gaussian analysis on a central portion of a face image; (b) sequentially performing Fourier transform, PCLDA projection, and LDA projection on the normalized face image to generate a full Fourier feature vector; And (c) sequentially performing Fourier transform, PCLDA projection, and LDA projection on k components of the face image to generate Fourier feature vectors for each component.

도 1은 본 발명에 따른 3차원 얼굴 기술자를 이용한 얼굴인식장치의 구성을 나타내는 블럭도,1 is a block diagram showing the configuration of a face recognition device using a three-dimensional facial descriptor according to the present invention;

도 2는 도 1에 있어서 베이시스 생성부의 새부적인 구성을 나타내는 블럭도,FIG. 2 is a block diagram showing a new configuration of the basis generation unit in FIG. 1; FIG.

도 3은 도 1에 있어서 전체 기술자 생성부의 새부적인 구성을 나타내는 블럭도,3 is a block diagram showing a new configuration of the entire descriptor generation unit in FIG. 1;

도 4는 도 1에 있어서 검색부의 새부적인 구성을 나타내는 블럭도,4 is a block diagram showing a new configuration of a search unit in FIG. 1;

도 5는 본 발명에 따른 3차원 얼굴 기술자를 이용한 얼굴인식방법을 설명하는 도면,5 is a view for explaining a face recognition method using a three-dimensional face descriptor according to the present invention;

도 6은 도 6은 전체 특징점공간을 만들 때 사용되지 않은 시점에서 들어오는 영상을 인식하여야 하는 경우 해결방법을 보여주는 도면,FIG. 6 is a diagram illustrating a solution when a video coming from an unused time point should be recognized when creating a full feature point space.

도 7은 본 발명에 따른 2차원 얼굴 기술자 생성과정을 설명하는 블럭도,7 is a block diagram illustrating a two-dimensional face descriptor generation process according to the present invention;

도 8은 개인별 얼굴 톤값을 나타낸 그래프,8 is a graph showing individual face tone values;

도 9는 본 발명에 적용된 시점 구체에서의 모자이크 이론을 설명하는 도면, 및9 illustrates a mosaic theory in a viewpoint sphere applied to the present invention, and

도 10은 도 9에 도시된 시점 구체에서의 모자이크를 2차원적으로 나타낸 도면이다.FIG. 10 is a diagram two-dimensionally illustrating a mosaic in the viewpoint sphere shown in FIG. 9.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 일실시예에 대하여 상세히 설명하기로 한다.Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the present invention.

도 1은 본 발명에 따른 3차원 얼굴 기술자를 이용한 얼굴인식장치의 구성을 나타내는 블럭도로서, 베이시스 생성부(11), 전체 얼굴기술자 생성부(13), 부분 얼굴기술자 생성부(15) 및 검색부(17)로 이루어진다. 먼저 본 발명에서 사용되는 용어에 대하여 먼저 정의하기로 한다. 3차원 전체 특징점 공간은 전체 얼굴기술자를 만들어내는 특징점 공간으로서, 전체의 각도를 덮는 다양한 시점에서 들어온 얼굴 특징점들이 모아진 공간을 의미한다. 3차원 부분 특징점 공간은 부분 얼굴기술자를 만들어내는 특징점 공간으로서, 전체 특징점 공간에서 프로젝션 또는 보간에 의해 생성되어지는 특징점 공간을 의미한다. 3차원 전체 얼굴기술자는 3차원 얼굴정보를 가지고 있는 기술자로서, 들어온 데이터를 3차원 전체 특징점 공간에 투영하여 얻어진다. 3차원 부분 얼굴기술자는 3차원 얼굴정보 중의 일부만을 가지고 있는 기술자로서, 전체 얼굴기술자들에서 투사하여 얻어지거나, 그후에 보간법을 사용하여 얻어지거나, 3차원 부분 특징점 공간에 투영하여 얻어질 수 있다.1 is a block diagram showing the configuration of a face recognition apparatus using a three-dimensional face descriptor according to the present invention, including a basis generator 11, a full face descriptor generator 13, a partial face descriptor generator 15, and a search. It consists of a section (17). First, terms used in the present invention will be defined first. The three-dimensional global feature point space is a feature point space for generating the entire face descriptor and means a space in which face feature points from various viewpoints covering the entire angle are collected. The 3D partial feature space is a feature space for generating a partial face descriptor, and means a feature space generated by projection or interpolation in the entire feature space. The three-dimensional full face descriptor is a descriptor having three-dimensional face information, and is obtained by projecting incoming data into the three-dimensional full feature point space. The three-dimensional partial face descriptor is a descriptor having only a part of the three-dimensional face information, and may be obtained by projecting from all the face descriptors, subsequently using interpolation, or by projecting onto the three-dimensional partial feature point space.

도 1을 참조하면, 베이시스 생성부(11)는 특징점 공간의 베이시스를 생성하기 위한 것으로서, 다수의 포즈를 가진 이미지들, 얼굴 회전을 가진 비디오들 및 3차원 얼굴 매쉬모델 중의 하나로 등록하고 훈련시켜 포즈값을 포함한 전체 특징점 공간 베이시스를 생성하여 데이터베이스에 저장한다.Referring to FIG. 1, the basis generator 11 generates a basis of a feature point space, and registers and trains one of images having a plurality of poses, videos having a face rotation, and a three-dimensional face mesh model to pose. Create a full feature space basis with values and store it in the database.

전체 얼굴기술자 생성부(13)는 각 유저별로 N 포즈의 영상, 얼굴 회전을 가진 비디오 스트림, 및 3차원 얼굴 매쉬모델 중의 하나와 베이시스 생성부(11)로부터 제공되는 포즈값을 포함한 전체 특징점 공간 베이시스를 이용하여 3차원 전체 얼굴기술자를 생성하여 데이터베이스에 저장한다.The full face descriptor generator 13 includes an N pose image for each user, a video stream with face rotation, and a full feature point spatial basis including one of the three-dimensional face mesh models and the pose values provided from the basis generator 11. Create a 3D full face descriptor using and store it in the database.

부분 얼굴기술자 생성부(15)는 인식하고자 하는 임의의 유저의 영상, 비디오 스트림, 3차원 얼굴 메쉬모델 중의 하나와 베이시스 생성부(11)로부터 제공되는 포즈값을 포함한 전체 특징점 공간 베이시스를 이용하여 3차원 부분 얼굴기술자를 생성하고, 인식하고자 하는 임의의 유저의 포즈값과 전체 얼굴기술자 생성부(13)로부터 제공되는 3차원 전체 얼굴기술자를 이용하여 3차원 부분 얼굴기술자 데이터베이스를 생성한다.The partial face descriptor generation unit 15 uses the entire feature point spatial basis including one of an image, a video stream, a three-dimensional face mesh model of an arbitrary user to be recognized, and a pose value provided from the basis generation unit 11. A 3D partial face descriptor database is generated by using a pose value of an arbitrary user to be recognized and a 3D full face descriptor provided from the full face descriptor generator 13.

검색부(17)는 부분 얼굴기술자 생성부(15)에서 생성되는 인식하고자 하는 임의의 유저에 대한 부분 얼굴기술자를 이용하여 3차원 부분 얼굴기술자 데이터베이스를 검색하여 유사도를 측정하고 검색 결과를 출력한다.The search unit 17 searches the 3D partial face descriptor database by using the partial face descriptor for any user to be generated by the partial face descriptor generator 15 to measure the similarity and outputs the search result.

한편, 3차원 얼굴 매쉬모델로부터 3차원 얼굴기술자를 생성하기 위해서는, 3차원 얼굴 매쉬모델을 2차원으로 투사한 다음, 이후에는 2차원 영상과 동일한 방법으로 처리한다. 즉, 3차원 얼굴 매쉬모델을 3차원 그래피컬 툴을 이용하여 로딩하여 텍스쳐 정보를 추출하고, 텍스터 정보로부터 눈의 위치를 계산하여 3차원 모델에서 눈을 찾아낸다. 이후, 3차원에서 모델을 주어진 각도로 돌린 다음, 각각 2차원 평면으로 투사하여 이미지 파일을 생성한다. 이때 크기와 수평 이동변수는 3차원 모델의 눈의 위치로부터 계산하여 표준화시킨다.In order to generate a three-dimensional face descriptor from the three-dimensional face mesh model, the three-dimensional face mesh model is projected in two dimensions and then processed in the same manner as the two-dimensional image. That is, the 3D face mesh model is loaded using a 3D graphical tool to extract texture information, and the eye position is calculated from the texture information to find the eye in the 3D model. The model is then rotated in three dimensions at a given angle and then projected onto a two-dimensional plane, respectively, to create an image file. At this time, the size and horizontal movement variables are calculated from the eye position of the 3D model and normalized.

도 2는 도 1에 있어서 베이시스 생성부(11)의 세부적인 구성을 나타내는 블럭도로서, 제1 내지 제3 영상 데이터베이스(211,212,213), 제1 내지 제3 신호처리부(214,215,216), 포즈 데이터베이스(217), 포즈영상 데이터베이스(218), 부분영상 특징점공간 생성부(219) 및 특징점 공간 압축부(220)로 이루어진다. 여기서, 제1 내지 제3 신호처리부(214,215,216), 포즈영상 데이터베이스(218), 부분영상 특징점공간 생성부(219)은 특징점공간 생성부(200)를 구성한다.FIG. 2 is a block diagram illustrating a detailed configuration of the basis generator 11 in FIG. 1. The first to third image databases 211, 212, 213, the first to third signal processors 214, 215, 216, and the pose database 217 are illustrated in FIG. , A pose image database 218, a partial image feature point space generator 219, and a feature point space compressor 220. Here, the first to third signal processing units 214, 215, 216, the pose image database 218, and the partial image feature point space generator 219 constitute a feature point space generator 200.

도 2를 참조하면, 제1 내지 제3 영상 데이터베이스(211,212,213)는 각각 다수의 포즈를 가진 이미지들, 얼굴 회전을 가진 비디오들 및 3차원 얼굴 매쉬모델에 해당한다.Referring to FIG. 2, the first to third image databases 211, 212, and 213 correspond to images having a plurality of poses, videos having a face rotation, and a 3D face mesh model, respectively.

제1 신호처리부(214)는 제1 영상 데이터베이스(211)에 저장된 다수 예컨데 N개의 포즈를 가진 이미지들에 대하여 얼굴영역을 추출하고 눈위 위치를 찾아 스케일링 및 트랜슬레이션을 수행하는 정규화과정을 수행하여 N 개의 포즈데이터와 N 개의 얼굴부분으로 정형화된 영상을 생성한다. 제2 신호처리부(215)는 제2 영상 데이터베이스(212)에 저장된 얼굴 회전을 가진 비디오들에 대하여 얼굴을 트랙킹하여 얼굴을 검출한 다음, 포즈 데이터베이스(217)에 저장된 포즈값을 참조하여 주어진 포즈 이미지를 추출하고 정규화과정을 수행하여 N 개의 포즈데이터와 N 개의 얼굴부분으로 정형화된 영상을 생성한다. 제3 신호처리부(216)는 제2 영상 데이터베이스(212)에 저장된 3차원 얼굴 매쉬모델에 대한 텍스쳐 정보로부터 눈의 위치를 추출하고 포즈 데이터베이스(217)에 저장된 포즈값을 참조하여 주어진 포즈로 영상을 프로젝션하여 N 개의 포즈데이터와 N 개의 얼굴부분으로 정형화된 영상을 생성한다.The first signal processor 214 performs a normalization process of extracting a face region from a plurality of images having N poses stored in the first image database 211, finding a position on the eye, and performing scaling and translation. A normalized image is generated with N pose data and N face parts. The second signal processor 215 detects the face by tracking the face with respect to the videos having the face rotation stored in the second image database 212, and then provides a pose image with reference to the pose value stored in the pose database 217. And normalization process is performed to generate an image normalized with N pose data and N face parts. The third signal processor 216 extracts the eye position from the texture information of the three-dimensional face mesh model stored in the second image database 212 and uses the pose value stored in the pose database 217 to refer to the image in the given pose. Projection generates a stereotyped image with N pose data and N face parts.

포즈영상 데이터베이스(218)는 제1 내지 제3 신호처리부(214,215,216)로부터 각각 생성되는 N 개의 포즈데이터와 N 개의 얼굴부분으로 정형화된 영상을 저장한다. 부분영상 특징점공간 생성부(219)는 포즈영상 데이터베이스(218)에 저장된 N 개의 포즈데이터와 N 개의 얼굴부분으로 정형화된 영상을 이용하여 제1 내지 제N 포즈기반 부분 특징점 공간을 생성한다. 특징점공간 압축부(220)는 부분영상 특징점공간 생성부(219)로부터 제공되는 제1 내지 제N 포즈기반 부분 특징점 공간을 서로 합친 다음, 공간 압축 기법을 수행하여 최종적으로 포즈값을 포함하는 전체 특징점 공간 베이시스를 생성한다. 이와 같이, 전체 특징점 공간 베이시스는 시점 정보를 가지고 있기 때문에 3차원 얼굴 부분기술자를 검색시 끄집어 낼 수 있다. 한편, 특징점공간 압축부(220)는 주성분분석(PCA) 기법을 이용하여 가장 큰 고유벡터(eigenvector) 순으로 정렬하여 특징점 공간을 압축하며, 압축은 반드시 행해질 필요는 없고, 경우에 따라 수행되어진다.The pose image database 218 stores an image formed of N pose data and N face parts generated from the first to third signal processors 214, 215, and 216, respectively. The partial image feature point space generator 219 generates the first to N-th pose-based partial feature point spaces by using the N-shaped pose data stored in the pose image database 218 and the image formed by the N face parts. The feature point space compressing unit 220 combines the first to Nth pose based partial feature spaces provided from the partial image feature point space generating unit 219 with each other, and then performs a spatial compression technique to finally include all feature points including a pose value. Create a spatial basis. As such, since the entire feature point spatial basis has viewpoint information, the 3D face descriptor can be taken out when searching. Meanwhile, the feature point space compression unit 220 compresses the feature space by arranging the largest eigenvectors using principal component analysis (PCA), and the compression is not necessarily performed. .

도 3은 도 1에 있어서 전체 얼굴기술자 생성부(13)의 세부적인 구성을 나타내는 블럭도로서, 제1 내지 제3 영상(311,312,313), 특징점공간 생성부(314), 특징점공간 프로젝션부(315), 포즈값을 포함한 전체 특징점 공간 베이시스 데이터베이스(316) 및 특징기술자 압축부(317)로 이루어진다. 여기서, 특징점공간 생성부(314)는 도 2에 도시된 특징점공간 생성부(200)와 동일한 구성요소로 이루어지며 동일한 작용을 수행한다.FIG. 3 is a block diagram illustrating a detailed configuration of the entire face descriptor generator 13 in FIG. 1, and includes first to third images 311, 312, and 313, a feature point space generator 314, and a feature point space projection unit 315. , The feature point space basis database 316 including the pause value and the feature descriptor compression unit 317. Here, the feature point space generator 314 is composed of the same components as the feature point space generator 200 shown in FIG. 2 and performs the same function.

도 3을 참조하면, 제1 내지 제3 영상(311,312,313)은 각각 유저별 N 포즈의 영상, 얼굴 회전을 가진 비디오 스트림, 및 3차원 얼굴 매쉬모델에 해당한다.Referring to FIG. 3, the first to third images 311, 312, and 313 correspond to images of N poses for each user, a video stream having face rotation, and a 3D face mesh model, respectively.

특징점공간 생성부(314)는 도 2에 도시된 특징점공간 생성부(200)와 마찬가지로, N개의 포즈기반 부분 특징점공간을 생성한다. 특징점공간 프로젝션부(315)는 특징점공간 생성부(314)로부터 제공되는 N개의 포즈기반 부분 특징점공간과 포즈값을 포함하는 전체 특징점공간 베이시스 데이터베이스(316)에 프로젝션되어 N개의 좌표값을 획득한다. 특징기술자 압축부(317)는 주성분분석(PCA) 기법을 이용하여 가장 큰 고유벡터(eigenvector) 순으로 정렬하여 특징기술자를 압축하며, 압축은 반드시 행해질 필요는 없고, 경우에 따라 수행되어진다. 즉, 제1 내지 제3 영상(311,312,313)을 각각 별도로 신호처리하여 N개의 시점 데이터와 N개의 얼굴부분으로 정형화된 영상을 생성하고, 이를 N개의 부분 영상 얼굴 특징점공간에 투사되어 소정의 공간압축기법을 거쳐 3차원 전체 얼굴기술자를 생성하여 데이터베이스에 저장한다.The feature point space generator 314 generates N pose-based partial feature point spaces similarly to the feature point space generator 200 illustrated in FIG. 2. The feature point space projection unit 315 is projected to the feature point space basis database 316 including N pose-based partial feature point spaces and pose values provided from the feature point space generator 314 to obtain N coordinate values. The feature descriptor compression unit 317 compresses the feature descriptors in the order of the largest eigenvectors using a principal component analysis (PCA) technique, and the compression is not necessarily performed, but is performed in some cases. That is, the first to third images 311, 312, and 313 are separately signal-processed to generate an image formed of N view data and N face portions, and are projected onto N partial image face feature point spaces so as to provide a predetermined spatial compression technique. After creating the 3D full face descriptor and store it in the database.

도 4는 도 1에 있어서 부분 얼굴기술자 생성부(15, 400)의 세부적인 구성을 나타내는 블럭도로서, 제1 내지 제3 영상(411,412,413), 제1 내지 제3 신호처리부(414,415,416), 포즈 데이터베이스(417), 포즈 추출부(418), 부분 특징점공간 베이시스 생성부(419), 포즈값을 포함한 전체 특징점 공간 베이시스 데이터베이스(420), 3차원 부분 얼굴기술자 생성부(421), 프로젝션/보간부(422), 3차원 전체 얼굴기술자 데이터베이스(423) 및 3차원 부분 얼굴기술자 데이터베이스(424)로 이루어진다. 한편, 검색부(425)는 도 1에 도시된 검색부(17)에 해당한다.FIG. 4 is a block diagram illustrating a detailed configuration of the partial face descriptor generators 15 and 400 in FIG. 1, wherein the first to third images 411, 412, 413, the first to third signal processors 414, 415, 416 and a pose database are illustrated in FIG. 1. 417, the pose extractor 418, the partial feature point space basis generator 419, the full feature point space basis database 420 including the pose value, the 3D partial face descriptor generator 421, the projection / interpolator ( 422, a three dimensional full face descriptor database 423, and a three dimensional partial face descriptor database 424. Meanwhile, the searcher 425 corresponds to the searcher 17 illustrated in FIG. 1.

도 4를 참조하면, 제1 내지 제3 영상(411,412,413)은 각각 인식하고자 하는 임의의 유저의 영상, 비디오 스트림, 3차원 얼굴 메쉬모델에 해당한다. 제1 내지 제3 신호처리부(414,415,416)는 도 2에 도시된 제1 내지 제3 신호처리부(214,215,216)와 동일한 동작을 수행한다.Referring to FIG. 4, the first to third images 411, 412, and 413 correspond to an image, a video stream, and a 3D face mesh model of an arbitrary user to be recognized. The first to third signal processors 414, 415, and 416 perform the same operations as the first to third signal processors 214, 215, and 216 shown in FIG. 2.

포즈 추출부(418)는 제1 내지 제3 신호처리부(414,415,416)로부터 신호처리되어 제공되는 영상데이터로부터 포즈값을 추출하고, 포즈값과 정규화한 영상부분으로 분리하여 제공한다. 부분 특징점 공간 베이시스 생성부(419)는 포즈값을 포함하는 전체 특징점공간 베이시스 데이터베이스(420)과 포즈 추출부(418)로부터 제공되는 포즈값을 이용하여 부분 특징점 공간 베이시스를 생성한다.The pose extractor 418 extracts a pose value from image data provided by signal processing from the first to third signal processors 414, 415, and 416, and provides the pose value separately from the pose value and a normalized image part. The partial feature point spatial basis generator 419 generates the partial feature point spatial basis using the pose values provided from the full feature point space basis database 420 including the pose value and the pose extractor 418.

3차원 부분 얼굴기술자 생성부(421)는 포즈 추출부(418)로부터 제공되는 정규화된 영상부분과 부분 특징점 공간 베이시스 생성부(419)의 출력으로부터 3차원 부분 얼굴기술자를 생성한다. 프로젝션/보간부(422)는 포즈 추출부(418)로부터 제공되는 포즈값을 3차원 전체 얼굴기술자 데이터베이스(422)에 투사하거나, 3차원 전체 얼굴기술자 데이터베이스(422)를 이용하여 보간을 수행하여 3차원 부분 얼굴기술자 데이터베이스(423)를 생성한다.The 3D partial face descriptor generator 421 generates a 3D partial face descriptor from the output of the normalized image part and the partial feature point spatial basis generator 419 provided from the pose extractor 418. The projection / interpolator 422 projects the pose value provided from the pose extractor 418 to the three-dimensional full face descriptor database 422 or performs interpolation using the three-dimensional full face descriptor database 422 to perform three interpolation. Create a dimensional partial face descriptor database 423.

검색부(424)는 3차원 부분 얼굴기술자 생성부(421)로부터 제공되는 부분 얼굴기술자를 이용하여 3차원 부분 얼굴기술자 데이터베이스(423)를 검색하여 유사도를 측정하고, 검색결과를 출력한다.The search unit 424 searches the 3D partial face descriptor database 423 by using the partial face descriptor provided from the 3D partial face descriptor generation unit 421, measures similarity, and outputs a search result.

요약하면, 검색은 임의의 데이터가 입력되는 경우 등록이 된 사용자 중에 누구에 해당하는지 알아내는 것이다. 그런데, 입력되는 데이터는 사용자의 모든 3차원 얼굴 기술자를 표현할 만큼의 정보를 포함하고 있지 않기 때문에 주어진 부분 정보를 사용하여 검색이 이루어져야 한다. 입력데이터는 얼굴이 담긴 사진영상이나 비디오, 또는 3차원 메쉬모델 중 어느 것이나 무방하다. 이와 같은 입력데이더들이 훈련단계로 가기 위해서 입력데이터 성격에 맞도록 제1 내지 제3 신호처리부(414,415,416)에서 신호처리한 다음, 하나의 정형화된 얼굴영상과 시점데이터를 추출해 낸다. 한편, 3차원 메쉬 모델은 정면 얼굴을 추출해 낸다. 여기서, 시점 데이터는 전체 특징 공간에서 그 포즈에 해당하는 부분 특징 공간을 생성하는데 사용되며, 정형화된 얼굴영상은 그 공간에 투영되어 부분 얼굴기술자를 생성한다. 또한, 시점 데이터는 등록된 사용자에 대한 3차원 전체 얼굴기술자 데이터베이스(423)에서 그 포즈에 해당하는 부분 얼굴기술자를 생성하는데 사용된다. 마지막으로, 검색시에는 부분 얼굴기술자의 유사성을 측정하게 된다.In summary, a search is to find out which of the registered users corresponds to any data entered. However, since the input data does not contain enough information to represent all three-dimensional face descriptors of the user, the search should be performed using the given partial information. The input data may be a photographic image, a video containing a face, or a three-dimensional mesh model. The input data are processed by the first to third signal processing units 414, 415, and 416 to match the characteristics of the input data in order to proceed to the training phase, and then a single face image and viewpoint data are extracted. Meanwhile, the 3D mesh model extracts the front face. Here, the viewpoint data is used to generate a partial feature space corresponding to the pose in the entire feature space, and the standardized face image is projected to the space to generate a partial face descriptor. In addition, the viewpoint data is used to generate a partial face descriptor corresponding to the pose in the three-dimensional full face descriptor database 423 for the registered user. Finally, the similarity of the partial face descriptors is measured during the search.

도 5는 본 발명에 따른 3차원 얼굴 기술자를 이용한 얼굴인식방법을 설명하는 것으로서, 51 단계에서는 다수의 포즈를 가진 이미지들, 얼굴 회전을 가진 비디오들 및 3차원 얼굴 매쉬모델 중의 하나인 등록데이터에 대하여 전체 특징점공간을 통해 전체 얼굴기술자로 등록한 다음, 부분 얼굴기술자로 표현하여 부분 얼굴기술자 데이터베이스를 생성한다.FIG. 5 illustrates a face recognition method using a 3D face descriptor according to the present invention. In step 51, images having a plurality of poses, videos having a face rotation, and registration data which is one of three-dimensional face mesh models are described. In this case, the full face descriptor is registered as a full face descriptor and then a partial face descriptor database is generated to generate a partial face descriptor database.

53 단계에서는 임의의 포즈를 가진 이미지, 얼굴 회전을 가진 비디오 및 3차원 얼굴 매쉬모델 중의 하나인 검색데이터에 대하여 부분 특징점공간을 통해 부분 얼굴기술자로 표현한다.In step 53, the partial face descriptor is represented through the partial feature point space for the image having an arbitrary pose, the video having the face rotation, and the search data which is one of the three-dimensional face mesh models.

55 단계에는 상기 52 단계에서 표현된 부분 얼굴기술자를 이용하여 상기 51 단계에서 생성된 부분 얼굴기술자 데이터베이스를 검색한다.In step 55, the partial face descriptor database generated in step 51 is searched using the partial face descriptor expressed in step 52.

도 6은 전체 특징점공간을 만들 때 사용되지 않은 시점에서 들어오는 영상을 인식하여야 하는 경우 해결방법을 설명하는 것으로서, 검색시 주어진 시점 즉, 포즈를 사이에 둔 등록시 사용되었던 두 시점 즉, 포즈 A와 포즈 B에 대한 3차원 부분 특징점공간을 선형검색기법을 이용하여 보간함으로써 주어진 시점에 대한 부분 특징점 공간을 생성한다. 또한, 검색시 주어진 시점 즉, 포즈를 사이에 둔 등록시 사용되었던 두 시점 즉, 포즈 A와 포즈 B에 대한 3차원 부분 특징점공간을 선형검색기법을 이용하여 보간함으로써 주어진 시점에 대한 부분 얼굴기술자를 생성한다.FIG. 6 illustrates a solution when an incoming image is to be recognized from an unused point in time when creating an entire feature point space. In FIG. A partial feature space for a given viewpoint is generated by interpolating the three-dimensional partial feature space for B using a linear search technique. In addition, a partial face descriptor for a given viewpoint is generated by interpolating a three-dimensional partial feature point space for a given viewpoint, that is, two poses used for registration between poses, pose A and pose B, using a linear search technique. do.

도 7은 본 발명에 따른 2차원 얼굴 기술자 생성과정을 설명하는 블럭도로서, 얼굴톤 특징 생성부(710), 전체퓨리에특징 생성부(730) 및 성분별퓨리에특징 생성부(750)로 이루어진다. 도 7에 도시된 2차원 얼굴기술자 생성과정은 도 2에 있어서 베이시스 생성부(11)의 특징점 공간 생성부(200)와 전체 얼굴기술자 생성부(13)의 특징점 공간 생성부(314)에 적용될 수 있다.FIG. 7 is a block diagram illustrating a two-dimensional facial descriptor generation process according to the present invention, and includes a facial tone feature generator 710, a total Fourier feature generator 730, and a component-specific Fourier feature generator 750. The two-dimensional face descriptor generation process illustrated in FIG. 7 may be applied to the feature point space generator 200 of the basis generator 11 and the feature point space generator 314 of the entire face descriptor generator 13 in FIG. 2. have.

도 7을 참조하면, 얼굴톤특징 생성부(710)에 있어서 히스토그램 처리부(711)는 i번째 포즈를 갖는 얼굴영상의 중심 영역에 대하여 히스토그램을 생성한다. 가우시안 분석부(713)에서는 히스토그램 처리부(710)에 생성된 히스토그램에 대하여가우시안 분석을 수행하여 가우시안 분포를 가정한 후 평균값(μ^skin)을 찾아낸다. 스칼라 정규화부(715)에서는 가우시안 분석부(713)에서 생성된 평균값(μ^skin)에 대하여 스칼라 정규화를 수행하여 일정한 상수값, 예를 들면 정수형이 되도록 하고, 그 값을 얼굴톤특징벡터로 생성한다. 얼굴톤특징 생성부(710)에 있어서 50 명이 5가지의 포즈를 취한 경우에 대하여 얼굴톤특징을 생성하기 위하여, 모든 얼굴들의 동일한 포즈에서의 평균값(μ^skin)을 알아내어 수평이동시킨 결과, 도 8과 같은 그래프를 얻을 수 있다.Referring to FIG. 7, in the face tone feature generator 710, the histogram processor 711 generates a histogram of a central region of a face image having an i-th pose. The Gaussian analysis unit 713 performs Gaussian analysis on the histogram generated by the histogram processing unit 710 to find a mean value (μ ^skin ) after assuming Gaussian distribution. The scalar normalization unit 715 performs scalar normalization on the average value μ ^skin generated by the Gaussian analysis unit 713 so as to be a constant constant value, for example, an integer type, and generate the value as a facial tone feature vector. . In the face tone feature generation unit 710, in order to generate the face tone feature for the case where 50 people took five poses, the average value μ ^skin of all the faces was found and horizontally moved. You will get a graph like 8.

전체퓨리에특징 생성부(730)에 있어서, 제1 퓨리에 변환부(731)는 정규화된 얼굴영상(f(x,y))에 대하여 퓨리에 변환을 수행한다. 제1 특징벡터 생성부(732)은 제1 퓨리에 변환부(731)로부터 생성되는 퓨리에 스펙트럼(F(u,v))의 라스터 스캐닝 실수부와 허수부에 의해 정의되는 엘리먼트를 갖는 제1 특징벡터(x₁ ^h)를 생성하고, 제2 특징벡터 생성부(733)은 제1 퓨리에 변환부(731)로부터 생성되는 퓨리에 진폭(｜F(u,v)｜)을 라스터 스캥닝하여 제2 특징벡터(x₂ ^h)를 생성한다. 제1 및 제2 PCLDA 프로젝션부(734,735)는 제1 및 제2 특징벡터 생성부(732,733)로부터 생성되는 제1 및 제2 특징벡터(x₁ ^h,x₂ ^h)를 제1 및 제2 특징벡터(x₁ ^h,x₂ ^h)의 주성분(PC)들에 대한 선형판별분석(LDA)에 의해 획득되며 베이시스 행렬(Ψ₁ ^h,Ψ₂ ^h)에 의해 정의되는판별공간상으로 프로젝션시킨다. 제1 벡터 정규화부(736)는 제1 및 제2 PCLDA 프로젝션부(734,735)에서 프로젝션된 벡터를 소정의 단위벡터(y₁ ^h,y₂ ^h)로 정규화시킨다. 제1 LDA 프로젝션부(737)는 제1 벡터 정규화부(736)에서 정규화된 벡터들(y₁ ^h,y₂ ^h)을 결합시켜 단일 벡터로 구성한 다음, 베이시스 행렬(Ψ₃ ^h)에 의해 정의된 판별공간으로 프로젝션시킨다. 제1 양자화부(738)는 제1 LDA 프로젝션부(737)에서 프로젝션된 벡터(Z^h)를 일정한 방식을 이용하여 5비트 십진수(unsigned integer)로 클리핑시켜 양자화하여 전체퓨리에특징벡터를 생성한다.In the all Fourier feature generation unit 730, the first Fourier transform unit 731 performs Fourier transform on the normalized face image f (x, y). The first feature vector generator 732 has a first feature having an element defined by a raster scanning real part and an imaginary part of the Fourier spectrum F (u, v) generated from the first Fourier transform part 731. A vector (x ₁ ^h ) is generated, and the second feature vector generator 733 raster-scans the Fourier amplitude (| F (u, v) |) generated from the first Fourier transform unit 731 to generate a vector. Create a feature vector (x ₂ ^h ). The first and second PCLDA projection units 734 and 735 may include the first and second feature vectors x ₁ ^h and x ₂ ^h generated from the first and second feature vector generators 732 and 733. Projected onto a discriminant space obtained by linear discriminant analysis (LDA) on the principal components (PCs) of the vectors (x ₁ ^h , x ₂ ^h ) and defined by the basis matrix (Ψ ₁ ^h , Ψ ₂ ^h ). The first vector normalizer 736 normalizes the vectors projected by the first and second PCLDA projection units 734 and 735 to predetermined unit vectors y ₁ ^h and y ₂ ^h . The first LDA projection unit 737 combines the normalized vectors y ₁ ^h and y ₂ ^h in the first vector normalizer 736 to form a single vector, and then is defined by a basis matrix Ψ ₃ ^h . Projected to the discriminated space. The first quantizer 738 generates a full Fourier feature vector by clipping and quantizing the vector Z ^h projected by the first LDA projection unit 737 into a 5-bit unsigned integer using a predetermined method.

성분별퓨리에특징 생성부(750)에 있어서, 제2 퓨리에 변환부(751)는 얼굴영상의 k개의 성분(f^j(x,y), j=1,...,k)에 대하여 퓨리에 변환을 수행한다. 여기서, 성분은 얼굴영상에서 눈, 코, 입 등에 해당한다. 제3 특징벡터 생성부(752)은 제2 퓨리에 변환부(751)로부터 생성되는 k개의 퓨리에 스펙트럼(F^j(u,v))의 라스터 스캐닝 실수부와 허수부에 의해 정의되는 엘리먼트를 갖는 k개의 제1 특징벡터(x₁ ^j)를 생성하고, 제4 특징벡터 생성부(753)은 제2 퓨리에 변환부(751)로부터 생성되는 k 개의 퓨리에 진폭(｜F^j(u,v)｜)을 라스터 스캥닝하여 k개의 제2 특징벡터(x₂ ^j)를 생성한다. 제3 및 제4 PCLDA 프로젝션부(754,755)는 제3 및 제4 특징벡터생성부(752,753)로부터 생성되는 k개의 제1 및 제2 특징벡터(x₁ ^j,x₂ ^j)를 k개의 제1 및 제2 특징벡터(x₁ ^j,x₂ ^j)의 주성분(PC)들에 대한 선형판별분석(LDA)에 의해 획득되며 베이시스 행렬(Ψ₁ ^j,Ψ₂ ^j)에 의해 정의되는 판별공간상으로 프로젝션시킨다. 제2 벡터 정규화부(756)는 제3 및 제4 PCLDA 프로젝션부(754,755)에서 프로젝션된 벡터를 소정의 단위벡터(y₁ ^j,y₂ ^j)로 정규화시킨다. 제2 LDA 프로젝션부(757)는 제2 벡터 정규화부(756)에서 정규화된 k개의 벡터들(y₁ ^j,y₂ ^j)을 결합시켜 단일 벡터로 구성한 다음, 베이시스 행렬(Ψ₃ ^j)에 의해 정의된 판별공간으로 프로젝션시킨다. 제2 양자화부(758)는 제2 LDA 프로젝션부(757)에서 프로젝션된 벡터(Z^j)를 일정한 방식을 이용하여 5비트 십진수(unsigned integer)로 클리핑시켜 양자화하여 성분별퓨리에특징벡터를 생성한다.In the component-specific Fourier feature generation unit 750, the second Fourier transform unit 751 performs Fourier transform on k components f ^j (x, y), j = 1, ..., k of the face image. Do this. Here, the component corresponds to eyes, nose, mouth, etc. in the face image. The third feature vector generator 752 has elements defined by the raster scanning real part and the imaginary part of k Fourier spectra F ^j (u, v) generated from the second Fourier transform unit 751. The k first feature vectors x ₁ ^j are generated, and the fourth feature vector generator 753 generates k Fourier amplitudes (| F ^j (u, v) |) generated from the second Fourier transform unit 751. ) K raster scans to generate k second feature vectors (x ₂ ^j ). The third and fourth PCLDA projection units 754 and 755 may include k first and second feature vectors x ₁ ^j and x ₂ ^j generated from the third and fourth feature vector generators 752 and 753. And a discriminant space image obtained by linear discrimination analysis (LDA) on principal components (PCs) of the second feature vectors (x ₁ ^j , x ₂ ^j ) and defined by a basis matrix (Ψ ₁ ^j , Ψ ₂ ^j ). To project. The second vector normalizer 756 normalizes the vectors projected by the third and fourth PCLDA projection units 754 and 755 to predetermined unit vectors y ₁ ^j and y ₂ ^j . The second LDA projection unit 757 combines k vectors (y ₁ ^j and y ₂ ^j ) normalized by the second vector normalization unit 756 to form a single vector, and then applies the basis matrix Ψ ₃ ^j . Project to the determination space defined by The second quantization unit 758 generates a Fourier feature vector for each component by clipping and quantizing the vector Z ^j projected by the second LDA projection unit 757 into a 5-bit unsigned integer using a predetermined method. .

상기한 바와 같이, 얼굴영상의 중심부로부터 얻어지는 얼굴톤특징벡터, 전체 얼굴영상 및 성분별 영상으로부터 얻어지는 전체퓨리에 특징벡터 및 성분별퓨리에 특징벡터를 이용하여 2차원 얼굴 기술자를 생성하게 되면, 표정, 포즈 및 조명에 강인할 뿐 아니라 얼굴기술자의 크기가 작아지는 효과가 있다.As described above, when the two-dimensional face descriptor is generated by using the face tone feature vector obtained from the center of the face image, the whole Fourier feature vector obtained from the full face image, and the component-specific image, and the Fourier feature vector of each component, And it is not only robust to lighting, but also reduces the size of the facial descriptor.

도 9는 본 발명에 적용된 시점 구체에서의 모자이크 이론을 설명하는 것으로서, 독립된 모델들은 시점 구체 표면의 각 구역에서 메워진다. 비선형인 전체 모델들은 각각의 부분 특징점 공간에서의 특정한 시점들로부터 만들어진 선형공간들의 합으로서 만들어진다. 그 결과, 시점에 불변한 얼굴인식이 가능한 이점이 있다. 도 9에서는 중앙의 정면얼굴을 가지고서 인식가능한 구역의 예를 나타낸다.Fig. 9 illustrates the mosaic theory in the viewpoint spheres applied to the present invention, in which independent models are filled in each zone of the viewpoint sphere surface. The nonlinear whole models are made as the sum of the linear spaces created from specific viewpoints in each partial feature space. As a result, there is an advantage in that face recognition is invariant at the viewpoint. 9 shows an example of a zone that can be recognized with a central front face.

도 10은 도 9에 도시된 시점 구체에서의 모자이크를 2차원적으로 나타낸 도면으로서, 한 시점은 준구역(quasi region)까지 지원하는데, 여기서 준구역은 소정 각도, 바람직하게는 15°정도까지 확장된 구역을 의미한다. 따라서, 대략 30°정도로 떨어진 시점들을 모아 시점 구체를 지원할 수 있게 된다.FIG. 10 is a two-dimensional view of the mosaic in the view sphere shown in FIG. 9, one view supports up to a quasi region, where the quasi-region extends to an angle, preferably about 15 °. Means a designated area. Therefore, it is possible to collect viewpoints separated by approximately 30 ° to support the viewpoint sphere.

본 발명은 또한 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플라피디스크, 광데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고 본 발명을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있다.The invention can also be embodied as computer readable code on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like, which are also implemented in the form of a carrier wave (for example, transmission over the Internet). It also includes. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. And functional programs, codes and code segments for implementing the present invention can be easily inferred by programmers in the art to which the present invention belongs.

상술한 바와 같이 본 발명에 따르면, 다수의 포즈를 갖는 이미지, 얼굴 회전을 가진 비디오 스트림, 3차원 얼굴 매쉬 모델 중의 하나로 생성된 3차원 얼굴 기술자를 이용하여 얼굴 사진, 얼굴이 포함된 비디오 및 3차원 얼굴 매쉬 모델 중의 어떠한 형태의 이미지가 들어오더라도 얼굴 인식이 가능한 이점이 있다. 또한, 이미지, 비디오, 및 3차원 메쉬모델을 모두 하나의 3차원 얼굴기술자로 생성할 수 있으므로 통상의 정면얼굴만 인식하는 얼굴기술자에 비하여 매우 높은 인식율을 보장할 수 있다. 또한, 포즈에 기반한 3차원 얼굴기술자를 생성할 수 있으므로 어떠한 시점의 얼굴이 들어오더라도 인식할 수 있기 때문에 감시시스템에 매우 효율적으로 적용할 수 있는 이점이 있다.As described above, according to the present invention, a face photograph, a video including a face, and a three-dimensional face using a three-dimensional face descriptor generated as one of an image having a plurality of poses, a video stream having a face rotation, and a three-dimensional face mesh model There is an advantage in that face recognition is possible even when an image of any type in the face mesh model comes in. In addition, since an image, a video, and a 3D mesh model can be all generated by a single 3D face descriptor, a very high recognition rate can be guaranteed compared to a face descriptor that recognizes only a normal front face. In addition, since a three-dimensional face descriptor can be generated based on a pose, a face can be recognized at any point in time and thus can be applied to a surveillance system very efficiently.

이상 도면과 명세서에서 최적 실시예들이 개시되었다. 여기서 특정한 용어들이 사용되었으나, 이는 단지 본 발명을 설명하기 위한 목적에서 사용된 것이지 의미 한정이나 특허청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 그러므로 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.The best embodiments have been disclosed in the drawings and specification above. Although specific terms have been used herein, they are used only for the purpose of describing the present invention and are not used to limit the scope of the present invention as defined in the meaning or claims. Therefore, those skilled in the art will understand that various modifications and equivalent other embodiments are possible from this. Therefore, the true technical protection scope of the present invention will be defined by the technical spirit of the appended claims.

Claims

A basis generator which registers and trains one of images having a plurality of poses, videos having a face rotation, and a three-dimensional face mesh model to generate an entire feature point spatial basis including a pose value and store the same in a database;

For each user, a three-dimensional full face descriptor is generated by using an N pose image, a video stream having a face rotation, and a full feature point spatial basis including one of the three-dimensional face mesh models and the pose value provided from the basis generator 11. A full face descriptor generator for generating and storing in a database;

To generate and recognize a 3D partial face descriptor using an entire feature point spatial basis including an image, a video stream, and a 3D face mesh model of a user to be recognized and a pose value provided from the basis generator. A partial face descriptor generator for generating a three-dimensional partial face descriptor database using a pose value of an arbitrary user and a three-dimensional full face descriptor provided from the full face descriptor generator; And

And a search unit for searching for a 3D partial face descriptor database by using the partial face descriptor for any user to be generated by the partial face descriptor generator and measuring similarity and outputting a search result. Face recognition device using 3D face descriptor.

(a) Register as a full face descriptor through the full feature point space for the images with multiple poses, videos with face rotation, and the 3D face mesh model, and then express them as partial face descriptors. Creating a descriptor database;

(b) expressing a partial face descriptor through partial feature point space for search data which is one of an image having an arbitrary pose, a video having a face rotation, and a three-dimensional face mesh model; And

and (c) searching for the partial face descriptor database generated in step (a) using the partial face descriptor represented in step (b).

A face tone feature generator for generating a face tone feature vector by performing histogram and Gaussian analysis on the center of the face image;

A total Fourier feature generator for generating a Fourier feature vector by sequentially performing Fourier transform, PCLDA projection, and LDA projection on the normalized face image; And

And a Fourier feature generator for each component, which sequentially performs Fourier transform, PCLDA projection, and LDA projection on k components of the facial image to generate Fourier feature vectors for each component.

(a) generating a facial tone feature vector by performing histogram and Gaussian analysis on the center of the face image;

(b) sequentially performing Fourier transform, PCLDA projection, and LDA projection on the normalized face image to generate a full Fourier feature vector; And

(c) generating a Fourier feature vector for each component by sequentially performing Fourier transform, PCLDA projection, and LDA projection on k components of the face image.