JP2012185545A

JP2012185545A - Face image processing device

Info

Publication number: JP2012185545A
Application number: JP2011046501A
Authority: JP
Inventors: Naoyuki Takada; 直幸高田; Yusuke Sano; 友祐佐野
Original assignee: Secom Co Ltd
Current assignee: Secom Co Ltd
Priority date: 2011-03-03
Filing date: 2011-03-03
Publication date: 2012-09-27
Anticipated expiration: 2031-03-03
Also published as: JP5751865B2

Abstract

PROBLEM TO BE SOLVED: To provide a face image processing device creating a three-dimensional face image of a target person in which, based on a two-dimensional face image of the target person, feature parts are satisfactorily placed as a real face of the target person by using a three-dimensional shape model indicating a three-dimensional shape of a face without using a special surface measuring device.SOLUTION: A face image processing device 100 has: position matching means 111 for matching positions of two-dimensional face feature points with positions of three-dimensional face feature points; derivation feature point calculation means 112 for generating a derivation face feature point by changing a position of the three-dimensional face feature point of which a position gap amount from the corresponding two-dimensional face feature point is equal to or more than a determined threshold; derivation model creation means 105 for creating a derivation shape model by combining the three-dimensional face feature points and the derivation face feature point; similar shape selection means 109 for selecting a shape model which is the most similar to the two-dimensional face image from the derivation shape model and a three-dimensional shape model; and personal model creation means 110 for creating a three-dimensional face image by synthesizing the selected derivation shape model or three-dimensional shape model with the two-dimensional face image.

Description

本発明は、顔画像処理装置に関し、特に、個人を撮影した２次元の顔画像を３次元の顔形状データにマッピングした３次元顔画像を作成する顔画像処理装置に関する。 The present invention relates to a face image processing apparatus, and more particularly to a face image processing apparatus that creates a 3D face image by mapping a 2D face image obtained by photographing an individual to 3D face shape data.

従来より、対象者の顔を撮影して取得した２次元の顔画像を登録された顔画像と照合することにより、その対象者を認証する顔認証装置が提案されている。このような顔認証装置は、事前に対象者本人の顔画像を登録しておき、利用時に取得した顔画像と比較した結果に基づいて、認証の可否を決定する。そのため、登録時の顔画像と利用時の顔画像との間に、顔の向き、照明条件、表情などにおいて差異が生じていると、照合の際の誤り率が高くなるという問題があった。 Conventionally, there has been proposed a face authentication device that authenticates a target person by collating a two-dimensional face image acquired by photographing the face of the target person with a registered face image. Such a face authentication apparatus registers the face image of the subject person in advance, and determines whether or not to authenticate based on the result of comparison with the face image acquired at the time of use. For this reason, if there are differences in the face orientation, lighting conditions, facial expressions, and the like between the face image at the time of registration and the face image at the time of use, there is a problem that the error rate at the time of matching increases.

そこで、特許文献１には、顔画像の照合に使用するための、対象者の顔の複数の２次元画像を作成するサンプル画像収集方法が提案されている。この画像収集方法は、表面計測装置を用いて、取得した顔の３次元の表面形状データ及びカラー情報から対象者の顔の３次元コンピュータグラフィックスモデルを作成する。そして、その画像収集方法は、そのモデルに基づいて顔の向き又は照明条件を変動させてレンダリングすることにより、複数の２次元顔画像を作成する。 Therefore, Patent Document 1 proposes a sample image collection method for creating a plurality of two-dimensional images of a subject's face for use in collating face images. In this image collection method, a three-dimensional computer graphics model of a subject's face is created from the acquired three-dimensional surface shape data and color information of the face using a surface measuring device. Then, the image collection method creates a plurality of two-dimensional face images by rendering with the face orientation or illumination conditions varied based on the model.

また、特許文献２には、標準的な顔の３次元形状を表す標準フレームモデルを用いて、照合の際に使用する複数の参照顔画像を作成する顔画像照合装置が提案されている。この顔画像照合装置は、照合の際に取得した対象者の２次元顔画像と上記の標準フレームモデルを合成して対象者の顔に対応した３次元顔モデル（３次元顔画像）を作成する。そして顔画像照合装置は、３次元顔モデルに対して顔の向き、照明条件又は表情といった変動要因を考慮してレンダリングを行うことにより、参照顔画像を作成する。 Further, Patent Document 2 proposes a face image matching device that creates a plurality of reference face images to be used for matching using a standard frame model representing a standard three-dimensional shape of a face. This face image matching device generates a three-dimensional face model (three-dimensional face image) corresponding to the face of the subject by combining the two-dimensional face image of the subject acquired at the time of matching with the standard frame model. . Then, the face image matching device creates a reference face image by rendering the three-dimensional face model in consideration of the variation factors such as the face direction, the illumination condition, or the expression.

また、特許文献３には、事前に準備された顔の３次元形状を表す３次元形状モデルを変形させた派生形状モデルを用いて、対象者の顔形状をあらわす３次元顔画像を作成する顔画像処理装置が提案されている。この顔画像処理装置は、３次元形状モデルを、鼻尖点などの基準点に対して高さ方向又は上下の長さの比率を変えるように変形した派生形状モデルを作成する。そして顔画像処理装置は、派生形状モデル又は３次元形状モデルのうち、画像入力手段から取得した対象者の２次元顔画像に最も類似するモデルと２次元顔画像を合成して３次元顔画像を作成する。 Patent Document 3 discloses a face for generating a three-dimensional face image representing a face shape of a subject using a derived shape model obtained by deforming a three-dimensional shape model representing a three-dimensional shape of a face prepared in advance. An image processing apparatus has been proposed. This face image processing apparatus creates a derived shape model obtained by transforming a three-dimensional shape model so as to change the ratio of the height direction or the vertical length with respect to a reference point such as a nose tip. Then, the face image processing device combines the 2D face image with the model most similar to the 2D face image of the subject acquired from the image input means from the derived shape model or the 3D shape model, and generates the 3D face image. create.

特開平４−２５６１８５号公報JP-A-4-256185 特開２００３−６６４５号公報JP 2003-6645 A 特開２００９−２１１１４８号公報JP 2009-2111148 A

特許文献１に記載された画像収集方法では、対象者ごとに顔の３次元コンピュータグラフィックスモデルを作成する必要がある。しかし、そのモデルの作成には、特殊な表面計測装置を用いて対象者本人の顔を計測する必要があるため、各対象者の顔の３次元コンピュータグラフィックスモデルを作成することは容易でない。そのため、係る画像収集方法を、一般に利用される顔認証装置に適用することは困難であった。 In the image collection method described in Patent Document 1, it is necessary to create a three-dimensional computer graphics model of a face for each subject. However, since it is necessary to measure the face of the subject person using a special surface measuring device in order to create the model, it is not easy to create a three-dimensional computer graphics model of each subject person's face. For this reason, it has been difficult to apply such an image collection method to a commonly used face authentication apparatus.

一方、特許文献２に記載された顔画像照合装置では、３次元形状を表すモデルとして、標準的な顔に対応する標準フレームモデルが１ないし複数準備される。この標準フレームモデルは一度作成すればよく、係る顔画像照合装置の何れについても使用することができるので、上記のような問題は生じない。しかし、標準フレームモデルは、標準的な顔の形状に対応するものであるため、眉骨又は頬骨の張り出し具合、顎の形状などの個々人が持つ顔形状の特徴を反映するものではない。また、特許文献２には、３次元顔モデルを作成するために用いる標準フレームモデルとして、対象者の性別、年齢等の属性情報に応じた標準フレームモデルを選択する方法も開示されている。しかし、対象者の属性情報が同じであっても顔形状の特徴が異なる場合がある。そのため、対象者の顔に適した３次元顔モデルを作成できる顔画像処理装置がさらに望まれていた。 On the other hand, in the face image matching apparatus described in Patent Document 2, one or more standard frame models corresponding to a standard face are prepared as models representing a three-dimensional shape. This standard frame model only needs to be created once and can be used for any of such face image collation apparatuses, so the above-mentioned problems do not occur. However, since the standard frame model corresponds to a standard face shape, it does not reflect the characteristics of the face shape of each person such as the extent of the eyebrows or cheekbones and the shape of the jaw. Patent Document 2 also discloses a method of selecting a standard frame model according to attribute information such as the sex and age of the subject as a standard frame model used for creating a three-dimensional face model. However, even if the attribute information of the subject is the same, the facial shape characteristics may be different. Therefore, a face image processing apparatus that can create a three-dimensional face model suitable for the face of the subject is further desired.

また、特許文献３に記載された顔画像処理装置は、対象者の顔の特徴的な部位の位置にあわせて３次元形状モデルを変形して派生形状モデルを作成するので、対象者の顔により適した３次元顔画像を作成することができる。しかし、この派生形状モデルは、事前に準備された３次元形状モデルに対して、基準点に対する高さ方向又は上下の長さの比率のみを変えたものであり、それぞれの特徴的な部位の位置をより良好に一致させた３次元顔画像を作成できる顔画像処理装置がさらに望まれている。 Further, the face image processing apparatus described in Patent Document 3 creates a derived shape model by deforming a three-dimensional shape model in accordance with the position of a characteristic part of the subject's face. A suitable three-dimensional face image can be created. However, this derived shape model is obtained by changing only the ratio of the height direction or the vertical length to the reference point with respect to the three-dimensional shape model prepared in advance, and the position of each characteristic part. There is a further demand for a face image processing apparatus that can create a three-dimensional face image with better matching.

そこで、本発明の目的は、特殊な表面計測装置を用いることなく、顔の３次元形状を表す３次元形状モデルを用いて、対象者の２次元の顔画像から特徴的な部位の位置をその対象者の顔により良好に一致させた３次元顔画像を作成する顔画像処理装置を提供することにある。 Therefore, an object of the present invention is to use a three-dimensional shape model representing the three-dimensional shape of the face without using a special surface measuring device, and to determine the position of a characteristic part from the two-dimensional face image of the subject. An object of the present invention is to provide a face image processing apparatus that creates a three-dimensional face image that is more closely matched to the face of the subject.

かかる課題を解決するための本発明は、人物の顔が含まれる２次元顔画像と顔の３次元形状を表す３次元形状モデルを合成してその人物の顔に対応した３次元顔画像を作成する顔画像処理装置を提供する。係る顔画像処理装置は、２次元顔画像を入力する画像入力手段と、予め３次元形状モデルと該３次元形状モデルにおける顔の特徴点を表す複数の３次元顔特徴点とを記憶する記憶手段と、２次元顔画像から顔の特徴点を表す複数の２次元顔特徴点を抽出する顔特徴点抽出手段と、３次元顔特徴点と当該３次元顔特徴点に対応する２次元顔特徴点との位置ずれ量を算出して位置合わせを行う位置合わせ手段と、位置ずれ量が所定の閾値以上の３次元顔特徴点について位置を変えた一または複数の派生顔特徴点を生成し、当該３次元顔特徴点については当該３次元顔特徴点または当該派生顔特徴点から選択した特徴点、位置ずれ量が所定の閾値未満の３次元顔特徴点については当該３次元顔特徴点を、顔の形状を形成するよう組み合わせた派生顔特徴点セットを生成する派生顔特徴点算出手段と、３次元形状モデルの３次元顔特徴点が派生顔特徴点セットに一致するよう３次元形状モデルを変形して派生形状モデルを作成する派生モデル作成手段と、派生形状モデル及び３次元形状モデルを２次元顔画像と比較して２次元顔画像に最も類似する派生形状モデルまたは３次元形状モデルを選択する類似形状選択手段と、選択された派生形状モデルまたは３次元形状モデルと２次元顔画像を合成して、人物の顔に対応した３次元顔画像を作成する個人モデル作成手段と、を有する。 To solve this problem, the present invention creates a three-dimensional face image corresponding to the face of a person by synthesizing a two-dimensional face image including the face of the person and a three-dimensional shape model representing the three-dimensional shape of the face. A face image processing apparatus is provided. The face image processing apparatus includes an image input unit that inputs a two-dimensional face image, and a storage unit that stores in advance a three-dimensional shape model and a plurality of three-dimensional face feature points representing facial feature points in the three-dimensional shape model. A face feature point extracting means for extracting a plurality of 2D face feature points representing face feature points from the 2D face image, a 3D face feature point and a 2D face feature point corresponding to the 3D face feature point A positioning unit that calculates a positional deviation amount with respect to the position and performs positioning, and generates one or a plurality of derived face feature points whose positions are changed with respect to a three-dimensional face feature point having a positional deviation amount equal to or greater than a predetermined threshold, For the 3D face feature point, select the feature point selected from the 3D face feature point or the derived face feature point, for the 3D face feature point whose positional deviation amount is less than a predetermined threshold, Derived face combined to form the shape of Derived face feature point calculating means for generating a scoring point set and a derived model for generating a derived shape model by modifying the 3D shape model so that the 3D face feature point of the 3D shape model matches the derived face feature point set Creating means; comparing derived shape model and 3D shape model with 2D face image; selecting derived shape model or 3D shape model most similar to 2D face image; and selected derivation And a personal model creation means for creating a 3D face image corresponding to a human face by synthesizing a shape model or a 3D shape model and a 2D face image.

さらに、本発明に係る顔画像処理装置において、派生顔特徴点算出手段は、対応する２次元顔特徴点に対する３次元顔特徴点の位置ずれ量が大きいほど、該３次元顔特徴点に対応する派生顔特徴点を広く分布させて生成することが好ましい。 Furthermore, in the face image processing apparatus according to the present invention, the derived face feature point calculation means corresponds to the three-dimensional face feature point as the positional deviation amount of the three-dimensional face feature point with respect to the corresponding two-dimensional face feature point increases. It is preferable that the derived face feature points are generated by being widely distributed.

さらに、本発明に係る顔画像処理装置において、派生顔特徴点算出手段は、対応する２次元顔特徴点に対する３次元顔特徴点の位置ずれ量が大きいほど、該３次元顔特徴点に対応する派生顔特徴点を数を増やして生成することが好ましい。 Furthermore, in the face image processing apparatus according to the present invention, the derived face feature point calculation means corresponds to the three-dimensional face feature point as the positional deviation amount of the three-dimensional face feature point with respect to the corresponding two-dimensional face feature point increases. It is preferable to generate the derived face feature points by increasing the number.

本発明に係る顔画像処理装置は、特殊な表面計測装置を用いることなく、顔の３次元形状を表す３次元形状モデルを用いて、対象者の２次元の顔画像から特徴的な部位の位置をその対象者の顔により良好に一致させた３次元顔画像を作成できるという効果を奏する。 The face image processing apparatus according to the present invention uses a three-dimensional shape model representing the three-dimensional shape of the face without using a special surface measurement device, and positions of characteristic parts from the two-dimensional face image of the subject. It is possible to create a three-dimensional face image that more closely matches the face of the subject.

本発明を適用した画像処理装置の概略構成図である。1 is a schematic configuration diagram of an image processing apparatus to which the present invention is applied. 顔画像と３次元形状モデルの関係を表す模式図である。It is a schematic diagram showing the relationship between a face image and a three-dimensional shape model. ３次元顔特徴点と２次元顔特徴点の投影直線との間の距離と位置ずれ量の関係を示すグラフである。It is a graph which shows the relationship between the distance and the amount of positional deviation between a 3D face feature point and the projection line of a 2D face feature point. 派生顔特徴点及び派生顔特徴点組を示す模式図である。It is a schematic diagram which shows a derived face feature point and a derived face feature point set. 本発明を適用した画像処理装置における３次元顔画像作成処理の一例を示すフローチャートである。It is a flowchart which shows an example of the three-dimensional face image creation process in the image processing apparatus to which this invention is applied.

以下、本発明の一実施形態である顔画像処理装置について図を参照しつつ説明する。
本発明を適用した顔画像処理装置は、対象者の２次元の顔画像を取得すると、その２次元の顔画像から、事前に準備された顔の３次元形状を表す３次元形状モデルにおける顔の特徴点を表す特徴点を抽出する。そして顔画像処理装置は、３次元形状モデルの特徴点の配置が２次元の顔画像の特徴点の配置に最も一致するように位置合わせを行い、その結果、３次元形状モデルにおいて、２次元の顔画像の特徴点との位置ずれ量が所定以上の特徴点について位置を変えた複数の派生顔特徴点を生成し、その派生顔特徴点を用いて派生形状モデルを作成する。そして顔画像処理装置は、作成した派生形状モデル及び元の３次元形状モデルのうち２次元の顔画像に最も類似するものに基づいて３次元顔画像を作成することにより、対象者の顔とそれぞれの特徴的な部位の位置が良好に一致する３次元顔画像を作成する。 Hereinafter, a face image processing apparatus according to an embodiment of the present invention will be described with reference to the drawings.
When the face image processing apparatus to which the present invention is applied acquires the two-dimensional face image of the subject, the face image in the three-dimensional shape model representing the three-dimensional shape of the face prepared in advance from the two-dimensional face image. A feature point representing the feature point is extracted. Then, the face image processing apparatus performs alignment so that the arrangement of the feature points of the three-dimensional shape model most closely matches the arrangement of the feature points of the two-dimensional face image. A plurality of derived face feature points are generated by changing the positions of feature points whose positional deviation from the feature points of the face image is greater than or equal to a predetermined value, and a derived shape model is created using the derived face feature points. Then, the face image processing device creates a three-dimensional face image based on the derived shape model and the original three-dimensional shape model that are most similar to the two-dimensional face image. A three-dimensional face image is created in which the positions of the characteristic parts are well matched.

図１は、本発明を適用した顔画像処理装置１００の概略構成を示す図である。図１に示すように、顔画像処理装置１００は、記憶手段１０１、画像入力手段１０２、顔特徴点抽出手段１０３、位置合わせ情報算出手段１０４、派生モデル作成手段１０５、光源方向推定手段１０６、陰影画像作成手段１０７、類似度算出手段１０８、類似形状選択手段１０９及び個人モデル作成手段１１０を有する。
このうち、顔特徴点抽出手段１０３、位置合わせ情報算出手段１０４、派生モデル作成手段１０５、光源方向推定手段１０６、陰影画像作成手段１０７、類似度算出手段１０８、類似形状選択手段１０９及び個人モデル作成手段１１０は、それぞれ、マイクロプロセッサ、メモリ、その周辺回路及びそのマイクロプロセッサ上で動作するソフトウェアにより実装される機能モジュールである。あるいは、これらの手段を、ファームウェアにより一体化して構成してもよい。また、これらの手段の一部または全てを、独立した電子回路、ファームウェア、マイクロプロセッサなどで構成してもよい。以下、顔画像処理装置１００の各部について詳細に説明する。 FIG. 1 is a diagram showing a schematic configuration of a face image processing apparatus 100 to which the present invention is applied. As shown in FIG. 1, the face image processing apparatus 100 includes a storage unit 101, an image input unit 102, a face feature point extraction unit 103, a registration information calculation unit 104, a derived model creation unit 105, a light source direction estimation unit 106, a shadow, and the like. An image creation unit 107, a similarity calculation unit 108, a similar shape selection unit 109, and a personal model creation unit 110 are included.
Among these, the facial feature point extraction means 103, the alignment information calculation means 104, the derived model creation means 105, the light source direction estimation means 106, the shadow image creation means 107, the similarity calculation means 108, the similar shape selection means 109, and the personal model creation. Each means 110 is a functional module implemented by a microprocessor, a memory, a peripheral circuit thereof, and software operating on the microprocessor. Alternatively, these means may be integrated by firmware. Moreover, you may comprise some or all of these means with an independent electronic circuit, firmware, a microprocessor, etc. Hereinafter, each part of the face image processing apparatus 100 will be described in detail.

記憶手段１０１は、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）などの半導体メモリ、あるいは磁気記録媒体及びそのアクセス装置若しくは光記録媒体及びそのアクセス装置などを有する。そして記憶手段１０１は、顔画像処理装置１００を制御するためのコンピュータプログラム、各種パラメータ及びデータなどを記憶する。また記憶手段１０１は、顔の３次元形状モデルと、該３次元形状モデルにおける顔の特徴点を表す複数の３次元顔特徴点とを記憶する。３次元形状モデルは複数存在してもよく、また一つであってもよい。３次元形状モデルは、人物の顔の３次元形状を表すフレームモデルであって、ワイヤーフレームモデルあるいはサーフェイスモデル等が用いられる。なお３次元形状モデルは、例えば、複数の人物の顔形状からそれぞれ作成してもよいし、多数の人物の顔形状を模した顔形状モデルを平均化して作成してもよい。あるいは形状の類似度が高い顔が同じカテゴリになるように、多数の顔形状モデルをカテゴライズし、同一カテゴリ内の顔形状モデルを平均化するなどして作成してもよい。 The storage unit 101 includes a semiconductor memory such as a ROM (Read Only Memory) and a RAM (Random Access Memory), or a magnetic recording medium and its access device or an optical recording medium and its access device. The storage unit 101 stores a computer program for controlling the face image processing apparatus 100, various parameters, data, and the like. The storage unit 101 also stores a three-dimensional shape model of the face and a plurality of three-dimensional face feature points representing the face feature points in the three-dimensional shape model. There may be a plurality of three-dimensional shape models or one. The three-dimensional shape model is a frame model representing the three-dimensional shape of a human face, and a wire frame model, a surface model, or the like is used. The three-dimensional shape model may be created, for example, from the face shapes of a plurality of persons, or may be created by averaging face shape models that imitate the face shapes of many persons. Alternatively, many face shape models may be categorized so that faces having a high degree of similarity in the same category, and the face shape models in the same category may be averaged.

ここで、３次元顔特徴点は、目、鼻、口など、形状若しくは色成分について他と異なる特徴的な部位の何れかの点、例えばそれらの部位の中心点若しくは端点を表す。例えば、３次元顔特徴点には、眉頭、眉尻、黒目中心、目領域中心、目頭、目尻、鼻尖点、鼻孔中心、口点（口中心）、口角点などが含まれる。なお、記憶する３次元顔特徴点の種類及び数に制限はないが、記憶する３次元顔特徴点は、顔特徴点抽出手段１０３において抽出可能な２次元顔画像の顔特徴点と同じ部位の特徴点であること、又は少なくともこれらの特徴点を全て含むことが好ましい。本実施形態では、３次元顔特徴点として、左右それぞれの目領域中心、目頭及び目尻と、鼻尖点、口点、並びに左右の口角点の１０箇所を記憶するものとした。 Here, the three-dimensional face feature point represents any point of a characteristic part different from others in terms of shape or color component such as eyes, nose and mouth, for example, a center point or an end point of those parts. For example, the three-dimensional face feature points include an eyebrow head, an eyebrow butt, a black eye center, an eye region center, an eye head, an eye corner, a nose tip, a nostril center, a mouth point (mouth center), a mouth corner point, and the like. The type and number of the 3D face feature points to be stored are not limited, but the 3D face feature points to be stored are the same parts as the face feature points of the 2D face image that can be extracted by the face feature point extraction unit 103. Preferably, it is a feature point, or at least all of these feature points are included. In the present embodiment, as the three-dimensional face feature points, the left and right eye region centers, the eyes and the corners of the eyes, the nose apex point, the mouth point, and the left and right mouth corner points are stored.

画像入力手段１０２は、例えば、監視カメラ等の画像取得手段と接続されるインターフェース回路であり、画像取得手段により２次元顔画像として取得された対象者の顔画像を顔画像処理装置１００に入力する。また、予め撮影された１ないし複数の顔画像を含む履歴情報を記録した記録媒体が存在する場合、画像入力手段１０２は、そのような記録媒体にアクセスするための読み取り装置と、キーボード、マウスなどの入力デバイスとディスプレイを含むユーザインターフェースとを有していてもよい。この場合、ユーザは、画像入力手段１０２のユーザインターフェースを介して、履歴情報から何れか一つの顔画像を選択する。なお、顔画像は、顔全体を含み、かつ顔の各特徴部分（目、鼻、口など）を他の特徴部分と区別できるものであれば、顔の向きは正面向きでも斜め向きでもよい。さらに顔画像は、グレースケールまたはカラーの多階調の画像とすることができるが、人の肌部分の特徴を抽出し易いカラーの多階調画像とすることが好ましい。本実施形態では、顔画像を、１２８×１２８画素を有し、ＲＧＢ各色について８ビットの輝度分解能を持つカラー画像とした。ただし、顔画像として、この実施形態以外の解像度及び階調を有するものを使用してもよい。
画像入力手段１０２は、取得した顔画像を、顔特徴点抽出手段１０３、光源方向推定手段１０６、類似度算出手段１０８及び個人モデル作成手段１１０へ出力する。 The image input unit 102 is an interface circuit connected to an image acquisition unit such as a surveillance camera, for example, and inputs the face image of the subject acquired as a two-dimensional face image by the image acquisition unit to the face image processing apparatus 100. . In addition, when there is a recording medium that records history information including one or more face images taken in advance, the image input unit 102 includes a reading device for accessing such a recording medium, a keyboard, a mouse, and the like. And a user interface including a display. In this case, the user selects any one face image from the history information via the user interface of the image input unit 102. Note that the face image may include the entire face, and the face direction may be front or oblique as long as each feature part (eyes, nose, mouth, etc.) of the face can be distinguished from other feature parts. Furthermore, the face image can be a grayscale or color multi-tone image, but is preferably a color multi-tone image that easily extracts the characteristics of the human skin. In this embodiment, the face image is a color image having 128 × 128 pixels and having a luminance resolution of 8 bits for each color of RGB. However, a face image having a resolution and gradation other than this embodiment may be used.
The image input means 102 outputs the acquired face image to the face feature point extraction means 103, the light source direction estimation means 106, the similarity calculation means 108, and the personal model creation means 110.

顔特徴点抽出手段１０３は、画像入力手段１０２から出力された顔画像から顔の特徴点を表す顔特徴点（以下、２次元顔特徴点と称する）を抽出する。そして顔特徴点抽出手段１０３は、抽出した２次元顔特徴点の種別と顔画像上の位置情報（例えば、顔画像の左上端部を原点とする２次元座標値）を、位置合わせ情報算出手段１０４へ出力する。本実施形態において、顔特徴点抽出手段１０３は、記憶手段１０１に記憶されている各３次元顔特徴点（目領域中心、鼻尖点、口角点などの１０箇所）に対応する２次元顔特徴点を抽出する。顔特徴点抽出手段１０３は、顔画像から２次元顔特徴点を抽出するための公知の様々な手法を用いることができる。例えば、顔特徴点抽出手段１０３は、顔画像に対してエッジ抽出処理を行って周辺画素との輝度差が大きいエッジ画素を抽出する。そして顔特徴点抽出手段１０３は、エッジ画素の位置、パターンなどに基づいて求めた特徴量が、目、鼻、口などの部位について予め定められた条件を満たすか否かを調べて各部位の位置を特定することにより、各２次元顔特徴点を抽出することができる。また顔特徴点抽出手段１０３は、エッジ抽出処理を行ってエッジ画素を抽出する代わりに、ガボール変換処理あるいはウェーブレット変換処理を行って、異なる複数の空間周波数帯域で局所的に変化の大きい画素を抽出してもよい。さらに顔特徴点抽出手段１０３は、顔の各部位に相当するテンプレートと顔画像とのテンプレートマッチングを行って顔の各部位の位置を特定することにより、２次元顔特徴点を抽出してもよい。さらにまた、顔特徴点抽出手段１０３は、キーボード、マウス及びディスプレイなどで構成されるユーザインターフェース（図示せず）を介してユーザに各２次元顔特徴点の位置を指定させることにより、各２次元顔特徴点を取得してもよい。 The face feature point extracting unit 103 extracts a face feature point representing a face feature point (hereinafter referred to as a two-dimensional face feature point) from the face image output from the image input unit 102. Then, the face feature point extraction unit 103 uses the extracted two-dimensional face feature point type and position information on the face image (for example, a two-dimensional coordinate value with the upper left corner of the face image as the origin) as a registration information calculation unit. To 104. In the present embodiment, the face feature point extraction unit 103 is a two-dimensional face feature point corresponding to each three-dimensional face feature point (10 locations such as the center of the eye region, the tip of the nose, the corner of the mouth) stored in the storage unit 101. To extract. The face feature point extraction unit 103 can use various known methods for extracting a two-dimensional face feature point from a face image. For example, the face feature point extraction unit 103 performs edge extraction processing on the face image to extract edge pixels having a large luminance difference from surrounding pixels. Then, the face feature point extraction unit 103 checks whether or not the feature amount obtained based on the position and pattern of the edge pixel satisfies a predetermined condition for the part such as the eyes, nose, and mouth. By specifying the position, each two-dimensional face feature point can be extracted. Also, the face feature point extraction unit 103 performs Gabor transform processing or wavelet transform processing instead of performing edge extraction processing to extract edge pixels, and extracts pixels having large local changes in different spatial frequency bands. May be. Further, the face feature point extraction unit 103 may extract a two-dimensional face feature point by performing template matching between a template corresponding to each part of the face and a face image and specifying the position of each part of the face. . Furthermore, the face feature point extraction means 103 allows each two-dimensional face feature point to be specified by allowing the user to specify the position of each two-dimensional face feature point via a user interface (not shown) including a keyboard, a mouse, and a display. Face feature points may be acquired.

位置合わせ情報算出手段１０４は、記憶手段１０１に記憶された３次元形状モデルと画像入力手段１０２から入力された２次元顔画像との位置合わせ、後述する派生モデル作成手段１０５にて用いる派生顔特徴点セットの生成などを行う。位置合わせ情報算出手段１０４は、記憶手段１０１に記憶された３次元形状モデルの各３次元顔特徴点の配置が、顔特徴点抽出手段１０３によって顔画像から抽出された各２次元顔特徴点の配置に最も一致するように位置合わせを行う。また位置合わせ情報算出手段１０４は、派生モデル作成手段１０５が派生顔特徴点セットに基づいて作成する派生形状モデルについて、派生顔特徴点セットを用いて２次元顔画像との位置合わせを行う。派生顔特徴点セットの詳細については後述する。そして位置合わせ情報算出手段１０４は、その位置合わせの結果として得られる位置合わせ情報を派生モデル作成手段１０５へ出力する。そのために、位置合わせ情報算出手段１０４は、位置合わせ手段１１１と、派生顔特徴点算出手段１１２とを有する。 The alignment information calculation unit 104 aligns the three-dimensional shape model stored in the storage unit 101 with the two-dimensional face image input from the image input unit 102, and a derived face feature used in a derived model creation unit 105 described later. Generate a set of points. The alignment information calculation unit 104 is configured such that the arrangement of the three-dimensional face feature points of the three-dimensional shape model stored in the storage unit 101 is the position of each two-dimensional face feature point extracted from the face image by the face feature point extraction unit 103. Align to best match the placement. The alignment information calculation unit 104 aligns the derived shape model created by the derived model creation unit 105 based on the derived face feature point set with the two-dimensional face image using the derived face feature point set. Details of the derived face feature point set will be described later. Then, the alignment information calculation unit 104 outputs the alignment information obtained as a result of the alignment to the derived model creation unit 105. For this purpose, the alignment information calculation unit 104 includes an alignment unit 111 and a derived face feature point calculation unit 112.

位置合わせ手段１１１は、３次元形状モデルの各３次元顔特徴点を、回転、並進、拡大／縮小して、各３次元顔特徴点の配置が、各２次元顔特徴点の配置と最も一致するように位置合わせを行い、その位置合わせ情報を算出する。このとき位置合わせ手段１１１は、位置合わせの前後において各３次元顔特徴点間の相対的な位置関係を保持するように、すなわち位置合わせ前の各３次元顔特徴点の配置と位置合わせ後の各３次元顔特徴点の配置が互いに相似になるように各３次元顔特徴点を移動させる。
位置合わせ情報は、例えば、３次元形状モデルに対して設定された３次元の正規直交座標系(X,Y,Z)の各軸に沿った回転角、並進量、及び拡大／縮小率を含む。この正規直交座標系(X,Y,Z)では、例えば、３次元形状モデル上の複数の３次元顔特徴点の重心を原点とし、顔に対して水平かつ左から右へ向かう方向にX軸、顔に対して水平かつ後方から前方へ向かう方向にY軸、顔に対して垂直に下から上へ向かう方向にZ軸が設定される。 The alignment means 111 rotates, translates, enlarges / reduces each 3D face feature point of the 3D shape model, and the arrangement of each 3D face feature point most closely matches the arrangement of each 2D face feature point. Alignment is performed so that the alignment information is calculated. At this time, the alignment unit 111 maintains the relative positional relationship between the three-dimensional face feature points before and after the alignment, that is, the arrangement of the three-dimensional face feature points before the alignment and the position after the alignment. Each 3D face feature point is moved so that the arrangement of each 3D face feature point is similar to each other.
The alignment information includes, for example, a rotation angle along each axis of a three-dimensional orthonormal coordinate system (X, Y, Z) set for the three-dimensional shape model, a translation amount, and an enlargement / reduction ratio. . In this orthonormal coordinate system (X, Y, Z), for example, the center of gravity of a plurality of three-dimensional face feature points on the three-dimensional shape model is set as the origin, and the X axis in the direction from left to right with respect to the face The Y axis is set in a direction horizontal to the face and from the rear to the front, and the Z axis is set in the direction from the bottom to the top perpendicular to the face.

そして位置合わせ手段１１１は、算出した各位置合わせ情報を評価するために、各位置合わせ情報を用いて各３次元顔特徴点と対応する２次元顔特徴点の位置合わせをしたときの位置合わせ誤差を算出する。
位置合わせ手段１１１は、例えば、以下のように位置合わせ誤差を算出する。まず位置合わせ手段１１１は、各２次元顔特徴点を３次元形状モデルの３次元空間に投影する。
図２に、顔画像と３次元形状モデルの関係を模式的に示す。図２に示すように、位置合わせ手段１１１は、カメラ位置３０を原点として顔画像１０の各２次元顔特徴点１１、１２を３次元形状モデル２０の３次元空間に投影する。位置合わせ手段１１１は、カメラ位置３０を原点として各２次元顔特徴点１１、１２を３次元空間に投影した投影直線３１、３２と、各３次元顔特徴点２１、２２の位置関係から、３次元形状モデル２０の特徴点の配置と顔画像１０の特徴点の配置とを比較することができる。 Then, the registration unit 111 uses the registration information to evaluate the registration information, and the registration error when the registration of the two-dimensional face feature points corresponding to the three-dimensional face feature points is performed. Is calculated.
The alignment unit 111 calculates an alignment error as follows, for example. First, the alignment unit 111 projects each two-dimensional face feature point on the three-dimensional space of the three-dimensional shape model.
FIG. 2 schematically shows the relationship between the face image and the three-dimensional shape model. As shown in FIG. 2, the alignment unit 111 projects the two-dimensional face feature points 11 and 12 of the face image 10 onto the three-dimensional space of the three-dimensional shape model 20 with the camera position 30 as the origin. The alignment unit 111 determines that the 3D face feature points 21 and 22 are based on the positional relationship between the 3D face feature points 21 and 22 and the projection lines 31 and 32 obtained by projecting the 2D face feature points 11 and 12 onto the 3D space with the camera position 30 as the origin. The arrangement of the feature points of the three-dimensional shape model 20 and the arrangement of the feature points of the face image 10 can be compared.

本実施形態においては、記憶手段１０１に記憶された３次元形状モデルは、標準的な顔の３次元形状を表すモデルを想定しており、対象者の顔形状を模したモデルではない。そのため各３次元顔特徴点２１、２２の配置と各２次元顔特徴点１１、１２の配置とは一般に異なる。従って、左目尻における３次元顔特徴点２１のように投影直線３１との最短距離が小さい特徴点と、左口角における３次元顔特徴点２２のように投影直線３２との最短距離が大きい特徴点とが存在する。そのため、全ての３次元顔特徴点の配置が２次元顔特徴点の配置と完全に一致するように位置合わせをすることは一般にできない。
一方で、全ての３次元顔特徴点を、対応する２次元顔特徴点に、まんべんなく位置合わせをすると、求めたい３次元顔画像は、顔の形状に対してまんべんなくテクスチャ情報がずれた、質の悪いものとなる。そこで、投影直線との最短距離の評価に軽重をつけることとする。 In the present embodiment, the three-dimensional shape model stored in the storage unit 101 assumes a model representing a standard three-dimensional shape of a face, and is not a model imitating the face shape of a subject. Therefore, the arrangement of the three-dimensional face feature points 21 and 22 and the arrangement of the two-dimensional face feature points 11 and 12 are generally different. Therefore, a feature point having a shortest distance from the projection line 31 such as the three-dimensional face feature point 21 at the left eye corner and a feature point having a large shortest distance from the projection line 32 such as the three-dimensional face feature point 22 at the left mouth corner. And exist. For this reason, it is generally impossible to perform alignment so that the arrangement of all the three-dimensional face feature points completely matches the arrangement of the two-dimensional face feature points.
On the other hand, when all the 3D face feature points are evenly aligned with the corresponding 2D face feature points, the desired 3D face image has a texture quality that is evenly shifted from the face shape. It will be bad. Therefore, the evaluation of the shortest distance from the projected straight line will be emphasized.

位置合わせ手段１１１は、注目する３次元顔特徴点について、その注目する３次元顔特徴点に対応する２次元顔特徴点に対する位置ずれ量E_iを算出し、その位置ずれ量E_iの総和として全体の位置ずれ量Eを求める。全体の位置ずれ量Eは、例えば以下の式で表される。

位置合わせ手段１１１は、この全体の位置ずれ量Eが最小となるように、各３次元顔特徴点を３次元の正規直交座標系(X,Y,Z)の各軸に沿って回転または並進させたり、拡大または縮小させたりする。位置合わせ手段１１１は、例えば、シミュレーティッドアニーリングまたは最急降下法などの最適化法を用いて、全体の位置ずれ量Eが最小となるときの３次元形状モデルの姿勢変化量（回転角、並進量、及び拡大／縮小率）を求めることができる。位置合わせ手段１１１は、上記の姿勢変化により最小となった全体の位置ずれ量Eを、３次元顔特徴点の２次元顔特徴点に対する位置合わせ誤差とする。 The alignment unit 111 calculates a positional deviation amount E _i for the two-dimensional face feature point corresponding to the target three-dimensional face feature point with respect to the target three-dimensional face feature point, and calculates the sum of the positional deviation amounts E _i . The total positional deviation amount E is obtained. The total positional deviation amount E is expressed by the following equation, for example.

The alignment means 111 rotates or translates each three-dimensional face feature point along each axis of the three-dimensional orthonormal coordinate system (X, Y, Z) so that the total displacement E is minimized. Or zoom in or out. The alignment unit 111 uses, for example, an optimization method such as simulated annealing or steepest descent method to change the attitude change amount (rotation angle, translation amount) of the three-dimensional shape model when the total displacement E is minimized. , And enlargement / reduction ratio). The alignment unit 111 sets the total positional deviation amount E, which is minimized by the above-described posture change, as the alignment error of the three-dimensional face feature point with respect to the two-dimensional face feature point.

また（１）式において、位置ずれ量E_iは、例えば以下の式で表される。

（２）式において、d_iは、注目する３次元顔特徴点と、その注目する３次元顔特徴点に対応する２次元顔特徴点の投影直線との最短距離である。この場合、３次元空間における２次元顔特徴点に対応する点（以下、３次元投影点と称する）は、２次元顔特徴点の投影直線上の、その２次元顔特徴点に対応する３次元顔特徴点に最も近い点となる。またσは、距離d_iと位置ずれ量E_iの関係を定める係数である。 Further, in the equation (1), the positional deviation amount E _i is expressed by the following equation, for example.

In the equation (2), d _i is the shortest distance between the noticeable three-dimensional face feature point and the projection line of the two-dimensional face feature point corresponding to the noticed three-dimensional face feature point. In this case, a point corresponding to a 2D face feature point in the 3D space (hereinafter referred to as a 3D projection point) is a 3D corresponding to the 2D face feature point on the projection line of the 2D face feature point. The point closest to the face feature point. Σ is a coefficient that determines the relationship between the distance d _i and the positional deviation amount E _i .

図３に、距離d_iと位置ずれ量E_iの関係を示す。図３において、横軸は距離d_iを、縦軸は位置ずれ量E_iを表し、グラフ３０１はσ=0.0017の場合の、グラフ３０２はσ=0.017の場合の、グラフ３０３はσ=0.17の場合の距離d_iと位置ずれ量E_iの関係をそれぞれ表す。
図３に示されるように、d_iの変化に対する位置ずれ量E_iの変化量は、d_iが大きい領域よりd_iが小さい領域の方が大きい。つまり距離d_iが大きい特徴点をより近づけるよりも、距離d_iが小さい特徴点をより近づけるように調整する方が、位置ずれ量E_iの減少幅が大きくなり、全体の位置ずれ量Eをより小さくすることができる。
また図３では、例えば距離d_i＞0.2の領域において、グラフ３０１の位置ずれ量E_iは1に漸近しているが、グラフ３０３の位置ずれ量E_iは漸近していない。つまり位置ずれ量E_iは、σが小さい場合、距離d_iがより小さい領域で所定値に漸近するが、σが大きい場合、距離d_iがより大きい領域でなければ所定値に漸近しない。このようにσの値により、距離d_iと位置ずれ量E_iの関係を調整することができ、σの値は、顔画像処理装置１００が設置される環境、その目的などに応じて適宜定められる。本実施形態ではσ＝0.017としている。 FIG. 3 shows the relationship between the distance d _i and the positional deviation amount E _i . In FIG. 3, the horizontal axis represents the distance d _i , the vertical axis represents the positional deviation amount E _i , the graph 301 is for σ = 0.0017, the graph 302 is for σ = 0.017, and the graph 303 is for σ = 0.17. The relationship between the distance d _i and the positional deviation amount E _i is shown.
As shown in FIG. 3, the variation of the displacement amount E _i to changes in d _i will be larger in the region d _i is smaller than the area d _i is large. That distance d _i than closer more the large feature points, better to adjust the distance d _i is smaller feature points to be more closer is decline of the displacement amount E _i is increased, the whole positional deviation amount E It can be made smaller.
In FIG. 3, for example, in the region of distance d _i > 0.2, the positional deviation amount E _i of the graph 301 is asymptotic to 1, but the positional deviation amount E _i of the graph 303 is not asymptotic. That is, when σ is small, the positional deviation amount E _i gradually approaches a predetermined value in a region where the distance d _i is small. However, when σ is large, the positional deviation amount E _i does not asymptotic to a predetermined value unless the distance d _i is large. Thus, the relationship between the distance d _i and the positional deviation amount E _i can be adjusted by the value of σ, and the value of σ is appropriately determined according to the environment where the face image processing apparatus 100 is installed, its purpose, and the like. It is done. In this embodiment, σ = 0.018.

このように位置合わせ手段１１１は、（２）式により算出した位置ずれ量E_iを用いて全体の位置ずれ量Eを最小にするように位置合わせすることにより、距離d_iがより小さい３次元顔特徴点を優先的に、対応する３次元投影点に近づけるように位置合わせする。これにより位置合わせ手段１１１は、（２）式により算出した位置ずれ量E_iが小さい３次元顔特徴点ほど、より対応する２次元顔特徴点に一致させるように位置合わせすることができる。 As described above, the alignment unit 111 performs alignment so as to minimize the total positional deviation amount E using the positional deviation amount E _i calculated by the expression (2), thereby reducing the three-dimensional distance d _i. The face feature points are preferentially aligned so as to approach the corresponding three-dimensional projection points. Thus the positioning means 111 may be aligned so as to match (2) about 3D face feature points calculated positional displacement amount E _i is smaller by formula, more corresponding two-dimensional face feature point.

なお位置ずれ量E_iは、距離d_iが小さいほど距離d_iの変化に対する位置ずれ量E_iの変化量が小さくなるように算出されるものであれば、どのようなものであってもよく、例えば、（３）式で表されるシグモイド関数、又は（４）式で表される関数を用いてもよい。

（３）式及び（４）式において、αは、それぞれ距離d_iと位置ずれ量E_iの関係を定める係数であり、αの値は、顔画像処理装置１００が設置される環境、その目的などに応じて適宜定められる。
また距離d_iは、注目する３次元顔特徴点と、その注目する３次元顔特徴点に対応する２次元顔特徴点の投影直線との最短距離に限られない。例えば距離d_iは、注目する３次元顔特徴点と、その注目する３次元顔特徴点に対応する２次元顔特徴点の投影直線と３次元形状モデルの表面の交点との距離としてもよい。この場合、２次元顔特徴点の投影直線と３次元形状モデルの表面の交点が３次元投影点となる。
なお、鼻などの顔上に突起した部位に隠れて、あるいは遮蔽物などにより顔の一部が見えなくなるオクルージョンが発生し、顔画像において顔の一部分の情報が欠落しており、２次元特徴点の一部が抽出できないときは、対応すべき３次元特徴点については位置ずれ量E_iを算出しないこととする。 The positional deviation amount E _i may be any as long as it is calculated such that the smaller the distance d _i is, the smaller the amount of change in the positional deviation amount E _{i with} respect to the change in the distance d _i is. For example, you may use the sigmoid function represented by (3) Formula, or the function represented by (4) Formula.

In the equations (3) and (4), α is a coefficient that determines the relationship between the distance d _i and the positional deviation amount E _i , and the value of α is the environment in which the face image processing apparatus 100 is installed, its purpose It is determined appropriately according to the above.
The distance d _i is not limited to the shortest distance between the focused 3D face feature point and the projection line of the 2D face feature point corresponding to the focused 3D face feature point. For example, the distance d _i may be the distance between the 3D face feature point of interest and the intersection of the projection line of the 2D face feature point corresponding to the 3D face feature point of interest and the surface of the 3D shape model. In this case, the intersection of the projection line of the two-dimensional face feature point and the surface of the three-dimensional shape model becomes the three-dimensional projection point.
In addition, there is an occlusion that is hidden behind a protruding part on the face such as the nose or that part of the face cannot be seen due to an obstruction, etc., and information on the part of the face is missing in the face image. when a portion of the can not be extracted, the three-dimensional feature points should correspond to not calculate the positional shift amount E _i.

位置合わせ手段１１１は、全体の位置ずれ量Eが最小となったときの各３次元顔特徴点の、正規直交座標系(X,Y,Z)の各軸に沿った回転角、並進量、及び拡大／縮小率を位置合わせ情報とする。また位置合わせ手段１１１は、各３次元顔特徴点について、位置ずれ量E_i及び位置合わせ情報を派生顔特徴点算出手段１１２へ出力する。 The alignment means 111 includes a rotation angle, a translation amount, and a translation amount of each three-dimensional face feature point along each axis of the orthonormal coordinate system (X, Y, Z) when the total positional deviation amount E is minimized. And the enlargement / reduction ratio is used as alignment information. In addition, the alignment unit 111 outputs the positional deviation amount E _i and the alignment information to the derived face feature point calculation unit 112 for each three-dimensional face feature point.

派生顔特徴点算出手段１１２は、３次元形状モデル上の３次元顔特徴点と、当該３次元顔特徴点に対応する２次元顔特徴点との位置ずれ量E_iが所定値以上の３次元顔特徴点を特定する。そして、特定した３次元顔特徴点における３次元位置情報のみを周辺位置に派生させた１または複数の派生顔特徴点を生成する。
つまり、２次元顔特徴点との位置ずれ量が所定値以上の３次元顔特徴点に対して、一つの顔の特徴点について、この派生元となった３次元顔特徴点とかかる派生顔特徴点が存在することとなる。
そして、派生顔特徴点算出手段１１２は、一つの３次元形状モデルを形成する複数の３次元顔特徴点の組み合わせである顔特徴点セットを作成する。派生顔特徴点が生成されている３次元顔特徴点については、派生元となった３次元顔特徴点と当該派生顔特徴点から一つを順次顔の特徴点ごとに選択し、顔特徴点セットを生成する。派生顔特徴点が生成されていない３次元顔特徴点については、その３次元顔特徴点をそのまま用いて顔特徴点セットを生成する。ここでは、これらの顔特徴点セットのことを派生顔特徴点セットと呼ぶ。
この所定値は、例えば、３次元形状モデルを２次元の顔画像に投影したときに２次元の顔画像上で同一の画素になると考えられる範囲の大きさに定めることができる。 The derived face feature point calculation means 112 is a three-dimensional image in which the positional deviation amount E _i between the three-dimensional face feature point on the three-dimensional shape model and the two-dimensional face feature point corresponding to the three-dimensional face feature point is a predetermined value or more. Identify facial feature points. Then, one or a plurality of derived face feature points are generated by deriving only the three-dimensional position information of the specified three-dimensional face feature points to the peripheral positions.
That is, with respect to a 3D face feature point whose positional deviation amount with respect to the 2D face feature point is a predetermined value or more, with respect to one face feature point, the 3D face feature point that is the derivation source and the derived face feature There will be points.
Then, the derived face feature point calculating unit 112 creates a face feature point set that is a combination of a plurality of three-dimensional face feature points forming one three-dimensional shape model. For the 3D face feature points for which the derived face feature points have been generated, one of the 3D face feature points from which the derived face feature points are derived and the derived face feature points are sequentially selected for each face feature point. Generate a set. For a 3D face feature point for which no derived face feature point has been generated, a face feature point set is generated using the 3D face feature point as it is. Here, these face feature point sets are referred to as derived face feature point sets.
For example, the predetermined value can be set to a size of a range that is considered to be the same pixel on the two-dimensional face image when the three-dimensional shape model is projected onto the two-dimensional face image.

位置ずれ量が大きい３次元顔特徴点は、対応する２次元顔特徴点の３次元投影点と位置が大きく異なるため、その３次元顔特徴点の周辺において対象者の顔と３次元形状モデルの形状が大きく異なる可能性が高い。従って例えば、各３次元顔特徴点をその位置ずれ量に応じて変動させることにより、位置ずれ量の大きい特徴点、すなわち形状が大きく異なる可能性が高い部分において３次元形状モデルを重点的に変形することができ、対象者の顔に類似する３次元形状モデルを効率的に作成することができる。例えば図２に示した例では、顔画像処理装置１００は、まず左目尻における３次元顔特徴点２１を基準として位置合わせを行い、その後左口角における３次元顔特徴点２２を大きく変動させることにより、対象者の顔に類似する３次元形状モデルを効率的に作成することができる。そこで本実施形態では、派生顔特徴点算出手段１１２は、３次元顔特徴点と、当該３次元顔特徴点に対応する２次元顔特徴点の位置ずれ量E_iが所定値以上の３次元顔特徴点について位置を変えた複数の派生顔特徴点を生成する。 Since the position of a 3D face feature point with a large amount of displacement is significantly different from the 3D projection point of the corresponding 2D face feature point, the face of the subject and the 3D shape model around the 3D face feature point. The shape is likely to vary greatly. Therefore, for example, by changing each three-dimensional face feature point according to the amount of positional deviation, the three-dimensional shape model is mainly deformed at a feature point having a large amount of positional deviation, that is, a portion where the shape is highly likely to differ greatly. And a three-dimensional shape model similar to the face of the subject can be efficiently created. For example, in the example shown in FIG. 2, the face image processing apparatus 100 first performs alignment based on the three-dimensional face feature point 21 in the left eye corner, and then greatly varies the three-dimensional face feature point 22 in the left mouth corner. A three-dimensional shape model similar to the subject's face can be efficiently created. Therefore, in this embodiment, the derived face feature point calculating unit 112 has a 3D face feature point and a 3D face whose positional deviation amount E _i between the 2D face feature point corresponding to the 3D face feature point is a predetermined value or more. A plurality of derived face feature points whose positions are changed with respect to the feature points are generated.

例えば、派生顔特徴点算出手段１１２は、各３次元顔特徴点について位置ずれ量E_iから求められる確率分布に従ってその３次元顔特徴点の位置を変えた派生顔特徴点を生成する。例えば、ある３次元顔特徴点をP_i=(X_i,Y_i,Z_i)^T、派生顔特徴点をP_i'=(X_i',Y_i',Z_i')^Tとすると、確率分布p(P_i')は、次式のように３次元顔特徴点P_iと位置ずれ量E_iから求められる正規分布で表すことができる。

ここでd(E_i)は位置ずれ量E_iから求められ、例えば次式により算出することができる。

ここでβは、正規分布の各次元の分散を定める係数である。βが大きいほど３次元顔特徴点P_iと派生顔特徴点P_i'の距離は大きくなり、βが小さいほど３次元顔特徴点P_iと派生顔特徴点P_i'の距離は小さくなる。βの値は、顔画像処理装置１００が設置される環境、その目的などに応じて適宜定められ、例えばβ＝0.7とすることができる。 For example, the derived face feature point calculation unit 112 generates a derived face feature point in which the position of the 3D face feature point is changed according to the probability distribution obtained from the positional deviation amount E _i for each 3D face feature point. For example, if a certain three-dimensional face feature point is P _i = (X _i , Y _i , Z _i ) ^T and a derived face feature point is P _i ′ = (X _i ′, Y _i ′, Z _i ′) ^T , The probability distribution p (P _i ′) can be expressed by a normal distribution obtained from the three-dimensional face feature point P _i and the positional deviation amount E _i as in the following equation.

Here, d (E _i ) is obtained from the positional deviation amount E _i and can be calculated by the following equation, for example.

Here, β is a coefficient that determines the variance of each dimension of the normal distribution. beta is too large 3-dimensional face feature point P _i and derived facial feature point P _i 'distance becomes large, beta is small enough 3D face feature point P _i and derived facial feature point P _i' distance becomes smaller. The value of β is appropriately determined according to the environment where the face image processing apparatus 100 is installed, its purpose, and can be set to β = 0.7, for example.

図４は、（５）式により作成した派生顔特徴点P_i'を示す模式図である。図４において派生顔特徴点は記号「＋」であらわされる。図４に示されるように、派生顔特徴点４１、４２は、３次元顔特徴点２１、２２を中心に３次元正規分布に従って分布する。そして位置ずれ量E_iが小さい左目尻における３次元顔特徴点２１の周辺では、派生顔特徴点４１は狭い範囲で分布するが、位置ずれ量E_iが大きい左口角における３次元顔特徴点２２の周辺では、派生顔特徴点４２は広い範囲で分布する。 FIG. 4 is a schematic diagram showing the derived face feature point P _i ′ created by the equation (5). In FIG. 4, the derived face feature point is represented by a symbol “+”. As shown in FIG. 4, the derived face feature points 41 and 42 are distributed according to a three-dimensional normal distribution around the three-dimensional face feature points 21 and 22. In the vicinity of the three-dimensional face feature point 21 at the left corner of the left eye where the displacement amount E _i is small, the derived face feature points 41 are distributed in a narrow range, but the three-dimensional face feature point 22 at the left mouth corner where the displacement amount E _i is large. , The derived face feature points 42 are distributed over a wide range.

なお、（５）式では、確率分布p(P_i')は、共分散行列が対角行列に制限された正規分布を用いる例を示したが、派生顔特徴点算出手段１１２は、共分散行列に制限のない正規分布を用いてもよい。 In Equation (5), the probability distribution p (P _i ′) is an example using a normal distribution in which the covariance matrix is limited to a diagonal matrix. A normal distribution with no restrictions on the matrix may be used.

また、派生顔特徴点算出手段１１２は、各２次元顔特徴点の３次元投影点を正規分布の中心としてもよい。つまりこの場合、各２次元顔特徴点の投影直線上の、その２次元顔特徴点に対応する３次元顔特徴点との距離が最短となる点、又は各２次元顔特徴点を３次元空間に投影する直線と３次元形状モデルの表面との交点が正規分布の中心となる。２次元顔特徴点の３次元投影点をQ_i=(X_i,Y_i,Z_i)^Tとすると、確率分布p(P_i')は、例えば次式のように表すことができる。

例えば、顔特徴点抽出手段１０３が抽出した顔画像の２次元顔特徴点の位置の、対象者の顔における正しい特徴点の位置に対する誤差が小さい場合、３次元空間における正しい特徴点の位置は、３次元投影点の近傍に存在する可能性が高い。従ってその場合には、（７）式を用いて派生顔特徴点を求めることにより、対応する２次元顔特徴点と位置が合う派生顔特徴点を生成できる可能性がより高くなるので、顔画像処理装置１００は、より信頼性の高い派生形状モデルを作成することができる。 Further, the derived face feature point calculation unit 112 may set the three-dimensional projection point of each two-dimensional face feature point as the center of the normal distribution. That is, in this case, the point on the projection line of each 2D face feature point that has the shortest distance from the 3D face feature point corresponding to the 2D face feature point, or each 2D face feature point is represented in the 3D space. The intersection of the straight line projected onto the surface of the three-dimensional shape model is the center of the normal distribution. If the three-dimensional projection point of the two-dimensional face feature point is Q _i = (X _i , Y _i , Z _i ) ^T , the probability distribution p (P _i ′) can be expressed, for example, by the following equation.

For example, when the error of the position of the two-dimensional face feature point of the face image extracted by the face feature point extraction unit 103 with respect to the position of the correct feature point on the face of the subject is small, the position of the correct feature point in the three-dimensional space is There is a high possibility that it exists in the vicinity of the three-dimensional projection point. Therefore, in that case, by obtaining the derived face feature point using the equation (7), it is more likely that a derived face feature point whose position matches the corresponding two-dimensional face feature point can be generated. The processing apparatus 100 can create a derived shape model with higher reliability.

また派生顔特徴点算出手段１１２は、非正規分布を用いて派生顔特徴点P_i'を生成してもよい。例えば、３次元顔特徴点P_iからみて対応する３次元投影点が存在する側の反対側に、正しい特徴点が存在する可能性は低い。そこで派生顔特徴点算出手段１１２は、３次元顔特徴点P_iからみて対応する３次元投影点が存在しない側には派生顔特徴点P_i'が分布しないようにしてもよい。これにより顔画像処理装置１００は、より効率的に３次元形状モデルを作成することができる。 The derived face feature point calculation unit 112 may generate the derived face feature point P _i ′ using a non-normal distribution. For example, the side opposite to the side three-dimensional projection point corresponding viewed from 3D face feature point P _i is present, it is unlikely that the correct feature point exists. Therefore, the derived face feature point calculation unit 112 may not distribute the derived face feature points P _i ′ on the side where the corresponding 3D projection points do not exist as viewed from the 3D face feature points P _i . Thereby, the face image processing apparatus 100 can create a three-dimensional shape model more efficiently.

また派生顔特徴点算出手段１１２は、全ての３次元投影点に対して派生顔特徴点を生成してもよいが、必ずしもその必要はない。対応する２次元特徴点との位置ずれ量が十分に小さい３次元特徴点については、その３次元特徴点をそのまま用いて３次元形状モデルを作成しても、対象者の顔画像に類似した３次元形状モデルを作成することができる。そのため派生顔特徴点算出手段１１２は、位置ずれ量が所定値未満となる３次元特徴点に対しては派生顔特徴点を生成しなくてもよい。なお、この所定値は、前述のように例えば３次元形状モデルを２次元の顔画像に投影したときに２次元の顔画像上で同一の画素になると考えられる範囲の大きさに定めることができる。
またこの場合、例えば派生顔特徴点算出手段１１２は、位置ずれ量が所定値以上となる３次元特徴点に対して、予め定められた数の派生顔特徴点を予め定められたグリッド状に生成してもよい。また派生顔特徴点算出手段１１２は、３次元特徴点の位置ずれ量が大きいほど派生顔特徴点を生成するグリッドの幅を大きくしてもよい。これにより顔画像処理装置１００は、３次元顔画像作成処理の負荷を抑制しつつ、対象者の顔と特徴点の配置が類似する３次元形状モデルを作成することができる。 The derived face feature point calculation unit 112 may generate derived face feature points for all three-dimensional projection points, but this is not always necessary. For a three-dimensional feature point having a sufficiently small positional deviation from the corresponding two-dimensional feature point, even if a three-dimensional shape model is created by using the three-dimensional feature point as it is, it is similar to the face image of the target person. A dimensional shape model can be created. Therefore, the derived face feature point calculation unit 112 does not have to generate a derived face feature point for a three-dimensional feature point whose positional deviation amount is less than a predetermined value. As described above, the predetermined value can be set to the size of a range that is considered to be the same pixel on the two-dimensional face image when, for example, the three-dimensional shape model is projected onto the two-dimensional face image. .
Also, in this case, for example, the derived face feature point calculation unit 112 generates a predetermined number of derived face feature points in a predetermined grid shape with respect to a three-dimensional feature point whose positional deviation amount is a predetermined value or more. May be. The derived face feature point calculation unit 112 may increase the width of the grid for generating the derived face feature points as the positional deviation amount of the three-dimensional feature points is larger. Thereby, the face image processing apparatus 100 can create a three-dimensional shape model in which the arrangement of feature points is similar to that of the subject's face while suppressing the load of the three-dimensional face image creation processing.

また派生顔特徴点算出手段１１２は、３次元特徴点の位置ずれ量が大きいほど生成する派生顔特徴点を増やし、位置ずれ量が小さいほど生成する派生顔特徴点を少なくしてもよい。これにより顔画像処理装置１００は、形状が異なる可能性の高い部分において３次元顔特徴点を変形させた派生形状モデルを増やしつつ、形状が類似する可能性の高い部分において３次元顔特徴点を変形させた派生形状モデルの作成を抑制することになる。そのため顔画像処理装置１００は、３次元顔画像作成処理の負荷を抑制しつつ、対象者の顔と特徴点の配置が類似する３次元形状モデルを効率的に作成することができる。 The derived face feature point calculation unit 112 may increase the number of derived face feature points as the positional deviation amount of the three-dimensional feature point is larger, and may decrease the number of derived facial feature points as the positional deviation amount is smaller. As a result, the face image processing apparatus 100 increases the number of derived shape models obtained by deforming the three-dimensional face feature points in a portion where the shape is likely to be different, and the three-dimensional face feature points in a portion where the shape is likely to be similar. The creation of a deformed derived shape model is suppressed. Therefore, the face image processing apparatus 100 can efficiently create a three-dimensional shape model in which the arrangement of feature points is similar to that of the subject's face while suppressing the load of the three-dimensional face image creation processing.

そして派生顔特徴点算出手段１１２は、派生顔特徴点セットを作成する。
例えば図４に示すように、派生顔特徴点算出手段１１２は、３次元形状モデル２０の各３次元顔特徴点２１、２２又は各３次元顔特徴点２１、２２を派生させた派生顔特徴点４１、４２をそれぞれ組み合わせた派生顔特徴点セット５０を作成する。例えば各３次元顔特徴点について同数の派生顔特徴点を生成する場合、作成する派生顔特徴点セット５０の数は、各３次元顔特徴点に対する派生顔特徴点の数に（３次元顔特徴点の分として）１を加えた数を、派生顔特徴点を生成した３次元顔特徴点の数で累乗した値となる。作成する派生顔特徴点セット５０の数が多いほど、３次元顔画像作成処理に要する時間が長くなるが、作成される３次元顔画像の特徴的な部位の位置が対象者の特徴的な部位の位置と類似する可能性は高くなる。一方、作成する派生顔特徴点セット５０の数が少ないほど、作成される３次元顔画像の特徴的な部位の位置が対象者の特徴的な部位の位置と類似する可能性は低くなるが、３次元顔画像作成処理に要する時間は短くなる。あるいは各３次元顔特徴点又は３次元投影点から近い位置の派生顔特徴点に限定し、作成する派生顔特徴点セット５０の数に一定の上限を設けてもよい。 Then, the derived face feature point calculation unit 112 creates a derived face feature point set.
For example, as shown in FIG. 4, the derived face feature point calculation means 112 is a derived face feature point derived from the 3D face feature points 21 and 22 of the 3D shape model 20 or the 3D face feature points 21 and 22. A derived face feature point set 50 is created by combining 41 and 42, respectively. For example, when the same number of derived face feature points are generated for each three-dimensional face feature point, the number of derived face feature point sets 50 to be created is equal to the number of derived face feature points for each three-dimensional face feature point (three-dimensional face feature points). The value obtained by adding 1 (as the number of points) to the power of the number of three-dimensional face feature points that generated the derived face feature points. As the number of derived face feature point sets 50 to be created increases, the time required for the 3D face image creation process increases. However, the position of the characteristic part of the created three-dimensional face image is the characteristic part of the subject. The possibility of similarity to the position of is increased. On the other hand, the smaller the number of derived face feature point sets 50 to be created, the lower the possibility that the position of the characteristic part of the created three-dimensional face image will be similar to the position of the characteristic part of the subject. The time required for the three-dimensional face image creation process is shortened. Alternatively, it may be limited to derived face feature points at positions close to each 3D face feature point or 3D projection point, and a certain upper limit may be provided for the number of derived face feature point sets 50 to be created.

ここで、位置合わせ手段１１１の説明に戻り、位置合わせ手段１１１は、派生顔特徴点算出手段１１２から取得したそれぞれの派生顔特徴点セットについて、派生顔特徴点セットに含まれる、３次元顔特徴点又は派生顔特徴点を回転、並進、拡大／縮小して、派生顔特徴点セットの配置が、対応する２次元顔特徴点の配置に最も一致するように位置合わせを行い、その位置合わせ情報を算出する。そして位置合わせ手段１１１は、位置合わせ情報を用いて派生顔特徴点セットと２次元顔特徴点の位置合わせをしたときの位置合わせ誤差を算出する。また位置合わせ手段１１１は、それぞれの派生顔特徴点セットについて、位置ずれ量E_i、位置合わせ情報及び位置情報を派生モデル作成手段１０５へ出力する。なお、この場合の位置合わせ方法及び位置合わせ誤差の算出方法は、各２次元顔特徴点に対して３次元形状モデルの各３次元顔特徴点を位置合わせする場合と同様であるため、説明を省略する。 Here, returning to the description of the alignment unit 111, the alignment unit 111 includes, for each derived face feature point set acquired from the derived face feature point calculation unit 112, the three-dimensional face feature included in the derived face feature point set. Rotating, translating, and enlarging / reducing a point or derived face feature point to perform alignment so that the arrangement of the derived face feature point set most closely matches the arrangement of the corresponding two-dimensional face feature points. Is calculated. The alignment unit 111 calculates an alignment error when the derived face feature point set and the two-dimensional face feature point are aligned using the alignment information. Further, the alignment unit 111 outputs the positional deviation amount E _i , alignment information, and position information to the derived model creation unit 105 for each derived face feature point set. Note that the alignment method and the alignment error calculation method in this case are the same as in the case of aligning each 3D face feature point of the 3D shape model with respect to each 2D face feature point. Omitted.

派生モデル作成手段１０５は、派生顔特徴点セットごとに、位置合わせ手段１１１から出力された派生顔特徴点セットについての位置ずれ量E_i、位置合わせ情報及び位置情報を用いて、記憶手段２５０に記憶されている３次元形状モデルを派生顔特徴点セットに対応して変形させた派生形状モデルを作成する。
派生形状モデルを作成するために、まず派生モデル作成手段１０５は、各３次元顔特徴点について派生顔特徴点セットの対応する特徴点に一致するよう位置を変えることで３次元形状モデルを変形する。例えば、派生モデル作成手段１０５は、３次元形状モデルを３次元顔特徴点から構成される３次元メッシュで表現し、その３次元メッシュをＴｈｉｎ−ＰｌａｔｅＳｐｌｉｎｅ、ｐｉｅｃｅｗｉｓｅａｆｆｉｎｅ等により派生顔特徴点セットから構成される３次元メッシュに変形することにより、３次元形状モデルを変形する。
次に派生モデル作成手段１０５は、変形した３次元形状モデルを、派生顔特徴点セットの位置合わせ情報を用いて、３次元空間の正規直交座標系(X,Y,Z)の各軸に沿って回転、並進、及び拡大／縮小して顔画像と同じ方向を向くように変換し、これを派生形状モデルとする。 For each derived face feature point set, the derived model creating unit 105 uses the positional deviation amount E _i , the alignment information, and the position information for the derived face feature point set output from the alignment unit 111 to store in the storage unit 250. A derived shape model is created by deforming the stored three-dimensional shape model corresponding to the derived face feature point set.
In order to create a derived shape model, first, the derived model creating means 105 deforms the 3D shape model by changing the position of each 3D face feature point so as to match the corresponding feature point of the derived face feature point set. . For example, the derivation model creation means 105 expresses a three-dimensional shape model with a three-dimensional mesh composed of three-dimensional face feature points, and the three-dimensional mesh is derived from the derivative face feature point set by Thin-Plate Spline, piecewise affine, or the like. A three-dimensional shape model is deformed by deforming into a configured three-dimensional mesh.
Next, the derived model creating means 105 uses the alignment information of the derived face feature point set for the deformed three-dimensional shape model along each axis of the orthonormal coordinate system (X, Y, Z) in the three-dimensional space. Then, it is rotated, translated, and enlarged / reduced so as to face the same direction as the face image, and this is used as a derived shape model.

派生モデル作成手段１０５は、位置合わせ手段１１１から出力された派生顔特徴点セットの数だけ派生形状モデルを作成する。そして派生モデル作成手段１０５は、記憶手段１０１に記憶されている元の３次元形状モデル及び作成した派生形状モデルを光源方向推定手段１０６、陰影画像作成手段１０７及び類似形状選択手段１０９へ出力する。 Derived model creating means 105 creates as many derived shape models as the number of derived face feature point sets output from the alignment means 111. The derived model creating unit 105 outputs the original three-dimensional shape model stored in the storage unit 101 and the created derived shape model to the light source direction estimating unit 106, the shadow image creating unit 107, and the similar shape selecting unit 109.

光源方向推定手段１０６は、画像入力手段１０２から出力された顔画像と派生モデル作成手段１０５から出力された元の３次元形状モデル及び派生形状モデルから、顔画像に写った顔に照射された光の光源方向を推定する。光源方向を推定するための方法として、公知の様々な方法を用いることができる。例えば、光源方向推定手段１０６は、以下の方法により顔画像における輝度分布から光源方向を推定する。 The light source direction estimating means 106 is a light applied to the face reflected in the face image from the face image output from the image input means 102 and the original three-dimensional shape model and derived shape model output from the derived model creating means 105. The light source direction of is estimated. Various known methods can be used as a method for estimating the light source direction. For example, the light source direction estimating means 106 estimates the light source direction from the luminance distribution in the face image by the following method.

まず、顔表面は、その表面により拡散される光の強度がその表面の法線とのなす角の余弦に比例する完全拡散面（ランバート面）であると仮定する。例えば、顔画像の左上端部を原点とし、顔に対して水平に左から右へ向かう方向にx軸、顔に対して垂直に上から下へ向かう方向にy軸が設定された２次元の座標系において、顔画像上の位置(x,y)における輝度E(x,y)は、次式により、顔の３次元形状、光源の方向及び顔表面の反射率で決定されると考えられる。

（８）式において、ρ(x,y)は、位置(x,y)における顔表面の反射率、l₀は光源係数、lは光源方向ベクトルを表す。またn(x,y)は、位置(x,y)に対応する元の３次元形状モデル又は派生形状モデル上の点における、顔表面に対する法線ベクトルを表す。あるいは、n(x,y)は、位置(x,y)に対応する元の３次元形状モデル上の点における、顔表面に対する法線ベクトルを近似値として用いてもよい。その場合、元の３次元形状モデルから作成した全ての派生形状モデルについて共通の法線ベクトルを使用でき、法線ベクトルの算出処理の負荷を軽減できる。 First, it is assumed that the face surface is a completely diffusing surface (Lambert surface) in which the intensity of light diffused by the surface is proportional to the cosine of the angle formed with the normal of the surface. For example, a two-dimensional set in which the upper left corner of the face image is the origin, the x axis is set in the direction from left to right horizontally with respect to the face, and the y axis is set in the direction from top to bottom perpendicular to the face In the coordinate system, the luminance E (x, y) at the position (x, y) on the face image is considered to be determined by the three-dimensional shape of the face, the direction of the light source, and the reflectance of the face surface according to the following equation. .

In equation (8), ρ (x, y) represents the reflectance of the face surface at the position (x, y), l ₀ represents a light source coefficient, and l represents a light source direction vector. N (x, y) represents a normal vector to the face surface at a point on the original three-dimensional shape model or derived shape model corresponding to the position (x, y). Alternatively, for n (x, y), a normal vector to the face surface at a point on the original three-dimensional shape model corresponding to the position (x, y) may be used as an approximate value. In this case, a common normal vector can be used for all derived shape models created from the original three-dimensional shape model, and the load of normal vector calculation processing can be reduced.

ここで、顔の皮膚は場所によらず同一の成分で構成されると仮定し、（８）式においてρ(x,y)は一定値γを有するものとする。この場合、（８）式は光源係数l₀及び光源方向lを未知数とした方程式となる。そこで、光源方向推定手段１０６は、顔画像における顔の皮膚に相当する領域内の各画素において、（８）式を立てて連立方程式とし、この連立方程式を解くことによって派生形状モデルごとの光源係数l₀及び光源方向lを求めることができる。なお、一定値γは、顔画像における顔の皮膚に相当する領域の輝度値の平均値、最頻値または中央値若しくはその近傍値に設定することができる。 Here, it is assumed that the skin of the face is composed of the same component regardless of the location, and ρ (x, y) has a constant value γ in the equation (8). In this case, equation (8) is an equation with the light source coefficient l ₀ and the light source direction l as unknowns. Therefore, the light source direction estimating means 106 sets the equation (8) as a simultaneous equation at each pixel in the region corresponding to the facial skin in the face image, and solves the simultaneous equation to solve the light source coefficient for each derived shape model. l ₀ and the light source direction l can be obtained. Note that the constant value γ can be set to an average value, a mode value, a median value, or a vicinity value thereof in a region corresponding to the skin of the face in the face image.

また、顔表面の反射率ρ(x,y)を一定と仮定する際、皮膚でない部位に相当する領域、例えば、目、口、鼻孔、眉毛、髪の毛などの領域を除外することが好ましい。そこで、光源方向推定手段１０６は、これらの皮膚でない部位に相当する領域を光源推定マスク領域とし、上記の連立方程式を立てる際に光源推定マスク領域内の画素を用いないことで、高精度に光源方向を推定することができる。 Further, when it is assumed that the reflectance ρ (x, y) of the face surface is constant, it is preferable to exclude regions corresponding to parts that are not skin, for example, regions such as eyes, mouth, nostrils, eyebrows, and hair. Therefore, the light source direction estimating means 106 uses the region corresponding to the non-skin portion as the light source estimation mask region, and does not use the pixels in the light source estimation mask region when the above simultaneous equations are established, so that the light source direction is accurately determined. The direction can be estimated.

あるいは、光源方向推定手段１０６は、予め様々な光源方向でモデルとなる人物の顔を撮影した標準的な顔画像若しくはシミュレーションにより求めた同等の顔画像を用意しておき、それらと顔画像とのパターンマッチングをおこなって、最も一致する顔画像を決定することにより、光源方向を推定してもよい。さらにまた、光源方向推定手段１０６は、照明光源と顔画像を取得したカメラの位置関係、または照明光源から放射される照明光の方向及びカメラの撮影方向の関係が予め分かっている場合、それらの関係に基づいて光源方向を決定してもよい。 Alternatively, the light source direction estimating means 106 prepares a standard face image obtained by photographing a person's face as a model in various light source directions in advance or an equivalent face image obtained by simulation, The light source direction may be estimated by performing pattern matching and determining the most matching face image. Furthermore, the light source direction estimation means 106, when the positional relationship between the illumination light source and the camera that acquired the face image, or the relationship between the direction of the illumination light emitted from the illumination light source and the shooting direction of the camera is known in advance, The light source direction may be determined based on the relationship.

そして光源方向推定手段１０６は、光源方向を示す光源方向情報を、対応する３次元形状モデル及び派生形状モデルに関連付けて陰影画像作成手段１０７へ出力する。 Then, the light source direction estimation unit 106 outputs light source direction information indicating the light source direction to the shadow image creation unit 107 in association with the corresponding three-dimensional shape model and derived shape model.

陰影画像作成手段１０７は、光源方向推定手段１０６から出力された光源方向情報と派生モデル作成手段１０５から出力された３次元形状モデル及び派生形状モデルを取得する。そして陰影画像作成手段１０７は、派生形状モデルに対して、対応する光源方向情報に示された光源方向にしたがってレンダリングし、当該方向から光を照射した場合の２次元投影画像である陰影画像を作成する。例えば、陰影画像作成手段１０７は以下の手順で陰影画像を作成する。
派生形状モデルに、光源方向情報に示された光源方向から光を照射した場合、その顔形状モデルの任意の点(X,Y,Z)における光の反射強度、すなわち輝度E(X,Y,Z)は、次式によって表現できる。

（９）式において、l₀及びlは、それぞれ光源方向推定手段１０６により求められた光源係数及び光源方向ベクトルである。またγは、顔表面の皮膚に相当する部位の反射率であり、例えば、光源方向推定手段１０６において設定されたものと同一の値を有する。さらに、n(X,Y,Z)は、派生形状モデル上の点(X,Y,Z)における、顔表面に対する法線方向ベクトルを表す。あるいは、n(X,Y,Z)は、３次元形状モデル上の点(X,Y,Z)における、顔表面に対する法線方向ベクトルとしてもよい。その場合、その３次元形状モデルから作成した全ての派生形状モデルについて共通の法線ベクトルを使用でき、法線ベクトルの算出処理の負荷を軽減できる。 The shadow image creating unit 107 acquires the light source direction information output from the light source direction estimating unit 106 and the three-dimensional shape model and the derived shape model output from the derived model creating unit 105. The shadow image creating unit 107 renders the derived shape model according to the light source direction indicated in the corresponding light source direction information, and creates a shadow image that is a two-dimensional projection image when light is emitted from the direction. To do. For example, the shadow image creating unit 107 creates a shadow image in the following procedure.
When the derived shape model is irradiated with light from the light source direction indicated in the light source direction information, the reflection intensity of light at an arbitrary point (X, Y, Z) of the face shape model, that is, luminance E (X, Y, Z) can be expressed by the following equation.

In equation (9), l ₀ and l are a light source coefficient and a light source direction vector obtained by the light source direction estimating means 106, respectively. Further, γ is a reflectance of a portion corresponding to the skin on the face surface, and has the same value as that set in the light source direction estimating means 106, for example. Furthermore, n (X, Y, Z) represents a normal direction vector with respect to the face surface at the point (X, Y, Z) on the derived shape model. Alternatively, n (X, Y, Z) may be a normal direction vector with respect to the face surface at the point (X, Y, Z) on the three-dimensional shape model. In that case, a common normal vector can be used for all the derived shape models created from the three-dimensional shape model, and the load of the normal vector calculation processing can be reduced.

陰影画像作成手段１０７は、（９）式に基づいて、派生形状モデル上の各点(X,Y,Z)における輝度を求める。そして陰影画像作成手段１０７は、派生形状モデル上の各点(X,Y,Z)を、２次元平面上の対応する点(x,y)に投影する。その後、陰影画像作成手段１０７は、輝度値を適切に調整して、投影された各点(x,y)の輝度がオーバーフローまたはアンダーフローしないようにグレースケール化し、陰影画像を得る。 The shadow image creating means 107 obtains the luminance at each point (X, Y, Z) on the derived shape model based on the equation (9). Then, the shadow image creating means 107 projects each point (X, Y, Z) on the derived shape model to a corresponding point (x, y) on the two-dimensional plane. After that, the shadow image creating unit 107 appropriately adjusts the luminance value and grayscales the projected point (x, y) so that the luminance does not overflow or underflow, thereby obtaining a shadow image.

同様に陰影画像作成手段１０７は、３次元形状モデルに対して、対応する光源方向情報に示された光源方向にしたがってレンダリングし、当該方向から光を照射した場合の陰影画像を作成する。
そして陰影画像作成手段１０７は、３次元形状モデル及び各派生形状モデルに対してそれぞれ求めた陰影画像を類似度算出手段１０８へ出力する。 Similarly, the shadow image creation unit 107 renders a three-dimensional shape model according to the light source direction indicated by the corresponding light source direction information, and creates a shadow image when light is emitted from the direction.
Then, the shadow image creation unit 107 outputs the shadow image obtained for each of the three-dimensional shape model and each derived shape model to the similarity calculation unit 108.

類似度算出手段１０８は、画像入力手段１０２から出力された顔画像の輝度分布と陰影画像作成手段１０７から出力された陰影画像の輝度分布との類似度を、各陰影画像について算出する。そのために、類似度算出手段１０８は、顔画像の各点について、ＲＧＢで表される輝度値をグレースケールの輝度値に変換して、単色顔画像を作成する。そして類似度算出手段１０８は、単色顔画像と陰影画像の対応する画素の輝度値の平均二乗誤差の逆数を算出し、類似度とする。 The similarity calculation unit 108 calculates the similarity between the luminance distribution of the face image output from the image input unit 102 and the luminance distribution of the shadow image output from the shadow image creation unit 107 for each shadow image. For this purpose, the similarity calculation unit 108 converts a luminance value represented by RGB into a grayscale luminance value for each point of the face image to create a single-color face image. Then, the similarity calculation means 108 calculates the reciprocal of the mean square error of the luminance values of the corresponding pixels of the single-color face image and the shadow image, and sets it as the similarity.

なお、類似度算出手段１０８は、類似度として、輝度値の平均二乗誤差の代わりに、単色顔画像と陰影画像の正規化相関値など、これら２枚の画像の輝度値系列の類似度を評価できる他の指標を用いてもよい。
また、類似度算出手段１０８は、上記の光源推定マスク領域に対応する領域を、類似度の算出領域から除外してもよい。この場合、類似度算出手段１０８は、目、鼻、口などの輝度が顔の皮膚と大きく異なる部位に依存せず、顔の輪郭形状の類似性などをより正確に反映した類似度を求めることができる。
また、類似度算出手段１０８は、顔画像及び陰影画像の顔を表す領域全体で一つの類似度を算出してもよいし、複数の領域に分割してその領域ごとに類似度を算出してもよい。
類似度算出手段１０８は、各陰影画像について算出した類似度を類似形状選択手段１０９へ出力する。 Note that the similarity calculation unit 108 evaluates the similarity of the luminance value series of these two images, such as the normalized correlation value of the single-color face image and the shadow image, instead of the mean square error of the luminance value, as the similarity. Other possible indicators may be used.
Further, the similarity calculation unit 108 may exclude an area corresponding to the light source estimation mask area from the similarity calculation area. In this case, the similarity calculation means 108 obtains a similarity that more accurately reflects the similarity of the facial contour shape, etc., without depending on the part where the brightness of eyes, nose, mouth, etc. is significantly different from the skin of the face. Can do.
Further, the similarity calculation unit 108 may calculate one similarity for the entire area representing the face of the face image and the shadow image, or may calculate the similarity for each area by dividing into a plurality of areas. Also good.
The similarity calculation unit 108 outputs the similarity calculated for each shadow image to the similar shape selection unit 109.

類似形状選択手段１０９は、派生モデル作成手段１０５から出力された３次元形状モデル及び派生形状モデルと、類似度算出手段１０８から出力された類似度を取得する。そして類似形状選択手段１０９は、各陰影画像について算出された類似度を参照して、最も類似度の高い陰影画像に対応する３次元形状モデル又は派生形状モデルを選択し、最類似顔形状モデルとする。 The similar shape selection unit 109 acquires the three-dimensional shape model and the derived shape model output from the derived model creation unit 105 and the similarity output from the similarity calculation unit 108. The similar shape selection unit 109 refers to the similarity calculated for each shadow image, selects a three-dimensional shape model or a derived shape model corresponding to the shadow image with the highest similarity, and selects the most similar face shape model. To do.

なお、類似形状選択手段１０９は、対応する類似度が高い方から順にＭ個の派生形状モデルを選択し、それらを平均化して最類似顔形状モデルとしてもよい。なお所定数Ｍは、例えば、２、３などの固定値としてもよく、あるいは、記憶手段１０１に記憶された３次元形状モデルの総数に占める所定の割合（例えば、５％または１０％）に相当する値としてもよい。
また、類似度算出手段１０８が顔画像及び陰影画像を複数の領域に分割してその領域ごとに類似度を算出している場合、類似形状選択手段１０９は、その領域に対応するように３次元形状モデル及び派生形状モデルを分割する。そして類似形状選択手段１０９は、分割した領域ごとに最も類似度の高い陰影画像に対応する３次元形状モデル又は派生形状モデルを選択し、それらを合成したものを最類似顔形状モデルとしてもよい。さらに、類似形状選択手段１０９は、分割した領域ごとに、対応する類似度が高い方から順にＭ個の３次元形状モデル又は派生形状モデルを選択して平均化し、それらを合成したものを最類似顔形状モデルとしてもよい。このように複数に分割した領域ごとに類似顔形状モデルを作成することにより、顔画像処理装置１００は、より高精度に類似顔形状モデルを作成することができる。
類似形状選択手段１０９は、求めた最類似顔形状モデルを個人モデル作成手段１１０へ出力する。 Note that the similar shape selection unit 109 may select M derived shape models in descending order of the corresponding similarity, and average them to obtain the most similar face shape model. The predetermined number M may be a fixed value such as 2, 3 or the like, or corresponds to a predetermined ratio (for example, 5% or 10%) in the total number of three-dimensional shape models stored in the storage unit 101. It is good also as a value to do.
Further, when the similarity calculation unit 108 divides the face image and the shadow image into a plurality of regions and calculates the similarity for each region, the similar shape selection unit 109 performs three-dimensional processing so as to correspond to the region. Divide the shape model and the derived shape model. Then, the similar shape selection unit 109 may select a three-dimensional shape model or a derived shape model corresponding to the shadow image with the highest similarity for each divided region, and may combine these to form the most similar face shape model. Further, the similar shape selection unit 109 selects and averages the M three-dimensional shape models or the derived shape models in descending order of the corresponding similarity for each divided region, and the synthesized result is the most similar. It may be a face shape model. Thus, by creating a similar face shape model for each of the divided areas, the face image processing apparatus 100 can create a similar face shape model with higher accuracy.
The similar shape selection unit 109 outputs the obtained most similar face shape model to the individual model creation unit 110.

個人モデル作成手段１１０は、類似形状選択手段１０９から出力された最類似顔形状モデルに、顔画像をテクスチャ画像としてマッピングすることにより、対象者の３次元顔画像を作成する。つまり個人モデル作成手段１１０は、顔画像の各２次元顔特徴点と、各２次元顔特徴点に対応する、最類似顔形状モデルの３次元顔特徴点又は派生顔特徴点セットとの位置が合うように、位置合わせ情報を用いて顔画像をテクスチャ画像として最類似顔形状モデルにマッピングする。 The personal model creation unit 110 creates a three-dimensional face image of the target person by mapping the face image as a texture image on the most similar face shape model output from the similar shape selection unit 109. That is, the individual model creation means 110 determines the position of each 2D face feature point of the face image and the 3D face feature point or derived face feature point set of the most similar face shape model corresponding to each 2D face feature point. In order to match, the face image is mapped to the most similar face shape model as a texture image using the alignment information.

なお、前述のように顔の一部が見えなくなるオクルージョンが発生し、顔画像において顔の一部分の情報が欠落していることもある。そこで、顔画像をマッピングした３次元顔画像に顔の情報やテクスチャの欠落部分が生じる場合、個人モデル作成手段１１０は、マッピングを行う前あるいはマッピングを行なった後に、欠落部分の周囲の画素の輝度情報を用いて補間処理（例えば、スプライン補間、線形補間）を行って、欠落部分の画素の輝度値を算出する。
あるいは、個人モデル作成手段１１０は、人の顔には対称性があることを利用して、欠落部分の対称位置に相当する画素の輝度値を、その欠落部分の画素の輝度値としてもよい。例えば、欠落部分に対して、顔の正中線を中心とした線対称の位置の画素の輝度値を、その欠落部分の画素の輝度値とすることができる。
あるいは、個人モデル作成手段１１０は、欠落部分については３次元形状モデルの対応する部分のテクスチャ画像を用いて補間処理を行ってもよい。
このような補間処理を行うことにより、顔画像が取得された時に、対象者が顔の一部が隠れる方向を向いていたり、遮蔽物の陰に対象者の顔の一部が隠れている場合であっても、対象者の顔全体を表現した３次元顔画像を作成することができる。
個人モデル作成手段１１０は、作成した３次元顔画像を、記憶手段１０１に記憶するか、あるいは、顔画像処理装置１００を利用して、照合処理を行う顔認証装置などへ出力する。 Note that, as described above, occlusion that makes part of the face invisible occurs, and information on part of the face may be missing in the face image. Accordingly, when a face information or texture missing portion is generated in the 3D face image to which the face image is mapped, the personal model creating means 110 can detect the luminance of pixels around the missing portion before mapping or after mapping. Interpolation processing (for example, spline interpolation, linear interpolation) is performed using the information, and the luminance value of the missing portion pixel is calculated.
Alternatively, the personal model creation means 110 may use the luminance value of the pixel corresponding to the symmetrical position of the missing portion as the luminance value of the pixel of the missing portion by utilizing the symmetry of the human face. For example, the luminance value of the pixel at a line-symmetric position with respect to the midline of the face with respect to the missing portion can be set as the luminance value of the pixel of the missing portion.
Alternatively, the personal model creation unit 110 may perform an interpolation process on the missing part using the texture image of the corresponding part of the three-dimensional shape model.
By performing such an interpolation process, when the face image is acquired, the target person is facing the direction in which part of the face is hidden, or a part of the target person's face is hidden behind the shield Even so, it is possible to create a three-dimensional face image representing the entire face of the subject.
The personal model creation unit 110 stores the created three-dimensional face image in the storage unit 101 or outputs it to a face authentication device or the like that performs a matching process using the face image processing device 100.

以下、図５に示したフローチャートを参照しつつ、本発明を適用した顔画像処理装置１００による３次元顔画像作成処理の動作を説明する。なお、以下に説明する動作のフローは、顔画像処理装置１００を構成するマイクロプロセッサ上で動作し、顔画像処理装置１００全体を制御する制御手段（図示せず）により制御される。
最初に、顔画像処理装置１００は、画像入力手段１０２を介して、２次元画像である対象者の顔画像を取得する（ステップＳ５０１）。次に、顔特徴点抽出手段１０３は、取得された顔画像から２次元顔特徴点を抽出する（ステップＳ５０２）。 Hereinafter, the operation of the three-dimensional face image creation process by the face image processing apparatus 100 to which the present invention is applied will be described with reference to the flowchart shown in FIG. The flow of operations described below is controlled by a control unit (not shown) that operates on the microprocessor constituting the face image processing apparatus 100 and controls the entire face image processing apparatus 100.
First, the face image processing apparatus 100 acquires a face image of the subject as a two-dimensional image via the image input unit 102 (step S501). Next, the face feature point extraction unit 103 extracts a two-dimensional face feature point from the acquired face image (step S502).

以下のステップＳ５０３〜Ｓ５１３の処理は、記憶手段１０１に３次元形状モデルが複数記憶されている場合には、そのそれぞれについて行われる。記憶手段１０１に記憶されている３次元形状モデルが標準的な顔の３次元形状モデルを表すモデル１つのみの場合は、後述するステップＳ５１４は必要ない。位置合わせ情報算出手段１０４の位置合わせ手段１１１は、ステップＳ５０にて抽出された各２次元顔特徴点に対する、各３次元顔特徴点の位置ずれ量E_iから求められる全体の位置ずれ量Eが最小となるように、各３次元顔特徴点を、正規直交座標系(X,Y,Z)の各軸に沿って回転または併進させたり、拡大／縮小させたりする。そして位置合わせ手段１１１は、全体の位置ずれ量Eが最小となるときの、各２次元顔特徴点に対応する、各３次元顔特徴点の位置ずれ量E_iを派生顔特徴点算出手段１１２に出力する（ステップＳ５０３）。 When the storage unit 101 stores a plurality of three-dimensional shape models, the following steps S503 to S513 are performed for each of them. When the three-dimensional shape model stored in the storage unit 101 is only one model representing a standard face three-dimensional shape model, step S514 described later is not necessary. The registration unit 111 of the registration information calculation unit 104 calculates the total positional deviation amount E obtained from the positional deviation amount E _i of each three-dimensional face feature point with respect to each two-dimensional face feature point extracted in step S50. Each three-dimensional face feature point is rotated or translated along each axis of the orthonormal coordinate system (X, Y, Z) or enlarged / reduced so as to be minimized. Then, the alignment unit 111 derives the positional deviation amount E _i of each three-dimensional face feature point corresponding to each two-dimensional facial feature point when the total positional deviation amount E is minimum, as a derived face feature point calculating unit 112. (Step S503).

以下のステップＳ５０４〜Ｓ５１０の処理は、派生形状モデルごとに行われる。派生顔特徴点算出手段１１２は、３次元形状モデルの各３次元顔特徴点の位置を位置ずれ量E_iに基づいて変えた派生顔特徴点を生成する（ステップＳ５０４）。そして派生顔特徴点算出手段１１２は、顔の特徴点ごとに、対応する３次元顔特徴点又は派生顔特徴点から選択した特徴点を組み合わせた派生顔特徴点セットを作成する。次に、位置合わせ手段１１１は、ステップＳ５０３の処理と同様にして、２次元顔特徴点に対して派生顔特徴点セットの位置合わせを行い、派生顔特徴点セットの位置合わせ情報を求める（ステップＳ５０５）。次に、派生モデル作成手段１０５は、３次元形状モデルについて、３次元顔特徴点を派生顔特徴点セットの対応する特徴点に移すように変形し、さらに派生顔特徴点セットの位置合わせ情報を用いて顔画像と同じ方向を向くように変換して派生形状モデルを作成する（ステップＳ５０６）。 The following steps S504 to S510 are performed for each derived shape model. The derived face feature point calculation unit 112 generates a derived face feature point in which the position of each 3D face feature point of the 3D shape model is changed based on the positional deviation amount E _i (step S504). Then, the derived face feature point calculation unit 112 creates a derived face feature point set in which the corresponding feature points selected from the corresponding three-dimensional face feature points or the derived face feature points are combined for each face feature point. Next, the registration unit 111 performs registration of the derived face feature point set with respect to the two-dimensional face feature point in the same manner as the process of step S503, and obtains alignment information of the derived face feature point set (step S503). S505). Next, the derivation model creation means 105 transforms the three-dimensional shape model so as to move the three-dimensional face feature points to the corresponding feature points of the derivation face feature point set, and further obtains alignment information of the derivation face feature point set. The derived shape model is created by using the transformation so as to face the same direction as the face image (step S506).

次に、光源方向推定手段１０６は、顔画像に基づいて、派生形状モデルを用いて光源方向を推定し、推定した光源方向を表す光源方向情報を出力する（ステップＳ５０７）。そして、陰影画像作成手段１０７は、派生形状モデルを、光源方向情報にしたがってレンダリングして２次元平面に投影し、陰影画像を作成する（ステップＳ５０８）。その後、類似度算出手段１０８は、作成された陰影画像と、画像入力手段２１０により取得された顔画像との類似度を算出する（ステップＳ５０９）。 Next, the light source direction estimation means 106 estimates the light source direction using the derived shape model based on the face image, and outputs light source direction information representing the estimated light source direction (step S507). Then, the shadow image creating unit 107 renders the derived shape model according to the light source direction information and projects it on the two-dimensional plane to create a shadow image (step S508). Thereafter, the similarity calculation unit 108 calculates the similarity between the created shadow image and the face image acquired by the image input unit 210 (step S509).

そして、所定数Ｎの派生形状モデルが作成され、それぞれの類似度が算出されたか否かが判定される（ステップＳ５１０）。所定数Ｎは、顔画像との類似度が算出される派生形状モデルの数、すなわち派生モデル作成手段１０５が作成すべき派生形状モデルの数であり、その数に上限を設ける場合には例えば500とすることができる。まだ所定数Ｎの派生形状モデルが作成されておらず、類似度が算出されていない場合、ステップＳ５０４〜Ｓ５０９の処理が繰り返され、新たな派生形状モデルが作成され、その類似度が算出される。
一方、ステップＳ５１０において、所定数Ｎの派生形状モデルが作成され、類似度が算出されたと判定されると、光源方向推定手段１０６は、３次元形状モデルを用いて光源方向を推定し、推定した光源方向を表す光源方向情報を出力する（ステップＳ５１１）。そして、陰影画像作成手段１０７は、３次元形状モデルを、光源方向情報にしたがってレンダリングして２次元平面に投影し、陰影画像を作成する（ステップＳ５１２）。その後、類似度算出手段１０８は、作成された陰影画像と顔画像との類似度を算出する（ステップＳ５１３）。
そして、全ての３次元形状モデルについて派生形状モデルが作成され、各３次元形状モデル及び各派生形状モデルに対して顔画像との類似度が算出されたか否かが判定される（ステップＳ５１４）。まだ派生形状モデルが作成されておらず、その３次元形状モデル及び各派生形状モデルと顔画像との類似度が算出されていない３次元形状モデルが存在する場合、制御はステップＳ５０３に戻り、ステップＳ５０３〜Ｓ５１３の処理が繰り返される。 Then, it is determined whether or not a predetermined number N of derived shape models have been created and the respective similarities have been calculated (step S510). The predetermined number N is the number of derived shape models for which the similarity with the face image is calculated, that is, the number of derived shape models to be created by the derived model creation means 105. It can be. If a predetermined number N of derived shape models have not yet been created and the similarity has not been calculated, the processes in steps S504 to S509 are repeated to create a new derived shape model, and the similarity is calculated. .
On the other hand, when it is determined in step S510 that a predetermined number N of derived shape models are created and the similarity is calculated, the light source direction estimation unit 106 estimates and estimates the light source direction using the three-dimensional shape model. Light source direction information representing the light source direction is output (step S511). Then, the shadow image creating means 107 renders the three-dimensional shape model according to the light source direction information and projects it on the two-dimensional plane to create a shadow image (step S512). Thereafter, the similarity calculation means 108 calculates the similarity between the created shadow image and the face image (step S513).
Derived shape models are created for all the three-dimensional shape models, and it is determined whether or not the similarity with the face image is calculated for each three-dimensional shape model and each derived shape model (step S514). If a derived shape model has not yet been created, and there is a 3D shape model for which the similarity between the 3D shape model and each derived shape model and the face image has not been calculated, control returns to step S503. The processing from S503 to S513 is repeated.

一方、ステップＳ５１４において、全ての３次元形状モデルについて派生形状モデルが作成され、各３次元形状モデル及び各派生形状モデルと顔画像との類似度が算出されたと判定されると、類似形状選択手段１０９は、その類似度にしたがって、顔画像に最も近い３次元形状モデル又は派生形状モデルから、最類似顔形状モデルを決定する（ステップＳ５１２）。最後に、個人モデル作成手段２９０は、最類似顔形状モデルに、顔画像をテクスチャ画像としてマッピングすることにより、対象者の３次元顔画像を作成する（ステップＳ２９０）。 On the other hand, if it is determined in step S514 that the derived shape models have been created for all the three-dimensional shape models and the similarity between each three-dimensional shape model and each derived shape model and the face image has been calculated, similar shape selection means 109 determines the most similar face shape model from the three-dimensional shape model or derived shape model closest to the face image according to the similarity (step S512). Finally, the personal model creation means 290 creates a three-dimensional face image of the subject by mapping the face image as a texture image on the most similar face shape model (step S290).

以上説明してきたように、本発明を適用した顔画像処理装置１００は、記憶手段１０１に予め記憶された３次元形状モデルにおける３次元顔特徴点と、画像入力手段１０２から入力された２次元顔画像から抽出した２次元顔特徴点との位置ずれ量を算出する。そして顔画像処理装置１００は、位置ずれ量が所定の閾値以上の３次元顔特徴点について位置を変えた派生顔特徴点を生成する。さらに顔画像処理装置１００は、位置ずれ量が所定の閾値以上の３次元顔特徴点については元の３次元顔特徴点または派生顔特徴点から選択した特徴点を組み合わせ、位置ずれ量が所定の閾値未満の３次元顔特徴点については元の３次元顔特徴点を組み合わせて、派生顔特徴点セットを生成する。そして顔画像処理装置１００は、３次元顔特徴点が派生顔特徴点セットに一致するように３次元形状モデルを変形して派生形状モデルを作成し、作成した派生形状モデル又は元の３次元形状モデルのうち２次元顔画像に最も類似するものに基づいて３次元顔画像を作成する。これにより、顔画像処理装置１００は、特殊な表面計測装置を用いることなく、また予め多人数の顔の形状データを収集していなくても、対象者の２次元の顔画像から特徴的な部位の位置をその対象者の顔により良好に一致させた３次元顔画像を作成できる。これにより、眉骨又は頬骨の張り出し具合、顎の形状などの個々人が持つ顔形状の特徴を有する３次元顔画像を作成することも可能となる。また、複数の３次元形状モデルを準備することにより、多様な派生形状モデルを作成できるので、対象者の２次元顔画像に対して各顔特徴点の位置が良好に一致する派生形状モデルを提供できる可能性がより高くなる。 As described above, the face image processing apparatus 100 to which the present invention is applied includes the three-dimensional face feature point in the three-dimensional shape model stored in advance in the storage unit 101 and the two-dimensional face input from the image input unit 102. The amount of positional deviation from the two-dimensional face feature point extracted from the image is calculated. Then, the face image processing apparatus 100 generates a derived face feature point in which the position is changed for a three-dimensional face feature point having a positional deviation amount equal to or greater than a predetermined threshold. Further, the face image processing apparatus 100 combines the feature points selected from the original three-dimensional face feature points or derived face feature points with respect to the three-dimensional face feature points having a positional deviation amount equal to or greater than a predetermined threshold, and the positional deviation amount is a predetermined amount. For 3D face feature points less than the threshold, the original 3D face feature points are combined to generate a derived face feature point set. Then, the face image processing apparatus 100 creates a derived shape model by modifying the 3D shape model so that the 3D face feature point matches the derived face feature point set, and the created derived shape model or the original 3D shape A three-dimensional face image is created based on the model most similar to the two-dimensional face image. As a result, the face image processing apparatus 100 does not use a special surface measuring apparatus and does not collect shape data of a large number of people in advance. A three-dimensional face image can be created in which the position of is more closely matched to the face of the subject. As a result, it is also possible to create a three-dimensional face image having facial shape characteristics such as the extent of eyebrow or cheekbone overhanging and the shape of the jaw. In addition, a variety of derived shape models can be created by preparing multiple 3D shape models, providing a derived shape model in which the position of each facial feature point matches well with the 2D face image of the subject. More likely to be able to do it.

さらに顔画像処理装置１００は、位置合わせの結果、３次元顔特徴点の、対応する２次元顔特徴点に対する位置ずれ量が大きいほど、その３次元顔特徴点に対応する派生顔特徴点を大きい範囲に分布させる。これにより、対象者の２次元顔画像に対して各顔特徴点の位置が良好に一致する派生形状モデルを提供できる可能性がより高くなる。
さらに顔画像処理装置１００は、位置合わせの結果、３次元顔特徴点の、対応する２次元顔特徴点に対する位置ずれ量が大きいほど、その３次元顔特徴点に対応する派生顔特徴点の数を増やす。これにより、対象者の顔と形状が類似する可能性の高い部分について３次元顔特徴点を変動させた派生形状モデルを作成する数を抑制でき、３次元顔画像作成処理の負荷を軽減できる。 Further, the face image processing apparatus 100 increases the derived face feature point corresponding to the three-dimensional face feature point as the positional deviation amount of the three-dimensional face feature point with respect to the corresponding two-dimensional face feature point increases. Distribute to the range. As a result, it is more likely that a derived shape model in which the positions of the facial feature points are well matched with the two-dimensional face image of the subject can be provided.
Further, the face image processing apparatus 100 increases the number of derived face feature points corresponding to the three-dimensional face feature point as the positional deviation amount of the three-dimensional face feature point with respect to the corresponding two-dimensional face feature point increases. Increase. Thereby, it is possible to suppress the number of derivative shape models in which the three-dimensional face feature points are changed for a portion that is likely to be similar in shape to the subject's face, and to reduce the load of the three-dimensional face image creation processing.

以上、本発明の好適な実施形態について説明してきたが、本発明はこれらの実施形態に限定されるものではない。例えば、ステップＳ５０７及びＳ５１１に示した光源方向推定手段による光源方向推定処理を省略してもよい。その場合、ステップＳ５０８及びＳ５１２において陰影画像作成手段は光源方向情報に示された光源方向に従わずに派生形状モデル及び３次元形状モデルをレンダリングした２次元画像を作成する。そしてステップＳ５０９及びＳ５１３において類似度算出手段は、その２次元画像と顔画像の輝度分布の類似度を算出する。これにより顔画像処理装置は、３次元顔画像作成処理の負荷を軽減できる。 The preferred embodiments of the present invention have been described above, but the present invention is not limited to these embodiments. For example, you may abbreviate | omit the light source direction estimation process by the light source direction estimation means shown to step S507 and S511. In that case, in steps S508 and S512, the shadow image creating means creates a two-dimensional image in which the derived shape model and the three-dimensional shape model are rendered without following the light source direction indicated in the light source direction information. In steps S509 and S513, the similarity calculation unit calculates the similarity of the luminance distribution between the two-dimensional image and the face image. Thereby, the face image processing apparatus can reduce the load of the three-dimensional face image creation process.

また、ステップＳ５０４に示した派生顔特徴点の生成処理において、派生顔特徴点算出手段は、その３次元形状モデルに対する全ての派生顔特徴点を一括して算出してもよい。その場合、ステップＳ５１０において、所定数Ｎの派生形状モデルに対して、類似度が算出されていない場合、顔画像処理装置は、制御をステップＳ５０５に戻し、ステップＳ５０５〜Ｓ５０９の処理を繰り返す。これにより顔画像処理装置は、３次元顔画像作成処理の負荷を軽減できる。 In the derived face feature point generation process shown in step S504, the derived face feature point calculation unit may calculate all the derived face feature points for the three-dimensional shape model at once. In that case, if the similarity is not calculated for the predetermined number N of derived shape models in step S510, the face image processing device returns control to step S505, and repeats the processing of steps S505 to S509. Thereby, the face image processing apparatus can reduce the load of the three-dimensional face image creation process.

また、派生顔特徴点算出手段は、作成した全ての派生顔特徴点セットについての情報を位置合わせ手段に出力するのではなく、２次元顔特徴点との位置ずれ量が最も小さい派生顔特徴点セットについての情報のみ、位置合わせ手段に出力してもよい。その場合、派生モデル作成手段は、３次元形状モデルごとに一つの派生形状モデルを作成する。そして光源方向推定手段、陰影画像作成手段、類似度算出手段及び類似形状選択手段は、それぞれその派生形状モデル及び元の３次元形状モデルに対してのみ光源方向推定処理、陰影画像作成処理、類似度算出処理及び類似形状選択処理を行う。これにより顔画像処理装置は、３次元顔画像作成処理の負荷を軽減できる。 Further, the derived face feature point calculation means does not output information on all the created derived face feature point sets to the alignment means, but the derived face feature point having the smallest positional deviation amount from the two-dimensional face feature point. Only information about the set may be output to the alignment means. In that case, the derived model creating means creates one derived shape model for each three-dimensional shape model. Then, the light source direction estimating means, the shadow image creating means, the similarity calculating means, and the similar shape selecting means are the light source direction estimating process, the shadow image creating process, the similarity degree only for the derived shape model and the original three-dimensional shape model, respectively. Calculation processing and similar shape selection processing are performed. Thereby, the face image processing apparatus can reduce the load of the three-dimensional face image creation process.

以上のように、当業者は、本発明の範囲内で、実施される形態に合わせて様々な変更を行うことができる。 As described above, those skilled in the art can make various modifications in accordance with the embodiment to be implemented within the scope of the present invention.

１００顔画像処理装置
１０１記憶手段
１０２画像入力手段
１０３顔特徴点抽出手段
１０４位置合わせ情報算出手段
１０５派生モデル作成手段
１０６光源方向推定手段
１０７陰影画像作成手段
１０８類似度算出手段
１０９類似形状選択手段
１１０個人モデル作成手段
１１１位置合わせ手段
１１２派生顔特徴点算出手段 DESCRIPTION OF SYMBOLS 100 Face image processing apparatus 101 Storage means 102 Image input means 103 Face feature point extraction means 104 Registration information calculation means 105 Derived model creation means 106 Light source direction estimation means 107 Shadow image creation means 108 Similarity calculation means 109 Similar shape selection means 110 Personal model creation means 111 Positioning means 112 Derived face feature point calculation means

Claims

A face image processing device that generates a 3D face image corresponding to a face of a person by synthesizing a 2D face image including the face of the person and a 3D shape model representing the 3D shape of the face,
Image input means for inputting the two-dimensional face image;
Storage means for storing in advance the three-dimensional shape model and a plurality of three-dimensional face feature points representing facial feature points in the three-dimensional shape model;
Facial feature point extraction means for extracting a plurality of two-dimensional facial feature points representing the facial feature points from the two-dimensional facial image;
Alignment means for performing alignment by calculating a displacement amount between the three-dimensional face feature point and the two-dimensional face feature point corresponding to the three-dimensional face feature point;
One or a plurality of derived face feature points whose positions are changed with respect to the three-dimensional face feature points having a positional deviation amount equal to or greater than a predetermined threshold value are generated, and the three-dimensional face feature points or the derivative are generated for the three-dimensional face feature points. For a feature point selected from face feature points, and for the three-dimensional face feature point whose positional deviation amount is less than a predetermined threshold, a derived face feature point set is formed by combining the three-dimensional face feature points to form a face shape. A derived face feature point calculating means for generating;
Derived model creation means for creating a derived shape model by deforming the 3D shape model so that the 3D face feature point of the 3D shape model matches the derived face feature point set;
Similar shape selection means for comparing the derived shape model and the 3D shape model with the 2D face image and selecting the derived shape model most similar to the 2D face image or the 3D shape model;
A personal model creating means for creating the 3D face image corresponding to the face of the person by combining the 2D face image with the selected derived shape model or the 3D shape model;
A face image processing apparatus comprising:

The derived face feature point calculation means distributes the derived face feature points corresponding to the three-dimensional face feature points more widely as the positional deviation amount of the three-dimensional face feature points with respect to the corresponding two-dimensional face feature points is larger. The face image processing apparatus according to claim 1, generated by

The derived face feature point calculation means increases the number of the derived face feature points corresponding to the three-dimensional face feature points as the positional deviation amount of the three-dimensional face feature points with respect to the corresponding two-dimensional face feature points increases. The face image processing device according to claim 1, wherein the face image processing device is generated.