JP2007141107A

JP2007141107A - Image processor and its method

Info

Publication number: JP2007141107A
Application number: JP2005336395A
Authority: JP
Inventors: Satoru Yashiro; 哲八代
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-11-21
Filing date: 2005-11-21
Publication date: 2007-06-07

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processor and that achieves shape representation with a less number of dimensions for a model image referred to when an image is composed, and also to provide a method therefor. <P>SOLUTION: The model image referred to when a two-dimensional image of an object present in an image is composed, is composed, based upon composition parameters. At this time, efficient dimension compression of the composition parameters is performed by taking a main-component analysis of information representing a feature point of the model image, based upon local coordinate systems in a plurality of areas, such as an eye area of, for example, a face image, obtained by sectioning the model image. Thus, shape representation of the model image using a less number of dimensions is achieved. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は画像処理装置およびその方法に関し、特に、画像内の物体認識処理を行う画像処理装置およびその方法に関する。 The present invention relates to an image processing apparatus and method, and more particularly to an image processing apparatus and method for performing object recognition processing in an image.

近年、デジタルカメラや、携帯電話、デジタルビデオカムコーダ等が低価格で市場に投入されている。また、テレビチューナーカード、ＭＰＥＧ２、ＭＰＥＧ４に代表される動画圧縮技術がハードウェア化されたエンコーダ、デコーダカードの登場により、パーソナルコンピュータ（ＰＣ）がビデオデッキとしての機能を実現している。このようなＰＣではさらに、インターネットを通じた動画配信など、手軽に大量のデジタル動画・画像が入手可能である。 In recent years, digital cameras, mobile phones, digital video camcorders and the like have been put on the market at low prices. In addition, with the advent of a television tuner card, an encoder and a decoder card in which video compression techniques represented by MPEG2 and MPEG4 are implemented in hardware, a personal computer (PC) has realized a function as a video deck. Furthermore, such a PC can easily obtain a large amount of digital moving images / images such as moving image distribution via the Internet.

ハードディスクや光ディスクの大容量化により、普及価格帯のＰＣでも大量の動画・画像を蓄積することが可能であるが、蓄積されたコンテンツから、内容が類似している動画の箇所や画像を検索したいというニーズが増している。中でも被写体が同じ画像を探したいという検索ニーズが高い。 Large volumes of hard disks and optical disks can be used to store a large amount of movies and images on popular price range PCs, but I would like to search for similar parts and images of videos from the stored content. Needs are increasing. In particular, there is a high search need to search for images with the same subject.

被写体が同じ画像を探すには、まず、画像から被写体、特に顔領域を検出することが必要である。顔検出技術としては、カーネギーメロン大学のＲｏｗｌｅｙ氏らが提案したニューラルネットワークを用いた手法などが知られており、この手法によれば、画像中の顔の位置、大きさ、回転角度などが得られる。 In order to search for an image with the same subject, it is first necessary to detect the subject, particularly the face area, from the image. As a face detection technique, a technique using a neural network proposed by Rowley et al. Of Carnegie Mellon University is known. According to this technique, the position, size, rotation angle, etc. of a face in an image can be obtained. It is done.

また、検出された顔が誰であるかを判別する顔認識技術としては、国際規格としてＭＰＥＧ７（ＭｏｖｉｎｇＰｉｃｔｕｒｅｓＥｘｐｅｒｔｓＧｒｏｕｐ−７）の一部である顔認識記述子が存在する。これは、画像から顔領域を切り出して正規化した顔画像の輝度画像に対して、輝度情報や周波数空間に変換した情報を用いている。すなわち、主成分分析や線形判別分析を組み合わせて、顔画像の特徴量を抽出するための基底ベクトルを求め、線形射影することで顔特徴量を抽出している。この方法では、両目を基準として顔を切り出しており、周波数空間へ変換することで多少の位置のずれにも対応できるが、顔の各器官同士の対応が取れていないことによる性能劣化という問題がある。 As a face recognition technique for determining who the detected face is, there is a face recognition descriptor that is a part of MPEG7 (Moving Pictures Experts Group-7) as an international standard. This uses luminance information or information converted into a frequency space with respect to a luminance image of a face image obtained by cutting out and normalizing a face region from the image. That is, by combining principal component analysis and linear discriminant analysis, a base vector for extracting a feature amount of a face image is obtained, and the face feature amount is extracted by linear projection. In this method, the face is cut out based on both eyes, and it is possible to cope with a slight positional shift by converting to the frequency space, but there is a problem of performance deterioration due to the correspondence between each organ of the face not being taken. is there.

顔の各器官の対応を合わせるように変形する手法として、特許文献１やマンチェスター大学のＣｏｏｔｅｓ氏らが提案したＡＡＭ（アクティブアピアランスモデル）がある。 As a technique for deforming so as to match the correspondence of each organ of the face, there are AAM (active appearance model) proposed by Patent Document 1 and Mr. Cootes of Manchester University.

前者は顔画像に対して格子状の基準点を設定し、基準点の変異量行列を遺伝的手法によって更新しながら変形して、基準画像と入力画像をマッチングするものである。各基準点の変異量が顔画像の特徴量となる。 In the former, a grid-like reference point is set for a face image, and a variation matrix of the reference point is modified while being updated by a genetic technique to match the reference image with the input image. The amount of variation at each reference point is the feature amount of the face image.

また、後者は大量の２次元の顔画像データベースを用意し、画像中の各顔について複数の特徴点の座標と輝度値等から形状情報とテクスチャ情報とを抽出し、抽出した複数の形状情報とテクスチャ情報の平均を求めたものを平均顔とする。そして、各顔画像の平均顔からの差分を主成分分析を行ない、形状や表情の変化に対応した部分空間を求める。そして、求めた部分空間のそれぞれの座標軸に沿って合成パラメータを変化させることにより、顔画像を合成する。このモデルを画像認識方法として用いる場合には、先に求めた部分空間内で合成パラメータを動かして画像を合成し、入力された認識の対象となる顔画像との差が最も小さくなるときの合成画像についての合成パラメータを求める。 The latter prepares a large amount of a two-dimensional face image database, extracts shape information and texture information from the coordinates and luminance values of a plurality of feature points for each face in the image, An average face is obtained by obtaining an average of texture information. Then, a principal component analysis is performed on the difference from the average face of each face image to obtain a partial space corresponding to a change in shape and expression. Then, the face image is synthesized by changing the synthesis parameter along each coordinate axis of the obtained partial space. When this model is used as an image recognition method, an image is synthesized by moving the synthesis parameters in the previously obtained partial space, and the synthesis is performed when the difference from the input face image to be recognized is the smallest. Find the synthesis parameters for the image.

入力された顔画像に相関の高い合成顔画像の合成パラメータは、以下の手法によって求められる。すなわち、入力された顔画像と合成された顔画像の差分情報からパラメータの修正量へ射影するための行列を予め求めておき、差分評価とパラメータの更新、顔の再合成を繰り返す。またこの時、マルチ解像度の手法を用い、１段で解像度の比が半分になるような解像度を多段用意し、各解像度で射影行列を求めておく。そして粗い解像度で、入力画像に合成顔画像を近似させ、一段ずつ解像度を上げながら近似を繰り返し、最終的な解像度で、入力画像に相関の高い合成パラメータを得ている。
特開２０００−１１３１９７号公報 The composite parameter of the composite face image highly correlated with the input face image is obtained by the following method. That is, a matrix for projecting the parameter correction amount from the difference information between the input face image and the combined face image is obtained in advance, and the difference evaluation, parameter update, and face recombination are repeated. At this time, a multi-resolution technique is used to prepare a multi-stage resolution in which the resolution ratio is halved in one stage, and a projection matrix is obtained at each resolution. Then, the synthesized face image is approximated to the input image with a coarse resolution, and the approximation is repeated while increasing the resolution step by step to obtain a synthesized parameter having a high correlation with the input image at the final resolution.
JP 2000-113197 A

上記従来の顔表現モデルでは、顔の形状を表すために、顔の特徴点の座標や格子状に分割した座標を用いて主成分分析を行っていた。これらの物理的特徴の変動には非線形の要素があり、これが主成分分析を行っても次元数が減らない原因となっていた。 In the conventional face expression model, the principal component analysis is performed using the coordinates of the feature points of the face and the coordinates divided in a grid pattern in order to represent the shape of the face. These fluctuations in physical characteristics have non-linear elements, which cause the number of dimensions to not decrease even when principal component analysis is performed.

本発明は上述した問題を解決するためになされたものであり、画像合成の際に参照されるモデル画像において、より少ない次元数による形状表現を可能とする画像処理装置およびその方法を提供することを目的とする。 The present invention has been made to solve the above-described problems, and provides an image processing apparatus and method capable of expressing a shape with a smaller number of dimensions in a model image referred to in image synthesis. With the goal.

上記目的を達成するための一手法として、本発明の画像処理方法は以下の工程を備える。 As a technique for achieving the above object, the image processing method of the present invention includes the following steps.

すなわち、画像内に存在する物体の二次元画像を合成する際に参照されるモデル画像を構築する画像処理方法であって、前記モデル画像の合成パラメータを設定する設定ステップと、前記設定ステップにおいて設定された合成パラメータに基づいてモデル画像を合成する合成ステップと、を有し、前記合成パラメータは、前記モデル画像を区分した複数の領域内における局所的な座標系に基づいて、該モデル画像の特徴点を示す情報を主成分分析して次元圧縮されていることを特徴とする。 That is, an image processing method for constructing a model image that is referred to when a two-dimensional image of an object existing in an image is synthesized, the setting step for setting a synthesis parameter of the model image, and the setting in the setting step A synthesis step of synthesizing the model image based on the synthesized parameters, wherein the synthesis parameter is characterized by a feature of the model image based on a local coordinate system in a plurality of regions into which the model image is partitioned. Dimensional compression is performed by principal component analysis of information indicating points.

以上の構成からなる本発明によれば、画像合成の際に参照されるモデル画像における次元圧縮効果が高くなり、より少ない次元数による形状表現が可能となる。 According to the present invention having the above-described configuration, the dimensional compression effect in the model image that is referred to during image synthesis is increased, and shape representation with a smaller number of dimensions is possible.

以下、添付の図面を参照して、本発明をその好適な実施形態に基づいて詳細に説明する。なお、以下の各実施形態において示す構成は一例に過ぎず、本発明は図示された構成に限定されるものではない。 Hereinafter, the present invention will be described in detail based on preferred embodiments with reference to the accompanying drawings. The configurations shown in the following embodiments are merely examples, and the present invention is not limited to the illustrated configurations.

＜第１実施形態＞
●装置構成
図１は、本実施形態に係る画像処理装置の構成を示すブロック図である。本実施形態の画像処理装置において、２０１はＲＯＭ２０２やＲＡＭ２０３に格納されたプログラムに従って命令を実行するＣＰＵである。２０２は本実施形態の動作を実現するプログラムやその他の制御に必要なプログラムやデータを格納するＲＯＭ、２０３は一時的にデータを格納するための作業エリアとして利用されるＲＡＭである。２０４はＩＤＥやＳＣＳＩなどの外部記憶装置とのインターフェースを実現するドライブＩ／Ｆ、２０５は画像や画像検索のための特徴量やプログラムなどを記憶する記憶装置としてのＨＤＤである。２０６はデジタルカメラやスキャナなどの装置から画像を入力する画像入力部、２０８はキーボードやマウスなど、オペレータからの入力を受ける操作入力部である。２０９はブラウン管や液晶ディスプレイなどの表示部、２１０はインターネットやイントラネットなどのネットワークと接続を行なうモデムやＬＡＮなどのネットワークＩ／Ｆである。２１１はバスであり、上記各構成を接続して相互のデータ入出力を可能とする。 <First Embodiment>
Apparatus Configuration FIG. 1 is a block diagram illustrating a configuration of an image processing apparatus according to the present embodiment. In the image processing apparatus of the present embodiment, 201 is a CPU that executes instructions according to programs stored in the ROM 202 and the RAM 203. Reference numeral 202 denotes a ROM that stores a program for realizing the operation of the present embodiment and other programs and data necessary for control, and 203 denotes a RAM that is used as a work area for temporarily storing data. Reference numeral 204 denotes a drive I / F that realizes an interface with an external storage device such as IDE or SCSI, and 205 denotes an HDD as a storage device that stores an image, a feature amount for image search, a program, and the like. Reference numeral 206 denotes an image input unit that inputs an image from a device such as a digital camera or a scanner, and 208 denotes an operation input unit that receives an input from an operator, such as a keyboard or a mouse. Reference numeral 209 denotes a display unit such as a cathode ray tube or a liquid crystal display, and 210 denotes a network I / F such as a modem or a LAN for connecting to a network such as the Internet or an intranet. Reference numeral 211 denotes a bus which connects the above-described components to enable mutual data input / output.

本実施形態の画像処理装置は、ＰＣにオペレーティングシステムとしてマイクロソフト社のＷＩＮＤＯＷＳ（登録商標）ＸＰ（登録商標）がインストールされ、その上で動作するアプリケーションとして実装される。 The image processing apparatus according to the present embodiment is installed as an application operating on a WINDOWS (registered trademark) XP (registered trademark) manufactured by Microsoft Corporation as an operating system on a PC.

●顔モデル
図２に、本実施形態における顔モデルの概要を示し、説明する。同図に示すように顔モデルは、形状情報３０１、テクスチャ情報３０２、拘束条件３０３、パラメータ更新テーブル３０４から構成される。形状情報３０１は、多数の顔画像から得られる形状ベクトルの平均と、該平均からの変位について主成分分析した部分空間の形状基底ベクトルから構成される。テクスチャ情報３０２は、多数の顔画像から得られる画像の輝度平均と、該平均からの変位について主成分分析した部分空間のテクスチャ基底ベクトルから構成される。拘束条件情報３０３は、形状ベクトルの各次元において取りうる範囲である最大値と最小値を示す。パラメータ更新テーブル３０４は、２５６×２５６，１２８×１２８，６４×６４，３２×３２，１６×１６，８×８の解像度ごとに、テーブルを有する。 FIG. 2 shows an outline of the face model in this embodiment and will be described. As shown in the figure, the face model includes shape information 301, texture information 302, constraint conditions 303, and parameter update table 304. The shape information 301 includes an average of shape vectors obtained from a large number of face images, and a shape base vector of a subspace obtained by principal component analysis with respect to a displacement from the average. The texture information 302 includes a luminance average of images obtained from a large number of face images and a texture basis vector of a subspace obtained by principal component analysis with respect to a displacement from the average. The constraint condition information 303 indicates a maximum value and a minimum value that are ranges that can be taken in each dimension of the shape vector. The parameter update table 304 has a table for each resolution of 256 × 256, 128 × 128, 64 × 64, 32 × 32, 16 × 16, and 8 × 8.

以下、顔モデルの各部について、その構築手順を説明する。なお、顔画像を統計的に処理するため、本実施形態の顔モデルの構築にあたっては、予め大量の顔画像を含む画像を用意しておく。 Hereinafter, the construction procedure of each part of the face model will be described. Note that in order to statistically process a face image, an image including a large amount of face images is prepared in advance when the face model of this embodiment is constructed.

・顔の特徴点
まず、形状情報３０１を構築するために必要となる、本実施形態における顔の特徴点の設定方法について、図３Ａ〜図３Ｆを用いて説明する。 Facial Feature Points First, a method for setting facial feature points in the present embodiment, which is necessary for constructing the shape information 301, will be described with reference to FIGS. 3A to 3F.

図３Ａ〜図３Ｆは、正面を向いた顔における形状情報の一例を示す図である。本実施形態では、顔の部分によってローカルな座標系を定義し、該ローカルな座標系で特徴点の取りうる範囲を限定し、この限定された条件の下で座標が特定可能な情報を形状ベクトルとして、顔の特徴点を表現する。 3A to 3F are diagrams illustrating an example of shape information on a face facing the front. In the present embodiment, a local coordinate system is defined by the face portion, a range of feature points that can be taken in the local coordinate system is limited, and information whose coordinates can be specified under this limited condition is a shape vector. To express the feature points of the face.

図３Ａは、顔座標の例を示す図である。顔座標は、両目の中心を結ぶ線をＸ軸、両目の中央で直交する直線をＹ軸とする平面であり、原点から目の中心までの長さを１単位とする。 FIG. 3A is a diagram illustrating an example of face coordinates. The face coordinates are a plane with the X axis as the line connecting the centers of both eyes and the Y axis as a straight line orthogonal to the center of both eyes, and the length from the origin to the center of the eyes is one unit.

図３Ｂは、目部分の詳細を示す図である。目の中心は、目頭と目尻を結ぶ線分の中心としている。左目におけるこの線分の長さをＬ１、両目の中心を結ぶ線とのなす角度をＡ１、この線分を４等分した点から垂直方向に伸ばした線分と瞼との交点までの長さをＬ２からＬ７としている。右目についても左目と同数のパラメータが設定される。すなわち、図３ＢのＬ１に対応するＬ８から、Ｌ７に対応するＬ１４、およびＡ１に対応するＡ２が設定される。 FIG. 3B is a diagram showing details of the eye portion. The center of the eye is the center of the line connecting the head and the corner of the eye. The length of this line segment in the left eye is L1, the angle between the line connecting the centers of both eyes is A1, and the length from the point obtained by dividing this line segment into four equal parts to the intersection of the line segment and the eyelid From L2 to L7. The same number of parameters are set for the right eye as for the left eye. That is, L14 corresponding to L7 and A2 corresponding to A1 are set from L8 corresponding to L1 in FIG. 3B.

図３Ｃは、鼻付近の詳細を示す図である。顔座標系における鼻の最下部の位置Ｐ１を原点とし、そこから両目を結ぶ線分の中央を結ぶ線をＹ軸とし、Ｘ軸は原点でＹ軸と直行させる。両目を結ぶ線分の中央との交点をＹ＝１とする鼻座標系を定義し、鼻座標系における鼻の特徴点をＰ２，またＰ２からの相対でＰ３をあらわす。また鼻の端を向かって右端をＰ４とし、Ｐ３からの相対座標であらわす。Ｐ２，Ｐ３は鼻の穴がある場合はその端にあたるようにし、ない場合にはＰ１−Ｐ２，Ｐ２−Ｐ３，Ｐ３−Ｐ４が等しい長さになる間隔で、鼻のエッジに沿って設定する。鼻の右側部分についても同様に、Ｐ５〜Ｐ７を設定する。また、鼻の穴が存在するときには穴のエッジ上にＰ８、Ｐ９を設定し、なければ鼻のエッジ上でＰ２とＰ３間を２分する箇所に設定する。 FIG. 3C is a diagram showing details in the vicinity of the nose. A position P1 at the lowest part of the nose in the face coordinate system is set as the origin, a line connecting the centers of the line segments connecting both eyes is defined as the Y axis, and the X axis is orthogonal to the Y axis at the origin. A nose coordinate system is defined in which the intersection point with the center of the line segment connecting both eyes is Y = 1, and the feature point of the nose in the nose coordinate system is represented by P2 and P3 relative to P2. Moreover, the right end is set to P4 toward the end of the nose, and the relative coordinates from P3 are expressed. If there is a nostril, P2 and P3 are set at the end of the nostril. If not, P1 and P3 are set along the edge of the nose at intervals of equal lengths P1-P2, P2-P3, and P3-P4. Similarly, P5 to P7 are set for the right side portion of the nose. Further, when there is a nostril, P8 and P9 are set on the edge of the hole, and when there is no nostril, the position between P2 and P3 is set to be divided into two on the nose edge.

図３Ｄは、眉毛付近の詳細を示す図である。眉毛の目頭側端とその下の目の中心を結ぶ線と、両目の中心間を結ぶ線のなす角度をＡ３，また、眉毛の目頭側端とその下の目の中心を結ぶ線と、眉毛の目尻側端とその下の目の中心を結ぶ線とのなす角度をＡ４とする。そして、角度Ａ４を４等分する３本の線を引き、眉のエッジとの交点を特徴点とする。各特徴点から目中心までの長さＬ８〜１２、まゆの幅Ｌ１３〜Ｌ１５を設定する。右の眉毛も同様に、Ａ５，Ａ６およびＬ１６〜Ｌ２３を設定する。眉毛の表現は局座標系での座標と等価である。 FIG. 3D is a diagram showing details in the vicinity of the eyebrows. The angle between the line connecting the eyebrow eye end and the center of the eye below and the line connecting the centers of both eyes is A3, and the line connecting the eyebrow eye end and the center of the eye below the eyebrow Let A4 be the angle formed by the line connecting the edge of the eye corner side and the center of the lower eye. Then, three lines that divide the angle A4 into four equal parts are drawn, and the intersection with the edge of the eyebrows is used as a feature point. Lengths L8 to L12 and eyebrows widths L13 to L15 from each feature point to the eye center are set. Similarly, A5, A6, and L16 to L23 are set for the right eyebrows. The expression of eyebrows is equivalent to the coordinates in the local coordinate system.

図３Ｅは、口領域の詳細を示す図である。口領域には鼻座標系を用いる。鼻座標系のＹ軸上で上唇の上部をＰ１１とし、上唇の下部Ｐ１２をＰ１１からの相対で、下唇の上部Ｐ１３をＰ１２からの相対で、下唇の下部Ｐ１４をＰ１３からの相対であらわす。また、唇の向かって右端をＰ１５とし、Ｐ１１〜Ｐ１５までの唇のエッジ上を４等分するようにＰ１６〜Ｐ１８を定める。同様に、Ｐ１２〜Ｐ１５間にＰ１９〜Ｐ２１を設定し、下唇も同様にＰ２２〜Ｐ２７を設定する。ここで、Ｐ１６〜Ｐ２７の各座標は、左隣の特徴点からの相対座標で表すとする。唇の向かって左側についても同様に、Ｐ２８〜Ｐ４０を設定する。 FIG. 3E is a diagram showing details of the mouth region. A nose coordinate system is used for the mouth region. On the Y axis of the nose coordinate system, the upper lip upper part is represented as P11, the upper lip lower part P12 is represented relative to P11, the lower lip upper part P13 is represented relative to P12, and the lower lip lower part P14 is represented relative to P13. . Further, P16 to P18 are determined so that the right end toward the lips is P15 and the lip edges from P11 to P15 are equally divided into four. Similarly, P19 to P21 are set between P12 and P15, and P22 to P27 are similarly set for the lower lip. Here, it is assumed that the coordinates P16 to P27 are expressed as relative coordinates from the feature point on the left. Similarly, P28 to P40 are set for the left side of the lips.

図３Ｆは顔の輪郭を示す図である。顔の輪郭は、鼻座標系と原点を同じくした局座標系と等価である。鼻座標系（図３Ｃ）におけるＰ１を中心とする局座標において、Ｐ１１方向を０とし、−５／８πから５／８πラジアンの範囲を１／８単位で等分し、顔の輪郭の中心（Ｐ１）からの距離をＬ２８〜Ｌ３３で示す。 FIG. 3F is a diagram showing the outline of the face. The face contour is equivalent to a local coordinate system having the same origin as the nose coordinate system. In the local coordinates centered on P1 in the nose coordinate system (FIG. 3C), the direction of P11 is 0, the range of −5 / 8π to 5 / 8π radians is equally divided by 1/8 unit, and the center of the face contour ( The distance from P1) is indicated by L28 to L33.

本実施形態における顔の特徴点は、以上のように設定される。 The facial feature points in this embodiment are set as described above.

以下、図２に示す顔モデルの各構成について、その構築方法について詳細に説明する。・形状情報の構築手順
形状情報３０１を構築するために、形状ベクトルｘを以下のように設定する。 Hereinafter, the construction method of each configuration of the face model shown in FIG. 2 will be described in detail. Forming procedure of shape information In order to build the shape information 301, the shape vector x is set as follows.

X = {A1,...,A6,L1,・・・,L33,P1x,P1y,P2x,P2y,...,P40x,P40y}
また、顔座標軸上における各特徴点からなる特徴点座標ベクトルｓを以下のように設定する。 X = {A1, ..., A6, L1, ..., L33, P1x, P1y, P2x, P2y, ..., P40x, P40y}
Also, a feature point coordinate vector s consisting of each feature point on the face coordinate axis is set as follows.

s = {x1,x2,・・・・,y1,y2,・・・}
以下、形状情報３０１の構築手順について、図４のフローチャートを用いて説明する。まずステップＳ５０１では、顔画像が存在する対象画像を表示部２０９に表示する。次にステップＳ５０２で、画像中の顔画像における両目の座標を、入力部２０８から指定する。ここで指定される座標系は、画像の左上を原点とするＸＹ座標系である。次にステップＳ５０３でアフィン変換を行って、ステップＳ５０２で指定した目の座標が所定の位置になるように顔画像の切り出しを行なう。 s = {x1, x2, ......, y1, y2, ...}
Hereinafter, the construction procedure of the shape information 301 will be described using the flowchart of FIG. First, in step S501, a target image including a face image is displayed on the display unit 209. In step S502, the coordinates of both eyes in the face image in the image are designated from the input unit 208. The coordinate system specified here is an XY coordinate system with the origin at the upper left of the image. Next, affine transformation is performed in step S503, and a face image is cut out so that the coordinates of the eye designated in step S502 are in a predetermined position.

次にステップ５０４で、顔画像上における各特徴点座標を入力部２０８から指定する。そして、指定された特徴点の座標を上述したローカル座標系に変換し、形状情報に変換する。ここで、拡大表示を行ったり均等の角度や間隔で設定されるべき特徴点については、その都度ガイドラインを表示したり、計算によって位置の微調整を行なうことで、入力誤差を低減することができる。 Next, in step 504, each feature point coordinate on the face image is designated from the input unit 208. Then, the coordinates of the designated feature point are converted into the above-mentioned local coordinate system and converted into shape information. Here, for feature points that should be enlarged and set at equal angles and intervals, the input error can be reduced by displaying a guideline each time or performing fine adjustment of the position by calculation. .

１つの顔画像について全ての特徴点の指定が終了したらステップＳ５０５に進み、画像上での両目座標、また、形状ベクトル、画像ファイル名と対応付けてＨＤＤ２０５に記録する。 When the specification of all the feature points for one face image is completed, the process proceeds to step S505, and is recorded in the HDD 205 in association with the binocular coordinates on the image, the shape vector, and the image file name.

次にステップＳ５０６に進み、すべての顔画像に対して特徴点の指定が終了したか否かを判定し、未設定の顔画像があればステップＳ５０１に戻るが、すべて終了していればステップＳ５０７に進む。 Next, the process proceeds to step S506, where it is determined whether or not feature point designation has been completed for all face images. If there is an unset face image, the process returns to step S501. If all face images have been designated, step S507 is performed. Proceed to

ステップＳ５０７では、前段で指定したすべての顔画像について、各特徴点の平均的な座標（平均形状）を求める。次にステップＳ５０８に進み、所定の各サンプル点における平均形状からの差分により、差分形状ベクトルを生成する。次にステップＳ５０９に進み、全サンプルの形状ベクトルについて主成分分析を行なう。この結果、寄与率の大きいものから順に寄与率の合計が所定の割合になるまでの主成分からなる、部分空間のための基底行列を得る。この基底行列がすなわち、形状基底ベクトルとなる。 In step S507, the average coordinates (average shape) of each feature point are obtained for all the face images specified in the previous stage. In step S508, a difference shape vector is generated based on a difference from the average shape at each predetermined sample point. In step S509, the principal component analysis is performed on the shape vectors of all samples. As a result, a base matrix for the subspace is obtained, which is composed of principal components from the largest contribution rate until the total contribution rate reaches a predetermined ratio. That is, this basis matrix becomes a shape basis vector.

そしてステップＳ５１０において、得られた平均形状および形状基底ベクトルを形状情報３０１として、ＨＤＤ２０５に記録する。 In step S510, the obtained average shape and shape basis vector are recorded in the HDD 205 as shape information 301.

・テクスチャ情報の構築手順
以下、テクスチャ情報３０２の構築手順について、図５のフローチャートを用いて説明する。 Texture Information Construction Procedure Hereinafter, the texture information 302 construction procedure will be described with reference to the flowchart of FIG.

まずステップＳ６０１では、上記ステップＳ５０５でＨＤＤ２０５に記録した１つの顔画像に対応する両目の座標、また、切り出した形状ベクトルを読み出し、アフィン変換によって顔画像を切り出す。 First, in step S601, the coordinates of both eyes corresponding to one face image recorded in the HDD 205 in step S505 and the cut shape vector are read, and the face image is cut out by affine transformation.

次にステップＳ６０２に進み、特徴点が平均の位置に移動するようなモーフィングを行なう。なおモーフィングは次のように表すこととする。 In step S602, morphing is performed so that the feature point moves to the average position. Morphing is expressed as follows.

Ｉ_m = Ｍ(Ｉ,Ｓ_s,Ｓ_d) ・・・(1)
なお、(1)式においてＩ_mはモーフィングされた合成顔画像、Ｉは切り出した顔画像、Ｓ_sは変換元の特徴点座標列、Ｓ_dは変換先の特徴点座標列である。特徴点座標列は切り出した顔画像上での座標となるが、形状ベクトルから一意に求めることができる。 I _m = M (I, S _s , S _d ) (1)
Incidentally, it is I _m is synthetic face image is morphed, the face image I to cut out, S _s is the feature point coordinate sequence of the conversion source, S _d is a feature point coordinate sequence of the destination in (1). The feature point coordinate sequence is coordinates on the cut face image, and can be uniquely obtained from the shape vector.

このようなモーフィングにより、異なる顔の各器官の位置が顔座標系で同じ位置となり、対応付けが確保される。 By such morphing, the positions of the organs of different faces become the same position in the face coordinate system, and the association is ensured.

次にステップＳ６０３に進み、マスキングを行なう。すなわち、特徴点平均座標列で示される平均顔形状の顔の輪郭の外側をマスキングし、顔以外の領域をこれ以降で行なわれる処理の対象から除外する。 In step S603, masking is performed. That is, the outside of the face contour of the average face shape indicated by the feature point average coordinate sequence is masked, and the area other than the face is excluded from the processing target to be performed thereafter.

次にステップＳ６０４に進み、輝度分布の正規化を行なう。これは、画像の平均輝度、輝度の分散を所定の値に揃うようにする。また、ヒストグラム平滑化を行って輝度分布を平滑化する処理を行なう。 In step S604, the luminance distribution is normalized. This ensures that the average luminance and luminance variance of the image are aligned to a predetermined value. In addition, a process for smoothing the luminance distribution by performing histogram smoothing is performed.

次にステップＳ６０５に進み、すべての顔画像に対して上記モーフィング、マスキング、正規化処理が終了したか否かを判定し、未終了の顔画像があればステップＳ６０１に戻るが、全て終了していればステップＳ６０６に進む。 In step S605, it is determined whether the morphing, masking, and normalization processes have been completed for all face images. If there are unfinished face images, the process returns to step S601, but all have been completed. If so, the process proceeds to step S606.

ステップＳ６０６では、画素ごとに全顔サンプルの平均輝度を求めることによって、平均テクスチャを得る。 In step S606, an average texture is obtained by obtaining the average luminance of all face samples for each pixel.

次にステップＳ６０７に進み、各顔サンプルについて前段で求めた平均輝度との差分を得て、非マスキング画素数からなる輝度差ベクトルを生成する。次にステップＳ６０８に進み、全顔サンプル数の輝度差ベクトルを主成分分析する。これを寄与率の大きいものから順に、寄与率の合計が所定の割合になるまでの主成分からなる部分空間のための基底行列を得る。この基底行列がすなわち、テクスチャ基底ベクトルとなる。 In step S607, a difference from the average luminance obtained in the previous stage is obtained for each face sample, and a luminance difference vector including the number of non-masking pixels is generated. In step S608, the luminance difference vector of the total number of face samples is subjected to principal component analysis. In order of increasing contribution rate, a base matrix for a subspace consisting of principal components until the total contribution rate reaches a predetermined ratio is obtained. This basis matrix is a texture basis vector.

そしてステップＳ６０９において、得られた平均テクスチャおよびテクスチャ基底ベクトルをテクスチャ情報３０２として、ＨＤＤ２０５に記録する。 In step S609, the obtained average texture and texture basis vector are recorded in the HDD 205 as texture information 302.

・拘束条件情報の構築手順
以下、拘束条件情報３０３の構築手順について説明する。 -Procedure for constructing constraint condition information The procedure for constructing constraint condition information 303 will be described below.

本実施形態の特徴である拘束条件は、実在する可能性が低い顔が合成される確率を低減するために設定される。本実施形態では、形状ベクトルｘの各次元の取りうる値の範囲を設定し、これを超えたか否かによって、実在する可能性が低いか否かを判定し、範囲外の場合は範囲内に収まるようクリッピングをかけることで形状ベクトルの修正を行う。具体的には、次に示す(2)式によって判定と修正を行う。 The constraint condition that is a feature of the present embodiment is set in order to reduce the probability that a face that is unlikely to exist is synthesized. In the present embodiment, a range of possible values of each dimension of the shape vector x is set, and it is determined whether or not there is a low possibility of existence depending on whether or not this value is exceeded. The shape vector is corrected by clipping to fit. Specifically, determination and correction are performed according to the following equation (2).

ｘ_i ≦ ｘ^av _i-ｋσ_i のとき、
ｆ(ｘi) = ｘ^av _i-ｋσ_i
ｘ_i ≧ ｘ^av _i+ｋσ_i のとき、
ｆ(ｘi) = ｘ^av _i+ｋσ_i
その他のとき、
ｆ(ｘi) = ｘ_i
・・・(2)
ただし(2)式において、ｘ^av _iはi番目の次元におけるサンプルの平均であり、σ_iはi番目の次元におけるサンプルの標準偏差である。また、ｋは所定の定数であり、実在確率の閾値となる。サンプル顔画像の特徴ベクトルの各次元について値の分布を正規分布で近似しており、たとえば、99.7%を網羅したい場合はｋ=３である。 When x _i ≤ x ^av _i -kσ _i ,
f (xi) = x ^av _i -kσ _i
When x _i ≧ x ^av _i + kσ _i ,
f (xi) = x ^av _i + kσ _i
At other times
f (xi) = x _i
... (2)
In equation (2), x ^av _i is the average of samples in the i-th dimension, and σ _i is the standard deviation of samples in the i-th dimension. Further, k is a predetermined constant and becomes a threshold value of the real probability. For each dimension of the feature vector of the sample face image, the distribution of values is approximated by a normal distribution. For example, if it is desired to cover 99.7%, k = 3.

本実施形態ではすなわち、(2)式におけるｘ^av _i-ｋσ_iを形状最小値を示す拘束条件情報３０３として、ＨＤＤ２０５に記憶する。同様に、
ｘ^av _i+ｋσ_iを形状最大値を示す拘束条件情報３０３としてＨＤＤ２０５に記憶する。 That is, in this embodiment, x ^av _i -kσ _i in the equation (2) is stored in the HDD 205 as the constraint condition information 303 indicating the shape minimum value. Similarly,
x ^av _i + kσ _i is stored in the HDD 205 as constraint condition information 303 indicating the shape maximum value.

・パラメータ更新テーブルの構築
以下、パラメータ更新テーブル３０４の構築手順について説明する。 -Construction of Parameter Update Table The procedure for constructing the parameter update table 304 will be described below.

パラメータ更新テーブル３０４の実体は、入力された顔画像と合成された顔画像の差分情報から差分を低減するために、次に合成すべき顔画像の合成パラメータの修正量へ射影するための行列である。この行列を、顔画像に対する２５６×２５６，１２８×１２８，６４×６４，３２×３２，１６×１６，８×８ピクセルの各解像度において求める。 The entity of the parameter update table 304 is a matrix for projecting to the correction amount of the synthesis parameter of the face image to be synthesized next in order to reduce the difference from the difference information between the inputted face image and the synthesized facial image. is there. This matrix is obtained at each resolution of 256 × 256, 128 × 128, 64 × 64, 32 × 32, 16 × 16, and 8 × 8 pixels for the face image.

以下、ある１つの解像度におけるパラメータ更新テーブル３０４の構築手順を、図６のフローチャートを用いて説明する。 Hereinafter, the construction procedure of the parameter update table 304 at a certain resolution will be described with reference to the flowchart of FIG.

まずステップＳ７０１において第１合成パラメータを設定する。ここでは第１合成パラメータとして、顔画像の合成パラメータをランダムに設定する。ただし各パラメータの平均モデルからの変位の上限は、近似成功確率から実験的に求めておく必要がある。 First, in step S701, a first synthesis parameter is set. Here, the face image composition parameter is set randomly as the first composition parameter. However, the upper limit of displacement from the average model of each parameter must be experimentally determined from the approximate success probability.

次にステップＳ７０２に進んで第２合成パラメータを設定する。第２合成パラメータは、第１合成パラメータの要素の１つを正または負に変更したものとする。変更量は、そのパラメータ要素のデータの分散に基づいて決定するが、実験的に求める必要がある。 In step S702, the second synthesis parameter is set. The second synthesis parameter is obtained by changing one of the elements of the first synthesis parameter to positive or negative. The amount of change is determined based on the variance of the data of the parameter element, but it must be obtained experimentally.

次にステップＳ７０３に進み、第１および第２合成パラメータに基づいて２つの顔画像を合成する。このとき、本実施形態の特徴である拘束条件が適用されるが、その詳細については後述する。 In step S703, two face images are synthesized based on the first and second synthesis parameters. At this time, the constraint condition that is a feature of the present embodiment is applied, and details thereof will be described later.

次にステップＳ７０４に進み、２つの合成画像の差分ベクトルを求める。この差分ベクトルは、画素数分の次元で、各画素の輝度差を値とするベクトルである。合成された顔の輪郭と平均の顔とでは顔の輪郭が異なり、画素数も異なる。ここでは、差分ベクトルを集めて行列として扱うために、解像度毎に画素数を一定にする必要がある。そこで、上記ステップＳ６０３と同じ形状のマスキングをかけることによって、平均顔の占める領域の画素により差分ベクトルを構成する。 In step S704, a difference vector between the two composite images is obtained. This difference vector is a vector having a dimension corresponding to the number of pixels and a luminance difference of each pixel as a value. The synthesized face outline and the average face have different face outlines and different numbers of pixels. Here, in order to collect the difference vectors and handle them as a matrix, it is necessary to make the number of pixels constant for each resolution. Therefore, by applying masking of the same shape as in step S603, a difference vector is formed by pixels in the area occupied by the average face.

次にステップＳ７０５に進み、所定回数のループを行ったか否かを判定する。ループ回数は最低でも顔の差分ベクトルの次元数分必要である。さらにループする必要がある場合はステップＳ７０１に戻るが、ループが所定回数に達した場合にはステップＳ７０６に進む。 In step S705, it is determined whether a predetermined number of loops have been performed. The number of loops is at least the number of dimensions of the face difference vector. If it is necessary to further loop, the process returns to step S701. If the loop has reached a predetermined number of times, the process proceeds to step S706.

ステップＳ７０６では、多変数線形回帰を求める。具体的には、以下に示す行列式(3)となるような射影行列Ａを求める。 In step S706, multivariable linear regression is obtained. Specifically, a projection matrix A such that the determinant (3) shown below is obtained.

ΔＣ = ＡΔＩ・・・(3)
なお、式(3)において、ΔＣは第１合成パラメータと第２合成パラメータの差の組、ΔＩは対応する合成画像の差分ベクトルの組である。 ΔC = AΔI (3)
In Equation (3), ΔC is a set of differences between the first synthesis parameter and the second synthesis parameter, and ΔI is a set of difference vectors of the corresponding synthesized image.

射影行列Ａは、以下に示す行列式(4)によって導き出すことができる。 The projection matrix A can be derived from the determinant (4) shown below.

Ａ = ΔＣΔＩ⁺ ・・・(4)
なお、式(4)において、ΔＩ⁺はΔＩの擬似逆行列である。 A = ΔCΔI ⁺ (4)
In Equation (4), ΔI ⁺ is a pseudo inverse matrix of ΔI.

次にステップＳ７０７に進み、求められた射影行列Ａをパラメータ更新テーブル３０４として、ＨＤＤ２０５に記録する。 In step S707, the obtained projection matrix A is recorded in the HDD 205 as the parameter update table 304.

・顔画像の合成方法
以下、上記ステップＳ７０３における、合成パラメータを使用した顔画像の合成方法について図７のフローチャートを用いて説明する。 -Face Image Combining Method Hereinafter, the face image synthesizing method using the synthesis parameter in step S703 will be described with reference to the flowchart of FIG.

まずステップＳ８０１において、以下の(5)式に示すように、平均テクスチャと主成分毎の強さである合成パラメータの線形和によって、テクスチャＧを作成する。 First, in step S801, as shown in the following equation (5), a texture G is created by a linear sum of an average texture and a synthesis parameter that is the strength of each principal component.

Ｇ = ｇ^av + Ｐ⁺ _gｂ_g ・・・(5)
式(5)において、ｇ^avは平均テクスチャ、Ｐ⁺ _gはテクスチャの基底ベクトルの擬似逆行列であり、事前に求めておいてよい。ｂ_gは輝度の合成パラメータである。 G = g ^av + P ⁺ _g b _g (5)
In equation (5), g ^av is an average texture, and P ⁺ _g is a pseudo inverse matrix of texture basis vectors, which may be obtained in advance. b _g is a luminance synthesis parameter.

次にステップＳ８０２に進み、平均形状ベクトルと主成分ごとの強さである合成パラメータの線形和によって、形状ベクトルＸを作成する。 In step S802, a shape vector X is created by a linear sum of an average shape vector and a synthesis parameter that is the strength of each principal component.

Ｘ = ｘ^av + Ｐ⁺ _xｂ_x ・・・(6)
式(6)において、ｘ^avは平均形状、Ｐ⁺ _xは形状の基底ベクトルの擬似逆行列であり、事前に求めておいてよい。ｂ_xは形状の合成パラメータである。 ^{^{_{X = x av + P + x}}} b x ··· (6)
In Equation (6), x ^av is an average shape, and P ⁺ _x is a pseudo inverse matrix of a shape basis vector, and may be obtained in advance. b _x is a shape synthesis parameter.

次にステップＳ８０３に進み、合成された顔画像が実在する可能性のある顔を示しているか否かを判定する。ここではすなわち式(7)に示すように、合成された形状ベクトルＸの各次元の値が所定の範囲内にあるか否かを判定する。このとき、上述した拘束条件情報３０３（この場合、kσ_i）が参照される。 In step S803, it is determined whether or not the synthesized face image indicates a face that may actually exist. Here, as shown in Equation (7), it is determined whether or not the values of the dimensions of the combined shape vector X are within a predetermined range. At this time, the above-described constraint condition information 303 (in this case, kσ _i ) is referred to.

|Ｘ_i-ｘ^av _i| ≦ kσ_i ・・・(7)
ここで、当該顔画像は実在する可能性が無いものであると判定された場合はステップＳ８０４に進み、実在し得る場合はステップＳ８０５に進む。 | X _i -x ^av _i | ≦ kσ _i (7)
If it is determined that there is no possibility that the face image actually exists, the process proceeds to step S804. If the face image can exist, the process proceeds to step S805.

ステップＳ８０４では、上記式(2)を適用して、合成された顔に近く、実在する可能性のある顔形状ベクトルＸ'に変換する。 In step S804, the above equation (2) is applied to convert the face shape vector X ′ that is close to the synthesized face and may exist.

Ｘ' ← ｆ(Ｘ) ・・・(8)
ステップＳ８０５では、式(9)に基づいて形状ベクトルＸを顔座標上での特徴点座標に変換する。 X '← f (X) (8)
In step S805, the shape vector X is converted into feature point coordinates on the face coordinates based on the equation (9).

Ｓ = Ｓ(Ｘ') ・・・(9)
また、平均形状ベクトルから平均座標値への変換も同様に行うが、これは式(10)に示すように、平均形状ベクトルｘ^avに基づいて事前に求めておけばよい。 S = S (X ') (9)
Further, the conversion from the average shape vector to the average coordinate value is performed in the same manner, but this may be obtained in advance based on the average shape vector x ^av as shown in the equation (10).

ｓ^av = Ｓ(ｘ^av) ・・・(10)
次にステップＳ８０６において、式(11)に示すように平均的な顔形状から合成された形状へのモーフィングを行なうことによって、合成顔画像Ｉ_mが得られる。 s ^av = S (x ^av ) (10)
In step S806, a composite face image _Im is obtained by performing morphing from an average face shape to a synthesized shape as shown in Expression (11).

Ｉ_m = Ｍ(Ｇ,ｓ^av,Ｓ) ・・・(11)
なお、ここでは説明を分かりやすくするために、形状とテクスチャの成分を分けて主成分分析を行う例を示したが、これらをまとめて主成分分析を行なって合成パラメータとした方が、次元数が削減できるという効果がある。ただし、ダイナミックレンジが大きく異なる次元が混在している場合には、計算誤差の影響が大きくなるため、ダイナミックレンジをそろえるためのスケーリング調整を行うことが好ましい。 I _m = M (G, s ^av , S) (11)
In order to make the explanation easier to understand, an example in which the principal component analysis is performed separately for the shape and texture components is shown. Can be reduced. However, when dimensions with greatly different dynamic ranges are mixed, the influence of calculation errors increases, so it is preferable to perform scaling adjustment to align the dynamic ranges.

●画像検索システム
以上、本実施形態における顔モデルの構築方法について説明した。以下、該顔モデルを用いた、本実施形態における画像処理の概要について説明する。 Image Search System The face model construction method in the present embodiment has been described above. Hereinafter, an outline of image processing in the present embodiment using the face model will be described.

本実施形態では、画像中の顔領域に着目し、与えられたクエリ画像に対して類似した顔が映っている画像を類似画像として出力する画像検索システムを例として説明する。この場合の画像処理としては、検索に必要な画像から特徴量を抽出して記録する画像登録処理と、検索条件が与えられたときに特徴量の照合を行って最も類似した画像を獲得する画像検索処理から構成される。以下、この画像登録処理および画像検索処理のそれぞれについて説明する。 In this embodiment, an image search system that focuses on a face area in an image and outputs an image showing a similar face to a given query image as a similar image will be described as an example. As image processing in this case, an image registration process for extracting and recording feature amounts from images necessary for search, and an image for obtaining the most similar image by comparing feature amounts when a search condition is given Consists of search processing. Hereinafter, each of the image registration process and the image search process will be described.

・画像登録処理
本実施形態における画像登録処理を、図８のフローチャートを用いて説明する。 Image Registration Processing The image registration processing in this embodiment will be described using the flowchart of FIG.

まずステップＳ９０１にて画像入力部２０６から画像を入力する。次にステップＳ９０２に進み、入力された画像から顔領域を検出する。この検出処理としては、上述したＲｏｗｌｅｙ氏らによる手法を適用する。次にステップＳ９０３に進み、得られる両目の座標から顔の大きさや傾きを補正し、ステップＳ５０３と同様に顔画像の切り出しを行なう。次にステップＳ９０４に進み、輝度の正規化を行なう。すなわち、輝度分布が一定になるようにヒストグラムの均一化などの処理を行なう。 In step S901, an image is input from the image input unit 206. In step S902, a face area is detected from the input image. As the detection process, the above-described method by Rowley et al. Is applied. In step S903, the face size and inclination are corrected from the obtained coordinates of both eyes, and the face image is cut out in the same manner as in step S503. In step S904, the luminance is normalized. That is, processing such as histogram equalization is performed so that the luminance distribution is constant.

次にステップＳ９０５に進み、特徴抽出を行なう。ここでは入力された顔画像に最も相関の高い合成顔モデルの合成パラメータを決定し、これを特徴量とする。この特徴抽出処理の詳細については後述する。 In step S905, feature extraction is performed. Here, a composite parameter of a composite face model having the highest correlation with the input face image is determined, and this is used as a feature amount. Details of this feature extraction processing will be described later.

次にステップＳ９０６に進み、特徴抽出処理で得られた合成パラメータをＨＤＤ２０５に記録する。次にステップＳ９０７に進み、画像入力部２０６において未処理の画像が残っていないか判定し、残っていなければステップＳ９０８に進むが、処理対象の画像が存在する場合はステップＳ９０１に戻る。 In step S906, the composite parameter obtained by the feature extraction process is recorded in the HDD 205. In step S907, the image input unit 206 determines whether an unprocessed image remains. If not, the process proceeds to step S908. If an image to be processed exists, the process returns to step S901.

ステップＳ９０８では、検索時の特徴量間の距離計算を行うために、線形判別分析を行って、判別用の空間の基底ベクトルを求める。線形判別分析は、特徴量のクラス分けを行ない、同一クラス内の特徴量間の距離とクラス間の距離の比が最大になるような判別空間の基底ベクトルを求めるものである。システムの目的に応じて、類似している顔とは同一人物の顔であるとか、類似している顔とは表情が似た顔であるなど、類似している顔を定義し、その定義に従ってクラス分けを行った上で線形判別分析を行なうことで、目的にあった機能が実現される。 In step S908, linear discriminant analysis is performed in order to calculate the distance between feature quantities at the time of search, and a basis vector of the discriminant space is obtained. The linear discriminant analysis classifies feature quantities and obtains a basis vector of a discriminant space that maximizes the ratio of the distance between feature quantities in the same class and the distance between classes. Depending on the purpose of the system, define a similar face, such as a similar face is a face of the same person, or a similar face is a face with a similar expression. By performing linear discriminant analysis after classifying, functions that meet the purpose are realized.

判別空間への射影方法としては例えば、下記文献に記載されているような、非線形空間へ射影する技術も知られている。この詳細は下記文献に記載されているため、ここでは詳細な説明を省略する。 As a method for projecting to the discriminant space, for example, a technique for projecting to a non-linear space as described in the following document is also known. Since this detail is described in the following document, detailed description is omitted here.

J.Lu,K.Plateniotis,and A.Venetsanopoulos;"Face recognition using kernel direct discriminant analysis algorithms"。 IEEE Transactions on Neural Networks,14(1),Jan.2003
本実施形態では、画像入力部２０６から入力された画像を対象に登録処理を行う例を示したが、本発明はこれに限るものではない。例えば、登録対象となる画像を所定のフォルダに格納し、そのフォルダを指定することによって、該フォルダ下にある画像ファイルを対象として登録処理を行っても良い。この場合、登録処理のプロセスとしては、対象フォルダが新しく追加された場合は全てのファイルを対象に処理するが、その後は、更新されたファイルのみを対象にすればよい。また、バックグラウンドプロセスとして起動しておき、対象フォルダの更新を監視しておくことも可能である。 J. Lu, K. Plateniotis, and A. Venetsanopoulos; "Face recognition using kernel direct discriminant analysis algorithms". IEEE Transactions on Neural Networks, 14 (1), Jan. 2003
In the present embodiment, an example in which registration processing is performed on an image input from the image input unit 206 is shown, but the present invention is not limited to this. For example, an image to be registered may be stored in a predetermined folder, and the registration process may be performed for an image file under the folder by designating the folder. In this case, as a process of registration processing, when the target folder is newly added, all files are processed, but thereafter, only the updated file may be processed. It is also possible to start as a background process and monitor the update of the target folder.

・特徴抽出処理
以下、ステップＳ９０５に示した特徴抽出処理、すなわち、入力された顔画像に最も相関の高い合成顔モデルの合成パラメータを決定する処理について、図９のフローチャートを用いて詳細に説明する。 Feature Extraction Processing Hereinafter, the feature extraction processing shown in step S905, that is, processing for determining the synthesis parameter of the synthetic face model having the highest correlation with the input face image will be described in detail with reference to the flowchart of FIG. .

まずステップＳ１００１において、初期解像度を設定する。ここでは、解像度を８×８ピクセルに設定している。次にステップＳ１００２でループカウンタｎを１に初期化し、ステップＳ１００３で合成パラメータを初期設定する。初回では、平均的な形状とテクスチャとなるように、合成パラメータが設定される。そしてステップＳ１００４で、合成パラメータから上述した方法により顔画像を合成する。 First, in step S1001, an initial resolution is set. Here, the resolution is set to 8 × 8 pixels. In step S1002, the loop counter n is initialized to 1, and in step S1003, the synthesis parameter is initialized. At the first time, the synthesis parameters are set so as to obtain an average shape and texture. In step S1004, a face image is synthesized from the synthesis parameters by the method described above.

次にステップＳ１００５に進み、ループカウンタｎが２以上であるか否かを判定する。２回目以上でない場合はステップＳ１００７に進み、２回目以上であればステップＳ１００６に進む。 In step S1005, it is determined whether the loop counter n is 2 or more. If it is not the second time or more, the process proceeds to step S1007. If it is the second or more time, the process proceeds to step S1006.

ステップＳ１００６では、合成した顔が収束しているか否かを判定する。これは前回の合成顔からの差分が閾値以下であるか否かによって判定される。差分は各画素の輝度差の２乗和とし、これと解像度別に予め設定した所定の閾値とを比較して判定する。差分が閾値以下であれば収束していると判断され、この場合は近似失敗として処理を終了する。一方、差分が閾値よりも大きければ収束していないと判断され、ステップＳ１００７へ進む。 In step S1006, it is determined whether or not the synthesized face has converged. This is determined by whether or not the difference from the previous combined face is equal to or less than a threshold value. The difference is determined as a sum of squares of the luminance difference of each pixel, and this is compared with a predetermined threshold set in advance for each resolution. If the difference is equal to or smaller than the threshold, it is determined that the difference has converged. In this case, the processing is terminated as an approximation failure. On the other hand, if the difference is larger than the threshold value, it is determined that it has not converged, and the process proceeds to step S1007.

ステップＳ１００７では、入力画像の顔画像と合成された顔画像の差分ベクトルを求める。ここで、差分ベクトルは、入力画像と合成画像の対応する画素の輝度差であるが、ステップＳ７０７で記録したパラメータ更新テーブルを構築する際に求めた差分ベクトルの対応する解像度と同じ次元数すなわち画素数のものが必要である。そこで、差分画像は以下の式(12)に示すように、平均の顔形状にモーフィングしてステップＳ６０３と同じマスキングを行った画像として得る。 In step S1007, a difference vector between the face image of the input image and the synthesized face image is obtained. Here, the difference vector is a luminance difference between corresponding pixels of the input image and the composite image, but has the same number of dimensions, that is, pixels, as the corresponding resolution of the difference vector obtained when the parameter update table recorded in step S707 is constructed. A number of things are needed. Therefore, as shown in the following equation (12), the difference image is obtained as an image that is morphed into an average face shape and subjected to the same masking as in step S603.

ΔＩ_rn = Ｍ(Ｉ_rs-Ｉ_rn,Ｓ_rn,ｓ^av) ・・・(12)
式(12)において、Ｉ_rsは解像度ｒでの入力画像の顔画像、Ｉ_rnは解像度ｒ，ループカウンタｎでの合成された顔画像である。また、Ｓ_rnはループカウンタｎにおける顔特徴座標ベクトル、ｓ^avは平均座標値ベクトルである。 ΔI _rn = M (I _rs -I _rn , S _rn , s ^av ) (12)
In Expression (12), I _rs is a face image of the input image at resolution r, and I _rn is a synthesized face image at resolution r and loop counter n. S _rn is a face feature coordinate vector in the loop counter n, and s ^av is an average coordinate value vector.

次にステップＳ１００８に進み、以下の式(13)によって差分Ｅが所定値以下であるかを判定する。なお、差分Ｅは差分ベクトルのＬ２ノルム（ユークリッド距離）である。 In step S1008, it is determined whether the difference E is equal to or less than a predetermined value using the following equation (13). The difference E is the L2 norm (Euclidean distance) of the difference vector.

Ｅ = |ΔＩ_rn| ・・・(13)
差分Ｅが所定の値以下であれば、入力画像に対して合成画像の近似が成功しているとしてステップＳ１０１２に進むが、そうでない場合はステップＳ１００９に進む。 E = | ΔI _rn | (13)
If the difference E is less than or equal to a predetermined value, the process proceeds to step S1012 on the assumption that the approximation of the composite image with respect to the input image is successful. If not, the process proceeds to step S1009.

ステップＳ１００９では、ループカウンタｎを評価し、所定回数（Ｌ回）の繰り返しがなされていたら近似失敗として終了する。所定回数以下の繰り返しであればステップＳ１０１０に進む。 In step S1009, the loop counter n is evaluated, and if it has been repeated a predetermined number of times (L times), it ends as an approximation failure. If it is repeated a predetermined number of times or less, the process proceeds to step S1010.

ステップＳ１０１０では、以下の式(14)によって差分から合成パラメータ変更量への射影を行うことによって、合成パラメータを更新する。 In step S1010, the composite parameter is updated by projecting the difference to the composite parameter change amount according to the following equation (14).

ｂ_r(n+1) = ｂ_rn + Ａ_rΔＩ_rn ・・・(14)
式(14)において、ｂ_rnは、解像度ｒ，ループカウンタｎの時の合成パラメータである。 b _{r (n + 1)} = b _rn + A _r ΔI _rn (14)
In Equation (14), b _rn is a synthesis parameter when the resolution is r and the loop counter is n.

その後、ステップＳ１０１１でループカウンタｎをインクリメントして、ステップＳ１００４に戻る。 Thereafter, the loop counter n is incremented in step S1011 and the process returns to step S1004.

一方、ステップＳ１００８で合成画像の近似が成功したと判定された場合にはステップＳ１０１２において、最終解像度に達しているか否かを判定する。ここで最終解像度は２５６×２５６ピクセルである。最終解像度であれば近似が成功したとして最終的な合成パラメータを特徴量として出力するが、最終解像度でなければステップＳ１０１３に進み、解像度を２倍に上げてステップＳ１００２に戻る。 On the other hand, if it is determined in step S1008 that the approximation of the composite image has succeeded, it is determined in step S1012 whether the final resolution has been reached. Here, the final resolution is 256 × 256 pixels. If it is the final resolution, the final synthesis parameter is output as a feature value if the approximation is successful. If it is not the final resolution, the process proceeds to step S1013, the resolution is doubled, and the process returns to step S1002.

本実施形態の特徴抽出処理においては、拘束条件情報３０３を課したために合成パラメータを変更しても合成した顔画像に変化がない場合が発生する。そこで、合成画像を評価して極所解での収束状態を検出し、収束状態から抜け出せるようにしている。 In the feature extraction process of the present embodiment, there are cases where the combined face image does not change even if the synthesis parameter is changed because the constraint condition information 303 is imposed. Therefore, the synthesized image is evaluated to detect the convergence state at the extreme solution so that the convergence state can be escaped.

なお、近似処理が失敗した場合にはステップＳ１００３での合成パラメータの設定値を新たな値に設定してリトライを行っても良い。新たな値の設定方法としては、簡単にはランダムに行なうことである。また、探索途中に得られる入力画像と相関の高い他のパラメータ候補や、遺伝的手法によって、変更パラメータでビット列を作って遺伝子とみなし、入力画像と相関の高かったパラメータのビット列を部分的に切り貼りしても良い。また、近似に残された時間や、リトライの数に応じて、ランダムに生成された合成パラメータと平均顔画像または前回の変更パラメータからの標準化ユークリッド距離での制限範囲を設定し、制限範囲内でランダムな合成パラメータに設定しても良い。 If the approximation process fails, retry may be performed by setting the composite parameter setting value in step S1003 to a new value. As a new value setting method, it is simply performed at random. In addition, other parameter candidates that are highly correlated with the input image obtained during the search, or by using genetic methods, a bit string is created with the changed parameters and regarded as a gene, and the bit string of the parameter highly correlated with the input image is partially cut and pasted You may do it. Also, depending on the time remaining in the approximation and the number of retries, a limit range is set for the standardized Euclidean distance from the randomly generated synthesis parameter and the average face image or the previous change parameter. Random synthesis parameters may be set.

また、複数の顔モデルを用意しておき、ある顔モデルでの近似が失敗した場合には、他のモデルでのリトライを行っても良い。 Also, a plurality of face models may be prepared, and when approximation with a certain face model fails, retry with another model may be performed.

・画像検索処理
以上のように抽出された特徴量すなわち合成パラメータに基づき、近似される合成画像を顔モデルから検索する。以下、本実施形態における画像検索処理を、図１０のフローチャートを用いて説明する。 Image search processing Based on the feature amount extracted as described above, that is, the synthesis parameter, a synthetic image to be approximated is searched from the face model. Hereinafter, the image search process in the present embodiment will be described with reference to the flowchart of FIG.

まずステップＳ１１０１で、画像入力部２０６からクエリ画像を入力する。もしくはＨＤＤ２０５内にある画像ファイルを指定してもよい。次にステップＳ１１０２に進み、当該クエリ画像には特徴量が存在するか否かを判定する。すなわち、すでに登録処理を済ませた画像であるか否かを判定する。特徴量が存在すればステップＳ１１０８に進む。 In step S1101, a query image is input from the image input unit 206. Alternatively, an image file in the HDD 205 may be designated. Next, proceeding to step S1102, it is determined whether or not there is a feature amount in the query image. That is, it is determined whether or not the image has already undergone registration processing. If there is a feature amount, the process proceeds to step S1108.

一方、ステップＳ１１０２でクリエ画像に特徴量が存在しない、すなわち未登録画像であればステップＳ１１０３に進み、ステップＳ１１０７までの処理を行う。ここで、ステップＳ１１０３からステップＳ１１０６の処理については、上述した図８に示す画像登録処理におけるステップＳ９０２からステップＳ９０５と同様の処理をクエリ画像に対してかけるものである。したがって、ここでの詳細な説明は省略する。そして、ステップＳ１１０７において、ステップＳ１１０６で抽出した特徴量を判別空間へ線形射影する。 On the other hand, if the feature amount does not exist in the CLIE image in step S1102, that is, if it is an unregistered image, the process proceeds to step S1103, and the process up to step S1107 is performed. Here, with respect to the processing from step S1103 to step S1106, the same processing as step S902 to step S905 in the image registration processing shown in FIG. 8 described above is applied to the query image. Therefore, detailed description here is omitted. In step S1107, the feature amount extracted in step S1106 is linearly projected onto the discrimination space.

そしてステップＳ１１０８およびステップＳ１１０９において、実質的な検索処理を行う。すなわち、まずステップＳ１１０８で判別空間上において、クエリ画像と検索対象の顔画像の特徴ベクトルについて、その差分ベクトルのＬ２ノルムを求め、これを距離とする。次にステップＳ１１０９において画像出力を行なう。例えば、検索対象画像をクエリ画像と距離の近い順に並べ替え、その縮小画像を一覧表示する。 In step S1108 and step S1109, substantial search processing is performed. That is, first in step S1108, the L2 norm of the difference vector is obtained for the feature vector of the query image and the face image to be searched in the discrimination space, and this is used as the distance. In step S1109, image output is performed. For example, the search target images are rearranged in order of distance from the query image, and the reduced images are displayed as a list.

以上説明したように本実施形態によれば、顔形状を示すモデル画像を、顔の器官ごとのローカル座標系の組み合わせによって表現された顔特徴量として扱う。これを主成分分析することによって、寄与率が主成分側に集約され、次元圧縮効果が高くなる。したがって、特徴量記憶コストや特徴の照合コスト、パラメータ更新テーブルのサイズや学習コスト等、システム全体におけるコストダウンが実現される。 As described above, according to the present embodiment, a model image indicating a face shape is treated as a face feature amount expressed by a combination of local coordinate systems for each facial organ. By performing this principal component analysis, the contribution rate is concentrated on the principal component side, and the dimensional compression effect is enhanced. Therefore, cost reduction in the entire system such as feature amount storage cost, feature matching cost, parameter update table size and learning cost can be realized.

＜第２実施形態＞
以下、本発明に係る第２実施形態について説明する。 Second Embodiment
Hereinafter, a second embodiment according to the present invention will be described.

上述した第１実施形態では、形状情報の多くをローカル座標系での相対的な座標などで表現する例を示したが、第２実施形態においては、これを従来のように各特徴点の顔座標系での座標値とする例を示す。 In the first embodiment described above, an example in which much of the shape information is expressed by relative coordinates or the like in the local coordinate system has been shown. In the second embodiment, this is represented by the face of each feature point as in the past. An example of coordinate values in the coordinate system is shown.

第２実施形態における顔の特徴点の設定例を図１１に示す。ここで拘束条件を各特徴値のとりうる範囲とした場合、これはすなわち、Ｘ座標、Ｙ座標の最大最小値を定義することであり、特徴点を含む座標軸と平行な辺を含む短形の領域を示すことになる。そのため、評価条件としては緩いものとなってしまう。そこで第２実施形態においては、より効果的な評価条件を与える。 An example of setting facial feature points in the second embodiment is shown in FIG. Here, when the constraint condition is a range that each feature value can take, this means that the maximum and minimum values of the X coordinate and the Y coordinate are defined, and the short shape including the side parallel to the coordinate axis including the feature point is defined. Will show the area. For this reason, the evaluation condition is loose. Therefore, in the second embodiment, more effective evaluation conditions are given.

第２実施形態における、形状情報を座標値とした場合の拘束条件情報の構築手順を、図１２のフローチャートを用いて説明する。 A construction procedure of constraint condition information when shape information is used as a coordinate value in the second embodiment will be described with reference to a flowchart of FIG.

まずステップＳ１３０１において、顔の各特徴点について、すべての顔サンプルの対応する特徴点集合の凸包領域を求める。次にステップＳ１２０２進み、顔の各特徴点について平均座標を中心として、対応する凸包領域を所定倍率拡大する。次にステップＳ１２０３に進み、凸包領域を顔の各特徴点と対応付けてＨＤＤ２０５に記録する。 First, in step S1301, for each feature point of the face, a convex hull region of the corresponding feature point set of all face samples is obtained. In step S1202, the corresponding convex hull region is enlarged by a predetermined magnification with the average coordinate at the center of each feature point of the face. In step S1203, the convex hull region is recorded in the HDD 205 in association with each feature point of the face.

すると第２実施形態では、顔画像の合成を上述した第１実施形態と同様に図７に示す手順により行うが、ステップＳ８０３では合成される顔の各特徴点の座標が凸包領域内であるかを判定することによって、顔でない可能性を判定する。またステップＳ８０４では各特徴点の座標が凸包領域外に存在する場合、この点と平均的な特徴位置を結ぶ線と凸領域境界の交点の座標に変換する処理をすべての次元で行なうことによって、実在可能性のある顔に変換すればよい。 Then, in the second embodiment, the face image is synthesized by the procedure shown in FIG. 7 as in the first embodiment. In step S803, the coordinates of the feature points of the face to be synthesized are within the convex hull region. The possibility of not being a face is determined by determining. In step S804, if the coordinates of each feature point exist outside the convex hull region, the process of converting the coordinates of the intersection of the convex region boundary with the line connecting this point and the average feature position is performed in all dimensions. It may be converted into a face that may exist.

なお、第２実施形態で示した凸領域は頂点数が不定の多角形であるが、これを四角形や楕円で近似すれば、処理コストを低減することも可能である。 In addition, although the convex area | region shown in 2nd Embodiment is a polygon with the indefinite number of vertices, if this is approximated by a rectangle or an ellipse, processing cost can also be reduced.

以上説明したように第２実施形態によれば、形状情報を顔座標系での座標値として扱っても、第１実施形態と同様の効果が得られる。 As described above, according to the second embodiment, even if the shape information is handled as coordinate values in the face coordinate system, the same effect as in the first embodiment can be obtained.

＜第３実施形態＞
以下、本発明に係る第３実施形態について説明する。 <Third Embodiment>
The third embodiment according to the present invention will be described below.

図１３は、第３実施形態における画像処理装置の構成を示すブロック図である。同図において、上述した第１実施形態に示す図１と共通する構成には同一番号を付し、説明を省略する。 FIG. 13 is a block diagram illustrating a configuration of an image processing apparatus according to the third embodiment. In the same figure, the same number is attached | subjected to the same structure as FIG. 1 shown in 1st Embodiment mentioned above, and description is abbreviate | omitted.

図１３によれば、第３実施形態においては本発明の特徴的なプログラムを記録したＤＶＤまたはＣＤのような光ディスク２１２およびそのインタフェースを構成として加えたことを特徴とする。すなわち、ドライブインタフェース２０４にＣＤ／ＤＶＤドライブなどの外部記憶読書装置２１３が接続されている。 According to FIG. 13, the third embodiment is characterized in that an optical disk 212 such as a DVD or CD in which a characteristic program of the present invention is recorded and its interface are added as components. That is, an external storage reading / writing device 213 such as a CD / DVD drive is connected to the drive interface 204.

本発明の特徴的なプログラムを記録した光ディスク２１２を外部記憶読書装置２１３に挿入すると、ＣＰＵ２０１が光ディスク２１２から当該プログラムを読み取ってＲＡＭ２０３に展開する。これにより、上述した第１および第２実施形態と同様の処理を実現することができる。 When the optical disk 212 on which the characteristic program of the present invention is recorded is inserted into the external storage reading / writing device 213, the CPU 201 reads the program from the optical disk 212 and develops it in the RAM 203. Thereby, the process similar to 1st and 2nd embodiment mentioned above is realizable.

＜その他の実施形態＞
本発明は上述した第１乃至第３実施形態に限定されず、その主旨を逸脱しない範囲で種々の変形が可能である。以下、各種変形例を挙げる。 <Other embodiments>
The present invention is not limited to the first to third embodiments described above, and various modifications can be made without departing from the spirit of the present invention. Hereinafter, various modifications will be described.

・奥行き情報
各特徴点に奥行き情報を持たせても良い。この場合、形状情報の次元が特徴点数分増えることになるが、これは顔特徴を表す有意な情報であり、顔の向きの影響とは独立なモデルとなる。奥行き情報を持った特徴点により、物体を３次元上で仮想的に合成し、さらに視点や照明方法のパラメータにより画像化した場合は本発明の主旨の範囲内である。奥行き情報は、たとえば間隔を空けた２台の入力装置による三角法や、レーザー測距計によって取得可能である。 Depth information Each feature point may have depth information. In this case, the dimension of the shape information increases by the number of feature points, but this is significant information representing the facial features, and becomes a model independent of the influence of the facial orientation. It is within the scope of the gist of the present invention when an object is virtually synthesized three-dimensionally with feature points having depth information and further imaged with viewpoint and illumination method parameters. The depth information can be acquired by, for example, triangulation using two input devices spaced apart or a laser rangefinder.

・パラメータ更新テーブル
パラメータ更新テーブルの構築方法は第１実施形態に示した方法に限定されず、様々な変形が考えられる。例えば第１実施形態においては、ステップＳ７０１でランダムな合成パラメータを設定したが、これをサンプル顔画像から射影した合成パラメータとしてもよい。 Parameter update table The parameter update table construction method is not limited to the method shown in the first embodiment, and various modifications are conceivable. For example, in the first embodiment, a random synthesis parameter is set in step S701. However, this may be a synthesis parameter projected from the sample face image.

また、顔全体の多少の移動や拡大率、アスペクト比、回転などに対応可能とするために、合成パラメータの要素としてＸＹ移動量や拡大率、アスペクト比、回転などの新たな次元を追加しても良い。このとき特徴点として、対象となる物体における少なくとも２つの特徴点を基準にして拡大縮小、回転による正規化を行った画像上の座標情報を用いればよい。 In addition, new dimensions such as XY movement amount, enlargement ratio, aspect ratio, rotation, etc. are added as elements of the synthesis parameter in order to be able to cope with some movement, enlargement ratio, aspect ratio, rotation, etc. of the entire face. Also good. At this time, coordinate information on an image that has been normalized by enlargement / reduction and rotation based on at least two feature points in the target object may be used as the feature points.

また、パラメータ更新テーブルは、第１合成パラメータの設定方法によって近似成功確率が変わってくる。そこで、近似成功確率を評価関数として、評価値の高いパラメータ更新テーブルを求めた時の第１合成パラメータの組を遺伝的手法などを用いて交配し、近似成功確率を高めていく手法をとることも有効である。 Also, the parameter update table changes the approximate success probability depending on the first synthesis parameter setting method. Therefore, using the approximate success probability as an evaluation function, a method of increasing the approximate success probability by mating the first synthetic parameter set when a parameter update table with a high evaluation value is obtained using a genetic method or the like. Is also effective.

・制限時間
上述した第１実施形態においては、図９のステップＳ１００９でループカウンタｎのみを評価していたが、それに加え、制限時間内か否かの評価を加えてもよい。すなわち、制限時間を越えていたら近似失敗として終了することが考えられる。 -Time limit In the above-described first embodiment, only the loop counter n is evaluated in step S1009 of FIG. 9, but in addition to that, an evaluation as to whether it is within the time limit may be added. That is, if the time limit is exceeded, it can be considered that the approximation is terminated as a failure.

・差分ベクトル
上述した第１実施形態では、合成した顔画像と入力画像中の顔画像の差分ベクトルを、平均顔形状の占める領域の各画素の輝度差として説明したが、本発明はそれに限定されるものではない。例えば、ＦＦＴやＤＣＴなどの周波数変換を行ない、各周波数成分における強度、位相情報を用いても良いし、オプティカルフローの手法を用いてもよい。また、平均顔形状の占める領域のすべての画素を利用することも限定しないが、一部の画素を利用することで入力画像の物体の一部が隠れていた場合のロバスト性が向上する。 Difference Vector In the first embodiment described above, the difference vector between the synthesized face image and the face image in the input image has been described as the luminance difference of each pixel in the area occupied by the average face shape, but the present invention is not limited thereto. It is not something. For example, frequency conversion such as FFT or DCT may be performed, and intensity and phase information in each frequency component may be used, or an optical flow method may be used. Also, using all the pixels in the area occupied by the average face shape is not limited, but using some pixels improves the robustness when a part of the object of the input image is hidden.

また、複数のガボールフィルタでコンボリュージョンを行ない、強度画像を用いてもよい。以下の式(15)に、ガボールフィルタＧ(x,ｙ)を示す。 Further, convolution may be performed with a plurality of Gabor filters, and an intensity image may be used. The Gabor filter G (x, y) is shown in the following formula (15).

Ｇ(x,y) = exp[-π{(x-x0)²/A + (y-y0)²/B}]・exp[2πi{u(x-x0)+v(y-y0)}]
・・・(15)
ただし、iは虚数単位、x=0〜s-1，y=0〜s-1，x0=s/2，y0=s/2，Aは水平方向の影響範囲、Bは垂直方向の影響範囲である。また、tan^-1(u/v)は波の方向、(u²+v²)^1/2は周波数である。sはフィルタの縦横サイズであり、ここでは正方形としている。 G (x, y) = exp [-π {(x-x0) ² / A + (y-y0) ² / B}] · exp [2πi {u (x-x0) + v (y-y0)} ]
... (15)
Where i is the imaginary unit, x = 0 to s-1, y = 0 to s-1, x0 = s / 2, y0 = s / 2, A is the horizontal influence range, and B is the vertical influence range. It is. Further, tan ⁻¹ (u / v) is the wave direction, and (u ² + v ² ) ^1/2 is the frequency. s is the vertical and horizontal size of the filter, and is a square here.

また、波の方向8方向、周波数5種類、水平垂直方向の影響範囲は周波数ごとに１種類の４０種類のフィルタを生成している。 In addition, 40 types of filters are generated, one for each frequency in the influence range in the wave direction 8 direction, frequency 5 types, and horizontal and vertical directions.

画像中の任意の位置でのフィルタ出力値は、フィルタと画像間のコンボリューションにより計算する。ガボールフィルタの場合は実数フィルタと虚数フィルタ（虚数フィルタは実数フィルタと半波長分位相がずれたフィルタ）が存在するため、それらの２乗平均値をフィルタ出力値とする。実数フィルタと画像間のコンボリューションがＲc，虚数フィルタとのコンボリューションがＩcであったとすると、出力値Ｐは以下の式(16)により算出される。 The filter output value at an arbitrary position in the image is calculated by convolution between the filter and the image. In the case of a Gabor filter, there are a real number filter and an imaginary number filter (an imaginary number filter is a filter whose phase is shifted by a half wavelength from the real number filter), and the mean square value thereof is used as a filter output value. If the convolution between the real number filter and the image is Rc and the convolution between the imaginary number filter is Ic, the output value P is calculated by the following equation (16).

Ｐ = (Ｒc²+Ｉc²)^1/2 ・・・（16)
また、パラメータ更新テーブル構築中に得られる差分ベクトルのばらつきを主成分分析によって次元数を減らし、入力画像と合成画像の差分ベクトルも同じ射影を行って次元数を減らしもよい。各解像度において一定の次元の差ベクトルが得られるのならば、本発明の主旨の範囲内である。 P = (Rc ² + Ic ² ) ^1/2 (16)
Alternatively, the number of dimensions may be reduced by principal component analysis of the difference vector obtained during the parameter update table construction, and the number of dimensions may be reduced by performing the same projection on the difference vector of the input image and the synthesized image. If a difference vector of a certain dimension is obtained at each resolution, it is within the scope of the present invention.

・拘束方法
主成分分析された部分空間上で表された合成パラメータのベクトルについて、多数の顔画像から各次元の分散を求め、各次元の分散が等しくなるように標準化を行った上で平均形状とのユークリッド距離を求め、これにより近似の正否を判定してもよい。ここで求めるユークリッド距離とは、パラメータの各要素の２乗誤差を要素の分散で除して和をとった標準化ユークリッド平方距離である。得られたユークリッド距離が所定の値以下であれば、入力画像に対して合成画像の近似が成功していると判定される。・ Constraining method For the synthesis parameter vector expressed in the subspace analyzed by principal component analysis, the variance of each dimension is obtained from a number of face images, standardized so that the variance of each dimension is equal, and the average shape The Euclidean distance may be obtained and whether the approximation is correct or not may be determined. The Euclidean distance obtained here is a standardized Euclidean square distance obtained by dividing the square error of each parameter element by the variance of the elements and taking the sum. If the obtained Euclidean distance is equal to or smaller than a predetermined value, it is determined that the composite image is successfully approximated with respect to the input image.

合成パラメータを実在可能性のある顔が得られるように変換するためには、図１４に示すように、標準化された形状ベクトルの各次元の値の絶対値が所定の範囲内に収まるようにする。すなわち、この範囲を狭めながらユークリッド距離が所定の値に収まったときの形状ベクトルを出力する。 In order to convert the synthesis parameter so as to obtain a face that may exist, the absolute value of each dimension value of the standardized shape vector is set within a predetermined range as shown in FIG. . That is, the shape vector when the Euclidean distance falls within a predetermined value while narrowing this range is output.

・ハイブリッド化
本発明を従来技術の前処理として組み合わせて利用することが可能である。例えば、本発明の画像処理装置によって変形した顔をＭＰＥＧ−７の特徴抽出手法に対する入力画像とすることが考えられる。この場合、出力される顔特徴記述子に合成パラメータを追加したものを顔の特徴量として併せたもの、またこれらの出力行列を部分空間法で射影した行列を、顔特徴としても良い。 Hybridization The present invention can be used in combination as a pretreatment of the prior art. For example, a face deformed by the image processing apparatus of the present invention can be considered as an input image for the MPEG-7 feature extraction method. In this case, a face feature descriptor obtained by adding a synthesis parameter to the output face feature descriptor may be used as a face feature, or a matrix obtained by projecting these output matrices by the subspace method may be used as the face feature.

・顔画像以外への適用
また、上述した第１実施形態では類似画像検索を行うシステムを示したが、誰の顔かを判別する顔弁別システムや、判別空間を構築する際のクラス分けの基準によって性別や年齢を判別するシステムに本発明を実装することも容易である。 -Application to other than face images In addition, in the first embodiment described above, a system for performing similar image search has been shown. However, a face discrimination system for discriminating who is a face, and criteria for classification when constructing a discrimination space It is also easy to implement the present invention in a system for discriminating gender and age.

また、本発明は顔以外の任意の物体に対しても適用可能である。たとえば、人体全身、生物、自動車などを対象とすることが考えられる。また、医療分野においてはＸ線写真やＣＴスキャン画像における骨や臓器などの位置同定、工業、流通分野などにおける工業生産物、部品、流通物品などの同定や検査などに適用できる。 The present invention can also be applied to any object other than the face. For example, it is conceivable to target the whole human body, living things, automobiles, and the like. Further, in the medical field, the present invention can be applied to position identification of bones and organs in X-ray photographs and CT scan images, and identification and inspection of industrial products, parts, and distribution articles in the industrial and distribution fields.

・その他
以上、実施形態例を詳述したが、本発明は例えば、システム、装置、方法、プログラム若しくは記憶媒体(記録媒体)等としての実施態様をとることが可能である。具体的には、複数の機器から構成されるシステムに適用しても良いし、また、一つの機器からなる装置に適用しても良い。複数の機器から構成されるシステムとしては、画像入力装置と画像蓄積装置が複合または接続されたものが考えられる。画像入力装置としては例えば、ビデオカメラ、デジタルカメラ、監視カメラなど各種ＣＣＤを利用したカメラやスキャナ、アナログ画像入力装置からＡＤ変換によりデジタル画像に変換された画像入力装置が挙げられる。画像蓄積装置としては例えば、外部ハードディスク、ビデオレコーダが挙げられる。このようなシステムにおいて、該システムを構成する全てまたはいずれかの機器に備わるＣＰＵなどが実際の処理の一部または全部を行ない、その処理によって前述した実施形態の機能が実現される場合も本発明に含まれる。 Others Although the embodiment has been described in detail, the present invention can take an embodiment as a system, apparatus, method, program, storage medium (recording medium), or the like. Specifically, the present invention may be applied to a system composed of a plurality of devices, or may be applied to an apparatus composed of a single device. As a system composed of a plurality of devices, a system in which an image input device and an image storage device are combined or connected can be considered. Examples of the image input device include a camera and scanner using various CCDs such as a video camera, a digital camera, and a surveillance camera, and an image input device converted into a digital image by AD conversion from an analog image input device. Examples of the image storage device include an external hard disk and a video recorder. In such a system, the present invention also includes a case where the CPU or the like provided in all or any of the devices constituting the system performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing. include.

尚本発明は、前述した実施形態の機能を実現するソフトウェアのプログラムを、システムあるいは装置に直接あるいは遠隔から供給し、そのシステムあるいは装置のコンピュータが該供給されたプログラムコードを読み出して実行することによっても達成される。なお、この場合のプログラムとは、実施形態において図に示したフローチャートに対応したプログラムである。 In the present invention, a software program for realizing the functions of the above-described embodiments is supplied directly or remotely to a system or apparatus, and the computer of the system or apparatus reads and executes the supplied program code. Is also achieved. The program in this case is a program corresponding to the flowchart shown in the drawing in the embodiment.

従って、本発明の機能処理をコンピュータで実現するために、該コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、OSに供給するスクリプトデータ等の形態であっても良い。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, or the like.

プログラムを供給するための記録媒体としては、以下に示す媒体がある。例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、MO、CD-ROM、CD-R、CD-RW、磁気テープ、不揮発性のメモリカード、ROM、DVD(DVD-ROM，DVD-R)などである。 Recording media for supplying the program include the following media. For example, floppy disk, hard disk, optical disk, magneto-optical disk, MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD- R).

プログラムの供給方法としては、以下に示す方法も可能である。すなわち、クライアントコンピュータのブラウザからインターネットのホームページに接続し、そこから本発明のコンピュータプログラムそのもの(又は圧縮され自動インストール機能を含むファイル)をハードディスク等の記録媒体にダウンロードする。また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるWWWサーバも、本発明に含まれるものである。 As a program supply method, the following method is also possible. That is, the browser of the client computer is connected to a homepage on the Internet, and the computer program itself (or a compressed file including an automatic installation function) of the present invention is downloaded to a recording medium such as a hard disk. It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、本発明のプログラムを暗号化してCD-ROM等の記憶媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせることも可能である。すなわち該ユーザは、その鍵情報を使用することによって暗号化されたプログラムを実行し、コンピュータにインストールさせることができる。 In addition, the program of the present invention is encrypted, stored in a storage medium such as a CD-ROM, distributed to users, and key information for decryption is downloaded from a homepage via the Internet to users who have cleared predetermined conditions. It is also possible to make it. That is, the user can execute the encrypted program by using the key information and install it on the computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される。さらに、そのプログラムの指示に基づき、コンピュータ上で稼動しているOSなどが、実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現され得る。 Further, the functions of the above-described embodiments are realized by the computer executing the read program. Furthermore, based on the instructions of the program, an OS or the like running on the computer performs part or all of the actual processing, and the functions of the above-described embodiments can also be realized by the processing.

さらに、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、実行されることによっても、前述した実施形態の機能が実現される。すなわち、該プログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるCPUなどが実際の処理の一部または全部を行うことが可能である。 Further, the program read from the recording medium is written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, and then executed, so that the program of the above-described embodiment can be realized. Function is realized. That is, based on the instructions of the program, the CPU provided in the function expansion board or function expansion unit can perform part or all of the actual processing.

本発明に係る一実施形態である画像処理装置のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the image processing apparatus which is one Embodiment which concerns on this invention. 本実施形態における顔モデルの概要を示す図である。It is a figure which shows the outline | summary of the face model in this embodiment. 本実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in this embodiment. 本実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in this embodiment. 本実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in this embodiment. 本実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in this embodiment. 本実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in this embodiment. 本実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in this embodiment. 本実施形態における形状情報の構築手順を示すフローチャートである。It is a flowchart which shows the construction procedure of the shape information in this embodiment. 本実施形態におけるテクスチャ情報の構築手順を示すフローチャートである。It is a flowchart which shows the construction procedure of the texture information in this embodiment. 本実施形態におけるパラメータ更新テーブルの構築手順を示すフローチャートである。It is a flowchart which shows the construction procedure of the parameter update table in this embodiment. 本実施形態における顔画像の合成方法を示すフローチャートである。It is a flowchart which shows the synthesis method of the face image in this embodiment. 本実施形態における画像登録処理を示すフローチャートである。It is a flowchart which shows the image registration process in this embodiment. 本実施形態における特徴抽出処理を示すフローチャートである。It is a flowchart which shows the feature extraction process in this embodiment. 本実施形態における画像検索処理を示すフローチャートである。It is a flowchart which shows the image search process in this embodiment. 第２実施形態における顔の特徴点の設定方法を示す図である。It is a figure which shows the setting method of the feature point of the face in 2nd Embodiment. 第２実施形態における拘束条件情報の構築手順を示すフローチャートである。It is a flowchart which shows the construction procedure of the constraint condition information in 2nd Embodiment. 第３実施形態における画像処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the image processing apparatus in 3rd Embodiment. 合成パラメータを存在可能性のある顔に変換する際の概念図である。It is a conceptual diagram at the time of converting a synthetic parameter into a face that may exist.

Claims

An image processing method for constructing a model image that is referred to when a two-dimensional image of an object existing in an image is synthesized,
A setting step for setting a synthesis parameter of the model image;
Synthesizing a model image based on the synthesis parameter set in the setting step, and
The synthesis parameter is characterized in that, based on a local coordinate system in a plurality of regions into which the model image is divided, information indicating feature points of the model image is subjected to dimension compression by principal component analysis. Image processing method.

2. The image processing according to claim 1, wherein the composite parameter is obtained by dimensionally compressing a shape feature of the model image quantified by coordinates of the feature point in the local coordinate system by principal component analysis. Method.

The image processing method according to claim 1, wherein the local coordinate system includes an orthogonal coordinate system.

The image processing method according to claim 3, wherein the local coordinate system includes a polar coordinate system.

5. The image processing method according to claim 2, wherein the coordinates of the feature points are relative coordinates from other feature points in the local coordinate system.

Further, an evaluation step (S803) for evaluating the possibility of existence of the model image synthesized in the synthesis step;
A change step (S804) of changing the synthesis parameter so that the possibility of existence of the model image is high when it is determined that the possibility of existence is low in the evaluation step;
The image processing method according to claim 1, further comprising:

The image processing method according to claim 6, wherein in the evaluation step, if the element value of the synthesis parameter is outside a predetermined range, it is determined that the possibility of existence is low.

The image processing method according to claim 7, wherein the predetermined range is set based on a statistical distribution in a target object set.

9. The image processing method according to claim 7, wherein, in the changing step, an element value of the synthesis parameter is changed to a boundary value within the predetermined range.

The image processing method according to claim 1, wherein the object is a human face.

An image processing apparatus for constructing a model image to be referred to when a two-dimensional image of an object existing in an image is synthesized,
Setting means for setting a synthesis parameter of the model image;
Synthesizing means for synthesizing the model image based on the synthesis parameter set in the setting step,
The synthesis parameter is characterized in that, based on a local coordinate system in a plurality of regions into which the model image is divided, information indicating feature points of the model image is subjected to dimension compression by principal component analysis. Image processing device.

An image processing system for approximating a composite image for an object existing in an image, using a model image constructed by the image processing method according to claim 1.

11. A program that realizes the image processing method according to claim 1 on a computer by operating on the computer.

A recording medium on which the program according to claim 13 is recorded.