JP3995614B2

JP3995614B2 - Pattern recognition dictionary generation apparatus and method, pattern recognition apparatus and method

Info

Publication number: JP3995614B2
Application number: JP2003039192A
Authority: JP
Inventors: 和広福井; 修山口
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2002-02-25
Filing date: 2003-02-18
Publication date: 2007-10-24
Anticipated expiration: 2023-02-18
Also published as: JP2003317099A

Description

【０００１】
【発明の属する技術分野】
本発明はパターン認識を行う装置及びパターン認識方法に関する。
【０００２】
【従来の技術】
パターン認識の手法として、部分空間法（例えば、非特許文献１、非特許文献２）の拡張として開発された相互部分空間法と呼ばれる手法が提案されている（例えば、非特許文献３、特許文献１）。
【０００３】
相互部分空間法では、まず、予め辞書パターン分布を部分空間で表現して辞書部分空間を生成しておく。部分空間の生成は基底ベクトルを求めれば良い。
【０００４】
そして、認識対象の入力パターン分布を部分空間で表現して入力部分空間を生成する。それから、入力部分空間と予め生成しておいた各辞書部分空間との間のなす最小正準角を求め、認識対象は最小正準角が最小となる辞書部分空間に対応するカテゴリに属すると判定する。
【０００５】
カテゴリに属するとは、例えば画像を用いた人間の顔の認識であれば「現在認識を受けている人は当該辞書に登録された人である」ということである。
【０００６】
相互部分空間法では入力側、辞書側双方を部分空間で表現しているため、部分空間法に比べるとパターンの変形の吸収能力が優れているが、他のカテゴリとの関係を考慮していないために、例えば顔の認識の場合、照明条件などの影響を受け易いという問題があった。
【０００７】
そこで、識別に必要な本質的な特徴から構成される「制約部分空間」を予め用意し、比較すべき部分空間を制約部分空間に射影した上で相互部分空間法を適用する「制約相互部分空間法」と呼ばれる手法が提案されている（例えば、非特許文献４、特許文献２）。
【０００８】
制約相互部分空間法では図７に示すように、比較する部分空間Ｐ、Ｑを制約部分空間Ｌに射影して得られる部分空間Ｐ’、Ｑ’を相互部分空間法により比較する。
【０００９】
部分空間Ｐ’、Ｑ’は部分空間Ｐ、Ｑの本質的な特徴を受け継ぐので、部分空間Ｐ’、Ｑ’間の差異は部分空間Ｐ、Ｑ間の差異（図中ではベクトルｄで表現されているが一般には差異空間として表現される）のうち、本質的な部分を抽出したものになる。
【００１０】
従って、相互部分空間法よりもパターンの変形の吸収能力が優れているという利点がある。
【００１１】
【特許文献１】
特開平１１−２６５４５２号公報
【特許文献２】
特開２０００−３００６５公報
【００１２】
【非特許文献１】
飯島泰蔵、「パターン認識理論」、森北出版、（１９８９）
【非特許文献２】
エルッキオヤ著、「パターン認識と部分空間法」、産業図書、（１９８６）
【非特許文献３】
前田賢一、渡辺貞一、「局所的構造を導入したパターン・マッチング法」、信学論（Ｄ）、ｖｏｌ．Ｊ６８−Ｄ，Ｎｏ．３，ｐｐ３４５−３５２、１９８５．
【非特許文献４】
福井和広，山口修，鈴木薫，前田賢一、「制約相互部分空間法を用いた環境変化にロバストな顔画像認識−照明変動を抑える制約部分空間の学習−」、電子情報通信学会論文誌（Ｄ−ＩＩ）、ｖｏｌ．Ｊ８２−Ｄ−ＩＩ，ｎｏ．４ｐｐ．６１３−６２０、１９９９．
【００１３】
【発明が解決しようとする課題】
従来の制約相互部分空間法では、制約部分空間を２つの部分空間の差異を表す差分部分空間の集合から手続き的に生成している。
【００１４】
手続き的な生成手法では、まず、同一カテゴリ内に属する部分空間の全組み合わせについて差分部分空間を生成し、全ての差分部分空間の主成分部分空間から第１の変動部分空間を生成する。
【００１５】
「同一カテゴリに属する部分空間」とは、画像を用いた顔認識で例示すると「同一人物の顔画像から生成した部分空間」である。次に、異なるカテゴリに属する部分空間の全組み合わせについて差分部分空間を求め、求めた全ての差分部分空間の主成分部分空間から第２の変動部分空間を生成する。
【００１６】
ここで求めた第１の変動部分空間は、画像を用いた顔認識であれば「同一人物を異なる条件（表情、照明など）で撮影したときの変化」に相当するもので、識別の際にはできるだけ抑制したい成分を抽出した空間であると言える。一方、第２の変動部分空間は、画像を用いた顔認識であれば「他人と自分との違い」に相当するもので、識別の際には考慮したい成分を抽出した空間であると言える。
【００１７】
従って、第１の変動部分空間の補空間と第２の変動部分空間とは共に識別の際に考慮したい成分を抽出した空間であり、これらの共通部分を制約部分空間として算出する。
【００１８】
しかし、上述のような手続き的な制約部分空間の生成方法では、部分空間の全組み合わせについて差分部分空間を生成する必要がある。部分空間がｍ個存在すると_mＣ₂回の差分部分空間を計算する必要がある。そのため、カテゴリ数が多くなると求めるべき差分部分空間の数が膨大になって処理速度が遅くなるという問題があった。
【００１９】
本発明では、より簡単かつ計算量の少ない手法で制約部分空間を生成して用いるパターン認識装置及び方法を提供することを目的とする。
【００２０】
【課題を解決するための手段】
上記課題を解決するために本発明のパターン認識用辞書生成装置は、辞書パターンを入力するための辞書パターン入力部と、前記辞書パターンから辞書部分空間を生成する辞書部分空間生成部と、前記辞書部分空間の射影行列の和行列から制約部分空間を生成する制約部分空間生成部と、前記辞書部分空間を前記制約部分空間に射影して制約辞書部分空間を生成する辞書射影部と、前記制約部分空間と前記制約辞書部分空間を出力する出力部とを有する。
【００２１】
本発明のパターン認識用辞書生成装置では、前記制約部分空間生成部は、前記辞書部分空間の射影行列の和行列の固有値の小さい方から選んだｃ個の固有ベクトルから、前記制約部分空間を生成することを特徴としていても良い。
【００２２】
本発明のパターン認識用辞書生成装置では、前記制約部分空間生成部は、同一カテゴリに属する辞書パターンから表現された辞書部分空間を一つの部分空間に統合してから制約部分空間を生成することを特徴としていても良い。
【００２３】
本発明のパターン認識装置は、辞書パターンに対応する制約辞書部分空間を格納しておく制約辞書部分空間格納部と、認識対象の入力パターンを入力する入力部と、前記入力パターンから入力部分空間を生成する部分空間生成部と、予め、前記辞書パターンから生成した辞書部分空間への射影行列の総和を用いて生成した制約部分空間を格納しておく制約部分空間格納部と、前記入力部分空間を前記制約部分空間に射影して制約入力部分空間を求める部分空間射影部と、前記制約入力部分空間と前記制約辞書部分空間との間の正準角を求め、前記正準角を用いて前記対象を識別する識別部とを有することを特徴とする。
【００２４】
また、本発明のパターン認識装置は、上記パターン認識用辞書生成装置からの出力のうち、前記制約部分空間格納部には前記制約部分空間を格納し、前記制約辞書部分空間格納部には前記制約辞書部分空間を格納することを特徴としていても良い。
【００２５】
本発明のパターン認識用辞書生成方法は、辞書パターンを入力するための辞書パターン入力ステップと、前記辞書パターンから辞書部分空間を生成する辞書部分空間生成ステップと、前記辞書部分空間の射影行列の和行列から制約部分空間を生成する制約部分空間生成ステップと、前記辞書部分空間を前記制約部分空間に射影して制約辞書部分空間を生成する辞書射影ステップと、前記制約部分空間と前記制約辞書部分空間を出力する出力ステップとを有することを特徴とする。
【００２６】
また、本発明のパターン認識用辞書生成方法の前記制約部分空間生成ステップでは、前記辞書部分空間の射影行列の和行列の固有ベクトルのうち、固有値の小さい方から選んだｃ個を用いて前記制約部分空間を生成することを特徴としていても良い。
【００２７】
また、本発明のパターン認識用辞書生成方法の前記制約部分空間生成ステップでは、同一カテゴリに属する辞書パターンから生成された辞書部分空間を一つの部分空間に統合してから、制約部分空間を生成することを特徴としていても良い。
【００２８】
本発明のパターン認識方法は、前記制約部分空間として、同一カテゴリに属する辞書パターンから表現された辞書部分空間を一つの部分空間に統合してから射影行列を求めて、前記射影行列の全カテゴリ若しくは一部カテゴリ分の和行列を計算し、前記和行列について固有値の小さい方から選んだｃ個の固有ベクトルで張られる部分空間を用いることを特徴とする。
【００２９】
【発明の実施の形態】
（第１の実施形態）以下、本発明の一実施形態であるパターン認識装置を応用した顔画像認識装置について図面を参照して説明する。
【００３０】
図１は本発明の一実施形態に係る顔画像認識装置の概略構成図を示すブロック図である。本装置は、認識対象となる人物の画像を撮影して入力画像を得る画像入力部１１と、入力画像から顔面に相当する領域の顔領域画像を抽出する顔領域抽出部１２と、顔領域画像から目、鼻、口等の特徴点を抽出する顔特徴点抽出部１３とを有する。
【００３１】
さらに本装置は、抽出した特徴点を基準にして顔領域画像を正規化する変換を施す正規化画像生成部１４と、所定の枚数の正規化した顔領域画像から入力部分空間Ｐ_inを張る基底ベクトルを求めて入力部分空間Ｐ_inを生成する部分空間生成部１５と、入力部分空間Ｐ_inを制約部分空間Ｌに射影して制約入力部分空間Ｐ_in ^Lを張る基底ベクトルを求めて制約入力部分空間Ｐ_in ^Lを生成する部分空間射影部１６と、制約部分空間Ｌを張る基底ベクトルを格納しておく制約部分空間格納部１７とを有する。
【００３２】
また本装置は、制約入力部分空間Ｐ_in ^Lと制約辞書部分空間Ｈ_i ^Lを用いて類似度を計算する相互部分空間類似度計算部１８と、認識を行う前に予め生成しておいた制約辞書部分空間Ｈ_i ^Lを張る基底ベクトルを記憶しておく制約辞書部分空間格納部１９と、求めた類似度から、入力画像に写った人が辞書に登録された人であるか（辞書に複数人登録されている場合は、どの人か）を判定する判定部２０と、識別結果等の表示を行う表示部２１とを有する。
【００３３】
図３は本装置の外観を説明する図である。本装置の外部には、カメラ１１０１と、モニタ２１０１と、スピーカ２１０２と、本装置の操作用インターフェース５００１が露出している。
【００３４】
カメラ１１０１は画像入力部１１を構成する部品で、人物の画像を取得するために用いる。モニタ２１０１及びスピーカ２１０２は表示部２１を構成する部品で、識別結果を画像や音声で表示するために用いる。本実施形態ではカメラ１１０１は本装置の正面、モニタ２１０１下に取り付けるものとするが、使用状況等に応じて取り付け位置は適宜変えてよい。
【００３５】
画像入力部１１は、認識対象となる人物をカメラ１１０１で撮影して入力画像を得る。そして、入力画像をＡ／Ｄ変換器でデジタルデータに変換して出力する。入力画像はカメラ１１０１によって連続的に得られるので、順次Ａ／Ｄ変換して順次出力していく。
【００３６】
顔領域抽出部１２は、得られた入力画像から顔領域を抽出して顔領域画像を順次生成する。顔領域の抽出は、予め登録しておいた標準顔画像を用いたテンプレートマッチングにより行う。標準顔画像を入力画像中で移動させながら相関値を計算し、相関値が最も高い領域を顔領域とする。ただし、相関値が所定の閾値より低い場合は、顔が存在しないとする。例えば、１００％完全一致した場合の相関値を「１００」とした場合、閾値は「３０」に設定する。顔領域抽出にあたっては、顔向きの変化に対応するために部分空間法や複合類似度などにより複数のテンプレートを用いると、更に精度良く顔領域を抽出できる。
【００３７】
顔特徴点抽出部１３は抽出された顔領域画像から瞳、鼻、口端などの特徴点を抽出する。本実施形態では特徴点の抽出方法としては特開平９−２５１５３４号公報で提案されている位置精度の高い形状情報により求めた特徴点の候補をパターンマッチングで検証する方法を用いる。
【００３８】
ただし、特徴点を抽出することができればこの方法に限らず、例えば「坂本静生，宮尾陽子，田島譲二，”顔画像からの目の特徴点抽出”，信学論Ｄ−ＩＩ，Ｖｏｌ．Ｊ７６−Ｄ−ＩＩ，Ｎｏ．８，ｐｐ。１７９６−１８０４，Ａｕｇｕｓｔ，１９９３．」及び「Ａ．Ｌ．Ｙｕｉｌｌｅ，”Ｆｅａｔｕｒｅｅｘｔｒａｃｔｉｏｎｆｒｏｍｆａｃｅｓｕｓｉｎｇｄｅｆｏｒｍａｂｌｅｔｅｍｐｌａｔｅｓ”，ＵＪＣＶ，ｖｏｌ８：２，ｐｐ．９９−１１１，１９９２」で提案されているエッジ情報に基づく方法や、「ＡｌｅｘＰｅｎｔｌａｎｄ，ＢａｂａｃｋＭｏｇｈａｄｄａｍ，ＴｈａｄＳｔａｒｎｅｒ，”Ｖｉｅｗ−ｂａｓｅｄａｎｄｍｏｄｕｌａｒｅｉｇｅｎｓｐａｃｅｓｆｏｒｆａｃｅｒｅｃｏｇｎｉｔｉｏｎ”，ＣＶＰＲ’９４，ｐｐ．８４−９１，１９９４．」で提案されている固有空間法を適用したＥｉｇｅｎｆｅａｔｕｒｅ法や、「佐々木努，赤松茂，末永康仁，”顔画像認識のための色情報を用いた顔の位置合わせ法”，ＩＥ９１−２，ｐｐ．９−１５，１９９１．」で提案されているカラー情報による方法を用いても良い。
【００３９】
正規化画像生成部１４では、「山口修，福井和広，前田賢一，”動画像を用いた顔認識システム”，信学技報，ＰＲＭＵ９７−５０，ｐｐ．１７−２４，１９９７．」で提案されている、瞳、鼻穴を基準にした正規化処理を施して正規化画像を生成する。正規化処理の内容は、両方の瞳を結んだ第１のベクトルと、鼻穴の中点と瞳の中点を結んだ第２のベクトルについて、第１のベクトルが水平になり、第１のベクトルと第２のベクトルが直交し、そして第１、第２のベクトルが所定の長さになるようにアフィン変換を施すというものである。
【００４０】
部分空間生成部１５では、正規化画像生成部１４で逐次生成される正規化画像をヒストグラム平坦化、ベクトル長正規化を施した後で図示しないメモリに蓄える。そして、所定の枚数の正規化画像が蓄積されたら、入力部分空間Ｐ_inの生成、すなわちＰ_inを張るｍ個の基底ベクトルを生成する。
【００４１】
本装置では画像入力部１１のカメラ１１０１を介して入力画像が連続的に得られ、順次、顔領域抽出処理、特徴点抽出処理、正規化処理が行われるので正規化画像も連続的に得ることができる。部分空間も正規化画像が得られる度に逐次生成して更新できれば、リアルタイムにパターン認識処理を行うことができてパターン入力が容易になる。そこで、本装置では部分空間を逐次生成する手法として「エルッキ・オヤ著，小川英光，佐藤誠訳，”パターン認識と部分空間法”，産業図書，１９８６．」で述べられている同時反復法を適用する。
【００４２】
部分空間射影部１６では、部分空間生成部１５で生成された入力部分空間Ｐ_inを、制約部分空間格納部１７に格納されている制約部分空間Ｌ上へ射影して制約入力部分空間Ｐ_in ^Lを次の手順で生成する。
【００４３】
まず、入力部分空間Ｐ_inを張るｍ個の基底ベクトルを制約部分空間Ｌ上へ射影する。そして、射影した各基底ベクトルを、ベクトル長を正規化して正規化ベクトルにする。さらに、正規化ベクトルにグラムシュミットの直交化を施して正規直交化ベクトルにする。以上の手順で生成されたｍ個の正規直交化ベクトルは、入力部分空間Ｐ_inを制約部分空間Ｌに射影して生成した制約入力部分空間Ｐ_in ^Lの基底ベクトルである。
【００４４】
制約辞書部分空間格納部１９に格納されている人物ｉの制約辞書部分空間Ｈ_i ^Lは、各人物に対応する辞書部分空間Ｈ_iを制約部分空間Ｌに射影した部分空間で、人物ｉを辞書に登録した時点で予め部分空間射影部１６で生成しておく。制約辞書部分空間Ｈ_i ^Lの生成手順については後述する。
【００４５】
入力部分空間Ｐ_in、辞書部分空間Ｈ_i及び制約部分空間Ｌの次元数は、データの種類に応じて実験的に決める。顔パターンのデータが２２５次元の場合は、入力部分空間及び辞書部分空間は５〜１０次元に、制約部分空間は１５０〜１８０次元に設定すると良いことが実験的にわかっている。
【００４６】
相互部分空間類似度計算部１８では、制約入力部分空間Ｐ_in ^Lと制約辞書部分空間格納部１９に格納された人物ｉ（ｉ＝１・・・ｍ）の制約辞書部分空間Ｈ_i ^Lとのなす最小正準角θ₁に対するｃｏｓ²θ₁を、全ての辞書について計算して、類似度とする。
【００４７】
具体的にはｃｏｓ²θ₁は以下の式で定義される。ただし、制約入力部分空間Ｐ_in ^Lに対する射影行列を行列Ｐとし、制約辞書部分空間Ｈ_i ^Lに対する射影行列を行列Ｈとし、行列ＰＨＰの最大固有値をλ₁とする。
【００４８】
【数１】

【００４９】
尚、類似度は、ｃｏｓ²θ₁に限らず、第ｔ正準角θ_rまでのｔ個のｃｏｓ²θ_j（ｊ＝１〜ｔ）の加重平均、積、乗和、またはこれらをベクトル化したものを用いても構わない。
【００５０】
判定部２０では、ｍ人の中で類似度が最大かつ所定の閾値以上である人物ｒを本人と同定する。この時、第２候補以降の類似度も考慮して決定しても良い。例えば、第２候補との類似度の差が閾値より小さい場合は不確定とする。
【００５１】
表示部２１は液晶画面とスピーカとを持ち、識別結果を画面に表示して、音声で知らせる。また、表示部２１は識別結果を他の機器に出力する出力部も持っている。以下は図示していないが、表示部２１は電子錠に識別結果を出力し、電子錠は識別結果に応じて開錠や施錠を行う。
【００５２】
（具体的な制約辞書部分空間の生成手順）以下、制約辞書部分空間Ｈ_i ^Lを生成して制約辞書部分空間格納部１９に格納する処理で使用する本装置各部と、処理の流れを図２を用いて説明する。
【００５３】
図２は制約辞書部分空間生成に関わる主要部分を抽出した図である。主要部分は、辞書部分空間Ｈ_iを制約辞書部分空間Ｈ_i ^Lに射影する部分空間射影部１６と、制約部分空間Ｌを格納しておく制約部分空間格納部１７と、生成した制約辞書部分空間Ｈ_i ^Lを格納する制約辞書部分空間格納部１９と、図１の画像入力部１１から部分空間部１５までを利用して生成された辞書部分空間Ｈ_iを格納しておく部分空間格納部２２である。部分空間格納部２２は図１では図示されていないが、あとは図１と同じ物である。
【００５４】
本装置の制約部分空間生成処理では、登録する人物を撮影してから辞書部分空間Ｈ_iを生成する、すなわち辞書パターンを入力して辞書部分空間を生成する処理までは、識別を行う時に入力パターンたる識別対象の人物を撮影してから入力部分空間Ｐ_iを生成する処理と同じ動作であるので説明を省略する。
【００５５】
部分空間格納部２２は、部分空間生成部１５で生成された辞書部分空間Ｈ_iを受け取って格納しておく。そして、部分空間射影部１６は制約部分空間格納部１７から制約部分空間Ｌを読み出して、部分空間格納部２２に格納されている辞書部分空間Ｈ_iを順次読み出して制約部分空間Ｌへの射影を行い、制約辞書部分空間Ｈ_i ^Lを順次生成する。制約辞書部分空間格納部１９は生成された制約辞書部分空間Ｈ_i ^Lを格納する。
【００５６】
（具体的な制約部分空間の生成手順）以下、図４を参照して人間の顔認識を行う場合を例に挙げて制約部分空間の生成手順を説明する。
【００５７】
まず、各人物毎に様々な顔向き、表情、照明条件などを含んだ正規化顔パターンを収集する。収集には「山口修、福井和広、前田賢一，”動画像を用いた顔認識システム”，信学技報，ＰＲＭＵ９７−５０，ｐｐ．１７−２４，１９９７．」に記載された方法を用い、収集時には照明条件を多様に変化させながら被登録者の顔面の画像を上下左右様々な方向から撮影する（Ｓ４０１）。
【００５８】
収集したｆ次元の正規化画像データに対して、ヒストグラム平坦化、ベクトル長正規化などの前処理を行った後に、ＫＬ展開を適用して部分空間を張る基底ベクトルを求める。この時、同一人物について異なる条件で取得した顔画像から部分空間が生成済みの場合は、一つの部分空間に統合しておく。すなわち、一人に対して部分空間は一つだけ割り当てる（Ｓ４０２）。そして、求めた部分空間の基底ベクトルから射影行列を計算する（Ｓ４０３）。
【００５９】
全ての人物（ｍ人とする）を登録するまでステップＳ４０１からＳ４０３までを繰り返す（Ｓ４０４）。
【００６０】
このようにして生成したｍ人分の部分空間に対する射影行列の総和Ｇを求める（Ｓ４０５）。そして、行列Ｇの固有値問題を解いて固有値を求める（Ｓ４０６）。求めた全固有値のうち値の小さいほうからｃ個を選び、これらの固有値に対応する固有ベクトルをｃ次元の制約部分空間Ｌの基底ベクトルとする（Ｓ４０７）。
【００６１】
この手順で制約部分空間Ｌを求める際に注意すべき点は「一人に対して部分空間は一つだけ割り当てる」ということである。すなわち、同一人物を異なる条件で撮影した画像から生成された部分空間は、全て同一の部分空間に統合しておくということである。統合を行わないと、本来主成分空間Ｐに含まれるべき同一カテゴリ（すなわち、同一人物）内での変動が制約部分空間Ｌに含まれることになり、識別の精度を落とす結果になる。
【００６２】
上述の制約部分空間及び制約辞書部分空間の生成を行うための装置の構成を図５に示す。この装置の画像入力部１１、顔領域抽出部１２、顔特徴点抽出部１３、正規化画像生成部１４、部分空間生成部１５、部分空間射影部１６は図１のパターン認識装置と同じものである。
【００６３】
まず、画像入力部１１、顔領域抽出部１２、顔特徴点抽出部１３、正規化画像生成部１４、部分空間生成部１５を用いて、顔画像認識の時と同様に入力された辞書パターンから辞書部分空間Ｈ_iを生成する。そして部分空間格納部２２は辞書部分空間Ｈ_iを格納する。部分空間格納部２２は、今までに登録された辞書部分空間Ｈ_iを蓄積していく。
【００６４】
制約部分空間部２３は部分空間格納部２２から辞書部分空間Ｈ_iを読み出して前述の手順で制約部分空間Ｌを生成して出力部２４と部分空間射影部１６に出力する。部分空間射影部１６は制約部分空間Ｌと部分空間Ｈ_iを用いて制約辞書部分空間Ｈ_i ^Lを出力部２４へ出力する。出力部２４は制約部分空間生成部２３で生成された制約部分空間Ｌ及び部分空間射影部１６で生成された制約辞書部分空間Ｈ_i ^Lを図示しない外部の装置に出力する。例えば外部の装置として図１に示す顔画像認識装置を接続した場合、制約部分空間Ｌは制約部分空間格納部１７に、制約辞書部分空間Ｈ_i ^Lは制約辞書部分空間格納部に格納することになる。
【００６５】
（制約部分空間の生成方法）以下、各部分空間から制約部分空間を生成する方法について説明する。
【００６６】
まず、２カテゴリ識別問題について説明し、次にこれを複数カテゴリの場合に拡張する。
【００６７】
（２カテゴリ識別問題）２カテゴリ識別問題においては、２つの部分空間の差分部分空間を制約部分空間として適用する。差分部分空間は２つの部分空間の差異を表す部分空間であり、２つの部分空間のなす正準角に基づく定義のほか、２つの部分空間に対する射影行列の和を用いても定義できる。本発明では射影行列の和を用いる。
【００６８】
ｎ次元部分空間Ａに対する射影行列Ｘとｍ次元部分空間Ｂへに対する射影行列Ｙは、部分空間Ａ、Ｂの基底ベクトルφ_i、ψ_iを用いて数２に示す関係で表され、部分空間と射影行列は一意に対応していることがわかる。
【００６９】
【数２】

【００７０】
射影行列の和Ｇ₂＝Ｘ＋Ｙはｍ＞ｎとすると、射影行列の和Ｇ₂の持つｎ×２個の正の固有値に対応するｎ×２個の固有ベクトルを持つ。そして、差分部分空間Ｄ₂はＧ₂の持つｎ×２個の固有ベクトルのうち、固有値が１．０より小さいｎ個の固有ベクトルｄにより張られる空間となることが数学的に示される。数３はこれを表したものである。ここで、Ｓは部分空間Ａと部分空間Ｂの和空間で、Ｐは主成分部分空間である。
【００７１】
【数３】

【００７２】
また、射影行列の和Ｇ₂のｎ×２個の固有ベクトルのうち、固有値が１．０より大きいｎ個が張る主成分部分空間Ｐ₂は２つの部分空間Ａ、Ｂの共通的な空間とみなせ、部分空間Ａ、Ｂから等しく近い空間と解釈できる。
【００７３】
以上に示した差分部分空間Ｄ₂と主成分部分空間Ｐ₂との関係から、一般に２つの部分空間の和空間Ｓは、主成分部分空間Ｐと差分部分空間Ｄに直和分解できることが示された。このことから差分部分空間Ｄは和空間Ｓから主成分部分空間Ｐを取り除いた空間であるといえる。従って、差分部分空間は２つの部分空間の差異を表すと同時に、２つの部分空間の平均的な変動に対して直交、すなわちこれらの変動を受けにくいという識別を行う上で望ましい特性を持つ。
【００７４】
（複数カテゴリ識別の場合）「差分部分空間Ｄは２つの部分空間の和空間から両者の共通空間である主成分空間を取り除いた空間である」ということをさらに一般化して、複数カテゴリに対する差分部分空間Ｄ_nはｎ個のｍ次元部分空間の和空間からｎ個の部分空間の主成分部分空間Ｐ_nを取り除いた空間であると定義する。
【００７５】
具体的には、以下の式に示すように、ｎカテゴリに対する差分部分空間Ｄ_nを各カテゴリ部分空間から計算される射影行列Ｘ_iの総和行列Ｇ_nに対する固有ベクトルで、固有値が小さい方から順に選んだｃ個のベクトルｄ_n×_m、ｄ_n×_m-1、ｄ_n×_m-2、・・・、ｄ_n×_m-c+1により張られる空間とする。ただし、ｎ×ｍが正規化データ次元数ｆよりより大きい場合にはｄ_f、ｄ_f-1、ｄ_f-2、・・・ｄ_f-c+1、により張られる空間となる。
【００７６】
【数４】

【００７７】
この一般化された差分部分空間Ｄ_nも識別する上で好ましい空間となっている。なぜならば、図６に示したようにｎ個の部分空間に対する主成分部分空間Ｐ_nは、各カテゴリ内におけるパターン変動の全体平均を示しているからである。従って、この主成分部分空間Ｐ_nと直交する差分部分空間Ｄ_nにはこれらのパターン変動が含まれずに、これ以外の変動、つまりカテゴリ間の変動成分が主に含まれる。
【００７８】
ここで差分部分空間Ｄ_nを改めて複数カテゴリに対する制約部分空間Ｌとする。このようにして複数カテゴリ識別問題に対しても射影行列を用いることで容易な計算で解析的に制約部分空間Ｌが生成できる。これにより、従来の手続き的な手法に比べて演算量を減らせるので、処理速度が向上する。
【００７９】
尚、２２５次元のデータの顔パターンを識別する場合、上記ｃの値は１５０〜１８０が良いことが実験的にわかっている。
【００８０】
また、本実施形態ではカメラ１１０１で撮影した輝度画像を用いた顔認識を取り上げたが、これに限らず一般のパターン認識においても適用可能である。例えば、「赤松茂、佐々木努、深町映夫，”濃淡画像マッチングによるロバストな正面顔の識別法 − フーリエスペクトルによるＫＬ展開の応用 −”，信学論（Ｄ−ＩＩ），．Ｊ７６−ＤＩＩ，７，ｐｐ．１３６３−１３７３，１９９３．」で提案されているように、濃淡画像から生成したフーリエスペクトルパターンを入力として本発明を適用しても良い。
【００８１】
また、本実施形態では、正規化顔パターンをもとに入力部分空間および辞書部分空間を生成しているが、正規化顔パターンから多数の正規化顔パターン集合の平均顔パターンを引いたデータを用いて入力部分空間及び辞書部分空間を生成しても良い。この場合には、制約部分空間も平均顔パターンが引かれたデータから生成された部分空間から先に述べた手順により生成する。識別は平均顔を引かない場合と同様に生成された部分空間を制約部分空間へ射影したうえで行なう。このように平均顔を引くことで、元は原点に存在するベクトルの始点が、顔集合の中央の点に変更されるために、識別性能が向上することが期待できる。
【００８２】
（非線形識別への拡張）本実施形態では線形識別を仮定していた。カーネルトリックと呼ばれる計算技法を用いることにより、非線形識別への拡張が実現できる。
【００８３】
まず、カーネルトリック法について説明する。カーネルトリック法では、ｍ次元の原パターンｘを、非線形変換φにより、原空間Ａに比べて遥かに高い次元ｄφの空間、或いは無限次元の空間（以下、非線形空間Ｂ）に写像する。
【００８４】
【数５】

【００８５】
そして、非線形空間Ｂ上の写像に対して部分空間法に基づく識別を行う。従来は高次元空間における識別は高い性能が望めないとされていた。しかし、原パターンｘを高次元空間に写像することで、原空間では線形識別が不可能な場合でも線形分離可能な分布に変換することができる。
【００８６】
図８では、原空間Ａでは原パターンｘ１とｙ１とは線形分離不能であるが、高次元空間Ｂに写像した写像パターンφ（ｘ１）と写像パターンφ（ｙ１）とは線形分離可能となっている。
【００８７】
このことを顔パターンと顔に極めて類似した非顔パターンとの識別に応用すると、顔パターンと非顔パターンとにそれぞれ非線形変換を施すことで、識別することができるようになる。
【００８８】
しかし、非線形空間Ｂ上において、写像パターンφ（ｘ１）と写像パターンφ（ｙ１）との内積（φ（ｘ１）、φ（ｙ１））を直接計算するのは計算量が膨大なため実用的ではない。まして、非線形空間Ｂが無限次元の場合は計算不可能である。
【００８９】
ところが、非線形変換φをカーネル関数ｋ（ｘ、ｙ）を介して定義すると、内積（φ（ｘ１）、φ（ｙ１））は内積（ｘ１、ｙ１）から計算することができる。これがカーネルトリック法である。
【００９０】
非線形変換φが存在するためには、カーネル関数ｋ（ｘ、ｙ）がマーサーの条件を満たす必要がある。カーネル関数ｋ（ｘ、ｙ）には例えば以下のような関数を用いることができる（ｐは任意の数）。
【００９１】
【数６】

【００９２】
例えば、原パターンが２５６次元でｐ＝５とした場合、数６の（１）式を用いると１０¹⁰次元という極めて高次元の空間に写像される。また、数６の（３）式を用いると無限次元空間へ写像されることになる。数６の（３）式におけるσは識別対象に応じて決定する任意の実数である。
【００９３】
カーネルトリックはベクトルパターンの内積計算から構成される全ての識別法において導入可能であり、容易に非線形識別への拡張が実現できる。
【００９４】
代表的な方法であるカーネル非線形部分空間法は非線形空間上において部分空間法を適用する方法である。原空間における部分空間法に比べて大きく識別性能が向上し、現在最も高性能な識別法の１つとされているサポートベクタマシンと同等もしくは問題によってはそれ以上の良好な性能が得られることが確認されている。
【００９５】
この手法については、例えば「津田宏治，『ヒルベルト空間における部分空間法』，電子情報通信学会論文誌 (Ｄ−ＩＩ），ｖｏｌ．Ｊ８２−Ｄ−ＩＩ，ｐｐ．５５２−５９９，１９９９」や、「前田栄作、村瀬洋『カーネル非線形部分空間法によるパターン認識』，電子情報通信学会論文誌（Ｄ−ＩＩ），ｖｏｌ．Ｊ８２−Ｄ−ＩＩ，ｎｏ．４，ｐｐ．６００−６１２，１９９９」で述べられている。
【００９６】
この他にも、相互部分空間法にカーネルトリック法を導入した核非線形相互部分空間法や、投影距離法にカーネルトリック法を導入したカーネル投影距離法が提案されている。
【００９７】
核非線形相互部分空間法については、例えば「坂野鋭、武川直樹、中村太一，『核非線形相互部分空間法による物体認識』，信学論（Ｄ−ＩＩ），ｖｏｌ．Ｊ８４−Ｄ，Ｎｏ．８，ｐｐ．１５４９−１５５６，２００１．」で提案されている。
【００９８】
本実施形態における射影行列を用いた制約部分空間の生成、および部分空間の制約部分空間への射影も、全てベクトル同士の内積計算により構成される。したがってカーネル部分空間法などと同様にカーネルトリックを導入することで、非線形識別に拡張することができる。
【００９９】
【発明の効果】
以上、本発明によれば制約部分空間を各部分空間に対する射影行列の総和から容易に生成できるので、制約相互部分空間法による高精度な識別処理を従来より高速に実行可能となる。
【図面の簡単な説明】
【図１】本発明の一実施形態のパターン認識装置を応用した顔画像認識装置の構成を表すブロック図。
【図２】辞書部分空間の生成手順と辞書部分空間生成に関わる主要な部分の構成を説明する図。
【図３】本発明の一実施形態のパターン認識装置を応用した顔画像認識装置の外観を説明する図。
【図４】制約部分空間の生成処理を説明する図。
【図５】制約部分空間と制約辞書部分空間を作成するパターン認識用辞書生成装置の構成を説明するブロック図。
【図６】主成分部分空間と制約部分空間の関係を説明する図。
【図７】制約相互部分空間法の概念を説明する図。
【図８】非線形変換による高次元化の概念を説明する図。
【符号の説明】
１１画像入力部
１２顔領域抽出部
１３顔特徴点抽出部
１４正規化画像生成部
１５部分空間生成部
１６部分空間射影部
１７制約部分空間格納部
１８相互部分空間類似度計算部
１９制約辞書部分空間格納部
２０判定部
２１表示部
２２部分空間格納部
２３制約部分空間生成部
２４出力部
１１０１カメラ
２１０１モニタ
２１０２スピーカ
５００１操作用インターフェース[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a pattern recognition apparatus and a pattern recognition method.
[0002]
[Prior art]
As a pattern recognition technique, a technique called a mutual subspace method developed as an extension of the subspace method (for example, Non-Patent Document 1, Non-Patent Document 2) has been proposed (for example, Non-Patent Document 3, Patent Document). 1).
[0003]
In the mutual subspace method, first, a dictionary subspace is generated by expressing a dictionary pattern distribution in a subspace in advance. A subspace may be generated by obtaining a basis vector.
[0004]
Then, the input subspace is generated by expressing the input pattern distribution to be recognized in the subspace. Then, the minimum canonical angle formed between the input subspace and each previously generated dictionary subspace is obtained, and it is determined that the recognition target belongs to the category corresponding to the dictionary subspace with the smallest minimum canonical angle. To do.
[0005]
For example, belonging to a category means that if a human face is recognized using an image, “the person currently receiving recognition is a person registered in the dictionary”.
[0006]
The mutual subspace method expresses both the input side and the dictionary side in the subspace, so it has better pattern deformation absorption capability than the subspace method, but does not consider the relationship with other categories. Therefore, for example, in the case of face recognition, there is a problem that it is easily influenced by illumination conditions.
[0007]
Therefore, a “constrained subspace” is prepared by preparing a “constrained subspace” composed of essential features necessary for identification in advance, projecting the subspace to be compared to the constrained subspace, and applying the mutual subspace method. A method called “method” has been proposed (for example, Non-Patent Document 4 and Patent Document 2).
[0008]
In the constrained mutual subspace method, as shown in FIG. 7, the subspaces P ′ and Q ′ obtained by projecting the subspaces P and Q to be compared to the constrained subspace L are compared by the mutual subspace method.
[0009]
Since the subspaces P ′ and Q ′ inherit the essential characteristics of the subspaces P and Q, the difference between the subspaces P ′ and Q ′ is represented by the difference between the subspaces P and Q (in the figure, represented by the vector d). However, it is generally expressed as a difference space).
[0010]
Therefore, there is an advantage that the ability to absorb pattern deformation is superior to the mutual subspace method.
[0011]
[Patent Document 1]
JP-A-11-265542
[Patent Document 2]
JP 2000-30065
[0012]
[Non-Patent Document 1]
Taizo Iijima, “Pattern Recognition Theory”, Morikita Publishing, (1989)
[Non-Patent Document 2]
Ercchioya, "Pattern recognition and subspace method", Sangyo Tosho, (1986)
[Non-Patent Document 3]
Kenichi Maeda, Sadaichi Watanabe, “Pattern Matching Method Introducing Local Structure”, Science (D), vol. J68-D, no. 3, pp 345-352, 1985.
[Non-Patent Document 4]
Kazuhiro Fukui, Osamu Yamaguchi, Satoshi Suzuki, Kenichi Maeda, “Recognition of face images robust to environmental changes using the constrained mutual subspace method-Learning of constrained subspace to suppress lighting fluctuations”, IEICE Transactions (D -II), vol. J82-D-II, no. 4pp. 613-620, 1999.
[0013]
[Problems to be solved by the invention]
In the conventional constrained mutual subspace method, a constrained subspace is generated procedurally from a set of difference subspaces representing the difference between two subspaces.
[0014]
In the procedural generation method, first, a difference subspace is generated for all combinations of subspaces belonging to the same category, and a first variation subspace is generated from the principal component subspaces of all the difference subspaces.
[0015]
“Partial space belonging to the same category” is “partial space generated from face images of the same person” as exemplified by face recognition using an image. Next, a difference subspace is obtained for all combinations of subspaces belonging to different categories, and a second variable subspace is generated from the principal component subspaces of all the obtained difference subspaces.
[0016]
The first variation subspace obtained here corresponds to “change when the same person is photographed under different conditions (expression, lighting, etc.)” in the case of face recognition using an image. It can be said that this is a space where the components to be suppressed are extracted as much as possible. On the other hand, the second variable subspace corresponds to “difference between others and oneself” in the case of face recognition using an image, and can be said to be a space from which components to be considered in the identification are extracted.
[0017]
Accordingly, the complementary space of the first variation subspace and the second variation subspace are both spaces from which components to be considered in the identification are extracted, and these common portions are calculated as the constraint subspace.
[0018]
However, in the procedural constraint subspace generation method as described above, it is necessary to generate a difference subspace for all combinations of subspaces. If there are m subspaces_mC₂It is necessary to calculate the difference subspace of times. Therefore, when the number of categories increases, there is a problem that the number of difference subspaces to be obtained becomes enormous and the processing speed becomes slow.
[0019]
It is an object of the present invention to provide a pattern recognition apparatus and method for generating and using a constrained subspace by a simpler method with less calculation amount.
[0020]
[Means for Solving the Problems]
In order to solve the above problems, a pattern recognition dictionary generating apparatus of the present invention includes a dictionary pattern input unit for inputting a dictionary pattern, a dictionary subspace generating unit for generating a dictionary subspace from the dictionary pattern, and the dictionary A constrained subspace generating unit that generates a constrained subspace from a sum matrix of projection matrices of the subspace; a dictionary projecting unit that projects the dictionary subspace onto the constrained subspace to generate a constrained dictionary subspace; and the constrained portion A space and an output unit for outputting the constraint dictionary subspace.
[0021]
In the pattern recognition dictionary generation device of the present invention, the constraint subspace generation unit generates the constraint subspace from c eigenvectors selected from the smaller eigenvalues of the sum matrix of the projection matrix of the dictionary subspace. It may be characterized.
[0022]
In the pattern recognition dictionary generation device of the present invention, the constraint subspace generation unit generates a constraint subspace after integrating dictionary subspaces expressed from dictionary patterns belonging to the same category into one partial space. It may be a feature.
[0023]
A pattern recognition apparatus according to the present invention includes a constraint dictionary subspace storage unit that stores a constraint dictionary subspace corresponding to a dictionary pattern, an input unit that inputs an input pattern to be recognized, and an input subspace from the input pattern. A subspace generation unit for generating, a constrained subspace storage unit for storing a constrained subspace generated in advance using a sum of projection matrices to the dictionary subspace generated from the dictionary pattern, and the input subspace A subspace projection unit that projects the constraint subspace to obtain a constraint input subspace, obtains a canonical angle between the constraint input subspace and the constraint dictionary subspace, and uses the canonical angle to target the target And an identification unit for identifying the device.
[0024]
The pattern recognition device of the present invention stores the constraint subspace in the constraint subspace storage unit out of the output from the pattern recognition dictionary generation device, and the constraint dictionary subspace storage unit stores the constraint A dictionary subspace may be stored.
[0025]
The pattern recognition dictionary generation method of the present invention includes a dictionary pattern input step for inputting a dictionary pattern, a dictionary subspace generation step for generating a dictionary subspace from the dictionary pattern, and a sum of projection matrices of the dictionary subspace. A constraint subspace generating step for generating a constraint subspace from the matrix; a dictionary projecting step for projecting the dictionary subspace onto the constraint subspace to generate a constraint dictionary subspace; and the constraint subspace and the constraint dictionary subspace And an output step for outputting.
[0026]
In the constraint subspace generating step of the pattern recognition dictionary generating method of the present invention, the constraint subspace is generated using c selected from the eigenvectors having the smaller eigenvalues among the eigenvectors of the sum matrix of the projection matrix of the dictionary subspace. It may be characterized by generating a space.
[0027]
Further, in the constraint subspace generation step of the pattern recognition dictionary generation method of the present invention, the dictionary subspace generated from dictionary patterns belonging to the same category is integrated into one partial space, and then the constraint subspace is generated. It may be characterized.
[0028]
In the pattern recognition method of the present invention, as the constraint subspace, a dictionary subspace expressed from dictionary patterns belonging to the same category is integrated into one partial space, and then a projection matrix is obtained, and all categories of the projection matrix or A sum matrix for some categories is calculated, and a subspace spanned by c eigenvectors selected from the smaller eigenvalues is used for the sum matrix.
[0029]
DETAILED DESCRIPTION OF THE INVENTION
(First Embodiment) A face image recognition apparatus to which a pattern recognition apparatus according to an embodiment of the present invention is applied will be described with reference to the drawings.
[0030]
FIG. 1 is a block diagram showing a schematic configuration diagram of a face image recognition apparatus according to an embodiment of the present invention. The apparatus includes an image input unit 11 that captures an image of a person to be recognized and obtains an input image, a face region extraction unit 12 that extracts a face region image corresponding to a face from the input image, and a face region image. And a facial feature point extraction unit 13 for extracting feature points such as eyes, nose, and mouth.
[0031]
The apparatus further includes a normalized image generation unit 14 that performs conversion for normalizing the face area image with reference to the extracted feature points, and an input subspace P from a predetermined number of normalized face area images._inTo find a basis vector spanning the input subspace P_inAnd a subspace generation unit 15 for generating the input subspace P_inIs projected onto the constraint subspace L and the constraint input subspace P_in ^LTo obtain a basis vector spanning the constraint input subspace P_in ^LAnd a constrained subspace storage unit 17 for storing a base vector extending the constrained subspace L.
[0032]
In addition, the present apparatus has a constraint input subspace P._in ^LAnd constraint dictionary subspace H_i ^LA mutual subspace similarity calculation unit 18 that calculates the similarity by using a restriction dictionary subspace H generated in advance before recognition_i ^LWhether the person shown in the input image is a person registered in the dictionary based on the restriction dictionary subspace storage unit 19 that stores the basis vectors for extending and the obtained similarity (when multiple persons are registered in the dictionary) And a display unit 21 that displays identification results and the like.
[0033]
FIG. 3 is a diagram for explaining the external appearance of this apparatus. A camera 1101, a monitor 2101, a speaker 2102 and an operation interface 5001 of the apparatus are exposed outside the apparatus.
[0034]
The camera 1101 is a component constituting the image input unit 11 and is used for acquiring a human image. A monitor 2101 and a speaker 2102 are components constituting the display unit 21 and are used for displaying the identification result as an image or sound. In this embodiment, the camera 1101 is attached to the front of the apparatus and under the monitor 2101. However, the attachment position may be changed as appropriate according to the use situation or the like.
[0035]
The image input unit 11 captures a person to be recognized by the camera 1101 and obtains an input image. Then, the input image is converted into digital data by an A / D converter and output. Since input images are obtained continuously by the camera 1101, they are sequentially A / D converted and sequentially output.
[0036]
The face area extraction unit 12 extracts face areas from the obtained input image and sequentially generates face area images. The face area is extracted by template matching using a standard face image registered in advance. The correlation value is calculated while moving the standard face image in the input image, and the area having the highest correlation value is set as the face area. However, if the correlation value is lower than a predetermined threshold, it is assumed that no face exists. For example, when the correlation value when 100% completely matches is “100”, the threshold is set to “30”. In extracting the face area, the face area can be extracted with higher accuracy by using a plurality of templates by the subspace method or the composite similarity in order to cope with the change in the face direction.
[0037]
The face feature point extraction unit 13 extracts feature points such as a pupil, nose, and mouth edge from the extracted face area image. In this embodiment, as a method for extracting feature points, a method for verifying feature point candidates obtained from shape information with high positional accuracy proposed by Japanese Patent Laid-Open No. 9-251534 is used.
[0038]
However, the method is not limited to this method as long as the feature points can be extracted. For example, “Shizuo Sakamoto, Yoko Miyao, Joji Tajima,“ Extraction of eye feature points from face images ”, Science theory D-II, Vol. -D-II, No. 8, pp. 1796-1804, August, 1993. "and" AL Yule, "Feature extraction from faces deforming templates", UJCV, vol8: 2, pp. 99-111 ,. 1992 "and the method based on edge information," Alex Pentland, Backpack Maudaddam, Thad Starner, "View-based and modular eigenspaces for face recognition" The Eigen feature method applying the eigenspace method proposed in “CVPR '94, pp. 84-91, 1994.” and “Suguru Sasaki, Shigeru Akamatsu, Yasuhito Suenaga,” use color information for facial image recognition. The method using color information proposed in “Face Alignment Method”, IE 91-2, pp. 9-15, 1991. may be used.
[0039]
The normalized image generation unit 14 is proposed by “Osamu Yamaguchi, Kazuhiro Fukui, Kenichi Maeda,“ Face recognition system using moving images ”, IEICE Technical Report, PRMU 97-50, pp. 17-24, 1997.” The normalized image is generated by applying the normalization process based on the pupil and nostril. The contents of the normalization process are as follows. For the first vector connecting both pupils and the second vector connecting the midpoint of the nostril and the midpoint of the pupil, the first vector is horizontal, The affine transformation is performed so that the vector and the second vector are orthogonal to each other and the first and second vectors have a predetermined length.
[0040]
The subspace generation unit 15 stores the normalized images sequentially generated by the normalized image generation unit 14 in a memory (not shown) after performing histogram flattening and vector length normalization. When a predetermined number of normalized images are accumulated, the input subspace P_inGeneration, ie P_inGenerate m basis vectors spanning.
[0041]
In this apparatus, input images are continuously obtained via the camera 1101 of the image input unit 11, and since a face area extraction process, a feature point extraction process, and a normalization process are sequentially performed, normalized images can also be obtained continuously. Can do. If the partial space can be sequentially generated and updated each time a normalized image is obtained, pattern recognition processing can be performed in real time, and pattern input becomes easy. Therefore, in this equipment, the simultaneous iteration method described in “Ecchi Oya, Hidemitsu Ogawa, Makoto Sato,“ Pattern recognition and subspace method ”, Sangyo Tosho, 1986.” Apply.
[0042]
In the subspace projection unit 16, the input subspace P generated by the subspace generation unit 15._inIs projected onto the constraint subspace L stored in the constraint subspace storage unit 17 and the constraint input subspace P_in ^LIs generated by the following procedure.
[0043]
First, the input subspace P_inAre projected onto the constraint subspace L. Then, each projected base vector is normalized to the vector length to be a normalized vector. Further, the normalized vector is subjected to Gram Schmidt orthogonalization to obtain an orthonormalized vector. The m orthonormalized vectors generated by the above procedure are the input subspace P_inConstraint input subspace P generated by projecting to the constraint subspace L_in ^LBasis vectors.
[0044]
Constraint dictionary subspace H of person i stored in constraint dictionary subspace storage unit 19_i ^LIs a dictionary subspace H corresponding to each person_iIs generated in the subspace projection unit 16 in advance when the person i is registered in the dictionary. Constraint dictionary subspace H_i ^LThe generation procedure will be described later.
[0045]
Input subspace P_in, Dictionary subspace H_iThe number of dimensions of the constraint subspace L is experimentally determined according to the type of data. It is experimentally known that when the face pattern data is 225 dimensions, the input subspace and dictionary subspace should be set to 5 to 10 dimensions, and the restricted subspace should be set to 150 to 180 dimensions.
[0046]
In the mutual subspace similarity calculation unit 18, the constraint input subspace P_in ^LAnd the constraint dictionary subspace H of the person i (i = 1... M) stored in the constraint dictionary subspace storage unit 19_i ^LCanonical angle θ₁Cos for²θ₁Are calculated for all the dictionaries and set as the similarity.
[0047]
Specifically, cos²θ₁Is defined by the following equation. However, the constraint input subspace P_in ^LLet the projection matrix for the matrix P be a constraint dictionary subspace H_i ^LLet the projection matrix for be a matrix H and let the maximum eigenvalue of the matrix PHP be λ₁And
[0048]
[Expression 1]

[0049]
The similarity is cos²θ₁The t-th canonical angle θ_rUp to t cos²θ_jYou may use the weighted average of (j = 1-t), a product, the sum of multiplication, or what vectorized these.
[0050]
The determination unit 20 identifies a person r having the maximum similarity and not less than a predetermined threshold among the m persons as the person. At this time, it may be determined in consideration of the similarity after the second candidate. For example, if the difference in similarity with the second candidate is smaller than the threshold, it is determined as indeterminate.
[0051]
The display unit 21 has a liquid crystal screen and a speaker, displays the identification result on the screen, and informs by voice. The display unit 21 also has an output unit that outputs the identification result to another device. Although not shown below, the display unit 21 outputs an identification result to the electronic lock, and the electronic lock performs unlocking and locking according to the identification result.
[0052]
(Specific Constraint Dictionary Subspace Generation Procedure) Hereinafter, the constraint dictionary subspace H_i ^L2 will be described with reference to FIG. 2 and each part of the apparatus used in the process of generating and storing in the constraint dictionary subspace storage unit 19.
[0053]
FIG. 2 is a diagram in which main parts related to constraint dictionary partial space generation are extracted. The main part is the dictionary subspace H_iThe constraint dictionary subspace H_i ^LA subspace projection unit 16 for projecting to the target, a constraint subspace storage unit 17 for storing the constraint subspace L, and the generated constraint dictionary subspace H_i ^L, And a dictionary subspace H generated using the image input unit 11 to the subspace unit 15 in FIG._iIs a partial space storage unit 22 for storing. The partial space storage unit 22 is not shown in FIG. 1, but the rest is the same as FIG.
[0054]
In the constrained subspace generation process of the present apparatus, the dictionary subspace H is shot after the person to be registered is photographed._iUntil the process of generating the dictionary subspace by inputting the dictionary pattern, the input subspace P is captured after the person to be identified as the input pattern is photographed at the time of identification._iSince this is the same operation as the process of generating the data, description thereof is omitted.
[0055]
The subspace storage unit 22 is a dictionary subspace H generated by the subspace generation unit 15._iIs received and stored. Then, the subspace projection unit 16 reads the constrained subspace L from the constrained subspace storage unit 17 and stores the dictionary subspace H stored in the subspace storage unit 22._iAre sequentially read out and projected onto the constraint subspace L, and the constraint dictionary subspace H_i ^LAre generated sequentially. The constraint dictionary subspace storage unit 19 generates the generated constraint dictionary subspace H_i ^LIs stored.
[0056]
(Specific Restriction Subspace Generation Procedure) Hereinafter, the restriction subspace generation procedure will be described with reference to FIG.
[0057]
First, normalized face patterns including various face orientations, facial expressions, lighting conditions, etc. are collected for each person. For collection, the method described in “Osamu Yamaguchi, Kazuhiro Fukui, Kenichi Maeda,“ Face Recognition System Using Moving Images ”, Shingaku Technical Report, PRMU 97-50, pp. 17-24, 1997.” At the time of collection, images of the face of the registered person are photographed from various directions, up, down, left, and right while changing illumination conditions in various ways (S401).
[0058]
After preprocessing such as histogram flattening and vector length normalization is performed on the collected f-dimensional normalized image data, KL expansion is applied to obtain a base vector that extends the subspace. At this time, if partial spaces have already been generated from face images acquired under different conditions for the same person, they are integrated into one partial space. That is, only one partial space is allocated to one person (S402). Then, a projection matrix is calculated from the obtained subspace basis vectors (S403).
[0059]
Steps S401 to S403 are repeated until all persons (m persons) are registered (S404).
[0060]
A total sum G of projection matrices for the subspaces for m persons generated in this way is obtained (S405). Then, the eigenvalue problem of the matrix G is solved to obtain the eigenvalue (S406). Of the obtained eigenvalues, c are selected from the smaller values, and the eigenvectors corresponding to these eigenvalues are set as the basis vectors of the c-dimensional constraint subspace L (S407).
[0061]
A point to be noted when obtaining the constrained subspace L in this procedure is that “one subspace is allocated to one person”. That is, the partial spaces generated from images obtained by photographing the same person under different conditions are all integrated into the same partial space. If the integration is not performed, fluctuations within the same category (that is, the same person) that should originally be included in the principal component space P are included in the restricted subspace L, resulting in a decrease in identification accuracy.
[0062]
FIG. 5 shows a configuration of a device for generating the above-described constraint subspace and constraint dictionary subspace. The image input unit 11, face area extraction unit 12, face feature point extraction unit 13, normalized image generation unit 14, subspace generation unit 15, and subspace projection unit 16 of this device are the same as the pattern recognition device of FIG. is there.
[0063]
First, using the image input unit 11, the face region extraction unit 12, the face feature point extraction unit 13, the normalized image generation unit 14, and the partial space generation unit 15, from a dictionary pattern input in the same manner as in the face image recognition. Dictionary subspace H_iIs generated. The subspace storage unit 22 stores the dictionary subspace H._iIs stored. The subspace storage unit 22 stores the dictionary subspace H registered so far._iWill accumulate.
[0064]
The constrained subspace 23 is transferred from the subspace storage 22 to the dictionary subspace H._i, And generates a constrained subspace L according to the procedure described above and outputs it to the output unit 24 and the subspace projection unit 16. The subspace projection unit 16 includes the constraint subspace L and the subspace H._iConstraint dictionary subspace H using_i ^LIs output to the output unit 24. The output unit 24 includes a constraint subspace L generated by the constraint subspace generation unit 23 and a constraint dictionary subspace H generated by the subspace projection unit 16._i ^LIs output to an external device (not shown). For example, when the face image recognition apparatus shown in FIG. 1 is connected as an external device, the constraint subspace L is stored in the constraint dictionary subspace H in the constraint subspace storage unit 17._i ^LIs stored in the constraint dictionary subspace storage unit.
[0065]
(Constrained Subspace Generation Method) A method for generating a constrained subspace from each subspace will be described below.
[0066]
First, the two-category identification problem will be described and then extended to the case of multiple categories.
[0067]
(2-Category Identification Problem) In the 2-category identification problem, a difference subspace between two subspaces is applied as a constrained subspace. The difference subspace is a subspace representing a difference between two subspaces, and can be defined by using a sum of projection matrices for the two subspaces in addition to a definition based on a canonical angle formed by the two subspaces. In the present invention, the sum of projection matrices is used.
[0068]
The projection matrix X for the n-dimensional subspace A and the projection matrix Y for the m-dimensional subspace B are the basis vectors φ of the subspaces A and B._i, Ψ_iIt can be seen that the subspace and the projection matrix uniquely correspond to each other.
[0069]
[Expression 2]

[0070]
Projective matrix sum G₂= X + Y is the sum G of projection matrices, where m> n₂Has n × 2 eigenvectors corresponding to n × 2 positive eigenvalues. And the difference subspace D₂Is G₂It is mathematically shown that the space is spanned by n eigenvectors d whose eigenvalues are smaller than 1.0 among n × 2 eigenvectors of. Equation 3 represents this. Here, S is a sum space of the partial space A and the partial space B, and P is a principal component partial space.
[0071]
[Equation 3]

[0072]
Also, the sum G of the projection matrix₂The principal component subspace P spanned by n eigenvalues greater than 1.0 among n × 2 eigenvectors₂Can be regarded as a common space between the two partial spaces A and B, and can be interpreted as a space that is equally close to the partial spaces A and B.
[0073]
Difference subspace D shown above₂And principal component subspace P₂In general, it was shown that the sum space S of two subspaces can be directly summed into a main component subspace P and a difference subspace D. From this, it can be said that the difference subspace D is a space obtained by removing the main component subspace P from the sum space S. Therefore, the difference subspace represents a difference between the two subspaces, and at the same time has a desirable characteristic for performing discrimination that is orthogonal to the average fluctuations of the two subspaces, that is, is less susceptible to these fluctuations.
[0074]
(In the case of multi-category identification) “Difference subspace D is a space obtained by removing the principal component space, which is a common space between two subspaces, from the sum space of two subspaces”. Space D_nIs a principal component subspace P of n subspaces from a sum space of n m-dimensional subspaces._nIt is defined as a space from which is removed.
[0075]
Specifically, as shown in the following equation, the difference subspace D for n categories_nIs a projection matrix X calculated from each category subspace_iSum matrix G_nC vectors d selected in order from the smallest eigenvalue_n×_m, D_n×_m-1, D_n×_m-2, ..., d_n×_{m-c + 1}It is a space stretched by. However, when n × m is larger than the normalized data dimension number f, d_f, D_f-1, D_f-2... d_{f-c + 1}, It becomes a space stretched by.
[0076]
[Expression 4]

[0077]
This generalized difference subspace D_nIt is also a preferable space for identification. This is because the principal component subspace P for n subspaces as shown in FIG._nThis is because it indicates the overall average of pattern variations within each category. Therefore, this principal component subspace P_nSubspace D orthogonal to_nDoes not include these pattern variations, but mainly includes other variations, that is, variation components between categories.
[0078]
Where difference subspace D_nIs defined as a restricted subspace L for a plurality of categories. In this way, the constraint subspace L can be generated analytically with a simple calculation by using the projection matrix for the multiple category identification problem. As a result, the amount of calculation can be reduced as compared with the conventional procedural method, so that the processing speed is improved.
[0079]
It is experimentally known that when identifying a face pattern of 225-dimensional data, the value of c is preferably 150 to 180.
[0080]
In the present embodiment, face recognition using a luminance image photographed by the camera 1101 is taken up, but the present invention is not limited to this and can be applied to general pattern recognition. For example, “Shigeru Akamatsu, Tsutomu Sasaki, Teruo Fukamachi,“ Robust Front Face Identification by Gray Image Matching -Application of KL Expansion by Fourier Spectrum ”, Theory of Science (D-II), J76-DII, 7, pp. 1363- 1373, 1993. ", the present invention may be applied with a Fourier spectrum pattern generated from a grayscale image as an input.
[0081]
In this embodiment, the input subspace and the dictionary subspace are generated based on the normalized face pattern, but data obtained by subtracting the average face pattern of a number of normalized face pattern sets from the normalized face pattern is used. The input subspace and the dictionary subspace may be generated by using them. In this case, the constrained subspace is also generated from the subspace generated from the data with the average face pattern drawn by the procedure described above. Discrimination is performed after projecting the generated subspace to the constrained subspace in the same manner as when the average face is not drawn. By drawing the average face in this way, the starting point of the vector originally present at the origin is changed to the center point of the face set, so that it can be expected that the identification performance is improved.
[0082]
(Extension to non-linear discrimination) In this embodiment, linear discrimination is assumed. By using a calculation technique called kernel trick, an extension to nonlinear discrimination can be realized.
[0083]
First, the kernel trick method will be described. In the kernel trick method, an m-dimensional original pattern x is mapped to a space of a dimension dφ that is much higher than the original space A or an infinite-dimensional space (hereinafter referred to as a nonlinear space B) by nonlinear transformation φ.
[0084]
[Equation 5]

[0085]
Then, the mapping on the nonlinear space B is identified based on the subspace method. Conventionally, it has been said that high performance cannot be expected for identification in a high-dimensional space. However, by mapping the original pattern x into a high-dimensional space, it is possible to convert the distribution into a linearly separable distribution even when linear identification is impossible in the original space.
[0086]
In FIG. 8, in the original space A, the original patterns x1 and y1 cannot be linearly separated, but the mapped pattern φ (x1) mapped to the high-dimensional space B and the mapped pattern φ (y1) can be linearly separated. Yes.
[0087]
When this is applied to the discrimination between a face pattern and a non-face pattern that is very similar to the face, the face pattern and the non-face pattern can be discriminated by performing nonlinear transformation respectively.
[0088]
However, in the non-linear space B, it is not practical to directly calculate the inner product (φ (x1), φ (y1)) of the mapping pattern φ (x1) and the mapping pattern φ (y1) because the calculation amount is enormous. Absent. In addition, calculation is impossible when the nonlinear space B is infinite.
[0089]
However, if the nonlinear transformation φ is defined via the kernel function k (x, y), the inner product (φ (x1), φ (y1)) can be calculated from the inner product (x1, y1). This is the kernel trick method.
[0090]
In order for the nonlinear transformation φ to exist, the kernel function k (x, y) needs to satisfy the conditions of Mercer. For example, the following function can be used as the kernel function k (x, y) (p is an arbitrary number).
[0091]
[Formula 6]

[0092]
For example, if the original pattern is 256 dimensions and p = 5, then using equation (1) in Equation 6, 10^TenIt is mapped to a very high-dimensional space called dimensions. In addition, when Expression (3) of Formula 6 is used, mapping to an infinite dimensional space is performed. In the equation (3), σ is an arbitrary real number determined according to the identification target.
[0093]
Kernel tricks can be introduced in all discrimination methods consisting of vector product inner product calculations, and can easily be extended to nonlinear discrimination.
[0094]
The kernel nonlinear subspace method, which is a typical method, is a method of applying the subspace method on the nonlinear space. Compared to the subspace method in the original space, the discrimination performance has been greatly improved, and it has been confirmed that the same performance as the support vector machine, which is regarded as one of the most powerful discrimination methods at present, or better than that depending on the problem. Has been.
[0095]
Regarding this technique, for example, “Koji Tsuda,“ Subspace Method in Hilbert Space ”, IEICE Transactions (D-II), vol. J82-D-II, pp. 552-599, 1999” Maeda Eisaku and Murase Hiroshi “Pattern Recognition by Kernel Nonlinear Subspace Method”, IEICE Transactions (D-II), vol. J82-D-II, no. 4, pp. 600-612, 1999 ” It has been.
[0096]
In addition to this, a nuclear nonlinear mutual subspace method in which the kernel trick method is introduced into the mutual subspace method and a kernel projection distance method in which the kernel trick method is introduced into the projection distance method have been proposed.
[0097]
Regarding the nuclear non-linear mutual subspace method, for example, “Akira Sakano, Naoki Takegawa, Taichi Nakamura,“ Object recognition by nuclear non-linear mutual subspace method ”, Science theory (D-II), vol. J84-D, No. 8 , Pp. 1549-1556, 2001. ”.
[0098]
The generation of the constrained subspace using the projection matrix and the projection of the subspace onto the constrained subspace in this embodiment are all configured by calculating the inner product of the vectors. Therefore, it can be extended to nonlinear discrimination by introducing a kernel trick as in the kernel subspace method.
[0099]
【The invention's effect】
As described above, according to the present invention, a constrained subspace can be easily generated from the sum of projection matrices for each subspace, so that highly accurate identification processing by the constrained mutual subspace method can be executed at a higher speed than in the past.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration of a face image recognition apparatus to which a pattern recognition apparatus according to an embodiment of the invention is applied.
FIG. 2 is a diagram for explaining a dictionary subspace generation procedure and a configuration of main parts involved in dictionary subspace generation.
FIG. 3 is a diagram for explaining the appearance of a face image recognition apparatus to which the pattern recognition apparatus according to an embodiment of the present invention is applied.
FIG. 4 is a diagram for explaining a generation process of a restricted subspace.
FIG. 5 is a block diagram illustrating a configuration of a pattern recognition dictionary generating apparatus that creates a constraint subspace and a constraint dictionary subspace.
FIG. 6 is a diagram for explaining the relationship between a principal component subspace and a restricted subspace.
FIG. 7 is a diagram for explaining the concept of a constrained mutual subspace method.
FIG. 8 is a diagram for explaining a concept of higher dimension by nonlinear transformation.
[Explanation of symbols]
11 Image input section
12 Face region extraction unit
13 Face feature point extraction unit
14 Normalized image generator
15 Subspace generator
16 Subspace projection part
17 Constraint subspace storage
18 Mutual subspace similarity calculator
19 Constraint dictionary subspace storage
20 judgment part
21 Display section
22 Subspace storage
23 Constraint subspace generator
24 Output unit
1101 Camera
2101 monitor
2102 Speaker
5001 Operation interface

Claims

A dictionary pattern input unit for inputting a dictionary pattern;
A dictionary subspace generating unit that generates m dictionary subspaces corresponding to each of m categories from the dictionary pattern;
A constrained subspace generation unit that obtains a sum matrix of projection matrices to each of the dictionary subspaces, and generates one constrained subspace from c eigenvectors selected from the eigenvectors having the smallest eigenvalues among the eigenvectors of the sum matrix ; ,
A dictionary projection unit for projecting each of the subspace to the constraint subspace, generates m constraint subspace corresponding to each of the categories,
An output unit for outputting the constraint subspace and the constraint dictionary subspace;
A dictionary recognition apparatus for pattern recognition comprising:

2. The pattern recognition dictionary generation device according to claim 1, wherein the constraint subspace generation unit obtains the projection matrix from each base vector of the dictionary subspace.

3. The constraint subspace generation unit generates the constraint subspace after integrating dictionary subspaces expressed from dictionary patterns belonging to the same category into one partial space. Dictionary recognition device for pattern recognition.

storing one constrained subspace generated from c eigenvectors selected from the smaller eigenvalues among the eigenvectors of the sum matrix obtained from the projection matrix to the dictionary subspace corresponding to each of the m categories. A constrained subspace storage,
A constraint dictionary subspace storage unit that stores m constraint dictionary subspaces obtained by projecting each of the dictionary subspaces onto the constraint subspace;
An input unit for inputting an input pattern to be recognized;
A subspace generation unit for generating an input subspace from the input pattern;
A subspace projection unit for projecting the input subspace onto the constraint subspace to obtain a constraint input subspace;
An identification unit for obtaining a canonical angle between the constraint input subspace and the constraint dictionary subspace, and identifying the target using the canonical angle;
A pattern recognition apparatus comprising:

Among the outputs from the pattern recognition dictionary generating device according to claim 1,
The constraint subspace storage unit stores the constraint subspace,
5. The pattern recognition apparatus according to claim 4, wherein the constraint dictionary subspace is stored in the constraint dictionary subspace storage unit.

A dictionary pattern input step for inputting a dictionary pattern;
A dictionary subspace generation step of generating m dictionary subspaces corresponding to each of m categories from the dictionary pattern;
A constrained subspace generation step of obtaining a sum matrix of projection matrices to each of the dictionary subspaces, and generating one constrained subspace from c eigenvectors selected from the eigenvectors having the smaller eigenvalues of the sum matrix ,
A dictionary projection generating said by projecting each subspace to the constraint subspace, m-number of constraints subspace corresponding to each of the categories,
Outputting the constraint subspace and the constraint dictionary subspace;
A pattern recognition dictionary generation method comprising:

In the constraint subspace generation step, the basis vectors of the dictionary subspaces are used to generate the constraint subspace. The pattern recognition dictionary generation method according to claim 6, wherein a projection matrix is obtained.

8. The constraint subspace is generated by integrating the dictionary subspaces generated from dictionary patterns belonging to the same category into one partial space in the constraint subspace generation step. The pattern recognition dictionary generation method described.

Identifying an input subspace and the dictionary pattern representing the input pattern in subspace and the dictionary subspace expressed in subspaces, since the projection constraints subspace for suppressing common components between the dictionary subspace A pattern recognition method using a constrained mutual subspace method,
integrating a dictionary subspace of dictionary patterns belonging to each of the m categories into one subspace corresponding to each of the categories, and then obtaining a projection matrix for each of the categories ;
calculating a sum matrix from the m projection matrices,
Wherein among the eigenvectors of the sum matrix, determining the constraint subspace subspace spanned by c-number of the eigenvectors selected from the smaller eigenvalue,
A pattern recognition method characterized by the above.