JP2004258907A

JP2004258907A - Facial part tracking device

Info

Publication number: JP2004258907A
Application number: JP2003047918A
Authority: JP
Inventors: Kinya Iwamoto; 欣也岩本; Masayuki Kaneda; 雅之金田; Haruo Matsuo; 治夫松尾
Original assignee: Nissan Motor Co Ltd
Current assignee: Nissan Motor Co Ltd
Priority date: 2003-02-25
Filing date: 2003-02-25
Publication date: 2004-09-16
Anticipated expiration: 2023-02-25
Also published as: JP4107104B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a facial part tracking device capable of increasing accuracy and a processing speed in identifying a facial part to be tracked. <P>SOLUTION: A facial part detecting means CL2 receives the input of an image from a facial image pickup means CL1 and detects the facial part to be tracked from the entire image. Thereafter, a facial part search area setting means CL31 sets a search area of the facial part on the basis of detected positions on the image to be tracked. A preferred facial part search area setting means CL32 sets a preferred search area of a facial part within the search area of the facial part. A facial part candidate extracting means CL33 extracts candidates for the subject of tracking from within the search area of the facial part. When the candidate extracted is located within the preferred search area of the facial part, a facial part determining means CL34 determines the candidate to be the subject of tracking; when the candidate extracted is located not within the preferred search area of the facial part but within the search area of the facial part, the determining means processes an image of the candidate to determine whether or not the candidate extracted is the subject of tracking. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、顔部位追跡装置に関する。
【０００２】
【従来の技術】
従来、被検出者の顔を撮像して得られた撮像画像から、追跡の対象となる顔の部位を検出し、顔部位を追跡していく顔部位追跡装置が知られている。この顔部位追跡装置は、まず、標準となるテンプレートを記憶し、この標準テンプレートにより撮影画像内から追跡対象となる顔の部位を抽出する。そして、抽出した画像を追跡用のテンプレートとして記憶し、追跡用テンプレートにより追跡対象である顔部位を追跡していく。（例えば特許文献１参照）。
【０００３】
また、他の顔部位追跡装置では、撮影画像の縦方向に配列された画素列に沿って、濃度の局所的な高まりごとに１個ずつ画素を定めて抽出点を決定する。そして、画像横方向に並ぶ抽出点を曲線群とし、この曲線群が追跡対象となる顔部位の所定の形状（例えば眼である場合には横方向に長いかなど）と合致するかを判断して、追跡対象の位置を検出する。その後、検出された追跡対象を基に存在領域を設定し、存在領域を２値化して追跡対象の位置を詳細に特定し、特定された追跡対象の位置を次回の処理における存在領域の設定位置とする。そして、以上の処理を繰り返し、所望する顔部位を追跡していく（例えば特許文献２参照）。
【０００４】
【特許文献１】
特開２０００−１６３５６４号公報
【０００５】
【特許文献２】
特開平１０−１４３６６９号公報
【０００６】
【発明が解決しようとする課題】
しかしながら、特許文献１に記載の装置では、１フレームの画像に対して顔の特定部位が見つかるまで、標準及び追跡用テンプレートにより繰り返しパターンマッチングを行っている。このため、追跡対象の顔部位をリアルタイムに追跡していくためには、非常に高い計算負荷が要求されることとなってしまう。
【０００７】
また、特許文献２に記載の装置では、存在領域内の対象物が追跡したい顔部位であるか否かの判断をしておらず、追跡したい顔部位でないものを誤って追跡してしまう可能性がある。
【０００８】
【課題を解決するための手段】
本発明によれば、被検出者の顔を撮像し入力した画像に基づいて、顔部位の動きを追跡する顔部位追跡装置において、顔部位検出手段は、入力した撮像画像の全体から、追跡の対象となる顔部位を検出し、顔部位探査領域設定手段は、検出後に入力した画像に対し、顔部位検出手段により検出された追跡対象の顔部位の画像上における位置に基づいて、顔部位探査領域を設定し、優先顔部位探査領域設定手段は、顔部位探査領域設定手段により設定された顔部位探査領域内に、優先顔部位探査領域を設定し、候補抽出手段は、顔部位探査領域内から、顔部位の候補を抽出し、第１顔部位判定手段は、候補抽出手段により抽出された候補が優先顔部位探査領域内にあるときに、その候補を追跡対象の顔部位と判定し、第２顔部位判定手段は、抽出された候補が優先顔部位探査領域内に無く顔部位探査領域内にあるときには、その候補の画像を画像処理することにより、抽出された候補が追跡対象の顔部位か否かを判定し、上記の顔部位探査領域設定手段は、被検出者が顔の向きを変えたときに、サンプリング時間中に移動する追跡対象の移動量に基づいて、顔部位探査領域を設定する。
【０００９】
【発明の効果】
本発明によれば、検出された追跡対象の画像上の位置に基づいて、全体画像よりも狭い顔部位探査領域を設定している。このため、画像全体から顔部位の候補を抽出することなく、顔部位が存在する可能性の高い領域から候補を抽出することができ、迅速な処理を行うことができる。
【００１０】
また、抽出された候補が顔部位探査領域内であって優先顔部位探査領域外にある場合に、その候補の画像を画像処理して候補が追跡対象であるか否かを判定している。これにより、追跡対象でない顔部位を誤って追跡してしまうことを防止することができる。
【００１１】
従って、追跡対象となる顔部位を判定するのに際し、精度及び処理速度の向上を図ることができる。
【００１２】
【発明の実施の形態】
以下、本発明の好適な実施形態を図面に基づいて説明する。
【００１３】
図１は、本発明の実施形態に係る顔部位追跡装置の構成を示す機能ブロック図である。同図に示すように、顔部位追跡装置１は、被検出者の顔を撮像し入力した画像に基づいて、顔部位の動きを追跡するものであって、顔画像撮像手段ＣＬ１と、顔部位検出手段ＣＬ２と、顔部位追跡手段ＣＬ３とを備えている。
【００１４】
顔画像撮像手段ＣＬ１は、被検出者の顔を撮像することにより、追跡対象となる顔部位を含む撮像画像を得るものである。また、顔画像撮像手段ＣＬ１は、入力した画像のデータを、顔部位検出手段ＣＬ２及び顔部位追跡手段ＣＬ３に送出する構成とされている。
【００１５】
顔部位検出手段ＣＬ２は、入力した撮像画像の全体から追跡の対象となる顔部位を検出するものである。また、顔部位追跡手段ＣＬ３は、顔画像撮像手段ＣＬ１及び顔部位検出手段ＣＬ２からの信号に基づいて、追跡対象となる顔部位の動きを追跡するものである。
【００１６】
上記顔部位追跡手段ＣＬ３は、顔部位探査領域設定手段ＣＬ３１と、優先顔部位探査領域設定手段ＣＬ３２とを備えている。また、顔部位追跡手段ＣＬ３は、顔部位候補抽出手段（候補抽出手段）ＣＬ３３と、顔部位判定手段（第１顔部位判定手段、第２顔部位判定手段）ＣＬ３４とを備えている。
【００１７】
顔部位探査領域設定手段ＣＬ３１は、顔部位検出手段ＣＬ２により追跡対象の顔部位が検出された場合に、検出後に入力した画像に対して処理を行うものである。行う処理としては、追跡対象の画像上の位置に基づいて、画像全体よりも狭い顔部位探査領域を設定する処理である。なお、顔部位探査領域は、例えば、被検出者が顔の向きを変えたときに、サンプリング時間中に移動する追跡対象の移動量に基づいて、設定されるものである。
【００１８】
また、優先顔部位探査領域設定手段ＣＬ３２は、上記の顔部位探査領域内に優先顔部位探査領域を設定するものである。この優先顔部位探査領域は、例えば、被検出者が一方向を視認しているときに、サンプリング時間中に移動する追跡対象の顔部位の移動量に基づいて、設定されるものである。
【００１９】
顔部位候補抽出手段ＣＬ３３は、顔部位探査領域内から追跡対象となる顔部位の候補を抽出するものである。すなわち、顔部位候補抽出手段ＣＬ３３は、顔部位検出手段ＣＬ２と異なり、撮像画像全体から追跡対象の顔部位を抽出せず、顔部位検出手段ＣＬ２よりも高速に処理を行うことができるものである。
【００２０】
顔部位判定手段ＣＬ３４は、顔部位候補抽出手段ＣＬ３３で抽出された追跡対象の候補が追跡対象であるか否かを判定するものである。具体的に、顔部位判定手段ＣＬ３４は、抽出された候補が優先顔部位探査領域内にあるとき、その候補を追跡対象であると判定する。また、顔部位判定手段ＣＬ３４は、抽出された候補が優先顔部位探査領域内に無く顔部位探査領域内にあるとき、その候補の画像を画像処理することにより、抽出された候補が追跡対象であるか否かを判定する。
【００２１】
このような顔部位追跡装置１においては、まず、顔画像撮像手段ＣＬ１が被検出者の顔を撮像して、得られた画像データを顔部位検出手段ＣＬ２に送信する。これを受けた顔部位検出手段ＣＬ２は、画像全体から追跡対象となる顔部位を検出する。
【００２２】
その後、顔画像撮像手段ＣＬ１により撮像画像が得られた場合、顔画像撮像手段ＣＬ１は、画像データを顔部位追跡手段ＣＬ３に送信する。これを受けた顔部位追跡手段ＣＬ３は、顔部位探査領域設定手段ＣＬ３１により顔部位探査領域を設定すると共に、優先顔部位探査領域設定手段ＣＬ３２により優先顔部位探査領域を設定する。
【００２３】
そして、顔部位候補抽出手段ＣＬ３３は、撮像画像のうち顔部位探査領域内から追跡対象となる顔部位の候補を抽出する。抽出後、顔部位判定手段ＣＬ３４は、候補がどの領域に属するかを判断し、その候補が追跡対象となる顔部位であるか否かを判定する。すなわち、顔部位判定手段ＣＬ３４は、候補が優先顔部位探査領域内にある場合には、その候補を追跡対象であると判定する。一方、候補が優先顔部位探査領域内に無く顔部位探査領域内にある場合には、その候補の画像を画像処理する。そして、画像処理により得られた結果に基づいて、追跡対象となる顔部位か否かを判定する。その後、本装置１は、この判定結果に基づいて、追跡対象の顔部位を追跡していく。
【００２４】
なお、本装置１は、候補が顔部位探査領域及び優先顔部位探査領域内にあるか否かの判断を高精度に行うべく、抽出された候補に対し候補点を定めている。すなわち、顔部位候補抽出手段ＣＬ３３は、追跡対象の候補を抽出し、その候補位置を特定するための候補点を定める。そして、顔部位判定手段ＣＬ３４は、顔部位候補抽出手段ＣＬ３３により定められた候補点が優先顔部位探査領域内にあるときに、その候補点を有する候補を追跡対象であると判定する。また、顔部位判定手段ＣＬ３４は、候補点が優先顔部位探査領域内に無く顔部位探査領域内にあるときに、その候補点を有する候補を含む画像を画像処理して、その候補が追跡対象であるか否かを判定する。
【００２５】
このように、点に基づく判断を行うことで、候補の一部が優先顔部位探査領域内であって、一部が優先顔部位探査領域外にあるという事態を無くすことができ、高精度に処理を行うことができる。
【００２６】
また、本装置１は、自動車、鉄道車両、船舶の運転者やプラントのオペレータ等の顔部位追跡に用いることができるが、以下の説明においては、自動車の運転者の顔部位のうち特に左眼に適用した場合で説明する。なお、本装置１は、眼だけの追跡に留まらず眉、鼻、口、耳なども同様の方法で追跡処理することができる。
【００２７】
図２は、本発明の実施形態に係る顔部位追跡装置の示すハード構成図である。同図に示すように、顔画像撮像手段ＣＬ１としてＴＶカメラ２が自動車のインストルメント上に設けられている。ＴＶカメラ２は、運転者を略正面から撮像できる位置に設置されており、少なくとも運転者の顔部分を撮影するようにされている。このＴＶカメラ２の入力画像は、本実施形態では、例えば横方向（Ｘ）６４０画素、縦方向（Ｙ）４８０画素からなる。前記ＴＶカメラ２で撮像された入力画像は、インストルメント裏側など車体内部に設置されたマイクロコンピュータ３に画像データとして入力される。
【００２８】
マイクロコンピュータ３には、顔部位検出手段ＣＬ２及び顔部位追跡手段ＣＬ３を構成するプログラムロジックがプログラミングされている。なお、顔部位追跡手段ＣＬ３のプログラムロジックは、顔部位探査領域設定手段ＣＬ３１、優先顔部位探査領域設定手段ＣＬ３２、顔部位候補抽出手段ＣＬ３３及び顔部位判定手段ＣＬ３４のそれぞれのロジックを含むものである。
【００２９】
次に、本実施形態に係る顔部位追跡装置１の動作について説明する。図３は、本実施形態に係る顔部位追跡装置１の動作の概略を示すメインフローチャートである。同図に示すように、まず、処理が開始されると、マイクロコンピュータ３は、初期値入力処理を実行する（ＳＴ１０）。この初期値入力の処理では、サンプリング時間などの各種定数が読み込まれる。
【００３０】
そして、マイクロコンピュータ３は、追跡対象の顔部位が見つかっているか否かを示す追跡対象検出フラグ「ＧｅｔＦｌａｇ」を「ＦＡＬＳＥ」に設定する（ＳＴ１１）。その後、マイクロコンピュータ３は、処理フレームカウンタ「ｉ」を「０」に初期化する（ＳＴ１２）。
【００３１】
初期化後、マイクロコンピュータ３は、終了判断処理を実行する（ＳＴ１３）。この際、マイクロコンピュータ３は、例えばエンジンが起動しているか等に基づいて判断を行う。
【００３２】
そして、マイクロコンピュータ３は、「ＳＴＯＰ」か否かを判断する（ＳＴ１４）。例えばエンジンが起動されていないと判断した場合、マイクロコンピュータ３は、「ＳＴＯＰ」であると判断し（ＳＴ１４：ＹＥＳ）、処理は終了することとなる。
【００３３】
一方、エンジンが起動され走行しているなどにより、「ＳＴＯＰ」でないと判断した場合（ＳＴ１４：ＮＯ）、マイクロコンピュータ３は、顔画像の撮像処理を実行する（ＳＴ１５）。これにより、ＴＶカメラ２は、運転者の顔を撮像する。
【００３４】
その後、マイクロコンピュータ３は、追跡対象検出フラグ「ＧｅｔＦｌａｇ」が「ＦＡＬＳＥ」か否かを判断する（ＳＴ１６）。すなわち、追跡対象となる顔部位が見つかっているか否かを判断する。
【００３５】
追跡対象検出フラグ「ＧｅｔＦｌａｇ」が「ＦＡＬＳＥ」であり、追跡対象となる顔部位が見つかっていないと判断した場合（ＳＴ１６：ＹＥＳ）、マイクロコンピュータ３は、追跡対象検出処理を実行する（ＳＴ１７）。このステップＳＴ１７の処理は、図１にて説明した顔部位検出手段ＣＬ２にて行われる処理である。すなわち、マイクロコンピュータ３は、顔部位検出手段ＣＬ２に相当するプログラムを実行することとなる。なお、この処理において、追跡対象となる顔部位が見つけられた場合には、後述するが、追跡対象検出フラグ「ＧｅｔＦｌａｇ」が「ＴＲＵＥ」とされることとなる。
【００３６】
追跡対象検出処理の実行後、マイクロコンピュータ３は、処理フレームカウンタ「ｉ」をインクリメントする（ＳＴ１８）。そして、処理は、ステップＳＴ１３に戻る。
【００３７】
その後、上記したステップＳＴ１３〜１５を経て、ステップＳＴ１５に至る。このとき、前述の追跡対象検出処理（ＳＴ１７）において、追跡対象となる顔部位が見つけられていた場合には、追跡対象検出フラグ「ＧｅｔＦｌａｇ」が「ＴＲＵＥ」となっている。このため、追跡対象検出フラグ「ＧｅｔＦｌａｇ」が「ＦＡＬＳＥ」でないと判断されて（ＳＴ１６：ＮＯ）、マイクロコンピュータ３は追跡処理を実行する（ＳＴ１９）。このステップＳＴ１９の処理は、図１にて説明した顔部位追跡手段ＣＬ３にて行われる処理である。すなわち、マイクロコンピュータ３は、顔部位追跡手段ＣＬ３に相当するプログラムを実行する。そして、顔部位の追跡が行われる。
【００３８】
その後、処理はステップＳＴ１８に移行し、処理フレームカウンタをインクリメント後、再度処理はステップＳＴ１３に戻る。以上の処理が、ステップＳＴ１４にて「ＹＥＳ」と判断されるまで繰り返されることとなる。
【００３９】
なお、図１を参照して説明したように、顔部位検出手段ＣＬ２は、撮像画像全体に対して処理を行い、追跡対象となる顔部位を検出する。一方、顔部位追跡手段ＣＬ３は、撮像画像に領域を設定し、その領域内から追跡対象となる顔部位を判定し追跡していくようにしている。このため、本装置１は、少なくとも一度は画像全体に対して処理を行うものの、その後は画像の一部に対して処理を行うこととなり、常に画像全体に処理を行う装置に比して、迅速な処理を行うことができる。
【００４０】
次に、追跡対象検出処理（ＳＴ１７）の詳細な動作について説明する。図４は、図３に示した追跡対象検出処理（ＳＴ１７）の詳細な動作を示すフローチャートである。
【００４１】
同図に示すように、ステップＳＴ１６にて「ＹＥＳ」と判断された場合、マイクロコンピュータ３は、追跡対象候補の位置の特定処理を実行する（ＳＴ２０）。この処理により、画像全体から追跡対象の候補の位置が特定される。なお、この処理では、追跡対象となる顔部位である可能性を有する候補の位置が１又は複数特定される。
【００４２】
そして、マイクロコンピュータ３は、追跡対象判定処理を実行する（ＳＴ２１）。追跡対象判定処理（ＳＴ２１）では、追跡対象候補位置の特定処理（ＳＴ２０）にて特定された１又は複数の追跡対象候補のうち１つを選別し、この選別した候補が追跡対象であるか否かを判断する。
【００４３】
その後、マイクロコンピュータ３は、追跡対象判定処理（ＳＴ２１）の結果に基づいて、選別された追跡対象の候補が追跡対象であると判定されたか否かを判断する（ＳＴ２２）。
【００４４】
追跡対象であると判定されていなかった場合（ＳＴ２２：ＮＯ）、マイクロコンピュータ３は、特定された１又は複数の追跡対象候補のすべてについて判定したか否かを判断する（ＳＴ２４）。
【００４５】
全てに対して判定した場合（ＳＴ２４：ＹＥＳ）、処理は図３のステップＳＴ１８に移行する。一方、全てに対して判定していない場合（ＳＴ２４：ＮＯ）、処理はステップＳＴ２１に戻る。
【００４６】
ところで、ステップＳＴ２２において、追跡対象であると判定されていた場合（ＳＴ２２：ＹＥＳ）、マイクロコンピュータ３は、追跡対象検出フラグ「ＧｅｔＦｌａｇ」を「ＴＲＵＥ」にする（ＳＴ２３）。そして、処理は図３のステップＳＴ１８に移行する。
【００４７】
以上のようにして、本装置１では、所望する顔部位である可能性を有する１又は複数の追跡対象候補を画像全体から特定し、特定された１又は複数の追跡対象候補を１つずつ判定して追跡対象を検出することとなる。なお、追跡対象である可能性を有する１又は複数の追跡対象候補を画像全体から特定する処理（ステップＳＴ２０の処理）は、以下のようにして行われる。
【００４８】
図５は、図４に示した追跡対象候補位置特定処理（ＳＴ２０）の詳細を示すフローチャートである。同図において、まず、マイクロコンピュータ３は、撮像した画像のデータ全体を、全体画像として画像メモリに保存する（ＳＴ３０）。
【００４９】
次に、マイクロコンピュータ３は、ステップＳＴ３１の判断を行う。この判断については後述する。ステップＳＴ３１において「ＮＯ」と判断された場合、マイクロコンピュータ３は、全体画像の縦方向（Ｙ軸方向）の画素列のうち１ラインのみに沿って濃度値の相加平均演算を行う（ＳＴ３２）。
【００５０】
この相加平均演算は、例えば縦方向に並ぶ所定数の画素について、濃度の平均値を求め、所定数の画素のうちの１画素の濃度値を前記平均値とする処理である。例えば、所定数が「５」である場合、画面上方から１〜５番目に位置する画素を選択して平均値を求め、この平均値を５番目の画素の濃度値とする。次に、画面上方から２〜６番目に位置する画素を選択して平均値を求め、この平均値を６番目の画素の濃度値とする。そして、これを順次繰り返し、１ラインすべての画素について濃度の平均値を求める。
【００５１】
このように相加平均演算することで、本装置１は、画像データ撮影時の濃度値の変化の小さなバラツキを無くすことができ、濃度値の大局的な変化を捉えることができる。
【００５２】
相加平均演算後、マイクロコンピュータ３は、縦方向に相加平均値の微分演算を行う（ＳＴ３３）。そして、マイクロコンピュータ３は、微分値に基づいてポイント抽出を行う（ＳＴ３４）。このポイント抽出とは、縦方向の画素列に沿って画素濃度の相加平均値の局所的な高まり毎に１個ずつの画素を定める処理であって、例えば相加平均値の微分値が負から正に変化する画素を定める処理である。
【００５３】
ポイントとなる画素を定めた後、マイクロコンピュータ３は、現在ポイント抽出していたラインを次ラインへ切り替える（ＳＴ３５）。
【００５４】
そして、マイクロコンピュータ３は、縦方向の全ラインでのポイント抽出が終了したか否かを判断する（ＳＴ３１）。全ラインでのポイント抽出が終了していないと判断した場合（ＳＴ３１：ＮＯ）、前述のステップＳＴ３２〜ＳＴ３５の処理を経て、再度ステップＳＴ３１に戻る。
【００５５】
一方、全ラインでのポイント抽出が終了したと判断した場合（ＳＴ３１：ＹＥＳ）、隣り合う各ラインの抽出ポイントのＹ座標値を比較する。そして、Ｙ座標値が所定値以内の場合、連続データとして、（ｉ）連続データのグループ番号、（ｉｉ）連続開始ライン番号、（ｉｉｉ）連続データ数をメモリする。また、（ｉｖ）連続データを構成する各抽出ポイントの縦方向位置の平均値（その連続データの代表上下位置）、（ｖ）連続開始ラインと終了ラインの横方向位置の平均値（その連続データの代表左右位置）をメモリする（ＳＴ３６）。
【００５６】
なお、本実施形態では、追跡対象を眼としているため、連続データは横方向比較的長く延びるものとなる。このため、マイクロコンピュータ３は、連続データ形成後、横方向に所定値以上続くことを条件に連続データを選択することができる。
【００５７】
その後、マイクロコンピュータ３は、各連続データについて代表座標値Ｃを定め、これを基準として存在領域ＥＡを設定する（ＳＴ３７）。この代表座標値Ｃとは、ステップＳＴ３６の処理において、メモリされたＸ座標値の平均値及びＹ座標値の平均値により決定するものである（上記ｉｖ，ｖに示す平均値）。なお、存在領域ＥＡについては、図６〜図１１を参照して後述する。
【００５８】
代表座標値Ｃを定めて存在領域ＥＡを設定した後、処理は、図４のステップＳＴ２１に移行する。以上が、追跡対象候補位置特定処理（ＳＴ２０）である。以上のようにして、求められた連続データが眼の候補となり、連続データの代表座標値Ｃが眼の候補点の位置となる。
【００５９】
次に、縦方向の画素列ごとに定められた抽出ポイントが画像横方向に隣接する場合に形成される連続データ、その連続データの代表座標値Ｃ及び存在領域ＥＡについて説明する。
【００６０】
図６は、図５に示したステップＳＴ３６の処理にて形成される連続データ、並びにステップＳＴ３７の処理にて定められる代表座標値Ｃ及び存在領域ＥＡを示す説明図である。なお、追跡対象候補位置特定処理（ＳＴ２０）は、１又は複数の追跡対象候補を特定するものであるが、図６では複数の追跡対象候補が特定された場合を例に説明する。
【００６１】
同図に示すように、マイクロコンピュータ３は、複数の連続データＧを形成している。これは、眼を検出対象としているため、眼と似た特徴量を示すもの（口、鼻、眉毛など）が検出されるためである。
【００６２】
連続データＧは、前述したように、縦方向の画素列ごとに定められた抽出ポイントが画像横方向に隣接する場合に形成されるものである。そして、この連続データを形成する横方向両端の画素のＸ座標値の平均値と、連続データを形成する各画素のＹ座標の平均値により、代表座標値Ｃが決定される。さらに、存在領域ＥＡは、この代表座標値Ｃを基準として設定される。
【００６３】
次に、存在領域ＥＡの設定方法を説明する。図７は、図６に示した存在領域ＥＡの大きさを示す説明図であり、図８及び図９は数人の眼の大きさを調べた横Ｘａ、縦Ｙａの長さの統計データを示す説明図であり、図１０は存在領域ＥＡの画像上の位置を決定する方法を示す説明図である。
【００６４】
存在領域ＥＡの設定は、存在領域ＥＡの大きさが決定され、その後、存在領域ＥＡの画像上における位置が定められることでなされる。
【００６５】
存在領域ＥＡの大きさは、ノイズ（顔の皺や明暗などを抽出してしまう）の低減や処理速度を落とさないためにも、可能な限り小さい領域が良い。本実施形態では、数人の顔部位の大きさを調べ、それに余裕分（例えば×１．５倍）を加味して、存在領域ＥＡの大きさを決定している。すなわち、図８及び図９のように、顔部位の縦横寸法のデータを集め、その分布の例えば９５％をカバーする寸法に余裕分を考慮して決定する方法を採用している。
【００６６】
そして上記９５％をカバーする寸法、すなわち横寸法ｘａ、縦寸法ｙａに余裕分（×１．５）を加味して決定している（図７）。なお、存在領域ＥＡの大きさについては、画像処理により顔部位の幅や高さを推定し、縦横の大きさに余裕分を加える大きさとしてもよい。
【００６７】
このように存在領域ＥＡの大きさが決定された後、図１０に示すように、例えば眼の座標値（ｘ１，ｙ１）を基準に、基準点Ｐを決める。基準点Ｐは、眼の座標値（ｘ１，ｙ１）から距離ｘ２，ｙ２だけ離れた位置に定められるものである。
【００６８】
そして、マイクロコンピュータ３は、点Ｐを基準に存在領域ＥＡの寸法ｘ３，ｙ３を描画する。これにより、存在領域ＥＡの位置が決定される。その後、画像全体で見つかった連続データＧすべてについて存在領域ＥＡを設定する。
【００６９】
なお、上記のｘ２及びｙ２はｘ３，ｙ３の１／２であって、予め存在領域ＥＡが眼の中心にくるような長さとすることが望ましい。
【００７０】
以上の図５〜図１０の処理により、図４の追跡対象候補位置特定処理（ＳＴ２０）がなされる。
【００７１】
次に、図４の追跡対象判定処理（ＳＴ２１）について説明する。図１１は、図４に示した追跡対象判定処理（ＳＴ２１）の詳細を示すフローチャートである。
【００７２】
まず、マイクロコンピュータ３は、図５の処理にて求められた存在領域ＥＡの画像データを微少画像ＩＧとして画像メモリに保存する（ＳＴ４０）。全体画像と画像メモリに保存される微小画像ＩＧとの状態を図１２に示す。図１２は、微小画像を示す説明図である。図１２に示すように、マイクロコンピュータ３は、全体画像から存在領域ＥＡ内の画像を抽出し、微小画像ＩＧとしている。
【００７３】
再度、図１１を参照して説明する。マイクロコンピュータ３は、全体画像の代表座標値Ｃを微少画像ＩＧの代表座標値ＩＣとする。そして、マイクロコンピュータ３は、微少画像ＩＧの代表座標値ＩＣを基準とした範囲ＡＲを設定し、範囲ＡＲの濃度情報をもとに二値化閾値を設定する（ＳＴ４１）。
【００７４】
範囲ＡＲでの二値化閾値の算出方法の一例を、図１３を参照して説明する。図１３は、範囲ＡＲでの二値化閾値の算出方法の説明図である。まず、マイクロコンピュータ３は、範囲ＡＲにおいて縦方向に数ラインの濃度値の読み出しを行う。
【００７５】
そして、マイクロコンピュータ３は、各ラインにおいて濃度値の最も高い（明るい）濃度値と、最も低い（暗い）濃度値をメモリしていく。全ラインのメモリが終了したら、マイクロコンピュータ３は、各ラインの最も高い（明るい）濃度値の中で、一番低い濃度値（皮膚の部分）と、各ラインの最も低い（暗い）濃度値の中で、一番低い濃度値（眼の部分）とを求める。そして、その中央値を二値化閾値とする。
【００７６】
なお、上記した範囲ＡＲは、好適に二値化閾値を決定するため、眼の黒い部分と眼の周囲の皮膚の白い部分が入るように設定される。また、範囲ＡＲは、画像の明るさのバラツキによる影響を少なくするために必要最小限の大きさにされる。
【００７７】
さらに、二値化閾値は、範囲ＡＲ内の眼の一番低い（暗い）濃度値と、皮膚の部分の一番低い（暗い）濃度値の中央値とすることで、皮膚の部分から眼の部分を切り出すのに適した値になる。
【００７８】
ここで、二値化閾値を決定するのに皮膚部分における一番低い（暗い）濃度値を用いている理由は、次の通りである。例えば、範囲ＡＲの一部に直射光が当たっている場合、皮膚部分は、眼球の黒色部分に比して、光を強く反射する傾向にある。このため、本装置１は、多くのノイズとも言える光を入力してしまうこととなる。
【００７９】
この場合、濃度値を読み出す範囲ＡＲを極力小さくしても、画像がノイズ光による影響を受け、本装置１は正確な二値化閾値を決定できなくなってしまう。このため、本実施形態では、強く反射している可能性がある濃度値の高い部分を用いず、皮膚の部分の濃度値の一番低い（暗い）濃度値を用いることで、より適切な二値化閾値を決定できるようにしている。
【００８０】
再度、図１１を参照して説明する。二値化閾値の決定後、マイクロコンピュータ３は、決定した二値化閾値を用いて微少画像ＩＧを二値化処理し、二値画像ｂＧとして画像メモリに保存する（ＳＴ４２）。
【００８１】
次に、マイクロコンピュータ３は、全体画像の代表座標値Ｃを二値画像ｂＧの位置ｂＣとし、この位置ｂＣを初期位置として設定する（ＳＴ４３）。その後、マイクロコンピュータ３は、設定位置が黒画素か否かを判断する（ＳＴ４４）。ここでは、ステップＳＴ４３において設定された初期位置が黒画素か否か判断される。
【００８２】
そして、設定位置が黒画素でないと判断した場合（ＳＴ４４：ＮＯ）、マイクロコンピュータ３は、設定位置を上下左右に１画素ずつずらす（ＳＴ４５）。その後、マイクロコンピュータ３は、再度、設定位置が黒画素か否かを判断する。ここでは、ステップＳＴ４５においてずらされた設定位置が黒画素か否か判断される。そして、黒画素と判断されるまで、この処理が繰り返される。
【００８３】
一方、設定位置が黒画素であると判断した場合（ＳＴ４４：ＹＥＳ）、マイクロコンピュータ３は、その黒画素の連結成分を候補オブジェクトとして設定する（ＳＴ４６）。そして、マイクロコンピュータ３は、候補オブジェクトの幾何形状を算出する（ＳＴ４７）。
【００８４】
算出後、マイクロコンピュータ３は、予め記憶している追跡対象のテンプレートの幾何形状と候補オブジェクトの幾何形状とを比較する（ＳＴ４８）。候補オブジェクトと追跡対象のテンプレートとの幾何形状の比較方法の一例を、図１４を参照して説明する。
【００８５】
図１４は、候補オブジェクトと追跡対象である眼のテンプレートとの幾何形状の比較方法の説明図であり、（ａ）は候補オブジェクトが最適な状態で撮像された場合を示し、（ｂ）は眼の右側が欠けた状態を示し、（ｃ）は眼の左側が欠けた状態を示している。
【００８６】
眼の画像を二値化した形状は光環境が良く安定した画像であれば図１４（ａ）に示すようなものになる。ところが、車室内に直射日光が一側から当たる等して光環境が悪化したときには、図１４（ｂ）及び（ｃ）に示すように、一部が欠けた形状になることもある。
【００８７】
マイクロコンピュータ３は、上記のような候補オブジェクトを正確に判断するために、３つの条件により比較判断を行う。まず、条件（ｉ）としては、横幅が眼の相場値の２／３以上あり、且つ上に凸の所定範囲の曲率を持っていることである。次に、条件（ｉｉ）としては、黒眼の左側の凹み形状があることである。また、条件（ｉｉｉ）としては、黒眼の右側の凹み形状があることである。
【００８８】
再度、図１１を参照して説明する。幾何形状の比較後、マイクロコンピュータ３は、上記３つの条件に基づき、比較判断を行い、候補オブジェクトと眼テンプレートとの幾何形状が一致するか否かを判断する（ＳＴ４９）。ここで、図１４（ｂ）及び（ｃ）のように眼の形状の一部が欠けている場合を考慮し、マイクロコンピュータ３は、条件（ｉ）及び（ｉｉ）を満たすもの、並びに条件（ｉｉ）及び（ｉｉｉ）を満たすものを一致すると判断する。
【００８９】
一致しないと判断した場合（ＳＴ４９：ＮＯ）、マイクロコンピュータ３は、その候補オブジェクトが追跡対象となる顔部位でないと判定し（ＳＴ５０）、その後、処理は、図４のステップＳＴ２２に移行する。
【００９０】
一方、一致すると判断した場合（ＳＴ４９：ＹＥＳ）、マイクロコンピュータ３は、その候補オブジェクトが追跡対象となる顔部位であると判定する（ＳＴ５１）。そして、判定された候補オブジェクトの座標値（全体画像における代表座標値Ｃに相当する）を、画像上における眼の座標値としてメモリする（ＳＴ５２）。
【００９１】
その後、マイクロコンピュータ３は、一致と判断された候補オブジェクトを含む微小画像ＩＧを追跡対象画像ＭＧ_ｉとして、画像メモリに保存する（ＳＴ５３）。その後、処理は、図４のステップＳＴ２２に移行する。
【００９２】
なお、図１１の処理では、二値化閾値を用いて二値化した候補オブジェクトを検出している。このため、本実施形態では、眼の部分と他の部分（背景や眼以外の顔部分）とを明確に区別し、眼を正確に捉えることができる。さらには、候補オブジェクトの幾何形状を用いた判定をより正確に行うことができ、眼の位置検出精度をより向上させることができる。
【００９３】
以上、図４〜図１４を参照して説明したように、マイクロコンピュータ３（顔部位検出手段ＣＬ２）は、入力した画像全体から、追跡対象となる顔部位を検出することとなる。そして、前述したように、追跡対象となる顔部位が検出されると、追跡対象検出フラグ「ＧｅｔＦｌａｇ」が「ＴＲＵＥ」とされる。そして、図３に示すように、追跡処理（ＳＴ１９）が実行されるようになる。
【００９４】
図１５は、図３に示した追跡処理（ＳＴ１９）の詳細を示すフローチャートである。同図に示すように、ステップＳＴ１６にて「ＮＯ」と判断された場合、マイクロコンピュータ３は、顔部位探査領域の設定処理を実行する（ＳＴ６０）。このステップＳＴ６０の処理は、図１に示した顔部位探査領域設定手段ＣＬ３１にて行われる処理である。すなわち、マイクロコンピュータ３は、顔部位探査領域設定手段ＣＬ３１に相当するプログラムを実行することとなる。図１６を参照して、顔部位探査領域の設定処理の概略を説明する。
【００９５】
図１６は、図１５に示した顔部位探査領域の設定処理（ＳＴ６０）の説明図であり、（ａ）は時刻ｔ０において撮像された画像を示し、（ｂ）は時刻ｔ１において撮像された画像を示し、（ｃ）は時刻ｔ２において撮像された画像を示し、（ｄ）は時刻ｔ３において撮像された画像を示し、（ｅ）はこれらの画像上の左眼位置を一画像上で表した場合を示している。
【００９６】
被検出者が顔の向きを変える場合、まず、時刻ｔ０において図１６（ａ）に示す画像が撮像される。このとき、被検出者は、ほぼ正面を視認している。その後、時刻ｔ１において、図１６（ｂ）に示す画像が撮像される。このとき、被検出者は、サイドミラー等を確認すべく、顔を右（図１６においては左側）に向け始める。顔の向きを右に向け始めたことから、被検出者の左眼の位置は、右側へ移動することとなる。
【００９７】
そして、時刻ｔ２において、図１６（ｃ）に示す画像が撮像される。このとき、被検出者は、時刻ｔ１よりも、さらに顔を右に向けている。このため、左眼の位置は、さらに右側へ移動することとなる。
【００９８】
その後、時刻ｔ３において、図１６（ｄ）に示す画像が撮像される。このとき、被検出者は、サイドミラー等を確認しており、顔を最も右側に向けた状態となっている。故に、左眼の位置は、最も右側へ移動したこととなる。
【００９９】
そして、図１６（ｅ）に示すように、時刻ｔ０からｔ３に向かって、これら画像上の左眼の位置が徐々に移動していることがわかる。顔部位探査領域の設定処理（ＳＴ６０）においては、これら時刻ｔ０〜ｔ３までの各期間（ｔ０〜ｔ１，ｔ１〜ｔ２，ｔ２〜ｔ３）に移動する左眼位置が含まれるように設定される。
【０１００】
再度、図１５を参照して説明する。ステップＳＴ６０の後、マイクロコンピュータ３は、優先顔部位探査領域の設定処理を実行する（ＳＴ６１）。このステップＳＴ６１の処理は、図１に示した優先顔部位探査領域設定手段ＣＬ３２にて行われる処理である。すなわち、マイクロコンピュータ３は、優先顔部位探査領域設定手段ＣＬ３２に相当するプログラムを実行することとなる。図１７を参照して、優先顔部位探査領域の設定処理の概略を説明する。
【０１０１】
図１７は、図１５に示した優先顔部位探査領域の設定処理（ＳＴ６１）の説明図であり、（ａ）は時刻ｔ１０において撮像された画像を示し、（ｂ）は時刻ｔ１１において撮像された画像を示し、（ｃ）は時刻ｔ１２において撮像された画像を示し、（ｄ）は時刻ｔ１３において撮像された画像を示し、（ｅ）はこれらの画像上の左眼位置を一画像上で表した場合を示している。
【０１０２】
被検出者が一方向を視認している場合、まず、時刻ｔ１０において図１７（ａ）の画像が撮像される。その後、時刻ｔ１１，時刻ｔ１２，時刻ｔ１３において、それぞれ図１７（ｂ）、（ｃ）、（ｄ）の画像が撮像される。
【０１０３】
これらの画像上における左眼位置は、被検出者が一方向を視認していることから、図１７（ｅ）からも明らかなように、ほぼ静止した状態となっている。
【０１０４】
優先顔部位探査領域の設定処理（ＳＴ６１）においては、これら時刻ｔ１０〜ｔ１３までの各期間（ｔ１０〜ｔ１１，ｔ１１〜ｔ１２，ｔ１２〜ｔ１３）に移動する左眼位置が含まれるように設定される。
【０１０５】
ここで、一方向を視認している場合と顔の向きを変えた場合との左眼位置の分布について説明する。図１８は、一方向を視認している場合と顔の向きを変えた場合との左眼位置の分布を示す説明図である。ここで、図１８の縦軸は画像におけるＸ軸方向の座標値であり、横軸は画像におけるＹ軸方向の座標値である。また、画像サイズ６４０×４８０であり、縦軸の最大値は４８０で、横軸の最大値は６８０である。さらに、図１８では３０フレーム／秒のビデオレートでサンプリングしたときの座標をプロットしたものを示している。
【０１０６】
同図に示すように、被検出者が一方向を視認している場合、左眼位置はほぼ１点に滞留している。このとき、軌跡ａに示すように、各時刻の座標値は、Ｘ軸において２００〜２３０で、Ｙ軸において３５０〜３９０でほぼ一定となっている。
【０１０７】
一方、被検出者が顔の向きを変えた場合、例えば、エアコン装置の操作パネル等が設置されている方向（左下方向）に、被検出者が顔を向けた場合、左眼位置は大きく移動する。このとき、軌跡ｂに示すように、各時刻の座標値は、Ｘ軸において３９０〜５２０で、Ｙ軸において２４０〜３５０であり、大きく移動している。
【０１０８】
この分布についての解析結果を図１９に示す。図１９は、図１８に示した分布から求まる左眼位置の移動量の解析結果を示す説明図である。なお、図１９では、被検出者が図１８の軌跡ａ及び軌跡ｂと同様の動きをした場合に、３０ｍｓ／フレーム及び６０ｍｓ／フレームにて、撮像したときの解析結果を示している。また、ここでの画像サイズは、６４０×４８０である。
【０１０９】
まず、軌跡ａと同様の動きを３０ｍｓ／フレームにて撮像した場合、１フレーム当たりの移動量の平均はＸ軸方向に「１．１３」Ｙ軸方向に「０．５２」である。また、このときの標準偏差はＸ軸方向に「０．９５」Ｙ軸方向に「０．５２」であり、３δ移動量はＸ軸方向に「３．９７」Ｙ軸方向に「２．０８」である。そして、最大移動量は、Ｘ軸方向に「４」Ｙ軸方向に「２」である。
【０１１０】
一方、軌跡ｂと同様の動きを３０ｍｓ／フレームにて撮像した場合、１フレーム当たりの移動量の平均はＸ軸方向に「３．３８」Ｙ軸方向に「２．３５」である。また、このときの標準偏差はＸ軸方向に「２．６３」Ｙ軸方向に「２．１２」であり、３δ移動量はＸ軸方向に「１１．２７」Ｙ軸方向に「８．７２」である。そして、最大移動量は、Ｘ軸方向に「１４」Ｙ軸方向に「９」である。
【０１１１】
また、軌跡ａと同様の動きを６０ｍｓ／フレームにて撮像した場合、１フレーム当たりの移動量の平均はＸ軸方向に「１．７６」Ｙ軸方向に「０．９１」である。また、このときの標準偏差はＸ軸方向に「１．４７」Ｙ軸方向に「０．６８」であり、３δ移動量はＸ軸方向に「６．１８」Ｙ軸方向に「２．９４」である。そして、最大移動量は、Ｘ軸方向に「６」Ｙ軸方向に「３」である。
【０１１２】
一方、軌跡ｂと同様の動きを６０ｍｓ／フレームにて撮像した場合、１フレーム当たりの移動量の平均はＸ軸方向に「５．７７」Ｙ軸方向に「４．２５」である。また、このときの標準偏差はＸ軸方向に「４．１０」Ｙ軸方向に「３．７０」であり、３δ移動量はＸ軸方向に「１８．０６」Ｙ軸方向に「１５．３５」である。そして、最大移動量は、Ｘ軸方向に「１５」Ｙ軸方向に「１４」である。
【０１１３】
このように、図１９から明らかなように、被検出者が一方向を視認しているときには、左眼位置の移動量が最大数画素程度であるが、顔の向きを変えた場合には、左眼位置の移動量が最大数十画素となっている。
【０１１４】
再度、図１５を参照して説明する。ステップＳＴ６１の後、マイクロコンピュータ３は、追跡対象候補位置の特定処理を行う（ＳＴ６２）。この処理は、図５に示す処理と同様である。この処理は、図１に示した顔部位候補抽出手段ＣＬ３３にて行われる処理である。すなわち、マイクロコンピュータ３は、顔部位候補抽出手段ＣＬ３３に相当するプログラムを実行することとなる。
【０１１５】
この処理の概略を説明すると、まず、マイクロコンピュータ３は、撮像画像の縦方向の画素列に沿って画素の濃度値を検出する。このとき、マイクロコンピュータ３は、相加平均演算を実行し、濃度の平均値を求める。そして、マイクロコンピュータ３は、検出された濃度平均値の局所的な高まり毎に１個ずつの画素を定めてポイント抽出する。これにより、抽出点が定まる。その後、マイクロコンピュータ３は、縦方向の画素列ごとに定められた抽出点が、画像横方向に隣接する場合に、横方向に延びる抽出点群の連続データＧを形成する。この連続データＧは、図５〜図９を参照して説明したものと同様である。そして、マイクロコンピュータ３は、形成された連続データＧの代表座標値Ｃを、追跡対象候補の候補点とする。
【０１１６】
ステップＳＴ６２の後、マイクロコンピュータ３は、追跡対象の候補が優先顔部位探査領域内にあるか否かを判断する（ＳＴ６３）。より詳細には、追跡対象候補の候補点である代表座表値Ｃが優先顔部位探査領域内にあるか否かを判断する。なお、この処理は、図１に示した顔部位判定手段ＣＬ３４にて行われる処理である。すなわち、マイクロコンピュータ３は、顔部位判定手段ＣＬ３４に相当するプログラムを実行することとなる。
【０１１７】
候補が優先顔部位探査領域内にあると判断した場合（ＳＴ６３：ＹＥＳ）、マイクロコンピュータ３は、追跡対象の候補が追跡対象であると判定する（ＳＴ６４）。そして、マイクロコンピュータ３は、追跡対象であると判定された顔部位を含む存在領域ＥＡを微小画像ＩＧとして画像メモリに保存する（ＳＴ６５）。
【０１１８】
その後、マイクロコンピュータ３は、追跡対象候補の代表座標値Ｃを追跡対象の座標値としてメモリし（ＳＴ６６）、さらに、微小画像ＩＧを追跡対象画像ＭＧ_ｉとして画像メモリに保存する（ＳＴ６７）。
【０１１９】
そして、マイクロコンピュータ３は、不検出カウンタを初期化する（ＳＴ６８）。その後、処理は図３に示したステップＳＴ１８に移行する。なお、不検出カウンタとは、追跡対象が特定できない連続処理数をカウントするものである。
【０１２０】
ところで、候補が優先顔部位探査領域内に無いと判断した場合（ＳＴ６３：ＮＯ）、処理は、図２０に示すステップＳＴ７０に移行する。
【０１２１】
図２０は、追跡対象の候補が優先顔部位探査領域内に無いと判断された場合に実行される処理を示すフローチャートである。
【０１２２】
マイクロコンピュータ３は、まず、微小画像ＩＧの濃度による追跡対象判定処理を行う（ＳＴ７０）。このステップＳＴ７０の処理は、図１にて説明した顔部位判定手段ＣＬ３４にて行われる処理である。すなわち、マイクロコンピュータ３は、顔部位判定手段ＣＬ３４に相当するプログラムを実行することとなる。
【０１２３】
詳細には、図２１に示す処理が実行される。図２１は、図２０に示した濃度による追跡対象判定処理（ＳＴ７０）の詳細を示すフローチャートである。
【０１２４】
同図に示すように、まず、マイクロコンピュータ３は、微小画像ＩＧを画像メモリに保存する（ＳＴ９０）。その後、マイクロコンピュータ３は、微小画像ＩＧの濃度データと追跡対象画像ＭＧ_ｉ−１の濃度データの類似度パラメータを求める（ＳＴ９１）。
【０１２５】
ここで、追跡対象画像ＭＧ_ｉ−１は、前回の追跡処理において画像メモリに記憶された追跡対象の画像である。また、図１５のステップＳＴ６７に示されるように、追跡対象画像ＭＧ_ｉ−１は、前回、追跡対象となる顔部位を含むと判定された微小画像ＩＧでもある。
【０１２６】
すなわち、マイクロコンピュータ３は、現在の画像フレームから抽出された追跡対象の候補を含む微小画像ＩＧと、過去の画像フレームにおいて特定された追跡対象を含む微小画像との双方から、濃度データの類似度パラメータを求めている。
【０１２７】
また、濃度値データの類似度パラメータは、次の式により求められる。
【０１２８】
【数１】

なお、Ｉ（ｍ，ｎ）は、微小画像ＩＧの画素の濃度を示し、Ｔ（ｍ，ｎ）は追跡対象画像ＭＧ_ｉ−１の画素の濃度を示し、Ｍ及びＮは画素サイズを示している。上記式に示されるように、類似度パラメータは残差和として表される。この残差和は２枚の画像の類似性が高いと値が小さくなり、類似性が低いと大きくなることから、閾値をもうけて、残差和が閾値よりも小さいと類似性が高いと判断できる。
【０１２９】
この処理の後、マイクロコンピュータ３は、類似度パラメータに基づいて、抽出された候補が追跡対象の顔部位か否かを判定する（ＳＴ９２）。すなわち、類似度が高い否かを判断して、微小画像ＩＧが追跡対象となる顔部位を含むものか否かを判断している。
【０１３０】
類似度が高くないと判断した場合（ＳＴ９２：ＮＯ）、マイクロコンピュータ３は、微小画像ＩＧに含まれる候補オブジェクトが追跡対象となる顔部位でないと判定する（ＳＴ９３）。その後、処理は、図２０のステップＳＴ７１に移行する。
【０１３１】
一方、類似度が高いと判断した場合（ＳＴ９２：ＹＥＳ）、マイクロコンピュータ３は、微小画像ＩＧに含まれる候補オブジェクトが追跡対象となる顔部位であると判定する（ＳＴ９４）。その後、処理は、図２０のステップＳＴ７１に移行する。
【０１３２】
再度、図２０を参照して説明する。ステップＳＴ７０の後、マイクロコンピュータ３は、図２１に示したステップＳＴ９３，９４の判定に基づいて、存在領域ＥＡが追跡対象となる顔部位を含むものか否かを判断する（ＳＴ７１）。
【０１３３】
追跡対象となる顔部位を含むものであると判断された場合（ＳＴ７１：ＹＥＳ）、処理は、図１５に示したステップＳＴ６６に移行する。一方、追跡対象となる顔部位を含むものでないと判断された場合（ＳＴ７１：ＮＯ）、マイクロコンピュータ３は、周波数画像による追跡対象判定処理を行う（ＳＴ７２）。このステップＳＴ７２の処理は、図１にて説明した顔部位判定手段ＣＬ３４にて行われる処理である。
【０１３４】
詳細には、図２２に示す処理が実行される。図２２は、図２０に示した周波数画像による追跡対象判定処理（ＳＴ７２）の詳細を示すフローチャートである。
【０１３５】
同図に示すように、まず、マイクロコンピュータ３は、存在領域ＥＡを微小画像ＩＧとして画像メモリに保存する（ＳＴ１００）。その後、マイクロコンピュータ３は、微小画像ＩＧを周波数処理して周波数画像ＩＦＧを生成し、これを画像メモリに保存する（ＳＴ１０１）。すなわち、マイクロコンピュータ３は、現在の画像フレームから抽出された追跡対象の候補を含む微小画像ＩＧを周波数処理して周波数画像ＩＦＧを生成している。
【０１３６】
ここでの周波数画像の生成は、フーリエ変換やウェーブレット変換などの一般的な方法により行われる。図２３は、図２２に示した周波数画像生成処理（ステップＳＴ１０１）の説明図であり、（ａ）は微小画像ＩＧを示しており、（ｂ）は周波数画像を示している。
【０１３７】
図２３（ａ）に示すような微小画像ＩＧを周波数処理した場合には、例えば、図２３（ｂ）に示す画像が得られる。マイクロコンピュータ３は、この周波数画像を画像メモリに保存することとなる。
【０１３８】
図２２を参照して説明する。ステップＳＴ１０１の後、マイクロコンピュータ３は、前回の追跡処理において画像メモリに記憶された追跡対象画像ＭＧ_ｉ−１を周波数処理して周波数画像ＢＩＦＧを求め、これを画像メモリに保存する（ＳＴ１０１）。すなわち、マイクロコンピュータ３は、過去の画像フレームにおいて特定された追跡対象の顔部位を含んだ追跡対象画像ＭＧ_ｉ−１を周波数処理して周波数画像ＢＩＦＧを求めている。なお、ここでの周波数処理は、図２３を参照して説明したものと同様である。
【０１３９】
次に、マイクロコンピュータ３は、周波数画像ＩＦＧ，ＢＩＦＧの類似度パラメータを算出する（ＳＴ１０３）。類似度パラメータの算出方法は、図２１に示したステップＳＴ９１と同様であり、濃度データの残差和を求めることによりなされる。
【０１４０】
この処理の後、マイクロコンピュータ３は、算出された類似度パラメータに基づいて、抽出された候補が追跡対象の顔部位か否かを判定する（ＳＴ１０４）。すなわち、類似度が高い否かを判断して、微小画像ＩＧが追跡対象となる顔部位を含むものか否かを判断している。
【０１４１】
類似度が高くないと判断した場合（ＳＴ１０４：ＮＯ）、マイクロコンピュータ３は、微小画像ＩＧに含まれる候補オブジェクトが追跡対象となる顔部位でないと判定する（ＳＴ１０５）。その後、処理は、図２０のステップＳＴ７３に移行する。
【０１４２】
一方、類似度が高いと判断した場合（ＳＴ１０４：ＹＥＳ）、マイクロコンピュータ３は、微小画像ＩＧに含まれる候補オブジェクトが追跡対象となる顔部位であると判定する（ＳＴ１０６）。その後、処理は、図２０のステップＳＴ７３に移行する。
【０１４３】
再度、図２０を参照して説明する。ステップＳＴ７２の後、マイクロコンピュータ３は、図２２に示したステップＳＴ１０５，１０６の判定に基づいて、存在領域ＥＡが追跡対象となる顔部位を含むものか否かを判断する（ＳＴ７３）。
【０１４４】
追跡対象となる顔部位を含むものであると判断された場合（ＳＴ７３：ＹＥＳ）、処理は、図１５に示したステップＳＴ６６に移行する。一方、追跡対象となる顔部位を含むものでないと判断された場合（ＳＴ７３：ＮＯ）、マイクロコンピュータ３は、候補オブジェクトの幾何形状による追跡対象判定処理を行う（ＳＴ７４）。このステップＳＴ７４の処理は、図１にて説明した顔部位判定手段ＣＬ３４にて行われる処理である。
【０１４５】
詳細には、図２４に示す処理が実行される。図２４は、図２０に示した候補オブジェクトの幾何形状による追跡対象判定処理（ＳＴ７４）の詳細を示すフローチャートである。同図に示すステップＳＴ１１０〜ＳＴ１１８については、図１１に示したステップＳＴ４０〜ＳＴ４８と同様であるため、説明を省略する。
【０１４６】
この処理の後、マイクロコンピュータ３は、算出された幾何形状のマッチング度合いに基づいて、抽出された候補が追跡対象の顔部位か否かを判定する（ＳＴ１１９）。すなわち、幾何形状が一致するかを判断して、微小画像ＩＧが追跡対象となる顔部位を含むものか否かを判断している。
【０１４７】
一致しないと判断した場合（ＳＴ１１９：ＮＯ）、マイクロコンピュータ３は、微小画像ＩＧに含まれる候補オブジェクトが追跡対象となる顔部位でないと判定する（ＳＴ１２０）。その後、処理は、図２０のステップＳＴ７５に移行する。
【０１４８】
一方、一致すると判断した場合（ＳＴ１１９：ＹＥＳ）、マイクロコンピュータ３は、微小画像ＩＧに含まれる候補オブジェクトが追跡対象となる顔部位であると判定する（ＳＴ１２１）。その後、処理は、図２０のステップＳＴ７５に移行する。
【０１４９】
再度、図２０を参照して説明する。ステップＳＴ７４の後、マイクロコンピュータ３は、図２４に示したステップＳＴ１２０，１２１の判定に基づいて、存在領域ＥＡが追跡対象となる顔部位を含むものか否かを判断する（ＳＴ７５）。
【０１５０】
追跡対象となる顔部位を含むものであると判断された場合（ＳＴ７５：ＹＥＳ）、処理は、図１５に示したステップＳＴ６６に移行する。一方、追跡対象となる顔部位を含むものでないと判断された場合（ＳＴ７５：ＮＯ）、マイクロコンピュータ３は、ステップＳＴ７６の処理を行う。
【０１５１】
図１５に示したステップＳＴ６２では、複数の追跡対象候補が抽出されている場合がある。例えば、被検出者が眼鏡を着用している場合などには、複数の追跡対象候補が抽出されることがある（後述する）。このため、マイクロコンピュータ３は、他の追跡対象の候補があるか、すなわち未だ判定していない追跡対象の候補があるか否かを判断する（ＳＴ７６）。他の追跡対象の候補があると判断した場合（ＳＴ７６：ＹＥＳ）、処理は図１５のステップＳＴ６３に移行する。
【０１５２】
一方、他の追跡対象の候補がないと判断した場合（ＳＴ７６：ＮＯ）、マイクロコンピュータ３は、不検出カウンタをインクリメントする（ＳＴ７７）。その後、マイクロコンピュータ３は、不検出カウンタの数値が顔部位再検出処理移行数を超えたか否かを判断する（ＳＴ７８）。顔部位再検出処理移行数は、追跡対象となる顔部位が特定できなかった場合であっても、図３のステップＳＴ１７の処理を行うことなく、ステップＳＴ１９の追跡処理を連続して何度実行するかを示す数である。この数は、システムの処理速度、処理精度等によって異なってくるものであり、本装置１の適用対象に合わせて適宜設定すればよいものである。
【０１５３】
顔部位再検出処理移行数を超えていないと判断した場合（ＳＴ７８：ＮＯ）、処理は、図３に示したステップＳＴ１８に移行する。そして、ステップＳＴ１３〜ＳＴ１５の処理が行われ、再度、追跡処理（ＳＴ１９）が行われることとなる。なお、再度ステップＳＴ１９の処理が実行され、再度、追跡対象の候補が追跡対象であると判定されなかった場合には、不検出カウンタがさらにインクリメントされることとなる。そして、ステップＳＴ１９の処理が繰り返され、不検出カウンタの値が顔部位再検出処理移行数を超えた場合（ＳＴ７８：ＹＥＳ）、マイクロコンピュータ３は、追跡対象検出フラグ「ＧｅｔＦｌａｇ」を「ＦＡＬＳＥ」に設定する（ＳＴ７９）。
【０１５４】
その後、マイクロコンピュータ３は、不検出カウンタを初期化し（ＳＴ８０）、処理は、図３に示したステップＳＴ１８に移行する。
【０１５５】
なお、不検出カウンタの数値が顔部位再検出処理移行数を超えた場合、追跡対象検出フラグ「ＧｅｔＦｌａｇ」を「ＦＡＬＳＥ」に設定されるため、図３に示す追跡対象検出処理（ＳＴ１７）が、再度実行されることとなる。すなわち、マイクロコンピュータ３は、追跡対象を特定できないため、ステップＳＴ１９の処理を繰り返したにもかかわらず、数回に渡って追跡対象を特定できない場合に、最終的に追跡対象を特定できなかったとする。そして、再度の追跡対象検出処理（ＳＴ１７）を実行することとなる。
【０１５６】
次に、図１５に示した顔部位探査領域の設定処理（ＳＴ６０）及び優先顔部位探査領域の設定処理（ＳＴ６１）をさらに詳細に説明する。
【０１５７】
図２５は、顔部位探査領域の設定処理（ＳＴ６０）の詳細を示すフローチャートであり、図２６は、優先顔部位探査領域の設定処理（ＳＴ６１）の詳細を示すフローチャートである。図２５に示すように、マイクロコンピュータ３は、顔部位探査領域の位置を設定する（ＳＴ１３０）。ここでは、前回の処理において検出又は判定された追跡対象の顔部位の代表座標値Ｃなどに基づいて、顔部位探査領域の中心位置が設定される。
【０１５８】
その後、顔部位探査領域の大きさを設定する（ＳＴ１３１）。この処理では、例えば、追跡対象が特定できずに何度追跡処理が実行されたか、すなわち不検出カウンタの数値などの情報に基づいて、大きさが決定される。そして、マイクロコンピュータ３は、顔部位探査領域の領域設定を行い（ＳＴ１３２）、処理は、図２６のステップＳＴ１４０に移行する。
【０１５９】
ステップＳＴ１４０において、マイクロコンピュータ３は、不検出カウンタが優先顔部位領域の非設定数を超えた否かを判断する（ＳＴ１４０）。優先顔部位領域の非設定数は、顔部位が追跡できていないと判断するのに必要な数である。この数も、顔部位再検出処理移行数と同様に、システムの処理速度、処理精度によって設定される値が異なってくるものである。なお、優先顔部位領域の非設定数は、ほぼビデオレートで処理ができ、顔部位の検出率（顔部位を顔部位として判定する率）が９０％程度であれば３〜５に設定できる。
【０１６０】
不検出カウンタが優先顔部位領域の非設定数を超えたと判断した場合（ＳＴ１４０：ＹＥＳ）、処理は図１５のステップＳＴ６２に移行する。一方、不検出カウンタが優先顔部位領域の非設定数を超えていないと判断した場合（ＳＴ１４０：ＮＯ）、優先顔部位探査領域の領域設定を行い（ＳＴ１４１）、処理は図１５のステップＳＴ６２に移行する。
【０１６１】
次に、図２７〜図３３を参照して、上記の図２５及び図２６に示した処理をさらに詳細に説明する。図２７は、顔部位探査領域及び優先顔部位探査領域の説明図である。同図に示すように、顔部位探査領域は、中心から片側幅Ｈ１、片側高Ｖ１となっている。また、優先顔部位探査領域は、中心から片側幅Ｈ２、片側高Ｖ２となっている。ここでの中心は、例えば、前回の処理において検出又は判定された追跡対象の代表座標値Ｃである。また、前回の処理とは、追跡対象検出処理（ＳＴ１７）及び追跡処理（ＳＴ１９）のいずれであってもよい。なお、図２５に示すステップＳＴ１３０では、この中心となる座標を定める処理を行っている。
【０１６２】
領域の大きさは、前述したように、検出対象等によって変わってくるものである。また、領域の大きさは、システムの処理速度、処理精度によっても変わってくるが、例えば、前述の例ではＨ１を３０〜５０画素、Ｖ１を２０〜３０画素とすればよい。また、Ｈ２を１０〜１５画素、Ｖ２を５から１０画素程度に設定すればよい。
【０１６３】
ところが、上記のような顔部位探査領域では、被検出者が顔の向きを大きく変えた場合などには、追跡対象の顔部位が領域外へ移動してしまい、追跡対象の顔部位を特定できないこともある。すなわち、前回の処理において検出又は判定された追跡対象の代表座標値Ｃを顔部位探査領域の中心としているため、移動中の追跡対象は、今回の処理時において既に領域外に位置しているということもありうる。
【０１６４】
そこで、本実施形態では、図２８に示すように、顔部位探査領域の大きさを可変としている。図２８は、顔部位探査領域の大きさを可変とする場合の一例を示す説明図である。同図に示すように、マイクロコンピュータ３は、追跡対象の顔部位が特定できなかった場合には、顔部位探査領域を広くする。
【０１６５】
本実施形態においては、例えば、一度追跡対象が特定されず不検出カウンタが「１」となった場合、追跡対象が存在するであろう領域を広げて、追跡対象の候補を見つけるようにしている。図２５に示すステップＳＴ１３１では、このようにして、顔部位探査領域の大きさを決定している。
【０１６６】
また、顔部位探査領域の大きさは次のようにして決定してもよい。図２９は、顔部位探査領域の大きさを可変とする場合の他の例を示す説明図である。同図に示すように、マイクロコンピュータ３は、顔部位探査領域を広くする際に不検出カウンタのカウント値に基づいて、顔部位探査領域の大きさを順次大きくするようにしてもよい。すなわち、不検出カウンタの数値が大きければ大きいほど、顔部位探査領域を広くするようにしている。このように、不検出カウンタの数値に基づいて領域の大きさを決定することで、追跡対象が特定できなかった連続回数に応じて領域の大きさを決定することとなる。
【０１６７】
通常、顔部位探査領域を大きくすると処理速度の低下を招くため、顔部位探査領域の大きさを前回処理のときの大きさに比して、突然に大きくしてしまうことは、急激な処理速度の低下を招いてしまう。ところが、この例のように大きさを不検出カウンタの数値に応じて決定することで、処理速度の急激な低下を防止しつつ顔部位探査領域を適切な大きさにすることができる。
【０１６８】
さらに、本実施形態では、マイクロコンピュータ３が優先顔部位探査領域を設定しない場合もある。図２６のステップＳＴ１４０の処理がこれに当たる。
【０１６９】
ステップＳＴ１４０では、不検出カウンタが優先顔部位領域非設定数を超えたか否かを判断している。すなわち、マイクロコンピュータ３は、追跡対象の候補のすべてについて追跡対象であるか否かの判断を行い、追跡対象を特定できなくとも、不検出カウンタが顔部位再検出移行数に達するまで、追跡対象の特定を試みる。そして、マイクロコンピュータ３は、不検出カウンタが顔部位再検出移行数に達した場合には、最終的に追跡対象が特定できなかったと判断し、図３のステップＳＴ１７の追跡対象検出処理を行うこととなる。
【０１７０】
ステップＳＴ１４０では、最終的に追跡対象が特定できなかったと判断するまでの間において、不検出カウンタが優先顔部位領域非設定数を超えた場合に、優先顔部位探査領域の設定を行わないようにしている。
【０１７１】
なお、本例では、優先顔部位探査領域の設定を行わないようにしているが、これに限らず、優先顔部位探査領域を狭くして設定するようにしてもよい。
【０１７２】
また、図２７にて説明した顔部位探査領域の中心は、前回の処理において検出又は判定された追跡対象の顔部位の代表座標値Ｃとしなくともよい。以下にその場合の例を示す。図３０は、顔部位探査領域の中心位置を設定する場合の一例を示す説明図である。
【０１７３】
同図には、前々回及び前回の眼の位置及び顔部位探査領域の中心位置が示されている。図３０に示す例の場合、まず、マイクロコンピュータ３は、前々回及び前回の顔部位探査領域について、中心位置のＸ軸方向での差分及びＹ軸方向での差分を求める。そして、前回の中心位置にこれらの差分値を加え、得られた座標値を今回の顔部位探査領域の中心位置とする。
【０１７４】
図３１は、眼の位置及び顔部位探査領域の中心位置を含む画像例を示す説明図であり、（ａ）は全体画像を示し、（ｂ）は拡大画像を示している。
【０１７５】
図３０を参照して説明した処理を実行した場合、図３１（ａ）に示すように、顔部位探査領域内に眼の位置が納まっている。また、図３１（ｂ）の拡大画像例からも明らかなように、前々回及び前回の中心位置に基づいて今回の顔部位探査領域を設定した結果、今回の顔部位探査領域内に眼の位置が納まっている。このように、本例では過去の画像フレームにおける追跡対象の移動量に基づいて顔部位探査領域を設定することで、被検出者の顔の動きに応じて適切処理を行うことができる。
【０１７６】
なお、本例においては、前回、前々回の追跡対称の位置から求まる移動量に応じて顔部位探査領域の中心位置を決定しているが、これに限らない。すなわち、前々回以前に特定された追跡対象の位置から移動量を求め、これに基づき、中心位置を決定するようにしてもよい。また、顔部位探査領域の中心位置を、まず、前回特定された追跡対象の位置とし、この位置にて追跡対象が特定されず不検出カウンタが「１」となった場合に、本例を用いるようにしてもよい。
【０１７７】
次に中心位置の設定の他の例について説明する。図３２は、顔部位探査領域の中心位置を設定する場合の他の例を示す説明図である。図３３は、眼の位置及び顔部位探査領域の中心位置を含む画像の他の例を示す説明図であり、（ａ）は全体画像を示し、（ｂ）は拡大画像を示している。
【０１７８】
図３０及び図３１を参照して説明した例は、中心位置のＸ軸方向での差分値及びＹ軸方向での差分値が大きい場合に有効な手段となる。本例では、Ｘ軸方向での差分値及びＹ軸方向での差分値が小さい場合に有効な手段となる。
【０１７９】
図３２及び図３３に示すように、Ｘ軸方向での差分値及びＹ軸方向での差分値が大きくない場合には、これらの図に示すように、被検出者の顔の動きに応じて顔部位探査領域を設定しなくともよい。なぜなら、被検出者の顔の動きに応じて設定しなくとも追跡対象が顔部位探査領域内に含まれるからである。
【０１８０】
そこで、本例では、Ｘ軸方向での差分値及びＹ軸方向での差分値が小さい場合、前回の処理において検出又は判定された追跡対象の顔部位の代表座標値Ｃを中心位置としている。
【０１８１】
このように、Ｘ軸方向での差分値及びＹ軸方向での差分値を考慮しつつも移動量が所定の閾値を超えない場合、通常通り、前回の処理における代表座標値Ｃを中心位置とする。これにより、図３０及び図３１に示した例に比して、詳細な計算等を不要とし迅速な処理を行うことができる。
【０１８２】
次に、画像例を参照しつつ本実施形態に係る顔部位追跡装置１の動作を再度説明する。なお、以下の説明においては、便宜上代表座標値Ｃを代表座標点Ｃと称する。図３４は、被検出者が一方向を視認しているときの画像例を示す図である。同図に示すように、本画像例では、連続データＧ４の代表座標点Ｃ４が優先顔部位探査領域内に納まっている。このため代表座標点Ｃ４が顔部位として判定される。すなわち、図１５のステップＳＴ６３において、「ＹＥＳ」と判断される。
【０１８３】
図３５は、被検出者が顔の向きを変えたときの画像例を示す図であり、（ａ）は全体画像例を示しており、（ｂ）は拡大画像例を示している。図３５（ａ）に示すように、連続データＧ４の代表座標点Ｃ４は、優先顔部位探査領域に無く顔部位探査領域内にある。このため、図１５のステップＳＴ６３において、「ＮＯ」と判断される。そして、代表座標点Ｃ４を中心にして設定した存在領域ＥＡが微小画像ＩＧ（図３５（ｂ））として画像メモリに保存される。その後、ステップＳＴ７０以下の追跡対象判定処理を順次行っていくこととなる。
【０１８４】
次に、被検出者が眼鏡を着用している場合の本装置１の動作を説明する。図３６は、被検出者が眼鏡を着用している場合の画像例を示す図であり、図３７は、被検出者が眼鏡を着用している場合に得られる複数の微小画像例を示す図である。
【０１８５】
被検出者が眼鏡を着用している場合には、図３６に示すように、顔部位探査領域内から複数の候補点が抽出されることがある。図３６によると、連続データＧ２の代表座標点Ｃ２、連続データＧ３の代表座標点Ｃ３、連続データＧ４の代表座標点Ｃ４がいずれもが優先顔部位探査領域ではない顔部位探査領域内にある。
【０１８６】
このため、図３７に示すように、代表座標点Ｃ２，Ｃ３，Ｃ４を中心にした存在領域ＥＡ１，ＥＡ２，ＥＡ３である微小画像ＩＧ１，ＩＧ２，ＩＧ３をそれぞれ画像メモリに保存し、ステップＳＴ７０以下の追跡対象判定処理を順次行っていく。
【０１８７】
なお、この例において、１つ目の微小画像ＩＧＡ１を判定した場合、追跡対象となる顔部位でないと判定されて、図２０のステップＳＴ７６の処理において、他の追跡対象候補があると判断される。そして、２つ目の微小画像ＩＧＡ２が判定の対象とされ、追跡対象となる顔部位が特定されることとなる。
【０１８８】
また、本実施形態では、顔部位探査領域を囲む領域を連続データ抽出領域として設定し、その領域内だけで連続データを抽出するようにしてもよい。図３８は、顔部位探査領域を囲む領域を連続データ抽出領域として設定したときの一例を示す図である。また、図３９は、連続データ抽出領域を設定したときに抽出される連続データの一例を示す図である。図３８及び図３９に示すように、顔部位探査領域を囲む連続データ抽出領域を設定し、この範囲内から候補を抽出するようにしても処理は可能である。なお、この例の場合、連続データＧ１の代表座標点Ｃ１が優先顔部位探査領域内にあるので、代表座標点Ｃ１を顔部位として判定することとなる。
【０１８９】
このようにして、本実施形態における顔部位追跡装置１は、顔部位探査領域を設定している。この顔部位探査領域は、検出された追跡対象の画像上の位置に基づくものであり且つ被検出者が顔の向きを変えたときにサンプリング時間中に移動する追跡対象の移動量に基づいているため、追跡対象となる顔部位が存在する可能性の高い領域であるといえる。そして、本装置１はこの領域から追跡対象の候補を抽出するようにしている。このため、追跡対象の検出後の撮像画像については、画像全体から追跡対象の候補を抽出することなく、追跡対象が存在する可能性の高い領域から候補を抽出することができ、精度良く且つ迅速な処理を行うことができる。
【０１９０】
また、顔部位探査領域内に優先顔部位探査領域を設定している。この優先顔部位探査領域は、顔部位探査領域内に設定されるものであるため、追跡対象となる顔部位が存在する可能性が一層高い領域といえる。そして、追跡対象の候補が優先顔部位探査領域内にある場合には、その候補は追跡対象の顔部位である可能性が一層高いものであるため、顔部位判定手段ＣＬ３４は、この候補を追跡対象の顔部位と判定する。
【０１９１】
一方、抽出された候補が顔部位探査領域内であって優先顔部位探査領域外にある場合、その候補は、追跡対象の顔部位である可能性が高いものの、優先顔部位探査領域内にある場合に比べ、追跡対象の顔部位である可能性が低い。このため、顔部位判定手段ＣＬ３４は、その候補の画像を画像処理して候補が追跡対象となる顔部位か否かを判定している。すなわち、顔部位探査領域内の候補は、追跡したい顔部位でない可能性が少なからずあり、追跡したい顔部位でない場合に、顔部位判定手段ＣＬ３４は、誤って追跡してしまうことを防止すべく、追跡対象となる顔部位であるか否かの判定している。これにより、誤った追跡をしてしまうことを防止している。
【０１９２】
以上から本発明では、追跡対象となる顔部位を判定するのに際し、精度及び処理速度の向上を図ることができる。
【０１９３】
また、被検出者が一方向を視認しているときに、サンプリング時間中に移動する追跡対象の顔部位の移動量に基づいて、優先顔部位探査領域を設定している。このため、追跡対処となる顔部位が位置する可能性が高い領域について、優先顔部位探査領域が設定することができる。
【０１９４】
また、顔部位判定手段ＣＬ３４により追跡対象の顔部位が特定された後には、この特定された位置に基づいて顔部位探査領域及び優先顔部位探査領域を設定している。このため、一度、顔部位検出手段ＣＬ２により顔部位を検出した後には、画像全体に対して顔部位の検出処理を行うことが少なくなり、迅速な処理を継続して行うことができる。
【０１９５】
また、過去の画像フレームにおいて追跡対象の顔部位が判定されたときの位置を中心位置として、顔部位探査領域を設定する。このため、例えば、過去の画像上における追跡対象位置に基づいて、追跡対象が存在する可能性が高い箇所に、顔部位探査領域を設定することができる。
【０１９６】
また、過去の画像フレームにおける追跡対象の移動量に基づいて補正した位置を中心として、顔部位探査領域を設定している。このため、例えば、過去の画像上において追跡対象の位置がＸ軸方向及びＹ軸方向に大きく移動している場合などには、過去のデータに基づき今回の処理時に追跡対象が存在する可能性が高い箇所に、顔部位探査領域を設定することができる。
【０１９７】
また、顔部位判定手段ＣＬ３４により追跡対象の顔部位が特定できなかった場合、顔部位探査領域の範囲を広くすることとしたので、追跡対象を見失ったとしても、すぐに追跡処理に復帰することができる。
【０１９８】
また、顔部位判定手段ＣＬ３４により追跡対象の顔部位が特定できなかった場合、優先顔部位探査領域の範囲を狭くするもしくは優先顔部位探査領域を設定しないこととしている。このため、偶然に優先顔部位探査領域内に追跡対象と近似した特徴量の追跡対象候補があったとしても、誤って追跡対象と判断することを防止又は軽減でき、好適に追跡処理に復帰することができる。
【０１９９】
また、候補の位置を特定するための候補点を定めいるので、候補の一部が優先顔部位探査領域内であって、一部が優先顔部位探査領域外にあるという事態を無くすことができ、高精度に処理を行うことができる。
【０２００】
また、画像縦方向について濃度値の局所的な高まり毎に１個ずつの画素を定めて抽出点とし、抽出点が画像横方向に隣接する場合に、横方向に延びる抽出点群の連続データを形成し、形成された連続データの代表座標値を顔部位の候補点とする。このため、得られる連続データが、追跡対象となる顔部位の特徴を有しているか否かなどの判断が行えるようになり、例えば、追跡対象となる顔部位の特徴を有するものだけを、選択することが可能となる。従って、本装置１の精度の向上を図ることができる。
【０２０１】
また、顔部位の候補を含んで、微小画像を抽出し、微小画像に基づいて、顔部位の追跡対象を判定する。すなわち、顔部位探査領域又は優先顔部位探査領域内から、画像を一部抽出して処理を行うので、処理負荷を軽減することができる。
【０２０２】
また、微小画像を抽出し、濃度、空間周波数、幾何形状のいずれかに基づいて、追跡対象の顔部位か否かを判定するので、処理負荷を軽減することができる上に、正確な判定を行うことができる。
【０２０３】
また、顔部位判定手段ＣＬ３４により追跡対象の顔部位が特定できなかった場合、再度、顔部位検出手段ＣＬ２により追跡対象の検出処理を行うため、追跡対象を見失っても追跡処理に復帰することができる。
【０２０４】
なお、本実施形態は上記の構成に限られず、本発明の趣旨を逸脱しない範囲において変更等が可能である。例えば、顔部位判定手段ＣＬ３４内に、それぞれ異なる判定精度とされた複数の顔部位判定部を有する構成としてもよい。すなわち、通常、判定等を行う手段は、判定精度が低くなると処理速度が速くなる傾向にある。これを利用して、本実施形態において追跡対象の顔部位か否かを判定する際に、判定精度が低く処理速度が速いものから順に判定処理を実行するようにしてもよい。これにより、処理速度を高めると共に、判定精度の低下を防ぐことができる。
【０２０５】
また、本実施形態における顔部位候補抽出手段ＣＬ３３は、上記構成に限らず、例えば以下のような構成であってもよい。すなわち、顔部位候補抽出手段ＣＬ３３は、画像横方向に延びる抽出点群である連続データＧを形成できなかった場合、候補抽出を行っている現在の画像フレーム以前の画像フレームにおける候補点を、現在の候補点とするようにしてもよい。また、顔部位候補抽出手段ＣＬ３３は、画像横方向に延びる抽出点群である連続データＧを形成できなかった場合に、候補抽出を行っている現在の画像フレーム以前の画像フレームにおける追跡対象の移動量に基づいて、現在の候補点を決定するようにしてもよい。
【０２０６】
このように顔部位候補抽出手段ＣＬ３３を構成した場合、連続データＧを形成できなかったときに、再度候補点を抽出する処理を実行しなくとも、候補点が定められるので、演算負荷を軽減することができる。また、画像横方向に延びる適切な連続データＧが形成されていないために、不適切な連続データＧに基づいて候補点を定めてしまうという事態を防止することができる。故に、追跡精度の向上を図ることができる。
【図面の簡単な説明】
【図１】本発明の実施形態に係る顔部位追跡装置の構成を示す機能ブロック図である。
【図２】本発明の実施形態に係る顔部位追跡装置の示すハード構成図である。
【図３】本実施形態に係る顔部位追跡装置１の動作の概略を示すメインフローチャートである。
【図４】図３に示した追跡対象検出処理（ＳＴ１７）の詳細な動作を示すフローチャートである。
【図５】図４に示した追跡対象候補位置特定処理（ＳＴ２０）の詳細を示すフローチャートである。
【図６】図５に示したステップＳＴ３６の処理にて形成される連続データ、並びにステップＳＴ３７の処理にて定められる代表座標値Ｃ及び存在領域ＥＡを示す説明図である。
【図７】図６に示した存在領域ＥＡの大きさを示す説明図である。
【図８】数人の眼の大きさを調べた横Ｘａの長さの統計データを示す説明図である。
【図９】数人の眼の大きさを調べた縦Ｙａの長さの統計データを示す説明図である。
【図１０】存在領域ＥＡの画像上の位置を決定する方法を示す説明図である。
【図１１】図４に示した追跡対象判定処理（ＳＴ２１）の詳細を示すフローチャートである。
【図１２】微小画像を示す説明図である。
【図１３】範囲ＡＲでの二値化閾値の算出方法の説明図である。
【図１４】候補オブジェクトと追跡対象である眼のテンプレートとの幾何形状の比較方法の説明図であり、（ａ）は候補オブジェクトが最適な状態で撮像された場合を示し、（ｂ）は眼の右側が欠けた状態を示し、（ｃ）は眼の左側が欠けた状態を示している。
【図１５】図３に示した追跡処理（ＳＴ１９）の詳細を示すフローチャートである。
【図１６】図１５に示した顔部位探査領域の設定処理（ＳＴ６０）の説明図であり、（ａ）は時刻ｔ０において撮像された画像を示し、（ｂ）は時刻ｔ１において撮像された画像を示し、（ｃ）は時刻ｔ２において撮像された画像を示し、（ｄ）は時刻ｔ３において撮像された画像を示し、（ｅ）はこれらの画像上の左眼位置を一画像上で表した場合を示している。
【図１７】図１５に示した優先顔部位探査領域の設定処理（ＳＴ６１）の説明図であり、（ａ）は時刻ｔ１０において撮像された画像を示し、（ｂ）は時刻ｔ１１において撮像された画像を示し、（ｃ）は時刻ｔ１２において撮像された画像を示し、（ｄ）は時刻ｔ１３において撮像された画像を示し、（ｅ）はこれらの画像上の左眼位置を一画像上で表した場合を示している。
【図１８】一方向を視認している場合と顔の向きを変えた場合との左眼位置の分布を示す説明図である。
【図１９】図１８に示した分布から求まる左眼位置の移動量の解析結果を示す説明図である。
【図２０】顔部位の候補が優先顔部位探査領域内に無いと判断された場合に実行される処理を示すフローチャートである。
【図２１】図２０に示した濃度による追跡対象判定処理（ＳＴ７０）の詳細を示すフローチャートである。
【図２２】図２０に示した周波数画像による追跡対象判定処理（ＳＴ７２）の詳細を示すフローチャートである。
【図２３】図２２に示した周波数画像生成処理（ステップＳＴ１０１）の説明図であり、（ａ）は微小画像ＩＧを示しており、（ｂ）は周波数画像を示している。
【図２４】図２０に示した候補オブジェクトの幾何形状による追跡対象判定処理（ＳＴ７４）の詳細を示すフローチャートである。
【図２５】顔部位探査領域の設定処理（ＳＴ６０）の詳細を示すフローチャートである。
【図２６】優先顔部位探査領域の設定処理（ＳＴ６１）の詳細を示すフローチャートである。
【図２７】顔部位探査領域及び優先顔部位探査領域の説明図である。
【図２８】顔部位探査領域の大きさを可変とする場合の一例を示す説明図である。
【図２９】顔部位探査領域の大きさを可変とする場合の他の例を示す説明図である。
【図３０】顔部位探査領域の中心位置を設定する場合の一例を示す説明図である。
【図３１】眼の位置及び顔部位探査領域の中心位置を含む画像例を示す説明図であり、（ａ）は全体画像を示し、（ｂ）は拡大画像を示している。
【図３２】顔部位探査領域の中心位置を設定する場合の他の例を示す説明図である。
【図３３】眼の位置及び顔部位探査領域の中心位置を含む画像の他の例を示す説明図であり、（ａ）は全体画像を示し、（ｂ）は拡大画像を示している。
【図３４】被検出者が一方向を視認しているときの画像例を示す図である。
【図３５】被検出者が顔の向きを変えたときの画像例を示す図であり、（ａ）は全体画像例を示しており、（ｂ）は拡大画像例を示している。
【図３６】被検出者が眼鏡を着用している場合の画像例を示す図である。
【図３７】被検出者が眼鏡を着用している場合に得られる複数の微小画像例を示す図である。
【図３８】顔部位探査領域を囲む領域を連続データ抽出領域として設定したときの一例を示す図である。
【図３９】連続データ抽出領域を設定したときに抽出される連続データの一例を示す図である。
【符号の説明】
１…顔部位追跡装置
ＣＬ１…顔画像撮像手段
ＣＬ２…顔部位検出手段
ＣＬ３１…顔部位探査領域設定手段
ＣＬ３２…優先顔部位探査領域設定手段
ＣＬ３３…顔部位候補抽出手段（候補抽出手段）
ＣＬ３４…顔部位判定手段（第１顔部位判定手段、第２顔部位判定手段）
Ｇ…連続データ
ＩＧ…微少画像
ＩＦＧ，ＢＩＦＧ…周波数画像[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a face part tracking device.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, there has been known a face part tracking device that detects a part of a face to be tracked from a captured image obtained by capturing an image of a face of a subject and tracks the face part. The face part tracking apparatus first stores a template as a standard, and extracts a part of a face to be tracked from a captured image using the standard template. Then, the extracted image is stored as a tracking template, and the face part to be tracked is tracked by the tracking template. (See, for example, Patent Document 1).
[0003]
In another face part tracking apparatus, an extraction point is determined by defining one pixel for each local increase in density along a pixel row arranged in the vertical direction of a captured image. Then, the extraction points arranged in the horizontal direction of the image are set as a curve group, and it is determined whether or not the curve group matches a predetermined shape of the face part to be tracked (for example, whether the face part is long in the horizontal direction for an eye). To detect the position of the tracking target. After that, the existence area is set based on the detected tracking object, the existence area is binarized, the position of the tracking object is specified in detail, and the specified position of the tracking object is set at the setting position of the existing area in the next processing. And Then, the above processing is repeated to track a desired face part (for example, see Patent Document 2).
[0004]
[Patent Document 1]
JP 2000-163564 A
[0005]
[Patent Document 2]
JP-A-10-143669
[0006]
[Problems to be solved by the invention]
However, in the device described in Patent Literature 1, pattern matching is repeatedly performed using the standard and tracking templates until a specific part of the face is found in the image of one frame. For this reason, in order to track the face portion to be tracked in real time, an extremely high calculation load is required.
[0007]
Further, in the device described in Patent Document 2, it is not determined whether or not the target in the existence area is a face part to be tracked, and there is a possibility that an object that is not a face part to be tracked is erroneously tracked. There is.
[0008]
[Means for Solving the Problems]
According to the present invention, in a face part tracking device that tracks the movement of a face part based on an image captured and input of a face of a detected person, the face part detection means performs tracking of the entire input captured image. The target face part is detected, and the face part search area setting means detects the face part based on the position of the face part of the tracking target detected by the face part detection means on the image input after the detection. Setting a region, the priority face part search area setting means sets a priority face part search area within the face part search area set by the face part search area setting means, and the candidate extraction means sets the face part search area. A first facial part determining means, when the candidate extracted by the candidate extracting means is within the priority face part search area, determines the candidate as a face part to be tracked; The second face part determining means includes When the extracted candidate is not in the priority face part search area but in the face part search area, image processing of the candidate is performed to determine whether or not the extracted candidate is a face part to be tracked. The face part search area setting means sets the face part search area based on the movement amount of the tracking target that moves during the sampling time when the subject changes his or her face direction.
[0009]
【The invention's effect】
According to the present invention, a face part search area narrower than the entire image is set based on the detected position of the tracking target on the image. For this reason, a candidate can be extracted from an area where a face part is likely to be present without extracting a candidate for a face part from the entire image, and quick processing can be performed.
[0010]
Further, when the extracted candidate is within the face part search area and outside the priority face part search area, image processing of the candidate is performed to determine whether the candidate is a tracking target. Thereby, it is possible to prevent a face part that is not a tracking target from being erroneously tracked.
[0011]
Therefore, it is possible to improve accuracy and processing speed when determining a face part to be tracked.
[0012]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, a preferred embodiment of the present invention will be described with reference to the drawings.
[0013]
FIG. 1 is a functional block diagram showing the configuration of the face part tracking device according to the embodiment of the present invention. As shown in the figure, the face part tracking device 1 tracks the movement of a face part based on an image obtained by capturing and inputting the face of a detected person. It comprises a detecting means CL2 and a face part tracking means CL3.
[0014]
The face image capturing means CL1 obtains a captured image including a face part to be tracked by capturing the face of the detected person. Further, the face image capturing means CL1 is configured to transmit the input image data to the face part detecting means CL2 and the face part tracking means CL3.
[0015]
The face part detection means CL2 detects a face part to be tracked from the entire input captured image. The face part tracking means CL3 tracks the movement of the face part to be tracked based on signals from the face image capturing means CL1 and the face part detecting means CL2.
[0016]
The face part tracking means CL3 includes a face part search area setting means CL31 and a priority face part search area setting means CL32. The face part tracking means CL3 includes face part candidate extracting means (candidate extracting means) CL33 and face part determining means (first face part determining means, second face part determining means) CL34.
[0017]
The face part search area setting means CL31 performs processing on an image input after detection when the face part to be tracked is detected by the face part detecting means CL2. The process to be performed is a process of setting a face part search area narrower than the entire image based on the position of the tracking target on the image. The face region search area is set, for example, based on the movement amount of the tracking target that moves during the sampling time when the subject changes his or her face direction.
[0018]
Further, the priority face part search area setting means CL32 sets a priority face part search area in the face part search area. The priority face region search area is set based on, for example, the movement amount of the tracking target face region that moves during the sampling time when the subject is visually recognizing one direction.
[0019]
The face part candidate extracting means CL33 extracts face part candidates to be tracked from within the face part search area. That is, unlike the face part detecting means CL2, the face part candidate extracting means CL33 does not extract the face part to be tracked from the entire captured image, and can perform processing at a higher speed than the face part detecting means CL2. .
[0020]
The face part determination means CL34 determines whether or not the tracking target candidate extracted by the face part candidate extraction means CL33 is a tracking target. Specifically, when the extracted candidate is within the priority face region search area, the face part determining means CL34 determines that the candidate is a tracking target. Further, when the extracted candidate is not in the priority face part search area but in the face part search area, the face part determination means CL34 performs image processing on the image of the candidate so that the extracted candidate becomes a tracking target. It is determined whether or not there is.
[0021]
In such a face part tracking device 1, first, the face image capturing means CL1 captures the face of the person to be detected, and transmits the obtained image data to the face part detecting means CL2. Upon receiving this, the face part detecting means CL2 detects a face part to be tracked from the entire image.
[0022]
Thereafter, when a captured image is obtained by the face image capturing unit CL1, the face image capturing unit CL1 transmits image data to the face part tracking unit CL3. The face part tracking means CL3 receiving this sets the face part search area by the face part search area setting means CL31 and sets the priority face part search area by the priority face part search area setting means CL32.
[0023]
Then, the face part candidate extracting unit CL33 extracts a face part candidate to be tracked from the face part search area in the captured image. After the extraction, the face part determination means CL34 determines which area the candidate belongs to, and determines whether the candidate is a face part to be tracked. That is, when the candidate is in the priority face region search area, the face part determining means CL34 determines that the candidate is a tracking target. On the other hand, when the candidate is not in the priority face part search area but in the face part search area, the image of the candidate is subjected to image processing. Then, based on the result obtained by the image processing, it is determined whether or not the face portion is a tracking target. Thereafter, the present apparatus 1 tracks the face part to be tracked based on the determination result.
[0024]
Note that the present apparatus 1 determines candidate points for the extracted candidates in order to determine with high accuracy whether or not the candidates are in the face part search area and the priority face part search area. That is, the face part candidate extracting unit CL33 extracts a candidate to be tracked and determines a candidate point for specifying the candidate position. Then, when the candidate point determined by the face part candidate extraction means CL33 is within the priority face part search area, the face part determination means CL34 determines that the candidate having the candidate point is a tracking target. When the candidate point is not in the priority face part search area but in the face part search area, the face part determination means CL34 performs image processing on an image including the candidate having the candidate point, and the candidate Is determined.
[0025]
In this way, by performing the point-based determination, it is possible to eliminate a situation in which some of the candidates are within the priority face part search area and some of the candidates are outside the priority face part search area. Processing can be performed.
[0026]
Further, the present device 1 can be used for tracking a face portion of a driver of a car, a railroad vehicle, a ship, a plant operator, or the like. A description will be given of a case where the present invention is applied. In addition, the present apparatus 1 can track not only the eyes but also the eyebrows, nose, mouth, ears, and the like in the same manner.
[0027]
FIG. 2 is a hardware configuration diagram illustrating the face part tracking device according to the embodiment of the present invention. As shown in the figure, a TV camera 2 is provided on an instrument of a car as a face image pickup means CL1. The TV camera 2 is installed at a position where the driver can be imaged from substantially the front, and captures at least the driver's face. In the present embodiment, the input image of the TV camera 2 includes, for example, 640 pixels in the horizontal direction (X) and 480 pixels in the vertical direction (Y). The input image captured by the TV camera 2 is input as image data to a microcomputer 3 installed inside the vehicle body such as on the back of the instrument.
[0028]
The microcomputer 3 is programmed with program logic constituting the face part detecting means CL2 and the face part tracking means CL3. The program logic of the face part tracking means CL3 includes respective logics of the face part search area setting means CL31, the priority face part search area setting means CL32, the face part candidate extraction means CL33, and the face part determination means CL34.
[0029]
Next, the operation of the face part tracking device 1 according to the present embodiment will be described. FIG. 3 is a main flowchart showing an outline of the operation of the face part tracking device 1 according to the present embodiment. As shown in the figure, first, when the process is started, the microcomputer 3 executes an initial value input process (ST10). In this initial value input process, various constants such as the sampling time are read.
[0030]
Then, the microcomputer 3 sets a tracking target detection flag "GetFlag" indicating whether or not a tracking target face part has been found to "FALSE" (ST11). Thereafter, the microcomputer 3 initializes the processing frame counter “i” to “0” (ST12).
[0031]
After the initialization, the microcomputer 3 executes an end determination process (ST13). At this time, the microcomputer 3 makes a determination based on, for example, whether the engine is running.
[0032]
Then, the microcomputer 3 determines whether or not "STOP" (ST14). For example, when it is determined that the engine has not been started, the microcomputer 3 determines that the operation is "STOP" (ST14: YES), and the process ends.
[0033]
On the other hand, when it is determined that the state is not “STOP” because the engine is started and running (ST14: NO), the microcomputer 3 executes a face image capturing process (ST15). Thereby, the TV camera 2 captures an image of the driver's face.
[0034]
Thereafter, the microcomputer 3 determines whether or not the tracking target detection flag “GetFlag” is “FALSE” (ST16). That is, it is determined whether or not a face part to be tracked has been found.
[0035]
When the tracking target detection flag “GetFlag” is “FALSE” and it is determined that the face part to be tracked has not been found (ST16: YES), the microcomputer 3 executes a tracking target detection process (ST17). The process of step ST17 is a process performed by the face part detecting means CL2 described with reference to FIG. That is, the microcomputer 3 executes a program corresponding to the face part detecting means CL2. In this process, when a face part to be tracked is found, the tracking target detection flag “GetFlag” is set to “TRUE” as described later.
[0036]
After executing the tracking target detection processing, the microcomputer 3 increments the processing frame counter “i” (ST18). Then, the process returns to step ST13.
[0037]
Thereafter, the process proceeds to step ST15 through steps ST13 to ST15 described above. At this time, if a face part to be tracked has been found in the above-described tracking target detection processing (ST17), the tracking target detection flag “GetFlag” is “TRUE”. Therefore, it is determined that the tracking target detection flag “GetFlag” is not “FALSE” (ST16: NO), and the microcomputer 3 executes a tracking process (ST19). The process of step ST19 is a process performed by the face part tracking means CL3 described with reference to FIG. That is, the microcomputer 3 executes a program corresponding to the face part tracking means CL3. Then, tracking of the face part is performed.
[0038]
Thereafter, the process proceeds to step ST18, and after the processing frame counter is incremented, the process returns to step ST13 again. The above processing is repeated until "YES" is determined in step ST14.
[0039]
Note that, as described with reference to FIG. 1, the face part detection means CL2 performs processing on the entire captured image to detect a face part to be tracked. On the other hand, the face part tracking means CL3 sets an area in the captured image and determines and tracks the face part to be tracked from within the area. Therefore, the present apparatus 1 performs processing on the entire image at least once, but thereafter performs processing on a part of the image, which is faster than an apparatus that always performs processing on the entire image. Processing can be performed.
[0040]
Next, a detailed operation of the tracking target detection process (ST17) will be described. FIG. 4 is a flowchart showing a detailed operation of the tracking target detecting process (ST17) shown in FIG.
[0041]
As shown in the figure, if “YES” is determined in step ST16, the microcomputer 3 executes a process of specifying the position of the tracking target candidate (ST20). By this processing, the position of the tracking target candidate is specified from the entire image. In this process, one or a plurality of candidate positions that may be a face part to be tracked are specified.
[0042]
Then, the microcomputer 3 executes a tracking target determination process (ST21). In the tracking target determination process (ST21), one of the one or more tracking target candidates specified in the tracking target candidate position specifying process (ST20) is selected, and whether the selected candidate is a tracking target is determined. Judge.
[0043]
Thereafter, the microcomputer 3 determines whether or not the selected candidate for the tracking target is determined to be the tracking target based on the result of the tracking target determination processing (ST21) (ST22).
[0044]
If it is not determined that the target is a tracking target (ST22: NO), the microcomputer 3 determines whether or not all of the specified one or a plurality of tracking target candidates have been determined (ST24).
[0045]
If the determination has been made for all (ST24: YES), the process proceeds to step ST18 in FIG. On the other hand, if the determination has not been made for all of them (ST24: NO), the process returns to step ST21.
[0046]
If it is determined in step ST22 that the target is a tracking target (ST22: YES), the microcomputer 3 sets the tracking target detection flag “GetFlag” to “TRUE” (ST23). Then, the process proceeds to step ST18 in FIG.
[0047]
As described above, in the present apparatus 1, one or a plurality of tracking target candidates having a possibility of being a desired face part are specified from the entire image, and the specified one or a plurality of tracking target candidates are determined one by one. Then, the tracking target is detected. The process of identifying one or a plurality of tracking target candidates that may be tracking targets from the entire image (the process of step ST20) is performed as follows.
[0048]
FIG. 5 is a flowchart showing details of the tracking target candidate position specifying process (ST20) shown in FIG. In the figure, first, the microcomputer 3 saves the entire data of a captured image as an entire image in an image memory (ST30).
[0049]
Next, the microcomputer 3 makes a determination in step ST31. This determination will be described later. If "NO" is determined in the step ST31, the microcomputer 3 performs arithmetic averaging of the density values along only one line of the pixel row in the vertical direction (Y-axis direction) of the entire image (ST32). .
[0050]
The arithmetic averaging operation is a process in which, for example, an average value of density is obtained for a predetermined number of pixels arranged in the vertical direction, and the density value of one of the predetermined number of pixels is set as the average value. For example, when the predetermined number is “5”, the first to fifth pixels from the top of the screen are selected, an average value is obtained, and this average value is set as the density value of the fifth pixel. Next, the second to sixth pixels from the top of the screen are selected to calculate an average value, and the average value is used as the density value of the sixth pixel. Then, this is sequentially repeated, and the average value of the density is obtained for all the pixels in one line.
[0051]
By performing the arithmetic averaging in this manner, the present apparatus 1 can eliminate a small variation in the change in the density value at the time of capturing the image data, and can capture a global change in the density value.
[0052]
After the arithmetic averaging operation, the microcomputer 3 performs a differential operation of the arithmetic average value in the vertical direction (ST33). Then, the microcomputer 3 performs point extraction based on the differential value (ST34). This point extraction is a process of determining one pixel for each local increase of the arithmetic mean value of the pixel density along the vertical pixel row. For example, the differential value of the arithmetic mean value is negative. This is a process for determining a pixel that changes to positive.
[0053]
After determining the pixel to be the point, the microcomputer 3 switches the line from which the point has been extracted to the next line (ST35).
[0054]
Then, the microcomputer 3 determines whether or not point extraction has been completed for all lines in the vertical direction (ST31). If it is determined that point extraction has not been completed for all lines (ST31: NO), the process returns to step ST31 again through the processing of steps ST32 to ST35 described above.
[0055]
On the other hand, when it is determined that the point extraction has been completed for all the lines (ST31: YES), the Y coordinate values of the extraction points of the adjacent lines are compared. When the Y coordinate value is within a predetermined value, (i) a continuous data group number, (ii) a continuous start line number, and (iii) a continuous data number are stored as continuous data. Also, (iv) the average value of the vertical position of each extraction point constituting the continuous data (representative vertical position of the continuous data), (v) the average value of the horizontal position of the continuous start line and the end line (the continuous data Are stored (ST36).
[0056]
In the present embodiment, since the tracking target is the eye, the continuous data extends relatively long in the horizontal direction. Therefore, the microcomputer 3 can select the continuous data on the condition that the continuous data continues for a predetermined value or more in the horizontal direction after the continuous data is formed.
[0057]
Thereafter, the microcomputer 3 determines the representative coordinate value C for each continuous data, and sets the existence area EA based on the representative coordinate value C (ST37). The representative coordinate value C is determined by the average value of the X coordinate values and the average value of the Y coordinate values stored in the process of step ST36 (the average values indicated by iv and v). The existence area EA will be described later with reference to FIGS.
[0058]
After determining the representative coordinate value C and setting the existence area EA, the process proceeds to step ST21 in FIG. The above is the tracking target candidate position specifying process (ST20). As described above, the obtained continuous data is the eye candidate, and the representative coordinate value C of the continuous data is the position of the eye candidate point.
[0059]
Next, a description will be given of continuous data formed when the extraction points determined for each pixel row in the vertical direction are adjacent in the horizontal direction of the image, the representative coordinate value C of the continuous data, and the existence area EA.
[0060]
FIG. 6 is an explanatory diagram showing the continuous data formed in the process of step ST36 shown in FIG. 5, the representative coordinate value C and the existence area EA determined in the process of step ST37. Note that the tracking target candidate position specifying process (ST20) specifies one or a plurality of tracking target candidates, but FIG. 6 illustrates an example in which a plurality of tracking target candidates are specified.
[0061]
As shown in the figure, the microcomputer 3 forms a plurality of continuous data G. This is because an eye is a detection target, and an object (a mouth, a nose, an eyebrow, and the like) indicating a feature amount similar to the eye is detected.
[0062]
As described above, the continuous data G is formed when the extraction points determined for each pixel row in the vertical direction are adjacent in the horizontal direction of the image. Then, the representative coordinate value C is determined by the average value of the X coordinate values of the pixels at both ends in the horizontal direction forming the continuous data and the average value of the Y coordinate of each pixel forming the continuous data. Further, the existence area EA is set based on the representative coordinate value C.
[0063]
Next, a method of setting the existence area EA will be described. FIG. 7 is an explanatory diagram showing the size of the existence area EA shown in FIG. 6. FIGS. 8 and 9 show statistical data of the lengths of the horizontal Xa and the vertical Ya obtained by examining the sizes of several eyes. FIG. 10 is an explanatory diagram showing a method for determining the position of the existence area EA on the image.
[0064]
The setting of the existence area EA is performed by determining the size of the existence area EA and then determining the position of the existence area EA on the image.
[0065]
The size of the existence area EA is preferably as small as possible in order to reduce the noise (extracting wrinkles, light and darkness of the face, etc.) and not to reduce the processing speed. In the present embodiment, the size of the presence area EA is determined by examining the size of several facial parts and adding a margin (for example, × 1.5) to the size of the face part. That is, as shown in FIG. 8 and FIG. 9, a method is adopted in which data of the vertical and horizontal dimensions of the face part is collected, and a dimension covering, for example, 95% of the distribution is determined in consideration of a margin.
[0066]
The dimension covering the above 95%, that is, the horizontal dimension xa and the vertical dimension ya is determined in consideration of a margin (× 1.5) (FIG. 7). The size of the existence area EA may be determined by estimating the width and height of the face part by image processing and adding a margin to the vertical and horizontal sizes.
[0067]
After the size of the existence area EA is determined in this way, as shown in FIG. 10, the reference point P is determined based on, for example, the coordinate values (x1, y1) of the eye. The reference point P is set at a position separated by distances x2 and y2 from the coordinate values (x1, y1) of the eye.
[0068]
Then, the microcomputer 3 draws the dimensions x3, y3 of the existence area EA based on the point P. Thus, the position of the existence area EA is determined. After that, the existence area EA is set for all the continuous data G found in the entire image.
[0069]
Note that the above x2 and y2 are の of x3 and y3, and it is desirable that the length be such that the existence area EA comes to the center of the eye in advance.
[0070]
The tracking target candidate position specifying process (ST20) in FIG. 4 is performed by the processes in FIGS.
[0071]
Next, the tracking target determination process (ST21) of FIG. 4 will be described. FIG. 11 is a flowchart showing details of the tracking target determination process (ST21) shown in FIG.
[0072]
First, the microcomputer 3 stores the image data of the existence area EA obtained by the processing of FIG. 5 in the image memory as the small image IG (ST40). FIG. 12 shows the state of the whole image and the small image IG stored in the image memory. FIG. 12 is an explanatory diagram showing a minute image. As shown in FIG. 12, the microcomputer 3 extracts an image in the existence area EA from the entire image and sets it as a small image IG.
[0073]
Description will be made again with reference to FIG. The microcomputer 3 sets the representative coordinate value C of the whole image as the representative coordinate value IC of the minute image IG. Then, the microcomputer 3 sets a range AR based on the representative coordinate value IC of the minute image IG, and sets a binarization threshold based on the density information of the range AR (ST41).
[0074]
An example of a method of calculating the binarization threshold in the range AR will be described with reference to FIG. FIG. 13 is an explanatory diagram of a method of calculating a binarization threshold value in the range AR. First, the microcomputer 3 reads out density values of several lines in the vertical direction in the range AR.
[0075]
Then, the microcomputer 3 stores the highest (bright) density value and the lowest (dark) density value of the density value in each line. When the memory of all the lines is completed, the microcomputer 3 determines the lowest (dark) density value of each line among the highest (bright) density value of each line and the lowest (dark) density value of each line. Among them, the lowest density value (eye part) is determined. Then, the median value is used as a binarization threshold.
[0076]
The above-mentioned range AR is set so that a black part of the eye and a white part of the skin around the eye are included in order to suitably determine the binarization threshold. In addition, the range AR is set to a minimum necessary size in order to reduce the influence of variations in brightness of the image.
[0077]
Further, the binarization threshold is set to be the median value between the lowest (dark) density value of the eye within the range AR and the lowest (dark) density value of the skin portion, so that It is a value suitable for cutting out the part.
[0078]
Here, the reason why the lowest (dark) density value in the skin portion is used to determine the binarization threshold is as follows. For example, when direct light is incident on a part of the range AR, the skin portion tends to reflect light more strongly than the black portion of the eyeball. For this reason, the device 1 inputs light that can be said to be a lot of noise.
[0079]
In this case, even if the range AR from which the density value is read is made as small as possible, the image is affected by the noise light, and the apparatus 1 cannot determine an accurate binarization threshold. For this reason, in the present embodiment, a more appropriate two-dimensional value is obtained by using the lowest (dark) density value of the skin portion instead of using the high-density portion that may be strongly reflected. The binarization threshold can be determined.
[0080]
Description will be made again with reference to FIG. After the determination of the binarization threshold, the microcomputer 3 binarizes the minute image IG using the determined binarization threshold, and stores it in the image memory as the binary image bG (ST42).
[0081]
Next, the microcomputer 3 sets the representative coordinate value C of the whole image as the position bC of the binary image bG, and sets this position bC as the initial position (ST43). Thereafter, the microcomputer 3 determines whether or not the set position is a black pixel (ST44). Here, it is determined whether the initial position set in step ST43 is a black pixel.
[0082]
If it is determined that the set position is not a black pixel (ST44: NO), the microcomputer 3 shifts the set position up, down, left, and right by one pixel (ST45). Thereafter, the microcomputer 3 determines again whether or not the set position is a black pixel. Here, it is determined whether the set position shifted in step ST45 is a black pixel. This process is repeated until a black pixel is determined.
[0083]
On the other hand, when it is determined that the set position is a black pixel (ST44: YES), the microcomputer 3 sets a connected component of the black pixel as a candidate object (ST46). Then, the microcomputer 3 calculates the geometric shape of the candidate object (ST47).
[0084]
After the calculation, the microcomputer 3 compares the geometric shape of the template to be tracked stored in advance with the geometric shape of the candidate object (ST48). An example of a method of comparing a geometric shape between a candidate object and a template to be tracked will be described with reference to FIG.
[0085]
14A and 14B are explanatory diagrams of a method of comparing a geometric shape between a candidate object and a template of an eye to be tracked. FIG. 14A illustrates a case where the candidate object is imaged in an optimal state, and FIG. Shows the state where the right side of the eye is missing, and (c) shows the state where the left side of the eye is missing.
[0086]
The binarized shape of the eye image is as shown in FIG. 14A if the image is a stable image with good light environment. However, when the light environment deteriorates due to direct sunlight coming into the vehicle interior from one side or the like, the shape may be partially missing as shown in FIGS. 14 (b) and (c).
[0087]
The microcomputer 3 makes a comparison judgment under three conditions in order to accurately judge the candidate object as described above. First, the condition (i) is that the width is equal to or more than ２ of the market value of the eye, and the curvature is in a predetermined range convex upward. Next, the condition (ii) is that there is a concave shape on the left side of the black eye. The condition (iii) is that there is a concave shape on the right side of the black eye.
[0088]
Description will be made again with reference to FIG. After comparing the geometric shapes, the microcomputer 3 performs a comparison judgment based on the above three conditions, and determines whether or not the geometric shapes of the candidate object and the eye template match (ST49). Here, in consideration of the case where a part of the shape of the eye is missing as shown in FIGS. 14B and 14C, the microcomputer 3 satisfies the conditions (i) and (ii) and the condition ( Those that satisfy ii) and (iii) are determined to be the same.
[0089]
If it is determined that they do not match (ST49: NO), the microcomputer 3 determines that the candidate object is not a face part to be tracked (ST50), and then the process proceeds to step ST22 in FIG.
[0090]
On the other hand, if it is determined that they match (ST49: YES), the microcomputer 3 determines that the candidate object is a face part to be tracked (ST51). Then, the coordinate value of the determined candidate object (corresponding to the representative coordinate value C in the whole image) is stored as the coordinate value of the eye on the image (ST52).
[0091]
Thereafter, the microcomputer 3 converts the small image IG including the candidate object determined to match to the tracking target image MG. _i Is stored in the image memory (ST53). Thereafter, the process proceeds to step ST22 in FIG.
[0092]
In the process of FIG. 11, the binarized candidate object is detected using the binarization threshold. For this reason, in the present embodiment, it is possible to clearly distinguish the eye portion from other portions (the background and the face portion other than the eye), and to accurately capture the eye. Furthermore, the determination using the geometric shape of the candidate object can be performed more accurately, and the eye position detection accuracy can be further improved.
[0093]
As described above with reference to FIGS. 4 to 14, the microcomputer 3 (the face part detection means CL2) detects a face part to be tracked from the entire input image. Then, as described above, when the face part to be tracked is detected, the tracking target detection flag “GetFlag” is set to “TRUE”. Then, as shown in FIG. 3, the tracking process (ST19) is executed.
[0094]
FIG. 15 is a flowchart showing details of the tracking process (ST19) shown in FIG. As shown in the figure, when it is determined “NO” in step ST16, the microcomputer 3 executes a process of setting a face part search area (ST60). The process of step ST60 is a process performed by the face part search area setting means CL31 shown in FIG. That is, the microcomputer 3 executes a program corresponding to the face part search area setting means CL31. With reference to FIG. 16, the outline of the setting process of the face part search area will be described.
[0095]
FIG. 16 is an explanatory diagram of the setting process (ST60) of the face part search area shown in FIG. 15, (a) shows an image taken at time t0, and (b) shows an image taken at time t1. (C) shows an image taken at time t2, (d) shows an image taken at time t3, and (e) shows the left eye position on these images on one image. Shows the case.
[0096]
When the subject changes the direction of the face, first, an image shown in FIG. 16A is captured at time t0. At this time, the detected person is almost visually observing the front. Thereafter, at time t1, the image shown in FIG. At this time, the subject starts turning his or her face to the right (to the left in FIG. 16) in order to check the side mirror and the like. Since the direction of the face has started to turn to the right, the position of the left eye of the subject moves to the right.
[0097]
Then, at time t2, the image shown in FIG. At this time, the detected person turns his face further to the right than at time t1. Therefore, the position of the left eye moves further to the right.
[0098]
Thereafter, at time t3, the image shown in FIG. At this time, the detected person is checking the side mirror and the like, and the face is turned to the rightmost side. Therefore, the position of the left eye has moved to the rightmost.
[0099]
Then, as shown in FIG. 16 (e), it can be seen that the position of the left eye on these images is gradually moving from time t0 to t3. In the setting process of the face part search area (ST60), the setting is made so as to include the left eye position that moves during each of the periods (t0 to t1, t1 to t2, t2 to t3) from time t0 to t3.
[0100]
Description will be made again with reference to FIG. After step ST60, the microcomputer 3 executes a setting process of a priority face part search area (ST61). The process of step ST61 is a process performed by the priority face part search area setting means CL32 shown in FIG. That is, the microcomputer 3 executes a program corresponding to the priority face part search area setting means CL32. With reference to FIG. 17, the outline of the setting process of the priority face part search area will be described.
[0101]
FIG. 17 is an explanatory diagram of the setting process (ST61) of the priority face part search area shown in FIG. 15, wherein (a) shows an image taken at time t10, and (b) shows an image taken at time t11. (C) shows an image taken at time t12, (d) shows an image taken at time t13, and (e) shows the left eye position on these images on one image. It shows the case where it is done.
[0102]
When the subject is visually recognizing one direction, first, the image of FIG. 17A is captured at time t10. Thereafter, at time t11, time t12, and time t13, the images of FIGS. 17B, 17C, and 17D are captured, respectively.
[0103]
The left eye position on these images is almost stationary, as is clear from FIG. 17 (e), since the subject is visually recognizing one direction.
[0104]
In the priority face region search area setting processing (ST61), the setting is made so as to include the left eye position that moves during each of the periods (t10 to t11, t11 to t12, t12 to t13) from time t10 to t13. .
[0105]
Here, the distribution of the left eye position in the case where one direction is visually recognized and the case where the direction of the face is changed will be described. FIG. 18 is an explanatory diagram illustrating the distribution of the left eye position in the case where one direction is visually recognized and the case where the face direction is changed. Here, the vertical axis of FIG. 18 is the coordinate value of the image in the X-axis direction, and the horizontal axis is the coordinate value of the image in the Y-axis direction. The image size is 640 × 480, the maximum value on the vertical axis is 480, and the maximum value on the horizontal axis is 680. Further, FIG. 18 shows a plot of coordinates when sampling is performed at a video rate of 30 frames / second.
[0106]
As shown in the figure, when the detected person is viewing in one direction, the left eye position stays at almost one point. At this time, as indicated by the trajectory a, the coordinate values at each time are substantially constant at 200 to 230 on the X axis and 350 to 390 on the Y axis.
[0107]
On the other hand, when the subject turns his / her face, for example, when the subject turns his / her face in the direction in which the operation panel of the air conditioner is installed (lower left direction), the left eye position largely moves. I do. At this time, as shown by the locus b, the coordinate values at each time are 390 to 520 on the X axis and 240 to 350 on the Y axis, and move greatly.
[0108]
FIG. 19 shows the analysis result of this distribution. FIG. 19 is an explanatory diagram showing an analysis result of the movement amount of the left eye position obtained from the distribution shown in FIG. In addition, FIG. 19 shows an analysis result when an image is captured at 30 ms / frame and 60 ms / frame when the subject moves in the same manner as the trajectories a and b in FIG. 18. The image size here is 640 × 480.
[0109]
First, when the same movement as the trajectory a is imaged at 30 ms / frame, the average movement amount per frame is “1.13” in the X-axis direction and “0.52” in the Y-axis direction. The standard deviation at this time is “0.95” in the X-axis direction and “0.52” in the Y-axis direction, and the 3δ movement amount is “3.97” in the X-axis direction and “2.08” in the Y-axis direction. ". The maximum movement amount is “4” in the X-axis direction and “2” in the Y-axis direction.
[0110]
On the other hand, when the same movement as the trajectory b is captured at 30 ms / frame, the average movement amount per frame is “3.38” in the X-axis direction and “2.35” in the Y-axis direction. In this case, the standard deviation is “2.63” in the X-axis direction and “2.12” in the Y-axis direction, and the 3δ movement amount is “11.27” in the X-axis direction and “8.72” in the Y-axis direction. ". The maximum movement amount is “14” in the X-axis direction and “9” in the Y-axis direction.
[0111]
When the same motion as the trajectory a is captured at a rate of 60 ms / frame, the average movement amount per frame is “1.76” in the X-axis direction and “0.91” in the Y-axis direction. In this case, the standard deviation is “1.47” in the X-axis direction and “0.68” in the Y-axis direction, and the 3δ movement amount is “6.18” in the X-axis direction and “2.94” in the Y-axis direction. ". The maximum movement amount is “6” in the X-axis direction and “3” in the Y-axis direction.
[0112]
On the other hand, when the same movement as the trajectory b is captured at 60 ms / frame, the average movement amount per frame is “5.77” in the X-axis direction and “4.25” in the Y-axis direction. In this case, the standard deviation is “4.10” in the X-axis direction and “3.70” in the Y-axis direction, and the 3δ movement amount is “18.06” in the X-axis direction and “15.35” in the Y-axis direction. ". The maximum movement amount is “15” in the X-axis direction and “14” in the Y-axis direction.
[0113]
In this manner, as is apparent from FIG. 19, when the detected person is visually recognizing one direction, the movement amount of the left eye position is about several pixels at the maximum, but when the face direction is changed, The movement amount of the left eye position is several tens of pixels at the maximum.
[0114]
Description will be made again with reference to FIG. After step ST61, the microcomputer 3 performs a process of specifying a tracking target candidate position (ST62). This process is the same as the process shown in FIG. This process is a process performed by the face part candidate extracting unit CL33 shown in FIG. That is, the microcomputer 3 executes a program corresponding to the face part candidate extracting means CL33.
[0115]
To explain the outline of this processing, first, the microcomputer 3 detects a density value of a pixel along a vertical pixel row of a captured image. At this time, the microcomputer 3 executes an arithmetic averaging operation to obtain an average value of the density. Then, the microcomputer 3 determines one pixel for each local increase of the detected density average value and extracts points. Thereby, an extraction point is determined. Thereafter, the microcomputer 3 forms continuous data G of a group of extraction points extending in the horizontal direction when the extraction points determined for each pixel column in the vertical direction are adjacent in the horizontal direction of the image. This continuous data G is the same as that described with reference to FIGS. Then, the microcomputer 3 sets the representative coordinate value C of the formed continuous data G as a candidate point of the tracking target candidate.
[0116]
After step ST62, the microcomputer 3 determines whether or not the tracking target candidate is within the priority face part search area (ST63). More specifically, it is determined whether or not the representative locus value C, which is a candidate point of the tracking target candidate, is within the priority face part search area. This process is a process performed by the face part determination unit CL34 shown in FIG. That is, the microcomputer 3 executes a program corresponding to the face part determination means CL34.
[0117]
If it is determined that the candidate is in the priority face region search area (ST63: YES), the microcomputer 3 determines that the candidate for the tracking target is the tracking target (ST64). Then, the microcomputer 3 stores the existence area EA including the face part determined to be a tracking target in the image memory as the small image IG (ST65).
[0118]
Thereafter, the microcomputer 3 stores the representative coordinate value C of the tracking target candidate as the coordinate value of the tracking target (ST66), and further stores the micro image IG in the tracking target image MG. _i Is stored in the image memory (ST67).
[0119]
Then, the microcomputer 3 initializes the non-detection counter (ST68). Thereafter, the process proceeds to step ST18 shown in FIG. The non-detection counter counts the number of continuous processes for which the tracking target cannot be specified.
[0120]
By the way, if it is determined that the candidate is not in the priority face part search area (ST63: NO), the process proceeds to step ST70 shown in FIG.
[0121]
FIG. 20 is a flowchart illustrating a process executed when it is determined that the tracking target candidate is not in the priority face part search area.
[0122]
The microcomputer 3 first performs a tracking target determination process based on the density of the minute image IG (ST70). The process of step ST70 is a process performed by the face part determination unit CL34 described with reference to FIG. That is, the microcomputer 3 executes a program corresponding to the face part determination means CL34.
[0123]
Specifically, the processing shown in FIG. 21 is executed. FIG. 21 is a flowchart showing details of the tracking target determination process (ST70) based on the density shown in FIG.
[0124]
As shown in the figure, first, the microcomputer 3 stores the small image IG in the image memory (ST90). Thereafter, the microcomputer 3 determines the density data of the minute image IG and the tracking target image MG. _i-1 (ST91).
[0125]
Here, the tracking target image MG _i-1 Is an image of the tracking target stored in the image memory in the previous tracking processing. Further, as shown in step ST67 of FIG. _i-1 Is the micro image IG that was previously determined to include the face part to be tracked.
[0126]
That is, the microcomputer 3 calculates the similarity of the density data from both the micro image IG including the tracking target candidate extracted from the current image frame and the micro image including the tracking target specified in the past image frame. Seeking parameters.
[0127]
The similarity parameter of the density value data is obtained by the following equation.
[0128]
(Equation 1)

It should be noted that I (m, n) indicates the density of the pixel of the minute image IG, and T (m, n) indicates the tracking target image MG. _i-1 , And M and N indicate pixel sizes. As shown in the above equation, the similarity parameter is represented as a residual sum. The value of the sum of the residuals decreases when the similarity of the two images is high, and increases when the similarity of the two images is low. it can.
[0129]
After this processing, the microcomputer 3 determines whether or not the extracted candidate is a face part to be tracked based on the similarity parameter (ST92). That is, it is determined whether or not the similarity is high, and it is determined whether or not the minute image IG includes a face part to be tracked.
[0130]
If it is determined that the similarity is not high (ST92: NO), the microcomputer 3 determines that the candidate object included in the small image IG is not a face part to be tracked (ST93). Thereafter, the process proceeds to step ST71 in FIG.
[0131]
On the other hand, when determining that the similarity is high (ST92: YES), the microcomputer 3 determines that the candidate object included in the small image IG is a face part to be tracked (ST94). Thereafter, the process proceeds to step ST71 in FIG.
[0132]
Description will be made again with reference to FIG. After step ST70, the microcomputer 3 determines whether or not the existence area EA includes a face part to be tracked, based on the determinations in steps ST93 and ST94 shown in FIG. 21 (ST71).
[0133]
If it is determined that the image includes a face part to be tracked (ST71: YES), the process proceeds to step ST66 shown in FIG. On the other hand, when it is determined that the target does not include the face part to be tracked (ST71: NO), the microcomputer 3 performs a tracking target determination process using a frequency image (ST72). The process of step ST72 is a process performed by the face part determination unit CL34 described with reference to FIG.
[0134]
Specifically, the processing shown in FIG. 22 is executed. FIG. 22 is a flowchart showing details of the tracking target determination process (ST72) using the frequency image shown in FIG.
[0135]
As shown in the figure, first, the microcomputer 3 stores the existence area EA in the image memory as a small image IG (ST100). Thereafter, the microcomputer 3 performs frequency processing on the minute image IG to generate a frequency image IFG, and stores the frequency image IFG in the image memory (ST101). That is, the microcomputer 3 generates a frequency image IFG by performing frequency processing on the small image IG including the tracking target candidate extracted from the current image frame.
[0136]
The generation of the frequency image here is performed by a general method such as Fourier transform or wavelet transform. FIG. 23 is an explanatory diagram of the frequency image generation process (step ST101) shown in FIG. 22. (a) shows a small image IG, and (b) shows a frequency image.
[0137]
When the small image IG as shown in FIG. 23A is subjected to frequency processing, for example, an image as shown in FIG. 23B is obtained. The microcomputer 3 stores the frequency image in the image memory.
[0138]
This will be described with reference to FIG. After step ST101, the microcomputer 3 sets the tracking target image MG stored in the image memory in the previous tracking process. _i-1 To obtain a frequency image BIFG and store it in an image memory (ST101). That is, the microcomputer 3 generates the tracking target image MG including the tracking target face portion specified in the past image frame. _i-1 To obtain a frequency image BIFG. Note that the frequency processing here is the same as that described with reference to FIG.
[0139]
Next, the microcomputer 3 calculates a similarity parameter between the frequency images IFG and BIFG (ST103). The method of calculating the similarity parameter is the same as that in step ST91 shown in FIG. 21, and is performed by obtaining the residual sum of the density data.
[0140]
After this process, the microcomputer 3 determines whether or not the extracted candidate is a face part to be tracked, based on the calculated similarity parameter (ST104). That is, it is determined whether or not the similarity is high, and it is determined whether or not the minute image IG includes a face part to be tracked.
[0141]
When it is determined that the similarity is not high (ST104: NO), the microcomputer 3 determines that the candidate object included in the small image IG is not a face part to be tracked (ST105). Thereafter, the process proceeds to step ST73 in FIG.
[0142]
On the other hand, when it is determined that the similarity is high (ST104: YES), the microcomputer 3 determines that the candidate object included in the small image IG is a face part to be tracked (ST106). Thereafter, the process proceeds to step ST73 in FIG.
[0143]
Description will be made again with reference to FIG. After step ST72, the microcomputer 3 determines whether or not the existence area EA includes a face part to be tracked, based on the determinations in steps ST105 and ST106 shown in FIG. 22 (ST73).
[0144]
If it is determined that the image includes a face part to be tracked (ST73: YES), the process proceeds to step ST66 shown in FIG. On the other hand, when it is determined that the target object does not include the face part to be tracked (ST73: NO), the microcomputer 3 performs a tracking target determination process based on the geometric shape of the candidate object (ST74). The process of step ST74 is a process performed by the face part determining means CL34 described with reference to FIG.
[0145]
Specifically, the processing shown in FIG. 24 is executed. FIG. 24 is a flowchart showing details of the tracking target determination process (ST74) based on the geometric shape of the candidate object shown in FIG. Steps ST110 to ST118 shown in FIG. 11 are the same as steps ST40 to ST48 shown in FIG.
[0146]
After this processing, the microcomputer 3 determines whether or not the extracted candidate is a face part to be tracked, based on the calculated degree of matching of the geometric shapes (ST119). That is, it is determined whether or not the geometric shapes match, and whether or not the micro image IG includes a face part to be tracked is determined.
[0147]
If it is determined that they do not match (ST119: NO), the microcomputer 3 determines that the candidate object included in the small image IG is not a face part to be tracked (ST120). Thereafter, the process proceeds to step ST75 in FIG.
[0148]
On the other hand, when it is determined that they match (ST119: YES), the microcomputer 3 determines that the candidate object included in the small image IG is a face part to be tracked (ST121). Thereafter, the process proceeds to step ST75 in FIG.
[0149]
Description will be made again with reference to FIG. After step ST74, the microcomputer 3 determines whether or not the existence area EA includes a face part to be tracked, based on the determinations in steps ST120 and ST121 shown in FIG. 24 (ST75).
[0150]
If it is determined that the image includes a face part to be tracked (ST75: YES), the process proceeds to step ST66 shown in FIG. On the other hand, when it is determined that the target does not include the face part to be tracked (ST75: NO), the microcomputer 3 performs the process of step ST76.
[0151]
In step ST62 shown in FIG. 15, a plurality of tracking target candidates may be extracted. For example, when the subject wears glasses, a plurality of tracking target candidates may be extracted (described later). Therefore, the microcomputer 3 determines whether there is another candidate for a tracking target, that is, whether there is a candidate for a tracking target that has not been determined yet (ST76). If it is determined that there is another candidate to be tracked (ST76: YES), the process proceeds to step ST63 in FIG.
[0152]
On the other hand, if it is determined that there is no other tracking target candidate (ST76: NO), the microcomputer 3 increments the non-detection counter (ST77). Thereafter, the microcomputer 3 determines whether or not the value of the non-detection counter has exceeded the number of transitions to the face part re-detection processing (ST78). The number of transitions to the face part re-detection processing is such that even if the face part to be tracked cannot be identified, the tracking processing of step ST19 is continuously performed without performing the processing of step ST17 in FIG. Is a number that indicates This number varies depending on the processing speed, processing accuracy, and the like of the system, and may be appropriately set according to the application target of the present apparatus 1.
[0153]
If it is determined that the number of transitions to the face part re-detection process has not been exceeded (ST78: NO), the process proceeds to step ST18 shown in FIG. Then, the processing of steps ST13 to ST15 is performed, and the tracking processing (ST19) is performed again. Note that the process of step ST19 is performed again, and if the tracking target candidate is not determined to be the tracking target again, the non-detection counter is further incremented. Then, the processing of step ST19 is repeated, and when the value of the non-detection counter exceeds the number of transitions to the face part re-detection processing (ST78: YES), the microcomputer 3 sets the tracking target detection flag “GetFlag” to “FALSE”. It is set (ST79).
[0154]
Thereafter, the microcomputer 3 initializes the non-detection counter (ST80), and the process proceeds to step ST18 shown in FIG.
[0155]
If the value of the non-detection counter exceeds the number of transitions to the face part re-detection processing, the tracking target detection flag “GetFlag” is set to “FALSE”, so the tracking target detection processing (ST17) shown in FIG. It will be executed again. That is, since the microcomputer 3 cannot specify the tracking target, if the tracking target cannot be specified several times in spite of repeating the processing of step ST19, the microcomputer 3 cannot determine the tracking target finally. . Then, the tracking target detecting process (ST17) is executed again.
[0156]
Next, the process of setting the face region search area (ST60) and the process of setting the priority face region search region (ST61) shown in FIG. 15 will be described in more detail.
[0157]
FIG. 25 is a flowchart showing the details of the setting process of the face part search area (ST60), and FIG. 26 is a flowchart showing the details of the setting processing of the priority face part search area (ST61). As shown in FIG. 25, the microcomputer 3 sets the position of the face part search area (ST130). Here, the center position of the face part search area is set based on the representative coordinate value C of the face part to be tracked detected or determined in the previous processing.
[0158]
Thereafter, the size of the face part search area is set (ST131). In this process, for example, the size is determined based on how many times the tracking process has been executed without being able to identify the tracking target, that is, based on information such as the numerical value of the non-detection counter. Then, the microcomputer 3 sets an area of the face part search area (ST132), and the process proceeds to step ST140 in FIG.
[0159]
In step ST140, the microcomputer 3 determines whether or not the non-detection counter has exceeded the non-set number of the priority face region (ST140). The non-set number of the priority face region is a number necessary to determine that the face region has not been tracked. This number also differs in the value set depending on the processing speed and processing accuracy of the system, similarly to the number of transitions to the face part re-detection processing. The non-set number of the priority face region can be processed at almost the video rate, and can be set to 3 to 5 if the detection rate of the face portion (rate of determining the face portion as the face portion) is about 90%.
[0160]
If it is determined that the non-detection counter has exceeded the non-set number of the priority face region (ST140: YES), the process proceeds to step ST62 in FIG. On the other hand, when it is determined that the non-detection counter does not exceed the non-set number of the priority face region (ST140: NO), the region of the priority face region search region is set (ST141), and the process proceeds to step ST62 of FIG. Transition.
[0161]
Next, the processing shown in FIGS. 25 and 26 will be described in more detail with reference to FIGS. FIG. 27 is an explanatory diagram of the face part search area and the priority face part search area. As shown in the drawing, the face region search area has a width H1 on one side and a height V1 on one side from the center. The priority face region search area has a width H2 on one side and a height V2 on one side from the center. The center here is, for example, the representative coordinate value C of the tracking target detected or determined in the previous processing. Further, the previous processing may be any of the tracking target detection processing (ST17) and the tracking processing (ST19). In step ST130 shown in FIG. 25, a process of determining the center coordinates is performed.
[0162]
As described above, the size of the region changes depending on the detection target and the like. The size of the region also depends on the processing speed and processing accuracy of the system. For example, in the above-described example, H1 may be set to 30 to 50 pixels, and V1 may be set to 20 to 30 pixels. Further, H2 may be set to about 10 to 15 pixels, and V2 may be set to about 5 to 10 pixels.
[0163]
However, in the above-described face part search area, when the subject significantly changes the face direction, the face part to be tracked moves out of the area, and the face part to be tracked cannot be specified. Sometimes. That is, since the representative coordinate value C of the tracking target detected or determined in the previous processing is set as the center of the face part search area, the moving tracking target is already located outside the area in the current processing. It is possible.
[0164]
Thus, in the present embodiment, as shown in FIG. 28, the size of the face part search area is variable. FIG. 28 is an explanatory diagram showing an example in which the size of the face part search area is variable. As shown in the figure, when the face part to be tracked cannot be specified, the microcomputer 3 widens the face part search area.
[0165]
In the present embodiment, for example, when the tracking target is not specified once and the non-detection counter becomes “1”, the area where the tracking target is likely to be expanded is expanded to find a tracking target candidate. . In step ST131 shown in FIG. 25, the size of the face part search area is thus determined.
[0166]
Further, the size of the face part search area may be determined as follows. FIG. 29 is an explanatory diagram showing another example in which the size of the face part search area is variable. As shown in the figure, the microcomputer 3 may sequentially increase the size of the face part search area based on the count value of the non-detection counter when expanding the face part search area. That is, the larger the value of the non-detection counter is, the wider the face part search area is. As described above, by determining the size of the area based on the value of the non-detection counter, the size of the area is determined according to the number of consecutive times that the tracking target could not be specified.
[0167]
Normally, increasing the size of the face region search area causes a reduction in processing speed. Therefore, suddenly increasing the size of the face region search region compared to the size of the previous processing is a sudden processing speed. Will be reduced. However, by determining the size in accordance with the value of the non-detection counter as in this example, it is possible to prevent the rapid decrease in the processing speed and make the face part search area an appropriate size.
[0168]
Further, in the present embodiment, the microcomputer 3 may not set the priority face part search area. The processing in step ST140 in FIG. 26 corresponds to this.
[0169]
In step ST140, it is determined whether or not the non-detection counter has exceeded the number of non-set priority face region areas. That is, the microcomputer 3 determines whether or not all the tracking target candidates are tracking targets. Even if the tracking target cannot be identified, the microcomputer 3 determines whether or not the tracking target reaches the face part redetection transition number. Try to identify. When the non-detection counter has reached the face part re-detection transition number, the microcomputer 3 determines that the tracking target could not be finally specified, and performs the tracking target detection processing of step ST17 in FIG. It becomes.
[0170]
In step ST140, until the tracking target is finally determined, if the non-detection counter exceeds the number of non-set priority face part areas, the priority face part search area is not set. ing.
[0171]
In this example, the priority face part search area is not set, but the present invention is not limited to this. The priority face part search area may be set narrower.
[0172]
The center of the face part search area described in FIG. 27 may not be the representative coordinate value C of the face part of the tracking target detected or determined in the previous processing. An example of such a case is shown below. FIG. 30 is an explanatory diagram illustrating an example of setting the center position of the face part search area.
[0173]
The figure shows the positions of the eyes before and after the last time and the center position of the face part search area. In the case of the example shown in FIG. 30, first, the microcomputer 3 obtains a difference in the X-axis direction and a difference in the Y-axis direction of the center position in the last and previous face region search areas. Then, these difference values are added to the previous center position, and the obtained coordinate value is set as the center position of the current face part search area.
[0174]
FIG. 31 is an explanatory diagram showing an example of an image including the position of the eyes and the center position of the face part search area, where (a) shows the entire image and (b) shows an enlarged image.
[0175]
When the processing described with reference to FIG. 30 is executed, as shown in FIG. 31A, the position of the eye is within the face part search area. Also, as is clear from the example of the enlarged image in FIG. 31B, as a result of setting the current face part search area based on the center position of the last time and the previous time, the position of the eye is within the current face part search area. I'm in it. As described above, in this example, by setting the face part search area based on the movement amount of the tracking target in the past image frame, appropriate processing can be performed according to the movement of the face of the detected person.
[0176]
In the present example, the center position of the face part search area is determined in accordance with the movement amount obtained from the position of the tracking symmetry two times before the previous time, but is not limited to this. That is, the movement amount may be obtained from the position of the tracking target specified two or more times before and the center position may be determined based on the movement amount. In addition, the center position of the face region search area is first set as the position of the tracking target specified last time, and this example is used when the tracking target is not specified at this position and the non-detection counter becomes “1”. You may do so.
[0177]
Next, another example of setting the center position will be described. FIG. 32 is an explanatory diagram showing another example of setting the center position of the face part search area. FIG. 33 is an explanatory diagram showing another example of an image including the positions of the eyes and the center position of the face part search area, where (a) shows the entire image and (b) shows an enlarged image.
[0178]
The example described with reference to FIGS. 30 and 31 is effective means when the difference value in the X-axis direction and the difference value in the Y-axis direction of the center position are large. In this example, this is an effective means when the difference value in the X-axis direction and the difference value in the Y-axis direction are small.
[0179]
As shown in FIGS. 32 and 33, when the difference value in the X-axis direction and the difference value in the Y-axis direction are not large, the It is not necessary to set the face part search area. This is because the tracking target is included in the face part search area even if it is not set according to the motion of the face of the subject.
[0180]
Therefore, in this example, when the difference value in the X-axis direction and the difference value in the Y-axis direction are small, the representative coordinate value C of the face part to be tracked detected or determined in the previous processing is set as the center position.
[0181]
As described above, when the movement amount does not exceed the predetermined threshold value while considering the difference value in the X-axis direction and the difference value in the Y-axis direction, the representative coordinate value C in the previous process is regarded as the center position as usual. I do. As a result, compared to the examples shown in FIGS. 30 and 31, detailed processing and the like are not required, and quick processing can be performed.
[0182]
Next, the operation of the face part tracking apparatus 1 according to the present embodiment will be described again with reference to image examples. In the following description, the representative coordinate value C is referred to as a representative coordinate point C for convenience. FIG. 34 is a diagram illustrating an example of an image when the subject is visually recognizing one direction. As shown in the figure, in this image example, the representative coordinate point C4 of the continuous data G4 falls within the priority face part search area. Therefore, the representative coordinate point C4 is determined as a face part. That is, "YES" is determined in step ST63 of FIG.
[0183]
FIG. 35 is a diagram illustrating an example of an image when the detected person changes the direction of the face. FIG. 35A illustrates an example of the entire image, and FIG. 35B illustrates an example of an enlarged image. As shown in FIG. 35A, the representative coordinate point C4 of the continuous data G4 is not in the priority face part search area but in the face part search area. Therefore, “NO” is determined in step ST63 of FIG. Then, the existence area EA set around the representative coordinate point C4 is stored in the image memory as a small image IG (FIG. 35B). After that, the tracking target determination processing of step ST70 and subsequent steps is sequentially performed.
[0184]
Next, the operation of the device 1 when the subject wears eyeglasses will be described. FIG. 36 is a diagram illustrating an example of an image when the subject wears eyeglasses, and FIG. 37 is a diagram illustrating an example of a plurality of micro images obtained when the subject wears eyeglasses. It is.
[0185]
When the subject wears eyeglasses, a plurality of candidate points may be extracted from the face area search area as shown in FIG. According to FIG. 36, the representative coordinate point C2 of the continuous data G2, the representative coordinate point C3 of the continuous data G3, and the representative coordinate point C4 of the continuous data G4 are all in the face part search area that is not the priority face part search area.
[0186]
For this reason, as shown in FIG. 37, the small images IG1, IG2, and IG3, which are the existence areas EA1, EA2, and EA3 centered on the representative coordinate points C2, C3, and C4, are stored in the image memory, respectively. The tracking target determination processing is sequentially performed.
[0187]
In this example, when the first micro image IGA1 is determined, it is determined that the target is not a face part to be tracked, and it is determined that there is another tracking target candidate in the process of step ST76 in FIG. . Then, the second micro image IGA2 is determined, and the face part to be tracked is specified.
[0188]
In the present embodiment, an area surrounding the face part search area may be set as a continuous data extraction area, and continuous data may be extracted only within that area. FIG. 38 is a diagram illustrating an example when an area surrounding the face part search area is set as a continuous data extraction area. FIG. 39 is a diagram illustrating an example of continuous data extracted when a continuous data extraction area is set. As shown in FIGS. 38 and 39, the processing can be performed by setting a continuous data extraction area surrounding the face part search area and extracting candidates from within this area. In this case, since the representative coordinate point C1 of the continuous data G1 is in the priority face part search area, the representative coordinate point C1 is determined as the face part.
[0189]
Thus, the face part tracking device 1 according to the present embodiment sets the face part search area. The face region search area is based on the detected position of the tracking target on the image and based on the movement amount of the tracking target moving during the sampling time when the subject changes his or her face direction. Therefore, it can be said that this is an area in which the face part to be tracked is likely to exist. Then, the present apparatus 1 extracts a candidate for a tracking target from this area. For this reason, for the captured image after the detection of the tracking target, the candidate can be extracted from the region where the tracking target is likely to exist without extracting the candidate for the tracking target from the entire image, and it is accurate and quick. Processing can be performed.
[0190]
Also, a priority face part search area is set within the face part search area. Since the priority face part search area is set within the face part search area, it can be said that the priority face area is more likely to have a face part to be tracked. When the candidate to be tracked is within the priority face part search area, the candidate is more likely to be the face part to be tracked, and the face part determination means CL34 tracks this candidate. It is determined as the target face part.
[0191]
On the other hand, when the extracted candidate is within the face part search area and outside the priority face part search area, the candidate is likely to be the face part to be tracked, but is within the priority face part search area. It is less likely that the face part is a tracking target face part. For this reason, the face part determination means CL34 performs image processing on the candidate image to determine whether the candidate is a face part to be tracked. That is, it is not unlikely that the candidate in the face part search area is not the face part to be tracked, and in the case that the face part is not the face part to be tracked, the face part determination means CL34 is to prevent erroneous tracking. It is determined whether or not the face part is to be tracked. This prevents erroneous tracking.
[0192]
As described above, according to the present invention, it is possible to improve accuracy and processing speed when determining a face part to be tracked.
[0193]
Further, the priority face area search area is set based on the movement amount of the face area to be tracked which moves during the sampling time when the subject is visually recognizing one direction. Therefore, a priority face part search area can be set for an area in which a face part to be tracked is likely to be located.
[0194]
Further, after the face part to be tracked is specified by the face part determination means CL34, a face part search area and a priority face part search area are set based on the specified position. For this reason, once the face part is detected by the face part detecting means CL2, the detection processing of the face part is reduced for the entire image, and the rapid processing can be continuously performed.
[0195]
Also, a face part search area is set with the position at the time when the face part to be tracked is determined in the past image frame as the center position. For this reason, for example, based on the position of the tracking target on the past image, the face part search area can be set at a location where the tracking target is likely to exist.
[0196]
Further, the face part search area is set around a position corrected based on the movement amount of the tracking target in the past image frame. Therefore, for example, when the position of the tracking target is largely moving in the X-axis direction and the Y-axis direction on the past image, there is a possibility that the tracking target exists at the time of the current processing based on the past data. A face part search area can be set at a high place.
[0197]
Further, when the face part to be tracked cannot be specified by the face part determination means CL34, the range of the face part search area is widened. Therefore, even if the tracked object is lost, the process immediately returns to the tracking processing. Can be.
[0198]
Further, when the face part to be tracked cannot be specified by the face part determining means CL34, the range of the priority face part search area is reduced or the priority face part search area is not set. For this reason, even if a tracking target candidate having a feature amount similar to the tracking target is accidentally found in the priority face region search area, it is possible to prevent or reduce erroneous determination as the tracking target, and to suitably return to the tracking processing. be able to.
[0199]
In addition, since the candidate points for specifying the positions of the candidates are determined, it is possible to eliminate a situation in which some of the candidates are in the priority face part search area and some of them are outside the priority face part search area. The processing can be performed with high accuracy.
[0200]
In addition, one pixel is determined for each local increase in the density value in the vertical direction of the image and is set as an extraction point. When the extraction points are adjacent to each other in the horizontal direction of the image, continuous data of a group of extraction points extending in the horizontal direction is obtained. The representative coordinate values of the formed continuous data are set as candidate points of the face part. For this reason, it is possible to determine whether or not the obtained continuous data has the features of the face part to be tracked.For example, only those having the features of the face part to be tracked are selected. It is possible to do. Therefore, the accuracy of the device 1 can be improved.
[0201]
Further, a minute image is extracted including a candidate for the face part, and a tracking target of the face part is determined based on the small image. That is, since an image is partially extracted and processed from within the face part search area or the priority face part search area, the processing load can be reduced.
[0202]
In addition, since a small image is extracted and it is determined whether or not it is a face part to be tracked based on one of density, spatial frequency, and geometric shape, the processing load can be reduced and accurate determination can be made. It can be carried out.
[0203]
If the face part of the tracking target cannot be identified by the face part determining means CL34, the tracking target is detected again by the face part detecting means CL2. it can.
[0204]
The present embodiment is not limited to the above-described configuration, and can be modified without departing from the spirit of the present invention. For example, a configuration may be employed in which a plurality of face part determination units having different determination accuracy are provided in the face part determination unit CL34. That is, the processing speed of the means for making a determination or the like generally tends to increase as the determination accuracy decreases. By utilizing this, when determining whether or not the face part is a tracking target face part in the present embodiment, the determination processing may be performed in order from one having a low determination accuracy and a high processing speed. As a result, the processing speed can be increased, and a decrease in determination accuracy can be prevented.
[0205]
Further, the face part candidate extracting unit CL33 in the present embodiment is not limited to the above configuration, and may have the following configuration, for example. That is, when the face part candidate extraction unit CL33 cannot form the continuous data G, which is an extraction point group extending in the horizontal direction of the image, the face part candidate extraction unit CL33 replaces the candidate point in the image frame before the current image frame from which the candidate is being extracted with May be set as the candidate points. Further, when the face part candidate extraction means CL33 cannot form the continuous data G, which is an extraction point group extending in the horizontal direction of the image, the movement of the tracking target in the image frame before the current image frame from which the candidate is being extracted is performed. The current candidate point may be determined based on the quantity.
[0206]
When the face part candidate extracting means CL33 is configured as described above, when continuous data G cannot be formed, the candidate points are determined without executing the process of extracting candidate points again, so that the calculation load is reduced. be able to. Further, it is possible to prevent a situation in which a candidate point is determined based on inappropriate continuous data G because appropriate continuous data G extending in the horizontal direction of the image is not formed. Therefore, the tracking accuracy can be improved.
[Brief description of the drawings]
FIG. 1 is a functional block diagram showing a configuration of a face part tracking device according to an embodiment of the present invention.
FIG. 2 is a hardware configuration diagram showing the face part tracking device according to the embodiment of the present invention.
FIG. 3 is a main flowchart showing an outline of an operation of the face part tracking apparatus 1 according to the embodiment.
FIG. 4 is a flowchart showing a detailed operation of a tracking target detection process (ST17) shown in FIG. 3;
FIG. 5 is a flowchart showing details of a tracking target candidate position specifying process (ST20) shown in FIG. 4;
6 is an explanatory diagram showing continuous data formed in the processing of step ST36 shown in FIG. 5, and representative coordinate values C and existence areas EA determined in the processing of step ST37.
FIG. 7 is an explanatory diagram showing the size of the existence area EA shown in FIG. 6;
FIG. 8 is an explanatory diagram showing statistical data of the length of the horizontal Xa obtained by examining the sizes of several eyes.
FIG. 9 is an explanatory diagram showing statistical data of the length of the vertical Ya obtained by examining the size of several eyes.
FIG. 10 is an explanatory diagram illustrating a method of determining a position of an existence area EA on an image.
FIG. 11 is a flowchart illustrating details of a tracking target determination process (ST21) illustrated in FIG. 4;
FIG. 12 is an explanatory diagram showing a minute image.
FIG. 13 is an explanatory diagram of a method of calculating a binarization threshold value in a range AR.
14A and 14B are explanatory diagrams of a method of comparing a geometric shape between a candidate object and an eye template to be tracked, where FIG. 14A illustrates a case where the candidate object is imaged in an optimal state, and FIG. Shows the state where the right side of the eye is missing, and (c) shows the state where the left side of the eye is missing.
FIG. 15 is a flowchart showing details of a tracking process (ST19) shown in FIG. 3;
16 is an explanatory diagram of the setting process (ST60) of the face part search area shown in FIG. 15, where (a) shows an image captured at time t0, and (b) shows an image captured at time t1. (C) shows an image taken at time t2, (d) shows an image taken at time t3, and (e) shows the left eye position on these images on one image. Shows the case.
17 is an explanatory diagram of a setting process (ST61) of the priority face part search area shown in FIG. 15; (a) shows an image taken at time t10; and (b) shows an image taken at time t11. (C) shows an image taken at time t12, (d) shows an image taken at time t13, and (e) shows the left eye position on these images on one image. It shows the case where it is done.
FIG. 18 is an explanatory diagram showing the distribution of left eye positions when one direction is visually recognized and when the face direction is changed.
19 is an explanatory diagram illustrating an analysis result of a movement amount of a left eye position obtained from the distribution illustrated in FIG. 18;
FIG. 20 is a flowchart illustrating a process executed when it is determined that a candidate for a face part is not in a priority face part search area.
FIG. 21 is a flowchart illustrating details of a tracking target determination process based on density (ST70) illustrated in FIG. 20;
FIG. 22 is a flowchart illustrating details of tracking target determination processing (ST72) using the frequency image illustrated in FIG. 20;
23 is an explanatory diagram of the frequency image generation process (step ST101) shown in FIG. 22, where (a) shows a small image IG and (b) shows a frequency image.
24 is a flowchart showing details of a tracking target determination process (ST74) based on the geometric shape of the candidate object shown in FIG. 20.
FIG. 25 is a flowchart showing details of a face part search area setting process (ST60).
FIG. 26 is a flowchart showing details of a priority face part search area setting process (ST61).
FIG. 27 is an explanatory diagram of a face part search area and a priority face part search area.
FIG. 28 is an explanatory diagram showing an example of a case where the size of the face part search area is variable.
FIG. 29 is an explanatory diagram showing another example in which the size of the face part search area is variable.
FIG. 30 is an explanatory diagram showing an example of setting a center position of a face part search area.
FIGS. 31A and 31B are explanatory diagrams illustrating an example of an image including a position of an eye and a center position of a face part search area, wherein FIG.
FIG. 32 is an explanatory diagram showing another example of setting the center position of the face part search area.
FIG. 33 is an explanatory diagram showing another example of an image including the position of the eye and the center position of the face part search area, where (a) shows the entire image and (b) shows an enlarged image.
FIG. 34 is a diagram illustrating an example of an image when a detected person is viewing one direction.
35A and 35B are diagrams illustrating an example of an image when a subject changes the direction of a face, wherein FIG. 35A illustrates an example of an entire image, and FIG. 35B illustrates an example of an enlarged image.
FIG. 36 is a diagram illustrating an example of an image when a subject wears glasses.
FIG. 37 is a diagram illustrating an example of a plurality of minute images obtained when a subject wears eyeglasses.
FIG. 38 is a diagram showing an example when an area surrounding a face part search area is set as a continuous data extraction area.
FIG. 39 is a diagram showing an example of continuous data extracted when a continuous data extraction area is set.
[Explanation of symbols]
1. Face part tracking device
CL1 ... Face image pickup means
CL2: Face part detecting means
CL31: Face area search area setting means
CL32: priority face region search area setting means
CL33: Face part candidate extraction means (candidate extraction means)
CL34: Face part determination means (first face part determination means, second face part determination means)
G: Continuous data
IG ... Small image
IFG, BIFG ... frequency image

Claims

In a face part tracking device that tracks the movement of the face part based on an image obtained by capturing and inputting the face of the subject,
A face part detection unit that detects a face part to be tracked from the entire input captured image;
For an image input after detection, a face part search area setting means for setting a face part search area narrower than the entire image based on a position on the image of the tracking target detected by the face part detection means,
In the face part search area set by the face part search area setting means, a priority face part search area setting means for setting a priority face part search area,
Candidate extraction means for extracting a candidate for a face part to be tracked from within the face part search area,
When the candidate extracted by the candidate extraction unit is within the priority face region search area, a first face region determination unit that determines that the candidate is a tracking target,
When the extracted candidate is not in the priority face part search area but in the face part search area, image processing of the candidate is performed to determine whether or not the extracted candidate is a tracking target. And a second face part determining means,
The face part search area setting means sets the face part search area based on a movement amount of a tracking target that moves during a sampling time when a subject changes a face direction. Face part tracking device.

The priority face part search area setting means sets the priority face part search area based on a movement amount of a tracking target that moves during a sampling time when the subject is visually recognizing one direction. The face part tracking device according to claim 1, wherein

The face part search area setting means, after the tracking target is specified by the determination result by the first or second face part determining means, based on the position of the specified tracking target on the image, 3. The face part tracking device according to claim 1, wherein an area is set.

4. The face part search area setting unit according to claim 3, wherein the face part search area setting unit sets the face part search area with a position at the time when a tracking target is specified in an image frame before a current image frame as a center position. Face part tracking device.

The face region search area setting unit corrects the center position by an amount of movement on an image in which a tracking target has moved in an image frame before a current image frame, and sets the face region search region. The face part tracking apparatus according to claim 4, wherein

The said face part search area setting means widens the said face part search area, when the tracking target is not specified by the determination result by the said 1st or 2nd face part determination means. Item 6. The face part tracking device according to any one of items 5.

The priority face part search area setting means narrows or does not set the priority face part search area when a tracking target is not specified by the determination result by the first or second face part determination means. The face part tracking device according to any one of claims 1 to 6.

The candidate extracting means, when extracting a candidate to be tracked, determines a candidate point for specifying a candidate position,
The first face part determination means, when the candidate point determined by the candidate extraction means is within the priority face part search area, determines that the candidate having the candidate point is a tracking target,
The second face part determination means, when the candidate point determined by the candidate extraction means is not in the priority face part search area but in the face part search area, generates an image including a candidate having the candidate point. The face part tracking apparatus according to any one of claims 1 to 7, wherein it is determined whether or not the candidate is a tracking target by performing image processing.

The candidate extracting means,
Detecting the density value of the pixel along the vertical pixel row of the captured image,
One pixel is determined for each local increase in the detected density value and used as an extraction point. Forming continuous data of the extended extraction point cloud,
The face part tracking apparatus according to claim 1, wherein a representative coordinate value of the formed continuous data is set as a candidate point of a tracking target candidate.

The second face part determining means includes:
Including the candidate for the tracking target extracted by the candidate extraction means, to extract a small image,
When the extracted candidate is not in the priority face part search area but in the face part search area, image processing is performed on the extracted micro image to determine whether the extracted candidate is a tracking target. The face part tracking apparatus according to any one of claims 1 to 9, wherein

The second face part determining means includes:
Both the density data of the first micro image including the candidate for the tracking target extracted from the current image frame and the density data of the second micro image including the tracking target specified in the image frame before the current image frame are both used. Calculate the similarity of the small images of
The face part tracking apparatus according to claim 10, wherein it is determined whether or not the extracted candidate is a tracking target based on the calculated similarity parameter.

The second face part determining means includes:
The first and second micro-images including the tracking target candidate extracted from the current image frame and the second micro-image including the tracking target specified in the image frame before the current image frame are frequency-processed to obtain the first and second micro images. Find a two-frequency image,
Calculating the similarity between the two frequency images from the density data of the frequency-processed first and second frequency images;
The face part tracking apparatus according to claim 10, wherein it is determined whether or not the extracted candidate is a tracking target based on the calculated similarity parameter.

The second face part determining means includes:
Extract a small image containing the candidate to be tracked from the current image frame,
Find the geometric shape of the candidate to be tracked from this small image,
11. The face part tracking according to claim 10, wherein it is determined whether or not the candidate is a tracking target based on a degree of matching between the determined candidate geometric shape and a previously stored geometric shape. apparatus.

The second face part determining means includes:
It has a plurality of face region determination units each having a different determination accuracy,
The processing speeds of the plurality of face region determination units are increased in order from the one with the lowest determination accuracy. The face part tracking apparatus according to any one of claims 1 to 13, wherein the apparatus performs processing.

If the tracking target is not finally specified by the determination result by the first or second face part determining means, the face part to be tracked is detected again by the face part detecting means. The face part tracking device according to any one of claims 1 to 14.

In a face part tracking device that tracks the movement of the face part based on an image obtained by capturing and inputting the face of the subject,
From the entire input image, the face part to be tracked is detected,
Based on the position of the detected tracking target on the image and based on the position of the face part search area narrower than the entire image, a priority face part search area in the face part search area is set for the image input after detection,
A candidate for a face part to be tracked is extracted from within the face part search area, and when the extracted candidate is within the priority face part search area, the candidate is determined to be a tracking target and extracted. If the extracted candidate is not in the priority face area search area but in the face area search area, image processing of the candidate is performed to determine whether or not the extracted candidate is a tracking target. A feature tracking device for the face.