JP2014063280A

JP2014063280A - Object tracking method and device and program

Info

Publication number: JP2014063280A
Application number: JP2012207087A
Authority: JP
Inventors: Gakuhin Ko; 学斌胡; Makoto Yonaha; 誠與那覇
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2012-09-20
Filing date: 2012-09-20
Publication date: 2014-04-10

Abstract

PROBLEM TO BE SOLVED: To provide an object tracking method and device and a program which are capable of improving tracking accuracy.SOLUTION: In the object tracking method detecting a specific object from a captured image acquired in time series and tracking the position of the specific object, a first template matching processing using an initial model is performed to the captured image inputted after the initial model is generated and when a first score indicating similarity indicates high similarity equal to or higher than a predetermined threshold value, a position searched by the first matching processing is adopted as a tracking position. On the other hand, when the first score indicates the similarity lower than the predetermined threshold value, a second template matching processing is performed by using the latest update model generated from the tracking position of the captured image of a preceding frame and the position searched by the second matching processing is adopted as the tracking position.

Description

本発明は撮像画像から特定の対象物（オブジェクト）を検出してその位置を追跡(追尾)する画像処理技術に係り、特に、テンプレートマッチングの手法を用いるオブジェクト追跡方法及び装置並びにプログラムに関する。 The present invention relates to an image processing technique for detecting a specific object (object) from a captured image and tracking (tracking) its position, and more particularly to an object tracking method, apparatus, and program using a template matching technique.

テンプレートマッチング法を用いて対象物の追跡を行う画像処理技術が知られている（特許文献１〜３）。テンプレートマッチングをベースにした追跡手法は、ある一つの参照テンプレートを用いて未知画像に対して照合を行い、参照テンプレートとの類似度（「一致度」、「相関度」、「マッチングのスコア」などともいう。）が最も高いところを探索する。類似度がある一定の判定条件（所定の閾値）を超えれば、その場所が対象物の位置と判断される。 An image processing technique for tracking an object using a template matching method is known (Patent Documents 1 to 3). The tracking method based on template matching uses a single reference template to match an unknown image, and the similarity to the reference template ("matching degree", "correlation degree", "matching score", etc.) Also search for the place with the highest. If the similarity exceeds a certain determination condition (predetermined threshold), the location is determined as the position of the object.

参照テンプレートとして、追跡開始時点のアピアランス情報（初期モデル）が使用される場合がある。また、直近の追跡結果に基づく最新のモデル（更新モデル）を参照テンプレートとして用いる場合もある。 Appearance information (initial model) at the start of tracking may be used as a reference template. In addition, the latest model (update model) based on the latest tracking result may be used as a reference template.

このようなテンプレートマッチングによる追跡技術を利用して、自動車などの車両の運転者（ドライバー）を監視し、居眠りなどのドライバーの状態を検知するシステムが提案されている（特許文献４〜６）。 A system that monitors a driver (driver) of a vehicle such as an automobile and detects the state of the driver such as falling asleep using a tracking technique based on template matching has been proposed (Patent Documents 4 to 6).

特開２０１０−２８２５３７号公報JP 2010-282537 A 特開２００１−６０２６５号公報JP 2001-60265 A 特開２００７−３０４８５２号公報JP 2007-304852 A 特開２０１０−９７３７９号公報JP 2010-97379 A 特開２００５−３２７０７２号公報JP 2005-327072 A 特開２００１−１０１３８６号公報JP 2001-101386 A

しかしながら、例えば、人物の眼を追跡対象とした場合、顔の向きは頻繁に変化する。また、カメラに対して顔が近づいたり、遠ざかったりするため、撮像画像内における顔部品の大きさ（サイズ）が変化する。このため、追跡開始時点のアピアランス情報から生成した初期モデル（「初期テンプレート」とも言う。）のみを用いる照合方法では、顔向きの変化や遠近変動などに十分に対応できない。 However, for example, when a person's eyes are set as a tracking target, the orientation of the face changes frequently. In addition, since the face approaches or moves away from the camera, the size (size) of the face component in the captured image changes. For this reason, the matching method using only the initial model (also referred to as “initial template”) generated from the appearance information at the start of tracking cannot sufficiently cope with changes in face orientation, perspective changes, and the like.

その一方、直近の追跡結果から生成した更新モデル（「更新テンプレート」とも言う。）を参照テンプレートとして使う場合、顔向きの角度変動や遠近変動には対応できるが、追跡中に位置ずれ（誤差）が累積され、位置誤差が次第に拡大していくという理論上の欠点がある。 On the other hand, when using an update model (also referred to as “update template”) generated from the latest tracking results as a reference template, it can cope with angular variations and perspective variations in the face direction, but misalignment (error) during tracking There is a theoretical drawback in that the position error gradually increases.

本発明はこのような事情に鑑みてなされたもので、上記課題を解決し、追跡精度を向上させることができるオブジェクト追跡方法及び装置並びにプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and an object of the present invention is to provide an object tracking method, apparatus, and program capable of solving the above-described problems and improving tracking accuracy.

前記目的を達成するために、次の発明を提供する。 In order to achieve the above object, the following invention is provided.

（第１態様）：時系列で取得される撮像画像から特定のオブジェクトを検出し、その位置を追跡するオブジェクト追跡方法であって、入力された撮像画像からオブジェクトの画像部分を特定し、オブジェクトの画像特徴を示す初期モデルを生成する初期モデル生成工程と、初期モデルの生成後に入力される撮像画像に対して少なくとも初期モデルを用いるテンプレートマッチングの処理を行い、当該撮像画像からオブジェクトの位置を探索するオブジェクト探索工程と、オブジェクト探索工程により特定された追跡位置のオブジェクトの画像部分から更新モデルを生成し、最新の更新モデルを記憶する更新モデル生成工程と、を有し、オブジェクト探索工程は、初期モデルを用いる第１のテンプレートマッチングの処理を行い、初期モデルとの類似度を示す第１のスコアを算出する第１のマッチング処理工程と、第１のマッチング処理工程によって得られる第１のスコアを第１の閾値と比較するする比較工程と、記憶された最新の更新モデルを用いる第２のテンプレートマッチングの処理を行い、更新モデルとの類似度を示す第２のスコアを算出する第２のマッチング処理工程と、比較工程による比較の結果、第１のスコアが第１の閾値と同等以上の高い類似性を示す場合は第１のマッチング処理工程により探索されたマッチング位置を追跡位置として採用する一方、第１のスコアが第１の閾値よりも低い類似性を示す場合は第２のマッチング処理工程により探索されたマッチング位置を追跡位置として採用する追跡位置決定工程と、を含む。 (First aspect): An object tracking method for detecting a specific object from a captured image acquired in time series and tracking its position, specifying an image portion of the object from an input captured image, An initial model generation step for generating an initial model indicating image features, and template matching processing using at least the initial model for the captured image input after the initial model is generated, and searching for the position of the object from the captured image An object search step, and an update model generation step of generating an update model from the image portion of the object at the tracking position specified by the object search step and storing the latest update model. The first template matching process using the initial model and A first matching processing step for calculating a first score indicating the similarity, a comparison step for comparing the first score obtained by the first matching processing step with a first threshold, and the latest stored As a result of the comparison by the second matching processing step of performing the second template matching process using the update model and calculating the second score indicating the degree of similarity with the update model and the comparison step, the first score is In the case of showing high similarity equal to or higher than the threshold value of 1, the matching position searched in the first matching processing step is adopted as the tracking position, while the first score shows similarity lower than the first threshold value. A tracking position determination step that employs the matching position searched in the second matching processing step as a tracking position.

この態様によれば、初期モデルを用いる第１のマッチング処理と、随時更新される最新の更新モデルを用いる第２のマッチング処理とを組み合わせてオブジェクト探索が行われる。その際、基本的には、初期モデルによる第１のマッチング処理の結果が優先され、初期モデルによって第１の閾値と同等以上の位置を検出できなかった場合に、更新モデルによる第２のマッチング処理の結果が採用される。 According to this aspect, the object search is performed by combining the first matching process using the initial model and the second matching process using the latest update model updated as needed. At that time, basically, the result of the first matching process by the initial model is prioritized, and the second matching process by the updated model is performed when a position equal to or greater than the first threshold value cannot be detected by the initial model. The result is adopted.

このような構成により、追跡位置誤差の累積という問題を解消しつつ、遠近変動や角度（回転）変動にも対応できる。 With such a configuration, it is possible to deal with perspective variation and angle (rotation) variation while solving the problem of accumulation of tracking position errors.

（第２態様）：第１態様に記載のオブジェクト追跡方法において、初期モデルの生成後に、順次入力される各撮像画像に対して、第１のマッチング処理工程を優先して行い、比較工程による比較の結果、第１のスコアが第１の閾値と同等以上の高い類似性を示す場合は、第２のマッチング処理工程を省略する構成とすることができる。 (Second Aspect): In the object tracking method according to the first aspect, after the initial model is generated, the first matching process step is preferentially performed on each captured image that is sequentially input, and the comparison by the comparison step is performed. As a result, when the first score shows high similarity equal to or higher than the first threshold, the second matching processing step can be omitted.

つまり、第１のスコアが第１の閾値よりも低い類似性を示す場合に限り、第２のマッチング工程を実施する構成とすることができる。 That is, only when the first score shows a similarity lower than the first threshold, the second matching process can be performed.

（第３態様）：第１態様又は第２態様に記載のオブジェクト追跡方法において、オブジェクト探索工程による追跡結果を基に追跡を停止するか継続するかを判断する追跡停止判断工程を含む構成とすることができる。 (Third Aspect): The object tracking method according to the first aspect or the second aspect includes a tracking stop determination step for determining whether to stop or continue tracking based on the tracking result of the object search step. be able to.

（第４態様）：第３態様に記載のオブジェクト追跡方法において、追跡停止判断工程は、オブジェクト探索工程によって探索された追跡位置、第１のスコア、第２のスコア、オブジェクトの大きさのうち、少なくとも１つの情報に基づいて追跡停止を決定する構成とすることができる。 (Fourth aspect): In the object tracking method according to the third aspect, the tracking stop determination step includes the tracking position searched by the object search step, the first score, the second score, and the size of the object. The tracking stop may be determined based on at least one information.

（第５態様）：第３態様又は第４態様に記載のオブジェクト追跡方法において、追跡停止判断工程の判断に基づいて追跡を停止した場合に、初期モデルを作成し直す構成とすることができる。 (Fifth aspect): In the object tracking method according to the third aspect or the fourth aspect, when the tracking is stopped based on the determination in the tracking stop determination step, the initial model can be recreated.

（第６態様）：第１態様から第５態様のいずれか１項に記載のオブジェクト追跡方法において、時系列で連続する複数フレーム分の撮像画像についての追跡結果の情報を履歴として保存する工程を含むことができる。 (Sixth aspect): In the object tracking method according to any one of the first aspect to the fifth aspect, a step of storing tracking result information about a plurality of frames of captured images continuous in time series as a history. Can be included.

（第７態様）：第１態様から第６態様のいずれか１項に記載のオブジェクト追跡方法において、追跡位置決定工程によって決定された追跡位置からオブジェクトの水平方向の回転角度を示す水平角度を算出する角度推定工程と、角度推定工程で算出された角度に応じて初期モデルを変更する初期モデル更新工程と、を含むことができる。 (Seventh aspect): In the object tracking method according to any one of the first to sixth aspects, a horizontal angle indicating a horizontal rotation angle of the object is calculated from the tracking position determined by the tracking position determining step. And an initial model update step of changing the initial model according to the angle calculated in the angle estimation step.

（第８態様）：第７態様に記載のオブジェクト追跡方法において、初期モデル生成工程において生成した初期モデルを基に、予め複数の回転角度に対応した複数の角度対応モデルを作成しておき、角度推定工程で算出された角度に基づいて複数の角度対応モデルの中から１つの角度対応モデルを選択し、当該選択した角度対応モデルを初期モデルに代えて新たな初期モデルとして記憶する工程を含むことができる。 (Eighth aspect): In the object tracking method according to the seventh aspect, a plurality of angle correspondence models corresponding to a plurality of rotation angles are created in advance based on the initial model generated in the initial model generation step. Including a step of selecting one angle correspondence model from a plurality of angle correspondence models based on the angle calculated in the estimation step, and storing the selected angle correspondence model as a new initial model instead of the initial model. Can do.

（第９態様）：第１態様から第８態様のいずれか１項に記載のオブジェクト追跡方法において、初期モデル生成工程は、多数の画像から機械学習により構築されたオブジェクト判別器を用いて撮像画像からオブジェクトの画像部分を検出することにより初期モデルを生成する。 (Ninth aspect): In the object tracking method according to any one of the first aspect to the eighth aspect, the initial model generation step uses a captured image using an object discriminator constructed by machine learning from a large number of images. An initial model is generated by detecting the image portion of the object from the image.

（第１０態様）：時系列で取得される撮像画像から特定のオブジェクトを検出し、その位置を追跡するオブジェクト追跡装置であって、入力された撮像画像からオブジェクトの画像部分を特定し、オブジェクトの画像特徴を示す初期モデルを生成する初期モデル生成部と、初期モデルの生成後に入力される撮像画像に対して少なくとも初期モデルを用いるテンプレートマッチングの処理を行い、当該撮像画像からオブジェクトの位置を探索するオブジェクト探索部と、オブジェクト探索部により特定された追跡位置のオブジェクトの画像部分から更新モデルを生成し、最新の更新モデルを記憶する更新モデル生成部と、を有し、オブジェクト探索部は、初期モデルを用いる第１のテンプレートマッチングの処理を行い、初期モデルとの類似度を示す第１のスコアを算出する第１のマッチング処理部と、第１のマッチング処理部によって得られる第１のスコアを第１の閾値と比較する比較部と、記憶された最新の更新モデルを用いる第２のテンプレートマッチングの処理を行い、更新モデルとの類似度を示す第２のスコアを算出する第２のマッチング処理部と、比較部による比較の結果、第１のスコアが第１の閾値と同等以上の高い類似性を示す場合は第１のマッチング処理部により探索されたマッチング位置を追跡位置として採用する一方、第１のスコアが第１の閾値よりも低い類似性を示す場合は第２のマッチング処理部により探索されたマッチング位置を追跡位置として採用する追跡位置決定部と、を備える。 (Tenth aspect): An object tracking device that detects a specific object from a captured image acquired in time series and tracks its position, specifies an image portion of the object from the input captured image, and An initial model generation unit that generates an initial model indicating image features, and a template matching process that uses at least the initial model for a captured image input after the initial model is generated, and searches for the position of the object from the captured image An object search unit; and an update model generation unit that generates an update model from the image portion of the object at the tracking position specified by the object search unit and stores the latest update model. Perform the first template matching process using, and the similarity with the initial model A first matching processing unit that calculates a first score to be shown, a comparison unit that compares a first score obtained by the first matching processing unit with a first threshold, and a stored latest update model A second matching processing unit that performs a second template matching process and calculates a second score indicating the degree of similarity with the update model, and as a result of comparison by the comparison unit, the first score is a first threshold value. The matching position searched by the first matching processing unit is employed as the tracking position when the similarity is equal to or higher than the second, and the second is when the first score is lower than the first threshold. A tracking position determination unit that employs a matching position searched by the matching processing unit as a tracking position.

第１０態様のオブジェクト追跡装置において、第２態様から第９態様に示した事項と同様の構成を適宜組み合わせることができる。この場合、「工程」として示されている要素は、これに対応する手段、或いはその機能を果たす機能部として特定される。 In the object tracking device of the tenth aspect, configurations similar to the matters shown in the second aspect to the ninth aspect can be appropriately combined. In this case, the element shown as "process" is specified as a means corresponding to this, or a function part performing the function.

（第１１態様）：第１態様から第９態様のいずれか１項に記載のオブジェクト追跡方法の各工程をコンピュータに実行させるためのプログラムを提供する。 (Eleventh aspect): A program for causing a computer to execute each step of the object tracking method according to any one of the first to ninth aspects is provided.

本発明によれば、従来の追跡手法による追跡位置誤差の累積拡大という問題を解消して、追跡精度の向上を達成できる。また、遠近変動や角度（回転）変動が発生した場合でもオブジェクトの位置を特定でき、追跡を行うことができる。 According to the present invention, it is possible to solve the problem of cumulative enlargement of tracking position error by the conventional tracking method, and achieve improvement in tracking accuracy. Further, even when a perspective variation or an angle (rotation) variation occurs, the position of the object can be specified and tracking can be performed.

本発明の実施形態に係る監視映像システムの全体構成を示すブロック図The block diagram which shows the whole structure of the surveillance video system which concerns on embodiment of this invention 居眠り検知処理全体の流れを示すフローチャートFlow chart showing the flow of the whole dozing detection process ロックオン判定の処理の流れに関する第１例を示すフローチャートThe flowchart which shows the 1st example regarding the flow of the process of lock-on determination ロックオン判定の具体的な処理の流れに関する第２例を示すフローチャートThe flowchart which shows the 2nd example regarding the flow of a specific process of lock-on determination 本実施形態による眼追跡処理の流れを示すフローチャートThe flowchart which shows the flow of the eye tracking process by this embodiment. 追跡の停止並びに停止からの復帰の処理を行う判断処理の例を示すフローチャートFlowchart illustrating an example of determination processing for performing tracking stop and return processing from stop 追跡中に起こりうる３つの状態間の遷移関係を示した状態遷移図State transition diagram showing transition relationships between three possible states during tracking 開閉判断処理の流れを示すフローチャートFlow chart showing the flow of open / close determination processing 顔角度と検出スコアの関係を示すグラフGraph showing the relationship between face angle and detection score 顔角度の推定処理と脇見判定の処理の流れを示したフローチャートFlow chart showing the flow of face angle estimation processing and aside look determination processing 顔の角度（水平角度）の算出方法を説明するための模式図Schematic diagram for explaining how to calculate the face angle (horizontal angle) 初期モデルの更新方法を示すフローチャートFlow chart showing how to update the initial model 顔向き角度に対応した角度テンプレートを生成する処理の概念図Conceptual diagram of processing to generate an angle template corresponding to the face orientation angle 画像処理部の構成例を示すブロック図Block diagram showing a configuration example of an image processing unit

以下、添付図面にしたがって本発明を実施するための形態について詳説する。 Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the accompanying drawings.

図１は本発明の実施形態に係る監視映像システムの全体構成を示すブロック図である。この監視映像システム１０は、撮像装置としてのカメラ１２と、照明光を照射する照明装置１４と、制御装置１６を含んで構成される。カメラ１２の詳細な構成は図示しないが、一般的な電子撮像装置と同様に、ＣＣＤ（charge-coupled device）撮像素子やＣＭＯＳ（Complementary Metal-Oxide Semiconductor）撮像素子などの光電変換素子（撮像デバイス）と所要の電子回路を含んでおり、レンズ部１８を介して撮影した被写体の光学像を出電気信号に変換して光学像に応じた撮像信号を出力する。 FIG. 1 is a block diagram showing the overall configuration of a surveillance video system according to an embodiment of the present invention. The monitoring video system 10 includes a camera 12 as an imaging device, an illumination device 14 that emits illumination light, and a control device 16. Although the detailed configuration of the camera 12 is not shown, a photoelectric conversion element (imaging device) such as a CCD (charge-coupled device) imaging element or a CMOS (Complementary Metal-Oxide Semiconductor) imaging element as in a general electronic imaging apparatus. And a required electronic circuit, which converts an optical image of a subject photographed through the lens unit 18 into an output electrical signal and outputs an imaging signal corresponding to the optical image.

なお、カメラ１２としては、赤外線の波長域に感度を持つ赤外線カメラを用いることができる。また、赤外線カメラに代えて、ＲＧＢの色分解カラーフィルターを備えたカラー撮像デバイスが搭載されたカラー撮像可能なカメラでもよいし、白黒画像を取得する白黒カメラでもよい。 As the camera 12, an infrared camera having sensitivity in the infrared wavelength region can be used. Further, instead of the infrared camera, a color imaging camera equipped with a color imaging device including RGB color separation color filters may be used, or a monochrome camera that acquires a monochrome image may be used.

カメラ１２には、撮像素子の駆動回路の他、ゲインコントロール回路、サンプリング回路、Ａ／Ｄ変換器などの信号処理回路が搭載されており、撮像素子から得られるアナログ撮像信号はデジタル形式の撮像データに変換されてカメラ１２から出力される。カメラ１２から出力された撮像データは、有線又は無線の信号伝送手段によって制御装置１６に送られる。 The camera 12 is equipped with a signal processing circuit such as a gain control circuit, a sampling circuit, and an A / D converter in addition to a drive circuit for the image sensor. The analog image signal obtained from the image sensor is digital image data. And output from the camera 12. The imaging data output from the camera 12 is sent to the control device 16 by wired or wireless signal transmission means.

制御装置１６には、カメラ１２から得られる撮像画像のデータを処理する画像処理部２０と、カメラ１２を制御するカメラ制御部２２と、照明装置１４を制御する照明光制御部２４と含まれる。 The control device 16 includes an image processing unit 20 that processes captured image data obtained from the camera 12, a camera control unit 22 that controls the camera 12, and an illumination light control unit 24 that controls the illumination device 14.

画像処理部２０は、入力された撮像画像から特定のオブジェクトを検出してその位置を追跡するオブジェクト追跡装置としての機能を果たす。詳細は後述する。 The image processing unit 20 functions as an object tracking device that detects a specific object from an input captured image and tracks its position. Details will be described later.

カメラ制御部２２は、カメラ１２の起動（ON）／停止（OFF）の制御の他、撮像サイクル（フレームレート）の設定などを行う。また、カメラ制御部２２は、画像処理部２０と連携してカメラ１２の自動露出制御（ＡＥ）や自動焦点調節制御（ＡＦ）など行うこともできる。 The camera control unit 22 controls the start (ON) / stop (OFF) of the camera 12 and sets the imaging cycle (frame rate). The camera control unit 22 can also perform automatic exposure control (AE) and automatic focus adjustment control (AF) of the camera 12 in cooperation with the image processing unit 20.

照明光制御部２４は、照明装置１４の点灯（ON）／消灯（OFF）の制御の他、発光量の制御（照度の調整、発光時間、発光周波数など）を行う。照明装置１４は、被写体である運転者に対して複数の方向から照明光を照射できるように複数の発光部を備える構成とすることができる。例えば、カメラ１２を中心に左右対称な位置関係で複数の発光部を配置することができる。照明光制御部２４は、複数の発光部を同時に（例えば、同じ指令信号で一括して）制御する構成も可能であるし、各発光部を個別に（独立に）制御する構成としてもよい。 The illumination light control unit 24 controls the amount of light emission (illuminance adjustment, light emission time, light emission frequency, etc.) in addition to the lighting (ON) / light-off (OFF) control of the lighting device 14. The illuminating device 14 can be configured to include a plurality of light emitting units so that the driver who is the subject can irradiate illumination light from a plurality of directions. For example, a plurality of light emitting units can be arranged in a symmetrical relationship with respect to the camera 12. The illumination light control unit 24 may be configured to control a plurality of light emitting units simultaneously (for example, collectively with the same command signal), or may be configured to control each light emitting unit individually (independently).

照明光制御部２４は、カメラ制御部２２と連携し、必要な明るさの画像が得られるように照明装置１４の光量を制御する。カメラ制御部２２の機能と照明光制御部２４の機能とを統合したシステム制御部として構成することも可能である。 The illumination light control unit 24 controls the light amount of the illumination device 14 so as to obtain an image having a necessary brightness in cooperation with the camera control unit 22. It is also possible to configure as a system control unit in which the functions of the camera control unit 22 and the illumination light control unit 24 are integrated.

照明装置１４の発光源は特に限定されず、ランプ、発光ダイオード（ＬＥＤ；light-emitting diode）、半導体レーザなど、適宜の手段を採用できる。光量の制御応答性、省電力等の観点からＬＥＤを用いる構成が好ましく、本例の照明装置１４は、必要な照明光照射範囲を実現するために、複数のＬＥＤを二次元配列させたＬＥＤ群が用いられている。 The light emission source of the illumination device 14 is not particularly limited, and any appropriate means such as a lamp, a light-emitting diode (LED), a semiconductor laser, or the like can be employed. A configuration using LEDs is preferable from the viewpoints of light quantity control responsiveness, power saving, and the like, and the illumination device 14 of this example is a group of LEDs in which a plurality of LEDs are two-dimensionally arranged in order to realize a necessary illumination light irradiation range. Is used.

撮像部のカメラ１２が赤外線カメラである場合には、照明装置１４として赤外線を照射するものが用いられる。ただし、発明の実施に際して、照明光のスペクトルは特に限定されない。赤外光に代えて、白色光であってもよいし、特定の波長成分分布を持つものであってもよく、これらの適宜の組み合わせであってもよい。カメラ１２や画像処理部２０との関係で適切な光源（スペクトル）が選択される。 When the camera 12 of the imaging unit is an infrared camera, an illumination device 14 that emits infrared rays is used. However, in the practice of the invention, the spectrum of the illumination light is not particularly limited. Instead of infrared light, white light may be used, a specific wavelength component distribution may be used, or an appropriate combination thereof may be used. An appropriate light source (spectrum) is selected in relation to the camera 12 and the image processing unit 20.

本例の監視映像システム１０は、例えば、自動車などの車両に搭載され、運転者の顔をカメラ１２で撮影して運転者の状態（居眠りや脇見など）を検知するドライバーモニタリングシステムとして用いることができる。 The monitoring video system 10 of this example is mounted on a vehicle such as an automobile, for example, and is used as a driver monitoring system that detects a driver's state (sleeping or looking aside) by photographing the driver's face with the camera 12. it can.

以下の説明では、カメラ１２で撮像した運転者の顔画像から「眼」の位置を追跡し、居眠りなどの状態を察知する監視映像システムを例に説明する。この場合、カメラ１２はレンズ部１８を通した視野内にドライバーの顔を捉え、通常の運転状態でドライバーの顔を正面側から撮影できるようにカメラ位置と画角とが調整されて車内に設置される。また、照明装置１４についてもドライバーの顔（特に、左右両眼を含む領域）に必要な照度の照明光を照射できるように適宜の場所に設置される。 In the following description, a monitoring video system that tracks the position of the “eye” from the driver's face image captured by the camera 12 and senses a state such as falling asleep will be described as an example. In this case, the camera 12 captures the driver's face in the field of view through the lens unit 18, and the camera position and angle of view are adjusted so that the driver's face can be photographed from the front side in a normal driving state. Is done. The illumination device 14 is also installed at an appropriate location so that illumination light having a necessary illuminance can be applied to the driver's face (particularly, the region including both the left and right eyes).

カメラ１２によって一定の時間間隔（例えば、６０フレーム／秒）で連続的に撮像が行われ、カメラ１２から時系列で撮像画像のデータが取得される。画像処理部２０は、カメラ１２から時系列で取得される撮像画像の各フレームを順次に処理する。つまり、画像処理部２０は、動画映像をリアルタイムで処理することができる。 Imaging is continuously performed by the camera 12 at a constant time interval (for example, 60 frames / second), and captured image data is acquired from the camera 12 in time series. The image processing unit 20 sequentially processes each frame of the captured image acquired in time series from the camera 12. That is, the image processing unit 20 can process a moving image in real time.

＜全体の処理フロー＞
図２は画像処理部２０における居眠り検知処理全体の流れを示すフローチャートである。図２では１フレームの画像入力に対する処理の流れを示した。画像処理部２０に１フレームの画像が入力されると（ステップＳ１２）、まず、当該入力画像が初期フレームであるか否かが判定される（ステップＳ１４）。入力された画像が初期フレーム（例えば、撮像画像の取り込みを開始した最初のフレーム）である場合ときは（ステップＳ１４でYes時）、ステップＳ１６に進み、ロックオン（Lock On）フラグをオフ（OFF）にする処理を行う。 <Overall processing flow>
FIG. 2 is a flowchart showing the flow of the whole dozing detection process in the image processing unit 20. FIG. 2 shows the flow of processing for image input of one frame. When an image of one frame is input to the image processing unit 20 (step S12), it is first determined whether or not the input image is an initial frame (step S14). If the input image is an initial frame (for example, the first frame from which captured image capture is started) (Yes in step S14), the process proceeds to step S16, and the lock on flag is turned off (OFF). ).

ロックオンフラグは、追跡対象である運転者の両眼を検出できている状態であるか否かを示す状態識別符号である。両眼を検出できているときにはロックオンフラグが「ON」、両眼を検出できていない状態ではロックオンフラグが「OFF」となる。初期フレームが入力された段階では、未だ運転者の眼を検出できていない状態（追跡していない状態）であるため、ロックオンフラグを「OFF」にする。 The lock-on flag is a state identification code indicating whether or not both eyes of the driver who is the tracking target can be detected. When both eyes are detected, the lock-on flag is “ON”, and when both eyes are not detected, the lock-on flag is “OFF”. At the stage when the initial frame is input, since the driver's eyes have not been detected yet (the state is not being tracked), the lock-on flag is set to “OFF”.

次いで、ロックオン判定の処理を行う（ステップＳ１８）。ロックオン判定の処理は、ドライバーの眼を監視するために、入力画像からドライバーの両眼の位置を検出する処理である。具体的には、AdaBoostのアルゴリズムを用いた機械学習によって得られた、人物の開いた両眼を検出する判別器（「両眼判別器」という。）と、開いた単眼を検出する判別器（「単眼判別器」という。）とを併用することにより、左右の眼を検出する。ロックオン判定の処理の詳細は図３及び図４で後述する。なお、両眼判別器、単眼判別器がそれぞれ「オブジェクト判別器」に相当する。 Next, lock-on determination processing is performed (step S18). The lock-on determination process is a process for detecting the positions of both eyes of the driver from the input image in order to monitor the eyes of the driver. Specifically, a discriminator for detecting both eyes opened by a person (referred to as a “binocular discriminator”) obtained by machine learning using the AdaBoost algorithm, and a discriminator for detecting an open monocular ( In combination with a “monocular discriminator”), the left and right eyes are detected. Details of the lock-on determination process will be described later with reference to FIGS. The binocular discriminator and the monocular discriminator each correspond to an “object discriminator”.

ロックオン判定の処理（図２のステップＳ１８）によってドライバーの両眼が検出されると、ロックオンフラグが「ON」に設定される。 When both eyes of the driver are detected by the lock-on determination process (step S18 in FIG. 2), the lock-on flag is set to “ON”.

ステップＳ２０では、ロックオンフラグがONであるか否かの確認判断が行われる。ロックオンフラグがONであることが確認されると（ステップＳ２０でYes）、追跡フラグをONに設定する処理（ステップＳ２２）を行う。追跡フラグは、追跡対象たるドライバーの眼を追跡中であるか否かを示す状態識別符号である。ステップＳ２２にて追跡フラグをONに設定したら、当該１フレームの処理を終了する。 In step S20, it is determined whether or not the lock-on flag is ON. When it is confirmed that the lock-on flag is ON (Yes in step S20), processing for setting the tracking flag to ON (step S22) is performed. The tracking flag is a state identification code indicating whether or not the driver's eye as the tracking target is being tracked. If the tracking flag is set to ON in step S22, the processing for the one frame is terminated.

一方、ステップＳ１８のロックオン判定で両眼を検出できなかった場合（ロックオンできなかった場合）には、ステップＳ２０でロックオンフラグがOFFと判断され（ステップＳ２０でNo）、そのまま１フレームの処理を終了する。 On the other hand, if both eyes cannot be detected in the lock-on determination in step S18 (if lock-on cannot be performed), the lock-on flag is determined to be OFF in step S20 (No in step S20), and one frame is left as is. The process ends.

また、入力画像が初期フレームでない場合には、ステップＳ１４において「No」と判定され、ステップＳ３０に進む。ステップＳ３０では、追跡フラグがONであるか否かが判断される（ステップＳ３０）。追跡フラグがOFFである場合（ステップＳ３０でNo）、ステップＳ１６に進む。 If the input image is not an initial frame, “No” is determined in step S14, and the process proceeds to step S30. In step S30, it is determined whether or not the tracking flag is ON (step S30). If the tracking flag is OFF (No in step S30), the process proceeds to step S16.

ステップＳ３０の判断において、追跡フラグがONである場合（ステップＳ３０でYes）、ステップＳ３２の眼追跡の処理に進む。眼追跡の処理（ステップＳ３２）は、ロックオン判定（ステップＳ１８）で検出した左右の眼の開閉状態を監視するため、左右の眼をそれぞれ個別に追跡する処理を行う。具体的には、ロックオン判定（ステップＳ１８）で検出した左右の眼をモデル（初期モデル）として、それぞれの眼が検出された位置の周辺にパターンマッチングを行うことで、眼が移動した位置を求める。初期モデルによるマッチングのスコア（以下、類似度という場合がある）が所定の基準値（第１の閾値）を下まわる場合には、直近のフレームから生成した最新の更新モデル（「最新モデル」、「直近モデル」ともいう。）によるパターンマッチングを行い、眼の位置を検出する。 If it is determined in step S30 that the tracking flag is ON (Yes in step S30), the process proceeds to the eye tracking process in step S32. In the eye tracking process (step S32), the left and right eyes are individually tracked in order to monitor the open / closed state of the left and right eyes detected in the lock-on determination (step S18). Specifically, using the left and right eyes detected in the lock-on determination (step S18) as a model (initial model), pattern matching is performed around the position where each eye is detected, so that the position where the eye has moved is determined. Ask. If the score of the matching by the initial model (hereinafter sometimes referred to as similarity) falls below a predetermined reference value (first threshold), the latest updated model (“latest model”, Pattern matching is also performed by “the latest model” to detect the eye position.

追跡中に眼が障害物で隠れるなど、眼の位置を確認できなくなった場合には一時的に追跡を停止し、再度眼が確認できる状態になったら追跡を復帰する処理を行う。眼追跡の処理の詳細は、図５で説明（後述）する。また、追跡の停止／復帰に関する処理については図６、図７で説明する。 When the eye position cannot be confirmed, such as when the eye is hidden by an obstacle during tracking, the tracking is temporarily stopped, and when the eye can be confirmed again, the tracking is restored. Details of the eye tracking process will be described with reference to FIG. Further, processing related to tracking stop / return will be described with reference to FIGS.

眼追跡の処理（図２のステップＳ３２）に続き、この眼追跡において追跡している眼の開閉判断を行う（ステップＳ３４）。具体的には、ロックオン判定（ステップＳ１８）の眼検出において使用した単眼検出器と同じ判別器、すなわち、機械学習によって得られた、開いた単眼を検出する判別器（開き目検出器）を追跡位置の周辺に照合することによって、眼の開閉判断をする。詳細は図８で説明する。 Following the eye tracking process (step S32 in FIG. 2), it is determined whether to open or close the eye being tracked in this eye tracking (step S34). Specifically, the same discriminator as the monocular detector used in the eye detection of the lock-on determination (step S18), that is, the discriminator (open eye detector) for detecting an open monocular obtained by machine learning. By checking the vicinity of the tracking position, it is determined whether to open or close the eye. Details will be described with reference to FIG.

開閉判断の処理（図２ステップＳ３４）によって、開閉判断結果の情報（Ｄ３６）が得られる。開閉判断の処理（ステップＳ３４）の後、ステップＳ４０に進み、眼の追跡が停止された否かが判定される。眼追跡の処理（ステップＳ３２）において、両眼ともに追跡が停止されると、追跡フラグがOFFされる（後述の図５、ステップＳ２３８）。この場合、図２のステップＳ４０でYes判定となり、ロックオンフラグをOFFにする（ステップＳ４２）。 Open / close determination result information (D36) is obtained by the open / close determination process (step S34 in FIG. 2). After the opening / closing determination process (step S34), the process proceeds to step S40, and it is determined whether or not eye tracking is stopped. In the eye tracking process (step S32), when tracking is stopped for both eyes, the tracking flag is turned off (FIG. 5, step S238 described later). In this case, a Yes determination is made in step S40 of FIG. 2, and the lock-on flag is turned OFF (step S42).

次いで、再ロックオンの処理の必要性が判断される（ステップＳ４４）。ロックオンフラグがOFFの場合には、再ロックオンの処理を実施する。なお、片眼のみの追跡が停止され、もう片方の追跡が行われている状況（復帰を待っている状況）では、再ロックオンは不要と判断される。 Next, the necessity of the re-lock on process is determined (step S44). If the lock-on flag is OFF, re-lock-on processing is performed. It should be noted that in a situation where tracking of only one eye is stopped and tracking of the other eye is being performed (situation waiting for return), it is determined that re-lock on is unnecessary.

ステップＳ４４で再ロックオンが必要と判定されると（ステップＳ４４でYes）、再ロックオンの処理が行われる（ステップＳ４６）。再ロックオンの処理（ステップＳ４６）は、ステップＳ１８で説明したロックオン判定のアルゴリズムと同様である。 If it is determined in step S44 that relock-on is necessary (Yes in step S44), relock-on processing is performed (step S46). The re-lock-on process (step S46) is the same as the lock-on determination algorithm described in step S18.

再ロックオンの処理（ステップＳ４６）が終了すると、１フレームの処理終了を終了する。また、ステップＳ４４にて、再ロックオンの処理が不要と判断されたときには、１フレームの処理を終了する。 When the re-lock-on process (step S46) ends, the process for one frame ends. If it is determined in step S44 that the re-lock-on process is unnecessary, the process for one frame is terminated.

図１で説明したカメラ１２から所定のフレーム間隔で連続的に（時系列で）取得される各フレーム画像について、図２の処理が繰り返される。 The processing of FIG. 2 is repeated for each frame image acquired continuously (in time series) from the camera 12 described in FIG. 1 at a predetermined frame interval.

＜ロックオン判定の処理フローの例１＞
図３は、ロックオン判定の処理の流れに関する第１例を示すフローチャートである。フレーム画像が入力されると（ステップＳ１０２）、両眼判別器による両眼検出が行われる（ステップＳ１０４）。この処理は、入力画像に対して両眼判別器を照合し、両眼候補を検出する処理である。両眼判別器は、両眼を同時に検出する検出器であり、入力画像から両眼の大まかな位置を特定する。 <Example 1 of lock-on determination processing flow>
FIG. 3 is a flowchart illustrating a first example regarding the flow of the lock-on determination process. When a frame image is input (step S102), binocular detection is performed by the binocular discriminator (step S104). This process is a process of collating a binocular discriminator against an input image and detecting a binocular candidate. The binocular discriminator is a detector that detects both eyes simultaneously, and specifies the approximate position of both eyes from the input image.

次に、両眼が検出されたか否かの確認判断を行う（ステップＳ１０６）。ステップＳ１０４で両眼が検出されていればステップＳ１０６でYes判定となる。ステップＳ１０４で両眼が検出されていなければステップＳ１０６はNoとなる。ステップＳ１０６でNo判定の場合は、処理を終了する。 Next, it is determined whether or not both eyes have been detected (step S106). If both eyes are detected in step S104, a Yes determination is made in step S106. If both eyes are not detected in step S104, step S106 is No. If NO in step S106, the process ends.

ステップＳ１０６でYes判定となったときは、ステップＳ１０８に進み、単眼検出候補位置の設定を行う。単眼検出候補位置の設定処理（ステップＳ１０８）は、両眼判別器で検出された両眼候補それぞれに対し、単眼判別器を照合する位置を設定する。本実施形態では、両眼判別器の学習設計において、両眼中心点間を32pixelに正規化した。ここでいう「pixel」は、カメラ１２から得られるデジタル画像データの画素を表している。32pixelは、１フレームの撮像画像における32画素分の距離を表している。 If YES is determined in step S106, the process proceeds to step S108, and a monocular detection candidate position is set. In the monocular detection candidate position setting process (step S108), a position for collating the monocular discriminator is set for each binocular candidate detected by the binocular discriminator. In the present embodiment, the distance between the binocular center points is normalized to 32 pixels in the learning design of the binocular discriminator. Here, “pixel” represents a pixel of digital image data obtained from the camera 12. 32 pixels represents a distance of 32 pixels in a captured image of one frame.

そのため、両眼候補の検出位置の中心から両脇16pixelの位置に左右の眼の中心があると認定し、左右の眼それぞれの中心位置を基準に画像内の水平方向及び垂直方向のそれぞれ所定範囲内（比較的小さい範囲内）で単眼判別器を適用する。単眼判別器による単眼検出の処理（ステップＳ１１０）では、ステップＳ１０８で設定した範囲に単眼判別器を照合する。 Therefore, it is determined that the center of the left and right eyes is located at the position of 16 pixels on both sides from the center of the detection position of the candidate for both eyes, and a predetermined range in the horizontal direction and the vertical direction in the image based on the center positions of the left and right eyes The monocular discriminator is applied within (within a relatively small range). In the monocular detection process by the monocular classifier (step S110), the monocular classifier is collated with the range set in step S108.

ステップＳ１１０の処理の結果、単眼が２個検出されたか否かの判別が行われ（ステップＳ１１２）、単眼が２個、すなわち、左右の眼がそれぞれ正しく検出された場合には、この２つを追跡対象とし、ロックオンする（ステップＳ１１４）。左右の眼を追跡対象として特定できたら、ロックオンフラグをONにする。 As a result of the processing in step S110, it is determined whether or not two monoculars are detected (step S112). If two monoculars, that is, the right and left eyes are detected correctly, the two are detected. The target of tracking is locked on (step S114). When the left and right eyes can be identified as the tracking target, the lock on flag is turned ON.

そして、ロックオンした両眼を初期モデル（初期テンプレート）に設定する（ステップＳ１１６、「初期モデル生成工程」に相当）。この設定された初期モデルを用いて眼の追跡処理が行われる（図５参照）。 Then, both eyes that are locked on are set as an initial model (initial template) (step S116, corresponding to an “initial model generation step”). Eye tracking processing is performed using the set initial model (see FIG. 5).

なお、図３のステップＳ１０６でNo判定、又は、ステップＳ１１２でNo判定となった場合は、当該フレームについての処理を終了する。 Note that if the determination in step S106 of FIG. 3 is No or the determination in step S112 is No, the processing for the frame is terminated.

また、後述する両眼の追跡中に、左右の眼の中心間距離から顔サイズの変動を自動検出し、局所的にロックオン評価をして、同じ初期モデルを継続使用できるか否かの判断を行う。顔サイズが大きく変化して、当初の初期モデルを適用できない（マッチング相関度が規定値よりも低くなるときには）ロックオンの処理をやり直し、適切な初期モデルを設定し直す。 Also, during tracking of both eyes, which will be described later, the face size fluctuation is automatically detected from the distance between the centers of the left and right eyes, and a lock-on evaluation is performed locally to determine whether the same initial model can be used continuously. I do. The face size changes greatly, and the initial initial model cannot be applied (when the matching correlation is lower than the specified value), the lock-on process is performed again, and an appropriate initial model is set again.

＜ロックオン判定の処理フローの例２＞
図４は、ロックオン判定の具体的な処理の流れに関する第２例を示すフローチャートである。図３に代えて、図４のフローチャートを採用することができる。図４では、フレーム画像の入力（ステップＳ１０２）後、この入力画像に対して、単眼判別器により単眼検出が行われる（ステップＳ１２４）。単眼判別器により入力画像内を探索し、単眼候補を検出する。単眼検出の処理（ステップＳ１２４）によって、単眼の候補が二つ以上検出されたか否かの判定が行われる（ステップＳ１２６）。単眼の候補が二つ以上検出されなかった場合には、ステップＳ１２６でNo判定となり、本処理を終了して、次のフレーム画像の処理に移行する。 <Example 2 of processing flow of lock-on determination>
FIG. 4 is a flowchart illustrating a second example regarding the specific processing flow of the lock-on determination. Instead of FIG. 3, the flowchart of FIG. 4 can be employed. In FIG. 4, after inputting a frame image (step S102), monocular detection is performed on the input image by a monocular discriminator (step S124). The monocular discriminator searches the input image and detects monocular candidates. By the monocular detection process (step S124), it is determined whether or not two or more monocular candidates are detected (step S126). If two or more monocular candidates are not detected, the determination is No in step S126, the process ends, and the process proceeds to the next frame image process.

単眼検出（ステップＳ１２４）の処理によって、二つ以上の単眼候補が検出された場合にはステップＳ１２６でYes判定となり、ステップＳ１２８に進む。 If two or more monocular candidates are detected by the process of monocular detection (step S124), a Yes determination is made in step S126, and the process proceeds to step S128.

ステップＳ１２８では、二つ以上の単眼候補について、対称性の判断を行う。左右両眼の対称性を用い、二つの単眼候補がペア（対）になった「両眼」であるか否かの判断が行われる（ステップＳ１３０）。左右対称性のある二つの単眼候補のペアを検出できなかった場合には（ステップＳ１３０でNo）、本処理を終了して、次のフレーム画像の処理に移行する。 In step S128, symmetry is determined for two or more monocular candidates. Using the symmetry of the left and right eyes, it is determined whether or not the two monocular candidates are “binocular” in pairs (step S130). If a pair of two monocular candidates having left-right symmetry cannot be detected (No in step S130), the process ends and the process proceeds to the next frame image process.

その一方、ペアになっている両眼を検出できた場合には（ステップＳ１３０においてYes）、このペアの両眼を追跡対象とし、ロックオンする（ステップＳ１３４）。左右の眼をロックオンできたら（追跡対象として特定できたら）、ロックオンフラグをONにし、このペアの両眼を初期モデルとして登録する（ステップＳ１３６、「初期モデル生成工程」に相当）。こうして設定された初期モデルを用いて追跡の処理が行われることになる。 On the other hand, if the pair of eyes can be detected (Yes in step S130), the pair of eyes is set as a tracking target and locked on (step S134). When the left and right eyes can be locked on (when the eye can be identified as a tracking target), the lock-on flag is turned ON and both eyes of this pair are registered as the initial model (step S136, corresponding to the “initial model generation step”). Tracking processing is performed using the initial model thus set.

＜眼追跡処理＞
図５は本実施形態による眼追跡処理の流れを示すフローチャートである。フレーム画像の入力が行われると（ステップＳ２０２）、眼の追跡状態であるか否かの判断が行われる（ステップＳ２０４）。追跡状態であるか否かは追跡フラグ（ロックオンフラグ）によって識別することができる。追跡中でなければ（ステップＳ２０４でNo）、ロックオン判定（ステップＳ２０６）の処理を行い、初期モデルの生成を行う（ステップＳ２０８、「初期モデル生成工程」に相当）。すなわち、ロックオン判定（ステップＳ２０６）直後は、ロックオン判定で検出した眼の画像部分をテンプレートとして切り出し、眼の画像特徴を示す初期モデルとしてメモリに記憶し、保持する（ステップＳ２０８）。ステップＳ２０６〜Ｓ２０８の処理は、図３又は図４で例示した処理と同様の処理が適用される。 <Eye tracking process>
FIG. 5 is a flowchart showing the flow of eye tracking processing according to this embodiment. When the frame image is input (step S202), it is determined whether or not the eye is in the tracking state (step S204). Whether or not it is in a tracking state can be identified by a tracking flag (lock-on flag). If tracking is not in progress (No in step S204), lock-on determination (step S206) is performed to generate an initial model (step S208, corresponding to an “initial model generation step”). That is, immediately after the lock-on determination (step S206), the eye image portion detected by the lock-on determination is cut out as a template, stored in the memory as an initial model indicating the image characteristics of the eye, and held (step S208). The same processing as the processing illustrated in FIG. 3 or FIG. 4 is applied to the processing in steps S206 to S208.

初期モデルは、「初期テンプレート」とも呼ばれ、テンプレートマッチングによる追跡を開始するときの最初の参照テンプレートとして使われる。この初期モデルはロックオン処理によって得られる。 The initial model is also called an “initial template” and is used as the first reference template when tracking by template matching is started. This initial model is obtained by a lock-on process.

その一方、先行するフレーム（例えば、直前のフレーム）において既に追跡が行われている場合には、前フレームで追跡した眼の画像部分がモデルとして切り出され、これを更新モデル（最新モデル）として保持している（後述のステップＳ２２４参照）。本例の場合、更新モデルは毎フレーム更新され、追跡中、常に最新の（直近の）モデルが保持される。更新モデルは「最新モデル」、「直近モデル」、或いは「更新テンプレート」とも呼ばれる。 On the other hand, if tracking has already been performed in the preceding frame (for example, the immediately preceding frame), the image portion of the eye tracked in the previous frame is cut out as a model and retained as an updated model (latest model) (See step S224 described later). In this example, the updated model is updated every frame, and the latest (most recent) model is always maintained during tracking. The update model is also called “latest model”, “latest model”, or “update template”.

つまり、直前のフレームのマッチング位置から切り出された眼領域の画像部分を新たなモデル（テンプレート）としたものが更新モデルである。ロックオン処理で生成された初期モデルを維持しつつ、常に最新（直近）の更新モデルが生成され、初期モデルと最新の更新モデルが保持される。 That is, an updated model is a model (template) obtained by using the image portion of the eye region cut out from the matching position of the immediately preceding frame. While maintaining the initial model generated by the lock-on process, the latest (most recent) update model is always generated, and the initial model and the latest update model are held.

図５のステップＳ２０４で追跡状態であると判断され（ステップＳ２０４でYes）、ステップＳ２１０に進む。ステップＳ２１０では、マッチング対象領域の切り出し処理が行われる。この処理は、前フレームで検出（追跡）した眼の位置を中心に、フレーム内からマッチング対象領域を切り出す処理である。 In step S204 of FIG. 5, it is determined that the tracking state is set (Yes in step S204), and the process proceeds to step S210. In step S210, a matching target area is cut out. This process is a process of cutting out a matching target area from the frame around the position of the eye detected (tracked) in the previous frame.

次に、初期デンプレート（初期モデル）によるマッチング処理を行う（ステップＳ２１２、「第１のマッチング処理工程」に相当）。本実施形態では、初期テンプレートと更新テンプレートとを併用して追跡を行うが、更新テンプレートを用いるマッチング処理に先んじて、まずは、初期テンプレートによるテンプレートマッチングの処理（「第１のテンプレートマッチングの処理」に相当）を行う（ステップＳ２１２）。マッチング対象領域に対して初期テンプレートを照合して類似度（「第１のスコア」に相当）を計算する。マッチング対象領域と初期テンプレートの相対的な位置関係をずらしながら各位置で類似度を計算する。 Next, matching processing using an initial denplate (initial model) is performed (step S212, corresponding to “first matching processing step”). In the present embodiment, tracking is performed using both the initial template and the update template. Prior to the matching process using the update template, first, the template matching process using the initial template ("first template matching process") is performed. Equivalent) is performed (step S212). The similarity (corresponding to the “first score”) is calculated by matching the initial template against the matching target region. The similarity is calculated at each position while shifting the relative positional relationship between the matching target region and the initial template.

算出された類似度と第１の閾値（予め定められた判定基準値、例えば類似度50%、）との比較が行われる（ステップＳ２１４、「比較工程」に相当）。マッチングの結果、第１の閾値以上の類似度（50％以上の類似度）を持つ位置が見つかれば、ステップＳ２１４でYesとなり、その位置を追跡結果として決定する（ステップＳ２１６、「追跡位置決定工程」に相当）。 The calculated similarity is compared with a first threshold (predetermined criterion value, for example, 50% similarity) (step S214, corresponding to “comparison process”). As a result of the matching, if a position having a similarity level equal to or higher than the first threshold (similarity level of 50% or more) is found, the result in Step S214 is Yes, and the position is determined as a tracking result (Step S216, “tracking position determination process”). ”).

その一方、初期テンプレートによるマッチング（ステップＳ２１２）で、第１の閾値以上の類似度を持つ位置が見つからなかった場合は（ステップＳ２１４でNo）、更新テンプレートによるテンプレートマッチングの処理（「第２のテンプレートマッチングの処理」に相当）を行い（ステップＳ２２０、「第２のマッチング処理工程」に相当）、最大の類似度を持つ位置を追跡結果とする（ステップＳ２２２、「追跡位置決定工程」に相当）。 On the other hand, if a position having a similarity greater than or equal to the first threshold is not found in the matching by the initial template (step S212) (No in step S214), the template matching process ("second template" (Corresponding to “matching process”) (step S220, corresponding to “second matching process step”), and the position having the maximum similarity is set as the tracking result (step S222, corresponding to “tracking position determination step”). .

追跡位置が決定したら（ステップＳ２１６、Ｓ２２２）、その決定した位置における眼領域を最新モデルとして切り出し、これを最新モデル（更新テンプレート）として記録する（ステップＳ２２４、「更新モデル生成工程」に相当）。 When the tracking position is determined (steps S216 and S222), the eye region at the determined position is cut out as the latest model and recorded as the latest model (update template) (corresponding to step S224, “update model generation step”).

このように、本実施形態では、初期モデルによるマッチング処理を優先して行い、初期モデルの類似度が規定の閾値（第１の閾値）以上であれば、初期モデルによるマッチングの結果をそのまま採用する。このとき更新モデルによるマッチング処理の演算は省略されるが、初期モデルのマッチング結果に基づいて、最新モデルは更新される。 As described above, in this embodiment, the matching process using the initial model is preferentially performed, and if the similarity of the initial model is equal to or higher than the specified threshold (first threshold), the result of the matching using the initial model is used as it is. . At this time, the computation of the matching process by the updated model is omitted, but the latest model is updated based on the matching result of the initial model.

一方、初期モデルの類似度が第１の閾値を下回って低い場合には、更新モデルによるマッチングを行い、更新モデルによるマッチング結果を採用する。一般に、更新モデルは、初期モデルに比べて一致度が高くなるため、更新モデルによってマッチング位置を探索する際の判定閾値（第２の閾値）は、初期モデルの閾値（第１の閾値）よりも高い値に設定される。 On the other hand, when the similarity of the initial model is lower than the first threshold, matching with the updated model is performed, and the matching result based on the updated model is adopted. In general, since the update model has a higher degree of matching than the initial model, the determination threshold value (second threshold value) when searching for a matching position by the update model is higher than the threshold value (first threshold value) of the initial model. Set to a high value.

なお、ステップＳ２１２、Ｓ２１４、Ｓ２１６、Ｓ２２０、Ｓ２２２を含んだ一連の工程が「オブジェクト探索工程」に相当する。 A series of steps including steps S212, S214, S216, S220, and S222 corresponds to the “object search step”.

次に、追跡結果についての判断処理を行う（ステップＳ２２６）。具体的には、初期モデルが適切か否か（同じ初期モデルを継続して使用可能か否か）の判断を行う処理と、追跡不良を判別する処理を含んでいる。 Next, a determination process for the tracking result is performed (step S226). Specifically, it includes a process of determining whether or not the initial model is appropriate (whether or not the same initial model can be used continuously) and a process of determining a tracking failure.

まず、初期モデルの適切性を判断するために、眼追跡で特定された両眼の位置から両眼間の距離を求め、この両眼距離の変化がある条件（予め定められた判定基準値、「第３の閾値」という。）を超えているか否かの判定を行う（ステップＳ２２８）。なお、カメラ１２に顔が近づくと両眼距離は大きくなり、カメラ１２から顔が離れると両眼距離は小さくなる。遠近の許容範囲を規定する第３の閾値は、下限値側の値と、上限値側の値の両方を含むことができる。或いはまた、両眼距離の変化量を絶対値で表すことにより、その変化量（絶対値）の上限を第３の閾値として設定することができる。 First, in order to determine the suitability of the initial model, the distance between both eyes is obtained from the position of both eyes specified by eye tracking, and there is a condition (predetermined criterion value, It is determined whether or not “third threshold value” is exceeded (step S228). Note that the binocular distance increases as the face approaches the camera 12, and the binocular distance decreases as the face moves away from the camera 12. The third threshold value that defines the permissible range can include both a lower limit value and an upper limit value. Alternatively, by expressing the change amount of the binocular distance as an absolute value, the upper limit of the change amount (absolute value) can be set as the third threshold value.

両眼距離の変化量が第３の閾値を超えた場合（ステップＳ２２８でYes）、カメラ１２に対する運転者の顔の遠近位置が移動した可能性があるため、ステップＳ２３０に進み再ロックオンを試す。 When the change amount of the binocular distance exceeds the third threshold value (Yes in Step S228), the perspective position of the driver's face with respect to the camera 12 may have moved, so the process proceeds to Step S230 and tries to re-lock on. .

この再ロックオン処理（ステップＳ２３０）は、初期モデル（初期テンプレート）を作り直すことを目的として行われるものである。再ロックオン処理（ステップＳ２３０）によって、新たに初期モデルが生成され、初期モデルの更新が行われる（ステップＳ２３２）。ステップＳ２３０〜Ｓ２３２の処理は、運転者の顔がカメラ１２から遠ざかったり、カメラ１２に近づいたりする遠近移動対策としての処理であり、両眼の距離変化から遠近移動を判断して初期モデルを作り直すものである。ステップＳ２３０〜Ｓ２３２の具体的な処理内容は、図３及び図４で例示したロックオン判定の処理と同様の処理が適用される。 This re-lock-on process (step S230) is performed for the purpose of recreating the initial model (initial template). A new initial model is generated by the re-lock-on process (step S230), and the initial model is updated (step S232). The processing in steps S230 to S232 is processing as a measure of distance movement in which the driver's face moves away from the camera 12 or approaches the camera 12, and the initial model is recreated by determining the distance movement from the distance change of both eyes. Is. As the specific processing contents of steps S230 to S232, the same processing as the lock-on determination processing exemplified in FIGS. 3 and 4 is applied.

ステップＳ２２８で両眼距離の変化が第３の閾値以内であれば、ステップＳ２３６に進み、追跡結果の妥当性を判断する。具体的には、図６に示す処理を行う。 If the change in the binocular distance is within the third threshold value in step S228, the process proceeds to step S236 to determine the validity of the tracking result. Specifically, the process shown in FIG. 6 is performed.

＜追跡不良の判定処理＞
図６は、追跡結果から追跡の一時停止や停止からの復帰処理を行う判断処理の例を示すフローチャートである。図７は、図６のフローチャートを理解しやすくするために、追跡中に起こりうる３つの状態間の遷移関係を示した説明図である。 <Determining processing for poor tracking>
FIG. 6 is a flowchart illustrating an example of determination processing for performing tracking suspension or return processing from the stop based on the tracking result. FIG. 7 is an explanatory diagram showing transition relationships between three states that can occur during tracking in order to facilitate understanding of the flowchart of FIG.

眼追跡の処理では、両眼ともに追跡中の状態（図７中、符号「ＳＴ−２」として記載）、一方の眼（片眼）のみを追跡中の状態（他方の眼については追跡を停止している状態、図７中、符号「ＳＴ−１」として記載）、両眼ともに追跡を停止している状態（図７中、符号「ＳＴ−０」として記載）の３つの状態があり得る。 In the eye tracking process, both eyes are being tracked (denoted as “ST-2” in FIG. 7), and only one eye (one eye) is being tracked (tracking is stopped for the other eye) There are three possible states: a state in which the tracking is performed, indicated as “ST-1” in FIG. 7, and a state where tracking is stopped for both eyes (indicated as “ST-0” in FIG. 7). .

図６に示したフローは、眼追跡の停止と復帰に関する判定の処理手順を示したものである。追跡演算の対象となる注目領域（ＲＯＩ；region of interest）のデータが入力されると（ステップＳ３０２）、まず、両眼とも追跡中であるか否かの判断が行われる（ステップＳ３０４）。両眼とも追跡中であるときはステップＳ３０４でYesとなり、ステップＳ３０６の「両眼同時追跡停止判定」の処理に進む。 The flow shown in FIG. 6 shows a determination processing procedure regarding stop and return of eye tracking. When data of a region of interest (ROI) to be tracked is input (step S302), it is first determined whether or not both eyes are being tracked (step S304). If both eyes are being tracked, the result of step S304 is Yes, and the process proceeds to the "Binocular simultaneous tracking stop determination" process in step S306.

両眼同時追跡停止判定の処理（ステップＳ３０６）は、両眼とも追跡中に、両眼の追跡を同時に停止するか否かを判定する処理である。例えば、以下に示す条件１〜３のうち少なくとも１つの条件を満たした場合は、両眼の追跡を同時に停止する処理を行う。 The binocular simultaneous tracking stop determination process (step S306) is a process for determining whether or not to stop the tracking of both eyes simultaneously while tracking both eyes. For example, when at least one of the following conditions 1 to 3 is satisfied, a process for simultaneously stopping the tracking of both eyes is performed.

〔条件１〕ロックオン直後の両眼距離（両眼の中心位置間のpixel数によって評価）に比べ、現在の両眼距離が1.25倍以上離れており、かつ、初期モデルとの相関（類似度）が35%を下回った場合。 [Condition 1] Compared to the binocular distance immediately after lock-on (evaluated by the number of pixels between the center positions of both eyes), the current binocular distance is more than 1.25 times, and the correlation with the initial model (similarity) ) Is below 35%.

〔条件２〕初期モデルとの相関（類似度）が高い方の眼の相関が20%を下回り、かつ低い方の眼の相関が15％以下である場合。 [Condition 2] When the correlation of the eye with higher correlation (similarity) with the initial model is less than 20% and the correlation of the lower eye is 15% or less.

〔条件３〕現在の両眼距離が単眼の横幅よりも狭く、かつ初期モデルとの相関が高い方の眼の相関が35%以下である場合。 [Condition 3] When the current binocular distance is narrower than the width of a single eye and the correlation of the eye with the higher correlation with the initial model is 35% or less.

なお、条件１〜３における判断の基準となる各閾値の具体的な数値「1.25倍」、「35%」、「20%」、「15%」は一例に過ぎず、目的の範囲で適宜の値に設定することができる。なお、両眼距離は、遠近変動（サイズ変化）を判断する指標となる。つまり、サイズ変化の観点と、初期モデルとの類似度の観点を組み合わせて、判定条件が定められている。 In addition, the specific numerical values “1.25 times”, “35%”, “20%”, and “15%” of each threshold value used as a criterion for the determination in the conditions 1 to 3 are merely examples, and are appropriately set within a target range. Can be set to a value. The binocular distance is an index for determining perspective variation (size change). That is, the determination condition is determined by combining the viewpoint of the size change and the viewpoint of the similarity with the initial model.

ステップＳ３０６の後、ステップＳ３０８に進み、両眼とも追跡停止とされたか否かの判断を行う。両眼の追跡を同時に停止した場合（ステップＳ３０８でYes）、追跡フラグをオフにする（ステップＳ３１０）。この場合は、再ロックオンの処理に移行する（図２のステップＳ４０〜Ｓ４６、図７のＳＴ−０参照）。 After step S306, the process proceeds to step S308, where it is determined whether or not tracking has been stopped for both eyes. When tracking of both eyes is stopped simultaneously (Yes in step S308), the tracking flag is turned off (step S310). In this case, the process shifts to a re-lock on process (see steps S40 to S46 in FIG. 2 and ST-0 in FIG. 7).

その一方、図６のステップＳ３０８において、No判定となった場合には、片眼追跡停止判定の処理（ステップＳ３１２）に進む。両眼同時追跡中の両眼同時追跡停止判定（ステップＳ３０６）において両眼同時の追跡停止が起きなかった場合は、片眼（単眼）の追跡停止判定を行う（ステップＳ３１２）。例えば、以下の条件４〜７のうち少なくとも１つの条件を満たした場合は、単眼の追跡を停止する。 On the other hand, in the case of No determination in step S308 in FIG. 6, the process proceeds to the one-eye tracking stop determination process (step S312). If the binocular simultaneous tracking stop determination during the binocular simultaneous tracking does not occur (step S306), the single eye (monocular) tracking stop determination is performed (step S312). For example, when at least one of the following conditions 4 to 7 is satisfied, monocular tracking is stopped.

〔条件４〕初期モデルとの相関（類似度）が高い方の眼の相関が50%を上回っており、かつ低い方の眼の相関が15%を下回った場合は、相関が低い方の眼の追跡を停止する。 [Condition 4] If the correlation of the eye with higher correlation (similarity) with the initial model exceeds 50% and the correlation of the lower eye with less than 15%, the eye with the lower correlation Stop tracking.

〔条件５〕Lock On直後の両眼距離に比べ、現在の両眼距離が1.25倍以上離れている場合は、相関が低い方の眼の追跡を停止する。 [Condition 5] If the current binocular distance is 1.25 times or more away from the binocular distance immediately after Lock On, tracking of the eye with the lower correlation is stopped.

〔条件６〕両眼距離が単眼の横幅よりも狭くなった場合、相関が低い方の眼の追跡を停止する。 [Condition 6] When the binocular distance becomes narrower than the lateral width of a single eye, tracking of the eye with the lower correlation is stopped.

〔条件７〕初期モデルとの相関が高い方の眼の相関が50%を上回っており、かつ低い方の眼の相関が50%を下回ったとき、現在の両眼距離がLock On直後の両眼距離の４分の１未満となった場合は、相関が低い方の眼の追跡を停止する。 [Condition 7] When the correlation of the eye with a higher correlation with the initial model is higher than 50% and the correlation with the lower eye is lower than 50%, the current binocular distance is If the eye distance is less than a quarter of the eye distance, tracking of the eye with the lower correlation is stopped.

なお、条件４〜７における判断の基準となる各閾値の具体的な数値「50%」、「15%」、「1.25倍」、「４分の１」は一例に過ぎず、目的の範囲で適宜の値に設定することができる。 In addition, the specific numerical values “50%”, “15%”, “1.25 times”, and “1/4” of the threshold values used as criteria for the determination in the conditions 4 to 7 are merely examples, and are within the target range. An appropriate value can be set.

ステップＳ３０４でのNo判定の場合、ステップＳ３２０に進み、両眼とも追跡停止中であるか否かが判断される。両眼とも追跡が停止している場合（ステップＳ３２０でYes）は、ステップＳ３２２の「両眼同時追跡復帰判定」に進む。両眼追跡停止中の両眼同時追跡復帰判定（ステップＳ３２２）では、両眼それぞれにおいて履歴に保持している追跡範囲を対象として、初期モデルのマッチングを行う。このマッチングにおいて初期モデルの相関が30%を上回った場合は、追跡を復帰する。 In the case of No determination in step S304, the process proceeds to step S320, and it is determined whether or not tracking is stopped for both eyes. If tracking is stopped for both eyes (Yes in step S320), the process proceeds to "Binocular simultaneous tracking return determination" in step S322. In the binocular simultaneous tracking return determination while the binocular tracking is stopped (step S322), matching of the initial model is performed for the tracking range held in the history for both eyes. If the initial model correlation exceeds 30% in this matching, tracking is resumed.

ステップＳ３２０でNoの場合、すなわち、片眼（単眼）追跡停止中であるときは、ステップＳ３２４の片眼追跡復帰判定の処理を行う。つまり、単眼のみ追跡が行われている（片眼の追跡が停止中）は、追跡が停止している方の単眼について追跡復帰の判定処理を行う（ステップＳ３２４）。例えば、追跡中の単眼が左眼ならば、追跡中の左眼の中心位置から右側80 pixelの位置を中心に縦131×横131pixelの範囲に、初期モデルのマッチングを行う。また、追跡中の単眼が右眼ならば、追跡中の右眼の中心位置から左側80 pixelの位置を中心に縦131×横131pixelの範囲に、初期モデルのマッチングを行う。 In the case of No in step S320, that is, when one-eye (monocular) tracking is being stopped, the single-eye tracking return determination process in step S324 is performed. That is, when tracking is performed only for a single eye (tracking for one eye is stopped), a tracking return determination process is performed for the single eye for which tracking is stopped (step S324). For example, if the monocular being tracked is the left eye, the initial model is matched within the range of vertical 131 × horizontal 131 pixels centered on the position of the right 80 pixels from the center position of the tracking left eye. Also, if the single eye being tracked is the right eye, the initial model is matched within a range of 131 pixels in the vertical direction and 131 pixels in the horizontal direction from the center position of the right eye being tracked to the position of the left 80 pixels.

このマッチングにおいて初期モデルの相関が30%を上回った場合は、追跡を復帰する。 If the initial model correlation exceeds 30% in this matching, tracking is resumed.

上記の判定基準を示す各閾値の「80pixel」、「131pixel」、「30%」という具体的数値は一例に過ぎず、目的の範囲で適宜の値に設定することができる。 Specific numerical values such as “80 pixels”, “131 pixels”, and “30%” of the threshold values indicating the above-described determination criteria are merely examples, and can be set to appropriate values within a target range.

ステップＳ３２４に続いて、ステップＳ３２６では追跡復帰したか否かの判断が行われる。ステップＳ３２６でYesならば、処理を終了し、ステップＳ３２６でNoであれば、ステップＳ３２８の「片眼追跡停止判定」の処理に進む。このステップＳ３２８は、単眼追跡停止中のもう片方の眼の追跡停止を行うか否かを判定する処理である。つまり、単眼のみ追跡が行われている場合に、追跡中の単眼の追跡を停止する処理を行う。次に示す条件８を満たした場合は、単眼の追跡を停止する。 Subsequent to step S324, in step S326, it is determined whether or not the tracking is returned. If Yes in step S326, the process ends. If No in step S326, the process proceeds to the “one eye tracking stop determination” process in step S328. This step S328 is a process for determining whether or not to stop the tracking of the other eye during the monocular tracking stop. That is, when only a single eye is being tracked, processing for stopping the tracking of the single eye being tracked is performed. When the following condition 8 is satisfied, monocular tracking is stopped.

〔条件８〕初期モデルの相関が25%を下回り、かつ前フレームの開閉判断において「閉じ眼」と判断されている場合。 [Condition 8] The correlation of the initial model is less than 25%, and it is determined as “closed eyes” in the opening / closing determination of the previous frame.

なお、条件８における判断の基準となる閾値の具体的な数値「25%」は一例に過ぎず、目的の範囲で適宜の値に設定することができる。 Note that the specific numerical value “25%” of the threshold value that is the criterion for determination in Condition 8 is merely an example, and can be set to an appropriate value within the target range.

図６で説明した処理を用い、追跡結果の妥当性を判断し（図５のステップＳ２３６）、両眼の追跡を続けられない場合（ステップＳ２３６でNo）には、追跡状態をOFF、つまり追跡フラグをOFFにする（ステップＳ２３８）。その一方、ステップＳ２３６で追跡結果が妥当な場合（ステップＳ２３６でYes）は、追跡を継続して次のフレーム画像の処理に続く。 The process described in FIG. 6 is used to determine the validity of the tracking result (step S236 in FIG. 5). If tracking of both eyes cannot be continued (No in step S236), the tracking state is turned off, that is, tracking is performed. The flag is turned off (step S238). On the other hand, if the tracking result is valid in step S236 (Yes in step S236), the tracking is continued and the processing of the next frame image is continued.

なお、上述のように、追跡不良の判断方法としては、初期モデルと更新モデルを用いたマッチングによって検知されたマッチング位置やそのスコア（類似度）を使って、両眼又は片眼の追跡不良を判断することができる。 As described above, as a method for determining tracking failure, the tracking failure of both eyes or one eye is detected using the matching position detected by matching using the initial model and the updated model and its score (similarity). Judgment can be made.

或いはまた、後述する開閉判断のスコアを使い、追跡不良を判断することもできる。 Alternatively, tracking failure can be determined by using a score of opening / closing determination described later.

さらに、複数フレームの追跡結果の情報をとして保存する工程を含む構成とすることができる。複数フレームの追跡結果の情報の履歴から把握される軌跡情報を使い、眼の動きを予測して（運動予測）、不良な追跡位置を判断し、追跡を停止させることができる。ここでいう「軌跡情報」とは、時系列で連続する複数のフレーム間でのマッチング位置の軌跡（位置の移動軌跡）を意味している。複数フレームのマッチング結果を履歴として保存することによって、前後のフレーム間の繋がりから、不良な追跡位置を判別できる。例えば、時刻ｔの眼位置と、時刻ｔ＋１の眼位置の差が規定の値よりも大きい場合には、追跡不良と判断することができる。 Furthermore, it can be configured to include a step of storing information on tracking results of a plurality of frames as information. By using trajectory information grasped from the history of tracking result information of a plurality of frames, eye movement can be predicted (motion prediction), a bad tracking position can be determined, and tracking can be stopped. Here, “trajectory information” means a trajectory of a matching position (a movement trajectory of a position) between a plurality of continuous frames in time series. By storing the matching results of a plurality of frames as a history, a defective tracking position can be determined from the connection between the previous and next frames. For example, if the difference between the eye position at time t and the eye position at time t + 1 is greater than a prescribed value, it can be determined that tracking is poor.

追跡不良と判断された場合は、追跡を停止し、ロックオン処理をやり直して、追跡の復帰を試みる。なお、図６で説明した追跡不良の判定処理（追跡の停止／復帰判定処理）が「追跡停止判断工程」に相当している。 When it is determined that the tracking is poor, the tracking is stopped, the lock-on process is performed again, and the recovery of the tracking is attempted. The tracking failure determination process (tracking stop / return determination process) described with reference to FIG. 6 corresponds to a “tracking stop determination process”.

＜再ロックオンについて＞
再ロックオン判定（図１のステップＳ４６、図５のステップＳ２３０）は、左右どちらかの眼において、初期モデルの相関が80%を下回った場合、又は少なくとも片眼の追跡が停止している場合に行われる。これは、眼追跡を開始してからある程度時間が経つと、運転者とカメラ１２の距離変動（遠近移動）や照明変動などにより、初期モデルを用いてマッチング処理をしても高い相関が得られなくなるため、初期モデルを更新すること、或いはまた、追跡復帰処理（図６のステップＳ３２２、Ｓ３２４）を補い、追跡が停止している際に、できるだけ早く追跡を復帰させることを目的とする。 <About re-lock on>
The re-lock-on determination (step S46 in FIG. 1 and step S230 in FIG. 5) is performed when the correlation between the initial models falls below 80% in either the left or right eye, or when tracking of at least one eye is stopped To be done. This is because, after a certain amount of time has elapsed since eye tracking is started, a high correlation can be obtained even if matching processing is performed using the initial model due to distance fluctuation (distance movement) and illumination fluctuation between the driver and the camera 12. Therefore, the object is to update the initial model, or to supplement the tracking return processing (steps S322 and S324 in FIG. 6) and return the tracking as soon as possible when the tracking is stopped.

再ロックオンのアルゴリズムは、図３及び図４で例示したロックオン判定のアルゴリズムと概ね同様であるが、少なくともどちらか一方の眼を追跡中の場合は、両眼判別器を照合する位置が限定される。すなわち、図７に示した３つの状態ＳＴ−０、ＳＴ−１、ＳＴ−２に応じて、次のように照合が行われる。 The re-lock-on algorithm is substantially the same as the lock-on determination algorithm illustrated in FIGS. 3 and 4, but the position to collate the binocular discriminator is limited when tracking at least one of the eyes. Is done. That is, matching is performed as follows according to the three states ST-0, ST-1, and ST-2 shown in FIG.

［1］両眼とも追跡中の場合（図７の状態ＳＴ−２）、両眼それぞれの中心点の中点を基準に、比較的小さい範囲に両眼判別器を照合する。 [1] When both eyes are being tracked (state ST-2 in FIG. 7), the binocular discriminator is collated in a relatively small range with reference to the midpoint of the center point of each eye.

［2］左右どちらか一方（片眼）のみ追跡中の場合（図７の状態ＳＴ−１）、追跡中の単眼が左眼ならばその右側に、右目ならばその左側に、追跡停止している単眼が存在すると考え、追跡中の単眼の中心から40pixelの位置を基準に、横15pixel、縦10pixelの範囲に両眼判別器を照合する。 [2] When only one of the left and right (one eye) is being tracked (state ST-1 in FIG. 7), the tracking is stopped on the right side if the monocular being tracked is the left eye, and on the left side if the right eye is tracking. The binocular discriminator is collated in a range of 15 pixels wide and 10 pixels long based on the position of 40 pixels from the center of the single eye being tracked.

［3］両眼とも追跡停止中の場合（図７のＳＴ−０）、最初に行ったロックオン判定（図１のステップＳ１８）と同様の両眼検出を行う。 [3] When both eyes are stopped (ST-0 in FIG. 7), the same binocular detection as in the first lock-on determination (step S18 in FIG. 1) is performed.

＜開閉判断＞
次に開閉判断の処理（図２のステップＳ３４）について説明する。 <Open / close judgment>
Next, the opening / closing determination process (step S34 in FIG. 2) will be described.

図８は開閉判断処理の流れを示すフローチャートである。開閉判断に必要な初期設定（ステップＳ４０２）のための事前学習として、予め顔の向きを示す顔角度と、開き目（単眼）判別器を適用したときの検出スコア（類似度を表す評価値）との関係を求めておく。その具体的な方法として、まず、顔角度ごとの複数の分析画像を用意する。顔角度は、例えば、カメラ１２に正対する正面前方向きを角度の基準（０度）として、右向き方向の回転角（水平角度）をプラスの角度、左向き方向の回転角をマイナスの角度として表す。 FIG. 8 is a flowchart showing the flow of the open / close determination process. As pre-learning for initial setting required for open / close determination (step S402), a face angle indicating a face direction and a detection score (an evaluation value indicating similarity) when an open eye (monocular) discriminator is applied. Seeking a relationship with As a specific method, first, a plurality of analysis images for each face angle are prepared. The face angle, for example, represents the front-facing direction facing the camera 12 as a reference angle (0 degree), the right-handed rotation angle (horizontal angle) as a positive angle, and the left-handed rotation angle as a negative angle.

様々な顔向き角度ごとに複数人分の顔画像を用意し、これら各画像に対して、単眼の開き目判別器をかけ、得られたスコアの分布から、正面前方向きの画像に対するスコアと、顔角度との関係を統計処理で算出する。 Prepare face images for multiple people for each of various face orientation angles, apply a monocular open eye discriminator to each of these images, and from the obtained score distribution, the score for the front facing image, The relationship with the face angle is calculated by statistical processing.

開き目（単眼）判別器は、正面前方向きの画像から生成されている。したがって、人物の違いによって、スコアの大小に相違はあるものの全体的には、正面前方向きの画像のスコアが最も値が大きく、顔向き角度が大きくなるにつれて、スコアの値は低下していく傾向となる。複数人分の分析画像を用いて統計処理を行うことにより、概ね平均的な関係を把握することができる。 The open eye (monocular) discriminator is generated from a front-facing image. Therefore, although the score varies depending on the person, overall, the score of the front-facing image has the largest value, and the score value tends to decrease as the face angle increases. It becomes. By performing statistical processing using analysis images for a plurality of people, it is possible to grasp an average relationship.

このような事前学習によるデータが予め得られていることを前提とし、図８の処理が進められる。眼追跡が開始されると、まず、開閉判断用の初期設定が行われる（ステップＳ４０２）。この初期設定は、被判断者であるドライバー自身の撮像画像を用いて、顔角度とスコアの関係性を設定する処理である。 The processing of FIG. 8 is advanced on the premise that data by such prior learning is obtained in advance. When eye tracking is started, initial setting for opening / closing determination is first performed (step S402). This initial setting is a process of setting the relationship between the face angle and the score using the captured image of the driver who is the person to be judged.

人によって開き目判別器による検出の精度は異なる。例えば、メガネをかけている人と、かけていない人とでは、検出スコアが大きくことなる。このため、事前学習で得られた統計結果をそのまま適用すると、必ずしも適切な開き目検出（判断）ができないことも起こりうる。したがって、被判断者の正面画像に対するスコアを基に、事前学習による統計関係から、当該被判断者の顔角度と検出スコアの関係を設定する。つまり、ステップＳ４０２の処理は、被判断者である人物（当該運転者）に対して、どういう閾値を設定するか、ということを決めるアルゴリズムである。 The accuracy of detection by the open eye discriminator varies depending on the person. For example, the detection score is large between a person wearing glasses and a person not wearing glasses. For this reason, if the statistical result obtained by prior learning is applied as it is, it may not always be possible to detect (determine) an appropriate opening. Therefore, the relationship between the face angle of the person to be judged and the detection score is set based on the statistical relation based on the prior learning based on the score of the person to be judged on the front image. That is, the process of step S402 is an algorithm that determines what threshold value is set for the person who is the person to be judged (the driver).

具体的には、ロックオンされたらフレーム毎の開閉判断結果の履歴を１０フレーム分保持するものとし、ロックオン直後の１０フレームについて正面を向いた眼検出の結果から、スコアの平均値（「正面検出平均スコア」）を算出する。そして、事前学習で得られている統計関係から、被判断者の顔角度と検出スコアの関係を決める。 Specifically, when the lock-on is performed, the history of the open / close determination result for each frame is held for 10 frames, and the average score value (“front The average detection score ") is calculated. And the relationship between a to-be-determined person's face angle and a detection score is determined from the statistical relationship obtained by prior learning.

図９にその例を示す。図９の横軸は顔角度の大きさ（絶対値）、縦軸は検出スコアの値を示す。図９中g1、g2は、それぞれ異なる人物の顔角度と検出スコアの関係を示している。例えば、g1はメガネをかけていない人、g2はメガネをかけている人のグラフである。
眼追跡（図８のステップＳ４０４）でロックオンされている被追跡中の両眼から顔角度を推定し（ステップＳ４０６）、初期設定（ステップＳ４０２）で得られた顔角度と検出スコアの関係から、角度ごとに開閉判断用の閾値を設定する（ステップＳ４０８）。つまり、図９のｇ1やg2で表されるような閾値が設定される。顔の角度が大きいほど、閾値は小さい値に設定されることになる。 An example is shown in FIG. The horizontal axis in FIG. 9 indicates the face angle magnitude (absolute value), and the vertical axis indicates the detection score value. In FIG. 9, g1 and g2 indicate the relationship between the face angle and detection score of different persons. For example, g1 is a graph of a person who is not wearing glasses, and g2 is a graph of a person who is wearing glasses.
The face angle is estimated from the eyes being tracked that are locked on by eye tracking (step S404 in FIG. 8) (step S406), and the relationship between the face angle and the detection score obtained in the initial setting (step S402) is estimated. A threshold value for opening / closing determination is set for each angle (step S408). That is, a threshold value represented by g1 or g2 in FIG. 9 is set. The threshold value is set to a smaller value as the face angle increases.

こうして設定された閾値（第４の閾値）を用いて、後のスコア判定（図８のステップＳ４１４）の処理にて眼の開閉状態を判断する。顔角度の推定処理については図１０で後述する。 Using the threshold value thus set (fourth threshold value), the open / closed state of the eye is determined in the subsequent score determination (step S414 in FIG. 8). The face angle estimation process will be described later with reference to FIG.

追跡中の眼に対し、開閉判断は片眼ずつ行われる（図８のステップＳ４１０）。すなわち、単眼の開閉判断として、開き目（単眼）判別器が適用され、開き目の検出が行われる（ステップＳ４１２）。眼追跡の処理で特定された追跡位置によって定まる所定の画像範囲に開き目（単眼）判別器を適用し、開き目検出によるスコアが計算され、ステップＳ４０８で決定された閾値（第４の閾値）と対比して判定が行われる（ステップＳ４１４）。 Open / close determination is performed for each eye for each eye being tracked (step S410 in FIG. 8). That is, an open eye (monocular) discriminator is applied to determine whether to open or close the monocular, and an open eye is detected (step S412). An open eye (monocular) discriminator is applied to a predetermined image range determined by the tracking position specified in the eye tracking process, a score by open eye detection is calculated, and the threshold value (fourth threshold value) determined in step S408 The determination is made in comparison with (step S414).

時間軸統合の処理（ステップＳ４１６）では、直近の数フレーム（例えば、２〜３フレーム）分の単眼開閉判断の結果を蓄積しておき、これら直近の数フレーム分の単眼開閉判断の結果を統合して単眼の開閉判断結果とする。このような時間軸統合の処理によって、例えば、瞬きなどの一瞬の開閉変化を捨象することが可能であり、居眠りとは違う「瞬き」を無視することができる。 In the time axis integration process (step S416), the results of monocular opening / closing judgments for the latest several frames (for example, 2 to 3 frames) are accumulated, and the results of monocular opening / closing judgments for these recent several frames are integrated. The result is a monocular opening / closing judgment result. By such a time axis integration process, for example, it is possible to discard an instantaneous opening / closing change such as a blink, and a “blink” that is different from a doze can be ignored.

左右の眼について、それぞれステップＳ４１０〜Ｓ４１６の処理を並列に行い、その後、左右それぞれの単眼開閉判断の結果を統合して（ステップＳ４１８）、総合判定としての両眼の開閉判断結果を出力する（ステップＳ４２０）。通常、人は両眼の瞼が左右同時に閉じたり、開いたりすることが多く、片方の眼を閉じて、他の片方を開くということは起こりにくい。したがって、本実施形態では、開閉判断に際しては、左右それぞれの単眼開閉判断を総合して、判断している。すなわち、最終的な開閉判断は、片眼ずつの開閉判断結果を組み合わせて両眼の開閉判断結果として統合（両眼統合）されたものを出力する（ステップＳ４１８〜Ｓ４２０）。この出力される判断結果に基づき、ドライバーの居眠りを判別することができる。また、上述した開閉判断に基づく居眠り判別と併せて、脇見運転の判別も可能である。 For the left and right eyes, the processing of steps S410 to S416 is performed in parallel, and then the results of the monocular opening / closing determination for both the left and right are integrated (step S418), and the binocular opening / closing determination result as a comprehensive determination is output ( Step S420). Usually, a person often closes or opens the eyelids of both eyes at the same time, and it is difficult to close one eye and open the other. Therefore, in the present embodiment, when opening / closing is determined, the left and right monocular opening / closing determinations are comprehensively determined. That is, the final open / close determination is performed by combining the open / close determination results for each eye and integrating the open / close determination results for both eyes (binocular integration) (steps S418 to S420). Based on the output determination result, it is possible to determine whether the driver is falling asleep. Further, in addition to the dozing determination based on the above-described opening / closing determination, it is also possible to determine the side-view driving.

さらに、本実施形態では、開閉判断に際し、開いた眼の画像特徴を機械学習することによって構築した判別器（開き目判別器）を用いているため、この開閉判断の処理において眼の位置が正確に特定される。本実施形態では、こうして検出された眼の位置により、追跡位置を修正する機能を備えている。すなわち、追跡処理で特定された追跡位置に対して、開閉判断を行い、その開閉判断で検出した眼の位置を追跡位置に反映させる。このような追跡位置修正機能により、更に精度の高い追跡が可能である。 Furthermore, in this embodiment, since the discriminator (open eye discriminator) constructed by machine learning of the image feature of the open eye is used for the open / close determination, the eye position is accurately determined in the open / close determination process. Specified. In the present embodiment, the tracking position is corrected based on the eye position thus detected. That is, an open / close determination is performed on the tracking position specified by the tracking process, and the eye position detected by the open / close determination is reflected in the tracking position. Such tracking position correction function enables tracking with higher accuracy.

＜顔向き（水平角度）の推定処理と脇見判定について＞
図１０は顔角度の推定処理と脇見判定の処理の流れを示したフローチャートである。既述のとおり、追跡対象の眼がロックオンされ（ステップＳ５０２）、被判断者の顔角度と検出スコアの関係の設定が更新された後（ステップＳ５０４）、顔角度の算出基準となる角度基準の設定が行われる（ステップＳ５０６）。 <About face orientation (horizontal angle) estimation processing and side-arming determination>
FIG. 10 is a flowchart showing the flow of the face angle estimation process and the look-ahead determination process. As described above, after the eye to be tracked is locked on (step S502) and the setting of the relationship between the face angle of the person to be judged and the detection score is updated (step S504), the angle reference that is the reference for calculating the face angle Is set (step S506).

図１１（ａ）（ｂ）は、顔向き（顔角度）を示す水平角の算出方法を説明するための模式図である。図１１（ａ）（ｂ）は人物の頭を円柱近似して上から見た模式図であり、Ａは左眼、Ｃは右眼の位置を表す。矢印Ｂは顔の向きを示す。図１１（ａ）は、顔向き水平角度算出の基準となる正面の顔向き、図１１（ｂ）は、図１１（ａ）から水平面内で右方向に回転した顔向きが描かれている。 FIGS. 11A and 11B are schematic diagrams for explaining a method of calculating a horizontal angle indicating the face direction (face angle). FIGS. 11A and 11B are schematic views of a person's head viewed from above with a cylinder approximation, where A represents the position of the left eye and C represents the position of the right eye. Arrow B indicates the direction of the face. FIG. 11A shows the front face orientation as a reference for calculating the face orientation horizontal angle, and FIG. 11B shows the face orientation rotated to the right in the horizontal plane from FIG. 11A.

角度基準設定（図１０のステップＳ５０６）の処理では、ロックオンされた際に、正面の撮像画像から左右両眼の直線距離（図１１（ａ）におけるＡ−Ｃ間の直線距離Ｄ）を計算し、その値を記録する。具体的には、撮像画像内における左眼及び右眼のそれぞれの中心位置間距離（両眼の中心距離）をpixel単位で求め、その値をメモリに記憶する。 In the processing of the angle reference setting (step S506 in FIG. 10), when the lock is turned on, the linear distance between the left and right eyes (the linear distance D between A and C in FIG. 11A) is calculated from the captured image of the front. And record the value. Specifically, the distance between the center positions of the left eye and the right eye in the captured image (center distance between both eyes) is obtained in units of pixels, and the value is stored in the memory.

次に、追跡（図１０のステップＳ５０８）によって特定される両眼の追跡位置から、顔角度の推定処理が行われる（ステップＳ５１０）。この角度推定処理（ステップＳ５１０）では、まず、両眼の追跡位置から両眼の中心距離（図１１（ｂ）におけるＡ−Ｃ間の直線距離Ｙ）を算出する。 Next, face angle estimation processing is performed from the tracking positions of both eyes specified by tracking (step S508 in FIG. 10) (step S510). In this angle estimation process (step S510), first, the center distance of both eyes (the linear distance Y between A and C in FIG. 11B) is calculated from the tracking position of both eyes.

そして、水平方向顔向き角度Ｂを次の（式１）により算出する。 Then, the horizontal face orientation angle B is calculated by the following (Equation 1).

水平方向顔向き角度Ｂ＝arcsin（Ｙ／Ｄ） …（式１）
なお、（式１）は次のようにして導出される。 Horizontal face angle B = arcsin (Y / D) (Formula 1)
(Equation 1) is derived as follows.

図１１において、円柱の中心をＥ、円柱の半径をｒ、両眼（弧ＡＣ）の中心角をｘ［ラジアン］とする。また、図１１（ａ）において、矢印Ｂ（ベクトルＥＢ）に直交する基準線ＥＦとすると、図１１（ａ）に示した関係からＤは次の（式２）で表される。 In FIG. 11, the center of the cylinder is E, the radius of the cylinder is r, and the center angle of both eyes (arc AC) is x [radians]. Further, in FIG. 11 (a), assuming that the reference line EF is orthogonal to the arrow B (vector EB), D is expressed by the following (Expression 2) from the relationship shown in FIG. 11 (a).

Ｄ＝２ｒ・sin（ｘ／２）…（式２）
なお、式中の「・」は乗算の演算子を表す（式３についても同様）。 D = 2r · sin (x / 2) (Formula 2)
Note that “·” in the expression represents a multiplication operator (the same applies to Expression 3).

一方、図１１（ｂ）において、基準線ＥＦから反時計回り方向に角度を測り、角ＦＥＣを角度Ｃ、角ＦＥＢを角度Ｂ、角ＦＥＡを角度Ａと示すとき、図１１（ｂ）の幾何学的関係から、次の（式３）〜（式５）が成り立つ。 On the other hand, in FIG. 11B, when the angle is measured counterclockwise from the reference line EF, the angle FEC is represented as angle C, the angle FEB is represented as angle B, and the angle FEA is represented as angle A. From the scientific relationship, the following (formula 3) to (formula 5) hold.

Ｙ＝r・cosＣ−r・cosＡ＝2r・sinＢ・sin(ｘ／２) …（式３）
Ｃ＋Ｘ＝Ａ …（式４）
Ｂ＝（Ａ＋Ｃ）／２ …（式５）
これらの関係から、次の（式６）が導かれる。 Y = r · cosC−r · cosA = 2r · sinB · sin (x / 2) (Formula 3)
C + X = A (Formula 4)
B = (A + C) / 2 (Formula 5)
From these relationships, the following (formula 6) is derived.

Ｙ＝ＤsinＢ …（式６）
この（式６）を角度Ｂについて解くと（式１）が得られる。 Y = DsinB (Formula 6)
Solving (Equation 6) with respect to angle B yields (Equation 1).

角度推定処理（図１０のステップＳ５１０）にて計算した角度Ｂを予め定めた脇見判定用の閾値と比較し、顔向きが所定の角度以上で、かつ、その状態がある時間以上継続したら、ドライバーの脇見運転と判別することができる（ステップＳ５１２）。 If the angle B calculated in the angle estimation process (step S510 in FIG. 10) is compared with a predetermined threshold for looking aside, and the face orientation is equal to or greater than a predetermined angle and the state continues for a certain period of time, the driver It can be determined that the driver is looking aside (step S512).

図１０のステップＳ５０８で示した追跡処理の結果に対しては、追跡結果のチェック（結果判断）が行われ（ステップＳ５１４）、追跡結果がＯＫであれば、次のフレームの処理に進む（ステップＳ５１６）。一方、追跡結果がＮＧであれば（ステップＳ５１４にてＮＧ）、ステップＳ５０２に戻ってロックオン処理をやり直す。カメラ１２と被写体の拒理（遠近）が変化すると、正面前方を向いた顔画像における両眼の直線距離Ｄが変わるため、Ｄの誤差をできるだけ小さくするために、適切な再ロックオンをかける。Ｄの精度を高めることにより、顔向きの変化に対する検出精度が高まる。 The result of the tracking process shown in step S508 of FIG. 10 is checked (result determination) for the tracking result (step S514). If the tracking result is OK, the process proceeds to the next frame (step S514). S516). On the other hand, if the tracking result is NG (NG in step S514), the process returns to step S502 and the lock-on process is performed again. When the camera 12 and subject rejection (perspective) change, the binocular linear distance D in the face image facing forward is changed, so that an appropriate re-lock-on is applied in order to minimize the error of D. By increasing the accuracy of D, the detection accuracy with respect to changes in face orientation is increased.

＜顔角度に対応した初期モデルの変更について＞
テンプレートマッチングの更なる精度向上のために、顔向きが大きく変化した場合に初期モデルを更新する構成を採用する態様が好ましい。 <Change of initial model corresponding to face angle>
In order to further improve the accuracy of template matching, it is preferable to adopt a configuration in which the initial model is updated when the face orientation changes greatly.

図１２は初期モデルの更新方法を示すフローチャートである。 FIG. 12 is a flowchart showing an initial model update method.

図１２に示すように、ロックオン判定の処理によって、両眼がロックオンされると（ステップＳ６０２）、このロックオンされた際の初期モデルが記録される（ステップＳ６０４）。この初期モデルは、運転者の顔が正面前方を向いているときの初期正面テンプレートである。 As shown in FIG. 12, when both eyes are locked on by the lock-on determination process (step S602), an initial model when the lock-on is performed is recorded (step S604). This initial model is an initial front template when the driver's face is facing forward.

追跡開始時は、この初期モデル（初期正面テンプレート）を用いて、追跡処理が行われる（ステップＳ６０８）。そして、図１１で説明した方法により、現在追跡中の顔の向き（顔角度）を算出する（ステップＳ６１０）。そして、この求めた顔向き角度を基に初期モデルに対して角度に応じた変形処理を施し、顔角度に対応したテンプレート（「角度テンプレート」という。図１２中「角度ＴＰ」と表記）を作成する（ステップＳ６１２）。 At the start of tracking, tracking processing is performed using the initial model (initial front template) (step S608). Then, the orientation (face angle) of the currently tracked face is calculated by the method described in FIG. 11 (step S610). Then, based on the obtained face orientation angle, the initial model is subjected to deformation processing according to the angle, and a template corresponding to the face angle (referred to as “angle template”, expressed as “angle TP” in FIG. 12) is created. (Step S612).

角度テンプレートは次のようにして作成される（ステップＳ６１２）。 The angle template is created as follows (step S612).

まず、３次元の被写体である人物（被判断者としての運転者）の眼を含む周辺区域を円柱形状と近似し、現在追跡中の眼のモデルサイズに応じて、円柱の半径を決定する。 First, a peripheral area including the eyes of a person who is a three-dimensional subject (driver as a person to be judged) is approximated to a cylindrical shape, and the radius of the cylinder is determined according to the model size of the eye currently being tracked.

次に、ステップＳ６１０で得られた顔向きの角度をパラメータとして、円柱回転変換を行う。 Next, cylindrical rotation conversion is performed using the face orientation angle obtained in step S610 as a parameter.

最後に、円柱回転後の画像を射影変換し（平面に正射影）、現在の顔向き角度に相当する角度テンプレートを生成する。角度テンプレートは「角度モデル」、「回転モデル」と言う場合がある。 Finally, projective transformation is performed on the image after the rotation of the cylinder (orthographic projection on a plane), and an angle template corresponding to the current face orientation angle is generated. An angle template may be referred to as an “angle model” or a “rotation model”.

図１３に、初期正面テンプレートから、ある顔向き角度に対応した角度テンプレートを生成する方法の概念図を示した。図示のように、初期正面モデルから、円柱回転変換及び射影変換を行うことにより、顔向き角度に対応した初期モデルが生成される。 FIG. 13 shows a conceptual diagram of a method for generating an angle template corresponding to a certain face orientation angle from the initial front template. As shown in the figure, an initial model corresponding to the face orientation angle is generated by performing cylindrical rotation conversion and projective conversion from the initial front model.

こうして得られた初期モデルを新たな初期モデルとして記憶し、初期モデルを更新する。 The initial model thus obtained is stored as a new initial model, and the initial model is updated.

次のフレームに対する追跡の処理（ステップＳ６１８）に際しては、ステップＳ６１２により更新された初期モデル（この場合、角度テンプレート）が適用される。 In the tracking process for the next frame (step S618), the initial model (in this case, the angle template) updated in step S612 is applied.

このように、追跡中に顔向き角度を測定し、その測定した角度に対して適応的に初期モデルを回転変形させて、適切な初期モデルに修正（更新）していくことにより、テンプレートマッチングの精度向上を達成できる。 In this way, the face orientation angle is measured during tracking, and the initial model is adaptively rotated and deformed with respect to the measured angle, and corrected (updated) to an appropriate initial model. Improved accuracy can be achieved.

＜初期モデルの更新方法に関する他の例について＞
図１２で説明したフローに代えて、初期正面テンプレートを生成した際に、予め（事前に）複数の顔向き角度に対応した角度テンプレート（回転モデル）を複数作成しておき、これら複数の角度テンプレートを初期テンプレートの変更候補として記憶保存しておく態様も好ましい。 <Other examples of how to update the initial model>
Instead of the flow described in FIG. 12, when the initial front template is generated, a plurality of angle templates (rotation models) corresponding to a plurality of face orientation angles are created in advance, and the plurality of angle templates are generated. It is also preferable to store and save as initial template change candidates.

例えば、正面前方向きを基準（回転角＝0度）として、±30度、±50度の顔向き角度にそれぞれ対応した４種類の角度テンプレートを作成しておく。この場合、角度推定（ステップＳ６１０、「角度推定工程」に相当）で求めた角度に最も近いテンプレート候補を選択して、次フレームに適用する初期モデルとする（「新たな初期モデルとして記憶する工程」に相当）。 For example, four types of angle templates corresponding to the face orientation angles of ± 30 degrees and ± 50 degrees are created using the front-front direction as a reference (rotation angle = 0 degree). In this case, a template candidate closest to the angle obtained in the angle estimation (step S610, corresponding to the “angle estimation step”) is selected and set as an initial model to be applied to the next frame (“store as a new initial model” ”).

このような方法によれば、顔向き角度に対応した適切な初期モデルの更新が可能であり、テンプレートマッチングの精度向上を達成できる。また、図１２で説明した方法と比較して、演算処理時間を短縮することができる。 According to such a method, an appropriate initial model corresponding to the face orientation angle can be updated, and the accuracy of template matching can be improved. In addition, the calculation processing time can be shortened compared to the method described with reference to FIG.

なお、事前に作成しておく角度テンプレートについて、上記４種類に限らず、適宜の角度のものを作成しておくことができる。更に、上記例示の４種類の角度だけでなく、更に細かく多数の角度位置に対応する角度テンプレートを作成しておいてもよい。演算時間と検出精度を考慮して、適当な種類数と角度位置の回転モデルを作成すればよい。 Note that the angle templates that are created in advance are not limited to the above four types, and those with appropriate angles can be created. Furthermore, not only the four types of angles illustrated above but also angle templates corresponding to a large number of angle positions may be created. In consideration of calculation time and detection accuracy, a rotation model with an appropriate number of types and angular positions may be created.

＜画像処理部２０の詳細構成＞
図２〜図１３で説明した処理機能を実現する画像処理部２０の構成例について説明する。図１４は、画像処理部２０の構成例を示したブロック図である。画像処理部２０は、画像入力部２０２と、ロックオン処理部２０４と、追跡処理部２０６と、開閉判断部２０８と、居眠り判断部２１０と、脇見判断部２１２と、初期モデル変更部２１４とを備えている。 <Detailed Configuration of Image Processing Unit 20>
A configuration example of the image processing unit 20 that realizes the processing functions described with reference to FIGS. FIG. 14 is a block diagram illustrating a configuration example of the image processing unit 20. The image processing unit 20 includes an image input unit 202, a lock-on processing unit 204, a tracking processing unit 206, an open / close determination unit 208, a dozing determination unit 210, an aside look determination unit 212, and an initial model change unit 214. I have.

画像入力部２０２は、カメラ１２から撮像画像のデータを取得するための画像入力インターフェース部である。具体的には、通信インターフェース部、データ入力端子、メディアインターフェース部などの態様があり得る。 The image input unit 202 is an image input interface unit for acquiring captured image data from the camera 12. Specifically, there may be aspects such as a communication interface unit, a data input terminal, and a media interface unit.

ロックオン処理部２０４は、両眼判別器２２０、単眼判別器２２４、初期モデル生成部２２６、初期モデル記憶部２２８を備える。ロックオン処理部２０４は、図３及び図４出説明したロックオン処理を行う。また、初期モデル生成時に、その生成した初期モデル（初期正面テンプレート）から予め複数の角度テンプレートを作成しておく場合、ロックオン処理部２０４には、角度テンプレート生成部２３０と角度テンプレート記憶部２３２とを備える。角度テンプレート生成部２３０は、図１３で説明した変換処理と同様の処理を実現するための円柱回転変換部２３６と射影変換部２３８とを含む。 The lock-on processing unit 204 includes a binocular discriminator 220, a monocular discriminator 224, an initial model generation unit 226, and an initial model storage unit 228. The lock-on processing unit 204 performs the lock-on process described with reference to FIGS. Further, when generating a plurality of angle templates in advance from the generated initial model (initial front template) at the time of generating the initial model, the lock-on processing unit 204 includes an angle template generating unit 230, an angle template storage unit 232, Is provided. The angle template generation unit 230 includes a cylindrical rotation conversion unit 236 and a projective conversion unit 238 for realizing processing similar to the conversion processing described with reference to FIG.

追跡処理部２０６は、初期モデルを用いたテンプレートマッチングの処理を行う第１のマッチング処理部２５０と、第１の閾値を記憶しておく記憶部２５２と、初期モデルを用いたマッチングのスコアと第１の閾値を比較するスコア比較部２５４（「比較部」に相当）と、更新モデルを用いたテンプレートマッチングの処理を行う第２のマッチング処理部２５６と、二つのマッチング処理部（２５０、２５６）のいずれか一方のマッチング処理の結果を採用して追跡位置を決定する追跡位置決定部２５８と、マッチング処理の結果から追跡不良を判定する追跡不良判定部２６０と、決定された追跡位置から更新モデル生成する更新モデル生成部２６２と、生成した最新の更新モデルを記憶する記憶部２６４と、開閉判断部２０８の開き目検出の結果を利用して追跡位置の修正を行う追跡位置修正部２６６と、を備える。追跡処理部２０６は図５で説明した眼追跡の処理を行う。また、追跡不良判定部２６０は図６及び図７で説明した眼追跡の停止／復帰の判定処理を行う。 The tracking processing unit 206 includes a first matching processing unit 250 that performs template matching processing using an initial model, a storage unit 252 that stores a first threshold value, a matching score using the initial model, and a first matching score. A score comparison unit 254 that compares the threshold values of 1 (corresponding to a “comparison unit”), a second matching processing unit 256 that performs template matching processing using the update model, and two matching processing units (250, 256) A tracking position determination unit 258 that determines a tracking position by using the result of the matching process, a tracking defect determination unit 260 that determines a tracking defect from the result of the matching process, and an updated model from the determined tracking position. An update model generation unit 262 to be generated, a storage unit 264 that stores the latest update model that has been generated, and an opening detection of the open / close determination unit 208 Using the result includes a track position correcting unit 266 for correcting the tracking position. The tracking processing unit 206 performs the eye tracking process described with reference to FIG. Further, the tracking failure determination unit 260 performs the eye tracking stop / return determination process described with reference to FIGS. 6 and 7.

なお、第１のマッチング処理部２５０、スコア比較部２５４、第２のマッチング処理部２５６、追跡位置決定部２５８の組み合わせが「オブジェクト探索部」に相当する。 A combination of the first matching processing unit 250, the score comparison unit 254, the second matching processing unit 256, and the tracking position determination unit 258 corresponds to an “object search unit”.

開閉判断部２０８は、顔向きの角度を算出する角度推定部２８０と、算出（推定）された顔角度に応じて開閉判断のための閾値（第４の閾値）を決定する閾値決定部２８２と、決定された閾値を用いて開き目を検出する開き目（単眼）判別器２８４と、直近の数フレーム分の判別結果を統合して開閉を判断する時間軸統合部２８６と、左右の各単眼について行われた単眼開閉判断の結果を両眼合わせて統合して判断を行う両眼統合部２８８と、を備える。開閉判断部２０８は、図８で説明した開閉判断の処理を行う。 The opening / closing determination unit 208 includes an angle estimation unit 280 that calculates an angle of face orientation, a threshold determination unit 282 that determines a threshold (fourth threshold) for opening / closing determination according to the calculated (estimated) face angle, , An open eye (monocular) discriminator 284 that detects an open eye using the determined threshold, a time axis integration unit 286 that integrates discrimination results for the last several frames and determines opening and closing, and left and right monoculars A binocular integration unit 288 that performs determination by integrating the results of the single eye opening / closing determination performed for both eyes together. The opening / closing determination unit 208 performs the opening / closing determination process described with reference to FIG.

両眼統合部２８８から出力される開閉判断結果を基に、居眠り判断部２１０にてドライバーが居眠り状態にあるか否かの判断が行われる。 Based on the opening / closing determination result output from the binocular integration unit 288, the dozing determination unit 210 determines whether the driver is in the dozing state.

また、角度推定部２８０によって推定された角度の情報を基に、脇見判断部２１２にてドライバーが脇見運転の状態にあるか否かの判断が行われる。 Further, based on the information on the angle estimated by the angle estimation unit 280, the sidewalk determination unit 212 determines whether or not the driver is in a sideward driving state.

さらに、角度推定部２８０で推定された顔角度の情報は初期モデル変更部２１４に送られる。初期モデル変更部２１４は、顔角度の情報を基に、角度テンプレート記憶部２３２の中から顔角度に最も適した角度テンプレートを自動的に選択し、初期テンプレートを変更する。 Further, the face angle information estimated by the angle estimation unit 280 is sent to the initial model change unit 214. The initial model changing unit 214 automatically selects an angle template most suitable for the face angle from the angle template storage unit 232 based on the face angle information, and changes the initial template.

なお、居眠り判断部２１０や脇見判断部２１２によって居眠り状態や脇見状態であることが検知された場合に、運転者に対して適宜の警告手段（不図示）によって警告を発する構成とすることができる。警告手段としては、音声による警告手段（音声出力手段）、表示による警告手段（警告表示手段）、ハンドルを振動させるなどの運転者に刺激を与える警告手段（刺激付与手段）など様々な形態が可能であり、これらの適宜の組み合わせであってもよい。また、警告手段に代えて、又はこれと組み合わせて、居眠り判断部２１０や脇見判断部２１２の判断結果を基に、危険回避のためのドライビングアシスト制御（例えば、自動減速など、自動的な運転補助制御）を行う態様も可能である。 In addition, when it is detected by the dozing determination unit 210 or the look-aside determination unit 212 that a dozing state or a look-aside state is detected, a warning can be given to the driver by appropriate warning means (not shown). . As warning means, various forms such as warning means by voice (sound output means), warning means by display (warning display means), warning means for stimulating the driver such as vibrating the steering wheel (stimulation giving means) are possible. It may be an appropriate combination of these. Further, instead of or in combination with the warning means, based on the determination result of the dozing determination unit 210 or the aside look determination unit 212, driving assist control for avoiding danger (for example, automatic driving assistance such as automatic deceleration) (Control) is also possible.

本実施形態の画像処理部２０を構成する各部の機能は、集積回路などのハードウエア、又は、ＣＰＵ（中央演算処理装置）などを動作させるソフトウェア（プログラム）、或いはこれらの適宜の組み合わせによって実現することができる。 The function of each unit constituting the image processing unit 20 of the present embodiment is realized by hardware such as an integrated circuit, software (program) for operating a CPU (Central Processing Unit), or an appropriate combination thereof. be able to.

また、画像処理部２０を構成する各部の機能はコンピュータによって実現することができる。すなわち、図２〜図１３で説明した画像処理方法を実現する各工程は、コンピュータに実行させることができる。本実施形態で説明した画像処理機能をコンピュータに実現させるためのプログラムは、コンピュータに予めインストールされていてもよいし、当該プログラムを記憶させた磁気ディスク、光ディスク、光磁気ディスク、メモリカードその他のコンピュータ可読媒体（情報記憶媒体）を提供することも可能である。また、このような有体物たる記憶媒体にプログラムを記憶させて提供する態様に代えて、インターネットなどの通信ネットワークを利用してプログラム信号をダウンロードサービスとして提供することも可能である。 Moreover, the function of each part which comprises the image process part 20 is realizable with a computer. That is, each process for realizing the image processing method described with reference to FIGS. 2 to 13 can be executed by a computer. A program for causing a computer to realize the image processing function described in the present embodiment may be installed in the computer in advance, or a magnetic disk, an optical disk, a magneto-optical disk, a memory card, or other computer storing the program. It is also possible to provide a readable medium (information storage medium). Further, instead of providing the program by storing the program in such a tangible storage medium, the program signal can be provided as a download service using a communication network such as the Internet.

また、本実施形態で説明した画像処理機能の一部又は全部をクラウドコンピューティングによって実現することもできる。 In addition, part or all of the image processing function described in this embodiment can be realized by cloud computing.

＜初期モデルを設定／更新する方法について＞
眼追跡のようなオブジェクト追跡の場合、従来のテンプレートマッチング技術には次のような課題がある。 <How to set / update the initial model>
In the case of object tracking such as eye tracking, the conventional template matching technique has the following problems.

（１）最初のテンプレートをどのように設定するか。 (1) How to set the first template.

（２）追跡対象物のサイズ変化（例えば、遠近変動によるもの）や角度変化（例えば、水平方向の回転によるも）などに対応しにくい。このため、追跡を継続できなかったり、追跡の位置誤差が生じたりする可能性がある。 (2) It is difficult to cope with changes in the size of the tracking target (for example, due to perspective variation) and angular changes (for example, due to horizontal rotation). For this reason, there is a possibility that tracking cannot be continued or a tracking position error occurs.

このような課題に対し、本実施形態によればオブジェクト判別器（本例では、両眼判別器２２０、単眼判別器２２４）を用いてロックオンの処理を行い、参照テンプレート（初期モデル）を構築することができる。また、オブジェクトのサイズ変化（遠近変動）や角度（回転）変化に対して、再ロックオンを行い、参照テンプレート（初期モデル）を作り直すことができる。このときの初期モデルの変更は、例えば、オブジェクトが前向きの範囲の中にある条件で、初期テンプレート（正面前方向き）とのマッチング相関度（スコア）が所定の閾値よりも低ければ、変更を行うという態様が可能である。 In response to such a problem, according to the present embodiment, a lock-on process is performed using an object discriminator (in this example, a binocular discriminator 220, a monocular discriminator 224), and a reference template (initial model) is constructed. can do. Further, it is possible to re-lock on an object size change (perspective change) and angle (rotation) change to recreate a reference template (initial model). The change of the initial model at this time is performed, for example, if the matching correlation (score) with the initial template (front-facing front) is lower than a predetermined threshold under the condition that the object is in the forward range. Is possible.

さらに、本実施形態によれば、オブジェクトの角度変化（回転方向の変動）に対して、基準の角度位置による最初の参照テンプレート（例えば、正面前方向き）から円柱回転変換と射影変換とを組み合わせた変形処理を行い、オブジェクトの角度変化に対応した参照テンプレート（回転モデル）を構築することができる。 Furthermore, according to the present embodiment, the cylinder rotation transformation and the projective transformation are combined from the first reference template (for example, front-facing front) according to the standard angular position with respect to the change in the angle of the object (change in the rotation direction). By performing deformation processing, a reference template (rotation model) corresponding to the change in the angle of the object can be constructed.

これにより、オブジェクトのサイズ変化（遠近変動）や角度（回転）変化に対してロバスト性がある初期モデルの設定が可能である。また、追跡対象のオブジェクトの遠近（サイズ）変化や角度変化に対応して、適応的に初期モデルを修正／更新することができる。 This makes it possible to set an initial model that is robust to changes in the size (perspective variation) and angle (rotation) of the object. In addition, the initial model can be adaptively modified / updated in response to a perspective (size) change or an angle change of the object to be tracked.

正面前向きの初期モデルを生成した際に、これと併せて予め複数の回転角度（例えば、正面前方向きを基準にして、±30度、±50度の４種類）に対応した回転モデル（角度テンプレート）を作成しておくことができる。複数の角度テンプレート群を記憶しておくことで、オブジェクトの角度推定に基づいて、最も近い角度の角度テンプレートを選択して、初期テンプレートを変更することができる。 When generating a front-facing initial model, a rotation model (an angle template) corresponding to a plurality of rotation angles in advance (for example, four types of ± 30 degrees and ± 50 degrees with respect to the front-facing direction) ) Can be created. By storing a plurality of angle template groups, it is possible to change the initial template by selecting the angle template having the closest angle based on the angle estimation of the object.

つまり、現在使っている初期テンプレートで対応できているか否かを常に判断しており、マッチング相関度（スコア）が規定値よりも低くなったら、初期テンプレートを変更（更新）する。具体的には、再ロックオンの処理を行い、初期テンプレートを作り直す。 That is, it is always determined whether or not the initial template currently used can cope with it, and when the matching correlation (score) becomes lower than the specified value, the initial template is changed (updated). Specifically, the re-lock-on process is performed to recreate the initial template.

なお、サイズ変化や角度変化に対して適応的に初期テンプレートの作り直しを実施する態様の他、初期テンプレートの作成時に、正面向きの基準初期テンプレートから、サイズ変化、角度変化に対応した複数のテンプレートを予め作成しておき、追跡時にこれらの中から適切なテンプレートを選択して、実際に使用する初期テンプレートを更新する態様も可能である。 In addition to the aspect in which the initial template is re-created adaptively to the size change and the angle change, a plurality of templates corresponding to the size change and the angle change are generated from the front-facing reference initial template when the initial template is created. It is also possible to create in advance, select an appropriate template from these during tracking, and update the initial template actually used.

＜本実施形態の利点＞
（１）従来のテンプレートマッチング技術では、顔の遠近移動（サイズ変化）や顔向き角度の変化に対応できず、或いは、位置誤差が累積していく可能性があるが、本実施形態では、顔の遠近移動、顔向き角度の変化にも対応できる。また、初期モデルの結果を優先的に利用しつつ、更新モデルを併用し、追跡不良の場合には初期モデルを修正（再ロックオン）するため、位置誤差の累積という問題も解消されている。 <Advantages of this embodiment>
(1) The conventional template matching technique cannot cope with the movement of the face (change in size) or the change in the face orientation angle, or the position error may accumulate. It can cope with the movement of the camera and the change of the face angle. In addition, since the initial model result is used preferentially and the updated model is used together, and the initial model is corrected (relocked on) in the case of poor tracking, the problem of accumulation of position errors is also solved.

（２）ドライバーの両眼を追跡する場合、ドライバーは通常、車の進行方向前方を注視することが運転時の基本姿勢（常態）であるため、ドライバーが前方を真っ直ぐに向いているときの顔向きを基準の「正面」とし、この正面前方向きの顔画像から初期モデルを作る。そして、この初期モデルを用いた追跡が追跡不良となった場合には、初期モデルを作り直すため、追跡途中の位置誤差を修正することが可能である。 (2) When tracking both eyes of the driver, the driver usually looks at the front of the car in the driving direction, which is the basic posture (normal state) during driving, so the face when the driver is looking straight ahead The initial model is created from this face image facing forward, with the orientation as the reference “front”. When tracking using this initial model results in poor tracking, it is possible to correct the position error during tracking in order to recreate the initial model.

（３）初期モデルによるマッチング処理を優先させつつ、初期モデルによるマッチングのスコアが第１の閾値を下まわった場合に、更新モデル（直近モデル、最新モデル）によるマッチングを併用して追跡を行うため、顔向き変動やサイズ変化（遠近変動）に対して対応可能である。 (3) To prioritize the matching process using the initial model, and to perform tracking using the matching using the updated model (the latest model and the latest model) when the matching score based on the initial model falls below the first threshold value It is possible to cope with face direction variation and size variation (perspective variation).

（４）また、従来、更新モデルを使い続けることによって起こりうる累積誤差については、本実施形態の場合、顔向きが正面に戻った際など、初期モデルが利用されることで解消される。このように初期モデルを優先したマルチテンプレート（初期モデルと更新モデル）の利用により、一定の頻度で誤差の累積を修正でき、追跡精度が向上する。 (4) Conventionally, the cumulative error that can occur by continuing to use the updated model is solved by using the initial model in the case of this embodiment, such as when the face orientation returns to the front. In this way, by using the multi-template (initial model and update model) giving priority to the initial model, the error accumulation can be corrected at a certain frequency, and the tracking accuracy is improved.

（５）本実施形態では、複数フレームの追跡結果を蓄積し、軌跡情報（運動予測）から、追跡停止を決めるアルゴリズムが含まれている。そして、追跡が停止された場合に、ロックオン判定の処理モードに切り替え、初期モデルの再作成を行う。こうして、適切な初期モデルに更新され、精度の高い追跡が可能である。 (5) This embodiment includes an algorithm for accumulating tracking results of a plurality of frames and determining tracking stop from trajectory information (motion prediction). When the tracking is stopped, the mode is switched to the lock-on determination processing mode, and the initial model is recreated. In this way, it is updated to an appropriate initial model, and accurate tracking is possible.

（６）開閉判断に用いる開き目（単眼）判別器によって検出された眼の位置情報を用いて追跡位置を修正するため、追跡精度を一層向上させることができる。 (6) Since the tracking position is corrected using the position information of the eyes detected by the open eye (monocular) discriminator used for open / close determination, the tracking accuracy can be further improved.

（７）本実施形態では、両眼判別器２２０、単眼判別器２２４を用い、顔部品のうち眼だけの情報を利用して、ロックオン及び追跡、並びに開閉判断を行うため、運転者がマスクを着用していても、判断が可能である。 (7) In the present embodiment, the driver uses a binocular discriminator 220 and a monocular discriminator 224 to perform lock-on and tracking and open / close determination using information on only the eyes of the facial parts. Judgment is possible even when wearing

この点、従来の顔追跡の手法では、目、口、鼻など、複数の顔部品の情報を利用しているため、人物がマスクを着用するなどして、顔部品の一部が隠れてしまっている状況では、追跡対象物を検出できなかった。そのため、マスクが外されるまで処理を待機し、判断が停止されるという欠点があった。また、従来の居眠り判別技術（例えば、特開２０１０−９７３７９号公報）では、顔画像から抽出された一般の画像特徴（例えば、輪郭を示すエッジなど）を使用して目が開いているか閉じているかを判断しているため、判断精度が低いという問題があった。 In this regard, the conventional face tracking method uses information on multiple facial parts such as eyes, mouth, and nose, so a part of the facial part is hidden when a person wears a mask. In this situation, the tracking object could not be detected. For this reason, there is a drawback that the process waits until the mask is removed and the determination is stopped. Further, in the conventional dozing technique (for example, Japanese Patent Application Laid-Open No. 2010-97379), the eyes are opened or closed by using general image features (for example, edges indicating outlines) extracted from the face image. There is a problem that the determination accuracy is low.

本実施形態によれば、機械学習により構築した開き目（単眼）判別器を用いているため、マスクを着用していても判断が可能であり、従来の方法と比較して判断精度が向上する。 According to this embodiment, since the open eye (monocular) discriminator constructed by machine learning is used, it is possible to make a judgment even when a mask is worn, and the judgment accuracy is improved as compared with the conventional method. .

＜変形例１＞
上記の実施形態では、追跡中、フレーム毎に常に更新モデルを作成し、常時最新の更新モデルに更新しているが、発明の実施に際しては、必ずしも、フレーム毎に常に更新モデルを作成／更新することは要求されない。適宜のフレーム間隔毎に更新モデルを更新してもよい。或いはまた、複数枚のフレームの情報から１つの更新モデルを作成してもよい。 <Modification 1>
In the above embodiment, during tracking, an update model is always created for each frame and is constantly updated to the latest update model. However, in implementing the invention, an update model is always created / updated for each frame. That is not required. The update model may be updated at appropriate frame intervals. Alternatively, one update model may be created from information on a plurality of frames.

＜変形例２＞
上記の実施形態では１台のカメラ１２を例示したが、複数台のカメラを備えるシステムとすることも可能である。 <Modification 2>
In the above embodiment, one camera 12 is illustrated, but a system including a plurality of cameras may be used.

＜変形例３＞
本発明の適用範囲は、ドライバーモニタリングシステムに限定されず、他の様々な用途への応用が可能である。用途や追跡対象物について特に制限なく、一般的なオブジェクト追跡の技術として利用可能である。 <Modification 3>
The application range of the present invention is not limited to the driver monitoring system, and can be applied to various other uses. It can be used as a general object tracking technique without any particular limitation on the use and tracking object.

＜変形例４＞
テンプレートマッチングによる類似度（一致度、マッチング相関度）を表すスコアやオブジェクト判別器によるスコアについては、通常、その値が大きいほど類似の程度が高いことを示すものが用いられる。しかし、類似の程度を表す指標（評価値）の定義によっては、その値が小さいほど類似の程度が高いものとすることも可能である。したがって、スコアの値と閾値とを比較して類似性を判定するにあたっては、それぞれの数値が意味する実質的な類似性の高低に注目する。 <Modification 4>
As for the score representing similarity (matching degree, matching correlation degree) by template matching and the score by the object discriminator, those indicating that the degree of similarity is usually higher as the value is larger. However, depending on the definition of an index (evaluation value) representing the degree of similarity, the degree of similarity can be higher as the value is smaller. Accordingly, in determining similarity by comparing the score value and the threshold, attention is paid to the level of substantial similarity that each numerical value means.

以上説明した本発明の実施形態は、本発明の趣旨を逸脱しない範囲で、適宜構成要件を変更、追加、削除することが可能である。本発明は以上説明した実施形態に限定されるものでは無く、本発明の技術的思想内で当該分野の通常の知識を有するものにより、多くの変形が可能である。 In the embodiment of the present invention described above, the configuration requirements can be appropriately changed, added, and deleted without departing from the spirit of the present invention. The present invention is not limited to the embodiments described above, and many modifications can be made by those having ordinary knowledge in the field within the technical idea of the present invention.

１０…監視映像システム、１２…カメラ、１６…制御装置、２０…画像処理部、２０４…ロックオン処理部、２０６…追跡処理部、２１４…初期モデル変更部、２２０…両眼判別器、２２２…単眼判別器、２２６…初期モデル生成部、２２８…初期モデル記憶部、２３０…角度テンプレート生成部、２３２…角度テンプレート記憶部、２５０…第１のマッチング処理部、２５４…スコア比較部、２５６…第２のマッチング処理部、２５８…追跡位置決定部、２６２…更新モデル生成部、２６４…更新モデル記憶部、２８０…角度推定部、２８２…閾値決定部、２８４…開き目判別器 DESCRIPTION OF SYMBOLS 10 ... Surveillance video system, 12 ... Camera, 16 ... Control apparatus, 20 ... Image processing part, 204 ... Lock-on processing part, 206 ... Tracking processing part, 214 ... Initial model change part, 220 ... Binocular discriminator, 222 ... Monocular discriminator, 226 ... initial model generation unit, 228 ... initial model storage unit, 230 ... angle template generation unit, 232 ... angle template storage unit, 250 ... first matching processing unit, 254 ... score comparison unit, 256 ... first 2 matching processing unit, 258 ... tracking position determination unit, 262 ... update model generation unit, 264 ... update model storage unit, 280 ... angle estimation unit, 282 ... threshold determination unit, 284 ... open discriminator

Claims

An object tracking method for detecting a specific object from a captured image acquired in time series and tracking its position,
An initial model generation step of identifying an image portion of the object from the input captured image and generating an initial model indicating an image feature of the object;
An object search step of performing template matching processing using at least the initial model for a captured image input after generation of the initial model, and searching for the position of the object from the captured image;
An update model generation step of generating an update model from the image portion of the object at the tracking position specified by the object search step, and storing the latest update model;
Have
The object search step includes
A first matching processing step of performing a first template matching process using the initial model and calculating a first score indicating a similarity to the initial model;
A comparison step of comparing the first score obtained by the first matching processing step with a first threshold;
A second matching process step of performing a second template matching process using the latest update model stored, and calculating a second score indicating a similarity to the update model;
As a result of the comparison in the comparison step, when the first score shows high similarity equal to or higher than the first threshold, the matching position searched in the first matching processing step is adopted as the tracking position. On the other hand, an object including: a tracking position determining step that adopts a matching position searched by the second matching processing step as a tracking position when the first score shows a similarity lower than the first threshold value Tracking method.

After the initial model is generated, the first matching processing step is preferentially performed for each captured image that is sequentially input, and as a result of the comparison in the comparison step, the first score is the first threshold value. 2. The object tracking method according to claim 1, wherein the second matching processing step is omitted when high similarity equal to or higher than is shown.

The object tracking method according to claim 1, further comprising a tracking stop determination step of determining whether to stop or continue tracking based on a tracking result of the object search step.

The tracking stop determination step determines tracking stop based on at least one of the tracking position searched by the object search step, the first score, the second score, and the size of the object. The object tracking method according to claim 3.

The object tracking method according to claim 3 or 4, wherein when the tracking is stopped based on the determination in the tracking stop determination step, the initial model is recreated.

6. The object tracking method according to claim 1, further comprising a step of storing, as a history, tracking result information about a plurality of frames of captured images that are continuous in time series.

An angle estimation step of calculating a horizontal angle indicating a horizontal rotation angle of the object from the tracking position determined by the tracking position determination step;
An initial model update step of changing the initial model according to the angle calculated in the angle estimation step;
The object tracking method according to claim 1, comprising:

Based on the initial model generated in the initial model generation step, create a plurality of angle correspondence models corresponding to the plurality of rotation angles in advance,
One angle correspondence model is selected from the plurality of angle correspondence models based on the angle calculated in the angle estimation step, and the selected angle correspondence model is stored as a new initial model instead of the initial model. The object tracking method according to claim 7, further comprising a step.

9. The initial model generation step of generating the initial model by detecting an image portion of the object from the captured image using an object discriminator constructed by machine learning from a large number of images. The object tracking method according to claim 1.

An object tracking device that detects a specific object from a captured image acquired in time series and tracks its position,
An initial model generation unit that identifies an image portion of the object from an input captured image and generates an initial model indicating an image feature of the object;
An object search unit for performing template matching processing using at least the initial model for a captured image input after generation of the initial model, and searching for the position of the object from the captured image;
An update model is generated from an image portion of the object at the tracking position specified by the object search unit, and an update model generation unit that stores the latest update model;
Have
The object search unit
A first matching processing unit that performs a first template matching process using the initial model and calculates a first score indicating a similarity to the initial model;
A comparison unit that compares the first score obtained by the first matching processing unit with a first threshold;
A second matching processing unit that performs a second template matching process using the latest stored update model, and calculates a second score indicating a similarity to the update model;
As a result of the comparison by the comparison unit, when the first score shows high similarity equal to or higher than the first threshold, the matching position searched by the first matching processing unit is adopted as the tracking position. On the other hand, when the first score indicates a similarity lower than the first threshold, a tracking position determination unit that employs the matching position searched by the second matching processing unit as a tracking position;
An object tracking device comprising:

The program for making a computer perform each process of the object tracking method of any one of Claim 1 to 9.