JP2004334786A

JP2004334786A - State detection device and state detection system

Info

Publication number: JP2004334786A
Application number: JP2003133469A
Authority: JP
Inventors: Haruo Matsuo; 治夫松尾; Masayuki Kaneda; 雅之金田; Kinya Iwamoto; 欣也岩本
Original assignee: Nissan Motor Co Ltd
Current assignee: Nissan Motor Co Ltd
Priority date: 2003-05-12
Filing date: 2003-05-12
Publication date: 2004-11-25
Anticipated expiration: 2023-05-12
Also published as: JP4325271B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a state detection device which can be improved in cost and versatility. <P>SOLUTION: An image processing part 22 calculates optical flow between pick-up images obtained by imaging a position where a driver's body exists when the driver sits down in time series based on the pick-up images. Then, a state detection part 24 detects at least one of three driver states, the direction of a driver's face, entrance/exit of an object other than the driver's face in an imaging range and presence/absence of the driver as a detection target from the calculated optical flow. In this case, the state detection part 24 detects the detection target without specifying the position of the driver's body in the pick-up images. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、状態検出装置及び状態検出システムに関する。
【０００２】
【従来の技術】
従来、運転者の身体を撮像手段にて撮像し、得られた画像に基づいて運転者等の状態を検出する状態検出装置が知られている。
【０００３】
その１つとして、例えば、目の特徴に基づいて画像上における目の座標を検出し、目の縦幅の変化量から瞼の開閉を検出し、目を閉じている時間と頻度により運転者の覚醒度を検出して居眠りを検出する状態検出装置が知られている（例えば特許文献１参照）。
【０００４】
また、連続して得られた顔画像の差分画像を利用し、運転者の顔の向きを推定して脇見検出を行ったり、顔の動きが少なくなる状態を検出して運転者の意識低下を検出したりする状態検出装置が知られている（例えば特許文献２参照）。
【０００５】
また、車両の走行環境に基づいて運転者が見るべき領域を予測し、運転者の視線領域を検出して、この領域を運転者が見たか否かを検出する状態検出装置が知られている（例えば特許文献３参照）。なお、この装置は、運転者が見るべき領域を視認しなかった場合には、その旨を運転者に報知する機能も有している。
【０００６】
また、連続して得られた運転者の身体画像の差分画像を利用し、運転者の乗車姿勢を検出する状態検出装置が知られている（例えば特許文献４参照）。
【０００７】
また、あらかじめ作成した表情遷移マップを利用し、運転者の連続する顔画像から表情遷移を読み取り、運転者の覚醒レベルを検出する状態検出装置が知られている（例えば特許文献５参照）。
【０００８】
このように従来の装置では、それぞれ運転者等の状態を検出することが可能となっている。
【０００９】
【特許文献１】
特開平１０−４０３６１号公報
【００１０】
【特許文献２】
特開平１１−１６１７９８号公報
【００１１】
【特許文献３】
特開２００２−８３４００号公報
【００１２】
【特許文献４】
特開２０００−１１３１６４号公報
【００１３】
【特許文献５】
特開２００１−４３３４５号公報
【００１４】
【発明が解決しようとする課題】
上記状態検出装置では、検出したい状態に応じて最適な手法が選択されており、装置内には最適な手法を行う機器等が実装されている。具体的に、従来の状態検出装置は、検出したい状態に応じてそれぞれ画像処理方法が異なっている。このため、上記状態検出装置は、様々な状態のうち特定の状態しか検出できないことが多く、様々な状態を検出するためには、複数の画像処理方法を行う機器等を実装して装置を構成することとなる。
【００１５】
しかしながら、上記のように複数の画像処理を実行する場合、費用がかさばると共に、画像処理内容によっては同時実行が不可能となってしまう可能性がある。
【００１６】
【課題を解決するための手段】
本発明によれば、画像処理手段は、運転者が着座したときに運転者の身体が存する位置を時系列的に撮像して得られた撮像画像に基づき、撮像画像間のオプティカルフローを求め、状態検出手段は、画像処理手段により求められたオプティカルフローから、撮像画像中における運転者の身体の位置を特定することなく、運転者の顔の向き、運転者の顔以外のものの撮像範囲内への出入、及び運転者の有無の３つの運転者状態のうち、少なくとも１つを検出対象として検出する。
【００１７】
【発明の効果】
本発明によれば、運転者の顔の向き、運転者の顔以外のものの撮像範囲への出入、運転者の有無については、いずれを検出対象とするとしてもオプティカルフローによる１つの画像処理方法を用いている。
【００１８】
故に、上記３つの運転者状態のうち１つを検出する装置を構成した場合、後に他の運転者状態を検出するように装置のグレードアップ等を図りたいときには、共通しない処理部分だけを組み込むようにすればよい。これにより、グレードアップ等に際し、全く異なる処理を行う装置を組み込む場合に比して、費用がかさばることがなく、画像処理内容によっては同時実行が不可能となるという事態が生じ難くなる。
【００１９】
また、３つの運転者状態のうち、２つ以上を検出する場合には、画像処理方法が共通なので、異なる画像処理処理を行う装置を実装している場合に比して、費用がかさばることがなく、画像処理内容によっては同時実行が不可能となるという事態が生じ難くなる。
【００２０】
従って、費用面及び汎用性の面での向上を図ることができる。
【００２１】
【発明の実施の形態】
以下、本発明の好適な実施形態を図面に基づいて説明する。なお、以下の実施形態においては、状態検出システムを車両に搭載した場合を例に説明する。また、以下の説明において、移動量とは移動速度と移動方向とを含むものとする。さらに、この移動量をオプティカルフローと称呼するものとする。
【００２２】
図１は、本発明の第１実施形態に係る状態検出装置を含む状態検出システムの構成を示すブロック図である。なお、第１実施形態においては、運転者の顔の向き、運転者の顔以外のものの撮像範囲内への出入、及び運転者の有無の３つの運転者状態のうち少なくとも１つを検出対象として検出する状態検出システムを例に説明する。
【００２３】
同図に示すように、本実施形態の状態検出システム１は、運転者の顔の向き等を検出するものであり、撮像装置（撮像手段）１０と状態検出装置２０と制御装置（制御手段）３０を備えている。
【００２４】
撮像装置１０は、運転者が着座したときに運転者の身体が存する位置を撮像範囲内に含んでおり、この撮像範囲を時系列的に撮像するものである。具体的に撮像装置１０は、可視光を撮像するためのＣＣＤカメラやＣＭＯＳカメラ、近赤外光にて撮像するカメラ、及び人等の発する熱を遠赤外にて撮像するカメラ等の少なくとも１つから構成されている。
【００２５】
また、撮像装置１０は、例えば運転者の正面下方に設置され、運転者の頭部を含む画像を取得し、得られた撮像画像のデータをビデオ信号Ｓａとして状態検出装置２０に送出する。なお、撮像装置１０は、運転者の有無のみを検出対象とする場合、運転者の胴体などを撮像するようにされていてもよいが、以下において撮像装置１０は運転者の頭部を撮像するものとする。
【００２６】
状態検出装置２０は、撮像装置１０からの撮像画像のデータに基づいて、所定の処理を実行し、３つの運転者状態のうち少なくとも１つを検出するものである。
【００２７】
状態検出装置２０の詳細を図２に示す。図２は、図１に示した状態検出装置２０の詳細構成を示すブロック図である。
【００２８】
状態検出装置２０は、撮像装置１０からの撮像画像のデータであるビデオ信号Ｓａを入力する画像取得部（画像取得手段）２１を備えている。また、状態検出装置２０は、画像取得部２１により入力された撮像画像のデータを画像処理して、撮像画像間のオプティカルフローを求める画像処理部（画像処理手段）２２を有している。さらに、状態検出装置２０は、求めたオプティカルフローから運転者の動作を検出する動作検出部（動作検出手段）２３と、３つの運転者状態のうち少なくとも１つを検出する状態検出部２４とを具備している。さらには、状態検出装置２０は、状態検出部２４からの検出結果を電気信号Ｓｂに変換して外部に出力する状態信号出力部（信号出力手段）２５を備えている。
【００２９】
また、制御装置３０は、状態信号出力部２５からの電気信号Ｓｂに基づいて、所定の処理、例えばシートベルト制御処理、エアバッグ制御処理、警報処理等を行うものである。
【００３０】
ここで、上記各部２１〜２５うち画像取得部２１及び画像処理部２２は、３つの運転者状態のうちいずれを検出するとしても、行う処理が共通している。次に、処理の共通部分である画像取得部２１及び画像処理部２２の基本動作、並びに動作検出部２３、状態検出部２４及び状態信号出力部２５の動作の概略を図３〜図８を参照して説明する。図３は、本実施形態に係る状態検出装置２０の動作の概略を示すデータフローダイヤグラムであり、図４は、本実施形態に係る状態検出装置２０の動作の概略を示す説明図である。
【００３１】
まず、撮像装置１０により運転者の顔を含む画像が撮像され（図４ａに示す画像）、その画像がビデオ信号Ｓａとして画像取得部２１に入力される。
【００３２】
撮像装置１０からのビデオ信号Ｓａを入力すると、画像取得部２１は、横幅３２０画素、縦幅２４０画素、１画素あたり８ｂｉｔ（２５６階調）の濃淡データを示す２次元のディジタルデータに変換する。変換後、画像取得部２１は、変換したデータを記憶領域に格納し、格納した撮像画像データを画像処理部２２に出力する。
【００３３】
画像処理部２２は、画像取得部２１からの撮像画像のデータに基づいて、撮像画像間のオプティカルフローを求める（図７ｂ）。この際、画像処理部２２は、領域データを入力し、この領域データにより定義される領域毎（演算領域毎）にオプティカルフローを求める。そして、画像処理部２２は、求めた各領域毎のオプティカルフローのデータを動作検出部２３に送出する。
【００３４】
ここで、上記領域及び領域データについて説明する。上記領域データとは、撮像画像中に領域を定めるための位置及び大きさを示すデータである。また、上記領域は、それぞれ異なる時刻にて取得された画像に対し、領域データに基づいて１又は複数設定されるものであり、具体的には以下の参照領域と探索領域とを示すものである。
【００３５】
図５は、参照領域及び探索領域の説明図である。なお、参照領域と探索領域とは、それぞれ時刻を異にする撮像画像に設定されるものであるが、図５においては、便宜上、幅ｗ画素，高さｈ画素の１つの画像上に表して説明する。
【００３６】
同図に示すように、参照領域は、特定の点Ｏを中心として設定される幅ｔｗ画素，高さｔｈ画素の領域である。また、探索領域は、点Ｏを中心として設定される幅ｓｗ画素，高さｓｈ画素の領域である。この探索領域は、各参照領域を取り囲んで設定されるものであり、参照領域と同じ数だけ設定される。
【００３７】
このように、これら両領域は中心を等しくし、ｓｗ＞ｔｗ且つｓｈ＞ｓｗの関係となるように設定される。なお、ここでの参照領域及び探索領域は、運転者の顔の位置等に依存せず、予め定められた位置及び大きさで設定される。
【００３８】
また、参照領域は、規則的に配置されることが望ましい。図６は、撮像画像に規則的に配置される参照領域の説明図である。参照領域は、例えば図６（ａ）に示すように、撮像画像上の横方向に複数（例えば７つ）配置される。また、図６（ｂ）に示すように、参照領域は、撮像画像上に格子状に複数（例えば５行７列）配置される。さらには、図６（ｃ）に示すように、参照領域は、横方向に且つ格子状に複数（例えば３行５列に加え、さらに横方向に２つの計１７）配置されてもよい。
【００３９】
また、さらに参照領域は、カメラの位置、カメラの画角、及び撮像画像中における顔の占める割合等から、目、鼻又は口等の顔の部位の大きさ程度に固定的に設定されることが望ましい。
【００４０】
再度、図３及び図４を参照して説明する。オプティカルフローの算出後、動作検出部２３は、画像処理部２２により求められた各領域毎のオプティカルフローから運転者の動き、すなわち実動作パターンを求める（図４ｃ）。この際、動作検出部２３は、各領域グループ毎に実動作パターンを求める。そして、動作検出部２３は、求めた実動作パターンのデータを状態検出部２４に送出する。
【００４１】
上記領域グループについて説明する。領域グループとは、上記参照領域の少なくとも１つからなるものである。図７及び図８を参照して領域グループの例を説明する。図７及び図８は領域グループの説明図である。なお、図７及び図８においては参照領域が撮像画像上に格子状（５行７列）に配置された場合を例に説明する。
【００４２】
まず、図７に示すように、領域グループＡ１〜Ｉ１は、それぞれ９つの参照領域を含むようにされている。具体的に領域グループＡ１〜Ｉ１は３行３列分の参照領域を含み、領域グループＡ１は、１〜３行目の１〜３列目の参照領域を含んでいる。また、領域グループＢ１〜Ｄ１は、それぞれ１〜３行目の５〜７列目、３〜５行目の１〜３列目、３〜５行目の５〜７列目の参照領域を含んでいる。また、領域グループＥ１〜Ｆ１は、それぞれ１〜３行目の３〜５列目、３〜５行目の３〜５列目の参照領域を含んでいる。さらに、領域グループＧ１〜Ｉ１は、２〜４行目の２〜４列目、２〜４行目の４〜６列目、２〜４行目の３〜５列目の参照領域を含んでいる。
【００４３】
また、図８に示すように、領域グループＡ２〜Ｈ２のそれぞれが３〜５つの参照領域を含むようにしてもよい。この例の場合、領域グループＡ２は、１行目の１，２列目及び２行１列目の参照領域を含んでいる。また、領域グループＢ２〜Ｄ２は、それぞれ１行目の６，７列目及び２行７列目、４行目の１列目及び５行１，２列目、４行目の７列目及び５行６，７列目の参照領域を含んでいる。また、領域グループＥ２〜Ｆ２は、それぞれ１行目の２〜６列目、５行目の２〜６列目の参照領域を含んでいる。さらに、領域グループＧ２〜Ｈ２は、２〜４行目の１列目、２〜４行目の７列目の参照領域を含んでいる。
【００４４】
このように、領域グループは、少なくとも１つの参照領域を含む大きさで撮像画像に設定されるものである。そして、動作検出部２３は、領域グループ毎に実動作パターンを求めることとなる。
【００４５】
パターンの算出後、状態検出部２４は、実動作パターンと記憶動作パターンとに基づいて、３つの運転者状態のうち少なくとも１つをする。具体的に、状態検出部２４は、実動作パターンと予め記憶される複数の記憶動作パターンそれぞれとの相関を計算し、最も相関の高い記憶動作パターンを検出結果として得る（図７ｄ）。
【００４６】
ここで、複数の記憶動作パターンは、実際の運転者の動きに基づいて予め得られる特徴量からなるものであり、状態検出部２４の内部に設けられた記憶部（パターン記憶手段）２４ａに記憶されている。状態検出部２４は、記憶部２４ａから複数の記憶動作パターンを読み出し、これら記憶動作パターンと求めた実動作パターンとを比較していく。そして、状態検出部２４は、比較により得られた検出結果を状態信号出力部２５に出力する。
【００４７】
状態信号出力部２５は、状態検出部２４からの検出結果を電気信号Ｓｂに変換して外部に出力する。そして、電気信号Ｓｂを受けた制御装置３０は、その信号に基づいて各種動作を行うこととなる。
【００４８】
次に、図９〜図１５を参照して、第１実施形態に係る状態検出装置２０の動作を詳細に説明する。
【００４９】
図９は、図２に示した画像処理部２２の動作を示すフローチャートである。
【００５０】
まず、画像処理部２２は、画像取得部２１から撮像画像のデータであるビデオ信号Ｓａを入力する。そして、画像処理部２２は、画像取得部２１からの撮像画像にスムージングフィルタを適応し、所定の式にて画素値を変換する（ＳＴ１０）。ここで、スムージングフィルタは、以下に示す５行５列からなるフィルタである。
【００５１】
【数１】

所定の式は、以下に示すものである。
【００５２】
【数２】

なお、ｄ（ｘ，ｙ）は、撮像画像内の任意位置の画素値であり、ｄ’（ｘ，ｙ）は変換後の画素値である。
【００５３】
その後、画像処理部２２は、現在の撮像画像の探索領域内から、前回の撮像画像内の参照領域に最も類似する位置を求めて、移動量（ｘｄ，ｙｄ）、すなわちオプティカルフローを算出する（ＳＴ１１）。具体的には、画像処理部２２は、まず、探索領域内から参照領域に最も類似する領域を求め、最も類似する領域の中心点を、参照領域に最も類似する位置とする。そして、画像処理部２２は、求められた最も類似する領域の中心点と、探索領域の中心点とから移動量（ｘｄ，ｙｄ）を算出し、オプティカルフローとする。
【００５４】
ここで、ステップＳＴ１１について詳細に説明する。上述したように、撮像画像上には予め複数の参照領域が設定されている。また、探索領域は各参照領域を取り囲むように設定される。また、参照領域と探索領域とは時間を異にして設定される。具体的には、図１０に示すように、参照領域は時刻ｔにおいて設定され、探索領域は時刻ｔ後の時刻（ｔ＋１）において設定される。
【００５５】
図１０は、図９に示すステップＳＴ１１における移動量（ｘｄ，ｙｄ）の算出方法の説明図である。ステップＳＴ１１の処理において、画像処理部２２は、まず、候補領域を作成する。この候補領域は、参照領域と同じ大きさを有する領域である。そして、画像処理部２２は、探索領域内の所定箇所に候補領域を設定し、設定した候補領域と参照領域とを比較等して類似度を求める。次に、画像処理部２２は、候補領域を他の位置に動かし、動かした位置の候補領域と参照領域とを比較等して類似度を求める。
【００５６】
その後、画像処理部２２は、候補領域を順次移動させていき、探索領域内での各箇所において参照領域との類似度を算出する。類似度は、例えば、濃淡データを基準に判断される。ここで、濃淡データを基準に類似度を算出する場合において、類似度をｃｏｓθとすると、類似度は以下の式にて表される。
【００５７】
【数３】

上式においては、参照領域の濃淡データをＴとし、候補領域の濃淡データをＳとしている。また、ｘｄは、探索領域内のＸ座標値を示し、ｙｄは、探索領域内のＹ座標値を示している。
【００５８】
以上から、画像処理部２２は、類似度が最大となる位置Ｓを定め、点Ｓと点Ｏとの座標値の差を移動量（ｘｄ，ｙｄ）として取得し、これをオプティカルフローとする。
【００５９】
再度、図９を参照して説明する。移動量（ｘｄ，ｙｄ）の算出後、画像処理部２２は、類似度の範囲が閾値以上か否かを判断する（ＳＴ１２）。
【００６０】
ここでの判断を図１１を参照して説明する。図１１は、図９に示すステップＳＴ１２の処理の説明図である。画像処理部２２は、候補領域によって探索領域内を走査していき、探索領域内の各箇所の類似度を算出する。そして、画像処理部２２は、得られた類似度の分散を求める。
【００６１】
例えば、図１１に示すように、各箇所における類似度を変化量として表した場合、変化量Ｃ１では分散値が小さく、分散の範囲が狭いと言える。一方、変化量Ｃ２では変化量Ｃ１よりも分散値が大きく、分散の範囲も大きいと言える。
【００６２】
ここで、分散の範囲が狭い場合とは、探索領域内の各箇所において、同じような類似度が検出される場合である。例えば、参照領域が真っ白な画像である場合など、特徴が少ない場合には探索領域内のどの箇所と比較しても似たような類似度の結果が得られることとなる。そして、このような場合、それぞれ類似度の差が小さいことから、類似度が最大となる点Ｓの検出が不正確になりやすい。このため、図９のステップＳＴ１２の処理では、所定の閾値と比較し、好適なものと不適なものとの選別するようにしている。
【００６３】
再度、図９を参照して説明する。類似度の範囲が閾値以上であると判断した場合（ＳＴ１３：ＹＥＳ）、画像処理部２２は、参照領域を有効な領域とし、ｆｄに「１」を代入する（ＳＴ１３）。そして、処理はステップＳＴ１５に移行する。
【００６４】
一方、類似度の範囲が閾値以上でないと判断した場合（ＳＴ１２：ＮＯ）、画像処理部２２は、参照領域を無効な領域とし、ｆｄに「０」を代入する（ＳＴ１４）。そして、処理はステップＳＴ１５に移行する。このように、画像処理部２２は、類似度（特徴量の１つ）の変化量と、予め設定される閾値とを比較することにより、オプティカルフローの計算に用いるか否かを判断している。
【００６５】
ステップＳＴ１５において、画像処理部２２は、領域の数だけ上記のステップＳＴ１１〜ＳＴ１４を行ったか否かを判断する（ＳＴ１５）。すなわち、画像処理部２２は、すべての参照領域について、探索領域内から類似する位置を特定したか否かを判断している。
【００６６】
いずれかの参照領域について、探索領域内から類似する位置を特定していないと判断した場合（ＳＴ１５：ＮＯ）、処理はステップＳＴ１１に戻り、類似する位置を特定していない参照領域について、上記ステップＳＴ１１〜ＳＴ１４の処理を繰り返すこととなる。
【００６７】
一方、すべての参照領域について、探索領域内から類似する位置を特定したと判断した場合（ＳＴ１５：ＹＥＳ）、画像処理部２２は、各参照領域毎のオプティカルフローのデータを動作検出部２３に送信する。その後、画像処理部２２による処理は終了する。
【００６８】
なお、以上の図９に示した画像処理部２２の動作は、３つの運転者状態のうちいずれを検出する場合であっても共通している。
【００６９】
ここで、３つの運転者状態のそれぞれにおけるオプティカルフローの例を説明する。図１２は、運転者の顔の向きを検出する場合のオプティカルフローの例を示す説明図であり、図１３は、運転者の有無を検出する場合のオプティカルフローの例を示す説明図である。また、図１４〜図１６は、運転者の顔以外ものの撮像範囲内への出入を検出する場合のオプティカルフローの例を示す説明図である。なお、図１４は、運転者が目付近に手を移動させたときのオプティカルフローの例を示しており、図１５は、運転者が道路マップ等を見るために本を持ち上げたときのオプティカルフローの例を示している。また、図１６は、ハンドルのスポーク部が撮像範囲内に進入してきた場合のオプティカルフローの例を示している。
【００７０】
まず、図１２を参照して説明する。時刻ｔにおいて運転者は前方を視認している（図１２（ａ））。その後、時刻（ｔ＋１）において、運転者は交差点の確認等を行うべく、顔を左方に向ける。このとき、オプティカルフローが検出される（図１２（ｂ））。ここで、画像中の四角で表示されている領域は、参照領域であり、各参照領域から伸びる線分は、各部位の移動量、すなわちオプティカルフローを示している。
【００７１】
その後、時刻（ｔ＋２）において、運転者は顔をさらに左方に向ける。このときも同様に、オプティカルフローが検出される（図１２（ｃ））。そして、時刻（ｔ＋３）において運転者は顔を左上方に向けると、同様にオプティカルフローが検出される（図１２（ｄ））。
【００７２】
なお、図１２中において、参照領域を示す四角枠が実線にて図示されているものは、図９のステップＳＴ１２にて「ＮＯ」と判断され、無効領域とされた参照領域であり、四角枠が破線にて図示されているものは、図９のステップＳＴ１２にて「ＹＥＳ」と判断され、有効領域とされた参照領域である。これは、以下の図１３〜図１６においても同様とする。
【００７３】
次に、図１３を参照して説明する。まず、運転者の乗車前の状態において画像内の物体等には当然に動きが見られず、オプティカルフローは検出されない。また、参照領域の多くが無効領域となっている（図１３（ａ））。その後、運転者が乗車し始めると、運転者の動きが検出されてオプティカルフローが算出される。また、この際に、参照領域の一部が有効領域となる（図１３（ｂ））。その後、運転者は乗車を完了する。このとき、運転者は一端静止状態となるため、オプティカルフローの検出量は少なくなるが、運転者は完全には停止することができず僅かながら動くため、参照領域の殆どが有効領域となる（図１３（ｃ））。
【００７４】
なお、図１３においては、オプティカルフローが非常に僅かしか検出されなかった場合、参照領域から伸びる線分の図示を省略している。これは、以下の図１４〜図１６についても同様とする。
【００７５】
次に、図１４を参照して説明する。まず、時刻ｔにおいて運転者は前方を視認している（図１４（ａ））。その後、時刻（ｔ＋１）において、運転者は手を目付近に移動させる。このとき、撮像画像の一部にオプティカルフローが検出される（図１４（ｂ））。その後、時刻（ｔ＋２）においては、運転者の手の動きは殆どなく、オプティカルフローの検出量は少なくなる（図１４（ｃ））。
【００７６】
次に、図１５を参照して説明する。まず、時刻ｔにおいて運転者は、前方を視認している状態から、道路マップ等を見ようとして一端視線を落とす。このとき、僅かながら顔自体も下方へ移動するため、僅かにオプティカルフローが検出される（図１５（ａ））。その後、時刻（ｔ＋１）において、運転者は道路マップ等を持ち上げる。このとき、撮像画像の中央のやや下方にオプティカルフローが検出される（図１５（ｂ））。その後、時刻（ｔ＋２）において運転者は、道路マップ等を注視し、動きが殆どなくなる。このため、オプティカルフローの検出量は少なくなる（図１５（ｃ））。
【００７７】
次に、図１６を参照して説明する。まず、時刻ｔにおいて運転者は直進道路を運転している（図１６（ａ））。その後、時刻（ｔ＋１）において、運転者は右折動作を行う。このとき、撮像範囲内にハンドルのスポーク部が進入し、オプティカルフローが検出される（図１６（ｂ））。その後、時刻（ｔ＋２）において、運転者がさらに右折方向へハンドルを切ると、さらにオプティカルフローが検出される（図１６（ｃ））。
【００７８】
なお、オプティカルフローの計算方法は本実施形態の他に、八木信行監修， ”ディジタル映像処理”，映像情報メディア学会編，ｐｐ．１２９−１３９，２０００，オーム社などにて動画像から動きを検出する手法が複数紹介されていおり、それらを用いることもできる。
【００７９】
次に、動作検出部２３の処理を説明する。図１７は、図２に示した動作検出部２３の動作を示すフローチャートである。なお、以下に説明する動作検出部２３による処理は、運転者の有無を検出する場合には実行されないこととなる。
【００８０】
また、図１７に示す処理では、検出対象が運転者の顔の向きである場合と、運転者の顔以外のものの撮像範囲内への出入である場合とでは、領域グループの設定が異なっている。
【００８１】
まず、領域グループの相違について説明する。検出対象が運転者の顔の向きである場合、領域グループは、図７に示したように設定される。すなわち、９つの領域グループＡ１〜Ｉ１のそれぞれが３行３列の９つの参照領域を含むように設定される。
【００８２】
一方、検出対象が運転者の顔以外のものの撮像範囲内への出入である場合、領域グループは、図８に示したように設定される。すなわち、８つの領域グループＡ２〜Ｈ２のそれぞれが３〜５つの参照領域を含むように設定される。
【００８３】
ここで、領域グループの設定方法が異なるのには、以下の理由がある。すなわち、運転者の顔の向きを検出する場合には、運転者の顔の位置が画像上のどの位置に移動しても動きをとらえる必要がある。このため、画像全体に領域グループを設定することが望ましくなる。一方、運転者の顔以外のものの撮像範囲内への出入を検出する場合には、出入に特化して検出すればよく、画像の中心部に領域グループを設定する必要がなくなる。
【００８４】
このように、本実施形態では、検出対象に応じて領域グループの設定を異ならせ、それぞれ好適に検出できるようにしている。
【００８５】
次に、以上のような領域グループの相違を前提として、図１７のフローチャートを説明する。
【００８６】
まず、動作検出部２３は、複数の領域グループのうち処理の対象となるものを選択し、さらに、そのグループ内の参照領域のうちいずれか１つを選択する。
【００８７】
そして、動作検出部２３は、選択した領域グループについて、画像内の物体の移動量に関する数値ｘｍ，ｙｍ，ｃを「０」に初期化する（ＳＴ２０）。その後、動作検出部２３は、選択した参照領域が有効領域であるか否か、すなわちｆｄが「１」であるか否かを判断する（ＳＴ２１）。
【００８８】
ｆｄが「１」であると判断した場合（ＳＴ２１：ＹＥＳ）、動作検出部２３は、移動量であるオプティカルフローを積算する（ＳＴ２２）。具体的に、動作検出部２３は、「ｘｍ」を「ｘｍ＋ｘｄ」とし、「ｙｍ」を「ｙｍ＋ｙｄ」とし、「ｃ」を「ｃ＋１」とする。そして、処理はステップＳＴ２３に移行する。
【００８９】
一方、ｆｄが「１」でないと判断した場合（ＳＴ２１：ＮＯ）、動作検出部２３は、移動量であるオプティカルフローを積算することなく、処理はステップＳＴ２３に移行する。
【００９０】
ステップＳＴ２３において、動作検出部２３は、選択した領域グループ内のすべての参照領域について処理したか否かを判断する（ＳＴ２３）。いずれかの参照領域について処理をしてないと判断した場合（ＳＴ２３：ＮＯ）、処理はステップＳＴ２１に戻り、上記ステップＳＴ２１，ＳＴ２２を繰り返すこととなる。すなわち、動作検出部２３は、すべての参照領域について有効領域か否かを判断し、有効領域である場合には、移動量を積算するという処理を行っていく。
【００９１】
そして、順次移動量の積算等が行われ、すべての参照領域について処理した場合（ＳＴ２３：ＹＥＳ）、動作検出部２３は、ｃが「０」であるか否かを判断する（ＳＴ２４）。
【００９２】
「ｃ」が「０」であると判断した場合（ＳＴ２４：ＹＥＳ）、処理はステップＳＴ２６に移行する。一方、「ｃ」が「０」でないと判断した場合（ＳＴ２４：ＮＯ）、動作検出部２３は、積算した「ｘｍ」「ｙｍ」についての平均を求める（ＳＴ２５）。すなわち、動作検出部２３は、「ｘｍ＝ｘｍ／ｃ」及び「ｙｍ＝ｙｍ／ｃ」を実行し、平均の移動量を求める。
【００９３】
ここで、平均の移動量は、例えば前述の図１２に示すようなものである。図１２において平均移動量は、各画像（（ａ）を除く）の右下に矢印で示されている。なお、平均の移動量は、領域グループ毎に求められるものであるが、図１２では説明の便宜上、全体画像の平均の移動量を示すものとする。また、ここで示す平均の移動量は、顔の平均の移動量を示している。すなわち、顔の向きを検出対象としている場合の平均の移動量である。
【００９４】
再度、図１７を参照して説明する。動作検出部２３は、上記のような平均移動量の算出後、求めた平均移動量について、移動平均値（ａｘ，ａｙ）（動き量）を求める（ＳＴ２６）。移動平均を求める範囲は任意に定められており、例えば、動作検出部２３は、図１２（ｂ）、（ｃ）及び（ｄ）に示した平均移動量（矢印の大きさに相当）の平均を求めるなどする。
【００９５】
その後、動作検出部２３は、平均移動量の移動平均値（ａｘ，ａｙ）を積算する（ＳＴ２７）。具体的に、動作検出部２３は、「ｓｘ」を「ｓｘ＋ａｘ」とし、「ｓｙ」を「ｓｙ＋ａｙ」とする。
【００９６】
その後、動作検出部２３は、積算値（ｓｘ，ｓｙ）の移動平均値（ｃｘ，ｃｙ）を求める（ＳＴ２８）。この移動平均を求める範囲についても任意に定められている。
【００９７】
そして、動作検出部２３は、積算値（ｓｘ，ｓｙ）と積算値の移動平均（ｃｘ，ｃｙ）の差から移動位置（ｖｘ，ｖｙ）を得る（ＳＴ２９）。具体的に、動作検出部２３は、「ｖｘ」を「ｓｘ−ｃｘ」とし、「ｖｙ」を「ｓｙ−ｃｙ」とする。
【００９８】
その後、動作検出部２３は、移動位置（ｖｘ，ｖｙ）をバッファに格納し、以前に求められていた一定時間分の移動位置（ｖｘ，ｖｙ）と現在の移動位置（ｖｘ，ｖｙ）とを現在の実動作パターンとする（ＳＴ３０）。
【００９９】
その後、動作検出部２３は、積算値（ｓｘ，ｓｙ）が閾値以上であるか否かを判断する（ＳＴ３１）。積算値（ｓｘ，ｓｙ）が閾値以上でないと判断した場合（ＳＴ３１：ＮＯ）、動作検出部２３は、移動位置（ｖｘ，ｖｙ）のデータを状態検出部２４に送出し、処理はステップＳＴ３５に移行する。
【０１００】
一方、積算値（ｓｘ，ｓｙ）が閾値以上であると判断した場合（ＳＴ３１：ＹＥＳ）、動作検出部２３は、積算値（ｓｘ，ｓｙ）の標準偏差が閾値以下であるか否かを判断する（ＳＴ３２）。積算値（ｓｘ，ｓｙ）の標準偏差が閾値以下でないと判断した場合（ＳＴ３２：ＮＯ）、動作検出部２３は、移動位置（ｖｘ，ｖｙ）のデータを状態検出部２４に送出し、処理はステップＳＴ３５に移行する。
【０１０１】
一方、積算値（ｓｘ，ｓｙ）の標準偏差が閾値以下であると判断した場合（ＳＴ３２：ＹＥＳ）、動作検出部２３は、平均移動量の移動平均値が閾値以下であるか否かを判断する（ＳＴ３３）。平均移動量の移動平均値が閾値以下でないと判断した場合（ＳＴ３３：ＮＯ）、動作検出部２３は、移動位置（ｖｘ，ｖｙ）のデータを状態検出部２４に送出し、処理はステップＳＴ３５に移行する。
【０１０２】
一方、平均移動量の移動平均値が閾値以下であると判断した場合（ＳＴ３３：ＹＥＳ）、動作検出部２３は、積算値（ｓｘ，ｓｙ）を「０」に初期化する（ＳＴ３４）。そして、動作検出部２３は、移動位置（ｖｘ，ｖｙ）のデータを状態検出部２４に送出し、処理はステップＳＴ３５に移行する。
【０１０３】
なお、上記ステップＳＴ３１〜ＳＴ３４の処理は、以下の理由で行っている。
【０１０４】
例えば、運転者がシートに着座した場合、運転者の顔は撮像範囲の中心に位置するとは限らない。このため、撮像範囲内において運転者の顔位置の左右の範囲が等しくならなかった場合、運転者が顔を左右に動かすことにより、左右の範囲の差に起因して誤差が発生し、これが積算値（ｓｘ，ｓｙ）として累積されてしまう。また、種々の理由により誤差が累積されてしまう場合もある。そして、徐々に誤差が積算値（ｓｘ，ｓｙ）として累積されてしまうと、顔の向きの検出や顔以外のものの撮像範囲内への出入の検出に支障をきたしてしまう。
【０１０５】
そこで、上記ステップＳＴ３１にて、積算値（ｓｘ，ｓｙ）が閾値以上か否かを判断し、閾値以上の場合に積算値（ｓｘ，ｓｙ）を「０」に初期化するようにしている。このように、所定の条件に基づいて積算値を初期化することにより、好適に検出対象を検出するようにしている。
【０１０６】
ただし、現に運転者が顔の向きを変えている段階や顔以外のものが撮像範囲内への出入している段階において積算値（ｓｘ，ｓｙ）を「０」に初期化してしまうと、初期化することによって逆に検出対象の検出に支障をきたしてしまう。そこで、ステップＳＴ３２及びＳＴ３３において、顔が動いていない状態や顔以外のものが撮像範囲内に出入していない状態であることを検出している。すなわち、動作検出部２３は、積算値（ｓｘ，ｓｙ）の標準偏差が閾値以下であり、且つ平均移動量の移動平均値が閾値以下であるという所定の条件に基づいて、積算値（ｓｘ，ｓｙ）を「０」に初期化するようにしている。
【０１０７】
ステップＳＴ３５では、すべての領域グループについて処理したか否かが判断される（ＳＴ３５）。いずれかの領域グループについて処理をしてないと判断した場合（ＳＴ３５：ＮＯ）、処理は再度ステップＳＴ２０に戻り、同様の処理を行っていくこととなる。一方、すべての領域グループについて処理したと判断した場合（ＳＴ３５：ＹＥＳ）、動作検出部２３は領域グループ毎の実動作パターンデータを状態検出部２４に送出する。その後、動作検出部２３による処理は終了する。
【０１０８】
ここで、上記動作検出部２３により得られる移動位置（ｖｘ，ｖｙ）のデータ、すなわち実動作パターンの例を図１８を参照して説明する。図１８は、図２に示した動作検出部２３により得られる実動作パターンの説明図であり、検出対象が運転者の顔の向きである場合を示している。
【０１０９】
なお、図１８において、縦軸は移動位置を示しており、横軸は時刻を示している。また、図１８では、画像横方向（Ｘ方向）における移動位置のみを示し、画像縦方向（Ｙ方向）における移動位置は省略するものとする。さらに、図１８では、運転者が前方を視認している状態から、顔を左に向け、この後に、再度前方を視認する場合に所定の領域グループにて得られる実動作パターンの例を示している。
【０１１０】
同図に示すように、まず、運転者が車両前方を注視している場合（時刻３５０〜４１０の期間）、移動位置は、「０」付近となっている。
【０１１１】
次に、運転者が確認動作をして顔を左に向けた場合（時刻４１０〜４３０の期間）、移動位置が「−４５〜−４８」画素程度を示す。その後、しばらく間、運転者が左を向いているままの状態でいる場合（時刻４３０〜５６０の期間）、移動位置は「−４５〜−４８」画素程度を維持する。
【０１１２】
そして、運転者が再度車両前方に顔を向けると（時刻５６０〜５８０の期間）、移動位置が「０」付近に復帰する。その後、運転者が車両前方を注視し続けると（時刻５８０〜６５０の期間）、移動位置は「０」付近を維持し続ける。
【０１１３】
このように、動作検出部２３により得られる移動位置（ｖｘ，ｖｙ）は、運転者の顔の向きを表し、この移動位置を時間経過的にとらえることで、実動作パターンＰ１が検出されることとなる。
【０１１４】
また、実動作パターンの他の例を図１９を参照して説明する。図１９は、図２に示した動作検出部２３により得られる実動作パターンの説明図であり、検出対象が運転者の顔以外のものの撮像範囲への出入である場合を示している。
【０１１５】
なお、図１９において、横軸は画像横方向（Ｘ方向）における移動位置を示しており、縦軸は画像縦方向（Ｙ方向）における移動位置を示している。さらに、図１９に示す実動作パターンは、図１６に示すようにハンドル操作した場合に、所定の領域グループにおいて得られる例を示している。
【０１１６】
図１６に示すように、ハンドルのスポーク部は、撮像画像上においてＸ軸及びＹ軸に負の方向に移動している。このため、時間経過的に得られたハンドルの移動位置（ｖｘ，ｖｙ）、すなわち、実動作パターンＰ２は、図１９に示すようにＸ軸及びＹ軸に負の方向に移動を示すものとなる。なお、ハンドルを左方向に切った場合には、上記と逆になり、図１９に示す実動作パターンＰ２を原点（０，０）に対してほぼ点対称としたパターンが得られることとなる。
【０１１７】
次に、検出対象が運転者の有無である場合の動作検出部２３による動作を説明する。検出対象が運転者の有無である場合、動作検出部２３は、上記図１７の処理を行わず、全参照領域のうち図９のステップＳＴ１２において「ＹＥＳ」と判断された参照領域の数を、時間経過的に求めて実動作パターンを得る。すなわち、全参照領域のうち有効領域であるものの数をカウントして実動作パターンを得る。
【０１１８】
図１３を参照して説明したように、運転者の乗車前の状態から乗車最中の状態を経て乗車完了に至るまでの間に、有効領域の数は徐々に増加する傾向がある。動作検出部２３は、この傾向を実動作パターンとして得る。
【０１１９】
図２０は、検出対象が運転者の有無である場合の動作検出部２３により得られる実動作パターンの例を示す説明図である。なお、図２０において、縦軸は有効領域数を示し、横軸は時刻を示している。
【０１２０】
まず、運転者の乗車前の状態（時刻３５１４０〜３５１６４の期間）において有効領域数は５以下で安定している。その後、運転者が乗車し始めると、有効領域数は増加し始める（時刻３５１６４〜３５２０４の期間）。このとき、有効領域数は、６以上１５未満となる。そして、乗車完了の状態（時刻３５２０４〜３５２５０の期間）では、有効領域数はさらに増加し、１５以上となる。
【０１２１】
検出対象が運転者の有無である場合、動作検出部２３は、以上のような有効領域数の変化を実動作パターンＰ３として取得することとなる。なお、検出対象が運転者の顔の向きである場合と同様に（図１７のステップＳＴ３０のように）、動作検出部２３は、有効領域数を一定時間分だけ記憶するようにしている。このため、実際に得られる実動作パターンＰ３は、図２０に示すように時刻３５１４０〜３５２５０まで連続的でなくともよい。すなわち、実動作パターンＰ３は、時刻３５１８０〜３５２００といったように、図２０に示す有効領域数の増加変化のうち一部であってもよい。
【０１２２】
そして、実動作パターンが得られると、動作検出部２３は実動作パターンＰ３のデータを状態検出部２４に送出する。その後、動作検出部２３による処理は終了する。
【０１２３】
次に、図２に示した状態検出部２４の動作を説明する。図２１は、図２に示した状態検出部２４の動作を示すフローチャートである。
【０１２４】
同図に示すように、状態検出部２４は、まず、各領域グループのうちいずれか１つを選択する。そして、状態検出部２４は、選択したいずれか１つについて、図１１のステップＳＴ３０にて得られた実動作パターンＰと、記憶部２４ａに予め記憶される複数の記憶動作パターンＤそれぞれとの相関を求める（ＳＴ４０）。
【０１２５】
相関を求める方法としては、例えば数３と同様にして求めたり、フーリエ変換やウェブレット変換により周波数解析した情報を用いて求めたりする。
【０１２６】
ここで具体的に、実動作パターンＰ及び記憶動作パターンＤは、
【数４】

となっている。なお、上記「ｓｔａｔｅｃｏｄｅ」とは運転者の状態を表す状態コードである。また、「ｄａｔａ」は、検出対象が運転者の顔以外のもの撮像範囲内への出入、及び顔の向きである場合、図１７のステップＳＴ３０にて求められた移動位置（ｖｘ，ｖｙ）を示すものとなる。また、検出対象が運転者の有無である場合、「ｄａｔａ」は有効領域の数を示すものとなる。
【０１２７】
その後、状態検出部２４は、複数の記憶動作パターンのうち、最も相関の高い記憶動作パターンを検出する（ＳＴ４１）。検出後、状態検出部２４は、検出された記憶動作パターンが示す状態を、運転者の状態として検出する（ＳＴ４２）。すなわち、状態検出部２４は、相関が最も高い記憶動作パターンＤが示す顔の向きの状態等を検出結果とする。そして、状態検出部２４は、この検出結果を状態信号出力部２５に出力する。
【０１２８】
この後、状態検出部２４は、各領域グループのうち選択した１つ以外のものについても同様の処理を行い、検出結果を状態信号出力部２５に出力する。
【０１２９】
なお、上記では領域グループ毎の実動作パターンと記憶動作パターンを比較し、それぞれの検出結果を得ているが、それぞれの検出結果を総合的に判断して、１の結果を得るようにしてもよい。この場合、各領域グループの検出結果を状態信号出力部２５に順次出力するのではなく、総合的に判断した結果のみを出力するようにする。
【０１３０】
また、検出対象が運転者の有無である場合、領域グループが設定されないことから上記処理は繰り返されることがない。すなわち、状態検出部２４は、ステップＳＴ４０〜ＳＴ４２の処理を一度行い、得られた検出結果を状態信号出力部２５に出力することとなる。
【０１３１】
そして、この後に状態信号出力部２５は、状態検出部２４からの検出結果を電気信号Ｓｂに変換して外部に出力する。
【０１３２】
以上から明らかなように、本実施形態では、図９に示す画像処理が３つの運転者状態のいずれを検出する場合であっても、共通した処理が行われている。また、予め設定した参照領域についてオプティカルフローを求めていることから、従来のように、顔の位置を特定することなく検出が行われている。
【０１３３】
このようにして、本実施形態に係る状態検出装置２０では、画像処理部２２は撮像画像間のオプティカルフローを求めている。このオプティカルフローを求める方法では、画像内の何らかの物体に動きがあったときに、その動きを検出することができる。このため、検出対象が何であろうと、動きに基づいて求めることが可能なものであれば、検出対象毎に個別に設定した画像処理方法を用いる必要がなくなっている。
【０１３４】
このため、例えば、動きに基づいて求めることが可能な運転者の顔の向き、運転者の顔以外のものの撮像範囲への出入、運転者の有無については、このオプティカルフローによる１つの画像処理方法を用いればよいこととなる。
【０１３５】
故に、上記３つの運転者状態のうち１つを検出する装置を構成した場合、後に他の運転者状態を検出するように装置のグレードアップ等を図りたいときには、共通しない処理部分だけを組み込むようにすればよい。これにより、グレードアップ等に際し、全く異なる処理を行う装置を組み込む場合に比して、費用がかさばることがなく、画像処理内容によっては同時実行が不可能となるという事態が生じ難くなる。
【０１３６】
また、３つの運転者状態のうち、２つ以上を検出する場合には、画像処理方法が共通なので、１の画像処理方法にて複数の運転者の状態を検出することができることとなる。これにより、異なる処理を行う装置を実装している場合に比して、費用がかさばることがなく、画像処理内容によっては同時実行が不可能となるという事態が生じ難くなる。
【０１３７】
従って、費用面及び汎用性の面での向上を図ることができる。
【０１３８】
また、状態信号出力部２５が状態検出部２４からの検出結果を電気信号Ｓｂに変換して外部に出力するので、例えば外部の制御装置３０が報知装置である場合には、運転者の顔の向きに応じた報知ができる。従って、検出結果を利用し車両制御等を行うことができる。
【０１３９】
さらに、撮像画像に対して所定の位置と大きさとで定められる１又は複数の演算領域毎にオプティカルフローを求め、少なくとも１つの演算領域からなる領域グループ毎に、オプティカルフローから求まる実動作パターンを求めている。そして、求められた実動作パターンと予め記憶される記憶動作パターンとに基づいて、顔の向きを検出している。このため、例えば、顔が撮像画像の隅にしか存在しないような場合であっても、その隅の領域グループについては実動作パターンが正確に得られることとなる。よって、画像の隅に顔の一部しか存在しない場合等に、実動作パターンが正確に得られないという事態を回避することができる。
【０１４０】
従って、利便性を向上させることができる。
【０１４１】
また、オプティカルフローの計算結果から空間的、時間経過的に実動作パターンを検出している。つまり、例えば空間的に左右方向等の動きを求め、且つ時間的に現在から過去に遡った運転者の動きを求めていることとなる。すなわち、瞬間的なオプティカルフローにより実動作パターンを求めないようにし、ノイズ等による影響を軽減させることができる。
【０１４２】
また、従来では、撮像等して得られた特徴量を基準として運転者の状態等を検出している。このため、従来では、運転開始時には基準を得るために特徴量を取得する必要がある。故に、運転開始時には状態等が検出できないこととなる。ところが、本実施形態では、撮像等して特徴量を得るのではなく、実際に運転者の動きに基づいて予め得られる特徴量を記憶している。このため、運転開始時であっても運転者の状態等を検出することができる。さらに、この特徴量である記憶動作パターンを用いて比較処理を行う場合にも、同様の効果が得られる。
【０１４３】
また、１又は複数の参照領域のうち少なくとも１つは、撮像画像における顔の占める割合に基づいて、顔の特定部位の大きさに設定している。このため、大き過ぎる参照領域を設定して計算量が増大していしまうことを防止すると共に、１つの参照領域内に同時に複数の特徴的な部位が入る可能性を少なくすることができる。さらに、小さ過ぎる領域を設定して特徴的な部位がない領域となることを防ぐことができる。
【０１４４】
また、顔の動きに基づく移動平均値（動き量）を積算し、積算値に基づいて顔の動きを求めると共に、積算値を所定の条件に基づいて初期化している。このため、例えば運転者が顔を左右に動かすなどすることにより積算値として累積されてしまう誤差を初期化して、好適に運転者状態を検出することができる。
【０１４５】
また、１又は複数の演算領域それぞれは、各探索領域内にて算出された特徴量（類似度）の変化量（分散値）と予め設定した閾値とが比較されることにより、各領域をオプティカルフローの計算に用いるか否かが判断される。このため、特徴のない参照領域が設定されたことにより、不正確な検出してしまうことを防止することができる。
【０１４６】
また、本実施形態に係る状態検出システム１では、費用面及び汎用性の面での向上を図ることができる。さらには、例えば外部の制御装置３０が報知装置である場合には、運転者の顔の向きに応じた報知ができる。従って、検出結果を利用し車両制御等を行うことができる。
【０１４７】
なお、本実施形態において、画像処理部２２による処理は、いずれの検出対象を検出する場合であっても、すべて同じであったが、全く同じである必要はない。すなわち、オプティカルフローを求める画像処理に影響を与えない程度であれば多少の変更があっても構わない。
【０１４８】
また、本実施形態では車両走行中に運転者の有無の判別を行うことで、例えば、運転者が座席の下の落ちた物を拾うために屈んだ場合や、助手席の物を取ろうとして助手席側に身を乗り出した場合も検出することができる。
【０１４９】
また、図１６及び図１９にハンドルのスポーク部が撮像範囲内に進入した場合の例を説明したが、この場合には、スポーク部の移動軌跡が得られることから、ハンドルの切れ角を推定する装置に適用することもできる。
【０１５０】
次に、本発明の第２実施形態を説明する。なお、第２実施形態では、主に第１実施形態との相違点について説明することとする。
【０１５１】
第２実施形態に係る状態検出システム１ａ及び状態検出装置２０ａは、３つの運転者状態のうち少なくとも２つを検出対象としている。また、第２実施形態に係る状態検出部２４は、実行する処理の内容が第１実施形態のものと相違する。
【０１５２】
以下、相違する処理内容について説明する。まず、第２実施形態の状態検出装置２０ａは、第１実施形態にて説明した３つの運転者状態のうち２つ以上を検出可能としており、それぞれの検出結果を制御装置３０に送出するように構成されている。
【０１５３】
具体的に、動作検出部２３は、検出対象とする２つ以上の運転者状態のうちのいずれか１つについて、実動作パターンを求め、このデータを状態検出部２４に送出する。その後、動作検出部２３は、残りの運転者状態について再度実動作パターンを求め、このデータを状態検出部２４に送出する。なお、この動作は、並行して行われてもよい。
【０１５４】
そして、状態検出部２４は、入力した実動作パターンのデータに基づいて、第１実施形態にて説明したように、検出を行う。その後、状態検出部２４は、検出結果を状態信号出力部２５に送出する。
【０１５５】
そして、状態信号出力部２５は、第１実施形態にて説明したように、検出結果を電気信号Ｓｂに変換して制御装置３０に出力する。
【０１５６】
また、第２実施形態に係る状態検出部２４は、検出結果に基づいて動作検出部２３に抑止信号を送出する機能を有している。
【０１５７】
図２２は、状態検出部２４が行う抑止制御処理を示すフローチャートである。まず、状態検出部２４は、検出結果を得ると、その運転者状態の検出結果が所定の結果であるか否かを判断する（ＳＴ５０）。そして、所定の結果であると判断した場合（ＳＴ５０：ＹＥＳ）、状態検出部２４は、抑止信号を動作検出部２３に送信する（ＳＴ５１）。これにより、動作検出部２３は、検出した運転者状態以外の運転者状態についての検出を抑止する。
【０１５８】
例えば、運転者の手が目付近にある場合には、運転者が車内にいないということはなく、このような場合に状態検出部２４は、運転者の有無の検出を抑止する信号を送信する。また、運転者の手が目付近にある場合には、運転者は顔の向きを変え難い傾向があるため、このような場合には、運転者の顔の向きの検出を抑止する信号を送信する。このように、１つの運転者状態の検出結果によると、他の運転者状態についての検出を行う必要がない場合などに、その検出を抑止している。これにより、本装置２は、他の運転者状態の検出について誤検出してしまうことを防止している。
【０１５９】
また、この間に動作検出部２３及び状態検出部２４は、既に検出結果を得た運転者状態について再度の検出を実行している。そして、その再度の検出による結果が所定の結果であるか否かを判断する（ＳＴ５２）。すなわち、所定の結果が継続しているか否かを判断している。
【０１６０】
所定の結果が継続していると判断した場合（ＳＴ５２：ＹＥＳ）、継続しなくなったと判断するまでこの処理を繰り返すこととなる。一方、所定の結果が継続しなくなったと判断した場合（ＳＴ５２：ＮＯ）、状態検出部２４は、抑止を解除する解除信号を動作検出部２３に送信する（ＳＴ５３）。すなわち、ステップＳＴ５１に実行した抑止を解除することとなる。
【０１６１】
そして、処理は終了する。ところで、ステップＳＴ５０において、所定の結果でないと判断した場合（ＳＴ５０：ＮＯ）、同様に処理は終了する。
【０１６２】
このようにして、本実施形態に係る状態検出装置２０ａでは、第１実施形態と同様に、費用面及び汎用性の面での向上を図ることができる。さらに、或る運転者状態の検出結果に基づいて他の運転者状態の検出を抑止するので、他の運転者状態について誤検出してしまうことを防止することができる。
【０１６３】
また、第１実施形態と同様に、利便性を向上させることができ、ノイズ等による影響を軽減させることができる。
【０１６４】
また、運転開始時であっても運転者の状態等を検出することができる。
【０１６５】
さらに、計算量の増大を防止すると共に、１つの参照領域内に同時に複数の特徴的な部位が入る可能性を少なくすることができる。また、小さ過ぎる領域を設定することにより、特徴的な部位がない領域となることを防ぐことができる。
【０１６６】
さらには、好適に運転者状態を検出することができ、不正確な検出してしまうことを防止することができる。
【０１６７】
なお、本実施形態においても、オプティカルフローを求める画像処理を行っていれば、他の処理は多少の変更があっても構わない。また、例えば、運転者が座席の下の落ちた物を拾うために屈んだ場合や、助手席の物を取ろうとして助手席側に身を乗り出した場合も検出することができる。さらに、スポーク部の移動軌跡が得られることから、ハンドルの切れ角を推定する装置に適用することもできる。
【０１６８】
また、本実施形態においては、３つの運転者状態のうち少なくとも２つを検出対象としていればよいため、検出対象は２つであっても３つであっても構わない。さらに、少なくとも１つの検出結果に基づいて、他の運転者状態の検出を抑止するので、３つのうち２つの検出結果に基づいて、残り１つの運転者状態の検出を抑止するようにしてもよい。また、１つの検出結果に基づいて、残り２つの運転者状態の検出を抑止するようにしてもよい。
【０１６９】
次に、本発明の第３実施形態を説明する。なお、第３実施形態では、主に第２実施形態との相違点について説明することとする。
【０１７０】
第３実施形態に係る状態検出システム１ｂ及び状態検出装置２０ｂは、３つの運転者状態と、３つの身体状態とのうち少なくとも２つを検出対象としている。なお、３つの身体状態とは、運転者の瞼の開閉、運転者の口の開閉、及び運転者の表情変化を指すものである。
【０１７１】
この身体状態の検出では、実行する処理が上記した運転者状態の検出と概ね一致している。但し、身体状態の検出については、瞼、口及び表情という微細な変化を正確にとらえる必要があるため、撮像画像中から顔の目や口等の位置を特定することが必要となるる。
【０１７２】
次に、瞼の開閉、口の開閉及び表情変化を検出する際の状態検出装置２０ｂの動作を説明する。
【０１７３】
瞼の開閉を検出する場合、画像処理部２２は、目の位置を特定する。具体的には、特開平５−６０５１５号公報や特開２０００−１４２１６４号公報に記載されるようにして、撮像画像中における目の座標位置を特定するようにすればよい。
【０１７４】
そして、目の位置を特定後、画像処理部２２は、撮像画像中の目の位置付近に参照領域を設定すると共に、複数の参照領域を含んだ領域グループを設定する。
【０１７５】
図２３は、瞼の開閉を検出する場合の参照領域及び領域グループを示す説明図であり、（ａ）は参照領域の例を示し、（ｂ）は領域グループの例を示している。図２３（ａ）に示すように、画像処理部２２は、両目を覆うように４行１６列の参照領域を設定する。そして、図２３（ｂ）に示すように、２つの領域グループＡ３，Ｂ３を設定する。これら領域グループは、左右の目それぞれに対して設定されるものであり、具体的には４行８列の参照領域を含んで設定される。
【０１７６】
参照領域及び領域グループの設定後、第２実施形態と同様にオプティカルフローを求めて、そのデータを動作検出部２３に送出する。
【０１７７】
このように、参照領域は、第２実施形態では予め定められた位置に設定されるのに対し、第３実施形態では特定した目の位置に設定される。すなわち、第３実施形態は、目の位置の特定処理、及び参照領域の設定処理という点で第２実施形態と相違している。なお、この相違は、瞼の開閉以外の状態の検出に影響を与え、他の状態の検出との同時実行ができなくなる程のものではない。
【０１７８】
オプティカルフローの算出後、動作検出部２３は、第２実施形態と同様に（特に顔の向きの検出と同様に）して、実動作パターンを求める。そして、状態検出部２４は、複数の記憶動作パターンとの相関を求め、身体の状態を検出する。
【０１７９】
ここで、瞼の開閉を検出する場合に得られるオプティカルフロー及び実動作パターンを説明する。図２４は、瞼の開閉を検出する場合に得られるオプティカルフローの例を示す説明図である。
【０１８０】
まず、図２４（ａ）に示すように、時刻ｔにおいて運転者の目は開いている状態となっている。その後、時刻（ｔ＋１）において運転者が目を閉じ始める。このとき、図２４（ｂ）に示すように、運転者の瞼の部分について画像縦方向（Ｙ方向）にオプティカルフローが検出される。
【０１８１】
そして、時刻（ｔ＋２）において運転者の目が完全に閉じる。このときも、図２４（ｃ）に示すように、運転者の目付近には画像縦方向にオプティカルフローが検出される。なお、画像横方向（Ｘ方向）については、時刻ｔ〜（ｔ＋２）を通じて、オプティカルフローがあまり検出されない。
【０１８２】
図２５は、瞼の開閉を検出する場合に得られる実動作パターンの例を示す説明図である。なお、図２５では、運転者が目を閉じ、その後目を開けるまでに得られるパターンを示している。
【０１８３】
運転者が目を閉じる動作を行う場合、図２４に示すように画像縦方向にオプティカルフローが検出され、画像横方向にはオプティカルフローがあまり検出されない。このため、得られる実動作パターンＰ４，Ｐ５（以下、瞼の開閉を検出する場合に得られる実動作パターンを瞼動作パターンＰ４，Ｐ５という）は、図２５に示すようになる。
【０１８４】
具体的に説明すると、画像縦方向についての瞼動作パターンＰ４は以下のようになっている。まず、運転者が目を開いている状態（時刻１７８〜１８６の期間）では、移動位置が「０」付近となっている。その後、運転者が目を閉じ始めると、画像縦方向のオプティカルフローが得られることから、移動位置が「６〜８」画素まで上昇する（時刻１８６〜１９０の期間）。
【０１８５】
そして、運転者が目を閉じ続けている状態（時刻１９０〜２１６の期間）では、移動位置は「６〜８」画素を維持し続ける。その後、運転者が目を開け始めると、移動位置は次第に減少する（時刻２１６〜２３７の期間）。
【０１８６】
一方、瞼のオプティカルフローは、画像横方向に余り検出されていない。このため、画像横方向についての瞼動作パターンＰ５は、時刻１７８〜１８６の期間においてほぼ同じ値を維持し続ける。
【０１８７】
上記のような瞼動作パターンＰ４，Ｐ５が得られた後、状態検出部２４は、記憶部２４ａから複数の記憶動作パターンを読み出す。そして、状態検出部２４は、瞼動作パターンＰ４と記憶動作パターンとを比較して、運転者の瞬きを検出する。なお、第３実施形態に係る記憶部２４ａは、記憶動作パターンとして、画像縦方向に所定の移動を示した後に所定の移動分復帰を示した場合のパターンを記憶している。このため、状態検出部２４は、瞼動作パターンＰ４と、画像縦方向に所定の移動を示した後に所定の移動分復帰を示す記憶動作パターンとの相関が最も高かった場合に運転者の瞬きを検出することとなる。
【０１８８】
その後、状態信号出力部２５は、検出結果に応じた電気信号Ｓｂを制御装置３０に出力する。また、記憶部２４ａは、目の開動作、閉動作の記憶動作パターンを記憶しており、閉動作から開動作までの時間に基づいて長い間目を閉じているなども検出することができる。
【０１８９】
なお、従来では、何らかの検出対象を検出する場合、２以上の手法（例えば濃淡値データに基づく手法や差分画像の基づく手法）により、検出対象を検出するように装置を構成していることが多い。そして、これら２以上の手法のうちいずれか１つの手法により検出対象を検出すれば、他の手法により検出対象が検出されなくとも検出有りと判断している。このように、従来では、１の手法により検出を行った場合の検出ミスを補完する目的で、２以上の手法を組み合わせて検出対象を検出することが行われている。
【０１９０】
ところで、上記組み合わせにおいては、全く異なる手法を２以上組み合わせた方が、同様の手法を２以上組み合わせるよりも検出精度が高くなる傾向にある。すなわち、例えば２以上の手法すべてが濃淡値データに基づき検出対象を検出している場合、濃淡値データそのものが上手く検出されていないときにはすべての手法によって検出ミスが発生してしまう可能性があるからである。
【０１９１】
本実施形態においては、オプティカルフローという新規の手法にて瞬きを検出している。このため、例えば２以上の手法を組み合わせて検出対象を検出する場合には、全体のとして検出精度の高い瞬き検出を行うことができる。また、目の開動作、閉動作の検出も同様に、精度良く行うことができる。
【０１９２】
次に、口の開閉について説明する。口の開閉を検出する場合、画像処理部２２は、口の位置を特定する。口の位置の特定に際しては、まず、上記のように目の座標を特定する。そして、目の座標位置の相対位置関係から、撮像画像中における口の座標位置を特定する。さらに、画像処理部２２は、口の位置を特定すると、次に、上唇と下唇との位置を特定する。この上唇と下唇との位置は、例えば、画像横方向に伸びる濃淡値の低い領域（即ち、口を閉じたときにできる上唇と下唇との境目）を基準に特定される。
【０１９３】
そして、上唇と下唇の位置を特定後、画像処理部２２は、撮像画像中の口の位置付近に参照領域を設定すると共に、複数の参照領域を含んだ領域グループを設定する。
【０１９４】
図２６は、口の開閉を検出する場合の参照領域及び領域グループを示す説明図であり、（ａ）は参照領域の例を示し、（ｂ）は領域グループの例を示している。図２６（ａ）に示すように、画像処理部２２は、両唇を覆うように４行８列の参照領域を設定する。そして、図２６（ｂ）に示すように、２つの領域グループＡ４，Ｂ４を設定する。これら領域グループは、上唇及び下唇のそれぞれに対して設定されるものであり、具体的には２行８列の参照領域を含んで設定される。
【０１９５】
参照領域及び領域グループの設定後、第２実施形態と同様にオプティカルフローを求めて、そのデータを動作検出部２３に送出する。
【０１９６】
このように、参照領域は、特定した口の位置に設定される。なお、第２実施形態との相違点である口の位置の特定処理、及び参照領域の設定処理は、口の開閉以外の状態の検出に影響を与え、他の状態の検出との同時実行ができなくなる程のものではない。
【０１９７】
オプティカルフローの算出後、動作検出部２３は、第２実施形態と同様に（特に顔の向きの検出と同様に）して、実動作のパターンを求める。そして、状態検出部２４は、複数の記憶動作パターンとの相関を求め、身体の状態を検出する。
【０１９８】
ここで、口の開閉を検出する場合に得られるオプティカルフロー及び実動作パターンを説明する。図２７は、口の開閉を検出する場合に得られるオプティカルフローの例を示す説明図である。
【０１９９】
まず、図２７（ａ）に示すように、時刻ｔにおいて運転者の口は閉じている状態となっている。その後、時刻（ｔ＋１）において運転者が口を開け始める。このとき、図２７（ｂ）に示すように、運転者の下唇の部分について画像縦方向（Ｙ方向）にオプティカルフローが検出される。一方、画像横方向（Ｘ方向）については、オプティカルフローがあまり検出されない。また、上唇については、画像縦方向にも横方向にもオプティカルフローが検出されない。
【０２００】
そして、時刻（ｔ＋２）において運転者の目が完全に閉じる。このときも、図２７（ｃ）に示すように、運転者の下唇の部分には画像縦方向だけにオプティカルフローが検出される。一方、上唇にはオプティカルフローが検出されない。
【０２０１】
図２８は、口の開閉を検出する場合に得られる実動作パターンの例を示す説明図である。なお、図２８では、運転者が口を開け、その後口を閉じるまでに得られるパターンを示している。
【０２０２】
運転者が口を開ける動作を行う場合、図２７に示したように、下唇については画像縦方向にオプティカルフローが検出され、画像横方向にはオプティカルフローがあまり検出されない。また、上唇については画像縦方向及び横方向の双方で、オプティカルフローがあまり検出されない。
【０２０３】
このため、得られる実状態パターンＰ６〜Ｐ９は、図２８に示すようになる。なお、以下において、口の開閉を検出する場合に下唇について得られる実動作パターンを下唇動作パターンＰ６，Ｐ７とする。また、上唇について得られる実動作パターンを上唇動作パターンＰ８，Ｐ９とする。
【０２０４】
図２８に示すパターンＰ６〜Ｐ９を具体的に説明する。まず、画像縦方向についての下唇動作パターンＰ６については、運転者が口を閉じている状態（時刻６６０〜６７５の期間）において移動位置が「０」付近となっている。その後、運転者が口を開け始めると、画像縦方向のオプティカルフローが得られることから、移動位置が「３０」画素付近まで上昇する（時刻６７５〜７００の期間）。
【０２０５】
そして、運転者が口と開き続けている状態（時刻７００〜７１０の期間）では、移動位置は「３０」画素付近を維持し続ける。その後、運転者が口を閉じ始めると、移動位置は次第に減少する（時刻７１０〜７１６の期間）。そして、運転者が口を閉じると（時刻７１０〜７３４の期間）、移動位置は「５」画素付近を維持し続ける。ここで、移動位置は「５」画素付近となっているのは、誤差分が検出されたためである。
【０２０６】
一方、画像横方向についての下唇動作パターンＰ７は、画像横方向にオプティカルフローがあまり検出されないことから、時刻６６０〜７３４の期間においてほぼ「０」付近を維持し続ける。また、上唇の動作パターンＰ８，Ｐ９も同様に、時刻６６０〜７３４の期間においてほぼ「０」付近を維持し続ける。
【０２０７】
上記のような上唇及び下唇動作パターンＰ６〜Ｐ９が得られた後、状態検出部２４は、記憶部２４ａから複数の記憶動作パターンを読み出す。そして、状態検出部２４は、上唇及び下唇動作パターンＰ６〜Ｐ９と記憶動作パターンとを比較して、運転者の瞬きを検出する。なお、第３実施形態に係る記憶部２４ａは、記憶動作パターンとして、上唇がほぼ静止状態を示し、下唇が画像縦方向に所定の動きを示した場合のパターンを記憶している。このため、状態検出部２４は、上唇及び下唇動作パターンＰ６〜Ｐ９とと、上唇がほぼ静止状態を示し、下唇が画像縦方向に所定の動きを示す記憶動作パターンとの相関が最も高かった場合に運転者の運転者の口の開動作又は閉動作を検出することとなる。
【０２０８】
そして、状態信号出力部２５は、検出結果に応じた電気信号Ｓｂを制御装置３０に出力する。なお、口の開閉を検出する場合では、記憶動作パターンとして「ａ」「ｉ」「ｕ」「ｅ」「ｏ」の発音時のデータを記憶部２４ａに記憶させておくことにより、発音の推定装置等に応用することができる。すなわち、音声入力のナビゲーション装置等に応用が可能である。
【０２０９】
また、記憶動作パターンとしてあくびの際の口の動きのデータを記憶部２４ａに記憶させておくことにより、あくび検出の装置等に応用することができる。さらには、あくび検出することにより、運転者の覚醒度の評価や居眠検出装置等に応用することもできる。
【０２１０】
なお、上記口の開閉の検出は、比較的精度の高いものとなっている。これは、本実施形態が上唇と下唇のそれぞれの動きを検出ことに起因している。例えば、口の動き全体を捕らえて口の開閉を検出する場合、運転者が顔を多少上下に動かしたときには、口の開閉なのか顔の上下動作なのか区別が付きにくくなってしまう。
【０２１１】
ところが、本実施形態では、人が口を開閉させる際に、上唇が殆ど動かず、主に下唇が動くということに着目し、この動きを検出して口の開閉を判断している。よって、比較的精度の高い口の開動作及び閉動作の検出が可能となっている。
【０２１２】
次に、表情の変化について説明する。表情の変化を検出する場合、画像処理部２２は、顔の位置を特定する。そして、顔の目や鼻等の位置を特定する。この特定に際しては、まず、目の座標位置を特定する。そして、目の座標位置の相対位置関係から、撮像画像中における鼻、口、頬、眉等の顔の各部位の座標位置を特定する。
【０２１３】
そして、顔の各部位を特定後、画像処理部２２は、撮像画像中の顔全体に参照領域を設定すると共に、顔の各部位毎に複数の参照領域を含んだ領域グループを設定する。
【０２１４】
図２９は、表情の変化を検出する場合の参照領域及び領域グループを示す説明図であり、（ａ）は参照領域の例を示し、（ｂ）は領域グループの例を示している。図２９（ａ）に示すように、画像処理部２２は、顔全体を覆うように１４行１６列の参照領域を設定する。
【０２１５】
そして、図２９（ｂ）に示すように、１１の領域グループＡ５〜Ｋ５を設定する。詳細には、領域グループＡ５〜Ｄ５は、右眉、左眉、右目及び左目位置に対して設定されるものであり、具体的にはそれぞれ３行８列の参照領域を含んで設定される。また、領域グループＥ５，Ｇ５，Ｉ５，Ｊ５は、右頬、左頬、右顎及び左顎位置に対して設定されるものであり、具体的にはそれぞれ４行４列の参照領域を含んで設定される。
【０２１６】
さらに、領域グループＦ５，Ｈ５，Ｋ５は、鼻、上唇及び下唇位置に対して設定されるものであり、具体的には３行８列、１行８列、４行４列の参照領域を含んで設定される。
【０２１７】
そして、画像処理部２２は、参照領域及び領域グループの設定後、第２実施形態と同様にオプティカルフローを求めて、そのデータを動作検出部２３に送出する。
【０２１８】
このように、参照領域は、顔の各部位の位置に設定される。なお、顔の表情検出において第２実施形態との相違点は、運転者状態の検出に影響を与え、運転者状態の検出ができなくなる程のものではない。
【０２１９】
オプティカルフローの算出後、動作検出部２３は、第２実施形態と同様に（特に顔の向きの検出と同様に）して、実動作のパターンを求める。そして、状態検出部２４は、複数の記憶動作パターンとの相関を求め、身体の状態を検出する。
【０２２０】
ここで、表情の変化を検出する場合に得られるオプティカルフロー及び実動作パターンを説明する。図３０は、表情の変化を検出する場合に得られるオプティカルフローの例を示す説明図である。また、図３１は、図３０に示したオプティカルフローを簡略化して示す説明図である。なお、図３０及び図３１においては、運転者が眉をひそめる動作をする場合のオプティカルフローを示している。
【０２２１】
まず、図３０（ａ）に示すように、時刻ｔにおいて運転者の表情は通常の状態となっている。その後、時刻（ｔ＋１）において運転者が眉をひそめ始める。このとき、図３０（ｂ）に示すように、眉及び目付近にオプティカルフローが検出される。そして、時刻（ｔ＋２）において運転者が眉をひそめると、図３０（ｃ）に示すように、オプティカルフローが検出されなくなる。
【０２２２】
時刻ｔ〜（ｔ＋２）までの様子を図３１に示す。同図に示すように、運転者が眉をひそめる動作を行うと、眉の位置はやや画像縦方向に移動する傾向がある。また、目にも僅かな動きが見られる。
【０２２３】
図３２は、表情の変化を検出する場合に得られる実動作パターンの例を示す説明図である。なお、図３２では、図３０及び図３１にて示した眉をひそめる動作をしたときに得られるパターンを示している。
【０２２４】
図３０及び図３１からの明らかなように、運転者が眉をひそめる動作を行う場合、眉及び目付近にオプティカルフローが検出される。このため、得られる実動作パターンは、図３２に示すようになる。
【０２２５】
図３２に示すように、時刻ｔ〜（ｔ＋２）を通じて、眉及び目にオプティカルフローが得られることから、これらの移動位置にそれぞれ変化が見られる。これに対し、眉及び目以外の顔部位については殆ど変化が見られない。
【０２２６】
上記のような顔の特徴部位毎に実動作パターンが得られた後、状態検出部２４は、記憶部２４ａから複数の記憶動作パターンを読み出す。そして、状態検出部２４は、顔の特徴部位毎に実動作パターンと、記憶部２４ａに記憶される記憶動作パターンとに基づいて、運転者の表情を検出することとなる。
【０２２７】
そして、状態信号出力部２５は、検出結果に応じた電気信号Ｓｂを制御装置３０に出力する。
【０２２８】
ここで、表情の変化についての記憶動作パターンを各表情毎に記憶させておくことが望ましい。この場合、種々の表情を検出することが可能となる。このため、例えば、従来では区別することが困難である笑っている状態と目を細めている状態との判別が可能となる。
【０２２９】
なお、本実施形態では、表情の変化の検出を車両内にて行っており、且つ制御装置により検出結果に基づく制御を行うため、より有用なものとなっている。例えば、運転者が眉をひそめる動作を行った場合には、制御装置３０にて電動サンシェードを制御することができる。また、表情の変化から運転者の感情を検出して、イライラ状態にある運転者の感情を沈静させるなど、制御装置３０にてオーディオ装置を制御することができる。このように、表情の変化に基づいて運転者の感情等を考慮した車両内環境の制御を行うことができるため、非常に有用なものとなっている。
【０２３０】
ここで、表情を認識するのみの装置は、特開平４−３４２０７８号公報に開示されている。本実施形態では、この従来技術と同様の方法にて表情を検出することもできる。
【０２３１】
以上が身体状態の検出の説明である。また、第３実施形態においては、第２実施形態と同様に、検出した結果に基づいて、他の運転者状態又は／及び身体状態についての検出を抑止する機能を有している。このため、状態検出部２４は、まず、画像処理部２２により求められたオプティカルフローから、検出対象のうち少なくとも１つの状態を検出する。そして、その結果に基づいて、抑止信号の送出の可否を判定し、条件を満たしていれば抑止信号を動作検出部２３に送出することとなる。
【０２３２】
つまり、３つ運転者状態と３つの身体状態とのうち、少なくとも２つを検出対象とし、この検出対象のうち少なくとも１つの状態を検出し、検出した結果に基づいて、他の状態以外の状態についての検出を抑止することとなる。
【０２３３】
これにより、第２実施形態と同様に他の状態の検出について誤検出していしまうことを防止している。
【０２３４】
このようにして、第３実施形態に係る状態検出装置２０ｂでは、第２実施形態と同様に、費用面及び汎用性の面での向上を図ることができ、誤検出してしまうことを防止することができる。
【０２３５】
また、第２実施形態と同様に、利便性を向上させることができ、ノイズ等による影響を軽減させることができる。
【０２３６】
また、運転開始時であっても運転者の状態等を検出することができる。
【０２３７】
また、計算量の増大を防止すると共に、１つの参照領域内に同時に複数の特徴的な部位が入る可能性を少なくすることができる。また、小さ過ぎる領域を設定することにより、特徴的な部位がない領域となることを防ぐことができる。
【０２３８】
また、好適に運転者状態を検出することができ、不正確な検出してしまうことを防止することができる。
【０２３９】
また、本実施形態に係る状態検出装置２０ｂでは、瞼動作のパターンが画像縦方向に所定の移動を示した後に、所定の移動分復帰を示した場合に、運転者の瞬きが検出されている。このように、本実施形態ではオプティカルフローという新規の手法にて瞬きを検出している。このため、例えば２以上の手法を組み合わせて検出対象を検出する場合には、全体のとしての検出効率の高い瞬き検出を行うことができる。また、目の開動作、閉動作の検出も同様に、精度良く行うことができる。
【０２４０】
また、口の開閉の検出は、比較的精度の高いものとなっている。すなわち、本実施形態では、人が口を開閉させる際に、上唇が殆ど動かず、主に下唇が動くという動作に着目し、この動きを検出して口の開閉を判断している。このため、顔を上下させたときと口の開閉との区別が明確となっている。よって、比較的精度の高い口の開動作及び閉動作の検出が可能となっている。
【０２４１】
また、顔の特徴部位である目や鼻等の実動作パターンと、予め記憶される記憶動作パターンとに基づいて、運転者の表情を検出している。また、本実施形態では、この表情の変化の検出を車両内にて行っており、且つ制御装置により検出結果に基づく制御を行っている。このため、例えば運転者が眉をひそめる動作を行った場合には、制御装置にて電動サンシェードを制御することができる。また、表情の変化から運転者の感情を検出して、イライラ状態にある運転者の感情を沈静させるなど、制御装置にてオーディオ装置を制御することができる。このように、表情の変化に基づいて運転者の感情等を考慮した車両内環境の制御を行うことができる。
【０２４２】
なお、本実施形態では、３つ運転者状態と３つの身体状態とのうち、少なくとも２つを検出対象としていればよいため、３つ以上であっても構わない。さらに、少なくとも１つの検出結果に基づいて、他の状態の検出を抑止するので、６つのうち３つの検出結果に基づいて、残り２つの状態の検出を抑止するようにしてもよい。また、検出結果の数、及び検出を抑止する状態の数は、これに限らず、適宜変更することができる。
【０２４３】
次に本発明の第４実施形態を説明する。第４実施形態では、第３実施形態の構成に加えて、新たに車両状態検出手段と環境情報検出手段とを備えている。以下、第４実施形態について説明する。
【０２４４】
図３３は、第４実施形態に係る状態検出装置を含む状態検出システムの構成を示すブロック図である。同図に示すように、状態検出システム１ｃは、車両の状態を検出する車両状態検出手段４０と、車両の周囲環境を検出する環境情報検出手段５０とを備えている。
【０２４５】
具体的に車両状態検出手段４０は、車速や、ブレーキスイッチのオン／オフ情報、アクセルスイッチのオン／オフ情報、操舵角、シフトレンジ情報等の車両に関する状態を１つ以上検出するものである。
【０２４６】
環境情報検出手段５０は、ＧＰＳやジャイロを利用したナビゲーションシステムによる位置情報を取得し、例えば、走行中の道路の種別や交差点の有無等を検出するものである。
【０２４７】
また、環境情報検出手段５０は、可視光カメラ、遠赤外線検出素子、レーザーレーダー及び超音波センサの１つ以上から構成されて、車両周辺の情報を検出するものである。この構成により、環境情報検出手段５０は、例えば、先行車や障害物の有無・接近、歩行者の横断、後続車の接近、側後方からの接近車両等を検出する。
【０２４８】
さらに、環境情報検出手段５０は、気象情報や、天候、照度計による外の明るさや昼夜の区別等の情報を得るものでもある。
【０２４９】
また、状態検出装置２０ｃは、車両状態検出手段４０からの信号Ｓｃと、環境情報検出手段５０からの信号Ｓｄとの少なくとも一方に基づいて、検出すべき状態（運転者状態、身体状態）を変更する機能を有している。
【０２５０】
例えば、状態検出装置２０ｃは、ナビゲーションによる地図情報から、見通しの悪い交差点や信号のない交差点に差し掛かっているという環境信号Ｓｄに基づいて、運転者の顔の向きを検出対象とする。
【０２５１】
また、状態検出装置２０ｃは、車速が設定速度以下であるという車両の状態信号Ｓｃに基づいて、渋滞を判断し、運転者が眠気を感じているか等を検出すべく、表情の変化を検出対象とする。
【０２５２】
このようにして、本実施形態に係る状態検出装置２０ｃによれば、第３実施形態と同様に、費用面及び汎用性の面での向上を図ることができ、誤検出してしまうことを防止することができる。
【０２５３】
また、第３実施形態と同様に、利便性を向上させることができ、ノイズ等による影響を軽減させることができる。
【０２５４】
また、運転開始時であっても運転者の状態等を検出することができる。
【０２５５】
また、計算量の増大を防止すると共に、１つの参照領域内に同時に複数の特徴的な部位が入る可能性を少なくすることができる。また、小さ過ぎる領域を設定することにより、特徴的な部位がない領域となることを防ぐことができる。
【０２５６】
また、好適に運転者状態を検出することができ、不正確な検出してしまうことを防止することができる。
【０２５７】
また、例えば２以上の手法を組み合わせて検出対象を検出する場合には、全体のとしての検出効率の高い瞬き検出を行うことができる。さらに、目の開動作、閉動作の検出も同様に、精度良く行うことができる。
【０２５８】
また、比較的精度の高い口の開動作及び閉動作の検出が可能となっており、さらには、表情の変化に基づいて運転者の感情等を考慮した車両内環境の制御を行うことができる。
さらに、車両状態検出手段４０からの信号Ｓｃ、及び環境情報検出手段５０からの信号Ｓｄとの少なくとも一方に基づいて、検出すべき状態を変更する。このため、各状態・環境に応じて適切な運転者・身体状態を検出することができる。
【図面の簡単な説明】
【図１】本発明の第１実施形態に係る状態検出装置を含む状態検出システムの構成を示すブロック図である。
【図２】図１に示した状態検出装置２０の詳細構成を示すブロック図である。
【図３】本実施形態に係る状態検出装置２０の動作の概略を示すデータフローダイヤグラムである。
【図４】本実施形態に係る状態検出装置２０の動作の概略を示す説明図である。
【図５】参照領域及び探索領域の説明図である。
【図６】撮像画像に規則的に配置される参照領域の説明図であり、（ａ）は参照領域を画像横方向に配置したときの例を示し、（ｂ）は参照領域を格子状に配置したときの例を示し、（ｃ）は参照領域を画像横方向且つ格子状に配置したときの例を示している。
【図７】領域グループの説明図であり、顔の向きを検出する場合の例を示している。
【図８】領域グループの説明図であり、運転者の顔以外のものの撮像範囲内屁への出入を検出する場合の例を示している。
【図９】図２に示した画像処理部２２の動作を示すフローチャートである。
【図１０】図９に示すステップＳＴ１１における移動量（ｘｄ，ｙｄ）の算出方法の説明図である。
【図１１】図９に示すステップＳＴ１２の処理の説明図である。
【図１２】運転者の顔の向きを検出する場合のオプティカルフローの例を示す説明図であり、（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示し、（ｄ）は時刻（ｔ＋３）におけるオプティカルフローの例を示している。
【図１３】運転者の有無を検出する場合のオプティカルフローの例を示す説明図であり、（ａ）は乗車前におけるオプティカルフローの例を示し、（ｂ）は乗車最中におけるオプティカルフローの例を示し、（ｃ）は乗車完了後におけるオプティカルフローの例を示している。
【図１４】運転者の顔以外ものの撮像範囲内への出入を検出する場合のオプティカルフローの例を示す説明図であり、運転者が目付近に手を移動させたときのオプティカルフローの例を示し、且つ（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示している。
【図１５】運転者の顔以外ものの撮像範囲内への出入を検出する場合のオプティカルフローの例を示す説明図であり、運転者が道路マップ等を見るために本を持ち上げたときのオプティカルフローの例を示し、且つ（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示している。
【図１６】運転者の顔以外ものの撮像範囲内への出入を検出する場合のオプティカルフローの例を示す説明図であり、ハンドルのスポーク部が撮像範囲内に進入してきた場合のオプティカルフローの例を示し、且つ（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示している。
【図１７】図２に示した動作検出部２３の動作を示すフローチャートである。
【図１８】図２に示した動作検出部２３により得られる実動作パターンの説明図であり、検出対象が運転者の顔の向きである場合を示している。
【図１９】図２に示した動作検出部２３により得られる実動作パターンの説明図であり、検出対象が運転者の顔以外のものの撮像範囲への出入である場合を示している。
【図２０】検出対象が運転者の有無である場合の動作検出部２３により得られる実動作パターンの例を示す説明図である。
【図２１】図２に示した状態検出部２４の動作を示すフローチャートである。
【図２２】状態検出部２４が行う抑止制御処理を示すフローチャートである。
【図２３】瞼の開閉を検出する場合の参照領域及び領域グループを示す説明図であり、（ａ）は参照領域の例を示し、（ｂ）は領域グループの例を示している。
【図２４】瞼の開閉を検出する場合に得られるオプティカルフローの例を示す説明図であり、（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示している。
【図２５】瞼の開閉を検出する場合に得られる実動作パターンの例を示す説明図である。
【図２６】口の開閉を検出する場合の参照領域及び領域グループを示す説明図であり、（ａ）は参照領域の例を示し、（ｂ）は領域グループの例を示している。
【図２７】口の開閉を検出する場合に得られるオプティカルフローの例を示す説明図であり、（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示している。
【図２８】口の開閉を検出する場合に得られる実動作パターンの例を示す説明図である。
【図２９】表情の変化を検出する場合の参照領域及び領域グループを示す説明図であり、（ａ）は参照領域の例を示し、（ｂ）は領域グループの例を示している。
【図３０】表情の変化を検出する場合に得られるオプティカルフローの例を示す説明図であり、（ａ）は時刻ｔにおけるオプティカルフローの例を示し、（ｂ）は時刻（ｔ＋１）におけるオプティカルフローの例を示し、（ｃ）は時刻（ｔ＋２）におけるオプティカルフローの例を示している。
【図３１】図３０に示したオプティカルフローを簡略化して示す説明図である。
【図３２】表情の変化を検出する場合に得られる実動作パターンの例を示す説明図である。
【図３３】第４実施形態に係る状態検出装置を含む状態検出システムの構成を示すブロック図である。
【符号の説明】
１〜１ｃ…状態検出システム
１０…撮像装置
２０〜２０ｃ…状態検出装置
２１…画像取得部（画像取得手段）
２２…画像処理部（画像処理手段）
２３…動作検出部（動作検出手段）
２４…状態検出部（状態検出手段）
２４ａ…記憶部（パターン記憶手段）
２５…状態信号出力部（信号出力手段）
３０…制御装置（制御手段）
４０…車両状態検出手段
５０…環境情報検出手段
Ａ〜Ｋ…領域グループ
Ｃ１，Ｃ２…変化量
Ｄ…記憶動作パターン
Ｐ…実動作パターン
Ｐ４，Ｐ５…瞼動作パターン
Ｐ６，Ｐ７…下唇動作パターン
Ｐ８，Ｐ９…上唇動作パターン
Ｓａ…ビデオ信号
Ｓｂ…電気信号
Ｓｃ…状態信号
Ｓｄ…環境信号[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a state detection device and a state detection system.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, there has been known a state detection device that captures an image of a driver's body by an imaging unit and detects a state of the driver or the like based on the obtained image.
[0003]
As one of them, for example, the coordinates of the eyes on the image are detected based on the characteristics of the eyes, the opening / closing of the eyelids is detected from the amount of change in the vertical width of the eyes, and the driver's time and frequency of closing the eyes are determined based on the time and frequency of closing the eyes. 2. Description of the Related Art There is known a state detection device that detects an arousal level and detects a drowsiness (for example, see Patent Document 1).
[0004]
In addition, using difference images of face images obtained continuously, the driver can estimate the driver's face direction and perform inattentiveness detection, or detect a state where the face movement is reduced to reduce the driver's consciousness. 2. Description of the Related Art A state detecting device for detecting a state is known (for example, see Patent Document 2).
[0005]
Further, there is known a state detection device that predicts a region to be viewed by a driver based on a traveling environment of a vehicle, detects a driver's line of sight, and detects whether the driver has viewed the region. (See, for example, Patent Document 3). Note that this device also has a function of notifying the driver of the fact that the driver has not visually recognized the area to be viewed.
[0006]
Further, a state detection device that detects a driver's riding posture by using a difference image of a driver's body image obtained continuously is known (for example, see Patent Document 4).
[0007]
Further, there is known a state detection device that reads a facial expression transition from a continuous face image of a driver using a facial expression transition map created in advance and detects a driver's arousal level (for example, see Patent Document 5).
[0008]
As described above, in the conventional devices, it is possible to detect the state of the driver or the like.
[0009]
[Patent Document 1]
JP-A-10-40361
[0010]
[Patent Document 2]
JP-A-11-161798
[0011]
[Patent Document 3]
JP-A-2002-83400
[0012]
[Patent Document 4]
JP-A-2000-113164
[0013]
[Patent Document 5]
JP 2001-43345 A
[0014]
[Problems to be solved by the invention]
In the above-described state detection device, an optimum method is selected according to a state to be detected, and a device or the like that performs the optimum method is mounted in the device. Specifically, the conventional state detection devices have different image processing methods depending on the state to be detected. For this reason, the state detection device can often detect only a specific state among various states, and in order to detect various states, the apparatus is configured by mounting devices for performing a plurality of image processing methods and the like. Will be done.
[0015]
However, when a plurality of image processes are executed as described above, the cost may be high and simultaneous execution may not be possible depending on the contents of the image processing.
[0016]
[Means for Solving the Problems]
According to the present invention, the image processing means obtains an optical flow between the captured images based on the captured images obtained by capturing the position where the driver's body is present when the driver is seated in time series, The state detection means, from the optical flow obtained by the image processing means, without specifying the position of the driver's body in the captured image, into the direction of the driver's face, within the imaging range of something other than the driver's face. , And at least one of the three driver states of presence / absence of a driver is detected as a detection target.
[0017]
【The invention's effect】
According to the present invention, one image processing method using an optical flow can be used for detecting the orientation of a driver's face, entering / leaving an object other than the driver's face into an imaging range, and the presence / absence of a driver. Used.
[0018]
Therefore, when a device that detects one of the above three driver states is configured, and when it is desired to upgrade the device so as to detect another driver state later, only a processing part that is not common should be incorporated. What should I do? As a result, compared to the case where a device that performs completely different processing is incorporated in upgrading or the like, the cost is not increased, and the situation in which simultaneous execution is impossible depending on the content of image processing hardly occurs.
[0019]
In addition, when two or more of the three driver states are detected, the image processing method is common, so that the cost may be higher than when a device that performs different image processing is mounted. In other words, depending on the content of the image processing, simultaneous execution becomes impossible.
[0020]
Therefore, cost and versatility can be improved.
[0021]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, a preferred embodiment of the present invention will be described with reference to the drawings. In the following embodiment, a case where the state detection system is mounted on a vehicle will be described as an example. In the following description, the moving amount includes the moving speed and the moving direction. Further, this movement amount is referred to as an optical flow.
[0022]
FIG. 1 is a block diagram showing a configuration of a state detection system including a state detection device according to the first embodiment of the present invention. In the first embodiment, at least one of the three driver states, ie, the orientation of the driver's face, the movement of objects other than the driver's face into and out of the imaging range, and the presence or absence of the driver, is set as the detection target. A state detection system for detection will be described as an example.
[0023]
As shown in FIG. 1, a state detection system 1 according to the present embodiment detects the direction of a driver's face and the like, and includes an imaging device (imaging unit) 10, a state detection device 20, and a control device (control unit). 30.
[0024]
The imaging device 10 includes a position where the driver's body is present when the driver is seated in the imaging range, and images the imaging range in time series. Specifically, the imaging device 10 includes at least one of a CCD camera or a CMOS camera for imaging visible light, a camera for imaging near-infrared light, and a camera for imaging heat generated by a person or the like in far-infrared light. It is composed of one.
[0025]
The imaging device 10 is installed, for example, below the front of the driver, acquires an image including the driver's head, and sends the data of the acquired captured image to the state detection device 20 as a video signal Sa. In addition, when only the presence or absence of a driver is to be detected, the imaging device 10 may be configured to capture an image of the driver's torso, but the imaging device 10 captures the driver's head in the following. Shall be.
[0026]
The state detection device 20 executes a predetermined process based on data of a captured image from the imaging device 10 and detects at least one of three driver states.
[0027]
FIG. 2 shows details of the state detection device 20. FIG. 2 is a block diagram showing a detailed configuration of the state detection device 20 shown in FIG.
[0028]
The state detection device 20 includes an image acquisition unit (image acquisition unit) 21 that inputs a video signal Sa that is data of a captured image from the imaging device 10. Further, the state detection device 20 includes an image processing unit (image processing unit) 22 that performs image processing on the data of the captured image input by the image acquisition unit 21 and obtains an optical flow between the captured images. Further, the state detection device 20 includes an operation detection unit (operation detection unit) 23 that detects a driver's operation from the obtained optical flow and a state detection unit 24 that detects at least one of the three driver states. I have it. Furthermore, the state detection device 20 includes a state signal output unit (signal output unit) 25 that converts a detection result from the state detection unit 24 into an electric signal Sb and outputs the signal to the outside.
[0029]
Further, the control device 30 performs a predetermined process, for example, a seat belt control process, an airbag control process, an alarm process, and the like, based on the electric signal Sb from the state signal output unit 25.
[0030]
Here, the image acquisition unit 21 and the image processing unit 22 among the units 21 to 25 have the same processing even when any of the three driver states is detected. Next, the basic operations of the image acquisition unit 21 and the image processing unit 22, which are common parts of the processing, and the outline of the operations of the operation detection unit 23, the state detection unit 24, and the state signal output unit 25 are described with reference to FIGS. Will be explained. FIG. 3 is a data flow diagram illustrating an outline of the operation of the state detection device 20 according to the embodiment, and FIG. 4 is an explanatory diagram illustrating an outline of the operation of the state detection device 20 according to the embodiment.
[0031]
First, an image including the driver's face is captured by the imaging device 10 (the image illustrated in FIG. 4A), and the image is input to the image acquisition unit 21 as a video signal Sa.
[0032]
When the video signal Sa is input from the imaging device 10, the image acquisition unit 21 converts the video signal into 320-pixel horizontal width, 240-pixel vertical width, and two-dimensional digital data representing 8-bit (256 gradations) grayscale data per pixel. After the conversion, the image acquisition unit 21 stores the converted data in the storage area, and outputs the stored captured image data to the image processing unit 22.
[0033]
The image processing unit 22 obtains an optical flow between the captured images based on the data of the captured images from the image acquisition unit 21 (FIG. 7B). At this time, the image processing unit 22 inputs the area data, and obtains an optical flow for each area defined by the area data (each operation area). Then, the image processing unit 22 sends the obtained optical flow data for each area to the motion detection unit 23.
[0034]
Here, the area and the area data will be described. The area data is data indicating a position and a size for defining an area in a captured image. In addition, one or a plurality of areas are set based on area data for images acquired at different times, and specifically indicate the following reference areas and search areas. .
[0035]
FIG. 5 is an explanatory diagram of the reference area and the search area. Note that the reference area and the search area are set in captured images at different times, but in FIG. 5, for convenience, they are represented on one image having a width of w pixels and a height of h pixels. explain.
[0036]
As shown in the drawing, the reference area is an area having a width of tw pixels and a height of th pixels set around a specific point O. The search area is an area having a width sw pixel and a height sh pixel set with the point O as a center. This search area is set so as to surround each reference area, and is set by the same number as the reference areas.
[0037]
In this way, these two areas are set to have the same center and to have the relation of sw> tw and sh> sw. The reference region and the search region here are set at predetermined positions and sizes without depending on the position of the driver's face or the like.
[0038]
Further, it is desirable that the reference areas are regularly arranged. FIG. 6 is an explanatory diagram of reference regions regularly arranged in a captured image. For example, as shown in FIG. 6A, a plurality of (for example, seven) reference regions are arranged in the horizontal direction on the captured image. Further, as shown in FIG. 6B, a plurality of reference regions (for example, 5 rows and 7 columns) are arranged in a grid pattern on the captured image. Furthermore, as shown in FIG. 6C, a plurality of reference areas may be arranged in a horizontal direction and in a lattice shape (for example, two rows 17 in addition to three rows and five columns).
[0039]
Further, the reference area may be fixedly set to about the size of a face part such as an eye, a nose or a mouth, based on the position of the camera, the angle of view of the camera, the proportion of the face in the captured image, and the like. Is desirable.
[0040]
Description will be made with reference to FIGS. 3 and 4 again. After calculating the optical flow, the motion detection unit 23 obtains the motion of the driver, that is, the actual motion pattern from the optical flow of each region obtained by the image processing unit 22 (FIG. 4C). At this time, the motion detection unit 23 obtains an actual motion pattern for each area group. Then, the operation detection unit 23 sends the data of the obtained actual operation pattern to the state detection unit 24.
[0041]
The area group will be described. The area group includes at least one of the above-described reference areas. An example of an area group will be described with reference to FIGS. 7 and 8 are explanatory diagrams of the area group. Note that FIGS. 7 and 8 illustrate a case where the reference areas are arranged in a grid pattern (5 rows and 7 columns) on the captured image.
[0042]
First, as shown in FIG. 7, each of the area groups A1 to I1 includes nine reference areas. Specifically, the region groups A1 to I1 include reference regions of three rows and three columns, and the region group A1 includes reference regions of first to third rows and first to third columns. The area groups B1 to D1 include the reference areas in the first to third rows, the fifth to seventh columns, the third to fifth rows, the first to third columns, and the third to fifth rows, to the fifth to seventh columns. In. Each of the area groups E1 to F1 includes a reference area in the first to third rows, the third to fifth columns, and the third to fifth rows in the third to fifth columns. Further, the area groups G1 to I1 include reference areas in the second to fourth rows, the second to fourth columns, the second to fourth rows, the fourth to sixth columns, and the second to fourth rows, the third to fifth columns. I have.
[0043]
Further, as shown in FIG. 8, each of the area groups A2 to H2 may include 3 to 5 reference areas. In the case of this example, the area group A2 includes reference areas on the first row, the first and second columns, and the second row and the first column. The area groups B2 to D2 include the first row, the sixth and seventh columns, the second row, the seventh column, the fourth row, the first column, the fifth row, the first column, the fourth row, the seventh column, and the like. The reference area in the fifth row and the sixth column is included. The area groups E2 to F2 include reference areas in the first row, the second to sixth columns, and the fifth row, the second to sixth columns. Further, the region groups G2 to H2 include reference regions in the first column in the second to fourth rows and the seventh column in the second to fourth rows.
[0044]
As described above, the area group is set to a captured image with a size including at least one reference area. Then, the motion detection unit 23 determines an actual motion pattern for each region group.
[0045]
After calculating the pattern, the state detection unit 24 performs at least one of the three driver states based on the actual operation pattern and the stored operation pattern. Specifically, the state detection unit 24 calculates a correlation between the actual operation pattern and each of a plurality of storage operation patterns stored in advance, and obtains a storage operation pattern having the highest correlation as a detection result (FIG. 7D).
[0046]
Here, the plurality of storage operation patterns are composed of feature amounts obtained in advance based on actual driver movements, and are stored in a storage unit (pattern storage unit) 24 a provided inside the state detection unit 24. Have been. The state detection unit 24 reads a plurality of storage operation patterns from the storage unit 24a, and compares these storage operation patterns with the obtained actual operation patterns. Then, the state detection unit 24 outputs the detection result obtained by the comparison to the state signal output unit 25.
[0047]
The state signal output unit 25 converts the detection result from the state detection unit 24 into an electric signal Sb and outputs the signal to the outside. Then, the control device 30 that has received the electric signal Sb performs various operations based on the signal.
[0048]
Next, an operation of the state detection device 20 according to the first embodiment will be described in detail with reference to FIGS.
[0049]
FIG. 9 is a flowchart showing the operation of the image processing unit 22 shown in FIG.
[0050]
First, the image processing unit 22 inputs a video signal Sa that is data of a captured image from the image acquisition unit 21. Then, the image processing unit 22 applies a smoothing filter to the captured image from the image acquisition unit 21 and converts a pixel value by a predetermined formula (ST10). Here, the smoothing filter is a filter having the following five rows and five columns.
[0051]
(Equation 1)

The predetermined equation is shown below.
[0052]
(Equation 2)

Note that d (x, y) is a pixel value at an arbitrary position in the captured image, and d ′ (x, y) is a pixel value after conversion.
[0053]
Thereafter, the image processing unit 22 calculates a movement amount (xd, yd), that is, an optical flow, by finding a position most similar to the reference region in the previous captured image from within the search region of the current captured image ( ST11). Specifically, the image processing unit 22 first finds an area most similar to the reference area from within the search area, and sets the center point of the most similar area as the position most similar to the reference area. Then, the image processing unit 22 calculates the movement amount (xd, yd) from the obtained center point of the most similar area and the center point of the search area, and sets the movement amount as an optical flow.
[0054]
Here, step ST11 will be described in detail. As described above, a plurality of reference areas are set in advance on the captured image. The search area is set so as to surround each reference area. The reference region and the search region are set at different times. Specifically, as shown in FIG. 10, the reference area is set at time t, and the search area is set at time (t + 1) after time t.
[0055]
FIG. 10 is an explanatory diagram of a method of calculating the movement amount (xd, yd) in step ST11 shown in FIG. In the process of step ST11, the image processing unit 22 first creates a candidate area. This candidate area is an area having the same size as the reference area. Then, the image processing unit 22 sets a candidate area at a predetermined position in the search area, and compares the set candidate area with the reference area to obtain a similarity. Next, the image processing unit 22 moves the candidate area to another position, and calculates the similarity by comparing the candidate area at the moved position with the reference area.
[0056]
Thereafter, the image processing unit 22 sequentially moves the candidate area, and calculates the degree of similarity with the reference area at each location in the search area. The similarity is determined based on, for example, grayscale data. Here, when calculating the similarity based on the grayscale data, assuming that the similarity is cos θ, the similarity is expressed by the following equation.
[0057]
[Equation 3]

In the above equation, T is the grayscale data of the reference area, and S is the grayscale data of the candidate area. Further, xd indicates an X coordinate value in the search area, and yd indicates a Y coordinate value in the search area.
[0058]
From the above, the image processing unit 22 determines the position S where the similarity is maximum, acquires the difference between the coordinate values of the point S and the point O as the movement amount (xd, yd), and sets this as the optical flow.
[0059]
Description will be made again with reference to FIG. After calculating the movement amount (xd, yd), the image processing unit 22 determines whether or not the range of the similarity is equal to or larger than a threshold (ST12).
[0060]
This determination will be described with reference to FIG. FIG. 11 is an explanatory diagram of the process of step ST12 shown in FIG. The image processing unit 22 scans the inside of the search area using the candidate area, and calculates the similarity of each point in the search area. Then, the image processing unit 22 obtains the variance of the obtained similarity.
[0061]
For example, as shown in FIG. 11, when the degree of similarity at each location is represented as a variation, it can be said that the variance is small and the range of the variance is small in the variation C1. On the other hand, in the variation C2, the variance value is larger than the variation C1, and the range of the variance can be said to be larger.
[0062]
Here, the case where the range of the variance is narrow refers to the case where the similarity is detected at each point in the search area. For example, when the reference area is a pure white image or the like, when there are few features, a similar similarity result is obtained even when compared with any part in the search area. In such a case, since the difference between the similarities is small, the detection of the point S having the maximum similarity tends to be inaccurate. For this reason, in the process of step ST12 in FIG. 9, a comparison is made with a predetermined threshold value to discriminate a suitable one from an inappropriate one.
[0063]
Description will be made again with reference to FIG. When it is determined that the range of the similarity is equal to or larger than the threshold (ST13: YES), the image processing unit 22 sets the reference area as an effective area and substitutes “1” into fd (ST13). Then, the process proceeds to step ST15.
[0064]
On the other hand, when it is determined that the range of the similarity is not equal to or larger than the threshold (ST12: NO), the image processing unit 22 sets the reference area as an invalid area and substitutes “0” into fd (ST14). Then, the process proceeds to step ST15. As described above, the image processing unit 22 determines whether or not to use the optical flow for calculation by comparing the change amount of the similarity (one of the characteristic amounts) with the preset threshold value. .
[0065]
In step ST15, the image processing unit 22 determines whether or not steps ST11 to ST14 have been performed by the number of regions (ST15). That is, the image processing unit 22 determines whether or not a similar position has been specified from within the search area for all the reference areas.
[0066]
If it is determined that a similar position has not been specified from within the search region for any of the reference regions (ST15: NO), the process returns to step ST11. The processing of ST11 to ST14 will be repeated.
[0067]
On the other hand, when it is determined that similar positions have been specified from within the search area for all the reference areas (ST15: YES), the image processing unit 22 transmits optical flow data for each reference area to the motion detection unit 23. I do. Thereafter, the processing by the image processing unit 22 ends.
[0068]
Note that the operation of the image processing unit 22 shown in FIG. 9 described above is common even when any one of the three driver states is detected.
[0069]
Here, an example of an optical flow in each of the three driver states will be described. FIG. 12 is an explanatory diagram illustrating an example of an optical flow when detecting the orientation of the driver's face, and FIG. 13 is an explanatory diagram illustrating an example of an optical flow when detecting the presence or absence of the driver. FIG. 14 to FIG. 16 are explanatory diagrams showing examples of optical flows when detecting entry and exit of things other than the driver's face into the imaging range. FIG. 14 shows an example of an optical flow when the driver moves his hand near the eyes, and FIG. 15 shows an optical flow when the driver lifts a book to see a road map or the like. Is shown. FIG. 16 shows an example of an optical flow when the spoke portion of the handle enters the imaging range.
[0070]
First, a description will be given with reference to FIG. At time t, the driver is looking ahead (FIG. 12A). Thereafter, at time (t + 1), the driver turns his / her face to the left to confirm an intersection or the like. At this time, an optical flow is detected (FIG. 12B). Here, the square area in the image is the reference area, and the line segment extending from each reference area indicates the movement amount of each part, that is, the optical flow.
[0071]
Then, at time (t + 2), the driver turns his face further to the left. At this time, similarly, an optical flow is detected (FIG. 12C). Then, at time (t + 3), when the driver turns his face to the upper left, an optical flow is similarly detected (FIG. 12D).
[0072]
In FIG. 12, a square frame indicating the reference area is indicated by a solid line, and the reference area determined as “NO” in step ST12 of FIG. Are reference areas determined as “YES” in step ST12 of FIG. 9 and set as valid areas. This applies to FIGS. 13 to 16 described below.
[0073]
Next, a description will be given with reference to FIG. First, in the state before the driver gets on the vehicle, an object or the like in the image naturally does not move, and no optical flow is detected. Many of the reference areas are invalid areas (FIG. 13A). Thereafter, when the driver starts to get on the vehicle, the movement of the driver is detected and the optical flow is calculated. At this time, a part of the reference area becomes an effective area (FIG. 13B). Thereafter, the driver completes the ride. At this time, since the driver temporarily stops, the detection amount of the optical flow is reduced, but the driver cannot move to a complete stop and moves slightly, so that most of the reference area becomes the effective area ( FIG. 13 (c)).
[0074]
Note that, in FIG. 13, when only a very small optical flow is detected, illustration of a line segment extending from the reference area is omitted. This applies to FIGS. 14 to 16 described below.
[0075]
Next, a description will be given with reference to FIG. First, at time t, the driver visually recognizes the front (FIG. 14A). Thereafter, at time (t + 1), the driver moves his hand near the eyes. At this time, an optical flow is detected in a part of the captured image (FIG. 14B). Thereafter, at time (t + 2), the movement of the driver's hand hardly occurs, and the detection amount of the optical flow decreases (FIG. 14C).
[0076]
Next, a description will be given with reference to FIG. First, at a time t, the driver drops his / her line of sight to view a road map or the like from a state where the driver is looking ahead. At this time, since the face itself slightly moves downward, an optical flow is slightly detected (FIG. 15A). Thereafter, at time (t + 1), the driver lifts a road map or the like. At this time, an optical flow is detected slightly below the center of the captured image (FIG. 15B). Thereafter, at time (t + 2), the driver gazes at a road map or the like, and almost no motion occurs. Therefore, the detection amount of the optical flow is reduced (FIG. 15C).
[0077]
Next, a description will be given with reference to FIG. First, at time t, the driver is driving on a straight road (FIG. 16A). Thereafter, at time (t + 1), the driver performs a right turn operation. At this time, the spoke portion of the handle enters the imaging range, and an optical flow is detected (FIG. 16B). Thereafter, at time (t + 2), when the driver further turns the steering wheel in the right-turn direction, an optical flow is further detected (FIG. 16C).
[0078]
The calculation method of the optical flow is the same as that of the present embodiment, supervised by Nobuyuki Yagi, “Digital Video Processing”, edited by the Institute of Image Information and Television Engineers, pp. 129-139, 2000, Ohmsha, etc., have introduced a plurality of techniques for detecting a motion from a moving image, and they can also be used.
[0079]
Next, the processing of the motion detection unit 23 will be described. FIG. 17 is a flowchart showing the operation of the operation detecting unit 23 shown in FIG. Note that the processing by the operation detection unit 23 described below is not executed when the presence or absence of a driver is detected.
[0080]
In the processing illustrated in FIG. 17, the setting of the area group is different between the case where the detection target is the direction of the driver's face and the case where something other than the driver's face enters or leaves the imaging range. .
[0081]
First, the difference between the area groups will be described. When the detection target is the direction of the driver's face, the area group is set as shown in FIG. That is, each of the nine area groups A1 to I1 is set to include nine reference areas in three rows and three columns.
[0082]
On the other hand, when the detection target is a thing other than the driver's face coming in and out of the imaging range, the area group is set as shown in FIG. That is, each of the eight area groups A2 to H2 is set to include 3 to 5 reference areas.
[0083]
Here, the reason why the setting method of the area group is different is as follows. That is, when detecting the direction of the driver's face, it is necessary to capture the movement regardless of the position of the driver's face moving on the image. For this reason, it is desirable to set a region group for the entire image. On the other hand, when detecting the entry or exit of something other than the driver's face into the imaging range, it is sufficient to detect the entry or exit specifically, and it is not necessary to set an area group at the center of the image.
[0084]
As described above, in the present embodiment, the setting of the area group is made different depending on the detection target, so that the detection can be performed appropriately.
[0085]
Next, the flowchart of FIG. 17 will be described on the premise of the above-described difference between the area groups.
[0086]
First, the motion detection unit 23 selects a processing target from a plurality of region groups, and further selects any one of the reference regions in the group.
[0087]
Then, the motion detection unit 23 initializes the numerical values xm, ym, and c relating to the movement amount of the object in the image to “0” for the selected area group (ST20). Thereafter, the motion detection unit 23 determines whether or not the selected reference area is an effective area, that is, whether or not fd is “1” (ST21).
[0088]
When it is determined that fd is “1” (ST21: YES), the motion detection unit 23 integrates the optical flows that are the movement amounts (ST22). Specifically, the motion detection unit 23 sets “xm” to “xm + xd”, “ym” to “ym + yd”, and “c” to “c + 1”. Then, the process proceeds to step ST23.
[0089]
On the other hand, if it is determined that fd is not “1” (ST21: NO), the operation detecting unit 23 proceeds to step ST23 without integrating the optical flow as the movement amount.
[0090]
In step ST23, the motion detection unit 23 determines whether or not all the reference areas in the selected area group have been processed (ST23). If it is determined that the processing has not been performed for any of the reference areas (ST23: NO), the processing returns to step ST21, and the above steps ST21 and ST22 are repeated. That is, the motion detection unit 23 determines whether or not all of the reference areas are valid areas, and if the area is an effective area, performs processing of integrating the movement amount.
[0091]
Then, the movement amount is sequentially accumulated, and when all the reference areas have been processed (ST23: YES), the motion detection unit 23 determines whether or not c is “0” (ST24).
[0092]
If it is determined that “c” is “0” (ST24: YES), the process proceeds to step ST26. On the other hand, when it is determined that “c” is not “0” (ST24: NO), the operation detection unit 23 calculates an average of the integrated “xm” and “ym” (ST25). That is, the motion detection unit 23 executes “xm = xm / c” and “ym = ym / c” to obtain an average movement amount.
[0093]
Here, the average moving amount is, for example, as shown in FIG. In FIG. 12, the average movement amount is indicated by an arrow at the lower right of each image (excluding (a)). Note that the average movement amount is obtained for each region group, but in FIG. 12, for convenience of description, the average movement amount of the entire image is shown. The average movement amount shown here indicates the average movement amount of the face. That is, the average movement amount when the direction of the face is the detection target.
[0094]
Description will be made again with reference to FIG. After calculating the average moving amount as described above, the movement detecting unit 23 calculates a moving average value (ax, ay) (movement amount) for the obtained average moving amount (ST26). The range for obtaining the moving average is arbitrarily determined. For example, the motion detection unit 23 calculates the average of the average moving amount (corresponding to the size of the arrow) shown in FIGS. Ask for.
[0095]
Thereafter, the motion detection unit 23 integrates the moving average value (ax, ay) of the average moving amount (ST27). Specifically, the motion detection unit 23 sets “sx” to “sx + ax” and sets “sy” to “sy + ay”.
[0096]
Thereafter, the motion detection unit 23 calculates a moving average value (cx, cy) of the integrated value (sx, sy) (ST28). The range for obtaining the moving average is also arbitrarily determined.
[0097]
Then, the motion detection unit 23 obtains the moving position (vx, vy) from the difference between the integrated value (sx, sy) and the moving average (cx, cy) of the integrated value (ST29). Specifically, the motion detection unit 23 sets “vx” to “sx-cx” and “vy” to “sy-cy”.
[0098]
After that, the motion detection unit 23 stores the movement position (vx, vy) in the buffer, and determines the movement position (vx, vy) for a predetermined time previously obtained and the current movement position (vx, vy). The current actual operation pattern is set (ST30).
[0099]
Thereafter, the operation detection unit 23 determines whether or not the integrated value (sx, sy) is equal to or greater than a threshold (ST31). When it is determined that the integrated value (sx, sy) is not equal to or larger than the threshold (ST31: NO), the operation detection unit 23 sends the data of the movement position (vx, vy) to the state detection unit 24, and the process proceeds to step ST35. Transition.
[0100]
On the other hand, when determining that the integrated value (sx, sy) is equal to or greater than the threshold value (ST31: YES), the operation detecting unit 23 determines whether the standard deviation of the integrated value (sx, sy) is equal to or less than the threshold value. (ST32). When it is determined that the standard deviation of the integrated value (sx, sy) is not less than the threshold value (ST32: NO), the operation detection unit 23 sends the data of the movement position (vx, vy) to the state detection unit 24, and the process is performed. The process moves to step ST35.
[0101]
On the other hand, when it is determined that the standard deviation of the integrated values (sx, sy) is equal to or smaller than the threshold (ST32: YES), the motion detection unit 23 determines whether the moving average value of the average moving amount is equal to or smaller than the threshold. (ST33). If it is determined that the moving average value of the average moving amount is not equal to or smaller than the threshold (ST33: NO), the operation detecting unit 23 sends the data of the moving position (vx, vy) to the state detecting unit 24, and the process proceeds to step ST35. Transition.
[0102]
On the other hand, when it is determined that the moving average value of the average moving amount is equal to or smaller than the threshold (ST33: YES), the operation detecting unit 23 initializes the integrated value (sx, sy) to “0” (ST34). Then, the operation detection unit 23 sends the data of the movement position (vx, vy) to the state detection unit 24, and the process proceeds to step ST35.
[0103]
The processing in steps ST31 to ST34 is performed for the following reason.
[0104]
For example, when the driver sits on the seat, the driver's face is not always located at the center of the imaging range. Therefore, if the left and right ranges of the driver's face position are not equal within the imaging range, the driver moves the face to the left and right, causing an error due to the difference between the left and right ranges, and this is integrated. They are accumulated as values (sx, sy). Further, errors may be accumulated for various reasons. If errors are gradually accumulated as integrated values (sx, sy), detection of the direction of the face and detection of entry / exit of something other than the face into the imaging range will be hindered.
[0105]
Therefore, in step ST31, it is determined whether or not the integrated value (sx, sy) is equal to or greater than a threshold value. When the integrated value (sx, sy) is equal to or greater than the threshold value, the integrated value (sx, sy) is initialized to "0". As described above, by initializing the integrated value based on the predetermined condition, the detection target is suitably detected.
[0106]
However, if the integrated value (sx, sy) is initialized to “0” at the stage where the driver is currently changing the direction of the face or at the stage where something other than the face enters or leaves the imaging range, an initial On the contrary, the detection of the detection target is hindered. Therefore, in steps ST32 and ST33, it is detected that the face is not moving or the state other than the face is not in the imaging range. That is, the motion detection unit 23 determines the integrated value (sx, sy) based on a predetermined condition that the standard deviation of the integrated value (sx, sy) is equal to or less than the threshold value and the moving average value of the average moving amount is equal to or less than the threshold value. sy) is initialized to “0”.
[0107]
In step ST35, it is determined whether or not all the area groups have been processed (ST35). If it is determined that the processing has not been performed for any of the area groups (ST35: NO), the processing returns to step ST20 again, and the same processing is performed. On the other hand, when it is determined that the processing has been performed for all the area groups (ST35: YES), the operation detection unit 23 sends the actual operation pattern data for each area group to the state detection unit 24. Thereafter, the processing by the motion detection unit 23 ends.
[0108]
Here, an example of data of the movement position (vx, vy) obtained by the operation detection unit 23, that is, an example of an actual operation pattern will be described with reference to FIG. FIG. 18 is an explanatory diagram of an actual motion pattern obtained by the motion detection unit 23 shown in FIG. 2, and shows a case where the detection target is the direction of the driver's face.
[0109]
In FIG. 18, the vertical axis indicates the movement position, and the horizontal axis indicates time. FIG. 18 shows only the movement position in the image horizontal direction (X direction), and omits the movement position in the image vertical direction (Y direction). Further, FIG. 18 shows an example of an actual operation pattern obtained in a predetermined area group when the driver turns his / her face to the left from a state in which the driver is looking forward, and then looks forward again. I have.
[0110]
As shown in the figure, first, when the driver gazes ahead of the vehicle (period 350 to 410), the moving position is near “0”.
[0111]
Next, when the driver performs the checking operation and turns his / her face to the left (period of time 410 to 430), the moving position indicates about “−45 to −48” pixels. After that, if the driver is left to the left for a while (period 430 to 560), the moving position is maintained at about "-45 to -48" pixels.
[0112]
Then, when the driver turns his / her face in front of the vehicle again (period 560 to 580), the moving position returns to the vicinity of “0”. Thereafter, when the driver keeps gazing at the front of the vehicle (period 580 to 650), the movement position keeps maintaining near “0”.
[0113]
As described above, the movement position (vx, vy) obtained by the movement detection unit 23 indicates the direction of the driver's face, and the actual movement pattern P1 is detected by capturing the movement position over time. It becomes.
[0114]
Another example of the actual operation pattern will be described with reference to FIG. FIG. 19 is an explanatory diagram of an actual operation pattern obtained by the operation detection unit 23 illustrated in FIG. 2, and illustrates a case where a detection target is an object other than the driver's face entering and exiting the imaging range.
[0115]
In FIG. 19, the horizontal axis indicates the movement position in the image horizontal direction (X direction), and the vertical axis indicates the movement position in the image vertical direction (Y direction). Further, the actual operation pattern shown in FIG. 19 shows an example obtained in a predetermined area group when the steering wheel is operated as shown in FIG.
[0116]
As shown in FIG. 16, the spoke portion of the handle is moving in the negative direction on the X axis and the Y axis on the captured image. Therefore, the movement position (vx, vy) of the handle obtained over time, that is, the actual operation pattern P2 indicates movement in the negative direction on the X axis and the Y axis as shown in FIG. . When the steering wheel is turned to the left, the operation is reversed, and a pattern in which the actual operation pattern P2 shown in FIG. 19 is substantially point-symmetric with respect to the origin (0, 0) is obtained.
[0117]
Next, the operation of the operation detection unit 23 when the detection target is the presence or absence of a driver will be described. When the detection target is the presence or absence of a driver, the operation detection unit 23 does not perform the processing of FIG. 17 and determines the number of reference areas determined as “YES” in step ST12 of FIG. An actual operation pattern is obtained over time. That is, the actual operation pattern is obtained by counting the number of effective areas in all the reference areas.
[0118]
As described with reference to FIG. 13, the number of effective areas tends to gradually increase from a state before the driver gets on the vehicle to a state where the driver is in the middle of getting on the vehicle and getting on. The motion detection unit 23 obtains this tendency as an actual motion pattern.
[0119]
FIG. 20 is an explanatory diagram illustrating an example of an actual operation pattern obtained by the operation detection unit 23 when the detection target is the presence or absence of a driver. In FIG. 20, the vertical axis indicates the number of effective areas, and the horizontal axis indicates time.
[0120]
First, in the state before the driver gets on the vehicle (period between times 35140 to 35164), the number of effective areas is stable at 5 or less. Thereafter, when the driver starts to get on the vehicle, the number of effective areas starts to increase (period between times 35164 and 35204). At this time, the number of effective areas is 6 or more and less than 15. Then, in the state where the boarding is completed (the period from time 35204 to 35250), the number of effective areas further increases to 15 or more.
[0121]
When the detection target is the presence or absence of a driver, the operation detection unit 23 acquires the change in the number of effective areas as described above as the actual operation pattern P3. Note that, similarly to the case where the detection target is the direction of the driver's face (as in step ST30 of FIG. 17), the motion detection unit 23 stores the number of effective areas for a certain period of time. For this reason, the actual operation pattern P3 actually obtained may not be continuous from time 35140 to time 35250 as shown in FIG. That is, the actual operation pattern P3 may be a part of the increase in the number of effective areas shown in FIG.
[0122]
When the actual operation pattern is obtained, the operation detection unit 23 sends the data of the actual operation pattern P3 to the state detection unit 24. Thereafter, the processing by the motion detection unit 23 ends.
[0123]
Next, the operation of the state detection unit 24 shown in FIG. 2 will be described. FIG. 21 is a flowchart showing the operation of the state detection unit 24 shown in FIG.
[0124]
As shown in the drawing, the state detection unit 24 first selects any one of the area groups. Then, the state detection unit 24 calculates, for any one of the selected ones, the correlation between the actual operation pattern P obtained in step ST30 of FIG. 11 and each of the plurality of storage operation patterns D stored in the storage unit 24a in advance. (ST40).
[0125]
As a method of obtaining the correlation, for example, it is obtained in the same manner as Expression 3, or by using information obtained by performing frequency analysis by Fourier transform or Weblet transform.
[0126]
Here, specifically, the actual operation pattern P and the storage operation pattern D are
(Equation 4)

It has become. Note that the “state code” is a state code representing the state of the driver. In addition, “data” indicates the moving position (vx, vy) obtained in step ST30 of FIG. 17 when the detection target is a person other than the driver's face entering or leaving the imaging range and the direction of the face. It will be shown. When the detection target is the presence or absence of a driver, “data” indicates the number of effective areas.
[0127]
Thereafter, the state detection unit 24 detects a storage operation pattern having the highest correlation among the plurality of storage operation patterns (ST41). After the detection, the state detection unit 24 detects the state indicated by the detected storage operation pattern as the driver's state (ST42). That is, the state detection unit 24 sets the state of the face orientation or the like indicated by the storage operation pattern D having the highest correlation as the detection result. Then, the state detection unit 24 outputs the detection result to the state signal output unit 25.
[0128]
After that, the state detection unit 24 performs the same processing for one of the area groups other than the selected one, and outputs the detection result to the state signal output unit 25.
[0129]
In the above description, the actual operation pattern and the storage operation pattern for each region group are compared to obtain the respective detection results. However, the respective detection results may be comprehensively determined to obtain one result. Good. In this case, instead of sequentially outputting the detection results of each area group to the state signal output unit 25, only the result determined comprehensively is output.
[0130]
Further, when the detection target is the presence or absence of a driver, the above processing is not repeated because no area group is set. That is, the state detection unit 24 performs the processes of steps ST40 to ST42 once, and outputs the obtained detection result to the state signal output unit 25.
[0131]
After that, the state signal output unit 25 converts the detection result from the state detection unit 24 into an electric signal Sb and outputs it to the outside.
[0132]
As is clear from the above, in the present embodiment, the same processing is performed regardless of which of the three driver states the image processing illustrated in FIG. 9 detects. Further, since the optical flow is obtained for a preset reference area, the detection is performed without specifying the position of the face as in the related art.
[0133]
Thus, in the state detection device 20 according to the present embodiment, the image processing unit 22 obtains the optical flow between the captured images. According to the method of obtaining the optical flow, when there is a movement of any object in the image, the movement can be detected. For this reason, it is not necessary to use an image processing method individually set for each detection target as long as the detection target can be obtained based on the motion regardless of the detection target.
[0134]
Therefore, for example, the direction of the driver's face that can be obtained based on the motion, the entry / exit of an object other than the driver's face into / from the imaging range, and the presence / absence of the driver are determined by using one image processing method using this optical flow Should be used.
[0135]
Therefore, when a device that detects one of the above three driver states is configured, and when it is desired to upgrade the device so as to detect another driver state later, only a processing part that is not common should be incorporated. What should I do? As a result, compared to the case where a device that performs completely different processing is incorporated in upgrading or the like, the cost is not increased, and the situation in which simultaneous execution is impossible depending on the content of image processing hardly occurs.
[0136]
When two or more of the three driver states are detected, the image processing method is common, so that the states of a plurality of drivers can be detected by one image processing method. As a result, compared to the case where a device that performs different processing is mounted, the cost is not increased, and the situation that simultaneous execution is impossible depending on the content of image processing is less likely to occur.
[0137]
Therefore, cost and versatility can be improved.
[0138]
Further, since the state signal output unit 25 converts the detection result from the state detection unit 24 into an electric signal Sb and outputs the signal to the outside, for example, when the external control device 30 is a notifying device, the state of the driver's face is Notification according to the direction can be made. Therefore, vehicle control or the like can be performed using the detection result.
[0139]
Further, an optical flow is obtained for each of one or a plurality of calculation regions defined by a predetermined position and a predetermined size with respect to the captured image, and an actual operation pattern obtained from the optical flow is obtained for each region group including at least one calculation region. ing. Then, the direction of the face is detected based on the obtained actual operation pattern and the stored operation pattern stored in advance. For this reason, for example, even when the face exists only in the corner of the captured image, the actual operation pattern can be accurately obtained for the area group in that corner. Therefore, it is possible to avoid a situation where an actual operation pattern cannot be accurately obtained when only a part of the face is present at a corner of the image.
[0140]
Therefore, convenience can be improved.
[0141]
Further, an actual operation pattern is detected spatially and temporally from the calculation result of the optical flow. That is, for example, the movement in the left-right direction or the like is obtained spatially, and the movement of the driver retrospectively from the present to the past is obtained. That is, it is possible to prevent the actual operation pattern from being obtained by an instantaneous optical flow, thereby reducing the influence of noise or the like.
[0142]
Conventionally, the state of a driver or the like is detected based on a feature amount obtained by imaging or the like. For this reason, conventionally, at the start of operation, it is necessary to acquire a characteristic amount in order to obtain a reference. Therefore, the state or the like cannot be detected at the start of operation. However, in the present embodiment, instead of obtaining a feature amount by imaging or the like, a feature amount that is actually obtained in advance based on the movement of the driver is stored. For this reason, even at the start of driving, it is possible to detect the state of the driver and the like. Further, a similar effect can be obtained when the comparison process is performed using the storage operation pattern that is the feature amount.
[0143]
At least one of the one or more reference areas is set to the size of a specific part of the face based on the proportion of the face in the captured image. For this reason, it is possible to prevent the calculation amount from being increased by setting a too large reference region, and to reduce the possibility that a plurality of characteristic parts are simultaneously included in one reference region. Furthermore, it is possible to set a region that is too small to prevent the region from having no characteristic portion.
[0144]
Further, a moving average value (movement amount) based on the face motion is integrated, the face motion is obtained based on the integrated value, and the integrated value is initialized based on a predetermined condition. Therefore, for example, an error that is accumulated as an integrated value when the driver moves his / her face to the left or right can be initialized, and the driver state can be detected appropriately.
[0145]
In addition, each of the one or a plurality of calculation regions compares each region with an optical amount by comparing a change amount (variance value) of a feature amount (similarity) calculated in each search region with a preset threshold value. It is determined whether or not to use the flow for calculation. For this reason, it is possible to prevent inaccurate detection due to the setting of the reference region having no feature.
[0146]
Further, in the state detection system 1 according to the present embodiment, it is possible to improve cost and versatility. Further, for example, when the external control device 30 is a notification device, the notification according to the direction of the driver's face can be performed. Therefore, vehicle control or the like can be performed using the detection result.
[0147]
Note that, in the present embodiment, the processing by the image processing unit 22 is the same in any case of detecting any of the detection targets, but need not be exactly the same. That is, a slight change may be made as long as the image processing for obtaining the optical flow is not affected.
[0148]
Further, in the present embodiment, by determining the presence or absence of a driver while the vehicle is running, for example, when the driver bows to pick up a fallen object under the seat, or when trying to take an object in the passenger seat It is also possible to detect a situation in which the player leans on the passenger seat side.
[0149]
In addition, FIGS. 16 and 19 illustrate an example in which the spoke portion of the handle enters the imaging range. In this case, since the movement locus of the spoke portion is obtained, the turning angle of the handle is estimated. It can also be applied to devices.
[0150]
Next, a second embodiment of the present invention will be described. In the second embodiment, differences from the first embodiment will be mainly described.
[0151]
The state detection system 1a and the state detection device 20a according to the second embodiment detect at least two of the three driver states. Further, the state detection unit 24 according to the second embodiment is different from the first embodiment in the content of the processing to be executed.
[0152]
Hereinafter, different processing contents will be described. First, the state detection device 20a of the second embodiment is capable of detecting two or more of the three driver states described in the first embodiment, and sends the respective detection results to the control device 30. It is configured.
[0153]
Specifically, the operation detection unit 23 obtains an actual operation pattern for any one of the two or more driver states to be detected, and sends this data to the state detection unit 24. After that, the operation detection unit 23 obtains the actual operation pattern again for the remaining driver state, and sends this data to the state detection unit 24. Note that this operation may be performed in parallel.
[0154]
Then, the state detection unit 24 performs detection based on the input data of the actual operation pattern as described in the first embodiment. After that, the state detection unit 24 sends the detection result to the state signal output unit 25.
[0155]
Then, as described in the first embodiment, the state signal output unit 25 converts the detection result into the electric signal Sb and outputs the electric signal Sb to the control device 30.
[0156]
Further, the state detection unit 24 according to the second embodiment has a function of transmitting a suppression signal to the operation detection unit 23 based on the detection result.
[0157]
FIG. 22 is a flowchart illustrating the suppression control processing performed by the state detection unit 24. First, upon obtaining the detection result, the state detection unit 24 determines whether or not the detection result of the driver state is a predetermined result (ST50). Then, when determining that the result is the predetermined result (ST50: YES), the state detection unit 24 transmits a suppression signal to the operation detection unit 23 (ST51). As a result, the operation detection unit 23 suppresses detection of a driver state other than the detected driver state.
[0158]
For example, when the driver's hand is near the eyes, it does not mean that the driver is not in the vehicle, and in such a case, the state detection unit 24 transmits a signal for suppressing detection of the presence or absence of the driver. . In addition, when the driver's hand is near the eyes, the driver tends to hardly change the face direction. In such a case, a signal to suppress the detection of the driver's face direction is transmitted. I do. As described above, according to the detection result of one driver state, when it is not necessary to detect another driver state, the detection is suppressed. This prevents the device 2 from erroneously detecting other driver states.
[0159]
In the meantime, the operation detection unit 23 and the state detection unit 24 perform the detection again for the driver state for which the detection result has already been obtained. Then, it is determined whether or not the result of the re-detection is a predetermined result (ST52). That is, it is determined whether or not the predetermined result is continued.
[0160]
When it is determined that the predetermined result is continued (ST52: YES), this processing is repeated until it is determined that the predetermined result is not continued. On the other hand, when it is determined that the predetermined result is not continued (ST52: NO), the state detection unit 24 transmits a release signal for releasing the suppression to the operation detection unit 23 (ST53). That is, the suppression executed in step ST51 is released.
[0161]
Then, the process ends. By the way, when it is determined in step ST50 that the result is not the predetermined result (ST50: NO), the process is similarly terminated.
[0162]
In this way, in the state detection device 20a according to the present embodiment, it is possible to improve the cost and versatility in the same manner as in the first embodiment. Furthermore, since the detection of another driver's state is suppressed based on the detection result of a certain driver's state, erroneous detection of another driver's state can be prevented.
[0163]
Further, similarly to the first embodiment, the convenience can be improved, and the influence of noise or the like can be reduced.
[0164]
Further, even at the start of driving, the state of the driver and the like can be detected.
[0165]
Further, it is possible to prevent an increase in the amount of calculation and to reduce the possibility that a plurality of characteristic portions are simultaneously included in one reference region. Further, by setting an area that is too small, it is possible to prevent the area from having no characteristic part.
[0166]
Further, it is possible to preferably detect the driver state, and prevent inaccurate detection.
[0167]
Note that in the present embodiment as well, if image processing for obtaining an optical flow is performed, other processing may be slightly changed. Further, for example, it is also possible to detect a case where the driver bows to pick up a fallen object under the seat or a case where the driver leans on the passenger seat side in an attempt to pick up an object in the passenger seat. Furthermore, since the movement locus of the spoke portion can be obtained, the present invention can be applied to an apparatus for estimating the steering angle of the steering wheel.
[0168]
Further, in the present embodiment, since at least two of the three driver states need only be the detection targets, the number of detection targets may be two or three. Furthermore, since the detection of the other driver states is suppressed based on at least one detection result, the detection of the remaining one driver state may be suppressed based on two of the three detection results. . Further, the detection of the remaining two driver states may be suppressed based on one detection result.
[0169]
Next, a third embodiment of the present invention will be described. In the third embodiment, differences from the second embodiment will be mainly described.
[0170]
The state detection system 1b and the state detection device 20b according to the third embodiment detect at least two of three driver states and three physical states. Note that the three physical states indicate opening and closing of the driver's eyelids, opening and closing of the driver's mouth, and changes in the driver's facial expression.
[0171]
In the detection of the physical condition, the processing to be executed substantially coincides with the detection of the driver condition described above. However, in detecting the physical condition, it is necessary to accurately capture minute changes such as eyelids, mouth, and facial expressions, and thus it is necessary to specify the positions of the eyes, mouth, and the like of the face from the captured image.
[0172]
Next, the operation of the state detection device 20b when detecting opening / closing of the eyelid, opening / closing of the mouth, and a change in facial expression will be described.
[0173]
When detecting the opening and closing of the eyelid, the image processing unit 22 specifies the position of the eye. Specifically, the coordinate position of the eye in the captured image may be specified as described in JP-A-5-60515 or JP-A-2000-142164.
[0174]
Then, after specifying the position of the eye, the image processing unit 22 sets a reference region near the position of the eye in the captured image and sets an area group including a plurality of reference regions.
[0175]
FIGS. 23A and 23B are explanatory diagrams illustrating a reference region and a region group when opening / closing of the eyelids is detected. FIG. 23A illustrates an example of a reference region, and FIG. 23B illustrates an example of a region group. As shown in FIG. 23A, the image processing unit 22 sets a reference area of 4 rows and 16 columns so as to cover both eyes. Then, as shown in FIG. 23B, two area groups A3 and B3 are set. These region groups are set for each of the left and right eyes, and are specifically set to include a reference region of 4 rows and 8 columns.
[0176]
After the setting of the reference area and the area group, an optical flow is obtained in the same manner as in the second embodiment, and the data is transmitted to the motion detection unit 23.
[0177]
As described above, the reference region is set at the predetermined position in the second embodiment, whereas it is set at the specified eye position in the third embodiment. That is, the third embodiment is different from the second embodiment in the eye position specifying process and the reference region setting process. Note that this difference affects the detection of a state other than the opening and closing of the eyelids, and is not so large that simultaneous detection with other states cannot be performed.
[0178]
After calculating the optical flow, the motion detection unit 23 obtains an actual motion pattern in the same manner as in the second embodiment (especially, in the same manner as the detection of the face direction). Then, the state detection unit 24 obtains a correlation with a plurality of storage operation patterns and detects the state of the body.
[0179]
Here, an optical flow and an actual operation pattern obtained when the opening and closing of the eyelids are detected will be described. FIG. 24 is an explanatory diagram illustrating an example of an optical flow obtained when opening / closing of the eyelids is detected.
[0180]
First, as shown in FIG. 24A, the driver's eyes are open at time t. Thereafter, at time (t + 1), the driver starts closing his eyes. At this time, as shown in FIG. 24B, an optical flow is detected in the image vertical direction (Y direction) for the driver's eyelid portion.
[0181]
Then, at time (t + 2), the driver's eyes are completely closed. Also at this time, as shown in FIG. 24C, an optical flow is detected in the vertical direction of the image near the eyes of the driver. Note that, in the horizontal direction of the image (X direction), the optical flow is hardly detected from time t to (t + 2).
[0182]
FIG. 25 is an explanatory diagram illustrating an example of an actual operation pattern obtained when the opening and closing of the eyelids is detected. FIG. 25 shows a pattern obtained from when the driver closes his eyes to when he opens his eyes.
[0183]
When the driver performs the operation of closing the eyes, as shown in FIG. 24, the optical flow is detected in the vertical direction of the image, and the optical flow is hardly detected in the horizontal direction of the image. Therefore, the obtained actual operation patterns P4 and P5 (hereinafter, the actual operation patterns obtained when detecting eyelid opening / closing are referred to as eyelid operation patterns P4 and P5) are as shown in FIG.
[0184]
More specifically, the eyelid movement pattern P4 in the image vertical direction is as follows. First, when the driver has his eyes open (period 178 to 186), the moving position is near “0”. Thereafter, when the driver starts closing his eyes, an optical flow in the vertical direction of the image is obtained, so that the moving position rises to “6 to 8” pixels (period from time 186 to 190).
[0185]
Then, in a state where the driver keeps his eyes closed (period 190 to 216), the moving position keeps “6 to 8” pixels. Thereafter, when the driver starts to open his eyes, the moving position gradually decreases (period 216 to 237).
[0186]
On the other hand, the optical flow of the eyelid is hardly detected in the lateral direction of the image. For this reason, the eyelid movement pattern P5 in the horizontal direction of the image continues to maintain substantially the same value during the period from time 178 to time 186.
[0187]
After the eyelid movement patterns P4 and P5 as described above are obtained, the state detection unit 24 reads a plurality of storage operation patterns from the storage unit 24a. Then, the state detection unit 24 detects the driver's blink by comparing the eyelid operation pattern P4 with the storage operation pattern. The storage unit 24a according to the third embodiment stores, as a storage operation pattern, a pattern in which a predetermined movement is shown in the vertical direction of the image, and then a return by a predetermined movement is shown. Therefore, the state detection unit 24 detects the driver's blink when the correlation between the eyelid movement pattern P4 and the stored movement pattern indicating the return of the predetermined movement after the predetermined movement in the image vertical direction is the highest. Will be detected.
[0188]
After that, the state signal output unit 25 outputs the electric signal Sb according to the detection result to the control device 30. In addition, the storage unit 24a stores a storage operation pattern of the eye opening operation and the eye closing operation, and can detect that the eye is closed for a long time based on the time from the closing operation to the opening operation.
[0189]
Conventionally, when detecting any detection target, an apparatus is often configured to detect the detection target using two or more methods (for example, a method based on gray value data or a method based on a difference image). . Then, if the detection target is detected by any one of the two or more methods, it is determined that the detection target exists even if the detection target is not detected by another method. As described above, conventionally, a detection target is detected by combining two or more methods for the purpose of complementing a detection error when detection is performed by one method.
[0190]
By the way, in the above combination, the detection accuracy tends to be higher when two or more completely different methods are combined than when two or more similar methods are combined. That is, for example, if all of the two or more methods detect the detection target based on the gray value data, and if the gray value data itself has not been successfully detected, a detection error may occur due to all the methods. It is.
[0191]
In the present embodiment, blinks are detected by a new method called optical flow. Therefore, for example, when a detection target is detected by combining two or more methods, blink detection with high detection accuracy as a whole can be performed. In addition, the detection of the eye opening operation and the eye closing operation can be similarly performed with high accuracy.
[0192]
Next, the opening and closing of the mouth will be described. When detecting opening / closing of the mouth, the image processing unit 22 specifies the position of the mouth. When specifying the position of the mouth, first, the coordinates of the eyes are specified as described above. Then, the coordinate position of the mouth in the captured image is specified from the relative positional relationship between the coordinate positions of the eyes. Further, after specifying the position of the mouth, the image processing unit 22 specifies the positions of the upper lip and the lower lip. The positions of the upper lip and the lower lip are specified with reference to, for example, a region having a low density value extending in the horizontal direction of the image (that is, a boundary between the upper lip and the lower lip when the mouth is closed).
[0193]
Then, after specifying the positions of the upper lip and the lower lip, the image processing unit 22 sets a reference area near the position of the mouth in the captured image and sets an area group including a plurality of reference areas.
[0194]
FIGS. 26A and 26B are explanatory diagrams illustrating a reference region and a region group when opening / closing of the mouth is detected. FIG. 26A illustrates an example of a reference region, and FIG. 26B illustrates an example of a region group. As shown in FIG. 26A, the image processing unit 22 sets a reference region of 4 rows and 8 columns so as to cover both lips. Then, as shown in FIG. 26B, two area groups A4 and B4 are set. These region groups are set for each of the upper lip and the lower lip, and are specifically set to include a reference region of 2 rows and 8 columns.
[0195]
After the setting of the reference area and the area group, an optical flow is obtained in the same manner as in the second embodiment, and the data is transmitted to the motion detection unit 23.
[0196]
Thus, the reference area is set at the position of the specified mouth. Note that the process of specifying the position of the mouth and the process of setting the reference area, which are different from the second embodiment, affect the detection of a state other than the opening and closing of the mouth. It ’s not enough to make it impossible.
[0197]
After calculating the optical flow, the motion detection unit 23 obtains an actual motion pattern in the same manner as in the second embodiment (particularly, in the same manner as the detection of the face direction). Then, the state detection unit 24 obtains a correlation with a plurality of storage operation patterns and detects the state of the body.
[0198]
Here, an optical flow and an actual operation pattern obtained when opening and closing of the mouth are detected will be described. FIG. 27 is an explanatory diagram illustrating an example of an optical flow obtained when opening / closing of a mouth is detected.
[0199]
First, as shown in FIG. 27A, the driver's mouth is closed at time t. Thereafter, at time (t + 1), the driver starts to open his mouth. At this time, as shown in FIG. 27B, an optical flow is detected in the vertical direction of the image (Y direction) for the lower lip of the driver. On the other hand, in the horizontal direction of the image (X direction), optical flows are hardly detected. For the upper lip, no optical flow is detected in the vertical direction or the horizontal direction of the image.
[0200]
Then, at time (t + 2), the driver's eyes are completely closed. Also at this time, as shown in FIG. 27C, an optical flow is detected only in the vertical direction of the image in the lower lip portion of the driver. On the other hand, no optical flow is detected in the upper lip.
[0201]
FIG. 28 is an explanatory diagram illustrating an example of an actual operation pattern obtained when opening / closing of a mouth is detected. FIG. 28 shows a pattern obtained from when the driver opens his / her mouth and thereafter when his / her mouth is closed.
[0202]
When the driver performs an operation of opening the mouth, as shown in FIG. 27, the optical flow is detected in the vertical direction of the lower lip and the optical flow is not detected much in the horizontal direction of the image. Also, for the upper lip, little optical flow is detected in both the vertical and horizontal directions of the image.
[0203]
Therefore, the obtained real state patterns P6 to P9 are as shown in FIG. In the following, actual operation patterns obtained for the lower lip when opening and closing the mouth are detected are referred to as lower lip operation patterns P6 and P7. The actual operation patterns obtained for the upper lip are referred to as upper lip operation patterns P8 and P9.
[0204]
The patterns P6 to P9 shown in FIG. 28 will be specifically described. First, as for the lower lip movement pattern P6 in the image vertical direction, the movement position is near “0” in a state where the driver's mouth is closed (period 660 to 675). Thereafter, when the driver starts to open his mouth, an optical flow in the vertical direction of the image is obtained, so that the moving position rises to around "30" pixels (period from time 675 to time 700).
[0205]
Then, in a state where the driver keeps the mouth open (period 700 to 710), the movement position keeps maintaining around "30" pixels. Thereafter, when the driver starts closing the mouth, the moving position gradually decreases (period 710 to 716). Then, when the driver closes his mouth (period 710 to 734), the moving position keeps maintaining the vicinity of “5” pixels. Here, the movement position is near the “5” pixel because an error is detected.
[0206]
On the other hand, the lower lip movement pattern P7 in the horizontal direction of the image continues to maintain substantially “0” in the period from time 660 to 734 since optical flows are not detected much in the horizontal direction of the image. Similarly, the upper lip motion patterns P8 and P9 continue to maintain approximately “0” in the period from time 660 to time 734.
[0207]
After the upper lip and lower lip operation patterns P6 to P9 are obtained, the state detection unit 24 reads a plurality of storage operation patterns from the storage unit 24a. Then, the state detection unit 24 detects the driver's blink by comparing the upper lip and lower lip operation patterns P6 to P9 with the stored operation pattern. The storage unit 24a according to the third embodiment stores, as a storage operation pattern, a pattern in a case where the upper lip shows a substantially stationary state and the lower lip shows a predetermined movement in the image vertical direction. For this reason, the state detection unit 24 has the highest correlation between the upper lip and lower lip operation patterns P6 to P9 and the storage operation pattern in which the upper lip shows a nearly stationary state and the lower lip shows a predetermined movement in the image vertical direction. In this case, the opening or closing operation of the driver's mouth is detected.
[0208]
Then, the state signal output unit 25 outputs the electric signal Sb according to the detection result to the control device 30. When the opening and closing of the mouth is detected, the data at the time of sounding of “a”, “i”, “u”, “e”, and “o” are stored in the storage unit 24a as the storage operation pattern, so that the sounding is estimated. It can be applied to devices and the like. That is, it can be applied to a voice input navigation device or the like.
[0209]
Further, by storing the data of the mouth movement at the time of yawning as the storage operation pattern in the storage unit 24a, it can be applied to a yawn detection device or the like. Further, by detecting yawning, the present invention can be applied to evaluation of a driver's arousal level, a doze detection device, and the like.
[0210]
The detection of the opening and closing of the mouth is relatively accurate. This is because the present embodiment detects the respective movements of the upper lip and the lower lip. For example, when detecting the opening and closing of the mouth by capturing the entire movement of the mouth, when the driver slightly moves the face up and down, it is difficult to distinguish between opening and closing of the mouth and vertical movement of the face.
[0211]
However, in the present embodiment, attention is paid to the fact that when a person opens and closes the mouth, the upper lip hardly moves and mainly the lower lip moves, and this movement is detected to determine the opening and closing of the mouth. Therefore, it is possible to detect the opening and closing operations of the mouth with relatively high accuracy.
[0212]
Next, changes in facial expressions will be described. When detecting a change in facial expression, the image processing unit 22 specifies the position of the face. Then, the positions of the eyes and nose of the face are specified. In this specification, first, the coordinate position of the eye is specified. Then, the coordinate position of each part of the face such as the nose, mouth, cheek, and eyebrows in the captured image is specified from the relative positional relationship between the coordinate positions of the eyes.
[0213]
After specifying each part of the face, the image processing unit 22 sets a reference area for the entire face in the captured image and sets an area group including a plurality of reference areas for each part of the face.
[0214]
FIGS. 29A and 29B are explanatory diagrams showing a reference region and a region group when detecting a change in a facial expression. FIG. 29A shows an example of a reference region, and FIG. 29B shows an example of a region group. As shown in FIG. 29A, the image processing unit 22 sets a reference area of 14 rows and 16 columns so as to cover the entire face.
[0215]
Then, as shown in FIG. 29B, eleven area groups A5 to K5 are set. More specifically, the area groups A5 to D5 are set for the right eyebrow, the left eyebrow, the right eye, and the left eye position. Specifically, each of the area groups A5 to D5 includes three rows and eight columns of reference areas. The area groups E5, G5, I5, and J5 are set for the right cheek, the left cheek, the right jaw, and the left jaw position, and specifically include a reference area of 4 rows and 4 columns, respectively. Is set.
[0216]
Further, the region groups F5, H5, and K5 are set for the nose, upper lip, and lower lip position. Specifically, reference regions of three rows and eight columns, one row and eight columns, and four rows and four columns are set. It is set including.
[0219]
After setting the reference region and the region group, the image processing unit 22 obtains an optical flow as in the second embodiment, and sends the data to the operation detection unit 23.
[0218]
Thus, the reference area is set at the position of each part of the face. The difference in the facial expression detection from the second embodiment differs from the second embodiment in that the detection of the driver's state is affected and the detection of the driver's state cannot be performed.
[0219]
After calculating the optical flow, the motion detection unit 23 obtains an actual motion pattern in the same manner as in the second embodiment (particularly, in the same manner as the detection of the face direction). Then, the state detection unit 24 obtains a correlation with a plurality of storage operation patterns and detects the state of the body.
[0220]
Here, an optical flow and an actual operation pattern obtained when detecting a change in facial expression will be described. FIG. 30 is an explanatory diagram illustrating an example of an optical flow obtained when a change in a facial expression is detected. FIG. 31 is an explanatory diagram showing the optical flow shown in FIG. 30 in a simplified manner. FIGS. 30 and 31 show an optical flow in the case where the driver performs an operation of frowning.
[0221]
First, as shown in FIG. 30A, at time t, the expression of the driver is in a normal state. Thereafter, at time (t + 1), the driver starts frowning. At this time, as shown in FIG. 30B, optical flows are detected near the eyebrows and eyes. Then, when the driver frowns at time (t + 2), as shown in FIG. 30C, the optical flow is not detected.
[0222]
FIG. 31 shows a state from time t to (t + 2). As shown in the figure, when the driver performs an operation of frowning, the position of the eyebrow tends to slightly move in the vertical direction of the image. In addition, slight movement is seen in the eyes.
[0223]
FIG. 32 is an explanatory diagram illustrating an example of an actual operation pattern obtained when a change in a facial expression is detected. FIG. 32 shows a pattern obtained when the operation of frowning the eyebrows shown in FIGS. 30 and 31 is performed.
[0224]
As is clear from FIGS. 30 and 31, when the driver performs the operation of frowning, the optical flow is detected near the eyebrows and eyes. Thus, the obtained actual operation pattern is as shown in FIG.
[0225]
As shown in FIG. 32, the optical flow is obtained in the eyebrows and the eyes from the time t to (t + 2), and therefore, these movement positions change respectively. On the other hand, there is almost no change in the face part other than the eyebrows and eyes.
[0226]
After the actual operation pattern is obtained for each characteristic part of the face as described above, the state detection unit 24 reads a plurality of storage operation patterns from the storage unit 24a. Then, the state detection unit 24 detects the expression of the driver based on the actual operation pattern for each characteristic part of the face and the storage operation pattern stored in the storage unit 24a.
[0227]
Then, the state signal output unit 25 outputs the electric signal Sb according to the detection result to the control device 30.
[0228]
Here, it is desirable to store a storage operation pattern for a change in facial expression for each facial expression. In this case, various facial expressions can be detected. Therefore, for example, it is possible to distinguish between a laughing state and a squinting state, which are conventionally difficult to distinguish.
[0229]
In the present embodiment, the change of the facial expression is detected in the vehicle, and the control device performs control based on the detection result, which is more useful. For example, when the driver performs an operation of frowning, the control device 30 can control the electric sunshade. In addition, the control device 30 can control the audio device, for example, by detecting the driver's emotion from the change in facial expression and calming the emotion of the driver in an irritable state. As described above, it is possible to control the in-vehicle environment in consideration of the driver's feelings and the like based on the change in the facial expression, which is very useful.
[0230]
Here, an apparatus that only recognizes facial expressions is disclosed in Japanese Patent Application Laid-Open No. 4-32078. In the present embodiment, a facial expression can be detected by a method similar to the conventional technique.
[0231]
The above is the description of the detection of the physical condition. Further, in the third embodiment, similarly to the second embodiment, a function of suppressing detection of another driver state and / or body state based on a detection result is provided. Therefore, the state detection unit 24 first detects at least one state among the detection targets from the optical flow obtained by the image processing unit 22. Then, based on the result, it is determined whether or not the inhibition signal can be transmitted. If the condition is satisfied, the inhibition signal is transmitted to the operation detection unit 23.
[0232]
That is, at least two of the three driver states and the three physical states are to be detected, at least one of the detected states is detected, and a state other than the other states is determined based on the detected result. Will be suppressed.
[0233]
This prevents erroneous detection of other states from being detected as in the second embodiment.
[0234]
In this manner, in the state detection device 20b according to the third embodiment, as in the second embodiment, it is possible to improve the cost and versatility, and prevent erroneous detection. be able to.
[0235]
Further, similarly to the second embodiment, the convenience can be improved, and the influence of noise or the like can be reduced.
[0236]
Further, even at the start of driving, the state of the driver and the like can be detected.
[0237]
In addition, it is possible to prevent an increase in the amount of calculation and to reduce the possibility that a plurality of characteristic parts are simultaneously included in one reference area. Further, by setting an area that is too small, it is possible to prevent the area from having no characteristic part.
[0238]
In addition, it is possible to preferably detect the driver state, and prevent inaccurate detection.
[0239]
In addition, in the state detection device 20b according to the present embodiment, when the pattern of the eyelid movement indicates a predetermined movement in the vertical direction of the image, and then indicates a return by a predetermined movement, a blink of the driver is detected. . As described above, in the present embodiment, blinks are detected by a novel method called optical flow. Therefore, for example, when a detection target is detected by combining two or more methods, blink detection with high detection efficiency as a whole can be performed. In addition, the detection of the eye opening operation and the eye closing operation can be similarly performed with high accuracy.
[0240]
In addition, detection of opening and closing of the mouth is relatively accurate. That is, in the present embodiment, when a person opens and closes the mouth, the upper lip hardly moves, and attention is focused on the movement of the lower lip, and this movement is detected to determine the opening and closing of the mouth. For this reason, the distinction between raising and lowering the face and opening and closing the mouth is clear. Therefore, it is possible to detect the opening and closing operations of the mouth with relatively high accuracy.
[0241]
Further, the facial expression of the driver is detected based on actual operation patterns such as eyes and nose, which are characteristic portions of the face, and stored operation patterns stored in advance. Further, in the present embodiment, the change of the expression is detected in the vehicle, and the control device performs control based on the detection result. For this reason, for example, when the driver performs an operation of frowning, the control device can control the electric sunshade. Further, the audio device can be controlled by the control device, for example, by detecting the driver's emotion from the change in the facial expression and calming the emotion of the driver in the frustrated state. In this way, it is possible to control the in-vehicle environment in consideration of the driver's emotion and the like based on the change in the expression.
[0242]
In the present embodiment, at least two of the three driver states and the three physical states may be detected, and the number may be three or more. Further, since the detection of the other states is suppressed based on at least one detection result, the detection of the remaining two states may be suppressed based on three of the six detection results. Further, the number of detection results and the number of states in which detection is suppressed are not limited thereto, and can be changed as appropriate.
[0243]
Next, a fourth embodiment of the present invention will be described. In the fourth embodiment, in addition to the configuration of the third embodiment, a vehicle state detecting unit and an environment information detecting unit are newly provided. Hereinafter, a fourth embodiment will be described.
[0244]
FIG. 33 is a block diagram illustrating a configuration of a state detection system including a state detection device according to the fourth embodiment. As shown in the figure, the state detection system 1c includes a vehicle state detection unit 40 that detects a state of the vehicle, and an environment information detection unit 50 that detects a surrounding environment of the vehicle.
[0245]
More specifically, the vehicle state detection means 40 detects one or more vehicle-related states such as vehicle speed, brake switch on / off information, accelerator switch on / off information, steering angle, shift range information, and the like.
[0246]
The environment information detecting means 50 acquires position information obtained by a navigation system using a GPS or a gyro, and detects, for example, the type of a traveling road, the presence or absence of an intersection, and the like.
[0247]
Further, the environmental information detecting means 50 includes at least one of a visible light camera, a far infrared ray detecting element, a laser radar, and an ultrasonic sensor, and detects information around the vehicle. With this configuration, the environment information detecting unit 50 detects, for example, the presence / absence / approach of a preceding vehicle or an obstacle, pedestrian crossing, approaching of a succeeding vehicle, approaching vehicle from the side behind, and the like.
[0248]
Further, the environmental information detecting means 50 also obtains information such as weather information, weather, outside brightness and day / night distinction based on an illuminometer.
[0249]
Further, the state detection device 20c changes the state to be detected (driver state, body state) based on at least one of the signal Sc from the vehicle state detection unit 40 and the signal Sd from the environment information detection unit 50. It has the function to do.
[0250]
For example, the state detection device 20c detects the direction of the driver's face from the map information obtained by the navigation based on the environment signal Sd indicating that the vehicle is approaching an intersection with poor visibility or an intersection with no signal.
[0251]
Further, the state detection device 20c determines a traffic jam based on the vehicle state signal Sc indicating that the vehicle speed is equal to or lower than the set speed, and detects a change in facial expression to detect whether the driver feels drowsy. And
[0252]
In this way, according to the state detection device 20c according to the present embodiment, as in the third embodiment, it is possible to improve the cost and versatility, and prevent erroneous detection. can do.
[0253]
Further, similarly to the third embodiment, the convenience can be improved, and the influence of noise or the like can be reduced.
[0254]
Further, even at the start of driving, the state of the driver and the like can be detected.
[0255]
In addition, it is possible to prevent an increase in the amount of calculation and to reduce the possibility that a plurality of characteristic parts are simultaneously included in one reference area. Further, by setting an area that is too small, it is possible to prevent the area from having no characteristic part.
[0256]
In addition, it is possible to preferably detect the driver state, and prevent inaccurate detection.
[0257]
Further, for example, when detecting a detection target by combining two or more techniques, blink detection with high detection efficiency as a whole can be performed. Further, the detection of the eye opening operation and the eye closing operation can be similarly performed with high accuracy.
[0258]
Further, it is possible to detect the opening and closing operations of the mouth with relatively high accuracy, and further, it is possible to control the environment in the vehicle in consideration of the driver's emotions and the like based on the change in the expression. .
Further, the state to be detected is changed based on at least one of the signal Sc from the vehicle state detecting means 40 and the signal Sd from the environment information detecting means 50. Therefore, it is possible to detect an appropriate driver / body condition according to each condition / environment.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a state detection system including a state detection device according to a first embodiment of the present invention.
FIG. 2 is a block diagram showing a detailed configuration of a state detection device 20 shown in FIG.
FIG. 3 is a data flow diagram schematically showing an operation of the state detection device 20 according to the embodiment.
FIG. 4 is an explanatory diagram illustrating an outline of an operation of the state detection device 20 according to the present embodiment.
FIG. 5 is an explanatory diagram of a reference area and a search area.
6A and 6B are explanatory diagrams of reference regions regularly arranged in a captured image; FIG. 6A illustrates an example in which reference regions are arranged in a horizontal direction of an image; FIG. An example when the reference areas are arranged is shown, and (c) shows an example when the reference areas are arranged in the image lateral direction and in a lattice shape.
FIG. 7 is an explanatory diagram of an area group, and shows an example in the case of detecting the direction of a face.
FIG. 8 is an explanatory diagram of an area group, and shows an example of a case in which an object other than the driver's face enters or leaves the imaging range.
FIG. 9 is a flowchart illustrating an operation of the image processing unit 22 illustrated in FIG. 2;
10 is an explanatory diagram of a method of calculating a movement amount (xd, yd) in step ST11 shown in FIG.
11 is an explanatory diagram of the process of step ST12 shown in FIG.
12A and 12B are explanatory diagrams illustrating an example of an optical flow at the time of detecting the orientation of a driver's face. FIG. 12A illustrates an example of an optical flow at time t, and FIG. 12B illustrates an optical flow at time (t + 1). An example of a flow is shown, (c) shows an example of an optical flow at time (t + 2), and (d) shows an example of an optical flow at time (t + 3).
13A and 13B are explanatory diagrams illustrating an example of an optical flow when detecting the presence / absence of a driver. FIG. 13A illustrates an example of an optical flow before boarding, and FIG. 13B illustrates an example of an optical flow during boarding. (C) shows an example of an optical flow after the boarding is completed.
FIG. 14 is an explanatory diagram showing an example of an optical flow when detecting entry and exit of something other than the driver's face into the imaging range, and showing an example of an optical flow when the driver moves his hand near his eyes. (A) shows an example of an optical flow at time t, (b) shows an example of an optical flow at time (t + 1), and (c) shows an example of an optical flow at time (t + 2). .
FIG. 15 is an explanatory diagram showing an example of an optical flow when detecting entry and exit of an object other than the driver's face into the imaging range, and an optical flow when the driver lifts a book to see a road map or the like. And (a) shows an example of an optical flow at time t, (b) shows an example of an optical flow at time (t + 1), and (c) shows an example of an optical flow at time (t + 2). Is shown.
FIG. 16 is an explanatory diagram showing an example of an optical flow when detecting entry and exit of something other than the driver's face into the imaging range, and is an example of an optical flow when a spoke portion of the steering wheel enters the imaging range; And (a) shows an example of an optical flow at time t, (b) shows an example of an optical flow at time (t + 1), and (c) shows an example of an optical flow at time (t + 2). I have.
FIG. 17 is a flowchart showing an operation of the operation detecting unit 23 shown in FIG.
18 is an explanatory diagram of an actual operation pattern obtained by the operation detection unit 23 illustrated in FIG. 2, and illustrates a case where a detection target is a face direction of a driver.
19 is an explanatory diagram of an actual operation pattern obtained by the operation detection unit 23 illustrated in FIG. 2, and illustrates a case where a detection target is an object other than the driver's face entering and exiting the imaging range.
FIG. 20 is an explanatory diagram showing an example of an actual operation pattern obtained by the operation detection unit 23 when the detection target is the presence or absence of a driver.
FIG. 21 is a flowchart showing an operation of the state detection unit 24 shown in FIG.
FIG. 22 is a flowchart illustrating a suppression control process performed by the state detection unit 24.
FIGS. 23A and 23B are explanatory diagrams illustrating a reference region and a region group when opening / closing an eyelid is detected, where FIG. 23A illustrates an example of a reference region and FIG. 23B illustrates an example of a region group.
FIGS. 24A and 24B are explanatory diagrams illustrating an example of an optical flow obtained when eyelid opening / closing is detected. FIG. 24A illustrates an example of an optical flow at time t, and FIG. 24B illustrates an optical flow at time (t + 1). (C) shows an example of an optical flow at time (t + 2).
FIG. 25 is an explanatory diagram showing an example of an actual operation pattern obtained when opening / closing an eyelid is detected.
26A and 26B are explanatory diagrams showing a reference region and a region group when opening / closing of a mouth is detected, wherein FIG. 26A shows an example of a reference region, and FIG. 26B shows an example of a region group.
FIGS. 27A and 27B are explanatory diagrams showing examples of optical flows obtained when opening and closing of a mouth are detected. FIG. 27A shows an example of an optical flow at time t, and FIG. 27B shows an optical flow at time (t + 1). (C) shows an example of an optical flow at time (t + 2).
FIG. 28 is an explanatory diagram showing an example of an actual operation pattern obtained when opening / closing of a mouth is detected.
FIGS. 29A and 29B are explanatory diagrams showing a reference region and a region group when detecting a change in a facial expression. FIG. 29A shows an example of a reference region, and FIG. 29B shows an example of a region group.
FIG. 30 is an explanatory diagram showing an example of an optical flow obtained when detecting a change in a facial expression. FIG. 30 (a) shows an example of an optical flow at time t, and FIG. 30 (b) shows an optical flow at time (t + 1). (C) shows an example of an optical flow at time (t + 2).
FIG. 31 is an explanatory diagram showing a simplified optical flow shown in FIG. 30;
FIG. 32 is an explanatory diagram showing an example of an actual operation pattern obtained when a change in a facial expression is detected.
FIG. 33 is a block diagram illustrating a configuration of a state detection system including a state detection device according to a fourth embodiment.
[Explanation of symbols]
1-1c ... state detection system
10 ... Imaging device
20 to 20c: State detection device
21 ... Image acquisition unit (image acquisition means)
22 ... Image processing unit (image processing means)
23 ... motion detection unit (motion detection means)
24 ... State detection unit (state detection means)
24a ... storage unit (pattern storage means)
25 ... status signal output unit (signal output means)
30 ... Control device (control means)
40 ... vehicle state detecting means
50 ... Environmental information detecting means
AK: Area group
C1, C2 ... change amount
D: Memory operation pattern
P ... actual operation pattern
P4, P5 ... Eyelid movement pattern
P6, P7: Lower lip movement pattern
P8, P9: Upper lip movement pattern
Sa: video signal
Sb ... electric signal
Sc ... status signal
Sd: Environmental signal

Claims

Image processing means for obtaining an optical flow between the captured images, based on captured images obtained by capturing the position where the driver's body is present when the driver is seated in chronological order,
From the optical flow obtained by the image processing means, without specifying the position of the driver's body in the captured image, the direction of the driver's face, entering and leaving the imaging range of things other than the driver's face, and State detection means for detecting at least one of the three driver states with or without a driver as a detection target;
A state detection device comprising:

Image processing means for obtaining an optical law between the captured images based on captured images obtained by capturing the position where the driver's body is present when the driver is seated in a time-series manner,
At least two of the three driver states, ie, the orientation of the driver's face, the entry / exit of an object other than the driver's face into / from the imaging range, and the presence / absence of the driver, were detected, and were obtained by the image processing means. State detection means for detecting at least one driver state among the detection targets, without specifying the position of the driver's body in the captured image from the optical flow,
The state detection means, based on a detection result regarding at least one driver state, suppresses detection of a driver state other than the detected at least one driver state among the detection targets. apparatus.

Image processing means for obtaining an optical law between the captured images based on captured images obtained by capturing the position where the driver's body is present when the driver is seated in a time-series manner,
The driver's face orientation, entry and exit of things other than the driver's face into the imaging range, and the presence or absence of the driver, three driver states, opening and closing of the driver's eyelids, opening and closing of the driver's mouth, and driving At least two of the three physical states of the facial expression change of the driver are to be detected, and the driver's body state in the captured image is not specified without specifying the position of the driver's body in the captured image. State detecting means for specifying at least one position of the body and a specific part of the body, and detecting at least one state of the detection target from the optical flow obtained by the image processing means,
The state detection device, wherein the state detection unit suppresses detection of a state other than the detected at least one state among the detection targets, based on a detection result regarding at least one state.

Image acquisition means for inputting a captured image obtained by chronologically capturing the position where the face as the driver's body is present when the driver is seated,
Operation detection means for detecting the movement of the face from the optical flow obtained by the image processing means,
Signal output means for converting the detection result from the state detection means into an electric signal and outputting the electric signal to the outside,
The image processing unit obtains an optical flow for each of one or a plurality of calculation regions determined by a predetermined position and a size with respect to the captured image input by the image acquisition unit,
The motion detecting means obtains an actual motion pattern obtained from the optical flow obtained by the image processing means for each area group including at least one calculation area,
The said state detection means detects at least one of the said detection object based on the actual operation pattern calculated | required by the said operation detection means, and the storage operation pattern stored beforehand. The state detection device according to claim 3.

5. The state detection apparatus according to claim 4, wherein the state detection unit obtains an actual operation pattern spatially and temporally from the optical flow obtained by the image processing unit.

Pattern storage means for storing the storage operation pattern in advance,
6. The state detection device according to claim 4, wherein the pattern storage unit stores, as the storage operation pattern, a characteristic amount obtained based on an actual driver's movement. 7. apparatus.

5. The image processing unit according to claim 4, wherein at least one of the one or more calculation regions is set to a predetermined size of a face part based on a ratio of a face occupied in a captured image. Item 7. The state detection device according to any one of Items 6.

5. The motion detecting means accumulates a motion amount based on a face motion, calculates an actual motion pattern based on the integrated value, and initializes the integrated value based on a predetermined condition. The state detecting device according to claim 7.

In detecting at least one of the detection targets, the state detection unit calculates a correlation between the actual operation pattern and each of a plurality of storage operation patterns stored in advance, and determines a storage operation pattern having the highest correlation as a detection result. The state detection device according to any one of claims 4 to 8, wherein the state detection device is obtained as:

The state detection means, when the eyelid movement pattern as the actual movement pattern obtained by the movement detection means shows a predetermined movement in the image vertical direction and then returns by a predetermined movement, The state detection device according to claim 4, wherein the state detection device detects blinking.

The state detection unit is configured to perform driving when the upper lip operation pattern as the actual operation pattern obtained by the operation detection unit indicates a substantially stationary state and the lower lip operation pattern indicates a predetermined movement in the image vertical direction. The state detecting device according to any one of claims 4 to 10, wherein the state detecting device detects that the mouth of the person has opened or closed.

The motion detection means obtains an actual motion pattern for each characteristic portion of the face,
5. The driver according to claim 4, wherein the state detector detects an expression of the driver based on an actual operation pattern for each characteristic portion obtained by the operation detector and a stored operation pattern stored in advance. 6. The state detection device according to any one of claims 11 to 11.

The image processing unit determines whether or not to use the optical flow calculation by comparing a change amount of the feature amount calculated in each of the one or more calculation regions with a preset threshold value. The state detection device according to claim 4, wherein:

At least one of the detection targets is set to another state based on at least one of a signal from a vehicle state detection unit that detects a state of the vehicle and a signal from an environment information detection unit that detects a surrounding environment of the vehicle. The state detection device according to claim 1, wherein the state is changed.

The optical flow between the captured images is obtained based on the captured images obtained by capturing the position of the driver's body in time series when the driver is seated, and the driving in the captured image is performed based on the obtained optical flows. Without specifying the position of the body of the driver, at least among the three driver states of the orientation of the driver's face, entry and exit of an object other than the driver's face into the imaging range, and presence or absence of the driver A state detection device, wherein one is detected as a detection target.

Based on the captured images obtained by chronologically capturing the position where the driver's body is present when the driver is seated, an optical flow between the captured images is obtained, and the direction of the driver's face and the driver's face are determined. At least two of the three driver states, that is, the presence / absence of a driver other than the face entering / leaving the imaging range and the presence / absence of a driver, are set as detection targets. Without specifying at least one driver state among the detection targets, and detecting a driver state other than the detected at least one driver state based on a detection result regarding the detected at least one driver state. A state detection device for suppressing detection of a state.

Based on the captured images obtained by taking a time-series image of the position where the driver's body is present when the driver is seated, determine the optical flow between the captured images,
The three driver states of the orientation of the driver's face, the entry and exit of things other than the driver's face into the imaging range, and the presence or absence of the driver, the opening and closing of the driver's eyelids, and the opening of the driver's mouth At least two of the three physical states of opening and closing, and the change of the facial expression of the driver are to be detected, and the physical state of the driver is specified without specifying the position of the driver's body in the captured image. Identifying the position of at least one of the driver's body and the specific part of the body in the captured image, from the obtained optical flow, to detect at least one state of the detection target,
A state detection device that suppresses detection of a state other than the detected state among detection targets based on a detection result regarding at least one detected state.

Imaging means including within the imaging range the position where the driver's body is when the driver is seated,
Image processing means for obtaining an optical flow between the captured images based on the captured images taken in time series by the imaging means;
From the optical flow obtained by the image processing means, without specifying the position of the driver's body in the captured image, the direction of the driver's face, entering and leaving the imaging range of things other than the driver's face, and State detection means for detecting at least one of the three driver states with or without a driver as a detection target;
Signal output means for converting the detection result from the state detection means to an electric signal and outputting the same to the outside,
Control means for performing predetermined processing based on the electric signal from the signal output means;
A state detection system comprising: