JP3797686B2

JP3797686B2 - Shape recognition apparatus and method

Info

Publication number: JP3797686B2
Application number: JP23025795A
Authority: JP
Inventors: 修山口; 和広福井; 恭一岡本
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1995-09-07
Filing date: 1995-09-07
Publication date: 2006-07-19
Anticipated expiration: 2015-09-07
Also published as: JPH0973544A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理における形状認識装置及びその方法に関する。
【０００２】
【従来の技術】
モデル当てはめにより形状を抽出する方法としては次のようなものがある。
【０００３】
テンプレートマッチングは、一定の曲線や画像としてモデルを記述し、モデルと対象画像の間の類似度を判定することによって対象認識を行う。このテンプレートマッチングには、モデルの記述として画像の明度分布をそのまま用いるものと、二値画像を用いてモデルの形状をエッジ点の集合として表現するものがある。前者は明度分布の類似度を測ることによって形状認識を行う。後者は、まず画像の明度分布に基づいてエッジの強度画像を作り、そのエッジ強度画像を二値化することによって二値画像を作る。そして、二値画像の点集合とモデルの点集合の類似度を計算するというものである。
【０００４】
一般化ハフ変換(Ballard,D.H．“Generalizing the Hough transform to detect arbitary shapes" Pattern Recognition.13,111-122(1981)) は、モデルを輪郭線上の点のリストとして記述し、その平行移動、回転、拡大縮小の表すパラメータが使われる。この変換は、パラメータ空間への”投票”によって、パターンが表すもっとも適切なパラメータの位置に局所的なピークを形成するように変換される。このハフ変換は、部分的にデータが欠けているような場合でもうまく照合が行えるといった利点がある。
【０００５】
次に、モデルと画像データとの適合性がある目的関数（エネルギ）によって測られ、その関数の値の最小化などによってモデルの照合が行われるものがある。例えば最近では、スネーク(M．Kass,A.Witkin,D.Terzopouls，“Snakes:Active Contour Models",In Proc.1st Int.Conf.on Computer Vision,pp.259-268(1987)) などがある。
【０００６】
これらのモデル当てはめには、生の画像データをそのまま用いるもの、または対象のもつ他の特徴、例えばエッジ強調フィルタや線強調フィルタといった低レベルの画像処理を行った画像データを使うものもある。
【０００７】
本発明では、エッジデータを用いて、ノイズ、外乱に強い安定した形状認識について考える。そのためにまず、物体輪郭や物体内部の構造を反映したエッジ情報を抽出する手法について挙げる。
【０００８】
エッジの抽出方法には、画像の明度勾配に基づく方法とエッジ近傍領域の情報に基づく方法がある。画像の明度勾配に基づく方法は、明度勾配が急激に変化する位置をエッジとして抽出するもので、１次微分型のRobertｓ、Prowitｔ、Sobel や２次微分型のLaplacian などの方法がある。エッジ近傍領域の情報に基づく方法としては、エッジモデルの当てはめを局所領域に対して行うHueckel の方法などがある。
【０００９】
明度勾配に基づいたエッジ抽出は、弱いエッジを抽出できないといった問題や色相画像など他のエッジ強度に拡張し難いといった問題がある。明度勾配エッジ抽出を用いた、従来法の対象認識についての問題点をまとめる。
【００１０】
テンプレートマッチングはテンプレートに対して、画像データにわずかな変動があった場合に弱いという特徴を持つ。一般化ハフ変換については、エッジ点の数に投票数が影響を受ける。エッジ点の有無のみが問題となり、エッジ抽出の性能の影響を非常に受ける。スネークモデルは、制御点同士を結び付ける弾性エネルギ関数による関連づけによって、柔軟でノイズに強いという性質を持つ。しかし、なお局所的なエネルギーの最小値に落ち込むという欠点がある。
【００１１】
これらの問題を解決するために、エッジ近傍領域の情報に基づく方法を用いることが考えられる。エッジを画像の明度だけでなく、色、テクスチャの異なる領域間の境界としてとらえ、エッジ強度をこれら領域間の特徴量から計算する方法として、“領域間の分離度に基づくエッジ抽出、情報処理学会コンピュータビジョン研究会資料87-1(1994)”が提案されている。
【００１２】
「分離度」とは、データの集合を幾つかの部分集合に分割した場合に、それぞれの部分集合がどの程度分離されているかを表す量である。これは、集合全体のデータの特徴に対する部分集合のデータの特徴の割合で表され、その値は０．０〜１．０の値をとる。
【００１３】
今、図２のように、画像中の２つの領域を上記の２つの集合と考えた場合、分離度が大きい程、領域の境界に、強いエッジが存在することになる。つまり、ある画素を境とする２つの領域について考えると、その画素における分離度が大きければ、その画素には強いエッジが存在するということができる。
【００１４】
分離度Ｓは次の式で与えられる。
【００１５】
【数１】

この手法はエッジを安定に抽出できるという利点があるものの、逆に、分離度は画像抽出量の差に依存したものではないため、対象物のエッジを抽出するためには分離度をそのまま用いるには不十分であるとの観点から、重みつき分離度（特願平７−７４９９号）も提案されている。重みつき分離度は、対象物の存在位置、色情報、テクスチャ情報などを利用して選択的に対象物のエッジを抽出することを目的として、位置に関する重み付けと画像特徴の重み付けを分離度の式に組込んだ形をとる。
【００１６】
【発明が解決しようとする課題】
上述した分離度を用いて適当な大きさ、形状のマスクを設定し、画像に適用し、従来法に効果的なエッジ強度画像を生成することができる。しかし、このような単純な使い方では不十分である。
【００１７】
図３（ａ）で示したような画像データにおける、緩やかなエッジで構成されている対象の物体認識について考える。
【００１８】
この図では、ア、イ、ウ、エの４点に対してエッジ強度を図３（ｂ）に示す。エッジの強度について考えると、対象を構成しているエッジ強度が図３（ｃ）のように単調に変化している場合や撮影時のボケによる影響により弱くなっている場合、二値化によりエッジ画像を作ると、閾値の設定方法によってはエッジ点として抽出されないこともある。
【００１９】
しかし、図３（ａ）のように同じように連続した弱いエッジを持つ場合は物体として認識されなければならない。
【００２０】
このような実際の対象としては人間の目の瞳などがある。
【００２１】
顔画像を通常室内環境で撮影し、撮影された瞳の形状抽出を行おうとする場合、図８において、(1) 〜(3) で示すような問題が挙げられる。
【００２２】
(1) まぶた、まつげ等に瞳の一部を隠されてしまうため推定が不安定である（図８ア）。
【００２３】
(2) 蛍光灯の写り込みにより、非常に強いエッジが画像中にできてしまう（図８イ）。
【００２４】
(3) 虹彩と白目の間はなだらかに変化しており、弱いエッジで構成されている（図８ウ）。
【００２５】
従来の対象認識法は、エッジ強度、エッジ点の幾何学的配置についての考慮はなされているが、隣接するエッジ点同士の関係としてのエッジの方向性とエッジ強度の類似性には着目していない。
【００２６】
例えば、一般化ハフ変換におけるエッジ点同士の関係は、エッジ点の集合として表される幾何学的配置関係のみである。
【００２７】
また、スネークモデルは、制御点同士を結び付ける弾性エネルギ関数による関連づけや、最適化のための制御点同士の情報交換等があるが、積極的に関係情報を利用したものではない。また、楕円等の形状をパラメータ表現できる場合、楕円を構成するエッジ点を抽出し、最小自乗法等によって楕円のパラメータを求めることによって、検出を行っていた。しかし、最小自乗法による楕円の当てはめには、最小自乗法に偏差があることが知られていることや楕円の一部が隠されてしまうことによって、推定が不安定になるという問題がある。このように従来の形状抽出はノイズや外乱に弱かった。
【００２８】
そこで、本発明による装置では、ノイズ、外乱に強く、安定した形状認識が行えるものを提供する。
【００２９】
【課題を解決するための手段】
本発明の形状認識装置は、画像を入力する画像入力部と、認識対象物の形状を擬似的に表すモデルの輪郭に沿って２つの領域を有する特徴量抽出領域を設定する特徴量抽出領域設定部と、前記モデルを前記画像に当てはめるための、前記モデルの前記画像中における位置、姿勢または大きさを決定するアフィン変換用のパラメータを設定するパラメータ設定部と、前記特徴量抽出領域が設定された前記モデルを、前記パラメータに基づいて前記画像に当てはめるモデル適用部と、前記画像に当てはめる前記モデルの前記特徴量抽出領域における分離度を、前記画像から求める特徴量抽出部と、前記分離度が最も高いパラメータを決定するパラメータ決定部と、を有し、前記特徴量抽出領域設定部は、前記モデルの輪郭に沿った第１の側の第１領域と第２の側の第２領域との２つの領域を有する前記特徴量抽出領域を設定し、前記特徴量抽出部は、前記分離度として前記第１領域と前記第２領域との分離度を用い、かつ、前記分離度を、
【数１４】

に基づいて求めることを特徴とする。
【００３１】
本発明による装置では、ノイズ、外乱に強く、安定した形状認識が行える。
【００３２】
【発明の実施の形態】
以下に本発明の実施の形態における実施例を説明する。
【００３３】
なお、本実施例は、人間の目の瞳の部分の形状認識を行うという状況での一実施例について説明する。ＣＣＤカラーカメラを用いて取得した顔画像では、図７のように、瞳の形状が楕円として撮影される。この瞳の形状を楕円として近似しその楕円パラメータを求めることについて考える。
【００３４】
形状認識装置１００のブロック図を図１に示す。
【００３５】
画像を入力するために必要なＡ／Ｄ変換や入力変換手段と画像蓄積部からなる画像入力部１０１、取得した画像の前処理するための画像前処理部１０２、そして形状抽出を行う形状抽出部１０３から構成される。
【００３６】
(1) 画像入力部１０１
画像入力部１０１について説明する（図４参照）。
【００３７】
画像入力部１０１は、カラーＣＣＤカメラ１、ビデオボード２、画像を蓄積するメモリ３からなる。なお、画像入力部１０１は、ＣＣＤカメラ１からの入力に限らずビデオ等の映像出力機器やＸ線撮影装置等で得られた画像をビデオボードの端子から入力することもできる。
【００３８】
(2) 画像前処理部１０２
画像前処理部１０２について説明する（図５参照）。
【００３９】
画像入力部１０１に入力された画像を明度画像、色相画像、彩度画像といった画像に変換する演算装置４と、これら変換した画像を保存する画像蓄積部５からなる。
【００４０】
ＣＣＤカメラ１によって取得されたカラー画像はＲ、Ｇ、Ｂの３枚の画像からなる。演算装置４は、画像中のある一点（ｘ，ｙ）のそれぞれのＲ，Ｇ，Ｂ画像の値を、（Ｒ，Ｇ，Ｂ）として、このＲＧＢカラー空間における色表現をＨＳＶカラー空間(Computer Graphics 2nd edition(Addison Wesley)pages 592-593)やＧＨＬＳカラー空間(GLHS:A generalized Lightness,Hue,and Saturation Color Model,Haim Levkowitz and Gabor T.Herman, CVGIP Graphical Models and Image Processing(GMIP)Volume 55,Number 4,July 1993)での色表現に変換する。画像蓄積部５のメモリから、（ｘ，ｙ）の位置に対応するＲ，Ｇ，Ｂ値を読みとり、上記した変換アルゴリズムによって（Ｒ，Ｇ，Ｂ）から（Ｈ，Ｓ，Ｖ）に変換し、その（Ｈ，Ｓ，Ｖ）の値を画像蓄積部５のメモリに記憶させる。
【００４１】
なお、本実施例では、画像入力部１０１と画像前処理部１０２とを別の構成にしたが、これに限らず、両者を一体として画像入力部としてもよい。
【００４２】
(3) 形状抽出部１０３
形状抽出部１０３について説明する（図６参照）。
【００４３】
形状抽出部１０３は３つの構成からなり、マスクを構成するマスク構成部６、各マスク内の特徴量を抽出する特徴量抽出部７、各マスクで得られた特徴量を組み合わせて、形状抽出の評価値を計算する組合せ評価値計算部８からなる。
【００４４】
形状抽出部１０３については、「組合せ分離度」を用いて説明する。
【００４５】
なお、前記で説明した式（１）で求められる分離度を、以下で説明する「組合せ分離度」と区別するために「単純分離度」と呼ぶ。
【００４６】
「組合せ分離度」は先に説明した単純類似度のマスクを、エッジ方向と幾何学的位置関係を考慮して、複数個配置し、それらを関係づける評価関数によって求められる。
【００４７】
図９に示すように楕円状に分離度を求めるためのマスクを配置する。これを部分マスクと呼ぶ。楕円の接線方向に垂直な直線上（エッジ方向）に部分マスクを配置する。そして、各部分マスクについて、各分離度の計算を行う。組合せ分離度Ｃ（０．０≦Ｃ≦１．０）を次式のように定義する。
【００４８】
【数２】

または、
【数３】

但しＧ，Ｅは次のようにして求められる。
【００４９】
Ｇは隣接する部分マスク間の分離度の差をとり、その差が小さいときによい評価を与えるという、複数あるマスク間の幾何学的配置とエッジの類似度を評価する関数である。
【００５０】
部分マスクＭｉ（ｉ＝１〜ｎ）の単純分離度をＳ（Ｍｉ）で表す。各部分マスク間の差の絶対値を取り、１．０から引いた値を各マスク間の「類似度」として定義し、これより次式で表される。
【００５１】
【数４】

Ｅは各部分マスクで計算された分離度の総和を取り、全体のエッジ強度を表す。
【００５２】
【数５】

Ｇ，Ｅの値が、０≦Ｇ、Ｅ≦ｎとなるため、式（２）のように、正規化を行う。
【００５３】
α、βそれぞれの重みについて実験等によって決定する。一般には、α＝０．５、β＝０．５として評価する。例えば、α＝１、β＝０として、エッジ間の類似度のみを評価するようにすることも可能である。これは、あらかじめ物体の存在する位置等がわかっており、非常にエッジ間の類似度が高い場合に利用される。
【００５４】
式（３）は、Ｇ，Ｅのそれぞれを正規化した値を掛け合わせることによって評価しており、全体のエッジ強度とエッジ間の類似度のバランスに敏感な評価関数となる。
【００５５】
そして、瞳の形状を楕円として抽出する場合の形状抽出部１０３についての機能、動作の説明を行う。
【００５６】
(3-1) マスク構成部６
マスク構成部６は、形状記述記憶部９、アフィン変換計算部１０、組合せマスク構成部１１からなる（図１０参照）。
【００５７】
(3-1-1) 形状記述記憶部９
形状記述記憶部９は、抽出したい形状（モデル）が格納されている。すなわち、モデルはエッジのサンプル点（ｎ個）と、その点におけるエッジの方向の情報が図１１のモデル記述テーブルに格納されている。
【００５８】
楕円の場合では、楕円の中心座標（ｘ０，ｙ０）、半径（長軸の長さの１／２）ｒ、長軸と短軸の比ｂ（０．０＜ｂ≦１．０）、長軸と座標系のＸ軸とのなす角θによって決定される。これらの楕円パラメータを決定し、そのサンプル点（ｘｉ，ｙｉ）を与え、楕円上の点（ｘｉ，ｙｉ）におけるエッジ方向については、中心と点（ｘｉ，ｙｉ）を結ぶ直線がＸ軸となす角度をθｐとすると、エッジ方向は、
【数６】

で計算される。
【００５９】
例えば、図９のように、点（ｘｓ，ｙｓ）では、エッジの方向が矢印のように求められる。このエッジ方向をモデル記述テーブルに記述する。
【００６０】
(3-1-2) アフィン変換計算部１０
アフィン変換計算部１０について説明する。
【００６１】
モデルは図１２のように、画像データ中では平行移動、回転、拡大縮小されている場合がある。モデルにアフィン変換を施し、そのモデルの当てはめを行うことで変換に対応した検出を行う。
【００６２】
図１１のモデル記述テーブルの各点（ｘｉ，ｙｉ）に関して、アフィン変換マトリクスＡは、
【数７】

を
【数８】

に対して以下のようにｐ’を求める。但し、Ｔは平行移動、Ｒは回転変換、Ｚは相似変換の行列を表す。
【００６３】
【数９】

このｐ’を別に用意する一時保管テーブルに登録する。また、このアフィン変換の回転成分Ｒをそれぞれのエッジ方向に足し合わせて、同様に一時保管テーブルに登録する。
【００６４】
抽出したい対象が画像データ中のどの辺りにあるかという探索範囲やどのくらいの抽出精度が必要かに応じて、変換パラメータの変動範囲を決定する。アフィン変換の平行移動成分、回転成分、拡大縮小成分においてそれぞれ、可変範囲と刻みを決定し、その範囲での変換パラメータから変換マトリクスを生成し、一時保管テーブルに登録する。
【００６５】
(3-1-3) 組合せマスク構成部１１
組合わせマスク構成部１１について説明する。
【００６６】
アフィン変換計算部１０で変換された一時保管テーブルに基づいて、計算を行うマスクを設定する。例えば、分離度を用いた場合、分離度マスクは図２のようなマスクを構成する。このときマスクの方向が図９のように、楕円のパラメータ（ｘ，ｙ，ｒ，ｂ，θ）によって計算される楕円上にエッジ方向と平行となるようなマスクを構成する。マスクの大きさ、形状については取得する画像抽出量に応じて定める。なお、本実施例では、矩形のマスク領域を用いて説明する。
【００６７】
(3-2) 特徴量抽出部７
特徴量抽出部７について説明する（図１３参照）。
【００６８】
特徴量抽出部７は、画像メモリ１２、演算装置１３、特徴量蓄積部１４からなる。
【００６９】
(3-2-1) 画像メモリ１２
画像メモリ１２は、画像前処理部１０２で計算された画像を転送し保持する役割をもつ。
【００７０】
(3-2-2) 演算装置１３
演算装置１３は、組合せマスク構成部１１において構成されたマスクにしたがって、画像メモリ１２中に蓄えられている画像の輝度、色相といった値を読みとり、マスクで定義されている特徴量を算出する。
【００７１】
算出された結果は、特徴量蓄積部１４に保存される。この説明では特徴量として分離度を用いているので、各部分マスクについて明度画像を用いて、それぞれの式（１）の分離度を計算し、それぞれの分離度が格納される。式（１）における画像抽出量Ｐｉは、本実施例では分離度を算出するための明度となる。
【００７２】
なお、特徴量として分離度だけではなく、他の特徴量も利用でき、これについては変形例で述べる。
【００７３】
(3-3) 組合せ評価値計算部８
組合せ評価値計算部８は、組合せ特徴量計算部１５、評価値記憶部１６、結果記憶部１７とよりなる（図１４参照）。
【００７４】
組合せ特徴量計算部１５は、式（２）（３）で定義される組合せ分離度Ｃを求める。組合せ分離度Ｃは、式（２）（３）のどちらか一方から演算した値を使用する。
【００７５】
その組合せ分離度Ｃを評価値記憶部１６に登録する。この評価値記憶部１６では、各組合せ分離度Ｃと式（７）の変換のパラメータＡを対にして記憶する。そして、一番高い組合せ分離度Ｃ（最も１に近い組合せ分離度Ｃ）を持つときのパラメータＡが結果記憶部１７に送られる。
【００７６】
(3-4) 形状抽出部１０３の動作
図１５は、形状抽出部１０３の動作をフローチャートに示したものである。
【００７７】
ステップ５１では、マスク構成部６の形状記述記憶部９、アフィン変換計算部１０によって楕円のパラメータが計算される。
【００７８】
ステップ５２では、組合せマスク構成部１１によって、マスク生成が行われる。
【００７９】
ステップ５３では、特徴量抽出部１４により、部分マスクの計算が行われる。
ステップ５４では、組合せ分離度Ｃの計算が、組合せ評価値計算部１５でなされ、そして、評価値記憶部１６では、組合せ分離度Ｃの大きい順にデータをソートする。
【００８０】
ステップ５５では、これにより、組合せ分離度Ｃの更新が行われる。
【００８１】
ステップ５６では、楕円のパラメータを変化させながら、組合せ分離度Ｃの計算を行うが、探索範囲を全て終了したかどうかを判断し、終了の場合、一番高い組合せ分離度Ｃを持つとき（最も１に近い組合せ分離度Ｃ）の楕円パラメータＡを結果記憶部１７に送り、ステップ５７において、その時の楕円パラメータＡを持つ検出楕円とする。終了していない場合、ステップ５１に戻り、楕円パラメータの更新を行い、処理を繰り返す。
【００８２】
(4) 変更例
本発明は上で述べた実施例に限定されるものではない。
【００８３】
全体の構成についての変形例として、画像メモリが各部分に点在するが、これを一つにまとめた構成とし、各演算装置がそのメモリをアクセスできるように構成を変更してもよい。
【００８４】
また、ＣＣＤカメラのみの入力だけでなく、ＭＲＩ画像やＸ線ＣＴ画像、超音波画像、ＳＰＥＣＴ画像等の入力でも可能である。たとえば、人間の肝臓部分の輪郭抽出等に有効と考えられる。
【００８５】
画像前処理部１０２は、色空間の変換だけでなく、平滑化等のフィルタリング等の画像の変換を行うように変形できる。
【００８６】
特徴量抽出部７についての変形例をいくつか挙げる。
【００８７】
本実施例では楕円形状の抽出について述べたが、楕円形状だけに限らず、図１２のように、テーブルの構成内容により、様々な形状に対応可能である。
【００８８】
組合せ特徴量について次のように変形できる。
【００８９】
上記実施例で説明した複数のマスクで計算される特徴量から評価値を算出するだけでなく、予め決められた形状をした一つのマスクを用いて特徴量を算出し、それを評価値として用いる。この一つのマスクの形状はモデルを参考にして決定する。
【００９０】
例えば、図１８のような、楕円の輪郭に沿った、２つの領域の分離度を計算し、それを評価関数とする。
【００９１】
この場合の組合せマスク構成部の変形例は、図１８のように、一時保管テーブルに基づいて楕円の輪郭に沿って、２つの領域を設定する。各領域の面積などについては、画像抽出量に応じて定める。
【００９２】
また、予め決められた形状をもつマスクを一つだけ用いた方法の場合の組合せ評価値計算部の変形例として、評価値Ｃの計算方法を
（数２）についてα，β＝１，ｎ＝１，Ｇ＝０
（数３）についてｎ＝１，Ｇ＝１
として、計算する。
【００９３】
分離度はテクスチャ画像のエッジ強度等も計算できるため、複雑なテクスチャ中の形状認識等も可能となる。また、本実施例のように明度画像のみを用いるのではなく、他の画像前処理部で得られる色相画像等を用いた、様々な分離度を計算できる。
【００９４】
組合せ特徴量についての計算式を次のような形に変形できる。
【００９５】
【数１０】

これにより、Ｃ’の値は、０．０〜１．０の間に収まらなくなるが、値の大きなものを形状として抽出することができる。このように、様々に評価関数を設定でき、組合せた特徴量のもっとも高い値を持つときの形状パラメータを求めることができる。
【００９６】
特徴量として分離度だけではなく、他の特徴量も利用できる。例えば、Sobel フィルタによるエッジ抽出によってエッジ強度を計算し、特徴量として利用する。この場合もエッジの強度の類似度を計算する。しかし、Sobel フィルタでは、分離度とは異なりマスク形状の方向性はない。そこで、計算されたエッジの方向における隣接したエッジ点の同士の類似度を計算する式を次のように定義し、隣接した２つのエッジ点のエッジ方向の角度をθ１、θ２とする。θ１、θ２が直交するとき最小となる角度間の評価関数を
【数１１】

のように設定し、この評価関数を組み込んだ形式に変形する。
【００９７】
また前述のように、分離度以外の特徴量を用いる場合、特徴量を計算するマスクが必要のない場合も存在する。マスク構成部を除いた構成に変形しても良い。
【００９８】
特徴量抽出部の動作は、図１５に示したように、繰り返し計算による探索だけではない。別の例として、図１６のように、予め数種類のエッジ方向のマスクを画像に適用しておき、多数のエッジ強度画像を作成しておく。そして、パラメータ空間を用意しておき、すべてのエッジ強度画像から得られた特徴量をパラメータ空間に投票を行う。そしてパラメータ空間のピークの位置を検出するといった、ハフ変換に似たアルゴリズムを用いる変形例がある。
【００９９】
また、本実施例は瞳の形状抽出について説明したが、先に示したように、その他の形状抽出も可能であると述べた。この時並列に異なる形状テーブルのものを探索することによって、複数形状を同時に抽出できるように拡張が可能である。具体的な応用としては、口形状を抽出する場合に有効である。図１７（ａ）（ｂ）（ｃ）（ｄ）のように、口の開け方（大きさ）に応じたいくつかの口形状モデルを用意する。そしてそれぞれに対して、図１７（ｄ）のように、分離度マスクを設定し抽出を行う。
【０１００】
また、一つのモデルが、画像データ中に複数の存在し、同時に複数個選択したい場合、評価値記憶部１６に記録されている候補から複数結果を出力するように、評価値記憶部１７と結果記憶部１８を変形できる。
【０１０１】
以上、本発明はその趣旨を逸脱しない範囲で種々変形して実施することが可能である。
【０１０２】
【発明の効果】
本発明によれば、最も評価の高いと判断されたパラメータに基づいて、楕円形状のモデルを画像に当てはめると、その楕円形状のモデルの位置、姿勢、大きさで決定される箇所に、目的の形状が存在する。したがって、ロバストな形状認識が行え、さまざまな画像処理装置で応用が可能である。
【図面の簡単な説明】
【図１】形状認識装置のブロック図である。
【図２】分離度を示す図である。
【図３】弱いエッジで構成される物体の認識を示す図である。
（ａ）は、弱いエッジで構成される物体の図である。
（ｂ）は、エッジ強度を示すテーブルである。
（ｃ）は、明度分布のグラフである。
【図４】画像入力部のブロック図である。
【図５】画像前処理部のブロック図である。
【図６】形状抽出部のブロック図である。
【図７】楕円形状として撮影される瞳を示す図である。
【図８】目を取得した画像の不安定性を示す図である。
（ａ）は、目の正面図である。
（ｂ）は、明度分布のグラフである。
【図９】組合せ分離度マスクを示す図である。
【図１０】マスク構成部のブロック図である。
【図１１】モデル記述テーブルを示す図である。
【図１２】モデルと画像データのアフィン変換による関係を示す図である。
【図１３】特徴量抽出部のブロック図である。
【図１４】組合せ評価値計算部のブロック図である。
【図１５】形状抽出部のフローチャートである。
【図１６】形状抽出部の変形例の概念図である。
【図１７】口形状の検出を示す図である。
【図１８】変形例におけるマスク領域の説明図である。
【符号の説明】
１００形状認識装置
１０１画像入力部
１０２画像前処理部
１０３形状抽出部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a shape recognition apparatus and method for image processing.
[0002]
[Prior art]
There are the following methods for extracting shapes by model fitting.
[0003]
In template matching, a model is described as a fixed curve or image, and object recognition is performed by determining the similarity between the model and the target image. There are two types of template matching that use the brightness distribution of an image as it is as a model description, and one that expresses the shape of a model as a set of edge points using a binary image. The former performs shape recognition by measuring the similarity of brightness distribution. In the latter, first, an intensity image of an edge is created based on the brightness distribution of the image, and a binary image is created by binarizing the edge intensity image. Then, the similarity between the binary image point set and the model point set is calculated.
[0004]
Ballard, DH ("Generalizing the Hough transform to detect arbitary shapes" Pattern Recognition. 13, 111-122 (1981)) describes a model as a list of points on a contour and translates, rotates, and magnifies it. The parameter representing the reduction is used. This conversion is performed by “voting” to the parameter space so as to form a local peak at the position of the most appropriate parameter represented by the pattern. This Hough transform has an advantage that it can be collated well even when data is partially missing.
[0005]
Next, a model is measured by an objective function (energy) having compatibility between the model and image data, and the model is verified by minimizing the value of the function. For example, recently, Snake (M. Kass, A. Witkin, D. Terzopouls, “Snakes: Active Contour Models”, In Proc. 1st Int. Conf. On Computer Vision, pp. 259-268 (1987)) .
[0006]
Some of these model fittings use raw image data as they are, or some use other image characteristics such as edge enhancement filters and line enhancement filters that have been subjected to low-level image processing.
[0007]
In the present invention, stable shape recognition resistant to noise and disturbance is considered using edge data. To that end, first, a method for extracting edge information reflecting the object outline and the structure inside the object will be described.
[0008]
As an edge extraction method, there are a method based on a brightness gradient of an image and a method based on information on an edge vicinity region. The method based on the lightness gradient of the image extracts a position where the lightness gradient changes suddenly as an edge, and there are methods such as first-order differential type Roberts, Prowitt, Sobel, and second-order differential type Laplacian. As a method based on information in the vicinity of the edge, there is a Hueckel method in which an edge model is applied to a local region.
[0009]
Edge extraction based on the lightness gradient has a problem that weak edges cannot be extracted and a problem that it is difficult to expand to other edge strengths such as a hue image. This section summarizes the problems of conventional object recognition using lightness gradient edge extraction.
[0010]
Template matching has a characteristic that it is weak when image data has a slight variation with respect to the template. For the generalized Hough transform, the number of votes is affected by the number of edge points. Only the presence / absence of edge points becomes a problem, and it is greatly affected by the performance of edge extraction. The snake model has the property of being flexible and resistant to noise by being linked by an elastic energy function that connects control points. However, it still has the disadvantage of falling to a local minimum value of energy.
[0011]
In order to solve these problems, it is conceivable to use a method based on information of the edge vicinity region. As a method of calculating the edge strength from the feature quantity between these areas, not only the brightness of the image but also the boundary between areas of different colors and textures, “Edge extraction based on the degree of separation between areas, Information Processing Society of Japan” Computer Vision Study Group Material 87-1 (1994) "has been proposed.
[0012]
The “separation degree” is an amount representing how separated each subset is when a data set is divided into several subsets. This is represented by the ratio of the data feature of the subset to the data feature of the entire set, and the value takes a value of 0.0 to 1.0.
[0013]
Now, as shown in FIG. 2, when two regions in an image are considered as the above-described two sets, a stronger edge exists at the boundary between regions as the degree of separation increases. In other words, considering two regions with a certain pixel as a boundary, it can be said that if the degree of separation at the pixel is large, the pixel has a strong edge.
[0014]
The degree of separation S is given by the following equation.
[0015]
[Expression 1]

Although this method has the advantage that the edges can be extracted stably, on the contrary, the degree of separation does not depend on the difference in the amount of image extraction, so the degree of separation is used as it is to extract the edges of the object. From the viewpoint of insufficient, a weighted separation degree (Japanese Patent Application No. 7-7499) has also been proposed. Weighted separability is an equation for segregation of position weighting and image feature weighting for the purpose of selectively extracting object edges using the object's existing position, color information, texture information, etc. It takes the form built in.
[0016]
[Problems to be solved by the invention]
A mask having an appropriate size and shape can be set using the degree of separation described above and applied to the image, and an edge strength image effective for the conventional method can be generated. However, this simple usage is not enough.
[0017]
Consider the object recognition of a target composed of gentle edges in the image data as shown in FIG.
[0018]
In this figure, edge strength is shown in FIG. 3B for four points a, i, c, and d. When considering the strength of the edge, if the edge strength constituting the object changes monotonously as shown in FIG. 3C, or if it becomes weak due to the blur due to shooting, the edge is converted into binarized. When an image is created, it may not be extracted as an edge point depending on the threshold setting method.
[0019]
However, as shown in FIG. 3A, in the case where there is a continuous weak edge in the same way, it must be recognized as an object.
[0020]
Such actual objects include the eyes of human eyes.
[0021]
When a face image is photographed in a normal indoor environment and the shape of the photographed pupil is to be extracted, problems (1) to (3) shown in FIG.
[0022]
(1) The estimation is unstable because part of the pupil is hidden by the eyelids, eyelashes, etc. (FIG. 8A).
[0023]
(2) Due to the reflection of the fluorescent lamp, a very strong edge is formed in the image (Fig. 8a).
[0024]
(3) The space between the iris and the white changes gently, and is composed of weak edges (Fig. 8C).
[0025]
In the conventional object recognition method, consideration is given to edge strength and geometrical arrangement of edge points, but attention is paid to the similarity of edge directionality and edge strength as the relationship between adjacent edge points. Absent.
[0026]
For example, the relationship between edge points in the generalized Hough transform is only the geometrical arrangement relationship expressed as a set of edge points.
[0027]
In addition, the snake model includes an association using an elastic energy function that connects control points, information exchange between control points for optimization, and the like, but it does not actively use related information. Further, when the shape of an ellipse or the like can be expressed as a parameter, the detection is performed by extracting the edge points constituting the ellipse and obtaining the parameter of the ellipse by the least square method or the like. However, the fitting of ellipses by the method of least squares has a problem that estimation is unstable because it is known that there is a deviation in the method of least squares or a part of the ellipse is hidden. Thus, the conventional shape extraction is vulnerable to noise and disturbance.
[0028]
In view of this, the apparatus according to the present invention provides a device that is resistant to noise and disturbance and can perform stable shape recognition.
[0029]
[Means for Solving the Problems]
The shape recognition device of the present invention includes an image input unit that inputs an image, and a feature amount extraction region setting that sets a feature amount extraction region having two regions along the outline of a model that artificially represents the shape of a recognition target object A parameter setting unit for setting a parameter for affine transformation that determines a position, orientation, or size of the model in the image for applying the model to the image, and the feature amount extraction region. A model applying unit that applies the model to the image based on the parameter, a feature amount extracting unit that obtains a degree of separation in the feature amount extraction region of the model to be applied to the image, and the degree of separation is possess a parameter determining unit for determining the highest parameters, the feature amount extraction region setting unit, first a first side along the contour of the model The feature amount extraction region having two regions of a region and a second region on the second side is set, and the feature amount extraction unit sets the degree of separation between the first region and the second region as the degree of separation. And the degree of separation is
[Expression 14]

It calculates | requires based on .
[0031]
The apparatus according to the present invention is resistant to noise and disturbance and can perform stable shape recognition.
[0032]
DETAILED DESCRIPTION OF THE INVENTION
Examples of the embodiment of the present invention will be described below.
[0033]
In the present embodiment, an embodiment will be described in a situation where the shape of the pupil part of the human eye is recognized. In the face image acquired using the CCD color camera, the shape of the pupil is photographed as an ellipse as shown in FIG. Consider approximating the shape of this pupil as an ellipse and obtaining the ellipse parameters.
[0034]
A block diagram of the shape recognition apparatus 100 is shown in FIG.
[0035]
An image input unit 101 including A / D conversion and input conversion means and an image storage unit necessary for inputting an image, an image preprocessing unit 102 for preprocessing an acquired image, and a shape extraction unit for extracting a shape 103.
[0036]
(1) Image input unit 101
The image input unit 101 will be described (see FIG. 4).
[0037]
The image input unit 101 includes a color CCD camera 1, a video board 2, and a memory 3 for storing images. Note that the image input unit 101 is not limited to the input from the CCD camera 1 and can also input an image obtained by a video output device such as a video or an X-ray imaging apparatus from a terminal of the video board.
[0038]
(2) Image preprocessing unit 102
The image preprocessing unit 102 will be described (see FIG. 5).
[0039]
It comprises an arithmetic unit 4 that converts an image input to the image input unit 101 into an image such as a brightness image, a hue image, and a saturation image, and an image storage unit 5 that stores these converted images.
[0040]
The color image acquired by the CCD camera 1 consists of three images of R, G, and B. The arithmetic unit 4 uses the value of each R, G, B image at a certain point (x, y) in the image as (R, G, B), and expresses the color representation in this RGB color space as an HSV color space (Computer Graphics 2nd edition (Addison Wesley) pages 592-593) and GHLS color space (GLHS: A generalized Lightness, Hue, and Saturation Color Model, Haim Levkowitz and Gabor T. Herman, CVGIP Graphical Models and Image Processing (GMIP) Volume 55, Convert to color representation in Number 4, July 1993). The R, G, B values corresponding to the position (x, y) are read from the memory of the image storage unit 5 and converted from (R, G, B) to (H, S, V) by the conversion algorithm described above. The (H, S, V) values are stored in the memory of the image storage unit 5.
[0041]
In the present embodiment, the image input unit 101 and the image preprocessing unit 102 have different configurations. However, the present invention is not limited to this, and the image input unit 101 and the image preprocessing unit 102 may be integrated as an image input unit.
[0042]
(3) Shape extraction unit 103
The shape extraction unit 103 will be described (see FIG. 6).
[0043]
The shape extraction unit 103 has three configurations. The mask extraction unit 6 that forms a mask, the feature amount extraction unit 7 that extracts a feature amount in each mask, and the feature amount obtained by each mask are combined to perform shape extraction. It consists of the combination evaluation value calculation part 8 which calculates an evaluation value.
[0044]
The shape extraction unit 103 will be described using “combination separation degree”.
[0045]
In addition, in order to distinguish the separation degree calculated | required by Formula (1) demonstrated above from the "combination separation degree" demonstrated below, it calls a "simple separation degree."
[0046]
The “combination separation degree” is obtained by an evaluation function that arranges a plurality of masks having the simple similarity described above in consideration of the edge direction and the geometrical positional relationship and relates them.
[0047]
As shown in FIG. 9, a mask for obtaining the separation degree is arranged in an elliptical shape. This is called a partial mask. A partial mask is arranged on a straight line (edge direction) perpendicular to the tangential direction of the ellipse. Then, each degree of separation is calculated for each partial mask. The degree of combination separation C (0.0 ≦ C ≦ 1.0) is defined as follows:
[0048]
[Expression 2]

Or
[Equation 3]

However, G and E are obtained as follows.
[0049]
G is a function that evaluates the geometrical arrangement between a plurality of masks and the degree of edge similarity, which takes a difference in separation between adjacent partial masks and gives a good evaluation when the difference is small.
[0050]
The simple separation degree of the partial mask Mi (i = 1 to n) is represented by S (Mi). The absolute value of the difference between the partial masks is taken, and the value subtracted from 1.0 is defined as the “similarity” between the masks.
[0051]
[Expression 4]

E takes the sum of the degrees of separation calculated for each partial mask and represents the overall edge strength.
[0052]
[Equation 5]

Since the values of G and E are 0 ≦ G and E ≦ n, normalization is performed as in equation (2).
[0053]
The weights of α and β are determined by experiments or the like. In general, the evaluation is performed with α = 0.5 and β = 0.5. For example, it is also possible to evaluate only the similarity between edges with α = 1 and β = 0. This is used when the position where an object exists is known in advance and the similarity between edges is very high.
[0054]
Expression (3) evaluates by multiplying the normalized values of G and E, and becomes an evaluation function sensitive to the balance of the overall edge strength and the similarity between the edges.
[0055]
Then, the function and operation of the shape extraction unit 103 when extracting the pupil shape as an ellipse will be described.
[0056]
(3-1) Mask component 6
The mask construction unit 6 includes a shape description storage unit 9, an affine transformation calculation unit 10, and a combination mask construction unit 11 (see FIG. 10).
[0057]
(3-1-1) Shape description storage unit 9
The shape description storage unit 9 stores a shape (model) to be extracted. That is, the model stores information on edge sample points (n) and the edge direction at that point in the model description table of FIG.
[0058]
In the case of an ellipse, the center coordinates (x0, y0) of the ellipse, the radius (1/2 of the length of the major axis) r, the ratio b (0.0 <b ≦ 1.0) of the major axis to the minor axis, the length It is determined by the angle θ between the axis and the X axis of the coordinate system. These ellipse parameters are determined, the sample point (xi, yi) is given, and for the edge direction at the point (xi, yi) on the ellipse, the straight line connecting the center and the point (xi, yi) is the X axis. If the angle is θp, the edge direction is
[Formula 6]

Calculated by
[0059]
For example, as shown in FIG. 9, at the point (xs, ys), the edge direction is obtained as shown by an arrow. This edge direction is described in the model description table.
[0060]
(3-1-2) Affine transformation calculator 10
The affine transformation calculation unit 10 will be described.
[0061]
As shown in FIG. 12, the model may be translated, rotated, and scaled in the image data. The model is subjected to affine transformation, and detection corresponding to the transformation is performed by fitting the model.
[0062]
For each point (xi, yi) in the model description table of FIG.
[Expression 7]

[Equation 8]

P ′ is obtained as follows. However, T represents translation, R represents rotational transformation, and Z represents a similarity transformation matrix.
[0063]
[Equation 9]

This p ′ is registered in a temporary storage table prepared separately. Further, the rotational component R of the affine transformation is added in the respective edge directions and registered in the temporary storage table in the same manner.
[0064]
The variation range of the conversion parameter is determined in accordance with the search range of the target in the image data and the degree of extraction accuracy required. A variable range and a step are determined for each of the translation component, rotation component, and enlargement / reduction component of the affine transformation, and a transformation matrix is generated from the transformation parameters in that range and registered in the temporary storage table.
[0065]
(3-1-3) Combination mask component 11
The combination mask construction unit 11 will be described.
[0066]
Based on the temporary storage table converted by the affine transformation calculation unit 10, a mask for calculation is set. For example, when the degree of separation is used, the degree of separation mask constitutes a mask as shown in FIG. At this time, as shown in FIG. 9, the mask is configured such that the mask direction is parallel to the edge direction on the ellipse calculated by the ellipse parameters (x, y, r, b, θ). The size and shape of the mask are determined according to the acquired image extraction amount. In this embodiment, a description will be given using a rectangular mask region.
[0067]
(3-2) Feature extraction unit 7
The feature quantity extraction unit 7 will be described (see FIG. 13).
[0068]
The feature quantity extraction unit 7 includes an image memory 12, a calculation device 13, and a feature quantity storage unit 14.
[0069]
(3-2-1) Image memory 12
The image memory 12 has a role of transferring and holding the image calculated by the image preprocessing unit 102.
[0070]
(3-2-2) Arithmetic unit 13
The arithmetic device 13 reads values such as the luminance and hue of the image stored in the image memory 12 according to the mask configured in the combination mask configuration unit 11 and calculates the feature amount defined by the mask.
[0071]
The calculated result is stored in the feature amount storage unit 14. In this description, since the degree of separation is used as the feature amount, the degree of separation of each equation (1) is calculated using the brightness image for each partial mask, and each degree of separation is stored. In the present embodiment, the image extraction amount Pi in the equation (1) is the lightness for calculating the degree of separation.
[0072]
It should be noted that not only the degree of separation but also other feature quantities can be used as the feature quantity, which will be described in a modified example.
[0073]
(3-3) Combination evaluation value calculator 8
The combination evaluation value calculation unit 8 includes a combination feature amount calculation unit 15, an evaluation

value

storage unit 16, and a result storage unit 17 (see FIG. 14).
[0074]
The combination feature amount calculation unit 15 obtains the combination separation degree C defined by the equations (2) and (3). As the combination separation degree C, a value calculated from either one of the expressions (2) and (3) is used.
[0075]
The combination separation degree C is registered in the evaluation value storage unit 16. The evaluation value storage unit 16 stores each combination separation degree C and the conversion parameter A in Expression (7) in pairs. Then, the parameter A having the highest combination separation degree C (the combination separation degree C closest to 1) is sent to the result storage unit 17.
[0076]
(3-4) Operation of Shape Extraction Unit 103 FIG. 15 is a flowchart showing the operation of the shape extraction unit 103.
[0077]
In step 51, ellipse parameters are calculated by the shape description storage unit 9 and the affine transformation calculation unit 10 of the mask construction unit 6.
[0078]
In step 52, the combination mask construction unit 11 performs mask generation.
[0079]
In step 53, the feature quantity extraction unit 14 calculates a partial mask.
In step 54, the combination separation degree C is calculated by the combination evaluation value calculation unit 15, and the evaluation value storage unit 16 sorts the data in descending order of the combination separation degree C.
[0080]
In step 55, the combination separation degree C is thereby updated.
[0081]
In step 56, the combination separability C is calculated while changing the parameters of the ellipse, but it is determined whether or not the entire search range has been completed. The ellipse parameter A having a combination separation degree C) close to 1 is sent to the result storage unit 17, and in step 57, a detection ellipse having the ellipse parameter A at that time is set. If not completed, the process returns to step 51, the ellipse parameter is updated, and the process is repeated.
[0082]
(4) Modification Examples The present invention is not limited to the embodiments described above.
[0083]
As a modified example of the overall configuration, the image memory is scattered in each part. However, the configuration may be changed so that each arithmetic device can access the memory by combining the image memories into one.
[0084]
Further, not only input from a CCD camera but also input from an MRI image, X-ray CT image, ultrasonic image, SPECT image, or the like is possible. For example, it is considered effective for extracting the contour of the human liver.
[0085]
The image preprocessing unit 102 can be modified not only to convert the color space but also to perform image conversion such as filtering such as smoothing.
[0086]
Several modified examples of the feature quantity extraction unit 7 will be given.
[0087]
In this embodiment, the extraction of the elliptical shape has been described. However, the present invention is not limited to the elliptical shape, and various shapes can be dealt with depending on the configuration of the table as shown in FIG.
[0088]
The combination feature amount can be modified as follows.
[0089]
In addition to calculating the evaluation value from the feature values calculated using the plurality of masks described in the above embodiments, the feature value is calculated using one mask having a predetermined shape and used as the evaluation value. . The shape of this one mask is determined with reference to the model.
[0090]
For example, as shown in FIG. 18, the degree of separation between two regions along the outline of an ellipse is calculated and used as an evaluation function.
[0091]
In the modified example of the combination mask configuration unit in this case, two regions are set along the outline of the ellipse based on the temporary storage table as shown in FIG. The area of each region is determined according to the image extraction amount.
[0092]
Further, as a modification of the combination evaluation value calculation unit in the case of a method using only one mask having a predetermined shape, the calculation method of the evaluation value C is expressed by α, β = 1, n = 1, G = 0
N = 1, G = 1 for (Equation 3)
Calculate as
[0093]
As the degree of separation, the edge strength of the texture image can be calculated, so that it is possible to recognize the shape in a complex texture. In addition, it is possible to calculate various degrees of separation using a hue image obtained by another image preprocessing unit instead of using only a brightness image as in the present embodiment.
[0094]
The calculation formula for the combination feature can be transformed into the following form.
[0095]
[Expression 10]

As a result, the value of C ′ does not fall between 0.0 and 1.0, but a large value can be extracted as a shape. In this way, various evaluation functions can be set, and the shape parameter when the combined feature amount has the highest value can be obtained.
[0096]
In addition to the degree of separation, other feature quantities can be used as the feature quantity. For example, edge strength is calculated by edge extraction using a Sobel filter and used as a feature quantity. In this case as well, the similarity of edge strength is calculated. However, in the Sobel filter, unlike the degree of separation, there is no directionality of the mask shape. Therefore, an equation for calculating the similarity between adjacent edge points in the calculated edge direction is defined as follows, and the angles in the edge direction of two adjacent edge points are defined as θ1 and θ2. An evaluation function between angles which becomes the minimum when θ1 and θ2 are orthogonal to each other is

Is transformed into a form incorporating this evaluation function.
[0097]
Further, as described above, when a feature amount other than the degree of separation is used, there is a case where a mask for calculating the feature amount is not necessary. You may deform | transform into the structure except a mask structure part.
[0098]
As shown in FIG. 15, the operation of the feature amount extraction unit is not limited to a search based on iterative calculation. As another example, as shown in FIG. 16, several types of edge direction masks are applied to an image in advance to create a large number of edge intensity images. A parameter space is prepared, and feature values obtained from all edge intensity images are voted on the parameter space. There is a modification using an algorithm similar to the Hough transform, such as detecting the peak position of the parameter space.
[0099]
Moreover, although the present Example demonstrated the extraction of the shape of a pupil, as above-mentioned, it stated that other shape extraction is also possible. At this time, it is possible to expand so that a plurality of shapes can be extracted simultaneously by searching for different shape tables in parallel. As a specific application, it is effective when extracting a mouth shape. As shown in FIGS. 17 (a), (b), (c), and (d), several mouth shape models are prepared according to the opening method (size) of the mouth. For each of them, as shown in FIG. 17D, a separation degree mask is set and extraction is performed.
[0100]
When there are a plurality of models in the image data and it is desired to select a plurality of models at the same time, the evaluation value storage unit 17 and the result are output so that a plurality of results are output from the candidates recorded in the evaluation value storage unit 16. The storage unit 18 can be deformed.
[0101]
As described above, the present invention can be implemented with various modifications without departing from the spirit of the present invention.
[0102]
【The invention's effect】
According to the present invention, when an elliptical model is applied to an image based on the parameter determined to have the highest evaluation, a target determined by the position, posture, and size of the elliptical model is obtained. A shape exists. Therefore, robust shape recognition can be performed and application to various image processing apparatuses is possible.
[Brief description of the drawings]
FIG. 1 is a block diagram of a shape recognition device.
FIG. 2 is a diagram showing the degree of separation.
FIG. 3 is a diagram illustrating recognition of an object composed of weak edges.
(A) is a figure of the object comprised by a weak edge.
(B) is a table which shows edge strength.
(C) is a graph of brightness distribution.
FIG. 4 is a block diagram of an image input unit.
FIG. 5 is a block diagram of an image preprocessing unit.
FIG. 6 is a block diagram of a shape extraction unit.
FIG. 7 is a diagram showing a pupil imaged as an elliptical shape.
FIG. 8 is a diagram illustrating instability of an image obtained by acquiring an eye.
(A) is a front view of eyes.
(B) is a graph of brightness distribution.
FIG. 9 is a diagram illustrating a combination separation degree mask.
FIG. 10 is a block diagram of a mask configuration unit.
FIG. 11 is a diagram illustrating a model description table.
FIG. 12 is a diagram illustrating a relationship between a model and image data by affine transformation.
FIG. 13 is a block diagram of a feature amount extraction unit.
FIG. 14 is a block diagram of a combination evaluation value calculation unit.
FIG. 15 is a flowchart of a shape extraction unit.
FIG. 16 is a conceptual diagram of a modification of the shape extraction unit.
FIG. 17 is a diagram illustrating detection of a mouth shape.
FIG. 18 is an explanatory diagram of a mask area in a modified example.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 100 Shape recognition apparatus 101 Image input part 102 Image pre-processing part 103 Shape extraction part

Claims

An image input unit for inputting an image;
A feature amount extraction region setting unit that sets a feature amount extraction region having two regions along the contour of a model that artificially represents the shape of the recognition object;
A parameter setting unit for setting a parameter for affine transformation that determines the position, orientation, or size of the model in the image for fitting the model to the image;
A model application unit that applies the model to the image while varying the set parameters ;
The separation of the two regions of the feature amount extraction area in the parameter different models fitted to each of the images, a feature amount extraction unit for determining each from the image,
A parameter determining unit that determines a parameter corresponding to the highest degree of separation among the different degrees of separation ;
Have a,
The feature amount extraction region setting unit sets the feature amount extraction region having two regions of a first region on one side and a second region on the other side along the contour of the model,
The feature amount extraction unit uses the degree of separation between the first region and the second region as the degree of separation, and the degree of separation is

A shape recognition device characterized in that it is obtained based on the above .

The shape recognition apparatus according to claim 1, wherein the feature amount extraction region setting unit uses a circular or elliptical model.

Enter an image,
A feature amount extraction region having two regions is set along the contour of a model that artificially represents the shape of the recognition object,
Setting parameters for affine transformation to determine the position, orientation or size of the model in the image for fitting the model to the image;
Applying the model to the images while varying the set parameters ,
The separation of the two regions of the feature amount extraction area in the parameter different models fitted to each of the image, respectively obtained from the image,
Determining the parameter corresponding to the highest resolution among the different resolutions ,
The set feature amount extraction region has two regions, a first region on one side and a second region on the other side along the contour of the model,
The separation degree between the first region and the second region is used as the separation degree, and the separation degree is

A shape recognition method characterized in that it is obtained based on

The shape recognition method according to claim 3 , wherein a circle or an ellipse model is used when setting the feature amount extraction region.