JP3716455B2

JP3716455B2 - Region extraction method and region extraction device

Info

Publication number: JP3716455B2
Application number: JP16479295A
Authority: JP
Inventors: 知生光永; 琢横山; 卓志戸塚
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1995-06-07
Filing date: 1995-06-07
Publication date: 2005-11-16
Anticipated expiration: 2020-11-16
Also published as: JPH08335268A

Description

【０００１】
【産業上の利用分野】
本発明は、画像処理のための例えば動画像中の物体領域を抽出する領域抽出方法及びその装置に関し、特に、画像合成を行うときに必要な対象物の領域を指定するマスク画像を作成する作業に利用できる領域抽出方法及びその装置に関する。
【０００２】
【従来の技術】
物体領域抽出処理技術として、動画像のなかから、対象物の領域を抽出しようとする従来技術としては以下のようなものがある。
【０００３】
先ず、動的輪郭モデルを用いるものとして、
文献「スネイクス：動的輪郭モデル」、“SNAKES : Active Contour models”Kass M., Witikin A., Terzopoulos D., Proc.1st ICCV, pp.259-268, 1987 に記載される、スネイクス（snakes）と呼ばれる動的輪郭モデルは、画像中の輪郭に収束するように移動するコントロールポイントを、拘束条件によって鎖状に連結したものである。
【０００４】
この動的輪郭モデルをフレーム間の輪郭追跡に応用したものに、以下の文献がある。
文献(a) 「領域分割に基づく複雑物体の自動切り出し」栄藤稔,白井良明, NICOGRAPH'92論文集, pp.8-17, 1992、対応特許文献：特開平５−６１９７７号公報「領域抽出装置」
文献(b) 「弾性輪郭モデルとエネルギー最小化原理による輪郭追跡手法」上田修功、間瀬健二、末永康仁, 信学誌, Vol.J-75-D-II, No.1,pp.111-120,1992 、対応特許文献：特開平５−１２４３３号公報「動物体の輪郭追跡方法」
これらの文献に記載された技術は、前フレームの輪郭位置を初期値として、現フレームにおいて動的輪郭モデルを収束させ対象物輪郭を得ようとするものである。一般に、この手法は以下の特徴をもつ。
【０００５】
動的輪郭モデルは、複数のコントロールポイントをスプライン曲線などを用いて連結したもので、常に滑らかな輪郭が得られる。
動的輪郭モデルの収束はコントロールポイントを少しずつ動かしながら、モデルのもつ評価関数の更新を行うという反復処理を行う。
【０００６】
次に、動きベクトル推定を行うものとして、前−現フレーム間での輪郭の動きベクトルを推定することによって、現フレームの輪郭位置を決める方法が、次の文献により提案されている。
【０００７】
文献(c) 「映像のための動ベクトル検出法に関する一検討」八木伸行、田中勝行、榎並和雅、テレビジョン学会誌、Vol.45, No.10,pp.1221-1229, 1991
文献(d) 特開平４−１１７０７９号公報「画像処理システム」
これらの文献に記載される技術は、前-現フレーム間においてブロックマッチングを行うことによって動きベクトルを求めるものである。ブロックマッチングとは、前画像中の注目する画素を含むブロックに、もっとも類似したブロックを現画像中から探査する方法である。一般に、ブロックマッチングを用いる方法は、前−現フレーム間で物体の動きの方向を調べるので、以前の動きとは異なった場合においても追随性がよい、という特徴がある。
【０００８】
また、文献(e) 「動画像合成のための対象物の抽出とはめ込み法」井上誠喜、小山広毅、テレビジョン学会誌, Vol.47, No.7, pp.999-1005, 1993, に記載される技術は、前−現フレーム間で直接動きベクトルを求めずに、前フレームまでの、輪郭の動きをもとにした予測ベクトルを用いる。この文献(e) によれば、本来インターレース画像のフィールド単位の処理であるが、本明細書中では、原理説明の目的上、フレーム単位として説明する。ブロックマッチング法より大まかな予測になってしまうので、輪郭の軌跡だけの追跡は行わず、輪郭が存在する可能性がある、ある太さをもった領域の動き予測を行い、その領域中ではエッジ検出等で詳細に輪郭抽出を行うという手法をとる。
【０００９】
【発明が解決しようとする課題】
ところで、上述した従来の物体領域抽出処理技術には、以下に挙げるような、フレーム間追跡の問題や、輪郭軌跡の生成の問題があった。
【００１０】
先ず、フレーム間追跡の問題として、上記文献(a),(b) の従来技術で用いられている動的輪郭モデル（いわゆるsnakes）の収束は、初期値により大きく左右される。例えば、物体位置が大きく変化した場合には収束が悪く、しばしば誤った局所解に陥る危険がある。上記動的輪郭モデルに弾性的な拘束条件を加えることによって、物体位置の大きな変化に追随できるような改良も上記文献(b) に提案されているが、そのかわり、輪郭形状の変形に対する柔軟性が失われるという問題がある。
【００１１】
上記文献(e) の従来技術のような前フレームまでの輪郭の動きから予測ベクトルを求める追跡方法では、現フレームにおいて動きの方向が急激に変化した場合に対応できないという問題がある。
【００１２】
一方、上記文献(c),(d) に示す従来技術で用いられているブロックマッチングは前−現フレームの動きを直接調べるので、動きの変化には強い。しかし、ブロックマッチングの方法自体に、輪郭領域ではエラーが多いという欠点があり、従来からのブロックマッチングをそのまま用いるだけでは十分な正確さがえられなかった。
【００１３】
次に、輪郭軌跡の生成の問題について述べる。動きベクトル推定を行う方法では、動きベクトルが得られたあとに、現フレーム上の輪郭を軌跡として得る処理方法が軌跡の詳細さと精度を決める。上記文献(c),(d) に示す従来技術では、動きベクトル、または動きベクトルに基づいて得られた変形パラメータに従って、対象物領域を移動・変形させる。この方法は、大局的な変形にのみ対応し、微細な変形、特に輪郭形状においては対応できない。
【００１４】
上記文献(e) の従来技術では、動きベクトルによる追跡の後、輪郭領域におけるエッジ検出を行い、詳細に輪郭位置を決定する。輪郭の軌跡は画素点の並びとして得られるため、処理後の修正が行いにくいという欠点がある。また、この従来技術によるエッジ検出方法では注目する対象物の輪郭と、それ以外の画像濃度勾配が選択できないために、画像によっては輪郭を正確に抽出できない問題があった。
【００１５】
上記動的輪郭モデル（いわゆるsnakes）を用いる従来技術は、輪郭を追跡した結果がそのまま輪郭曲線として得られる。輪郭モデルをパラメトリックな曲線で記述するので、処理後の修正は容易である。誤って検出された箇所を局所的になおすばかりでなく、画像を拡大・縮小したときにマスク画像もあわせて拡大・縮小したりする必要があるなど、修正の容易さは領域抽出と画像合成においては重要な用件である。
【００１６】
本発明は、このような実情に鑑みてなされたものであり、上記従来の領域抽出方法の問題点を解決し、フレーム間の輪郭の追従性能を向上し、輪郭の修正を容易化し、輪郭形状の正確さを向上し、良好な領域抽出結果を得ることができるような領域抽出方法及びその装置の提供を目的とする。
【００１７】
【課題を解決するための手段】
本発明に係る領域抽出方法及び装置は、上述の課題を解決するために、入力動画像の前フレームでの対象物輪郭の軌跡が与えられたときに、前−現フレーム間で輪郭の動きベクトルを推定することによって、現フレーム上で輪郭の推定位置である推定輪郭を求め、この推定輪郭の周囲の画素を含む輪郭が存在する可能性のある領域である輪郭候補領域を現フレーム上で決定し、この輪郭候補領域において、画像空間上の単位ベクトルと各画素の濃度空間上の単位ベクトルとの内積からグラディエントベクトル場を求め、このグラディエントベクトル場に存在するベクトルの方向と画像空間上のベクトルによってグラディエント強度を変調し、各領域についてグラディエント強度が最大になるような画素を結んで閉曲線を生成し、その軌跡を現フレームにおける対象物輪郭としている。
【００１８】
ここで、上記動きベクトルを推定する際には、与えられた上記対象物輪郭の軌跡上の特徴点を抽出し、抽出した特徴点の動きベクトルを推定するに際して、比較する２ブロック間の誤差評価に対象物領域画素の残差に重みづけするような誤差評価関数を用いた階層的ブロックマッチングを用いて計算し、その各階層の結果の動きベクトルに対して平滑化処理を行うことが挙げられる。
【００１９】
また、上記輪郭候補領域を決定する際には、前フレームの輪郭上のブロックと、それに対応する現フレームの推定輪郭上のブロックとの間の平均２乗誤差を対象物色と背景色との差で規格化した値を求め、その値を用いて、現フレームの推定輪郭の周囲に形成する輪郭候補領域の太さを調節することが挙げられる。
【００２０】
上記グラディエントを計算する際には、現フレームの推定輪郭に対し特徴点抽出を行い、現フレームの推定輪郭上の特徴点位置に従って輪郭候補領域を小領域に分割し、その小領域ごとに現フレームにおける輪郭の方向を推定し、その方向の勾配にのみ反応するようなエッジ検出を行うことによって、輪郭候補領域のグラディエントベクトル場を得ることが挙げられる。あるいは、現フレームの推定輪郭に対し特徴点抽出を行い、現フレームの推定輪郭上の特徴点位置に従って輪郭候補領域を小領域に分割し、その小領域ごとに現フレームにおける輪郭の色変化を推定し、その色変化の勾配にのみ反応するようなエッジ検出を行うことによって、輪郭候補領域のグラディエントベクトル場を得ることが挙げられる。
【００２１】
さらに、上記曲線を生成する際には、現フレームの輪郭領域のグラディエントベクトル場に対して、現フレームの推定輪郭に対し特徴点抽出を行い、現フレームの推定輪郭上の特徴点位置に従って輪郭候補領域を小領域に分割し、その小領域ごとに輪郭が通過する点を抽出し、グラディエントベクトル場ベクトルの大きさが大きいところを通過するように生成した３次スプライン曲線で前記通過点を連結することが挙げられる。
【００２２】
【作用】
前フレームでの対象物輪郭の動きベクトルを推定し、現フレームで輪郭が存在する可能性がある領域を判定し、その領域内の濃度勾配が急峻なところを通過する３次スプライン閉曲線を得、それを現フレームの対象物輪郭とする処理を順次フレームで行うことによって、動画像の各フレームにおいて対象物を抽出することができる。
【００２３】
本発明に係る領域抽出方法及びその装置は、前フレームでの対象物輪郭の軌跡が与えられたときに、前−現フレーム間で輪郭の動きベクトルを推定することによって、現フレーム上で輪郭の推定位置、すなわち推定輪郭を求め、推定輪郭の周囲の画素を含む輪郭が存在する可能性のある領域、すなわち輪郭候補領域を現フレーム上で決定し、輪郭候補領域において濃度値のグラディエントベクトル場を求め、そのグラディエントベクトル場中のベクトルの大きさが大きいところを通過するように閉曲線を生成し、その軌跡を現フレームにおける対象物輪郭とする処理を行う。これによって、前−現フレーム間で輪郭が大きく移動した時や、その動きの方向が急激に変化した場合にも追従できる。また、各フレームの輪郭はパラメトリックな曲線で記述されるので、処理後の修正を容易にすることが出来る。
【００２４】
以下、本発明に係る好ましい実施例について、全体の構成、各部の詳細、効果の順に説明する。
【００２５】
図１は、本発明の一実施例としての動画像内の物体の領域抽出方法の処理の全体を示す図である。本実施例は、対象物を含む連続した画像Ｉから対象物輪郭Ｂを各フレームにおいてパラメトリックに表現された閉曲線として獲得するものである。
【００２６】
このような物体領域抽出処理を行うための画像処理装置の概略構成の一例を図２に示す。
【００２７】
この図２において、画像処理装置は、本実施例の物体領域抽出に必要なあらゆる演算を行うためのＣＰＵ（中央演算処理装置）２１と、画像Ｉ、輪郭Ｂあるいは処理中間結果等を保持するための外部記憶手段２２と、画像を作成したり、軌跡を入力するためのマウス、タブレットペンなどの入力手段２３と、画像を表示するためのディスプレイなどの表示手段２４とを有している。これらのＣＰＵ２１、外部記憶手段２２、入力手段２３、表示手段２４間でのデータの送受は、バスライン２５を介して行われる。
【００２８】
再び図１に戻って、この図１は、上記図２の画像処理装置の主としてＣＰＵ２１により実行される物体領域抽出処理の、１フレームの処理を示している。本実施例は、各フレームにおいて、前フレーム画像Ｉ_t-1、現フレーム画像Ｉ_t、前フレームにおいて既に獲得された対象物輪郭Ｂ_t-1を入力とし、現フレームの対象物輪郭Ｂ_tを出力とする。まず第１に、動きベクトル推定処理部１１において、現フレームにおける対象物輪郭の位置を、前−現フレーム間での前フレーム対象物輪郭Ｂ_t-1の動きベクトル推定により得、これを推定輪郭Ｂ_eとする処理を行う。第２に、輪郭候補領域決定処理部１２により、上記推定輪郭Ｂ_eの周囲に、対象物輪郭が存在する可能性がある範囲を判定し、輪郭候補領域Ａとする処理を行う。第３に、グラディエント計算処理部１３により、上記輪郭候補領域Ａにおいて、対象物輪郭の正確な位置を知るために、グラディエントベクトル場Ｇを求める処理を行う。第４に、曲線生成処理部１４により、対象物輪郭位置を正確に通過するよう、グラディエントベクトル場Ｇのベクトルの大きさが大きいところを通過するような閉曲線を求める処理を行う。求めた閉曲線を現フレームの対象物輪郭Ｂ_tとする。得られた対象物輪郭Ｂ_tは、遅延処理部１５を介して、次回の処理において前フレームの対象物輪郭Ｂ_t-1として用いられ、以後同様に処理を進める。
【００２９】
なお、第１フレームに関しては本実施例とは別の方法で対象物輪郭を与える必要がある。その方法としては、画像をディスプレイ等に表示し、適当なグラフィックユーザインターフェイス（ＧＵＩ）によって輪郭上の点をマニュアルで指定することによって、スプライン曲線を発生させる曲線エディタの技術が既に公知となっているので利用可能である。
【００３０】
また、本発明の実施例の最終的な目的として、対象物領域であることを輝度の濃淡で示すマスク画像を各フレームで得ることがある。そのマスク画像は、対象物輪郭が閉曲線で得られれば、その内外判定、画素の塗りつぶしアルゴリズム等公知の技術を用いることにより生成することができるので、本実施例からマスク画像Ｉ_mが得られるものとして説明する。したがって、以後の処理の説明において、対象物領域の画素であるかどうかの判別は、対象物輪郭あるいは推定輪郭が得られれば判定できるものする。なお、内外判定、塗りつぶしアルゴリズムに関する文献としては、「コンピュータディスプレイによる図形処理工学」山口富士夫、日刊工業新聞社,1981 や、「コンピュータグラフィックス原理と実習、第２版」“Computer Graphics Principles and practice 2nd ed.”Foley, vanDam, Feiner, Hughes, ADDISON-WESLWEY PUBLISHING,1990 等がある。
【００３１】
ここで、データ形式等について説明する。
以降の説明において、各データは以下の形式をもつものである。
【００３２】
先ず、「曲線」とは、対象物輪郭、推定輪郭等は連続した３次スプラインセグメントによる閉曲線とする。１つの３次スプラインセグメントは４つのコントロールポイントによって記述され、その複数のセットで曲線を記述する連続したスプラインセグメントとは、１つ前のp₃と次のp₀が共通なセグメントの並びである。また、閉曲線は最終のセグメントのp₃と最初のセグメントのp₀が共通な曲線である。すなわち、セグメント数をＫとすると、
(p₀₀,p₀₁,p₀₂,p₀₃),(p₁₀,p₁₁,p₁₂,p₁₃),(p₂₀,p₂₁,p₂₂,p₂₃),...
...,(p_K-10,p_K-11,p_K-12,p_K-13)
p_i+10 ＝ p_i3
p_K-13 ＝ p₀₀
次に、「特徴点」について説明すると、本実施例において、輪郭上の特徴点は輪郭上の位置と画像上の位置の２つの位置情報をもつデータである。輪郭上の位置とは、輪郭の始点p₀₀ からの軌跡にそった長さである。
【００３３】
次に、「輪郭候補領域」について説明する。
本実施例において、輪郭候補領域Ａは該当する画素位置Ａ_olのリストである。
Ａ＝｛Ａ_o0,Ａ_o1,...｝
または、輪郭候補領域Ａはさらに分割した小領域Ａ_kのリストである。
Ａ＝｛Ａ₀,Ａ₁,...｝
小領域Ａ_kは推定輪郭に沿った順に並んでいる。小領域Ａ_kは該当する画素位置Ａ_klのリストである。
Ａ＝｛Ａ_k0,Ａ_k1,...｝
次に、その他の表記については、説明のため、以下のように表記する。
画像Ｉの位置（ｉ，ｊ）の画素値：Ｉ［ｉ，ｊ］
画像Ｉの輪郭候補領域画素位置Ａ_klの画素値：Ｉ［Ａ_kl］
ブロック、グラディエントベクトル場の要素も同様に表記する。
【００３４】
次に、各部の詳細について説明する。
先ず、図１の動きベクトル推定処理部１１について説明する。
【００３５】
現フレームにおける対象物輪郭の位置を、前−現フレーム間での前フレーム対象物輪郭Ｂ_t-1の動きベクトル推定により得、これを推定輪郭Ｂ_eとする処理を行う。本発明の実施例では、輪郭領域での動きベクトル推定の精度を向上させるために、次のような動きベクトル推定方法を採用している。
【００３６】
先ず、動きベクトル推定方法の第１の具体例としては、画像小領域の動きベクトル推定手段であるブロックマッチングにおいて、従来は、比較する第１の画像Ｉ₁中のブロックと、第２の画像Ｉ₂中のブロックの画素ごとの誤差の平均を誤差評価としていたものを、図３に示すように、第１の画像Ｉ₁中画素の重要度を示す第３の画像Ｉ₃を与え、第３の画像による重みつき誤差評価で行う技術を挙げることができる。
【００３７】
すなわち、図３において、マッチング演算処理部３１は、動きベクトルを求めようとする対象物を含む第１、第２の画像Ｉ₁,Ｉ₂ と、第１の画像Ｉ₁の対象物領域を濃淡値で示す第３の画像Ｉ₃と、第１の画像Ｉ₁の対象物の輪郭部に適当個配置されたテンプレートと、第２の画像Ｉ₂にテンプレートと対応して配置された探査範囲とを入力とし、テンプレートともっとも一致する第２の画像Ｉ₂上のマッチング位置を出力とする。また動きベクトル演算処理部３２は、テンプレートの位置と、マッチング演算処理部３１からのマッチング位置とを入力とし、テンプレートの動きベクトルを出力とする。
【００３８】
これによって、従来ブロックマッチングでは物体境界領域で、ブロックが複数の物体を含み、動きベクトルが一意に決定できないという問題を解決するものである。
【００３９】
次に動きベクトル推定方法の第２の具体例について説明する。これは、画像小領域の動きベクトル推定手段であるブロックマッチング処理において、従来のブロックマッチングでは、輪郭上の動きベクトルは本来輪郭に沿って滑らかに連続するべきであるのに、一般に、輪郭に沿って一様な画像パターンとなりやすいという原因によって、動きベクトルを誤りやすかった問題を、図４に示すように、与えられた輪郭上の動きベクトルの推定を階層的探査で行い、各階層間でベクトル場の平滑化処理による動きベクトルの修正を行うことにより、滑らかに連続する動きベクトルを得ることを実現する技術である。
【００４０】
すなわち、図４において、動きベクトルを求めようとする対象物を含む第１、第２の画像Ｉ₁,Ｉ₂ は、階層化処理部４１、４２によりそれぞれ階層化される。この階層化処理は、原画像を最下層として、例えば２×２の画像の平均を１つ上の層の画素としていくことによって、階層画像をつくることができる。階層化された各層の画像は、ブロックマッチング処理部４３、４４、４５、・・・、４６により、最上層の画像から、１つ上の階層の結果を用いて順にブロックマッチングを行っていく。
【００４１】
この図４の構成は、さらにブロックリスト生成処理部４７および平滑化処理部４８、４９、・・・を有している。ブロックリスト生成処理部４７は、入力された対象物輪郭情報から、追跡する輪郭上にブロックを配置する。また、平滑化処理部４８、４９、・・・は、動きベクトル場の平滑化を行う。
【００４２】
次に動きベクトル推定方法の第３の具体例を、図５を参照しながら説明する。これは、画像小領域の動きベクトル推定手段であるブロックマッチングによって対象物輪郭を追跡する技術において、従来では、輪郭上のあらゆる点で動きベクトル推定を行うと処理量が膨大であり、また輪郭上の点は必ずしも動きベクトル推定が正確に得られない、という問題を、図５に示すように、輪郭上の特徴点を抽出する手段を備え、その特徴点に対して動きベクトル推定を行うことによって、少数の動きベクトル推定でも高い信頼度をもって輪郭を追跡できることを実現する技術である。
【００４３】
この図５において、動きベクトル推定処理部５１には、動きベクトルを求めようとする対象物を含む第１、第２の画像Ｉ₁,Ｉ₂ と、特徴点抽出処理部５２からのテンプレートリスト及び探査範囲リストが供給されている。特徴点抽出処理部５２は、与えられた対象物輪郭情報と、上記第１の画像Ｉ₁ とから、特徴点を抽出し、探査するブロックリストを生成する。動きベクトル推定処理部５１は、生成されたブロックリストの動きベクトル場を推定する処理を行う。
【００４４】
次に、図６は、上記動きベクトル推定方法の第１〜第３の具体例を組み合わせて成る動きベクトル推定方法の第４の具体例を示している。すなわち、輪郭部の動きベクトル推定処理に効果的な上記第１〜第３の３つの技術の組み合わせ方法を与えるものである。上記第１〜第３の技術は、それぞれ輪郭部の動きベクトル推定の精度向上について異なる効果をあげるものである。
【００４５】
この図６において、階層化処理部６１、６２は上記図４の階層化処理部４１、４２に相当するものであり、前フレーム画像Ｉ_t-1と前フレーム輪郭Ｂ_t-1とから、上記図５と共に説明した動きベクトル推定方法の第３の具体例の技術を用いて、特徴点抽出処理部６４により輪郭Ｂ_t-1上の特徴点を抽出し、その特徴点の動きベクトルを推定するために、各特徴点位置にブロックを配置する。
【００４６】
上記図４と共に説明した動きベクトル推定方法の第２の具体例の階層的ブロックマッチングによる探査を行うために、前フレーム画像Ｉ_t-1、現フレーム画像Ｉ_t、そして前フレーム輪郭Ｂ_t-1から得られる前フレームマスク画像Ｉ_mt-1の階層画像を作成する。ブロックマッチング処理部４３、４４、４５、・・・、４６により、最上層から最下層まで、階層的ブロックマッチングを行う。このとき、各階層のブロックマッチングの誤差評価方法を、マスク画像Ｉ_mt-1を用いて上記図３と共に説明した動きベクトル推定方法の第１の具体例の技術による誤差評価を行う。また、各階層間において、平滑化処理部４８、４９、・・・により、上記図４と共に説明した動きベクトル推定方法の第２の具体例の技術による平滑化処理を行う。
【００４７】
動きベクトル推定によって、得られた特徴点の推定移動先（以下、推定輪郭上の特徴点とよぶ）を、補間処理部６５における曲線補間処理によって各特徴点を通過するように連結する。与えられた各特徴点を通過するような補間曲線の生成方法は前記文献「コンピュータグラフィックス原理と実習、第２版」“Computer Graphics Principles and practice 2nd ed.”Foley, vanDam, Feiner,Hughes, ADDISON-WESLWEY PUBLISHING,1990に記載のCatmull-Rom spline等がある。また、各点間を直線で連結し推定輪郭としても構わない。
【００４８】
本具体例は、上記動きベクトル推定方法の第１〜第３の具体例で示した３つの技術の組み合わせることにより、それぞれの効果によってさらに良い結果を得ることを実現する。
【００４９】
次に、輪郭候補領域決定処理部１２の具体例について説明する。
【００５０】
推定輪郭Ｂe の周囲に、対象物輪郭が存在する可能性がある範囲を判定し、輪郭候補領域Ａとする処理を行う。そのために推定輪郭Ｂe の真の輪郭に対するずれ量を見積り、そのずれ量に相当する領域の大きさを決める。
【００５１】
図７は本実施例における輪郭候補領域決定処理のための構成の一例を示すブロック図である。この図７を参照しながら本処理の概要を説明する。
【００５２】
第１に、評価点抽出処理部７２は、推定輪郭Ｂe 上から、輪郭が比較的直線的で輪郭の両側の領域の濃度が一定とみなせる点をずれ量を評価する点として抽出する。第２に、対応点抽出処理部７１は、前フレーム対象物輪郭Ｂ_t-1上の、評価点に対応する点を抽出する。第３に、ずれ量評価処理部７３により各評価点と対応点間で誤差評価を行い、ずれ量を見積もる。第４に、領域判定処理部７４により各評価点近傍の領域の大きさをずれ量から決定し、その範囲に属する画素位置を輪郭候補領域とする。
【００５３】
次に図８は、上述したような輪郭候補領域決定処理の一例を説明するためのフローチャートである。本処理は、前フレーム対象物輪郭Ｂ_t-1、推定輪郭Ｂe 、上記前フレーム対象物輪郭Ｂ_t-1上の特徴点ｃ_k、Ｂe 上の特徴点ｃ_ek、特徴点数Ｋを入力とし、輪郭候補領域Ａを出力する。本処理は、以下に説明する処理を各特徴点間で繰り返すものである。この繰り返しは、いわゆるＦＯＲループ８１により、特徴点数のＫ回実行される。
【００５４】
以下、ＦＯＲループ８１内でのｋ番めの特徴点間についての処理について説明する。
【００５５】
先ず、ステップＳ８２では、推定輪郭Ｂ_e上の特徴点ｃ_ekと、ｃ_ek+1の推定輪郭上の中間点をｋ番めの評価点ｓ_ekとする。同様に前フレーム対象物輪郭Ｂ_t-1上の特徴点ｃ_kと、ｃ_k+1の推定輪郭上の中間点をｋ番めの対応点ｓ_kとする。また、このステップＳ８２では、評価点ｓ_ekにブロックｂ_ek、対応点ｓ_kにブロックｂ_kをおき、次のステップＳ８３にて、２つのブロックから正規化された２乗誤差平均の平方根（root mean square error）NRMSE を計算し、それをブロックの大きさ（ブロックサイズ）ＢＳで割った値をずれ量ｘとし、ＦＯＲループ８４に移行する。NRMSE の計算方法は後述する。
【００５６】
ＦＯＲループ８４は、ループ制御変数ｉ，ｊについて、それぞれ領域の画像サイズ（イメージサイズ）ＩＳの回数の繰り返し処理を行うものであり、このＦＯＲループ８４内のステップＳ８５、Ｓ８６、Ｓ８７により、画像中の各画素位置(i,j) について、その線分ｃ_ekｃ_ek+1への距離がｘより小さいならば、その画素位置(i,j) を輪郭候補領域と判定する。
【００５７】
次に図９は、前記NRMSE の計算方法を示すフローチャートである。この計算は前記ブロックｂ_k、ｂ_ekを用いて行う。
この図９において、ＦＯＲループ９１は、ループ制御変数ｉ，ｊについて、上記ブロックサイズＢＳの回数の繰り返し処理を行うものである。このＦＯＲループ９１内では、先ずステップＳ９２によりブロックｂe 内の各画素値ｂe[i,j]に対し、推定輪郭Ｂe の内側か外側かの判定を行い、分類して、ステップＳ９３により画素値ｂe[i,j]を内側画素値の集合fgdata[ ] に登録し、Ｓ９４により画素値ｂe[i,j]を外側画素値の集合bgdata[ ] に登録する。
【００５８】
次に、ＦＯＲループ９１の繰り返し処理が終了した後、ステップＳ９５により全内側画素の集合fgdata[ ] の重心値をｆｇとし、全外側画素の集合bgdata[ ] の重心値をｂｇとする。次にブロックｂe 、ブロックｂの２乗誤差平均の平方根（root mean square error）RMSEを、次の式により計算する。
【００５９】
【数１】

次に、ステップＳ９６により、RMSEを重心の差の大きさで割った値をNRMSE とする。
【００６０】
NRMSE ＝ RMSE／｜fg-bg｜
次に、上記図１のグラディエント計算処理部１３の具体例について説明する。
【００６１】
このグラディエント計算では、輪郭候補領域Ａにおいて、対象物輪郭の正確な位置を知るために、グラディエントベクトル場Ｇを求める処理を行う。
【００６２】
図１０は、本実施例のグラディエント計算の処理ブロック図である。
本処理は以下の手順で行われる。第１に、特徴点抽出処理部１０１において、推定輪郭Ｂe 上で、上記図５と共に説明した動きベクトル推定方法の第３の具体例の技術の特徴点抽出処理を行う。第２に、領域分割処理部１０２において、前記特徴点を境界とするように、輪郭候補領域Ａを小領域Ａ_kに分割し、エッジ特徴が各小領域内で一定になるようにする。第３に、エッジ特徴推定処理部１０３において、各小領域ごとにエッジ特徴を推定し、その特徴をもとに、エッジ検出処理部１０４において、以下に示すような具体的なエッジ検出方法を用いてグラディエントを求める。
【００６３】
このエッジ検出方法の具体例について説明する。
【００６４】
先ず、エッジ検出方法の第１の具体例としては、画像上の濃度勾配（グラディエント）を求める計算方法において、従来は、画像上のあらゆる方向に均等に注目して勾配を検出したために、注目すべき物体の輪郭以外の濃度勾配も検出してしまっていたものを、図１１に示すように、どの方向の濃度勾配を検出すべきかという情報を与え、その情報によって検出した濃度勾配強度を変調することによって、注目すべき方向以外の濃度勾配を検出しないことを実現する技術である。
【００６５】
すなわち、この図１１において、濃淡画像に対し、グラディエント計算処理部１１１にてグラディエントを計算する。方向検出処理部１１３でグラディエントベクトルの方向を検出し、それと検出すべき画像空間上のベクトルによって選択度を、選択度判定処理部１１４により判定する。判定されて得られた選択度に従って、強度変調処理部１１２がグラディエント強度を変調する。
【００６６】
次に、エッジ検出方法の第２の具体例について説明する。
【００６７】
このエッジ検出方法の第２の具体例は、画像上の濃度勾配（グラディエント）を求める計算方法において、従来は、濃度空間上のあらゆる方向の濃度変化に均等に注目して勾配を検出したために、注目すべき物体の輪郭以外の濃度勾配も検出してしまっていたものを、図１２に示すように、内積計算処理部１２１とグラディエント計算処理部１２２とにより、どの方向の濃度変化を検出すべきかという情報を与え、その情報から得られる、画像空間上のベクトルと、各画素の濃度空間ベクトルの内積計算から得られる濃淡画像の濃度勾配を求めることによって、注目すべき濃度変化以外の濃度勾配を検出しないことを実現する技術である。
【００６８】
以下に、図１３を参照しながら、グラディエント計算における本実施例の特徴であるところの領域分割処理を説明する。
図１３は、領域分割処理のフローチャートである。この図１３において、いわゆるＦＯＲループ１３１では輪郭候補領域Ａの画素数Ｌの回数分の繰り返し処理を、またＦＯＲループ１３１内のＦＯＲループ１３２では特徴点数Ｋの回数分の繰り返し処理を行っており、輪郭候補領域Ａ内の全画素Ａ_olについて、以下の処理を行う。
【００６９】
上記ＦＯＲループ１３２内では、ステップＳ１３３において、隣り合う特徴点ｃ_k、ｃ_k+1によるＫ本の線分ｃ_kｃ_k+1それぞれと画素位置Ａ_olとの距離ｄ_kを求める。次のステップＳ１３３、Ｓ１３５では、その距離ｄ_kが最も小さいものをｄ_minとし、そのときのｋをｍとしている。ＦＯＲループ１３２の処理が終了した後、ステップＳ１３６でＡ_olを領域Ａ_mに分類する。
【００７０】
次に、領域分割されたそれぞれの輪郭候補小領域Ａ_kで、エッジ検出に必要な情報を得る処理を説明する。
【００７１】
図１４は本実施例のエッジ検出情報を得る処理のフローチャートである。本実施例では、エッジ検出情報として、画像空間上の単位ベクトルｖ_okと、濃度空間上の単位ベクトルｖ_ckを獲得する。
【００７２】
この図１４において、いわゆるＦＯＲループ１４１では、輪郭候補小領域の画素数Ｌ_kの回数分の繰り返し処理を行っている。このＦＯＲループ１４１内で、ステップＳ１４２では、ブロックｂe 内の各小領域内画素Ａ_klに対し、推定輪郭Ｂe の内側か外側かの判定を行い、分類して、ステップＳ１４３により上記Ａ_klを内側画素値の集合fgdata[ ] に登録し、Ｓ１４４により上記Ａ_klを外側画素値の集合bgdata[ ] に登録する。
【００７３】
次に、ＦＯＲループ１４１の繰り返し処理が終了した後、ステップＳ１４５により全内側画素の集合fgdata[ ] の重心値をｆｇとし、全外側画素の集合bgdata[ ] の重心値をｂｇとする。
【００７４】
ここで、画像空間上の単位ベクトルｖ_ｏｋは、検出したいグラディエントの方向を与えるための情報である。ｖ_ｏｋは、各小領域Ａ_ｋにおいて、２つの特徴点で構成される線分ｃ_ｋｃ_ｋ＋１に直交する方向に定められる。濃度空間上の単位ベクトルｖ_ｃｋは、検出したい濃度変化の方向を与えるための情報である。ｖ_ｃｋは、各小領域Ａ_ｋにおいて、全画素を推定輪郭Ｂe の内側と外側の画素に分類し、それぞれに重心ｆｇ、ｂｇを求め、その重心の差ベクトルと平行な方向にとられる（ステップＳ１４６参照）。または、２つの重心を輝度軸に垂直な面に投影したのち、差ベクトルを求め、それに平行な方向にとる方法も、輝度方向のノイズ除去に有効である。
【００７５】
次に、図１５に、上記図１１、図１２と共に説明したエッジ検出方法の第１、第２の具体例の技術を組み合わせて行う処理のブロック図を示す。組み合わせたグラディエント計算の手順を以下に説明する。
【００７６】
この図１５において、グラディエントを求める入力画像、検出すべき濃度空間上のベクトル、検出すべき画像空間上のベクトルが入力として得られる。
【００７７】
第１に、内積計算処理部１５１にて、画像空間上のベクトルと、各画素の濃度空間ベクトルの内積計算から濃淡画像を求める。求めた濃淡画像に対し、グラディエント計算処理部１５２にてグラディエントを計算する。方向検出処理部１５４でグラディエントベクトルの方向を検出し、それと検出すべき画像空間上のベクトルによって選択度を、選択度判定処理部１５５により判定する。判定されて得られた選択度に従って、強度変調処理部１５３がグラディエント強度を変調する。以上の処理を、各輪郭候補小領域において行う。
【００７８】
以上の図１０〜図１５と共に説明したように、選択的な輪郭検出を行うことができる上記エッジ検出技術の第１、第２の具体例（図１１、図１２参照）に対し、自動的に必要な情報を与える手段を備え、かつ、これら２つの技術を組み合わせて用いるものである。すなわち、本発明は、輪郭候補領域において濃度値のグラディエントベクトル場を求める処理において、現フレームの推定輪郭に対し、上記図５と共に説明した動きベクトル推定方法の第３の具体例の技術による特徴点抽出を行い、現フレームの推定輪郭上の特徴点位置に従って輪郭候補領域を小領域に分割する。この図５の技術の特徴点抽出は、輪郭の屈曲点と、輪郭色の変化点を抽出するものである。一方、上記図１１、図１２と共に説明したエッジ検出技術では、エッジの勾配方向と色変化方向に注目するものである。すなわち、上記図５に示す技術で検出される特徴点は、上記図１１、図１２の技術のエッジの特徴の変化する点である。よって、特徴点を境に領域分割すれば、小領域ごとに輪郭の特徴を一定にすることが出来る。小領域ごとに輪郭の特徴を調べることによって、正確な上記図１１、図１２のエッジ検出が可能になり、対象物輪郭以外の不要なグラディエント成分を抑えることができる。
【００７９】
次に、上記図１の曲線生成処理部１４の具体例について説明する。
【００８０】
この曲線生成処理においては、対象物候補領域で求められたグラディエントベクトル場から、対象物輪郭位置を３次スプライン曲線で抽出する。そのために、グラディエントベクトル場Ｇのベクトルの大きさが大きいところを通過するような閉曲線を求める処理を行う。求めた閉曲線を現フレームの対象物輪郭Ｂ_tとする。
【００８１】
図１６は本実施例の曲線生成の処理ブロック図である。本処理は以下の手順で行われる。
【００８２】
すなわち、図１６において、第１に、特徴点抽出処理部１６１により、推定輪郭Ｂe 上で、上記図５と共に説明した動きベクトル推定方法の第３の具体例の技術による特徴点抽出処理を行う。第２に、領域分割処理部１６２により、上記特徴点を境界とするように、輪郭候補領域Ａを小領域Ａ_kに分割する。具体的な領域分割方法は、上述したグラディエント計算処理における領域分割方法と共通である。第３に、通過点抽出処理部１６３により、各小領域ごとに通過点を１点抽出する。抽出された通過点は、小領域の順番のとおりに順序づけされる。第４に、ｐ_i1、ｐ_i2探査処理部１６４により、後述するようなスプラインコントロールポイントの探査方法によって、各セグメントごとに残りのコントロールポイントを求め、対象物輪郭Ｂt を得る。
【００８３】
ここで、上記スプラインコントロールポイントの探査方法について、図１７を参照しながら説明する。
【００８４】
この図１７は、曲線生成方法を実現するためのブロック図を示し、この曲線形成方法は、ベクトル場ベクトルの大きさがなるべく大きいところを通過するような３次スプライン曲線形状を探査する技術である。それによって、図１７に示すように画像上の物体輪郭の大体の位置を与える手段を備えることによって、画像のグラディエントベクトル場の情報から曲線を探査し、物体輪郭をパラメトリックな曲線で抽出することを実現する。
【００８５】
この図１７に示す曲線生成方法は、２次元の滑らかなベクトル場上、例えば画像の色彩を表すＲ、Ｇ、Ｂ信号の各信号に基づいて形成される濃淡勾配ベクトル場上をを通過する曲線を生成する曲線生成方法において、上記曲線上の所定数個の評価位置で、当該曲線の接線と直交する単位ベクトルとベクトル場ベクトルとの内積の関数で表される評価値を計算する評価値計算工程と、上記評価値計算行程にて得られた上記評価値が最大になるように曲線を決定する曲線決定工程とを有するものである。
【００８６】
図１７において、ベクトル場は、空間例えば入力画像中の全ての点における各点（i,j）でのベクトル場ベクトルＶ（i,j）を通過点抽出部１７２及びｐ_i1，ｐ_i2探査部１７４に出力される。軌跡入力部１７１は、例えばペン状入力装置いわゆるタブレットペンやマウス等の入力手段を備えており、軌跡Ｔが曲線を得たい大まかな領域の画素点の並びとして入力されると、該軌跡Ｔを通過点抽出部１７２に出力する。通過点入力部１７２では、上記軌跡Ｔから一定距離内に含まれる領域のベクトル場を調べて、ベクトルの大きさが所定の基準以上である位置を通過点ｐとして抽出し、これら通過点ｐの位置データを順序づけ部１７３に出力する。なお、上記各通過点ｐは、後述するように、軌跡Ｔまでの最短距離が所定の長さよりも小さく、かつ、ベクトル場ベクトルの大きさが所定の大きさよりも大きい全ての点である。順序づけ部１７３は、後述するように、上記通過点ｐを軌跡Ｔの進行方向に沿った順序で並べ換えて得られた通過点を基準点ｐ₀₀、ｐ₁₀、ｐ₂₀、…、ｐ_K0としてｐ_i1，ｐ_i2探査部１７４に出力する。ｐ_i1，ｐ_i2探査部１７４は、上記評価値計算工程及び曲線決定工程を行う部分である。評価値計算工程では、上記基準点ｐ_i0、ｐ_(i+1)0と、上記ベクトル場ベクトルＶ（ｉ，ｊ）に基づいて、基準点ｐ_i0、ｐ_i3より標本点ｐ_i1、ｐ_i2を抽出し、これら基準点及び標本点より部分曲線を生成し出力する。なお、上記部分曲線は端部が連結されると曲線を形成する。
【００８７】
図１７によれば、軌跡入力部１７１にて使用者により空間例えば画像内で軌跡Ｔが入力され、通過点抽出部１７２及び順序づけ部１７３で、生成する曲線を構成する部分曲線を生成するための基準点ｐ₀₀、ｐ₁₀、…が、上記軌跡Ｔ及び上記ベクトル場ベクトルに基づいて抽出される。さらに、ｐ_i1，ｐ_i2探査部１７４にて、上記基準点ｐ₀₀、ｐ₁₀、…と上記ベクトル場ベクトルに基づいて、上記基準点ｐ_i0、ｐ_i3間で上記標本点ｐ_i1、ｐ_i2が取り出され、これら点により部分曲線が形成され、部分曲線を連結して求めたい曲線が得られる。
【００８８】
以下に、曲線生成における本実施例の特徴であるところの、通過点抽出処理を説明する。図１８は、通過点抽出処理のフローチャートである。
【００８９】
この図１８において、いわゆるＦＯＲループ１８１では輪郭候補小領域数Ｋの回数分の繰り返し処理を、またＦＯＲループ１８１内のＦＯＲループ１８２では第ｋの輪郭候補小領域Ａ_kに対応する輪郭候補小領域の画素数Ｌ_kの回数分の繰り返し処理を行っており、全ての輪郭候補小領域Ａ_kについて、輪郭候補小領域Ａ_k中の各画素位置Ａ_klのグラディエント強度Ｇ[Ａ_kl]を調べ、それが最大となる画素位置を通過点ｐ_k0とする。
【００９０】
すなわち、上記ＦＯＲループ１８２内では、ステップＳ１８３において、上記グラディエント強度Ｇ[Ａ_kl]の絶対値をｇとし、次のステップＳ１８４で、この値ｇが現在までの最大値ｇ_maxより大きいか否かを判別し、ＹesのときのみステップＳ１８５に進んで、今回のｇをｇ_maxとし、このときの画素位置Ａ_klを上記通過点ｐ_k0としている。
【００９１】
このような曲線生成方法は、上述した図１７に示すような、グラディエントベクトル場の大きさが大きいところを通過する曲線を生成することができる曲線生成方法の技術に対し、曲線探査に必要なベクトル場上の通過点列を自動的に抽出する手段を備えるものである。すなわち本発明の実施例は、輪郭候補領域内のグラディエントベクトル場上に曲線を生成する処理において、現フレームの推定輪郭に対し、上記図５と共に説明した技術の特徴点抽出を行い、現フレームの推定輪郭上の特徴点位置に従って輪郭候補領域を小領域に分割し、その小領域ごとに輪郭が通過する点を抽出する。これによって、小領域の配置する順に順序づけされた通過点列が得られ、グラディエントベクトル場ベクトルの大きさが大きいところを通過する曲線を、上記図１７の技術による３次スプライン曲線生成処理によって生成することができる。上記図５の技術の特徴点抽出は、輪郭の屈曲点を抽出するものである。一方、上記図１７の技術の曲線生成は、通過点ごとに３次スプライン曲線が生成される。したがって、特徴点ごとに対応する３次スプラインセグメントが生成できれば、輪郭上の目立った形状を再現できることが保証される。また、屈曲箇所とスプラインセグメントが１対１で対応するので、形状の複雑さによって、セグメント数が必要以上に多過ぎたり、足りなかったりすることがない。
【００９２】
以上説明したような本実施例の領域抽出方法の効果について、図１９を参照しながら説明する。
【００９３】
この図１９は、ある動画像に対し本実施例を適用した結果を明瞭化するために簡略化して図示したものである。
【００９４】
図１９の（ａ）〜（ｄ）は、動画像の連続した４フレーム分を、主要な物体の輪郭を線画にして図示したものである。図９の（ｅ）〜（ｈ）は、本実施例を適用することにより抽出された対象物のマスクあるいはシルエット画像である。この例では、画像中央の物体ｏｂｊを対象物としている。
【００９５】
この図１９の（ａ）〜（ｄ）に示されるように、対象物ｏｂｊの輪郭はそれ以外の複数の物体と接しているので、輪郭のもつ特徴は単純ではない。このような任意の前景、背景の組み合わせをもつ輪郭であっても、本実施例では領域抽出を行わせることができる。この結果が示すように、本実施例は動画像の対象物領域を自動的に抽出することを実現する技術である。
【００９６】
【発明の効果】
本発明に係る領域抽出方法及びその装置によれば、入力動画像の前フレームでの対象物輪郭の軌跡が与えられたときに、前−現フレーム間で輪郭の動きベクトルを推定することによって、現フレーム上で輪郭の推定位置である推定輪郭を求め、この推定輪郭の周囲の画素を含む輪郭が存在する可能性のある領域である輪郭候補領域を現フレーム上で決定し、この輪郭候補領域において濃度値のグラディエントベクトル場を求め、このグラディエントベクトル場中のベクトルの大きさが大きいところを通過するように閉曲線を生成し、その軌跡を現フレームにおける対象物輪郭としていることにより、対象物輪郭の追従性を高め、輪郭の修正を容易化し、領域抽出結果の向上を実現することができる。
【００９７】
すなわち本発明は、前フレームでの対象物輪郭の軌跡が与えられたときに、前−現フレーム間で輪郭の動きベクトルを推定することによって、現フレーム上で輪郭の推定位置、すなわち推定輪郭を求め、推定輪郭の周囲の画素を含む輪郭が存在する可能性のある領域、すなわち輪郭候補領域を現フレーム上で決定し、輪郭候補領域において濃度値のグラディエントベクトル場を求め、そのグラディエントベクトル場中のベクトルの大きさが大きいところを通過するように閉曲線を生成し、その軌跡を現フレームにおける対象物輪郭とする処理を行う。これによって、前−現フレーム間で輪郭が大きく移動した時や、その動きの方向が急激に変化した場合にも追従できる。また、各フレームの輪郭はパラメトリックな曲線で記述されるので、処理後の修正を容易にすることが出来る。
【００９８】
また、本発明は、輪郭部の動きベクトル推定処理に効果的な３つの動きベクトル推定技術の組み合わせ方法を与え、これらの技術を組み合わせることにより、それぞれの効果によってさらに良い結果を得ることを実現する。
【００９９】
また本発明は、現フレーム上で輪郭候補領域を求める処理において、前フレームの輪郭上のブロックと、それに対応する現フレームの推定輪郭上のブロックとの間の平均２乗誤差を対象物色と背景色との差で規格化した値を求める。これによって、現フレーム上での真の輪郭と推定輪郭のずれの大きさを見積もる処理を行う。これによって、現フレームの推定輪郭の周囲に形成する輪郭候補領域の太さを調節し、必要以上に領域が大きくなり、無駄な処理が増えるのを防ぐ。
【０１００】
すなわち、本発明に係る領域抽出方法及びその装置によれば、動画像からの対象物領域抽出処理技術において、従来技術のフレーム間の追従性能、修正の容易さ、輪郭形状の正確さの問題を解決し、領域抽出結果の向上を実現することができる。
【図面の簡単な説明】
【図１】本発明の領域抽出方法が適用される実施例の概略構成を示すブロック図である。
【図２】本発明の領域抽出装置が適用される画像処理装置の全体の概略構成を示すブロック図である。
【図３】図１の動きベクトル推定処理部の第１の具体例を示すブロック図である。
【図４】図１の動きベクトル推定処理部の第２の具体例を示すブロック図である。
【図５】図１の動きベクトル推定処理部の第３の具体例を示すブロック図である。
【図６】図１の動きベクトル推定処理部の第４の具体例を示すブロック図である。
【図７】図１の輪郭候補領域決定処理部の具体例を示すブロック図である。
【図８】図７の輪郭候補領域決定処理の一例を説明するためのフローチャートである。
【図９】図８中のNRMSE の計算方法の一例を示すフローチャートである。
【図１０】図１のグラディエント計算処理部の具体例を示すブロック図である。
【図１１】図１０のエッジ検出処理部の第１の具体例を示すブロック図である。
【図１２】図１０のエッジ検出処理部の第２の具体例を示すブロック図である。
【図１３】グラディエント計算における領域分割処理の動作を説明するためのフローチャートである。
【図１４】エッジ検出情報を得る処理を説明するためのフローチャートである。
【図１５】エッジ検出処理の図１１、図１２に示す具体例を組み合わせて行う処理を示すブロック図である。
【図１６】図１の曲線生成処理部の一例を示すブロック図である。
【図１７】曲線生成方法が適用される構成の具体例を示すブロック図である。
【図１８】通過点処理を説明するためのフローチャートである。
【図１９】本実施例の効果を説明するための図である。
【符号の説明】
１１動きベクトル推定処理部
１２輪郭候補領域決定処理部
１３グラディエント計算処理部
１４曲線生成処理部
２１ＣＰＵ（中央演算処理装置）
２２外部記憶手段
２３入力手段
２４表示手段
２５バスライン
３１マッチング演算処理部
３２動きベクトル演算処理部
４１、４２、６１、６２、６３階層化処理部
４３〜４６ブロックマッチング処理部
４８、４９平滑化処理部
５１動きベクトル推定処理部
５２特徴点抽出処理部[0001]
[Industrial application fields]
The present invention relates to an area extraction method and apparatus for extracting an object area in a moving image, for example, for image processing, and in particular, an operation for creating a mask image for designating an area of an object necessary when performing image synthesis. The present invention relates to a region extraction method and apparatus that can be used for the above.
[0002]
[Prior art]
As the object region extraction processing technique, the following techniques are known as techniques for extracting a region of an object from a moving image.
[0003]
First, using the active contour model,
Snakes, described in the literature "Snake: Active Contour Models", "SNAKES: Active Contour models" Kass M., Witikin A., Terzopoulos D., Proc. 1st ICCV, pp. The active contour model called is a combination of control points that move so as to converge on a contour in an image in a chain shape according to constraint conditions.
[0004]
The following documents are examples of applying this active contour model to contour tracking between frames.
Reference (a) “Automatic extraction of complex objects based on region segmentation” Satoshi Eito, Yoshiaki Shirai, NICOGRAPH '92, pp.8-17, 1992, corresponding patent document: Japanese Patent Laid-Open No. 5-61977 "
Reference (b) "Contour Tracking Method Based on Elastic Contour Model and Energy Minimization Principle" Ueda Nobuyoshi, Mase Kenji, Suenaga Yasuhito, Shingaku, Vol.J-75-D-II, No.1, pp.111-120 1992, corresponding patent document: Japanese Patent Application Laid-Open No. 5-12433 “Contour Tracking Method of Moving Object”
The techniques described in these documents are intended to obtain an object contour by converging a dynamic contour model in the current frame with the contour position of the previous frame as an initial value. In general, this method has the following characteristics.
[0005]
The active contour model is obtained by connecting a plurality of control points using a spline curve or the like, and a smooth contour is always obtained.
The convergence of the active contour model is an iterative process of updating the evaluation function of the model while moving the control point little by little.
[0006]
Next, as a method for estimating the motion vector, a method for determining the contour position of the current frame by estimating the motion vector of the contour between the previous and current frames is proposed by the following document.
[0007]
Reference (c) "A Study on Motion Vector Detection for Video" Nobuyuki Yagi, Katsuyuki Tanaka, Kazumasa Minami, Television Society Journal, Vol. 45, No. 10, pp. 1221-1229, 1991
Document (d) Japanese Patent Laid-Open No. 4-117079 “Image Processing System”
The techniques described in these documents obtain a motion vector by performing block matching between the previous and current frames. Block matching is a method for searching the current image for a block most similar to the block including the pixel of interest in the previous image. In general, the method using block matching is characterized in that the direction of motion of an object is examined between the previous and current frames, so that the tracking is good even when different from the previous motion.
[0008]
Also described in the literature (e) "Extraction and fitting of objects for video synthesis" Seiki Inoue, Hiroaki Koyama, Journal of Television Society, Vol. 47, No. 7, pp. 999-1005, 1993. The technique used does not directly determine a motion vector between the previous frame and the current frame, but uses a prediction vector based on the contour motion up to the previous frame. According to the document (e), the process is originally performed in units of fields of the interlaced image, but in the present specification, for the purpose of explaining the principle, it will be described in units of frames. Since this is a rough prediction compared to the block matching method, only the contour trajectory is not tracked, and motion estimation is performed for an area with a certain thickness where there may be an outline. A method of performing contour extraction in detail by detection or the like is employed.
[0009]
[Problems to be solved by the invention]
By the way, the above-described conventional object region extraction processing technique has a problem of tracking between frames and a problem of generation of a contour locus as described below.
[0010]
First, as a problem of inter-frame tracking, the convergence of the active contour model (so-called snakes) used in the prior art of the above-mentioned documents (a) and (b) is greatly influenced by the initial value. For example, when the object position changes greatly, convergence is poor and there is often a risk of falling into an incorrect local solution. An improvement that can follow a large change in the object position by adding an elastic constraint condition to the dynamic contour model has also been proposed in the above document (b). There is a problem that is lost.
[0011]
The tracking method for obtaining a prediction vector from the motion of the contour up to the previous frame as in the prior art of the above-mentioned document (e) has a problem that it cannot cope with a sudden change in the direction of motion in the current frame.
[0012]
On the other hand, the block matching used in the prior art shown in the above documents (c) and (d) directly examines the motion of the previous-current frame, and thus is resistant to motion changes. However, the block matching method itself has a drawback that there are many errors in the contour region, and sufficient accuracy cannot be obtained simply by using the conventional block matching as it is.
[0013]
Next, the problem of contour locus generation will be described. In the method of estimating the motion vector, after the motion vector is obtained, the processing method for obtaining the contour on the current frame as the trajectory determines the details and accuracy of the trajectory. In the conventional techniques shown in the above documents (c) and (d), the object region is moved and deformed according to the motion vector or a deformation parameter obtained based on the motion vector. This method can deal only with global deformations, and cannot cope with fine deformations, particularly contour shapes.
[0014]
In the prior art of the above document (e), after tracking by a motion vector, edge detection is performed in the contour region, and the contour position is determined in detail. Since the contour locus is obtained as an array of pixel points, there is a drawback that correction after processing is difficult. In addition, since the edge detection method according to this conventional technique cannot select the contour of the target object and other image density gradients, there is a problem that the contour cannot be accurately extracted depending on the image.
[0015]
In the conventional technique using the above-described dynamic contour model (so-called snakes), the result of tracking the contour is directly obtained as a contour curve. Since the contour model is described by a parametric curve, correction after processing is easy. In addition to correcting the wrongly detected parts locally, it is also necessary to enlarge / reduce the mask image when the image is enlarged / reduced. Is an important requirement.
[0016]
The present invention has been made in view of such circumstances, solves the problems of the conventional region extraction method, improves the contour tracking performance between frames, facilitates the correction of the contour, and improves the contour shape. An object of the present invention is to provide a region extraction method and apparatus capable of improving the accuracy of the above and obtaining a good region extraction result.
[0017]
[Means for Solving the Problems]
In order to solve the above-described problem, the region extraction method and apparatus according to the present invention provide a contour motion vector between a previous frame and a current frame when a trajectory of an object contour in a previous frame of an input moving image is given. To estimate an estimated contour that is an estimated position of the contour on the current frame, and determine a candidate contour region on the current frame that may contain a contour including pixels around the estimated contour. In this contour candidate region, a gradient vector field is obtained from the inner product of the unit vector on the image space and the unit vector on the density space of each pixel, and the direction of the vector existing in the gradient vector field and the vector on the image space The gradient intensity is modulated by, pixels that maximize the gradient intensity for each region are connected to generate a closed curve, and the locus is displayed. Are the object contours in frame.
[0018]
Here, when estimating the motion vector, a feature point on the trajectory of the given object contour is extracted, and an error evaluation between two blocks to be compared is performed when estimating the motion vector of the extracted feature point. In other words, the calculation is performed using hierarchical block matching using an error evaluation function that weights the residual of the object region pixel, and smoothing processing is performed on the motion vector of the result of each layer. .
[0019]
In determining the contour candidate area, the mean square error between the block on the contour of the previous frame and the block on the estimated contour of the current frame corresponding to the difference between the object color and the background color is determined. The value normalized in step (1) is obtained, and the value is used to adjust the thickness of the contour candidate region formed around the estimated contour of the current frame.
[0020]
When calculating the above gradient, feature points are extracted from the estimated contour of the current frame, the candidate contour region is divided into small regions according to the feature point positions on the estimated contour of the current frame, and the current frame is divided into each small region. The gradient vector field of the candidate contour region can be obtained by estimating the direction of the contour and performing edge detection that reacts only to the gradient of the direction. Alternatively, feature point extraction is performed on the estimated contour of the current frame, the contour candidate region is divided into small regions according to the feature point positions on the estimated contour of the current frame, and the color change of the contour in the current frame is estimated for each small region Then, by performing edge detection that reacts only to the gradient of the color change, it is possible to obtain a gradient vector field of the contour candidate region.
[0021]
Further, when generating the curve, feature points are extracted from the estimated contour of the current frame with respect to the gradient vector field of the contour region of the current frame, and contour candidates are extracted according to the feature point positions on the estimated contour of the current frame. Divide the region into small regions, extract points through which the contour passes through each small region, and connect the passing points with a cubic spline curve generated so as to pass through a place where the gradient vector field vector has a large magnitude Can be mentioned.
[0022]
[Action]
Estimate the motion vector of the object contour in the previous frame, determine the region where the contour may exist in the current frame, obtain a cubic spline closed curve that passes through a sharp density gradient in that region, By sequentially performing the process of making the object outline of the current frame in the frame, the object can be extracted in each frame of the moving image.
[0023]
The region extraction method and apparatus according to the present invention estimate the motion vector of the contour between the previous frame and the current frame by giving a contour motion vector between the previous frame and the current frame when a trajectory of the object contour in the previous frame is given. An estimated position, that is, an estimated contour is obtained, a region where a contour including pixels around the estimated contour may exist, that is, a contour candidate region is determined on the current frame, and a gradient vector of density values is determined in the contour candidate region. Then, a closed curve is generated so as to pass through a place where the magnitude of the vector in the gradient vector field is large, and the locus is used as the object outline in the current frame. Accordingly, it is possible to follow even when the contour moves greatly between the previous and current frames or when the direction of the movement changes abruptly. Further, since the outline of each frame is described by a parametric curve, correction after processing can be facilitated.
[0024]
Hereinafter, a preferred embodiment according to the present invention will be described in the order of the overall configuration, details of each part, and effects.
[0025]
FIG. 1 is a diagram showing the entire processing of an object region extraction method in a moving image according to an embodiment of the present invention. In this embodiment, an object outline B is obtained as a closed curve expressed parametrically in each frame from a continuous image I including the object.
[0026]
An example of a schematic configuration of an image processing apparatus for performing such object region extraction processing is shown in FIG.
[0027]
In FIG. 2, the image processing apparatus holds a CPU (central processing unit) 21 for performing all calculations necessary for object region extraction according to the present embodiment and an image I, contour B, or processing intermediate result. External storage means 22, input means 23 such as a mouse or tablet pen for creating an image or inputting a locus, and display means 24 such as a display for displaying the image. Data transmission / reception among the CPU 21, external storage means 22, input means 23, and display means 24 is performed via a bus line 25.
[0028]
Returning to FIG. 1 again, FIG. 1 shows a one-frame process of the object region extraction process executed mainly by the CPU 21 of the image processing apparatus of FIG. In this embodiment, in each frame, the previous frame image I_t-1, Current frame image I_t, Object contour B already acquired in the previous frame_t-1As an input and the object outline B of the current frame_tIs output. First, in the motion vector estimation processing unit 11, the position of the object contour in the current frame is determined based on the previous frame object contour B between the previous frame and the current frame._t-1Obtained by estimating the motion vector of_eThe process is performed. Secondly, the contour candidate region determination processing unit 12 performs the estimated contour B._eA range in which there is a possibility that an object outline may exist around is determined, and a process is performed as an outline candidate area A. Third, the gradient calculation processing unit 13 performs a process of obtaining a gradient vector field G in order to know the exact position of the object contour in the contour candidate region A. Fourth, the curve generation processing unit 14 performs a process of obtaining a closed curve that passes through a place where the magnitude of the vector of the gradient vector field G is large so as to accurately pass through the object contour position. The calculated closed curve is the object contour B of the current frame._tAnd Object outline B obtained_tThe object outline B of the previous frame in the next process is transmitted via the delay processing unit 15._t-1And the process proceeds in the same manner.
[0029]
In addition, it is necessary to give an object outline by the method different from a present Example regarding the 1st frame. As the method, a curve editor technique for generating a spline curve by displaying an image on a display or the like and manually specifying a point on the contour by an appropriate graphic user interface (GUI) is already known. So it is available.
[0030]
In addition, as a final object of the embodiment of the present invention, there is a case in which a mask image that indicates the object region with the brightness gradation is obtained in each frame. Since the mask image can be generated by using known techniques such as inside / outside determination, pixel filling algorithm, etc., if the object outline is obtained as a closed curve, the mask image I can be obtained from this embodiment._mIs described as being obtained. Therefore, in the description of the subsequent processing, it can be determined whether or not the pixel is the pixel of the object area if the object outline or the estimated outline is obtained. In addition, as for the literature about inside / outside judgment and the painting algorithm, “Graphic processing engineering by computer display” Fujio Yamaguchi, Nikkan Kogyo Shimbun, 1981 and “Computer Graphics Principles and Practice, 2nd edition” ed. "Foley, vanDam, Feiner, Hughes, ADDISON-WESLWEY PUBLISHING, 1990.
[0031]
Here, the data format and the like will be described.
In the following description, each data has the following format.
[0032]
First, the “curve” means that the object contour, the estimated contour, and the like are closed curves by continuous cubic spline segments. A cubic spline segment is described by four control points, and a continuous spline segment that describes a curve with multiple sets is the previous p_ThreeAnd next p₀Is a sequence of common segments. The closed curve is the p of the last segment_ThreeAnd p of the first segment₀Is a common curve. That is, if the number of segments is K,
(p₀₀, p₀₁, p₀₂, p₀₃), (p_Ten, p₁₁, p₁₂, p₁₃), (p₂₀, p_{twenty one}, p_{twenty two}, p_{twenty three}), ...
..., (p_K-10, p_K-11, p_K-12, p_K-13)
p_{i + 10} = P_i3
p_K-13 = P₀₀
Next, the “feature point” will be described. In the present embodiment, the feature point on the contour is data having two pieces of position information, that is, a position on the contour and a position on the image. The position on the contour is the starting point p of the contour₀₀ It is the length along the trajectory.
[0033]
Next, the “contour candidate area” will be described.
In this embodiment, the contour candidate area A is the corresponding pixel position A._olIt is a list.
A = {A_o0, A_o1, ...}
Alternatively, the contour candidate area A is further divided into small areas A_kIt is a list.
A = {A₀, A₁, ...}
Small area A_kAre arranged in the order along the estimated contour. Small area A_kIs the corresponding pixel position A_klIt is a list.
A = {A_k0, A_k1, ...}
Next, other notations are represented as follows for the sake of explanation.
Pixel value at position (i, j) of image I: I [i, j]
Contour candidate area pixel position A of image I_klPixel value: I [A_kl]
Elements of blocks and gradient vector fields are also expressed in the same way.
[0034]
Next, the detail of each part is demonstrated.
First, the motion vector estimation processing unit 11 in FIG. 1 will be described.
[0035]
The position of the object contour in the current frame is set as the previous frame object contour B between the previous frame and the current frame._t-1Obtained by estimating the motion vector of_eThe process is performed. In the embodiment of the present invention, the following motion vector estimation method is adopted in order to improve the accuracy of motion vector estimation in the contour region.
[0036]
First, as a first specific example of a motion vector estimation method, in block matching, which is a motion vector estimation means for a small image area, conventionally, a first image I to be compared is used.₁Middle block and second image I₂As shown in FIG. 3, the average of errors for each pixel in the middle block is used as the error evaluation, as shown in FIG.₁Third image I showing importance of middle pixel_ThreeAnd a technique for performing weighted error evaluation using a third image.
[0037]
That is, in FIG. 3, the matching calculation processing unit 31 includes first and second images I that include an object for which a motion vector is to be obtained.₁, I₂ And the first image I₁A third image I showing the object region of_ThreeAnd the first image I₁A template arranged appropriately on the contour of the target object and the second image I₂To the second image I that most closely matches the template.₂The upper matching position is output. The motion vector calculation processing unit 32 receives the template position and the matching position from the matching calculation processing unit 31, and outputs the template motion vector.
[0038]
As a result, the conventional block matching solves the problem that the block includes a plurality of objects in the object boundary region and the motion vector cannot be uniquely determined.
[0039]
Next, a second specific example of the motion vector estimation method will be described. This is because, in the block matching process, which is a motion vector estimation means for a small image area, in the conventional block matching, the motion vector on the contour should normally be smoothly continuous along the contour, but generally along the contour. As shown in FIG. 4, the problem that the motion vector is likely to be erroneous due to the reason that the image pattern tends to be uniform is estimated by hierarchical exploration as shown in FIG. This is a technique for realizing a smooth continuous motion vector by correcting a motion vector by a field smoothing process.
[0040]
That is, in FIG. 4, the first and second images I including the object whose motion vector is to be obtained.₁, I₂ Are hierarchized by the

hierarchization processing units

41 and 42, respectively. In this hierarchization process, a hierarchical image can be created by using the original image as the lowermost layer, for example, by averaging the 2 × 2 images as pixels in the layer one level above. The layered images of the layers are sequentially subjected to block matching by using the result of the layer one level higher than the image of the uppermost layer by the block

matching processing units

43, 44, 45,.
[0041]
4 further includes a block list generation processing unit 47 and smoothing

processing units

48, 49,... The block list generation processing unit 47 arranges blocks on the contour to be tracked from the input object contour information. Further, the smoothing

processing units

48, 49,... Smooth the motion vector field.
[0042]
Next, a third specific example of the motion vector estimation method will be described with reference to FIG. This is a technique for tracking an object outline by block matching, which is a motion vector estimation means for a small image area. Conventionally, if a motion vector is estimated at every point on the outline, the processing amount is enormous. As shown in FIG. 5, there is provided a means for extracting feature points on the contour, and the motion vector estimation is performed on the feature points. This technique realizes that the contour can be tracked with high reliability even with a small number of motion vector estimations.
[0043]
In FIG. 5, the motion vector estimation processing unit 51 includes first and second images I including an object for which a motion vector is to be obtained.₁, I₂ The template list and the search range list from the feature point extraction processing unit 52 are supplied. The feature point extraction processing unit 52 receives the given object contour information and the first image I.₁ Then, feature points are extracted and a block list to be searched is generated. The motion vector estimation processing unit 51 performs processing for estimating the motion vector field of the generated block list.
[0044]
Next, FIG. 6 shows a fourth specific example of the motion vector estimation method formed by combining the first to third specific examples of the motion vector estimation method. That is, a combination method of the first to third techniques that is effective for the motion vector estimation process of the contour portion is provided. The first to third techniques have different effects for improving the accuracy of motion vector estimation of the contour portion.
[0045]
In FIG. 6, the

hierarchization processing units

61 and 62 correspond to the

hierarchization processing units

41 and 42 of FIG. 4, and the previous frame image I_t-1And front frame contour B_t-1From the above, by using the technique of the third specific example of the motion vector estimation method described with reference to FIG._t-1In order to extract the upper feature point and estimate the motion vector of the feature point, a block is arranged at each feature point position.
[0046]
In order to perform the search by the hierarchical block matching of the second specific example of the motion vector estimation method described with reference to FIG._t-1, Current frame image I_t, And front frame contour B_t-1Previous frame mask image I obtained from_mt-1Create a hierarchical image of The block

matching processing units

43, 44, 45,..., 46 perform hierarchical block matching from the top layer to the bottom layer. At this time, an error evaluation method for block matching of each layer is defined as the mask image I_mt-1The error evaluation is performed by the technique of the first specific example of the motion vector estimation method described with reference to FIG. Further, smoothing processing is performed between the layers by the smoothing

processing units

48, 49,... According to the technique of the second specific example of the motion vector estimation method described with reference to FIG.
[0047]
Estimated movement destinations of feature points obtained by motion vector estimation (hereinafter referred to as feature points on the estimated contour) are connected so as to pass through the feature points by curve interpolation processing in the interpolation processing unit 65. The method of generating an interpolation curve that passes through each given feature point is described in the document “Computer Graphics Principles and Practice 2nd ed.” “Foley, vanDam, Feiner, Hughes, ADDISON” -Catmull-Rom spline described in WESLWEY PUBLISHING, 1990. Alternatively, the points may be connected with straight lines to form an estimated contour.
[0048]
In this specific example, by combining the three techniques shown in the first to third specific examples of the motion vector estimation method, it is possible to obtain a better result by each effect.
[0049]
Next, a specific example of the contour candidate area determination processing unit 12 will be described.
[0050]
A range in which there is a possibility that an object contour may exist around the estimated contour Be is determined, and processing is performed as a contour candidate region A. Therefore, the amount of deviation of the estimated contour Be from the true contour is estimated, and the size of the area corresponding to the amount of deviation is determined.
[0051]
FIG. 7 is a block diagram showing an example of a configuration for the contour candidate area determination process in the present embodiment. The outline of this processing will be described with reference to FIG.
[0052]
First, the evaluation point extraction processing unit 72 extracts, from the estimated contour Be, points where the contour is relatively linear and the density of the regions on both sides of the contour can be regarded as constant as points for evaluating the deviation amount. Second, the corresponding point extraction processing unit 71 performs the previous frame object outline B_t-1The points corresponding to the evaluation points above are extracted. Third, the deviation amount evaluation processing unit 73 performs error evaluation between each evaluation point and the corresponding point, and estimates the deviation amount. Fourth, the area determination processing unit 74 determines the size of the area in the vicinity of each evaluation point from the shift amount, and sets the pixel position belonging to that range as the contour candidate area.
[0053]
Next, FIG. 8 is a flowchart for explaining an example of the contour candidate region determination process as described above. This processing is performed for the front frame object contour B._t-1, Estimated contour Be, the previous frame object contour B_t-1Top feature point c_k, Feature point c on Be_ekThe contour candidate area A is output with the number K of feature points as an input. In this process, the process described below is repeated between feature points. This repetition is executed K times by the so-called FOR loop 81 for the number of feature points.
[0054]
Hereinafter, processing between the k-th feature points in the FOR loop 81 will be described.
[0055]
First, in step S82, the estimated contour B_eTop feature point c_ekAnd c_{ek + 1}The middle point on the estimated contour of the kth evaluation point s_ekAnd Similarly, front frame object contour B_t-1Top feature point c_kAnd c_{k + 1}The middle point on the estimated contour of the k-th corresponding point s_kAnd In step S82, the evaluation score s_ekBlock b_ek, Corresponding point s_kBlock b_kIn step S83, the root mean square error NRMSE normalized from the two blocks is calculated and divided by the block size (block size) BS. Shift to the FOR loop 84. The calculation method of NRMSE will be described later.
[0056]
The FOR loop 84 repeats the number of times of the image size (image size) IS of the area for each of the loop control variables i and j, and in steps S85, S86, and S87 in the FOR loop 84, For each pixel position (i, j) of the line segment c_ekc_{ek + 1}If the distance to is smaller than x, the pixel position (i, j) is determined as a candidate contour region.
[0057]
Next, FIG. 9 is a flowchart showing a method for calculating the NRMSE. This calculation is based on block b_k, B_ekTo do.
In FIG. 9, a FOR loop 91 repeats the number of times of the block size BS for the loop control variables i and j. In the FOR loop 91, first, in step S92, each pixel value be [i, j] in the block be is determined whether it is inside or outside the estimated contour Be and classified, and in step S93, the pixel value be is determined. [i, j] is registered in the inner pixel value set fgdata [], and the pixel value be [i, j] is registered in the outer pixel value set bgdata [] in S94.
[0058]
Next, after the repetition process of the FOR loop 91 is completed, in step S95, the center of gravity of the set fgdata [] of all inner pixels is set to fg, and the center of gravity of the set bgdata [] of all outer pixels is set to bg. Next, the root mean square error RMSE of block be and block b is calculated by the following equation.
[0059]
[Expression 1]

Next, in step S96, a value obtained by dividing RMSE by the magnitude of the difference between the centers of gravity is defined as NRMSE.
[0060]
NRMSE = RMSE / | fg-bg |
Next, a specific example of the gradient calculation processing unit 13 in FIG. 1 will be described.
[0061]
In this gradient calculation, in order to know the exact position of the object contour in the contour candidate region A, a process for obtaining the gradient vector field G is performed.
[0062]
FIG. 10 is a processing block diagram of the gradient calculation of this embodiment.
This process is performed according to the following procedure. First, the feature point extraction processing unit 101 performs feature point extraction processing on the estimated contour Be according to the technique of the third specific example of the motion vector estimation method described with reference to FIG. Second, the region division processing unit 102 converts the contour candidate region A into the small region A so that the feature point is a boundary._kSo that the edge features are constant within each subregion. Third, the edge feature estimation processing unit 103 estimates an edge feature for each small region, and the edge detection processing unit 104 uses a specific edge detection method as described below based on the feature. To find the gradient.
[0063]
A specific example of this edge detection method will be described.
[0064]
First, as a first specific example of the edge detection method, in a calculation method for obtaining a density gradient (gradient) on an image, since the gradient is conventionally detected by paying attention to all directions on the image, attention is paid. As shown in FIG. 11, information on which density gradient is to be detected is given to the density gradient other than the contour of the object to be detected, and the detected density gradient intensity is modulated based on the information. By this, it is a technique that realizes not detecting a concentration gradient other than a notable direction.
[0065]
  That is, in FIG. 11, the gradient calculation processing unit 111 calculates the gradient for the grayscale image. The direction detection processing unit 113 detects the direction of the gradient vector, and the selectivity determination processing unit 114 determines the selectivity based on the vector in the image space to be detected. The intensity modulation processing unit 112 modulates the gradient intensity according to the selectivity obtained by the determination.
[0066]
Next, a second specific example of the edge detection method will be described.
[0067]
  The second specific example of the edge detection method is a calculation method for obtaining a density gradient (gradient) on an image. Conventionally, since the gradient is detected by paying attention to density changes in all directions in the density space, As shown in FIG. 12, the inner product calculation processing unit 121 and the gradient calculation processing unit 122 should detect the density change in which the density gradient other than the contour of the object to be detected has been detected. By calculating the density gradient of the grayscale image obtained from the inner product calculation of the vector on the image space obtained from the information and the density space vector of each pixel, the density gradient other than the density change to be noticed can be obtained. This is a technology that realizes not detecting.
[0068]
Hereinafter, with reference to FIG. 13, the region division process which is a feature of the present embodiment in the gradient calculation will be described.
FIG. 13 is a flowchart of the area dividing process. In FIG. 13, the so-called FOR loop 131 repeats the number of pixels L of the contour candidate region A, and the FOR loop 132 in the FOR loop 131 performs the number of feature points K. All pixels A in the contour candidate area A_olThe following processing is performed.
[0069]
In the FOR loop 132, in step S133, the adjacent feature point c_k, C_{k + 1}K line segment c by_kc_{k + 1}Each and pixel position A_olDistance d_kAsk for. In the next steps S133 and S135, the distance d_kD is the smallest_minAnd k at that time is m. After the processing of the FOR loop 132 is completed, A in step S136_olRegion A_mClassify into:
[0070]
Next, each contour candidate subregion A divided into regions_kNow, a process for obtaining information necessary for edge detection will be described.
[0071]
FIG. 14 is a flowchart of processing for obtaining edge detection information according to this embodiment. In this embodiment, the unit vector v on the image space is used as the edge detection information._OKAnd the unit vector v in the density space_ckTo win.
[0072]
In FIG. 14, in the so-called FOR loop 141, the number of pixels L of the contour candidate small region_kIs repeated for the number of times. Within this FOR loop 141, in step S142, each small area pixel A in the block be._klOn the other hand, it is determined whether or not the estimated contour Be is inside or outside, and is classified._klAre registered in the set of inner pixel values fgdata [], and the above A_klAre registered in the outer pixel value set bgdata [].
[0073]
Next, after the repetition processing of the FOR loop 141 is completed, in step S145, the centroid value of the set fgdata [] of all inner pixels is set to fg, and the centroid value of the set bgdata [] of all outer pixels is set to bg.
[0074]
  Here, the unit vector v in the image space_okIs information for giving the direction of the gradient to be detected. v_okAre each subregion A_kA line segment c composed of two feature points_kc_{k + 1}In a direction orthogonal to Unit vector v in the density space_ckIs information for giving the direction of density change to be detected. v_ckAre each subregion A_k, All the pixels are classified into pixels inside and outside the estimated contour Be, and centroids fg and bg are obtained for each of them and taken in a direction parallel to the difference vector of the centroids (see step S146). Alternatively, a method in which two centroids are projected onto a plane perpendicular to the luminance axis, and then a difference vector is obtained and taken in a direction parallel thereto is also effective for removing noise in the luminance direction.
[0075]
Next, FIG. 15 shows a block diagram of processing performed by combining the techniques of the first and second specific examples of the edge detection method described with reference to FIGS. The procedure for the combined gradient calculation is described below.
[0076]
In FIG. 15, an input image for obtaining a gradient, a vector on the density space to be detected, and a vector on the image space to be detected are obtained as inputs.
[0077]
  First, the inner product calculation processing unit 151 obtains a grayscale image from the inner product calculation of the vector on the image space and the density space vector of each pixel. The gradient calculation processing unit 152 calculates the gradient for the obtained grayscale image. The direction detection processing unit 154 detects the direction of the gradient vector, and the selectivity determination processing unit 155 determines the selectivity based on the vector in the image space to be detected. The intensity modulation processing unit 153 modulates the gradient intensity according to the selectivity obtained by the determination. The above processing is performed in each contour candidate small region.
[0078]
As described above with reference to FIGS. 10 to 15, the first and second specific examples (see FIGS. 11 and 12) of the edge detection technique capable of performing selective contour detection automatically. A means for providing necessary information is provided, and these two techniques are used in combination. That is, according to the third embodiment of the motion vector estimation method described in conjunction with FIG. 5 described above, the present invention is applied to the estimated contour of the current frame in the processing for obtaining the gradient vector field of the density value in the contour candidate region. Extraction is performed, and the contour candidate region is divided into small regions according to the feature point positions on the estimated contour of the current frame. The feature point extraction of the technique of FIG. 5 is to extract the inflection point of the contour and the change point of the contour color. On the other hand, the edge detection technique described with reference to FIGS. 11 and 12 pays attention to the edge gradient direction and the color change direction. That is, the feature point detected by the technique shown in FIG. 5 is a point where the edge feature of the technique shown in FIGS. 11 and 12 changes. Therefore, if the region is divided with the feature point as a boundary, the contour feature can be made constant for each small region. By examining the characteristics of the contour for each small region, the edge detection in FIGS. 11 and 12 can be accurately performed, and unnecessary gradient components other than the object contour can be suppressed.
[0079]
Next, a specific example of the curve generation processing unit 14 in FIG. 1 will be described.
[0080]
In this curve generation process, the object contour position is extracted with a cubic spline curve from the gradient vector field obtained in the object candidate region. For this purpose, a process for obtaining a closed curve that passes through a place where the magnitude of the gradient vector field G is large is performed. The calculated closed curve is the object contour B of the current frame._tAnd
[0081]
FIG. 16 is a processing block diagram of curve generation according to this embodiment. This process is performed according to the following procedure.
[0082]
That is, in FIG. 16, first, the feature point extraction processing unit 161 performs feature point extraction processing on the estimated contour Be by the technique of the third specific example of the motion vector estimation method described with reference to FIG. Second, the area candidate processing area A is subdivided into small areas A by the area division processing unit 162 so that the feature points are used as boundaries._kDivide into A specific area dividing method is the same as the area dividing method in the gradient calculation process described above. Third, the passing point extraction processing unit 163 extracts one passing point for each small region. The extracted passing points are ordered according to the order of the small areas. Fourth, p_i1, P_i2The search processing unit 164 obtains the remaining control points for each segment by the spline control point search method as described later, and obtains the object contour Bt.
[0083]
Here, a method for searching for the spline control point will be described with reference to FIG.
[0084]
FIG. 17 shows a block diagram for realizing the curve generation method. This curve formation method is a technique for exploring a cubic spline curve shape that passes through a place where the magnitude of the vector field vector is as large as possible. . Accordingly, by providing means for giving an approximate position of the object outline on the image as shown in FIG. 17, the curve is searched from the information of the gradient vector field of the image, and the object outline is extracted as a parametric curve. Realize.
[0085]
The curve generation method shown in FIG. 17 is a curve passing on a two-dimensional smooth vector field, for example, on a gray gradient vector field formed based on R, G, and B signals representing the color of an image. An evaluation value calculation for calculating an evaluation value expressed by a function of an inner product of a unit vector and a vector field vector orthogonal to a tangent line of the curve at a predetermined number of evaluation positions on the curve And a curve determination step for determining a curve so that the evaluation value obtained in the evaluation value calculation step is maximized.
[0086]
In FIG. 17, the vector field is a vector field vector V (i, j) at each point (i, j) in a space, for example, all points in the input image._i1, P_i2The data is output to the search unit 174. The trajectory input unit 171 includes input means such as a pen-shaped input device so-called a tablet pen or a mouse. When the trajectory T is input as an array of pixel points in a rough area where a curve is desired to be obtained, the trajectory T is input. Output to the passing point extraction unit 172. The passing point input unit 172 examines a vector field in a region included within a certain distance from the trajectory T, extracts a position where the magnitude of the vector is equal to or greater than a predetermined reference as a passing point p, and passes through these passing points p. The position data is output to the ordering unit 173. As will be described later, each of the passing points p is all points whose shortest distance to the trajectory T is smaller than a predetermined length and whose vector field vector is larger than a predetermined size. As will be described later, the ordering unit 173 uses the passing points obtained by rearranging the passing points p in the order along the traveling direction of the trajectory T as reference points p.₀₀, P_Ten, P₂₀, ..., p_K0As p_i1, P_i2The data is output to the search unit 174. p_i1, P_i2The search unit 174 is a part that performs the evaluation value calculation step and the curve determination step. In the evaluation value calculation process, the reference point p_i0, P_{(i + 1) 0}And the reference point p based on the vector field vector V (i, j)_i0, P_i3More sample points p_i1, P_i2And a partial curve is generated from these reference points and sample points and output. In addition, the said partial curve will form a curve, if an edge part is connected.
[0087]
According to FIG. 17, a trajectory T in a space, for example, an image is input by the user at the trajectory input unit 171, and the passing point extraction unit 172 and the ordering unit 173 generate partial curves constituting the generated curves Reference point p₀₀, P_TenAre extracted based on the trajectory T and the vector field vector. In addition, p_i1, P_i2In the search unit 174, the reference point p₀₀, P_Ten, ... and the vector field vector, the reference point p_i0, P_i3Between the above sample points p_i1, P_i2Are extracted, and a partial curve is formed by these points, and a curve to be obtained is obtained by connecting the partial curves.
[0088]
The passing point extraction process, which is a feature of the present embodiment in curve generation, will be described below. FIG. 18 is a flowchart of the passing point extraction process.
[0089]
In FIG. 18, the so-called FOR loop 181 repeats the processing for the number K of contour candidate small regions, and the FOR loop 182 in the FOR loop 181 performs the kth contour candidate small region A._kNumber of pixels L in the contour candidate small region corresponding to_kAre repeated for the number of times, and all contour candidate small regions A_kContour candidate small region A_kEach pixel position A_klGradient strength G [A_kl], And pass through the pixel position where it is maximized_k0And
[0090]
That is, in the FOR loop 182, in step S183, the gradient strength G [A_kl] In the next step S184, this value g is the maximum value g up to the present time._maxIt is determined whether or not it is larger, and if only Yes, the process proceeds to step S185, and the current g is changed to g_maxAnd pixel position A at this time_klThe above passing point p_k0It is said.
[0091]
Such a curve generation method is a vector required for curve exploration as compared with the curve generation technique that can generate a curve passing through a place where the gradient vector field is large as shown in FIG. Means for automatically extracting a passing point sequence on the field is provided. That is, according to the embodiment of the present invention, in the process of generating a curve on the gradient vector field in the contour candidate region, the feature point extraction of the technique described with reference to FIG. The contour candidate region is divided into small regions according to the feature point positions on the estimated contour, and points through which the contour passes are extracted for each small region. As a result, a sequence of passing points ordered in the order in which the small regions are arranged is obtained, and a curve passing through a place where the gradient vector field vector has a large magnitude is generated by the cubic spline curve generation process using the technique of FIG. be able to. The feature point extraction of the technique shown in FIG. 5 is to extract the inflection point of the contour. On the other hand, the curve generation of the technique of FIG. 17 generates a cubic spline curve for each passing point. Therefore, if a cubic spline segment corresponding to each feature point can be generated, it is guaranteed that a conspicuous shape on the contour can be reproduced. Further, since the bent portion and the spline segment correspond one-to-one, the number of segments is not excessive or insufficient depending on the complexity of the shape.
[0092]
The effect of the region extraction method of the present embodiment as described above will be described with reference to FIG.
[0093]
FIG. 19 is a simplified illustration for clarifying the result of applying this embodiment to a certain moving image.
[0094]
FIGS. 19A to 19D show four consecutive frames of moving images with the outlines of main objects as line drawings. (E) to (h) of FIG. 9 are masks or silhouette images of the object extracted by applying this embodiment. In this example, the object obj at the center of the image is the target.
[0095]
As shown in FIGS. 19A to 19D, since the contour of the object obj is in contact with a plurality of other objects, the features of the contour are not simple. Even in the case of an outline having such an arbitrary combination of foreground and background, region extraction can be performed in this embodiment. As this result shows, the present embodiment is a technology that realizes automatic extraction of a target area of a moving image.
[0096]
【The invention's effect】
According to the region extraction method and apparatus therefor according to the present invention, when the trajectory of the object contour in the previous frame of the input moving image is given, by estimating the motion vector of the contour between the previous and current frames, An estimated contour that is an estimated position of the contour on the current frame is obtained, a contour candidate region that is a region where a contour including pixels around the estimated contour may exist is determined on the current frame, and this contour candidate region In this case, the gradient vector field of the concentration value is obtained, a closed curve is generated so that it passes through a place where the magnitude of the vector in the gradient vector field is large, and the locus is used as the object outline in the current frame. Can be improved, the contour can be easily corrected, and the region extraction result can be improved.
[0097]
That is, according to the present invention, when a trajectory of an object outline in a previous frame is given, an estimated position of the outline on the current frame, that is, an estimated outline is estimated by estimating a motion vector of the outline between the previous frame and the current frame. Determine the area where the contour including the pixels surrounding the estimated contour may exist, that is, the contour candidate region on the current frame, and obtain the gradient vector field of the density value in the contour candidate region. A closed curve is generated so as to pass through a place where the size of the vector is large, and the locus is used as the object outline in the current frame. Accordingly, it is possible to follow even when the contour moves greatly between the previous and current frames or when the direction of the movement changes abruptly. Further, since the outline of each frame is described by a parametric curve, correction after processing can be facilitated.
[0098]
In addition, the present invention provides a combination method of three motion vector estimation techniques effective for the motion vector estimation processing of the contour portion, and achieves a better result by each effect by combining these techniques. .
[0099]
Further, according to the present invention, in the process of obtaining the contour candidate region on the current frame, the mean square error between the block on the contour of the previous frame and the block on the estimated contour of the current frame corresponding to the object color and background Find the value normalized by the difference from the color. As a result, a process for estimating the magnitude of deviation between the true contour and the estimated contour on the current frame is performed. As a result, the thickness of the contour candidate region formed around the estimated contour of the current frame is adjusted to prevent the region from becoming larger than necessary and increasing unnecessary processing.
[0100]
That is, according to the region extraction method and the apparatus thereof according to the present invention, in the object region extraction processing technique from the moving image, the following problems of the tracking performance between frames, the ease of correction, and the accuracy of the contour shape of the prior art are solved. It is possible to solve the problem and improve the region extraction result.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a schematic configuration of an embodiment to which a region extraction method of the present invention is applied.
FIG. 2 is a block diagram showing an overall schematic configuration of an image processing apparatus to which the region extraction apparatus of the present invention is applied.
FIG. 3 is a block diagram illustrating a first specific example of a motion vector estimation processing unit in FIG. 1;
4 is a block diagram showing a second specific example of the motion vector estimation processing unit in FIG. 1; FIG.
FIG. 5 is a block diagram illustrating a third specific example of the motion vector estimation processing unit in FIG. 1;
6 is a block diagram illustrating a fourth specific example of the motion vector estimation processing unit in FIG. 1; FIG.
7 is a block diagram illustrating a specific example of a contour candidate area determination processing unit in FIG. 1; FIG.
FIG. 8 is a flowchart for explaining an example of a contour candidate area determination process in FIG. 7;
FIG. 9 is a flowchart showing an example of a method for calculating NRMSE in FIG. 8;
10 is a block diagram illustrating a specific example of a gradient calculation processing unit in FIG. 1. FIG.
11 is a block diagram illustrating a first specific example of the edge detection processing unit in FIG. 10; FIG.
12 is a block diagram illustrating a second specific example of the edge detection processing unit in FIG. 10; FIG.
FIG. 13 is a flowchart for explaining an operation of region division processing in gradient calculation.
FIG. 14 is a flowchart for explaining processing for obtaining edge detection information;
FIG. 15 is a block diagram showing processing performed by combining the specific examples shown in FIGS. 11 and 12 of edge detection processing;
16 is a block diagram illustrating an example of a curve generation processing unit in FIG. 1. FIG.
FIG. 17 is a block diagram illustrating a specific example of a configuration to which a curve generation method is applied.
FIG. 18 is a flowchart for explaining passing point processing;
FIG. 19 is a diagram for explaining the effect of the present embodiment.
[Explanation of symbols]
11 Motion vector estimation processing unit
12 Contour candidate area determination processing unit
13 Gradient calculation processing part
14 Curve generation processing unit
21 CPU (Central Processing Unit)
22 External storage means
23 Input means
24 display means
25 Bus line
31 Matching calculation processing part
32 Motion vector calculation processing unit
41, 42, 61, 62, 63 Hierarchization processing unit
43-46 Block matching processing unit
48, 49 Smoothing processor
51 Motion vector estimation processing unit
52 Feature Point Extraction Processing Unit

Claims

In an area extraction method for extracting an object area in each frame of an input moving image,
A motion vector estimation step for obtaining an estimated contour that is an estimated position of the contour on the current frame by estimating a motion vector of the contour between the previous and current frames when a trajectory of the object contour in the previous frame is given When,
A contour candidate region determination step of determining a contour candidate region on the current frame, which is a region where a contour including pixels around the estimated contour may exist,
In the contour candidate region, a gradient calculation step for obtaining a gradient vector field from an inner product of a unit vector on the image space and a unit vector on the density space of each pixel;
The gradient intensity is modulated by the direction of the vector existing in the gradient vector field and the vector in the image space, and a closed curve is generated by connecting pixels that maximize the gradient intensity for each region. A region extraction method comprising: a curve generation step for making an object contour.

In the motion vector estimation step,
When extracting the feature points on the trajectory of the given object outline and estimating the motion vector of the extracted feature points, the error evaluation between the two blocks to be compared is weighted to the residual of the object region pixel 2. The region extraction method according to claim 1, wherein calculation is performed using hierarchical block matching using a simple error evaluation function, and smoothing processing is performed on a motion vector resulting from each layer.

In the contour candidate region determination step,
Find the mean square error between the block on the contour of the previous frame and the block on the estimated contour of the current frame corresponding to the difference between the object color and the background color, and use that value 2. The region extracting method according to claim 1, wherein the thickness of the contour candidate region formed around the estimated contour of the current frame is adjusted.

In the above gradient calculation process,
Feature point extraction is performed on the estimated contour of the current frame, the contour candidate region is divided into small regions according to the feature point positions on the estimated contour of the current frame, the direction of the contour in the current frame is estimated for each small region, and 2. The region extraction method according to claim 1, wherein a gradient vector field of a contour candidate region is obtained by performing edge detection that reacts only to a direction gradient.

In the above gradient calculation process,
Perform feature point extraction on the estimated contour of the current frame, divide the contour candidate region into small regions according to the feature point positions on the estimated contour of the current frame, estimate the color change of the contour in the current frame for each small region, The region extraction method according to claim 1, wherein a gradient vector field of a contour candidate region is obtained by performing edge detection that reacts only to the gradient of the color change.

In the curve generation step,
For the gradient vector field of the contour area of the current frame, feature points are extracted from the estimated contour of the current frame, and the contour candidate area is divided into small areas according to the feature point positions on the estimated contour of the current frame. 2. The region according to claim 1, wherein a point through which a contour passes is extracted every time and the passing point is connected by a cubic spline curve generated so as to pass through a place where a gradient vector field vector has a large magnitude. Extraction method.

In an area extraction device that extracts an object area in each frame of an input moving image,
Motion vector estimation means for obtaining an estimated contour that is an estimated position of a contour on the current frame by estimating a motion vector of the contour between the previous and current frames when a trajectory of the object contour in the previous frame is given When,
Contour candidate area determining means for determining a contour candidate area, which is an area in which a contour including pixels around the estimated contour may exist, on the current frame;
In the contour candidate region, a gradient calculation means for obtaining a gradient vector field from an inner product of a unit vector on the image space and a unit vector on the density space of each pixel;
The gradient intensity is modulated by the direction of the vector existing in the gradient vector field and the vector in the image space, and a closed curve is generated by connecting pixels that maximize the gradient intensity for each region. An area extracting apparatus comprising a curve generating means for making an object contour.

The motion vector estimation means includes:
When extracting the feature points on the trajectory of the given object outline and estimating the motion vector of the extracted feature points, the error evaluation between the two blocks to be compared is weighted to the residual of the object region pixel 8. The object region extraction apparatus according to claim 7, wherein calculation is performed using hierarchical block matching using a simple error evaluation function, and smoothing processing is performed on a motion vector as a result of each layer.

The contour candidate area determination means includes:
Find the mean square error between the block on the contour of the previous frame and the block on the estimated contour of the current frame corresponding to the difference between the object color and the background color, and use that value 8. The area extracting apparatus according to claim 7, wherein a thickness of a contour candidate area formed around the estimated outline of the current frame is adjusted.

The gradient calculation means is
Feature point extraction is performed on the estimated contour of the current frame, the contour candidate region is divided into small regions according to the feature point positions on the estimated contour of the current frame, the direction of the contour in the current frame is estimated for each small region, and 8. The region extracting apparatus according to claim 7, wherein a gradient vector field of a contour candidate region is obtained by performing edge detection that reacts only to a direction gradient.

The gradient calculation means is
Perform feature point extraction on the estimated contour of the current frame, divide the contour candidate region into small regions according to the feature point positions on the estimated contour of the current frame, estimate the color change of the contour in the current frame for each small region, 8. The region extracting apparatus according to claim 7, wherein a gradient vector field of a contour candidate region is obtained by performing edge detection that reacts only to the gradient of the color change.

The curve generation means includes
For the gradient vector field of the contour area of the current frame, feature points are extracted from the estimated contour of the current frame, and the contour candidate area is divided into small areas according to the feature point positions on the estimated contour of the current frame. 8. The region according to claim 7, wherein a point through which a contour passes is extracted every time, and the passing point is connected by a cubic spline curve generated so as to pass through a place where a gradient vector field vector has a large magnitude. Extraction device.