JP2004280148A

JP2004280148A - Method, apparatus and program for color model construction

Info

Publication number: JP2004280148A
Application number: JP2003066680A
Authority: JP
Inventors: Hidenori Sato; 秀則佐藤; Hidekazu Hosoya; 英一細谷; Yoshinori Kitahashi; 美紀北端; Ikuo Harada; 育生原田; Akira Onozawa; 晃小野澤; Hisao Nojima; 久雄野島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-03-12
Filing date: 2003-03-12
Publication date: 2004-10-07

Abstract

<P>PROBLEM TO BE SOLVED: To construct a color model from a simple and natural action in which a user waves his or her hand at any place toward a camera. <P>SOLUTION: An action area detecting part 11A extracts a candidate for an arm area by generating a difference image between an frame image of an imaged motion image at each time and an frame image at a time t<SB>0</SB>. An arm area determining part 12A further generates difference images among generated difference images between frames, generates an image of which only the arm area is colored to make it a closed area to a colored pixel area and obtains an arm area image at each time. A flesh-colored area determining part 13A determines hand areas from each of arm area images. A flesh color model determining part 14A calculates an average color and a variance value of a color in the hand areas at each time to make them the flesh color. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、ユーザーが自分の手を動かす姿をビデオカメラに撮影しながら、撮影時の環境に合った、そのユーザーの手領域の肌色モデルを構築する方法および装置に関するものである。
【０００２】
【従来の技術】
上記の実現のための技術には、特許文献１では、あらかじめ指定した肌色モデル（特許文献内では肌色テンプレートと称している）を初期値として、あらかじめ決められた手の形状テンプレートをもとに手領域の動きを追跡し、追跡した手領域の肌色テンプレートを逐次更新していく手法を提案している。
【０００３】
また、例えば非特許文献１では、ステレオカメラからある距離範囲内に写っている物が手のみであると仮定し、その手領域の肌色モデルを決定している。
【０００４】
一方、非特許文献２では、顔や手の位置を簡易に推定可能なように、顔、両手が写るようなモデル姿勢をあらかじめ設定している。撮影時に、ユーザーがその姿勢をとった結果に対して、テンプレートマッチングから顔と手が写っている領域を決定し、その領域色からユーザーの肌色モデルを決定している。
【０００５】
【特許文献１】
特開平１１−１６７４５５号公報
【非特許文献１】
「距離画像を用いたウェアラブルな３次元デスクトップ環境の構築」（信学技法，ＰＲＭＵ２００１−２２２，ｐｐ．１−８，２００２−２）
【非特許文献２】
「複雑背景下における手指特徴抽出と手話認識」（画像の認識・理解シンポジウムＶｏｌ．ＩＩ，ｐｐ．１０５−１１０，２００２年７月）
【０００６】
【発明が解決しようとする課題】
上述した従来の方法では、下記の問題点がある。
【０００７】
特許文献１の方法は、初期値としての、肌色モデル（肌色テンプレート）や正確な手形状のテンプレートを、ユーザーが最初に指定する必要があり、また、動かす手もテンプレートに形状が一致していなければならない。
【０００８】
非特許文献１の方法は、ユーザーが、高価なステレオカメラを使用し、かつ手のみをあらかじめ決められたカメラからの距離の範囲内におかなければならない。
【０００９】
非特許文献２の方法は、ユーザーにあらかじめ決められた姿勢を強要する。
【００１０】
本発明の目的は、ユーザーの肌色モデルを、ユーザーがカメラに向かって自由な位置で手を振る（図９（１））、という簡易で自然な動作から構築する方法および装置を提供することにある。
【００１１】
【課題を解決するための手段】
上記目的を達成するために、本発明の色モデル構築方法は、
ユーザーを撮影する画像入力段階と、
撮影画像内の動きのある領域を抽出する動き領域検出段階と、
動き領域検出段階で抽出した領域の中から腕領域を決定する腕領域決定段階と、
腕領域決定段階で決定された腕領域の中から肌色を表わす領域を決定する肌色領域決定段階と、
肌色領域決定段階で決定された肌色領域の色情報から目的とする肌色モデルを決定する肌色モデル決定段階とを有している。
【００１２】
以下、各段階について説明する。
【００１３】
まず、動き領域検出段階では、画像入力段階の撮影動画像のフレーム画像間の色変化があった領域を、動きがあった領域として抽出する。
【００１４】
次の腕領域決定段階では、まず前段階で抽出した動き領域から、人の腕領域のみを抽出する。その結果を閉領域化することにより、最終的な腕領域を作成する。もし、閉領域が複数存在した場合には、１）画像に対し、ある程度以上の大きさで撮影される、２）長径と短径（図９（２）を参照）の比が個人差によらずほぼ一定である、という性質を利用し、複数の閉領域の中から、指定面積以上かつ短径と長径の比が指定範囲内にある領域を腕領域とする。決定された腕領域のイメージ図を図９（２）に示す。
【００１５】
次の肌色領域決定段階では、手領域は腕領域の先端にあり、人間が手を振る場合に、動き量の大きい方が手側、小さい方が体側となることを利用し、前段階で抽出した領域の長径方向の両端領域のうち、動き量の大きな方が手領域であると判定する。決定された手領域のイメージ図を図９（３）に示す。
【００１６】
最後の肌色モデル決定段階では、抽出された手領域内の色情報を使って肌色モデルを決定する。
【００１７】
本発明では、動画像処理により、ユーザーの、手を振るという自然で簡便な動作から手領域を抽出し、肌色モデルを決定するため、認識された色モデルを作成した手をポインタとするポインティングシステムを作成することができる。すなわち、ユーザーにとって利便性の高いポインティングシステムの構築に資することができる。
【００１８】
ここで、本明細書で使用する用語について説明する。
１．肌色モデル
撮影環境下における、ユーザーの、手または顔の肌色領域がとる色の範囲である。一般には、ＲＧＢやＨＳＶ表色系を用いて、各々の成分値の平均値と分散値で表したり、色のヒストグラムを用いて表す。
２．画素
画像を構成する最小要素である。画像は、画素がｘ，ｙ方向に規則正しく並ぶことにより構成される。画像の見えは、各画素の持つ色値（一般にはＲＧＢ表色系を用い、Ｒ，Ｇ，Ｇ値それぞれ２５６階調）によって決まる。
３．フレーム画像
撮影動画像から単位時間毎に切り出した静止画である。
４．差分画像
２枚の画像を入力として、図１０のアルゴリズムに従って生成した画像である。アルゴリズムでは、まず、入力画像対Ｉ１，Ｉ２に対して値Ｌ^２を求める（ステップ３１）。このＬ^２があらかじめ決められたＡ−しきい値以上となっている場合、色差分がＢ−しきい値以上の画素に、時間ｔ_ｉの画像の画素色をマッピングし、マッピングしなかった画素の色は白とし（ステップ３２，３３）、この生成画像を差分画像と呼ぶ（ステップ３４）。ここで、Ｌ^２はある２枚の画像間の、各画素毎の色の差分の自乗和、Ａ−しきい値は入力画像対から差分画像を生成するかどうかを判定する際の、Ｌ^２の最小値を示すしきい値、Ｂ−しきい値は入力画像対の注目画素色の差分の最小値を示すしきい値である。この値以下の場合は、注目画素同士が同色であるとみなす。
５．腕領域
ここでは、ユーザーの実際の手と腕の領域を合わせて、腕領域と呼ぶ（図９（２）で示されている領域）。
６．長径、短径
腕領域の径のうち長い方を長径、短い方を短径という。
７．閉領域
まわりを線分で繋いで囲むことのできる領域。着目した画素から閉領域を作成する場合は、外郭上の画素を繋いで作成する。
８．細線化
着色画素領域の線幅を縮めて、幅１画素の中心線を抽出することである。
９．線分化
細線化した隣接着色画素を連結させながら、その結果を線分として認識することである。
１０．オプティカルフロー
フレーム間の差分を用いて、フレーム間での、画像内の物体や小領域の見かけ上の移動距離を表したもの。
１１．ヒストグラム
指定領域内の画素のうち、ある範囲の値をとる画素の個数を表したものである。領域内画素の色の分布を表しているとも言える。
【００１９】
【発明の実施の形態】
次に、本発明の実施の形態について図面を参照して説明する。
【００２０】
［第１の実施形態］
図１は本発明の第１の実施形態の色モデル構築装置のブロック図、図２はその処理の流れを示すフローチャートである。
【００２１】
本実施形態の色モデル構築装置は画像入力部１０と動き領域検出部１１Ａと腕領域決定部１２Ａと肌色領域決定部１３Ａと肌色モデル決定部１４Ａと画像メモリ１５と情報記憶部１６で構成されている。情報記憶部１６には各部で求められた動き情報、腕領域情報、色表示領域情報等が記憶される。
【００２２】
次に、本実施形態において、ユーザーの手振り動作から手の肌色モデルを構築する動作を説明する。ここでは、ユーザーが手振り動作を開始した時刻をｔ_０、終了した時刻ｔ_ｎとする。
【００２３】
まず、画像入力部１０はユーザーを撮影し、撮像画像を画像メモリ１５に記憶する（ステップ２０）。動き領域検出部１１Ａは画像メモリ１５に記憶されている撮影動画像の各時刻のフレーム画像について、時刻ｔ_０のフレーム画像との差分画像を生成することにより、腕領域の候補を抽出する（ステップ２１Ａ）。抽出された腕領域の情報は情報記憶部１６に記憶される。時刻ｔ_ｉのフレーム画像に対するフレーム間差分画像生成アルゴリズムと入力、生成画像のイメージを図３に示す。こうして作られた差分画像は、理想的には、時刻ｔ_ｉにおける腕領域が実際の腕領域の色、時刻ｔ_０における腕領域が実際の背景色、それ以外が白となっている。
【００２４】
次の腕領域決定部１２Ａでは、まず、動き領域検出部１１Ａで生成されたフレーム間差分画像同士に対し、さらに差分画像を作成し、それを元のフレーム間差分画像同士と比較することにより、腕領域のみが着色された画像を生成する（ステップ２２Ａ）。時刻ｔ_ｉのフレーム間差分画像と時刻ｔ_ｊのフレーム間差分画像から差分画像を生成し、時刻ｔ_ｊの腕領域画像を生成するアルゴリズムと入出力画像のイメージを図４に示す。次に、着色された画素領域に対し閉領域を生成する。生成された閉領域のうち、しきい値以上の面積をとり、かつ指定された短径と長径の比の範囲内にある領域を選択し、情報記憶部１６に記憶する。この領域がそれぞれの時刻における仮の腕領域となる。
【００２５】
以上の処理を、動き領域検出部１１Ａで求まった各フレーム間画像に対して行い、各時刻において、抽出された腕領域面積が最も大きい画像を、その時刻における真の腕領域画像とする。
【００２６】
肌色領域決定部１３Ａでは、各画像の抽出領域に対し、長径方向の両先端近傍領域を指定された大きさの矩形に切り出し（ステップ２３Ａ、図５）、切り出した両領域に対し、時系列に沿って重心座標の移動距離を算出する。求まった移動距離が腕領域の動きのおおよその大きさを表しており、値が大きい方の領域を手領域とみなす。
【００２７】
最後の肌色モデル決定部１４Ａでは、肌色領域決定部１３Ａで決定された各時刻の手領域の色の平均値と分散値を計算し、それを肌色モデルとする（ステップ２４Ａ）。
【００２８】
［第２の実施形態］
図６は本発明の第２の実施形態の色モデル構築装置のブロック図、図７はその処理の流れを示すフローチャートである。
【００２９】
本実施形態の色モデル構築装置は画像入力部１０と動き領域検出部１１Ｂと腕領域決定部１２Ｂと肌色領域決定部１３Ｂと肌色モデル決定部１４Ｂと画像メモリ１５と情報記憶部１６で構成されている。
【００３０】
本実施形態においても、ユーザーが手振り動作を開始した時刻を時刻ｔ_０、終了した時刻を時刻ｔ_ｎとする。
【００３１】
本実施形態における動き領域検出部１１Ｂでは、時刻ｔ_ｉと時刻ｔ_ｉ＋１におけるフレーム画像から差分画像を生成することにより、隣接フレーム間で色の変化があった領域を検出する（ステップ２１Ｂ）。図８にアルゴリズムと入出力画像のイメージを示す。この場合、腕領域の体側が未連結となっている、腕領域を囲むような線分を表す画像が生成される。
【００３２】
腕領域決定部１２Ｂでは、細線化処理後に、体側の未連結の端点同士を結ぶことにより、閉領域を作成する（ステップ２２Ｂ）。この閉領域内の画素については時刻ｔ_ｉのフレーム画像の色をマッピングした画像を、時刻ｔ_ｉにおける腕領域を示す画像とする。
【００３３】
肌色領域決定部１３Ｂでは、まず、各時刻における腕領域画像内の小領域毎の動き量を表すオプティカルフローを求め、求めたオプティカルフローの中から動き量の大きな領域を求め、それを手領域とみなす（ステップ２３Ｂ）。
【００３４】
最後の肌色モデル決定部１４Ｂでは、上記で決定された各時刻の手領域の重心座標の近傍領域の色を抜き出し、各色成分のヒストグラムを求め、それを肌色モデルとする（ステップ２４Ｂ）。
【００３５】
なお、以上の実施形態において、それぞれ全ての適用範囲はこれに限定されるものではない。例えば、第１の実施形態における動き領域検出部１１Ａと腕領域決定部１２Ａをそれぞれ、第２の実施形態における動き領域検出部１１Ｂと腕領域決定部１２Ｂに適用することも考えられるし、その逆も有り得る。さらには、肌色領域決定部１３Ａ，１３Ｂや肌色モデル決定部１４Ａ，１４Ｂについても同じことが言える。
【００３６】
なお、本発明は専用のハードウェアにより実現されるもの以外に、その機能を実現するためのプログラムを、コンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行するものであってもよい。コンピュータ読み取り可能な記録媒体とは、フロッピーディスク、光磁気ディスク、ＣＤ−ＲＯＭ等の記録媒体、コンピュータシステムに内蔵されるハードディスク装置等の記憶装置を指す。さらに、コンピュータ読み取り可能な記録媒体は、インターネットを介してプログラムを送信する場合のように、短時間の間、動的にプログラムを保持するもの（伝送媒体もしくは伝送波）、その場合のサーバとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含む。
【００３７】
【発明の効果】
以上説明したように、本発明によれば、ユーザーに姿勢を強要したり、初期値を設定することもなく、自由な位置での自然な手振りの動きのみから肌色モデルを構築することができるという効果が得られる。
【図面の簡単な説明】
【図１】
本発明の第１の実施形態の色モデル構築装置の構成図である。
【図２】
第１の実施形態の色モデル構築装置の処理の流れを示すフローチャートである
。
【図３】
第１の実施形態における、時刻ｔ_ｉのフレーム画像の時刻ｔ_０のフレーム画像に
対するフレーム間差分画像生成アルゴリズムを示すイメージ図である。
【図４】
第１の実施形態における、時刻ｔ_ｉのフレーム差分画像と時刻ｔ_ｊのフレーム差
分画像から時刻ｔ_ｊの腕領域画像を生成する腕領域決定アルゴリズムを示すイメージ図である。
【図５】第１の実施形態における、抽出された腕領域画像から、両先端近傍領域の矩形切り出しを行うイメージ図である。
【図６】本発明の第２の実施形態の色モデル構築装置の構成図である。
【図７】第２の実施形態の色モデル構築装置の処理の流れを示す図である。
【図８】第２の実施形態における、時刻ｔ_ｉ＋１の時刻ｔ_ｉに対するフレーム間差分画像生成アルゴリズムを示すイメージ図である。
【図９】ユーザーの手振り動作、および本発明によって決定された、腕領域と手領域のイメージを示す図である。
【図１０】差分画像生成のアルゴリズムを示すフローチャートである。
【符号の説明】
１０画像入力部
１１Ａ，１１Ｂ動き領域検出部
１２Ａ，１２Ｂ腕領域決定部
１３Ａ，１３Ｂ肌色領域決定部
１４Ａ，１４Ｂ肌色モデル決定部
１５画像メモリ
１６情報記憶部
２０，２１Ａ〜２４Ａ，２１Ｂ〜２４Ｂ，３１〜３４ステップ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a method and an apparatus for constructing a skin color model of a hand region of a user, which is suitable for an environment at the time of photographing, while photographing a user moving his / her hand with a video camera.
[0002]
[Prior art]
In the technology for realizing the above, in Patent Document 1, a hand color model (previously referred to as a skin color template in Patent Document) is used as an initial value and a hand shape template is determined based on a predetermined hand shape template. We propose a method of tracking the movement of the area and updating the skin color template of the tracked hand area sequentially.
[0003]
In Non-Patent Document 1, for example, it is assumed that only a hand is present within a certain distance range from a stereo camera, and a skin color model of the hand region is determined.
[0004]
On the other hand, in Non-Patent Document 2, a model posture in which the face and both hands are captured is set in advance so that the position of the face and hands can be easily estimated. At the time of photographing, a region where the face and hand are shown is determined from template matching based on the result of the user's posture, and a skin color model of the user is determined from the region color.
[0005]
[Patent Document 1]
JP-A-11-167455 [Non-Patent Document 1]
"Construction of Wearable 3D Desktop Environment Using Range Images" (Technology of IEICE, PRMU2001-222, pp.1-8, 2002-2)
[Non-patent document 2]
"Finger Feature Extraction and Sign Language Recognition Under Complex Background" (Image Recognition and Understanding Symposium Vol.II, pp.105-110, July 2002)
[0006]
[Problems to be solved by the invention]
The conventional method described above has the following problems.
[0007]
In the method of Patent Document 1, the user must first specify a skin color model (skin color template) or an accurate hand shape template as an initial value, and the shape of the moving hand must match the template. Must.
[0008]
The method of Non-Patent Document 1 requires a user to use an expensive stereo camera and place his hand only within a predetermined distance from the camera.
[0009]
The method of Non-Patent Document 2 forces the user to a predetermined posture.
[0010]
An object of the present invention is to provide a method and an apparatus for constructing a user's skin color model from a simple and natural motion in which the user shakes his hand at a free position toward the camera (FIG. 9A). is there.
[0011]
[Means for Solving the Problems]
In order to achieve the above object, a color model construction method of the present invention includes:
An image input stage for photographing the user;
A motion area detecting step of extracting a moving area in the captured image;
An arm area determining step of determining an arm area from the areas extracted in the moving area detecting step,
A skin color area determining step of determining an area representing a skin color from the arm areas determined in the arm area determining step;
And a skin color model determining step of determining a target skin color model from the color information of the skin color area determined in the skin color area determining step.
[0012]
Hereinafter, each stage will be described.
[0013]
First, in the moving area detecting step, an area in which a color change between frame images of the captured moving image in the image input step has been extracted as a moving area.
[0014]
In the next arm area determination step, first, only the human arm area is extracted from the motion area extracted in the previous step. By closing the result, a final arm region is created. If there are a plurality of closed regions, 1) the image is photographed with a certain size or more with respect to the image, 2) the ratio between the major axis and the minor axis (see FIG. 9 (2)) depends on individual differences. Utilizing the property of being substantially constant, a region having a specified area or more and a ratio of a minor axis to a major axis within a specified range from among a plurality of closed regions is defined as an arm region. FIG. 9B shows an image diagram of the determined arm region.
[0015]
In the next skin color region determination stage, the hand region is located at the tip of the arm region, and when a human shakes his hand, the movement amount is larger on the hand side and the smaller movement amount is on the body side. It is determined that, of the two end regions in the major axis direction, the one with the larger motion amount is the hand region. FIG. 9C shows an image of the determined hand region.
[0016]
In the final skin color model determination step, a skin color model is determined using the color information in the extracted hand region.
[0017]
In the present invention, a pointing system using a hand that creates a recognized color model as a pointer in order to extract a hand region from a natural and simple motion of waving a hand by a moving image processing and determine a skin color model Can be created. That is, it is possible to contribute to the construction of a pointing system that is highly convenient for the user.
[0018]
Here, the terms used in the present specification will be described.
1. This is the range of colors taken by the user's hand or face skin color area under the skin color model shooting environment. In general, each component value is represented by an average value and a variance value using an RGB or HSV color system, or represented by using a color histogram.
2. This is the minimum element constituting the pixel image. An image is formed by regularly arranging pixels in the x and y directions. The appearance of an image is determined by the color value of each pixel (generally, 256 levels of R, G, and G values are used using the RGB color system).
3. This is a still image cut out from a frame image photographed moving image for each unit time.
4. The difference image is an image generated according to the algorithm of FIG. 10 using two images as inputs. The algorithm first determines the value ^{L 2} for the input image pair I1, I2 (Step 31). If the L ² is in the predetermined A- threshold above, the pixel color difference B- than the pixel threshold, which maps pixel color of an image of time t _i, did not map Is white (steps 32 and 33), and this generated image is called a difference image (step 34). Here, between the two images L ² is the square sum of the differences of the color of each pixel, A- threshold in determining whether to generate a difference image from the input image pair, L ² And B-threshold is a threshold indicating the minimum value of the difference between the target pixel colors of the input image pair. If the value is equal to or less than this value, the pixels of interest are regarded as having the same color.
5. Arm region Here, the actual hand and arm regions of the user are collectively referred to as an arm region (the region shown in FIG. 9B).
6. The longer one of the long and short arm regions is called the long diameter, and the short one is called the short diameter.
7. An area that can be enclosed by connecting line segments around a closed area. When a closed region is created from a focused pixel, the closed region is created by connecting pixels on the outline.
8. The purpose is to reduce the line width of the thinned colored pixel region and extract the center line of one pixel in width.
9. The purpose is to recognize the result as a line segment while connecting adjacent colored pixels that have been thinned.
10. An optical flow that expresses the apparent moving distance of an object or small area in an image between frames using the difference between frames.
11. It represents the number of pixels having a certain range of values among the pixels in the histogram designated area. It can also be said that it represents the color distribution of the pixels in the area.
[0019]
BEST MODE FOR CARRYING OUT THE INVENTION
Next, embodiments of the present invention will be described with reference to the drawings.
[0020]
[First Embodiment]
FIG. 1 is a block diagram of a color model construction apparatus according to a first embodiment of the present invention, and FIG. 2 is a flowchart showing the flow of the processing.
[0021]
The color model construction apparatus according to the present embodiment includes an image input unit 10, a motion region detection unit 11A, an arm region determination unit 12A, a skin color region determination unit 13A, a skin color model determination unit 14A, an image memory 15, and an information storage unit 16. I have. The information storage unit 16 stores motion information, arm area information, color display area information, and the like obtained by each unit.
[0022]
Next, in the present embodiment, an operation of constructing a skin color model of a hand from a user's hand gesture will be described. Here, it is assumed that the time at which the user starts the hand gesture is t ₀ , and the time at which the user ends the motion is t _n .
[0023]
First, the image input unit 10 captures a user and stores the captured image in the image memory 15 (Step 20). The motion area detection unit 11A for each time of the frame image captured moving image stored in the image memory 15, by generating a difference image between the frame image at time t _0, to extract a candidate of the arm region (step 21A). The information on the extracted arm region is stored in the information storage unit 16. Type inter-frame difference image generation algorithm for time t _i of a frame image shows an image of the generated image in FIG. In the difference image thus created, the arm region at time t _i is ideally the color of the actual arm region, the arm region at time t ₀ is the actual background color, and the rest is white.
[0024]
In the next arm region determination unit 12A, first, a difference image is created for the inter-frame difference images generated by the motion region detection unit 11A, and the difference image is compared with the original inter-frame difference images. An image in which only the arm region is colored is generated (Step 22A). FIG. 4 shows an algorithm for generating a difference image from the inter-frame difference image at time t _{i and} the inter-frame difference image at time t _j to generate an arm region image at time t _j , and an image of an input / output image. Next, a closed region is generated for the colored pixel region. Among the generated closed regions, a region having an area equal to or larger than the threshold value and within a specified range of the ratio of the minor axis to the major axis is selected and stored in the information storage unit 16. This area becomes a temporary arm area at each time.
[0025]
The above processing is performed on each inter-frame image obtained by the motion area detection unit 11A, and an image having the largest extracted arm area area at each time is set as a true arm area image at that time.
[0026]
The skin color region determination unit 13A cuts out the region near both tips in the major axis direction into a rectangle of a designated size from the extracted region of each image (step 23A, FIG. 5). Along the distance of the barycentric coordinates. The obtained moving distance represents the approximate magnitude of the movement of the arm area, and the area having the larger value is regarded as the hand area.
[0027]
The last skin color model determination unit 14A calculates the average value and the variance of the color of the hand region at each time determined by the skin color region determination unit 13A, and uses them as a skin color model (step 24A).
[0028]
[Second embodiment]
FIG. 6 is a block diagram of a color model construction apparatus according to the second embodiment of the present invention, and FIG. 7 is a flowchart showing the flow of the processing.
[0029]
The color model construction apparatus according to the present embodiment includes an image input unit 10, a motion region detection unit 11B, an arm region determination unit 12B, a skin color region determination unit 13B, a skin color model determination unit 14B, an image memory 15, and an information storage unit 16. I have.
[0030]
Also in the present embodiment, the time at which the user starts the hand gesture is time t ₀ , and the time at which the user finishes the motion is time t _n .
[0031]
The motion area detection unit 11B in the present embodiment, by generating a difference image from the frame image at time t _i and the time t _{i + 1,} detects the there has been a change in color between adjacent frame area (step 21B). FIG. 8 shows an image of an algorithm and input / output images. In this case, an image representing a line segment surrounding the arm region, in which the body side of the arm region is not connected, is generated.
[0032]
The arm area determination unit 12B creates a closed area by connecting the unconnected end points on the body side after the thinning processing (step 22B). An image mapping the color of the closed region of the pixel time t _i of a frame image for, and images showing the arm region at time t _i.
[0033]
First, the skin color region determination unit 13B obtains an optical flow representing the amount of movement of each small region in the arm region image at each time, obtains a region having a large amount of movement from the obtained optical flows, and sets it as a hand region. Consider (step 23B).
[0034]
The last skin color model determining unit 14B extracts the color of the area near the barycenter coordinate of the hand area determined at each time as described above, obtains a histogram of each color component, and uses it as a skin color model (step 24B).
[0035]
In the above embodiments, all applicable ranges are not limited to these. For example, it is conceivable to apply the moving region detecting unit 11A and the arm region determining unit 12A in the first embodiment to the moving region detecting unit 11B and the arm region determining unit 12B in the second embodiment, respectively, and vice versa. Is also possible. Furthermore, the same can be said for the skin color area determination units 13A and 13B and the skin color model determination units 14A and 14B.
[0036]
In addition, the present invention records a program for realizing the function other than that realized by dedicated hardware on a computer-readable recording medium, and stores the program recorded on the recording medium in a computer system. It may be read and executed. The computer-readable recording medium refers to a recording medium such as a floppy disk, a magneto-optical disk, a CD-ROM, or a storage device such as a hard disk device built in a computer system. Further, the computer-readable recording medium is one that dynamically holds the program for a short time (transmission medium or transmission wave), such as a case where the program is transmitted via the Internet, and serves as a server in that case. It also includes those that hold programs for a certain period of time, such as volatile memory inside a computer system.
[0037]
【The invention's effect】
As described above, according to the present invention, it is possible to construct a skin color model only from a natural hand movement at a free position without forcing the user to pose or setting an initial value. The effect is obtained.
[Brief description of the drawings]
FIG.
It is a lineblock diagram of a color model construction device of a 1st embodiment of the present invention.
FIG. 2
5 is a flowchart illustrating a flow of processing of the color model construction device according to the first embodiment.
FIG. 3
In the first embodiment is an image diagram showing the inter-frame difference image generation algorithm for time t _i of the frame image time t ₀ of the frame image.
FIG. 4
FIG. 9 is an image diagram showing an arm area determination algorithm for generating an arm area image at time t _j from the frame difference image at time t _{i and} the frame difference image at time t _j in the first embodiment.
FIG. 5 is an image diagram for performing rectangular clipping of a region near both ends from an extracted arm region image in the first embodiment.
FIG. 6 is a configuration diagram of a color model construction device according to a second embodiment of the present invention.
FIG. 7 is a diagram showing a flow of processing of a color model construction device of a second embodiment.
[8] in the second embodiment is an image diagram showing the inter-frame difference image generation algorithm for time t _i at time t _{i + 1.}
FIG. 9 is a diagram showing an image of a hand region and an arm region and a hand region determined by the present invention.
FIG. 10 is a flowchart illustrating an algorithm for generating a difference image.
[Explanation of symbols]
Reference Signs List 10 Image input units 11A, 11B Motion region detection units 12A, 12B Arm region determination units 13A, 13B Skin color region determination units 14A, 14B Skin color model determination unit 15 Image memory 16 Information storage units 20, 21A to 24A, 21B to 24B, 31 ~ 34 steps

Claims

A color model construction method of photographing a user and using the photographed moving image to construct a skin color model of a hand moved by the user,
An image input stage for photographing the user;
A motion area detecting step of extracting a moving area in the captured image;
An arm area determining step of determining an arm area from the areas extracted in the moving area detecting step;
A skin color region determining step of determining a region representing a skin color from the arm regions determined in the arm region determining step;
A method for constructing a color model, comprising: a skin color model determining step of determining a target skin color model from color information of a skin color region determined in the skin color region determining step.

The method according to claim 1, wherein the moving area detecting step extracts an area where a color change between frame images of the captured moving image has occurred as a moving area.

The arm region determination step is to extract only the human arm region from the extracted motion region, create a final arm region by closing the result, if there are multiple closed regions, In a plurality of closed regions, a region having a specified area or more and a ratio of a major axis, which is a longer diameter of the region, and a minor axis, which is a shorter diameter, within a specified range is determined as an arm region. Item 3. The method according to Item 1 or 2.

4. The skin color region determining step according to claim 1, wherein, of the end regions in the major axis direction determined in the arm region determining step, a region having a larger amount of motion is determined as a skin color region. 5. the method of.

The skin color model determining step calculates an average value and a variance value of the color of the skin color region at each time determined in the skin color region determining step, and uses the calculated values as a skin color model. The method described in.

The arm area determination step includes, after the thinning processing of the extracted motion area, creating a closed area by connecting the unconnected end points on the body side, and mapping the color of the frame image at each time for the pixels in the area. The method according to claim 1, wherein the determined image is determined as an image representing the arm region at the time.

The skin color region determination step is to determine an optical flow representing the amount of movement for each small region in the arm region image at each time, to determine a region with a large amount of movement from the optical flow, consider the region as a hand region, The method of claim 6.

8. The method according to claim 7, wherein the skin color model determining step extracts a color of a region near the barycenter coordinate of the hand region at each time, obtains a histogram of each color component, and uses the histogram as a skin color model.

A color model construction device that photographs a user and uses the photographed moving image to construct a skin color model of a hand moved by the user,
Image input means for photographing the user,
A motion region detecting means for extracting a moving region in the captured image,
Arm area determining means for determining an arm area from among the areas extracted by the motion area detecting means,
A skin color region determining unit that determines a region representing a skin color from the arm regions determined by the arm region determining unit;
A color model construction device having a skin color model determining unit for determining a target skin color model from color information of a skin color region determined by the skin color region determining unit.

A color model construction program for causing a computer to execute the color model construction method according to any one of claims 1 to 8.