JP2014035706A

JP2014035706A - Position detecting device, position detecting method and program

Info

Publication number: JP2014035706A
Application number: JP2012177567A
Authority: JP
Inventors: Hiroyuki Hoshino; 博之星野
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2012-08-09
Filing date: 2012-08-09
Publication date: 2014-02-24
Anticipated expiration: 2032-08-09
Also published as: JP6153051B2

Abstract

PROBLEM TO BE SOLVED: To make it possible to detect the position of a hand from a variety of images.SOLUTION: A detection operation for detecting a hand area corresponding to the hand of a person is detected from the image of each frame based on the hand color, which serves as a reference, and a detection operation for detecting the face of the person from the image of each frame is started in parallel during the display of a live view image(S3). During the detection of the hand area, position information indicating the position of the detected hand area in the image is acquired, and accumulated as locus data indicating a hand locus (S4). At a point in time that the face of a person is detected (S5: YES), the skin color of the face is detected from the image (S7). The detected color in the operation of detecting the hand area is changed to the skin color of the face from the hand color serving as a reference (S10). Thereafter, the accumulation of locus data is continued (S16). As long as only the face of a person is present in an image, the position of the hand of the person can be detected.

Description

本発明は、画像から人物の手の位置を検出する技術に関するものである。 The present invention relates to a technique for detecting the position of a person's hand from an image.

近年、人物をカメラで撮影し、人物の手の動きによるジェスチャを認識することにより家電機器等の遠隔操作を可能とする技術が普及しつつある。ジェスチャを認識するためには、人物を撮影した画像から手の位置を検出することが不可欠である。手の位置を検出するための技術としては、例えば下記特許文献１には、画像から人物の顔を検出し、画像内における顔の位置から人物の姿勢を推定した後、推定した姿勢に基づいて人物の手の位置を検出する技術が記載されている。 2. Description of the Related Art In recent years, a technique that enables remote control of home appliances and the like by photographing a person with a camera and recognizing a gesture caused by the movement of the person's hand is becoming widespread. In order to recognize a gesture, it is indispensable to detect the position of a hand from an image obtained by photographing a person. As a technique for detecting the position of a hand, for example, in Patent Document 1 below, a person's face is detected from an image, the posture of the person is estimated from the position of the face in the image, and then based on the estimated posture. A technique for detecting the position of a person's hand is described.

特開２０１２−９８９８８号公報JP 2012-989888 A

しかしながら、特許文献１に記載された技術にあっては、画像から人物の手の位置を検出するためには、画像内に、単に人物の顔が存在するだけでなく、人物の姿勢が推定できるだけの顔以外の部位が含まれている必要があるという問題があった。 However, in the technique described in Patent Document 1, in order to detect the position of a person's hand from an image, not only the person's face is present in the image but also the posture of the person can be estimated. There was a problem that it was necessary to include parts other than the face.

本発明は、かかる従来の課題に鑑みてなされたものであり、より多様な状態の画像から人物の手の位置を検出することが可能となる位置検出装置、位置検出方法及びプログラムを提供することを目的とする。 The present invention has been made in view of such conventional problems, and provides a position detection device, a position detection method, and a program capable of detecting the position of a person's hand from images in more various states. With the goal.

前記課題を解決するため、本発明にあっては、動画像を構成する各々の画像から人物の顔を検出する顔検出手段と、前記顔検出手段によって検出された顔の肌の色情報を検出する肌色検出手段と、前記動画像を構成する各々の画像から、手の色情報に基づいて人物の手に対応する手領域の検出を行う手領域検出手段と、前記手領域検出手段により検出された手領域の画像内における位置を示す位置情報を取得する位置取得手段と、前記手領域検出手段による前記手領域の検出動作で用いる手の色情報を、前記肌色検出手段により検出された前記肌の色情報に基づいて変更する検出制御手段とを備えることを特徴とする。 In order to solve the above problems, in the present invention, face detection means for detecting a person's face from each image constituting a moving image, and face skin color information detected by the face detection means are detected. Detected by the hand region detecting means, the hand region detecting means for detecting the hand region corresponding to the hand of the person based on the color information of the hand from each image constituting the moving image, and the hand region detecting means. Position information acquisition means for acquiring position information indicating the position of the hand area in the image and the skin color information detected by the skin color detection means for hand color information used in the detection operation of the hand area by the hand area detection means. And a detection control means for changing based on the color information.

本発明によれば、より多様な状態の画像から人物の手の位置を検出することが可能となる。 According to the present invention, it is possible to detect the position of a person's hand from images in more various states.

本発明の実施形態を示すデジタルカメラのブロック図である。It is a block diagram of a digital camera showing an embodiment of the present invention. デジタルカメラの自動撮影モードでの処理内容を示すフローチャートである。It is a flowchart which shows the processing content in the automatic imaging | photography mode of a digital camera.

以下、本発明の実施形態について説明する。図１は、本発明の位置検出装置としての機能を備えたデジタルカメラ１の電気的構成の概略を示したブロック図である。 Hereinafter, embodiments of the present invention will be described. FIG. 1 is a block diagram showing an outline of an electrical configuration of a digital camera 1 having a function as a position detecting device of the present invention.

図１に示したように、デジタルカメラ１は、制御部２と、レンズ部３、撮像部４、表示部５、画像記憶部６、プログラム記憶部７、操作部８、顔検出部９、手領域検出部１０、手検出部１１の各部を備えている。 As shown in FIG. 1, the digital camera 1 includes a control unit 2, a lens unit 3, an imaging unit 4, a display unit 5, an image storage unit 6, a program storage unit 7, an operation unit 8, a face detection unit 9, a hand. Each part of the area | region detection part 10 and the hand detection part 11 is provided.

制御部２は、ＣＰＵ（Central Processing Unit）、及びその周辺回路等や、ＲＡＭ（Random Access memory）等の作業用の内部メモリを含み、デジタルカメラ１の各部を制御する。 The control unit 2 includes a CPU (Central Processing Unit), its peripheral circuits, and a working internal memory such as a RAM (Random Access Memory), and controls each unit of the digital camera 1.

レンズ部３は、フォーカス調整用、及びズーム用のレンズを含むレンズ群、及びレンズ群を駆動するモータと、絞り、及び絞りを開閉駆動して開度を調整するアクチュエータ等から構成される。 The lens unit 3 includes a lens group including a focus adjustment lens and a zoom lens, a motor that drives the lens group, a diaphragm, and an actuator that opens and closes the diaphragm to adjust the opening.

撮像部４は、ＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Meta1 0xide Semiconductor）型の撮像素子と、撮像素子から出力された撮像信号に対して所定のアナログ処理を行った後、デジタル信号に変換するＡＦＥ（Analog Front End）とから構成される。撮像部４は、デジタルカメラ１に撮影モードが設定されている間、レンズ部３のレンズ群を介して被写体を所定のフレームレートで逐次撮像し、被写体の画像データ（撮像データ）を制御部２へ供給する。 The imaging unit 4 performs predetermined analog processing on a CCD (Charge Coupled Device) or CMOS (Complementary Meta10xide Semiconductor) type imaging device and an imaging signal output from the imaging device, and then converts the analog signal into a digital signal. It consists of AFE (Analog Front End). The imaging unit 4 sequentially captures the subject at a predetermined frame rate via the lens group of the lens unit 3 while the shooting mode is set in the digital camera 1, and controls the subject image data (imaging data). To supply.

制御部２へ供給された画像データは、制御部２において画素毎のＲＧＢデータに変換され、さらに輝度（Ｙ）成分及び色差（ＵＶ）成分からなるＹＵＶデータに変換された後、表示部５へ供給され、表示部５においてライブビュー画像として表示される。 The image data supplied to the control unit 2 is converted into RGB data for each pixel in the control unit 2 and further converted into YUV data including a luminance (Y) component and a color difference (UV) component, and then to the display unit 5. Supplied and displayed on the display unit 5 as a live view image.

また、撮影時において撮像部４から制御部２へ供給された画像データは、ＹＵＶデータに変換された後、ＪＰＥＧ（Joint Photographic Expert Group）方式により圧縮される。圧縮後の画像データは、種々の属性情報が付加され、Ｅｘｉｆ（Exchangeable Image File Format）規格に準拠した静止画ファイルとして画像記憶部６に記憶される。 Further, image data supplied from the imaging unit 4 to the control unit 2 at the time of shooting is converted into YUV data and then compressed by a JPEG (Joint Photographic Expert Group) method. Various attribute information is added to the compressed image data, and the image data is stored in the image storage unit 6 as a still image file compliant with the Exif (Exchangeable Image File Format) standard.

画像記憶部６は、例えばデジタルカメラ１に内蔵されたフラッシュメモリや、デジタルカメラ１に着脱自在な各種のメモリカード、及びメモリカードへのデータの入出力を可能とするカードインターフェイスにより構成される。 The image storage unit 6 includes, for example, a flash memory built in the digital camera 1, various memory cards that can be attached to and detached from the digital camera 1, and a card interface that enables data input and output to the memory card.

画像記憶部６に静止画ファイルとして記憶された画像データは、デジタルカメラ１に再生モードが設定されている間、制御部２によって読み出され伸張された後、表示部５へ供給されて画面表示される。 The image data stored as a still image file in the image storage unit 6 is read and expanded by the control unit 2 while the playback mode is set in the digital camera 1, and then supplied to the display unit 5 for screen display. Is done.

表示部５は、カラー液晶表示パネルと、制御部２から供給される画像データ等に応じてカラー液晶表示パネルを駆動する表示駆動回路とから構成され、前述したように撮影モードにおいては被写体画像をライブビュー表示し、再生モードにおいては画像記憶部６に静止画ファイルとして記憶されている画像データからなる既存の撮影画像を表示する。 The display unit 5 includes a color liquid crystal display panel and a display drive circuit that drives the color liquid crystal display panel in accordance with image data supplied from the control unit 2. As described above, the display unit 5 displays a subject image in the shooting mode. Live view display is performed, and in the playback mode, an existing captured image composed of image data stored as a still image file in the image storage unit 6 is displayed.

プログラム記憶部７は、例えばＲＯＭ（Read Only Memory）や、フラッシュメモリ等の記憶データが随時書き換え可能な不揮発性メモリにより構成される。プログラム記憶部７には、制御部２に後述する処理を行わせるためのプログラムが予め記憶されている。 The program storage unit 7 is configured by a nonvolatile memory in which stored data can be rewritten as needed, such as a ROM (Read Only Memory) or a flash memory, for example. The program storage unit 7 stores in advance a program for causing the control unit 2 to perform processing described later.

また、プログラム記憶部７には、制御部２にＡＥ（Auto Exposure）制御、ＡＦ（Auto Focus）制御や、ＡＷＢ（Auto White Balance）制御等を行わせるためのプログラム、及びＡＥ制御に際して使用されるシャッタ速度、ＩＳＯ感度、絞り値の組み合わせを示すプログラム線図を構成するデータ等の各種データも格納されている。 The program storage unit 7 is used for a program for causing the control unit 2 to perform AE (Auto Exposure) control, AF (Auto Focus) control, AWB (Auto White Balance) control, and the like, and AE control. Various data such as data constituting a program diagram showing combinations of shutter speed, ISO sensitivity, and aperture value are also stored.

操作部８は、電源キーやシャッターキー、及びデジタルカメラ１の基本の動作モードである撮影モードと再生モードとの切り替えに使用されるモード切替キー等の複数の操作キーから構成される。操作部８における操作キーの操作状態は制御部２において随時監視される。 The operation unit 8 includes a plurality of operation keys such as a power key, a shutter key, and a mode switching key used for switching between a photographing mode and a reproduction mode which are basic operation modes of the digital camera 1. The operation state of the operation key in the operation unit 8 is monitored by the control unit 2 as needed.

顔検出部９、手領域検出部１０、手検出部１１の各部は、撮影モードにおいて撮像部４により所定のフレームレートで撮像される画像を対象として特定の画像認識処理を行うための画像処理回路や、各種のパラメータが記憶されたメモリ、作業用の記憶メモリ等から構成される。 Each of the face detection unit 9, the hand region detection unit 10, and the hand detection unit 11 is an image processing circuit for performing a specific image recognition process on an image captured at a predetermined frame rate by the imaging unit 4 in the shooting mode. And a memory storing various parameters, a working storage memory, and the like.

顔検出部９は、各フレームの画像内における任意の人物の顔部分を検出し、検出した顔部分に対応する領域（以下、顔領域という。）を示す領域情報を取得し、それを制御部２へ供給する。顔検出部９による顔検出は、例えば二値化や輪郭抽出、パターンマッチング等の画像認識技術を用いて、一定範囲の位置関係にある目や鼻や口が存在する画像内の特定部分を探索する公知の技術によって行われる。 The face detection unit 9 detects a face part of an arbitrary person in the image of each frame, acquires area information indicating an area corresponding to the detected face part (hereinafter referred to as a face area), and uses it as a control part 2 is supplied. Face detection by the face detection unit 9 is performed by searching for a specific portion in an image where eyes, noses, and mouths in a certain range of positional relationship exist using image recognition techniques such as binarization, contour extraction, and pattern matching. This is done by a known technique.

手領域検出部１０は、各フレームの画像から、各画素の色情報に基づいて人物の手に対応する領域（以下、手領域という。）を検出し、検出した手領域の画像内の位置を示す位置情報を取得し、それを制御部２へ供給する。 The hand area detection unit 10 detects an area corresponding to a person's hand (hereinafter referred to as a hand area) from the image of each frame based on the color information of each pixel, and determines the position of the detected hand area in the image. The position information shown is acquired and supplied to the control unit 2.

具体的には、各画素の色情報としてＨＳＶ表色系のＨ（Hue，色相）成分の値を演算し、各フレームの画像について、Ｈ成分の値が人物の手の色の値を中心する一定の範囲にある画素群からなる特定領域を抽出するとともに、相前後するフレーム間で特定領域の差分を演算し、係る差分が最大面積となる部分を手領域として検出する。 Specifically, the value of the H (Hue) component of the HSV color system is calculated as the color information of each pixel, and the H component value is centered on the color value of the person's hand for each frame image. A specific area composed of a group of pixels in a certain range is extracted, a difference between the specific areas is calculated between successive frames, and a portion where the difference is the maximum area is detected as a hand area.

つまり、手領域検出部１０は、画像内における、人物の手の色、及びそれと近似する色の任意の動体に対応する領域を手領域として検出する。なお、人物の手の色は、後述するように制御部２により検出色として設定される。また、手領域検出部１０は、検出した手領域の位置情報として、例えば手領域を囲む最小の円又は四角形の中心位置を示す座標情報を取得する。 That is, the hand region detection unit 10 detects, as a hand region, a region corresponding to an arbitrary moving object having a color of a person's hand and a color similar thereto in the image. The color of the person's hand is set as a detection color by the control unit 2 as will be described later. In addition, the hand region detection unit 10 acquires, for example, coordinate information indicating the center position of the smallest circle or quadrangle surrounding the hand region as the position information of the detected hand region.

手検出部１１は、画像内における任意の人物の手を検出し、検出した手の位置を示す位置情報を制御部２へ供給する。手検出部１１による手検出は、形状認識によるものであり、予め用意されている多数の手の形状データを用い、例えば二値化や輪郭抽出、パターンマッチング等によって画像内における人物の手を探索することにより行われる。また、手検出部１１は、検出した手の画像内の位置を示す位置情報として、例えば手を囲む最小の円又は四角形の中心位置を示す座標情報を取得する。 The hand detection unit 11 detects the hand of an arbitrary person in the image and supplies position information indicating the position of the detected hand to the control unit 2. The hand detection by the hand detection unit 11 is based on shape recognition, and uses a large number of hand shape data prepared in advance to search for a human hand in an image by binarization, contour extraction, pattern matching, etc. Is done. In addition, the hand detection unit 11 acquires coordinate information indicating the center position of the smallest circle or rectangle surrounding the hand, for example, as position information indicating the position in the detected hand image.

そして、本実施形態のデジタルカメラ１には、撮影モードの下位モードとして自動撮影モードが設けられている。自動撮影モードは、撮影待機状態でライブビュー表示を行っている間、逐次撮像した画像から人物の手の動きを検出し、手の動きが予め決められている所定の動作（ジェスチャ）であると認識できたとき、それをトリガーとして撮影を行うモードである。なお、自動撮影モードにおいてデジタルカメラ１が認識するジェスチャは、例えば円やＳ字を描くような単純な手の動きである。 The digital camera 1 of this embodiment is provided with an automatic shooting mode as a lower mode of the shooting mode. The automatic shooting mode is a predetermined operation (gesture) in which the movement of a person's hand is detected from images sequentially captured during live view display in a shooting standby state, and the movement of the hand is determined in advance. In this mode, when it can be recognized, it is used as a trigger. Note that the gesture recognized by the digital camera 1 in the automatic shooting mode is a simple hand movement such as drawing a circle or an S-shape.

次に、デジタルカメラ１の本発明に係る動作を図２に従い説明する。なお、後述する動作は、デジタルカメラ１が自動撮影モードにおいて認識するジェスチャが、ジェスチャ認識による撮影を１回だけ行わせるためのものと、ジェスチャ認識による撮影を繰り返し行わせるためのものとの２種類である場合の動作である。 Next, the operation of the digital camera 1 according to the present invention will be described with reference to FIG. Note that there are two types of operations to be described later: a gesture that the digital camera 1 recognizes in the automatic shooting mode is one that causes the gesture recognition to be shot only once, and a gesture that causes the gesture recognition to be repeatedly performed. This is the operation when.

図２は、ユーザによって、デジタルカメラ１に自動撮影モードが設定されたとき、制御部２がプログラム記憶部７に記憶されているプログラムに従い実行する処理を示したフローチャートである。 FIG. 2 is a flowchart showing processing executed by the control unit 2 in accordance with a program stored in the program storage unit 7 when the user sets the automatic shooting mode in the digital camera 1.

制御部２は、自動撮影モードの設定とともに撮像部４による被写体画像の取り込みを開始し、表示部５におけるライブビュー表示を開始する（ステップＳ１）。 The control unit 2 starts capturing the subject image by the imaging unit 4 together with the setting of the automatic shooting mode, and starts live view display on the display unit 5 (step S1).

さらに、制御部２は、手領域検出部１０が検出すべき手領域の色である検出色として、予め決められている平均的な手の色である基準色を設定した後（ステップＳ２）、手領域検出部１０による手領域の検出と、顔検出部９による顔検出とを並行して開始させる（ステップＳ３）。 Furthermore, the control unit 2 sets a reference color that is a predetermined average hand color as a detection color that is the color of the hand region that the hand region detection unit 10 should detect (step S2). Detection of the hand region by the hand region detection unit 10 and face detection by the face detection unit 9 are started in parallel (step S3).

また、制御部２は、手領域検出部１０による手領域の検出開始に伴い、手領域検出部１０により取得された手領域の画像内の位置を示す位置情報を、人物の手の移動軌跡を示す軌跡データとして作業用の内部メモリに蓄積（記憶）する処理を開始する（ステップＳ４）。 Further, as the hand region detection unit 10 starts detecting the hand region, the control unit 2 uses the position information indicating the position in the image of the hand region acquired by the hand region detection unit 10 as the movement locus of the person's hand. The process of accumulating (storing) in the work internal memory as the locus data shown is started (step S4).

その後、制御部２は、顔検出部９において任意の人物の顔が検出できるまでの間は、所定の待ち時間を限度として、動体の移動軌跡を示す軌跡データの蓄積を継続する（ステップＳ５：ＮＯ、ステップＳ１２：ＮＯ）。 Thereafter, until the face detection unit 9 can detect the face of an arbitrary person, the control unit 2 continues to accumulate trajectory data indicating the moving trajectory of the moving object with a predetermined waiting time as a limit (step S5: NO, step S12: NO).

やがて、制御部２は、顔検出部９によって任意の人物の顔が検出できると（ステップＳ５：ＹＥＳ）、その時点で手領域検出部１０による手領域の検出動作を一時停止する（ステップＳ６）。 Eventually, when the face detection unit 9 can detect the face of an arbitrary person (step S5: YES), the control unit 2 temporarily stops the hand region detection operation by the hand region detection unit 10 at that time (step S6). .

引き続き、制御部２は、顔検出部９によって検出された画像内の顔領域から、任意の人物の顔部分の肌色を検出する（ステップＳ７）。具体的には、顔検出部９から供給された領域情報により示される顔領域の各画素についてＨＳＶ表色系のＨ成分の値を取得して、色のヒストグラムを算出し、度数が最大のＨ成分を顔部分の肌色を示す色情報として取得する。 Subsequently, the control unit 2 detects the skin color of the face portion of an arbitrary person from the face area in the image detected by the face detection unit 9 (step S7). Specifically, the value of the H component of the HSV color system is acquired for each pixel of the face area indicated by the area information supplied from the face detection unit 9, and a color histogram is calculated. The component is acquired as color information indicating the skin color of the face portion.

さらに、制御部２は、ステップＳ２の処理で設定した検出色が、ステップＳ７の処理で検出した顔部分の肌色とほぼ同色であるか否かを確認する（ステップＳ８）。具体的には、検出色と顔部分における実際の肌色との差（Ｈ成分の値の差）が、予め決められている許容範囲内であるか否かを確認する。 Further, the control unit 2 checks whether or not the detected color set in the process of step S2 is substantially the same color as the skin color of the face portion detected in the process of step S7 (step S8). Specifically, it is confirmed whether or not the difference between the detected color and the actual skin color in the face portion (difference in the H component value) is within a predetermined allowable range.

ここで、制御部２は、検出色が顔部分の肌色とほぼ同色ではないとき、つまり、それまで手領域検出部１０によって検出されていた手領域が人物の手に対応する領域ではない可能性が高いと判断できるときには（ステップＳ８：ＮＯ）、既に蓄積している手の軌跡データを破棄する（ステップＳ９）。 Here, when the detected color is not substantially the same color as the skin color of the face portion, the control unit 2 may mean that the hand region detected by the hand region detecting unit 10 until then is not a region corresponding to a human hand. If it can be determined that the hand is high (step S8: NO), the hand trajectory data already accumulated is discarded (step S9).

引き続き、制御部２は、手領域検出部１０が検出すべき検出色をステップＳ７の処理で取得した顔の肌色に更新し（ステップＳ１０）、手領域検出部１０による手領域の検出動作を再開する（ステップＳ１１）。そして、制御部２は、それ以降に手領域検出部１０により新たに検出される手領域の位置情報を、手の移動軌跡を示す軌跡データとして順次蓄積する（ステップＳ１６）。 Subsequently, the control unit 2 updates the detection color to be detected by the hand region detection unit 10 to the skin color of the face acquired in step S7 (step S10), and resumes the hand region detection operation by the hand region detection unit 10. (Step S11). Then, the control unit 2 sequentially accumulates the position information of the hand region newly detected by the hand region detection unit 10 thereafter as the trajectory data indicating the movement trajectory of the hand (step S16).

したがって、これ以降においては、顔検出部９によって顔が検出された後、手領域検出部１０により新たに検出される手領域の位置情報のみが、手の移動軌跡を示す軌跡データとして蓄積される。 Accordingly, after this, after the face is detected by the face detection unit 9, only the position information of the hand region newly detected by the hand region detection unit 10 is accumulated as trajectory data indicating the hand movement trajectory. .

また、制御部２は、上記とは逆に検出色が顔部分の肌色とほぼ同色であるとき、つまりそれまで手領域検出部１０によって検出されていた手領域が人物の手に対応する領域であると判断できるときには（ステップＳ８：ＹＥＳ）、直ちに手領域検出部１０による手領域の検出動作を再開する（ステップＳ１１）。そして、制御部２は、手領域検出部１０により新たに検出される手領域の位置情報を、手の移動軌跡を示す軌跡データとして順次蓄積する（ステップＳ１６）。 In contrast to the above, when the detected color is almost the same as the skin color of the face portion, the control unit 2 is an area in which the hand area detected by the hand area detecting unit 10 until then corresponds to the human hand. When it can be determined that there is (step S8: YES), the hand region detection operation by the hand region detector 10 is immediately resumed (step S11). And the control part 2 accumulate | stores sequentially the positional information on the hand area newly detected by the hand area detection part 10 as locus | trajectory data which shows the movement locus | trajectory of a hand (step S16).

したがって、これ以降においては、顔検出部９によって顔が検出される以前から蓄積している既存の軌跡データに続いて、顔検出部９によって顔が検出された以降に新たに検出される手領域の位置情報が軌跡データとして蓄積される。 Therefore, after this, the hand region newly detected after the face is detected by the face detector 9 following the existing trajectory data accumulated before the face is detected by the face detector 9. Are stored as trajectory data.

一方、制御部２は、ステップＳ３の処理で顔検出部９による顔検出を開始した後、所定の待ち時間が経過しても、任意の人物の顔が検出できなかったときには（ステップＳ１２：ＹＥＳ）、まず、手領域検出部１０による手領域の検出と、顔検出部９による顔検出とを終了する（ステップＳ１３）。 On the other hand, after starting the face detection by the face detection unit 9 in the process of step S3, the control unit 2 does not detect the face of any person even after a predetermined waiting time has elapsed (step S12: YES). First, the detection of the hand area by the hand area detection unit 10 and the face detection by the face detection unit 9 are ended (step S13).

さらに、制御部２は、既に蓄積している手の軌跡データを破棄した後（ステップＳ１４）、手検出部１１における形状認識による手検出を開始する（ステップＳ１５）。以後、制御部２は、手検出部１１により取得された画像内での手の位置を示す位置情報を、手の移動軌跡を示す軌跡データとして順次蓄積する（ステップＳ１６）。 Furthermore, after discarding the hand trajectory data that has already been accumulated (step S14), the control unit 2 starts hand detection by shape recognition in the hand detection unit 11 (step S15). Thereafter, the control unit 2 sequentially accumulates position information indicating the position of the hand in the image acquired by the hand detection unit 11 as trajectory data indicating the movement trajectory of the hand (step S16).

その後、制御部２は、手領域検出部１０、又は手検出部１１により取得された位置情報を手の移動軌跡を示す軌跡データとして新たに蓄積する毎に、軌跡データにより示される手の移動軌跡、つまり手の動きが既定の動き（ジェスチャ）と一致するか否かを判断する（ステップＳ１７）。すなわち制御部２はジェスチャ認識を行う。 Thereafter, each time the control unit 2 newly accumulates the position information acquired by the hand region detection unit 10 or the hand detection unit 11 as trajectory data indicating the movement trajectory of the hand, the movement trajectory of the hand indicated by the trajectory data. That is, it is determined whether or not the hand movement matches a predetermined movement (gesture) (step S17). That is, the control unit 2 performs gesture recognition.

なお、ここで認識するジェスチャは先に述べたように２種類であり、ステップＳ１７の処理おいて制御部２は、軌跡データにより示される手の移動軌跡が、２種類のジェスチャの少なくともいずれか一方のジェスチャの手の動きと一致するか否かを判断する。 As described above, there are two types of gestures recognized here, and in the process of step S17, the control unit 2 determines that the movement locus of the hand indicated by the locus data is at least one of the two types of gestures. It is determined whether or not it matches the movement of the hand of the gesture.

そして、制御部２は、軌跡データにより示される手の動きが既定の動き一致すると判断できるまでの間（ステップＳ１７：ＮＯ）、軌跡データの蓄積を継続する。 The control unit 2 continues to accumulate the trajectory data until it can be determined that the hand motion indicated by the trajectory data matches the predetermined motion (step S17: NO).

やがて軌跡データにより示される手の動きが既定の動き一致すると判断できたら（ステップＳ１７：ＹＥＳ）、制御部２は、それをトリガーとして撮影処理を行う（ステップＳ１８）。すなわち制御部２は、撮像部４により記録用の画像を取得し、取得した画像、すなわち撮影画像を画像記憶部６に記憶させる。 When it is determined that the hand movement indicated by the trajectory data eventually coincides with the predetermined movement (step S17: YES), the control unit 2 performs a photographing process using this as a trigger (step S18). That is, the control unit 2 acquires an image for recording by the imaging unit 4 and stores the acquired image, that is, a captured image, in the image storage unit 6.

次に、制御部２は、撮影を終了すべきであるか否かを確認する（ステップＳ１９）。すなわち、制御部２は、ステップＳ１７の処理で認識できたジェスチャが、ジェスチャ認識による撮影を１回だけ行わせるためのものであったか否かを確認する。 Next, the control unit 2 confirms whether or not shooting should be terminated (step S19). In other words, the control unit 2 confirms whether or not the gesture that can be recognized in the process of step S <b> 17 is for performing the shooting by the gesture recognition only once.

そして、制御部２は、ステップＳ１７の処理で認識できたジェスチャが、ジェスチャ認識による撮影を１回だけ行わせるためのものであって、撮影を終了すべきである場合には（ステップＳ１９：ＹＥＳ）、その時点で自動撮影モードによる処理を終了する。 And the control part 2 is for making the gesture recognized by the process of step S17 perform the imaging | photography by gesture recognition only once, and should complete | finish imaging | photography (step S19: YES) ), And the processing in the automatic shooting mode is terminated at that time.

一方、制御部２は、ステップＳ１７の処理で認識できたジェスチャが、ジェスチャ認識による撮影を繰り返し行わせるためのものであって、撮影を終了すべきでない場合には（ステップＳ１９：ＮＯ）、既に蓄積している手の軌跡データをいったん破棄する（ステップＳ２０）。そして、制御部２は、ステップＳ４の処理に戻って新たな軌跡データの蓄積を開始した後、前述したステップＳ５以降の処理を繰り返す。 On the other hand, if the gesture recognized in the process of step S17 is for repeatedly performing shooting by gesture recognition and the shooting should not be ended (step S19: NO), the control unit 2 has already performed. The accumulated hand trajectory data is once discarded (step S20). Then, the control unit 2 returns to the process of step S4 and starts accumulating new trajectory data, and then repeats the processes after step S5 described above.

これにより、ユーザは、画角内においてジェスチャ認識による撮影を繰り返し行わせるためのジェスチャを行えば、連続して複数回の撮影を行うことができる。また、例えばユーザが自己を被写体とした撮影を行った後、画角内の場所から他の場所に移動し、代わりに他のユーザが画角内の場所に移動して既定のジェスチャを行えば、他のユーザも自己を被写体とした撮影を行うことができる。 As a result, the user can perform a plurality of continuous shootings by performing a gesture for repeatedly performing shooting by gesture recognition within the angle of view. Also, for example, after a user has taken a picture of himself / herself as a subject, the user moves from a location within the angle of view to another location, and instead, the other user moves to a location within the angle of view and performs a predetermined gesture. Other users can also take pictures of themselves as subjects.

その後、制御部２は、ジェスチャ認識によって何回目かの撮影処理を行った後（ステップＳ１８）、ステップＳ１７の処理で認識できたジェスチャが、ジェスチャ認識による撮影を１回だけ行わせるためのものであって、撮影を終了すべきであれば（ステップＳ１９：ＹＥＳ）、その時点で自動撮影モードによる処理を終了する。 After that, the control unit 2 performs the shooting process several times by the gesture recognition (step S18), and then the gesture recognized by the process of step S17 is to perform the shooting by the gesture recognition only once. If the shooting should be terminated (step S19: YES), the processing in the automatic shooting mode is terminated at that time.

なお、図２では省略したが、制御部２は、ステップＳ１８の撮影処理を、手の動きが既定の動き一致すると判断できた後、一定時間（例えば０．５秒）が経過した後に実施することによって所定のジェスチャが終了した後の状態のユーザを撮影する。 Although omitted in FIG. 2, the control unit 2 performs the photographing process in step S <b> 18 after a certain time (for example, 0.5 seconds) elapses after it is determined that the hand movement matches the predetermined movement. Thus, the user in a state after the predetermined gesture ends is photographed.

以上説明したように本実施形態のデジタルカメラ１では、自動撮影モードでの動作において手の動きによるジェスチャを認識する際、手の色情報に基づいて画像内における人物の手に対応する手領域の検出することによって人物の手の位置を取得する。そのため、画像内に人物の顔さえ存在していれば、画像内に人物の姿勢が推定できるだけの顔以外の部位が含まれていない場合であっても、人物の手の位置を確実に検出することができる。したがって、より多様な状態の画像から人物の手の位置を検出することができる。 As described above, in the digital camera 1 according to the present embodiment, when recognizing a gesture caused by hand movement in the operation in the automatic shooting mode, the hand region corresponding to the hand of the person in the image is determined based on the hand color information. The position of a person's hand is acquired by detecting. Therefore, as long as there is a person's face in the image, the position of the person's hand is reliably detected even if the image does not contain any part other than the face that can estimate the posture of the person. be able to. Therefore, the position of a person's hand can be detected from images in more various states.

しかも、本実施形態においては、人物の顔が検出できる以前から人物の手の位置を取得するとともに、人物の顔が検出できた後には、検出色を更新することによって各フレームの画像から手領域を検出する動作を、予め決められている平均的な手の色である基準色、つまり基準となる手の色情報に基づいた検出動作から、実際の人物の顔の肌色に基づいた検出動作に変更する。そのため、人物の顔が検出できた後には、人物の手の位置として、人物の顔が検出できる以前よりも、より正確な位置を取得することができる。 Moreover, in the present embodiment, the position of the person's hand is acquired before the person's face can be detected, and after the person's face has been detected, the detected area is updated to update the hand region from the image of each frame. The detection operation based on the skin color of the face of an actual person is changed from the detection operation based on the reference color which is a predetermined average hand color, that is, the reference hand color information. change. Therefore, after the person's face can be detected, a more accurate position can be acquired as the position of the person's hand than before the person's face can be detected.

その結果、本実施形態においては、人物の手の動きによるジェスチャをより高速に、かつより正確に認識することができる。 As a result, in this embodiment, it is possible to recognize a gesture caused by the movement of a person's hand at higher speed and more accurately.

また、本実施形態においては、各フレームの画像から手領域を検出する際、人物の顔が検出できる以前は、検出色として基準色を設定し、検出色及び検出色と近似する色を有する動体部分に対応する領域を領域として検出し、また、人物の顔が検出できた以降は実際の人物の顔の肌色、及びその肌色と近似する色を有する動体部分に対応する領域を手領域として検出するようにした。 In this embodiment, when detecting a hand region from an image of each frame, a moving object having a reference color as a detection color and a color that approximates the detection color is set before the human face can be detected. An area corresponding to a part is detected as an area, and after a human face is detected, an area corresponding to a moving body part having a skin color of an actual person's face and a color similar to the skin color is detected as a hand area. I tried to do it.

したがって、各フレームの画像から、人物の手に対応する手領域をより正確に検出することができ、その結果、人物の手の動きによるジェスチャをより一層正確に認識することができる。 Therefore, the hand region corresponding to the person's hand can be detected more accurately from the image of each frame, and as a result, the gesture caused by the movement of the person's hand can be recognized more accurately.

なお、２回目以降の撮影において、人物の顔が検出できる以前の検出色は前回の撮影時に検出された顔の肌色である。 In the second and subsequent shootings, the detected color before the human face can be detected is the skin color of the face detected during the previous shooting.

また、本実施形態においては、人物の顔が検出できる以前に、各フレームの画像からの手領域の検出に使用していた検出色が、人物の顔が検出できた時点で取得した実際の人物の顔の肌色と同一又は近似する色ではない場合には、人物の顔が検出できる以前に記憶した手領域の位置情報をいったん破棄するようにした。 In the present embodiment, the actual color acquired when the detected face used for detecting the hand area from the image of each frame before the human face can be detected is detected when the human face can be detected. If the color is not the same as or similar to the skin color of the face, the position information of the hand area stored before the human face can be detected is discarded once.

つまり、人物の顔が検出できる以前の検出色が実際の人物の顔の肌色と同一又は近似する色ではない場合には、人物の顔が検出できた以降に取得し、記憶した手領域の位置情報のみに基づいて、手の動きが既定の動き（ジェスチャ）と一致するか否かを判断するようにした。 In other words, if the detected color before the human face can be detected is not the same or similar to the skin color of the actual human face, the position of the hand region acquired and stored after the human face is detected Based on only information, it was determined whether or not the hand movement matches the default movement (gesture).

したがって、人物の顔が検出できる以前に取得した一連の手領域の位置情報に、設定した検出色が実際の人物の顔の肌色と大きく異なることに起因する誤りがあった場合に生ずるジェスチャの誤認識を未然に防止することができる。 Therefore, if there is an error in the position information of a series of hand regions acquired before the person's face can be detected due to the fact that the set detection color is significantly different from the skin color of the actual person's face, an error in the gesture will occur. Recognition can be prevented in advance.

また、本実施形態においては、各フレームの画像から所定の待ち時間内に人物の顔が検出できなかった場合には、各フレームの画像に対して形状認識による手検出を行うことにより、手の移動軌跡を示す軌跡データ（手の位置情報）を取得するようにした。したがって、人物の顔が検出できない状況下であっても、確実に人物の手の動きによるジェスチャを認識することができる。 Further, in this embodiment, when a human face cannot be detected from the image of each frame within a predetermined waiting time, hand detection by shape recognition is performed on the image of each frame, thereby Trajectory data (hand position information) indicating the movement trajectory was acquired. Therefore, even in a situation where a person's face cannot be detected, it is possible to reliably recognize a gesture caused by a person's hand movement.

なお、本実施形態においては、人物の顔が検出できる以前に各フレームの画像から手領域を検出する際の検出色として、予め決められている平均的な手の色である基準色を設定したが、上記の検出色には、基準色をユーザの肌の色により近い色に補正した後の色を設定するようにしてもよい。基準色を補正する場合には、例えばデジタルカメラ１に設定されている使用言語や、ユーザの個人情報（名前やＥメールアドレス等）など、ユーザの肌の色を類推するために使用することができる任意の情報を用いることができる。 In this embodiment, a reference color, which is a predetermined average hand color, is set as a detection color when detecting a hand area from an image of each frame before a human face can be detected. However, a color after correcting the reference color to a color closer to the skin color of the user may be set as the detected color. When correcting the reference color, for example, it may be used to infer the user's skin color such as the language used in the digital camera 1 and the user's personal information (name, email address, etc.). Any information that can be used can be used.

さらに、基準色として複数種類の肌の色を予め用意しておき、自動撮影モードが設定された時点や、それ以前の段階で、基準色として使用する肌の色をユーザに選択させてもよい。その場合には、それぞれの顔（肌）の色を異なる基準色に加工した見本となる顔写真を用意し、ユーザに任意の顔、つまり基準色を選択させてもよい。 Furthermore, a plurality of types of skin colors may be prepared in advance as reference colors, and the user may select a skin color to be used as a reference color when the automatic shooting mode is set or at an earlier stage. . In that case, a sample face photo obtained by processing each face (skin) color into a different reference color may be prepared, and the user may select an arbitrary face, that is, a reference color.

また、本実施形態においては、顔検出部９、手領域検出部１０、手検出部１１の各部を備えたデジタルカメラ１について説明したが、本発明に実施に際しては、例えば顔検出部９、手領域検出部１０、手検出部１１の各部による前述した各々の画像認識処理を、プログラム記憶部７に予め記憶したプログラムに従い制御部２に行わせてもよい。 In the present embodiment, the digital camera 1 including the face detection unit 9, the hand region detection unit 10, and the hand detection unit 11 has been described. However, when the present invention is implemented, for example, the face detection unit 9, the hand detection unit 9, Each of the image recognition processes described above by the respective units of the area detection unit 10 and the hand detection unit 11 may be performed by the control unit 2 in accordance with a program stored in advance in the program storage unit 7.

また、本発明は、デジタルカメラに限らず他の家電機器等の任意の装置にも適用することができる。さらに、本発明は、例えば各フレームの画像における人物の手の位置を示す位置情報を有線又は無線によりリアルタイムで外部の装置に供給し、外部の装置においてジェスチャ認識を行わせるものにも適用することができる。 Further, the present invention can be applied not only to a digital camera but also to an arbitrary device such as another home appliance. Furthermore, the present invention is also applicable to, for example, supplying position information indicating the position of a person's hand in an image of each frame to an external device in real time by wire or wireless, and causing the external device to perform gesture recognition. Can do.

以上、本発明のいくつかの実施形態、及びその変形例について説明したが、これらは本発明の作用効果が得られる範囲内であれば適宜変更が可能であり、変更後の実施形態も特許請求の範囲に記載された発明、及びその発明と均等の発明の範囲に含まれる。
以下に、本出願の当初の特許請求の範囲に記載された発明を付記する。
［請求項１］
動画像を構成する各々の画像から人物の顔を検出する顔検出手段と、前記顔検出手段によって検出された顔の肌の色情報を検出する肌色検出手段と、前記動画像を構成する各々の画像から、手の色情報に基づいて人物の手に対応する手領域の検出を行う手領域検出手段と、前記手領域検出手段により検出された手領域の画像内における位置を示す位置情報を取得する位置取得手段と、前記手領域検出手段による前記手領域の検出動作で用いる手の色情報を、前記肌色検出手段により検出された前記肌の色情報に基づいて変更する検出制御手段とを備えることを特徴とする位置検出装置。
［請求項２］
前記手領域検出手段は、前記顔検出手段により顔が検出されていない場合、基準となる手の色情報に基づいて人物の手に対応する手領域の検出を行うことを特徴とする請求項１記載の位置検出装置。
［請求項３］
検出制御手段は、前記肌色検出手段により検出される肌の色情報が変化する毎に、前記手領域検出手段による前記手領域の検出動作で用いる手の色情報を変化させることを特徴とする請求項１又は２記載の位置検出装置。
［請求項４］
前記位置取得手段により取得された前記各々の画像における前記手領域の位置を示す位置情報を順に記憶する記憶手段と、前記記憶手段に順に記憶された一連の位置情報に基づいて人物の手の動きによる所定のジェスチャを認識する認識手段とを更に備えることを特徴とする請求項１又は２，３記載の位置検出装置。
［請求項５］
前記手領域検出手段による前記手領域の検出動作で用いられた手の色情報により示される色が、前記肌色検出手段により検出された前記肌の色情報により示される色と同一又は近似する色であるか否かを判断する判断手段を更に備え、前記認識手段は、前記判断手段により、前記手領域検出手段による前記手領域の検出動作で用いられた手の色情報により示される色が、前記肌色検出手段により検出された前記肌の色情報により示される色と同一又は近似する色ではないと判断された場合、前記顔検出手段により顔が検出された以降に前記位置取得手段により取得され前記記憶手段に記憶された位置情報のみに基づいて人物の手の動きによる所定のジェスチャを認識することを特徴とする請求項４記載の位置検出装置。
［請求項６］
前記手領域検出手段は、前記各々の画像から、前記手の色情報により示される色、及び当該色と近似する色を有する動体部分に対応する領域を前記手領域として検出することを特徴とする請求項１乃至５いずれか記載の位置検出装置。
［請求項７］
前記顔検出手段によって前記検出動作の開始後の所定時間内に人物の顔が検出できない場合に、前記各々の画像から人物の手を形状認識技術を用いて検出する手検出手段を更に備え、前記位置取得手段は、前記顔検出手段によって前記検出動作の開始後の所定時間内に人物の顔が検出できない場合に、前記手領域検出手段により検出された手領域の画像内における位置を示す位置情報に代えて、前記手検出手段により検出された人物の手の前記各々の画像における位置を示す位置情報を取得することを特徴とする請求項１乃至６いずれか記載の位置検出装置。
［請求項８］
動画像を構成する各々の画像から人物の顔を検出する工程と、検出した顔の肌の色を検出する工程と、前記動画像を構成する各々の画像から、手の色情報に基づいて人物の手に対応する手領域を検出する工程と、検出した手領域の画像内における位置を示す位置情報を取得する工程と、前記手領域の検出に用いる手の色情報を、前記顔の肌の色情報に基づいて変更する工程とを含むことを特徴とする位置検出方法。
［請求項９］
動画像を構成する各々の画像から人物の手の位置を検出する位置検出装置が有するコンピュータに、前記動画像を構成する各々の画像から人物の顔を検出する処理と、検出した顔の肌の色情報を検出する処理と、前記動画像を構成する各々の画像から、手の色情報に基づいて人物の手に対応する手領域の検出を行う処理と、検出した手領域の画像内における位置を示す位置情報を取得する処理と、前記手領域の検出に用いる手の色情報を、前記顔の肌の色情報に基づいて変更する処理とを実行させることを特徴とするプログラム。 As mentioned above, although several embodiment of this invention and its modification were demonstrated, if these are in the range in which the effect of this invention is acquired, it can change suitably, and embodiment after change is also a claim. It is included in the scope of the invention described in the scope of the invention and the invention equivalent to the invention.
The invention described in the scope of the claims of the present application will be appended below.
[Claim 1]
Face detecting means for detecting the face of a person from each image constituting the moving image, skin color detecting means for detecting color information of the skin of the face detected by the face detecting means, and each constituting the moving image From the image, a hand region detecting means for detecting a hand region corresponding to a person's hand based on hand color information, and position information indicating the position of the hand region detected by the hand region detecting means in the image are acquired. And a detection control unit that changes hand color information used in the hand region detection operation by the hand region detection unit based on the skin color information detected by the skin color detection unit. A position detecting device characterized by that.
[Claim 2]
2. The hand region detection unit, when no face is detected by the face detection unit, detects a hand region corresponding to a human hand based on reference hand color information. The position detection device described.
[Claim 3]
The detection control means changes the color information of the hand used in the detection operation of the hand area by the hand area detection means every time the skin color information detected by the skin color detection means changes. Item 3. The position detection device according to Item 1 or 2.
[Claim 4]
Storage means for sequentially storing position information indicating the position of the hand region in each of the images acquired by the position acquisition means, and movement of a person's hand based on a series of position information stored in order in the storage means The position detection apparatus according to claim 1, further comprising: a recognition unit that recognizes a predetermined gesture by.
[Claim 5]
The color indicated by the hand color information used in the hand region detection operation by the hand region detection means is the same color as or similar to the color indicated by the skin color information detected by the skin color detection means. A judgment means for judging whether or not there is a color, and the recognition means uses the judgment means to determine whether the color indicated by the color information of the hand used in the hand region detection operation by the hand region detection means is the color; When it is determined that the color is not the same as or similar to the color indicated by the skin color information detected by the skin color detection means, the face acquisition means acquires the face after the face is detected by the face detection means, and 5. The position detection apparatus according to claim 4, wherein a predetermined gesture caused by a movement of a person's hand is recognized based only on the position information stored in the storage means.
[Claim 6]
The hand region detecting means detects, as the hand region, a region corresponding to a moving body part having a color indicated by the color information of the hand and a color similar to the color from each of the images. The position detection device according to claim 1.
[Claim 7]
Hand detection means for detecting a person's hand from each of the images using a shape recognition technique when the face detection means cannot detect a person's face within a predetermined time after the start of the detection operation; The position acquisition means is position information indicating the position of the hand area detected by the hand area detection means when the face detection means cannot detect a human face within a predetermined time after the start of the detection operation. 7. The position detection device according to claim 1, wherein position information indicating a position in each image of a person's hand detected by the hand detection unit is acquired instead of the position information.
[Claim 8]
A step of detecting a person's face from each image constituting the moving image, a step of detecting a skin color of the detected face, and a person based on hand color information from each image constituting the moving image Detecting a hand region corresponding to the hand of the user, obtaining a positional information indicating a position of the detected hand region in the image, and color information of the hand used for detecting the hand region of the facial skin. And a step of changing based on the color information.
[Claim 9]
A computer having a position detection device that detects the position of a person's hand from each image constituting a moving image, a process for detecting a person's face from each image constituting the moving image, and a skin of the detected face A process for detecting color information, a process for detecting a hand region corresponding to the hand of a person based on the color information of the hand from each image constituting the moving image, and a position of the detected hand region in the image A program for executing a process for acquiring position information indicating the position and a process for changing color information of a hand used for detection of the hand region based on the color information of the skin of the face.

１デジタルカメラ
２制御部
３レンズ部
４撮像部
５表示部
６画像記憶部
７プログラム記憶部
８操作部
９顔検出部
１０手領域検出部
１１手検出部 DESCRIPTION OF SYMBOLS 1 Digital camera 2 Control part 3 Lens part 4 Imaging part 5 Display part 6 Image memory | storage part 7 Program memory | storage part 8 Operation part 9 Face detection part 10 Hand area detection part 11 Hand detection part

Claims

Face detecting means for detecting a human face from each image constituting the moving image;
Skin color detection means for detecting color information of the skin of the face detected by the face detection means;
Hand region detecting means for detecting a hand region corresponding to a person's hand based on hand color information from each image constituting the moving image;
Position acquisition means for acquiring position information indicating the position in the image of the hand area detected by the hand area detection means;
And a detection control unit configured to change hand color information used in the hand region detection operation by the hand region detection unit based on the skin color information detected by the skin color detection unit. Detection device.

2. The hand region detection unit, when no face is detected by the face detection unit, detects a hand region corresponding to a human hand based on reference hand color information. The position detection device described.

The detection control means changes the color information of the hand used in the detection operation of the hand area by the hand area detection means every time the skin color information detected by the skin color detection means changes. Item 3. The position detection device according to Item 1 or 2.

Storage means for sequentially storing position information indicating the position of the hand region in each of the images acquired by the position acquisition means;
The position detecting device according to claim 1, further comprising: a recognizing unit that recognizes a predetermined gesture caused by movement of a person's hand based on a series of position information sequentially stored in the storage unit. .

The color indicated by the hand color information used in the hand region detection operation by the hand region detection means is the same color as or similar to the color indicated by the skin color information detected by the skin color detection means. A judgment means for judging whether or not there is,
The recognizing unit uses the skin color information detected by the skin color detecting unit to detect the color indicated by the hand color information used in the hand region detecting operation by the hand region detecting unit. When it is determined that the color is not the same as or similar to the displayed color, the person is based only on the position information acquired by the position acquisition unit and stored in the storage unit after the face is detected by the face detection unit. The position detection device according to claim 4, wherein a predetermined gesture caused by the movement of the hand is recognized.

The hand region detecting means detects, as the hand region, a region corresponding to a moving body part having a color indicated by the color information of the hand and a color similar to the color from each of the images. The position detection device according to claim 1.

When the face detection means cannot detect a human face within a predetermined time after the start of the detection operation, the face detection means further comprises a hand detection means for detecting a person's hand from each image using a shape recognition technique,
A position indicating a position in the image of the hand region detected by the hand region detecting unit when the face detecting unit cannot detect a human face within a predetermined time after the start of the detection operation; The position detection device according to any one of claims 1 to 6, wherein, instead of information, position information indicating a position in each image of a person's hand detected by the hand detection unit is acquired.

Detecting a person's face from each image constituting the moving image;
Detecting the color of the detected skin of the face;
Detecting a hand region corresponding to a person's hand based on hand color information from each image constituting the moving image;
Obtaining position information indicating the position of the detected hand region in the image;
Changing the hand color information used for detecting the hand region based on the color information of the skin of the face.

In a computer having a position detection device that detects the position of a person's hand from each image constituting a moving image,
A process of detecting a human face from each image constituting the moving image;
A process for detecting the color information of the detected facial skin;
A process of detecting a hand region corresponding to a person's hand based on hand color information from each image constituting the moving image;
Processing for obtaining position information indicating the position of the detected hand region in the image;
A program for executing a process of changing color information of a hand used for detection of the hand region based on color information of the skin of the face.