JP2014081788A

JP2014081788A - Hand gesture recognition device and control method thereof

Info

Publication number: JP2014081788A
Application number: JP2012229238A
Authority: JP
Inventors: Tsuneichi Arai; 常一新井
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-10-16
Filing date: 2012-10-16
Publication date: 2014-05-08
Anticipated expiration: 2032-10-16
Also published as: JP6103875B2

Abstract

PROBLEM TO BE SOLVED: To provide a technology to recognize hand gestures of a plurality of users with high accuracy.SOLUTION: A hand gesture recognition device recognizes faces of users photographed with a camera to determine a hand gesture detection area unique to each user, and recognizes movement of a hand image in the hand gesture detection area as hand gesture. When the hand gesture recognition device pays attention to one of the hand gesture detection areas, for a hand image coming into the attention hand gesture detection area from the outside, deletes the image from the hand images in the attention hand gesture detection area, and recognizes hand gesture from the remaining hand images.

Description

本発明は操作者のハンドジェスチャを認識し、該当する処理を行う技術に関するものである。 The present invention relates to a technique for recognizing an operator's hand gesture and performing a corresponding process.

近年、３次元の多種多様なセンシング技術が開発され、操作者の手の動きを指示操作として認識し、情報操作を行う装置が開発されている。ＴＶ等の家電にその技術を利用すると、リモコン無しに離れた所から操作を行うことが実現できる。 In recent years, a variety of three-dimensional sensing technologies have been developed, and an apparatus for recognizing an operator's hand movement as an instruction operation and performing an information operation has been developed. When the technology is used for home appliances such as a TV, it is possible to perform an operation from a remote location without a remote controller.

ただし、多くの家庭では、複数人が存在するし、狭い空間であれば隣り合って座って居る場合も多い。係る点に鑑み、複数人の空間ジェスチャを取り扱う技術を開示したものとして以下に示す文献がある。 However, in many homes, there are a plurality of people, and in a small space, they often sit next to each other. In view of this point, there is a document shown below that discloses a technique for handling a plurality of people's space gestures.

特許文献１のジェスチャ認識方法では、複数の被写体を全方位視覚センサにより撮像し、当該撮像の結果を画像処理装置において人物範囲内を３×３に等分割する。そして、各々の被写体を含む画像に分割し、当該分割した画像から時間差分画像を取得するステップに引き渡すことで認識を行っている。 In the gesture recognition method of Patent Document 1, a plurality of subjects are imaged by an omnidirectional visual sensor, and the image processing apparatus equally divides the person range into 3 × 3 in the image processing apparatus. Then, the image is divided into images including the respective subjects, and recognition is performed by handing over to a step of obtaining a time difference image from the divided images.

特許第３７８４４７４号公報Japanese Patent No. 3784474

しかし、特許文献１のジェスチャ認識方法では、各ユーザが隣り合っている場合など、ジェスチャの手の動きが隣に干渉する状況において、認識が困難になる場合があった。 However, in the gesture recognition method of Patent Document 1, recognition may be difficult in a situation where gesture hand movement interferes next to each other, such as when users are adjacent to each other.

本発明は係る問題に鑑みなされたものであり、複数のユーザが存在したとしても、それぞれのユーザのハンドジェスチャを高い精度で認識する技術を提供しようとするものである。 The present invention has been made in view of such a problem, and an object of the present invention is to provide a technique for recognizing each user's hand gesture with high accuracy even when there are a plurality of users.

この課題を解決するため、例えば本発明のハンドジェスチャ認識装置は以下の構成を備える。すなわち、
撮像手段で時系列に撮像された画像中のユーザの手の動きをハンドジェスチャとして認識するハンドジェスチャ認識装置であって、
前記撮像手段で撮像された画像中の、少なくとも一人のユーザの顔領域を検出する顔検出手段と、
該顔検出手段で検出された顔領域それぞれに対し、予め設定された相対的位置の領域を、ハンドジェスチャを検出するための検出領域として決定する決定手段と、
該決定手段で決定された検出領域の１つを着目したとき、時系列に撮像された画像から、着目検出領域の外から当該着目検出領域の内側に入ったことを検出した手の画像を、前記着目検出領域から削除する削除手段と、
該削除手段による削除して残った、前記着目検出領域内の時系列の手の画像から、前記着目検出領域におけるハンドジェスチャを認識するハンドジェスチャ認識手段と、を有する。 In order to solve this problem, for example, the hand gesture recognition device of the present invention has the following configuration. That is,
A hand gesture recognition device for recognizing a movement of a user's hand in an image captured in time series by an imaging means as a hand gesture,
Face detection means for detecting a face area of at least one user in the image picked up by the image pickup means;
A determining unit that determines a region having a preset relative position as a detection region for detecting a hand gesture for each of the face regions detected by the face detecting unit;
When paying attention to one of the detection areas determined by the determining means, an image of a hand that has detected that it has entered inside the target detection area from outside the target detection area from an image captured in time series, Deleting means for deleting from the focus detection area;
Hand gesture recognition means for recognizing a hand gesture in the focus detection area from a time-series hand image in the focus detection area that has been deleted by the deletion means.

本発明によれば、複数のユーザが存在したとしても、それぞれのユーザのハンドジェスチャを高い精度で認識することが可能になる。 According to the present invention, even if there are a plurality of users, it is possible to recognize each user's hand gesture with high accuracy.

実施形態における使用形態を表す図。The figure showing the usage pattern in embodiment. 実施形態における情報機器のブロック図。The block diagram of the information equipment in an embodiment. 実施形態例の処理構成図。The processing block diagram of the example of an embodiment. 検出した顔領域とハンドジェスチャのハンドジェスチャ開始領域を表す図。The figure showing the detected face area | region and the hand gesture start area | region of hand gesture. ハンドジェスチャとハンドジェスチャ開始領域の関係を表した概念図。The conceptual diagram showing the relationship between a hand gesture and a hand gesture start area | region. ハンドジェスチャ開始領域決定処理のフローチャート。The flowchart of a hand gesture start area | region determination process. ハンドジェスチャ開始領域調整処理のフローチャート。The flowchart of a hand gesture start area | region adjustment process. ジェスチャデータを選別する処理のフローチャート。The flowchart of the process which classifies gesture data. ハンドジェスチャ開始領域調整処理時の調整前と調整後を表わした概念図。The conceptual diagram showing before the adjustment at the time of hand gesture start area | region adjustment processing, and after adjustment. 第２の実施形態の処理構成図。The processing block diagram of 2nd Embodiment. 第２の実施形態の表示例を示す図。The figure which shows the example of a display of 2nd Embodiment. ハンドジェスチャ開始領域表示処理のフローチャート。The flowchart of a hand gesture start area | region display process. 第３の実施形態の処理構成図。The processing block diagram of 3rd Embodiment. 第３の実施形態の処理内容を示す図。The figure which shows the processing content of 3rd Embodiment. ハンドジェスチャ画像検出補正処理のフローチャート。The flowchart of a hand gesture image detection correction process. 補間処理のフローチャートである。It is a flowchart of an interpolation process. 画像削除と補間処理を説明するための図。The figure for demonstrating image deletion and an interpolation process. 画像削除と補間処理を説明するための図。The figure for demonstrating image deletion and an interpolation process. 画像削除と補間処理を説明するための図。The figure for demonstrating image deletion and an interpolation process.

以下、添付図面に従って本発明に係る実施形態を詳細に説明する。 Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

［第１の実施形態］
図１は実施形態の装置の使用形態を表している。図中、１は、撮像手段としてのカメラ部でありＴＶ画面の上に設置され、時系列（予め設定されたフレームレート）に撮像した画像中のユーザの動きを捕える事ができる。２は本実施形態のデバイスが組み込まれているＴＶである。ユーザが行うハンドジェスチャをカメラ部で撮影し、その手の動きをコマンドとして認識し、ＴＶのリモコン代わりに、チャンネルを変えたり、２画面にしたり操作を行う。３と４はソファに座っているユーザを表している。本実施形態では、２人のユーザが座った状態で、ＴＶ２をハンドジェスチャで操作する例を表している。 [First Embodiment]
FIG. 1 shows a usage pattern of the apparatus of the embodiment. In the figure, reference numeral 1 denotes a camera unit as an imaging means, which is installed on a TV screen and can capture a user's movement in an image captured in time series (a preset frame rate). Reference numeral 2 denotes a TV in which the device of this embodiment is incorporated. A hand gesture performed by the user is photographed by the camera unit, the movement of the hand is recognized as a command, and a channel is changed or a two-screen operation is performed instead of a TV remote control. 3 and 4 represent users sitting on the sofa. In the present embodiment, an example of operating the TV 2 with a hand gesture while two users are sitting is shown.

図２は本発明実施形態の情報処理機器のブロック図である。本装置は、表示部５、カメラ部６、ハードディスク、ＣＰＵ８、ＲＡＭ９、ＲＯＭ１０、及び、それらを接続するバス１１を有する。なお、実施形態における装置は、ユーザのハンドジェスチャの検出とその処理内容に特徴があるので、図示では、ＴＶ２として機能するための回路（例えばチューナ等の回路）は示していない。 FIG. 2 is a block diagram of the information processing apparatus according to the embodiment of the present invention. The apparatus includes a display unit 5, a camera unit 6, a hard disk, a CPU 8, a RAM 9, a ROM 10, and a bus 11 for connecting them. In addition, since the apparatus in the embodiment is characterized by the detection of the user's hand gesture and the processing contents thereof, the circuit for functioning as the TV 2 (for example, a circuit such as a tuner) is not shown in the drawing.

表示部５は、一般的な表示部で、液晶表示素子、液晶制御回路、表示メモリから構成され、システムバス１１経由でＣＰＵ８に接続している。そして。ＣＰＵ８からの指示で、画像の表示や文字の表示が実行される。この表示部５はデジタルに制御可能な一般的な表示装置で構わない。 The display unit 5 is a general display unit, which includes a liquid crystal display element, a liquid crystal control circuit, and a display memory, and is connected to the CPU 8 via the system bus 11. And then. In accordance with an instruction from the CPU 8, image display and character display are executed. The display unit 5 may be a general display device that can be digitally controlled.

カメラ部６は、一般的なものでよく、CCD等のセンサ、レンズ、撮影画像メモリから構成され、システムバス１１経由でＣＰＵ８に接続している。そして、ＣＰＵ８からの指示で、レンズをズームしたり、画像を撮影したり、撮影された画像をＲＡＭ９に転送したりする。 The camera unit 6 may be a general one and includes a sensor such as a CCD, a lens, and a captured image memory, and is connected to the CPU 8 via the system bus 11. Then, in response to an instruction from the CPU 8, the lens is zoomed, an image is taken, and the taken image is transferred to the RAM 9.

ハードディスク７は、TV番組等の録画データを記録するためのものである。ＣＰＵ８は、システムバス１１を介して、ＲＡＭ９、ＲＯＭ１０、ハードディスク７等と接続されていて、ＲＯＭ１０に記憶されているプログラムによって処理動作を行う。ＲＡＭ９は、ＣＰＵ８のワークエリアとして使われる。 The hard disk 7 is for recording recording data such as a TV program. The CPU 8 is connected to the RAM 9, the ROM 10, the hard disk 7, and the like via the system bus 11, and performs a processing operation by a program stored in the ROM 10. The RAM 9 is used as a work area for the CPU 8.

図３は、本実施形態の装置の機能構成図である。個々の機能は、幾つかのハードウェア資源とＣＰＵ８が実行するプログラムによって実現するものである。 FIG. 3 is a functional configuration diagram of the apparatus according to the present embodiment. Each function is realized by several hardware resources and a program executed by the CPU 8.

ハンドジェスチャ画像記録部１２は、カメラ部６と、そのカメラ部６を使った処理プログラムで構成されている。このハンドジェスチャ画像記録部１２は、複数ユーザのハンドジェスチャ画像を検出し、そのハンドジェスチャ画像をハンドジェスチャ画像制御部１４に送る。 The hand gesture image recording unit 12 includes a camera unit 6 and a processing program using the camera unit 6. The hand gesture image recording unit 12 detects a hand gesture image of a plurality of users and sends the hand gesture image to the hand gesture image control unit 14.

ハンドジェスチャ開始領域決定部１３は、カメラ部６で撮影した画像内から顔の画像を検出し、顔の画像の位置を基に、各ユーザ毎に、ハンドジェスチャ開始領域（検出領域）を決定する処理を行う。この処理を、検出したユーザの人数分を行う。ハンドジェスチャ開始領域が複数存在し、各ハンドジェスチャ開始領域が隣接する場合、隣接するハンドジェスチャ開始領域の位置を基にハンドジェスチャ開始領域のサイズを調整する処理を行う。そして、決定したハンドジェスチャ開始領域情報を、ハンドジェスチャ画像制御部１４に送る。以下、注目するハンドジェスチャ開始領域においてジェスチャを認識する対象となるユーザをターゲットと呼ぶ。 The hand gesture start area determination unit 13 detects a face image from the image captured by the camera unit 6 and determines a hand gesture start area (detection area) for each user based on the position of the face image. Process. This process is performed for the number of detected users. When there are a plurality of hand gesture start areas and the hand gesture start areas are adjacent to each other, a process of adjusting the size of the hand gesture start area is performed based on the position of the adjacent hand gesture start area. Then, the determined hand gesture start area information is sent to the hand gesture image control unit 14. Hereinafter, a user who is a target for recognizing a gesture in the hand gesture start region of interest is referred to as a target.

ハンドジェスチャ画像制御部１４は、着目する検出領域内で検出した手の移動を記録した画像の中から、その領域外からその領域の内側に入って、出て行った手の画像を取り除く処理を行う。各ユーザは、各ハンドジェスチャ開始領域内で、ハンドジェスチャ操作を行う。ハンドジェスチャ画像制御部１４は、当該領域の対象ではないユーザの手の動きや、外乱光等によるノイズ画像を取り除く処理を行う。ハンドジェスチャ画像制御部１４は、着目検出領域に対し、その外から着目検出領域内に入った画像（その後、領域外に出たかは問わない）を判別する削除画像判定部１４ａと、判別された画像を削除する指定画像削除部１４ｂで構成される。処理の詳細はフローチャートに沿って後述する。 The hand gesture image control unit 14 performs a process of removing the image of the hand that has entered the area from the outside of the area and recorded the movement of the hand that has been detected within the detection area of interest. Do. Each user performs a hand gesture operation within each hand gesture start area. The hand gesture image control unit 14 performs a process of removing a noise image caused by a movement of a user's hand that is not a target of the region, disturbance light, or the like. The hand gesture image control unit 14 is determined to be a deleted image determination unit 14a that determines an image that has entered the target detection region from outside the target detection region (regardless of whether or not the target detection region has moved out of the region). The designated image deletion unit 14b is configured to delete an image. Details of the processing will be described later along a flowchart.

ハンドジェスチャ認識部１５は、撮影された手の動きから特徴量を抽出し、ハンドジェスチャ辞書の特徴量とマッチングを行い、一番近いハンドジェスチャコマンドを認識結果として出力する。例えば、手を回転させる動作をジェスチャ辞書に、TVの音量を変えるジェスチャコマンドとして登録しておく。例えば、右回転を認識した場合には音量を大きくし、左回転を認識した場合には音量を小さくする。また、例えば手の上下の動きをジェスチャコマンドとして登録し、手の上昇の動作でTVのチャンネルの番号を加算し、手の下降の動作でTVのチャンネルの番号を減算する。この様にジェスチャ辞書にジェスチャコマンドを登録する事により、手の操作でTVを操作する機器を実現する。そして、ハンドジェスチャ認識部１５は、認識されたハンドジェスチャに応じた制御処理（チャネル変更、ボリューム変更等）を行うが、この処理そのものは制御対象装置に依存する。本実施形態では、制御対象装置はＴＶであるものとして説明を続けるが、これに限らない。 The hand gesture recognition unit 15 extracts a feature amount from the photographed hand movement, performs matching with the feature amount of the hand gesture dictionary, and outputs the closest hand gesture command as a recognition result. For example, the operation of rotating the hand is registered in the gesture dictionary as a gesture command for changing the volume of the TV. For example, the volume is increased when the right rotation is recognized, and the volume is decreased when the left rotation is recognized. Also, for example, the up / down movement of the hand is registered as a gesture command, the TV channel number is added by the hand ascending motion, and the TV channel number is subtracted by the hand descending motion. By registering gesture commands in the gesture dictionary in this way, a device for operating the TV by hand is realized. The hand gesture recognition unit 15 performs control processing (channel change, volume change, etc.) according to the recognized hand gesture, and this processing itself depends on the control target device. In the present embodiment, the description is continued assuming that the control target device is a TV, but the present invention is not limited to this.

図４は、検出した顔領域と、それに応じたハンドジェスチャのハンドジェスチャ開始領域を表した概念図である。図中の符号１６がユーザＡとして検出した矩形の顔領域である。符号１７は、ユーザＢとして検出した矩形の顔領域である。符号１８は、ユーザＡの顔領域１６を基に設定したハンドジェスチャのハンドジェスチャ開始領域（ユーザＡのジェスチャの検出領域）である。符号１９は、ユーザＢの顔領域１７を基に設定したハンドジェスチャのハンドジェスチャ開始領域（ユーザＢのジェスチャの検出領域）である。 FIG. 4 is a conceptual diagram showing a detected face area and a hand gesture start area of a hand gesture corresponding to the detected face area. Reference numeral 16 in the figure is a rectangular face area detected as the user A. Reference numeral 17 denotes a rectangular face area detected as the user B. Reference numeral 18 denotes a hand gesture start area (detection area of the user A gesture) of the hand gesture set based on the face area 16 of the user A. Reference numeral 19 denotes a hand gesture start area (user B gesture detection area) of a hand gesture set based on the face area 17 of the user B.

図５(A)は、ハンドジェスチャとハンドジェスチャ開始領域の関係を表した概念図である。図４と同じ符号は同じ意味である。すなわち、符号１８はユーザＡのハンドジェスチャのハンドジェスチャ開始領域であり、１９はユーザＢのハンドジェスチャのハンドジェスチャ開始領域である。２０は、カメラの全画像領域を表している。２１は、ユーザAのハンドジェスチャの開始位置を表している、２２は、ユーザＡのハンドジェスチャの終了位置を表している。２３は、ユーザＢの無意識の手の動きの開始点で、２４は、ユーザＢの無意識の手の動きの終了点を表している。この様に複数人が隣接していると、隣の人の手の動きが邪魔をする場合がある。 FIG. 5A is a conceptual diagram showing the relationship between a hand gesture and a hand gesture start area. The same reference numerals as those in FIG. 4 have the same meaning. That is, reference numeral 18 denotes a hand gesture start area of the user A's hand gesture, and 19 denotes a hand gesture start area of the user B's hand gesture. Reference numeral 20 denotes the entire image area of the camera. 21 represents the start position of user A's hand gesture, and 22 represents the end position of user A's hand gesture. Reference numeral 23 denotes a start point of the user B's unconscious hand movement, and reference numeral 24 denotes an end point of the user B's unconscious hand movement. If a plurality of people are adjacent to each other in this way, the movement of the hand of the next person may get in the way.

図５(B)は、ハンドジェスチャとハンドジェスチャ開始領域の関係を表した概念図である。図５(A) から、ユーザＡのハンドジェスチャ開始領域のみに注目した時の概念図である。 FIG. 5B is a conceptual diagram showing the relationship between the hand gesture and the hand gesture start area. FIG. 5A is a conceptual diagram when focusing on only the hand gesture start area of user A from FIG.

図５(C)は、図５(B)の状態からハンドジェスチャ開始領域外からのハンドジェスチャ画像データ削除した後の概念図である。図５(B)で表した、２３〜２４の、ユーザＢの手の動きは、ハンドジェスチャ開始領域外から侵入しているので、そのデータを、図３の処理部１４で削除し、この様に２１〜２２のユーザＡのハンドジェスチャの動きのみの画像データにする。 FIG. 5C is a conceptual diagram after the hand gesture image data from outside the hand gesture start area is deleted from the state of FIG. Since the movement of the hand of user B shown in FIG. 5B from 23 to 24 enters from outside the hand gesture start area, the data is deleted by the processing unit 14 of FIG. The image data of only the hand gesture movements of the user A of 21 to 22 are used.

図６は、ハンドジェスチャ開始領域決定処理のフローチャートである。 FIG. 6 is a flowchart of the hand gesture start area determination process.

ステップS6-1において、装置が起動されたことに応じて、ワーク領域等の確保と初期化が行われ、ハンドジェスチャ開始領域決定処理が開始される。 In step S6-1, the work area and the like are secured and initialized in response to the activation of the apparatus, and the hand gesture start area determination process is started.

ステップS6-2において、ハンドジェスチャ開始領域決定部１３は、カメラ画像から少なくとも一人のユーザの顔検出を行う。まず、図４で表したような画像がカメラ部６で、撮影される。撮影された画像から肌色で、ある大きさ以上の楕円形の領域を抽出すれば、顔の候補領域は検出できる。検出した領域から画像特徴量を抽出し、顔辞書とマッチングを取ればより確からしい顔領域を検出できる。図４の符号１６，１７のような顔領域が検出される。検出した顔領域の矩形情報をステップS6-3に渡す。 In step S6-2, the hand gesture start area determination unit 13 detects the face of at least one user from the camera image. First, an image as shown in FIG. 4 is taken by the camera unit 6. By extracting an oval area of a certain size or larger from the photographed image, a face candidate area can be detected. A more probable face area can be detected by extracting image feature values from the detected area and matching the face dictionary. Face areas such as reference numerals 16 and 17 in FIG. 4 are detected. The rectangle information of the detected face area is passed to step S6-3.

ステップS6-3において、ハンドジェスチャ開始領域決定部１３が、顔の矩形領域位置と大きさ情報とそれぞれの顔の矩形領域位置とサイズから、予め設定された相対的位置に各ハンドジェスチャ開始領域を決定する。図４の例だと、顔領域１６の位置からハンドジェスチャ開始領域１８が決定される。例えば、顔の領域から顔の高さ２個分下で、縦が顔の高さと同一、横が顔の横の１．５倍等に決定する。この値は、例であり、実際のハンドジェスチャの動きを複数人サンプリングして決定する様にする。また、学習処理等をいれ、誤認識が多ければ自動的に補正する様にしても良い。 In step S6-3, the hand gesture start area determination unit 13 sets each hand gesture start area to a preset relative position from the rectangular area position and size information of the face and the rectangular area position and size of each face. decide. In the example of FIG. 4, the hand gesture start area 18 is determined from the position of the face area 16. For example, two face heights below the face area, the vertical length is the same as the face height, and the horizontal width is 1.5 times the horizontal of the face. This value is an example, and the actual movement of the hand gesture is determined by sampling a plurality of people. In addition, a learning process or the like may be entered, and if there are many misrecognitions, it may be automatically corrected.

ステップS6-4において、ハンドジェスチャ開始領域決定部１３が、各ハンドジェスチャ開始領域をX軸でソートする処理を行う。図４の例だと、顔領域１７を検出した後に、顔領域１６を検出した場合、ユーザＢの人の後に、ユーザＡの人が来た場合でも、ハンドジェスチャ開始領域の矩形をＸ軸でソートする事により、同じ順番で処理を行える。 In step S6-4, the hand gesture start area determination unit 13 performs a process of sorting each hand gesture start area on the X axis. In the example of FIG. 4, when the face area 16 is detected after the face area 17 is detected, even if the user A comes after the user B, the hand gesture start area rectangle on the X axis By sorting, processing can be performed in the same order.

ステップS6-5において、ハンドジェスチャ開始領域決定部１３が、１ハンドジェスチャ開始領域のデータを読み込む処理を行う。例えば、図４の例と、顔領域１６のデータを読み込む。次にまたこのステップに来る場合は、顔領域１７のデータを読み込む。 In step S6-5, the hand gesture start area determination unit 13 performs processing for reading data of one hand gesture start area. For example, the example of FIG. 4 and the data of the face area 16 are read. Next, when coming to this step again, the data of the face area 17 is read.

ステップS6-6において、ハンドジェスチャ開始領域決定部１３が、ハンドジェスチャ開始領域の調整処理を行う。詳細は、図７のフローチャートで説明する。例えば、図９で表した様に、顔の位置で設定したハンドジェスチャ開始領域２３と２４のハンドジェスチャ開始領域が一部重なっている。それを調整する事により、領域２５，２６のように分離した位置に設定する様にする。初めから離れた位置であれば、調整する必要はないので、この処理はスキップされる。 In step S6-6, the hand gesture start area determination unit 13 performs a hand gesture start area adjustment process. Details will be described with reference to the flowchart of FIG. For example, as shown in FIG. 9, the hand gesture start areas 23 and 24 set at the face position partially overlap. By adjusting it, it is set to a separated position as in the areas 25 and 26. If the position is away from the beginning, there is no need to adjust, so this process is skipped.

ステップS6-7において、ハンドジェスチャ開始領域決定部１３が、全ハンドジェスチャ開始領域の処理が終わったかをチェックする。顔検出したユーザが２人いれば、２つのハンドジェスチャ開始領域をチェックし、３人いれば、３つのハンドジェスチャ開始領域をチェックする。 In step S6-7, the hand gesture start area determination unit 13 checks whether the processing of all hand gesture start areas has been completed. If there are two face-detected users, two hand gesture start areas are checked, and if there are three users, three hand gesture start areas are checked.

ステップS6-8において、ハンドジェスチャ開始領域決定処理は終了する。ハンドジェスチャ開始領域決定部１３が、決定したハンドジェスチャ開始領域の情報をハンドジェスチャ画像制御部１４に送る。 In step S6-8, the hand gesture start area determination process ends. The hand gesture start area determination unit 13 sends information on the determined hand gesture start area to the hand gesture image control unit 14.

図７は、ハンドジェスチャ開始領域調整処理のフローチャートである。この処理は、先に説明したステップS6−6の詳細な処理の説明を表している。 FIG. 7 is a flowchart of the hand gesture start area adjustment process. This process represents the detailed process of step S6-6 described above.

ステップS7-1において、ハンドジェスチャ開始領域決定部１３が、ハンドジェスチャ開始領域調整処理を開始する。 In step S7-1, the hand gesture start area determination unit 13 starts a hand gesture start area adjustment process.

ステップS7-2において、ハンドジェスチャ開始領域決定部１３が、ターゲットのハンドジェスチャ画像のハンドジェスチャ開始領域をセットする。図９の例であれば、矩形領域２３の座標値Xmin、Xmax、Ymin、Ymaxがセットされる。
ステップS7-3において、ハンドジェスチャ開始領域決定部１３が、隣接するハンドジェスチャ開始領域をセットする。図９の例であれば、矩形領域２４の座標をセットする。この実施形態の説明では、２個のハンドジェスチャ開始領域が対象であるが、３個の場合であれば、真ん中のデータを調整する場合は、左隣、右隣の隣接するデータをセットして調整する。ここでは、RXmin、RXmax、RYmin、RYmaxがセットされる。 In step S7-2, the hand gesture start area determination unit 13 sets a hand gesture start area of the target hand gesture image. In the example of FIG. 9, the coordinate values Xmin, Xmax, Ymin, and Ymax of the rectangular area 23 are set.
In step S7-3, the hand gesture start area determination unit 13 sets an adjacent hand gesture start area. In the example of FIG. 9, the coordinates of the rectangular area 24 are set. In the description of this embodiment, two hand gesture start areas are targeted. However, in the case of three, when adjusting the middle data, the adjacent data on the left and right are set. adjust. Here, RXmin, RXmax, RYmin, and RYmax are set.

ステップS7-4において、Xmax＞RXminでXmax＜RXmaxであるかがチェックされ、条件に合致すればステップS7-5に進み、そうでなければステップS7-7に進む。図９の例であれば、領域２３は、その右側が隣接しているので、ステップS7-5に進み、右側の枠を修正する。
隣接するハンドジェスチャ開始領域がX軸方向に重なる場合、その重なりの値をDXで表す。 In step S7-4, it is checked whether Xmax> RXmin and Xmax <RXmax. If the condition is met, the process proceeds to step S7-5, and if not, the process proceeds to step S7-7. In the example of FIG. 9, since the area 23 is adjacent on the right side, the process proceeds to step S7-5 to correct the right frame.
When adjacent hand gesture start areas overlap in the X-axis direction, the overlap value is represented by DX.

ステップS7-5において、ハンドジェスチャ開始領域決定部１３が、領域の横幅DXの設定を行う。ここではDX=RXmin-Xmaxとする。ステップS7-6において、新たなXmaxの値を設定する。Xmax=Xmax-DXとする。 In step S7-5, the hand gesture start area determination unit 13 sets the horizontal width DX of the area. Here, DX = RXmin−Xmax. In step S7-6, a new value of Xmax is set. Let Xmax = Xmax-DX.

ステップS7-7において、Xmin＜RXmaxでXmin＞RXminであるかがチェックされる。条件に合致すればステップS7-8に進み、そうでなければステップS7-10に進み処理を終わる。図９の例であれば、領域２４は、左側が隣接しているので、ステップS7-8に進み、左側の枠を修正する。 In step S7-7, it is checked whether Xmin <RXmax and Xmin> RXmin. If the condition is met, the process proceeds to step S7-8, and if not, the process proceeds to step S7-10 to finish the process. In the example of FIG. 9, since the region 24 is adjacent on the left side, the process proceeds to step S7-8 and the left frame is corrected.

ステップS7-8において、ハンドジェスチャ開始領域決定部１３が、域の横幅DXの設定を行う。DX=RXmin-Xminとする。ステップS7-9において、新たなXminの値を設定する。Xmin=Xmin+DXとする。このフローチャートでは横の調整処理を説明したが、場合によっては、縦の調整処理が発生する場合があるが、それはＸをＹに置き換えた調整処理になる。 In step S7-8, the hand gesture start area determining unit 13 sets the area width DX. DX = RXmin-Xmin. In step S7-9, a new value of Xmin is set. Xmin = Xmin + DX. Although the horizontal adjustment process has been described in this flowchart, in some cases, a vertical adjustment process may occur, but this is an adjustment process in which X is replaced with Y.

ステップS7-10において、ハンドジェスチャ開始領域調整処理を終了する。 In step S7-10, the hand gesture start area adjustment process ends.

図８は、ハンドジェスチャ開始領域を基にジェスチャデータを選別する処理のフローチャートである。 FIG. 8 is a flowchart of processing for selecting gesture data based on the hand gesture start area.

ステップS8-1において、ハンドジェスチャ画像制御部１４が、ハンドジェスチャ開始領域を基にジェスチャデータを選別する処理を開始する。 In step S8-1, the hand gesture image control unit 14 starts a process of selecting gesture data based on the hand gesture start area.

前記のフローチャートで、ハンドジェスチャ開始領域は設定されており、ハンドジェスチャの手の画像データを、カメラ部６により取得されている状態でこの処理が呼ばれる。 In the flowchart, the hand gesture start area is set, and this processing is called in a state where image data of the hand of the hand gesture is acquired by the camera unit 6.

ステップS8-2において、削除画像判定部１４ａが、ハンドジェスチャ開始領域内画像データを時系列順にバッファ領域に読み込む処理を行う。図５(B)の例であれば、手の画像２１から２２と、手の画像２３から２４までの間の、領域１８内の手の画像データが読み込まれる。 In step S8-2, the deleted image determination unit 14a performs processing for reading the image data in the hand gesture start area into the buffer area in time series. In the example of FIG. 5B, hand image data in the region 18 between the hand images 21 to 22 and the hand images 23 to 24 is read.

ステップS8-3において、削除画像判定部１４ａが、ハンドジェスチャ開始領域に対し、その領域を外から中へ横切った手の画像データを検出する。図５(B)の例であれば、手の画像２１から２２は領域１８内で完結しているのでそのままである。一方、手の画像２３から２４は、ハンドジェスチャ開始領域の境界領域を外から中へ横切っているので、この手の画像２３〜２４のデータが検出される。 In step S8-3, the deleted image determination unit 14a detects image data of a hand that has crossed the hand gesture start area from the outside to the inside. In the example of FIG. 5B, the hand images 21 to 22 are completed because they are completed in the region 18. On the other hand, since the hand images 23 to 24 cross the boundary region of the hand gesture start region from the outside to the inside, the data of the hand images 23 to 24 is detected.

ステップS8-4において、指定画像削除部１４ｂが、前ステップで、検出した手の画像データをバッファから削除する。図５(B)の例では、検出した画像を削除し、図５(C)の状態にする。 In step S8-4, the designated image deletion unit 14b deletes the image data of the hand detected in the previous step from the buffer. In the example of FIG. 5B, the detected image is deleted and the state shown in FIG.

ステップS8-5において、ハンドジェスチャ認識手段１５が、バッファ内の時系列記憶画像データから一定時間経過しても動きの無い手の画像を検出する。そして、一定時間以上経過した後に動いて停止するまでの手の動きの一連の動作を一アクションとして、認識する処理を行う。図５(C)の例であれば、手の画像２１がその位置で一定時間停止して、そこから右へ移動、すこしして上へ移動して、手の画像２２の位置で停止している。 In step S8-5, the hand gesture recognition means 15 detects an image of a hand that does not move even after a predetermined time has elapsed from the time-series stored image data in the buffer. Then, a process of recognizing a series of movements of the hand until it stops after moving for a certain period of time as one action is performed. In the example of FIG. 5C, the hand image 21 stops at that position for a certain period of time, then moves to the right, slightly moves up, and stops at the position of the hand image 22. Yes.

ステップS8-6において、ハンドジェスチャ認識手段１５が、検出した停止画像から時系列の手の中心座標の移動軌跡を抽出する。ステップS8-7において、ハンドジェスチャ認識手段１５が、軌跡データから特徴量を抽出する。例えば手の画像２１から２２の軌跡を１０等分し、方向ベクトル化する。本実施形態では、方向ベクトル化は、通常方向角度360度を8等分または16等分してベクトル化する。 In step S8-6, the hand gesture recognition means 15 extracts the movement trajectory of the center coordinates of the time-series hand from the detected stop image. In step S8-7, the hand gesture recognition unit 15 extracts a feature amount from the trajectory data. For example, the trajectory of the hand images 21 to 22 is equally divided into 10 and converted into direction vectors. In the present embodiment, the direction vectorization is performed by dividing the normal direction angle 360 degrees into 8 or 16 equal parts.

ステップS8-8において、ハンドジェスチャ認識手段１５は、辞書の特徴量とマッチング処理を行い、ジェスチャコードを決定する処理を行う。入力データが“Ｌ”のような形で、辞書に“Ｌ”“１”の方向ベクトルデータが記憶されていれば、“Ｌ”のジェスチャコードに決定する。 In step S8-8, the hand gesture recognition unit 15 performs a process of determining a gesture code by performing a matching process with the feature amount of the dictionary. If the input data is in the form of “L” and the direction vector data of “L” “1” is stored in the dictionary, the gesture code of “L” is determined.

ステップS8-9において、この処理を終了する。 In step S8-9, this process ends.

図９は、ハンドジェスチャ開始領域調整処理時の調整前と調整後を表わした概念図である。この様に、本実施形態の処理を行うことにより、ユーザ毎にハンドジェスチャ開始領域を決定できる。各ハンドジェスチャ開始領域において、その領域の外から入って出て行く手の画像は除外してジェスチャの認識を行うため、隣り合った２人がハンドジェスチャを行っても誤認識せずに快適にTVをコントロールする様な機器を実現できる。 FIG. 9 is a conceptual diagram showing before and after adjustment in the hand gesture start area adjustment processing. Thus, by performing the processing of the present embodiment, the hand gesture start area can be determined for each user. In each hand gesture start area, the gestures are recognized by excluding the images of the hands entering and exiting from the outside of the area, so even if two adjacent people perform hand gestures, they can comfortably watch TV. Can be realized.

［第２の実施形態］
第２実施形態では、ハンドジェスチャ開始領域の領域をユーザに知らせ、ハンドジェスチャ開始領域内に手を検出した場合、ハンドジェスチャの認識が開始可能であることをユーザに明示する機構を説明する。第１実施形態では、内部処理により、ハンドジェスチャの領域を設定し、誤認識を防ぐ処理を説明した。第２の実施形態では、設定された領域をユーザに明示する事により、ユーザがジェスチャを行う位置を調整可能とし、よりわかり易く、精度の高いユーザインタフェースを提供することができる。 [Second Embodiment]
In the second embodiment, a mechanism for notifying the user of the area of the hand gesture start area and clearly indicating to the user that recognition of the hand gesture can be started when a hand is detected in the hand gesture start area will be described. In the first embodiment, the processing for setting a hand gesture region by internal processing and preventing erroneous recognition has been described. In the second embodiment, by clearly indicating the set area to the user, it is possible to adjust the position where the user performs the gesture, and it is possible to provide a user interface that is easier to understand and more accurate.

本第２の実施形態では、第１実施形態の構成に、ハンドジェスチャ開始領域状態表示部２７を追加する事により、ハンドジェスチャ開始領域をユーザに知らせる事を実現する。 In the second embodiment, by adding a hand gesture start area state display unit 27 to the configuration of the first embodiment, it is possible to notify the user of the hand gesture start area.

図１０は、第２の実施形態の装置の機能構成図である。第１実施形態と同じ参照番号は同じ構成要素であるものとし、その説明は省略する。ハンドジェスチャ開始領域状態表示部２７は、ハンドジェスチャ開始領域決定部１３が決定したハンドジェスチャ開始領域をユーザに明示するため、図１のＴＶ２の画面上にハンドジェスチャ開始領域を表わす領域情報を表示する。 FIG. 10 is a functional configuration diagram of the apparatus according to the second embodiment. The same reference numerals as those in the first embodiment are the same components, and the description thereof is omitted. The hand gesture start region state display unit 27 displays region information representing the hand gesture start region on the screen of the TV 2 in FIG. 1 in order to clearly indicate to the user the hand gesture start region determined by the hand gesture start region determination unit 13. .

図１１は、第２実施形態の表示例である。TVの画面が２画面表示になっており領域２８にユーザＡが見ているチャンネルＡの番組が表示され、領域２９にユーザＢが見ているチャンネルＢの番組が表示されている。領域３０はユーザＡのハンドジェスチャ開始領域を示し、領域３１がユーザＢのハンドジェスチャ開始領域を示す。本実施形態でも、検出領域内で手が所定時間以上停止しているのを検出すると、ハンドジェスチャの認識処理に移行する。その際、領域３０のように所定のマーク（図示では手の画像）をその枠内に表示して、現在の手の位置でジェスチャの認識が可能であることをユーザに通知する。この例では、番組の画像とハンドジェスチャ開始領域を別の位置に示したが、当然オーバレイで画像内の一部に表示する構成でも良い。又、TVの画面の額縁部分にLEDを設け、ハンドジェスチャ認識開始をLEDの点滅で知らせるようにしても良い。多色のLEDで、認識開始、認識中、認識終了を知らせるようにしても良い。 FIG. 11 is a display example of the second embodiment. The TV screen is displayed in two screens, the program of channel A being viewed by user A is displayed in area 28, and the program of channel B being viewed by user B is displayed in area 29. An area 30 indicates a user A hand gesture start area, and an area 31 indicates a user B hand gesture start area. Also in this embodiment, when it is detected that the hand has been stopped for a predetermined time or longer in the detection area, the process proceeds to hand gesture recognition processing. At that time, a predetermined mark (an image of a hand in the figure) is displayed in the frame as in the region 30 to notify the user that the gesture can be recognized at the current hand position. In this example, the program image and the hand gesture start area are shown at different positions, but it is of course possible to display the program image and a part of the image with an overlay. Further, an LED may be provided in the frame portion of the TV screen so that the start of hand gesture recognition is notified by blinking of the LED. A multicolor LED may be used to notify the recognition start, during recognition, and recognition end.

図１２は、ハンドジェスチャ開始領域の領域表示処理のフローチャートである。 FIG. 12 is a flowchart of the area display process of the hand gesture start area.

ステップS12-1において、ハンドジェスチャ開始領域状態表示部２７が、ハンドジェスチャ開始領域の領域表示処理を開始する。 In step S12-1, the hand gesture start area state display unit 27 starts an area display process of the hand gesture start area.

ステップS12-2において、ハンドジェスチャ開始領域調整処理で、ハンドジェスチャ開始領域決定部１３が決定したハンドジェスチャ開始領域情報を読み込む。ステップS12-3において、ハンドジェスチャ開始領域状態表示部２７は、カメラ部６の情報からユーザ位置を推定する。カメラの情報から撮影距離と顔の大きさとの情報からユーザの現在の位置は推定できる。ステップS12-4において、ハンドジェスチャ開始領域状態表示部２７は、ユーザ位置の情報に応じた大きさを決定し、画面上に、相当する枠を表示する。その枠が、ハンドジェスチャ開始領域をユーザに示すガイドとなり、ユーザはガイド枠を見ることによりどの位置に手を持っていけばジェスチャ認識が行われるかが明確になる。 In step S12-2, hand gesture start area information determined by the hand gesture start area determination unit 13 is read in the hand gesture start area adjustment processing. In step S12-3, the hand gesture start area state display unit 27 estimates the user position from the information of the camera unit 6. The current position of the user can be estimated from information about the shooting distance and face size from the camera information. In step S12-4, the hand gesture start area state display unit 27 determines the size according to the user position information, and displays a corresponding frame on the screen. The frame serves as a guide indicating the hand gesture start area to the user, and it becomes clear by which position the user recognizes the hand by holding the hand by looking at the guide frame.

ステップS12-5において、ハンドジェスチャ開始領域状態表示部２７は、全ハンドジェスチャ開始領域ガイド表示処理が終了したかのチェックを行う。図１１の例であれば、２つのガイド枠を表示すれば処理は終了する。もし３人で見ていれば３つのガイド枠を表示するまで、上記の処理を繰り返す。 In step S12-5, the hand gesture start area state display unit 27 checks whether or not the all hand gesture start area guide display process has been completed. In the example of FIG. 11, the process ends when two guide frames are displayed. If three people are watching, the above process is repeated until three guide frames are displayed.

ステップS12-6において、カメラ部６により、ハンドジェスチャ開始領域の現在画像を読み込む処理を行う。 In step S12-6, the camera unit 6 performs processing for reading the current image of the hand gesture start area.

ステップS12-7において、ハンドジェスチャ開始領域状態表示部２７は、領域内に手があるかのチェックを行う。肌色の手の大きさ状の領域を検出したら、ステップS12-8に進み、そうでなければ、ステップS12-9に進む。本実施形態では、手の大きさ状の肌色領域を検出したら、手の形状が登録された画像辞書とマッチング処理を行って、確からしさが高い場合に手であると判定するため、手の誤認識は防げる。 In step S12-7, the hand gesture start area state display unit 27 checks whether there is a hand in the area. If a skin-like hand-sized area is detected, the process proceeds to step S12-8; otherwise, the process proceeds to step S12-9. In this embodiment, when a skin color area in the size of a hand is detected, a matching process is performed with an image dictionary in which the hand shape is registered. Recognition can be prevented.

ステップS12-8において、ハンドジェスチャ開始領域状態表示部２７が、手の表示をハンドジェスチャ開始領域ガイド表示内に表示する。カメラで撮影して検出した、肌色の画素を切り出して、それを表示する様にしてもいい。 In step S12-8, the hand gesture start area state display unit 27 displays the hand display in the hand gesture start area guide display. It is also possible to cut out flesh-colored pixels detected by photographing with a camera and display them.

この処理が、ハンドジェスチャスキャン開始領域内にユーザの手が存在する時に明示する部の処理である。 This process is a process of a part that is clearly shown when the user's hand exists in the hand gesture scan start area.

ステップS12-9において、ハンドジェスチャ開始領域状態表示部２７は、全ハンドジェスチャ開始領域内の手のチェックが終了したかをチェックする。すべてのチェックが終了していれば、ステップS12-10に進み、そうでなければ、ステップS12-6に進む。図１１の例であれば、２つのハンドジェスチャ開始領域内の手のチェックが終了したらこのステップは終了する。ステップS12-10において、このハンドジェスチャ開始領域の領域表示処理を終了する。 In step S12-9, the hand gesture start area state display unit 27 checks whether the check of the hands in all hand gesture start areas has been completed. If all checks have been completed, the process proceeds to step S12-10; otherwise, the process proceeds to step S12-6. In the example of FIG. 11, this step ends when the check of the hands in the two hand gesture start areas ends. In step S12-10, the area display process of the hand gesture start area is terminated.

この様に処理を行う事により、ハンドジェスチャ開始領域の領域をユーザに知らせ、使い易い機械を実現できる。 By performing the processing in this way, the user can be informed of the hand gesture start area and an easy-to-use machine can be realized.

以上、本発明の実施形態の一例について詳述したが、本発明は、例えば、システム、装置、方法、プログラムもしくは記憶媒体等としての実施態様を取ることが可能である。具体的には、複数の機器から構成されるシステムに適用してもよいし、また、一つの機器からなる装置に適用してもよい。 As mentioned above, although an example of embodiment of this invention was explained in full detail, this invention can take the embodiment as a system, an apparatus, a method, a program, or a storage medium etc., for example. Specifically, the present invention may be applied to a system composed of a plurality of devices, or may be applied to an apparatus composed of a single device.

［第３の実施形態］
第３実施形態では、ハンドジェスチャ対象の手と他の手の重なりを検出し、他の手との重なりによる画像消失部分を補正する。 [Third Embodiment]
In the third embodiment, an overlap of the hand gesture target hand and another hand is detected, and an image disappearance portion due to the overlap with the other hand is corrected.

第１実施形態では、ユーザ毎にハンドジェスチャの領域を設定し、異なるユーザのジェスチャと混同する誤認識を防ぐ処理を説明した。その際、ターゲットのハンドジェスチャと削除するハンドジェスチャが重なる場合が存在する。カメラから見て削除するハンドジェスチャが上にあり、ターゲットのハンドジェスチャ下にある場合がある。このとき、ターゲットのジェスチャは上で行われたジェスチャに隠されて、カメラに撮像されないため、上のハンドジェスチャ画像を削除すると、空白部分が存在し、下のハンドジェスチャ画像の流れは分断される。分断された画像をそのまま認識させると、正常の画像と異なるため、認識できないか異なる認識結果を出力する。例えば“Ｌ”のような動きだとしても真ん中で分断されると、”｜““＿”の２つのジェスチャになる。その削除による空白部分を検出し、補正する事により、分断されたハンドジェスチャ画像の移動画像を再現し、認識させる。”｜““＿”を補正して“Ｌ”にする訳である。 In the first embodiment, a process of setting a hand gesture area for each user and preventing misrecognition that is confused with a gesture of a different user has been described. At this time, there is a case where the target hand gesture and the hand gesture to be deleted overlap. The hand gesture to be deleted when viewed from the camera may be above and under the target hand gesture. At this time, since the target gesture is hidden by the gesture made above and is not captured by the camera, if the upper hand gesture image is deleted, a blank portion exists and the flow of the lower hand gesture image is divided. . If the divided image is recognized as it is, it is different from the normal image, so that it cannot be recognized or a different recognition result is output. For example, even if it is a movement like “L”, if it is divided in the middle, it becomes two gestures “|” “_”. By detecting and correcting a blank portion resulting from the deletion, the moving image of the divided hand gesture image is reproduced and recognized. "|" "_" Is corrected to "L".

第１実施形態の構成に、重なり検出部３２と、ハンドジェスチャ画像補正部３３を追加する事により、重なりによる分断画像でも正しく認識できる機器を実現する。 By adding the overlap detection unit 32 and the hand gesture image correction unit 33 to the configuration of the first embodiment, a device capable of correctly recognizing a divided image due to overlap is realized.

図１３は、第３実施形態の処理構成図である。第１の実施形態と同じ参照符号は同一のものを指すものとし、その説明は省略する。 FIG. 13 is a processing configuration diagram of the third embodiment. The same reference numerals as those in the first embodiment denote the same components, and the description thereof is omitted.

重なり検出部３２は、枠外から侵入したハンドジェスチャと、ターゲットのハンドジェスチャと、領域外から侵入した異なるユーザの手によるハンドジェスチャとが交差して、ターゲットが下になったことを検出する。そして、検出時の、領域内でのジェスチャの位置情報（削除される画像領域との境界接点）を、ハンドジェスチャ画像補正部３３に送る。ハンドジェスチャ画像補正部３３は、削除画像領域との境界接点の４点を基に、隣の人の手によって消失したハンドジェスチャ画像を補正する。消失したハンドジェスチャ画像の補正により、２つに分かれたハンドジェスチャ画像は接続され、一連の画像になる。 The overlap detection unit 32 detects that the target has fallen because the hand gesture that has entered from outside the frame, the hand gesture of the target, and the hand gesture by a different user's hand that has entered from outside the region intersect. Then, the position information of the gesture in the area at the time of detection (boundary contact point with the image area to be deleted) is sent to the hand gesture image correction unit 33. The hand gesture image correcting unit 33 corrects the hand gesture image that has disappeared by the hand of the adjacent person based on the four points of the boundary contact with the deleted image region. By correcting the lost hand gesture image, the two hand gesture images are connected to form a series of images.

図１４は、第３実施形態の処理表示イメージ図である。図１４(A)は、２つのジェスチャの重なり合った状態の画面例である。図１４(A)のハンドジェスチャの開始領域１８がターゲットハンドジェスチャの検出を行う領域であり、ハンドジェスチャ開始領域１９が、隣にいるユーザのハンドジェスチャを検出する領域である。 FIG. 14 is a process display image diagram of the third embodiment. FIG. 14A shows an example of a screen in a state where two gestures overlap. A hand gesture start area 18 in FIG. 14A is an area for detecting a target hand gesture, and a hand gesture start area 19 is an area for detecting a hand gesture of an adjacent user.

図１４(B)は、第１の実施形態の処理を実行し、領域外からのハンドジェスチャ画像を削除した後の状態の画面例である。ここで、ターゲットの手が、隣のユーザの手に隠されて撮像されない部分があったため、検出対象のユーザのハンドジェスチャ画像は、２つに分断されている。 FIG. 14B is a screen example in a state after executing the processing of the first embodiment and deleting the hand gesture image from outside the area. Here, since the target hand is hidden by the adjacent user's hand and there is a portion that is not imaged, the hand gesture image of the detection target user is divided into two.

図１４(C)は、第３の実施形態の処理により、分断されていたハンドジェスチャ画像を補正した後の画面例である。 FIG. 14C shows an example of a screen after correcting the hand gesture image that has been divided by the processing of the third embodiment.

図１７は、図１４(A)の画面を拡大した表示例である。手の移動する領域を、手の幅の領域で表している。図示の領域３５、３６が、ターゲットハンドジェスチャの手が移動した領域である。領域３４は、隣の人の手が移動した領域である。領域３５と３６は同じ手が移動した画像であるが、領域３４の手のよって分断されている。 FIG. 17 is a display example in which the screen of FIG. 14A is enlarged. The area where the hand moves is represented by the area of the width of the hand. The illustrated areas 35 and 36 are areas where the hand of the target hand gesture has moved. The area 34 is an area where the hand of the adjacent person has moved. Regions 35 and 36 are images in which the same hand has moved, but are divided by the hand of region 34.

図１８は、図１７から領域３４の隣の人の手の画像を削除した後の画面例である。点３７、３７、３９、４０の４点が、削除画像領域との境界接点である。領域３５が開始位置からの領域であり、領域３６が終了位置側の領域である。点４１は、点３７の位置に手が存在した以前に手が存在した時の位置である。例えば、カメラの撮影単位が、１秒間に３０コマ撮影する物であれば、１/３０秒前の位置である。点４１と３７を結んだ延長線上に次の手の画像が推定される。 FIG. 18 is a screen example after the image of the hand of the person next to the area 34 is deleted from FIG. Four points 37, 37, 39, and 40 are boundary contacts with the deleted image area. A region 35 is a region from the start position, and a region 36 is a region on the end position side. Point 41 is the position when the hand was present before the hand was at the position of point 37. For example, if the shooting unit of the camera captures 30 frames per second, the position is 1/30 second before. An image of the next hand is estimated on the extension line connecting the points 41 and 37.

点４２は、点３８の位置に手が存在した以前に手が存在した時の位置である。点４３は、点３９の位置に手が存在した後に手が存在した時の位置である。点４４は、点４０の位置に手が存在した後に手が存在した時の位置である。 The point 42 is a position when the hand was present before the hand was present at the position of the point 38. The point 43 is a position when the hand is present after the hand is present at the position of the point 39. The point 44 is a position when the hand is present after the hand is present at the position of the point 40.

図１５は、削除ハンドジェスチャ画像下のハンドジェスチャ画像検出処理のフローチャートである。第１の実施形態で説明した処理により、領域外からのハンドジェスチャ画像を削除した後でこの処理が呼ばれる。図１６は、空白部分を補間する処理のフローチャートである。 FIG. 15 is a flowchart of the hand gesture image detection process under the deleted hand gesture image. This process is called after the hand gesture image from outside the region is deleted by the process described in the first embodiment. FIG. 16 is a flowchart of a process for interpolating blank portions.

ステップS15-1において、重なり検出部３２が、削除ハンドジェスチャ画像下のハンドジェスチャ画像検出処理を開始する。ワーク領域等を確保し、初期化を行う。第１の実施形態で説明した領域外からのハンドジェスチャ画像の削除後にその削除領域の情報がもたらされる。 In step S15-1, the overlap detection unit 32 starts a hand gesture image detection process below the deleted hand gesture image. Secure the work area and perform initialization. After deletion of the hand gesture image from outside the area described in the first embodiment, information on the deletion area is provided.

ステップS15-2において、重なり検出部３２が、指定画像削除部１４ｂから、削除ハンドジェスチャ画像の領域情報を読み込む処理を行う。手の移動画像領域なので、ある幅を持った軌跡と考えれば良い。図５の例だと、開始点２３から終了点２４へ移動した手の移動軌跡で表された領域の情報が、ＸＹ座標点列で渡される。 In step S15-2, the overlap detection unit 32 performs processing for reading the area information of the deleted hand gesture image from the designated image deleting unit 14b. Since it is a moving image area of the hand, it can be considered as a trajectory having a certain width. In the example of FIG. 5, information on the area represented by the movement trajectory of the hand moving from the start point 23 to the end point 24 is passed in an XY coordinate point sequence.

ステップS15-3において、重なり検出部３２が、ターゲットハンドジェスチャ始点画像から終点までの移動軌跡と削除ハンドジェスチャ画像の領域との接点を求める。ターゲットハンドジェスチャと、削除ハンドジェスチャ画像が交差している場合は、接点があるので、そこを求める。もし、ターゲットハンドジェスチャと、削除ハンドジェスチャ画像が離れている場合はこの処理は必要ないので呼ばれない。図１８の例では、削除画像領域との境界接点の４点とその座標を次にように定義する。始点から削除ハンドジェスチャ画像の領域との接点の左側の点３７（x1，y1）、右側の点３８（x2，y2）、終点から削除ハンドジェスチャ画像の領域との接点の左側の点３９（x3,y3）、右側の点４０（その座標を（x4,y4）。 In step S15-3, the overlap detection unit 32 obtains a contact point between the movement trajectory from the target hand gesture start point image to the end point and the area of the deletion hand gesture image. When the target hand gesture and the deletion hand gesture image intersect, there is a contact point, and the point is obtained. If the target hand gesture and the deleted hand gesture image are separated from each other, this processing is not necessary and is not called. In the example of FIG. 18, the four points of the boundary contact point with the deleted image area and the coordinates thereof are defined as follows. A point 37 (x1, y1) on the left side of the contact point from the start point to the area of the deletion hand gesture image, a point 38 (x2, y2) on the right side, and a point 39 (x3 on the left side of the contact point from the end point to the area of the deletion hand gesture image , y3), right point 40 (the coordinates are (x4, y4).

ステップS15-4において、重なり検出部３２が、始点から接点と終点からの接点で囲まれる領域が空白であるかをチェックする。図１８の例であれば、点３７、３８、３９、４０で囲まれた領域が空白であるかをチェックする。もし、隣の人の手の画像が、ターゲットの画像の影の部分に存在する場合は、この削除画像領域との境界接点の４点はそのままターゲットのハンドジェスチャ画像が存在する。 In step S15-4, the overlap detection unit 32 checks whether the area surrounded by the contact point from the start point and the contact point from the end point is blank. In the example of FIG. 18, it is checked whether the area surrounded by the points 37, 38, 39, and 40 is blank. If the image of the neighbor's hand is present in the shadow portion of the target image, the target hand gesture image is present as it is at the four points of boundary contact with the deleted image area.

空白であれば、補正する必要があるのでステップS15-5に進み、空白でなければ、ステップS15-8に進み処理を終了する。ターゲットハンドジェスチャの画像がカメラ部６から見て上（手前）にあり、削除ハンドジェスチャ画像が下側（奥）を移動したのであれば、このステップでは、空白でないと判断される。ターゲットハンドジェスチャの画像がカメラから見て下（奥）にあり、削除ハンドジェスチャ画像が上側（手前）を移動したのであれば、下側にあった手の移動は撮影されない為に空白が生じる。 If it is blank, it is necessary to correct it and the process proceeds to step S15-5. If not blank, the process proceeds to step S15-8 and the process is terminated. If the image of the target hand gesture is above (near) as viewed from the camera unit 6 and the deleted hand gesture image has moved down (back), it is determined in this step that the image is not blank. If the image of the target hand gesture is at the bottom (back) as viewed from the camera, and the deleted hand gesture image has moved up (front), the movement of the hand that was at the bottom is not photographed, resulting in a blank space.

ステップＳ15-5において、ハンドジェスチャ画像補正部３３が、空白領域の境界接点の４点をセットする。図１８の例では、点３７、３８、３９、４０のＸＹ座標点(x1,y1)、(x2,y2)、(x3,y3)、(x4,y4)が記憶される。 In step S15-5, the hand gesture image correcting unit 33 sets four points of boundary contacts in the blank area. In the example of FIG. 18, XY coordinate points (x1, y1), (x2, y2), (x3, y3), and (x4, y4) of the points 37, 38, 39, and 40 are stored.

ステップS15-6において、ハンドジェスチャ画像補正部３３が、空白部分を補間する処理を行う。詳細な処理の説明は図１６のフローチャートにおいて説明する。ステップS15-7において、補間したハンドジェスチャ画像を出力する。図１８で表した、領域３５と３６の分断されたハンドジェスチャ画像が補正され、図１９の様に点３７，３８，３９，４０で囲まれた領域に、手の画像が追加され、領域３５と３６の手の画像が１つのハンドジェスチャ画像に補正された画像を出力する。 In step S15-6, the hand gesture image correction unit 33 performs a process of interpolating the blank portion. Detailed processing will be described with reference to the flowchart of FIG. In step S15-7, the interpolated hand gesture image is output. The hand gesture image obtained by dividing the regions 35 and 36 shown in FIG. 18 is corrected, and a hand image is added to the region surrounded by the points 37, 38, 39, and 40 as shown in FIG. And 36 are output by correcting the hand images into one hand gesture image.

そのジェスチャを認識すれば、“」”ジェスチャと認識できる。分断したままだと、“＿”“｜”の2つにわかれているので正しく認識できない。ステップS15-8において、処理を終了し、ワーク領域等を解放する。 If the gesture is recognized, it can be recognized as a “” ”gesture.If it is divided, it cannot be recognized correctly because it is divided into two“ _ ”and“ |. ”In step S15-8, the process ends, Release work area.

次にステップS15-6の処理内容を、図１６のフローチャートに従い説明する。 Next, the processing content of step S15-6 will be described with reference to the flowchart of FIG.

ステップS16-1において、ハンドジェスチャ画像補正部３３は、空白部分を補間する処理を開始する。ワーク領域等を確保し、初期化する。 In step S16-1, the hand gesture image correction unit 33 starts a process of interpolating the blank portion. Secure and initialize the work area.

ステップS16-2において、ハンドジェスチャ画像補正部３３は、空白領域情報として４点の座標を記憶する。図１８の例の場合、点３７乃至点４０の座標（x1,y1）、（x2,y2）、（x3,y3）、（x4,y4）に記憶する。 In step S16-2, the hand gesture image correcting unit 33 stores the coordinates of the four points as the blank area information. In the case of the example of FIG. 18, the coordinates (x1, y1), (x2, y2), (x3, y3), and (x4, y4) of the points 37 to 40 are stored.

ステップS16-3において、ハンドジェスチャ画像補正部３３は、手の始点方向から（x1,y1）、（x2,y2）の方向角度を検出し、（x1,y1）、（x2,y2）からその方向角度に線を延長する。手の画像の中心点の100分の1秒毎の移動点の（x1,y1）、（x2,y2）に到達する前の数点から方向角度を求める。そこから同じ方向に手が移動したと推測できる。図１８の例で説明すると、点３７の位置の手の画像の撮影時間の１／１００秒前の手の画像の位置が点４１の位置である。点４１から点３７の方向の直線の延長が図の様な矢印になる。点３８の位置の手の画像の撮影時間の１／１００秒前の手の画像の位置が点４２の位置である。点４２から点３８の方向の直線の延長が図の様な矢印になる。 In step S16-3, the hand gesture image correcting unit 33 detects the direction angles of (x1, y1) and (x2, y2) from the direction of the starting point of the hand, and from (x1, y1) and (x2, y2) Extend line to directional angle. The direction angle is obtained from several points before reaching (x1, y1) and (x2, y2) of the moving point every 1/100 second of the center point of the hand image. It can be assumed that the hand has moved in the same direction from there. In the example of FIG. 18, the position of the hand image 1/100 second before the shooting time of the hand image at the position of the point 37 is the position of the point 41. The extension of the straight line from the point 41 to the point 37 becomes an arrow as shown in the figure. The position of the hand image 1/100 second before the shooting time of the hand image at the position of the point 38 is the position of the point 42. The extension of the straight line in the direction from the point 42 to the point 38 becomes an arrow as shown in the figure.

ステップS16-4において、ハンドジェスチャ画像補正部３３は、手の終点方向から（x3,y3）、（x4,y4）の方向角度を検出し、（x3,y3）、（x4,y4）からその方向角度に線を延長する。（x3,y3）、（x4,y4）から手の終点方向への手の画像の中心点の100分の1秒毎の移動点の数点から方向角度を求め、その逆方向を求める。逆方向から同じ方向に手が移動したと推測できるのでそのようにする。図１８の例で説明すると、点３９（x3,y3）の位置の手の画像の撮影時間の１／１００秒後の手の画像の位置が点４３の位置である。点４３から点３９の方向の直線の延長が図の様な矢印になる。点４０（x4,y4）の位置の手の画像の撮影時間の１／１００秒後の手の画像の位置が点４４の位置である。点４４から点４０の方向の直線の延長が図の様な矢印になる。 In step S16-4, the hand gesture image correcting unit 33 detects the direction angles of (x3, y3) and (x4, y4) from the end direction of the hand, and from (x3, y3) and (x4, y4) Extend line to directional angle. From (x3, y3) and (x4, y4), the direction angle is obtained from several points of the movement point per hundredth of the center point of the hand image in the direction of the end point of the hand, and the opposite direction is obtained. This is because it can be assumed that the hand has moved in the same direction from the opposite direction. In the example of FIG. 18, the position of the hand image 1/100 second after the shooting time of the hand image at the position of the point 39 (x3, y3) is the position of the point 43. The extension of the straight line from the point 43 to the point 39 becomes an arrow as shown in the figure. The position of the hand image after 1/100 second of the shooting time of the hand image at the position of the point 40 (x4, y4) is the position of the point 44. The extension of the straight line from the point 44 to the point 40 becomes an arrow as shown in the figure.

ステップS16-5において、ハンドジェスチャ画像補正部３３は、始点方向からの線と終点方向から線を中間点で接続する。そして、その線上に、（x1,y1）、（x2,y2）の位置の手の画像を撮影時間のタイミングで、同一の速度間隔で複写して、ハンドジェスチャ画像を補間する。図１９の例で説明すると、点４１から３７の延長線と、点４３から３９への延長線が点４５の位置で接続する。点４２から３８の延長線と、点４４から４０の延長線が点４６の位置で接続する。その点３７、３８、４６、４０、３９、４５で囲まれた領域に、手の画像を撮影タイミングと同一の１／１００秒間隔で、点３７の位置の手の画像を複写して、その位置に描画してハンドジェスチャ画像を補間する。ステップS16-6において、処理を終了する。ワーク領域を解放し、ステップS15-6に進む。 In step S16-5, the hand gesture image correcting unit 33 connects the line from the start point direction and the line from the end point direction at an intermediate point. Then, an image of the hand at the positions (x1, y1) and (x2, y2) is copied on the line at the same time interval at the timing of the photographing time, and the hand gesture image is interpolated. In the example of FIG. 19, the extension line from the points 41 to 37 and the extension line from the points 43 to 39 are connected at the position of the point 45. The extension line from the points 42 to 38 and the extension line from the points 44 to 40 are connected at the point 46. In the area surrounded by the points 37, 38, 46, 40, 39, and 45, the hand image at the point 37 is copied at the same 1/100 second interval as the shooting timing. Draw at the position and interpolate the hand gesture image. In step S16-6, the process ends. The work area is released and the process proceeds to step S15-6.

この様に、処理を行う事により、ターゲットのハンドジェスチャと、削除するハンドジェスチャが重なっても、隠れてしまった画像を補間する事により再現したハンドジェスチャ画像を正しく認識する機器を実現する。 In this way, by performing processing, even if the target hand gesture and the hand gesture to be deleted overlap, a device that correctly recognizes the hand gesture image reproduced by interpolating the hidden image is realized.

以上説明したように本実施形態によれば、複数のユーザが存在したとしても、それぞれのユーのハンドジェスチャを高い精度で認識することが可能になる。さらに、各ユーザの位置情報から決定した各ハンドジェスチャ開始領域の隣接する領域の重なりを検出し、調整する事により、より認識精度を向上させる事を実現する。また、ハンドジェスチャスキャン開始領域内にユーザの手を検出した事を明示する事により、ジェスチャ認識の開始ができる状態であることをユーザに知らせることが実現でき、操作が容易になる。追加する部はソフトウエアによって実現できるので他の検出部を追加する必要が無いためコスト的にも改善する。また、他のユーザの手により分断されたハンドジェスチャ画像でも、補正部で補正する事により認識を可能にする。 As described above, according to the present embodiment, even if there are a plurality of users, each user's hand gesture can be recognized with high accuracy. Furthermore, it is possible to further improve the recognition accuracy by detecting and adjusting the overlapping of adjacent areas of the hand gesture start areas determined from the position information of each user. In addition, by clearly indicating that the user's hand has been detected in the hand gesture scan start area, it is possible to notify the user that the gesture recognition can be started, and the operation becomes easy. Since the added part can be realized by software, it is not necessary to add another detection part, so that the cost is improved. Even a hand gesture image divided by another user's hand can be recognized by being corrected by the correction unit.

（その他の実施例）
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 (Other examples)
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

A hand gesture recognition device for recognizing a movement of a user's hand in an image captured in time series by an imaging means as a hand gesture,
Face detection means for detecting a face area of at least one user in the image picked up by the image pickup means;
A determining unit that determines a region having a preset relative position as a detection region for detecting a hand gesture for each of the face regions detected by the face detecting unit;
When paying attention to one of the detection areas determined by the determining means, an image of a hand that has detected that it has entered inside the target detection area from outside the target detection area from an image captured in time series, Deleting means for deleting from the focus detection area;
Hand gesture recognition means for recognizing a hand gesture in the focus detection area from a time-series hand image in the focus detection area left after being deleted by the deletion means;
A hand gesture recognition device comprising:

The region determining means includes
When the face detection means detects face areas of a plurality of users and the detection areas for the respective face areas overlap with each other, the face detection means includes means for adjusting the size of each detection area so that there is no overlap. Item 4. The hand gesture recognition device according to Item 1.

In the detection area determined by the area determination unit, when an image of a hand that does not move even after a preset time has been detected, the hand gesture is detected in the detection area including the image of the hand without movement. The hand gesture recognition device according to claim 1, further comprising an explicit means for clearly indicating that recognition is possible.

4. The hand gesture recognition apparatus according to claim 3, wherein the specifying means includes means for displaying a predetermined mark in a frame corresponding to the detection area in the image captured by the imaging means.

The hand gesture recognition means includes:
As a result of deletion by the deletion means, when the image of the hand of the target that recognizes the hand gesture is divided, it includes means for interpolating the divided disappearance portion based on the coordinates of the divided boundary. The hand gesture recognition device according to any one of claims 1 to 4.

A control method for a hand gesture recognition device for recognizing movement of a user's hand in an image captured in time series by an imaging means as a hand gesture,
A face detecting step for detecting a face region of at least one user in the image picked up by the image pickup means;
A determining step in which a determination unit determines a region of a preset relative position as a detection region for detecting a hand gesture for each of the face regions detected in the face detection step;
When the deletion means pays attention to one of the detection areas determined in the determination step, the hand that has detected that it has entered inside the target detection area from outside the target detection area from an image captured in time series A deletion step of deleting the image of the target detection area,
A hand gesture recognition step for recognizing a hand gesture in the target detection region from a time-series hand image in the target detection region, which is left after the deletion by the deletion step.
A method for controlling a hand gesture recognition device, comprising:

A non-transitory computer-readable storage medium storing a program for causing a computer having an imaging unit to execute each step according to the method according to claim 6 to cause the computer to function as a hand gesture recognition device.

A computer-readable storage medium storing the program according to claim 7.