JP5264844B2

JP5264844B2 - Gesture recognition apparatus and method

Info

Publication number: JP5264844B2
Application number: JP2010199306A
Authority: JP
Inventors: 良輔青木; 篤彦前田; 智樹渡部; 稔小林
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-09-06
Filing date: 2010-09-06
Publication date: 2013-08-14
Anticipated expiration: 2030-09-06
Also published as: JP2012058854A

Abstract

<P>PROBLEM TO BE SOLVED: To accurately recognize a gesture by the movement even if using fingers or arms. <P>SOLUTION: A gesture recognition device detects luminescent point coordinates of a graphic drawn by a gesture from an image frame imported from a camera 4, and determines whether the luminescent point coordinates are included in a first area Ea or a second area Ec in the image frame. The gesture recognition device sets a finger gesture mode when the luminescent point coordinates are included in the first area Ea, and sets a camera tracking mode when the luminescent point coordinates are included in the second area Ec. In the finger gesture mode, the graphic by the finger gesture is recognized based on the image data. On the other hand, in the camera tracking mode, a pan and tilt drive unit 5 controls a pan and tilt angle of the camera 4 to set the imaging direction of the camera 4 so as to track the movement of a user's arm, and recognizes the graphic based on the tracked trajectory. <P>COPYRIGHT: (C)2012,JPO&INPIT

Description

この発明は、例えばテレビジョン受信機や録画再生装置において、離れた場所からチャネル情報や制御情報等を入力するために用いる、指又は腕の動きによるジェスチャを認識するジェスチャ認識装置に関する。 The present invention relates to a gesture recognition device for recognizing a gesture caused by movement of a finger or an arm, which is used for inputting channel information, control information, and the like from a remote place, for example, in a television receiver or a recording / playback apparatus.

ディスプレイ装置の画面に表示された情報に対しポインティングするための代表的な技術としては、マウスやタブレットペン等のポインティングデバイスを用いるものが知られている。また、その他のポインティング技術として、リモートコントローラ（リモコン端末）を用いて遠隔的にポインティングを行うものや、ユーザのジェスチャを認識してポインティングを行うものも知られている。 As a representative technique for pointing to information displayed on the screen of a display device, a technique using a pointing device such as a mouse or a tablet pen is known. In addition, as other pointing techniques, there are known one that performs remote pointing using a remote controller (remote control terminal) and one that recognizes a user's gesture and performs pointing.

リモコン端末を用いた技術は、例えばリモコン端末に設けられた十字キー等のカーソルキーをユーザが指で操作して、その操作データを赤外線又は無線を介してディスプレイ装置へ送信し、ディスプレイ装置が上記操作データを受信することでポインティングを行うものとなっている。 In the technology using the remote control terminal, for example, a user operates a cursor key such as a cross key provided on the remote control terminal with a finger, and the operation data is transmitted to the display device via infrared or wireless. Pointing is performed by receiving operation data.

一方、ユーザのジェスチャを認識する技術は、例えばユーザの動きをカメラを用いて撮像し、この撮像された画像データからユーザの特定の身体部位の動作軌跡をパターン認識処理により認識して、この認識結果をもとにポインティングを行うものとなっている（例えば、特許文献１を参照）。 On the other hand, the technology for recognizing a user's gesture is, for example, capturing a user's movement using a camera, recognizing a motion locus of a specific body part of the user from the captured image data by pattern recognition processing, and performing this recognition. Pointing is performed based on the result (see, for example, Patent Document 1).

特開２００４−２７２５９８号公報JP 2004-272598 A

ところで、ジェスチャにより空間に図形を描く場合、腕を固定した状態で手首を動かす場合、つまり指の動きを用いる場合と、腕の動きを用いる場合が想定される。しかし、指を用いる場合と腕を用いる場合を併用すると、その両方の動きをカメラにより同じ条件で認識することは難しい。なぜなら、ユーザの手首付近にカメラの焦点を当てて指の動きを拡大して撮像しようとすると、カメラの撮像視野角が狭くなるため腕の動きによるジェスチャが撮像視野を外れてしまい認識できなくなる。一方、腕の動きによるジェスチャを認識するために倍率を下げてカメラの撮像視野を広角に設定すると、指の動きによるジェスチャを認識しにくくなり、ジェスチャにより描かれる図形の軌跡を高精度に認識することが困難となるからである。
この発明は上記事情に着目してなされたもので、その目的とするところは、指を用いる場合でもまた腕を用いる場合でもその動きによるジェスチャを的確に認識できるようにしたジェスチャ認識装置を提供することにある。 By the way, when drawing a figure in a space by a gesture, a case where the wrist is moved with the arm fixed, that is, a case where a finger movement is used and a case where an arm movement is used are assumed. However, if the case of using a finger and the case of using an arm are used together, it is difficult to recognize both movements under the same conditions by the camera. This is because if the focus of the camera is focused near the user's wrist and the movement of the finger is enlarged to capture an image, the imaging viewing angle of the camera becomes narrow, and the gesture caused by the movement of the arm deviates from the imaging viewing field and cannot be recognized. On the other hand, if you set the camera field of view to a wide angle to reduce the magnification to recognize gestures due to arm movements, it will be difficult to recognize gestures due to finger movements, and the traces of figures drawn by gestures will be recognized with high accuracy. This is because it becomes difficult.
The present invention has been made paying attention to the above circumstances, and an object of the present invention is to provide a gesture recognition device capable of accurately recognizing a gesture caused by movement of a finger or an arm. There is.

上記目的を達成するためにこの発明の一観点は、ユーザがジェスチャにより空間に図形を描く動きを撮像してその画像データを出力する撮像装置と、上記撮像装置から出力された画像データをもとに上記ジェスチャにより描かれた図形を認識するジェスチャ認識装置とを具備するシステムで使用される上記ジェスチャ認識装置にあって、
上記撮像装置から取り込んだ画像データから図形の描画点を検出し、この検出された図形の描画点が当該画像データ中の予め設定された第１のエリアに含まれるか或いは当該第１のエリアの周辺に設定した第２のエリアに含まれるかを判定する。そして、描画点が第１のエリアに含まれると判定された場合に、指の動きを用いたジェスチャを認識する第１の認識モードを設定し、第２のエリアに含まれると判定された場合には、腕の動きを用いたジェスチャを認識する第２の認識モードを設定する。上記第１の認識モードが設定された状態では、上記取り込まれた画像データをもとに上記指の動きを用いたジェスチャにより描画される図形を認識する。これに対し上記第２の認識モードが設定された状態では、上記検出された図形の描画点の位置に応じて上記撮像装置のパン・チルト角を制御することにより撮像方向を上記腕の動きに追従させ、このときの撮像方向の追従軌跡を検出してその検出結果をもとに上記腕の動きを用いたジェスチャにより描画される図形を認識するようにしたものである。 In order to achieve the above object, one aspect of the present invention is based on an imaging device that captures a motion of a user drawing a figure in a space and outputs the image data, and image data output from the imaging device. In the gesture recognition device used in a system comprising a gesture recognition device for recognizing a figure drawn by the gesture,
A drawing point of a figure is detected from the image data captured from the imaging device, and the drawing point of the detected figure is included in a preset first area in the image data or in the first area. It is determined whether it is included in the second area set in the periphery. When it is determined that the drawing point is included in the first area, a first recognition mode for recognizing a gesture using finger movement is set, and it is determined that the drawing point is included in the second area. The second recognition mode for recognizing the gesture using the movement of the arm is set. In the state where the first recognition mode is set, the figure drawn by the gesture using the finger movement is recognized based on the captured image data. On the other hand, in the state where the second recognition mode is set, the imaging direction is changed to the movement of the arm by controlling the pan / tilt angle of the imaging device according to the position of the drawing point of the detected figure. A tracking locus in the imaging direction at this time is detected, and a figure drawn by a gesture using the movement of the arm is recognized based on the detection result.

すなわち、ユーザが指の動きを用いてジェスチャを行ったか或いは腕の動きを用いてジェスチャを行ったかが自動的に判定される。そして、指の動きを用いた場合には画像データから描画図形が認識される。一方、腕の動きを用いた場合には、画像データから検出された図形の描画点の位置に応じて撮像装置のパン・チルト角が制御され、これにより撮像方向が上記腕の動きに追従する。そして、このときの撮像方向の追従軌跡から上記腕の動きによる描画図形が認識される。
したがって、ユーザが指の動きを用いた場合でもまた腕の動きを用いた場合でも、これらの動きによるジェスチャを適切に認識することが可能となる。 That is, it is automatically determined whether the user has made a gesture using finger movements or has made a gesture using arm movements. When a finger movement is used, a drawing figure is recognized from the image data. On the other hand, when the arm movement is used, the pan / tilt angle of the imaging apparatus is controlled according to the position of the drawing point of the figure detected from the image data, and the imaging direction follows the movement of the arm. . And the drawing figure by the movement of the said arm is recognized from the tracking locus | trajectory of the imaging direction at this time.
Therefore, regardless of whether the user uses finger movements or arm movements, it is possible to appropriately recognize gestures caused by these movements.

また、この発明の一観点は以下のような各種態様を備えることを特徴とする。
第１の態様は、第１の認識処理を行う際に、取り込まれた画像データから、指に装着された光学的マーカの描画軌跡を検出し、この検出された光学的マーカの描画軌跡のパターンを予め用意された複数の基本図形パターンと比較して、その比較結果をもとに指の動きを用いたジェスチャにより描画される図形を認識するものである。
このようにすると、光学的マーカの位置を画像データ中から輝点として検出することができ、これにより指の動きにより描かれた図形を正確に認識することができる。 One aspect of the present invention is characterized by comprising the following various aspects.
In the first aspect, when the first recognition process is performed, the drawing trajectory of the optical marker attached to the finger is detected from the captured image data, and the pattern of the detected optical marker drawing trajectory is detected. Is compared with a plurality of basic figure patterns prepared in advance, and a figure drawn by a gesture using finger movement is recognized based on the comparison result.
In this way, the position of the optical marker can be detected from the image data as a bright spot, and thereby the figure drawn by the movement of the finger can be accurately recognized.

第２の態様は、第２の認識処理を行う際に、一定の時間間隔で撮像装置のパン・チルト角をもとに撮像方向を表す座標値を検出して、この検出された座標値の集合を撮像方向の追従軌跡として記憶する。そして、この記憶された撮像方向の追従軌跡のパターンを予め用意された複数の基本図形パターンと比較し、その比較結果をもとに腕の動きを用いたジェスチャにより描画される図形を認識するものである。
このようにすると、カメラの撮像方向を表す座標値の集合が撮像方向の追従軌跡を表す情報として記憶される。このため、この記憶された情報を用いることで、追従軌跡を検出するための画像処理等をまったく行うことなく、腕の動きを用いたジェスチャを容易に認識することができる。 In the second mode, when the second recognition process is performed, a coordinate value representing an imaging direction is detected based on a pan / tilt angle of the imaging device at a certain time interval, and the detected coordinate value is The set is stored as a tracking locus in the imaging direction. Then, the stored tracking trace pattern in the imaging direction is compared with a plurality of basic figure patterns prepared in advance, and the figure drawn by the gesture using the movement of the arm is recognized based on the comparison result. It is.
In this way, a set of coordinate values representing the imaging direction of the camera is stored as information representing the tracking locus in the imaging direction. Therefore, by using this stored information, it is possible to easily recognize a gesture using the movement of the arm without performing any image processing or the like for detecting the tracking locus.

第３の態様は、認識モードを判定する際に、第１のエリアと第２のエリアとの間に第３のエリアを設定して、検出された図形の描画点が第１、第２或いは第３の各エリアのうちの何れに含まれるかを判定する。そして、描画点が第１のエリアに含まれると判定された場合には認識モードを第１の認識モードに変更し、描画点が第２のエリアに含まれると判定された場合には認識モードを第２の認識モードに変更し、描画点が第３のエリアに含まれると判定された場合には設定中の認識モードを維持するようにしたものである。
このようにすると、描画点が第１のエリアと第２のエリアの境界付近にある場合に、認識モードが第１の認識モードと第２の認識モードとの間で頻繁に切り替わり、この結果認識処理動作が不安定になる不具合を防止することが可能となる。すなわち、認識モードの切換動作にチャタリング現象が発生しないようにすることができる。 In the third aspect, when the recognition mode is determined, a third area is set between the first area and the second area, and the drawing point of the detected graphic is the first, second or Which of the third areas is included is determined. When it is determined that the drawing point is included in the first area, the recognition mode is changed to the first recognition mode, and when it is determined that the drawing point is included in the second area, the recognition mode is changed. Is changed to the second recognition mode, and when it is determined that the drawing point is included in the third area, the recognition mode being set is maintained.
In this way, when the drawing point is near the boundary between the first area and the second area, the recognition mode is frequently switched between the first recognition mode and the second recognition mode. It is possible to prevent a problem that the processing operation becomes unstable. That is, it is possible to prevent chattering from occurring in the recognition mode switching operation.

すなわちこの発明によれば、指を用いる場合でもまた腕を用いる場合でもその動きによるジェスチャを的確に認識できるようにしたジェスチャ認識装置を提供することができる。 That is, according to the present invention, it is possible to provide a gesture recognition device that can accurately recognize a gesture caused by a movement of a finger or an arm.

この発明の一実施形態に係わるジェスチャ認識装置のシステム構成を示す概略構成図。1 is a schematic configuration diagram showing a system configuration of a gesture recognition device according to an embodiment of the present invention. 図１にジェスチャ認識装置として示したテレビジョン受信機の機能構成を示すブロック図。The block diagram which shows the function structure of the television receiver shown as a gesture recognition apparatus in FIG. 図２に示したテレビジョン受信機の全体の処理手順と処理内容を示すフローチャート。3 is a flowchart showing an overall processing procedure and processing contents of the television receiver shown in FIG. 2. 図３に示した全体の処理手順のうちカメラ撮像動作の処理手順と処理内容を示すフローチャート。The flowchart which shows the process sequence and process content of a camera imaging operation among the whole process sequence shown in FIG. 図３に示した全体の処理手順のうちパン・チルト動作の処理手順と処理内容を示すフローチャート。FIG. 4 is a flowchart showing a processing procedure and processing contents of a pan / tilt operation in the entire processing procedure shown in FIG. 3. 図３に示した全体の処理手順のうちモード切換処理の手順と処理内容を示すフローチャート。The flowchart which shows the procedure and process content of a mode switching process among the whole process procedures shown in FIG. 図３に示した全体の処理手順のうちジェスチャ認識処理の手順と処理内容を示すフローチャート。The flowchart which shows the procedure and process content of gesture recognition process among the whole process procedures shown in FIG. 図３に示した全体の処理手順のうち表示画像制御処理の手順と処理内容を示すフローチャート。The flowchart which shows the procedure and process content of a display image control process among the whole process procedures shown in FIG. 図６に示したモード切換処理を説明するための図。The figure for demonstrating the mode switching process shown in FIG. 図６に示したモード切換処理を説明するための図。The figure for demonstrating the mode switching process shown in FIG. 図５に示したパン・チルト動作を説明するための図。FIG. 6 is a diagram for explaining a pan / tilt operation illustrated in FIG. 5. 図７に示したジェスチャ認識処理において、図形の始点と終点の一致を検出するための第１の方法を説明するための図。The figure for demonstrating the 1st method for detecting the coincidence of the start point and the end point of a figure in the gesture recognition process shown in FIG. 図７に示したジェスチャ認識処理において、図形の始点と終点の一致を検出するための第２の方法を説明するための図。The figure for demonstrating the 2nd method for detecting the coincidence of the start point and end point of a figure in the gesture recognition process shown in FIG. 図７に示したジェスチャ認識処理において、方向キー検出処理を説明するための図。The figure for demonstrating a direction key detection process in the gesture recognition process shown in FIG.

以下、図面を参照してこの発明に係わる実施形態を説明する。
［構成］
図１は、この発明の一実施形態に係わるジェスチャ認識を用いた情報入力システムの概略構成図である。このシステムは、テレビジョン受信機２に撮像装置を付設している。撮像装置は、カメラ４と、パン・チルト駆動ユニット５とから構成される。カメラ４は、ユーザ１の指又は腕の動きを用いたジェスチャを撮像し、その撮像画像データをテレビジョン受信機２へ出力する。パン・チルト駆動ユニット５は、テレビジョン受信機２から出力されるパン・チルト制御信号に従い、上記カメラ４のパン・チルト角を可変する。なお、ユーザ１の指先には例えばＬＥＤ（Light Emitting Diode）を用いた発光マーカ６が装着される。 Embodiments according to the present invention will be described below with reference to the drawings.
[Constitution]
FIG. 1 is a schematic configuration diagram of an information input system using gesture recognition according to an embodiment of the present invention. In this system, an imaging device is attached to the television receiver 2. The imaging device includes a camera 4 and a pan / tilt drive unit 5. The camera 4 captures a gesture using the movement of the user's finger or arm and outputs the captured image data to the television receiver 2. The pan / tilt drive unit 5 varies the pan / tilt angle of the camera 4 in accordance with a pan / tilt control signal output from the television receiver 2. Note that a light emitting marker 6 using, for example, an LED (Light Emitting Diode) is attached to the fingertip of the user 1.

テレビジョン受信機２は、ジェスチャ認識装置としての機能を備えたもので、以下のように構成される。図２は、このテレビジョン受信機２の構成を上記カメラ４及びパン・チルト駆動ユニット５の構成と共に示すブロック図である。 The television receiver 2 has a function as a gesture recognition device, and is configured as follows. FIG. 2 is a block diagram showing the configuration of the television receiver 2 together with the configuration of the camera 4 and the pan / tilt drive unit 5.

カメラ４は、カメラ撮像処理部４１と、画像送信部４２を備えている。カメラ撮像処理部４１は、後述するリアルタイムイベント発行ユニット５０からトリガ信号が発生されるごとにユーザのジェスチャを撮像する処理を行う。画像送信部４２は、上記撮像処理により得られた画像データを後述するデータベース２０内の画像情報蓄積部２１に記憶させる処理を行う。 The camera 4 includes a camera imaging processing unit 41 and an image transmission unit 42. The camera imaging processing unit 41 performs a process of imaging a user gesture every time a trigger signal is generated from a real-time event issuing unit 50 described later. The image transmission unit 42 performs processing for storing the image data obtained by the imaging processing in the image information storage unit 21 in the database 20 described later.

パン・チルト駆動ユニット５は、パン・チルト駆動部５１と、パン・チルト角度検出部５２と、パン・チルト角度送信部５３を備えている。パン・チルト駆動部５１は、２軸の駆動系を有し、後述するジェスチャ認識ユニット３０のカメラトラッキング制御部３３から出力されるパン・チルト制御信号に従い、上記カメラ４のパン・チルト角を可変する。パン・チルト角度検出部５２は、例えば上記パン・チルト駆動部５１に取着されたセンサを用いて、上記カメラ４のパン・チルト角を検出する。パン・チルト角度送信部５３は、上記パン・チルト角度検出部５２により得られたパン・チルト角の検出データをデータベース２０内のパン・チルト情報蓄積部２５に記憶させる。 The pan / tilt drive unit 5 includes a pan / tilt drive unit 51, a pan / tilt angle detection unit 52, and a pan / tilt angle transmission unit 53. The pan / tilt drive unit 51 has a two-axis drive system, and the pan / tilt angle of the camera 4 can be changed in accordance with a pan / tilt control signal output from a camera tracking control unit 33 of a gesture recognition unit 30 described later. To do. The pan / tilt angle detection unit 52 detects the pan / tilt angle of the camera 4 using, for example, a sensor attached to the pan / tilt driving unit 51. The pan / tilt angle transmission unit 53 stores the pan / tilt angle detection data obtained by the pan / tilt angle detection unit 52 in the pan / tilt information storage unit 25 in the database 20.

テレビジョン受信機２は、ジェスチャ認識を行うために必要な機能として、データベース２０と、ジェスチャ認識ユニット３０と、表示画像制御ユニット４０と、リアルタイムイベント発行ユニット５０を備えている。 The television receiver 2 includes a database 20, a gesture recognition unit 30, a display image control unit 40, and a real-time event issuance unit 50 as functions necessary for performing gesture recognition.

データベース２０は、画像情報蓄積部２１と、ジェスチャコマンド変換テーブル部２２と、表示画像データ蓄積部２３と、図形パターン記憶部２４と、パン・チルト情報蓄積部２５を備えている。 The database 20 includes an image information storage unit 21, a gesture command conversion table unit 22, a display image data storage unit 23, a graphic pattern storage unit 24, and a pan / tilt information storage unit 25.

画像情報蓄積部２１は、上記カメラ４の画像送信部４２から出力された画像データを記憶するために用いられる。ジェスチャコマンド変換テーブル部２２には、認識対象の複数の入力コマンドに対応付けて、当該入力コマンドを意味する図形パターンの種類とその終始点を表す情報が予め記憶されている。表示画像データ蓄積部２３には、上記入力コマンドが意味する表示処理内容に応じた画像を表示するために必要な様々な表示画像データが記憶される。図形パターン記憶部２４には、認識対象となる複数の図形形状の基本パターンが記憶される。パン・チルト情報蓄積部２５は、上記パン・チルト駆動ユニット５のパン・チルト角度送信部５３から送信されたパン・チルト角の検出データを記憶するために用いられる。 The image information storage unit 21 is used for storing the image data output from the image transmission unit 42 of the camera 4. The gesture command conversion table unit 22 stores in advance information indicating the type of a graphic pattern meaning the input command and its starting point in association with a plurality of input commands to be recognized. The display image data storage unit 23 stores various display image data necessary for displaying an image corresponding to the display processing content meaning the input command. The graphic pattern storage unit 24 stores basic patterns of a plurality of graphic shapes to be recognized. The pan / tilt information storage unit 25 is used to store pan / tilt angle detection data transmitted from the pan / tilt angle transmission unit 53 of the pan / tilt drive unit 5.

ジェスチャ認識ユニット３０は、モード切換部３１と、フィンガジェスチャ認識部３２と、カメラトラッキング制御部３３と、カメラトラッキングジェスチャ認識部３４を備えている。 The gesture recognition unit 30 includes a mode switching unit 31, a finger gesture recognition unit 32, a camera tracking control unit 33, and a camera tracking gesture recognition unit 34.

モード切換部３１は、以下の処理機能を有する。
(1) 上記画像情報蓄積部２１に新たな画像データ（画像フレーム）が蓄積されるごとに、当該画像フレームを読み出して当該画像フレームから図形の描画点を検出する。そして、この検出された図形の描画点が、当該画像フレーム中の中央部を含む範囲に設定された第１のエリアに含まれるか、この第１のエリアの周辺部に設定した第２のエリアに含まれるか、或いは上記第１のエリアと第２のエリアとの間に設定した第３のエリアに含まれるかを判定する処理。 The mode switching unit 31 has the following processing functions.
(1) Each time new image data (image frame) is stored in the image information storage unit 21, the image frame is read out and a drawing point of a figure is detected from the image frame. Then, the detected drawing point of the graphic is included in the first area set in the range including the central portion in the image frame, or the second area set in the peripheral portion of the first area. Or whether it is included in a third area set between the first area and the second area.

(2) 上記図形の描画点が第１のエリアに含まれると判定された場合に、指の動きを用いたジェスチャを認識する第１の認識モードを設定し、上記描画点が第２のエリアに含まれると判定された場合に、腕の動きを用いたジェスチャを認識する第２の認識モードを設定する。また、上記描画点が第３のエリアに含まれると判定された場合には、設定中の認識モードを維持する処理。 (2) When it is determined that the drawing point of the graphic is included in the first area, a first recognition mode for recognizing a gesture using finger movement is set, and the drawing point is set in the second area. A second recognition mode for recognizing a gesture using the movement of the arm when it is determined that the movement is included. Further, when it is determined that the drawing point is included in the third area, a process of maintaining the recognition mode being set.

フィンガジェスチャ認識部３２は、以下の処理機能を有する。
(1) 上記モード切換部３１により第１の認識モードが設定された場合に、上記画像情報蓄積部２１に蓄積された最新の画像フレームをもとに、指の動きを用いたジェスチャにより描画される図形の軌跡とその終始点を検出する処理。
(2) データベース２０の図形パターン記憶部２４に記憶された認識対象となる複数の図形形状の基本パターンを参照し、上記検出された図形がどの基本パターンに該当するかをパターンマッチングを用いて判定する処理。
(3)上記判定された図形パターンの種類と、その終始点の位置を表す情報をもとに、上記ジェスチャコマンド変換テーブル部２２から該当する入力コマンドを読み出す処理。 The finger gesture recognition unit 32 has the following processing functions.
(1) When the first recognition mode is set by the mode switching unit 31, the drawing is performed by the gesture using the movement of the finger based on the latest image frame stored in the image information storage unit 21. The process of detecting the trajectory and starting point of the figure.
(2) Refer to basic patterns of a plurality of graphic shapes to be recognized stored in the graphic pattern storage unit 24 of the database 20 and determine which basic pattern the detected graphic corresponds to by using pattern matching. To do.
(3) A process of reading out a corresponding input command from the gesture command conversion table unit 22 based on the information indicating the type of the determined graphic pattern and the position of the start point.

カメラトラッキング制御部３３は、上記画像フレーム中の描画点の位置に応じて上記パン・チルト駆動ユニット５を制御することによりカメラ４のパン・チルト角を変化させ、これによりカメラ４の撮像方向をユーザの腕の動きに追従させる処理を行う。 The camera tracking control unit 33 changes the pan / tilt angle of the camera 4 by controlling the pan / tilt driving unit 5 according to the position of the drawing point in the image frame, thereby changing the imaging direction of the camera 4. A process of following the movement of the user's arm is performed.

カメラトラッキングジェスチャ認識部３４は、以下の処理機能を有する。
(1) データベース２０のパン・チルト情報蓄積部２５からカメラ４のパン・チルト角の検出データの集合を読み出し、この読み出されたパン・チルト角の検出データを座標値に変換することによりカメラ４の撮像方向の追従軌跡とその終始点を検出する処理。
(2) 上記検出された撮像方向の追従軌跡を、ユーザの腕の動きによるジェスチャにより描画された図形と見なし、この図形が、データベース２０の図形パターン記憶部２４に記憶された複数の基本図形パターンのどれに該当するかをパターンマッチングを用いて判定する処理。
(3) 上記判定された図形パターンの種類と、上記検出された終始点の位置を表す情報をもとに、上記ジェスチャコマンド変換テーブル部２２から該当する入力コマンドを読み出す処理。 The camera tracking gesture recognition unit 34 has the following processing functions.
(1) A set of pan / tilt angle detection data of the camera 4 is read from the pan / tilt information storage unit 25 of the database 20 and the read pan / tilt angle detection data is converted into a coordinate value. 4 is a process for detecting the tracking locus in the imaging direction and its starting point.
(2) The detected follow-up trajectory in the imaging direction is regarded as a figure drawn by a gesture based on the movement of the user's arm, and the figure is stored in a plurality of basic figure patterns stored in the figure pattern storage unit 24 of the database 20. The process of determining which one of the above applies using pattern matching.
(3) A process of reading out a corresponding input command from the gesture command conversion table unit 22 based on the information indicating the type of the determined graphic pattern and the position of the detected start point.

表示画像制御ユニット４０は、上記フィンガジェスチャ認識部３２及びカメラトラッキングジェスチャ認識部３４により生成された入力コマンドをもとに、表示画像データを更新する。そして、この更新された表示画像データを図示しないディスプレイに出力して表示させる処理を行う。 The display image control unit 40 updates the display image data based on the input commands generated by the finger gesture recognition unit 32 and the camera tracking gesture recognition unit 34. The updated display image data is output to a display (not shown) and displayed.

リアルタイムイベント発行ユニット５０は、例えばタイマを使用して、上記カメラ４等を予め決められた周期で動作させるためのトリガ信号を生成する。 The real-time event issuing unit 50 uses a timer, for example, to generate a trigger signal for operating the camera 4 and the like at a predetermined cycle.

なお、上記ジェスチャ認識ユニット３０、表示画像制御ユニット４０及びリアルタイムイベント発行ユニット５０の各機能は、データベース２０内の図示しないプログラムメモリに格納されたアプリケーション・プログラムを中央処理ユニット（ＣＰＵ）に実行させることにより実現される。 Each function of the gesture recognition unit 30, the display image control unit 40, and the real-time event issuing unit 50 causes a central processing unit (CPU) to execute an application program stored in a program memory (not shown) in the database 20. It is realized by.

［動作］
次に、以上のように構成されたテレビジョン受信機２による、ジェスチャを用いた入力情報の認識動作を説明する。
図３は、その全体の処理手順と処理内容を示すフローチャートである。なお、ここではテレビジョン受信機２のディスプレイに電子番組案内（Electronic Program Guide：ＥＰＧ）情報を表示させ、このＥＰＧ情報に対しリモコン装置１から番組の選択操作を行う場合を例にとって説明する。 [Operation]
Next, the operation of recognizing input information using a gesture by the television receiver 2 configured as described above will be described.
FIG. 3 is a flowchart showing the overall processing procedure and processing contents. Here, a case will be described as an example where electronic program guide (EPG) information is displayed on the display of the television receiver 2 and a program selection operation is performed from the remote control device 1 for this EPG information.

（１）カメラによるジェスチャの撮像
リアルタイムイベント発行ユニット５０では、ステップＳ１により周期的にトリガ信号を発生している。具体的には、ステップＳ１１によりタイマをリセットして計時動作を開始させ、ステップＳ１２によりこのタイマの計時値Timer が１msecに達したか否かを判定する。そして、タイマの計時値Timer が１msecするごとに、ステップＳ１３によりタイマの計時値Timer をリセットして計時動作を開始させると共にトリガ信号を発生する。 (1) Image of Gesture by Camera In the real-time event issuing unit 50, a trigger signal is periodically generated in step S1. Specifically, the timer is reset in step S11 to start the time measuring operation, and it is determined in step S12 whether or not the timer value Timer of this timer has reached 1 msec. Then, every time the timer timing value Timer is 1 msec, the timer timing value Timer is reset in step S13 to start the timing operation and generate a trigger signal.

上記リアルタイムイベント発行ユニット５０からトリガ信号が発生されると、ステップＳ２においてカメラ４が起動し以下のように撮像処理が行われる。図４はその処理手順と処理内容を示すフローチャートである。すなわち、ステップＳ２１によりカメラ撮影処理部４１が撮像処理を行い、この撮像処理により得られた画像フレームをステップＳ２２により画像送信部４２がテレビジョン受信機２へ出力する。テレビジョン受信機２は、上記カメラ４から出力された画像フレームを図示しないカメラインタフェースで受信すると、この受信された画像フレームをステップＳ３によりデータベース２０内の画像情報蓄積部２１に記憶させる。 When a trigger signal is generated from the real-time event issuing unit 50, the camera 4 is activated in step S2, and imaging processing is performed as follows. FIG. 4 is a flowchart showing the processing procedure and processing contents. That is, the camera photographing processing unit 41 performs an imaging process in step S21, and the image transmission unit 42 outputs the image frame obtained by the imaging process to the television receiver 2 in step S22. When the television receiver 2 receives the image frame output from the camera 4 through a camera interface (not shown), the television receiver 2 stores the received image frame in the image information storage unit 21 in the database 20 in step S3.

また上記トリガ信号が発生されると、ステップＳ４によりパン・チルト駆動ユニット５が動作し、カメラ４のパン・チルト角を予め設定された初期位置に設定する。このとき初期位置は、ユーザが指の動きによりジェスチャを行う場合の手首の位置にフォーカスが当たるように設定される。また、ズーム倍率は指の動きによるジェスチャを必要十分なサイズで撮像可能な倍率に設定される。したがって、上記画像情報蓄積部２１には、ユーザの手首の位置を中心にユーザのジェスチャを撮像した最初の画像フレームが記憶される。 When the trigger signal is generated, the pan / tilt drive unit 5 is operated in step S4 to set the pan / tilt angle of the camera 4 to a preset initial position. At this time, the initial position is set so that the position of the wrist when the user makes a gesture by the movement of the finger is focused. Further, the zoom magnification is set to a magnification at which a gesture caused by finger movement can be imaged with a necessary and sufficient size. Therefore, the image information storage unit 21 stores the first image frame obtained by capturing the user's gesture around the position of the user's wrist.

（２）モード切換処理
さて、上記画像情報蓄積部２１に最初の画像フレームが記憶されると、ステップＳ６においてジェスチャ認識ユニット３０のモード切換部３１によりジェスチャ認識モードの切換処理が以下のように行われる。図６はその処理手順と処理内容を示すフローチャートである。 (2) Mode switching process When the first image frame is stored in the image information storage unit 21, the mode switching unit 31 of the gesture recognition unit 30 performs the gesture recognition mode switching process as follows in step S6. Is called. FIG. 6 is a flowchart showing the processing procedure and processing contents.

すなわち、モード切換部３１は、先ずステップＳ６１により画像情報蓄積部２１から最新の画像フレームを読み出し、ステップＳ６２において上記読み出された最新の画像フレームから指の位置座標を検出する。このとき、ユーザは図１に示したように指に発光マーカ６を付けているため、上記指の位置は画像フレーム中において輝点として検出される。次にモード切換部３１は、ステップＳ６３において上記輝点位置座標が画像フレーム内のどのエリアに存在するかを判定する。具体的には、図９に示すように画像フレーム内の中央部に第１のエリアＥａを設定すると共に、画像フレームの最外周部に第２のエリアＥｃ設定し、これら第１及び第２のエリアＥａ，Ｅｃ間に第３のエリアＥｂを設定する。そして、上記輝点位置座標が上記第１、第２及び第３のエリアＥａ，Ｅｃ，Ｅｂのうちの何れに存在するかを判定する。 That is, the mode switching unit 31 first reads the latest image frame from the image information storage unit 21 in step S61, and detects the position coordinates of the finger from the read latest image frame in step S62. At this time, since the user attaches the light emitting marker 6 to the finger as shown in FIG. 1, the position of the finger is detected as a bright spot in the image frame. Next, the mode switching unit 31 determines in which area in the image frame the bright spot position coordinates exist in step S63. Specifically, as shown in FIG. 9, the first area Ea is set at the center of the image frame, and the second area Ec is set at the outermost periphery of the image frame. A third area Eb is set between the areas Ea and Ec. Then, it is determined in which of the first, second and third areas Ea, Ec, Eb the bright spot position coordinates are present.

上記判定の結果、輝点位置座標が第１のエリアＥａに存在していたとする。この場合モード切換部３１は、ステップＳ６４に移行して認識モードをフィンガジェスチャモード（第１の認識モード）に設定する。これに対し、上記輝点位置座標が第２のエリアＥｂに存在していたとすると、モード切換部３１はステップＳ６６に移行してここで認識モードをカメラトラッキングモード（第２の認識モード）に設定する。なお、上記輝点位置座標が第３のエリアＥｃに存在していた場合には、現在設定中の第１又は第２の認識モードをステップＳ６８において維持する。 As a result of the determination, it is assumed that the bright spot position coordinates exist in the first area Ea. In this case, the mode switching unit 31 proceeds to step S64 and sets the recognition mode to the finger gesture mode (first recognition mode). On the other hand, if the bright spot position coordinates exist in the second area Eb, the mode switching unit 31 proceeds to step S66, where the recognition mode is set to the camera tracking mode (second recognition mode). To do. If the bright spot position coordinates are present in the third area Ec, the first or second recognition mode currently set is maintained in step S68.

以上のモード切換処理は、カメラ４により１msec 周期で新たな画像フレームが得られるごとに実行される。したがって、いま例えばユーザが指の動きによるジェスチャを行っているものとすると、ユーザの指の位置を表す輝点位置座標は第１のエリアＥａ内に存在し続けるため、認識モードはフィンガジェスチャモードに保持される。 The mode switching process described above is executed every time a new image frame is obtained by the camera 4 at a cycle of 1 msec. Therefore, for example, if the user is performing a gesture based on the movement of the finger, the bright spot position coordinates indicating the position of the user's finger continue to exist in the first area Ea, and therefore the recognition mode is changed to the finger gesture mode. Retained.

（３）フィンガジェスチャモードによるジェスチャ認識処理
フィンガジェスチャモードが設定されている状態では、カメラ４により１msec 周期でユーザの指の動きを用いたジェスチャが撮像されるごとに、その画像フレームが画像情報蓄積部２１に順次蓄積される。フィンガジェスチャ認識部３２は、上記画像情報蓄積部２１に新たな画像フレームが記憶されるごとに、以下のようにジェスチャ認識処理を行う。図７はその処理手順と処理内容を示すフローチャートである。 (3) Gesture recognition processing by finger gesture mode When the finger gesture mode is set, every time a gesture using the movement of the user's finger is imaged by the camera 4 at a cycle of 1 msec, the image frame is stored as image information. The data are sequentially stored in the unit 21. Each time a new image frame is stored in the image information storage unit 21, the finger gesture recognition unit 32 performs a gesture recognition process as follows. FIG. 7 is a flowchart showing the processing procedure and processing contents.

すなわち、フィンガジェスチャ認識部３２は、先ずステップＳ７１において終始点一致検出処理を実行する。すなわち、終始点一致検出部３１が、画像蓄積部２１から画像フレームを読み出し、この読み出した画像フレームから、ユーザが指の動きによるジェスチャにより空間上に描画した図形の終始点、つまり座標値が一致する２つの点を検出する。 That is, the finger gesture recognition unit 32 first executes the end point coincidence detection process in step S71. That is, the end point coincidence detection unit 31 reads an image frame from the image storage unit 21, and the end point of the figure drawn on the space by the gesture of the movement of the finger by the user, that is, the coordinate value matches from the read image frame. Two points to be detected are detected.

このとき、終始点の検出手法には例えば次の２つの手法が考えられる。第１の検出手法は、図１２（ａ）〜（ｃ）に示すように、ユーザが空間上で指の発光マーカ６を点灯させた点Ａから指を移動させて図形Ｂを描き、指の位置が上記点Ａに戻ったとき、この点Ａを終始点として検出するものである。第２の検出手法は、図１３（ａ）〜（ｃ）に示すようにユーザが空間上で指を動かして図形を描いた場合に、発光マーカ６の輝点の移動軌跡を追跡して当該移動軌跡が交差する点を終始点Ａとして検出するものである。 At this time, for example, the following two methods can be considered as the starting point detection method. As shown in FIGS. 12A to 12C, the first detection method draws a figure B by moving a finger from a point A where the user has turned on the light emitting marker 6 of the finger in the space. When the position returns to the point A, the point A is detected as the starting point. As shown in FIGS. 13 (a) to (c), the second detection method tracks the movement locus of the bright spot of the luminescent marker 6 when the user moves a finger in the space and draws a figure. The point where the movement trajectories intersect is detected as the starting point A.

ジェスチャ認識ユニット３０は、上記終始点Ａが検出されると、ユーザが空間上で描画した図形は番組選択操作を表す図形としての条件を満たすと判断し、ステップＳ７２における図形追跡処理に移行する。これに対し、例えば一定時間が経過しても終始点Ａが検出されなかった場合には、上記描画された図形は選択操作の条件を満たさないと判断し、そのままジェスチャ認識処理を終了する。なお、ジェスチャ認識処理を終了した場合ジェスチャ認識ユニット３０は、画像蓄積部２１に記憶された上記判定対象の画像フレームの集合を消去する。 When the end point A is detected, the gesture recognition unit 30 determines that the graphic drawn by the user in the space satisfies the condition as a graphic representing the program selection operation, and proceeds to graphic tracking processing in step S72. On the other hand, for example, if the end point A is not detected even after a predetermined time has elapsed, it is determined that the drawn figure does not satisfy the selection operation condition, and the gesture recognition process is terminated as it is. When the gesture recognition process is completed, the gesture recognition unit 30 erases the set of image frames to be determined stored in the image storage unit 21.

ジェスチャ認識ユニット３０は、次にステップＳ７２において図形追跡処理を以下のように実行する。すなわち、上記終始点一致検出処理（ステップＳ７１）により検出された終始点Ａを構成する終点から始点までの輝点位置座標を読み出し、この読み出した輝点位置座標をもとに図形の描画軌跡を追跡する処理を行う。 Next, in step S72, the gesture recognition unit 30 executes the graphic tracking process as follows. That is, the bright spot position coordinates from the end point to the start point constituting the start point A detected by the start point match detection process (step S71) are read, and the drawing trajectory of the figure is determined based on the read bright spot position coordinates. Process to be tracked.

上記図形データの描画軌跡が検出されると、ジェスチャ認識ユニット３０は続いてステップＳ７３において図形判断処理を実行する。すなわち、データベース２０の図形パターン記憶部２４から複数の基本図形パターンを順次読み出し、この読み出された基本図形パターンと上記図形追跡処理（ステップＳ７２）により検出された図形の描画軌跡のパターンとをパターンマッチング処理により比較し、その類似度を検出する。そして、この検出された類似度がしきい値以上であって、かつ最も大きいものを選択する。そして、この選択された基本図形パターンを、上記指の動きにより空間上に描かれた図形の形状の種類として認識する。 When the drawing trajectory of the graphic data is detected, the gesture recognition unit 30 subsequently executes graphic determination processing in step S73. That is, a plurality of basic graphic patterns are sequentially read out from the graphic pattern storage unit 24 of the database 20, and the read basic graphic pattern and the pattern of the drawing trajectory of the graphic detected by the graphic tracking process (step S72) are patterned. Comparison is performed by matching processing, and the similarity is detected. Then, the highest similarity that is equal to or greater than the threshold value is selected. Then, the selected basic graphic pattern is recognized as the shape type of the graphic drawn on the space by the movement of the finger.

ジェスチャ認識ユニット３０は、次にステップＳ７４に移行して以下のように図形位置判断処理を実行する。すなわち、先ず上記終始点一致検出処理（ステップＳ７１）により検出された終始点Ａを中心に、描画空間の上下左右各方向に４つの方向領域を設定する。そして、上記図形追跡処理（ステップＳ７２）により検出された図形の描画軌跡の位置座標が上記４つの方向領域のいずれに含まれるかを判定する。この判定処理は、例えば描かれた図形の重心を求め、この重心と終始点Ａが結ぶ直線の方向に図形の描画位置が存在するものと見なして、この直線の方向がいずれの方向領域の角度に含まれるかを判定することにより可能である。 Next, the gesture recognition unit 30 proceeds to step S74 and executes the graphic position determination process as follows. That is, first, four direction areas are set in the vertical and horizontal directions of the drawing space with the starting point A detected by the starting point coincidence detection process (step S71) as the center. Then, it is determined which of the four direction areas the position coordinates of the drawing trajectory of the figure detected by the figure tracking process (step S72) is included. For example, this determination processing obtains the center of gravity of the drawn figure, assumes that the drawing position of the figure exists in the direction of the straight line connecting the center of gravity and the start point A, and the direction of this straight line is the angle of any direction area. This is possible by determining whether it is included.

上記図形の描画位置が判定されると、ジェスチャ認識ユニット３０は続いてステップＳ７５において方向キー検出処理を実行し、上記図形位置判断処理（ステップＳ７４）により判定された図形の描画位置を上下左右の４つの方向のいずれかに対応付けする。そして、この対応付けられた方向キーに対応する入力コマンドを生成する。 When the drawing position of the figure is determined, the gesture recognition unit 30 subsequently executes a direction key detection process in step S75, and the drawing position of the figure determined by the figure position determination process (step S74) is changed up, down, left, and right. Correspond to one of the four directions. Then, an input command corresponding to the associated direction key is generated.

例えば、図１４（ａ）に示すように図形の描画位置が終始点Ａに対し左方向に位置する場合には左方向キーを示す入力コマンドが生成され、反対に右方向に位置する場合には右方向キーを示す入力コマンドが生成される。同様に、図形の描画位置が終始点Ａに対し上方向に位置する場合には上方向キーを示す入力コマンドが生成され、反対に下方向に位置する場合には下方向キーを示す入力コマンドが生成される。なお、上記指の動きによるジェスチャにより空間上に描画された図形の向きは、カメラ４で撮像すると左右が反転する。このため、この画像データから得られる方向キーの判定結果は左右方向を反転させる必要がある。 For example, as shown in FIG. 14A, when the drawing position of the figure is located in the left direction with respect to the start point A, an input command indicating the left direction key is generated, and on the contrary, when the drawing position is located in the right direction. An input command indicating a right arrow key is generated. Similarly, when the drawing position of the figure is positioned upward with respect to the start point A, an input command indicating an upward key is generated, and when the drawing position is positioned downward, an input command indicating the downward key is generated. Generated. Note that the direction of the figure drawn in space by the gesture of the finger movement is reversed when the camera 4 captures the image. Therefore, the direction key determination result obtained from the image data needs to be reversed in the left-right direction.

なお、先に述べた図形位置判断処理（ステップＳ７４）では、図形の描画位置が４方向のいずれに含まれるかを判定する場合について例示した。しかし、それに限らず上下左右斜め方向の８つの方向領域を設定し、検出された図形の描画位置が上記８つの方向領域のいずれに含まれるかを判定するようにしてもよい。 In the graphic position determination process (step S74) described above, the case where it is determined which of the four directions the graphic drawing position is included is exemplified. However, the present invention is not limited to this, and eight directional regions in the up, down, left, and right oblique directions may be set, and it may be determined which of the eight directional regions includes the detected drawing position of the figure.

また、以上述べたフィンガジェスチャモードは、画像フレームから検出されるユーザの指の位置を表す輝点位置座標が、図９に示した第１のエリアＥａから第３のエリアＥｂに移動したとしてもそのまま維持される。したがって、輝点位置座標、つまりユーザの指の位置が第１のエリアＥａから第３のエリアＥｃ方向へ一時的に変化しても、認識モードが即時フィンガジェスチャモードから後述するカメラトラッキングモードに変化することはなく、これによりフィンガジェスチャモードによる認識処理は安定に行われる。 Further, in the finger gesture mode described above, even if the bright spot position coordinates indicating the position of the user's finger detected from the image frame are moved from the first area Ea to the third area Eb shown in FIG. It is maintained as it is. Therefore, even if the bright spot position coordinates, that is, the position of the user's finger temporarily change from the first area Ea to the third area Ec, the recognition mode changes from the immediate finger gesture mode to the camera tracking mode described later. Thus, the recognition process in the finger gesture mode is stably performed.

（４）カメラトラッキングモードによるジェスチャ認識処理
一方、ユーザが指の動きによるジェスチャを止めて、腕の動きによるジェスチャを行ったとする。そうすると、例えば図１０に示すようにユーザの指の位置を表す輝点位置座標Ｍｃが第２のエリアＥｃに入ったことが検出された時点で、ステップＳ６６において認識モードがカメラトラッキングモードに設定される。カメラトラッキングモードが設定されると、ステップＳ６７においてカメラトラッキング制御部３３が起動し、以後このカメラトラッキング制御部３３の制御の下で、ユーザの腕の動きに対しカメラ４の撮像方向を追従させる、いわゆるカメラトラッキング制御が実行される。 (4) Gesture Recognition Processing in Camera Tracking Mode On the other hand, it is assumed that the user stops a gesture due to finger movement and performs a gesture due to arm movement. Then, for example, as shown in FIG. 10, when it is detected that the bright spot position coordinate Mc representing the position of the user's finger has entered the second area Ec, the recognition mode is set to the camera tracking mode in step S66. The When the camera tracking mode is set, the camera tracking control unit 33 is activated in step S67, and thereafter, under the control of the camera tracking control unit 33, the imaging direction of the camera 4 is made to follow the movement of the user's arm. So-called camera tracking control is executed.

すなわち、カメラトラッキング制御部３３は、上記検出された輝点位置座標Ｍｃと第１のエリアＥａの中心座標Ｏとを結ぶ線分上で第３のエリアＥｂを通過する点Ｍｂをターゲット位置座標として算出する。そして、この算出されたターゲット位置座標Ｍｂにカメラ４の撮像方向の中心（焦点）を設定するために必要なカメラ４のパン・チルト制御量を算出し、この制御量に対応するパン・チルト制御信号を生成してパン・チルト駆動ユニット５に与える。 That is, the camera tracking control unit 33 sets the point Mb passing through the third area Eb on the line segment connecting the detected bright spot position coordinate Mc and the center coordinate O of the first area Ea as the target position coordinate. calculate. Then, the pan / tilt control amount of the camera 4 necessary for setting the center (focus) of the imaging direction of the camera 4 to the calculated target position coordinate Mb is calculated, and the pan / tilt control corresponding to the control amount is calculated. A signal is generated and given to the pan / tilt drive unit 5.

この結果、パン・チルト駆動ユニット５では、図５に示すように先ずステップＳ４１においてパン・チルト駆動部５１が動作し、上記パン・チルト制御信号に従いカメラ４のパン・チルト角を可変する。これによりカメラ４の撮像方向がユーザの指に付けた発光マーカ６の位置に近づくように制御される。 As a result, in the pan / tilt drive unit 5, as shown in FIG. 5, the pan / tilt drive unit 51 first operates in step S41, and the pan / tilt angle of the camera 4 is varied in accordance with the pan / tilt control signal. Thus, the imaging direction of the camera 4 is controlled so as to approach the position of the light emitting marker 6 attached to the user's finger.

図１１はこのパン・チルト制御動作を説明するためのもので、パン・チルト駆動ユニット５によりカメラ４を矢印ａ方向に回動させることによりカメラ４のパン角θを制御し、またパン・チルト駆動ユニット５によりカメラ４を矢印ｂ方向に回動させることによりカメラ４のチルト角φを制御する。 FIG. 11 is a diagram for explaining the pan / tilt control operation. The pan / tilt drive unit 5 controls the pan angle θ of the camera 4 by rotating the camera 4 in the arrow a direction. The tilt angle φ of the camera 4 is controlled by rotating the camera 4 in the arrow b direction by the drive unit 5.

以後、カメラトラッキングモードが設定されている状態が維持されている限り、カメラトラッキング制御部３３の制御の下で、各画像フレーム中における発光マーカ６の輝点位置座標Ｍａをもとにターゲット位置座標Ｍｂが算出され、このターゲット位置座標Ｍｂに基づいてカメラ４のパン・チルト角θ，φが制御される。かくして、カメラ４の撮像方向はユーザの腕の動きに追従する。 Thereafter, as long as the state in which the camera tracking mode is set is maintained, the target position coordinates based on the bright spot position coordinates Ma of the light emitting marker 6 in each image frame under the control of the camera tracking control unit 33. Mb is calculated, and the pan / tilt angles θ and φ of the camera 4 are controlled based on the target position coordinates Mb. Thus, the imaging direction of the camera 4 follows the movement of the user's arm.

上記カメラトラッキングモードが設定されている状態で、カメラトラッキングジェスチャ認識部３４はステップＳ８においてユーザの腕の動きによるジェスチャを認識する処理を以下のように実行する。
すなわち、先ずカメラトラッキングモードが最初に設定された時点で、データベース２０内のパン・チルト情報蓄積部２５に記憶されているパン・チルト角の検出データを消去する。この結果、以後パン・チルト情報蓄積部２５には、図５のステップＳ４２，Ｓ４３によりパン・チルト駆動ユニット５０のパン・チルト角度取得部５２により検出され、かつパン・チルト角度送信部５３により送信されたパン・チルト角の検出データが順次記憶される。 In the state in which the camera tracking mode is set, the camera tracking gesture recognition unit 34 executes a process of recognizing a gesture due to the movement of the user's arm in step S8 as follows.
That is, when the camera tracking mode is first set, the pan / tilt angle detection data stored in the pan / tilt information storage unit 25 in the database 20 is deleted. As a result, the pan / tilt information accumulation unit 25 detects the pan / tilt angle acquisition unit 52 of the pan / tilt drive unit 50 in steps S42 and S43 in FIG. The detected pan / tilt angle detection data is sequentially stored.

次にカメラトラッキングジェスチャ認識部３４は、上記パン・チルト情報蓄積部２５からカメラ４のパン・チルト角の検出データを読み出し、この読み出されたパン・チルト角の検出データを座標値に変換して、この変換されたパン・チルト座標をもとに、ユーザの腕の動きを用いたジェスチャにより空間上に描画された図形の種類とその終始点を検出する処理を行う。このときの図形の種類とその終始点の検出処理も、検出処理対象のデータが輝点位置座標から上記パン・チルト座標に代わるだけで、先に述べたフィンガジェスチャ認識処理と同様に図７に示した処理手順に従い行われる。 Next, the camera tracking gesture recognition unit 34 reads the pan / tilt angle detection data of the camera 4 from the pan / tilt information storage unit 25 and converts the read pan / tilt angle detection data into coordinate values. Then, based on the converted pan / tilt coordinates, processing is performed to detect the type of graphic drawn in space and its starting point by a gesture using the movement of the user's arm. In this case, the type of figure and its start point detection process are the same as those in the above-described finger gesture recognition process, except that the data to be detected is replaced by the pan / tilt coordinates from the bright spot position coordinates. It is performed according to the processing procedure shown.

なお、以上のカメラトラッキングモードは、画像フレームから検出されるユーザの指の位置を表す輝点位置座標が、図９に示した第２のエリアＥｃから第３のエリアＥｂに移動したとしてもそのまま維持される。したがって、輝点位置座標、つまりユーザの指の位置が第２のエリアＥｃから第１のエリアＥａ方向へ一時的に変化しても、認識モードが即時フィンガジェスチャモードに変化することはなく、これによりカメラトラッキングモードによる認識処理は安定に行われる。 In the above camera tracking mode, even if the bright spot position coordinates indicating the position of the user's finger detected from the image frame move from the second area Ec to the third area Eb shown in FIG. Maintained. Therefore, even if the bright spot position coordinate, that is, the position of the user's finger temporarily changes from the second area Ec to the first area Ea, the recognition mode does not change to the immediate finger gesture mode. Thus, the recognition process in the camera tracking mode is stably performed.

（５）表示画像制御処理
上記フィンガジェスチャモード或いはカメラトラッキングモードにおけるジェスチャ認識処理が終了すると、続いてステップＳ９において表示画像制御ユニット４０が動作し、この表示画像制御ユニット４０の制御の下で以下のように表示画像の更新処理が行われる。図８はその制御手順と制御内容を示すフローチャートである。 (5) Display image control process When the gesture recognition process in the finger gesture mode or the camera tracking mode is completed, the display image control unit 40 operates in step S9, and under the control of the display image control unit 40, The display image is updated as described above. FIG. 8 is a flowchart showing the control procedure and control contents.

すなわち、先ずステップＳ９２において、先に述べた図形位置判断処理（ステップＳ７４）により生成された入力コマンドをもとに、当該入力コマンドが意味する表示処理内容を表す表示画像データを更新する処理がなされる。例えば、上記図形位置判断処理（ステップＳ７４）により生成された入力コマンドの意味が左方向キーを示すものであれば、ＥＰＧ情報におけるカーソルの位置を左方向へ１列分シフトさせた画像に更新される。反対に入力コマンドの意味が右方向キーを示すものであれば、ＥＰＧ情報におけるカーソルの位置を右向へ１列分シフトさせた画像に更新される。同様に、入力コマンドの意味が上又は下キーを示すものであれば、ＥＰＧ情報におけるカーソルの位置をそれぞれ上方向又は下方向へ１列分シフトさせた画像に更新される。なお、８方向キーの場合も同様にＥＰＧの表示画像が更新される。 That is, first, in step S92, based on the input command generated by the graphic position determination process (step S74) described above, a process of updating the display image data representing the display processing content meant by the input command is performed. The For example, if the meaning of the input command generated by the graphic position determination process (step S74) indicates a left direction key, the cursor is updated to an image in which the cursor position in the EPG information is shifted leftward by one column. The On the other hand, if the meaning of the input command indicates a right arrow key, the cursor is updated in the EPG information to an image shifted by one column to the right. Similarly, if the meaning of the input command indicates an up or down key, the cursor position in the EPG information is updated to an image that is shifted up or down by one column, respectively. Note that the display image of the EPG is similarly updated in the case of the 8-direction key.

したがって、ユーザは自身の指又は腕の動きを用いたジェスチャにより、図１に示した４つの円のいずれかを選択的に空間上に描画することで、ＥＰＧ情報におけるカーソル位置を所望の番組の位置にステップ移動させることが可能となる。 Therefore, the user selectively draws one of the four circles shown in FIG. 1 on the space by a gesture using the movement of his / her finger or arm, so that the cursor position in the EPG information is changed to a desired program. It is possible to step the position.

以上詳述したようにこの実施形態では、カメラ４から取り込んだ画像フレームからジェスチャにより描画された図形の輝点位置座標を検出し、この検出された輝点位置座標が当該画像フレーム中の第１のエリアＥａに含まれるか或いは当該第１のエリアＥａの周辺に設定した第２のエリアＥｃに含まれるかを判定する。そして、描画点が第１のエリアＥａに含まれると判定された場合にはフィンガジェスチャモードを設定し、第２のエリアＥｃに含まれると判定された場合にはカメラトラッキングモードを設定する。フィンガジェスチャモードが設定された状態では、画像データからフィンガジェスチャにより描画される図形を認識する。一方、カメラトラッキングモードが設定された状態では、上記輝点位置座標に応じてパン・チルト駆動ユニット５を動作させ、これによりカメラ４のパン・チルト角を制御してカメラ４の撮像方向をユーザの腕の動きに追従させて、この撮像方向の追従軌跡をもとに描画図形を認識するようにしている。 As described above in detail, in this embodiment, the bright spot position coordinates of the figure drawn by the gesture are detected from the image frame captured from the camera 4, and the detected bright spot position coordinates are the first bright spot position coordinates in the image frame. Whether it is included in the second area Ec set around the first area Ea. When it is determined that the drawing point is included in the first area Ea, the finger gesture mode is set, and when it is determined that the drawing point is included in the second area Ec, the camera tracking mode is set. In the state where the finger gesture mode is set, a figure drawn by the finger gesture is recognized from the image data. On the other hand, in the state where the camera tracking mode is set, the pan / tilt drive unit 5 is operated according to the bright spot position coordinates, thereby controlling the pan / tilt angle of the camera 4 to change the imaging direction of the camera 4 to the user. The drawing figure is recognized based on the tracking locus in the imaging direction.

したがって、ユーザが指の動きを用いてジェスチャを行ったか或いは腕の動きを用いてジェスチャを行ったかが自動的に判定される。そして、指の動きを用いた場合には画像データから描画図形が認識され、一方腕の動きを用いた場合にはカメラ４の撮像方向がユーザの腕の動きに追従するように制御されてその過程で検出されるパン・チルト角の変化から上記腕の動きによる描画図形が認識される。このため、ユーザが指の動きを用いた場合でもまた腕の動きを用いた場合でも、これらの動きによるジェスチャを適切に認識することが可能となる。 Therefore, it is automatically determined whether the user has made a gesture using finger movements or has made a gesture using arm movements. When the finger movement is used, the drawing figure is recognized from the image data. On the other hand, when the arm movement is used, the imaging direction of the camera 4 is controlled so as to follow the movement of the user's arm. From the change of the pan / tilt angle detected in the process, the figure drawn by the movement of the arm is recognized. For this reason, regardless of whether the user uses finger movements or arm movements, it is possible to appropriately recognize gestures caused by these movements.

またこの実施形態では、第１のエリアＥａと第２のエリアＥｃとの間に第３のエリアＥｂを設定し、検出された輝点位置座標が第３のエリアＥｂに含まれると判定された場合には設定中の認識モードを維持するようにしている。このため、輝点位置座標が第１のエリアＥａと第２のエリアＥｃとの境界付近にある場合に、認識モードがフィンガジェスチャとカメラトラッキングモードとの間で頻繁に切り替わり、この結果認識処理動作が不安定になる不具合を防止することができる。 In this embodiment, the third area Eb is set between the first area Ea and the second area Ec, and it is determined that the detected bright spot position coordinates are included in the third area Eb. In some cases, the recognition mode being set is maintained. For this reason, when the bright spot position coordinates are near the boundary between the first area Ea and the second area Ec, the recognition mode is frequently switched between the finger gesture and the camera tracking mode. Can be prevented from becoming unstable.

なお、この発明は上記実施形態に限定されるものではない。例えば、前記実施形態ではモード判定のためのエリアを図９に示したように矩形形状に設定したが、画像フレームの中心Ｏを中心として円形又は楕円形のエリアを同心円状に設定するようにしてもよい。ユーザがジェスチャにより描く図形が円形の場合、上記のように判定エリアも円形又は楕円形にした方が描画位置にかかわらず均一な条件でモードを判定することができる。 The present invention is not limited to the above embodiment. For example, in the above embodiment, the area for mode determination is set to a rectangular shape as shown in FIG. 9, but a circular or elliptical area centered on the center O of the image frame is set to be concentric. Also good. When the figure drawn by the user is a circle, the mode can be determined under uniform conditions regardless of the drawing position when the determination area is also circular or elliptical as described above.

前記実施形態では、図形の終始点Ａに対する描画位置を検出してカーソルの移動方向を制御するようにした。しかしこれに限らず、図形の描画速度、描画時間又は描画サイズを判定し、この図形の描画速度、描画時間又は描画サイズの判定結果に応じて、カーソルの移動量を変化させるようにしてもよい。 In the embodiment, the drawing position with respect to the starting point A of the figure is detected and the movement direction of the cursor is controlled. However, the present invention is not limited to this, and the drawing speed, drawing time, or drawing size of the figure may be determined, and the amount of movement of the cursor may be changed according to the determination result of the drawing speed, drawing time, or drawing size of the figure. .

例えば、ユーザが図形を通常の速度より遅いゆっくりとした速度で描いたり、通常の描画時間より長い時間をかけて描いたり、或いは通常サイズより大きなサイズで描いた場合には、カーソルの１回の移動ステップ量を大きくする。図形の描画速度又は描画時間は、図形データの受信開始タイミング及び受信終了タイミングをそれぞれ検出するか、又は終始点Ａを構成する始点及び終点の検出タイミングを検出し、この検出した各タイミングをもとに計算により求めることができる。なお、この場合も、カーソルの移動方向については、前記実施形態で述べたように描画された図形の終始点Ａに対する図形の描画位置の方向により決まる。 For example, if the user draws a figure at a slower speed slower than the normal speed, takes a longer time than the normal drawing time, or draws a figure larger than the normal size, Increase the moving step amount. The drawing speed or drawing time of the figure is detected by detecting the reception start timing and the reception end timing of the graphic data, respectively, or by detecting the detection timing of the start point and the end point constituting the start point A, and based on each detected timing. Can be obtained by calculation. In this case as well, the moving direction of the cursor is determined by the direction of the drawing position of the figure with respect to the starting point A of the drawn figure as described in the above embodiment.

また、前記実施形態では、テレビジョン受信機２に表示されるＥＰＧ情報をジェスチャにより操作する場合を例にとって説明したが、パーソナル・コンピュータの表示画面やビデオプロジェクタによる表示画面をジェスチャにより操作するようにしてもよい。
その他、ジェスチャ認識装置の種類や構成、処理手順と処理内容、図形の形状等についても、この発明の要旨を逸脱しない範囲で種々変形して実施できる。 In the above-described embodiment, the case where the EPG information displayed on the television receiver 2 is operated by the gesture has been described as an example. However, the display screen of the personal computer or the display screen by the video projector is operated by the gesture. May be.
In addition, the type and configuration of the gesture recognition device, the processing procedure and processing content, the shape of the figure, and the like can be variously modified and implemented without departing from the gist of the present invention.

要するにこの発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態に亘る構成要素を適宜組み合せてもよい。 In short, the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Further, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, you may combine suitably the component covering different embodiment.

１…ユーザ、２…テレビジョン受信機、３…通信回線、４…カメラ、５…パン・チルト駆動ユニット、６…発光マーカ、２０…データベース、２１…信号蓄積部、２２…ジェスチャコマンド変換テーブル部、２３…表示画像データ蓄積部、２４…図形パターン記憶部、２５…パン・チルト情報蓄積部、３０…ジェスチャ認識ユニット、３１…モード切換部、３２…フィンガジェスチャ認識部、３３…カメラトラッキング制御部、３４…カメラトラッキングジェスチャ認識部、４０…表示画像制御ユニット、４１…カメラ撮像処理部、４２…画像送信部、５０…リアルタイムイベント発行ユニット、５１…パン・チルト駆動部、５２…パン・チルト角度検出部、５３…パン・チルト角度送信部。 DESCRIPTION OF SYMBOLS 1 ... User, 2 ... Television receiver, 3 ... Communication line, 4 ... Camera, 5 ... Pan / tilt drive unit, 6 ... Light emission marker, 20 ... Database, 21 ... Signal storage part, 22 ... Gesture command conversion table part , 23 ... Display image data storage unit, 24 ... Graphic pattern storage unit, 25 ... Pan / tilt information storage unit, 30 ... Gesture recognition unit, 31 ... Mode switching unit, 32 ... Finger gesture recognition unit, 33 ... Camera tracking control unit 34 ... Camera tracking gesture recognition unit, 40 ... Display image control unit, 41 ... Camera imaging processing unit, 42 ... Image transmission unit, 50 ... Real-time event issuing unit, 51 ... Pan / tilt drive unit, 52 ... Pan / tilt angle Detection unit, 53... Pan / tilt angle transmission unit.

Claims

An imaging device that captures a motion of drawing a figure in a space by a gesture and outputs the image data; and a gesture recognition device that recognizes the graphic drawn by the gesture based on the image data output from the imaging device. The gesture recognition device used in a system comprising:
Means for capturing image data output from the imaging device;
A drawing point of a figure is detected from the captured image data, and the drawing point of the detected figure is included in a preset first area in the image data, or around the first area. Determining means for determining whether the second area is included in the second area,
When it is determined that the drawing point of the graphic is included in the first area, a first recognition mode for recognizing a graphic drawn by a gesture using finger movement is set, and the drawing point of the graphic is set. Is determined to be included in the second area, a recognition mode setting means for setting a second recognition mode for recognizing a figure drawn by a gesture using an arm movement;
In a state where the first recognition mode is set, first recognition processing means for recognizing a graphic drawn by a gesture using the finger movement based on the captured image data;
In a state where the second recognition mode is set, the imaging direction is made to follow the movement of the arm by controlling the pan / tilt angle of the imaging device according to the position of the drawing point of the detected figure, And a second recognition processing means for detecting a tracking locus in the imaging direction at this time and recognizing a figure drawn by a gesture using the movement of the arm based on the detected tracking locus. Gesture recognition device.

The first recognition processing means includes
Means for detecting a drawing trajectory of an optical marker attached to a finger from the captured image data;
Means for comparing the pattern of the drawing trajectory of the detected optical marker with a plurality of basic figure patterns prepared in advance, and recognizing a figure drawn by a gesture using finger movement based on the comparison result; The gesture recognition apparatus according to claim 1, further comprising:

The second recognition processing means includes
Means for detecting a coordinate value representing an imaging direction based on a pan / tilt angle of the imaging apparatus at a constant time interval, and storing a set of the detected coordinate values as a tracking locus of the imaging direction;
Means for comparing the stored trace pattern of the imaging direction with a plurality of basic figure patterns prepared in advance and recognizing a figure drawn by a gesture using arm movement based on the comparison result; The gesture recognition apparatus according to claim 1, further comprising:

The determination means sets a third area between the first area and the second area, and a drawing point of the detected graphic is set to the first, second, or third area. Determine which of them is included,
The recognition mode setting means changes the recognition mode to the first recognition mode when it is determined that the drawing point of the detected figure is included in the first area, and sets the second area to the second area. When it is determined to be included, the recognition mode is changed to the second recognition mode, and when it is determined to be included in the third area, the recognition mode being set is maintained. The gesture recognition device according to claim 1.

A process of capturing image data obtained by imaging a motion of drawing a figure in a space by a gesture from an imaging device;
A drawing point of a figure is detected from the captured image data, and the drawing point of the detected figure is included in a preset first area in the image data, or around the first area. Determining whether it is included in the second area set in
A step of setting a first recognition mode for recognizing a figure drawn by a gesture using a finger movement when it is determined that a drawing point of the figure is included in the first area;
A first recognition process for recognizing a figure drawn by a gesture using the movement of the finger based on the captured image data in a state where the first recognition mode is set;
A step of setting a second recognition mode for recognizing a figure drawn by a gesture using an arm movement when it is determined that the drawing point of the figure is included in the second area;
In the state where the second recognition mode is set, the imaging direction is made to follow the movement of the arm by controlling the pan / tilt angle of the imaging apparatus according to the position coordinates of the drawing point of the detected figure. And a second recognition step of detecting a tracking locus in the imaging direction at this time and recognizing a figure drawn by a gesture using the movement of the arm based on the detected tracking locus. Characteristic gesture recognition method.

The first recognition process includes:
A process of detecting a drawing locus of an optical marker attached to a finger from the captured image data;
A process of comparing the detected drawing pattern of the optical marker with a plurality of basic figure patterns prepared in advance, and recognizing a figure drawn by a gesture using finger movement based on the comparison result; The gesture recognition method according to claim 5, further comprising:

The second recognition process includes:
Detecting a coordinate value representing an imaging direction based on a pan / tilt angle of the imaging device at a fixed time interval, and storing a set of the detected coordinate values as a tracking locus of the imaging direction;
Comparing the stored trace pattern of the imaging direction with a plurality of basic figure patterns prepared in advance, and recognizing a figure drawn by a gesture using arm movement based on the comparison result. The gesture recognition method according to claim 5, further comprising:

In the determining process, a third area is set between the first area and the second area, and the drawing point of the detected graphic is the first, second or third area. Of which one of them is included,
In the process of setting the recognition mode, when it is determined that the detected drawing point of the figure is included in the first area, the recognition mode is changed to the first recognition mode, and the second recognition mode is set. When it is determined to be included in the area, the recognition mode is changed to the second recognition mode, and when it is determined to be included in the third area, the recognition mode being set is maintained. The gesture recognition method according to claim 5.

A program for causing a computer to execute a process for realizing a process included in the gesture recognition method according to claim 5.