JP6460862B2

JP6460862B2 - Gesture recognition device, system and program thereof

Info

Publication number: JP6460862B2
Application number: JP2015054334A
Authority: JP
Inventors: 依田　育士; 育士依田
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2014-03-20
Filing date: 2015-03-18
Publication date: 2019-01-30
Anticipated expiration: 2035-03-18
Also published as: JP2015195020A

Description

本発明は、特に身体が不自由な人であっても、その人の可動域においてできるジェスチャを行うことで、パーソナルコンピュータなどのインターフェース操作を行うことを可能とするジェスチャ認識装置、システム及びそのプログラムに関するものである。 The present invention relates to a gesture recognition device, a system, and a program for the same, which can perform an interface operation of a personal computer or the like by performing a gesture that can be performed in a movable range of the person even if the person is physically handicapped. It is about.

従来、身体が不自由な障害者であっても、コンピュータを利用して、インターネットやメールが利用できるよう、様々なインターフェースが開発されてきている。たとえば、指先が動く人用のスイッチや、呼気で使う呼気スイッチ、音声入力ソフトなどである。しかし、障害の程度や可動域は人により様々であり、日によって体調も変化するため、当事者によっては相性の悪いインターフェースを用いることでかえって、状態を悪くすることもあった。このため、個々人の状態に合わせたオーダーメードでインターフェースを開発することが必要となる場合も多く、必ずしも多くの障害者が快適に使えるインターフェースが十分に提供できていないという問題があった。 Conventionally, various interfaces have been developed so that a disabled person with a physical disability can use the Internet and mail using a computer. For example, a switch for a person whose fingertip moves, an expiration switch used for expiration, voice input software, and the like. However, the degree of obstacles and the range of motion vary from person to person, and the physical condition changes from day to day, so some parties have made the state worse by using an incompatible interface. For this reason, it is often necessary to develop an interface that is tailored to the individual's condition, and there is a problem that an interface that can be comfortably used by many people with disabilities cannot be provided sufficiently.

また、画像データによるジェスチャ認識においては、距離画像データを利用したジェスチャ認識システムによるジェスチャ検出が提案されている。非特許文献１においては、ジェスチャの大きさや動作に個人差があることから、個人差が最小化できるようなジェスチャを検討し、誰もが自然で直観的なジェスチャとすることで、確実にジェスチャ検出による照明制御を行うための技術が開示されている。 In gesture recognition based on image data, gesture detection by a gesture recognition system using distance image data has been proposed. In Non-Patent Document 1, since there are individual differences in the size and movement of gestures, a gesture that can minimize individual differences is considered, and by making everyone a natural and intuitive gesture, A technique for performing illumination control by detection is disclosed.

三木光範外３名著、「Kinectを用いたジェスチャ検出による照明の制御」、２０１２年度人工知能学会全国大会（第２６回）、２０１２年６月１２日、p.1-3Mitsunori Miki, 3 authors, “Control of Illumination by Gesture Detection Using Kinect”, 2012 Annual Conference of Japanese Society for Artificial Intelligence (26th), June 12, 2012, p.1-3

しかし、非特許文献１に開示された技術では、システム側が認識しやすいジェスチャであるため、システム側の仕様に従ったジェスチャを行う必要がある。しかし、身体が不自由な障害者は、たとえば、指曲げのジェスチャをする場合に、指をまっすぐにしてから、９０度曲げるという動作ができないことも多く、また手の位置も必ずしも前で行うことができないことがある。さらに、障害者の場合、症状はそれぞれ個人によって異なり、再現性のある動きが可能な部位や、動かし方、動き幅が異なるうえに、人によっては、自分の意図しない再現性のない動き、つまり不随意運動が多い場合もある。従って、システムが要求するような健常者と同じジェスチャはできないため、障害者に適用し、インターフェースとして用いることができないという問題があった。 However, since the technique disclosed in Non-Patent Document 1 is a gesture that can be easily recognized by the system side, it is necessary to perform a gesture according to the specification of the system side. However, handicapped people, for example, often do not bend 90 degrees after making their fingers straight when they make finger bending gestures, and their hands must always be placed in front. May not be possible. Furthermore, in the case of persons with disabilities, each symptom varies from person to person, and the parts where reproducible movement is possible, how to move, and the range of movement are different, and depending on the person, movements that are not intended to be reproducible There may be many involuntary movements. Accordingly, the same gesture as that required by a healthy person as requested by the system cannot be performed, so that there is a problem that it cannot be used as an interface when applied to a disabled person.

本発明は、上述した従来技術の課題に鑑みてなされたもので、不特定多数の身体障害者のジェスチャを認識し、インターフェース制御することが可能な、ジェスチャ認識装置を提供することを目的とする。 The present invention has been made in view of the above-described problems of the prior art, and an object thereof is to provide a gesture recognition device capable of recognizing gestures of an unspecified number of physically disabled persons and performing interface control. .

上記目的に鑑み、本発明は撮像装置が撮像した距離画像データに基づき、利用者のジェスチャを認識し、認識されたジェスチャに関連付けられたインターフェース制御をインターフェース装置に対して行うジェスチャ認識装置であって、撮像装置から出力された距離画像データを取り込む画像取込部と、距離画像データが取り込まれるごとに利用者の所定部位を検出し、所定部位が存在する領域を距離画像から抽出する部位領域抽出部と、抽出された部位領域から所定部位を検出する部位検出部と、各距離画像データでの検出された所定部位に基づいて、検出された所定部位の座標の変化を検出する部位変化検出部と、検出された所定部位の変化が所定値以上である場合、ジェスチャが行われたと認識するジェスチャ認識部と、ジェスチャが行われたと認識されると、そのジェスチャに関連付けられたインターフェース制御を行うインターフェース制御部と、を有するジェスチャ認識装置を提供する。 In view of the above object, the present invention is a gesture recognition device that recognizes a user's gesture based on distance image data captured by an imaging device and performs interface control associated with the recognized gesture on the interface device. An image capturing unit that captures distance image data output from the imaging device, and a region extraction that detects a predetermined part of the user each time the distance image data is captured and extracts a region where the predetermined part exists from the distance image A part detecting unit that detects a predetermined part from the extracted part region, and a part change detecting part that detects a change in coordinates of the detected predetermined part based on the detected predetermined part in each distance image data When the detected change in the predetermined part is equal to or greater than a predetermined value, the gesture recognition unit that recognizes that the gesture has been performed, and the gesture Once it is recognized that we have to provide a gesture recognition apparatus having an interface control unit that performs interface control associated with the gesture, the.

なお、ジェスチャ認識装置は、さらに、利用者の一又は複数の候補部位から、インターフェース制御に関連付ける部位及び該部位におけるジェスチャを決定するジェスチャ決定部を有し、部位領域抽出部は、一又は複数の候補部位が存在する領域を前記距離画像データから抽出し、部位検出部は、抽出された部位領域から、一又は複数の候補部位を検出し、部位変化検出部は、検出された候補部位の変化量を検出し、ジェスチャ決定部は、部位変化検出部が検出した変化量に基づいて、インターフェース制御に関連付ける部位及び該部位におけるジェスチャと認識する変化量を決定し、ジェスチャ認識部は、ジェスチャ決定部が決定した部位において決定した変化量以上が検出されると、ジェスチャが行われたと認識するように構成してもよい。ジェスチャ決定部を設けることで、各個人の症状に合わせたジェスチャ認識のための変化量を決定することができるため、より精度高くジェスチャ認識を行うことが可能となる。 The gesture recognition device further includes a gesture determination unit that determines a part associated with interface control and a gesture at the part from one or a plurality of candidate parts of the user, and the part region extraction unit includes one or a plurality of parts. A region where a candidate part exists is extracted from the distance image data, the part detection unit detects one or a plurality of candidate parts from the extracted part region, and the part change detection unit detects a change in the detected candidate part The gesture determination unit determines a part associated with the interface control and a change amount recognized as a gesture in the part based on the change amount detected by the part change detection unit. The gesture recognition unit includes the gesture determination unit. It may be configured to recognize that a gesture has been made when a change amount greater than or equal to the determined amount is detected at the determined site. . By providing the gesture determination unit, it is possible to determine the amount of change for gesture recognition in accordance with the symptom of each individual, so that gesture recognition can be performed with higher accuracy.

さらに、部位領域抽出部は、候補領域を抽出し、部位検出部は、抽出した領域を包含する矩形を検出し、部位変化検出部は、矩形内の変化を検出し、ジェスチャ決定部は、部位変化検出部が検出した変化に基づいて、変化のある領域を包含する矩形を決定するとともに、インターフェース制御に関連付ける部位及びジェスチャと認識する変化量を決定し、ジェスチャ認識部は、ジェスチャ決定部が決定した部位において決定した変化量以上が検出されると、ジェスチャが行われたと認識するように構成してもよい。これにより、どの身体部位であるかを判定することなく、ジェスチャ認識を行えるため、身体部位ごとのジェスチャモデルをもつことなく、確実にジェスチャ認識を行うことが可能となる。 Furthermore, the part region extraction unit extracts candidate regions, the part detection unit detects a rectangle that includes the extracted region, the part change detection unit detects a change in the rectangle, and the gesture determination unit Based on the change detected by the change detection unit, a rectangle that includes the region with the change is determined, and a part associated with the interface control and a change amount recognized as the gesture are determined. The gesture recognition unit is determined by the gesture determination unit. It may be configured to recognize that a gesture has been made when a change amount or more determined in the determined part is detected. Thereby, since gesture recognition can be performed without determining which body part, it is possible to reliably perform gesture recognition without having a gesture model for each body part.

さらに、変化を検出するためのしきい値を記憶するパラメータデータベースを有し、ジェスチャ決定部は、ジェスチャとして認識すると決定した変化量をしきい値としてパラメータデータベースに記憶させ、ジェスチャ認識部は、パラメータデータベースに記憶されたしきい値を読み出して、所定部位の変化がしきい値以上である場合、ジェスチャと認識するように構成してもよい。 Furthermore, it has a parameter database that stores a threshold value for detecting a change, and the gesture determination unit stores the amount of change determined to be recognized as a gesture in the parameter database as a threshold value. The threshold value stored in the database may be read out, and when the change in the predetermined part is equal to or greater than the threshold value, the gesture may be recognized as a gesture.

また、ジェスチャ決定部は、所定のタイミングでの候補部位の変化量に基づいて、ジェスチャと認識するしきい値を決定するようにしてもよい。これにより、利用者が意図していない動きである不随意運動を誤ってジェスチャと認識することを防止できる。 In addition, the gesture determination unit may determine a threshold value that is recognized as a gesture based on a change amount of the candidate part at a predetermined timing. Thereby, it is possible to prevent the involuntary movement, which is a movement that is not intended by the user, from being erroneously recognized as a gesture.

ジェスチャ決定部は、候補部位における、過去にジェスチャとして決定したしきい値と、新たに取得した所定のタイミングでの変化量と、に基づいてジェスチャと認識する新たなしきい値を決定するようにしてもよい。日々、症状が変化する利用者であっても、確実にジェスチャ認識を行うことができる。 The gesture determining unit determines a new threshold value to be recognized as a gesture based on a threshold value determined as a gesture in the past in the candidate part and a newly obtained change amount at a predetermined timing. Also good. Even a user whose symptoms change every day can reliably perform gesture recognition.

また、本発明は、距離画像データに基づき、利用者のジェスチャを認識し、認識されたジェスチャに関連付けられたインターフェース制御を行うためのジェスチャ認識プログラムであって、撮像装置から出力された距離画像データを取り込む画像取込部と、距離画像データが取り込まれるごとに利用者の所定部位が存在する領域を距離画像から抽出する部位領域抽出部と、抽出された部位領域から所定部位を検出する部位検出部と、各距離画像データでの検出された所定部位に基づいて、検出された所定部位の変化量を検出する部位変化検出部と、検出された所定部位の変化が所定値以上である場合、ジェスチャが行われたと認識するジェスチャ認識部と、ジェスチャが行われたと認識されると、そのジェスチャに関連付けられたインターフェース制御を行うインターフェース制御部と、を有するジェスチャ認識装置としてコンピュータを機能させることを特徴とする、ジェスチャ認識プログラムを提供する。 The present invention also provides a gesture recognition program for recognizing a user's gesture based on distance image data and performing interface control associated with the recognized gesture, the distance image data output from the imaging device. An image capturing unit that captures a region, a region region extracting unit that extracts a region where a predetermined region of the user exists every time distance image data is captured, and a region detection that detects a predetermined region from the extracted region region A part change detecting unit that detects a change amount of the detected predetermined part based on the detected predetermined part in each distance image data, and the detected change of the predetermined part is a predetermined value or more, A gesture recognition unit that recognizes that a gesture has been performed, and an interface associated with the gesture when it has been recognized that a gesture has been performed. And characterized by causing a computer to function as a gesture recognition apparatus having an interface controller, a performing over scan control, provides a gesture recognition program.

また、本発明は、距離画像データを撮像する撮像装置と、撮像装置が撮像した距離画像データに基づき、利用者のジェスチャを認識し、認識されたジェスチャに関連付けられたインターフェース制御をインターフェース装置に対して行うジェスチャ認識装置と、を有するジェスチャ認識システムであって、撮像装置は、利用者の身体を撮像し、ジェスチャ認識装置は、撮像装置から出力された距離画像データを取り込む画像取込部と、距離画像データが取り込まれるごとに利用者の所定部位が存在する領域を距離画像から抽出する部位領域抽出部と、抽出された部位領域から所定部位を検出する部位検出部と、各距離画像データでの検出された所定部位に基づいて、検出された所定部位の変化量を検出する部位変化検出部と、検出された所定部位の変化が所定値以上である場合、ジェスチャが行われたと認識するジェスチャ認識部と、ジェスチャが行われたと認識されると、そのジェスチャに関連付けられたインターフェース制御を行うインターフェース制御部と、を有するジェスチャ認識システムを提供する。 The present invention is also directed to an imaging device that captures distance image data, a user's gesture based on the distance image data captured by the imaging device, and interface control associated with the recognized gesture to the interface device. A gesture recognition system comprising: an image capturing unit that captures an image of a user's body, and the gesture recognition device includes an image capturing unit that captures distance image data output from the image capturing device; Each time the distance image data is captured, a region region extraction unit that extracts a region where a predetermined region of the user exists from the distance image, a region detection unit that detects a predetermined region from the extracted region region, and each distance image data A part change detecting unit for detecting a change amount of the detected predetermined part based on the detected predetermined part, and a detected predetermined part A gesture recognition unit that recognizes that a gesture has been performed when the change in position is equal to or greater than a predetermined value; and an interface control unit that performs interface control associated with the gesture when the gesture has been recognized. A gesture recognition system is provided.

本発明によれば、ジェスチャ動作が行われる部位の存在しうる領域を抽出したうえで、ジェスチャ部位を検出し、動きの変化量に基づいてジェスチャ認識を行うため、ジェスチャ動作について個人差が大きい場合であっても、確実にジェスチャを認識し、インターフェース制御を行うことが可能となる。 According to the present invention, since a region where a gesture operation can be performed is extracted, a gesture region is detected, and gesture recognition is performed based on the amount of change in motion. Even so, it is possible to reliably recognize the gesture and perform interface control.

また、ジェスチャ決定モードを設け、所定のタイミングで動作を行うように利用者に指示して身体を動かすことで、その利用者にとって、インターフェース制御に利用可能な再現性のあるジェスチャを確実に取得し、そのジェスチャの動きを変化量として記憶して、ジェスチャ認識することで、利用者個々の症状に合わせたジェスチャ認識装置を提供することが可能となる。 In addition, by providing a gesture determination mode and instructing the user to perform an action at a predetermined timing and moving his / her body, the user can reliably acquire a reproducible gesture that can be used for interface control. By storing the movement of the gesture as a change amount and recognizing the gesture, it is possible to provide a gesture recognition device that matches the individual symptoms of the user.

さらに、ジェスチャ決定モードにおいて、動いている領域を抽出し、その領域での変化量を決定するようにすることで、動いている部位がどこの身体部位であるかを限定することなく、ジェスチャ認識を行うことができる。これにより身体部位ごとの検出パラメータなどを記憶する必要がなくなり、簡易に利用者個々に合わせたジェスチャ認識装置を提供することが可能となる。 Furthermore, in the gesture determination mode, by extracting the moving area and determining the amount of change in that area, it is possible to recognize the gesture without limiting where the moving part is. It can be performed. This eliminates the need to store detection parameters for each body part, and makes it possible to provide a gesture recognition device that is easily adapted to each individual user.

図１は、本発明の第一の実施の形態におけるジェスチャ認識装置を含むジェスチャ認識システム例を示すブロック図である。FIG. 1 is a block diagram showing an example of a gesture recognition system including a gesture recognition device according to the first embodiment of the present invention. 図２は、本発明の第一の実施の形態におけるジェスチャ認識装置２０において行われる、利用者のジェスチャ動作認識と認識されたジェスチャに対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 2 is a flowchart showing an example of a flow of interface control processing corresponding to a gesture recognized as recognition of a user's gesture motion performed in the gesture recognition device 20 according to the first exemplary embodiment of the present invention. 図３は、記憶部２８０に記憶されているパラメータデータベース２８４及び対応ジェスチャデータベース２８６のデータ記憶内容の一例を示す図面である。FIG. 3 is a diagram illustrating an example of data storage contents of the parameter database 284 and the corresponding gesture database 286 stored in the storage unit 280. 図４は、本発明のジェスチャ認識装置２０において行われる、指の曲げ動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 4 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed by the gesture recognition device 20 of the present invention. 図５は、指曲げジェスチャの認識を行う際に撮像した距離画像データの一例を示す図面である。FIG. 5 is a diagram illustrating an example of distance image data captured when the finger bending gesture is recognized. 図６は、本発明のジェスチャ認識装置２０において行われる、腕の振り動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 6 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed by the gesture recognition device 20 of the present invention. 図７は、腕振りを認識する場合において、距離画像データからパーティクルフィルタにより腕をトラッキングする処理を行っている画面の一例である。FIG. 7 is an example of a screen on which the arm tracking process is performed from the distance image data by the particle filter when the arm swing is recognized. 図８は、本発明のジェスチャ認識装置２０において行われる、頭の動きの動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 8 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed in the gesture recognition device 20 of the present invention. 図９は、頭を振るジェスチャを認識する場合において、距離画像データから鼻を抽出し、法線ベクトルを算出する処理を行った画面の一例である。FIG. 9 is an example of a screen on which a nose is extracted from the distance image data and a normal vector is calculated when recognizing a gesture of shaking the head. 図１０は、本発明のジェスチャ認識装置２０において行われる、舌出し動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 10 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed by the tongue recognition operation performed in the gesture recognition device 20 of the present invention. 図１１は、舌出しジェスチャを認識する場合において、取得される距離画像データの一例である。FIG. 11 is an example of distance image data acquired when recognizing a tongue out gesture. 図１２は、本発明のジェスチャ認識装置２０において行われる、膝閉じ動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 12 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition of the knee closing operation performed in the gesture recognition device 20 of the present invention. 図１３は、ひざ閉じのジェスチャを検出するために撮像された距離画像データの一例である。FIG. 13 is an example of distance image data captured to detect a knee-closed gesture. 図１４は、本発明の第二の実施の形態におけるジェスチャ認識装置を含むジェスチャ認識システム例を示すブロック図である。FIG. 14 is a block diagram showing an example of a gesture recognition system including a gesture recognition device according to the second embodiment of the present invention. 図１５は、本発明の第二の実施の形態におけるジェスチャ決定モードにおいてジェスチャ認識装置２０において行われる、利用者のジェスチャを決定する処理の流れの一例を示すフロー図である。FIG. 15 is a flowchart illustrating an example of a flow of processing for determining a user's gesture performed in the gesture recognition device 20 in the gesture determination mode according to the second embodiment of the present invention. 図１６は、本発明の第二の実施の形態におけるジェスチャ指示プログラムでの指示画面の一例である。FIG. 16 is an example of an instruction screen in the gesture instruction program according to the second embodiment of the present invention. 図１７は、本発明の第三の実施の形態におけるジェスチャ決定モードにおいてジェスチャ認識装置２０において行われる、利用者のジェスチャを決定する処理の流れの一例を示すフロー図である。FIG. 17 is a flowchart illustrating an example of a flow of processing for determining a user's gesture performed in the gesture recognition device 20 in the gesture determination mode according to the third embodiment of the present invention. 図１８は、本発明の第三の実施の形態におけるパラメータデータベース２８４及び対応ジェスチャデータベース２８６のデータ記憶内容を示す一例である。FIG. 18 is an example showing data storage contents of the parameter database 284 and the corresponding gesture database 286 according to the third embodiment of the present invention. 図１９は、本発明の第三の実施の形態におけるジェスチャ認識装置２０において行われる、利用者のジェスチャ動作認識と認識されたジェスチャに対応するインターフェース制御処理の流れの一例を示すフロー図である。FIG. 19 is a flowchart illustrating an example of a flow of interface control processing corresponding to a gesture recognized as recognition of a user's gesture motion performed in the gesture recognition device 20 according to the third exemplary embodiment of the present invention. 図２０は、ジェスチャ認識装置のハードウェア構成図の一例である。FIG. 20 is an example of a hardware configuration diagram of the gesture recognition device.

以下、本発明における実施の形態を図面を用いて説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（第一の実施の形態）
図１は本発明におけるジェスチャ認識装置を含むジェスチャ認識システム例を示すブロック図である。図１において、ジェスチャ認識装置２０は、撮像装置１０、インターフェース３０と接続されている。撮像装置１０は、例えば２つのカメラを有するいわゆるステレオカメラである。また、３次元距離画像を取得できるカメラであってもよい。たとえば、撮像装置１０は、指を動かす動作など、インターフェース制御に利用する動作を行っている利用者を撮像する。撮像装置１０は、撮影により時系列的に画像データを取得し、ジェスチャ認識装置２０へと送る。インターフェース装置３０は、ジェスチャ認識装置２０によって制御されるインターフェースであって、例えば、ボタン３１０、スイッチ３２０、アラーム３３０などであるが、これに限らず、マウス、キーボード、タッチパネルなどであってもよい。 (First embodiment)
FIG. 1 is a block diagram showing an example of a gesture recognition system including a gesture recognition device according to the present invention. In FIG. 1, the gesture recognition device 20 is connected to an imaging device 10 and an interface 30. The imaging device 10 is a so-called stereo camera having two cameras, for example. Moreover, the camera which can acquire a three-dimensional distance image may be used. For example, the imaging device 10 images a user who is performing an operation used for interface control, such as an operation of moving a finger. The imaging device 10 acquires image data in time series by shooting and sends the image data to the gesture recognition device 20. The interface device 30 is an interface controlled by the gesture recognition device 20 and includes, for example, the button 310, the switch 320, and the alarm 330, but is not limited thereto, and may be a mouse, a keyboard, a touch panel, or the like.

ジェスチャ認識装置２０は、画像取込部２１０、部位領域抽出部２２０、部位検出部２３０、キャッシュ部２４０、部位変化検出部２５０、ジェスチャ認識部２６０、インターフェース制御部２７０、記憶部２８０を有する。 The gesture recognition device 20 includes an image capturing unit 210, a region region extraction unit 220, a region detection unit 230, a cache unit 240, a region change detection unit 250, a gesture recognition unit 260, an interface control unit 270, and a storage unit 280.

画像取込部２１０は、たとえば、撮像装置１０からリアルタイムで入力される利用者の動作を撮像した距離画像を取り込む。取り込んだ動画像データは、視差データ又は距離データを有する距離画像データである。なお、ステレオカメラである撮像装置１０から２つの画像データを取り込み、画像取込部２１０で視差を算出してもよい。また、視差データとしているが、撮像装置がステレオカメラではなく、距離計を有するカメラである場合には、画像データ及び距離データを取り込んでもよい。ここでいう「距離画像データ」は、時系列画像である動画像データのうちの各時点の画像データ（一フレーム画像）である。画像取込部２１０は、入力される画像データ及び視差データを部位検出部２２０へと出力する。 The image capturing unit 210 captures, for example, a distance image obtained by capturing the user's operation input from the imaging device 10 in real time. The captured moving image data is distance image data having parallax data or distance data. Note that two image data may be acquired from the imaging device 10 that is a stereo camera, and the image capturing unit 210 may calculate the parallax. Although the parallax data is used, when the imaging apparatus is not a stereo camera but a camera having a distance meter, image data and distance data may be captured. The “distance image data” here is image data (one frame image) at each time point in the moving image data which is a time-series image. The image capturing unit 210 outputs the input image data and parallax data to the part detection unit 220.

部位領域抽出部２２０は、画像取込部２１０によって取り込まれた距離画像データの一フレーム画像データ各々において、利用者の所定部位が存在する領域を抽出する。利用者の所定部位とは、インターフェース制御に対応づけられたジェスチャ認識に必要な体の部位であり、例えば、指、腕、頭、膝、肩などである。部位領域抽出部２２０は、予めその部位が存在し得る三次元空間（x, y, z）がパラメータとして与えられており、その空間内で利用者の所定部位が撮像されていないか探索し、抽出する。 The part region extraction unit 220 extracts a region where the predetermined part of the user exists in each frame image data of the distance image data captured by the image capturing unit 210. The predetermined part of the user is a part of the body necessary for gesture recognition associated with the interface control, such as a finger, an arm, a head, a knee, and a shoulder. The part region extraction unit 220 is provided with a three-dimensional space (x, y, z) in which the part can exist in advance as a parameter, and searches for a predetermined part of the user in the space. Extract.

部位検出部２３０は、部位領域抽出部２２０が抽出した距離画像データ内の三次元空間において、抽出すべき部位が撮像されている領域を検出する。部位検出部は、例えば、抽出された部位領域のテクスチャ画像から色や形状に基づいて抽出すべき部位を検出する。抽出された部位領域のデータは、三次元座標データとして出力される。 The part detection unit 230 detects a region in which the part to be extracted is imaged in the three-dimensional space in the distance image data extracted by the part region extraction unit 220. The part detection unit detects, for example, a part to be extracted from the texture image of the extracted part region based on the color or shape. The extracted region data is output as three-dimensional coordinate data.

キャッシュ部２４０は、部位検出部２３０が距離画像データから検出した部位領域のデータ（たとえば、三次元座標データなど）を一時的に記憶するメモリである。一時記憶された部位領域のデータは、部位変化検出部２５０が、所定部位の動きを検出するために、読み出されて用いられる。 The cache unit 240 is a memory that temporarily stores data (for example, three-dimensional coordinate data) of a part region detected by the part detection unit 230 from the distance image data. The part area data temporarily stored is read and used by the part change detection unit 250 in order to detect the movement of the predetermined part.

部位変化検出部２５０は、各フレーム画像において部位検出部２３０が検出した所定部位領域の変化を検出する。具体的には、フレーム画像間における、部位検出部２３０が検出した部位領域の変化量を算出することで、所定部位の動きを検出する。変化量は、三次元座標の座標値の変化に限らず、色領域の変化なども含まれる。変化の算出は、隣接フレーム画像間に限らず、５フレーム間隔、３０フレーム間隔など、所定のフレーム間隔で行ってよい。部位変化検出部２５０は、変化算出の際に用いる前フレーム画像データの値をキャッシュ部２４０から読み出すことで算出する。 The part change detection unit 250 detects a change in the predetermined part region detected by the part detection unit 230 in each frame image. Specifically, the movement of the predetermined part is detected by calculating the amount of change of the part region detected by the part detection unit 230 between the frame images. The amount of change is not limited to the change in the coordinate value of the three-dimensional coordinates, but also includes the change in the color area. The calculation of the change is not limited to between adjacent frame images, and may be performed at a predetermined frame interval such as an interval of 5 frames or an interval of 30 frames. The part change detection unit 250 calculates the value of the previous frame image data used in the change calculation by reading it from the cache unit 240.

ジェスチャ認識部２６０は、部位変化検出部２５０が検出した所定部位の変化量が所定値以上である場合、所定のジェスチャが行われたと認識する。予め設定されている変化量以上に所定領域の位置が変化したと判定される場合、ジェスチャが行われたと判定する。ジェスチャとは、例えば、指曲げ、腕振り、頭部の向きを変えること、舌出し、膝閉じである。なお、ジェスチャはこれらに限らない。変化量は、定常状態と最大移動時との比較で決定される。つまり、位置変化における３次元的な軌跡の距離の大きさで判定される。また、認識対象が頭部の場合は回転角度、指の場合は、手の甲と指の間の角度、舌の場合は、舌の領域の大きさが変化量となる。ジェスチャ認識部２６０は、ジェスチャがされたと判定すると、判定されたジェスチャ内容をインターフェース制御部２７０へ送出する。 The gesture recognition unit 260 recognizes that a predetermined gesture has been performed when the amount of change of the predetermined part detected by the part change detection unit 250 is equal to or greater than a predetermined value. If it is determined that the position of the predetermined area has changed by a predetermined amount or more, it is determined that a gesture has been performed. The gestures include, for example, finger bending, arm swing, changing the direction of the head, sticking out tongue, and closing the knee. The gesture is not limited to these. The amount of change is determined by comparison between the steady state and the maximum movement. That is, it is determined by the size of the three-dimensional trajectory distance in the position change. Further, the amount of change is the rotation angle when the recognition target is the head, the angle between the back of the hand and the finger when the recognition target is a finger, and the size of the tongue region when the recognition target is the tongue. If the gesture recognition unit 260 determines that a gesture has been made, the gesture recognition unit 260 sends the determined gesture content to the interface control unit 270.

インターフェース制御部２７０は、ジェスチャ認識部２６０が、認識したジェスチャに関連づけられたインターフェース制御を行う。具体的には、記憶部２８０の対応ジェスチャデータベース２８６を読み出し、所定のジェスチャ内容に対応するインターフェース制御を読み出し、インターフェース３０を制御する。例えば、指曲げのジェスチャが認識された場合、インターフェース制御部２７０は、対応ジェスチャデータベース２８６を読み出し、指曲げに対応するインターフェース制御、例えば、スイッチをオンする。 The interface control unit 270 performs interface control associated with the gesture recognized by the gesture recognition unit 260. Specifically, the corresponding gesture database 286 in the storage unit 280 is read, interface control corresponding to a predetermined gesture content is read, and the interface 30 is controlled. For example, when a finger bending gesture is recognized, the interface control unit 270 reads the corresponding gesture database 286 and turns on an interface control corresponding to the finger bending, for example, a switch.

記憶部２８０は、距離画像データベース２８２、パラメータデータベース２８４、対応ジェスチャデータベース２８６を有している。距離画像データベース２８２は、撮像装置１０から、画像取込部２１０が取り込んだ距離画像データを記憶している。距離画像データベースに記憶されている距離画像は、適宜部位領域抽出部２２０によって読み出される。 The storage unit 280 includes a distance image database 282, a parameter database 284, and a corresponding gesture database 286. The distance image database 282 stores the distance image data captured by the image capturing unit 210 from the imaging device 10. The distance image stored in the distance image database is read by the part region extraction unit 220 as appropriate.

パラメータデータベース２８４は、所定の部位領域が撮像されている距離画像領域の座標範囲や、所定部位を検出するために用いられる色のしきい値又は座標値や、部位変化を検出してジェスチャとして認識するためのしきい値などを記憶している。パラメータデータベースに記憶しているパラメータは、利用者別に任意に変更、設定できるように構成されていてもよい。 The parameter database 284 detects a coordinate range of a distance image area where a predetermined part region is imaged, a color threshold value or coordinate value used for detecting the predetermined part, and a part change and recognizes it as a gesture. It stores the threshold value and so on. The parameters stored in the parameter database may be configured to be arbitrarily changed and set for each user.

対応ジェスチャデータベース２８６は、ジェスチャの動作に関連付けられたインターフェース制御の内容を対応づけて記憶しているデータベースである。インターフェース制御部２７０は、ジェスチャ認識部２６０が所定のジェスチャがされたことを認識すると、対応ジェスチャデータベース２８６を読み出し、認識されたジェスチャに対応するインターフェース制御を読み出して、インターフェース制御を行う。 The correspondence gesture database 286 is a database that stores the contents of the interface control associated with the gesture operation in association with each other. When the gesture recognition unit 260 recognizes that a predetermined gesture has been made, the interface control unit 270 reads the corresponding gesture database 286, reads the interface control corresponding to the recognized gesture, and performs interface control.

図２は、本発明のジェスチャ認識装置２０において行われる、利用者のジェスチャ動作認識と認識されたジェスチャに対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 2 is a flowchart showing an example of a flow of interface control processing corresponding to a gesture recognized as recognition of a user's gesture operation performed in the gesture recognition device 20 of the present invention.

部位領域抽出部２２０は、所定部位の領域を検出する（ステップＳ２０１）。部位領域抽出部２２０は、画像取込部２１０が取得した撮像装置１０からの距離画像データから、所定の部位領域を抽出する。所定の部位領域とは、利用者が撮像されている距離画像データのなかで、ジェスチャ認識を行う体の部位、例えば、頭部、手、腕、膝、肩など、身体の部位が存在しうる領域を抽出する。撮像装置１０と利用者との位置関係から、距離画像データ内での所定部位が存在しうる範囲が定まるため、その範囲が三次元座標データ（x, y, z）でパラメータデータベース２８４に記憶されている。部位領域抽出部２２０は、パラメータデータベース２８４に記憶されている三次元座標データをよみだして、距離画像データから、所定部位の領域を抽出する。 Part region extraction unit 220 detects a region of a predetermined part (step S201). The part region extraction unit 220 extracts a predetermined part region from the distance image data from the imaging device 10 acquired by the image capturing unit 210. The predetermined part region may be a body part that performs gesture recognition, for example, a body part such as a head, a hand, an arm, a knee, or a shoulder, in the distance image data captured by the user. Extract regions. Since the range in which the predetermined part in the distance image data can exist is determined from the positional relationship between the imaging device 10 and the user, the range is stored in the parameter database 284 as three-dimensional coordinate data (x, y, z). ing. The part region extraction unit 220 reads out the three-dimensional coordinate data stored in the parameter database 284 and extracts a predetermined part region from the distance image data.

次に、部位検出部２３０は、部位領域から所定部位を検出する（ステップＳ２０２）。部位検出部２３０は、たとえば、部位領域抽出部２２０によって抽出された部位領域の距離画像データ内のテクスチャ情報、色情報（例えば色相、彩度、明度）、形状などのパラメータに基づいて、部位領域を検出する。部位領域の抽出にあたって、どのパラメータを用いるかは、検出する部位によって異なる。部位検出部２３０は、パラメータ情報を、パラメータデータベース２８４から対応する身体部位に基づいて読み出して、検出に利用する。 Next, the part detection unit 230 detects a predetermined part from the part region (step S202). The part detection unit 230, for example, based on parameters such as texture information, color information (for example, hue, saturation, brightness), and shape in the distance image data of the part region extracted by the part region extraction unit 220 Is detected. Which parameter is used to extract the region depends on the portion to be detected. The part detection unit 230 reads the parameter information from the parameter database 284 based on the corresponding body part and uses it for detection.

部位変化検出部２５０は、部位の座標変化を検出する（ステップＳ２０３）。部位検出部２３０が、各距離画像データにおいて検出した所定部位に基づいて、距離画像データ間での所定部位の座標の変化を検出する。たとえば、部位変化検出部２５０は、キャッシュ部２４０に一時記憶されている、比較対象となる前フレームの距離画像データにおける部位領域と、現在フレームの距離画像データにおける部位領域の座標変化を算出する。また、部位変化検出部２５０は、変化量やその変化量に達するまでの時間を算出する。障害者の不随意運動による誤認識を避けるため、動きの速度が遅すぎるものや速すぎるものを除外するためである。 The part change detecting unit 250 detects a coordinate change of the part (step S203). The part detection unit 230 detects a change in coordinates of the predetermined part between the distance image data based on the predetermined part detected in each distance image data. For example, the part change detection unit 250 calculates a coordinate change between the part region in the distance image data of the previous frame to be compared and the part region in the distance image data of the current frame, which are temporarily stored in the cache unit 240. Further, the part change detection unit 250 calculates the change amount and the time until the change amount is reached. In order to avoid misrecognition due to involuntary movements of disabled people, it is intended to exclude those whose movement speed is too slow or too fast.

次に、ジェスチャ認識部２６０は、部位変化検出部２５０が検出した部位変化の変化量に基づいて、部位の変化が所定値以上かの判定を行う（ステップＳ２０４）。判定を行う場合に、ジェスチャ認識部２６０は、該当する部位に対応する部位変化検出パラメータをパラメータデータベース２８４から読み出し、読み出されたパラメータに基づいて判定を行う。例えば、所定部位が指であって、指曲げのジェスチャである場合は、指と手の角度変化が所定量以上かどうかで、判定を行う。 Next, the gesture recognition unit 260 determines whether or not the change in the part is equal to or greater than a predetermined value based on the change amount of the part change detected by the part change detection unit 250 (step S204). When performing the determination, the gesture recognition unit 260 reads the part change detection parameter corresponding to the corresponding part from the parameter database 284 and performs the determination based on the read parameter. For example, when the predetermined part is a finger and a finger bending gesture, the determination is made based on whether or not the change in the angle between the finger and the hand is a predetermined amount or more.

部位の変化が所定値以上の場合（ｙｅｓ）、ジェスチャ認識部２６０は、ジェスチャ動作がなされたと認識する（ステップＳ２０５）。部位の変化が所定値以上ではない場合（ｎｏ）、部位変化検出部２５０が引き続き部位の座標変化を検出し（ステップＳ２０３）、ジェスチャ認識部２６０は、部位の変化が所定値以上かの判定を行う。ジェスチャ認識部２６０はジェスチャ動作がなされたと認識すると、認識されたジェスチャ動作をインターフェース制御部２７０へ出力する。 If the change in the region is greater than or equal to the predetermined value (yes), the gesture recognition unit 260 recognizes that the gesture operation has been performed (step S205). If the change in the part is not greater than or equal to the predetermined value (no), the part change detection unit 250 continues to detect the change in the coordinate of the part (step S203), and the gesture recognition unit 260 determines whether the change in the part is greater than the predetermined value. Do. When the gesture recognition unit 260 recognizes that the gesture operation has been performed, the gesture recognition unit 260 outputs the recognized gesture operation to the interface control unit 270.

インターフェース制御部２７０は、取得したジェスチャ動作に関連付けられたインターフェース制御を行う（ステップＳ２０６）。インターフェース制御部２７０は、対応ジェスチャデータベース２８６を参照し、認識されたジェスチャ動作に対応するインターフェース制御内容に基づいて、インターフェース３０のインターフェース制御を行う。 The interface control unit 270 performs interface control associated with the acquired gesture operation (step S206). The interface control unit 270 refers to the corresponding gesture database 286 and performs interface control of the interface 30 based on the interface control content corresponding to the recognized gesture operation.

ここでは、一つの身体部位を検出する方法を説明したが、一つの距離画像データから、複数の身体部位を検出するように構成してもよい。例えば、上半身が撮影されている距離画像データである場合、指曲げジェスチャ、頭部の振りのジェスチャ、舌出しのジェスチャなど複数のジェスチャを検出してもよい。 Although a method for detecting one body part has been described here, a plurality of body parts may be detected from one distance image data. For example, in the case of distance image data in which the upper body is photographed, a plurality of gestures such as a finger bending gesture, a head swing gesture, and a tongue out gesture may be detected.

図３は、記憶部２８０に記憶されているパラメータデータベース２８４及び対応ジェスチャデータベース２８６のデータ記憶内容の一例を示す図面である。 FIG. 3 is a diagram illustrating an example of data storage contents of the parameter database 284 and the corresponding gesture database 286 stored in the storage unit 280.

パラメータデータベース２８４には、身体部位ごとに、それぞれ、距離画像データから部位領域を抽出するためのパラメータや、所定部位を検出するためのパラメータ、ジェスチャを認識するための部位変化を検出するためのパラメータが記憶されている。これらのパラメータは、検出する身体部位ごとに異なる。部位領域抽出パラメータとして、ここでは、三次元座標データによって三次元による抽出範囲が指定されているが、これに限らず、画像データにおける二次元座標により指定し、視差のデータ範囲で指定しても同様である。 The parameter database 284 includes, for each body part, a parameter for extracting a part region from distance image data, a parameter for detecting a predetermined part, and a parameter for detecting a part change for recognizing a gesture. Is remembered. These parameters are different for each body part to be detected. Here, the three-dimensional extraction range is specified by the three-dimensional coordinate data as the part region extraction parameter. However, the present invention is not limited to this. It is the same.

指の部位検出パラメータとして、ここでは、テクスチャ情報及び色情報が指定されているが、指定されるパラメータは、身体部位によって異なる。身体部位によって異なるパラメータの詳細は、後述する。なお、これらのパラメータは、利用者の身体の可動域に基づいて、任意の値に自由に変更できるよう構成されていてもよい。 Here, texture information and color information are designated as finger part detection parameters, but the designated parameters differ depending on the body part. Details of parameters that differ depending on the body part will be described later. These parameters may be configured to be freely changeable to arbitrary values based on the range of motion of the user's body.

対応ジェスチャデータベース２８６には、認識されたジェスチャと対応するインターフェース制御内容とが対応づけて記憶されている。例えば、頭部が右に振られるジェスチャが認識された場合、マウスでクリック動作を行うことが記憶されている。このほか、腕振りに対してアラーム音発生、指曲げに足してスイッチ押下が一例として記憶されているが、これらに限らず、あるジェスチャに対してキーボードのキー押下を対応づけてもよい。インターフェース制御部２７０は、ジェスチャ認識部２６０が出力したジェスチャ内容に基づいて、対応ジェスチャデータベース２８６を参照し、関連づけられて記憶されているインターフェース制御内容に基づいてインターフェース制御を行う。 The corresponding gesture database 286 stores the recognized gesture and the corresponding interface control content in association with each other. For example, when a gesture in which the head is swung to the right is recognized, it is stored that a click operation is performed with the mouse. In addition to this, an alarm sound is generated for arm swing and a switch press is added as an example in addition to finger bending. However, the present invention is not limited to this, and a key press on the keyboard may be associated with a certain gesture. The interface control unit 270 refers to the corresponding gesture database 286 based on the gesture content output by the gesture recognition unit 260, and performs interface control based on the associated interface control content.

これより、各身体部位それぞれにおけるジェスチャの認識方法とインターフェース制御の詳細をフローチャートに従って、説明する。 The details of the gesture recognition method and interface control in each body part will now be described with reference to a flowchart.

図４は、本発明のジェスチャ認識装置２０において行われる、指の曲げ動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 4 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed by the gesture recognition device 20 of the present invention.

まず、部位領域抽出部２２０は、画像取込部２１０が取り込んだ距離画像データから、手が存在する領域の検出、つまり、手の領域となる三次元空間の部位領域を抽出する（ステップＳ４０１）。具体的には、部位領域抽出部２２０は、距離画像データ中で手が存在し得る三次元空間として設定されているパラメータをパラメータデータベース２８２から読み出し、距離画像データからパラメータとして設定されている領域を抽出する。パラメータデータベースには、手が存在し得る三次元空間として、たとえば、三次元座標の最大値及び最小値が記憶されている。 First, the part region extraction unit 220 detects a region where a hand exists, that is, extracts a part region in a three-dimensional space that becomes a hand region from the distance image data captured by the image capturing unit 210 (step S401). . Specifically, the part region extraction unit 220 reads a parameter set as a three-dimensional space in which the hand can exist in the distance image data from the parameter database 282, and extracts the region set as a parameter from the distance image data. Extract. In the parameter database, for example, a maximum value and a minimum value of three-dimensional coordinates are stored as a three-dimensional space in which a hand can exist.

次に、部位検出部２３０は、手が存在する領域の距離画像データから検出対象となる指と手を検出する（ステップＳ４０２）。部位検出部２３０は、部位領域抽出部２２０が出力した部位領域の距離画像データのなかから、テクスチャ情報に基づいて、手とジェスチャ対象となる指を検出する。具体的には、テクスチャ情報に基づき、ジェスチャ対象となる指には、着色された指サックなどで目印を装着しているため、その着色された色領域を持つ3次元オブジェクトを抽出する。手については、指に最も近い領域の3次元オブジェクトをラベリングしていき、最大ラベルを手として認識することでジェスチャ対象の手を抽出する。手を抽出する際に、手に相当する肌色領域を同時に利用することも可能である。指や手については抽出する着色領域や肌色領域については、パラメータデータベース２８４にあらかじめ、色相や明るさの最大値、最小値が記憶されており、部位検出部２３０は、それらの値を読みだすことで、指及び手の部位を検出する。 Next, the part detection unit 230 detects a finger and a hand to be detected from distance image data of an area where the hand is present (step S402). The part detection unit 230 detects a hand and a finger to be a gesture target from the distance image data of the part region output from the part region extraction unit 220 based on the texture information. Specifically, based on the texture information, since a mark is attached to a finger to be gestured with a colored finger sack or the like, a three-dimensional object having the colored color region is extracted. For the hand, the 3D object in the area closest to the finger is labeled, and the gesture target hand is extracted by recognizing the maximum label as the hand. When extracting a hand, it is also possible to simultaneously use a skin color area corresponding to the hand. For the finger and hand, the extracted hue area and skin color area are stored in advance in the parameter database 284 with the maximum value and the minimum value of the hue and brightness, and the part detection unit 230 reads out these values. Then, the finger and hand parts are detected.

部位変化検出部２５０は、部位検出部２３０が距離画像データから検出した指と手の部位に基づいて、距離画像データ間における指と手の角度の変化を検出する（ステップＳ４０３）。具体的には、部位変化検出部２５０は、キャッシュ部２４０に記憶されている一又は数フレーム以前の距離画像データにおける指及び手の部位の位置と、現在の距離画像データにおける指及び手の部位の位置から、指の曲げ変化を検出する。指の曲げ変化の検出にあたっては、それぞれの距離画像データにおける指と手首のモーメントを計算し、モーメントから２軸の角度を計算することで行う。そして、指と手の角度の差を計算する。 The part change detection unit 250 detects a change in the angle between the finger and the hand between the distance image data based on the part of the finger and hand detected by the part detection unit 230 from the distance image data (step S403). Specifically, the region change detection unit 250 stores the position of the finger and hand in the distance image data of one or several frames before stored in the cache unit 240 and the part of the finger and hand in the current distance image data. The finger bending change is detected from the position. In detecting the bending change of the finger, the moment of the finger and the wrist in each distance image data is calculated, and the biaxial angle is calculated from the moment. Then, the difference between the finger and hand angles is calculated.

ジェスチャ認識部２６０は、部位変化検出部２５０が計算した角度に基づき、指と手の角度の変化量が所定値以上かどうかを判定する（ステップＳ４０４）。所定値となる設定しきい値は、パラメータデータベース２８４に記憶されており、ジェスチャ認識部２６０は、そのしきい値を読みだして、参照することで、判定を行う。判定の結果、所定値以上である場合（ｙｅｓ）、指曲げと判定される（ステップＳ４０５）。判定の結果、角度の変化量が所定値以下の場合（ｎｏ）、指曲げのジェスチャはなされていないと判定され、ステップＳ４０３へ戻り、引き続き、部位変化検出部２５０が指と手の角度の変化の検出を行う。ジェスチャ認識部２６０は、指曲げと判定した場合（ステップＳ４０５）、指曲げジェスチャがなされたことを、インターフェース制御部２７０へ出力する。このように、手の領域を検出したうえで、指の部位を検出することで、不随意運動によって手が動いてしまう人であっても、確実に指の動きを検出してジェスチャ認識することができる。 The gesture recognition unit 260 determines whether the change amount of the finger-hand angle is equal to or greater than a predetermined value based on the angle calculated by the region change detection unit 250 (step S404). The set threshold value that is a predetermined value is stored in the parameter database 284, and the gesture recognition unit 260 reads the threshold value and refers to it to make a determination. As a result of the determination, if it is equal to or greater than the predetermined value (yes), it is determined that the finger is bent (step S405). As a result of the determination, if the amount of change in angle is equal to or less than a predetermined value (no), it is determined that no finger bending gesture has been made, and the process returns to step S403. Detection is performed. When it is determined that the finger is bent (step S405), the gesture recognition unit 260 outputs to the interface control unit 270 that the finger bending gesture has been performed. In this way, by detecting the region of the hand and then detecting the finger part, even a person whose hand moves due to involuntary movement can reliably detect the finger movement and recognize the gesture Can do.

次に、インターフェース制御部２７０は、ジェスチャ認識部２６０の出力に応じて、指曲げに関連付けられたインターフェース制御を行う（ステップＳ４０６）。具体的には、インターフェース制御部２７０は、対応ジェスチャデータベース２８６を参照し、指曲げに対応するインターフェース制御をインターフェース３０に対して行う。このように、手や指がどのような場所にあって、どの角度から曲げたとしても変化量によって指曲げを判定するため、様々な特性を有した人のジェスチャ認識を行うことができる。 Next, the interface control unit 270 performs interface control associated with finger bending according to the output of the gesture recognition unit 260 (step S406). Specifically, the interface control unit 270 refers to the corresponding gesture database 286 and performs interface control corresponding to finger bending on the interface 30. In this way, since the finger bending is determined by the amount of change regardless of where the hand or finger is located and bent from any angle, it is possible to perform gesture recognition of people having various characteristics.

図５は、指曲げジェスチャの認識を行う際に撮像した距離画像データの一例を示す図面である。撮像された画像データには、距離データが含まれているため、距離が大きく異なる境界は白く抜けた状態となっている。このような距離画像データから指及び手を検出する。 FIG. 5 is a diagram illustrating an example of distance image data captured when the finger bending gesture is recognized. Since the captured image data includes distance data, boundaries where the distances are greatly different are in a state of being white. A finger and a hand are detected from such distance image data.

図６は、本発明のジェスチャ認識装置２０において行われる、腕の振り動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 6 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed by the gesture recognition device 20 of the present invention.

まず、部位領域抽出部２２０は、画像取込部２１０が取り込んだ距離画像データから、腕が存在しうる領域の検出、つまり、肘から先の前腕の領域となる三次元空間の部位領域を抽出する（ステップＳ６０１）。具体的には、部位領域抽出部２２０は、距離画像データ中で肘から先の前腕及び手が存在し得る三次元空間として設定されているパラメータをパラメータデータベース２８２から読み出し、距離画像データからパラメータとして設定されている領域を抽出する。パラメータデータベースには、前腕及び手が存在し得る三次元空間として、たとえば、三次元座標の最大値及び最小値が記憶されている。 First, the part region extraction unit 220 detects a region where an arm can exist, that is, extracts a part region in a three-dimensional space that becomes a forearm region from the elbow to the distance image data captured by the image capturing unit 210. (Step S601). Specifically, the part region extraction unit 220 reads, from the parameter database 282, a parameter set as a three-dimensional space in which the forearm and hand ahead of the elbow can exist in the distance image data, and uses the parameters from the distance image data as parameters. Extract the set area. The parameter database stores, for example, the maximum value and the minimum value of the three-dimensional coordinates as a three-dimensional space in which the forearm and the hand can exist.

次に、部位検出部２３０は、腕が存在する領域の距離画像データから検出対象となる腕を検出する（ステップＳ６０２）。部位検出部２３０は、部位領域抽出部２２０が出力した部位領域の距離画像データに、パーティクルフィルタを適用する。 Next, the part detection unit 230 detects the arm to be detected from the distance image data of the region where the arm exists (step S602). Part detection unit 230 applies a particle filter to the distance image data of the part region output from part region extraction unit 220.

部位変化検出部２５０は、部位検出部２３０が距離画像データから検出した腕の部位に基づいて、距離画像データ間における腕の動きの変化を検出する（ステップＳ６０３）。部位変化検出部２５０は、具体的には、腕の位置の変化の検出にあたっては、パーティクルフィルタを用いて局所的特徴を追跡することで行う。そして、距離画像データのフレーム間差分におけるパーティクルの尤度を決定することで大きく動く腕の振りをトラッキングする。そして、部位変化検出部２５０は、パーティクル群の重心移動距離から腕状態を推定することで、腕の位置変化を検出する。 The part change detection unit 250 detects a change in arm movement between the distance image data based on the arm part detected by the part detection unit 230 from the distance image data (step S603). Specifically, the part change detection unit 250 detects a change in the position of the arm by tracking a local feature using a particle filter. Then, the swing of the arm that moves greatly is tracked by determining the likelihood of the particle in the inter-frame difference of the distance image data. Then, the part change detection unit 250 detects the arm position change by estimating the arm state from the gravity center moving distance of the particle group.

ジェスチャ認識部２６０は、部位変化検出部２５０が検出した腕の振りが所定値以上かどうかを判定する（ステップＳ６０４）。つまり、所定時間内における腕のパーティクル群の重心移動距離の変化量が所定値以上かを判定する。所定値となる設定しきい値は、パラメータデータベース２８４に記憶されており、ジェスチャ認識部２６０は、そのしきい値を読みだして、参照することで、判定を行う。判定の結果、所定値以上である場合（ｙｅｓ）、腕が振られたと判定される（ステップＳ６０５）。判定の結果、腕の位置の変化量が所定値以下の場合（ｎｏ）、腕振りのジェスチャはなされていないと判定され、ステップＳ６０３へ戻り、引き続き、部位変化検出部２５０が腕の位置の変化の検出を行う。ジェスチャ認識部２６０は、腕が振られたと判定した場合（ステップＳ６０５）、腕振りジェスチャがなされたことを、インターフェース制御部２７０へ出力する。 The gesture recognition unit 260 determines whether or not the arm swing detected by the part change detection unit 250 is greater than or equal to a predetermined value (step S604). That is, it is determined whether the change amount of the center-of-gravity movement distance of the arm particle group within a predetermined time is greater than or equal to a predetermined value. The set threshold value that is a predetermined value is stored in the parameter database 284, and the gesture recognition unit 260 reads the threshold value and refers to it to make a determination. As a result of the determination, if it is equal to or greater than the predetermined value (yes), it is determined that the arm has been swung (step S605). As a result of the determination, if the amount of change in the arm position is less than or equal to the predetermined value (no), it is determined that no arm swing gesture has been made, and the process returns to step S603, and the region change detection unit 250 continues to change the arm position. Detection is performed. If the gesture recognition unit 260 determines that the arm has been shaken (step S605), the gesture recognition unit 260 outputs to the interface control unit 270 that the arm swing gesture has been performed.

次に、インターフェース制御部２７０は、ジェスチャ認識部２６０の出力に応じて、腕振りに関連付けられたインターフェース制御を行う（ステップＳ６０６）。具体的には、インターフェース制御部２７０は、対応ジェスチャデータベース２８６を参照し、腕振りに対応するインターフェース制御をインターフェース３０に対して行う。 Next, the interface control unit 270 performs interface control associated with arm swing according to the output of the gesture recognition unit 260 (step S606). Specifically, the interface control unit 270 refers to the corresponding gesture database 286 and performs interface control corresponding to arm swing on the interface 30.

図７は、腕振りを認識する場合において、距離画像データからパーティクルフィルタにより腕をトラッキングする処理を行っている画面の一例である。中央の画像が距離画像データであり、右上の画像が、距離画像データにパーティクルフィルタを適用した画像、右下の画像が、パーティクル群から検出した重心を指先に示す画像である。このように、距離画像データにパーティクルフィルタを適用することで、腕の局所的特徴をトラッキングし、腕の重心を検出し、重心移動距離を算出することで、腕の振りを検出している。 FIG. 7 is an example of a screen on which the arm tracking process is performed from the distance image data by the particle filter when the arm swing is recognized. The center image is the distance image data, the upper right image is an image obtained by applying a particle filter to the distance image data, and the lower right image is an image showing the centroid detected from the particle group at the fingertip. As described above, by applying the particle filter to the distance image data, the local feature of the arm is tracked, the center of gravity of the arm is detected, and the movement of the center of gravity is calculated to detect the swing of the arm.

図８は、本発明のジェスチャ認識装置２０において行われる、頭の動きの動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 8 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed in the gesture recognition device 20 of the present invention.

まず、部位領域抽出部２２０は、画像取込部２１０が取り込んだ距離画像データから、頭部が存在しうる領域の検出、つまり、首より上の頭部の領域となる三次元空間の部位領域を抽出する（ステップＳ８０１）。具体的には、部位領域抽出部２２０は、距離画像データ中で頭部が存在し得る三次元空間として設定されているパラメータをパラメータデータベース２８２から読み出し、距離画像データからパラメータとして設定されている領域を抽出する。パラメータデータベースには、頭部が存在し得る三次元空間として、たとえば、三次元座標の最大値及び最小値が記憶されている。 First, the part region extraction unit 220 detects a region where the head can exist from the distance image data captured by the image capturing unit 210, that is, a part region in a three-dimensional space that becomes a head region above the neck. Is extracted (step S801). Specifically, the part region extraction unit 220 reads a parameter set as a three-dimensional space in which the head can exist in the distance image data from the parameter database 282, and a region set as a parameter from the distance image data To extract. In the parameter database, for example, a maximum value and a minimum value of three-dimensional coordinates are stored as a three-dimensional space in which the head can exist.

次に、部位検出部２３０は、頭部が存在する領域の距離画像データから検出対象となる頭部を検出したうえで、鼻を検出する（ステップＳ８０２）。部位検出部２３０は、部位領域抽出部２２０が出力した部位領域の距離画像データのなかから、テクスチャ画像に基づいて、ジェスチャ対象となる頭部を検出する。具体的には、テクスチャ画像に基づき、頭部となりうる楕円球の形状を抽出する。次に、抽出範囲をラベリングしていき、顔に最も近いオブジェクトを顔として認識する。抽出する楕円球の形状や顔のオブジェクトについては、パラメータデータベース２８４にあらかじめ、座標値の最大値、最小値が記憶されており、部位検出部２３０は、それらの値を読みだすことで顔を抽出する。次に、抽出された顔画像の距離画像データを、ズーム、回転等を行って、位置を正規化し、顔画像のデータのうち、最もカメラに対して距離が近い点を鼻として抽出する。 Next, the part detection unit 230 detects the head to be detected from the distance image data of the region where the head is present, and then detects the nose (step S802). The part detection unit 230 detects the head to be a gesture target based on the texture image from the distance image data of the part region output from the part region extraction unit 220. Specifically, the shape of an elliptic sphere that can be the head is extracted based on the texture image. Next, the extraction range is labeled, and the object closest to the face is recognized as a face. The maximum and minimum coordinate values are stored in advance in the parameter database 284 for the ellipsoidal shape and face object to be extracted, and the part detection unit 230 extracts the face by reading out these values. To do. Next, the distance image data of the extracted face image is zoomed, rotated, etc., to normalize the position, and from the face image data, the point closest to the camera is extracted as the nose.

続いて、部位検出部２３０は、鼻の部位から顔の法線ベクトルを算出する（ステップＳ８０３）。具体的には、検出した鼻の位置に基づき、鼻を中心とした領域の距離情報に基づいて顔の法線ベクトル（顔の向き）を計算する。 Subsequently, the part detection unit 230 calculates a normal vector of the face from the nose part (step S803). Specifically, based on the detected position of the nose, the normal vector of the face (face orientation) is calculated based on the distance information of the region centered on the nose.

部位変化検出部２５０は、顔の法線の向きの変化を検出する（ステップＳ８０４）。部位変化検出部２５０は、キャッシュ部２４０に記憶されている一又は数フレーム以前の距離画像データにおける顔の法線ベクトルと、現在の距離画像データにおける法線ベクトルの変化量を算出し、顔の向きの変化を検出する。 The part change detection unit 250 detects a change in the direction of the normal of the face (step S804). The part change detecting unit 250 calculates the amount of change in the normal vector of the face in the distance image data one or several frames before stored in the cache unit 240 and the amount of change in the normal vector in the current distance image data. Detect orientation changes.

ジェスチャ認識部２６０は、部位変化検出部２５０が検出した法線ベクトルの向きの変化が所定値以上かどうかを判定する（ステップＳ８０５）。つまり、法線ベクトルの向きの変化量が所定値以上かを判定する。所定時間内における向きの変化量から判定してもよい。例えば、顔を動かしていない状態を初期位置として、右向き、左向きまたは下向きへと意図して首を動かしたときに生じる変化量で判定してもよい。所定値となる設定しきい値は、それぞれの向きに応じた値がパラメータデータベース２８４に記憶されており、ジェスチャ認識部２６０は、そのしきい値を読みだして、参照することで、判定を行う。判定の結果、所定値以上である場合（ｙｅｓ）、頭を動かしたと判定される（ステップＳ８０６）。判定の結果、法線の向きの変化量が所定値以下の場合（ｎｏ）、頭を振るジェスチャはなされていないと判定され、ステップＳ８０３へ戻り、引き続き、部位変化検出部２５０が法線ベクトルの向きの変化の検出を行う。ジェスチャ認識部２６０は、頭を振ったと判定した場合（ステップＳ８０６）、頭を振るジェスチャがなされたことを、インターフェース制御部２７０へ出力する。 The gesture recognition unit 260 determines whether or not the change in the direction of the normal vector detected by the region change detection unit 250 is greater than or equal to a predetermined value (step S805). That is, it is determined whether the amount of change in the direction of the normal vector is greater than or equal to a predetermined value. The determination may be made from the amount of change in direction within a predetermined time. For example, it may be determined by the amount of change that occurs when the neck is moved with the intention of moving to the right, left, or downward, with the face not moving as the initial position. The setting threshold value that is a predetermined value is stored in the parameter database 284 in accordance with each direction, and the gesture recognition unit 260 reads the threshold value and refers to it to make a determination. . As a result of the determination, if it is equal to or greater than the predetermined value (yes), it is determined that the head has been moved (step S806). As a result of the determination, if the amount of change in the direction of the normal is equal to or less than the predetermined value (no), it is determined that no gesture of shaking the head has been made, and the process returns to step S803. Change of direction is detected. If it is determined that the head is shaken (step S806), the gesture recognition unit 260 outputs to the interface control unit 270 that the gesture of shaking the head has been made.

次に、インターフェース制御部２７０は、ジェスチャ認識部２６０の出力に応じて、頭の動きに関連付けられたインターフェース制御を行う（ステップＳ８０７）。具体的には、インターフェース制御部２７０は、対応ジェスチャデータベース２８６を参照し、頭の動きに対応するインターフェース制御をインターフェース３０に対して行う。 Next, the interface control unit 270 performs interface control associated with the head movement according to the output of the gesture recognition unit 260 (step S807). Specifically, the interface control unit 270 refers to the corresponding gesture database 286 and performs interface control corresponding to head movement on the interface 30.

図９は、頭を振るジェスチャを認識する場合において、距離画像データから鼻を抽出し、法線ベクトルを算出する処理を行った画面の一例である。このように顔画像を正規化して、鼻を検出したあと、法線ベクトルを算出する。 FIG. 9 is an example of a screen on which a nose is extracted from the distance image data and a normal vector is calculated when recognizing a gesture of shaking the head. In this way, after normalizing the face image and detecting the nose, the normal vector is calculated.

図１０は、本発明のジェスチャ認識装置２０において行われる、舌出し動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 10 is a flowchart showing an example of the flow of interface control processing corresponding to gesture recognition and gesture recognition performed by the tongue recognition operation performed in the gesture recognition device 20 of the present invention.

まず、部位領域抽出部２２０は、画像取込部２１０が取り込んだ距離画像データから、頭部が存在しうる領域の検出、つまり、首より上の頭部の領域となる三次元空間の部位領域を抽出する（ステップＳ１００１）。このステップは、頭部を振るジェスチャ動作の認識におけるステップＳ８０１と同様であるので、詳細は省略する。 First, the part region extraction unit 220 detects a region where the head can exist from the distance image data captured by the image capturing unit 210, that is, a part region in a three-dimensional space that becomes a head region above the neck. Is extracted (step S1001). Since this step is the same as step S801 in recognizing the gesture motion of shaking the head, details are omitted.

次に、部位検出部２３０は、頭部が存在する領域の距離画像データから検出対象となる頭部を検出したうえで、鼻を検出する（ステップＳ１００２）。この処理についても、頭を振る動作の認識におけるステップＳ８０２と同様であるため、説明を省略する。 Next, the part detection unit 230 detects the head to be detected from the distance image data of the region where the head is present, and then detects the nose (step S1002). This processing is also the same as that in step S802 in recognizing the motion of shaking the head, and thus the description thereof is omitted.

部位検出部２３０は、距離画像データから検出した鼻の部位に基づき、鼻より下の頭部領域から舌の領域を検出する（ステップＳ１００３）。具体的には、検出した鼻の位置に基づき、鼻より下の頭部領域のＨＳＶ色情報（色相、明度、彩度）を取得し、舌の色として設定した色相のしきい値をパラメータデータベース２８４から読み出して、フィルタリングを行う。フィルタリングによって抽出された領域にラベリング処理を行って、所定サイズ以上のラベルを有する領域を舌として検出する。 The part detection unit 230 detects the tongue area from the head area below the nose based on the nose part detected from the distance image data (step S1003). Specifically, based on the detected position of the nose, HSV color information (hue, brightness, saturation) of the head region below the nose is acquired, and the hue threshold set as the tongue color is set in the parameter database. Read from 284 and perform filtering. A region extracted by filtering is subjected to a labeling process to detect a region having a label of a predetermined size or more as a tongue.

部位変化検出部２５０は、舌領域の変化を検出する（ステップＳ１００４）。部位変化検出部２５０は、キャッシュ部２４０に記憶されている一又は数フレーム以前の距離画像データにおける舌領域と、現在の距離画像データにおける舌領域との変化量を算出し、舌領域の変化を検出する。 The part change detection unit 250 detects a change in the tongue region (step S1004). The part change detection unit 250 calculates the amount of change between the tongue region in the distance image data one or several frames before stored in the cache unit 240 and the tongue region in the current distance image data, and detects the change in the tongue region. To detect.

ジェスチャ認識部２６０は、部位変化検出部２５０が検出した舌領域の変化が所定値以上かどうかを判定する（ステップＳ１００５）。また、舌領域が、所定面積以上になり、一定時間継続しているかどうかで判定してもよい。所定値となる設定しきい値は、パラメータデータベース２８４に記憶されており、ジェスチャ認識部２６０は、そのしきい値を読みだして、参照することで、判定を行う。判定の結果、所定値以上である場合（ｙｅｓ）、舌を出したと判定される（ステップＳ１００６）。判定の結果、舌領域の変化が所定値以下の場合（ｎｏ）、舌を出すジェスチャはなされていないと判定され、ステップＳ１００４へ戻り、引き続き、部位変化検出部２５０が舌領域の変化の検出を行う。ジェスチャ認識部２６０は、舌を出したと判定した場合（ステップＳ１００６）、舌を出すジェスチャがなされたことを、インターフェース制御部２７０へ出力する。このように、舌領域の抽出を頭部領域の検出から行うように構成することで、顔の動きに不随意運動がある人であっても、確実に頭部の動きをトラッキングして舌領域を検出してジェスチャ認識を行うことが可能となる。 The gesture recognition unit 260 determines whether or not the change in the tongue region detected by the region change detection unit 250 is greater than or equal to a predetermined value (step S1005). Moreover, you may determine by the tongue area | region becoming more than predetermined area and continuing for a fixed time. The set threshold value that is a predetermined value is stored in the parameter database 284, and the gesture recognition unit 260 reads the threshold value and refers to it to make a determination. As a result of the determination, if it is greater than or equal to a predetermined value (yes), it is determined that the tongue has been taken out (step S1006). As a result of the determination, if the change in the tongue area is equal to or smaller than the predetermined value (no), it is determined that no gesture for sticking out the tongue has been made, and the process returns to step S1004. Do. If it is determined that the tongue has been taken out (step S1006), the gesture recognition unit 260 outputs to the interface control unit 270 that a gesture for taking out the tongue has been made. In this way, the tongue region is extracted from the detection of the head region, so that even if there is an involuntary movement in the face movement, the head region movement can be reliably tracked. It becomes possible to detect gesture and perform gesture recognition.

次に、インターフェース制御部２７０は、ジェスチャ認識部２６０の出力に応じて、舌を出すジェスチャに関連付けられたインターフェース制御を行う（ステップＳ１００７）。具体的には、インターフェース制御部２７０は、対応ジェスチャデータベース２８６を参照し、舌出しに対応するインターフェース制御をインターフェース３０に対して行う。 Next, the interface control unit 270 performs interface control associated with the gesture for sticking out the tongue according to the output of the gesture recognition unit 260 (step S1007). Specifically, the interface control unit 270 refers to the corresponding gesture database 286 and performs interface control corresponding to tongue out for the interface 30.

図１１は、舌出しジェスチャを認識する場合において、取得される距離画像データの一例である。（ａ）の中央は舌を出した距離画像データであり、右側に画像処理した画像が表示されている。右上が頭部検出した画像データ、右中が頭部検出後正規化した画像データ、右下が舌領域を検出した画像データである。（ｂ）の中央は舌を出していない距離画像データであり、右側に（ａ）と同様に画像処理した画像が表示されている。このように舌の色に該当する色領域を抽出し、色領域の変化に基づいて舌出しのジェスチャ認識を行う。 FIG. 11 is an example of distance image data acquired when recognizing a tongue out gesture. The center of (a) is distance image data with the tongue out, and an image processed image is displayed on the right side. The upper right is the image data with the head detected, the middle right is the image data normalized after the head detection, and the lower right is the image data with the tongue region detected. The center of (b) is distance image data that does not stick out the tongue, and an image processed in the same manner as (a) is displayed on the right side. In this way, the color region corresponding to the color of the tongue is extracted, and the tongue recognition gesture recognition is performed based on the change of the color region.

図１２は、本発明のジェスチャ認識装置２０において行われる、ひざ閉じ動作のジェスチャ認識とジェスチャ認識に対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 12 is a flowchart showing an example of a flow of interface control processing corresponding to gesture recognition and gesture recognition performed in the gesture recognition device 20 of the present invention.

まず、部位領域抽出部２２０は、画像取込部２１０が取り込んだ距離画像データから、両ひざが存在しうる領域の検出を抽出する（ステップＳ１２０１）。具体的には、部位領域抽出部２２０は、距離画像データ中で膝が存在し得る三次元空間として設定されているパラメータをパラメータデータベース２８２から読み出し、距離画像データからパラメータとして設定されている領域を抽出する。パラメータデータベースには、膝が存在し得る三次元空間として、たとえば、三次元座標の最大値及び最小値が記憶されている。 First, the part region extraction unit 220 extracts detection of a region where both knees may exist from the distance image data captured by the image capture unit 210 (step S1201). Specifically, the part region extraction unit 220 reads a parameter set as a three-dimensional space in which the knee can exist in the distance image data from the parameter database 282, and reads the region set as a parameter from the distance image data. Extract. In the parameter database, for example, the maximum value and the minimum value of the three-dimensional coordinates are stored as a three-dimensional space in which the knee can exist.

次に、部位検出部２３０は、ひざが存在する領域の距離画像データから検出対象となる両ひざのひざ頭部分を検出する（ステップＳ１２０２）。部位検出部２３０は、部位領域抽出部２２０が出力したひざ領域の距離画像データで、各々ひざの領域の端側から山登り法により、両ひざのひざ頭位置の推定を行うことで、ひざ頭部分を検出する。 Next, the part detection unit 230 detects the knee head portions of both knees to be detected from the distance image data of the area where the knee exists (step S1202). The part detection unit 230 is the distance image data of the knee region output from the part region extraction unit 220, and estimates the knee head position of both knees by the hill-climbing method from the end side of each knee region. Is detected.

部位変化検出部２５０は、部位検出部２３０が距離画像データから検出した両ひざのひざ頭の位置に基づいて、距離画像データ間における両ひざの位置の変化を検出する（ステップＳ１２０３）。具体的には、部位変化検出部２５０は、キャッシュ部２４０に記憶されている一又は数フレーム以前の距離画像データにおけるひざ頭の部位の位置と、現在の距離画像データにおける両ひざの位置から、両ひざの位置変化を検出する。両ひざの位置変化の検出にあたっては、それぞれの距離画像データにおける両ひざの位置座標から変化量を算出する。具体的には、ここでは、ひざ閉じのジェスチャとなるため、両ひざの位置座標が近付くことによる変化量を算出することとなる。 The part change detection unit 250 detects a change in the position of both knees between the distance image data based on the position of the knee heads of both knees detected by the part detection unit 230 from the distance image data (step S1203). Specifically, the part change detection unit 250 calculates the position of the knee head part in the distance image data one or several frames before stored in the cache unit 240 and the positions of both knees in the current distance image data. Detect position change of both knees. In detecting the position change of both knees, the amount of change is calculated from the position coordinates of both knees in the respective distance image data. Specifically, here, since it is a gesture for closing the knee, the amount of change due to the approach of the position coordinates of both knees is calculated.

ジェスチャ認識部２６０は、部位変化検出部２５０が計算した両ひざの位置変化に基づき、所定時間、変化量が所定値以上に継続するかどうかを判定する（ステップＳ１２０４）。なお、両ひざの座標値による距離が所定値以上に近づいているかどうかで判定してもよい。所定値となる設定しきい値は、パラメータデータベース２８４に記憶されており、ジェスチャ認識部２６０は、そのしきい値を読みだして、参照することで、判定を行う。判定の結果、所定値以上である場合（ｙｅｓ）、両ひざが閉じられたと判定される（ステップＳ１２０５）。判定の結果、両ひざの位置の変化量が所定値以下の場合（ｎｏ）、ひざ閉じのジェスチャはなされていないと判定され、ステップＳ１２０３へ戻り、引き続き、部位変化検出部２５０が両ひざの位置の変化の検出を行う。ジェスチャ認識部２６０は、ひざ閉じと判定した場合（ステップＳ１２０５）、ひざ閉じジェスチャがなされたことを、インターフェース制御部２７０へ出力する。 The gesture recognition unit 260 determines whether or not the amount of change continues for a predetermined time or more based on the position change of both knees calculated by the region change detection unit 250 (step S1204). In addition, you may determine by the distance by the coordinate value of both knees approaching more than predetermined value. The set threshold value that is a predetermined value is stored in the parameter database 284, and the gesture recognition unit 260 reads the threshold value and refers to it to make a determination. As a result of the determination, if it is equal to or greater than the predetermined value (yes), it is determined that both knees are closed (step S1205). As a result of the determination, if the amount of change in the positions of both knees is equal to or less than a predetermined value (no), it is determined that no knee closing gesture has been made, and the process returns to step S1203. Change detection is performed. If it is determined that the knee is closed (step S1205), the gesture recognition unit 260 outputs to the interface control unit 270 that the knee closing gesture has been made.

次に、インターフェース制御部２７０は、ジェスチャ認識部２６０の出力に応じて、ひざ閉じに関連付けられたインターフェース制御を行う（ステップＳ１２０６）。具体的には、インターフェース制御部２７０は、対応ジェスチャデータベース２８６を参照し、ひざ閉じに対応するインターフェース制御をインターフェース３０に対して行う。なお、ひざ閉じについては、必ずしも両ひざが密着するまで近づく必要はなく、通常の状態よりも、近づいた状態になったところで、ひざ閉じのジェスチャが行われたと認識してもよい。どの位置でひざ閉じと認識するかは、設定するしきい値で任意に設定することができる。また、ここでは、ひざ閉じのジェスチャとしたが、ひざを開くことで、ジェスチャ認識を行ってもよい。 Next, the interface control unit 270 performs interface control associated with the knee closure according to the output of the gesture recognition unit 260 (step S1206). Specifically, the interface control unit 270 refers to the corresponding gesture database 286 and performs interface control corresponding to knee closure on the interface 30. Note that it is not always necessary to approach the knees until the knees are in close contact with each other, and it may be recognized that the knee-closing gesture has been performed when the knees are closer than the normal state. The position at which the knee is recognized as being closed can be arbitrarily set by a threshold value to be set. Here, the gesture of closing the knee is used, but gesture recognition may be performed by opening the knee.

図１３は、ひざ閉じのジェスチャを検出するために撮像された距離画像データの一例である。このように、ひざの認識にあたっては、ひざが中心に映るようなカメラ設定で、撮影される。 FIG. 13 is an example of distance image data captured to detect a knee-closed gesture. In this way, when recognizing the knee, the image is shot with the camera setting so that the knee is centered.

（第二の実施の形態）
第一の実施の形態においては、インターフェース制御に用いることができる身体の各部位についてジェスチャを認識するためのパラメータを設定し、どの部位を用いるかを選択することで、再現性のある動きができる部位が身体障害者各々によって異なったとしても、ジェスチャ認識することが可能なジェスチャ認識装置について説明した。第二の実施の形態においては、ジェスチャ認識によりインターフェース制御を行う（ジェスチャ認識モード）前に、ジェスチャ決定モードを設けることで、日々変化する個々の身体障害者の症状に合わせて、的確にジェスチャ認識を行うジェスチャ認識装置について説明する。なお、第一の実施の形態と同様の処理については、説明を省略する。 (Second embodiment)
In the first embodiment, by setting a parameter for recognizing a gesture for each part of the body that can be used for interface control and selecting which part is used, reproducible movement can be performed. A gesture recognition device capable of recognizing a gesture even when the site is different for each physically handicapped has been described. In the second embodiment, a gesture determination mode is provided before performing interface control by gesture recognition (gesture recognition mode), thereby accurately recognizing gestures according to the symptoms of individual disabled persons that change daily. A gesture recognition apparatus for performing the above will be described. Note that description of the same processing as in the first embodiment is omitted.

図１４は、本発明の第二の実施の形態のジェスチャ認識装置を含むジェスチャ認識システム例を示すブロック図である。なお、図１と共通する構成については、同じ番号を付し、説明を省略する。図１５において、ジェスチャ認識装置２００は、撮像装置１０、インターフェース３０と接続されている。ジェスチャ認識装置２００は、画像取込部２１０、部位領域抽出部２２０、部位検出部２３０、キャッシュ部２４０、部位変化検出部２５０、ジェスチャ認識部２６０、インターフェース制御部２７０、記憶部２８０、ジェスチャ決定部２９０を有する。 FIG. 14 is a block diagram illustrating an example of a gesture recognition system including the gesture recognition device according to the second embodiment of this invention. In addition, about the structure which is common in FIG. 1, the same number is attached | subjected and description is abbreviate | omitted. In FIG. 15, the gesture recognition device 200 is connected to the imaging device 10 and the interface 30. The gesture recognition device 200 includes an image capture unit 210, a region region extraction unit 220, a region detection unit 230, a cache unit 240, a region change detection unit 250, a gesture recognition unit 260, an interface control unit 270, a storage unit 280, and a gesture determination unit. 290.

記憶部２８０は、距離画像データベース２８２、パラメータデータベース２８４、対応ジェスチャデータベース２８６、ジェスチャ指示プログラム２８８を有している。距離画像データベース２８２は、撮像装置１０から、画像取り込み部２１０が取り込んだ距離画像データを記憶している。パラメータデータベース２８４は、所定の部位領域が撮像されている距離画像データにおける座標範囲である部位領域抽出パラメータ、所定部位を検出するために用いられる色のしきい値または座標値である部位検出パラメータ、部位変化を検出してジェスチャとして認識するためのしきい値を記憶している。第二の実施の形態においては、少なくとも部位変化を検出してジェスチャとして認識するためのしきい値である部位検出パラメータは、ジェスチャ決定モードによって決定されたパラメータであり、利用者ごとに異なる値が設定されている。ジェスチャとして認識するための変化量であるしきい値は、ジェスチャ決定部２９０が決定した値であって、ジェスチャ決定モードが実行されるたびに更新される。対応ジェスチャデータベース２８６は、ジェスチャの動作に関連付けられたインターフェース制御の内容を対応付けて記憶しているデータベースである。ジェスチャ決定部２９０の決定に基づいて、利用者ごとに、インターフェース制御に用いる身体部位とインターフェース制御の内容とを対応付けて記憶している。 The storage unit 280 includes a distance image database 282, a parameter database 284, a corresponding gesture database 286, and a gesture instruction program 288. The distance image database 282 stores distance image data captured by the image capturing unit 210 from the imaging device 10. The parameter database 284 includes a part region extraction parameter that is a coordinate range in distance image data in which a predetermined part region is imaged, a part detection parameter that is a color threshold value or coordinate value used to detect the predetermined part, A threshold for detecting a change in a part and recognizing it as a gesture is stored. In the second embodiment, at least a region detection parameter that is a threshold for detecting a region change and recognizing it as a gesture is a parameter determined by the gesture determination mode, and has a different value for each user. Is set. The threshold value, which is a change amount for recognizing as a gesture, is a value determined by the gesture determination unit 290, and is updated every time the gesture determination mode is executed. The correspondence gesture database 286 is a database that stores the contents of the interface control associated with the gesture operation in association with each other. Based on the determination of the gesture determination unit 290, for each user, the body part used for interface control and the content of the interface control are stored in association with each other.

ジェスチャ指示プログラム２８８は、利用者に所定のタイミングで、動く身体部位を動かすように指示するプログラムであって、ジェスチャ決定部２９０によって読み出される。ジェスチャ指示プログラム２８８は、所定のタイミングで複数回、指示を出して利用者に身体を動かすようにさせる。これにより、利用者の意思による動作であって、かつ再現性のある動作（随意運動）を取得することができ、また、複数回、指示を出して動作をさせることで、より確実にインターフェース制御に利用できるジェスチャを決定することを可能としている。 The gesture instruction program 288 is a program for instructing the user to move the moving body part at a predetermined timing, and is read by the gesture determination unit 290. The gesture instruction program 288 issues an instruction a plurality of times at a predetermined timing to cause the user to move the body. As a result, it is possible to obtain a motion (voluntary movement) that is based on the user's intention and has reproducibility, and more reliably control the interface by issuing an instruction multiple times. It is possible to determine the gestures that can be used.

ジェスチャ決定部２９０は、利用者の一又は複数の候補部位から、インターフェース制御に関連付ける部位及びジェスチャを決定する。利用者によって、動かすことのできる身体部位が異なり、また動かせる度合いや動かし方も異なる。このため、ジェスチャ決定部２８０は、ジェスチャ決定モードにおいて、ジェスチャ指示プログラム２８８を記憶部２８０から読み出して、ジェスチャを所定のタイミングで行うよう、利用者に指示し、インターフェース制御に関連付ける候補となる身体部位が所定のタイミングで動いたかどうかを、部位変化検出部２５０が検出した部位の変化量に基づいて判定する。指示したタイミングにおける変化の場合、利用者が意思をもって動かしたと判断できるため、再現性のある動き、つまりインターフェース制御に利用可能なジェスチャを確実に取得できる。また、複数回、指示を出して身体部位を動かすことで、ジェスチャに利用可能な身体部位を決定し、さらにより確実にジェスチャと認識可能なしきい値を決定することができる。 The gesture determination unit 290 determines a part and a gesture associated with the interface control from one or a plurality of candidate parts of the user. The body parts that can be moved vary depending on the user, and the degree and method of movement vary. For this reason, in the gesture determination mode, the gesture determination unit 280 reads the gesture instruction program 288 from the storage unit 280, instructs the user to perform the gesture at a predetermined timing, and is a candidate body part to be associated with the interface control. Is determined based on the amount of change in the part detected by the part change detection unit 250. In the case of a change in the instructed timing, it can be determined that the user has moved with intention, so that a reproducible movement, that is, a gesture that can be used for interface control can be reliably acquired. In addition, by issuing an instruction a plurality of times and moving the body part, it is possible to determine a body part that can be used for a gesture, and to determine a threshold that can be recognized as a gesture more reliably.

たとえば、ジェスチャ決定部２９０は、ジェスチャ決定モードにおいて複数回取得することができた所定タイミングにおける変化量から、ジェスチャと認識する変化量を決定する。変化量としてみるパラメータは、各身体部位によって異なる。例えば、指曲げのジェスチャであれば、指と手の角度の変化であり、腕の振りであれば、腕の位置変化である。複数回取得した変化から、最終的にジェスチャとして認識する変化量、しきい値を決定する。この時、取得された複数の値の平均値をしきい値として決定してもよい。例えば、指曲げであれば、ジェスチャ指示プログラムが指示したタイミングで取得した手と指による角度の平均値を算出し、変化量として決定する。また、ジェスチャ決定部２９０は、過去に決定した変化量の値を一又は複数記憶しておき、今回計算した平均値と過去の値との平均を計算し、それを新たなしきい値として決定してもよい。この場合、記憶されている過去の値は、新しい値が記憶されるごとに、もっとも古い値が削除されていく（FIFO）ように構成されていてもよい。 For example, the gesture determination unit 290 determines a change amount recognized as a gesture from a change amount at a predetermined timing that can be acquired a plurality of times in the gesture determination mode. The parameter viewed as the amount of change differs depending on each body part. For example, a finger bending gesture is a change in the angle between a finger and a hand, and an arm swing is a change in the arm position. A change amount and a threshold value to be finally recognized as a gesture are determined from the changes acquired a plurality of times. At this time, an average value of a plurality of acquired values may be determined as a threshold value. For example, in the case of finger bending, the average value of the angle between the hand and the finger acquired at the timing instructed by the gesture instruction program is calculated and determined as the amount of change. In addition, the gesture determination unit 290 stores one or more values of the amount of change determined in the past, calculates the average of the average value calculated this time and the past value, and determines it as a new threshold value. May be. In this case, the stored past value may be configured such that the oldest value is deleted (FIFO) each time a new value is stored.

ジェスチャ決定部２９０により決定されたジェスチャに利用可能な身体部位及びジェスチャの動きにかかる変化量は、パラメータデータベース２８４にジェスチャ認識のためのしきい値として、利用者に対応づけて記憶される。また、ジェスチャ決定部２９０において決定されたジェスチャに利用可能な身体部位及び変化量は、ジェスチャとしてインターフェース制御に対応付けて対応ジェスチャデータベース２８６へ記憶される。 The body part that can be used for the gesture determined by the gesture determination unit 290 and the amount of change related to the movement of the gesture are stored in the parameter database 284 as a threshold for gesture recognition in association with the user. In addition, the body part and the change amount that can be used for the gesture determined by the gesture determination unit 290 are stored in the corresponding gesture database 286 as a gesture in association with the interface control.

ジェスチャ認識部２６０は、ジェスチャ認識モードにおいて、ジェスチャ決定部２９０が決定した所定の身体部位における変化量（部位変化検出パラメータ）を読み出して、部位変化検出部２５０が検出した所定部位の変化量が部位変化検出パラメータのしきい値以上である場合、所定のジェスチャが行われたと認識する。ジェスチャ認識部２６０は、ジェスチャがされたと判定すると、判定されたジェスチャ内容をインターフェース制御部２７０へ送出する。インターフェース制御部２７０は、ジェスチャ認識部２６０が認識したジェスチャに関連づけられたインターフェース制御を行う。 In the gesture recognition mode, the gesture recognition unit 260 reads the change amount (part change detection parameter) in the predetermined body part determined by the gesture determination part 290, and the change amount of the predetermined part detected by the part change detection unit 250 is the part. If it is equal to or greater than the threshold value of the change detection parameter, it is recognized that a predetermined gesture has been performed. If the gesture recognition unit 260 determines that a gesture has been made, the gesture recognition unit 260 sends the determined gesture content to the interface control unit 270. The interface control unit 270 performs interface control associated with the gesture recognized by the gesture recognition unit 260.

図１５は、本発明の第二の実施の形態におけるジェスチャ決定モードにおいてジェスチャ認識装置２０において行われる、利用者のジェスチャを決定する処理の流れの一例を示すフロー図である。このジェスチャ決定モードは、利用者が初めてジェスチャ認識装置を使用するときに、必ず行われるモードであり、ジェスチャ決定モードで、ジェスチャに用いる身体部位及びジェスチャとして用いる部位変化を決定し、その後のジェスチャ認識モードでのジェスチャ認識の際に、決定した身体部位及び部位変化量が用いられる。なお、ジェスチャ決定モードは、毎日、実施されるようにしてもよい。毎日実施することにより、日々変化する利用者の身体症状に応じて、ジェスチャ認識させる変化量を変化させることができ、より正確でユーザフレンドリーなジェスチャ認識装置を実現することができる。 FIG. 15 is a flowchart illustrating an example of a flow of processing for determining a user's gesture performed in the gesture recognition device 20 in the gesture determination mode according to the second embodiment of the present invention. This gesture determination mode is a mode that is always performed when a user uses the gesture recognition device for the first time. In the gesture determination mode, the body part used for the gesture and the part change used as the gesture are determined, and the subsequent gesture recognition is performed. When the gesture is recognized in the mode, the determined body part and the part change amount are used. Note that the gesture determination mode may be performed every day. By carrying out every day, it is possible to change the amount of change for gesture recognition according to the physical symptoms of the user that change daily, and it is possible to realize a more accurate and user-friendly gesture recognition device.

ジェスチャ決定モードが開始される前に、利用者は、全身または上半身または下半身の正面を撮像装置１０が撮影できるような位置にいるようにする。まず、ジェスチャ決定モードが開始されると、ジェスチャ指示プログラム２８８がジェスチャ決定部２９０によって読み出され、開始する（ステップＳ１６０１）。ジェスチャ指示プログラム２８８は、利用者に対して、所定のタイミングで、身体部位を動かすように合図、指示を行うプログラムである。指示は，音楽に合わせて動作を行うタイミングを意味する○が、左から右に流れ、左側の所定位置に来たときにジェスチャをすることで音が鳴る仕組みになっている．初回以降は，二以上の候補部位に対して、それぞれ異なるタイミングで指示を出したり、同時に指示を出したりすることが可能なプログラムである。 Before the gesture determination mode is started, the user is in a position where the imaging apparatus 10 can photograph the whole body or the front of the upper body or the lower body. First, when the gesture determination mode is started, the gesture instruction program 288 is read and started by the gesture determination unit 290 (step S1601). The gesture instruction program 288 is a program for instructing and instructing the user to move the body part at a predetermined timing. The instruction indicates the timing to perform the movement in accordance with the music, but it flows from left to right, and when it comes to the predetermined position on the left side, it makes a sound by making a gesture. After the first time, the program is capable of giving instructions to two or more candidate parts at different timings or giving instructions simultaneously.

部位領域抽出部２２０は、候補部位の領域を検出する（ステップＳ１６０２）。部位領域抽出部２２０は、画像取り込み部２１０が取得した撮像装置１０からの距離画像データから、ジェスチャに利用可能な身体の候補部位領域を抽出する。候補となり身体部位の領域としては、たとえば、頭、手、腕、膝、足先、指、口などである。撮像装置１０と利用者との位置関係から、距離画像データ内での所定部位が存在しうる範囲が定まるため、パラメータデータベース２８４に記憶されている部位領域抽出パラメータである三次元座標データ(x,y,z)を読み出して、候補部位の領域を検出する。 Part region extraction unit 220 detects a candidate part region (step S1602). The part region extraction unit 220 extracts a candidate part region of the body that can be used for the gesture from the distance image data from the imaging device 10 acquired by the image capturing unit 210. Examples of regions of the body part that are candidates include the head, hands, arms, knees, toes, fingers, and mouth. Since the range in which the predetermined part in the distance image data can exist is determined from the positional relationship between the imaging device 10 and the user, the three-dimensional coordinate data (x, x, which is the part region extraction parameter stored in the parameter database 284). y, z) is read out and the candidate region is detected.

次に、部位検出部２３０は、部位領域から所定部位を検出する（ステップＳ１６０３）。部位検出部２３０は、たとえば、部位領域抽出部２２０によって抽出された部位領域の距離画像データ内のテクスチャ情報、色情報（例えば、色相、彩度、明度）、形状などの部位検出パラメータに基づいて、部位領域を検出する。部位検出部２３０は、部位検出パラメータを、パラメータデータベース２８４から対応する身体部位に基づいて読み出して、検出に利用する。 Next, the part detection unit 230 detects a predetermined part from the part region (step S1603). The part detection unit 230 is based on part detection parameters such as texture information, color information (for example, hue, saturation, brightness), and shape in the distance image data of the part region extracted by the part region extraction unit 220, for example. Detect the region of the part. The part detection unit 230 reads the part detection parameter from the parameter database 284 based on the corresponding body part and uses it for detection.

部位変化検出部２５０は、部位の座標変化を検出する（ステップＳ１６０４）。部位検出部２３０が、各距離画像データにおいて検出した候補部位に基づいて、距離画像データ間での候補部位の座標の変化を検出する。たとえば、部位変化検出部２５０は、キャッシュ部２４０に一時記憶されている、比較対象となる前フレームの距離画像データにおける候補部位領域と、現在フレームの距離画像データにおける候補部位領域の座標変化を算出する。例えば、候補部位が舌である場合には、舌領域の変化量を算出し、膝である場合には、膝の位置変化量を算出する。 The part change detection unit 250 detects a coordinate change of the part (step S1604). The part detection unit 230 detects a change in the coordinates of the candidate part between the distance image data based on the candidate part detected in each distance image data. For example, the part change detection unit 250 calculates the coordinate change between the candidate part region in the distance image data of the previous frame to be compared and the candidate part region in the distance image data of the current frame, which are temporarily stored in the cache unit 240. To do. For example, when the candidate site is the tongue, the change amount of the tongue region is calculated, and when it is the knee, the change amount of the knee position is calculated.

次に、ジェスチャ決定部２９０は、部位変化検出部２５０が部位変化を検出すると、所定のタイミングにおける座標の変化かの判定を行う（ステップＳ１６０５）。所定のタイミングとは、ジェスチャ指示プログラムが利用者に身体部位を動かすように指示したタイミングである。身体障害者の場合、自分の意思とは無関係に身体が動いてしまうことがある。しかし、所定のタイミングでの座標の変化であれば、利用者が意図をもって動かしたことによる座標変化であることが明らかであり、ジェスチャとして利用可能な座標変化ということになる。ここで、たとえば、指示したタイミングから０．５秒以内の座標変化であれば、所定のタイミングにおける座標の変化であると判定する。 Next, when the region change detection unit 250 detects a region change, the gesture determination unit 290 determines whether the coordinate has changed at a predetermined timing (step S1605). The predetermined timing is a timing when the gesture instruction program instructs the user to move the body part. If you have a physical disability, your body may move regardless of your will. However, if the coordinate changes at a predetermined timing, it is clear that the coordinate change is caused by the user's intentional movement, and this is a coordinate change that can be used as a gesture. Here, for example, if the coordinate change is within 0.5 seconds from the instructed timing, it is determined that the coordinate is changed at a predetermined timing.

所定のタイミングにおける座標変化の場合（ｙｅｓ）、ジェスチャ決定部２９０は、座標の変化量を記憶する（ステップＳ１６０６）。具体的には、部位変化検出部２５０が算出した候補部位領域での移動の始点、終点の座標値、始点から終点に達するまでの時間を記憶する。これらのデータは、ジェスチャ指示プログラムが終了するまで、逐次記憶されていく。 In the case of a coordinate change at a predetermined timing (yes), the gesture determination unit 290 stores the coordinate change amount (step S1606). Specifically, the starting point and end point coordinate values of the movement in the candidate part region calculated by the part change detection unit 250 and the time from the start point to the end point are stored. These data are sequentially stored until the gesture instruction program ends.

所定のタイミングにおける座標変化ではない場合（ｎｏ）、部位変化検出部２５０は引き続き候補部位の座標変化を検出する（ステップＳ１６０４）。 If it is not a coordinate change at a predetermined timing (no), the part change detection unit 250 continues to detect a coordinate change of the candidate part (step S1604).

そして、ジェスチャ決定部２９０は、ジェスチャ指示プログラムが終了したかの判定を行う（ステップＳ１６０７）。ジェスチャ指示プログラムが終了した場合（ｙｅｓ）、ジェスチャ決定部２９０は、記憶した変化量に基づき、平均値を算出する（ステップＳ１６０８）。具体的には、取得した複数の変化量からジェスチャ認識の際に用いるしきい値を算出する。ここでは、平均値を算出するとしているが、これに限らない。たとえば、取得した変化量のうち、最大値と最小値の中間値をしきい値と決定してもよい。また、取得した変化量のうち、一定以上の値を有するもののうち、最小値をしきい値として決定してもよい。変化量としてみるパラメータが各身体部位によって異なるため、複数の変化量からどのようにしてしきい値を決めるかは、身体部位によって異なってもよい。 Then, the gesture determination unit 290 determines whether the gesture instruction program has ended (step S1607). When the gesture instruction program ends (yes), the gesture determination unit 290 calculates an average value based on the stored change amount (step S1608). Specifically, a threshold value used for gesture recognition is calculated from a plurality of obtained change amounts. Here, the average value is calculated, but the present invention is not limited to this. For example, an intermediate value between the maximum value and the minimum value among the obtained change amounts may be determined as the threshold value. Moreover, you may determine the minimum value as a threshold value among what has a value more than fixed among the acquired variation | change_quantity. Since the parameter viewed as the amount of change differs for each body part, how the threshold value is determined from a plurality of amounts of change may differ for each body part.

また、しきい値を算出するにあたって、過去に同じ利用者についてジェスチャ指示プログラムを実行して、しきい値を決定したことがある場合、そのしきい値を一又は複数記憶しておき、過去に実行したときのしきい値と今回実行した際の変化量から算出した値との平均をさらに算出して、しきい値としてもよい。この場合、記憶されている過去のしきい値は、新しい値が記憶されるごとに、もっとも古い値が削除されていく（FIFO）ように構成されていてもよい。なお、ジェスチャ決定部２９０は、候補部位のなかで、取得できた値のばらつきが大きい部位がある場合、平均値を算出せず、その候補部位については、ジェスチャには使わないと決定する。 In calculating the threshold value, if the threshold value has been determined by executing the gesture instruction program for the same user in the past, the threshold value is stored in the past. An average of the threshold value at the time of execution and the value calculated from the amount of change at the time of execution this time may be further calculated as the threshold value. In this case, the stored past threshold value may be configured such that the oldest value is deleted (FIFO) each time a new value is stored. In addition, when there is a part having a large variation in the obtained values among the candidate parts, the gesture determining unit 290 determines that the candidate part is not used for the gesture without calculating the average value.

ジェスチャ指示プログラムが終了していない場合（ｎｏ）、部位変化検出部２５０は、引き続き、候補部位の座標変化を検出する（ステップＳ１６０４）。 If the gesture instruction program has not ended (no), the part change detection unit 250 continues to detect a coordinate change of the candidate part (step S1604).

次に、ジェスチャ決定部２９０は、パラメータデータベース２８４に記憶している値を更新してジェスチャを決定する（ステップＳ１６０９）。具体的には、ジェスチャ決定部２９０は、ジェスチャ認識に使うと決定した身体部位に対応付けて、算出した平均値をしきい値として記憶する。更新された部位変化検出パラメータは、ジェスチャ認識モードにおいて、ジェスチャ認識の際に用いられる。また、ジェスチャ決定部２９０は、ジェスチャに用いる部位を決定すると、その身体部位とインターフェース制御とを対応づけるために、対応ジェスチャデータベース２８６を更新する。そして、ジェスチャ決定モードを終了する。このように、ジェスチャ決定モードを設けることで、各人の症状に合った、ジェスチャ認識のためのしきい値を決定することができるため、精度高くジェスチャ認識を行うことができる。 Next, the gesture determination unit 290 updates a value stored in the parameter database 284 to determine a gesture (step S1609). Specifically, the gesture determination unit 290 stores the calculated average value as a threshold value in association with the body part determined to be used for gesture recognition. The updated region change detection parameter is used for gesture recognition in the gesture recognition mode. In addition, when the gesture determination unit 290 determines a part to be used for the gesture, the gesture determination unit 290 updates the corresponding gesture database 286 in order to associate the body part with the interface control. Then, the gesture determination mode ends. As described above, by providing the gesture determination mode, it is possible to determine a threshold for gesture recognition that matches each person's symptom, so that gesture recognition can be performed with high accuracy.

ジェスチャ決定モードにおいて、ジェスチャに用いられる身体部位、ジェスチャ認識のためのしきい値が決定されると、ジェスチャ認識モードでは、ジェスチャ決定モードで決定されたしきい値をパラメータデータベース２８４から読み出してジェスチャ認識を行って、インターフェース制御を行う。ジェスチャ認識の処理は、第一の実施の形態と同様であるため、説明を省略する。 In the gesture determination mode, when the body part used for the gesture and the threshold value for gesture recognition are determined, in the gesture recognition mode, the threshold value determined in the gesture determination mode is read from the parameter database 284 to recognize the gesture. To control the interface. Since the gesture recognition process is the same as that of the first embodiment, the description thereof is omitted.

図１６は、本発明において用いられるジェスチャ指示プログラム２８８での指示画面の一例である。ジェスチャ決定モードでは、ジェスチャ指示プログラム２８８が実行されると、画面で身体を動かすように指示がなされる。ここでは、４つの部位を割り当てることが可能となっており、それぞれ左端に丸印が到達したタイミングで、割り当てられた身体を動かすような仕組みである。なお、これは一例であり、利用者に所定のタイミングで身体部位を動かすように指示するプログラムであれば、音声で指示するなど、どのように構成してもよい。 FIG. 16 is an example of an instruction screen in the gesture instruction program 288 used in the present invention. In the gesture determination mode, when the gesture instruction program 288 is executed, an instruction is given to move the body on the screen. Here, four parts can be assigned, and the assigned body is moved at the timing when a circle arrives at the left end. This is merely an example, and any program may be used, such as instructing by voice, as long as the program instructs the user to move the body part at a predetermined timing.

ジェスチャ決定モードが終了すると、ジェスチャ認識モードとなり、ジェスチャ認識部が、ジェスチャ決定部２９０が決定したジェスチャ認識に用いる身体部位と、そのしきい値に基づいて、基本的に図２のフロー図に従い、ジェスチャ認識を行う。各身体部位におけるジェスチャ認識の処理は、第一の実施の形態と同様であるため、省略する。 When the gesture determination mode ends, the gesture recognition mode is entered, and the gesture recognition unit basically follows the flow diagram of FIG. 2 based on the body part used for gesture recognition determined by the gesture determination unit 290 and its threshold value. Perform gesture recognition. Since the gesture recognition process in each body part is the same as that in the first embodiment, a description thereof will be omitted.

（第三の実施の形態）
次の第三の実施の形態について説明する。第三の実施の形態においても、ジェスチャ決定モードを設けて、各人に合わせたジェスチャ認識のためのしきい値を決定するが、第二の実施の形態とは異なり、検出する部位が身体のどこであるかを意識することなく、ジェスチャを決定する。なお、第三の実施の形態におけるジェスチャ認識システムは、第二の実施の形態と同様である。第二の実施の形態と同様の内容については、説明を省略する。 (Third embodiment)
The following third embodiment will be described. Also in the third embodiment, a gesture determination mode is provided to determine a threshold for gesture recognition tailored to each person, but unlike the second embodiment, the detection site is where on the body. Decide on a gesture without being aware of it. The gesture recognition system in the third embodiment is the same as that in the second embodiment. Description of the same contents as those of the second embodiment is omitted.

図１７は、本発明の第三の実施の形態におけるジェスチャ決定モードにおいてジェスチャ認識装置２０において行われる、利用者のジェスチャを決定する処理の流れの一例を示すフロー図である。 FIG. 17 is a flowchart illustrating an example of a flow of processing for determining a user's gesture performed in the gesture recognition device 20 in the gesture determination mode according to the third embodiment of the present invention.

ジェスチャ決定モードが開始される前に、利用者は、ジェスチャに用いる身体部位が撮像装置１０に最も近くなるような位置になるようにする。ジェスチャに用いる身体部位としては、手、腕、膝、足先、指、口、耳などがある。このうち、動きが小さい身体部位は、指、口、耳などである。たとえば、ジェスチャに用いる身体部位が足先である場合、足先が撮像装置１０に最も近くなるように撮像装置と利用者の位置を調整しておく。また、寝たきり状態の利用者である場合、寝ている床平面を推定し、床平面の上からジェスチャに用いる身体部位が撮像装置１０に対して最も近くなるように撮像装置と利用者の位置を調整する。最も近くになるような位置とすることで、容易に動きを検出することができる。そして、ジェスチャ決定モードが開始されると、ジェスチャ指示プログラム２８８がジェスチャ決定部２９０によって読み出され、プログラムが開始される（ステップＳ１７０１）。ジェスチャ指示プログラム２８８は、利用者に対して、所定のタイミングで、身体部位を動かすように合図、指示を行う。 Before the gesture determination mode is started, the user makes a position where the body part used for the gesture is closest to the imaging apparatus 10. Examples of body parts used for gestures include hands, arms, knees, toes, fingers, mouths, and ears. Of these, the body parts with small movements are the finger, mouth, ears, and the like. For example, when the body part used for the gesture is a foot tip, the positions of the imaging device and the user are adjusted so that the foot tip is closest to the imaging device 10. If the user is a bedridden user, the sleeping floor plane is estimated, and the positions of the imaging device and the user are positioned so that the body part used for the gesture is closest to the imaging device 10 from above the floor plane. adjust. By setting the position so as to be closest, the movement can be easily detected. When the gesture determination mode is started, the gesture instruction program 288 is read by the gesture determination unit 290, and the program is started (step S1701). The gesture instruction program 288 signals and instructs the user to move the body part at a predetermined timing.

部位領域抽出部２２０は、候補領域を検出する（ステップＳ１７０２）。部位領域抽出部２２０は、画像取込部２１０が取得した撮像装置１０からの距離画像データから、候補領域を抽出する。あらかじめジェスチャに用いる身体部位が撮像装置１０に対して近い位置になっていることを利用して、撮像装置１０に近い視差データをもつ領域が抽出されるように、視差値にしきい値を設け、しきい値で候補領域を抽出する。 Part region extraction unit 220 detects a candidate region (step S1702). The part region extraction unit 220 extracts candidate regions from the distance image data from the imaging device 10 acquired by the image capturing unit 210. A threshold value is provided for the parallax value so that a region having parallax data close to the imaging device 10 is extracted using the fact that the body part used for the gesture is close to the imaging device 10 in advance. Extract candidate areas by threshold.

次に、部位検出部２３０は、抽出した領域が含まれる矩形を部位として検出する（ステップＳ１７０３）。部位検出部２３０は、たとえば、部位領域抽出部２２０によって抽出された、候補領域を含んだ矩形をフレーム画像それぞれに設定する。つまり、所定のフレーム数分のフレーム画像において、それぞれの動き領域を包含する矩形を決定していく。この矩形は、三次元座標（x,y,z）及び画面座標(u,v)で特定される。 Next, the part detection unit 230 detects a rectangle including the extracted region as a part (step S1703). The part detection unit 230 sets, for example, a rectangle including the candidate area extracted by the part region extraction unit 220 for each frame image. That is, in a frame image for a predetermined number of frames, a rectangle that includes each motion region is determined. This rectangle is specified by three-dimensional coordinates (x, y, z) and screen coordinates (u, v).

部位変化検出部２５０は、矩形内の変化を検出する（ステップＳ１７０４）。部位検出部２３０が、各距離画像データで候補領域を含む矩形を設定する。そして、部位変化検出部２５０は、キャッシュ部２４０に一時記憶されている、所定のフレーム数分の過去のフレーム画像において比較対象となる前フレームの距離画像データにおける矩形領域と、現在フレームの距離画像データにおける矩形領域における変化を算出する。変化とは、特徴点の座標変化、またはグレー画像やグレー画像からのエッジ画像のフレーム間差分である。口や耳など、動きの小さい部位での変化の検出には、グレー画像又はグレー画像からのエッジ画像のフレーム間差分を算出することが望ましい。従って、利用者の症状によって口や耳などをわずかに動かすことしかできないことがわかっている場合には、変化検出をフレーム間差分で行うように予め設定してもよい。他人から見てすぐに動いていることがわかる程度に身体を動かすことができる利用者の場合は、特徴点の座標変化を検出する。 The part change detection unit 250 detects a change in the rectangle (step S1704). The part detection unit 230 sets a rectangle including the candidate area in each distance image data. Then, the region change detection unit 250 temporarily stores the rectangular area in the distance image data of the previous frame to be compared in the past frame images for a predetermined number of frames, which are temporarily stored in the cache unit 240, and the distance image of the current frame. The change in the rectangular area in the data is calculated. A change is a coordinate change of a feature point or a difference between frames of an edge image from a gray image or a gray image. In order to detect a change in a part with a small movement such as a mouth or an ear, it is desirable to calculate a difference between frames of a gray image or an edge image from the gray image. Therefore, when it is known that the mouth and ears can be moved only slightly depending on the user's symptoms, the change detection may be set in advance so as to be performed with the difference between frames. In the case of a user who can move the body to such an extent that it can be seen immediately from other people, the coordinate change of the feature point is detected.

次に、ジェスチャ決定部２９０は、部位変化検出部２５０が変化を検出すると、所定のタイミングにおける動きかどうかの判定を行う（ステップＳ１７０５）。所定のタイミングとは、ジェスチャ指示プログラムが利用者に身体部位を動かすように指示したタイミングである。たとえば、ジェスチャ決定部２９０は、指示したタイミングから０．５秒幅のなかでの動きである場合は、所定のタイミングにおける動きであると判定する。身体障害者の場合、自分の意思とは無関係に身体が動いてしまうことがある。しかし、所定のタイミングでの座標の変化であれば、利用者が意図をもって動かしたことによる座標変化であることが明らかであり、ジェスチャとして利用可能な座標変化ということになる。 Next, when the region change detection unit 250 detects a change, the gesture determination unit 290 determines whether the movement is at a predetermined timing (step S1705). The predetermined timing is a timing when the gesture instruction program instructs the user to move the body part. For example, the gesture determination unit 290 determines that the movement is at a predetermined timing when the movement is within a 0.5 second width from the instructed timing. If you have a physical disability, your body may move regardless of your will. However, if the coordinate changes at a predetermined timing, it is clear that the coordinate change is caused by the user's intentional movement, and this is a coordinate change that can be used as a gesture.

所定のタイミングにおける動きの場合（ｙｅｓ）、ジェスチャ決定部２９０は、矩形を決定し、変化量を記憶する（ステップＳ１７０６）。ジェスチャ決定部２９０は、所定のタイミングにおける変化の場合、変化した最初から終わりまで、所定のフレーム数のフレーム画像各々に部位検出部２３０が設定した矩形からもっとも大きい矩形を決定する。これにより、ジェスチャをしたときに変化する画像領域を特定しておくことで、トラッキングする領域を限定し、処理を高速にすることができる。そして、決定した矩形を部位検出パラメータとしてパラメータデータベース２８４に、三次元座標（x,y,z）及び画面座標(u,v)で記憶する。 In the case of movement at a predetermined timing (yes), the gesture determination unit 290 determines a rectangle and stores the amount of change (step S1706). In the case of a change at a predetermined timing, the gesture determination unit 290 determines the largest rectangle from the rectangles set by the part detection unit 230 for each frame image of a predetermined number of frames from the beginning to the end of the change. Thus, by specifying an image area that changes when a gesture is performed, the area to be tracked is limited, and the processing can be performed at high speed. Then, the determined rectangle is stored in the parameter database 284 as a part detection parameter in three-dimensional coordinates (x, y, z) and screen coordinates (u, v).

そして、ジェスチャ決定部２９０は、変化した最初の画像から終わりの画像までにおける、部位変化検出部２５０が算出した特徴点の座標変化による三次元軌跡を計算する。三次元軌跡は、動きの始点となる定常状態における座標と最大移動時である終点の座標それぞれが、三次元座標及び画面座標で記憶される。なお、座標値だけでなく、始点及び終点の距離を合わせて記憶してもよい。フレーム間差分を変化として検出する場合は、グレー画像又はグレー画像からのエッジ画像によるフレーム間差分を記憶する。これらのデータは、ジェスチャ指示プログラムが終了するまで、逐次記憶されていく。 Then, the gesture determination unit 290 calculates a three-dimensional trajectory based on the coordinate change of the feature point calculated by the region change detection unit 250 from the changed first image to the end image. In the three-dimensional trajectory, the coordinates in the steady state as the starting point of the motion and the coordinates of the end point at the time of maximum movement are stored as the three-dimensional coordinates and the screen coordinates. Note that not only the coordinate value but also the distance between the start point and the end point may be stored together. When detecting the interframe difference as a change, the interframe difference based on the gray image or the edge image from the gray image is stored. These data are sequentially stored until the gesture instruction program ends.

所定のタイミングにおける動きではない場合（ｎｏ）、ジェスチャ決定部２９０は、部位変化検出部が検出した変化は無視し、部位変化検出部２５０は所定位置の動きを検出する（ステップＳ１７０４）。 If it is not a movement at a predetermined timing (no), the gesture determination unit 290 ignores the change detected by the part change detection unit, and the part change detection unit 250 detects a movement at a predetermined position (step S1704).

そして、ジェスチャ決定部２９０は、ジェスチャ指示プログラムが終了したかの判定を行う（ステップＳ１７０７）。ジェスチャ指示プログラムが終了した場合（ｙｅｓ）、ジェスチャ決定部２９０は、記憶した複数の変化量に基づき、平均値を算出する（ステップＳ１６０８）。算出した平均値は、ジェスチャ認識の際のしきい値となる。たとえば、座標変化による三次元軌跡を取得している場合、動きの始点となる座標値及び終点の座標値それぞれについて、平均値を算出する。ここで、平均値をしきい値としているが、必ずしも平均値に限らない。たとえば、動きの始点の座標と終点の座標の中間点をしきい値と決定したり、移動軌跡位置をヒストグラム化して２値のしきい値選定法によって決定してもよい。また、フレーム間差分の値を変化量としている場合は、変化量の極大値を集積し、その極大値の中の最小値を算出し、その最小値をジェスチャと認識するしきい値として決定してもよい。なお、この場合に、極大値の正規分布から一定量外れた下位の極大値は除外するようにしてもよい。 Then, the gesture determination unit 290 determines whether the gesture instruction program has ended (step S1707). When the gesture instruction program ends (yes), the gesture determination unit 290 calculates an average value based on the stored plurality of changes (step S1608). The calculated average value becomes a threshold value for gesture recognition. For example, when a three-dimensional trajectory due to a coordinate change is acquired, an average value is calculated for each of the coordinate values that are the starting point and the end point of the movement. Here, although the average value is used as a threshold value, it is not necessarily limited to the average value. For example, an intermediate point between the coordinates of the start point and the end point of the motion may be determined as a threshold value, or the moving locus position may be histogrammed and determined by a binary threshold value selection method. In addition, when the value of the difference between frames is used as the amount of change, the maximum value of the amount of change is accumulated, the minimum value among the maximum values is calculated, and the minimum value is determined as a threshold for recognizing the gesture. May be. In this case, lower maximum values that deviate from the normal distribution of the maximum values by a certain amount may be excluded.

また、しきい値を算出するにあたって、過去に同じ利用者についてジェスチャ指示プログラムを実行して、しきい値を算出したことがある場合、過去に算出したしきい値を一又は複数記憶しておき、過去に実行したときのしきい値と今回実行した際の変化量から算出したしきい値との平均をさらに算出して、しきい値としてもよい。この場合、記憶されている過去のしきい値は、新しい値が記憶されるごとに、もっとも古い値が削除されていく（FIFO）ように構成されていてもよい。なお、ジェスチャ決定部２９０は、候補部位のなかで、取得できた値のばらつきが大きい部位がある場合、平均値を算出せず、その候補部位については、ジェスチャには使わないと決定する。 In calculating the threshold value, if the threshold value is calculated by executing a gesture instruction program for the same user in the past, one or more threshold values calculated in the past are stored. The average of the threshold value when executed in the past and the threshold value calculated from the amount of change when executed this time may be further calculated as the threshold value. In this case, the stored past threshold value may be configured such that the oldest value is deleted (FIFO) each time a new value is stored. In addition, when there exists a site | part with a large dispersion | variation in the value which can be acquired among the candidate site | parts, the gesture determination part 290 determines not to use an average value and not to use for a gesture.

ジェスチャ指示プログラムが終了していない場合（ｎｏ）、部位変化検出部２５０は、引き続き、所定位置の動きを検出する（ステップＳ１７０４）。 If the gesture instruction program has not ended (no), the region change detection unit 250 continues to detect movement at a predetermined position (step S1704).

次に、ジェスチャ決定部２９０は、パラメータデータベース２８４に記憶している値を更新してジェスチャを決定する（ステップＳ１７０９）。具体的には、ジェスチャ決定部２９０は、ジェスチャを画像から抽出するための矩形の座標値と、ジェスチャ認識のための変化を検出するためのしきい値を、ジェスチャ認識を行う部位としてパラメータデータベース２８４に記憶させる。ジェスチャ認識のためのしきい値は、部位変化検出パラメータであって、始点及び終点の座標またはフレーム間差分値のいずれかである。更新された部位変化検出パラメータは、ジェスチャ認識モードにおいて、ジェスチャ認識の際に用いられる。また、ジェスチャ決定部２９０は、ジェスチャを決定すると、そのジェスチャとインターフェース制御とを対応づけるために、対応ジェスチャデータベース２８６を更新する。そして、ジェスチャ決定モードを終了する。このように第三の実施の形態においては、動いている部位が、身体のどこの部位であるかを意識することなく、個々人の症状に合わせたジェスチャ認識を行うことを可能としている。 Next, the gesture determination unit 290 updates the value stored in the parameter database 284 to determine a gesture (step S1709). Specifically, the gesture determination unit 290 uses a rectangular coordinate value for extracting a gesture from an image and a threshold value for detecting a change for gesture recognition as a part for performing the gesture recognition. Remember me. The threshold for gesture recognition is a part change detection parameter, and is either the coordinates of the start point and end point or the inter-frame difference value. The updated region change detection parameter is used for gesture recognition in the gesture recognition mode. In addition, when the gesture determination unit 290 determines a gesture, the gesture determination unit 290 updates the corresponding gesture database 286 in order to associate the gesture with the interface control. Then, the gesture determination mode ends. As described above, according to the third embodiment, it is possible to perform gesture recognition in accordance with the symptom of each individual without being aware of where the moving part is in the body.

図１８は、本発明の第三の実施の形態におけるパラメータデータベース２８４及び対応ジェスチャデータベース２８６のデータ記憶内容を示す一例である。パラメータデータベース２８４では、利用者に対応付けて、ジェスチャ認識に用いる部位と矩形領域を特定するパラメータ及びジェスチャ認識のための変化を検出するためのしきい値が記憶されている。但し、部位の情報は、部位として身体のどこの部位を用いるかを特定する情報ではなくジェスチャに用いる領域が複数ある場合に、区別するための情報として例えば「部位１」「部位２」として記憶されている。このように第三の実施の形態においては、ジェスチャ認識を行う部位が実際の身体部位のどこであるかを意識することなく、動き領域を検出するための矩形の座標と、ジェスチャ認識するためのしきい値である、座標値またはフレーム間差分の値のみを記憶することで、ジェスチャ認識を可能としている。ここで、矩形領域の情報として、三次元座標及び画面座標の４点の座標を記憶しているが、これに限らず、矩形領域を特定する情報であれば、ほかの方法でもよい。また、視差値によるしきい値によって候補領域を検出するときに、個々人によって異なるしきい値設定する場合は、視差のしきい値を部位領域抽出パラメータとしてあわせて記憶してもよい。 FIG. 18 is an example showing data storage contents of the parameter database 284 and the corresponding gesture database 286 according to the third embodiment of the present invention. The parameter database 284 stores a parameter for specifying a region and a rectangular region used for gesture recognition and a threshold value for detecting a change for gesture recognition in association with the user. However, the part information is stored as, for example, “part 1” and “part 2” as information for distinguishing when there are a plurality of areas used for gestures, not information specifying which part of the body is used as the part. Yes. As described above, in the third embodiment, the rectangular coordinates for detecting the motion region and the gesture recognition are recognized without being aware of where the actual body part is to be recognized. By storing only the coordinate value or the inter-frame difference value, which is a threshold value, gesture recognition is possible. Here, the coordinates of the four points of the three-dimensional coordinates and the screen coordinates are stored as the information of the rectangular area. However, the present invention is not limited to this, and other methods may be used as long as the information specifies the rectangular area. In addition, when a candidate area is detected based on a threshold value based on a parallax value, when a different threshold value is set for each individual, the parallax threshold value may be stored together as a part area extraction parameter.

対応ジェスチャデータベース２８６は、ジェスチャと認識する変化量と、インターフェース制御内容とを対応づけて記憶する。第三の実施の形態では、どの身体部位によるどういった動きを検出しているかではなく、ある場所が所定値以上動いた場合にジェスチャと認識することとしているため、ジェスチャと認識する変化量とインターフェース制御内容とを、利用者ごとに記憶している。 The corresponding gesture database 286 stores a change amount recognized as a gesture and interface control content in association with each other. In the third embodiment, it is determined that a gesture is recognized when a certain place moves more than a predetermined value, not what kind of movement by which body part is detected. The contents of interface control are stored for each user.

図１９は、本発明の第三の実施の形態におけるジェスチャ認識装置２０において行われる、利用者のジェスチャ動作認識と認識されたジェスチャに対応するインターフェース制御処理の流れの一例を示すフロー図である。 FIG. 19 is a flowchart illustrating an example of a flow of interface control processing corresponding to a gesture recognized as recognition of a user's gesture motion performed in the gesture recognition device 20 according to the third exemplary embodiment of the present invention.

ジェスチャ認識モードにおいては、部位領域抽出部２２０は、撮像装置１０に近い領域を抽出する（ステップＳ１９０１）。具体的には、画像取込部２１０が取り込んだ距離画像データから、部位領域抽出部２２０は、視差値によるしきい値に基づいて、そのしきい値を満たした領域を抽出する。 In the gesture recognition mode, the part region extraction unit 220 extracts a region close to the imaging device 10 (step S1901). Specifically, the part region extraction unit 220 extracts a region that satisfies the threshold value based on the threshold value based on the parallax value from the distance image data captured by the image capturing unit 210.

部位検出部２３０は、矩形を所定部位として検出する（ステップＳ１９０２）。パラメータデータベース２８４に記憶されている矩形領域の座標値を読み出して、部位領域抽出部２２０が抽出した領域のうちの一つを包含している矩形を設定する。なお、複数の部位をジェスチャ認識に用いる場合は、複数の矩形が設定される。 The part detection unit 230 detects a rectangle as a predetermined part (step S1902). The coordinate value of the rectangular area memorize | stored in the parameter database 284 is read, and the rectangle which includes one of the area | regions which the part area | region extraction part 220 extracted is set. When a plurality of parts are used for gesture recognition, a plurality of rectangles are set.

次に、部位変化検出部２５０は、設定されている矩形内の変化を検出する（ステップＳ１９０３）。座標値が部位変化検出パラメータとして記憶されている場合は、座標値の変化、フレーム間差分の値が記憶されている場合は、フレーム間差分を算出する。 Next, the site | part change detection part 250 detects the change in the set rectangle (step S1903). When the coordinate value is stored as the part change detection parameter, the change between the coordinate values and the inter-frame difference value are calculated when the inter-frame difference value is stored.

ジェスチャ認識部２６０は、部位変化検出部２５０が検出した矩形内の変化が所定値以上であるか判定する（ステップＳ１９０４）。これは、パラメータデータベース２８４に記憶されている部位変化検出パラメータに基づいて判定される。変化が所定値以上である場合（ｙｅｓ）は、ジェスチャと認識される（ステップＳ１９０５）。矩形内の変化が所定値以上ではない場合（ｎｏ）、部位変化検出部２５０は、引き続き矩形内の変化を検出する（ステップＳ１９０３）。ジェスチャ認識部２６０は、ジェスチャがなされたと認識すると、ジェスチャがなされたことをインターフェース制御部２７０へ出力する。 The gesture recognition unit 260 determines whether the change in the rectangle detected by the region change detection unit 250 is greater than or equal to a predetermined value (step S1904). This is determined based on the part change detection parameter stored in the parameter database 284. If the change is greater than or equal to a predetermined value (yes), it is recognized as a gesture (step S1905). When the change in the rectangle is not greater than or equal to the predetermined value (no), the part change detection unit 250 continues to detect the change in the rectangle (step S1903). When the gesture recognition unit 260 recognizes that the gesture has been made, the gesture recognition unit 260 outputs the fact that the gesture has been made to the interface control unit 270.

インターフェース制御部２７０は、ジェスチャ認識部２６０からの出力に応じて対応ジェスチャデータベース２８６を参照し、インターフェース制御をインターフェース３０に対して行う（ステップＳ１９０６）。このように、身体部位ごとにジェスチャ認識のためのモデルをもたなくとも、ジェスチャ認識を行うことができる。 The interface control unit 270 refers to the corresponding gesture database 286 according to the output from the gesture recognition unit 260, and performs interface control on the interface 30 (step S1906). In this way, gesture recognition can be performed without having a model for gesture recognition for each body part.

図２０は、ジェスチャ認識装置のハードウェア構成図の一例である。ジェスチャ認識装置２０は、システム使用者の所有するパーソナルコンピュータと、パーソナルコンピュータ上で実行されるプログラムとして構成されてもよい。パーソナルコンピュータは、ＣＰＵ（中央演算装置）１００１と、ＣＰＵ１００１にバスを介して接続されているＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１００３、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１００５、ハードディスクドライブなどの外部記憶装置１００７、Ｉ／Ｏインターフェース１００９、通信ネットワーク回線に接続するための通信インターフェース１０１１などとを備え、インターフェース１００９には、カメラ１０１３、アラーム１０１５、ボタン１０１７、スイッチ１０１９が接続されている。この場合、例えば、ジェスチャ認識装置２０の画像取込部２１０、部位領域抽出部２２０、部位検出部２３０、部位変化検出部２５０、ジェスチャ認識部２６０、インターフェース制御部２７０、ジェスチャ決定部２９０の機能が、パーソナルコンピュータ上で実行されるプログラムによって実現され、記憶部２８０の機能が外部記憶装置１００７によって実現され、撮像装置１０、インターフェース３０の機能がそれぞれカメラ、アラーム、ボタン、スイッチによって実現される。各種機能を実現するプログラムは、外部記憶装置１００７に記憶され、ＲＡＭ１００３に読みだされた後に、ＣＰＵ１００１によって実行される。 FIG. 20 is an example of a hardware configuration diagram of the gesture recognition device. The gesture recognition device 20 may be configured as a personal computer owned by a system user and a program executed on the personal computer. The personal computer includes a CPU (Central Processing Unit) 1001, a RAM (Random Access Memory) 1003 connected to the CPU 1001 via a bus, a ROM (Read Only Memory) 1005, an external storage device 1007 such as a hard disk drive, an I / O An O interface 1009, a communication interface 1011 for connecting to a communication network line, and the like. A camera 1013, an alarm 1015, a button 1017, and a switch 1019 are connected to the interface 1009. In this case, for example, the functions of the image capturing unit 210, the part region extracting unit 220, the part detecting unit 230, the part change detecting unit 250, the gesture recognizing unit 260, the interface control unit 270, and the gesture determining unit 290 of the gesture recognition device 20 are provided. The functions of the storage unit 280 are realized by the external storage device 1007, and the functions of the imaging device 10 and the interface 30 are realized by a camera, an alarm, a button, and a switch, respectively. Programs that realize various functions are stored in the external storage device 1007, read into the RAM 1003, and then executed by the CPU 1001.

１０撮像装置
２０ジェスチャ認識装置
３０インターフェース
２１０画像取込部
２２０部位領域抽出部
２３０部位検出部
２４０キャッシュ部
２５０部位変化検出部
２６０ジェスチャ認識部
２７０インターフェース制御部
２８０記憶部 DESCRIPTION OF SYMBOLS 10 Imaging apparatus 20 Gesture recognition apparatus 30 Interface 210 Image taking-in part 220 Part area extraction part 230 Part detection part 240 Cache part 250 Part change detection part 260 Gesture recognition part 270 Interface control part 280 Storage part

Claims

A gesture recognition device that recognizes a user's gesture based on distance image data captured by an imaging device and performs interface control associated with the recognized gesture on the interface device,
An image capturing unit that captures distance image data output from the imaging device;
A region extraction unit for extracting from the distance image data a region where one or a plurality of candidate regions of the user exists each time the distance image data is captured;
A site detector that detects one or more candidate sites from the extracted site region;
A part change detection unit that detects a change amount of the detected candidate part based on the detected predetermined part in each distance image data;
A gesture recognition unit for recognizing that a gesture has been performed when a change in the detected predetermined part is equal to or greater than a predetermined value;
When it is recognized that a gesture has been performed, an interface control unit that performs interface control associated with the gesture;
A gesture determining unit that determines a portion associated with interface control and a gesture in the portion from one or a plurality of candidate portions of the user ;
The gesture determination unit instructs to perform a gesture at a predetermined timing, and determines a part associated with interface control and a threshold to recognize the gesture at the part based on a change amount of the candidate part at the predetermined timing. And
The gesture recognizing apparatus recognizes that a gesture has been performed when a change amount or more determined in the region determined by the gesture determining unit is detected .

A gesture recognition device that recognizes a user's gesture based on distance image data captured by an imaging device and performs interface control associated with the recognized gesture on the interface device,
  An image capturing unit that captures distance image data output from the imaging device;
  A region extraction unit for extracting from the distance image data a region where one or more predetermined candidate portions of the user exist each time the distance image data is captured;
  A part detection unit for detecting a rectangle including one or more candidate predetermined parts from the extracted part region;
  A part change detecting unit for detecting a change in the rectangle;
A gesture determining unit that determines a rectangle that includes a region with a change based on a change detected by the part change detection unit, and that determines a threshold value to be recognized as a rectangle and a gesture associated with interface control;
A gesture recognition unit that recognizes that a gesture has been performed when a change in the detected rectangle is equal to or greater than a predetermined value;
  An interface control unit that performs interface control associated with the gesture when the gesture is recognized, and
  The gesture determination unit instructs to perform a gesture at a predetermined timing, determines a threshold based on a change amount in a rectangle at the predetermined timing,
The gesture recognition device recognizes that a gesture has been performed when a threshold value determined in the rectangle determined by the gesture determination unit is detected.

The gesture recognition device according to claim 1 or 2 ,
In addition, it has a parameter database that stores thresholds for detecting changes,
The gesture determination unit stores a change amount determined to be recognized as a gesture in a parameter database in association with a user as a threshold value,
The gesture recognition unit reads a threshold value stored in the parameter database, and recognizes a gesture when the change is equal to or greater than the threshold value.

The gesture recognition apparatus according to claim 2 or 3, wherein the part region extraction unit sets a threshold value for the parallax value and extracts a region where a candidate part exists based on the threshold value.

The gesture recognition apparatus according to claim 2, wherein the part change detection unit detects a coordinate change of a feature point or an inter-frame difference of an edge image as a change.

A gesture recognition apparatus according to claim 5, wherein,
The gesture determination section, for the same user, and thresholds were determined as the gesture in the past, the amount of change at a predetermined timing newly acquired, a new threshold that recognized as the gesture based on the A gesture recognition device characterized in that

A gesture recognition program for recognizing a user's gesture based on distance image data and performing interface control associated with the recognized gesture,
A gesture recognition program that causes a computer to function as the gesture recognition device according to claim 1 .

An imaging device for imaging distance image data;
A gesture recognition system comprising: a gesture recognition device that recognizes a user's gesture based on distance image data captured by the imaging device and performs interface control associated with the recognized gesture on the interface device. ,
The imaging device images a user's body,
The gesture recognition device according to claim 1 , wherein the gesture recognition device is a gesture recognition device according to claim 1 .