JP2019039864A

JP2019039864A - Hand recognition method, hand recognition program and information processing device

Info

Publication number: JP2019039864A
Application number: JP2017163433A
Authority: JP
Inventors: 隆登大橋; Takato Ohashi; 康洲鎌; Yasushi Sukama
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-08-28
Filing date: 2017-08-28
Publication date: 2019-03-14

Abstract

To provide a hand recognition method for improving recognition accuracy when recognizing the region of a hand from depth data.SOLUTION: The hand recognition method causes a computer 120 to execute the process of: extracting the candidate region of a hand to be processed from acquired depth data; specifying a finger model that most resembles the candidate region of the hand to be processed; setting a new candidate region of the hand to be processed within the candidate region of the hand to be processed by excluding the region of an arm specified on the basis of a vector beginning with the position of a wrist and ending with the position of root of a finger in the specified finger model when the degree of similarity of the candidate region of the hand to be processed to the specified finger model does not satisfy a prescribed criterion; and recursively performing a specifying process and a setting process using the new candidate region of the hand to be processed.SELECTED DRAWING: Figure 1

Description

本発明は、手認識方法、手認識プログラム及び情報処理装置に関する。 The present invention relates to a hand recognition method, a hand recognition program, and an information processing apparatus.

従来より、ヒューマンインタフェースやインタラクティブシステムへの応用等を目的として、３次元空間において、対象となる人の手までの距離をセンサ（深度センサ等）で測定し、測定結果の深度データから手の領域（手首から先の領域）を認識する手認識技術が知られている。 Conventionally, for the purpose of application to human interfaces and interactive systems, the distance to the target person's hand is measured with a sensor (depth sensor, etc.) in a three-dimensional space, and the hand region is determined from the depth data of the measurement result. Hand recognition technology for recognizing (a region beyond the wrist) is known.

例えば、下記非特許文献１には、手首の位置から腕の領域（手首より胴体側の領域）を特定し、特定した腕の領域を除く領域を対象として、手指モデルを当てはめ、最適化することで、手の領域を認識する手認識技術が開示されている。 For example, in Non-Patent Document 1 below, an arm region (region on the body side from the wrist) is identified from the wrist position, and a finger model is applied and optimized for the region excluding the identified arm region. Thus, a hand recognition technique for recognizing a hand region is disclosed.

しかしながら、下記非特許文献１の場合、深度データに基づき手首の位置が特定できるよう、測定対象者の手首にリストバンドを装着させる必要があり、実用的でない。 However, in the case of the following non-patent document 1, it is necessary to attach a wristband to the wrist of the measurement subject so that the position of the wrist can be specified based on the depth data, which is not practical.

一方で、下記非特許文献２には、腕の形状と手首から掌にかけての形状の違いを利用して、深度データから手首の位置を特定する特定方法が開示されている。下記非特許文献２によれば、測定対象者の手首にリストバンドを装着させることなく手首の位置を特定することができる。 On the other hand, Non-Patent Document 2 below discloses a specifying method for specifying the position of the wrist from the depth data using the difference between the shape of the arm and the shape from the wrist to the palm. According to the following Non-Patent Document 2, the position of the wrist can be specified without attaching a wristband to the wrist of the measurement subject.

Chen Qian，Xiao Sun，Yichen Wei，Xiaoou Tang，Jian Sun著「Realtime and Robust Hand Tracking from Depth」CVPR2013"Realtime and Robust Hand Tracking from Depth" by Chen Qian, Xiao Sun, Yichen Wei, Xiaoou Tang, Jian Sun, CVPR2013 陳維英，藤木隆司，有田大作，谷口倫一郎著「複数カメラを用いた実時間三次元手形状推定」画像の認識・理解シンポジウム（ＭＩＲＵ２００６）、２００６年７月、ｐ．３２８〜３３３Chen Weihide, Fujiki Takashi, Arita Daisaku, Taniguchi Rinichiro "Real-time 3D hand shape estimation using multiple cameras" Image Recognition and Understanding Symposium (MIRU2006), July 2006, p. 328-333

しかしながら、上記非特許文献２の場合、距離測定用のセンサに対する測定対象者の手の角度や、測定対象者が着用する衣服によっては、腕の形状と手首から掌にかけての形状の違いを測定結果の深度データからは捉えることができず、手首の位置を特定できない場合がある。このため、上記特定方法を非特許文献１に適用しても、手指モデルを当てはめる際に、腕の領域を除く領域を対象として最適化することができず、手の領域を適切に認識できない場合があるといった問題がある。 However, in the case of the above-mentioned Non-Patent Document 2, depending on the angle of the measurement subject's hand with respect to the distance measurement sensor and the clothes worn by the measurement subject, the difference between the shape of the arm and the shape from the wrist to the palm is measured. In some cases, the wrist position cannot be determined from the depth data. For this reason, even when the above identification method is applied to Non-Patent Document 1, when a finger model is applied, the region other than the arm region cannot be optimized and the hand region cannot be properly recognized. There is a problem that there is.

一つの側面では、深度データから手の領域を認識する際の、認識精度を向上させることを目的としている。 In one aspect, the object is to improve recognition accuracy when recognizing a hand region from depth data.

一態様によれば、手認識方法は、
取得した深度データから処理対象とする手の候補領域を抽出し、
前記処理対象とする手の候補領域に対して最も類似する手指モデルを特定し、
前記処理対象とする手の候補領域に対して特定した前記手指モデルの類似の度合いが、所定の基準を満たさない場合に、特定した前記手指モデルにおける手首の位置を始点とし指の付け根の位置を終点とするベクトルに基づいて特定される腕の領域を除くことで、前記処理対象とする手の候補領域内において、新たな処理対象とする手の候補領域を設定し、
前記新たな処理対象とする手の候補領域を用いて、前記特定する処理と前記設定する処理とを再帰的に行う処理をコンピュータが実行することを特徴とする。 According to one aspect, the hand recognition method comprises:
Extract hand candidate areas to be processed from the acquired depth data,
Identify a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. By excluding the arm region specified based on the vector as the end point, the candidate region of the hand to be processed is set in the candidate region of the hand to be processed,
The computer executes a process of recursively performing the process of specifying and the process of setting using the candidate region of the hand as the new process target.

深度データから手の領域を認識する際の、認識精度を向上させることができる。 The recognition accuracy when recognizing the hand region from the depth data can be improved.

手認識システムのシステム構成の一例を示す図である。It is a figure which shows an example of the system configuration | structure of a hand recognition system. 手指モデル情報の一例を示す図である。It is a figure which shows an example of finger model information. 情報処理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of information processing apparatus. 手認識処理部の機能構成の一例及び各機能部の処理の具体例を示す図である。It is a figure which shows an example of a function structure of a hand recognition process part, and the specific example of a process of each function part. モデル最適化部による最適化処理の詳細を示す図である。It is a figure which shows the detail of the optimization process by a model optimization part. マスク領域算出部によるマスク領域算出処理及びマスク部によるマスク処理の詳細を示す第１の図である。It is a 1st figure which shows the detail of the mask area | region calculation process by a mask area | region calculation part, and the mask process by a mask part. マスク領域算出部によるマスク領域算出処理及びマスク部によるマスク処理の詳細を示す第２の図である。It is a 2nd figure which shows the detail of the mask area | region calculation process by a mask area | region calculation part, and the mask process by a mask part. 終了判定部による終了判定処理の流れを示す第１のフローチャートである。It is a 1st flowchart which shows the flow of the end determination process by an end determination part. 終了判定部による終了判定処理の流れを示す第２のフローチャートである。It is a 2nd flowchart which shows the flow of the end determination process by an end determination part. 終了判定部による終了判定処理の流れを示す第３のフローチャートである。It is a 3rd flowchart which shows the flow of the end determination process by an end determination part. マスク領域算出部によるマスク領域算出処理及びマスク部によるマスク処理の詳細を示す第３の図である。It is a 3rd figure which shows the detail of the mask area | region calculation process by a mask area | region calculation part, and the mask process by a mask part.

以下、各実施形態について添付の図面を参照しながら説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複した説明を省く。 Each embodiment will be described below with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, the duplicate description is abbreviate | omitted by attaching | subjecting the same code | symbol.

［第１の実施形態］
＜手認識システムのシステム構成＞
はじめに、手認識システムのシステム構成について説明する。図１は、手認識システムのシステム構成の一例を示す図である。図１に示すように、手認識システム１００は、深度センサ１１０と情報処理装置１２０とを有する。なお、深度センサ１１０と情報処理装置１２０とは、通信可能に接続される。 [First Embodiment]
<System configuration of hand recognition system>
First, the system configuration of the hand recognition system will be described. FIG. 1 is a diagram illustrating an example of a system configuration of a hand recognition system. As illustrated in FIG. 1, the hand recognition system 100 includes a depth sensor 110 and an information processing device 120. Note that the depth sensor 110 and the information processing apparatus 120 are connected to be communicable.

深度センサ１１０は、測定対象者１３０を含む所定領域を測定範囲として、深度データを測定する。深度データとは、各フレームの各画素に、測定対象者１３０を含む被測定物までの距離を示す距離情報が割り当てられたデータであり、３次元空間における被測定物の位置を示している。 The depth sensor 110 measures depth data using a predetermined area including the measurement subject 130 as a measurement range. The depth data is data in which distance information indicating the distance to the measurement object including the measurement subject 130 is assigned to each pixel of each frame, and indicates the position of the measurement object in the three-dimensional space.

情報処理装置１２０には、手認識プログラムがインストールされており、当該プログラムが実行されることで、情報処理装置１２０は、手認識処理部１２１として機能する。 A hand recognition program is installed in the information processing apparatus 120, and the information processing apparatus 120 functions as the hand recognition processing unit 121 by executing the program.

手認識処理部１２１は、深度センサ１１０にて測定された深度データを取得し、当該深度データに対して、手指モデル情報記憶部１２２に記憶された手指モデルを当てはめて最適化することで（最適化処理を行うことで）、測定対象者１３０の手の領域を認識する。具体的には、手認識処理部１２１は、深度データと手指モデルとを対比して、深度データにおける手指モデルの位置、手指モデルの姿勢、手指モデルの各関節の角度等を算出し、手認識結果として出力する。なお、手認識処理部１２１より出力される手認識結果は、例えば、ヒューマンインタフェースやインタラクティブシステム等において利用されるものとする。 The hand recognition processing unit 121 acquires depth data measured by the depth sensor 110 and applies the finger model stored in the finger model information storage unit 122 to the depth data for optimization (optimum The region of the hand of the person 130 to be measured is recognized. Specifically, the hand recognition processing unit 121 compares the depth data with the finger model, calculates the position of the finger model, the posture of the finger model, the angle of each joint of the finger model, etc. in the depth data, and recognizes the hand. Output as a result. Note that the hand recognition result output from the hand recognition processing unit 121 is used in, for example, a human interface or an interactive system.

＜手指モデル情報の具体例＞
次に、手指モデル情報記憶部１２２に格納された手指モデル情報について説明する。図２は、手指モデル情報の一例を示す図である。図２に示すように、手指モデル情報２００には、情報の項目として、“種類”と“手指モデル”とが含まれる。 <Specific examples of finger model information>
Next, the finger model information stored in the finger model information storage unit 122 will be described. FIG. 2 is a diagram illustrating an example of finger model information. As shown in FIG. 2, the finger model information 200 includes “type” and “hand model” as information items.

“種類”には、手指モデルの種類を示す情報が格納される。第１の実施形態において、手指モデル情報２００には、手指の太さ、長さ、形状に基づいて、３種類の手指モデルが格納されている。具体的には、手指モデル情報２００には、男性かつ大人の手指モデル、女性かつ大人の手指モデル、子供の手指モデル、の３種類の手指モデルが格納されている。 In “Type”, information indicating the type of the finger model is stored. In the first embodiment, the finger model information 200 stores three types of finger models based on the thickness, length, and shape of the fingers. Specifically, the finger model information 200 stores three types of finger models: male and adult finger models, female and adult finger models, and child finger models.

“手指モデル”には、更に、情報の項目として“右手”と“左手”が含まれており、“右手”には、対応する種類の右手の手指モデルが格納されている。また、“左手”には、対応する種類の左手の手指モデルが格納されている。 The “hand model” further includes “right hand” and “left hand” as information items, and the “right hand” stores a corresponding type of right hand finger model. The “left hand” stores a corresponding kind of left hand finger model.

＜情報処理装置のハードウェア構成＞
次に、情報処理装置１２０のハードウェア構成について説明する。図３は、情報処理装置のハードウェア構成の一例を示す図である。 <Hardware configuration of information processing device>
Next, the hardware configuration of the information processing apparatus 120 will be described. FIG. 3 is a diagram illustrating an example of a hardware configuration of the information processing apparatus.

図３に示すように、情報処理装置１２０は、ＣＰＵ（Central Processing Unit）３０１、ＲＯＭ（Read Only Memory）３０２、ＲＡＭ（Random Access Memory）３０３を有する。ＣＰＵ３０１、ＲＯＭ３０２、ＲＡＭ３０３は、いわゆるコンピュータを形成する。また、情報処理装置１２０は、補助記憶装置３０４、操作装置３０５、表示装置３０６、Ｉ／Ｆ（Interface）装置３０７、ドライブ装置３０８を有する。なお、情報処理装置１２０の各部は、バス３０９を介して相互に接続されている。 As illustrated in FIG. 3, the information processing apparatus 120 includes a CPU (Central Processing Unit) 301, a ROM (Read Only Memory) 302, and a RAM (Random Access Memory) 303. The CPU 301, the ROM 302, and the RAM 303 form a so-called computer. The information processing apparatus 120 includes an auxiliary storage device 304, an operation device 305, a display device 306, an I / F (Interface) device 307, and a drive device 308. Note that the units of the information processing apparatus 120 are connected to each other via a bus 309.

ＣＰＵ３０１は、補助記憶装置３０４にインストールされた各種プログラム（例えば、手認識プログラム等）を実行する。 The CPU 301 executes various programs (for example, a hand recognition program) installed in the auxiliary storage device 304.

ＲＯＭ３０２は、不揮発性メモリであり、主記憶装置として機能する。ＲＯＭ３０２は、補助記憶装置３０４に格納された各種プログラムをＣＰＵ３０１が実行するために必要な各種プログラム、データ等を格納する。具体的には、ＲＯＭ３０２はＢＩＯＳ（Basic Input/Output System）やＥＦＩ（Extensible Firmware Interface）等のブートプログラム等を格納する。 The ROM 302 is a nonvolatile memory and functions as a main storage device. The ROM 302 stores various programs and data necessary for the CPU 301 to execute various programs stored in the auxiliary storage device 304. Specifically, the ROM 302 stores a boot program such as BIOS (Basic Input / Output System) and EFI (Extensible Firmware Interface).

ＲＡＭ３０３は、ＤＲＡＭ（Dynamic Random Access Memory）やＳＲＡＭ（Static Random Access Memory）等の揮発性メモリであり、主記憶装置として機能する。ＲＡＭ３０３は、補助記憶装置３０４に格納された各種プログラムがＣＰＵ３０１によって実行される際に展開される、作業領域を提供する。 The RAM 303 is a volatile memory such as a DRAM (Dynamic Random Access Memory) or an SRAM (Static Random Access Memory), and functions as a main storage device. The RAM 303 provides a work area that is expanded when various programs stored in the auxiliary storage device 304 are executed by the CPU 301.

補助記憶装置３０４は、各種プログラムや、各種プログラムが実行される際に用いられる情報を格納する。手指モデル情報記憶部１２２は、補助記憶装置３０４において実現される。 The auxiliary storage device 304 stores various programs and information used when the various programs are executed. The finger model information storage unit 122 is realized in the auxiliary storage device 304.

操作装置３０５は、情報処理装置１２０の管理者が情報処理装置１２０に対して各種指示を入力する際に用いる入力デバイスである。表示装置３０６は、情報処理装置１２０の内部情報を表示する表示デバイスである。 The operation device 305 is an input device used when an administrator of the information processing device 120 inputs various instructions to the information processing device 120. The display device 306 is a display device that displays internal information of the information processing device 120.

Ｉ／Ｆ装置３０７は、深度センサ１１０と、情報処理装置１２０とを通信可能に接続するための接続デバイスである。 The I / F device 307 is a connection device for connecting the depth sensor 110 and the information processing device 120 so that they can communicate with each other.

ドライブ装置３０８は記録媒体３１０をセットするためのデバイスである。ここでいう記録媒体３１０には、ＣＤ−ＲＯＭ、フレキシブルディスク、光磁気ディスク等のように情報を光学的、電気的あるいは磁気的に記録する媒体が含まれる。また、記録媒体３１０には、ＲＯＭ、フラッシュメモリ等のように情報を電気的に記録する半導体メモリ等が含まれていてもよい。 The drive device 308 is a device for setting the recording medium 310. The recording medium 310 here includes a medium for recording information optically, electrically, or magnetically, such as a CD-ROM, a flexible disk, a magneto-optical disk, or the like. In addition, the recording medium 310 may include a semiconductor memory that electrically records information, such as a ROM and a flash memory.

なお、補助記憶装置３０４に格納される各種プログラムは、例えば、配布された記録媒体３１０がドライブ装置３０８にセットされ、該記録媒体３１０に記録された各種プログラムがドライブ装置３０８により読み出されることでインストールされる。 The various programs stored in the auxiliary storage device 304 are installed by, for example, setting the distributed recording medium 310 in the drive device 308 and reading the various programs recorded in the recording medium 310 by the drive device 308. Is done.

＜手認識処理部の機能構成＞
次に、情報処理装置１２０の手認識処理部１２１の機能構成について説明する。図４は、手認識処理部の機能構成の一例及び各機能部の処理の具体例を示す図である。図４に示すように、手認識処理部１２１は、深度データ取得部４０１、部位抽出部４０２、マスク部４０３、モデル最適化部４０４、マスク領域算出部４０５、終了判定部４０６を有する。 <Functional configuration of hand recognition processing unit>
Next, a functional configuration of the hand recognition processing unit 121 of the information processing apparatus 120 will be described. FIG. 4 is a diagram illustrating an example of a functional configuration of the hand recognition processing unit and a specific example of processing of each functional unit. As illustrated in FIG. 4, the hand recognition processing unit 121 includes a depth data acquisition unit 401, a region extraction unit 402, a mask unit 403, a model optimization unit 404, a mask region calculation unit 405, and an end determination unit 406.

深度データ取得部４０１は、深度センサ１１０より深度データを取得する。図４に示すデータ４１１は、深度センサ１１０より取得した深度データの一例である。なお、図４等において深度データを表現するにあたっては、説明をわかりやすくするため、各画素に割り当てられた距離情報から同一物体と識別できる領域の輪郭を点線で示している。深度データ取得部４０１は、取得した深度データを部位抽出部４０２に通知する。 The depth data acquisition unit 401 acquires depth data from the depth sensor 110. Data 411 illustrated in FIG. 4 is an example of depth data acquired from the depth sensor 110. In expressing depth data in FIG. 4 and the like, the outline of a region that can be identified as the same object from the distance information assigned to each pixel is indicated by a dotted line for easy understanding. The depth data acquisition unit 401 notifies the site extraction unit 402 of the acquired depth data.

部位抽出部４０２は抽出部の一例である。部位抽出部４０２は、深度データ取得部４０１より通知された深度データのうち、手の領域及び腕の領域を含む特定部位を、処理対象とする手の候補領域として抽出することで、特定部位深度データを得る。図４に示すデータ４１２は、部位抽出部４０２が、データ４１１から手の領域及び腕の領域を含む特定部位を抽出することで得た特定部位深度データの一例である。部位抽出部４０２は、特定部位深度データをマスク部４０３に通知する。 The part extraction unit 402 is an example of an extraction unit. The part extraction unit 402 extracts a specific part including the hand area and the arm area from the depth data notified from the depth data acquisition part 401 as a candidate area of the hand to be processed, thereby specifying the specific part depth. Get the data. Data 412 illustrated in FIG. 4 is an example of specific part depth data obtained by the part extraction unit 402 extracting a specific part including a hand region and an arm region from the data 411. Part extracting unit 402 notifies specific part depth data to mask unit 403.

マスク部４０３は設定部の一例である。マスク部４０３は、終了判定部４０６よりマスク領域情報を取得し、取得したマスク領域情報に基づいて、特定部位深度データのうち、腕の領域をマスクすることでマスク後深度データを得る。つまり、マスク領域情報に基づいて、腕の領域を除くことで、処理対象とする手の候補領域内において、新たな処理対象とする手の候補領域を設定し、マスク後深度データを得る。また、マスク部４０３は、マスク後深度データをモデル最適化部４０４に通知する。 The mask unit 403 is an example of a setting unit. The mask unit 403 acquires mask region information from the end determination unit 406, and obtains post-mask depth data by masking the arm region in the specific part depth data based on the acquired mask region information. That is, by removing the arm region based on the mask region information, a new candidate region for the hand to be processed is set in the candidate region for the hand to be processed, and post-mask depth data is obtained. Further, the mask unit 403 notifies the model optimization unit 404 of the post-mask depth data.

なお、マスク部４０３は、終了判定部４０６よりマスク領域情報を取得していない場合には、特定部位深度データを（マスク処理することなく）モデル最適化部４０４に通知する。１回目の処理においては、終了判定部４０６よりマスク領域情報が通知されないため、マスク部４０３は、特定部位深度データを（マスク処理することなく）モデル最適化部４０４に通知する。図４に示すデータ４１３は、特定部位深度データをマスク処理することなくモデル最適化部４０４に通知したデータの一例である。 Note that if the mask area information is not acquired from the end determination unit 406, the mask unit 403 notifies the model optimization unit 404 of the specific part depth data (without performing mask processing). In the first process, since the mask area information is not notified from the end determination unit 406, the mask unit 403 notifies the model optimization unit 404 of the specific part depth data (without mask processing). Data 413 illustrated in FIG. 4 is an example of data notified to the model optimization unit 404 without masking the specific part depth data.

モデル最適化部４０４は特定部の一例である。モデル最適化部４０４は、マスク部４０３より通知された特定部位深度データまたはマスク後深度データに対して最適化処理を行い、最も類似する手指モデルを特定することで手の領域を認識する。 The model optimization unit 404 is an example of a specifying unit. The model optimizing unit 404 performs optimization processing on the specific part depth data or the post-mask depth data notified from the mask unit 403, and recognizes the hand region by specifying the most similar finger model.

具体的には、モデル最適化部４０４は、手指モデル情報記憶部１２２より手指モデルを読み出し、特定部位深度データまたはマスク後深度データと対比する。 Specifically, the model optimization unit 404 reads a finger model from the finger model information storage unit 122 and compares it with specific part depth data or post-mask depth data.

そして、モデル最適化部４０４は、特定部位深度データまたはマスク後深度データにおける、手指モデルの最適な位置、姿勢、各関節の角度を算出する。モデル最適化部４０４は、算出した手指モデルの最適な位置、姿勢、各関節の角度を、マスク領域算出部４０５に通知する。 Then, the model optimization unit 404 calculates the optimum position and posture of the finger model and the angle of each joint in the specific part depth data or the post-mask depth data. The model optimization unit 404 notifies the mask region calculation unit 405 of the calculated optimal position, posture, and angle of each joint.

なお、モデル最適化部４０４は、予め定義したコスト関数を用いて算出されるコストが最小となるように、手指モデルの最適な位置、姿勢、各関節の角度を算出する。コストとは、特定部位深度データまたはマスク後深度データに基づいて算出される、手指モデルの位置、姿勢、各関節の角度の確からしさ（手の候補領域に対する手指モデルの類似の度合い）を示すパラメータである。なお、コストが最小となるように、手指モデルの最適な位置、姿勢、各関節の角度を算出するにあたっては、関節の可動範囲を制約条件として付加してもよい。モデル最適化部４０４は、算出したコストもあわせてマスク領域算出部４０５に通知する。 The model optimizing unit 404 calculates the optimal position and posture of the finger model and the angle of each joint so that the cost calculated using a predefined cost function is minimized. The cost is a parameter indicating the accuracy of the position and posture of the hand model and the angle of each joint (degree of similarity of the hand model with respect to the hand candidate area) calculated based on the specific part depth data or the post-mask depth data. It is. It should be noted that the joint movable range may be added as a constraint when calculating the optimum position and posture of the finger model and the angle of each joint so as to minimize the cost. The model optimization unit 404 notifies the mask area calculation unit 405 together with the calculated cost.

図４に示すデータ４１４は、特定部位深度データに対して最適化処理を行った後の手指モデル４２０、４３０の位置、姿勢、各関節の角度を示している。特定部位深度データの場合、マスク処理が行われていないため、特定部位深度データには、手の領域だけでなく腕の領域も含まれている。このため、データ４１３に対して最適化処理が行われた場合、データ４１４に示すように、腕の領域の影響により腕の領域側に手指モデル４２０、４３０が引き寄せられ、実際の手の領域からはずれた位置に手指モデルが配置されることになる。 Data 414 illustrated in FIG. 4 indicates the positions and postures of the finger models 420 and 430 and the angles of the joints after the optimization process is performed on the specific part depth data. In the case of specific part depth data, since mask processing is not performed, the specific part depth data includes not only the hand region but also the arm region. For this reason, when the optimization process is performed on the data 413, as shown in the data 414, the finger models 420 and 430 are attracted to the arm region side due to the effect of the arm region, and from the actual hand region. The finger model is placed at a position that is off.

マスク領域算出部４０５は算出部の一例である。マスク領域算出部４０５は、モデル最適化部４０４から取得した、手指モデル４２０、４３０の位置、姿勢、各関節の角度、及び、コストを終了判定部４０６に通知する。 The mask area calculation unit 405 is an example of a calculation unit. The mask area calculation unit 405 notifies the end determination unit 406 of the position, posture, angle of each joint, and cost of the finger models 420 and 430 acquired from the model optimization unit 404.

また、マスク領域算出部４０５は、モデル最適化部４０４が、特定部位深度データに対して最適化処理を行った後の手指モデル４２０、４３０の位置、姿勢、各関節の角度に基づいて、特定部位深度データにおける、腕の領域を含む領域（マスク領域）を算出する。更に、マスク領域算出部４０５は、算出したマスク領域を示すマスク領域情報を終了判定部４０６に通知する。 Further, the mask area calculation unit 405 specifies the positions and postures of the finger models 420 and 430 and the angles of the joints after the model optimization unit 404 performs the optimization process on the specific part depth data. A region (mask region) including the arm region in the part depth data is calculated. Further, the mask area calculation unit 405 notifies the end determination unit 406 of mask area information indicating the calculated mask area.

終了判定部４０６は、マスク領域算出部４０５より、手指モデル４２０、４３０の位置、姿勢、各関節の角度、コスト、マスク領域情報を取得する。また、終了判定部４０６は、マスク領域算出部４０５より取得したコストまたはマスク領域情報に基づいて、モデル最適化部４０４による最適化処理を継続するか終了するかを判定する。 The end determination unit 406 acquires the positions and postures of the finger models 420 and 430, the angles of the joints, the cost, and the mask region information from the mask region calculation unit 405. Further, the end determination unit 406 determines whether to continue or end the optimization processing by the model optimization unit 404 based on the cost or mask region information acquired from the mask region calculation unit 405.

終了すると判定した場合、終了判定部４０６は、取得した手指モデル４２０、４３０の位置、姿勢、各関節の角度を、手認識結果として出力する。一方、継続すると判定した場合、終了判定部４０６は、取得したマスク領域情報をマスク部４０３に通知する。 When it is determined to end, the end determination unit 406 outputs the acquired positions and postures of the finger models 420 and 430 and the angles of the joints as a hand recognition result. On the other hand, when it is determined to continue, the end determination unit 406 notifies the mask unit 403 of the acquired mask area information.

これにより、マスク部４０３は、特定部位深度データに対してマスク処理を行い、マスク後深度データを得る。図４に示すデータ４１３’は、マスク領域情報に基づいて、データ４１２に対してマスク処理を行うことで得たマスク後深度データの一例である。 As a result, the mask unit 403 performs mask processing on the specific part depth data to obtain post-mask depth data. Data 413 ′ illustrated in FIG. 4 is an example of post-mask depth data obtained by performing mask processing on the data 412 based on the mask area information.

なお、データ４１３’は、モデル最適化部４０４に通知され、モデル最適化部４０４において最適化処理が行われる。図４に示すデータ４１４’は、データ４１３’に対して、最適化処理が行われた後の手指モデル４２０、４３０の位置、姿勢、各関節の角度を示している。 The data 413 ′ is notified to the model optimization unit 404, and the model optimization unit 404 performs optimization processing. Data 414 ′ illustrated in FIG. 4 indicates the positions, postures, and angles of the joints of the finger models 420 and 430 after the optimization process is performed on the data 413 ′.

このように、第１の実施形態における手認識処理部１２１は、手指モデル４２０、４３０の最適化処理を行った後にマスク領域算出処理を行い、マスク領域情報を算出する。また、第１の実施形態における手認識処理部１２１は、算出したマスク領域情報に基づいてマスク処理することで得たマスク後深度データに対して、再び、手指モデル４２０、４３０の最適化処理を行う。 As described above, the hand recognition processing unit 121 according to the first embodiment performs the mask area calculation process after performing the optimization process of the finger models 420 and 430, and calculates the mask area information. In addition, the hand recognition processing unit 121 in the first embodiment performs optimization processing of the finger models 420 and 430 again on the post-mask depth data obtained by performing mask processing based on the calculated mask region information. Do.

つまり、第１の実施形態における手認識処理部１２１では、最適化処理、マスク領域算出処理、マスク処理を再帰的に実行する。これにより、手指モデル４２０、４３０の最適化処理が行われるごとに、特定部位深度データから腕の領域が徐々に除外されていき、手の領域が認識しなおされることになる。この結果、第１の実施形態における手認識処理部１２１によれば、手の領域を認識する際の認識精度を向上させることができる。 That is, the hand recognition processing unit 121 in the first embodiment recursively executes optimization processing, mask area calculation processing, and mask processing. Thus, each time the finger models 420 and 430 are optimized, the arm region is gradually excluded from the specific part depth data, and the hand region is re-recognized. As a result, according to the hand recognition processing unit 121 in the first embodiment, the recognition accuracy when recognizing the hand region can be improved.

＜モデル最適化部による最適化処理の説明＞
次に、モデル最適化部４０４による最適化処理の詳細について説明する。図５は、モデル最適化部による最適化処理の詳細を示す図である。 <Description of optimization processing by model optimization unit>
Next, details of the optimization processing by the model optimization unit 404 will be described. FIG. 5 is a diagram illustrating details of the optimization processing by the model optimization unit.

図５において、データ５００は、特定部位深度データの一例であるデータ４１３の一部を拡大して示したものであり、データ５００内の各点は、被測定物までの距離を示す距離情報が割り当てられた画素を示している。 In FIG. 5, data 500 is an enlarged view of a part of data 413, which is an example of specific part depth data. Each point in data 500 has distance information indicating the distance to the object to be measured. The assigned pixels are shown.

また、図５において、球４２０＿１〜４２０＿４は、手指モデル４２０の指（例えば、人差し指）の一部を示している。 In FIG. 5, spheres 420_1 to 420_4 indicate a part of fingers (for example, forefinger) of the finger model 420.

図５に示すデータ５００と、球４２０＿１〜４２０＿４とを対比して、データ５００における球４２０＿１〜４２０＿４の位置、姿勢、各関節の角度等を算出するにあたり、モデル最適化部４０４は、下式に示すコスト関数に基づいてコストを算出する。 In comparing the data 500 shown in FIG. 5 with the spheres 420_1 to 420_4 and calculating the positions, postures, angles of the joints, and the like of the spheres 420_1 to 420_4 in the data 500, the model optimization unit 404 uses the following equation: The cost is calculated based on the indicated cost function.

なお、上式において、第１項は、データ５００の各点と手指モデル４２０との２乗和を表している。具体的には、ｐは、データ５００内の各点（例えば、点５０１）を示している。また、ｓ_ｘ（ｐ）は、ｐに最も近い球を示している。ｐ＝点５０１の場合、ｓ_ｘ（ｐ）＝球４２０＿１となる。また、Ｄ（ｐ，ｓ_ｘ（ｐ））^２は、ｐに最も近い球とｐとの間の距離の２乗を示している。ｐ＝点５０１の場合、Ｄ（ｐ，ｓ_ｘ（ｐ））^２＝Ｌ_１ ^２となる。Ｄ（ｐ，ｓ_ｘ（ｐ））^２は、データ５００内に含まれるｐの個数に応じた個数分、算出されることになる。 In the above equation, the first term represents the sum of squares of each point of the data 500 and the finger model 420. Specifically, p indicates each point (eg, point 501) in the data 500. Further, s _{x (p)} indicates a sphere closest to p. In the case of p = point 501, s _{x (p)} = sphere 420_1. D (p, s _{x (p)} ) ² indicates the square of the distance between the sphere closest to p and p. For p = point _{501, D (p, s x} (p)) becomes ² _{= L} ¹ 2. D (p, s _{x (p)} ) ² is calculated by the number corresponding to the number of p included in the data 500.

また、上式において、第２項は、各球と点との２乗和を表している。具体的には、ｃ_ｉは、各球の中心座標（例えば、球４２０＿１内の黒丸の平面上の位置）を示している。また、Ｄは、データ５００の各点（例えば、点５０２）の平面上の位置を示している。ｃ_ｉ＝球４２０＿１内の黒丸の平面上の位置、Ｄ＝点５０２の平面上の位置であった場合、Ｂ（ｃ_１，Ｄ）^２＝ｌ_１ ^２となる。Ｂ（ｃ_ｉ，Ｄ）^２は、データ５００内に含まれる１つのＤに対して、球の個数分、算出されることになる。 In the above equation, the second term represents the sum of squares of each sphere and point. Specifically, c _i denotes the center coordinates of each sphere (e.g., the position of the black circle in the plane of the sphere 420_1). D indicates the position of each point (for example, point 502) in the data 500 on the plane. If c _i = position on the black circle plane in the sphere 420_1, and D = position on the plane of the point 502, then B (c ₁ , D) ² = l ₁ ² is obtained. B (c _i , D) ² is calculated for the number of spheres for one D included in the data 500.

また、上式において、第３項は、球同士の重なり距離の２乗和を表している。具体的には、Ｌ（ｓ_ｉ，ｓ_ｊ）^２は、２つの球の中心間距離と、２つの球の半径の和との差の２乗を示している。図５の例では、球４２０＿１と球４２０＿２との中心間距離はｄ_１２であるため、球４２０＿１の半径をｒ_１、球４２０＿２の半径をｒ_２とすると、Ｌ（ｓ_１，ｓ_２）^２は、（ｄ_１２−（ｒ_１＋ｒ_２））^２となる。なお、２つの球の中心間距離よりも、２つの球の半径の和の方が小さい場合には、Ｌ（ｓ_ｉ，ｓ_ｊ）^２は０として扱う。 In the above equation, the third term represents the sum of squares of the overlapping distance between the spheres. Specifically, L (s _i , s _j ) ² indicates the square of the difference between the distance between the centers of the two spheres and the sum of the radii of the two spheres. In the example of FIG. 5, since the distance between the centers of the spheres 420_1 and the ball 420_2 is _{d 12,} _{r 1} the radius of the sphere 420_1, and the radius of the sphere 420_2 and _{_{_{r 2, L (s 1,}}} s 2) 2 Becomes (d ₁₂ − (r ₁ + r ₂ )) ² . Note that L (s _i , s _j ) ² is treated as 0 when the sum of the radii of the two spheres is smaller than the distance between the centers of the two spheres.

モデル最適化部４０４が最適化処理を行うことで、モデル最適化部４０４は、マスク領域算出部４０５に対して、手認識結果５１０を出力する。図５に示すように、手認識結果５１０には、情報の項目として、“手指モデル”、“位置”、“姿勢”、“関節角度”が含まれる。 As the model optimization unit 404 performs the optimization process, the model optimization unit 404 outputs the hand recognition result 510 to the mask area calculation unit 405. As shown in FIG. 5, the hand recognition result 510 includes “hand model”, “position”, “posture”, and “joint angle” as information items.

“手指モデル”には、最適化処理に用いられた手指モデル４２０を識別するための識別子が格納される。“位置”には、最適化処理を行うことでコストが最小となった際の手指モデル４２０の３次元空間における位置を示す座標が格納される。 The “hand model” stores an identifier for identifying the hand model 420 used in the optimization process. “Position” stores coordinates indicating the position of the finger model 420 in the three-dimensional space when the cost is minimized by performing the optimization process.

“姿勢”には、最適化処理を行うことでコストが最小となった際の手指モデル４２０の３次元空間における姿勢が格納される。“関節角度”には、更に、“関節１”〜“関節ｍ”が含まれ、“関節１”〜“関節ｍ”には、最適化処理を行うことでコストが最小となった際の、手指モデル４２０を形成する各関節の角度が格納される。 The “posture” stores the posture of the finger model 420 in the three-dimensional space when the cost is minimized by performing the optimization process. The “joint angle” further includes “joint 1” to “joint m”, and when “joint 1” to “joint m” are optimized, the cost is minimized. The angle of each joint that forms the finger model 420 is stored.

＜マスク領域算出部によるマスク領域算出処理及びマスク部によるマスク処理の説明＞
次に、マスク領域算出部４０５によるマスク領域算出処理及びマスク部４０３によるマスク処理の詳細について説明する。図６、図７は、マスク領域算出部によるマスク領域算出処理及びマスク部によるマスク処理の詳細を示す第１及び第２の図である。 <Description of Mask Area Calculation Processing by Mask Area Calculation Unit and Mask Process by Mask Unit>
Next, details of the mask area calculation process by the mask area calculation unit 405 and the mask process by the mask unit 403 will be described. 6 and 7 are first and second diagrams showing details of the mask area calculation process by the mask area calculation unit and the mask process by the mask unit.

このうち、図６（ａ）のデータ６００は、特定部位深度データに対して最適化処理が行われた場合のデータ４１４（図４）の一部を拡大して示したものである。マスク領域算出部４０５は、データ６００に基づいてマスク領域を算出する。 Among these, the data 600 in FIG. 6A is an enlarged view of a part of the data 414 (FIG. 4) when the optimization process is performed on the specific part depth data. The mask area calculation unit 405 calculates a mask area based on the data 600.

具体的には、マスク領域算出部４０５は、まず、図６（ｂ）に示すように、データ６００内において、手指モデル４２０の手首を示す点６０１の位置を特定する。続いて、マスク領域算出部４０５は、手指モデル４２０の中指の付け根を示す点６０２の位置を特定する。更に、マスク領域算出部４０５は、点６０１の位置を始点とし点６０２の位置を終点とするベクトルｖ_０を算出する。 Specifically, the mask area calculation unit 405 first specifies the position of a point 601 indicating the wrist of the finger model 420 in the data 600 as shown in FIG. Subsequently, the mask area calculation unit 405 specifies the position of the point 602 indicating the root of the middle finger of the finger model 420. Further, the mask area calculation unit 405 calculates a vector v ₀ having the position of the point 601 as the start point and the position of the point 602 as the end point.

続いて、マスク領域算出部４０５は、図６（ｃ）に示すように、点６０１の位置を始点とし、データ６００上の任意の複数の点の位置を終点とする複数方向のベクトルを算出する。例えば、マスク領域算出部４０５は、データ６００上の点６０１の位置を始点とし点６１０＿１の位置を終点とするベクトルｖ_１を算出する。また、マスク領域算出部４０５は、データ６００上の点６０１の位置を始点とし点６１０＿２の位置を終点とするベクトルｖ_２を算出する。また、マスク領域算出部４０５は、データ６００上の点６０１の位置を始点とし点６１０＿３の位置を終点とするベクトルｖ_３を算出する。以下、同様に、マスク領域算出部４０５は、データ６００上の点６０１の位置を始点としｎ個の点それぞれの位置を終点とするｎ個のベクトルを算出する。 Subsequently, as shown in FIG. 6C, the mask area calculation unit 405 calculates a vector in a plurality of directions starting from the position of the point 601 and ending at the positions of any of a plurality of points on the data 600. . For example, the mask area calculation unit 405 calculates a vector v ₁ having the position of the point 601 on the data 600 as the start point and the position of the point 610_1 as the end point. Further, the mask area calculation unit 405 calculates a vector v ₂ having the position of the point 601 on the data 600 as the start point and the position of the point 610_2 as the end point. Further, the mask area calculation unit 405 calculates a vector v ₃ having the position of the point 601 on the data 600 as the start point and the position of the point 610_3 as the end point. Similarly, the mask area calculation unit 405 calculates n vectors starting from the position of the point 601 on the data 600 and ending at the position of each of the n points.

続いて、マスク領域算出部４０５は、算出したｎ個のベクトルそれぞれと、ベクトルｖ_０との内積を算出する。ベクトルｖ_０との内積が０以上となるベクトルによって特定される点は、点６０１よりも指先側に位置している点である。一方、ベクトルｖ_０との内積が０より小さいベクトルによって特定される点は、点６０１よりも腕側に位置している点である。 Subsequently, the mask area calculation unit 405 calculates the inner product of each of the calculated n vectors and the vector v ₀ . A point specified by a vector whose inner product with the vector v ₀ is 0 or more is a point located on the fingertip side from the point 601. On the other hand, a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 is a point located on the arm side from the point 601.

そこで、マスク領域算出部４０５は、ベクトルｖ_０との内積が０より小さいベクトルによって特定される点が位置する領域を、マスク領域に決定する。図６（ｃ）のデータ６００において、直線６２０より上側の領域は、ベクトルｖ_０との内積が０以上となるベクトルによって特定される点が位置する領域である。一方、直線６２０より下側の領域は、ベクトルｖ_０との内積が０より小さいベクトルによって特定される点が位置する領域（つまり、マスク領域）である。マスク領域を示すマスク領域情報は、終了判定部４０６を介してマスク部４０３に通知される。 Therefore, the mask area calculation unit 405 determines an area where a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 is located as a mask area. In the data 600 of FIG. 6C, the region above the straight line 620 is a region where a point specified by a vector whose inner product with the vector v ₀ is 0 or more is located. On the other hand, the region below the straight line 620 is a region where a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 (that is, a mask region). Mask area information indicating the mask area is notified to the mask unit 403 via the end determination unit 406.

図６（ｄ）は、通知されたマスク領域６３０に基づいて、マスク部４０３が、データ６００に対してマスク処理を行った様子を示している。 FIG. 6D shows a state in which the mask unit 403 performs a mask process on the data 600 based on the notified mask area 630.

なお、上述したとおり、マスク領域６３０がマスクされた後のマスク後深度データに対しては、モデル最適化部４０４において再び最適化処理が行われる。 As described above, the model optimization unit 404 performs the optimization process again on the post-mask depth data after the mask region 630 is masked.

図７（ａ）のデータ７００は、マスク後深度データに対して最適化処理が行われた場合のデータ４１４’（図４）の一部を拡大して示したものである。マスク領域算出部４０５は、データ７００に基づいて、再び、マスク領域を算出する。 The data 700 in FIG. 7A is an enlarged view of a part of the data 414 ′ (FIG. 4) when the optimization process is performed on the post-mask depth data. The mask area calculation unit 405 calculates the mask area again based on the data 700.

具体的には、マスク領域算出部４０５は、まず、図７（ｂ）に示すように、データ７００内において、手指モデル４２０の手首を示す点７０１の位置を特定する。続いて、マスク領域算出部４０５は、手指モデル４２０の中指の付け根を示す点７０２の位置を特定する。更に、マスク領域算出部４０５は、点７０１の位置を始点とし点７０２の位置を終点とするベクトルｖ_０を算出する。 Specifically, the mask area calculation unit 405 first specifies the position of a point 701 indicating the wrist of the finger model 420 in the data 700 as shown in FIG. Subsequently, the mask area calculation unit 405 specifies the position of the point 702 indicating the root of the middle finger of the finger model 420. Further, the mask area calculation unit 405 calculates a vector v ₀ having the position of the point 701 as the start point and the position of the point 702 as the end point.

続いて、マスク領域算出部４０５は、図７（ｃ）に示すように、点７０１の位置を始点とし、データ７００上の任意の点の位置を終点とするベクトルを算出する。例えば、マスク領域算出部４０５は、データ７００上の点７０１の位置を始点とし点７１０＿１の位置を終点とするベクトルｖ_１を算出する。また、マスク領域算出部４０５は、データ７００上の点７０１の位置を始点とし点７１０＿２の位置を終点とするベクトルｖ_２を算出する。また、マスク領域算出部４０５は、データ７００上の点７０１の位置を始点とし点７１０＿３の位置を終点とするベクトルｖ_３を算出する。以下、同様に、マスク領域算出部４０５は、データ７００上の点７０１の位置を始点としｎ個の点それぞれの位置を終点とするｎ個のベクトルを算出する。 Subsequently, as shown in FIG. 7C, the mask area calculation unit 405 calculates a vector having the position of the point 701 as the start point and the position of an arbitrary point on the data 700 as the end point. For example, the mask area calculation unit 405 calculates a vector v ₁ having the position of the point 701 on the data 700 as the start point and the position of the point 710_1 as the end point. Further, the mask area calculation unit 405 calculates a vector v ₂ having the position of the point 701 on the data 700 as the start point and the position of the point 710_2 as the end point. Further, the mask area calculation unit 405 calculates a vector v ₃ having the position of the point 701 on the data 700 as the start point and the position of the point 710_3 as the end point. Similarly, the mask area calculation unit 405 calculates n vectors starting from the position of the point 701 on the data 700 and ending at the position of each of the n points.

続いて、マスク領域算出部４０５は、算出したｎ個のベクトルそれぞれと、ベクトルｖ_０との内積を算出する。ベクトルｖ_０との内積が０以上となるベクトルによって特定される点は、点７０１よりも指先側に位置している点である。一方、ベクトルｖ_０との内積が０より小さいベクトルによって特定される点は、点７０１よりも腕側に位置している点である。 Subsequently, the mask area calculation unit 405 calculates the inner product of each of the calculated n vectors and the vector v ₀ . A point specified by a vector whose inner product with the vector v ₀ is 0 or more is a point located closer to the fingertip than the point 701. On the other hand, a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 is a point located on the arm side from the point 701.

そこで、マスク領域算出部４０５は、ベクトルｖ_０との内積が０より小さいベクトルによって特定される点が位置する領域を、マスク領域に決定する。図７（ｃ）のデータ７００において、直線７２０より上側の領域は、ベクトルｖ_０との内積が０以上となるベクトルによって特定される点が位置する領域である。一方、直線７２０より下側の領域は、ベクトルｖ_０との内積が０より小さいベクトルによって特定される点が位置する領域（マスク領域）である。マスク領域を示すマスク領域情報は、終了判定部４０６を介してマスク部４０３に通知される。 Therefore, the mask area calculation unit 405 determines an area where a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 is located as a mask area. In the data 700 of FIG. 7C, the region above the straight line 720 is a region where a point specified by a vector whose inner product with the vector v ₀ is 0 or more is located. On the other hand, the area below the straight line 720 is an area (mask area) where a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 is located. Mask area information indicating the mask area is notified to the mask unit 403 via the end determination unit 406.

図７（ｄ）は、通知されたマスク領域７３０に基づいて、マスク部４０３が、データ７００に対してマスク処理を行った様子を示している。 FIG. 7D shows a state in which the mask unit 403 performs a mask process on the data 700 based on the notified mask area 730.

なお、上述したとおり、マスク領域７３０がマスクされた後のマスク後深度データに対しては、モデル最適化部４０４において再び最適化処理が行われる。 As described above, the model optimization unit 404 performs the optimization process again on the post-mask depth data after the mask region 730 is masked.

＜終了判定部における終了判定処理の説明＞
次に、終了判定部４０６における終了判定処理の詳細について説明する。図８〜図１０は、終了判定部による終了判定処理の流れを示す第１〜第３のフローチャートである。図８〜図１０に示すように、第１の実施形態において、終了判定部４０６が実行可能な終了判定処理は５種類ある。以下、５種類の終了判定処理について説明する。 <Description of End Determination Process in End Determination Unit>
Next, details of the end determination process in the end determination unit 406 will be described. 8 to 10 are first to third flowcharts showing the flow of the end determination process by the end determination unit. As shown in FIGS. 8 to 10, in the first embodiment, there are five types of end determination processing that can be executed by the end determination unit 406. Hereinafter, five types of end determination processing will be described.

（１）終了判定処理のフローチャート（その１）
図８（ａ）は、終了判定処理の流れを示すフローチャート（その１）である。手認識処理部１２１が起動し、図８（ａ）に示す終了判定処理が選択されることで、図８（ａ）に示す終了判定処理が実行される。 (1) Flow chart of end determination processing (part 1)
FIG. 8A is a flowchart (part 1) showing the flow of the end determination process. When the hand recognition processing unit 121 is activated and the end determination process illustrated in FIG. 8A is selected, the end determination process illustrated in FIG. 8A is executed.

ステップＳ８０１において、終了判定部４０６は、マスク領域算出部４０５より、手指モデルの位置、姿勢、各関節の角度、コスト、マスク領域情報を取得する。 In step S <b> 801, the end determination unit 406 acquires the position and posture of the finger model, the angle of each joint, the cost, and mask region information from the mask region calculation unit 405.

ステップＳ８０２において、終了判定部４０６は、取得したコストが所定の基準を満たすか否か（所定の閾値以下であるか否か）を判定する。ステップＳ８０２において、所定の閾値以下でないと判定した場合には（ステップＳ８０２においてＮｏの場合には）、ステップＳ８０４に進む。ステップＳ８０４において、終了判定部４０６は、取得したマスク領域情報を、マスク部４０３に通知する。 In step S802, the end determination unit 406 determines whether the acquired cost satisfies a predetermined criterion (whether it is equal to or less than a predetermined threshold). If it is determined in step S802 that it is not less than the predetermined threshold value (in the case of No in step S802), the process proceeds to step S804. In step S804, the end determination unit 406 notifies the mask unit 403 of the acquired mask area information.

一方、ステップＳ８０２において、所定の閾値以下であると判定した場合には（ステップＳ８０２においてＹｅｓの場合には）、十分なコストが得られたと判断し、ステップＳ８０３に進む。ステップＳ８０３において、終了判定部４０６は、取得した手指モデルの位置、姿勢、各関節の角度を、手認識結果として出力する。 On the other hand, if it is determined in step S802 that the value is equal to or less than the predetermined threshold value (in the case of Yes in step S802), it is determined that sufficient cost has been obtained, and the process proceeds to step S803. In step S803, the end determination unit 406 outputs the acquired finger model position, posture, and angle of each joint as a hand recognition result.

（２）終了判定処理のフローチャート（その２）
図８（ｂ）は、終了判定処理の流れを示すフローチャート（その２）である。手認識処理部１２１が起動し、図８（ｂ）に示す終了判定処理が選択されることで、図８（ｂ）に示す終了判定処理が実行される。 (2) End determination processing flowchart (part 2)
FIG. 8B is a flowchart (part 2) showing the flow of the end determination process. When the hand recognition processing unit 121 is activated and the end determination process illustrated in FIG. 8B is selected, the end determination process illustrated in FIG. 8B is executed.

ステップＳ８１１において、終了判定部４０６は、今回取得したコストと前回取得したコストとに基づいてコストの変化を算出し、算出したコストの変化が所定の基準を満たすか否かを判定する。具体的には、終了判定部４０６は、今回取得したコストが前回取得したコストよりも大きくなっていないと判定した場合には（ステップＳ８１１においてＮｏの場合には）、ステップＳ８０４に進む。ステップＳ８０４において、終了判定部４０６は、取得したマスク領域情報を、マスク部４０３に通知する。 In step S811, the end determination unit 406 calculates a change in cost based on the currently acquired cost and the previously acquired cost, and determines whether the calculated cost change satisfies a predetermined criterion. Specifically, when it is determined that the cost acquired this time is not greater than the cost acquired last time (in the case of No in step S811), end determination unit 406 proceeds to step S804. In step S804, the end determination unit 406 notifies the mask unit 403 of the acquired mask area information.

一方、ステップＳ８１１において、今回取得したコストが前回取得したコストよりも大きくなっていると判定した場合には（ステップＳ８１１においてＹｅｓの場合には）、これ以上処理を繰り返してもコストが小さくならないと判断し、ステップＳ８１２に進む。ステップＳ８１２において、終了判定部４０６は、前回取得した取得した手指モデルの位置、姿勢、各関節の角度を、手認識結果として出力する。 On the other hand, if it is determined in step S811 that the cost acquired this time is larger than the cost acquired last time (in the case of Yes in step S811), the cost does not decrease even if the process is repeated further. Determination is made and the process proceeds to step S812. In step S812, the end determination unit 406 outputs the position, posture, and angle of each joint acquired last time as the hand recognition result.

（３）終了判定処理のフローチャート（その３）
図８（ｃ）は、終了判定処理の流れを示すフローチャート（その３）である。手認識処理部１２１が起動し、図８（ｃ）に示す終了判定処理が選択されることで、図８（ｃ）に示す終了判定処理が実行される。 (3) End determination processing flowchart (part 3)
FIG. 8C is a flowchart (part 3) showing the flow of the end determination process. When the hand recognition processing unit 121 is activated and the end determination process illustrated in FIG. 8C is selected, the end determination process illustrated in FIG. 8C is executed.

ステップＳ８２１において、終了判定部４０６は、今回取得したマスク領域情報と前回取得したマスク領域情報とに基づいて、マスク領域の変化量を算出し、変化量が所定の基準を満たすか否かを判定する。具体的には、終了判定部４０６は、算出したマスク領域の変化量が所定の閾値未満になったか否かを判定する。 In step S821, the end determination unit 406 calculates a change amount of the mask region based on the mask region information acquired this time and the mask region information acquired last time, and determines whether or not the change amount satisfies a predetermined criterion. To do. Specifically, the end determination unit 406 determines whether or not the calculated amount of change in the mask area is less than a predetermined threshold value.

ステップＳ８２１において、所定の閾値未満になっていないと判定した場合には（ステップＳ８２１においてＮｏの場合には）、ステップＳ８０４に進む。ステップＳ８０４において、終了判定部４０６は、取得したマスク領域情報を、マスク部４０３に通知する。 If it is determined in step S821 that it is not less than the predetermined threshold value (in the case of No in step S821), the process proceeds to step S804. In step S804, the end determination unit 406 notifies the mask unit 403 of the acquired mask area information.

一方、ステップＳ８２１において、所定の閾値未満になっていると判定した場合には（ステップＳ８２１においてＹｅｓの場合には）、コストが収束したと判断し、ステップＳ８０３に進む。ステップＳ８０３において、終了判定部４０６は、取得した手指モデルの位置、姿勢、各関節の角度を、手認識結果として出力する。 On the other hand, if it is determined in step S821 that the value is less than the predetermined threshold (in the case of Yes in step S821), it is determined that the cost has converged, and the process proceeds to step S803. In step S803, the end determination unit 406 outputs the acquired finger model position, posture, and angle of each joint as a hand recognition result.

（４）終了判定処理のフローチャート（その４）
図９は、終了判定処理の流れを示すフローチャート（その４）である。手認識処理部１２１が起動し、図９に示す終了判定処理が選択されることで、図９に示す終了判定処理が実行される。なお、図９に示す終了判定処理は、図８（ａ）〜（ｃ）に示す終了判定処理を組み合わせたものであるため、ここでは、組み合わせ内容について説明する。 (4) Flow chart of end determination processing (part 4)
FIG. 9 is a flowchart (part 4) illustrating the flow of the end determination process. When the hand recognition processing unit 121 is activated and the end determination process illustrated in FIG. 9 is selected, the end determination process illustrated in FIG. 9 is executed. Note that the end determination process shown in FIG. 9 is a combination of the end determination processes shown in FIGS. 8A to 8C, and therefore, the contents of the combination will be described here.

図９に示す終了判定処理の場合、今回取得したコストが所定の閾値以下でなかったとしても（ステップＳ８０２においてＮｏの場合でも）、前回取得したコストよりも大きくなっている場合には、これ以上処理を繰り返してもコストが小さくならないと判断する。この場合（ステップＳ８１１においてＹｅｓの場合）、終了判定部４０６は、ステップＳ８０３に進む。これにより、終了判定部４０６は、前回取得した手指モデルの位置、姿勢、各関節の角度を、手認識結果として出力する。 In the case of the end determination process shown in FIG. 9, even if the cost acquired this time is not less than or equal to the predetermined threshold (even in the case of No in step S <b> 802), if the cost is larger than the previously acquired cost, it is more It is determined that the cost is not reduced even if the processing is repeated. In this case (Yes in step S811), end determination unit 406 proceeds to step S803. Accordingly, the end determination unit 406 outputs the hand model position, posture, and angle of each joint acquired last time as a hand recognition result.

また、図９に示す終了判定処理の場合、今回取得したコストが前回取得したコストより大きくなっていない場合でも（ステップＳ８１１においてＮｏの場合でも）、終了判定部４０６は、ステップＳ８２１において、マスク領域の変化量を判定する。そして、終了判定部４０６は、マスク領域の変化量が所定の閾値未満の場合には、コストが収束したと判断し、ステップＳ８０３に進む。これにより、終了判定部４０６は、今回取得した手指モデルの位置、姿勢、各関節の角度を、手認識結果として出力する。 In the case of the end determination process shown in FIG. 9, even if the cost acquired this time is not greater than the cost acquired last time (even in the case of No in step S811), the end determination unit 406 determines that the mask area in step S821. The amount of change is determined. If the amount of change in the mask area is less than the predetermined threshold, the end determination unit 406 determines that the cost has converged and proceeds to step S803. As a result, the end determination unit 406 outputs the hand model position, posture, and angle of each joint acquired this time as a hand recognition result.

（５）終了判定処理のフローチャート（その５）
図１０は、終了判定処理の流れを示すフローチャート（その５）である。手認識処理部１２１が起動し、図１０に示す終了判定処理が選択されることで、図１０に示す終了判定処理が実行される。なお、図１０に示す終了判定処理も、図８（ａ）〜（ｃ）に示す終了判定処理を組み合わせたものであるため、ここでは、組み合わせ内容について説明する。 (5) Flow chart of end determination processing (part 5)
FIG. 10 is a flowchart (part 5) showing the flow of the end determination process. When the hand recognition processing unit 121 is activated and the end determination process illustrated in FIG. 10 is selected, the end determination process illustrated in FIG. 10 is executed. Since the end determination process shown in FIG. 10 is also a combination of the end determination processes shown in FIGS. 8A to 8C, the contents of the combination will be described here.

図１０に示す終了判定処理の場合、今回取得したコストが所定の閾値以下でない場合（ステップＳ８０２においてＮｏの場合）でも、前回取得したコストよりも大きくなっている場合（ステップＳ８１１においてＹｅｓの場合）には、ステップＳ８２１に進む。 In the case of the end determination process shown in FIG. 10, even when the cost acquired this time is not less than or equal to the predetermined threshold (No in step S802), it is larger than the cost acquired last time (Yes in step S811). In step S821, the process proceeds to step S821.

そして、マスク領域の変化量が、所定の閾値未満の場合には（ステップＳ８２１においてＹｅｓの場合には）、ステップＳ８０３に進む。このように、終了判定部４０６は、これ以上処理を繰り返してもコストが小さくならないと判断し、かつ、コストが収束したと判断した場合には、今回取得した手指モデルの位置、姿勢、各関節の角度を、手認識結果として出力する。 If the amount of change in the mask area is less than the predetermined threshold (in the case of Yes in step S821), the process proceeds to step S803. As described above, when the end determination unit 406 determines that the cost does not decrease even if the process is repeated further, and determines that the cost has converged, the position and posture of the finger model acquired this time, each joint Is output as a hand recognition result.

反対に、今回取得したコストが所定の閾値以下であったとしても（ステップＳ８０２においてＹｅｓの場合でも）、マスク領域の変化量が所定の閾値以上の場合には、更に、コストを小さくできると判断し、ステップＳ８０４に進む。 On the contrary, even if the cost acquired this time is equal to or less than the predetermined threshold (even in the case of Yes in step S802), if the amount of change in the mask area is equal to or greater than the predetermined threshold, it is determined that the cost can be further reduced. Then, the process proceeds to step S804.

以上の説明から明らかなように、第１の実施形態における手認識システムは、深度データと手指モデルとを対比し、手指モデルの最適な位置、姿勢、各関節の角度を算出する最適化処理を行うことで手の領域を認識する。また、第１の実施形態における手認識システムは、認識した手の領域に基づいて腕の領域を含むマスク領域を算出し、算出したマスク領域に基づいて、深度データをマスクするマスク処理を行う。そして、第１の実施形態における手認識システムは、マスク処理後の深度データに対して、再び、最適化処理を行うことで手の領域を認識しなおす。 As is clear from the above description, the hand recognition system according to the first embodiment performs an optimization process for comparing the depth data with the finger model and calculating the optimum position, posture, and angle of each joint of the finger model. Recognize hand area by doing. Further, the hand recognition system according to the first embodiment calculates a mask region including an arm region based on the recognized hand region, and performs mask processing for masking depth data based on the calculated mask region. Then, the hand recognition system in the first embodiment recognizes the hand region again by performing optimization processing again on the depth data after the mask processing.

このように、第１の実施形態における手認識システムでは、最適化処理、マスク領域算出処理及びマスク処理を再帰的に実行する。これにより、第１の実施形態における手認識システムによれば、測定対象者の手首にリストバンドを装着させることなく、あるいは、センサの角度や測定対象者が着用する衣類の種類によらずに、深度データから腕の領域を徐々に除外していくことが可能になる。 Thus, in the hand recognition system in the first embodiment, the optimization process, the mask area calculation process, and the mask process are recursively executed. Thereby, according to the hand recognition system in the first embodiment, without attaching the wristband to the wrist of the measurement subject, or regardless of the angle of the sensor and the type of clothing worn by the measurement subject, It is possible to gradually exclude the arm region from the depth data.

この結果、第１の実施形態における手認識システムによれば、深度データから手の領域を認識する際の、認識精度を向上させることができる。 As a result, according to the hand recognition system in the first embodiment, it is possible to improve recognition accuracy when recognizing a hand region from depth data.

［第２の実施形態］
上記第１の実施形態では、深度データに対して最適化処理、マスク領域算出処理及びマスク処理を再帰的に実行し、深度データから腕の領域を徐々に除外していくことで、高い認識精度を実現するものとして説明した。これに対して、第２の実施形態では、１回目の最適化処理の結果に基づいて算出するマスク領域を広くとることで、最適化処理、マスク領域算出処理及びマスク処理を繰り返す際の繰り返し回数を減らす。以下、第２の実施形態について、第１の実施形態との相違点を中心に説明する。 [Second Embodiment]
In the first embodiment, high recognition accuracy is achieved by recursively executing optimization processing, mask region calculation processing, and mask processing on depth data, and gradually excluding arm regions from the depth data. It was explained as realizing. In contrast, in the second embodiment, the number of repetitions when the optimization process, the mask area calculation process, and the mask process are repeated by taking a large mask area to be calculated based on the result of the first optimization process. Reduce. Hereinafter, the second embodiment will be described focusing on the differences from the first embodiment.

図１１は、マスク領域算出部によるマスク領域算出処理及びマスク部によるマスク処理の詳細を示す第３の図である。このうち、図１１（ａ）のデータ６００は、特定部位深度データに対して最適化処理が行われた場合のデータ４１４（図４）の一部を拡大して示したものである。 FIG. 11 is a third diagram illustrating details of the mask area calculation process by the mask area calculation unit and the mask process by the mask unit. Among these, the data 600 in FIG. 11A is an enlarged view of a part of the data 414 (FIG. 4) when the optimization process is performed on the specific part depth data.

第２の実施形態において、マスク領域算出部４０５は、まず、データ６００上において、手指モデル４２０の中指の先端を示す点１１０１の位置を特定する。続いて、マスク領域算出部４０５は、図１１（ｂ）に示すように、データ６００上において、手の領域の先端を示す点１１０２を抽出する。なお、マスク領域算出部４０５は、例えば、データ６００上の各点のうち、手指モデル４２０と近接する各点と同一物体の領域を識別し、識別した領域の端部の点を、点１１０２として抽出する。 In the second embodiment, the mask area calculation unit 405 first specifies the position of a point 1101 indicating the tip of the middle finger of the finger model 420 on the data 600. Subsequently, as shown in FIG. 11B, the mask area calculation unit 405 extracts a point 1102 indicating the tip of the hand area on the data 600. The mask area calculation unit 405 identifies, for example, an area of the same object as each point close to the finger model 420 among the points on the data 600, and points at the end of the identified area as points 1102 Extract.

更に、マスク領域算出部４０５は、図１１（ｃ）に示すように、手指モデル４２０の中指の先端を示す点１１０１が、手の領域の先端を示す点１１０２に一致するように、データ６００内において、手指モデル４２０を平行移動させる。 Further, as shown in FIG. 11C, the mask region calculation unit 405 stores the data 110 so that the point 1101 indicating the tip of the middle finger of the finger model 420 matches the point 1102 indicating the tip of the hand region. , The finger model 420 is translated.

続いて、マスク領域算出部４０５は、図１１（ｄ）に示すように、平行移動後の手指モデル４２０の手首を示す点１１１１の位置を特定する。また、マスク領域算出部４０５は、手指モデル４２０の中指の付け根を示す点１１１２の位置を特定する。更に、マスク領域算出部４０５は、点１１１１の位置を始点とし点１１１２の位置を終点とするベクトルｖ_０を算出する。 Subsequently, as shown in FIG. 11D, the mask area calculation unit 405 specifies the position of a point 1111 indicating the wrist of the finger model 420 after the parallel movement. In addition, the mask area calculation unit 405 specifies the position of a point 1112 that indicates the root of the middle finger of the finger model 420. Further, the mask area calculation unit 405 calculates a vector v ₀ having the position of the point 1111 as the start point and the position of the point 1112 as the end point.

続いて、マスク領域算出部４０５は、図１１（ｅ）に示すように、点１１１１の位置を始点としデータ６００上の任意の点の位置を終点とするベクトルを算出する。例えば、マスク領域算出部４０５は、データ６００上の点１１１１の位置を始点とし点１１２０＿１の位置を終点とするベクトルｖ_１を算出する。また、マスク領域算出部４０５は、データ６００上の点１１１１の位置を始点とし点１１２０＿２の位置を終点とするベクトルｖ_２を算出する。また、マスク領域算出部４０５は、データ６００上の点１１１１の位置を始点とし点１１２０＿３の位置を終点とするベクトルｖ_３を算出する。以下、同様に、マスク領域算出部４０５は、データ６００上の点１１１１の位置を始点としｎ個の点それぞれの位置を終点とするｎ個のベクトルを算出する。 Subsequently, as shown in FIG. 11E, the mask area calculation unit 405 calculates a vector having the position of the point 1111 as the start point and the position of an arbitrary point on the data 600 as the end point. For example, the mask area calculation unit 405 calculates a vector v ₁ having the position of the point 1111 on the data 600 as the start point and the position of the point 1120_1 as the end point. Further, the mask area calculation unit 405 calculates a vector v ₂ having the position of the point 1111 on the data 600 as the start point and the position of the point 1120_2 as the end point. Further, the mask area calculation unit 405 calculates a vector v ₃ having the position of the point 1111 on the data 600 as the start point and the position of the point 1120_3 as the end point. Similarly, the mask area calculation unit 405 calculates n vectors starting from the position of the point 1111 on the data 600 and ending at the position of each of the n points.

続いて、マスク領域算出部４０５は、算出したｎ個のベクトルそれぞれと、ベクトルｖ_０との内積を算出する。図１１（ｅ）のデータ６００において、直線１１３０より上側の領域はベクトルｖ_０との内積が０以上となるベクトルによって特定される点が位置する領域である。一方、直線１１３０より下側の領域はベクトルｖ_０との内積が０より小さいベクトルによって特定される点が位置する領域（つまり、マスク領域）である。 Subsequently, the mask area calculation unit 405 calculates the inner product of each of the calculated n vectors and the vector v ₀ . In the data 600 of FIG. 11E, the area above the straight line 1130 is an area where a point specified by a vector whose inner product with the vector v ₀ is 0 or more is located. On the other hand, the area below the straight line 1130 is an area where a point specified by a vector whose inner product with the vector v ₀ is smaller than 0 (ie, a mask area).

図１１（ｆ）は、マスク領域算出部４０５によって算出されたマスク領域１１４０に基づいて、マスク部４０３が、データ６００をマスク処理した様子を示している。 FIG. 11F shows a state in which the mask unit 403 masks the data 600 based on the mask region 1140 calculated by the mask region calculation unit 405.

このように、第２の実施形態における手認識システムは、１回目の最適化処理の結果に対して、手指モデル４２０を平行移動させたうえでマスク領域を算出することで、第１の実施形態と比較して、マスク領域を広くとることができる。 As described above, the hand recognition system according to the second embodiment calculates the mask region after moving the finger model 420 in parallel with respect to the result of the first optimization process, thereby enabling the first embodiment. Compared to the above, the mask area can be widened.

この結果、第２の実施形態における手認識システムによれば、最適化処理、マスク領域算出処理及びマスク処理を繰り返す際の繰り返し回数を減らすことが可能になる。 As a result, according to the hand recognition system in the second embodiment, it is possible to reduce the number of repetitions when the optimization process, the mask area calculation process, and the mask process are repeated.

［その他の実施形態］
上記第１及び第２の実施形態では、手指モデル４２０の手首を示す点の位置を始点とし、中指の付け根を示す点の位置を終点とするベクトルを算出することで、ベクトルｖ_０を算出した。しかしながら、ベクトルｖ_０の算出方法はこれに限定されず、例えば、手指モデル４２０の手首を示す点の位置を始点とし、他の指（例えば、人差し指）の付け根を示す点の位置を終点とするベクトルを算出することで、ベクトルｖ_０を算出するようにしてもよい。 [Other Embodiments]
In the first and second embodiments, the vector v ₀ is calculated by calculating a vector starting from the point indicating the wrist of the finger model 420 and ending at the point indicating the root of the middle finger. . However, the calculation method of the vector v ₀ is not limited to this. For example, the position of the point indicating the wrist of the finger model 420 is set as the start point, and the position of the point indicating the root of another finger (for example, the index finger) is set as the end point. The vector v ₀ may be calculated by calculating the vector.

また、上記第１及び第２の実施形態では、主に、測定対象者の右手を例に最適化処理、マスク領域算出処理及びマスク処理を説明したが、測定対象者の左手についての最適化処理、マスク領域算出処理及びマスク処理も同様である。 In the first and second embodiments, the optimization process, the mask area calculation process, and the mask process are mainly described using the measurement subject's right hand as an example. However, the optimization process for the measurement subject's left hand is described. The same applies to the mask area calculation process and the mask process.

また、上記第１及び第２の実施形態では、複数種類の手指モデルのうち、所定の１種類の手指モデルについて最適化処理を行うものとして説明したが、複数種類の手指モデルそれぞれについて最適化処理を行うようにしてもよい。この場合、モデル最適化部４０４では、複数種類の手指モデルのうち、最終的に、コストが最小の手指モデルを選択するものとする。 In the first and second embodiments described above, the optimization process is performed on a predetermined one kind of finger model among the plurality of kinds of finger models. However, the optimization process is performed on each of the plurality of kinds of finger models. May be performed. In this case, it is assumed that the model optimizing unit 404 finally selects a finger model with the lowest cost among a plurality of types of finger models.

また、上記第１及び第２の実施形態では、最適化処理を行うにあたり、手指モデルの大きさを固定としたが、手指モデルの大きさを拡縮しながら、最適化処理を行うようにしてもよい。これにより、大きさの異なる手指モデルを複数種類用意する必要がなくなるといった利点がある。 In the first and second embodiments, the size of the finger model is fixed when performing the optimization process. However, the optimization process may be performed while scaling the size of the finger model. Good. This has the advantage that it is not necessary to prepare a plurality of types of finger models having different sizes.

また、上記第１及び第２の実施形態では、最適化処理を行うにあたり、手指モデルの位置、姿勢、各関節の角度を変えながらコストを算出するものとして説明した。しかしながら、予め、位置、姿勢、各関節の角度を変えた複数の手指モデルを手指モデル情報記憶部１２２に記憶しておき、最適化処理の際、それぞれの手指モデルを読み出して、コストを算出するようにしてもよい。 In the first and second embodiments described above, the cost is calculated while changing the position and posture of the finger model and the angle of each joint in performing the optimization process. However, a plurality of finger models with different positions, postures, and angles of each joint are stored in advance in the finger model information storage unit 122, and each finger model is read during the optimization process to calculate the cost. You may do it.

なお、開示の技術では、以下に記載する付記のような形態が考えられる。
（付記１）
取得した深度データから処理対象とする手の候補領域を抽出し、
前記処理対象とする手の候補領域に対して最も類似する手指モデルを特定し、
前記処理対象とする手の候補領域に対して特定した前記手指モデルの類似の度合いが、所定の基準を満たさない場合に、特定した前記手指モデルにおける手首の位置を始点とし指の付け根の位置を終点とするベクトルに基づいて特定される腕の領域を除くことで、前記処理対象とする手の候補領域内において、新たな処理対象とする手の候補領域を設定し、
前記新たな処理対象とする手の候補領域を用いて、前記特定する処理と前記設定する処理とを再帰的に行う
処理をコンピュータが実行することを特徴とする手認識方法。
（付記２）
特定した前記手指モデルにおける手首の位置を始点とする任意の複数方向を指す複数のベクトルと、特定した前記手指モデルにおける手首の位置を始点とし指の付け根の位置を終点とするベクトルとの間で、それぞれ内積を算出し、算出した内積の値に基づいて前記腕の領域を特定する、ことを特徴とする付記１に記載の手認識方法。
（付記３）
前記処理対象とする手の候補領域に対して、前記手指モデルの位置、姿勢及び該手指モデルの各関節の角度を算出することで、前記処理対象とする手の候補領域に最も類似する手指モデルを特定することを特徴とする付記１または２に記載の手認識方法。
（付記４）
前記任意の複数方向を指す複数のベクトルのうち、算出した内積の値が０より小さいベクトルによって特定される点が位置する領域を、前記処理対象とする手の候補領域内における前記腕の領域として特定することを特徴とする付記２に記載の手認識方法。
（付記５）
前記手指モデルの位置、姿勢及び該手指モデルの各関節の角度についての確からしさを示すパラメータを算出し、
前記特定する処理と前記設定する処理とを再帰的に行い、算出した該パラメータが所定の閾値以下となった際の、前記手指モデルの位置、姿勢及び該手指モデルの各関節の角度を、手の領域の認識結果として出力することを特徴とする付記３に記載の手認識方法。
（付記６）
前記手指モデルの位置、姿勢及び該手指モデルの各関節の角度についての確からしさを示すパラメータを算出し、
前記特定する処理と前記設定する処理とを再帰的に行い、前記パラメータの変化が所定の基準を満たす場合に、変化する前のパラメータが算出された際の、前記手指モデルの位置、姿勢及び該手指モデルの各関節の角度を、手の領域の認識結果として出力することを特徴とする付記３に記載の手認識方法。
（付記７）
前記特定する処理と前記設定する処理とを再帰的に行い、前記新たな処理対象とする手の候補領域の変化量が、所定の基準を満たした際の、前記手指モデルの位置、姿勢及び該手指モデルの各関節の角度を、手の領域の認識結果として出力することを特徴とする付記３に記載の手認識方法。
（付記８）
取得した深度データから処理対象とする手の候補領域を抽出し、
前記処理対象とする手の候補領域に対して最も類似する手指モデルを特定し、
前記処理対象とする手の候補領域に対して特定した前記手指モデルの類似の度合いが、所定の基準を満たさない場合に、特定した前記手指モデルにおける手首の位置を始点とし指の付け根の位置を終点とするベクトルに基づいて特定される腕の領域を除くことで、前記処理対象とする手の候補領域内において、新たな処理対象とする手の候補領域を設定し、
前記新たな処理対象とする手の候補領域を用いて、前記特定する処理と前記設定する処理とを再帰的に行う
処理をコンピュータに実行させるための手認識プログラム。
（付記９）
取得した深度データから処理対象とする手の候補領域を抽出する抽出部と、
前記処理対象とする手の候補領域に対して最も類似する手指モデルを特定する特定部と、
前記処理対象とする手の候補領域に対して特定した前記手指モデルの類似の度合いが、所定の基準を満たさない場合に、特定した前記手指モデルにおける手首の位置を始点とし指の付け根の位置を終点とするベクトルに基づいて特定される腕の領域を除くことで、前記処理対象とする手の候補領域内において、新たな処理対象とする手の候補領域を設定する設定部と、を有し、
前記新たな処理対象とする手の候補領域を用いて、前記特定する処理と前記設定する処理とを再帰的に行うことを特徴とする情報処理装置。 In addition, in the disclosed technology, forms such as the following supplementary notes are conceivable.
(Appendix 1)
Extract hand candidate areas to be processed from the acquired depth data,
Identify a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. By excluding the arm region specified based on the vector as the end point, the candidate region of the hand to be processed is set in the candidate region of the hand to be processed,
A hand recognition method, wherein a computer executes a process of recursively performing the process of specifying and the process of setting using a hand candidate area as a new process target.
(Appendix 2)
Between a plurality of vectors pointing in a plurality of directions starting from the wrist position in the specified finger model and a vector starting from the wrist position in the specified finger model and the base of the finger as an end point The hand recognition method according to claim 1, wherein inner products are calculated, and the arm region is specified based on the calculated inner product value.
(Appendix 3)
A finger model that is most similar to the candidate region of the hand to be processed by calculating the position and orientation of the finger model and the angle of each joint of the finger model with respect to the candidate region of the hand to be processed. The hand recognition method according to appendix 1 or 2, characterized in that:
(Appendix 4)
Of the plurality of vectors indicating the plurality of arbitrary directions, an area where a point specified by a vector whose calculated inner product value is smaller than 0 is located as the arm area in the candidate area of the hand to be processed The hand recognition method according to appendix 2, characterized by specifying.
(Appendix 5)
Calculating parameters indicating the accuracy of the position and orientation of the finger model and the angle of each joint of the finger model;
The process of specifying and the process of setting are recursively performed, and the position and posture of the finger model and the angle of each joint of the finger model when the calculated parameter falls below a predetermined threshold The hand recognition method according to appendix 3, wherein the recognition result is output as a recognition result of the region.
(Appendix 6)
Calculating parameters indicating the accuracy of the position and orientation of the finger model and the angle of each joint of the finger model;
The specifying process and the setting process are recursively performed, and when the change in the parameter satisfies a predetermined criterion, the position, posture, and position of the finger model when the parameter before the change is calculated are calculated. The hand recognition method according to appendix 3, wherein the angle of each joint of the finger model is output as a hand region recognition result.
(Appendix 7)
The specifying process and the setting process are performed recursively, and the position, posture, and position of the finger model when the amount of change in the candidate region of the hand to be newly processed satisfies a predetermined criterion. The hand recognition method according to appendix 3, wherein the angle of each joint of the finger model is output as a hand region recognition result.
(Appendix 8)
Extract hand candidate areas to be processed from the acquired depth data,
Identify a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. By excluding the arm region specified based on the vector as the end point, the candidate region of the hand to be processed is set in the candidate region of the hand to be processed,
A hand recognition program for causing a computer to execute a process of recursively performing the process of specifying and the process of setting using a hand candidate area to be the new process target.
(Appendix 9)
An extraction unit that extracts a candidate region of a hand to be processed from the acquired depth data;
A specifying unit that specifies a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. A setting unit that sets a candidate area for a new hand to be processed in the candidate area for the hand to be processed by removing an arm area specified based on a vector as an end point. ,
An information processing apparatus characterized by recursively performing the specifying process and the setting process using a candidate area of a hand as a new processing target.

なお、上記実施形態に挙げた構成等に、その他の要素との組み合わせ等、ここで示した構成に本発明が限定されるものではない。これらの点に関しては、本発明の趣旨を逸脱しない範囲で変更することが可能であり、その応用形態に応じて適切に定めることができる。 Note that the present invention is not limited to the configurations shown here, such as combinations with other elements, etc., in the configurations described in the above embodiments. These points can be changed without departing from the spirit of the present invention, and can be appropriately determined according to the application form.

１００：手認識システム
１１０：深度センサ
１２０：情報処理装置
１２１：手認識処理部
２００：手指モデル情報
４０１：深度データ取得部
４０２：部位抽出部
４０３：マスク部
４０４：モデル最適化部
４０５：マスク領域算出部
４０６：終了判定部
４２０、４３０：手指モデル
６３０、７３０：マスク領域
１１４０：マスク領域 100: Hand recognition system 110: Depth sensor 120: Information processing device 121: Hand recognition processing unit 200: Finger model information 401: Depth data acquisition unit 402: Part extraction unit 403: Mask unit 404: Model optimization unit 405: Mask region Calculation unit 406: end determination unit 420, 430: finger model 630, 730: mask region 1140: mask region

Claims

Extract hand candidate areas to be processed from the acquired depth data,
Identify a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. By excluding the arm region specified based on the vector as the end point, the candidate region of the hand to be processed is set in the candidate region of the hand to be processed,
A hand recognition method, wherein a computer executes a process of recursively performing the process of specifying and the process of setting using a hand candidate area as a new process target.

Between a plurality of vectors pointing in a plurality of directions starting from the wrist position in the specified finger model and a vector starting from the wrist position in the specified finger model and the base of the finger as an end point The hand recognition method according to claim 1, wherein inner products are calculated, and the arm region is specified based on the calculated inner product value.

A finger model that is most similar to the candidate region of the hand to be processed by calculating the position and orientation of the finger model and the angle of each joint of the finger model with respect to the candidate region of the hand to be processed. The hand recognition method according to claim 1, wherein the hand recognition method is specified.

Of the plurality of vectors indicating the plurality of arbitrary directions, an area where a point specified by a vector whose calculated inner product value is smaller than 0 is located as the arm area in the candidate area of the hand to be processed The hand recognition method according to claim 2, wherein the hand recognition method is specified.

Calculating parameters indicating the accuracy of the position and orientation of the finger model and the angle of each joint of the finger model;
The process of specifying and the process of setting are recursively performed, and the position and posture of the finger model and the angle of each joint of the finger model when the calculated parameter falls below a predetermined threshold The hand recognition method according to claim 3, wherein the recognition result is output as a recognition result of the region.

Calculating parameters indicating the accuracy of the position and orientation of the finger model and the angle of each joint of the finger model;
The specifying process and the setting process are recursively performed, and when the change in the parameter satisfies a predetermined criterion, the position, posture, and position of the finger model when the parameter before the change is calculated are calculated. The hand recognition method according to claim 3, wherein the angle of each joint of the finger model is output as a recognition result of the hand region.

The specifying process and the setting process are performed recursively, and the position, posture, and position of the finger model when the amount of change in the candidate region of the hand to be newly processed satisfies a predetermined criterion. The hand recognition method according to claim 3, wherein the angle of each joint of the finger model is output as a recognition result of the hand region.

Extract hand candidate areas to be processed from the acquired depth data,
Identify a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. By excluding the arm region specified based on the vector as the end point, the candidate region of the hand to be processed is set in the candidate region of the hand to be processed,
Using the candidate region of the hand to be the new processing target, recursively performing the specifying process and the setting process.
A hand recognition program that causes a computer to execute processing.

An extraction unit that extracts a candidate region of a hand to be processed from the acquired depth data;
A specifying unit that specifies a finger model that is most similar to the candidate region of the hand to be processed;
When the degree of similarity of the finger model specified for the candidate region of the hand to be processed does not satisfy a predetermined criterion, the position of the finger base is determined from the wrist position in the specified finger model as a starting point. A setting unit that sets a candidate area for a new hand to be processed in the candidate area for the hand to be processed by removing an arm area specified based on a vector as an end point. ,
An information processing apparatus characterized by recursively performing the specifying process and the setting process using a candidate area of a hand as a new processing target.