JP6810442B2

JP6810442B2 - A camera assembly, a finger shape detection system using the camera assembly, a finger shape detection method using the camera assembly, a program for implementing the detection method, and a storage medium for the program.

Info

Publication number: JP6810442B2
Application number: JP2016122170A
Authority: JP
Inventors: 聖星野; 聡太杉村
Original assignee: 聖星野
Priority date: 2016-06-20
Filing date: 2016-06-20
Publication date: 2021-01-06
Anticipated expiration: 2036-06-20
Also published as: JP2017227687A

Description

本発明は、カメラで撮像した画像から、人間の手指の形状を検出する方法に関し、特に、単一のレンズ（単眼）カメラの平面（２Ｄ）画像から映っていない部分を推定して手指の形状を検出するシステムと方法に関する。 The present invention relates to a method of detecting the shape of a human finger from an image captured by a camera, and in particular, estimates the shape of the finger that is not reflected from a plane (2D) image of a single lens (monocular) camera. Regarding the system and method of detecting.

従来から、多指型ロボットハンド又はマニピュレーター等を駆動させるか、あるいは、表示部に表示されたゲームやアニメ等のキャラクター等の手指を動作させるための方法の一例として、使用者の手の動きからその手指の形状を検出するジェスチャー入力あるいは、「細かな手指動作のモーションキャプチャ」（手指動作のモーションキャプチャ＝ハンドモーキャップ(hand mocap)：３次元手指形状推定技術）が知られている。ハンドモーキャップは、顔の表情と共に体全体のモーションキャプチャーと同時に関連させてデータを取得する要望があるが、顔の表情については、人体動作のモーションキャプチャと併用してリアルタイムでコンピュータアニメーションとして描画するシステムも実用に供するようになってきたため、ハンドモーキャップのみがまだ実用に供するレベルに至っていない最後の技術と言われている。 Conventionally, as an example of a method for driving a multi-finger type robot hand or a manipulator, or for operating the fingers of a character such as a game or animation displayed on the display unit, from the movement of the user's hand. Gesture input for detecting the shape of the finger or "motion capture of fine finger movement" (motion capture of finger movement = hand mocap: three-dimensional finger shape estimation technology) is known. For hand mocap, there is a request to acquire data in association with the motion capture of the whole body together with the facial expression, but the facial expression is drawn as a computer animation in real time in combination with the motion capture of the human body movement. Since the system has become practical, it is said that only hand mocap is the last technology that has not yet reached the level of practical use.

ハンドモーキャップを概略的に分類すると、（Ｘ）使用者の腕や手指に装着したセンサー装置の出力、又は、マーカーをカメラで撮像した画像データの解析結果から、手指形状を求める機器装着方式と、（Ｙ）腕や手指の撮像画像から手指形状のハンドモーキャップ動作を検出する画像処理方式の２方式に分類できる。（Ｘ）機器装着型の場合、正確な手指の検出は可能であるが、構成が大がかりで、準備に時間がかかり検出が容易ではないことと、センサー装着の場合はユーザが拘束されて自由に動作できないことがあった。従って、容易に手指を検出するには（Ｙ）画像処理方式が望ましい。以下、本発明に関係する（Ｙ）画像処理方式とそれに関係する技術を主に説明する。 The hand morphs can be roughly classified as follows: (X) A device mounting method that obtains the shape of a finger from the output of a sensor device mounted on the user's arm or finger, or the analysis result of image data obtained by capturing a marker with a camera. , (Y) It can be classified into two types of image processing methods that detect the hand morphing motion of the finger shape from the captured images of the arm and fingers. (X) In the case of the device-mounted type, accurate finger detection is possible, but the configuration is large, preparation is time-consuming and detection is not easy, and in the case of sensor-mounted, the user is restrained and freely Sometimes it couldn't work. Therefore, the (Y) image processing method is desirable for easily detecting the fingers. Hereinafter, the (Y) image processing method related to the present invention and the technique related thereto will be mainly described.

（Ｙ）画像処理方式を概略的に分類すると、（Ｙ１）３Ｄ−ｍｏｄｅｌ−ｂａｓｅｄアプローチ（以下、３Ｄアプローチと記す。）と、（Ｙ２）２Ｄ−ａｐｐｅａｒａｎｃｅ−ｂａｓｅｄアプローチ（以下、２Ｄアプローチと記す。）の２方式に分類できる。 The (Y) image processing method is roughly classified into (Y1) 3D-model-based approach (hereinafter referred to as 3D approach) and (Y2) 2D-appearance-based approach (hereinafter referred to as 2D approach). ) Can be classified into two methods.

（Ｙ１）３Ｄアプローチでは、手指を複数の異なる方向から同時に撮影できる複数のレンズ機構と、リアルタイムに手指形状の滑らかな動画像を推定するための高い演算能力を有するコンピュータが必要である。（Ｙ２）２Ｄアプローチでは、例えば、（Ｙ２ａ）高次局所自己相関特徴（以下、ＨＬＡＣと記す。）により、手画像の輪郭線（シルエットの外形線）情報を特徴量化してマッチングを行うことで高精度に推定を行うことができるが、同一又は類似する輪郭線になった場合、異形状の場合の識別が困難であり、また、同一の手の形状でも輪郭線が異なり、他の手の形状と識別が困難になる個人差の問題も解決できない。（Ｙ２ｂ）ＨｉｓｔｏｇｒａｍｏｆＧｒａｄｉｅｎｔｓ（以下、ＨｏＧと記す。）による方法では、画像の輝度勾配情報を特徴量化しているため、輪郭線形状の内部範囲を識別をでき、個人差を解決できるが、物理メモリが多く必要であり、特徴量化レベルでの個人差対応はできない。以下、本発明に関係する（Ｙ２）２Ｄアプローチとそれに加えて（Ｙ２ａ）ＨＬＡＣによる画像処理方式とそれに関係する技術を主に説明する。 The (Y1) 3D approach requires a plurality of lens mechanisms capable of simultaneously photographing the fingers from a plurality of different directions, and a computer having a high computing power for estimating a smooth moving image of the finger shape in real time. In the (Y2) 2D approach, for example, the contour line (outline of the silhouette) information of the hand image is featured and matched by the (Y2a) higher - order local autocorrelation feature (hereinafter referred to as HLAC). It is possible to estimate with high accuracy, but if the contour lines are the same or similar, it is difficult to identify the case of different shapes, and even if the shape of the same hand is different, the contour lines are different and other hands. The problem of individual differences, which makes it difficult to distinguish from the shape, cannot be solved. (Y2b) In the method based on Histogram of Gradients (hereinafter referred to as HoG), since the luminance gradient information of the image is featured, the internal range of the contour line shape can be identified and individual differences can be solved. A large amount of memory is required, and individual differences cannot be handled at the feature quantification level. Hereinafter, the (Y2) 2D approach related to the present invention, the image processing method by (Y2a) HLAC, and the technology related thereto will be mainly described.

上記（Ｙ２）２Ｄアプローチのカメラとしては、モノクロかカラーで単眼のカメラ（カメラ）が用いられるが単眼（２Ｄ）の撮像画像から手指形状を検出することは困難である。 As the camera of the (Y2) 2D approach, a monochrome or color monocular camera (camera) is used, but it is difficult to detect the finger shape from the monocular (2D) captured image.

また、（Ｙ２）２Ｄアプローチで、例えば、（Ｘ）機器装着方式で得た手指の関節角度及び回旋角度データと、単眼カメラで撮像したグレイスケール手指画像の分割領域毎の輪郭線からの画像特徴量とを組み合わせて照合用の画像データベースを作成して、手指形状検出用の新規の手指画像とデータベースの手指画像との照合結果を手指の検出結果として利用することが知られている（例えば、特許文献３参照）。その場合、新規の手指画像から得られる輪郭線等の画像特徴量に対する画像データベース中の手指画像のうち、画像特徴量が最も類似する画像データを検索し、その最も類似する画像データと組み合わされる手指の関節角度及び回旋角度データから、新規画像の手指形状を推定することにより手指形状検出の精度を向上させている。 Further, in the (Y2) 2D approach, for example, the joint angle and rotation angle data of the fingers obtained by the (X) device mounting method, and the image features from the contour lines of each divided region of the grayscale finger image captured by the monocular camera. It is known that an image database for collation is created by combining with a quantity, and the collation result of a new finger image for finger shape detection and a finger image of the database is used as a finger detection result (for example). See Patent Document 3). In that case, among the hand images in the image database for the image feature amount such as the contour line obtained from the new hand image, the image data having the most similar image feature amount is searched, and the finger combined with the most similar image data is searched. The accuracy of finger shape detection is improved by estimating the finger shape of a new image from the joint angle and rotation angle data of the above.

また、（Ｙ２）２Ｄアプローチで、例えば、照合用の画像データベースの画像データ量を減らし、照合を容易にするために、例えば、各手指画像の前腕部の輪郭線から前腕部の延伸方向と手首の位置を求め、手首から先を同じ向きにして照合することで画像データベース中の手指画像の向きを揃えることができる。また、各手指画像の輪郭線を利用して縦横が所定サイズの画素（ピクセル）数の各手指画像に正規化することでサイズを揃えることができる（例えば、特許文献１、２参照）。従って、従来の単眼カメラによる手指形状推定方法では、手画像の生データから、極力精緻な輪郭線情報を得て、その輪郭線情報と照合用の画像データベースの画像データとから手指形状を推定していた。 Further, in the (Y2) 2D approach, for example, in order to reduce the amount of image data in the image database for collation and facilitate collation, for example, the extension direction of the forearm from the contour line of the forearm of each finger image and the wrist. The orientation of the finger images in the image database can be aligned by finding the position of and collating with the wrist to the tip in the same orientation. Further, the size can be made uniform by normalizing each finger image having a predetermined number of pixels in the vertical and horizontal directions by using the contour line of each finger image (see, for example, Patent Documents 1 and 2). Therefore, in the conventional method of estimating the shape of a finger using a monocular camera, the contour line information is obtained as precisely as possible from the raw data of the hand image, and the shape of the finger is estimated from the contour line information and the image data of the image database for collation. Was there.

従来の上記（Ｙ２ａ）ＨＬＡＣによる「手指形状、位置関係及び動き」を単眼カメラで撮像した画像中から輪郭線形状のみを用いて検出する場合、以下の（ａ）、（ｂ）、（ｃ）の３点の理由から、手指形状が異形状であるが輪郭線では同一又は類似する場合の識別が困難であり、手指形状を正確に推定する事は困難であることが知られている。さらに、個人差から同一の手指形状でも人毎に輪郭線が異なり、一人の手指形状が他の人の手指形状と識別が困難になるという問題がある。
（ａ）手指は、多関節構造であるため形状変化が複雑である点。
（ｂ）手指は、関節を曲げたり、握った場合に、輪郭線形状としては手指の甲や手指のひらに手指が隠れてしまう自己遮蔽が多い点。
（ｃ）手指は、身体全体に対する部位の占める比率は小さいが、可動空間が広い点。 When detecting the "finger shape, positional relationship, and movement" by the conventional (Y2a) HLAC from the image captured by the monocular camera using only the contour line shape, the following (a), (b), and (c) It is known that it is difficult to accurately estimate the finger shape because it is difficult to distinguish when the finger shape is different but the contour line is the same or similar for the above three reasons. Further, due to individual differences, even if the same finger shape is used, the contour line differs for each person, and there is a problem that it becomes difficult to distinguish one person's finger shape from another person's finger shape.
(A) Since the fingers have an articulated structure, the shape change is complicated.
(B) When a joint is bent or gripped, the contour line shape of the finger is that the finger is often hidden behind the back of the finger or the palm of the finger.
(C) The finger has a large movable space, although the ratio of the part to the whole body is small.

また、上記単眼カメラを用いた（Ｙ２）２Ｄアプローチの各画像と関節角度データとで対応をとり画像データベースを作成する場合は、例えば、各指の太さや長さ、手甲からの親指の出方、手甲の大きさ等に非常に多様な個人差があるので、汎用や代表となるサンプル手指画像や基準手指画像のように個人差を大幅に包括する手指画像を作成することが困難である。従って、個人差を考慮して多数の画像データを準備する必要があり、手指画像の画像データベースに格納する画像データ量が増加していた。そして、画像データベースに格納する画像データ量を減らすためには、個人差をある程度の範囲で含む画像データを準備して推定を行っていたが、個人差が想定範囲よりも大きい場合には誤推定することがあった。 In addition, when creating an image database by associating each image of the (Y2) 2D approach using the monocular camera with joint angle data, for example, the thickness and length of each finger and how the thumb comes out from the back of the hand. Since there are various individual differences in the size of the back of the hand, it is difficult to create a finger image that greatly covers the individual differences, such as a general-purpose or representative sample finger image or a reference finger image. Therefore, it is necessary to prepare a large number of image data in consideration of individual differences, and the amount of image data stored in the image database of finger images has increased. Then, in order to reduce the amount of image data stored in the image database, image data including individual differences within a certain range was prepared and estimated, but if the individual differences are larger than the assumed range, erroneous estimation is performed. I had something to do.

また、上記したような誤推定を避けるために、多様な個人差に対応させて可能性のある全ての画像データを準備することは、データ量が非常に増加し、その結果、必要なメモリ量も増えることになり、データベース作成の工数も増加する。さらに新規の画像データに対してデータ照合処理をするデータ量が増加し、最も類似した画像データを画像データベース中から探すのに時間がかかり、データ処理装置の演算速度が不足する可能性がある。あるいは逆に、データ処理装置の処理能力の限界から、照合処理のデータ量を制限することがあり、全ての画像データを準備することが困難になることがある。 In addition, preparing all the possible image data in response to various individual differences in order to avoid the above-mentioned misestimation greatly increases the amount of data, and as a result, the amount of memory required. Will also increase, and the man-hours for creating a database will also increase. Further, the amount of data to be collated with new image data increases, it takes time to search the image database for the most similar image data, and the calculation speed of the data processing device may be insufficient. Alternatively, conversely, the amount of data for collation processing may be limited due to the limitation of the processing capacity of the data processing device, and it may be difficult to prepare all the image data.

また、特許文献２に記載されているように、画像特徴量に用いられる輪郭線に代えて、手指の中心を通る尾根線形状を利用することで、画像データ量を削減し、演算速度の不足を解消できることも知られている。手指の尾根線としては、例えば、使用者を単眼カメラで撮像したグレイスケールの手指の画像に、エッジ処理等で用いられる細線化処理を用いて、擬似的な骨格化処理を施し、その骨格化された細線（尾根線）を用いる。また、細線化の際の指先以外のノイズの先端については、手指の重心座標からの距離が一致する値以内は無効として排除する。また、上記２Ｄアプローチの方式における手指の移動方向や移動量の検出は、上記した手指画像の輪郭線形状等から、３次元手指の形状推定（ｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ）等を用いて、手指の移動方向や移動量（ｈａｎｄｔｒａｃｋｉｎｇ）を検出すればよい。 Further, as described in Patent Document 2, the amount of image data is reduced and the calculation speed is insufficient by using the ridge line shape passing through the center of the finger instead of the contour line used for the image feature amount. It is also known that can be eliminated. As the ridge line of the finger, for example, a grayscale image of the finger captured by a monocular camera of the user is subjected to a pseudo skeletonization process by using a thinning process used in edge processing or the like to form the skeleton. Use the thin line (ridge line). In addition, the tip of noise other than the fingertip at the time of thinning is excluded as invalid if the distance from the center of gravity coordinates of the finger is within the same value. Further, in the detection of the movement direction and the movement amount of the finger in the above-mentioned 2D approach method, the movement direction of the finger is detected by using the three-dimensional shape estimation of the finger (hand pose establishment) or the like from the contour line shape of the finger image described above. And the amount of movement (hand tracking) may be detected.

また、特許文献１及び２の課題を解決するための一助として、機器装着方式のハンドモーキャップにより複数の手指形状と照合用の関節角度及び回旋角度を含む画像特徴量データとを含ませたデータセットの照合用データベースを予め作成しておき、検出用の撮像画像から情報処理装置により撮像画像を正規化及び平滑化して検出用画像特徴量データを生成してデータセット中の照合用画像特徴量データと対比し、類似する照合用の画像特徴量データを含むデータセットを選択し、選択されたデータセット中の手指形状データを手指形状の検出結果に含ませて出力することが知られている（例えば、特許文献３参照）。 Further, as an aid for solving the problems of Patent Documents 1 and 2, data including a plurality of finger shapes and image feature amount data including joint angles and rotation angles for collation by a device-mounted hand mocap. A database for collation of the set is created in advance, and the captured image is normalized and smoothed by the information processing device from the captured image for detection to generate the image feature amount data for detection, and the collation image feature amount in the data set. It is known that a data set containing similar image feature data for collation is selected by comparison with the data, and the finger shape data in the selected data set is included in the finger shape detection result and output. (See, for example, Patent Document 3).

また、やはり特許文献１及び２の課題を解決するための一助として、まず事前に一方の手（例えば右手）を手甲側に設置されたカメラアセンブリで撮影し、他方の手(例えば左手）にデータグローブを着用し、右手と左手で同じ動作をさせ、手甲側カメラアセンブリからの手指画像データと、データグローブの手指角度データとの両データを対応させてデータベースに格納する。検出時には、一方の手（例えば右手）を手甲側カメラアセンブリで撮影し、その手甲側カメラアセンブリの手指画像データからデータベース内の手指角度データを読み出し、手指ロボット又は手指のＣＧ画像等を動作させるために手指角度データを出力することが知られている（例えば、特許文献４参照）。 Also, as an aid to solving the problems of Patent Documents 1 and 2, first, one hand (for example, the right hand) is photographed in advance with a camera assembly installed on the back side, and the data is stored in the other hand (for example, the left hand). Wear the glove, perform the same operation with the right and left hands, and store both the finger image data from the back side camera assembly and the finger angle data of the data glove in the database. At the time of detection, one hand (for example, the right hand) is photographed by the back side camera assembly, the hand finger angle data in the database is read from the hand image data of the back side camera assembly, and the hand robot or the CG image of the hand is operated. It is known to output finger angle data to (see, for example, Patent Document 4).

また、手甲部のカメラアセンブリとしては、（ａ）手甲部の背面側の表面上に平面状の支持基礎部が形成され、（ｂ）手で角度補正可能な柔軟性を有する材料で形成されて先端にカメラ取付部を有する支持柱が、支持基礎部から上空に向かって突出するように設けられ、（ｃ）その支持基礎部と手甲部を共に、手甲部の小指側から人差指側までを横断するように手甲部の周囲を巻回す柔軟な固定バンドを有し、（ｄ）支持柱の先端にカメラを設置して、カメラの撮像レンズが親指と人差し指の間の斜め前から手指を撮像できるようにカメラ支持柱の角度を補正するカメラであることが知られている（例えば、特許文献４参照）。 Further, as the camera assembly of the back part, (a) a flat support base part is formed on the surface on the back side of the back part, and (b) it is made of a material having flexibility that can correct the angle by hand. A support column having a camera mounting portion at the tip is provided so as to project from the support base portion toward the sky, and (c) both the support foundation portion and the back of the hand cross from the little finger side to the index finger side of the back of the hand. It has a flexible fixing band that wraps around the back of the hand so that (d) the camera is installed at the tip of the support column, and the camera's imaging lens can image the fingers diagonally from the front between the thumb and index finger. As described above, it is known that the camera corrects the angle of the camera support column (see, for example, Patent Document 4).

国際公開ＷＯ２００９／１４７９０４号パンフレットInternational Publication WO2009 / 147904 Pamphlet 国際公開ＷＯ２０１３／０５１６８１号パンフレットInternational Publication WO2013 / 051681 Pamphlet 特開２０１６−０１４９５４JP 2016-014954 特開２０１５−１００６９７JP 2015-100697

しかしながら、特許文献１〜３の従来の室内固定や机上のラックに固定された単眼カメラを用いた形状検出方法では、以下の問題があった。
（Ａ）被写体の手指に対するカメラレンズの方向の角度は非常に多様で角度範囲が広くなり、結果的に撮像画像のポーズも多様になり、データベースのデータ量が増加していた。
（Ｂ）また、被写体の手指とカメラレンズとの両者の間隔も非常に多様で被写界深度内に収まらない場合は焦点距離と画角の調整が必要になり、結果的に撮像画像の縮尺比率も多様で画像の拡大／縮小調整が必要になり、データベースのデータ量が増加していた。
（Ｃ）さらに、上記（Ａ）の撮像画像のポーズが多様であることにも関係するが、手甲のみで手指の付け根が全く写っていない撮像画像や手甲さえ写っていない撮像画像も存在し、手指の推定が困難になっており、さらに、照合精度の向上も個人差対応の改善も部分的であり、照合結果を利用する各形状推定では、手指画像の指の関節が屈曲状態の同一輪郭線で異形状を識別することの改善も部分的になっていた。
（Ｄ）また、従来の撮像画像からの輪郭線、尾根線及び正規化及び平滑化した撮像画像を利用した照合では、上記したようにデータベースのデータ量が多いことから時間も多く必要であった。 However, the conventional shape detection methods using a monocular camera fixed indoors or fixed to a rack on a desk in Patent Documents 1 to 3 have the following problems.
(A) The angles of the direction of the camera lens with respect to the fingers of the subject are very diverse and the angle range is widened, and as a result, the poses of the captured images are also diverse, and the amount of data in the database is increasing.
(B) In addition, the distance between the subject's fingers and the camera lens is also very diverse, and if it does not fit within the depth of field, it is necessary to adjust the focal length and angle of view, resulting in the scale of the captured image. The ratio was also diverse, and it was necessary to adjust the enlargement / reduction of the image, and the amount of data in the database was increasing.
(C) Furthermore, although it is related to the variety of poses of the captured image of (A) above, there are also captured images in which only the back of the hand does not show the base of the fingers and the back of the hand is not shown. It is difficult to estimate the fingers, and the improvement of collation accuracy and the improvement of individual differences are also partial. In each shape estimation using the collation results, the finger joints in the finger image have the same contour in the flexed state. Improvements in identifying irregular shapes with lines were also partial.
(D) Further, collation using the contour line, the ridge line, and the normalized and smoothed captured image from the conventional captured image requires a lot of time due to the large amount of data in the database as described above. ..

また、例えば、舞台芸術や各種ダンス、或いは芸術点が評価されるスポーツ（アイススケート、シンクロナイズドスイミング、新体操等）では、人の体全体の動きの他に顔の表情と共に手先の動きも評価対象に含まれている。また、ゲームやアニメーションにおいても、人の体全体の動きの他に顔の表情と共に手先の動きが重要な作画要素になっている。しかしながら、特許文献１〜３の従来の固定カメラでは、例え複数台にカメラの数を増やしても、手指が撮像されなくなったり、腕から先の全てが撮像されなくなることがあり、ハンドモーキャップで体全体のモーションキャプチャーと同時にデータを取得することは不可能であった。 In addition, for example, in performing arts, various dances, or sports where artistic points are evaluated (ice skating, synchronized swimming, rhythmic gymnastics, etc.), in addition to the movement of the entire human body, the movement of the hands as well as the facial expression is also evaluated. Included in. Also, in games and animations, in addition to the movement of the entire human body, the movement of the hand as well as the facial expression is an important drawing element. However, in the conventional fixed cameras of Patent Documents 1 to 3, even if the number of cameras is increased to a plurality of cameras, the fingers may not be imaged or the entire arm and the tip may not be imaged. It was not possible to acquire data at the same time as motion capture of the entire body.

特許文献４の従来のカメラアセンブリでは、明細書中に詳細には内容が記載されていないものの、上記した（Ａ）〜（Ｄ）の問題を改善できる可能性と、ハンドモーキャップで体全体のモーションキャプチャーと同時にデータを取得することが可能になる可能性を有する。しかし、そのカメラアセンブリを用いた形状検出方法では、照合用データベースに格納する手指角度データ等の入力が、特許文献１〜３と同様にデータグローブを用いており、手指画像と対応する手指角度データ等をマッチングさせたデータセットを作成することに非常に工数と時間がかかっていた。また、緻密な特徴量化を行うと、指の太さ・厚み・長さなど人の手には様々な個人差があるため、データベースのデータ量が増加していたことも特許文献１〜３と同様であった。 In the conventional camera assembly of Patent Document 4, although the contents are not described in detail in the specification, the possibility of improving the above-mentioned problems (A) to (D) and the hand mocap of the whole body It has the potential to be able to acquire data at the same time as motion capture. However, in the shape detection method using the camera assembly, the data glove is used for inputting the finger angle data and the like stored in the collation database as in Patent Documents 1 to 3, and the finger angle data corresponding to the finger image is used. It took a lot of manpower and time to create a data set that matched the above. In addition, patent documents 1 to 3 indicate that the amount of data in the database has increased because there are various individual differences in human hands such as finger thickness, thickness, and length when detailed feature quantification is performed. It was similar.

そこで本発明は、手の甲側に取り付けた小さなカメラ１個からの撮像画像データ入力で、撮像フレームごとの手指のすべての関節角度を推定できる装置を提供し、さらに照合用データベースにおける手指画像のポーズ数を減らし、縮小／拡大比率を減少させ、データ量を減らし、ハンドモーキャップで体全体のモーションキャプチャーと同時にデータを取得できる効果を有すると共に、さらに照合用データベースの作成を効率化して作成のための時間と工数を抑制でき、個人差もより抑制した手指形状の検出方法とシステムを提供することを目的とする。 Therefore, the present invention provides a device that can estimate all the joint angles of the fingers for each imaging frame by inputting the captured image data from one small camera attached to the back side of the hand, and further, the number of poses of the finger images in the collation database. It has the effect of reducing the reduction / enlargement ratio, reducing the amount of data, and acquiring data at the same time as motion capture of the entire body with hand morph, and further streamlining the creation of the collation database for creation. It is an object of the present invention to provide a finger shape detection method and system that can reduce time and labor and further suppress individual differences.

まず、上記した問題を解決するために、上記したように本発明のカメラアセンブリでは、１台以上の小型のカメラと、装用者の片手の手甲部の近傍上空に前記カメラを設置できる支持部とを有し、前記支持部は、（ａ）前記カメラの画角内に前記装用者の片手の略全体が収められ、その時の焦点距離以上の長さを有する棒形状となるように、前記カメラの重量による応力が付与されても自立する強度と、角度変更可能な柔軟性との双方の特性を少なくとも有する材料により形成される支持柱と、（ｂ）前記支持柱を、前記手甲部の表面上から上に突出させて前記支持柱を支持可能なように接続及び設置でき、硬質な材料で、前記手甲部の表面に沿って、且つ、所定以上の面積を有するように平面的に形成された平面部を有する支持基礎部と、（ｃ）前記支持柱の先端部に設けられて前記カメラを設置できるカメラ取付部と、（ｄ）前記手甲部の表面上に配置された前記支持基礎部の少なくとも一部の上から押圧固定可能なように、前記手甲部の各指の付け根部を横断する方向に前記手甲部の周囲に巻回されるバンド形状、又は、前記支持基礎部の上面を全て押圧固定可能なように手袋の甲部の形状となるように、前記押圧固定時に破断しない強度と、前記手甲部の表面に沿って折り曲げ自在な柔軟性との双方の特性を少なくとも有する材料により形成される手甲固定部と、を有し、前記支持柱は、前記装用者の片手の前記手甲部に装着された後、前記手甲部における各指の付け根部を横断する線よりも指先側で、且つ、前記装用者の片手の親指と人差し指の間から、前記カメラが前記片手の略全体を撮像できるように、前記材質の柔軟性を利用して、少なくとも前記支持基礎部から突出する角度及び前記カメラに撮像方向に向ける角度が変更される、カメラアセンブリ。 First, in order to solve the above-mentioned problems, as described above, in the camera assembly of the present invention, one or more small cameras and a support portion capable of installing the camera in the sky near the back of one hand of the wearer are provided. (A) The camera has a rod shape so that substantially the entire one hand of the wearer is accommodated within the angle of view of the camera and has a length equal to or longer than the focal distance at that time. A support column formed of a material having at least the characteristics of both self-supporting strength even when stress due to the weight of the camera is applied and flexibility that can change the angle, and (b) the support column on the surface of the back of the camera. It can be connected and installed so that the support column can be supported by protruding from the top, and is formed of a hard material in a plane along the surface of the back of the hand and having an area equal to or larger than a predetermined area. A support base portion having a flat surface portion, (c) a camera mounting portion provided at the tip end portion of the support column and capable of mounting the camera, and (d) the support base portion arranged on the surface of the back portion. A band shape wound around the back of the hand in a direction crossing the base of each finger of the back of the hand or an upper surface of the support base so that it can be pressed and fixed from above at least a part of the back of the hand. With a material that has at least the characteristics of both the strength that does not break during pressing and fixing and the flexibility that allows it to bend along the surface of the back of the hand so that the shape of the back of the glove can be pressed and fixed. It has a back fixing portion to be formed, and the support column is attached to the back portion of one hand of the wearer, and then is on the fingertip side of a line crossing the base of each finger in the back portion. And, by utilizing the flexibility of the material, at least the angle and the angle of protrusion from the support base portion so that the camera can image substantially the entire image of the one hand from between the thumb and the index finger of the wearer's one hand. A camera assembly in which the angle at which the camera is directed in the imaging direction is changed.

好ましくは、本発明に係るカメラアセンブリでは、前記カメラ取付部が、前記カメラの取付位置を、前記装用者の片手が前記カメラの被写界深度を含めた焦点距離内になり、撮像された画像フレーム内の手指と手甲部のトータルの上下寸法が画像フレームの縦寸法に略一致するように前記カメラの光軸に沿って前後に摺動調整が可能であり、前記カメラの取付角度を、撮像された画像フレームの傾斜角度が事前に格納されている照合対象の画像フレームに一致する所定の角度となるように、前記撮像された画像を前記カメラの光軸の周囲に回転調整が可能であるようにしてもよい。 Preferably, in the camera assembly according to the present invention, the camera mounting portion is the mounting position of the camera, and one hand of the wearer is within the focal length including the depth of field of the camera. It is possible to slide back and forth along the optical axis of the camera so that the total vertical dimensions of the fingers and back of the frame roughly match the vertical dimensions of the image frame, and the mounting angle of the camera is captured. It is possible to adjust the rotation of the captured image around the optical axis of the camera so that the tilt angle of the image frame is a predetermined angle that matches the image frame to be collated stored in advance. You may do so.

好ましくは、本発明に係るカメラアセンブリでは、前記支持基礎部の平面部が、前記支持柱との接続部を中心に前記手甲部の表面に沿って前記手甲部の左右及び前後の端部に向けて延伸されるように配置される、ようにしてもよい。 Preferably, in the camera assembly according to the present invention, the flat surface portion of the support base portion faces the left, right, front and rear ends of the back surface portion along the surface of the back surface portion centering on the connection portion with the support pillar. It may be arranged so as to be stretched.

好ましくは、本発明に係る手指形状の検出システムでは、上記に記載のカメラアセンブリと、少なくとも一台の情報処理装置とを有して手指形状を検出するシステムであって、前記情報処理装置は、前記カメラから入力する各手指形状を撮像した画像データを記憶する撮像手指画像記憶部と、コンピューターグラフィックス（ＣＧ）のプログラムにより手指の各部寸法比率及び手甲の寸法比率が異なり、且つ、手指のポーズも異なる多種のＣＧ画像の手指画像を形成する仮想手指画像形成部と、前記仮想手指画像形成部で形成された多種の手指画像のうち、前記手指の各部寸法比率及び手甲の寸法比率をリファレンスとして、前記各寸法比率が同じであるが、前記手指のポーズが異なる複数の仮想手指画像を同じグループに分類して、前記各寸法比率データと前記ポーズにおける各指の関節毎の角度データを各仮想手指画像毎に少なくとも含むデータセットとして格納する仮想手指画像データベースと、前記撮像手指画像の手指の各部寸法比率及び手甲の寸法比率を判別して、前記仮想手指画像データベース中に前記グループ毎に分類されて前記各寸法比率データと共にデータセットとして格納されている前記複数の仮想手指画像から、何れの前記グループ内の前記複数の仮想手指画像を読み出すかを、前記各寸法比率データにより判別するグループ判別部と、前記撮像手指画像記憶部に格納された撮像手指画像と、前記仮想手指画像データベースに格納された仮想手指画像との双方の画像について、輪郭線を有するシルエット画像を形成するシルエット画像形成部と、前記撮像手指画像、前記仮想手指画像、又は、前記シルエット画像の何れかから尾根線画像を形成する尾根線画像形成部と、前記シルエット画像と前記尾根線画像を重ね合わせて重畳画像を形成する重畳画像形成部と、前記重畳画像の前記シルエット画像部分の内部範囲に、前記尾根線画像の尾根線が全て収容されているか、あるいは、前記シルエット画像部分の外部範囲にはみ出す前記尾根線があるかを判断する範囲外尾根線判定部と、前記シルエット画像部分の外部範囲にはみ出す前記尾根線がある場合に、その重畳画像に含まれた画像の元画像の中の仮想手指画像を、後段の推定処理に用いられる候補画像から除外する候補画像絞り込み部と、前記候補画像絞り込み部により絞り込まれた候補画像の各仮想手指画像に対して、入力された撮像手指画像との類似度を演算し、前記類似度が最小の仮想手指画像を判定し、前記類似度が最小の仮想手指画像を含む前記データセット内から、少なくとも前記各指の関節毎の角度データ含むデータを選択することで、前記類似度演算を含む推定処理を実施する手指角度推定部と、前記推定処理の結果を制御対象の装置に向けて出力するデータ出力部と
を少なくとも有する。 Preferably, the finger shape detection system according to the present invention is a system having the above-mentioned camera assembly and at least one information processing device to detect the finger shape, and the information processing device is The image storage unit that stores the image data of each finger shape input from the camera and the computer graphics (CG) program have different dimensional ratios for each part of the fingers and the dimensional ratio of the back of the hand, and the pose of the fingers. Of the various hand image forming parts that form the hand image of various CG images that are different from each other and the various hand image formed by the virtual hand image forming part, the dimensional ratio of each part of the hand and the dimensional ratio of the back of the hand are used as a reference. , A plurality of virtual finger images having the same dimensional ratio but different finger poses are classified into the same group, and the dimensional ratio data and the angle data for each finger joint in the pose are obtained as virtual. The virtual finger image database stored as a data set containing at least each finger image, the dimensional ratio of each part of the finger and the dimensional ratio of the back of the hand of the captured finger image are discriminated, and classified into the virtual finger image database for each group. From the plurality of virtual finger images stored as a data set together with the respective dimension ratio data, a group discriminating unit that determines which of the plurality of virtual finger images in the group is read out from the respective dimension ratio data. And a silhouette image forming unit that forms a silhouette image having contour lines for both images of the imaged hand image stored in the imaged hand image storage unit and the virtual hand image stored in the virtual hand image database. , The ridge line image forming portion that forms a ridge line image from any of the captured finger image, the virtual finger image, or the silhouette image, and the silhouette image and the ridge line image are superimposed to form a superimposed image. Whether all the ridge lines of the ridge line image are accommodated in the internal range of the superimposed image forming portion and the silhouette image portion of the superimposed image, or is there the ridge line protruding from the external range of the silhouette image portion? When there is an out-of-range ridge line determination unit that determines the data and the ridge line that extends beyond the external range of the silhouette image portion, the virtual finger image in the original image of the image included in the superimposed image is estimated in the latter stage. Input imaging hand fingers for each virtual hand image of the candidate image narrowing unit to be excluded from the candidate images used for processing and the candidate image narrowed down by the candidate image narrowing unit. The similarity with the image is calculated, the virtual finger image having the minimum similarity is determined, and the angle data for at least each finger joint is included from the data set including the virtual finger image having the minimum similarity. By selecting data, it has at least a finger angle estimation unit that performs estimation processing including the similarity calculation, and a data output unit that outputs the result of the estimation processing to a device to be controlled.

好ましくは、本発明に係る手指形状検出システムでは、前記撮像手指画像記憶部に格納された各手指形状を撮像した画像データに対して、撮像された画像フレーム内の手指と手甲部のトータルの上下寸法が画像フレームの縦寸法に略一致するように拡大又は縮小を行う画像寸法調整部と、前記撮像された画像フレームの傾斜角度を事前に記憶装置に格納されている照合対象の画像フレームに一致する所定の角度となるように、前記撮像された画像を前記カメラの光軸の周囲に回転調整させる画像角度調整部と、をさらに有するようにしてもよい。 Preferably, in the finger shape detection system according to the present invention, the total upper and lower parts of the fingers and the back of the hand in the imaged image frame are raised and lowered with respect to the image data of each hand shape stored in the imaged hand image storage unit. The image dimensional adjustment unit that enlarges or reduces the dimensions so that the dimensions substantially match the vertical dimensions of the image frame and the tilt angle of the imaged image frame match the image frame to be collated stored in the storage device in advance. An image angle adjusting unit for rotating and adjusting the captured image around the optical axis of the camera may be further provided so as to have a predetermined angle.

好ましくは、本発明に係る手指形状検出方法では、上記に記載のカメラアセンブリを少なくとも一台の情報処理装置と共に用いて手指形状を検出する検出方法であって、前記情報処理装置で、前記カメラから入力する各手指形状を撮像した画像データを記憶するステップと、コンピューターグラフィックス（ＣＧ）のプログラムにより手指の各部寸法比率及び手甲の寸法比率が異なり、且つ、手指のポーズも異なる多種のＣＧ画像の手指画像を形成するステップと、前記形成された多種の手指画像のうち、前記手指の各部寸法比率及び手甲の寸法比率をリファレンスとして同じであるが、前記手指のポーズが異なる複数の仮想手指画像を同じグループに分類して、前記各寸法比率データと前記ポーズにおける各指の関節毎の角度データを各仮想手指画像毎に少なくとも含むデータセットとして格納するステップと、記撮像手指画像の手指の各部寸法比率及び手甲の寸法比率を判別して、前記グループ毎に分類されて前記各寸法比率データと共に格納されている前記複数の前記仮想手指画像から、何れのグループの前記複数の前記仮想手指画像を読み出すかを、前記各寸法比率データにより判別するステップと、前記格納された撮像手指画像と、前記格納された仮想手指画像との双方の画像について、輪郭線を有するシルエット画像を形成するステップと、前記撮像手指画像、前記仮想手指画像、又は、前記シルエット画像の何れかから尾根線画像を形成するステップと、前記シルエット画像と前記尾根線画像を重ね合わせて重畳画像を形成するステップと、前記重畳画像の前記シルエット画像部分の内部範囲に、前記尾根線画像の尾根線が全て収容されているか、あるいは、前記シルエット画像部分の外部範囲にはみ出す前記尾根線があるかを判断するステップと、前記シルエット画像部分の外部範囲にはみ出す前記尾根線がある場合に、その重畳画像に含まれた画像の元画像の中の仮想手指画像を、後段の推定処理の候補画像から除外するステップと、前記候補画像絞り込み部により絞り込まれた候補画像の各仮想手指画像に対して、入力された撮像手指画像との類似度を演算し、前記類似度が最小の仮想手指画像を判定し、前記類似度が最小の仮想手指画像を含む前記データセット内から、少なくとも前記各指の関節毎の角度データ含むデータを選択することで、前記類似度演算を含む推定処理を実施するステップと、前記推定処理の結果を制御対象の装置に向けて出力するステップと、を少なくとも実施する。 Preferably, the finger shape detection method according to the present invention is a detection method for detecting a finger shape by using the camera assembly described above together with at least one information processing apparatus, and the information processing apparatus is used to detect the finger shape from the camera. Various CG images in which the dimensional ratio of each part of the finger and the dimensional ratio of the back of the hand are different and the pose of the finger is also different depending on the step of storing the image data obtained by capturing the image data of each finger shape to be input and the computer graphics (CG) program. A plurality of virtual finger images in which the steps for forming a finger image and the various formed finger images are the same with reference to the dimensional ratio of each part of the finger and the dimensional ratio of the back of the hand, but the poses of the fingers are different. A step of classifying into the same group and storing the dimension ratio data and the angle data of each finger joint in the pose as a data set including at least for each virtual finger image, and the dimensions of each part of the finger of the captured finger image. The ratio and the dimensional ratio of the back of the hand are discriminated, and the plurality of virtual hand images of any group are read out from the plurality of virtual finger images classified into each group and stored together with the dimensional ratio data. A step of determining whether or not the data is based on the dimensional ratio data, a step of forming a silhouette image having a contour line for both the stored image of the captured hand image and the stored virtual hand image, and the step of forming the silhouette image. A step of forming a ridge line image from any of the captured finger image, the virtual finger image, or the silhouette image, a step of superimposing the silhouette image and the ridge line image to form a superimposed image, and the superimposed image. A step of determining whether all the ridge lines of the ridge line image are contained in the internal range of the silhouette image portion, or whether there is the ridge line protruding from the external range of the silhouette image portion, and the silhouette image. When there is the ridge line that extends beyond the external range of the portion, the step of excluding the virtual finger image in the original image of the image included in the superimposed image from the candidate image of the estimation processing in the subsequent stage and the narrowing down of the candidate image. For each virtual hand image of the candidate image narrowed down by the unit, the similarity with the input imaged hand image is calculated, the virtual hand image with the minimum similarity is determined, and the virtual with the minimum similarity is determined. A step of performing an estimation process including the similarity calculation by selecting at least data including angle data for each finger joint from the data set including the finger image. At least a step of outputting the result of the estimation process to the device to be controlled is performed.

好ましくは、本発明に係る手指形状検出方法では、前記情報処理装置で、前記情報処理装置に格納された各手指形状を撮像した画像データに対して、撮像された画像フレーム内の手指と手甲部のトータルの上下寸法が画像フレームの縦寸法に略一致するように拡大又は縮小を行うステップと、前記情報処理装置で、前記撮像された画像フレームの傾斜角度を事前に記憶装置に格納されている照合対象の画像フレームに一致する所定の角度となるように、前記撮像された画像を前記カメラの光軸の周囲に回転調整させるステップと、をさらに実施するようにしてもよい。 Preferably, in the finger shape detection method according to the present invention, the information processing apparatus captures the image data of each finger shape stored in the information processing apparatus, and the fingers and the back of the hand in the captured image frame. The step of enlarging or reducing the total vertical dimension of the image frame so as to substantially match the vertical dimension of the image frame, and the tilt angle of the imaged image frame captured by the information processing device are stored in the storage device in advance. The step of rotating and adjusting the captured image around the optical axis of the camera may be further performed so that the angle matches the image frame to be collated.

上記課題を解決するために、本発明に係る手指形状の検出方法のプログラムは、上記に記載の検出方法を実施する。また、本発明に係るプログラムの記憶媒体は上記記載のプログラムを記憶する。 In order to solve the above problems, the program of the finger shape detection method according to the present invention implements the detection method described above. Further, the storage medium of the program according to the present invention stores the program described above.

本発明の手指形状の検出方法によれば、手の甲側に取り付けた小さなカメラ１個からの撮像画像データ入力で、撮像フレームごとの手指のすべての関節角度を推定できる装置を提供し、さらに照合用データベースにおける手指画像のポーズ数を減らし、縮小／拡大比率を減少させてデータ量を減らし、ハンドモーキャップで体全体のモーションキャプチャーと同時にデータを取得できる効果を有すると共に、さらに照合用データベースの作成を効率化して作成のための時間と工数を抑制でき、個人差もより抑制し、ノイズを抑制することで照合精度を向上させて同一輪郭線で異形状の識別をさらに改善した手指形状の検出方法とシステムを提供することができる。 According to the finger shape detection method of the present invention, a device capable of estimating all joint angles of fingers for each imaging frame by inputting captured image data from one small camera mounted on the back side of the hand is provided, and further for collation. The number of poses of finger images in the database is reduced, the reduction / enlargement ratio is reduced to reduce the amount of data, and the hand morph is effective in acquiring data at the same time as motion capture of the entire body, and further creating a collation database. A finger shape detection method that improves efficiency, reduces the time and labor required for creation, further suppresses individual differences, improves collation accuracy by suppressing noise, and further improves the identification of irregular shapes on the same contour line. And the system can be provided.

本発明の実施形態に係る手指の形状を検出するシステムの概略構成を示すブロック図である。It is a block diagram which shows the schematic structure of the system which detects the shape of a finger which concerns on embodiment of this invention. 本発明の実施形態の事前準備に係る動作フローチャートである。It is an operation flowchart which concerns on the advance preparation of the Embodiment of this invention. 本発明の実施形態の検出に係る動作フローチャートである。It is an operation flowchart which concerns on the detection of embodiment of this invention. 本発明のカメラアセンブリの斜視図である。It is a perspective view of the camera assembly of this invention. 本発明のカメラアセンブリの側面図である。It is a side view of the camera assembly of this invention. 本発明のカメラアセンブリの後面図である。It is a rear view of the camera assembly of this invention. 検出時の入力画像とデータベース画像の一般的な関係の概要を示した図である。It is a figure which showed the outline of the general relationship between the input image at the time of detection and a database image. 検出時の入力画像とデータベースのＣＧ画像の例を示した図である。It is a figure which showed the example of the input image at the time of detection and the CG image of a database. データベースのグループ分けの例を示した図である。It is a figure which showed the example of the grouping of a database. 検出時の入力画像の指長からのデータベースのグループ判定の例を示した図である。It is a figure which showed the example of the group judgment of the database from the finger length of the input image at the time of detection. 検出時の入力画像の指角度からのデータベースのグループ判定の例を示した図である。It is a figure which showed the example of the group judgment of the database from the finger angle of the input image at the time of detection. データベース中のデータセットの例を示した図である。It is a figure which showed the example of the data set in a database. 入力画像の尾根線とデータベースのシルエットで範囲外尾根線による絞り込みをする例を示した図である。It is the figure which showed the example which narrowed down by the out-of-range ridge line by the ridge line of an input image and the silhouette of a database. 入力画像の尾根線とデータベースのシルエットで範囲外尾根線による絞り込みをした選択例を示した図である。It is the figure which showed the selection example which narrowed down by the out-of-range ridge line by the ridge line of an input image and the silhouette of a database. 入力画像のシルエットとデータベースの尾根線で範囲外尾根線による絞り込みをする例を示した図である。It is the figure which showed the example which narrowed down by the out-of-range ridge line with the silhouette of an input image and the ridge line of a database. 入力画像のシルエットとデータベースの尾根線で範囲外尾根線による絞り込みをした選択例を示した図である。It is the figure which showed the selection example which narrowed down by the out-of-range ridge line by the silhouette of the input image and the ridge line of a database. 入力画像に基づく各画像とデータベースのデータセットの対比関係の例を示した図である。It is a figure which showed the example of the contrast relation between each image based on an input image, and a data set of a database. 入力画像とＣＧ角度データ出力の関係の例を示す図である。It is a figure which shows the example of the relationship between the input image and CG angle data output.

＜実施形態＞
映画などにおける人体のコンピュータアニメーションや、ゲームなどにおけるキャラクターの人間らしい動きの再現に、モーションキャプチャが多用されるようになった。また、人体動作のモーションキャプチャと併用して、顔の表情もリアルタイムでコンピュータアニメーションとして描画するシステムも実用に供するようになってきた。このような人体動作の映像編集に不可欠でありながら、まだ実用に供するレベルに至っていない最後の技術は「細かな手指動作のモーションキャプチャ」すなわち３次元手指形状推定技術（hand mocap）である。 <Embodiment>
Motion capture has come to be widely used for computer animation of the human body in movies and the like, and for reproducing human-like movements of characters in games. In addition, a system that draws facial expressions as computer animation in real time in combination with motion capture of human body movements has come into practical use. The last technology that is indispensable for video editing of such human body movements but has not yet reached the level of practical use is "motion capture of fine finger movements", that is, three-dimensional hand shape estimation technology (hand mocap).

手指形状推定システムがモーションキャプチャシステムと併用して使用できるためには、モーションキャプチャを邪魔しない仕様が必要である。たとえば、光学式モーションキャプチャ用の広いスタジオで被撮影者が自由に動き回っても、比較的サイズが小さい手指の形状が推定できること。被撮影者が手に何かを握ったり、彼の掌を、床や壁、机、あるいは他者といった他物体に接触させても手指形状推定できること、装置が被撮影者に疲労を生じさせにくい形態や重量であること、などが挙げられる。 In order for the finger shape estimation system to be used in combination with the motion capture system, specifications that do not interfere with motion capture are required. For example, in a large studio for optical motion capture, the shape of a relatively small finger can be estimated even if the subject moves around freely. The shape of the fingers can be estimated even if the photographed person holds something in his hand or his palm is brought into contact with another object such as the floor, wall, desk, or another person, and the device is less likely to cause fatigue to the photographed person. And the weight.

一つの方法として、ワイヤレスのデータグローブ（たとえば、CyberGlove Systems Inc。製、CyberGlove II）を利用する方法が考えられる。しかし、データグローブは高価で、拘束感が強く、またセンサである歪ゲージや足回り線（配線）が断線しやすく、激しい動きのモーションキャプチャには利用できない。また、掌での物体操作や、拳の形を作るための握り込みにも向かない。 One method is to use a wireless data glove (for example, CyberGlove II manufactured by CyberGlove Systems Inc.). However, data gloves are expensive, have a strong sense of restraint, and the strain gauges and suspension lines (wiring) that are sensors are easily broken, so they cannot be used for motion capture of violent movements. Also, it is not suitable for manipulating objects with the palm or gripping to form a fist.

あるいは、ワイヤレスの接触型手指形状推定の例のひとつにDigitsがある。手首に赤外線センサや赤外線カメラを装着し、掌側から手指を撮影することで、センサから指までの距離を測定し、形状の推定を行っている。ただし、同システムは手首にセンサを装着するため、手首を屈伸した時の対応が難しい。また、装着位置が掌側にあるため、歩行などの日常動作や、他物体の把持や操作、接触などの場面では、装用者の邪魔になるし、計測そのものができなくなる。 Alternatively, Digits is an example of wireless contact finger shape estimation. An infrared sensor or an infrared camera is attached to the wrist, and the finger is photographed from the palm side to measure the distance from the sensor to the finger and estimate the shape. However, since the system attaches a sensor to the wrist, it is difficult to respond when the wrist is bent and stretched. In addition, since the mounting position is on the palm side, it interferes with the wearer in daily activities such as walking, gripping, operating, and touching other objects, and the measurement itself cannot be performed.

そこで発明者らは、超小型のワイヤレスRGBカメラを、掌側ではなく、手の甲側に装着して手指形状推定を行うコンパクトなシステムを提案した。また、超小型RGBカメラとシングルボードコンピュータを組み合わせた実機の実装も行った。カメラを手の甲側に設置することで、モーションキャプチャにおける被撮影者動作の拘束を最小限に抑えることが期待できる。文献[1]などの従来手法が掌側にカメラを設置したのは、とくに指先が常に撮像される必要があったためであり、また、指関節が屈曲しても遮蔽が起こりにくいためであったが、同手法では、指先や指のある部分に遮蔽があっても手指形状推定が可能なアルゴリズムを提案していた。 Therefore, the inventors have proposed a compact system in which an ultra-small wireless RGB camera is attached to the back side of the hand instead of the palm side to estimate the shape of the fingers. We also implemented an actual machine that combines an ultra-compact RGB camera and a single board computer. By installing the camera on the back side of the hand, it can be expected that the restraint of the subject's movement in motion capture will be minimized. The reason why the conventional method such as Reference [1] installed the camera on the palm side was that the fingertips had to be imaged at all times, and that shielding was unlikely to occur even if the knuckle was bent. However, this method has proposed an algorithm that can estimate the shape of a finger even if the fingertip or a part with a finger is shielded.

しかし、上記に提案した手法には、とくに「指の長さ・太さ・掌の幅・母指の付け根位置、四指の開き具合い」あるいは「各手指の関節可動域と、可動域のなかでの動かし方の癖」など、ユーザの個人差に対応した高精度な手指形状推定ができなかった。一般に、照合用データベースの数を増やせば、推定分解能を上げることができるものの、処理時間を増大させてしまう。また、推定前のキャリブレーション時間をあまり長くすることも好ましくない。そこで本発明では、以下を特徴とする装着型の手指形状推定装置を考案した： However, the methods proposed above include, in particular, "finger length, thickness, palm width, thumb base position, four finger opening degree" or "range of motion of each finger and within the range of motion. It was not possible to estimate the shape of the fingers with high accuracy according to the individual differences of the user, such as "the habit of moving with." In general, if the number of collation databases is increased, the estimation resolution can be increased, but the processing time is increased. It is also not preferable to make the calibration time before estimation too long. Therefore, in the present invention, a wearable finger shape estimation device characterized by the following is devised:

１．指の長さや太さ、親指の付け根位置など、手指の形を特徴付けるパラメータを変えた照合用データベース（CGの手）グループを事前に複数種用意し、推定前に、その中からユーザの手指に最も類似したデータベース・グループを選択し、そのグループ内の照合用データセットと入力画像との類似度照合を行うことにより、
個人差を持った（たとえば、太い指、短い指、親指が曲がっている、親指の付け根位置が手首に近い、掌が広い、示指と中指でＶサインをした時に２指の間隔が広い、など）手指の形に対しても高精度で形状推定できるようにした。 1. 1. Prepare multiple types of collation database (CG hand) groups in advance with different parameters that characterize the shape of the finger, such as finger length and thickness, and the position of the base of the thumb, and select the user's finger from among them before estimation. By selecting the most similar database group and performing similarity matching between the matching data set in that group and the input image.
There are individual differences (for example, thick fingers, short fingers, bent thumbs, the base position of the thumb is close to the wrist, the palm is wide, the distance between the two fingers is wide when the V sign is made with the index finger and the middle finger, etc. ) The shape of the finger can be estimated with high accuracy.

１−２．照合用の手指データベースを作成する際に、
とくに、指の太さ、長さ、掌の幅、親指の付け根位置の比率が異なる照合用データセットを作成することにより、高精度推定可能な照合用データベースの作成を効率的に行う。 1-2. When creating a hand database for collation
In particular, by creating a collation data set in which the finger thickness, length, palm width, and thumb base position ratio are different, a collation database capable of high-precision estimation can be efficiently created.

１−３．ユーザの手指に最も類似した照合用データベース・グループを選択する際に、
とくに、ユーザの手指画像と照合用データセットの手指画像において、指の太さ、長さ、掌の幅、親指の付け根位置の比率に注目して比較決定する。 1-3. When selecting a collation database group that most closely resembles the user's finger
In particular, in the finger image of the user and the finger image of the collation data set, the comparison and determination are made by paying attention to the ratio of the finger thickness, the length, the width of the palm, and the position of the base of the thumb.

２−１．前記照合用データベース選択後に、さらに高精度の手指形状推定を行えるように、
装置装着後にカメラ位置の微調整を行う際、カメラとユーザ手指との距離を微調整する場合は、
照合用データセットベースと入力画像とのシルエットが最も合うようにカメラ位置の微調整を行う。具体的には、五指それぞれの付け根から先端までの長さが最も合うようにカメラ位置を前後させる。 2-1. After selecting the collation database, the finger shape can be estimated with higher accuracy.
When fine-tuning the camera position after mounting the device, when fine-tuning the distance between the camera and the user's fingers,
Fine-tune the camera position so that the silhouette of the collation dataset base and the input image match best. Specifically, the camera position is moved back and forth so that the length from the base to the tip of each of the five fingers is most suitable.

２−２．装置装着後にカメラ位置の微調整を行う際、カメラの傾きを微調整する場合は、
照合用データセットベースと入力画像とで手指の傾きが最も合致するように、
たとえば、両者の中指の傾きが合うようにカメラを回転させる。 2-2. When fine-tuning the camera position after mounting the device, if you want to fine-tune the tilt of the camera,
Make sure that the tilt of the fingers matches best between the collation dataset base and the input image.
For example, rotate the camera so that the two middle fingers are tilted.

３−１．入力画像と照合用データベース画像とを比較する時に、画像中のあるべき領域に指がなかったり、反対に、ないはずの領域に指があったりする場合に、ペナルティを与える類似度照合方法により、推定結果として、入力画像と著しく異なる照合用データセットが出力されないようにする。 3-1. When comparing the input image with the collation database image, if there is no finger in the area where it should be in the image, or conversely, there is a finger in the area where it should not be, a similarity matching method that gives a penalty As the estimation result, the collation data set significantly different from the input image is not output.

３−２．入力画像あるいは照合用データベース画像のどちらかを細線化した尾根線画像と、細線化しない元の手指画像とを照合することにより、そのはみ出し部分やはみ出した大きさをペナルティに用いることで、画像処理時のノイズの影響や手指形状の個人差を軽減する。 3-2. Image processing by collating the ridge line image with either the input image or the collation database image thinned with the original finger image without thinning, and using the protruding part or the protruding size as a penalty. Reduces the effects of time noise and individual differences in finger shape.

カメラアセンブリのハードウェアは、可撓性ワイヤからなる支持柱３１、リストサポーター、モバイルバッテリーによって構成される。ワイヤレスカメラで撮影した手指画像をコンピュータに送信し、手指形状推定を行う。図４〜図６に装置の外観を示す。ワイヤレスカメラは小型 RGB カメラである 3rd Eye Electronics 社製の 5。8G Wireless Mini Camera -TE60A を用いた。画角は90°、取得される画像の解像度は 720×480[pixel]である。図４〜図６に示すように、カメラ３５の位置は、手甲部５１側の、親指５４と人差指５５のあいだにあり、親指５４側から小指５８側に向けて、また、指先５２端から手甲部５１側に向けて、やや傾けて設置する。カメラ３５と手甲部５１の距離は約120[mm]である。カメラ３５とバッテリー３８とを繋ぐ電源ケーブル３６は腕の長さ以上であり、バッテリー３８はモーメントを感じない体幹付近に装着する。バッテリー３８を除いたカメラアセンブリの重量は約75[g]である。 The hardware of the camera assembly consists of a support column 31 made of flexible wire, a wrist supporter, and a mobile battery. The finger image taken by the wireless camera is transmitted to the computer to estimate the finger shape. 4 to 6 show the appearance of the device. The wireless camera used was a 5.8G Wireless Mini Camera -TE60A manufactured by 3rd Eye Electronics, which is a small RGB camera. The angle of view is 90 °, and the resolution of the acquired image is 720 x 480 [pixel]. As shown in FIGS. 4 to 6, the position of the camera 35, the hand-back portion 51 side, is in between the thumb 54 and index finger 55, hand-back direction from the thumb 54 side little finger 58 side, also, from the fingertip 52 end Install it at a slight angle toward the section 51 side. Length of the camera 35 and the hand-back portion 51 is about 120 [mm]. The power cable 36 connecting the camera 35 and the battery 38 is longer than the arm length, and the battery 38 is attached near the trunk where no moment is felt. The weight of the camera assembly excluding the battery 38 is about 75 [g].

＜照合用の手指データベース構築＞
照合用データベースは、手指関節角度情報と、それを元にCG編集ソフトにより作成した手指CGのシルエット情報と画像特徴量（たとえば高次局所自己相関）、そして尾根線情報を一つのデータセットとして構築する。様々な手指形状の様々な見え方に対するデータセットを集めたものを照合用データベースとする。照合用データベースの構成は図１０の通りである。
手指形状推定時には、シルエット情報、画像特徴量、尾根線情報の３つを用いて、入力された手指画像と照合用手指データベースとの照合を行う。照合の結果、入力画像と最も類似する手指データセットが選ばれたら、推定結果として、そのデータセットが持つ手指関節角度情報を出力する。（その関節角度情報は、手指CG作成に使った情報である）。 <Construction of hand database for collation>
The collation database builds finger joint angle information, finger CG silhouette information created by CG editing software based on it, image feature quantities (for example, higher-order local autocorrelation), and ridge line information as one data set. To do. A collation database is a collection of data sets for various appearances of various finger shapes. The structure of the collation database is as shown in FIG.
At the time of estimating the shape of the finger, the input finger image is collated with the collation finger database by using the silhouette information, the image feature amount, and the ridge line information. When the finger data set most similar to the input image is selected as a result of the collation, the finger joint angle information of the data set is output as the estimation result. (The joint angle information is the information used to create the finger CG).

＜カメラ位置の微調整＞
カメラで撮像され入力された手指画像と、できる限り類似する手指データセットが選ばれるために、実際の推定に先立ち、手の甲側に取り付けたカメラ位置の微調整を行う。微調整は、次の２ステップにより行う。 <Fine adjustment of camera position>
In order to select a finger data set that is as similar as possible to the finger image captured and input by the camera, the position of the camera attached to the back of the hand is finely adjusted prior to the actual estimation. Fine adjustment is performed in the following two steps.

［ステップ１］カメラとユーザ手指との「距離」を微調整するため、照合用データセットベースと入力画像とのシルエットが最も合うようにカメラ位置の微調整を行う。具体的には、図１１に例示するように、五指それぞれの付け根から先端までの長さが最も合うようにカメラ位置を前後させる。 [Step 1] In order to fine-tune the "distance" between the camera and the user's finger, the camera position is fine-tuned so that the silhouette of the collation data set base and the input image match best. Specifically, as illustrated in FIG. 11, the camera position is moved back and forth so that the lengths from the base to the tip of each of the five fingers are most suitable.

［ステップ２］カメラの「傾き」を微調整するため、照合用データセットベースと入力画像とで手指の傾きが最も合致するようにカメラを回転させる。たとえば、図１２に例示するように、両者の中指の傾きが合うようにカメラを回転させる。 [Step 2] In order to fine-tune the "tilt" of the camera, the camera is rotated so that the tilt of the fingers matches the collation data set base and the input image most. For example, as illustrated in FIG. 12, the camera is rotated so that the inclinations of both middle fingers match.

＜推定の流れ＞
ユーザは提案システムを装着し、入力画像からHLAC特徴量、尾根線情報、シルエット情報を算出する。算出した特徴量をもとにデータベースを探索して、入力画像と最も類似するデータセットを推定結果とする。最後に、推定結果のデータセットが持つ手指関節角度を出力する。以下に、その詳細を説明する。 <Estimation flow>
The user wears the proposal system and calculates HLAC features, ridge line information, and silhouette information from the input image. The database is searched based on the calculated features, and the data set most similar to the input image is used as the estimation result. Finally, the finger joint angle of the estimation result data set is output. The details will be described below.

＜探索対象の絞り込み＞
入力画像と著しく異なる姿勢を持つデータセットを探索対象からはずすため、2段階の探索対象の絞り込みを行う。取得したシルエット情報と尾根線情報を用いて、後述するペナルティを算出する。 <Narrowing down the search target>
In order to exclude the data set that has a posture significantly different from the input image from the search target, the search target is narrowed down in two stages. The penalty described later is calculated using the acquired silhouette information and ridge line information.

ここでの思想は、図１３、図１４に例示するように、入力画像に「有るはず」の指が「無い」照合用データセットは除外し、反対に、入力画像に「無いはず」の指が「有る」照合用データセットは除外するという２方向の除外処理を行うためである。 The idea here is to exclude the collation data set that "should not be" in the input image and "not" in the input image, as illustrated in FIGS. 13 and 14, and conversely, the finger that "should not be" in the input image. This is because the two-way exclusion process of excluding the collation data set that "exists" is performed.

まず、入力画像に「有るはず」の指が「無い」照合用データセットを探索対象から除外するための処理を行う。図１３、図１４に示すように、入力画像の尾根線情報とデータセットのシルエット情報をもとに、入力画像の尾根線領域がデータベース画像のシルエット領域からはみ出す面積を算出し、はみ出した面積の値[pixel]をペナルティＰ_１とする。そして、第1段階の絞り込みとして、Ｐ_１の値が以下の式で示す閾値を超えるデータセットを探索対象からはずす。 First, a process is performed to exclude the collation data set that does not have a finger that "should exist" in the input image from the search target. As shown in FIGS. 13 and 14, the area where the ridge line area of the input image protrudes from the silhouette area of the database image is calculated based on the ridge line information of the input image and the silhouette information of the data set, and the protruding area is calculated. the value [pixel] and penalty P _1. Then, as the narrowing down of the first stage, the data set in which the value of P ₁ exceeds the threshold value shown by the following formula is excluded from the search target.

第1段階の絞り込みをした後、探索対象として残った照合用データセットのうち、入力画像に「無いはず」の指が「有る」照合用データセットを探索対象から除外するための処理を行う。図１５、図１６に示すように、データセットの尾根線情報と入力画像のシルエット情報をもとに、データセットの尾根線領域が入力画像のシルエット領域からはみ出す面積を算出し、はみ出した面積の値[pixel]を2つ目のペナルティＰ_２とする。そして、第2段階の絞り込みとして、Ｐ_２の値が以下の式で示す閾値ｔｈ_２を超えるデータセットを探索対象からはずす。 After narrowing down in the first stage, processing is performed to exclude the collation data set that has a finger "should not be" in the input image from the search target among the collation data sets remaining as the search target. As shown in FIGS. 15 and 16, the area where the ridge line area of the data set protrudes from the silhouette area of the input image is calculated based on the ridge line information of the data set and the silhouette information of the input image, and the protruding area is calculated. the value [pixel] and the second penalty P _2. Then, as the second stage narrowing down, the data set in which the value of P ₂ exceeds the threshold value th ₂ shown by the following equation is excluded from the search target.

＜類似度計算＞
上記に示した手法で探索対象の絞り込みを行った後、推定対象として残った照合用データセットに対して類似度計算による探索を行う。類似度は、HLAC特徴量と上記で示したペナルティＰ_１、Ｐ_２を用いて、以下の式によって算出する。 <Similarity calculation>
After narrowing down the search target by the method shown above, the search is performed by the similarity calculation for the collation data set remaining as the estimation target. The similarity, using the penalty P _1, P ₂ shown in HLAC features and above is calculated by the following equation.

データベースを全て探索した結果、類似度計算によって得られた類似度が最小のデータセットを推定結果とし、推定結果のデータセットが持つ手指関節角度を出力する。
As a result of searching all the databases, the data set with the smallest similarity obtained by the similarity calculation is used as the estimation result, and the finger joint angle of the estimation result data set is output.

＜システム構成＞
図１の本実施形態に係る手指の形状を検出するシステムにおいては、情報処理装置１とカメラ３５と表示装置１００とを含み、情報処理装置１はカメラ３５と表示装置１００と通信接続される。 <System configuration>
The system for detecting the shape of a finger according to the present embodiment of FIG. 1 includes an information processing device 1, a camera 35, and a display device 100, and the information processing device 1 is communicatively connected to the camera 35 and the display device 100.

手指／手甲各種寸法データ記憶部２１は、異なる手指形状に対応する手指／手甲各種寸法と各関節部の角度データを格納しており読出して出力することができる。各データに対応させてＣＧ手指画像形成部２３で色々なポーズのＣＧ画像が形成されて、手指／手甲各種寸法によりグループ制御部２４によりグループ分類されて、情報処理装置１内の照合用ＣＧ手指画像データベース２５にグループ毎にデータセット形式で、同じＣＧ画像に対応するデータや加工された画像が組になって関連付けられて情報処理装置１内の照合用ＣＧ手指画像データベース２５に格納される。 The finger / back various dimension data storage unit 21 stores various finger / back dimensions corresponding to different finger shapes and angle data of each joint, and can read and output the data. CG images of various poses are formed by the CG finger image forming unit 23 corresponding to each data, grouped by the group control unit 24 according to various dimensions of the fingers / back of the hand, and collated CG fingers in the information processing device 1. Data corresponding to the same CG image and processed images are associated with each group in a set in the image database 25 and stored in the collation CG hand image database 25 in the information processing apparatus 1.

表示装置１００は、入力画像及び／又は入力画像から検出された手指形状の確認、輪郭線の確認等の用途であれば通常のＬＣＤ等の平面ディスプレイを用いることができる。また、表示装置１００内に入力画像から検出された手指形状に基づいて再生又は合成された手指形状を表示させることができる。 As the display device 100, a flat display such as a normal LCD can be used for applications such as confirmation of the input image and / or the shape of the finger detected from the input image and confirmation of the contour line. In addition, the display device 100 can display the reproduced or synthesized finger shape based on the finger shape detected from the input image.

情報処理装置１は、カメラ３５から入力する各手指形状を撮像した画像データから画像形状比率データ、輝度勾配方向ベクトルを含む画像特徴量データ、を算出し、両データを機器装着方式のハンドモーキャップにより形状が検出された複数の手指形状のデータセットに対応させて照合用データベースに格納する。 The information processing device 1 calculates the image shape ratio data and the image feature amount data including the brightness gradient direction vector from the image data obtained by capturing each finger shape input from the camera 35, and both data are hand morphs of the device mounting method. It is stored in the collation database corresponding to the data sets of the plurality of finger shapes whose shapes are detected by.

情報処理装置１内には、画像寸法調整部１１、画像角度調整部１２、撮像手指画像記憶部１３、シルエット画像形成部１４、尾根線画像形成部１５、輪郭線抽出部１６、画像特徴量検出部１７、重畳画像形成部１８、範囲外尾根線判定部１９、候補画像絞り込み部２０、手指／手甲各種寸法データ記憶部２１、関節角度記憶部２２、ＣＧ手指画像形成部２３、グループ制御部２４、ＣＧ手指画像データベース２５、画像特徴量類似度計算部２６、グループ判定部２７、手指角度推定部２８、データ出力部２９、各種設定値記憶部７１、プログラム記憶部８１、制御部９１、及び表示装置１００が設けられ、カメラ３５側から表示装置１００側に向けて通信可能に接続される。 In the information processing device 1, an image dimension adjustment unit 11, an image angle adjustment unit 12, an image pickup finger image storage unit 13, a silhouette image formation unit 14, a ridge line image formation unit 15, a contour line extraction unit 16, and an image feature amount detection. Unit 17, superimposed image forming unit 18, out-of-range ridge line determination unit 19, candidate image narrowing unit 20, finger / back various dimension data storage unit 21, joint angle storage unit 22, CG finger image forming unit 23, group control unit 24. , CG hand image database 25, image feature amount similarity calculation unit 26, group determination unit 27, finger angle estimation unit 28, data output unit 29, various set value storage units 71, program storage unit 81, control unit 91, and display. The device 100 is provided and is communicably connected from the camera 35 side toward the display device 100 side.

ＣＧ手指画像データベース２５は、カメラ３５で撮像された各フレームの画像データとその加工データ（シルエット画像、輪郭線画像、尾根線画像、画像特徴量）と共に、ＣＧ手指画像形成部２３で形成された画像についてもそのＣＧ画像データとその加工データ（シルエット画像、輪郭線画像、尾根線画像、画像特徴量）と関節角度データ等を格納する。 The CG finger image database 25 is formed by the CG finger image forming unit 23 together with the image data of each frame captured by the camera 35 and its processed data (silhouette image, contour line image, ridge line image, image feature amount). As for the image, the CG image data, its processing data (silhouette image, contour line image, ridge line image, image feature amount), joint angle data, and the like are stored.

画像寸法調整部１１は、カメラ３５から入力する画像の寸法を検出処理に適した寸法に調整する。これはカメラ３５のカメラ取付部３３側で調整することもできるが、それだけで対応できない場合、あるいは、カメラ３５側で対応しない場合に画像寸法調整部１１を使用する。 The image size adjusting unit 11 adjusts the size of the image input from the camera 35 to a size suitable for the detection process. This can be adjusted on the camera mounting portion 33 side of the camera 35, but the image dimension adjusting portion 11 is used when it cannot be handled by itself or when the camera 35 side does not support it.

画像角度調整部１２は、カメラ３５から入力する画像の角度を検出処理に適した角度に調整する。これはカメラ３５のカメラ取付部３３側で調整することもできるが、それだけで対応できない場合、あるいは、カメラ３５側で対応しない場合に画像角度調整部１２を使用する。 The image angle adjusting unit 12 adjusts the angle of the image input from the camera 35 to an angle suitable for the detection process. This can be adjusted on the camera mounting portion 33 side of the camera 35, but the image angle adjusting portion 12 is used when it cannot be handled by itself or when the camera 35 side does not support it.

撮像手指画像記憶部１３は、カメラ３５から入力する画像を記憶し、後段のシルエット画像形成部１４及びグループ判定部２７に出力する。 The imaging finger image storage unit 13 stores the image input from the camera 35 and outputs it to the silhouette image forming unit 14 and the group determination unit 27 in the subsequent stage.

シルエット画像形成部１４では、撮像画像データからシルエット画像が形成され、尾根線画像形成部１５では、シルエット画像から尾根線画像が形成され、輪郭線抽出部１６では、シルエット画像から輪郭線画像が形成され、画像特徴量検出部１７では、輪郭線画像から分割領域により画像特徴量が検出される。フレーム毎の撮像画像データに対応したそれぞれの画像と特徴量はＣＧ手指画像データベース２５に格納される。 The silhouette image forming unit 14 forms a silhouette image from the captured image data, the ridge line image forming unit 15 forms a ridge line image from the silhouette image, and the contour line extracting unit 16 forms a contour line image from the silhouette image. Then, the image feature amount detection unit 17 detects the image feature amount from the contour line image by the divided region. Each image and feature amount corresponding to the captured image data for each frame is stored in the CG finger image database 25.

一方ＣＧ手指画像形成部２３で形成されたＣＧ手指画像についても、シルシルエット画像形成部１４では、ＣＧ手指画像データからシルエット画像が形成され、尾根線画像形成部１５では、シルエット画像から尾根線画像が形成され、輪郭線抽出部１６では、シルエット画像から輪郭線画像が形成され、画像特徴量検出部１７では、輪郭線画像から分割領域により画像特徴量が検出される。ＣＧ手指画像データに対応したそれぞれの画像と特徴量と関節角度データはセットデータとしてＣＧ手指画像データベース２５に格納される。 On the other hand, with respect to the CG finger image formed by the CG finger image forming unit 23, the sill silhouette image forming unit 14 forms a silhouette image from the CG finger image data, and the ridge line image forming unit 15 forms a ridge line image from the silhouette image. Is formed, the contour line extraction unit 16 forms a contour line image from the silhouette image, and the image feature amount detection unit 17 detects the image feature amount from the contour line image by the divided region. Each image corresponding to the CG finger image data, the feature amount, and the joint angle data are stored in the CG finger image database 25 as set data.

重畳画像形成部１８では、フレーム毎の撮像画像データに対応したシルエット画像と尾根線画像がＣＧ手指画像データベース２５から読み出され、ＣＧ手指画像データベース２５内のセットデータの尾根線画像とシルエット画像と各々対応させて重畳される。範囲外尾根線判定部１９では、各重畳画像について、シルエット画像からはみ出している尾根線が検出及び判定される。候補画像絞り込み部２０では、重畳画像から尾根線がはみ出ているＣＧ手指画像は推定候補から除外して絞り込みを行う。 In the superimposed image forming unit 18, the silhouette image and the ridge line image corresponding to the captured image data for each frame are read from the CG finger image database 25, and the ridge line image and the silhouette image of the set data in the CG finger image database 25 are combined. They are superimposed in correspondence with each other. The out-of-range ridge line determination unit 19 detects and determines the ridge line protruding from the silhouette image for each superimposed image. In the candidate image narrowing unit 20, the CG finger image whose ridge line protrudes from the superimposed image is excluded from the estimation candidates and narrowed down.

手指／手甲各種寸法データ記憶部２１は、ＣＧで手指画像を形成するための手指／手甲の各種データを記憶する。その中の関節角度記憶部２２には各関節毎の角度データが記憶される。各データに基づいてＣＧ手指画像形成部２３で、ＣＧ手指画像が形成される。各ＣＧ手指画像は、グループ制御部２４で、同じ手指／手甲の寸法ごとにグループ分類されてＣＧ手指画像データベース２５に格納される。 The finger / back various size data storage unit 21 stores various data of the finger / back for forming a finger image by CG. The joint angle storage unit 22 in the joint angle storage unit 22 stores angle data for each joint. A CG finger image is formed in the CG finger image forming unit 23 based on each data. Each CG finger image is group-classified by the group control unit 24 according to the same finger / back size and stored in the CG finger image database 25.

画像特徴量類似度計算部２６は、入力した撮像画像の画像特徴量と、各ＣＧ手指画像の画像特徴量とから類似度を計算して絞り込み部２０に送信する。 The image feature amount similarity calculation unit 26 calculates the similarity from the image feature amount of the input captured image and the image feature amount of each CG finger image and transmits the similarity to the narrowing down unit 20.

手指角度推定部２８は、絞り込みされたグループの各ＣＧ手指画像から最も類似している画像を推定してその角関節角度データをデータ出力部２９に送出し、そこから外部装置（表示装置１００）に送出する。
各種設定値記憶部７１はあ情報処理装置１の各種設定値を記憶し、プログラム記憶部８１は上記各部を動作させるためのプログラムを記憶し、制御部９１は、上記各部を制御する。
制御部９１、及び表示装置１００ The finger angle estimation unit 28 estimates the most similar image from each CG finger image of the narrowed down group, sends the angular joint angle data to the data output unit 29, and from there, the external device (display device 100). Send to.
The various set value storage units 71 store various set values of the information processing device 1, the program storage unit 81 stores a program for operating each of the above units, and the control unit 91 controls each of the above units.
Control unit 91 and display device 100

本実施形態に係る手指の形状を検出するシステムの動作について図２のフローチャートを用いて説明する。まず、実際の手指形状の検出を実施する前に照合用データベースを構築する。情報処理装置１では、記憶されている角度／手指／手甲データ等からＣＧ手指画像を形成する（Ｓ５１）。形成した手指画像を手指の寸法グループごとに集合分類する（５２）。グループごとになった手指画像をデータベース２５に記憶させる（Ｓ５３）。 The operation of the system for detecting the shape of the fingers according to the present embodiment will be described with reference to the flowchart of FIG. First, a collation database is constructed before the actual finger shape detection is performed. The information processing device 1 forms a CG finger image from the stored angle / finger / back data and the like (S51). The formed finger images are group-classified by finger size group (52). The finger images of each group are stored in the database 25 (S53).

ＣＧ手指画像からシルエット画像を形成してデータベース２５に記憶させ（Ｓ５４）、シルエット画像から尾根線画像を形成してデータベース２５に記憶させ（Ｓ５５）、シルエット画像から輪郭線画像を形成してデータベース２５に記憶させ（Ｓ５６）、輪郭線画像から画像特徴量を検出してデータベース２５に記憶させ（Ｓ５７）、元のＣＧ画像が同一の各画像と画像特徴量と角度データをセットデータとしてデータベース２５に記憶させる（Ｓ５８）。 A silhouette image is formed from the CG finger image and stored in the database 25 (S54), a ridge line image is formed from the silhouette image and stored in the database 25 (S55), and a contour line image is formed from the silhouette image and stored in the database 25. (S56), the image feature amount is detected from the contour line image and stored in the database 25 (S57), and each image having the same original CG image, the image feature amount, and the angle data are stored in the database 25 as set data. Remember (S58).

ステップＳ１のでは撮像された手指画像の寸法を調整し（Ｓ１）、次に手指画像の角度を調整し（Ｓ２）、撮像された手指画像を記憶する（Ｓ３）。次いで、撮像された手指画像からシルエット画像を形成し（Ｓ４）、撮像画像とシルエット画像からグループを判定する（Ｓ５）。判定したグループのデータセットをデータベースから抽出する（Ｓ６）。 In step S1, the dimensions of the captured finger image are adjusted (S1), then the angle of the hand image is adjusted (S2), and the captured finger image is stored (S3). Next, a silhouette image is formed from the captured finger image (S4), and a group is determined from the captured image and the silhouette image (S5). The data set of the determined group is extracted from the database (S6).

撮像画像のシルエット画像から尾根線画像を形成する（Ｓ７）。データセットのシルエット画像と形成した尾根線画像を重畳する（Ｓ８）。尾根線がシルエット範囲外となる画像を判定する（Ｓ９）。重畳画像の尾根線がシルエット範囲外となるＣＧ手指画像を推定処理の候補から除外する（Ｓ１０）。手指画像のシルエット画像から画像特徴量を検出する（Ｓ１１）。照合用データベースの画像特徴量と類似度照合する（Ｓ１２）。グループ内に次の画像データが無いか判定する（Ｓ１３）。次データがある場合（Ｓ１３：ＮＯ）にはステップＳ７に戻って次の画像についての処理を行う。 A ridge line image is formed from the silhouette image of the captured image (S7). The silhouette image of the data set and the formed ridge line image are superimposed (S8). An image in which the ridge line is outside the silhouette range is determined (S9). A CG finger image whose ridge line of the superimposed image is outside the silhouette range is excluded from the candidates for estimation processing (S10). The image feature amount is detected from the silhouette image of the finger image (S11). The degree of similarity is collated with the image feature amount of the collation database (S12). It is determined whether or not there is the next image data in the group (S13). If there is next data (S13: NO), the process returns to step S7 to perform processing on the next image.

次データが無い場合（Ｓ１３：ＹＥＳ）には、最類似のＣＧ手指画像の手指角度をデータベースのデータセットから読み出し（Ｓ１４）、最類似の手指角度を出力する（Ｓ１５）。 When there is no next data (S13: YES), the finger angle of the most similar CG finger image is read from the data set of the database (S14), and the most similar finger angle is output (S15).

以上のように本実施形態の手指形状の検出方法によれば、手の甲側に取り付けた小さなカメラ１個からの撮像画像データ入力で、撮像フレームごとの手指のすべての関節角度を推定できる装置を提供し、さらに照合用データベースにおける手指画像のポーズ数を減らし、縮小／拡大比率を減少させてデータ量を減らし、ハンドモーキャップで体全体のモーションキャプチャーと同時にデータを取得できる効果を有すると共に、さらに照合用データベースの作成を効率化して作成のための時間と工数を抑制でき、個人差もより抑制し、ノイズを抑制することで照合精度を向上させて同一輪郭線で異形状の識別をさらに改善した手指形状の検出方法とシステムを提供することができる。 As described above, according to the finger shape detection method of the present embodiment, a device capable of estimating all the joint angles of the fingers for each imaging frame by inputting captured image data from one small camera attached to the back side of the hand is provided. In addition, the number of poses of the finger image in the collation database is reduced, the reduction / enlargement ratio is reduced to reduce the amount of data, and the hand morph is effective in acquiring data at the same time as motion capture of the entire body, and further collation. The time and effort required to create a database can be streamlined, individual differences can be further suppressed, and noise can be suppressed to improve collation accuracy and further improve the identification of irregular shapes with the same contour line. A method and system for detecting finger shape can be provided.

１情報処理装置、
１１画像寸法調整部、
１２画像角度調整部、
１３撮像手指画像記憶部、
１４シルエット画像形成部、
１５尾根線画像形成部、
１６輪郭線抽出部、
１７画像特徴量検出部、
１８重畳画像形成部、
１９範囲外尾根線判定部、
２０候補画像絞り込み部、
２１手指／手甲各種寸法データ記憶部、
２２関節角度記憶部、
２３仮想（ＣＧ）手指画像形成部、
２４グループ制御部、
２５仮想（ＣＧ）手指画像データベース、
２６画像特徴量類似度計算部、
２７グループ判別部、
２８手指角度推定部、
２９データ出力部、
３０支持部、
３１支持柱、
３２基礎部、
３３カメラ取付部、
３４手甲固定部、
３５カメラ、
３６電源ケーブル、
３７無線アンテナ、
３８バッテリー、
５１手甲部、
５２手指部、
５３手指付け根部横断線、
５４親指、
５５人差し指、
５６中指、
５７薬指、
５８小指、
７１各種設定値記憶部、
８１プログラム記憶部、
９１制御部、
１００表示装置、
θ１前方傾斜角、
θ２側方親指側傾斜角。 1 Information processing device,
11 Image dimension adjustment unit,
12 Image angle adjustment unit,
13 Imaging finger image storage unit,
14 Silhouette image forming part,
15 Ridge line image forming part,
16 Contour line extractor,
17 Image feature detection unit,
18 Superimposed image forming part,
19 Out-of-range ridge line judgment unit,
20 Candidate image narrowing section,
21 Finger / back of hand various size data storage unit,
22 Joint angle memory,
23 Virtual ( CG ) finger image forming part,
24 group control unit,
25 virtual ( CG ) finger image database,
26 Image feature similarity calculation unit,
27 Group Determination section,
28 Finger angle estimation unit,
29 Data output section,
30 Support,
31 Support pillar,
32 Foundation,
33 Camera mounting part,
34 Back fixing part,
35 camera,
36 power cable,
37 wireless antenna,
38 battery,
51 back of the hand,
52 fingers,
53 Transversal line at the base of fingers,
54 thumbs,
55 index finger,
56 middle finger,
57 Ring finger,
58 little finger,
71 Various setting value storage units,
81 Program storage,
91 Control unit,
100 display device,
θ1 forward tilt angle,
θ2 Lateral thumb side tilt angle.

Claims

One or more small cameras are supported in the sky near the back of one hand of the wearer so that substantially the entire one hand of the wearer is contained within the angle of view of the camera, and the input captured by the camera. A camera assembly that outputs images and
With at least one information processing device
It is a finger shape detection system that detects the shape of a finger .
The information processing device
An imaging hand image storage unit that stores image data of an input image that captures each finger shape input from the camera,
A virtual finger image forming unit that forms virtual finger images of various CG images in which the dimensional ratio of each part of the fingers and the dimensional ratio of the back of the hand are different depending on the computer graphics (CG) program, and the poses of the fingers are also different.
A virtual finger image database that stores angle data for each finger joint and image feature data calculated from the image data as a data set that includes at least each virtual finger image .
Before SL and captured hand image stored in the captured hand image storage unit, for both the image and the virtual hand image stored in the virtual finger image database, and the silhouette image forming unit for forming a silhouette image having a contour,
A ridge line image forming unit that forms a ridge line image from any of the captured finger image, the virtual finger image, or the silhouette image.
A superimposed image forming unit that superimposes the silhouette image and the ridge line image to form a superimposed image,
An out-of-range ridge for determining whether all the ridge lines of the ridge line image are contained in the internal range of the silhouette image portion of the superimposed image, or whether the ridge line extends beyond the external range of the silhouette image portion. Line judgment part and
When the ridge line extends beyond the external range of the silhouette image portion, the virtual finger image in the original image of the image included in the superimposed image is excluded from the candidate images used in the subsequent estimation process. Narrowing down part and
At least have a,
Exclusion by the candidate image narrowing section
Using the ridge line information of the input image and the silhouette information of the data set, the ridge line region of the input image protrudes from the internal range of the silhouette image portion of the virtual finger image stored in the virtual finger image database. Calculate the area of
The value of the first area that protrudes is used as the value of the first penalty, and the input image "should" have a virtual finger image of a data set in which the value of the first penalty exceeds the first predetermined threshold value. Exclude from the search target of the virtual finger image that is a candidate of the virtual finger image database as a virtual finger image that has no finger.
Finger shape detection system.

The exclusion by the candidate image narrowing section further
Using the ridge line information of the input image and the silhouette information of the data set, the ridge line region of the data set of the virtual finger image stored in the virtual finger image database is from the internal range of the silhouette image portion of the input image. Calculate the second area that sticks out,
The value of the second area that protrudes is used as the value of the second penalty, and the virtual finger image of the data set in which the value of the second penalty exceeds the second predetermined threshold is "should not be" in the input image. Exclude the virtual finger image that has a finger from the search target of the virtual finger image that is a candidate of the virtual finger image database.
The finger shape detection system according to claim 1.

Exclusion by the candidate image narrowing section
The input image "should" have a virtual finger image of a dataset in which the value of the first penalty exceeds the first predetermined threshold and the value of the second penalty exceeds the second predetermined threshold. Exclude from the search target of the virtual finger image that is a candidate of the virtual finger image database as a virtual finger image that does not have a finger and has a finger that "should not exist".
The finger shape detection system according to claim 2.

Further, the similarity is calculated from the value of the first penalty, the value of the second penalty, and the data of the image feature amount, and the finger angle estimation using the data set with the smallest obtained similarity as the estimation result is used. Department and
A data output unit that outputs the finger joint angle of the estimation result data set, and
The finger shape detection system according to claim 2 or 3.

With respect to the image data of each finger shape stored in the image-imaging finger image storage unit, the total vertical dimensions of the fingers and the back of the hand in the captured image frame substantially match the vertical dimensions of the image frame. An image dimension adjustment unit that enlarges or reduces,
The captured image is rotated around the optical axis of the camera so that the tilt angle of the captured image frame is a predetermined angle that matches the image frame to be collated stored in the storage device in advance. The image angle adjustment unit to be adjusted and
The finger shape detection system according to any one of claims 1 to 4 .

One or more small cameras are supported in the sky near the back of one hand of the wearer so that substantially the entire one hand of the wearer is contained within the angle of view of the camera, and the input captured by the camera. A camera assembly that outputs images and
With at least one information processing device
It is a finger shape detection method that detects the finger shape using
A step of storing the image data of the input image obtained by capturing each finger shape input from the camera, and
In the information processing apparatus,
A step different dimensional ratios of various dimensions ratios and hand-back of the fingers, and, for forming a virtual finger image poses also different various CG image finger by your computer graphics (CG) program,
A step of storing the angle data for each knuckle joint and the data of the image feature amount calculated from the image data as a data set including at least for each virtual finger image .
An imaging hand image stored before reporting, for both image and the stored virtual finger image, forming a silhouette image having a contour,
A step of forming a ridge line image from any of the captured finger image, the virtual finger image, or the silhouette image.
A step of superimposing the silhouette image and the ridge line image to form a superimposed image,
A step of determining whether all the ridge lines of the ridge line image are contained in the internal range of the silhouette image portion of the superimposed image, or whether the ridge line extends beyond the external range of the silhouette image portion.
When there is the ridge line that extends beyond the external range of the silhouette image portion, the step of excluding the virtual finger image in the original image of the image included in the superimposed image from the candidate image of the estimation processing in the subsequent stage, and
At least carry out,
The excluded step is
Using the ridge line information of the input image and the silhouette information of the data set, the ridge line region of the input image protrudes from the internal range of the silhouette image portion of the virtual finger image stored in the virtual finger image database. Calculate the area of
The value of the first area that protrudes is used as the value of the first penalty, and the input image "should" have a virtual finger image of a data set in which the value of the first penalty exceeds the first predetermined threshold value. Exclude from the search target of the virtual finger image that is a candidate of the virtual finger image database as a virtual finger image that has no finger.
Finger shape detection method.

The exclusion step further
Using the ridge line information of the input image and the silhouette information of the data set, the ridge line region of the data set of the virtual finger image stored in the virtual finger image database is from the internal range of the silhouette image portion of the input image. Calculate the second area that sticks out,
The value of the second area that protrudes is used as the value of the second penalty, and the virtual finger image of the data set in which the value of the second penalty exceeds the second predetermined threshold is "should not be" in the input image. Exclude the virtual finger image that has a finger from the search target of the virtual finger image that is a candidate of the virtual finger image database.
The finger shape detection method according to claim 6.

The excluded step is
The input image "should" have a virtual finger image of a dataset in which the value of the first penalty exceeds the first predetermined threshold and the value of the second penalty exceeds the second predetermined threshold. Exclude from the search target of the virtual finger image that is a candidate of the virtual finger image database as a virtual finger image that does not have a finger and has a finger that "should not exist".
The method for detecting the shape of a finger according to claim 7.

Further, a step of calculating the similarity from the value of the first penalty, the value of the second penalty, and the data of the image feature amount, and using the data set with the minimum obtained similarity as the estimation result. ,
A step of outputting the finger joint angle of the estimation result data set,
7. The finger shape detection method according to claim 7 or 8.

With the information processing device
With respect to the image data of each finger shape stored in the information processing device, the total vertical dimension of the finger and the back of the hand in the captured image frame is enlarged or enlarged so as to substantially match the vertical dimension of the image frame. Steps to shrink and
In the information processing device, the captured image of the camera is set so that the tilt angle of the captured image frame is a predetermined angle that matches the image frame to be collated stored in the storage device in advance. Steps to adjust the rotation around the optical axis,
The method for detecting the shape of a finger according to any one of claims 6 to 9 , further comprising the method.

A program for implementing the finger shape detection method according to any one of claims 6 to 10 .

A storage medium for storing the program according to claim 11 .