JP6353660B2

JP6353660B2 - Sign language word classification information generation device and program thereof

Info

Publication number: JP6353660B2
Application number: JP2014021253A
Authority: JP
Inventors: 井上　誠喜; 誠喜井上
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2014-02-06
Filing date: 2014-02-06
Publication date: 2018-07-04
Anticipated expiration: 2034-02-06
Also published as: JP2015148706A

Description

本発明は、手話単語を検索するための分類情報を生成する手話単語分類情報生成装置およびそのプログラムに関する。 The present invention relates to a sign language word classification information generation apparatus and a program for generating a classification information for searching the sign language word.

近年、視覚障害者のための手話通訳や手話放送が急速に社会に浸透している。従来、その手話を学習するには、手話辞書を用いて学習する手法が一般的である。
この手話辞書には、単語と手話動作の一連の画像とを対応付けた紙ベースの辞書、単語と手話動作の映像とを対応付けたＤＶＤ等の記録媒体等が存在する。
また、近年では、手話の動作をモーションキャプチャシステムにより、指先を含めた人体の動き（モーションデータ）として取得し、単語とモーションデータとを対応付けることで、単語から対応する手話を検索する技術が開示されている（特許文献１、非特許文献１参照）。このシステムによれば、単語を指定することで、手話のモーションデータをＣＧアニメーション化して提示することができる。
これらの手法（辞書、システム）は、単語から手話の動作を検索するものであって、実際の手話から単語を検索するものではない。 In recent years, sign language interpreting and sign language broadcasting for visually impaired people have rapidly spread into society. Conventionally, a learning method using a sign language dictionary is generally used to learn the sign language.
The sign language dictionary includes a paper-based dictionary in which words and a series of images of sign language actions are associated with each other, and a recording medium such as a DVD in which words and videos of sign language actions are associated with each other.
In recent years, a technology has been disclosed in which sign language motion is acquired as motion of the human body including the fingertips (motion data) using a motion capture system, and the corresponding sign language is retrieved from the word by associating the word with motion data. (See Patent Document 1 and Non-Patent Document 1). According to this system, by specifying a word, it is possible to present CG animation of sign language motion data.
These methods (dictionaries, systems) search for sign language actions from words, not words from actual sign languages.

一方、実際の手話動作から単語を検索するには、手の型から手話を検索する辞書を用いることができる（非特許文献２参照）。この辞書は、片手手話、両手同型手話、両手異型手話等の腕の動きを中心とした大きな分類、テ型、ホ型等の手型を中心とした小さな分類を、単語と対応付けている。
また、手話動作から単語（文字）を検索する技術として、手話動作を撮像した画像から、画像処理により、手の特徴情報を抽出し、手話動作における手の特徴情報を予めグループ化したデータベースから、グループに対応する文字を特定する技術が存在する（特許文献２参照）。 On the other hand, in order to search for a word from an actual sign language action, a dictionary for searching for a sign language from a hand pattern can be used (see Non-Patent Document 2). This dictionary associates large classifications centered on arm movements such as one-handed sign language, two-handed sign language, two-handed variant sign language, and small classifications centered on hand-types such as te-type and ho-type with words.
In addition, as a technique for searching for a word (character) from sign language action, from the image obtained by imaging the sign language action, by extracting the hand feature information by image processing, from the database that previously grouped the hand feature information in the sign language action, There is a technique for specifying characters corresponding to a group (see Patent Document 2).

特開２０１１−１１２６７５号公報JP 2011-112675 A 特開２０１２−２５２５８１号公報JP 2012-252581 A

加藤直人，金子浩之，井上誠喜，清水俊宏，長嶋祐二：“日本語−手話対訳辞書の構築−日本語語彙の拡張−”，電子情報通信学会HCG（Human Communication Group）シンポジウム，HCG2009-I-3，(2009)．Naoto Kato, Hiroyuki Kaneko, Seiki Inoue, Toshihiro Shimizu, Yuji Nagashima: “Construction of a Japanese-Sign Language Bilingual Dictionary-Expansion of Japanese Vocabulary-”, HCG (Human Communication Group) Symposium, HCG2009-I-3 , (2009). 竹村茂：“手話・日本語大辞典”，廣済堂出版，１９９４年４月１日Shigeru Takemura: “Sign Language / Japanese Dictionary”, Kosaido Publishing, April 1, 1994

前記した単語からその手話動作を検索する手法（辞書、システム）は、単語ごとに手話動作が対応付けられており、検索条件として、単語を指定するだけであるため、確実に単語から手話動作を検索することができる。
一方、実際の手話動作から単語を検索する手法には、多くの問題が存在する。
例えば、手話動作から単語を検索する辞書を作成するためには、辞書作成者が、手話を見て人手で手話の分類を行っているのが現状である。そのため、辞書の語彙数を多くしようとすれば、その作業量は膨大となってしまうという問題がある。
また、前記した画像処置によって、撮像画像から手話単語を特定する手法は、照明等の環境によって、撮像画像から手、指を検出できない場合がある。また、手話は、手の型（形）等により意味が異なる。そのため、平面の撮像画像から、手話単語で異なる多くの３次元の手型を特定すること、現実的には非常に困難である。 In the method (dictionary, system) for searching for the sign language action from the above-mentioned word, the sign language action is associated with each word, and only the word is specified as a search condition. You can search.
On the other hand, there are many problems in the method of searching for words from actual sign language actions.
For example, in order to create a dictionary for searching for a word from a sign language action, the dictionary creator currently classifies sign language by looking at the sign language. Therefore, there is a problem that if the number of vocabularies in the dictionary is increased, the amount of work becomes enormous.
In addition, according to the technique for identifying a sign language word from a captured image by the above-described image processing, a hand and a finger may not be detected from the captured image depending on an environment such as illumination. In addition, the meaning of sign language varies depending on the shape (shape) of the hand. For this reason, it is very difficult to identify many three-dimensional hand types that differ by sign language words from a planar captured image.

本発明は、このような問題に鑑みてなされたもので、手話動作から単語を検索するための分類情報を、人手を用いずに容易に生成する手話単語分類情報生成装置およびそのプログラムを提供することを課題とする。 The present invention has been made in view of such problems, the classification information for retrieving a word from the sign language operation, provide easily sign language word classification information generating apparatus and its program to produce without human intervention The task is to do.

前記課題を解決するため、本発明に係る手話単語分類情報生成装置は、手話単語の動作における手の型、位置および向きの特徴から前記手話単語を検索するための検索用の分類情報を生成する手話単語分類情報生成装置であって、代表フレーム検出手段と、手型分類手段と、手位置方向算出手段と、手位置分類手段と、手方向分類手段と、を備える。 In order to solve the above problems, a sign language word classification information generation device according to the present invention generates classification information for search for searching for a sign language word from features of a hand shape, a position, and an orientation in the operation of a sign language word. A sign language word classification information generating apparatus, comprising a representative frame detection means, a hand type classification means, a hand position direction calculation means, a hand position classification means, and a hand direction classification means.

かかる構成において、手話単語分類情報生成装置は、代表フレーム検出手段によって、人体の関節の相対位置と画面の単位であるフレームごとの関節の移動量および回転量とで構成されるモーションデータにおいて、動き量が最少となる代表フレームを検出する。これは、手動単語の動作が、その単語の典型的なポーズで一度静止することが多いためである。 In such a configuration, the sign language word classification information generation device uses the representative frame detection unit to generate motion in motion data including the relative positions of the joints of the human body and the amount of movement and rotation of the joints for each frame, which is a unit of the screen. The representative frame with the smallest amount is detected. This is because the movement of a manual word often rests once in the typical pose of that word.

そして、手話単語分類情報生成装置は、手型分類手段によって、手話単語ごとの代表フレームにおける手指の関節の回転量の組をクラスタリングする。このように、手指の関節の回転量に応じてクラスタリングすることで、大まかな手の形（型）で手話単語を分類することができる。
そして、手話単語分類情報生成装置は、手型分類手段によって、分類結果を、手話単語のモーションデータごとに分類情報の１つとして対応付ける。
これによって、手話単語分類情報生成装置は、手の型から手話単語を検索することが可能な分類情報を生成することができる。 Then, the sign language word classification information generation device clusters pairs of finger joint rotation amounts in the representative frame for each sign language word by the hand type classification means. As described above, by performing clustering according to the rotation amount of the finger joint, it is possible to classify the sign language words in rough hand shapes (types).
Then, the sign language word classification information generation apparatus associates the classification result as one piece of classification information for each motion data of the sign language word by the hand type classification means.
Thereby, the sign language word classification information generation device can generate classification information that can search for a sign language word from a hand pattern.

また、手話単語分類情報生成装置は、手位置方向算出手段によって、代表フレームごとに、人体の予め定めた基準位置となる関節から手首の関節までの各関節の移動量および回転量により、手首の関節の位置および向きを算出する。
そして、手話単語分類情報生成装置は、手位置分類手段によって、手首の関節の位置を、予め区分した位置で分類し、その分類結果を、代表フレームに対応する手話単語のモーションデータに分類情報の１つとして対応付ける。
これによって、手話単語分類情報生成装置は、手の位置から手話単語を検索することが可能な分類情報を生成することができる。 In addition, the sign language word classification information generating device uses the hand position direction calculating unit to calculate the wrist position based on the movement amount and the rotation amount of each joint from the joint serving as a predetermined reference position of the human body to the wrist joint for each representative frame. Calculate joint position and orientation.
Then, the sign language word classification information generation device classifies the wrist joint position by the position classified in advance by the hand position classification means, and the classification result is converted into the motion data of the sign language word corresponding to the representative frame. Correspond as one.
As a result, the sign language word classification information generation device can generate classification information capable of searching for a sign language word from the position of the hand.

そして、手話単語分類情報生成装置は、手方向分類手段によって、手首の関節の向きを、予め区分した方向で分類し、分類結果を、代表フレームに対応する手話単語のモーションデータに分類情報の１つとして対応付ける。
これによって、手話単語分類情報生成装置は、手の向き（手方向）から手話単語を検索することが可能な分類情報を生成することができる。 The sign language word classification information generation device classifies the direction of the wrist joint by the direction classified in advance by the hand direction classification means, and classifies the classification result into motion data of the sign language word corresponding to the representative frame. Associate as one.
Thereby, the sign language word classification information generation device can generate classification information that can search for a sign language word from the direction of the hand (hand direction).

本発明は、以下に示す優れた効果を奏するものである。
本発明によれば、モーションデータで規定された手話単語動作を、人手を介さずに、その動作の代表的な手の型、位置、向きによって分類することができる。
そして、本発明によって、手話単語動作の代表的な手の型、位置、向きによって、手話動作から、その手話動作が示す手話単語を検索することが可能になる。
これによって、本発明は、手話単語動作の学習効率を高めることができる。 The present invention has the following excellent effects.
According to the present invention, sign language word motions defined by motion data can be classified according to the representative hand type, position, and orientation of the motion without human intervention.
According to the present invention, the sign language word indicated by the sign language action can be searched from the sign language action by the typical hand type, position, and orientation of the sign language word action.
Accordingly, the present invention can improve the learning efficiency of the sign language word motion.

本発明の実施形態に係る手話単語分類情報生成装置の構成を示すブロック構成図である。It is a block block diagram which shows the structure of the sign language word classification | category information generation apparatus which concerns on embodiment of this invention. モーションデータの一例を示すＢＶＨ形式のデータ構成図である。It is a data block diagram of the BVH format which shows an example of motion data. モーションデータで規定される関節ノードを示す図であって、（ａ）は体全体の関節ノードの例、（ｂ）は右手部分の関節ノードの例を示す。It is a figure which shows the joint node prescribed | regulated by motion data, Comprising: (a) is an example of the joint node of the whole body, (b) shows the example of the joint node of a right hand part. 図１の動き量最少フレーム検出手段で、代表フレームを検出する手法を説明するための図であって、フレームと動き量との関係を示すグラフ図である。It is a figure for demonstrating the method of detecting a representative frame with the motion amount minimum frame detection means of FIG. 1, Comprising: It is a graph which shows the relationship between a frame and a motion amount. 図１のクラスタリング手段で、統計処理ソフト「Ｒ」を用いてクラスタリングを行った際に生成される系統樹を示す図である。It is a figure which shows the phylogenetic tree produced | generated when the clustering means of FIG. 1 clustered using statistical processing software "R". 図１の手位置分類手段で、手の位置を分類する領域を説明するための説明図である。It is explanatory drawing for demonstrating the area | region which classify | categorizes a hand position with the hand position classification | category means of FIG. 手のひらの向きが異なる手型の例を示す図であって、（ａ）は手の甲を正面に向けた図、（ｂ）は手のひらを正面に向けた図である。It is a figure which shows the example of the hand type from which direction of a palm differs, Comprising: (a) is a figure which turned the back of the hand to the front, (b) is a figure which turned the palm to the front. 人差し指の向きが異なる例を示す図であって、（ａ）は人差し指を内側に向けた図、（ｂ）は人差し指を下に向けた図である。It is a figure which shows the example from which the direction of an index finger differs, Comprising: (a) is a figure which turned the index finger inward, (b) is a figure which made the index finger face down. 本発明の実施形態に係る手話単語分類情報生成装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the sign language word classification | category information generation apparatus which concerns on embodiment of this invention. 本発明の実施形態に係る手話単語検索装置の構成を示すブロック構成図である。It is a block block diagram which shows the structure of the sign language word search apparatus which concerns on embodiment of this invention. 手話単語の検索画面の例を示す図である。It is a figure which shows the example of the search screen of a sign language word. 手型を指定する画面例を示す図である。It is a figure which shows the example of a screen which designates a hand type. 手の向きを指定する画面例を示す図である。It is a figure which shows the example of a screen which designates direction of a hand. 指の向きを指定する画面例を示す図である。It is a figure which shows the example of a screen which designates the direction of a finger | toe. 手の位置を指定する画面例を示す図である。It is a figure which shows the example of a screen which designates the position of a hand. 検索結果の手話単語の候補を表示する画面例を示す図である。It is a figure which shows the example of a screen which displays the candidate of the sign language word of a search result. 検索結果の手話単語を選択することで、手話単語動作を表示する画面例を示す図である。It is a figure which shows the example of a screen which displays a sign language word operation | movement by selecting the sign language word of a search result. 本発明の実施形態に係る手話単語検索装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the sign language word retrieval apparatus which concerns on embodiment of this invention.

本発明の実施形態について図面を参照して説明する。
〔手話単語モーションデータの概要〕
まず、本発明で使用する手話単語の動作を示すモーションデータ（手話単語モーションデータ）の構造について説明しておく。このモーションデータは、一般的なモーションキャプチャシステムを用いて、手話単語ごとに収録したデータである。ここでは、モーションデータとして、Ｂｉｏｖｉｓｉｏｎ社が開発したファイルフォーマット（ＢＶＨ〔BioVision Hierarchy〕形式）のデータを例に説明する。 Embodiments of the present invention will be described with reference to the drawings.
[Outline of sign language word motion data]
First, the structure of motion data (sign language word motion data) indicating the operation of a sign language word used in the present invention will be described. This motion data is recorded for each sign language word using a general motion capture system. Here, as motion data, data in a file format (BVH [BioVision Hierarchy] format) developed by Biovision will be described as an example.

モーションデータは、人体の関節の相対位置と手話動作を再生したときの画面の単位であるフレームごとの関節の移動量および回転量とで構成される。
図２にＢＶＨ形式で記述されたモーションデータの例を示す。
図２に示すように、ＢＶＨ形式のモーションデータは、ＨＩＥＲＡＲＣＨＹ部（１行目から２８行目）と、ＭＯＴＩＯＮ部（２９行目以降）とで構成される。
ＨＩＥＲＡＲＣＨＹ部には、人体の関節を示す関節ノード（ＪＯＩＮＴ）が階層構造で定義され、初期姿勢における親関節ノードからの相対座標（ＯＦＦＳＥＴ）と、当該関節ノードの移動量（Ｘｐｏｓｉｔｉｏｎ，Ｙｐｏｓｉｔｉｏｎ，Ｚｐｏｓｉｔｉｏｎ）および回転量（Ｚｒｏｔａｔｉｏｎ，Ｘｒｏｔａｔｉｏｎ，Ｙｒｏｔａｔｉｏｎ）の変数とを定義している。
ＭＯＴＩＯＮ部には、フレーム数（３０行目）と、フレーム時間間隔（３１行目）と、ＨＩＥＲＡＲＣＨＹ部で定義してある移動量および回転量の変数に対する実際の値（３２行目）を１フレームごとに１行ずつ記述している。 The motion data is composed of the relative position of the joint of the human body and the amount of movement and rotation of the joint for each frame, which is a unit of the screen when the sign language operation is reproduced.
FIG. 2 shows an example of motion data described in the BVH format.
As shown in FIG. 2, the motion data in BVH format is composed of a HIERARCHY section (from the first line to the 28th line) and a MOTION section (from the 29th line).
In the HIERARCHY part, a joint node (JOINT) indicating a joint of a human body is defined in a hierarchical structure, relative coordinates (OFFSET) from a parent joint node in an initial posture, and movement amounts (Xposition, Yposition, Zposition) of the joint node. And a variable of rotation amount (Zrotation, Xrotation, Yrotation).
In the MOTION section, the number of frames (line 30), the frame time interval (line 31), and the actual values (line 32) for the movement amount and rotation amount variables defined in the HIERARCHY section are stored in one frame. Each line is described one by one.

各関節ノード（ＪＯＩＮＴ）は、例えば、図３に示すように、予め定めた基準位置（ここでは、Ｈｉｐｓ）をルート（ＲＯＯＴ）として、連結している順に階層構造を有する。図３（ａ）は、人体の関節ノードの例を示している。また、図３（ｂ）は、図３（ａ）の右手の手首（ＲｉｇｈｔＷｒｉｓｔ）にさらに連結している指の関節ノードを示している。なお、図３（ｂ）では、親指と人差し指以外の関節ノードの符号を省略している。
例えば、右手の関節ノード（ＲｉｇｈｔＩｎｄｅｘ３）の位置および向きは、Ｈｉｐｓ、Ｃｈｅｓｔ１、Ｃｈｅｓｔ２、…、ＲｉｇｈｔＷｒｉｓｔ、ＲｉｇｈｔＩｎｄｅｘ０、…、ＲｉｇｈｔＩｎｄｅｘ３と、図２に示したモーションデータの関節ノードの階層を辿っていくことで求めることができる。
なお、ここでは、ＸＹＺ座標系は、図３（ａ）に示すように、右手系を用いることとする。
以下、本発明の実施形態に係る手話単語分類情報生成装置の構成および動作を説明した後、手話単語検索装置の構成および動作について順に説明する。 For example, as shown in FIG. 3, each joint node (JOINT) has a hierarchical structure in the order of connection with a predetermined reference position (here, Hips) as a root (ROOT). FIG. 3A shows an example of a joint node of a human body. FIG. 3B shows a finger joint node further connected to the right hand wrist (RightWrist) of FIG. In FIG. 3B, reference numerals of joint nodes other than the thumb and index finger are omitted.
For example, the position and orientation of the right hand joint node (RightIndex3) can be determined by following the hierarchy of joint nodes of the motion data shown in FIG. 2 as Hips, Chest1, Chest2, ..., RightWrist, RightIndex0, ..., RightIndex3. Can be sought.
Here, the XYZ coordinate system is a right-handed system as shown in FIG.
Hereinafter, after describing the configuration and operation of the sign language word classification information generation device according to the embodiment of the present invention, the configuration and operation of the sign language word search device will be described in order.

〔手話単語分類情報生成装置の構成〕
最初に、図１を参照して、手話単語分類情報生成装置１の構成について説明する。
手話単語分類情報生成装置１は、手話単語の動作における手の形状特徴（手の型、位置、向き等）から、手話単語を検索するための検索条件となる分類情報を生成するものである。 [Configuration of sign language word classification information generator]
First, the configuration of the sign language word classification information generation device 1 will be described with reference to FIG.
The sign language word classification information generation device 1 generates classification information that serves as a search condition for searching for a sign language word from the hand shape characteristics (hand type, position, orientation, etc.) in the operation of the sign language word.

この手話単語分類情報生成装置１は、手話単語ごとの動作を示すモーションデータから、当該データの代表フレームにおける手の型（手型）、手のひらの向き（手方向）、手の指（ここでは人差し指）の向き（指方向）、手首の位置（手位置）の各分類情報を生成する。
ここでは、手話単語分類情報生成装置１は、各分類情報を生成するため、代表フレーム検出手段１０と、型分類手段１１と、位置方向分類手段１２と、を備える。 This sign language word classification information generation device 1 uses hand data (hand type), palm direction (hand direction), hand finger (in this case, index finger) in the representative frame of the data from motion data indicating the operation of each sign language word. ) Direction (finger direction) and wrist position (hand position).
Here, the sign language word classification information generation device 1 includes representative frame detection means 10, type classification means 11, and position / direction classification means 12 in order to generate each classification information.

なお、モーションデータＤは図２で説明したデータであって、予め記憶手段２に記憶されているものとし、手話単語動作ごとのモーションデータＤには、管理情報として、少なくともファイル名、固有番号等の識別情報ＩＤと、手話単語の意味（日本語の単語）を示すラベルＬとが対応付けられているものとする。 The motion data D is the data described in FIG. 2 and is stored in the storage unit 2 in advance. The motion data D for each sign language word motion includes at least a file name, a unique number, etc. as management information. Is associated with a label L indicating the meaning of the sign language word (Japanese word).

代表フレーム検出手段１０は、手話単語ごとのモーションデータＤにおいて、時間経過で変化する各フレームの中で特徴的なフレームを代表フレームとして検出するものである。手話単語の動作は、その単語の典型的なポーズで一度静止することが多い。そこで、代表フレーム検出手段１０は、手話単語ごとに、モーションデータＤにおける動作の開始から終了までの間で、動き量が最少となるフレームを、代表フレームとして検出する。ここでは、代表フレーム検出手段１０は、動き量算出手段１００と、動き量最少フレーム検出手段１０１と、を備える。 The representative frame detection means 10 detects, as a representative frame, a characteristic frame among the frames that change over time in the motion data D for each sign language word. The behavior of a sign language word often rests once in the typical pose of that word. Therefore, the representative frame detection unit 10 detects, as a representative frame, a frame having the smallest amount of motion between the start and end of the motion data D for each sign language word. Here, the representative frame detection unit 10 includes a motion amount calculation unit 100 and a motion amount minimum frame detection unit 101.

動き量算出手段１００は、手話単語ごとのモーションデータＤにおいて、フレーム単位の動き量を算出するものである。
ここでは、動き量算出手段１００は、モーションデータＤのあるフレームとその前のフレームとの間で、移動量（Ｘ移動量，Ｙ移動量，Ｚ移動量）および回転量（Ｘ回転量，Ｙ回転量，Ｚ回転量）の成分ごとの差の絶対値の総和、または、成分ごとの差の２乗和を求めることで、あるフレームにおける動き量を求める。同様に、動き量算出手段１００は、順次、すべてのフレームについて動き量を求める。
すなわち、動き量算出手段１００は、図２で説明したモーションデータＤのＭＯＴＩＯＮ部において、３２行目以降で、１行単位に記載されているフレームの移動量および回転量から、フレーム間の動き量を求める。
そして、動き量算出手段１００は、手話単語動作ごとに、フレーム単位の動き量を、動き量最少フレーム検出手段１０１に出力する。 The motion amount calculation means 100 calculates a motion amount in units of frames in the motion data D for each sign language word.
Here, the motion amount calculation means 100 moves between the frame with the motion data D and the previous frame, the movement amount (X movement amount, Y movement amount, Z movement amount) and the rotation amount (X rotation amount, Y The amount of motion in a certain frame is obtained by calculating the sum of absolute values of differences for each component (rotation amount, Z rotation amount) or the square sum of the difference for each component. Similarly, the motion amount calculation unit 100 sequentially calculates the motion amount for all the frames.
That is, the motion amount calculation unit 100 calculates the motion amount between frames from the movement amount and the rotation amount of the frames described in units of one line from the 32nd line in the MOTION part of the motion data D described in FIG. Ask for.
Then, the motion amount calculation means 100 outputs the motion amount in frame units to the motion amount minimum frame detection means 101 for each sign language word motion.

動き量最少フレーム検出手段１０１は、動き量算出手段１００で算出された動き量に基づいて、手話単語動作において、動き量が最少となるフレームを検出するものである。
ここでは、動き量最少フレーム検出手段１０１は、手話単語の動作が開始されたフレームと、手話単語の動作が終了したフレームとの間で、動き量が最少となるフレームを検出する。 The motion amount minimum frame detection unit 101 detects a frame having the minimum motion amount in the sign language word motion based on the motion amount calculated by the motion amount calculation unit 100.
Here, the motion amount minimum frame detecting means 101 detects a frame having the smallest motion amount between the frame in which the motion of the sign language word is started and the frame in which the motion of the sign language word is finished.

ここで、図４を参照して、動き量最少フレーム検出手段１０１のフレーム検出処理について説明する。
図４は、ある手話単語動作のフレームごとの動き量をグラフ化したものである。横軸はフレーム番号、縦軸は動き量（単位なし）を示している。
図４に示すように、動き量最少フレーム検出手段１０１は、手話動作の最初のフレームから最後のフレームに向かって、予め定めた閾値の動き量を超過したフレームを動作の開始となる開始フレームＦｓとする。また、動き量最少フレーム検出手段１０１は、図４に示すように、手話動作の最後のフレームから最初のフレームに向かって、予め定めた閾値の動き量を超過したフレームを動作の終了となる終了フレームＦｅとする。
そして、動き量最少フレーム検出手段１０１は、開始フレームＦｓと終了フレームＦｅとの間で動き量が最少となるフレームを代表フレームＦｔとする。
この動き量最少フレーム検出手段１０１は、代表フレームを特定する情報（識別情報ＩＤおよびフレーム番号）を、型分類手段１１および位置方向分類手段１２に出力する。
図１に戻って、手話単語分類情報生成装置１の構成について説明を続ける。 Here, with reference to FIG. 4, the frame detection process of the motion amount minimum frame detecting means 101 will be described.
FIG. 4 is a graph showing the amount of movement for each frame of a certain sign language word motion. The horizontal axis indicates the frame number, and the vertical axis indicates the amount of movement (no unit).
As shown in FIG. 4, the motion amount minimum frame detecting means 101 starts a frame that exceeds a predetermined threshold motion amount from the first frame of the sign language motion to the last frame, and starts the motion. And Further, as shown in FIG. 4, the motion amount minimum frame detection unit 101 ends the operation of a frame that exceeds a predetermined amount of motion from the last frame of the sign language operation toward the first frame. Frame Fe.
Then, the motion amount minimum frame detecting means 101 sets the frame having the minimum motion amount between the start frame Fs and the end frame Fe as the representative frame Ft.
The motion amount minimum frame detection unit 101 outputs information (identification information ID and frame number) for specifying the representative frame to the type classification unit 11 and the position / direction classification unit 12.
Returning to FIG. 1, the description of the configuration of the sign language word classification information generation device 1 will be continued.

型分類手段１１は、代表フレーム検出手段１０で検出された複数の手話単語動作ごとの代表フレームにおける手型をその形状で分類するものである。
ここでは、型分類手段１１は、クラスタリング手段１１０と、手型画像生成手段１１１と、を備える。 The type classification unit 11 classifies the hand type in the representative frame for each of a plurality of sign language word motions detected by the representative frame detection unit 10 according to its shape.
Here, the type classification unit 11 includes a clustering unit 110 and a hand image generation unit 111.

クラスタリング手段１１０は、複数のモーションデータＤの代表フレームの手型をクラスタリングするものである。
ここで、クラスタリング手段１１０は、手型を特定する情報として、モーションデータＤの代表フレームにおける指部分の関節の回転量の組を用いる。
具体的には、クラスタリング手段１１０は、図３（ｂ）に示した手（ここでは、例として、右手）の各指の先端を除く、親指の３個の関節ノード（ＲｉｇｈｔＴｈｕｍｂ１〜ＲｉｇｈｔＴｈｕｍｂ３）、人差し指の４個の関節ノード（ＲｉｇｈｔＩｎｄｅｘ０〜ＲｉｇｈｔＩｎｄｅｘ３）、中指の４個の関節ノード、薬指の４個の関節ノード、小指の４個の関節ノードの合計１９個の関節ノードの回転量の組を１つのデータ単位とし、検出されたすべての代表フレームについてクラスタリングを行う。
すなわち、クラスタリング手段１１０は、代表フレーム検出手段１０から、記憶手段２に記憶されているモーションデータＤのすべての代表フレームが検出された段階で動作する。 The clustering means 110 clusters the hand molds of the representative frames of the plurality of motion data D.
Here, the clustering unit 110 uses a set of the rotation amounts of the joints of the finger portions in the representative frame of the motion data D as information for specifying the hand shape.
Specifically, the clustering unit 110 includes three joint nodes (Right Thumb 1 to Right Thumb 3) of the thumb, excluding the tip of each finger of the hand (here, right hand as an example) shown in FIG. 4 joint nodes (RightIndex0 to RightIndex3), 4 joint nodes of the middle finger, 4 joint nodes of the ring finger, and 4 joint nodes of the little finger, a set of rotation amounts of a total of 19 joint nodes. Clustering is performed on all detected representative frames as data units.
That is, the clustering unit 110 operates when all the representative frames of the motion data D stored in the storage unit 2 are detected from the representative frame detection unit 10.

なお、このクラスタリングには、一般的な手法を用いればよい。例えば、クラスタリング手段１１０は、Ｋ平均法を用いることができる。
Ｋ平均法を用いた場合、クラスタリング手段１１０は、代表フレーム数の指の関節ノードの回転量の組を、ランダムにＫ個のクラスタに割り当てる。なお、このクラスタリング数のＫは、例えば、手話の手型として予め定められている個数とする。そして、クラスタリング手段１１０は、クラスタごとに重心（Ｘ回転量の平均、Ｙ回転量の平均、Ｚ回転量の平均）を求める。さらに、クラスタリング手段１１０は、Ｋ個のクラスタを先に求めた重心に近いクラスタに再度割り当てる。そして、クラスタリング手段１１０は、クラスタの割り当てが変化しなくなるまで、処理を繰り返す。
これによって、クラスタリング手段１１０は、代表フレームにおける手型をＫ個にクラスタリングすることができる。 A general method may be used for this clustering. For example, the clustering means 110 can use a K-average method.
When the K-average method is used, the clustering means 110 randomly assigns a set of rotation amounts of the finger joint nodes of the number of representative frames to K clusters. The clustering number K is, for example, a number that is predetermined as a sign language of sign language. Then, the clustering unit 110 obtains the center of gravity (the average of the X rotation amount, the average of the Y rotation amount, and the average of the Z rotation amount) for each cluster. Further, the clustering means 110 reassigns the K clusters to the clusters close to the center of gravity obtained previously. Then, the clustering unit 110 repeats the process until the cluster assignment does not change.
Thereby, the clustering means 110 can cluster the hand type in the representative frame into K pieces.

また、例えば、クラスタリング手段１１０は、一般的な統計処理ソフト「Ｒ」を用いてクラスタリングを行ってもよい。統計処理ソフト「Ｒ」を用いた場合、クラスタリング手段１１０は、「Ｒ」のクラスタリング処理により、手型のクラスタリング結果を、図５に示したような系統樹で図示を省略した表示装置に表示する。そして、操作者は、その系統図における高さ（Ｈｅｉｇｈｔ）により適当なクラスタ数をクラスタリング手段１１０に入力する。そして、クラスタリング手段１１０は、その系統図で区分された各手型を、クラスタ数に応じて分類する。 Further, for example, the clustering means 110 may perform clustering using general statistical processing software “R”. When the statistical processing software “R” is used, the clustering means 110 displays the result of hand clustering on a display device that is not shown in the phylogenetic tree as shown in FIG. . Then, the operator inputs an appropriate number of clusters to the clustering means 110 based on the height in the system diagram. Then, the clustering unit 110 classifies each hand type divided in the system diagram according to the number of clusters.

そして、クラスタリング手段１１０は、分類したクラスタ固有の値を、分類情報Ｃの手型情報として、記憶手段２に記憶されている代表フレームに対応するモーションデータＤの管理情報に設定する。
また、クラスタリング手段１１０は、クラスタ固有の値とともに、モーションデータＤを特定する情報（識別情報ＩＤ）を、手型画像生成手段１１１に出力する。 Then, the clustering unit 110 sets the classified cluster-specific value as management information of the motion data D corresponding to the representative frame stored in the storage unit 2 as hand type information of the classification information C.
Further, the clustering unit 110 outputs information (identification information ID) for specifying the motion data D to the hand-type image generation unit 111 together with the cluster-specific value.

手型画像生成手段１１１は、クラスタリング手段１１０で分類された手型に対応する画像（手型画像）を生成するものである。
この手型画像生成手段１１１は、クラスタリング手段１１０から入力される識別情報ＩＤに対応するモーションデータＤから、ＣＧ（コンピュータグラフックス）により、手型画像Ｇを生成する。
そして、手型画像生成手段１１１は、クラスタリング手段１１０から入力されるクラスタ固有の値である手型情報に対応付けて、生成した手型画像Ｇを記憶手段２に書き込み記憶する。なお、手型画像生成手段１１１は、すでに同じ値の手型情報に対して手型画像Ｇが生成されている場合、手型画像の生成を行わないこととする。
このように生成された手型画像Ｇは、手話単語の検索を行う際の手型のサムネイル画像として使用される。 The hand-type image generation unit 111 generates an image (hand-type image) corresponding to the hand type classified by the clustering unit 110.
The hand image generation unit 111 generates a hand image G from the motion data D corresponding to the identification information ID input from the clustering unit 110 by CG (computer graphics).
The hand image generation unit 111 then writes and stores the generated hand image G in the storage unit 2 in association with the hand information that is a cluster-specific value input from the clustering unit 110. Note that the hand image generation unit 111 does not generate the hand image when the hand image G has already been generated for the hand information having the same value.
The hand-shaped image G generated in this way is used as a hand-shaped thumbnail image when searching for a sign language word.

位置方向分類手段１２は、代表フレーム検出手段１０で検出された複数の手話単語動作ごとの代表フレームにおける手指の位置および向きを分類するものである。
ここでは、位置方向分類手段１２は、手位置方向算出手段１２０と、手位置分類手段１２１と、手方向分類手段１２２と、指方向算出手段１２３と、指方向分類手段１２４と、を備える。 The position / direction classification unit 12 classifies the positions and orientations of fingers in the representative frame for each of a plurality of sign language word motions detected by the representative frame detection unit 10.
Here, the position / direction classification unit 12 includes a hand position / direction calculation unit 120, a hand position classification unit 121, a hand direction classification unit 122, a finger direction calculation unit 123, and a finger direction classification unit 124.

手位置方向算出手段１２０は、代表フレームで表されている手（手首）の空間位置と手（手のひら）の方向とを算出するものである。
この手位置方向算出手段１２０は、モーションデータＤに記述されている基準位置（Ｈｉｐｓ）の関節ノードから、手首の関節ノードまでの階層関係（親子関係）により、代表フレームにおける移動量および回転量を、親関節ノードから子関節ノードに向かって座標変換することで、手首の位置および手のひらの方向を求める。
ここでは、手位置方向算出手段１２０は、以下の（手順１）から（手順３）の順で、目的とする手首の位置および方向を算出する。 The hand position / direction calculating unit 120 calculates the spatial position of the hand (wrist) and the direction of the hand (palm) represented by the representative frame.
This hand position / direction calculation means 120 calculates the movement amount and the rotation amount in the representative frame based on the hierarchical relationship (parent-child relationship) from the joint node at the reference position (Hips) described in the motion data D to the joint node of the wrist. Then, coordinate conversion is performed from the parent joint node toward the child joint node to obtain the wrist position and the palm direction.
Here, the hand position / direction calculation means 120 calculates the target wrist position and direction in the following order (procedure 1) to (procedure 3).

（手順１）
手位置方向算出手段１２０は、ルート（Ｈｉｐｓ）から手首（Ｗｒｉｓｔ〔ここでは、ＲｉｇｈｔＷｒｉｓｔとする〕）までの関節ノードごとに、モーションデータＤに記述されている代表フレームの移動量および回転量から、当該関節ノードにおける親関節ノードからの位置および向きを求める変換行列（同次変換行列）を生成する。
具体的には、手位置方向算出手段１２０は、関節ノードごとに、以下の式（１）に示す行列演算を行うことで、変換行列Ａを生成する。
ここで、Ｐ_ｘはＸｐｏｓｉｔｉｏｎに対応したＸ方向の移動量、Ｐ_ｙはＹｐｏｓｉｔｉｏｎに対応したＹ方向の移動量、Ｐ_ｚはＺｐｏｓｉｔｉｏｎに対応したＺ方向の移動量である。また、θ_ｘはＸｒｏｔａｔｉｏｎに対応したＸ軸周りの回転量、θ_ｙはＹｒｏｔａｔｉｏｎに対応したＹ軸周りの回転量、θ_ｚはＺｒｏｔａｔｉｏｎに対応したＺ軸周りの回転量である。 (Procedure 1)
For each joint node from the route (Hips) to the wrist (Wrist (here, referred to as “RightWrist”)), the hand position / direction calculating unit 120 calculates the movement amount and the rotation amount of the representative frame described in the motion data D, A transformation matrix (homogeneous transformation matrix) for obtaining the position and orientation of the joint node from the parent joint node is generated.
Specifically, the hand position / direction calculation unit 120 generates a transformation matrix A by performing a matrix operation represented by the following equation (1) for each joint node.
Here, the _{P x} movement amount in the X direction corresponding to XPosition, the moving amount of _{P y} is the Y direction corresponding to YPosition, _{P z} is the moving amount in the Z direction corresponding to ZPOSITION. Further, θ _x is a rotation amount around the X axis corresponding to Xrotation, θ _y is a rotation amount around the Y axis corresponding to Yrotation, and θ _z is a rotation amount around the Z axis corresponding to Zrotation.

（手順２）
そして、手位置方向算出手段１２０は、（手順１）で算出したルート（Ｈｉｐｓ）に対応する変換行列から手首（Ｗｒｉｓｔ）に対応する変換行列までを順次乗算することで、ルートから手首までの全体の変換行列を生成する。
すなわち、関節ノードのＨｉｐｓの変換行列をＡ_Ｈｉｐｓ、Ｃｈｅｓｔ１の変換行列をＡ_{Ｃｈｅｓｔ１}、…、ＲｉｇｈｔＷｒｉｓｔの変換行列をＡ_{ＲｉｇｈｔＷｒｉｓｔ}としたとき、手位置方向算出手段１２０は、以下の式（２）に示すように、それぞれの行列を順に乗算することで、ルートから手首までの変換行列Ｂを生成する。 (Procedure 2)
Then, the hand position / direction calculation unit 120 sequentially multiplies the conversion matrix corresponding to the route (Hips) calculated in (Procedure 1) to the conversion matrix corresponding to the wrist (Wrist), so that the entire route from the root to the wrist is obtained. Generate a transformation matrix of.
In other words, the transformation matrix Hips joints node _A Hips, the transformation matrix of Chest1 _A Chest1, ..., when the transformation matrix RightWrist was _{A RightWrist,} hand position direction calculating means 120, shown in the following equation (2) In this way, a transformation matrix B from the root to the wrist is generated by multiplying the respective matrices in order.

（手順３）
そして、手位置方向算出手段１２０は、（手順２）で算出した変換行列Ｂの成分から、手首の位置および手のひらの方向を算出する。
すなわち、手位置方向算出手段１２０は、変換行列Ｂの平行移動の成分となる（ＬＸ，ＬＹ，ＬＺ）を計算することで、代表フレームにおける手首位置の空間座標（Ｎ_ｐｘ，Ｎ_ｐｙ，Ｎ_ｐｚ）を求める。
なお、初期姿勢における手首位置は、図２で説明したモーションデータ（手話単語モーションデータ）Ｄにおいて、ルート（ＲＯＯＴ）から、手首の関節ノードまでの相対座標（ＯＦＦＳＥＴ）を加算した位置である。 (Procedure 3)
Then, the hand position / direction calculating unit 120 calculates the wrist position and the palm direction from the components of the transformation matrix B calculated in (Procedure 2).
That is, the hand position direction calculation unit 120 calculates (LX, LY, LZ), which is a translation component of the transformation matrix B, so that the spatial coordinates (N _px , N _py , N _pz ) of the wrist position in the representative frame are calculated. )
The wrist position in the initial posture is a position obtained by adding relative coordinates (OFFSET) from the root (ROOT) to the wrist joint node in the motion data (sign language word motion data) D described with reference to FIG.

また、手位置方向算出手段１２０は、手話単語動作の初期姿勢における手首の方向ベクトルに、変換行列Ｂの成分（ＸＸ，ＸＹ，ＸＺ，ＹＸ，ＹＹ，ＹＺ，ＺＸ，ＺＹ，ＺＺ）で構成される回転行列を乗算することで、代表フレームにおける手のひらの方向となる手首の方向ベクトル（Ｎ_ｖｘ，Ｎ_ｖｙ，Ｎ_ｖｚ）を求める。
なお、初期姿勢における手首の方向ベクトルは、予め手話動作の初期ポーズとして定めておくこととする。ここでは、手話動作の初期ポーズを、両手を横に広げ（Ｔポーズ）とし、手のひらを下向きにした姿勢とする。すなわち、初期姿勢では、手のひらの方向は、下向きであり、ＸＹＺ座標系における方向ベクトルは（０，−１，０）である。
このように、手位置方向算出手段１２０は、代表フレームにおいて、空間座標（ＸＹＺ座標）における手首の位置とその方向とを算出することができる。
そして、手位置方向算出手段１２０は、算出した手首の位置（空間座標）を手位置分類手段１２１に出力し、手首の方向（方向ベクトル）を手方向分類手段１２２に出力する。 Further, the hand position / direction calculation means 120 is composed of the components (XX, XY, XZ, YX, YY, YZ, ZX, ZY, ZZ) of the transformation matrix B in the wrist direction vector in the initial posture of the sign language word motion. The direction vector (N _vx , N _vy , N _vz ) of the wrist that is the direction of the palm in the representative frame is obtained by multiplying the rotation matrix.
Note that the wrist direction vector in the initial posture is determined in advance as the initial pose of the sign language action. Here, the initial pose of the sign language operation is a posture in which both hands are spread sideways (T pose) and the palm is directed downward. That is, in the initial posture, the palm direction is downward, and the direction vector in the XYZ coordinate system is (0, -1, 0).
As described above, the hand position / direction calculation unit 120 can calculate the position and direction of the wrist in the space coordinates (XYZ coordinates) in the representative frame.
The hand position / direction calculating unit 120 outputs the calculated wrist position (spatial coordinates) to the hand position classifying unit 121 and outputs the wrist direction (direction vector) to the hand direction classifying unit 122.

手位置分類手段１２１は、手位置方向算出手段１２０で算出された手（手首）の位置を分類するものである。
この手位置分類手段１２１は、代表フレームにおいて、手首の位置がどこに存在するのかを大まかな区分で分類する。
ここでは、手位置分類手段１２１は、図６に示すように、代表フレームの画像をＸ方向（水平方向）に３分割、Ｙ方向（垂直方向）に３分割した計９個の分割領域（ＵＲ，ＵＣ，ＵＬ，ＭＲ，ＭＣ，ＭＬ，ＤＲ，ＤＣ，ＤＬ）のどこに手首が位置しているのかを判定する。 The hand position classification unit 121 classifies the hand (wrist) position calculated by the hand position direction calculation unit 120.
The hand position classifying unit 121 classifies the position of the wrist in the representative frame according to a rough classification.
Here, as shown in FIG. 6, the hand position classifying unit 121 divides the representative frame image into three in the X direction (horizontal direction) and into three in the Y direction (vertical direction). , UC, UL, MR, MC, ML, DR, DC, DL), it is determined where the wrist is located.

すなわち、手位置分類手段１２１は、手位置方向算出手段１２０で算出された手首位置の空間座標（Ｎ_ｐｘ，Ｎ_ｐｙ，Ｎ_ｐｚ）のうちのＸ座標Ｎ_ｐｘ，Ｙ座標Ｎ_ｐｙによって、手（手首）の位置を分類する。
なお、ここで、手位置分類手段１２１は、ＸＹ平面上で位置の分類を行ったが、手首位置のＺ座標を加えて、３次元空間上で位置の分類を行ってもよい。
そして、手位置分類手段１２１は、手首が位置する分割領域を特定する固有の値を、分類情報Ｃの手位置情報として、記憶手段２において、識別情報ＩＤで特定される手話単語（モーションデータＤ）に対応する管理情報に設定する。 That is, the hand position classification unit 121 uses the X coordinate N _px , Y coordinate N _py of the spatial coordinates (N _px , N _py , N _pz ) of the wrist position calculated by the hand position direction calculation unit 120 to determine the hand ( Classify the position of the wrist.
Here, the hand position classifying unit 121 classifies the position on the XY plane. However, the hand position classifying unit 121 may classify the position on the three-dimensional space by adding the Z coordinate of the wrist position.
Then, the hand position classification unit 121 uses the unique value that identifies the divided region where the wrist is located as the hand position information of the classification information C, and the sign language word (motion data D) identified by the identification information ID in the storage unit 2. ) To the management information corresponding to.

手方向分類手段１２２は、手位置方向算出手段１２０で算出された手（手首）の方向を分類するものである。
この手方向分類手段１２２は、代表フレームにおいて、手のひらの方向がどの方向を向いているのかを大まかな区分で分類する。
ここでは、手位置方向算出手段１２０で算出された方向ベクトル（Ｎ_ｖｘ，Ｎ_ｖｙ，Ｎ_ｖｚ）のうちで、絶対値が最も大きい値となる軸の成分をその符号とともに、手のひらの方向として分類する。 The hand direction classifying unit 122 classifies the direction of the hand (wrist) calculated by the hand position / direction calculating unit 120.
The hand direction classifying means 122 classifies the direction of the palm direction in the representative frame according to a rough classification.
Here, among the direction vectors (N _vx , N _vy , N _vz ) calculated by the hand position / direction calculating means 120, the axis component having the largest absolute value is classified as the palm direction together with its sign. To do.

例えば、図７（ａ）では、手話者が手の甲を正面に向けており、図７（ｂ）では手のひらを正面に向けている。この場合、手首の方向ベクトル（Ｎ_ｖｘ，Ｎ_ｖｙ，Ｎ_ｖｚ）のうちで、Ｚ軸成分Ｎｖｚの絶対値が最も大きくなり、手型が同じであっても、図７（ａ）、（ｂ）をＺ軸成分の符号によって、それぞれに分類することができる。
なお、手方向分類手段１２２は、必ずしもＸＹＺ方向のすべてについて方向を分類する必要はなく、例えば、Ｚ軸方向のみに着目して、その正負によって分類を行うこととしてもよい。
そして、手方向分類手段１２２は、手のひらの方向を特定する固有の値を、分類情報Ｃの手方向情報として、記憶手段２において、識別情報ＩＤで特定される手話単語（モーションデータＤ）に対応する管理情報に設定する。 For example, in FIG. 7A, the sign language has the back of the hand facing forward, and in FIG. 7B, the palm is facing the front. In this case, among the wrist direction vectors (N _vx , N _vy , N _vz ), even if the absolute value of the Z-axis component Nvz is the largest and the hand shape is the same, FIG. ) Can be classified according to the sign of the Z-axis component.
Note that the hand direction classification unit 122 does not necessarily need to classify the directions in all of the XYZ directions. For example, focusing on only the Z-axis direction, classification may be performed based on the positive / negative.
Then, the hand direction classifying means 122 corresponds to the sign language word (motion data D) specified by the identification information ID in the storage means 2 with the unique value specifying the palm direction as the hand direction information of the classification information C. Set the management information to be used.

指方向算出手段１２３は、代表フレームで表されている手の指の方向を算出するものである。
この指方向算出手段１２３は、代表フレームのモーションデータＤから、指の予め定めた２つの関節位置を求め、それぞれの関節の相対位置から、当該指が指し示す方向を算出する。 The finger direction calculation means 123 calculates the direction of the finger of the hand represented by the representative frame.
The finger direction calculation means 123 calculates two predetermined joint positions of the finger from the motion data D of the representative frame, and calculates the direction indicated by the finger from the relative position of each joint.

例えば、右手の人差し指の方向を求める場合、指方向算出手段１２３は、図３に示した関節ノードにおいて、人差し指の先端の関節ノード（ＲｉｇｈｔＩｎｄｅｘＥｆｆ）の位置と、その指の根元の関節ノード（ＲｉｇｈｔＩｎｄｅｘ１）の位置とを求める。
なお、これらの関節ノードの位置を求める手法は、手位置方向算出手段１２０で手首の関節ノード（ＲｉｇｈｔＷｒｉｓｔ）を求める手法と同じである。
ただし、すでに、手位置方向算出手段１２０で、基準位置（Ｈｉｐｓ）の関節ノードから、手首の関節ノード（ＲｉｇｈｔＷｒｉｓｔ）までの移動量および回転量は算出されているため、指方向算出手段１２３は、その移動量および回転量を用い、それ以降の指の関節ノードについて演算を行い、それぞれの空間座標を求めればよい。 For example, when obtaining the direction of the index finger of the right hand, the finger direction calculation means 123, in the joint node shown in FIG. 3, the position of the joint node (RightIndexEff) at the tip of the index finger and the joint node (RightIndex1) at the base of the finger Find the position of.
The method for obtaining the positions of these joint nodes is the same as the method for obtaining the wrist joint node (RightWrist) by the hand position direction calculating means 120.
However, since the movement amount and the rotation amount from the joint node of the reference position (Hips) to the wrist joint node (RightWrist) have already been calculated by the hand position direction calculation unit 120, the finger direction calculation unit 123 Using the amount of movement and the amount of rotation, calculation is performed for the subsequent joint nodes of the finger, and the respective spatial coordinates may be obtained.

そして、指方向算出手段１２３は、人差し指の先端の関節ノード（ＲｉｇｈｔＩｎｄｅｘＥｆｆ）の位置を示す空間座標（Ｉｅｘ，Ｉｅｙ，Ｉｅｚ）から、根元の関節ノード（ＲｉｇｈｔＩｎｄｅｘ１）の位置を示す空間座標（Ｉ１ｘ，Ｉ１ｙ，Ｉ１ｚ）の各成分の差を計算する。この各成分の差は、各軸における指の方向を表す指標（方向ベクトル）となる。
そして、指方向算出手段１２３は、この指の方向ベクトルを指方向分類手段１２４に出力する。 Then, the finger direction calculation unit 123 uses the spatial coordinates (I1x, I1y) indicating the position of the root joint node (RightIndex1) from the spatial coordinates (Iex, Iey, Iez) indicating the position of the joint node (RightIndexEff) at the tip of the index finger. , I1z) is calculated. The difference between the components becomes an index (direction vector) indicating the direction of the finger on each axis.
Then, the finger direction calculation unit 123 outputs the finger direction vector to the finger direction classification unit 124.

指方向分類手段１２４は、指方向算出手段１２３で算出された指の方向を分類するものである。
この指方向分類手段１２４は、指方向算出手段１２３で算出された指の方向ベクトルの成分のうちで、その値の最も絶対値が大きい軸方向において、符号が正であればその軸の正方向、符号が負であればその軸の負方向に指が向いていると分類する。 The finger direction classifying unit 124 classifies the finger direction calculated by the finger direction calculating unit 123.
This finger direction classifying means 124 is the positive direction of the axis if the sign is positive in the axial direction having the largest absolute value among the components of the finger direction vector calculated by the finger direction calculating means 123. If the sign is negative, the finger is classified in the negative direction of the axis.

例えば、図８（ａ）では、手話者が人差し指を内側（Ｘ軸正方向）に向けており、図８（ｂ）では人差し指を下（Ｙ軸負方向）に向けている。このように、手型が同じであっても、図８（ａ）、（ｂ）を人差し指の方向によって、それぞれ異なる手話単語に分類することができる。
なお、指方向分類手段１２４は、すべての指について方向を分類する必要はなく、例えば、人差し指のみ、親指、人差し指および中指の３本の指のみについて、分類を行うこととしてもよい。
そして、指方向分類手段１２４は、指の方向を特定する固有の値を、分類情報Ｃの指方向情報として、記憶手段２において、識別情報ＩＤで特定される手話単語（モーションデータＤ）に対応する管理情報に設定する。 For example, in FIG. 8A, the signer points the index finger inward (X-axis positive direction), and in FIG. 8B, the index finger is directed downward (Y-axis negative direction). Thus, even if the hand type is the same, FIGS. 8A and 8B can be classified into different sign language words depending on the direction of the index finger.
Note that the finger direction classification unit 124 does not need to classify directions for all fingers, and may classify only the three fingers, for example, the index finger only, the thumb, the index finger, and the middle finger.
Then, the finger direction classifying unit 124 uses a unique value for specifying the finger direction as finger direction information of the classification information C and corresponds to the sign language word (motion data D) specified by the identification information ID in the storage unit 2. Set the management information to be used.

以上説明したように手話単語分類情報生成装置１を構成することで、手話単語分類情報生成装置１は、手話単語動作を示すモーションデータから、手型、手指の位置方向による分類情報を生成することができる。
この分類情報を検索条件として用いることで、手型、手指の位置および向きから、手話動作が示す手話単語を検索することが可能になる。
なお、手話単語分類情報生成装置１は、コンピュータを、前記した各手段として機能させるための手話単語分類情報生成プログラムで動作させることができる。 By configuring the sign language word classification information generation device 1 as described above, the sign language word classification information generation device 1 generates the classification information based on the hand type and the finger finger direction from the motion data indicating the sign language word motion. Can do.
By using this classification information as a search condition, it becomes possible to search for a sign language word indicated by a sign language action from the hand shape and the position and orientation of a finger.
The sign language word classification information generation apparatus 1 can operate a computer with a sign language word classification information generation program for causing a computer to function as each of the means described above.

〔手話単語分類情報生成装置の動作〕
次に、図９を参照（構成については適宜図１参照）して、手話単語分類情報生成装置１の動作について説明する。
まず、手話単語分類情報生成装置１は、代表フレーム検出手段１０において、記憶手段２に記憶されている手話単語ごとのモーションデータＤから、代表フレームを検出する。
すなわち、手話単語分類情報生成装置１は、代表フレーム検出手段１０の動き量算出手段１００によって、モーションデータＤのあるフレームとその前のフレームと間で、移動量および回転量の成分ごとの差の絶対値の総和、または、成分ごとの差の２乗和を求めることで、フレーム単位の動き量を算出する（ステップＳ１）。 [Operation of sign language word classification information generator]
Next, the operation of the sign language word classification information generation device 1 will be described with reference to FIG.
First, the sign language word classification information generation apparatus 1 detects a representative frame from the motion data D for each sign language word stored in the storage unit 2 in the representative frame detection unit 10.
That is, the sign language word classification information generation device 1 uses the motion amount calculation unit 100 of the representative frame detection unit 10 to calculate the difference between the movement amount and rotation amount components between the frame having the motion data D and the previous frame. The amount of motion in units of frames is calculated by calculating the sum of absolute values or the sum of squares of differences for each component (step S1).

そして、手話単語分類情報生成装置１は、代表フレーム検出手段１０の動き量最少フレーム検出手段１０１によって、手話単語の動作が開始されたフレームと、手話単語の動作が終了したフレームとの間で、ステップＳ１で算出された動き量が最少となるフレームを、代表フレームとして検出する（ステップＳ２）。 Then, the sign language word classification information generating device 1 includes a frame in which the movement of the sign language word is started and a frame in which the movement of the sign language word is ended by the minimum motion amount frame detection unit 101 of the representative frame detection unit 10. The frame with the smallest amount of motion calculated in step S1 is detected as a representative frame (step S2).

ここで、代表フレーム検出手段１０は、記憶手段２に記憶されているすべてのモーションデータＤについて代表フレームを検出していない場合（ステップＳ３でＮｏ）、ステップＳ１に戻って、代表フレームの検出処理を繰り返す。一方、すべてのモーションデータＤについて代表フレームの検出が完了した場合（ステップＳ３でＹｅｓ）、代表フレーム検出手段１０は、モーションデータＤの識別情報ＩＤと代表フレームのフレーム番号とを型分類手段１１と、位置方向分類手段１２とに通知して、それぞれの分類手段１１，１２に制御を移す（ステップとして図示せず）。 Here, when the representative frame detection unit 10 has not detected a representative frame for all the motion data D stored in the storage unit 2 (No in step S3), the process returns to step S1 to detect the representative frame. repeat. On the other hand, when the detection of the representative frame is completed for all the motion data D (Yes in step S3), the representative frame detection means 10 determines the identification information ID of the motion data D and the frame number of the representative frame as the type classification means 11. The position / direction classification means 12 is notified, and control is transferred to the respective classification means 11 and 12 (not shown as steps).

その後、手話単語分類情報生成装置１は、型分類手段１１のクラスタリング手段１１０によって、代表フレームにおける指部分の関節の回転量の組を１つのデータ単位とし、すべての代表フレームについて手型のクラスタリングを行う（ステップＳ４）。
そして、手話単語分類情報生成装置１は、クラスタリング手段１１０によって、ステップＳ４でクラスタリングしたクラスタ固有の値を、分類情報Ｃの手型情報として、代表フレームに対応するモーションデータＤに対応付けて記憶する（ステップＳ５）。 After that, the sign language word classification information generating device 1 uses the clustering unit 110 of the type classification unit 11 to set a pair of rotation amounts of the joints of the finger portions in the representative frame as one data unit, and perform hand type clustering for all the representative frames. Perform (step S4).
Then, the sign language word classification information generation device 1 stores the cluster-specific values clustered in step S4 by the clustering unit 110 in association with the motion data D corresponding to the representative frame as the hand pattern information of the classification information C. (Step S5).

また、手話単語分類情報生成装置１は、型分類手段１１の手型画像生成手段１１１によって、ステップＳ４でクラスタリングされた手型に対応する画像（手型画像）を生成する（ステップＳ６）。
そして、手話単語分類情報生成装置１は、手型画像生成手段１１１によって、ステップＳ６で生成された手型画像を、手型情報に対応付けて記憶する（ステップＳ７）。 Further, the sign language word classification information generation device 1 generates an image (hand image) corresponding to the hand clustered in step S4 by the hand image generation unit 111 of the type classification unit 11 (step S6).
Then, the sign language word classification information generation device 1 stores the hand image generated in step S6 in association with the hand image information by the hand image generation unit 111 (step S7).

また、手話単語分類情報生成装置１は、位置方向分類手段１２によって、個々の代表フレームにおいて、手の位置および手指の方向の分類を行う。
すなわち、手話単語分類情報生成装置１は、位置方向分類手段１２の手位置方向算出手段１２０によって、モーションデータＤに記述されている関節ノードの階層関係に基づく座標変換を行うことで、手（手首）の空間位置と手（手のひら）の方向とを算出する（ステップＳ８）。
そして、手話単語分類情報生成装置１は、位置方向分類手段１２の手位置分類手段１２１によって、ステップＳ８で算出された手（手首）の位置を予め区分した位置で分類し、その分類した位置を特定する値を、分類情報Ｃの手位置情報として、代表フレームに対応するモーションデータＤに対応付けて記憶する（ステップＳ９）。 In addition, the sign language word classification information generation device 1 classifies the position of the hand and the direction of the finger in each representative frame by the position / direction classification unit 12.
That is, the sign language word classification information generating device 1 performs coordinate conversion based on the hierarchical relationship of the joint nodes described in the motion data D by the hand position / direction calculation unit 120 of the position / direction classification unit 12, thereby ) And the direction of the hand (palm) are calculated (step S8).
Then, the sign language word classification information generating device 1 classifies the position of the hand (wrist) calculated in step S8 by the hand position classification unit 121 of the position / direction classification unit 12 in advance, and classifies the classified position. The specified value is stored as hand position information of the classification information C in association with the motion data D corresponding to the representative frame (step S9).

また、手話単語分類情報生成装置１は、位置方向分類手段１２の手方向分類手段１２２によって、ステップＳ８で算出された手（手のひら）の方向を予め区分した方向で分類し、その分類した方向を特定する値を、分類情報Ｃの手方向情報として、代表フレームに対応するモーションデータＤに対応付けて記憶する（ステップＳ１０）。 Further, the sign language word classification information generation device 1 classifies the direction of the hand (palm) calculated in step S8 by the hand direction classification unit 122 of the position / direction classification unit 12 in advance, and determines the classified direction. The specified value is stored as hand direction information of the classification information C in association with the motion data D corresponding to the representative frame (step S10).

さらに、手話単語分類情報生成装置１は、位置方向分類手段１２の指方向算出手段１２３によって、モーションデータＤに記述されている関節ノードの階層関係に基づく座標変換を行うことで、指（人差し指）の方向を算出する（ステップＳ１１）。
そして、手話単語分類情報生成装置１は、位置方向分類手段１２の指方向分類手段１２４によって、ステップＳ１１で算出された指の方向を予め区分した方向で分類し、その分類した方向を特定する値を、分類情報Ｃの指方向情報として、代表フレームに対応するモーションデータＤに対応付けて記憶する（ステップＳ１２）。 Furthermore, the sign language word classification information generation device 1 performs finger-pointing (index finger) by performing coordinate transformation based on the hierarchical relationship of the joint nodes described in the motion data D by the finger direction calculation unit 123 of the position / direction classification unit 12. Is calculated (step S11).
Then, the sign language word classification information generation device 1 classifies the finger directions calculated in step S11 by the finger direction classification unit 124 of the position / direction classification unit 12 according to a previously classified direction, and specifies the classified direction. Is stored in association with the motion data D corresponding to the representative frame as the finger direction information of the classification information C (step S12).

ここで、位置方向分類手段１２は、すべての代表フレームに対して、手の位置および手指の方向の分類が完了していない場合（ステップＳ１３でＮｏ）、ステップＳ８に戻って、分類処理を繰り返す。
そして、すべての代表フレームに対して、手の位置および手指の方向の分類が完了した段階（ステップＳ１３でＹｅｓ）、手話単語分類情報生成装置１は、動作を終了する。 Here, if the classification of the hand position and the finger direction is not completed for all the representative frames (No in step S13), the position / direction classification unit 12 returns to step S8 and repeats the classification process. .
When the classification of the hand position and the finger direction is completed for all the representative frames (Yes in step S13), the sign language word classification information generation device 1 ends the operation.

以上の動作によって、手話単語分類情報生成装置１は、記憶手段２に記憶されているモーションデータＤに対して、検索用の分類情報Ｃ（手型情報、手位置情報、手方向情報、指方向情報）を対応付けることができる。
また、手話単語分類情報生成装置１は、手型情報に対応付けて、検索用のサムネイル画像として手型画像を生成することができる。 With the above operation, the sign language word classification information generation device 1 performs the classification information C (hand type information, hand position information, hand direction information, finger direction) for the motion data D stored in the storage unit 2. Information).
Further, the sign language word classification information generation device 1 can generate a hand image as a thumbnail image for search in association with the hand information.

〔手話単語検索装置の構成〕
次に、図１０を参照して、手話単語検索装置３の構成について説明する。
手話単語検索装置３は、手話単語分類情報生成装置１（図１参照）で生成された分類情報を検索条件として、手話単語を検索するものである。なお、手話単語検索装置３は、モーションデータＤに対して、予め手話単語分類情報生成装置１（図１参照）によって分類情報等が設定された記憶手段２を接続している。
ここでは、手話単語検索装置３は、検索画面表示制御手段３０と、分類情報指定手段３１と、検索実行手段３２と、を備える。 [Configuration of sign language word search device]
Next, the configuration of the sign language word search device 3 will be described with reference to FIG.
The sign language word search device 3 searches for a sign language word using the classification information generated by the sign language word classification information generation device 1 (see FIG. 1) as a search condition. Note that the sign language word search device 3 is connected to the motion data D with the storage means 2 in which the classification information and the like are set in advance by the sign language word classification information generation device 1 (see FIG. 1).
Here, the sign language word search device 3 includes a search screen display control means 30, a classification information designation means 31, and a search execution means 32.

検索画面表示制御手段３０は、手話単語の検索条件を入力するための画面を表示するとともに、ユーザから検索条件を受け付けるユーザインタフェースとなる検索画面を制御するものである。
この検索画面表示制御手段３０は、記憶手段２に記憶されている分類情報Ｃ（手型情報、手位置情報、手方向情報、指方向情報）を、検索条件としてユーザが指定するための手話単語検索画面を生成し、表示装置Ｍの画面に表示する。
例えば、検索画面表示制御手段３０は、図１１に示すように、手型を検索条件として指定するためのプルダウンメニューのタイトル部分５１と、手の方向（手のひらの向き）を検索条件として指定するためのプルダウンメニューのタイトル部分５２と、指の方向を検索条件として指定するためのプルダウンメニューのタイトル部分５３と、を検索条件の項目を選択するボタンとして含んだ検索画面５０を表示する。 The search screen display control unit 30 displays a screen for inputting a search condition for sign language words and controls a search screen serving as a user interface for receiving the search condition from the user.
This search screen display control means 30 is a sign language word for the user to specify the classification information C (hand type information, hand position information, hand direction information, finger direction information) stored in the storage means 2 as a search condition. A search screen is generated and displayed on the screen of the display device M.
For example, as shown in FIG. 11, the search screen display control means 30 specifies the title part 51 of the pull-down menu for specifying the hand shape as the search condition and the direction of the hand (the direction of the palm) as the search condition. A search screen 50 including the title part 52 of the pull-down menu and the title part 53 of the pull-down menu for specifying the direction of the finger as a search condition is displayed as a button for selecting an item of the search condition.

また、検索画面表示制御手段３０は、図１１に示すように、手の位置を指定するためのチェック領域（ラジオボタン５４ａ）を含んだ領域指定画像５４を画面上に表示する。
さらに、検索画面表示制御手段３０は、図１１に示すように、検索条件指定後に検索の実行を受け付ける検索ボタン５５を画面上に表示する。 Further, as shown in FIG. 11, the search screen display control means 30 displays an area designation image 54 including a check area (radio button 54a) for designating the hand position on the screen.
Further, as shown in FIG. 11, the search screen display control means 30 displays a search button 55 on the screen for accepting execution of the search after specifying the search conditions.

この検索画面表示制御手段３０は、タイトル部分５１をマウス等の選択手段で選択されることで、後記する分類情報指定手段３１の手型指定手段３１０に制御を移す。
また、検索画面表示制御手段３０は、タイトル部分５２をマウス等の選択手段で選択されることで、後記する分類情報指定手段３１の手方向指定手段３１１に制御を移す。
また、検索画面表示制御手段３０は、タイトル部分５３をマウス等の選択手段で選択されることで、後記する分類情報指定手段３１の指方向指定手段３１２に制御を移す。
また、検索画面表示制御手段３０は、領域指定画像５４をマウス等の選択手段で選択されることで、後記する分類情報指定手段３１の手位置指定手段３１３に制御を移す。 This search screen display control means 30 transfers control to the hand type designation means 310 of the classification information designation means 31 described later by selecting the title portion 51 with a selection means such as a mouse.
Further, the search screen display control means 30 transfers the control to the hand direction designation means 311 of the classification information designation means 31 to be described later by selecting the title portion 52 with the selection means such as a mouse.
Further, the search screen display control means 30 transfers the control to the finger direction designation means 312 of the classification information designation means 31 described later by selecting the title portion 53 with a selection means such as a mouse.
Further, the search screen display control means 30 transfers the control to the hand position designation means 313 of the classification information designation means 31 to be described later by selecting the region designation image 54 with a selection means such as a mouse.

分類情報指定手段３１は、検索画面を介して、ユーザからの検索条件の指定を受け付けるものである。
ここでは、分類情報指定手段３１は、手型指定手段３１０と、手方向指定手段３１１と、指方向指定手段３１２と、手位置指定手段３１３と、を備える。 The classification information designating unit 31 receives designation of search conditions from the user via the search screen.
Here, the classification information specifying unit 31 includes a hand type specifying unit 310, a hand direction specifying unit 311, a finger direction specifying unit 312, and a hand position specifying unit 313.

手型指定手段３１０は、手型を検索条件として受け付けるものである。
この手型指定手段３１０は、検索画面表示制御手段３０から、検索画面において、手型を検索条件として指定された際に、記憶手段２に記憶されている手型画像Ｇをサムネイル画像として表示する。そして、手型指定手段３１０は、マウス等の選択手段でサムネイル画像を選択されることで、それに対応する手型情報を検索条件として特定する。 The hand type designation unit 310 receives a hand type as a search condition.
The hand type designation unit 310 displays the hand type image G stored in the storage unit 2 as a thumbnail image when the hand type is designated as a search condition on the search screen from the search screen display control unit 30. . And the hand type designation | designated means 310 specifies hand type information corresponding to it as a search condition by selecting a thumbnail image with selection means, such as a mouse | mouth.

例えば、図１２に示すように、検索画面５０において、タイトル部分５１を選択されることで、手型指定手段３１０は、複数の手型を示すサムネイル画像５１ａを表示して、選択を受け付ける。
この手型指定手段３１０は、手型を選択され、検索条件の１つとして手型情報が特定された後、検索画面表示制御手段３０に制御を移す。
なお、検索条件として指定された手型情報は、図示を省略したメモリ上に保持され、検索実行時に検索実行手段３２によって参照される。 For example, as shown in FIG. 12, when the title portion 51 is selected on the search screen 50, the hand type designation unit 310 displays thumbnail images 51a indicating a plurality of hand types and accepts the selection.
This hand type designation means 310 moves the control to the search screen display control means 30 after the hand type is selected and the hand type information is specified as one of the search conditions.
Note that the hand type information specified as the search condition is held in a memory (not shown) and is referred to by the search execution means 32 when executing the search.

手方向指定手段３１１は、手（手のひら）の方向を検索条件として受け付けるものである。
この手方向指定手段３１１は、検索画面表示制御手段３０から、検索画面において、手の方向（手のひらの向き）を検索条件として指定された際に、記憶手段２に記憶されている手方向情報の分類項目を表示する。そして、手方向指定手段３１１は、マウス等の選択手段で項目を選択されることで、それに対応する手方向情報を検索条件として特定する。 The hand direction designation means 311 accepts the direction of the hand (palm) as a search condition.
This hand direction designating means 311 indicates the hand direction information stored in the storage means 2 when the hand direction (the palm direction) is designated as a search condition on the search screen from the search screen display control means 30. Display category items. And hand direction designation | designated means 311 specifies the hand direction information corresponding to it as a search condition by selecting an item with selection means, such as a mouse | mouth.

例えば、図１３に示すように、検索画面５０において、タイトル部分５２を選択されることで、手方向指定手段３１１は、手の向きとして予め分類されている項目内容を表示して、選択を受け付ける。
この手方向指定手段３１１は、手の向きを選択され、検索条件の１つとして手方向情報が特定された後、検索画面表示制御手段３０に制御を移す。
なお、検索条件として指定された手方向情報は、図示を省略したメモリ上に保持され、検索実行時に検索実行手段３２によって参照される。 For example, as shown in FIG. 13, when the title portion 52 is selected on the search screen 50, the hand direction designating unit 311 displays the item content that has been classified in advance as the hand direction and accepts the selection. .
The hand direction designating unit 311 moves the control to the search screen display control unit 30 after the hand direction is selected and the hand direction information is specified as one of the search conditions.
Note that the hand direction information designated as the search condition is held in a memory (not shown) and is referred to by the search execution means 32 when executing the search.

指方向指定手段３１２は、指の方向を検索条件として受け付けるものである。
この指方向指定手段３１２は、検索画面表示制御手段３０から、検索画面において、指の方向（人差し指の指し示す方向）を検索条件として指定された際に、記憶手段２に記憶されている指方向情報の分類項目を表示する。そして、指方向指定手段３１２は、マウス等の選択手段で項目を選択されることで、それに対応する指方向情報を検索条件として特定する。 The finger direction designating unit 312 accepts the finger direction as a search condition.
The finger direction designating unit 312 stores the finger direction information stored in the storage unit 2 when the search screen display control unit 30 designates the finger direction (direction pointed by the index finger) as a search condition on the search screen. Displays the category items. The finger direction designating unit 312 selects the item with a selection unit such as a mouse, and specifies the corresponding finger direction information as a search condition.

例えば、図１４に示すように、検索画面５０において、タイトル部分５３を選択されることで、指方向指定手段３１２は、指の向きとして予め分類されている項目内容を表示して、選択を受け付ける。
この指方向指定手段３１２は、指の向きを選択され、検索条件の１つとして指方向情報が特定された後、検索画面表示制御手段３０に制御を移す。
なお、検索条件として指定された指方向情報は、図示を省略したメモリ上に保持され、検索実行時に検索実行手段３２によって参照される。 For example, as shown in FIG. 14, when the title portion 53 is selected on the search screen 50, the finger direction designating unit 312 displays the item content that has been classified in advance as the finger orientation and accepts the selection. .
The finger direction designating unit 312 transfers the control to the search screen display control unit 30 after the finger direction is selected and the finger direction information is specified as one of the search conditions.
Note that the finger direction information designated as the search condition is held in a memory (not shown) and is referred to by the search execution means 32 when executing the search.

手位置指定手段３１３は、手の位置を検索条件として受け付けるものである。
この手位置指定手段３１３は、検索画面表示制御手段３０から、検索画面において、手の位置の指定されることで、それに対応する手位置情報を検索条件として特定する。
例えば、図１５に示すように、検索画面５０において、領域指定画像５４の領域を指定されることで、手位置指定手段３１３は、ラジオボタン５４ａの色等を変えることで、領域が選択されたことを表す。
この手位置指定手段３１３は、手の位置を選択され、検索条件の１つとして手位置情報が特定された後、検索画面表示制御手段３０に制御を移す。
なお、検索条件として指定された手位置情報は、図示を省略したメモリ上に保持され、検索実行時に検索実行手段３２によって参照される。 The hand position designation means 313 accepts the hand position as a search condition.
The hand position designation means 313 specifies hand position information corresponding to the hand position information as a search condition by designating the hand position on the search screen from the search screen display control means 30.
For example, as shown in FIG. 15, by specifying the area of the area specifying image 54 on the search screen 50, the hand position specifying means 313 changes the color or the like of the radio button 54a to select the area. Represents that.
This hand position designation means 313 moves the control to the search screen display control means 30 after the hand position is selected and the hand position information is specified as one of the search conditions.
Note that the hand position information designated as the search condition is held in a memory (not shown) and is referred to by the search execution means 32 when executing the search.

検索実行手段３２は、検索画面を介して、ユーザからの検索実行の指示を受け付けるものである。
ここでは、検索実行手段３２は、検索手段３２０と、検索結果表示手段３２１と、手話動作表示手段３２２と、を備える。 The search execution means 32 receives a search execution instruction from the user via the search screen.
Here, the search execution means 32 includes search means 320, search result display means 321, and sign language action display means 322.

検索手段３２０は、分類情報指定手段３１で指定された検索条件に合致する手話単語を検索するものである。
この検索手段３２０は、検索画面表示制御手段３０から、検索画面において、検索実行がされた指示（例えば、図１１の検索画面５０で検索ボタン５５が押下された場合）、記憶手段２において、分類情報指定手段３１で保持されている検索条件（手型情報、手位置情報、手方向情報、指方向情報）と合致する分類情報Ｃを有する手話単語のラベルＬを検索する。
なお、検索手段３２０は、分類情報指定手段３１で、すべての検索条件を指定される必要はなく、指定された検索条件のみが合致する分類情報Ｃを有する手話単語のラベルＬを検索する。
そして、検索手段３２０は、検索結果であるラベルＬを、検索結果表示手段３２１に出力する。 The search unit 320 searches for a sign language word that matches the search condition specified by the classification information specifying unit 31.
This search means 320 receives an instruction from the search screen display control means 30 for executing a search on the search screen (for example, when the search button 55 is pressed on the search screen 50 in FIG. 11). The label L of the sign language word having the classification information C that matches the search conditions (hand type information, hand position information, hand direction information, finger direction information) held by the information specifying means 31 is searched.
The search means 320 does not need to specify all the search conditions by the classification information specifying means 31, but searches for the label L of the sign language word having the classification information C that matches only the specified search conditions.
Then, the search unit 320 outputs the label L that is the search result to the search result display unit 321.

検索結果表示手段３２１は、検索手段３２０で検索された結果である手話単語（ラベル）を表示するものである。
例えば、図１６に示すように検索結果表示手段３２１は、検索結果の手話単語を検索結果単語領域６０に表示する。なお、検索結果の手話単語が複数ある場合は、その複数の手話単語を列挙して表示する。 The search result display unit 321 displays a sign language word (label) that is a result of the search performed by the search unit 320.
For example, as shown in FIG. 16, the search result display unit 321 displays the sign language word of the search result in the search result word area 60. When there are a plurality of sign language words as a search result, the plurality of sign language words are listed and displayed.

手話動作表示手段３２２は、検索結果表示手段３２１で表示された手話単語の動作をＣＧで表示するものである。
この手話動作表示手段３２２は、検索結果表示手段３２１で表示された手話単語を選択されることで、選択された手話単語に対応するモーションデータＤを読み出して、ＣＧ動画として再生する。なお、このモーションデータＤから、ＣＧを生成する手法は、一般的な手法を用いればよい。 The sign language action display means 322 displays the action of the sign language word displayed by the search result display means 321 in CG.
The sign language action display unit 322 reads the motion data D corresponding to the selected sign language word and reproduces it as a CG moving image by selecting the sign language word displayed by the search result display unit 321. As a method for generating CG from the motion data D, a general method may be used.

例えば、図１７に示すように、検索結果として検索結果単語領域６０に複数の手話単語の候補（単語Ａ〜Ｃ）が表示され、その中の１つ（単語Ａ）が選択された場合、手話動作表示手段３２２は、単語ＡをラベルＬとするモーションデータＤを記憶手段２から読み出し、ＣＧ描画領域６２に手話単語動作をＣＧで再生する。
これによって、例えば、検索条件に合致する手話単語が複数存在しても、手話動作を確認することで、ユーザが所望の手話動作に対応する手話単語を検索することが可能になる。 For example, as shown in FIG. 17, when a plurality of sign language word candidates (words A to C) are displayed in the search result word area 60 as a search result, and one of them (word A) is selected, sign language is displayed. The action display means 322 reads out the motion data D with the word A as the label L from the storage means 2 and reproduces the sign language word action in the CG drawing area 62 by CG.
As a result, for example, even when there are a plurality of sign language words that match the search condition, the user can search for a sign language word corresponding to a desired sign language operation by confirming the sign language operation.

以上説明したように手話単語検索装置３を構成することで、手話単語検索装置３は、手話単語動作時の手型、手指の位置および向きから、その手話動作が示す手話単語を検索することができる。
なお、手話単語検索装置３は、コンピュータを、前記した各手段として機能させるための手話単語検索プログラムで動作させることができる。 By configuring the sign language word search device 3 as described above, the sign language word search device 3 can search for the sign language word indicated by the sign language action from the hand type, the position and direction of the finger during the sign language word action. it can.
The sign language word search device 3 can be operated by a sign language word search program for causing a computer to function as each of the means described above.

〔手話単語分類情報生成装置の動作〕
次に、図１８を参照（構成については適宜図１０参照）して、手話単語検索装置３の動作について説明する。
まず、手話単語検索装置３は、検索画面表示制御手段３０によって、図１１に示すような手話単語の検索画面５０を、表示装置Ｍに表示する（ステップＳ２０）。
そして、手話単語検索装置３は、検索画面表示制御手段３０によって、ユーザからマウス等の選択手段を介して選択される、検索画面５０上の選択位置に応じて処理動作を分岐させる（ステップＳ２１）。 [Operation of sign language word classification information generator]
Next, the operation of the sign language word search device 3 will be described with reference to FIG.
First, the sign language word search device 3 causes the search screen display control means 30 to display a sign language word search screen 50 as shown in FIG. 11 on the display device M (step S20).
Then, the sign language word search device 3 branches the processing operation according to the selection position on the search screen 50 selected by the search screen display control means 30 through the selection means such as a mouse from the user (step S21). .

検索画面５０上で、検索条件の項目として“手型”が選択された場合、手話単語検索装置３は、分類情報指定手段３１の手型指定手段３１０によって、記憶手段２に記憶されている手型画像Ｇをサムネイル画像として表示する（図１２参照；ステップＳ２２）。そして、手型指定手段３１０は、検索画面表示制御手段３０に制御を移す。 When “hand type” is selected as the search condition item on the search screen 50, the sign language word search device 3 uses the hand type designation unit 310 of the classification information designation unit 31 to store the hand type stored in the storage unit 2. The mold image G is displayed as a thumbnail image (see FIG. 12; step S22). The hand type designation unit 310 then transfers control to the search screen display control unit 30.

そして、検索画面５０上で、手型の内容として、ステップＳ２２で表示された手型（サムネイル画像）の１つが選択された場合、手話単語検索装置３は、分類情報指定手段３１の手型指定手段３１０によって、選択された手型に対応する手型情報を検索条件として保持する（ステップＳ２３）。そして、手型指定手段３１０は、検索画面表示制御手段３０に制御を移す。 When one of the handprints (thumbnail images) displayed in step S22 is selected on the search screen 50 as the content of the handprint, the sign language word search device 3 specifies the handprint designation of the classification information designating unit 31. The hand type information corresponding to the selected hand type is held as a search condition by means 310 (step S23). The hand type designation unit 310 then transfers control to the search screen display control unit 30.

また、検索画面５０上で、検索条件の項目として“手方向”が選択された場合、手話単語検索装置３は、分類情報指定手段３１の手方向指定手段３１１によって、手方向情報の分類項目となる手のひらの向きの候補を表示する（図１３参照；ステップＳ２４）。そして、手方向指定手段３１１は、検索画面表示制御手段３０に制御を移す。 When “hand direction” is selected as the search condition item on the search screen 50, the sign language word search device 3 uses the hand direction specification means 311 of the classification information specification means 31 to classify the classification items of the hand direction information. A candidate for the palm orientation to be displayed is displayed (see FIG. 13; step S24). Then, the hand direction designating unit 311 transfers control to the search screen display control unit 30.

そして、検索画面５０上で、ステップＳ２４で表示された手方向の候補の１つが選択された場合、手話単語検索装置３は、分類情報指定手段３１の手方向指定手段３１１によって、選択された手方向に対応する手方向情報を検索条件として保持する（ステップＳ２５）。そして、手方向指定手段３１１は、検索画面表示制御手段３０に制御を移す。 When one of the hand direction candidates displayed in step S24 is selected on the search screen 50, the sign language word search device 3 uses the hand direction specifying unit 311 of the classification information specifying unit 31 to select the selected hand. Hand direction information corresponding to the direction is held as a search condition (step S25). Then, the hand direction designating unit 311 transfers control to the search screen display control unit 30.

また、検索画面５０上で、検索条件の項目として“指方向”が選択された場合、手話単語検索装置３は、分類情報指定手段３１の指方向指定手段３１２によって、指方向情報の分類項目となる指の向きの候補を表示する（図１４参照；ステップＳ２６）。そして、指方向指定手段３１２は、検索画面表示制御手段３０に制御を移す。 In addition, when “finger direction” is selected as the search condition item on the search screen 50, the sign language word search device 3 uses the finger direction specification unit 312 of the classification information specification unit 31 to identify the classification item of the finger direction information. The finger direction candidates are displayed (see FIG. 14; step S26). Then, the finger direction specifying unit 312 transfers control to the search screen display control unit 30.

そして、検索画面５０上で、ステップＳ２６で表示された指方向の候補の１つが選択された場合、手話単語検索装置３は、分類情報指定手段３１の指方向指定手段３１２によって、選択された指方向に対応する指方向情報を検索条件として保持する（ステップＳ２７）。そして、指方向指定手段３１２は、検索画面表示制御手段３０に制御を移す。 When one of the finger direction candidates displayed in step S26 is selected on the search screen 50, the sign language word search device 3 uses the finger direction specifying unit 312 of the classification information specifying unit 31 to select the selected finger direction. The finger direction information corresponding to the direction is held as a search condition (step S27). Then, the finger direction specifying unit 312 transfers control to the search screen display control unit 30.

また、検索画面５０上で、検索条件として“手位置”の領域が選択された場合、手話単語検索装置３は、分類情報指定手段３１の手位置指定手段３１３によって、選択された領域を表示し（図１５参照）、選択された手位置に対応する手位置情報を検索条件として保持する（ステップＳ２８）。そして、手位置指定手段３１３は、検索画面表示制御手段３０に制御を移す。 When the “hand position” area is selected as the search condition on the search screen 50, the sign language word search device 3 displays the selected area by the hand position specifying means 313 of the classification information specifying means 31. (See FIG. 15) The hand position information corresponding to the selected hand position is held as a search condition (step S28). Then, the hand position designation unit 313 transfers control to the search screen display control unit 30.

そして、検索画面５０上で、検索実行を指示する検索ボタン押下された場合、手話単語検索装置３は、検索実行手段３２の検索手段３２０によって、必要に応じて、ステップＳ２３、ステップＳ２５、ステップＳ２７、ステップＳ２８で指定された検索条件に合致する手話単語を検索し、検索結果表示手段３２１によって、検索結果の手話単語を表示する（図１６参照；ステップＳ２９）。 When the search button for instructing execution of the search is pressed on the search screen 50, the sign language word search device 3 uses the search means 320 of the search execution means 32 to perform steps S 23, S 25, and S 27 as necessary. The sign language word that matches the search condition specified in step S28 is searched, and the search result display means 321 displays the sign language word of the search result (see FIG. 16; step S29).

さらに、ステップＳ２９で表示された手話単語の候補が選択された場合、手話単語検索装置３は、検索実行手段３２の手話動作表示手段３２２によって、選択された手話単語に対応するモーションデータＤを読み出して、ＣＧ動画として手話単語動作を再生する（ステップＳ３０）。
以上の動作によって、手話単語検索装置３は、ユーザが選択する手話単語動作時の手の型、手指の位置および向きから、その手話動作が示す手話単語を検索することができる。 Further, when the sign language word candidate displayed in step S29 is selected, the sign language word search device 3 reads the motion data D corresponding to the selected sign language word by the sign language action display means 322 of the search execution means 32. Then, the sign language word motion is reproduced as a CG video (step S30).
With the above operation, the sign language word search device 3 can search for the sign language word indicated by the sign language action from the hand type, the position and the direction of the fingers during the sign language word action selected by the user.

〔変形例〕
以上、本発明の実施形態に係る手話単語分類情報生成装置および手話単語検索装置の構成および動作について説明したが、本発明はこの実施形態に限定されるものではない。
例えば、ここでは、手話単語分類情報生成装置は、説明を分かり易くするため、特に右手の型、位置、向き、右手の指の方向で分類情報を生成することとした。
しかし、手話単語分類情報生成装置は、右手と同様に、左手の型、位置、向き、左手の指の方向でさらに分類情報を生成することとしてもよい。
その場合、手話単語検索装置も同様に、左手の分類情報をさらに検索条件として加えて、手話単語の検索を行えばよい。 [Modification]
The configuration and operation of the sign language word classification information generation device and the sign language word search device according to the embodiment of the present invention have been described above, but the present invention is not limited to this embodiment.
For example, here, the sign language word classification information generation device generates classification information based on the right hand type, position, orientation, and right finger direction in order to make the explanation easy to understand.
However, the sign language word classification information generation device may generate classification information further according to the left hand type, position, orientation, and left finger direction, as with the right hand.
In that case, the sign language word search apparatus may similarly search the sign language word by adding the left hand classification information as a search condition.

また、ここでは、手話単語分類情報生成装置は、手型を分類した際に、手型画像を生成することとしたが、必ずしも手型画像を生成する必要はない。
その場合、予め分類した手型ごとに、既知の名称（例えば、テ型、ホ型等）を対応付けることとしてもよい。 Further, here, the sign language word classification information generation device generates a hand image when classifying a hand shape, but it is not always necessary to generate a hand image.
In that case, it is good also as matching a known name (for example, te type | mold, e-type etc.) for every hand type | mold classified beforehand.

また、ここでは、手話単語分類情報生成装置は、手の型、位置、向きおよび指の方向を分類した分類情報を生成することとした。
しかし、手話単語分類情報生成装置は、必ずしも、これらすべての項目について分類情報を生成する必要はない。例えば、指の方向を分類した分類情報を省略して、簡易な構成としても構わない。 Here, the sign language word classification information generation device generates classification information that classifies the hand type, position, orientation, and finger direction.
However, the sign language word classification information generation device does not necessarily need to generate classification information for all these items. For example, the classification information that classifies the direction of the finger may be omitted, and a simple configuration may be used.

また、ここでは、手話単語分類情報生成装置は、手の位置を、手首の位置で特定したが、例えば、人差し指の先端の関節ノードで、位置を特定することとしてもよい。
また、ここでは、手話単語検索装置は、手の型、位置、向きおよび指の方向をそれぞれ独立して扱った。
しかし、手話単語検索装置は、検索条件として、例えば、左右の手の型が同じ、左右の手の位置や方向が対称等を指定して、手話単語を検索することとしてもよい。 Here, the sign language word classification information generation device specifies the position of the hand by the position of the wrist. However, for example, the position may be specified by a joint node at the tip of the index finger.
Also, here, the sign language word retrieval apparatus handles the hand type, position, orientation, and finger direction independently.
However, the sign language word search apparatus may search for a sign language word by specifying, for example, the left and right hand types are the same and the left and right hand positions and directions are symmetrical as search conditions.

１手話単語分類情報生成装置
１０代表フレーム検出手段
１００動き量算出手段
１０１動き量最少フレーム検出手段
１１型分類手段
１１０クラスタリング手段
１１１手型画像生成手段
１２位置方向分類手段
１２０手位置方向算出手段
１２１手位置分類手段
１２２手方向分類手段
１２３指方向算出手段
１２４指方向分類手段
２記憶手段
３手話単語検索装置
３０検索画面表示制御手段
３１分類情報指定手段
３１０手型指定手段
３１１手方向指定手段
３１２指方向指定手段
３１３手位置指定手段
３２検索実行手段
３２０検索手段
３２１検索結果表示手段
３２２手話動作表示手段
Ｃ分類情報
Ｄモーションデータ
Ｇ手型画像 DESCRIPTION OF SYMBOLS 1 Sign language word classification | category information generation apparatus 10 Representative frame detection means 100 Motion amount calculation means 101 Motion amount minimum frame detection means 11 Type classification means 110 Clustering means 111 Hand type image generation means 12 Position / direction classification means 120 Hand position / direction calculation means 121 Hand Position classification means 122 Hand direction classification means 123 Finger direction calculation means 124 Finger direction classification means 2 Storage means 3 Sign language word search device 30 Search screen display control means 31 Classification information designation means 310 Hand type designation means 311 Hand direction designation means 312 Finger direction Designation means 313 Hand position designation means 32 Search execution means 320 Search means 321 Search result display means 322 Sign language action display means C Classification information D Motion data G Hand image

Claims

A sign language word classification information generation device that generates classification information for search for searching for the sign language word from features of a hand shape, a position, and an orientation in the operation of the sign language word,
In motion data composed of the relative position of the joints of the human body and the amount of movement and rotation of the joint for each frame, which is a unit of the screen, the representative amount of movement is minimized between the start and end of the operation. Representative frame detecting means for detecting a frame for each sign language word;
The sign language words are classified by hand shape by clustering the set of rotation amounts of the joints of the fingers in the representative frame for each sign language word detected by the representative frame detection means, and the classification result of the sign language words Hand type classification means for associating each piece of motion data as one of the classification information;
For each representative frame, hand position / direction calculation for calculating the position and orientation of the wrist joint based on the amount of movement and the amount of rotation of each joint from the joint serving as a predetermined reference position of the human body to the joint of the wrist Means,
The position of the wrist joint calculated by the hand position / direction calculating means is classified by a pre-divided position, and the classification result is used as one of the classification information in the motion data of the sign language word corresponding to the representative frame. Hand position classification means to associate;
The wrist joint direction calculated by the hand position / direction calculating means is classified according to the direction divided in advance, and the classification result is used as one of the classification information in the motion data of the sign language word corresponding to the representative frame. Hand direction classification means to associate;
A sign language word classification information generating device comprising:

Finger direction calculation means for obtaining two predetermined joint positions of the finger based on the amount of movement and the amount of rotation for each representative frame, and calculating a direction indicated by the finger from the relative position of each joint;
The direction indicated by the finger calculated by the finger direction calculation means is classified according to a previously classified direction, and the classification result is associated with motion data of a sign language word corresponding to the representative frame as one of the classification information. Classification means;
The sign language word classification information generating device according to claim 1, further comprising:

For each hand type classified by the hand type classification means, a hand type image that becomes a thumbnail image for search is generated by CG from one of the representative frames of sign language words classified into the hand type. The sign language word classification information generation device according to claim 1, further comprising a type image generation unit.

A sign language word classification information generation program for causing a computer to function as each means of the sign language word classification information generation device according to any one of claims 1 to 3.