JPH08115408A

JPH08115408A - Finger language recognition device

Info

Publication number: JPH08115408A
Application number: JP6253457A
Authority: JP
Inventors: Hirohiko Sagawa; 浩彦佐川; Masaru Oki; 優大木
Original assignee: GIJUTSU KENKYU KUMIAI SHINJOHO; GIJUTSU KENKYU KUMIAI SHINJOHO SHIYORI KAIHATSU KIKO; Hitachi Ltd
Current assignee: GIJUTSU KENKYU KUMIAI SHINJOHO; GIJUTSU KENKYU KUMIAI SHINJOHO SHIYORI KAIHATSU KIKO; Hitachi Ltd
Priority date: 1994-10-19
Filing date: 1994-10-19
Publication date: 1996-05-07

Abstract

PURPOSE: To flexibly, efficiently and precisely execute a finger language recognition by recognizing a manual operation pattern in an operation element unit at first, to synthesize the recognized result of an operation element so as to recognize a word and recognizing a finger language word including an operation changing in an expressed context and a situation by means of laying emphasis on a constitution element which does not change. CONSTITUTION: A finger language input part 101 converts the operation in finger language into an electric signal. An operation element recognition part 102 recognizes the constitution element of the operation constituting the finger word from operation data. Recognized result output recognized in the operation element recognition part 102 is stored in an operation element recognized result storage part 110. A finger language word recognition part 111 collates information on the operation elements stored in the operation element recognized result storage part 110 with that of a finger language word dictionary 112, and recognizes the finger language word according to whether they are matched with each other or not. Information on the finger language words executing recognition are stored in the finger language word dictionary 112 as the simultaneous and successive combination of marks showing the operation elements constituting the finger language words.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、手話を認識し、これを
音声言語に変換して出力する手話認識装置に関し、特に
手話を入力し、その結果を音声言語の形で出力すること
により、聴覚障害者と健聴者との間のコミュニケーショ
ンを支援することができる手話認識装置に関するもので
ある。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a sign language recognition apparatus for recognizing sign language, converting the sign language into a speech language and outputting the same, and in particular, by inputting sign language and outputting the result in the form of a speech language. The present invention relates to a sign language recognition device capable of supporting communication between a deaf person and a hearing person.

【０００２】[0002]

【従来の技術】従来の手話認識装置としては、手袋（デ
ータグローブ）を用いた装置により手話の手動作データ
を入力し、そのデータに対して音声認識や画像認識等の
一般的なパターン照合の技術やニューラルネットワーク
の技術を用いて、手話の認識を行う方法が行われてい
た。また、手話認識に近い分野である音声認識や画像認
識の技術では、あるパターンを認識するために必要な特
徴量を計算し、その結果を用いて認識を行っていた。こ
の場合、いくつかの特徴パラメータから直接対象を認識
する方法があるが、その他の方法として、音素や直線の
ような大きな構造を構成する要素を先ず認識し、その組
み合わせによって対象全体を認識する方法がある。ま
た、本出願人によって本願より先に提案された特願平６
−１０１０９７号明細書および図面（手話通訳装置）で
は、手話単語の境界に特定の手振りや表情を挿入した
り、手話単語の境界で別の入力装置から信号を入力する
ことにより、手話単語の境界を明確にして、手話文の認
識を手話単語単位の認識により行っている。しかし、こ
の方法においても、文脈や周囲の状況により手話単語が
変化してしまう部分があるので、明確に手話単語の境界
を認識することができないことがある。さらに、本出願
人により本願より先に提案された特願平５−１２５６９
８号明細書および図面では、手動作の認識はすなわち単
語の認識であった。これでは、前述と同じように、文脈
や周囲の状況によって手話単語が変化してしまうという
問題がある。2. Description of the Related Art As a conventional sign language recognition device, a device using a glove (data glove) is used to input hand motion data of sign language, and general pattern matching such as voice recognition or image recognition is performed on the data. There was a method of recognizing sign language by using the technology or the technology of the neural network. Further, in the technology of voice recognition and image recognition, which is a field close to sign language recognition, the feature amount necessary for recognizing a certain pattern is calculated, and the result is used for recognition. In this case, there is a method of directly recognizing an object from some characteristic parameters, but as another method, a method of first recognizing elements constituting a large structure such as a phoneme or a straight line and then recognizing the entire object by the combination thereof. There is. In addition, Japanese Patent Application No.
In the specification and drawings (Sign Language Interpreter), a sign gesture or a facial expression is inserted in the boundary of the sign language words, or a signal is input from another input device at the boundary of the sign language words to detect the boundaries of the sign language words. Is made clear, and sign language sentences are recognized by the recognition of each sign language word. However, even in this method, the sign language words may change depending on the context or the surrounding situation, so that the boundaries of the sign language words may not be clearly recognized. Furthermore, Japanese Patent Application No. 5-12569 proposed by the applicant prior to the present application
In No. 8 and the drawings, the recognition of hand movement was word recognition. This causes a problem that the sign language word changes depending on the context and surrounding conditions, as described above.

【０００３】[0003]

【発明が解決しようとする課題】一般に、手話の動作
は、手話単語と呼ばれる単位を連続的につなぎ合わせて
表現することにより手話文が表現され、手話文を交換し
合うことにより会話が行われる。この場合、手話単語に
は、常に同じ動作が行われる手話単語と文脈や周囲の状
況によって一部が変化する手話単語が存在する。このた
め、一般的な手話を認識するためには、手話単語の動作
のうち文脈や状況によって変化しない部分と変化する部
分を明確に分離する必要がある。しかしながら、従来の
手話認識技術では、入力されてきた手話データ全体を１
つのパターンとみなして認識を行っている場合が殆んど
であり、このために文脈や状況によって変化する一般的
な手話の認識を行うことができなかった。一方、人工的
に作成した手話単語の認識においては、手話単語の動作
をいくつかの要素に分離し、特定の要素の変化に特定の
意味を与えて認識を行っている技術もある。例えば、英
単語の語幹に付加された語尾を示す手話で、過去、過去
完了等を手話で表わすために、同じ形状でも手を下に向
けたならば過去、手を横に向けたならば過去完了と決め
ておく方法等がある。しかし、一般的な手話では、文脈
や状況によって変化する部分は共通ではなく、手話単語
の種類によって異なるため、そのような手話には対処す
ることができない。また、静止状態の認識が基本になっ
ているため、手の形状の変化の仕方や手の動き方そのも
のが意味を持つような手話の認識はできない。Generally, in sign language operation, sign language sentences are expressed by continuously connecting units called sign language words, and conversation is performed by exchanging sign language sentences. . In this case, the sign language words include a sign language word that always performs the same operation and a sign language word that partly changes depending on the context and surrounding circumstances. Therefore, in order to recognize general sign language, it is necessary to clearly separate the action of the sign language word from the part that does not change depending on the context or the situation. However, in the conventional sign language recognition technology, the entire input sign language data is
In most cases, recognition was performed by considering it as one pattern, which made it impossible to recognize general sign language that changes depending on the context or situation. On the other hand, in the recognition of an artificially created sign language word, there is also a technique in which the motion of the sign language word is separated into several elements and a change in a specific element is given a specific meaning for recognition. For example, in sign language showing the ending of an English word stem, in order to express the past, past completion, etc. in sign language, if the hand is downward even if the same shape, the past if the hand is turned sideways. There is a method to decide it as completed. However, in general sign language, the part that changes depending on the context or the situation is not common and differs depending on the type of sign language word, so such sign language cannot be dealt with. Further, since the recognition of the stationary state is the basic, it is not possible to recognize sign language in which the way the shape of the hand changes or the way the hand moves is meaningful.

【０００４】結局、このような手話を認識するために
は、動作をその構成要素に分解し、それぞれの構成要素
を認識した後に、構成要素の認識結果の組み合わせを変
化の少ない構成要素に重点をおいて調べることによって
認識を行う方法が良いと考える。ただし、手話の場合に
は、動作の構成要素が逐次的に組み合わされるだけでな
く、同時的にも組み合わされるという問題がある。音声
認識や画像認識では、対象の構成要素を認識して、その
組み合わせから対象全体を認識する方法があるが、その
うちの音声認識に限定すれば、構成要素の逐次的な組み
合わせのみであり、また、画像認識に限定すれば、基本
的には同時的な組み合わせのみしか扱っていない。従っ
て、手話のように複数の要素が同時的かつ逐次的に組み
合わされる場合には、画像認識や音声認識の処理方法で
は対応できないことになる。本発明の目的は、このよう
な従来の課題を解決し、表現される文脈や状況によって
変化する部分を持つ一般的な手話を、柔軟に、かつ効率
的で精度良く認識することが可能な手話認識装置を提供
することにある。In the end, in order to recognize such sign language, the action is decomposed into its constituent elements, and after recognizing each constituent element, the combination of the recognition results of the constituent elements is focused on the constituent element with little change. I think it is better to make a recognition by investigating in advance. However, in the case of sign language, there is a problem that the constituent elements of the operation are combined not only sequentially but also simultaneously. In voice recognition and image recognition, there is a method of recognizing a target component and recognizing the entire target from the combination, but if limited to voice recognition among them, it is only a sequential combination of the components, and As far as image recognition is concerned, basically only simultaneous combinations are handled. Therefore, when a plurality of elements are combined simultaneously and sequentially like sign language, it cannot be handled by the processing method of image recognition or voice recognition. An object of the present invention is to solve such a conventional problem and to flexibly, efficiently, and accurately recognize a general sign language having a part that changes depending on the expressed context or situation. To provide a recognition device.

【０００５】[0005]

【課題を解決するための手段】上記目的を達成するた
め、本発明の手話認識装置では、手話単語における動作
を手の形や方向，位置，動き方，関係などの動作要素に
分解し、それらの動作要素を表す記号の組み合わせとし
て手話単語を表現する。それらの組み合わせの中には、
逐次的（時系列的）に接続された動作要素と、同時的
（並列的）に接続された動作要素とが含まれる。そし
て、認識処理に際しては、先ず、必要な動作の動作要素
（逐次的な要素と同時的な要素）の認識をそれぞれ独立
に行って、記憶手段に格納する。次に、その記憶手段に
格納された認識結果から、認識を行う手話単語に必要な
動作要素を検索し、検索された動作の構成要素の同時的
および逐次的な組み合わせを調べることにより手話単語
を認識する。また、各動作の構成要素の認識には、それ
ぞれの構成要素の認識に最適な認識方法を用いるものと
する。In order to achieve the above object, the sign language recognition apparatus of the present invention decomposes a motion in a sign language word into motion elements such as a hand shape, direction, position, way of movement and relation, and A sign language word is expressed as a combination of symbols that represent the action elements of. Some of those combinations are
The operation elements connected sequentially (in time series) and the operation elements connected simultaneously (in parallel) are included. Then, in the recognition process, first, the motion elements of the necessary motions (sequential elements and simultaneous elements) are individually recognized and stored in the storage means. Next, from the recognition result stored in the storage means, a motion element required for the sign language word to be recognized is searched for, and the sign language word is searched by examining the simultaneous and sequential combinations of the searched motion components. recognize. In addition, for the recognition of the constituent elements of each operation, a recognition method optimal for the recognition of each constituent element is used.

【０００６】[0006]

【作用】本発明においては、手話単語を動作要素の組み
合わせとして表現し、独立に認識を行った動作要素の認
識結果の同時的および逐次的な組み合わせを評価して、
手話単語を認識する。すなわち、従来の手話認識装置で
は、全体としての手動作パターンを単語辞書に格納され
た標準手動作パターンと比較して、一致するか否かによ
り認識していた。しかし、本発明では、手動作パターン
を先ず動作要素単位（例えば、図１、図８の手話動作中
に表われる部分的パターン参照）で認識する。この段階
では、特定の単語とは無関係である。次に、その動作要
素の結果を統合して単語を認識するのである。従って、
より小さな単位（動作要素）で認識するので、処理が容
易であり、また単語を認識する場合でも、動作要素の有
無に関する情報を統合すればよいので、処理が容易であ
り、かつ高速処理が可能である。これにより、表現され
る文脈や状況で変化する動作を含む手話単語に対して
も、変化しない構成要素に重点をおいて手話単語を認識
することができる。さらに、各動作要素の認識はそれぞ
れの動作要素の性質に適した方法を用いて認識を行うこ
とができるので、柔軟かつ効率的、かつ精度の良い手話
認識を行うことができる。In the present invention, a sign language word is expressed as a combination of motion elements, and the simultaneous and sequential combinations of recognition results of motion elements that have been independently recognized are evaluated,
Recognize sign language words. That is, in the conventional sign language recognition device, the overall hand movement pattern is compared with the standard hand movement pattern stored in the word dictionary and is recognized depending on whether or not they match. However, in the present invention, the hand movement pattern is first recognized in units of movement elements (for example, refer to the partial patterns appearing during the sign language movement of FIGS. 1 and 8). At this stage, it has nothing to do with the particular word. Then, the results of the action elements are integrated to recognize the word. Therefore,
Since it is recognized in smaller units (motion elements), processing is easy, and even when recognizing words, it is only necessary to integrate information regarding the presence or absence of motion elements, so processing is easy and high-speed processing is possible. Is. As a result, it is possible to recognize the sign language words by focusing on the components that do not change, even with respect to the sign language words that include actions that change depending on the context or situation in which they are expressed. Furthermore, since each motion element can be recognized using a method suitable for the property of each motion element, it is possible to perform sign language recognition that is flexible, efficient, and accurate.

【０００７】[0007]

【実施例】以下、本発明の実施例を、図面により詳細に
説明する。図１は、本発明の一実施例を示す手話認識装
置の概念ブロック図である。図１において、手話入力部
１０１は手話における動作を電気信号に変換する。動作
要素認識部１０２では、動作データから手話単語を構成
している動作の構成要素（以下，動作要素と記す）を認
識する。動作要素認識部１０２は、独立したそれぞれの
動作要素毎の認識部１０３，１０５，１０７から構成さ
れている。各動作要素認識部１０３，１０５，１０７に
は、それぞれの認識処理に必要な認識用パラメータ１０
４，１０６，１０８が用意される。動作要素認識部１０
２では、同じ入力データに対して各動作要素の認識がそ
れぞれ独立に行われる。そのうちの１つの動作要素認識
部で認識が完了することにより、その認識部に対応する
動作要素が含まれていたことがわかる。動作要素認識部
１０２で認識された認識結果出力は、動作要素認識結果
記憶部１１０に格納される。動作要素認識結果記憶部１
１０では、認識された動作要素の情報がそれぞれ記憶さ
れる。次の手話単語認識部１１１では、動作要素認識結
果記憶部１１０と手話単語辞書１１２に記憶されている
動作要素の情報の照合を行い、一致したか否かで手話単
語の認識を行う。手話単語辞書１１２には、認識を行う
手話単語の情報が、その手話単語を構成する動作要素を
表す記号の同時的逐次的な組み合わせとして記憶されて
いる。最後に、手話単語認識部１１１で認識された手話
単語を出力部１１３を介して外部に出力する。Embodiments of the present invention will now be described in detail with reference to the drawings. FIG. 1 is a conceptual block diagram of a sign language recognition apparatus showing an embodiment of the present invention. In FIG. 1, a sign language input unit 101 converts a motion in sign language into an electric signal. The motion element recognizing unit 102 recognizes a motion constituent element (hereinafter referred to as a motion element) that constitutes a sign language word from the motion data. The motion element recognition unit 102 is composed of independent recognition units 103, 105, and 107 for respective motion elements. Each of the motion element recognition units 103, 105 and 107 has a recognition parameter 10 required for the recognition process.
4, 106 and 108 are prepared. Motion element recognition unit 10
In 2, the recognition of each motion element is performed independently for the same input data. When the recognition is completed by one motion element recognition unit, it can be seen that the motion element corresponding to the recognition unit was included. The recognition result output recognized by the motion element recognition unit 102 is stored in the motion element recognition result storage unit 110. Motion element recognition result storage unit 1
At 10, information on the recognized motion elements is stored. The next sign language word recognition unit 111 collates the information of the motion elements stored in the motion element recognition result storage unit 110 and the sign language word dictionary 112, and recognizes the sign language words based on whether they match. In the sign language word dictionary 112, information on the sign language words to be recognized is stored as a simultaneous and sequential combination of symbols representing the action elements forming the sign language words. Finally, the sign language word recognized by the sign language word recognition unit 111 is output to the outside via the output unit 113.

【０００８】図２は、図１における手話認識装置を実現
するためのハードウェアの一構成例図である。図２にお
いて、手動作入力装置２０１は手話における手動作を電
気信号に変換する装置であり、手袋にセンサを設置し、
手の形状や動きを電気信号に変換する装置として良く知
られている装置（例えば、データグローブ）を利用する
ことができる。手動作入力装置２０１により、手話の手
動作は指の曲げ角度や手の位置などからなる多次元の時
系列データに変換される。すなわち、後述の図３に示す
ような手の位置、方向、指の曲げにより分類された次元
のデータに変換され、図３の場合には１６次元のデータ
に変換されることになる。演算装置２０２は、動作要素
の認識や手話単語の認識を行う装置であり、メモリ２０
４，２０５からプログラムを読み込み、そのプログラム
に従って認識処理を行う。出力装置２０３は、手話単語
の認識結果を出力する装置であり、文字による出力や音
声合成を用いた音声による出力を利用することができ
る。メモリ２０４は、動作要素を認識するためのプログ
ラムを記憶するための記憶装置であり、メモリ２０５
は、手話単語を認識するためのプログラムを記憶するた
めの記憶装置である。また、メモリ２０６は、各動作要
素を認識するために必要なパラメータを記憶するための
記憶装置であり、メモリ２０７は、各手話単語を認識す
るために必要な動作要素の組み合わせを表した手話単語
辞書を記憶するための記憶装置であり、メモリ２０８
は、手話単語認識において参照するために動作要素の認
識結果を記憶するための記憶装置である。FIG. 2 is a block diagram showing an example of the hardware for realizing the sign language recognition apparatus shown in FIG. In FIG. 2, a hand movement input device 201 is a device for converting hand movement in sign language into an electric signal, and a sensor is installed on a glove,
A device (for example, a data glove) that is well known as a device that converts the shape or motion of a hand into an electric signal can be used. The hand movement input device 201 converts the hand movement of sign language into multidimensional time-series data including the bending angle of the finger, the position of the hand, and the like. That is, the data is converted into dimensional data classified by the position, direction, and bending of the finger as shown in FIG. 3 described later, and in the case of FIG. 3, converted into 16-dimensional data. The arithmetic device 202 is a device for recognizing motion elements and sign language words, and includes the memory 20.
A program is read from 4, 205 and recognition processing is performed according to the program. The output device 203 is a device that outputs the recognition result of the sign language word, and can use the output by characters or the output by voice using voice synthesis. The memory 204 is a storage device for storing a program for recognizing an operating element, and is a memory 205.
Is a storage device for storing a program for recognizing a sign language word. The memory 206 is a storage device for storing parameters necessary for recognizing each action element, and the memory 207 is a sign language word representing a combination of action elements necessary for recognizing each sign language word. A memory 208 is a storage device for storing a dictionary.
Is a storage device for storing a recognition result of a motion element for reference in sign language word recognition.

【０００９】図３は、本発明の手話入力装置により変換
された手動作データのフォーマット図である。図３にお
いて、３０１は手の位置に関するデータであり、手の位
置はさらにｘ軸のデータ３０２，ｙ軸のデータ３０３，
ｚ軸のデータ３０４から構成されている。３０５は手の
方向に関するデータであり、手の方向はさらにｘ軸回り
の角度３０６，ｙ軸回りの角度３０７，ｚ軸回りの角度
３０８から構成されている。３０９は指の曲げに関する
データであり、指の曲げはさらに親指の第２関節の曲げ
角度３１０，親指の第３関節の曲げ角度３１１，人差し
指の第１関節の曲げ角度３１２，人差し指の第２関節の
曲げ角度３１３，中指の第１関節の曲げ角度３１４，中
指の第２関節の曲げ角度３１５，薬指の第１関節の曲げ
角度３１６，薬指の第２関節の曲げ角度３１７，小指の
第１関節の曲げ角度３１８，小指の第２関節の曲げ角度
３１９から構成されている。また，３２０，３２１，３
２２はそれぞれ時刻ｔ１，ｔ２，ｔｎにおける手の位
置，方向，指の曲げの情報を表す。このように、手話に
おける動作は、手の位置３０１，手の方向３０５，指の
曲げ３０９からなる時系列データとして表される。FIG. 3 is a format diagram of the hand movement data converted by the sign language input device of the present invention. In FIG. 3, reference numeral 301 denotes data relating to the position of the hand.
It is composed of z-axis data 304. Reference numeral 305 is data regarding the direction of the hand, and the hand direction further includes an angle 306 around the x axis, an angle 307 around the y axis, and an angle 308 around the z axis. Reference numeral 309 denotes data relating to bending of the finger. The bending of the finger further includes bending angle 310 of the second joint of the thumb, bending angle 311 of the third joint of the thumb, bending angle 312 of the first joint of the index finger, and second joint of the index finger. Bending angle 313, bending angle 314 of first joint of middle finger, bending angle 315 of second joint of middle finger, bending angle 316 of first joint of ring finger, bending angle 317 of second joint of ring finger, first joint of little finger And a bending angle 318 of the second joint of the little finger. Also, 320, 321, 3
Reference numeral 22 represents information on the position, direction, and bending of the finger at times t1, t2, and tn, respectively. In this way, the motion in sign language is represented as time-series data including the hand position 301, the hand direction 305, and the finger bending 309.

【００１０】図４は、本発明の動作要素パラメータ・メ
モリに格納される動作要素を認識するためのパラメータ
の格納フォーマット図である。図４において、動作要素
名４０１は、そのパラメータを認識処理に使用する動作
要素の名称を表し、パラメータ数４０２は、その動作要
素の認識に使用するパラメータの数を表し、４０３から
４０５は、各パラメータを表す。また、パラメータ名４
０６は、そのパラメータの意味を表す名称であり、パラ
メータ４０７は、実際に認識処理に利用されるパラメー
タの値を表す。具体的には、パラメータは手の動きの速
度、直線度を表す数値、手の位置を示す範囲等を示す数
値である。図５は、本発明の手話単語辞書メモリに格納
される手話単語辞書のフォーマット図である。図５にお
いて、手話単語名５０１は、その動作要素の組み合わせ
が表す手話単語名を表す。動作タイプ５０２は、動作要
素の組み合わせによって表現される全体の動作が一回の
み行われる単一動作か、あるいは何回か繰り返される繰
り返し動作かを表している。繰り返し動作の場合は、繰
り返し回数も指定する。逐次組み合わせ数５０３は、逐
次的に組み合わされる動作の個数を表す。逐次動作５０
４から５０６は、逐次的に組み合わされるそれぞれの動
作を表す。各逐次動作は、さらに同時組み合わせ数５０
７，動作要素５０８，５１０，５１２，および各動作要
素の重要度を表す重み値５０９，５１１，５１３が格納
される。同時組み合わせ数５０７は、各逐次動作を表す
ために同時的に組み合わされる動作要素の個数である。FIG. 4 is a storage format diagram of a parameter for recognizing an operating element stored in the operating element parameter memory of the present invention. In FIG. 4, a motion element name 401 represents the name of a motion element whose parameter is used for recognition processing, a parameter number 402 represents the number of parameters used for recognition of the motion element, and 403 to 405 represent Represents a parameter. Also, parameter name 4
06 is a name representing the meaning of the parameter, and parameter 407 represents the value of the parameter actually used in the recognition processing. Specifically, the parameter is a numerical value indicating the speed of hand movement, linearity, a range indicating hand position, and the like. FIG. 5 is a format diagram of the sign language word dictionary stored in the sign language word dictionary memory of the present invention. In FIG. 5, the sign language word name 501 represents the sign language word name represented by the combination of the action elements. The operation type 502 represents whether the entire operation represented by the combination of operation elements is a single operation performed only once or a repeated operation repeated several times. In case of repeated operation, also specify the number of repetitions. The sequential combination number 503 represents the number of operations that are sequentially combined. Sequential operation 50
4 to 506 represent respective operations that are sequentially combined. The number of simultaneous combinations for each sequential operation is 50
7, motion elements 508, 510, 512, and weight values 509, 511, 513 representing the importance of each motion element are stored. The simultaneous combination number 507 is the number of operation elements that are simultaneously combined to represent each sequential operation.

【００１１】次に、図６から図１７を用いて本発明の手
話認識方法を詳細に説明する。図６は、図３のフォーマ
ットで入力されてくる動作データを認識するための演算
装置の処理のフローチャートである。認識処理では、先
ず、ステップ６０１において手話入力装置２０１から読
み込まれてくる動作データが最後かどうかの判定を行
い、最後でない場合はステップ６０２に進む。動作デー
タが最後の場合には、認識処理を終了する。ステップ６
０２では、手話入力装置２０１から１時刻分の動作デー
タを読み込む。次のステップ６０３では、入力した動作
データに対して各動作要素の認識処理を行う。ステップ
６０３の動作を、図７を用いて詳細に説明する。各動作
要素の認識では、認識を行うべき動作要素が多数あるた
め、先ずステップ７０１において認識処理を行う動作要
素があるかどうかの判定を行う。認識処理を行う動作要
素がある場合は、ステップ７０２に進み、ない場合は処
理を終了する。ステップ７０２では、認識処理を行う動
作要素を認識処理を行っていない動作要素の中から１つ
選択する。次に、ステップ７０３において、選択した動
作要素の認識処理を行う。認識処理では、前述のよう
に、それぞれパラメータを用いて該当する動作要素が認
識できるか否かを判定する。Next, the sign language recognition method of the present invention will be described in detail with reference to FIGS. 6 to 17. FIG. 6 is a flowchart of the processing of the arithmetic unit for recognizing the motion data input in the format of FIG. In the recognition process, first, in step 601, it is determined whether or not the motion data read from the sign language input device 201 is the last, and if it is not the last, the process proceeds to step 602. If the motion data is the last one, the recognition process ends. Step 6
In 02, the operation data for one time is read from the sign language input device 201. In the next step 603, recognition processing of each motion element is performed on the input motion data. The operation of step 603 will be described in detail with reference to FIG. In the recognition of each motion element, since there are many motion elements to be recognized, it is first determined in step 701 whether there is a motion element to be recognized. If there is an operation element for performing the recognition process, the process proceeds to step 702, and if not, the process ends. In step 702, one motion element for which the recognition process is performed is selected from motion elements for which the recognition process is not performed. Next, in step 703, recognition processing of the selected motion element is performed. In the recognition processing, as described above, it is determined whether or not the corresponding motion element can be recognized using the parameters.

【００１２】図８は、手話における動作要素の一覧を示
す図であり、図９は、動作要素『方向』の範囲を説明す
るための図であり、図１０は、動作要素『位置』の範囲
を説説明するための図であり、図１１は、動作要素『方
向』の認識方法を説明するための図である。例えば、動
作要素の１〜５等の数字やアイウエオ等のカタカナ文
字、２Ｂ，３Ａ等の記号、薬、佐、等の漢字は、手の形
状により認識される。また、上、下、左、右等の方向
は、手の方向で認識され、口、頭等の身体の部分は手の
位置で認識され、静止、直線、円等の曲線は運動の軌跡
で認識され、つまむ、引っかける等の動作は指の関係に
より認識され、大、中、小等の種類は運動の大きさで認
識される。認識すべき動作要素には、図８に示すように
手の形状や方向，位置，動き方などに関するさまざまな
種類がある。これらの動作要素の認識方法は、音声認識
や画像認識と同様に、パターン認識の一種である。パタ
ーン認識の方法としては、パターン照合による方式，統
計的手法による方式，手続きによる方式の３種類の方式
が広く知られている。動作要素の認識には、その動作要
素の性質に応じてこれらの認識方法のうちの最適な認識
方法を利用することができる。例えば、手の形状に関し
ては形状を決定するパラメータ数が多く、また微妙なパ
ラメータ間の関係が必要になる場合も多いので、多くの
データを集めて統計的な手法によって認識を行う方式が
適している。手の方向や位置，両手の関係に関しては、
パラメータ数が少なく、明確にパラメータの範囲を決定
できるので、手続き的な方式が適している。例えば、方
向の動作要素は図９のように、位置の動作要素は図１０
のように、それぞれ範囲を決定できる。FIG. 8 is a diagram showing a list of motion elements in sign language, FIG. 9 is a diagram for explaining a range of motion elements “direction”, and FIG. 10 is a range of motion elements “position”. And FIG. 11 is a diagram for explaining a method of recognizing the motion element “direction”. For example, numbers such as 1 to 5 of operation elements, katakana characters such as aiueo, symbols such as 2B and 3A, and kanji such as medicine and sa are recognized by the shape of the hand. Also, the directions of up, down, left, right, etc. are recognized by the direction of the hand, body parts such as the mouth, head, etc. are recognized by the position of the hand, and curves such as static, straight lines, and circles are the loci of movement. Recognized actions such as pinching and hooking are recognized by the relationship of the fingers, and types such as large, medium, and small are recognized by the magnitude of the motion. There are various types of motion elements to be recognized, such as the shape, direction, position, and movement of a hand, as shown in FIG. The method of recognizing these motion elements is a type of pattern recognition, like voice recognition and image recognition. As a pattern recognition method, three types of methods are widely known: a pattern matching method, a statistical method, and a procedure method. For the recognition of the motion element, the optimum recognition method among these recognition methods can be used according to the property of the motion element. For example, with respect to the shape of the hand, there are many parameters that determine the shape, and there are many cases where subtle relationships between parameters are required. Therefore, a method that collects a large amount of data and performs recognition using a statistical method is suitable. There is. Regarding the direction and position of the hand and the relationship between the two hands,
The procedural method is suitable because the number of parameters is small and the range of parameters can be clearly determined. For example, the directional motion element is shown in FIG. 9, and the position motion element is shown in FIG.
, The range can be determined respectively.

【００１３】この場合、例えば方向に関する動作要素の
認識には、入力された方向データとその範囲に関して、
下記（数１）に示すような評価関数を用いて認識を行う
ことができる。In this case, for example, in order to recognize the motion element related to the direction, regarding the input direction data and its range,
Recognition can be performed using an evaluation function as shown in the following (Equation 1).

【数１】例えば、求める方向が「上」の場合、（数１）における
基準ベクトル（ｘ０，ｙ０，ｚ０）は図１１におけるベ
クトル１１０１を表す。また、入力ベクトルを１１０２
とすると、（数１）におけるＡｄは基準ベクトルと入力
ベクトルのなす角度１１０３を表す。そして、ＴＨ２の
値が斜線で示した範囲１１０４であるとするならば、
（数１）によって求められる評価値Ｄ１は斜線で示した
範囲１１０４について、０より大きい値が与えられる。
さらに、手の動きに関する動作要素の認識の場合、およ
び直線や円運動のように比較的容易にルール化できる動
作要素の場合には、手続き的な方式を用いることができ
る。図１２は、動作要素『直線』の認識方法を示す動作
フローチャートであり、図１３は、動作要素『直線』の
認識における角度の評価方法の説明図である。例えば、
直線を認識するための動作は、図１２に示すようなフロ
ーの処理で行うことができる。すなわち、直線を求める
為には、先ずステップ１２０１において、どの方向の直
線を求めるか、その方向を決定する。次に、ステップ
１２０２で直線の始点を、ステップ１２０３で直線の終
点を、それぞれ求める。直線的な手の動作では、通常、
動作の開始点および終了点において動作が停止か、ある
いは緩慢になる傾向がある。そこで、手の動作速度を計
算し、その極小値を求めることによって始点および終点
を求めることができる。ステップ１２０４では、求め
ようとしている直線と、始点と終点を結ぶ直線の方向の
角度を求める。すなわち、図１３における直線１３０１
と直線１３０２の間の角度１３０３を求める。[Equation 1] For example, when the direction to be obtained is “up”, the reference vector (x0, y0, z0) in (Equation 1) represents the vector 1101 in FIG. Also, input vector 1102
Then, Ad in (Equation 1) represents an angle 1103 formed by the reference vector and the input vector. Then, if the value of TH2 is in the shaded range 1104,
The evaluation value D1 obtained by (Equation 1) is given a value larger than 0 in the shaded range 1104.
Furthermore, a procedural method can be used in the case of recognition of motion elements related to hand movements, and in the case of motion elements that can be relatively easily ruled like linear and circular motions. FIG. 12 is an operation flowchart showing a method of recognizing the motion element “straight line”, and FIG. 13 is an explanatory diagram of an angle evaluation method in recognition of the motion element “straight line”. For example,
The operation for recognizing the straight line can be performed by the processing of the flow shown in FIG. That is, in order to obtain a straight line, first in step 1201, which direction the straight line is obtained in is determined. Next, in step 1202, the start point of the straight line is obtained, and in step 1203, the end point of the straight line is obtained. In linear hand movements,
At the beginning and end of the movement, the movement tends to stop or become slow. Therefore, the start point and the end point can be obtained by calculating the motion speed of the hand and obtaining the minimum value thereof. In step 1204, the angle between the straight line to be obtained and the line connecting the start point and the end point is obtained. That is, the straight line 1301 in FIG.
An angle 1303 between the line 1302 and the line 1302 is obtained.

【００１４】ステップ１２０５において、角度が閾値以
下かどうかの判定を行い、閾値以下の場合はステップ１
２０６に進む。角度が閾値より大きい場合には、処理を
終了する。ステップ１２０６では、始点から終点までの
軌跡が直線からどの程度ずれているかを調べ、ずれの最
大値を求める。図１４は、動作要素『直線』の認識にお
けるずれの評価方法の説明図である。すなわち、図１４
において、始点１４０１から終点１４０２の間の軌跡１
４０３から、始点と終点を結ぶ直線１４０４への垂線１
４０５を求め、その長さの最大値１４０６を求めること
である。ステップ１２０７では、求めたずれの最大値が
閾値以下か否かの判定を行い、閾値以下の場合はステッ
プ１２０８に進む。一方、ずれが閾値より大きい場合に
は、処理を終了する。ステップ１２０８では、ステップ
１２０４で求めた角度とステップ１２０６で求めたずれ
から、検出した直線の評価値を計算する。評価値の計算
には、例えば、（数２）のような計算式を用いることが
できる。In step 1205, it is determined whether the angle is less than or equal to a threshold value. If it is less than or equal to the threshold value, step 1 is performed.
Proceed to 206. If the angle is larger than the threshold value, the process ends. In step 1206, it is checked how much the trajectory from the start point to the end point deviates from the straight line, and the maximum value of the deviation is obtained. FIG. 14 is an explanatory diagram of a deviation evaluation method in recognition of the motion element “straight line”. That is, FIG.
At the locus 1 between the start point 1401 and the end point 1402 at
A perpendicular line 1 from 403 to a straight line 1404 connecting the start point and the end point
405, and the maximum value 1406 of the length is calculated. In step 1207, it is determined whether or not the calculated maximum value of the deviation is less than or equal to the threshold value. On the other hand, if the deviation is larger than the threshold value, the process ends. In step 1208, the evaluation value of the detected straight line is calculated from the angle obtained in step 1204 and the deviation obtained in step 1206. For the calculation of the evaluation value, for example, a calculation formula such as (Equation 2) can be used.

【数２】ここで、Ａｎｇが求める直線と、始点および終点を結ぶ
直線とのなす角度、ＴＨａ２は角度の閾値、ＴＨｄ２は
ずれの閾値、ｄ２１は角度に対する評価値、ｄ２２はず
れに対する評価値、Ｄ２は検出した直線に対する評価値
である。このようにして、円弧や波線，円，螺旋，振動
についても、同様な方法で検出処理を手続き化し、認識
を行うことができる。また、自由曲線のようにルール化
が困難な場合は、パターン照合による方式を用いること
ができる。パターン照合による方法としては、音声認識
や画像認識において良く知られている方法を使用するこ
とができる。例えば、ＤＰ照合（白井良明編，「パタ
ーン理解」オーム社，１９８７）を使用することがで
きる。また、動作要素の認識を行う際のパラメータは、
メモリ２０２中から該当する動作要素のパラメータを読
み出して使用する。動作要素の認識方法として、各動作
要素を１種類の認識方法で認識する以外に、複数の認識
方法を用いて認識処理を行い、その結果求められた評価
値の中で最も良い値をその動作要素の認識結果としても
良い。[Equation 2] Here, the angle formed by the straight line obtained by Ang and the straight line connecting the start point and the end point, THa2 is the threshold value of the angle, THd2 is the threshold value of deviation, d21 is the evaluation value for the angle, d22 is the evaluation value for the deviation, and D2 is the detected straight line. It is an evaluation value. In this way, for arcs, wavy lines, circles, spirals, and vibrations, the detection process can be procedurally recognized in the same manner. Further, when it is difficult to form a rule like a free curve, a method based on pattern matching can be used. As a method of pattern matching, a method well known in voice recognition and image recognition can be used. For example, DP collation (Yoshiaki Shirai, "Pattern Understanding" Ohmsha, 1987) can be used. Also, the parameters for recognizing motion elements are
The parameter of the corresponding operating element is read from the memory 202 and used. As a method of recognizing motion elements, in addition to recognizing each motion element by one kind of recognition method, recognition processing is performed using a plurality of recognition methods, and the best value among the evaluation values obtained as a result is calculated as the motion. It may be used as the recognition result of the element.

【００１５】図７に示す動作要素の認識処理フローにお
けるステップ７０３で、選択した動作要素の認識処理を
行った後、ステップ７０４において認識処理を行った動
作要素が認識されたかどうかの判定を行う。この判定
は、（数１）や（数２）のような評価式を用いた場合、
評価値が０より大きいか否かによって判定することがで
きる。評価値が０の場合には、動作要素は検出されなか
ったと判断することができ、その場合にはステップ７０
１に戻って、他の動作要素の認識処理を行う。評価値が
０より大きい場合には、動作要素が検出されたと判断
し、ステップ７０５において検出された動作要素の開始
時刻，終了時刻，評価値からなる動作要素の情報を図２
における動作要素認識結果のメモリ２０８に格納する。
ステップ７０５の処理が終了した後はステップ７０１に
戻り、他の動作要素の認識処理を行う。図６に示す演算
装置の処理フローにおけるステップ６０３で、各動作要
素の認識を行った後、ステップ６０４において新しい動
作要素が検出されたか否かを判定する。この判定には、
図２におけるメモリ２０８の内容を参照して、新たな動
作要素が格納されているか否かを調べることにより行う
ことができる。新たな動作要素が格納されている場合に
はステップ６０５に進み、手話単語の認識を行う。一
方、新たな動作要素が格納されていない場合にはステッ
プ６０１に戻り、次のデータに対する処理を行う。After performing the recognition process of the selected motion element in step 703 in the motion element recognition process flow shown in FIG. 7, it is determined in step 704 whether the motion element subjected to the recognition process is recognized. This judgment is made using an evaluation formula such as (Equation 1) or (Equation 2).
It can be determined by whether the evaluation value is greater than 0. If the evaluation value is 0, it can be determined that no motion element has been detected. In that case, step 70
Returning to 1, the recognition processing of other motion elements is performed. If the evaluation value is greater than 0, it is determined that the motion element has been detected, and the motion element information including the start time, end time, and evaluation value of the motion element detected in step 705 is shown in FIG.
It is stored in the memory 208 of the recognition result of the motion element.
After the process of step 705 is completed, the process returns to step 701, and the recognition process of other action elements is performed. After recognizing each motion element in step 603 in the processing flow of the arithmetic unit shown in FIG. 6, it is determined in step 604 whether a new motion element is detected. For this judgment,
This can be done by referring to the contents of the memory 208 in FIG. 2 and checking whether or not a new operation element is stored. If a new motion element is stored, the process proceeds to step 605, and the sign language word is recognized. On the other hand, if no new motion element is stored, the process returns to step 601 and the process for the next data is performed.

【００１６】図１５は、本発明の一実施例を示す手話単
語の認識方法の動作フローチャートであり、図１６は動
作要素間の重なりを説明する図である。図６のステップ
６０５における手話単語の認識方法について、図１５，
図１６を用いて説明する。手話単語は、動作要素の同時
的逐次的な組み合わせによって表現され、それらの情報
はメモリ２０７の手話単語辞書に格納されている。先
ず、ステップ１５０１では、図６のステップ６０３にお
いて新しく検出された動作要素を含む手話単語を、メモ
リ２０７に格納されている手話単語辞書から検索する。
次に、ステップ１５０２において認識処理を行うべき手
話単語があるか否かの判定を行い、手話単語がある場合
にはステップ１５０３に進む。手話単語がない場合は処
理を終了する。ステップ１５０３では、認識処理を行う
手話単語をステップ１５０１において検索された手話単
語から１つ選択する。次に、ステップ１５０４におい
て、その手話単語を構成する動作要素を動作要素認識結
果が格納されているメモリ２０８から検索する。ステッ
プ１５０５では、認識処理を行う手話単語に必要な動作
要素が全て見つかったか否かの判定を行い、見つかった
場合にはステップ１５０６に進む。動作要素が全てそろ
っていない場合にはステップ１５０２に戻り、他の手話
単語の認識を行う。ステップ１５０６では、動作要素の
うち同時的に表現される動作要素，すなわち図５におい
て５０７〜５１３のそれぞれの逐次動作中に記述されて
いる動作要素１〜ｎについてのチェックを行う。同時的
な動作要素のチェックは、必要な動作要素の時間範囲の
重なりを求めることによって行う。FIG. 15 is an operation flowchart of a sign language word recognition method according to an embodiment of the present invention, and FIG. 16 is a diagram for explaining overlap between operation elements. Regarding the sign language word recognition method in step 605 of FIG. 6, FIG.
This will be described with reference to FIG. The sign language words are expressed by the simultaneous and sequential combination of motion elements, and their information is stored in the sign language word dictionary of the memory 207. First, in step 1501, a sign language word including the motion element newly detected in step 603 of FIG. 6 is searched from the sign language word dictionary stored in the memory 207.
Next, in step 1502, it is determined whether or not there is a sign language word to be recognized, and if there is a sign language word, the process proceeds to step 1503. If there is no sign language word, the process ends. In step 1503, one sign language word to be recognized is selected from the sign language words searched in step 1501. Next, in step 1504, the memory 208 storing the motion element recognition result is searched for a motion element forming the sign language word. In step 1505, it is determined whether or not all the motion elements necessary for the sign language word to be recognized are found. If found, the process proceeds to step 1506. If all the motion elements are not complete, the process returns to step 1502 to recognize another sign language word. In step 1506, a check is performed on the operation elements that are simultaneously expressed among the operation elements, that is, the operation elements 1 to n described during the sequential operation of each of 507 to 513 in FIG. The simultaneous check of motion elements is performed by obtaining the overlap of time ranges of necessary motion elements.

【００１７】この重なりは、例えば３個の動作要素が、
図１６に示すように存在する場合には１６０１に示す範
囲に相当する。動作要素がｎ個の場合、この重なりは
（数３）によって求めることができる。In this overlap, for example, three motion elements are
When it exists as shown in FIG. 16, it corresponds to the range 1601. When there are n motion elements, this overlap can be obtained by (Equation 3).

【数３】図１６では、時間軸は左側の小さい方から右側の大きい
方に進むので、（数３）の式のように、動作要素ｉの始
点の最大値が重なりの始点となり、また動作要素ｉの終
点のうちの最小値が重なりの終点となる。ステップ１５
０７では、同時的に表現される動作要素の検出位置が許
容範囲内に含まれているかどうかの判定を行う。この判
定には（数３）によって求めた動作要素の重なりの範囲
が、ある閾値以上であるか否かを調べることによって行
うことができる。同時的な動作要素の検出範囲が許容範
囲内であれば、ステップ１５０８に進む。許容範囲外の
場合にはステップ１５０２に戻り、他の手話単語の認識
処理を行う。ステップ１５０８では、逐次的に表現され
る動作の関係，すなわち図５における逐次動作５０４，
５０５，５０６がその順序で許容範囲内に検出されてい
るか否かを調べる。このためには、ステップ１５０６で
求めた同時的に表現される動作要素の重なり範囲を用い
る。この重なり範囲を各逐次動作の検出範囲として、そ
の範囲間の時間的な関係を調べる。範囲間の関係は、例
えば、ある動作の範囲とその後にくる動作の範囲の大き
さと、それらの動作間の重なり、あるいはギャップの大
きさとの比によって評価することができる。このような
定義で、ｉ番目と（ｉ＋１）番目の逐次動作の関係の評
価値としては（数４）のような式を用いることができ
る。(Equation 3) In FIG. 16, since the time axis advances from the smaller one on the left side to the larger one on the right side, the maximum value of the starting points of the motion element i becomes the start point of the overlap and the end point of the motion element i, as in the equation (3). The minimum value of these is the end point of the overlap. Step 15
At 07, it is determined whether or not the detected positions of the motion elements that are simultaneously expressed are within the allowable range. This determination can be performed by checking whether or not the overlapping range of the motion elements obtained by (Equation 3) is greater than or equal to a certain threshold. If the detection range of the simultaneous operation elements is within the allowable range, the process proceeds to step 1508. If it is out of the allowable range, the process returns to step 1502, and another sign language word recognition process is performed. In step 1508, the relationship between the operations expressed sequentially, that is, the sequential operation 504 in FIG.
It is checked whether or not 505 and 506 are detected within the allowable range in that order. For this purpose, the overlapping range of the simultaneously expressed motion elements obtained in step 1506 is used. The overlapping range is set as the detection range of each sequential operation, and the temporal relationship between the ranges is examined. The relationship between the ranges can be evaluated by, for example, a ratio of a range of a certain motion and a range of subsequent motions and an overlap between the motions or a size of a gap. With such a definition, an expression such as (Equation 4) can be used as the evaluation value of the relationship between the i-th and (i + 1) -th sequential operations.

【数４】（数４）では、ｉ番目と（ｉ＋１）番目の逐次動作のギ
ャップまたは重なりの大きさと、検出範囲が小さい方の
逐次動作の範囲の大きさの比を求めている。これは、図
１７における１７０１と１７０２の比となる。図１７
は、逐次動作間の関係を説明する図である。[Equation 4] In (Equation 4), the ratio between the size of the gap or overlap between the i-th and (i + 1) -th sequential operations and the size of the sequential operation range having the smaller detection range is obtained. This is the ratio of 1701 and 1702 in FIG. FIG. 17
FIG. 6 is a diagram illustrating a relationship between sequential operations.

【００１８】ステップ１５０９では、ステップ１５０８
で求めた逐次動作間の関係が閾値以下にあるか否かの判
定を行う。全ての逐次動作間の関係が閾値以下であれ
ば、ステップ１５１０に進む。閾値より大きい場合には
ステップ１５０２に戻り、他の手話単語の認識処理を行
う。ステップ１５１０では、手話単語が認識されたこと
を使用者に通知するために、その手話単語に相当する音
声語の単語名、すなわち図５における手話単語名５０１
を出力する。図１８は、本発明の第２の実施例を示す手
話認識装置のハードウェア構成図であって、２つの演算
装置を使用する場合を示す。ハードウェア構成として
は、図２に示すような演算装置が１つである構成の他
に、図１８に示すように２つの演算装置からなる構成の
ものにすることができる。この場合、演算装置ａ１８０
１は動作要素の認識を行い、演算装置ｂ１８０２は手話
単語の認識を行うが、各演算装置はメモリ１８０３を介
して結合されることになる。また、それぞれの演算装置
の認識アルゴリズムは、図６における認識処理を分割し
た処理となる。図１９は、図１８における演算装置（ａ
１８０１）の認識処理のフローチャートである。図１９
では、ステップ１９０１において手動作入力装置２０１
から読み込むべき動作データが最後か否かの判定を行
い、最後の場合は処理を終了する。最後でない場合に
は、ステップ１９０２に進み、手動作入力装置２０１か
ら一時刻分のデータを読み込む。次にステップ１９０３
に進み、各動作要素の認識処理を行う。各動作要素の認
識処理は、図７のフローチャートに従って行うことがで
きる。In step 1509, step 1508
It is determined whether or not the relationship between the sequential operations obtained in step 1 is below a threshold value. If the relationship between all the sequential operations is less than or equal to the threshold value, the process proceeds to step 1510. If it is larger than the threshold value, the process returns to step 1502, and another sign language word recognition process is performed. In step 1510, in order to notify the user that the sign language word has been recognized, the word name of the voice word corresponding to the sign language word, that is, the sign language word name 501 in FIG.
Is output. FIG. 18 is a hardware configuration diagram of a sign language recognition apparatus showing a second embodiment of the present invention, and shows a case where two arithmetic units are used. The hardware configuration may be one having two arithmetic units as shown in FIG. 18 in addition to the one arithmetic unit shown in FIG. In this case, the arithmetic unit a180
1 recognizes a motion element, and the arithmetic unit b 1802 recognizes a sign language word, but the arithmetic units are connected via the memory 1803. The recognition algorithm of each arithmetic unit is a process obtained by dividing the recognition process in FIG. FIG. 19 shows the arithmetic unit (a
It is a flowchart of the recognition processing of 1801). FIG.
Then, in step 1901, the manual operation input device 201
It is determined whether or not the motion data to be read from is the last one, and if it is the last, the processing ends. If it is not the last, the process proceeds to step 1902 to read the data for one time from the manual operation input device 201. Next step 1903
Then, the process of recognizing each operation element is performed. The recognition process of each operation element can be performed according to the flowchart of FIG. 7.

【００１９】図２０は、図１８における演算装置（ｂ１
８０２）の認識処理のフローチャートである。図２０に
おいて、先ず、ステップ２００１で図１８のメモリ１８
０３における動作要素認識結果の検索を行い、新たに検
出された動作要素の検索を行う。次に、ステップ２００
２では、動作要素の検索の結果、新しい動作要素が検出
されたか否かの判定を行い、新しい動作要素がある場合
にはステップ２００３に進む。新しい動作要素が検出さ
れていない場合には、ステップ２００４に進み、動作要
素の検索を入力データの最後の時刻まで行ったか否かを
調べる。入力データの最後の時刻まで検索を行っていた
場合には、処理を終了し、そうでない場合にはステップ
２００１に戻って、新たな動作要素の検索を行う。ステ
ップ２００３では、新たに検出された動作要素に対して
手話単語の認識処理を行う。手話単語の認識処理として
は、図１５に示すフローチャートに従って行うことがで
きる。図２１は、本発明の第３の実施例を示す手話認識
装置のハードウェア構成図であって、手話入力装置とし
て画像入力装置を加えたものである。図２１に示すよう
に、手話の入力装置として手動作入力装置２１０１以外
に、表情や口の動き、顔の動きなどを認識するために画
像入力装置２１０２を利用してもよい。この場合、画像
から検出される表情や顔の動きなどの動作要素の認識方
法として、画像認識で良く知られている方法を利用する
ことができる。また、画像から検出される動作要素を図
８に示す動作要素と同様に扱うことにより、手話単語辞
書は図５に示すフォーマットをそのまま利用することが
できる。また、手話単語の認識方法も、図１５に示す方
法をそのまま利用することができる。FIG. 20 shows the arithmetic unit (b1 in FIG. 18).
It is a flowchart of the recognition process of 802). In FIG. 20, first, in step 2001, the memory 18 of FIG.
The motion element recognition result in 03 is searched, and the newly detected motion element is searched. Next, step 200
In step 2, it is determined whether or not a new motion element is detected as a result of the search for the motion element, and if there is a new motion element, the process proceeds to step 2003. If no new motion element is detected, the process proceeds to step 2004, and it is checked whether or not the motion element has been searched until the last time of the input data. If the search has been performed up to the last time of the input data, the process is terminated. If not, the process returns to step 2001 to search for a new motion element. In step 2003, sign language recognition processing is performed on the newly detected motion element. The sign language word recognition process can be performed according to the flowchart shown in FIG. FIG. 21 is a hardware block diagram of a sign language recognition apparatus showing a third embodiment of the present invention, in which an image input apparatus is added as a sign language input apparatus. As shown in FIG. 21, an image input device 2102 may be used as a sign language input device in addition to the hand movement input device 2101 for recognizing facial expressions, mouth movements, face movements, and the like. In this case, a method well known in image recognition can be used as a method for recognizing motion elements such as facial expressions and face movements detected from an image. Further, by treating the motion elements detected from the image in the same manner as the motion elements shown in FIG. 8, the sign language word dictionary can use the format shown in FIG. 5 as it is. Further, as the method of recognizing the sign language word, the method shown in FIG. 15 can be used as it is.

【００２０】[0020]

【発明の効果】以上説明したように、本発明によれば、
手動作パターンを先ず動作要素単位で認識し、次に動作
要素の認識結果を統合して単語を認識するので、表現さ
れる文脈や状況で変化する動作を含む手話単語について
も、変化しない構成要素に重点をおいて手話単語を認識
することができ、さらに各動作要素の認識はそれぞれの
動作要素の性質に適した方法を用いて認識することがで
きるので、柔軟かつ効率的に、しかも精度の良い手話認
識を行うことができる。As described above, according to the present invention,
Since the hand movement pattern is first recognized in units of movement elements and then the recognition results of the movement elements are integrated to recognize the words, the sign language words that include movements that change depending on the context or situation in which they are expressed do not change. Since sign language words can be recognized with emphasis on, and each motion element can be recognized using a method suitable for the nature of each motion element, it is flexible, efficient, and accurate. Can perform good sign language recognition.

[Brief description of drawings]

【図１】本発明の一実施例を示す手話認識装置の機能概
念ブロック図である。FIG. 1 is a functional conceptual block diagram of a sign language recognition apparatus showing an embodiment of the present invention.

【図２】本発明の第１の実施例を示す手話認識装置のハ
ードウェア構成図である。FIG. 2 is a hardware configuration diagram of a sign language recognition device according to the first embodiment of the present invention.

【図３】本発明の手話入力装置から入力される動作デー
タのフォーマット図である。FIG. 3 is a format diagram of operation data input from the sign language input device of the present invention.

【図４】本発明における動作要素の認識に使用するパラ
メータのフォーマット図である。FIG. 4 is a format diagram of parameters used for recognition of motion elements in the present invention.

【図５】本発明における手話単語辞書のフォーマット図
である。FIG. 5 is a format diagram of a sign language word dictionary according to the present invention.

【図６】図２における演算装置の手話認識方法を説明す
るためのフローチャートである。6 is a flowchart for explaining a sign language recognition method of the arithmetic device in FIG.

【図７】図６における動作要素の認識方法を説明するた
めのフローチャートである。FIG. 7 is a flowchart for explaining a method of recognizing a motion element in FIG.

【図８】本発明の手話における動作要素の一覧を示す図
である。FIG. 8 is a diagram showing a list of operation elements in the sign language of the present invention.

【図９】本発明の動作要素「方向」の範囲を説明するた
めの図である。FIG. 9 is a diagram for explaining a range of an operation element “direction” of the present invention.

【図１０】本発明の動作要素「位置」の範囲を説明する
ための図である。FIG. 10 is a diagram for explaining the range of the operating element “position” of the present invention.

【図１１】本発明における動作要素「方向」の認識方法
を説明するための図である。FIG. 11 is a diagram for explaining a method of recognizing a motion element “direction” according to the present invention.

【図１２】本発明における動作要素「直線」の認識方法
を説明するためのフローチャートである。FIG. 12 is a flowchart for explaining a method of recognizing a motion element “straight line” according to the present invention.

【図１３】図１２のフローで示す動作要素「直線」の認
識における角度の評価方法を説明するための図である。FIG. 13 is a diagram for explaining an angle evaluation method in recognizing the motion element “straight line” shown in the flow of FIG.

【図１４】図１２の動作要素「直線」の認識におけるず
れの評価方法を説明するための図である。FIG. 14 is a diagram for explaining a method of evaluating a deviation in recognition of the motion element “straight line” in FIG. 12;

【図１５】本発明における手話単語の認識方法を説明す
るためのフローチャートである。FIG. 15 is a flowchart illustrating a method for recognizing a sign language word according to the present invention.

【図１６】本発明における動作要素間の重なりを説明す
るための図である。FIG. 16 is a diagram for explaining an overlap between operation elements according to the present invention.

【図１７】図１５における逐次動作間の関係を説明する
ための図である。FIG. 17 is a diagram for explaining the relationship between the sequential operations in FIG.

【図１８】本発明の第２の実施例を示す，２つの演算装
置を使用した手話認識装置のハードウェア構成図であ
る。FIG. 18 is a hardware configuration diagram of a sign language recognition device using two arithmetic devices, showing a second embodiment of the present invention.

【図１９】図１８における動作要素を認識する演算装置
の動作を説明するためのフローチャートである。19 is a flow chart for explaining the operation of the arithmetic device for recognizing the operation element in FIG.

【図２０】図１８における手話単語を認識する演算装置
の動作を説明するためのフローチャートである。20 is a flow chart for explaining the operation of the arithmetic device for recognizing the sign language word in FIG.

【図２１】本発明の第３の実施例を示す，手話入力装置
として画像入力装置を加えた手話認識装置のハードウェ
ア構成図である。FIG. 21 is a hardware configuration diagram of a sign language recognition device to which an image input device is added as a sign language input device, showing a third embodiment of the present invention.

Claims

[Claims]

1. A sign language input means for converting a hand shape or a movement pattern into an electric signal and inputting it to a recognition means, and a hand movement pattern is divided into movement elements which are constituent elements of movement in advance. A sign language expressed by a combination of motion element recognition means that recognizes each motion element in a sign language word from input hand information, motion element recognition result storage means that stores the recognition result of a motion element, and a symbol that represents a motion element. A sign language word dictionary that stores sign language data representing words, a sign language word recognition means for recognizing a sign language word composed of the motion elements by collating the recognized motion elements with the contents of the sign language data, and the recognized sign language words. A sign language recognition device, comprising: an output unit configured to output a voice word corresponding to a voice or a character.

2. The sign language recognition device according to claim 1, wherein the motion element recognition means independently recognizes each motion element, and stores the recognized motion element in the motion element recognition result storage means. Sign language recognition device.

3. The sign language recognition device according to claim 2,
The sign language recognition device, wherein the motion element recognition means performs recognition processing using a recognition method selected according to the property of each motion element.

4. The sign language recognition device according to claim 3,
The motion element recognition means uses a statistical method for hand shape recognition processing, a procedural method for hand direction and position recognition, and a simple hand movement recognition processing, and a complicated method. A sign language recognition device characterized in that a pattern matching method is used for recognizing various hand movements.

5. The sign language recognition device according to claim 3,
The above-mentioned action element recognition means is characterized in that a plurality of recognition methods available for recognizing the constituent element are used, and the recognition result of the method that gives the best result is used as the recognition result of the constituent element. Sign language recognizer.

6. The sign language recognition device according to claim 2,
The sign language recognition device characterized in that the motion element recognition means and the sign language word recognition means operate independently, and the motion element recognition means and the sign language word recognition means are connected via a storage means of a recognition result of the motion element. .

7. The sign language recognition device according to claim 1, wherein
The sign language word dictionary that stores the sign language data is a symbol that represents a motion element necessary for expressing the basic motion of each sign language word, and the degree of change of the motion element depending on the context or situation in which each sign language word is expressed. A sign language recognition device, which stores sign language data including a weight value for each motion element based on.

8. The sign language recognition device according to claim 1, wherein
The sign language word dictionary for storing the sign language data is a sign language recognition device characterized by storing, for each sign language word, a symbol having only a motion element that does not change depending on a context or a situation in which the sign language word is expressed.

9. The sign language recognition apparatus according to claim 1,
The sign language word recognition means searches the recognition result of the motion element necessary for the sign language to be matched when the recognition result of the motion element and the sign language data are matched, and the temporal relationship between the searched motion elements is sign language. A sign language recognition device characterized by performing verification by checking whether or not the temporal relationship between motion elements stored in data is the same.

10. The sign language recognition apparatus according to claim 9, wherein the sign language word recognition means compares the recognition result of the motion element with the sign language data, and detects the motion element when a new motion element is recognized. A sign language recognition device characterized by performing only matching of sign language data including.

11. A sign language input means for converting a shape or movement pattern of a hand into an electric signal and inputting it to a recognition means, an image input means for converting an image into an electric signal and inputting it, and a motion pattern of a hand. A motion element recognition means that recognizes each motion element in a sign language word from the input hand information in advance by dividing the motion element, which is a constituent element of the motion, and a facial expression, a mouth movement, and a face in sign language from the input face image. Facial expression motion recognition means for recognizing the motion element of the motion of the human body, the sign language word dictionary that stores the sign language data representing the sign language expressed by the motion elements of the hand motion, facial expression, mouth movement, and face movement, and the recognized motion element. Sign language recognition characterized by having sign language word recognition means for recognizing sign language by collating the contents of sign language data, and output means for outputting a voice word corresponding to the recognized sign language word by voice or characters. Location.