JP3615008B2

JP3615008B2 - Sign language recognition device

Info

Publication number: JP3615008B2
Application number: JP01110397A
Authority: JP
Inventors: 浩彦佐川; 勝竹内; 優大木
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1997-01-24
Filing date: 1997-01-24
Publication date: 2005-01-26
Anticipated expiration: 2017-01-24
Also published as: JPH10208023A

Description

【０００１】
【発明の属する技術分野】
本発明は、手話を入力して、その結果を音声または文字の形で正確に出力することにより、聴覚障害者と健聴者とのコミュニケーションを支援する手話認識装置に関するものである。
【０００２】
【従来の技術】
従来より、手話を入力して、その入力結果を解析しその手話を認識する装置が種々提案されている。これら従来の認識方法としては、手話の動作データをパターン照合の技術（例えば、特願平５−１２５６９８号明細書および図面参照）や、ニューラルネットワークの技術を用いて認識を行う技術と、手話動作を構成する動作の基本単位に基づいて手話の認識を行う技術（例えば、特願平６−２５３４５７号明細書および図面参照）がある。
前者（特願平５−１２５６９８号）の手法では、全体としての手動作パターンを単語辞書に格納された標準手動作パターンと比較して、一致するか否かにより認識する方法であり、後者（特願平６−２５３４５７号）の手法では、手動作パターン自体を比較するのではなく、手動作パターンを先ず動作の基本単位毎（部分パターン）に認識し、次にその動作の基本単位の結果を統合して単語を認識するものである。このように、後者の技術では、先ず認識を行う手話を構成する動作の基本単位を全て認識する。その結果と、あらかじめ手話を動作の基本単位の組み合わせとして記憶してある手話テンプレート中の動作の基本単位の時間的な関係とを比較することにより認識を行うのである。この場合、手話テンプレート中の動作の基本単位の種類や属性は全て記号によって記述されていた。
【０００３】
【発明が解決しようとする課題】
従来における動作の基本単位に基づく手話認識技術（前述した後者の技術）では、静的な動作の基本単位も動的な動作の基本単位も同時に認識を行い、認識された動作の基本単位を統合することにより手話を認識していた。このうち、静的な動作の基本単位とは、形状（手の形等）や方向（指の方向等）など、ある時間範囲においてパラメータが安定した状態を示す動作の特徴である。一方、動的な動作の基本単位は、直線運動や円運動などある時間範囲におけるパラメータの変化（移動等）を表す動作の特徴である。
静的な動作の基本単位を検出するためには、基準となるパラメータ（サンプルをとって、そこから抽出したパラメータ）と入力されたパラメータの差に対して閾値を設定し、その閾値以内の区間を検出することによって行えばよい。
しかしながら、静的な動作の基本単位のパラメータは変化しやすいため、認識したい時間範囲より大きい範囲で認識されたり、小さい範囲で検出されたりすることが多い。例えば、毎回同じような形で、かつ入力されたパラメータが閾値の境界のところで変動すると、閾値を往復してしまう結果、小間切れの区間として検出されてしまうことがある。逆に大きい区間として検出する場合も生じる。また、パラメータの変動の状態によっては、一つの動作の区間が二つ以上に分割されて認識される場合もある。
このため、静的な動作の基本単位と動的な基本単位を対等に扱っていた従来技術では、静的な動作の基本単位については正しい評価が行えないという問題があった。
また、従来の技術では、手話テンプレートを全て動作の基本単位を表す記号によって記述していたため、動作の基本単位の認識ではあらかじめ設定してある基準値に基づき認識を行い、その評価値も基準に基づいて計算を行っていた。
さらに、手話の認識結果も認識された動作の基本単位の評価値（基準値にどの程度近いかを表わす値で、近い程高い値）に基づいて計算を行っていた。このため、実際の動作におけるパラメータの範囲とあらかじめ決定していた基準値によって与えられるパラメータの範囲のずれにより、正しい評価値が得られないことが多く、認識精度が低いという問題があった。すなわち、パラメータの集中する場所は、基準値に近いとは限らず、基準値の中間位置に集中することがある。このような場合には、入力されたパラメータと基準値との差は大きくなり、評価値としては小さくなってしまう。
そこで、本発明の目的は、このような従来の課題を解決し、全ての動作の基本単位を正しく評価し、精度良く手話を認識する手話認識装置を提供することである。
【０００４】
【課題を解決するための手段】
上記目的を達成するため、本発明の手話認識装置では、動的な動作の基本単位を検出し、それらにより構成される逐次的な動作の単位を認識した後、認識した逐次的な動作の単位の時間範囲内において静的な動作の基本単位の評価結果を統合する。静的な動作の基本単位のみによって構成される逐次的な動作の単位は、各時刻における動作全体の評価値を求め、評価値が極大になる時刻を求めることにより認識を行う。また、動作の基本単位を正しく評価するために、手話テンプレートに記述される動作の基本単位は、実際の手話データを用いて決定した属性値を連続量を用いて記述する。認識された動作の基本単位の評価値は、その連続量を用いて求める。
このように、動的な動作の基本単位の認識結果から決定される時間範囲に基づいて静的な動作の基本単位を評価するため、静的な動作の基本単位の認識範囲に関する問題がなくなる。
また、静的な動作の基本単位のみによって構成される逐次的な動作の単位についても、各時刻毎に逐次的な動作の単位全体の評価値を求め、評価値の極大値となる時刻を検出することにより認識を行うため、それぞれの動作の基本単位の検出時間に影響を受けることがなくなる。
さらに、動作の基本単位の特徴を表す属性値を実際の動作データから求めた連続量によって表現し、それに基づいて認識された動作の基本単位を評価するため、動作の基本単位の適切な評価値を求めることが可能となり、手話の認識精度を向上することができる。
【０００５】
【発明の実施の形態】
以下、本発明の実施例を、図面により詳細に説明する。
図２は、本発明における動作要素に基づく手話動作モデルを示す図である。
本発明を説明するために、まず手話動作のモデルを説明する。手話動作は、動作の基本単位の組み合わせにより構成される。この手話における動作の基本単位を、以後動作要素と呼ぶことにする。手話の動作を構成する動作要素間には時間的な逐次性および同時性があるため、手話の認識を行うためには、それら動作要素間の時間的な関係も記述しておく必要がある。このために、図２に示すモデルを使用する。図２において、２０１は手話形態素の動作全体を表す。手話形態素は、手話における意味の単位である。手話形態素は、まず逐次的な動作単位である逐次要素２０２，２０３，２０４に分解される。横軸は左端を基準として時間の経過を表わしているので、逐次要素２０２，２０３，２０４の順序で動作が発生したことになる。逐次要素は必ず連続的に表現され、同時に表現されることはない動作の単位である。逐次要素は、さらに複数の同じ時間範囲に表現される単位である同時要素２０５，２０６，２０７に分解される。同時要素には、動作要素２０８，２０９が含まれる。同時要素には二種類あり、一つの動作要素のみから構成される同時要素と、二つ以上の動作要素のみから構成される同時要素がある。
二つ以上の動作要素から構成される同時要素は、その中に含まれる動作要素が逐次的に表現された場合に存在するとみなされる。ただし、一つの同時要素に含まれる動作要素は、必ず同じ種類の動作要素であるとする。なお動作要素とは、方向が同じで手の形態だけ異なる等の動作の要素である。
手話動作は、このように動作要素の逐次的構造と同時的構造の組み合わせによって構成される。
【０００６】
図１は、本発明の一実施例を示す手話認識装置の概念ブロック図である。
図１において、手話入力部１０１（データグローブ）は手話における動作を電気信号に変換し、時系列データとして動的動作要素認識部１０２および静的動作要素認識部１０３に入力する。動的動作要素認識部１０２では、動作データから動的な動作要素を認識する。静的動作要素認識部１０３では、動作データの各時刻のデータに対して静的な動作要素の評価値を求める。動的逐次要素認識部１０４では、認識された動的な動作要素から構成される逐次要素を認識する。静的逐次要素認識部１０５では、静的な動作要素のみから構成される逐次要素の認識を行う。静的動作要素統合部１０６では、動的な動作要素から構成される逐次要素に静的な動作要素の認識結果を統合することにより、手話形態素の認識を行う。本発明は、この静的動作要素統合部１０６を設けたことに特徴がある。先にモデルの説明で述べたように、静的動作要素の認識範囲は基本単位のパラメータが変化し易いため、認識したい時間範囲より小さい範囲で検出されたり、大きい範囲で認識されたりするとともに、通常、静的動作要素は動的動作要素と一緒に現われるので、動的動作要素の逐次要素と静的動作要素の認識結果を統合し、評価するのである。つまり、動的動作要素は、直線運動や円運動で時間が明確に決められるので、認識した時間範囲内において静的動作要素を評価してやればよい。なお、図２のモデルと図１の関係では、図２の動作要素２０８，２０９の中に動的動作要素と静的動作要素とが含まれると考えればよい。
手話形態素の認識結果は、出力部１０８によりモニタ１０９およびスピーカ１１０へ出力される。手話形態素辞書１１１には、手話における意味の単位である手話形態素毎に、動作要素の組み合わせによって記述した手話テンプレートが格納されている。
【０００７】
図３は、動的動作要素認識部の構造を示すブロック図であり、図４は、静的動作要素認識部の構造を示すブロック図である。
動的動作要素認識部１０２は、図３に示すように、独立したそれぞれの動作要素毎の認識部３０１，３０２，３０３から構成されている。各動作要素認識部には、それぞれの認識処理に必要な認識用パラメータ３０４，３０５，３０６が用意される。図３に示す動的動作要素認識部では、考えられる全ての動作要素を認識することになるが、手話形態素辞書中の動的動作要素のみを認識するようにしても良い。
静的動作要素認識部１０３も、図４に示すように、各動作要素毎の認識部４０１，４０２，４０３から構成されている。静的動作要素認識部１０３では、認識に必要なパラメータは全て手話形態素辞書に格納されているデータを用いる。
【０００８】
図５は、図１における手話認識装置を実現するためのハードウェアの一構成例を示す図である。
図５において、手話入力装置５０１は手話における手動作を電気信号に変換する装置であり、手袋にセンサを設置し、手の形状や動きを電気信号に変換する装置として良く知られている装置（データグローブ）を利用することができる。手話入力装置５０１によって、手話の手動作は指の曲げ角度や手の位置などからなる多次元の時系列データに変換される。演算装置５０２は、動作要素の認識や手話形態素の認識を行う装置であり、メモリ５０４，５０６，５０７，５０８，５０９，５１１からプログラムを読み込み、それらのプログラムに従って認識処理を行う。出力装置５０３は、手話形態素の認識結果を出力する装置であり、文字による出力や音声合成を用いた音声による出力装置を利用することができる。メモリ５０４は、動的動作要素を認識するためのプログラムを記憶するための記憶装置、メモリ５０５は動的動作要素を認識するために必要なパラメータを記憶するための記憶装置、メモリ５０６は動的な動作要素から構成される逐次要素を認識するためのプログラムを記憶するための記憶装置、メモリ５０７は静的動作要素を認識するために必要なプログラムを記憶するための記憶装置、メモリ５０８は動的な動作要素から構成される逐次要素と静的な動作要素の認識結果を統合するためのプログラムを記憶するための記憶装置、５０９は静的な動作要素のみで構成される逐次要素を認識するためのプログラムを記憶するための記憶装置、メモリ５１０は手話形態素の動作データである手話形態素辞書を記憶するための記憶装置、メモリ５１１は手話形態素を認識するためのプログラムを記憶するための記憶装置である。図５のハードウェア構成では、全ての認識プログラムの実行を一つの演算装置だけで行う構成であるが、この他に、複数の演算装置を用いて認識プログラムの実行をそれぞれの演算装置に分散させる構成も可能である。
【０００９】
図６は、図１の手話入力部により入力される動作データのフォーマット図である。
図６において、６０１は手の位置に関するデータであり、手の位置はさらにｘ軸のデータ６０２，ｙ軸のデータ６０３，ｚ軸のデータ６０４から構成されている。６０５は手の方向に関するデータであり、手の方向はさらにｘ軸回りの角度６０６，ｙ軸回りの角度６０７，ｚ軸回りの角度６０８から構成されている。
６０９は指の曲げに関するデータであり、指の曲げはさらに、親指の第２関節の曲げ角度６１０，親指の第３関節の曲げ角度６１１，人差し指の第１関節の曲げ角度６１２，人差し指の第２関節の曲げ角度６１３，中指の第１関節の曲げ角度６１４，中指の第２関節の曲げ角度６１５，薬指の第１関節の曲げ角度６１６，薬指の第２関節の曲げ角度６１７，小指の第１関節の曲げ角度６１８，小指の第２関節の曲げ角度６１９から構成されている。また、６２０，６２１，６２２は、それぞれ時刻ｔ１，ｔ２，ｔｎにおける手の位置，方向，指の曲げのデータを表す。このように、手話における動作は、手の位置６０１，手の方向６０５，指の曲げ６０９からなる時系列データとして表される。
図７は、図５の動的動作要素認識用パラメータを格納するメモリ（５０５）に格納されるパラメータのフォーマット図である。
図７において、動作要素名７０１はそのパラメータを認識処理に使用する動作要素の名称，パラメータ数７０２はその動作要素の認識に使用するパラメータの数，７０３，７０４は各パラメータを表す。また、パラメータ種類７０５，７０７はそのパラメータの意味を表す名称，パラメータ７０６，７０８は実際に認識処理に利用されるパラメータの値を表す。
【００１０】
図８は、図５の手話形態素辞書メモリに格納されるフォーマット図である。
図８において、手話形態素名８０１は、それ以下に記述される動作要素の組み合わせが表す手話形態素の名称を表す。繰り返し回数８０２は、それ以下に記述される動作が繰り返される回数を表す。逐次要素数８０３は、手話動作を構成する逐次要素の数を表す。逐次要素間重なり度８０４は、それぞれの逐次要素が実際の手話動作中に表現された場合に生じる重なり、あるいはギャップに対する許容範囲を表す。すなわち、実際に認識される場合には、要素相互が重なってしまったり、あるいは要素と要素の間が空いてしまう場合があるので、その度合を登録しておく。この場合には、離れていたとき＋、重なっていたとき−となる。逐次要素間重なり度は、逐次要素数が２以上の場合に有効である。逐次要素８０５，８０６，８０７は、それぞれの逐次要素の記述を表す。同時要素数８０８は、逐次要素を構成する同時要素の数を表す。同時要素間重なり度８０９は、逐次要素を構成するそれぞれの同時要素が実際の手話動作中で表現された場合に生じる重なりに対する許容範囲を表す。同時要素間重なり度は、同時要素数が２以上の場合に有効である。繰り返し回数８１０，８１５は、それ以下に記述される動作要素の列が繰り返される回数を表す。動作要素状態数８１１，８１６は、それぞれの同時要素を構成する動作要素の数を表す。動作要素間重なり度８１２，８１７は、逐次的に表現される動作要素が実際の手話動作中で表現された場合に生じる重なりあるいはギャップに対する許容値を表す。動作要素間重なり度は、動作要素状態数が２以上の時に有効である。動作要素８１３，８１４，８１８，８１９は、それぞれの同時要素を構成する動作要素を表す。
【００１１】
図９は、動作要素の記述フォーマット図であり、図１０は、動作要素の種類およびそれぞれの属性値の種類を示す図であり、図１１は、動作要素の属性値のフォーマット図である。
図９において、９０１は動作要素の種類を、９０２はその動作要素を表現するために使用される手の部位を、９０３，９０４，９０５はその動作要素を表すために必要な属性値を表す。動作要素の種類は、図１０に示すように、１４種類の動作要素から選択する。また、図１０に示すように、動作要素の種類に応じて属性値の種類もあらかじめ決定されている。
図１１の属性値フォーマットにおいて、１１０１，１１０２，１１０３は複数の動作データから学習した属性値の平均値，１１０４，１１０５，１１０６は複数の動作データから学習した属性値の分散である。ここで属性値の平均値とは、サンプルをとって、それらの平均値ｐ１〜ｐｎをとったものであり、また属性値の分散とは、平均値に対するばらつきであって、何回かとったパラメータがどのくらいばらついているかを同じデータから計算して、ｓ１〜ｓｎとして表わしたものである。なお属性値の次元は、図１０に示した属性値の種類に応じてあらかじめ決定されている。
【００１２】
次に、本発明における認識処理について説明する。
図１２は、動作要素状態数が２以上の同時要素に含まれる静的動作要素の認識処理を示すフローチャートである。
図１における動的動作要素認識部１０２では、振動や直線等の動的な動作要素および手話形態素辞書において、動作要素状態数が２以上の同時要素に含まれる形状や方法などの静的な動作要素の二種類の認識（動的動作要素および静的動作要素の認識）を行う。動的な動作要素認識の技術としては、既にある技術（例えば、特願平６―２５３４５７号明細書および図面『手話認識装置』参照）を使用することができる。動作要素状態数が２以上の同時要素に含まれる静的な動作要素の認識は、図１２に示すフローチャートに従って行うことができる。
図１２において、ステップ１２０１では、まず手話形態素辞書から動作要素状態数が２以上の同時要素に含まれる静的動作要素を抽出し、そのリストを作成する。リスト中の動作要素のフォーマットは、図９に示す動作要素のフォーマットと同じで良い。次に、ステップ１２０２において、手話入力部から１時刻分のデータを読み込む。ステップ１２０３において、動作データが最後であれば処理を終了する。最後でなければ、ステップ１２０４に移る。ステップ１２０４において、静的動作要素リストの全ての動作要素について、動作要素の属性値と読み込んだデータとからその時刻における評価値を求める。
【００１３】
評価値は、静的動作要素の属性値の種類をｎ，ｉ番目の属性値の次元数をｍ（ｉ），静的動作要素の手話形態素辞書に記述されているｉ番目の属性値の平均を（Ｐ（ｉ，１），Ｐ（ｉ，２），・・・，Ｐ（ｉ，ｍ（ｉ））），分散を（Ｓ（ｉ，１），Ｓ（ｉ，２），・・・，Ｓ（ｉ，ｍ（ｉ）），入力された時刻をｔ、入力されたデータのｉ番目の属性値を（Ｘ（ｔ，ｉ，１），Ｘ（ｔ，ｉ，２），・・・Ｘ（ｔ，ｉ，ｍ（ｉ））として、各時刻の評価値Ｅ１（ｔ）は下記（数１）の式によって求められる。
【数１】

次に、ステップ１２０５において、各動作要素毎にそれまでに求めた評価値と新しく求めた評価値からなる評価値の時系列が極大になる時間範囲を求める。評価値の時系列が、例えば図１３に示す曲線１３０１のように求められた場合には、極大となる時刻１３０２から時刻１３０３の範囲が動作要素として検出される。
ステップ１２０６において、求めた時間範囲とそれに対応する動作要素，評価値を認識結果として出力する。ステップ１２０７において、求めた各動作要素毎の評価値を次の時刻での認識に使用するためにバッファに格納し、ステップ１２０２に戻る。
図１４は、動的動作要素および図１２のフローで検出された静的動作要素のフォーマット図である。
１４０１は動作要素の検出された時間範囲の開始時刻，１４０２は動作要素の検出された時間範囲の終了時刻，１４０３は検出された動作要素に対する評価値，１４０４は検出された動作要素の種類，１４０５はその動作要素を表現するために使用される手の部位，１４０６，１４０７は各動作要素に付属する属性値である。属性値は、動作要素が検出された範囲における動作データから求めた値である。
【００１４】
次に、図１に示す動的逐次要素認識部１０４において、動的な動作要素を含む逐次要素を認識する方法について説明する。
図１５は、動的逐次要素の認識処理のフローチャートである。
この処理では、大きく分けて同時要素の認識と逐次要素の認識の二段階の処理が行われる。ステップ１５０１では、手話形態素辞書から動的な動作要素によって構成される同時要素と、動作要素状態が２以上の静的な動作要素によって構成される同時要素を抽出し、同時要素リストを作成する。この場合の同時要素のフォーマットを図１６に示す。図１６において、１６０１は手話形態素辞書中の同時要素の通し番号，１６０２は動作要素間の重なり度，１６０３は動作要素状態数，１６０４，１６０５は同時要素に含まれる各動作要素である。動作要素のフォーマットは図９に示すフォーマットと同じであり、また動作要素のフォーマット中の属性値は図１１のフォーマットと同じである。
ステップ１５０２では、手話形態素辞書から動的な動作要素によって構成される同時要素と、動作要素状態数が２以上の静的動作要素を含む同時要素を含む逐次要素を抽出し、動的な動作要素を含む同時要素および動作要素状態数が２以上の静的な動作要素を含む同時要素と逐次要素の対応リストを作成する。この同時要素と逐次要素の対応リスト中の逐次要素のフォーマットを図１７に示す。
図１７において、１７０１は手話形態素辞書中における逐次要素の通し番号，１７０２は同時要素間の重なり度，１７０３はその逐次要素に含まれる動的な動作要素含む同時要素および動作要素状態数が２以上の同時要素の数，１７０４，１７０５は動的な動作要素を含む同時要素および動作要素状態数が２以上の静的な動作要素を含む同時要素の手話形態素辞書中における通し番号である。
【００１５】
ステップ１５０３では、動的動作要素認識部１０２からの認識結果を一つ読み込む。次にステップ１５０４において、認識結果が最後であれば処理を終了する。そうでなければ、ステップ１５０５に進む。ステップ１５０５では、読み込んだ動作要素によって構成される同時要素の認識を行う。
この処理は、図１８に示すフローチャートに従って行われる。図１８において、ステップ１８０１では、ステップ１５０３で読み込んだ動作要素を含む同時要素を同時要素リストから検索する。ステップ１８０２において、検索された同時要素の数をカウンタｉに代入する。ステップ１８０３において、検索された同時要素のうちｉ番目の同時要素の動作要素状態数が１であればステップ１８０４に、そうでなければステップ１８０５に進む。ステップ１８０４では、ｉ番目の同時要素中の動作要素の属性値と読み込んだ動作要素の属性値とから評価値を求める。
この場合の評価値Ｅ２は、ｉ番目の同時要素の属性値の種類をｎ，ｉ番目の同時要素のｊ番目の属性値の次元数をｍ（ｊ），ｉ番目の同時要素のｊ番目の属性値を（Ｐ（ｊ，１），Ｐ（ｊ，２），…，Ｐ（ｊ，ｍ（ｊ））），分散を（Ｓ（ｊ，１），Ｓ（ｊ，２），…，Ｓ（ｊ，ｍ（ｊ））），読み込んだ動作要素のｊ番目の属性値を（Ｘ（ｊ，１），Ｘ（ｊ，２），…，Ｘ（ｊ，ｍ（ｊ）））として、下記（数２）の式によって求めることができる。
【数２】

【００１６】
次にステップ１８０５では、読み込んだ動作要素とバッファ中の動作要素からｉ番目の同時要素を構成する動作要素列と同じ動作要素列を検索する。ステップ１８０６では、検索した動作要素列中の動作要素の属性値とｉ番目の同時要素中の動作要素の属性値から同時要素の評価値を求める。この場合の評価値Ｅ３は、ｉ番目の同時要素の動作要素の数をｎ，ｉ番目の同時要素のｊ番目の動作要素の属性値の種類をｍ（ｊ），ｉ番目の同時要素のｊ番目の動作要素のｋ番目の属性値の次元数をｑ（ｊ，ｋ），ｉ番目の同時要素のｊ番目の動作要素のｋ番目の属性値を（Ｐ（ｊ，ｋ，１），Ｐ（ｊ，ｋ，２），…，Ｐ（ｊ，ｋ，ｑ（ｊ，ｋ））），分散を（Ｓ（ｊ，ｋ，１），Ｓ（ｊ，ｋ，２），…，Ｓ（ｊ，ｋ，ｑ（ｊ，ｋ））），ｉ番目の同時要素のｊ番目の動作要素に対応する読み込んだ動作要素あるいはバッファ中の動作要素のｋ番目の属性値を（Ｘ（ｊ，ｋ，１），Ｘ（ｊ，ｋ，２），…，Ｘ（ｊ，ｋ，ｑ（ｊ，ｋ））），ｊ番目とｊ＋１番目の動作要素間の重なりあるいはギャップをＧ（ｊ）（重なりの場合正，ギャップの場合負），動作要素間の重なりあるいはギャップの平均値をＡ，分散をσとして，下記（数３）の式によって求めることができる。
【数３】

【００１７】
ステップ１８０７では、求めた評価値とｉ番目の同時要素を同時要素の認識結果として、認識された同時要素のバッファに格納する。バッファに格納される同時要素のフォーマットを図１９に示す。図１９において、１９０１は認識された同時要素の時間範囲の開始時刻，１９０２は認識された同時要素の時間範囲の終了時刻，１９０３は手話形態素辞書中における同時要素の通し番号，１９０４は同時要素に対する評価値である。開始時刻１９０１および終了時刻１９０２は、その同時要素が認識されたと判定されるもととなった動作要素の時間範囲に基づいて計算される。
元に戻って、図１５のステップ１５０６では、ステップ１５０５で認識された同時要素によって構成される逐次要素を認識する。
図２０は、図１５に示す逐次要素の認識処理のフローチャートである。
図２０では、まずステップ２００１において、認識された同時要素のバッファ中で新しく認識された同時要素を含む逐次要素を同時要素と逐次要素の対応リストから検索する。ステップ２００２において、検索された逐次要素の数をカウンタｉに代入する。ステップ２００３において、検索された逐次要素のｉ番目の逐次要素を構成する同時要素を、同時要素のバッファから検索する。ステップ２００４において、ｉ番目の逐次要素を構成する同時要素が全て検索された場合には、ステップ２００５に移る。必要な同時要素が全て見つからなかった場合には、ステップ２００７に移る。
【００１８】
ステップ２００５では、検索された同時要素の評価値に基づき、ｉ番目の逐次要素の評価値を求める。評価値は、ｉ番目の逐次要素を構成する同時要素の数をｎ，ｊ番目の同時要素の評価値をＥ４（ｊ），同時要素間の重なり度をＯ，同時要素間の重なり度の平均をＡ，分散をσとして下記（数４）の式によって求めることができる。なお、式中のＥ４は、上記（数１）の式から容易に求められる。
【数４】

また、式中の同時要素間の重なり度Ｏは、ｊ番目の同時要素の開始時刻をｓ（ｊ），終了時刻をｅ（ｊ）として下記（数５）の式によって求めることができる。
【数５】

ステップ２００６において、求めた評価値とその逐次要素を認識結果として出力する。出力される逐次要素のフォーマットを、図２１に示す。図２１において、２１０１は逐次要素の時間範囲の開始時刻，２１０２は逐次要素の時間範囲の終了時刻，２１０３は手話形態素辞書中における逐次要素の通し番号，２１０４はその逐次要素の評価値である。逐次要素の時間範囲は、その逐次要素の認識に用いられた同時要素の時間範囲に基づいて求める。例えば、全ての同時要素の重なり部分の時間範囲を用いることができる。あるいは、同時要素の時間範囲の開始時刻，終了時刻それぞれの平均を用いても良い。
【００１９】
図２２は、図１に示す静的動作要素認識部の処理を表すフローチャートである。
図１における静的動作要素認識部１０３の認識処理では、静的な動作要素の評価値を各時刻毎に求める。図２２のステップ２２０１において、手話形態素辞書から静的な動作要素を抽出し、静的動作要素のリストを作成する。この場合、動作要素状態が２以上の同時要素を構成する静的な動作要素は削除する。これは、前述の処理で終了しているためである。
動作要素リスト中の動作要素のフォーマットを図２３に示す。
図２３において、２３０１は、手話形態素辞書中における動作要素の通し番号、２３０２は動作要素の種類，２３０３はその動作要素を表現するために使用される手の部位，２３０４，２３０５はその動作要素に付属する属性値である。
図２２のステップ２２０２において、手話入力部から１時刻分のデータを読み込む。次に、ステップ２２０３において、データが最後であれば、処理を終了する。そうでなければ、ステップ２２０４に移る。ステップ２２０４では、静的動作要素リスト中の全ての動作要素について、動作要素の属性値と読み込んだデータからその時刻における評価値を求める。評価値は、前記（数１）の式によって求めることができる。ステップ２２０５では、求めた評価値と動作要素を認識結果として出力する。また、静的動作要素統合部１０６で過去の静的動作要素の認識結果を使用するため、静的動作要素のバッファを設けて、そこに認識結果を格納する。
出力される動作要素のフォーマットを図２４に示す。
図２４において、２４０１は時刻，２４０２は手話形態素辞書中における動作要素の通し番号，２４０３は動作要素の評価値である。
【００２０】
図２５は、図１に示す静的逐次要素認識部における認識処理のフローチャートである。
図１の静的逐次要素認識部１０５の認識処理では、静的動作要素認識部１０３で認識された静的動作要素の評価値から、静的動作要素のみによって構成される逐次要素の認識を行う。図２５のステップ２５０１において、手話形態素辞書から静的な動作要素のみによって構成される逐次要素を抽出し、逐次要素と静的動作要素の対応リストを作成する。
対応リスト中の逐次要素のフォーマットを、図２６に示す。
図２６において、２６０１は手話形態素辞書中における逐次要素の通し番号，２６０２は逐次要素を構成する静的動作要素の数，２６０３，２６０４は逐次要素を構成する静的動作要素の手話形態素辞書中における通し番号である。
図２５のステップ２５０２では、静的動作要素認識部１０３から静的動作要素の認識結果を１時刻分読み込む。次にステップ２５０３では、認識結果が最後であれば処理を終了する。そうでなければ、ステップ２５０４に移る。ステップ２５０４では、各逐次要素について、必要な動作要素を読み込んだ静的動作要素の認識結果から選択し、その時刻における逐次要素の評価値を求める。時刻ｔにおける評価値Ｅ７（ｔ）は、逐次要素を構成する静的動作要素の数をｎ，ｉ番目の静的動作要素の時刻ｔにおける評価値をＥ６（ｔ，ｉ）として、下記（数６）の式によって求めることができる。なお、Ｅ６（ｔ，ｉ）は、前記（数１）の式により容易に求めることができる。
【数６】

ステップ２５０５では、各逐次要素について、求めた逐次要素の評価値と過去の逐次要素の評価値の履歴を記憶してあるバッファの内容から、評価値が極大になる時間範囲を検索し、その時間範囲と評価値を逐次要素の認識結果として出力する。なお、逐次要素の認識結果のフォーマットは、図２１に示すフォーマットと同じである。
ステップ２５０６では、次の時刻での処理で使用するために、求めた逐次要素の評価値をバッファに格納する。
【００２１】
図２７は、図１に示す静的動作要素統合部における統合処理のフローチャートである。
静的動作要素統合部１０６では、動的な動作要素のみによって構成された逐次要素の評価結果に静的な動作要素の評価値を統合して、逐次要素全体の評価値を求める。図２７では、まずステップ２７０１において、手話形態素辞書中から動的な動作要素を含む逐次要素および動作要素状態数が２以上の静的動作要素から構成される同時要素を含む逐次要素を抽出し、逐次要素と静的な動作要素の対応リストを作成する。対応リスト中の各逐次要素のフォーマットは、図２６に示すフォーマットと同じである。
ステップ２７０２では、動的逐次要素認識部１０４から動的逐次要素の認識結果を一つ読み込む。ステップ２７０３において、動的逐次要素の認識結果が最後であれば処理を終了する。そうでなければ、ステップ２７０４に移る。ステップ２７０４では、読み込んだ動的逐次要素の時間範囲において、動的逐次要素に対応する静的動作要素の認識結果（評価値）を静的動作要素認識部１０３から読み込む。ステップ２７０５では、読み込んだ静的動作要素の評価値の平均を動的逐次要素に対応する静的動作要素の評価値として求める。ステップ２７０６では、動的逐次要素の評価値と静的動作要素の評価値とから、逐次要素全体の評価値を求める。この評価値の計算は、二種類の評価値の平均や相乗平均等を計算することにより求めることができる。また逐次要素を構成する動作要素の種類毎に評価値を保存しておき、それらの平均や相乗平均等を計算することにより求めることもできる。ステップ２７０７では、動的逐次要素の時間範囲と求めた評価値を統合結果として出力する。統合結果のフォーマットは、図２１に示すフォーマットと同じである。
図２７に示すフローチャートでは、静的動作要素の評価値は全ての時刻について求め、その中から必要な時間範囲の評価値のみを受け取ることになるが、動的な逐次要素認識部の認識結果から得られる時間範囲についてのみ、必要な静的動作要素の評価値を求めるように静的動作要素認識部１０３に指示を送り、その評価結果を受け取って静的動作要素の統合処理を行うようにすることもできる。
【００２２】
図２８は、図１に示す手話形態素認識部における認識処理のフローチャートである。
手話形態素認識部１０７は、逐次要素の認識結果から手話形態素を認識する。図２８では、まずステップ２８０１において、手話形態素辞書から手話形態素とそれを構成する逐次要素の対応リストを作成する。対応リスト中の各手話形態素のフォーマットを図２９に示す。図２９において、２９０１は手話形態素名，２９０２は手話形態素を構成する逐次要素間の重なり度，２９０３は手話形態素を構成する逐次要素の数，２９０４，２９０５は手話形態素を構成する逐次要素の手話形態素辞書中における通し番号である。
ステップ２８０２において、逐次要素の認識結果を一つ読み込む。ステップ２８０３において、逐次要素の認識結果が最後であれば処理を終了する。そうでなければ、ステップ２８０４に移る。ステップ２８０４では、読み込んだ逐次要素と過去の逐次要素の履歴を格納してあるバッファ中の逐次要素から、構成される逐次要素列に対応する手話形態素を対応リストから検索する。ステップ２８０５では、検索された手話形態素について、逐次要素の評価値および逐次要素間の重なりあるいはギャップに基づいて評価値を求める。評価値Ｅ９は、手話形態素を構成する逐次要素の数をｎ，ｉ番目の逐次要素の評価値をＥ８（ｉ），ｉ番目とｉ＋１番目の逐次要素間の重なりあるいはギャップをＧ（ｉ）（重なりの場合正，ギャップの場合負），逐次要素間の重なりあるいはギャップの平均をＡ，分散をσとして、下記（数７）の式によって求めることができる。なお、Ｅ８（ｉ）は、前式（数１）により容易に求められる。
【数７】

【００２３】
ステップ２８０６において、求めた評価値，手話形態素名および時間範囲を認識結果として出力する。手話形態素の時間範囲は、認識に使用された逐次要素の時間範囲に基づいて求める。例えば、最初の逐次要素の開始時刻から最後の逐次要素の終了時刻を、手話形態素の時間範囲とすることができる。
図３０に、手話形態素の認識結果のフォーマットを示す。図３０において、３００１は手話形態素の時間範囲の開始時刻，３００２は手話形態素の時間範囲の終了時刻，３００３は手話形態素名，３００４は手話形態素の評価値である。
なお、前式（数１），（数２），（数３），（数４），（数６），（数７）では、評価値の計算はそれぞれを構成要素の評価値の相乗平均を用いているが、単純な平均（構成要素の評価値を全て加算し、その数で割算する）を用いても良い。また、構成要素の評価値とギャップや重なりに対する評価値に対して重み付けをしても良い。
【００２４】
【発明の効果】
以上説明したように、本発明によれば、動的な動作要素の認識結果から決定される時間範囲に基づいて静的な動作要素を評価するため、静的な動作要素の認識範囲による認識精度の低下がなくなる。また静的な動作要素のみによって構成される逐次要素についても、各時刻毎に逐次要素全体の評価値を求め、評価値の極大値となる時刻を検出することにより認識を行うため、それぞれの動作要素の検出時間のずれによる認識低下がなくなる。さらに動作要素の特徴を表す属性値を、実際の動作データから求めた連続量によって表現し、それに基づいて認識された動作要素を評価するため、動作要素の適切な評価値を求めることが可能となり、手話の認識精度を向上することができる。
【図面の簡単な説明】
【図１】本発明の一実施例を示す手話認識装置の概念ブロック図である。
【図２】本発明において、動作要素に基づく手話動作モデルを示す図である。
【図３】図１に示す動的動作要素認識部の構造を示すブロック図である。
【図４】図１に示す静的動作要素認識部の構造を示すブロック図である。
【図５】本発明の一実施例を実現するための手話認識装置のハードウェア構成図である。
【図６】図１の手話入力装置から入力されるデータのフォーマット図である。
【図７】動的動作要素認識のためのパラメータのフォーマット図である。
【図８】手話形態素辞書の記述フォーマットである。
【図９】動作要素の記述フォーマットである。
【図１０】動作要素の種類およびそれぞれの属性値の種類を示す図である。
【図１１】動作要素の属性値のフォーマットである。
【図１２】動作要素状態数が２以上の同時要素に含まれる静的動作要素の認識処理を説明するためのフローチャートである。
【図１３】静的動作要素の検出処理を説明するための図である。
【図１４】図１３において認識された動的動作要素のフォーマット図である。
【図１５】図１に示す動的逐次要素認識部の認識処理を説明するためのフローチャートである。
【図１６】同時要素と動作要素の対応リストにおける同時要素のフォーマット図である。
【図１７】逐次要素と同時要素の対応リストにおける逐次要素のフォーマット図である。
【図１８】図１に示す動的逐次要素認識部における同時要素の認識処理を説明するためのフローチャートである。
【図１９】認識された同時要素のフォーマット図である。
【図２０】図１に示す動的逐次要素認識部における逐次要素の認識処理を説明するためのフローチャートである。
【図２１】図２０で認識された逐次要素のフォーマット図である。
【図２２】図１に示す静的動作要素認識部の認識処理を説明するためのフローチャートである。
【図２３】静的動作要素リスト中の動作要素のフォーマット図である。
【図２４】図２２で認識された静的動作要素のフォーマット図である。
【図２５】図１に示す静的逐次要素認識部の認識処理を説明するためのフローチャートである。
【図２６】静的逐次要素リスト中の逐次要素のフォーマット図である。
【図２７】図１に示す静的動作要素統合部の統合処理を説明するためのフローチャートである。
【図２８】図１に示す手話形態素認識部の認識処理を説明するためのフローチャートである。
【図２９】手話形態素と逐次要素の対応リスト中の手話形態素のフォーマット図である。
【図３０】図２８で認識された手話形態素のフォーマット図である。
【符号の説明】
１０１…手話入力部、１０２…動的動作要素認識部、１０８…出力部、
１０３…静的動作要素認識部、１０４…動的逐次要素認識部、
１０５…静的逐次要素認識部、１０６…動的動作要素統合部、
１０７…手話形態素認識部、１０９…モニタ、１１０…スピーカ、
１１１…手話形態素辞書、２０１…手話形態素、２０８，２０９…動作要素、
２０２，２０３，２０４…逐次要素、２０５，２０６，２０７…同時要素、
３０１，３０２，３０３…動的動作要素認識部、３０４〜３０６…パラメータ、
４０１，４０２，４０３…静的動作要素認識部、５０４〜５１１…メモリ、
６０１…手の位置、６０５…手の方向、６０９…指の曲げ、
７０１…動作要素名、７０５，７０７…パラメータ種類、
７０６，７０８…パラメータ、９０１…動作要素の種類、９０２…手の部位、
９０３〜９０５…属性値、１１０１〜１１０３…属性値の平均値、
１１０４〜１１０６…属性値の分散、１３０１…評価値の時系列、
１３０２…時間範囲の開始時刻、１３０３…時間範囲の終了時刻。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a sign language recognition device that supports communication between a hearing impaired person and a normal hearing person by inputting sign language and outputting the result accurately in the form of speech or characters.
[0002]
[Prior art]
Conventionally, various apparatuses for inputting sign language, analyzing the input result, and recognizing the sign language have been proposed. As these conventional recognition methods, sign language motion data is recognized by using a pattern matching technology (see, for example, the specification of Japanese Patent Application No. 5-125698 and the drawings), a neural network technology, and a sign language motion. There is a technique for recognizing sign language based on the basic unit of the operation that constitutes (see, for example, Japanese Patent Application No. 6-253457 and drawings).
The former method (Japanese Patent Application No. 5-125698) is a method in which the entire hand movement pattern is compared with the standard hand movement pattern stored in the word dictionary and is recognized based on whether or not they match, and the latter ( In the method of Japanese Patent Application No. 6-253457), instead of comparing the hand movement patterns themselves, the hand movement patterns are first recognized for each basic unit (partial pattern) of the movement, and then the result of the basic unit of the movement is obtained. Is used to recognize words. As described above, in the latter technique, first, all the basic units of operations constituting the sign language for recognition are recognized. The recognition is performed by comparing the result with the temporal relationship of the basic unit of motion in the sign language template stored in advance as a combination of the basic unit of motion. In this case, the types and attributes of the basic units of actions in the sign language template are all described by symbols.
[0003]
[Problems to be solved by the invention]
Conventional sign language recognition technology based on the basic unit of motion (the latter technology described above) recognizes both the basic unit of static motion and the basic unit of dynamic motion simultaneously, and integrates the basic unit of recognized motion By recognizing sign language. Among these, the basic unit of the static motion is a feature of the motion in which the parameters are stable in a certain time range such as the shape (hand shape, etc.) and the direction (finger direction, etc.). On the other hand, the basic unit of dynamic motion is a feature of motion representing parameter change (movement, etc.) in a certain time range such as linear motion or circular motion.
In order to detect the basic unit of static operation, a threshold is set for the difference between the reference parameter (sampled and extracted from the parameter) and the input parameter, and the interval within that threshold May be performed by detecting.
However, since the basic unit parameter of static operation is easy to change, it is often recognized in a range larger than the time range to be recognized or detected in a smaller range. For example, if the input parameter fluctuates in the same form each time and changes at the threshold boundary, the threshold value may be reciprocated, resulting in detection as a short-cut section. On the other hand, there may be a case where a large section is detected. In addition, depending on the state of parameter fluctuation, one motion section may be divided into two or more and recognized.
For this reason, the conventional technique that treats the basic unit of static motion and the dynamic basic unit on an equal basis has a problem that the basic unit of static motion cannot be correctly evaluated.
In the conventional technology, all sign language templates are described with symbols representing the basic unit of motion. Therefore, the basic unit of motion is recognized based on a preset reference value, and the evaluation value is also used as a reference. Based on the calculation.
Furthermore, the recognition result of the sign language is also calculated based on the evaluation value of the recognized basic unit of the motion (a value representing how close to the reference value, the higher the closer the value is). For this reason, there is a problem that a correct evaluation value cannot often be obtained due to a difference between a parameter range in actual operation and a parameter range given by a predetermined reference value, and recognition accuracy is low. That is, the location where the parameters are concentrated is not necessarily close to the reference value, and may be concentrated at an intermediate position of the reference value. In such a case, the difference between the input parameter and the reference value becomes large and the evaluation value becomes small.
Therefore, an object of the present invention is to provide a sign language recognition device that solves such conventional problems, correctly evaluates basic units of all operations, and recognizes sign language with high accuracy.
[0004]
[Means for Solving the Problems]
In order to achieve the above object, in the sign language recognition apparatus of the present invention, a basic unit of dynamic motion is detected, a sequential motion unit constituted by them is recognized, and then a recognized sequential motion unit is recognized. Integrate the evaluation results of the basic unit of static motion within the time range of. A sequential motion unit constituted only by a basic unit of static motion is recognized by obtaining an evaluation value of the entire motion at each time and obtaining a time at which the evaluation value becomes maximum. Further, in order to correctly evaluate the basic unit of motion, the basic unit of motion described in the sign language template describes the attribute value determined using actual sign language data using a continuous amount. The evaluation value of the basic unit of the recognized motion is obtained using the continuous amount.
As described above, since the basic unit of the static motion is evaluated based on the time range determined from the recognition result of the basic unit of the dynamic motion, there is no problem regarding the recognition range of the basic unit of the static motion.
In addition, for sequential motion units consisting of only static motion basic units, the evaluation value for the entire sequential motion unit is obtained at each time, and the time at which the evaluation value becomes the maximum value is detected. Since the recognition is performed by this, the detection time of the basic unit of each operation is not affected.
Furthermore, since the attribute value representing the characteristics of the basic unit of motion is expressed by the continuous amount obtained from the actual motion data, and the basic unit of the recognized motion is evaluated based on it, an appropriate evaluation value of the basic unit of motion And the sign language recognition accuracy can be improved.
[0005]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below in detail with reference to the drawings.
FIG. 2 is a diagram showing a sign language motion model based on motion elements in the present invention.
In order to explain the present invention, first, a model of a sign language action will be described. A sign language action is composed of a combination of basic units of action. The basic unit of motion in sign language will be referred to as motion element hereinafter. Since there is temporal sequentiality and simultaneity between the motion elements that constitute the sign language motion, it is necessary to describe the temporal relationship between the motion elements in order to recognize the sign language. For this purpose, the model shown in FIG. 2 is used. In FIG. 2, 201 represents the overall operation of the sign language morpheme. A sign language morpheme is a unit of meaning in sign language. Sign language morphemes are first sequential Operation unit Are decomposed into

sequential elements

202, 203, and 204. Since the horizontal axis represents the passage of time with the left end as a reference, the operations occur in the order of the

sequential elements

202, 203, and 204. A sequential element is a unit of motion that is always expressed continuously and is not expressed simultaneously. The sequential elements are further decomposed into

simultaneous elements

205, 206, and 207 which are units expressed in a plurality of the same time ranges. Simultaneous elements include operating elements 208 and 209. There are two types of simultaneous elements. There are simultaneous elements composed of only one operation element and simultaneous elements composed of only two or more operation elements.
A simultaneous element composed of two or more operation elements is considered to exist when the operation elements included therein are sequentially expressed. However, it is assumed that the operation elements included in one simultaneous element are always the same type of operation element. The movement elements are movement elements such as the same direction but different hand forms.
The sign language motion is thus constituted by a combination of a sequential structure and a simultaneous structure of motion elements.
[0006]
FIG. 1 is a conceptual block diagram of a sign language recognition apparatus showing an embodiment of the present invention.
In FIG. 1, a sign language input unit 101 (data glove) converts an action in sign language into an electric signal, and inputs it into a dynamic motion element recognition unit 102 and a static motion element recognition unit 103 as time series data. The dynamic motion element recognition unit 102 recognizes a dynamic motion element from the motion data. The static motion element recognition unit 103 obtains a static motion element evaluation value for each time of motion data. The dynamic sequential element recognition unit 104 recognizes sequential elements composed of recognized dynamic motion elements. The static sequential element recognition unit 105 recognizes sequential elements composed only of static motion elements. The static motion element integration unit 106 recognizes sign language morphemes by integrating the recognition results of static motion elements into sequential elements composed of dynamic motion elements. The present invention is characterized in that the static operation element integration unit 106 is provided. As described above in the description of the model, since the basic unit parameters are easily changed in the recognition range of the static motion element, it is detected in a range smaller than the time range to be recognized or recognized in a large range, Usually, since the static motion element appears together with the dynamic motion element, the recognition result of the sequential element of the dynamic motion element and the static motion element is integrated and evaluated. In other words, since the dynamic motion element is clearly determined by linear motion or circular motion, the static motion element may be evaluated within the recognized time range. In the relationship between the model in FIG. 2 and FIG. 1, it can be considered that the motion elements 208 and 209 in FIG. 2 include a dynamic motion element and a static motion element.
The recognition result of the sign language morpheme is output to the monitor 109 and the speaker 110 by the output unit 108. The sign language morpheme dictionary 111 stores a sign language template described by a combination of motion elements for each sign language morpheme which is a unit of meaning in sign language.
[0007]
FIG. 3 is a block diagram showing the structure of the dynamic motion element recognition unit, and FIG. 4 is a block diagram showing the structure of the static motion element recognition unit.
As shown in FIG. 3, the dynamic motion element recognition unit 102 includes

recognition units

301, 302, and 303 for each independent motion element. Each motion element recognition unit is provided with recognition parameters 304, 305, and 306 necessary for each recognition process. The dynamic motion element recognition unit shown in FIG. 3 recognizes all possible motion elements, but may recognize only the dynamic motion elements in the sign language morpheme dictionary.
As shown in FIG. 4, the static motion element recognition unit 103 also includes

recognition units

401, 402, and 403 for each motion element. The static motion element recognition unit 103 uses data stored in the sign language morpheme dictionary for all parameters necessary for recognition.
[0008]
FIG. 5 is a diagram illustrating a configuration example of hardware for realizing the sign language recognition apparatus in FIG. 1.
In FIG. 5, a sign language input device 501 is a device that converts a hand motion in sign language into an electrical signal, and is well known as a device that installs a sensor on a glove and converts the shape and movement of a hand into an electrical signal ( Data glove) can be used. The sign language input device 501 converts the hand movement of the sign language into multidimensional time series data including a finger bending angle and a hand position. The computing device 502 is a device that recognizes motion elements and sign language morphemes, reads programs from the

memories

504, 506, 507, 508, 509, and 511, and performs recognition processing according to those programs. The output device 503 is a device that outputs a result of recognition of sign language morphemes, and an output device that uses speech or speech that uses speech synthesis can be used. The memory 504 is a storage device for storing a program for recognizing a dynamic operation element, the memory 505 is a storage device for storing a parameter necessary for recognizing the dynamic operation element, and the memory 506 is a dynamic A storage device for storing a program for recognizing sequential elements composed of various operation elements, a memory 507 is a storage device for storing a program necessary for recognizing a static operation element, and a memory 508 is an operation Storage device 509 for storing a program for integrating recognition results of sequential elements composed of static motion elements and static motion elements, 509 recognizes sequential elements composed only of static motion elements A memory device for storing a program for storing a sign language morpheme, a memory 510 for storing a sign language morpheme dictionary which is operation data of a sign language morpheme 1 is a storage device for storing a program for recognizing sign language morphemes. In the hardware configuration of FIG. 5, the execution of all recognition programs is performed by only one arithmetic device, but in addition to this, the execution of the recognition program is distributed to each arithmetic device using a plurality of arithmetic devices. Configuration is also possible.
[0009]
FIG. 6 is a format diagram of operation data input by the sign language input unit of FIG.
In FIG. 6, reference numeral 601 denotes data relating to the position of the hand, and the hand position further includes x-axis data 602, y-axis data 603, and z-axis data 604. Reference numeral 605 denotes data relating to the direction of the hand. The hand direction further includes an angle 606 around the x axis, an angle 607 around the y axis, and an angle 608 around the z axis.
Reference numeral 609 denotes data related to the bending of the finger. The bending of the finger is further performed by bending the second joint 610 of the thumb, the bending angle 611 of the third joint of the thumb, the bending angle 612 of the first joint of the index finger, and the second bending of the index finger. Bending angle 613 of the middle finger, bending angle 614 of the first joint of the middle finger, bending angle 615 of the second joint of the middle finger, bending angle 616 of the first joint of the ring finger, bending angle 617 of the second joint of the ring finger, first of the little finger It consists of a bending angle 618 of the joint and a bending angle 619 of the second joint of the little finger.

Reference numerals

620, 621, and 622 denote data of hand position, direction, and finger bending at times t1, t2, and tn, respectively. As described above, the motion in the sign language is expressed as time series data including the hand position 601, the hand direction 605, and the finger bending 609.
FIG. 7 is a format diagram of parameters stored in the memory (505) for storing the dynamic motion element recognition parameters of FIG.
In FIG. 7, an operation element name 701 indicates the name of an operation element that uses the parameter for recognition processing, a parameter number 702 indicates the number of parameters used for recognition of the operation element, and 703 and 704 indicate each parameter.

Parameter types

705 and 707 represent names indicating the meaning of the parameters, and parameters 706 and 708 represent parameter values actually used for recognition processing.
[0010]
FIG. 8 is a format diagram stored in the sign language morpheme dictionary memory of FIG.
In FIG. 8, a sign language morpheme name 801 represents a name of a sign language morpheme represented by a combination of motion elements described below. The number of repetitions 802 represents the number of times the operations described below are repeated. The number of sequential elements 803 represents the number of sequential elements constituting the sign language action. The overlap factor 804 between successive elements represents an allowable range for an overlap or a gap that occurs when each successive element is expressed during an actual sign language operation. That is, when the elements are actually recognized, the elements may overlap each other or the elements may be vacant, and the degree is registered. In this case, it is + when separated and-when overlapped. The degree of overlap between sequential elements is effective when the number of sequential elements is 2 or more.

Sequential elements

805, 806, and 807 represent descriptions of the respective sequential elements. The simultaneous element number 808 represents the number of simultaneous elements constituting the sequential element. The overlapping degree 809 between the simultaneous elements represents an allowable range for overlapping that occurs when each of the simultaneous elements constituting the sequential element is expressed in an actual sign language operation. The overlapping degree between simultaneous elements is effective when the number of simultaneous elements is 2 or more. The number of

repetitions

810 and 815 represents the number of repetitions of the operation element sequence described below. The operation

element state numbers

811 and 816 represent the number of operation elements constituting each simultaneous element. The overlapping

degree

812 and 817 between motion elements represents an allowable value for an overlap or a gap that occurs when motion elements expressed sequentially are expressed in an actual sign language motion. The degree of overlap between motion elements is effective when the number of motion element states is 2 or more. The

operation elements

813, 814, 818, and 819 represent the operation elements constituting the respective simultaneous elements.
[0011]
FIG. 9 is a description format diagram of an operation element, FIG. 10 is a diagram illustrating the types of operation elements and the types of attribute values, and FIG. 11 is a format diagram of attribute values of the operation elements.
In FIG. 9, 901 represents the type of the motion element, 902 represents the part of the hand used to represent the motion element, and 903, 904, and 905 represent attribute values required to represent the motion element. The type of operation element is selected from 14 types of operation elements as shown in FIG. Also, as shown in FIG. 10, the type of attribute value is determined in advance according to the type of operation element.
In the attribute value format of FIG. 11, 1101, 1102 and 1103 are average values of attribute values learned from a plurality of motion data, and 1104, 1105 and 1106 are distributions of attribute values learned from a plurality of motion data. Here, the average value of attribute values is obtained by taking samples and taking the average values p1 to pn, and the dispersion of attribute values is a variation with respect to the average value, and has been taken several times. How much the parameter varies is calculated from the same data and expressed as s1 to sn. The dimension of the attribute value is determined in advance according to the type of attribute value shown in FIG.
[0012]
Next, the recognition process in the present invention will be described.
FIG. 12 is a flowchart showing a recognition process of a static motion element included in simultaneous elements having two or more motion element states.
In the dynamic motion element recognition unit 102 in FIG. 1, static motions such as shapes and methods included in simultaneous motion elements whose number of motion element states is two or more in the dynamic motion elements such as vibrations and straight lines and the sign language morpheme dictionary. Two types of recognition of elements (recognition of dynamic motion elements and static motion elements) are performed. As a technique for dynamic motion element recognition, an existing technique (see, for example, the specification of Japanese Patent Application No. 6-253457 and the drawing “Sign Language Recognition Device”) can be used. Recognition of static motion elements included in simultaneous elements having two or more motion element states can be performed according to the flowchart shown in FIG.
In FIG. 12, in Step 1201, first, static motion elements included in simultaneous elements having two or more motion element state numbers are extracted from the sign language morpheme dictionary, and a list thereof is created. The format of the operation element in the list may be the same as the format of the operation element shown in FIG. Next, in step 1202, data for one time is read from the sign language input unit. If the operation data is the last in step 1203, the process is terminated. If it is not the last, step 1204 is entered. In step 1204, for all the motion elements in the static motion element list, evaluation values at the time are obtained from the motion element attribute values and the read data.
[0013]
The evaluation value is the average of the i-th attribute values described in the sign language morpheme dictionary of the static action element, n is the type of attribute value of the static action element, m (i) is the number of dimensions of the i-th attribute value (P (i, 1), P (i, 2),..., P (i, m (i))) and variance (S (i, 1), S (i, 2),... , S (i, m (i)), input time t, input data i-th attribute value (X (t, i, 1), X (t, i, 2), ... As X (t, i, m (i)), the evaluation value E1 (t) at each time is obtained by the following equation (Equation 1).
[Expression 1]

Next, in step 1205, a time range in which the time series of the evaluation values including the evaluation values obtained so far and the newly obtained evaluation values is maximized is obtained for each operation element. When the time series of evaluation values is obtained, for example, as shown by a curve 1301 shown in FIG. 13, a range from time 1302 to time 1303 where the maximum value is obtained is detected as an operating element.
In step 1206, the obtained time range, the corresponding operation element, and the evaluation value are output as a recognition result. In step 1207, the obtained evaluation value for each motion element is stored in a buffer for use in recognition at the next time, and the process returns to step 1202.
FIG. Dynamic operating elements and Detected in the flow of FIG. Of static operating elements It is a format diagram.
1401 is the start time of the time range in which the motion element is detected, 1402 is the end time of the time range in which the motion element is detected, 1403 is the evaluation value for the detected motion element, 1404 is the type of the detected motion element, and 1405 Is the part of the hand used to represent the action element, and 1406 and 1407 are attribute values attached to each action element. The attribute value is a value obtained from motion data in a range where motion elements are detected.
[0014]
Next, a method for recognizing sequential elements including dynamic motion elements in the dynamic sequential element recognition unit 104 shown in FIG. 1 will be described.
FIG. 15 is a flowchart of the dynamic sequential element recognition process.
This process is roughly divided into two stages of simultaneous element recognition and sequential element recognition. In step 1501, a simultaneous element composed of dynamic motion elements and a simultaneous element composed of static motion elements having two or more motion element states are extracted from the sign language morpheme dictionary, and a simultaneous element list is created. The format of the simultaneous element in this case is shown in FIG. In FIG. 16, 1601 is the serial number of the simultaneous elements in the sign language morpheme dictionary, 1602 is the degree of overlap between the motion elements, 1603 is the number of motion element states, and 1604 and 1605 are the motion elements included in the simultaneous elements. The format of the operation element is the same as the format shown in FIG. 9, and the attribute value in the format of the operation element is the same as the format of FIG.
In step 1502, a sequential element including a simultaneous element composed of dynamic motion elements from a sign language morpheme dictionary and a simultaneous element including a static motion element whose number of motion element states is two or more is extracted. And a correspondence list of simultaneous elements and sequential elements including static operation elements having two or more operation element state numbers. FIG. 17 shows the format of the sequential element in the correspondence list of the simultaneous element and the sequential element.
In FIG. 17, 1701 is the serial number of the sequential element in the sign language morpheme dictionary, 1702 is the degree of overlap between the simultaneous elements, 1703 is the number of simultaneous elements and dynamic element states including dynamic motion elements included in the sequential elements, and two or more. The number of simultaneous elements, 1704 and 1705, are serial numbers in the sign language morpheme dictionary of simultaneous elements including dynamic operation elements and simultaneous elements including static operation elements having two or more operation element states.
[0015]
In step 1503, one recognition result from the dynamic motion element recognition unit 102 is read. In step 1504, if the recognition result is the last, the process ends. Otherwise, go to step 1505. In step 1505, the simultaneous element constituted by the read operation element is recognized.
This process is performed according to the flowchart shown in FIG. In FIG. 18, in step 1801, a simultaneous element including the operation element read in step 1503 is searched from the simultaneous element list. In step 1802, the number of retrieved simultaneous elements is substituted into the counter i. In step 1803, if the number of motion element states of the i-th simultaneous element among the searched simultaneous elements is 1, the process proceeds to step 1804, and if not, the process proceeds to step 1805. In step 1804, an evaluation value is obtained from the attribute value of the motion element in the i-th simultaneous element and the attribute value of the read motion element.
In this case, the evaluation value E2 is such that the attribute value type of the i-th simultaneous element is n, the dimension number of the j-th attribute value of the i-th simultaneous element is m (j), and the j-th attribute value of the i-th simultaneous element is j-th. The attribute value is (P (j, 1), P (j, 2),..., P (j, m (j))), and the variance is (S (j, 1), S (j, 2),. S (j, m (j))), and the jth attribute value of the read operation element is (X (j, 1), X (j, 2),..., X (j, m (j))) The following equation (Equation 2) can be used.
[Expression 2]

[0016]
In step 1805, the same operation element sequence as the operation element sequence constituting the i-th simultaneous element is searched from the read operation element and the operation element in the buffer. In step 1806, the evaluation value of the simultaneous element is obtained from the attribute value of the motion element in the searched motion element sequence and the attribute value of the motion element in the i-th simultaneous element. In this case, the evaluation value E3 is such that the number of operation elements of the i-th simultaneous element is n, the attribute value type of the j-th operation element of the i-th simultaneous element is m (j), and j of the i-th simultaneous element The dimension number of the kth attribute value of the ith operation element is q (j, k), the kth attribute value of the jth operation element of the ith simultaneous element is (P (j, k, 1), P (J, k, 2),..., P (j, k, q (j, k))), and variances (S (j, k, 1), S (j, k, 2),. j, k, q (j, k))), the kth attribute value of the read operation element or the operation element in the buffer corresponding to the jth operation element of the i-th simultaneous element (X (j, k , 1), X (j, k, 2),..., X (j, k, q (j, k))), the overlap or gap between the jth and j + 1th motion elements is G (j) (overlap In case of positive, If negative), the overlapping or the average value of the gap between the operating element A, as the variance sigma, can be determined by the following equation (Equation 3).
[Equation 3]

[0017]
In step 1807, the obtained evaluation value and the i-th simultaneous element are stored in the recognized simultaneous element buffer as a simultaneous element recognition result. The format of the simultaneous element stored in the buffer is shown in FIG. In FIG. 19, 1901 is the start time of the recognized simultaneous element time range, 1902 is the end time of the recognized simultaneous element time range, 1903 is the serial number of the simultaneous element in the sign language morpheme dictionary, and 1904 is the evaluation for the simultaneous element. Value. The start time 1901 and the end time 1902 are calculated based on the time range of the motion element from which it is determined that the simultaneous element has been recognized.
Returning to FIG. 15, in step 1506 in FIG. 15, a sequential element constituted by the simultaneous elements recognized in step 1505 is recognized.
FIG. 20 is a flowchart of the sequential element recognition process shown in FIG.
In FIG. 20, first, in step 2001, a sequential element including a newly recognized simultaneous element in the buffer of the recognized simultaneous element is searched from the correspondence list of the simultaneous element and the sequential element. In step 2002, the number of retrieved sequential elements is substituted into the counter i. In step 2003, the simultaneous elements constituting the i-th sequential element of the retrieved sequential elements are searched from the buffer of the simultaneous elements. If all the simultaneous elements constituting the i-th sequential element are searched in step 2004, the process proceeds to step 2005. If all the necessary simultaneous elements are not found, the process proceeds to step 2007.
[0018]
In step 2005, the evaluation value of the i-th sequential element is obtained based on the searched evaluation value of the simultaneous element. The evaluation value is n for the number of simultaneous elements constituting the i-th sequential element, E4 (j) for the evaluation value for the j-th simultaneous element, O for the overlap between the simultaneous elements, and the average of the overlap between the simultaneous elements Is A and the variance is σ, and can be calculated by the following equation (4). In addition, E4 in a formula is easily calculated | required from the formula of said (Formula 1).
[Expression 4]

The degree of overlap O between the simultaneous elements in the equation can be obtained by the following equation (5), where the start time of the j-th simultaneous element is s (j) and the end time is e (j).
[Equation 5]

In step 2006, the obtained evaluation value and its sequential element are output as a recognition result. The format of the sequential element that is output is shown in FIG. In FIG. 21, 2101 is the start time of the sequential element time range, 2102 is the end time of the sequential element time range, 2103 is the serial number of the sequential element in the sign language morpheme dictionary, and 2104 is the evaluation value of the sequential element. The time range of the sequential element is obtained based on the time range of the simultaneous elements used for the recognition of the sequential element. For example, the time range of the overlapping portion of all simultaneous elements can be used. Alternatively, the average of the start time and end time of the time range of the simultaneous elements may be used.
[0019]
FIG. 22 is a flowchart showing the process of the static motion element recognition unit shown in FIG.
In the recognition process of the static motion element recognition unit 103 in FIG. 1, an evaluation value of a static motion element is obtained at each time. In step 2201 of FIG. 22, static motion elements are extracted from the sign language morpheme dictionary, and a list of static motion elements is created. In this case, static motion elements that constitute two or more simultaneous elements having motion element states are deleted. This is because the process has been completed.
The format of the operation element in the operation element list is shown in FIG.
In FIG. 23, 2301 is the serial number of the motion element in the sign language morpheme dictionary, 2302 is the type of motion element, 2303 is the part of the hand used to represent the motion element, and 2304 and 2305 are attached to the motion element. Attribute value to be
In step 2202 of FIG. 22, data for one time is read from the sign language input unit. Next, in step 2203, if the data is the last, the process ends. Otherwise, go to step 2204. In step 2204, for all motion elements in the static motion element list, evaluation values at the time are obtained from the attribute values of the motion elements and the read data. The evaluation value can be obtained by the equation (Equation 1). In step 2205, the obtained evaluation value and motion element are output as a recognition result. Further, since the static motion element integration unit 106 uses the recognition result of the past static motion element, a buffer for the static motion element is provided and the recognition result is stored therein.
The format of the output operation element is shown in FIG.
24, 2401 is the time, 2402 is the serial number of the motion element in the sign language morpheme dictionary, and 2403 is the evaluation value of the motion element.
[0020]
FIG. 25 is a flowchart of recognition processing in the static sequential element recognition unit shown in FIG.
In the recognition processing of the static sequential element recognition unit 105 in FIG. 1, sequential elements configured only by static motion elements are recognized from the evaluation values of the static motion elements recognized by the static motion element recognition unit 103. . In step 2501 of FIG. 25, sequential elements composed only of static motion elements are extracted from the sign language morpheme dictionary, and a correspondence list of sequential elements and static motion elements is created.
The format of the sequential element in the correspondence list is shown in FIG.
In FIG. 26, 2601 is the serial number of the sequential element in the sign language morpheme dictionary, 2602 is the number of static motion elements constituting the sequential element, and 2603 and 2604 are serial numbers in the sign language morpheme dictionary of the static motion element constituting the sequential element. It is.
In step 2502 of FIG. 25, the static motion element recognition result is read from the static motion element recognition unit 103 for one time. In step 2503, if the recognition result is the last, the process ends. Otherwise, go to step 2504. In step 2504, for each sequential element, a necessary motion element is selected from the recognition result of the static motion element read, and the evaluation value of the sequential element at that time is obtained. The evaluation value E7 (t) at time t is expressed as follows, where the number of static motion elements constituting the sequential element is n, and the evaluation value at time t of the i-th static motion element is E6 (t, i). 6). Note that E6 (t, i) can be easily obtained by the equation (Equation 1).
[Formula 6]

In step 2505, for each sequential element, a time range in which the evaluation value is maximized is searched from the contents of the buffer storing the obtained sequential element evaluation value and the history of past sequential element evaluation values. The range and evaluation value are output as the recognition result of the sequential elements. Note that the format of the recognition result of the sequential elements is the same as the format shown in FIG.
In step 2506, the obtained evaluation value of the sequential element is stored in a buffer for use in processing at the next time.
[0021]
FIG. 27 is a flowchart of the integration process in the static motion element integration unit shown in FIG.
The static motion element integration unit 106 integrates the evaluation value of the static motion element into the evaluation result of the sequential element configured only by the dynamic motion element, and obtains the evaluation value of the entire sequential element. In FIG. 27, first, in step 2701, a sequential element including a dynamic element and a sequential element including a simultaneous element composed of two or more operation element states are extracted from the sign language morpheme dictionary, Create a correspondence list of sequential elements and static action elements. The format of each sequential element in the correspondence list is the same as the format shown in FIG.
In step 2702, one dynamic sequential element recognition result is read from the dynamic sequential element recognition unit 104. If the recognition result of the dynamic sequential element is the last in step 2703, the process is terminated. Otherwise, go to Step 2704. In step 2704, the recognition result (evaluation value) of the static motion element corresponding to the dynamic sequential element is read from the static motion element recognition unit 103 in the time range of the read dynamic sequential element. In step 2705, the average of the read evaluation values of the static motion elements is obtained as the evaluation value of the static motion elements corresponding to the dynamic sequential elements. In step 2706, the evaluation value of the entire sequential element is obtained from the evaluation value of the dynamic sequential element and the evaluation value of the static motion element. The calculation of the evaluation value can be obtained by calculating the average of two kinds of evaluation values, the geometric average, or the like. It is also possible to obtain an evaluation value for each type of operation element that constitutes a sequential element and calculate an average or a geometric average of them. In step 2707, the time range of the dynamic sequential element and the obtained evaluation value are output as an integration result. The format of the integration result is the same as the format shown in FIG.
In the flowchart shown in FIG. 27, the evaluation values of the static motion elements are obtained for all times, and only the evaluation values in the required time range are received from the evaluation values. From the recognition result of the dynamic sequential element recognition unit, Only in the obtained time range, an instruction is sent to the static motion element recognition unit 103 so as to obtain the evaluation value of the required static motion element, and the evaluation result is received and the static motion element is integrated. You can also.
[0022]
FIG. 28 is a flowchart of the recognition process in the sign language morpheme recognition unit shown in FIG.
The sign language morpheme recognition unit 107 recognizes a sign language morpheme from the recognition result of the sequential elements. In FIG. 28, first, in step 2801, a correspondence list of sign language morphemes and sequential elements constituting them is created from a sign language morpheme dictionary. The format of each sign language morpheme in the correspondence list is shown in FIG. In FIG. 29, 2901 is the sign language morpheme name, 2902 is the degree of overlap between the sequential elements constituting the sign language morpheme, 2903 is the number of sequential elements constituting the sign language morpheme, and 2904 and 2905 are the sign language morpheme of the sequential elements constituting the sign language morpheme. It is a serial number in the dictionary.
In Step 2802, one sequential element recognition result is read. In step 2803, if the sequential element recognition result is the last, the process ends. Otherwise, go to Step 2804. In step 2804, a sign language morpheme corresponding to the constructed sequential element string is searched from the correspondence list from the sequential elements in the buffer storing the read sequential elements and the history of past sequential elements. In step 2805, for the retrieved sign language morpheme, an evaluation value is obtained based on the evaluation value of the sequential elements and the overlap or gap between the sequential elements. The evaluation value E9 is the number of sequential elements constituting the sign language morpheme is n, the evaluation value of the i-th sequential element is E8 (i), and the overlap or gap between the i-th and i + 1-th sequential elements is G (i) ( It can be obtained by the following equation (7), where A is the average of the overlap or gap between successive elements, and σ is the variance. Note that E8 (i) can be easily obtained from the previous equation (Equation 1).
[Expression 7]

[0023]
In step 2806, the obtained evaluation value, sign language morpheme name, and time range are output as a recognition result. The time range of the sign language morpheme is obtained based on the time range of the sequential elements used for recognition. For example, the time range of the sign language morpheme can be from the start time of the first sequential element to the end time of the last sequential element.
FIG. 30 shows the format of the sign language morpheme recognition result. In FIG. 30, 3001 is the start time of the time range of the sign language morpheme, 3002 is the end time of the time range of the sign language morpheme, 3003 is the name of the sign language morpheme, and 3004 is the evaluation value of the sign language morpheme.
In the previous formulas (Equation 1), (Equation 2), (Equation 3), (Equation 4), (Equation 6), and (Equation 7), the evaluation values are calculated by the geometric mean of the evaluation values of the constituent elements. However, a simple average (add all evaluation values of components and divide by the number) may be used. Also, the evaluation value of the component and the evaluation value for the gap or overlap may be weighted.
[0024]
【The invention's effect】
As described above, according to the present invention, since the static motion element is evaluated based on the time range determined from the recognition result of the dynamic motion element, the recognition accuracy based on the recognition range of the static motion element. The loss of In addition, for sequential elements that are composed of only static motion elements, the evaluation value of the entire sequential element is obtained at each time, and recognition is performed by detecting the time when the maximum value of the evaluation value is detected. The recognition degradation due to the shift of the element detection time is eliminated. In addition, attribute values that represent the characteristics of motion elements are expressed by continuous quantities obtained from actual motion data, and the motion elements recognized based on them are evaluated, so it is possible to determine appropriate evaluation values for motion elements. The sign language recognition accuracy can be improved.
[Brief description of the drawings]
FIG. 1 is a conceptual block diagram of a sign language recognition device according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a sign language motion model based on motion elements in the present invention.
FIG. 3 is a block diagram showing a structure of a dynamic motion element recognition unit shown in FIG.
4 is a block diagram showing a structure of a static motion element recognition unit shown in FIG. 1. FIG.
FIG. 5 is a hardware configuration diagram of a sign language recognition apparatus for realizing an embodiment of the present invention.
6 is a format diagram of data input from the sign language input device of FIG. 1; FIG.
FIG. 7 is a format diagram of parameters for dynamic motion element recognition.
FIG. 8 is a description format of a sign language morpheme dictionary.
FIG. 9 is a description format of an operation element.
FIG. 10 is a diagram illustrating types of motion elements and types of respective attribute values.
FIG. 11 is a format of an attribute value of an operation element.
FIG. 12 is a flowchart for explaining recognition processing of static motion elements included in simultaneous elements having two or more motion element states.
FIG. 13 is a diagram for explaining a detection process of a static operation element.
FIG. 14 is a format diagram of the dynamic motion element recognized in FIG. 13;
15 is a flowchart for explaining recognition processing of the dynamic sequential element recognition unit shown in FIG. 1;
FIG. 16 is a format diagram of simultaneous elements in a correspondence list of simultaneous elements and operation elements;
FIG. 17 is a format diagram of sequential elements in a correspondence list of sequential elements and simultaneous elements.
FIG. 18 is a flowchart for explaining simultaneous element recognition processing in the dynamic sequential element recognition unit shown in FIG. 1;
FIG. 19 is a format diagram of recognized simultaneous elements.
20 is a flowchart for explaining a sequential element recognition process in the dynamic sequential element recognition unit shown in FIG. 1; FIG.
FIG. 21 is a format diagram of sequential elements recognized in FIG. 20;
22 is a flowchart for explaining recognition processing of the static motion element recognition unit shown in FIG. 1; FIG.
FIG. 23 is a format diagram of operation elements in a static operation element list.
FIG. 24 is a format diagram of a static motion element recognized in FIG. 22;
FIG. 25 is a flowchart for explaining recognition processing of the static sequential element recognition unit shown in FIG. 1;
FIG. 26 is a format diagram of sequential elements in a static sequential element list.
FIG. 27 is a flowchart for explaining integration processing of the static operation element integration unit shown in FIG. 1;
FIG. 28 is a flowchart for explaining recognition processing of a sign language morpheme recognition unit shown in FIG. 1;
FIG. 29 is a format diagram of sign language morphemes in a correspondence list of sign language morphemes and sequential elements.
FIG. 30 is a format diagram of sign language morphemes recognized in FIG. 28;
[Explanation of symbols]
101 ... Sign language input unit, 102 ... Dynamic motion element recognition unit, 108 ... Output unit,
103 ... static motion element recognition unit, 104 ... dynamic sequential element recognition unit,
105 ... static sequential element recognition unit, 106 ... dynamic motion element integration unit,
107: Sign language morpheme recognition unit 109: Monitor 110: Speaker
111 ... Sign language morpheme dictionary, 201 ... Sign language morpheme, 208, 209 ... Action elements,
202, 203, 204 ... sequential elements, 205, 206, 207 ... simultaneous elements,
301, 302, 303 ... dynamic motion element recognition unit, 304-306 ... parameter,
401, 402, 403 ... static motion element recognition unit, 504 to 511 ... memory,
601 ... hand position, 605 ... hand direction, 609 ... bending of fingers,
701: Operation element name, 705, 707 ... Parameter type,
706, 708 ... parameter, 901 ... type of motion element, 902 ... hand part,
903 to 905 ... attribute values, 1101 to 1103 ... average values of attribute values,
1104 to 1106 ... dispersion of attribute values, 1301 ... time series of evaluation values,
1302 ... Time range start time, 1303 ... Time range end time.

Claims

Sign language input means for converting the shape and movement of the hand into an electrical signal and inputting it as time-series sign language data;
From the time series sign language data inputted from該手story input unit, the basic unit of the dynamic behavior of the basic unit of operation, and the static operation contained in the time series of a plurality of static basic unit of operation Dynamic motion element recognition means for recognizing basic units;
Static action element recognition means for recognizing a basic unit of static action among basic units of action from time-series sign language data input from the sign language input means;
Dynamically recognizing one or more basic units of dynamic motion captured from the dynamic motion element recognition means or a sequential motion unit composed of a plurality of static motion basic units in time series Sequential element recognition means;
Static sequential element recognition means for recognizing sequential motion units composed of one or more basic units of static motion captured from the static motion element recognition means;
And sequential operation unit constituted by the basic unit of the dynamic behavior taken from the dynamic sequential element recognition means, the basic unit of the static action taken from the static operating element recognition means Static motion element integration means for integrating based on a time range of sequential motion units composed of dynamic motion basic units ;
Sign language morpheme recognition means for recognizing a motion as a sign language with respect to a recognition result of sequential elements captured from the static motion element integration means and the static motion element recognition means;
Storing a sign language template expressed by a combination of basic units of motion, and sign language morpheme dictionary means referred to by each means;
A sign language recognition apparatus comprising: means for outputting the sign language recognized by the sign language morpheme recognition means in the form of speech or characters.

The sign language recognition device according to claim 1,
The sign language recognition device characterized in that the basic unit of motion stored in the sign language morpheme dictionary means for storing the sign language template is expressed by a combination of a symbol representing the type of motion and an attribute value of motion represented by a continuous amount. .

The sign language recognition device according to claim 2,
The static motion element integration means for integrating the sequential motion unit composed of the dynamic motion basic unit and the static motion basic unit is a sequential motion composed of the dynamic motion basic unit. A sign language recognition apparatus that selects only a recognition result of a basic unit of static motion in a time range determined by a specific motion unit.

The sign language recognition device according to claim 3.
The static motion element integrating means for integrating the sequential motion unit composed of the dynamic motion basic unit and the static motion basic unit is a sequential motion composed of the dynamic motion basic unit. The time range determined by a specific motion unit is sent to a static motion element recognition unit that recognizes a basic unit of static motion, and the static motion element recognition unit performs recognition processing only for the sent time range. sign language recognition device characterized by sending the result to the static operating element integration means.

The sign language recognition device according to claim 1,
The dynamic sequential element recognizing means for recognizing a sequential motion unit constituted by the dynamic motion basic unit performs recognition based on a degree of overlap in a time range of the recognized dynamic motion basic unit. This is a sign language recognition device.

The sign language recognition device according to claim 1,
The static motion element recognizing means for recognizing the basic unit of the static motion is the static motion stored in the sign language morpheme dictionary means for storing the sign language template data at each time of the time series sign language data. An apparatus for recognizing a sign language, wherein the evaluation is based on an attribute value of a basic unit.

The sign language recognition device according to claim 1,
The static sequential element recognizing means for recognizing a sequential motion unit configured by the static motion basic unit is configured to obtain an evaluation result of the static motion basic unit at each time for each sequential motion unit. A sign language recognition device characterized in that an evaluation value for a sequential motion unit is obtained by collecting the evaluation values, and a time range in which the calculated evaluation value is a maximum value is used as a recognition result of the sequential motion unit.