JP2014038543A

JP2014038543A - Character recognition system and program for recognizing finger character

Info

Publication number: JP2014038543A
Application number: JP2012181493A
Authority: JP
Inventors: Hiroshi Tanaka; 博田中; Takaya Shoji; 貴哉庄司; Masaki Kato; 正樹加藤; Takahiro Sugaya; 隆浩菅谷; Hiromitsu Nishimura; 広光西村; Takayuki Suzuki; 孝幸鈴木
Original assignee: Ikutoku Gakuen School Corp
Current assignee: Ikutoku Gakuen School Corp
Priority date: 2012-08-20
Filing date: 2012-08-20
Publication date: 2014-02-27

Abstract

PROBLEM TO BE SOLVED: To provide a new method for real-time image recognition of finger characters.SOLUTION: A user wears on both hands color gloves in which five finger tips and back of the hand are colored in six different colors; the user is photographed in color with the back of both hands directed to photographing means; and an obtained frame image is divided into two in an X-axis direction. For each of a first image and a second image obtained from the division, inter-gravity center distances between the gravity center of a colored area of the back of the hands and the center of gravities of colored areas of the finger tip parts are calculated and normalized; collection data having the normalized inter-gravity distances according to the five fingers of the right and left hands as elements is taken as collation data; and the collation data is collated with a template prepared in advance.

Description

本発明は、指文字を認識する技術に関し、より詳細には、指文字を実時間で画像認識する技術に関する。 The present invention relates to a technique for recognizing finger characters, and more particularly to a technique for recognizing finger characters in real time.

従来、手指の形状で文字を表現する方法として指文字が知られている。日本では、片手で一文字を表現する指文字が使用されており、４６個の仮名文字に対応して４６個の指文字が用意され、アルファベット２６文字に対応して２６個の指文字が用意されている。 Conventionally, a finger character is known as a method of expressing a character in the shape of a finger. In Japan, finger characters that represent one character with one hand are used, 46 finger characters are prepared corresponding to 46 kana characters, and 26 finger characters are prepared corresponding to 26 alphabet characters. ing.

一方、このような指文字を画像認識する技術について種々研究がなされている。この点につき、非特許文献１は、カラーグローブを装着したユーザの指文字画像を複数の特徴量に基づいて解析する方法を開示する。 On the other hand, various studies have been made on techniques for recognizing images of such finger characters. In this regard, Non-Patent Document 1 discloses a method of analyzing a finger character image of a user wearing a color glove based on a plurality of feature amounts.

しかしながら、現在使用されている指文字は、その手指形状において相互に類似しているものが少なくないので、画像解析によってこれらを正確且つリアルタイムに判別することは難しく、指文字の実時間画像認識は未だ実用レベルに至っていない。 However, since there are many finger characters currently used that are similar to each other in their finger shapes, it is difficult to distinguish them accurately and in real time by image analysis, and real-time image recognition of finger characters is difficult. It has not yet reached a practical level.

渡辺他、“カラーグローブを用いた指文字の認識”、電子情報通信学会論文誌D-II、Vol.J80-D-II、No.10, pp.2713-2722, 1997年10月Watanabe et al., “Recognition of Finger Characters Using Color Gloves”, IEICE Transactions D-II, Vol. J80-D-II, No. 10, pp. 2713-2722, October 1997

本発明は、上記従来技術における課題に鑑みてなされたものであり、本発明は、指文字の実時間画像認識のための新規な手法を提供することを目的とする。 The present invention has been made in view of the above problems in the prior art, and an object of the present invention is to provide a novel technique for real-time image recognition of finger characters.

現在使用されている指文字においては、ユーザは、文字の数だけ手指形状を覚えなければならず、その習得自体が非常に困難を伴う。本発明者らは、指文字の実時間画像認識を実現すべく検討する中で、従来の指文字に比べて格段に習得が容易な新規な指文字を創案するとともに、これに対応する新規な文字認識システムの構成に想到し、本発明に至ったのである。 In the finger characters currently used, the user has to learn the finger shape as many as the number of characters, and the acquisition itself is very difficult. While considering the real-time image recognition of finger characters, the present inventors have created a new finger character that is much easier to learn than conventional finger characters, and a new corresponding finger character. The idea of the configuration of the character recognition system was conceived and the present invention was achieved.

上述したように、本発明によれば、指文字の実時間画像認識のための新規な手法が提供される。 As described above, according to the present invention, a novel technique for real-time image recognition of finger characters is provided.

本実施形態におけるカラー手袋を示す図。The figure which shows the color glove in this embodiment. 本実施形態における両手指文字の仮名文字対応表（その１）を示す図。The figure which shows the kana character correspondence table | surface (the 1) of the two-handed finger character in this embodiment. 本実施形態における文字認識システムの構成図を示す図。The figure which shows the block diagram of the character recognition system in this embodiment. 本実施形態における文字認識装置の機能ブロック図。The functional block diagram of the character recognition apparatus in this embodiment. 本実施形態における文字認識装置が実行する処理のフローチャート。The flowchart of the process which the character recognition apparatus in this embodiment performs. 本実施形態における画像分割部が実行する画像分割処理を概念的に示す図。The figure which shows notionally the image division process which the image division part in this embodiment performs. 本実施形態における着色領域の抽出処理を表すフローチャート。The flowchart showing the extraction process of the coloring area | region in this embodiment. 本実施形態における照合データの生成処理を説明するための概念図。The conceptual diagram for demonstrating the production | generation process of the collation data in this embodiment. 本実施形態における設定テーブルを示す図。The figure which shows the setting table in this embodiment. 本実施形態における指の静止判定処理を概念的を示す図。The figure which shows notionally the stillness determination process of the finger in this embodiment. 本実施形態における照合データの生成処理を説明するための概念図。The conceptual diagram for demonstrating the production | generation process of the collation data in this embodiment. 本実施形態における妥当性のない照合データを示す図。The figure which shows the collation data with no validity in this embodiment. 本実施形態における判定結果の棄却を説明するための概念図。The conceptual diagram for demonstrating rejection of the determination result in this embodiment. 本実施形態における着色領域の抽出処理の別法を説明するための概念図。The conceptual diagram for demonstrating the other method of the extraction process of the coloring area | region in this embodiment. 本実施形態における着色領域の抽出処理の別法を表すフローチャート。The flowchart showing the other method of the extraction process of the coloring area | region in this embodiment. 本実施形態における両手指文字の仮名文字対応表（その２）を示す図。The figure which shows the kana character correspondence table | surface (the 2) of the two-handed finger character in this embodiment. 本実施形態における両手指文字の仮名文字対応表（その３）を示す図。The figure which shows the kana character correspondence table | surface (the 3) of the two-handed finger character in this embodiment.

以下、本発明を図面に示した実施の形態をもって説明するが、本発明は、図面に示した実施の形態に限定されるものではない。なお、以下に参照する各図においては、共通する要素について同じ符号を用い、適宜、その説明を省略するものとする。 Hereinafter, the present invention will be described with reference to embodiments shown in the drawings, but the present invention is not limited to the embodiments shown in the drawings. In the drawings referred to below, the same reference numerals are used for common elements, and the description thereof is omitted as appropriate.

本発明の実施形態の説明に入る前に、本発明が採用する３つの前提条件について説明する。 Before describing the embodiment of the present invention, three preconditions adopted by the present invention will be described.

（前提条件１：カラー手袋）
本発明の文字認識システムを利用するにあたり、ユーザは、５本の指と手の甲が色分け着色された手袋（以下、カラー手袋として参照する）を両手に着用することが求められる。図１は、本発明におけるカラー手袋の実施形態であるカラー手袋５００を示す。図１に示されるように、本実施形態においては、カラー手袋５００の５本の指（親指・人差し指・中指・薬指・小指）の各先端領域および手の甲の一部領域が６種類の異なる色で色を分けて着色されており、且つ、右手用と左手用が同じ配色で着色されている。なお、図１では、手の甲を矩形に着色しているが着色する領域の形状はこれに限定されない。 (Precondition 1: Color gloves)
In using the character recognition system of the present invention, the user is required to wear gloves (hereinafter referred to as color gloves) with five fingers and the back of the hand colored. FIG. 1 shows a color glove 500 which is an embodiment of a color glove according to the present invention. As shown in FIG. 1, in this embodiment, the tip regions of the five fingers (thumb, forefinger, middle finger, ring finger, little finger) of the glove 500 and a partial region of the back of the hand are in six different colors. The right and left hands are colored in the same color scheme. In FIG. 1, the back of the hand is colored in a rectangle, but the shape of the colored region is not limited to this.

本実施形態において、カラー手袋５００の着色に用いる６色は、認識精度向上の観点から、お互いの色相値が最大限に離れるような組み合わせを採用することが好ましい。なお、カラー手袋５００は、無地の手袋の該当箇所を後から染料や塗料を使って着色したり、該当箇所を所望の色の素材ではじめから作り込むなどして作製することができる。 In the present embodiment, the six colors used for coloring the color glove 500 are preferably employed in such a combination that their hue values are separated from each other from the viewpoint of improving recognition accuracy. The color glove 500 can be manufactured by coloring the corresponding portion of the plain glove with a dye or paint later, or by making the corresponding portion from the beginning with a desired color material.

（前提条件２：両手指を用いた新規な指文字）
本発明では、表現しようとする文字（以下、目的文字という）を２つの文字構成要素に分解し、左右の手の手指形状で各文字構成要素を表わすことを特徴とする新規な指文字を採用する。例えば、日本語の仮名文字（五十音）は、観念的に子音と母音に分解することができるので、本発明が採用する指文字では、一方の手の手指形状で子音（すなわち、五十音の行）を表わし、他方の手の手指形状で母音（すなわち、五十音の段）を表わす。以下、本発明が採用するこの新規な指文字を従来の指文字と区別するために「両手指文字」という。 (Precondition 2: New finger text using both fingers)
In the present invention, a new finger character is used, which is characterized by decomposing a character to be expressed (hereinafter referred to as a target character) into two character components and representing each character component in the shape of fingers of the left and right hands. To do. For example, a Japanese kana character (50 syllables) can be deliberately decomposed into consonants and vowels. Therefore, in the finger character adopted by the present invention, a consonant (that is, 50 syllables) is formed with a finger shape of one hand. Represents a row of sounds), and represents the vowels (ie, the fifty steps) in the finger shape of the other hand. Hereinafter, this new finger character employed by the present invention will be referred to as a “two-handed finger character” in order to distinguish it from a conventional finger character.

図２は、本発明が採用する両手指文字の仮名文字対応表を例示する。図２に示す例では、１０個の子音（五十音の行）および５個の母音（五十音の段）を表す１５種類の手指形状が定義されており、１０個の子音（五十音の行）の表現が右手に割り当てられ、５個の母音（五十音の段）の表現を左手で割り当てられている。 FIG. 2 illustrates a kana character correspondence table of two-handed finger characters employed by the present invention. In the example shown in FIG. 2, 15 types of finger shapes representing 10 consonants (rows of 50 sound) and 5 vowels (50 sound steps) are defined, and 10 consonants (50 The expression of (sound row) is assigned to the right hand, and the expression of five vowels (stages of 50 syllables) is assigned to the left hand.

この場合、例えば、ユーザが、右手で「か行」の手指形状を作り、左手で「い段」の手指形状を作った場合、「か行」+「い段」＝「き」を表したことなる。なお、単純母音（あ・い・う・え・お）についても、同様に、「行」と「段」の組み合わせによって表すものとする（例えば、「あ行」+「あ段」＝「あ」のように）。 In this case, for example, when the user creates a finger shape of “ka row” with the right hand and a finger shape of “in step” with the left hand, “ka row” + “in step” = “ki” is expressed. It will be different. Similarly, simple vowels (a, i, u, e, o) are also represented by a combination of “row” and “column” (for example, “a line” + “a stage” = “a”). "like).

（前提条件３：撮影条件）
本発明の文字認識システムは、ユーザを撮影した動画像に基づいてユーザが作る両手指文字をリアルタイムで認識することを目的とする。よって、本発明では、撮影に際して、ユーザがカメラに対して両手の甲を向けて正対することが求められる。なお、このとき、ユーザは両手を交差させてはならない。 (Precondition 3: Shooting conditions)
An object of the character recognition system of the present invention is to recognize a two-handed finger character created by a user in real time based on a moving image obtained by photographing the user. Therefore, in the present invention, at the time of shooting, the user is required to face the camera with the backs of both hands facing each other. At this time, the user must not cross both hands.

以上、説明した３つの前提条件を念頭に置いて、本発明の実施形態の説明を始める。なお、以下の説明は、図２の仮名文字対応表に基づいて実装されたシステムを例にとって行うものとする。 The description of the embodiment of the present invention is started with the three preconditions described above in mind. Note that the following description will be given by taking a system implemented based on the kana character correspondence table of FIG. 2 as an example.

図３は、本発明の実施形態である文字認識システム１０００の構成図を示す。図３に示すように、本実施形態の文字認識システム１０００は、デジタルビデオカメラとして参照される撮影手段２００と、撮影手段２００が取得したカラー画像を解析することによって両手指文字が表すところの目的文字に対応する文字情報を生成し、当該文字情報に基づいて出力データを生成するためのコンピュータ装置１００と、コンピュータ装置１００が生成した出力データを出力する出力装置３００とを含んで構成されている。なお、コンピュータ装置１００と、撮影手段２００および出力装置３００の間は、適切な通信手段（有線または無線を問わず）を介して通信可能に接続されているものとする。 FIG. 3 shows a configuration diagram of a character recognition system 1000 according to the embodiment of the present invention. As shown in FIG. 3, the character recognition system 1000 according to the present embodiment includes a photographing unit 200 referred to as a digital video camera, and a purpose represented by a two-handed finger character by analyzing a color image acquired by the photographing unit 200. A computer apparatus 100 for generating character information corresponding to characters and generating output data based on the character information, and an output apparatus 300 for outputting the output data generated by the computer apparatus 100 are configured. . It is assumed that the computer device 100, the photographing unit 200, and the output device 300 are communicably connected via an appropriate communication unit (whether wired or wireless).

図４は、本実施形態の文字認識システム１０００を構成するコンピュータ装置１００（以下、文字認識装置１００として参照する）の機能ブロック図を示す。 FIG. 4 is a functional block diagram of a computer apparatus 100 (hereinafter referred to as the character recognition apparatus 100) that constitutes the character recognition system 1000 of the present embodiment.

文字認識装置１００は、照合データ生成部１０と、文字構成要素判定部２０と、文字情報生成部３０と、出力データ生成部４０と、テンプレート記憶部５０と、処理に必要な各種パラメータを格納するパラメータ設定部６０とを含んで構成され、照合データ生成部１０は、さらに、画像読み込み部１２と、画像分割部１３と、着色領域抽出部１４と、重心間距離算出部１５と、正規化部１６と、指静止判定部１７と、データ妥当性判断部１８とを含んで構成されている。 The character recognition device 100 stores a collation data generation unit 10, a character component determination unit 20, a character information generation unit 30, an output data generation unit 40, a template storage unit 50, and various parameters necessary for processing. The collation data generation unit 10 further includes an image reading unit 12, an image division unit 13, a colored region extraction unit 14, a center-of-gravity distance calculation unit 15, and a normalization unit. 16, a finger stillness determination unit 17, and a data validity determination unit 18.

照合データ生成部１０は、撮影手段２００が取得したカラー画像から所定のアルゴリズムに従って照合データを生成する。文字構成要素判定部２０は、生成された照合データとテンプレート記憶部５０に用意されたテンプレートデータを照合し、ユーザの右手が表すところの文字構成要素（子音：五十音の行）およびユーザの左手が表すところの文字構成要素（母音：五十音の段）を判定する。 The collation data generation unit 10 generates collation data from the color image acquired by the photographing unit 200 according to a predetermined algorithm. The character component determination unit 20 collates the generated collation data with the template data prepared in the template storage unit 50, and the character component (consonant: line of the Japanese syllabary) represented by the user's right hand and the user's right hand. The character component represented by the left hand is determined.

文字情報生成部３０は、判定結果として出力された子音（五十音の行）および母音（五十音の段）に基づいて、両者の組み合わせとして観念される仮名文字に対応する文字情報（テキストデータ）を生成する。出力データ生成部４０は、文字情報生成部３０が生成した文字情報（テキストデータ）に基づいて出力装置３００に応じた出力データを生成する。 Based on the consonant (50 sound line) and the vowel (50 sound line) output as the determination result, the character information generation unit 30 character information (text) corresponding to the kana character considered as a combination of both Data). The output data generation unit 40 generates output data corresponding to the output device 300 based on the character information (text data) generated by the character information generation unit 30.

出力装置３００が、各種ディスプレイ装置（ヘッドマウントディスプレイを含む）やプロジェクター装置である場合には、出力データ生成部４０は、文字情報生成部３０が生成したテキストデータをそのまま出力し、出力装置３００は、そのテキストを表示する。一方、出力装置３００が音声出力装置である場合には、出力データ生成部４０は、文字情報生成部３０が生成したテキストデータをさらに音声データに変換し、出力する。さらに、出力装置３００が表示機能と音声出力機能を両具している場合は、テキスト表示と音声出力を同時に行うこともできる。 When the output device 300 is various display devices (including a head-mounted display) or a projector device, the output data generation unit 40 outputs the text data generated by the character information generation unit 30 as it is, and the output device 300 , Display that text. On the other hand, when the output device 300 is an audio output device, the output data generation unit 40 further converts the text data generated by the character information generation unit 30 into audio data and outputs it. Furthermore, when the output device 300 has both a display function and an audio output function, text display and audio output can be performed simultaneously.

以上、本実施形態の文字認識装置１００の構成について概説してきたが、続いて、文字認識装置１００を構成する各機能部が実行する処理の内容を図５に基づいて順を追って説明する。なお、以下の説明においては、適宜、図４を参照するものとする。 As described above, the configuration of the character recognition device 100 according to the present embodiment has been outlined. Subsequently, the contents of processing executed by each functional unit constituting the character recognition device 100 will be described in order based on FIG. In the following description, FIG. 4 will be referred to as appropriate.

図５は、文字認識装置１００が実行する処理のフローチャートである。ユーザから実行開始を指示されると、データを初期化した後（ステップ１０１）、画像読み込み部１２が、撮影手段２００が撮影したカラー画像（ＲＧＢ画像）の最新の１フレームを読み込む（ステップ１０２）。 FIG. 5 is a flowchart of processing executed by the character recognition device 100. When the execution start is instructed by the user, after the data is initialized (step 101), the image reading unit 12 reads the latest one frame of the color image (RGB image) photographed by the photographing means 200 (step 102). .

続く、ステップ１０３において、画像分割部１３が、読み込んだカラー画像をＸ軸方向に二分して分割する。図６は、画像分割部１３が実行する画像分割処理を概念的に示す。なお、本実施形態においては、計算負荷を軽減するためにカラー画像を事前に適切な倍率で縮小しておくことが好ましい。 In step 103, the image dividing unit 13 divides the read color image into two parts in the X-axis direction. FIG. 6 conceptually shows the image dividing process executed by the image dividing unit 13. In this embodiment, it is preferable to reduce the color image at an appropriate magnification in advance in order to reduce the calculation load.

図６に示す例では、読み込んだカラー画像を320×240に縮小した後、縮小後のカラー画像について、Ｘ座標値が0〜159の画素領域を第１の画像として定義し、Ｘ座標値が160〜319の画素領域を第２の画像として定義している。本実施形態においては、先に述べたように、ユーザが撮影手段２００のカメラレンズに対して両手の甲を向けて正対することを前提とするので、第１の画像には必ず右手が写り、第２の画像には必ず左手が写ることになる。よって、本実施形態では、第１の画像の解析結果を右手に係るものとし、第２の画像の解析結果を左手に係るものとする。 In the example shown in FIG. 6, after the read color image is reduced to 320 × 240, a pixel area having an X coordinate value of 0 to 159 is defined as the first image for the reduced color image, and the X coordinate value is A pixel region of 160 to 319 is defined as the second image. In the present embodiment, as described above, since it is assumed that the user faces the camera lens of the photographing unit 200 with both backs facing up, the first image always includes the right hand, The left image is always shown in the second image. Therefore, in the present embodiment, the analysis result of the first image is related to the right hand, and the analysis result of the second image is related to the left hand.

仮に、全体画像から右手と左手を区別して認識するとなると、右手用と左手用でカラー手袋の配色を異ならしめるなどの工夫が必要となる。この点、本実施形態は、画面の座標に基づいて右手の画像と左手の画像を自動的に判別するので、カラー手袋を右手用と左手用で同じ配色にすることができ、抽出する色の種類を少なくすることができる（６色に限定することができる）。抽出する色の種類が少なくなれば、お互いの色相値が最大限に離れるような組み合わせを採用しやすくなり、結果的に認識精度が向上する。 If the right hand and the left hand are recognized and recognized from the entire image, it is necessary to devise such as making the color scheme of the color gloves different for the right hand and the left hand. In this respect, the present embodiment automatically discriminates the right hand image and the left hand image based on the coordinates of the screen, so the color gloves can be made the same color for the right hand and the left hand, and the color of the color to be extracted The number of types can be reduced (it can be limited to 6 colors). If the types of colors to be extracted are reduced, it becomes easier to adopt a combination in which the hue values are separated from each other as much as possible, resulting in improved recognition accuracy.

次に、分割後の第１の画像（以下、右手画像という）および第２の画像（以下、左手画像という）のそれぞれについて、着色領域抽出部１４が着色領域の抽出処理を実行する（ステップ１０４）。 Next, for each of the divided first image (hereinafter referred to as a right-hand image) and the second image (hereinafter referred to as a left-hand image), the colored region extraction unit 14 performs a colored region extraction process (step 104). ).

図７は、着色領域の抽出処理を表すフローチャートである。まず、ステップ２０１において、背景差分法により前景画像を抽出する。本実施形態においては、先のステップ１０２の前に、背景画像を取得するフェーズを設け、そこで、例えば、カラー手袋５００が写り込まないように手を隠したユーザの画像を背景画像として取得しておき、当該背景画像の差分として前景画像を抽出することができる。その結果、図８（ａ）に示すように、右手画像および左手画像のそれぞれについて、カラー手袋５００に相当する画像領域が抽出される。なお、本実施形態においては、背景差分法による前景画像の抽出処理を上述した画像分割処理（ステップ１０３）の前に行うようにしてもよい。 FIG. 7 is a flowchart showing extraction processing of a colored area. First, in step 201, a foreground image is extracted by the background subtraction method. In the present embodiment, a phase of acquiring a background image is provided before the previous step 102, where, for example, an image of a user whose hand is hidden so that the color gloves 500 are not captured is acquired as a background image. The foreground image can be extracted as a difference between the background images. As a result, as shown in FIG. 8A, an image region corresponding to the color glove 500 is extracted for each of the right hand image and the left hand image. In the present embodiment, foreground image extraction processing by the background subtraction method may be performed before the above-described image division processing (step 103).

続いて抽出した前景画像（ＲＧＢ画像）をＨＳＶ画像に変換した後（ステップ２０２）、変換後のＨＳＶ画像からカラー手袋５００の６つの着色領域を以下の手順で抽出する。 Subsequently, the extracted foreground image (RGB image) is converted into an HSV image (step 202), and then the six colored regions of the color glove 500 are extracted from the converted HSV image by the following procedure.

まず、変換後のＨＳＶ画像につき、Ｈ値（色相：hue）に基づいて二値化処理を行う（ステップ２０３）。図９は、二値化処理に使用するＨ値の閾値を設定するテーブル６００を例示する。テーブル６００には、カラー手袋の６つの着色領域（親指・人差し指・中指・薬指・小指・手の甲）のそれぞれに対して、着色する色（赤・黄色・紫・緑・ピンク・青）、当該色に係るＨ値の閾値範囲およびラベルナンバーが格納されている。なお、本実施形態においては、テーブル６００をはじめ、後述する各種閾値のパラメータがパラメータ設定部６０において管理されている。 First, binarization processing is performed on the converted HSV image based on the H value (hue: hue) (step 203). FIG. 9 illustrates a table 600 for setting the threshold value of the H value used for the binarization process. The table 600 has colors (red, yellow, purple, green, pink, and blue) for each of the six colored areas (thumb, forefinger, middle finger, ring finger, little finger, and back of the hand) of the color gloves. The threshold value range and label number of the H value are stored. In the present embodiment, the parameter setting unit 60 manages parameters for various thresholds described later, including the table 600.

ステップ２０３においては、抽出対象とする領域の色に紐付いたＨ値の２つの閾値（すなわち、Ｈ値の最小値Ｈ_minおよび最大値Ｈ_max）に照らして、Ｈ値が、最小値Ｈ_min以上、且つ、最大値Ｈ_max以下となる画素の値を「１」とし、それ以外の画素の値を「０」とする二値化処理を行って、ＨＳＶ画像を二値化画像に変換する。 In step 203, the H value is equal to or greater than the minimum value H _min in light of two threshold values of the H value associated with the color of the region to be extracted (that is, the minimum value H _min and the maximum value H _{max of} the H value). In addition, a binarization process is performed in which the value of a pixel that is equal to or less than the maximum value H _max is set to “1”, and the values of other pixels are set to “0”, thereby converting the HSV image into a binarized image.

次に、変換後の二値化画像に対してノイズ除去処理を施した後（ステップ２０４）、ノイズ除去後の二値化画像に対して、４連結または８連結などの適切なアルゴリズムに従ってラベリング処理を施す（ステップ２０５）。その結果、例えば「親指」の着色領域を構成する全画素は、テーブル６００に設定されたラベルナンバー[１]でラベリングされる。 Next, after noise removal processing is performed on the binarized image after conversion (step 204), labeling processing is performed on the binarized image after noise removal according to an appropriate algorithm such as 4-connection or 8-connection. (Step 205). As a result, for example, all the pixels constituting the colored region of “thumb” are labeled with the label number [1] set in the table 600.

続く、ステップ２０６において、全ての色についてラベリング処理が終了したか否かが判断され、６色全てについてラベリングが終了するまで（ステップ２０６、Ｎｏ）、上述したステップ２０３〜ステップ２０５を繰り返す。その結果、各着色領域（親指・人差し指・中指・薬指・小指・手の甲）を構成する画素が、それぞれ、テーブル６００に設定されたラベルナンバー[１]、[２]、[３]、[４]、[５]、[６]でラベリングされる。 In step 206, it is determined whether or not the labeling process has been completed for all colors, and the above-described steps 203 to 205 are repeated until the labeling for all six colors is completed (No in step 206). As a result, the pixels constituting each coloring area (thumb, forefinger, middle finger, ring finger, little finger, back of hand) are labeled numbers [1], [2], [3], [4] set in the table 600, respectively. , [5], [6].

６色全てについてラベリング処理が終了すると（ステップ２０６、Ｙｅｓ）、最後に、６色のラベリング結果をマージする。図８（ｂ）は、マージの結果、右手画像から、手の甲、親指および人差し指に対応する３つの着色領域が抽出され、左手画像から、手の甲および人差し指に対応する２つの着色領域が抽出された様子を示している。ステップ２０７が終了すると、処理は、図５に示すステップ１０５に進む。 When the labeling process is completed for all six colors (step 206, Yes), the labeling results for the six colors are finally merged. FIG. 8B shows a state in which three colored areas corresponding to the back of the hand, the thumb, and the index finger are extracted from the right hand image, and two colored areas corresponding to the back of the hand and the index finger are extracted from the left hand image as a result of the merge. Is shown. When step 207 ends, the process proceeds to step 105 shown in FIG.

続くステップ１０５では、重心間距離算出部１５が、抽出された各着色領域の重心（重心画素のＸＹ座標）を算出する。図８（ｃ）は、図８（ｂ）に示した着色領域について算出された重心を示す。 In subsequent step 105, the center-of-gravity distance calculation unit 15 calculates the center of gravity of each extracted colored region (the XY coordinates of the center-of-gravity pixel). FIG. 8C shows the center of gravity calculated for the colored region shown in FIG.

続くステップ１０６では、２フレーム分のデータ（着色領域の重心）が取得済みか否かを判断する。一番最初のフレームについて、ステップ１０３〜ステップ１０５が終わった時点では、当然２フレーム分のデータが取得されていない状態なので（ステップ１０６、Ｎｏ）、重心間距離算出部１５は、その旨を画像読み込み部１２に通知する。これを受けて、処理はステップ１０２に戻り、画像読み込み部１２が次のフレームを読み込む。なお、画像読み込み部１２は、隣接するフレームを順次読み込むように構成してもよいし、所定の時間間隔でフレームを間引きして読み込むように構成してもよい。 In the subsequent step 106, it is determined whether or not data for two frames (the center of gravity of the colored region) has been acquired. Since the data for two frames is naturally not acquired at the time when Step 103 to Step 105 are completed for the very first frame (Step 106, No), the center-of-gravity distance calculation unit 15 displays that fact on the image. The reading unit 12 is notified. In response to this, the process returns to step 102, and the image reading unit 12 reads the next frame. Note that the image reading unit 12 may be configured to sequentially read adjacent frames, or may be configured to thin out and read frames at a predetermined time interval.

次のフレームが読み込まれると、上述したステップ１０３〜ステップ１０５を繰り返した後、処理はステップ１０６に戻る。この時点では、２フレーム分のデータが取得済みの状態となるので（ステップ１０６、Ｙｅｓ）、処理はステップ１０７に進む。ステップ１０７においては、指静止判定部１７が指の静止判定処理を実行する。 When the next frame is read, the processing returns to step 106 after repeating step 103 to step 105 described above. At this point, since data for two frames has already been acquired (step 106, Yes), the process proceeds to step 107. In step 107, the finger stillness determination unit 17 executes a finger stillness determination process.

以下、本実施形態における指の静止判定処理について説明する。ユーザの動画像に基づいて両手指文字を認識するにあたり、どの時点のフレーム画像を解析して目的文字を認識するかが問題になる。この点につき、予め決められた１文字分の入力周期（例えば、１秒周期）を光や音を使ってユーザに知らせ、ユーザがそのタイミングに合わせて両手指文字を作るといった方法が考えられる。 Hereinafter, the finger stillness determination process in the present embodiment will be described. When recognizing a two-handed finger character based on a user's moving image, it becomes a problem as to which frame image is analyzed to recognize a target character. In this regard, a method is conceivable in which a predetermined input period (for example, 1 second period) for one character is notified to the user using light or sound, and the user creates a two-handed finger character in accordance with the timing.

本発明は、そのように入力周期を固定する方法を排除するものではない。しかしながら、この方法では、入力周期を長くしすぎると、両手指文字に慣れたユーザにとっては、次の周期を待つのがもどかしく感じられるし、逆に、入力周期を短くしすぎると、両手指文字に不慣れなユーザは、装置側の要求するタイミングについてゆくことができず、完成途中の手指形状の画像に基づいて認識処理が実行される結果、認識効率が悪化する虞がある。 The present invention does not exclude such a method of fixing the input period. However, in this method, if the input cycle is too long, it will feel frustrating to wait for the next cycle for users accustomed to two-handed finger characters. Conversely, if the input cycle is too short, the two-handed finger characters A user unaccustomed to the process cannot follow the timing requested by the apparatus, and recognition processing may be deteriorated as a result of executing recognition processing based on a finger-shaped image being completed.

この点につき、本実施形態においては、ユーザの指の静止状態を判定することで両手指文字が完成したタイミングを検出し、当該タイミングに同期して両手指文字の認識処理を実行する構成を採用する。 With regard to this point, the present embodiment adopts a configuration in which the timing of completion of the two-handed finger character is detected by determining the stationary state of the user's finger and the recognition processing of the two-handed finger character is executed in synchronization with the timing. To do.

具体的には、時間的に前後する２フレーム分のデータ（着色領域の重心）を取得した後、２つのフレーム画像間における着色領域の重心の移動量を適切な評価関数を用いて評価し、当該評価結果に基づいてユーザの指の静止状態を判定する。 Specifically, after acquiring data (centroid of the colored region) for two frames before and after the time, the amount of movement of the centroid of the colored region between the two frame images is evaluated using an appropriate evaluation function, Based on the evaluation result, the stationary state of the user's finger is determined.

本実施形態においては、例えば、下記式（１）に示す評価関数を用いて、評価値Ｌが予め定めた閾値より大きい場合には指が静止していないと判定し、評価値Ｌが閾値未満の場合には指が静止していると判定することができる。 In the present embodiment, for example, using the evaluation function shown in the following formula (1), when the evaluation value L is larger than a predetermined threshold, it is determined that the finger is not stationary, and the evaluation value L is less than the threshold. In this case, it can be determined that the finger is stationary.

なお、上記式（１）において、ｘ_ｉ（ｔ）およびｙ_ｉ（ｔ）は、それぞれ、時間(ｔ)のフレーム（最新のフレーム）において抽出された指ｉの着色領域の重心のｘ座標およびｙ座標を示し、ｘ_ｉ（ｔ−Δｔ）およびｙ_ｉ（ｔ−Δｔ）は、それぞれ、時間(ｔ−Δｔ)のフレーム（一つ前のフレーム）において抽出された指の着色領域の重心のｘ座標およびｙ座標を示し、ｎは着色領域が抽出された指ｉの数を示す。 In the above equation (1), x _i (t) and y _i (t) are respectively the x coordinate of the center of gravity of the colored region of the finger i extracted in the frame at time (t) (latest frame) and The y coordinate is shown, and x _i (t−Δt) and y _i (t−Δt) are respectively the centroids of the finger coloring areas extracted in the frame of time (t−Δt) (the previous frame). The x coordinate and the y coordinate are indicated, and n indicates the number of fingers i from which the colored region is extracted.

図１０は、指静止判定部１７が実行する指の静止判定処理を説明するための概念図である。図１０に示す例の場合、第１フレームの左手画像について得られた人差し指の着色領域（２）の重心座標と第２フレームの左手画像について得られた人差し指の着色領域（２）の重心座標に基づく評価値Ｌが閾値未満となり、第１フレームの右手画像について得られた親指の着色領域（１）および人差し指の（２）の重心座標と第２フレームの右手画像について得られた親指の着色領域（１）および人差し指の（２）の重心座標に基づく評価値Ｌが閾値以上となる。 FIG. 10 is a conceptual diagram for explaining the finger stillness determination process executed by the finger stillness determination unit 17. In the example shown in FIG. 10, the barycentric coordinates of the index finger coloring area (2) obtained for the left-hand image of the first frame and the barycentric coordinates of the index finger coloring area (2) obtained for the left-hand image of the second frame are used. Based on the evaluation value L based on the threshold value, the thumb coloring area (1) obtained for the right-hand image of the first frame and the barycentric coordinates of the index finger (2) and the thumb coloring area obtained for the right-hand image of the second frame The evaluation value L based on the barycentric coordinates of (1) and the index finger (2) is equal to or greater than the threshold value.

この場合、左手の指は静止しているものの、右手の指は静止していないことが推定されるので、指静止判定部１７は、両手の指が静止していないと判定し（ステップ１０８、Ｎｏ）、その旨を重心間距離算出部１５に通知する。これを受けて、重心間距離算出部１５は、第１フレームのデータを破棄した後（ステップ１０９）、その旨を画像読み込み部１２に通知する。これを受けて、処理はステップ１０２に戻り、画像読み込み部１２が次のフレーム（第３フレーム）を読み込む。その後、処理はステップ１０３〜ステップ１０６を経て、ステップ１０７に戻り、再び、指静止判定部１７が指の静止判定を実行する。 In this case, it is estimated that the finger of the left hand is stationary, but the finger of the right hand is not stationary. Therefore, the finger stationary determination unit 17 determines that the fingers of both hands are not stationary (step 108, No), the fact is notified to the center-of-gravity distance calculation unit 15. In response to this, the center-of-gravity distance calculation unit 15 discards the first frame data (step 109), and notifies the image reading unit 12 accordingly. In response to this, the process returns to step 102, and the image reading unit 12 reads the next frame (third frame). Thereafter, the process goes through Step 103 to Step 106 and returns to Step 107, and the finger stillness determination unit 17 again performs the finger stillness determination.

２回目の指の静止判定では、第２フレームの左手画像について得られた人差し指の着色領域（２）の重心座標と第３フレームの左手画像について得られた人差し指の着色領域（２）の重心座標に基づく評価値Ｌが閾値未満となるとともに、第２フレームの右手画像について得られた親指の着色領域（１）および人差し指の着色領域（２）の重心座標と第３フレームの右手画像について得られた親指の着色領域（１）および人差し指の着色領域（２）の重心座標に基づく評価値Ｌがいずれも閾値未満となる。この場合、指静止判定部１７は、両手の指が静止していると判定し（ステップ１０８、Ｙｅｓ）、その旨を重心間距離算出部１５に通知する。 In the second finger stillness determination, the barycentric coordinates of the index finger coloring area (2) obtained for the left hand image of the second frame and the barycentric coordinates of the index finger coloring area (2) obtained for the left hand image of the third frame. Is obtained for the right hand image of the third frame and the barycentric coordinates of the colored region (1) of the thumb and the colored region (2) of the index finger obtained for the right hand image of the second frame. The evaluation values L based on the barycentric coordinates of the colored region (1) of the thumb and the colored region (2) of the index finger are both less than the threshold value. In this case, the finger stationary determination unit 17 determines that the fingers of both hands are stationary (step 108, Yes), and notifies the inter-centroid distance calculation unit 15 to that effect.

これを受けて、重心間距離算出部１５は、指ｉの着色領域の重心と手の甲の着色領域の重心との離間距離（重心間距離ｄ_ｉ）を下記式（２）に基づいて算出する（ステップ１１０）。 In response to this, the center-of-gravity distance calculation unit 15 calculates the separation distance (inter-centroid distance d _i ) between the center of gravity of the colored region of the finger i and the center of gravity of the colored region of the back of the hand based on the following formula (2) ( Step 110).

なお、上記式（２）において、（ｐｘ）および（ｐｙ）は、手の甲の着色領域の重心のｘ座標およびｙ座標を示し、（ｆｘ_ｉ）および（ｆｙ_ｉ）は、指ｉの着色領域の重心のｘ座標およびｙ座標を示す。 In the above formula (2), (px) and (py) indicate the x and y coordinates of the center of gravity of the colored region of the back of the hand, and (fx _i ) and (fy _i ) indicate the colored region of the finger i. The x and y coordinates of the center of gravity are shown.

その結果、図１１（ａ）に示すように、右手画像については、手の甲の着色領域の重心画素（６）と親指の着色領域の重心画素（１）の間の重心間距離ｄ_１と、手の甲の着色領域の重心画素（６）と人差し指の着色領域の重心画素（２）の間の重心間距離ｄ_２が算出され、左手画像については、手の甲の着色領域の重心画素（６）と人差し指の着色領域の重心画素（２）の間の重心間距離ｄ_２が算出される。 As a result, as shown in FIG. 11A, for the right-hand image, the center-of-gravity distance d ₁ between the center-of-gravity pixel (6) of the colored region of the back of the hand and the center of gravity pixel (1) of the colored region of the thumb, and the back of the hand the inter-centroid distance d ₂ between the center of gravity pixel colored region of the center of gravity pixels (6) and the index finger of the colored regions (2) is calculated, for the left hand image is the back of the hand of the center of gravity pixel colored region (6) and the index finger inter-centroid distance d ₂ between the center of gravity pixel colored regions (2) is calculated.

なお、重心間距離ｄの算出は、指の静止判定に使用した２つのフレーム（第２フレームおよび第３フレーム）のいずれか一方の画像で行ってもよいし、２つのフレーム（第２フレームおよび第３フレーム）のそれぞれについて重心間距離ｄを算出してその平均をとるなどしてもよい。なお、重心間距離の算出処理においては、着色領域が抽出されなかった指ｉに係る重心間距離ｄの値を「０」とする。 The calculation of the distance d between the centers of gravity may be performed on any one of the two frames (second frame and third frame) used for the finger stillness determination, or the two frames (second frame and The center-of-gravity distance d may be calculated for each of the third frames) and averaged. In the calculation process of the distance between the centers of gravity, the value of the distance d between the centers of gravity related to the finger i from which the colored region is not extracted is set to “0”.

ステップ１１０が実行される結果、図１１（ｂ）に示すように、右手画像および右手画像のそれぞれについて、重心間距離[ｄ_１]〜[ｄ_５]を要素とする集合データ（以下、データＤとして参照する）が取得される。その後、続くステップ１１１においては、正規化部１６がデータＤの正規化を行う。 As a result of the execution of step 110, as shown in FIG. 11B, for each of the right-hand image and the right-hand image, set data (hereinafter referred to as data D) having the center-to-center distances [d ₁ ] to [d ₅ ] as elements. Is referred to as). Thereafter, in the subsequent step 111, the normalization unit 16 normalizes the data D.

データＤを構成する５つの要素（重心間距離ｄ）の大きさは、撮影手段２００とユーザの離間距離によって変化する。この点につき、正規化部１６は、この距離依存性を排除するために、「手の甲の着色領域」の面積Ｓに基づいて重心間距離ｄを正規化する。本実施形態においては、例えば、指ｉに係る重心間距離ｄ_ｉを下記式（３）によって正規化することができる。 The size of the five elements (the distance d between the centers of gravity) constituting the data D varies depending on the distance between the photographing unit 200 and the user. In this regard, the normalizing unit 16 normalizes the distance d between the centers of gravity based on the area S of the “colored region of the back of the hand” in order to eliminate this distance dependency. In the present embodiment, for example, the distance d _i between the centroids of the finger i can be normalized by the following equation (3).

なお、上記式（３）において、Ｓは「手の甲の着色領域」の面積（ピクセル数）を示し、ｄ_ｉ′は指ｉに係る正規化された重心間距離を示す。 In the above equation (3), S represents the area (number of pixels) of the “colored region of the back of the hand”, and d _i ′ represents the normalized distance between the centers of gravity related to the finger i.

ステップ１１１が実行される結果、図１１（ｂ）に示したデータＤは、図１１（ｃ）に示すように正規化される。以下、正規化された集合データをデータＤ′として参照する。ステップ１１１でデータＤ′が生成されると、続くステップ１１２において、データ妥当性判断部１８がデータＤ′の妥当性について判断する。 As a result of step 111 being executed, the data D shown in FIG. 11B is normalized as shown in FIG. Hereinafter, the normalized set data is referred to as data D ′. When the data D ′ is generated in step 111, in the subsequent step 112, the data validity determination unit 18 determines the validity of the data D ′.

ユーザの後ろを人が通ったり、ユーザが動いたりすることや、照明光の状態の変化が原因で、取得画像上のカラー手袋以外の位置においてカラー手袋の配色と同じ色が抽出されることがある。このような状況下で生成されたデータＤ′は妥当性に欠けるのでこれを棄却することが好ましい。この点につき、本実施形態においては、人間の手指に係る解剖学的な制約条件に照らしてデータＤ′の妥当性を判断する。 The same color as the color glove's color scheme may be extracted at a position other than the color glove on the acquired image due to a person passing behind the user, the user moving, or a change in the state of the illumination light. is there. Since the data D ′ generated under such circumstances lacks validity, it is preferable to reject it. With respect to this point, in the present embodiment, the validity of the data D ′ is determined in light of the anatomical constraints on human fingers.

ここで、人間の手指に係る解剖学的な制約条件としては、人間の指の長さはほぼ決まっているといった定量的な条件や、人間の指は交差しないといった定性的な条件を挙げることができる。ここでは、人間の指の長さに関する制約条件に照らしてデータＤ′の妥当性を判断する処理を例示的に説明する。 Here, anatomical constraints on human fingers include quantitative conditions such that the length of human fingers is almost fixed and qualitative conditions that human fingers do not cross. it can. Here, the process of determining the validity of the data D ′ in the light of the constraint on the length of the human finger will be described as an example.

この場合、正規化された重心間距離ｄ′の値の大きさについて、解剖学的な見地から人間の指の長さに見合う数値範囲を制約条件として予め定めておき、左右の手にかかるデータＤ′（右手に係るデータＤ′_Ｒおよび左手に係るデータＤ′_Ｌ）を構成する５つの要素（重心間距離ｄ′）の全てが制約条件を満たすか否かによってデータＤ′の妥当性を判断する。 In this case, regarding the magnitude of the normalized distance between the centers of gravity d ′, a numerical value range corresponding to the length of a human finger from an anatomical viewpoint is determined in advance as a constraint condition, and data on the left and right hands The validity of the data D ′ depends on whether all of the five elements (distance between the centers of gravity d ′) constituting D ′ (the right hand data D ′ _R and the left hand data D ′ _L ) satisfy the constraint condition. to decide.

仮に、重心間距離ｄ′の制約条件を「２≦ ｄ′≦４」と定めた場合、図１２に例示するデータＤ′について見ると、右手に係るデータＤ′_Ｒにおいて、小指に係る重心間距離[ｄ_５′]が制約条件を満たしていない。この場合、データ妥当性判断部１８は、図１２に示すデータＤ′について妥当性なしと判断する（ステップ１１３、Ｎｏ）。 Assuming that 'the constraint "2 ≦ d' centroid distance d defined as ≦ 4"'As for the data D according to the right hand' data D illustrated in FIG. 12 in the _R, between the centers of gravity of the little finger The distance [d ₅ ′] does not satisfy the constraint condition. In this case, the data validity determination unit 18 determines that the data D ′ shown in FIG. 12 is not valid (No in step 113).

この判断を受けて、処理は、ステップ１０１に戻り、全てのデータが初期化される。その後、再び、画像読み込み部１２が次のフレームを読み込んで、上述した手順を繰り返す。なお、本実施形態においては、妥当性なしと判断されたフレームの連続数が所定の回数を超えた場合に、ユーザに対してアラートを出すように構成することが好ましい。 In response to this determination, the process returns to step 101 to initialize all data. Thereafter, the image reading unit 12 reads the next frame again and repeats the above-described procedure. In the present embodiment, it is preferable that an alert be issued to the user when the number of consecutive frames determined to be invalid exceeds a predetermined number.

一方、データＤ′を構成する５つの要素（重心間距離ｄ′）の全てが制約条件を満たす場合、データ妥当性判断部１８は、当該データＤ′について妥当性ありと判断し（ステップ１１３、Ｙｅｓ）、その旨を正規化部１６に通知する。これを受けて、正規化部１６は、当該データＤ′を内容とする照合データを生成して文字構成要素判定部２０に渡す。 On the other hand, if all of the five elements (distance between the centroids d ′) constituting the data D ′ satisfy the constraint condition, the data validity determination unit 18 determines that the data D ′ is valid (step 113, Yes), this is notified to the normalization unit 16. In response to this, the normalization unit 16 generates collation data containing the data D ′ and passes it to the character component determination unit 20.

正規化部１６から右手および左手に係る照合データ（データＤ′_ＲおよびデータＤ′_Ｌ）を受け取った文字構成要素判定部２０は、２つの照合データＤ′_Ｒおよび照合データＤ′_Ｌと、テンプレート記憶部５０に用意されたテンプレートデータを照合する。 The character component determination unit 20 that has received the right hand and left hand collation data (data D ′ _R and data D ′ _L ) from the normalization unit 16 includes two collation data D ′ _R, collation data D ′ _L, and a template. The template data prepared in the storage unit 50 is collated.

ここで、テンプレート記憶部５０には、図２に示した文字構成要素（子音：五十音の行／母音：五十音の段）に対応するカラー手袋の手指形状の１５種類の画像について、上述したのと同様の手順で事前に生成しておいた１５個のデータＤ′がテンプレートデータとして記憶されている。 Here, in the template storage unit 50, 15 types of finger-shaped images of color gloves corresponding to the character constituent elements shown in FIG. 2 (consonant: line of 50 sound / vowel: stage of 50 sound) Fifteen data D 'generated in advance by the same procedure as described above are stored as template data.

文字構成要素判定部２０は、２つの照合データＤ′_Ｒおよび照合データＤ′_Ｌと、テンプレート記憶部５０に用意された１５個のテンプレートデータのベクトル間距離を算出して、ベクトル間距離が最小値を示すテンプレートデータを特定し、当該テンプレートデータに対応付けられた文字構成要素（子音：五十音の行／母音：五十音の段）を判定結果として文字情報生成部３０に出力する。 The character component determination unit 20 calculates the inter-vector distance between the two collation data D ′ _R and the collation data D ′ _L and the 15 template data prepared in the template storage unit 50, and the inter-vector distance is minimized. The template data indicating the value is specified, and the character component (consonant: line of the Japanese syllabary / stage of the vowel: Japanese syllabary) associated with the template data is output to the character information generation unit 30 as a determination result.

具体的には、ステップ１１３に続くステップ１１４において、文字構成要素判定部２０が、正規化部１６から受け取った２つの照合データ（データＤ′_ＲおよびデータＤ′_Ｌ）のそれぞれについてベクトル間距離を算出する。ここで、本実施形態においては、右手に係る照合データＤ′_Ｒは、「子音：五十音の行」に係る１０個のテンプレートデータと照合され、左手に係る照合データＤ′_Ｌは、「母音：五十音の段」に係る５個のテンプレートデータと照合される。 Specifically, in step 114 following step 113, the character component determination unit 20 calculates the inter-vector distance for each of the two collation data (data D ′ _R and data D ′ _L ) received from the normalization unit 16. calculate. Here, in this embodiment, the collation data D ′ _R related to the right hand is collated with 10 template data related to “consonant: line of the Japanese syllabary”, and the collation data D ′ _L related to the left hand is “ It is collated with five template data related to “vowels: Japanese syllabary steps”.

なお、本実施形態においては、ステップ１１４で算出したベクトル間距離が最小値を示したテンプレートデータをそのまま判定結果としてもよいが、好ましくは、認識精度向上の観点から、以下の処理（ステップ１１５〜ステップ１１６）を実行する。 In the present embodiment, the template data in which the distance between vectors calculated in step 114 shows the minimum value may be used as the determination result as it is, but preferably the following processing (steps 115 to 115) is performed from the viewpoint of improving recognition accuracy. Step 116) is executed.

すなわち、ステップ１１４に続くステップ１１５において、文字構成要素判定部２０は、２つの照合データのそれぞれについてベクトル間距離の最小値と当該最小値の次に小さい値の差分を求めた後、両者の差分が所定の閾値αよりも大きいか否かを判断する（ステップ１１６）。 That is, in step 115 following step 114, the character component determination unit 20 obtains the difference between the minimum value of the intervector distance and the next smallest value after the minimum value for each of the two collation data, and then the difference between the two. Is greater than a predetermined threshold value α (step 116).

その結果、２つの照合データの少なくとも一方において、差分が閾値αよりも大きくない場合（ステップ１１６、Ｎｏ）、誤認識の蓋然性が高いので、処理は、ステップ１０１に戻り、全てのデータを初期化した後、再び、画像読み込み部１２が次のフレームを読み込んで、上述した手順を繰り返す。一方、２つの照合データのいずれにおいても、差分が閾値αよりも大きい場合には（ステップ１１６、Ｙｅｓ）、ステップ１１７に進み、ベクトル間距離が最小値を示したテンプレートデータに対応する文字構成要素（子音：五十音の行／母音：五十音の段）を判定結果として文字情報生成部３０に出力する。 As a result, if the difference is not greater than the threshold value α in at least one of the two verification data (No in step 116), the probability of misrecognition is high, and the process returns to step 101 to initialize all data. After that, the image reading unit 12 reads the next frame again and repeats the above-described procedure. On the other hand, if the difference is larger than the threshold value α in any of the two collation data (step 116, Yes), the process proceeds to step 117, and the character component corresponding to the template data in which the inter-vector distance indicates the minimum value. (Consonant: line of 50 sound / vowel: stage of 50 sound) is output to the character information generation unit 30 as a determination result.

図１３は、照合データの棄却処理を説明するための概念図である。図１３は、左手に係る照合データＤ′_Ｌと「母音：五十音の段」に係る５個のテンプレートデータのベクトル間距離の算出結果を示している。ここで、仮に、閾値α＝2.0とした場合、図１３（ａ）に示す例では、ベクトル間距離の最小値[1.47]と当該最小値の次に小さい値[3.00]の差分[1.53]は、閾値[2.0]よりも大きくないので、文字構成要素判定部２０は判定処理を中止する。これを受けて、処理は、ステップ１０１に戻り、全てのデータが初期化され、再び、画像読み込み部１２が次のフレームを読み込んで、上述した手順を繰り返す。 FIG. 13 is a conceptual diagram for explaining a collation data rejection process. FIG. 13 shows the calculation results of the inter-vector distances of the collation data D ′ _L relating to the left hand and the five template data relating to “vowels: the stage of the Japanese syllabary”. Here, if the threshold value α = 2.0, in the example shown in FIG. 13A, the difference [1.53] between the minimum value [1.47] of the inter-vector distance and the next smallest value [3.00] of the minimum value is Therefore, the character component determination unit 20 stops the determination process. In response to this, the process returns to step 101, all data is initialized, the image reading unit 12 reads the next frame again, and the above-described procedure is repeated.

一方、図１３（ｂ）に示す例では、ベクトル間距離の最小値[0.09]と当該最小値の次に小さい値[3.00]の差分[2.91]は、閾値[2.0]よりも大きいので、文字構成要素判定部２０は、最小値[0.09]を示したテンプレートデータに対応する文字構成要素「あ段」を判定結果として文字情報生成部３０に出力する。 On the other hand, in the example shown in FIG. 13B, the difference [2.91] between the minimum value [0.09] of the intervector distance and the next smallest value [3.00] is larger than the threshold value [2.0]. The component determination unit 20 outputs the character component “Adan” corresponding to the template data indicating the minimum value [0.09] to the character information generation unit 30 as a determination result.

文字情報生成部３０は、受領した２つの文字構成要素（子音：五十音の行／母音：五十音の段）の組み合わせとして観念される仮名文字のテキストデータを生成し、出力データ生成部４０に出力する。出力データ生成部４０は、受領したテキストデータに基づいて出力装置３００に応じた出力データを生成し、出力装置３００に出力する。 The character information generation unit 30 generates text data of kana characters that are considered as a combination of the received two character components (consonant: line of the Japanese syllabary / vowel: stage of the Japanese syllabary), and an output data generation unit Output to 40. The output data generation unit 40 generates output data corresponding to the output device 300 based on the received text data, and outputs the output data to the output device 300.

一方、画像読み込み部１２は、ステップ１１８において、ユーザから終了が指示されているか否かを判断する。終了が指示されていない場合には（ステップ１１８、Ｎｏ）、処理はステップ１０１に戻り、全てのデータが初期化される。その後、次の目的文字を認識するために、再び、ステップ１０２〜ステップ１１７の処理が繰り返される。一方、ユーザから終了が指示された場合には（ステップ１１８、Ｙｅｓ）、そのまま処理を終了する。 On the other hand, in step 118, the image reading unit 12 determines whether or not the user has instructed termination. If termination is not instructed (No at step 118), the process returns to step 101, and all data is initialized. Thereafter, the processing from step 102 to step 117 is repeated again to recognize the next target character. On the other hand, when the user gives an instruction to end (step 118, Yes), the processing ends.

以上、本実施形態の文字認識装置１００が実行する処理について説明してきたが、次に、図７に基づいて説明した着色領域の抽出処理に関して、さらに好ましい別の実施形態を説明する。 The processing executed by the character recognition device 100 of the present embodiment has been described above. Next, another preferred embodiment will be described with respect to the color region extraction processing described with reference to FIG.

本発明では、既に上述したように、「手の甲の着色領域」の面積を重心間距離の正規化の基準とするところ、背景差分処理後に「手の甲の着色領域」の一部に欠損が生じることがある。このような場合、基準となる「手の甲の着色領域」の面積が変化するため適正な正規化がなされない。一方、背景差分を適用しない元の画像に基づいて解析を行うとなると、背景部分を含めた全画素を解析対象としなければならなくなるので処理速度が犠牲になる。この点につき、以下に説明する別法は、処理速度を犠牲にすることなく、適正な正規化を実現する方法を提供する。 In the present invention, as described above, when the area of the “back colored region” is used as a standard for normalizing the distance between the centers of gravity, a part of the “back colored region” may be lost after the background subtraction process. is there. In such a case, since the area of the reference “colored region of the back of the hand” changes, proper normalization is not performed. On the other hand, if the analysis is performed based on the original image to which the background difference is not applied, the processing speed is sacrificed because all the pixels including the background portion must be analyzed. In this regard, the alternative method described below provides a way to achieve proper normalization without sacrificing processing speed.

図１５は、着色領域の抽出処理の別法を表すフローチャートである。別法においては、まず、ステップ３０１において、図１４（ａ）に示す元の画像から背景差分法により前景画像を抽出する。 FIG. 15 is a flowchart showing another method for extracting a colored region. In another method, first, in step 301, a foreground image is extracted from the original image shown in FIG.

次に、図１４（ｂ）に示すように、抽出された前景画像（カラー手袋に相当する画像領域）に外接する四角形の４辺に囲まれる画素領域を手領域Ｔとして定義する（ステップ３０２） Next, as shown in FIG. 14B, a pixel region surrounded by four sides of a rectangle circumscribing the extracted foreground image (an image region corresponding to a color glove) is defined as a hand region T (step 302).

次に、図１４（ｃ）に示すように、背景差分適用前の元の画像の手領域Ｔから、手の甲の着色領域のみを抽出する一方で（ステップ３０３）、図１４（ｄ）に示すように、各指の着色領域については、背景差分適用後の前景画像から抽出する（ステップ３０４）。 Next, as shown in FIG. 14C, only the colored region of the back of the hand is extracted from the hand region T of the original image before application of the background difference (step 303), as shown in FIG. 14D. In addition, the colored region of each finger is extracted from the foreground image after applying the background difference (step 304).

上述した別法によれば、正規化の基準となる「手の甲の着色領域」の抽出を背景差分適用前の元の画像から行うので「手の甲の着色領域」を欠損のない形で抽出することができ、且つ、その際の解析領域を必要最小限の範囲（手領域）に限定することができるので、処理速度を犠牲にすることなく適正な正規化を実現することができる。 According to the alternative method described above, since the extraction of the “colored area of the back of the hand”, which is a standard for normalization, is performed from the original image before application of the background difference, it is possible to extract the “colored area of the back of the hand” in a form without any defects. In addition, since the analysis area at that time can be limited to a necessary minimum range (hand area), proper normalization can be realized without sacrificing the processing speed.

以上、本発明の文字認証システムを実施の形態をもって説明してきたが、上述したように、本発明によれば、日本語の仮名文字（五十音）を表すために最大でわずか１５の手指形状を覚えるだけ済むので、その習得が格段に容易になり、また、指文字の策定にあたり、１５種類程度であれば、類似する手指形状を採用しなくて済むので、認識精度を最大化することができる。 As described above, the character authentication system of the present invention has been described with reference to the embodiment. As described above, according to the present invention, a maximum of only 15 finger shapes are used to represent Japanese kana characters (50 Japanese syllabary characters). It is much easier to learn, and it is not necessary to adopt a similar finger shape if it is about 15 types in formulating finger characters, so that the recognition accuracy can be maximized. it can.

また、本発明によれば、ユーザの指の静止状態を判定することで両手指文字が完成したタイミングを動的に検出し、当該タイミングに同期して両手指文字の認識処理を実行するので、習熟度の異なるユーザに柔軟に対応することが可能になる。 In addition, according to the present invention, the timing of completion of the two-handed finger character is dynamically detected by determining the stationary state of the user's finger, and the recognition process of the two-handed finger character is executed in synchronization with the timing. It becomes possible to flexibly cope with users having different proficiency levels.

さらに、本発明においては、５本の指に係る重心間距離を要素とする集合データを照合に用いるため、左右の手で共通のテンプレートを使用することができ、また、シンプルな形式の照合データ（５次元ベクトル）を採用することで、計算負荷を格段に小さくすることができるので、リアルタイム性が好適に実現される。 Furthermore, in the present invention, since the collective data having the distance between the centers of gravity of the five fingers as elements is used for collation, a common template can be used with the left and right hands, and simple collation data can be used. By adopting (five-dimensional vector), the calculation load can be remarkably reduced, so that real-time performance is suitably realized.

また、本発明によれば、生成した照合データの妥当性および照合結果の妥当性を検証する構成を採用するので、誤認識が好適に回避され、認識精度が向上する。 Further, according to the present invention, since the configuration for verifying the validity of the generated collation data and the validity of the collation result is adopted, erroneous recognition is preferably avoided, and the recognition accuracy is improved.

なお、本発明は上述した実施形態に限定されるものではなく、当業者が推考しうる実施態様の範囲内において、本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。以下、本発明の範囲に含まれる事項を例示的に示す。 The present invention is not limited to the above-described embodiment, and is included in the scope of the present invention as long as the effects and effects of the present invention are exhibited within the scope of embodiments that can be considered by those skilled in the art. . Hereinafter, matters included in the scope of the present invention will be exemplified.

上述した実施形態においては、日本語の仮名文字（五十音）に対して１５個のテンプレートを用意する例を示したが、別の実施形態では、図１６に示すように、「母音：五十音の段」に係る手指形状と「子音：五十音の行」に係る手指形状の一部を共通させることもできる。この場合、ユーザは、わずか１０種類の手指形状を覚えるだけ済むので、その習得がさらに容易になり、また、識別対象となる手指形状が２／３になるので認識精度がさらに向上する。 In the embodiment described above, an example in which 15 templates are prepared for Japanese kana characters (50 Japanese syllabary) has been shown. However, in another embodiment, as shown in FIG. It is also possible to share a part of the finger shape related to the “ten-tone stage” and the finger shape related to “consonant: line of the Japanese syllabary”. In this case, since the user only has to learn 10 types of finger shapes, the learning is further facilitated, and the finger shape to be identified is 2/3, so that the recognition accuracy is further improved.

また、上述した実施形態においては、右手および左手に係る照合データを専用のテンプレートデータ（すなわち、「子音：五十音の行」に係る１０個のテンプレートデータおよび「母音：五十音の段」に係る５個のテンプレートデータ）と照合する例を示したが、別の実施形態では、右手および左手に係る照合データを、１５個のテンプレートデータのすべてと照合するように構成してもよい。この場合、理論上、１５×１５＝２２５種類の両手指文字を定義することが可能になる。この点につき、図１７は、右手に対して、１５個のテンプレートデータを割り当てた例を示す。この場合、例えば、左手で「あ段」の手指形状を作り、且つ、右手で「あ段」、「い段」、「う段」、「え段」、「お段」と同じ手指形状を作った場合について、それぞれ、「濁点」、「半濁点」、「句点」、「読点」、「長音記号」と定義することができる。 Further, in the above-described embodiment, the collation data for the right hand and the left hand is used as dedicated template data (that is, 10 template data related to “consonant: the line of the Japanese syllabary” and “vowel: the stage of the Japanese syllabary”. In this embodiment, the collation data for the right hand and the left hand may be collated with all the 15 template data. In this case, theoretically, 15 × 15 = 225 types of two-handed finger characters can be defined. In this regard, FIG. 17 shows an example in which 15 template data are assigned to the right hand. In this case, for example, the finger shape of “Adan” is made with the left hand, and the finger shape same as “Adan”, “Idan”, “Udan”, “Edan”, “Odan” is made with the right hand. About the case where it made, it can define as "a muddy point", a "semi-dakuten", a "punctuation mark", a "reading mark", and a "long sound symbol", respectively.

さらに、上述した実施形態においては、両手指文字で表す文字（目的文字）として、日本語の仮名文字（五十音）を例示したが、本発明は、対象とする目的文字を日本語の仮名文字（五十音）に限定するものではなく、２つの文字構成要素に分解できる文字であれば、仮名文字（五十音）以外にも適用が可能である。例えば、漢字は、図形的に「部首」と「つくり」という２つの文字構成要素に分解することができ、ハングル文字は、「母音字母」と「子音字母」という２つの文字構成要素に分解することができるので、これらの２つの文字構成要素を表す適切な手指形状を策定すればよい。 Furthermore, in the above-described embodiment, a Japanese kana character (Japanese syllabary) is exemplified as a character (target character) represented by a two-handed finger character. However, the present invention is directed to a target kana character in Japanese. The present invention is not limited to characters (50 syllables), and can be applied to characters other than kana characters (50 syllabary) as long as the characters can be decomposed into two character components. For example, Kanji can be graphically decomposed into two character components, “radical” and “making”, and Hangul characters can be decomposed into two character components, “vowel mother” and “consonant mother”. Therefore, it is only necessary to formulate an appropriate finger shape that represents these two character components.

さらに加えて、上述した実施形態においては、撮影手段２００、文字認識装置１００（コンピュータ装置）および出力装置３００が分離してなるシステム構成を示したが、別の実施形態では、上述した各装置の持つそれぞれの機能を１つの装置（例えば、スマートフォンやタブレット型ＰＣなど）に集約搭載して一体化することもできる。またそれとは逆に、図４に示した文字認識装置１００を構成する各機能部を適切な単位でネットワーク上に分散配置してネットワークシステムとして構築することもできる。その他、当業者が推考しうる実施態様の範囲内において、本発明の作用・効果を奏する限り、本発明の範囲に含まれるものである。 In addition, in the above-described embodiment, the system configuration in which the photographing unit 200, the character recognition device 100 (computer device), and the output device 300 are separated is shown. However, in another embodiment, each of the above-described devices. Each function can be integrated and integrated in one device (for example, a smartphone or a tablet PC). On the contrary, each functional unit constituting the character recognition apparatus 100 shown in FIG. 4 can be distributed and arranged on the network in an appropriate unit to construct a network system. In addition, it is included in the scope of the present invention as long as the effects and effects of the present invention are exhibited within the scope of embodiments that can be considered by those skilled in the art.

なお、上述した実施形態の各機能は、Ｃ、Ｃ＋＋、Ｃ＃、Ｊａｖａ（登録商標）などのオブジェクト指向プログラミング言語などで記述された装置実行可能なプログラムにより実現でき、本実施形態のプログラムは、ハードディスク装置、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、フレキシブルディスク、ＥＥＰＲＯＭ、ＥＰＲＯＭなどの装置可読な記録媒体に格納して頒布することができ、また他装置が可能な形式でネットワークを介して伝送することができる。 Each function of the above-described embodiment can be realized by a device-executable program described in an object-oriented programming language such as C, C ++, C #, Java (registered trademark), and the program of the present embodiment is It can be stored in a device-readable recording medium such as a hard disk device, CD-ROM, MO, DVD, flexible disk, EEPROM, EPROM, etc., and can be transmitted via a network in a format that other devices can use. it can.

上述した本発明の文字認証システムを市販のWebカメラ（有効画素：500万画素、フレームレート：30fps）とパーソナルコンピュータを使って構築し、認識精度を検証する実験を行った。なお、カラー手袋は、市販の白軍手を油性マーカー６色で着色して作製した。下記表１に、着色に使用した油性マーカーの色の種類と着色箇所および着色領域の抽出に使用した色相値（Ｈ値）のパラメータを示す。 The above-described character authentication system of the present invention was constructed using a commercially available Web camera (effective pixels: 5 million pixels, frame rate: 30 fps) and a personal computer, and an experiment was conducted to verify recognition accuracy. The color gloves were produced by coloring commercially available white gloves with six oily markers. Table 1 below shows the color type of the oily marker used for coloring, and the parameters of the hue value (H value) used for extraction of the colored portion and the colored region.

事前に、カラー手袋を着用した４名の被験者(Ａ，Ｂ，Ｃ，Ｄ)のそれぞれに、図２に示した１５種類の手指形状を作らせて、一人につき１５個のテンプレートデータを生成した。その後、Webカメラから50cm離間した位置でカラー手袋を着用した各被験者に同じく図２に示した１５種類の手指形状を作らせて、一人につき１５個の照合データを生成した。 In advance, each of four subjects (A, B, C, D) wearing color gloves made the 15 types of finger shapes shown in FIG. 2 to generate 15 template data per person. . Thereafter, each test subject wearing color gloves at a position 50 cm away from the Web camera was made to produce the 15 types of finger shapes shown in FIG. 2 to generate 15 pieces of verification data for each person.

テンプレートデータおよび照合データについて、下記（１）〜（５）の組み合わせでベクトル間距離を計算し、ベクトル間距離が最小値を示したテンプレートを判定結果とした。
（１）被験者Ａのテンプレートデータ／被験者Ａの照合データ
（２）被験者Ａのテンプレートデータ／被験者Ｂの照合データ
（３）被験者Ａのテンプレートデータ／被験者Ｃの照合データ
（４）被験者ＡおよびＤのテンプレートデータの混合／被験者Ｂの照合データ
（５）被験者ＡおよびＤのテンプレートデータの混合／被験者Ｃの照合データ For the template data and the collation data, the distance between vectors was calculated by the combination of the following (1) to (5), and the template whose vector distance showed the minimum value was used as the determination result.
(1) Template data of subject A / matching data of subject A (2) Template data of subject A / matching data of subject B (3) Template data of subject A / matching data of subject C (4) Subjects A and D Template data mix / subject B collation data (5) Template data mix of subjects A and D / subject C collation data

その結果、上記（１）〜（５）のすべての組み合わせにおいて、正解率１００％を示した。また、認識に要する時間は約46.8ミリ秒となり、本発明の文字認証システムが手指形状を実時間で認識できることが実証された。 As a result, in all the combinations (1) to (5), the accuracy rate was 100%. The time required for recognition is about 46.8 milliseconds, and it has been proved that the character authentication system of the present invention can recognize the finger shape in real time.

１０…照合データ生成部
１２…画像読み込み部
１３…画像分割部
１４…着色領域抽出部
１５…重心間距離算出部
１６…正規化部
１７…指静止判定部
１８…データ妥当性判断部
２０…文字構成要素判定部
３０…文字情報生成部
４０…出力データ生成部
５０…テンプレート記憶部
６０…パラメータ設定部
１００…文字認識装置（コンピュータ装置）
２００…撮影手段
３００…出力装置
５００…カラー手袋
６００…設定テーブル
１０００…文字認識システム DESCRIPTION OF SYMBOLS 10 ... Collation data generation part 12 ... Image reading part 13 ... Image division part 14 ... Colored area extraction part 15 ... Intercentroid distance calculation part 16 ... Normalization part 17 ... Finger stillness determination part 18 ... Data validity judgment part 20 ... Character Component element determination unit 30 ... Character information generation unit 40 ... Output data generation unit 50 ... Template storage unit 60 ... Parameter setting unit 100 ... Character recognition device (computer device)
DESCRIPTION OF SYMBOLS 200 ... Photographing means 300 ... Output device 500 ... Color glove 600 ... Setting table 1000 ... Character recognition system

Claims

A character recognition system for recognizing an image of a user's finger shape representing a target character,
In order to photograph in color a state in which a user wearing color gloves in which each tip region of the five fingers and a partial region of the back of the hand are color-coded in six different colors are put in both hands facing the back of both hands Shooting means,
Template storage means for storing the first template data and the second template data in association with each of the first character component and the second character component constituting the target character;
Verification data generation means for generating first verification data and second verification data based on the frame image read from the imaging means;
Calculating a first inter-vector distance between the first matching data and the first template data and a second inter-vector distance between the second matching data and the second template data; The first character component associated with the first template data having the smallest inter-vector distance and the second character data associated with the second template data having the smallest inter-vector distance. Character component determination means for outputting a character component as a determination result;
Character information generating means for generating character information corresponding to a target character composed of the first character component and the second character component output as a determination result;
Character recognition system.

The collation data generating means
Image dividing means for dividing the frame image into two in the X-axis direction and dividing the frame image into a first image showing the color glove fitted on one hand and a second image showing the color glove fitted on the other hand; ,
For each of the first image and the second image, a colored region extracting means for extracting a colored region of the color glove,
For each of the first image and the second image, a distance between the centers of gravity between the center of gravity of the colored region corresponding to a part of the back of the hand and the center of gravity of the colored region corresponding to the tip portion of the finger. Means for calculating the distance between the center of gravity to be calculated;
Normalizing means for normalizing the calculated distance between the centers of gravity based on the area of the colored region corresponding to a part of the back of the hand,
The first set data including the normalized distance between the centers of gravity of the five fingers of the one hand as elements is generated as first matching data, and the normal of the five fingers of the other hand Generating as a second collation data set data having the distance between the centroids as an element;
The character recognition system according to claim 1.

The collation data generating means
Finger rest determination means that evaluates the amount of movement of the center of gravity of the colored region corresponding to the tip of the finger between two frame images that move back and forth in time and determines whether the finger is stationary based on the evaluation result Including
Only when it is determined that the fingers of both hands of the user are stationary, the verification data is generated.
The character recognition system according to claim 2.

The colored area extracting means includes
For each of the first image and the second image, a pixel region surrounded by four sides of a rectangle circumscribing the foreground image extracted by the background difference method is defined as a hand region, and the image before the background difference application 4. The character according to claim 2, wherein the colored region corresponding to a part of the back of the hand is extracted from a hand region, and the colored region corresponding to the tip portion of the finger is extracted from the foreground image after applying background difference. Recognition system.

The collation data generating means
Data validity judging means for judging the validity of the first and second set data in view of anatomical constraints on human fingers,
The collation data including the set data is generated only when it is determined that both the first and second set data are valid. 5. Character recognition system.

The character component determination means includes
For each of the calculated first and second vector distances, the determination result is obtained only when the minimum value and the next smallest value after the minimum value are obtained and it is determined that the difference between the two is greater than a predetermined threshold. Output,
The character recognition system as described in any one of Claims 1-5.

The target character is a kana character (Japanese syllabary),
The first character component and the second character component are a consonant (a line of 50 syllables) and a vowel (a stage of 50 syllables), respectively.
The character recognition system as described in any one of Claims 1-6.

A computer-executable program for causing a computer to perform image recognition of a user's finger shape representing a target character,
Computer
Template storage means for storing the first template data and the second template data in association with each of the first character component and the second character component constituting the target character;
Shooting in color when a user wearing color gloves colored in 6 different colors on each tip area of the five fingers and a part of the back of the hand facing the back of both hands. Verification data generation means for generating first verification data and second verification data based on the frame image read from the means;
Calculating a first inter-vector distance of first template data associated with the first collation data and a first character component constituting the target character;
Calculating a second inter-vector distance of second template data associated with the second collation data and a second character component constituting the target character;
The first character component associated with the first template data having a minimum distance between the first vectors and the second template data having a minimum distance between the second vectors. Character component determination means for outputting the associated second character component as a determination result;
A program for functioning as character information generation means for generating character information corresponding to a target character composed of the first character component and the second character component output as a determination result.

The collation data generating means
Image dividing means for dividing the frame image into two parts in the X-axis direction and dividing the frame image into a first image showing the color glove fitted on one hand and a second image showing the color glove fitted on the other hand When,
For each of the first image and the second image, a colored region extracting means for extracting a colored region of the color glove,
For each of the first image and the second image, a distance between the centers of gravity between the center of gravity of the colored region corresponding to a part of the back of the hand and the center of gravity of the colored region corresponding to the tip portion of the finger. Means for calculating the distance between the center of gravity to be calculated;
Normalizing means for normalizing the calculated distance between the centers of gravity based on the area of the colored region corresponding to a part of the back of the hand,
The first set data including the normalized distance between the centers of gravity of the five fingers of the one hand as elements is generated as first matching data, and the normal of the five fingers of the other hand Generating as a second collation data set data having the distance between the centroids as an element;
The program according to claim 8.

The collation data generating means
Finger rest determination means that evaluates the amount of movement of the center of gravity of the colored region corresponding to the tip of the finger between two frame images that move back and forth in time and determines whether the finger is stationary based on the evaluation result Including
Only when it is determined that the fingers of both hands of the user are stationary, the verification data is generated.
The program according to claim 9.

The colored area extracting means includes
For each of the first image and the second image, a pixel region surrounded by four sides of a rectangle circumscribing the foreground image extracted by the background difference method is defined as a hand region, and the image before the background difference application The program according to claim 9 or 10, wherein the coloring area corresponding to a part of the back of the hand is extracted from a hand area, and the coloring area corresponding to a tip portion of the finger is extracted from the foreground image after applying background difference. .

The collation data generating means
Data validity judging means for judging the validity of the first and second set data in view of anatomical constraints on human fingers,
The collation data including the set data is generated only when it is determined that both the first and second set data are valid. 12. program.

The character component determination means includes
For each of the calculated first and second vector distances, the determination result is obtained only when the minimum value and the next smallest value after the minimum value are obtained and it is determined that the difference between the two is greater than a predetermined threshold. Output,
The program as described in any one of Claims 8-12.

The target character is a kana character (Japanese syllabary),
The first character component and the second character component are a consonant (a line of 50 syllables) and a vowel (a stage of 50 syllables), respectively.
The program as described in any one of Claims 8-13.