JP2003308317A

JP2003308317A - Character string conversion method, character recognition method, character recognition device, and character recognition program

Info

Publication number: JP2003308317A
Application number: JP2002115212A
Authority: JP
Inventors: Yojiro Touchi; 洋次郎登内; Hiroshi Matsuura; 博松浦
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2002-04-17
Filing date: 2002-04-17
Publication date: 2003-10-31

Abstract

(57)【要約】【課題】手書き文字を文字認識する際に、類似文字を誤
読した場合でも、正しくかな漢字変換を行うことができ
る文字認識装置を提供する。【解決手段】文字認識結果として得られた第１の文字列
から変換可能な他の第２の文字列と、前記第１の文字列
中の第１の文字を、当該第１の文字に形状が類似する第
２の文字で置き換えることにより得られる第３の文字列
から変換可能な他の第４の文字列とを前記第１の文字列
の変換候補として出力し、当該第１の文字列を当該出力
された変換候補のうちの選択された１つの他の文字列へ
変換する。 (57) [Problem] To provide a character recognition device capable of performing correct kana-kanji conversion even when a similar character is erroneously read in recognizing a handwritten character. A second character string that can be converted from a first character string obtained as a result of character recognition and a first character in the first character string are formed into the first character. And a fourth character string that can be converted from the third character string obtained by replacing the third character string with a second character similar to the first character string. Is converted into another character string selected from the output conversion candidates.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、手書き文字を認識
する文字認識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device for recognizing handwritten characters.

【０００２】[0002]

【従来の技術】従来から、座標入力用タブレット等の座
標入力装置を介して手書き入力される文字あるいは文字
列に関する筆跡情報から、文字コードあるいは文字コー
ド列として認識結果を表示する文字認識装置が存在す
る。2. Description of the Related Art Conventionally, there is a character recognition device which displays a recognition result as a character code or a character code string from handwriting information about a character or a character string handwritten and input through a coordinate input device such as a coordinate input tablet. To do.

【０００３】特願２０００−３３３９１９には、文字の
切れ目が明示的に与えられない環境下において入力され
た文字列を適宜文字を切り出しをしながら認識を行う文
字認識装置及び方法が記載されている。Japanese Patent Application No. 2000-333919 describes a character recognition apparatus and method for recognizing an input character string while appropriately cutting out characters in an environment where character breaks are not explicitly given. .

【０００４】また、特開平９−２６９９７４号公報に
は、１文字ごとの筆記情報が得られる場合の文字認識を
行う文字認識装置が記載されている。Further, Japanese Unexamined Patent Publication No. 9-269974 describes a character recognition device for performing character recognition when writing information for each character is obtained.

【０００５】[0005]

【発明が解決しようとする課題】文字認識装置におい
て、同じ形状をした類似文字を一意に定められないこと
がある。例えば、「つ」と「っ」、「１」と「ｌ」など
は、入力された文字の形だけではどちらを入力されたの
かがわからないことがある。「つ」と「っ」について
は、通常では大きさに基づいて判別を行う。しかし、人
や前後の文字の大きさなどで文字が書かれる大きさは様
々であり、確実に区別できるわけではない。In the character recognition device, it is sometimes impossible to uniquely define similar characters having the same shape. For example, "tsu" and "tsu", "1" and "l", etc. may not be used to know which one is input only by the shape of the input character. Regarding "tsu" and "tsu", the discrimination is usually made based on the size. However, the size in which the characters are written varies depending on the size of the person and the characters before and after, and it is not always possible to distinguish them reliably.

【０００６】さらに、一部だけ間違った文字認識結果に
対してかな漢字変換を行うと全く異なる文字列になる。
例えば、筆記者は、「とうきょう（東京）」と入力した
つもりでも、「ょ」を「よ」と判定した結果、「とうき
よう（登記用）」となってしまう。Furthermore, if a kana-kanji conversion is performed on a partially incorrect character recognition result, a completely different character string is obtained.
For example, even if the writer intends to input “Tokyo (Tokyo)”, the result of determining “yo” as “yo” is “toukiyou (for registration)”.

【０００７】このような問題は、キーボードとかな漢字
変換の組合せでは起きる可能性は低いが、文字認識とか
な漢字変換を組み合わせた場合には、起きる可能性が高
くなる。Although such a problem is unlikely to occur in the combination of the keyboard and the kana-kanji conversion, it is more likely to occur in the case of combining the character recognition and the kana-kanji conversion.

【０００８】本発明は、文字認識特有の類似文字を後認
識した場合であっても後段のかな漢字変換が正しく行え
ることができる文字列変換方法および文字認識方法およ
びそれを用いた文字認識装置および文字認識プログラム
を提供することを目的とする。The present invention is directed to a character string conversion method and a character recognition method capable of correctly performing the subsequent kana-kanji conversion even when a similar character peculiar to character recognition is subsequently recognized, and a character recognition device and a character using the same. The purpose is to provide a recognition program.

【０００９】[0009]

【課題を解決するための手段】本発明は、座標入力装置
上でペンが当該座標入力装置に触れてから離れるまでの
間に当該座標入力装置により検出されるペン先の座標系
列で表現されるストロークの列を基に文字認識を行って
得られた文字列を、他の文字列に変換するためのもので
あって、文字認識結果として得られた第１の文字列から
変換可能な他の第２の文字列と、前記第１の文字列中の
第１の文字を、当該第１の文字に形状が類似する第２の
文字で置き換えることにより得られる第３の文字列から
変換可能な他の第４の文字列とを前記第１の文字列の変
換候補として出力し、当該第１の文字列を当該出力され
た変換候補のうちの選択された１つの他の文字列へ変換
することを特徴とする。The present invention is represented by a coordinate sequence of a pen tip detected by the coordinate input device from the time the pen touches the coordinate input device until the pen leaves the coordinate input device. A character string obtained by performing character recognition based on the stroke sequence is converted into another character string, and another character string that can be converted from the first character string obtained as the character recognition result is converted. It is possible to convert from a second character string and a third character string obtained by replacing the first character in the first character string with a second character having a shape similar to the first character. The other fourth character string is output as a conversion candidate for the first character string, and the first character string is converted to another selected one of the output conversion candidates. It is characterized by

【００１０】本発明によれば、手書き文字を文字認識す
る際に、類似文字を誤読した場合でも、正しくかな漢字
変換を行うことができる。According to the present invention, when recognizing a handwritten character, correct kana-kanji conversion can be performed even if a similar character is misread.

【００１１】本発明は、座標入力装置上でペンが当該座
標入力装置に触れてから離れるまでの間に当該座標入力
装置により検出されるペン先の座標系列で表現されるス
トロークの列を基に文字認識を行い、文字認識結果とし
て得られた第１の文字に、その形状が類似する第２の文
字が存在するときは、当該第２の文字を前記第１の文字
に対応付けておき、前記文字認識結果として得られた当
該第１の文字を含む第１の文字列から変換可能な他の第
２の文字列と、前記第１の文字列中の前記第１の文字を
前記第２の文字で置き換えることにより得られる第３の
文字列から変換可能な他の第４の文字列とを前記第１の
文字列の変換候補として出力し、当該第１の文字列を当
該出力された変換候補のうちの選択された１つの他の文
字列へ変換することを特徴とする。The present invention is based on a stroke sequence represented by a coordinate sequence of a pen tip detected by the coordinate input device between the time the pen touches the coordinate input device and the time the pen leaves the coordinate input device. When character recognition is performed and the first character obtained as a result of character recognition has a second character having a similar shape, the second character is associated with the first character, The other second character string that can be converted from the first character string including the first character obtained as the character recognition result, and the first character in the first character string are converted into the second character string. The other fourth character string that can be converted from the third character string obtained by replacing the first character string is output as the conversion candidate of the first character string, and the first character string is output. It is possible to convert the selected one of the conversion candidates to another character string. The features.

【００１２】本発明によれば、手書き文字を文字認識す
る際に、類似文字を誤読した場合でも、正しくかな漢字
変換を行うことができる。According to the present invention, when recognizing a handwritten character, even if a similar character is misread, kana-kanji conversion can be performed correctly.

【００１３】好ましくは、前記第１の文字が文字認識さ
れる直前の文字認識結果の第３の文字と前記第２の文字
との組合せが文字列としてあり得る文字の組合せである
とき、当該第２の文字を前記第１の文字に対応付ける。[0013] Preferably, when the combination of the third character of the character recognition result immediately before the first character is character-recognized and the second character is a character combination that can be a character string, The second character is associated with the first character.

【００１４】また、好ましくは、前記第３の文字と前記
第１の文字との組合せが文字列としてあり得えない文字
の組合せであるとき、当該第１の文字に代えて前記第２
の文字を文字認識結果とし、この第２の文字を含む前記
第１の文字列の変換候補を出力する。Further, preferably, when the combination of the third character and the first character is a combination of characters that cannot be a character string, the second character is replaced with the second character.
Is used as the character recognition result, and the conversion candidate of the first character string including the second character is output.

【００１５】また、好ましくは、前記第１の文字が大文
字と小文字の使い分けのある文字であり、前記第２の文
字と前記第１の文字との関係が、それらのいずれか一方
が大文字で他方が小文字という関係にあるとき、前記第
１の文字の認識対象の筆記文字の大きさを基に、当該第
１の文字に当該第２の文字を対応付けるか否かを判断す
る。Further, preferably, the first character is a character having a proper use of uppercase and lowercase, and the relationship between the second character and the first character is such that one of them is in uppercase and the other is in the other. Is in a lower case, it is determined whether to associate the second character with the first character based on the size of the written character that is the recognition target of the first character.

【００１６】なお、いわゆる重ね書き文字認識装置の場
合には、前記第２の文字は、続けて筆記される２文字の
組み合せであって、その２文字の間の構造関係が前記第
１の文字の形状と類似する場合でもよい。In the case of a so-called overwritten character recognizing device, the second character is a combination of two characters written continuously, and the structural relationship between the two characters is the first character. The shape may be similar to the shape.

【００１７】本発明の文字認識装置は、座標入力装置上
でペンが当該座標入力装置に触れてから離れるまでの間
に当該座標入力装置により検出されるペン先の座標系列
で表現されるストロークの列を基に文字認識を行う文字
認識手段と、前記文字認識手段での文字認識結果として
得られた第１の文字に、その形状が類似する第２の文字
が存在するときは、当該第２の文字を前記第１の文字に
対応付ける対応付け手段と、前記文字認識手段での文字
認識結果として得られた前記第１の文字を含む第１の文
字列から変換可能な他の第２の文字列と、前記第１の文
字列中の前記第１の文字を前記第２の文字で置き換える
ことにより得られる第３の文字列から変換可能な他の第
４の文字列とを前記第１の文字列の変換候補として出力
する出力手段と、前記第１の文字列を前記出力手段で出
力された変換候補のうちの選択された１つの他の文字列
へ変換する変換手段とを具備したことを特徴とする。In the character recognition device of the present invention, the stroke represented by the coordinate series of the pen tip detected by the coordinate input device from when the pen touches the coordinate input device until it leaves the coordinate input device. When a character recognizing unit that performs character recognition based on a string and a first character obtained as a result of character recognition by the character recognizing unit have a second character whose shape is similar, the second character Associating means for associating the above character with the first character, and another second character that can be converted from the first character string including the first character obtained as the character recognition result by the character recognizing means. A string and another fourth character string convertible from a third character string obtained by replacing the first character in the first character string with the second character; Output means for outputting as a character string conversion candidate, The serial first character string, characterized by comprising a conversion means for converting the selected one of the other string of the conversion candidates are output in the output means.

【００１８】本発明によれば、手書き文字を文字認識す
る際に、類似文字を誤読した場合でも、正しくかな漢字
変換を行うことができる。According to the present invention, when recognizing a handwritten character, even if a similar character is misread, kana-kanji conversion can be performed correctly.

【００１９】[0019]

【発明の実施の形態】以下、本発明の実施形態について
図面を参照して説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

【００２０】図１は、本実施形態に係る文字認識装置の
機能構成を示すブロック図である。FIG. 1 is a block diagram showing the functional arrangement of the character recognition apparatus according to this embodiment.

【００２１】この文字認識装置は、タブレット１００、
文字認識部１０１、文字認識辞書１０２，文字バッファ
１０３、類似文字記憶部１０４、類似文字追加部１０
５、かな漢字変換部１０７、かな漢字変換辞書１０６、
ディスプレイ１０８から構成される。This character recognition device is provided with a tablet 100,
Character recognition unit 101, character recognition dictionary 102, character buffer 103, similar character storage unit 104, similar character addition unit 10
5, kana-kanji conversion unit 107, kana-kanji conversion dictionary 106,
It is composed of the display 108.

【００２２】タブレット１００は例えば透明板から構成
され、ディスプレイ１０８の表示画面上に重ねて配置さ
れている。タブレット１００の入力面には文字筆記領域
３０１（図３参照）が設けられている。この文字筆記領
域３０１にユーザが専用ペンＰを利用して文字を筆記す
ると、その筆跡データ（座標値の時系列情報）がタブレ
ット１００により検出され、文字認識部１０１へ転送さ
れる。すなわち、ペンＰが当該タブレット１００に触れ
ている間のペン先の位置を表す２次元座標データを一定
時間間隔でサンプリングする。得られた座標データは文
字認識部１０１に送られる。The tablet 100 is made of, for example, a transparent plate, and is arranged on the display screen of the display 108 in an overlapping manner. A character writing area 301 (see FIG. 3) is provided on the input surface of the tablet 100. When the user writes a character on the character writing area 301 using the dedicated pen P, the handwriting data (time series information of coordinate values) is detected by the tablet 100 and transferred to the character recognition unit 101. That is, the two-dimensional coordinate data representing the position of the pen tip while the pen P is touching the tablet 100 is sampled at regular time intervals. The obtained coordinate data is sent to the character recognition unit 101.

【００２３】文字認識部１０１は、ペンＰがタブレット
に触れてから離れるまでの間の座標データ列、つまり筆
跡の座標データ列をストロークと呼ぶひとまとまりのデ
ータとして扱い、これをストロークデータとして取得す
る。文字認識部１０１は、ストロークデータが入力する
たびに、それまでに入力されたストロークデータから、
文字認識辞書１０２を用いて、最適な文字列を認識す
る。The character recognizing unit 101 treats a coordinate data string from the time the pen P touches the tablet until it leaves the tablet, that is, a coordinate data string of handwriting, as a group of data called a stroke, and acquires this as stroke data. . Every time the stroke data is input, the character recognition unit 101 calculates from the stroke data input so far,
An optimal character string is recognized using the character recognition dictionary 102.

【００２４】文字認識辞書１０２は、例えば、文字構造
辞書と文字間構造辞書とから構成される。The character recognition dictionary 102 is composed of, for example, a character structure dictionary and an inter-character structure dictionary.

【００２５】文字構造辞書は、認識対象となる各文字の
構造が表現されたデータ（文字構造辞書情報）、すなわ
ち、各文字について、その文字を構成するストロークの
形状やストローク間の位置関係（構造）などの特徴情報
を登録した辞書である。The character structure dictionary is data (character structure dictionary information) representing the structure of each character to be recognized, that is, for each character, the shape of the strokes that form the character and the positional relationship between the strokes (structure). ) Is a dictionary in which feature information such as is registered.

【００２６】文字間構造辞書は、文字構造辞書に登録さ
れた複数の文字のうちの続けて筆記される各２文字の組
み合わせについて、その２文字の間の構造関係が表現さ
れたデータ（文字間辞書情報）を登録した辞書である。The inter-character structure dictionary is data (inter-character inter-character) that expresses a structural relationship between two characters written continuously in a plurality of characters registered in the character structure dictionary. This is a dictionary in which dictionary information) is registered.

【００２７】タブレット１００には、例えば、連続する
２文字が左右に並べて筆記される場合もあれば、重ねて
筆記される場合（例えば、ペンで文字を筆記するための
文字筆記領域が１文字分だけである場合）もある。前者
の場合の文字間辞書情報とは、２文字が左右に並べて筆
記される場合の一方の文字のストロークと他方の文字の
ストロークとの間の位置関係（構造）の特徴情報であ
り、後者の場合の文字間辞書情報とは、２文字が重ねて
筆記される場合の一方の文字のストロークと他方の文字
のストロークとの間の位置関係（構造）の特徴情報であ
る。On the tablet 100, for example, two consecutive characters may be written side by side on the left or right side, or may be overwritten (for example, a character writing area for writing characters with a pen corresponds to one character). There is also only if). The inter-character dictionary information in the former case is characteristic information of the positional relationship (structure) between the stroke of one character and the stroke of the other character when two characters are written side by side on the left and right, and the latter The inter-character dictionary information in this case is characteristic information of the positional relationship (structure) between the stroke of one character and the stroke of the other character when two characters are overwritten and written.

【００２８】文字認識部１０１は、ストロークデータが
入力するたびに、上記文字構造辞書と文字間構造辞書を
用いて、それまでに入力されたストロークの形状および
ストローク間の位置関係に基づき、最も確からしい、最
適な文字（文字列）を求めるようになっている。例え
ば、認識結果として、１文字毎の文字コードを出力す
る。Each time the stroke data is input, the character recognizing unit 101 uses the character structure dictionary and the inter-character structure dictionary to determine the most reliable stroke based on the shape of the strokes input so far and the positional relationship between the strokes. It seems to find the most suitable character (character string). For example, a character code for each character is output as the recognition result.

【００２９】文字認識部１０１から出力された文字コー
ドは、出力された順に文字バッファ１０３に格納され
る。The character codes output from the character recognition unit 101 are stored in the character buffer 103 in the order of output.

【００３０】類似文字記憶部１０４には、互いに形状が
類似する複数の文字（の文字コード）をまとめて１つの
類似文字グループとしたものであって、このような類似
文字グループが複数記憶されている。The similar character storage unit 104 collects a plurality of characters (character codes of which are similar in shape) into one similar character group, and a plurality of such similar character groups are stored. There is.

【００３１】互いに形状が類似する文字とは、例えば、
図２に示したように、「よ」と「ょ」、「１」と
「ｌ」、「つ」と「っ」などの入力された文字の形だけ
ではどちらを入力されたのかが区別できないような文字
である。「よ」と「ょ」、「１」と「ｌ」、「つ」と
「っ」などが、それぞれ１つの類似文字グループであ
る。Characters whose shapes are similar to each other are, for example,
As shown in FIG. 2, it is not possible to distinguish which one is input only by the shape of the input characters such as “yo” and “yo”, “1” and “l”, and “tsu” and “tsu”. It is a character like. “Yo” and “yo”, “1” and “l”, “tsu” and “tsu”, etc. are each one similar character group.

【００３２】類似文字追加部１０５は、文字バッファ１
０３内に格納された各文字について、当該文字が類似文
字記憶部１０４に記憶されている類似文字グループのう
ちの１つに含まれている（属する）かチェックし、含ま
れている場合には、当該文字が含まれている（属する）
類似文字グループ内の各文字（当該文字以外の文字）
を、文字バッファ１０３内の当該文字に対応付ける。The similar character adding unit 105 uses the character buffer 1
For each character stored in 03, it is checked whether the character is included (belongs) in one of the similar character groups stored in the similar character storage unit 104, and if it is included, , The character is included (belongs to)
Each character in the similar character group (characters other than the relevant character)
Is associated with the character in the character buffer 103.

【００３３】例えば、文字バッファ１０３に「しよ」と
いう文字列が格納されているとき、文字「よ」は、上記
複数の類似文字グループのうちの１つの類似文字グルー
プに属する文字であり、同じ類似文字グループには
「よ」の他に「ょ」があるので、類似文字追加部１０５
は、文字バッファ１０３内に格納されている文字「よ」
にその類似文字「ょ」を対応付けて記憶する。ここで
は、これを「｛よ、ょ｝」と表記し、このときの文字バ
ッファ１０３内の認識結果の文字列を「し｛よ、ょ｝」
と表記する。なお、ここで、「｛Ｘ，Ｙ｝」は、文字
（文字列）Ｘ，文字（文字列）Ｙのうちのどちらの文字
（列）でもよいという意味である。For example, when the character string "shiyo" is stored in the character buffer 103, the character "yo" is a character belonging to one similar character group of the plurality of similar character groups, and is the same. Since the similar character group includes “yo” in addition to “yo”, the similar character adding unit 105
Is the character "yo" stored in the character buffer 103.
The similar character "yo" is stored in association with. Here, this is expressed as “{yo, yo}”, and the character string of the recognition result in the character buffer 103 at this time is “shi {yo, yo}”.
It is written as. Here, “{X, Y}” means that either character (character string) X or character (character string) Y may be used.

【００３４】かな漢字変換部１０７は、文字バッファ１
０３中の文字列に対して日本語解析し、予め保持したか
な漢字変換表辞書１０６と照合して、最適な他の文字列
に変換する。The kana-kanji conversion unit 107 uses the character buffer 1
The character string in 03 is analyzed in Japanese, collated with the Kana-Kanji conversion table dictionary 106 stored in advance, and converted into another optimal character string.

【００３５】通常のかな漢字変換は、１つの文字列に対
して変換処理を行うが、ここでは、類似文字の組合せを
考慮した複数の文字列を構成して、それぞれについて変
換処理を行い、それらの結果を合わせて変更候補とす
る。In the ordinary kana-kanji conversion, conversion processing is performed on one character string. Here, however, a plurality of character strings are constructed in consideration of a combination of similar characters, and conversion processing is performed for each of them. The results are combined and considered as a change candidate.

【００３６】例えば、文字バッファ１０３に格納されて
いる類似文字の対応付けられた文字列「とうき｛よ，
ょ｝う」があり、この文字列に対しかな漢字変換を行う
場合には、「とうきょう」と「とうきよう」それぞれに
ついて変換処理が行われる。For example, the character string "Touki {yo,
If there is "", and kana-kanji conversion is performed on this character string, conversion processing is performed for each of "touki" and "toukiyou."

【００３７】「とうきょう」の変換候補は、「東京」、
「とうきょう」，…であり、「とうきよう」の変換候補
は「登記用」、「冬季用」、「冬期用」、「とうきよ
う」、…なので、「とうき｛よ，ょ｝う」の変換候補
は、「とうきょう」の変換候補と「とうきよう」の変換
候補とを合わせた、「東京」、「とうきょう」、…、
「登記用」、「冬季用」、「冬期用」、「とうきよ
う」、…となる。The conversion candidates for "Tokyo" are "Tokyo",
"Touki", ..., and the conversion candidates for "Toukiyou" are "Registration", "Winter", "Winter", "Toukiyou", and so on, so "Touki {yo, yo} u" conversion The candidates are "Tokyo", "Tokyo", which is a combination of "Tokyo" and "Tokyo" conversion candidates.
"For registration", "for winter", "for winter", "Toukiyou", ...

【００３８】図３は、図１に示した文字認識装置として
の情報機器の外観を示したものである。FIG. 3 shows an appearance of the information equipment as the character recognition device shown in FIG.

【００３９】図３に示すように、情報機器の主面、即ち
ディスプレイ１０８の表示画面上に透明な板状のタブレ
ット１００が積層された面には、ユーザがペンＰでタブ
レット１００に文字を筆記するための文字筆記領域３０
１と、この文字筆記領域３０１に筆記された筆跡を文字
列として認識した結果と文字挿入位置を示すカーソルＣ
とを表示する認識結果表示領域３０２と、例えばかな漢
字変換の指示ボタン３０３と、かな漢字変換処理により
表示された変換候補の中から所望のものを選択・確定す
るための決定ボタン３０４と、カーソル位置の直前の文
字の削除を指示する１文字後退削除ボタン３０５とが確
保されている。As shown in FIG. 3, the user writes a character on the tablet 100 with the pen P on the main surface of the information equipment, that is, the surface on which the transparent tablet 100 is laminated on the display screen of the display 108. Character writing area 30 for
1 and a cursor C indicating the result of recognizing the handwriting written in the character writing area 301 as a character string and the character insertion position.
A recognition result display area 302 for displaying, a Kana-Kanji conversion instruction button 303, a decision button 304 for selecting and confirming a desired conversion candidate from the conversion candidates displayed by the Kana-Kanji conversion processing, and a cursor position A one-character backward delete button 305 for instructing to delete the immediately preceding character is secured.

【００４０】次に、図４〜図５に示すフローチャートを
参照して、図１の文字認識装置の処理動作について説明
する。Next, the processing operation of the character recognition apparatus of FIG. 1 will be described with reference to the flow charts shown in FIGS.

【００４１】ステップＳ１では、文字筆記領域内にスト
ロークが１本筆記されると、そのストロークの座標デー
タ列、即ちストロークデータが文字認識部１０１に送ら
れる。In step S1, when one stroke is written in the character writing area, a coordinate data string of the stroke, that is, stroke data, is sent to the character recognition unit 101.

【００４２】ステップＳ２では、文字認識部１０１は、
文字認識辞書１０２を参照しながら、それまでに入力さ
れた１文字相当のストロークデータから、最適な文字を
認識する。In step S2, the character recognition unit 101
While referring to the character recognition dictionary 102, the optimum character is recognized from the stroke data corresponding to one character input so far.

【００４３】ステップＳ３では、文字認識部１０１で認
識された最適な文字（の文字コード）を文字バッファ１
０３に格納するとともに、当該文字がディスプレイ１０
８に表示される。In step S3, the optimum character (character code of the character) recognized by the character recognition unit 101 is displayed in the character buffer 1.
03, and the character is displayed on the display 10
8 is displayed.

【００４４】類似文字追加部１０５は、文字バッファ１
０３内に認識結果として出力された文字（のコード）が
格納されるたびに、当該文字が類似文字記憶部１０４に
記憶されている類似文字グループのうちの１つに含まれ
ている（属する）か否かをチェックし（ステップＳ
３）、含まれている場合には、当該文字が含まれている
（属する）類似文字グループ内の当該文字に類似する文
字（のコード）、すなわち、類似文字（の文字コード）
を、文字バッファ１０３内に当該文字に対応付けて記憶
する（ステップＳ５）。ここでは、これを「｛よ、
ょ｝」と表記する。なお、この対応付けは上記手法に限
定するものではなく、文字バッファ１０３に文字認識結
果として格納された文字（例えば、「よ」）にその形状
が類似する他の文字が存在し、それがどのような文字で
あるか（例えば、「ょ」であること）がわかるようにし
ておけば、どのような手法をとってもよい。The similar character adding unit 105 uses the character buffer 1
Every time the character output as a recognition result (code thereof) is stored in 03, the character is included in (belongs to) one of the similar character groups stored in the similar character storage unit 104. Check whether or not (step S
3) If included, a character (code of) similar to the character in the similar character group that includes (belongs to) the character, that is, similar character (character code of)
Is stored in the character buffer 103 in association with the character (step S5). Here, this is "{yo,
"}". Note that this association is not limited to the above method, and there is another character having a similar shape to the character (for example, “yo”) stored as the character recognition result in the character buffer 103, and which character Any method may be used as long as it is possible to understand whether the character is such a character (for example, "yo").

【００４５】一方、ステップＳ４において、文字バッフ
ァ１０３内に格納された文字に、その形状が類似する他
の文字が存在しなとき（すなわち、当該文字が類似文字
記憶部１０４に記憶されている類似文字グループのいず
れにも含まれない（属していない）ときは、ステップＳ
６へ進む。On the other hand, in step S4, when there is no other character whose shape is similar to the character stored in the character buffer 103 (that is, the character stored in the similar character storage unit 104 is similar). If it is not included (does not belong) in any of the character groups, step S
Go to 6.

【００４６】ユーザが、例えば図３の認識結果表示領域
３０２に表示されている文字認識結果としての文字列を
変換対象の文字列として（あるいは、変換対象とする文
字列の範囲を指示した後）、図３の変換ボタン３０３を
選択すると、当該変換対象の文字列に対し、かな漢字変
換が行われる（ステップＳ６）。For example, after the user designates the character string as the character recognition result displayed in the recognition result display area 302 of FIG. 3 as the character string to be converted (or after designating the range of the character string to be converted). 3 is selected, kana-kanji conversion is performed on the character string to be converted (step S6).

【００４７】この変換対象の文字列（第１の文字列）中
に上記ステップＳ５で類似文字が対応付けられた文字
（第１の文字）が存在するときは（図５のステップＳ
７）、図５のステップＳ９へ進み、存在しないときは、
図５のステップＳ８へ進む。When there is a character (first character) to which the similar character is associated in step S5 in the character string to be converted (first character string) (step S in FIG. 5).
7), the process proceeds to step S9 in FIG.
It progresses to step S8 of FIG.

【００４８】図５のステップＳ８では、かな漢字変換部
１０７は、変換対象の第１の文字列から変換候補を生成
し、ステップＳ９では、かな漢字変換部１０７は、第１
の文字列と、当該第１の文字列中の類似文字が対応付け
られている文字（第１の文字）をその類似文字に置き換
えた第２の文字列とから上記第１の文字列の変換候補を
生成する（ステップＳ９）。In step S8 of FIG. 5, the kana-kanji conversion unit 107 generates conversion candidates from the first character string to be converted, and in step S9, the kana-kanji conversion unit 107 generates the first conversion candidate.
Conversion of the first character string from the second character string in which the similar character in the first character string and the character (first character) associated with the similar character in the first character string are replaced with the similar character. A candidate is generated (step S9).

【００４９】そして、上記ステップＳ８あるいはステッ
プＳ９で生成された上記第１の文字列の変換候補がディ
スプレイ１０８に表示される（ステップＳ１０）。Then, the conversion candidates for the first character string generated in step S8 or step S9 are displayed on the display 108 (step S10).

【００５０】例えば、認識結果として、「し」という文
字の後に、「ょ」が認識されて、文字バッファ１０３に
は、「しょ」が格納されている場合、ステップＳ５で、
「よ」と「ょ」は、その形状が類似しているため、文字
バッファ１０３に格納されている認識結果としての文字
（コード）「よ」に、その類似文字「ょ」を対応付けて
格納する。これをここでは、「｛よ、ょ｝」と表記す
る。この場合、ステップＳ６で、ディスプレイ１０８に
表示されている文字列「しょ」を変換対象として、かな
漢字変換の指示がなされたとき、ステップＳ７からステ
ップＳ９へ進み、変換候補としては、文字列「しよ」か
ら変換可能な文字列（「しよ」、…））と、文字列「し
ょ」から変換可能な文字列（「所」、「書」、…）とか
らなる、｛所、書、しよ，…｝が生成されて、ディスプ
レイ１０８に表示される。For example, as a recognition result, when "sho" is recognized after the character "shi" and "sho" is stored in the character buffer 103, in step S5,
Since the shapes of “yo” and “yo” are similar, the similar character “yo” is stored in association with the character (code) “yo” as the recognition result stored in the character buffer 103. To do. This is referred to as “{yo, yo}” here. In this case, in step S6, when a kana-kanji conversion instruction is made with the character string "SHO" displayed on the display 108 as the conversion target, the process proceeds from step S7 to step S9, and the character string " It consists of a character string that can be converted from "yo"("shiyo", ...) and a character string that can be converted from the character string "sho"("place","writing", ...). , ...} is generated and displayed on the display 108.

【００５１】一方、認識結果として、「し」という文字
の後に、「か」が認識されて、文字バッファ１０３に
は、「しか」が格納されている場合、「し」や「か」に
は、それぞれ、その形状が類似する文字が存在しない
（類似文字記憶部１０５に、「し」や「か」の類似文字
が格納されていない）ため、ステップＳ６で、文字列
「しか」を変換対象として、かな漢字変換の指示がなさ
れたとき、ステップＳ７からステップＳ８へ進み、変換
候補としては、文字列「しか」から変換可能な文字列
（｛鹿、しか，…｝が生成されて、ディスプレイ１０８
に表示される。On the other hand, as a recognition result, when “ka” is recognized after the character “shi” and “shika” is stored in the character buffer 103, “shi” or “ka” is stored. Since there is no character having a similar shape (similar characters such as “shi” and “ka” are not stored in the similar character storage unit 105), the character string “shika” is converted in step S6. When a command for kana-kanji conversion is given, the process proceeds from step S7 to step S8, and a character string ({deer, shika, ...}) that can be converted from the character string "shika" is generated as a conversion candidate, and the display 108
Is displayed in.

【００５２】ユーザは、表示された変換候補の中から所
望のものを選択して、例えば、図３の「決定」ボタン３
０４を押すと、上記第１の文字列が当該選択された変換
候補としての文字列に変換される（ステップＳ１１）。The user selects a desired conversion candidate from the displayed conversion candidates, and, for example, the "OK" button 3 in FIG.
When 04 is pressed, the first character string is converted into a character string as the selected conversion candidate (step S11).

【００５３】以上のステップＳ１〜Ｓ１１を、ストロー
クの入力が終了するまで繰り返す（ステップＳ１２）。The above steps S1 to S11 are repeated until the stroke input is completed (step S12).

【００５４】図６は、本実施形態に係る文字認識装置の
他の構成例を示したものである。なお、図６において、
図１と同一部分には同一符号を付し、異なる部分につい
てのみ説明する。すなわち、図６では、文字組合せ記憶
部１０９をさらに具備している。FIG. 6 shows another configuration example of the character recognition device according to this embodiment. In addition, in FIG.
The same parts as those in FIG. 1 are designated by the same reference numerals, and only different parts will be described. That is, in FIG. 6, the character combination storage unit 109 is further provided.

【００５５】文字組合せ記憶部１０９には、類似文字記
憶部１０４に記憶されている複数の類似文字グループの
いずれかに属する文字を他の文字と組み合わせたとき、
文字列としてあり得る組合せが記憶されている。例え
ば、日本語には、「きゃ」「きゅ」「きょ」や「もっぱ
ら」など、拗音や促音などを表現するために、「ゃ」
「ゅ」「ょ」や「っ」などのような小書きされる文字が
ある。なお、このような拗音や促音などを表現するため
に小書きされる文字（小書き文字）を、ここでは、「小
文字」と呼び、これに対し、通常のサイズの文字を「大
文字」と呼ぶことがある。ある文字が大文字と小文字の
使い分けがあり、例えば「よ」と「ょ」のように形が類
似していてサイズが異なる文字があるとき、すなわち、
ある１つの類似文字グループとして「よ」と「ょ」があ
るとき、図７に示すように、そのそれぞれの文字の前に
「か」を配置した場合、「かよ」という文字の組合せは
あり得るが、「かょ」という文字の組合せはあり得な
い。一方、「よ」と「ょ」のそれぞれの文字の前に
「し」を配置した場合には、「しよ」も「しょ」も文字
列としてあり得る文字の組合せである。In the character combination storage unit 109, when a character belonging to any of the plurality of similar character groups stored in the similar character storage unit 104 is combined with another character,
Possible combinations are stored as a character string. For example, in Japanese, "ya", "kyu", "kyo", and "moppora" are used to express the syllabary and consonants.
There are small letters such as "yu", "yo", and "tsu". Characters that are written in small letters to express such whistle sounds and consonants are called "lowercase letters" here, while letters of normal size are called "uppercase letters". Sometimes. When a letter has uppercase and lowercase letters, and there are letters with similar shapes and different sizes, such as "yo" and "yo", that is,
When there are “yo” and “yo” as one similar character group, when “ka” is placed in front of each character as shown in FIG. 7, there may be a combination of characters “kayo”. However, there is no possible combination of the letters "ka". On the other hand, when “shi” is placed before each of the characters “yo” and “yo”, “shiyo” and “sho” are possible character combinations as a character string.

【００５６】このように、類似文字記憶部１０４に記憶
されている複数の類似文字グループのいずれかに属する
文字のそれぞれについて、他の文字と組み合わせたとき
（例えば、前に配置したとき）、文字の前後関係から、
文字列としてあり得る文字の組合せと、文字列としてあ
り得ない文字の組合せのうちの少なくとも一方が文字組
合せ記憶部１０９に記憶されている。As described above, when each of the characters belonging to any one of the plurality of similar character groups stored in the similar character storage unit 104 is combined with another character (for example, when it is arranged before), the character From the context of
At least one of a combination of characters that can be a character string and a combination of characters that cannot be a character string is stored in the character combination storage unit 109.

【００５７】例えば、ここでは、文字組合せ記憶部１０
９には、類似文字記憶部１０４に記憶されている複数の
類似文字グループのいずれかに属する文字のそれぞれに
ついて、図７の上段のように、その直前に他の文字を配
置して当該他の文字と組み合わせたとき、その組合せが
文字列としてあり得るものを記憶しているものとする。For example, here, the character combination storage unit 10
In FIG. 9, for each of the characters belonging to any of the plurality of similar character groups stored in the similar character storage unit 104, another character is arranged immediately before that character as in the upper part of FIG. When the character string is combined with the character, it is assumed that the character string that the combination can be is stored.

【００５８】類似文字追加部１０５は、今回認識結果と
して文字バッファ１０３内に格納された文字（第１の文
字）が類似文字記憶部１０４に記憶されている類似文字
グループのうちの１つに含まれている（属する）とき、
当該第１の文字が含まれている（属する）類似文字グル
ープ内の各類似文字のうち、文字バッファ１０３内に格
納されている直前の認識結果の文字（およびその文字に
類似文字が対応付けられているときは、その類似文字）
との組合せが文字列としてあり得る（当該組合せが、文
字組合せ記憶部１０９にあり得る文字列として記憶され
ている）類似文字を、当該第１の文字に対応付けるよう
になっている。In the similar character adding unit 105, the character (first character) stored in the character buffer 103 as the recognition result this time is included in one of the similar character groups stored in the similar character storage unit 104. When (belongs to)
Among the similar characters in the similar character group including (belonging to) the first character, the character of the recognition result immediately before stored in the character buffer 103 (and the similar character is associated with the character). When it is, its similar characters)
The similar character that can be a combination of the characters and the character string (the combination is stored as a character string that can be stored in the character combination storage unit 109) is associated with the first character.

【００５９】また、類似文字追加部１０５は、第１の文
字とその直前の認識結果の文字（およびその文字に類似
文字が対応付けられているときは、その類似文字）との
組合せが文字列としてあり得ないとき（当該組合せが、
文字組合せ記憶部１０９にあり得る文字列として記憶さ
れていないとき）は、文字バッファ１０３内の第１の文
字を、当該第１の文字に対応付けた類似文字（対応付け
た類似文字が複数あるときは、そのいずれか１つ）で書
き換えるようになっている。なお、当該第１の文字に対
応付けた類似文字が複数あるときは、第１の文字の書換
に用いた類似文字以外の類似文字は、当該書換に用いた
類似文字に対応させて文字バッファ１０３に格納する。Further, the similar character adding unit 105 determines that the combination of the first character and the character of the recognition result immediately before the first character (and the similar character when the character is associated with the character) is a character string. When it is impossible (the combination is
When not stored as a possible character string in the character combination storage unit 109), the first character in the character buffer 103 is associated with the similar character (there are a plurality of associated similar characters). At that time, any one of them) is rewritten. When there are a plurality of similar characters associated with the first character, the similar characters other than the similar character used for rewriting the first character are associated with the similar character used for the rewriting. To store.

【００６０】図８〜図９は、図６に示した文字認識装置
の処理動作を説明するためのフローチャートである。な
お、図８〜図９において、図４〜図５と同一部分には同
一符号を付し、異なる部分についてのみ説明する。すな
わち、図８では、図４のステップ５がステップＳ５ａ〜
ステップＳ５ｃに置き換わっている。8 to 9 are flow charts for explaining the processing operation of the character recognition device shown in FIG. 8 to 9, the same parts as those in FIGS. 4 to 5 are designated by the same reference numerals, and only different parts will be described. That is, in FIG. 8, step S5a of FIG.
It is replaced by step S5c.

【００６１】図８のステップＳ４では、類似文字追加部
１０５は、ステップＳ３で、文字認識部１０１での認識
結果として得られた文字（第１の文字）が文字バッファ
１０３に格納されると、当該第１の文字が類似文字記憶
部１０４に記憶されている類似文字グループのうちの１
つに含まれている（属する）か否かをチェックする。含
まれている場合には、ステップＳ５ａへ進み、当該文字
が含まれている（属する）類似文字グループ内の各文字
（類似文字）について、文字バッファ１０３に格納され
ている、上記第１の文字が認識結果として得られる直前
に認識結果として得られた文字（およびその文字に類似
文字が対応付けられているときは、その類似文字）との
組合せが、文字列としてあり得る（当該組合せが、文字
組合せ記憶部１０９にあり得る文字列として記憶されて
いる）か否かをチェックする。In step S4 of FIG. 8, the similar character adding unit 105 stores the character (first character) obtained as the recognition result of the character recognition unit 101 in the character buffer 103 in step S3. The first character is one of the similar character groups stored in the similar character storage unit 104.
Check whether it is included (belongs to). If it is included, the process proceeds to step S5a, and for each character (similar character) in the similar character group including (or belonging to) the character, the first character stored in the character buffer 103. A combination with a character obtained as a recognition result immediately before is obtained as a recognition result (and a similar character when a similar character is associated with that character) can be a character string (the combination is It is checked whether it is stored as a possible character string in the character combination storage unit 109).

【００６２】そして、類似文字追加部１０５は、上記第
１の文字の類似文字のうち、上記第１の文字が認識結果
として得られる直前に認識結果として得られた文字およ
びその文字に類似文字が対応付けられているときは、そ
の類似文字のそれぞれとの組合せが、文字列としてあり
得る（当該組合せが、文字組合せ記憶部１０９にあり得
る文字列として記憶されている）類似文字を上記第１の
文字に対応付けるようになっている。Then, the similar character adding unit 105 determines that, among the similar characters of the first character, the character obtained as the recognition result immediately before the first character is obtained as the recognition result and the character similar to the character. When associated, the combination with each of the similar characters may be a character string (the combination is stored as a possible character string in the character combination storage unit 109). It corresponds to the character of.

【００６３】ステップＳ５ｂ〜ステップＳ５ｃでは、類
似文字追加部１０５は、第１の文字とその直前の認識結
果の文字（およびその文字に類似文字が対応付けられて
いるときは、その類似文字）との組合せが文字列として
あり得ないとき（当該組合せが、文字組合せ記憶部１０
９にあり得る文字列として記憶されていないとき）は、
文字バッファ１０３内の第１の文字を、当該第１の文字
に対応付けた類似文字（対応付けた類似文字が複数ある
ときは、そのいずれか１つ）で書き換えるようになって
いる。なお、当該第１の文字に対応付けた類似文字が複
数あるときは、第１の文字の書換に用いた類似文字以外
の類似文字は、当該書換に用いた類似文字に対応させて
文字バッファ１０３に格納する。In steps S5b to S5c, the similar character adding unit 105 determines that the first character and the character as the recognition result immediately before the first character (and the similar character when the character is associated with the character). Is not possible as a character string (the relevant combination is the character combination storage unit 10
9 is not stored as a possible character string),
The first character in the character buffer 103 is rewritten with a similar character associated with the first character (when there is a plurality of associated similar characters, any one of them is rewritten). When there are a plurality of similar characters associated with the first character, the similar characters other than the similar character used for rewriting the first character are associated with the similar character used for the rewriting. To store.

【００６４】ステップＳ６以降の処理は、図４の説明と
同様である。The processes after step S6 are the same as those described with reference to FIG.

【００６５】例えば、認識結果として、「し」という文
字の後に、ユーザが「ょ」を入力したつもりであるの
に、「よ」と認識されて、文字バッファ１０３には、
「しょ」が格納されているとする。この場合、類似文字
記憶部１０４参照すると、「よ」と「ょ」は類似関係に
あり、しかも、文字組合せ記憶部１０９を参照すると、
「ょ」とその直前の認識結果の文字「し」との文字の組
合せは、文字列としてあり得るものである。従って、ス
テップＳ５ａでは、文字バッファ１０３に格納されてい
る認識結果としての文字（コード）「よ」に、その類似
文字「ょ」を対応付けて格納する。これをここでは、
「｛よ、ょ｝」と表記する。また、今回の認識結果の文
字「よ」と前回の認識結果の文字「し」との組合せは、
文字列としてあり得るものであるから、ステップＳ５ｂ
からステップＳ６へ進む。ステップＳ６で、文字列「し
ょ」を変換対象として、かな漢字変が指示されたとき、
図４の場合と同様に、ステップＳ７からステップＳ９へ
進み、変換候補としては、文字列「しよ」から変換可能
な文字列（「しよ」、…））と、文字列「しょ」から変
換可能な文字列（「所」、「書」、…）とからなる、
｛所、書、しよ，…｝が生成されて、ディスプレイ１０
８に表示される。For example, as a recognition result, even though the user intends to input "yo" after the character "shi", it is recognized as "yo" and the character buffer 103 stores
It is assumed that "sho" is stored. In this case, referring to the similar character storage unit 104, “yo” and “yo” have a similar relationship, and further, referring to the character combination storage unit 109,
A character combination of "yo" and the character "shi" of the recognition result immediately before that is possible as a character string. Therefore, in step S5a, the similar character “yo” is stored in association with the character (code) “yo” as the recognition result stored in the character buffer 103. This here
It is written as "{yo, yo}". In addition, the combination of the character "yo" of this recognition result and the character "shi" of the previous recognition result is
Since it can be a character string, step S5b
To step S6. In step S6, when the kana-kanji transformation is instructed with the character string "sho" as the conversion target,
Similar to the case of FIG. 4, the process proceeds from step S7 to step S9, and as the conversion candidates, the character string “shiyo” can be converted (“shiyo”, ...), and the character string “sho” is used. It consists of convertible character strings (“place”, “calligraphy”, ...)
{Place, calligraphy, shiyo, ...} is generated, and the display 10 is displayed.
8 is displayed.

【００６６】一方、認識結果として、「か」という文字
の後に、ユーザが「よ」を入力したつもりであるのに、
「ょ」と認識されて、文字バッファ１０３には、「か
ょ」が格納されているとする。この場合、類似文字記憶
部１０４参照すると、「ょ」と「よ」は類似関係にあ
り、しかも、文字組合せ記憶部１０９を参照すると、
「よ」とその直前の認識結果の文字「し」との文字の組
合せは、文字列としてあり得るものである。従って、ス
テップＳ５ａでは、文字バッファ１０３に格納されてい
る認識結果としての文字（コード）「ょ」に、その類似
文字「よ」を対応付けて格納する。これをここでは、
「｛ょ、よ｝」と表記する。しかし、今回の認識結果の
文字「ょ」と前回の認識結果の文字「か」との組合せ
は、文字組合せ記憶部１０９を参照すると、文字列とし
てあり得ないものであるから、ステップＳ５ｂからステ
ップＳ５ｃへ進む。ステップＳ５ｃでは、文字バッファ
１０３内の今回の認識結果の文字「ょ」を、この文字に
対応付けた類似文字、すなわち、「よ」で書き換える。
すなわち、この場合、文字バッファ１０３には、「か
よ」という文字列が記憶されていることになる。このと
き、認識結果として、今回の認識結果の文字として訂正
された「よ」という文字をディスプレイ１０８に表示し
直すようにしてもよい。さて、この場合、ステップＳ６
で、例えば、現在表示されている認識結果としての文字
列「かょ」が変換対象として、かな漢字変換が指示され
たとき、ステップＳ７からステップＳ８へ進み、変換候
補としては、現在文字バッファ１０３に格納されている
文字列「かよ」から変換可能な文字列｛かよ、佳代、
…｝が変換候補として生成されて、ディスプレイ１０８
に表示される。On the other hand, as the recognition result, the user intends to input "yo" after the character "ka",
It is assumed that the character buffer 103 is recognized as “yo” and that “ka” is stored in the character buffer 103. In this case, referring to the similar character storage unit 104, “yo” and “yo” have a similar relationship, and further, referring to the character combination storage unit 109,
A character combination of "yo" and the character "shi" of the recognition result immediately before that is possible as a character string. Therefore, in step S5a, the similar character "yo" is stored in association with the character (code) "yo" as the recognition result stored in the character buffer 103. This here
It is written as “{yo, yo}”. However, the combination of the character "yo" of the recognition result this time and the character "ka" of the previous recognition result cannot be a character string when the character combination storage unit 109 is referred to. Therefore, steps S5b to S5b are performed. Proceed to S5c. In step S5c, the character "yo" of the current recognition result in the character buffer 103 is rewritten with a similar character associated with this character, that is, "yo".
That is, in this case, the character buffer 103 stores the character string “KAYO”. At this time, the character "yo" corrected as the character of this recognition result may be displayed again on the display 108 as the recognition result. Now, in this case, step S6
Then, for example, when the kana-kanji conversion is instructed with the character string “ka” as the currently displayed recognition result as the conversion target, the process proceeds from step S7 to step S8, and the current character buffer 103 is selected as the conversion candidate. Character string that can be converted from the stored character string "Kayo" {Kayo, Kayo,
...} is generated as a conversion candidate, and the display 108
Is displayed in.

【００６７】認識結果として、「し」という文字の後
に、ユーザが「か」を入力し、それが「か」と認識され
て、文字バッファ１０３には、「しか」が格納されてい
るとする。この場合、類似文字記憶部１０４参照する
と、今回認識された文字「か」には、類似関係にある文
字が存在しない。従って、ステップＳ４からそのままス
テップＳ６へ進む。ステップＳ６で、文字列「しか」が
変換対象として指定されたときには、ステップＳ７から
ステップＳ８へ進み、変換候補としては、文字列「し
か」から変換可能な文字列が｛鹿、しか，…｝変換候補
として生成されて、ディスプレイ１０８に表示される。As a recognition result, the user inputs "ka" after the character "shi", the character is recognized as "ka", and "shika" is stored in the character buffer 103. . In this case, referring to the similar character storage unit 104, the character “ka” recognized this time does not have a similar character. Therefore, the process directly proceeds from step S4 to step S6. In step S6, when the character string “shika” is designated as the conversion target, the process proceeds from step S7 to step S8, and as the conversion candidates, the character strings that can be converted from the character string “shika” are {deer, shika, ...}. It is generated as a conversion candidate and displayed on the display 108.

【００６８】以上説明したように、上記実施形態によれ
ば、文字認識結果として得られた第１の文字列中の文字
に、誤認識の可能性がある類似文字があるときは、当該
第１の文字列から変換可能な他の第２の文字列と、当該
第１の文字列中の文字をその類似文字で置き換えた第３
の文字列から変換可能な他の第４の文字列とを、当該第
１の文字列の変換候補として出力して、当該第１の文字
列を当該出力された変換候補のうちの選択された１つの
文字列へ変換することにより、手書き文字を文字認識す
る際に、類似文字を誤読した場合でも、正しくかな漢字
変換を行うことができる。As described above, according to the above-described embodiment, when a character in the first character string obtained as a character recognition result has a similar character that may be erroneously recognized, the first character string is detected. Other second character string that can be converted from the character string of No. 3 and the character in the first character string is replaced with the similar character
The other fourth character string that can be converted from the character string of is output as a conversion candidate of the first character string, and the first character string is selected from the output conversion candidates. By converting into one character string, when recognizing a handwritten character, even if a similar character is misread, kana-kanji conversion can be performed correctly.

【００６９】なお、上記実施形態では、かな漢字変換の
場合を例にとり説明したが、この場合に限らず、例え
ば、片仮名に変換する場合などにも適用可能である。In the above embodiment, the case of Kana-Kanji conversion has been described as an example, but the present invention is not limited to this case, and can be applied to the case of converting to Katakana, for example.

【００７０】また、認識結果の文字に対応付ける類似文
字の数が多くなると、それに伴い、変換候補の生成の処
理量が多くなり、しかも変換候補の数も多くなり、ユー
ザが変換指示を行ってから変換候補が表示されるまでの
レスポンスが遅くなるとともに、多くの変換候補の中か
ら所望の１つを探すユーザの手間が増大していまう。従
って、類似文字を対応付ける際には、その時点でできる
だけ無用な類似文字は排除できることが望ましい。Further, as the number of similar characters associated with the character of the recognition result increases, the processing amount of generation of conversion candidates also increases, and the number of conversion candidates also increases, and after the user gives a conversion instruction. The response until the conversion candidate is displayed becomes slow, and the time and effort for the user to search for a desired one from many conversion candidates increases. Therefore, when associating similar characters, it is desirable to eliminate as many unnecessary similar characters as possible at that time.

【００７１】特に、誤認識されやすいのが大文字、小文
字の使い分けがある文字（例えば、「や」と「ゃ」、
「よ」と「ょ」など）であり、従って、類似文字追加部
１０５の処理で類似文字を対応付ける機会が多いのも、
このような文字である。In particular, it is easy to be erroneously recognized because there are characters with different uppercase and lowercase letters (for example, "ya" and "ya",
Therefore, there are many opportunities to associate similar characters in the processing of the similar character adding unit 105.
It is such a character.

【００７２】そこで、図１、図６に示した文字認識装置
の類似文字追加部１０５は、タブレット１００上に筆記
された文字の大きさを基に、文字バッファ１０３に格納
された認識文字に類似文字を対応付けるか否かを判断す
るようにしてもよい。Therefore, the similar character adding unit 105 of the character recognition device shown in FIGS. 1 and 6 is similar to the recognized character stored in the character buffer 103 based on the size of the character written on the tablet 100. It may be possible to determine whether or not to associate the characters.

【００７３】例えば、図４のステップＳ５，図８のステ
ップＳ５ａにおいて、類似文字追加部１０５は、文字バ
ッファ１０３に格納された認識結果の文字に類似文字記
憶部１０４に記憶されている類似文字を対応付ける際、
その類似文字が文字バッファ１０３に格納された認識結
果の文字と、大文字と小文字、あるいは、逆に小文字と
大文字の関係にあるときは、そのような類似文字を対応
付けるか否かをタブレット１００上に筆記された認識対
象の手書き文字（筆記文字）の大きさから判断するよう
にしてもよい。For example, in step S5 of FIG. 4 and step S5a of FIG. 8, the similar character adding unit 105 replaces the similar character stored in the similar character storage unit 104 with the character of the recognition result stored in the character buffer 103. When associating
When the similar character has a relationship between the character of the recognition result stored in the character buffer 103 and the uppercase letter and the lowercase letter, or conversely, the lowercase letter and the uppercase letter, whether or not to associate the similar letter on the tablet 100. The determination may be made based on the size of the written handwritten character (written character) to be recognized.

【００７４】なお、文字バッファ１０３に格納された認
識結果の文字と、その類似文字とが大文字と小文字ある
いは小文字と大文字の関係にあるか否かは、それらの文
字コードから判断できる。従って、類似文字記憶部１０
４には、大文字の文字コードと小文字の文字コードとの
対応関係を予め記憶しておく必要がある。また、類似文
字記憶部１０４は、タブレット１００からの筆記された
文字（筆記文字）の大きさに関する情報を取得する。Whether or not the recognition result character stored in the character buffer 103 and its similar character are in uppercase and lowercase or lowercase and uppercase can be determined from those character codes. Therefore, the similar character storage unit 10
In 4, it is necessary to store the correspondence between the uppercase character code and the lowercase character code in advance. In addition, the similar character storage unit 104 acquires information regarding the size of a written character (written character) from the tablet 100.

【００７５】類似文字追加部１０５は、文字バッファ１
０３に格納された認識結果の文字が大文字であるとき
（例えば、「よ」であるとき）、タブレット１００から
得た筆記文字の大きさに関する情報から、当該筆記文字
の大きさが、予め定められた第１の基準値以下あるいは
当該第１の基準値より小さいときに、当該認識結果の文
字に類似文字として小文字（この場合、「ょ」）を対応
付けて文字バッファ１０３に格納する。また、文字バッ
ファ１０３に格納された認識結果の文字が小文字である
とき（例えば「ゃ」であるとき）、タブレット１００か
ら得た筆記文字の大きさに関する情報から、当該筆記文
字の大きさが、予め定められた第２の基準値以上あるい
は当該第２の基準値より大きいときに、当該認識結果の
文字に類似文字として大文字（この場合「や」）を対応
付けて文字バッファ１０３に格納する。The similar character adding unit 105 uses the character buffer 1
When the character of the recognition result stored in 03 is a capital letter (for example, when it is "yo"), the size of the handwritten character is determined in advance from the information about the size of the handwritten character obtained from the tablet 100. When it is less than or equal to the first reference value or smaller than the first reference value, a lowercase letter (in this case, "yo") is stored in the character buffer 103 as a similar character to the character of the recognition result. In addition, when the character of the recognition result stored in the character buffer 103 is a lower case character (for example, “ya”), the size of the handwritten character is calculated from the information on the size of the handwritten character obtained from the tablet 100. When it is equal to or larger than the second reference value set in advance or larger than the second reference value, an uppercase character (in this case, "or") is stored in the character buffer 103 as a similar character to the character of the recognition result.

【００７６】上記第１の基準値、第２の基準値は、文字
筆記領域３０１の大きさから予め定められていてもよ
い。The first reference value and the second reference value may be predetermined based on the size of the character writing area 301.

【００７７】従って、文字バッファ１０３に格納された
認識結果の文字が大文字であっても、その認識対象であ
った筆記文字の大きさが上記第１の基準値より大きいあ
るいは当該第１の基準値以上のときは、高い確率で大文
字であるといえるので、当該文字の小文字は類似文字と
して対応付けることはしない。また、文字バッファ１０
３に格納された認識結果の文字が小文字であっても、そ
の認識対象であった筆記文字の大きさが上記第２の基準
値より小さいあるいは当該第２の基準値以下のときは、
高い確率で小文字であるといえるので、当該文字の大文
字は類似文字として対応付けることはしない。Therefore, even if the character of the recognition result stored in the character buffer 103 is an uppercase character, the size of the written character that is the recognition target is larger than the first reference value or the first reference value. In the above case, since it can be said that it is an uppercase letter with high probability, the lowercase letter of the character is not associated as a similar character. Also, the character buffer 10
Even if the character of the recognition result stored in 3 is a lowercase letter, if the size of the written character that is the recognition target is smaller than the second reference value or less than the second reference value,
Since it can be said that it is a small letter with a high probability, the upper case of the character is not associated as a similar character.

【００７８】このようにして、大文字や小文字の使い分
けのある認識結果の文字に類似文字を対応付ける際に
は、筆記文字の大きさを基に判断することで、類似文字
を対応付ける認識結果の文字の数を最小限にすることが
できるのである。In this way, when associating the similar character with the character of the recognition result in which uppercase letters and lowercase letters are properly used, the character of the recognition result to which the similar character is associated is determined by judging based on the size of the written character. The number can be minimized.

【００７９】また、文字筆記領域が１文字分だけであ
り、そこに連続する２文字を重ねて筆記した結果から文
字認識部１０１が文字認識を行う場合（すなわち、ユー
ザが文字の区切りを意識することなく、文字筆記領域に
ただ文字を続けて筆記していくだけで、文字認識部１０
１がその筆跡を自動的に文字列として認識することがで
きる、いわゆる重ね書き文字認識装置の場合）、文字と
文字との区切りが正確でない場合もある。例えば認識結
果として「も」が得られたとしても、これが「し」と
「こ」、あるいは「こ」と「し」である可能性もある。
そこで、このような重ね書き文字認識装置の場合には、
「も」と、「し」「こ」、あるいは、「こ」「し」と
は、類似関係のある文字と云える。すなわち、このよう
な重ね書き文字認識装置の場合には、「し」「こ」、あ
るいは、「こ」「し」のような、続けて筆記される２文
字の組み合せであって、その２文字の間の構造関係があ
る文字（この場合「も」）の形状と類似する場合も、そ
のような２文字を当該ある文字とともに類似文字グルー
プとして、類似文字記憶部１０４に記憶する必要があ
る。これ以外は、前述同様である。Further, when the character writing area is only for one character and the character recognizing unit 101 performs character recognition from the result of writing by writing two consecutive characters on the character writing area (that is, the user is aware of character breaks). The character recognition unit 10 can simply write characters continuously in the character writing area without writing.
In the case of a so-called overwriting character recognition device in which 1 can automatically recognize the handwriting as a character string), the delimiter between characters may not be accurate. For example, even if “mo” is obtained as the recognition result, it may be “shi” and “ko” or “ko” and “shi”.
Therefore, in the case of such an overwriting character recognition device,
It can be said that "mo" and "shi""ko" or "ko""shi" have similar relations. That is, in the case of such an overwriting character recognition device, a combination of two characters written in succession, such as "shi", "ko", or "ko", "shi", Even if the shape of a character having a structural relationship between them is similar to that of a character (in this case, also “”), it is necessary to store such two characters together with the character as a similar character group in the similar character storage unit 104. Other than this, it is the same as described above.

【００８０】すなわち、文字認識結果として得られた第
１の文字（例えば、「も」）に、続けて筆記される２文
字の組み合せであって、その２文字の間の構造関係が第
１の文字の形状と類似する第２の文字（例えば、「し」
「こ」）が存在するときは、当該第２の文字を前記第１
の文字に対応付けておき、文字認識結果として得られた
当該第１の文字を含む第１の文字列から変換可能な他の
第２の文字列と、第１の文字列中の当該第１の文字を上
記第２の文字で置き換えることにより得られる第３の文
字列から変換可能な他の第４の文字列とを上記第１の文
字列の変換候補として出力する。That is, it is a combination of two characters written in succession to the first character (for example, "mo") obtained as a result of character recognition, and the structural relationship between the two characters is the first. A second character that resembles the shape of the character (for example, "shi")
")" Is present, the second character is replaced by the first character.
Other second character string that can be converted from the first character string including the first character obtained as a result of character recognition and the first character in the first character string. And the other fourth character string that can be converted from the third character string obtained by replacing the character of the above with the second character as a conversion candidate of the above-mentioned first character string.

【００８１】このとき、上記第１の文字が文字認識され
る直前の文字認識結果の第３の文字と上記第２の文字と
の組合せが文字列としてあり得る文字の組合せであると
き、当該第２の文字を第１の文字に対応付ける。At this time, when the combination of the third character of the character recognition result immediately before the first character is character-recognized and the second character is a character combination that can be a character string, The second character is associated with the first character.

【００８２】また、第１の文字が文字認識される直前の
文字認識結果の第３の文字と当該第１の文字との組合せ
が文字列としてあり得えない文字の組合せであるとき、
当該第１の文字に代えて２文字の組合せである第２の文
字を文字認識結果とし、この第２の文字を含む第１の文
字列の変換候補を出力する。Further, when the combination of the third character of the character recognition result immediately before the first character is character-recognized and the first character is a character combination that cannot be a character string,
The second character, which is a combination of two characters, is used as the character recognition result instead of the first character, and the conversion candidate of the first character string including the second character is output.

【００８３】[0083]

【発明の効果】以上説明したように本発明によれば、手
書き文字を文字認識する際に、類似文字を誤読した場合
でも、正しくかな漢字変換を行うことができる。As described above, according to the present invention, when recognizing a handwritten character, even if a similar character is misread, kana-kanji conversion can be performed correctly.

[Brief description of drawings]

【図１】本発明に一実施形態に係る文字認識装置の構成
例を示した図。FIG. 1 is a diagram showing a configuration example of a character recognition device according to an embodiment of the present invention.

【図２】類似文字記憶部に記憶されている類似文字グル
ープについて説明するための図。FIG. 2 is a diagram for explaining a similar character group stored in a similar character storage unit.

【図３】図１に示した文字認識装置としての情報機器の
外観を示した図。FIG. 3 is a diagram showing an appearance of an information device as the character recognition device shown in FIG.

【図４】図１に示した文字認識装置の処理動作を説明す
るためのフローチャート。4 is a flowchart for explaining a processing operation of the character recognition device shown in FIG.

【図５】図１に示した文字認識装置の処理動作を説明す
るためのフローチャート。5 is a flowchart for explaining a processing operation of the character recognition device shown in FIG.

【図６】文字認識装置の他の構成例を示した図。FIG. 6 is a diagram showing another configuration example of the character recognition device.

【図７】文字列としてあり得る文字の組合せと、文字列
としてあり得ない文字の組合せについて説明するための
図。FIG. 7 is a diagram for explaining a combination of characters that can be a character string and a combination of characters that cannot be a character string.

【図８】図６に示した文字認識装置の処理動作を説明す
るためのフローチャート。8 is a flowchart for explaining the processing operation of the character recognition device shown in FIG.

【図９】図６に示した文字認識装置の処理動作を説明す
るためのフローチャート。9 is a flowchart for explaining the processing operation of the character recognition device shown in FIG.

[Explanation of symbols]

１００…タブレット１０１…文字認識部１０２…文字認識辞書１０３…文字バッファ１０４…類似文字記憶部１０５…類似文字追加部１０６…かな漢字変換辞書１０７…かな漢字変換部１０８…ディスプレイ１０９…文字組合せ記憶部 100 ... Tablet 101 ... Character recognition unit 102 ... Character recognition dictionary 103 ... Character buffer 104 ... Similar character storage unit 105 ... Similar character addition section 106 ... Kana-Kanji conversion dictionary 107 ... Kana-Kanji conversion unit 108 ... Display 109 ... Character combination storage unit

フロントページの続きＦターム(参考） 5B009 KA01 LA02 5B064 AB04 BA06 DD08 DD11 EA10 EA18 FA04 FA06 FA13 Continued front page F-term (reference) 5B009 KA01 LA02 5B064 AB04 BA06 DD08 DD11 EA10 EA18 FA04 FA06 FA13

Claims

[Claims]

1. A character recognition based on a stroke sequence represented by a coordinate sequence of a pen tip detected by the coordinate input device from when the pen touches the coordinate input device until it leaves the coordinate input device. Is a character string conversion method for converting a character string obtained by performing the above to another character string, and another second character string that can be converted from the first character string obtained as a character recognition result. And another fourth character convertible from the third character string obtained by replacing the first character in the first character string with the second character having a shape similar to the first character. And outputting a character string as a conversion candidate for the first character string, and converting the first character string into another selected one of the output conversion candidates. String conversion method.

2. A character recognition based on a stroke sequence represented by a coordinate series of a pen tip detected by the coordinate input device from when the pen touches the coordinate input device until it leaves the coordinate input device. When the first character obtained as a result of character recognition has a second character whose shape is similar, the second character is associated with the first character, and the character Another second character string that can be converted from the first character string including the first character obtained as a recognition result, and the first character in the first character string are converted into the second character. Is output as a conversion candidate of the first character string and another fourth character string that can be converted from the third character string obtained by replacing the first character string with the output conversion candidate. Characterized by converting to one of the other selected strings of Character recognition method that.

3. When the combination of the third character of the character recognition result immediately before character recognition of the first character and the second character is a combination of characters that can be a character string, the second character The character recognition method according to claim 2, wherein the character is associated with the first character.

4. When the combination of the third character of the character recognition result immediately before the first character is character-recognized and the first character is a combination of characters that cannot be a character string, The character recognition method according to claim 2, wherein the second character is used as a character recognition result in place of the first character, and a conversion candidate of the first character string including the second character is output. .

5. The first character is a character having different upper and lower case letters, and the relationship between the second character and the first character is that one of them is uppercase and the other is lowercase. When there is a relationship, the second character is added to the first character based on the size of the written character that is the recognition target of the first character.
4. The character recognition method according to claim 3, wherein it is determined whether or not the character is associated.

6. The second character is a combination of two characters written in succession, and the structural relationship between the two characters is similar to the shape of the first character. The character recognition method according to item 2.

7. A character recognition based on a stroke sequence represented by a coordinate series of a pen tip detected by the coordinate input device from when the pen touches the coordinate input device until it leaves the coordinate input device. And a first character obtained as a result of character recognition by the character recognition means.
When a second character having a similar shape is present in the character of, the result obtained as the character recognition result by the associating unit that associates the second character with the first character and the character recognizing unit. Another second that can be converted from the first character string including the first character
And another fourth character string convertible from the third character string obtained by replacing the first character in the first character string with the second character. An output unit that outputs the first character string as a conversion candidate, and a conversion unit that converts the first character string into another selected one of the conversion candidates output by the output unit. A character recognition device characterized by being provided.

8. A character recognition based on a stroke sequence represented by a coordinate series of a pen tip detected by the coordinate input device between the time the pen touches the coordinate input device and the time the pen leaves the coordinate input device. And a storage unit for storing a plurality of such similar character groups, the storage unit storing a plurality of such similar character groups. First obtained as a character recognition result
Is a character belonging to a first similar character group that is one of the plurality of similar character groups, the first character is other than the first character belonging to the first similar character group. Associated means for associating the second character of the second character with another second character that can be converted from the first character string including the first character obtained as a result of character recognition by the character recognition means.
And another fourth character string convertible from the third character string obtained by replacing the first character in the first character string with the second character. An output unit that outputs the first character string as a conversion candidate, and a conversion unit that converts the first character string into another selected one of the conversion candidates output by the output unit. A character recognition device characterized by being provided.

9. The associating means is a combination of characters in which a combination of a third character of the character recognition result immediately before the first character is recognized and the second character is a character string. 9. The character recognition device according to claim 7, wherein the second character is associated with the first character at a certain time.

10. When the combination of the third character of the character recognition result immediately before the first character is character-recognized and the first character is a combination of characters that cannot be a character string,
9. The conversion candidate of the first character string including the second character is output instead of the first character as the character recognition result, and the conversion candidate is output. Character recognizer.

11. The first character is a character that has different uppercase and lowercase letters, and the relationship between the second character and the first character is that one of them is uppercase and the other is lowercase. 10. When there is a relation, it is determined whether or not to associate the second character with the first character based on the size of the written character that is the recognition target of the first character. Character recognition device described.

12. The second character is continuously written 2
9. The character recognition device according to claim 7, wherein the character recognition device is a combination of characters and the structural relationship between the two characters is similar to the shape of the first character.

13. A character recognition program for performing character recognition based on a stroke string representing a character written by a pen on a coordinate input device, wherein the computer recognizes a first character string obtained as a character recognition result. From another second character string that can be converted from a third character obtained by replacing the first character in the first character string with a second character that is similar in shape to the first character. Outputting another fourth character string that can be converted from a character string as a conversion candidate for the first character string, and selecting the first character string from the selected conversion candidates of the output conversion candidates. A character recognition program that executes the steps of converting into one conversion candidate and.