JP2001243425A

JP2001243425A - On-line character recognition device and method

Info

Publication number: JP2001243425A
Application number: JP2000054067A
Authority: JP
Inventors: Takenori Kawamata; 武典川又
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2000-02-29
Filing date: 2000-02-29
Publication date: 2001-09-07

Abstract

PROBLEM TO BE SOLVED: To solve the problem that it can not be recognized accurately from only the distance values of a recognition result whether or not the recognition result is a correct character since a conventional on-line character recognition device is so constituted only that its recognition rate is increased. SOLUTION: The character quantity of an input pattern is extracted from time-series information and the character quality information and the distance values of the recognition result are used in combination for decision making, thereby actualizing a device and a method for character recognition which can decide the likelihood of the recognition result with high precision.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は文字認識装置に関
し、特に筆記文字の文字品質を考慮することにより、認
識結果の確からしさを精度良く判定可能なオンライン文
字認識装置およびオンライン文字認識方法に関するもの
である。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition apparatus, and more particularly to an online character recognition apparatus and an online character recognition method capable of accurately determining the likelihood of a recognition result by considering the character quality of written characters. is there.

【０００２】[0002]

【従来の技術】認識結果の確からしさを判定する従来の
オンライン文字認識装置としては、認識結果の最小距離
値（最も認識辞書の標準パターンと一致するものの距離
値）が、ある閾値より小さい場合に確からしいと判定す
る装置、類似文字における判定性能を向上させるため
に、認識結果の最小距離値と第２番目に小さい候補にお
ける距離値の差がある閾値よりも大きい場合（第１位の
候補の距離値と第２位の候補の距離値が離れている場
合）に確からしいと判定する装置、両者を併用する装置
がある。2. Description of the Related Art A conventional on-line character recognition apparatus for determining the likelihood of a recognition result is a conventional on-line character recognition device in which the minimum distance value of a recognition result (the distance value of a pattern that matches the standard pattern of a recognition dictionary) is smaller than a certain threshold value. If the difference between the minimum distance value of the recognition result and the distance value of the second smallest candidate is larger than a certain threshold value (e.g., There is a device that determines that the distance value is probable when the distance value and the distance value of the second candidate are far apart, and a device that uses both of them.

【０００３】また、より判定性能を向上させるために、
予め類似カテゴリ（文字）毎の判定パラメータを用意
し、判定時には認識結果の候補文字がどの類似カテゴリ
セットに入るかを調べ判定する装置（特開平０６−２５
９６０３号公報）、前記装置において複数の候補文字の
属する類似カテゴリの情報を用いて判定する装置（特開
平０６−３２５２１３号公報）、及び認識結果の複数の
候補文字における距離値から、各候補文字の確信度を算
出しなおす装置（特開平０８−２１２３００号公報、特
開平０８−３２９１９４号公報）がある。In order to further improve the judgment performance,
A determination device for each similar category (character) is prepared in advance, and at the time of determination, a candidate character of a recognition result is included in which similar category set to determine.
Japanese Patent Application Laid-Open No. 9603/1996), a device for determining using the similar category to which a plurality of candidate characters belong in the device (Japanese Patent Laid-Open No. 06-325213), and a distance value of a plurality of candidate characters as a recognition result. (Japanese Unexamined Patent Application Publication No. 08-212300 and Japanese Unexamined Patent Application Publication No. 08-329194).

【０００４】図２８は例えば特開平０６−２５９６０３
号公報に示された認識結果の確からしさを判定する従来
装置の構成を示すブロック図である。図中１はスキャ
ナ、文字切り出し装置等の読取手段によって文字単位に
２値変換された入力パターンを入力し、その入力パター
ンの特徴を解析し、その特徴ベクトルを識別部３に出力
する特徴抽出部、２は辞書メモリ、３は辞書メモリ２か
ら取り出した認識対象カテゴリの標準的な特徴を表す標
準ベクトルと前述の特徴ベクトルとを照合し、両者の類
似度、例えば距離値を計算して類似度の高い順（距離値
であればその距離が小さい順）に上位候補となるカテゴ
リ群を生成し、認識結果として、より上位候補のカテゴ
リ情報をあらわす信号とそのときの距離値情報を表す信
号とを出力する識別部である。FIG. 28 shows, for example, Japanese Patent Application Laid-Open No. H06-259603.
FIG. 1 is a block diagram illustrating a configuration of a conventional device that determines the likelihood of a recognition result disclosed in Japanese Patent Application Laid-Open Publication No. H10-15095. In the figure, reference numeral 1 denotes a feature extracting unit that inputs an input pattern that has been binary-converted for each character by a reading unit such as a scanner or a character cutout device, analyzes the features of the input pattern, and outputs the feature vector to the identification unit 3. Reference numeral 2 denotes a dictionary memory. Reference numeral 3 denotes a standard vector representing a standard feature of a recognition target category extracted from the dictionary memory 2 and a feature vector described above, and the similarity between them is calculated, for example, a distance value is calculated. Are generated in the descending order (if the distance value is smaller, the distance is smaller), and a signal representing category information of a higher candidate and a signal representing distance value information at that time are generated as recognition results. Is an identification unit that outputs.

【０００５】１４は認識対象の各カテゴリが属する類似
カテゴリのグループ番号（類似カテゴリグループ識別情
報）を、入力が予定される文字パターンの全カテゴリ分
について格納している記憶装置としての係数インデック
ス用メモリ（第１のメモリ）、１５は類似カテゴリグル
ープ毎に、その正読傾向と誤読傾向とを判別するように
判別分析を用いて学習した判別係数値を格納する係数メ
モリ（第２のメモリ）である。A coefficient index memory 14 stores a group number (similar category group identification information) of a similar category to which each category to be recognized belongs for all categories of a character pattern to be input. (First memory) 15 is a coefficient memory (second memory) for storing, for each similar category group, a discrimination coefficient value learned using discrimination analysis so as to discriminate between the correct reading tendency and the misreading tendency. is there.

【０００６】１６は係数インデックス用メモリ１４の類
似カテゴリのグループ番号と識別部３から認識結果とし
て送られるカテゴリ情報に基づいて認識対象文字パター
ンが属する類似カテゴリグループを決定するとともに、
係数メモリ１５からそのときの判別係数値を出力する係
数選択部、１７は識別部３からの距離値情報のうちの距
離値列からその距離値列全体の分布形状を表現する複数
の差分値データを求める差分値作成部、１８は識別部３
で得られた認識結果の距離値情報と差分値作成部１７か
ら得られる複数の差分値データ及び係数選択部１６によ
って選択される係数メモリ１５からの判別係数値との間
で判別関数を用いてその内積をとり、その判別関数値を
候補確度Ｈとして出力する候補確度算出部である。Reference numeral 16 determines a similar category group to which the character pattern to be recognized belongs, based on the group number of the similar category in the coefficient index memory 14 and the category information sent as the recognition result from the identification unit 3.
A coefficient selection unit 17 for outputting the discrimination coefficient value at that time from the coefficient memory 15. Is a difference value creating unit for calculating
Using the discriminant function between the distance value information of the recognition result obtained in step (1) and the plurality of difference value data obtained from the difference value creation unit 17 and the discrimination coefficient value from the coefficient memory 15 selected by the coefficient selection unit 16. A candidate probability calculation unit that calculates the inner product and outputs the discriminant function value as a candidate probability H.

【０００７】図２９は、係数インデックス用メモリ１４
の内容を示す図であり、２０はカテゴリ番号、２１は対
応する類似カテゴリのグループ番号である。FIG. 29 shows a memory 14 for coefficient index.
20 is a diagram showing the contents of a category number, and 21 is a group number of a corresponding similar category.

【０００８】図３０は、係数メモリ１５の内容を表すも
のであり、３１は類似カテゴリのグループ番号２１に対
応するグループ毎の判別係数値である。FIG. 30 shows the contents of the coefficient memory 15, and 31 is a discrimination coefficient value for each group corresponding to the group number 21 of the similar category.

【０００９】次に動作について説明する。まず、特徴抽
出部１は入力パターンを入力し、その特徴ベクトルを表
す信号を識別部３に出力する。次に識別部３は、特徴抽
出部１より得られた入力パターンの特徴ベクトルと、辞
書メモリ２から取り出した認識対象カテゴリの標準的な
特徴を表す標準ベクトルとを照合し、両者の距離値情報
とカテゴリ情報を出力する。すなわち、この認識結果
は、対象となる最大候補カテゴリ数をＫとするときに、
カテゴリ情報Ｃは式（１）で表される。Next, the operation will be described. First, the feature extraction unit 1 receives an input pattern and outputs a signal representing the feature vector to the identification unit 3. Next, the identification unit 3 collates the feature vector of the input pattern obtained from the feature extraction unit 1 with a standard vector representing the standard feature of the recognition target category extracted from the dictionary memory 2 and obtains distance value information of both. And output category information. That is, when the maximum number of target candidate categories is K,
Category information C is represented by equation (1).

【００１０】Ｃ＝｛Ｃ_ｋ｜ｋ＝１，２，・・・，Ｋ｝（１）但し、Ｃ_ｋは第ｋ位の候補カテゴリC = {C _k | k = 1, 2,..., K} (1) where C _k is the k-th candidate category

【００１１】また、距離値情報、すなわち距離値列ｄは
式（２）で表される。Further, distance value information, that is, a distance value sequence d is represented by equation (2).

【００１２】ｄ＝｛ｄ_ｋ｜ｋ＝１，２，・・・，Ｋ｝（２）但し、ｄ_ｋは第ｋ位の候補カテゴリの距離値D = {d _k | k = 1, 2,..., K} (2) where d _k is the distance value of the k-th candidate category

【００１３】そこで、係数選択部１６は最初に第1位候
補カテゴリＣ_１に着目して、候補確度の算出時に用いる
判別係数値ｗを選択する。例えば第1位候補カテゴリＣ
_１のカテゴリ番号を３とすると、図２９及び図３０を参
照すれば、判別係数値ｗは式（３）の如く表される。Therefore, the coefficient selection unit 16 first focuses on the _first candidate category C1, and selects a discrimination coefficient value w to be used when calculating the candidate probability. For example, the first candidate category C
Assuming that the category number of ₁ is 3, referring to FIGS. 29 and 30, the discrimination coefficient value w is expressed as Expression (3).

【００１４】ｗ＝（ｗ２１，ｗ２２，・・・，ｗ２ｋ，ｗ２（２・ｋ−１））（３）W = (w21, w22,..., W2k, w2 (2 · k−1)) (3)

【００１５】次に、差分値作成部１７は、上述した距離
値列ｄを受けて距離値列全体の分布形状に着目した差分
値データ、即ち、差分値データ列ｄ’を求める。この差
分値データ列ｄ’の一例は式（４）の如く表される。Next, the difference value creation unit 17 receives the distance value sequence d and obtains difference value data focusing on the distribution shape of the entire distance value sequence, ie, a difference value data sequence d '. An example of the difference value data sequence d ′ is expressed as in Expression (4).

【００１６】ｄ’＝｛ｄ’_ｋ：ｄ_ｋ＋１−ｄ_ｋ｜ｋ＝１，２，・・・，Ｋ｝（４）但し、ｄ’_ｋは第ｋ位の候補カテゴリの距離値D ′ = {d ′ _k : d _{k + 1} −d _k | k = 1, 2,..., K} (4) where d ′ _k is the distance value of the k-th candidate category

【００１７】更に、候補確度算出部１８では、認識結果
の距離値列ｄと、差分値作成部１７で求めた差分値デー
タ列ｄ’と係数選択部１６によって選択された係数メモ
リ１５からの判別係数値ｗとをパラメータとした判別関
数ｆ（ｗ、ｄ、ｄ’）に基づいて認識結果の候補確度Ｈ
を式（５）のように算出する。Further, the candidate probability calculation unit 18 determines the distance value sequence d of the recognition result, the difference value data sequence d ′ obtained by the difference value generation unit 17 and the coefficient memory 15 selected by the coefficient selection unit 16. Based on the discriminant function f (w, d, d ') using the coefficient value w as a parameter, the recognition result candidate accuracy H
Is calculated as in equation (5).

【００１８】Ｈ＝ｆ（ｗ，ｄ，ｄ’）＝ｗ２１ｄ１＋ｗ２２ｄ２＋・・・ｗ２ＫｄＫ＋ｗ２（Ｋ＋１）ｄ’Ｋ＋ｗ２（Ｋ＋２）ｄ’２＋・・・＋ｗ２（２Ｋ−１）ｄ’Ｋ−１（５）H = f (w, d, d ') = w21d1 + w22d2 + ... w2KdK + w2 (K + 1) d'K + w2 (K + 2) d'2 + ... + w2 (2K-1) d'K-1 (5)

【００１９】この後、得られた候補確度Ｈを、認識結果
の確からしさを数量化した値として、図示しない後段の
リジェクト判定処理や後処理判定に適用させる。Thereafter, the obtained candidate accuracy H is applied as a quantified value of the certainty of the recognition result to a rejection determination process or a post-processing determination (not shown) at a later stage.

【００２０】[0020]

【発明が解決しようとする課題】以上説明したように、
従来の認識結果の確からしさを判定するオンライン文字
認識装置では、認識結果の距離値（差分値を含む）及び
認識結果のカテゴリが属する類似カテゴリセットにおけ
る判別係数値により得られた候補確度により、認識結果
の確からしさを判定しているため、認識結果の候補文字
が予め定められた類似文字カテゴリ内において分布して
いる場合は判別性能が向上するが、未知の分布に対して
は判別精度が低下するという課題があった。As described above,
In a conventional online character recognition apparatus for determining the likelihood of a recognition result, recognition is performed based on a distance value (including a difference value) of the recognition result and a candidate likelihood obtained from a discrimination coefficient value in a similar category set to which the category of the recognition result belongs. Since the likelihood of the result is determined, the discrimination performance is improved when the candidate characters of the recognition result are distributed in a predetermined similar character category, but the discrimination accuracy is reduced for an unknown distribution. There was a problem to do.

【００２１】また、カテゴリの分布は互いに重なりが存
在するため、すべてのカテゴリをいずれかの類似カテゴ
リに分類した場合は、実際に考慮しなければならない類
似カテゴリに対する判別を行うことができず、判別性能
が低下するという課題があった。Further, since the distributions of the categories overlap each other, if all the categories are classified into any of the similar categories, it is not possible to determine the similar category which must be actually considered, and There is a problem that performance is reduced.

【００２２】また、判別精度を向上するために、類似カ
テゴリグループの数を増加させた場合は、類似カテゴリ
グループ毎に次元数（２Ｋ−１）個の判別係数値を格納
しておく必要があるため、非常にメモリ容量が大きくな
るという課題があった。When the number of similar category groups is increased in order to improve the discrimination accuracy, it is necessary to store (2K-1) dimension coefficient values for each similar category group. Therefore, there is a problem that the memory capacity becomes very large.

【００２３】また、判別係数を精度良く求めるために
は、次元数の１０倍の学習サンプルが必要となり、類似
カテゴリグループごとに１０×（２Ｋ−１）個の正読サ
ンプル及びエラーサンプルが必要となる。一般的に、正
読サンプルを用意することは比較的容易であるが、エラ
ーサンプルを用意することは、認識率の比較的高い装置
では非常に困難であり、類似カテゴリグループの数を絞
り込むことが必要になり、判別精度が低下するという課
題があった。In order to obtain the discrimination coefficient with high accuracy, it is necessary to use 10 times as many learning samples as the number of dimensions, and 10 × (2K−1) correct reading samples and error samples are required for each similar category group. Become. Generally, it is relatively easy to prepare correct reading samples, but it is very difficult to prepare error samples on a device with a relatively high recognition rate, and it is difficult to narrow down the number of similar category groups. This necessitates a problem that the discrimination accuracy is reduced.

【００２４】また、認識結果の距離値は、各カテゴリに
対する標準的なパターンを用意するため、認識結果の距
離値は必ずしも文字品質を反映させた距離値になってお
らず、比較的丁寧に筆記された場合でも、候補確度が高
くならない場合があり、ユーザの筆記品質を向上させて
も判別精度が向上しないという課題があった。Further, since the distance value of the recognition result is prepared as a standard pattern for each category, the distance value of the recognition result is not necessarily a distance value reflecting the character quality, and is relatively carefully written. Even in this case, the accuracy of the candidate may not be high, and there is a problem that even if the writing quality of the user is improved, the discrimination accuracy is not improved.

【００２５】この発明は上記のような課題を解決するた
めになされたもので、認識結果の確からしさを判定する
ために、認識結果の距離値だけでなく、入力時の時系列
情報を利用して文字品質を抽出することにより、認識結
果の判別精度を向上させることが可能なオンライン文字
認識装置およびオンライン文字認識方法を実現すること
を目的とする。The present invention has been made to solve the above-described problem. In order to determine the certainty of the recognition result, not only the distance value of the recognition result but also the time-series information at the time of input is used. It is an object of the present invention to realize an online character recognition device and an online character recognition method capable of improving the accuracy of discrimination of a recognition result by extracting character quality using the method.

【００２６】[0026]

【課題を解決するための手段】この発明に係る文字認識
装置は、入力パターンとカテゴリ情報を入力して文字品
質判定のための文字品質情報を抽出する文字品質抽出部
と、予め文字品質を判定するための情報を格納した文字
品質判定情報格納メモリと、前記入力パターンの特徴ベ
クトルと、辞書メモリ中の各カテゴリにおける標準パタ
ーンの特徴ベクトルとを照合し、認識結果のカテゴリ情
報と距離値情報を出力する識別部と、この識別部で得ら
れた認識結果の距離値情報と文字品質抽出部で得られた
文字品質情報及び文字品質判定情報格納メモリからの判
定情報を用いて認識結果の確からしさを判定し、判定結
果を出力する判定部を備えたものである。A character recognition apparatus according to the present invention includes a character quality extracting unit for inputting an input pattern and category information and extracting character quality information for character quality determination, and determining character quality in advance. A character quality determination information storage memory storing information for performing the recognition, a feature vector of the input pattern, and a feature vector of a standard pattern in each category in the dictionary memory. The identification unit to be output, the distance value information of the recognition result obtained by the identification unit, the character quality information obtained by the character quality extraction unit and the judgment information from the character quality judgment information storage memory are used to determine the accuracy of the recognition result. And a judgment unit that outputs the judgment result.

【００２７】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、入力パターンか
ら得られる時系列の情報を基に文字品質情報を抽出する
ものである。In the character recognition device according to the present invention, when character quality information is extracted by the character quality extraction unit, character quality information is extracted based on time-series information obtained from an input pattern.

【００２８】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、時系列情報とし
て入力パターンから得られる画数情報あるいは認識結果
の候補文字における正規画数情報により文字品質情報を
抽出するものである。In the character recognition apparatus according to the present invention, when character quality information is extracted by the character quality extraction unit, the character quality is determined based on the number of strokes obtained from the input pattern as time-series information or the normal number of strokes of the candidate character of the recognition result. This is for extracting information.

【００２９】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、時系列情報とし
て入力パターンから得られる画数情報及び認識結果の候
補文字における正規画数情報により算出した画数変動率
を基に文字品質情報を抽出するものである。In the character recognition device according to the present invention, when character quality information is extracted by the character quality extracting unit, the character quality information is calculated based on the number of strokes obtained from the input pattern as time-series information and the number of regular strokes in the candidate character of the recognition result. The character quality information is extracted based on the stroke number variation rate.

【００３０】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、時系列情報とし
て入力パターンから得られるストローク情報を基に文字
品質情報を抽出するものである。In the character recognition device according to the present invention, when character quality information is extracted by a character quality extraction unit, character quality information is extracted based on stroke information obtained from an input pattern as time-series information.

【００３１】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、ストローク情報
として入力パターンにおけるストローク矩形と認識結果
の候補文字に対して予め用意されたストローク矩形との
幅、高さの距離値を文字品質情報として抽出するもので
ある。In the character recognition device according to the present invention, when character quality information is extracted by the character quality extraction unit, a stroke rectangle in an input pattern and a stroke rectangle prepared in advance for a candidate character of a recognition result are used as stroke information. Is extracted as character quality information.

【００３２】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、ストローク情報
として入力パターンのストロークの方向と認識結果の候
補文字に対して予め用意されたストロークの方向との距
離値を文字品質情報として抽出するものである。In the character recognition apparatus according to the present invention, when the character quality information is extracted by the character quality extraction unit, the stroke direction of the input pattern and the stroke of a stroke prepared in advance for the candidate character of the recognition result are used as the stroke information. The distance value from the direction is extracted as character quality information.

【００３３】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、ストローク情報
として入力パターンの連続するストローク間の方向と認
識結果の候補文字に対して予め用意された連続するスト
ローク間の方向との距離値を文字品質情報として抽出す
るものである。In the character recognition device according to the present invention, when character quality information is extracted by the character quality extraction unit, the direction between successive strokes of the input pattern and the candidate character of the recognition result are prepared in advance as stroke information. The distance value from the direction between successive strokes is extracted as character quality information.

【００３４】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、ストローク情報
として入力パターンの各ストロークの長さの総和と認識
結果の候補文字に対して予め用意されたストロークの長
さの総和との距離値を文字品質情報として抽出するもの
である。In the character recognition device according to the present invention, when the character quality information is extracted by the character quality extraction unit, the sum of the lengths of the strokes of the input pattern and the candidate characters of the recognition result are prepared in advance as stroke information. The distance value from the total length of the strokes thus extracted is extracted as character quality information.

【００３５】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、ストローク情報
として入力パターンのストロークの種類情報と、認識結
果の候補文字に対して予め用意されたストロークの種類
情報とを比較し、文字品質情報として抽出するものであ
る。In the character recognition device according to the present invention, when character quality information is extracted by the character quality extraction unit, stroke type information of an input pattern and stroke candidate information of a recognition result are prepared in advance as stroke information. It is compared with stroke type information and extracted as character quality information.

【００３６】この発明に係る文字認識装置は、文字品質
抽出部で文字品質情報を抽出する際に、入力パターンか
ら得られる屈曲点の情報を用いて文字品質情報を抽出す
るものである。In the character recognition device according to the present invention, when character quality information is extracted by the character quality extraction unit, character quality information is extracted by using information on a bending point obtained from an input pattern.

【００３７】この発明に係る文字認識方法は、入力パタ
ーンから文字品質判定のための情報を抽出する文字品質
抽出工程と、入力パターンの特徴ベクトルと、予め辞書
メモリに格納された認識対象文字における標準パターン
の特徴ベクトルとを照合し、認識結果のカテゴリ情報と
距離値情報を出力する識別工程と、この識別工程で得ら
れた認識結果と文字品質抽出工程で得られた文字品質情
報及び予め文字品質判定情報格納メモリに格納された文
字品質を判定するための判定情報を用いて認識結果の確
からしさを判定し、判定結果を出力する判定工程を備え
たものである。According to the character recognition method of the present invention, there is provided a character quality extracting step of extracting information for character quality determination from an input pattern, a feature vector of the input pattern, and a standard for a recognition target character stored in a dictionary memory in advance. A discrimination step of comparing the feature vector of the pattern and outputting category information and distance value information of the recognition result; and the recognition result obtained in this discrimination step, the character quality information obtained in the character quality extraction step, The method further includes a determination step of determining the likelihood of the recognition result using the determination information for determining the character quality stored in the determination information storage memory and outputting the determination result.

【００３８】[0038]

【発明の実施の形態】以下、この発明の実施の一形態を
説明する。実施の形態１．図１はこの発明の実施の形態１によるオ
ンライン文字認識装置の構成を示す回路ブロックであ
る。図において、１はスキャナ、文字切り出し装置等の
読取手段によって文字単位に２値変換されたパターンを
入力し、入力パターンの特徴を解析し、その特徴ベクト
ルを識別部３に出力する特徴抽出部、２は辞書メモリ、
３は辞書メモリ２から取り出した認識対象カテゴリの標
準的な特徴を表す標準ベクトルと前述の特徴ベクトルと
を照合し、両者の類似度、例えば距離値を計算して類似
度の高い順（距離値であればその距離が小さい順）に上
位候補となるカテゴリ群を生成し、認識結果として、よ
り上位候補のカテゴリ情報をあらわす信号とそのときの
距離値情報を表す信号とを出力する識別部である。これ
等スキャナ１，辞書メモリ２，識別部３は前記図２８に
示した従来装置の構成部分と同じである。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below. Embodiment 1 FIG. FIG. 1 is a circuit block diagram showing a configuration of an online character recognition device according to Embodiment 1 of the present invention. In the figure, reference numeral 1 denotes a feature extraction unit that inputs a pattern that has been binary-converted for each character by a reading unit such as a scanner or a character cutout device, analyzes the features of the input pattern, and outputs a feature vector to the identification unit 3. 2 is a dictionary memory,
Reference numeral 3 compares a standard vector representing a standard feature of the recognition target category extracted from the dictionary memory 2 with the above-described feature vector, calculates a similarity between the two, for example, a distance value, and sorts the similarity in descending order of similarity (distance value (In the order of smaller distances), a category group that becomes a top candidate is generated, and a recognition unit outputs a signal representing category information of a top candidate and a signal representing distance value information at that time. is there. These scanner 1, dictionary memory 2, and identification unit 3 are the same as the components of the conventional apparatus shown in FIG.

【００３９】５０は入力パターンとカテゴリ情報を入力
して文字品質判定のための文字品質情報を抽出する文字
品質抽出部、５１は予め文字品質を判定するための情報
を格納した文字品質判定情報格納メモリ、５２は前記文
字品質抽出部５０から得られる文字品質情報及び識別部
３から得られる認識結果の距離値情報および文字品質判
定情報格納メモリ５１から得られる文字品質判定情報か
ら認識結果の確からしさを判定し、判定結果を出力する
判定部である。Reference numeral 50 denotes a character quality extraction unit for inputting an input pattern and category information to extract character quality information for character quality determination, and 51 stores character quality determination information preliminarily storing information for determining character quality. The memory 52 has a certainty of the recognition result from the character quality information obtained from the character quality extraction unit 50, the distance value information of the recognition result obtained from the identification unit 3, and the character quality judgment information obtained from the character quality judgment information storage memory 51. And outputs a result of the determination.

【００４０】図２は、入力パターンを示す図である。図
において、６０は７画で筆記された入力パターン「亜」
である。FIG. 2 is a diagram showing an input pattern. In the figure, reference numeral 60 denotes an input pattern "A" written with seven strokes
It is.

【００４１】図３は、識別部３から出力された入力パタ
ーン６０に対する認識結果を示す図である。図におい
て、６１は距離値の小さい順（標準パターンとの距離が
小さい順）にソーティングされた後の候補内における順
位、６２は候補文字のカテゴリ、６３は距離値を示して
いる。FIG. 3 is a diagram showing a recognition result for the input pattern 60 output from the identification unit 3. In the drawing, reference numeral 61 denotes a rank in a candidate after sorting in the order of smaller distance values (in order of smaller distance from the standard pattern), 62 denotes a category of candidate characters, and 63 denotes a distance value.

【００４２】図４は、文字品質判定情報格納メモリ５１
に格納された認識対象文字における正規の画数（正しい
字画数）を格納した画数情報テーブルを示す図である。
図において、６４はカテゴリ、６５は該当カテゴリにお
ける正規の画数である。FIG. 4 shows a character quality judgment information storage memory 51.
FIG. 6 is a diagram showing a stroke number information table in which the regular stroke numbers (correct stroke stroke numbers) of the recognition target characters stored in.
In the figure, 64 is a category, and 65 is a regular number of strokes in the category.

【００４３】図５は、認識結果の文字品質抽出部５０で
得られた入力パターン６０に対する画数変動率を示す図
であり、図中７０は、認識結果の候補文字における画数
変動率を示している。FIG. 5 is a diagram showing the stroke number variation rate for the input pattern 60 obtained by the character quality extraction unit 50 of the recognition result, and 70 in the figure shows the stroke number variation rate of the candidate character of the recognition result. .

【００４４】図６は、大量に収集した手書き文字におけ
る正規の画数からの変動割合を示す図であり、横軸が画
数変動、縦軸がデータ比率（％）である。FIG. 6 is a diagram showing the variation ratio from the normal number of strokes in a large number of handwritten characters, where the horizontal axis represents the stroke number variation and the vertical axis represents the data ratio (%).

【００４５】図７は、入力パターンを示す図である。図
において、８０は右上がりに7画で筆記された入力パタ
ーン「亜」である。FIG. 7 is a diagram showing an input pattern. In the figure, reference numeral 80 denotes an input pattern “A” written with seven strokes upward and to the right.

【００４６】図８は、識別部３から出力された入力パタ
ーン８０に対する認識結果を示す図である。図におい
て、８１は距離値の小さい順（標準パターンとの距離が
小さい順）にソーティングされた後の候補内における順
位、８２は候補文字のカテゴリ、８３は距離値を示して
いる。FIG. 8 is a diagram showing a recognition result for the input pattern 80 output from the identification unit 3. In the figure, reference numeral 81 denotes a rank in a candidate after sorting in the order of smaller distance values (in order of smaller distance from the standard pattern), 82 denotes a category of candidate characters, and 83 denotes a distance value.

【００４７】図９は、認識結果の文字品質抽出部５０で
得られた入力パターン８０に対する画数変動率を示す図
であり、図中８４は、認識結果の候補文字における画数
変動率を示している。FIG. 9 is a diagram showing a stroke count variation rate for the input pattern 80 obtained by the character quality extraction unit 50 of the recognition result. In FIG. 9, reference numeral 84 denotes a stroke count variation rate of the candidate character of the recognition result. .

【００４８】図１０は、認識結果の距離値を尺度とした
場合の第１位の認識結果候補文字が正解文字の場合にお
ける分布及びエラー文字の場合における分布を示す図で
あり、図中８５は正解文字の分布、８６はエラー文字の
分布を示している。FIG. 10 is a diagram showing a distribution in the case where the first-ranked recognition result candidate character is a correct character and a distribution in the case of an error character when the distance value of the recognition result is used as a scale. The distribution of correct characters, 86 indicates the distribution of error characters.

【００４９】図１１は、認識結果の距離値に文字品質を
併用した場合の判別得点を尺度とした場合の第１位の認
識結果候補文字が、正解文字の場合における分布及びエ
ラー文字の場合における分布を示す図であり、図中８７
は正解文字の分布、８８はエラー文字の分布、８９は正
解文字とエラー文字を判別する場合の閾値の例を示して
いる。FIG. 11 shows the distribution when the first-ranked recognition result candidate character is the correct character and the error character when the first-ranked recognition result candidate character is the scale when the recognition score when the character value is used in combination with the distance value of the recognition result. It is a figure which shows distribution, 87 in a figure.
Shows an example of the distribution of correct characters, 88 shows an example of the distribution of error characters, and 89 shows an example of a threshold value for determining a correct character from an error character.

【００５０】次に動作について説明する。図２に示す入
力パターン６０に対して、識別部３から図３に示す認識
結果が出力されたとすると、文字品質抽出部５０は、入
力パターンから文字品質情報を抽出する。ここでは、入
力パターンにおける文字品質を示す尺度として第ｉ位の
認識結果の候補文字における正規の画数からの画数変動
率ｄｉを式（６）で定義する。Next, the operation will be described. Assuming that the recognition unit 3 outputs the recognition result shown in FIG. 3 for the input pattern 60 shown in FIG. 2, the character quality extracting unit 50 extracts character quality information from the input pattern. Here, the stroke number variation ratio di from the regular stroke number in the candidate character of the i-th recognition result is defined as Expression (6) as a scale indicating the character quality in the input pattern.

【００５１】[0051]

【数１】 (Equation 1)

【００５２】具体的に、図３に示す認識結果の第1位の
候補文字について、画数変動率を求めるには、第1位の
候補カテゴリ「亜」における正規の画数を、文字品質判
定情報格納メモリ５１に格納された画数情報テーブルよ
り求め、式（６）を用いて計算する。第２位、第３位の
認識結果候補文字についても同様にして求め、図５に示
すように、各候補文字毎に画数変動率７０が得られる。
図４に示すように、第２位の候補文字「壷」の正規の画
数は１１なので、この文字における画数変動率７０は、
比較的大きな値になる。Specifically, in order to obtain the stroke number variation rate for the first candidate character in the recognition result shown in FIG. 3, the regular stroke number in the first candidate category “A” is stored in the character quality determination information. The number of strokes is obtained from the stroke number information table stored in the memory 51, and is calculated using Expression (6). The second and third recognition result candidate characters are obtained in the same manner, and a stroke number variation rate 70 is obtained for each candidate character as shown in FIG.
As shown in FIG. 4, the regular stroke number of the second-place candidate character “pot” is 11, and therefore, the stroke count variation rate 70 of this character is
A relatively large value.

【００５３】次に、文字品質抽出部５０は、求められた
認識結果の文字品質情報（図５の画数変動率７０）を判
定部５２に送る。判定部５２は送られた文字品質情報と
識別部３より送られる認識結果の距離値（図３の６３）
とから認識結果の確からしさを判定する。認識結果の確
からしさの判定は、認識結果の距離値（第１位の距離
値、第２位と第１位の距離値の差）、及び文字品質情報
をパラメータとし、正解文字分布及びエラー文字分布ま
でのマハラノビスの汎距離（特徴間の統計的相関を補う
距離測定）を求め、式（７）に示すようにエラー分布ま
での距離Ｄｅから正解までの距離Ｄｃを引いた値が判定
用閾値ＴＨより大きい場合は、正解文字、それ以外の場
合は、エラー文字として判定する。Next, the character quality extracting section 50 sends the character quality information (the stroke number variation rate 70 in FIG. 5) of the obtained recognition result to the determining section 52. The determination unit 52 determines the distance value between the sent character quality information and the recognition result sent from the identification unit 3 (63 in FIG. 3).
And the likelihood of the recognition result is determined. The determination of the likelihood of the recognition result is performed by using the distance value of the recognition result (the distance value of the first place, the difference between the second place and the first place), and the character quality information as parameters, and correct character distribution and error characters. The Mahalanobis's general distance to the distribution (distance measurement that compensates for the statistical correlation between features) is obtained, and the value obtained by subtracting the distance Dec to the correct answer from the distance De to the error distribution as shown in Expression (7) is the determination threshold. If it is greater than TH, it is determined as a correct character, otherwise, it is determined as an error character.

【００５４】[0054]

【数２】 (Equation 2)

【００５５】ここで、実施の形態１では、判定に用いる
入力パラメータは、認識結果における第１位候補文字の
距離値、第２位候補文字距離値−第１位候補文字距離
値、文字品質情報における第１位候補文字の画数変動率
とすると、ｎ＝３となる。In the first embodiment, the input parameters used for the determination are the distance value of the first candidate character in the recognition result, the second candidate character distance value−the first candidate character distance value, and the character quality information. Assuming that the variation rate of the number of strokes of the first candidate character is n, then n = 3.

【００５６】また、正解文字分布の平均ベクトルμｃ及
び正解文字分布の分散・共分散行列の逆行列Σｃ
^-１は、予め用意した大量の学習サンプルにおいて、認
識結果の第１位文字が正解文字となるサンプルを用いて
算出しておく。The mean vector μc of the correct character distribution and the inverse matrix Σc of the variance / covariance matrix of the correct character distribution
^-1 is calculated using a sample in which the first character of the recognition result is a correct character in a large number of learning samples prepared in advance.

【００５７】また、エラー文字における平均ベクトル、
逆行列も同様に学習サンプルのうち、認識結果の第1位
文字がエラーとなるサンプルを用いて算出しておく。こ
れらのパラメータは、文字品質判定情報格納メモリ５１
に格納しておくものとする。Also, the average vector in the error character,
Similarly, the inverse matrix is calculated using a sample of the learning samples in which the first character of the recognition result has an error. These parameters are stored in the character quality determination information storage memory 51.
Shall be stored in

【００５８】また、通常のマハラノビスの汎距離では、
ＤｃとＤｅの大小関係のみで判定を行うが、実施の形態
１では、適当な閾値（ＴＨ）を設定することにより、正
解率、エラー率の調整を可能にしておく。この閾値（Ｔ
Ｈ）も予め学習サンプルにより算出し、文字品質判定情
報格納メモリ５１中に格納しておく。In a general Mahalanobis generalized distance,
Although the determination is made only based on the magnitude relationship between Dc and De, in the first embodiment, the correct answer rate and the error rate can be adjusted by setting an appropriate threshold (TH). This threshold (T
H) is also calculated in advance by a learning sample and stored in the character quality determination information storage memory 51.

【００５９】大量の手書き文字データサンプルを収集
し、正規の画数からの筆記文字における画数変動を分析
した結果、図６に示すような楷書体サンプルが全体の６
割以上を占めている。また、通常の文字認識装置では、
楷書体文字サンプルにおける認識率は、画数変動が発生
した文字サンプルに比べ高くなる傾向がある。したがっ
て、正解文字とエラー文字を判別するための尺度として
画数変動率は有効である。すなわち、認識結果の第１位
文字が正解文字の場合の画数変動率は小さく、エラー文
字の場合は大きくなる傾向がある。よって、画数変動率
は正解文字の確からしさを判定する尺度として有効とな
る。As a result of collecting a large number of handwritten character data samples and analyzing the variation in the number of strokes in the written characters from the regular number of strokes, a square type sample as shown in FIG.
More than a percentage. In a normal character recognition device,
The recognition rate of a square type character sample tends to be higher than that of a character sample in which the number of strokes has changed. Therefore, the stroke number variation rate is effective as a scale for discriminating a correct character from an error character. That is, when the first character in the recognition result is a correct character, the stroke number variation rate tends to be small, and when the character is an error character, it tends to be large. Therefore, the stroke number variation rate is effective as a scale for determining the likelihood of a correct character.

【００６０】実施の形態１では、入力パターン６０に対
しては、図５に示す画数変動率７０が得られ、第１位の
認識結果における画数変動率７０は０（入力パターン６
０が認識結果の第１位文字「亜」の正規画数７で筆記さ
れた）と小さい値になる。In the first embodiment, the stroke number variation rate 70 shown in FIG. 5 is obtained for the input pattern 60, and the stroke number variation rate 70 in the first recognition result is 0 (input pattern 6).
0 is written with the number of regular strokes of the first character “A” of the recognition result of 7), which is a small value.

【００６１】次に、図７に示す右上がりに筆記された入
力パターン８０が入力されたとする。入力パターン６０
の場合の動作と同様に、識別部３で図８に示す認識結果
としての順位８１、カテゴリ８２，距離値８３、文字品
質抽出部５０で図９に示す画数変動率８４が得られたと
する。図８に示すように、傾いて筆記されたため、認識
結果の第１位文字は文字「丑」となり、エラーしてい
る。Next, it is assumed that an input pattern 80 written upward and to the right as shown in FIG. 7 has been input. Input pattern 60
Similarly to the operation in the case of, it is assumed that the recognition unit 3 obtains the ranking 81, the category 82, the distance value 83 as the recognition result shown in FIG. 8, and the character quality extraction unit 50 obtains the stroke number variation rate 84 shown in FIG. As shown in FIG. 8, the first character in the recognition result is the character “ox” due to the slanted writing, and an error occurs.

【００６２】しかし、距離値は第1位が正解文字であっ
た入力パターン６０の場合と同一になっており、認識結
果の距離値だけでは、入力パターン６０と入力パターン
８０の場合のマハラノビス距離は同一になってしまう。
しかし、この場合でも、図９に示すように第１位文字
「丑」の画数変動率は、０．４３と比較的な大きな値と
なり、実施の形態１における正解文字分布までのマハラ
ノビスの汎距離Ｄｃは、正解文字の場合（入力パターン
６０の場合）が、エラー文字の場合（入力パターン８０
の場合）より小さい値となり、逆にエラー文字分布まで
のマハラノビスの汎距離Ｄｅは、正解文字の場合がエラ
ー文字の場合より大きくなる。これにより、認識結果の
確からしさの判定が、認識結果の距離値のみでできない
場合でも、画数変動率を適用することで正しく判別でき
るようになる。However, the distance value is the same as in the case of the input pattern 60 in which the first place is the correct character, and the Mahalanobis distance in the case of the input pattern 60 and the input pattern 80 is only the distance value of the recognition result. Will be the same.
However, even in this case, as shown in FIG. 9, the stroke number variation rate of the first character “ox” becomes a relatively large value of 0.43, and the Mahalanobis general distance to the correct character distribution in the first embodiment. Dc is the case of the correct character (in the case of the input pattern 60), and the case of the error character (the input pattern 80).
), The Mahalanobis' general distance De to the error character distribution is larger in the case of the correct character than in the case of the error character. As a result, even if the certainty of the recognition result cannot be determined only by the distance value of the recognition result, the determination can be correctly performed by applying the rate of change in the number of strokes.

【００６３】以上、実施の形態１について説明したが、
この実施の形態１では、判別方式としてマハラノビスの
汎距離を用いたが、他の判別方式（例えば線型判別関数
による判別）により判別しても良い。Although the first embodiment has been described above,
In the first embodiment, Mahalanobis' generalized distance is used as a discrimination method, but discrimination may be made by another discrimination method (for example, discrimination by a linear discrimination function).

【００６４】また、実施の形態１では、正解文字分布、
エラー文字分布それぞれについて、マハラノビスの汎距
離を求めたが、正解文字分布までのマハラノビスの汎距
離を求め、その汎距離が予め設定した閾値よりも小さい
場合に正解文字、大きい場合にエラー文字として判別し
ても良い。特に、高精度な認識方式の場合は、エラー文
字サンプルを大量に用意することが困難なため、正解文
字分布のみを用いて判別することが好ましい。In the first embodiment, the correct character distribution,
For each error character distribution, the Mahalanobis 'general distance was calculated, but the Mahalanobis' general distance to the correct character distribution was calculated, and if the general distance was smaller than a preset threshold, it was determined as a correct character, and if it was larger, it was determined as an error character. You may. In particular, in the case of a highly accurate recognition method, it is difficult to prepare a large number of error character samples. Therefore, it is preferable to make a determination using only the correct character distribution.

【００６５】また、実施の形態１では、マハラノビスの
汎距離を分散・共分散行列の逆行列により求めたが、相
関行列の逆行列を用いても良い。In the first embodiment, the Mahalanobis' generalized distance is obtained by using the inverse matrix of the variance / covariance matrix. However, the inverse matrix of the correlation matrix may be used.

【００６６】また、実施の形態１では、文字品質情報と
認識結果の距離値の両方を用いて判別を行ったが、文字
品質情報のみを用いて判別しても良い。Further, in the first embodiment, the determination is made using both the character quality information and the distance value of the recognition result. However, the determination may be made using only the character quality information.

【００６７】また、実施の形態１では、文字品質情報に
併用する情報として、認識結果として第１位の距離値及
び第２位と第１位の距離値との差を用いたが、そのいず
れかのみを併用するようにしても良い。In the first embodiment, as the information used together with the character quality information, the first distance value and the difference between the second and first distance values are used as the recognition result. You may make it use together only.

【００６８】また、実施の形態１では、文字品質情報と
して、認識結果第１位文字の画数変動率のみを用いて判
定したが、複数の候補文字における画数変動率の平均値
や、第１位と第２位の画数変動率の差分値を用いても良
い。Also, in the first embodiment, the determination is made using only the stroke number variation rate of the first character of the recognition result as the character quality information. However, the average value of the stroke number variation rate of a plurality of candidate characters, A difference value between the second and the number-of-number-of-times change rates may be used.

【００６９】また、実施の形態１では、すべてのデータ
に対して画数変動率を用いたが、正規画数で筆記された
パターンと、正規画数以外で筆記されたパターンとに学
習データを分けて、正規画数以外で筆記されたパターン
についてのみ、画数変動率を使用するようにしても良
い。Further, in the first embodiment, the stroke number variation rate is used for all data. However, the learning data is divided into a pattern written with the normal stroke number and a pattern written with a non-regular stroke number. The stroke number variation rate may be used only for a pattern written other than the regular stroke number.

【００７０】以上のように、実施の形態１の構成によれ
ば、図１０に示すように、認識結果の距離値のみを用い
た場合は、最終的に認識率を高くすることを目的とする
ため、正解文字となるパターン数の合計（正解文字の分
布８５の面積に相当する）は、エラー文字となるパター
ン数の合計（エラー文字の分布８６の面積に相当する）
より多くなるが、正解文字の分布８５とエラー文字の分
布８６は重なりを持った分布となり、正解文字とエラー
文字を正しく判別するという観点では正しい尺度にはな
らない場合が多い。As described above, according to the configuration of the first embodiment, as shown in FIG. 10, when only the distance value of the recognition result is used, the object is to finally increase the recognition rate. Therefore, the total number of patterns that are correct characters (corresponding to the area of the distribution 85 of correct characters) is the total of the number of patterns that are error characters (corresponding to the area of the distribution 86 of error characters).
Although more, the distribution 85 of the correct characters and the distribution 86 of the error characters are overlapped distributions, and often do not provide a correct scale from the viewpoint of correctly determining the correct characters and the error characters.

【００７１】そこで、この発明では、例えば実施の形態
１に示すような、正規の画数からの画数変動度合いに代
表される文字品質を表す尺度を新たに導入することによ
り、入力パターンの確からしさを正確に判別できる割合
を向上させることができる。つまり、文字認識装置にお
いては、入力パターンの文字品質が低下するほど、正し
く認識することが困難となり、認識結果の距離値の信頼
度合いも低下するが、比較的文字品質が良い場合は、認
識結果の距離値の信頼度合いは向上する。Therefore, in the present invention, the reliability of the input pattern is improved by introducing a new measure of character quality represented by the degree of variation in the number of strokes from the normal number of strokes, as shown in the first embodiment, for example. It is possible to improve the ratio that can be accurately determined. In other words, in the character recognition device, the lower the character quality of the input pattern is, the more difficult it is to correctly recognize the input pattern and the lower the reliability of the distance value of the recognition result is. Is more reliable.

【００７２】そこで、文字品質の尺度を距離値に併用す
ることにより、文字品質が良好と判断された場合は、距
離値が比較的大きくても信頼性が高いと判断することに
より、比較的多くの入力パターンを正解とし、文字品質
が悪いと判断された場合には、距離値が比較的小さくな
った場合においても信頼性がないと判断できるようにす
ることにより、確からしさの判定の精度を向上できる。Therefore, when the character quality is judged to be good by using the scale of the character quality in combination with the distance value, it is judged that the reliability is high even if the distance value is relatively large, so that the character value is relatively large. If the input pattern is determined to be correct and character quality is determined to be poor, it is possible to determine that there is no reliability even when the distance value is relatively small, thereby improving the accuracy of determination of certainty. Can be improved.

【００７３】結果として、図１１に示すように、認識率
としては低いが、正解文字とエラー文字の重なり度合い
が小さく（正解文字分布とエラー文字分布の分離性が高
い）、誤判定が少ない文字認識装置を実現できる。ここ
で、判定の閾値８９を設定した場合は、この閾値８９よ
り判別得点が高い場合を正解、それより低い場合をエラ
ーと判断する。また、閾値８９より判別得点が高い部分
でのエラー文字分布は非常に小さいので、誤判定が少な
くできる。As a result, as shown in FIG. 11, although the recognition rate is low, the degree of overlap between the correct character and the error character is small (the separability between the correct character distribution and the error character distribution is high) and the character with few erroneous determinations is obtained. A recognition device can be realized. Here, when the judgment threshold 89 is set, a case where the judgment score is higher than the threshold 89 is judged as a correct answer, and a case where the judgment score is lower than this is judged as an error. In addition, the error character distribution in a portion having a higher discrimination score than the threshold 89 is very small, so that erroneous judgment can be reduced.

【００７４】また、実施の形態１では、認識結果の距離
値情報に加えて、入力パターンの画数変動度合いを文字
品質情報として抽出し、認識結果の確からしさの判定に
用いるようにしたので、正しい画数（楷書体）で筆記し
たにも関わらず、認識に用いる標準パターンとの一致度
が低いパターンにおける正解文字としての確からしさを
向上させることができ、高精度に正解文字の判別が可能
になる。Further, in the first embodiment, in addition to the distance value information of the recognition result, the degree of change in the number of strokes of the input pattern is extracted as character quality information and used for determining the certainty of the recognition result. Despite writing in strokes (square style), it is possible to improve the certainty as a correct character in a pattern with a low degree of coincidence with a standard pattern used for recognition, and it is possible to accurately determine a correct character. .

【００７５】また、実施の形態１では、画数変動度合い
を正規画数からの変動比率で表すようにしたので、画数
変動が比較的大きく出やすい、画数が多い文字に関して
は、ある程度の変動まで許容でき、画数が少ない文字に
ついては、少ない変動に対しても着実に反映できる尺度
となり、入力パターンの画数における判定精度のバラツ
キを抑え、高精度に正解文字の判別が可能になる。Further, in the first embodiment, the degree of change in the number of strokes is represented by a change ratio from the normal number of strokes. Therefore, a relatively large variation in the number of strokes is allowed. For a character having a small number of strokes, the scale becomes a scale that can be steadily reflected even with a small change, and it is possible to suppress a variation in the determination accuracy in the number of strokes of the input pattern and to accurately determine a correct character.

【００７６】また、実施の形態１では、正解文字分布と
エラー文字分布の２つの分布を反映するパラメータのみ
をすべてのカテゴリに対して共通に用意すれば良いの
で、大量の学習サンプルによりパラメータの学習が可能
となり、高精度な判別が可能になる。Further, in the first embodiment, since only the parameters reflecting the two distributions of the correct character distribution and the error character distribution need to be prepared in common for all the categories, the parameter learning is performed using a large number of learning samples. Is possible, and highly accurate discrimination becomes possible.

【００７７】また、画数情報を用いるようにしたので、
類似文字で画数が異なる文字における判別精度が向上で
き、従来例のような類似カテゴリ毎の判別パラメータ情
報を用意せずに実現可能となり、少ないメモリ容量での
装置の実現が可能である。Since the number of strokes information is used,
The accuracy of determination of similar characters having different stroke counts can be improved, and this can be realized without preparing the determination parameter information for each similar category as in the conventional example, and the device can be realized with a small memory capacity.

【００７８】また、認識結果の確からしさの判定精度が
向上するため、階層的な文字認識装置（複数の文字認識
装置が直列に構成された装置）におけるリジェクト判定
を高精度に行うことができ、高精度な文字認識装置が実
現できる。Further, since the accuracy of the determination of the certainty of the recognition result is improved, the rejection determination in a hierarchical character recognition device (a device in which a plurality of character recognition devices are configured in series) can be performed with high accuracy. A highly accurate character recognition device can be realized.

【００７９】また、認識結果の第１位文字が正解文字と
判定された場合には、認識結果の候補文字を１つに絞る
ことができるため、言語処理を統合した文字認識装置に
おける余計な候補文字の削減が可能になり、言語処理を
含めた読取精度が向上する。If the first character in the recognition result is determined to be the correct character, the number of candidate characters in the recognition result can be reduced to one. Characters can be reduced, and reading accuracy including language processing is improved.

【００８０】実施の形態２．以下、この発明の実施の形
態２を図１及び図１２〜図１７を用いて説明する。図
中、実施の形態１と同一または相当部分は同一符号を付
し、重複説明を省略する。図１２は、文字品質判定情報
格納メモリ５１中に格納された丁寧に筆記された認識対
象文字における第１ストローク（時間的に最初に筆記さ
れたストローク）と第２ストローク（時間的に２番目に
筆記されたストローク）外接矩形における幅、高さを正
規化（０〜１の値に正規化）して格納したストローク矩
形情報テーブルであり、図中９０は第１ストロークの矩
形情報、９１は第２ストロークの矩形情報である。Embodiment 2 Hereinafter, a second embodiment of the present invention will be described with reference to FIG. 1 and FIGS. In the figure, the same or corresponding parts as those in the first embodiment are denoted by the same reference numerals, and redundant description is omitted. FIG. 12 shows a first stroke (a stroke written first in time) and a second stroke (second stroke in time) of a carefully written recognition target character stored in the character quality determination information storage memory 51. A written stroke) is a stroke rectangle information table in which the width and height of the circumscribed rectangle are normalized (normalized to values of 0 to 1) and stored, where 90 is rectangle information of the first stroke, and 91 is rectangle information of the first stroke. This is two-stroke rectangle information.

【００８１】図１３は、図２の入力パターン６０に対し
て第1ストロークの外接矩形における幅（Ｗ）及び高さ
（Ｈ）を抽出している例であり、図中９２は第１ストロ
ークである。FIG. 13 shows an example in which the width (W) and the height (H) of the circumscribed rectangle of the first stroke are extracted from the input pattern 60 of FIG. is there.

【００８２】図１４は、図７の入力パターン８０に対し
て第１ストロークの外接矩形における幅（Ｗ）及び高さ
（Ｈ）を抽出している例であり、図中９３は第１ストロ
ークである。FIG. 14 is an example in which the width (W) and the height (H) of the circumscribed rectangle of the first stroke are extracted from the input pattern 80 of FIG. 7, and 93 in the drawing indicates the first stroke. is there.

【００８３】図１５は、入力パターン６０に対して、文
字品質抽出部５０で抽出された第１ストロークの矩形評
価値を示す図であり、図中９４はストローク矩形評価値
である。FIG. 15 is a diagram showing the rectangle evaluation value of the first stroke extracted by the character quality extraction unit 50 for the input pattern 60. In FIG. 15, reference numeral 94 denotes a stroke rectangle evaluation value.

【００８４】図１６は、入力パターン８０に対して、文
字品質抽出部５０で抽出された第１ストロークの矩形評
価値を示す図であり、図中９５はストローク矩形評価値
である。FIG. 16 is a diagram showing a rectangle evaluation value of the first stroke extracted by the character quality extraction unit 50 for the input pattern 80. In FIG. 16, reference numeral 95 denotes a stroke rectangle evaluation value.

【００８５】図１７は、入力パターン６０に対して、文
字品質抽出部５０で抽出された第１ストロークから第2
ストロークへの方向ベクトルを示す図であり、図中１０
０はストローク間の方向ベクトルである。FIG. 17 is a diagram showing an example of the input pattern 60 from the first stroke extracted by the character quality extraction unit 50 to the second stroke.
It is a figure which shows the direction vector to a stroke,
0 is a direction vector between strokes.

【００８６】まず、図２に示す入力パターン６０に対し
て、動作を説明する。実施の形態１と同様に、識別部３
で図３のような認識結果を得たとする。次に、文字品質
抽出部５０は、図１３に示すように入力パターン６０か
ら第１ストローク９２の外接矩形の幅（Ｗ）、高さ
（Ｈ）を求める。First, the operation of the input pattern 60 shown in FIG. 2 will be described. As in the first embodiment, the identification unit 3
Suppose that a recognition result as shown in FIG. 3 is obtained. Next, the character quality extraction unit 50 obtains the width (W) and the height (H) of the circumscribed rectangle of the first stroke 92 from the input pattern 60 as shown in FIG.

【００８７】次に、文字品質抽出部５０は、識別部３か
ら出力された認識結果の各候補文字に対して、基準とな
る第1ストロークの外接矩形の幅、高さ情報を文字品質
判定情報格納メモリ５１中のストローク矩形情報テーブ
ルを参照して求める。例えば、認識結果第１位文字
「亜」に対する外接矩形の幅、高さはそれぞれ、１．０
０、０．０５となる。Next, the character quality extraction unit 50 converts the width and height information of the circumscribed rectangle of the first stroke serving as a reference into the character quality determination information for each candidate character of the recognition result output from the identification unit 3. It is determined by referring to the stroke rectangle information table in the storage memory 51. For example, the width and height of the circumscribed rectangle for the first character “A” of the recognition result are 1.0 and 1.0, respectively.
0 and 0.05.

【００８８】次に、文字品質抽出部５０は、入力パター
ン６０から得られた第1ストロークの外接矩形情報及び
認識結果から得られた第１ストロークの外接矩形情報か
ら、第１ストロークの外接矩形の一致度を示す第１スト
ロークの矩形評価値ｄｉを算出する。具体的に、矩形評
価値ｄｉは式（８）で求める。Next, the character quality extracting unit 50 determines the circumscribed rectangle of the first stroke from the circumscribed rectangle information of the first stroke obtained from the input pattern 60 and the circumscribed rectangle information of the first stroke obtained from the recognition result. A rectangle evaluation value di of the first stroke indicating the degree of coincidence is calculated. Specifically, the rectangle evaluation value di is obtained by Expression (8).

【００８９】[0089]

【数３】 (Equation 3)

【００９０】具体的に文字品質抽出部５０で式（８）に
より求められた第１ストローク矩形評価値９４を図１５
に示す。第1位文字は正解文字であるので、ストローク
矩形評価値は小さい値になっている。The first stroke rectangle evaluation value 94 specifically obtained by the character quality extraction unit 50 by the equation (8) is shown in FIG.
Shown in Since the first character is the correct character, the stroke rectangle evaluation value is a small value.

【００９１】次に、判定部５２は、実施の形態１の場合
と同様に、文字品質抽出部５０で出力された文字品質情
報（認識結果第１位の第１ストローク矩形評価値）と、
識別部３で出力された認識結果の距離値情報６３を入力
パラメータとするマハラノビスの汎距離を算出し、認識
結果が正解文字であるか、エラー文字であるかを判定す
る。Next, as in the case of the first embodiment, the determination unit 52 determines the character quality information (the first stroke rectangle evaluation value of the first recognition result) output by the character quality extraction unit 50,
The Mahalanobis' general distance is calculated using the distance value information 63 of the recognition result output by the identification unit 3 as an input parameter, and it is determined whether the recognition result is a correct character or an error character.

【００９２】次に、図７の入力パターン８０の場合につ
いて説明する。入力パターン６０の場合と同様に、図１
６に示す第１ストロークの矩形評価値９５が求まったと
する。図１６に示すように、第１位の候補文字「丑」に
おける第１ストロークの矩形評価値は、入力パターンの
第1ストローク９３と、文字「丑」の第1ストロークの形
状が異なるため、矩形評価値は０．５９と大きい値にな
る。Next, the case of the input pattern 80 shown in FIG. 7 will be described. As in the case of the input pattern 60, FIG.
It is assumed that the rectangle evaluation value 95 of the first stroke shown in FIG. As shown in FIG. 16, the rectangle evaluation value of the first stroke in the first candidate character “ox” is the same as the first stroke 93 of the input pattern and the shape of the first stroke of the character “ox”. The evaluation value is a large value of 0.59.

【００９３】次に、判定部５２は、入力パターン６０の
場合と同様にマハラノビスの汎距離を算出し、認識結果
が正解文字であるか、エラー文字であるかを判定する。
この場合は、エラー文字と判別される。Next, the determination unit 52 calculates the Mahalanobis general distance in the same manner as in the case of the input pattern 60, and determines whether the recognition result is a correct character or an error character.
In this case, the character is determined to be an error character.

【００９４】このように、第１ストロークの矩形評価値
は、認識結果が正解文字である場合では小さい値、認識
結果がエラー文字の場合は、第１ストロークの形状が異
なれば、大きい値となり、認識結果のみで正解文字、エ
ラー文字の判定が行えない場合でも、この実施の形態２
に示すように正しい判定を行うことができる。As described above, the rectangle evaluation value of the first stroke is a small value when the recognition result is a correct character, and a large value when the recognition result is an error character if the shape of the first stroke is different. The second embodiment can be applied to the case where the correct character or the error character cannot be determined only by the recognition result.
The correct determination can be made as shown in FIG.

【００９５】以上この実施の形態２では、文字品質情報
として第１ストロークの矩形評価値を用いたが、第２ス
トローク以降の矩形評価値を用いても良い。また、複数
ストロークの矩形評価値を併用しても良い。As described above, in the second embodiment, the rectangle evaluation value of the first stroke is used as the character quality information, but the rectangle evaluation value of the second stroke and thereafter may be used. Also, rectangle evaluation values of a plurality of strokes may be used together.

【００９６】また、この実施の形態２では、ストローク
の評価値として、外接矩形の幅、高さを用いたが、スト
ロークの始点から終点への方向ベクトルを用いても良
い。In the second embodiment, the width and height of the circumscribed rectangle are used as the stroke evaluation values. However, a direction vector from the start point to the end point of the stroke may be used.

【００９７】また、この実施の形態２では、ストローク
の評価値として、外接矩形の幅、高さを用いたが、図１
７に示すように連続するストローク間の方向ベクトル１
００（ストロークの中点間のベクトル）を用いても良
い。Further, in the second embodiment, the width and height of the circumscribed rectangle are used as the stroke evaluation values.
Direction vector 1 between successive strokes as shown in FIG.
00 (vector between the middle points of the strokes) may be used.

【００９８】以上のように、この実施の形態２では、認
識結果の距離値情報に加えて、入力パターンの部分的な
ストローク情報から文字品質情報を抽出し、認識結果の
確からしさの判定に用いるようにしたので、より正確に
文字品質が判定できるようになり、判別精度が向上す
る。As described above, in the second embodiment, character quality information is extracted from partial stroke information of an input pattern in addition to distance value information of a recognition result, and is used to determine the certainty of the recognition result. As a result, the character quality can be determined more accurately, and the determination accuracy is improved.

【００９９】実施の形態３．以下、この発明の実施の形
態３について、図１及び図１８〜図２１を用いて説明す
る。実施の形態１、２と同一または相当部分は同一符号
を付し、重複説明を省略する。図１８は、文字品質判定
情報格納メモリ５１中に格納された文字を構成するスト
ロークを７種類で表現したストローク種類テーブルであ
り、図中１００〜１０６は該当するストロークである。Embodiment 3 Hereinafter, a third embodiment of the present invention will be described with reference to FIG. 1 and FIGS. The same or corresponding parts as those in Embodiments 1 and 2 are denoted by the same reference numerals, and redundant description will be omitted. FIG. 18 is a stroke type table expressing seven types of strokes constituting a character stored in the character quality determination information storage memory 51. In the drawing, reference numerals 100 to 106 denote corresponding strokes.

【０１００】図１９は、文字品質判定情報格納メモリ５
１中に格納された、認識対象文字における構成ストロー
クの数を示す構成ストロークテーブル１１０である。FIG. 19 shows the character quality judgment information storage memory 5.
4 is a configuration stroke table 110 that indicates the number of configuration strokes in a recognition target character, which is stored in 1.

【０１０１】図２０は、図２の入力パターン６０に対し
て、ストローク種類テーブルを用いて、構成するストロ
ーク数を求め、かつ入力パターン６０の構成ストローク
数と認識結果の候補文字における構成ストロークとの差
異の絶対値を示した図である。FIG. 20 shows the number of strokes constituting the input pattern 60 shown in FIG. 2 using the stroke type table. It is a figure showing the absolute value of the difference.

【０１０２】図２１は、図７の入力パターン８０に対し
て、ストローク種類テーブルを用いて、構成するストロ
ーク数を求め、かつ入力パターン８０の構成ストローク
数と認識結果の候補文字における構成ストロークとの差
異の絶対値を示した図である。FIG. 21 shows the number of constituent strokes of the input pattern 80 of FIG. 7 using the stroke type table, and compares the number of constituent strokes of the input pattern 80 with the constituent strokes of the candidate character of the recognition result. It is a figure showing the absolute value of the difference.

【０１０３】まず、図２に示す入力パターン６０に対し
て、動作を説明する。実施の形態１と同様に、識別部３
で図３のような認識結果を得たとする。次に、文字品質
抽出部５０は、図１８のストローク種類テーブルを用い
て、図２０に示すように入力パターン６０から各構成ス
トローク毎の数を抽出する。First, the operation of the input pattern 60 shown in FIG. 2 will be described. As in the first embodiment, the identification unit 3
Suppose that a recognition result as shown in FIG. 3 is obtained. Next, the character quality extraction unit 50 extracts the number of each constituent stroke from the input pattern 60 as shown in FIG. 20, using the stroke type table of FIG.

【０１０４】次に、文字品質抽出部５０は、識別部３か
ら出力された認識結果の各候補文字に対して、基準とな
る構成ストローク毎の本数を文字品質判定情報格納メモ
リ５１中の構成ストロークテーブル１１０を参照して求
める。例えば、認識結果第１位文字「亜」における基準
は、横ストロークが３本、縦ストロークが３本、かぎス
トロークが１本である。Next, the character quality extraction unit 50 determines the number of reference strokes of each constituent stroke for each candidate character output from the identification unit 3 in the constituent strokes stored in the character quality judgment information storage memory 51. It is determined with reference to the table 110. For example, the criteria for the first character “A” in the recognition result are three horizontal strokes, three vertical strokes, and one key stroke.

【０１０５】次に、文字品質抽出部５０は、入力パター
ン６０における構成ストロークと各候補文字における構
成ストローク数との差異の絶対値を求め、その差異の総
和をストローク種類を用いた場合の距離ｄｉとする。具
体的に距離ｄｉは式（９）で求める。Next, the character quality extraction unit 50 calculates the absolute value of the difference between the constituent strokes in the input pattern 60 and the number of constituent strokes in each candidate character, and calculates the sum of the differences as the distance di when the stroke type is used. And Specifically, the distance di is obtained by Expression (9).

【０１０６】[0106]

【数４】 (Equation 4)

【０１０７】各ストローク種類毎の差異を図２０の１１
１に示している。入力パターン６０は丁寧に筆記された
文字「亜」なので、入力パターンの構成ストロークとス
トローク種類テーブルの文字「亜」における構成ストロ
ークは一致する。したがって、認識結果「亜」に対して
は、各ストロークとも差異は０となり、最終的な距離ｄ
ｉも０となる。The difference for each stroke type is shown in FIG.
It is shown in FIG. Since the input pattern 60 is a carefully written character "A", the constituent stroke of the input pattern matches the constituent stroke of the character "A" in the stroke type table. Therefore, for the recognition result "A", the difference is 0 for each stroke, and the final distance d
i also becomes 0.

【０１０８】次に、判定部５２は、実施の形態１の場合
と同様に、文字品質抽出部５０で出力された文字品質情
報（認識結果第１位のストローク種類評価値）と、識別
部３で出力された認識結果の距離値６３を入力パラメー
タとするマハラノビスの汎距離を算出し、認識結果が正
解文字であるか、エラー文字であるかを判定する。Next, as in the case of the first embodiment, the determination unit 52 determines the character quality information (the first stroke type evaluation value of the recognition result) output by the character quality extraction unit 50 and the identification unit 3. Calculates the Mahalanobis' generalized distance using the distance value 63 of the recognition result output as an input parameter, and determines whether the recognition result is a correct character or an error character.

【０１０９】次に、図７の入力パターン８０の場合につ
いて説明する。入力パターン６０の場合と同様に、図２
１に示す入力パターン８０における構成ストローク毎の
本数、及び認識結果文字に対するストローク種類毎の入
力パターン６０との差異が求まったとする。図２１に示
すように、入力パターン８０は右上がりに傾いて筆記さ
れた文字「亜」であるので、横ストロークがすべて右上
がりストロークに分類されている。Next, the case of the input pattern 80 of FIG. 7 will be described. As in the case of the input pattern 60, FIG.
It is assumed that the difference between the number of strokes of each constituent stroke in the input pattern 80 shown in FIG. 1 and the input pattern 60 of each stroke type for the recognition result character is determined. As shown in FIG. 21, the input pattern 80 is a character “A” written inclining rightward, so that all the horizontal strokes are classified as rightward upward strokes.

【０１１０】次に、第１位の候補文字「丑」におけるス
トローク種類評価値は求めると、入力パターンとは異な
る文字であるので、距離は９と大きい値になる。この例
では、認識結果第２位の正解文字「亜」に対しても、距
離は８と大きい値となっている。つまり、第１位が正解
文字であっても、入力パターンが傾いて筆記された文字
であるために、距離は大きな値になる。Next, when the stroke type evaluation value of the first candidate character "ox" is obtained, since the stroke type evaluation value is a character different from the input pattern, the distance becomes a large value of nine. In this example, the distance is also a large value of 8, even for the second correct character "A" in the recognition result. That is, even if the first place is the correct character, the distance becomes a large value because the input pattern is a character written with an inclination.

【０１１１】次に、判定部５２は、入力パターン６０の
場合と同様にマハラノビスの汎距離を算出し、認識結果
が正解文字であるか、エラー文字であるかを判定する。
この場合は、エラー文字と判別される。Next, the determination unit 52 calculates the Mahalanobis' general distance as in the case of the input pattern 60, and determines whether the recognition result is a correct character or an error character.
In this case, the character is determined to be an error character.

【０１１２】このように、ストローク種類評価値は、入
力パターンが丁寧に筆記され、かつ認識結果が正解文字
の場合は小さい値、認識結果がエラー文字あるいは入力
パターンが丁寧に筆記されなかった場合は、大きい値と
なり、認識結果のみで正解文字、エラー文字の判定が行
えない場合でも、実施の形態３に示すように正しい判定
を行うことができる。As described above, the stroke type evaluation value is a small value when the input pattern is carefully written and the recognition result is a correct character, and when the recognition result is an error character or the input pattern is not carefully written. Even when the correct character or the error character cannot be determined only by the recognition result, the correct determination can be performed as shown in the third embodiment.

【０１１３】以上、この実施の形態３では、ストローク
種類を７種類に分類したが、これとは異なる数の分類に
しても良い。As described above, in the third embodiment, the stroke types are classified into seven types. However, a different number of types may be used.

【０１１４】また、この実施の形態３では、認識対象文
字に対して１つのストローク種類情報のみ持つようにし
ているが、１つの文字に対して複数のストローク種類テ
ーブルを持っても良い。Further, in the third embodiment, only one stroke type information is provided for the character to be recognized, but a plurality of stroke type tables may be provided for one character.

【０１１５】また、この実施の形態３では、文字を構成
するすべてのストロークを用いて評価を行ったが、部分
的なストロークに対してのみ、ストローク種類を評価し
ても良い。In the third embodiment, the evaluation is performed using all the strokes constituting the character. However, the stroke type may be evaluated only for a partial stroke.

【０１１６】また、この実施の形態３では、ストローク
本数の差の絶対値を使用したが、式（１０）に示すよう
に、入力パターンの画数及び正規文字の画数で除算した
値の差を距離ｄｉとしても良い。In the third embodiment, the absolute value of the difference in the number of strokes is used. However, as shown in equation (10), the difference between the number of strokes of the input pattern and the number of strokes of the regular character is calculated as the distance. It may be di.

【０１１７】[0117]

【数５】 (Equation 5)

【０１１８】以上のように、この実施の形態３では、認
識結果の距離値情報に加えて、入力パターンのストロー
ク種類により文字品質情報を抽出し、認識結果の確から
しさの判定に用いるようにしたので、より正確に文字品
質が判定できるようになり、判別精度が向上する。As described above, in the third embodiment, in addition to the distance value information of the recognition result, the character quality information is extracted according to the stroke type of the input pattern, and is used to determine the certainty of the recognition result. Therefore, the character quality can be more accurately determined, and the determination accuracy is improved.

【０１１９】実施の形態４．以下、この発明の実施の形
態４について、図１及び図２２〜図２７を用いて説明す
る。実施の形態１、２、３と同一または相当部分は同一
符号を付し、重複説明を省略する。図２２は、入力パタ
ーンから公知の技術を用いて、屈曲点を抽出する様子を
示した図であり、この例では、∠ＱＰＲが適当な閾値θ
より小さい場合に、点Ｐを屈曲点として抽出する。Embodiment 4 Hereinafter, a fourth embodiment of the present invention will be described with reference to FIG. 1 and FIGS. The same or corresponding parts as those in Embodiments 1, 2, and 3 are denoted by the same reference numerals, and redundant description is omitted. FIG. 22 is a diagram illustrating a state in which a bending point is extracted from an input pattern using a known technique. In this example, ∠QPR is set to an appropriate threshold θ.
If smaller, the point P is extracted as a bending point.

【０１２０】図２３は、入力パターン６０に対して、屈
曲点を抽出した例を示す図であり、図中１２０が抽出さ
れた屈曲点である。FIG. 23 is a diagram showing an example in which a bending point is extracted from the input pattern 60. In FIG. 23, 120 is the extracted bending point.

【０１２１】図２４は、文字品質判定情報格納メモリ５
１に格納された丁寧に筆記された認識対象文字の屈曲点
数を格納した屈曲点数テーブル１３１である。FIG. 24 shows the character quality judgment information storage memory 5.
4 is a bending point number table 131 that stores the number of bending points of the carefully written recognition target character stored in No. 1;

【０１２２】図２５は、入力パターン６０における屈曲
点数と認識結果文字における屈曲点数から、屈曲評価値
１３２を算出した例を示す図である。FIG. 25 is a diagram showing an example of calculating the bending evaluation value 132 from the number of bending points in the input pattern 60 and the number of bending points in the recognition result character.

【０１２３】図２６は、文字筆記時に始点に飾りが発生
した入力パターン１２１に対して、屈曲点を抽出した例
を示す図であり、図中１２２〜１２５は抽出された屈曲
点を示している。FIG. 26 is a diagram showing an example in which inflection points are extracted from the input pattern 121 in which a decoration has occurred at the start point when writing a character. In the drawing, 122 to 125 indicate the extracted inflection points. .

【０１２４】図２７は、入力パターン１２１における屈
曲点数と認識結果文字における屈曲点数から、屈曲評価
値１３５を算出した例を示す図である。図中１３６は、
入力パターン１２１に対する認識結果の距離値である。FIG. 27 is a diagram showing an example in which the bending evaluation value 135 is calculated from the number of bending points in the input pattern 121 and the number of bending points in the recognition result character. 136 in the figure is
This is the distance value of the recognition result for the input pattern 121.

【０１２５】まず、図２に示す入力パターン６０に対し
て、動作を説明する。実施の形態１と同様に、識別部３
で図３のような認識結果を得たとする。次に、文字品質
抽出部５０は、図２２に示すように、入力パターンの各
ストロークに対して、等間隔なサンプル点を抽出し、注
目サンプル点の前後との角度を計算し、規定の屈曲検出
角度θと比較を行い、θより小さい場合に注目点を屈曲
点として抽出する。入力パターン６０は丁寧に筆記され
たパターンであるので、図２３に示すように１つの屈曲
点１２０のみ抽出される。First, the operation of the input pattern 60 shown in FIG. 2 will be described. As in the first embodiment, the identification unit 3
Suppose that a recognition result as shown in FIG. 3 is obtained. Next, as shown in FIG. 22, the character quality extraction unit 50 extracts sample points at regular intervals for each stroke of the input pattern, calculates the angle between the sample point before and after the sample point of interest, and performs the specified bending. A comparison is made with the detected angle θ, and if it is smaller than θ, the point of interest is extracted as a bending point. Since the input pattern 60 is a carefully written pattern, only one bending point 120 is extracted as shown in FIG.

【０１２６】次に、文字品質抽出部５０は、識別部３か
ら出力された認識結果の各候補文字に対して、基準とな
る屈曲点数を文字品質判定情報格納メモリ５１中の屈曲
点数テーブル１３１を参照して求める。例えば、認識結
果第１位文字「亜」における基準屈曲点数は、１であ
る。Next, the character quality extracting unit 50 determines the reference bending point number for each candidate character of the recognition result output from the identification unit 3 by using the bending point number table 131 in the character quality determination information storage memory 51. Seek and ask. For example, the reference number of bending points in the first character “A” in the recognition result is one.

【０１２７】次に、文字品質抽出部５０は、入力パター
ン６０における屈曲点数と各候補文字における屈曲点数
との差異の絶対値を屈曲評価値ｄｉとして式（１１）で
算出する。Next, the character quality extraction unit 50 calculates the absolute value of the difference between the number of inflection points in the input pattern 60 and the number of inflection points in each of the candidate characters as the inflection evaluation value di using equation (11).

【０１２８】ｄｉ＝｜Ｎ−Ｎｉ｜（１１）Ｎ：入力パターンの屈曲点数Ｎｉ：認識結果の第ｉ位候補における屈曲点数Di = | N−Ni | (11) N: number of bending points of input pattern Ni: number of bending points in i-th candidate of recognition result

【０１２９】文字品質抽出部５０で計算された入力パタ
ーン６０に対する屈曲評価値１３２を図２５に示す。入
力パターン６０は丁寧に筆記された文字「亜」なので、
入力パターンの屈曲点数と屈曲点数テーブル１３１の文
字「亜」における屈曲点数は一致する。したがって、認
識結果「亜」における屈曲評価値ｄｉも０となる。FIG. 25 shows the bending evaluation value 132 for the input pattern 60 calculated by the character quality extracting unit 50. Since the input pattern 60 is a carefully written character "A",
The number of inflection points in the input pattern and the number of inflection points in the character “A” in the inflection point number table 131 match. Therefore, the bending evaluation value di in the recognition result “A” is also zero.

【０１３０】次に、判定部５２は、実施の形態１の場合
と同様に、文字品質抽出部５０で出力された文字品質情
報（認識結果第1位の屈曲評価値）と、識別部３で出力
された認識結果の距離値６３を入力パラメータとするマ
ハラノビスの汎距離を算出し、認識結果が正解文字であ
るか、エラー文字であるかを判定する。Next, as in the case of the first embodiment, the judging section 52 compares the character quality information (the first bending evaluation value of the recognition result) output by the character quality extracting section 50 with the identifying section 3. The Mahalanobis' general distance is calculated using the distance value 63 of the output recognition result as an input parameter, and it is determined whether the recognition result is a correct character or an error character.

【０１３１】次に、図２６の入力パターン１２１の場合
について説明する。入力パターン６０の場合と同様に、
図２６に示す入力パターン１２１における屈曲点数、及
び認識結果文字に対する屈曲点数との差異が求まったと
する。図２６に示すように、入力パターン１２１はスト
ロークの始点箇所に飾りが発生したパターンのため、屈
曲点数は４となっている。Next, the case of the input pattern 121 shown in FIG. 26 will be described. As in the case of the input pattern 60,
It is assumed that a difference between the number of inflection points in the input pattern 121 shown in FIG. 26 and the number of inflection points for the recognition result character has been obtained. As shown in FIG. 26, since the input pattern 121 is a pattern in which decoration is generated at the starting point of the stroke, the number of bending points is four.

【０１３２】次に、第１位の候補文字「亜」における屈
曲評価値を求めると、屈曲評価値は３となる。つまり、
第１位が正解文字であっても、入力パターンに飾りが発
生しているため、評価値は大きな値となる。Next, when the bending evaluation value of the first candidate character “A” is obtained, the bending evaluation value is 3. That is,
Even if the first place is a correct character, the evaluation value is a large value because decoration is generated in the input pattern.

【０１３３】次に、判定部５２は、入力パターン６０の
場合と同様にマハラノビスの汎距離を算出し、認識結果
が正解文字であるか、エラー文字であるかを判定する。
この場合は、エラー文字と判別される。Next, the determination unit 52 calculates the Mahalanobis' general distance as in the case of the input pattern 60, and determines whether the recognition result is a correct character or an error character.
In this case, the character is determined to be an error character.

【０１３４】このように、屈曲評価値は、入力パターン
が丁寧に筆記され、かつ認識結果が正解文字の場合は小
さい値、認識結果がエラー文字あるいは入力パターンが
丁寧に筆記されなかった場合は、大きい値となり、認識
結果のみで正解文字、エラー文字の判定が行えない場合
でも、実施の形態４に示すように正しい判定を行うこと
ができる。As described above, the bending evaluation value is a small value when the input pattern is carefully written and the recognition result is a correct character, and when the recognition result is an error character or the input pattern is not carefully written, Even when the value is large and the correct character or the error character cannot be determined only by the recognition result, the correct determination can be performed as shown in the fourth embodiment.

【０１３５】以上、この実施の形態４では、屈曲点数を
等間隔にサンプリングした点から抽出したが、他の方法
で屈曲点を求めても良い。As described above, in the fourth embodiment, the number of inflection points is extracted from points sampled at equal intervals, but the inflection points may be obtained by other methods.

【０１３６】また、この実施の形態４では、認識対象文
字に対して１つの屈曲点数テーブルのみ持つようにして
いるが、１つの文字に対して複数の屈曲点数テーブルを
持っても良い。In the fourth embodiment, only one bending point number table is provided for a character to be recognized. However, a plurality of bending point number tables may be provided for one character.

【０１３７】また、この実施の形態４では、文字を構成
するすべてのストロークに対して屈曲点数による評価を
行ったが、部分的なストロークあるいは、部分的な領域
に対してのみ、屈曲点数を評価しても良い。Further, in the fourth embodiment, the evaluation based on the number of bending points is performed for all strokes constituting a character. However, the number of bending points is evaluated only for a partial stroke or a partial area. You may.

【０１３８】また、この実施の形態４では、屈曲点数の
差の絶対値を使用したが、式（１２）に示すように、入
力パターンの画数及び正規文字の画数で除算した値の差
を距離ｄｉとしても良い。In the fourth embodiment, the absolute value of the difference in the number of inflection points is used. However, as shown in equation (12), the difference between the number of strokes of the input pattern and the number of strokes of the regular character is calculated as the distance. It may be di.

【０１３９】[0139]

【数６】 (Equation 6)

【０１４０】以上のように、この実施の形態４では、認
識結果の距離値情報に加えて、入力パターンの屈曲評価
値により文字品質情報を抽出し、認識結果の確からしさ
の判定に用いるようにしたので、正規の画数で筆記した
が、崩れて筆記された文字や、はね、抑えが発生した癖
字における誤判定が低減でき、より正確に文字品質が判
定できるようになり、判別精度が向上する効果が得られ
る。As described above, in the fourth embodiment, in addition to the distance value information of the recognition result, the character quality information is extracted based on the bending evaluation value of the input pattern, and is used to determine the certainty of the recognition result. So, I wrote it with a regular number of strokes, but it was possible to reduce erroneous judgments on broken handwritten characters, splashes, and habit characters where suppression occurred, and it became possible to judge character quality more accurately, and discrimination accuracy The effect of improving is obtained.

【０１４１】[0141]

【発明の効果】以上のように、この発明によれば、入力
パターンとカテゴリ情報を入力して文字品質判定のため
の文字品質情報を抽出する文字品質抽出部と、予め文字
品質を判定するための情報を格納した文字品質判定情報
格納メモリと、識別部で得られた認識結果の距離値情
報、文字品質抽出部で得られた文字品質情報及び文字品
質判定情報格納メモリからの情報を用いて認識結果の確
からしさを判定し、判定結果を出力する判定部を備え、
文字品質を判定するように構成したので、文字品質の低
下したパターンと、文字品質の良いパターンを分離でき
る。一般に文字品質の低下した文字と品質の良い文字の
距離分布は異なった分布をしており、全て同一の分布で
代用すると、正解・エラー判別の性能が低下する。従っ
て、文字品質に応じて判別のための学習をすることによ
り、性能向上が図られる。また、文字品質を表す尺度を
判別パラメータとして導入することにより性能向上が図
れるという効果がある。As described above, according to the present invention, a character quality extracting unit for inputting an input pattern and category information to extract character quality information for character quality determination and a character quality extracting unit for determining character quality in advance. Using the character quality judgment information storage memory storing the information of the distance, the distance value information of the recognition result obtained by the identification unit, the character quality information obtained by the character quality extraction unit, and the information from the character quality judgment information storage memory. A determination unit that determines the likelihood of the recognition result and outputs the determination result,
Since the configuration is such that the character quality is determined, it is possible to separate a pattern having a reduced character quality from a pattern having a good character quality. In general, the distance distribution between a character whose character quality has deteriorated and a character having good quality has different distributions. If the same distribution is substituted for all the distance distributions, the performance of correct answer / error discrimination deteriorates. Therefore, the performance is improved by learning for discrimination according to the character quality. In addition, there is an effect that performance can be improved by introducing a scale indicating character quality as a discrimination parameter.

【０１４２】また、この発明によれば、時系列の情報を
用いて、文字品質の正解・エラーの判別を行うように構
成したので、文字の形の崩れたパターンにおいても、時
系列的な情報が正しく筆記されていれば、字形的な評価
値は小さい場合でも、正解・エラーの判別性能を向上で
きる。時系列的な情報は字形に比べてユーザが正しく筆
記しやすい情報であり、綺麗な形で文字を筆記するよ
り、ユーザにとって負担が少ない。また、字形的な情報
のみで認識した場合は、時系列的な情報を加味して判別
することにより、時系列情報が相補的に働き、判別性能
が向上するという効果がある。Further, according to the present invention, the correctness / error of the character quality is determined by using the time-series information. Is correctly written, it is possible to improve the correctness / error discrimination performance even if the character-like evaluation value is small. The time-series information is information that is easier for the user to write correctly than the character shape, and less burdens on the user than writing characters in a beautiful form. Further, when the recognition is performed only by the character-shaped information, the determination is performed in consideration of the time-series information, so that the time-series information works complementarily, and there is an effect that the determination performance is improved.

【０１４３】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、時系列情報として入力
パターンから得られる画数情報あるいは認識結果の候補
文字における正規画数情報により文字品質情報を抽出す
るように構成したので、時系列情報は画数情報を容易に
抽出でき、文字ごとの基準画数と入力パターンの画数の
差異が大きくなるほど、文字品質が低下していると判断
してもよく、最も容易に抽出できる文字品質を顕わす尺
度である。しかも、上記時系列情報を用いて、文字品質
の正解・エラーの判別を行う場合と同様の効果がある。Further, according to the present invention, when character quality information is extracted by the character quality extraction unit, the character quality information is obtained based on the number of strokes obtained from the input pattern as time-series information or the normal number of strokes in the candidate character of the recognition result. , The time-series information can easily extract the stroke number information, and the larger the difference between the reference stroke number of each character and the stroke number of the input pattern, the lower the character quality may be. Is a measure of character quality that can be extracted most easily. In addition, there is an effect similar to that in the case of determining the correct answer / error of the character quality using the time-series information.

【０１４４】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、時系列情報として入力
パターンから得られる画数情報及び認識結果の候補文字
における正規画数情報により算出した画数変動率を基に
文字品質情報を抽出するように構成したので、画数変動
率の適用により、画数の多い文字での画数変化と画数の
少ない文字での画数変化を正規化できる。つまり、３画
の文字が２画で筆記された場合と、２０画の文字が１９
画で筆記された場合では、文字品質に対する影響度が異
なる。そこで、画数差を正規画数で正規化することによ
り、影響度を均一化する効果があるとともに、上記した
各項目と同様の効果がある。Further, according to the present invention, when the character quality information is extracted by the character quality extraction unit, the number of strokes calculated based on the number of strokes obtained from the input pattern as the time-series information and the normal stroke number information of the candidate character of the recognition result. Since the character quality information is configured to be extracted based on the change rate, the change in the number of strokes in a character with a large number of strokes and the change in the number of strokes in a character with a small number of strokes can be normalized by applying the change number of strokes. In other words, three strokes are written in two strokes, and 20 strokes are 19
The degree of influence on character quality differs in the case of writing with an image. Therefore, normalizing the difference in the number of strokes with the number of normal strokes has the effect of equalizing the degree of influence, and has the same effect as each of the above items.

【０１４５】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、時系列情報として入力
パターンから得られるストローク情報を基に文字品質情
報を抽出するように構成したので、低品質文字では、文
字を構成するストローク（字画）の情報も崩れた場合が
多く、その情報を基に文字品質を判定する。ストローク
情報はオンライン文字認識では容易に抽出できる情報で
あり、画数情報以上に詳細な判定が可能なので、精度良
く判定できるとともに、上記した各項目と同様の効果が
ある。Further, according to the present invention, when the character quality information is extracted by the character quality extraction unit, the character quality information is extracted based on stroke information obtained from the input pattern as time-series information. In the case of low-quality characters, the information of strokes (strokes) constituting the characters often collapses, and character quality is determined based on the information. The stroke information is information that can be easily extracted by online character recognition, and can be determined in more detail than the stroke number information. Therefore, the stroke information can be determined with high accuracy, and the same effects as those of the above items can be obtained.

【０１４６】また、この発明によれば、前文字品質抽出
部で文字品質情報を抽出する際に、ストローク情報とし
て入力パターンにおけるストローク矩形と認識結果の候
補文字に対して予め用意されたストローク矩形との幅、
高さの距離値を文字品質情報として抽出するように構成
したので、ストローク情報として、容易に抽出できるス
トロークの外接矩形情報を使用する。文字が傾いた場合
などは、外接矩形も変化するので、容易に抽出でき、か
つ精度良く文字品質を抽出できる効果がある。According to the present invention, when character quality information is extracted by the preceding character quality extraction unit, a stroke rectangle in an input pattern and a stroke rectangle prepared in advance for a candidate character of a recognition result are used as stroke information. The width of the
Since the height distance value is extracted as character quality information, circumscribed rectangle information of a stroke that can be easily extracted is used as stroke information. When the character is tilted, the circumscribed rectangle also changes, so that there is an effect that the character can be easily extracted and the character quality can be accurately extracted.

【０１４７】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、ストローク情報として
入力パターンのストロークの方向と認識結果の候補文字
に対して予め用意されたストロークの方向との距離値を
文字品質情報として抽出するように構成したので、漢字
では、直線ストロークで構成される割合が高く、ストロ
ーク方向も、ストロークの始点からストロークの終点へ
の方向で代用できるので、容易に抽出できる。また、字
形が崩れた場合は、この情報も変化しやすいため、精度
良く文字品質を抽出できる効果がある。Also, according to the present invention, when the character quality information is extracted by the character quality extraction unit, the stroke direction of the input pattern and the stroke direction prepared in advance for the candidate character of the recognition result are used as the stroke information. Is extracted as character quality information, so the ratio of kanji composed of straight strokes is high, and the stroke direction can be substituted by the direction from the start point of the stroke to the end point of the stroke. Can be extracted. Further, when the character shape is broken, this information is also likely to change, so that there is an effect that the character quality can be accurately extracted.

【０１４８】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、ストローク情報として
入力パターンの連続するストローク間の方向と認識結果
の候補文字に対して予め用意された連続するストローク
間の方向との距離値を文字品質情報として抽出するよう
に構成したので、ストローク間の方向は、ストローク間
の相対的な位置関係を示し、字形が崩れた場合に、相対
位置関係は変化しやすい。また、連続するストローク間
とすることで、筆順情報を反映できる。また、容易に抽
出できるという効果がある。Further, according to the present invention, when character quality information is extracted by the character quality extraction unit, the direction between successive strokes of the input pattern and the candidate character of the recognition result are prepared in advance as stroke information. Since the distance value from the direction between successive strokes is configured to be extracted as character quality information, the direction between strokes indicates the relative positional relationship between the strokes. Is easy to change. In addition, by setting the interval between consecutive strokes, the stroke order information can be reflected. In addition, there is an effect that it can be easily extracted.

【０１４９】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、ストローク情報として
入力パターンの各ストロークの長さの総和と認識結果の
候補文字に対して予め用意されたストロークの長さの総
和との距離値を文字品質情報として抽出するように構成
したので、ストロークの長さは、同一ストローク内のサ
ンプル点間の距離の総和を求めればよく、容易に抽出可
能である。字形が崩れた文字や、速く筆記した文字は、
ストロークの長さが変化しやすいため、精度良く文字品
質を抽出できる効果がある。Further, according to the present invention, when character quality information is extracted by the character quality extraction unit, the sum of the lengths of the strokes of the input pattern and the candidate characters of the recognition result are prepared in advance as stroke information. It is configured to extract the distance value with the sum of the lengths of the strokes as character quality information, so the stroke length can be easily extracted by calculating the sum of the distances between sample points in the same stroke. It is. Characters with broken shapes or written quickly
Since the length of the stroke is easily changed, there is an effect that character quality can be accurately extracted.

【０１５０】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、ストローク情報として
入力パターンのストロークの種類情報と認識結果の候補
文字に対して予め用意されたストロークの種類情報とを
比較し、文字品質情報として抽出するように構成したの
で、文字を構成するストロークの形状は、ある程度の決
められた形状に分類できる。その分類情報を基に、文字
品質を判別することにより、精度良く判別が可能である
という効果がある。Further, according to the present invention, when character quality information is extracted by the character quality extraction unit, stroke type information of an input pattern and stroke information prepared in advance for a candidate character of a recognition result are used as stroke information. Since the type information is compared with the type information and extracted as the character quality information, the shape of the stroke constituting the character can be classified into a certain fixed shape. By determining the character quality based on the classification information, there is an effect that the determination can be performed with high accuracy.

【０１５１】また、この発明によれば、文字品質抽出部
で文字品質情報を抽出する際に、入力パターンから得ら
れる屈曲点（方向が大きく変化する点）の情報を用いて
文字品質情報を抽出するように構成したので、屈曲点を
容易に抽出できる。また、続け字、崩し字などでは、ス
トロークとストロークが結合し、新たな屈曲点が発生す
るため、屈曲点情報が変化しやすいため、精度良く文字
品質を抽出できる効果がある。Further, according to the present invention, when the character quality information is extracted by the character quality extraction unit, the character quality information is extracted by using the information of the inflection point (point at which the direction changes greatly) obtained from the input pattern. As a result, the inflection point can be easily extracted. In the case of continuous characters, broken characters, and the like, the strokes are combined with each other and a new bending point is generated. Therefore, the bending point information is easily changed, so that there is an effect that character quality can be accurately extracted.

[Brief description of the drawings]

【図１】この発明の実施の形態１によるオンライン文
字認識装置の構成を示す回路ブロックである。FIG. 1 is a circuit block diagram showing a configuration of an online character recognition device according to Embodiment 1 of the present invention.

【図２】入力パターンを示す図である。FIG. 2 is a diagram showing an input pattern.

【図３】入力パターンに対する認識結果を示す図であ
る。FIG. 3 is a diagram showing a recognition result for an input pattern.

【図４】認識対象文字における正規の画数を格納した
画数情報テーブルを示す図である。FIG. 4 is a diagram illustrating a stroke number information table storing a regular stroke number in a recognition target character;

【図５】入力パターンに対する画数変動率を示す図で
ある。FIG. 5 is a diagram showing the number-of-images change rate with respect to an input pattern.

【図６】大量に収集した手書き文字における正規の画
数からの変動割合を示す図である。FIG. 6 is a diagram showing a variation ratio from a regular number of strokes in handwritten characters collected in large quantities.

【図７】入力パターンを示す図である。FIG. 7 is a diagram showing an input pattern.

【図８】入力パターンに対する認識結果を示す図であ
る。FIG. 8 is a diagram showing a recognition result for an input pattern.

【図９】入力パターンに対する画数変動率を示す図で
ある。FIG. 9 is a diagram showing the number-of-images change rate with respect to an input pattern.

【図１０】認識結果の距離値を尺度とした場合の第１
位の認識結果候補文字が正解文字の場合における分布及
びエラー文字の場合における分布を示す図である。FIG. 10 shows a first example in which the distance value of the recognition result is used as a scale.
It is a figure which shows the distribution in case a recognition result candidate character of a position is a correct character, and the distribution in case of an error character.

【図１１】認識結果の距離値に文字品質を併用した場
合の判別得点を尺度とした場合の第1位の認識結果候補
文字が、正解文字の場合における分布及びエラー文字の
場合における分布を示す図である。FIG. 11 shows a distribution in the case where the first-ranked recognition result candidate character is a correct character and a distribution in the case of an erroneous character when the discrimination score obtained when character quality is used in combination with the distance value of the recognition result is used as a scale. FIG.

【図１２】筆記された認識対象文字における第1スト
ロークと第2ストロークの外接矩形における幅、高さを
正規化して格納したストローク矩形情報テーブルであ
る。FIG. 12 is a stroke rectangle information table in which the width and height of a circumscribed rectangle of a first stroke and a second stroke of a written recognition target character are normalized and stored.

【図１３】図２の入力パターンに対して第1ストロー
クの外接矩形における幅（Ｗ）及び高さ（Ｈ）を抽出し
ている説明図である。13 is an explanatory diagram of extracting a width (W) and a height (H) of a circumscribed rectangle of a first stroke with respect to the input pattern of FIG. 2;

【図１４】図７の入力パターンに対して第1ストロー
クの外接矩形における幅（Ｗ）及び高さ（Ｈ）を抽出し
ている説明図である。14 is an explanatory diagram of extracting a width (W) and a height (H) of a circumscribed rectangle of a first stroke from the input pattern of FIG. 7;

【図１５】図２の入力パターンに対して文字品質抽出
部で抽出された第１ストロークの矩形評価値を示す図で
ある。FIG. 15 is a diagram showing a rectangle evaluation value of a first stroke extracted by the character quality extraction unit for the input pattern of FIG. 2;

【図１６】図７の入力パターンに対して文字品質抽出
部で抽出された第１ストロークの矩形評価値を示す図で
ある。16 is a diagram showing a rectangle evaluation value of a first stroke extracted by the character quality extraction unit for the input pattern of FIG. 7;

【図１７】図２の入力パターンに対して文字品質抽出
部で抽出された第１ストロークから第２のストロークへ
の方向ベクトルを示す図である。17 is a diagram illustrating a direction vector from a first stroke to a second stroke extracted by the character quality extraction unit with respect to the input pattern of FIG. 2;

【図１８】文字品質判定情報格納メモリ中に格納され
た文字を構成するストロークを7種類で表現したストロ
ーク種類テーブルである。FIG. 18 is a stroke type table expressing strokes constituting characters stored in a character quality determination information storage memory in seven types.

【図１９】認識対象文字における構成ストロークの数
を示す構成ストロークテーブル図である。FIG. 19 is a configuration stroke table showing the number of configuration strokes in a recognition target character.

【図２０】図２の入力パターンの構成ストローク数と
認識結果の候補文字における構成ストロークとの差異の
絶体値を示した図である。20 is a diagram showing an absolute value of a difference between the number of constituent strokes of the input pattern of FIG. 2 and a constituent stroke of a candidate character as a recognition result.

【図２１】図７の入力パターンの構成ストローク数と
認識結果の候補文字における構成ストロークとの差異の
絶体値を示した図である。21 is a diagram showing an absolute value of a difference between the number of constituent strokes of the input pattern of FIG. 7 and the constituent strokes of candidate characters as a recognition result.

【図２２】入力パターンから屈曲点を抽出する様子を
示した図である。FIG. 22 is a diagram showing a state of extracting a bending point from an input pattern.

【図２３】入力パターンに対して屈曲点を抽出した例
を示す図である。FIG. 23 is a diagram illustrating an example in which a bending point is extracted from an input pattern.

【図２４】筆記された認識対象文字の屈曲点数を格納
した屈曲点数テーブルである。FIG. 24 is a table showing the number of inflection points of a written recognition target character.

【図２５】図２の入力パターンにおける屈曲点数と認
識結果文字における屈曲点数から、屈曲評価値を算出し
た例を示す図である。25 is a diagram illustrating an example of calculating a bending evaluation value from the number of bending points in the input pattern of FIG. 2 and the number of bending points in a recognition result character.

【図２６】文字筆記時に始点に飾りが発生した入力パ
ターンに対して、屈曲点を抽出した例を示す図である。FIG. 26 is a diagram illustrating an example in which a bending point is extracted from an input pattern in which a decoration has occurred at a starting point during character writing.

【図２７】文字筆記時に始点に飾りが発生した入力パ
ターンにおける屈曲点数と認識結果文字における屈曲点
数から、屈曲評価値を算出した例を示す図である。FIG. 27 is a diagram illustrating an example in which a bending evaluation value is calculated from the number of bending points in an input pattern in which a decoration has occurred at the start point during character writing and the number of bending points in a recognition result character.

【図２８】従来のオンライン文字認識装置の構成を示
すブロック図である。FIG. 28 is a block diagram showing a configuration of a conventional online character recognition device.

【図２９】係数インデックス用メモリの内容を示す図
である。FIG. 29 is a diagram illustrating the contents of a coefficient index memory.

【図３０】係数メモリの内容を表す図である。FIG. 30 is a diagram showing the contents of a coefficient memory.

[Explanation of symbols]

１特徴抽出部、２辞書メモリ、３識別部、１４
係数インデックス用メモリ、１５係数メモリ、１６
係数選択部、１７差分値作成部、１８候補確度算出
部、２０カテゴリ番号、２１類似カレゴリのグルー
プ番号、３１グループ毎の判別系数値、５０文字品質
抽出部、５１文字品質判定情報格納メモリ、５２判
定部、６０入力パターン、６１順位、６２カテゴ
リ、６３距離値、６４カテゴリ、６５正規の画
数、７０画数変動率、８０入力パターン、８１順
位、８２カテゴリ、８３距離値、８４画数変動
率、８５正解文字の分布、８６エラー文字の分布、
８７正解文字の分布、８８エラー文字の分布、８９
閾値、９０第１ストロークの矩形情報、９１第２ス
トロークの矩形情報、９２第１ストローク、９３第
１ストローク、９４ストローク矩形評価値、９５スト
ローク矩形評価値、１００〜１０６ストローク間の方
向ベクトル、１１０ストローク種類テーブル、１１１
各ストローク種類毎の差異、１１２各ストローク種
類毎の差異、１２０屈曲点、１２１入力パターン、
１２２〜１２５屈曲点、１３１屈曲点数テーブル、
１３２屈曲評価値、１３３順位、１３４カテゴ
リ、１３５屈曲評価値、１３６距離値。1 feature extraction unit, 2 dictionary memory, 3 identification unit, 14
Coefficient index memory, 15 Coefficient memory, 16
Coefficient selecting section, 17 difference value creating section, 18 candidate probability calculating section, 20 category numbers, 21 similar category group numbers, 31 group discriminant values, 50 character quality extracting section, 51 character quality judgment information storage memory, 52 Judgment unit, 60 input patterns, 61 ranks, 62 categories, 63 distance values, 64 categories, 65 regular strokes, 70 strokes variation rate, 80 input patterns, 81 ranks, 82 categories, 83 distance values, 84 strokes variation rates, 85 Distribution of correct characters, distribution of 86 error characters,
87 Distribution of correct characters, 88 Distribution of error characters, 89
Threshold value, 90 first stroke rectangle information, 91 second stroke rectangle information, 92 first stroke, 93 first stroke, 94 stroke rectangle evaluation value, 95 stroke rectangle evaluation value, direction vector between 100-106 strokes, 110 Stroke type table, 111
Difference for each stroke type, 112 Difference for each stroke type, 120 Bending point, 121 input pattern,
122 to 125 bending point, 131 bending point number table,
132 bending evaluation value, 133 rank, 134 category, 135 bending evaluation value, 136 distance value.

Claims

[Claims]

1. A dictionary memory in which a feature vector of a standard pattern in a character to be recognized is stored in advance, a feature extraction unit for extracting a feature vector indicating a feature of a pattern from an input pattern, and an input pattern obtained from the feature extraction unit A character recognition device having an identification unit that compares a feature vector of a reference pattern with a feature vector of a standard pattern in each category in the dictionary memory and outputs category information and distance value information of a recognition result. , A character quality extraction unit for extracting character quality information for character quality determination, a character quality determination information storage memory storing information for character quality determination in advance, and a recognition obtained by the identification unit. The resulting distance value information, the character quality information obtained by the character quality extraction unit, and the determination information from the character quality determination information storage memory. Recognition determines the likelihood of a result, the character recognition apparatus characterized by comprising a judgment unit for outputting a determination result using the.

2. The character according to claim 1, wherein the character quality information is extracted by the character quality extracting unit based on time-series information obtained from an input pattern. Recognition device.

3. The character quality information extraction unit extracts character quality information based on stroke number information obtained from an input pattern as time-series information or regular stroke number information on a candidate character of a recognition result. 3. The character recognition device according to 2.

4. A method for extracting character quality information by the character quality extracting unit, based on stroke number information obtained from an input pattern as time-series information and stroke number fluctuation rate calculated from regular stroke number information in candidate characters of a recognition result. The character recognition device according to claim 3, wherein the character quality information is extracted.

5. The character quality information extraction unit extracts character quality information based on stroke information obtained from an input pattern as time-series information.
The character recognition device according to 1.

6. A method according to claim 1, wherein said character quality extracting unit extracts character quality information by extracting, as stroke information, a width and a height of a stroke rectangle in an input pattern and a stroke rectangle prepared in advance for a candidate character as a recognition result. The character recognition device according to claim 5, wherein the distance value is extracted as character quality information.

7. When extracting character quality information by the character quality extracting unit, a distance value between a stroke direction of an input pattern and a stroke direction prepared in advance for a candidate character of a recognition result as stroke information. 6. The character recognition device according to claim 5, wherein the character recognition device extracts the character quality information.

8. When extracting character quality information in the character quality extracting section, the direction between successive strokes of an input pattern and the distance between successive strokes prepared in advance for a candidate character of a recognition result as stroke information. The distance value from the direction is extracted as character quality information.
The character recognition device according to 1.

9. When extracting character quality information in the character quality extracting section, a sum of lengths of strokes of an input pattern as stroke information and a length of a stroke prepared in advance for a candidate character of a recognition result are obtained. 6. The character recognition device according to claim 5, wherein a distance value from the sum of the first character and the second character is extracted as character quality information.

10. When character quality information is extracted by the character quality extraction unit, stroke type information of an input pattern and stroke type information prepared in advance for a candidate character of a recognition result are used as stroke information. The character recognition device according to claim 5, wherein the character recognition device compares and extracts the character quality information.

11. The character recognition apparatus according to claim 2, wherein when the character quality information is extracted by the character quality extraction unit, the character quality information is extracted using information on a bending point obtained from an input pattern.

12. A feature extracting step of extracting a feature vector indicating a feature of a pattern from an input pattern, a feature vector of the input pattern obtained from the feature extracting step, and a standard pattern in a recognition target character stored in a dictionary memory in advance. A character quality extracting step of extracting information for character quality determination from an input pattern, the character quality extracting step comprising: Recognition is performed using distance value information of the recognition result obtained in the identification step, character quality information obtained in the character quality extraction step, and determination information for determining character quality previously stored in a character quality determination information storage memory. A character recognition method comprising: a determination step of determining the likelihood of a result and outputting a determination result.