JP2008077621A

JP2008077621A - Method for making up for learning character sample of character recognition system

Info

Publication number: JP2008077621A
Application number: JP2006286951A
Authority: JP
Inventors: Koji Miyake; 康二三宅
Original assignee: Individual
Current assignee: Individual
Priority date: 2006-09-23
Filing date: 2006-09-23
Publication date: 2008-04-03
Anticipated expiration: 2026-09-23
Also published as: JP5224156B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method for automatically forming a character group having no unreadable characters for each character quality which is highly deformed in various manners as compared with an original character sample group when the learning character sample group is not enough for learning in character recognition by statistic pattern recognition. <P>SOLUTION: Based on a main component analysis result of characteristic vectors including coordinates of characteristic points of the centerlines of collected handwritten characters or the contour lines of print characters of other kinds of fonts, a range that a person can recognize a character of a type or a range permitted as the character type is determined by visual observation or according to a mathematical condition to determine a readable range or allowable range represented by a sectional super-ellipse, and sufficient characters having various deformation degrees and qualities are automatically generated using a psychological distance showing the distance from the center of the sectional super-ellipse, and added to the learning character sample group collected first. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

Detailed Description of the Invention

本発明は、文字認識システムの学習文字サンプル集団の充実化に関するものである。 The present invention relates to enhancement of a learning character sample group of a character recognition system.

手書きＯＣＲの分野では、旧通商産業省工業技術院電子技術総合研究所・富士通株式会社作成の常用手書き平仮名漢字データベースＥＴＬ９（Ｂ）で、未学習文字で９９．５％を超える認識率が得られているが、特徴量や識別部の改良に多くの努力がなされてきているにもかかわらず、近年は認識率の向上が飽和傾向にある。 In the field of handwritten OCR, a recognition rate of over 99.5% for unlearned characters was obtained with the ETL9 (B), a regular handwritten hiragana / kanji database created by the Institute of Electronics Technology Research Institute of the Ministry of International Trade and Industry and Fujitsu Limited. However, in spite of many efforts to improve the feature amount and the identification unit, in recent years, the recognition rate has been increasing.

ＥＴＬ９（Ｂ）は、所定の記入用紙に、字を書くことに専念した状態で書かれたものであるが、我々が日常業務の中で各自は、業務の方に注意が多く配分され、さらに品質が低い。手書きＯＣＲが真に実用的なものになるには、日常書かれる文字が実用精度で読めるようになる必要があり、現在の技術レベルでは、手書きＯＣＲが普及する条件は整っていない。 ETL9 (B) was written on a prescribed form with a focus on writing, but in our daily work, each person is given more attention to the work. The quality is low. In order for handwritten OCR to be truly practical, it is necessary to be able to read characters written on a daily basis with practical accuracy, and at the current technical level, there are no conditions for popularization of handwritten OCR.

また手書き郵便番号も読み取り区分率にもかなりの進歩が見られたものの、誤区分を避けるために最初の３桁については棄却を設けて正読み取り区分率を落とす必要があり、さらに人間に近い読み取り精度が実現すれば一層の人件費削減が可能となるであろう。米国の郵便公社（ＵｎｉｔｅｄＳｔａｔｅｓＰｏｓｔａｌＳｅｒｖｉｃｅ）が全世界から最高の認識技術を発掘するために提示した手書き郵便住所・番号の読み取り精度は、認識率５０％・誤読率２％以下というものであり、まだ人間の読み取り精度にはほど遠い。 Although significant progress has been made in both handwritten zip codes and reading classification rates, it is necessary to reduce the correct reading classification rate by rejecting the first three digits in order to avoid misclassification. If accuracy is realized, it will be possible to further reduce labor costs. The reading accuracy of handwritten postal addresses and numbers presented by the United States Postal Service to discover the best recognition technology from all over the world is a recognition rate of 50% and a misreading rate of 2% or less. Still far from human reading accuracy.

従って、手書きＯＣＲが真に実用的なものとしての地位を占めるには、まだかなりの技術的進歩が必要である。 Therefore, considerable technical progress is still necessary for handwritten OCR to occupy a position as truly practical.

他方、印刷文字の認識は、手書き文字の認識に比べて、一般には容易であると考えられているが、印刷文字においても、いろいろな書体や無数に存在する字形デザインに対して高精度の認識を行うことは容易ではない。欧米では、書体はもちろん、どのデザインの字形かを認識することも重要であり、この機能は英文ＯＣＲの重要な機能として実現されているが、どの国の文書でも、この機能を使うことができるようになれば同じ字形で印刷された文書全体の認識率は大幅に向上させることができる。 On the other hand, the recognition of printed characters is generally considered easier than the recognition of handwritten characters, but the printed characters can also be recognized with high accuracy for various typefaces and innumerable character designs. It is not easy to do. In Europe and the United States, it is important to recognize not only the typeface, but also the design, and this function is realized as an important function of English OCR, but this function can be used in documents in any country. Then, the recognition rate of the whole document printed with the same character shape can be greatly improved.

また理工学分野の多数の出版物に現れる数式の文字は、ローマン、イタリック、ゴシック、ボールド等の書体（タイプフェース）の違いやアルファベットの大文字と小文字の違いが数学的意味の違いを表すので、これらの間の正確な識別は不可欠であること、古今東西の書籍、論文等の数式を読み取り対象とするならば、各書体について無限に近い種類の字形デザインが存在すること、及び大文字小文字で字形が酷似しているもの（Ｃとｃ，Ｓとｓなど）が存在すること、一般の文章と比べて、数式中の文字認識の誤りはコンテクストによる誤りの訂正が難しいことなどの要因が共存しているため、数式単位で見て実用上十分高い認識率を得る技術は実現していない。 In addition, mathematical characters appearing in many publications in the science and engineering field represent differences in the mathematical meaning of differences in typefaces such as Roman, Italic, Gothic, and bold, and the difference between uppercase and lowercase letters in the alphabet. Accurate identification between them is indispensable, and if you want to read mathematical formulas such as books and papers from ancient and modern east and west, there is an infinite variety of glyph designs for each typeface, and capital letters Are similar to each other (C and c, S and s, etc.), and compared to general sentences, character recognition errors in mathematical formulas are difficult to correct by context. Therefore, a technique for obtaining a sufficiently high recognition rate practically in terms of mathematical formulas has not been realized.

さて手書き文字・印刷文字のいずれにおいても、認識システムの学習文字集団としては、手書きＯＣＲでは、現在までに書かれた手書き文字を偏りなく多数集めること、印刷文字用ＯＣＲでは、これまでに使用されたフォントの中でなるべく多くの種類を偏りなく集めるのが理想であるが、これを実施することは物理的あるいは経済的に困難である。実際に収集される学習用文字サンプル集団は、図１に示すように、分布の広がりが不十分であり、かつ偏在するのが普通である。 Well, in both handwritten characters and printed characters, the recognition character group of the recognition system is to collect a large number of handwritten characters that have been written so far in handwritten OCR, and has been used so far in OCR for printed characters. It is ideal to collect as many different font types as possible, but it is physically and economically difficult to implement. As shown in FIG. 1, the learning character sample group that is actually collected usually has an insufficient distribution and is unevenly distributed.

統計学では、正しい確率分布を与えるだけのデータの集団の存在が前提条件であるが、高い認識精度を求める高次元の特徴空間において、通常の努力で集まるサンプル群ではこの条件を到底満たすことができないこと、また（人間の能力の及ぶ範囲で）さらにサンプル数を増やしても効果は極めて少ないことは専門家が指摘しているとおりである。 In statistics, the existence of a group of data that can give a correct probability distribution is a precondition, but in a high-dimensional feature space that requires high recognition accuracy, this condition must be satisfied in a group of samples gathered by normal efforts. Experts have pointed out that it is impossible, and that increasing the number of samples (within the range of human abilities) is extremely ineffective.

またたとえ学習文字集団収集の問題が解決されたとしても、今後書かれる文字や今後設計されるフォントへの対応については不分明なところが多すぎる。
この問題に対する解決の主なアプローチは、２つあるように思われる。Moreover, even if the problem of learning character group collection is solved, there are too many unclear points about correspondence to characters to be written and fonts to be designed in the future.
There appear to be two main approaches to solving this problem.

その一つは、人が文字認識に使っていると思われる一般的知識を設計者の工夫・思いつきでシステムに組み込み、学習文字集団のもつ情報の不足を補おうとする試み（各種距離・識別関数の工夫、弛緩法などの柔軟なマッチング法の導入など）であり、長年わたって多くの試みがなされきているが、その効果は近年飽和しているようである。 One of them is to try to make up for the lack of information in the learning character group by incorporating general knowledge that people seem to use for character recognition into the system with the ingenuity and idea of the designer (various distance / discriminant functions). And the introduction of flexible matching methods such as relaxation methods), and many attempts have been made for many years, but the effect seems to be saturated in recent years.

もう一つのアプローチは、文字パターンに画像処理的変形を加えたり、人間の筆記動作の力学的モデルを利用していろいろな変形文字を多数発生させる試みである。
近年では、［非特許文献２］において、オンライン入力時の各ストロークにアフィン変換による変形を加えて学習文字の不足を補う方法と認識精度の向上結果が発表されている。
しかし、この種の方法では、変形を強めるに従ってその字種の文字と認められない文字の発生が起こるようになり、これを防止する手段がない限り、認識率の飽和が起こると考えられる。
Ｈ．Ｍｉｙａｎｏ，Ｍ．Ｍａｒｕｙａｍａ，Ｙ．Ｎａｋａｎｏ，ａｎｄＴ．Ｈａｎａｎｏｉ，”Ｏｆｆ−ＬｉｎｅＨａｎｄｗｒｉｔｔｅｎＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎｂｙＳＶＭｂａｓｅｄｏｎＶｉｒｔｕａｌＥｘａｍｐｌｅｓＳｙｎｔｈｅｓｉｚｅｄｆｒｏｍＯｎ−ｌｉｎｅＣｈａｒａｃｔｅｒｓ”Ｐｒｏｃ．ｏｆＥｉｇｈｔｈＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＤｏｃｕｍｅｎｔＡｎａｌｙｓｉｓａｎｄＲｅｃｏｇｎｉｏｎ（Ｓｏｕｌ），ｐｐ．４９４−４９８，Ａｕｇ．２９−Ｓｅｐｔ．１，２００５． Another approach is to apply image processing deformation to the character pattern or to generate a large number of various deformed characters using a dynamic model of human writing movement.
In recent years, [Non-Patent Document 2] has published a method for compensating for the lack of learned characters by adding deformation by affine transformation to each stroke during online input and a result of improving recognition accuracy.
However, in this type of method, as the deformation is increased, characters that are not recognized as characters of that character type occur, and unless there is a means for preventing this, the recognition rate is saturated.
H. Miyano, M .; Maruyama, Y .; Nakano, and T.K. Hananoi, “Off-Line Handwritten Character Recognition by SVM based on Virtual Examples Synthesized from On-line Characters” Proc. of Eighth International Conference on Document Analysis and Recognition (Soul), pp. 494-498, Aug. 29-Sept. 1,2005.

一方人間は、この世のほとんどの手書き文字を見ているわけではないが、十分な認識精度を示す。これは、おそらく人間の優れた抽象化能力・一般化能力によるところが多いのであろうが、統計的認識手法を使う限り、現実的収集できる文字サンプルで十分な認識精度を実現できるような、抽象化能力・一般化能力に代わる方法の開発が不可欠になる。 On the other hand, humans do not see most handwritten characters in the world, but they show sufficient recognition accuracy. This is probably due to the excellent abstraction and generalization ability of human beings, but as long as statistical recognition methods are used, abstraction that can achieve sufficient recognition accuracy with realistically collected character samples Development of alternatives to capacity / generalization ability is essential.

申請者は、［非特許文献２］において、細線化パターンの端点、屈曲点、分岐点などの重要な点を特徴点とし、それらの特徴点の間に存在するストロークを等間隔に区切る副特徴点（補助点ともいう）を加えたモデルを字種ごとに設定し（特徴点・副特徴点には順序番号が設定されている）、字種ごとに細線化された学習文字の上で、特徴点・副特徴点に対応する点を求め、それらのｘ、ｙ座標を通し番号順に並べて、特徴ベクトルを得る。 In [Non-Patent Document 2], the applicant has important points such as end points, bending points, and branch points of the thinning pattern as feature points, and subfeatures that divide strokes existing between these feature points at equal intervals. A model with points (also referred to as auxiliary points) is set for each character type (order numbers are set for feature points and sub-feature points), and on the learning characters thinned for each character type, Points corresponding to the feature points / sub-feature points are obtained, and their feature coordinates are obtained by arranging their x and y coordinates in numerical order.

このようにして定まる特徴空間で字種ごとに主成分分析を行い、主成分軸（固有ベクトル）上の点に対応する文字心線パターンに太さを付けて、被験者にどの点までの文字が読めるかを答えさせる測定を行った。この方法により、特徴空間上で、被験者がその字種に属する文字と認める範囲が区分的超楕円面としてとらえられた。この区分的超楕円面上の点は原点から１の距離にあるとし、区分的超楕円面内の点は比例的に減少するように定義し、これを心理距離と名付けた。 The principal component analysis is performed for each character type in the feature space determined in this way, and the character core line pattern corresponding to the point on the principal component axis (eigenvector) is given a thickness so that the subject can read the character up to which point. Measurements were made to answer this question. By this method, the range that the subject recognizes as a character belonging to the character type in the feature space was captured as a piecewise hyperelliptic surface. The point on the piecewise hyperelliptic surface is assumed to be at a distance of 1 from the origin, and the point in the piecewise hyperelliptic surface is defined to decrease proportionally, and this is named psychological distance.

認識過程では、未知文字パターン（認識すべき文字）は細線化され、各字種とのマッチングでは、当該字種の標準パターンで定義されている特徴点に対応する点が求められた後、特徴ベクトルが求められる。このベクトルから、各字種の標準パターンとの心理距離が計算され、最小の心理距離を与えるものを認識結果とするものであるが、被験者にかなり近い認識結果を与える点が注目された。 In the recognition process, unknown character patterns (characters to be recognized) are thinned, and matching with each character type is performed after the points corresponding to the feature points defined in the standard pattern of the character type are obtained. A vector is required. From this vector, the psychological distance to the standard pattern of each character type is calculated, and the one that gives the minimum psychological distance is the recognition result.

この方法により、学習文字サンプル群の内容の不備をある程度補う方法が得られたわけであるが、周知のように、従来の細線化技術では、分岐点、交差点、屈曲点付近で文字心線に歪みが生じるために真の文字心線が扱えず、これが人間の認識結果への一層の接近を困難にしていることが分かった。
この問題に対しては、心線歪みがほとんどない細線化アルゴリズムの開発が必要になるが、いまだ十分なものは実現されていない。With this method, a method to compensate for some deficiencies in the contents of the learning character sample group has been obtained, but as is well known, the conventional thinning technique distorts the character core wire near branch points, intersections, and inflection points. As a result, it was found that the true character core line could not be handled, which made it more difficult to access human recognition results.
For this problem, it is necessary to develop a thinning algorithm with almost no core distortion, but a sufficient algorithm has not been realized yet.

統計的パターン認識による文字認識において、収集された学習用文字サンプル集団だけでは十分な学習ができない場合に、この学習用文字サンプル集団より変形の強度・様態が極めて多様・広範囲でありながら人間に読めない字を含まない文字集団を文字品質別に自動作成する。 In character recognition using statistical pattern recognition, when sufficient learning is not possible with just the collected learning character sample group, the deformation intensity and mode of deformation are far more diverse and wide than those of the learning character sample group, and can be read by humans. Automatically create character groups that do not contain missing characters according to character quality.

Means to solve the problem

本発明は、手書き文字の学習用文字サンプルパターンの発生を主眼とする方法（文字心線を用いる方法）と印刷文字の学習用文字サンプルパターンの発生を主眼とする方法（文字輪郭線を用いる方法）からなる。 The present invention relates to a method mainly using generation of a character sample pattern for learning handwritten characters (method using a character core line) and a method focusing on generation of a character sample pattern for learning printed characters (a method using a character outline). ).

（１）文字心線を用いる方法
手書き文字については、［非特許文献１］により周知となっている心理距離を文字の変形の強さあるいは品質と考え、心理距離の区間〔０，１）を等間隔に分ける。例えば、１０区間に分けるとすると、区間０〜０．１，０．１〜０．２，・・・，０．９〜１．０にそれぞれ入る文字をランダムに多数発生させ、品質別文字データベースを自動作成する。(1) Method using character core line For handwritten characters, the psychological distance known from [Non-Patent Document 1] is considered as the strength or quality of deformation of the character, and the interval [0, 1) of the psychological distance is defined. Divide into equal intervals. For example, if it is divided into 10 sections, a large number of characters in sections 0 to 0.1, 0.1 to 0.2,... Is automatically created.

元となる学習文字は、ペンタブレットで入力したり、オフラインで収集され手書き文字データベースなどの文字をディスプレイ上に表示しそれをペンでなぞる方法を採れば、細線化歪みの問題は避けられるが、手書き漢字のように膨大な数の文字サンプルを入力する場合は、別に特許申請中の細線化アルゴリズムのような歪みの少ない細線化処理を行い、ディスプレイ上で校正する方法を採るのが実用的であろう。
図２は、ペンタブレットで入力された文字から、標準パターン上で定義されている特徴点を抽出する方法を概念的に説明するためのものである。The original learning characters can be entered on a pen tablet, or collected off-line, displayed on the display with characters such as a handwritten character database, and traced with a pen. When inputting an enormous number of character samples such as handwritten kanji, it is practical to use a thinning process with little distortion, such as the thinning algorithm currently under patent application, and calibrate on the display. I will.
FIG. 2 conceptually illustrates a method for extracting feature points defined on a standard pattern from characters input with a pen tablet.

続いて、図３に示すように、特徴点を結ぶ各文字線を等分割する副特徴点が抽出される。
［非特許文献１］によれば、ある字種の標準パターン（１個以上の曲線で定義）がＮ個の特徴点・副特徴点もつ場合、当該字種に属する文字の原特徴ベクトルＦは、
Ｆ＝（ｘ_１，ｙ_１，ｘ_２，ｙ_２，・・・，ｘ_Ｎ，ｙ_Ｎ）（１）
で表される。ここで、ｘ_ｉ，ｙ_ｉは特徴点ｉのｘ座標及びｙ座標であり、Ｎは特徴点と副特徴点を合わせた総数である。また取り扱いを容易にするため、原特徴ベクトルＦを式（２）により正規化し、これをｘで表す。
ｘ＝（ｘ_１，ｘ_２，・・・，ｘ_ｎ）＝Ｆ／｜Ｆ｜（ｎ＝２Ｎ）（２）Subsequently, as shown in FIG. 3, sub-feature points that equally divide the character lines connecting the feature points are extracted.
According to [Non-Patent Document 1], when a standard pattern of a character type (defined by one or more curves) has N feature points / sub-feature points, the original feature vector F of a character belonging to the character type is ,
F = (x ₁ , y ₁ , x ₂ , y ₂ ,..., X _N , y _N ) (1)
It is represented by Here, x _i and y _i are the x-coordinate and y-coordinate of the feature point i, and N is the total number of feature points and sub-feature points combined. Further, in order to facilitate handling, the original feature vector F is normalized by Expression (2), and is represented by x.
x = (x ₁ , x ₂ ,..., x _n ) = F / | F | (n = 2N) (2)

各字種に、あらかじめ標準的な字形に対する特徴ベクトルｘを定めておき、入力された学習文字（線図形）上で、ＤＰ法等により対応する特徴点を抽出し、それらのｘ，ｙ座標を特徴点番号順に並べて、Ｆ，ｘを得る。
続いて、式（３）の形式をもつ各字種の心理距離式で、各字種との距離（相違度）を求める。

ここで、μは当該字種の平均、φ_ｉ，Ｕ_ｉはそれぞれ第ｉ主成分軸（固有ベクトル）と人間の可読限界であり、これらは字種ごとに異なる。For each character type, a feature vector x for a standard character shape is determined in advance, and corresponding feature points are extracted by the DP method or the like on the input learning character (line figure), and their x and y coordinates are determined. F and x are obtained in the order of feature point numbers.
Subsequently, a distance (difference) from each character type is obtained by a psychological distance formula of each character type having the form of Expression (3).

Here, μ is the average of the character type, and φ _i and U _i are the i-th principal component axis (eigenvector) and the human readable limit, respectively, which are different for each character type.

また（ｘ−μ，φ_ｉ）はベクトルの内積を示し、Ｕ_ｉは式（４）で与えられる。

ただし、Ｕ_ｉ ^＋，Ｕ_ｉ ⁻は、それぞれ第ｉ主成分軸の正側及び負側の可読限界値を表す。Further, (x−μ, φ _i ) represents an inner product of vectors, and U _i is given by Expression (4).

However, U _i ⁺ and U _i ⁻ represent readable limit values on the positive side and the negative side of the i-th principal component axis, respectively.

以上は［非特許文献１］に記載されている内容であるが、本発明では、式（３）、（４）を細線化された文字から抽出された特徴ベクトルに適用して字種識別をするのではなく、特徴空間上で人間の可読範囲として測定された区分的超楕円体（０≦ｄ（ｘ）≦１）の中でランダムに多数の点を選び、人間の可読範囲を覆う文字パターン集団を得てこれを元の学習用文字サンプル集団に追加するものである。 The above is the contents described in [Non-Patent Document 1], but in the present invention, character type identification is performed by applying Equations (3) and (4) to feature vectors extracted from thinned characters. Rather than selecting a large number of points in a piecewise hyperellipsoid (0 ≦ d (x) ≦ 1) measured as a human-readable range on the feature space, and covering the human-readable range A pattern group is obtained and added to the original learning character sample group.

このため、ｘ−μとして、一様乱数を成分とするｎ次元のベクトル（ベクトル方向がランダムに選ばれる）を作り、式（３）、（４）によって得られるｄ（ｘ）の値をＤとすると、（ｘ−μ）／Ｄは心理距離１を与える。また区間〔０，１〕の値を取るパラメータｋを導入すると、ｋ（ｘ−μ）／Ｄは心理距離ｋを与える。 For this reason, an n-dimensional vector (vector direction is randomly selected) having a uniform random number as a component is created as x−μ, and the value of d (x) obtained by the equations (3) and (4) is represented by D Then, (x−μ) / D gives a psychological distance of 1. When a parameter k taking a value in the interval [0, 1] is introduced, k (x−μ) / D gives a psychological distance k.

これに当該字種の平均ベクトルμを加えてｋ（ｘ−μ）／Ｄ＋μを作れば、これは原特徴空間座標における心理距離ｋをもつ特徴ベクトルとなり、その成分で決まる特徴点・副特徴点の間を、標準パターンの特徴点に関する定義に従って線分で結べば心理距離ｋをもつ文字心線を復元を発生させる。 If k (x−μ) / D + μ is created by adding the average vector μ of the character type to this, this becomes a feature vector having a psychological distance k in the original feature space coordinates, and the feature points / sub-feature points determined by the components If a line segment is connected by a line segment according to the definition regarding the feature points of the standard pattern, a character core line having a psychological distance k is restored.

例えば、既述のように、区間〔０，１〕を０．１刻みの１０区間に分けて、各区間内でｋを乱数発生させれば、１０段階の品質別文字集団ができる。
これを原学習文字集団に加えることによって、通常収集できる原学習文字集団に比べ、特徴空間においてはるかに多様かつ偏りのない分布をもった文字集団が実現する。For example, as described above, if the interval [0, 1] is divided into 10 intervals of 0.1, and k is randomly generated in each interval, a 10-level character group according to quality is obtained.
By adding this to the original learning character group, a character group having a far more diverse and unbiased distribution in the feature space than the original learning character group that can be normally collected is realized.

なお式（１）の原特徴ベクトルに、特徴点における距離値（文字線の当該個所の太さを表す）ｚ_ｉ（ｉ：特徴点番号）を付加し、原特徴ベクトルを、
Ｆ＝（ｘ_１，ｙ_１，ｚ_１，ｘ_２，ｙ_２，ｚ_３，・・・，ｘ_Ｎ，ｙ_Ｎ，ｚ_Ｎ）（５）
とする方法も本特許申請の範囲に含める。この方法は、文字線の太さの局所的変動もとらえることができ、毛筆文字などの扱いに有効である。In addition, a distance value (representing the thickness of the portion of the character line) z _i (i: feature point number) at the feature point is added to the original feature vector of Expression (1), and the original feature vector is
F = (x ₁ , y ₁ , z ₁ , x ₂ , y ₂ , z ₃ ,..., X _N , y _N , z _N ) (5)
This method is also included in the scope of this patent application. This method can detect local variations in the thickness of the character line, and is effective for handling brushstroke characters.

（２）文字輪郭線を用いる方法
（１）で述べた文字心線を用いる方法では、たとえ文字の太さの情報を加えたとしても、印刷文字のように，複雑な飾りを付けた文字への適用は効率的でない場合が多い．(2) Method using a character outline In the method using a character core described in (1), even if information on the thickness of a character is added, a character with a complicated decoration, such as a printed character, is added. The application of is often inefficient.

図４に示すように，基準となる既存のフォントを選び（一般には複数必要になると考えられるが、適切な選択をすれば１個で済むことが多いように思われる），各字種の文字パターンの輪郭線画素列に，特徴点（屈折点、曲率不連続点等）を設定する． As shown in Fig. 4, select existing standard fonts (generally more than one may be required, but if you make an appropriate selection, you probably only need one), and each character type Set feature points (refractive points, curvature discontinuities, etc.) in the pattern outline pixel array.

収集されたいろいろなフォントの文字を入力し、（１）の文字心線を用いる方法と同様に、標準パターンの特徴点に対応する点を輪郭線画素列中に見つけ、さらに特徴点間の文字線を標準パターンと同数の線分に等分割して副特徴点に対応する点を求め、それらの座標値を要素とする原特徴ベクトルＦを作る。 Input the collected characters of various fonts, and find the points corresponding to the feature points of the standard pattern in the outline pixel row, as in the method (1) using the character core line, and further, the characters between the feature points The line is equally divided into the same number of line segments as the standard pattern to obtain points corresponding to the sub-feature points, and an original feature vector F having these coordinate values as elements is created.

続いて正規化特徴ベクトルｘを用いて字種ごとに主成分分析を行い、各主成分軸上に標準偏差を単位として均等な間隔で点を設け、その点を文字データに変換する。 Subsequently, principal component analysis is performed for each character type using the normalized feature vector x, points are provided on each principal component axis at regular intervals with the standard deviation as a unit, and the points are converted into character data.

上記の手段で作成した文字について、輪郭線が他の輪郭線と交差しないという条件（図５参照）を満たす限界の領域を求めることによって変動許容幅を確定する。これによって定まった区分的超楕円体内でランダムに特徴ベクトルを発生させる。特徴ベクトルを文字輪郭線データに変換し、輪郭線に囲まれた領域を塗りつぶす処理を施して、形状情報をもった文字画像データを生成する。 For the character created by the above means, the variation allowable width is determined by obtaining a limit area that satisfies the condition that the contour line does not intersect with another contour line (see FIG. 5). In this way, feature vectors are randomly generated within the piecewise hyperellipse. The feature vector is converted into character outline data, and a process of filling a region surrounded by the outline is performed to generate character image data having shape information.

図６は、特徴空間における字種“Ｆ”の許容領域内でランダムに点を選んで発生させた疑似フォント文字の例を示す。
上述した領域限界の決定法は、（１）の文字心線を用いる方法とは異なるが、使用目的によっては、（１）と同様な目視による方法などを用いた方が合理的である場合もあり得る。
また図７は、ある主成分軸上で字種“Ｆ”の許容領域に入る区間の中の点に対応する疑似フォント文字を例示したものである。FIG. 6 shows an example of a pseudo font character generated by randomly selecting a point within the allowable area of the character type “F” in the feature space.
The method for determining the region limit described above is different from the method using the character core wire of (1), but depending on the purpose of use, it may be more reasonable to use the visual method similar to (1). possible.
FIG. 7 exemplifies a pseudo font character corresponding to a point in a section that falls within the allowable region of the character type “F” on a certain principal component axis.

The invention's effect

文字認識では、実用的な認識精度を実現するには高次元の特徴空間が用いられているため、統計的パターン認識法を適用する場合、通常収集できる程度の文字サンプル集団は、統計理論が要求する条件を満たしているとは到底言い難く、この種の認識システムの正読率向上の壁となっていたといってもよい。 In character recognition, a high-dimensional feature space is used to achieve practical recognition accuracy. Therefore, when applying the statistical pattern recognition method, the character sample population that can be normally collected is required by statistical theory. It is difficult to say that the conditions for satisfying this condition are satisfied, and it can be said that this type of recognition system has been a barrier to improving the correct reading rate.

しかし、本発明では、通常収集できる文字サンプル集団を、そのまま認識システムの学習（設計）には使わず、その統計的解析とそれに基づく心理的測定等によって得られた各字種の可読領域あるいは許容領域（特徴空間上で区分的超楕円面で記述）と心理距離式を使って、可読全域あるいは許容全域でいろいろな品質の文字を十分な数だけ自動発生させ、これを学習文字群に加える。従って、最初収集された文字サンプル群よりも遙かに文字認識に必要な情報量（特に人間の文字種判別基準など）を多く含むことになり、これを学習文字サンプル群に使えば、人間に近い認識結果あるいは高い認識率が得られることになる。 However, in the present invention, the character sample group that can be normally collected is not used as it is for learning (design) of the recognition system as it is, but the readable area of each character type obtained by the statistical analysis and the psychological measurement based on it or the permissible area. A sufficient number of characters of various qualities are automatically generated in the entire readable range or allowable range using a region (denoted as a piecewise hyperelliptic surface in the feature space) and a psychological distance formula, and added to the learning character group. Therefore, the amount of information necessary for character recognition (particularly human character type discrimination criteria) is much larger than the character sample group collected first, and if this is used for the learning character sample group, it is close to humans. A recognition result or a high recognition rate can be obtained.

ここで得られた学習文字サンプル群は、２値画像でも表現できるので、いかなる認識方式の学習にも利用でき、新しい文字群を対象とする文字認識システムの構築における最大の難題といえる学習問題の解決に大きな貢献をすることが期待される。
また従来広く行われている統計的判別分析法の応用時の問題点をはっきりさせることにも貢献できる。The learning character sample group obtained here can be expressed as a binary image, so it can be used for learning of any recognition method, and the learning problem that can be said to be the biggest challenge in the construction of a character recognition system for a new character group. It is expected to make a significant contribution to the solution.
It can also contribute to clarifying problems when applying statistical discriminant analysis methods that have been widely used in the past.

本申請の方法を用いる場合でも、なるべく多彩で分布に大きな偏りのない文字サンプルを集めるほど良い結果を期待できる。
（１）文字心線を用いる方法
文字心線を利用する方法の場合は、収集する文字は心線で表される必要がある。したがって、ペンタブレットを使ってオンライン入力する方法には利点が多い。Even when the method of this application is used, the better results can be expected as the sample of characters is collected as much as possible and the distribution is not largely biased.
(1) Method using character core wire In the case of the method using character core wire, the collected characters must be represented by the core wire. Therefore, there are many advantages to the online input method using a pen tablet.

特に数字認識では、筆順の変動の問題が少なく、筆記の順序（特徴点の出現順序）を用いることにより、標準パターンで定義されている特徴点に対応する点を抽出しやすい。屈曲点の検出には、周知のＤＰマッチング法を用いることにより、ほぼ自然な位置が検出されるが、厳密さを望む場合は、ディスプレイ付きペンタブレットシステムを利用したマンマシンインターフェースにより、位置の微調整を行えばよい。 In particular, in the number recognition, there is little problem of fluctuation of the stroke order, and it is easy to extract points corresponding to the feature points defined in the standard pattern by using the writing order (the appearance order of the feature points). For detection of the inflection point, an almost natural position is detected by using a well-known DP matching method. However, if strictness is desired, the position of the inflection point can be detected by a man-machine interface using a pen tablet system with a display. Adjustments can be made.

しかし、オンライン入力だけで十分な文字サンプルを得られないときは、オフラインで収集されたデータベース（例えば旧郵政省郵政研究所作成の手書き数字データベースＩＰＴＰＣＤ−ＲＯＭ１）の文字パターンを表示し、それをまねてペンを動かすか、ディスプレイ付きペンタブレットであれば、表示文字の心線部をペンでなぞることによって、入力文字データを増やすこともできる。 However, when sufficient character samples cannot be obtained by online input alone, the character pattern of a database collected offline (for example, the handwritten numeric database IPTP CD-ROM1 created by the Postal Service Institute of the Ministry of Posts and Telecommunications) is displayed. If the pen is moved, or if it is a pen tablet with a display, the input character data can be increased by tracing the core portion of the display character with the pen.

また歪みの少ない細線化アルゴリズム（特許申請中）を使えば、オフラインで収集された文字画像を細線化し、ディスプレイ付きペンタブレットで確認・校正することによってデータ収集をすることができる。 If a thinning algorithm with little distortion (patent pending) is used, data can be collected by thinning the character images collected off-line and confirming / calibrating them with a pen tablet with a display.

とくに漢字認識のように、字種数が多い場合は、この方法が一番有力であろう。この場合、特徴点抽出に筆順等の情報は使えないので、［非特許文献３］のように、特徴点間の相対位置関係を表すために定義された特徴点間を結ぶ線分（２次元ベクトル）同士のマッチングを使うことになる。
木村文隆，吉村ミツ，三宅康二，市川真人，“ストローク構造解析法による自由手書き片仮名文字認識，”電子情報通信学会論文誌，Ｖｏｌ．６２−Ｄ，Ｎｏ．１，ｐｐ．１６−２３，Ｊａｎ．１９７９． This method may be the most effective method especially when there are many types of characters, such as kanji recognition. In this case, since information such as the stroke order cannot be used for feature point extraction, as shown in [Non-Patent Document 3], a line segment (two-dimensional) connecting feature points defined to represent the relative positional relationship between feature points is used. Vector) matching is used.
Fumitaka Kimura, Mitsuru Yoshimura, Koji Miyake, Masato Ichikawa, “Free Handwritten Katakana Character Recognition by Stroke Structure Analysis,” IEICE Transactions, Vol. 62-D, no. 1, pp. 16-23, Jan. 1979.

上記方法で式（１）の特徴ベクトルＦが求められると、式（２）により正規化された特徴ベクトルｘが得られる。これはｎ次元特徴空間の１点に対応し、字種ごとに点の集団について主成分分析を行う。
以下は［非特許文献１］に述べられている方法に従う。
すなわち、主成分分析は、字種ごとの分散共分散行列の固有値解析により、主成分軸は固有ベクトルとして、その軸に関する分散は固有値として求められる。When the feature vector F of Equation (1) is obtained by the above method, the feature vector x normalized by Equation (2) is obtained. This corresponds to one point in the n-dimensional feature space, and a principal component analysis is performed on a group of points for each character type.
The following follows the method described in [Non-Patent Document 1].
That is, in the principal component analysis, the principal component axis is determined as an eigenvector and the variance related to the axis is determined as an eigenvalue by eigenvalue analysis of the variance-covariance matrix for each character type.

主成分軸（固有ベクトル）と固有値に、固有値の大きいものから小さいものに向かって番号ｉ（ｉ＝１，２，３，・・・，ｎ）を付ける。第ｉ主成分軸について、式（２）で与えられるｘから線図形を発生させると、数字「９」でｉ＝１の場合は、図８のような線図形が得られる。
ｘ＝μ＋ｍσ_ｉφ_ｉ（５）
ここで、μは当該字種の平均、φ_ｉは第ｉ固有ベクトル、σ_ｉは対応する固有値の平方根（標準偏差）である。Numbers i (i = 1, 2, 3,..., N) are assigned to principal component axes (eigenvectors) and eigenvalues in descending order of eigenvalues. When a line figure is generated from x given by Equation (2) for the i-th principal component axis, a line figure as shown in FIG. 8 is obtained when i = 1 with the numeral “9”.
x = μ + mσ _i φ _i (5)
Here, μ is the average of the character type, φ _i is the i-th eigenvector, and σ _i is the square root (standard deviation) of the corresponding eigenvalue.

ｍを正側及び負側に増やしていき、人がその字種の文字と認識できる限界を目視で調べる。正側及び負側の限界値をそれぞれｍ^＋，ｍ⁻とすれば、この主成分軸上の座標Ｕ_ｉ ^＋，Ｕ_ｉ ⁻は、式（６）のようになる。
Ｕ_ｉ ^＋＝ｍ^＋σ_ｉ（６）
Ｕ_ｉ ⁻＝ｍ⁻σ_ｉ Increase m to the positive side and the negative side, and visually check the limit that a person can recognize as a character of that character type. Assuming that the positive and negative limit values are m ⁺ and m ⁻ , the coordinates U _i ⁺ and U _i ⁻ on the principal component axis are as shown in Equation (6).
U _i ⁺ = m ⁺ σ _i (6)
U _{_i} ^{^-} = m ^- σ _i

ベクトルｘの当該字種の平均からの心理距離ｄ（ｘ）は、

で与えられる。ｄ（ｘ）＝１が可読域（区分的超楕円体）の表面，すなわち可読限界面を与える．The psychological distance d (x) from the average of the character type of the vector x is

Given in. d (x) = 1 gives the surface of the readable region (piecewise hyperellipsoid), ie the readable limit surface.

ここで，Ｈ_ｉはｘの第ｉ主成分軸上の座標値（＝（ｘ−μ，φ_ｉ）），Ｕ_ｉはＨ_ｉ＞０のときＵ_ｉ ^＋，Ｈ_ｉ＜０のときＵ_ｉ ⁻となる．Ｋは，どの程度まで高次の主成分軸を取るかを決めるもので，意味のある変動をとらえることに重点を置くとよい．余り高次の軸をとると，ノイズの要因を拾うだけとなる．
図９は、各主成分軸のＵ_ｉ ^＋，Ｕ_ｉ ⁻の測定値と可読限界における字形の例を示したものである。Here, _{H i} is the i principal component axis on the coordinate value of x (= (x-μ, φ i)), U when _{U i} is _H i> 0 ⁱ _+, when _H i <0 _{U i} ^- . K determines how much higher-order principal component axes are taken, and it is better to focus on capturing meaningful fluctuations. Taking a very high order axis only picks up the cause of the noise.
FIG. 9 shows a measured value of U _i ⁺ and U _i ⁻ of each principal component axis and an example of a letter shape at the readability limit.

引き続いて，ｎ次元ベクトルｘ−μのｎ個の成分値を一様乱数として，向きがランダムなベクトルを作り，これを式（７）に代入した値がＤであったとすると，（ｘ−μ）／Ｄは心理距離１を与えるから，心理距離ｋをもつ文字パターンは，（ｘ−μ）ｋ／Ｄとなる．
このベクトルの始端は、μの終端と一致するので、特徴空間の原点Ｏを始端とするベクトルは、μ＋（ｘ−μ）ｋ／Ｄとなり、これを、曲線で表現された文字パターンに復元した後、太さを与えて文字パターンとする。Then, assuming that the component value of n components of the n-dimensional vector x−μ is a uniform random number and a vector whose direction is random is substituted into the equation (7) is D, (x−μ ) / D gives psychological distance 1, so the character pattern with psychological distance k is (x-μ) k / D.
Since the beginning of this vector coincides with the end of μ, the vector starting from the origin O of the feature space is μ + (x−μ) k / D, and this is restored to a character pattern expressed by a curve. Later, thickness is given to make a character pattern.

ベクトルｋの値を，既述のように，いくつかの区間に分けて，その中で一様乱数とすれば，文字の変形の強さ，あるいは品質による等級で分けた文字集団が得られる．このようにして発生させた文字の例を図１０に示す．
また各字種のプロトタイプ（標準パターン）は、必要に応じて複数個にすることが望ましく、図１１に字種ごとのプロトタイプを示しておく（各プロトタイプは１つの字形例で示してある）。If the value of the vector k is divided into several intervals as described above and is made a uniform random number, the character group divided by the strength of the character deformation or the grade by quality is obtained. Figure 10 shows an example of characters generated in this way.
Further, it is desirable that a plurality of prototypes (standard patterns) for each character type be provided as necessary, and FIG. 11 shows a prototype for each character type (each prototype is shown as one example of character shape).

申請者らが開発した加重方向指数ヒストグラム法（［非特許文献４］）に、旧郵政省郵政研究所で作成された手書き年賀はがき郵便番号データベースＩＰＴＰＣＤ−ＲＯＭ１を用いた認識実験では，表１に示すように，従来方式（旧郵政書郵政研究所主催文字認識コンテストで最優秀賞を受賞したシステム）では未学習文字に対して９９．１７％（［非特許文献４］とは実験データ等がわずかに異なるので認識率が同一とはならない）であったのに対して，本発明の方法で生成した文字２５６０個（各字種２５６文字）を加えただけで，９９．４２％に向上している． In the recognition experiment using the handwritten New Year's postcard postal code database IPTP CD-ROM1 created by the Postal Service Institute of the Ministry of Posts and Telecommunications, the weighted direction index histogram method developed by the applicants ([Non-Patent Document 4]) is shown in Table 1. As shown in Fig. 4, 99.17% of untrained characters ([Non-Patent Document 4] is experimental data, etc.) with the conventional method (the system that won the highest award in the character recognition contest sponsored by the Postal Service). The recognition rate is not the same because of slightly different), but by adding 2560 characters (256 characters for each character type) generated by the method of the present invention, it is improved to 99.42% is doing.

ただし，実際に書かれる文字には大きな回転をもつものが少なく，本法だけで発生させた文字群には回転した文字が少なめとなるので，発生させた線分としての文字に若干の回転変換（１次変換）を加えたものを含めてある．
若林哲史，鶴岡信治，木村文隆，三宅康二，“特徴量の次元数増加による手書き数字認識の高精度化，”電子情報通信学会論文誌（ＤＩＩ），Ｖｏｌ．Ｊ７７−Ｄ−ＩＩ，ｎｏ．１０，ｐｐ．２０４６−２０５３，Ｏｃｔ．１９９４． However, there are few characters that have a large rotation in the actual written characters, and there are few rotated characters in the character group generated only by this method, so some rotation conversion to characters as generated line segments. (Primary transformation) is added.
Wakabayashi Satoshi, Tsuruoka Shinji, Kimura Fumitaka, Miyake Koji, “High accuracy of handwritten digit recognition by increasing the number of dimensions of features,” IEICE Transactions (DII), Vol. J77-D-II, no. 10, pp. 2046-2053, Oct. 1994.

また従来方式で学習文字群に実際の手書き郵便番号８０００個（各字種平均８００字）を追加しても，９９．１８％と０．０１％の向上を見るだけであり，これは従来方式の延長上では高い認識率を得ることが困難であることを示唆している．
表２は，誤読率，棄却率，及び上記コンテストで使われた認識性能評価尺度Ｓを示すものであり，正読率を落とさずに誤読を大幅に減らすことができることが分かる．In addition, adding 8000 handwritten postal codes (average of 800 characters for each character type) to the learning character group using the conventional method only shows an improvement of 99.18% and 0.01%. This suggests that it is difficult to obtain a high recognition rate.
Table 2 shows the misreading rate, rejection rate, and recognition performance evaluation scale S used in the contest, and it can be seen that misreading can be greatly reduced without reducing the correct reading rate.

図１２には，本手法の適用によって新たに正読になった文字を示す．これらは，変形が大きいものの人間には読める文字であった．また図１３は、本手法を用いても誤読になる文字の代表例を示すが、これらは，人間でも判読困難なものが多く，この結果は、認識システムのユーザに理解されやすいものであろう． Figure 12 shows the newly read characters by applying this method. These were human-readable characters with large deformations. FIG. 13 shows typical examples of characters that are misread even if this method is used, but these are often difficult for humans to interpret, and the results will be easily understood by the user of the recognition system. .

図１４は，その他の誤読例であり，その主な原因は，ノイズと文字線同士の重度の接触（目ツブレなど）によるものである．この問題が解決されれば，正読率は９９．６６％に達する．ノイズによる誤読は，ノイズが文字部分に比べて小さい面積をもつので，これを非決定的に除去する方法で解決されることを確認しており，この場合認識率は９９．４７％に達する． Fig. 14 shows other misreading examples, the main cause of which is due to severe contact between noise and character lines (such as eyelids). If this problem is solved, the correct reading rate will reach 99.66%. It has been confirmed that misreading due to noise is solved by a non-deterministic removal method because the noise has a smaller area than the character part. In this case, the recognition rate reaches 99.47%.

本手法で作成した変形文字を，原学習用文字データ群に，品質別あるいは太さ別に，どのような割合で加えるべきかは実験によって決めるわけであるが，これまでの実験では，心理距離が０．９〜１．０の文字群（自然界で出現頻度が非常に低い）を加えるのが最も効果的であること，文字線同士の接触の多い文字を読むときは太い文字を加えた方が認識率が上がることを確認している．若干の認識実験で正読率が９９．５０％を超えることが確認されている． The ratio of the modified characters created by this method to be added to the original learning character data group by quality or thickness is determined by experiment. It is most effective to add a character group of 0.9-1.0 (occurrence frequency is very low in nature). When reading characters with many contact between character lines, it is better to add thick characters It has been confirmed that the recognition rate increases. In some recognition experiments, it has been confirmed that the correct reading rate exceeds 99.50%.

図１５は，手書き数字を，読むべき字，認識困難な文字，（正しく）読めない文字に分け，認識実験で得られた認識率と対応させているが，従来法で得られた９９．１７％の正読率では、人間ならば読めると思われるものが目立ったのに対して，９９．４２％，９９．４７％，９９．６６％などの数字は，人間の認識率にかなり接近しているものであることを認識させる． In FIG. 15, handwritten numerals are divided into characters to be read, characters that are difficult to recognize, and characters that cannot be (correctly) read, and correspond to the recognition rate obtained in the recognition experiment, but 99.17 obtained by the conventional method. At the correct reading rate of%, things that would be readable by humans were conspicuous, whereas numbers such as 99.42%, 99.47%, and 99.66% were quite close to human recognition rates. Recognize that it is.

（２）文字の輪郭線を用いる方法
収集された印刷文字パターンの輪郭線の情報を統計的に解析して、書体・フォントの違いを支配する要因を抽出し、その要因を制御することによって、多数の疑似書体・フォントを生成して、認識システムの学習文字集団とする。［非特許文献５］で発表した筆記者に適応するＯＣＲの技術を準用すれば、書体・フォントの認識を行うことができるうえ、同じフォントで印刷された文書の文字認識率を向上できることが期待される。
鶴岡信治，森田裕之，木村文隆，三宅康二，“筆記者に対して適応機能を持った自由手書き文字認識”，電子情報通信学会論文誌Ｄ，Ｖｏｌ．Ｊ７０−ＤＮｏ．１０，ｐｐ．１９５３−１９６０，Ｏｃｔ．１９８７． (2) Method of using the outline of characters By analyzing the collected outline information of the printed character pattern statistically, extracting the factors governing the difference between typefaces and fonts, and controlling those factors, A large number of pseudo typefaces and fonts are generated and used as a learning character group of the recognition system. By applying the OCR technology applied to scribes published in [Non-Patent Document 5], it is expected to be able to recognize typefaces and fonts and improve the character recognition rate of documents printed in the same font. Is done.
Tsuruoka Shinji, Morita Hiroyuki, Kimura Fumitaka, Miyake Koji, “Free Handwritten Character Recognition with Adaptive Function for Writers”, IEICE Transactions D, Vol. J70-D No. 10, pp. 1953-1960, Oct. 1987.

疑似フォントの生成は以下の手順で行う．
（ｉ）既に説明した方法で、基準となる既存のフォントを選び，各字種の文字パターンの輪郭線画素列上に，特徴点（屈折点、曲率不連続点等）と副特徴点を設定する（図４）．The pseudo font is generated by the following procedure.
(I) Select an existing standard font by the method described above, and set feature points (refractive points, curvature discontinuities, etc.) and sub-feature points on the outline pixel string of each character type character pattern. (Fig. 4).

（ｉｉ）収集された既存フォントの文字を入力し、その輪郭線上で各特徴点に対応する点を見つける。ついで、特徴点・副特徴点の座標値を要素とする原特徴ベクトルＦを作る。
（ｉｉｉ）原特徴ベクトルＦを正規化し、これを特徴ベクトルｘとする。(Ii) Input the collected characters of the existing font and find the points corresponding to each feature point on the outline. Next, an original feature vector F having the coordinate values of feature points and sub-feature points as elements is created.
(Iii) The original feature vector F is normalized and set as a feature vector x.

（ｉｖ）特徴ベクトルｘを用いて字種ごとに主成分分析を行い、各主成分軸上に標準偏差を単位として均等な間隔で点を設け、その点を文字データに変換する。
（ｖ）上記の手段で作成した文字について、輪郭線同士が交差しないという条件（図５）を満たす主成分軸上の限界を求めることによって変動許容幅を確定する。(Iv) Principal component analysis is performed for each character type using the feature vector x, points are provided on each principal component axis at regular intervals in units of standard deviation, and the points are converted into character data.
(V) For the character created by the above means, the fluctuation tolerance is determined by obtaining a limit on the principal component axis that satisfies the condition that the contour lines do not intersect (FIG. 5).

文字心線の場合と同様に、式（７）による距離式に従って、距離値別に、許容領域内でランダムに特徴ベクトルを発生させ、これを文字輪郭線データに変換し、輪郭線に囲まれた領域を塗りつぶして、形状情報をもった文字画像データを生成する。Similar to the case of the character core line, according to the distance formula according to the equation (7), a feature vector is randomly generated in the allowable region for each distance value, and this is converted into character outline data and surrounded by the outline. The area is filled and character image data having shape information is generated.

（ｖｉ）（ｖ）で得られた文字画像データを、最初収集した文字データに加えて、認識システムの学習文字集団とすることによって、マルチフォント対応あるいはオムニフォント対応のＯＣＲを実現する。(Vi) By adding the character image data obtained in (v) to the learning character group of the recognition system in addition to the initially collected character data, multi-font or omni-font compatible OCR is realized.

文字心線を利用する方法を、手書き郵便番号の認識を手紙類の読み取り区分に応用すれば、さらに大きな人件費削減を実現できるであろう。また一般書類の手書き数字においても、数値がコンテクストによる訂正が困難であることを考慮すると、手書きＯＣＲの事務分野への普及にも貢献するであろう。 If the method of using the letter cord is applied to the classification of letters, the labor cost can be further reduced. Also, considering that it is difficult to correct the numerical values of handwritten numbers in general documents by the context, it will contribute to the spread of handwritten OCR in the field of office work.

手書きＯＣＲの分野では、手書き平仮名漢字データベースＥＴＬ９（Ｂ）の文字で、未知文字に対して９９．５％を超える認識率が得られる時代となったが、我々が日常業務中に書くより低品質の文字を実用精度で認識できるＯＣＲへの道を開く本発明は、ＯＣＲの応用範囲を一段と広げることに貢献するであろう。 In the field of handwritten OCR, the character recognition of the handwritten Hiragana / Kanji database ETL9 (B) has resulted in a recognition rate exceeding 99.5% for unknown characters, but it is of lower quality than what we write during daily work. The present invention, which opens the way to OCR that can recognize the characters of Japanese characters with practical accuracy, will contribute to further expanding the application range of OCR.

また文字輪郭線を用いる方法では、オムニフォント印刷文字ＯＣＲ実現への道を開くだけでなく、数式認識などのように、無限といえるフォントの実在する状況下で正確な書体認識を要求される数式認識理解の分野でも大きな役割を果たすであろう。 In addition, the method using the character outline not only opens the way to the omnifont print character OCR, but also mathematical expressions that require accurate typeface recognition in situations where an infinite font exists, such as mathematical expression recognition. It will also play a big role in the field of cognitive understanding.

通常の方法で収集できる学習文字集団に見られる字形の多様さの不足と偏りを特徴空間上で説明した図である。 It is the figure explaining the lack and bias | inclination of the character form diversity seen in the learning character group which can be collected by the normal method on the feature space. 標準パターンに設定された特徴点に対応する点を入力パターン（収集された原学習文字パターン）上に見つける方法を説明したものである。 This is a method for finding a point corresponding to a feature point set in a standard pattern on an input pattern (collected original learning character pattern).

特徴点の間に存在する文字線を等分割する副特徴点を設定し、特徴点及び副特徴点の座標を要素とする特徴ベクトルＦを作成する方法を説明する図である。 It is a figure explaining the method of setting the sub feature point which equally divides the character line which exists between feature points, and producing the feature vector F which makes the coordinate of a feature point and a sub feature point an element. 文字の輪郭線を使用する方法において、輪郭上に定義した特徴点の例を図示したものである。 In the method using the outline of a character, the example of the feature point defined on the outline is shown in figure.

主成分軸上で原点から十分離れた点に対応する文字輪郭線において、輪郭線同士が交差する現象を図解したものである。 This is a diagram illustrating a phenomenon in which contour lines intersect with each other in a character contour line corresponding to a point sufficiently away from the origin on the principal component axis. 特徴空間における字種“Ｆ”の存在領域の中でランダムに点を選んで発生させた疑似フォント文字の例を示したものである。 An example of a pseudo font character generated by randomly selecting a point in the existence area of the character type “F” in the feature space is shown.

字種“Ｆ”におけるある主成分軸の上の点（許容領域内）に対応する疑似フォント文字を例示したものである。 This is an example of a pseudo font character corresponding to a point (within an allowable region) on a certain principal component axis in the character type “F”. 第１主成分軸上で、その標準偏差を単位として、原点及び正側・負側にその整数倍（移動系数）だけ移動した点における文字を発生させた例を示したものである。 This shows an example in which a character is generated at a point moved on the first principal component axis by an integral multiple (moving system number) to the origin and the positive and negative sides with the standard deviation as a unit.

各主成分軸上で人間が正読できる範囲と可読限界点における文字の形を示したものである。 It shows the range of characters that can be read correctly by humans on each principal component axis and the shape of characters at the limit of legibility. 心理距離０．５〜０．６の範囲及び心理距離０．９〜１．０の範囲でランダムに発生させた文字（数字）の例を示したものである。 The example of the character (number) produced | generated at random in the range of the psychological distance 0.5-0.6 and the psychological distance 0.9-1.0 is shown.

各字種に設定した亜字種（サブカテゴリ）の数と代表的な字形を示したものである。 It shows the number of sub-character types (subcategories) set for each character type and typical character shapes. 本発明の方法で新たに正読になった文字を示した図である。 It is the figure which showed the character newly read correctly by the method of this invention. 本手法でも正読にならなかった文字の代表例を示したものである。 A typical example of characters that could not be read correctly even with this method is shown.

ノイズ・ツブレのない場合の正読率を示したものである。 This shows the correct reading rate when there is no noise or blur. 本発明の方法によって達成された認識性能と郵便番号認識システムの到達可能な認識率を説明するための図である。 It is a figure for demonstrating the recognition performance achieved by the method of this invention, and the reachable recognition rate of a postal code recognition system.

Table 1

通常の方法で収集する文字集団のサイズを大きくする方法と本手法による方法との正読率を比較したものである。 This is a comparison of the correct reading rate between the method of increasing the size of the character group collected by the normal method and the method of this method.

Table 2

旧郵政省郵政研究所作成データベースＩＰＴＰ−ＣＤＲＯＭ１を直接学習文字集団として用いた従来の方法と本手法による認識率を比較したものである。 This is a comparison between the conventional method using the IPTP-CDROM1 created by the Postal Service Institute of the Ministry of Posts and Telecommunications directly as a learning character group and the recognition rate by this method.

Claims

In response to the practical problem of not being able to collect handwritten character samples with sufficient content for actual learning of a handwritten character recognition system, the feature points and sub-feature points (sequence numbers) of typical character-shaped character cores are provided for each character type. And a feature vector formed by arranging the values of the x and y coordinates in the order of the order number, and a point corresponding to the feature point / sub-feature point is selected from the pixel string representing the core line of the learning character. Extracted and based on the principal component analysis result for the feature vector obtained from those x and y coordinates, the human recognizable area in the feature space (the range that humans can recognize as the character type) is visually piecewise hyperelliptic. A method for obtaining a body and a character recognition method performed using a psychological distance defined therein are disclosed in [Non-Patent Document 1]. However, the present invention further includes a distance as an attribute of a feature point as necessary. Value (statement In addition to extending the method to add z), the piecewise hyperellipsoid is divided into several shells according to psychological distance, and the character pattern is reversely generated from the randomly generated points. Thus, the present invention relates to a method of automatically generating a sufficient number of character patterns for each deformation strength or quality grade and enriching the content of learning characters in addition to the learning character samples collected first.
[Non-Patent Document 1]
Tsuruoka Shinji, Murase Akihiko, Kimura Fumitaka, Yokoi Shigeki, Miyake Koji: “Free Handwritten Katakana Character Recognition Using Human Character Type Identification Criteria”, IEICE Transactions, Vol, J68-D, No. 4, pp. 781-788, 1985.

In order to enhance the contents of the learning character group, mainly multi-font or omnifont print character recognition system, important feature points (corner points, curvature discontinuities, etc.) Define and perform principal component analysis in the same way as in [Non-Patent Document 1], determine the allowable range of characters of the character type by mathematical constraints or visual observation on each principal component axis, Using a distance value in the same format as the psychological distance for a piecewise hyperellipsoid with a long axis, [This is related to a method for generating a variety of fonts in the same way as claim 1 and enriching the contents of the learning sample. is there.