JPH0443316B2

JPH0443316B2 -

Info

Publication number: JPH0443316B2
Application number: JP58219427A
Authority: JP
Inventors: Akihiro Asada
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1983-11-24
Filing date: 1983-11-24
Publication date: 1992-07-16
Also published as: JPS60112187A

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は、文字を手書きする過程の情報を用い
て、逐次認識処理を行なつてゆくオンライン手書
き文字認識方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to an online handwritten character recognition method that sequentially performs recognition processing using information about the process of handwriting characters.

[Background of the invention]

従来のオンライン手書き文字認識方式は、大き
く分けて、次の４つの方式に分けられる。 Conventional online handwritten character recognition methods can be roughly divided into the following four methods.

第１の方式は、文字を手書きする際の筆点の運
動変化を直交座標成分に分解した１次元波形の組
と考え、この１次元波形を直交関数展開により近
似し、直交関数の係数を用いて文字を認識する方
式である。 The first method considers the change in motion of the pen point when handwriting characters as a set of one-dimensional waveforms that are decomposed into orthogonal coordinate components, approximates this one-dimensional waveform by orthogonal function expansion, and uses the coefficients of the orthogonal function. This is a method for recognizing characters.

第２の方式は、文字を構成する各ストロークを
８方向で方向量子化したベクトルのつながりとし
て近似し、これらの近似された各ストロークをい
くつかの基本ストロークに分類し、基本ストロー
クの組合せから文字を認識する方式である。 The second method approximates each stroke that makes up a character as a connection of directionally quantized vectors in eight directions, classifies each of these approximated strokes into several basic strokes, and creates characters from the combination of basic strokes. This is a method to recognize

第３の方式は、文字を構成する各ストロークを
いくつかの基本ストロークに分類し、かつストロ
ークの端点や交叉点などを用いて文字を記述する
特徴テーブルを作り、入力文字とこの特徴テーブ
ルとの比較により入力文字を認識する方式であ
る。 The third method classifies each stroke that makes up a character into several basic strokes, creates a feature table that describes the character using stroke end points, intersection points, etc., and compares the input character with this feature table. This method recognizes input characters by comparison.

また、第４の方式は、特公昭57−6151号公報
『手書き文字のオンライン認識処理方式』で提案
されている方式で、入力文字の各ストロークの始
点座標、終点座標、該始点と終点との中央位置の
中点座標を抽出して該各点を特徴点とし、予め準
備された標準文字における上記各特徴点との距離
の総和を決定して、該距離の総和が最小値となる
標準文字を上記入力文字として認識する方式であ
る。 The fourth method is a method proposed in Japanese Patent Publication No. 57-6151, "Online Recognition Processing Method for Handwritten Characters," in which the coordinates of the starting point, the coordinates of the ending point, and the coordinates of the starting point and ending point of each stroke of the input character are calculated. Extract the coordinates of the midpoint of the center position, use each point as a feature point, determine the sum of distances from each feature point in a standard character prepared in advance, and create a standard character whose sum of distances is the minimum value. This is a method that recognizes the above input characters.

このような従来のオンライン手書き文字認識方
式は、各々以下に述べるような問題点を有してい
る。即ち、第１の方式は、漢字、片仮名はどの直
線を主な構成成分とする文字については直交関数
による近似精度が必ずしも良くなく、また、文字
の位相的な形状をつかめないため認識率の低下を
招いていた。 Each of these conventional online handwritten character recognition systems has the following problems. In other words, in the first method, the accuracy of approximation using orthogonal functions is not necessarily good for characters whose main components are straight lines for kanji and katakana, and the recognition rate decreases because the topological shape of the characters cannot be grasped. was inviting.

第２の方式では、入力された文字を構成する各
ストロークを基本ストロークに分類する際に生じ
る基本ストロークの分類誤りのために認識率の低
下を招いている。また、認識対象となる全ての文
字について詳細な記述を要すため、この記述に多
大な手間を要している。 In the second method, recognition rate decreases due to classification errors in basic strokes that occur when each stroke constituting an input character is classified into basic strokes. Furthermore, since detailed descriptions are required for all characters to be recognized, this description requires a great deal of effort.

第３の方式では、第２の方式と同様に、基本ス
トロークの分類誤りによる認識率の低下、ならび
に、認識対象とするすべての文字を詳細に記述す
る特徴テーブルの作成に多大の手間と多量の記憶
要領とを要する。 In the third method, as with the second method, the recognition rate decreases due to basic stroke classification errors, and it takes a lot of time and effort to create a feature table that describes in detail all the characters to be recognized. Requires memory requirements.

第４の方式は、第１〜第３の方式の問題点を改
善するものとして提案された方式であるが、以下
のような問題点がある。 The fourth method is a method proposed to improve the problems of the first to third methods, but it has the following problems.

(1) 各ストロークを始点、中点および終点の３点
で近似するが、第１図に示すように、屈曲点を
有するストロークに対して、必ずしも、近似精
度は良くない。第１図のａのようなストローク
は、３点近似では、ｂのようになつてしまう。
これは、ストロークの中点は必ずしもストロー
クの屈曲点に対応しないことによる。(1) Each stroke is approximated by three points: a start point, a middle point, and an end point, but as shown in FIG. 1, the approximation accuracy is not necessarily good for strokes that have bending points. A stroke like a in FIG. 1 becomes like b in three-point approximation.
This is because the midpoint of a stroke does not necessarily correspond to the bending point of the stroke.

(2) ストロークにハネなどが含まれると、このハ
ネの影響によつて、終点の座標値が書き方によ
つて、変動するのはもちろんのこと、中点の座
標値まで、このハネによつて変動することにな
る。これは、中点は、ストローク線分の中央の
点として抽出することによるからである。(2) If a stroke includes a splash, the coordinates of the end point will not only change depending on how you write it, but also the coordinates of the midpoint will change due to the influence of the splash. It will change. This is because the midpoint is extracted as the center point of the stroke line segment.

これらの中点、終点の特徴点の筆者、および
書き方によつて、変動することは、認識率低下
の要因となる。 Variations in the midpoint and endpoint feature points depending on the author and writing style are a factor in lowering the recognition rate.

(3) ストロークの形状によらず、一律に、始点、
中点および終点の３点でストロークを近似する
ため、ストロークの形状情報が得られない。こ
のため、候補文字は、入力文字のストローク数
情報によつて選択される標準文字群となり、候
補文字数が多く、整合処理に多大の時間を要す
る。(3) Regardless of the shape of the stroke, the starting point,
Since the stroke is approximated by three points, the midpoint and the end point, no information on the shape of the stroke can be obtained. Therefore, the candidate characters are a standard character group selected based on the stroke number information of the input character, and the number of candidate characters is large, and the matching process takes a lot of time.

(4) ストロークが１本の直線で近似できるストロ
ーク、つまり、始点と終点の２点のみで充分に
近似できるストロークでも前述のように３点で
近似するため、特徴点に冗長性があり、標準文
字を記憶するメモリ容量の増加となつている。(4) Even strokes that can be approximated by a single straight line, that is, strokes that can be sufficiently approximated by only two points, the start point and the end point, are approximated by three points as described above, so there is redundancy in the feature points, and the standard The memory capacity for storing characters is increasing.

[Purpose of the invention]

本発明の目的は、前述の問題点をなくし、主に
直線成分から構成される片仮名や漢字などの文字
の認識に適したオンライン手書き文字認識方式を
提供することにある。 An object of the present invention is to eliminate the above-mentioned problems and provide an online handwritten character recognition method suitable for recognizing characters such as katakana and kanji that are mainly composed of linear components.

[Summary of the invention]

本発明は、上記目的を達成するために、入力文
字の各ストロークの始点座標、終点座標および該
ストロークに屈曲点が存在する場合には、屈曲点
座標を抽出して該各点を特徴点とし、予め準備さ
れた標準文字のうち、該入力文字と同一のストロ
ーク数でかつ各ストロークとも該入力文字と同じ
特徴点数の標準文字群を候補文字として選択し、
該入力文字の各特徴点と該候補文字となる各標準
文字との距離の総和を決定し、該距離の総和が最
小値となる標準文字を、該入力文字として認識す
ることを特徴とするものである。 In order to achieve the above object, the present invention extracts the start point coordinates, end point coordinates of each stroke of an input character, and if the stroke has a bending point, extracts the bending point coordinates and treats each point as a feature point. , select as candidate characters a group of standard characters that have the same number of strokes as the input character and each stroke has the same number of feature points as the input character from among standard characters prepared in advance;
A method characterized by determining the sum of distances between each feature point of the input character and each standard character serving as the candidate character, and recognizing the standard character for which the sum of the distances is the minimum value as the input character. It is.

[Embodiments of the invention]

第２図は、本発明の１実施例の機能ブロツクダ
イヤグラムを示す。図中の符号１は文字情報入力
装置、いわゆるタブレツト、２は前処理部、３は
特徴点抽出部、４はパターン間距離計算部、５は
最小距離検出部、６は出力端子、７は標準パター
ン（標準文字）メモリ部である。 FIG. 2 shows a functional block diagram of one embodiment of the invention. In the figure, numeral 1 is a character information input device, a so-called tablet, 2 is a preprocessing section, 3 is a feature point extraction section, 4 is an inter-pattern distance calculation section, 5 is a minimum distance detection section, 6 is an output terminal, and 7 is a standard This is a pattern (standard character) memory section.

本発明の原理は次のようなものである。まず、
文字情報入力装置１からの入力文字は、前処理部
２において、入力文字の重心点が原点となるよう
に各筆点の座標変換が行なわれる（以下、位置の
正規化と称す）。また、各筆点と重心点との距離
の平均値が一定値となるように、大きさが正規化
される。さらに、前処理部２では入力文字のスト
ローク数を検出する。このストローク数は、文字
情報入力装置１より得られる各筆点のＸ軸、Ｙ軸
座標値および入力ペンが文字情報入力装置１の入
力面に圧着しているか否かのＺ軸情報のうち、Ｚ
軸情報をもとに、例えば圧着（Ｚ＝１）から離脱
（Ｚ＝０）の変化を、１文字分にわたつて計数す
ることにより得る。このストローク数の情報２０
１は、入力文字に対する候補文字選択情報とな
る。 The principle of the present invention is as follows. first,
Input characters from the character information input device 1 are subjected to coordinate transformation of each writing point in the preprocessing unit 2 so that the center of gravity of the input character becomes the origin (hereinafter referred to as position normalization). Further, the size is normalized so that the average value of the distance between each writing point and the center of gravity becomes a constant value. Furthermore, the preprocessing unit 2 detects the number of strokes of the input character. This number of strokes is determined from among the X-axis and Y-axis coordinate values of each writing point obtained from the character information input device 1 and the Z-axis information about whether or not the input pen is pressed against the input surface of the character information input device 1. Z
Based on the axis information, for example, the change from crimping (Z=1) to detachment (Z=0) is obtained by counting the change over one character. Information on this number of strokes 20
1 is candidate character selection information for the input character.

この前処理後、特徴点抽出部３において、入力
文字の各ストロークは、ストロークの始点、終点
およびストロークに屈曲点が存在する場合には、
そのうち１つの屈曲点を特徴として抽出する。始
点とは、ストロークの書き始めの点終点とは、ス
トロークの書き終りの点である。また屈曲点は、
始点と終点を結ぶ直線とストローク内筆点との最
大距離Ｈが、始点と終点との距離Ｌに対してＨ／
Ｌが一定値以上となる筆点が存在するとき、最大
距離Ｈを与える筆点に対応するものである。 After this preprocessing, the feature point extraction unit 3 extracts each stroke of the input character from the start point, end point, and if there is a bending point in the stroke.
One of the bending points is extracted as a feature. The starting point is the point at which the stroke begins, and the end point is the point at the end of the stroke. Also, the bending point is
The maximum distance H between the straight line connecting the start point and end point and the writing point within the stroke is H/with respect to the distance L between the start point and end point.
When there is a writing point for which L is equal to or greater than a certain value, this corresponds to the writing point that provides the maximum distance H.

よつて、ほぼ直線のストロークは、始点、終点
の２点、折れ曲りのあるストロークは、始点、屈
曲点、終点の３点を特徴点として抽出する。 Therefore, for a nearly straight stroke, two points, the starting point and the ending point, are extracted, and for a curved stroke, three points, the starting point, the bending point, and the ending point, are extracted as feature points.

この特徴点抽出部は、各ストロークの特徴点数
情報３０１を入力文字に対する第２の候補文字選
択情報として出力する。 This feature point extraction unit outputs feature point number information 301 of each stroke as second candidate character selection information for the input character.

パターン間距離計算部４では、前述の入力文字
と、あらかじめ、入力文字と同様に前処理、特徴
点抽出され標準パターンメモリ部に記憶されてい
るパターンとについて、パターン間距離が計算さ
れる。なお、このとき、対象となる標準パターン
は、前述したストローク数情報２０１、各ストロ
ークの特徴点数情報３０１をもとに、入力文字と
同じストローク数で、かつ、各ストロークの特徴
点数が、入力文字の筆順的に対応する各ストロー
クの特徴点に等しいものを選択し、これを対象と
する。 The inter-pattern distance calculation section 4 calculates the inter-pattern distance between the input character described above and a pattern that has been pre-processed and feature points extracted in the same manner as the input character and stored in the standard pattern memory section. At this time, the target standard pattern has the same number of strokes as the input character, and the number of feature points of each stroke is the same as the input character, based on the stroke number information 201 and the feature point number information 301 of each stroke. Select the feature points that are equal to the feature points of each stroke corresponding to the stroke order, and use these as the target.

よつて、認識対象カテゴリーをθで表わし、パ
ターン間距離をd〓とすれば、d〓は筆順的に対応す
る特徴点間の距離の緩和として求める。この特徴
点間の距離は、例えば、入力文字の特徴点Ａと対
応する標準パターンの特徴点Ｂの間の遠近を表わ
す量であればよく、Ａ，Ｂ間のユークリツド・ノ
ルム、シテイブロツク距離などが用いられる。 Therefore, if the recognition target category is represented by θ and the distance between patterns is d〓, then d〓 is obtained as a relaxation of the distance between corresponding feature points in stroke order. The distance between the feature points may be any amount that represents the distance between the feature point A of the input character and the feature point B of the corresponding standard pattern, such as the Euclidean norm between A and B, the city block distance, etc. is used.

このようにして計算されたパターン間距離d〓の
うちから最小距離検出部５により最小値が検出さ
れ、最小値を示すカテゴリθを入力文字として認
識し、カテゴリθに対応するコードを出力端子６
に出力する。 The minimum distance detection unit 5 detects the minimum value from the inter-pattern distance d〓 calculated in this way, recognizes the category θ indicating the minimum value as an input character, and outputs the code corresponding to the category θ to the output terminal 6.
Output to.

第３図は、本発明の具体的な１実施例を示す１
１はタブレツト、１２は入力ペンで、第２図の文
字情報入力装置１に対応する。１３はタブレツト
インタフエイス部、８１はマイクロプロセツサ、
８２はランダムアクセスメモリ（以下RAMと称
す）、８３はリードオンリーメモリ（以下ROM
と称す）、６１は、出力インタフエイス部、６は
出力端子である。 FIG. 3 shows a specific embodiment of the present invention.
1 is a tablet, and 12 is an input pen, which corresponds to the character information input device 1 shown in FIG. 13 is a tablet interface section, 81 is a microprocessor,
82 is a random access memory (hereinafter referred to as RAM), and 83 is a read-only memory (hereinafter referred to as ROM).
), 61 is an output interface section, and 6 is an output terminal.

マイクロプロセツサ８１は、前述した原理を、
実行するもので、実行するためのプログラムは、
ROM８３に記憶されている。 The microprocessor 81 operates according to the above-mentioned principle.
The program to run is
It is stored in ROM83.

入力ペン１２によつてタブレツト１１上に筆記
された文字の各筆点情報は、タブレツトインタフ
エイス部１３を介してマイクロプロセツサ８１に
取り込まれる。そして、１文字分に対応する各筆
点情報が、RAM８２の所定のエリアに格納され
る。 Information about each writing point of characters written on the tablet 11 with the input pen 12 is taken into the microprocessor 81 via the tablet interface section 13. Then, each piece of writing point information corresponding to one character is stored in a predetermined area of the RAM 82.

そして、この１文字分の各筆点情報に対して、
まず、前処理が行なわれる。この前処理について
は、本発明に直接関係しないので、ここでは省略
する。詳細については、特公昭57−6151号公報に
述べてある。また概要は前述の原理説明で述べ
た。 Then, for each writing point information for this one character,
First, preprocessing is performed. Since this preprocessing is not directly related to the present invention, it will be omitted here. Details are described in Japanese Patent Publication No. 57-6151. Moreover, the outline was described in the above-mentioned explanation of the principle.

前処理された入力文字の各筆点情報は、再び
RAM８２の所定エリアに格納される。 Each stroke information of the preprocessed input character is again
It is stored in a predetermined area of RAM82.

この前処理後、各ストロークの特徴点が抽出さ
れる。各ストロークの特徴点は、前述の原理で説
明したように、ほぼ直線のストロークは、始点、
終点の２点、折れ曲りのあるストロークの場合に
は始点、屈曲点、終点の３点である。 After this preprocessing, feature points of each stroke are extracted. As explained in the above principle, the characteristic points of each stroke are the starting point,
The two points are the end point, and in the case of a curved stroke, the three points are the start point, the bend point, and the end point.

第４図に屈曲点の抽出原理を示す。あるストロ
ークの筆点を筆順的にP₁〜P₁₀とすれば、P₁は始
点、P₁₀は終点に対応する。 Figure 4 shows the principle of extracting bending points. If the writing points of a certain stroke are P ₁ to _{P 10} in stroke order, P ₁ corresponds to the starting point and P ₁₀ corresponds to the ending point.

屈曲点の抽出は、次のようにして行なう。 Extraction of bending points is performed as follows.

(1) 始点Ｓ＝P₁と終点Ｅ＝P₁₀を結ぶ直線SEとス
トローク内の各筆点Pi（ｉ＝２〜９）の距離h′_i
を求める。(1) Distance h′ _i between the straight line SE connecting the starting point S=P ₁ and the ending point E=P ₁₀ and each writing point Pi (i=2 to 9) in the stroke
seek.

(2) 距離h′_iの最大値H′を検出する。(2) Detect the maximum value H′ of distance h′ _i .

(3) 始点Ｓと終点Ｅの距離L′を求める。(3) Find the distance L' between the starting point S and the ending point E.

(4) H′／L′がいき値THより大なるとき、H′を与
える筆点を屈曲点とする。(4) When H′/L′ is greater than the threshold TH, the pen point that gives H′ is the inflection point.

ここで、各筆点Piの座標値（xi，yi）とすれ
ば、直線SEと各筆点Piとの距離h′_iは、次のように
して求める。 Here, if the coordinate values of each writing point Pi are (xi, yi), then the distance h′ _i between the straight line SE and each writing point Pi is determined as follows.

直線SEを Ax＋Bx＋Ｃ＝０ (1) ただし、Ａ＝（y₁−y₁₀）Ｂ＝−（x₁−y₁₀）Ｃ＝（x₁−x₁₀）y₁−（y₁−y₁₀）x₁ とすれば、h′_iはとして求める。 Line SE is Ax + Bx + C = 0 (1) However, A = (y ₁ - y ₁₀ ) B = - (x ₁ - y ₁₀ ) C = (x ₁ - x ₁₀ ) y ₁ - (y ₁ - y ₁₀ ) x ₁ , h′ _i is Find it as.

また始点Ｓと終点Ｅの距離L′は、 L′＝√（₁−₁₀）²＋（₁−₁₀）² ＝√²＋² (3) として求める。 Further, the distance L' between the starting point S and the ending point E is determined as L'=√( ₁ − ₁₀ ) ² +( ₁ − ₁₀ ) ² =√ ² + ² (3).

よつて、Ｈ＝h′₅とすれば、Ｈ／L′は、 H′／L′＝｜Ax₅＋By₅＋Ｃ｜／A²＋B² (4) となる。 Therefore, if H=h' ₅ , H/L' becomes H'/L'=|Ax ₅ +By ₅ +C|/A ² +B ² (4).

そして、H′／L′がいき値THより大なるとき、
筆点Psを屈曲点として抽出し、特徴点とする。
そうでないときは、このストロークは、屈曲点は
存在しないと判断し、始点と終点の２点を特徴と
して抽出する。 Then, when H′/L′ is greater than the threshold value TH,
The writing point Ps is extracted as an inflection point and used as a feature point.
Otherwise, it is determined that this stroke does not have a bending point, and the two points, the starting point and the ending point, are extracted as features.

第５図に、具体的な処理フローを示す。 FIG. 5 shows a specific processing flow.

ここでは、前述したh′_i，L′の計算の処理量を低
減するために、h′_iのかわりに、(2)式右辺の分子
のみをh_i＝｜Axi＋Byi＋Ｃ(5)として計算し、そ
の最大値Ｈを検出する。また、L′のかわりに、Ｌ＝A²＋B² (6) を計算し、H′／L′は、Ｈ／Ｌとして求めている。
このＨ／Ｌは、(4)式から明らかなようにH′／
L′で等価である。 Here, in order to reduce the amount of processing for calculating h' _i and L' mentioned above, instead of h' _i , only the numerator on the right side of equation (2) is calculated as h _i = |Axi + Byi + C (5), The maximum value H is detected. Also, instead of L', L=A ² +B ² (6) is calculated, and H'/L' is obtained as H/L.
As is clear from equation (4), this H/L is H′/
It is equivalent to L′.

第５図の(1)において、始点、終点を結ぶ直線
SEの直線式(1)式の計数Ａ，BCを計算する。 In (1) of Figure 5, a straight line connecting the starting point and ending point
Calculate the coefficients A and BC of the SE linear equation (1).

そして、上記直線SEと筆点Piとの距離h_i（(5)
式）を(2)において計算する。なお、ストロークの
筆点数をＫとすれば、ｉは１〜Ｋ−１までについ
て行なう。そして、(3)において、h_iのうちの最大
値Ｈを求める。(4)において、始点、終点間距離Ｌ
（(6)式）を求める。 Then, the distance h _i ((5)
Equation) is calculated in (2). Note that if the number of pen points of a stroke is K, then i is 1 to K-1. Then, in (3), the maximum value H of h _i is determined. In (4), the distance L between the starting point and the ending point
Find (formula (6)).

こうして求めたＨとＬの比Ｈ／Ｌがいき値TH
より大なるか否かを(5)で判定する。(5)の判定結果
が否定的のときは、ストローク内には、屈曲点は
存在しないものと判断し、(6)において、特徴点数
Ｐ（ｓ）を２にセツトする。なお、ｓはストロー
ク番号で、筆順的に付与した番号である。そし
て、第ｓ番目のストロークの特徴点として、
P_s,1，P_s,2を始点つまり筆点P₁、終点つまり筆点
P_kの座標値に対応させRAM８２の所定エリアに
格納する。なお、Pa，ｂは、ａ番目のストロー
クのｂ番目の特徴点を表わす。また、(5)の判定に
おいて、結果が肯定的のときは、ストローク内に
屈曲点が存在するものと判断し、(7)において、特
徴点数Ｐ（ｓ）を３にセツトする。そして、(9)に
おいて、第ｓ番目のストロークの特徴点として
P_s,1，P_s,2，P_s,3を始点、Ｈを与える筆点Pi、終点
の各座標値に対応させ、RAM８２の所定エリア
に格納する。 The ratio H/L of H and L obtained in this way is the threshold value TH
Use (5) to determine whether the value is greater than or not. If the determination result in (5) is negative, it is determined that there is no bending point within the stroke, and the number of feature points P(s) is set to 2 in (6). Note that s is a stroke number, which is assigned according to stroke order. Then, as the feature point of the sth stroke,
P _s,1 and P _s,2 are the starting point or writing point P ₁ and the ending point or writing point
It is stored in a predetermined area of the RAM 82 in correspondence with the coordinate value of P _k . Note that Pa, b represents the b-th feature point of the a-th stroke. If the result in step (5) is positive, it is determined that a bending point exists within the stroke, and in step (7), the number of feature points P(s) is set to 3. In (9), as the feature point of the sth stroke,
P _s,1 , P _s,2 , P _s,3 are made to correspond to the coordinate values of the starting point, the writing point Pi that gives H, and the ending point, and are stored in a predetermined area of the RAM 82 .

以上の手順を、１文字分の各ストロークについ
てくり返し行なう。 The above procedure is repeated for each stroke of one character.

このようにして抽出された入力文字の特徴点
は、パターン間距離計算部４において、標準パタ
ーンとパターン間距離計算に用いられる。 The feature points of the input characters extracted in this way are used in the inter-pattern distance calculation section 4 to calculate the standard pattern and the inter-pattern distance.

このとき、対象となる標準パターンは、入力文
字のストローク数Ｎに等しいもので、かつ各スト
ロークの特徴点数Ｐ（ｓ）（ｓ＝１〜Ｎ）が、入力
文字のそれと等しいものである。 At this time, the target standard pattern has the same number of strokes N of the input character, and the number of feature points P(s) (s=1 to N) of each stroke is equal to that of the input character.

ここで、認識対象を片仮名文字の清音とすれ
ば、ストローク数と特徴点数Ｐ（ｓ）のパターン
によつて、第６図のように分類することができ
る。例えば「ア」は、ストローク数は２で、特徴
点数は、第１ストローク（筆順的に）が屈曲点あ
りで３、第２ストロークは屈曲点なしで、２とな
る。第６図では、特徴点数Ｐ（ｓ）のパターンを
（２，２）のように表現している。 Here, if the recognition target is the clear sound of katakana characters, it can be classified as shown in FIG. 6 according to the pattern of the number of strokes and the number of feature points P(s). For example, for "A", the number of strokes is 2, the number of feature points is 3 because the first stroke (in stroke order) has an inflection point, and 2 because the second stroke does not have an inflection point. In FIG. 6, the pattern of the number of feature points P(s) is expressed as (2, 2).

同様に「セ」は、ストローク数は２で、特徴点
数Ｐ（ｓ）パターンは、（３，３）となる。 Similarly, for "Se", the number of strokes is 2, and the number of feature points P(s) pattern is (3, 3).

よつて、標準パターンメモリ部７に第６図のよ
うに標準パターンがあらかじめ分類格納されてい
れば、入力文字のストローク数Ｎと各ストローク
の特徴点数Ｐ（ｓ）パターンによつて、候補文字
を限定することができる。 Therefore, if standard patterns are classified and stored in the standard pattern memory section 7 in advance as shown in FIG. can be limited.

第７図に、パターン間距離計算と最小距離検出
の処理フローを示す。この図では、各標準パター
ンは、ストローク数によつて分類格納され、か
つ、特徴点の座標値とともに、各ストロークの特
徴点数Ｐ（ｓ）パターン情報が付与されて格納さ
れているものとする。 FIG. 7 shows a processing flow of inter-pattern distance calculation and minimum distance detection. In this figure, it is assumed that each standard pattern is classified and stored according to the number of strokes, and the pattern information of the number of feature points P(s) of each stroke is given and stored together with the coordinate values of the feature points.

第７図の(1)は入力文字に対応する候補文字の選
択、(2)はパターン間距離計算、(3)は最小距離検出
の処理に対応する。 In FIG. 7, (1) corresponds to selection of candidate characters corresponding to input characters, (2) corresponds to inter-pattern distance calculation, and (3) corresponds to minimum distance detection.

まず、パターン間距離の最小値dminをある値
βに初期セツトする。このβは、通常得られるパ
ターン間距離より大きな値であればよい。そし
て、(1)で、入力文字のストローク数Ｎに等しい標
準パターンのうち、入力文字の各ストロークの特
徴点数Ｐ（ｓ）と、標準パターンの各ストローク
の特徴点数P〓^m（ｓ）（ｓ＝１〜Ｎ）を比較しｓ＝
１〜Ｎについて、Ｐ（ｓ）＝P〓^m（ｓ）なる標準パタ
ーンを選択する。これは（11）の判定をストロー
ク数回くり返すことによつて行なう。（11）の判
定において、結果が否定的のときは、次の標準パ
ターンについて行なう。次に、選択された標準パ
ターンとのパターン間距離を(2)で計算する。標準
パターンのカテゴリーθ_nに対するパターン間距離
をdθ_nとすれば、まず、（21）で、d〓_nを０に初期
セツトし、（22）で、対応する特徴点間の距離ｄ
（Psi，P〓^msi）を求めd〓_nに加算して求めていく。 First, the minimum value dmin of the distance between patterns is initially set to a certain value β. This β may be a value larger than the distance between patterns that is normally obtained. Then, in (1), among the standard patterns equal to the number of strokes N of the input character, the number of feature points P(s) of each stroke of the input character and the number of feature points of each stroke of the standard pattern P〓 ^m (s)(s =1~N) and compare s=
For 1 to N, select a standard pattern such that P(s)=P〓 ^m (s). This is done by repeating the judgment in (11) several times. If the result in (11) is negative, perform the next standard pattern. Next, the inter-pattern distance from the selected standard pattern is calculated in (2). If the inter-pattern distance for the standard pattern category θ _n is dθ _n , first, in (21), d〓 _n is initially set to 0, and in (22), the distance d between the corresponding feature points is
Find (Psi, P〓 ^m si) and add it to d〓 _n .

なお、Psiは、第ｓ番目のストロークの第ｉ番
目の特徴点の座標値（x_si，y_si）を示し、ｓ＝１
〜ｎ，ｉ＝１〜Ｐ（ｓ），Ｐ（ｓ）＝２あるいは３で
ある。 Note that Psi indicates the coordinate value (x _si , y _si ) of the i-th feature point of the s-th stroke, and s=1
˜n, i=1˜P(s), P(s)=2 or 3.

よつて、(2)では、パターン間距離d〓_nを、 d〓_n＝_N 〓^s=1 _P(s) 〓^i-1 ｄ（Psi，P〓^msi） (7) d〓_n＝_N 〓^s=1 _P(S) 〓ⁱ⁼¹ √（_si−〓^m _si）²＋（_si−〓^m _si）² (8) を計算する。 Therefore, in (2), the distance between patterns d〓 _n is d〓 _n = _N 〓 ^s=1 _P(s) 〓 ^i-1 d(Psi, P〓 ^m si) (7) d〓 _n = _N Calculate 〓 ^s=1 _P(S) 〓 ⁱ⁼¹ √( _si −〓 ^m _si ) ² + ( _si −〓 ^m _si ) ² (8).

なお、(8)式では、特徴点間距離をコークリツ
ド・ノルムとして表現したが、下式に示すよう
に、コークリツド・ノルムの２乗（(9)式）あるい
はシテイブロツク距離（(10)式）にしても良いこと
は言うまでもない。 Note that in equation (8), the distance between feature points is expressed as the Caulkrid norm, but as shown in the equation below, it can be expressed as the square of the Caulkid norm (equation (9)) or the city block distance (equation (10)). Needless to say, it's a good thing to do.

d〓_n＝_N 〓^s=1 _P(s) 〓ⁱ⁼¹ ｛（x_si−x〓^m _si）²＋（y_si−y〓^m _si）²｝(9) d〓_n＝_N 〓^s=1 _P(s) 〓ⁱ⁼¹ ｛｜x_si−x〓^m _si｜＋｜y_si−y〓^m _si｜｝ (10) こうして得られたパターン間距離d〓_nは、（31）
において、パターン間距離の最小値d_nioに対して
小さいか否か判定される。判定結果が否定的のと
きは、次の標準パターンに対して、先に述べた
(1)，(2)の処理を行なう。 d〓 _n = _N 〓 ^s=1 _P(s) 〓 ⁱ⁼¹ {(x _si −x〓 ^m _si ) ² + (y _si −y〓 ^m _si ) ² }(9) d〓 _n = _N 〓 ^{s =1} _P(s) 〓 ⁱ⁼¹ {｜x _si −x〓 ^m _si |+｜y _si −y〓 ^m _si |} (10) The inter-pattern distance d〓 _n obtained in this way is (31)
In , it is determined whether the distance between patterns is smaller than the minimum value d _nio . If the judgment result is negative, use the method described above for the next standard pattern.
Perform processes (1) and (2).

また、判定結果が肯定的のときは、（32）にお
いて、d_nioをdθ_nに置換するとともに、標準パタ
ーンの識別番号ｍを、RAM８２の所定エリアに
記憶しておく、これをPCと称す。このPCは、標
準パターンを識別するものであれば何でもよく、
第７図では、同一ストローク数内の標準パターン
の一連の番号としたが、標準パターンのカテゴリ
ーを示す例えばJISコードであつてもよいことは
言うまでもない。 Further, when the determination result is positive, in (32), d _nio is replaced with dθ _n , and the identification number m of the standard pattern is stored in a predetermined area of the RAM 82, which is referred to as PC. This PC can be anything that identifies standard patterns;
In FIG. 7, the numbers are a series of standard patterns within the same number of strokes, but it goes without saying that the numbers may be, for example, JIS codes indicating the categories of the standard patterns.

以上の(1)，(2)，(3)の処理を順次くり返してい
く。入力文字のストローク数Ｎに等しい標準パタ
ーンの個数をＭとすれば、Ｍ回(1)，(2)，(3)の処理
をくり返す。この結果、最小距離d_nioに対応する
標準パターンを入力文字として認識し、対応する
コードを出力する。 The above processes (1), (2), and (3) are repeated in sequence. If the number of standard patterns equal to the number of strokes N of input characters is M, then processes (1), (2), and (3) are repeated M times. As a result, the standard pattern corresponding to the minimum distance d _nio is recognized as an input character, and the corresponding code is output.

〔Effect of the invention〕

以上説明してきたように、本発明では、入力文
字の各ストロークを、始点、終点および屈曲点が
存在すれば屈曲点を特徴点として抽出するので、
ほぼ直線のストロークは、始点、終点の２点、屈
曲点のあるストロークは、始点、屈曲点、終点の
３点で近する。よつて、少ない特徴点で、入力文
字をより忠実に近似することができる。これによ
つて、標準パターンのメモリ容量を、低減するこ
とができる。例えば、第６図に示す片仮名46字に
対して、本発明では総特徴点数は240点となるが、
従来の各ストロークを一律に３点近似する場合に
は、321点となる。よつて、本発明では従来に比
し、約25％標準パターンのメモリ容量を低減する
ことができる。また、ハネのあるストロークにお
いても、終点そのものは、ハネによつてその座標
値は変動するが、屈曲点は、ハネの影響を受け
ず、特徴点の安定性が良い。（従来の３点近似法
では、ハネの長さの1/2が中点の座標値の変動と
なつてしまう）また、抽出した各ストロークの特徴点の数によ
つてほぼ直線のストロークか否か判断できるので
この情報を用いて、入力文字に対する候補文字の
数を従来に比し1/2以下に低減でき、パターン間
距離計算の合計の演算量を1/2以下にすることが
できる。これは、認識処理の高速化となるもので
ある。 As explained above, in the present invention, if each stroke of an input character has a starting point, an ending point, and a bending point, the bending point is extracted as a feature point.
A nearly straight stroke is close to each other at two points, the start point and the end point, and a stroke with a bend point is close to each other at three points, the start point, the bend point, and the end point. Therefore, input characters can be more faithfully approximated with fewer feature points. Thereby, the memory capacity of the standard pattern can be reduced. For example, for the 46 katakana characters shown in Figure 6, the total number of feature points in the present invention is 240, but
In the conventional case where each stroke is uniformly approximated by three points, the number of points is 321. Therefore, in the present invention, the memory capacity of the standard pattern can be reduced by about 25% compared to the conventional method. Further, even in a stroke with springs, the coordinate values of the end point itself vary depending on the springs, but the bending point is not affected by the springs, and the stability of the feature point is good. (In the conventional three-point approximation method, 1/2 of the length of the wing results in a change in the coordinate value of the midpoint.) Also, depending on the number of feature points of each extracted stroke, it is possible to determine whether the stroke is an almost straight line or not. By using this information, it is possible to reduce the number of candidate characters for input characters to less than 1/2 compared to the conventional method, and to reduce the total amount of calculation for inter-pattern distance calculations to 1/2 or less. This speeds up the recognition process.

[Brief explanation of drawings]

第１図は従来の特徴点抽出法を説明する図、第
２図は本発明の１実施例の機能ブロツクダイヤグ
ラムを示す図、第３図は本発明の１具体実施例を
示す図、第４図は本発明の特徴点抽出法を説明す
る図、第５図、第７図は本発明の処理フローを示
す図、第６図は本発明の候補文字分類を説明する
図である。１……文字情報入力装置、２……前処理部、３
……特徴点抽出部、４……パターン間距離計算
部、５……最小距離検出部、７……標準パターン
メモリ部、８１……マイクロプロセツサ、８２…
…RAM、８３……ROM。 FIG. 1 is a diagram explaining a conventional feature point extraction method, FIG. 2 is a diagram showing a functional block diagram of one embodiment of the present invention, FIG. 3 is a diagram showing a specific embodiment of the present invention, and FIG. The figure is a diagram explaining the feature point extraction method of the present invention, FIGS. 5 and 7 are diagrams showing the processing flow of the present invention, and FIG. 6 is a diagram explaining candidate character classification of the present invention. 1...Character information input device, 2...Preprocessing section, 3
...Feature point extraction unit, 4...Inter-pattern distance calculation unit, 5...Minimum distance detection unit, 7...Standard pattern memory unit, 81...Microprocessor, 82...
...RAM, 83...ROM.

Claims

[Claims] 1. Inputting characters while writing with a character writing device,
In an online handwritten character recognition method that recognizes input characters while tracing the strokes of the input characters, the start and end points are connected based on the start point coordinates, end point coordinates, and pen point coordinate series of each stroke of the input character. The maximum value H of the distance between the straight line SE and each writing point coordinate value is extracted, and the ratio (H/L) between the length L of the straight line SE and the maximum value H is larger than a preset threshold. Then, the coordinates of the inflection point extracted as the inflection point are the writing points that give the maximum value H, and the coordinates of the inflection point are taken as the feature points of the stroke. Select a group of standard characters with the same number of feature points as the input character as candidate characters, determine the sum of the distances between the feature points of the input character and the feature points of each standard character serving as the candidate character, and select the minimum distance. An online handwritten character recognition method characterized by recognizing a standard character serving as a value as the input character.