JPS58186882A

JPS58186882A - Input device of handwritten character

Info

Publication number: JPS58186882A
Application number: JP57070764A
Authority: JP
Inventors: Kunio Sakai; 坂井　邦夫; Yoshiaki Kurosawa; 由明黒沢
Original assignee: Toshiba Corp; Tokyo Shibaura Electric Co Ltd
Current assignee: Toshiba Corp
Priority date: 1982-04-27
Filing date: 1982-04-27
Publication date: 1983-10-31

Abstract

PURPOSE:To perform accurate approximation with less information by approximating each stroke of an inputted character pattern arcuately and extracting features. CONSTITUTION:When a handwritten character pattern is inputted on a coordinate input device 11, a sequence of position coordinates of its strokes is inputted to a buffer successively and a preprocessing part 12 performs smoothing, etc., and sends the result to a feature extraction part 13 to perform feature extraction by arcuate approximation. Features of respective strokes of the input character pattern are found by this feature extraction and stored in a feature buffer 14 while classified by the strokes. A collation arithmetic device 15 collates them with features of respective strokes of standard characters registered in a dictionary 16 and a recognition decision part 17 outputs a category name which has the highest similarly and sufficient difference from succeeding similarity as the solution to the input character pattern.

Description

【発明の詳細な説明】本発明は手書き文字のス）ｏ−り情報から上記手書き文
字を識別入力する精度の高い手書き文字入力装置に関す
る。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a highly accurate handwritten character input device that identifies and inputs handwritten characters based on information on handwritten characters.

[Technical backbone of the invention and its problems]

漢字・仮名混り文字列を日常言語の表記手段とする日本
国において、コンピュータ利用の普及とその高度化につ
れて、漢字を含む多種類の日本語文字を人力する装置の
開発が重要な課題となってさている。そして最近では、
日本語ワードノロセッサの開発・実用化により、比較的
容易に日本語文字の入力・編集が行われるようになって
きている。In Japan, where character strings containing kanji and kana are used as a means of writing everyday language, as computer use spreads and becomes more sophisticated, the development of devices that can manually write a wide variety of Japanese characters, including kanji, has become an important issue. I'm waiting. And recently,
With the development and practical use of Japanese word processors, it has become relatively easy to input and edit Japanese characters.

ところが日本社会は、長年に亘って手書き文字を重んじ
ており、その傾向は今後も変らないと考えられている。However, Japanese society has valued handwritten characters for many years, and it is thought that this trend will not change in the future.

このことから、手書きされた文字をそのまま読取るＯＣ
Ｒ装置の開発や、上記文字の手書きの過程で得られる情
報から上記文字を識別入力することのできる簡易な装置
の開発が強く望まれている。From this, OC can read handwritten characters as they are.
There is a strong desire for the development of an R device or a simple device that can identify and input the above-mentioned characters from information obtained in the process of handwriting the above-mentioned characters.

さて、このような手書き文字の入力をオンラインに行う
装置については、従来より種々研究が重ねられており、
入力文字種として英文字・数字・記号から片仮名・平仮
名までにも拡がってきている。その代表的なものには、小高、部用、増田 “ストロークの点近似による手書き文字のオンライン認
識” 電子通信学会論文誌８２／２　、Ｖｏｌ　Ｊ６３−Ｄ　
Ａ　２に述べられるストロークの点近似沃等がある。Now, various studies have been conducted on devices for inputting handwritten characters online.
Input character types are expanding from English letters, numbers, and symbols to katakana and hiragana. Typical examples include Odaka, Buyou, and Masuda, “Online recognition of handwritten characters using point approximation of strokes,” Journal of the Institute of Electronics and Communication Engineers 82/2, Vol J63-D.
There is a point approximation of the stroke described in A2.

この点近似法は、入力文字の個々のストロークを３〜６
個の代表点で近似し、予め辞誉登録された文字辞簀との
照合を行うものであり、約１０００種の文字認識を可能
としている。そして、この手法によれば文字を構成する
ストロークを直線または直線セグメントにて近似するの
で、基本的に直線で構成される漢字や片仮名の認識入力
を効果的に行うことができる。但し、この場合であって
も、上記入力文字の筆順や画数についての問題が残され
ている。しかし、日本語文字列で最も出現頻度が高く、
しかも曲線で構成されることの多い平仮名文字の認識に
は無理があった。また片仮名や漢字であっても、手書き
により変形して曲線化した場合、これに十分対処するこ
とができないと云う問題があった。This point approximation method calculates the individual strokes of the input character from 3 to 6
This approximation is performed using five representative points and compared with a dictionary of characters registered in advance, making it possible to recognize approximately 1,000 types of characters. According to this method, the strokes that constitute characters are approximated by straight lines or straight line segments, so that it is possible to effectively recognize and input kanji and katakana, which basically consist of straight lines. However, even in this case, problems remain regarding the stroke order and number of strokes of the input characters. However, it appears most frequently in Japanese character strings,
Moreover, it was difficult to recognize hiragana characters, which are often composed of curved lines. Furthermore, even if katakana or kanji are deformed by handwriting and become curved, there is a problem in that it is not possible to adequately deal with this.

これ故、入力された文字の高精度な認識が望めなかった
。。Therefore, highly accurate recognition of input characters could not be expected. .

[Purpose of the invention]

本発明はこのような事情を考厘してなされたもので、そ
の目的とするところは、曲線を多く含む手書き文字を簡
易に且つ正確に識別して入力することのできる実用性の
高い手書き文字入　　　　゛力装置を提供することにあ
る。The present invention was made in consideration of the above circumstances, and its purpose is to provide highly practical handwritten characters that can easily and accurately identify and input handwritten characters that include many curves. Its purpose is to provide an input device.

[Summary of the invention]

本発明は座標入力装置を介して入力される手書き文字の
上記入力過程で得られるストロークの情報を各ストロー
ク毎に円弧近似し、これらの円弧近似された情報系列を
識別することによって前記手書き文字を入力するように
したものである。The present invention approximates the stroke information obtained in the above-mentioned input process of handwritten characters input through a coordinate input device by circular arcs for each stroke, and identifies the information series approximated by these circular arcs to obtain the handwritten characters. It is designed to be input.

〔Effect of the invention〕

従って本発明によれば手書き文字に多い曲線ストローク
を効果的に円弧近似してその特徴を良く表わすことが可
能となり、少ない情報猾で高精度な識別を行って前記手
書き文字を精度良く入力することができる。しかも、各
ストロークを円弧近似するので、手書きによって湾曲し
た漢字の直線成分についても、その情報を適確に見出す
ことができるので、識別能力の向上を図り、多くの文字
種を効果的に人力することが可能となる等の実用上絶大
なる効果が奏せられる。Therefore, according to the present invention, it is possible to effectively approximate the curved strokes that are often found in handwritten characters to express their characteristics well, and it is possible to perform highly accurate identification with less information and input the handwritten characters with high precision. I can do it. Moreover, since each stroke is approximated by a circular arc, it is possible to accurately find information about the straight line components of kanji that are curved by handwriting, improving identification ability and making it possible to effectively write many types of characters manually. This has great practical effects, such as making it possible to

５− 〔発明の実施例〕以下、図面を参照して本発明の実施例につき説明する。5- [Embodiments of the invention] Embodiments of the present invention will be described below with reference to the drawings.

第１図は従来一般的なオンライン手書き文字認識処理装
置の構成を示すもので、手書き文字入力・母ターンはタ
ブレット等の２次元座標入力装置（図示せず）より、時
系列な信号Ｐｉ（ｘ、ｙ）として前処理部１に与えられ
る。この前処理部１にて、例えば上記時系列信号Ｐｉ　
（ｘ、ｙ）の前後３点間における移動平均を求める等し
て、その平滑化処理が行われる。この平滑化処理された
信号に対して、特徴抽出部２では、入力文字・母ターン
のストローク原情報、即ち上記時系列信号Ｐｉ　（ｘ、
ｙ）の系列を最も良く近似する特徴の抽出が行われる。FIG. 1 shows the configuration of a conventional online handwritten character recognition processing device. Handwritten character input/main turn is input from a two-dimensional coordinate input device (not shown) such as a tablet using a time-series signal Pi(x , y) to the preprocessing unit 1. In this preprocessing section 1, for example, the above-mentioned time series signal Pi
Smoothing processing is performed by calculating a moving average between three points before and after (x, y). The feature extraction unit 2 extracts stroke original information of the input character/mother turn from this smoothed signal, that is, the time series signal Pi (x,
The feature that best approximates the sequence of y) is extracted.

この特徴は、例えば前述した文献に示される手法によれ
ば、前記入力文字・母ターンのストロークを直線近似す
ることにより求められ、ストロークを一本の直線で近似
する場合にはその始点と終点の位置座標によってストロ
ーク特徴が示される。また２本の直線で近似６− する場合には、２本の折線で示される始点、中点、終点
の位置座標によってストローク特徴が示されることにな
る。For example, according to the method shown in the above-mentioned literature, this feature is obtained by linearly approximating the stroke of the input character/mother turn, and when the stroke is approximated by a single straight line, the starting point and ending point of the stroke are approximated by a straight line. Stroke features are indicated by position coordinates. In addition, when approximation is performed using two straight lines, the stroke characteristics are indicated by the positional coordinates of the starting point, midpoint, and ending point indicated by the two broken lines.

しかしてこのようにして抽出された入力文字・ぐターン
のストローク特徴系列の情報は順次ストローク特徴系列
バッファ３に格納され、上記入力文字・ぐターンを構成
する全ての情報が得られたとき、照合部４に転送される
。照合部４では、ストローク特徴系列辞書５に予め辞舊
登録されている各文字の標準ストローク特徴と前記入力
文字パターンのストローク特徴とを順次照合し、例えば
その類似度を計算する。この照合結果から、最も類似度
の高い標準パターンのカテゴリを前記入力文字ｉ４ター
ンの入力結果として認識入力が行われるようになってい
る。However, the information on the stroke feature series of the input characters and patterns extracted in this way is sequentially stored in the stroke feature series buffer 3, and when all the information constituting the input characters and patterns is obtained, it is verified Transferred to Section 4. The matching section 4 sequentially matches the standard stroke features of each character registered in advance in the stroke feature series dictionary 5 with the stroke features of the input character pattern, and calculates, for example, their similarity. Based on this matching result, the category of the standard pattern with the highest degree of similarity is recognized and inputted as the input result of the input character i4 turn.

このような基本的な認識処理において、前述したストロ
ークの特徴抽出と辞書照合とが、この種の認識処理にお
ける最も重要な技術となりその認識精度を左右する。In such basic recognition processing, the stroke feature extraction and dictionary matching described above are the most important techniques in this type of recognition processing and influence the recognition accuracy.

本発明は、この糧の文字認識入力処理におけるストロー
ク特徴の抽出に、従来の直線近似に代えて円弧近似を採
用したことを特徴とするものである。この円弧近似の詳
細については稜述するが、先ずその概要につき説明する
。The present invention is characterized in that circular arc approximation is used in place of the conventional straight line approximation to extract stroke features in the character recognition input process. The details of this circular arc approximation will be described below, but first an overview thereof will be explained.

第２図（ａ）〜（Ｃ）および第３図（ａ）〜（Ｃ）は入
力文字・母ターンに対する直線近イ以と円弧近似との異
なりを、手書き漢字「事」と手書き平板名「あ」につい
てそれぞれ例示したものである。同図において（−）は
入力文字・母ターン、（ｂ）は直線近似ノ母ターン、そ
して（ｅ）は円弧近似パターンをそれぞれ示している。Figures 2 (a) to (C) and 3 (a) to (C) show the difference between linear approximation and circular arc approximation for the input character/mother turn. These are examples for each of the following. In the figure, (-) indicates an input character/mother turn, (b) a straight line approximation mother turn, and (e) an arc approximation pattern.

これらのノやターンに示されるように、漢字は平板名に
較べて、１文字を構成するストローク数は多いが、その
個々のストロークは殆んど直線によって構成される。こ
れに対して平板名はストローク数が少ないものの、その
殆んどが曲線によって示されることが多い。As shown in these nos and turns, kanji characters have a larger number of strokes than flat names, but each stroke is almost entirely straight. On the other hand, although flat name names have fewer strokes, most of them are often represented by curved lines.

従ってこの場合、上記曲線ストロークを複数の部分的直
線ストロークの組合せによって直線近似することが必要
となる。これに較べて、円弧近似によれば、上述した直
線近似よりも少ない分割数で、各ストロークを正確に近
似することが可能となる。Therefore, in this case, it is necessary to linearly approximate the curved stroke by a combination of a plurality of partial linear strokes. In comparison, the arc approximation allows each stroke to be accurately approximated with a smaller number of divisions than the linear approximation described above.

このような特徴抽出におけるストロークの近似精度が高
いと云うことは、高精度な認識処理を行い得る可能性が
高いことを意味する。また上記し友ようにストロークの
分割数を少なくして上記ストロークを効果的に近似する
ことができる。換言すれば近似に要する情報量が少なく
ても良いと云うことは、より簡易なハードウェアで高速
の認識処理を行い得ることを意味する。The fact that the stroke approximation accuracy in feature extraction is high means that there is a high possibility that highly accurate recognition processing can be performed. Further, as mentioned above, the stroke can be effectively approximated by reducing the number of divisions of the stroke. In other words, the fact that the amount of information required for approximation may be small means that high-speed recognition processing can be performed with simpler hardware.

このように概念的にも、本発明装置における円弧近似に
よる特徴抽出の効果は非常に高いと云える。Conceptually, it can be said that the effect of feature extraction by arc approximation in the apparatus of the present invention is very high.

さて、本発明に係る入力文字パターンのストロークの円
弧近似は次のようにして行われる。Now, arc approximation of the stroke of an input character pattern according to the present invention is performed as follows.

第４図はその一例を示すもので、太実線Ａで示されるｉ
ｆターンが近似対象となるストロークを示している。こ
のストロークＡの両端をＲＱとし、これを結ぶ直線をＢ
としたとき、上記ストロークＡの近似中点Ｐは上記直線
Ｂ　（ＰＱ）上の（９）中点として定義される。そして、前記ストロークＡは、
上記直線Ｂを弦とする円弧Ｃ（図中破線で示す）により
近似される。また、この円弧Ｃの長さはＳ１前記直線Ｂ
の傾きはθ（若しくはθ′）として定義される。このよ
うにして円弧近似されるストロークＡの円弧Ｃを一章に
決定する手法には、様々なものがあるが、ここでは例え
ば次のような４つの情報が用いられる。Figure 4 shows an example of this, where i is indicated by a thick solid line A.
The f-turn indicates the stroke to be approximated. Both ends of this stroke A are RQ, and the straight line connecting them is B
Then, the approximate midpoint P of the stroke A is defined as the (9) midpoint on the straight line B (PQ). And the stroke A is
It is approximated by a circular arc C (indicated by a broken line in the figure) whose chord is the straight line B. Also, the length of this arc C is S1 the straight line B
The slope of is defined as θ (or θ'). There are various methods for determining the arc C of the stroke A approximated in this way in one section, but here, for example, the following four pieces of information are used.

（１）中点Ｐの座標　（Ｐｘ、Ｐｙ）（ｉｉ）直線Ｂの傾き　（θ）＜　ｉｊ）円弧Ｃの長さ　（Ｓ）（１ｖ）円弧Ｃの長さと直線Ｂの長さとの差　（Ｄ）尚、前記ストロークＡの直線性ＵはＵ＝（直線Ｂの長さ）／（円弧Ｃの長さ）として定義さ
れる。また上記情報（１１）としての直線Ｂの傾きθは
、一般的には第４図に示されるように基準線に対する上
記直線Ｂそのものの傾きθ′として定義されることが多
いが、ここで１０− は上記傾きθ′に角度π力を加え、直線Ｂに対して円弧
Ｃが存在する方向に測った角度θ、つまり直線Ｂの法線
方向として定義されている。これにより、円弧Ｃの向き
が一意に定められている。以上のようにして、本発明に
係る円弧近似では、ストロークＡが上述した４つの情報
（ＰＸ。(1) Coordinates of midpoint P (Px, Py) (ii) Inclination of straight line B (θ) < ij) Length of arc C (S) (1v) Difference between the length of arc C and the length of straight line B ( D) The linearity U of the stroke A is defined as U=(length of straight line B)/(length of arc C). Further, the slope θ of the straight line B as the information (11) is generally defined as the slope θ' of the straight line B itself with respect to the reference line as shown in FIG. − is defined as the angle θ measured in the direction in which the arc C exists with respect to the straight line B by adding an angle π force to the above-mentioned inclination θ′, that is, the normal direction of the straight line B. Thereby, the direction of the arc C is uniquely determined. As described above, in the arc approximation according to the present invention, the stroke A has the above-mentioned four pieces of information (PX.

ｐｙ）、θ、Ｓ、Ｄによって示される円弧Ｃとして表わ
されるようになっている。py), θ, S, and D.

ちなみに、前述した文献で示される直線近似によれば、
ストロークＡは（Ｑｘ、Ｑｙ、Ｔｘ、Ｔｙ、ＲＸ。By the way, according to the linear approximation shown in the above-mentioned literature,
Stroke A is (Qx, Qy, Tx, Ty, RX.

Ｒｙ）なる情報によって折線近似される。これに比較す
れば、円弧近似の採用により、約１７ｆ６の情報量圧縮
を図り得ることになる。Ry) is approximated by a broken line. In comparison, by employing circular arc approximation, it is possible to compress the amount of information by approximately 17f6.

しかして、このような情報量は、タブレット等の座標入
力装置を介した文字・ぐターン入力に際して、その入力
過程におけるストロークＡを示す位置座標系列（”ｉ　
、３’ｉ　）　ｉ＝１．２−ｔ　　よシ次のようにして
求められる。即ち、ストロークＡの省き始めの座標情報
（Ｑ工、Ｑア）と、その沓き終りの座標情報（Ｒｘ、Ｒ
ｙ）とから、前記中点Ｐの位置座標がＰｘ　＝　（ＱＸ　＋　ＲＸ　）／２ｐｙ＝　（Ｑｙ＋　Ｒｙ）／２として求められる。そして、直線Ｂの傾きはとして求め
られる。但し、この傾きθについては、上記Ｒｘ、Ｒｙ
、Ｑｘ、Ｑｙに関して予め計算された角度の情報をＲＯ
Ｍ等にテーブルとして格納しておき、このテーブルを（
Ｒｙ−Ｑｙ）、　（ＲＸ−ＱＸ）をアドレスとして検索
して求めるようにしてもよい。However, when inputting a character or a turn through a coordinate input device such as a tablet, such an amount of information is generated by a position coordinate series ("i") indicating a stroke A in the input process.
, 3'i) i=1.2-t It can be obtained as follows. That is, the coordinate information at the beginning of stroke A (Q-work, Qa) and the coordinate information at the end of stroke A (Rx, R
y), the position coordinates of the midpoint P are determined as Px = (QX + RX)/2 py = (Qy+Ry)/2. Then, the slope of straight line B can be found as. However, regarding this slope θ, the above Rx, Ry
, Qx, Qy, the angle information calculated in advance is RO
Store it as a table in M etc., and convert this table to (
Ry-Qy), (RX-QX) may be searched for as an address.

そしてまた、円弧Ｃの長さについては、この円弧Ｃがス
トロークＡを良く近似したものとした場合、上記ストロ
ークＡの長さとほぼ等しいと云えることから、前記位置
座標系列（ｘｌ、ｙｌ）よシ、例えはとして求める。史に円弧Ｃと直線Ｂとの長さの差りは、
上記Ｓの値から直線Ｂの長さを差引くことによって求められる。Furthermore, regarding the length of the arc C, if this arc C closely approximates the stroke A, it can be said that it is approximately equal to the length of the stroke A, so the length of the arc C can be calculated from the position coordinate series (xl, yl). I'm asking for an example. Historically, the difference in length between arc C and straight line B is
It is determined by subtracting the length of straight line B from the value of S.

これらの演算（計算処理）はマイクロプロセッサによる
ファームウェアや、専用のディジタル・ハードウェアに
より実行されるもので、比較的容易に抽出される特徴音
である。These operations (calculations) are executed by microprocessor firmware or dedicated digital hardware, and are characteristic sounds that can be extracted relatively easily.

またこのようにして求められた特徴音から計算される前
記ストロークＡの直線性ＵはこのストロークＡが直線的
が、あるいは曲線的であるかを示すものとなる。従って
、この情報Ｕを用いて、後述するように１つのス）ｏ−
りを複数の部分ストロークに分割し、各部分ストローク
毎に上述した円弧近似を行ってＮ個の連続する円弧によ
り上記ストロークを表現するようにすることもできる。Further, the linearity U of the stroke A calculated from the characteristic sound obtained in this manner indicates whether the stroke A is linear or curved. Therefore, using this information U, one step) o-
It is also possible to divide the stroke into a plurality of partial strokes, perform the above-described arc approximation for each partial stroke, and express the stroke using N consecutive circular arcs.

このようにすれば、円弧近似の精度を更に高くすること
が可能となる。In this way, it is possible to further improve the accuracy of the circular arc approximation.

第５図（−）〜（ｆ）は上記の如くして内弧近似される
ストローク数を同じくする「合」、「全」。FIGS. 5(-) to (f) show "combined" and "full" strokes in which the number of strokes approximated by the inner arc is the same as described above.

１３− 「会」なる入力文字・ぐターンの各ストロークの特徴、
即ち角度θ′と直線性Ｕとを・母うメータとして放射線
図に示したものである。この放射線図は、放射方向を上
記直線性Ｕの軸としたもので、ストロークが直線であり
（Ｕ＝１）なるとき、その最外周位置に上記ストローク
の情報がプロットされる。またストロークの曲線性が強
く、（Ｕ＜１）なる条件が強い程、上記ｆロット位置が
その円内部に定められる。このようにして定　　　　′
められる角度θ′と直線性Ｕとに関する特徴だけを捕え
ても、上記した類似性の強い同一ストローク数０３つの
入力文字・母ターンの特徴が大幅に異っていることが判
る。このことは、上記類似した入力文字パターンを明確
に識別できることを意味する。本発明に係るストローク
の円弧近似によれば、このような特徴に加えて、更にス
トロークＡの長さＳｌその位ｉｔ　（Ｐｘ、Ｐｙ）をも
識別情報として用いるので、上記放射線図に表われる特
徴以上の異なりを谷入力文字パターンのそれぞれについ
て得ることかり能と彦る。従１４− って、例えば「目」と「且」、「未」と「末」、「ト」
と「ハ」等の類似性の強い文字パターン間の識別も効果
的に行うことができる。13- Characteristics of each stroke of the input character ``gutaan'',
That is, the angle θ' and the linearity U are shown in the radiation diagram as a basic meter. In this radiation diagram, the radial direction is the axis of the linearity U, and when the stroke is a straight line (U=1), information on the stroke is plotted at the outermost position. Further, the stronger the stroke curve and the stronger the condition (U<1), the more the f lot position is determined inside the circle. In this way,
Even if only the features related to the angle θ' and the linearity U are taken into account, it can be seen that the features of the input characters/mother turns of the above-mentioned three input characters with the same number of strokes and the same number of strokes are significantly different. This means that the above-mentioned similar input character patterns can be clearly identified. According to the stroke arc approximation according to the present invention, in addition to these features, the length Sl of the stroke A and its position it (Px, Py) are also used as identification information, so that the features appearing in the radiographic diagram are The above differences can be obtained for each input character pattern. J14- So, for example, “me” and “and”, “mi” and “su”, and “to”.
It is also possible to effectively identify character patterns that have strong similarities, such as `` and ``.

第６図は、上述したストロークの円弧近（ＪＪ処理を行
って、入力された手書き文字を認識入力する一実施例装
置の概略構成図であり、１）はタブレット等の位置座標
入力装置、１２は前処理部、１３はストロークの円弧近
似により入力文字パターンの特徴を抽出する特徴抽出部
、１４は特徴バッファであり、１５は照合演算装置であ
る。そして１６は上記照合演算に用いられる各カテゴリ
の特徴を登録してなる辞書、１７は認識判定部であって
、１８はこれらの一連の制御を担う制御部である。FIG. 6 is a schematic configuration diagram of an embodiment of a device for recognizing and inputting input handwritten characters by performing the above-mentioned stroke arc near (JJ process); 1) is a position coordinate input device such as a tablet, 12 13 is a preprocessing unit, 13 is a feature extraction unit that extracts features of an input character pattern by circular arc approximation of strokes, 14 is a feature buffer, and 15 is a matching calculation device. Reference numeral 16 denotes a dictionary in which features of each category used in the matching calculation are registered, 17 a recognition determination section, and 18 a control section responsible for a series of these controls.

しかして今、座標入力装置１１を用いて文字ノ４？ター
ンが手書き入力されると、そのストロークを示す位置座
標系列（”ｉ、３’１）ｉ＝１．２０１．ｔ　全ｌｌ１
次バッファに記憶する。そしてこの位置座標系列（ｘｌ
、ｙ４）に対して前処理部１２にて平滑化等の処理を施
し、特徴抽出部Ｊ３に等いて前述した円弧近似による特
徴抽出を行わしめる。この特徴抽出処理により、入力文
字・ぐターンの各ストロークについてそれぞれ、その特
徴（ＰＸ、Ｐｙ、θ。However, now, using the coordinate input device 11, character No. 4? When a turn is input by hand, the position coordinate series ("i, 3'1) i=1.201.t indicating the stroke
Store in next buffer. And this position coordinate series (xl
. Through this feature extraction process, the features (PX, Py, θ) are extracted for each stroke of the input character/gutaan.

Ｓ、Ｄ）が求められる。そして、これらの特徴が各スト
ローク毎に特徴バッファ１４に格納される。以上の処理
は、入力文字パターンを構成する全てのストロークの特
徴抽出と、その特徴のバッファ１４への格納が完了する
まで行われ、この処理の完了に伴い、前記入力文字パタ
ーンのストローク数も検出記憶される。照合演算装置１
５は、このようにしてバッファ１４に登録された入力文
字・やターンの各ストロークの特徴に対して、辞書１６
に予め仔録された複数の標準文字・ぐターンの各ストロ
ークの特徴とを比較照合処理する。辞書１６は、例えば
第７図に示すように、複数の標準パターンのカテコ゛り
毎にそのカテコゝり名とストローク数をヘッダ２１とし
て記憶し、且つこのヘッダ２ノによって示されるストロ
ーク数の各ストローク情報２２をそれぞれ格納している
。このストロークの情報２２は、前述した円弧似似によ
って求められる特徴情報（ｐｘ、ｐｙ、θ、Ｓ、Ｄ　）
からなり、例えば第８図に示す如きフォーマットとして
各ストローク毎に登録されている。尚、ストロークの情
報２２は、文字パターンの筆順に従って登録することが
望捷しい。またこのとき、上記した情報に加えて、スト
ローク番号Ｎ、ストロークの分割連番号ｎＸｔｌｊには
ストロークの重要度を示す係数ｋｌ、に２　、に３等も
併せて登録しておく。S, D) are required. These features are then stored in the feature buffer 14 for each stroke. The above processing is performed until the feature extraction of all the strokes constituting the input character pattern and the storage of the features in the buffer 14 are completed. Upon completion of this processing, the number of strokes of the input character pattern is also detected. be remembered. Verification calculation device 1
5 is a dictionary 16 for the characteristics of each stroke of input characters and turns registered in the buffer 14 in this way.
The characteristics of each stroke of a plurality of standard characters and patterns recorded in advance are compared and verified. For example, as shown in FIG. 7, the dictionary 16 stores the category name and the number of strokes for each category of a plurality of standard patterns as a header 21, and stores each stroke of the number of strokes indicated by the header 2. Information 22 is stored respectively. This stroke information 22 is the feature information (px, py, θ, S, D) obtained by the circular arc approximation described above.
It is registered for each stroke in a format as shown in FIG. 8, for example. Note that it is desirable that the stroke information 22 be registered in accordance with the stroke order of the character pattern. At this time, in addition to the above-described information, coefficients kl, 2, 3, etc. indicating the importance of the stroke are also registered in the stroke number N and stroke division serial number nXtlj.

しかして前記照合演算装置１５け、前述した入力文字・
ぐターンのストローク数と、辞書１６に登録された各標
準文字のストローク数とを順に比較し、一致するカテゴ
リを抽出する。このストローク数の一致したカテコ９り
の各ストロークの情報について、入力文字ツクターンの
対応するス）ｏ−りとの間で、ヤ１ｊえば類似度計痺を
行う那して照合処理を行う。この照合処理を、−文字を
構成する全てのストロークについて行い、その結果を記
憶する。However, the above-mentioned input character and
The number of strokes of each turn is compared with the number of strokes of each standard character registered in the dictionary 16, and matching categories are extracted. Information on each of the strokes with the same number of strokes is compared with the corresponding stroke of the input character string by performing a similarity calculation. This matching process is performed for all strokes forming the - character, and the results are stored.

またこの照合処理を、入力文字・母ターンとス１７− トローク数を同じくする他のカテコ゛りの標準文字・母
ターンとの間においても、（Ｉ直に実行し、それによっ
て得られた照合結果とそのカテコゝり名とをそれぞれ記
憶する。しかるのち、認識判定部１７において、上記各
カテコ８りに対して得られた照合結果を相互に比較し、
例えばｊ１似性が最も高く、且つ次位の類似度との差が
十分にあるカテゴリ名を、前記入力文字・ぐターンの答
として出力する。この答であるカテコゝり名を入力して
、前記手書きされた入力文字パターンの入力が完了する
。This matching process can also be performed directly between the input character/mother turn and standard characters/mother turns of other categories that have the same number of strokes, and the matching result obtained thereby and its category name are respectively memorized.Then, in the recognition determination section 17, the matching results obtained for each category 8 are compared with each other,
For example, the category name having the highest similarity to j1 and having a sufficient difference from the next similarity is output as the answer to the input character/gutern. By inputting the categorical name that is the answer, the input of the handwritten input character pattern is completed.

ところで、上記の説明では、文字・Ｐターンを構成する
全てのストロークを、それぞれ１個の円弧で近似してそ
の特徴を求めたが、先に１４１１中に触れたように、曲
線性の強いストロークに対してはこれを複数の部分ス）
ｏ−りに分割し、これらの複数の部分ストローク毎にそ
れぞれ円弧近似して特徴抽出するようにすることも用ｈ
１式である。即ち、この場合には前記辞・、４１６に、
′子球されたストロークの分割連番号ｎを訊出し、１８
− ｎ　）　１なる条件の場合にはそのｎの値に従って該当
ストロークをｎ分割し、そのそれぞれについて改めて円
弧近似を行うようにすればよい。By the way, in the above explanation, all the strokes that make up the character/P turn are approximated by one circular arc to find their characteristics, but as mentioned earlier in 1411, strokes with strong curves (for multiple parts)
It is also possible to divide the stroke into o-ways and extract features by approximating each of these partial strokes.
It is 1 set. That is, in this case, in the above sentence 416,
'Find the division sequence number n of the stroke that was made by the child ball, and calculate 18
-n) If the condition is 1, the corresponding stroke may be divided into n parts according to the value of n, and arc approximation may be performed anew for each of them.

そして、これらの分割された部分ストロークの特徴毎に
照合比較を行うようにすればよい。Then, a comparison may be performed for each feature of these divided partial strokes.

尚、上記したストローク間の類似度計算は、例えば次の
ようにしで行われる。Note that the above-described similarity calculation between strokes is performed, for example, as follows.

ここで、ハツト（−）が付された記号は、辞書１６に予
め登録された標準文字ノにターンのストローク情報を示
している。またに１　＋　ｋ２　＋　ｋ３１’ｊ、、各
特徴カテゴリ毎に与えられる係数であって、特徴の取扱
い上の重要度に応じて定められる。Here, the symbol with a cross (-) indicates the stroke information of a turn in a standard character registered in the dictionary 16 in advance. Furthermore, 1 + k2 + k31'j is a coefficient given to each feature category, and is determined according to the importance of the feature in handling.

つまり、上記式において第１項はストローク位置の違い
、第２項はストロークの傾きの違い、第３項はストロー
クの形状の違いをそれぞれ示している。従って、例えば
「ン」と「す」の識別にあっては、第１ストロークの傾
きの違いが決定的要因となることから、この場合には係
数に、のウェイトが重くされることになる。そして、こ
のような特徴の重み付けにより、類似したストロークを
有する文字間の識別を確実に行うことが可能となる。That is, in the above equation, the first term indicates a difference in stroke position, the second term indicates a difference in stroke inclination, and the third term indicates a difference in stroke shape. Therefore, for example, in distinguishing between "n" and "su", the difference in the slope of the first stroke is the determining factor, so in this case, the coefficient is given a heavier weight. By weighting such features, it becomes possible to reliably identify characters having similar strokes.

以上説明したように本発明によれば、入力された文字パ
ターンの各ストロークをそれぞれ円弧近似して特徴抽出
を行うので、上記各ストロークを少ない情報量で正確に
近似することができる。そして曲線ストロークを多く含
み、しかも出現頻度の高い平仮名の特徴を効果的に抽出
することができるので、従来より認識が困難とされてい
たこの種の文字を高梢度に認識入力することが可能とな
る。また上記ストロークの分割数や各ストローク特徴の
優先的取扱いの程度を、ストローク形状の複雑でと、そ
のストロークの重黴度に応じて適応的に変えることによ
って、類似文字間の識別を冒梢度に行うことが可能とな
る。しかも、このストロークの分割数や重要度の・母う
メータを学習訓練を繰返し行い乍ら可変調整することに
より、文字認識入力性能の逐次的向上を容易に図ること
が可能となる。As explained above, according to the present invention, each stroke of an input character pattern is approximated by a circular arc to extract features, so each stroke can be accurately approximated with a small amount of information. In addition, it is possible to effectively extract the characteristics of hiragana, which contain many curved strokes and appear frequently, making it possible to recognize and input these types of characters, which have traditionally been difficult to recognize, with a high level of accuracy. becomes. In addition, by adaptively changing the number of stroke divisions and the degree of preferential treatment of each stroke feature according to the complexity of the stroke shape and the severity of the stroke, the discrimination between similar characters can be made more advanced. It becomes possible to do so. Furthermore, by variably adjusting the number of stroke divisions and the importance meter while repeatedly learning and training, character recognition input performance can be easily improved successively.

更には、ストロークの特徴をその入力過程や書き終りの
時点で瞬時的に得ることも可能であり、簡易なハードウ
ェアによって高速に処理することができる。従って、認
識精度やその他の点で本装置の実用的利点は極めて高い
。そして従来装置には期待することのできない種々格別
なる効果を奏する。Furthermore, the characteristics of a stroke can be obtained instantaneously during the input process or at the end of writing, and can be processed at high speed using simple hardware. Therefore, the practical advantages of this device in terms of recognition accuracy and other aspects are extremely high. Moreover, it produces various special effects that cannot be expected from conventional devices.

尚、本発明は上記実施例にのみ限定されるものではない
。例えば入力される文字の大きさと辞書に記憶された文
字の大きさの正規化処理を行うことも有用である。この
場合、入力文字ノ４ターンの存在領域を調べて、上記文
字・臂ターンの縦横の寸法を求め、この中において位置
（Ｐｘ。Note that the present invention is not limited only to the above embodiments. For example, it is useful to normalize the size of input characters and the size of characters stored in a dictionary. In this case, the area where the four turns of the input character exist is examined, the vertical and horizontal dimensions of the character/arm turn are determined, and the position (Px) is determined within this area.

Ｐｙ）をその相対位置情報として用いるようにすればよ
い。またこの正規化における円弧の長さＳと長さの差り
については、文字を構成する全ストロークの総長で、各
ストロークの上＝ｔ　ｓと２１− Ｄとをそれぞれ除すようにしてもよい。Py) may be used as the relative position information. Also, regarding the difference between the arc length S and the length in this normalization, the upper = t s and 21-D of each stroke may be divided by the total length of all strokes that make up the character. .

また各ストロークの篭順に対する制約を次のようにして
除くこともできる。即ち、成る入力ストロークの位置（
ｐＸ、ｐア）と辞書に登録されたストロークの位置（ｐ
ｘ、ｐｙ）とからその差を求め、この差が最も小さいも
のから順に照合対象としていけばよい。また更に、成る
ストロークと、次のストロークとを結ぶ仮想線分を上記
ストロークの終点と、次のストロークの始点とを結ぶ直
線を用いて定義し、これを識別情報の１つとして加える
ことも有用である。このようにすれば、相前後して生起
する２つのストローク間の相対位置関係が明確になるの
で、認識処理が容易となる。Further, the restrictions on the order of each stroke can be removed as follows. That is, the position of the input stroke consisting of (
pX, pA) and the stroke position registered in the dictionary (p
x, py), and select the one with the smallest difference as the matching target. Furthermore, it is also useful to define a virtual line segment connecting one stroke to the next stroke using a straight line connecting the end point of the above stroke and the start point of the next stroke, and add this as one of the identification information. It is. In this way, the relative positional relationship between two strokes that occur one after the other becomes clear, which facilitates the recognition process.

また、ストロークとストロークとの間の幾何学的関係、
例えばつき出し、交差、平行、分岐等の情報を識別情報
の１つとして加えることも有用である。このように本発
明は、その要旨を２２− 逸脱しない範囲で種々変形し、応用して実施することが
できる。Also, the geometric relationship between strokes,
For example, it is useful to add information such as protrusion, intersection, parallelism, and branching as one of the identification information. As described above, the present invention can be variously modified and applied without departing from the gist thereof.

[Brief explanation of drawings]

図は本発明の一実施例を説明する為のもので、第１図は
オンライン手書き文字認識装置の概略構成図、第２図（
ａ）　〜（ｃ）および第３図（ａ）　〜（ｃ）は文字・
ぞターンの近似表現を対比して示す図、第４図は円弧近
似の概念を示す図、第５図（ａ）〜（ｆ）は文字・臂タ
ーンと円弧近似された特徴を示す図、第６図は実施例装
置の概略構成図、第７図は辞書のメモリ構成図、第８図
はストローク特徴のデータフォーマットを示す図である
。１１・・・座標入力装置、１２・・・前処理部、１３・
・・特徴抽出装置、１４・・・特徴バッファ、１５・・
・照合演算装置、１６・・・辞書、１７・・・認識判定
部、１８・・・制御部。出願人代理人　　弁理士　鈴　江　武　彦２３−The figures are for explaining one embodiment of the present invention. Figure 1 is a schematic configuration diagram of an online handwritten character recognition device, and Figure 2 (
a) to (c) and Figure 3 (a) to (c) are letters and
Figure 4 is a diagram showing the concept of circular arc approximation; Figures 5 (a) to (f) are diagrams showing the character/arm turn and the features approximated by circular arc; Figure 4 is a diagram showing the concept of circular arc approximation; FIG. 6 is a schematic block diagram of the embodiment apparatus, FIG. 7 is a memory block diagram of a dictionary, and FIG. 8 is a diagram showing a data format of stroke characteristics. 11... Coordinate input device, 12... Preprocessing unit, 13.
...Feature extraction device, 14...Feature buffer, 15...
- Verification calculation device, 16... Dictionary, 17... Recognition determination unit, 18... Control unit. Applicant's agent Patent attorney Takehiko Suzue 23-

Claims

[Claims]

(1) Stroke information obtained in the above input process of handwritten characters input via a coordinate input device is approximated by circular arcs for each stroke, and these arc approximated information series are identified to obtain information on the handwritten characters. A handwritten character input device that is capable of inputting .

(2) The circular arc approximation of a stroke is based on information on the coordinates of the midpoint of a straight line connecting both ends of the stroke, information on the slope of the straight line, information on the total length of the above-mentioned stroke, and the relationship between the total length of this stroke and the length of the straight line. The handwritten character input device according to claim 1, wherein approximation processing is performed by obtaining difference information.

(3) A patent claim in which the means for identifying the o-ri by approximating a circular arc divides the stroke into a plurality of partial strokes, and then performs the identification process by approximating each partial stroke to a circular arc one by one. The handwritten character input device according to item 1.

(4) The handwritten character input device according to claim 3, wherein the number of divisions of a stroke into a plurality of partial strokes is adaptively determined according to the degree of complexity of the stroke.