JPS63307593A

JPS63307593A - Continuous character recognizing device

Info

Publication number: JPS63307593A
Application number: JP62142788A
Authority: JP
Inventors: Hiroshi Shimizu; 洋清水
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-06-08
Filing date: 1987-06-08
Publication date: 1988-12-15
Anticipated expiration: 2010-10-18
Also published as: JPH0797396B2

Abstract

PURPOSE:To reduce the arithmetic quantity which follows up a recognition, by dividing a character-string into plural partial character-strings. CONSTITUTION:The titled device is provided with a partial character-string dividing part 300 for dividing an input pattern from an input means 100, into plural partial character-strings, based on character pitch information of said input pattern, giving them to a recognizing means 500. By the input means 100, a character-string data is converted to an input pattern, and this input pattern is divided into plural partial partial character-strings by using the charac ter pitch information by the partial character-string dividing part 300 and sent to the recognizing means 500. In the recognizing means 500, patterns which are held in an isolated character pattern storage part 600 and an inter-character stroke pattern storage part 700 are synthesized, pattern is recognized. In such a way, the arithmetic quantity can be reduced remarkably.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、人力された文字の筆跡に基づき１個以上連続
して書かれた文字を認識する手段に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a means for recognizing one or more consecutively written characters based on handwriting of the characters manually drawn.

〔overview〕

本発明は、連続して筆記された文字列を孤立文字パター
ンと文字間ストロークパターンの合成パターンに基づき
認識する手段において、文字列を複数の部分文字列に分
割することにより、認識に伴う演算量を低減することができるようにしだも
°のである。The present invention is a means for recognizing a continuously written character string based on a composite pattern of an isolated character pattern and an inter-character stroke pattern. It is also possible to reduce the

、　　〔従来の技術〕連続して書かれた文字列中から使用者が１文字の終わり
を指示することなしにその文字列を認識する従来例装置
には、例えば、「特願昭６１−０５３３９７号」　（以
下、文献（１）という。）記載の連続文字認識装置があ
った。以下、この装置を例にとって従来の連続文字認識
装置を説明する。, [Prior Art] A conventional device that recognizes a string of consecutively written characters without the user instructing the end of one character includes, for example, the device disclosed in Japanese Patent Application No. 61-053397. There was a continuous character recognition device described in "No." (hereinafter referred to as Document (1)). Hereinafter, a conventional continuous character recognition device will be explained using this device as an example.

この装置では切れ目なく書かれた文字をも含む連続して
書かれた文字を入力パターンとし、このパターンとあら
かじめ認識装置内に保持されている標準パターンとをパ
ターンマツチングして文字を３忍識する。このようなパ
ターンマツチングで１文字ずつ独立して筆記された孤立
文字を認識する方法としては、例えば電子通信学会技術
研究報告ＰＲＬ８３−２９　（１９８３年８月２７日）
の１ページから８ページに「スタックＤＰマツチングに
よるオンライン手書き文字認識」と題して発表された論
文（以下、文献（２）という。）に示されている方式が
ある。This device uses continuously written characters, including those written without breaks, as an input pattern, and pattern-matches this pattern with a standard pattern stored in the recognition device in advance to recognize the characters. do. As a method for recognizing isolated characters written individually one by one using pattern matching, for example, the Institute of Electronics and Communication Engineers Technical Research Report PRL83-29 (August 27, 1983)
There is a method shown in a paper published on pages 1 to 8 entitled "Online handwritten character recognition using stacked DP matching" (hereinafter referred to as document (2)).

この方式では、まず人力された文字は、文字を構成する
線分の方向角の時系列パターンＡ＝（ａｉ１１≦ｉ≦■
）に変換される。ここで、ａｉは方向角、■は線分数を
表す。この時系列パターンを入力パターンＡとする。標
準パターンは、入力パターンと同様の方向角の時系列パ
ターンとしてあらかじめ認識装置内に保持されている。In this method, first, a human-written character is created using a time-series pattern of direction angles of line segments that make up the character A=(ai11≦i≦■
) is converted to Here, ai represents the direction angle, and ■ represents the number of line segments. Let this time-series pattern be input pattern A. The standard pattern is stored in advance in the recognition device as a time-series pattern with the same direction angle as the input pattern.

これは、標準パターンＢｋ　＝　（ｂｊ　ｌ　１≦Ｊ≦
Ｊｋ）と表される。This is the standard pattern Bk = (bj l 1≦J≦
Jk).

ここで、ｋ（１≦に≦Ｋ）は標準パターン間距離Ｌｂｊ
は方向角、Ｊｋはカテゴ’Ｊｋの標準パターンの線分数
を表す。入力パターンＡのｉ番目のデータと標準パター
ンＢｋのｊ番目のデータとの間の距離を方向角ａ１とｂ
Ｊの間のなす角度で定義する。Here, k (1≦≦K) is the distance between standard patterns Lbj
represents the direction angle, and Jk represents the number of line segments of the standard pattern of category 'Jk. The distance between the i-th data of input pattern A and the j-th data of standard pattern Bk is determined by the direction angles a1 and b.
Defined by the angle between J.

この距離をｄ　（ｉ、ｊ）とする。入力パターンＡと標
準パターンＢｋのパターン間距離Ｄｋは、この距離ｄ　
（ｉ、　　ｊ）を時間軸方向に累積した値が最小となる
ように、入力パターンＡと標準パターンＢｋを時間軸に
ついて整合させたときの距離ｄ（ｉ、　　ｊ）の累積値
とする。この時間軸についての整合は文献（２）に述べ
られているＤＰマツチング法により以下のように行われ
る。Let this distance be d (i, j). The inter-pattern distance Dk between the input pattern A and the standard pattern Bk is this distance d
Let this be the cumulative value of the distance d(i, j) when input pattern A and standard pattern Bk are aligned on the time axis so that the cumulative value of (i, j) in the time axis direction is the minimum. This matching on the time axis is performed as follows using the DP matching method described in document (2).

入力パターンＡの各時刻ｌに対する漸化式を１≦Ｊ≦Ｊ
ｋの間で解くことによって時刻ｉまでの最小累積距離を
求めることができる。ここで、ｇ　（ｉ、　　ｊ）はカ
テゴリに１時刻ｌＳ　Ｊにおける漸化式値を表す。ただ
し、初期値としてｇ　（０，０）＝０、ｇ　（０、Ｊ）
＝■（１≦」≦Ｊｋ）とする。The recurrence formula for each time l of input pattern A is 1≦J≦J
By solving for k, the minimum cumulative distance up to time i can be found. Here, g (i, j) represents the recurrence formula value at one time lS J in the category. However, as the initial value g (0, 0) = 0, g (0, J)
=■(1≦”≦Jk).

（１）式の漸化式演算を入力パターンの時刻ｉ　　（１
≦ｉ≦１）まで行うことによってパターン間距離ＤｋがＤｋ　＝ｇ　（Ｉ、　　Ｊｋ）　　　　　　　　　　　
　（２）のように求まる。このようにして求められたパ
ターン間距離Ｄｋが最小となるカテゴＩＪ　ｌ＜を認識
結果とする。The recurrence formula operation of equation (1) is performed at the input pattern time i (1
≦i≦1), the distance between patterns Dk becomes Dk = g (I, Jk)
It can be found as (2). The category IJ l< for which the inter-pattern distance Dk obtained in this way is the minimum is set as the recognition result.

以上の孤立文字の認識を連続文字の認識へ拡張するには
、孤立文字を連結した連続文字の標準パターンを用い、
連続文字の入力パターンとパターンマツチングを行えば
良い。標準パターンとしては、一文字の書き始めから書
き終わりまでの筆跡を表す孤立文字パターンと、その孤
立文字パターンどうしの終点と始点を連結するような文
字間ストロークパターンを用いる。以下、これら文字パ
ターンの書き始めの点を始点、書き終わりの点を終点と
いう。これらの孤立文字パターンと文字間ストロークパ
ターンを交互に連結させ、連続文字の標準パターンを合
成する。このような、複数の標準パターンをあらかじめ
定められた順序に連結して効率よく認識するには特願昭
５５−８３１９９号に記載されている有限状態オートマ
トンを使用する。To extend the recognition of isolated characters described above to recognition of continuous characters, use a standard pattern of continuous characters that is a combination of isolated characters.
All you have to do is perform pattern matching with the input pattern of continuous characters. As standard patterns, an isolated character pattern representing the handwriting from the beginning to the end of one character, and an inter-character stroke pattern that connects the end and start points of the isolated character patterns are used. Hereinafter, the point at which writing of these character patterns begins will be referred to as the starting point, and the point at which writing will end will be referred to as the ending point. These isolated character patterns and inter-character stroke patterns are alternately connected to synthesize a standard pattern of continuous characters. In order to efficiently recognize such a plurality of standard patterns by linking them in a predetermined order, a finite state automaton described in Japanese Patent Application No. 83199/1980 is used.

このように、孤立文字パターンと文字間ストロークパタ
ーンとを交互に連結し合成した標準パターンをもとにパ
ターンマツチングを行うことによって、連続文字を認識
することができる。In this way, continuous characters can be recognized by performing pattern matching on the basis of a standard pattern in which isolated character patterns and inter-character stroke patterns are alternately connected and synthesized.

[Problem that the invention seeks to solve]

このような従来の連続文字認識装置では、（１）式に示
す漸化式演算を各標準パターンごとに入力パターンの時
刻ｉについて行っている。そこで、人力される文字数が
増えて入力パターンの線分数Ｉが増える°と、漸化式の
演算量が増加することになる。さらに、連続文字では孤
立文字パターンと文字間ストロークパターンを組合わせ
て文字列の標準パターンを合成しているので、組合わせ
の数が増えることでさらに演算量が増加する。このよう
に、人力される文字数が増えると演算量も増加してしま
う。In such a conventional continuous character recognition device, the recurrence formula calculation shown in equation (1) is performed for each standard pattern at time i of the input pattern. Therefore, as the number of manually input characters increases and the number of line segments I of the input pattern increases, the amount of calculation of the recurrence formula increases. Furthermore, for continuous characters, a standard pattern of a character string is synthesized by combining an isolated character pattern and an inter-character stroke pattern, so as the number of combinations increases, the amount of calculation further increases. In this way, as the number of characters manually input increases, the amount of calculation also increases.

また、人力される文字列の中でペンが筆記面を離れ移動
する部分（以下、ペンアップストロークという。）が文
字間に対応する可能性は高い。しかし、入力パターンと
合成した標準パターンをパターンマツチングする場合に
、必ずしもペンアップストロークと文字間ストロークが
対応するとは限らない。そこで、従来の連続文字認識装
置ではこの文字量情報に十分にいかされていない。Furthermore, in a human-written character string, there is a high possibility that the portion where the pen moves away from the writing surface (hereinafter referred to as pen-up stroke) corresponds to the space between characters. However, when performing pattern matching on a standard pattern synthesized with an input pattern, pen-up strokes and inter-character strokes do not necessarily correspond. Therefore, conventional continuous character recognition devices do not make full use of this character amount information.

このように、従来の連続文字認識装置では入力文字数に
従って演算量が増加してしまい、またその認識に文字量
情報が十分にいかされていない欠点があった。As described above, conventional continuous character recognition devices have the drawback that the amount of calculation increases with the number of input characters, and character amount information is not fully utilized for recognition.

本発明はこの問題点を解決するもので、少い演算量で文
字分割精度の高い連続文字認識装置を提供することを目
的とする。The present invention solves this problem, and aims to provide a continuous character recognition device with a small amount of calculation and high character division accuracy.

[Means for solving problems]

本発明は、筆記された文字列の筆跡を時系列の入力パタ
ーンとして読み込む入力手段と、−文°字の書き始めか
ら書き終わりまでの筆跡を表す孤立文字パターンを保持
する孤立文字パターン記憶部と、一文字の書き終わりか
ら次の文字の書き始めまでの筆跡を表す文字間ストロー
クパターンを保持する文字間ストロークパターン記憶部
と、１個以上の孤立文字パターンと０個以上の文字間ス
トロークパターンを交互に矛盾なく連結させたパターン
に基づきこの手段に与えられるパターンを認識する認識
手段とを備えた連続文字認識装置において、上記入力手
段からの入力パターンの文字ピッチ情報に基づきこの入
力パターンを複数の部分文字列に分割して上記認識手段
に与える部分文字列分割部を備えたことを特徴とする。The present invention includes an input means for reading the handwriting of a written character string as a time-series input pattern, and an isolated character pattern storage unit that holds an isolated character pattern representing the handwriting from the beginning of writing to the end of the character. , a character-to-character stroke pattern storage unit that stores a character-to-character stroke pattern representing handwriting from the end of writing one character to the beginning of the next character; and one or more isolated character patterns and zero or more character-to-character stroke patterns alternately. and a recognition means for recognizing a pattern given to the means based on a pattern concatenated without contradiction, the continuous character recognition device recognizes the input pattern into a plurality of parts based on the character pitch information of the input pattern from the input means. The present invention is characterized by comprising a partial character string dividing section that divides the character string into character strings and provides the divided character strings to the recognition means.

[Principle of the invention]

本発明では、人力された文字列を文字間の情報を用いて
、複数の部分文字列に分割し、その後従来と同様なパタ
ーンマツチングで部分文字列ごとに認識を行う。この文
字量情報とは、一つの文字列中の文字と文字との間隔（
以下、″文字ピッチという。）はほぼ等しく、また文字
と文字の間にはペンアップストロークが存在する可能性
が高いとすることである。In the present invention, a human-generated character string is divided into a plurality of partial character strings using information between characters, and then each partial character string is recognized by pattern matching similar to the conventional method. This character amount information refers to the spacing between characters in one character string (
(hereinafter referred to as "character pitch") are approximately equal, and it is assumed that there is a high possibility that a pen-up stroke exists between characters.

以下にこの人力文字列を部分文字列に分割する過程を説
明する。入力された文字列が次のようなｘｙ座標の時系
列である入力パターンＡに変換される。The process of dividing this human-powered character string into partial character strings will be explained below. The input character string is converted into an input pattern A, which is a time series of x and y coordinates as shown below.

Ａ＝　（（ｘｉ、ｙｉ、ｐｉ）　　ｌ　　１≦ｌ≦Ｉ　
）　　　　　−、（３）×ｌはＸ座標、ｙｌはｙ座標、
またｐｌはペンのアップ、ダウンを時刻１において表す
。ｐｌはペンがアップからダウンに変化したときにはｐ
ｉ＝１、またダウンからアップに変化したときにはｐ１
＝２、その他のときにはｐ１＝０になる。まず、入力パ
ターンＡの中から文字間となる可能性が高いペンアップ
ストローク部分を探す。すなわち、時刻ｉ＝ｍでｐｌ＝
２、かつｉ＝ｍ＋ｌでｐｉ＝１に変化したならば、この
時刻ｍからｍ＋ｌの間がペンアップストロークになる。A= ((xi, yi, pi) l 1≦l≦I
) −, (3) ×l is the X coordinate, yl is the y coordinate,
Further, pl represents the up and down of the pen at time 1. pl is p when the pen changes from up to down
i=1, and p1 when changing from down to up
= 2, otherwise p1 = 0. First, a pen-up stroke portion that is likely to be between characters is searched for in input pattern A. That is, at time i=m, pl=
2, and if i=m+l and pi=1, then the period from time m to m+l becomes a pen-up stroke.

また、このときのペンアップストロークのＸ方向の変化
量を文字ピッチと仮定する。すなわち、Ｕ番目のペンア
ップストロークに対する文字ピッチｄ　ｘ　（ｕ）とそ
の時の時刻Ｓ　（Ｕ）は次のように表される。Further, the amount of change in the pen-up stroke in the X direction at this time is assumed to be the character pitch. That is, the character pitch d x (u) for the U-th pen-up stroke and the time S (U) at that time are expressed as follows.

ｄＸ（ｕ）＝ｘｓ、ｌ　　　ＸＩｍ　　　　　　　　　
　　　　””’（４）Ｓ（ｕ）＝ｍ　　　（１≦Ｕ≦Ｕ
）（５）次に、これらＵ個の文字量候補の中で文字ピッ
チが正のものだけを探し、負のものは除く。すなわち、ｄ　ｘ　（ｕ）　＞　０　　　　　　　　　　　　　　
　　　　（６）がｔり立つペンアップストロークのみを
残し、文字間の候補とする。また、（６）式を満たす文
字ピッチｄ　ｘ　（ｕ）と時刻ｓ　（ｕ）を更新して、
ｄ　ｘ　（ｖ）　＝　ｄ　Ｘ　（ｕ）　　　　　　　　
　　　　　　　　（７）ｓ　Ｍ　＝　ｓ　（ｕ）（１≦Ｕ≦Ｔ、Ｔ、１≦Ｖ≦Ｖ、Ｖ≦Ｕ）（８）とする
。ここで、これらのＶ個の文字ピッチの平均値Ｐａｖを
求めると、ｔ’ａｖ＝　　　（ｈａｘ（ｖ〕）　　／Ｖ　　　　　
　　　　　　　　　　　　　　　　　、、、、、（９）
となる。この文字ピッチの平均値Ｐａｖより小さな文字
ピッチｄ　Ｘ　Ｍは再び候補から除く。すなわち、ｄｘ
（ｖ）＞Ｐａｙ　　　　　　　　　　　　　　　−α１
を満たすペンアップストロークのみを文字間の候補とす
る。また、αＣ式を満たす文字ピッチｄ　ｘ　（ｖ）と
時刻ｓ　（ｖ）を更新して、ｄ　ｘ　Ｈ＝　ｄ　ｘ　（ｖ）　　　　　　　　　　　
　　　　　αＤＳ（ロ）＝　ｓ　（ｖ）（１≦ｖ≦Ｖ、ｌ≦ｗ≦Ｗ、Ｗ≦Ｖ）ａ！ｌとする。dX(u)=xs, l XIm
""'(4) S(u)=m (1≦U≦U
) (5) Next, among these U character amount candidates, only those with positive character pitch are searched, and those with negative character pitch are excluded. That is, d x (u) > 0
Only the pen-up strokes where (6) stands out are left as candidates for character spacing. Also, update the character pitch d x (u) and time s (u) that satisfy equation (6),
d x (v) = d x (u)
(7) s M = s (u) (1≦U≦T, T, 1≦V≦V, V≦U) (8). Here, when calculating the average value Pav of these V character pitches, t'av= (hax(v)) /V
, , , (9)
becomes. Character pitches dXM smaller than this average value Pav of character pitches are again excluded from candidates. That is, dx
(v)>Pay −α1
Only pen-up strokes that satisfy the following are candidates for character spacing. Also, by updating the character pitch d x (v) and time s (v) that satisfy the αC formula, d x H= d x (v)
αDS (b) = s (v) (1≦v≦V, l≦w≦W, W≦V) a! Let it be l.

ここで、入力パターンＡをこの文字間の候補で分割する
。すなわち、時刻Ｓ（ト）で入力パターンＡを区切ると
、Ａｗ　＝　（（ｘｔ、ｙ＋、ｐｕ）　ｌ　ｓ□≦ｌ≦Ｓ
（Ｗ＋１））αＪただし、１≦Ｗ≦Ｗ、Ｗ＋１＝ＩのようにＷ個の部分文字列に分割される。この各部分文
字列に従来と同様なパターンマツチングを行うことによ
って人力文字列を認識することができる。Here, the input pattern A is divided by this candidate between characters. That is, if input pattern A is separated by time S(g), then Aw = ((xt, y+, pu) l s□≦l≦S
(W+1)) αJ However, it is divided into W substrings such that 1≦W≦W, W+1=I. The human-powered character string can be recognized by performing pattern matching on each of these partial character strings in the same manner as in the past.

たとえば、ｎ個の文字からなる文字列が人力され、標準
パターンのカテゴリ数かに個とすると、従来の方法では
合成する標準パターンの組合わせ数はＫ１１　　　　　　　　　　　　　　　　　　０Ｏとな
る。しかし、これが等しくｎ／Ｗ文字ずつＷ個の部分文
字列に分割されたとすると、部分文字列ごとの組合わせ
数はＫ　ＩＩ／ＩＩ　　　　　　　　　　　　　　　　　α
りとなり、人力文字列全体ではＷ　−Ｋ″／’　　　　　　　　　　　　　　　　　α
■となるＫ”≧２のとき（通常、Ｋ＞２、ｎ＞ｌである
のでこれは成り立つ。）Ｋ”　≧Ｗ−Ｋ””　　　　　　　　　　　　　０７）
が成り立ち、人力文字列を分割することで標準パターン
の組合わせ数が減少する。そこで、各標準パターンごと
に行う（１）式の漸化式演算の回数も減少し、パターン
マツチングのための演算量が減少する。For example, if a character string consisting of n characters is manually generated and the number of characters is equal to the number of categories of standard patterns, the number of combinations of standard patterns to be synthesized using the conventional method is K1100. However, if this is equally divided into W substrings of n/W characters, the number of combinations for each substring is K II/II α
Therefore, the entire human string is W −K″/' α
■When K”≧2 (normally, this holds true because K>2 and n>l) K”≧W−K””07)
holds true, and the number of combinations of standard patterns is reduced by dividing the human string. Therefore, the number of recurrence formula calculations of equation (1) performed for each standard pattern is also reduced, and the amount of calculations for pattern matching is reduced.

また、入力文字列中の全てのペンアップストロークは文
字間となる可能性を必ず試されており、文字量情報が使
用される。Furthermore, all pen-up strokes in the input character string are tested for the possibility of being spaced between characters, and character amount information is used.

以上、本発明の原理について説明した。以上の説明では
文字間ピッチのペンアップストロークのＸ方向の変化量
で定義したが、この他にもペンアップストロークで分離
された部分文字列間のＸ方向の間隔で定義するなどの方
法がある。The principle of the present invention has been explained above. In the above explanation, the inter-character pitch is defined by the amount of change in the X direction of the pen-up stroke, but there are other methods such as defining it by the interval in the X direction between partial character strings separated by the pen-up stroke. .

また、文字量候補を求めるために文字間ピッチの平均値
Ｐａｖを用いたが、平均値だけでな（分散などの統計量
や文字列のＸ方向の高さから求めた値を用いることもで
きる。In addition, although we used the average value Pav of the pitch between characters to find character length candidates, it is also possible to use not only the average value (values found from statistics such as variance or the height of the character string in the X direction). .

′さらに、基本的な孤立文字の認識方式について、方向
角データを用いた場合について述べたが、この他にもた
とえば、日経エレクトロニクス誌昭和５８年１２月５日
号１１５ページから１３３ページに「くずし字など筆記
制限を緩和する方向に進むオンライン手書き漢字認識」
と題されて発表されている文献中に述べられている様々
な方式が使用できる。'Furthermore, we have described the basic isolated character recognition method using directional data, but there are also other methods, such as "Kuzushi "Online handwritten kanji recognition moves toward easing restrictions on handwriting, etc."
A variety of methods can be used, as described in the literature published under the title.

また、連続文字認識の方式についても、たとえば電子通
信学会技術研究報告ＰＲＬ８４−１３の６７ページかう
７６ページに「候補文字ラティス法によるオンライン手
書き文字列の認識」と題して発表された論文の方式など
他に様々なものが使用できる。In addition, regarding continuous character recognition methods, for example, the method of the paper published on page 67 or 76 of IEICE technical research report PRL84-13 entitled "Online handwritten character string recognition using candidate character lattice method". Various other items can be used.

[Effect]

入力手段で、文字列データは入力パターンに変換され、
この入力パターンは部分文字列分割部で文字ピッチ情報
を用いて複数の部分文字列に分割され認識手段に送られ
る。認識手段では、孤立文字パターン記憶部と文字間ス
トロークパターン記憶部に保持されているパターンを合
成し、部分文字列とパターンマツチングを行って入力パ
ターンを認識する。これにより演算量をいちじるしく低
減できる。In the input method, string data is converted into an input pattern,
This input pattern is divided into a plurality of partial character strings using character pitch information in a partial character string dividing section and sent to recognition means. The recognition means synthesizes the patterns held in the isolated character pattern storage section and the intercharacter stroke pattern storage section, performs pattern matching with the partial character string, and recognizes the input pattern. This allows the amount of calculation to be significantly reduced.

〔Example〕

以下、本発明の一実施例を図面に基づき説明する。第１
図は、この実施例の構成を示すブロック構成図である。Hereinafter, one embodiment of the present invention will be described based on the drawings. 1st
The figure is a block configuration diagram showing the configuration of this embodiment.

第２図は、第１図に示す部分文字列分割部の構成を示す
ブロック構成図である。FIG. 2 is a block configuration diagram showing the configuration of the partial character string division section shown in FIG. 1.

この実施例は、筆記された文字列の筆跡を時系列の入力
パターンとして読み込む入力手段であるタブレット１０
０および前処理部２００と、一文字の書き始めから書き
終わりまでの筆跡を表す孤立文字パターンを保持する孤
立文字パターン記憶部６００と、一文字の書き終わりか
ら次の文字の書き始めまでの筆跡を表す文字間ストロー
クパターンを保持する文字間ストロークパターン記憶部
７００　と、１個以上の孤立文字パターンと０個以上の
文字間ストロークのパターンを交互に矛盾なく連結させ
たパターンに基づきこの手段に与えられるパターンを認
識する認識手段であるデータ変換部４００および認識部
５００と、入力手段からの入力パターンの文字ピッチ情
報に基づきこの入力パターンを複数の部分文字列に分割
して上記認識手段に与える部分文字列分割部３００とを
備え、この部分文字列分割部３００が本発明の特徴とす
るところである。This embodiment uses a tablet 10 which is an input means for reading the handwriting of a written character string as a time-series input pattern.
0, a preprocessing unit 200, an isolated character pattern storage unit 600 that holds an isolated character pattern representing the handwriting from the beginning of writing one character to the end of writing, and an isolated character pattern storage unit 600 representing the handwriting from the end of writing one character to the beginning of writing the next character. A pattern given to this means based on a character interval stroke pattern storage unit 700 that holds character interval stroke patterns, and a pattern in which one or more isolated character patterns and zero or more character interval stroke patterns are alternately and consistently connected. A data conversion unit 400 and a recognition unit 500, which are recognition means for recognizing the input pattern, divide the input pattern into a plurality of partial character strings based on character pitch information of the input pattern from the input means, and provide the partial character strings to the recognition means. This partial character string dividing section 300 is a feature of the present invention.

第１図で、タブレット１００から入力された文字列デー
タは前処理部２００で（３〕式に示されるＸ座標、Ｘ座
標およびペンデータの時系列である入力パターンＡに変
換され、部分文字列分割部３００に送られる。部分文字
列分割部３００は文字ピッチ情報を用いて入力パターン
へを複数の部分文字列Ａｗに分割し、順次データ変換部
４００に転送する。データ変換部４００ではＸ座標、Ｘ
座標およびペンデータの時系列で構成される部分文字列
を文献（２）に示される方向角の時系列データに変換し
、認識部５００に送る。認識部５００は文献（１）に示
されているように孤立文字パターン記憶部６００と文字
間ストロークパターン記憶部７００に保持されている標
準パターンを合成し、部分文字列とパターンマツチング
を行うことによって入力パターンを認識し、結果Ｒを出
力する。この結果Ｒは部分文字列Ａｗの認識結果を合わ
せたものになる。In FIG. 1, character string data input from a tablet 100 is converted by a preprocessing unit 200 into an input pattern A, which is a time series of the X coordinate, X coordinate, and pen data shown in equation (3), The partial string dividing section 300 uses the character pitch information to divide the input pattern into a plurality of partial character strings Aw, and sequentially transfers them to the data converting section 400.The data converting section 400 uses the X coordinate ,X
A partial character string composed of a time series of coordinates and pen data is converted into time series data of direction angles shown in document (2), and sent to the recognition unit 500. As shown in document (1), the recognition unit 500 synthesizes the standard patterns held in the isolated character pattern storage unit 600 and the intercharacter stroke pattern storage unit 700, and performs pattern matching with the partial character string. recognizes the input pattern and outputs the result R. The result R is a combination of the recognition results of the partial character string Aw.

第２図を用いて部分文字列分割部３００の動作を詳細に
説明する。前処理部２００から送られてくる入力パター
ンＡはいったんデータバッファ３０１に格納される。こ
こで、制御部３１１から時刻信号ｌがデータバッファ３
０１およびペンアップ検出部３０２に送られる。データ
バッファ３０１はこの時刻信号ｌに同期して入力パター
ンＡ＝　（ｘｉ、ｙｉ、ｐｉ）を順次ペンアップ検出部
３０２に送る。ペンアップ検出部３０２は入力パターン
Ａの中でｐ１＝２かつｐｉ＋１＝１となるペンアップス
トロークを検出し、その時刻信号ｉ＝ｍをスイッチ３０
４に送り、またそのＸ座標Ｘｌｌ　、Ｘ１１＊１を減算
器３０３に送る。減算器３０３では減算ｄ　＝　Ｘｓ＋
ｌ　　　ｘｆｉを計算し、ｄ≧０が成り立つときに減算
結果ｄをＤメモリ３０５に送る。同時に、減算器３０３
はｄ≧０が成り立つときに切替信号ｓｗをスイッチ３０
４に送り、線路を導通させる。そこで、ペンアップ検出
部３０２から送られてくる時刻信号ｍがスイッチ３０４
を通してＶメモリ３０６に送られ、ここに保持される。The operation of the partial string dividing section 300 will be explained in detail using FIG. Input pattern A sent from preprocessing section 200 is temporarily stored in data buffer 301. Here, the time signal l is sent from the control unit 311 to the data buffer 3.
01 and the pen-up detection section 302. Data buffer 301 sequentially sends input pattern A=(xi, yi, pi) to pen-up detection section 302 in synchronization with this time signal l. The pen-up detection unit 302 detects a pen-up stroke where p1=2 and pi+1=1 in the input pattern A, and sends the time signal i=m to the switch 30.
4, and its X coordinates Xll and X11*1 are sent to the subtracter 303. The subtracter 303 subtracts d = Xs+
lxfi is calculated, and when d≧0 holds, the subtraction result d is sent to the D memory 305. At the same time, the subtractor 303
switches the switching signal sw to the switch 30 when d≧0 holds true.
4 to make the line conductive. Therefore, the time signal m sent from the pen-up detection section 302 is transmitted to the switch 304.
is sent to the V memory 306 and held there.

Ｄメモリ３０５とＶメモリ３０６はどちらも一次元配列
のメモリであり、入力パターンへがペンアップとなる時
刻ｍとそのときの文字ピッチｄとが互いに対応付けられ
て記憶される。時刻信号ｉが１からＩまで変化すると、
（７）、（８）式におけるｄ　ｘ　（ｖ）がＤメモリに
、Ｓ　（Ｖ）がＶメモリに格納される。Both the D memory 305 and the V memory 306 are memories with one-dimensional arrays, and the time m when the input pattern is pen-up and the character pitch d at that time are stored in correspondence with each other. When time signal i changes from 1 to I,
In equations (7) and (8), d x (v) is stored in the D memory, and S (V) is stored in the V memory.

時刻信号ｉが１からＩまで変化し林わると、制御部３１
１は信号ｔ１をＤメモリ３０５と平均化部３０７に送る
。この信号ｔ、に対応してＤメモリ３０５からは文字ピ
ッチｄが順次読み出され、平均化部３０７に送られる。When the time signal i changes from 1 to I, the control unit 31
1 sends the signal t1 to the D memory 305 and the averaging section 307. In response to this signal t, the character pitch d is sequentially read out from the D memory 305 and sent to the averaging section 307.

平均化部３０７は（９）式のような演算を行い、文字ピ
ッチｄの平均値ＰａｙをＰａｖメモリ３０９に送り、こ
こに記憶される。The averaging unit 307 performs calculations such as equation (9), and sends the average value Pay of the character pitch d to the Pav memory 309, where it is stored.

平均値Ｐａｖが決定されると、制御部３１１は信号ｔ２
をＤメモ！７３０５　、Ｖメモリ３０６および比較部３
０８に送る。この信号ｔ２に対応し、Ｄメモリ３０５か
らは文字ピッチｄが、またＶメモリ３０６からは時刻ｍ
が順次読み出され、比較部３０８に送られる。When the average value Pav is determined, the control unit 311 outputs the signal t2.
D memo! 7305, V memory 306 and comparison unit 3
Send to 08. Corresponding to this signal t2, the character pitch d is output from the D memory 305, and the time m is input from the V memory 306.
are sequentially read out and sent to the comparison unit 308.

比較部３０８はＰａｖメモリ３０９に記憶されているＰ
ａｖＯ値とＤメモリ３０５から読み出される文字ピッチ
ｄの値を比較し、ｄ＞Ｐａｖすなわちα０式が成り立つ
ときにその時刻ｍをＷメモリ３１０に送る。The comparison unit 308 compares P stored in the Pav memory 309.
The avO value is compared with the value of the character pitch d read from the D memory 305, and when d>Pav, that is, the α0 formula holds, the time m is sent to the W memory 310.

Ｗメモリ３１０はＤメモリ３０５やＶメモリ３０６　と
同様な一次元配列のメモリであり、文字間の候補となる
時刻すなわち（支）式のＳ（ロ）が記憶される。Ｗメモ
リ３１０に文字間の候補がすべてそろうと、制御部３１
１は信号ｔ、をＷメモリ３１０に送る。データバッフ’
７３０１はこの時刻で入力パターンＡを区切って０３式
のような部分文字配列Ａｖｌｌとしてデータ変換部４０
０に送る。The W memory 310 is a one-dimensional array memory similar to the D memory 305 and the V memory 306, and stores candidate times between characters, that is, S (b) of the (sub) expression. When all the character spacing candidates are stored in the W memory 310, the control unit 31
1 sends the signal t to the W memory 310. data buffer'
7301 is the data conversion unit 40 that separates the input pattern A at this time and converts it into a partial character array Avll as shown in formula 03.
Send to 0.

以上説明したように、部分文字列分割部３００に人力さ
れた入力パターンＡが部分文字列Ａｗに分割される。こ
の部分文字列Ａｗがデータ変換部４００を通して認識部
５００に人力され、認識結果Ｒが出力される。この認識
部５００の構成は文献（１）の中に示されている。As explained above, the input pattern A manually entered into the partial character string dividing unit 300 is divided into partial character strings Aw. This partial character string Aw is manually input to the recognition section 500 through the data conversion section 400, and a recognition result R is output. The configuration of this recognition unit 500 is shown in document (1).

〔Effect of the invention〕

本発明は以上説明したように１、使用者による文字を分
割するための特別な操作を必要とせずに、かつ演算量を
低減し文字分割精度を向上するので、高速で認識精度の
高い連続文字認識装置が得られる効果がある。As explained above, the present invention (1) does not require any special operation for dividing characters by the user, reduces the amount of calculations, and improves character division accuracy, so continuous characters can be recognized at high speed and with high accuracy. This has the effect of providing a recognition device.

[Brief explanation of drawings]

第１図は本発明の連続文字認識装置の一実施例の構成を
示す構成図。第２図は第１図の部分文字列分割部の構成を示す詳細構
成図。１００・・・タブレット、２００・・・前処理部、３０
０・・・部分文字列分割部、３０１・・・データバッフ
ァ、３０２・・・ペンアップ検出部、３０３・・・減算
器、３０４・・・スイッチ、３０５・・・Ｄメモリ、３
０６・・・Ｖメモリ、３ｏ７・・・平均化部、３０８・
・・比較部、３０９・・・Ｐａｖメモリ、３１０・・・
Ｗメモリ、３１１・・・制御部、４００・・・データ変
換部、５００・・・認識部、６００・・・孤立文字パタ
ーン記憶部、７００・・・文字間ストロークパターン記
憶部。FIG. 1 is a block diagram showing the structure of an embodiment of the continuous character recognition device of the present invention. FIG. 2 is a detailed configuration diagram showing the configuration of the partial character string division section in FIG. 1. 100...Tablet, 200...Pretreatment section, 30
0... Partial string division unit, 301... Data buffer, 302... Pen-up detection unit, 303... Subtractor, 304... Switch, 305... D memory, 3
06...V memory, 3o7...Averaging section, 308.
...Comparison section, 309...Pav memory, 310...
W memory, 311...Control unit, 400...Data conversion unit, 500...Recognition unit, 600...Isolated character pattern storage unit, 700...Character stroke pattern storage unit.

Claims

[Claims]

(1) An input means (100, 200) that reads the handwriting of a written character string as a time-series input pattern, and an isolated character pattern storage unit that holds an isolated character pattern representing the handwriting of one character from the beginning to the end of writing. (600), a character-to-character stroke pattern storage unit (700) that stores a character-to-character stroke pattern representing handwriting from the end of one character to the beginning of the next character, one or more isolated character patterns and zero or more character patterns; a recognition means (500) for recognizing a pattern given to the means based on a pattern in which stroke patterns between characters are connected alternately and without contradiction; the character pitch of the input pattern from the input means; A continuous character recognition device comprising: a partial character string dividing section (300) that divides the input pattern into a plurality of partial character strings based on information and provides the divided character strings to the recognition means.