JPS63308690A

JPS63308690A - Holograph recognition

Info

Publication number: JPS63308690A
Application number: JP62228227A
Authority: JP
Inventors: モーリー・グザヴィエ
Original assignee: ANATEKUSU
Current assignee: ANATEKUSU
Priority date: 1986-09-11
Filing date: 1987-09-11
Publication date: 1988-12-16
Also published as: CA1293807C

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕この発明は、手書きの筆跡の認識方法に関する。[Detailed description of the invention] [Industrial application field] The present invention relates to a method for recognizing handwritten handwriting.

〔従来の技術〕手書きの筆跡の機械認ＲＫ関する長年の研究により、多
くの問題が提起された。以下、こうした問題点とそれを
いかに解決したかを簡単に述べることにする。[Prior Art] Many problems have been raised through many years of research on machine recognition RK of handwritten handwriting. Below, I will briefly describe these problems and how they were resolved.

書体の基本的要素は（％殊な符号は勿論のこととして）
文字と数字であり、以下、「書体要素」と呼ぶ。こうし
た要素を集め配列して語とするのであるが、その認識は
個別化から始まる。印刷書体では、字等の符号は分離さ
れ（文字）、したがって、それらの境界は何の問題も生
じない。これに対して、つながった書体では、書体要素
を表わす筆跡要素は個別化しえず（これは一般に文字の
場合である。なぜなら、数字は互いに区別されるのが普
通だからである）、したがって、要素を認識するには、
境界に予じめ印をつけておくか文字を予じめ切り離して
おくことが必要になる。The basic elements of a typeface (not to mention special symbols) are:
These are letters and numbers, and are hereinafter referred to as "typeface elements." These elements are collected and arranged to form words, and recognition begins with individualization. In printed typefaces, the characters and other symbols are separated (letters), so their boundaries do not pose any problems. In contrast, in connected typefaces, the stroke elements that represent the typeface elements cannot be individuated (this is generally the case for letters, since numbers are usually distinct from each other), and therefore the elements To recognize,
It is necessary to mark the boundaries in advance or to separate the characters in advance.

筆跡が要素毎に切り離されてしまうと、これらの要素が
どの書体要素に対応するかを決定することによって各要
素を識別することが必要である。Once the handwriting has been separated into elements, it is necessary to identify each element by determining which typeface element these elements correspond to.

この識別には、大きさ、曲り具合、傾斜、方向等々の様
々な基準を使用しなければならない。Various criteria such as size, curvature, inclination, direction, etc. must be used for this identification.

第１の試みは、書体要素の「絶対的」認識を可能とする
基準、即ち、どのような筆跡にも妥当する基準の探索で
あった。この試みが成功するのは、書き手がきわめて厳
格な書き方、即ち一定の大きさで一定の方向をもった形
の整った文字に従うことを認めた場合だけである。The first attempt was to search for a standard that would enable the "absolute" recognition of typeface elements, that is, a standard that would be valid for any type of handwriting. This attempt is successful only if the writer agrees to follow a very strict writing style, that is, to follow well-formed letters of a certain size and direction.

したがって、訓練段階即ち学習段階を使用して上記の制
約を回避する努力が行われた。こうした場合、認識に先
立つ段階では、書き手は既知の文章をなぞり、（様々な
基準を次々と適用することによって得られた）各文字の
特徴を保持して、他の特徴との後の比較に使用する。実
際、このプロセスは筆跡の認識を可能にする。しかしな
がら、きわめて多数の基本的な操作ステップを必要とす
る。なぜなら、全ての基準をどの筆跡要素にも適用して
比較を行うことになるからである。Efforts have therefore been made to circumvent the above limitations using a training or learning phase. In such cases, in the pre-recognition stage, the writer traces the known text and retains the features of each letter (obtained by successively applying various criteria) for later comparison with other features. use. In fact, this process makes it possible to recognize handwriting. However, it requires a very large number of basic operating steps. This is because comparisons are made by applying all criteria to any handwriting element.

[Means for solving problems]

したがって、この発明の方法は、マイクロコンピュータ
のような妥当な電力の機器を使うことにより、はとんど
どのような追跡であっても、その実時間の認識を可能す
るというものではない。Therefore, the method of the present invention does not enable real-time recognition of almost any kind of tracking by using equipment of reasonable power, such as a microcomputer.

この発明は、マイクロコンピュータの電力を越えない機
器を利用して実時間で行われる筆跡認識の方法に関する
。The present invention relates to a handwriting recognition method that is performed in real time using equipment that does not exceed the power of a microcomputer.

この発明は、筆跡の大きさや格好（非アルファペ、ト、
表意文字、特定の記号等）とは無関係に全ての筆跡の認
識を可能とする方法に関する。This invention was developed based on the size and shape of handwriting (non-alphape, g,
The present invention relates to a method that enables recognition of all handwriting, regardless of ideograms, specific symbols, etc.).

詳述すると、この発明は、考察される特定の場合の筆跡
に最適な基準を選択するためばかりでなく、筆跡の処理
に最も良（適合する操作ステップ列（判断ツリー）を設
定するためにも、学習段階の結果、又は筆跡の段階的な
認識の結果、又はこれら両者の結果を利用するｄＲ方法
に関する。In detail, the present invention is designed not only to select the most appropriate criterion for the handwriting in the particular case considered, but also to set the most suitable sequence of operation steps (decision tree) for processing the handwriting. , a dR method that utilizes the results of a learning phase, or the results of a step-by-step recognition of handwriting, or both.

この発明による筆跡認識方法は次の段階を具備する。The handwriting recognition method according to the present invention includes the following steps.

所定の操作ステップ列による所定の基準を筆跡又はこの
筆跡の要素に適用して、この筆跡又はこの要素のいくつ
かの特徴を決定する段階と、こうして決定された特徴を
既知の書体要素を表わす特徴と比較する段階と、特徴の比較が所定の結果を与えるとき、筆跡の一つの要
素を既知の書体要素として確認する段階。applying predetermined criteria according to a predetermined sequence of operational steps to the handwriting or to the elements of this handwriting to determine some features of the handwriting or to the elements; and converting the thus determined features into features representative of known typeface elements. and when the comparison of features gives a predetermined result, confirming one element of the handwriting as a known typeface element.

この認識方法の特徴は、筆跡要素へ基準を適用すること
により、決定された特徴に従って所定の操作ステップ列
を設定する段階を有することである。A feature of this recognition method is that it includes the step of setting a predetermined operation step sequence according to the determined characteristics by applying a criterion to handwriting elements.

操作ステップ列を設定する段階が、筆跡又は筆跡要素へ
適用されるべき基準の順序の決定を含むことは有益であ
る。また、この設定する段階が使用可能な多数の基準の
中から若干の基準のみを選択することを含むことも有益
である。Advantageously, the step of setting up the sequence of operational steps includes determining the order of the criteria to be applied to the handwriting or handwriting elements. It may also be advantageous for this step of establishing to include selecting only some criteria from among a large number of available criteria.

好都合にも、既知の書体要素を表わす特徴、操作ステッ
プ列の順序及び／又は基準の選択は、筆跡認識の別の操
作ステップの期間に徐々に更新される。Advantageously, the selection of features representing known typeface elements, the order of the sequence of operational steps and/or the criteria is updated gradually during different operational steps of handwriting recognition.

この発明の認識方法は、訓練という前段階を含み、該段
階では、文字や数字などの既知の書体要素に対応する筆
跡の要素には、′若干の所定の基準が適用され、既知の
書体要素を表わす特徴が決定されて保持即ち記憶される
。この訓練という前段階は、認識の目的に使用される操
作ステップ列の設定を含む。The recognition method of the present invention includes a pre-stage of training, in which elements of the handwriting corresponding to known typeface elements such as letters and numbers are applied with some predetermined criteria; Features representative of are determined and retained or stored. This preliminary step of training involves setting up the sequence of operational steps used for recognition purposes.

好ましくは、この発明の方法は、所定の基準を適用する
のに先立って、筆跡又は筆跡の要素の顕著点の符号化を
含む。また、この発明の方法が、近接の基準に従って若
干の顕著点を除去する段階を含むことも有益である。Preferably, the method of the invention includes encoding the salient points of the handwriting or handwriting elements prior to applying the predetermined criteria. It is also advantageous that the method of the invention includes the step of removing some salient points according to a criterion of proximity.

この有利な実施例では、この発明の認識方法は、筆跡を
その要素に切断する段階をも含む。筆跡切断のこの段階
は、可能な切断という特定点を決定する段階と、この点
が２個の筆跡要素の間の切断を現実に構成する可能性を
特定点毎に評価する段階とを含む。そのうえ、ある点が
２個の筆跡要素の間の切断を構成する可能性を決定する
のに用いる操作ステップ列が、訓練段階の期間に決定さ
れる特徴に従って設定されることは有効である。In this advantageous embodiment, the recognition method of the invention also includes the step of cutting the handwriting into its components. This stage of handwriting cutting includes the steps of determining particular points of possible cuts and evaluating for each particular point the possibility that this point actually constitutes a cut between two handwriting elements. Moreover, it is advantageous that the sequence of operating steps used to determine the probability that a point constitutes a cut between two handwriting elements is set according to the characteristics determined during the training phase.

好ましくは、この発明の方法は、書体要素群を含む辞書
と予め識別された当該要素群との比較を更らに含む。Preferably, the method of the invention further comprises comparing a dictionary containing the typeface elements with the previously identified element group.

この発明の好ましい特定の実施例を表わすのみの非限定
的な例を与える添付の図面を参照することにより、この
発明や目的、特徴、細部、利点が一層明瞭に理解できよ
う。The invention, objects, features, details and advantages may be better understood by reference to the accompanying drawings, which give non-limiting examples only of preferred specific embodiments of the invention.

〔Example〕

以下の説明は、機械から知られたテキストを手書きした
ものを要求する訓練の段階と、該訓練の段階の結果を利
用する認識の段階とを本質的に含む。更らに、認識を利
用して、訓練段階忙よって与えられた結果を更新する。The following description essentially includes a training phase that requires a handwritten version of the text known from the machine, and a recognition phase that utilizes the results of the training phase. Additionally, recognition is used to update the results given by the training phase.

しかしながら、この更新は、オペレータの筆跡が長い期
間にわたってほんの少ししか変化しない場合には必ずし
も必要ではない。そのうえ、以前の訓練を省略したり免
除したりしてもよい。、こうすると、最初は認識はあま
り有効ではない。即ち、使用機械はオペレータから多く
の援助（文字の切断の場所や文字の性質等）を受けなけ
ればならない。この相互作用性は訓練段階の期間も、ま
た認識段階の期間においても、この発明の範囲内におい
て極めて有効である。However, this update is not necessarily necessary if the operator's handwriting changes only slightly over a long period of time. Additionally, previous training may be omitted or waived. , in this way recognition is not very effective at first. That is, the machine used must receive a lot of assistance from the operator (location of cutting characters, nature of the characters, etc.). This interactivity is very useful within the scope of the invention both during the training phase and also during the recognition phase.

考察すべき実施例では、この発明による認識方法は、２
個の個別の段階、即ち既知の書体に関係した訓練の段階
と、一種の「連続的訓練」の期間の訓練結果の更新を好
ましくは伴う未知の書体の認識の段階とを有する。ｕＲ
段階は、用いられるプロセス及び基準と既知であるべき
特徴とを理解するためにまず考慮され、次に、訓練の段
階が、どのような原初的特徴が得られるかを理解するた
めに考察されねばならない。In the considered embodiment, the recognition method according to the invention comprises two
a training phase relating to known typefaces and a phase of recognition of unknown typefaces, preferably with updating of the training results during a kind of "continuous training" period. uR
The stages must be considered first to understand the process and criteria used and the features that should be known, and then the training stages must be considered to understand what primitive features are obtained. No.

以下の説明は、−組の基準という特定の場合に関するも
のであるが、別の基準の組を使ってもよい。したがって
、以下の説明は発明の範囲を全く限定しない単なる例で
ある。Although the following description relates to the specific case of a - set of criteria, other sets of criteria may be used. Accordingly, the following description is merely an example and does not limit the scope of the invention in any way.

認識は所定の基準を適用することから始まる。Recognition begins with the application of predetermined criteria.

しかしながら、識別されるべき筆跡は境界分けされて個
々の筆跡要素へ分けられる。この境界定めは、第１の基
準を適用することによって、最初に筆跡を顕著点へ符号
化することにより実行される。However, the handwriting to be identified is demarcated and separated into individual handwriting elements. This delimitation is performed by first encoding the handwriting into salient points by applying a first criterion.

この基準は有効には筆跡の端点の決定、即ち、高い点、
低い点、右端、左端、書写具の立ち上がり（これは筆跡
の動的習得の場合に関係している）の決定である。This criterion is effective in determining the end points of handwriting, i.e., high points,
The determination of the low point, the right edge, the left edge, the rise of the writing instrument (this is relevant in the case of dynamic acquisition of handwriting).

第１図は、２語に属する４個の文字（２個の文字は連結
し、他の２個の文字は連結していない）を含む筆跡の一
部を拡大して示したものである。FIG. 1 is an enlarged view of a portion of handwriting that includes four characters belonging to two words (two characters are connected and the other two characters are not connected).

同図の下部には、筆跡の顕著点をどのように解釈するか
が示されている。これらの顕著点は、高い点（■４）、
低い点（Ｌ）、左端（Ｇ）、右端（Ｄ）、書写具の立ち
上がり（Ｌ）を含む。したがって第１段階では、この筆
跡はＨＢ　ＨＢ　ＨＢ　ＨＢ　Ｄ　ＨＬ　−ＢＬ−Ｌ−
ＢＤＨＬ−ＧＢＤＨＬの形に符号化される。The lower part of the figure shows how to interpret the salient points of the handwriting. These notable points are high points (■4);
It includes the low point (L), the left edge (G), the right edge (D), and the rising edge of the writing instrument (L). Therefore, in the first stage, this handwriting is HB HB HB HB D HL -BL-L-
It is encoded in the format BDHL-GBDHL.

まず、この符号は、Ｉしきい値を適用することによって
、主基準方向Ｘ及びＹにおいて接近しすぎている点を除
去するようになされたフィルタを通される。第１図の特
定の場合、除去されるのは点１２（点１６に接近しすぎ
ている）と点１５．１９（書写共の下げの場所に接近し
すぎている）と点２２（立ち上がり２６に接近しすぎて
いる）である。ここで注意すべきは、点７は点６及び点
８に接近してはいるが除去されないということである。First, this code is filtered to remove points that are too close together in the main reference directions X and Y by applying an I threshold. In the particular case of Figure 1, the points that are removed are points 12 (too close to point 16), points 15. ). Note that although point 7 is close to points 6 and 8, it is not removed.

点７は極めて重要な特徴即ち反転又は失点に対応するの
で除去し【はならないのである。したがって新しい符号
はＨＢ　［−１ＨＨＢ　Ｉ−Ｔ　Ｂ　Ｄ　ＨＬ　−Ｌ　
−Ｌ　−ＤＨＬ−ＢＤＬとなる。Point 7 must not be removed because it corresponds to a very important feature, namely a reversal or a goal conceded. Therefore, the new code is HB [-1HHB I-T B D HL -L
-L -DHL-BDL.

前記のプロセスは、曲率の決定のような他の基準とは違
って、計算時間が極めて短かいことを指摘しておく。し
たがって、全ての形の筆跡に対して識別能力があるので
、はとんどシステマチ、りにこのプロセスを採用するの
が望ましい。It should be pointed out that the above process, unlike other criteria such as the determination of curvature, has a very short calculation time. Therefore, since it has the ability to discriminate against all types of handwriting, it is desirable to employ this process in a systematic manner.

すでに述べたように、認識の前に切断、即ち筆跡を個々
の筆跡要素へ切断すること（それぞれの筆跡要素を１つ
の文字に対応させること）が行われなければならない。As already mentioned, before recognition a cutting, ie cutting the handwriting into individual handwriting elements (each handwriting element corresponds to one character) must be carried out.

有利には、この切断は２個の段階、つまり、全ての可能
な切断点を決定する第１段階と、文字が実際にこうした
切断点を構成するという可能性を切断点毎に評価する第
２段階とを有する。Advantageously, this cutting involves two stages: a first step in which all possible cut points are determined, and a second step in which the probability that the character actually constitutes such a cut point is evaluated for each cut point. It has stages.

第２図は、全ての可能な切断点を小さな黒い四角で示し
た語を示している。これらの点はその間の「セグメント
」を規定することになる。前記の可能性の評価は、当該
セグメントとその前のセグメントとの距離又は間隔を計
算する段階と、この距離に従って相対的位置（バ、り・
ストロークはあるか）を計算する段階とを有する。こう
した計算により、９５係以下の場合において、文字の切
断、語の切断やアクセント又はラインへの復帰が存在す
るかどうかを決定することができる。残りの５係の場合
は、文字の概念を訓練期間中に文字列へ拡張することに
よって処理される。こうした切断を決定するために、他
の基準を用いることができるのは勿論である。Figure 2 shows a word with all possible cut points indicated by small black squares. These points will define "segments" between them. The evaluation of said possibilities consists of calculating the distance or spacing between the segment in question and its previous segment and determining the relative position according to this distance.
and calculating whether there is a stroke. These calculations allow it to be determined whether there are character breaks, word breaks or accents or return to lines in cases below 95. The remaining five cases are handled by extending the concept of characters to character strings during the training period. Of course, other criteria can be used to determine such a cut.

こうして規定又は確定された文字認識用の基準は極めて
異なりうるもので、その多くを取り扱５ことになろう。The standards thus defined or established for character recognition can vary widely, many of which will be dealt with5.

基準の第１群は、境界の定められた要素の大きさに関す
るものである。第６図にこの基準がいくつか示されてい
る。同図におい【、「ｎ」は高さの等しいＨＢ　Ｈ形で
ある（実際に切断点を表わす顕著点は除去されているこ
とに注意）。ｒｈＪは減少形のＨＢＨであり、「ルＬ」
は増加形のＩ−Ｉ　Ｂ　Ｍである。ループは全開、開、
閉のいずれかでありうるし、右開き、左開き、右閉じ、
左開じの℃・ずれかでありうる。The first group of criteria concerns the size of the bounded element. Some of these criteria are shown in Figure 6. In the figure, "n" is an HB H shape of equal height (note that the salient points that actually represent the cutting points have been removed). rhJ is the reduced form of HBH, “ruL”
is the incremental form I-I B M. The loop is fully open, open,
It can be either closed, right open, left open, right closed,
It can be any degree of difference from the left opening.

基準の第２群は大きさに関係する。筆跡要素は極めて犬
（手書きのｆ）、大（１）、小（ａ）文は極めて小（１
）のいずれかである。これらの基準はどのような要素に
も、又はそれの（２個の顕著点の間の）１個のセグメン
トのみに適用しうる。The second group of criteria relates to size. The handwriting elements are extremely dog (handwritten f), large (1), and small (a), and the sentences are extremely small (1).
). These criteria can be applied to any element or only one segment of it (between two salient points).

同様に、筆跡要素は極めて広い、広い、狭い、極めて狭
いのいずれかである。Similarly, a handwriting element can be very wide, wide, narrow, or extremely narrow.

基準の第６群は湾曲であり、時計回り方向の湾曲反時計
回り方向の湾曲及び反射点即ち失点を含む。また、この
第６群は湾曲の不在（直線セグメント）及び切断の存在
を含む。The sixth group of criteria is curvature and includes clockwise curvature, counterclockwise curvature, and reflection or loss points. This sixth group also includes the absence of curvature (straight segments) and the presence of cuts.

基準の第４群は方形である。方形は高い（文字「１」）
、低い（文字「ｍ」）又は正方形（文字「ａ」）のいず
れかである。こうした方形の基準は筆跡要素の全て又は
一部のみに適用される。The fourth group of standards is square. The square is high (letter "1")
, either low (letter "m") or square (letter "a"). These square criteria apply to all or only some of the handwriting elements.

基準の第５群は顕著点の相対的位置である。この群によ
り、戻り、不動、前進、迅速前進、減少、同等、増加、
近接、遠方、等々の特徴を決定することができる。The fifth group of criteria is the relative position of the salient points. This group includes return, immobility, advance, rapid advance, decrease, equal, increase,
Near, far, etc. characteristics can be determined.

基準の第６群は方向に関係するもので、高い、石高、右
、右低、低、在任、左及び在高のいずれかの特徴を与え
る。また、２個の下降の間、２個の立ち上がりの間、下
降と立ち上がりとの間、立ち上がりと下降との間の一般
的方向をも与える。The sixth group of criteria is related to direction and gives the following characteristics: high, stone high, right, low right, low, current, left, and high. It also gives the general direction between two falls, between two rises, between a fall and a rise, and between a rise and a fall.

さらに、１個のセグメントに対して又は全ての最初又は
最後のストローク又は線に対して方向が決定されるので
もよ（・。Additionally, the direction may be determined for one segment or for all first or last strokes or lines.

基準の第７群は位置決めで、顕著な線（基線、２本の線
の間の間隔）に対する筆跡の位置の決定に関係する。こ
うした基準は中間の領域とアクセントとコンマとの決定
、上向き及び下向きのストロークの存在の決定を可能と
する。The seventh group of criteria is positioning, which concerns the determination of the position of the handwriting relative to a salient line (baseline, distance between two lines). These criteria allow the determination of intermediate areas, accents and commas, and the presence of upward and downward strokes.

基準の使用しうる第８群は符号の最初のフィルタリンク
の際に除去された点の処理である。An eighth possible group of criteria is the treatment of points removed during the first filter link of the code.

基準の第９群は端点り符号であり、筆跡要素（下降、立
ち上がり、切断）の両端をいかに決定するかを示す。The ninth group of criteria is the end point symbol, which indicates how to determine the ends of a handwriting element (falling, rising, cutting).

他の基準及び基準の他の群も、識別能力を有するならば
使用可能である。Other criteria and other groups of criteria can also be used if they have discriminative capabilities.

筆跡要素の特徴を決定した後、こうした特徴は訓練期間
中に決定された筆跡要素の特徴を構成する。After determining the features of the handwriting elements, these features constitute the features of the handwriting elements determined during the training period.

しかしながら、こうした全ての基準は認識の際に実時間
で使用するの釦は数が多すぎる。このため、判定ツリー
（その第ルベルは最も識別能力の高い基準である）に従
って基準を組織化することによって基準の配列を行おう
との試みがなされている。それにもかかわらず、この組
織化が被識別筆跡のどのようなものにも適切であるはず
だったので、判定ツリーは全ての基準を備えねばならず
、極めて多数のレベルを有していなければならなかった
。極めて強力な機器を使わなくては、こうしたやりかた
で実時間認識を得ることは不可能であった。However, all these criteria require too many buttons to be used in real time during recognition. Attempts have therefore been made to arrange the criteria by organizing them according to a decision tree, the first rubel of which is the most discriminative criterion. Nevertheless, since this organization was to be appropriate for any type of handwriting to be identified, the decision tree must have all the criteria and must have a very large number of levels. There wasn't. Real-time recognition could not be achieved in this way without extremely powerful equipment.

しかしながら、認識には一般的にはほんの少数の基準で
充分である。このために、第４図に図示されたフランス
語の単語「ａｄｒｅｓｓｅＪ　の例を考察する。同図に
おいて、黒い正方形は可能な切断点を示す。「セグメン
ト」は筆跡の２個の連続する点の間又は筆跡の開始と第
１の可能な切断点との間に延びる筆跡要素である。最初
の操作ステップは純粋に自動的なプロセスに従って顕著
点を決定することである。However, only a few criteria are generally sufficient for recognition. To this end, consider the example of the French word ``addresseJ'' illustrated in Figure 4. In the same figure, the black squares indicate possible cutting points. or a handwriting element extending between the start of the handwriting and the first possible cutting point. The first operating step is to determine the salient points according to a purely automatic process.

次Ｋ、部分認識、この部分認識に従って文字へ切断する
こと（即ち、若干のセグメントを収集して一つの文字と
すること）及び完全な文字への切断に従って最終な認識
を行うことという操作ステップが実行されるが、これは
時には、辞書的図式的規則又は実際の辞書を利用するこ
とによって行われる。Next, there are the operational steps of partial recognition, cutting into characters according to this partial recognition (i.e. collecting some segments into one character) and performing final recognition according to cutting into complete characters. This is sometimes done by making use of lexicographical rules or actual dictionaries.

つまり、第４図に示されたフランス語の単語［ａｄｒｅ
ｓｓｅＪ　　という特定の場合には、第１の基準は「Ｈ
ＧＢＤＪ（第１図を参照しながら既に説明した符号化）
である。符号１−Ｉ　Ｇ　Ｂを有する第１のセグメント
はＣか又は文字り、Ａ、Ｑの一部かである。第２のセグ
メントは符号Ｂ　Ｉ−Ｉ　Ｂであり、Ｉかしか又は文字
１−１．Ｆの一部を表わす。第１のセグメントどおしの
結合はＱではな（、結合の場合には単にＤ又はＡである
。第６のセグメントは第１のセグメントと同様の符号ｔ
ｒ　Ｇ　［３を有し、しだがってＣ又は文字り、Ａ、Ｑ
のいずれかの一部である。第４のセグメントは符号ＢＤ
ＢＧＢを持ち、Ｌのみ又は文字り、Ａの一部（前のセグ
メントと結合される）又はＨの一部（次のセグメントと
結合される）を表わしている。それに続くセグメントは
符号ＢＨＢを有する。この符号は色々な文字のみ又は後
続のセグメントと一緒になって種々の文字を表わすが、
後続のセグメントと結合されてＨを形成する。同様の考
え方は後続のセグメントにも合てはまる。In other words, the French word [adre
In the particular case sseJ, the first criterion is “H
GBDJ (encoding already explained with reference to Figure 1)
It is. The first segment with the code 1-IGB is either C or part of the letters A,Q. The second segment is the code B I-I B and the I scarecrow or letters 1-1 . Represents a part of F. The connection between the first segments is not Q (in case of a connection, it is simply D or A. The sixth segment has the same sign t as the first segment.
r G [3, and therefore C or letters, A, Q
is part of either. The fourth segment is coded BD
It has BGB and represents only L or letters, part of A (combined with the previous segment) or part of H (combined with the next segment). The segment that follows has the symbol BHB. This code can represent various characters alone or together with subsequent segments, but
Combined with subsequent segments to form H. Similar considerations apply to subsequent segments.

したがって、（語の始まりの部分に限っていえば）一つ
の基準は適用はＣＬＣＬ、ＣＩＣＨ，ＣＩＤ。Therefore, one standard (limited to the beginning of a word) is CLCL, CICH, and CID.

ＣＩＡ、ＡＣＬ、ＡＣＨ，ＡＤ、ＡＡ、ＤＣＬ、ＤＣＨ
。CIA, ACL, ACH, AD, AA, DCL, DCH
.

ＤＤ、及びＤＡの結合を利用可能にすることが指摘され
なければならない。It must be pointed out that a combination of DD and DA is available.

そこで、「増加−同等−減少」の基準（前述の基準の第
１群）が使われる。最初の２個のセグメントの群に適用
されるこの基準は「同等又は減少」の結果を与え、第３
及び第４のセグメントの群に適用されると、「増加」の
結果を与える。しだがって、第１の基準によって与えら
れる第４及び第８〜第１２番目の可能な解は除去される
。６個の可能な結合が残るのみである。Therefore, the "increase-same-decrease" criterion (the first group of criteria mentioned above) is used. This criterion applied to the first two groups of segments gives an "equal or reduced" result, and the third
and when applied to the fourth group of segments gives an "increase" result. Therefore, the fourth and eighth to twelfth possible solutions given by the first criterion are eliminated. Only 6 possible combinations remain.

そこで、別の基準によって追加の除去が可能となり、辞
書との対比が最後の不確定性を除去する。Another criterion then allows for additional removal, and a comparison with the dictionary removes the final uncertainty.

つまり、極めて小数の基準が極めて広い選択性を与える
ということに注目すべきである。そのうえ、これらの基
準は極く簡単に得ることができるものであり、機械によ
る長時間の処理（計算）を必要としない。筆跡の実時間
認識を可能とするのは、当該筆跡の特定の認識の操作ス
テップ列の構成によって許容されるこれら基準が選択さ
れるからであり、その数が少ないからである。It should be noted that a very small number of criteria gives a very wide selectivity. Moreover, these criteria are extremely easy to obtain and do not require lengthy mechanical processing (calculations). Real-time recognition of handwriting is made possible because these criteria are selected that are allowed by the configuration of the sequence of operation steps for specific recognition of the handwriting, and are small in number.

使用されるべき基準の決定、操作ステップ列の構成及び
筆跡要素の特徴の決定は、訓練によって与えられる。つ
まり、訓練の段階は、この発明によれば、処理される筆
跡という特定の場合に最も識別力のある基準を選択する
ために、また、この特定の場合に対して最も速（最も有
効な認識を与える判定ツリー（基本的操作ステップ列）
を設定するために使用される。The determination of the criteria to be used, the construction of the sequence of operating steps and the determination of the characteristics of the handwriting elements are provided by training. That is, the training phase is performed according to the invention in order to select the most discriminative criterion in the particular case of the handwriting to be processed, as well as the fastest (most effective recognition) criterion for this particular case. Judgment tree (basic operation step sequence) that gives
used to set.

訓練の段階をここで考察する。この段階の目的は、基準
の選択と組織化である。したがって、極めて多（の、お
そらく最大数の基準が利用される。The stages of training will now be considered. The purpose of this stage is the selection and organization of criteria. Therefore, a very large number (perhaps the largest number of criteria) is utilized.

その後、それぞれの基準の有効性が決定され、基準の選
択が行われる。また、極めて不安定な結果を与える基準
が回避される。The validity of each criterion is then determined and a criterion selection is made. Also, criteria that give highly unstable results are avoided.

訓練は、既知のテキストの習得、このテキストの文字へ
の切断及び基準（測定）の適用による特徴の決定（即ち
、特徴の値の処理、認識の可能性の計算及び判定ツリー
を構成する操作ステップ列の設定）を含む。Training consists of the acquisition of a known text, the cutting of this text into characters and the determination of features by applying criteria (measurements) (i.e. the operational steps of processing the values of the features, calculating the probability of recognition and constructing a decision tree). column settings).

第１段階（即ち習得）は、マイクロコンピュータに接続
されたグラフィカル・テーブルなどの手段によってユー
ザーが断片的なテキストを複写することを含む。好まし
くは、テキストは、使用される記号の組（例えば文字、
数字、アルファベットの句読記号、訂正用記号など）の
全ての記号を含むのがよい。勿論、テキストのサンプル
が一層の認識を表わせば表わすほど、訓練はそれだけ有
効となる。The first stage (or learning) involves the user copying the text fragment by means such as a graphical table connected to a microcomputer. Preferably, the text consists of the set of symbols used (e.g. letters,
It is best to include all symbols (numbers, alphabetic punctuation marks, correction marks, etc.). Of course, the more the text samples represent recognition, the more effective the training will be.

次の段階は切断である。セグメントへの切断は、テキス
トを知って〜・る程度によりオペレータの援助を必要と
しない機械によって完全に実行される。The next stage is cutting. The cutting into segments is carried out entirely by a machine that does not require operator assistance due to its knowledge of the text.

しかし、切断点の選択（文字の境界定め）には、得られ
る結果に関してオペレータの責任がかかわってくる。な
ぜなら、筆跡を１個又は数個の文字として識別し結果的
に決定するのはオペレータだからである。切断に続いて
、１個の筆跡要素が１個の記号（文字、数字など）と関
連付けられる。However, the selection of cut points (character delimitation) involves the operator's responsibility for the results obtained. This is because it is the operator who identifies and ultimately determines the handwriting as one or several characters. Following cutting, one handwriting element is associated with one symbol (letter, number, etc.).

習得用の機器（グラフィカル・テーブルに接続されたマ
イクロコンピュータ）自体がどこで切断を行うかを示す
ならば、訓練はスピードアップされる。単にオペレータ
は文字間の切断を可能とする、又は逆に、どこで切断が
なされるべきかを示すだけでよい。Training will be speeded up if the learning equipment (a microcomputer connected to a graphical table) itself indicates where to make the cuts. The operator merely has to enable a cut between characters or, conversely, indicate where the cut should be made.

この後の段階は、前記の基準の適用による特徴の測定で
ある。測定は、全ての基準による完全な筆跡解析を行う
原理を含む。切断後に筆跡と関連付けられた全ての文字
は、値のリスト（基準の値のリスト）と関連付けられる
。この値のリストは、特徴の組を構成する測定要素とよ
ばれる。The next step is the measurement of the features by applying the criteria mentioned above. The measurement involves the principle of performing a complete handwriting analysis according to all criteria. All characters associated with the handwriting after cutting are associated with a list of values (list of reference values). This list of values is called the measurement element that makes up the set of features.

第５図に例示されている表は、全体的な構造の形で文字
の組をそれぞれの文字の異なる変形例と共に示したもの
で、その測定要素をそれぞれの筆跡と関連付けている。The table illustrated in FIG. 5 shows a set of letters in overall structure with different variants of each letter, and associates its measurement elements with the respective handwriting.

この表は構造の組織化を理解しやすくするための簡単な
例である。この表によれば、既知のテキストの分析によ
って、文字ａ及び１の５個の異なる変形、文字◇及びＣ
の４個の変形、文字ｄの３個の変形の識別を可能として
いる。それぞれの文字及びその変形は、第１列の符号を
構成する数によって識別される。列Ｉ、ＩＩ、■はそれ
ぞれ、１個の基準を適用して得た値を示している。つま
り、列■は基準ＨＢＧＤ（第１図に関連して定義したも
の）を適用した結果を形成する値を示す。列■【は各文
字に対して大きさの基準を適用した結果を示す。この基
準は「極めて小」「小」「犬」「極めて大」をそれぞれ
意味する４個の符号１〜４によって表わされる４個の区
別を可能にする。列■は相対的な大きさ、即ち「計算不
能」「減少」「同等」「増加」をそれぞれ表わす０〜４
０４個の区別符号を有する１減少−同等−増加」の基準
に関する。This table is a simple example to help you understand the organization of the structure. According to this table, analysis of known texts reveals five different variants of letters a and 1, letters ◇ and C
It is possible to identify four variations of the letter d and three variations of the letter d. Each letter and its variants are identified by the numbers that make up the code in the first column. Columns I, II, ■ each show the values obtained by applying one criterion. In other words, column 2 shows the values forming the result of applying the reference HBGD (as defined in connection with FIG. 1). Column ■【 shows the result of applying the size standard to each character. This criterion allows for four distinctions, represented by four codes 1 to 4, meaning "extremely small,""small,""dog," and "extremely large," respectively. Column ■ represents relative size, 0 to 4 representing "uncalculable,""decrease,""equivalent," and "increase," respectively.
04 diacritics with 1 decrease-equal-increase” criterion.

この表は３列の基準を示しているにすぎないが、基準と
同数の列が存在することを理解しなければならない。文
字の全体の構造及びその測定要素を設定した後、これら
は自助的処理を受ける。これについては第６〜１０図に
おいて説明するが、第５図による表の設定を可能とする
筆跡を持ち装置にとって初めての人が書いたテキストの
認識の期間に使用される判定ツリーの設計が可能になる
。Although this table only shows three columns of criteria, it must be understood that there are as many columns as criteria. After setting up the overall structure of the characters and their measurement elements, they are subjected to self-help processing. This will be explained in Figures 6 to 10, but it is possible to design a decision tree to be used during recognition of text written by a person whose handwriting is new to the device and whose handwriting allows the setting of the table shown in Figure 5. become.

自助的処理は、全部の文字に対して最も識別力のあるこ
とが証明された基準ＨＢ　Ｇ　Ｄを適用することから始
まる。列Ｉは５個の符号化された区別要素を含み、この
基準の適用によって５個のサブセットＳＥ１〜５Ｅ５（
各区別要素に対して１個のサブセットがある。第６図参
照）が設定される。Self-help processing begins by applying the criterion HB GD that has proven to be the most discriminative to all characters. Column I contains 5 encoded distinguishing elements, and application of this criterion yields 5 subsets SE1 to 5E5 (
There is one subset for each distinguishing element. (see Figure 6) are set.

この表では、区別要素Ｂ　Ｉ−１は１度しか現われず、
ａの１種である文字４を指示する。区別要素Ｄ　Ｉ−Ｉ
　Ｑは符号番号１０，１１，１２，１３，１７゜１８．
１９，２０．２１　　を有する文字の測定要素に含まれ
ている。したがって文字ｅ及び１を表わす文字はサブセ
ットＳＥ２に含まれることになる。In this table, the distinguishing element B I-1 appears only once,
Indicates the character 4, which is a type of a. Distinguishing element D I-I
Q is code number 10, 11, 12, 13, 17°18.
19,20.21 is included in the measurement element of the character. Therefore, the characters e and the characters representing 1 will be included in subset SE2.

同様にして、文字ｄである文字６及び８を含むサブセッ
トＳＥ６、符号番号７を有し文字じである文字を含むサ
ブセットＳＥ４、符号番号１．２，３゜５．１４．１５
．１６を有し文字ａ及びｄを表わすサブセットＳＥ５を
得ることができる。Similarly, a subset SE6 containing the letters 6 and 8 which is the letter d, a subset SE4 containing the letters having the code number 7 and which is the letter ji, code numbers 1.2, 3゜5.14.15
．． 16 and representing the letters a and d can be obtained.

サブセ、）ＳＥｌ　、ＳＦ６　、ＳＦ３は１個の文字し
か含んでいないので、それに対応する文字の認識のプロ
セスは完結している。しかしながら、サプセ、　トＳＥ
２　、ＳＦ３は異なる文字を含んでいる。その認識には
一層完全な分析が必要である。Since the characters SEL, SEL, SF6, and SF3 contain only one character, the recognition process for the corresponding characters is complete. However, Sapse, ToSE
2, SF3 contains different characters. That recognition requires a more complete analysis.

後者は既述の全ての基準のうち基準ＨＢ　ＧＤを除いた
基準のこれらサブセットへの適用より成る。The latter consists of applying to these subsets of all the criteria already mentioned, excluding the criterion HB GD.

第７図は、基準ＩＩをいかにサブセｙ　トＳ　Ｅ　２に
適用するかを示している。この基準は４個の区別可能性
を含んでいるので、サブセットＳＥ２のデータ処理を表
わす構造は、５ＳＥ１〜４によって示されるサブセット
ＳＥ２でそれぞれ終っている４個の枝を有する。符号１
を有する区別可能性によって与えられるサブセット５Ｓ
Ｅ１は文字ｅである符号１０及び１１を有する文字の筆
跡を含む。FIG. 7 shows how criterion II is applied to subset S E 2. Since this criterion contains four distinct possibilities, the structure representing the data processing of subset SE2 has four branches, each terminating in a subset SE2, denoted by 5SE1-4. code 1
The subset 5S given by the distinguishability with
E1 includes the handwriting of characters having the symbols 10 and 11, which is the letter e.

符号２０区別要素又は区別可能性によって与えられるサ
ブセラ）ＳＳＦ２は同じく文字ｅである符号に及び１３
を持つ文字を有する。サブセット５ＳＥ３及び５ＳＥ４
はそれぞれ、文字１である文字１７．１９及び同じく文
字！である文字１８゜２０．２１を有する。可能性の計
算によって基準の有効性が決定される。このためＫ、各
サブセットには主要文字（ｍａｊｏｒｉｔｙ　１ｅｔｔ
ｅｒ）のみが残されなければならない。こうした文字の
個数が計算され、表に含まれるこの形の文字の総数で除
算される。次いで、５ＳＥｌ〜５ＳＥ４のサブセットに
対して求められた値は加算される。第７図の場合、各サ
ブセットは１個の文字を表わす文字を含んでいるだけで
ある。この文字が主要文字である。The subsera given by the sign 20 distinguishing element or distinguishability) SSF2 also extends to the sign which is the letter e, 13
has a character with . Subsets 5SE3 and 5SE4
are respectively character 1, character 17, 19 and character ! It has the character 18°20.21. Calculation of likelihood determines the validity of the criterion. Therefore, K, each subset has a majority 1ett
er) must remain. The number of such characters is calculated and divided by the total number of characters of this form in the table. The values determined for the subsets 5SE1 to 5SE4 are then added. In the case of FIG. 7, each subset only contains characters representing one character. This character is the main character.

つまり、サブセットｓ　ｓ　Ｅｉは２個の文字Ｃを含み
、表の文字Ｃの総数は４であるから、可能性２／４の係
数を含む。サブセット５ＳＩ３２〜Ｓ　Ｓ　ＩＪ４はそ
れぞれ、可能性２／４．２１５．３１５の係数を与える
。これら４個の係数の和は２であって、基準Ｉ［の有効
性を決定する値を構成する。That is, the subset s s Ei contains two characters C, and since the total number of characters C in the table is 4, it contains a coefficient with probability 2/4. Subsets 5SI32 to SSIJ4 each give a coefficient of probability 2/4.215.315. The sum of these four coefficients is 2 and constitutes the value that determines the validity of the criterion I[.

第８図は、４個の区別可能性０，１，２．３を有する基
準■のサブセット５Ｅ２（したがって、４個の枝を持つ
構造となる）への適用を表わしている。表では区別可能
性０．１．２．３はサブセットＳＥ２の筆跡又は文字と
は関連付けられていないので、これらの区別可能性はサ
ブセットＳＥ２の設定を生じない。逆に、区別可能性２
はＳＦ２の全ての文字（即ち４個の文字ｅと５個の文字
１）を含むサブセット５ＳＥ６を生じる。文字ｅ及び１
０分布の区別可能性係数はそれぞれ４／４゜５１５であ
る。区別可能性係数の和は１に等しい。FIG. 8 represents the application of criterion 2 with four distinctness possibilities 0, 1, 2.3 to subset 5E2 (therefore resulting in a structure with four branches). Since in the table the distinguishability 0.1.2.3 is not associated with the handwriting or characters of subset SE2, these distinguishability do not result in the setting of subset SE2. On the other hand, distinguishability 2
yields a subset 5SE6 containing all the characters of SF2 (ie 4 characters e and 5 characters 1). letter e and 1
The distinguishability coefficient of the 0 distribution is 4/4°515, respectively. The sum of the distinguishability coefficients is equal to one.

基準の有効性を示すこの因子は、基準ＩＩを用いて求め
られた因子よりも小さい。したがって、後者が最も識別
力ある基準とみなされ、第６図に示した判定ツリーの設
定のために保持される。This factor indicating criterion validity is smaller than the factor determined using Criterion II. Therefore, the latter is considered the most discriminative criterion and is retained for setting up the decision tree shown in FIG.

第９図及び第１０図は、基準Ｂ　Ｉ−（Ｇ　Ｄ以外の全
ての基準の適用を含むサブセラ）ＳＦ３の処理を示して
いる。この例は基準ＩＩ及び１■に限定されているので
、サブセットＳＥ２の処理の場合には、これら２個の基
準のみが保持されなければならない。第９図は基準■【
をサブセットＳＥ５へ適用した結果を与える。４個のサ
プセッ）ＳＳＥ１〜５ＳＥ５が求まる。サプセッ）ＳＳ
Ｅｌは全て文字ａである文字１及び２を含む。区別可能
性の係数は２１５である。サプセッｌ−８ＳＥ２は文字
ａである文字６及び５と文字ｄを示す文字１４とを含み
、その区別可能性の計算により、文字ａに対しては係数
２１５が、文字ｄに対しては係数１／６が与えられる。FIGS. 9 and 10 show the processing of standard B I-(subsera including application of all standards other than GD) SF3. Since this example is limited to criteria II and 1■, only these two criteria have to be retained for the processing of subset SE2. Figure 9 is the standard ■【
The result of applying to subset SE5 is given. 4 subsets) SSE1 to 5SE5 are found. Supplement) SS
El contains characters 1 and 2, all of which are the characters a. The distinguishability factor is 215. Subset l-8SE2 includes letters 6 and 5, which are the letter a, and a letter 14, which is the letter d, and by calculating their distinguishability, a coefficient of 215 is given to the letter a, and a coefficient of 1 is given to the letter d. /6 is given.

したがって、文字ａが主要文字であり、係数２１５は基
準Ｉ【の有効性の計算のために保持されなければならな
い。サプセッ）ＳＳＥ３及び５ＳＥ４はそれぞれ１個の
文字ｄのみを含むので、区別可能性の係数は１／３．１
／６である。Therefore, the letter a is the main letter and the coefficient 215 must be retained for the calculation of the validity of the criterion I. Since SSE3 and 5SE4 each contain only one letter d, the coefficient of distinguishability is 1/3.1
/6.

主要文字の係数の全部を加算すると、区別因子２２／１
５　　が求まる。Adding all the coefficients of the main characters, the discrimination factor is 22/1
5 is found.

第１０図は、サブセットＳＥ５に含まれた文字を識別す
るために基準１１【を使用した結果を示している。基準
■が４個の区別可能性を有する程度まで、４個のサブセ
ットが求まらなければならない。FIG. 10 shows the results of using criterion 11 to identify the characters included in subset SE5. To the extent that the criterion ■ has four distinct possibilities, four subsets must be found.

サブセラ）ＳＥ５０文字はその測定要素において、即ち
、これら特徴の組の全てにおいて、区別可能性０及び１
を表わす符号を示さないので、区別可能性２及び６のみ
がそれぞれ１個のサブセットを与えることになる。サプ
セッ）ＳＦ３は文字ａである４個の文字１．２．３．５
を有するので、区別可能性係数４１５が求まる。サブセ
ット５ＳＥ４は文字ｄであることを証明する文字１４，
１５゜１６を有する。文字ｄの総数は６であるから、区
別可能性係数は６／３である。両方の区別可能性係数を
加算すると、値９１５　（＝２７／１５）　　が求まる
。The SE50 character has a distinguishability of 0 and 1 in its measurement elements, i.e. in all of these feature sets.
Since we do not show the symbols representing , only distinguishability 2 and 6 will each give one subset. Subset) SF3 is the letter a, four letters 1.2.3.5
Therefore, the distinguishability coefficient 415 is determined. Subset 5SE4 proves character 14 to be character d,
It has 15°16. Since the total number of letters d is 6, the distinguishability factor is 6/3. Adding both distinctness coefficients yields the value 915 (=27/15).

サブセットＳＥ５に対して適用された基準■【及び■の
識別の有効性を示す因子を比較すると、因子２７／１５
を有する基準■の方が因子２２／１５を有する基準ＩＩ
よりも有効であることがわかる。Comparing the factors showing the effectiveness of discrimination of criteria ■ and ■ applied to subset SE5, factor 27/15
Criterion ■ with a factor of 22/15 is better than criterion II with a factor of 22/15.
It turns out that it is more effective than

したがって、第６図の判定ツリーでは基準■が保持され
ることになる。Therefore, in the decision tree of FIG. 6, the criterion ■ is held.

これまで述べた詳細な説明は、最も識別力のある文字の
みを保持しその基準を決定する判定ツリーの構成へ導く
自助的処理の原理を示すものであり、こうして決められ
た基準は判定ツリーの各ノード又は接合点で使用される
。第１の基準として符号ＨＢ　Ｇ　Ｄを用いると、通常
の筆跡でははｇ２００個の枝が求まる。枝の中には、ツ
リーの葉を形成するサブセットで終るものもあり、ツリ
ーの葉では若干の文字が分類され、他の文字から分離さ
れる。訓練の過程は判定ツリーの構成で終了し、所与の
筆跡の特徴ファイルの設定は、第１１図のフローチャー
トの形で表わされる。最初の操作ステツブでは、装置に
よって筆跡の識別を受ける人が支持体（例えばディジタ
ル化するテーブル）上に既知のテキストを書（。図には
、いくつかの文字が書かれたテーブルが示されている。The detailed description given so far illustrates the principle of self-help processing that leads to the construction of a decision tree that retains only the most discriminating characters and determines the criteria for which the criteria are determined. Used at each node or junction. If the code HB GD is used as the first criterion, g200 branches are found for normal handwriting. Some branches end up in subsets that form leaves of the tree, where some characters are classified and separated from other characters. The training process ends with the construction of a decision tree, and the settings of a given handwriting feature file are represented in the form of a flowchart in FIG. In the first operating step, a person whose handwriting is to be identified by the device writes a known text on a support (e.g. a table to be digitized). There is.

それ自体旧知であるテーブルは筆跡から、それぞれが筆
跡の一つの点を表わす一連の電気信号を発生し、マイク
ロコンピュータのような筆跡処理用機器へ該電気信号を
送る。マイクロコンピュータは習得、即ち、既述のよう
な文字名によって指示される要素へのテキストの切断を
行うようにプログラムされており、種々の文字に対する
測定要素を設定して、第５図による表に示されたような
やりかたで組織化された構造を作る。そこで、マイクロ
コンピュータは基準を適用することによって判定ツリー
の発生の自助的処理を実行する。Tables, which are themselves well known, generate from the handwriting a series of electrical signals, each representing one point of the handwriting, and send these electrical signals to a handwriting processing device, such as a microcomputer. The microcomputer is programmed to master, i.e. to cut the text into elements indicated by the letter names as described above, to set up the measuring elements for the various letters and to write them in the table according to FIG. Create an organized structure in the manner shown. The microcomputer then performs a self-help process of generating a decision tree by applying criteria.

第１２図には、判定ツリーを含む特徴ファイルを装置が
すでに設定した人によって書かれ装置にとって未知のテ
キストを認識する過程がフローチャートで示されている
。この例では、ディジタル化するテーブル上に文字ｄが
書かれている。マイクロコンピュータはこの文字の習得
を以前同様に行うようにプログラムされている。次いで
、マイクロコンピュータは、筆跡の識別を受ける人のた
めに作られた判定ツリーを参照することによって、文字
ｄの認識に適する操作ステップを実行するようにプログ
ラムされる。識別されるべき文字ｄの筆跡に対して適用
される基準１−（Ｂ　Ｇ　Ｄが符号ａＤｒ−ｘを与える
要素を与えると仮定すると、第６図の判定ツリーは、識
別されるべき文字がサブセットＳＥ５にある筈であるこ
とを示す。判定ツリーによれば、基準ＩＩが適用される
べきである。この同じ筆跡に基準Ｉ［Ｉを適用して符号
３が結果として与えられたとすると、識別される文字は
サブセット５ＳＥ４に存する筈であることをマイクロコ
ンピュータは知る。このサブセットは符号１４　、１５
゜１６を持った文字を含み、第５図の表によれば実際に
文字ｄを示す。つまり、この発明によれば、各文字の測
定要素を設定するために保持されてきた極めて多様な基
準の中のほんの２個の基準を適用することによって、マ
イクロコンピュータは文字ｄを識別することができるこ
とになる。この発明により、標準的なマイクロコンピュ
ータ（例えば、テキスト処理用機器妊通常採用されてい
るタイプのもの）を用いて実時間認識が可能となるとい
うことの価値が認められるべきである。FIG. 12 shows in a flowchart the process of recognizing text that is unknown to the device and written by someone who has already configured the device with a feature file containing a decision tree. In this example, the letter d is written on the table to be digitized. The microcomputer is programmed to learn this character in the same way as before. The microcomputer is then programmed to carry out operational steps suitable for the recognition of the letter d by reference to a decision tree created for the person whose handwriting is to be identified. Criterion 1 applied to the handwriting of the letter d to be identified (B According to the decision tree, criterion II should be applied.If we apply criterion I[I to this same handwriting and the code 3 is given as a result, then the identified The microcomputer knows that the characters should be in the subset 5SE4.This subset has the symbols 14 and 15.
It contains a letter with ゜16, which actually indicates the letter d according to the table in Figure 5. In other words, according to the present invention, a microcomputer is able to identify the letter d by applying just two of the extremely diverse criteria that have been maintained for establishing the measurement elements of each letter. It will be possible. It should be appreciated that the present invention enables real-time recognition using standard microcomputers (eg, of the type commonly employed in text processing equipment).

判定ツリーの構成を作り出した筆跡が修正を受けるなら
ば、この発明の装置は認識と同時Ｋ、修正された文字の
測定要素の更新と、必要があれば、判定ツリーの更新を
行う。If the handwriting that created the composition of the decision tree undergoes a modification, the device of the invention simultaneously recognizes, updates the measurement elements of the modified character and, if necessary, updates the decision tree.

この発明の別の有用な特徴点は、語を語いの形の辞書に
形成する筆跡要素群を比較して、必ずしも文字をそれぞ
れ識別することなしに語を識別できる可能性、又は筆跡
がいくつかのやりかたで適正に切断されるという事実か
ら生じる不確定性を除去することができる可能性である
。この発明によれば、こうした問題は、分布格子又は分
布パターンとよばれる構造により、及び、特定の構造の
序列化された辞書によって解決される。第１６図は、フ
ランス語の語［ａｄｍｉｓｅＪをマイクロコンピュータ
が作る切断及びマイクロコンピュータによって設定され
る格子又はパターンＣ文字の下側）と共に示す。この格
子又はパターンはフランス語の語１”　ａｄｍｉ　ｓｅ
　Ｊの筆跡セグメントを認識する様々な可能性を示して
いる。しかし、この例では、各分布の最も可能性の高い
文字だけが示されている。Another useful feature of the invention is the possibility of identifying words without necessarily identifying each letter by comparing groups of handwriting elements that form a word into a dictionary of word forms, or how many handwritings there are. In this way it is possible to eliminate the uncertainty arising from the fact that it is properly cut. According to the invention, these problems are solved by a structure called a distribution grid or pattern, and by an ordered dictionary of specific structures. FIG. 16 shows the French word [admise J] with the cut made by the microcomputer and the grid or pattern set by the microcomputer below the letter C). This grid or pattern comes from the French word 1” admi se
It shows various possibilities for recognizing handwriting segments of J. However, in this example, only the most likely characters of each distribution are shown.

文字「ｄ」とｒｈＪは１個のセグメントを共通に持って
いるので、これら両文字は相互に排他的である。Since the letters "d" and rhJ have one segment in common, both letters are mutually exclusive.

第１４図に構造の原則が示されている特定の辞書を参照
することにより、不確定性は除去される。Uncertainty is removed by reference to a specific dictionary, the principle of which is shown in FIG. 14.

これは樹枝状の構造をしており、アルファベットの各文
字に対して例えば１つの枝が与えられている。これらの
全ての枝は共通の点から出発する。It has a dendritic structure, with one branch for each letter of the alphabet, for example. All these branches start from a common point.

それぞれの枝はノード又は接合点を有し、ノード又は接
合点から別の枝が生じ、一つ一つの枝は、何かの意味を
持つらしい文字結合を最初の枝と共に形成する１個の文
字と関連付けられる。これらの小枝のそれぞれは１個の
文字と関連付けられた新しい枝の開始点をなし、意味を
持つ新たな文字結合を形成する。また、この構造は語の
多様な文法的形態を考慮に入れること許容する。例えば
８万語を含み語の異なる文法的形態を考慮した１個の樹
枝状構造、即ち２０万個の適正な文字結合は筆跡の実時
間認識にとって極めて満足のい（ものである。しかしな
がら、こうした構造は例えば１８０キロオクテツトの記
憶スペースを必要とする。Each branch has a node or junction, from which other branches arise, and each branch is a single letter that forms with the first branch a letter combination that seems to have some meaning. associated with. Each of these twigs forms the start of a new branch associated with a single letter, forming a new meaningful letter combination. This structure also allows to take into account various grammatical forms of words. For example, a single dendritic structure containing 80,000 words and considering different grammatical forms of the words, i.e. 200,000 proper letter combinations, is extremely unsatisfactory for real-time recognition of handwriting. The structure requires, for example, 180 kilo octets of storage space.

訓練過程の期間に判定ツリーを設定することにより、ま
た必要により、樹枝状構造に組織化された辞書の存在に
より、この発明の方法は、グラフィカル・テーブルやマ
イクロコンピュータのような筆跡支持体や担体を必要と
するだけの標準的な機器によって筆跡の実時間認識を可
能とする。例えば５１２キロオクテ、トの中央記憶装置
を合体させ辞書に適した記憶スベヤスだけ増やされたマ
イクロコンピュータは極めて満足のいくものである。By setting up a decision tree during the training process and, if necessary, by the presence of a dictionary organized in a dendritic structure, the method of the invention can Real-time recognition of handwriting is possible using standard equipment that requires only For example, a microcomputer incorporating a central memory of 512 oct. and increasing the storage space suitable for a dictionary is extremely satisfactory.

この発明の方法は多（の応用を有する。つまり、マイク
ロコンピュータ用習得機器を構成し、いま「マウス」と
よばれる回転型の制御部材と共にキーボード（タプレ、
ト上へのテキスト、命令等の直接筆記）に取って代わる
。また、判定ツリーを構成するために訓練過程の期間に
筆跡が予じめ分析され走査され学習された人による手書
きのテキストの直接タイプライティングを可能とする。The method of the present invention has many applications. Namely, it constitutes a learning device for a microcomputer, and together with a rotary control member now called a "mouse", a keyboard (taple, etc.) is used.
Direct writing of text, instructions, etc. on the board). It also enables direct typewriting of handwritten text by a person whose handwriting has been previously analyzed, scanned and learned during the training process to construct the decision tree.

[Brief explanation of the drawing]

第１図は、若干の文字とこの発明による符号化をどのよ
うに実行するかを示す。第２図は、１個の語とどのよう
にしてこの語の文字への切断を行うかとを示す。第３図
は、若干の文字とどのように符号化が解釈されるかとを
示す。第４図は、認識期間に実施される操作ステップの
説明のために用いられる手書きの語を示す。第５図は、
この発明による手書きテキストの分析結果を示す表であ
る。第６図は、第５図の表から設定された判定ツリーを
示す。第７図、第８図、第９図及び第１０図は、この発
明による判定ツリーを設定するための操作ステップを示
す。第１１図は、この発明による判定ツリーを設定する
ためのフローチャートを示す。第１２図は、この発明に
よる１個の文字の認識プロセスのフローチャートを示す
。第１６図は、この発明による分布格子又はパターンの
原理を示す。第１４図は、この発明による序列化された
辞書の組織化を示す。ＨＢＨＢＨＢＨＢＤＨＬＢＬ　　　ＬＢＤＨＬＧＢＤＨ
Ｌノ２コｒ３ φ贅１１　　　　　　　　２叫＋２　　　　　　　２１０　　　　　　　　　　　　　　　　　　　　　　　
　…＋ｒ＋＋１２　　　　　　　　　　　　　　　　　
　　　　　關　　　　　２２１：ｌ　　　　　　　　　
　　　　　　　　　　　　　　　　ＩＮＫ＋　　　　　
　２”　　　　　　　　　　　　　　　　　　　　　　
　　　Ｇ１１ｌｌ　　　　　　２１！ｉ　　　　１２８
１１３１６　　　　　　　　　　　　　　　　　　　　　　　
０ｍｌ　　　　　　４１フ　　　　　　　　　　　　　
　　　　　　　　　ｌＮ１Ｇ　　　　　　３１８　　　
　　　　　　　　　　　　　　　　　　　　　　ＩＮＩ
Ｇ　　　　　　４ＩＱ　　　　　　　　　　　　　　　
　　　　　　　　ＩＮＩＧ　　　　　　３２０　　　　
　　　　　　　　　　　　　　　　　　　　ｌ囮　　　
　　４Ｊ工ｑＳＳ５区１　　　　　　　　＋醋ノ　　　　　　　　♀逼
４１　　　　　　　　　衷４１Ｌ”（１０）　　　　　
ＩＩ！Ｗ、’／　　　　　ＳＳＭ：ｌ　　　　　　ｌ１
ｆｆｌＥ４Ｊエク飼FIG. 1 shows some characters and how the encoding according to the invention is performed. FIG. 2 shows a word and how to cut the word into letters. Figure 3 shows some characters and how the encoding is interpreted. FIG. 4 shows the handwritten words used to describe the operational steps performed during the recognition period. Figure 5 shows
3 is a table showing the analysis results of handwritten text according to the present invention. FIG. 6 shows a decision tree set from the table of FIG. 7, 8, 9 and 10 illustrate the operational steps for setting up a decision tree according to the invention. FIG. 11 shows a flowchart for setting up a decision tree according to the invention. FIG. 12 shows a flowchart of a single character recognition process according to the present invention. FIG. 16 illustrates the principle of a distributed grid or pattern according to the invention. FIG. 14 shows the organization of an ordered dictionary according to the present invention. HBHBHBHBBDHLBL LBDHLGBDH
L no 2 Ko r3 φ11 2 Scream+2 2 10
...+r++12
221:l
INK+
2”
G11ll 21! i 128
113 16
0ml 41f
lN1G 318
INI
G4IQ
INIG320
l decoy
4J Engineering qS S5 Ward 1 + 醋ノ♀逼41 衷41L” (10)
II! W,'/SSM:l l1
fflE4J Ex-Kai

Claims

[Claims]

(1) applying a predetermined criterion based on a predetermined sequence of operation steps to handwritten strokes or handwriting elements to determine features of the handwriting or handwriting elements; and converting the thus determined features into features representing known typeface elements; A method for recognizing handwriting, comprising: comparing; and identifying the handwriting element as a known writing element when the comparison between the features yields a predetermined result; 1. A recognition method comprising the step of setting a predetermined sequence of operation steps according to the characteristics determined by the application.

(2) In the recognition method described in claim 1,
A recognition method, characterized in that the step of setting the predetermined sequence of operation steps includes determining the order in which criteria are to be applied to the handwriting or handwriting elements.

(3) In the recognition method described in claim 2,
A recognition method characterized in that the step of setting includes selecting only a small number of criteria from among a large number of criteria.

(4) In the recognition method described in claim 1,
A recognition method further comprising the step of gradually updating features representing known typeface elements, the order of the sequence of operational steps, or the selected criteria.

(5) In the recognition method described in claim 1,
Known typeface elements, such as letters, numbers, etc., are subjected to the application of predetermined criteria to determine or retain the characteristics representing these known typeface elements, with a training pre-step, which is used for recognition and A recognition method comprising setting an operation step sequence including criteria with discriminative power.

(6) In the recognition method described in claim 1,
A recognition method characterized by encoding salient points of handwriting and handwriting elements.

(7) In the recognition method described in claim 6,
A recognition method comprising the step of removing some salient points according to a criterion of proximity.

(8) In the recognition method described in claim 1,
A recognition method comprising the step of cutting handwriting into handwriting elements.

(9) In the recognition method described in claim 8,
The step of cutting the handwriting involves discrimination of specific points of possible cutting;
and, for each particular point, an evaluation of the possibility that the point actually constitutes a cut between two elements of the handwriting.

(10) A recognition method according to claim 1, characterized in that the recognition method includes a comparison between a group of previously recognized elements and a dictionary containing the group.