JP2002063548A

JP2002063548A - Handwritten character recognizing method

Info

Publication number: JP2002063548A
Application number: JP2001183718A
Authority: JP
Inventors: Harunobu Oyama; 晴信大山; Masaki Nakagawa; 正樹中川
Original assignee: Hitachi Software Engineering Co Ltd
Current assignee: Hitachi Software Engineering Co Ltd
Priority date: 2001-06-18
Filing date: 2001-06-18
Publication date: 2002-02-28

Abstract

PROBLEM TO BE SOLVED: To provide a handwritten character recognizing method which can accurately segment even obliquely handwritten characters and handwritten characters having narrow character intervals and recognize the handwritten characters in an arbitrary line together at a time according to the segmentation result. SOLUTION: Strokes which are smaller in distance, obtained by evaluating the distance between strokes constituting stroke groups according to a predetermined relational expression, than a threshold for temporary combination are combined and the stroke groups are divided into character elements. Then, the circumscribed rectangle of each character element is found and the maximum height or minimum width of the circumscribed rectangle is estimated as the standard character size of a handwritten character; and character elements having parameters, representing the relation between adjacent character elements in a space of the estimated standard character size, smaller than the threshold for temporary combination are combined, character elements are divided into character element sets, and the character recognition of the character element set is carried out by using a dictionary.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、タブレットや電子
黒板などの手書き文字入力装置から入力された手書き文
字を認識する手書き文字列認識方法に関するものであ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a handwritten character string recognition method for recognizing handwritten characters input from a handwritten character input device such as a tablet or an electronic blackboard.

【０００２】[0002]

【従来の技術】従来、この種の手書き文字認識方法ある
いは認識装置として、次のようなものが提案されてい
る。（１）特開昭６１−２９９８２号（名称：オンライン手
書き文字列認識方式）（２）特開平５−１７４１８５号（名称：日本語文字認
識装置）（３）特開平６−１６２２６９号（名称：手書き文字認
識装置）（４）特開平８−５０６３２号（名称：手書き文字切り
出し方法および装置）2. Description of the Related Art Heretofore, the following has been proposed as this kind of handwritten character recognition method or recognition device. (1) JP-A-61-29982 (name: online handwritten character string recognition method) (2) JP-A-5-174185 (name: Japanese character recognition device) (3) JP-A-6-162269 (name: Handwritten character recognition device) (4) JP-A-8-50632 (name: handwritten character cutout method and device)

【０００３】特開昭６１−２９９８２号公報に開示され
たオンライン手書き文字列認識方法は、データタブレッ
ト上に自由形式で筆記された文字列を認識する際の制約
を解消すると共に、文字のセグメンテーションを正しく
行うことを目的とし、データタブレットから入力された
ストローク列を複数の基本セグメント列に分割し、次
に、その基本セグメントを組み合わせて候補文字を生成
し、次に、生成された候補文字を標準文字との照合によ
って逐次認識し、認識結果の文字名称と相違度を蓄積す
る処理を、全ての候補文字に対して反復実行し、入力ス
トローク列に対し相違度の総和を最小とする文字名称の
系列を最小経路探索アルゴリズムを用いて割り当てるよ
うにしたものである。The on-line handwritten character string recognition method disclosed in Japanese Patent Application Laid-Open No. 61-29982 solves a restriction in recognizing a character string written in a free form on a data tablet, and performs character segmentation. In order to perform correctly, the stroke sequence input from the data tablet is divided into a plurality of basic segment sequences, then the basic segments are combined to generate candidate characters, and then the generated candidate characters are standardized. The process of successively recognizing characters by collation with characters and accumulating the character names and the degree of difference of the recognition result is repeatedly executed for all candidate characters, and the character name of the character sequence that minimizes the sum of the degrees of difference for the input stroke sequence A sequence is assigned using a minimum route search algorithm.

【０００４】特開平５−１７４１８５号公報に開示され
た日本語文字認識装置は、スキャナなどからオンライン
もしくはオフラインで入力された日本語文字列の誤切り
出しおよび誤認識を最小限にすることを目的とし、分離
文字あるいは半角文字が並んでいる可能性のある文字列
の範囲を検出し、その範囲で全ての切り出し候補を求
め、認識を行い、切り出し優先順位と認識類似度との相
互判断で最も確からしい認識文字コードを出力するため
に、文字部分の連結部分の外接図形を抽出し、隣接する
外接図形が、横書き文書ならば上下方向に、縦書き文書
ならば左右方向に重なっている場合に統合を行って基本
矩形を作成し、その基本矩形が単独で１文字として決定
できるか否かを判定し、決定できない場合、その基本矩
形の範囲を検出し、この範囲に対し、切り出し候補とし
て隣接する基本矩形の統合の組合せを求め、夫々に優先
順位を付け、全切り出し候補を認識し、切り出し優先順
位および認識類似度より最も確からしい認識文字コード
を出力するようにしたものである。The Japanese character recognition device disclosed in Japanese Patent Application Laid-Open No. 5-174185 aims at minimizing erroneous segmentation and erroneous recognition of a Japanese character string input online or offline from a scanner or the like. Detects a range of character strings where there may be separation characters or half-width characters, finds all cutout candidates in that range, performs recognition, and determines the cutoff priority and the recognition similarity the most. Extracts circumscribed figures of connected parts of character parts in order to output unique recognition character codes, and integrates when adjacent circumscribed figures overlap in the vertical direction for horizontal writing documents and in the horizontal direction for vertical writing documents To create a basic rectangle, determine whether the basic rectangle can be determined as one character by itself, if not, detect the range of the basic rectangle, In the range of, a combination of integration of adjacent basic rectangles is obtained as a cutout candidate, priorities are assigned to each of them, all cutout candidates are recognized, and a recognized character code that is more certain than the cutout priority and the recognition similarity is output. It is like that.

【０００５】特開平６−１６２２６９号公報に開示され
た手書き文字認識装置は、任意の位置に任意の速度で円
滑に手書き文字を入力可能にすることを目的とし、入力
された手書き文字のストローク間の距離および方向、始
点の位置を検出し、座標データを文字単位で識別し、文
字単位の座標データによって該ストロークが表現する文
字を認識するようにしたものである。A handwritten character recognition device disclosed in Japanese Patent Laid-Open No. 6-162269 has an object to enable a handwritten character to be smoothly input at an arbitrary position at an arbitrary speed. , The position of the starting point is detected, the coordinate data is identified in character units, and the character represented by the stroke is recognized by the coordinate data in character units.

【０００６】特開平８−５０６３２号公報に開示された
手書き文字切り出し方法および装置は、入力枠を設けず
に文字の切り出しを可能にすることを目的とし、入力さ
れた手書き文字列の高さＨを求め、この文字列高さＨに
基づいて幅Ｌを決定し、基点Ｏから水平方向に幅Ｌの範
囲を予備探索範囲とし、その予備探索範囲内においてス
トロークの数Ｓと最大高さｈと形状特徴量ｘ（空白長の
最大のもの）を求め、変数Ｓ，ｈ，ｘに応じて探索範囲
を決定し、その探索範囲内でヒストグラムが最小値をと
る区間を探索し、その区間のうち最長のものが後続の文
字との間の切れ目であるとして１文字の切り出しを行う
ようにしたものである。A method and apparatus for extracting a handwritten character disclosed in Japanese Patent Application Laid-Open No. Hei 8-50632 is intended to enable the extraction of a character without providing an input frame. Is determined based on the character string height H, a range of the width L in the horizontal direction from the base point O is set as a preliminary search range, and the number of strokes S, the maximum height h, and the like within the preliminary search range. The shape feature x (maximum blank length) is obtained, a search range is determined according to the variables S, h, x, and a section where the histogram takes the minimum value within the search range is searched. One character is cut out assuming that the longest one is a break between subsequent characters.

【０００７】[0007]

【発明が解決しようとする課題】しかしながら、前述の
各公報に記載された手書き文字認識方法にあっては、い
ずれも、筆記方向が横書きまたは縦書きとして予め指定
されるか、固定されていることを前提とし、さらに改行
位置も指定されることを前提としているため、筆記方向
や改行位置が指定されない手書き文字文書、例えば、電
子黒板に筆記された複数行の手書き文書をオンラインで
取り込み、これを一括して認識することができないとい
う問題がある。However, in the handwritten character recognition methods described in the above publications, the writing direction is specified in advance as horizontal writing or vertical writing or fixed. Since it is assumed that the line feed position is also specified, a handwritten character document in which the writing direction and line feed position are not specified, for example, a multi-line handwritten document written on an electronic blackboard is imported online, and There is a problem that it cannot be recognized collectively.

【０００８】また、特開昭６１−２９９８２号公報に開
示されたオンライン手書き文字列認識方式にあっては、
入力されたストローク列を基本セグメント列に区分する
手法として、横書きの手書き入力文字パターンに対し
て、各ストロークの横軸への投影の重なり具合と手書き
入力文字パターンの外接図形の高さの比と閾値とを比較
してストロークを分割し、分割された各ストロークの組
を基本セグメントとしているため、手書き文字が斜め方
向に傾いて筆記された場合、外接図形の高さが文字高さ
より異常に大きくなってしまい、その結果として、隣の
文字を構成するセグメント列を含んだ形で１つの組の基
本セグメント列として区分してしまう。この結果、斜め
方向に傾いて筆記された手書き入力文字を正しく認識す
ることができなくなるという問題がある。In the online handwritten character string recognition system disclosed in Japanese Patent Application Laid-Open No. 61-29982,
As a method of dividing the input stroke sequence into basic segment sequences, for the horizontal handwritten input character pattern, the ratio of the overlapping degree of projection of each stroke to the horizontal axis and the height of the circumscribed figure of the handwritten input character pattern Since the stroke is divided by comparing it with the threshold value and each set of divided strokes is used as a basic segment, when a handwritten character is written in a diagonal direction, the height of the circumscribed figure is abnormally larger than the character height. As a result, as a result, the set is divided into a set of basic segment strings including the segment strings constituting the adjacent characters. As a result, there is a problem that a handwritten input character written in a slanted direction cannot be correctly recognized.

【０００９】また、特開平５−１７４１８５号公報に開
示された日本語文字認識装置にあっては、横書きの場合
は縦方向に、縦書きの場合は横方向に重なり合うストロ
ーク同士を結合し、１つの文字を構成し得る基本セグメ
ントとしているため、すなわち、重なりが有るか無いか
という決定論的な手法によって基本セグメントに分割し
ているため、文字間隔が狭くて隣接する文字との外接図
形が重なっている場合には、複数の文字のストロークを
１つの文字の基本セグメントに統合してしまう危険性が
あり、文字間隔の狭い手書き入力文字を正しく認識でき
なくなる恐れがある。Further, in the Japanese character recognition device disclosed in Japanese Patent Laid-Open No. 5-174185, strokes overlapping in the vertical direction for horizontal writing and in the horizontal direction for vertical writing are combined with one another. Since the basic segment can be composed of two characters, that is, it is divided into basic segments by a deterministic method of determining whether there is an overlap or not, the circumscribed figure between adjacent characters overlaps with a narrow character interval. In such a case, there is a risk that the strokes of a plurality of characters are integrated into a basic segment of one character, and there is a risk that handwritten input characters with a narrow character interval cannot be correctly recognized.

【００１０】また、特開平６−１６２２６９号公報に開
示された手書き文字認識装置にあっては、複数の手書き
文字を１文字づつ切り出す際に、第１ストロークの始点
に注目し、直前の文字の最後のストロークの始点位置が
予め定めた閾値よりも下部に有り、かつ当該文字の第１
ストロークの始点位置が前記閾値より上部に有ることを
検出したならば、この部分を１文字の境界候補とした
後、直前の文字の第１ストロークと注目文字の第１スト
ロークの始点間距離および方向を調べ、その始点間距離
が閾値より大きく、かつ文字入力方向と同一であれば、
１文字の切り出し候補に決定し、その切り出し候補の外
接ボックスを作成し、直前に作成した外接ボックスとの
重なり関係を調べ、重なる場合は２つの外接ボックスを
同一文字のストローク群として統合し、重ならない場合
は１つ前の切り出し候補のストローク群を１文字分とし
て切り出すようにしているため、第１ストロークの始点
位置が直前の文字の最後のストロークの始点位置よりも
常に下部になる縦書き形式の手書き文字認識には適用で
きないという問題がある。また、横書き形式であって
も、同様の理由により、１行全体の文字が右下がり方向
に傾いた斜め書き形式の手書き文字の場合に、注目文字
の第１ストロークの始点位置が直前の文字の最後のスト
ロークの始点位置より下部になっていれば、当該第１ス
トロークは直前の文字を構成するストロークとして区分
されてしまい、切り出しが正しく行われなくなるという
問題がある。In the handwritten character recognition device disclosed in Japanese Patent Application Laid-Open No. 6-162269, when cutting out a plurality of handwritten characters one by one, attention is paid to the starting point of the first stroke, and The start point position of the last stroke is below a predetermined threshold and the first point of the character is
If it is detected that the starting point position of the stroke is above the threshold value, this part is set as a candidate for the boundary of one character, and then the distance and direction between the starting point of the first stroke of the immediately preceding character and the first stroke of the target character If the distance between the starting points is larger than the threshold value and is the same as the character input direction,
A cutout candidate for one character is determined, a circumscribing box of the cutout candidate is created, and an overlapping relationship with the circumscribing box created immediately before is checked. If it does not, the stroke group of the previous cut candidate is cut out as one character, so the vertical stroke format where the start point of the first stroke is always lower than the start point of the last stroke of the previous character There is a problem that it cannot be applied to handwritten character recognition. Even in the horizontal writing format, for the same reason, in the case where the characters on the entire line are handwritten characters in the diagonal writing format in which the characters are inclined to the lower right, the starting point of the first stroke of the target character is set to the position of the immediately preceding character. If the first stroke is lower than the starting point of the last stroke, the first stroke is classified as a stroke constituting the immediately preceding character, and there is a problem that the clipping is not performed correctly.

【００１１】また、特開平８−５０６３２号公報に開示
された手書き文字切り出し方法にあっては、入力された
手書き文字列の高さＨを求め、この文字列高さＨに基づ
いて幅Ｌを決定し、基点Ｏから水平方向に幅Ｌの範囲を
予備探索範囲とし、その予備探索範囲内においてストロ
ークの数Ｓと最大高さｈと形状特徴量ｘ（空白長の最大
のもの）を求め、変数Ｓ，ｈ，ｘに応じて探索範囲を決
定し、その探索範囲内でヒストグラムが最小値をとる区
間を探索し、その区間のうち最長のものが後続の文字と
の間の切れ目であるとして１文字の切り出しを行うよう
にしているため、例えば、３桁の数字「１１１」を縦長
に筆記した場合、これらの数字が１つの文字を構成する
ストローク列として切り出され、漢字の「川」という文
字に誤認識されてしまう恐れがある。また、複数行の手
書き文字については改行位置で行の区分を行うようにし
ているが、改行位置をどのようにして検出するかについ
ては考慮されていない。このため、複数行に渡って筆記
された手書き文字をそれぞれの行別に一括して認識する
ことができないという問題がある。In the handwritten character extracting method disclosed in Japanese Patent Application Laid-Open No. 8-50632, a height H of an input handwritten character string is obtained, and a width L is determined based on the character string height H. Is determined, and the range of the width L in the horizontal direction from the base point O is set as the preliminary search range, and within the preliminary search range, the number S of strokes, the maximum height h, and the shape feature x (the maximum blank length) are obtained. A search range is determined according to the variables S, h, and x, and a section where the histogram takes the minimum value is searched within the search range, and the longest one of the sections is a break between the subsequent character. Since one character is cut out, for example, when a three-digit number "111" is written vertically, these numbers are cut out as a stroke sequence that constitutes one character, and are called a kanji character "kawa". Misrecognized by letters Mau there is a risk. In addition, for handwritten characters of a plurality of lines, line division is performed at line break positions, but how to detect line break positions is not taken into consideration. Therefore, there is a problem that handwritten characters written over a plurality of lines cannot be collectively recognized for each line.

【００１２】本発明は、前記従来技術の問題点を解決す
るためになされたものであり、本発明の目的は、斜め書
きや文字間隔が狭い手書き文字であっても、各文字の切
り出しを正確に行い、その切り出し結果に従って任意行
の手書き文字を一括して認識することが可能な手書き文
字認識方法を提供することにある。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems of the prior art, and it is an object of the present invention to accurately cut out each character even if it is a diagonal writing or a handwritten character with a narrow character spacing. It is another object of the present invention to provide a handwritten character recognition method capable of collectively recognizing handwritten characters on an arbitrary line according to the cutout result.

【００１３】また、本発明の他の目的は、電子黒板等に
筆記方向が指定されずに筆記された手書き文字をオンラ
インで取り込み、その手書き文字の筆記方向を正確に判
定し、その判定結果に従って手書き文字を一括して認識
することが可能な手書き文字認識方法を提供することに
ある。Another object of the present invention is to take in a handwritten character written on an electronic blackboard or the like without specifying a writing direction online, accurately determine the writing direction of the handwritten character, and according to the determination result. An object of the present invention is to provide a handwritten character recognition method capable of recognizing handwritten characters collectively.

【００１４】また、本発明の他の目的は、電子黒板等に
改行位置が指定されずに筆記された手書き文字をオンラ
インで取り込み、その手書き文字の改行位置を正確に判
定し、その判定結果に従って複数行に渡る手書き文字を
一括して認識することが可能な手書き文字認識方法を提
供することにある。Another object of the present invention is to take in a handwritten character written on an electronic blackboard or the like without specifying a line feed position online, determine the line feed position of the handwritten character accurately, and according to the determination result. An object of the present invention is to provide a handwritten character recognition method capable of collectively recognizing handwritten characters over a plurality of lines.

【００１５】また、本発明の他の目的は、縦書き横書き
の種別、行数、筆記枠の有無に関係なく、電子黒板等に
筆記された手書き文字をオンラインで取り込み、その手
書き文字を一括して認識することが可能な手書き文字認
識方法を提供することにある。本発明の前記ならびにそ
の他の目的と新規な特徴は、本明細書の記述及び添付図
面によって明らかにする。Another object of the present invention is to fetch online handwritten characters on an electronic blackboard or the like and collectively write the handwritten characters irrespective of the type of vertical and horizontal writing, the number of lines, and the presence or absence of a writing frame. It is an object of the present invention to provide a handwritten character recognizing method capable of recognizing by handwriting. The above and other objects and novel features of the present invention will become apparent from the description of the present specification and the accompanying drawings.

【００１６】[0016]

【課題を解決するための手段】本願において開示される
発明のうち、代表的なものの概要を簡単に説明すれば、
下記の通りである。即ち、本発明は、手書き文字入力装
置からストローク順に入力された複数ストローク群から
成る複数の手書き文字列を認識する手書き文字認識方法
であって、前記複数ストローク群を構成する各ストロー
ク間の距離を予め定めた関係式に従って評価し、その評
価した距離が仮結合用の閾値よりも小さいストローク同
士を結合する処理を結合可能なストロークがなくなるま
で繰り返すことにより、複数ストローク群を複数の文字
要素に分割した後、各文字要素の外接矩形を求め、当該
外接矩形の高さの最大値または平均値と、当該外接矩形
の幅の最大値または平均値を手書き文字の標準文字サイ
ズとして推定し、この推定の標準文字サイズの空間にお
いて隣接する文字要素間の関係を表すパラメータを予め
定めた関係式に従って算出し、その算出したパラメータ
が仮結合用の閾値よりも小さい文字要素同士を結合する
処理を結合可能な文字要素がなくなるまで繰り返すこと
により、複数の文字要素を複数の文字要素集合に分割
し、その文字要素集合によって辞書を探索し、辞書に登
録された手書き文字パターンに対する評価値が最大にな
る文字を認識結果として出力することを特徴とする。SUMMARY OF THE INVENTION Among the inventions disclosed in the present application, the outline of a representative one will be briefly described.
It is as follows. That is, the present invention is a handwritten character recognition method for recognizing a plurality of handwritten character strings composed of a plurality of stroke groups input in the order of strokes from a handwritten character input device, wherein a distance between the strokes constituting the plurality of stroke groups is determined. A plurality of stroke groups are divided into a plurality of character elements by repeating a process of combining strokes whose evaluated distance is smaller than a threshold for provisional combination until there are no more connectable strokes, according to a predetermined relational expression. After that, the circumscribed rectangle of each character element is obtained, and the maximum value or average value of the height of the circumscribed rectangle and the maximum value or average value of the width of the circumscribed rectangle are estimated as the standard character size of the handwritten character. A parameter representing the relationship between adjacent character elements in the standard character size space is calculated according to a predetermined relational expression. By repeating the process of combining character elements whose parameters are smaller than the provisional combination threshold until there are no more character elements that can be combined, multiple character elements are divided into multiple character element sets, and the character element set It is characterized in that a dictionary is searched, and a character having a maximum evaluation value for a handwritten character pattern registered in the dictionary is output as a recognition result.

【００１７】本発明の好ましい実施の形態では、前記文
字要素間の関係を表すパラメータが仮分割用の閾値より
も大きい文字要素のそれぞれに文字の区切りであること
を示す属性フラグを設定し、前記複数の文字要素を前記
属性フラグによって文字の区切りが明らかな状態に区分
し、この区分を参照して辞書を探索することを特徴とす
る。In a preferred embodiment of the present invention, an attribute flag is set for each character element whose parameter representing the relationship between the character elements is larger than a threshold for provisional division, indicating that the character is a delimiter. The method is characterized in that a plurality of character elements are classified into a state where character delimitation is clear by the attribute flag, and a dictionary is searched with reference to the classification.

【００１８】[0018]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態を詳細に説明する。なお、実施の形態を説明す
るための全図において、同一機能を有するものは同一符
号を付け、その繰り返しの説明は省略する。図１は、本
発明を適用した手書き文字認識装置の実施の形態を示す
ブロック構成図であり、タブレットあるいは電子黒板等
で構成され、ペン１で入力面に筆記された手書き文字の
筆点座標をストローク順に出力する手書き文字入力装置
２と、手書き文字の認識結果を表示する表示装置３と、
手書き文字入力装置２から入力された手書き文字の複数
ストローク群を文字要素候補別に結合／分割し、辞書と
の照合によって認識する中央処理装置（ＣＰＵ）４と、
認識処理に必要な各種のパラメータやコマンドを入力す
るキーボード５、手書き文字認識プログラム６１や辞書
６２等を記憶した記憶装置６とで構成されている。Embodiments of the present invention will be described below in detail with reference to the drawings. In all the drawings for describing the embodiments, components having the same functions are denoted by the same reference numerals, and repeated description thereof will be omitted. FIG. 1 is a block diagram showing an embodiment of a handwritten character recognition apparatus to which the present invention is applied, which is constituted by a tablet or an electronic blackboard, etc. A handwritten character input device 2 for outputting stroke order, a display device 3 for displaying a recognition result of handwritten characters,
A central processing unit (CPU) 4 for combining / dividing a plurality of stroke groups of handwritten characters input from the handwritten character input device 2 for each character element candidate and recognizing them by collating with a dictionary;
It comprises a keyboard 5 for inputting various parameters and commands necessary for the recognition process, and a storage device 6 for storing a handwritten character recognition program 61, a dictionary 62 and the like.

【００１９】ここで、手書き文字入力装置２は、電子黒
板やタブレットに限定されるものではなく、手書き文字
の筆点座標をストローク順に出力する構成のものであれ
ば全て使用することができる。また、透明タブレットの
下面に表示画面を実装した構造の入力装置を使用するこ
ともできる。Here, the handwritten character input device 2 is not limited to an electronic blackboard or a tablet, and any device can be used as long as it has a configuration in which the brush point coordinates of handwritten characters are output in the order of strokes. Further, an input device having a structure in which a display screen is mounted on the lower surface of the transparent tablet can be used.

【００２０】本実施形態の手書き文字認識装置にあって
は、図２に示すように、手書き文字入力装置２の入力面
２１上には手書き文字の入力枠は設けられておらず、入
力面２１上でペン１によって、例えば、図２に示すよう
な任意の手書き文字「枠無し手書き文字の認識につい
て」を任意の位置に複数行に渡って入力した後、「認
識」のコマンドボタン２２を選択操作すると、入力面２
１上に筆記された手書き文字が一括して認識され、その
認識結果が表示装置３の表示画面に文字表示される。こ
の場合、認識結果に誤りがあったならば、「再認識」の
コマンドボタン２３を選択操作することにより、筆記方
向を認識する処理から始まる一連の処理が再度実行さ
れ、再認識結果が表示される。また、誤った手書き文字
を筆記した場合、「取消し」のコマンドボタン２４を選
択操作することにより、１文字単位で取り消すことがで
きる。In the handwritten character recognition device of the present embodiment, as shown in FIG. 2, no input frame for handwritten characters is provided on the input surface 21 of the handwritten character input device 2. After inputting, for example, an arbitrary handwritten character “recognition of a frameless handwritten character” as shown in FIG. 2 over a plurality of lines by the pen 1 at an arbitrary position, the “recognition” command button 22 is selected. When operated, input surface 2
The handwritten characters written on 1 are collectively recognized, and the recognition result is displayed on the display screen of the display device 3 as characters. In this case, if there is an error in the recognition result, a series of processes starting from the process of recognizing the writing direction are executed again by selecting and operating the "re-recognize" command button 23, and the re-recognition result is displayed. You. When an erroneous handwritten character is written, it can be canceled in units of one character by selecting and operating the "cancel" command button 24.

【００２１】ここで、本明細書中で使用する用語の定義
について説明する。（１）ストロークストロークとは、ペン１が入力装置２の入力面２１に接
触してから離れるまでに書かれる１本の手書き線を意味
し、日本語でいうところの「一画」に対応する。１つの
手書き文字は、句読点などを除き複数のストロークで構
成される。（２）筆点筆点とは、それぞれのストロークを構成する最小単位の
点を意味し、入力面２１におけるペン１の押圧座標値、
あるいはその押圧座標値から導き出される論理的な座標
値で表現され、ストロークの始点や終点といった属性を
備える。Here, definitions of terms used in this specification will be described. (1) Stroke The stroke means one handwritten line written from when the pen 1 comes into contact with the input surface 21 of the input device 2 until the pen 1 leaves, and corresponds to "one stroke" in Japanese. . One handwritten character is composed of a plurality of strokes except for punctuation marks. (2) Brush Point A brush point means a point of a minimum unit constituting each stroke, and a coordinate value of a pen 1 pressed on the input surface 21;
Alternatively, it is expressed by a logical coordinate value derived from the pressed coordinate value, and has attributes such as a start point and an end point of the stroke.

【００２２】（３）文字要素文字要素とは、１文字に含まれることが明らかなストロ
ークの集合のことを指し、任意のストロークの集合に対
して交点を持つストローク同士の結合、距離の近いスト
ローク同士の結合等の処理を経ることによって得られ
る。ストローク、文字要素の区別を図３に例示する。（４）手書きパターン手書きパターンとは、図３に例示するように、入力装置
２の入力面に筆記された認識対象の手書き文字を構成す
るストローク群の全体を指し、何処までを認識対象とす
るかは、ユーザが区切りであることをボタンやメニュー
等で明示的に指示する方法、あるいはペン１が入力面２
１から離れて一定時間以上接触操作が行われなかった時
点を区切りとして指示する方法がある。(3) Character element A character element refers to a set of strokes clearly included in one character, and is a combination of strokes having intersections with an arbitrary set of strokes, and a stroke having a short distance. It is obtained through a process such as bonding between them. FIG. 3 illustrates the distinction between a stroke and a character element. (4) Handwritten Pattern The handwritten pattern refers to the entire stroke group that constitutes the recognition target handwritten character written on the input surface of the input device 2 as illustrated in FIG. The user may explicitly indicate that a break is to be performed by using a button or menu, or the pen 1
There is a method in which a point in time when the contact operation has not been performed for a certain period of time apart from 1 is set as a break.

【００２３】（５）裏ストローク裏ストロークとは、あるストロークの終点から次のスト
ロークの始点へのベクトルを意味し、本発明では、文字
内裏ストローク、文字間裏ストローク、改行裏ストロー
クに細分される。（６）文字内裏ストローク文字内裏ストロークとは、１文字内に含まれる連続した
２ストローク間で発生する裏ストロークを意味する。（７）文字間裏ストローク文字間裏ストロークとは、ある文字の最後のストローク
の終点と次の文字の始点との間で発生する裏ストローク
を意味する。（８）改行裏ストローク改行裏ストロークとは、ある行の最後の文字の最後のス
トロークの終点と次の行の先頭の文字の最初のストロー
クの始点との間で発生する裏ストロークを意味する。(5) Back stroke The back stroke means a vector from the end point of a certain stroke to the start point of the next stroke. In the present invention, the back stroke is subdivided into a back stroke inside a character, a back stroke between characters, and a back stroke between line feeds. . (6) Inner-character back stroke The inner-character back stroke means a back stroke generated between two consecutive strokes included in one character. (7) Inter-character back stroke The inter-character back stroke means a back stroke generated between the end point of the last stroke of a certain character and the start point of the next character. (8) New Line Back Stroke The new line back stroke means a back stroke generated between the end point of the last stroke of the last character of a certain line and the start point of the first stroke of the first character of the next line.

【００２４】図４は、本実施形態の手書き文字入力装置
の機能構成図であり、入力装置２の入力面２１で手書き
文字が筆記されると、その手書き文字の各ストロークを
構成する複数の筆点の座標データ列が入力装置２からス
トローク順に出力される。この各ストロークの筆点座標
データ列は、記憶装置６に順次格納される。任意の手書
き文字の入力が終了し、ユーザが「認識」のコマンドボ
タン２２を選択操作すると、手書き文字認識プログラム
６１が起動され、記憶装置６に格納された手書き文字の
筆点座標データ列を読出し、筆記方向の判別処理、改行
位置の判別処理、文字サイズの判別処理、ストローク群
の分割／結合処理、文字要素の分割／結合処理、辞書６
２を用いた認識処理を行う。手書き文字認識プログラム
６１は、筆記方向取得部６１１、改行位置取得部６１
２、標準文字サイズ取得部６１３、枠無し手書き文字列
認識部６１４とから構成される。このうち、枠無し手書
き文字列認識部６１４は、図５に示すように、仮結合処
理部６１５、仮分割処理部６１６、評価・探索処理部６
１７とから構成される。FIG. 4 is a functional block diagram of the handwritten character input device of the present embodiment. When a handwritten character is written on the input surface 21 of the input device 2, a plurality of brushes constituting each stroke of the handwritten character are displayed. A coordinate data sequence of points is output from the input device 2 in the order of strokes. The brush point coordinate data sequence of each stroke is sequentially stored in the storage device 6. When the input of an arbitrary handwritten character is completed and the user selects and operates the “recognition” command button 22, the handwritten character recognition program 61 is started, and the pen point coordinate data sequence of the handwritten character stored in the storage device 6 is read. , Writing direction determination processing, line feed position determination processing, character size determination processing, stroke group division / combination processing, character element division / combination processing, dictionary 6
2 is performed. The handwritten character recognition program 61 includes a writing direction acquisition unit 611, a line feed position acquisition unit 61.
2, a standard character size acquisition unit 613 and a frameless handwritten character string recognition unit 614. Among them, the frameless handwritten character string recognition unit 614 includes a temporary combination processing unit 615, a temporary division processing unit 616, and an evaluation / search processing unit 6 as shown in FIG.
17 is comprised.

【００２５】以下、この手書き文字認識プログラム６１
を構成する各部の構成および処理内容について詳細に説
明する。（１）記憶装置６に格納される筆点座標データ列の構成入力装置２から出力される手書き文字の各ストローク筆
点座標データ列は、図６に示すように、基本的にはスト
ローク番号６３１と各筆点のｘ，ｙ座標値６３２とから
構成され、認識処理の過程で各ストロークが何文字目の
ストロークに属するかなどのストローク間関係属性６３
３、改行位置に相当するストロークであることを示す改
行位置フラグ６３４などが付加されるようになってい
る。Hereinafter, the handwritten character recognition program 61
Will be described in detail with respect to the configuration and the processing content of each unit constituting. (1) Configuration of Pen Point Coordinate Data Sequence Stored in Storage Device 6 As shown in FIG. 6, each stroke pen point coordinate data sequence of handwritten characters output from the input device 2 basically has a stroke number 631. And the x and y coordinate values 632 of each brush point. The stroke relation attribute 63 such as the character number of each stroke in the recognition process.
3. A line feed position flag 634 indicating that the stroke corresponds to the line feed position is added.

【００２６】（２）筆記方向取得部６１１筆記方向取得部６１１は、図７および図８で示される手
順に従って手書きパターンが縦書きか、横書きかを判定
する。図７は、裏ストローク及び縦書き横書き判別ベク
トルの説明図である。裏ストロークとは、前述したよう
に、あるストロークの終点から次のストロークの始点へ
のベクトルである。直感的には、手書きパターンの入力
中のタブレットから離れた状態でのペン１の移動が裏ス
トロークであり、裏ストロークは、さらに文字内裏スト
ロークと文字間裏ストロークに分類できる。文字内裏ス
トロークとは、１文字に含まれるストローク間に生じる
裏ストロークであり、文字間裏ストロークとは、ある文
字の最後のストロークの終点から次の文字の最初のスト
ロークの始点への裏ストロークである。図７の手書きパ
ターンでは、ＢＳ１,ＢＳ２,ＢＳ４,ＢＳ６が文字内裏
ストローク、ＢＳ３,ＢＳ５が文字間裏ストロークであ
る。(2) Writing Direction Acquisition Unit 611 The writing direction acquisition unit 611 determines whether the handwritten pattern is vertical writing or horizontal writing according to the procedure shown in FIGS. FIG. 7 is an explanatory diagram of the back stroke and the vertical / horizontal writing determination vector. As described above, the back stroke is a vector from the end point of a certain stroke to the start point of the next stroke. Intuitively, the movement of the pen 1 away from the tablet during the input of the handwritten pattern is a back stroke, and the back stroke can be further classified into a back stroke inside a character and a back stroke between characters. A back stroke within a character is a back stroke generated between strokes included in one character, and a back stroke between characters is a back stroke from the end point of the last stroke of a certain character to the start point of the first stroke of the next character. is there. In the handwritten pattern of FIG. 7, BS1, BS2, BS4, and BS6 are the back strokes inside the character, and BS3 and BS5 are the back strokes between the characters.

【００２７】筆記方向取得部６１１は、認識対象の手書
きパターンの全てのストローク群を対象として、各裏ス
トロークに含まれる右方向の成分Ｒ３，Ｒ４，Ｒ５と下
方向の成分Ｄ６のみを加算し、縦書き横書き判別ベクト
ルを求める。図７では、Ｖtotalが縦書き横書き判別ベ
クトルである。日本語の場合、横書きの文字列では文字
間裏ストロークは、右方向の成分を多く含み、縦書きの
文字列では文字間裏ストロークは下方向の成分を多く含
む。この性質を利用し、筆記方向取得部６１１は、図８
のような手順で縦書き横書きの判定を行なっている。The writing direction acquisition unit 611 adds only rightward components R3, R4, R5 and downward component D6 included in each back stroke to all stroke groups of the handwritten pattern to be recognized, Find the vertical writing horizontal writing discrimination vector. In FIG. 7, Vtotal is a vertical writing horizontal writing determination vector. In the case of Japanese, the backstroke between characters includes many rightward components in a horizontally written character string, and the backstroke between characters includes many downward components in a vertically written character string. Utilizing this property, the writing direction acquisition unit 611 performs
Vertical / horizontal writing is determined by the following procedure.

【００２８】まず、図７で示した縦書き横書き判別ベク
トルを求める（ステップ８０１）。次に、縦書き横書き
判別ベクトルの右方向の成分を下方向の成分で割った値
Ａ（下方向成分に対する右方向成分の比）と、横書き判
定用の閾値Ｔｈ及び縦書き判定用の閾値Ｔｖとを比較
し、前記の値ＡがＴｈ以上であれば横書き、Ｔｖ以下で
あれば縦書きとして判定する（ステップ８０２）。前述
の処理で判定できなかった場合は、筆記された文字数が
少ないと判断し、筆記された手書きパターン全体の外接
矩形の縦横比（高さに対する幅の比）が「１」以上か否
かを調べ、「１」以上ならば横書き、「１」未満ならば
縦書きとして判定する（ステップ８０３）。従って、図
７に示したように、判別ベクトルの下方向成分に対する
右方向成分の比が横書き判定用の閾値Ｔｈを超えるもの
については、正確に「横書き」として判定される。この
ようにして横書きか、縦書きかを判定することにより、
ユーザは予め筆記方向を指定する必要がなくなり、手書
き文字を筆記する際の煩わしさから解放される。First, the vertical / horizontal writing discrimination vector shown in FIG. 7 is obtained (step 801). Next, a value A (the ratio of the rightward component to the downward component) obtained by dividing the rightward component of the vertical writing horizontal writing determination vector by the downward component, a threshold Th for horizontal writing determination, and a threshold Tv for vertical writing determination Are determined as horizontal writing if the value A is equal to or greater than Th, and vertical writing if the value A is equal to or less than Tv (step 802). If the determination cannot be made in the above-described processing, it is determined that the number of written characters is small, and it is determined whether the aspect ratio (the ratio of the width to the height) of the circumscribed rectangle of the entire written handwritten pattern is “1” or more. It is determined that horizontal writing is performed when the value is "1" or more, and vertical writing is performed when the value is less than "1" (step 803). Therefore, as shown in FIG. 7, if the ratio of the rightward component to the downward component of the discrimination vector exceeds the threshold Th for horizontal writing determination, it is accurately determined as “horizontal writing”. By judging horizontal writing or vertical writing in this way,
The user does not need to specify the writing direction in advance, and is free from the trouble of writing handwritten characters.

【００２９】（３）改行位置取得部６１２改行位置取得部６１２は、入力装置２から入力された手
書き文字の複数ストローク群を対象とし、その筆記方向
へのヒストグラムを求め、そのヒストグラムにより筆記
点が少ない部分を改行位置候補に選定し、さらに前記ス
トローク群の中のストローク入力時刻において隣合うス
トロークの終点から始点へのベクトルおよびそのベクト
ルの長さの平均を求め、前記改行位置候補内のベクトル
の長さと前記ベクトルの長さの平均を比較し、改行判定
用の閾値を超えるベクトルの位置を改行位置として判定
する。即ち、改行位置取得部６１２は、図１１のフロー
チャートに示すように、筆記方向取得部６１１が判定し
た筆記方向の判定結果に基づき、ストローク群の筆記方
向へのヒストグラムを求める（ステップ１１０１）。横
書きの場合、図９に示すように、ヒストグラム９０１の
「谷」に相当する位置が改行位置であると推定される。
そこで、ヒストグラム９０１で筆点分布度数が小さい部
分（谷の部分）をまたぐ裏ストロークを改行位置候補に
選定する（ステップ１１０２）。(3) Line Feed Position Acquisition Unit 612 The line feed position acquisition unit 612 obtains a histogram in the writing direction for a plurality of strokes of the handwritten character input from the input device 2, and determines the writing point by the histogram. A small portion is selected as a line feed position candidate, and at the stroke input time in the stroke group, a vector from the end point to the start point of an adjacent stroke and the average of the lengths of the vectors are obtained. The length and the average of the vector lengths are compared, and the position of the vector exceeding the threshold for line feed determination is determined as a line feed position. That is, as shown in the flowchart of FIG. 11, the line feed position obtaining unit 612 obtains a histogram of the stroke group in the writing direction based on the determination result of the writing direction determined by the writing direction obtaining unit 611 (step 1101). In the case of horizontal writing, as shown in FIG. 9, a position corresponding to a “valley” in the histogram 901 is estimated to be a line feed position.
Therefore, a back stroke that straddles a portion (valley portion) where the brush point distribution frequency is small in the histogram 901 is selected as a line feed position candidate (step 1102).

【００３０】改行裏ストロークとは、文字間裏ストロー
クの一種であり、図１０に示すように、ある行の末尾の
文字の最後のストロークの終点から次の行の先頭の文字
の最初のストロークの始点への裏ストロークという意味
である。日本語の場合、横書きの文章中の改行裏ストロ
ークは左下方向、縦書きの文章中の裏ストロークは左上
方向である。そこで、縦書きの場合は、ヒストグラム９
０１で筆点分布度数が小さい部分（谷の部分）をまたぐ
左上方向の裏ストロークを、横書きの場合は左下方向の
裏ストロークを改行裏ストローク候補として選択する。
次に、横書きの場合、前述処理で選択した裏ストローク
の左方向水平成分Ｗｃｒが、改行判定用の閾値を超える
ものを改行裏ストロークと判定し、縦書きの場合は、上
記処理で選択した裏ストロークの上方向鉛直成分Ｈｃｒ
が改行判定用の閾値を超えるものを改行裏ストロークと
判定する（ステップ１１０３）。The line feed back stroke is a kind of inter-character back stroke, and as shown in FIG. 10, the first stroke of the first character of the next character from the end point of the last stroke of the last character of a certain line. This means a back stroke to the starting point. In the case of Japanese, the back stroke of the line feed in the horizontal writing is the lower left direction, and the back stroke in the vertical writing is the upper left direction. Therefore, in the case of vertical writing, the histogram 9
At 01, a back stroke in the upper left direction straddling a portion (valley portion) where the brush point distribution frequency is small, and in the case of horizontal writing, a back stroke in the lower left direction is selected as a line feed back stroke candidate.
Next, in the case of horizontal writing, if the left horizontal component Wcr of the back stroke selected in the above process exceeds the threshold for line feed determination, it is determined as a line feed back stroke. In the case of vertical writing, the back stroke selected in the above process is determined. Upward vertical component Hcr of stroke
Is determined to be a line feed back stroke (step 1103).

【００３１】この場合、改行裏ストロークの水平成分Ｗ
ｃｒおよび鉛直成分Ｈｃｒの大きさは、１行の文字数に
よって異なる。そこで、手書き文字の１文字の標準サイ
ズが、図１０に示すように既知であるか、推定できる場
合、その標準文字サイズの幅Ｗｓで水平成分Ｗｃｒを割
った値が閾値を超えるものを横書きの場合の改行裏スト
ロークとして選定し、また標準文字サイズの高さＨｓで
鉛直成分Ｈｃｒを割った値が閾値を超えるものを縦書き
の場合の改行裏ストロークとして選定することにより、
判定精度がさらに向上する。In this case, the horizontal component W of the line feed back stroke is
The size of cr and the vertical component Hcr differs depending on the number of characters in one line. Therefore, when the standard size of one handwritten character is known or can be estimated as shown in FIG. 10, when the value obtained by dividing the horizontal component Wcr by the width Ws of the standard character size exceeds the threshold, the horizontal writing is performed. By selecting the stroke as the line feed backstroke in the case, and selecting the one in which the value obtained by dividing the vertical component Hcr by the height Hs of the standard character size exceeds the threshold value as the linefeed back stroke in the case of vertical writing,
The judgment accuracy is further improved.

【００３２】ところで、手書き文字が斜め方向に傾いて
筆記された場合、水平成分Ｗｃｒおよび鉛直成分Ｈｃｒ
が算定できなくなる恐れがあるが、斜め書きの場合は、
手書き文字パターンを正規直交座標系に変換する補正処
理を施すことによって水平成分Ｗｃｒおよび鉛直成分Ｈ
ｃｒを正常に算定することが可能である。この場合、斜
め書きであるか否かは、例えば、各手書き文字の外接矩
形の中心を結ぶ線を求め、その線の傾斜によって判定す
ることができる。このようにして改行位置を判定するこ
とにより、ユーザは筆記途中で改行位置を指定する必要
がなくなり、手書き文字を筆記する際の煩わしさから解
放される。When a handwritten character is written in an oblique direction, a horizontal component Wcr and a vertical component Hcr are written.
May not be calculated, but in the case of diagonal writing,
The horizontal component Wcr and the vertical component H are obtained by performing a correction process for converting a handwritten character pattern into an orthonormal coordinate system.
It is possible to calculate cr normally. In this case, whether or not the writing is oblique can be determined, for example, by obtaining a line connecting the centers of the circumscribed rectangles of the respective handwritten characters, and determining the inclination of the line. By determining the line feed position in this way, the user does not need to specify the line feed position during writing, and is free from the trouble of writing handwritten characters.

【００３３】（４）標準文字サイズ取得部６１３標準文字サイズ取得部６１３は、入力装置２から入力さ
れた手書き文字の複数ストローク群を構成する各ストロ
ーク間の距離を、予め定めた関係式に従って評価し、そ
の評価した距離が仮結合用の閾値よりも小さいストロー
ク同士を結合する仮結合処理を、結合可能なストローク
がなくなるまで繰り返すことにより、複数ストローク群
を複数の文字要素に分割した後、各文字要素の外接矩形
を求め、その外接矩形の高さの最大値または平均値と幅
の最大値または平均値を手書き文字の標準文字サイズと
して推定する。仮結合処理におけるストローク間の距離
は、図１２および図１３に示すような各パラメータに係
数を乗じて加算した値で評価する。(4) Standard Character Size Acquisition Unit 613 The standard character size acquisition unit 613 evaluates the distance between each stroke constituting a plurality of stroke groups of the handwritten character input from the input device 2 according to a predetermined relational expression. Then, by repeating the temporary joining process of joining strokes whose evaluated distances are smaller than the threshold for temporary joining until there are no more joinable strokes, after dividing the plurality of stroke groups into a plurality of character elements, A circumscribed rectangle of the character element is obtained, and the maximum value or average value of the height and the maximum value or average value of the width of the circumscribed rectangle are estimated as the standard character size of the handwritten character. The distance between the strokes in the temporary connection processing is evaluated by a value obtained by multiplying each parameter by a coefficient as shown in FIGS.

【００３４】ここで、Ｌは、図１２（ａ）に示すように
１つのストロークの標準サイズ（１辺の長さ）、Ｓは１
つのストロークの標準の面積である。１つのストローク
の標準サイズＬおよび標準面積Ｓは、図１２（ｂ）に破
線で示すような各ストロークの外接矩形を求め、その外
接矩形の高さおよび幅のうち、長い方の値のみを選択
し、さらに全てのストロークの高さおよび幅のうち最大
のものを選択し、これから１つのストロークの標準サイ
ズＬおよび標準面積Ｓ推定する。なお、後述する文字要
素間の結合処理においては、Ｌは１つの文字要素の標準
サイズ、Ｓは１つの文字要素の標準面積となる。Here, L is a standard size (length of one side) of one stroke as shown in FIG.
The standard area of one stroke. For the standard size L and standard area S of one stroke, a circumscribed rectangle of each stroke as shown by a broken line in FIG. 12B is obtained, and only the longer one of the height and width of the circumscribed rectangle is selected. Then, the largest one of the heights and widths of all the strokes is selected, and the standard size L and standard area S of one stroke are estimated from this. In the combining process between the character elements described later, L is the standard size of one character element, and S is the standard area of one character element.

【００３５】（ａ）評価パラメータ＝ｄ／Ｌ図１２（ｂ）に示すように、隣合うストロークの外接図
形（破線で図示）の筆記方向の変位ｄの１文字の標準サ
イズＬに対する割合い、（ｂ）評価パラメータ＝ｃ／Ｓ図１２（ｃ）に示すように、隣合うストロークの外接図
形（破線で図示）の重なり部分の面積ｃの１文字の標準
面積Ｓに対する割合い、（ｃ）評価パラメータ＝ｄ／Ｌ図１２（ｄ）に示すように、隣合うストロークの重心座
標のユークリッド距離ｄの１文字の標準サイズＬに対す
る割合い、（ｄ）評価パラメータ＝ｄ／Ｌ図１３（ａ）に示すように、隣合うストロークの重心座
標の筆記方向の変位ｄの１文字の標準サイズＬに対する
割合い、（ｅ）評価パラメータ＝ｄ／Ｌ図１３（ｂ）に示すように、先のストロークの末尾の筆
点と後のストロークの先頭の筆点のユークリッド距離ｄ
の１文字の標準サイズＬに対する割合い、（ｆ）評価パラメータ＝ｄ／Ｌ図１３（ｃ）に示すように、先のストロークの末尾の筆
点と後のストロークの先頭の筆点の筆記方向の変位ｄの
１文字の標準サイズＬに対する割合い。(A) Evaluation parameter = d / L As shown in FIG. 12B, the ratio of the displacement d in the writing direction of the circumscribed figure (shown by a broken line) of adjacent strokes to the standard size L of one character, (B) Evaluation parameter = c / S As shown in FIG. 12C, the ratio of the area c of the overlapping part of the circumscribed figure (shown by a broken line) of the adjacent stroke to the standard area S of one character, (c) Evaluation parameter = d / L As shown in FIG. 12 (d), the ratio of the Euclidean distance d of the barycentric coordinates of adjacent strokes to the standard size L of one character, (d) evaluation parameter = d / L ), The ratio of the displacement d in the writing direction of the barycentric coordinates of adjacent strokes to the standard size L of one character. (E) Evaluation parameter = d / L As shown in FIG. End of stroke Euclidean distance d of the beginning of the writing point of the stroke and after the writing point
(F) Evaluation parameter = d / L As shown in FIG. 13C, the writing direction of the last writing point of the previous stroke and the writing point of the first writing point of the subsequent stroke Is the ratio of the displacement d to the standard size L of one character.

【００３６】これらの評価パラメータの中から少なくと
も２つを予め選定しておき、その選定した複数の評価パ
ラメータによる評価値が求まったならば、その各評価値
に所定の係数を乗じて加算し、その加算値と仮結合用の
閾値と比較する。この比較処理の結果、加算値が小さい
ものについては、１文字の中に含まれると判定し、その
１対のストロークを同一集合に結合し、１つの文字要素
候補に選定する。この仮結合処理は、閾値以下のストロ
ークがいずれかの文字要素に全て結合されるまで再帰的
に繰り返す。At least two of these evaluation parameters are selected in advance, and when the evaluation values based on the selected plurality of evaluation parameters are obtained, each of the evaluation values is multiplied by a predetermined coefficient and added. The sum is compared with a threshold value for provisional combination. As a result of this comparison processing, a small addition value is determined to be included in one character, and the pair of strokes is combined into the same set and selected as one character element candidate. This temporary combining process is recursively repeated until all strokes equal to or less than the threshold value are combined with any of the character elements.

【００３７】例えば、図１４（ａ）に示すように「ソフ
ト」というカナ文字が入力された場合、このカナ文字を
構成するストロークＳＴ_１〜ＳＴ_５について、互いに隣
接するストローク同士で図１２（ｂ）〜図１３（ｃ）に
示す評価パラメータを求め、その評価パラメータを全部
使って総合評価を行い、どのストロークを結合して１つ
の文字要素とするかを決定する。図１４（ｂ）に各評価
パラメータの値の例を示している。ここで、図１４
（ｂ）における評価パラメータ（ａ）〜（ｃ）は、図１
２（ａ）〜（ｃ）の評価パラメータ、評価パラメータ
（ｄ）〜（ｆ）は図１３（ａ）〜（ｃ）の評価パラメー
タに該当する。算出した各評価パラメータは、小さいほ
ど結合の度合いが強いことを示している。For example, when a kana character “soft” is input as shown in FIG. 14A, strokes ST ₁ to ST ₅ constituting the kana character are connected with each other by the adjacent strokes shown in FIG. 13) to 13 (c), comprehensive evaluation is performed using all the evaluation parameters, and it is determined which strokes are combined into one character element. FIG. 14B shows an example of the value of each evaluation parameter. Here, FIG.
The evaluation parameters (a) to (c) in (b) are shown in FIG.
The evaluation parameters 2 (a) to (c) and the evaluation parameters (d) to (f) correspond to the evaluation parameters in FIGS. 13 (a) to 13 (c). The smaller the calculated evaluation parameters, the stronger the degree of coupling.

【００３８】図１４（ｂ）の評価パラメータに対し、
「仮結合の閾値＝−４．０」、「仮分割の閾値＝−５．
０」を設定した場合、総合評価はストロークＳＴ_１，Ｓ
Ｔ_２間が「−３．２」、ストロークＳＴ_２，ＳＴ_３間が
「−５．４５」、ストロークＳＴ_３，ＳＴ_４間が「−
７．４」、ストロークＳＴ_４，ＳＴ_５間が「−１．４
１」であるので、ストロークＳＴ_１，ＳＴ_２間は「結
合」、ストロークＳＴ_２，ＳＴ _３間は「分割」、ストロ
ークＳＴ_３，ＳＴ_４間は「分割」、ストロークＳＴ_４，
ＳＴ_５間は「結合」となる。With respect to the evaluation parameters shown in FIG.
“Temporary combination threshold = −4.0”, “Temporary division threshold = −5.
If "0" is set, the overall evaluation is the stroke ST₁, S
T₂The interval is "-3.2", stroke ST₂, ST₃Between
"-5.45", stroke ST₃, ST₄The interval is "-
7.4 ”, stroke ST₄, ST₅The interval is "-1.4
1 ”, the stroke ST₁, ST₂In the interval,
Go ", stroke ST₂, ST ₃Between the "split", str
Ark ST₃, ST₄"Division" between strokes, ST₄,
ST₅The space is a "bond".

【００３９】ここで、Ｘ軸方向（横書き方向）の単なる
重なり度合いによって「結合」か「分割」かを、従来の
決定論的な方法によって判断するようにした場合、例え
ば、ストロークＳＴ_２，ＳＴ_３間の距離ｄ２よりも小さ
い距離を、仮結合用の閾値に設定した場合、ストローク
ＳＴ_４，ＳＴ_５間の距離ｄ３は、ｄ２＞ｄ３であるので
ストロークＳＴ_４，ＳＴ_５は「結合」となる。しかし、
ストロークＳＴ_１，ＳＴ_２間の距離ｄ１は、ｄ１＞ｄ２
であるので、これらストロークＳＴ_１，ＳＴ_２間は「分
割」となり、ストロークＳＴ_２，ＳＴ _３間は「結合」と
なり、ストローク同士の結合および分割が正しく行われ
なくなる。Here, a simple X-axis direction (horizontal writing direction)
Depending on the degree of overlap, whether it is "combined" or "divided"
If you decide to use a deterministic method,
If the stroke ST₂, ST₃Less than the distance d2 between
Is set to the threshold for temporary connection, the stroke
ST₄, ST₅Since the distance d3 between them is d2> d3,
Stroke ST₄, ST₅Becomes "join". But,
Stroke ST₁, ST₂The distance d1 between them is d1> d2
Therefore, these strokes ST₁, ST₂The interval is "minutes
%) And the stroke ST₂, ST ₃Between the "join"
The strokes are joined and split correctly
Disappears.

【００４０】一方、本発明のように、複数の評価パラメ
ータの総合評価によってストローク同士の結合および分
割を決定することにより、ストローク同士の結合および
分割を精度良く行うことができる。標準文字サイズ取得
部６１３は、以上のようにしてストロークの結合および
分割を行い、文字要素となる候補を定めるこの結果、入
力装置２から入力された手書き文字の複数ストローク群
は、図１５に破線で囲んで示すように、複数の文字要素
に分割される。On the other hand, as in the present invention, the connection and division of strokes are determined by comprehensive evaluation of a plurality of evaluation parameters, so that the connection and division of strokes can be performed accurately. The standard character size acquisition unit 613 combines and divides strokes as described above and determines candidates to be character elements. As a result, a plurality of stroke groups of handwritten characters input from the input device 2 are indicated by broken lines in FIG. It is divided into a plurality of character elements, as shown by surrounding with.

【００４１】そこで、次に、図１５に破線で示すような
各文字要素の外接矩形を求め、その外接矩形の大きさか
ら１文字の大きさを推定する。文字の大きさは、高さと
幅をそれぞれ別個に計算し、計算には、各外接矩形の高
さおよび幅のうち、長い方の値のみを利用する。図１４
のような手書きパターンが与えられた時は、高さの計算
には、Ｈ_１，Ｈ _３，Ｈ_４，Ｈ_５，Ｈ_６を、幅の計算には
Ｗ_２，Ｗ_７を利用する。計算に用いるデータを選択した
後、それぞれのデータの平均値と標準偏差を求め、平均
値との差を標準偏差で割った値が閾値以上のものはノイ
ズを含んでいるものと見做してデータから削除する。最
後に残ったデータの最大値もしくは平均値を標準文字の
高さ、あるいは幅の推定値とする。Then, next, as shown by a broken line in FIG.
Finds the circumscribed rectangle of each character element and determines whether the size of the circumscribed rectangle is
Estimate the size of one character. The size of the character is height and
Calculate the width separately and calculate the height of each circumscribed rectangle.
Only the longer value of the height and width is used. FIG.
When a handwritten pattern like is given, calculate the height
H₁, H ₃, H₄, H₅, H₆To calculate the width
W₂, W₇Use Selected data for calculation
Then, find the average value and standard deviation of each data,
If the difference between the value and the standard deviation is greater than or equal to the threshold,
And delete it from the data. Most
The maximum or average value of the remaining data is
Estimate height or width.

【００４２】この場合、最終的にデータ不足で、高さＨ
あるいは幅Ｗの片方が算出できなかった場合、算出でき
た方の値を算出できなかった方の値にも利用する。例え
ば、高さＨだけが算出でき、幅Ｗが求められなかった場
合は、幅Ｗ＝高さＨとする。図１４の例では、文字の高
さ＝Ｈ_６、幅＝Ｗ_７として算出している。このようにす
ることにより、筆記方向や行数の指定が無い場合でも、
文字の大きさの推定が可能になる。そして、筆記方向や
行数の情報が筆記方向判別処理および改行位置判別処理
で判明すれば、仮結合処理の精度がさらに向上し、結果
として、手書き文字の標準サイズの推定精度が向上する
という利点がある。In this case, the data is finally insufficient and the height H
Alternatively, if one of the widths W cannot be calculated, the value of the calculated width is also used for the value of the uncalculated one. For example, when only the height H can be calculated and the width W is not obtained, the width W is equal to the height H. In the example of FIG. 14, the calculation is performed with the height of the character = H ₆ and the width = W ₇ . By doing so, even if the writing direction and the number of lines are not specified,
The size of the character can be estimated. If the information on the writing direction and the number of lines is found in the writing direction discrimination processing and the line feed position discrimination processing, the accuracy of the temporary combination processing is further improved, and as a result, the accuracy of estimating the standard size of handwritten characters is improved. There is.

【００４３】特に、斜め書きや文字間隔が狭い手書き文
字であっても、各文字要素の切り出しを行うための標準
文字サイズを正確に推定することができる。例えば、図
１６（ａ）に示すように斜め書きの手書き文字が入力さ
れた場合、仮結合処理によって図１６（ｂ）に示すよう
に結合または分割された文字要素単位に、その文字要素
の外接矩形を求め、その外接矩形の大きさから１文字の
大きさを推定するため、標準文字サイズを斜め書きの場
合であっても正確に推定することができる。In particular, it is possible to accurately estimate the standard character size for cutting out each character element, even for oblique writing or handwritten characters with a narrow character interval. For example, when a diagonally-written handwritten character is input as shown in FIG. 16A, the circumscription of the character element is performed in units of character elements which are combined or divided as shown in FIG. Since a rectangle is obtained and the size of one character is estimated from the size of the circumscribed rectangle, the standard character size can be accurately estimated even in the case of oblique writing.

【００４４】（５）枠無し手書き文字列認識部６１４枠無し手書き文字列認識部６１４は、図５に詳細を示し
たように仮結合処理部６１５、仮分割処理部６１６、評
価・探索処理部６１７とで構成される。仮結合処理部６
１５における処理は、標準文字サイズ取得部６１３にお
ける仮結合処理と全く同様である。但し、標準文字サイ
ズ取得部６１３における仮結合処理は個々のストローク
を結合し、「１つの文字に含まれることが明らかな状態
の文字要素」を作成することであるのに対し、仮結合処
理部６１５における仮結合処理は標準文字サイズの推定
値を参照し、各文字要素をさらに結合することである。
この場合、文字要素を結合する際に用いる評価パラメー
タおよび手順は、標準文字サイズ取得部６１３における
仮結合処理と全く同様のものを用いることができる。但
し、標準サイズＬは、１つの文字要素の外接矩形の長さ
の大きい方の値、標準面積Ｓは標準サイズＬの正方形の
面積を使用する点が異なる。なお、文字要素の結合に専
用に設定した評価パラメータを用いてもよい。(5) Frameless Handwritten Character String Recognition Unit 614 The frameless handwritten character string recognition unit 614 includes a temporary combination processing unit 615, a temporary division processing unit 616, an evaluation / search processing unit, as shown in detail in FIG. 617. Temporary join processing unit 6
The processing in 15 is exactly the same as the temporary combination processing in the standard character size acquisition unit 613. However, the temporary combination processing in the standard character size acquisition unit 613 is to combine individual strokes to create a “character element in a state clearly included in one character”. The provisional combination processing in 615 is to further combine each character element with reference to the estimated value of the standard character size.
In this case, the evaluation parameters and the procedure used when combining the character elements can be exactly the same as the temporary combination processing in the standard character size acquisition unit 613. The difference is that the standard size L uses the larger value of the length of the circumscribed rectangle of one character element, and the standard area S uses the square area of the standard size L. Note that an evaluation parameter set specifically for combining character elements may be used.

【００４５】この文字要素の再帰的な仮結合処理によっ
て、例えば、図１７に示すように「問」という漢字につ
いては、「門構え」内の「口」という文字要素は最後に
筆記された文字要素であるにも拘らず、「門構え」内に
結合され、「問」という１つの漢字の文字要素集合とな
る。文字要素がさらに結合され、新たな文字要素集合が
作成されたならば、仮分割処理部６１６において仮分割
処理を行う。仮分割処理とは、文字要素間の距離を評価
し、仮分割用の閾値よりも大きい距離の文字要素間に、
そこが文字の区切りであることを示す属性フラグを設定
するという処理である。この場合、文字要素間の距離の
評価方法は前述した仮結合処理と同様である。By the recursive provisional combination processing of the character elements, for example, as shown in FIG. 17, for the kanji character “Q”, the character element “mouth” in the “monarchy” is replaced with the last written character element. Despite this, it is combined in the "gate stance" to form a single kanji character element set of "question". When the character elements are further combined and a new character element set is created, the provisional division processing unit 616 performs provisional division processing. Temporary division processing is to evaluate the distance between character elements, and between character elements with a distance larger than the threshold for temporary division,
This is a process of setting an attribute flag indicating that there is a character delimiter. In this case, the method of evaluating the distance between the character elements is the same as in the above-described provisional combination processing.

【００４６】この処理によって、文字区切りの属性フラ
グが設定された２つの文字要素のうち先に筆記された文
字要素の末尾のストロークと、後に筆記された文字要素
の先頭のストロークの間は「文字の区切りであることが
明らかな状態」になる。図６においては、この属性フラ
グを文字の順番号で例示している。属性フラグの表現方
法としては、他の方法を用いても何等構わない。この枠
無し文字列認識部６１４における仮結合処理および仮分
割処理は、後続の評価・探索処理部６１７における探索
空間を小さくするための処理であるので、処理時間が問
題にならない場合（高速の処理時間を必要としない場
合）は省略することができる。As a result of this processing, between the last stroke of the previously written character element and the first stroke of the later written character element of the two character elements for which the character delimiter attribute flag is set, a “character” It is a state in which it is clear that it is a delimiter. In FIG. 6, the attribute flag is illustrated by a character sequence number. As a method of expressing the attribute flag, any other method may be used. The provisional combining process and the provisional division process in the frameless character string recognition unit 614 are processes for reducing the search space in the subsequent evaluation / search processing unit 617, so that the processing time does not matter (high-speed processing). If no time is required) can be omitted.

【００４７】次に、評価・探索処理部６１７において、
各文字要素集合によって辞書６２を探索し、辞書６２に
登録された手書き文字パターンに対する評価値が最大に
なる文字を判定し、その文字のコードを認識結果として
表示装置３に出力し、表示装置３において文字コードに
対応した文字を表示させる。前記の仮分割処理部６１６
の処理が終了した段階では、入力装置２から入力された
手書きパターンに含まれる全ての隣接したストローク間
の状態は、「１文字に含まれることが明らかな状態」、
「文字の区切りであることが明らかな状態」、「曖昧な
状態」のいずれかである。この段階で存在する「曖昧な
状態」についてそれぞれ、１文字に含まれていると見做
すか、文字の区切りであると見做すかによって、１つの
「切り出しパターン」が定義できる。探索空間にある
「切り出しパターン」の数は「あいまいな状態」の数を
ｎとすると、２のｎ乗である。Next, in the evaluation / search processing unit 617,
The dictionary 62 is searched by each character element set, a character having the maximum evaluation value for the handwritten character pattern registered in the dictionary 62 is determined, and the code of the character is output to the display device 3 as a recognition result. To display the character corresponding to the character code. The above-mentioned provisional division processing unit 616
Is completed, the state between all adjacent strokes included in the handwritten pattern input from the input device 2 is “a state clearly included in one character”,
Either "a state where it is clear that a character is delimited" or "an ambiguous state". One “cutout pattern” can be defined depending on whether the “ambiguous state” existing at this stage is included in one character or a character delimiter. The number of “cutout patterns” in the search space is 2 to the power of n, where n is the number of “ambiguous states”.

【００４８】この評価・探索処理は、探索空間に含まれ
る「全切り出しパターン」の中から以下で説明する評価
値を最大にする「切り出しパターン」を探索するという
処理である。この場合の探索手法には、動的計画法、全
探索、ビーム探索等の既存の探索手法が利用可能であ
る。本実施形態では、探索空間を図１８に示すように２
分木で表現し、その２分木に対するビーム探索を行うよ
うにしている。切り出しパターンの評価値は、次に示す
評価パラメータに係数を乗じて加算した値を用いてい
る。The evaluation / search process is a process of searching for a “cut pattern” that maximizes an evaluation value described below from “all cut patterns” included in the search space. As a search method in this case, an existing search method such as a dynamic programming method, a full search, or a beam search can be used. In the present embodiment, as shown in FIG.
It is represented by a binary tree, and a beam search for the binary tree is performed. As the evaluation value of the cutout pattern, a value obtained by multiplying the following evaluation parameter by a coefficient and adding the result is used.

【００４９】（ａ）切り出された各手書きパターンと辞
書に登録されている手書きパターンとの距離から得られ
る評価パラメータ、（ｂ）各認識結果文字間の遷移確率から得られる評価パ
ラメータ、（ｃ）切り出された各手書きパターンのサイズの標準の
文字サイズに対する割合から得られる評価パラメータ、（ｄ）１文字に含まれると判断した隣接のストローク間
の、文字要素間の距離の評価値と、仮結合の閾値から得
られる評価パラメータ、（ｅ）文字の区切りであると判断した隣接ストローク間
の、文字要素間の距離の評価値と、仮分割処理の閾値か
ら得られる評価パラメータ。(A) an evaluation parameter obtained from the distance between each cut-out handwritten pattern and the handwritten pattern registered in the dictionary; (b) an evaluation parameter obtained from the transition probability between each recognition result character; An evaluation parameter obtained from a ratio of the size of each cut-out handwritten pattern to a standard character size; (d) an evaluation value of a distance between character elements between adjacent strokes determined to be included in one character; (E) An evaluation value obtained from the evaluation value of the distance between character elements between adjacent strokes determined to be character delimiters, and an evaluation parameter obtained from the threshold of the temporary division processing.

【００５０】図１８において、１点鎖線は区切りになる
かどうかが曖昧な部分を示し、破線矢印は分割処理、実
線矢印は結合処理によって各文字要素が分割または結合
されることを示している。例えば、手書き文字「晴れ」
をの曖昧部分で結合した後、の曖昧部分で分割した
場合は「晴れ」という文字に認識される。しかし、の
曖昧部分も結合した場合は認識不可能であることを示し
ている。評価・探索処理部６１７は、各文字要素間の結
合関係が曖昧な部分を左から順に、文字の区切りと判断
する場合は左側に、１文字に含まれると判断する場合は
右側に進むものとすると、図１８の２分木の各ノードの
日本語の文字列としての確からしさを以下に述べる手法
で評価しながら、２分木の葉の中から最も確からしい葉
を探索し、その葉に相当する文字列を認識結果とする。
これは、上記（ｃ）の評価方法に該当する。ある手書き
パターンＸが文字列Ｃである確率は、ベイズの定理によ
り次の「数１」によって表すことができる。In FIG. 18, a dashed line indicates a portion where it is ambiguous whether or not it becomes a break, a broken line arrow indicates that each character element is divided or combined by a combining process, and a solid arrow indicates that each character element is combined or divided by a combining process. For example, the handwritten character "sunny"
After combining at the ambiguous part of and then dividing at the ambiguous part, the character is recognized as "sunny". However, if the ambiguous part is also combined, it indicates that it cannot be recognized. The evaluation / search processing unit 617 proceeds in order from left to right in the part where the connection relation between the character elements is ambiguous, and proceeds to the left when it is determined to be a character delimiter, and to the right when it is determined that it is included in one character. While evaluating the certainty of each node of the binary tree in FIG. 18 as a Japanese character string by the method described below, the most probable leaf is searched from the leaves of the binary tree, and the character corresponding to the leaf is searched. Let the sequence be the recognition result.
This corresponds to the evaluation method (c). The probability that a certain handwritten pattern X is a character string C can be expressed by the following “Equation 1” according to Bayes' theorem.

【００５１】[0051]

【数１】 (Equation 1)

【００５２】ここで、Ｐ（Ｘ）は事象Ｘの起こる確率、
Ｐ（Ｘ│Ｙ）は事象Ｙのもとで事象Ｘの起きる条件つき
確率である。すなわち、Ｐ（Ｘ│Ｃ）；文字列ＣがパターンＸのように書かれる確率、Ｐ（Ｃ）；文字列Ｃが書かれる確率、Ｐ（Ｘ）；パターンＸが書かれる確率；（Ｃとは独立であるので定数として考える）、である。ここで、Ｐ（Ｃ）は近似的に、「数２」によっ
て表すことができる。Where P (X) is the probability of occurrence of event X,
P (X | Y) is the conditional probability that event X occurs under event Y. That is, P (X | C); the probability that the character string C is written like the pattern X, P (C); the probability that the character string C is written, P (X); the probability that the pattern X is written; Are independent and are considered as constants). Here, P (C) can be approximately represented by “Equation 2”.

【００５３】[0053]

【数２】 (Equation 2)

【００５４】但し、Ｐ（Ｃ_ｉ＋１│Ｃ_ｉ）は、ｉ番目の
文字と（ｉ＋１）番目の文字が連続して書かれる確率の
ことで、予め統計を取って用意して有る表から求める。
Ｎは文字数である。Ｐ（Ｘ│Ｃ）は近似的に、「数３」
によって表すことができる。[0054] _{However, P (C i + 1 │C} i) , by the probability that the i-th character and (i + 1) th character is written continuously determined from the table there are prepared taking advance statistics.
N is the number of characters. P (X│C) is approximately “Equation 3”
Can be represented by

【００５５】[0055]

【数３】 (Equation 3)

【００５６】但し、Ｐ（Ｘ_ｉ│Ｃ_ｉ）は文字列Ｃ中のｉ
番目の文字Ｃｉが、手書きパターンＸを１文字毎に分割
した中のｉ番目の手書きパターンＸｉのように書かれる
確率であり、文字Ｃｉに対応する辞書パターンと手書き
パターンＸｉをオンライン枠有り文字認識装置で比較す
ることにより求めている。Ｐ（区切りｏｒ結合│ｄｋ）
はｋ番目の文字要素と（ｋ＋１）番目の文字要素間の距
離がｄｋの場合にその２つの文字要素間が、文字の区切
りに成っている確率、あるいは１文字に含まれている確
率である。どちらの確率を求めるかは、手書きパターン
Ｘの分割の仕方に依存する。Here, P (X _i | C _i ) is the _i in the character string C.
Is the probability that the i-th character Ci is written like the i-th hand-written pattern Xi in which the hand-written pattern X is divided for each character, and recognizes a dictionary pattern corresponding to the character Ci and the hand-written pattern Xi with an online frame. It is determined by comparing with the device. P (delimiter or join | dk)
Is the probability that, when the distance between the k-th character element and the (k + 1) -th character element is dk, the two character elements form a character delimiter or are included in one character. . Which probability is determined depends on how to divide the handwritten pattern X.

【００５７】評価中の手書きパターンの分割法で、ｋ番
目の文字要素と（ｋ＋１）番目の文字要素が、１文字に
含まれていなければ文字になる確率を、１文字に含まれ
ていれば１文字に含まれる確率を求める。Ｐ（ＳＩＺＥ
_ｉ│標準サイズ）は、１文字の標準の大きさが標準サイ
ズである時の、ｉ番目の文字の大きさＳＩＺＥｉの確か
らしさである。次に、コンピュータで計算することを考
慮した場合、「数３」では乗算が多く、（２ｉ＋ｋ）回
の乗算が必要になる。そこで、「数３」を「数４」に示
すような対数項を持つ計算式に置き換え、この「数４」
の計算結果を統計的評価値として採用する。In the division method of the handwritten pattern under evaluation, if the k-th character element and the (k + 1) -th character element are not included in one character, the probability of becoming a character is determined as follows. Find the probability of being included in one character. P (SIZE
_i | standard size) is the certainty of the size SIZEi of the i-th character when the standard size of one character is the standard size. Next, in consideration of calculation by a computer, “Equation 3” requires many multiplications and (2i + k) multiplications. Therefore, "Equation 3" is replaced by a calculation formula having a logarithmic term as shown in "Equation 4".
Is used as the statistical evaluation value.

【００５８】[0058]

【数４】 (Equation 4)

【００５９】このように日本語としての確からしさを評
価し、その評価値が最大となる文字を認識結果として出
力することにより、文字間隔が不揃いな手書き文字、斜
めに傾いて筆記された手書き文字が存在したとしても、
複数行にわたる文字列の文脈に適合する認識結果が得ら
れ、文字単位の認識では得られない高精度の認識結果を
一括して得ることができる。例えば、図１６（ａ）の手
書き文字は同図（ｃ）に示すような文字要素の結合によ
って正しく認識される。As described above, the likelihood of Japanese is evaluated, and the character having the largest evaluation value is output as a recognition result, so that handwritten characters with irregular character spacing and handwritten characters written diagonally. Even if exists,
A recognition result suitable for the context of a character string over a plurality of lines can be obtained, and a high-precision recognition result that cannot be obtained by character-by-character recognition can be obtained collectively. For example, the handwritten characters in FIG. 16A are correctly recognized by combining the character elements as shown in FIG.

【００６０】なお、本発明は、上記実施形態に限定され
るものではなく、筆記方向取得部６１１、改行位置取得
部６１２、標準文字サイズ取得部６１２、枠無し手書き
文字認識部６１４における処理を新規の要素技術とし
て、既存の文字認識処理の中に組み込んで構成すること
ができる。また、手書き文字認識プログラムは、ＣＤ・
ＲＯＭ等の記録媒体に格納されてユーザに提供される。
または、インタネット等の通信媒体を通じて有償で提供
される。It should be noted that the present invention is not limited to the above embodiment, and the processing in the writing direction acquisition unit 611, line feed position acquisition unit 612, standard character size acquisition unit 612, and frameless handwritten character recognition unit 614 is newly performed. As an elemental technology of, it can be configured by being incorporated into existing character recognition processing. In addition, the handwritten character recognition program
It is stored in a recording medium such as a ROM and provided to the user.
Alternatively, it is provided for a fee through a communication medium such as the Internet.

【００６１】以上説明したように、本実施の形態によれ
ば、電子黒板等に筆記方向が指定されずに筆記された手
書き文字の筆記方向を正確に判定し、その判定結果に従
って手書き文字を認識することができる。また、電子黒
板等に改行位置が指定されずに筆記された手書き文字の
改行位置を正確に判定し、その判定結果に従って複数行
に渡る手書き文字を認識することができる。さらに、斜
め書きや文字間隔が狭い手書き文字であっても、各文字
要素の切り出しを正確に行い、その切り出し結果に従っ
て任意行の手書き文字を認識することができる。また、
縦書き横書きの種別、行数、筆記枠の有無に関係なく、
電子黒板等に筆記された手書き文字を高精度で認識する
ことができる。以上、本発明者によってなされた発明
を、前記実施の形態に基づき具体的に説明したが、本発
明は、前記実施の形態に限定されるものではなく、その
要旨を逸脱しない範囲において種々変更可能であること
は勿論である。As described above, according to the present embodiment, the writing direction of a handwritten character written on an electronic blackboard or the like without specifying the writing direction is accurately determined, and the handwritten character is recognized according to the determination result. can do. In addition, it is possible to accurately determine a line feed position of a handwritten character written without specifying a line feed position on an electronic blackboard or the like, and recognize a handwritten character over a plurality of lines according to the determination result. Furthermore, even for diagonal writing or handwritten characters with a narrow character interval, it is possible to accurately cut out each character element and recognize handwritten characters on an arbitrary line according to the cutout result. Also,
Regardless of the type of vertical and horizontal writing, the number of lines, and the presence or absence of a writing frame,
Handwritten characters written on an electronic blackboard or the like can be recognized with high accuracy. As described above, the invention made by the inventor has been specifically described based on the embodiment. However, the present invention is not limited to the embodiment, and can be variously modified without departing from the gist of the invention. Of course, it is.

【００６２】[0062]

【発明の効果】本願において開示される発明のうち代表
的なものによって得られる効果を簡単に説明すれば、下
記の通りである。本発明によれば、斜め書きや文字間隔
が狭い手書き文字であっても、各文字要素の切り出しを
正確に行い、その切り出し結果に従って任意行の手書き
文字を認識することが可能となる。The effects obtained by typical ones of the inventions disclosed in the present application will be briefly described as follows. ADVANTAGE OF THE INVENTION According to this invention, even if it is a diagonal writing or a handwritten character with a narrow character spacing, it becomes possible to cut out each character element accurately, and to recognize the handwritten character of an arbitrary line according to the cutout result.

[Brief description of the drawings]

【図１】本発明を適用した手書き文字認識装置の実施形
態を示すブロック構成図である。FIG. 1 is a block diagram showing an embodiment of a handwritten character recognition apparatus to which the present invention is applied.

【図２】手書き文字入力装置の入力面に筆記された手書
き文字の一例を示す説明図である。FIG. 2 is an explanatory diagram illustrating an example of a handwritten character written on an input surface of a handwritten character input device.

【図３】手書き文字の中のデータの単位を示す説明図で
ある。FIG. 3 is an explanatory diagram showing a unit of data in a handwritten character.

【図４】図１の手書き文字認識装置の機能構成図であ
る。FIG. 4 is a functional configuration diagram of the handwritten character recognition device of FIG. 1;

【図５】枠無し文字列認識部の詳細構成図である。FIG. 5 is a detailed configuration diagram of a frameless character string recognition unit.

【図６】記憶装置に格納される手書き文字のデータ構成
の一例を示す図である。FIG. 6 is a diagram illustrating an example of a data configuration of a handwritten character stored in a storage device.

【図７】縦書き横書き判別ベクトルの説明図である。FIG. 7 is an explanatory diagram of a vertical / horizontal writing discrimination vector.

【図８】縦書き横書きの判別処理を示すフローチャート
である。FIG. 8 is a flowchart illustrating a process of determining vertical writing and horizontal writing.

【図９】改行位置の判別に使用するヒストグラムの例を
示す説明図である。FIG. 9 is an explanatory diagram showing an example of a histogram used for determining a line feed position.

【図１０】改行裏ストロークの説明図である。FIG. 10 is an explanatory diagram of a line feed back stroke.

【図１１】改行位置の判定処理を示すフローチャートで
ある。FIG. 11 is a flowchart illustrating a line feed position determination process.

【図１２】ストローク間の仮結合処理に用いる評価パラ
メータの説明図である。FIG. 12 is an explanatory diagram of evaluation parameters used for a temporary connection process between strokes.

【図１３】ストローク間の仮結合処理に用いる評価パラ
メータの説明図である。FIG. 13 is an explanatory diagram of evaluation parameters used for a temporary connection process between strokes.

【図１４】ストロークの仮結合処理の対象となる入力ス
トロークの例と評価パラメータの算出例を示す説明図で
ある。FIG. 14 is an explanatory diagram showing an example of an input stroke to be subjected to a temporary connection process of strokes and an example of calculation of an evaluation parameter.

【図１５】文字要素の外接矩形から標準文字サイズを推
定する処理の説明図である。FIG. 15 is an explanatory diagram of a process of estimating a standard character size from a circumscribed rectangle of a character element.

【図１６】斜め書きの手書き文字の文字要素への仮結合
処理の一例を示す図である。FIG. 16 is a diagram illustrating an example of a process of temporarily combining obliquely written handwritten characters into character elements.

【図１７】文字要素の再帰的な処理によって結合可能な
手書き文字の一例を示す説明図である。FIG. 17 is an explanatory diagram showing an example of handwritten characters that can be combined by recursive processing of character elements.

【図１８】手書き文字を辞書内で探索する際に用いる２
分木の一例を示す説明図である。FIG. 18 illustrates a method for searching for a handwritten character in a dictionary.
It is explanatory drawing which shows an example of a branch tree.

[Explanation of symbols]

１…ペン、２…手書き文字入力装置、３…表示装置、４
…ＣＰＵ、６…記憶装置、２１…手書き文字の入力面、
６１…手書き文字認識プログラム、６２…辞書、６１１
…筆記方向取得部、６１２…改行位置取得部、６１３…
標準文字サイズ取得部、６１４…枠無し手書き文字列認
識部、６１５…仮結合処理部、６１６…仮分割処理部、
６１７…評価・探索処理部。DESCRIPTION OF SYMBOLS 1 ... Pen, 2 ... Handwritten character input device, 3 ... Display device, 4
... CPU, 6 ... Storage device, 21 ... Input surface for handwritten characters,
61: handwritten character recognition program, 62: dictionary, 611
... writing direction acquisition unit, 612 ... line feed position acquisition unit, 613 ...
Standard character size acquisition unit, 614: frameless handwritten character string recognition unit, 615: temporary combination processing unit, 616: temporary division processing unit
617 ... Evaluation / search processing unit.

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5B029 AA01 CC21 EE08 5B064 AB04 BA05 DD06 EA34 ──────────────────────────────────────────────────続き Continued on the front page F-term (reference) 5B029 AA01 CC21 EE08 5B064 AB04 BA05 DD06 EA34

Claims

[Claims]

1. A handwritten character recognition method for recognizing a plurality of handwritten character strings composed of a plurality of stroke groups input in the order of strokes from a handwritten character input device, wherein a distance between strokes constituting the plurality of stroke groups is determined in advance. A plurality of stroke groups were divided into a plurality of character elements by repeating the process of combining strokes whose evaluated distances were smaller than the threshold for provisional combination until there were no more connectable strokes, according to the determined relational expression. Then, the circumscribed rectangle of each character element is obtained, and the maximum or average value of the height of the circumscribed rectangle and the maximum or average value of the width of the circumscribed rectangle are estimated as the standard character size of the handwritten character. Calculate a parameter representing the relationship between adjacent character elements in a standard character size space according to a predetermined relational expression, and calculate the By repeating the process of combining character elements whose parameters are smaller than the threshold for provisional combination until there are no more character elements that can be combined, multiple character elements are divided into multiple character element sets, and A handwritten character recognition method comprising: searching a dictionary; and outputting, as a recognition result, a character having a maximum evaluation value for a handwritten character pattern registered in the dictionary.

2. An attribute flag indicating a character delimiter is set for each character element whose parameter indicating the relationship between the character elements is larger than a threshold for provisional division, and the plurality of character elements are assigned the attribute 2. The handwritten character recognition method according to claim 1, wherein the flag is divided into a state in which character delimitation is clear, and a dictionary is searched with reference to this division.