JPS6084683A

JPS6084683A - Character recognizing system

Info

Publication number: JPS6084683A
Application number: JP58193134A
Authority: JP
Inventors: Mitsumasa Sugiyama; 杉山　光正
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1983-10-14
Filing date: 1983-10-14
Publication date: 1985-05-14

Abstract

PURPOSE:To suppress a classifying error rate of a basic stroke to a low level by a small processing quantity, and to obtain a high recognition rate by limiting a basic stroke group which is an object to be collated, by a recognition mode of ''Kana'' (Japanese syllabary), a Chinese character, an alphabet, etc. CONSTITUTION:Recognition mode setting information is sent to a recognition mode setting part 5, and set to a recognition mode. In a preprocessing part 4, a processing such as elimination of a noise, smoothing, etc. is executed for inputted character information, and thereafter, a stroke is cut by on-and-off information of a pen touch of an input pen 2, and stroke information of every stroke is sent to a stroke recognizing part 6. In the stroke recognizing part 6, the stroke information obtained from the preprocessing part 4 is collated to a basic stroke pattern registered in a basic stroke dictionary part 7 in accordance with the recognition mode set by the recognition mode setting part 5. In this way, the input stroke is classified to one of the basic strokes.

Description

【発明の詳細な説明】（技術分野）本発明は文字を構成するストロークにし１する情報によ
って文字認識を行う文字認識方式に関するものである。DETAILED DESCRIPTION OF THE INVENTION (Technical Field) The present invention relates to a character recognition method that performs character recognition based on information on strokes that constitute a character.

（従来波？１１１）従来、文字を構成するストロークに関する情報によって
、入力ストロークを予め準（ｉｉｌｌさｔした基本スト
ロークに分類し、基本ストロークの集合力）ら文字を認
識する方式をとってし）る。し力１しな力（ら日本語に
おいては漢字、ひら力くな、カタカナ、英字、数字等、
多くの文字が使用されており、そのストν−りを分類す
るための基本ストロークも多い。しかし、ひらがなを構
成するストロークには、漢字、カタカナ、英字、数字等
には使われないものも多くある。誉存マ漢牛乎壜キ１文
工准市工２ｄｉ寡を才式ツツ；また、英字、数字を構成
するストロークにも他の文字には使われないものがある
。この様に認識させるべき文字の種類が多いと必然的に
備えるべきストロークも多くなるので入力ストロークを
基本ストロークに分類する際に生じる基本ストローク分
類誤りが認識率の低下を招いていた。(Conventional wave? 111) Conventionally, input strokes are classified into basic strokes based on information about the strokes that make up the character, and characters are recognized based on the collective force of the basic strokes. Ru. Shiriki 1 Shina Power (Ra In Japanese, kanji, hirari kuna, katakana, alphabetic characters, numbers, etc.)
Many characters are used, and there are many basic strokes for classifying the characters. However, many of the strokes that make up hiragana are not used for kanji, katakana, alphabetic characters, numbers, etc. There are also strokes that make up letters and numbers that are not used for other letters. If there are many types of characters to be recognized in this way, the number of strokes that must be prepared will also inevitably increase, so basic stroke classification errors that occur when classifying input strokes into basic strokes have caused a reduction in the recognition rate.

（目　的）本発明はかな、漢字、英字等の認識モードにより照合対
象とする基本ストローク群を限定し、少い処理量で基本
ストロークの分類誤り率を低く抑え、高い認識率を得る
ことができる文字認識方式を提供することを目的とする
。(Purpose) The present invention is capable of limiting basic stroke groups to be matched by recognition modes such as kana, kanji, and alphabets, keeping the classification error rate of basic strokes low with a small amount of processing, and achieving a high recognition rate. The purpose is to provide a character recognition method that can.

（実施例）以下、図面に従って本発明の一実施例を詳＃１１に説明
する。(Example) Hereinafter, an example of the present invention will be described in detail #11 with reference to the drawings.

第１図は本発明の一実施例である文字認識装置の構成を
示すブロック図である。図において６は認識させるべき
文字情報を入力するための文字情報入力装置でタブレッ
ト１と入力ペン２より構成されており、入力ペン２を用
いてタブレット１上に認識させるべき文字情報を描くこ
とにより入力が行れる。４は文字情報入力装置６より入
力された文字情報にノイズ除去、正規化等を施す前処理
部、５は認識させるべき入力文字がひらがなモード、カ
タカナモード、英字モード、数字モード等のいずれのモ
ードであるかを設定するための認識モード設定部、６は
ストローク情報から入カスト四−りを認識するストロー
ク認識部、７はストローク認識のために使用される基本
ストロークの代表ストロークパターンが登録しである基
本ストローク辞書部、８は各入力ストロークの長さ、位
置関係等を処理する文字情報処理部、９は６のストロー
ク８織部から得た結果と文字情報処理部から得た文字情
報により、入り文字を認識する文字間ｔＡ？ｍ、１０は
複数種の文字パターンが格納されている文字辞書部、１
１は文字認識部９で認識された結果を出力する出力部で
ある。FIG. 1 is a block diagram showing the configuration of a character recognition device that is an embodiment of the present invention. In the figure, 6 is a character information input device for inputting character information to be recognized, which is composed of a tablet 1 and an input pen 2. By drawing character information to be recognized on the tablet 1 using the input pen 2, You can input. 4 is a preprocessing unit that performs noise removal, normalization, etc. on the character information input from the character information input device 6; 5 is a preprocessing unit that performs noise removal, normalization, etc. on the character information input from the character information input device 6; and 5, whether the input characters to be recognized are in hiragana mode, katakana mode, alphabet mode, numeric mode, etc. 6 is a stroke recognition unit that recognizes an input cast four-way from stroke information; 7 is a recognition mode setting unit for setting whether the stroke is a typical stroke pattern used for stroke recognition; A basic stroke dictionary part, 8 is a character information processing part that processes the length of each input stroke, positional relationship, etc., 9 is a character information processing part that processes the length of each input stroke, positional relationship, etc. Character spacing tA to recognize characters? m, 10 is a character dictionary section in which multiple types of character patterns are stored, 1
Reference numeral 1 denotes an output unit that outputs the result recognized by the character recognition unit 9.

第２図は基本ストロークの１例であり、ストロークｔｄ
、ナンバー、代表ストローク、各モードに該当する文字
を構成するストロークと成り得るがどうかを表示してい
る。代表ストロークの矢印はペンの移動の方向を表して
いる。各モード列にｒＯＪのある基本ストロークは、そ
のモードに該当する文字を構成するストロークと成り得
ることを表している。Figure 2 is an example of a basic stroke, and the stroke td
, number, representative stroke, and whether or not the strokes can constitute a character corresponding to each mode are displayed. The arrow of the representative stroke indicates the direction of pen movement. A basic stroke with rOJ in each mode string indicates that it can be a stroke that constitutes a character corresponding to that mode.

次に第１図、第２図を参照しつつ、本実施例を説明する
。Next, the present embodiment will be described with reference to FIGS. 1 and 2.

オペレータが１のタブレット上で２の入力ペンを用いて
文字を書くと、ある一定時１ｆＪＩ毎にタブレット１上
における入力ベン２のペン先の座標情報と入力ペン２の
ペン先がタブレットに触れているかいないかの情報が前
処理部４に送られる。また認識モードの設定は文字情報
入力装置６上に設けたキー（不図示）を押下するが、又
は入力ペン２でタブレット１上の所定の区域に触れる等
で行い認識モード設定情報は認識モード設定部５に送ら
れ、認識モードが設定される。前処理部４では入力され
た文字情報に対し、ノイズ除去、平滑化等の処理を行っ
た後、入力ペン２のペンタッチのオン。When an operator writes characters on tablet No. 1 using input pen No. 2, the coordinate information of the pen tip of input pen No. 2 on tablet No. 1 and the pen tip of input pen No. 2 touch the tablet every 1fJI at a certain time. Information on whether or not there is a fish is sent to the preprocessing section 4. The recognition mode setting information can be set by pressing a key (not shown) provided on the character information input device 6 or by touching a predetermined area on the tablet 1 with the input pen 2. 5, and a recognition mode is set. The preprocessing unit 4 performs processing such as noise removal and smoothing on the input character information, and then turns on the pen touch of the input pen 2.

オフ情報よりス）ローフの切り出しを行い、ストローク
毎のストｐ−り情報を６のストローク認識部へ送る。ま
た各ストロークの長さ、ストロークの始点、終点、入力
ペン２のペン移動方向変化点の座標、各ストロークの交
差の有無等を文字情報処理部８へ送る。ス）ｏ−り認識
部６では前処理部４から得たストローク情報に対して認
識モード設定部５で設定された認識モードに従い基本ス
トローク辞書部７に登録されている基本ストロークパタ
ーンと照合して入カスト四−りを基本ストロークのいづ
れかに分類する。A loaf is cut out from the off information, and the stroke information for each stroke is sent to the stroke recognition section 6. Further, the length of each stroke, the start point and end point of the stroke, the coordinates of the point of change in the pen movement direction of the input pen 2, the presence or absence of intersection of each stroke, etc. are sent to the character information processing section 8. S) The o-ri recognition unit 6 compares the stroke information obtained from the preprocessing unit 4 with the basic stroke pattern registered in the basic stroke dictionary unit 7 according to the recognition mode set in the recognition mode setting unit 5. Classify the incoming cast four strokes as one of the basic strokes.

いま、認識モード設定部に設定された認識モードがひら
がなであるとすると、入カス）ｏ−りは第２図のスト四
−りｉｃｔナンバー１．２．４．６．７．８．９゜１０
、１１．１２．１３．１４．１５．１６．１７．１Ｂ、
　１９．２０．２１．２２．２３゜２６、２７．２Ｂ、
　２９．３０に属する基本ストロークパターンと照合し
て分類し、認識モードが数字の場合は、入力ストローク
はストロークｔｄナンバー１．２．５．８゜１２、１３
．１４．１９．２０．２１．２２．２７．２９．３１．
６５．５４に属する基本ス）Ｉｆｆ−クパターンと照合
して分類する。他の認識モードの場合も同様である。以
上のように１文字のすべてのストｐ−りの処理がストロ
ーク認識部６で終ると、文字認識部９ではストローク、
ｄ　Ｗｈ部６から各入力ストロークのｉｅｔナンバー、
文字処理情報部８からストローク位置情報、ストローク
交差情報、ストローク長情報等の文字情報、認識モード
設定部５から認識モードを得、文字辞書部１０に登録し
である文字パターンと照合して開織結果を出力部１１よ
り出力する。前実施例ではひらがなモード、漢字モード
、カタカナモード、数字モード、英字モード、等のそれ
ぞれの認識モードについて説明したが、使用者が認識モ
ードを設定する場合には、モード数が少い方が使用者の
負担は小さい。そのためいくつかの認識モードを一つに
し、ひらがな漢字モード、カタカナ争英数字モード等を
設定するようにしてもよい。この場合には、認識モード
だけでなく、入力文字の画数が７以上ならば照合対象と
する基本ストローク群を漢字を構成するストν−りに成
りうる基本ストロークに限定する等、入力文字の画数、
および入力ストロークの画数により照合対象とする基本
ストロークを限定してもよい。Now, assuming that the recognition mode set in the recognition mode setting section is Hiragana, the input error is the ICT number 1.2.4.6.7.8.9° in Figure 2. 10
, 11.12.13.14.15.16.17.1B,
19.20.21.22.23゜26, 27.2B,
29.30, and if the recognition mode is numeric, the input stroke is the stroke td number 1.2.5.8°12, 13.
．． 14.19.20.21.22.27.29.31.
65.65.65.65.65.65.65.65.54). The same applies to other recognition modes. As described above, when all the strokes of one character are processed by the stroke recognition unit 6, the character recognition unit 9 processes the strokes,
d IET number of each input stroke from Wh section 6,
Character information such as stroke position information, stroke intersection information, and stroke length information is obtained from the character processing information section 8, a recognition mode is obtained from the recognition mode setting section 5, and the text is opened by comparing it with a character pattern registered in the character dictionary section 10. The results are output from the output unit 11. In the previous embodiment, each recognition mode such as hiragana mode, kanji mode, katakana mode, number mode, alphabet mode, etc. was explained, but when the user sets the recognition mode, the one with fewer modes is used. The burden on people is small. Therefore, several recognition modes may be combined into one, such as a hiragana/kanji mode, a katakana alphanumeric mode, etc. In this case, in addition to the recognition mode, if the number of strokes of the input character is 7 or more, the number of strokes of the input character is ,
The basic strokes to be compared may be limited based on the number of input strokes.

（効　果）以上の説明から明らかなように、本発明によれば、認識
モードにより照合対象となる基本ストローク群が限定さ
れ、少い処理量で高いストローク認識率が得られ、文字
認識率を高めることができる。(Effects) As is clear from the above explanation, according to the present invention, the basic stroke group to be matched is limited by the recognition mode, a high stroke recognition rate can be obtained with a small amount of processing, and the character recognition rate can be improved. can be increased.

[Brief explanation of the drawing]

第１図は本発明の一実施例である文字＆ＬｉＪｌ装置の
構成を示すブロック図、第２図は第１図に示した基本ス
トローク辞書部に格納されている基本ストロークを示す
図であり、６は文字情報入力装置、５は認識モード設定
部、６はストローク開織部、７は基本ス）ｏ−り辞書部
、９は文字認識部、１０は文字辞書部である。出願人　キャノン株式会社FIG. 1 is a block diagram showing the configuration of a character & LiJl device which is an embodiment of the present invention, and FIG. 2 is a diagram showing basic strokes stored in the basic stroke dictionary section shown in FIG. 5 is a character information input device, 5 is a recognition mode setting section, 6 is a stroke opening section, 7 is a basic script dictionary section, 9 is a character recognition section, and 10 is a character dictionary section. Applicant Canon Co., Ltd.

Claims

[Claims]

A character recognition device that performs character recognition using information that traps characters in the strings that make up the characters.