JPH06348899A

JPH06348899A - Character recognizing device

Info

Publication number: JPH06348899A
Application number: JP5140747A
Authority: JP
Inventors: Toru Miyamae; 徹宮前; Koichi Higuchi; 浩一樋口
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1993-06-11
Filing date: 1993-06-11
Publication date: 1994-12-22
Anticipated expiration: 2014-06-07
Also published as: JP2902904B2

Abstract

PURPOSE:To provide a character recognizing device capable of obtaining highly accurate and stable recognition performance even for low quality character patterns with large variation in a local line width. CONSTITUTION:Four kinds of sub patterns for representing the distribution state of the strokes of characters based on the average line width source character patterns are extracted at every scanning direction and thinning pattern composed of the patterns belonging to neither of extracted sub patterns are extracted. The presence/absence of the need of extracting the sub patterns is judged for the remaining patterns for which infinitesimal segments are removed from the thinning patterns and thinning sub patterns based on the average line width of the pertinent thinning pattern are extracted when it is judged that the extraction is needed. Thereafter, either of the sub pattern and the synthesis sub pattern of the sub pattern with the thinning sub pattern is outputted to a feature extraction part 135, feature extraction relating to the pertinent character pattern is performed and the characters are recognized by collating the features with the features of a dictionary.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、一部がかすれたよう
な局所的に線幅の異なる文字パタンに対処した高精度な
文字認識装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-accuracy character recognizing device which copes with a character pattern having a partially different line width, such as a faint part.

【０００２】[0002]

【従来の技術】従来、入力文字パタンの特徴を抽出し、
予め用意した辞書との照合によって、認識結果を出力す
る文字認識装置としては、例えば特公昭６０−３８７５
６に開示されるものがあった。この文字認識装置による
処理の概要について以下に説明する。2. Description of the Related Art Conventionally, characteristics of input character patterns are extracted,
As a character recognition device that outputs a recognition result by collating with a dictionary prepared in advance, for example, Japanese Patent Publication No. 60-3875.
6 was disclosed. The outline of the processing by the character recognition device will be described below.

【０００３】先ず、入力文字パタンの各セルの明るさを
光電変換によって、量子化された電気信号である２値画
像に変換し、パタンレジスタに格納しておく。そして前
記パタンレジスタ内の文字パタンの外接枠を検出し、外
接枠内の文字パタンの線幅を計算する。次に外接枠内の
文字パタンに対して、水平、垂直、右斜め、左斜め方向
に走査し、前記線幅を閾値とする連続黒画素成分を検出
することによって、該入力文字パタンに対する４種のサ
ブパタンを抽出する。また、前記パタンレジスタの外接
枠内の文字パタンに対して、各分割領域内の黒画素数が
同数になるようにＮ×Ｍ個の格子状の部分領域に非線形
分割する。次に４種のサブパタンのそれぞれについて、
分割された部分領域内における該サブパタンの黒画素数
を計数し、これを文字パタンの大きさで正規化すること
によって、各方向における文字線の分布状態を反映する
Ｎ×Ｍ×４次元の特徴マトリクスを抽出する。この特徴
マトリクスと予め用意された複数の標準文字の特徴マト
リクスである辞書とを照合し、該照合結果より該入力文
字パタンの認識結果を出力する。First, the brightness of each cell of the input character pattern is converted into a binary image which is a quantized electric signal by photoelectric conversion and stored in a pattern register. Then, the circumscribing frame of the character pattern in the pattern register is detected, and the line width of the character pattern in the circumscribing frame is calculated. Next, the character pattern in the circumscribing frame is scanned horizontally, vertically, diagonally to the right, and diagonally to the left to detect continuous black pixel components with the line width as a threshold value, thereby determining four types of the input character pattern. The sub-pattern of is extracted. Further, the character pattern in the circumscribing frame of the pattern register is non-linearly divided into N × M grid-shaped partial regions so that the number of black pixels in each divided region is the same. Next, for each of the four types of sub patterns,
The number of black pixels of the sub-pattern in the divided partial area is counted and normalized by the size of the character pattern to reflect the distribution state of the character lines in each direction, which is an N × M × 4 dimensional feature. Extract the matrix. The feature matrix is collated with a dictionary, which is a feature matrix of a plurality of standard characters prepared in advance, and the recognition result of the input character pattern is output from the collation result.

【０００４】[0004]

【発明が解決しようとする課題】しかし、前記文字認識
装置においては、以下のような問題点があった。即ち、
従来技術では、入力された文字パタンの外接枠内の２値
画像に対して、水平、垂直、右斜め、左斜めの４方向に
それぞれ走査し、当該文字パタンの平均線幅の２倍を閾
値として、連続した黒画素よりなるストロ−クを抽出
し、それらの分布を表わす４種のサブパタンを抽出して
いた。しかし、従来のサブパタンの抽出方法では、当然
ながら前記線幅の２倍より小さい連続黒画素数を持つス
トロ−ク成分は、抽出されない。従って、一部がかすれ
たような文字パタン、即ち、局所的な線幅が他の部分と
比較して極めて小さくなっている文字パタン等は、その
かすれた部分がサブパタン及びサブパタンに基づいて抽
出される特徴マトリクスに反映されず、その結果、認識
性能の低下の一因を為していたという問題点があった。However, the above character recognition device has the following problems. That is,
In the conventional technique, the binary image in the circumscribed frame of the input character pattern is scanned in each of four directions of horizontal, vertical, right diagonal, and left diagonal, and a threshold value is twice the average line width of the character pattern. As an example, a stroke consisting of continuous black pixels was extracted, and four types of sub-patterns representing their distribution were extracted. However, the conventional sub-pattern extraction method naturally does not extract the stroke component having the number of continuous black pixels smaller than twice the line width. Therefore, a character pattern that is partly faint, that is, a character pattern whose local line width is extremely smaller than other parts, is extracted based on the sub-pattern and sub-pattern. However, there was a problem in that it was not reflected in the feature matrix, and as a result, it contributed to the deterioration of recognition performance.

【０００５】このような場合の例を図４及び図６（ａ）
に示す。図４では、アルファベットの「Ｑ」の字のひげ
の部分、即ち波線４０１で囲まれた領域内のセグメント
がかすれて３つに分裂してしまった場合を表している。
このかすれたひげの部分は、本来ならば、左斜め方向の
走査によって、ストロ−クの一部として検出されるわけ
であるが、この例では、いかなる方向の走査においても
サブパタンの一部としては検出されない。従って、ひげ
の部分は特徴に反映されず、ひげのない類似文字、例え
ば、「Ｏ」等に誤読する確率が増大することになる。An example of such a case is shown in FIGS. 4 and 6 (a).
Shown in. FIG. 4 shows a case where the beard portion of the letter “Q” of the alphabet, that is, the segment in the area surrounded by the wavy line 401 is faintly divided into three.
This faint whisker is normally detected as a part of the stroke by scanning in the left diagonal direction, but in this example, as a part of the sub-pattern in scanning in any direction. Not detected. Therefore, the beard portion is not reflected in the feature, and the probability of misreading a similar character without a beard, such as "O", increases.

【０００６】また図６（ａ）では、漢字の「因」の字を
扱っている。この場合、「因」を構成する要素の内、外
側の部分である「口」に対し、内側の部分である「大」
が通常より小さく書かれ、文字全体の平均線幅の２倍以
下の大きさしか持たないため、平均線幅を用いた走査で
は、図６（ｂ）、図６（ｃ）、図６（ｄ）、図６（ｅ）
に示したように「大」の字がどのサブパタンにも反映さ
れないといった事態が生じ、大きな問題点となる。Further, in FIG. 6 (a), the kanji character "Cause" is used. In this case, of the elements that make up the "cause," the "mouth," which is the outer part, is "large," which is the inner part.
Is written smaller than usual and has a size of not more than twice the average line width of the entire character. Therefore, in scanning using the average line width, FIG. 6 (b), FIG. 6 (c), and FIG. ), FIG. 6 (e)
As shown in, there is a situation in which the character “large” is not reflected in any sub-pattern, which is a big problem.

【０００７】また、上述のように一部がかすれたり、小
さく書かれたりした文字パタンではないときでも、一部
がつぶれたことにより平均線幅の値が非常に大きくな
り、その結果、通常のストロ−クが走査によってサブパ
タンとして検出されず、故に特徴マトリクスに反映され
ず、認識性能の低下をもたらすという問題点があった。
このような場合の例を図５に示す。図５は、数字の
「５」において、下部がル−プを作りつぶれてしまった
例であるが、このとき、文字全体の平均線幅は大きな値
となり、その結果、例えば、波線５０１で示されたよう
な通常に書かれたストロ−クの部分等は線幅の２倍以下
の大きさとなってしまい、結局、サブパタンとして抽出
されなくなる。従って、波線５０１が示すストロ−クが
ないパタンとして、特徴抽出されるので、例えば「６」
等に極めて類似してくることになり、「６」に誤読する
確率が増大する。Further, even when the character pattern is not faint or written small as described above, the value of the average line width becomes very large due to the part being crushed. There is a problem in that the stroke is not detected as a sub-pattern by scanning and is therefore not reflected in the feature matrix, resulting in deterioration of recognition performance.
An example of such a case is shown in FIG. FIG. 5 shows an example in which the lower part of the numeral "5" has been formed into a loop, but at this time, the average line width of the entire character becomes a large value, and as a result, for example, a wavy line 501 is shown. The stroke portion or the like which is normally written as described above has a size of twice the line width or less, and is eventually not extracted as a sub pattern. Therefore, since the feature is extracted as a pattern having no stroke indicated by the wavy line 501, for example, "6".
And so on, and the probability of misreading as "6" increases.

【０００８】本発明は、前記従来のサブパタン抽出方法
において、文字を構成する各ストロ−ク成分の線幅が平
均線幅に近いところで分布する場合には、有効な特徴抽
出となり得る一方、局所的なストロ−クの線幅が他の部
分の線幅と大きな差がある場合、即ち、一部がつぶれて
いたり、かすれていたりするような文字パタンに対して
は、適切な特徴抽出ができず認識性能の低下をもたらす
といった問題点を除去し、局所線幅の異なる各ストロ−
ク成分のそれぞれに対して、最適な閾値で各方向から走
査してサブパタンを抽出し、該サブパタンに基づいた特
徴抽出、認識処理を行うことによって、局所的な線幅に
大きなばらつきのある品質の良くない文字パタンに対し
ても、高精度で、安定な認識性能の得られる文字認識装
置を提供することを目的とする。According to the present invention, in the above-mentioned conventional sub-pattern extraction method, when the stroke width of each stroke component constituting a character is distributed near the average stroke width, effective feature extraction can be performed, while local feature extraction can be performed. If the stroke width is significantly different from that of other parts, that is, if the character pattern is partially crushed or faint, proper feature extraction cannot be performed. By eliminating the problem that the recognition performance deteriorates, each strobe with different local line width
Each sub-component is scanned from each direction with an optimal threshold value to extract a sub-pattern, and feature extraction and recognition processing based on the sub-pattern are performed to obtain a local line width with a large variation in quality. It is an object of the present invention to provide a character recognition device that can obtain stable recognition performance with high accuracy even for bad character patterns.

【０００９】[0009]

【課題を解決するための手段】本発明は、前記課題を解
決するために、帳票等に記入された文字パタンを光学的
に走査して、量子化された電気信号である２値画像に変
換する光電変換部と、前記２値画像に変換された文字パ
タンを格納するパタンレジスタと、前記パタンレジスタ
内の文字パタンの外接枠を検出する外接枠検出部と、前
記パタンレジスタの外接枠内の文字パタンの線幅を算出
する文字パタン線幅計算部と、前記パタンレジスタの外
接枠内の文字パタンに対して、各々、水平、垂直、右斜
め、左斜め方向に走査して得られた黒画素の連続数が前
記線幅に基づいて定められた閾値を超えた場合にストロ
−クとして検出し、これらのストロ−クの分布を表わす
サブパタンを各方向毎に４種類抽出するサブパタン抽出
部と、前記パタンレジスタの外接枠内の２値画像及び前
記４種類のサブパタンより、文字パタンを構成する黒画
素の中で、前記４種類のサブパタンのいずれにも属さな
い黒画素をかすれパタンとして抽出するかすれパタン抽
出部と、前記かすれパタンを構成する各々独立したセグ
メントのうち、微小セグメントを除去する微小セグメン
ト除去部と、前記かすれパタンの線幅を算出するかすれ
パタン線幅計算部と、微小セグメントを除去したかすれ
パタンについてサブパタンの抽出の必要が有りと判定さ
れた場合に、前記かすれパタンの線幅に基づいてかすれ
パタンのストロ−クの分布を表わすかすれサブパタンを
前記同様に各走査方向毎に４種類抽出するかすれサブパ
タン抽出部と、前記サブパタン或は前記サブパタンとか
すれサブパタンを各走査方向毎に合成した合成サブパタ
ンの何れか一方のサブパタンを特徴抽出部に出力する制
御部と、前記サブパタン或は合成サブパタンの特徴マト
リクスを抽出する特徴抽出部と、前記特徴マトリクスと
予め用意された辞書マトリクスとを照合した結果に基づ
き認識結果を出力する識別部とを有することを特徴とす
る。In order to solve the above problems, the present invention optically scans a character pattern written on a form or the like and converts it into a binary image which is a quantized electric signal. A photoelectric conversion unit, a pattern register for storing the character pattern converted into the binary image, a circumscribing frame detecting unit for detecting a circumscribing frame of the character pattern in the pattern register, and a circumscribing frame of the pattern register. A character pattern line width calculation unit for calculating the line width of a character pattern, and a black obtained by scanning the character pattern in the circumscribing frame of the pattern register in the horizontal, vertical, right diagonal, and left diagonal directions, respectively. A sub-pattern extraction unit that detects a stroke when the number of consecutive pixels exceeds a threshold value determined based on the line width and extracts four types of sub-patterns representing the distribution of these strokes in each direction. , The pattern From the binary image in the circumscribing frame of the register and the four types of sub-patterns, among the black pixels forming the character pattern, a black pattern that does not belong to any of the four types of sub-patterns is extracted as a fading pattern. Section, a fine segment removal unit that removes a fine segment among the independent segments that form the blur pattern, a blur pattern line width calculation unit that calculates the line width of the blur pattern, and a blur that has removed the fine segment. When it is determined that it is necessary to extract a sub-pattern for a pattern, four types of faint sub-patterns representing the stroke distribution of the faint pattern are extracted for each scanning direction based on the line width of the faint pattern. The blurring sub-pattern extraction unit and the sub-pattern or the sub-pattern and the blurring sub-pattern are combined for each scanning direction. The control unit that outputs any one of the synthesized sub-patterns to the feature extraction unit, the feature extraction unit that extracts the feature matrix of the sub-pattern or the synthesized sub-pattern, and the feature matrix and the dictionary matrix prepared in advance. And an identification unit that outputs a recognition result based on the result.

【００１０】[0010]

【作用】この発明によれば、原文字パタンの平均線幅に
基づいた文字のストロークの分布状態を表わすサブパタ
ンが各走査方向毎に４種類抽出され、更に抽出されたサ
ブパタンの何れにも属さないパタンから成るかすれパタ
ンが抽出される。このかすれパタンから微小セグメント
を除去した残りのパタンに対しサブパタンの抽出の必要
の有無が判定され、抽出の必要有りと判定された場合に
は当該かすれパタンの平均線幅に基づいたかすれサブパ
タンが抽出される。その後、前記サブパタン或はかすれ
サブパタンが抽出された場合には前記サブパタンとかす
れサブパタンとの合成サブパタンの何れか一方が制御部
の制御の下に特徴抽出部に出力されて当該文字パタンに
関する特徴抽出が行われ、この特徴が辞書の特徴と照合
されることにより文字認識がおこなわれる。従って、文
字パタンを構成するストローク成分で文字認識に本質的
な役割を果たすものの一部がかすれたり或は小さくなっ
たような場合でもかすれサブパタンとして救済し抽出す
ることが可能となるため局所的な線幅にばらつきの有る
ような低品質の文字パタンに対し他も高精度で、安定し
た認識性能を得ることが可能となる。According to the present invention, four types of sub-patterns representing the distribution of the strokes of characters based on the average line width of the original character pattern are extracted for each scanning direction, and do not belong to any of the extracted sub-patterns. A faint pattern consisting of patterns is extracted. It is determined whether or not sub-patterns need to be extracted from the remaining patterns obtained by removing minute segments from this blurring pattern, and if it is determined that extraction is necessary, a blurring sub-pattern based on the average line width of the blurring pattern is extracted. To be done. After that, when the sub pattern or the faint sub pattern is extracted, one of the combined sub patterns of the sub pattern and the faint sub pattern is output to the feature extraction unit under the control of the control unit to extract the feature related to the character pattern. Then, character recognition is performed by collating this feature with the feature of the dictionary. Therefore, even if a part of the stroke component that constitutes the character pattern, which plays an essential role in character recognition, becomes faint or small, it can be relieved and extracted as a faint sub-pattern, so that it can be locally extracted. It is possible to obtain stable recognition performance with high accuracy for other low-quality character patterns having line width variations.

【００１１】[0011]

【実施例】以下に本発明による文字認識装置の実施例１
及び実施例２を説明するが、ここでは例えば図４の４０
１，図５の５０１及び図６（ａ）の「因」を構成する要
素「大」等は、便宜上、かすれパタンという名称で一括
して呼称する。また実施例１では、図６（ａ）の漢字
「因」という字の２値画像に対して、本実施例を適用し
た例について併せて説明していく。[Embodiment] Embodiment 1 of the character recognition apparatus according to the present invention will be described below.
And Example 2 will be described. Here, for example, in FIG.
1, 501 in FIG. 5 and the element “large” and the like that constitute the “factor” in FIG. 6A are collectively referred to as a blur pattern for convenience. In addition, in the first embodiment, an example in which the present embodiment is applied to the binary image of the Chinese character “Cause” in FIG.

【００１２】図１は、本発明による文字認識装置の実施
例１を示すブロック図である。ここで、１０１は、文字
パタンをスキャナで走査して得られた光信号入力、１０
２は光電変換部、１０３はパタンレジスタ、１０４は外
接枠検出部、１０５は文字パタン線幅計算部、１０６は
水平方向走査部、１０７は水平サブパタン１メモリ、１
０８は垂直方向走査部、１０９は垂直サブパタン１メモ
リ、１１０は右斜め方向走査部、１１１は右斜めサブパ
タン１メモリ、１１２は左斜め方向走査部、１１３は左
斜めサブパタン１メモリ、１１４はかすれパタン抽出
部、１１５はかすれパタンメモリ、１１６はかすれパタ
ン線幅計算部、１１７は水平方向走査部、１１８は水平
かすれサブパタンメモリ、１１９は垂直方向走査部、１
２０は垂直かすれサブパタンメモリ、１２１は右斜め方
向走査部、１２２は右斜めかすれサブパタンメモリ、１
２３は左斜め方向走査部、１２４は左斜めかすれサブパ
タンメモリ、１２５は水平サブパタン合成部、１２６は
水平サブパタン２メモリ、１２７は垂直サブパタン合成
部、１２８は垂直サブパタン２メモリ、１２９は右斜め
サブパタン合成部、１３０は右斜めサブパタン２メモ
リ、１３１は左斜めサブパタン合成部、１３２は左斜め
サブパタン２メモリ、１３３は出力制御部、１３４は線
幅判定部、１３５は特徴抽出部、１３６は識別部、１３
７は辞書メモリ、１３８は認識結果、１３９は微小セグ
メント除去部である。FIG. 1 is a block diagram showing a first embodiment of a character recognition device according to the present invention. Here, 101 is an optical signal input obtained by scanning a character pattern with a scanner, 10
2 is a photoelectric conversion unit, 103 is a pattern register, 104 is a circumscribing frame detection unit, 105 is a character pattern line width calculation unit, 106 is a horizontal scanning unit, 107 is a horizontal sub pattern 1 memory, 1
Reference numeral 08 is a vertical scanning unit, 109 is a vertical sub-pattern 1 memory, 110 is a right diagonal scanning unit, 111 is a right diagonal sub pattern 1 memory, 112 is a left diagonal scanning unit, 113 is a left diagonal sub pattern 1 memory, and 114 is a faint pattern. An extraction unit, 115 is a blur pattern memory, 116 is a blur pattern line width calculation unit, 117 is a horizontal scanning unit, 118 is a horizontal blur sub-pattern memory, 119 is a vertical scanning unit, 1
Reference numeral 20 is a vertical blurring sub-pattern memory, 121 is a diagonal right direction scanning unit, 122 is a right diagonal blurring sub pattern memory, 1
23 is a left oblique direction scanning unit, 124 is a left oblique fading sub pattern memory, 125 is a horizontal sub pattern combining unit, 126 is a horizontal sub pattern 2 memory, 127 is a vertical sub pattern combining unit, 128 is a vertical sub pattern 2 memory, and 129 is a right oblique sub pattern. A synthesizing unit, 130 is a right diagonal sub-pattern 2 memory, 131 is a left diagonal sub-pattern 2 synthesizing unit, 132 is a left diagonal sub-pattern 2 memory, 133 is an output control unit, 134 is a line width determining unit, 135 is a feature extracting unit, 136 is an identifying unit. , 13
Reference numeral 7 is a dictionary memory, 138 is a recognition result, and 139 is a minute segment removing unit.

【００１３】帳票等に手書きまたは印刷された文字パタ
ンをスキャナで走査して得られた光信号１０１は、光電
変換部１０２において、電気信号に変換され、さらに量
子化れて２値の信号からなる２値画像に変換されパタン
レジスタ１０３に格納される。An optical signal 101 obtained by scanning a character pattern handwritten or printed on a form or the like with a scanner is converted into an electric signal in a photoelectric conversion unit 102 and further quantized to be a binary signal. It is converted into a binary image and stored in the pattern register 103.

【００１４】外接枠検出部１０４は、パタンレジスタ１
０３に蓄えられた２値画像に対し、水平走査により前記
２値画像の上端及び下端を検出し、垂直走査により前記
２値画像の左端及び右端を検出し、その結果、当該入力
文字パタンに外接する矩形である外接枠を得る。そし
て、外接枠に関する座標値を文字パタン線幅計算部１０
５、水平方向走査部１０６、垂直方向走査部１０８、右
斜め方向走査部１１０、左斜め方向走査部１１２及びか
すれパタン抽出部１１４に出力し、文字パタンの領域を
指定する。以下の処理において、パタンレジスタ１０３
の２値画像を用いる場合は、全て外接枠内にある２値画
像を対象とする。The circumscribing frame detection unit 104 is used in the pattern register 1
For the binary image stored in 03, the upper and lower ends of the binary image are detected by horizontal scanning, the left and right ends of the binary image are detected by vertical scanning, and as a result, the input character pattern is circumscribed. Get a circumscribed frame that is a rectangle. Then, the coordinate values regarding the circumscribing frame are set to the character pattern line width calculation unit 10
5, the horizontal scanning unit 106, the vertical scanning unit 108, the right oblique direction scanning unit 110, the left oblique direction scanning unit 112, and the blur pattern extracting unit 114, and specifies the character pattern area. In the following processing, the pattern register 103
When the binary image of is used, all the binary images within the circumscribed frame are targeted.

【００１５】文字パタン線幅計算部１０５では、当該文
字パタンの平均線幅が計算される。ここで、平均線幅の
求め方の一つの例として、本実施例では、次の方法を採
用した。即ち、パタンレジスタ１０３の外接枠内の文字
パタンの２値画像の黒画素数をＡ、４黒画素数をＱとし
た時、平均線幅Ｗｒを次式で計算する方法である。Ｗｒ＝Ａ／（Ａ − Ｑ）（１）但し、４黒画素とは、２値画像を２×２の窓で走査した
時に２×２の窓の全てが黒画素となる点であり、４黒画
素数Ｑとは、そのような４黒画素を計数したものであ
る。The character pattern line width calculation unit 105 calculates the average line width of the character pattern. Here, as one example of how to obtain the average line width, the following method is adopted in the present embodiment. That is, when the number of black pixels of the binary image of the character pattern in the circumscribed frame of the pattern register 103 is A and the number of black pixels is Q, the average line width Wr is calculated by the following formula. Wr = A / (A−Q) (1) However, 4 black pixels means that all the 2 × 2 windows become black pixels when the binary image is scanned by the 2 × 2 window. The black pixel number Q is a count of such 4 black pixels.

【００１６】次にパタンレジスタ１０３の外接枠内の文
字パタンに対して、水平方向走査部１０６において水平
方向に、垂直方向走査部１０８において垂直方向に、右
斜め方向走査部１１０において右斜め方向に、左斜め方
向走査部１１２において左斜め方向に、それぞれ走査
し、前記線幅に基づいた値を閾値として、連続した黒画
素であるストロ−クを検出していき、それらの分布状態
を反映するサブパタンを生成する。この時、その連続し
た黒画素がサブパタンを構成するストロ−ク成分である
ことの条件は、連続黒画素数をＬとしたとき、次式で与
えられる。Ｌ＞２ × Ｗｒ（２）ここで、Ｗｒは前述の平均線幅である。即ち、それぞれ
の方向の走査において平均線幅の２倍を超える長さを持
つストロ−クが当該方向のサブパタンを構成するストロ
−クとして抽出されるのである。以上のように検出され
た外接枠内における連続黒画素としてのストロ−クの分
布状態は、各々の走査方向毎に、水平サブパタン１、垂
直サブパタン１、右斜めサブパタン１、左斜めサブパタ
ン１として、それぞれ水平サブパタン１メモリ１０７、
垂直サブパタン１メモリ１０９、右斜めサブパタン１メ
モリ１１１、左斜めサブパタン１メモリ１１３に格納さ
れる。Next, with respect to the character pattern in the circumscribed frame of the pattern register 103, the horizontal scanning unit 106 horizontally, the vertical scanning unit 108 vertically, and the right diagonal scanning unit 110 diagonally right. , The diagonally leftward scanning unit 112 scans diagonally leftward, detects strokes that are continuous black pixels by using the value based on the line width as a threshold, and reflects their distribution state. Generate a sub pattern. At this time, the condition that the continuous black pixels are the stroke components forming the sub-pattern is given by the following equation, where L is the number of continuous black pixels. L> 2 * Wr (2) Here, Wr is the above-mentioned average line width. That is, in the scanning in each direction, a stroke having a length that is more than twice the average line width is extracted as a stroke that constitutes a sub pattern in that direction. The distribution state of strokes as continuous black pixels in the circumscribing frame detected as described above is as follows: horizontal sub-pattern 1, vertical sub-pattern 1, right diagonal sub-pattern 1, left diagonal sub-pattern 1, for each scanning direction. Horizontal sub-pattern 1 memory 107,
The data is stored in the vertical sub-pattern 1 memory 109, the right diagonal sub-pattern 1 memory 111, and the left diagonal sub-pattern 1 memory 113.

【００１７】図６を例にとると、走査前の原２値画像が
図６（ａ）に、水平サブパタン１が図６（ｂ）に、垂直
サブパタン１が図６（ｃ）に、右斜めサブパタン１が図
６（ｄ）に、左斜めサブパタン１が図６（ｅ）に各々表
されている。前述したように「因」を構成する要素
「大」は、平均線幅の２倍以下のスケ−ルであるため各
サブパタンには全く反映されていないことがわかる。Taking FIG. 6 as an example, the original binary image before scanning is shown in FIG. 6A, the horizontal sub-pattern 1 is shown in FIG. 6B, the vertical sub-pattern 1 is shown in FIG. The sub pattern 1 is shown in FIG. 6 (d), and the left diagonal sub pattern 1 is shown in FIG. 6 (e). As described above, it can be seen that the element "large" that constitutes the "factor" is not reflected in each sub-pattern because it is a scale of twice the average line width or less.

【００１８】次のかすれパタン抽出部１１４は、パタン
レジスタ１０３の外接枠内の２値画像及び水平サブパタ
ン１、垂直サブパタン１、右斜めサブパタン１、左斜め
サブパタン１とを用いて、サブパタンとして抽出されな
かったストロ−ク成分の分布状態をかすれパタンとして
抽出する。図３はかすれパタン抽出部の構成例を示すブ
ロック図であり、点線で示された枠内がかすれパタン抽
出部１１４の内部を表している。３０１はＯＲ回路部、
３０２はメモリ、３０３はＮＯＴ回路部、３０４は文字
パタンメモリ、３０５はＡＮＤ回路部である。The next blur pattern extraction unit 114 extracts a sub-pattern by using the binary image in the circumscribing frame of the pattern register 103 and the horizontal sub-pattern 1, the vertical sub-pattern 1, the right diagonal sub-pattern 1, and the left diagonal sub-pattern 1. The distribution state of the stroke components that did not exist is extracted as a blur pattern. FIG. 3 is a block diagram showing a configuration example of the blur pattern extracting unit, and the inside of the frame indicated by the dotted line represents the inside of the blur pattern extracting unit 114. 301 is an OR circuit unit,
Reference numeral 302 is a memory, 303 is a NOT circuit unit, 304 is a character pattern memory, and 305 is an AND circuit unit.

【００１９】図３に示されたかすれパタン抽出部１１４
における各ブロックの機能及び処理の流れについて以下
で説明する。先ず、各方向のサブパタンメモリ１０７，
１０９，１１１，１１３に格納された水平サブパタン
１、垂直サブパタン１、右斜めサブパタン１及び左斜め
サブパタン１は、ＯＲ回路部３０１に入力される。ＯＲ
回路部３０１では、各サブパタンの黒画素を１、白画素
を０としたとき、外接枠で囲まれたサブパタン領域の画
素１つ１つについて、４つのサブパタン１の画素値のＯ
Ｒ論理演算が実行され、当該演算結果が、予めメモリ３
０２に用意されたサブパタン領域と同じ句形領域の対応
する画素についてそれぞれ出力されていき、最終的に
は、４つのサブパタン１の和集合であるパタンがメモリ
３０２上に生成される。このパタンは、当該領域の各画
素において、４つのサブパタンの内、少なくとも１つの
サブパタンの画素値が１、即ち、黒画素である時に、黒
画素であり、４つのサブパタン１のいずれも画素値が
０、即ち、白画素である時に白画素となっている。従っ
て、このサブパタンの和集合のパタンの白画素部分は、
もともと文字パタンの２値画像で白画素であったか或い
は、２値画像では黒画素であるがサブパタンとしては抽
出されなかったかのどちらかである。The faint pattern extraction unit 114 shown in FIG.
The function and processing flow of each block in FIG. First, the sub pattern memory 107 for each direction,
The horizontal sub-pattern 1, the vertical sub-pattern 1, the right diagonal sub-pattern 1 and the left diagonal sub-pattern 1 stored in 109, 111 and 113 are input to the OR circuit unit 301. OR
In the circuit unit 301, assuming that the black pixel of each sub-pattern is 1 and the white pixel is 0, each pixel in the sub-pattern area surrounded by the circumscribing frame has an O value of the pixel value of four sub-patterns 1.
The R logical operation is executed, and the operation result is previously stored in the memory 3
The sub-pattern area prepared in No. 02 is output for each corresponding pixel in the same phrase-shaped area, and finally a pattern that is the union of four sub-patterns 1 is generated in the memory 302. This pattern is a black pixel when the pixel value of at least one sub-pattern among the four sub-patterns in each pixel of the area is 1, that is, a black pixel, and the pixel values of all four sub-patterns 1 are 0, that is, a white pixel is a white pixel. Therefore, the white pixel part of the pattern of the union of this sub-pattern is
Either it was originally a white pixel in the binary image of the character pattern, or it was a black pixel in the binary image but was not extracted as a sub-pattern.

【００２０】次にメモリ３０２上に生成された前記パタ
ンについて、ＮＯＴ回路部３０３によるＮＯＴ演算が実
行される。ＮＯＴ回路部３０３では、メモリ３０２上の
パタンを構成する画素の一つ一つについて、順次、画素
値０の画素を画素値１に、画素値１の画素を画素値０に
変換し、即ち、白画素を黒画素に、黒画素を白画素に変
換するＮＯＴ演算を実行し、当該演算結果をメモリ３０
２における当該画素上に出力する。以上のようにして、
メモリ３０２上には、ＯＲ回路部３０１によって生成さ
れたサブパタンの和集合であるパタンを白黒反転させた
パタンが生成される。Next, the NOT circuit section 303 performs a NOT operation on the pattern generated on the memory 302. The NOT circuit unit 303 sequentially converts a pixel having a pixel value of 0 into a pixel value of 1 and a pixel having a pixel value of 1 into a pixel value of 0 for each of the pixels forming the pattern on the memory 302, that is, A NOT operation for converting a white pixel into a black pixel and a black pixel into a white pixel is executed, and the operation result is stored in the memory 30.
It outputs on the said pixel in 2. As described above,
On the memory 302, a pattern in which the pattern, which is the union of the sub patterns generated by the OR circuit unit 301, is inverted in black and white is generated.

【００２１】一方、上述の処理とは独立に、パタンレジ
スタ１０３の２値画像の内、外接枠検出部１０４によっ
て検出された外接枠内の２値画像のみが文字パタンメモ
リ３０４に転送される。On the other hand, independently of the above-mentioned processing, only the binary image in the circumscribing frame detected by the circumscribing frame detecting unit 104 among the binary images in the pattern register 103 is transferred to the character pattern memory 304.

【００２２】次にメモリ３０２上のパタンと文字パタン
メモリ３０４上の文字パタンに対して、ＡＮＤ回路部３
０５によって、ＡＮＤ演算が実行される。ＡＮＤ回路部
３０５では、パタン領域内の個々の画素について、メモ
リ３０２上のパタンの画素値と該画素に対応する文字パ
タンメモリ３０４上の文字パタンの画素値とのＡＮＤ演
算、即ち、両者の画素値が１であったときのみに、画素
値１を出力し、少なくともどちらかが０であったとき
は、画素値０を出力する演算を実行していき、当該演算
結果をかすれパタンとして、かすれパタンメモリ１１５
に出力する。Next, for the pattern on the memory 302 and the character pattern on the memory pattern 304, the AND circuit unit 3
An AND operation is executed by 05. In the AND circuit unit 305, for each pixel in the pattern area, an AND operation is performed between the pixel value of the pattern on the memory 302 and the pixel value of the character pattern on the character pattern memory 304 corresponding to the pixel, that is, both pixels. Only when the value is 1, the pixel value 1 is output, and when at least one of them is 0, an operation of outputting the pixel value 0 is executed, and the operation result is used as a blur pattern to make a blur. Pattern memory 115
Output to.

【００２３】このかすれパタンは、上述の説明で理解で
きるように、文字パタンを構成する黒画素の中で、４つ
のサブパタン１の黒画素のいずれにも所属しないものを
抽出してできたものである。即ち、かすれパタンは、例
えば、図４の４０１が示すようにストロ−クの一部がか
すれ、いくつかのセグメントに分裂してできたストロ−
クや図５の５０１が示すように元々孤立したストロ−ク
であって、式（２）で示された平均線幅の２倍という閾
値に達しないもの等から構成されている。尚、このかす
れパタン抽出部の処理を図６（ａ）の原２値画像に適用
すると、図６（ａ）から図６（ｂ），（ｃ），（ｄ），
（ｅ）の各サブパタンの黒画素を全て除去することにな
り、従って、図６（ｆ）のように、サブパタンとして抽
出されなかった要素「大」だけからなるかすれパタンが
得られる。As can be understood from the above description, this blur pattern is obtained by extracting black pixels constituting the character pattern that do not belong to any of the black pixels of the four sub patterns 1. is there. That is, for example, as shown by 401 in FIG. 4, a faint pattern is a stroke formed by a part of the stroke being divided into several segments.
Or a stroke originally isolated as indicated by 501 in FIG. 5 and which does not reach the threshold value of twice the average line width shown in equation (2). When the processing of the blur pattern extracting unit is applied to the original binary image of FIG. 6A, the processing of FIGS. 6A to 6B, 6C, 6D,
All the black pixels of each sub-pattern of (e) are removed, and therefore, as shown in FIG. 6 (f), a blur pattern including only the element “large” that is not extracted as a sub-pattern is obtained.

【００２４】かすれパタン抽出部１１４で抽出されたか
すれパタンは、かすれパタンメモリ１１５に格納されて
いるが、必要に応じて、このかすれパタンにおける微小
セグメントを除去するための微小セグメント除去部１３
９を設置することも可能である。例えば、この微小セグ
メント除去部１３９による微小セグメントの除去ル−ル
として、次のものが考えられる。即ち、かすれパタンを
構成する各セグメントの輪郭を構成する輪郭黒画素数ま
たは、各セグメントの全黒画素数が、所定の閾値、例え
ば、当該入力文字パタンの線幅Ｗｒのβ倍（β＞０）以
下であったとき、微小セグメントとみなすというル−ル
である。ここで微小と判定されたセグメントは、かすれ
パタンメモリ上で消去されるか、あるいは処理の対象外
とされる。以上のように微小セグメントが消去されるこ
とによって、それに起因する認識性能の低下を未然に防
止することができる。The faint pattern extracted by the faint pattern extracting unit 114 is stored in the faint pattern memory 115. If necessary, the fine segment removing unit 13 for removing the fine segment in the faint pattern.
It is also possible to install 9. For example, the following can be considered as a removal rule of the minute segment by the minute segment removing unit 139. That is, the number of contour black pixels forming the contour of each segment forming the faint pattern or the total number of black pixels of each segment is a predetermined threshold, for example, β times (β> 0) the line width Wr of the input character pattern. ) It is a rule to consider it as a minute segment when it is below. Here, the segment determined to be minute is erased on the blur pattern memory or excluded from the processing. By deleting the minute segment as described above, it is possible to prevent deterioration of the recognition performance due to the deletion.

【００２５】かすれパタン線幅計算部１１６において
は、かすれパタンの線幅が計算される。この線幅の計算
方法としては、例えば、文字パタン線幅計算部１０５で
使用した式（１）に基づく方法が用いられる。The faint pattern line width calculation unit 116 calculates the faint pattern line width. As the method of calculating the line width, for example, the method based on the equation (1) used in the character pattern line width calculation unit 105 is used.

【００２６】次にかすれパタンメモリ１１５内のかすれ
パタンに対して、水平方向走査部１１７、垂直方向走査
部１１９、右斜め方向走査部１２１及び左斜め方向走査
部１２３によって、それぞれ水平、垂直、右斜め、左斜
め方向に走査され、所定の閾値を超えて連続した黒画素
がストロ−クとして検出されていく。その結果、今度は
かすれパタンのサブパタン、即ち、かすれサブパタンが
抽出され、それぞれ水平かすれサブパタンメモリ１１
８、垂直かすれサブパタンメモリ１２０、右斜めかすれ
サブパタンメモリ１２２及び左斜めかすれサブパタンメ
モリ１２４に格納される。尚、ここで、サブパタンを構
成するストロ−ク成分であるための条件は、式（２）で
与えられるのではなく、連続した黒画素数をＬとしたと
き、次式で与えられる。Next, with respect to the blur pattern in the blur pattern memory 115, the horizontal scanning unit 117, the vertical scanning unit 119, the right diagonal scanning unit 121, and the left diagonal scanning unit 123 respectively perform horizontal, vertical, and right scanning. Scanning is performed diagonally and diagonally to the left, and continuous black pixels exceeding a predetermined threshold are detected as strokes. As a result, the sub-patterns of the blurring pattern, that is, the blurring sub-patterns are extracted this time, and the horizontal blurring sub-pattern memory 11 is extracted.
8. The vertical blurring sub pattern memory 120, the right diagonal blurring sub pattern memory 122, and the left diagonal blurring sub pattern memory 124 are stored. Here, the condition for the stroke component forming the sub-pattern is not given by the equation (2), but given by the following equation when the number of consecutive black pixels is L.

【００２７】[0027]

【数１】 [Equation 1]

【００２８】但し、Ｗｓは、かすれパタン線幅計算部１
１６によって計算されたかすれパタンの線幅値である。
また（４）式で、γ＝２としなかったのは、通常より小
さな領域で線幅を計算すること等に由来する誤差を考慮
したからであり、γに補正因子が乗じられているとみな
す。この補正因子は、経験的に求められるが、勿論、通
常はγ＝２として閾値を設定してもよい。However, Ws is the fading pattern line width calculation unit 1
It is the line width value of the fading pattern calculated by 16.
In addition, the reason why γ is not set to 2 in the equation (4) is because an error caused by calculating the line width in a smaller area than usual is taken into consideration, and it is considered that γ is multiplied by the correction factor. . This correction factor is empirically determined, but, of course, the threshold value may be normally set as γ = 2.

【００２９】かすれサブパタンの抽出処理を図６を例に
とって説明する。先ず、「因」の字からかすれパタンと
して抽出された部分パタン「大」は図６（ｆ）に示され
ており、このかすれパタンに対して、水平、垂直、右斜
め、左斜め方向に走査して得られたかすれサブパタン
が、それぞれ、図６（ｇ）、（ｈ）、（ｉ）、（ｊ）に
示されている。前述したように、当該走査における閾値
は、図６（ａ）の原２値画像「因」の線幅ではなく、図
６（ｆ）のかすれパタン「大」の線幅に基づいて決定さ
れる。従って、図６（ａ）の原２値画像の走査時では、
線幅値が大きかったため抽出されなかった「大」の字の
サブパタンが、好適な線幅値による走査によって適切に
抽出されていることがわかる。The process of extracting the faint sub-pattern will be described with reference to FIG. First, a partial pattern "large" extracted as a blur pattern from the character "Cause" is shown in FIG. 6F, and scanning is performed horizontally, vertically, diagonally to the right, and diagonally to the left with respect to this blurred pattern. The faint sub-patterns obtained by the above are shown in FIGS. 6 (g), (h), (i), and (j), respectively. As described above, the threshold value in the scan is determined based on the line width of the faint pattern “large” in FIG. 6F, not the line width of the original binary image “factor” in FIG. 6A. . Therefore, when scanning the original binary image of FIG.
It can be seen that the "large" sub-patterns that were not extracted because the line width value was large were properly extracted by scanning with a suitable line width value.

【００３０】次に、原２値画像に対する走査によって抽
出された水平サブパタン１、垂直サブパタン１、右斜め
サブパタン１、左斜めサブパタン１と、かすれパタンに
対する走査によって抽出された水平かすれサブパタン、
垂直かすれサブパタン、右斜めかすれサブパタン、左斜
めかすれサブパタンとをそれぞれ合成する処理を行う。
この合成処理は、各方向のサブパタンに対して、それぞ
れ独立に水平サブパタン合成部１２５、垂直サブパタン
合成部１２７、右斜めサブパタン合成部１２９、左斜め
サブパタン合成部１３１によって実行される。合成され
たパタンは、それぞれ水平サブパタン２、垂直サブパタ
ン２、右斜めサブパタン２、左斜めサブパタン２とし
て、各々、水平サブパタン２メモリ１２６、垂直サブパ
タン２メモリ１２８、右斜めサブパタン２メモリ１３
０、左斜めサブパタンメモリ１３２に格納される。Next, a horizontal sub-pattern 1, a vertical sub-pattern 1, a right diagonal sub-pattern 1, a left diagonal sub-pattern 1 extracted by scanning the original binary image, and a horizontal blur sub-pattern extracted by scanning the blur pattern,
Processing for combining the vertical blurring subpattern, the right diagonal blurring subpattern, and the left diagonal blurring subpattern is performed.
This synthesizing process is executed by the horizontal sub-pattern synthesizing unit 125, the vertical sub-pattern synthesizing unit 127, the right diagonal sub-pattern synthesizing unit 129, and the left diagonal sub-pattern synthesizing unit 131 independently for each sub-pattern in each direction. The combined patterns are a horizontal sub-pattern 2, a vertical sub-pattern 2, a right diagonal sub-pattern 2 and a left diagonal sub-pattern 2, respectively, which are a horizontal sub-pattern 2 memory 126, a vertical sub-pattern 2 memory 128 and a right diagonal sub-pattern 2 memory 13, respectively.
0, stored in the left diagonal sub-pattern memory 132.

【００３１】ここで前記合成部におけるパタンの合成
は、例えば、２つのサブパタンの個々の画素についてＯ
Ｒ演算を行う方法等が用いられる。つまり、２つの２値
パタンを合成する場合、各々を構成する個々の画素にお
いて、少なくともどちらかが、画素値１、即ち、黒画素
であれば画素値１を出力し、両者ともに画素値０、即
ち、白画素であったときに画素値０を出力するという方
法で合成パタンを作成する。図６の例の場合、原２値画
像のサブパタンとして、それぞれ、図６（ｂ）、
（ｃ）、（ｄ）、（ｅ）が与えられ、かすれパタンのサ
ブパタンとしてはそれぞれ図６（ｇ）、（ｈ）、
（ｉ）、（ｊ）が与えられているときに前記合成部によ
って、合成されたサブパタン２は、各々図６（ｋ）、
（ｌ）、（ｍ）、（ｎ）となる。これらの合成されたサ
ブパタン２は、原２値画像のサブパタンと比較して、局
所的なスケ−ルの小さい部分が正確に反映されているの
で、そのサブパタンに基づいて計算される特徴マトリク
スにも当然それが反映され、従って、従来のサブパタン
抽出にともなう情報劣化による誤読等が防止できる。Here, the pattern composition in the composition section is performed by, for example, O for each pixel of two sub patterns.
A method of performing R calculation or the like is used. That is, in the case of combining two binary patterns, at least one of the individual pixels forming each outputs the pixel value 1, that is, the pixel value 1 if it is a black pixel, and both output the pixel value 0, That is, a composite pattern is created by a method of outputting a pixel value of 0 when it is a white pixel. In the case of the example of FIG. 6, as the sub-pattern of the original binary image, FIG.
(C), (d), and (e) are given, and subpatterns of the blur pattern are shown in FIGS. 6 (g), 6 (h), and 6 (h), respectively.
When (i) and (j) are given, the sub-pattern 2 synthesized by the synthesizing unit is as shown in FIG.
(L), (m), and (n). Compared with the sub-pattern of the original binary image, these combined sub-patterns 2 accurately reflect the small local scale, so that the feature matrix calculated based on that sub-pattern is also reflected. Naturally, this is reflected, and therefore erroneous reading due to information deterioration due to the conventional sub-pattern extraction can be prevented.

【００３２】さて、上述の方法により作成されたサブパ
タンは、特徴抽出部１３５においてさらに圧縮された特
徴に変換されるわけであるが、本実施例では、出力制御
部１３３の制御の下に特徴抽出部１３５に入力させるサ
ブパタンを選択できるようにしている。この出力制御部
１３３は、前記微小セグメント除去部１３９においてか
すれパタンを構成する各セグメントが全て微小であると
判定された場合に、それ以降のかすれパタンに対する処
理を中止させ、原２値画像から得られた各方向のサブパ
タン１をそれぞれのメモリ１０７、１０９、１１１、１
１３から読取り、特徴抽出部１３５に出力する。また、
前記かすれパタン線幅計算部１１６で計算されたかすれ
パタンの線幅は、常時、線幅判定部１３４で判定されて
おり、前記線幅が所定の閾値以下であると判定された場
合、その判定結果は出力制御部１３３に伝達される。こ
の時、前記線幅に対する閾値としては、例えば、次式が
与えられる。Ｗｓ＜ δ × Ｗｒ（５）０＜ δ 《１（６）但し、Ｗｓはかすれパタンの平均線幅、Ｗｒは原２値画
像の平均線幅であって、式（５）及び式（６）の条件が
満たされる時は、ＷｓがＷｒに比べて極端に小さいこと
を意味している。The sub-pattern created by the above method is converted into a further compressed feature in the feature extraction unit 135. In this embodiment, the feature extraction is performed under the control of the output control unit 133. A sub pattern to be input to the unit 135 can be selected. The output control unit 133, when the minute segment removing unit 139 determines that all the segments forming the blurred pattern are all minute, stops the processing for the blurred pattern thereafter and obtains it from the original binary image. The sub-pattern 1 in each direction is stored in each of the memories 107, 109, 111, 1
The data is read from the data 13 and output to the feature extraction unit 135. Also,
The line width of the blur pattern calculated by the blur pattern line width calculation unit 116 is always determined by the line width determination unit 134, and when the line width is determined to be equal to or less than a predetermined threshold value, the determination is made. The result is transmitted to the output control unit 133. At this time, for example, the following equation is given as the threshold for the line width. Ws <δ × Wr (5) 0 <δ << 1 (6) where Ws is the average line width of the blur pattern, and Wr is the average line width of the original binary image, and the equations (5) and (6) When the condition of is satisfied, it means that Ws is extremely smaller than Wr.

【００３３】前記線幅判定部１３４からＷｓがＷｒに比
べて極端に小さい場合には出力制御部１３３はかすれパ
タンに対する走査を中止させ、原２値画像に対する走査
によって得られた各方向のサブパタン１をそれぞれのメ
モリ１０７、１０９、１１１、１１３から読取り、特徴
抽出部１３５に出力する。以上の出力制御部１３３の処
理は、以下に述べる問題点に鑑みてなされたものであ
る。When Ws is extremely smaller than Wr from the line width determination unit 134, the output control unit 133 stops the scan for the blur pattern and the sub pattern 1 in each direction obtained by the scan for the original binary image. Is read from each of the memories 107, 109, 111 and 113 and is output to the feature extraction unit 135. The above processing of the output control unit 133 is performed in view of the problems described below.

【００３４】かすれサブパタンを原２値画像の走査によ
って得られたサブパタン１に合成することは、除去され
た重要な情報を回復させる一方でストロ−クのノイズ的
な成分をもつけ加えてしまうおそれがある。従って本実
施例では、非本質的なストロ−ク成分の除去を目指すた
めに、前述したように先ず、微小セグメント除去部１３
９においてかすれパタン内の微小セグメントを除去し、
また当然のことながら全てのセグメントが微小と判定さ
れた場合には、原２値画像に対する走査によって得られ
たサブパタン１だけを特徴抽出部１３５に出力するよう
にしたのである。さらに線幅判定部１３４を設け、かす
れパタンの線幅が所定の閾値に達しない場合にも当該か
すれパタンは、認識上、非本質的であると判定すること
にして、かかる場合にかすれパタンの走査を実行せず、
サブパタン１のみを特徴抽出部１３５に出力するように
したものである。このようにすることで、非本質的なス
トロ−ク成分はサブパタンから除去され、それによる誤
読等を未然に防止することが可能となる。Combining the faint sub-pattern with the sub-pattern 1 obtained by scanning the original binary image may restore the important information that has been removed while adding a noise-like component of the stroke. is there. Therefore, in this embodiment, in order to remove the extrinsic stroke component, first, as described above, the minute segment removing unit 13 is first performed.
At 9, the fine segment in the blur pattern is removed,
Further, as a matter of course, when it is determined that all the segments are minute, only the sub pattern 1 obtained by scanning the original binary image is output to the feature extraction unit 135. Further, a line width determination unit 134 is provided, and even if the line width of the fading pattern does not reach a predetermined threshold, the fading pattern is determined to be extrinsic for recognition, and in such a case, the fading pattern is determined. Without performing a scan,
Only the sub pattern 1 is output to the feature extraction unit 135. By doing so, the extrinsic stroke component is removed from the sub-pattern, and it is possible to prevent erroneous reading and the like due to it.

【００３５】特徴抽出部１３５では、入力された原２値
画像のサブパタンあるいは合成されたサブパタン４種に
基づいた特徴抽出を行うが、この特徴抽出を行う前に、
予め前記パタンレジスタ１０３の外接枠内の文字パタン
に対して、例えば各分割領域内の黒画素数が同数になる
ように垂直方向、水平方向に格子状となるＮ×Ｍ個の部
分領域に非線形分割しておく。次に前記４種のサブパタ
ンのそれぞれについて、前記分割された部分領域内にお
ける該サブパタンの黒画素数を計数し、これを文字パタ
ンの大きさで正規化することによって、各方向における
文字線の分布状態を反映するＮ×Ｍ×４次元の特徴マト
リクスを抽出し、識別部１３６に出力する。The feature extraction unit 135 performs feature extraction based on the sub-pattern of the input original binary image or the four types of synthesized sub-patterns. Before performing this feature extraction,
For the character pattern in the circumscribing frame of the pattern register 103, for example, nonlinearity is made in N × M partial regions in a grid pattern in the vertical and horizontal directions so that the number of black pixels in each divided region is the same. Split it. Next, for each of the four types of sub-patterns, the number of black pixels of the sub-pattern in the divided partial area is counted, and the number of black pixels is normalized by the size of the character pattern to thereby distribute the character lines in each direction. An N × M × 4 dimensional feature matrix that reflects the state is extracted and output to the identification unit 136.

【００３６】識別部１３６では、入力された前記特徴マ
トリクスと辞書メモリ１３７に予め格納しておいた複数
の標準文字の特徴マトリクスとを照合し、該照合結果か
ら判断して、最終的に一つの候補カテゴリを該入力文字
パタンの認識結果１３８として出力する。以上が本発明
による文字認識装置の実施例１である。The identification unit 136 collates the inputted characteristic matrix with the characteristic matrix of a plurality of standard characters stored in the dictionary memory 137 in advance, judges from the collation result, and finally makes one The candidate category is output as the recognition result 138 of the input character pattern. The above is the first embodiment of the character recognition device according to the present invention.

【００３７】ここで実施例１は、文字や図形を構成する
ストロ−クの局所線幅が２つに分類できるときに極めて
有効な方法であった。しかし、通常の簡単な文字は、２
種類の線幅による走査でもサブパタンにほぼ反映できる
とみなせる一方、３種類以上のスケ−ルのストロ−クか
ら構成される複雑な図形や漢字等では、２段階の走査で
もとらえきれないストロ−ク成分がある場合が有り得
る。実施例２は、このような問題点に対処し得るもので
あり、実施例１が２段階の線幅による走査であったのに
対し、実施例２は、これをさらに一般化し、Ｎ段階（Ｎ
≧２）の走査が可能となっている。この実施例２につい
て以下に説明する。Here, the first embodiment was an extremely effective method when the local line widths of strokes forming characters and figures can be classified into two. But the usual simple letter is 2
It can be considered that the scanning can be almost reflected in the sub-pattern even by scanning with different line widths, but in the case of complicated figures and Chinese characters composed of strokes of three or more types of stroke, strokes that cannot be captured by two-step scanning There may be ingredients. The second embodiment is capable of coping with such a problem, and the second embodiment generalizes this, and the second embodiment is the N-stage ( N
Scanning of ≧ 2) is possible. The second embodiment will be described below.

【００３８】図２は本発明による実施例２を示すブロッ
ク図である。ここで、２０１は光信号入力、２０２は光
電変換部、２０３はパタンレジスタ、２０４は外接枠検
出部、２０５はレジスタ、２０６は線幅計算部、２０７
は水平方向走査部、２０８は水平パタンメモリ、２０９
は垂直方向走査部、２１０は垂直パタンメモリ、２１１
は右斜め方向走査部、２１２は右斜めパタンメモリ、２
１３は左斜め方向走査部、２１４は左斜めパタンメモ
リ、２１５はかすれパタン抽出部、２１６は微小セグメ
ント除去部、２１７は線幅判定部、２１８は水平パタン
合成部、２１９は水平合成パタンメモリ、２２０は垂直
パタン合成部、２２１は垂直合成パタンメモリ、２２２
は右斜めパタン合成部、２２３は右斜め合成パタンメモ
リ、２２４は左斜めパタン合成部、２２５は左斜め合成
パタンメモリ、２２６はル−プカウンタ、２２７は出力
制御部、２２８は特徴抽出部、２２９は識別部、２３０
は辞書メモリ、２３１は認識結果である。FIG. 2 is a block diagram showing a second embodiment according to the present invention. Here, 201 is an optical signal input, 202 is a photoelectric conversion unit, 203 is a pattern register, 204 is a circumscribing frame detection unit, 205 is a register, 206 is a line width calculation unit, and 207.
Is a horizontal scanning unit, 208 is a horizontal pattern memory, 209
Is a vertical scanning unit, 210 is a vertical pattern memory, 211
Is a right diagonal scanning unit, 212 is a right diagonal pattern memory, 2
13 is a left oblique direction scanning unit, 214 is a left oblique pattern memory, 215 is a blurred pattern extraction unit, 216 is a minute segment removal unit, 217 is a line width determination unit, 218 is a horizontal pattern composition unit, 219 is a horizontal composition pattern memory, 220 is a vertical pattern composition unit, 221 is a vertical pattern pattern memory, 222
Is a right diagonal pattern synthesis unit, 223 is a right diagonal synthesis pattern memory, 224 is a left diagonal pattern synthesis unit, 225 is a left diagonal synthesis pattern memory, 226 is a loop counter, 227 is an output control unit, 228 is a feature extraction unit, 229 Is an identification unit, 230
Is a dictionary memory and 231 is a recognition result.

【００３９】ここでは、主として実施例１との相違点に
ついて説明する。先ず、２０１、２０２、２０３、２０
４は実施例１に準じ、パタンレジスタ２０３の２値画像
のうち、外接枠内のデ−タだけが、レジスタ２０５に転
送される。後述するようにこのレジスタ２０５には、文
字パタンの２値デ−タだけでなく、かすれパタンも順
次、上書きされる。線幅計算部２０６はこのレジスタ２
０５内のデ−タに対し、線幅の計算を行う。今は、文字
パタンの２値デ−タが格納されているので、文字パタン
の平均線幅が計算される。この線幅の算出も実施例１の
方法を準用する。Here, differences from the first embodiment will be mainly described. First, 201, 202, 203, 20
In No. 4, according to the first embodiment, only the data within the circumscribing frame of the binary image of the pattern register 203 is transferred to the register 205. As will be described later, not only the binary data of the character pattern but also the blurred pattern are sequentially overwritten in the register 205. The line width calculation unit 206 uses this register 2
The line width is calculated for the data in 05. Since the binary data of the character pattern is currently stored, the average line width of the character pattern is calculated. The method of Example 1 is also applied to the calculation of the line width.

【００４０】次に、実施例１と同様に、このレジスタ２
０５内の２値デ−タに対して、水平方向走査部２０７、
垂直方向走査部２０９、右斜め方向走査部２１１、左斜
め方向走査部２１３により、それぞれ水平、垂直、右斜
め、左斜め方向に走査し、前記線幅を閾値として、サブ
パタンを抽出し、各々、水平サブパタンメモリ２０８、
垂直サブパタンメモリ２１０、右斜めサブパタンメモリ
２１２、左斜めサブパタンメモリ２１４に格納する。Next, as in the first embodiment, this register 2
For the binary data in 05, the horizontal scanning section 207,
The vertical scanning unit 209, the right diagonal scanning unit 211, and the left diagonal scanning unit 213 scan horizontally, vertically, diagonally to the right, and diagonally to the left, respectively, and the subpatterns are extracted with the line width as a threshold. Horizontal sub-pattern memory 208,
The data is stored in the vertical sub pattern memory 210, the right diagonal sub pattern memory 212, and the left diagonal sub pattern memory 214.

【００４１】次に、かすれパタン抽出部２１５におい
て、レジスタ２０５の文字パタンの２値デ−タとメモリ
２０８、２１０、２１２、２１４に格納されたサブパタ
ンより、かすれパタンを抽出し、レジスタ２０５に転送
する。この時、かすれパタンの抽出は、実施例１の図３
に示した方法によって行い、このかすれパタンを便宜
上、かすれパタン１としておく。そして、微小セグメン
ト除去部２１６でかすれパタン１の微小セグメントの除
去を行い、残ったセグメント数等をチェックした後、線
幅計算部２０６においてかすれパタン１の線幅の計算を
行い、さらに線幅判定部２１７で、前記線幅値に基づい
てかすれパタン１の走査を行うか否かを判定する。但
し、微小セグメント除去部２１６または線幅判定部２１
７の判定は、実施例１に準用する。ここでかすれパタン
１について、走査をする必要はないと判定されると、メ
モリ２０８、２１０、２１２、２１４に格納されたサブ
パタンは、出力制御部２２７を通じて特徴抽出部２２８
に出力され、また走査する必要ありと判定された場合に
は、各々、水平合成サブパタンメモリ２１９、垂直合成
サブパタンメモリ２２１、右斜め合成サブパタンメモリ
２２３、左斜め合成サブパタンメモリ２２５に転送され
る。次に実施例１と同様に、レジスタ２０５内のかすれ
パタン１に対して、各方向の走査部２０７、２０９、２
１１、２１３により再度走査され、かすれパタン１の線
幅に基づいて、かすれパタン１のサブパタン、即ち、か
すれサブパタン１が抽出され、各々、メモリ２０８、２
１０、２１２、２１４に格納される。Next, the blur pattern extracting unit 215 extracts the blur pattern from the binary data of the character pattern of the register 205 and the sub patterns stored in the memories 208, 210, 212 and 214 and transfers it to the register 205. To do. At this time, the extraction of the faint pattern is performed as shown in FIG.
This method is used as the fading pattern 1 for the sake of convenience. Then, the minute segment removing unit 216 removes the minute segment of the blurred pattern 1, and after checking the number of remaining segments, the line width calculating unit 206 calculates the line width of the blurred pattern 1 and further determines the line width. The unit 217 determines whether to scan the fading pattern 1 based on the line width value. However, the minute segment removal unit 216 or the line width determination unit 21
The determination of 7 applies mutatis mutandis to Example 1. If it is determined that the blur pattern 1 does not need to be scanned, the sub patterns stored in the memories 208, 210, 212, and 214 are output to the feature extraction unit 228 via the output control unit 227.
When it is determined that scanning is necessary, scanning is performed to the horizontal composition sub-pattern memory 219, the vertical composition sub-pattern memory 221, the right diagonal composition sub-pattern memory 223, and the left diagonal composition sub-pattern memory 225, respectively. To be done. Next, similarly to the first embodiment, the scanning patterns 207, 209, 2 in each direction are applied to the blur pattern 1 in the register 205.
11 and 213, the sub-pattern of the blurred pattern 1, that is, the blurred sub pattern 1, is extracted based on the line width of the blurred pattern 1, and the memories 208 and 2 are respectively extracted.
10, 212, and 214 are stored.

【００４２】次に水平サブパタン合成部２１８、垂直サ
ブパタン合成部２２０、右斜めサブパタン合成部２２
２、左斜めサブパタン合成部２２４において、メモリ２
０８、２１０、２１２、２１４に格納されたかすれサブ
パタン１とメモリ２１９、２２１、２２３、２２５に格
納されたサブパタンとが合成され、合成サブパタン１と
して、再び、メモリ２１９、２２１、２２３、２２５に
出力される。Next, the horizontal sub-pattern combining section 218, the vertical sub-pattern combining section 220, and the right diagonal sub-pattern combining section 22.
2. In the left diagonal sub-pattern combining unit 224, the memory 2
08, 210, 212, 214 and the sub-pattern 1 stored in the memories 219, 221, 223, 225 are combined, and the combined sub-pattern 1 is output to the memories 219, 221, 223, 225 again. To be done.

【００４３】前記合成サブパタン１は、実施例１におい
て、２度のサブパタン抽出の結果合成されたものと同一
のものであるが、実施例２では、さらにかすれパタン抽
出部２１５において、現時点でレジスタ２０５に格納さ
れたかすれパタン１とメモリ２０８、２１０、２１２、
２１４に格納されたかすれサブパタン１とを用いて、２
度目の走査によっても検出されなかったストロ−ク成分
を抽出し、これをかすれパタン２としてレジスタ２０５
に格納する。ここで、図３において、メモリ１０７、１
０９、１１１、１１３は、図２におけるメモリ２０８、
２１０、２１２、２１４に相当し、文字パタンメモリ３
０４は、レジスタ２０５に相当する。The synthetic sub-pattern 1 is the same as that synthesized as a result of the two sub-pattern extractions in the first embodiment, but in the second embodiment, the blurring pattern extraction unit 215 further registers the register 205 at the present time. The blur pattern 1 and the memories 208, 210, 212 stored in
By using the faint sub-pattern 1 stored in 214, 2
The stroke component which is not detected by the second scanning is extracted and is set as the blur pattern 2 in the register 205.
To store. Here, in FIG. 3, the memories 107, 1
09, 111, and 113 are the memories 208,
Character pattern memory 3 corresponding to 210, 212, and 214
04 corresponds to the register 205.

【００４４】次にかすれパタン２に対しても、かすれパ
タン２の線幅を閾値とした走査によってかすれサブパタ
ン２を求め、合成部２１８、２２０、２２２、２２４に
おいて、メモリ２１９、２２１、２２３、２２５に格納
された合成サブパタン１との合成を行い、再びメモリ２
１９、２２１、２２３、２２５に合成サブパタン２とし
て出力する。全く同様にして、かすれパタンＫに対し
て、かすれパタンＫの線幅を閾値とした走査によってか
すれサブパタンＫを求め、合成部２１８、２２０、２２
２、２２４において、メモリ２１９、２２１、２２３、
２２５に格納された合成サブパタンＫ−１との合成を行
い、再びメモリ２１９、２２１、２２３、２２５に合成
サブパタンＫとして出力する。Next, with respect to the blur pattern 2, the blur sub pattern 2 is obtained by scanning with the line width of the blur pattern 2 as a threshold value, and the memories 219, 221, 223, 225 in the combining units 218, 220, 222, 224. The composition with the composition sub-pattern 1 stored in
It is output to 19, 221, 223, and 225 as a combined sub-pattern 2. In exactly the same manner, for the blur pattern K, the blur sub-pattern K is obtained by scanning with the line width of the blur pattern K as a threshold, and the combining units 218, 220, 22 are obtained.
2, 224, memories 219, 221, 223,
The composite sub-pattern K-1 stored in 225 is combined, and the composite sub-pattern K-1 is output to the memories 219, 221, 223 and 225 as the composite sub-pattern K again.

【００４５】ル−プカウンタ２２６は、サブパタンの合
成回数Ｋをカウントし、Ｋが所定の閾値Ｍに達した場
合、出力制御部２２７にそのことを伝達する。その時、
出力制御部２２７では、メモリ２１９、２２１、２２
３、２２５に格納されていた合成サブパタンＭを特徴抽
出部２２８に転送する。尚、合成回数ＫがＭに達しない
場合でも、微小セグメント除去部２１６または線幅判定
部２１７において、かすれパタンＫを走査する必要がな
いと判定された場合は、その時点の合成サブパタンＫが
特徴抽出部２２８に転送される。The loop counter 226 counts the number of times K of sub-pattern combination, and when K reaches a predetermined threshold value M, it notifies the output control unit 227 of this. At that time,
In the output control unit 227, the memories 219, 221, 22
The combined sub-pattern M stored in Nos. 3 and 225 is transferred to the feature extraction unit 228. Even if the number of times K of synthesis does not reach M, if the minute segment removal unit 216 or the line width determination unit 217 determines that it is not necessary to scan the fading pattern K, the combination sub-pattern K at that time is a feature. It is transferred to the extraction unit 228.

【００４６】以下、特徴抽出部２２８、識別部２２９、
辞書メモリ２３０、認識結果２３１は、全て実施例１と
同様であるので、ここでは説明を省略する。Hereinafter, the feature extracting section 228, the identifying section 229,
Since the dictionary memory 230 and the recognition result 231 are all the same as those in the first embodiment, the description thereof is omitted here.

【００４７】以上、実施例２によれば、Ｍ回の走査によ
って、それぞれ線幅の異なるＭ種のストロ−ク成分を反
映したサブパタンが作成でき、従って、Ｍ種の線幅のス
トロ−クからなる複雑な漢字や図形等に対しても高精度
な認識性能を安定に維持できる。また、実施例１は、実
施例２においてＭ＝１としたものと同等であり、実施例
２の特殊な場合に相当している。As described above, according to the second embodiment, by scanning M times, it is possible to create a sub-pattern that reflects M kinds of stroke components having different line widths. Highly accurate recognition performance can be stably maintained even for complicated Chinese characters and figures. Further, the first embodiment is equivalent to the case where M = 1 in the second embodiment, and corresponds to the special case of the second embodiment.

【００４８】尚、実施例１及び実施例２は、上述した例
のみに限定されるものではない。例えば、かすれパタン
抽出部１１４または２１５におけるかすれパタン抽出手
段は図３に示された方法だけでなく、ＯＲ、ＮＯＲ、Ａ
ＮＤ，ＮＡＮＤ、ＮＯＴ回路等を組み合わせることによ
って、また原２値画像と４種のサブパタンの黒画素を画
素毎にカウントし、そのカウント数が１のものを抽出し
ていくことによって同一の結果を出力する方法がいくつ
か考えられるが、如何なる方法であっても本実施例で定
義されたかすれパタンを抽出できれば、それらは全て本
発明に属する。The first and second embodiments are not limited to the above-mentioned examples. For example, the blur pattern extracting means in the blur pattern extracting unit 114 or 215 is not limited to the method shown in FIG.
The same result is obtained by combining the ND, NAND, and NOT circuits, and by counting the original binary image and the black pixels of four types of sub-patterns for each pixel and extracting the ones with the count number of 1. There are several possible output methods, but all of them belong to the present invention as long as the blur pattern defined in this embodiment can be extracted.

【００４９】また非本質的なストロ−ク成分を除去する
方法として、微小セグメントの除去や線幅による判定等
を用意したが、これらの条件式及び閾値の設定等は、本
発明の範囲内で任意に変更できる。尚、線幅判定部１３
４及び微小セグメント除去部１３９は、かすれパタンと
して残された部分が文字の本質的特徴を表わすものであ
るか否かを、かすれパタンのセグメントの大きさや線幅
という特徴を用いて判定するという点では両者とも同様
な機能を持つので、いづれか一方のみを実施してもよ
い。更に前記特徴以外の特徴、例えばセグメントの分布
状態や黒画素密度等を用いて前記判定と同様な効果が得
られる手段であれば全て本発明に属する。また、入力す
る文字が高品質であり局所線幅が一定であるということ
が予め判っている場合にはかすれパタンの抽出を行わ
ず、逆に文字品質が低品質であり、かすれ文字が多いと
いうことが予め判っている場合には、常に、或は前記判
定を実施しながらかすれパタンの抽出、合成を行うとい
う具合いに、トップダウン的に行う実施例も本発明に属
する。Further, as a method for removing an extrinsic stroke component, removal of a minute segment, determination by a line width, etc. were prepared. However, these conditional expressions and threshold values are set within the scope of the present invention. It can be changed arbitrarily. The line width determination unit 13
4 and the minute segment removing unit 139 determines whether or not the portion left as the blurred pattern represents the essential characteristics of the character by using the characteristics such as the size and line width of the segment of the blurred pattern. Since both have the same function, either one may be implemented. Further, any feature other than the above feature, such as the distribution state of the segment or the black pixel density, can be used as long as it can obtain the same effect as the above determination, and belongs to the present invention. Also, if it is known in advance that the characters to be input are of high quality and the local line width is constant, the blur pattern is not extracted, and conversely, the character quality is low and there are many blur characters. If it is known in advance, an embodiment in which the faint pattern is extracted and combined constantly or while performing the determination also belongs to the present invention.

【００５０】またパタンレジスタや各メモリの構成、線
幅の計算方法、特徴マトリクスの抽出方法、外接枠分割
方法等も本発明の範囲内で適宜変更可能である。さらに
図１のブロック図において、各構成部分に分担された処
理や動作、入出力信号の流れ、設置個数、位置その他の
条件も任意好適に変更可能である。Further, the configuration of the pattern register and each memory, the line width calculation method, the feature matrix extraction method, the circumscribing frame division method, etc. can be appropriately changed within the scope of the present invention. Further, in the block diagram of FIG. 1, the processing and operation assigned to each component, the flow of input / output signals, the number of installations, the position, and other conditions can be arbitrarily changed.

【００５１】[0051]

【発明の効果】以上、詳細に説明したように、本発明に
よれば、入力文字パタンを２値画像に変換し、この２値
画像の外接枠内の２値画像の線幅を計算し、外接枠内の
２値画像に対して水平、垂直、右斜め、左斜め方向に走
査して、前記線幅に基づくストロ−クの分布状態を反映
する４種類のサブパタンを抽出し、前記外接枠内の２値
画像及び前記サブパタン４種とを用いて、サブパタンと
して抽出されなかった画素部分から構成されるかすれパ
タンを検出し、このかすれパタンの線幅を計算する。更
にかすれパタンに対して、その線幅に基づいて設定され
た閾値を用いて、水平、垂直、右斜め、左斜め方向に走
査し、検出されたストロ−クの分布状態を反映する４種
類のかすれサブパタンを抽出し、前記サブパタン及び前
記かすれサブパタンとをそれぞれの種類毎に合成するこ
とによって、合成サブパタンを作成し、前記合成サブパ
タンに基づいて特徴マトリクスを抽出し、前記特徴マト
リクスと辞書とを照合した結果より、認識結果を出力す
るようにしたので、文字パタンを構成するストロ−クで
あって、認識に本質的な役割を果たすものの一部が、他
の部分との局所線幅と比較して小さくなった場合でも、
サブパタンの一部として抽出され、従って、局所線幅に
大きな相違のある品質の悪い文字パタンや様々なスケ−
ルのストロ−クから構成される複雑な漢字文字や図形等
に対しても高精度な認識性能を安定に維持できる文字認
識装置が実現可能となる。As described above in detail, according to the present invention, the input character pattern is converted into a binary image, and the line width of the binary image in the circumscribing frame of the binary image is calculated. The binary image in the circumscribing frame is scanned horizontally, vertically, diagonally to the right, and diagonally to the left to extract four types of sub-patterns that reflect the stroke distribution state based on the line width. Using the binary image and the four types of sub-patterns, a blur pattern composed of pixel portions not extracted as a sub-pattern is detected, and the line width of this blur pattern is calculated. Further, with respect to the blur pattern, using four threshold values set on the basis of the line width, scanning is performed in the horizontal, vertical, right diagonal, and left diagonal directions, and there are four types that reflect the distribution state of the detected strokes. A sub-pattern is extracted, a composite sub-pattern is created by combining the sub-pattern and the blur sub-pattern for each type, a feature matrix is extracted based on the composite sub-pattern, and the feature matrix and the dictionary are collated. As a result, the recognition result is output.Therefore, some strokes that make up the character pattern, which play an essential role in recognition, are compared with the local line width with other portions. Even when it gets smaller,
It is extracted as a part of the sub-pattern, and therefore poor quality character patterns and various scales with large differences in local line widths.
It is possible to realize a character recognition device capable of stably maintaining high-precision recognition performance even for complicated Kanji characters and figures composed of strokes of a ball.

[Brief description of drawings]

【図１】本発明による文字認識装置の実施例１を示すブ
ロック図である。FIG. 1 is a block diagram showing a first embodiment of a character recognition device according to the present invention.

【図２】本発明による文字認識装置の実施例２を示すブ
ロック図である。FIG. 2 is a block diagram showing a second embodiment of the character recognition device according to the present invention.

【図３】かすれパタン抽出部の構成を示すブロック図で
ある。FIG. 3 is a block diagram showing a configuration of a blur pattern extracting unit.

【図４】かすれ部分の存在するパタンの一例である。FIG. 4 is an example of a pattern in which a blurred portion exists.

【図５】つぶれによりサブパタンとして抽出されない部
分があるパタンの一例である。FIG. 5 is an example of a pattern in which there is a portion that is not extracted as a sub-pattern due to crushing.

【図６】本発明の適用例の一例を示す図である。FIG. 6 is a diagram showing an example of an application example of the present invention.

[Explanation of symbols]

１０１光信号１０２光電変換部１０３パタンレジスタ１０４外接枠検出部１０５文字パタン線幅計算部１０６水平方向走査部１０７水平サブパタン１メモリ１０８垂直方向走査部１０９垂直サブパタン１メモリ１１０右斜め方向走査部１１１右斜めサブパタン１メモリ１１２左斜め方向走査部１１３左斜めサブパタン１メモリ１１４かすれパタン抽出部１１５かすれパタンメモリ１１６かすれパタン線幅計算部１１７水平方向走査部１１８水平かすれサブパタンメモリ１１９垂直方向走査部１２０垂直かすれサブパタンメモリ１２１右斜め方向走査部１２２右斜めかすれサブパタンメモリ１２３左斜め方向走査部１２４左斜めかすれサブパタンメモリ１２５水平サブパタン合成部１２６水平サブパタン２メモリ１２７垂直サブパタン合成部１２８垂直サブパタン２メモリ１２９右斜めサブパタン合成部１３０右斜めサブパタン２メモリ１３１左斜めサブパタン合成部１３２左斜めサブパタン２メモリ１３３出力制御部１３４線幅判定部１３５特徴抽出部１３６識別部１３７辞書メモリ１３８認識結果１３９微小セグメント除去部 101 optical signal 102 photoelectric conversion unit 103 pattern register 104 circumscribing frame detection unit 105 character pattern line width calculation unit 106 horizontal direction scanning unit 107 horizontal sub pattern 1 memory 108 vertical direction scanning unit 109 vertical sub pattern 1 memory 110 right diagonal direction scanning unit 111 right Diagonal sub-pattern 1 memory 112 Left diagonal sub-pattern scanning unit 113 Left diagonal sub-pattern 1 memory 114 Blurred pattern extraction unit 115 Blurred pattern memory 116 Blurred pattern line width calculation unit 117 Horizontal scanning unit 118 Horizontal blur sub-pattern memory 119 Vertical scanning unit 120 Vertical Faint sub-pattern memory 121 Right diagonal scan section 122 Right diagonal sub-pattern memory 123 Left diagonal scan section 124 Left diagonal sub-pattern memory 125 Horizontal sub-pattern combining section 126 Horizontal sub-pattern 2 Memory 127 Vertical sub-pattern combining unit 128 Vertical sub-pattern 2 memory 129 Right oblique sub-pattern combining unit 130 Right oblique sub-pattern 2 memory 131 Left oblique sub-pattern combining unit 132 Left oblique sub-pattern 2 memory 133 Output control unit 134 Line width determination unit 135 Feature extraction unit 136 Identification Part 137 dictionary memory 138 recognition result 139 minute segment removal part

Claims

[Claims]

1. A photoelectric conversion unit for optically scanning a character pattern written on a form or the like to convert it into a binary image which is a quantized electric signal, and a character pattern converted into the binary image. A pattern register for storing, a circumscribing frame detecting unit for detecting a circumscribing frame of the character pattern in the pattern register, a character pattern line width calculating unit for calculating a line width of the character pattern in the circumscribing frame of the pattern register, For the character pattern in the circumscribed frame of the pattern register,
When the number of consecutive black pixels obtained by scanning horizontally, vertically, diagonally to the right, and diagonally to the left exceeds a threshold value determined based on the line width, it is detected as a stroke, and these strokes are detected. -A sub-pattern extraction unit that extracts four types of sub-patterns representing the distribution of black and white in each direction, and a binary image in the circumscribed frame of the pattern register and the four types of sub-patterns among the black pixels that form a character pattern. A blur pattern extracting unit that extracts a black pixel that does not belong to any of the four types of sub patterns as a blur pattern, and a fine segment removing unit that removes a fine segment from each of the independent segments that form the blur pattern. It is necessary to extract the sub-pattern for the faint pattern line width calculation unit that calculates the faint pattern line width and the faint pattern from which minute segments have been removed. If it is determined that the blur pattern sub-pattern that represents the stroke distribution of the blur pattern based on the line width of the blur pattern is extracted in the same manner as above, four types of blur sub patterns are extracted in each scanning direction; A control unit that outputs to the feature extraction unit one of the sub-patterns of the combined sub-pattern and the blurred sub-pattern for each scanning direction, and a feature extraction unit that extracts the feature matrix of the sub-pattern or the combined sub-pattern, A character recognition device comprising: an identification unit that outputs a recognition result based on a result obtained by comparing the feature matrix with a dictionary matrix prepared in advance.