JPH0728948A

JPH0728948A - Character recognition device

Info

Publication number: JPH0728948A
Application number: JP5167908A
Authority: JP
Inventors: Toru Miyamae; 徹宮前; Koichi Higuchi; 浩一樋口
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1993-07-07
Filing date: 1993-07-07
Publication date: 1995-01-31

Abstract

PURPOSE:To provide a character recognition device which can obtain the highly accurate and stable recognition performance even for a character pattern of low quality that has the local variation of line width. CONSTITUTION:Four types of subpatterns are extracted at every scanning direction based on the average line width of an original character pattern. Furthermore a blurred pattern is extracted. If it is decided that a restoration pattern must be generated for the remaining pattern obtained by excluding the infinitesimal segments form the blurred pattern, a blur restoration pattern is generated by a thickening processing. In the same way, the sub-pattern of the blur restoration pattern are extracted at every direction. Then an area included in a circumscribed frame of the original character pattern or a synthetic pattern of the original character pattern with the blur restoration pattern is divided into the lattice-shaped partial areas. Then the feature value is calculated in each divided in each divided area of either a subpattern or a synthetic subpattern of the subpattern with the blur restoration pattern. Thus a feature matrix is generated and collated with the feature matrix of a dictionary 137. Thus the characters are recognized.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、一部がかすれたよう
な局所的に線幅の異なる文字パタンに対しても高精度に
安定した認識性能の得られる文字認識装置に関するもの
である。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device capable of obtaining highly accurate and stable recognition performance even for a character pattern having a partially different line width such as a faint part.

【０００２】[0002]

【従来の技術】従来、入力文字パタンの特徴を抽出し、
予め用意した辞書との照合によって、認識結果を出力す
る文字認識装置としては、例えば特公昭６０−３８７５
６に開示されるものがあった。この文字認識装置による
処理の概要について以下に説明する。2. Description of the Related Art Conventionally, characteristics of input character patterns are extracted,
As a character recognition device that outputs a recognition result by collating with a dictionary prepared in advance, for example, Japanese Patent Publication No. 60-3875.
6 was disclosed. The outline of the processing by the character recognition device will be described below.

【０００３】先ず、入力文字パタンの各セルの明るさを
光電変換によって、量子化された電気信号である２値画
像に変換し、該２値画像をパタンレジスタに格納してお
く。そして前記パタンレジスタ内の文字パタンの外接枠
を検出し、この外接枠内の文字パタンの線幅を計算す
る。次にパタンレジスタの外接枠内の文字パタンに対し
て、水平、垂直、右斜め、左斜め方向に走査し、前記線
幅を閾値とする連続黒画素成分を検出することによっ
て、該入力文字パタンに対する４種のサブパタンを抽出
する。また、前記パタンレジスタの外接枠内の文字パタ
ンに対して、各分割領域内の黒画素数が同数になるよう
に垂直方向、水平方向に格子状となるＮ×Ｍ個の部分領
域に非線形分割する。次に前記４種のサブパタンのそれ
ぞれについて、前記分割された部分領域内における該サ
ブパタンの黒画素数を計数し、これを文字パタンの大き
さで正規化することによって、各方向における文字線の
分布状態を反映するＮ×Ｍ×４次元の特徴マトリクスを
抽出する。そして前記特徴マトリクスと予め用意された
複数の標準文字の特徴マトリクスである辞書とを照合
し、該照合結果より該入力文字パタンの認識結果を出力
するというものであった。First, the brightness of each cell of an input character pattern is converted into a binary image which is a quantized electric signal by photoelectric conversion, and the binary image is stored in a pattern register. Then, the circumscribing frame of the character pattern in the pattern register is detected, and the line width of the character pattern in the circumscribing frame is calculated. Next, the character pattern in the circumscribing frame of the pattern register is scanned horizontally, vertically, diagonally to the right, and diagonally to the left to detect continuous black pixel components having the line width as a threshold, thereby detecting the input character pattern. Extract four sub-patterns for. Further, with respect to the character pattern in the circumscribing frame of the pattern register, non-linear division into N × M partial areas in a grid pattern in the vertical direction and the horizontal direction so that the number of black pixels in each divided area is the same. To do. Next, for each of the four types of sub-patterns, the number of black pixels of the sub-pattern in the divided partial area is counted, and the number of black pixels is normalized by the size of the character pattern to thereby distribute the character lines in each direction. An N × M × 4 dimensional feature matrix reflecting the state is extracted. Then, the feature matrix is collated with a dictionary, which is a feature matrix of a plurality of standard characters prepared in advance, and the recognition result of the input character pattern is output from the collation result.

【０００４】[0004]

【発明が解決しようとする課題】しかし、前記文字認識
装置においては、以下のような問題点があった。即ち、
入力された文字パタンの外接枠内の２値画像に対して、
水平、垂直、右斜め、左斜めの４方向にそれぞれ走査
し、当該文字パタンの平均線幅の２倍を閾値として、連
続した黒画素よりなるストロ−クを抽出し、それらの分
布を表わす４種のサブパタンを抽出していた。しかし、
前記従来のサブパタンの抽出方法では、当然ながら前記
線幅の２倍より小さい連続黒画素数を持つストロ−ク成
分は抽出されない。従って、一部がかすれたような文字
パタン、即ち、局所的な線幅値が他の部分と比較して極
めて小さくなっている文字パタン等は、その部分がサブ
パタン及びサブパタンに基づいて抽出される特徴マトリ
クスに反映されず、その結果、認識性能の低下の一因を
為していたという問題点があった。However, the above character recognition device has the following problems. That is,
For the binary image in the circumscribed frame of the input character pattern,
Scanning is performed in each of the four directions of horizontal, vertical, diagonal to the right, and diagonal to the left, and a stroke consisting of consecutive black pixels is extracted by using twice the average line width of the character pattern as a threshold, and the distribution thereof is represented. The seed sub-pattern was extracted. But,
The conventional sub-pattern extraction method naturally does not extract the stroke component having the number of continuous black pixels smaller than twice the line width. Therefore, a character pattern that is partly faint, that is, a character pattern whose local line width value is extremely small compared to other parts, is extracted based on the sub-pattern and sub-pattern. There was a problem that it was not reflected in the feature matrix, and as a result, it contributed to the deterioration of the recognition performance.

【０００５】このような場合の例を図４及び図６（ａ）
に示す。図４では、アルファベットの「Ｑ」の字のひげ
の部分、即ち波線４０１で囲まれた領域内のセグメント
がかすれて３つに分裂してしまった場合を表している。
このかすれたひげの部分は、本来ならば、左斜め方向の
走査によって、ストロ−クの一部として検出されるわけ
であるが、この例では、いかなる方向の走査においても
サブパタンの一部としては検出されない。従って、ひげ
の部分は特徴に反映されず、ひげのない類似文字、例え
ば、「Ｏ」等に誤読する確率が増大することになる。ま
た図６（ａ）では、漢字の「因」の字をあつかってい
る。この場合、「因」を構成する要素の内、外側の部分
である「口」に対し、内側の部分である「大」が通常よ
り小さく書かれ、文字全体の平均線幅の２倍以下の大き
さしか持たないため、平均線幅を用いた走査では、図６
（ｂ）、（ｃ）、（ｄ）、（ｅ）に示したように「大」
の字がどのサブパタンにも反映されないといった事態が
生じ、大きな問題点となる。An example of such a case is shown in FIGS. 4 and 6 (a).
Shown in. FIG. 4 shows a case where the beard portion of the letter “Q” of the alphabet, that is, the segment in the area surrounded by the wavy line 401 is faintly divided into three.
This faint whisker is normally detected as a part of the stroke by scanning in the left diagonal direction, but in this example, as a part of the sub-pattern in scanning in any direction. Not detected. Therefore, the beard portion is not reflected in the feature, and the probability of misreading a similar character without a beard, such as "O", increases. In addition, in FIG. 6A, the kanji “Cause” is used. In this case, of the elements that make up the "cause," the inner part, "large", is written smaller than usual, with respect to the outer part, "mouth," and is less than twice the average line width of the entire character. Since it has only the size, the scan using the average line width is
“Large” as shown in (b), (c), (d), and (e)
This is a big problem because the situation where the character of “” is not reflected in any sub pattern occurs.

【０００６】また、上述のように一部がかすれたり、小
さく書かれたりした文字パタンではないときでも、一部
がつぶれたことにより平均線幅の値が非常に大きくな
り、その結果、通常のストロ−クが走査によってサブパ
タンとして検出されず、故に特徴マトリクスに反映され
ず、認識性能の低下をもたらすという問題点があった。
このような場合の例を図５に示す。図５は、数字の
「５」において、下部がル−プを作りつぶれてしまった
例であるが、このとき、文字全体の平均線幅は大きな値
となり、その結果、例えば、破線５０１で示されたよう
な通常に書かれたストロ−クの部分等は線幅の２倍以下
の大きさとなってしまい、結局、サブパタンとして抽出
されなくなる。従って、破線５０１が示すストロ−クが
ないパタンとして、特徴抽出されるので、例えば「６」
等に極めて類似してくることになり、「６」に誤読する
確率が増大する。Further, even when the character pattern is not faint or written small as described above, the value of the average line width becomes very large due to the part being crushed, and as a result, the normal line width is increased. There is a problem in that the stroke is not detected as a sub-pattern by scanning and is therefore not reflected in the feature matrix, resulting in deterioration of recognition performance.
An example of such a case is shown in FIG. FIG. 5 is an example in which the lower part of the numeral "5" has been formed into a loop, but at this time, the average line width of the entire character becomes a large value, and as a result, for example, the broken line 501 indicates. The stroke portion or the like which is normally written as described above has a size of twice the line width or less, and is eventually not extracted as a sub pattern. Therefore, the feature is extracted as a pattern having no stroke indicated by the broken line 501, and, for example, "6".
And so on, and the probability of misreading as "6" increases.

【０００７】本発明は、前記従来のサブパタン抽出方法
において、文字を構成する各ストロ−ク成分の線幅が平
均線幅に近いところで分布する場合には、有効な特徴抽
出となり得る一方、局所的なストロ−クの線幅値が他の
部分の線幅値と大きな差がある場合、即ち、一部がつぶ
れていたり、かすれていたりする等のような文字パタン
に対しては、適切な特徴抽出ができず認識性能の低下を
もたらすといった問題点を除去し、平均線幅に比較して
局所線幅が小さく、通常の走査ではサブパタンとして抽
出されないストロ−ク成分のみからなるかすれパタンを
抽出し、これらのストロ−ク成分に対して、入力文字の
平均線幅にまで太め処理を行うことによって、かすれ復
元パタンを作成した後、かすれ復元パタンのサブパタン
を抽出して入力文字パタンのサブパタンとの合成を行
い、さらに原２値画像とかすれ復元パタンとの合成を行
って該合成文字パタンに基づく周辺分布より外接枠を分
割し、前記合成サブパタン及び前記分割領域に基づいた
特徴抽出、認識処理を行うことによって、局所的な線幅
に大きなばらつきのある品質の良くない文字パタンに対
しても、高精度で、安定な認識性能の得られる文字認識
装置を提供することを目的とする。In the conventional sub-pattern extraction method described above, when the stroke width of each stroke component forming a character is distributed near the average line width, effective feature extraction can be performed while local feature extraction can be performed. When the stroke width value of a stroke has a large difference from the stroke width value of other parts, that is, an appropriate feature for a character pattern such as a part being crushed or faint By eliminating the problem that the extraction could not be performed and the recognition performance was degraded, the local pattern width was smaller than the average line width, and a blur pattern consisting only of stroke components that was not extracted as a sub pattern in normal scanning was extracted. By creating a blur restoration pattern by performing thickening processing to these stroke components up to the average line width of the input character, the sub pattern of the blur restoration pattern is extracted and the input sentence is extracted. The feature is based on the composite sub-pattern and the divided area by synthesizing the pattern with the sub-pattern, further synthesizing the original binary image and the blur restoration pattern, and dividing the circumscribing frame from the peripheral distribution based on the synthetic character pattern. An object of the present invention is to provide a character recognition device that can obtain stable and highly accurate recognition performance even for poor quality character patterns that have large variations in local line widths by performing extraction and recognition processing. And

【０００８】[0008]

【課題を解決するための手段】本発明は、前記課題を解
決するために、帳票等に記入された文字パタンを光学的
に走査して、量子化された電気信号である２値画像に変
換する光電変換部と、前記２値画像に変換された文字パ
タンを格納するパタンレジスタと、前記パタンレジスタ
内の文字パタンの外接枠を検出する外接枠検出部と、前
記パタンレジスタの外接枠内の文字パタンの線幅を算出
する線幅計算部と、前記パタンレジスタの外接枠内の文
字パタンに対して、水平、垂直、右斜め、左斜めの各方
向に走査し、走査線上の黒画素の連続数が前記線幅に基
づいて定められた閾値を超えた場合にストロ−クとして
検出し、これらのストロ−クの分布を表わすサブパタン
を各方向毎に４種類抽出するサブパタン抽出部と、前記
パタンレジスタの外接枠内の２値画像及び前記４種類の
サブパタンより、文字パタンを構成する黒画素の中で、
４種類のサブパタンのいずれにも属さない黒画素の集合
をかすれパタンとして抽出するかすれパタン抽出部と、
前記かすれパタンを構成する各々独立したセグメントの
うち、微小セグメントを除去する微小セグメント除去部
と、前記かすれパタンの線幅を算出するかすれパタン線
幅計算部と、微小セグメントを除去したかすれパタンに
ついて復元パタン作成の必要有りと判定された場合に、
当該かすれパタンを構成する各々のセグメントに対し
て、線幅を前記平均線幅まで太める処理を行うことによ
って、かすれ復元パタンを作成するかすれパタン復元部
と、前記かすれ復元パタンに対して、水平、垂直、右斜
め、左斜めの各方向に走査し、走査線上の黒画素の連続
数が前平均線幅に基づいて定められた閾値を超えた場合
にストロ−クとして検出し、これらのストロ−クの分布
を表わすかすれ復元サブパタンを各方向毎に４種類抽出
するかすれ復元サブパタン抽出部と、前記サブパタンと
前記かすれ復元サブパタンとをそれぞれの種類毎に合成
するサブパタン合成部と、前記サブパタンまたは合成サ
ブパタンの何れか一方のサブパタンを特徴抽出部に出力
する制御部と、前記パタンレジスタの外接枠内の２値画
像及び前記かすれ復元パタンとを合成し、合成文字パタ
ンを作成するパタン合成部と、前記パタンレジスタの外
接枠内の２値画像または前記合成文字パタンの周辺分布
に基づいて、外接枠を水平方向及び垂直方向に格子状と
なる部分領域に分割する外接枠分割部と、前記サブパタ
ンまたは前記合成サブパタンについて前記分割された部
分領域の特徴値を算出し、特徴マトリクスを作成する特
徴抽出部と、前記特徴マトリクスと予め用意された辞書
とを照合することにより最終的な認識結果を出力する識
別部とを有することを特徴とする。In order to solve the above problems, the present invention optically scans a character pattern written on a form or the like and converts it into a binary image which is a quantized electric signal. A photoelectric conversion unit, a pattern register for storing the character pattern converted into the binary image, a circumscribing frame detecting unit for detecting a circumscribing frame of the character pattern in the pattern register, and a circumscribing frame of the pattern register. A line width calculation unit for calculating a line width of a character pattern, and a character pattern in a circumscribing frame of the pattern register is scanned in each of horizontal, vertical, right diagonal, and left diagonal directions, and a black pixel on a scanning line is scanned. A sub-pattern extraction unit that detects as strokes when the number of consecutive patterns exceeds a threshold value determined based on the line width, and extracts four types of sub-patterns representing the distribution of these strokes in each direction; Pattern register From the binary image and the four sub-patterns in the contact frame, among the black pixels constituting a character pattern,
A blur pattern extracting unit that extracts a set of black pixels that do not belong to any of the four types of sub patterns as a blur pattern,
Of the independent segments constituting the faint pattern, a fine segment removing unit that removes a fine segment, a faint pattern line width calculating unit that calculates a line width of the faint pattern, and a faint pattern from which the fine segment is removed are restored. If it is determined that a pattern needs to be created,
For each segment that constitutes the faint pattern, by performing a process of thickening the line width to the average line width, a faint pattern restoring unit that creates a faint restored pattern and a horizontal direction with respect to the faint restored pattern , Vertical, diagonal to the right, diagonal to the left, and when the number of consecutive black pixels on the scanning line exceeds the threshold value determined based on the previous average line width, it is detected as a stroke and these strokes are detected. A shading restoration sub-pattern extraction unit that extracts four types of shading restoration sub-patterns that represent the distribution of black and white in each direction, a sub-pattern synthesis unit that synthesizes the sub-pattern and the shading restoration sub-pattern for each type, and the sub-pattern or synthesis. A control unit that outputs one of the sub patterns to the feature extraction unit, a binary image in the circumscribed frame of the pattern register, and the blurring correction. A pattern synthesizing unit for synthesizing a pattern and creating a synthetic character pattern, and a circumscribing frame in a horizontal and vertical direction based on a binary image in the circumscribing frame of the pattern register or a peripheral distribution of the synthetic character pattern. Circumscribing frame dividing unit that divides the divided partial region into a shape, a feature extraction unit that calculates a feature value of the divided partial region for the sub-pattern or the composite sub-pattern, and creates a feature matrix; It has a discriminating part which outputs a final recognition result by collating with the created dictionary.

【０００９】[0009]

【作用】この発明によれば、原文字パタンの平均線幅に
基づいた文字のストロークの分布状態を表わすサブパタ
ンが各走査方向毎に４種類抽出され、更に抽出されたサ
ブパタンの何れにも属さないパタンから成るかすれパタ
ンが抽出される。このかすれパタンから微小セグメント
を除去した残りのパタンに対して復元パタン作成の必要
性の有無が判定され、作成の必要有りと判定された場合
には当該かすれパタンの線幅に基づきかすれ復元パタン
が作成され、前記同様にかすれ復元パタンのサブパタン
が各方向毎に抽出される。また、原文字パタン或は原文
字パタンとかすれ復元パタンとの合成パタンの外接枠内
の領域が格子状の部分領域に分割され、前記サブパタン
或はかすれ復元パタンのサブパタンが抽出された場合に
はそれらの合成サブパタンの何れか一方のサブパタンが
特徴抽出部に出力され、前記分割領域内の特徴値が計算
されて特徴マトリクスが作成される。この特徴マトリク
スを辞書の特徴マトリクスと照合することにより文字認
識が行われる。従って、文字パタンを構成するストロー
ク成分の中で文字認識に本質的な役割を果たすものの一
部がかすれたり或は小さくなったような場合でもかすれ
復元サブパタンとして救済し抽出することが可能となる
ため、局所的な線幅にばらつきの有るような低品質の文
字パタンに対しても高精度で安定した認識性能を得るこ
とが可能となる。According to the present invention, four types of sub-patterns representing the distribution of the strokes of characters based on the average line width of the original character pattern are extracted for each scanning direction, and do not belong to any of the extracted sub-patterns. A faint pattern consisting of patterns is extracted. It is determined whether or not there is a need to create a restoration pattern for the remaining patterns obtained by removing minute segments from this blur pattern, and if it is determined that it is necessary to create a restoration pattern, the blur restoration pattern is determined based on the line width of the blur pattern. The sub-pattern of the blur restoration pattern is created and extracted in each direction as described above. In addition, when the area inside the circumscribed frame of the original character pattern or the composite pattern of the original character pattern and the blur restoration pattern is divided into the grid-like partial areas, and the sub pattern or the sub pattern of the blur restoration pattern is extracted. One of the sub patterns of the combined sub patterns is output to the feature extraction unit, the feature value in the divided area is calculated, and the feature matrix is created. Character recognition is performed by comparing this feature matrix with the feature matrix of the dictionary. Therefore, even if a part of the stroke components constituting the character pattern that plays an essential role in character recognition becomes faint or small, it is possible to rescue and extract it as a faint restoration sub-pattern. As a result, it is possible to obtain a highly accurate and stable recognition performance even for a low-quality character pattern having local line width variations.

【００１０】[0010]

【実施例】以下に本発明による文字認識装置の実施例１
及び２を説明するが、ここでは例えば図４の４０１，図
５の５０１及び図６（ａ）の「因」を構成する要素
「大」等は、便宜上、かすれパタンという名称で一括し
て呼称する。また実施例１では、図６（ａ）の漢字
「因」という字の２値画像に対して、本発明を適用した
例について併せて説明していく。[Embodiment] Embodiment 1 of the character recognition apparatus according to the present invention will be described below.
2 will be described, but here, for example, the element “large” and the like constituting 401 of FIG. 4, 501 of FIG. 5, and “factor” of FIG. 6A are collectively referred to as a blur pattern. To do. Further, in the first embodiment, an example in which the present invention is applied to the binary image of the Chinese character “Cause” in FIG.

【００１１】図１は、本発明による文字認識装置の実施
例１を示すブロック図である。ここで、１０１は、文字
パタンをスキャナで走査して得られた光信号入力、１０
２は光電変換部、１０３はパタンレジスタ、１０４は外
接枠検出部、１０５は文字パタン線幅計算部、１０６は
水平方向走査部、１０７は水平サブパタン１メモリ、１
０８は垂直方向走査部、１０９は垂直サブパタン１メ
モリ、１１０は右斜め方向走査部、１１１は右斜めサブ
パタン１メモリ、１１２は左斜め方向走査部、１１３は
左斜めサブパタン１メモリ、１１４はかすれパタン抽出
部、１１５はかすれパタンメモリ、１１６はかすれパタ
ン線幅計算部、１１７は水平方向走査部、１１８は水平
かすれサブパタンメモリ、１１９は垂直方向走査部、１
２０は垂直かすれサブパタンメモリ、１２１は右斜め方
向走査部、１２２は右斜めかすれサブパタンメモリ、１
２３は左斜め方向走査部、１２４は左斜めかすれサブパ
タンメモリ、１２５は水平サブパタン合成部、１２６は
水平サブパタン２メモリ、１２７は垂直サブパタン合成
部、１２８は垂直サブパタン２メモリ、１２９は右斜め
サブパタン合成部、１３０は右斜めサブパタン２メモ
リ、１３１は左斜めサブパタン合成部、１３２は左斜め
サブパタン２メモリ、１３３は出力制御部、１３４は線
幅判定部、１３５は特徴抽出部、１３６は識別部、１３
７は辞書メモリ、１３８は認識結果、１３９は微小セグ
メント除去部、１４０はかすれパタン復元部、１４１は
かすれ復元パタンメモリ、１４２はパタン合成部、１４
３は合成文字パタンメモリ、１４４は外接枠分割部であ
る。FIG. 1 is a block diagram showing a first embodiment of a character recognition device according to the present invention. Here, 101 is an optical signal input obtained by scanning a character pattern with a scanner, 10
2 is a photoelectric conversion unit, 103 is a pattern register, 104 is a circumscribing frame detection unit, 105 is a character pattern line width calculation unit, 106 is a horizontal scanning unit, 107 is a horizontal sub pattern 1 memory, 1
Reference numeral 08 is a vertical scanning unit, 109 is a vertical sub-pattern 1 memory, 110 is a right diagonal scanning unit, 111 is a right diagonal sub pattern 1 memory, 112 is a left diagonal scanning unit, 113 is a left diagonal sub pattern 1 memory, and 114 is a faint pattern. An extraction unit, 115 is a blur pattern memory, 116 is a blur pattern line width calculation unit, 117 is a horizontal scanning unit, 118 is a horizontal blur sub-pattern memory, 119 is a vertical scanning unit, 1
Reference numeral 20 is a vertical blurring sub-pattern memory, 121 is a diagonal right direction scanning unit, 122 is a right diagonal blurring sub pattern memory, 1
23 is a left oblique direction scanning unit, 124 is a left oblique fading sub pattern memory, 125 is a horizontal sub pattern combining unit, 126 is a horizontal sub pattern 2 memory, 127 is a vertical sub pattern combining unit, 128 is a vertical sub pattern 2 memory, and 129 is a right oblique sub pattern. A synthesizing unit, 130 is a right diagonal sub-pattern 2 memory, 131 is a left diagonal sub-pattern 2 synthesizing unit, 132 is a left diagonal sub-pattern 2 memory, 133 is an output control unit, 134 is a line width determining unit, 135 is a feature extracting unit, 136 is an identifying unit. , 13
7 is a dictionary memory, 138 is a recognition result, 139 is a minute segment removing unit, 140 is a faint pattern restoring unit, 141 is a faint restoring pattern memory, 142 is a pattern combining unit, 14
Reference numeral 3 is a composite character pattern memory, and 144 is a circumscribing frame division unit.

【００１２】先ず、帳票等に手書きまたは印刷された文
字パタンをスキャナで走査して得られた光信号１０１
は、光電変換部１０２において、電気信号に変換され、
さらに量子化によって２値の信号からなる２値画像に変
換されてパタンレジスタ１０３に格納される。First, an optical signal 101 obtained by scanning a character pattern handwritten or printed on a form with a scanner.
Is converted into an electric signal in the photoelectric conversion unit 102,
Further, it is converted into a binary image composed of binary signals by quantization and stored in the pattern register 103.

【００１３】外接枠検出部１０４は、パタンレジスタ１
０３に蓄えられた２値画像に対し、水平走査により前記
２値画像の上端及び下端を検出し、垂直走査により前記
２値画像の左端及び右端を検出し、その結果、当該入力
文字パタンに外接する句形である外接枠を得る。そし
て、外接枠に関する座標値を線幅計算部１０５、水平方
向走査部１０６、垂直方向走査部１０８、右斜め方向走
査部１１０、左斜め方向走査部１１２及びかすれパタン
抽出部１１４に出力し、文字パタンの切り出し領域を指
定する。以下の処理において、パタンレジスタ１０３の
２値画像を用いる場合は、全て外接枠内にある２値画像
を対象とする。The circumscribing frame detection unit 104 is used in the pattern register 1
For the binary image stored in 03, the upper and lower ends of the binary image are detected by horizontal scanning, the left and right ends of the binary image are detected by vertical scanning, and as a result, the input character pattern is circumscribed. Get a circumscribing frame that is a phrase form. Then, the coordinate value regarding the circumscribing frame is output to the line width calculation unit 105, the horizontal scanning unit 106, the vertical scanning unit 108, the right diagonal scanning unit 110, the left diagonal scanning unit 112, and the blur pattern extracting unit 114, and the character Specify the cutout area of the pattern. In the following processing, when the binary image of the pattern register 103 is used, all the binary images within the circumscribed frame are targeted.

【００１４】文字パタン線幅計算部１０５では、当該入
力文字パタンにおける平均線幅が計算される。ここで、
平均線幅の求め方の一例として本実施例では、パタンレ
ジスタ１０３の外接枠内の文字パタンの２値画像の黒画
素数をＡ、４黒画素数をＱとした時、当該入力文字パタ
ンの平均線幅Ｗｒを次式で計算する方法を用いた。Ｗｒ＝Ａ／（Ａ − Ｑ）（１）但し、４黒画素とは、２値画像を２×２の窓で走査した
時に２×２の窓の全てが黒画素となる点であり、４黒画
素数Ｑとは、そのような４黒画素を計数したものであ
る。The character pattern line width calculation unit 105 calculates the average line width in the input character pattern. here,
As an example of how to obtain the average line width, in this embodiment, when the number of black pixels of the binary image of the character pattern in the circumscribing frame of the pattern register 103 is A and the number of black pixels is Q, the input character pattern The method of calculating the average line width Wr by the following formula was used. Wr = A / (A−Q) (1) However, 4 black pixels means that all the 2 × 2 windows become black pixels when the binary image is scanned by the 2 × 2 window. The black pixel number Q is a count of such 4 black pixels.

【００１５】次にパタンレジスタ１０３の外接枠内の文
字パタンに対して、水平方向走査部１０６において水平
方向に、垂直方向走査部１０８において垂直方向に、右
斜め方向走査部１１０において右斜め方向に、左斜め方
向走査部１１２において左斜め方向に、それぞれ走査
し、前記線幅に基づいた値を閾値として、連続した黒画
素であるストロ−クを検出していき、それらの分布状態
を反映するサブパタンを生成する。この時、その連続し
た黒画素がサブパタンを構成するストロ−ク成分である
ことの条件は、連続黒画素数をＬとしたとき、次式で与
えられる。Ｌ＞２ × Ｗｒ（２）ここで、Ｗｒとは文字パタン線幅計算部１０５において
算出された当該文字入力パタンの平均線幅である。即
ち、それぞれの方向の走査において線幅の２倍を超える
長さを持つストロ−クが当該方向のサブパタンを構成す
るストロ−クとして抽出されるのである。以上のように
検出された外接枠内における連続黒画素としてのストロ
−クの分布状態は、各々の走査方向毎に、水平サブパタ
ン１、垂直サブパタン１、右斜めサブパタン１、左斜め
サブパタン１として、それぞれ水平サブパタン１メモリ
１０７、垂直サブパタン１メモリ１０９、右斜めサブパ
タン１メモリ１１１、左斜めサブパタン１メモリ１１３
に格納される。Next, with respect to the character pattern in the circumscribed frame of the pattern register 103, the horizontal scanning unit 106 horizontally, the vertical scanning unit 108 vertically, and the right diagonal scanning unit 110 diagonally right. , The diagonally leftward scanning unit 112 scans diagonally leftward, detects strokes that are continuous black pixels by using the value based on the line width as a threshold, and reflects their distribution state. Generate a sub pattern. At this time, the condition that the continuous black pixels are the stroke components forming the sub-pattern is given by the following equation, where L is the number of continuous black pixels. L> 2 × Wr (2) Here, Wr is the average line width of the character input pattern calculated by the character pattern line width calculation unit 105. That is, in the scanning in each direction, the stroke having a length that is more than twice the line width is extracted as the stroke that constitutes the sub-pattern in that direction. The distribution state of strokes as continuous black pixels in the circumscribing frame detected as described above is as follows: horizontal sub-pattern 1, vertical sub-pattern 1, right diagonal sub-pattern 1, left diagonal sub-pattern 1, for each scanning direction. Horizontal sub pattern 1 memory 107, vertical sub pattern 1 memory 109, diagonal right sub pattern 1 memory 111, diagonal left sub pattern 1 memory 113, respectively.
Stored in.

【００１６】図６を例にとると、走査前の原２値画像が
図６（ａ）に、水平サブパタン１が図６（ｂ）に、垂直
サブパタン１が図６（ｃ）に、右斜めサブパタン１が図
６（ｄ）に、左斜めサブパタン１が図６（ｅ）に各々示
されている。前述したように「因」を構成する要素
「大」は、平均線幅の２倍以下のスケ−ルであるため各
サブパタンには全く反映されていないことがわかる。Taking FIG. 6 as an example, the original binary image before scanning is shown in FIG. 6A, the horizontal sub-pattern 1 is shown in FIG. 6B, the vertical sub-pattern 1 is shown in FIG. The sub pattern 1 is shown in FIG. 6 (d), and the left diagonal sub pattern 1 is shown in FIG. 6 (e). As described above, it can be seen that the element "large" that constitutes the "factor" is not reflected in each sub-pattern because it is a scale of twice the average line width or less.

【００１７】かすれパタン抽出部１１４は、パタンレジ
スタ１０３の外接枠内の２値画像及び水平サブパタン
１、垂直サブパタン１、右斜めサブパタン１、左斜めサ
ブパタン１とを用いて、サブパタンとして抽出されなか
ったストロ−ク成分の分布状態をかすれパタンとして抽
出する。図３はかすれパタン抽出部１１４の構成を示す
ブロック図であって、点線で示された枠内がかすれパタ
ン抽出部１１４の内部を表しており、３０１はＯＲ回路
部、３０２はメモリ、３０３はＮＯＴ回路部、３０４は
文字パタンメモリ、３０５はＡＮＤ回路部である。The blur pattern extracting unit 114 uses the binary image in the circumscribing frame of the pattern register 103 and the horizontal sub pattern 1, the vertical sub pattern 1, the right oblique sub pattern 1, and the left oblique sub pattern 1 and is not extracted as a sub pattern. The distribution state of the stroke components is extracted as a blur pattern. FIG. 3 is a block diagram showing the configuration of the blur pattern extracting unit 114, and the inside of the frame shown by the dotted line represents the interior of the blur pattern extracting unit 114, 301 is an OR circuit unit, 302 is a memory, and 303 is A NOT circuit unit, 304 is a character pattern memory, and 305 is an AND circuit unit.

【００１８】次に図３に示されたかすれパタン抽出部１
１４における各ブロックの機能及び処理の流れについて
説明する。先ず、各方向のサブパタンメモリ１０７，１
０９，１１１，１１３に格納された水平サブパタン１、
垂直サブパタン１、右斜めサブパタン１及び左斜めサブ
パタン１は、ＯＲ回路部３０１に入力される。ＯＲ回路
部３０１では、各サブパタンの黒画素を１、白画素を０
としたとき、外接枠で囲まれたサブパタン領域の画素１
つ１つについて、４つのサブパタン１の画素値のＯＲ論
理演算が実行され、当該演算結果が、予めメモリ３０２
に用意されたサブパタン領域と同じ句形領域の対応する
画素についてそれぞれ出力されていき、最終的には、４
つのサブパタン１の和集合であるパタンがメモリ３０２
上に生成される。このパタンは、当該領域の各画素にお
いて、４つのサブパタンの内、少なくとも１つのサブパ
タンの画素値が１、即ち、黒画素である時に、黒画素で
あり、４つのサブパタン１のいずれも画素値が０、即
ち、白画素である時に白画素となっている。従って、こ
のサブパタンの和集合のパタンの白画素部分は、もとも
と文字パタンの２値画像でも白画素であったか或いは、
２値画像では黒画素であるがサブパタンとしては抽出さ
れなかったかのどちらかである。Next, the blur pattern extraction unit 1 shown in FIG.
The function of each block in 14 and the flow of processing will be described. First, the sub-pattern memories 107 and 1 in each direction
Horizontal sub-patterns 1 stored in 09, 111 and 113,
The vertical sub pattern 1, the diagonal right sub pattern 1, and the diagonal left sub pattern 1 are input to the OR circuit unit 301. In the OR circuit unit 301, the black pixel of each sub-pattern is 1, and the white pixel is 0.
Then, the pixel 1 of the sub-pattern area surrounded by the circumscribed frame
For each one, the OR logical operation of the pixel values of the four sub patterns 1 is executed, and the operation result is stored in advance in the memory 302.
Are output for each of the pixels corresponding to the same phrase-shaped area as the sub-pattern area prepared in.
The pattern that is the union of the two sub patterns 1 is the memory 302.
Generated on. This pattern is a black pixel when the pixel value of at least one sub-pattern among the four sub-patterns in each pixel of the area is 1, that is, a black pixel, and the pixel values of all four sub-patterns 1 are 0, that is, a white pixel is a white pixel. Therefore, the white pixel portion of the pattern of the union of the sub patterns was originally a white pixel in the binary image of the character pattern, or
It is either a black pixel in the binary image, but it was not extracted as a sub pattern.

【００１９】次にメモリ３０２上に生成された前記パタ
ンについて、ＮＯＴ回路部３０３によるＮＯＴ演算が実
行される。ＮＯＴ回路部３０３では、メモリ３０２上の
パタンを構成する画素の一つ一つについて、順次、画素
値０の画素を画素値１に、画素値１の画素を画素値０に
変換し、即ち、白画素を黒画素に、黒画素を白画素に変
換するＮＯＴ演算を実行し、当該演算結果をメモリ３０
２における当該画素上に出力する。以上のようにして、
メモリ３０２上には、ＯＲ回路部３０１によって生成さ
れたサブパタンの和集合であるパタンを白黒反転させた
パタンが生成される。一方、上述の処理とは独立に、パ
タンレジスタ１０３の２値画像の内、外接枠検出部１０
４によって検出された外接枠内の２値画像のみが文字パ
タンメモリ３０４に転送される。Next, the NOT circuit section 303 performs a NOT operation on the pattern generated on the memory 302. The NOT circuit unit 303 sequentially converts a pixel having a pixel value of 0 into a pixel value of 1 and a pixel having a pixel value of 1 into a pixel value of 0 for each of the pixels forming the pattern on the memory 302, that is, A NOT operation for converting a white pixel into a black pixel and a black pixel into a white pixel is executed, and the operation result is stored in the memory 30.
It outputs on the said pixel in 2. As described above,
On the memory 302, a pattern in which the pattern, which is the union of the sub patterns generated by the OR circuit unit 301, is inverted in black and white is generated. On the other hand, independently of the above processing, the circumscribing frame detection unit 10 in the binary image of the pattern register 103 is
Only the binary image in the circumscribing frame detected by No. 4 is transferred to the character pattern memory 304.

【００２０】次にメモリ３０２上のパタンと文字パタン
メモリ３０４５の文字パタンに対して、ＡＮＤ回路部３
０５によって、ＡＮＤ演算が実行される。ＡＮＤ回路部
３０５では、パタン領域内の個々の画素について、メモ
リ３０２上のパタンの画素値と該画素に対応する文字パ
タンメモリ３０４上の文字パタンの画素値とのＡＮＤ演
算、即ち、両者の画素値が１であったときのみに、画素
値１を出力し、少なくともどちらかが０であったとき
は、画素値０を出力する演算を実行していき、当該演算
結果をかすれパタンとして、かすれパタンメモリ１１５
に出力する。このかすれパタンは、上述の説明で理解で
きるように、文字パタンを構成する黒画素の中で、４つ
のサブパタン１の黒画素のいずれにも所属しないものを
抽出してできたものである。即ち、かすれパタンは、例
えば、図４の４０１が示すようにストロ−クの一部がか
すれ、いくつかのセグメントに分裂してできたストロ−
クやまた図５の５０１が示すように元々孤立したストロ
−クであって、式（２）で示された平均線幅の２倍とい
う閾値に達しないもの等から構成されている。Next, for the pattern on the memory 302 and the character pattern in the character pattern memory 3045, the AND circuit unit 3
An AND operation is executed by 05. In the AND circuit unit 305, for each pixel in the pattern area, an AND operation is performed between the pixel value of the pattern on the memory 302 and the pixel value of the character pattern on the character pattern memory 304 corresponding to the pixel, that is, both pixels. Only when the value is 1, the pixel value 1 is output, and when at least one of them is 0, an operation of outputting the pixel value 0 is executed, and the operation result is used as a blur pattern to make a blur. Pattern memory 115
Output to. As can be understood from the above description, this blurring pattern is obtained by extracting black pixels constituting the character pattern that do not belong to any of the black pixels of the four sub patterns 1. That is, for example, as shown by 401 in FIG. 4, a faint pattern is a stroke formed by a part of the stroke being divided into several segments.
Or a stroke originally isolated as shown by 501 in FIG. 5 and which does not reach the threshold value of twice the average line width shown in equation (2).

【００２１】このかすれパタン抽出部の処理を図６
（ａ）の原２値画像に適用すると、図６（ａ）から図６
（ｂ），（ｃ），（ｄ），（ｅ）の各サブパタンの黒画
素を全て除去することになり、従って、図６（ｆ）のよ
うに、サブパタンとして抽出されなかった要素「大」だ
けからなるかすれパタンが得られる。FIG. 6 shows the processing of this blur pattern extraction unit.
When applied to the original binary image of (a), FIG.
All black pixels of the sub-patterns of (b), (c), (d), and (e) are to be removed. Therefore, as shown in FIG. 6F, the element "large" that is not extracted as a sub-pattern. A faint pattern consisting of only

【００２２】以上説明したように、図１のかすれパタン
抽出部１１４で抽出されたかすれパタンは、かすれパタ
ンメモリ１１５に格納されているが、必要に応じて、こ
のかすれパタンにおける微小セグメントを除去するため
の微小セグメント除去部１３９を設けることも可能であ
る。例えば、この微小セグメント除去部１３９による微
小セグメントの除去ル−ルとして、次のものが考えられ
る。即ち、かすれパタンを構成する各セグメントの輪郭
を構成する輪郭黒画素数または、各セグメントの全黒画
素数が、所定の閾値、例えば、当該入力文字パタンの線
幅Ｗｒのβ倍（β＞０）以下であったとき、微小セグメ
ントとみなすというル−ルである。ここで微小と判定さ
れたセグメントは、かすれパタンメモリ上で消去される
か、あるいは後続する処理の対象外とされる。以上のよ
うに微小セグメントが消去されることによって、それに
起因する認識性能の低下を未然に防止することが出来
る。As described above, the faint pattern extracted by the faint pattern extracting unit 114 of FIG. 1 is stored in the faint pattern memory 115, but if necessary, minute segments in the faint pattern are removed. It is also possible to provide a minute segment removal unit 139 for this. For example, the following can be considered as a removal rule of the minute segment by the minute segment removing unit 139. That is, the number of contour black pixels forming the contour of each segment forming the faint pattern or the total number of black pixels of each segment is a predetermined threshold, for example, β times (β> 0) the line width Wr of the input character pattern. ) It is a rule to consider it as a minute segment when it is below. Here, the segment determined to be minute is erased on the blurred pattern memory or excluded from the subsequent processing. By deleting the minute segment as described above, it is possible to prevent deterioration of the recognition performance due to the deletion.

【００２３】次にかすれパタン線幅計算部１１６におい
て、かすれパタンの線幅が計算される。この線幅の計算
方法としては、例えば、文字パタン線幅計算部１０５で
使用した式（２）が用いられる。Next, the blur pattern line width calculation unit 116 calculates the line width of the blur pattern. As the method of calculating the line width, for example, the equation (2) used in the character pattern line width calculation unit 105 is used.

【００２４】次に、かすれパタン復元部１４０におい
て、かすれパタンを構成する各ストロ−ク成分に対し
て、入力文字パタンの平均線幅までの太め処理を行い、
かすれた部分を復元する。このかすれパタン復元部１４
０における処理の一実施例を表わすブロック図を図７に
示した。図７において、点線で囲まれた部分が、かすれ
パタン復元部１４０を示しており、７０１は、セグメン
ト検出部、７０２はセグメント１メモリ、７０３はセグ
メント２メモリ、７０４はセグメントＮメモリ、７０５
は輪郭点抽出部、７０６は輪郭黒画素追加部、７０７は
線幅判定部、７０８は合成部である。Next, in the blurred pattern restoring unit 140, the stroke components constituting the blurred pattern are thickened to the average line width of the input character pattern,
Restore the faded part. This faint pattern restoration unit 14
FIG. 7 is a block diagram showing an example of the processing in 0. In FIG. 7, a portion surrounded by a dotted line indicates the blurred pattern restoration unit 140, 701 is a segment detection unit, 702 is a segment 1 memory, 703 is a segment 2 memory, 704 is a segment N memory, and 705.
Is a contour point extraction unit, 706 is a contour black pixel addition unit, 707 is a line width determination unit, and 708 is a synthesis unit.

【００２５】かすれパタンメモリ１１５に格納されたか
すれパタン（微小セグメントを除く）を構成するストロ
−ク成分は、先ず、セグメント検出部７０１において、
一つのセグメントまたは互いに接していない複数のセグ
メント１，２，．．，Ｎとして識別される。そして、各
々のセグメント１，２，．．，Ｎを構成する黒画素は、
かすれパタンメモリ１１５上のアドレス列に変換され、
それぞれセグメント１メモリ７０２，セグメント２メモ
リ７０３，．．，セグメントＮメモリ７０４に格納され
る。但し、Ｎはかすれパタンのセグメント数を示してい
る（Ｎ≧１）。The stroke components constituting the blur pattern (excluding minute segments) stored in the blur pattern memory 115 are first detected by the segment detector 701.
One segment or a plurality of segments 1, 2 ,. ． , N. Then, each of the segments 1, 2 ,. ． , N are the black pixels
Converted to an address string on the faint pattern memory 115,
Segment 1 memory 702, segment 2 memory 703 ,. ． , Segment N memory 704. However, N indicates the number of segments of the blur pattern (N ≧ 1).

【００２６】次に輪郭点抽出部７０５において、各セグ
メント１，２，．．，Ｎの輪郭部を構成する黒画素全て
が検出され、各々かすれパタンメモリ１１５上のアドレ
ス列に変換された後、輪郭点テ−ブル１，２，．．，Ｎ
として次の輪郭黒画素追加部７０６に出力される。Next, in the contour point extraction unit 705, each segment 1, 2 ,. ． , N, all the black pixels forming the contour portion are detected and converted into address strings on the blur pattern memory 115, respectively, and then the contour point tables 1, 2 ,. ． , N
Is output to the next contour black pixel adding unit 706.

【００２７】輪郭黒画素追加部７０６では、セグメント
１，２，．．，Ｎの各々に対し、輪郭点テ−ブル及びか
すれパタンメモリ１１５の内容を参照しながら、輪郭点
系列の外側に隣接する黒画素を追加する。この時、追加
された黒画素も各セグメントを構成する要素となったた
め、それに応じて、セグメント１，２，．．，Ｎメモリ
７０２、７０３、７０４の内容は更新される。輪郭点系
列を一周してその外側に隣接する画素全てが黒画素に変
換されると、再び輪郭黒画素追加部７０６は、新たに追
加された黒画素を輪郭点系列として、さらにその外側に
黒画素を追加し、セグメント１，２，．．，Ｎメモリ７
０２、７０３、７０４の内容を更新していく。この処理
は、いわば各セグメントの外周に一皮ずつ追加していく
方法である。同様の処理は、線幅判定部７０７の判定結
果により中止の指示が出力されるまで繰り返される。In the contour black pixel adding section 706, the segments 1, 2 ,. ． , N, with reference to the contents of the contour point table and the blur pattern memory 115, black pixels adjacent to the outside of the contour point series are added. At this time, the added black pixel is also an element constituting each segment, and accordingly, the segments 1, 2 ,. ． , N memories 702, 703, 704 are updated. When all the pixels adjacent to the outer side of the contour point series are converted into black pixels, the contour black pixel adding unit 706 again uses the newly added black pixel as the contour point series to further black the outside. Pixels are added and segments 1, 2 ,. ． , N memory 7
The contents of 02, 703, and 704 are updated. This processing is, so to speak, a method of adding one skin to the outer circumference of each segment. The same process is repeated until a stop instruction is output according to the determination result of the line width determination unit 707.

【００２８】この隣接した画素を黒画素に変換する方法
は、例えば次のように行われる。即ち、図８のような３
×３のマスクを用意し、着目する輪郭点を中心の升目８
０１に置く時、８０１以外の８個の升目８０２、８０
３、８０４、８０５、８０６、８０７、８０８、８０９
の中で、白画素があった場合、それらを全て黒画素に変
換するという方法である。The method of converting the adjacent pixels into black pixels is performed as follows, for example. That is, 3 as shown in FIG.
Prepare a mask of × 3, and square 8 with the contour point of interest as the center.
When placed on 01, 8 squares 802, 80 other than 801
3, 804, 805, 806, 807, 808, 809
Among them, if there are white pixels, all of them are converted into black pixels.

【００２９】線幅判定部７０７では、常時、輪郭黒画素
追加部７０６で追加された黒画素を含めたかすれパタン
の増大する線幅値を計算しており、その線幅値が、文字
パタン線幅計算部１０５で計算された入力文字パタンの
平均線幅に達した時、輪郭黒画素追加部７０６に黒画素
追加の中止の指示を、さらに合成部７０８に合成開始の
指示を出力する。The line width determining unit 707 always calculates a line width value that increases the blur pattern including the black pixels added by the contour black pixel adding unit 706, and the line width value is the character pattern line. When the average line width of the input character pattern calculated by the width calculation unit 105 is reached, a contour black pixel addition unit 706 outputs an instruction to stop adding black pixels, and a synthesis unit 708 outputs a synthesis start instruction.

【００３０】合成開始の指示を受けた合成部７０８は、
セグメント１，２，．．，Ｎメモリ７０２、７０３、７
０４の更新された黒画素のアドレス情報及びかすれパタ
ンメモリ１１５の内容に基づいて、かすれ復元パタンメ
モリ１４１上に線幅の太め処理を施したかすれ復元パタ
ンを合成出力する。尚、この時、線幅を太くすることに
よって、外接枠領域をはみ出した黒画素は除去される。The synthesizing unit 708, which has received the instruction to start synthesizing,
Segments 1, 2 ,. ． , N memories 702, 703, 7
On the basis of the updated address information of the black pixels 04 and the contents of the blur pattern memory 115, the blur restoration pattern with the thickened line width is synthetically output on the blur restoration pattern memory 141. At this time, by increasing the line width, black pixels protruding from the circumscribing frame region are removed.

【００３１】次に図１のかすれ復元パタンメモリ１４１
内のかすれ復元パタンに対して、水平方向走査部１１
７、垂直方向走査部１１９、右斜め方向走査部１２１及
び左斜め方向走査部１２３によって、それぞれ水平、垂
直、右斜め、左斜め方向に走査され、所定の閾値を超え
て連続した黒画素がストロ−クとして検出されていく。
その結果、今度はかすれ復元パタンのサブパタン、即
ち、かすれ復元サブパタンが抽出され、それぞれ水平か
すれサブパタンメモリ１１８、垂直かすれサブパタンメ
モリ１２０、右斜めかすれサブパタンメモリ１２２及び
左斜めかすれサブパタンメモリ１２４に格納される。
尚、ここで、サブパタンを構成するストロ−ク成分であ
るための条件は、式（２）で与えられる。Next, the blur restoration pattern memory 141 shown in FIG.
The horizontal scanning unit 11 with respect to the blur restoration pattern inside
7. The vertical scanning unit 119, the right diagonal direction scanning unit 121, and the left diagonal direction scanning unit 123 scan horizontally, vertically, diagonally rightward, and diagonally leftward, respectively, and black pixels consecutively exceeding a predetermined threshold are strobed. -It is detected as ku.
As a result, the sub-patterns of the blurring restoration pattern, that is, the blurring restoration sub-patterns are extracted this time, and the horizontal blurring sub pattern memory 118, the vertical blurring sub pattern memory 120, the right diagonal blurring sub pattern memory 122, and the left diagonal blurring sub pattern memory 124, respectively. Stored in.
Here, the condition for being the stroke component forming the sub-pattern is given by the equation (2).

【００３２】ここでの処理を図６を例にとって説明する
と、先ず、「因」の字からかすれパタンとして抽出され
た部分パタン「大」は図６（ｆ）に、このかすれパタン
を平均線幅値まで復元したかすれ復元パタンは図６
（ｏ）に示されている。このかすれ復元パタンに対し
て、水平、垂直、右斜め、左斜め方向に走査して得られ
たかすれ復元サブパタンが、それぞれ、図６（ｇ）、
（ｈ）、（ｉ）、（ｊ）に示されている。前述したよう
に、当該サブパタン抽出処理における閾値は、図６
（ａ）の原２値画像「因」の平均線幅である。図６
（ａ）の原２値画像の走査時では、平均線幅値が大きか
ったため抽出されなかった「大」の字のサブパタンが、
平均線幅値まで太らせる処理によって、適切に抽出され
ていることがわかる。The process here will be described with reference to FIG. 6 as an example. First, the partial pattern “large” extracted as a blur pattern from the character “factor” is shown in FIG. Fig. 6 shows the fading restoration pattern restored to the value.
It is shown in (o). The blur restoration sub-patterns obtained by scanning the blur restoration pattern horizontally, vertically, diagonally to the right, and diagonally to the left are shown in FIG.
It is shown in (h), (i) and (j). As described above, the threshold in the sub pattern extraction processing is
It is the average line width of the original binary image "cause" in (a). Figure 6
At the time of scanning the original binary image in (a), the sub-pattern of the "large" character that was not extracted because the average line width value was large,
It can be seen that the extraction is appropriately performed by the process of thickening the average line width value.

【００３３】次に、原２値画像に対する走査によって抽
出された水平サブパタン１、垂直サブパタン１、右斜め
サブパタン１、左斜めサブパタン１と、かすれ復元パタ
ンに対する走査によって抽出された水平かすれ復元サブ
パタン、垂直かすれ復元サブパタン、右斜めかすれ復元
サブパタン、左斜めかすれ復元サブパタンとをそれぞれ
合成する処理を行う。この合成処理は、各方向のサブパ
タンに対して、それぞれ独立に水平サブパタン合成部１
２５、垂直サブパタン合成部１２７、右斜めサブパタン
合成部１２９、左斜めサブパタン合成部１３１によって
実行される。合成されたパタンは、それぞれ水平サブパ
タン２、垂直サブパタン２、右斜めサブパタン２、左斜
めサブパタン２として、各々、水平サブパタン２メモリ
１２６、垂直サブパタン２メモリ１２８、右斜めサブパ
タン２メモリ１３０、左斜めサブパタンメモリ１３２に
格納される。Next, the horizontal sub-pattern 1, vertical sub-pattern 1, right diagonal sub-pattern 1, left diagonal sub-pattern 1 extracted by scanning the original binary image, and the horizontal blurring restoration sub-pattern, vertical extracted by scanning for the blurring restoration pattern. A process is performed to combine the blur restoration sub-pattern, the right diagonal blur restoration sub-pattern, and the left diagonal blur restoration sub-pattern. This synthesizing process is performed by the horizontal sub-pattern synthesizing unit 1 independently for each sub-pattern in each direction.
25, the vertical sub-pattern combining unit 127, the right oblique sub-pattern combining unit 129, and the left oblique sub-pattern combining unit 131. The combined patterns are the horizontal sub-pattern 2, the vertical sub-pattern 2, the right diagonal sub-pattern 2 and the left diagonal sub-pattern 2, respectively, and the horizontal sub-pattern 2 memory 126, the vertical sub-pattern 2 memory 128, the right diagonal sub-pattern 2 memory 130, and the left diagonal sub-pattern 2, respectively. It is stored in the pattern memory 132.

【００３４】ここで前記合成部におけるパタンの合成
は、例えば、２つのサブパタンの個々の画素についてＯ
Ｒ演算を行う方法等が用いられる。つまり、２つの２値
パタンを合成する場合、各々を構成する個々の画素にお
いて、少なくともどちらかが、画素値１、即ち、黒画素
であれば画素値１を出力し、両者ともに画素値０、即
ち、白画素であったときに画素値０を出力するという方
法で合成パタンを作成する。図６の場合では、原２値画
像のサブパタンとして、それぞれ、図６（ｂ）、
（ｃ）、（ｄ）、（ｅ）が与えられ、かすれ復元パタン
のサブパタンとしてはそれぞれ図６（ｇ）、（ｈ）、
（ｉ）、（ｊ）が与えられているときに前記合成部によ
って、合成されたサブパタン２は、各々図６（ｋ）、
（ｌ）、（ｍ）、（ｎ）となる。これらの合成されたサ
ブパタン２は、原２値画像のサブパタンと比較して、局
所的なスケ−ルの小さい部分が正確に反映され、しかも
平均線幅値まで復元されているので、そのサブパタンに
基づいて計算される特徴マトリクスにも当然それが反映
され、従って、従来のサブパタン抽出にともなう情報損
失による誤読等が防止できる。Here, the synthesizing of the patterns in the synthesizing section is performed by, for example, O for each pixel of the two sub patterns.
A method of performing R calculation or the like is used. That is, in the case of combining two binary patterns, at least one of the individual pixels forming each outputs the pixel value 1, that is, the pixel value 1 if it is a black pixel, and both output the pixel value 0, That is, a composite pattern is created by a method of outputting a pixel value of 0 when it is a white pixel. In the case of FIG. 6, as the sub-pattern of the original binary image, FIG.
(C), (d), and (e) are given, and the sub patterns of the blur restoration pattern are shown in FIGS. 6 (g), 6 (h), and 6 (h), respectively.
When (i) and (j) are given, the sub-pattern 2 synthesized by the synthesizing unit is as shown in FIG.
(L), (m), and (n). Compared with the sub-pattern of the original binary image, these combined sub-patterns 2 accurately reflect the small local scale portion and are restored to the average line width value. This is naturally reflected in the feature matrix calculated based on the above, and therefore, misreading due to information loss due to conventional sub-pattern extraction can be prevented.

【００３５】上述の方法により作成されたサブパタン
は、特徴抽出部１３５においてさらに圧縮された特徴に
変換されるわけであるが、本実施例では、出力制御部１
３３を設けて、特徴抽出部１３５に入力させるサブパタ
ンを選択できるようにしている。この出力制御部１３３
は、前記微小セグメント除去部１３９においてかすれパ
タンを構成する各セグメントが全て微小であると判定さ
れた場合に、かすれパタンの復元及びかすれ復元パタン
に対する走査を中止させ、原２値画像に対する走査によ
って得られた各方向のサブパタン１をそれぞれのメモリ
１０７、１０９、１１１、１１３から読取り、特徴抽出
部１３５に出力する。また、前記かすれパタン線幅計算
部１１６で計算されたかすれパタンの線幅は、常時、線
幅判定部１３４で判定されており、前記線幅が所定の閾
値以下であると判定された場合、その判定結果は出力制
御部１３３に伝達される。この時、前記線幅に対する閾
値としては、例えば、次式が与えられる。Ｗｓ＜ δ × Ｗｒ（５）０＜ δ 《１（６）但し、Ｗｓはかすれパタンの線幅、Ｗｒは原２値画像の
線幅であって、式（５）及び式（６）の条件が満たされ
る時は、ＷｓがＷｒに比べて極端に小さいことを意味し
ている。The sub-pattern created by the above-mentioned method is converted into a further compressed feature in the feature extraction unit 135, but in the present embodiment, the output control unit 1 is used.
33 is provided so that the sub pattern to be input to the feature extraction unit 135 can be selected. This output control unit 133
Is obtained by scanning the original binary image by stopping the restoration of the blurring pattern and the scanning for the blurring restoration pattern when the fine segment removing unit 139 determines that all the segments forming the blurring pattern are all minute. The obtained sub-pattern 1 in each direction is read from each of the memories 107, 109, 111, 113 and output to the feature extraction unit 135. Further, the line width of the blur pattern calculated by the blur pattern line width calculation unit 116 is always determined by the line width determination unit 134, and when the line width is determined to be equal to or less than a predetermined threshold value, The determination result is transmitted to the output control unit 133. At this time, for example, the following equation is given as the threshold for the line width. Ws <δ × Wr (5) 0 <δ << 1 (6) where Ws is the line width of the blurred pattern, Wr is the line width of the original binary image, and the conditions of formulas (5) and (6) are satisfied. When is satisfied, it means that Ws is extremely smaller than Wr.

【００３６】さて、前記線幅判定部１３４からＷｓがＷ
ｒに比べて極端に小さいという判定結果を受けた出力制
御部１３３は上述した場合と同様にかすれパタンに対す
る走査を中止させ、原２値画像に対する走査によって得
られた各方向のサブパタン１をそれぞれのメモリ１０
７、１０９、１１１、１１３から読取り、特徴抽出部１
３５に出力する。以上の出力制御部１３３の処理は、以
下に述べる問題点に鑑みてなされたものである。Now, from the line width determination unit 134, Ws is W
The output control unit 133, which has received the determination result that it is extremely smaller than r, stops scanning for the blur pattern in the same manner as described above, and sets the sub-pattern 1 in each direction obtained by scanning the original binary image for each. Memory 10
7, 109, 111, 113, and feature extraction unit 1
To 35. The above processing of the output control unit 133 is performed in view of the problems described below.

【００３７】かすれ復元サブパタンを原２値画像の走査
によって得られたサブパタン１に合成することは、除去
された重要な情報を回復させる一方で、その文字パタン
の非本質的なストロ−ク成分をもつけ加えてしまうおそ
れがある。従って本実施例では、非本質的なストロ−ク
成分の除去を目指すために、前述したように先ず、微小
セグメント除去部１３９において復元される前のかすれ
パタンの微小セグメントを除去し、また当然のことなが
ら全てのセグメントが微小と判定された場合には、原２
値画像に対する走査によって得られたサブパタン１だけ
を特徴抽出部１３５に出力するようにしたのである。さ
らに線幅判定部１３４を設け、かすれパタンの線幅が所
定の閾値に達しない場合にも当該かすれパタンは、認識
上、非本質的であると判定することにして、かかる場合
にかすれパタンの復元及び走査を実行せず、サブパタン
１のみを特徴抽出部１３５に出力するようにしたもので
ある。このようにすることで、非本質的なストロ−ク成
分はサブパタンから除去され、それによる誤読等を未然
に防止することが可能となる。Combining the blur-restoring sub-pattern into sub-pattern 1 obtained by scanning the original binary image restores the important information that was removed, while eliminating the non-essential stroke component of that character pattern. There is a risk of adding more. Therefore, in this embodiment, in order to remove the extrinsic stroke component, first, as described above, the minute segment of the blurred pattern before being restored in the minute segment removing unit 139 is removed, and naturally. If all the segments are judged to be very small, the original 2
Only the sub-pattern 1 obtained by scanning the value image is output to the feature extraction unit 135. Further, a line width determination unit 134 is provided, and even if the line width of the fading pattern does not reach a predetermined threshold, the fading pattern is determined to be extrinsic for recognition, and in such a case, the fading pattern is determined. Only the sub-pattern 1 is output to the feature extraction unit 135 without performing restoration and scanning. By doing so, the extrinsic stroke component is removed from the sub-pattern, and it is possible to prevent erroneous reading and the like due to it.

【００３８】特徴抽出部１３５では、入力された原２値
画像のサブパタンあるいは合成されたサブパタン４種に
基づいた特徴抽出を行うが、この特徴抽出を行う前に、
外接枠分割部１４４において、予め前記パタンレジスタ
１０３の外接枠内の文字パタンに対して、各分割領域内
の黒画素数が同数になるように垂直方向、水平方向に格
子状となるＮ×Ｍ個の部分領域に非線形分割するステッ
プがある。例えば、図９（ａ）に示された「土」という
文字は、その外接枠９０１を垂直、水平方向にそれぞれ
４分割ずつ計１６個の部分領域に分割された例である。
先ず、水平方向の分割線を決める際、図９（ｂ）に示さ
れたようにＹ軸に投影された周辺分布ヒストグラムを求
める。この周辺分布ヒストグラムとは、Ｘ軸に平行な走
査線上に存在する黒画素数をＹ＝０からＹ＝Ｙｅ（外接
枠Ｙ座標の最大値）の各々についてカウントして得られ
たヒストグラムのことであり、横軸は走査線のＹ座標、
縦軸は黒画素数として表されている。The feature extraction unit 135 performs feature extraction based on the sub-pattern of the input original binary image or four types of synthesized sub-patterns. Before performing this feature extraction,
In the circumscribing frame dividing unit 144, N × M is arranged in the vertical and horizontal directions in advance so that the number of black pixels in each divided region is the same as that of the character pattern in the circumscribing frame of the pattern register 103 in advance. There is a step of non-linear division into a number of sub-regions. For example, the character “Soil” shown in FIG. 9A is an example in which the circumscribing frame 901 is divided into four partial regions in each of the vertical and horizontal directions, for a total of 16 partial regions.
First, when determining a horizontal dividing line, a marginal distribution histogram projected on the Y axis as shown in FIG. 9B is obtained. The marginal distribution histogram is a histogram obtained by counting the number of black pixels existing on a scanning line parallel to the X axis for each of Y = 0 to Y = Ye (the maximum value of the Y coordinate of the circumscribing frame). Yes, the horizontal axis is the Y coordinate of the scan line,
The vertical axis represents the number of black pixels.

【００３９】ここで、Ｘ軸に平行な分割線によって区分
けされた各領域の黒画素数が同数になるように分割線の
位置を決めるために、次のような処理を行う。即ち周辺
分布ヒストグラムの探索をＹ＝０から開始してＹ＝１、
Ｙ＝２と順次探索し、各々の度数である黒画素数を足し
合わせていく。そしてそのヒストグラムの累積値が、２
値画像の総黒画素数を分割数で割った値に達した時、そ
の時点における走査線をもって分割線とする。図９
（ｂ）では、最初に分割線９０８が見いだされる。同様
に分割線９０８の次の走査線から始まり、新たに累積値
が求められていき、当該累積値が前述した所定の閾値に
達した走査線９０９が第２の分割線として検出される。
この例では４分割が採用されているので第３の分割線９
１０が検出された段階で水平方向の分割は終了する。図
９（ａ）では、これらの分割線９０８、９０９、９１０
はそれぞれ９０２、９０３、９０４に対応している。以
上のようにして、分割線９０２、９０３、９０４によっ
て、２値画像は水平方向に各々等しい黒画素数をもった
領域に４分割される。Here, in order to determine the position of the dividing line so that the number of black pixels in each area divided by the dividing line parallel to the X axis is the same, the following processing is performed. That is, the search for the marginal distribution histogram is started from Y = 0 and Y = 1,
Y = 2 is sequentially searched, and the number of black pixels, which is each frequency, is added. And the cumulative value of the histogram is 2
When a value obtained by dividing the total number of black pixels of the value image by the number of divisions is reached, the scanning line at that time is set as a division line. Figure 9
In (b), the dividing line 908 is first found. Similarly, starting from the scanning line next to the dividing line 908, the cumulative value is newly obtained, and the scanning line 909 in which the cumulative value reaches the above-described predetermined threshold value is detected as the second dividing line.
In this example, four divisions are adopted, so the third division line 9
When 10 is detected, the horizontal division ends. In FIG. 9A, these dividing lines 908, 909, and 910.
Correspond to 902, 903, and 904, respectively. As described above, the dividing lines 902, 903, and 904 divide the binary image into four regions each having the same number of black pixels in the horizontal direction.

【００４０】この分割例を見てわかるように、黒画素が
密集した領域、即ち、周辺分布ヒストグラムの極大付近
では、分割領域の幅が狭く、黒画素があまり多くない領
域では、分割領域の幅が広い。つまり、分割領域は２値
画像の黒画素の分布状態に敏感に依存している。このよ
うに分割することによって、文字パタンの様々な黒画素
分布の片寄りが緩和され、単なる外接枠の等分割よりも
遥かに正確に文字パタンの特徴を反映したマトリクスが
算出される。As can be seen from this division example, the width of the division area is narrow in the area where the black pixels are dense, that is, in the vicinity of the maximum of the peripheral distribution histogram, and in the area where the number of black pixels is not so large. Is wide. That is, the divided area is sensitively dependent on the distribution state of the black pixels of the binary image. By dividing in this way, the deviation of the various black pixel distributions of the character pattern is alleviated, and a matrix that reflects the characteristics of the character pattern is calculated much more accurately than simple division of the circumscribing frame.

【００４１】全く同様の方法により、図９（ｃ）に示さ
れたＸ軸に投影された周辺分布ヒストグラムに基づい
て、垂直方向の分割線９１１、９１２、９１３が検出さ
れ、各々等しい黒画素数をもつ領域に４分割される。但
し、この垂直方向の分割線９１１、９１２、９１３は、
図９（ａ）における９０５、９０６、９０７に対応して
いる。By the same method, vertical dividing lines 911, 912, 913 are detected based on the peripheral distribution histogram projected on the X axis shown in FIG. 9C, and the number of black pixels is equal to each other. Is divided into four areas. However, the dividing lines 911, 912, 913 in the vertical direction are
It corresponds to 905, 906, and 907 in FIG.

【００４２】以上のようにして、図９（ａ）の２値画像
は、水平方向の分割線９０２、９０３、９０４及び垂直
方向の分割線９０５、９０６、９０７によって１６個の
部分領域に分割されることになる。As described above, the binary image of FIG. 9A is divided into 16 partial areas by the horizontal dividing lines 902, 903 and 904 and the vertical dividing lines 905, 906 and 907. Will be.

【００４３】図９（ａ）に例として示した文字パタン
は、かすれた部分がない通常のパタンであるため、かす
れパタンとして抽出される部分が存在しないかまたは、
微小セグメント除去部１３９において、微小セグメント
として全て除去され、かすれパタンなしと判定され、第
１の走査によるサブパタンのみに基づく特徴抽出が実行
されることになる。従って、かすれパタン復元部１４０
による復元処理をうけることもないため、従来の分割方
法を踏襲した原２値画像の周辺分布から求められた分割
線は、対象とする２値画像の全ての黒画素の分布を正確
に反映するものとなっている。The character pattern shown as an example in FIG. 9 (a) is a normal pattern having no fading portion, so there is no portion to be extracted as a fading pattern, or
In the minute segment removing unit 139, all minute segments are removed, it is determined that there is no blur pattern, and feature extraction based on only the sub-pattern by the first scan is executed. Therefore, the blurred pattern restoration unit 140
Since it is not subjected to the restoration processing by, the dividing line obtained from the peripheral distribution of the original binary image that follows the conventional dividing method accurately reflects the distribution of all black pixels of the target binary image. It has become a thing.

【００４４】しかし、一部がかすれパタンとして抽出さ
れ、さらにそこが復元された文字パタンに対して、従来
の分割方法をそのまま原２値画像に適用すると問題点が
発生する。例えば、図９（ａ）の「土」という文字にお
いて、垂直方向のストロ−ク成分がかすれてしまった図
１０（ａ）のような文字パタンを考える。図１０（ａ）
の２値画像におけるＹ軸、Ｘ軸に投影された周辺分布ヒ
ストグラムは、それぞれ図１０（ｂ）及び図１０（ｃ）
に示されており、前述した方法によって得られた水平方
向の分割線は、１００８、１００９、１０１０、垂直方
向の分割線は、１０１１、１０１２、１０１３で与えら
れている。但しこれらの分割線は図１０（ａ）におい
て、それぞれ１００２、１００３、１００４、１００
５、１００６、１００７に対応している。この時、これ
らの分割線は、垂直方向のストロ−クがかすれて黒画素
数が少なくなったことにより、それがかすれていない場
合とは微妙に異なる位置に設定される。However, if a conventional division method is applied to the original binary image as it is with respect to a character pattern that is partially extracted as a blurred pattern and is then restored, there arises a problem. For example, consider a character pattern as shown in FIG. 10A in which the stroke component in the vertical direction is faint in the character "soil" in FIG. 9A. Figure 10 (a)
10B and 10C are marginal distribution histograms projected on the Y axis and the X axis of the binary image of FIG.
The horizontal dividing lines obtained by the method described above are given by 1008, 1009, 1010, and the vertical dividing lines are given by 1011, 1012, 1013. However, these dividing lines are 1002, 1003, 1004, and 100 in FIG.
5, 1006, 1007. At this time, these dividing lines are set to positions slightly different from the case where the dividing lines are not blurred because the vertical stroke is blurred and the number of black pixels is reduced.

【００４５】例えば、図１０（ｃ）において１０１２と
１０１３の分割線で仕切られた領域の横幅は、垂直方向
のストロ−ク成分がかすれていない時よりも黒画素数を
稼ぐ必要から幅が広くなっている。従来の方法では、以
上のように原２値画像に対して求められた分割線１００
２、１００３、１００４及び１００５、１００６、１０
０７によって、外接枠を分割し、それぞれの部分領域毎
に後述する特徴を求めていた。しかし、原２値画像に基
づいて決定された分割領域を用いて、復元された合成サ
ブパタンの特徴を計算すると、上述のように分割領域と
サブパタンとの対応関係にはずれがあるので、認識性能
が低下するという問題点が発生する。For example, in FIG. 10 (c), the width of the area partitioned by the dividing lines 1012 and 1013 is wider because it is necessary to obtain more black pixels than when the vertical stroke component is not blurred. Has become. In the conventional method, the dividing line 100 obtained for the original binary image as described above is used.
2, 1003, 1004 and 1005, 1006, 10
The circumscribing frame is divided by 07, and the features described later are obtained for each partial area. However, when the characteristics of the restored combined sub-pattern are calculated using the divided areas determined based on the original binary image, the recognition performance is poor because the correspondence between the divided areas and the sub-patterns is different as described above. The problem of lowering occurs.

【００４６】本発明では、上記の問題点を解決するため
に、原２値画像に対して外接枠を分割するのではなく、
原２値画像とかすれ復元パタンとを合成することによ
り、合成文字パタンを作成し、当該合成文字パタンの周
辺分布に基づいて外接枠を分割するようにしたので、分
割領域にも復元した効果が反映され、合成サブパタンと
分割領域との不一致がなくなり、認識性能の低下が防止
可能となっている。この点が本発明の一つの大きな特徴
である。In the present invention, in order to solve the above problems, the circumscribed frame is not divided into the original binary image, but
The original binary image and the blur restoration pattern are combined to create a combined character pattern, and the circumscribing frame is divided based on the peripheral distribution of the combined character pattern. As a result, there is no discrepancy between the combined sub-pattern and the divided area, and it is possible to prevent the deterioration of recognition performance. This is one of the major features of the present invention.

【００４７】上記方法を実現するために、実施例１では
図１のパタン合成部１４２、合成文字パタンメモリ１４
３を設けている。先ずパタン合成部１４２において、パ
タンレジスタ１０３の外接枠内の原２値画像とかすれ復
元パタンメモリ１４１内のかすれ復元パタンとの合成を
行い、合成文字パタンとして合成文字パタンメモリ１４
３に出力する。そして、この合成文字パタンに対して、
外接枠分割部１４４において前記分割処理を施し、分割
領域を得る。ここで、パタン合成部１４２における合成
とは、サブパタン合成部等で行われる処理と同様であ
る。In order to realize the above method, in the first embodiment, the pattern synthesizing unit 142 and the synthetic character pattern memory 14 shown in FIG.
3 is provided. First, in the pattern synthesizing unit 142, the original binary image in the circumscribing frame of the pattern register 103 and the blur restoration pattern in the blur restoration pattern memory 141 are synthesized, and the synthesized character pattern memory 14 is used as a synthesized character pattern.
Output to 3. And for this composite character pattern,
The circumscribing frame dividing unit 144 performs the dividing process to obtain a divided area. Here, the composition in the pattern composition unit 142 is the same as the processing performed in the sub-pattern composition unit or the like.

【００４８】例えば、図１０（ａ）の垂直方向のストロ
−ク成分がかすれた「土」という文字に本発明を適用す
る例を考える。このかすれたストロ−ク成分を復元する
と、これらのストロ−クは、線幅が太くなるとともに連
結するので、その合成文字パタンは図９（ａ）に類似し
たパタンになる。ここで、仮に図９（ａ）を図１０
（ａ）の合成文字パタンとすると、外接枠分割部１４４
によって得られる分割線は、従来は１００２、１００
３、１００４、１００５、１００６、１００７であった
ものが、９０２、９０３、９０４、９０５、９０６、９
０７となり、これらの分割線の座標値が特徴抽出部１３
５に出力される。尚、かすれパタンの復元処理及びサブ
パタン抽出処理がなされなかった時は、当然ながら合成
文字パタンは作成されず、従来通りにパタンレジスタ１
０３の外接枠内の２値画像に対する分割線が検出され、
それらの座標値が特徴抽出部１３５に出力される。For example, let us consider an example in which the present invention is applied to the character "earth" in which the stroke component in the vertical direction is faint in FIG. When this faint stroke component is restored, the strokes become thicker and connect, so that the composite character pattern becomes a pattern similar to that in FIG. 9 (a). Here, suppose that FIG.
Assuming the composite character pattern of (a), the circumscribing frame dividing unit 144
The dividing line obtained by
3, 1004, 1005, 1006, 1007 were replaced with 902, 903, 904, 905, 906, 9
07, and the coordinate values of these dividing lines are the feature extraction unit 13
5 is output. It should be noted that when the fading pattern restoration processing and the sub-pattern extraction processing are not performed, naturally, the synthetic character pattern is not created, and the pattern register 1
The dividing line for the binary image in the circumscribed frame of 03 is detected,
Those coordinate values are output to the feature extraction unit 135.

【００４９】次に前記４種のサブパタン１または合成さ
れたサブパタン２のそれぞれについて、前記分割された
部分領域内における該サブパタンの黒画素数を計数し、
これを文字パタンの大きさで正規化することによって、
各方向における文字線の分布状態を反映するＮ×Ｍ×４
次元の特徴マトリクスを抽出し、識別部１３６に出力す
る。Next, for each of the four types of sub-patterns 1 or the combined sub-pattern 2, the number of black pixels of the sub-pattern in the divided partial area is counted,
By normalizing this with the size of the character pattern,
N × M × 4 reflecting the distribution of character lines in each direction
The dimensional feature matrix is extracted and output to the identification unit 136.

【００５０】識別部１３６では、前記特徴マトリクスと
辞書メモリ１３７に予め格納しておいた複数の標準文字
の特徴マトリクスとを照合し、最終的に一つに絞られた
候補カテゴリを該入力文字パタンの認識結果１３８とし
て出力する。The identifying unit 136 collates the feature matrix with a feature matrix of a plurality of standard characters stored in advance in the dictionary memory 137, and finally, the candidate category narrowed down to one is selected as the input character pattern. Is output as the recognition result 138.

【００５１】実施例１は、文字や図形を構成するストロ
−クの局所線幅が２つに分類できるときに極めて有効な
方法であった。しかし、通常の簡単な文字は、２種類の
線幅による走査でもサブパタンにほぼ反映できるとみな
せる一方、３種類以上のスケ−ルのストロ−クからなる
複雑な図形や漢字等では、２段階の走査でもとらえきれ
ないストロ−ク成分が存在し得る。実施例２は、このよ
うな問題点に鑑みて発明されたものであり、実施例１が
２段階の線幅による走査であったのに対し、実施例２
は、これをさらに一般化し、Ｍ段階（Ｍ≧２）の走査が
可能となっている。この実施例２について以下に説明す
る。Example 1 was an extremely effective method when the local line widths of strokes forming characters and figures could be classified into two. However, it can be considered that ordinary simple characters can be almost reflected in the sub-pattern even when scanning with two kinds of line widths, while there are two steps for complicated figures and kanji, etc. consisting of strokes of three or more kinds of scales. There may be stroke components that cannot be captured by scanning. The second embodiment has been invented in view of such a problem, and the first embodiment is a scanning with a two-step line width, whereas the second embodiment is a scan.
Is further generalized to enable M stages (M ≧ 2) of scanning. The second embodiment will be described below.

【００５２】図２は本発明による実施例２を示すブロッ
ク図である。ここで、２０１は光信号入力、２０２は光
電変換部、２０３はパタンレジスタ、２０４は外接枠検
出部、２０５はレジスタ、２０６は線幅計算部、２０７
は水平方向走査部、２０８は水平パタンメモリ、２０９
は垂直方向走査部、２１０は垂直パタン走査部、２１１
は右斜め方向走査部、２１２は右斜めパタンメモリ、２
１３は左斜め方向走査部、２１４は左斜めパタンメモ
リ、２１５はかすれパタン抽出部、２１６は微小セグメ
ント除去部、２１７は線幅判定部、２１８は水平パタン
合成部、２１９は水平合成パタンメモリ、２２０は垂直
パタン合成部、２２１は垂直合成パタンメモリ、２２２
は右斜めパタン合成部、２２３は右斜め合成パタンメモ
リ、２２４は左斜めパタン合成部、２２５は左斜め合成
パタンメモリ、２２６はル−プカウンタ、２２７は出力
制御部、２２８は特徴抽出部、２２９は識別部、２３０
は辞書メモリ、２３１は認識結果、２３２はかすれパタ
ン復元部、２３３はパタン合成部、２３４は合成文字パ
タンメモリ、２３５は外接枠分割部である。FIG. 2 is a block diagram showing a second embodiment according to the present invention. Here, 201 is an optical signal input, 202 is a photoelectric conversion unit, 203 is a pattern register, 204 is a circumscribing frame detection unit, 205 is a register, 206 is a line width calculation unit, and 207.
Is a horizontal scanning unit, 208 is a horizontal pattern memory, 209
Is a vertical scanning unit, 210 is a vertical pattern scanning unit, 211
Is a right diagonal scanning unit, 212 is a right diagonal pattern memory, 2
13 is a left oblique direction scanning unit, 214 is a left oblique pattern memory, 215 is a blurred pattern extraction unit, 216 is a minute segment removal unit, 217 is a line width determination unit, 218 is a horizontal pattern composition unit, 219 is a horizontal composition pattern memory, 220 is a vertical pattern composition unit, 221 is a vertical pattern pattern memory, 222
Is a right diagonal pattern synthesis unit, 223 is a right diagonal synthesis pattern memory, 224 is a left diagonal pattern synthesis unit, 225 is a left diagonal synthesis pattern memory, 226 is a loop counter, 227 is an output control unit, 228 is a feature extraction unit, 229 Is an identification unit, 230
Is a dictionary memory, 231 is a recognition result, 232 is a blurred pattern restoring unit, 233 is a pattern combining unit, 234 is a combined character pattern memory, and 235 is a circumscribing frame dividing unit.

【００５３】ここでは、主として実施例１との相違点に
ついて説明する。先ず、２０１、２０２、２０３、２０
４は実施例１に準じ、パタンレジスタ２０３の２値画像
のうち、外接枠内のデ−タだけが、レジスタ２０５に転
送される。後述するようにこのレジスタ２０５には、文
字パタンの２値デ−タだけでなく、かすれパタンも順
次、上書きされる。線幅計算部２０６はこのレジスタ２
０５内のデ−タに対し、線幅の計算を行う。今は、文字
パタンの２値デ−タが格納されているので、文字パタン
の平均線幅が計算される。この線幅の算出も実施例１の
方法を準用する。Here, differences from the first embodiment will be mainly described. First, 201, 202, 203, 20
In No. 4, according to the first embodiment, only the data within the circumscribing frame of the binary image of the pattern register 203 is transferred to the register 205. As will be described later, not only the binary data of the character pattern but also the blurred pattern are sequentially overwritten in the register 205. The line width calculation unit 206 uses this register 2
The line width is calculated for the data in 05. Since the binary data of the character pattern is currently stored, the average line width of the character pattern is calculated. The method of Example 1 is also applied to the calculation of the line width.

【００５４】次に、実施例１と同様に、このレジスタ２
０５内の２値デ−タに対して、水平方向走査部２０７、
垂直方向走査部２０９、右斜め方向走査部２１１、左斜
め方向走査部２１３により、それぞれ水平、垂直、右斜
め、左斜め方向に走査し、前記線幅を閾値として、サブ
パタンを抽出し、各々、水平パタンメモリ２０８、垂直
パタンメモリ２１０、右斜めパタンメモリ２１２、左斜
めパタンメモリ２１４に格納する。Next, as in the first embodiment, this register 2
For the binary data in 05, the horizontal scanning section 207,
The vertical scanning unit 209, the right diagonal scanning unit 211, and the left diagonal scanning unit 213 scan horizontally, vertically, diagonally to the right, and diagonally to the left, respectively, and the subpatterns are extracted with the line width as a threshold. The data is stored in the horizontal pattern memory 208, the vertical pattern memory 210, the right diagonal pattern memory 212, and the left diagonal pattern memory 214.

【００５５】次に、かすれパタン抽出部２１５におい
て、レジスタ２０５の文字パタンの２値デ−タとメモリ
２０８、２１０、２１２、２１４に格納されたサブパタ
ンより、かすれパタンを抽出し、レジスタ２０５に転送
する。この時、かすれパタンの抽出は、実施例１の図３
に示したかすれパタン抽出の処理によって行い、このか
すれパタンを便宜上、かすれパタン１としておく。そし
て、微小セグメント除去部２１６でかすれパタン１の微
小セグメントの除去を行い、残ったセグメント数等をチ
ェックした後、線幅計算部２０６においてかすれパタン
１の線幅の計算を行い、さらに線幅判定部２１７で、前
記線幅値に基づいてかすれパタン１の走査を行うか否か
を判定する。但し、微小セグメント除去部２１６または
線幅判定部２１７の判定は、実施例１に準用する。Next, the blur pattern extracting unit 215 extracts the blur pattern from the binary data of the character pattern of the register 205 and the sub patterns stored in the memories 208, 210, 212 and 214 and transfers it to the register 205. To do. At this time, the extraction of the faint pattern is performed as shown in FIG.
The fading pattern extraction processing shown in FIG. 2 is performed, and this fading pattern is referred to as a fading pattern 1 for convenience. Then, the minute segment removing unit 216 removes the minute segment of the blurred pattern 1, and after checking the number of remaining segments, the line width calculating unit 206 calculates the line width of the blurred pattern 1 and further determines the line width. The unit 217 determines whether to scan the fading pattern 1 based on the line width value. However, the determination by the minute segment removal unit 216 or the line width determination unit 217 is applied to the first embodiment.

【００５６】ここでかすれパタン１について、走査をす
る必要はないと判定されると、メモリ２０８、２１０、
２１２、２１４に格納されたサブパタンは、出力制御部
２２７を通じて特徴抽出部２２８に出力され、また走査
する必要ありと判定された場合には、各々、水平合成パ
タンメモリ２１９、垂直合成パタンメモリ２２１、右斜
め合成パタンメモリ２２３、左斜め合成パタンメモリ２
２５に転送される。尚、図３において、メモリ１０７、
１０９、１１１、１１３は、図２におけるメモリ２０
８、２１０、２１２、２１４に相当し、文字パタンメモ
リ３０４は、レジスタ２０５に置き換えるものとするIf it is determined that the blur pattern 1 does not need to be scanned, the memories 208, 210,
The sub patterns stored in 212 and 214 are output to the feature extraction unit 228 through the output control unit 227, and when it is determined that scanning is necessary, the horizontal composition pattern memory 219 and the vertical composition pattern memory 221, respectively. Right diagonal composition pattern memory 223, left diagonal composition pattern memory 2
25. In FIG. 3, the memory 107,
109, 111, and 113 are the memory 20 in FIG.
8, 210, 212, and 214, and the character pattern memory 304 is replaced with the register 205.

【００５７】次に実施例１と同様に、レジスタ２０５内
のかすれパタン１に対して、かすれパタン復元部２３２
においてかすれ復元パタン１が生成される。そして、こ
のかすれ復元パタン１は各方向の走査部２０７、２０
９、２１１、２１３により再度走査され、平均線幅に基
づいて、かすれ復元パタン１のサブパタン、即ち、かす
れ復元サブパタン１が抽出され、各々、メモリ２０８、
２１０、２１２、２１４に格納される。Next, in the same manner as in the first embodiment, for the blur pattern 1 in the register 205, the blur pattern restoring unit 232 is executed.
At, a blur restoration pattern 1 is generated. The blur restoration pattern 1 is used for the scanning units 207, 20 in each direction.
9, 211, and 213 are again scanned, and the sub-pattern of the blurring restoration pattern 1, that is, the blurring restoration sub-pattern 1, is extracted based on the average line width, and the memory 208, respectively.
210, 212, and 214 are stored.

【００５８】次に水平パタン合成部２１８、垂直パタン
合成部２２０、右斜めパタン合成部２２２、左斜めパタ
ン合成部２２４において、メモリ２０８、２１０、２１
２、２１４に格納されたかすれ復元サブパタン１とメモ
リ２１９、２２１、２２３、２２５に格納されたサブパ
タンとが合成され、合成サブパタン１として、再び、メ
モリ２１９、２２１、２２３、２２５に出力される。前
記合成サブパタン１は、実施例１において、２度のサブ
パタン抽出の結果合成されたものと同一のものである。
しかし実施例２では、再度パタンレジスタ２０３の外接
枠内文字パタンをレジスタ２０５に転送し、かすれパタ
ン抽出部２１５において、この文字パタンの２値デ−タ
と合成されたサブパタン１とを用いて、２度目の走査に
よっても検出されなかったストロ−ク成分を抽出し、こ
れをかすれパタン２としてレジスタ２０５に格納するこ
とが可能となっている。ここで、図３におけるメモリ１
０７、１０９、１１１、１１３は、図２における合成パ
タンメモリ２１９、２２１、２２３、２２５に相当し、
文字パタンメモリ３０４は、レジスタ２０５に置き換え
るものとする。Next, in the horizontal pattern synthesizing unit 218, the vertical pattern synthesizing unit 220, the right diagonal pattern synthesizing unit 222, and the left diagonal pattern synthesizing unit 224, the memories 208, 210, 21.
The blur restoration sub-pattern 1 stored in 2, 214 and the sub-patterns stored in the memories 219, 221, 223, 225 are combined, and the combined sub-pattern 1 is output to the memories 219, 221, 223, 225 again. The synthetic sub-pattern 1 is the same as that synthesized as a result of extracting the sub-patterns twice in the first embodiment.
However, in the second embodiment, the character pattern in the circumscribing frame of the pattern register 203 is transferred to the register 205 again, and the blur pattern extracting unit 215 uses the binary data of this character pattern and the sub pattern 1 synthesized, It is possible to extract the stroke component that has not been detected by the second scan and store it in the register 205 as the blur pattern 2. Here, the memory 1 in FIG.
07, 109, 111 and 113 correspond to the synthetic pattern memories 219, 221, 223 and 225 in FIG.
The character pattern memory 304 is replaced with the register 205.

【００５９】次にかすれパタン２に対しても、かすれパ
タン復元部２３２で平均線幅までの復元処理を行い、か
すれ復元パタンを作成し、平均線幅を閾値とした走査に
よってかすれ復元サブパタン２を求め、合成部２１８、
２２０、２２２、２２４において、メモリ２１９、２２
１、２２３、２２５に格納された合成サブパタン１との
合成を行い、再びメモリ２１９、２２１、２２３、２２
５に合成サブパタン２として出力する。全く同様にし
て、かすれパタンＫに対して、かすれ復元パタンＫを作
成し、平均線幅を閾値とした走査によってかすれ復元サ
ブパタンＫを求め、合成部２１８、２２０、２２２、２
２４において、メモリ２１９、２２１、２２３、２２５
に格納された合成サブパタンＫ−１との合成を行い、再
びメモリ２１９、２２１、２２３、２２５に合成サブパ
タンＫとして出力する。Next, with respect to the blur pattern 2, the blur pattern restoring unit 232 performs restoration processing up to the average line width to create a blur restoration pattern, and the blur restoration sub-pattern 2 is set by scanning with the average line width as a threshold. The synthesizing unit 218,
In 220, 222, 224, the memories 219, 22
1, 223, 225 are combined with the combined sub-pattern 1, and the memories 219, 221, 223, 22 are again combined.
It outputs to 5 as synthetic sub-pattern 2. In exactly the same manner, the blur restoration pattern K is created for the blur pattern K, the blur restoration sub-pattern K is obtained by scanning with the average line width as a threshold, and the combining units 218, 220, 222, and 2 are performed.
24, memories 219, 221, 223, 225
The synthesized sub-pattern K-1 stored in the memory is stored in the memory 219, and the synthesized sub-pattern K-1 is output again to the memories 219, 221, 223 and 225.

【００６０】ル−プカウンタ２２６は、サブパタンの合
成回数Ｋをカウントし、Ｋが所定の閾値Ｍに達した場
合、出力制御部２２７にそのことを通知する。その時、
出力制御部２２７では、メモリ２１９、２２１、２２
３、２２５に格納されていた合成サブパタンＭを特徴抽
出部２２８に転送する。尚、合成回数ＫがＭに達しない
場合でも、微小セグメント除去部２１６または線幅判定
部２１７において、かすれパタンＫの復元及び走査の必
要がないと判定された場合は、その時点の合成サブパタ
ンＫが特徴抽出部２２８に転送される。The loop counter 226 counts the number of times K of sub-pattern combination, and when K reaches a predetermined threshold M, notifies the output control section 227 of this. At that time,
In the output control unit 227, the memories 219, 221, 22
The combined sub-pattern M stored in Nos. 3 and 225 is transferred to the feature extraction unit 228. Even when the number of times of composition K does not reach M, if the minute segment removal unit 216 or the line width determination unit 217 determines that there is no need to restore and scan the fading pattern K, the composition sub-pattern K at that point in time. Is transferred to the feature extraction unit 228.

【００６１】また、パタン合成部２３３では、パタンレ
ジスタ２０３の外接枠内の２値画像を基にしてレジスタ
２０５に出力されたかすれ復元パタン１，２，．．，Ｋ
を順次合成し、合成文字パタンとしてメモリ２３４に出
力する。出力制御部２２７が、合成サブパタンＭを特徴
抽出部２２８に出力する時、外接枠分割部２３５は、メ
モリ２３４に格納されている原２値画像とかすれ復元パ
タン１，２，．．，Ｍの合成である合成文字パタンの周
辺分布に基づく外接枠分割を行い、分割座標を特徴抽出
部２２８に通知する。Further, in the pattern synthesizing unit 233, the blur restoration patterns 1, 2, ..., Which are output to the register 205 based on the binary image in the circumscribing frame of the pattern register 203. ． , K
Are sequentially combined and output to the memory 234 as a combined character pattern. When the output control unit 227 outputs the combined sub-pattern M to the feature extracting unit 228, the circumscribing frame dividing unit 235 causes the circumscribing frame dividing unit 235 to store the original binary image and the blur restoration patterns 1, 2 ,. ． , M, the circumscribing frame division is performed based on the peripheral distribution of the combined character pattern, and the division coordinates are notified to the feature extraction unit 228.

【００６２】特徴抽出部２２８、識別部２２９、辞書メ
モリ２３０、認識結果２３１は、全て実施例１と同様で
あるので説明を省略する。The feature extraction unit 228, the identification unit 229, the dictionary memory 230, and the recognition result 231 are all the same as those in the first embodiment, and therefore their explanations are omitted.

【００６３】以上、実施例２によれば、Ｍ回の走査によ
って、それぞれ線幅の異なるＭ種のストロ−ク成分を反
映したサブパタンが作成でき、従って、Ｍ種の線幅のス
トロ−クからなる複雑な漢字や図形等に対しても高精度
な認識性能を安定に維持できる。また、実施例１は、実
施例２においてＭ＝１としたものと同等であり、実施例
２の特殊な場合に相当している。As described above, according to the second embodiment, by scanning M times, it is possible to create a sub-pattern that reflects M kinds of stroke components having different line widths. Highly accurate recognition performance can be stably maintained even for complicated Chinese characters and figures. Further, the first embodiment is equivalent to the case where M = 1 in the second embodiment, and corresponds to the special case of the second embodiment.

【００６４】尚、実施例１及び実施例２は、上述した例
のみに限定されるものではない。例えば、かすれパタン
抽出部１１４または２１５におけるかすれパタン抽出手
段は図３に示された方法だけでなく、ＯＲ、ＮＯＲ、Ａ
ＮＤ，ＮＡＮＤ、ＮＯＴ回路等を組み合わせることによ
って、同一の結果を出力する方法がいくつか考えられる
が、如何なる方法であっても本実施例で定義されたかす
れパタンを抽出できれば、それらは全て本発明に属す
る。The first and second embodiments are not limited to the above examples. For example, the blur pattern extracting means in the blur pattern extracting unit 114 or 215 is not limited to the method shown in FIG.
There are several possible methods of outputting the same result by combining ND, NAND, NOT circuits, etc., but if any of the methods can extract the blur pattern defined in the present embodiment, all of them can be used in the present invention. Belong to.

【００６５】また図７において、かすれパタンのストロ
−クを平均線幅まで太らせる処理として、輪郭点系列の
外側に順次、黒画素を追加する手段を講じたが、線幅を
増大させる処理であれば、特に本方法に限定する必要は
なく、任意に設定可能である。In FIG. 7, as a process for thickening the stroke of the fading pattern to the average line width, a means for sequentially adding black pixels to the outside of the contour point series has been taken. If so, there is no particular need to limit to this method, and it can be set arbitrarily.

【００６６】また非本質的なストロ−ク成分を除去する
方法として、微小セグメントの除去や線幅による判定等
を用いたが、これらの条件式及び閾値の設定等は、本発
明の範囲内で任意に変更できる。Further, as a method of removing the extrinsic stroke component, the removal of the minute segment, the judgment by the line width, etc. were used, but the conditional expressions and the setting of the threshold value, etc. are within the scope of the present invention. It can be changed arbitrarily.

【００６７】また線幅の計算方法、特徴マトリクスの抽
出方法、外接枠分割方法等も本発明の範囲内で適宜変更
可能である。さらに図１、図２、図７のブロック図にお
いて、パタンレジスタや各メモリの構成、各構成部分に
分担された処理や動作、入出力信号の流れ、設置個数、
位置その他の条件も任意好適に変更可能である。The line width calculation method, the feature matrix extraction method, the circumscribing frame division method, etc. can be appropriately changed within the scope of the present invention. Further, in the block diagrams of FIG. 1, FIG. 2, and FIG. 7, the configuration of the pattern register and each memory, the processing and operation shared by each component, the flow of input / output signals, the number of installed units,
The position and other conditions can be arbitrarily changed.

【００６８】[0068]

【発明の効果】以上、詳細に説明したように、本発明に
よれば、入力文字パタンを量子化された２値画像に変換
し、２値画像の外接枠を検出し、外接枠内の２値画像の
線幅を計算し、外接枠内の２値画像に対して、水平、垂
直、右斜め、左斜め方向に走査して、前記線幅の２倍を
超えるストロ−クの分布状態を反映する４種類のサブパ
タンを抽出し、前記外接枠内の２値画像及びサブパタン
４種とを用いて、サブパタンとして抽出されなかったか
すれパタンを検出し、かすれパタンの線幅を計算し、か
すれパタンを構成する一つまたは複数の互いに接触して
いないセグメントの外側に順次黒画素を追加することに
よって、かすれパタンの線幅を前記入力文字パタンの平
均線幅まで太らせることによって、かすれ復元パタンを
作成し、さらにかすれ復元パタンに対して、前記平均線
幅に基づいて設定された閾値を用いて、水平、垂直、右
斜め、左斜め方向に走査し、検出されたストロ−クの分
布状態を反映する４種類のかすれ復元サブパタンを抽出
し、前記サブパタン及び前記かすれ復元サブパタンとを
それぞれの種類毎に合成することによって、合成サブパ
タンを作成し、さらに前記外接枠内の２値画像とかすれ
復元パタンとを合成して合成文字パタンとし、当該合成
文字パタンに基づいて外接枠を分割し、前記合成サブパ
タン及び前記外接枠の分割情報に基づいて特徴マトリク
スを抽出し、前記特徴マトリクスと辞書とを照合した結
果より、認識結果を出力するようにしたので、文字パタ
ンを構成するストロ−クであって、認識に本質的な役割
を果たすものの一部が、他の部分との局所線幅と比較し
て小さくなった場合でも、サブパタンの一部として抽出
され、しかも平均線幅をもつまで復元され、さらに分割
領域にも復元の効果が反映されるので、ストロ−ク成分
の損失に伴う認識性能の低下が防止できる。従って、局
所線幅に大きな相違のある品質の悪い文字パタンや様々
なスケ−ルのストロ−クから構成される複雑な漢字文字
や図形等に対しても高精度な認識性能を安定に維持でき
る文字認識装置が実現可能となる。As described above in detail, according to the present invention, the input character pattern is converted into a quantized binary image, the circumscribing frame of the binary image is detected, and the circumscribing frame is detected. The line width of the value image is calculated, and the binary image in the circumscribing frame is scanned horizontally, vertically, diagonally to the right, and diagonally to the left to determine the distribution state of strokes exceeding twice the line width. Four types of sub-patterns to be reflected are extracted, the binary image in the circumscribing frame and the four types of sub-patterns are used to detect the fading patterns not extracted as sub-patterns, the line width of the fading patterns is calculated, and the fading patterns are calculated. The line width of the blur pattern is increased to the average line width of the input character pattern by sequentially adding black pixels to the outside of one or a plurality of segments that do not contact each other to form a blur restoration pattern. Create and do more 4 types that reflect the detected stroke distribution state by scanning horizontally, vertically, diagonally to the right, and diagonally to the left by using a threshold value set based on the average line width for the restoration pattern. The subtle restoration sub-pattern of is extracted, and the sub-pattern and the sub-restoration sub-pattern are synthesized for each type to create a synthetic sub-pattern, and further, the binary image in the circumscribing frame and the blur restoration sub-pattern are synthesized. As a composite character pattern, the circumscribing frame is divided based on the composite character pattern, the feature matrix is extracted based on the division information of the synthesizing sub-pattern and the circumscribing frame, and as a result of collating the feature matrix with the dictionary, Since the recognition result is output, a part of the stroke that constitutes the character pattern and plays an essential part in the recognition is different from the other parts. Even if it becomes smaller than the local line width, it is extracted as a part of the sub-pattern and is restored until it has the average line width, and the restoration effect is reflected in the divided areas as well. It is possible to prevent deterioration of recognition performance due to loss. Therefore, it is possible to stably maintain high-precision recognition performance even for complicated Kanji characters and figures composed of poor quality character patterns with large differences in local line widths and strokes of various scales. A character recognition device can be realized.

[Brief description of drawings]

【図１】文字認識装置の実施例１を示すブロック図であ
る。FIG. 1 is a block diagram showing a first embodiment of a character recognition device.

【図２】文字認識装置の実施例２を示すブロック図であ
る。FIG. 2 is a block diagram showing a second embodiment of the character recognition device.

【図３】かすれパタン抽出部の構成を示すブロック図で
ある。FIG. 3 is a block diagram showing a configuration of a blur pattern extracting unit.

【図４】かすれ部分の存在するパタンの一例を示す図で
ある。FIG. 4 is a diagram showing an example of a pattern in which a blurred portion exists.

【図５】つぶれによりサブパタンとして抽出されない部
分があるパタンの一例を示す図である。FIG. 5 is a diagram showing an example of a pattern that has a portion that is not extracted as a sub-pattern due to crushing.

【図６】本発明の適用例を示す図である。FIG. 6 is a diagram showing an application example of the present invention.

【図７】かすれパタン復元部の一構成例を示すブロック
図である。FIG. 7 is a block diagram showing a configuration example of a blur pattern restoration unit.

【図８】３×３マスクを示す図である。FIG. 8 is a diagram showing a 3 × 3 mask.

【図９】文字枠分割の例を示す図である。FIG. 9 is a diagram showing an example of character frame division.

【図１０】文字枠分割の例を示す図である。FIG. 10 is a diagram showing an example of character frame division.

[Explanation of symbols]

１０１光信号１０２光電変換部１０３パタンレジスタ１０４外接枠検出部１０５文字パタン線幅計算部１０６水平方向走査部１０７水平サブパタン１メモリ１０８垂直方向走査部１０９垂直サブパタン１メモリ１１０右斜め方向走査部１１１右斜めサブパタン１メモリ１１２左斜め方向走査部１１３左斜めサブパタン１メモリ１１４かすれパタン抽出部１１５かすれパタンメモリ１１６かすれパタン線幅計算部１１７水平方向走査部１１８水平かすれサブパタンメモリ１１９垂直方向走査部１２０垂直かすれサブパタンメモリ１２１右斜め方向走査部１２２右斜めかすれサブパタンメモリ１２３左斜め方向走査部１２４左斜めかすれサブパタンメモリ１２５水平サブパタン合成部１２６水平サブパタン２メモリ１２７垂直サブパタン合成部１２８垂直サブパタン２メモリ１２９右斜めサブパタン合成部１３０右斜めサブパタン２メモリ１３１左斜めサブパタン合成部１３２左斜めサブパタン２メモリ１３３出力制御部１３４線幅判定部１３５特徴抽出部１３６識別部１３７辞書メモリ１３８認識結果１３９微小セグメント除去部１４０かすれパタン復元部１４１かすれ復元パタンメモリ１４２パタン合成部１４３合成文字パタンメモリ１４４外接枠分割部 101 optical signal 102 photoelectric conversion unit 103 pattern register 104 circumscribing frame detection unit 105 character pattern line width calculation unit 106 horizontal direction scanning unit 107 horizontal sub pattern 1 memory 108 vertical direction scanning unit 109 vertical sub pattern 1 memory 110 right diagonal direction scanning unit 111 right Diagonal sub-pattern 1 memory 112 Left diagonal sub-pattern scanning unit 113 Left diagonal sub-pattern 1 memory 114 Blurred pattern extraction unit 115 Blurred pattern memory 116 Blurred pattern line width calculation unit 117 Horizontal scanning unit 118 Horizontal blur sub-pattern memory 119 Vertical scanning unit 120 Vertical Faint sub-pattern memory 121 Right diagonal scan section 122 Right diagonal sub-pattern memory 123 Left diagonal scan section 124 Left diagonal sub-pattern memory 125 Horizontal sub-pattern combining section 126 Horizontal sub-pattern 2 Memory 127 Vertical sub-pattern combining unit 128 Vertical sub-pattern 2 memory 129 Right oblique sub-pattern combining unit 130 Right oblique sub-pattern 2 memory 131 Left oblique sub-pattern combining unit 132 Left oblique sub-pattern 2 memory 133 Output control unit 134 Line width determination unit 135 Feature extraction unit 136 Identification Part 137 Dictionary memory 138 Recognition result 139 Small segment removal part 140 Blurred pattern restoration part 141 Blurred restoration pattern memory 142 Pattern composition part 143 Composite character pattern memory 144 Encircling frame division part

Claims

[Claims]

1. A photoelectric conversion unit for optically scanning a character pattern written on a form or the like to convert it into a binary image which is a quantized electric signal, and a character pattern converted into the binary image. , A circumscribing frame detecting unit that detects a circumscribing frame of the character pattern in the pattern register, a line width calculating unit that calculates a line width of the character pattern in the circumscribing frame of the pattern register, and the pattern. For the character pattern in the circumscribed frame of the register,
Scanning is performed in each of horizontal, vertical, right diagonal, and diagonal left directions, and when the number of consecutive black pixels on the scanning line exceeds a threshold value determined based on the line width, it is detected as a stroke, and these strokes are detected. -A sub-pattern extraction unit that extracts four types of sub-patterns representing the distribution of black and white in each direction, and a binary image in the circumscribed frame of the pattern register and the four types of sub-patterns among the black pixels that form a character pattern. A blur pattern extraction unit that extracts a set of black pixels that do not belong to any of the four types of sub patterns as a blur pattern, and a fine segment removal unit that removes a fine segment from each of the independent segments that form the blur pattern. , A blur pattern line width calculation unit that calculates the line width of the blur pattern, and a restoration pattern needs to be created for the blur pattern from which minute segments have been removed. When it is determined, for each segment that constitutes the faint pattern, a faint pattern restoring unit that creates a faint restored pattern by performing a process of thickening the line width to the average line width, and the faint restore Horizontal, vertical, diagonal to the right,
Scanning is performed in each diagonal left direction, and when the number of consecutive black pixels on the scanning line exceeds a threshold value determined based on the previous average line width, it is detected as a stroke, and the distribution of these strokes is represented. A shading restoration sub-pattern extraction unit that extracts four types of shading restoration sub-patterns for each direction, a sub-pattern synthesis unit that synthesizes the sub-pattern and the shading restoration sub-pattern for each type, and one of the sub-patterns or synthesis sub-patterns. A control unit that outputs a sub-pattern to a feature extraction unit, a pattern synthesizing unit that synthesizes a binary image in the circumscribing frame of the pattern register and the blurring restoration pattern, and a composite character pattern; and a circumscribing frame of the pattern register. The circumscribing frame is divided into grid-like partial regions in the horizontal and vertical directions based on the binary image in the image or the peripheral distribution of the composite character pattern. A circumscribing frame dividing unit, a feature extracting unit that calculates a feature value of the divided partial region for the sub-pattern or the composite sub-pattern, and creates a feature matrix, and collates the feature matrix with a dictionary prepared in advance. A character recognition device having a discriminating section for outputting a final recognition result.