JPH0991378A

JPH0991378A - Character recognition system

Info

Publication number: JPH0991378A
Application number: JP7244129A
Authority: JP
Inventors: Shinji Sase; 慎治佐瀬
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1995-09-22
Filing date: 1995-09-22
Publication date: 1997-04-04
Anticipated expiration: 2015-09-22
Also published as: JP2973892B2

Abstract

PROBLEM TO BE SOLVED: To provide the character recognition system which can stably recognize a correct character by making good use of the effect of the shaping of a character even for a character whose character line is deformed. SOLUTION: This system has a 1st image converting means 2 which scans an image of a read character with a mask of predetermined size and replaces pixels positioned in the center of the mask with a background when the pixels in the mask are all a character line, a 2nd image converting means 3 which measures the length of an array of background pixels from the output image of the 1st image converting means 2 as to two orthogonal directions, accumulates values which are in inverse proportion to the reciprocal of the length of the array at right angles to the measurement directions, and levels the accumulated values respectively, a dictionary part 4 which stores features by characters, and a collation part 5 which extracts features of a character from the output image of the 2nd image converting means 3, compares them with the features of characters stored in the dictionary part 4 to find a similar character category, and outputs the recognition result of the read character or candidates.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は光学的に文字を読み
取る文字認識方式に関し、特に様々な筆記具で書かれた
手書き文字を読み取るための文字認識方式に関するもの
である。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition system for optically reading characters, and more particularly to a character recognition system for reading handwritten characters written by various writing instruments.

【０００２】[0002]

【従来の技術】手書き文字の認識方式では、文字の線部
分または背景部分の分布状況に応じて文字の整形を行う
処理方式の有効性が確認されている。この処理を一般的
に非線形正規化と呼ぶ。このような処理は手書きによる
文字の歪みを補正する能力を有しており、複雑な構造を
持つ漢字を読み取る場合などに特に有効であるとされて
いる。この非線形正規化技術の代表的な文献として、”
階層的な位置ずれ補正処理に基づく手書き漢字認識”
（電子情報通信学会研究会資料ＰＲＵ８７−１０４）が
知られている。2. Description of the Related Art In a handwritten character recognition method, the effectiveness of a processing method for shaping a character in accordance with the distribution of the line portion or background portion of the character has been confirmed. This process is generally called nonlinear normalization. Such processing has the ability to correct handwritten character distortion, and is said to be particularly effective when reading Chinese characters having a complicated structure. As a representative document of this nonlinear normalization technique,
Handwritten Chinese Character Recognition Based on Hierarchical Misregistration Correction "
(The Institute of Electronics, Information and Communication Engineers Study Group material PRU87-104) is known.

【０００３】上記した非線形正規化処理では、入力され
た文字の画像のうち、文字の背景の狭い部分については
拡大し、背景の広い部分については縮小する処理が行わ
れている。このように整形することによって、文字を認
識するために予め持っている文字カテゴリ毎の特徴デー
タと重なる部分が増し、文字の認識率を向上させること
ができる。In the above-described non-linear normalization processing, in the input character image, a portion of the character background having a narrow background is enlarged and a portion of the character background having a wide background is reduced. By shaping in this way, the portion that overlaps with the characteristic data for each character category that is held in advance for recognizing the character increases, and the character recognition rate can be improved.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら上記のよ
うな従来の文字認識方式では、文字を構成する文字線が
全て得られることを前提にして処理が行われているた
め、太いマジックや筆などで書かれた文字の文字線が重
なっていたり接触している場合にかえって認識性能が劣
化してしまうことがあった。However, in the conventional character recognition method as described above, since the processing is performed on the assumption that all the character lines forming the character are obtained, it is possible to use thick magic or a brush. When the character lines of written characters overlap or are in contact with each other, the recognition performance may be deteriorated.

【０００５】例えば、図１０（ａ）に示すような通常の
ペンで書かれた文字イメージ「束」の場合、図に示すよ
うに文字線が全て与えられているため、図１０（ｂ）に
示すように非線形正規化を行っても正しい文字「束」が
認識される。しかしながら、図１０（ｃ）に示すように
中央部が潰れた文字イメージ「束」に非線形正規化を適
用すると、中央部に背景部分がないために、文字の整形
によって中央部が縮小され、図１０（ｄ）に示すような
画像に変換されてしまう。このため、正しい文字「束」
との形状差が大きくなり、文字カテゴリ毎の特徴データ
と照合した結果、誤った文字に認識されることがあっ
た。For example, in the case of a character image "bundle" written with a normal pen as shown in FIG. 10A, all the character lines are given as shown in FIG. As shown, the correct character "bundle" is recognized even if the nonlinear normalization is performed. However, when non-linear normalization is applied to the character image "bundle" in which the central portion is crushed as shown in FIG. 10C, the central portion is reduced by shaping the character because there is no background portion in the central portion. The image is converted into an image as shown in 10 (d). For this reason, the correct character "bundle"
The shape difference between and became large, and as a result of collating with the characteristic data for each character category, it was sometimes recognized as an incorrect character.

【０００６】また、特願昭６０−１１１４４４号公報で
は、潰れた活字文字に対する認識方法として、文字線部
のエッジからある範囲の部分については照合領域とし、
それより内部は照合領域外として認識する方法が開示さ
れている。しかしながらこの方法は照合時の演算方法に
関するものであるため、非線形正規化のような文字の整
形が行われず、手書き文字を読み取る場合に十分な効果
を得ることができなかった。さらに同公報においては、
文字線の幅を抽出することが記載されているが、その具
体的な方法については触れられていない。Further, in Japanese Patent Application No. 60-111444, as a method of recognizing a crushed type character, a portion within a certain range from the edge of the character line portion is set as a collation area.
A method of recognizing the inside as outside the collation area is disclosed. However, since this method relates to a calculation method at the time of matching, character shaping such as non-linear normalization is not performed, and a sufficient effect cannot be obtained when reading a handwritten character. Further, in the publication,
It is described that the width of the character line is extracted, but the specific method is not mentioned.

【０００７】本発明は上記したような従来の技術が有す
る問題点を解決するためになされたものであり、様々な
太さで書かれ、文字線が重なった文字などに対しても、
文字の整形の効果を生かして、正しい文字を安定して認
識することができる文字認識方式を提供することを目的
とする。The present invention has been made in order to solve the problems of the above-mentioned conventional techniques, and it is possible to write characters having various thicknesses and overlapping character lines.
An object of the present invention is to provide a character recognition method that can recognize a correct character stably by making the most of the effect of character shaping.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するため
本発明の文字認識方式は、文字を画像として読み取り、
前記文字の認識結果あるいは候補を出力する文字認識方
式であって、読み取った文字の画像を予め定められた大
きさのマスクで走査し、前記マスク内の画素がすべて文
字線となる場合に、前記マスクの中央に位置する画素を
背景に置き換える第１の画像変換手段と、前記第１の画
像変換手段の出力画像から、背景の画素の並びの長さを
垂直に交わる２方向についてそれぞれ測定し、測定した
方向と垂直な方向に前記長さの逆数に比例する値をそれ
ぞれ累積し、前記累積した値がそれぞれ均一になるよう
にする第２の画像変換手段と、文字カテゴリ毎の特徴が
蓄積された辞書部と、前記第２の画像変換手段の出力画
像から文字の特徴を抽出し、前記辞書部に蓄積された特
徴と比較して類似している文字カテゴリを探しだし、前
記読み取った文字の認識結果あるいは候補を出力する照
合部とを有することを特徴とする。In order to achieve the above object, the character recognition system of the present invention reads characters as an image,
A character recognition method for outputting a recognition result or a candidate of the character, wherein an image of the read character is scanned with a mask having a predetermined size, and when all the pixels in the mask are character lines, From the first image conversion unit that replaces the pixel located in the center of the mask with the background, and from the output image of the first image conversion unit, the length of the array of pixels of the background is measured in each of two perpendicularly intersecting directions, Second image conversion means for accumulating values proportional to the reciprocal of the length in a direction perpendicular to the measured direction so that the accumulated values are uniform, and characteristics for each character category are accumulated. The character portion is extracted from the dictionary unit and the output image of the second image conversion means, the similar character category is searched by comparing with the characteristic accumulated in the dictionary unit, and the read character is read. And having a matching portion for outputting a recognition result or candidate.

【０００９】このとき、第１の画像変換手段は、予め定
められた大きさの第１のマスクと、前記第１のマスクと
相似形で前記第１のマスクの周辺に形成される第２のマ
スクとによって読み取った文字の画像を走査し、第１の
マスク内の画素がすべて文字線であり、かつ前記第２の
マスク内の画素に背景がある場合に、前記第１のマスク
の中央に位置する画素を背景に置き換えてもよい。At this time, the first image conversion means has a first mask having a predetermined size and a second mask formed around the first mask in a similar shape to the first mask. The image of the character read by the mask is scanned, and when all the pixels in the first mask are character lines and the pixels in the second mask have a background, the image is displayed in the center of the first mask. The located pixel may be replaced with the background.

【００１０】また、文字を画像として読み取り、前記文
字の認識結果あるいは候補を出力する文字認識方式であ
って、最も近い背景の画素までの距離を各文字線の画素
についてそれぞれ計算する距離演算部と、前記距離演算
部の演算結果から文字線の画素の数に対して前記距離が
最小の画素の数の比率を求め、該比率から文字線の画素
を背景に置き換えるためのしきい値を算出するしきい値
設定部と、前記しきい値より長い距離の画素を背景に置
き換える画像変換部とを備えた第１の画像変換手段と、
前記第１の画像変換手段の出力画像から、背景の画素の
並びの長さを垂直に交わる２方向についてそれぞれ測定
し、測定した方向と垂直な方向に前記長さの逆数に比例
する値をそれぞれ累積し、前記累積した値がそれぞれ均
一になるようにする第２の画像変換手段と、文字カテゴ
リ毎の特徴が蓄積された辞書部と、前記第２の画像変換
手段の出力画像から文字の特徴を抽出し、前記辞書部に
蓄積された特徴と比較して類似している文字カテゴリを
探しだし、前記読み取った文字の認識結果あるいは候補
を出力する照合部と、を有することを特徴とする。In addition, a character recognition method for reading a character as an image and outputting the recognition result or candidate of the character, and a distance calculation unit for calculating the distance to the nearest background pixel for each pixel of each character line. , The ratio of the number of pixels with the minimum distance to the number of pixels of the character line is calculated from the calculation result of the distance calculation unit, and the threshold value for replacing the pixel of the character line with the background is calculated from the ratio. A first image conversion unit including a threshold value setting unit and an image conversion unit that replaces pixels having a distance longer than the threshold value with a background;
From the output image of the first image conversion means, the length of the arrangement of the background pixels is measured in each of two directions perpendicular to each other, and a value proportional to the reciprocal of the length is measured in the direction perpendicular to the measured direction. Second image conversion means for accumulating and making the accumulated values uniform, a dictionary unit accumulating characteristics for each character category, and character characteristics from an output image of the second image conversion means Is extracted, a similar character category is searched for by comparing with the characteristics stored in the dictionary section, and a collation section that outputs the recognition result or the candidate of the read character is included.

【００１１】このとき、しきい値設定部は、距離演算部
の演算結果から、文字線の画素の数に対する前記距離が
最小の画素の数の比率を求め、該比率から文字線の画素
を背景に置き換えるための下限のしきい値と、前記下限
のしきい値に予め定められた値を加算した上限のしきい
値とを算出し、画像変換部は、前記下限のしきい値と上
限のしきい値と間の距離の画素を背景に置き換えてもよ
い。At this time, the threshold setting unit obtains a ratio of the number of pixels having the smallest distance to the number of pixels of the character line from the calculation result of the distance calculating unit, and the pixel of the character line is set as a background from the ratio. And a lower limit threshold value for replacing the lower limit threshold value and an upper limit threshold value obtained by adding a predetermined value to the lower limit threshold value. Pixels at a distance from the threshold may be replaced with the background.

【００１２】さらに、上記したいずれかの文字認識方式
において、読み取った文字を一様な大きさに変換する正
規化手段を有していてもよい。Further, in any one of the above character recognition methods, a normalizing means for converting the read character into a uniform size may be provided.

【００１３】上記のように構成された本発明の文字認識
方式は、第１の画像変換手段によって読み取った文字の
画像を予め定められた大きさのマスクで走査し、マスク
内の画素がすべて文字線となる場合にマスクの中央に位
置する画素を背景に置き換えることで、２本の文字線が
重なったり接触している場合に、それらの文字線が分離
されて正しい文字に近い画像を得ることができる。According to the character recognition method of the present invention configured as described above, the image of the character read by the first image conversion means is scanned with a mask of a predetermined size, and all the pixels in the mask are character. When the line becomes a line, the pixel located in the center of the mask is replaced with the background, and when two character lines overlap or touch, the character lines are separated and an image close to the correct character is obtained. You can

【００１４】また、第１の画像変換手段で予め定められ
た大きさの第１のマスクと、第１のマスクの周辺に形成
される第２のマスクとによって読み取った文字の画像を
走査し、第１のマスク内の画素がすべて文字線であり、
かつ第２のマスク内の画素に背景がある場合に、第１の
マスクの中央に位置する画素を背景に置き換えること
で、２本の文字線が重なったり接触している場合だけで
なく、３本の文字線が重なったり接触している場合にも
それらの文字線が分離されて正しい文字に近い画像を得
ることができる。Further, the image of the character read by the first image converting means is scanned by the first mask having a predetermined size and the second mask formed around the first mask, All pixels in the first mask are character lines,
In addition, when the pixel in the second mask has a background, the pixel located in the center of the first mask is replaced with the background, so that not only when two character lines overlap or contact each other, Even when the character lines of a book overlap or are in contact with each other, the character lines are separated and an image close to a correct character can be obtained.

【００１５】さらに、第１の画像変換手段の距離演算部
で最も近い背景の画素までの距離を各文字線の画素につ
いてそれぞれ計算し、しきい値設定部で文字線の画素の
数に対して距離が最小の画素の数の比率を求め、文字線
の画素を背景に置き換えるためのしきい値を算出し、画
像変換部でしきい値より長い距離の画素を背景に置き換
えることで、様々な太さの文字を読み取る場合にも２本
の文字線が重なったり接触しているときに、それらの文
字線が分離されて正しい文字に近い画像を得ることがで
きる。特に、しきい値設定部でこのしきい値に予め定め
られた値を加算して上限のしきい値を設定し、画像変換
部で２つのしきい値の間の画素を背景に変換すること
で、３本の文字線が重なったり接触している場合にも対
応することができる。Furthermore, the distance to the nearest background pixel is calculated for each pixel of each character line by the distance calculation unit of the first image conversion means, and the threshold value setting unit calculates the distance to the number of pixels of the character line. By calculating the ratio of the number of pixels with the minimum distance, calculating the threshold value for replacing the character line pixels with the background, and replacing the pixels with a distance longer than the threshold value with the background in the image conversion unit, Even when reading a character having a thickness, when the two character lines overlap or are in contact with each other, the character lines are separated and an image close to a correct character can be obtained. In particular, the threshold value setting unit adds a predetermined value to this threshold value to set an upper limit threshold value, and the image conversion unit converts pixels between the two threshold values into a background. Thus, it is possible to deal with the case where three character lines overlap or are in contact with each other.

【００１６】[0016]

【発明の実施の形態】次に、本発明について図面を参照
して説明する。Next, the present invention will be described with reference to the drawings.

【００１７】（第１実施例）図１は本発明の文字認識方
式の第１実施例の装置構成を示すブロック図である。図
１において、文字認識装置は、光学的な手段によって文
字を読み込む入力部１と、入力された文字の画像を予め
定められた大きさのマスクで連続的に走査し、マスク内
がすべて文字線の場合に、マスク中央の画素を背景に置
き換える第１の画像変換手段２と、第１の画像変換手段
２から出力される第１の整形画像の背景の分布に基づ
き、垂直に交わる縦横２方向それぞれについて、その背
景の分布が均一になるように文字画像を整形する第２の
画像変換手段３と、文字カテゴリ毎の特徴が蓄積された
辞書部４と、第２の画像変換手段３から出力された第２
の整形画像から文字の特徴を抽出し、辞書部４に蓄積さ
れた文字の特徴と比較演算を行って入力された文字を認
識する照合部５と、照合部５の文字認識結果を出力する
出力部６とによって構成されている。(First Embodiment) FIG. 1 is a block diagram showing the arrangement of a character recognition system according to the first embodiment of the present invention. In FIG. 1, the character recognition device continuously scans an input unit 1 for reading characters by optical means and an image of the input characters with a mask of a predetermined size, and all the lines in the mask are character lines. In the case of, the first image conversion unit 2 that replaces the pixel at the center of the mask with the background, and the two vertical and horizontal directions that vertically intersect based on the background distribution of the first shaped image output from the first image conversion unit 2. For each of them, the second image conversion unit 3 that shapes the character image so that the background distribution is uniform, the dictionary unit 4 that stores the characteristics of each character category, and the second image conversion unit 3 outputs The second done
An output for outputting the character recognition result of the collation unit 5 and the collation unit 5 for extracting the character features from the shaped image and performing the comparison operation with the character features accumulated in the dictionary unit 4 to recognize the input character. And part 6.

【００１８】従来例で述べたような文字整形を行う場合
には前提条件となる文字線をなるべく正確に得ることが
重要である。本実施例では文字線の太さに注目し、文字
線が重なったと想定される太い線については、第１の画
像変換手段２によって強制的に中央部に背景を設けるよ
うにしている。そして、第１の画像変換手段２で処理し
た後、第２の画像変換手段３で非線形正規化にしたがっ
た文字の整形を行い、照合部５で文字の認識を行う。When performing character shaping as described in the conventional example, it is important to obtain a character line which is a precondition as accurately as possible. In the present embodiment, attention is paid to the thickness of the character lines, and for thick lines assumed to overlap character lines, the first image conversion means 2 forcibly provides a background in the central portion. Then, after being processed by the first image conversion unit 2, the second image conversion unit 3 shapes the character according to the non-linear normalization, and the collation unit 5 recognizes the character.

【００１９】このような構成において、入力部１によっ
て取り込まれた文字は入力文字画像として第１の画像変
換手段２に出力される。ここで、入力文字画像は格子状
に配列された複数の画素に対してそれぞれ白（背景）ま
たは黒（文字線）の値が設定された１文字分の２値画像
である。入力文字画像を受け取った第１の画像変換手段
２は入力文字画像の全面を予め定められた大きさのマス
クによって走査する。このマスクは、マスク内に納まる
入力文字画像の各画素が文字線あるいは背景であるかを
判定するための処理単位である。In such a configuration, the character captured by the input unit 1 is output to the first image conversion means 2 as an input character image. Here, the input character image is a binary image for one character in which a value of white (background) or black (character line) is set for each of a plurality of pixels arranged in a grid. The first image conversion means 2 which has received the input character image scans the entire surface of the input character image with a mask having a predetermined size. This mask is a processing unit for determining whether each pixel of the input character image contained in the mask is a character line or a background.

【００２０】ここで、マスク内の画素がすべて文字線の
場合、第１の画像変換手段２はマスクの中心部に相当す
る画素を背景の画素に変換し、それ以外の画素について
は入力文字画像のまま変換しないで出力する。この第１
の画像変換手段２で処理した文字画像を第１の整形画像
とする。Here, when all the pixels in the mask are character lines, the first image converting means 2 converts the pixel corresponding to the center of the mask into the background pixel, and the other pixels are the input character image. Output without converting. This first
The character image processed by the image conversion means 2 is used as a first shaped image.

【００２１】第２の画像変換手段３では、第１の整形画
像の背景の画素の分布にもとづいて第２の整形画像を作
成する。第２の画像変換手段３は、第１の整形画像の背
景の画素について、その縦方向と横方向の並びの長さを
それぞれ検出し、横方向の背景長はその逆数に比例する
値を縦方向に累積し、縦方向の背景長はその逆数に比例
する値を横方向に累積する。そして、その累積分布が一
様になるように画像変換基準を設定し、画像変換基準に
基づいて累積値の高い部分は圧縮し、累積値の低い部分
は拡大する。この処理は従来例で説明した非線形正規化
処理であり、この処理によって精度よく文字を認識する
ことができる。The second image conversion means 3 creates a second shaped image based on the distribution of background pixels of the first shaped image. The second image conversion means 3 detects the lengths of the vertical and horizontal arrangements of the background pixels of the first shaped image, and the horizontal background length has a value proportional to its reciprocal. The background length in the vertical direction accumulates a value proportional to the reciprocal thereof in the horizontal direction. Then, the image conversion standard is set so that the cumulative distribution becomes uniform, and based on the image conversion standard, a portion having a high cumulative value is compressed and a portion having a low cumulative value is expanded. This processing is the non-linear normalization processing described in the conventional example, and the character can be accurately recognized by this processing.

【００２２】これらの処理の様子を示したのが図２およ
び図３である。図２では文字「田」を例にして第１の整
形画像から第２の整形画像への変換過程を示している。
図２に示すように第１の整形画像で歪んでいた文字
「田」は、第２の整形画像では背景が均一にされて文字
「田」の形により正確に整形されている。また、図３
は、図１０（ｃ）で示したような中央部が潰れた文字
「束」を例にした場合の様子を示している。ここで、図
３（ａ）は第１の整形画像を表わしており、図３（ｂ）
は第２の整形画像を表わしている。ここでも第２の整形
画像は従来に比較して文字「束」により近い画像に整形
されていることが分かる。FIGS. 2 and 3 show the states of these processes. FIG. 2 shows the conversion process from the first shaped image to the second shaped image by taking the character “T” as an example.
As shown in FIG. 2, the character "Ta" distorted in the first shaped image is accurately shaped in the shape of the character "Ta" with a uniform background in the second shaped image. Also, FIG.
10 shows a case where the character “bundle” whose central portion is crushed as shown in FIG. 10C is taken as an example. Here, FIG. 3A shows the first shaped image, and FIG.
Represents the second shaped image. Here again, it can be seen that the second shaped image is shaped into an image closer to the character "bundle" as compared with the conventional case.

【００２３】照合部５では、第２の整形画像から文字の
特徴を抽出し、予め辞書部４に登録されている文字カテ
ゴリ毎の特徴データと比較演算して類似している文字を
探しだす。そして、類似していると判定した文字を文字
認識結果として出力し、１文字分の認識処理を終了す
る。ここで、照合部５から出力される文字認識結果の信
号は、比較演算結果、および比較演算結果から導かれた
１つないし複数の文字コードなどから構成される。The collation unit 5 extracts the characteristics of the character from the second shaped image and compares the characteristic data of each character category registered in the dictionary unit 4 in advance to find a similar character. Then, the characters determined to be similar are output as a character recognition result, and the recognition process for one character is completed. Here, the signal of the character recognition result output from the collating unit 5 is composed of the comparison calculation result and one or a plurality of character codes derived from the comparison calculation result.

【００２４】なお、照合部５で文字の特徴を抽出する方
法には、文字エッジ部の特徴を抽出する方法（文字エッ
ジ部のみに着目）、方向の特徴を抽出する方法（文字線
部あるいはエッジ部の線の方向）、細線化によって特徴
を抽出する方法、画像の圧縮処理によって特徴を抽出す
る方法（変動を吸収するために画像を圧縮する）、畳み
込み演算により特徴を抽出する方法（マスクの各位置に
適当な係数を付与し、局所的に入力画像との積和演算を
行う）、およびマトリクス積による次元変換により特徴
を抽出する方法（画像全体とそれに対応するマスクとの
積和演算の組み合わせ）などがあり、これらの方法は処
理速度、読み取り対象、および認識精度要求からいずれ
を採用してもよく、任意に組み合わせて文字の特徴を抽
出してもよい。As the method of extracting the feature of the character by the collating unit 5, a method of extracting the feature of the character edge part (focusing only on the character edge part) and a method of extracting the feature of the direction (character line part or edge) are used. Direction of lines), a method of extracting features by thinning, a method of extracting features by image compression processing (compressing an image to absorb fluctuations), a method of extracting features by convolution calculation (mask An appropriate coefficient is given to each position, and the product-sum operation with the input image is locally performed), and a method of extracting features by dimension conversion by a matrix product (of the product-sum operation of the entire image and the corresponding mask) Combination), etc., any of these methods may be adopted based on the processing speed, the reading target, and the recognition accuracy requirement, and the features of the characters may be extracted in any combination.

【００２５】また、照合部５で実施する比較演算につい
ても、様々な距離計算（市街区距離、ユークリッド距
離、マハラノビス距離、バタチャリヤ距離等）や内積等
の類似度計算、ＤＰマッチング等のいずれを採用しても
よい。Also, for the comparison calculation performed by the matching unit 5, any of various distance calculations (city block distance, Euclidean distance, Mahalanobis distance, Batachariya distance, etc.), similarity calculation of inner products, DP matching, etc. is adopted. You may.

【００２６】上述した特徴抽出演算、比較演算について
は周知の技術であるので詳細な説明は省略する。さら
に、第２の画像変換手段３の処理内容についても、従来
例で示した引用文献”階層的な位置ずれ補正処理に基づ
く手書き漢字認識”およびその参考文献に詳しい説明が
あるのでその詳細な説明は省略する。Since the above-mentioned feature extraction calculation and comparison calculation are well known techniques, detailed description thereof will be omitted. Further, as for the processing contents of the second image conversion means 3, there is a detailed description in the cited document "Handwritten Chinese character recognition based on hierarchical misregistration correction processing" and its reference document, which are shown in the conventional example, and thus detailed description thereof. Is omitted.

【００２７】次に、本実施例の第１の画像変換手段２で
使用するマスクについて図面を参照して説明する。Next, the mask used in the first image conversion means 2 of this embodiment will be described with reference to the drawings.

【００２８】図４に本実施例で使用するマスクの例を示
す。図４（ａ）において、マスク中央の○印の画素は背
景に置き換えるか否かの処理が行われる画素であり、以
下ではこの画素のことを評価画素と称す。マスク７は文
字の記入に使用した筆記具、記入した文字の大きさ、お
よび入力部１の分解能等の条件（活字の場合は文字フォ
ントと分解能になる）によって、文字線の太さよりも余
裕をもって大きくなるように実験的に定められ、本実施
例ではその縦横の幅を仮にそれぞれ画素９つ分とする。
このときのマスク７のサイズと入力文字画像の大きさの
関係を示したのが図４（ｂ）である。FIG. 4 shows an example of a mask used in this embodiment. In FIG. 4A, a pixel marked with a circle in the center of the mask is a pixel on which the process of whether or not to replace with the background is performed, and hereinafter, this pixel is referred to as an evaluation pixel. The mask 7 is larger than the thickness of the character line with a margin depending on the writing instrument used to enter the character, the size of the entered character, and the resolution of the input unit 1 (the character font and the resolution in the case of the typeface). Experimentally, the vertical and horizontal widths are set to 9 pixels in this embodiment.
FIG. 4B shows the relationship between the size of the mask 7 and the size of the input character image at this time.

【００２９】第１の画像変換手段２は、マスク７を使っ
て入力文字画像中の全画素がそれぞれ評価画素となるよ
うに走査し、マスク７内が全て文字線の場合に評価画素
を背景の値に設定し、それ以外の場合は評価画素を入力
文字画像と同じにする。このときの走査の順は任意に行
う。The first image conversion means 2 scans by using the mask 7 so that all the pixels in the input character image become the evaluation pixels, and when the mask 7 is all the character lines, the evaluation pixels are set to the background. Set to a value, otherwise, make the evaluation pixel the same as the input character image. The scanning order at this time is arbitrary.

【００３０】また、第１の画像変換手段２では入力文字
画像の周辺が背景であると仮定し、入力文字画像の内部
にマスクがおさまる場合（図４（ｂ）の斜線の部分がそ
の時の評価画素に対応する）についてのみ処理を行い、
評価画素とならなかった画素（図４（ｂ）の斜線の以外
の部分）については、入力文字画像の画素と同じ値で出
力する。In the first image conversion means 2, it is assumed that the periphery of the input character image is the background, and the mask fits inside the input character image (the shaded portion in FIG. 4B is the evaluation at that time). (Corresponding to pixels) only,
Pixels that have not become evaluation pixels (portions other than the shaded areas in FIG. 4B) are output with the same values as the pixels of the input character image.

【００３１】上記したような処理を文字の整形を行う前
に第１の画像変換手段２で行うことで、２本の文字線が
重なったり接触している文字についても非常に簡単な処
理で正しい文字により近い画像を得ることができ、誤っ
た文字に認識されることが低減される。By performing the above-mentioned processing by the first image conversion means 2 before shaping the characters, it is very simple and correct for a character in which two character lines are overlapped or in contact with each other. It is possible to obtain an image that is closer to the characters, and it is possible to reduce recognition of incorrect characters.

【００３２】なお、図４に示したように本実施例のマス
ク７の形状は正方形を採用しているが、活字等のように
縦横の幅が明らかに異なるものに対しては長方形のマス
ク７を使用してもよい。また、中央からマスク周辺部ま
での距離が一定になるような菱形（市街区距離ベース）
や疑似円形（ユークリッド距離ベース）のマスク形状を
採用してもよい。As shown in FIG. 4, the mask 7 of the present embodiment has a square shape, but a rectangular mask 7 is used for a pattern such as a type having a clearly different vertical and horizontal widths. May be used. Also, a diamond shape (based on city block distance) so that the distance from the center to the mask periphery is constant.
Alternatively, a pseudo circular (Euclidean distance based) mask shape may be adopted.

【００３３】（第２実施例）次に本発明の第２実施例に
ついて説明する。本実施例では第１の画像変換手段の処
理のみが第１実施例と異なっており、その他の構成およ
び処理内容は第１実施例と同様であるので、その説明は
省略する。(Second Embodiment) Next, a second embodiment of the present invention will be described. In the present embodiment, only the processing of the first image conversion means is different from that of the first embodiment, and other configurations and processing contents are the same as those of the first embodiment, and therefore description thereof will be omitted.

【００３４】まず、本実施例の第１の画像変換手段で使
用するマスクについて説明する。First, the mask used in the first image conversion means of this embodiment will be described.

【００３５】図５は本発明の文字認識方式の第１の画像
変換手段で使用するマスクの第２実施例の構成を示す図
である。本実施例の第１の画像変換手段で使用するマス
クは、３本の文字線が重なっている場合を想定して処理
を行うために、第１実施例で示したマスクの周囲にさら
にマスクを追加した構成としている。図５に示した斜線
部分（内側マスク）が第１実施例と同様のマスクに相当
し、斜線のない部分（周辺マスク）が本実施例で追加し
た部分である。このマスクの追加部分は入力文字画像の
対応する部分がすべて文字線ではないことを確認するた
めのものであり、そのサイズは、第１実施例のマスクと
同様の考え方で設定される。ここでは、一例として第１
実施例のマスクの周囲に縦横方向それぞれ５画素分の幅
のマスクを追加する。FIG. 5 is a diagram showing the configuration of a second embodiment of the mask used in the first image conversion means of the character recognition system of the present invention. The mask used in the first image conversion means of the present embodiment is further processed around the mask shown in the first embodiment in order to perform processing assuming that three character lines overlap each other. It has an added configuration. The shaded portion (inner mask) shown in FIG. 5 corresponds to the same mask as in the first embodiment, and the portion without hatching (peripheral mask) is the portion added in this embodiment. The additional portion of this mask is for confirming that all the corresponding portions of the input character image are not character lines, and its size is set in the same way as the mask of the first embodiment. Here, as an example, the first
A mask having a width of 5 pixels each in the vertical and horizontal directions is added around the mask of the embodiment.

【００３６】なお、本実施例のマスク形状は第１実施例
で説明したように長方形でもよく、またその周囲に追加
するマスクの幅も縦横で均等である必要はない。The mask shape of this embodiment may be rectangular as described in the first embodiment, and the width of the mask added around the mask does not have to be uniform vertically and horizontally.

【００３７】次に、本実施例の第１の画像変換手段の処
理手順を図６を参照して説明する。図６は本発明の文字
認識方式の第１の画像変換手段の第２実施例の処理手順
を示すフローチャートである。Next, the processing procedure of the first image conversion means of this embodiment will be described with reference to FIG. FIG. 6 is a flowchart showing the processing procedure of the second embodiment of the first image conversion means of the character recognition system of the present invention.

【００３８】図６において、第１の画像変換手段は、ま
ず最初に現在のマスクの位置が入力文字画像の照合範囲
（処理範囲）内にあるか否かを確認し（ステップＳ
１）、照合範囲内にある場合はステップＳ２に移って内
側のマスクを使用して第１実施例と同様の処理を行う。
内側のマスクによる処理では内側のマスク内の画素が全
て文字線であるか否かを確認し（ステップＳ３）、全て
文字線である場合にはステップＳ４に移って周辺のマス
クの位置が照合範囲内にあるか否かを確認する。周辺の
マスクが照合範囲にある場合、ステップＳ５に移って周
辺のマスクによる処理を開始する。周辺のマスクによる
処理では周辺のマスク内の画素がすべて文字線であるか
否かを確認し（ステップＳ６）、マスク内がすべて文字
線でないときに評価画素を背景に変換する（ステップＳ
７）。In FIG. 6, the first image conversion means first confirms whether or not the current mask position is within the collation range (processing range) of the input character image (step S).
1) If it is within the collation range, the process proceeds to step S2 and the same process as in the first embodiment is performed using the inner mask.
In the process by the inner mask, it is confirmed whether or not all the pixels in the inner mask are the character lines (step S3). If all the pixels are the character lines, the process moves to step S4 and the positions of the peripheral masks are in the collation range. Check if it is inside. If the peripheral mask is within the collation range, the process proceeds to step S5 to start the process using the peripheral mask. In the process by the peripheral mask, it is confirmed whether or not all the pixels in the peripheral mask are character lines (step S6), and when all the pixels in the mask are not character lines, the evaluation pixel is converted into the background (step S).
7).

【００３９】ここで、スッテプＳ１でマスク位置が照合
範囲でない場合、ステップＳ３で内側のマスク内に背景
がある場合、およびステップＳ６で周辺のマスク内がす
べて文字線の場合には、評価画素を入力文字画像と同じ
値に設定する（ステップＳ８）。また、ステップＳ４で
周辺のマスク位置が照合範囲でない場合にはステップＳ
７に移って評価画素を背景に変換する。以上の処理が終
了したら照合範囲内すべてについて処理が終了したか否
かを確認し（ステップＳ９）、処理がすべて終了してい
ない場合には次の照合位置にマスクをセットして（ステ
ップＳ１０）、ステップＳ１に戻って上記処理を繰り返
す。Here, if the mask position is not within the collation range in step S1, the background is present in the inner mask in step S3, and if all the surrounding masks are character lines in step S6, the evaluation pixel is selected. It is set to the same value as the input character image (step S8). If the peripheral mask position is not within the collation range in step S4, step S4
Moving to 7, the evaluation pixel is converted into the background. When the above processing is completed, it is confirmed whether or not the processing is completed for all the collation ranges (step S9), and if all the processing is not completed, a mask is set at the next collation position (step S10). , And returns to step S1 to repeat the above process.

【００４０】このときの整形画像の様子を示したのが図
７である。図７（ａ）は文字「東」の中心が潰れたとき
の様子を示しており、図７（ｂ）および（ｃ）はこの潰
れた文字に対して第１実施例にもとづいて処理を行った
場合の第１の整形画像および第２の整形画像を示してい
る。また、同図（ａ）に対して、第２実施例にもとづい
て作成した第１の整形画像と第２の整形画像をそれぞれ
図７（ｄ）および（ｅ）に示している。FIG. 7 shows the appearance of the shaped image at this time. FIG. 7A shows a state in which the center of the character “East” is collapsed, and FIGS. 7B and 7C show the process for the collapsed character based on the first embodiment. The first shaped image and the second shaped image in the case of being shown are shown. Further, with respect to FIG. 7A, a first shaped image and a second shaped image created based on the second embodiment are shown in FIGS. 7D and 7E, respectively.

【００４１】図７（ｂ）、（ｃ）に示すように第１実施
例の処理では「東」ではなく「束」と認識されてしまう
可能性があったのが、図７（ｄ）、（ｅ）に示すように
本実施例の処理を行うことで正しい「東」により近い文
字画像を得ることができる。したがって、３本の文字線
が重なったり接触している文字に対しても、簡単な処理
で正しい文字により近い画像を得ることができ、誤った
文字に認識されることが低減される。As shown in FIGS. 7 (b) and 7 (c), in the processing of the first embodiment, there is a possibility that it is recognized as "bundle" instead of "east". By performing the process of this embodiment as shown in (e), a correct character image closer to "East" can be obtained. Therefore, even for a character in which three character lines overlap or are in contact with each other, an image closer to a correct character can be obtained by a simple process, and recognition of an incorrect character is reduced.

【００４２】なお、本実施例の第１の画像変換手段の処
理においても、第１実施例と同様に入力文字画像の周辺
を背景と仮定して処理が行われるため、周辺に追加した
マスクが入力文字画像の外にかかるときには、後半の処
理が不要になるため、前半の処理だけで第１の整形画像
を作成することができる。In the processing of the first image converting means of this embodiment, the processing is performed assuming that the periphery of the input character image is the background as in the first embodiment, and therefore the mask added to the periphery is When the area is outside the input character image, the latter half of the processing is unnecessary, so that the first shaped image can be created only by the first half of the processing.

【００４３】（第３実施例）次に本発明の第３実施例に
ついて説明する。本実施例においても第１の画像変換手
段のみが第１実施例と異なっており、その他の構成およ
び処理内容は第１実施例と同様であるので、その説明は
省略する。(Third Embodiment) Next, a third embodiment of the present invention will be described. Also in this embodiment, only the first image conversion means is different from the first embodiment, and the other configurations and processing contents are the same as those in the first embodiment, and therefore the description thereof is omitted.

【００４４】本実施例の第１の画像変換手段は、太さの
異なる様々な筆記具で書かれた文字にも対応するため
に、文字線の太さを求めてしきい値を設定し、しきい値
より太い文字線に対応する画素は背景に変換する処理を
行っている。The first image conversion means of this embodiment sets the threshold value by determining the thickness of the character line in order to deal with characters written by various writing instruments having different thicknesses. Pixels corresponding to character lines thicker than the threshold value are converted into the background.

【００４５】図８は本発明の文字認識方式の第１の画像
変換手段の第３実施例の構成を示すブロック図である。
図８において、本実施例の第１の画像変換手段２０は、
入力文字画像の各文字線の画素について最も近い背景の
画素までの距離を計算する距離計算部２１と、距離計算
部２１で計算した結果と予め定められた値から文字線画
素を背景に変換するためのしきい値を設定するしきい値
設定部２２と、距離演算部２１としきい値設定部２２の
処理結果から第１の整形画像を出力する画像変換部２３
とによって構成されている。FIG. 8 is a block diagram showing the configuration of the third embodiment of the first image conversion means of the character recognition system of the present invention.
In FIG. 8, the first image conversion means 20 of this embodiment is
The distance calculation unit 21 that calculates the distance to the closest background pixel for each character line pixel of the input character image, and the result calculated by the distance calculation unit 21 and the character line pixel are converted to the background from a predetermined value. A threshold setting unit 22 that sets a threshold for the calculation, and an image conversion unit 23 that outputs a first shaped image from the processing results of the distance calculation unit 21 and the threshold setting unit 22.
And is constituted by.

【００４６】このような構成において、図９（ａ）の入
力文字画像の文字線（斜線内）の画素から最も近い背景
（斜線以外の部分）の画素までの距離を距離演算部２１
で算出し、それぞれの値を各画素上に記載すると図９
（ｂ）に示すような距離画像と呼ばれる画像を得ること
ができる。ここで背景に相当する画素の距離は常に０で
あり、距離の値は市街区距離計算を用いて算出している
ものとする。In such a configuration, the distance calculator 21 calculates the distance from the pixel of the character line (inside the hatched line) of the input character image of FIG. 9A to the pixel of the closest background (the portion other than the hatched line).
9 and each value is written on each pixel.
An image called a distance image as shown in (b) can be obtained. Here, it is assumed that the distance of the pixel corresponding to the background is always 0, and the distance value is calculated by using the city block distance calculation.

【００４７】なお、距離の算出方法には市街区距離だけ
でなく、ユークリッド距離やチェス盤距離であってもよ
い。The distance may be calculated not only by the city block distance but also by the Euclidean distance or the chessboard distance.

【００４８】距離演算部２１から出力された距離画像の
うち、距離が最小値の１となる画素の数は文字のエッジ
の長さを表わし、距離が０以外の画素の数は文字線の画
素の数を表わすことになる。したがって、距離が０以外
の画素の数を距離が１の画素の数で割るとその文字の平
均的な文字線幅の半分に近い値を得ることができる。し
きい値設定部２２ではこのようにして得られた平均的な
文字線幅に、文字線の幅の変動を考慮して予め設定した
値を加え、推定最大線幅として設定する。In the distance image output from the distance calculation unit 21, the number of pixels having the minimum distance of 1 represents the length of the edge of the character, and the number of pixels other than 0 is the pixel of the character line. Will represent the number of. Therefore, when the number of pixels having a distance other than 0 is divided by the number of pixels having a distance of 1, a value close to half the average character line width of the character can be obtained. The threshold setting unit 22 adds a preset value in consideration of the variation in the width of the character line to the average character line width thus obtained, and sets it as the estimated maximum line width.

【００４９】この推定最大線幅をしきい値として、距離
が推定最大線幅以上の画素を画像変換部２３で背景の画
像に変換し、それ以外の部分（距離が０の画素は除く）
を文字線とすることにより第１の整形画像を作成する。With this estimated maximum line width as a threshold value, pixels whose distance is equal to or larger than the estimated maximum line width are converted into a background image by the image conversion unit 23, and the other portions (pixels with a distance of 0 are excluded).
The first shaped image is created by using the character line.

【００５０】このような処理を行うことで、筆記具が特
定できない様々な太さの文字線を読み取る場合にも、２
本の文字線が重なったり接触している文字から正しい文
字に近い文字画像を得ることができ、誤った文字に認識
されることが低減される。By carrying out such a processing, even when reading a character line of various thicknesses that cannot be specified by the writing instrument, it is possible to perform
It is possible to obtain a character image close to a correct character from a character in which the character lines of a book overlap or are in contact with each other, and it is possible to reduce recognition of a wrong character.

【００５１】ところで、上記した推定最大線幅を下限の
しきい値として設定し、これに予め実験等によって得ら
れた値を加えて上限のしきい値として設定し、この２つ
のしきい値の間にある画素を背景に変換し、それ以外の
画素（距離が０の画素を除く）を文字線とすることで第
１の整形画像を作成してもよい。By the way, the above-mentioned estimated maximum line width is set as a lower limit threshold value, and a value obtained in advance by experiments or the like is added to it to set as an upper limit threshold value. The first shaped image may be created by converting the pixels in between to the background and using the other pixels (except for the pixel having a distance of 0) as the character line.

【００５２】このような処理を実行することで、２本の
文字線が重なったり接触している文字だけでなく、３本
の文字線が重なったり接触している文字にも対応するこ
とができる。By carrying out such processing, it is possible to deal with not only characters having two character lines overlapping or touching, but also characters having three character lines overlapping or touching. .

【００５３】なお、上述した各実施例の第１の画像変換
手段の処理を行う前に、入力文字画像の外接文字枠サイ
ズ、あるいは重心位置からの一次モーメントや２次モー
メントが一様になるように文字の大きさを変換する正規
化手段を設けると、記入された文字の大きさにバラつき
がある場合、例えば郵便物の差出人と受取人の両方を読
み取るときなどや記入された文字の大きさの変化が大き
い文字に対しても、文字線の太さの変動を相対的に小さ
くすることができ、様々な筆記具により書かれた文字を
簡単な処理で確実に認識することができる。Before the processing of the first image converting means of each of the above-described embodiments, the size of the circumscribed character frame of the input character image, or the first and second moments from the center of gravity are made uniform. If a normalizing means to convert the character size is provided in, if the size of the written characters varies, for example, when reading both the sender and the recipient of the mail, or the size of the written characters. Even for a character with a large change in, the variation in the thickness of the character line can be made relatively small, and the character written by various writing instruments can be reliably recognized by a simple process.

【００５４】また、第１の画像変換手段、第２の画像変
換手段、辞書部、および照合部は、中央演算処理装置
（ＣＰＵ）およびメモリを組み合わせ、プログラムにし
たがった処理で実現することも可能であるが、高速化を
考慮した場合、専用のハードウェア（ＤＳＰ等）を使用
してもよい。Further, the first image conversion means, the second image conversion means, the dictionary section, and the collation section can be realized by the processing according to the program by combining the central processing unit (CPU) and the memory. However, in consideration of speeding up, dedicated hardware (DSP or the like) may be used.

【００５５】[0055]

【発明の効果】本発明は以上説明したように構成されて
いるので、以下に記載する効果を奏する。Since the present invention is constructed as described above, it has the following effects.

【００５６】請求項１に記載のものにおいては、２本の
文字線が重なったり接触している場合に、第１の画像変
換手段によってそれらの文字線が分離されて正しい文字
に近い画像を得ることができ、誤った文字に認識される
ことが低減される。また第２の画像変換手段によって非
線形正規化による文字整形が行われるため、より精度よ
く文字を認識することができる。特に請求項２に記載の
ものにおいては、３本の文字線が重なったり接触してい
る文字に対しても同様の効果を得ることができる。According to the first aspect of the present invention, when the two character lines overlap or are in contact with each other, the first image converting means separates the character lines to obtain an image close to a correct character. It is possible to reduce the recognition of the wrong character. Further, since the character shaping by the non-linear normalization is performed by the second image conversion means, the character can be recognized more accurately. In particular, according to the second aspect, the same effect can be obtained even for a character in which three character lines overlap or are in contact with each other.

【００５７】請求項３に記載のものにおいては、筆記具
が特定できない様々な太さの文字線を読み取る場合に
も、第１の画像変換手段によって２本の文字線が重なっ
たり接触している文字から正しい文字に近い文字画像を
得ることができ、誤った文字に認識されることが低減さ
れる。また第２の画像変換手段によって非線形正規化に
よる文字整形が行われるため、より精度よく文字を認識
することができる。特に請求項４に記載のものにおいて
は、３本の文字線が重なったり接触している文字に対し
ても同様の効果を得ることができる。According to the third aspect of the present invention, even when the character lines of various thicknesses which cannot be specified by the writing instrument are read, the characters which are overlapped or are in contact with each other by the first image conversion means are used. It is possible to obtain a character image close to a correct character from, and it is possible to reduce recognition of a wrong character. Further, since the character shaping by the non-linear normalization is performed by the second image conversion means, the character can be recognized more accurately. In particular, according to the fourth aspect, the same effect can be obtained even for a character in which three character lines overlap or are in contact with each other.

【００５８】請求項５に記載のものにおいては、記入さ
れた文字の大きさにバラつきがある場合、例えば郵便物
の差出人と受取人の両方を読み取るときなどや記入され
た文字の大きさの変化が大きい文字に対しても、文字線
の太さの変動を相対的に小さくすることができ、様々な
筆記具により書かれた文字を簡単な処理で確実に認識す
ることができる。In the fifth aspect of the present invention, when the size of the written characters varies, for example, when both the sender and the recipient of the mail are read, or the size of the written characters changes. Even for a large character, the variation in the thickness of the character line can be made relatively small, and the character written by various writing instruments can be surely recognized by a simple process.

[Brief description of drawings]

【図１】本発明の文字認識方式の第１実施例の装置構成
を示すブロック図である。FIG. 1 is a block diagram showing a device configuration of a first embodiment of a character recognition system of the present invention.

【図２】図１に示した第１の画像変換手段の出力である
第１の整形画像から、第２の画像変換手段の出力である
第２の整形画像への変換過程を示す図である。FIG. 2 is a diagram showing a conversion process from a first shaped image which is an output of the first image converting means shown in FIG. 1 to a second shaped image which is an output of a second image converting means. .

【図３】図１に示した第１の画像変換手段の出力である
第１の整形画像、および第２の画像変換手段の出力であ
る第２の整形画像の他の例を示した図である。FIG. 3 is a diagram showing another example of a first shaped image output from the first image conversion unit shown in FIG. 1 and a second shaped image output from the second image conversion unit. is there.

【図４】図１に示した第１の画像変換手段で使用するマ
スクの第１実施例の構成を示す図である。FIG. 4 is a diagram showing a configuration of a first embodiment of a mask used in the first image conversion means shown in FIG.

【図５】本発明の文字認識方式の第１の画像変換手段で
使用するマスクの第２実施例の構成を示す図である。FIG. 5 is a diagram showing the configuration of a second embodiment of a mask used in the first image conversion means of the character recognition system of the present invention.

【図６】本発明の文字認識方式の第１の画像変換手段の
第２実施例の処理手順を示すフローチャートである。FIG. 6 is a flowchart showing a processing procedure of a second embodiment of the first image conversion means of the character recognition system of the present invention.

【図７】第１の整形画像、および第２の整形画像の例を
示す図である。FIG. 7 is a diagram showing an example of a first shaped image and a second shaped image.

【図８】本発明の文字認識方式の第１の画像変換手段の
第３実施例の構成を示すブロック図である。FIG. 8 is a block diagram showing the configuration of a third embodiment of the first image conversion means of the character recognition system of the present invention.

【図９】図８に示した第１の画像変換手段の処理結果の
例を示す図である。9 is a diagram showing an example of a processing result of the first image converting means shown in FIG.

【図１０】従来の画像認識方式で文字を認識する場合の
認識例を示した図である。FIG. 10 is a diagram showing a recognition example when a character is recognized by a conventional image recognition method.

[Explanation of symbols]

１入力部２、２０第１の画像変換手段３第２の画像変換手段４辞書部５照合部６出力部７、１７マスク２１距離演算部２２しきい値設定部２３画像変換部 DESCRIPTION OF SYMBOLS 1 input part 2 and 20 1st image conversion means 3 2nd image conversion means 4 dictionary part 5 collation part 6 output part 7 and 17 mask 21 distance calculation part 22 threshold value setting part 23 image conversion part

Claims

[Claims]

1. A character recognition method for reading a character as an image and outputting a recognition result or a candidate of the character, wherein an image of the read character is scanned with a mask having a predetermined size, First image conversion means for replacing the pixel located at the center of the mask with the background when all the pixels are character lines, and the length of the arrangement of the pixels of the background from the output image of the first image conversion means. In a direction perpendicular to the measured direction, accumulating values proportional to the reciprocal of the length, and making the accumulated values uniform. Means, a dictionary section in which the characteristics of each character category are stored, and character characteristics extracted from the output image of the second image conversion means, and compared with the characteristics stored in the dictionary section to be similar. Out locate the character category, character recognition method, characterized by having a verification unit for outputting a recognition result or candidates of the character read.

2. The character recognition system according to claim 1, wherein the first image conversion means comprises: a first mask having a predetermined size; and the first mask in a similar shape to the first mask. Scan the image of the character read by the second mask formed around the mask,
When all the pixels in the first mask are character lines and the pixels in the second mask have a background, the pixel located in the center of the first mask is replaced with the background. Character recognition method.

3. A character recognition method for reading a character as an image and outputting a recognition result or a candidate of the character, the distance calculating section calculating a distance to a nearest background pixel for each character line pixel. , The ratio of the number of pixels with the minimum distance to the number of pixels of the character line is calculated from the calculation result of the distance calculation unit, and the threshold value for replacing the pixel of the character line with the background is calculated from the ratio. A first image conversion unit including a threshold value setting unit and an image conversion unit that replaces pixels having a distance longer than the threshold value with a background; and a background pixel from an output image of the first image conversion unit. The lengths of the lines are measured in two directions that intersect vertically, and values proportional to the reciprocal of the lengths are accumulated in the direction perpendicular to the measured direction so that the accumulated values are uniform. First Image conversion means, a dictionary unit in which the characteristics of each character category are accumulated, character characteristics are extracted from the output image of the second image conversion means, and compared with the characteristics accumulated in the dictionary section. A character recognition method, which comprises: a character category which is being searched for and which outputs a recognition result or a candidate of the read character.

4. The character recognition method according to claim 3, wherein the threshold value setting unit obtains a ratio of the number of pixels with the minimum distance to the number of pixels of the character line from the calculation result of the distance calculation unit. , A lower limit threshold value for replacing the character line pixels with the background from the ratio, and an upper limit threshold value obtained by adding a predetermined value to the lower limit threshold value, and the image conversion unit A character recognition method, wherein pixels at a distance between the lower limit threshold and the upper limit threshold are replaced with a background.

5. The character recognition system according to claim 1, further comprising a normalizing means for converting a read character into a uniform size.