JP2000181994A

JP2000181994A - Character recognition processing method, device therefor and recording medium recording the method

Info

Publication number: JP2000181994A
Application number: JP10357072A
Authority: JP
Inventors: Minoru Mori; 稔森; Masaharu Kurakake; 正治倉掛; Toshiaki Sugimura; 利明杉村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1998-12-16
Filing date: 1998-12-16
Publication date: 2000-06-30
Anticipated expiration: 2018-12-16
Also published as: JP3375292B2

Abstract

PROBLEM TO BE SOLVED: To provide a processing method and the device for correctly recognizing a character even in a condition where background noise is present. SOLUTION: A character pattern is divided into local areas in a preprocessing part 102 and the number of black pixels present inside the local area is measured for each local area in a black pixel number extraction part 1-4. In an identification part 1-5, a difference value from the standard black pixel number of the respective categories of a prepared black pixel number standard dictionary 1-7 is obtained, and at the time of calculating a distance value or similarity between a feature vector obtained from the local area of an input pattern in a feature extraction part 1-3 and the standard feature vector of the corresponding category of the feature standard dictionary 1-7, adjustment is performed so as to reduce the distance value or the value of the similarity among the respective local areas by the value proportional to the black pixel number difference value. Thus, the fluctuation of the distance value or the similarity in the local area including a lot of noise is reduced, the fluctuation of the distance value or the similarity in the entire character pattern is reduced and correct character recognition is made possible.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字パターンの認
識方法及び装置に関し、特に２値の文字パターンを認識
する方法及びその装置に関するものである。The present invention relates to a method and an apparatus for recognizing a character pattern, and more particularly to a method and an apparatus for recognizing a binary character pattern.

【０００２】[0002]

【従来の技術】従来の文字パターンの認識方法及び装置
の一例として、２値化、位置及び大きさの正規化を行っ
た文字パターンから認識のために必要な特徴をベクトル
の形で抽出し（例えば、文字パターンを粗い局所領域に
分割し、各局所領域内に存在する文字部に対して複数方
向の座標軸から観測し、該座標軸から走査した際に交差
した文字部の黒画素について、文字線の方向寄与度（特
願昭５６−４６６５９号）を求めることにより文字を認
識する方法及び装置「萩田紀博、内藤誠一郎、増田功、
“大局的・局所的方向寄与度密度特徴による手書き漢字
認識方式”、信学論（Ｄ），Ｊ６６−Ｄ，Ｎｏ．６，ｐ
ｐ．７２２−７２９参照」）、あらかじめ作成してある
標準辞書内の各カテゴリの標準パターンベクトルとの間
で距離値または類似度などの識別関数を求めて、もっと
も類似した文字カテゴリを認識結果とする方法がある
（以下、「第一の方法」）。2. Description of the Related Art As an example of a conventional character pattern recognition method and apparatus, a feature necessary for recognition is extracted in the form of a vector from a character pattern that has been subjected to binarization, position and size normalization ( For example, a character pattern is divided into coarse local areas, a character part existing in each local area is observed from coordinate axes in a plurality of directions, and a black line of a character part that intersects when scanned from the coordinate axis is a character line. No. Hagida, Seiichiro Naito, Isao Masuda, and a method and apparatus for recognizing characters by obtaining the directional contribution (Japanese Patent Application No. 56-46659).
"Handwritten Kanji Recognition Method Using Global / Local Direction Contribution Density Features", IEICE (D), J66-D, No. 6, p
p. 722-729 "), a method of determining a discriminant function such as a distance value or a similarity between the standard pattern vector of each category in a standard dictionary created in advance and determining the most similar character category as a recognition result. (Hereinafter, “first method”).

【０００３】また、入力パターンの２値パターンそのも
のを特徴ベクトルの要素とし、あらかじめ作成してある
標準辞書内の各カテゴリの標準パターンベクトル（特徴
ベクトル同様に２値パターンで表現）との間で共に黒画
素となる数をもとにした識別関数である単純類似度など
により、もっとも類似した文字カテゴリを認識結果とす
る方法がある（以下「第二の方法」）。Further, the binary pattern of the input pattern itself is used as an element of the feature vector, and can be exchanged with a standard pattern vector of each category in the standard dictionary created in advance (expressed as a binary pattern like a feature vector). There is a method of determining the most similar character category as a recognition result by a simple similarity or the like, which is a discriminant function based on the number of black pixels (hereinafter, “second method”).

【０００４】[0004]

【発明が解決しようとする課題】上記文字パターン認識
の第一の方法では、従来特徴ベクトルの要素には、文字
線の方向や接続関係、位置関係などの文字線構造を反映
した特徴量が広く用いられているが、背景雑音などが激
しい画像に対しては、これらの特徴量が大きく変動して
しまい、距離値または類似度が大きく変動することによ
り、正しく認識することが困難になる問題がある。According to the first method of character pattern recognition described above, elements of the conventional feature vector have a wide range of feature values reflecting the character line structure such as the direction of the character line, the connection relationship, and the positional relationship. However, for images with strong background noise, these features vary greatly, making it difficult to recognize correctly due to large variations in distance values or similarities. is there.

【０００５】また、上記文字パターン認識の第二の方法
では、背景雑音などが激しい画像に対して、標準パター
ンのベクトルの黒画素数が多いカテゴリに誤認識してし
まう問題があった。また、重なっている画素数で判定さ
れるため、入力パターンと標準パターンの重ねあわせの
位置が少しずれたりするだけで、大きく認識性能が低下
する問題がある。Further, the second method of character pattern recognition has a problem that an image having a large background noise or the like is erroneously recognized as a category having a large number of black pixels in a standard pattern vector. In addition, since the determination is made based on the number of overlapping pixels, there is a problem that the recognition performance is greatly reduced even if the position of superimposition of the input pattern and the standard pattern is slightly shifted.

【０００６】本発明は上記欠点に鑑みてなされたもの
で、その課題は、２値化、位置及び大きさの正規化をさ
れた文字パターンから認識のために必要な特徴をベクト
ルの形で抽出し、あらかじめ作成してある標準辞書内の
各カテゴリの標準パターンベクトルとの間で距離値また
は類似度などの識別関数を求めて、もっとも類似した文
字カテゴリを認識結果とする方法において、文字パター
ンの背景にノイズが存在する場合、ノイズが存在する局
所領域においてはノイズから抽出された特徴値が文字線
から抽出された特徴値に重畳されるため、局所領域から
得られる特徴ベクトルの値は大きく変動してしまうこと
により、従来の識別関数により求められる距離値または
類似度の値は大きく変動してしまい、文字種を正しく認
識することが困難になってしまう問題を解決し、上記の
ような背景ノイズが存在する状況においても、文字を正
しく認識する方法及びその装置を提供することにある。SUMMARY OF THE INVENTION The present invention has been made in view of the above-mentioned drawbacks, and has as its object to extract, in the form of a vector, features necessary for recognition from a character pattern whose binarization and position and size have been normalized. In a method in which a discriminant function such as a distance value or a degree of similarity is obtained between a standard pattern vector of each category in a standard dictionary created in advance and a similar character category is determined as a recognition result, a character pattern When noise is present in the background, in a local region where noise is present, the feature value extracted from the noise is superimposed on the feature value extracted from the character line, so the value of the feature vector obtained from the local region varies greatly. As a result, the distance value or similarity value obtained by the conventional discriminant function fluctuates greatly, making it difficult to correctly recognize the character type. Tsu and thus solve the problem, even in situations where background noise as described above exists, it is to provide a correctly recognizing method and apparatus letters.

【０００７】[0007]

【課題を解決するための手段】上記の課題を解決するた
めの本発明の文字認識処理方法は、一文字毎に切り出さ
れて２値化された文字パターンに対して、位置及び大き
さについて正規化処理を行い、該正規化された文字パタ
ーンから特徴を抽出する処理を行い、また該特徴を抽出
する処理とは別に、該正規化された文字パターンを局所
領域に分割する処理を行い、該分割された文字パターン
の各局所領域毎に、局所領域内に存在する黒画素数を計
測し、該計測により求められた該黒画素数とあらかじめ
作成してある黒画素数標準辞書の各カテゴリの標準黒画
素数との間で黒画素数差分値を求める処理を行い、該局
所領域内の文字パターンから得られる特徴ベクトルとあ
らかじめ作成してある特徴標準辞書の対応するカテゴリ
の標準特徴ベクトルとの間で求められる距離値または類
似度の値を、該黒画素数差分値の大きさに応じて調整
し、該調整された距離値または類似度を用いることによ
り、もっとも類似した文字カテゴリを当該文字パターン
の認識結果とすることを特徴とする。According to the character recognition processing method of the present invention for solving the above-mentioned problems, a character pattern cut out for each character and binarized is normalized with respect to position and size. Performing a process of extracting a feature from the normalized character pattern, and performing a process of dividing the normalized character pattern into local regions separately from the process of extracting the feature. The number of black pixels present in the local area is measured for each of the local areas of the obtained character pattern, and the number of black pixels obtained by the measurement is compared with the standard of each category of the previously created black pixel number standard dictionary. A process of obtaining a difference value of the number of black pixels from the number of black pixels is performed, and a feature vector obtained from a character pattern in the local area and a standard feature vector of a corresponding category of a feature standard dictionary created in advance. Is adjusted according to the magnitude of the black pixel number difference value, and by using the adjusted distance value or similarity, the most similar character category is determined. The recognition result of the character pattern is used.

【０００８】あるいは、前記局所領域内の文字パターン
から得られる特徴ベクトルとあらかじめ作成してある特
徴標準辞書の対応するカテゴリの標準特徴ベクトルとの
間で求められる距離値または類似度の値を、該黒画素数
差分値の大きさに応じて調整する処理の段階では、該黒
画素数差分値が０の場合には、特徴ベクトルの全ての次
元について計算した距離値または類似度を累積し、該黒
画素数差分値が０より大きい場合には、特徴ベクトルの
各次元の距離値または類似度を黒画素数の差分値で除算
して累積し、該累積した値を該局所領域の調整された距
離値または類似度とすることを特徴とする。Alternatively, a distance value or a similarity value obtained between a feature vector obtained from a character pattern in the local area and a standard feature vector of a corresponding category in a previously created feature standard dictionary is calculated. In the process of adjusting according to the size of the black pixel number difference value, if the black pixel number difference value is 0, the distance values or similarities calculated for all the dimensions of the feature vector are accumulated, and If the black pixel number difference value is greater than 0, the distance value or similarity of each dimension of the feature vector is divided by the black pixel number difference value and accumulated, and the accumulated value is adjusted for the local region. It is characterized by a distance value or similarity.

【０００９】あるいは、前記局所領域内の文字パターン
から得られる特徴ベクトルとあらかじめ作成してある特
徴標準辞書の対応するカテゴリの標準特徴ベクトルとの
間で求められる距離値または類似度の値を、該黒画素数
差分値の大きさに応じて調整する処理の段階では、特徴
ベクトルの各次元の距離値または類似度を該黒画素数差
分値で減算し、該減算された距離値または類似度が０以
上の場合の該減算した距離値または類似度を累積し、該
累積した値を該局所領域の調整された距離値または類似
度とすることを特徴とする。Alternatively, a distance value or a similarity value obtained between a feature vector obtained from a character pattern in the local region and a standard feature vector of a corresponding category in a previously created feature standard dictionary is calculated. At the stage of the process of adjusting according to the magnitude of the black pixel number difference value, the distance value or similarity of each dimension of the feature vector is subtracted by the black pixel number difference value, and the subtracted distance value or similarity is calculated. The method is characterized in that the subtracted distance value or similarity in the case of 0 or more is accumulated, and the accumulated value is used as the adjusted distance value or similarity of the local region.

【００１０】さらには、上記の文字認識処理方法におけ
る処理の段階をコンピュータに実行させるためのプログ
ラムを、該コンピュータが読み取り可能な記録媒体に記
録したことを特徴とする。Further, a program for causing a computer to execute the processing steps in the above character recognition processing method is recorded on a computer-readable recording medium.

【００１１】同じく、上記課題を解決するための本発明
の文字認識処理装置は、一文字毎に切り出されて２値化
された文字パターンに対して、位置及び大きさについて
正規化処理を行う前処理手段と、該正規化された文字パ
ターンから特徴を抽出する特徴抽出手段と、該正規化さ
れた文字パターンを局所領域に分割し、該分割された文
字パターンの各局所領域毎に、局所領域内に存在する黒
画素数を計測する黒画素数抽出手段と、該計測により求
められた該黒画素数とあらかじめ作成してある黒画素数
標準辞書の各カテゴリの標準黒画素数との間で黒画素数
差分値を求める黒画素数差分値計算手段と、該局所領域
内の文字パターンから得られる特徴ベクトルとあらかじ
め作成してある特徴標準辞書の対応するカテゴリの標準
特徴ベクトルとの間で距離値または類似度を計算する距
離値または類似度計算手段と、該計算された距離値また
は類似度の値を、該黒画素数差分値の大きさに応じて調
整する調整手段と、該調整された距離値または類似度を
用いることにより、もっとも類似した文字カテゴリを当
該文字パターンの認識結果とする識別手段と、を具備す
ることを特徴とする。[0011] Similarly, a character recognition processing apparatus according to the present invention for solving the above-mentioned problem is a pre-processing that performs a normalization process on a position and a size of a character pattern cut out for each character and binarized. Means, feature extracting means for extracting a feature from the normalized character pattern, and dividing the normalized character pattern into local regions, and for each local region of the divided character pattern, And a black pixel number extracting means for measuring the number of black pixels existing in the black pixel number obtained by the measurement and the standard black pixel number of each category of the previously created black pixel number standard dictionary. A black pixel number difference value calculating unit for obtaining a pixel number difference value, and a feature vector obtained from a character pattern in the local area and a standard feature vector of a category corresponding to a previously created feature standard dictionary. A distance value or similarity calculating means for calculating a distance value or a similarity degree in; an adjusting means for adjusting the calculated distance value or similarity value in accordance with the magnitude of the black pixel number difference value; Using the adjusted distance value or similarity to identify the most similar character category as a recognition result of the character pattern.

【００１２】本発明では、文字パターンを局所領域に分
割し、各局所領域毎に、局所領域内に存在する黒画素の
数を計測し、あらかじめ作成してある黒画素数標準辞書
の各カテゴリの標準黒画素数との差分値を求め、入力パ
ターンの該局所領域から得られた特徴ベクトルと特徴標
準辞書の対応カテゴリの標準特徴ベクトルとの間で距離
値または類似度を計算する際に、該黒画素数差分値に比
例した値により、各局所領域間で求められる距離値また
は類似度の値を低減するように調整することにより、ノ
イズが多く含まれている局所領域から求められる距離値
または類似度の変動を低減し、文字パターン全体から得
られる距離値または類似度の変動を低減させることを課
題の解決手段とする。According to the present invention, a character pattern is divided into local regions, the number of black pixels existing in the local region is measured for each local region, and the number of black pixels in the black pixel number standard dictionary prepared in advance is determined. When calculating a difference value from the standard black pixel number and calculating a distance value or similarity between a feature vector obtained from the local region of the input pattern and a standard feature vector of a corresponding category of the feature standard dictionary, By adjusting so as to reduce the distance value or similarity value obtained between each local region by a value proportional to the black pixel number difference value, the distance value obtained from the local region containing much noise or An object of the present invention is to reduce variation in similarity and reduce variation in distance values or similarities obtained from the entire character pattern.

【００１３】本発明の特徴は、従来からの特徴を用いた
文字認識手法において、黒画素数差分値計算処理／手
段、黒画素数標準辞書、黒画素数差分値正規化処理／手
段、距離値・類似度調整処理／手段を併用する点にあ
る。この手段／処理の併用により、従来の認識対象の文
字を分割したエリアごとに文字の特徴を認識する場合に
生じていた分割されたエリアに局所的に重畳されるノイ
ズ（汚れなど）による特徴ベクトルへの悪影響を抑え、
かかる場合にも文字認識率を向上させる。A feature of the present invention is that, in a conventional character recognition method using features, a black pixel number difference value calculation processing / means, a black pixel number standard dictionary, a black pixel number difference normalization process / means, a distance value The point is that the similarity adjustment processing / means are used together. By using this means / processing together, a feature vector due to noise (such as dirt) which is locally superimposed on the divided area, which has been generated when the character feature is recognized for each divided area of the character to be recognized conventionally. To reduce the negative impact on
Even in such a case, the character recognition rate is improved.

【００１４】[0014]

【発明の実施の形態】以下に、図を参照して本発明の実
施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１５】［実施形態例１］図１は、本発明の文字認
識方法および装置における第一の実施形態例を説明する
構成図である。[First Embodiment] FIG. 1 is a block diagram for explaining a first embodiment of a character recognition method and apparatus according to the present invention.

【００１６】前処理部１−２は、例えば従来までに知ら
れている位置の正規化処理法を用いて、入力文字パター
ン１−１の横幅及び縦幅を算出することにより入力文字
パターンの中心を算出し、該中心が文字枠の中心位置に
くるように入力文字パターン全体の平行移動処理を行
う。また、例えば従来までに知られている大きさの正規
化処理法を用いて、文字の横幅及び縦幅が文字枠横幅及
び縦幅の大きさと同じになるように入力文字パターンの
拡大／縮小処理を行う。The pre-processing unit 1-2 calculates the width and height of the input character pattern 1-1 by using, for example, a position normalization processing method which has been known so far, thereby obtaining the center of the input character pattern. Is calculated, and the parallel movement processing of the entire input character pattern is performed so that the center is located at the center position of the character frame. Also, for example, the input character pattern is enlarged / reduced by using a conventionally known size normalization method so that the width and height of the character are the same as the width and height of the character frame. I do.

【００１７】図２に、文字「圧」において、前処理部の
正規化処理により入力文字パターンが正規化される例を
示す。図２−（ａ）は、入力文字パターン１−１の例で
ある。図２−（ｂ）は、前処理部１−２において入力文
字パターン１−１に対して位置と大きさの正規化処理を
行った後の文字パターンである。FIG. 2 shows an example in which the input character pattern is normalized by the normalization processing of the pre-processing unit for the character "pressure". FIG. 2A is an example of the input character pattern 1-1. FIG. 2B shows the character pattern after the preprocessing unit 1-2 has performed the position and size normalization processing on the input character pattern 1-1.

【００１８】特徴抽出部１−３は、前処理部１−２にお
いて正規化処理をされた文字パターンから、文字線の方
向や接続関係、位置関係など認識のために必要な特徴を
ベクトルの形で抽出する処理を行う。The feature extraction section 1-3 extracts, from the character pattern subjected to the normalization processing in the preprocessing section 1-2, features necessary for recognition such as the direction of a character line, connection relation, and positional relation in the form of a vector. Perform the extraction process.

【００１９】図３に、該特徴抽出部１−３を実施する装
置の構成の一例であるブロック図を示す。ここで１−３
−１は、文字パターンから認識に必要な特徴値を抽出す
る特徴値抽出部、１−３−２は、特徴値をベクトルの形
で算出する特徴値算出部である。FIG. 3 is a block diagram showing an example of the configuration of an apparatus for implementing the feature extracting section 1-3. Where 1-3
-1 is a feature value extraction unit that extracts a feature value required for recognition from a character pattern, and 1-3-2 is a feature value calculation unit that calculates a feature value in the form of a vector.

【００２０】黒画素数抽出部１−４は、前処理部１−２
において正規化処理をされた文字パターンを入力し、該
文字パターンを粗い局所領域に分割し、各局所領域から
局所領域内に含まれる黒画素の数を計測する処理を行
う。The number-of-black-pixels extracting section 1-4 includes a preprocessing section 1-2.
, A character pattern subjected to the normalization processing is input, the character pattern is divided into coarse local areas, and the number of black pixels included in the local area is measured from each local area.

【００２１】図４に、該黒画素数抽出部１−４を実施す
る装置の構成の一例であるブロック図を示す。ここで１
−４−１は、文字パターンを複数の粗い局所領域に分割
する文字パターン分割部、１−４−２は、局所領域内に
存在する画素を検出し白か黒かを判定する画素検出部、
１−４−３は、画素検出部で黒画素と判定された場合、
数をカウントする黒画素数計測部、１−４−４は、局所
領域内の黒画素数を出力する黒画素数算出部である。FIG. 4 is a block diagram showing an example of the configuration of an apparatus for implementing the black pixel number extracting section 1-4. Where 1
-4-1 is a character pattern dividing unit that divides the character pattern into a plurality of coarse local regions, 1-4-2 is a pixel detecting unit that detects pixels existing in the local region and determines whether the pixel is white or black,
1-4-3: when the pixel detection unit determines that the pixel is a black pixel,
The black pixel number measurement unit that counts the number of pixels is a black pixel number calculation unit that outputs the number of black pixels in the local area.

【００２２】識別部１−５は、本発明の主要部をなすも
ので、特徴抽出部１−３によって得られた特徴ベクトル
の値と黒画素数抽出部１−４によって得られた黒画素数
をもとに、すでに作成しておいた各文字の特徴標準辞書
１−６と、黒画素数標準辞書１−７とを用いて、特徴ベ
クトルと特徴標準辞書１−６の各カテゴリの標準特徴ベ
クトルとの間で、従来までにすでに知られている距離値
もしくは類似度の識別関数における各局所領域毎の計算
において、該入力パターンの各局所領域における黒画素
数と黒画素数標準辞書１−７の各カテゴリの標準黒画素
数の差分値を求め、各局所領域毎に求められる距離値ま
たは類似度の値を、該黒画素数差分値の大きさに応じて
低減する処理を行い、前記処理を全ての局所領域に対し
て行うことにより、入力文字パターンと標準辞書の各カ
テゴリとの間の距離値または類似度を求めることによ
り、文字パターンの識別を行う。The identification section 1-5 is a main part of the present invention, and includes the value of the feature vector obtained by the feature extraction section 1-3 and the number of black pixels obtained by the black pixel number extraction section 1-4. Using the feature standard dictionary 1-6 for each character and the standard dictionary 1-7 for the number of black pixels, which have already been created, the feature vector and the standard feature of each category of the feature standard dictionary 1-6 are used. In the calculation for each local area in the discriminant function of the distance value or the similarity which has been known so far between the vector and the vector, the number of black pixels in each local area of the input pattern and the black pixel number standard dictionary 1- 7, a difference value of the standard number of black pixels of each category is obtained, and a distance value or a similarity value obtained for each local region is reduced according to the magnitude of the difference value of the number of black pixels. By performing processing on all local regions, By obtaining the distance values or similarity between the category of the input character pattern and the standard dictionary, and identifies the character pattern.

【００２３】図５に、識別部１−５を実施する装置の構
成の一例であるブロック図を示す。ここで、１−５−１
は、入力パターンから得られた特徴ベクトルと、すでに
蓄えておいた特徴標準辞書１−６の各カテゴリの標準特
徴ベクトルとの間で、すでに従来知られている距離値も
しくは類似度を用いて、各局所領域毎に計算を行う距離
値・類似度計算部、１−５−２は、各局所領域毎に入力
パターンから得られた黒画素数とすでに蓄えておいた黒
画素数標準辞書１−７の各カテゴリの標準黒画素数との
間で、黒画素数の差分値を求める黒画素数差分値算出
部、１−５−３は、黒画素数差分値を、距離値・類似度
計算部１−５−１で使用する距離値または類似度に合わ
せた数値に正規化する黒画素数差分値正規化部、１−５
−４は、各局所領域毎の距離値または類似度に対して、
正規化黒画素数差分値により距離値または類似度の値を
調整する距離値・類似度調整部である。FIG. 5 is a block diagram showing an example of the configuration of an apparatus for implementing the identification section 1-5. Here, 1-5-1
The distance between the feature vector obtained from the input pattern and the standard feature vector of each category of the feature standard dictionary 1-6 that has already been stored is calculated using a conventionally known distance value or similarity. The distance value / similarity calculation unit 1-5-2, which performs calculation for each local region, includes a black pixel number standard dictionary 1- 7, a black pixel number difference value calculation unit for obtaining a difference value of the number of black pixels between the standard black pixel number of each category and 1-5-3 calculates a difference value of the number of black pixels by calculating a distance value / similarity. A black pixel number difference value normalizing unit for normalizing to a numerical value corresponding to the distance value or similarity used in the unit 1-5-1, 1-5
-4 is the distance value or similarity for each local region,
A distance value / similarity adjustment unit that adjusts a distance value or a similarity value based on the normalized black pixel number difference value.

【００２４】距離値・類似度調整部１−５−４では、黒
画素数差分値正規化部１−５−３において得られた正規
化黒画素数差分値と距離値・類似度計算部１−５−１に
おいて得られた距離値または類似度を入力とし、該距離
値または類似度の値を該正規化黒画素数差分値により除
算または減算などを用いて低減させた値を出力とするこ
とにより、各局所領域において入力パターンの黒画素数
と黒画素標準辞書の標準黒画素数との差の値に応じた距
離値または類似度を得ることを目的とする。In the distance value / similarity adjustment section 1-5-4, the normalized black pixel number difference value obtained in the black pixel number difference value normalization section 1-5-3 and the distance value / similarity calculation section 1 The distance value or the similarity obtained in -5-1 is input, and a value obtained by reducing the value of the distance value or the similarity by the normalized black pixel number difference value using division or subtraction is output. Accordingly, it is an object to obtain a distance value or a similarity according to a difference value between the number of black pixels of the input pattern and the standard number of black pixels of the black pixel standard dictionary in each local region.

【００２５】距離値・類似度調整部１−５−４における
処理のフローチャートを図６に示す。図６は、距離値の
調整手法として除算を用いる場合の処理例を示す。本例
では、正規化黒画素数差分値が０であった場合、従来手
法で得られた各特徴ベクトル次元ごとの距離値または類
似度を全ての次元において累積し、累積値をそのまま各
局所領域の距離値として出力する。正規化黒画素数差分
値が０より大きかった場合、各特徴ベクトル次元ごとに
得られた従来手法の距離値または類似度を正規化黒画素
数差分値で除算し、全ての次元に関して除算された距離
値または類似度を累積し、累積された距離値または類似
度を各局所領域における距離値または類似度として出力
する。FIG. 6 shows a flowchart of the processing in the distance value / similarity adjustment section 1-5-4. FIG. 6 shows a processing example in the case of using division as a method of adjusting a distance value. In this example, when the normalized black pixel number difference value is 0, the distance value or similarity for each feature vector dimension obtained by the conventional method is accumulated in all dimensions, and the accumulated value is directly used for each local region. Is output as the distance value. When the normalized black pixel number difference value was greater than 0, the distance value or similarity of the conventional method obtained for each feature vector dimension was divided by the normalized black pixel number difference value, and the values were divided for all dimensions. The distance value or the similarity is accumulated, and the accumulated distance value or the similarity is output as the distance value or the similarity in each local region.

【００２６】図５に戻り、１−５−５は、各局所領域毎
の距離値または類似度を加算して標準辞書の各文字毎の
距離値または類似度を算出する距離値・類似度算出部、
１−５−６は、全文字種から得られた距離値または類似
度の値からもっとも類似性が高い文字種順に結果を並び
換えるソーティング部、１−５−７は、得られた識別結
果を出力する識別結果出力部である。Returning to FIG. 5, 1-5-5 is a distance value / similarity calculation for adding a distance value or a similarity for each local area to calculate a distance value or a similarity for each character in the standard dictionary. Department,
1-5-6 is a sorting unit that sorts results in the order of the character type having the highest similarity from the distance value or the value of the similarity obtained from all character types, and 1-5-7 outputs the obtained identification result. This is an identification result output unit.

【００２７】次に、本発明の文字認識処理方法および装
置の具体的な実施形態例として、認識するための特徴と
して、文字パターンを粗い局所領域に分割し各局所領域
内の黒画素についてあらかじめ定めた複数方向、例えば
８方向の場合には０°、４５°、９０°、１３５°、１
８０°、２２５°、２７０°、３１５°（それぞれ１、
２、３、４、５、６、７、８の番号を付ける）に触手を
伸ばし、各方向に連結する黒画素の画素数を計数し、該
黒画素の各方向成分別の分布状況を表す方向寄与度（特
願昭５６−４６６５９号）を、識別関数としてユークリ
ッド距離を用いて黒画素数の差に応じた値により距離値
を除算して距離値を算出することにより、文字パターン
を識別する場合を説明する。Next, as a specific embodiment of the character recognition method and apparatus of the present invention, as a feature for recognition, a character pattern is divided into coarse local areas, and black pixels in each local area are determined in advance. 0 °, 45 °, 90 °, 135 °, 1
80 °, 225 °, 270 °, 315 ° (1,
2, 3, 4, 5, 6, 7, 8), the tentacles are extended, the number of black pixels connected in each direction is counted, and the distribution status of the black pixels in each direction component is represented. The character pattern is identified by calculating the distance value by dividing the directional contribution degree (Japanese Patent Application No. 56-46659) by a value corresponding to the difference in the number of black pixels using the Euclidean distance as an identification function. Will be described.

【００２８】図７は、それを説明するためのフローチャ
ートを表した図である。FIG. 7 is a flow chart for explaining this.

【００２９】まず、一文字毎の領域として切り出された
入力文字パターン（ステップＳ１）は、前処理部１−２
へ送られる。前処理部１−２は、入力文字パターンの位
置と大きさの正規化を行う（ステップＳ２）。前処理に
よって得られたＮ×Ｎメッシュの正規化文字パターン
（ステップＳ３）は、特徴抽出部１−３の特徴値抽出部
１−３−１と、黒画素数計測部１−４の文字パターン分
割部１−４−１とへ送られる。First, the input character pattern cut out as an area for each character (step S1) is processed by the preprocessing unit 1-2.
Sent to The preprocessing unit 1-2 normalizes the position and size of the input character pattern (step S2). The normalized character pattern of the N × N mesh obtained by the pre-processing (step S3) includes the character value extraction unit 1-3-1 of the characteristic extraction unit 1-3 and the character pattern of the black pixel number measurement unit 1-4. It is sent to the division unit 1-4-1.

【００３０】黒画素数抽出部１−４の文字パターン分割
部１−４−１は、正規化文字パターンをＫ個の粗い局所
領域、例えば正方形の局所領域に等分割する（ステップ
Ｓ４）。局所分割された文字パターンの各々の局所領域
において黒画素数を算出するため、局所領域に分割され
た文字パターンは画素検出部１−４−２へ送られる（ス
テップＳ５）。The character pattern dividing section 1-4-1 of the black pixel number extracting section 1-4 equally divides the normalized character pattern into K coarse local areas, for example, square local areas (step S4). In order to calculate the number of black pixels in each local region of the locally divided character pattern, the character pattern divided into the local region is sent to the pixel detection unit 1-4-2 (step S5).

【００３１】各局所領域における黒画素数算出の具体的
な処理フローを図８に示す。画素検出部１−４−２は、
局所領域に分割された文字パターンにおける各局所領域
内の画素を検出する（ステップＳ５−１）。検出画素が
黒画素の場合（ステップＳ５−２）、黒画素数のカウン
タを１増やす（ステップＳ５−３）。局所領域内の全て
の画素について検出及び黒画素数をカウントし、局所領
域における黒画素数を求める（ステップＳ５−４）。以
上のようにして求められた黒画素数は、識別部１−５の
黒画素数差分値計算部１−５−１へ送られる。FIG. 8 shows a specific processing flow for calculating the number of black pixels in each local area. The pixel detection unit 1-4-2
Pixels in each local area in the character pattern divided into local areas are detected (step S5-1). If the detected pixel is a black pixel (step S5-2), the counter of the number of black pixels is incremented by 1 (step S5-3). The detection and counting of the number of black pixels are performed for all the pixels in the local area, and the number of black pixels in the local area is obtained (step S5-4). The number of black pixels obtained as described above is sent to the black pixel number difference value calculation unit 1-5-1 of the identification unit 1-5.

【００３２】特徴抽出部１−３の特徴値抽出部１−３−
１は、黒画素数抽出部１−４の文字パターン分割部１−
４−１と同様な手法により文字パターンを分割する（ス
テップＳ６）。次に局所領域に分割された文字パターン
の各々の局所領域において特徴値を算出する（ステップ
Ｓ７）。The feature value extraction unit 1-3 of the feature extraction unit 1-3
1 is a character pattern dividing unit 1-1 of the black pixel number extracting unit 1-4.
The character pattern is divided by the same method as 4-1 (step S6). Next, a feature value is calculated in each local region of the character pattern divided into local regions (step S7).

【００３３】各局所領域における特徴値算出の具体的な
処理フローを図９に示す。特徴値抽出部１−３−１は、
局所領域に分割された文字パターンにおける各局所領域
内の画素を検出する（ステップＳ７−１）。検出画素が
黒画素の場合（ステップＳ７−２）、黒画素数のカウン
タを１増やす（ステップＳ７−３）。次に検出された黒
画素に対して、方向寄与度を求める処理を行う（ステッ
プＳ７−４）。FIG. 9 shows a specific processing flow for calculating a characteristic value in each local region. The feature value extraction unit 1-3-1
Pixels in each local area in the character pattern divided into local areas are detected (step S7-1). If the detected pixel is a black pixel (step S7-2), the counter of the number of black pixels is incremented by 1 (step S7-3). Next, a process of obtaining a direction contribution is performed on the detected black pixel (step S7-4).

【００３４】各画素における方向寄与度を算出する具体
的な処理フローを図１０に示す。特徴値抽出部１−３−
１は、検出画素を基準点とし（ステップＳ７−４−
１）、各方向に触手を伸ばし隣接した画素を検出する
（ステップＳ７−４−２）。走査方向に隣接した画素が
黒の場合、連結長のカウンタを１増やし（ステップＳ７
−４−３）、新たに隣接画素を基準点とし（ステップＳ
７−４−４）、走査処理を繰り返す。なお走査処理は、
検出画素が存在するブロック内の画素に限られることな
く、正規化文字パターン全体に対して行われる。FIG. 10 shows a specific processing flow for calculating the degree of directional contribution at each pixel. Feature value extraction unit 1-3
1 uses the detected pixel as a reference point (step S7-4-
1) Extend the tentacle in each direction and detect adjacent pixels (step S7-4-2). If the pixel adjacent in the scanning direction is black, the connection length counter is increased by 1 (step S7).
-4-3), a new adjacent pixel is set as a reference point (Step S)
7-4-4), the scanning process is repeated. The scanning process is
The processing is performed on the entire normalized character pattern without being limited to the pixels in the block where the detection pixel exists.

【００３５】隣接画素が白画素または隣接画素が存在し
ない場合、走査を終了する（ステップＳ７−４−５）。
以上の処理を全８方向について行う（ステップＳ７−４
−６）。各黒画素において求められた８方向の黒画素連
結長から、例えば単純和または二乗和の平方根などを用
いて黒画素連結長累積値を求める（ステップＳ７−４−
７）。各方向の黒画素連結長を黒画素連結長累積値によ
って除算することにより方向寄与度を求める（ステップ
Ｓ７−４−８）。If the adjacent pixel is a white pixel or if there is no adjacent pixel, the scanning ends (step S7-4-5).
The above processing is performed for all eight directions (step S7-4)
-6). From the black pixel connection lengths in the eight directions obtained for each black pixel, a black pixel connection length cumulative value is obtained using, for example, a simple sum or a square root of a square sum (step S7-4-).
7). The directional contribution is obtained by dividing the black pixel connection length in each direction by the black pixel connection length accumulated value (step S7-4-8).

【００３６】各黒画素の方向寄与度ｆは、ｆ＝（α１，α２，α３，α４，α５，α６，α７，α
８）なる８次元ベクトルで表される。ここで、α１，α２，
…，α８はそれぞれ、８方向の方向寄与度成分で、該黒
画素から８方向に触手を伸ばし各方向別に得られる黒画
素連結長ｌｉ（ｉ＝１，２，…，８）を用いて、例とし
て黒画素連結長累積値として二乗和の平方根を用いた場
合、 αｉ＝ｌｉ／√（Σ_j=1 ⁸ｌｊ²）で表される。The directional contribution f of each black pixel is given by f = (α1, α2, α3, α4, α5, α6, α7, α
8) It is represented by the following eight-dimensional vector. Where α1, α2,
.., Α8 are directional contribution components in eight directions, respectively. Using a black pixel connection length li (i = 1, 2,..., 8) obtained in each direction by extending a tentacle from the black pixel in eight directions, For example, when the square root of the sum of squares is used as the cumulative value of the black pixel connection length, it is represented by αi = li / √ (Σ _{j = 1} ¹⁸ lj ² ).

【００３７】このようにして求められる方向寄与度ｆを
各局所領域内の全黒画素について求め、各方向毎に累積
する（ステップＳ７−５）。累積した方向寄与度の値と
黒画素数は、特徴値算出部１−３−２へ送られる。特徴
値算出部１−３−２は、累積した方向寄与度の値を各局
所領域内の黒画素の数によって平均化し各局所領域にお
ける特徴値を算出する（ステップＳ７−６）。第ｋ番目
（１、２、…、ｋ、…、Ｋ）の局所領域においてえられ
る特徴値ｆｋは、ｆｋ＝（αｋ１，αｋ２，…，αｋ８）で表される。ここで、αｋ１，αｋ２，…，αｋ８は、
第ｋ番目の局所領域内に存在する全ての黒画素における
方向寄与度ベクトルをそれぞれ方向成分別に累積した方
向寄与度のベクトルの各要素を黒画素の数によって平均
化した各要素である。The directional contribution f obtained in this manner is obtained for all black pixels in each local area, and is accumulated for each direction (step S7-5). The accumulated value of the direction contribution and the number of black pixels are sent to the feature value calculation unit 1-3-2. The feature value calculation unit 1-3-2 averages the accumulated directional contribution values by the number of black pixels in each local region to calculate a feature value in each local region (step S7-6). The feature value fk obtained in the k-th (1, 2,..., K,..., K) local region is represented by fk = (αk1, αk2,..., Αk8). Here, αk1, αk2,..., Αk8 are
Each element of the directional contribution vector obtained by accumulating the directional contribution vectors of all the black pixels existing in the k-th local region for each directional component is averaged by the number of black pixels.

【００３８】このようにして表される各局所領域におけ
る文字パターンの特徴ベクトルｆ（ステップＳ８）は、
識別部１−５の距離値・類似度計算部１−５−１へ送ら
れる。The feature vector f (step S8) of the character pattern in each local region represented in this manner is
It is sent to the distance value / similarity calculation unit 1-5-1 of the identification unit 1-5.

【００３９】各局所領域における距離値算出の具体的な
処理フローを図１１に示す。FIG. 11 shows a specific processing flow for calculating the distance value in each local area.

【００４０】黒画素数差分値計算部１−５−２は、黒画
素数抽出部１−４から送られた黒画素数と黒画素数標準
辞書１−７の各カテゴリの標準黒画素数との間で、黒画
素数差分値を計算する（ステップＳ９−１）。該黒画素
数差分値は黒画素数差分値正規化部１−５−３へ送られ
る。黒画素数差分値正規化部１−５−３は、該黒画素数
差分値をユークリッド距離に合わせた値に正規化する
（ステップＳ９−２）。第ｋ番目の局所領域における入
力パターンの黒画素数をａｋ、黒画素数標準辞書１−７
の各カテゴリｉ（１≦ｉ≦Ｍ）の標準黒画素数をｂｉｋ
とすると、黒画素数差分値Ｃ１ｋは、Ｃ１ｋ＝（ａｋ−ｂｉｋ）で表される。The black pixel number difference value calculation unit 1-5-2 calculates the black pixel number sent from the black pixel number extraction unit 1-4 and the standard black pixel number of each category of the black pixel number standard dictionary 1-7. Then, a black pixel number difference value is calculated (step S9-1). The black pixel number difference value is sent to the black pixel number difference value normalizing section 1-5-3. The black pixel number difference value normalizing unit 1-5-3 normalizes the black pixel number difference value to a value corresponding to the Euclidean distance (step S9-2). The number of black pixels of the input pattern in the k-th local region is ak, the black pixel number standard dictionary 1-7
The standard number of black pixels of each category i (1 ≦ i ≦ M)
Then, the black pixel number difference value C1k is represented by C1k = (ak−bik).

【００４１】また、この黒画素数差分値Ｃ１ｋをユーク
リッド距離に合わせた正規化の一例として、正規化黒画
素数差分値Ｃ２ｋは、Ｃ１ｋ＜＝０の時Ｃ２ｋ＝１Ｃ１ｋ＞０の時Ｃ２ｋ＝（Ｃ１ｋ／Ｗ）＋１（１＜＝Ｗ＜＝局所領域に
含まれる画素数）で表される。正規化黒画素数差分値は距離値・類似度調
整部１−５−４へ送られる。Further, as an example of normalization in which the black pixel number difference value C1k is adjusted to the Euclidean distance, the normalized black pixel number difference value C2k is: C1k <= 0, C2k = 1, C1k> 0, C2k = (C1k / W) +1 (1 <= W <= the number of pixels included in the local area). The normalized black pixel number difference value is sent to the distance value / similarity adjustment unit 1-5-4.

【００４２】距離値・類似度計算部１−５−１は、特徴
抽出部１−３から送られた特徴ベクトルとあらかじめ作
成してある特徴標準辞書１−６の各カテゴリの標準特徴
ベクトルとの間で、差分の二乗値を計算する（ステップ
Ｓ９−３）。以上の処理を全８方向について行う（ステ
ップＳ９−４）。第ｋ番目の局所領域における入力文字
パターンの特徴ベクトルをｆｋ＝（α１ｋ，α２ｋ，α
ｋ３，αｋ４，αｋ５，αｋ６，αｋ７，αｋ８）、特
徴標準辞書１−６の各カテゴリｉ（１≦ｉ≦Ｍ）の標準
特徴ベクトルをｓｉ＝（βｋ１，βｋ２，βｋ３，βｋ
４，βｋ５，βｋ６，βｋ７，βｋ８）とすると、特徴
値の差分の二乗値は方向毎に（αｋ１−βｋ１）²，（αｋ２−βｋ２）²，…，（α
ｋ８−βｋ８）² で表される。該計算結果は距離値・類似度調整部１−５
−４へ送られる。The distance / similarity calculation unit 1-5-1 calculates the correspondence between the feature vector sent from the feature extraction unit 1-3 and the standard feature vector of each category of the feature standard dictionary 1-6 created in advance. Then, the square value of the difference is calculated (step S9-3). The above processing is performed for all eight directions (step S9-4). The feature vector of the input character pattern in the k-th local region is represented by fk = (α1k, α2k, α
k3, αk4, αk5, αk6, αk7, αk8), and the standard feature vector of each category i (1 ≦ i ≦ M) of the feature standard dictionary 1-6 is represented by si = (βk1, βk2, βk3, βk
4, βk5, βk6, βk7, βk8), the square value of the difference between the feature values is (αk1-βk1) ² , (αk2-βk2) ² ,.
k8-βk8) ² . The calculation result is a distance value / similarity adjustment unit 1-5.
-4.

【００４３】距離値・類似度調整部１−５−４は、距離
値・類似度計算部１−５−１から送られた差分の二乗値
を、全８方向毎に黒画素数正規化部１−５−２から送ら
れた正規化黒画素数で除算し、累積する（ステップＳ９
−５）。全８方向について累積した値を距離値として算
出する（ステップＳ９−６）。各方向別に差分の二乗値
を正規化黒画素数で除算した値は、（αｋ１−βｋ１）²／Ｃ２ｋ，（αｋ２−βｋ２）²／Ｃ２ｋ，…，（αｋ８ −βｋ８）²／Ｃ２ｋ …式（１）で表すことができ、上記値を累積した値Ｄｉｋは、Ｄｉｋ＝（αｋ１−βｋ１）²／Ｃ２ｋ＋（αｋ２−βｋ２）²／Ｃ２ｋ＋…＋（αｋ８−βｋ８）²／Ｃ２ｋ …式（２）で表すことができる。The distance value / similarity adjusting section 1-5-4 converts the square value of the difference sent from the distance value / similarity calculating section 1-5-1 into a black pixel number normalizing section for every eight directions. Divide by the normalized number of black pixels sent from 1-5-2 and accumulate (Step S9)
-5). The values accumulated for all eight directions are calculated as distance values (step S9-6). The value obtained by dividing the square value of the difference for each direction by the number of normalized black pixels is (αk1-βk1) ² / C2k, (αk2-βk2) ² / C2k,..., (Αk8-βk8) ² / C2k ... 1), and the value Dik obtained by accumulating the above values is Dik = (αk1-βk1) ² / C2k + (αk2-βk2) ² / C2k +... + (Αk8−βk8) ² / C2k... Can be represented by

【００４４】各局所領域毎に算出された距離値は、距離
値・類似度算出部１−５−５へ送られる。距離値・類似
度算出部１−５−５は、全ての局所領域から算出された
距離値を累積し、各カテゴリにおける距離値としてソー
ティング部１−５−６へ送る（ステップＳ１０）。The distance value calculated for each local area is sent to the distance / similarity calculation section 1-5-5. The distance value / similarity calculation section 1-5-5 accumulates the distance values calculated from all the local areas and sends the accumulated distance values to the sorting section 1-5-6 as distance values in each category (step S10).

【００４５】入力文字パターンと特徴標準辞書１−６の
各カテゴリｉ（１≦ｉ≦Ｍ）との間で求められる距離値
Ｄｉは、１からＫ番目の全ての局所領域から求められる
距離値を累積し、Ｄｉ＝Ｄｉ１＋Ｄｉ２＋…＋Ｄｉｋ＋…＋ＤｉＫで表すことができる。The distance value Di obtained between the input character pattern and each category i (1 ≦ i ≦ M) of the feature standard dictionary 1-6 is the distance value obtained from all the 1st to Kth local areas. .. + Dik +... + DiK can be expressed as Di = Di1 + Di2 +.

【００４６】ソーティング部１−５−６では、上記の一
連の処理を標準辞書の全カテゴリに対して行うことによ
り得られた全カテゴリの距離値を小さい順に（他の距離
値・類似度によっては大きい順に）並べ換える（ステッ
プＳ１１）。並び換えられた結果は、識別結果出力部１
−５−７へ送られる。The sorting unit 1-5-6 sorts the distance values of all categories obtained by performing the above series of processing on all categories of the standard dictionary in ascending order (depending on other distance values / similarity, Rearrange them (in descending order) (step S11). The rearranged result is output to the identification result output unit 1
-5-7.

【００４７】識別結果出力部１−５−７は、もっとも距
離値の小さい（他の距離値・類似度によっては大きい順
に）文字を識別結果として出力する（ステップＳ１
２）。The identification result output section 1-5-7 outputs the character having the smallest distance value (in descending order depending on other distance values / similarity) as an identification result (step S1).
2).

【００４８】上記説明では、距離値・類似度調整部１−
５−４において、距離値・類似度を調整する手法とし
て、正規化黒画素数差分値により距離値を除算したが、
他に減算するなど黒画素数差分値に応じて距離値・類似
度を低減できる手法であればもちろん適用可能である。In the above description, the distance value / similarity adjustment unit 1-
In 5-4, as a method of adjusting the distance value / similarity, the distance value was divided by the normalized black pixel number difference value.
Of course, any method that can reduce the distance value / similarity according to the difference value of the number of black pixels, such as subtraction, is applicable.

【００４９】図１７を用いて具体的な文字パターンにお
ける本実施形態例の効果を説明する。図１７−（ａ）は
正常な文字パターン、（ｂ）は背景にノイズがある文字
パターン例を示す。第ｋ番目の局所領域１７−（ａ）−
ｋ、１７−（ｂ）−ｋにおいて、特徴として実施形態例
で説明した方向寄与度特徴を用いた場合、１７−（ａ）
−ｋから得られる特徴値Ｆａ，１７−（ｂ）−ｋから得
られる特徴値Ｆｂを表１に、各局所領域の黒画素数Ｂ
ａ，Ｂｂを表２に、識別関数として従来のユークリッド
距離Ｄ１と本実施形態例の手法を用いた場合の距離値Ｄ
２を表３に示す。The effect of this embodiment in a specific character pattern will be described with reference to FIG. 17A shows an example of a normal character pattern, and FIG. 17B shows an example of a character pattern having noise in the background. K-th local region 17- (a)-
In k, 17- (b) -k, when the directional contribution feature described in the embodiment is used as a feature, 17- (a)
Table 1 shows the characteristic values Fa, 17- (b) obtained from −k and the number of black pixels B in each local region.
Table 2 shows “a” and “Bb”, and shows a conventional Euclidean distance D1 as a discriminant function and a distance value D when the method of the present embodiment is used.
2 is shown in Table 3.

【００５０】[0050]

【表１】 [Table 1]

【００５１】[0051]

【表２】 [Table 2]

【００５２】[0052]

【表３】 [Table 3]

【００５３】表１より、ノイズが存在する場合に得られ
る特徴量は、存在しない場合に得られる特徴量から変動
していることが分かる。表３より、前記の状況において
従来の識別関数であるユークリッド距離では、局所領域
間から得られる距離値として１２０／１０００ほど大き
い値が得られてしまうが、本発明の手法では８／１００
０と距離値の変動が押さえられていることが分かる。以
下同様の処理により全ての局所領域から求められた距離
値を用いることにより、ノイズが存在しない領域は本来
のユークリッド距離が、ノイズが存在している局所領域
はノイズの量に応じて低減した距離値が得られることに
より、ノイズが背景に存在している場合においても文字
パターン全体から得られる距離値の変動を低減すること
ができ、正しく認識できるようになる。From Table 1, it can be seen that the characteristic amount obtained when noise is present varies from the characteristic amount obtained when noise is not present. From Table 3, in the above situation, the Euclidean distance, which is a conventional discriminant function, obtains a value as large as about 120/1000 as a distance value obtained between local regions, but is 8/100 in the method of the present invention.
It can be seen that the variation of 0 and the distance value is suppressed. Hereinafter, by using the distance values obtained from all the local regions by the same processing, the original Euclidean distance is obtained in the region where no noise is present, and the distance is reduced according to the amount of noise in the local region where noise is present. By obtaining the value, even when noise is present in the background, the fluctuation of the distance value obtained from the entire character pattern can be reduced, and the recognition can be performed correctly.

【００５４】［実施形態例２］本発明の第２の実施形態
例として、特徴は第１の実施形態例と同様なものを用い
て、識別関数としてシティブロック距離を用いて黒画素
数の差に応じた値により距離値を減算して距離値を算出
することにより文字パターンを識別する場合における、
各局所領域の距離値算出の具体的な処理フローを図１２
に示す。[Embodiment 2] As a second embodiment of the present invention, the feature is the same as that of the first embodiment, and the difference in the number of black pixels is determined by using the city block distance as a discrimination function. When a character pattern is identified by calculating a distance value by subtracting a distance value by a value corresponding to
FIG. 12 shows a specific processing flow of calculating the distance value of each local region.
Shown in

【００５５】黒画素数差分値計算部１−５−２は、黒画
素数抽出部１−４から送られた黒画素数と黒画素標準辞
書１−７の各カテゴリの標準黒画素数との間で、黒画素
数差分値を計算する（ステップＳ９−１１）。該黒画素
数差分値は黒画素数差分値正規化部１−５−３へ送られ
る。黒画素数差分値正規化部１−５−３は、該黒画素数
差分値をシティブロック距離に合わせた値に正規化する
（ステップＳ９−１２）。例えば、第ｋ番目の局所領域
における入力パターンの黒画素数をｄｋ、黒画素数標準
辞書１−７の各文字ｉ（１≦ｉ≦Ｍ）の黒画素数をｅｉ
ｋとすると、黒画素数差分値Ｇ１ｋは、Ｇ１ｋ＝（ｄｋ−ｅｉｋ）で表される。The black pixel number difference value calculation unit 1-5-2 calculates the number of black pixels sent from the black pixel number extraction unit 1-4 and the standard black pixel number of each category in the black pixel standard dictionary 1-7. Then, a difference value of the number of black pixels is calculated (step S9-11). The black pixel number difference value is sent to the black pixel number difference value normalizing section 1-5-3. The black pixel number difference value normalizing unit 1-5-3 normalizes the black pixel number difference value to a value corresponding to the city block distance (step S9-12). For example, the number of black pixels of the input pattern in the k-th local area is dk, and the number of black pixels of each character i (1 ≦ i ≦ M) in the black pixel number standard dictionary 1-7 is ei.
Assuming that k, the black pixel number difference value G1k is represented by G1k = (dk-eik).

【００５６】また、この黒画素数差分値Ｇ１ｋをシティ
ブロック距離に合わせた正規化の一例として、正規化黒
画素数差分値Ｇ２ｋは、Ｇ１ｋ＜＝０の時Ｇ２ｋ＝０Ｇ１ｋ＞１の時Ｇ２ｋ＝Ｇ１ｋ／Ｖ（１＜＝Ｖ＜＝局所領域に含まれる
画素数）で表される。正規化黒画素数差分値は、距離値・類似度
調整部１−５−４へ送られる。As an example of normalization of the black pixel number difference value G1k according to the city block distance, the normalized black pixel number difference value G2k is: G1k <= 0, G2k = 0 G1k> 1, G2k = G1k / V (1 <= V <= the number of pixels included in the local region). The normalized black pixel number difference value is sent to the distance value / similarity adjustment unit 1-5-4.

【００５７】距離値・類似度計算部１−５−１は、特徴
抽出部１−３送られた特徴ベクトルとあらかじめ作成し
てある特徴標準辞書１−６の各カテゴリの特徴標準ベク
トルとの間で、差分の絶対値を計算する（ステップＳ９
−１３）。以上の処理を全８方向について行う（ステッ
プＳ９−１４）。第ｋ番目の局所領域における入力文字
パターンの特徴ベクトルをｈｋ＝（γ１ｋ，γ２ｋ，γ
ｋ３，γｋ４，γｋ５，γｋ６，γｋ７，γｋ８）、特
徴標準辞書１−６の各文字ｉ（１≦ｉ≦Ｍ）の特徴ベク
トルをｔｉ＝（δｋ１，δｋ２，δｋ３，δｋ４，δｋ
５，δｋ６，δｋ７，δｋ８）とすると、特徴値の差分
の絶対値は方向毎に｜γｋ１−δｋ１｜，｜γｋ２−δｋ２｜，…，｜γｋ
８−δｋ８｜で表される。該計算結果は距離値・類似度調整部１−５
−４へ送られる。The distance value / similarity calculation unit 1-5-1 calculates the distance between the feature vector sent to the feature extraction unit 1-3 and the feature standard vector of each category of the feature standard dictionary 1-6 created in advance. To calculate the absolute value of the difference (step S9)
-13). The above process is performed for all eight directions (step S9-14). The feature vector of the input character pattern in the k-th local region is defined as hk = (γ1k, γ2k, γ
k3, γk4, γk5, γk6, γk7, γk8), and the feature vector of each character i (1 ≦ i ≦ M) in the feature standard dictionary 1-6 is ti = (δk1, δk2, δk3, δk4, δk
5, δk6, δk7, δk8), the absolute value of the difference between the feature values is | γk1−δk1 |, | γk2−δk2 |,.
8-δk8 |. The calculation result is a distance value / similarity adjustment unit 1-5.
-4.

【００５８】距離値・類似度調整部１−５−４は、距離
値・類似度計算部１−５−１から送られた差分の絶対値
を全８方向毎に黒画素数正規化部１−５−２から送られ
た正規化黒画素数で減算した値の絶対値を累積する（ス
テップＳ９−１５）。全８方向について累積した値を距
離値として算出する（ステップＳ９−１６）。The distance value / similarity adjusting section 1-5-4 calculates the absolute value of the difference sent from the distance value / similarity calculating section 1-5-1 into the black pixel number normalizing section 1 for every eight directions. The absolute value of the value subtracted by the normalized black pixel number sent from -5-2 is accumulated (step S9-15). The values accumulated for all eight directions are calculated as distance values (step S9-16).

【００５９】各方向別に差分の絶対値を正規化黒画素数
で減算した値は、｜｜γｋ１−δｋ１｜−Ｇ２ｋ｜，｜｜γｋ２−δｋ２｜−Ｇ２ｋ｜，…，｜｜γｋ８−δｋ８｜−Ｇ２ｋ｜ …式（３）で表すことができ、上記値を累積した値Ｊｉｋは、Ｊｉｋ＝（｜｜γｋ１−δｋ１｜−Ｇ２ｋ｜）＋（｜｜γｋ２−δｋ２｜−Ｇ２ｋ｜）＋…＋（｜｜γｋ８−δｋ８｜−Ｇ２ｋ｜） …式（４）となり、各局所領域における距離値が求められる。The value obtained by subtracting the absolute value of the difference for each direction by the number of normalized black pixels is: || γk1−δk1 | −G2k |, || γk2−δk2 | −G2k |, ..., || γk8−δk8 | -G2k | ... Expression (3), and the value Jik obtained by accumulating the above values is Jik = (|| γk1-δk1 | -G2k |) + (|| γk2-δk2 | -G2k |) + .. + (|| γk8−δk8 | −G2k |) Expression (4), and the distance value in each local region is obtained.

【００６０】本実施形態例での距離値・類似度調整部１
−５−４は、距離値・類似度の調整手法として減算を用
いる。この場合の具体的な処理例を図１３のフローチャ
ートに示す。本例では、各特徴ベクトル次元ごとに得ら
れた従来手法による距離値または類似度と正規化黒画素
数差分値との間で減算を行う。得られた距離値または類
似度が０以上だった場合、減算された距離値または類似
度を累積する。得られた距離値または類似度が０より小
さかった場合、距離値または類似度を累積せず次の次元
の計算に移る。全ての次元に関して前記の処理を行い、
最終的に得られた累積値を各局所領域における距離値ま
たは類似度として出力する。Distance value / similarity adjustment section 1 in this embodiment
-5-4 uses subtraction as a method of adjusting the distance value / similarity. A specific processing example in this case is shown in the flowchart of FIG. In this example, subtraction is performed between the distance value or similarity obtained by the conventional method and the normalized black pixel number difference value obtained for each feature vector dimension. If the obtained distance value or similarity is 0 or more, the subtracted distance value or similarity is accumulated. When the obtained distance value or similarity is smaller than 0, the process proceeds to the calculation of the next dimension without accumulating the distance value or the similarity. Perform the above processing for all dimensions,
The finally obtained cumulative value is output as a distance value or a similarity in each local region.

【００６１】上記説明では、距離値・類似度調整部１−
５−４において、距離値・類似度を調整する手法とし
て、正規化黒画素数差分値により距離値を減算したが、
他に除算するなど黒画素数差分値に応じて距離値・類似
度を低減できる手法であればもちろん適用可能である。In the above description, the distance value / similarity adjustment section 1-
In 5-4, as a method of adjusting the distance value / similarity, the distance value was subtracted by the normalized black pixel number difference value.
Of course, any method that can reduce the distance value / similarity according to the black pixel number difference value, such as division, can be applied.

【００６２】図１４に文字パターンを粗い正方形のＫ個
の局所領域１４−１，１４−２，…，１４−ｋ，…，１
４−Ｋに分割した場合の図を示す。FIG. 14 shows a K-shaped local area 14-1, 14-2,..., 14-k,.
The figure at the time of dividing into 4-K is shown.

【００６３】図１５に、文字パターンの黒画素連結長を
求めるために触手を伸ばす方向として、８方向（１５−
１，１５−２，１５−３，…，１５−７，１５−８）に
した場合を示す。FIG. 15 shows eight directions (15-) for extending the tentacle in order to obtain the black pixel connection length of the character pattern.
1, 15-2, 15-3,..., 15-7, 15-8).

【００６４】図１６は、図１４の第ｋ番目の局所領域１
４−ｋの黒画素において、方向寄与度を求めるために、
触手を伸ばして黒画素連結長を求める様子を示す。FIG. 16 shows the k-th local region 1 in FIG.
In order to obtain the directional contribution in the 4-k black pixel,
7 shows how a tentacle is extended to obtain a black pixel connection length.

【００６５】本発明の第２の実施形態例による効果を表
４に示す。表４は、従来方法の特徴として本実施形態例
と同様の特徴を用い、従来方法の識別関数としてシティ
ブロック距離（従来方法１）と重み付きシティブロック
距離（従来方法２）を用い、特徴辞書には各カテゴリの
平均値及び標準偏差を用い、入力パターンとして背景ノ
イズを多く含む３４４９パターンに対して上位１位、２
位、５位、１０位までの各累積分類率を求めたものであ
る。Table 4 shows the effects of the second embodiment of the present invention. Table 4 uses features similar to those of the present embodiment as features of the conventional method, uses a city block distance (conventional method 1) and a weighted city block distance (conventional method 2) as discriminant functions of the conventional method, and uses a feature dictionary. Uses the average value and standard deviation of each category, and ranks 1st and 2nd for the 3449 patterns containing much background noise as input patterns.
The cumulative classification rates for the fifth, fifth, and tenth positions are obtained.

【００６６】[0066]

【表４】 [Table 4]

【００６７】背景ノイズを多く含むパターンに対して、
重み付きシティブロック距離を用いた従来方法２ではノ
イズによる特徴値の変動に対処できないため、シティブ
ロック距離を用いた従来方法１よりも認識率が低下して
しまうが、ノイズ量に応じて距離値の調整を行う本発明
の第２の実施形態例によれば、ノイズによる変動を効果
的に押さえることができるため、従来の識別関数を用い
た方法よりかなり分類率を改善できることが分かる。For a pattern containing a lot of background noise,
Since the conventional method 2 using the weighted city block distance cannot cope with the fluctuation of the feature value due to the noise, the recognition rate is lower than that of the conventional method 1 using the city block distance. According to the second embodiment of the present invention in which the adjustment is made, it can be understood that the fluctuation rate due to noise can be suppressed effectively, so that the classification rate can be considerably improved as compared with the method using the conventional discriminant function.

【００６８】以上で説明した本発明の特徴とするところ
は、従来からの文字認識方法である方向寄与度密度特徴
を用いた手法（距離値・類似度計算部１−５−１、特徴
標準辞書１−６、距離値・類似度算出部１−５−５を構
成要素とする手法）に対して、黒画素数差分値計算部１
−５−２、黒画素数標準辞書１−７、黒画素数差分値正
規化部１−５−３、距離値・類似度調整部１−５−４を
併用する点にある。これらの併用により、従来の方向寄
与度密度特徴を認識対象の文字を分割したエリアごとに
文字の特徴を認識する場合に生じていた分割されたエリ
アに局所的に重畳されるノイズ（汚れなど）による特徴
ベクトルへの悪影響を抑え、かかる場合にも文字認識率
を向上させることができる。The feature of the present invention described above is that a method using a directional contribution density feature which is a conventional character recognition method (distance value / similarity calculation unit 1-5-1, feature standard dictionary) 1-6, a method including the distance value / similarity calculation unit 1-5-5 as a constituent element), the black pixel number difference value calculation unit 1
−5-2, black pixel number standard dictionary 1-7, black pixel number difference value normalizing section 1-5-3, and distance value / similarity adjusting section 1-5-4. By these combined use, noise (such as dirt) which is locally superimposed on the divided area, which has been generated in the case of recognizing the character feature for each area obtained by dividing the character for which the direction contribution density characteristic is to be recognized in the related art. Adversely affects the feature vector, and in such a case, the character recognition rate can be improved.

【００６９】上記本発明の特徴についてさらに補足説明
すると、分割されたエリアについてノイズが多いこと
は、黒画素数標準辞書１−７と当該分割されたエリア
を、黒画素数差分値計算部１−５−２において比較する
ことで判断される。ノイズが多いと判断された場合は、
黒画素数差分値正規化部１−５−３において調整処理を
行い、出力として差分値Ｃ２ｋを得て、Ｃ２ｋを距離値
・類似度調整部１−５−４の入力とする。本発明の中心
的処理は、距離値・類似度調整部１−５−４の処理であ
る。当該処理の具体的な方法としては、第１の実施形態
例に示す距離値の調整手法として除算を用いる場合（図
６）、および、第２の実施形態例に示す距離値の調整手
法として減算を用いる場合（図１３）がある。図６中の
「各次元の距離値（または類似度）を正規化黒画素数差
分値で除算」の詳細は、式（１）に記載されている。図
６中の「除算された距離値を累積」の詳細は、式（２）
に記載されている。図６中の右側「各次元の距離値を累
積」の処理は従来手法を用いている。また、図１３中の
「各次元の距離値（または類似度）を正規化黒画素数差
分値で減算」の詳細は、式（３）に記載されいる。図１
３中の「除算された距離値を累積」の詳細は、式（４）
に記載されている。To further explain the features of the present invention, the fact that the divided area has a large amount of noise means that the black pixel number standard dictionary 1-7 and the divided area are converted into the black pixel number difference value calculation unit 1- It is determined by comparing at 5-2. If you determine that there is a lot of noise,
The black pixel number difference value normalizing unit 1-5-3 performs an adjustment process, obtains a difference value C2k as an output, and uses C2k as an input to the distance value / similarity adjusting unit 1-5-4. The central processing of the present invention is the processing of the distance value / similarity adjustment section 1-5-4. As a specific method of the processing, a case where division is used as a distance value adjusting method shown in the first embodiment (FIG. 6), and a subtraction is used as a distance value adjusting method shown in the second embodiment example (FIG. 13). Details of “division of the distance value (or similarity) of each dimension by the normalized black pixel number difference value” in FIG. 6 are described in Expression (1). The details of “accumulate the divided distance value” in FIG.
It is described in. The process of “accumulating distance values of each dimension” on the right side in FIG. 6 uses a conventional method. The details of “subtract the distance value (or similarity) of each dimension by the normalized black pixel number difference value” in FIG. 13 are described in Expression (3). FIG.
The details of “accumulate the divided distance value” in 3 are given by equation (4).
It is described in.

【００７０】なお、図１、図３、図４、図５で示した手
段の一部もしくは全部をコンピュータを用いて機能させ
ることができること、あるいは、図６〜図１３で示した
処理のステップをコンピュータで実行させることができ
ることは言うまでもなく、コンピュータをその手段とし
て機能させるためのプログラム、あるいは、コンピュー
タでその処理のステップを実行させるためのプログラム
を、そのコンピュータが読み取り可能な記録媒体、例え
ば、ＦＤ（フロッピーディスク）や、ＭＯ、ＲＯＭ、メ
モリカード、ＣＤ、ＤＶＤ、リムーバブルディスクなど
に記録して提供し、配布することが可能である。It should be noted that some or all of the means shown in FIGS. 1, 3, 4, and 5 can be made to function using a computer, or the steps of the processing shown in FIGS. It goes without saying that a program for causing a computer to function as the means or a program for causing a computer to execute the steps of the processing can be stored in a computer-readable recording medium such as an FD. (A floppy disk), an MO, a ROM, a memory card, a CD, a DVD, a removable disk, and the like.

【００７１】[0071]

【発明の効果】以上説明したように、本発明によれば、
各局所領域内の黒画素の数を計測し、あらかじめ作成し
てある標準辞書の黒画素の数との差を求め、該黒画素数
の差の値を用いて、距離値または類似度の値を調整する
ことにより、背景にノイズが存在する場合、ノイズの量
に応じて特徴値の変動が大きくなり、それにつれて距離
値が大きく変動してしまうのに対して、ノイズの量に応
じて距離値を低減させることが出来、それにより文字パ
ターン全体から得られる距離値の変動を小さくすること
により、従来の識別関数では誤認識となるようなパター
ンに対しても、正しく認識できるようになる。As described above, according to the present invention,
The number of black pixels in each local area is measured, the difference between the number of black pixels in the standard dictionary created in advance, and the value of the difference in the number of black pixels is used to calculate the distance value or similarity value. In the case where noise is present in the background, the feature value greatly fluctuates in accordance with the amount of noise, and the distance value greatly fluctuates accordingly. The value can be reduced, thereby reducing the variation of the distance value obtained from the entire character pattern, so that a pattern that would be erroneously recognized by the conventional identification function can be correctly recognized.

【００７２】また、本発明ではノイズが存在していない
時は、適切な黒画素数差分値の正規化を行うことによ
り、従来の識別関数と同じ距離値・類似度が得られるた
め、ノイズが存在していない文字パターンに対しても悪
影響を与えることなく、従来と同じ識別能力を得ること
ができる。In the present invention, when noise does not exist, the same distance value and similarity as those of the conventional discriminant function can be obtained by appropriately performing the normalization of the difference value of the number of black pixels. It is possible to obtain the same discrimination ability as before without giving an adverse effect to a character pattern that does not exist.

【００７３】また、本発明では特に固有の特徴抽出手法
に制限されることなく、文字パターンを粗い局所領域に
分割し、その各局所領域から認識に用いる特徴を抽出す
る手法であれば適用可能であるので、今まで文字認識手
法として提案されている各種の特徴抽出手法に適用し、
ノイズが含まれている文字パターンを認識可能とするこ
とで、各種手法の性能を向上させることが可能である。The present invention is not particularly limited to a unique feature extraction technique, and any technique that divides a character pattern into coarse local areas and extracts features used for recognition from each of the local areas is applicable. Therefore, we applied it to various feature extraction methods that have been proposed as character recognition methods,
By making it possible to recognize a character pattern containing noise, it is possible to improve the performance of various methods.

【００７４】また、従来の文字パターンの黒画素の重な
り具合で認識を行う手法では、重ねあわせの際の画素が
位置が少しずれたり、文字のフォントが異なるだけで、
大きく認識性能が劣化したが、本発明では文字パターン
から特徴抽出を行う手法を用いることにより、粗い局所
領域の範囲内の画素の位置ずれや文字のフォントの違い
に対しても、あまり認識性能を低下させることなくノイ
ズが存在するパターンを認識することができる。In the conventional method of performing recognition based on the degree of black pixel overlap in a character pattern, the position of a pixel at the time of superposition is slightly shifted or the font of a character is different.
Although the recognition performance has deteriorated significantly, the present invention uses a method of extracting features from character patterns, so that the recognition performance is not significantly improved even for pixel displacements and differences in character fonts within a coarse local area. A pattern in which noise exists can be recognized without lowering.

[Brief description of the drawings]

【図１】本発明の文字認識方法および装置における第１
の実施形態例を示す構成図である。FIG. 1 shows a first example of a character recognition method and apparatus according to the present invention.
FIG. 2 is a configuration diagram showing an example of the embodiment.

【図２】（ａ），（ｂ）は、上記第１の実施形態例の前
処理部における正規化処理の様子を示す図である。FIGS. 2A and 2B are diagrams showing a normalization process in a preprocessing unit according to the first embodiment. FIG.

【図３】上記第１の実施形態例の特徴抽出部を実施する
装置の構成の一例を表すブロック図である。FIG. 3 is a block diagram illustrating an example of a configuration of an apparatus that implements a feature extracting unit according to the first embodiment.

【図４】上記第１の実施形態例の黒画素数抽出部を実施
する装置の構成の一例を表すブロック図である。FIG. 4 is a block diagram illustrating an example of a configuration of an apparatus that implements a black pixel number extracting unit according to the first embodiment.

【図５】上記第１の実施形態例の識別部を実施する装置
の構成の一例を表すブロック図である。FIG. 5 is a block diagram illustrating an example of a configuration of a device that implements an identification unit according to the first embodiment.

【図６】上記第１の実施形態例の距離値・類似度調整部
における距離値の調整手法として除算を用いる場合の処
理例を示す図である。FIG. 6 is a diagram illustrating a processing example in a case where division is used as a distance value adjustment method in the distance value / similarity adjustment unit according to the first embodiment.

【図７】上記第１の実施形態例での処理を説明するため
のフローチャートである。FIG. 7 is a flowchart for explaining processing in the first embodiment.

【図８】上記第１の実施形態例での局所領域における黒
画素数算出の具体的な処理例を説明するためのフローチ
ャートである。FIG. 8 is a flowchart for explaining a specific processing example of calculating the number of black pixels in a local region in the first embodiment.

【図９】上記第１の実施形態例での各局所領域における
特徴値算出の具体的な処理例を説明するためのフローチ
ャートである。FIG. 9 is a flowchart for explaining a specific processing example of feature value calculation in each local region in the first embodiment.

【図１０】上記第１の実施形態例での各画素における方
向寄与度を算出する具体的な処理例を説明するためのフ
ローチャートである。FIG. 10 is a flowchart for explaining a specific example of processing for calculating a directional contribution at each pixel according to the first embodiment.

【図１１】上記第１の実施形態例での各局所領域におけ
る距離値算出の具体的な処理を説明するためのフローチ
ャートである。FIG. 11 is a flowchart illustrating a specific process of calculating a distance value in each local region in the first embodiment.

【図１２】本発明の第２の実施形態例での各局所領域の
距離値算出の具体的な処理例を説明するためのフローチ
ャートである。FIG. 12 is a flowchart for explaining a specific processing example of calculating a distance value of each local region in the second embodiment of the present invention.

【図１３】上記第２の実施形態例の距離値・類似度調整
部における距離値の調整手法として減算を用いる場合の
処理例を示す図である。FIG. 13 is a diagram illustrating a processing example in a case where subtraction is used as a distance value adjustment method in the distance value / similarity adjustment unit of the second embodiment.

【図１４】本発明の実施形態例での特徴抽出部における
文字パターンを粗い局所領域に分割する様子を示す図で
ある。FIG. 14 is a diagram illustrating a state in which a character pattern is divided into coarse local regions in a feature extraction unit according to the embodiment of the present invention.

【図１５】本発明の実施形態例での特徴抽出部において
黒画素連結長を求めるために触手を伸ばす方向として、
８方向にした場合を示す図である。FIG. 15 illustrates a direction in which a tentacle is extended to obtain a black pixel connection length in the feature extraction unit according to the embodiment of the present invention.
It is a figure showing the case where it makes it eight directions.

【図１６】本発明の実施形態例での特徴抽出部において
黒画素連結長を求める様子を示す図である。FIG. 16 is a diagram illustrating a state where a black pixel connection length is obtained in a feature extraction unit according to the embodiment of the present invention.

【図１７】（ａ），（ｂ）は、本発明の効果としてノイ
ズが背景に存在している場合の文字パターンを説明する
図である。FIGS. 17A and 17B are diagrams illustrating a character pattern when noise is present in the background as an effect of the present invention.

【符号の説明】１−１…入力文字パターン１−２…前処理部１−３…特徴抽出部１−４…黒画素数抽出部１−５…識別処理部１−６…特徴標準辞書１−７…黒画素数標準辞書１−８…識別結果１４−１、１４−２〜１４−ｋ〜１４−Ｋ…文字パター
ンを粗いＫ個の正方形の局所領域に分割した場合の各局
所領域１５−１、１５−２、１５−３、１５−４、１５−５、
１５−６、１５−７、１５−１５…触手を伸ばす方向[Description of Signs] 1-1: Input character pattern 1-2: Pre-processing unit 1-3: Feature extraction unit 1-4: Black pixel number extraction unit 1-5: Identification processing unit 1-6: Feature standard dictionary 1 -7: black pixel number standard dictionary 1-8: identification result 14-1, 14-2 to 14-k to 14-K: each local area in the case where a character pattern is divided into coarse K square local areas 15 -1, 15-2, 15-3, 15-4, 15-5,
15-6, 15-7, 15-15 ... Direction of extending tentacles

───────────────────────────────────────────────────── フロントページの続き (72)発明者杉村利明東京都新宿区西新宿３丁目19番２号日本電信電話株式会社内Ｆターム(参考） 5B064 AA01 BA01 CA11 DC19 DC27 DC28 DC39 DC42 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Toshiaki Sugimura 3-19-2 Nishi-Shinjuku, Shinjuku-ku, Tokyo F-term in Nippon Telegraph and Telephone Corporation (reference) 5B064 AA01 BA01 CA11 DC19 DC27 DC27 DC28 DC39 DC42

Claims

[Claims]

1. A normalization process is performed on a position and a size of a character pattern cut out and binarized for each character, and a process of extracting a feature from the normalized character pattern is performed. Apart from the process of extracting the feature, a process of dividing the normalized character pattern into local regions is performed. For each local region of the divided character pattern, the number of black pixels present in the local region is calculated. Performing a process of measuring a black pixel number difference value between the black pixel number obtained by the measurement and the standard black pixel number of each category of the black pixel number standard dictionary prepared in advance; The distance value or the similarity value obtained between the feature vector obtained from the character pattern in the above and the standard feature vector of the corresponding category of the previously created feature standard dictionary is calculated as the large value of the black pixel number difference value. Adjusted if the, by using the adjustment distance values or similarity,
A character recognition processing method, wherein the most similar character category is used as a recognition result of the character pattern.

2. A distance value or similarity value obtained between a feature vector obtained from a character pattern in the local area and a standard feature vector of a corresponding category of a previously created feature standard dictionary is calculated. In the process of adjusting according to the size of the black pixel number difference value, if the black pixel number difference value is 0, the distance values or similarities calculated for all dimensions of the feature vector are accumulated, and If the black pixel number difference value is larger than 0, the distance value or similarity of each dimension of the feature vector is divided by the black pixel number difference value and accumulated, and the accumulated value is adjusted for the local region. 2. The character recognition processing method according to claim 1, wherein the distance value or the similarity is used.

3. A distance value or a similarity value obtained between a feature vector obtained from a character pattern in the local area and a standard feature vector of a corresponding category of a previously created feature standard dictionary is calculated. In the process of adjusting according to the magnitude of the black pixel number difference value, the distance value or similarity of each dimension of the feature vector is subtracted by the black pixel number difference value, and the subtracted distance value or similarity is calculated. 2. The character recognition processing method according to claim 1, wherein the subtracted distance value or similarity in the case of 0 or more is accumulated, and the accumulated value is used as an adjusted distance value or similarity of the local region. .

4. A pre-processing means for normalizing a position and a size of a character pattern cut out and binarized for each character, and a feature for extracting a characteristic from the normalized character pattern. Extracting means for dividing the normalized character pattern into local areas, and for each local area of the divided character pattern, black pixel number extracting means for measuring the number of black pixels existing in the local area; Black pixel number difference value calculating means for obtaining a black pixel number difference value between the black pixel number obtained by the measurement and the standard black pixel number of each category of the black pixel number standard dictionary created in advance; A distance value or similarity calculator for calculating a distance value or similarity between a feature vector obtained from a character pattern in a local area and a standard feature vector of a corresponding category in a previously created feature standard dictionary. When the value of the calculated distance values or similarity, and adjusting means for adjusting in accordance with the size of the black-pixel count difference value, by using the distance value or the degree of similarity the adjustment,
A recognition unit that determines a most similar character category as a recognition result of the character pattern.

5. A program for causing a computer to execute a processing step in the character recognition processing method according to claim 1 on a recording medium readable by the computer. A recording medium that records a character recognition processing method as a feature.