JP3375292B2

JP3375292B2 - Character recognition processing method and apparatus and recording medium recording the method

Info

Publication number: JP3375292B2
Application number: JP35707298A
Authority: JP
Inventors: 稔森; 正治倉掛; 利明杉村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1998-12-16
Filing date: 1998-12-16
Publication date: 2003-02-10
Anticipated expiration: 2018-12-16
Also published as: JP2000181994A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、文字パターンの認
識方法及び装置に関し、特に２値の文字パターンを認識
する方法及びその装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and apparatus for recognizing character patterns, and more particularly to a method and apparatus for recognizing binary character patterns.

【０００２】[0002]

【従来の技術】従来の文字パターンの認識方法及び装置
の一例として、２値化、位置及び大きさの正規化を行っ
た文字パターンから認識のために必要な特徴をベクトル
の形で抽出し（例えば、文字パターンを粗い局所領域に
分割し、各局所領域内に存在する文字部に対して複数方
向の座標軸から観測し、該座標軸から走査した際に交差
した文字部の黒画素について、文字線の方向寄与度（特
願昭５６−４６６５９号）を求めることにより文字を認
識する方法及び装置「萩田紀博、内藤誠一郎、増田功、
“大局的・局所的方向寄与度密度特徴による手書き漢字
認識方式”、信学論（Ｄ），Ｊ６６−Ｄ，Ｎｏ．６，ｐ
ｐ．７２２−７２９参照」）、あらかじめ作成してある
標準辞書内の各カテゴリの標準パターンベクトルとの間
で距離値または類似度などの識別関数を求めて、もっと
も類似した文字カテゴリを認識結果とする方法がある
（以下、「第一の方法」）。2. Description of the Related Art As an example of a conventional character pattern recognition method and apparatus, features necessary for recognition are extracted in the form of a vector from a character pattern that has been binarized and normalized in position and size. For example, a character pattern is divided into coarse local areas, and a character part existing in each local area is observed from coordinate axes in a plurality of directions. And apparatus for recognizing characters by determining the degree of direction contribution (Japanese Patent Application No. 56-46659) "Norihiro Hagita, Seiichiro Naito, Isao Masuda,
"Handwritten Chinese Character Recognition Method Using Global / Local Direction Contribution Density Features", Theological Theory (D), J66-D, No. 6, p
p. 722-729 ”), a method of obtaining a discriminant function such as a distance value or a similarity with a standard pattern vector of each category in a standard dictionary created in advance and using the most similar character category as a recognition result. (Hereinafter, “first method”).

【０００３】また、入力パターンの２値パターンそのも
のを特徴ベクトルの要素とし、あらかじめ作成してある
標準辞書内の各カテゴリの標準パターンベクトル（特徴
ベクトル同様に２値パターンで表現）との間で共に黒画
素となる数をもとにした識別関数である単純類似度など
により、もっとも類似した文字カテゴリを認識結果とす
る方法がある（以下「第二の方法」）。Further, the binary pattern itself of the input pattern is used as an element of the feature vector, and it is used together with the standard pattern vector (represented by a binary pattern like the feature vector) of each category in the standard dictionary created in advance. There is a method in which the most similar character category is used as the recognition result based on the simple similarity, which is an identification function based on the number of black pixels (hereinafter, "second method").

【０００４】[0004]

【発明が解決しようとする課題】上記文字パターン認識
の第一の方法では、従来特徴ベクトルの要素には、文字
線の方向や接続関係、位置関係などの文字線構造を反映
した特徴量が広く用いられているが、背景雑音などが激
しい画像に対しては、これらの特徴量が大きく変動して
しまい、距離値または類似度が大きく変動することによ
り、正しく認識することが困難になる問題がある。In the first method of character pattern recognition, the feature vector elements of the prior art have a wide range of feature quantities reflecting character line structures such as character line directions, connection relationships, and positional relationships. Although it is used, there is a problem that it is difficult to recognize correctly for an image with strong background noise, because these feature values change greatly and the distance value or similarity changes greatly. is there.

【０００５】また、上記文字パターン認識の第二の方法
では、背景雑音などが激しい画像に対して、標準パター
ンのベクトルの黒画素数が多いカテゴリに誤認識してし
まう問題があった。また、重なっている画素数で判定さ
れるため、入力パターンと標準パターンの重ねあわせの
位置が少しずれたりするだけで、大きく認識性能が低下
する問題がある。Further, the second method of character pattern recognition has a problem that an image with a lot of background noise is erroneously recognized as a category having a large number of black pixels in the vector of the standard pattern. Further, since the determination is made by the number of overlapping pixels, there is a problem that the recognition performance is significantly deteriorated even if the position of superimposition of the input pattern and the standard pattern is slightly shifted.

【０００６】本発明は上記欠点に鑑みてなされたもの
で、その課題は、２値化、位置及び大きさの正規化をさ
れた文字パターンから認識のために必要な特徴をベクト
ルの形で抽出し、あらかじめ作成してある標準辞書内の
各カテゴリの標準パターンベクトルとの間で距離値また
は類似度などの識別関数を求めて、もっとも類似した文
字カテゴリを認識結果とする方法において、文字パター
ンの背景にノイズが存在する場合、ノイズが存在する局
所領域においてはノイズから抽出された特徴値が文字線
から抽出された特徴値に重畳されるため、局所領域から
得られる特徴ベクトルの値は大きく変動してしまうこと
により、従来の識別関数により求められる距離値または
類似度の値は大きく変動してしまい、文字種を正しく認
識することが困難になってしまう問題を解決し、上記の
ような背景ノイズが存在する状況においても、文字を正
しく認識する方法及びその装置を提供することにある。The present invention has been made in view of the above-mentioned drawbacks, and its problem is to extract a feature required for recognition in the form of a vector from a character pattern whose binarization, position and size are normalized. Then, in the method that finds the discriminant function such as the distance value or the similarity with the standard pattern vector of each category in the standard dictionary created in advance and makes the most similar character category the recognition result, When there is noise in the background, the feature value extracted from the noise is superimposed on the feature value extracted from the character line in the local area where the noise exists, so the value of the feature vector obtained from the local area fluctuates greatly. As a result, the distance value or the similarity value obtained by the conventional discriminant function greatly changes, making it difficult to correctly recognize the character type. Tsu and thus solve the problem, even in situations where background noise as described above exists, it is to provide a correctly recognizing method and apparatus letters.

【０００７】[0007]

【課題を解決するための手段】上記の課題を解決するた
めの本発明の文字認識処理方法は、一文字毎に切り出さ
れて２値化された文字パターンに対して、位置及び大き
さについて正規化処理を行い、該正規化された文字パタ
ーンを局所領域に分割して各々の特徴を特徴ベクトルの
形で抽出する処理を行い、また該特徴を抽出する処理と
は別に、該正規化された文字パターンを該局所領域と同
様に分割する処理を行い、該分割する処理で分割された
文字パターンの各局所領域毎に、局所領域内に存在する
黒画素数を計測し、該計測により求められた該黒画素数
とあらかじめ作成してある黒画素数標準辞書の各カテゴ
リの標準黒画素数との間で黒画素数差分値を求める処理
を行い、該局所領域毎に、該抽出された特徴ベクトルと
あらかじめ作成してある特徴標準辞書の対応するカテゴ
リの標準特徴ベクトルとの間で求められる距離値または
類似度の値を、該黒画素数差分値の大きさに応じて調整
し、全ての局所領域の該調整された距離値または類似度
を累積して用いることにより、もっとも類似した文字カ
テゴリを当該文字パターンの認識結果とすることを特徴
とする。In the character recognition processing method of the present invention for solving the above problems, a character pattern cut out for each character and binarized is normalized with respect to position and size. Processing is performed, the normalized character pattern is divided into local regions, and each feature is divided into feature vector vectors.
It performs a process of extracting a form, also separately from the processing of extracting the feature, the and the local region the normalized character pattern
The number of black pixels existing in the local region is measured for each local region of the character pattern divided by the dividing process , and the number of black pixels determined by the measurement is calculated in advance. A black pixel number difference value is calculated from the created standard number of black pixels of each category of the standard dictionary of black pixels, and the extracted feature vector is created in advance for each local region. The distance value or the similarity value obtained with the standard feature vector of the corresponding category of the feature standard dictionary is adjusted according to the magnitude of the black pixel number difference value, and the adjusted value of all local regions is adjusted. It is characterized in that the most similar character category is used as the recognition result of the character pattern by accumulating and using the distance value or the similarity.

【０００８】あるいは、前記局所領域毎に、前記抽出さ
れた特徴ベクトルとあらかじめ作成してある特徴標準辞
書の対応するカテゴリの標準特徴ベクトルとの間で求め
られる距離値または類似度の値を、該黒画素数差分値の
大きさに応じて調整する処理の段階では、該黒画素数差
分値が０の場合には、特徴ベクトルの全ての次元につい
て計算した距離値または類似度を累積し、該黒画素数差
分値が０より大きい場合には、特徴ベクトルの各次元の
距離値または類似度を黒画素数の差分値で除算して累積
し、該累積した値を該局所領域の調整された距離値また
は類似度とすることを特徴とする。Alternatively, for each of the local regions , the extracted
The distance value or the similarity value obtained between the generated feature vector and the standard feature vector of the corresponding category of the feature standard dictionary created in advance is adjusted according to the magnitude of the black pixel number difference value. In the processing stage, when the black pixel number difference value is 0, the distance values or the similarities calculated for all the dimensions of the feature vector are accumulated, and when the black pixel number difference value is larger than 0, It is characterized in that the distance value or similarity of each dimension of the feature vector is divided by the difference value of the number of black pixels and accumulated, and the accumulated value is used as the adjusted distance value or similarity of the local region.

【０００９】あるいは、前記局所領域毎に、前記抽出さ
れた記特徴ベクトルとあらかじめ作成してある特徴標準
辞書の対応するカテゴリの標準特徴ベクトルとの間で求
められる距離値または類似度の値を、該黒画素数差分値
の大きさに応じて調整する処理の段階では、特徴ベクト
ルの各次元の距離値または類似度を該黒画素数差分値で
減算し、該減算された距離値または類似度が０以上の場
合の該減算した距離値または類似度を累積し、該累積し
た値を該局所領域の調整された距離値または類似度とす
ることを特徴とする。Alternatively, for each of the local regions , the extracted
The distance value or the similarity value obtained between the specified feature vector and the standard feature vector of the corresponding category of the feature standard dictionary created in advance is adjusted according to the magnitude of the black pixel number difference value. In the processing step, the distance value or similarity of each dimension of the feature vector is subtracted by the black pixel number difference value, and the subtracted distance value or similarity when the subtracted distance value or similarity is 0 or more. The degree is accumulated, and the accumulated value is used as an adjusted distance value or similarity of the local area.

【００１０】さらには、上記の文字認識処理方法におけ
る処理の段階をコンピュータに実行させるためのプログ
ラムを、該コンピュータが読み取り可能な記録媒体に記
録したことを特徴とする。Further, a program for causing a computer to execute the processing steps in the above character recognition processing method is recorded in a computer-readable recording medium.

【００１１】同じく、上記課題を解決するための本発明
の文字認識処理装置は、一文字毎に切り出されて２値化
された文字パターンに対して、位置及び大きさについて
正規化処理を行う前処理手段と、該正規化された文字パ
ターンを局所領域に分割して各々の特徴を特徴ベクトル
の形で抽出する特徴抽出手段と、該正規化された文字パ
ターンを該局所領域と同様に分割し、ここで分割された
文字パターンの各局所領域毎に、局所領域内に存在する
黒画素数を計測する黒画素数抽出手段と、該計測により
求められた該黒画素数とあらかじめ作成してある黒画素
数標準辞書の各カテゴリの標準黒画素数との間で黒画素
数差分値を求める黒画素数差分値計算手段と、該局所領
域毎に、該抽出された特徴ベクトルとあらかじめ作成し
てある特徴標準辞書の対応するカテゴリの標準特徴ベク
トルとの間で距離値または類似度を計算する距離値また
は類似度計算手段と、該局所領域毎に、該計算された距
離値または類似度の値を、該黒画素数差分値の大きさに
応じて調整する調整手段と、全ての局所領域の該調整さ
れた距離値または類似度を累積して用いることにより、
もっとも類似した文字カテゴリを当該文字パターンの認
識結果とする識別手段と、を具備することを特徴とす
る。Similarly, the character recognition processing apparatus of the present invention for solving the above problem is a preprocessing for normalizing the position and size of a character pattern that is cut out for each character and binarized. Means, and the normalized character pattern is divided into local regions and each feature is a feature vector
Feature extracting means for extracting in the form of, the normalized character pattern is divided similarly to the local region, wherein each local region of the divided character pattern, the number of black pixels present in the local region A black pixel number extraction means for measuring the number of black pixels, and a black pixel number difference value between the black pixel number obtained by the measurement and the standard black pixel number of each category of the black pixel number standard dictionary created in advance. A distance value or a similarity is calculated between the black pixel number difference value calculating means and the extracted feature vector and the standard feature vector of the corresponding category of the feature standard dictionary created in advance for each local region. and the distance values or similarity calculation means, for each said local region, the value of the calculated distance values or similarity, and adjusting means for adjusting in accordance with the size of the black-pixel count difference value, all the local regions the adjusted distance value or The use by accumulating acetonide,
And an identification unit that uses the most similar character category as a recognition result of the character pattern.

【００１２】本発明では、文字パターンを局所領域に分
割し、各局所領域毎に、局所領域内に存在する黒画素の
数を計測し、あらかじめ作成してある黒画素数標準辞書
の各カテゴリの標準黒画素数との差分値を求め、入力パ
ターンの該局所領域から得られた特徴ベクトルと特徴標
準辞書の対応カテゴリの標準特徴ベクトルとの間で距離
値または類似度を計算する際に、該黒画素数差分値に比
例した値により、各局所領域間で求められる距離値また
は類似度の値を低減するように調整することにより、ノ
イズが多く含まれている局所領域から求められる距離値
または類似度の変動を低減し、文字パターン全体から得
られる距離値または類似度の変動を低減させることを課
題の解決手段とする。According to the present invention, the character pattern is divided into local areas, the number of black pixels existing in each local area is measured, and each of the categories of the black pixel number standard dictionary created in advance is measured. When calculating the distance value or the similarity between the feature vector obtained from the local region of the input pattern and the standard feature vector of the corresponding category of the feature standard dictionary, the difference value from the standard black pixel number is calculated. A value proportional to the black pixel number difference value is adjusted to reduce the distance value or the similarity value obtained between the local areas, so that the distance value obtained from the local area containing a lot of noise or A solution to the problem is to reduce the variation of the similarity and reduce the variation of the distance value or the similarity obtained from the entire character pattern.

【００１３】本発明の特徴は、従来からの特徴を用いた
文字認識手法において、黒画素数差分値計算処理／手
段、黒画素数標準辞書、黒画素数差分値正規化処理／手
段、距離値・類似度調整処理／手段を併用する点にあ
る。この手段／処理の併用により、従来の認識対象の文
字を分割したエリアごとに文字の特徴を認識する場合に
生じていた分割されたエリアに局所的に重畳されるノイ
ズ（汚れなど）による特徴ベクトルへの悪影響を抑え、
かかる場合にも文字認識率を向上させる。The feature of the present invention is that in the character recognition method using the conventional feature, the black pixel number difference value calculation process / means, the black pixel number difference dictionary normalization process, the black pixel number difference value normalization process / means, the distance value. -The point is that the similarity adjustment processing / means are used together. By using this means / processing together, a feature vector due to noise (dirt etc.) locally superimposed on the divided areas, which has occurred when recognizing the characteristics of the character for each area where the conventional character to be recognized is divided The negative impact on
Even in such a case, the character recognition rate is improved.

【００１４】[0014]

【発明の実施の形態】以下に、図を参照して本発明の実
施の形態を説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

【００１５】［実施形態例１］図１は、本発明の文字認
識方法および装置における第一の実施形態例を説明する
構成図である。[Embodiment 1] FIG. 1 is a block diagram for explaining a first embodiment of a character recognition method and apparatus according to the present invention.

【００１６】前処理部１−２は、例えば従来までに知ら
れている位置の正規化処理法を用いて、入力文字パター
ン１−１の横幅及び縦幅を算出することにより入力文字
パターンの中心を算出し、該中心が文字枠の中心位置に
くるように入力文字パターン全体の平行移動処理を行
う。また、例えば従来までに知られている大きさの正規
化処理法を用いて、文字の横幅及び縦幅が文字枠横幅及
び縦幅の大きさと同じになるように入力文字パターンの
拡大／縮小処理を行う。The preprocessing section 1-2 calculates the width and height of the input character pattern 1-1 by using, for example, a position normalization processing method that has been known so far. Is calculated, and the translation processing of the entire input character pattern is performed so that the center comes to the center position of the character frame. Further, for example, by using a conventionally known normalizing method of a size, an input character pattern is enlarged / reduced so that the width and height of the character become the same as the width and height of the character frame. I do.

【００１７】図２に、文字「圧」において、前処理部の
正規化処理により入力文字パターンが正規化される例を
示す。図２−（ａ）は、入力文字パターン１−１の例で
ある。図２−（ｂ）は、前処理部１−２において入力文
字パターン１−１に対して位置と大きさの正規化処理を
行った後の文字パターンである。FIG. 2 shows an example in which the input character pattern is normalized by the normalization process of the preprocessing unit in the character "pressure". FIG. 2- (a) is an example of the input character pattern 1-1. FIG. 2B is a character pattern after the position and size of the input character pattern 1-1 are normalized by the preprocessing unit 1-2.

【００１８】特徴抽出部１−３は、前処理部１−２にお
いて正規化処理をされた文字パターンから、文字線の方
向や接続関係、位置関係など認識のために必要な特徴を
ベクトルの形で抽出する処理を行う。The feature extracting unit 1-3 uses the character pattern normalized by the pre-processing unit 1-2 to identify the features necessary for recognition such as the direction of the character lines, the connection relation, and the positional relation in the form of a vector. The process of extracting is performed.

【００１９】図３に、該特徴抽出部１−３を実施する装
置の構成の一例であるブロック図を示す。ここで１−３
−１は、文字パターンから認識に必要な特徴値を抽出す
る特徴値抽出部、１−３−２は、特徴値をベクトルの形
で算出する特徴値算出部である。FIG. 3 is a block diagram showing an example of the configuration of a device that implements the feature extracting section 1-3. 1-3 here
Reference numeral -1 is a feature value extraction unit that extracts a feature value required for recognition from the character pattern, and 1-3-2 is a feature value calculation unit that calculates the feature value in the form of a vector.

【００２０】黒画素数抽出部１−４は、前処理部１−２
において正規化処理をされた文字パターンを入力し、該
文字パターンを粗い局所領域に分割し、各局所領域から
局所領域内に含まれる黒画素の数を計測する処理を行
う。The black pixel number extraction unit 1-4 includes a preprocessing unit 1-2.
In, the character pattern subjected to the normalization process is input, the character pattern is divided into coarse local regions, and the process of measuring the number of black pixels included in the local region from each local region is performed.

【００２１】図４に、該黒画素数抽出部１−４を実施す
る装置の構成の一例であるブロック図を示す。ここで１
−４−１は、文字パターンを複数の粗い局所領域に分割
する文字パターン分割部、１−４−２は、局所領域内に
存在する画素を検出し白か黒かを判定する画素検出部、
１−４−３は、画素検出部で黒画素と判定された場合、
数をカウントする黒画素数計測部、１−４−４は、局所
領域内の黒画素数を出力する黒画素数算出部である。FIG. 4 is a block diagram showing an example of the configuration of an apparatus for implementing the black pixel number extraction unit 1-4. Where 1
4-1 is a character pattern division unit that divides a character pattern into a plurality of coarse local areas, and 1-4-2 is a pixel detection unit that detects a pixel existing in the local area and determines whether the pixel is white or black.
1-4-3, when the pixel detection unit determines that the pixel is a black pixel,
A black pixel number measuring unit for counting the number 1-4-4 is a black pixel number calculating unit for outputting the number of black pixels in the local region.

【００２２】識別部１−５は、本発明の主要部をなすも
ので、特徴抽出部１−３によって得られた特徴ベクトル
の値と黒画素数抽出部１−４によって得られた黒画素数
をもとに、すでに作成しておいた各文字の特徴標準辞書
１−６と、黒画素数標準辞書１−７とを用いて、特徴ベ
クトルと特徴標準辞書１−６の各カテゴリの標準特徴ベ
クトルとの間で、従来までにすでに知られている距離値
もしくは類似度の識別関数における各局所領域毎の計算
において、該入力パターンの各局所領域における黒画素
数と黒画素数標準辞書１−７の各カテゴリの標準黒画素
数の差分値を求め、各局所領域毎に求められる距離値ま
たは類似度の値を、該黒画素数差分値の大きさに応じて
低減する処理を行い、前記処理を全ての局所領域に対し
て行うことにより、入力文字パターンと標準辞書の各カ
テゴリとの間の距離値または類似度を求めることによ
り、文字パターンの識別を行う。The identification unit 1-5 forms the main part of the present invention. The value of the feature vector obtained by the feature extraction unit 1-3 and the number of black pixels obtained by the black pixel number extraction unit 1-4. Using the feature standard dictionary 1-6 for each character and the black pixel number standard dictionary 1-7 that have already been created, the feature vector and the standard feature of each category of the feature standard dictionary 1-6 are used. In the calculation for each local area in the discriminant function of the distance value or the similarity degree that has been already known between the vector and the vector, the number of black pixels in each local area of the input pattern and the black pixel number standard dictionary 1- 7, the difference value of the standard black pixel number of each category is calculated, and the distance value or the similarity value calculated for each local area is reduced according to the magnitude of the black pixel number difference value. By performing processing on all local regions By obtaining the distance values or similarity between the category of the input character pattern and the standard dictionary, and identifies the character pattern.

【００２３】図５に、識別部１−５を実施する装置の構
成の一例であるブロック図を示す。ここで、１−５−１
は、入力パターンから得られた特徴ベクトルと、すでに
蓄えておいた特徴標準辞書１−６の各カテゴリの標準特
徴ベクトルとの間で、すでに従来知られている距離値も
しくは類似度を用いて、各局所領域毎に計算を行う距離
値・類似度計算部、１−５−２は、各局所領域毎に入力
パターンから得られた黒画素数とすでに蓄えておいた黒
画素数標準辞書１−７の各カテゴリの標準黒画素数との
間で、黒画素数の差分値を求める黒画素数差分値算出
部、１−５−３は、黒画素数差分値を、距離値・類似度
計算部１−５−１で使用する距離値または類似度に合わ
せた数値に正規化する黒画素数差分値正規化部、１−５
−４は、各局所領域毎の距離値または類似度に対して、
正規化黒画素数差分値により距離値または類似度の値を
調整する距離値・類似度調整部である。FIG. 5 is a block diagram showing an example of the configuration of a device that implements the identifying unit 1-5. Where 1-5-1
Between the feature vector obtained from the input pattern and the standard feature vector of each category of the feature standard dictionary 1-6 already stored, using the distance value or the similarity degree already known, The distance value / similarity calculation unit 1-5-2, which calculates for each local region, includes the number of black pixels obtained from the input pattern for each local region and the black pixel number standard dictionary 1- The black pixel number difference value calculation unit that obtains the difference value of the black pixel number from the standard black pixel number of each category of 7; 1-5-3 calculates the black pixel number difference value by the distance value / similarity calculation. Black pixel number difference value normalization unit that normalizes to a numerical value according to the distance value or similarity used in the unit 1-5-1, 1-5
-4 is for the distance value or similarity for each local region,
A distance value / similarity adjustment unit that adjusts the distance value or the similarity value based on the normalized difference value of the number of black pixels.

【００２４】距離値・類似度調整部１−５−４では、黒
画素数差分値正規化部１−５−３において得られた正規
化黒画素数差分値と距離値・類似度計算部１−５−１に
おいて得られた距離値または類似度を入力とし、該距離
値または類似度の値を該正規化黒画素数差分値により除
算または減算などを用いて低減させた値を出力とするこ
とにより、各局所領域において入力パターンの黒画素数
と黒画素標準辞書の標準黒画素数との差の値に応じた距
離値または類似度を得ることを目的とする。In the distance value / similarity adjustment unit 1-5-4, the normalized black pixel number difference value obtained in the black pixel number difference value normalization unit 1-5-3 and the distance value / similarity calculation unit 1 Input the distance value or the degree of similarity obtained in -5-1 and output the value obtained by reducing the value of the distance value or the degree of similarity by the normalized difference value of the number of black pixels by using subtraction or subtraction. By doing so, it is an object to obtain the distance value or the similarity according to the value of the difference between the number of black pixels of the input pattern and the number of standard black pixels of the standard black pixel dictionary in each local area.

【００２５】距離値・類似度調整部１−５−４における
処理のフローチャートを図６に示す。図６は、距離値の
調整手法として除算を用いる場合の処理例を示す。本例
では、正規化黒画素数差分値が０であった場合、従来手
法で得られた各特徴ベクトル次元ごとの距離値または類
似度を全ての次元において累積し、累積値をそのまま各
局所領域の距離値として出力する。正規化黒画素数差分
値が０より大きかった場合、各特徴ベクトル次元ごとに
得られた従来手法の距離値または類似度を正規化黒画素
数差分値で除算し、全ての次元に関して除算された距離
値または類似度を累積し、累積された距離値または類似
度を各局所領域における距離値または類似度として出力
する。FIG. 6 shows a flowchart of the processing in the distance value / similarity adjusting unit 1-5-4. FIG. 6 shows a processing example in which division is used as the distance value adjustment method. In this example, when the normalized black pixel number difference value is 0, the distance value or the similarity for each feature vector dimension obtained by the conventional method is accumulated in all dimensions, and the accumulated value is directly used for each local region. Output as the distance value of. When the normalized black pixel number difference value is larger than 0, the distance value or the similarity degree of the conventional method obtained for each feature vector dimension is divided by the normalized black pixel number difference value, and is divided for all dimensions. The distance value or the similarity is accumulated, and the accumulated distance value or the similarity is output as the distance value or the similarity in each local area.

【００２６】図５に戻り、１−５−５は、各局所領域毎
の距離値または類似度を加算して標準辞書の各文字毎の
距離値または類似度を算出する距離値・類似度算出部、
１−５−６は、全文字種から得られた距離値または類似
度の値からもっとも類似性が高い文字種順に結果を並び
換えるソーティング部、１−５−７は、得られた識別結
果を出力する識別結果出力部である。Returning to FIG. 5, 1-5-5 is a distance value / similarity calculation in which the distance value or similarity for each local region is added to calculate the distance value or similarity for each character in the standard dictionary. Department,
1-5-6 is a sorting unit that sorts the results in the order of the character type having the highest similarity from the distance value or the similarity value obtained from all the character types, and 1-5-7 outputs the obtained identification result. It is an identification result output unit.

【００２７】次に、本発明の文字認識処理方法および装
置の具体的な実施形態例として、認識するための特徴と
して、文字パターンを粗い局所領域に分割し各局所領域
内の黒画素についてあらかじめ定めた複数方向、例えば
８方向の場合には０°、４５°、９０°、１３５°、１
８０°、２２５°、２７０°、３１５°（それぞれ１、
２、３、４、５、６、７、８の番号を付ける）に触手を
伸ばし、各方向に連結する黒画素の画素数を計数し、該
黒画素の各方向成分別の分布状況を表す方向寄与度（特
願昭５６−４６６５９号）を、識別関数としてユークリ
ッド距離を用いて黒画素数の差に応じた値により距離値
を除算して距離値を算出することにより、文字パターン
を識別する場合を説明する。Next, as a specific embodiment of the character recognition processing method and apparatus of the present invention, as a characteristic for recognition, a character pattern is divided into coarse local regions and black pixels in each local region are predetermined. In the case of multiple directions, for example, 8 directions, 0 °, 45 °, 90 °, 135 °, 1
80 °, 225 °, 270 °, 315 ° (1, respectively,
The number of black pixels connected in each direction is counted, and the distribution status of each direction component of the black pixels is represented. A character pattern is identified by calculating the distance value by dividing the distance contribution value (Japanese Patent Application No. 56-46659) by the value corresponding to the difference in the number of black pixels using the Euclidean distance as the identification function. A case will be described.

【００２８】図７は、それを説明するためのフローチャ
ートを表した図である。FIG. 7 is a diagram showing a flow chart for explaining it.

【００２９】まず、一文字毎の領域として切り出された
入力文字パターン（ステップＳ１）は、前処理部１−２
へ送られる。前処理部１−２は、入力文字パターンの位
置と大きさの正規化を行う（ステップＳ２）。前処理に
よって得られたＮ×Ｎメッシュの正規化文字パターン
（ステップＳ３）は、特徴抽出部１−３の特徴値抽出部
１−３−１と、黒画素数計測部１−４の文字パターン分
割部１−４−１とへ送られる。First, the input character pattern (step S1) cut out as an area for each character is processed by the preprocessing section 1-2.
Sent to. The preprocessing unit 1-2 normalizes the position and size of the input character pattern (step S2). The N × N mesh normalized character pattern (step S3) obtained by the pre-processing is the character pattern of the feature value extraction unit 1-3-1 of the feature extraction unit 1-3 and the black pixel number measurement unit 1-4. It is sent to the dividing unit 1-4-1.

【００３０】黒画素数抽出部１−４の文字パターン分割
部１−４−１は、正規化文字パターンをＫ個の粗い局所
領域、例えば正方形の局所領域に等分割する（ステップ
Ｓ４）。局所分割された文字パターンの各々の局所領域
において黒画素数を算出するため、局所領域に分割され
た文字パターンは画素検出部１−４−２へ送られる（ス
テップＳ５）。The character pattern division unit 1-4-1 of the black pixel number extraction unit 1-4 equally divides the normalized character pattern into K coarse local regions, for example, square local regions (step S4). In order to calculate the number of black pixels in each local area of the locally divided character pattern, the character pattern divided into local areas is sent to the pixel detection unit 1-4-2 (step S5).

【００３１】各局所領域における黒画素数算出の具体的
な処理フローを図８に示す。画素検出部１−４−２は、
局所領域に分割された文字パターンにおける各局所領域
内の画素を検出する（ステップＳ５−１）。検出画素が
黒画素の場合（ステップＳ５−２）、黒画素数のカウン
タを１増やす（ステップＳ５−３）。局所領域内の全て
の画素について検出及び黒画素数をカウントし、局所領
域における黒画素数を求める（ステップＳ５−４）。以
上のようにして求められた黒画素数は、識別部１−５の
黒画素数差分値計算部１−５−１へ送られる。A concrete processing flow for calculating the number of black pixels in each local area is shown in FIG. The pixel detection unit 1-4-2 is
Pixels in each local area in the character pattern divided into local areas are detected (step S5-1). If the detected pixel is a black pixel (step S5-2), the counter for the number of black pixels is incremented by 1 (step S5-3). The detection and the number of black pixels are counted for all the pixels in the local area, and the number of black pixels in the local area is obtained (step S5-4). The number of black pixels obtained as described above is sent to the black pixel number difference value calculation unit 1-5-1 of the identification unit 1-5.

【００３２】特徴抽出部１−３の特徴値抽出部１−３−
１は、黒画素数抽出部１−４の文字パターン分割部１−
４−１と同様な手法により文字パターンを分割する（ス
テップＳ６）。次に局所領域に分割された文字パターン
の各々の局所領域において特徴値を算出する（ステップ
Ｓ７）。Feature value extraction unit 1-3 of feature extraction unit 1-3
1 is a character pattern division unit 1-of the black pixel number extraction unit 1-4.
The character pattern is divided by the same method as in 4-1 (step S6). Next, a feature value is calculated in each local area of the character pattern divided into local areas (step S7).

【００３３】各局所領域における特徴値算出の具体的な
処理フローを図９に示す。特徴値抽出部１−３−１は、
局所領域に分割された文字パターンにおける各局所領域
内の画素を検出する（ステップＳ７−１）。検出画素が
黒画素の場合（ステップＳ７−２）、黒画素数のカウン
タを１増やす（ステップＳ７−３）。次に検出された黒
画素に対して、方向寄与度を求める処理を行う（ステッ
プＳ７−４）。FIG. 9 shows a specific processing flow for calculating the feature value in each local area. The feature value extraction unit 1-3-1,
Pixels in each local area in the character pattern divided into local areas are detected (step S7-1). If the detected pixel is a black pixel (step S7-2), the counter for the number of black pixels is incremented by 1 (step S7-3). Next, a process for obtaining the direction contribution is performed on the detected black pixel (step S7-4).

【００３４】各画素における方向寄与度を算出する具体
的な処理フローを図１０に示す。特徴値抽出部１−３−
１は、検出画素を基準点とし（ステップＳ７−４−
１）、各方向に触手を伸ばし隣接した画素を検出する
（ステップＳ７−４−２）。走査方向に隣接した画素が
黒の場合、連結長のカウンタを１増やし（ステップＳ７
−４−３）、新たに隣接画素を基準点とし（ステップＳ
７−４−４）、走査処理を繰り返す。なお走査処理は、
検出画素が存在するブロック内の画素に限られることな
く、正規化文字パターン全体に対して行われる。A concrete processing flow for calculating the direction contribution in each pixel is shown in FIG. Feature value extraction unit 1-3
1 uses the detected pixel as a reference point (step S7-4-
1) Extend the tentacle in each direction to detect adjacent pixels (step S7-4-2). If the pixel adjacent in the scanning direction is black, the connection length counter is incremented by 1 (step S7).
-4-3), using the adjacent pixel as a new reference point (step S
7-4-4), and the scanning process is repeated. The scanning process is
It is performed for the entire normalized character pattern without being limited to the pixels in the block in which the detected pixel exists.

【００３５】隣接画素が白画素または隣接画素が存在し
ない場合、走査を終了する（ステップＳ７−４−５）。
以上の処理を全８方向について行う（ステップＳ７−４
−６）。各黒画素において求められた８方向の黒画素連
結長から、例えば単純和または二乗和の平方根などを用
いて黒画素連結長累積値を求める（ステップＳ７−４−
７）。各方向の黒画素連結長を黒画素連結長累積値によ
って除算することにより方向寄与度を求める（ステップ
Ｓ７−４−８）。If the adjacent pixel is a white pixel or there is no adjacent pixel, the scanning is ended (step S7-4-5).
The above processing is performed for all eight directions (step S7-4).
-6). The black pixel connection length cumulative value is obtained from the black pixel connection lengths in the eight directions obtained for each black pixel, for example, by using the simple sum or the square root of the sum of squares (step S7-4-).
7). The direction contribution degree is obtained by dividing the black pixel connection length in each direction by the black pixel connection length cumulative value (step S7-4-8).

【００３６】各黒画素の方向寄与度ｆは、ｆ＝（α１，α２，α３，α４，α５，α６，α７，α
８）なる８次元ベクトルで表される。ここで、α１，α２，
…，α８はそれぞれ、８方向の方向寄与度成分で、該黒
画素から８方向に触手を伸ばし各方向別に得られる黒画
素連結長ｌｉ（ｉ＝１，２，…，８）を用いて、例とし
て黒画素連結長累積値として二乗和の平方根を用いた場
合、 αｉ＝ｌｉ／√（Σ_j=1 ⁸ｌｊ²）で表される。The direction contribution f of each black pixel is f = (α1, α2, α3, α4, α5, α6, α7, α
8) is represented by the following eight-dimensional vector. Where α1, α2,
.. .alpha.8 is a direction contribution component in each of eight directions, and using black pixel connection lengths li (i = 1, 2, ..., 8) obtained by extending the tentacles from the black pixels in eight directions, As an example, when the square root of the sum of squares is used as the black pixel connection length cumulative value, it is expressed by αi = li / √ (Σ _{j = 1} ⁸ lj ² ).

【００３７】このようにして求められる方向寄与度ｆを
各局所領域内の全黒画素について求め、各方向毎に累積
する（ステップＳ７−５）。累積した方向寄与度の値と
黒画素数は、特徴値算出部１−３−２へ送られる。特徴
値算出部１−３−２は、累積した方向寄与度の値を各局
所領域内の黒画素の数によって平均化し各局所領域にお
ける特徴値を算出する（ステップＳ７−６）。第ｋ番目
（１、２、…、ｋ、…、Ｋ）の局所領域においてえられ
る特徴値ｆｋは、ｆｋ＝（αｋ１，αｋ２，…，αｋ８）で表される。ここで、αｋ１，αｋ２，…，αｋ８は、
第ｋ番目の局所領域内に存在する全ての黒画素における
方向寄与度ベクトルをそれぞれ方向成分別に累積した方
向寄与度のベクトルの各要素を黒画素の数によって平均
化した各要素である。The directional contribution f thus obtained is obtained for all black pixels in each local area and accumulated for each direction (step S7-5). The accumulated direction contribution value and the number of black pixels are sent to the characteristic value calculation unit 1-3-2. The feature value calculation unit 1-3-2 averages the accumulated values of the direction contribution according to the number of black pixels in each local region to calculate the feature value in each local region (step S7-6). The feature value fk obtained in the k-th (1, 2, ..., K, ..., K) local region is represented by fk = (αk1, αk2, ..., αk8). Here, αk1, αk2, ..., αk8 are
The elements of the direction contribution vector obtained by accumulating the direction contribution vectors of all the black pixels existing in the k-th local area for each direction component are averaged by the number of black pixels.

【００３８】このようにして表される各局所領域におけ
る文字パターンの特徴ベクトルｆ（ステップＳ８）は、
識別部１−５の距離値・類似度計算部１−５−１へ送ら
れる。The feature vector f (step S8) of the character pattern in each local area thus represented is
It is sent to the distance value / similarity calculation unit 1-5-1 of the identification unit 1-5.

【００３９】各局所領域における距離値算出の具体的な
処理フローを図１１に示す。FIG. 11 shows a specific processing flow for calculating the distance value in each local area.

【００４０】黒画素数差分値計算部１−５−２は、黒画
素数抽出部１−４から送られた黒画素数と黒画素数標準
辞書１−７の各カテゴリの標準黒画素数との間で、黒画
素数差分値を計算する（ステップＳ９−１）。該黒画素
数差分値は黒画素数差分値正規化部１−５−３へ送られ
る。黒画素数差分値正規化部１−５−３は、該黒画素数
差分値をユークリッド距離に合わせた値に正規化する
（ステップＳ９−２）。第ｋ番目の局所領域における入
力パターンの黒画素数をａｋ、黒画素数標準辞書１−７
の各カテゴリｉ（１≦ｉ≦Ｍ）の標準黒画素数をｂｉｋ
とすると、黒画素数差分値Ｃ１ｋは、Ｃ１ｋ＝（ａｋ−ｂｉｋ）で表される。The black pixel number difference value calculation unit 1-5-2 calculates the number of black pixels sent from the black pixel number extraction unit 1-4 and the number of standard black pixels of each category of the black pixel number standard dictionary 1-7. In between, the black pixel number difference value is calculated (step S9-1). The black pixel number difference value is sent to the black pixel number difference value normalization unit 1-5-3. The black pixel number difference value normalization unit 1-5-3 normalizes the black pixel number difference value to a value matched with the Euclidean distance (step S9-2). The number of black pixels of the input pattern in the k-th local area is ak, and the number of black pixels is in the standard dictionary 1-7
The number of standard black pixels in each category i (1 ≦ i ≦ M) of
Then, the black pixel number difference value C1k is represented by C1k = (ak-bik).

【００４１】また、この黒画素数差分値Ｃ１ｋをユーク
リッド距離に合わせた正規化の一例として、正規化黒画
素数差分値Ｃ２ｋは、Ｃ１ｋ＜＝０の時Ｃ２ｋ＝１Ｃ１ｋ＞０の時Ｃ２ｋ＝（Ｃ１ｋ／Ｗ）＋１（１＜＝Ｗ＜＝局所領域に
含まれる画素数）で表される。正規化黒画素数差分値は距離値・類似度調
整部１−５−４へ送られる。As an example of the normalization of the black pixel number difference value C1k according to the Euclidean distance, the normalized black pixel number difference value C2k is C2k = 1 when C1k <= 0, and C2k = when C1k> 0. (C1k / W) +1 (1 <= W <= the number of pixels included in the local area). The normalized black pixel number difference value is sent to the distance value / similarity degree adjusting unit 1-5-4.

【００４２】距離値・類似度計算部１−５−１は、特徴
抽出部１−３から送られた特徴ベクトルとあらかじめ作
成してある特徴標準辞書１−６の各カテゴリの標準特徴
ベクトルとの間で、差分の二乗値を計算する（ステップ
Ｓ９−３）。以上の処理を全８方向について行う（ステ
ップＳ９−４）。第ｋ番目の局所領域における入力文字
パターンの特徴ベクトルをｆｋ＝（α１ｋ，α２ｋ，α
ｋ３，αｋ４，αｋ５，αｋ６，αｋ７，αｋ８）、特
徴標準辞書１−６の各カテゴリｉ（１≦ｉ≦Ｍ）の標準
特徴ベクトルをｓｉ＝（βｋ１，βｋ２，βｋ３，βｋ
４，βｋ５，βｋ６，βｋ７，βｋ８）とすると、特徴
値の差分の二乗値は方向毎に（αｋ１−βｋ１）²，（αｋ２−βｋ２）²，…，（α
ｋ８−βｋ８）² で表される。該計算結果は距離値・類似度調整部１−５
−４へ送られる。The distance value / similarity calculation unit 1-5-1 combines the feature vector sent from the feature extraction unit 1-3 and the standard feature vector of each category of the feature standard dictionary 1-6 created in advance. In between, the square value of the difference is calculated (step S9-3). The above processing is performed for all eight directions (step S9-4). The feature vector of the input character pattern in the kth local region is fk = (α1k, α2k, α
k3, αk4, αk5, αk6, αk7, αk8), and the standard feature vector of each category i (1 ≦ i ≦ M) of the feature standard dictionary 1-6 is si = (βk1, βk2, βk3, βk
4, βk5, βk6, βk7, βk8), the squared value of the difference between the feature values is (αk1-βk1) ² , (αk2-βk2) ² , ..., (α
It is represented by k8−βk8) ² . The calculation result is the distance value / similarity adjusting unit 1-5.
-4.

【００４３】距離値・類似度調整部１−５−４は、距離
値・類似度計算部１−５−１から送られた差分の二乗値
を、全８方向毎に黒画素数正規化部１−５−２から送ら
れた正規化黒画素数で除算し、累積する（ステップＳ９
−５）。全８方向について累積した値を距離値として算
出する（ステップＳ９−６）。各方向別に差分の二乗値
を正規化黒画素数で除算した値は、（αｋ１−βｋ１）²／Ｃ２ｋ，（αｋ２−βｋ２）²／Ｃ２ｋ，…，（αｋ８ −βｋ８）²／Ｃ２ｋ …式（１）で表すことができ、上記値を累積した値Ｄｉｋは、Ｄｉｋ＝（αｋ１−βｋ１）²／Ｃ２ｋ＋（αｋ２−βｋ２）²／Ｃ２ｋ＋…＋（αｋ８−βｋ８）²／Ｃ２ｋ …式（２）で表すことができる。The distance value / similarity adjustment unit 1-5-4 uses the square value of the difference sent from the distance value / similarity calculation unit 1-5-1 for the black pixel number normalization unit for all eight directions. It is divided by the normalized number of black pixels sent from 1-5-2 and accumulated (step S9).
-5). A value accumulated for all eight directions is calculated as a distance value (step S9-6). The value obtained by dividing the squared value of the difference for each direction by the number of normalized black pixels is (αk1-βk1) ² / C2k, (αk2-βk2) ² / C2k, ..., (αk8-βk8) ² / C2k ... 1) and the value Dik obtained by accumulating the above values is: Dik = (αk1-βk1) ² / C2k + (αk2-βk2) ² / C2k + ... + (αk8-βk8) ² / C2k (2) Can be expressed as

【００４４】各局所領域毎に算出された距離値は、距離
値・類似度算出部１−５−５へ送られる。距離値・類似
度算出部１−５−５は、全ての局所領域から算出された
距離値を累積し、各カテゴリにおける距離値としてソー
ティング部１−５−６へ送る（ステップＳ１０）。The distance value calculated for each local area is sent to the distance value / similarity calculator 1-5-5. The distance value / similarity calculation unit 1-5-5 accumulates the distance values calculated from all the local regions, and sends them to the sorting unit 1-5-6 as the distance value in each category (step S10).

【００４５】入力文字パターンと特徴標準辞書１−６の
各カテゴリｉ（１≦ｉ≦Ｍ）との間で求められる距離値
Ｄｉは、１からＫ番目の全ての局所領域から求められる
距離値を累積し、Ｄｉ＝Ｄｉ１＋Ｄｉ２＋…＋Ｄｉｋ＋…＋ＤｉＫで表すことができる。The distance value Di obtained between the input character pattern and each category i (1≤i≤M) of the feature standard dictionary 1-6 is the distance value obtained from all the 1st to Kth local areas. Cumulatively, it can be expressed as Di = Di1 + Di2 + ... + Dik + ... + DiK.

【００４６】ソーティング部１−５−６では、上記の一
連の処理を標準辞書の全カテゴリに対して行うことによ
り得られた全カテゴリの距離値を小さい順に（他の距離
値・類似度によっては大きい順に）並べ換える（ステッ
プＳ１１）。並び換えられた結果は、識別結果出力部１
−５−７へ送られる。In the sorting unit 1-5-6, the distance values of all categories obtained by performing the above series of processing for all categories of the standard dictionary are sorted in ascending order (depending on other distance values / similarities. Rearrange (in descending order) (step S11). The sorted result is the identification result output unit 1
-5-7.

【００４７】識別結果出力部１−５−７は、もっとも距
離値の小さい（他の距離値・類似度によっては大きい順
に）文字を識別結果として出力する（ステップＳ１
２）。The identification result output unit 1-5-7 outputs the character having the smallest distance value (in order of increasing distance value / similarity) as an identification result (step S1).
2).

【００４８】上記説明では、距離値・類似度調整部１−
５−４において、距離値・類似度を調整する手法とし
て、正規化黒画素数差分値により距離値を除算したが、
他に減算するなど黒画素数差分値に応じて距離値・類似
度を低減できる手法であればもちろん適用可能である。In the above description, the distance value / similarity adjusting unit 1-
In 5-4, as a method of adjusting the distance value / similarity, the distance value is divided by the normalized black pixel number difference value.
Any other method that can reduce the distance value / similarity according to the black pixel number difference value, such as subtraction, can be applied.

【００４９】図１７を用いて具体的な文字パターンにお
ける本実施形態例の効果を説明する。図１７−（ａ）は
正常な文字パターン、（ｂ）は背景にノイズがある文字
パターン例を示す。第ｋ番目の局所領域１７−（ａ）−
ｋ、１７−（ｂ）−ｋにおいて、特徴として実施形態例
で説明した方向寄与度特徴を用いた場合、１７−（ａ）
−ｋから得られる特徴値Ｆａ，１７−（ｂ）−ｋから得
られる特徴値Ｆｂを表１に、各局所領域の黒画素数Ｂ
ａ，Ｂｂを表２に、識別関数として従来のユークリッド
距離Ｄ１と本実施形態例の手法を用いた場合の距離値Ｄ
２を表３に示す。The effect of this embodiment example in a specific character pattern will be described with reference to FIG. 17- (a) shows a normal character pattern, and FIG. 17 (b) shows an example of a character pattern having noise in the background. K-th local area 17- (a)-
k, 17- (b) -k, when the direction contribution feature described in the embodiment is used as the feature, 17- (a)
The characteristic value Fa obtained from −k and the characteristic value Fb obtained from 17- (b) −k are shown in Table 1, and the number of black pixels B in each local region is
a and Bb are shown in Table 2, and the distance value D when the conventional Euclidean distance D1 as the discriminant function and the method of the present embodiment is used
2 is shown in Table 3.

【００５０】[0050]

【表１】 [Table 1]

【００５１】[0051]

【表２】 [Table 2]

【００５２】[0052]

【表３】 [Table 3]

【００５３】表１より、ノイズが存在する場合に得られ
る特徴量は、存在しない場合に得られる特徴量から変動
していることが分かる。表３より、前記の状況において
従来の識別関数であるユークリッド距離では、局所領域
間から得られる距離値として１２０／１０００ほど大き
い値が得られてしまうが、本発明の手法では８／１００
０と距離値の変動が押さえられていることが分かる。以
下同様の処理により全ての局所領域から求められた距離
値を用いることにより、ノイズが存在しない領域は本来
のユークリッド距離が、ノイズが存在している局所領域
はノイズの量に応じて低減した距離値が得られることに
より、ノイズが背景に存在している場合においても文字
パターン全体から得られる距離値の変動を低減すること
ができ、正しく認識できるようになる。From Table 1, it can be seen that the feature amount obtained when noise is present varies from the feature amount obtained when no noise is present. From Table 3, in the above situation, the Euclidean distance, which is the conventional discriminant function, gives a value as large as 120/1000 as the distance value obtained between local regions, but the method of the present invention gives 8/100.
It can be seen that the variation of 0 and the distance value is suppressed. By using the distance values obtained from all the local areas by the same process below, the original Euclidean distance is the area where noise is not present, and the distance that is reduced according to the amount of noise is the local area where noise is present. By obtaining the value, the variation in the distance value obtained from the entire character pattern can be reduced even when noise is present in the background, and correct recognition can be achieved.

【００５４】［実施形態例２］本発明の第２の実施形態
例として、特徴は第１の実施形態例と同様なものを用い
て、識別関数としてシティブロック距離を用いて黒画素
数の差に応じた値により距離値を減算して距離値を算出
することにより文字パターンを識別する場合における、
各局所領域の距離値算出の具体的な処理フローを図１２
に示す。[Embodiment 2] As a second embodiment of the present invention, the same characteristic as that of the first embodiment is used, and the difference in the number of black pixels is obtained by using the city block distance as the discrimination function. When the character pattern is identified by subtracting the distance value by the value according to and calculating the distance value,
FIG. 12 shows a specific processing flow of calculating the distance value of each local area.
Shown in.

【００５５】黒画素数差分値計算部１−５−２は、黒画
素数抽出部１−４から送られた黒画素数と黒画素標準辞
書１−７の各カテゴリの標準黒画素数との間で、黒画素
数差分値を計算する（ステップＳ９−１１）。該黒画素
数差分値は黒画素数差分値正規化部１−５−３へ送られ
る。黒画素数差分値正規化部１−５−３は、該黒画素数
差分値をシティブロック距離に合わせた値に正規化する
（ステップＳ９−１２）。例えば、第ｋ番目の局所領域
における入力パターンの黒画素数をｄｋ、黒画素数標準
辞書１−７の各文字ｉ（１≦ｉ≦Ｍ）の黒画素数をｅｉ
ｋとすると、黒画素数差分値Ｇ１ｋは、Ｇ１ｋ＝（ｄｋ−ｅｉｋ）で表される。The black pixel number difference value calculation unit 1-5-2 calculates the black pixel number sent from the black pixel number extraction unit 1-4 and the standard black pixel number of each category of the black pixel standard dictionary 1-7. In between, a black pixel number difference value is calculated (step S9-11). The black pixel number difference value is sent to the black pixel number difference value normalization unit 1-5-3. The black pixel number difference value normalization unit 1-5-3 normalizes the black pixel number difference value to a value matched with the city block distance (step S9-12). For example, the number of black pixels of the input pattern in the k-th local area is dk, and the number of black pixels of each character i (1 ≦ i ≦ M) of the black pixel number standard dictionary 1-7 is ei.
If k, the black pixel number difference value G1k is expressed by G1k = (dk-eik).

【００５６】また、この黒画素数差分値Ｇ１ｋをシティ
ブロック距離に合わせた正規化の一例として、正規化黒
画素数差分値Ｇ２ｋは、Ｇ１ｋ＜＝０の時Ｇ２ｋ＝０Ｇ１ｋ＞１の時Ｇ２ｋ＝Ｇ１ｋ／Ｖ（１＜＝Ｖ＜＝局所領域に含まれる
画素数）で表される。正規化黒画素数差分値は、距離値・類似度
調整部１−５−４へ送られる。As an example of the normalization of the black pixel number difference value G1k according to the city block distance, the normalized black pixel number difference value G2k is: G1k <= 0 G2k = 0 G1k> 1 G2k = G1k / V (1 <= V <= the number of pixels included in the local region). The normalized black pixel number difference value is sent to the distance value / similarity degree adjusting unit 1-5-4.

【００５７】距離値・類似度計算部１−５−１は、特徴
抽出部１−３送られた特徴ベクトルとあらかじめ作成し
てある特徴標準辞書１−６の各カテゴリの特徴標準ベク
トルとの間で、差分の絶対値を計算する（ステップＳ９
−１３）。以上の処理を全８方向について行う（ステッ
プＳ９−１４）。第ｋ番目の局所領域における入力文字
パターンの特徴ベクトルをｈｋ＝（γ１ｋ，γ２ｋ，γ
ｋ３，γｋ４，γｋ５，γｋ６，γｋ７，γｋ８）、特
徴標準辞書１−６の各文字ｉ（１≦ｉ≦Ｍ）の特徴ベク
トルをｔｉ＝（δｋ１，δｋ２，δｋ３，δｋ４，δｋ
５，δｋ６，δｋ７，δｋ８）とすると、特徴値の差分
の絶対値は方向毎に｜γｋ１−δｋ１｜，｜γｋ２−δｋ２｜，…，｜γｋ
８−δｋ８｜で表される。該計算結果は距離値・類似度調整部１−５
−４へ送られる。The distance value / similarity calculation unit 1-5-1 is arranged between the feature vector sent from the feature extraction unit 1-3 and the feature standard vector of each category of the feature standard dictionary 1-6 created in advance. Then, the absolute value of the difference is calculated (step S9).
-13). The above processing is performed for all eight directions (step S9-14). The feature vector of the input character pattern in the k-th local area is hk = (γ1k, γ2k, γ
k3, γk4, γk5, γk6, γk7, γk8) and the feature vector of each character i (1 ≤ i ≤ M) of the feature standard dictionary 1-6 is ti = (δk1, δk2, δk3, δk4, δk).
5, δk6, δk7, δk8), the absolute value of the difference between the feature values is | γk1-δk1 |, | γk2-δk2 |, ..., | γk
It is represented by 8-δk8 |. The calculation result is the distance value / similarity adjusting unit 1-5.
-4.

【００５８】距離値・類似度調整部１−５−４は、距離
値・類似度計算部１−５−１から送られた差分の絶対値
を全８方向毎に黒画素数正規化部１−５−２から送られ
た正規化黒画素数で減算した値の絶対値を累積する（ス
テップＳ９−１５）。全８方向について累積した値を距
離値として算出する（ステップＳ９−１６）。The distance value / similarity adjustment unit 1-5-4 uses the absolute value of the difference sent from the distance value / similarity calculation unit 1-5-1 for all eight directions in the black pixel number normalization unit 1 The absolute value of the value subtracted by the normalized black pixel number sent from -5-2 is accumulated (step S9-15). A value accumulated for all eight directions is calculated as a distance value (step S9-16).

【００５９】各方向別に差分の絶対値を正規化黒画素数
で減算した値は、｜｜γｋ１−δｋ１｜−Ｇ２ｋ｜，｜｜γｋ２−δｋ２｜−Ｇ２ｋ｜，…，｜｜γｋ８−δｋ８｜−Ｇ２ｋ｜ …式（３）で表すことができ、上記値を累積した値Ｊｉｋは、Ｊｉｋ＝（｜｜γｋ１−δｋ１｜−Ｇ２ｋ｜）＋（｜｜γｋ２−δｋ２｜−Ｇ２ｋ｜）＋…＋（｜｜γｋ８−δｋ８｜−Ｇ２ｋ｜） …式（４）となり、各局所領域における距離値が求められる。The value obtained by subtracting the absolute value of the difference for each direction by the number of normalized black pixels is: || γk1-δk1 | -G2k |, || γk2-δk2 | -G2k |, ..., || γk8-δk8 | -G2k | ... can be expressed by equation (3), and the value Jik obtained by accumulating the above values is Jik = (|| γk1-δk1 | -G2k |) + (|| γk2-δk2 | -G2k |) + ... + (|| k8- [delta] k8 | -G2k |) Equation (4) is obtained, and the distance value in each local region is obtained.

【００６０】本実施形態例での距離値・類似度調整部１
−５−４は、距離値・類似度の調整手法として減算を用
いる。この場合の具体的な処理例を図１３のフローチャ
ートに示す。本例では、各特徴ベクトル次元ごとに得ら
れた従来手法による距離値または類似度と正規化黒画素
数差分値との間で減算を行う。得られた距離値または類
似度が０以上だった場合、減算された距離値または類似
度を累積する。得られた距離値または類似度が０より小
さかった場合、距離値または類似度を累積せず次の次元
の計算に移る。全ての次元に関して前記の処理を行い、
最終的に得られた累積値を各局所領域における距離値ま
たは類似度として出力する。Distance value / similarity adjusting unit 1 in the present embodiment
-5-4 uses subtraction as a distance value / similarity adjustment method. A specific processing example in this case is shown in the flowchart of FIG. In this example, the subtraction is performed between the distance value or the similarity degree obtained by the conventional method for each feature vector dimension and the normalized black pixel number difference value. When the obtained distance value or similarity is 0 or more, the subtracted distance value or similarity is accumulated. When the obtained distance value or similarity is less than 0, the distance value or similarity is not accumulated and the calculation of the next dimension is started. Do the above for all dimensions,
The cumulative value finally obtained is output as a distance value or similarity in each local region.

【００６１】上記説明では、距離値・類似度調整部１−
５−４において、距離値・類似度を調整する手法とし
て、正規化黒画素数差分値により距離値を減算したが、
他に除算するなど黒画素数差分値に応じて距離値・類似
度を低減できる手法であればもちろん適用可能である。In the above description, the distance value / similarity adjusting unit 1-
In 5-4, as a method of adjusting the distance value / similarity, the distance value is subtracted by the normalized black pixel number difference value,
Of course, any other method, such as division, that can reduce the distance value / similarity according to the black pixel number difference value can be applied.

【００６２】図１４に文字パターンを粗い正方形のＫ個
の局所領域１４−１，１４−２，…，１４−ｋ，…，１
４−Ｋに分割した場合の図を示す。In FIG. 14, the character pattern is composed of K local areas 14-1, 14-2, ..., 14-k, ...
The figure when it divides into 4-K is shown.

【００６３】図１５に、文字パターンの黒画素連結長を
求めるために触手を伸ばす方向として、８方向（１５−
１，１５−２，１５−３，…，１５−７，１５−８）に
した場合を示す。FIG. 15 shows eight directions (15−15) as the directions for extending the tentacles in order to obtain the black pixel connection length of the character pattern.
1, 15-2, 15-3, ..., 15-7, 15-8).

【００６４】図１６は、図１４の第ｋ番目の局所領域１
４−ｋの黒画素において、方向寄与度を求めるために、
触手を伸ばして黒画素連結長を求める様子を示す。FIG. 16 shows the k-th local area 1 of FIG.
In order to obtain the direction contribution in a 4-k black pixel,
It shows how the tentacles are extended to obtain the black pixel connection length.

【００６５】本発明の第２の実施形態例による効果を表
４に示す。表４は、従来方法の特徴として本実施形態例
と同様の特徴を用い、従来方法の識別関数としてシティ
ブロック距離（従来方法１）と重み付きシティブロック
距離（従来方法２）を用い、特徴辞書には各カテゴリの
平均値及び標準偏差を用い、入力パターンとして背景ノ
イズを多く含む３４４９パターンに対して上位１位、２
位、５位、１０位までの各累積分類率を求めたものであ
る。Table 4 shows the effects of the second embodiment of the present invention. Table 4 uses the same characteristics as those of the present embodiment as the characteristics of the conventional method, uses the city block distance (conventional method 1) and the weighted city block distance (conventional method 2) as the discrimination function of the conventional method, and uses the feature dictionary. The average value and the standard deviation of each category are used as the input patterns, and the top 1 and 2 of the 3449 patterns that include a lot of background noise as the input pattern.
The cumulative classification rates of the 5th, 10th, and 10th ranks are calculated.

【００６６】[0066]

【表４】 [Table 4]

【００６７】背景ノイズを多く含むパターンに対して、
重み付きシティブロック距離を用いた従来方法２ではノ
イズによる特徴値の変動に対処できないため、シティブ
ロック距離を用いた従来方法１よりも認識率が低下して
しまうが、ノイズ量に応じて距離値の調整を行う本発明
の第２の実施形態例によれば、ノイズによる変動を効果
的に押さえることができるため、従来の識別関数を用い
た方法よりかなり分類率を改善できることが分かる。For a pattern containing a lot of background noise,
Since the conventional method 2 using the weighted city block distance cannot deal with the fluctuation of the feature value due to noise, the recognition rate is lower than that of the conventional method 1 using the city block distance. However, the distance value depends on the noise amount. According to the second embodiment of the present invention that performs the adjustment, it is possible to effectively suppress the fluctuation due to noise, so that it is possible to significantly improve the classification rate as compared with the method using the conventional discriminant function.

【００６８】以上で説明した本発明の特徴とするところ
は、従来からの文字認識方法である方向寄与度密度特徴
を用いた手法（距離値・類似度計算部１−５−１、特徴
標準辞書１−６、距離値・類似度算出部１−５−５を構
成要素とする手法）に対して、黒画素数差分値計算部１
−５−２、黒画素数標準辞書１−７、黒画素数差分値正
規化部１−５−３、距離値・類似度調整部１−５−４を
併用する点にある。これらの併用により、従来の方向寄
与度密度特徴を認識対象の文字を分割したエリアごとに
文字の特徴を認識する場合に生じていた分割されたエリ
アに局所的に重畳されるノイズ（汚れなど）による特徴
ベクトルへの悪影響を抑え、かかる場合にも文字認識率
を向上させることができる。The feature of the present invention described above is that the method using the direction contribution density feature which is a conventional character recognition method (distance value / similarity calculator 1-5-1, feature standard dictionary) is used. 1-6, a method using the distance value / similarity calculation unit 1-5-5 as a constituent element), the black pixel number difference value calculation unit 1
-5-2, the black pixel number standard dictionary 1-7, the black pixel number difference value normalization unit 1-5-3, and the distance value / similarity degree adjustment unit 1-5-4 are used together. By using these together, noise that is locally superimposed on the divided areas that occurred when recognizing the character features for each area where the character for recognition of the direction contribution density feature was divided in the conventional way (dirt etc.) It is possible to suppress the adverse effect on the feature vector due to, and to improve the character recognition rate in such a case.

【００６９】上記本発明の特徴についてさらに補足説明
すると、分割されたエリアについてノイズが多いこと
は、黒画素数標準辞書１−７と当該分割されたエリア
を、黒画素数差分値計算部１−５−２において比較する
ことで判断される。ノイズが多いと判断された場合は、
黒画素数差分値正規化部１−５−３において調整処理を
行い、出力として差分値Ｃ２ｋを得て、Ｃ２ｋを距離値
・類似度調整部１−５−４の入力とする。本発明の中心
的処理は、距離値・類似度調整部１−５−４の処理であ
る。当該処理の具体的な方法としては、第１の実施形態
例に示す距離値の調整手法として除算を用いる場合（図
６）、および、第２の実施形態例に示す距離値の調整手
法として減算を用いる場合（図１３）がある。図６中の
「各次元の距離値（または類似度）を正規化黒画素数差
分値で除算」の詳細は、式（１）に記載されている。図
６中の「除算された距離値を累積」の詳細は、式（２）
に記載されている。図６中の右側「各次元の距離値を累
積」の処理は従来手法を用いている。また、図１３中の
「各次元の距離値（または類似度）を正規化黒画素数差
分値で減算」の詳細は、式（３）に記載されいる。図１
３中の「除算された距離値を累積」の詳細は、式（４）
に記載されている。To further supplement the above-mentioned features of the present invention, the fact that the divided areas have a lot of noise means that the black pixel number standard dictionary 1-7 and the divided areas are defined by the black pixel number difference value calculation unit 1- It is judged by comparing in 5-2. If it is judged to be noisy,
The adjustment processing is performed in the black pixel number difference value normalization unit 1-5-3, the difference value C2k is obtained as an output, and C2k is input to the distance value / similarity degree adjustment unit 1-5-4. The central processing of the present invention is the processing of the distance value / similarity adjusting unit 1-5-4. As a specific method of the processing, when division is used as the distance value adjusting method shown in the first embodiment (FIG. 6), and subtraction is used as the distance value adjusting method shown in the second embodiment. May be used (FIG. 13). Details of “division of distance value (or similarity) of each dimension by normalized black pixel number difference value” in FIG. 6 are described in Expression (1). For details of “accumulate divided distance values” in FIG.
It is described in. The conventional method is used for the process of “accumulating distance values of each dimension” on the right side in FIG. Further, details of “subtraction of distance value (or similarity) of each dimension by normalized black pixel number difference value” in FIG. 13 are described in Expression (3). Figure 1
For details of “accumulate divided distance values” in 3, refer to equation (4).
It is described in.

【００７０】なお、図１、図３、図４、図５で示した手
段の一部もしくは全部をコンピュータを用いて機能させ
ることができること、あるいは、図６〜図１３で示した
処理のステップをコンピュータで実行させることができ
ることは言うまでもなく、コンピュータをその手段とし
て機能させるためのプログラム、あるいは、コンピュー
タでその処理のステップを実行させるためのプログラム
を、そのコンピュータが読み取り可能な記録媒体、例え
ば、ＦＤ（フロッピーディスク）や、ＭＯ、ＲＯＭ、メ
モリカード、ＣＤ、ＤＶＤ、リムーバブルディスクなど
に記録して提供し、配布することが可能である。It should be noted that some or all of the means shown in FIGS. 1, 3, 4, and 5 can be made to function using a computer, or the steps of the processing shown in FIGS. Needless to say, the program can be executed by a computer, or a program that causes the computer to function as the means or a program that causes the computer to execute the steps of the process can be read by the computer. (Floppy disk), MO, ROM, memory card, CD, DVD, removable disk, etc. can be recorded, provided, and distributed.

【００７１】[0071]

【発明の効果】以上説明したように、本発明によれば、
各局所領域内の黒画素の数を計測し、あらかじめ作成し
てある標準辞書の黒画素の数との差を求め、該黒画素数
の差の値を用いて、距離値または類似度の値を調整する
ことにより、背景にノイズが存在する場合、ノイズの量
に応じて特徴値の変動が大きくなり、それにつれて距離
値が大きく変動してしまうのに対して、ノイズの量に応
じて距離値を低減させることが出来、それにより文字パ
ターン全体から得られる距離値の変動を小さくすること
により、従来の識別関数では誤認識となるようなパター
ンに対しても、正しく認識できるようになる。As described above, according to the present invention,
The number of black pixels in each local area is measured, the difference from the number of black pixels in a standard dictionary created in advance is obtained, and the value of the distance value or the similarity is calculated using the difference value of the number of black pixels. By adjusting, if there is noise in the background, the feature value will change greatly depending on the amount of noise, and the distance value will change greatly accordingly, whereas the distance will change depending on the amount of noise. The value can be reduced, and by reducing the variation of the distance value obtained from the entire character pattern, it becomes possible to correctly recognize even a pattern that would be erroneously recognized by the conventional identification function.

【００７２】また、本発明ではノイズが存在していない
時は、適切な黒画素数差分値の正規化を行うことによ
り、従来の識別関数と同じ距離値・類似度が得られるた
め、ノイズが存在していない文字パターンに対しても悪
影響を与えることなく、従来と同じ識別能力を得ること
ができる。Further, in the present invention, when no noise is present, the same distance value / similarity as that of the conventional discriminant function can be obtained by normalizing the appropriate difference value of the number of black pixels. It is possible to obtain the same discrimination ability as the conventional one without adversely affecting a character pattern that does not exist.

【００７３】また、本発明では特に固有の特徴抽出手法
に制限されることなく、文字パターンを粗い局所領域に
分割し、その各局所領域から認識に用いる特徴を抽出す
る手法であれば適用可能であるので、今まで文字認識手
法として提案されている各種の特徴抽出手法に適用し、
ノイズが含まれている文字パターンを認識可能とするこ
とで、各種手法の性能を向上させることが可能である。Further, the present invention is not particularly limited to a peculiar feature extraction method, and can be applied as long as it is a method of dividing a character pattern into coarse local regions and extracting a feature used for recognition from each of the local regions. Therefore, it is applied to various feature extraction methods that have been proposed as character recognition methods,
The performance of various methods can be improved by making it possible to recognize a character pattern containing noise.

【００７４】また、従来の文字パターンの黒画素の重な
り具合で認識を行う手法では、重ねあわせの際の画素が
位置が少しずれたり、文字のフォントが異なるだけで、
大きく認識性能が劣化したが、本発明では文字パターン
から特徴抽出を行う手法を用いることにより、粗い局所
領域の範囲内の画素の位置ずれや文字のフォントの違い
に対しても、あまり認識性能を低下させることなくノイ
ズが存在するパターンを認識することができる。In addition, in the conventional method of recognizing the degree of overlap of black pixels of a character pattern, the position of the pixels at the time of superposition may be slightly displaced, or the character font may be different,
Although the recognition performance is significantly deteriorated, in the present invention, by using the method of performing the feature extraction from the character pattern, the recognition performance is not so much improved even with respect to the pixel position shift in the range of the coarse local area and the character font difference. It is possible to recognize a pattern in which noise exists without reducing the pattern.

[Brief description of drawings]

【図１】本発明の文字認識方法および装置における第１
の実施形態例を示す構成図である。FIG. 1 is a first part of a character recognition method and device according to the present invention.
It is a block diagram which shows the example of embodiment.

【図２】（ａ），（ｂ）は、上記第１の実施形態例の前
処理部における正規化処理の様子を示す図である。FIGS. 2A and 2B are diagrams showing a normalization process in a preprocessing unit according to the first embodiment.

【図３】上記第１の実施形態例の特徴抽出部を実施する
装置の構成の一例を表すブロック図である。FIG. 3 is a block diagram illustrating an example of a configuration of an apparatus that implements a feature extraction unit of the first exemplary embodiment.

【図４】上記第１の実施形態例の黒画素数抽出部を実施
する装置の構成の一例を表すブロック図である。FIG. 4 is a block diagram illustrating an example of a configuration of an apparatus that implements the black pixel number extraction unit of the first embodiment.

【図５】上記第１の実施形態例の識別部を実施する装置
の構成の一例を表すブロック図である。FIG. 5 is a block diagram illustrating an example of a configuration of an apparatus that implements the identification unit according to the first exemplary embodiment.

【図６】上記第１の実施形態例の距離値・類似度調整部
における距離値の調整手法として除算を用いる場合の処
理例を示す図である。FIG. 6 is a diagram showing a processing example in the case where division is used as a distance value adjustment method in the distance value / similarity adjustment unit of the first embodiment.

【図７】上記第１の実施形態例での処理を説明するため
のフローチャートである。FIG. 7 is a flowchart for explaining processing in the first embodiment example.

【図８】上記第１の実施形態例での局所領域における黒
画素数算出の具体的な処理例を説明するためのフローチ
ャートである。FIG. 8 is a flowchart for explaining a specific processing example of calculating the number of black pixels in a local area in the first embodiment.

【図９】上記第１の実施形態例での各局所領域における
特徴値算出の具体的な処理例を説明するためのフローチ
ャートである。FIG. 9 is a flowchart for explaining a specific processing example of feature value calculation in each local area in the first embodiment.

【図１０】上記第１の実施形態例での各画素における方
向寄与度を算出する具体的な処理例を説明するためのフ
ローチャートである。FIG. 10 is a flowchart for explaining a specific example of processing for calculating the direction contribution in each pixel in the first embodiment.

【図１１】上記第１の実施形態例での各局所領域におけ
る距離値算出の具体的な処理を説明するためのフローチ
ャートである。FIG. 11 is a flowchart for explaining a specific process of calculating a distance value in each local area in the first embodiment.

【図１２】本発明の第２の実施形態例での各局所領域の
距離値算出の具体的な処理例を説明するためのフローチ
ャートである。FIG. 12 is a flowchart for explaining a specific processing example of distance value calculation of each local area in the second embodiment of the present invention.

【図１３】上記第２の実施形態例の距離値・類似度調整
部における距離値の調整手法として減算を用いる場合の
処理例を示す図である。FIG. 13 is a diagram showing a processing example when subtraction is used as the distance value adjustment method in the distance value / similarity adjustment unit of the second embodiment.

【図１４】本発明の実施形態例での特徴抽出部における
文字パターンを粗い局所領域に分割する様子を示す図で
ある。FIG. 14 is a diagram showing how a character pattern is divided into coarse local regions in a feature extraction unit according to an embodiment of the present invention.

【図１５】本発明の実施形態例での特徴抽出部において
黒画素連結長を求めるために触手を伸ばす方向として、
８方向にした場合を示す図である。FIG. 15 shows a direction in which a tentacle is extended in order to obtain a black pixel connection length in the feature extraction unit according to the embodiment of the present invention.
It is a figure which shows the case where it makes 8 directions.

【図１６】本発明の実施形態例での特徴抽出部において
黒画素連結長を求める様子を示す図である。FIG. 16 is a diagram showing how a feature pixel extraction unit obtains a black pixel connection length according to the embodiment of the present invention.

【図１７】（ａ），（ｂ）は、本発明の効果としてノイ
ズが背景に存在している場合の文字パターンを説明する
図である。17A and 17B are diagrams for explaining a character pattern when noise is present in the background as an effect of the present invention.

[Explanation of symbols]

１−１…入力文字パターン１−２…前処理部１−３…特徴抽出部１−４…黒画素数抽出部１−５…識別処理部１−６…特徴標準辞書１−７…黒画素数標準辞書１−８…識別結果１４−１、１４−２〜１４−ｋ〜１４−Ｋ…文字パター
ンを粗いＫ個の正方形の局所領域に分割した場合の各局
所領域１５−１、１５−２、１５−３、１５−４、１５−５、
１５−６、１５−７、１５−１５…触手を伸ばす方向1-1 ... Input character pattern 1-2 ... Preprocessing unit 1-3 ... Feature extraction unit 1-4 ... Black pixel number extraction unit 1-5 ... Identification processing unit 1-6 ... Feature standard dictionary 1-7 ... Black pixels Number standard dictionary 1-8 ... Identification results 14-1, 14-2 to 14-k to 14-K ... Local areas 15-1 and 15- when character patterns are divided into coarse K local areas 2, 15-3, 15-4, 15-5,
15-6, 15-7, 15-15 ... Direction to extend tentacles

フロントページの続き (56)参考文献特開平10−27215（ＪＰ，Ａ) 特開平５−166008（ＪＰ，Ａ) 特開平11−96302（ＪＰ，Ａ) 画質劣化にロバストな映像中テロップ文字認識，電子情報通信学会技術研究報告ＰＲＭＵ98−154，日本，1998年12 月18日，Ｖｏｌ．98 Ｎｏ．490，ｐｐ. 33−40 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06K 9/00 - 9/82 Continuation of the front page (56) References Japanese Unexamined Patent Publication No. 10-27215 (JP, A) Japanese Unexamined Patent Publication No. 5-166008 (JP, A) Japanese Unexamined Patent Publication No. 11-96302 (JP, A) Image telop that is robust against deterioration Character Recognition, IEICE Technical Report PRMU98-154, Japan, December 18, 1998, Vol. 98 No. 490, pp. 33-40 (58) Fields investigated (Int.Cl. ⁷ , DB name) G06K 9/00-9/82

Claims

(57) [Claims]

1. A character pattern cut out for each character and binarized is subjected to normalization processing for position and size, and the normalized character pattern is divided into local regions.
Characterized performs processing for extracting in the form of feature vectors of and separately from the processing of extracting the feature performs a process of dividing the normalized character pattern as with the local region, in the process of the divided For each local region of the divided character pattern, the number of black pixels existing in the local region is measured, and the number of black pixels obtained by the measurement and the number of black pixels in the standard dictionary created in advance are calculated. A process of calculating a black pixel number difference value between the standard black pixel number and the standard black pixel number is performed, and the extracted feature vector and the standard feature vector of the corresponding category of the feature standard dictionary created in advance are calculated for each local region. The distance value or the similarity value calculated between the two is adjusted according to the magnitude of the black pixel number difference value , and the adjusted distance value or the similarity value of all local regions is accumulated.
A character recognition processing method characterized in that the most similar character category is used as a recognition result of the character pattern by being accumulated and used.

2. A distance value or a similarity value obtained between the extracted feature vector and a standard feature vector of a corresponding category of a feature standard dictionary created in advance for each of the local regions, At the stage of the process of adjusting according to the magnitude of the black pixel number difference value, when the black pixel number difference value is 0, the distance values or the similarities calculated for all the dimensions of the feature vector are accumulated, and When the difference value of the number of black pixels is larger than 0, the distance value or similarity of each dimension of the feature vector is divided by the difference value of the number of black pixels and accumulated, and the accumulated value is adjusted in the local region. The character recognition processing method according to claim 1, wherein the distance value or the similarity is used.

Wherein for each of the local region, the distance values or similar determined between the standard feature vector of the corresponding category of the features standard dictionary that has been created in advance with the extracted serial JP <br/> FEATURES vector In the process of adjusting the degree value according to the magnitude of the black pixel number difference value, the distance value or similarity of each dimension of the feature vector is subtracted by the black pixel number difference value, and the subtraction is performed. 2. The distance value or the degree of similarity subtracted when the distance value or the degree of similarity is 0 or more is accumulated, and the accumulated value is used as the adjusted distance value or the degree of similarity of the local region. Character recognition processing method described.

4. Preprocessing means for normalizing the position and size of a character pattern cut out for each character and binarized, and the normalized character pattern is divided into local regions. Each
Feature extracting means for extracting a feature in the form of a feature vector, the normalized character pattern is divided similarly to the local region, for each local region of the divided character pattern here,
A black pixel number extraction means for measuring the number of black pixels existing in the local area; and the number of black pixels obtained by the measurement and the standard number of black pixels of each category of the standard dictionary of black pixels created in advance. Between the black pixel number difference value calculating means for obtaining the black pixel number difference value between the two, and between the extracted feature vector and the standard feature vector of the corresponding category of the feature standard dictionary created in advance for each local region. A distance value or similarity calculating means for calculating a distance value or similarity , and adjusting the calculated distance value or similarity value for each local region according to the magnitude of the black pixel number difference value. and adjusting means for, the adjusted distance values or similarity of all local regions cumulative
A character recognition processing device comprising: an identification unit that uses the most similar character category as a recognition result of the character pattern by accumulating and using.

5. A program for causing a computer to execute the processing steps in the character recognition processing method according to claim 1, is recorded on a computer-readable recording medium. A recording medium on which a characteristic character recognition processing method is recorded.