JPH03100786A

JPH03100786A - Method and device for character recognition

Info

Publication number: JPH03100786A
Application number: JP1238244A
Authority: JP
Inventors: Makoto Senoo; 誠妹尾; Yoshiaki Ichikawa; 芳明市川
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1989-09-13
Filing date: 1989-09-13
Publication date: 1991-04-25
Anticipated expiration: 2010-11-01
Also published as: JPH07101440B2

Abstract

PURPOSE:To shorten the recognition time by performing the integration processing of value '0' or '1' for each two-dimensional address to obtain the frequency distribution with respect to plural normalized dictionary characters and using a specific address as the information extraction position for character recognition. CONSTITUTION:Binarized normalized pictures 220 having a prescribed size are generated for all dictionary characters 200 as the recognition object, and information '0' or '1' of all dictionary character patterns corresponding to respective addresses on normalized pictures are integrated by a processing 300 to obtain two-dimensional frequency distribution 310. A prescribed number of two-dimensional addresses are extracted from addresses of the most largest quantity of information based on the frequency distribution 310, and information '0' and '1' of normalized pictures obtained on these two-dimensional addresses for inputted recognition object characters are taken as input data of a character recognizing means. Thus, the processing time required for character recognition is shortened in comparison with the method where all of (MXM)-number of picture element information are used as information for character recognition.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、定形文字の自動認識方法に係り、車のナンバ
プレート上の番号、及びダンボール上の印刷文字等の文
字認識方法及び装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a method for automatically recognizing fixed-form characters, and relates to a method and apparatus for recognizing characters such as numbers on car license plates and characters printed on cardboard.

[Conventional technology]

従来の定形文字の文字認識方法において、文字認識概論
（電気通信協会発行、橋本新一部編著。In the conventional character recognition method for fixed-form characters, Introduction to Character Recognition (Published by Telecommunications Association, edited by Shinichi Hashimoto).

１９８２）の印刷文字の認識技術でも述べられているよ
うに、パターンマツチング法、ストロークアナリシス法
１輪郭抽出法等が知られている。As described in Printed Character Recognition Technology (1982), the pattern matching method, stroke analysis method 1 contour extraction method, etc. are known.

パターンマツチング法では、テンプレートマツチング法
が代表的なものである。この方法は、認識対象となる辞
書文字パターン（テンプレート）を用意し、認識対象文
字パターンと重ね合せることによって、その一致度合を
求める手法である。A typical pattern matching method is a template matching method. This method is a technique in which a dictionary character pattern (template) to be recognized is prepared and the degree of matching is determined by superimposing it on the recognition target character pattern.

ストロークアナリシス法は、黒点が分布している２値パ
ターンから、指定された領域内にあるまとまった黒点が
存在するかどうかを調べる方法である。また、輪郭特徴
抽出法は、文字の輪郭を追跡しながら特徴を抽出する方
法と１文字の外郭の形状から特徴を抽出する方法とがあ
る。The stroke analysis method is a method of checking whether a group of black dots exists within a designated area from a binary pattern in which black dots are distributed. Contour feature extraction methods include a method of extracting features while tracing the outline of a character, and a method of extracting features from the shape of the outline of a single character.

[Problem to be solved by the invention]

従来の文字認識方法においては、いずれの処理でも認識
対象文字の数が多い場合は、認識のための前処理及びマ
ツチング等の認識処理に時間がかかるという問題があっ
た。Conventional character recognition methods have a problem in that recognition processing such as preprocessing and matching for recognition takes time when the number of characters to be recognized is large in any process.

本発明の目的は、認識の都度辞書文字パターンとのマツ
チング処理あるいは認識対象文字の重心や面積等の特徴
量を算出する等の前処理を含めた認識時間を短縮する文
字認識方法を提供することにある。An object of the present invention is to provide a character recognition method that shortens recognition time, including preprocessing such as matching processing with dictionary character patterns or calculating feature quantities such as center of gravity and area of characters to be recognized each time. It is in.

[Means to solve the problem]

前記の目的を達成するため、本発明に係る文字認識方法
は、定形文字を認識処理する文字認識方法において、認
識対象となる全ての辞書文字について２値化した所定の
大きさの正規化画像を作成し、正規化画像面上のそれぞ
れのアドレスに対応する全ての辞書文字パターンの１″
０″又は１”情報を積算した２次元頻度分布を求め、こ
の頻度分布を基に最も情報量の多いアドレスから所定数
の２次元アドレスを抽出し、入力した認識対象文字に対
してその２次元アドレス上で得られる正規化画像のＯＩ
Ｉ　　　１１１７１情報を文字認識手段の入力データと
するように構成されている。In order to achieve the above object, the character recognition method according to the present invention is a character recognition method that recognizes and processes fixed-form characters. Create and normalize 1″ of all dictionary character patterns corresponding to each address on the image plane
A two-dimensional frequency distribution is obtained by integrating 0'' or 1'' information, and based on this frequency distribution, a predetermined number of two-dimensional addresses are extracted from the address with the largest amount of information, and the two-dimensional OI of normalized image obtained on address
The I11171 information is configured to be input data to the character recognition means.

そして２次元アドレスは、全辞書文字数の１／２の頻度
に最も近い値を示すアドレスから所定数抽出される構成
である。A predetermined number of two-dimensional addresses are extracted from addresses that have a value closest to the frequency of 1/2 of the total number of characters in the dictionary.

また文字認識手段は、入力層、中間層及び出力層の３層
からなるニューラルネットで形成され、認識対象文字を
認識する中間層の重み係数は、辞書文字パターンの学習
により予め決定されである構成である。Further, the character recognition means is formed by a neural network consisting of three layers: an input layer, a middle layer, and an output layer, and the weighting coefficient of the middle layer for recognizing characters to be recognized is determined in advance by learning dictionary character patterns. It is.

さらに、中間層の重み係数は、認識対象文字の入力に対
して所定回数の認識を失敗した際、２次元頻度分布デー
タを基にニューラルネットの入力層への入力個数を少な
くとも１増加されて再学習させる構成とする。Furthermore, the weighting coefficient of the intermediate layer is determined by increasing the number of inputs to the input layer of the neural network by at least 1 based on the two-dimensional frequency distribution data when recognition fails a predetermined number of times for inputting the recognition target character. The structure is designed to allow learning.

そして認識対象文字は、既知の認識対象文字列の１部で
あり、その認識対象文字の認識に失敗した際、認識対象
文字の第一候補又は第二候補の候補文字と一致する既知
の文字を認識結果とする構成である。The recognition target character is part of a known recognition target character string, and when recognition of the recognition target character fails, a known character that matches the first candidate character or the second candidate character of the recognition target character is selected. This is the configuration for the recognition result.

また認識対象文字を、自動文字認識で失敗した際、前記
認識対象文字を表示して人が認識し、その認識結果を用
いてニューラルネットの中間層の重み係数を再学習させ
る構成でも良い。Alternatively, when automatic character recognition fails, the character to be recognized may be displayed and recognized by a person, and the recognition results may be used to relearn the weighting coefficients of the intermediate layer of the neural network.

さらに前記文字認識方法を適用した文字認識装置におい
ては、認識対象文字パターン及び辞書文字パターンを撮
像する撮像部と、その撮像信号をＡ／Ｄ変換するＡ／Ｄ
変換ユニットと、Ａ／Ｄ変換した画像の画像メモリ部と
、画像の画像処理・認識部と、辞書文字メモリ部及び認
識対象文字に関する情報メモリ部と、文字認識結果を表
示する表示装置とからなる構成である。Furthermore, the character recognition device to which the character recognition method is applied includes an imaging unit that images the recognition target character pattern and the dictionary character pattern, and an A/D that converts the imaged signal from A/D.
Consists of a conversion unit, an image memory section for A/D converted images, an image processing/recognition section for the images, a dictionary character memory section and an information memory section regarding characters to be recognized, and a display device for displaying character recognition results. It is the composition.

[Effect]

本発明の文字認識方法によれば、辞書文字パターン及び
認識対象文字パターンの２値化は、テレビカメラ等の撮
像手段によって得られる文字画像では、文字の部分と背
景の部分の明るさがいつも一定になることはなく、絶え
ず変化することから、文字の部分を背景の部分の明るさ
を規格化すると同時に情報の圧縮を実現している。また
、文字の正規化は、撮像装置によって得られる文字の大
きさの変化に対しても柔軟に対応できるようにするため
のものである。さらに、認識対象文字の辞書文字パター
ンとして得られるＩＩ　Ｏ”又は“１”の情報を全ての
辞書文字に対して積算した２次元類度分布は、正規化さ
れた大きさの２次元画像アドレス上のどのアドレスが文
字認識上の有意な情報を持っているかを決定する指標と
なる。このように、文字パターン画像上で決定した特定
の画素アドレスのみの情報が取り出される。According to the character recognition method of the present invention, in the binarization of the dictionary character pattern and the recognition target character pattern, in a character image obtained by an imaging means such as a television camera, the brightness of the character part and the background part is always constant. Since the brightness of the text and background is standardized, the information is compressed at the same time. Further, character normalization is performed to allow flexible response to changes in character size obtained by an imaging device. Furthermore, the two-dimensional degree distribution obtained by integrating the information of "II O" or "1" obtained as the dictionary character pattern of the recognition target character for all dictionary characters is calculated based on the two-dimensional image address of the normalized size. This serves as an index for determining which addresses have significant information for character recognition.In this way, information only for specific pixel addresses determined on the character pattern image is extracted.

さらに、文字認識手段としてニューラルネットを適用す
ることによって、予め辞書文字パターンに対して一旦学
習を完了しておくことにより、文字の認識段階では、予
め決定された特定の２次元画像アドレス上の“０７′１
４１１ｊデータを二１−ラルネットの入力層に入力する
ことによって、中間層での重み係数との積和演算のみを
実行することにより文字の認識がなされる。Furthermore, by applying a neural network as a character recognition means, by completing learning on dictionary character patterns in advance, in the character recognition stage, "" on a predetermined specific two-dimensional image address 07'1
By inputting the 411j data to the input layer of the 21-ral net, characters are recognized by executing only the product-sum operation with the weighting coefficient in the intermediate layer.

〔Example〕

本発明の一実施例を第１図を参照しながら説明する。第
１図において、辞書文字の濃淡画像２００．２値化後の
辞書文字２１０、正規化後の辞書文字２２０．全認識対
象文字の辞書文字パターンに対する正規化２次元画像ア
ドレス上の頻度分布を求める処理３００．得られた頻度
分布３１０が示される。An embodiment of the present invention will be described with reference to FIG. In FIG. 1, a grayscale image of dictionary characters 200, a dictionary character 210 after binarization, a dictionary character 220 after normalization, . Process for determining the frequency distribution on the normalized two-dimensional image address for the dictionary character pattern of all recognition target characters 300. The resulting frequency distribution 310 is shown.

２値化され正規化された辞書文字パターン２２０は、た
とえばコンピュータメモリ上で１画素ずつキーボード等
の外部データ入力手段によって作成することも可能であ
るが、入力に時間を要するため、テレビカメラ等の撮像
手段によって一旦画像メモリ上に濃淡画像（８ビツトの
場合、０〜２５５階調で表現できる）として入力し、適
当な２値化処理によって、１（Ｑｌｊ　、　　ｔｔ１″
′情報に変換する。この２値化処理は、第４図及び第５
図に示す方法で実現することができる。第４図に示す方
法は、２値化対象文字を含む所定の領域の濃淡画像の濃
淡値の頻度分布を求めると、右図に示すような分布とな
る。この濃淡頻度分布において、濃淡レベルｇのより小
さい方のピークＡが文字部分に相当するものであり、そ
の右側の濃淡レベルｇの大きい方のピークＢが背景に相
当する明るい部分に相当している。よって、２値化処理
では、２つのピークの谷間にあたるｇｏの濃淡レベルを
閾値として式（１）のアルゴリズムを使うことができる
。The binarized and normalized dictionary character pattern 220 can be created, for example, pixel by pixel on a computer memory using an external data input means such as a keyboard. Once input into the image memory by the imaging means as a gray scale image (in the case of 8 bits, it can be expressed in 0 to 255 gradations), it is converted to 1 (Qlj, tt1'' by appropriate binarization processing).
'Convert into information. This binarization process is shown in Figures 4 and 5.
This can be realized by the method shown in the figure. In the method shown in FIG. 4, when the frequency distribution of the grayscale values of a grayscale image in a predetermined region including the character to be binarized is determined, the distribution is as shown in the figure on the right. In this shading frequency distribution, the smaller peak A of the shading level g corresponds to the character part, and the peak B of the larger shading level g to the right of it corresponds to the bright part corresponding to the background. . Therefore, in the binarization process, the algorithm of Equation (1) can be used with the gray level of go, which is the valley between the two peaks, as the threshold.

ここで、ｇｔ：２値化領域内の各画素の濃淡レベルしか
し、第４図に示す濃淡頻度分布のように、２つのピーク
Ａ、Ｂが明確に発生しない場合もある。このような場合
は、第５図に示すように、２値化対象文字領域の所定の
ライン（アドレス）上の濃淡レベルの最大値ｇ　ｗａｘ
と最小値ｇ　ｌｌ１ｉｎを求め、その中央値を閾値とし
て用いる方法を採用してもよい。Here, gt: gradation level of each pixel in the binarized region However, as in the gradation frequency distribution shown in FIG. 4, the two peaks A and B may not clearly occur. In such a case, as shown in FIG.
A method may also be adopted in which the minimum value g ll1in is determined and the median value thereof is used as the threshold value.

文字の正規化は１文字認識の入力情報として抽出するべ
き文字中の画像の位置を決めるために不可欠な処理であ
り、正規化前の文字の大きさをＸ方向Ｎｘ画素及びＹ方
向Ｎｙ画素とすると、正規化後の文字の大きさをＭＸＭ
にするために１通常はＭとしては文字パターンを画像と
して表わせる最小の値として選定するので、Ｘ方向及び
Ｙ方向共にＭ　／　Ｎ　ｘ及びＭ　／　Ｎ　ｙの倍率で
縮少すればよい、このようにして正規化したｎ個の辞書
文字に対して、それぞれの２次元アドレスごとにａｇｏ
”又は１”の値を積算処理して頻度分布３１０を求める
。この頻度分布３１０は、正規化された２次元画像上の
どの位置に文字パターンとしての情報が多く存在するか
を示しているといえる。そこで、このＭＸＭ個の画素ア
ドレス上の特定のｍ個のアドレスを文字認識のための情
報抽出位置とすれば、ＭＸＭ個の全ての画素情報を文字
認識のための情報として用いるのに比べて文字認識に要
する処理時間を短縮できる。この特定のｍ個のアドレス
の選定方法としては、認識対象文字がｎ個存在する場合
は、辞書文字パターンの頻度分布の内ｎ　／　２の頻度
に最も近い値を与えるアドレスから順次抽出するのが妥
当である。Character normalization is an essential process for determining the position of an image within a character to be extracted as input information for single character recognition.The character size before normalization is defined as Nx pixels in the X direction and Ny pixels in the Y direction. Then, the font size after normalization is set to MXM
1. Normally, M is selected as the minimum value that can represent the character pattern as an image, so it is only necessary to reduce it by the magnification of M / N x and M / N y in both the X and Y directions. For each two-dimensional address of n dictionary characters normalized in this way, ago
A frequency distribution 310 is obtained by integrating the values of "or 1". This frequency distribution 310 can be said to indicate at which position on the normalized two-dimensional image there is a large amount of information as a character pattern. Therefore, if specific m addresses among these MXM pixel addresses are used as information extraction positions for character recognition, the character The processing time required for recognition can be reduced. The method for selecting these specific m addresses is that if there are n characters to be recognized, they should be extracted sequentially starting from the address that gives the value closest to the frequency of n/2 in the frequency distribution of the dictionary character pattern. It is reasonable.

この理由としては、どの文字パターンに対してもいつも
“０′″であるアドレスや“１”であるアドレスは１文
字パターンの情報がない位置とみなせるからである。こ
のような点から考えて、ＮＯ”と１１１７１の情報が５
０％の確率で発生するようなアドレスが最も多くの情報
を含んでいる。このようなアドレスがｎ　／　２の頻度
を与える点に対応する１以上述べた第１図の実施例を具
体的なフローとして示したものが第２図である。The reason for this is that an address that is always "0'" or "1" for any character pattern can be regarded as a position where there is no information about a single character pattern. Considering this point, the information of “NO” and 11171 is 5
Addresses that occur with a probability of 0% contain the most information. FIG. 2 shows a concrete flow of the embodiment of FIG. 1, which has been described at least once, corresponding to the point where such an address gives a frequency of n/2.

第２図において、ｐｈ（ｘ＊ｊ）は正規化処理後の２値
化文字パターンであり、ｈ　（ｕｐ　Ｊ）はに＝１から
ｎまでのｎ個の辞書文字パターンの“Ｏｎ　　ａ　１”
情報を各２次元アドレス（ｘ、ｊ）ごとに積算処理した
頻度分布を表わす。処理ブロック５０は全辞書文字パタ
ーンＰｋ（ｘ、ｊ）（ｋ＝１〜ｎ）に対する頻度分布を
求める変数ｈ（ｉ、ｊ）の初期化をする処理である。In Figure 2, ph(x*j) is a binary character pattern after normalization processing, and h(up J) is "On a 1" of n dictionary character patterns from 1 to n.
It represents the frequency distribution obtained by integrating information for each two-dimensional address (x, j). Processing block 50 is a process for initializing a variable h(i,j) for determining the frequency distribution for all dictionary character patterns Pk(x,j) (k=1 to n).

第３図は、本発明をニューラルネットによる文字認識に
適用した実施例を示したものである。本図において、辞
書文字パターン及び認識対象文字パターンを撮像するた
めの撮像装置４００、辞書文字画像の取込み処理部４１
０、撮像装置４００によって取込んだ文字画像の２値化
処理部４２０．２値化文字パターンの正規化処理部４３
０．全辞書文字パターンに対する＃　Ｏ＃＃又は１１１
　ＩＴの積算処理によって得られる頻度分布算出部４４
０、この頻度分布算出部４４０からの頻度分布に基いて
、ｎ　／　２の頻度に最も近い頻度から順次ｍ個の２次
元アドレスを抽出する処理部４５０、認識対象文字画像
の取込み処理部４６０、正規化された辞書文字パ゛ター
ン又は認識対象文字パターンからｍ個のアドレスに対応
して“０”又は＃　１　ｊｊの情報を抽出する処理部４
７０、ニューラルネットの入力層と出力層を結び付ける
中間層の重み係数Ｗ１□（ＷＩＪはニューロン１１間の
接続の重み係数を表わす）の学習処理部４８０、文字認
識のためのニューラルネット４９０が示される。FIG. 3 shows an embodiment in which the present invention is applied to character recognition using a neural network. In this figure, an imaging device 400 for capturing images of dictionary character patterns and recognition target character patterns, and a dictionary character image capture processing unit 41
0. Binarization processing unit 420 for character images captured by the imaging device 400. Normalization processing unit 43 for binary character patterns.
0. # O # # or 111 for all dictionary character patterns
Frequency distribution calculation unit 44 obtained by IT integration processing
0, a processing unit 450 that sequentially extracts m two-dimensional addresses from the frequency closest to n/2 based on the frequency distribution from the frequency distribution calculation unit 440; a recognition target character image capture processing unit 460; Processing unit 4 that extracts “0” or #1 jj information corresponding to m addresses from the normalized dictionary character pattern or recognition target character pattern
70, a learning processing unit 480 of a weighting coefficient W1□ (WIJ represents a weighting coefficient of a connection between neurons 11) of an intermediate layer that connects the input layer and output layer of a neural network, and a neural network 490 for character recognition are shown. .

つぎに本実施例の処理内容を詳細に説明する。Next, the processing contents of this embodiment will be explained in detail.

辞書文字パターン及び認識対象文字パターン上のｍ個の
情報抽出アドレス決定方法については、第１図及び第２
図のフローで説明したため、ここでは説明は省略する。Regarding the method of determining m information extraction addresses on the dictionary character pattern and the recognition target character pattern, see Figures 1 and 2.
Since the process has been explained using the flow shown in the figure, the explanation will be omitted here.

本実施例で示したニューラルネットは、人間の脳の働き
を模擬したものであり、基本的には入力層、中間層及び
出力層で構成される。中間層は一般的に多層構造となっ
ており、各層間は接続の強さを表わす重み係数で接続さ
れている。人間の場合、入力層への種々のパターンの情
報に対して、長い時間をかけて学習した結果が中間層で
の各重み係数として保持されており、ある入力情報パタ
ーンと出カバターンとの関係が結び付けられている０本
実施例のニューラルネットは、入力層へ入る文字パター
ンに対して出力層における辞書文字パターンとを対応付
ける、いわゆる文字認識の働きを実現するためのもので
ある。The neural network shown in this embodiment simulates the function of the human brain, and is basically composed of an input layer, a middle layer, and an output layer. The intermediate layer generally has a multilayer structure, and each layer is connected using a weighting coefficient representing the strength of connection. In the case of humans, the results of learning over a long period of time for various patterns of information in the input layer are retained as weight coefficients in the middle layer, and the relationship between a certain input information pattern and output pattern is The neural network of this embodiment is for realizing the function of so-called character recognition, in which character patterns entering the input layer are associated with dictionary character patterns in the output layer.

ここで、まず最初にニューラルネットの中間層の重み係
数Ｗ　ｔ　ａの学習の方法及び学習の結果、決定された
Ｗｌ−による文字認識方法について説明する。First, a method of learning the weighting coefficient W ta of the intermediate layer of the neural network and a character recognition method using Wl- determined as a result of the learning will be described.

中間層に当る重み係数Ｗｉ　Ｊの学習は、ｎ、個の各辞
書文字パターンに対して、正規化された２値化画像上の
特定のｍ点のアドレスから抽出された１゛０″“１”デ
ータをニュラルネットのｍ個の入力に与え、結果として
得られる出力層における応答０．から式（２）によって
順次重み係数Ｗ　ｔ　Ｊを学習させる。The learning of the weighting coefficient Wi J corresponding to the intermediate layer is carried out by learning the weighting coefficient Wi ``Data is given to m inputs of the neural network, and the weighting coefficient W t J is sequentially learned from the resulting response 0 in the output layer using equation (2).

ΔＷ　Ｉ　Ｊ　（ｋ÷１）＝−ηδＪＯ量＋αΔＷＩＪ
（ｋ）・・・　（２）ただし、δＪ：出力層の誤差（教
示パターンと応答パターンとの差） η、α：学習用パラメータに：学習の繰返しステップ学習は、ある辞書文字によって決まるｍ個の入力による
０”１”パターンに対して、その辞書文字に対応する出
力が得られるまで続けられる。この学習過程をｎ個の辞
書文字に対して実行し中間層の重み係数Ｗｉ−を学習す
る０通常、重み係数ＷＩＪの値は学習前に適当に設定す
る必要があるが、その値としては一様乱数によって設定
する方法がある。このようにして、ニューラルネットの
構成を完了すると、認識対象文字を撮像手段４００によ
って画像を取込んだ後、２値化処運４２０、正規化処理
４３０を経た後に、この画像から抽出したｍ個の０”１
”パターンをニューラルネットの入力層に入力すること
によって得られる出力層の値から文字を認識することが
できる。ΔW I J (k÷1) = -ηδJO amount + αΔWIJ
(k)... (2) where δJ: error in the output layer (difference between the teaching pattern and the response pattern) η, α: learning parameters: repeated steps of learning. This continues until the output corresponding to the dictionary character is obtained for the 0"1" pattern caused by the input. This learning process is executed for n dictionary characters to learn the weighting coefficient Wi- of the intermediate layer.Normally, the value of the weighting coefficient WIJ needs to be set appropriately before learning, but the value is There is a method to set it using random numbers. When the configuration of the neural network is completed in this way, after capturing an image of characters to be recognized by the imaging means 400, after passing through a binarization process 420 and a normalization process 430, m characters extracted from this image are 0”1
``Characters can be recognized from the values in the output layer obtained by inputting patterns into the input layer of a neural network.

この場合の認識は、出力層のｎ個の応答のうち最も大き
な値を出力した応答Ｏ１に対応した文字が認識結果とな
る。In this case, the recognition result is the character corresponding to the response O1 that outputs the largest value among the n responses of the output layer.

ここで、ニューラルネットを用いた認識方法を具体例で
説明する。Here, a recognition method using a neural network will be explained using a specific example.

文字認識の場合、出力層の応答０１の数は辞書文字の数
ｎと同一とし、かつ各応答０１に対して各辞書文字を対
応させる。In the case of character recognition, the number of responses 01 in the output layer is the same as the number n of dictionary characters, and each response 01 is associated with each dictionary character.

そこで、ニューラルネットの中間層の重み係数Ｗｓ−の
学習として、入力層に例えば辞書文字パターンＡの情報
を入力した場合（認識対象文字Ａに対する学習の場合）
、予め与えられた所定の許容誤差範囲内で０２が“１″
で他の０２〜Ｏｎが全て“ｏ″になるまで学習を繰り返
す。同様にして認識対象文字Ｂに対する学習の場合は、
Ｂの辞書文字パターンを入力し、０．が′１″で他は全
て０”になるまで学習させる。Therefore, when learning the weighting coefficient Ws- of the intermediate layer of the neural network, for example, when inputting information on dictionary character pattern A to the input layer (in the case of learning for recognition target character A)
, 02 is “1” within a predetermined tolerance range
The learning is repeated until all other 02 to On become "o". Similarly, when learning target character B,
Enter the dictionary character pattern of B, 0. is ``1'' and all others are 0''.

認識対象文字パターンを入力層に入力し、得られた出力
層の応答が０１＝０．０２，０２＝０．９５゜Ｏａ　”
　Ｏ−０−Ｏｎ　＝　Ｏ−０となり、Ｏｌの応答が最大
値を示した時は、認識結果はＢとなる。そして文字認識
の成功又は失敗の判定は、例えば判定値が０．８の場合
、応答０１〜Ｏｎの全ての値が０．８以下の時は認識失
敗となる。Input the character pattern to be recognized into the input layer, and the response of the output layer obtained is 01 = 0.02, 02 = 0.95°Oa.
When O-0-On = O-0 and the response of Ol shows the maximum value, the recognition result is B. The success or failure of character recognition is determined, for example, when the determination value is 0.8, and when all the values of responses 01 to On are 0.8 or less, recognition is failed.

第６図は、本発明による文字認識方法を適用した文字認
識装置の一実施例を示したものである。FIG. 6 shows an embodiment of a character recognition device to which the character recognition method according to the present invention is applied.

第６図において、認識文字パターン及び辞書文字パター
ンを撮像する撮像部４００、Ａ　／　Ｄ　（Ａｎａｌ。In FIG. 6, an imaging unit 400 and an A/D (Anal.

−ｇｕｅ　ｔｏ　Ｄｉｇｉｔａｌ）変換ユニット５００
、画像メモリ５１０、画像処理・認識部５２０、辞書文
字パターン用メモリ５３０．認識対象文字に関する情報
メモリ５４０１表示装置５５０が示される。-gue to Digital) conversion unit 500
, image memory 510, image processing/recognition unit 520, dictionary character pattern memory 530. An information memory 5401 display device 550 regarding characters to be recognized is shown.

Ａ／Ｄ変換ユニット５００はテレビカメラ４００からの
映像信号をＡ／Ｄ変換するものであり、Ａ／Ｄ変換の周
期は１枚の画像を構成する画素数で決定される。１枚の
画像を５１２Ｘ５１２の画素で構成する場合、このＡ／
Ｄ変換の周期は約０．１３μｓとなる。Ａ／Ｄ変換され
た映像信号は、５１２Ｘ５１２の画素で構成される画像
として、画像メモリ５１０に記憶される。この画像は画
像処理・認識部５２０で読み込まれ１文字認識のための
２値化、正規化等の前処理が実行される。The A/D conversion unit 500 performs A/D conversion of the video signal from the television camera 400, and the period of A/D conversion is determined by the number of pixels forming one image. When one image is composed of 512 x 512 pixels, this A/
The period of D conversion is approximately 0.13 μs. The A/D converted video signal is stored in the image memory 510 as an image composed of 512×512 pixels. This image is read by the image processing/recognition unit 520 and undergoes preprocessing such as binarization and normalization for single character recognition.

辞書文字パターンメモリ５３０内には１文字認識のため
の２値化され、かつ正規化された辞書文字のパターンが
記憶されている。この辞書文字は、第３図で示したよう
に、テレビカメラ４００で撮像した辞書文字を画像処理
・認識部５２０で２値化しかつ正規化した後、辞書文字
パターンメモリ５３０に格納することができるが、画像
処置・認識部５２０に接続されたキーボード（図示せず
）から各辞書文字パターンに対応するパＯ′″　　Ｉｔ
　Ｉ　Ｉｔデータをキーインすることもできる。第３図
で示した文字認識のためのニューラルネットは、コンピ
ュータのプログラムとして画像処理・認識部５２０内の
メモリに記憶させることができる。従って１画像処理・
認識部５２０はマイクロコンピュータによって構成する
ことができる。文字認識のためのニューラルネットの学
習は、辞書文字パターン用メモリ５３０内に記憶された
辞書文字パターンを用いて画像処理・認識部５２０で実
行される。また、ニューラルネットの入力層に入力する
文字パターン画像上のｍ個の画素アドレスの決定も、辞
書文字パターンメモリ５３０内への辞書文字パターンの
格納を完了すれば画像処理・認識部５２０で実行される
。The dictionary character pattern memory 530 stores binarized and normalized dictionary character patterns for single character recognition. As shown in FIG. 3, these dictionary characters can be stored in the dictionary character pattern memory 530 after the dictionary characters imaged by the television camera 400 are binarized and normalized by the image processing/recognition unit 520. However, from a keyboard (not shown) connected to the image processing/recognition unit 520, a password corresponding to each dictionary character pattern is input.
I It data can also be keyed in. The neural network for character recognition shown in FIG. 3 can be stored in the memory in the image processing/recognition unit 520 as a computer program. Therefore, one image processing
The recognition unit 520 can be configured by a microcomputer. Learning of the neural network for character recognition is performed by the image processing/recognition unit 520 using dictionary character patterns stored in the dictionary character pattern memory 530. Furthermore, determination of m pixel addresses on the character pattern image to be input to the input layer of the neural network is also executed by the image processing/recognition unit 520 once the dictionary character pattern has been stored in the dictionary character pattern memory 530. Ru.

認識対象文字に関する情報メモリ５４０内には、認識す
る文字列に関する規則（たとえば、複数文字列を認識す
る場合、頭文字は必ずアルファベット文字である等）や
予め認識対象文字がわかっている場合（自動認識は誤り
がないことを確認するための手段）はその文字列データ
が格納されている。第３図に示すようなニューラルネッ
トを用いた文字認識は、最も大きな値を出力した応答０
４（−ａｘ）に対応する文字が認識結果となるが、この
応答の最大値が所定の値以上ならない場合や２番目に大
きい値との差が所定値以上でない場合は認識を失敗した
とみなされる。このような認識の失敗が複数文字列の内
の１文字だけであったとした場合、予め認識対象文字列
に関する情報が何もない場合には完全な認識失敗となり
、認識率が低下することになる。逆に予め認識対象文字
列が何であるかわかっている場合は、第１候補又は第２
候補の候補文字が予め記憶されている認識対象文字と一
致すれば、その文字を認識結果としても間違う確率は非
常に小さくなる。高速連続認識が要求されないようなシ
ステムの場合、認識に失敗した文字パターンを表示装置
５５０に表示してオペレータに認識をゆだねることも可
能である。このような場合、オペレータによる認識結果
をキーボード等の手段により画像処理・認識部に入力す
ることにより、認識に失敗した文字パターンによってニ
ューラルネットの学習をやり直すことも可能である。The information memory 540 regarding characters to be recognized includes rules regarding character strings to be recognized (for example, when recognizing multiple character strings, the first letter must be an alphabetic character, etc.) and information about characters to be recognized when they are known in advance (automatic information). Recognition is a means to confirm that there are no errors), and the character string data is stored. In character recognition using a neural network as shown in Figure 3, the response 0 that outputs the largest value is
The character corresponding to 4 (-ax) is the recognition result, but if the maximum value of this response is not greater than a predetermined value or the difference from the second largest value is not greater than a predetermined value, recognition is considered to have failed. It will be done. If such a recognition failure occurs for only one character in a multiple character string, if there is no information about the character string to be recognized in advance, it will be a complete recognition failure and the recognition rate will decrease. . On the other hand, if you know in advance what the character string to be recognized is, select the first or second candidate.
If a candidate character matches a pre-stored recognition target character, the probability that the character will be mistaken as a recognition result is extremely small. In the case of a system that does not require high-speed continuous recognition, it is also possible to display character patterns that have failed in recognition on the display device 550 and leave the recognition to the operator. In such a case, by inputting the recognition result by the operator into the image processing/recognition unit using a means such as a keyboard, it is possible to retrain the neural network using the character pattern that failed to be recognized.

〔Effect of the invention〕

本発明によれば、認識対象文字画像を２値化さらにはＭ
ＸＭ画素で正規化した後、ｎ個の辞書文字パターンから
決定したｍ　（ｍ　＜　Ｍ　Ｘ　Ｍ　）個の画像アドレ
ス上の点を抽出することができるため、大幅な情報量の
圧縮に効果がある６また、抽出したｍ個の情報とニューラルネットとを結合
することによって、予めニューラルネットの中間層の重
み係数を学習しておけば、文字の認識はニューラルネッ
トの入力層のｍ個の入力データと中間層の重み係数との
積和演算のみで達成されるため１文字認識時間の大幅な
短縮にも効果がある。According to the present invention, a recognition target character image is binarized and M
After normalization with XM pixels, it is possible to extract points on m (m < M 6 In addition, if the weight coefficients of the middle layer of the neural network are learned in advance by combining the m pieces of extracted information with the neural network, character recognition can be performed using the m input data of the input layer of the neural network. Since this is achieved only by the product-sum operation of and the weighting coefficient of the intermediate layer, it is also effective in significantly shortening the time required to recognize one character.

[Brief explanation of drawings]

第１図は本発明の一実施例を示す図、第２図は第１図の
実施例の具体的な処理を示すフローチャート、第３図は
本発明の他の実施例を示す図、第４図及び第５図は第１
図、第２図及び第３図における文字濃淡画像の２値化方
法を示す図、第６図は文字認識装置の一実施例を示す図
である。２００・・・辞書文字の濃淡画像、２１０・・・辞書文字の２値化画像、２２０・・・辞書文字の正規化画像。３００・・・辞書文字に対する０′″又は１１１　ｊｌ
パターンの頻度分布算出処理、３１０・・・頻度分布。FIG. 1 is a diagram showing one embodiment of the present invention, FIG. 2 is a flowchart showing specific processing of the embodiment of FIG. 1, FIG. 3 is a diagram showing another embodiment of the present invention, and FIG. Figures and Figure 5 are
2 and 3, and FIG. 6 is a diagram showing an embodiment of the character recognition device. 200... Grayscale image of dictionary characters, 210... Binarized image of dictionary characters, 220... Normalized image of dictionary characters. 300...0'' or 111 jl for dictionary characters
Pattern frequency distribution calculation process, 310... Frequency distribution.

Claims

[Claims] 1. In a character recognition method that recognizes fixed-form characters, a binarized normalized image of a predetermined size is created for all dictionary characters to be recognized, and the normalized image is A two-dimensional frequency distribution is obtained by accumulating "0" or "1" information for all dictionary character image data corresponding to each address, and based on this frequency distribution, a predetermined number of two-dimensional frequency distributions are calculated from the address with the largest amount of information. A character recognition method characterized by extracting an address and using information of a normalized image obtained on a two-dimensional address extracted by the means for an input recognition target character as input data of a character recognition means. 2. The character recognition method according to claim 1, wherein a predetermined number of two-dimensional addresses are extracted from addresses having a value closest to a frequency of 1/2 of the total number of characters in the dictionary. 3. The character recognition means is formed by a neural network consisting of three layers: an input layer, an intermediate layer, and an output layer, and the weighting coefficient of the intermediate layer that recognizes the recognition target character is determined in advance by learning dictionary character patterns. The character recognition method according to claim 1, characterized in that: 4. The weighting coefficient of the intermediate layer is such that when recognition fails a predetermined number of times with respect to input characters to be recognized, the number of inputs to the input layer of the neural network is set to at least 4. The character recognition method according to claim 3, wherein the character recognition method is re-learned by incrementing the number by 1. 5. The recognition target character is a part of a known recognition target character string, and when recognition of the recognition target character fails, a known character matching the first or second candidate character of the recognition target character is used. 4. The character recognition method according to claim 3, wherein the recognition result is a character. 6. A claim characterized in that when automatic character recognition fails, the character to be recognized is displayed and recognized, and the recognition result is used to relearn the weighting coefficients of the intermediate layer of the neural network. Character recognition method described in Section 3. 7. A character recognition device to which the character recognition method according to claim 1 is applied, comprising: an imaging unit for imaging a character pattern to be recognized and a dictionary character pattern; an A/D conversion unit for A/D converting the imaging signal; /D-converted image; an image processing/recognition unit for the image; a dictionary character memory unit and an information memory unit regarding characters to be recognized; and a display device for displaying character recognition results. character recognition device.