JPH0830728A

JPH0830728A - Binarization device for image

Info

Publication number: JPH0830728A
Application number: JP6182951A
Authority: JP
Inventors: Hitoshi Kubota; 整久保田; Hisashi Chiba; 久千葉; Katsuichi Ono; 勝一小野
Original assignee: Suzuki Motor Corp
Current assignee: Suzuki Motor Corp
Priority date: 1994-07-12
Filing date: 1994-07-12
Publication date: 1996-02-02

Abstract

PURPOSE:To perform highly accurate binarization without being influenced by noise and contrast, etc. CONSTITUTION:This device is provided with a source image input part 10 for image-picking up images provided with a recognition processing object a1 and converting the image picked-up analog image data to source image data (b) provided with gradation, a threshold value selection part 12 for calculating a threshold value (c) for correction for the binarization of the source image data (b), an input data preparation part 14 for preparing input data X1 to Xn by correcting the density level of the source image data (b) based on the threshold value (b) for the correction and normalizing the source image data (b) and a neural network processing part 16 for binarizing the input data X1 to Xn prepared by the input data preparation part 14 by a neural network processing and converting them to output data x1 to xn.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、入力画像中の文字，図
形，記号，色彩等を認識する前処理に用いられる画像の
二値化装置に関し、詳しくは、階調のある画像を濃度が
「０」又は「１」の二階調画像に変換するための画像の
二値化装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image binarization device used for preprocessing for recognizing characters, figures, symbols, colors, etc. in an input image. The present invention relates to an image binarization device for converting into a two-tone image of "0" or "1".

【０００２】本装置は、例えば、文字を光学的に読み取
る文字認識装置において、入力画像中の文字を認識する
ための前処理として濃淡画像を二値化する二値化装置に
適用される。This device is applied to a binarization device for binarizing a grayscale image as a preprocessing for recognizing a character in an input image in a character recognition device for optically reading a character.

【０００３】[0003]

【従来の技術】従来の画像の二値化装置は、しきい値を
基準に二値化を行うＡ／Ｄ変換器であった。また、階調
のある画像をしきい値を基準に二値化する画像処理も行
われていた。これらのしきい値については、種々の決定
手法が知られている。例えば、アナログ画像や階調のあ
るデジタル画像の濃度ヒストグラムを作成し、そのヒス
トグラムの谷の濃度値をしきい値としている。2. Description of the Related Art A conventional image binarization device is an A / D converter which performs binarization based on a threshold value. In addition, image processing has also been performed in which an image with gradation is binarized based on a threshold value. Various determination methods are known for these threshold values. For example, a density histogram of an analog image or a digital image with gradation is created, and the density value of the valley of the histogram is used as the threshold value.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、かかる
従来技術では、まず、アナログ画像中の背景部分と認識
対象部分とを適切に分離するしきい値の選定が困難であ
り、しかも、自動的にしきい値を選定する手法では、必
ずしもその精度は認識手段が要求する精度には達してい
なかった。また、原画像にノイズが含まれている場合
に、そのノイズの濃度値がしきい値以上であると、その
ノイズを有効な部分（文字等の認識対象の一部）として
二値化してしまう、という不都合があった。このよう
に、従来技術では、ノイズを除去して高精度に二値化す
ることが困難であった。However, in such a conventional technique, first, it is difficult to select a threshold value for properly separating the background portion and the recognition target portion in the analog image, and the threshold value is automatically set. In the method of selecting the value, the accuracy does not always reach the accuracy required by the recognition means. Further, when the original image contains noise and the density value of the noise is equal to or higher than the threshold value, the noise is binarized as an effective part (a part of the recognition target such as characters). There was an inconvenience. As described above, in the conventional technique, it is difficult to remove noise and perform binarization with high accuracy.

【０００５】この不都合を解決しようとする、同一出願
人による特願平６−８５５１９号では、リカレントニュ
ーラルネットワークを用いて濃淡画像を二値化する手法
が提案されている。この手法では、画像認識の前処理と
して原画像の最外周囲部の平均濃度値と各画素との差分
を採り、差分画像としてリカレントニューラルネットワ
ークに入力していた。すると、背景部の濃度レベルを一
定にすることができたため、ニューラルネットワーク処
理によりノイズを除去しつつ二値化することができた。Japanese Patent Application No. 6-85519 by the same applicant who attempts to solve this inconvenience proposes a method of binarizing a grayscale image using a recurrent neural network. In this method, the difference between the average density value of the outermost peripheral portion of the original image and each pixel is taken as preprocessing for image recognition, and the difference image is input to the recurrent neural network. Then, since the density level of the background portion could be made constant, it was possible to perform binarization while removing noise by the neural network processing.

【０００６】しかしながら、この手法では、二値化の精
度が一定以上向上しない、という不都合があった。特
に、コントラストの高い画像や、コントラストの極端に
低い画像を対象とした場合、精度の高い二値化を行うこ
とができなかった。即ち、コントラストの高い画像を対
象とした場合、この手法では、ニューラルネットワーク
処理の基準値が適切な値に設定されないため、画像の二
値化を精度良く行うことができない、という不都合があ
った。However, this method has a disadvantage that the accuracy of binarization cannot be improved beyond a certain level. In particular, when an image with high contrast or an image with extremely low contrast is targeted, highly accurate binarization cannot be performed. That is, when an image with high contrast is targeted, this method has a disadvantage that the reference value of the neural network processing is not set to an appropriate value, and thus the image cannot be binarized with high accuracy.

【０００７】一方、コントラストの極端に低い画像は、
例えば、エンジンブロック等の金属表面に打たれた刻印
文字を撮像した場合の原画像などがこれに該当する。こ
のような刻印文字の自動認識処理における前処理とし
て、刻印文字を撮像したコントラストの極端に低い画像
の二値化を、認識部が要求する精度で行うことができな
い、という不都合があった。On the other hand, an image with extremely low contrast is
For example, this corresponds to an original image obtained by imaging a stamped character stamped on a metal surface such as an engine block. As a pre-processing for such automatic recognition processing of the engraved character, there is a disadvantage that the image of the engraved character having extremely low contrast cannot be binarized with the accuracy required by the recognition unit.

【０００８】[0008]

【発明の目的】本発明は、係る従来例の有する不都合を
改善し、特に、ノイズやコントラスト等の影響を受けず
高精度に二値化することのできる画像の二値化装置を提
供することを、その目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to provide an image binarization apparatus which is capable of improving the disadvantages of the conventional example, and in particular binarizing it with high accuracy without being affected by noise, contrast and the like. Is the purpose.

【０００９】[0009]

【課題を解決するための手段】そこで、本発明では、認
識処理対象を含む画像を撮像すると共にこの撮像したア
ナログ画像データを階調のある原画像データに変換する
原画像入力部と、当該原画像の二値化用の補正用しきい
値を算出するしきい値選定部と、当該補正用しきい値に
基づいて原画像の濃度レベルを補正すると共に当該原画
像を正規化することで入力データを作成する入力データ
作成部と、この入力データ作成部が作成した入力データ
をニューラルネットワーク処理により二値化して出力す
るニューラルネットワーク処理部とを備えた、という構
成を採っている。これによって前述した目的を達成しよ
うとするものである。Therefore, in the present invention, an original image input unit for capturing an image including a recognition processing target and converting the captured analog image data into original image data having gradation, and the original image input unit. A threshold value selection unit that calculates a correction threshold value for binarization of an image, and an input by correcting the density level of the original image based on the correction threshold value and normalizing the original image. The configuration is such that an input data creating unit for creating data and a neural network processing unit for binarizing and outputting the input data created by the input data creating unit by a neural network process are adopted. This aims to achieve the above-mentioned object.

【００１０】[0010]

【作用】本装置の動作中、原画像入力部が、まず、認識
処理対象を含む画像を撮像する。これは、撮像対象から
の反射光を光電変換した電圧変化の波形であるアナログ
画像を生成することで行う。続いて、原画像入力部は、
当該アナログ画像を階調のある原画像データに変換す
る。しきい値選定部では、当該原画像データの二値化用
の補正用しきい値を算出する。次いで、入力データ作成
部では、当該補正用しきい値に基づいて原画像データの
濃度レベルを補正する。そのため、原画像入力部が撮像
したアナログ画像の濃さにかかわらず、また、濃度分布
にかかわらず、補正用しきい値を基準に原画像の濃度レ
ベルが補正される。During the operation of the apparatus, the original image input section first captures an image including the recognition processing target. This is performed by generating an analog image that is a waveform of a voltage change obtained by photoelectrically converting the reflected light from the imaging target. Then, the original image input section
The analog image is converted into original image data having gradation. The threshold selection unit calculates a correction threshold for binarizing the original image data. Next, the input data creation unit corrects the density level of the original image data based on the correction threshold value. Therefore, the density level of the original image is corrected based on the correction threshold value regardless of the density of the analog image captured by the original image input unit and regardless of the density distribution.

【００１１】さらに、入力データ作成部は、当該濃度レ
ベルを補正した原画像データｂを正規化することで入力
データＸ1 〜Ｘnを作成する。続いて、この入力データ
は、ニューラルネットワーク処理部によって二値化され
る。二値化された画像データは文字認識装置等に出力さ
れる。Further, the input data creating unit creates the input data X1 to Xn by normalizing the original image data b whose density level has been corrected. Subsequently, this input data is binarized by the neural network processing unit. The binarized image data is output to a character recognition device or the like.

【００１２】このとき、ニューラルネットワーク処理部
では、入力データを的確に認識して二値化できるよう
に、ニューラルネットワークに予め学習させてある。例
えば、しきい値を越える画素であっても、その画素の周
囲との関係からノイズであると判断するように学習させ
る。したがって、ニューラルネットワーク処理部では、
入力データに含まれるノイズを認識して背景と同一の扱
いをしている。At this time, in the neural network processing section, the neural network is pre-learned so that the input data can be accurately recognized and binarized. For example, even if the pixel exceeds the threshold value, learning is performed so as to determine that the pixel is noise based on the relationship with the surroundings of the pixel. Therefore, in the neural network processing unit,
The noise contained in the input data is recognized and treated in the same way as the background.

【００１３】[0013]

【実施例】次に本発明の一実施例について図面を参照し
て説明する。図１は、本発明による画像の二値化装置の
構成を示すブロック図である。画像の二値化装置は、認
識処理対象ａ１を含む画像を撮像すると共にこの撮像し
たアナログ画像データを階調のある原画像データｂに変
換する原画像入力部１０と、当該原画像データｂの二値
化用の補正用しきい値ｃを算出するしきい値選定部１２
と、当該補正用しきい値ｃに基づいて原画像データｂの
濃度レベルを補正すると共に当該原画像データｂを正規
化することで入力データＸ1 〜Ｘnを作成する入力デー
タ作成部１４と、この入力データ作成部１４が作成した
入力データＸ1 〜Ｘnをニューラルネットワーク処理に
より二値化して出力データｘ1 〜ｘnに変換するニュー
ラルネットワーク処理部１６とを備えている。また、ニ
ューラルネットワーク処理部１６には、出力データｘ1
〜ｘnを合成することで二値化画像を生成し出力する二
値化画像合成部１８が併設されている。An embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the configuration of an image binarization apparatus according to the present invention. The image binarization apparatus captures an image including a recognition processing target a1 and converts the captured analog image data into original image data b with gradation, and an original image input unit 10 for converting the original image data b. Threshold selection unit 12 that calculates a correction threshold c for binarization
And an input data creation unit 14 that creates the input data X1 to Xn by correcting the density level of the original image data b based on the correction threshold value c and normalizing the original image data b. The input data creating unit 14 is provided with a neural network processing unit 16 for binarizing the input data X1 to Xn by a neural network process and converting the input data X1 to Xn into output data x1 to xn. Further, the neural network processing unit 16 outputs the output data x1
A binarized image synthesizing unit 18 for generating and outputting a binarized image by synthesizing .about.xn is additionally provided.

【００１４】これを詳細に説明する。原画像入力部１０
は、イメージスキャナやＣＣＤカメラ等などの撮像手段
１０Ａと、この撮像手段１０Ａが撮像したアナログ画像
を階調のあるデジタル画像データ（原画像）ｂにＡ／Ｄ
変換するＡ／Ｄ変換器１０Ｂとから構成され、文字ａ１
等の認識処理対象を含む画像を光電変換してデジタル値
として出力するものである。ここでは、まず、ＣＣＤカ
メラ１０Ａが、認識処理対象を含む画像の反射光を光電
変換して連続的な電圧の変化量であるアナログ画像を出
力する。次に、Ａ／Ｄ変換器１０Ｂは、アナログ画像を
例えば２５６階調のデジタル画像（原画像）ｂに変換す
る。This will be described in detail. Original image input unit 10
Is an image scanner 10A such as an image scanner or a CCD camera, and an analog image captured by the image capture unit 10A is converted into digital image data (original image) b with gradation A / D.
It is composed of an A / D converter 10B for conversion and the character a1
An image including a recognition processing target such as is photoelectrically converted and output as a digital value. Here, first, the CCD camera 10A photoelectrically converts the reflected light of the image including the recognition processing target and outputs an analog image which is a continuous voltage change amount. Next, the A / D converter 10B converts the analog image into a digital image (original image) b having 256 gradations, for example.

【００１５】しきい値選定部１２は、原画像ｂを二値化
する際のしきい値ｃを選定する。この補正用しきい値ｃ
は、原画像ｂの濃度レベルを一定にするために用いられ
る。従って、原画像ｂの種類によらず自動的に大まかな
しきい値ｃを求められる手法であれば良く、本実施例で
は、従来から実施されている手法を採用している。この
補正用しきい値ｃの選定手法には、Ｐ−タイル法や、判
別しきい値選定法など種々のものがあり、認識対象の性
格によって良好に作用する以下の手法を採用している。The threshold selection unit 12 selects a threshold c for binarizing the original image b. This correction threshold value c
Is used to keep the density level of the original image b constant. Therefore, any method can be used as long as it can automatically obtain the rough threshold value c regardless of the type of the original image b. In this embodiment, the method conventionally used is adopted. There are various methods for selecting the correction threshold value c, such as the P-tile method and the discrimination threshold value selecting method, and the following method that works well depending on the character of the recognition target is adopted.

【００１６】Ｐ−タイル法：アナログ画像内で認識対
象が占める部分が予め判っている場合には、このＰ−タ
イル法を用いると良い。この手法は、濃度ヒストグラム
を用いて濃度値の小さい方からの累積分布がＰパーセン
トとなる濃度値を算出し、しきい値ｃとする手法であ
る。P-tile method: When the part occupied by the recognition target in the analog image is known in advance, this P-tile method may be used. This method is a method of calculating a density value in which the cumulative distribution from the smaller density value is P percent by using a density histogram and setting it as a threshold value c.

【００１７】モード法：コントラストの高い画像が対
象である場合、濃度ヒストグラムは双峰性を示すため、
背景部分の山と対象部分の山との間の谷を算出してしき
い値ｃとする手法である。しかしながら、ヒストグラム
がなめらかでない場合や、谷が顕著で無い場合、特に、
コントラストの低い場合、自動的にしきい値ｃを決定す
ることは困難であった。そのため、本実施例では次の判
別しきい値選定法を採用している。なお、モード法に基
づくしきい値ｃの自動決定の手法としては、双峰的なヒ
ストグラムを正規分布関数の和で最小２乗誤差近似し、
平均及び分散などの推定値からしきい値ｃを算出する方
法などがある。Modal method: When a high-contrast image is targeted, the density histogram exhibits bimodality,
This is a method of calculating the valley between the mountain of the background portion and the mountain of the target portion and setting it as the threshold value c. However, if the histogram is not smooth or if the valleys are not noticeable,
When the contrast is low, it is difficult to automatically determine the threshold value c. Therefore, in this embodiment, the following discrimination threshold selection method is adopted. As a method for automatically determining the threshold value c based on the modal method, a bimodal histogram is approximated by a least square error with the sum of normal distribution functions,
There is a method of calculating the threshold value c from estimated values such as average and variance.

【００１８】判別しきい値選定法：多変量解析におけ
る判別分析の手法に基づくものであり、しきい値ｃを選
定する対象の原画像ｂをクラス分けし、クラス間分散と
全分散の関係から対象を最も良く判別する値を選定する
ものである。即ち、クラス分離度η(t)を最大にするし
きい値ｃ(t)を算出する。これは、しきい値ｃ(t)で二値
化した場合のしきい位置の良さを測るために、判別分析
で用いられている次式(1)で示すクラス分離度を導入す
る。Discriminant threshold selection method: This is based on the method of discriminant analysis in multivariate analysis, and the original image b for which the threshold value c is selected is classified into classes, and from the relationship between the interclass variance and the total variance. The value that best discriminates the target is selected. That is, the threshold value c (t) that maximizes the class separation degree η (t) is calculated. This introduces the class separability represented by the following equation (1) used in the discriminant analysis in order to measure the quality of the threshold position when binarized by the threshold value c (t).

【００１９】[0019]

【数１】 [Equation 1]

【００２０】ここで、σT²(t)，σB²(t)はそれぞれクラ
ス間分散，全分散を表す。そして、η(t)を最大にする
しきい値ｃ(t)を求める。しきい値ｃ(t)に対するクラス
分離度η(t)は、原画像ｂが二値的である度合いを表し
ていると解釈することができる。この手法は、ヒストグ
ラムが双峰的な場合には前述のモード法として動作し、
モードがなく、従って谷を判別しがたい場合でもしきい
値ｃが自動的に定まるという特長を備えている。Here, σT ² (t) and σB ² (t) represent interclass dispersion and total dispersion, respectively. Then, a threshold value c (t) that maximizes η (t) is obtained. The class separation degree η (t) with respect to the threshold value c (t) can be interpreted as representing the degree to which the original image b is binary. This method works as the above modal method when the histogram is bimodal,
It has a feature that the threshold value c is automatically determined even when there is no mode and therefore it is difficult to determine the valley.

【００２１】これらの手法は公知であり、例えば、画像
処理ハンドブック編集委員会編「画像処理ハンドブッ
ク」平成４年５月２０日初版第５刷 pp.278〜280などで
開示されている。また、判別分析法によるしきい値の算
出についての詳細は、大津展之「判別および最小２乗基
準に基づく自動しきい値選定法」電子通信学会論文誌，
Vol. J63-D No. 4，pp.349〜356（昭和５５年４月）を
参照されたい。ここでは、濃度ヒストグラムの０，１次
累積モーメントのみに基づく簡単な手続でしきい値を自
動的に算出する手法が開示されている。These techniques are publicly known, and are disclosed, for example, in "Image Processing Handbook" edited by the Image Processing Handbook Editorial Committee, May 20, 1992, first edition, fifth printing, pp. 278 to 280. For details on the threshold value calculation by the discriminant analysis method, see Nobuyuki Otsu, "Automatic Threshold Value Selection Method Based on Discriminant and Least Squares Criterion", IEICE Transactions,
See Vol. J63-D No. 4, pp.349-356 (April 1980). Here, a method of automatically calculating a threshold value by a simple procedure based only on the 0th and 1st order cumulative moments of the density histogram is disclosed.

【００２２】しきい値選定部１２は、これらの手法に基
づいて原画像ｂの補正用しきい値ｃを算出する。本実施
例では、二値化への処理時間の制約や、また、汎用性の
ある二値化手法を提供するために、この補正用しきい値
ｃの選定では、エッジ情報を用いてしきい値を算出する
手法や、動的なしきい値選定手法は採用していない。し
かしながら、演算処理装置の性能向上に応じて、判別し
きい値選定法の前処理として２次微分（ラプラシアン）
ヒストグラム法を用いるなどの手法を採用するようにし
ても良い。本実施例では、何らかの手法で求めた補正用
しきい値ｃを基準に、原画像ｂの濃度レベルを補正した
後にニューラルネットワークに入力することを課題解決
の為の主要な技術的思想としているため、補正用しきい
値ｃが求まれば良く、従って、しきい値ｃの選定手法に
よっては本発明は限定されない。The threshold selection unit 12 calculates the correction threshold c of the original image b based on these methods. In this embodiment, in order to provide a constraint on the processing time for binarization and a versatile binarization method, edge information is used to select the correction threshold value c. It does not adopt a method of calculating values or a dynamic threshold selection method. However, according to the improvement of the performance of the arithmetic processing unit, the second-order differentiation (Laplacian) is performed as the preprocessing of the discrimination threshold selection method.
A method such as using the histogram method may be adopted. In this embodiment, the main technical idea for solving the problem is to correct the density level of the original image b based on the correction threshold value c obtained by some method and then input the corrected density level to the neural network. , The correction threshold value c can be obtained, and therefore the present invention is not limited by the selection method of the threshold value c.

【００２３】画像や音声等のパターン認識処理技術で
は、ニューラルネットワーク処理を用いることで従来に
ない効果を得ることは自明のことである。そのため、ニ
ューラルネットワーク処理技術では、入力データの生成
手法及びニューラルネットワーク学習の手法が技術的課
題となる。即ち、ニューラルネットワーク処理が潜在的
に有する効果を最大限引き出すように入力データを生成
し、またニューラルネットワークを学習させる手法が、
本実施例の主要な技術的課題である。It is self-evident that the pattern recognition processing technique for images, voices and the like can obtain an effect which has never been obtained by using the neural network processing. Therefore, in the neural network processing technique, the technique of input data generation and the technique of neural network learning are technical problems. That is, a method of generating input data so as to maximize the potential effect of the neural network processing and learning the neural network is
This is the main technical problem of this embodiment.

【００２４】そこで、入力データ作成部１４は、しきい
値選定部１２が選定した補正用しきい値ｃに基づいて原
画像ｂの濃度レベルを補正する。この補正には種々の手
法があるが、しきい値ｃを基準に濃度レベルを補正する
手法であればどのようなものであっても良い。ここで
は、補正用しきい値ｃを二値化するための中心値として
扱い、原画像ｂの濃度をこのしきい値ｃを基準に補正し
ている。即ち、本実施例では、画像の濃度値が２５６階
調（０〜２５５）である場合には、このしきい値ｃが１
２８になるように原画像の濃度レベルを補正する。Therefore, the input data creating section 14 corrects the density level of the original image b based on the correction threshold value c selected by the threshold value selecting section 12. There are various methods for this correction, but any method may be used as long as it is a method for correcting the density level based on the threshold value c. Here, the correction threshold value c is treated as a central value for binarization, and the density of the original image b is corrected with reference to this threshold value c. That is, in this embodiment, when the density value of the image is 256 gradations (0 to 255), the threshold value c is 1.
The density level of the original image is corrected to 28.

【００２５】ここでは、補正用しきい値ｃと階調の中心
値（例えば、２５６階調であれば１２８）との差を取
り、この差の分だけ全ての濃度値を一律にシフトさせて
いる。また、この差を足し合わせることで階調を逸脱す
る部分は、ここでは、二値化の処理速度を確保するた
め、それぞれ最大値及び最小値としている。もちろん、
扱っている階調からはみ出さないように原画像データｂ
に画像処理を施した上で補正するようにしても良い。Here, the difference between the correction threshold value c and the center value of the gradation (for example, 128 for 256 gradations) is calculated, and all the density values are uniformly shifted by the difference. There is. Further, the portions that deviate from the gradation by adding up the differences are set to the maximum value and the minimum value, respectively, in order to ensure the binarization processing speed. of course,
Original image data b so that it does not overflow from the gradation that is handled
It may be possible to perform image processing on the image and then perform the correction.

【００２６】図２は入力データ作成部１４及びニューラ
ルネットワーク処理部１６における画像処理を説明する
ための説明図である。入力データ作成部１４では、濃度
レベルをシフトさせた画像データを正規化することで、
入力データＸ1 〜Ｘn を生成する。濃度レベルをシフト
させた為、例えば、２５６階調を０．０から１．０に正
規化する場合、補正用しきい値ｃの濃度は、どのような
原画像ｂであっても常に０．５となる。さらに、入力デ
ータ作成部１４では、原画像ｂを横方向ｍ個，縦方向ｎ
個に分割し、これを例えば縦方向一列毎にｍ個づつ出力
することで、ニューラルネットワーク処理部１６への入
力データＸ1 〜Ｘn としている。FIG. 2 is an explanatory diagram for explaining the image processing in the input data creating unit 14 and the neural network processing unit 16. In the input data creation unit 14, by normalizing the image data whose density level has been shifted,
Input data X1 to Xn are generated. Since the density level is shifted, for example, when 256 gradations are normalized from 0.0 to 1.0, the density of the correction threshold value c is always 0. It becomes 5. Furthermore, the input data creation unit 14 has m original images b in the horizontal direction and n in the vertical direction.
Input data X1 to Xn to the neural network processing unit 16 are obtained by dividing the data into m pieces and outputting m pieces for each column in the vertical direction.

【００２７】ニューラルネットワーク処理部１６は、入
力データ作成部１４で作成される入力データＸ1 〜Ｘn
の画素数に等しいｍ個のユニット数からなる入力層２０
と、任意のユニット数からなる中間層（隠れ層）２２
と、入力層２０のユニット数と等しいｍ個のユニット数
からなる出力層２４とを備えている。The neural network processing unit 16 inputs the input data X1 to Xn created by the input data creating unit 14.
Input layer 20 consisting of m units equal to the number of pixels
And an intermediate layer (hidden layer) 22 composed of an arbitrary number of units
And an output layer 24 having m units equal to the number of units of the input layer 20.

【００２８】また、本実施例では、ニューラルネットワ
ーク処理部１６をリカレントニューラルネットワークで
構成している。即ち、入力層２０，中間層２２，出力層
２４の各ユニットは、対称結合又は一方向結合に拘束さ
れない、非対称でリカレントな結合をしている。このよ
うな特徴を有するリカレントニューラルネットワーク
は、ノイズが混入した入力データに対しても、正確な認
識が可能である。なお、リカレントニューラルネットワ
ーク自体は、公知技術であり、例えば「計測と制御」Vo
l.30. No.4 (1991年４月) pp.296-301 等に詳述されて
いる。Further, in this embodiment, the neural network processing unit 16 is composed of a recurrent neural network. That is, the units of the input layer 20, the intermediate layer 22, and the output layer 24 are asymmetrical and recurrently coupled to each other without being restricted by the symmetric coupling or the unidirectional coupling. The recurrent neural network having such characteristics can accurately recognize input data containing noise. Note that the recurrent neural network itself is a well-known technique, for example, "measurement and control" Vo
l.30. No. 4 (April 1991) pp.296-301, etc.

【００２９】このような構成において、入力層２０に入
力される入力データＸ1 〜Ｘn と出力層２４から出力さ
れる出力データｘ1 〜ｘn との関係を得るために、各ユ
ニット間の結合強度をあらかじめ学習によって求める。
すなわち、ある入力データを入力層２０に入力したと
き、その入力データにノイズが含まれていればノイズを
除去した出力データを出力層２４から出力させように学
習させる。又は、ある入力データを入力層２０に入力し
たとき、その入力データが大きな固まり（「１」の集
合）であれば、出力層２４から「１」を出力させ、その
入力データが小さな固まりであれば、出力層２４から
「０」を出力させるように学習させる。In such a structure, in order to obtain the relationship between the input data X1 to Xn input to the input layer 20 and the output data x1 to xn output from the output layer 24, the coupling strength between the units is preset. Seek by learning.
That is, when certain input data is input to the input layer 20, if the input data includes noise, learning is performed so that the output data from which the noise is removed is output from the output layer 24. Alternatively, when a certain input data is input to the input layer 20, if the input data is a large lump (a set of “1”), “1” is output from the output layer 24 and the input data is a small lump. For example, the output layer 24 is trained to output “0”.

【００３０】このような学習済みのニューラルネットワ
ーク処理部１６を使用すれば、入力層２０に例えば文字
ａ１及びノイズａ２，ａ３を含む未知の入力データが与
えられたとき、出力層２４からノイズａ２，ａ３を除去
した正確な文字ａ１の出力データが得られる。ニューラ
ルネットワーク処理部１６は、パターンの形状を学習す
るものであるため、入力パターン（本実施例における
「入力データ」）にノイズが混入した場合でも、正しい
パターン認識が可能である。By using the learned neural network processing unit 16 as described above, when the input layer 20 is given unknown input data including, for example, the characters a1 and the noises a2 and a3, the noise a2 from the output layer 24 is obtained. Accurate output data of the character a1 with a3 removed can be obtained. Since the neural network processing unit 16 learns the shape of the pattern, correct pattern recognition is possible even when noise is mixed in the input pattern (“input data” in this embodiment).

【００３１】このようにニューラルネットワーク処理部
１６から出力された出力データは、二値化画像合成部１
８によって合成され、二値化画像として文字認識装置等
に出力される。The output data output from the neural network processing unit 16 in this way is the binarized image synthesizing unit 1
8 is combined and output as a binarized image to a character recognition device or the like.

【００３２】次に、このアナログ画像の二値化装置の動
作を図３のフローチャート及び図４乃至図９を参照して
説明する。Next, the operation of the analog image binarization apparatus will be described with reference to the flowchart of FIG. 3 and FIGS. 4 to 9.

【００３３】まず、原画像入力部１０で原画像ｂが入力
される（ステップＳ１）。原画像ｂの例を図４に示す。
図４は、処理対象としての原画像ｂの特徴を強調して表
示した説明図であり、原画像ｂは実際には多数の階調
（濃度）によって表現される画像である。ここで、原画
像Ａ，Ｂ，Ｃは、それぞれ数字の「２」，「６」，
「４」と、濃度をハッチングで表わした数字の背景とか
ら構成されている。また、原画像Ａ，Ｂ，Ｃの下に、ア
ナログ画像Ａ，Ｂ，ＣのそれぞれのA1-A1,B1-B1,C1-C1
線における濃度値（０〜255 ）のグラフを示す。原画像
における背景の濃度はＢ，Ａ，Ｃの順に濃くなってい
る。原画像Ａ，Ｂ，Ｃをそのまま、ニューラルネットワ
ーク処理部１６で学習しようとすれば、情報量が膨大と
なって学習の収束が困難となる。First, the original image b is input by the original image input unit 10 (step S1). An example of the original image b is shown in FIG.
FIG. 4 is an explanatory diagram in which the characteristics of the original image b to be processed are emphasized and displayed, and the original image b is actually an image represented by a large number of gradations (density). Here, the original images A, B, and C are numbers “2”, “6”, and
It is composed of "4" and a background of numbers in which the density is hatched. Below the original images A, B, C, the analog images A1, A1, B1, B1, B1, B1, C1, C1
The graph of the density value (0-255) in a line is shown. The background density in the original image increases in the order of B, A, and C. If the neural network processing unit 16 tries to learn the original images A, B, and C as they are, the amount of information becomes enormous and it becomes difficult to converge the learning.

【００３４】そこで、前述した特願平６−８５５１９号
では、差分画像作成部が、原画像Ａ，Ｂ，Ｃについて、
最外周部の濃度値の平均値と各画素の濃度値との差をと
ることにより、図５（Ｂ）に示す濃度値の画像を作成し
ていた。このように、原画像Ａ，Ｂ，Ｃにおける背景の
濃度値を一定にしていた。これにより、適切な学習デー
タを得ていたので、ニューラルネットワーク処理部１６
で正しく二値化するようになっていた。Therefore, in the above-mentioned Japanese Patent Application No. 6-85519, the difference image creating unit
By taking the difference between the average density value of the outermost peripheral portion and the density value of each pixel, an image of the density value shown in FIG. 5B is created. In this way, the background density values in the original images A, B, and C are kept constant. As a result, since the appropriate learning data has been obtained, the neural network processing unit 16
It was supposed to be binarized correctly.

【００３５】しかしながら、この特願平６−８５５１９
号では、コントラストの高低によって二値化の精度の上
昇に限界があることが判明した。即ち、コントラストの
低い画像では良好に二値化するが、コントラストの高い
画像ではノイズの分離が良好にされなかった。即ち、図
５（Ｂ）の数字「２」の濃度のグラフで示すように、コ
ントラストの高い画像ではニューラルネットワークの基
準値が原画像の濃度に対して低く設定されてしまうこと
を原因の一つとして、ノイズと認識対象との分離を良好
に行うことができなかった。また、背景レベルを均一化
したため、濃度ヒストグラムが双峰性を示す原画像デー
タであれば良好に背景部分を一定にすることができた
が、認識対象ａ１と背景部分の濃度差の少ない画像、即
ち、コントラストの極端に低い画像も、精度良く二値化
することができなかった。However, this Japanese Patent Application No. 6-85519
In the issue, it was found that there is a limit to the increase in binarization accuracy due to the high and low contrast. That is, although binarization is favorably performed on an image having low contrast, noise separation was not favorably performed on an image having high contrast. That is, as shown in the density graph of the number “2” in FIG. 5B, one of the causes is that the reference value of the neural network is set lower than the density of the original image in an image with high contrast. As a result, the noise and the recognition target could not be well separated. Further, since the background level is made uniform, the background part can be favorably made constant if the density histogram is the original image data exhibiting bimodality, but an image with a small density difference between the recognition target a1 and the background part, That is, even an image with extremely low contrast could not be binarized accurately.

【００３６】そのため、本実施例では、入力データ作成
部１４が、まず、図５（Ｃ）に示すように、補正用しき
い値ｃを選定し（ステップＳ２）、続いて、図５（Ｄ）
に示すように、この補正用しきい値ｃを基準に原画像デ
ータｂの濃度レベルをシフトさせる（ステップＳ３）。
次に、この画像を正規化することで入力データを生成す
る（ステップＳ４）。従って、ニューラルネットワーク
処理部１６では、常に補正用しきい値ｃを基準値として
学習するため、原画像データｂの特徴を良好に学習する
ことができる。即ち、ニューラルネットワーク処理部１
６は、補正用しきい値ｃに基づいて自己組織化するた
め、認識対象の特徴及び処理目的に合致する構成を自ら
取ることができる。Therefore, in the present embodiment, the input data creation unit 14 first selects the correction threshold value c as shown in FIG. 5C (step S2), and then, FIG. )
As shown in, the density level of the original image data b is shifted based on the correction threshold value c (step S3).
Next, the input data is generated by normalizing this image (step S4). Therefore, the neural network processing unit 16 always learns using the correction threshold value c as the reference value, so that the characteristics of the original image data b can be learned satisfactorily. That is, the neural network processing unit 1
Since No. 6 self-organizes on the basis of the correction threshold value c, it is possible for itself to have a configuration that matches the characteristics of the recognition target and the processing purpose.

【００３７】さらに、入力データ作成部１４は、図６に
示すように、正規化された原画像ｂを一列ごとの画素に
分割することにより、入力データＸ1 〜Ｘn を作成する
（ステップＳ５）。Further, as shown in FIG. 6, the input data creating section 14 creates the input data X1 to Xn by dividing the normalized original image b into pixels for each column (step S5).

【００３８】続いて、ニューラルネットワーク処理部１
６の入力層２０に入力データＸ1 〜Ｘn を順次入力する
（ステップＳ６）。すると、入力データＸ1 〜Ｘn に対
応した出力データｘ1 〜ｘn が出力層２４から順次出力
される（ステップＳ７）。例えば、図７（Ａ）に示すよ
うに入力データＸ1 に対して出力データｘ1 、図７
（Ｂ）に示すように入力データＸ2 に対して出力データ
ｘ2 、図７（Ｃ）に示すように入力データＸ3 に対して
出力データｘ3 がそれぞれ出力される。Next, the neural network processing unit 1
Input data X1 to Xn are sequentially input to the input layer 20 of No. 6 (step S6). Then, the output data x1 to xn corresponding to the input data X1 to Xn are sequentially output from the output layer 24 (step S7). For example, as shown in FIG. 7A, input data X1 is output data x1,
Output data x2 is output for the input data X2 as shown in FIG. 7B, and output data x3 is output for the input data X3 as shown in FIG. 7C.

【００３９】最後に、二値化画像合成部１８は、ニュー
ラルネットワーク処理部１６で変換された出力データｘ
1 〜ｘn を合成して二値化画像ｄとして出力する（ステ
ップＳ８）。Finally, the binarized image synthesizing unit 18 outputs the output data x converted by the neural network processing unit 16.
1 to xn are combined and output as a binarized image d (step S8).

【００４０】図８及び図９は、二値化画像の例を示す平
面図である。図８は原画像ｂのコントラストが高い場合
を示し、図９は原画像ｂのコントラストが低い場合を示
している。また、図８（Ａ）及び図９（Ａ）は原画像，
図８（Ｂ）及び図９（Ｂ）は判別しきい値選定法による
しきい値で単純に二値化した画像，図８（Ｃ）及び図９
（Ｃ）は本発明による二値化画像である。この図から明
らかなように、また、上述したように、本発明によれ
ば、従来技術では得られなかった高精度の二値化画像を
得ることができる。8 and 9 are plan views showing an example of a binarized image. 8 shows a case where the contrast of the original image b is high, and FIG. 9 shows a case where the contrast of the original image b is low. 8A and 9A are original images,
8 (B) and 9 (B) are images binarized simply by threshold values by the discrimination threshold selection method, and FIGS. 8 (C) and 9 (B).
(C) is a binarized image according to the present invention. As is clear from this figure, and as described above, according to the present invention, it is possible to obtain a highly accurate binarized image that cannot be obtained by the conventional technique.

【００４１】なお、ニューラルネットワーク処理部１６
は、本実施例のようにリカレントニューラルネットワー
クを用いることが最も望ましいが、これに限定するもの
ではなく、例えばフィードフォワードネットワーク等を
用いてもよい。The neural network processing unit 16
It is most preferable to use a recurrent neural network as in this embodiment, but the present invention is not limited to this, and a feedforward network or the like may be used.

【００４２】また、入力データ作成部１４は、差分画像
ｃを縦方向にｎ個に分割することにより入力データを作
成しているが、差分画像ｃを横方向にｍ個に分割するこ
とにより入力データを作成してもよい。また、縦横に限
らず斜めに分割してもよく、さらに、一列に限らず二列
以上を一まとめにして分割してもよい。The input data creating section 14 creates the input data by dividing the difference image c into n pieces in the vertical direction, but inputs the data by dividing the difference image c into m pieces in the horizontal direction. You may create the data. Further, the division is not limited to the vertical and horizontal directions, and the division may be performed diagonally, and further, not limited to one row, two or more rows may be collectively divided.

【００４３】次に、本発明を適用した刻印文字認識装置
について、図１０〜図１４を参照して説明する。Next, an engraved character recognition device to which the present invention is applied will be described with reference to FIGS.

【００４４】自動車の生産工程では、組み付け部品の指
示記号や車体番号として刻印文字が使われている。金属
表面に打たれた刻印文字は、印刷文字と異なり、対環境
性に優れ、経年変化が少ない。しかし、周囲との色差が
小さく、刻印表面の処理方法、油汚れ、傷などにより、
線画としては鮮明でない。In the production process of automobiles, engraved characters are used as instruction symbols for assembled parts and vehicle body numbers. Unlike printed characters, engraved characters stamped on the metal surface have excellent environmental resistance and little change over time. However, the color difference with the surroundings is small, and due to the method of treating the marking surface, oil stains, scratches, etc.
It is not clear as a line drawing.

【００４５】このような刻印文字を画像処理で認識する
場合には、画像の二値化、ノイズの除去、文字の切り出
しなどの前処理が必要になる。しかし、不鮮明な文字画
像は、コントラストが低く、ノイズも多いため、従来の
二値化手法では文字部と背景部を分離して正確に文字部
だけを抽出する精度の高い二値化が困難である。When recognizing such a stamped character by image processing, preprocessing such as image binarization, noise removal, and character cutout is required. However, since unclear character images have low contrast and a lot of noise, it is difficult for the conventional binarization method to separate the character part and the background part and accurately extract only the character part. is there.

【００４６】これまで、印刷文字などの二値化手法とし
ては、濃度ヒストグラムを用いる方法や、画像の微分
値、エッジ、境界点、輪郭線などの情報を用いる方法が
提案されている。濃度ヒストグラムを用いる方法では、
文字部と背景部のコントラストが極端に低い画像や、文
字部と背景部の面積比が大きい画像では、精度の高い二
値化が困難であった。また、画像の微分値を加味して濃
度ヒストグラムを算出する方法、二値化のしきい値の決
定に画像の境界点とエッジの一致度を用いる方法、文字
の輪郭線の長さと文字の面積の比を用いる方法などは、
それぞれ計算時間の増大、パラメータの設定が複雑、エ
ッジ検出が困難などの問題点がある。Hitherto, as a method of binarizing printed characters, a method using a density histogram and a method using information such as an image differential value, an edge, a boundary point, and a contour line have been proposed. In the method using the density histogram,
It is difficult to perform highly accurate binarization for an image in which the contrast between the character portion and the background portion is extremely low or an image in which the area ratio between the character portion and the background portion is large. Also, the method of calculating the density histogram by adding the differential value of the image, the method of using the degree of coincidence of the boundary points and edges of the image to determine the threshold of binarization, the length of the character outline and the area of the character The method using the ratio of
There are problems such as increase in calculation time, complicated parameter setting, and difficult edge detection.

【００４７】近年、階層型ニューラルネットワークによ
るパターンの学習及び認識能力を、濃淡画像のエッジ抽
出や二値化などに応用する方法も検討されているが、ネ
ットワークで取り扱うことができるデータが３ｘ３や５
ｘ５の局所領域に限定されているので、充分な二値化の
精度は得られていない。Recently, a method of applying the learning and recognition ability of a pattern by a hierarchical neural network to edge extraction and binarization of a grayscale image has been studied, but the data that can be handled by the network is 3 × 3 or 5.
Since it is limited to the local region of x5, sufficient binarization accuracy is not obtained.

【００４８】一方、静的なパターン変換を行う階層型ニ
ューラルネットワークに対して、自身に内部状態を持た
せたリカレントニューラルネットワーク（以下ＲＮＮと
記す）によって、時系列のパターンを学習及び認識させ
る試みが成されている。On the other hand, an attempt has been made to learn and recognize time-series patterns by a recurrent neural network (hereinafter referred to as RNN) having an internal state in itself for a hierarchical neural network which performs static pattern conversion. Is made.

【００４９】本実施例では、刻印も自画像を横方向また
は縦方向に変化する時系列パターンとして取り扱うこと
によって、ＲＮＮを刻印文字画像の二値化に応用してい
る。ここでは、低コントラスト画像の二値化及びノイズ
の判別を、ＲＮＮに時系列データとして学習させる。さ
らに、学習済みのＲＮＮに刻印文字画像を横方向または
縦方向の時系列データとして入力することによって二値
化させている。In this embodiment, RNN is applied to the binarization of the stamped character image by treating the stamp as a time series pattern in which the self-portrait changes in the horizontal direction or the vertical direction. Here, the RNN is made to learn the binarization of a low-contrast image and the discrimination of noise as time-series data. Further, the stamped character image is binarized by inputting it as time series data in the horizontal direction or the vertical direction into the learned RNN.

【００５０】エンジンブロックには、数字の「１」，
「２」の二文字を用いた三桁の文字列とアルファベット
の「Ａ」，「Ｂ」，「Ｃ」の三文字を用いた四桁の文字
列が刻印されている。本実施例では、数字の「１」，
「２」の画像を使用した。In the engine block, the numeral "1",
A three-digit character string using two characters "2" and a four-digit character string using three letters "A", "B", and "C" are engraved. In this embodiment, the number "1",
The image of "2" was used.

【００５１】刻印文字認識装置は、図１０に示すよう
に、上述した画像の二値化装置と、この二値化装置が出
力する二値化画像から文字を認識する文字認識装置とを
備えている。二値化装置４２は、光電変換によってアナ
ログ画像を生成する原画像入力部３０と、アナログ画像
をＡ／Ｄ変換して２５６階調の原画像を生成するＡ／Ｄ
変換部３２と、当該原画像を二値化する二値化部３４と
を備えている。原画像入力部３０は、ここでは、刻印文
字を４５度の方向から照射する白熱灯と、この白熱灯の
反対方向に設置され刻印文字を撮像するモノクロＣＣＤ
カメラとを備えている。As shown in FIG. 10, the engraved character recognition device comprises the above-described image binarization device and a character recognition device for recognizing characters from the binarized image output by this binarization device. There is. The binarization device 42 includes an original image input unit 30 that generates an analog image by photoelectric conversion and an A / D that A / D-converts the analog image to generate an original image with 256 gradations.
The conversion unit 32 and the binarization unit 34 that binarizes the original image are provided. The original image input unit 30 here is an incandescent lamp that irradiates a marking character from a direction of 45 degrees, and a monochrome CCD that is installed in the opposite direction of the incandescent lamp and images the marking character.
Equipped with a camera.

【００５２】また、文字認識装置４４は、二値化画像か
ら文字部分を切り出す文字切り出し部３６と、この文字
切り出し部３６が出力したデータと文字サンプルデータ
とを比較することで文字認識を行う文字認識部３８とを
備えている。さらに、これらの二値化装置４２及び文字
認識装置４４の各部は、制御手段４０によってその動作
タイミング及びデータの授受を制御されている。Further, the character recognition device 44 performs character recognition by comparing the character cutout unit 36 for cutting out the character portion from the binarized image with the data output by the character cutout unit 36 and the character sample data. And a recognition unit 38. Further, the respective operation units of the binarizing device 42 and the character recognizing device 44 are controlled by the control means 40 in terms of their operation timing and data transfer.

【００５３】画像入力部が撮像した原画像は、横６４０
［ドット］縦４００［ドット］の２５６階調のモノクロ
画像で、１ドットの濃度は０〜２５５となっている。本
実施例では、取り込まれたこの画像の中から、固定枠で
文字を１文字づつ枠取りし、１００ｘ１００［ドット］
の文字画像として取り扱っている。The original image picked up by the image input unit is 640 horizontal.
[Dot] A vertical image of 400 [dots] and 256 gradations of a monochrome image, and the density of one dot is 0 to 255. In the present embodiment, characters are framed one by one in a fixed frame from this captured image, and 100 × 100 [dots]
It is handled as a character image of.

【００５４】図１１に刻印文字画像列と、画像中の１ラ
インの濃度分布を示した。図１１（ａ）及び図１１
（ｂ）は背景部の濃度値の平均が１９０程度、文字部の
濃度値の平均が２４０程度で、背景部と文字部の濃度差
が５０以上有り、コントラストが高くノイズも少ない画
像である。図１１（ｃ）及び図１１（ｄ）は背景部の濃
度値の平均が２３０程度、文字部の濃度差が２０程度
で、背景部と文字部の濃度値の平均が２５０程度で、コ
ントラストが低い画像である。FIG. 11 shows an engraved character image string and the density distribution of one line in the image. 11 (a) and 11
(B) is an image in which the average density value of the background part is about 190, the average density value of the character part is about 240, the density difference between the background part and the character part is 50 or more, and the contrast is high and the noise is small. 11 (c) and 11 (d), the average density value of the background portion is about 230, the density difference of the character portion is about 20, the average density value of the background portion and the character portion is about 250, and the contrast is It is a low image.

【００５５】本実施例では、画像を時系列データとして
取り扱い二値化を行っている。図１１に示したの画像列
の場合には、グラフで示した１ラインの濃度分布が、画
像を横方向に１００分割した時系列データとなる。In this embodiment, the image is treated as time series data and binarized. In the case of the image sequence shown in FIG. 11, the density distribution of one line shown in the graph is time-series data obtained by dividing the image into 100 in the horizontal direction.

【００５６】図１２に、刻印文字の認識処理のフローチ
ャートを示す。FIG. 12 shows a flow chart of the marking character recognition processing.

【００５７】刻印文字の認識は、始めに取り込んだ画像
に対して、二値化、切り出し、ノイズ除去等の前処理を
行う。前処理によって、画像中から文字部分のみのデー
タを取りだし、文字の特徴値を抽出して認識を行う。To recognize the engraved character, pre-processing such as binarization, clipping, noise removal, etc. is performed on the image captured first. By preprocessing, data of only the character part is extracted from the image, and the feature value of the character is extracted for recognition.

【００５８】原画像の二値化処理は、上述したように、
まず、入力した画像から判別分析法によってしきい値を
計算する（ステップＳ２１）。次に、得られたしきい値
が濃度分布（ここでは、０から２５５）の中間値１２８
になるように、画像全体の濃度値をシフトさせる（ステ
ップＳ２２）。即ち、得られたしきい値と濃度値１２８
との差を取り、その差を画像の各画素の濃度値に加える
ことで全ての濃度値を一律にシフトさせる。これは、撮
像の対象の状態や撮像の環境などに拘わらず濃度値レベ
ルを一定にするために行われ、後のニューラルネットワ
ーク処理を良好に動作させるための前処理である。次
に、各画素の０から２５５の濃度値を、０．０から１．
０に正規化してＲＮＮへの入力データとする。作成した
入力データをＲＮＮに入力し、そのときの出力結果で二
値化を行う（ステップＳ２３）。The binarization process of the original image is performed as described above.
First, a threshold value is calculated from the input image by the discriminant analysis method (step S21). Next, the obtained threshold value is an intermediate value 128 of the density distribution (here, 0 to 255).
The density value of the entire image is shifted so as to become (step S22). That is, the obtained threshold value and density value 128
And the difference is added to the density value of each pixel of the image to uniformly shift all the density values. This is a pre-process that is performed in order to make the density value level constant irrespective of the state of the object to be imaged, the environment of the image pickup, and the like, and allows the subsequent neural network processing to operate favorably. Next, the density value from 0 to 255 of each pixel is changed from 0.0 to 1.
It is normalized to 0 and used as input data to the RNN. The created input data is input to the RNN, and the output result at that time is binarized (step S23).

【００５９】次に、文字認識処理を行う。文字認識処理
は、まず、二値化画像から文字部分を切り出し（ステッ
プＳ２５）、次いで、この切り出した画像から文字の特
徴値を抽出する（ステップＳ２６）。さらに、この特徴
値に基づいてパターン認識処理を行うことで文字認識を
行う（ステップＳ２７）。Next, character recognition processing is performed. In the character recognition process, first, a character portion is cut out from the binarized image (step S25), and then the characteristic value of the character is extracted from the cut out image (step S26). Further, character recognition is performed by performing pattern recognition processing based on this feature value (step S27).

【００６０】ＲＮＮは、入力層，中間層（隠れ層），及
び出力層とからなる。このネットワーク内での結合は次
の６種類であり、中間層と出力層の層内は全結合であ
る。入力層から中間層へ入力層から出力層へ中間層から出力層へ中間層から中間層へ出力層から中間層へ出力層から出力層へThe RNN is composed of an input layer, an intermediate layer (hidden layer), and an output layer. There are the following six types of connections in this network, and all connections are in the layers of the intermediate layer and the output layer. Input layer to middle layer Input layer to output layer Middle layer to output layer Middle layer to middle layer Output layer to middle layer Output layer to output layer

【００６１】コンピュータのメモリ容量の制約のため、
本実施例では、入力層、中間層、出力層のユニット数は
それぞれ５個づつとし、学習データのタイムステップは
３０とした。タイムステップとは、時系列データの時間
方向に変化するデータの個数である。Due to the limitation of the memory capacity of the computer,
In this embodiment, the number of units in each of the input layer, the intermediate layer, and the output layer is 5, and the time step of the learning data is 30. A time step is the number of pieces of data that changes in the time direction of time series data.

【００６２】この実験で用いるＲＮＮの各ユニットの動
作は、以下に示す式で表される。ここでは、入力層を
Ｉ，中間層をＨ，出力層をＯとしている。また、Ｘ_i(t)：ユニットｉの外部入力ｘ_i(t)：ユニットｉの内部状態値ｙ_i(t)：ユニットｉの出力値ｗ_ij ：ユニットｊからユニットｉへの結合荷重ｆ_i ：ユニットｉの応答関数Ｙ_i(t)：ユニットｉの教師出力とすると、ｉ∈Ｉのとき、ｙ_i(t)＝ｘ_i(t)＝Ｘ_i(t) ……式(2)The operation of each unit of the RNN used in this experiment is expressed by the following equation. Here, the input layer is I, the intermediate layer is H, and the output layer is O. Further, X _i (t): external input of unit i x _i (t): internal state value of unit i y _i (t): output value of unit i w _ij : coupling load from unit j to unit i f _i : Response function of unit i Y _i (t): As a teacher output of unit i, when _i ∈ I, y _i (t) = x _i (t) = X _i (t) (2)

【００６３】ｉ∈Ｈ、Ｏのとき、When i ∈ H, O,

【００６４】[0064]

【数２】 [Equation 2]

【００６５】[0065]

【数３】 (Equation 3)

【００６６】で表され、学習時の評価関数は、The evaluation function at the time of learning is

【００６７】[0067]

【数４】 [Equation 4]

【００６８】で表される。It is represented by

【００６９】ＲＮＮの教師有り学習則としては、変分法
を用いてＥ(w)の勾配を求めるＢＰＴＴ（Back Propagat
ion Through Time）を用いている。ラグランジュの未定
定数をλiとすると、As a supervised learning rule of the RNN, a BPTT (Back Propagat) for obtaining a gradient of E (w) by using a variational method.
Ion Through Time) is used. If Lagrange's undetermined constant is λi,

【００７０】[0070]

【数５】 (Equation 5)

【００７１】と、境界条件δｘ_i(0)＝0，λ_i(T)＝0か
ら、Ｅのｗ_ijに対する勾配は次式となる。From the boundary conditions δx _i (0) = 0 and λ _i (T) = 0, the gradient of E with respect to w _ij is given by the following equation.

【００７２】[0072]

【数６】 (Equation 6)

【００７３】従って、各ユニット間の結合荷重は次式に
示す値で更新される。但し、微分方程式は時間に関して
離散化する。Therefore, the coupling load between the units is updated by the value shown in the following equation. However, the differential equation is discretized with respect to time.

【００７４】[0074]

【数７】 (Equation 7)

【００７５】従来の二値化手法では、しきい値を求めて
各画像を０か１（白か黒）に区分する手法が採られてい
る。このような手法では、二値化しようとする画素とそ
の周囲の画素との関係の情報が用いられないため、二値
化しようとする画素が文字部なのかノイズなのか判らな
いまま二値化されてしまう。In the conventional binarization method, a method of obtaining a threshold value and dividing each image into 0 or 1 (white or black) is adopted. In such a method, since the information on the relationship between the pixel to be binarized and the surrounding pixels is not used, the binarization is performed without knowing whether the pixel to be binarized is a character part or noise. Will be done.

【００７６】本実施例では、二値化しようとする画素
が、文字部かノイズかを二値化に反映させるために、文
字部なら１、背景部なら０、ノイズなら０となるように
ＲＮＮに学習させている。ここでは、３ドット以下の塊
まりをノイズ、４ドット以上の塊を文字とする。In this embodiment, in order to reflect whether the pixel to be binarized is a character part or noise in the binarization, the character part is 1, the background part is 0, and the noise is 0. Is learning. Here, a lump of 3 dots or less is noise, and a lump of 4 dots or more is a character.

【００７７】図１３に学習データ例を示す。上段が入力
データ、下段が教師データである。縦方向はユニットを
表しており、上段が入力層のユニット１〜５に対応し、
下段が出力層のユニット１〜５に対応している。横方向
はタイムステップを表しており、ステップ１〜３０に対
応している。各ユニットに与えるデータは面積表示され
ており、最大値が１．０，最小値が０．０である。即
ち、アナログ画像データを濃度シフトさせた後正規化し
た入力データを、ここでは面積表示として扱っている。FIG. 13 shows an example of learning data. The upper row is the input data and the lower row is the teacher data. The vertical direction represents units, and the upper row corresponds to the units 1 to 5 in the input layer,
The lower stage corresponds to units 1 to 5 in the output layer. The horizontal direction represents time steps and corresponds to steps 1 to 30. The data given to each unit is displayed in area, and the maximum value is 1.0 and the minimum value is 0.0. That is, the input data obtained by density-shifting the analog image data and then normalizing the analog image data is treated as an area display here.

【００７８】図１３（Ａ）は入力層のユニット１への入
力データが０．６以上のとき、出力層のユニット１に
１．０を出力させるための学習データである。ユニット
１に０．６以上のデータが連続して入力されたら、その
データは文字の一部として１．０を出力するように学習
させる。FIG. 13A shows learning data for causing the unit 1 of the output layer to output 1.0 when the input data to the unit 1 of the input layer is 0.6 or more. When data of 0.6 or more is continuously input to the unit 1, the data is trained to output 1.0 as a part of the character.

【００７９】図１３（Ｂ）は図１３（Ａ）と同様に入力
層のユニット１，２への入力データが０．６以上のと
き、出力層のユニット１に１．０を出力させるための学
習データである。Similar to FIG. 13A, FIG. 13B is for making the unit 1 of the output layer output 1.0 when the input data to the units 1 and 2 of the input layer is 0.6 or more. This is learning data.

【００８０】図１３（Ｃ）は入力層のユニット１，２へ
の入力データが０．４以下のとき、出力層のユニット
１，２に０．０を出力させるための学習データである。
０．４以下のデータが連続して入力されたら、その入力
データは文字でないとして０．０を出力するように学習
させる。FIG. 13C shows learning data for causing the units 1 and 2 of the output layer to output 0.0 when the input data to the units 1 and 2 of the input layer is 0.4 or less.
When data of 0.4 or less is continuously input, it is learned that the input data is not a character and 0.0 is output.

【００８１】図１３（Ｄ）は入力層への入力データが
０．６以上でも、連続していなければ（３ステップ以
下）その入力データはノイズとして０．０を出力するよ
うに学習させる。０．６の入力に対し１．０を出力する
データを１５組、１．０の入力に対し１．０を出力する
データも同様に１５組作成した。さらに、ノイズに対し
て０．０を出力するデータを１２組作成した。In FIG. 13D, even if the input data to the input layer is 0.6 or more, if it is not continuous (3 steps or less), the input data is learned so as to output 0.0 as noise. 15 sets of data for outputting 1.0 for 0.6 input and 15 sets for outputting 1.0 for 1.0 input were also created. Furthermore, 12 sets of data that output 0.0 for noise were created.

【００８２】次に、本実施例による二値化処理の結果に
ついて説明する。本実施例では、合計５７個のデータを
学習データとしてＲＮＮに学習させた。学習は、ＲＩＳ
Ｃチップを用いているワークステーションで行い、学習
の終了条件は収束誤差０．０１以下の時とした。その結
果、学習回数３２，０８５回、学習時間約３時間１４分
で終了した。Next, the result of the binarization processing according to this embodiment will be described. In this example, the RNN was trained with a total of 57 data as learning data. Learning is RIS
It was carried out by a workstation using a C chip, and the learning termination condition was a convergence error of 0.01 or less. As a result, the learning was completed 32,085 times and the learning time was about 3 hours and 14 minutes.

【００８３】刻印文字画像の大きさは１００ｘ１００ド
ットであるので、画像を縦方向に５ｘ１００ドットの大
きさに２０分割した。学習が終了したＲＮＮを２０個用
意し、分割した画像をそれぞれのＲＮＮに並列に入力
し、その時の出力結果を二値化画像とした。Since the size of the engraved character image is 100 × 100 dots, the image was divided into 20 in the size of 5 × 100 dots in the vertical direction. Twenty learned RNNs were prepared, the divided images were input in parallel to the respective RNNs, and the output result at that time was used as a binarized image.

【００８４】図８及び図９は本実施例の条件による二値
化の結果例である。FIG. 8 and FIG. 9 are examples of binarization results under the conditions of this embodiment.

【００８５】コントラストの高い画像では、判別分析
法、ＲＮＮによる二値化と共に文字の輪郭が抽出したい
文字の輪郭と一致しており、精度の高い二値化がされて
いる。判別分析法による画像には面積が１〜５ドット程
度の小さなノイズが残っているが、ＲＮＮによる二値化
によってノイズが除去されている。In a high-contrast image, the contour of the character matches the contour of the character to be extracted as well as the binarization by the discriminant analysis method and RNN, and the binarization is performed with high accuracy. A small noise having an area of about 1 to 5 dots remains in the image by the discriminant analysis method, but the noise is removed by binarization by RNN.

【００８６】コントラストの低い画像では、判別分析法
による画像の文字部及び背景部に、文字の輪郭が確認で
きないほどのノイズが残っており、抽出したい文字の輪
郭と異なった画像となっている。ＲＮＮによる二値化画
像では、判別分析法による画像よりもノイズの数及び面
積が減少し、文字の輪郭も抽出したい文字の輪郭に近く
なっている。In an image with a low contrast, noise is present in the character portion and background portion of the image by the discriminant analysis method so that the contour of the character cannot be confirmed, and the image is different from the contour of the character to be extracted. In the binarized image by RNN, the number of noises and the area are smaller than the image by the discriminant analysis method, and the contour of the character is closer to the contour of the character to be extracted.

【００８７】二値化の精度を比較検討するために、二値
化後の画像を階層型のニューラルネットワークで認識さ
せた。得られた二値画像に対してノイズの除去及び文字
の切り出しを行い、切り出された文字から階層型ネット
ワークに入力するためのデータを作成する。本報告では
ネットワークに認識させるデータとして、切り出した文
字を縦横５ｘ５に分割し各メッシュ内で文字部分の占め
る面積割合を正規化した２５次元ベクトルを使用した。
各層のユニット数は入力層を２５、中間層を１０、出力
層を２とした。ネットワークには、ノイズが無く輪郭が
はっきりしている１と２の文字それぞれ４個づつ計８個
を学習させた。In order to compare and examine the accuracy of binarization, the binarized image was recognized by a hierarchical neural network. Noise is removed and characters are cut out from the obtained binary image, and data for inputting to the hierarchical network is created from the cut out characters. In this report, as the data to be recognized by the network, a 25-dimensional vector is used, in which the extracted characters are divided into vertical and horizontal 5x5, and the area ratio of the character part in each mesh is normalized.
The number of units in each layer was 25 for the input layer, 10 for the intermediate layer, and 2 for the output layer. The network was trained with 4 letters each of 1 and 2 letters with no noise and a clear outline, for a total of 8 letters.

【００８８】認識には文字１を６９個、文字２を８１個
の計１５０文字を使用した。出力層から０．９以上の出
力があった場合を正解とした。For recognition, a total of 150 characters, 69 characters 1 and 81 characters 2, were used. The correct answer was given when there was an output of 0.9 or more from the output layer.

【００８９】図１４に認識結果を示す。ＲＮＮによる二
値化画像では、１５０文字のうち認識できなかったのは
２個で、文字「１」が１個、文字「２」が１個であっ
た。この２個の文字は判別分析法による二値化画像でも
認識できていない。判別分析法による二値化画像で認識
できなかった１１文字のうち、９文字がＲＮＮによる二
値化を行うことによって、認識できるようになったとい
える。FIG. 14 shows the recognition result. In the binarized image by RNN, two out of 150 characters could not be recognized, one was a character "1" and one was a character "2". These two characters cannot be recognized even in the binarized image by the discriminant analysis method. It can be said that among the 11 characters that could not be recognized in the binarized image by the discriminant analysis method, 9 characters became recognizable by performing binarization by RNN.

【００９０】図１５に、ＲＮＮによる二値化画像で不正
解となった２個の画像を示す。図１５上段の文字「１」
は文字の下側に大きなノイズが残っており、このノイズ
の影響で認識できなかったと考えられる。しかし、ＲＮ
Ｎによる二値化画像は、ノイズは残っているが文字１の
輪郭自体は抽出したい文字に近い。図１５下段の文字
「２」は文字の輪郭上にノイズが多く残っており、文字
の輪郭も抽出したい文字とは異なっている。FIG. 15 shows two incorrect images in the binarized image by RNN. The letter "1" in the upper part of Fig. 15
It is considered that there was a large noise on the lower side of the character and could not be recognized due to the influence of this noise. But RN
In the binarized image by N, the noise itself remains, but the contour itself of the character 1 is close to the character to be extracted. The character "2" in the lower part of FIG. 15 has much noise on the contour of the character, and the contour of the character is different from the character to be extracted.

【００９１】コントラストが極端に低い画像を含んだ１
５０文字を認識させた結果、ＲＮＮによる二値化画像の
正解数が１４８で認識率が９８．７％であったことか
ら、本実施例による二値化手法は、広範囲の入力条件に
対応できる二値化アルゴリズムであるといえる。1 including an image with extremely low contrast
As a result of recognizing 50 characters, the number of correct answers of the binarized image by RNN was 148 and the recognition rate was 98.7%. Therefore, the binarization method according to the present embodiment can cope with a wide range of input conditions. It can be said that this is a binarization algorithm.

【００９２】[0092]

【発明の効果】本発明は以上のように構成され機能する
ので、これによると、しきい値選定部が、原画像データ
の二値化用の補正用しきい値を算出し、入力データ作成
部が、当該補正用しきい値に基づいて原画像データの濃
度レベルを補正するため、原画像入力部が撮像したアナ
ログ画像の濃さにかかわらず、また、濃度分布にかかわ
らず、補正用しきい値を基準に原画像の濃度レベルが補
正される。従って、ニューラルネットワーク処理部は、
補正用しきい値を基準値として学習することとなるた
め、認識対象の特徴に良好に合致した自己組織化を行う
ことができる。そのため、従来と比較して精度の高い二
値化が行える。このようにノイズやコントラスト等の影
響を受けず高精度に二値化することのできる従来にない
優れた画像の二値化装置を提供することができる。Since the present invention is constructed and functions as described above, according to this, the threshold value selecting unit calculates the correction threshold value for binarizing the original image data and creates the input data. Section corrects the density level of the original image data based on the correction threshold value, the correction is performed regardless of the density of the analog image captured by the original image input section and regardless of the density distribution. The density level of the original image is corrected based on the threshold value. Therefore, the neural network processing unit
Since learning is performed using the correction threshold value as a reference value, it is possible to perform self-organization that is well matched to the feature of the recognition target. Therefore, binarization with higher accuracy can be performed as compared with the conventional method. As described above, it is possible to provide an unprecedented excellent image binarization device that can perform binarization with high accuracy without being affected by noise and contrast.

[Brief description of drawings]

【図１】本発明の一実施例の構成を示すブロック図であ
る。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention.

【図２】図１に示したニューラルネットワーク処理部の
構成及び動作を示す説明図である。FIG. 2 is an explanatory diagram showing a configuration and an operation of a neural network processing unit shown in FIG.

【図３】図１に示した実施例の動作を示すフローチャー
トである。FIG. 3 is a flowchart showing the operation of the embodiment shown in FIG.

【図４】図１に示した実施例における原画像とその一断
面の濃度分布の関係の例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of the relationship between the original image and the density distribution of one cross section thereof in the embodiment shown in FIG.

【図５】図１に示した実施例における原画像と濃度分布
の関係を示し、図５（Ａ）は原画像を示す平面図であ
り、図５（Ｂ）は差分画像を用いた場合の濃度分布図で
あり、図５（Ｃ）は補正用しきい値ｃを示す濃度分布図
であり、図５（Ｄ）は濃度レベルを補正した濃度分布図
である。5A and 5B show a relationship between an original image and a density distribution in the embodiment shown in FIG. 1, FIG. 5A is a plan view showing the original image, and FIG. 5B is a case where a difference image is used. 5C is a density distribution map, FIG. 5C is a density distribution map showing the correction threshold value c, and FIG. 5D is a density distribution map in which the density level is corrected.

【図６】図１に示した実施例における入力データ作成部
の動作を示す説明図である。FIG. 6 is an explanatory diagram showing an operation of an input data creation unit in the embodiment shown in FIG.

【図７】本発明の一実施例におけるニューラルネットワ
ーク処理部の動作を示す説明図であり、図７（Ａ）は入
力データＸ1 に対する出力データｘ1 、図７（Ｂ）は入
力データＸ2 に対する出力データｘ2 、図７（Ｃ）は入
力データＸ3 に対する出力データｘ3 をそれぞれ示す。7A and 7B are explanatory views showing the operation of the neural network processing unit in the embodiment of the present invention, FIG. 7A shows output data x1 for input data X1, and FIG. 7B shows output data for input data X2. x2 and FIG. 7C show the output data x3 with respect to the input data X3.

【図８】コントラストの高い原画像とその二値化画像の
例を示す平面図であり、図８（Ａ）が原画像、図８
（Ｂ）が従来技術による二値化画像、図８（Ｃ）が本発
明による二値化画像である。FIG. 8 is a plan view showing an example of a high-contrast original image and its binarized image. FIG. 8A is the original image and FIG.
8B is a binarized image according to the related art, and FIG. 8C is a binarized image according to the present invention.

【図９】コントラストの低い原画像とその二値化画像の
例を示す平面図であり、図９（Ａ）が原画像、図９
（Ｂ）が従来技術による二値化画像、図９（Ｃ）が本発
明による二値化画像である。9 is a plan view showing an example of a low-contrast original image and a binarized image thereof, FIG. 9 (A) being the original image and FIG.
9B is a binarized image according to the related art, and FIG. 9C is a binarized image according to the present invention.

【図１０】本発明を適用した文字認識装置の構成を示す
ブロック図である。FIG. 10 is a block diagram showing a configuration of a character recognition device to which the present invention has been applied.

【図１１】図１０に示した実施例における刻印文字と濃
度分布との関係を示す図で、図１１（Ａ）はコントラス
トの高い原画像「１」の平面図及びその濃度分布図であ
り、図１１（Ｂ）はコントラストの高い原画像「２」の
平面図及びその濃度分布図であり、図１１（Ｃ）はコン
トラストの低い原画像「１」の平面図及びその濃度分布
図であり、図１１（Ｄ）はコントラストの低い原画像
「２」の平面図及びその濃度分布図である。11 is a diagram showing the relationship between the engraved characters and the density distribution in the embodiment shown in FIG. 10, and FIG. 11 (A) is a plan view of the original image “1” with high contrast and its density distribution diagram, FIG. 11B is a plan view of the original image “2” having a high contrast and its density distribution diagram, and FIG. 11C is a plan view of the original image “1” having a low contrast and its density distribution diagram, FIG. 11D is a plan view of the original image “2” having a low contrast and its density distribution diagram.

【図１２】図１０に示した実施例におけるい動作を示す
フローチャートである。FIG. 12 is a flowchart showing an operation in the embodiment shown in FIG.

【図１３】図１０に示した実施例におけるニューラルネ
ットワーク処理部の学習例を示す説明図であり、図１３
（Ａ）は入力層のユニット１への入力データが０．６以
上の時に出力層のユニット１に１．０を出力させるため
の学習データを示す説明図で、図１３（Ｂ）は入力層の
ユニット１及び２への入力データが０．６以上の時に出
力層のユニット１及び２に１．０を出力させる学習デー
タを示す説明図であり、図１３（Ｃ）は入力層のユニッ
ト１及び２への入力データが０．４以下の時、出力層の
ユニット１及び２に０．０を出力させるための学習デー
タを示す説明図であり、図１３（Ｄ）は入力層への入力
データが０．６以上でも連続していなければ０．０を出
力する学習データを示す説明図である。13 is an explanatory diagram showing a learning example of the neural network processing unit in the embodiment shown in FIG.
FIG. 13A is an explanatory diagram showing learning data for causing the unit 1 of the output layer to output 1.0 when the input data to the unit 1 of the input layer is 0.6 or more, and FIG. 13C is an explanatory diagram showing learning data that causes the units 1 and 2 of the output layer to output 1.0 when the input data to the units 1 and 2 of FIG. FIG. 13D is an explanatory diagram showing learning data for causing the units 1 and 2 of the output layer to output 0.0 when the input data to the input layers 2 and 3 is 0.4 or less, and FIG. It is explanatory drawing which shows the learning data which outputs 0.0 if data is not continuous even if it is 0.6 or more.

【図１４】図１０に示す実施例による文字認識の結果を
示す図表である。FIG. 14 is a chart showing a result of character recognition according to the embodiment shown in FIG.

【図１５】図１０に示す実施例による不正解画像を示す
平面図である。15 is a plan view showing an incorrect image according to the embodiment shown in FIG.

[Explanation of symbols]

１０原画像入力部１２しきい値選定部１４入力データ作成部１６ニューラルネットワーク処理部１８二値化画像合成部２０入力層２２中間層２４出力層ｂ，Ａ，Ｂ，Ｃ原画像ｃ補正用しきい値Ｘ1 〜Ｘn 入力データｘ1 〜ｘn 出力データ 10 original image input unit 12 threshold selection unit 14 input data creation unit 16 neural network processing unit 18 binarized image synthesis unit 20 input layer 22 intermediate layer 24 output layer b, A, B, C original image c for correction Threshold value X1 to Xn input data x1 to xn output data

Claims

[Claims]

1. An original image input unit for capturing an image including a recognition processing target and converting the analog image into an original image having gradation, and calculating a correction threshold value for binarizing the original image. A threshold value selecting unit, an input data creating unit that creates input data by correcting the density level of the original image based on the correction threshold value, and normalizing the original image; An image binarization device, comprising: a neural network processing unit that binarizes input data created by the creating unit by neural network processing and outputs the binarized network.