JP2010026805A

JP2010026805A - Character recognition device and character recognition method

Info

Publication number: JP2010026805A
Application number: JP2008187575A
Authority: JP
Inventors: Takashi Murozaki; 隆室崎
Original assignee: Denso Corp
Current assignee: Denso Corp
Priority date: 2008-07-18
Filing date: 2008-07-18
Publication date: 2010-02-04

Abstract

<P>PROBLEM TO BE SOLVED: To provide a character recognition device and a character recognition method for accurately recognizing characters to be recognized even though those characters are not clear. <P>SOLUTION: The character recognition device includes: an image acquisition means (2) for acquiring an inspection image obtained by photographing an object to be inspected; a character area detection means (34) for detecting a character area in which characters are projected from the inspection image; a stroke detection means (38) for detecting pixels matched with any of a plurality of strokes representing the partial structure of each of a plurality of characters to be recognized about at least a partial area of the character area, and for detecting the number of strokes included in at least the partial area of the character area by counting the number of the matched pixels; and a pattern identification means (39) for calculating the certainty of the respective characters to be recognized from the number of the strokes, and for deciding that the characters projected in the character area are characters whose certainty is the highest among the characters to be recognized. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、文字認識装置及び文字認識方法に関するものであり、より詳しくは、被検査物上に表記された文字を撮影した画像から、その文字を認識する文字認識装置及び文字認識方法に関する。 The present invention relates to a character recognition device and a character recognition method, and more particularly, to a character recognition device and a character recognition method for recognizing a character from an image obtained by photographing a character written on an inspection object.

近年、被検査物に印字あるいは刻印されている文字情報を画像として取得し、その画像を解析することにより、その文字を認識する方法及び装置の開発が盛んに行われている。このような文字認識装置及び文字認識方法では、文字の種類毎に異なる特徴を表すと考えられる特徴量を画像から抽出し、その特徴量に基づいて文字を認識する。あるいは、認識対象となる文字の字体及びサイズが予め分かっている場合には、その字体及びサイズに応じた文字のテンプレートを準備しておき、画像とそのテンプレートとのパターンマッチングを行い、最も一致するテンプレートを特定することにより、文字を認識する。 In recent years, methods and apparatuses for recognizing characters by acquiring character information printed or stamped on an inspection object as an image and analyzing the image have been actively developed. In such a character recognition device and character recognition method, a feature amount that is considered to represent a different feature for each character type is extracted from an image, and a character is recognized based on the feature amount. Alternatively, if the font and size of the character to be recognized are known in advance, a character template corresponding to the font and size is prepared, and pattern matching between the image and the template is performed, and the best match is achieved. Recognize characters by specifying a template.

しかし、被検査物の表面が汚れるなどして、文字列情報と重畳してノイズ情報が存在する場合、特徴量を抽出できなかったり、誤ったパターンマッチング結果が得られることにより、正確に文字を認識することができない場合がある。また、ドットマトリクスプリンタのように一つの文字がドットの集合として印字される場合についても、ドット間の不連続性がノイズと同じように作用するため、正確に文字を認識できない場合がある。そこで、前処理として、ノイズ情報を減弱するような処理を行った後、文字の認識を行う方法が開発されている。 However, if the surface of the object to be inspected becomes dirty and noise information is superimposed on the character string information, the feature value cannot be extracted or an incorrect pattern matching result can be obtained. It may not be recognized. Also, even when a single character is printed as a set of dots as in a dot matrix printer, the discontinuity between dots acts in the same way as noise, so that the character may not be recognized accurately. Therefore, as a preprocessing, a method of recognizing characters after performing processing that attenuates noise information has been developed.

例えば、特許文献１に開示された光学文字認識用画像データ処理方法では、画像データの画素の濃度値のヒストグラムを作成し、２−Ｄ空間平均演算を行うことによって画像データのダイナミックレンジを一度減少させ、その後文字情報が含まれていると考えられる領域に対してコントラストの範囲を広げ、さらにエッジ強調処理を行い、最後に二値化を行うことにより、ノイズを減弱させた画像データを得る。 For example, in the image data processing method for optical character recognition disclosed in Patent Document 1, a histogram of density values of pixels of image data is created, and the dynamic range of the image data is once reduced by performing 2-D spatial averaging. After that, the range of contrast is expanded with respect to an area that is considered to include character information, edge enhancement processing is further performed, and binarization is performed finally, thereby obtaining image data with reduced noise.

特開平５−３１４３１５号公報JP-A-5-314315

しかしながら、認識すべき文字と、被検査物表面の傷や汚れが重なって、その文字の一部が不鮮明となっている場合、文字についての画像上の情報（例えば、濃度、エッジ強度など）を正確に取得できないことがある。このような場合、上記のような前処理を行っても、文字に関する情報だけを強調することができないため、認識精度の向上を期待することができなかった。 However, if the character to be recognized overlaps the scratch or dirt on the surface of the object to be inspected and a part of the character is unclear, information on the image about the character (for example, density, edge strength, etc.) It may not be obtained accurately. In such a case, even if the pre-processing as described above is performed, it is not possible to emphasize only the information related to the characters, and thus it is not possible to expect an improvement in recognition accuracy.

上記の問題点に鑑み、本発明は、認識すべき文字が不鮮明な場合であっても、正確にその文字を認識できる文字認識装置及び文字認識方法を提供することを目的とする。 In view of the above problems, an object of the present invention is to provide a character recognition device and a character recognition method that can accurately recognize a character even if the character to be recognized is unclear.

本発明の請求項１に記載の形態によれば、被検査物上に表記された文字を認識する文字認識装置が提供される。係る文字認識装置は、被検査物を撮影した検査画像を取得する画像取得手段（２）と、検査画像から、文字が写っている文字領域を検出する文字領域検出手段（３４）と、文字領域のうちの少なくとも一部分の領域について、認識対象となる複数の文字のそれぞれの部分構造を表す複数の線素の何れかと一致する画素を検出し、その一致する画素の数を計数することにより、文字領域のうちの少なくとも一部分の領域に含まれる各線素の数を検出する線素検出手段（３８）と、各線素の数から各認識対象文字の確信度を算出し、文字領域に写っている文字を、認識対象文字のうち、その確信度が最も高い文字であると判定するパターン識別手段（３９）とを有する。
本発明に係る文字認識装置は、係る構成を有することにより、認識すべき文字の一部に傷や汚れなどが重畳されてその文字が不鮮明となっている場合であっても、正確にその文字を認識できる。 According to the first aspect of the present invention, there is provided a character recognition device for recognizing characters written on an inspection object. The character recognition apparatus includes an image acquisition unit (2) that acquires an inspection image obtained by photographing an object to be inspected, a character region detection unit (34) that detects a character region in which a character is reflected from the inspection image, and a character region. For at least a part of the region, a pixel that matches one of a plurality of line elements representing the partial structure of each of the plurality of characters to be recognized is detected, and the number of the matching pixels is counted, thereby detecting the character. A line element detecting means (38) for detecting the number of each line element included in at least a part of the area, and a certainty factor of each recognition target character is calculated from the number of each line element, and the character shown in the character area Pattern recognition means (39) for determining that the character has the highest certainty among the recognition target characters.
The character recognition device according to the present invention has such a configuration, so that even if a character to be recognized is overlaid with scratches, dirt, etc., and the character is unclear, the character is accurately detected. Can be recognized.

また請求項２に記載のように、パターン識別手段（３９）は、各線素の数を入力とし、各認識対象文字の確信度を出力するサポートベクトルマシンにより構成されることが好ましい。本発明に係る文字認識装置は、サポートベクトルマシンを識別器として利用することにより、各認識対象文字に対応する各線素の数が、非線形な分離を示すものであったとしても、検出された線素の数に基づいて被検査物上の文字を正確に識別することができる。 According to a second aspect of the present invention, the pattern identifying means (39) is preferably constituted by a support vector machine that receives the number of each line element and outputs the certainty factor of each recognition target character. The character recognition device according to the present invention uses a support vector machine as a discriminator, so that even if the number of line elements corresponding to each recognition target character shows non-linear separation, the detected line Characters on the inspection object can be accurately identified based on the prime number.

また請求項３に記載のように、本発明に係る文字認識装置は、文字領域と各認識対象文字に対応する複数のテンプレートとの間でテンプレートマッチングを行って各認識対象文字との一致度を算出し、一致度が最も高い方から順に複数の認識対象文字を選択するパターンマッチング手段（３６）をさらに有し、パターン識別手段（３９）は、文字領域に写っている文字を、その選択された複数の認識対象文字のうち、対応する確信度が最も高い文字であると判定することが好ましい。
本発明に係る文字認識装置は、係る構成を有することにより、異なる二種類の認識手段を用いて被検査物上の文字を認識するので、認識精度をより向上させることができる。 According to a third aspect of the present invention, the character recognition device according to the present invention performs template matching between a character region and a plurality of templates corresponding to each recognition target character, and determines the degree of coincidence with each recognition target character. It further has a pattern matching means (36) for calculating and selecting a plurality of recognition target characters in order from the one with the highest degree of coincidence, and the pattern identification means (39) selects the character appearing in the character area. Of the plurality of recognition target characters, it is preferable to determine that the corresponding character has the highest certainty factor.
Since the character recognition device according to the present invention has such a configuration, the recognition accuracy can be further improved because the character on the object to be inspected is recognized using two different types of recognition means.

さらに請求項４に記載のように、パターン識別手段（３９）は、文字領域のうちの少なくとも一部分の領域に選択された複数の認識対象文字の異なる部分が含まれるように、その一部分の領域の位置またはサイズを決定することが好ましい。
本発明に係る文字認識装置は、係る構成を有することにより、候補となる各文字の特に差異のある部分に着目して認識処理を行うので、その差異が明確化されるため、認識精度をより向上させることができる。 Furthermore, as described in claim 4, the pattern identifying means (39) is arranged so that at least a part of the plurality of recognition target characters is included in at least a part of the character area. It is preferred to determine the position or size.
Since the character recognition device according to the present invention has such a configuration and performs recognition processing by paying attention to a particularly different portion of each candidate character, the difference is clarified, so that the recognition accuracy is further improved. Can be improved.

また請求項５の記載によれば、被検査物上に表記された文字を認識する文字認識方法が提供される。係る文字認識方法は、被検査物を撮影した検査画像を取得するステップと、検査画像から、文字が写っている文字領域を検出するステップと、文字領域のうちの少なくとも一部分の領域について、認識対象となる複数の文字のそれぞれの部分構造を表す複数の線素の何れかと一致する画素を検出し、その一致する画素の数を計数することにより、文字領域のうちの少なくとも一部分の領域に含まれる各線素の数を検出するステップと、各線素の数から各認識対象文字の確信度を算出し、文字領域に写っている文字を、認識対象文字のうち、その確信度が最も高い文字であると判定するステップと、を有する。 According to the fifth aspect of the present invention, there is provided a character recognition method for recognizing characters written on an inspection object. Such a character recognition method includes a step of acquiring an inspection image obtained by photographing an object to be inspected, a step of detecting a character region in which characters are reflected from the inspection image, and at least a part of the character region. Detecting pixels that match any of a plurality of line elements representing the partial structure of each of the plurality of characters, and counting the number of matching pixels, so that it is included in at least a part of the character region The step of detecting the number of each line element and the certainty of each recognition target character are calculated from the number of each line element, and the character reflected in the character area is the character with the highest certainty among the recognition target characters. Determining.

なお、上記各手段に付した括弧内の符号は、後述する実施形態に記載の具体的手段との対応関係を示す一例である。 In addition, the code | symbol in the parenthesis attached | subjected to each said means is an example which shows a corresponding relationship with the specific means as described in embodiment mentioned later.

以下、図面を参照しつつ本発明に係る文字認識装置について詳細に説明する。
本発明を適用した文字認識装置は、一例として、車載エンジンに燃料を供給するポンプユニットの生産ラインに設置され、被検査物であるそのポンプユニットの部品（ワーク）の表面に刻印された、型番、製造年月日などの文字列を認識するものである。そして係る文字認識装置は、先ずパターンマッチングにより文字認識を行い、その結果、認識精度が低いと考えられる文字を、各文字領域から検出した、文字の部分的な特徴を表す各線素の数を入力特徴量とし、特定の文字である可能性を表す確信度を出力とするサポートベクトルマシンを用いて認識するものである。 Hereinafter, a character recognition device according to the present invention will be described in detail with reference to the drawings.
A character recognition device to which the present invention is applied is, for example, installed in a production line of a pump unit that supplies fuel to an on-vehicle engine, and is stamped on the surface of a part (work) of the pump unit that is an inspection object. It recognizes character strings such as date of manufacture. Then, the character recognition device first performs character recognition by pattern matching, and as a result, a character that is considered to have low recognition accuracy is detected from each character region, and the number of each line element that represents a partial feature of the character is input. It is recognized using a support vector machine that outputs a certainty factor representing the possibility of a specific character as a feature amount.

図１に、本発明の一実施形態に係る文字認識装置１の構成ブロック図を示す。
本発明の一実施形態に係る文字認識装置１は、被検査物であるワーク５を撮影し、検査画像を取得する撮像部２と、検査画像に基づいてワーク５表面に刻印された文字列を認識し、且つ文字認識装置の制御を行う処理部３とを有する。
以下、文字認識装置１の各部について詳細に説明する。 FIG. 1 shows a block diagram of a character recognition device 1 according to an embodiment of the present invention.
A character recognition device 1 according to an embodiment of the present invention captures a workpiece 5 that is an object to be inspected, an imaging unit 2 that acquires an inspection image, and a character string stamped on the surface of the workpiece 5 based on the inspection image. A processing unit 3 that recognizes and controls the character recognition device.
Hereinafter, each part of the character recognition device 1 will be described in detail.

撮像部２は、ワーク５の表面に刻印された文字列を撮影し、検査画像を取得する。また撮像部２は、文字列全体が検査画像に含まれ、且つその文字列に含まれる個々の文字が検査画像上で識別できるように撮影する。そのために、撮像部２は、ＣＣＤ、Ｃ−ＭＯＳセンサなどの光電変換器で構成された２次元検出器と、その２次元検出器上にワーク５表面の像を結像する結像光学系を有する。本実施形態では、２次元検出器として、６４０×４８０画素の２／３インチＣＣＤを用い、各画素の輝度値を０〜２５５で表すものとした。しかし、撮像部２を、異なる画素数及び画面サイズを有する２次元検出器で構成してもよい。また撮像部２は、検査画像上で各文字の特徴が判別できるように、結像光学系の結像倍率が調整される。さらに、撮像部２は、ワーク５を照明する照明光源を有してもよい。
撮像部２は、取得した検査画像を処理部３へ送信する。 The imaging unit 2 captures a character string stamped on the surface of the workpiece 5 and acquires an inspection image. The imaging unit 2 captures an image so that the entire character string is included in the inspection image and individual characters included in the character string can be identified on the inspection image. For this purpose, the imaging unit 2 includes a two-dimensional detector composed of a photoelectric converter such as a CCD or C-MOS sensor, and an imaging optical system that forms an image of the surface of the workpiece 5 on the two-dimensional detector. Have. In this embodiment, a 640 × 480 pixel 2/3 inch CCD is used as the two-dimensional detector, and the luminance value of each pixel is represented by 0 to 255. However, you may comprise the imaging part 2 with the two-dimensional detector which has a different pixel count and screen size. The imaging unit 2 adjusts the imaging magnification of the imaging optical system so that the characteristics of each character can be determined on the inspection image. Furthermore, the imaging unit 2 may include an illumination light source that illuminates the workpiece 5.
The imaging unit 2 transmits the acquired inspection image to the processing unit 3.

図２に、撮像部２で取得される検査画像の概略図を示す。図２において、検査画像２００に撮影されたワーク５の表面には、文字列情報として、ワーク５の型番情報２０１と、製造年月日情報２０２とが刻印されている。本実施形態において、型番情報２０１は、８桁の数値で表される。また、製造年月日情報２０２は、左から順に、製造年を示す４桁の数値と、１文字のアルファベットが表記される。そして、各文字に対応する画素は、背景に対応する画素と比較して相対的に高い輝度値を有する。しかし、例えば、製造年月日情報２０２の一番左側に記載された数値'２'の近傍のように、ワーク５の表面に汚れや傷があると、その汚れや傷に対応する画素もまた、相対的に高い輝度を有する。そのため、このような汚れや傷の存在により文字が不鮮明となって、文字を誤認識する原因となり得る。 FIG. 2 shows a schematic diagram of an inspection image acquired by the imaging unit 2. In FIG. 2, model number information 201 of the workpiece 5 and manufacturing date information 202 are imprinted as character string information on the surface of the workpiece 5 photographed in the inspection image 200. In the present embodiment, the model number information 201 is represented by an 8-digit numerical value. In addition, the manufacturing date information 202 includes a four-digit numerical value indicating a manufacturing year and a one-letter alphabet in order from the left. And the pixel corresponding to each character has a relatively high luminance value compared with the pixel corresponding to a background. However, for example, if the surface of the workpiece 5 is soiled or scratched, as in the vicinity of the numerical value “2” written on the leftmost side of the manufacturing date information 202, the pixel corresponding to the soiled or scratched surface is also changed. , Has a relatively high brightness. For this reason, the presence of such dirt and scratches may make the characters unclear and cause the characters to be recognized incorrectly.

処理部３は、いわゆるパーソナルコンピュータ（ＰＣ）及びその周辺機器で構成され、撮像部２から取得した検査画像を解析して、ワーク５の表面に刻印された文字列情報を認識する。
図３に、処理部３の機能ブロック図を示す。図３に示すように、処理部３は、制御手段３１、通信手段３２、記憶手段３３、文字領域検出手段３４、正規化手段３５、パターンマッチング手段３６、輪郭抽出手段３７、線素検出手段３８及びパターン識別手段３９を有する。
以下、処理部３の各部について説明する。 The processing unit 3 includes a so-called personal computer (PC) and peripheral devices thereof, analyzes the inspection image acquired from the imaging unit 2, and recognizes character string information stamped on the surface of the workpiece 5.
FIG. 3 shows a functional block diagram of the processing unit 3. As shown in FIG. 3, the processing unit 3 includes a control unit 31, a communication unit 32, a storage unit 33, a character area detection unit 34, a normalization unit 35, a pattern matching unit 36, a contour extraction unit 37, and a line element detection unit 38. And a pattern identification means 39.
Hereinafter, each part of the processing unit 3 will be described.

制御手段３１は、ＰＣの中央演算装置（ＣＰＵ）と、その周辺回路などで構成され、ＣＰＵに読み込まれたプログラムにしたがって動作し、撮像部２及び処理部３の各手段を制御する。また、通信手段３２は、処理部３と、撮像部２及び他の機器との間で制御信号、画像データあるいはデータ信号を送受信する通信インタフェースであり、ＵＳＢ、ＳＣＳＩ、ＲＳ２３２Ｃ、イーサネット（登録商標）などのＩ／Ｏポート及びそれらのドライバで構成される。そして、処理部３は、通信手段３２を通じて撮像部２から検査画像を受信する。一方、制御手段３１で生成された制御信号は、通信手段３２を通じて撮像部２へ送信される。
さらに、処理部３は、認識した文字の情報を、通信手段３２を通じて外部の機器へ出力する。 The control means 31 is composed of a central processing unit (CPU) of a PC and its peripheral circuits, and operates according to a program read into the CPU, and controls each means of the imaging unit 2 and the processing unit 3. The communication unit 32 is a communication interface that transmits and receives control signals, image data, or data signals between the processing unit 3, the imaging unit 2, and other devices. USB, SCSI, RS232C, Ethernet (registered trademark) Etc. and I / O ports and their drivers. Then, the processing unit 3 receives the inspection image from the imaging unit 2 through the communication unit 32. On the other hand, the control signal generated by the control unit 31 is transmitted to the imaging unit 2 through the communication unit 32.
Further, the processing unit 3 outputs the recognized character information to an external device through the communication unit 32.

記憶手段３３は、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）などの半導体メモリ、又は磁気ディスク、光ディスクなどの記録媒体及びそのアクセス装置で構成され、撮像部２から受信した検査画像を一時的に記憶する。また、記憶手段３３は、処理部３の制御を行うプログラムなどを記憶する。
さらに記憶手段３３は、ワーク５に刻印される可能性のある各文字（以下、認識対象文字という）に対応する文字パターンテンプレートを記憶する。この文字パターンテンプレートには、刻印される文字の理想的な形状に対応するものの他、ワーク５表面の傷や汚れなどによって文字の一部が検査画像上で識別できない場合を想定して作成された、文字の一部が欠けたパターンも含まれる。
さらに記憶手段３３は、各文字の部分的な構造を表す線素に対応する線素テンプレートを記憶する。なお、線素テンプレートの詳細については、後述する。 The storage means 33 is composed of a semiconductor memory such as a random access memory (RAM) or a read only memory (ROM), or a recording medium such as a magnetic disk or an optical disk and its access device, and temporarily stores an inspection image received from the imaging unit 2. Remember me. Further, the storage unit 33 stores a program for controlling the processing unit 3 and the like.
Further, the storage unit 33 stores a character pattern template corresponding to each character that may be engraved on the workpiece 5 (hereinafter referred to as a recognition target character). This character pattern template was created assuming that a part of the characters could not be identified on the inspection image due to scratches or dirt on the surface of the work 5 in addition to those corresponding to the ideal shape of the characters to be imprinted. Also included are patterns in which some characters are missing.
Further, the storage means 33 stores a line element template corresponding to a line element representing a partial structure of each character. Details of the line element template will be described later.

文字領域検出手段３４、正規化手段３５、パターンマッチング手段３６、輪郭抽出手段３７、線素検出手段３８及びパターン識別手段３９は、例えばＣＰＵ上で実行されるプログラムにより実装される機能モジュールである。あるいは、これらの手段は、ＣＰＵと別個に設けられた画像処理用プロセッサなどを備えた専用処理ボードとして実装されてもよい。 The character area detection means 34, normalization means 35, pattern matching means 36, contour extraction means 37, line element detection means 38, and pattern identification means 39 are functional modules implemented by a program executed on the CPU, for example. Alternatively, these means may be mounted as a dedicated processing board including an image processing processor provided separately from the CPU.

文字領域検出手段３４は、検査画像から、ワーク５上に刻印された一文字ごとに、その文字が写っている文字領域を検出する。文字領域の検出には、様々な公知の方法を利用することができる。例えば、予め検査画像上の各文字の位置及びサイズが分かっている場合、文字領域検出手段３４は、その位置及びサイズに基づいて、各文字ごとに文字領域を設定する。また、検査画像上において、ワーク５のうちの所定の形状を有する基準点の位置（例えば、ワーク５の端部、中心、または所定のマーカの位置）と、各文字との位置関係が分かっている場合、文字領域検出手段３４は、その基準点の位置を、基準点の形状に対応するテンプレートを用いてパターンマッチングを行うことなどにより検出する。そして文字領域検出手段３４は、基準点の位置と各文字の位置関係にしたがって、各文字に対する文字領域を設定する。なお、この場合、基準点は、認識対象となる文字のうちの何れか一つでもよい。
文字領域検出手段３４は、各文字領域の位置及びサイズを、他の手段でも利用できるように記憶手段３３に記憶する。 The character area detecting means 34 detects a character area in which the character is shown for each character stamped on the work 5 from the inspection image. Various known methods can be used to detect the character area. For example, when the position and size of each character on the inspection image are known in advance, the character region detection unit 34 sets a character region for each character based on the position and size. Further, on the inspection image, the positional relationship between the position of the reference point having a predetermined shape of the workpiece 5 (for example, the end portion, the center of the workpiece 5, or the position of the predetermined marker) and each character is known. If so, the character area detecting means 34 detects the position of the reference point by performing pattern matching using a template corresponding to the shape of the reference point. The character area detecting unit 34 sets a character area for each character according to the position of the reference point and the positional relationship between the characters. In this case, the reference point may be any one of characters to be recognized.
The character area detection means 34 stores the position and size of each character area in the storage means 33 so that other means can also use it.

正規化手段３５は、各文字領域内の文字のサイズが一定となるように、文字領域を拡大または縮小する。そのために、例えば、正規化手段３５は、文字領域内の画素の平均輝度を算出する。そして、正規化手段３５は、その平均輝度よりも高い輝度を有する画素のうち、文字領域内の最も上側に位置する画素を検出する。また、正規化手段３５は、文字領域内の画素の平均輝度値よりも高い輝度を有する画素のうち、文字領域内の最も下側に位置する画素、最も左側に位置する画素、及び最も右側に位置する画素をそれぞれ検出する。そして、正規化手段３５は、検出された上端の画素と下端の画素との差が、所定高さ（例えば、６０画素）となるように、文字領域の縦方向の長さを拡大または縮小して正規化する。同様に正規化手段３５は、検出された左端の画素と右端の画素との差が、所定幅（例えば、３０画素）となるように、文字領域の横方向の長さを拡大または縮小して正規化する。正規化手段３５は、この縦及び横のサイズが正規化された文字領域に相当する画像を正規化文字領域画像として作成する。そして正規化手段３５は、正規化文字領域画像を、他の手段でも利用できるように記憶手段３３に記憶する。 The normalizing means 35 enlarges or reduces the character area so that the character size in each character area is constant. For this purpose, for example, the normalizing means 35 calculates the average luminance of the pixels in the character area. And the normalization means 35 detects the pixel located in the uppermost part in a character area among the pixels which have a brightness | luminance higher than the average brightness | luminance. Further, the normalizing means 35 is a pixel located on the lowermost side, a pixel located on the leftmost side, and a pixel located on the rightmost side among the pixels having luminance higher than the average luminance value of the pixels in the character area. Each pixel located is detected. Then, the normalizing means 35 enlarges or reduces the length of the character area in the vertical direction so that the difference between the detected upper pixel and lower pixel becomes a predetermined height (for example, 60 pixels). Normalize. Similarly, the normalizing unit 35 enlarges or reduces the horizontal length of the character area so that the difference between the detected leftmost pixel and rightmost pixel becomes a predetermined width (for example, 30 pixels). Normalize. The normalizing means 35 creates an image corresponding to the character area whose vertical and horizontal sizes are normalized as a normalized character area image. Then, the normalizing means 35 stores the normalized character area image in the storage means 33 so that it can be used by other means.

パターンマッチング手段３６は、記憶手段３３から読み出した文字パターンテンプレートと、正規化文字領域画像との間でパターンマッチングを行って、文字領域に含まれる文字を認識する。
パターンマッチング手段３６は、パターンマッチングにより、正規化文字領域画像と各文字パターンテンプレートの相対的な位置及び角度を探索領域内で変更しつつ一致度を求め、その一致度が最大となる文字パターンテンプレートを決定する。なお、一致度（相関係数）は、例えば、以下の式で求められる。
ここでｒは一致度であり、Ｉ、Ｍはそれぞれ正規化文字領域画像及び文字パターンテンプレートの輝度値を表す。また、Ｎは文字パターンテンプレートに含まれる画素数を表す。この式において、正規化文字領域画像と文字パターンテンプレートが完全に一致する場合、一致度ｒ＝１となり、正規化文字領域画像と文字パターンテンプレートに全く相関が無い場合、一致度ｒ＝０となる。 The pattern matching unit 36 performs pattern matching between the character pattern template read from the storage unit 33 and the normalized character region image to recognize characters included in the character region.
The pattern matching means 36 obtains a matching degree by changing the relative position and angle of the normalized character area image and each character pattern template in the search area by pattern matching, and the character pattern template that maximizes the matching degree. To decide. Note that the degree of coincidence (correlation coefficient) is obtained by, for example, the following equation.
Here, r is the degree of coincidence, and I and M represent the luminance values of the normalized character region image and the character pattern template, respectively. N represents the number of pixels included in the character pattern template. In this expression, when the normalized character area image and the character pattern template completely match, the matching degree r = 1, and when there is no correlation between the normalized character area image and the character pattern template, the matching degree r = 0. .

パターンマッチング手段３６は、一致度の最大値r_maxが所定の閾値Th_pmよりも高い場合、その最大値r_maxに対応する文字パターンテンプレートが表す文字がその文字領域内に刻印されているものとする。なお、所定の閾値Th_pmは、一致度の最大値r_maxがその閾値よりも高い場合、認識結果が誤りとなる可能性が非常に低くなるように、実験結果などに基づいて設定され、例えば、0.95に設定される。 When the maximum value r _max of the matching degree is higher than a predetermined threshold value Th _pm , the pattern matching unit 36 has a character represented by the character pattern template corresponding to the maximum value r _max stamped in the character area. To do. Note that the predetermined threshold Th _pm is set based on an experimental result or the like so that the possibility that the recognition result becomes an error is very low when the maximum value r _max of the matching degree is higher than the threshold, for example, , 0.95.

一方、パターンマッチング手段３６は、一致度の最大値r_maxが所定の閾値Th_pm以下の場合、一致度が最も高い方から順に所定個数の文字パターンテンプレートに表される文字を、その文字領域に含まれる文字の候補である候補文字として選択する。なお、所定個数は、予め設定された固定値（例えば、２、３、５など）でもよく、あるいは、可変であってもよい。所定個数が可変の場合には、パターンマッチング手段３６は、一致度rが候補抽出閾値Th_c（上記の閾値Th_pmよりも低い値であり、例えば、0.7に設定される）よりも高い文字パターンテンプレート全てに表される文字を候補文字として選択する。さらに、パターンマッチング手段３６は、一致度の最大値r_maxが候補抽出閾値Th_cよりも低い場合、全ての認識対象文字を候補文字としてしてもよい。
パターンマッチング手段３６は、候補文字を対応する正規化文字領域画像と関連付けて記憶手段３３に記憶する。 On the other hand, when the maximum matching value r _max is equal to or smaller than a predetermined threshold value Th _pm , the pattern matching unit 36 puts characters represented in a predetermined number of character pattern templates in the character area in order from the highest matching score. Select a candidate character that is a candidate for the included character. The predetermined number may be a preset fixed value (for example, 2, 3, 5, etc.) or may be variable. When the predetermined number is variable, the pattern matching means 36 uses a character pattern whose matching degree r is higher than the candidate extraction threshold Th _c (which is lower than the above-mentioned threshold Th _pm , for example, set to 0.7). Select characters that appear in all templates as candidate characters. Further, the pattern matching unit 36 may set all recognition target characters as candidate characters when the maximum matching value r _max is lower than the candidate extraction threshold value Th _c .
The pattern matching unit 36 stores the candidate character in the storage unit 33 in association with the corresponding normalized character region image.

輪郭抽出手段３７、線素検出手段３８及びパターン識別手段３９は、パターンマッチング手段３６による処理の結果、特定の文字が認識されず、候補文字が選択された正規化文字領域画像に対して処理を行う。 The contour extracting unit 37, the line element detecting unit 38, and the pattern identifying unit 39 perform processing on the normalized character region image in which the specific character is not recognized and the candidate character is selected as a result of the processing by the pattern matching unit 36. Do.

輪郭抽出手段３７は、正規化文字領域画像から、その画像に写っている文字の輪郭に相当する画素を抽出する。そのために、輪郭抽出手段３７は、前処理としてノイズ除去処理を行う。具体的には、輪郭抽出手段３７は、正規化文字領域画像の各画素に対してメディアンフィルタ（例えば、３×３画素、あるいは、５×５画素のメディアンフィルタ）処理を行う。あるいは、輪郭抽出手段３７は、正規化文字領域画像の平均輝度以上の輝度値の画素を対象として、モルフォロジー演算のオープニング処理を行う。また輪郭抽出手段３７は、正規化文字領域画像に対して、上記の処理とは別に、あるいは上記の処理と組み合わせて、他の様々なノイズ除去処理（ただし、輪郭をぼかさないタイプの処理が好ましい）を実行してもよい。 The contour extracting unit 37 extracts pixels corresponding to the contour of the character shown in the image from the normalized character region image. Therefore, the contour extracting unit 37 performs noise removal processing as preprocessing. Specifically, the contour extracting unit 37 performs a median filter (for example, a 3 × 3 pixel or 5 × 5 pixel median filter) process on each pixel of the normalized character region image. Alternatively, the contour extracting unit 37 performs an opening process of the morphological operation for pixels having a luminance value equal to or higher than the average luminance of the normalized character region image. In addition, the contour extracting means 37 is not limited to the above processing, or in combination with the above processing, other various noise removal processing (however, a processing that does not blur the contour is preferable for the normalized character region image. ) May be executed.

次に、輪郭抽出手段３７は、ノイズ除去処理が施された正規化文字領域画像に対して二値化処理を行って、正規化文字領域画像の各画素を文字に相当する画素とその他の画素に区分する。そこで、輪郭抽出手段３７は、ノイズ除去処理が施された正規化文字領域画像の平均輝度値、最大輝度値及び最小輝度値をそれぞれ算出する。そして輪郭抽出手段３７は、正規化文字領域画像の輝度値を補正するために、所定の基準値（例えば、１２８）から平均輝度値を引いた差をオフセット値として求め、そのオフセット値を各画素の輝度値に加算する。さらに輪郭抽出手段３７は、各正規化文字領域画像のコントラストを一定にするため、オフセット値が加えられた最大輝度値及び最小輝度値を、それぞれ、所定の最大値（例えば、２５５）及び所定の最小値（例えば、０）となるように、正規化文字領域画像の階調を線形変換する。 Next, the contour extraction unit 37 performs binarization processing on the normalized character region image that has been subjected to noise removal processing, and each pixel of the normalized character region image is represented by a pixel corresponding to a character and other pixels. Divide into Therefore, the contour extracting unit 37 calculates the average luminance value, the maximum luminance value, and the minimum luminance value of the normalized character region image that has been subjected to the noise removal processing. Then, in order to correct the luminance value of the normalized character region image, the contour extracting unit 37 obtains a difference obtained by subtracting the average luminance value from a predetermined reference value (for example, 128) as an offset value, and the offset value is calculated for each pixel. Is added to the luminance value. Further, the contour extracting unit 37 sets the maximum luminance value and the minimum luminance value to which the offset value is added to a predetermined maximum value (for example, 255) and a predetermined value, respectively, in order to make the contrast of each normalized character region image constant. The gradation of the normalized character area image is linearly converted so as to be the minimum value (for example, 0).

上記のような輝度値の補正処理を行った後、輪郭抽出手段３７は、正規化文字領域画像の各画素について、所定の基準値（例えば、１２８）よりも高い輝度値を有する画素を輝度値２５５、その所定の基準値以下の輝度値を有する画素を輝度値０と二値化する。図２に示したように、本実施形態では、文字に相当する画素は、その他に相当する画素と比較して相対的に高い輝度値を有する。そのため、正規化文字領域画像の各画素は、この二値化処理によって、文字に相当する画素（輝度値２５５）とその他の画素（輝度値０）に区分される。 After performing the brightness value correction processing as described above, the contour extracting unit 37 determines a pixel having a brightness value higher than a predetermined reference value (for example, 128) for each pixel of the normalized character area image. 255. A pixel having a luminance value equal to or lower than the predetermined reference value is binarized with a luminance value of 0. As shown in FIG. 2, in the present embodiment, pixels corresponding to characters have a relatively high luminance value as compared with pixels corresponding to other characters. Therefore, each pixel of the normalized character area image is divided into a pixel corresponding to the character (luminance value 255) and other pixels (luminance value 0) by this binarization processing.

輪郭抽出手段３７は、二値化された正規化文字領域画像に対してエッジ検出処理を行って、文字の輪郭に対応する画素を抽出する。例えば、輪郭抽出手段３７は、エッジ検出処理として、Sobel（ソーベル）フィルタ、Prewitt（プレウィット）フィルタ、ラブラシアンフィルタなどの１次または２次差分フィルタを用いて、二値化された正規化文字領域画像の各画素に対するフィルタ処理を実行する。そして、輪郭抽出手段３７は、フィルタ処理結果の絶対値が所定値（例えば２５５）以上となる画素を、輪郭に対応するエッジ画素とし、輪郭であることを示す輪郭輝度値（例えば２５５）を割り当て、それ以外の画素の輝度値を０とする輪郭画像を作成する。 The contour extraction means 37 performs edge detection processing on the binarized normalized character region image to extract pixels corresponding to the character contour. For example, the contour extraction unit 37 uses a first-order or second-order difference filter such as a Sobel filter, a Prewitt filter, or a Labracian filter as edge detection processing, and binarized normalized characters. Filter processing is performed on each pixel of the region image. Then, the contour extracting unit 37 assigns a contour luminance value (for example, 255) indicating that the pixel whose absolute value of the filter processing result is equal to or greater than a predetermined value (for example, 255) as an edge pixel corresponding to the contour. A contour image in which the luminance values of the other pixels are 0 is created.

最後に、輪郭抽出手段３７は、輪郭画像に対して細線化処理を行い、エッジ画素が連結された輪郭線が１画素幅で表されるようにする。またさらに、輪郭抽出手段３７は、輪郭に相当する輝度値を持つ画素に対してモルフォロジー演算のクロージング処理を行って、ノイズ等の影響により分断された輪郭線が連結されるようにしてもよい。
輪郭抽出手段３７は、得られた輪郭画像を線素検出手段３８へ渡す。 Finally, the contour extracting unit 37 performs a thinning process on the contour image so that the contour line in which the edge pixels are connected is represented by one pixel width. Further, the contour extracting means 37 may perform a morphological calculation closing process on a pixel having a luminance value corresponding to the contour so that the contour lines separated by the influence of noise or the like are connected.
The contour extracting unit 37 passes the obtained contour image to the line element detecting unit 38.

線素検出手段３８は、パターン識別手段３９において利用する識別用の入力特徴量として、輪郭画像に含まれる、認識対象となる文字の部分構造を表す線素の数を求める。
図４（ａ）〜（ｈ）に、本実施形態において使用する線素テンプレートの例を示す。図４（ａ）〜（ｈ）に示すように、各線素テンプレート４０１〜４０８ｂは、それぞれ３×３画素で表され、白抜きで表された画素の組が線素である。例えば、線素テンプレート４０１は、横方向に一列に並んだ画素の組によって表される線素に対応する。また、線素テンプレート４０２は、縦方向に一列に並んだ画素の組によって表される線素に対応する。そして線素に対応する各画素は、上記の輪郭輝度値を持ち、その他の画素は輝度値０を持つ。なお、線素はこれらに限られず、例えば、５×５画素のように、図４の例とは異なる大きさで表されるものであってもよい。さらに、線素は、横３画素×縦４画素あるいは横５×縦２画素のように、縦方向と横方向とで異なる大きさを有するものであってもよい。 The line element detection unit 38 obtains the number of line elements representing the partial structure of the character to be recognized, included in the contour image, as the input feature quantity for identification used in the pattern identification unit 39.
4A to 4H show examples of line element templates used in this embodiment. As shown in FIGS. 4A to 4H, each of the line element templates 401 to 408b is represented by 3 × 3 pixels, and a set of pixels represented by white lines is a line element. For example, the line element template 401 corresponds to a line element represented by a set of pixels arranged in a line in the horizontal direction. The line element template 402 corresponds to a line element represented by a set of pixels arranged in a line in the vertical direction. Each pixel corresponding to the line element has the above-described contour luminance value, and the other pixels have a luminance value of 0. The line elements are not limited to these, and may be expressed in a size different from the example of FIG. 4, for example, 5 × 5 pixels. Further, the line element may have different sizes in the vertical direction and the horizontal direction, such as horizontal 3 pixels × vertical 4 pixels or horizontal 5 × vertical 2 pixels.

線素検出手段３８は、輪郭画像全体または輪郭画像の一部分のみを含む部分領域について、各線素と一致する画素の数を算出する。本実施形態では線素検出手段３８は、輪郭画像を上下左右にそれぞれ２分割した、４個の部分領域のそれぞれについて、上記の各線素テンプレートの何れかと一致する画素の数を算出する。なお、部分領域の設定方法は上記に限られず、例えば、輪郭画像を上下方向に３等分したものであってもよく、あるいは、各部分領域が異なる大きさを有するように設定してもよい。 The line element detection means 38 calculates the number of pixels matching each line element for a partial region including the entire contour image or only a part of the contour image. In the present embodiment, the line element detection means 38 calculates the number of pixels that coincide with any of the above line element templates for each of the four partial regions obtained by dividing the contour image into two parts in the vertical and horizontal directions. Note that the method for setting the partial area is not limited to the above. For example, the outline image may be divided into three equal parts in the vertical direction, or each partial area may be set to have a different size. .

輪郭画像に含まれる特定の画素と線素テンプレートが一致するか否かの判断は、パターンマッチングにより行うことができる。例えば、線素検出手段３８は、特定の線素テンプレートを輪郭画像の所定の位置に合わせたときに、上記の（１）式と同様の式により算出される一致度が１となれば、その線素テンプレートに対応する線素が検出されたと判定する。あるいは、線素検出手段３８は、輪郭画像中の任意の着目画素及びその周囲８近傍画素のそれぞれの輝度値が、特定の線素テンプレートの各画素の輝度値と一致するか否かを調べ、全ての画素の輝度値が一致する場合に、その線素テンプレートに対応する線素が検出されたと判定してもよい。そして線素検出手段３８は、特定の線素が検出される度に、その特定の線素の検出数を１加算する。なお、図４（ｅ）に示す線素テンプレート４０５ａ及び４０５ｂは、互いに対して中心点対称となっているため、線素検出手段３８は、線素テンプレート（４０５ａ、４０５ｂ）の組を一つの線素に対応するものとして扱う。すなわち、線素検出手段３８は、線素テンプレート４０５ａまたは線素テンプレート４０５ｂの何れかに一致する画素の検出数の合計を、一つの線素に対する検出数とする。同様に、図４（ｆ）〜（ｈ）に示す線素テンプレート（４０６ａ、４０６ｂ）、（４０７ａ、４０７ｂ）、（４０８ａ、４０８ｂ）の各組についても、それぞれの線素テンプレートの何れか一方と一致する画素の検出数の合計を、一つの線素に対する検出数とする。しかし、これら中心点対称の二つの線素を、それぞれ別の線素として扱ってもよい。 Whether or not a specific pixel included in the contour image matches the line element template can be determined by pattern matching. For example, the line element detection means 38, when matching a specific line element template to a predetermined position of the contour image, if the degree of coincidence calculated by the same expression as the expression (1) is 1, It is determined that a line element corresponding to the line element template has been detected. Alternatively, the line element detection unit 38 checks whether or not the luminance values of the arbitrary pixel of interest in the contour image and the surrounding eight neighboring pixels match the luminance value of each pixel of the specific line element template, When the luminance values of all the pixels match, it may be determined that a line element corresponding to the line element template has been detected. Then, each time a specific line element is detected, the line element detection means 38 adds 1 to the number of detections of the specific line element. Since the line element templates 405a and 405b shown in FIG. 4E are symmetric with respect to each other, the line element detection unit 38 uses a line element template (405a, 405b) as one line. Treat as an element. In other words, the line element detection unit 38 sets the total number of detections of pixels matching either the line element template 405a or the line element template 405b as the detection number for one line element. Similarly, for each set of line element templates (406a, 406b), (407a, 407b), (408a, 408b) shown in FIGS. 4 (f) to (h), either one of the respective line element templates is used. The total number of detected matching pixels is taken as the number of detections for one line element. However, these two center line symmetrical line elements may be treated as different line elements.

図５（ａ）〜（ｃ）を参照しつつ、線素の検出数を求める方法についてさらに詳しく説明する。図５（ａ）は、二値化された正規化文字領域画像の一例５０１を示し、図５（ｂ）は、正規化文字領域画像５０１から抽出された文字の輪郭を示す、輪郭画像の一例５０２を示す。また図５（ｃ）は、輪郭画像５０２の右上部分の部分領域５０３を拡大表示したものである。
図５（ｃ）において、特に枠線で囲った領域５０４を例として、各線素の検出数の算出について説明する。ここで、画素５１１に着目すると、画素５１１及びその周囲８近傍画素の輝度値は、線素テンプレート４０８ｂの各画素の輝度値と一致する。そのため、線素検出手段３８は、線素テンプレートの組（４０８ａ、４０８ｂ）に対応する線素の検出数を１加算する。一方、画素５１２に着目すると、画素５１２及びその周囲８近傍画素の輝度値は線素テンプレート４０２の各画素の輝度値と一致する。そのため、線素検出手段３８は、線素テンプレート４０２に対応する線素の検出数を１加算する。同様に、画素５１３〜５１６に着目すると、それらの画素及びその周囲８近傍画素の輝度値も、線素テンプレート４０２の各画素の輝度値と一致する。従って、線素テンプレート４０２に対応する線素の検出数は５となる。領域５０４内には、その他の線素テンプレートと一致するところは存在しないため、線素検出手段３８は、他の線素の検出数を０とする。
線素検出手段３８は、上下左右にそれぞれ２分割した、４個の部分領域のそれぞれに含まれる、各線素の検出数を特徴量としてパターン識別手段３９へ渡す。 A method for obtaining the number of detected line elements will be described in more detail with reference to FIGS. 5A shows an example 501 of a normalized character area image binarized, and FIG. 5B shows an example of an outline image showing the outline of a character extracted from the normalized character area image 501. 502 is shown. FIG. 5C is an enlarged view of the partial area 503 in the upper right part of the contour image 502.
In FIG. 5C, the calculation of the number of detections of each line element will be described by taking a region 504 surrounded by a frame line as an example. Here, paying attention to the pixel 511, the luminance values of the pixel 511 and the neighboring eight neighboring pixels match the luminance value of each pixel of the line element template 408b. Therefore, the line element detection means 38 adds 1 to the number of detected line elements corresponding to the set of line element templates (408a, 408b). On the other hand, when attention is paid to the pixel 512, the luminance values of the pixel 512 and the neighboring eight neighboring pixels match the luminance value of each pixel of the line element template 402. Therefore, the line element detection unit 38 adds 1 to the number of detected line elements corresponding to the line element template 402. Similarly, when attention is paid to the pixels 513 to 516, the luminance values of those pixels and the neighboring eight neighboring pixels also match the luminance values of the pixels of the line element template 402. Therefore, the number of detected line elements corresponding to the line element template 402 is 5. In the area 504, there is no place that matches the other line element template, so the line element detection unit 38 sets the number of other line elements detected to 0.
The line element detection means 38 passes the detected number of each line element included in each of the four partial areas divided into two parts, top, bottom, left and right, to the pattern identification means 39 as a feature amount.

パターン識別手段３９は、各部分領域に含まれる各線素の検出数に基づいて、パターンマッチング手段３６により選択された候補文字の中から、文字領域画像中に写っている文字を識別する。本実施形態では、パターン識別手段３９は、サポートベクトルマシンにより構成した。
図６に、サポートベクトルマシンの概念図を示す。サポートベクトルマシンは、所定の識別対象物が、複数のカテゴリの何れかに属する場合、その識別対象物から求めた１乃至複数の特徴量（以下、特徴量セットという）に基づいて、その識別対象物を何れのカテゴリに属するかを判定する識別器である。そしてカテゴリ間の境界は、各カテゴリに属する学習データの特徴量セットのうち、隣接するカテゴリに属する学習データの特徴量セットとの距離が最も近いものの組で表される。このカテゴリ間の境界を表す特徴量セットは、サポートベクトルと呼ばれる。図６では、丸印で示された各点が、カテゴリＣ１に属する特徴量セットの一つであり、このうち特徴量セット６０１〜６０３が、カテゴリＣ１のサポートベクトルである。また、菱形で示された各点が、カテゴリＣ２に属する特徴量セットの一つであり、このうち特徴量セット６０４〜６０６が、カテゴリＣ２のサポートベクトルである。そして、サポートベクトルマシンでは、識別精度を向上するために、カテゴリＣ１のサポートベクトルと、カテゴリＣ２のサポートベクトル間の距離（マージン）が最大化されるように、サポートベクトルが決定される。さらに、サポートベクトルマシンでは、カテゴリ間の境界が非線形な場合でも、カーネル関数を利用して、学習データの特徴量セットを高次元に写像した上でサポートベクトルを決定することにより、各カテゴリに属する特徴量セットを線形分離可能とすることで、良好な識別性能を得ることができる。 The pattern identifying means 39 identifies characters appearing in the character area image from the candidate characters selected by the pattern matching means 36 based on the number of detected line elements included in each partial area. In the present embodiment, the pattern identifying means 39 is configured by a support vector machine.
FIG. 6 shows a conceptual diagram of the support vector machine. When a predetermined identification object belongs to any of a plurality of categories, the support vector machine can identify the identification object based on one or more feature amounts (hereinafter referred to as a feature amount set) obtained from the identification object. It is a discriminator that determines which category an object belongs to. The boundary between the categories is represented by a set of learning data feature amounts belonging to each category having the closest distance from the learning data feature amount set belonging to the adjacent category. A feature amount set representing a boundary between categories is called a support vector. In FIG. 6, each point indicated by a circle is one of feature quantity sets belonging to the category C1, and among these feature quantity sets 601 to 603 are support vectors of the category C1. Each point indicated by a rhombus is one of feature quantity sets belonging to the category C2, and among these feature quantity sets 604 to 606 are support vectors of the category C2. In the support vector machine, in order to improve the identification accuracy, the support vector is determined so that the distance (margin) between the support vector of category C1 and the support vector of category C2 is maximized. Furthermore, in the case of a support vector machine, even if the boundary between categories is non-linear, using a kernel function, the feature vector set of learning data is mapped to a higher dimension, and the support vector is determined to belong to each category. By making the feature quantity set linearly separable, good discrimination performance can be obtained.

本実施形態では、各認識対象文字に対応する二値化された正規化文字領域画像の各部分領域に含まれる、各線素の検出数を特徴量セットとして、サポートベクトルマシンを予め学習させた。学習の際、学習データとして使用する特徴量セットとして、認識対象文字全体が完全に見えるサンプルだけでなく、認識対象文字の一部（例えば、認識対象文字に対応する画素数の約１／１０〜１／４の画素）が傷や汚れにより不鮮明となっているサンプルから抽出した特徴量セットも利用した。さらに、正規化文字領域画像内に、傷や汚れに相当する画素が含まれるものから抽出した特徴セットも学習データとして使用した。
このような、画像中に傷や汚れを含むサンプルから抽出した特徴量セットも学習データとして利用することにより、サポートベクトルマシンの識別器としてのロバスト性を高めることができる。
学習されたサポートベクトルマシンは、各認識対象文字ごとのサポートベクトルに相当する特徴量セットなどによって表される。そしてこれらの特徴量セットは、各認識対象文字と関連付けられて、予め記憶手段３３に記憶される。 In the present embodiment, the support vector machine is trained in advance using the detected number of each line element included in each partial region of the binarized normalized character region image corresponding to each recognition target character as a feature amount set. When learning, as a feature quantity set used as learning data, not only a sample in which the entire recognition target character can be seen completely, but also a part of the recognition target character (for example, about 1/10 to 10th of the number of pixels corresponding to the recognition target character A feature set extracted from a sample in which ¼ pixel) was unclear due to scratches or dirt was also used. Furthermore, a feature set extracted from the normalized character region image including pixels corresponding to scratches and dirt was also used as learning data.
Robustness as a discriminator of a support vector machine can be improved by using a feature amount set extracted from a sample containing scratches and dirt in the image as learning data.
The learned support vector machine is represented by a feature amount set corresponding to a support vector for each recognition target character. These feature quantity sets are stored in advance in the storage means 33 in association with each recognition target character.

パターン識別手段３９は、線素検出手段３８から各部分領域に含まれる、各線素の検出数を受け取ると、それをサポートベクトルマシンの入力特徴量とすることにより、特定の認識対象文字である確からしさを表す確信度を各候補文字に対して出力する。そしてパターン識別手段３９は、確信度の最大値P_maxが、所定の閾値Th_pdよりも高い場合、その確信度の最大値P_maxに対応する候補文字が、文字領域に写っている文字であると判定する。一方、確信度の最大値P_maxが閾値Th_pd以下である場合には、パターン識別手段３９は、その文字領域に写っている文字を識別不能と判定する。なお、閾値Th_pdは、文字認識結果の信頼性が十分であると考えられる確信度に相当し、文字認識装置１に要求される認識精度に応じて適宜設定される。 When the pattern identification means 39 receives the number of detections of each line element included in each partial area from the line element detection means 38, the pattern identification means 39 uses it as an input feature quantity of the support vector machine, thereby confirming that it is a specific recognition target character. A certainty factor representing the likelihood is output for each candidate character. Then, when the maximum certainty value P _max of the certainty factor is higher than the predetermined threshold Th _pd , the pattern identifying means 39 is a character in which the candidate character corresponding to the maximum certainty value P _max is reflected in the character area. Is determined. On the other hand, when the maximum value P _max of the certainty factor is equal to or less than the threshold value Th _pd , the pattern identifying unit 39 determines that the character shown in the character area cannot be identified. Note that the threshold value Th _pd corresponds to a certainty factor that the reliability of the character recognition result is considered to be sufficient, and is appropriately set according to the recognition accuracy required for the character recognition device 1.

最後に、処理部３は、各文字領域の位置関係に従って、パターンマッチング手段３６またはパターン識別手段３９により認識された各文字を配列し、ワーク５に刻印された文字列情報を得る。そして処理部３は、認識された文字列情報を、ディスプレイに表示してユーザに報知したり、通信手段３２を介して通信可能に接続された他の機器へ出力する。 Finally, the processing unit 3 arranges each character recognized by the pattern matching unit 36 or the pattern identification unit 39 according to the positional relationship of each character region, and obtains character string information stamped on the work 5. Then, the processing unit 3 displays the recognized character string information on the display to notify the user, or outputs it to other devices that are communicably connected via the communication unit 32.

図７を参照しつつ、本発明を適用した文字認識装置１が一つのワーク５に刻印された文字列情報を認識する際の動作について説明する。なお、文字認識装置１の動作は、処理部３の制御手段３１によって制御される。 The operation when the character recognition device 1 to which the present invention is applied recognizes character string information stamped on one work 5 will be described with reference to FIG. The operation of the character recognition device 1 is controlled by the control means 31 of the processing unit 3.

図７に示すように、検査が開始されると、撮像部２は、ワーク５の文字が刻印された面を撮影し、検査画像を取得する（ステップＳ１０１）。取得された検査画像は、処理部３に送信される。処理部３は、検査画像を受信すると、文字領域検出手段３４により、検査画像からワーク５上に刻印された各文字ごとに、その文字が含まれる文字領域を検出する（ステップＳ１０２）。なお、文字領域検出手段３４による文字領域検出処理の詳細は、上述したとおりである。 As shown in FIG. 7, when the inspection is started, the imaging unit 2 captures the surface of the workpiece 5 on which the characters are engraved, and acquires an inspection image (step S101). The acquired inspection image is transmitted to the processing unit 3. When the processing unit 3 receives the inspection image, the character region detection unit 34 detects a character region including the character for each character stamped on the work 5 from the inspection image (step S102). The details of the character area detection processing by the character area detection means 34 are as described above.

各文字に対応する文字領域が検出されると、処理部３は、それら文字領域の中から、着目する文字領域を設定する（ステップＳ１０３）。そして、処理部３は、正規化手段３５により、着目文字領域内の文字が所定のサイズとなるように、着目文字領域を拡大または縮小して、着目文字領域に対応する正規化文字領域画像を作成する（ステップＳ１０４）。 When a character area corresponding to each character is detected, the processing unit 3 sets a character area of interest from the character areas (step S103). Then, the processing unit 3 enlarges or reduces the focused character area so that the character in the focused character area has a predetermined size by the normalizing unit 35, and generates a normalized character area image corresponding to the focused character area. Create (step S104).

次に、処理部３は、パターンマッチング手段３６により、着目文字領域に対応する正規化文字領域画像と、認識対象文字の各文字パターンテンプレートとのパターンマッチングを行い、各文字パターンテンプレートごとに一致度を算出する（ステップＳ１０５）。そしてパターンマッチング手段３６は、一致度の最大値r_maxと所定の閾値Th_pmを比較する（ステップＳ１０６）。パターンマッチング手段３６は、一致度の最大値r_maxが所定の閾値Th_pmよりも高い場合、その最大値r_maxに対応する文字パターンテンプレートが表す文字が着目文字領域内に刻印されているものと判定する（ステップＳ１０７）。 Next, the processing unit 3 uses the pattern matching unit 36 to perform pattern matching between the normalized character area image corresponding to the target character area and each character pattern template of the recognition target character, and the matching degree for each character pattern template. Is calculated (step S105). Then, the pattern matching unit 36 compares the maximum value r _{max of the} degree of coincidence with a predetermined threshold value Th _pm (step S106). When the maximum value r _max of the matching degree is higher than a predetermined threshold value Th _pm , the pattern matching unit 36 has the character represented by the character pattern template corresponding to the maximum value r _max stamped in the target character area. Determination is made (step S107).

一方、パターンマッチング手段３６は、一致度の最大値r_maxが所定の閾値Th_pm以下の場合、一致度が最も高い方から順に所定個数の文字パターンテンプレートに表される文字を、候補文字として選択する。そして、処理部３は、パターン識別手段３９により、着目文字領域内に刻印されている文字を、候補文字の中の何れかの文字と判定する（ステップＳ１０８）。 On the other hand, the pattern matching means 36 selects, as candidate characters, characters represented in a predetermined number of character pattern templates in order from the highest matching level when the maximum matching level r _max is equal to or smaller than a predetermined threshold value Th _pm. To do. Then, the processing unit 3 uses the pattern identification unit 39 to determine that the character stamped in the target character area is one of the candidate characters (step S108).

ステップＳ１０７またはステップＳ１０８の後、処理部３は、ワーク５上に刻印されている全ての文字が認識されたか否かを判定する（ステップＳ１０９）。まだ認識されていない文字が有る場合には、制御をステップＳ１０３に戻し、着目文字領域に設定されていない文字領域の中から、新たな着目文字領域を設定する。そして処理部３は、ステップＳ１０４〜Ｓ１０９の処理を繰り返す。一方、ステップＳ１０９において、全ての文字が認識されたと判定されると、処理部３は、認識した各文字を、各文字領域の位置関係に基づいて配列することにより文字列情報を得て、その文字列情報を出力する（ステップＳ１１０）。そして処理部３は認識処理を終了する。 After step S107 or step S108, the processing unit 3 determines whether or not all characters stamped on the workpiece 5 have been recognized (step S109). When there is a character that has not been recognized yet, the control is returned to step S103, and a new focused character area is set from the character areas that are not set as the focused character area. And the process part 3 repeats the process of step S104-S109. On the other hand, if it is determined in step S109 that all characters have been recognized, the processing unit 3 obtains character string information by arranging the recognized characters based on the positional relationship of the character regions, Character string information is output (step S110). Then, the processing unit 3 ends the recognition process.

図８を参照しつつ、上記のステップＳ１０８の処理の詳細な手順について説明する。
まず、処理部３の輪郭抽出手段３７は、着目文字領域に対応する正規化文字領域画像に対してノイズ除去処理を実行する（ステップＳ２０１）。輪郭抽出手段３７は、ノイズ除去処理として、上記のように、メディアンフィルタ処理、モルフォロジー演算のオープニング処理、または他の様々なノイズ除去処理の何れかまたはそれらを組み合わせて実行する。次に、輪郭抽出手段３７は、二値化処理を行って、文字に対応する画素とその他の画素を区分する（ステップＳ２０２）。その後、輪郭抽出手段３７は、エッジ検出処理を行って、文字の輪郭を抽出した輪郭画像を作成する（ステップＳ２０３）。さらに、輪郭抽出手段３７は、その輪郭線が１画素の幅となるように、輪郭画像に対して細線化処理を実行する。 With reference to FIG. 8, a detailed procedure of the process in step S108 will be described.
First, the contour extraction unit 37 of the processing unit 3 performs a noise removal process on the normalized character area image corresponding to the target character area (step S201). As described above, the contour extracting unit 37 executes any one or a combination of median filter processing, morphological operation opening processing, and various other noise removal processing as described above. Next, the contour extracting unit 37 performs binarization processing to distinguish the pixel corresponding to the character from other pixels (step S202). After that, the contour extraction unit 37 performs edge detection processing and creates a contour image in which the contour of the character is extracted (step S203). Further, the contour extracting unit 37 performs a thinning process on the contour image so that the contour line has a width of one pixel.

次に、処理部３の線素検出手段３８は、輪郭画像を上下左右に各２等分して得られた４個の部分領域のそれぞれについて、その部分領域に含まれる各線素の数を検出する（ステップＳ２０４）。なお、線素の数の検出処理の詳細については、線素検出手段３８に関して上述したとおりである。
各部分領域に含まれる、複数の線素のそれぞれの検出数が求まると、処理部３のパターン識別手段３９は、それら検出数を入力特徴量として、各候補文字の確信度を算出する（ステップＳ２０５）。そしてパターン識別手段３９は、確信度の最大値p_maxと所定の閾値Th_pdを比較する（ステップＳ２０６）。パターン識別手段３９は、確信度の最大値p_maxが所定の閾値Th_pdよりも高い場合、その最大値p_maxに対応する候補文字が着目文字領域内に刻印されているものと判定する（ステップＳ２０７）。一方、パターン識別手段３９は、確信度の最大値p_maxが所定の閾値Th_pd以下の場合、着目文字領域内の文字は識別不能と判定する（ステップＳ２０８）。 Next, the line element detecting means 38 of the processing unit 3 detects the number of line elements included in each of the four partial areas obtained by dividing the contour image into two equal parts in the vertical and horizontal directions. (Step S204). The details of the process for detecting the number of line elements are as described above with respect to the line element detection means 38.
When the number of detections of each of the plurality of line elements included in each partial region is obtained, the pattern identification unit 39 of the processing unit 3 calculates the certainty factor of each candidate character using the number of detections as an input feature amount (step S205). Then, the pattern identification unit 39 compares the certainty factor p _max with a predetermined threshold value Th _pd (step S206). If the maximum value p _max of the certainty factor is higher than the predetermined threshold value Th _pd , the pattern identifying unit 39 determines that the candidate character corresponding to the maximum value p _max is imprinted in the target character region (step) S207). On the other hand, if the maximum certainty factor p _max is equal to or smaller than the predetermined threshold value Th _pd , the pattern identifying unit 39 determines that the character in the target character region cannot be identified (step S208).

以上説明してきたように、本発明の一実施形態に係る文字認識装置は、ワーク表面に刻印された文字を、先ずパターンマッチングにより認識し、その結果、認識精度が低いと考えられる文字を、別途用意したサポートベクトルマシンを用いて認識する。このように、係る文字認識装置は、異なる複数の手法を用いてワーク表面の文字を認識することにより、良好な認識精度を得ることができる。さらに、サポートベクトルマシンの入力特徴量として、文字領域の各部分領域から抽出した、文字の部分的な構造を表す様々な線素の検出数を用いているので、文字の一部が傷や汚れなどにより不鮮明になっていたとしても、その汚れなどの影響を抑制することができる。したがって、係る文字認識装置は、被検査物上の文字の一部が傷や汚れなどによって不鮮明な場合でも、その文字を正確に認識することができる。 As described above, the character recognition device according to an embodiment of the present invention first recognizes a character imprinted on the workpiece surface by pattern matching, and as a result, separately recognizes a character that is considered to have low recognition accuracy. Recognize using prepared support vector machine. As described above, the character recognition apparatus can obtain good recognition accuracy by recognizing characters on the workpiece surface using a plurality of different methods. Furthermore, since the number of detected line elements representing the partial structure of the character extracted from each partial area of the character area is used as the input feature quantity of the support vector machine, a part of the character is scratched or soiled. Even if it becomes unclear due to the above, it is possible to suppress the influence of dirt and the like. Therefore, such a character recognition device can accurately recognize a character even when a part of the character on the inspection object is unclear due to scratches or dirt.

なお、上述してきた実施形態は、本発明を説明するためのものであり、本発明は、これらの実施形態に限定されるものではない。
例えば、上記の実施形態において、パターンマッチング手段３６を省略し、直接パターン識別手段３９、すなわち、サポートベクトルマシンにより文字認識を行ってもよい。この場合には、候補文字は選択されないので、パターン識別手段３９は、全ての認識対象文字に対する確信度を算出し、確信度の最大値に対応する文字が文字領域に写っていると判定する。また、被検査物に表記されている文字のサイズが一定であることが予め分かっている場合、正規化手段３５及び正規化手段３５による文字領域の拡大／縮小処理を省略してもよい。 The embodiments described above are for explaining the present invention, and the present invention is not limited to these embodiments.
For example, in the above embodiment, the pattern matching unit 36 may be omitted, and the character recognition may be performed directly by the pattern identification unit 39, that is, the support vector machine. In this case, since the candidate character is not selected, the pattern identifying unit 39 calculates the certainty factor for all the recognition target characters, and determines that the character corresponding to the maximum certainty factor is reflected in the character region. In addition, when it is known in advance that the size of characters written on the object to be inspected is constant, the normalization means 35 and the character area enlargement / reduction processing by the normalization means 35 may be omitted.

また、パターン識別手段３９は、選択された各候補文字の全てが互いに異なる部分領域の各線素の検出数のみを入力特徴量として用いてもよい。この場合には、入力特徴量として使用する各線素の数を求める部分領域の位置及びサイズを、上半分、下半分、左上、右下など、候補文字の組み合わせに応じて設定することができる。例えば、候補文字が'６'と'８'の２文字である場合、パターン識別手段３９は、使用する部分領域を文字領域の右上部分に位置する領域とすることができる。なお、候補文字の組み合わせと、使用する部分領域の対応関係は、予めルックアップテーブルに定義し、記憶手段３３に記憶しておく。
さらにこの場合、パターン識別手段３９において使用されるサポートベクトルマシンは、候補文字の組み合わせごとに、別個に準備され、各サポートベクトルマシンは、使用する部分領域のみの各線素の検出数を入力特徴量とするように、予め学習しておくことが好ましい。 In addition, the pattern identifying unit 39 may use only the detected number of each line element in the partial area where all the selected candidate characters are different from each other as the input feature amount. In this case, the position and size of the partial area for obtaining the number of each line element used as the input feature amount can be set according to the combination of candidate characters, such as the upper half, the lower half, the upper left, and the lower right. For example, when the candidate characters are two characters “6” and “8”, the pattern identifying means 39 can set the partial region to be used as the region located in the upper right part of the character region. The correspondence relationship between the combination of candidate characters and the partial area to be used is defined in advance in the lookup table and stored in the storage means 33.
Further, in this case, the support vector machine used in the pattern identifying means 39 is prepared separately for each combination of candidate characters, and each support vector machine inputs the number of detected line elements only in the partial area to be used as an input feature quantity. It is preferable to learn in advance.

また、線素検出手段３８は、文字領域画像に対して輪郭抽出及びその前処理を行わず、所定の閾値（例えば、文字領域画像の平均輝度値）以上の輝度値を持つ画素を対象に直接細線化した後、線素の数を求めてもよい。この場合には、サンプルの文字領域画像に対して直接細線化処理を行った後に各部分領域について検出された各線素の数を学習データとして使用し、パターン識別手段３９のサポートベクトルマシンを学習する。
さらに、パターン識別手段３９を、サポートベクトルマシンの代わりに、別の非線形識別器により構成してもよい。例えば、パターン識別手段３９を、３層以上の層を持つパーセプトロンタイプのニューラルネットワークにより構成してもよい。 Further, the line element detection unit 38 does not perform contour extraction and preprocessing on the character region image, and directly targets pixels having a luminance value equal to or higher than a predetermined threshold (for example, the average luminance value of the character region image). After thinning, the number of line elements may be obtained. In this case, the thinning process is directly performed on the sample character area image and the number of each line element detected for each partial area is used as learning data to learn the support vector machine of the pattern identification means 39. .
Further, the pattern identifying means 39 may be constituted by another nonlinear classifier instead of the support vector machine. For example, the pattern identifying means 39 may be configured by a perceptron type neural network having three or more layers.

以上のように、本発明の範囲内で、実施される形態に合わせて様々な変更を行うことができる。 As described above, various modifications can be made within the scope of the present invention according to the embodiment to be implemented.

本発明を適用した文字認識装置の概略構成図である。It is a schematic block diagram of the character recognition apparatus to which this invention is applied. 被検査物のワークを撮影した検査画像の概略図である。It is the schematic of the test | inspection image which image | photographed the workpiece | work of the to-be-inspected object. 処理部の機能ブロック図である。It is a functional block diagram of a processing part. （ａ）〜（ｈ）は、それぞれ線素の一例を示す図である。(A)-(h) is a figure which shows an example of a line element, respectively. （ａ）は、二値化された正規化文字領域画像の一例を示す図であり、（ｂ）は、（ａ）に示した正規化文字領域画像から抽出された文字の輪郭を示す、輪郭画像の一例を示す図であり、（ｃ）は、（ｂ）に示した輪郭画像の右上部分の部分領域の拡大図である。(A) is a figure which shows an example of the normalized character area image binarized, (b) is the outline which shows the outline of the character extracted from the normalized character area image shown to (a) It is a figure which shows an example of an image, (c) is an enlarged view of the partial area | region of the upper right part of the outline image shown to (b). サポートベクトルマシンの概念図である。It is a conceptual diagram of a support vector machine. 本発明を適用した文字認識装置の全体動作を示すフローチャートである。It is a flowchart which shows the whole operation | movement of the character recognition apparatus to which this invention is applied. パターン識別処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of a pattern identification process.

Explanation of symbols

１文字認識装置
２撮像部（画像取得手段）
３処理部
３１制御手段
３２通信手段
３３記憶手段
３４文字領域検出手段
３５正規化手段
３６パターンマッチング手段
３７輪郭抽出手段
３８線素検出手段
３９パターン識別手段
５ワーク DESCRIPTION OF SYMBOLS 1 Character recognition apparatus 2 Imaging part (image acquisition means)
DESCRIPTION OF SYMBOLS 3 Processing part 31 Control means 32 Communication means 33 Storage means 34 Character area detection means 35 Normalization means 36 Pattern matching means 37 Contour extraction means 38 Line element detection means 39 Pattern identification means 5 Workpiece

Claims

Image acquisition means (2) for acquiring an inspection image obtained by photographing an inspection object on which characters are written;
A character region detecting means (34) for detecting a character region in which the character is reflected from the inspection image;
For at least a portion of the character area, detecting pixels that match any of a plurality of line elements representing the partial structures of the plurality of characters to be recognized, and counting the number of the matching pixels A line element detecting means (38) for detecting the number of each of the line elements included in the at least a part of the region,
A pattern identification unit that calculates the certainty factor of each recognition target character from the number of each line element and determines that the character shown in the character region is the character having the highest certainty factor among the recognition target characters ( 39),
A character recognition device comprising:

The character recognition device according to claim 1, wherein the pattern identification unit (39) is configured by a support vector machine that receives the number of each line element and outputs a certainty factor of each recognition target character.

Template matching is performed between the character region and a plurality of templates corresponding to each recognition target character to calculate a degree of coincidence with each recognition target character, and a plurality of recognition target characters are selected in order from the highest matching degree. A pattern matching means (36) for selecting;
The said pattern identification means (39) determines with the character reflected in the said character area being the character with the highest said corresponding certainty among the selected several recognition object character, or 2. The character recognition device according to 2.

The said pattern identification means (39) determines the position or size of the said partial area | region so that a different part of the selected several recognition object character may be included in the said at least partial area | region. Character recognition device.

Obtaining an inspection image obtained by photographing an object to be inspected with characters;
Detecting a character region in which the character is shown from the inspection image;
For at least a part of the character region, detecting a pixel that matches any of a plurality of line elements representing a partial structure of each of the plurality of characters to be recognized, and counting the number of the matching pixels Detecting the number of each line element included in the at least part of the region;
Calculating the certainty factor of each recognition target character from the number of each line element, and determining that the character shown in the character region is the character having the highest certainty factor among the recognition target characters;
A character recognition method characterized by comprising: