JP4231375B2

JP4231375B2 - A pattern recognition apparatus, a pattern recognition method, a pattern recognition program, and a recording medium on which the pattern recognition program is recorded.

Info

Publication number: JP4231375B2
Application number: JP2003345256A
Authority: JP
Inventors: 慎吾安藤; 良規草地; 章鈴木; 賢一荒川
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-10-03
Filing date: 2003-10-03
Publication date: 2009-02-25
Anticipated expiration: 2023-10-03
Also published as: JP2005115432A

Description

本発明は、自然画像中に含まれる２次元パターンを精度良く識別するためのパターン認識技術に関するものである。 The present invention relates to a pattern recognition technique for accurately identifying a two-dimensional pattern included in a natural image.

近年、携帯電話にカメラ機能が搭載され、いつでも手軽に写真データを取得、保存することができるようになった。この機能をさらに活用する手段として、写真データから文字情報を抽出し、その情報に関連したデータを提供するというサービスが考えられる。このとき、問題になるのが、パターン認識の代表的問題の一つ、文字認識である。文字認識は印刷活字ＯＣＲや手書き文字入力インタフェース等を目的として古くから研究されてきた。そのような用途では、入力されるデータは文字データであると限定しても、まず問題ない。ところが、上記のようなサービスでは、文字以外にもあらゆるテクスチャ、パターン等が入力される可能性がある。そのため、文字でないものを文字として認識してしまう誤りを無くすよう、認識方法を改良する必要がある。 In recent years, mobile phones have been equipped with a camera function, which makes it easy to obtain and store photo data at any time. As a means for further utilizing this function, a service of extracting character information from photograph data and providing data related to the information can be considered. At this time, one of the typical problems of pattern recognition is character recognition. Character recognition has long been studied for the purpose of print type OCR, handwritten character input interface, and the like. In such an application, there is no problem even if the input data is limited to character data. However, in the service as described above, there is a possibility that various textures, patterns and the like other than characters may be input. Therefore, it is necessary to improve the recognition method so as to eliminate the error of recognizing a non-character as a character.

文字認識の処理手順は、「前処理部」「特徴抽出部」「識別部」の３ステップに大別される。中でも特徴抽出部は、認識性能を左右する重要なプロセスである。従来の文字認識における特徴抽出手法として「局所方向ヒストグラム特徴」がある(例えば、非特許文献1参照。)。これは、画像データを幾つかの局所的なブロックに分割し、各ブロック内においてエッジの方向成分を算出して、４方向に量子化されたエッジの頻度分布を作成することで特徴量を算出する方法である。主に手書き文字認識において精度良い識別が可能であることが確認されている。
若林哲史、鶴岡信治、木村文隆、三宅康二、「特徴量の次元数増加による手書き数字認識の高精度化」、信学論Ｄ−II，Ｖｏ１．Ｊ７７−Ｄ−II，Ｎｏ．１０，ｐｐ．２０４６−２０５３、１９９４ The processing procedure for character recognition is roughly divided into three steps: “pre-processing unit”, “feature extraction unit”, and “identification unit”. Among them, the feature extraction unit is an important process that affects the recognition performance. As a feature extraction method in conventional character recognition, there is a “local direction histogram feature” (for example, see Non-Patent Document 1). This is by dividing the image data into several local blocks, calculating the edge direction component in each block, and creating the frequency distribution of edges quantized in four directions to calculate the feature amount It is a method to do. It has been confirmed that accurate recognition is possible mainly in handwritten character recognition.
Wakabayashi Satoshi, Tsuruoka Shinji, Kimura Fumitaka, Miyake Koji, “Higher accuracy of handwritten digit recognition by increasing the number of dimensions of features”, Science theory D-II, Vo1. J77-D-II, no. 10, pp. 2046-2053, 1994

ところが、文字以外のデータとして例えば複雑なテクスチャが入力された場合、それを複雑な文字と誤認識してしまう問題があった。誤認識の一例を図１０に示す。これは、各ブロックで方向成分のヒストグラムをとるために、ブロック内でのエッジの相対的位置情報が潰されてしまうことに起因していると考えられる。つまり、自然画像中の文字認識を実現するには、文字とそうでないものとの相違がはっきり現れるような特徴抽出法が不可欠であると言える。ここまでは文字認識に限定して話をしたが、これは自然画像中のあらゆるパターン、オブジェクトの認識において共通の問題である。 However, for example, when a complex texture is input as data other than characters, there is a problem that it is erroneously recognized as a complex character. An example of erroneous recognition is shown in FIG. This is considered to be due to the fact that the relative position information of the edges in the block is crushed because the histogram of the direction component is taken in each block. That is, in order to realize character recognition in a natural image, it can be said that a feature extraction method that clearly shows the difference between characters and those that do not is essential. Up to this point, the discussion has been limited to character recognition, but this is a common problem in the recognition of all patterns and objects in natural images.

本発明は上記の問題点に鑑みてなされたものであり、前記問題を解決して、より高精度な認識を可能とするパターン認識装置、パターン認識方法、パターン認識プログラムおよびパターン認識プログラムを記録した記録媒体を提供することを目的とする。 The present invention has been made in view of the above-described problems, and has recorded a pattern recognition device, a pattern recognition method, a pattern recognition program, and a pattern recognition program that can solve the above problems and enable more accurate recognition. An object is to provide a recording medium.

上記目的を達成するために、本発明では、新たな特徴抽出方法を導入する。文字に代表される、人問の認識対象としてのパターンは、エッジ成分が長く連結している場合がほとんどである。図１０で示した例をエッジ抽出したものを図１１に示す。このエッジ抽出によれば図１０のような誤認識は起こらないと考えられる。そこで、エッジの連結性を特徴量に加味することで、リジェクトすべき入力データを認識対象パターンデータから隔離することが可能になると思われる。 In order to achieve the above object, the present invention introduces a new feature extraction method. Most patterns that are recognized by humans, such as characters, have long edge components connected to each other. FIG. 11 shows an edge extracted from the example shown in FIG. According to this edge extraction, it is considered that erroneous recognition as shown in FIG. 10 does not occur. Therefore, it is considered that the input data to be rejected can be isolated from the recognition target pattern data by adding the edge connectivity to the feature amount.

そこで、請求項１に記載のパターン認識装置は、画像データに含まれる２次元パターンを識別するパターン認識装置であって、画像データを入力する入力手段と、前記画像データの前処理を行う前処理手段と、前記前処理を行った画像データからエッジの連結性を考慮した特徴量を抽出する特徴抽出手段と、予め学習したパターンの特徴量を格納する学習パターン記憶手段と、前記予め学習したパターンの特徴量と前記抽出した特徴量とを比較し、予め学習したパターンの中から識別結果を決定する識別手段と、前記識別結果を出力する出力手段と、を有し、前記特徴抽出手段は、前記前処理を行った画像データの各画素からエッジ強度とエッジの方向成分とを抽出する濃度値勾配計算手段と、前記方向成分を量子化する方向成分量子化手段と、各画素についてエッジ加算値を算出するエッジ探索手段と、前記前処理を行った画像データを複数の領域に分割し、各領域内でエッジの方向成分に対するエッジ加算値のヒストグラムを作成し、このヒストグラムから特徴ベクトルを前記特徴量として算出する局所方向頻度分布作成手段と、を備え、前記エッジ探索手段は、一の画素を注目画素とし、この注目画素の量子化されたエッジの方向成分に基づいて該注目画素に隣接する画素から連結エッジ候補画素を決定し、この連結エッジ候補画素の中から濃度勾配値の値が前記注目画素に最も近いものを連結エッジ画素に決定し、決定した連結エッジ画素に対して連結を行い、この連結エッジ画素を連結注目画素として前記と同様に連結エッジ候補画素の決定と連結エッジ画素への連結とを規定した連結数に到達するまで連続的に行い、注目画素に連結した全ての連結エッジ画素のエッジ強度の総和を前記エッジ加算値として各画素について算出することを特徴とする。 Accordingly, the pattern recognition apparatus according to claim 1 is a pattern recognition apparatus for identifying a two-dimensional pattern included in image data, and includes input means for inputting image data and preprocessing for performing preprocessing of the image data. Means, feature extraction means for extracting feature quantities in consideration of edge connectivity from the preprocessed image data, learning pattern storage means for storing feature quantities of previously learned patterns, and the previously learned patterns And an extracted means for comparing the extracted feature quantity with a pattern learned in advance, and an output means for outputting the identification result . Density value gradient calculation means for extracting edge intensity and edge direction component from each pixel of the preprocessed image data, and direction component quantization means for quantizing the direction component , Edge search means for calculating an edge addition value for each pixel, and image data that has been subjected to the preprocessing is divided into a plurality of areas, and a histogram of edge addition values for edge direction components in each area is created. A local direction frequency distribution creating unit that calculates a feature vector from the histogram as the feature amount, and the edge search unit uses one pixel as a pixel of interest and is based on a quantized edge direction component of the pixel of interest. Then, a connected edge candidate pixel is determined from pixels adjacent to the target pixel, and a connected edge pixel having a density gradient value closest to the target pixel is determined from the connected edge candidate pixels. The connection is performed on the pixels, and the connection edge candidate pixel is determined and the connection edge pixel is determined in the same manner as described above using the connection edge pixel as the connection target pixel. Prescribed number of connections continuously performed until reaching that, and calculating for each pixel the sum of the edge intensities of all the connected edge pixels connected to the pixel of interest as the edge added value.

請求項２に記載のパターン認識装置は、請求項1に記載のパターン認識装置において、前記局所方向頻度分布作成手段は、画像データを複数の領域に分割し、各領域において、量子化した方向ごとにエッジ加算値の総和を求め、量子化した方向数に領域数を積算して算出される数の次元を有する特徴ベクトルを得ることを特徴とする。 The pattern recognition device according to claim 2 is the pattern recognition device according to claim 1, wherein the local direction frequency distribution creating unit divides the image data into a plurality of regions, and each region is quantized for each direction. Then, the sum of the edge addition values is obtained, and the feature vector having the number of dimensions calculated by adding the number of regions to the quantized number of directions is obtained .

請求項３に記載のパターン認識方法は、画像データに含まれる２次元パターンを識別するパターン認識方法であって、画像データを入力する入力ステップと、前記画像データの前処理を行う前処理ステップと、前記前処理を行った画像データからエッジの連結性を考慮した特徴量を抽出する特徴抽出ステップと、予め学習したパターンの特徴量を格納する学習パターン記憶ステップと、前記予め学習したパターンの特徴量と前記抽出した特徴量とを比較し、予め学習したパターンの中から識別結果を決定する識別ステップと、前記識別結果を出力する出力ステップと、を有し、前記特徴抽出ステップは、前記前処理を行った画像データの各画素からエッジ強度とエッジの方向成分とを抽出する濃度値勾配計算ステップと、前記方向成分を量子化する方向成分量子化ステップと、各画素についてエッジ加算値を算出するエッジ探索ステップと、前記前処理を行った画像データを複数の領域に分割し、各領域内でエッジの方向成分に対するエッジ加算値のヒストグラムを作成し、このヒストグラムから特徴ベクトルを前記特徴量として算出する局所方向頻度分布作成ステップと、を備え、前記エッジ探索ステップは、一の画素を注目画素とし、量子化したエッジの方向成分に基づいて注目画素に隣接する画素から連結エッジ候補画素を決定し、この連結エッジ候補画素の中から濃度勾配値の値が注目画素に最も近いものを連結エッジ画素に決定し、決定した連結エッジ画素に対して連結を行い、この連結エッジ画素を連結注目画素として前記と同様に連結エッジ候補画素の決定と連結エッジ画素への連結とを規定した連結数に到達するまで連続的に行い、注目画素に連結した全ての連結エッジ画素のエッジ強度の総和を前記エッジ加算値として各画素について算出することを特徴とする。 The pattern recognition method according to claim 3 is a pattern recognition method for identifying a two-dimensional pattern included in image data, an input step for inputting image data, and a preprocessing step for performing preprocessing of the image data; , A feature extraction step for extracting feature amounts in consideration of edge connectivity from the preprocessed image data, a learning pattern storage step for storing feature amounts of previously learned patterns, and features of the previously learned patterns An identification step for comparing a quantity with the extracted feature quantity, determining an identification result from patterns learned in advance, and an output step for outputting the identification result. A density value gradient calculating step for extracting edge intensity and edge direction component from each pixel of the processed image data; A direction component quantization step, an edge search step for calculating an edge addition value for each pixel, and the preprocessed image data is divided into a plurality of regions, and an edge addition value for an edge direction component in each region And a local direction frequency distribution creation step of calculating a feature vector from the histogram as the feature amount, wherein the edge search step uses one pixel as a pixel of interest and is a quantized edge direction component Based on the above, the connected edge candidate pixel is determined from the pixels adjacent to the target pixel, and the connected edge pixel having the density gradient value closest to the target pixel is determined from the connected edge candidate pixels. Concatenation is performed on the pixels, and the connection edge candidate pixel is determined as the connection target pixel, and the connection edge candidate pixel is determined and connected in the same manner as described above. Continuously performed until reaching the coupling number that defines the connection to the di-pixel, and calculating means calculates for each pixel the sum of the edge intensities of all the connected edge pixels connected to the pixel of interest as the edge sum value To do.

請求項４に記載のパターン認識方法は、請求項３に記載のパターン認識方法において、前記局所方向頻度分布作成ステップは、画像データを複数の領域に分割し、各領域において、量子化した方向ごとにエッジ加算値の総和を求め、量子化した方向数に領域数を積算して算出される数の次元を有する特徴ベクトルを得ることを特徴とする。 The pattern recognition method according to claim 4 is the pattern recognition method according to claim 3, wherein the local direction frequency distribution creating step divides the image data into a plurality of regions, and each region is quantized for each direction. Then, the sum of the edge addition values is obtained, and the feature vector having the number of dimensions calculated by adding the number of regions to the quantized number of directions is obtained .

請求項５に記載のパターン認識プログラムは、上記の請求項１〜４のいずれか１項に記載のパターン認識装置またはパターン認識方法を、コンピュータプログラムで記載してそれを実行可能にしたことを特徴とする。 A pattern recognition program according to claim 5 is characterized in that the pattern recognition apparatus or pattern recognition method according to any one of claims 1 to 4 is described as a computer program and can be executed. And

請求項６に記載の記録媒体は、上記の請求項１〜４のいずれか１項に記載のパターン認識装置またはパターン認識方法を、コンピュータで実行可能に記載したパターン認識プログラムを記録したことを特徴とする。 A recording medium according to claim 6 is recorded with a pattern recognition program in which the pattern recognition apparatus or pattern recognition method according to any one of claims 1 to 4 is executable by a computer. And

上記の手段を実現することによって、エッジの連結性を表す評価値が特徴量に加味され、リジェクトすべき入力データは、認識対象パターンのデータと大きく異なる特徴量を持つようになり、結果として、自然画像を対象としたパターン認識を精度良く行うことが可能となる。 By realizing the above means, the evaluation value representing the connectivity of the edge is added to the feature amount, and the input data to be rejected has a feature amount greatly different from the recognition target pattern data. Pattern recognition for natural images can be performed with high accuracy.

本発明によれば、エッジの強度にエッジ連結性評価値を加味した特徴量を用いることで、認識対象パターンデータとそれ以外の入力データとの差異を大きくすることができるため、自然画像中に含まれる２次元パターンを精度良く識別するためのパターン認識装置、パターン認識方法、パターン認識プログラムおよびパターン認識プログラムを記録した記録媒体を提供できる。 According to the present invention, the difference between the recognition target pattern data and the other input data can be increased by using the feature amount in which the edge connectivity evaluation value is added to the edge strength. It is possible to provide a pattern recognition apparatus, a pattern recognition method, a pattern recognition program, and a recording medium on which a pattern recognition program is recorded for accurately identifying the included two-dimensional pattern.

以下、本発明の実施の形態について図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は本発明に係る実施形態のパターン認識装置の構成を示すブロック図、図２は本発明に係る実施形態のパターン認識方法のフローチャートである。 FIG. 1 is a block diagram showing a configuration of a pattern recognition apparatus according to an embodiment of the present invention, and FIG. 2 is a flowchart of a pattern recognition method according to the embodiment of the present invention.

図１および図２において、まず、Ｓ１で画像データ入力部１１がデジタルカメラ等で撮像された自然画像等の画像データを入力し、前処理部１２に伝送する。 1 and 2, first, in S 1, the image data input unit 11 inputs image data such as a natural image captured by a digital camera or the like and transmits the image data to the preprocessing unit 12.

Ｓ２では、前処理部１２が前処理を行う。例えば、入力されたデジタル画像データから、文字が存在すると思われる部分領域を幾つか適当な大きさ、位置で切り出し、濃度値の正規化（例えば、濃度値の平均を０、分散を１に正規化する）、またはノイズ除去等の処理を行い、処理後のデータを特徴抽出部１３へ伝送する。 In S2, the preprocessing unit 12 performs preprocessing. For example, from the input digital image data, some partial areas where characters are supposed to exist are cut out at appropriate sizes and positions, and density value normalization (for example, density value average is 0 and variance is 1 normal) Or processing such as noise removal, and transmits the processed data to the feature extraction unit 13.

Ｓ３では、特徴抽出部１３が入力されたデジタル画像データからエッジの連結性を考慮した特徴量を抽出し、この特徴量を識別部１４に伝送する。 In S 3, the feature extraction unit 13 extracts a feature amount considering the edge connectivity from the input digital image data, and transmits the feature amount to the identification unit 14.

Ｓ４では、識別部１４が学習パターン記憶部１５に記憶されている幾つかのパターンの特徴量と伝送された特徴量とを比較して、その比較結果をもとに記憶されているパターンから識別結果を決定し、識別結果を出力部１６に伝送する。 In S4, the identification unit 14 compares the feature quantities of several patterns stored in the learning pattern storage unit 15 with the transmitted feature quantities, and identifies from the stored patterns based on the comparison results. The result is determined, and the identification result is transmitted to the output unit 16.

Ｓ５では、出力部１６が、識別部１４で演算された識別結果を出力して操作を終了する。 In S5, the output unit 16 outputs the identification result calculated by the identification unit 14 and ends the operation.

次に、特徴抽出部１３における演算処理装置構成および処理の実行方法を図３および図４により詳しく説明する。図３は演算処理装置構成を示すブロック図、図４は処理の実行方法のフローチャートである。 Next, the configuration of the arithmetic processing unit and the process execution method in the feature extraction unit 13 will be described in detail with reference to FIGS. FIG. 3 is a block diagram showing the configuration of the arithmetic processing unit, and FIG. 4 is a flowchart of a process execution method.

図３および図４において、まず、Ｓ６で、濃度値勾配計算部２１が入力画像データの各画素点における濃度値勾配（エッジ）を計算する。この結果、濃度値勾配の大きさ(エッジ強度)と方向成分とが算出される。実際の計算は、例えば、ＳｏｂｅｌｆｉｌｔｅｒやＲｏｂｅｒｔｓｆｉｌｔｅｒ等を用いて算出することができる。この濃度値勾配の大きさを計算した例を図５に示す。図５において、濃度が濃い（黒い）ほど濃度値勾配が大きい(エッジ強度が大きい)ことを表す。 3 and 4, first, in S6, the density value gradient calculation unit 21 calculates the density value gradient (edge) at each pixel point of the input image data. As a result, the magnitude (edge strength) of the density value gradient and the direction component are calculated. The actual calculation can be performed using, for example, a Sobel filter, a Roberts filter, or the like. An example of calculating the magnitude of the concentration value gradient is shown in FIG. In FIG. 5, the darker the density (black), the greater the density value gradient (the higher the edge strength).

Ｓ７では、方向成分量子化部２２が濃度値勾配の方向成分を上一下方向、左下一右上方向、左一右方向、左上一右下方向の４方向に量子化する。 In S 7, the direction component quantization unit 22 quantizes the direction component of the density value gradient in four directions, that is, upper and lower directions, lower left and upper right directions, left and right directions, and upper left and lower right directions.

Ｓ８では、エッジ探索部２３が各画素点に対して自画素の勾配の大きさと最も近い画素を逐次的に探索していく。その概念図を図６に示す。図６において、注目画素の勾配方向は左下一右上方向に量子化されており、２つの方向に自画素の勾配の大きさと最も近い画素の探索を行っている。 In S 8, the edge search unit 23 sequentially searches for the pixel closest to the gradient of the own pixel with respect to each pixel point. The conceptual diagram is shown in FIG. In FIG. 6, the gradient direction of the pixel of interest is quantized in the lower left and upper right directions, and the pixel closest to the gradient size of the own pixel is searched in two directions.

ここで、探索方法を、図７のフローチャートに基づいて詳しく説明する。まず、Ｓ１０で、画素の位置を表す変数i(画像の幅)、j(画像の高さ)を１で初期化する。 Here, the search method will be described in detail based on the flowchart of FIG. First, in S10, variables i (image width) and j (image height) representing pixel positions are initialized to 1.

Ｓ１１では、位置が(i、j)の画素を注目画素とする。 In S11, the pixel at the position (i, j) is set as the target pixel.

Ｓ１２では、注目画素の濃度値勾配の大きさＰを記録する。 In S12, the magnitude P of the density value gradient of the target pixel is recorded.

Ｓ１３では、連結の回数を表す変数である連結数ｎを１で初期化する。 In S13, the connection number n, which is a variable indicating the number of connections, is initialized with 1.

Ｓ１４では、逐次的に連結処理を実行するために「連結注目画素」を定義する。この「連結注目画素」は、注目画素を出発点として、連結先の画素の位置を記録するためのもので、２方向に探索するために２画素分用意する。 In S 14, “connected pixel of interest” is defined in order to sequentially execute the connection process. This “concatenated pixel of interest” is for recording the position of the pixel to be linked from the pixel of interest as a starting point, and is prepared for two pixels for searching in two directions.

Ｓ１５では、連結注目画素の近傍８画素のうち、次に連結しようとする連結エッジ候補画素を決定する。決定方法は、例えば注目画素の濃度値勾配方向をもとに図８のように濃度値勾配方向以外の方向に設定する。図８において、白丸が連結注目画素、黒丸が連結エッジ候補画素を表す。それぞれ２種類存在するのは、２方向に探索を進めるためである。 In S15, a connection edge candidate pixel to be connected next is determined among the 8 pixels in the vicinity of the connection target pixel. For example, the determination method is set in a direction other than the density value gradient direction as shown in FIG. 8 based on the density value gradient direction of the target pixel. In FIG. 8, a white circle represents a connection target pixel, and a black circle represents a connection edge candidate pixel. There are two types for the purpose of searching in two directions.

Ｓ１６では、連結エッジ候補画素のうち、濃度値勾配の大きさがＰと最も近いものを連結エッジ画素に決定し、これに連結する。 In S16, among the connected edge candidate pixels, a pixel whose density value gradient is closest to P is determined as a connected edge pixel and connected to this.

Ｓ１７では、連結エッジ画素を新たな連結注目画素とする。 In S17, the connection edge pixel is set as a new connection target pixel.

Ｓ１８では、連結数ｎが、予め規定した最大連結数Ｎになったかどうかを判断し、Ｎに達していないと判断したときは、Ｓ２３で連結数ｎをｎ＋１にしてから、Ｓ１５に戻り、再び連結エッジ候補画素を決定して、Ｓ１５〜Ｓ１８の操作を繰り返す。なお、最大連結数Ｎは、あらかじめ適当な値を設定しておくものとする。また、Ｓ１８で連結数ｎが最大連結数Ｎに達したと判断したときは後段のＳ１９に進む。 In S18, it is determined whether or not the number n of connections has reached a predetermined maximum number N. If it is determined that the number N has not reached N, the number n of connections is set to n + 1 in S23, and the process returns to S15. A connection edge candidate pixel is determined, and the operations of S15 to S18 are repeated. Note that an appropriate value is set in advance for the maximum number N of connections. If it is determined in S18 that the number of connections n has reached the maximum number of connections N, the process proceeds to S19 in the subsequent stage.

Ｓ１９では、iが規定値に達したかを判断して、規定値に達していないと判断したときは、Ｓ２２でiをi＋１にしてから、Ｓ１１に戻り、再びＳ１１〜Ｓ１９の操作を繰り返す。また、Ｓ１９でiが規定値に達したと判断したときは後段のＳ２０に進む。 In S19, it is determined whether i has reached the specified value. If it is determined that i has not reached the specified value, i is set to i + 1 in S22, and the process returns to S11, and the operations of S11 to S19 are repeated again. If it is determined in step S19 that i has reached the specified value, the process proceeds to step S20.

Ｓ２０では、jが規定値に達したかを判断して、規定値に達していないと判断したときは、Ｓ２１でjをj＋１にしてから、Ｓ１１に戻り、再びＳ１１〜Ｓ２０の操作を繰り返す。また、Ｓ２０でjが規定値に達したと判断したときは操作を終了する。 In S20, it is determined whether j has reached a specified value. If it is determined that j has not reached the specified value, j is set to j + 1 in S21, and the process returns to S11, and the operations of S11 to S20 are repeated again. If it is determined in step S20 that j has reached the specified value, the operation is terminated.

以上のエッジ探索処理により、注目画素に連結した画素の組を求めることができ、注目画素に連結した全ての画素の濃度値勾配の大きさ値を加算したエッジ加算値を算出する。そして、全ての画素点についてエッジ加算値を算出する。 By the above edge search processing, a set of pixels connected to the target pixel can be obtained, and an edge addition value obtained by adding the magnitude values of the density value gradients of all the pixels connected to the target pixel is calculated. Then, edge addition values are calculated for all pixel points.

図３および図４において、Ｓ９では、局所方向頻度分布作成部２４がエッジ加算値をもとに、局所的な勾配方向成分のヒストグラムを作成する。具体的には、図９のように、画像を幾つかの部分領域に分割し、各領域ブロックにおいて４方向別にエッジ加算値の総和を求める。以上の処理により、４×（領域ブロック数）次元の特徴ベクトルが算出される。 3 and 4, in S9, the local direction frequency distribution creation unit 24 creates a local gradient direction component histogram based on the edge addition value. Specifically, as shown in FIG. 9, the image is divided into several partial areas, and the sum of the edge addition values is obtained for each of the four directions in each area block. With the above processing, a 4 × (region block number) -dimensional feature vector is calculated.

上述では、方向成分量子化数を４方向として説明したが、エッジ探索部２３の連結エッジ候補画素決定方法を変更することで８方向や１６方向に拡張することも可能である。 In the above description, the direction component quantization number has been described as four directions. However, it is possible to expand the number of directions component to eight directions or 16 directions by changing the connected edge candidate pixel determination method of the edge search unit 23.

次に、識別部１４における処理について説明する。識別部１４では、部分空間法を用いて処理を行う。 Next, processing in the identification unit 14 will be described. The identification unit 14 performs processing using a subspace method.

この部分空間法は、類別すべきカテゴリを特徴ベクトル成分の分布から形成される部分空間への射影を通して判定する統計的手法の一種である。この場合の、変換するベクトル成分の固有ベクトル計算には、例えば、量子化アルゴリズムであるカルーネン・レーベ変換によるＫＬ解析が採用される。部分空間法における代表的な手法には、ＣＬＡＦＩＣ法や、平均学習部分空間法が知られている。また、ＡＬＳＭは、対抗するカテゴリをも考慮した適応的な学習アプローチに属し、所定の訓練パターンに対する誤認識が最小となるように空間を反復的に張り直してカテゴリ境界の学習が進められる。 This subspace method is a kind of statistical method for determining categories to be classified through projection onto a subspace formed from the distribution of feature vector components. In this case, for the eigenvector calculation of the vector component to be converted, for example, KL analysis by Karhunen-Loeve transform which is a quantization algorithm is employed. As typical techniques in the subspace method, the CLAFIC method and the average learning subspace method are known. ALSM belongs to an adaptive learning approach that also takes into account opposing categories, and the learning of category boundaries is advanced by re-adjusting the space repeatedly so as to minimize the misrecognition of a predetermined training pattern.

まずは、識別したいパターンのサンプルデータを予め入力し、そのパターンを表す部分空間を算出しておき、学習パターン記憶部１５に蓄積する。部分空間は、識別パターンから得られる特徴ベクトルの共分散行列の固有値および固有ベクトルを算出することにより、求めることができる。 First, sample data of a pattern to be identified is input in advance, a partial space representing the pattern is calculated and stored in the learning pattern storage unit 15. The subspace can be obtained by calculating the eigenvalue and eigenvector of the covariance matrix of the feature vector obtained from the identification pattern.

識別の際には、学習パターン記憶部１５から全てのパターンの部分空間データを呼び出し、それらの部分空間との距離をもとにパターンの識別を行う。距離は、特徴空間内での単純ユークリッド距離にて計測し、距離が最も近い部分空間のパターンを識別結果とする。 At the time of identification, the partial pattern data of all patterns is called from the learning pattern storage unit 15, and the pattern is identified based on the distance from these partial spaces. The distance is measured by a simple Euclidean distance in the feature space, and the pattern of the subspace with the closest distance is used as the identification result.

なお、部分空間法の基本原理は、石井健一郎、上田修功、前田英作、村瀬洋共著、“わかりやすいパターン認識”、オーム社（１９９８）等に詳しく掲載されている。なお、部分空間法以外にも例えば最近傍法、フィッシャーの線形判別法、サポートベクトルマシン、ニューラルネットワーク、カーネル非線形部分空間法等を用いても良い。 The basic principle of the subspace method is described in detail in Kenichiro Ishii, Nobuyoshi Ueda, Eisaku Maeda, Hiroshi Murase, “Easy-to-understand pattern recognition”, Ohmsha (1998), etc. In addition to the subspace method, for example, a nearest neighbor method, Fisher's linear discriminant method, support vector machine, neural network, kernel nonlinear subspace method, or the like may be used.

また、本発明は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のＣＰＵ（ＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、実現できる。その場合、記憶媒体から読み出されたプログラムコード自体が上述した実施の形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体、例えばＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ，ＣＤ−Ｒ、ＣＤ−ＲＷ、ＭＯ、ＨＤＤ等は本発明を構成する。 The present invention also provides a storage medium in which a program code of software for realizing the functions of the above-described embodiments is recorded to a system or apparatus, and a program in which the CPU (MPU) of the system or apparatus is stored in the storage medium. This can also be realized by reading and executing the code. In that case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and a storage medium storing the program code, for example, a CD-ROM, DVD-ROM, CD-R, CD-RW, MO, HDD, etc. constitute the present invention.

パターン認識装置の構成を示すブロック図。The block diagram which shows the structure of a pattern recognition apparatus. パターン認識方法のフローチャート。The flowchart of a pattern recognition method. 特徴抽出部１３の演算処理装置構成を示すブロック図。The block diagram which shows the arithmetic processing unit structure of the feature extraction part. 特徴抽出部１３の処理の実行方法のフローチャート。The flowchart of the execution method of the process of the characteristic extraction part. 濃度値勾配の大きさを計算した例を示す図。The figure which shows the example which computed the magnitude | size of the density | concentration value gradient. 画素を逐次的に探索する概念図。The conceptual diagram which searches a pixel sequentially. 画素を逐次的に探索するフローチャート。The flowchart which searches a pixel sequentially. 次の連結エッジ候補画素の決定方法の例を示す図。The figure which shows the example of the determination method of the next connection edge candidate pixel. 局所的な勾配方向成分のヒストグラムの作成の例を示す図。The figure which shows the example of creation of the histogram of a local gradient direction component. 複雑なテクスチャが入力された場合に、複雑な文字と誤認識する例を示す図。The figure which shows the example which misrecognizes with a complicated character when a complicated texture is input. 図１０をエッジ抽出処理した例を示す図。The figure which shows the example which performed the edge extraction process of FIG.

Explanation of symbols

１１…画像データ入力部
１２…前処理部
１３…特徴抽出部
１４…識別部
１５…学習パターン記憶部
１６…出力部
２１…濃度値勾配計算部
２２…方向成分量子化部
２３…エッジ探索部
２４…局所方向頻度分布作成部 DESCRIPTION OF SYMBOLS 11 ... Image data input part 12 ... Pre-processing part 13 ... Feature extraction part 14 ... Identification part 15 ... Learning pattern memory | storage part 16 ... Output part 21 ... Density value gradient calculation part 22 ... Direction component quantization part 23 ... Edge search part 24 ... Local direction frequency distribution generator

Claims

A pattern recognition apparatus for identifying a two-dimensional pattern included in image data,
Input means for inputting image data;
Preprocessing means for performing preprocessing of the image data;
Feature extraction means for extracting feature quantities considering the connectivity of edges from the preprocessed image data;
Learning pattern storage means for storing feature quantities of previously learned patterns;
An identification unit that compares the feature quantity of the previously learned pattern with the extracted feature quantity, and determines an identification result from the previously learned pattern;
Output means for outputting the identification result ,
The feature extraction means includes
Density value gradient calculating means for extracting edge intensity and edge direction component from each pixel of the preprocessed image data;
Direction component quantization means for quantizing the direction component;
Edge search means for calculating an edge addition value for each pixel;
The pre-processed image data is divided into a plurality of regions, a histogram of edge addition values for edge direction components in each region is created, and a local direction frequency distribution that calculates a feature vector from the histogram as the feature amount Creating means, and
The edge search means sets one pixel as a target pixel, determines a connected edge candidate pixel from pixels adjacent to the target pixel based on a quantized edge direction component of the target pixel,
Among the connected edge candidate pixels, the one having the density gradient value closest to the target pixel is determined as a connected edge pixel,
Concatenation is performed for the determined connected edge pixel, and the connected edge pixel is used as a connected pixel of interest, and the connection edge candidate pixel is determined and connected to the connected edge pixel in the same manner as described above until the number of connections specified is reached. To
The sum of edge strengths of all connected edge pixels connected to the target pixel is calculated for each pixel as the edge addition value.
A pattern recognition apparatus.

The local direction frequency distribution creating means divides image data into a plurality of regions,
  In each region, find the sum of the edge addition values for each quantized direction,
  A feature vector having a number of dimensions calculated by adding the number of regions to the quantized number of directions is obtained.
  The pattern recognition apparatus according to claim 1.

A pattern recognition method for identifying a two-dimensional pattern included in image data,
  An input step for inputting image data;
  A preprocessing step of performing preprocessing of the image data;
  A feature extraction step of extracting feature quantities considering the connectivity of edges from the preprocessed image data;
  A learning pattern storage step for storing a feature amount of a previously learned pattern;
  An identification step of comparing the feature quantity of the previously learned pattern with the extracted feature quantity, and determining an identification result from the previously learned pattern;
  An output step of outputting the identification result,
  The feature extraction step includes
  A density value gradient calculating step for extracting an edge strength and an edge direction component from each pixel of the preprocessed image data;
  A direction component quantization step for quantizing the direction component;
  An edge search step for calculating an edge addition value for each pixel;
  The pre-processed image data is divided into a plurality of regions, a histogram of edge addition values for edge direction components in each region is created, and a feature vector is calculated as the feature quantity from this histogram. A creation step,
  The edge search step uses one pixel as a target pixel, determines a connected edge candidate pixel from pixels adjacent to the target pixel based on a quantized edge direction component,
  Among the connected edge candidate pixels, the one having the density gradient value closest to the target pixel is determined as the connected edge pixel,
  Concatenation is performed on the determined connected edge pixel, and the connected edge pixel is used as a connected pixel of interest, and the connection edge candidate pixel is determined and connected to the connected edge pixel in the same manner as described above until the number of connections specified is reached. To
  The sum of edge strengths of all connected edge pixels connected to the target pixel is calculated for each pixel as the edge addition value.
  A pattern recognition method characterized by the above.

The local direction frequency distribution creating step divides the image data into a plurality of regions,
  In each region, find the sum of the edge addition values for each quantized direction,
  A feature vector having a number of dimensions calculated by adding the number of regions to the quantized number of directions is obtained.
  The pattern recognition method according to claim 3.

A pattern recognition program characterized in that the pattern recognition apparatus or pattern recognition method according to any one of claims 1 to 4 is described as a computer program and can be executed.

A recording medium on which is recorded a pattern recognition program in which the pattern recognition apparatus or pattern recognition method according to any one of claims 1 to 4 is executable by a computer.