JP6665488B2

JP6665488B2 - Image processing method and image processing apparatus

Info

Publication number: JP6665488B2
Application number: JP2015216223A
Authority: JP
Inventors: 雄介関川; 育郎佐藤; 鈴木　幸一郎; 幸一郎鈴木
Original assignee: Denso IT Laboratory Inc
Current assignee: Denso IT Laboratory Inc
Priority date: 2015-11-03
Filing date: 2015-11-03
Publication date: 2020-03-13
Anticipated expiration: 2035-11-03
Also published as: JP2017090983A

Description

本発明は、入力画像と所定のテンプレート画像との照合を行うことで、入力画像において、テンプレート画像に対応する対象物を検出する画像処理方法及び画像処理装置に関する。 The present invention relates to an image processing method and an image processing apparatus for detecting an object corresponding to a template image in an input image by comparing the input image with a predetermined template image.

カメラで撮影した画像（クエリ画像）の中から予め登録した対象物の画像（テンプレート画像）を検出するテンプレートマッチングは画像処理の基本技術であり、産業上も広く利用されている。例えば、テンプレートマッチングを使って面内回転からなる角度パラメータを推定する回転マッチングは、工場内のロボットによるビンピッキングや外観検査など多くのアプリケーションに用いられている。このため、回転マッチングを高速かつロバストにすることは実応用において重要な課題である。 Template matching for detecting an image (template image) of an object registered in advance from images (query images) captured by a camera is a basic technology of image processing, and is widely used in industry. For example, rotation matching for estimating an angle parameter consisting of in-plane rotation using template matching is used for many applications such as bin picking and appearance inspection by a robot in a factory. For this reason, making the rotation matching fast and robust is an important issue in practical applications.

回転マッチングはＳＩＦＴ（Scale-Invariant Feature Transform）等の手法を使って高速に行うことが可能であるが、表面に模様の少ないテクスチャレスの対象物には適用が難しいという問題がある。一方、テクスチャレスの対象物に有効なテンプレート画像とクエリ画像の相関に基づいた手法が、いくつか提案されている。 Rotational matching can be performed at high speed using a technique such as SIFT (Scale-Invariant Feature Transform), but there is a problem that it is difficult to apply the technique to a textureless target having few patterns on the surface. On the other hand, several methods based on the correlation between a template image and a query image effective for a textureless target have been proposed.

例えば、ＮＣＣ（Normalized Cross-Correlation）は、一定の明るさの変動やノイズにロバストであることから広く用いられている。ＮＣＣでは、畳み込み領域内のクエリ画像のノルムで規格化した値を類似度とすることで内積値を類似度とする手法に対してロバスト性を向上することができる。ただし、ＮＣＣを回転マッチングに適用する場合、クエリ画像と一定の回転ピッチで回転したテンプレート画像との相関計算を複数回繰り返すことで、位置及び角度の姿勢パラメータの類似度を求める必要がある。そして、検出率や角度パラメータの推定精度を上げるには回転ピッチを小さくする必要があるため、必要な畳み込み演算の数が増え処理時間が増大するという課題がある。 For example, NCC (Normalized Cross-Correlation) is widely used because it is robust against constant brightness fluctuation and noise. In the NCC, robustness can be improved with respect to a method in which an inner product value is set to a similarity by setting a value normalized by a norm of a query image in a convolution area as a similarity. However, when applying the NCC to the rotation matching, it is necessary to calculate the similarity between the position and angle posture parameters by repeating the correlation calculation between the query image and the template image rotated at a fixed rotation pitch a plurality of times. In order to increase the accuracy of estimating the detection rate and the angle parameter, it is necessary to reduce the rotation pitch. Therefore, there is a problem that the number of necessary convolution operations increases and the processing time increases.

回転マッチングを高速化する手法としてLog-Polar変換を使ったＲＩＰＯＣ（Rotation Invariant Phase Only Correlation）が知られている。ＲＩＰＯＣは、テンプレート画像とクエリ画像のパワースペクトルが並進に依存しないことを利用して、パワースペクトルのLog-Polar画像についてＰＯＣを使って回転ずれを求め、求めた回転で補正したクエリ画像に対してＰＯＣにより並進を推定する。この手法は、回転を推定するためのマッチングと並進を推定するためのマッチングの２回のマッチングで回転マッチングを行えるため非常に効率が良い。しかし、ＲＩＰＯＣは回転角の推定時に画像を特徴づける位相成分を捨てることから、クエリ画像の背景にテンプレート画像と似た周波数成分が多く含まれる場合には検出に失敗しやすいという課題がある。 RIPOC (Rotation Invariant Phase Only Correlation) using Log-Polar transformation is known as a technique for speeding up rotation matching. RIPOC utilizes the fact that the power spectra of the template image and the query image do not depend on translation, and uses the POC to determine the rotational deviation of the log-polar image of the power spectrum. Estimate the translation by POC. This method is very efficient because rotation matching can be performed by two matchings, one for estimating rotation and the other for estimating translation. However, since the RIPOC discards the phase component that characterizes the image when estimating the rotation angle, there is a problem that the detection tends to fail if the query image contains many frequency components similar to the template image.

上述した手法の課題を解決できる手法として、固有値テンプレート法が提案されている（例えば、特許文献１参照）。この固有値テンプレート法は、クエリ画像に対して、複数のテンプレート画像を１画像ずつマッチングする代わりに、テンプレート画像群を特異値分解（ＳＶＤ）で情報圧縮することで得られる新たなテンプレート画像（固有値テンプレート画像）を利用し効率よく類似度を算出するものである。また、固有値テンプレート法は、ＲＩＰＯＣとは異なり、重要な情報を捨てることなく類似度を計算することから背景の影響にロバストでもある。 An eigenvalue template method has been proposed as a method that can solve the problem of the above-described method (for example, see Patent Document 1). In the eigenvalue template method, a new template image (eigenvalue template) obtained by compressing information of a template image group by singular value decomposition (SVD) instead of matching a plurality of template images one by one with respect to a query image is used. Image) to efficiently calculate the similarity. Also, unlike the RIPOC, the eigenvalue template method is robust to the influence of the background because it calculates the similarity without discarding important information.

特許第５３１７２５０号明細書Patent No. 5317250

上述した固有値テンプレート法は、大きく分けて、クエリ画像と各固有値テンプレート画像との相関値（レスポンス）を求める処理と、相関値と固有関数（基底関数）との積和演算により姿勢パラメータの類似度を計算する処理とからなる。しかしながら、固有値テンプレート画像に対するクエリ画像のレスポンスから類似度を計算する積和演算処理の計算量負荷が相対的に高く、処理時間も長くなる傾向にあるため、改善の余地がある。 The eigenvalue template method described above is roughly divided into a process of obtaining a correlation value (response) between a query image and each eigenvalue template image, and a similarity of posture parameters by a product-sum operation of the correlation value and an eigenfunction (basis function). Is calculated. However, there is room for improvement because the load of the calculation amount of the product-sum operation for calculating the similarity from the response of the query image to the eigenvalue template image is relatively high and the processing time tends to be long.

本発明は、上述した点に鑑みてなされたものであり、固有値テンプレート法を改良し、より高速に類似度の演算を行うことが可能な画像処理方法及び画像処理装置を提供することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to provide an image processing method and an image processing apparatus capable of improving the eigenvalue template method and performing a similarity calculation at a higher speed. I do.

上記目的を達成するために、第１発明による画像処理方法は、入力画像と所定のテンプレート画像との照合を行うことで、入力画像において、テンプレート画像に対応する対象物を検出するための方法であって、
テンプレート画像を、所定の周波数範囲に属する低周波から高周波までの所定数の周波数成分を基底として、回転方向に周波数分解して、所定数の周波数成分と同数の固有値テンプレート画像を算出する第１ステップ（Ｓ１００、Ｓ１１０）と、
入力画像と固有値テンプレート画像とを畳み込み演算することにより、固有値テンプレート画像毎に、入力画像との相関値を算出する第２ステップ（Ｓ２１０）と、
算出された相関値に対して、所定の回転方向を基準とするＦＦＴ演算処理を行って、回転角度毎の類似度を算出する第３ステップ（Ｓ２４０）と、
回転角度毎の類似度から、入力画像における対象物の回転角度を算出する第４ステップ（Ｓ２５０、Ｓ２６０）と、を備える。 In order to achieve the above object, an image processing method according to a first aspect of the present invention is a method for detecting an object corresponding to a template image in an input image by comparing an input image with a predetermined template image. So,
A first step of frequency-decomposing the template image in a rotational direction based on a predetermined number of frequency components from a low frequency to a high frequency belonging to a predetermined frequency range and calculating the same number of eigenvalue template images as the predetermined number of frequency components (S100, S110),
A second step (S210) of calculating a correlation value with the input image for each eigenvalue template image by performing a convolution operation on the input image and the eigenvalue template image;
A third step (S240) of performing an FFT operation on the calculated correlation value based on a predetermined rotation direction to calculate a similarity for each rotation angle;
A fourth step (S250, S260) of calculating the rotation angle of the target object in the input image from the similarity for each rotation angle.

また、第２発明による画像処理装置は、入力画像と所定のテンプレート画像との照合を行うことで、入力画像において、テンプレート画像に対応する対象物を検出するための装置であって、
テンプレート画像を、所定の周波数範囲に属する低周波から高周波までの所定数の周波数成分を基底として、回転方向に周波数分解することにより算出した、所定数の周波数成分と同数の固有値テンプレート画像を記憶する記憶部（２１）と、
入力画像と、記憶部に記憶された固有値テンプレート画像とを畳み込み演算することにより、固有値テンプレート画像毎に、入力画像との相関値を算出する相関値算出部（Ｓ２１０）と、
算出された相関値に対して、所定の回転方向を基準とするＦＦＴ演算処理を行って、回転角度毎の類似度を算出する類似度算出部（Ｓ２４０）と、
回転角度毎の類似度から、入力画像における対象物の回転角度を算出する回転角度算出部（Ｓ２５０、Ｓ２６０）と、を備える。 Further, the image processing apparatus according to the second invention is an apparatus for detecting a target object corresponding to the template image in the input image by comparing the input image with a predetermined template image,
The same number of eigenvalue template images as the predetermined number of frequency components are calculated by performing frequency decomposition on the template image based on a predetermined number of frequency components from a low frequency to a high frequency belonging to a predetermined frequency range in a rotational direction. A storage unit (21);
A correlation value calculation unit (S210) for calculating a correlation value with the input image for each eigenvalue template image by performing a convolution operation on the input image and the eigenvalue template image stored in the storage unit;
A similarity calculation unit (S240) that performs an FFT calculation process on the calculated correlation value based on a predetermined rotation direction to calculate a similarity for each rotation angle;
A rotation angle calculation unit (S250, S260) for calculating the rotation angle of the target object in the input image from the similarity for each rotation angle.

上述した画像処理方法及び画像処理装置によれば、テンプレート画像を、所定の周波数範囲に属する低周波から高周波までの所定数の周波数成分を基底として、回転方向に周波数分解することにより、所定数の周波数成分と同数の固有値テンプレート画像が算出される。このため、類似度を算出すべく、固有値テンプレート画像と入力画像との畳み込み演算により算出した相関値に対し、所定数の周波数成分を要素とする周波数基底関数との積和演算を行った場合と、ＦＦＴ演算処理を行った場合とで等価な結果を得ることができるようになる。ＦＦＴ演算処理は、相関値と周波数基底関数との積和演算に比較して、演算時間を大幅に短縮することが可能である。そのため、従来の固有値テンプレート法と同等の検出率を維持しながら、従来の固有値テンプレート法に比較して、高速に類似度の演算を行うことが可能になる。 According to the above-described image processing method and image processing apparatus, a predetermined number of template images are frequency-decomposed in the rotational direction based on a predetermined number of frequency components from a low frequency to a high frequency belonging to a predetermined frequency range. As many eigenvalue template images as frequency components are calculated. For this reason, in order to calculate the similarity, the sum of the correlation value calculated by the convolution operation of the eigenvalue template image and the input image is calculated by performing the product-sum operation with the frequency basis function having a predetermined number of frequency components as elements. , FFT operation processing, it is possible to obtain an equivalent result. The FFT operation processing can significantly reduce the operation time as compared with the product-sum operation of the correlation value and the frequency basis function. Therefore, it is possible to calculate the similarity at a higher speed than the conventional eigenvalue template method while maintaining the detection rate equivalent to the conventional eigenvalue template method.

上記括弧内の参照番号は、本発明の理解を容易にすべく、後述する実施形態における具体的な構成との対応関係の一例を示すものにすぎず、なんら本発明の範囲を制限することを意図したものではない。 The reference numbers in the parentheses are merely an example of a correspondence relationship with a specific configuration in the embodiment described later in order to facilitate understanding of the present invention, and do not limit the scope of the present invention. Not intended.

また、上述した特徴以外の、特許請求の範囲の各請求項に記載した技術的特徴に関しては、後述する実施形態の説明及び添付図面から明らかになる。 Further, technical features described in each claim of the claims other than the above-described features will be apparent from the description of the embodiments and the accompanying drawings described later.

第１実施形態に係る画像処理装置を含む、工場内のロボットによるビンピッキングを行うためのシステム全体の構成を示す構成図である。FIG. 1 is a configuration diagram illustrating an overall configuration of a system for performing bin picking by a robot in a factory, including an image processing apparatus according to a first embodiment. Ｌ字型のワークをテンプレート画像とした場合の、各周波数成分に対する固有値を示すグラフである。9 is a graph showing eigenvalues for each frequency component when an L-shaped workpiece is used as a template image. Ｎ次元ＦＦＴの結果得られる類似度が、Ｍ次元（Ｍ＜Ｎ）ＦＦＴを使って近似できることを説明するための説明図である。FIG. 9 is an explanatory diagram for explaining that a similarity obtained as a result of an N-dimensional FFT can be approximated using an M-dimensional (M <N) FFT. 第１実施形態の、固有値テンプレート画像の作成手法を含む学習処理を示すフローチャートである。5 is a flowchart illustrating a learning process including a method of generating an eigenvalue template image according to the first embodiment. 対象物を検出するための入力画像に対する照合処理を示すフローチャートである。It is a flowchart which shows the collation process with respect to the input image for detecting a target object. 第２実施形態の、固有値テンプレート画像の作成手法を含む学習処理を示すフローチャートである。9 is a flowchart illustrating a learning process including a technique for creating an eigenvalue template image according to the second embodiment. 第１実施形態に係る画像処理装置及び第２実施形態に係る画像処理装置における手法と、従来の固有値テンプレート法とを、照合時間、学習時間、メモリ使用量の観点で対比させた結果を示す図である。The figure which shows the result of having compared the method in the image processing apparatus which concerns on 1st Embodiment, the image processing apparatus which concerns on 2nd Embodiment, and the conventional eigenvalue template method from a viewpoint of collation time, learning time, and memory usage. It is. 各実施形態に関わる手法の優位性を検証するための実験に用いたサンプル画像である。9 is a sample image used in an experiment for verifying the superiority of the technique according to each embodiment. 実施形態に係る手法と、従来の各種手法との処理時間の相違の一例を示す図である。FIG. 11 is a diagram illustrating an example of a difference in processing time between the method according to the embodiment and various conventional methods. 実施形態に係る手法と、従来の各種手法との検出率の相違の一例を示す図である。FIG. 7 is a diagram illustrating an example of a difference in detection rate between the technique according to the embodiment and various conventional techniques.

以下、本発明の実施形態を、図面を参照しつつ、詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（第１実施形態）
図１は、第１実施形態に係る画像処理装置２０を含む、工場内のロボットによるビンピッキングを行うためのシステム全体の構成を示している。 (1st Embodiment)
FIG. 1 shows a configuration of an entire system for bin picking by a robot in a factory, including an image processing apparatus 20 according to the first embodiment.

図１に示すように、システムは、カメラ１０を備えている。カメラ１０は、例えば、ライン上を流動する、対象物を収容した部品箱の内部を撮影する。カメラ１０によって撮影された画像は、入力画像として、画像処理装置２０に提供される。 As shown in FIG. 1, the system includes a camera 10. The camera 10 captures an image of the inside of a component box that accommodates an object and flows on a line, for example. The image captured by the camera 10 is provided to the image processing device 20 as an input image.

画像処理装置２０は、入力画像と、予め準備され、内部のメモリ２１に記憶されている固有値テンプレート画像を用いた照合を行うことで、入力画像において、テンプレート画像に対応する対象物を検出する。なお、固有値テンプレート画像の作成手法を含む学習処理は、後に、図４のフローチャートに基づき詳しく説明する。 The image processing device 20 detects a target object corresponding to the template image in the input image by performing matching using the input image and an eigenvalue template image prepared in advance and stored in the internal memory 21. The learning process including the method of creating the eigenvalue template image will be described later in detail with reference to the flowchart of FIG.

入力画像は、通常、テンプレート画像よりも広い画像エリアを有している。そのため、画像処理装置２０は、入力画像において、テンプレート画像と照合される照合領域を定めてクエリ画像とする。なお、クエリ画像とは、画像処理装置２０において、類似度の演算に用いられる画像を意味するものである。 The input image usually has a larger image area than the template image. Therefore, the image processing device 20 determines a matching area to be matched with the template image in the input image and sets the matching area as a query image. Note that the query image means an image used for calculating the similarity in the image processing device 20.

その際、画像処理装置２０は、照合領域が入力画像全域をカバーするように、照合領域を所定画素単位（例えば、１画素単位）でずらしながら、複数回、照合領域を設定する。さらに、画像処理装置２０は、入力画像の異なる位置に設定された照合領域毎、すなわち、クエリ画像毎に、固有値テンプレート画像との相関値に対してＦＦＴ演算を行って、回転角度毎の類似度を算出し、その類似度が最大となる回転角度を求める。なお、入力画像とテンプレート画像との画像エリアが同等である場合には、直接、入力画像をクエリ画像として用いれば良い。 At this time, the image processing device 20 sets the matching area a plurality of times while shifting the matching area by a predetermined pixel unit (for example, one pixel unit) so that the matching area covers the entire input image. Further, the image processing apparatus 20 performs an FFT operation on the correlation value with the eigenvalue template image for each collation region set at a different position of the input image, that is, for each query image, and performs similarity for each rotation angle. Is calculated, and a rotation angle at which the similarity is maximized is obtained. When the image areas of the input image and the template image are equivalent, the input image may be directly used as the query image.

そして、画像処理装置２０は、クエリ画像毎に算出された類似度に基づき、類似度が最大になる位置及び回転角度に関する姿勢パラメータ、もしくは、類似度が閾値以上となる位置及び回転角度に関する姿勢パラメータを検出することで、最終的に必要な対象物の姿勢に関する情報を得る。例えば、画像処理装置２０は、クエリ画像毎に算出された類似度を相互に比較し、最大の類似度となるクエリ画像の位置及び回転角度から、対象物の姿勢を検出する。なお、画像処理装置２０における上記の照合処理の詳細については、後に、図５のフローチャートに基づき詳細に説明する。 Then, based on the similarity calculated for each query image, the image processing apparatus 20 sets the posture parameter related to the position and the rotation angle at which the similarity is maximum, or the posture parameter related to the position and the rotation angle at which the similarity is equal to or more than the threshold. , The information about the posture of the finally required object is obtained. For example, the image processing device 20 compares the similarities calculated for each query image with each other, and detects the orientation of the target object from the position and rotation angle of the query image that has the maximum similarity. The details of the above-described collation processing in the image processing apparatus 20 will be described later in detail with reference to the flowchart of FIG.

ロボット３０は、画像処理装置２０から、対象物の位置及び回転角度に関する姿勢情報を取得し、その姿勢情報に基づき、例えば、該当する対象物を把持し、製品への組み付けを行うように、アームを制御する。 The robot 30 acquires posture information on the position and rotation angle of the target object from the image processing device 20 and, based on the posture information, for example, grasps the target object and performs arm assembly so as to perform assembly to a product. Control.

本実施形態に係る画像処理装置２０は、ビンピッキングを行うシステム以外に、例えば、部品や製品の外観検査を行うシステムなどにも適用可能である。外観検査を行うシステムに適用する場合には、例えば、画像処理装置２０から類似度に関する情報を表示装置に出力するようにし、表示装置において、類似度が閾値以上であれば正常表示し、閾値未満であれば警告表示を行うようにしても良い。 The image processing apparatus 20 according to the present embodiment can be applied to, for example, a system for inspecting the appearance of components and products, in addition to a system for performing bin picking. When applied to a system for performing a visual inspection, for example, information about the similarity is output from the image processing apparatus 20 to a display device. If so, a warning display may be performed.

次に、本実施形態に係る画像処理装置２０における、相関に基づいた回転マッチングの高速化の手法について説明する。 Next, a method of speeding up the rotation matching based on the correlation in the image processing device 20 according to the present embodiment will be described.

従来の相関に基づいた回転マッチングは、下記の数式１のように定式化できる。すなわち、クエリ画像ｆと、一定の回転ピッチ（Δθ=２π／Ｎ）で面内回転したテンプレート画像Ｔθとの畳み込み演算をＮ回行うことで、各姿勢パラメータｘ、ｙ、θに対する類似度をそれぞれ計算する。なお、Ｒは、テンプレート画像Ｔθが、クエリ画像と重なる領域を示している。
Conventional rotation matching based on correlation can be formulated as Equation 1 below. In other words, by performing convolution operation N times between the query image f and the template image Tθ rotated in-plane at a constant rotation pitch (Δθ = 2π / N), the similarity for each of the posture parameters x, y, and θ is calculated. calculate. R indicates an area where the template image Tθ overlaps the query image.

検出率や角度パラメータθの推定精度を上げるために、角度ピッチΔθを小さくすると、必要な畳み込み演算の数が増えるため処理時間が増大してしまう。この問題を解決できる手法として固有値テンプレート法が提案されている。 If the angle pitch Δθ is reduced in order to increase the detection rate and the accuracy of estimating the angle parameter θ, the number of necessary convolution operations increases, so that the processing time increases. An eigenvalue template method has been proposed as a method that can solve this problem.

固有値テンプレート法では、テンプレート画像群を特異値分解で情報圧縮して得た新たなテンプレート画像（固有値テンプレート画像）をマッチングに利用することで効率よく類似度を算出する。以下に、固有値テンプレート法による類似度の算出手法の一例を説明する。 In the eigenvalue template method, similarity is efficiently calculated by using a new template image (eigenvalue template image) obtained by compressing information of a template image group by singular value decomposition for matching. Hereinafter, an example of a similarity calculation method using the eigenvalue template method will be described.

例えば、テンプレート画像群Ｔを、以下の数式２のように定義する。
For example, the template image group T is defined as in the following Expression 2.

そして、テンプレート画像群Ｔを、下記の数式３のように、テンプレート画像群Ｔの画像枚数Ｎよりも少ないＭ枚の固有値テンプレート画像Ｅ＝［Ｅ_０，…，Ｅ_Ｍ−１］と、Ｍ個の固有関数φ_ｋ（θ）＝［φ₀ ^Ｔ (θ)，…，φ_M-1 ^Ｔ(θ)]^Ｔとの積で近似する。
Then, as shown in the following Expression 3, the template image group T is composed of M eigenvalue template images E = [E ₀ ,..., E _M−1 ] smaller than the number N of images of the template image group T, and M固有_k (θ) = [φ ₀ ^T (θ),..., Φ _M-1 ^T (θ)] ^T.

数式３の近似式を数式１に代入し、以下の数式４に示す類似度の近似式を得る。
The approximate expression of Expression 3 is substituted into Expression 1 to obtain an approximate expression of similarity shown in Expression 4 below.

この数式４を用いて類似度を算出することにより、数式２に比較して、Ｎ回の畳み込み演算をＭ回に削減することができ、演算処理を効率化することができる。 By calculating the similarity using Expression 4, N convolution operations can be reduced to M times as compared with Expression 2, and the operation process can be made more efficient.

本実施形態による画像処理装置２０では、類似度を算出するための演算処理の更なる高速化を図るべく、数式４を２つのステップに分けて考える。すなわち、下記の数式５に示すように、クエリ画像ｆとＭ枚の固有値テンプレート画像Ｅとの畳込み演算により、相関値ベクトル（レスポンスベクトル）ｒ（ｘ，ｙ）を計算する処理を第１のステップとする。そして、下記の数式６に示すように、相関値ベクトルｒ（ｘ，ｙ）と固有関数φ_ｋ（θ）との各要素の積の総和から類似度ｇを計算する処理を第２のステップとする。
In the image processing apparatus 20 according to the present embodiment, Expression 4 is considered in two steps in order to further speed up the arithmetic processing for calculating the similarity. That is, as shown in the following Expression 5, a process of calculating a correlation value vector (response vector) r (x, y) by a convolution operation of the query image f and the M eigenvalue template images E is a first process. Step. Then, as shown in the following Expression 6, the process of calculating the similarity g from the sum of the products of the elements of the correlation value vector r (x, y) and the eigenfunction φ _k (θ) is a second step and I do.

数式５の相関値ベクトルｒの計算は、固有値テンプレート画像Ｅの枚数Ｍを固定すれば一定の処理時間で効率的に計算できる。しかし、数式６の相関値ベクトルｒと固有関数φ_ｋ（θ）との積の総和の演算は、それぞれに含まれる要素数に比例して処理負荷が増大する。換言すると、この相関値ベクトルｒと固有関数φ_ｋ（θ）との積の総和の演算を効率化することができれば、類似度ｇ（ｘ，y，θ）の演算処理をより高速化することができる。 The calculation of the correlation value vector r in Expression 5 can be efficiently performed in a fixed processing time if the number M of the eigenvalue template images E is fixed. However, the calculation of the sum of the products of the correlation value vector r and the eigenfunction φ _k (θ) in Equation 6 increases the processing load in proportion to the number of elements included in each. In other words, if the calculation of the sum of products of the correlation value vector r and the eigenfunction φ _k (θ) can be made more efficient, the calculation processing of the similarity g (x, y, θ) can be further speeded up. Can be.

そこで、本実施形態に係る画像処理装置２０は、テンプレート画像群ＴがＤＦＴ（Discrete Fourier Transform）行列を使って特異値分解でき、また大きな特異値が低周波成分に集中していることに着目し、固有関数φとしてＤＦＴ行列の低周波部分を用いた周波数基底関数を用いて固有値テンプレート画像Ｅを作成することとした。このようにすれば、類似度を算出すべく、固有値テンプレート画像Ｅとクエリ画像ｆとの畳み込み演算により算出した相関値ベクトルｒに対し、低周波部分のＤＦＴ行列（周波数基底関数）を固有関数として積和演算を行った場合と、ＦＦＴ演算処理を行った場合とで等価な結果を得ることができるようになる。ＦＦＴ演算処理は、相関値ベクトルｒとＤＦＴ行列との積和演算に比較して、演算時間を大幅に短縮することが可能である。そのため、従来の固有値テンプレート法と同等の検出率を維持しながら、従来の固有値テンプレート法に比較して、高速に回転マッチング、すなわち、類似度の演算を行うことが可能になる。 Therefore, the image processing apparatus 20 according to the present embodiment pays attention to the fact that the template image group T can be singular value-decomposed using a DFT (Discrete Fourier Transform) matrix, and that large singular values are concentrated in low frequency components. The eigenvalue template image E is created using a frequency basis function using a low-frequency part of the DFT matrix as the eigenfunction φ. In this way, in order to calculate the similarity, the DFT matrix (frequency basis function) of the low frequency part is used as the eigenfunction for the correlation value vector r calculated by the convolution operation of the eigenvalue template image E and the query image f. An equivalent result can be obtained when the product-sum operation is performed and when the FFT operation is performed. The FFT operation processing can greatly reduce the operation time as compared with the product-sum operation of the correlation value vector r and the DFT matrix. Therefore, rotation matching, that is, calculation of similarity, can be performed at a higher speed than the conventional eigenvalue template method, while maintaining the detection rate equivalent to the conventional eigenvalue template method.

以下に、上述した手法の理論的背景について説明する。 Hereinafter, the theoretical background of the above method will be described.

まず、テンプレート画像群ＴがＤＦＴ行列を用いて特異値分解できることを示す。 First, it will be shown that the template image group T can be subjected to singular value decomposition using a DFT matrix.

テンプレート画像群Ｔのように、一定の回転ピッチで面内回転させて作成した行列の分散共分散行列は循環行列になり、さらに、循環行列はＤＦＴ行列Ｆと固有値行列Σとに固有値分解することができる。従って、以下の数式７が成り立つ。なお、添字の「Ｔ」は、転置行列、「†」は共役転置行列を表している。
Like the template image group T, the variance-covariance matrix of a matrix created by in-plane rotation at a constant rotation pitch becomes a circular matrix, and the circular matrix is subjected to eigenvalue decomposition into a DFT matrix F and an eigenvalue matrix Σ. Can be. Therefore, the following equation 7 is established. Note that the subscript “T” indicates a transposed matrix, and “†” indicates a conjugate transposed matrix.

数式７から、テンプレート画像群Ｔは、以下の数式８のように、ＤＦＴ行列Ｆを使って特異値分解することができる。
From Expression 7, the template image group T can be subjected to singular value decomposition using the DFT matrix F as in Expression 8 below.

ここで、固有値テンプレート画像Ｅを、数式９のように定義すると、固有値テンプレート画像Ｅは、数式１０のように、テンプレート画像群ＴとＤＦＴ行列Ｆとから求めることができる。
Here, when the eigenvalue template image E is defined as in Expression 9, the eigenvalue template image E can be obtained from the template image group T and the DFT matrix F as in Expression 10.

次に、上述したようにテンプレート画像群Ｔを周波数分解して作成した固有値テンプレート画像Ｅを用いた類似度の算出手法について説明する。 Next, a method of calculating the similarity using the eigenvalue template image E created by performing frequency decomposition on the template image group T as described above will be described.

数式８より、Ｎ枚の固有値テンプレート画像Ｅは互いに直交していることから、数式１の類似度は、以下の数式１１のように、フーリエ級数の形で表すことができる。
From Equation 8, since the N eigenvalue template images E are orthogonal to each other, the similarity of Equation 1 can be expressed in the form of a Fourier series as in Equation 11 below.

なお、相関値ベクトルｒ_ｋの定義は、数式５と同様である。 The definition of the correlation value vector r _k is the same as Equation 5.

ここで、多くの画像では面内回転方向の輝度変化は緩やかなので、テンプレート画像群Ｔをフーリエ級数展開した場合、低周波領域に、高い値の固有値が集中する傾向がある。例えば、図２は、Ｌ字型のワークをテンプレート画像として、固有値と各周波数成分とに分解した場合の、各周波数成分に対する固有値を示している。この図２からも、低周波成分に対応する固有値が、相対的に高い値を示すことが分かる。 Here, since the luminance change in the in-plane rotation direction is gradual in many images, when the template image group T is subjected to Fourier series expansion, high-valued eigenvalues tend to concentrate in the low-frequency region. For example, FIG. 2 shows eigenvalues for each frequency component when an L-shaped work is used as a template image and decomposed into eigenvalues and frequency components. FIG. 2 also shows that the eigenvalue corresponding to the low frequency component shows a relatively high value.

このため、離散フーリエ基底の低周波成分Ｍ個からなるトランケートＤＦＴ行列Ｆ^(N×M)を使って、テンプレート画像群Ｔは、以下の数式１２のように近似することができる。なお、Ｍ個の低周波数成分としては、例えば、直流成分を下限とし、所定の基準周波数を上限とするＭ個の周波数成分を選択することができる。また、各固有値テンプレート画像Ｅは、それぞれＮ個の要素からなるものとしている。
Therefore, using a truncated DFT matrix F ^{(N × M)} composed of M low-frequency components of a discrete Fourier basis, the template image group T can be approximated as in the following Expression 12. As the M low frequency components, for example, M frequency components having a DC component as a lower limit and a predetermined reference frequency as an upper limit can be selected. Each eigenvalue template image E is assumed to be composed of N elements.

数式１２を、数式１１に適用することで、Ｍ枚の固有値テンプレート画像Ｅによる類似度の近似式が、以下の数式１３のように得られる。
By applying Expression 12 to Expression 11, an approximate expression of the similarity based on the M eigenvalue template images E is obtained as in Expression 13 below.

次に、固有値テンプレート画像Ｅとクエリ画像ｆとの畳み込みの結果得られる相関値ベクトルｒのＮ次元ＦＦＴにより、数式１３の類似度が計算できることを示す。準備として、数式１３の類似度を以下の数式１４のようにトランケートＤＦＴ行列を使って書き直す。
Next, it will be shown that the similarity of Expression 13 can be calculated by the N-dimensional FFT of the correlation value vector r obtained as a result of convolution of the eigenvalue template image E and the query image f. As a preparation, the similarity of Expression 13 is rewritten using a truncated DFT matrix as in Expression 14 below.

なお、数式１４における相関値ベクトルｒの定義は、数式１５に示す通りである。
Note that the definition of the correlation value vector r in Expression 14 is as shown in Expression 15.

数式１４の相関値ベクトルｒとトランケートＤＦＴ行列Ｆ（Ｎ×Ｍ）の積和演算により得られる類似度は、長さＮに０パディングした相関値ベクトルｒ（＝［r^Ｔ(ｘ,y),0,…,0］）とＮ次元ＤＦＴ行列との積和演算結果と等価である。 The similarity obtained by the product-sum operation of the correlation value vector r of Expression 14 and the truncated DFT matrix F (N × M) is the correlation value vector r (= [r ^T (x, y), 0,..., 0]) and an N-dimensional DFT matrix.

従って、数式１４の類似度は以下の数式１６のように変形することができ、Ｎ次元ＦＦＴ演算により計算することができる。
Therefore, the similarity of Expression 14 can be modified as in Expression 16 below, and can be calculated by an N-dimensional FFT operation.

より高速に類似度を計算することを可能とするため、数式１６のＮ次元ＦＦＴの結果得られる類似度が、より高速なＭ（＜Ｎ）次元ＦＦＴを使って近似できることを示す。 It is shown that the similarity obtained as a result of the N-dimensional FFT of Expression 16 can be approximated using a faster M (<N) -dimensional FFT so that the similarity can be calculated faster.

ＤＦＴ行列Ｆ^（N×M）のＭ次元ＤＦＴ行列への射影は、図３に示すように、対角行列を縦に潰したような形になっている。このことから、数式１７の類似度は、０パディングを行わずに、相関値ベクトルｒをＭ次元ＦＦＴすることで得られる類似度を、回転角方向に伸ばしたもので良く近似できることがわかる。 The projection of the DFT matrix F ^{(N × M)} onto the M-dimensional DFT matrix has a shape in which a diagonal matrix is crushed vertically, as shown in FIG. From this, it can be seen that the similarity of Expression 17 can be well approximated by extending the similarity obtained by performing the M-dimensional FFT on the correlation value vector r without performing zero padding in the rotation angle direction.

したがって、Ｍ次元ＦＦＴの結果からピークサーチにより、各照合領域の回転方向の類似度のピーク値ｇ（x,y,θmax）と対応する角度パラメータθmax（x, y）の初期値を推定できる。そして、角度パラメータθmax（x, y）の精度をさらに高めるべく、θmax（x, y）を初期値として、Newton法を使って回転角の推定を、サブピクセル単位で行う。類似度が解析関数（フーリエ基底の線形和）で記述できるため、その二次微分も解析的に計算できるためである。あるいは、ピーク周辺の３点を使ってパラボラフィッティングを行うことで回転角の推定を行っても良い。 Therefore, the initial value of the angle parameter θmax (x, y) corresponding to the peak value g (x, y, θmax) of the similarity in the rotation direction of each matching region can be estimated from the result of the M-dimensional FFT by the peak search. Then, in order to further increase the accuracy of the angle parameter θmax (x, y), the rotation angle is estimated in subpixel units using the Newton method with θmax (x, y) as an initial value. This is because the similarity can be described by an analytic function (a linear sum of Fourier bases), and its second derivative can also be analytically calculated. Alternatively, the rotation angle may be estimated by performing parabola fitting using three points around the peak.

以下、図４のフローチャートを参照して、固有値テンプレート画像の作成手法を中心として、画像処理装置２０における学習処理について説明する。 Hereinafter, the learning process in the image processing device 20 will be described with reference to the flowchart of FIG.

この学習処理では、まず、ステップＳ１００において、回転角度が異なる複数のテンプレート画像からなるテンプレート画像群Ｔを生成する。続くステップＳ１１０では、テンプレート画像群Ｔを、上述した数式１２によって示されるように、要素としての周波数成分がＭ個のトランケートＤＦＴ行列Ｆと、Ｍ枚の固有値テンプレート画像Ｅとに近似分解する。このようにして、Ｍ枚の固有値テンプレート画像Ｅを作成する。作成したＭ枚の固有値テンプレート画像Ｅは、メモリ２１に記憶しておく。 In this learning process, first, in step S100, a template image group T including a plurality of template images having different rotation angles is generated. In the following step S110, the template image group T is approximately decomposed into a truncated DFT matrix F having M frequency components as elements and M eigenvalue template images E as shown by the above-described Expression 12. In this way, M unique value template images E are created. The created M unique value template images E are stored in the memory 21.

なお、上述した学習処理は、他のコンピュータで行い、作成されたＭ枚の固有値テンプレート画像Ｅを、画像処理装置２０のメモリ２１に記憶させるようにしても良い。 The learning process described above may be performed by another computer, and the created M unique value template images E may be stored in the memory 21 of the image processing device 20.

次に、図５のフローチャートを参照して、画像処理装置２０において実行される照合処理について説明する。 Next, with reference to the flowchart of FIG. 5, the collation processing executed in the image processing apparatus 20 will be described.

まず、ステップＳ２００において、入力画像において、テンプレート画像と照合される照合領域を設定する。この設定された照合領域がクエリ画像として、以下に説明する各処理の処理対象となる。 First, in step S200, a matching area to be matched with the template image is set in the input image. The set collation area is a query image and is a processing target of each processing described below.

続くステップＳ２１０では、上述した数式５に従い、クエリ画像ｆと、Ｍ枚の固有値テンプレート画像との畳み込み演算を行い、数式１５に示す、Ｍ個の要素からなる相関値ベクトルｒを算出する。 In the following step S210, a convolution operation of the query image f and the M eigenvalue template images is performed in accordance with the above-described Expression 5 to calculate a correlation value vector r including M elements shown in Expression 15.

ステップＳ２２０では、算出した相関値ベクトルｒの各要素の絶対値の和を算出する。この処理を行う理由は、ＤＦＴ行列Ｆの構造に着目することで、算出される類似度の上限値を簡易的に計算し、上限値が閾値に満たない場合には、詳細な類似度の計算を省略することで処理をさらに高速にするためである。なお、ＤＦＴ行列Ｆでは、行ベクトルのｌ_２ノルムは対応するθによらず一定である。そのため、相関値ベクトルｒの各要素の絶対値の和から類似度の上限値が計算できる。 In step S220, the sum of the absolute values of each element of the calculated correlation value vector r is calculated. The reason for performing this processing is that the upper limit of the calculated similarity is simply calculated by focusing on the structure of the DFT matrix F. If the upper limit is less than the threshold, the detailed similarity is calculated. This is to further increase the processing speed by omitting. In DFT matrix F, l ₂ norm of the row vector is constant regardless of the corresponding theta. Therefore, the upper limit of the similarity can be calculated from the sum of the absolute values of the elements of the correlation value vector r.

ステップＳ２３０では、算出した絶対値の和を所定の閾値と比較する。そして、算出した絶対値の和が所定の閾値未満であると判定した場合、類似度を算出するための処理をスキップして、ステップＳ２７０の処理に進む。一方、閾値以上であると判定した場合には、ステップＳ２４０の処理に進む。 In step S230, the sum of the calculated absolute values is compared with a predetermined threshold. When it is determined that the sum of the calculated absolute values is less than the predetermined threshold, the process for calculating the similarity is skipped, and the process proceeds to step S270. On the other hand, if it is determined that the difference is equal to or larger than the threshold, the process proceeds to step S240.

ステップＳ２４０では、相関値ベクトルｒに対して、Ｍ次元ＦＦＴ演算を実行する。この際、Ｍ次元ＦＦＴは、数式１３に示されるように、所定の回転角度θを基準として実行される。これにより、Ｍ次元ＦＦＴ演算により得られる各周波数成分と、回転角度との対応関係が一義的に定まる。そのため、ステップＳ２５０において、Ｍ次元ＦＦＴ演算の結果から、類似度がピークとなる角度パラメータθmax（x,y）の初期値を求めることができる。そして、ステップＳ２６０において、ステップＳ２５０で算出したθmax（x, y）を初期値として、Newton法を使って回転角の最終的な推定をサブピクセル単位で行う。 In step S240, an M-dimensional FFT operation is performed on the correlation value vector r. At this time, the M-dimensional FFT is performed based on a predetermined rotation angle θ as shown in Expression 13. Thereby, the correspondence between each frequency component obtained by the M-dimensional FFT operation and the rotation angle is uniquely determined. Therefore, in step S250, the initial value of the angle parameter θmax (x, y) at which the similarity peaks can be obtained from the result of the M-dimensional FFT operation. Then, in step S260, the rotation angle is finally estimated in subpixel units using the Newton method, with θmax (x, y) calculated in step S250 as an initial value.

ステップＳ２７０では、入力画像の全域に渡って照合が行われたか否かを判定する。まだ、全域での照合が完了していないと判定された場合には、ステップＳ２００に戻り、照合領域をずらして、ステップＳ２１０以降の処理を実行する。一方、ステップＳ２７０において、全域での照合が完了したと判定された場合には、ステップＳ２８０に進み、全ての照合領域での照合結果に基づいて、対象物の位置及び回転角度を決定する。 In step S270, it is determined whether or not the matching has been performed over the entire area of the input image. If it is determined that the matching in the entire area has not been completed yet, the process returns to step S200, and the processing from step S210 is performed by shifting the matching area. On the other hand, if it is determined in step S270 that the matching in the entire area has been completed, the process proceeds to step S280, and the position and rotation angle of the target object are determined based on the matching results in all the matching areas.

（第２実施形態）
次に、本発明の第２実施形態に係る画像処理装置２０について説明する。 (2nd Embodiment)
Next, an image processing device 20 according to a second embodiment of the present invention will be described.

上述した第１実施形態に係る画像処理装置２０では、固有値テンプレート画像Ｅを作成するために、回転角度が異なる複数のテンプレート画像からなるテンプレート画像群Ｔを生成した。このようなテンプレート画像群Ｔを用いる場合、テンプレート画像を回転する処理を複数回行わなければならず、学習処理が煩雑となる。また、回転したテンプレート画像を保存しておくために大量のメモリも必要となる。 In the image processing device 20 according to the first embodiment described above, in order to create the eigenvalue template image E, the template image group T including a plurality of template images having different rotation angles is generated. When such a template image group T is used, the process of rotating the template image must be performed a plurality of times, which complicates the learning process. Also, a large amount of memory is required to store the rotated template image.

そこで、本実施形態に係る画像処理装置２０では、１枚のテンプレート画像を極座標表現し、その極座標表現したテンプレート画像に対して、回転軸方向にフーリエ変換を施すことで、固有値テンプレート画像を作成し、その固有値テンプレート画像を用いて類似度の近似計算を行うようにしたものである。 Therefore, the image processing apparatus 20 according to the present embodiment creates an eigenvalue template image by expressing one template image in polar coordinates, and performing Fourier transform on the template image expressed in polar coordinates in the rotation axis direction. , Approximate calculation of similarity is performed using the eigenvalue template image.

以下、本実施形態による手法の理論的背景について説明する。 Hereinafter, the theoretical background of the method according to the present embodiment will be described.

まず、角度θだけ回転したテンプレート画像Ｔ（ｒ、τ−θ）とクエリ画像ｆ（ｒ、τ）との類似度を極座標パラメータ（ｒ、θ）を使って、以下の数式１７のように定義する。
First, the similarity between the template image T (r, τ−θ) rotated by the angle θ and the query image f (r, τ) is defined as the following Expression 17 using the polar coordinate parameters (r, θ). I do.

また、極座標変換したテンプレート画像Ｔ（ｒ，τ）は回転軸方向のフーリエ級数展開により、以下の数式１８ように書き表すことができる。
Further, the template image T (r, τ) subjected to the polar coordinate conversion can be expressed as the following Expression 18 by Fourier series expansion in the rotation axis direction.

テンプレート画像を極座標変換した場合も、回転角度が異なる複数のテンプレート画像からなるテンプレート画像群Ｔの場合と同様に、回転方向の輝度変化は低周波成分に集中する。このため、数式１８は、以下の数式１９のように、Ｍ個の低周波数成分を基底とし、そのＭ個の低周波成分の線形和で近似することができる。
Even when the template image is subjected to the polar coordinate conversion, as in the case of the template image group T including a plurality of template images having different rotation angles, the luminance change in the rotation direction concentrates on the low frequency components. Therefore, Expression 18 can be approximated by a linear sum of the M low-frequency components based on M low-frequency components, as in Expression 19 below.

数式１９を、数式１７に代入し、ｋθについて整理すると、以下の数式２０のように、類似度ｇ（θ）が、上述したＭ個の低周波成分を要素とするフーリエ基底の線形和で表現することができる。
By substituting Equation 19 into Equation 17 and rearranging for kθ, the similarity g (θ) is expressed by the linear sum of the Fourier bases using the above-described M low-frequency components as in Equation 20 below. can do.

なお、数式２０における、固有値テンプレート画像Ｅ_ｋ（ｒ、τ）は、以下の数式２１のように定義される。
The eigenvalue template image E _k (r, τ) in Expression 20 is defined as in Expression 21 below.

このように、固有値テンプレート画像Ｅ_ｋ（ｒ、τ）は、極座標に変換された前記テンプレート画像をフーリエ級数展開したときの、Ｍ個の低周波成分と、各低周波成分の係数φ_ｋ（ｒ）とを用いて算出することができる。 As described above, the eigenvalue template image E _k (r, τ) is obtained by developing the M low-frequency components and the coefficients φ _k (r ) Can be calculated.

なお、数式２１のままでは、クエリ画像ｆとの照合を行う際に、クエリ画像ｆを極座標変換する必要があるため計算負荷が大きくなり、また、畳み込み定理を使った内積演算の効率化ができないという問題が生じる。 In the case where Expression 21 is used as it is, it is necessary to polarize the query image f when performing collation with the query image f, so that the calculation load increases, and the efficiency of the inner product operation using the convolution theorem cannot be improved. The problem arises.

そこで、数式２１における、φ_ｋ（ｒ）、ψ_ｋ（τ）を逆極座標変換することで、数式２２に示すように、固有値テンプレート画像Ｅ_ｋ（ｒ、τ）をユークリッド座標上において算出する。
Therefore, in equation 21, φ _k (r), by the inverse polar coordinate conversion to ψ _k (τ), as shown in Equation 22, the eigenvalues template image _E k (r, τ) the calculated on Euclidean coordinates.

これにより、第１実施形態の場合と同様に、クエリ画像ｆと固有値テンプレート画像Ｅ_ｋとの畳み込み演算により相関値ベクトルｒを算出することができる。従って、本実施形態に係る画像処理装置２０も、第１実施形態に係る画像処理装置２０と同じく、図５のフローチャートに示される照合処理により、類似度を算出し、対象物の姿勢パラメータを求めることができる。 Thus, as in the first embodiment, the convolution operation between the query image f and eigenvalues template image E _k can be calculated correlation value vector r. Therefore, similarly to the image processing apparatus 20 according to the first embodiment, the image processing apparatus 20 according to the present embodiment calculates the similarity by the matching process illustrated in the flowchart of FIG. be able to.

以下に、本実施形態に係る画像処理装置２０において実行される学習処理について、図６のフローチャートを参照して説明する。 Hereinafter, a learning process performed in the image processing apparatus 20 according to the present embodiment will be described with reference to a flowchart of FIG.

まず、ステップＳ３００において、テンプレート画像の中心を基準として、テンプレート画像を極座標に変換する。続くステップＳ３１０では、極座標に変換したテンプレート画像を回転軸方向にフーリエ級数展開することにより、周波数分解する。このフーリエ級数展開では、数式１９に示されるように、Ｍ個の低周波数成分を基底とし、そのＭ個の低周波成分の線形和で近似する。これにより、Ｍ個の低周波成分と、各低周波成分の係数φ_ｋ（ｒ）とが定まる。 First, in step S300, the template image is converted into polar coordinates based on the center of the template image. In the subsequent step S310, the template image converted into the polar coordinates is subjected to Fourier series expansion in the direction of the rotation axis, thereby performing frequency decomposition. In this Fourier series expansion, as shown in Expression 19, M low-frequency components are used as a basis, and the low-frequency components are approximated by a linear sum of the M low-frequency components. As a result, M low frequency components and the coefficient φ _k (r) of each low frequency component are determined.

続くステップＳ３２０では、Ｍ個の周波数成分を要素とする関数ψ_ｋ（τ）、及び各周波数成分の係数φ_ｋ（ｒ）をユークリッド座標に変換する。そして、ステップＳ３３０において、数式２２に従い、ユークリッド座標に変換されたＭ個の低周波成分を要素とする関数ψ_ｋ（τ）と、各低周波成分の係数φ_ｋ（ｒ）とを用いて、固有値テンプレート画像Ｅ_ｋ（ｒ、τ）を算出する。 In the following step S320, the function _{ｋ k} (τ) having M frequency components as elements and the coefficient φ _k (r) of each frequency component are converted into Euclidean coordinates. Then, in step S330, using the function _{ｋ k} (τ) having M low frequency components converted into Euclidean coordinates as elements according to Expression 22, and the coefficient φ _k (r) of each low frequency component, The eigenvalue template image E _k (r, τ) is calculated.

なお、上述した説明では、テンプレート画像を極座標変換したのちに、フーリエ変換する手法を示したが、ユークリッド座標上のテンプレート画像から直接極座標におけるフーリエ変換を行うことも可能である。また、第１実施形態の場合と同様に、上述した学習処理を、他のコンピュータに行わせ、作成されたＭ枚の固有値テンプレート画像Ｅを、画像処理装置２０のメモリ２１に記憶させるようにしても良い。 In the above description, a method of performing Fourier transform after the template image is subjected to polar coordinate conversion has been described. However, it is also possible to perform Fourier transform in polar coordinates directly from a template image on Euclidean coordinates. Further, similarly to the first embodiment, the learning process described above is performed by another computer, and the created M eigenvalue template images E are stored in the memory 21 of the image processing device 20. Is also good.

ここで、上述した第１実施形態に係る画像処理装置及び第２実施形態に係る画像処理装置における手法と、従来の固有値テンプレート法とを、照合時間、学習時間、メモリ使用量の観点で対比させた結果を図７に示す。なお、図７（ａ）〜（ｃ）において、「実１」は第１実施形態にて説明した手法を採用した場合の結果を示し、「実２」は第２実施形態にて説明した手法を採用した場合の結果を示している。また、図７（ａ）〜（ｃ）に示す値は、理論値の一例を示したものである。 Here, the method in the image processing apparatus according to the first embodiment and the method in the image processing apparatus according to the second embodiment described above is compared with the conventional eigenvalue template method in terms of the matching time, the learning time, and the memory usage. FIG. 7 shows the results. 7A to 7C, “actual 1” indicates a result when the method described in the first embodiment is used, and “actual 2” indicates a method described in the second embodiment. The result when adopting is shown. The values shown in FIGS. 7A to 7C are examples of theoretical values.

図７（ａ）は、クエリ画像との照合処理に要した照合時間に関する対比結果を示している。第１実施形態及び第２実施形態に係る手法では、相関値ベクトルと周波数基底関数との積和演算をＦＦＴ演算により高速化しているため、照合時間を従来法の１／５程度に短縮できている。 FIG. 7A shows a comparison result regarding a matching time required for a matching process with a query image. In the methods according to the first embodiment and the second embodiment, since the product-sum operation of the correlation value vector and the frequency basis function is speeded up by the FFT operation, the matching time can be reduced to about 1/5 of the conventional method. I have.

また、図７（ｂ）は、固有値テンプレート画像の作成を含む学習時間に関する対比結果を示している。第１実施形態に係る手法では、学習アルゴリズムの中で、処理負荷が高い特異値分解（ＳＶＤ）がＦＦＴに置き換わるので、学習時間も、従来法に比較して１／４程度に短縮できている。 FIG. 7B shows a comparison result regarding learning time including creation of an eigenvalue template image. In the method according to the first embodiment, singular value decomposition (SVD) having a high processing load is replaced with FFT in the learning algorithm, so that the learning time can be reduced to about 1/4 compared to the conventional method. .

さらに、図７（ｃ）は、学習処理におけるメモリ使用量に関する対比結果を示している。第２実施形態に係る手法では、学習時に、テンプレート画像の極座標変換を使うことで、従来法で必要であった面内回転画像のメモリへの格納を不要としている。これにより、第２実施形態の手法によれば、学習処理時のメモリ使用量を従来法の１／１２０程度まで削減できている。 FIG. 7C shows a comparison result regarding the memory usage in the learning process. In the method according to the second embodiment, the use of the polar coordinate transformation of the template image at the time of learning eliminates the need to store the in-plane rotated image in the memory, which is required in the conventional method. Thus, according to the method of the second embodiment, the memory usage during the learning process can be reduced to about 1/120 of the conventional method.

（第３実施形態）
次に、本発明の第３実施形態に係る画像処理装置２０について説明する。 (Third embodiment)
Next, an image processing device 20 according to a third embodiment of the present invention will be described.

本実施形態に係る画像処理装置２０は、照明や背景の変動に対するロバスト化を強化するため、背景のノイズに対してロバストである位相情報を使ったマッチングと、照明の変動にロバストである輝度勾配方向を使ったマッチングとを、上述した第１実施形態及び第２実施形態によるテンプレート画像の情報圧縮に基づいたマッチングに適用するものである。 The image processing device 20 according to the present embodiment includes a matching method using phase information that is robust against background noise and a luminance gradient that is robust against illumination variation in order to enhance robustness against illumination and background variation. The matching using the direction is applied to the matching based on the information compression of the template image according to the first embodiment and the second embodiment described above.

輝度勾配の方向は環境の明るさ変動による画像の輝度変化に対して頑強である。また、クエリ画像の背景にテンプレート画像と似たような模様がある場合でも、位相成分を使ったＰＯＣはロバストに類似度を計算できる。これらの考え方を適用した画像を用いて、第１実施形態及び第２実施形態において説明した、学習処理や照合処理を実行することで、照明や背景の変動に対するロバスト化を強化することができる。 The direction of the brightness gradient is robust against a change in the brightness of the image due to a change in the brightness of the environment. Further, even when there is a pattern similar to the template image in the background of the query image, the POC using the phase component can calculate the similarity robustly. By executing the learning process and the matching process described in the first embodiment and the second embodiment using an image to which these ideas are applied, it is possible to enhance robustness against variations in lighting and background.

そこで、本実施形態では、クエリ画像及びテンプレート画像それぞれに関して、以下の数式２３に示すように、輝度勾配の強度と方向からなる複素勾配画像を計算する。
Therefore, in the present embodiment, a complex gradient image composed of the intensity and direction of the luminance gradient is calculated for each of the query image and the template image as shown in Expression 23 below.

数式２３において、Ｔ_ｘ、Ｔ_ｙは、元の画像Ｔに対してガウシアンフィルタを適用した、水平、垂直方向の微分画像である。 In Expression 23, T _x and T _y are differential images in the horizontal and vertical directions obtained by applying a Gaussian filter to the original image T.

さらに、この複素勾配画像の２次元フーリエ変換した周波数空間における位相成分を取り出した画像（振幅を１にした画像）を生成する。 Further, an image (an image having an amplitude of 1) is extracted from the phase component in the frequency space obtained by subjecting the complex gradient image to two-dimensional Fourier transform.

面内回転したテンプレート画像から、上記の方法で生成した画像から作られるテンプレート群の分散共分散行列は循環行列になる。このため、第１実施形態及び第２実施形態において説明した手法に適用することができる。 The variance-covariance matrix of the template group created from the image generated by the above method from the template image rotated in the plane becomes a circular matrix. For this reason, it can be applied to the method described in the first embodiment and the second embodiment.

また、テンプレート画像とクエリ画像とで背景の輝度が大きく変わると、輝度勾配の方向が反転（π回転）する。そこで、数式２３では、輝度勾配の方向が反転した場合でも、内積の値が同じになるように回転成分に２をかけることで、このような変動にロバストにしている。 Also, when the luminance of the background greatly changes between the template image and the query image, the direction of the luminance gradient is reversed (π rotation). Therefore, in Expression 23, even when the direction of the luminance gradient is reversed, the rotation component is multiplied by 2 so that the value of the inner product becomes the same, thereby making such fluctuations robust.

（第４実施形態）
次に、本発明の第４実施形態に係る画像処理装置２０について説明する。 (Fourth embodiment)
Next, an image processing device 20 according to a fourth embodiment of the present invention will be described.

例えば、第３実施形態として説明した、複素勾配画像の位相成分のみを使って、クエリ画像と固有値テンプレート画像との相関を計算する場合など、クエリ画像によっては偽ピークが発生し易くなる場合がある。クエリ画像の背景などによって発生する偽ピークは、広い面積で高い値を示す傾向がある。 For example, a false peak may easily occur depending on a query image, for example, when a correlation between a query image and an eigenvalue template image is calculated using only the phase component of the complex gradient image described in the third embodiment. . False peaks generated by the background of the query image and the like tend to show high values over a wide area.

そこで、本実施形態では、ＦＦＴ演算結果、すなわち類似度のピーク値peakと、その周辺の平均値meanと分散値σとの関係を評価することで、偽ピークを抑制することとした。具体的には、回転方向の最大類似度に対してＰＳＲ（Peak Signal Ratio）フィルタ処理を施して、最終的な類似度を求める。ＰＳＲフィルタ処理は、以下の数式２４によって定義される。
Therefore, in the present embodiment, the false peak is suppressed by evaluating the result of the FFT operation, that is, the relationship between the peak value peak of the similarity and the average value mean and the variance σ around the peak value peak. Specifically, a PSR (Peak Signal Ratio) filter process is performed on the maximum similarity in the rotation direction to obtain a final similarity. The PSR filter processing is defined by the following equation (24).

なお、各クエリ画像における、ピーク値peak、平均値mean、分散値σは、インテグラルイメージを用いることで高速に計算することができる。 Note that the peak value peak, average value mean, and variance σ in each query image can be calculated at high speed by using an integral image.

（評価実験）
上述した各実施形態による、高速化及び背景／明るさに対するロバスト化の手法の有用性を評価するためシミュレーションによる実験を行った。なお、以下に記載する提案法１（Proposed-Phase）とは、第１実施形態による手法に、第３実施形態による複素勾配画像より生成した画像を適用したものに相当し、提案法２（Proposed-Naive）とは、第１実施形態で説明した手法をそのまま使用したものに相当する。 (Evaluation experiment)
Experiments by simulation were performed to evaluate the usefulness of the technique for speeding up and robustness against background / brightness according to each of the above-described embodiments. The Proposed Method 1 (Proposed-Phase) described below corresponds to a method in which an image generated from the complex gradient image according to the third embodiment is applied to the method according to the first embodiment, and is referred to as Proposed Method 2 (Proposed-Phase). -Naive) corresponds to a method that directly uses the method described in the first embodiment.

（１）評価項目
検出成功率及び処理速度について評価を行った。検出成功率は、ＸＹ位置については正しい位置から±３［pix］以内、回転角度θについては±３［deg］以内を検出成功とした。処理速度は、各手法で類似度ｇを最大にするパラメータを探索するまでの時間を計測した。 (1) Evaluation items The detection success rate and the processing speed were evaluated. The detection success rate was determined to be within ± 3 [pix] from the correct position for the XY position and within ± 3 [deg] for the rotation angle θ. The processing speed was determined by measuring the time required to search for a parameter that maximizes the similarity g in each method.

なお、閾値を使った判定は行っていない。その理由は、各手法で相関値の指標が異るため、各手法で同じ閾値を設定するのができないためである。従って、提案法１，２としては、相関値ベクトルｒの各要素の絶対値の和の大きさに基づく足切りは行っていない。 Note that the determination using the threshold is not performed. The reason is that the same threshold value cannot be set in each method because the index of the correlation value differs in each method. Therefore, in Proposed Methods 1 and 2, no truncation based on the sum of the absolute values of the elements of the correlation value vector r is performed.

（２）比較した手法
提案法１（Proposed-Phase）及び提案法２（Proposed-Naive）に対して対比された手法は、以下の４つの手法である。
・積分形正規化エッジ固有値テンプレート法（Eigen-Edge）
・固有値テンプレート法（Eigen-Naive）
・回転不変位相限定相関法（RIPOC）
・正規化相互相関を使った回転マッチング（NCC）
なお、今回の評価実験では、精度についての評価は行わないため、サブピクセル推定の処理は無効にしている。 (2) Methods Compared The following four methods are compared with the proposed method 1 (Proposed-Phase) and the proposed method 2 (Proposed-Naive).
・ Integral normalized edge eigenvalue template method (Eigen-Edge)
・ Eigenvalue template method (Eigen-Naive)
・ Rotation-invariant phase-only correlation (RIPOC)
・ Rotation matching (NCC) using normalized cross-correlation
In this evaluation experiment, since the accuracy is not evaluated, the sub-pixel estimation processing is disabled.

また、提案法１（Proposed-Phase）及び提案法２（Proposed-Naive）を含め、各手法に関して、ＧＰＵによる処理の並列化を行った実装についても処理速度の評価を行った。 In addition, with respect to each method, including the Proposed Method 1 (Proposed-Phase) and the Proposed Method 2 (Proposed-Naive), the processing speed was also evaluated for implementations in which processing by the GPU was parallelized.

また、提案法（Proposed-Phase）、提案法２（Proposed-Naive）、積分形正規化エッジ固有値テンプレート法（Eigen-Edge）、及び固有値テンプレート法（Eigen-Naive）では、テンプレート画像を角度ピッチΔθ＝３６０／５１２［deg］刻みで回転して生成した５１２枚のテンプレート画像群を、Ｍ＝３２枚の固有値テンプレート画像で近似した。 In the proposed method (Proposed-Phase), the proposed method 2 (Proposed-Naive), the integral-type normalized edge eigenvalue template method (Eigen-Edge), and the eigenvalue template method (Eigen-Naive), the template image is represented by an angular pitch Δθ. A group of 512 template images generated by rotation at an interval of = 360/512 [deg] was approximated by M = 32 eigenvalue template images.

回転不変位相限定相関法（RIPOC）については、ノイズへのロバスト性を向上するため、ＰＯＣ処理においてローパスフィルタを適用した。 For the rotation-invariant phase-only correlation method (RIPOC), a low-pass filter was applied in POC processing to improve robustness to noise.

正規化相互相関を使った回転マッチング（NCC）についても、角度刻み数Ｎ＝５１２とした。 Rotational matching (NCC) using normalized cross-correlation was also performed with the number of angular steps N = 512.

（３）対象ワークとクエリ画像
対象のワークはテクスチャレスの工業部品を想定し、図８（ａ）〜（ｄ）に示すように、Ｌ字型のパーツを用いた。また、クエリ画像に関しては、別に撮影した回路基板の実装面など、特徴が異なる４種類の画像を背景画像とし、対象ワークの画像と合成することにより、図８（ａ）〜（ｄ）に示す４種類のクエリ画像を作成した。さらに、クエリ画像の枚数を増やすため、図８（ａ）〜（ｄ）のそれぞれのクエリ画像を元に、背景画像に対して対象ワークをランダムに回転した上で背景画像と合成することにより、各クエリ画像からそれぞれ１０００枚のクエリ画像を作成した。また、実環境におけるノイズを再現するため、クエリ画像に分散が１５のガウシアンノイズを付与し、I(x,y)←I(x,y)×(0.5×x/512)となるようなシェーディングによる変換を施した。 (3) Target Work and Query Image The target work was assumed to be a textureless industrial part, and an L-shaped part was used as shown in FIGS. 8 (a) to 8 (d). Also, as for the query image, four types of images having different characteristics, such as a mounting surface of a circuit board taken separately, are used as background images, and are combined with an image of the target work, so as to be shown in FIGS. 8A to 8D. Four types of query images were created. Further, in order to increase the number of query images, the target work is rotated at random with respect to the background image based on each of the query images in FIGS. 8A to 8D and then combined with the background image. 1000 query images were created from each query image. Also, in order to reproduce the noise in the real environment, a Gaussian noise with a variance of 15 is added to the query image, and shading such that I (x, y) ← I (x, y) × (0.5 × x / 512) is obtained. Was performed.

（４）処理時間
図９に、各手法による処理時間を示す。なお、図９に示す処理時間は、４種類のクエリ画像×１０００枚について平均値である。 (4) Processing Time FIG. 9 shows the processing time of each method. The processing time shown in FIG. 9 is an average value for four types of query images × 1000 sheets.

正規化相互相関を使った回転マッチング（NCC）は、面内回転した多数のテンプレート画像とそれぞれ畳み込み演算を行わなければならず、処理時間は非常に長くなる。また、ＧＰＵ利用による改善効果もさほど見られない。ＧＰＵへの転送時間がボトルネックとなっていることが一因と考えられる。 Rotation matching (NCC) using normalized cross-correlation requires a convolution operation with each of a large number of in-plane rotated template images, and the processing time is extremely long. Also, the improvement effect by using the GPU is not so much seen. One possible reason is that the transfer time to the GPU is a bottleneck.

提案法１（Proposed-Phase）、提案法２（Proposed-Naive）、積分形正規化エッジ固有値テンプレート法（Eigen-Edge）及び固有値テンプレート法（Eigen-Naive）は、畳み込み演算に関しては、３２枚の固有値テンプレート画像との畳込み演算を行うだけで良いので、正規化相互相関を使った回転マッチング（NCC）に対して大幅に処理時間を短縮できている。 The proposed method 1 (Proposed-Phase), the proposed method 2 (Proposed-Naive), the integral-type normalized edge eigenvalue template method (Eigen-Edge) and the eigenvalue template method (Eigen-Naive) have 32 sheets of convolution. Since it is only necessary to perform convolution operation with the eigenvalue template image, the processing time can be greatly reduced for rotation matching (NCC) using normalized cross-correlation.

さらに、提案法１（Proposed-Phase）及び提案法２（Proposed-Naive）は、相関値と固有関数との積和演算をＦＦＴ演算に置き換えているため、提案法１（Proposed-Phase）と積分形正規化エッジ固有値テンプレート法（Eigen-Edge）との比較では、約５．５倍、提案法２（Proposed-Naive）と固有値テンプレート法（Eigen-Naive）との比較では、約３．５倍の処理速度の高速化が確認できた。 Furthermore, the proposed method 1 (Proposed-Phase) and the proposed method 2 (Proposed-Naive) replace the product-sum operation of the correlation value and the eigenfunction with the FFT operation. About 5.5 times in comparison with the shape-normalized edge eigenvalue template method (Eigen-Edge), and about 3.5 times in comparison with the proposed method 2 (Proposed-Naive) and the eigenvalue template method (Eigen-Naive). It was confirmed that the processing speed was increased.

また、提案法１（Proposed-Phase）及び提案法２（Proposed-Naive）は、ＧＰＵを使用した場合、高速に回転マッチングが行えることで知られるＲＩＣＯＰと比較して、約３倍の高速化が実現できた。
（５）検出成功率
図１０に、各手法の検出成功率を示す。なお、図１０では、図８（ａ）〜（ｄ）に示す各クエリ画像ごとに分けて、検出成功率を示している。 In addition, the proposed method 1 (Proposed-Phase) and the proposed method 2 (Proposed-Naive) achieve approximately three times speedup when using a GPU, as compared with RICOP, which is known to be able to perform high-speed rotation matching. I realized it.
(5) Detection Success Rate FIG. 10 shows the detection success rate of each method. In FIG. 10, the detection success rate is shown for each of the query images shown in FIGS. 8 (a) to 8 (d).

提案法２（Proposed-Naive）と固有値テンプレート法（Eigen-Naive）では、図８（ａ）、（ｂ）に示すクエリ画像に関して同等の検出成功率が得られており、今回実験に用いたテクスチャが少ないワークでは、第１実施形態で説明した高速化のための手法が検出成功率に悪影響を与えないことが確認できた。 In the proposed method 2 (Proposed-Naive) and the eigenvalue template method (Eigen-Naive), the same detection success rates are obtained for the query images shown in FIGS. 8A and 8B, and the textures used in this experiment were used. It was confirmed that, for a work with a small number, the technique for increasing the speed described in the first embodiment did not adversely affect the detection success rate.

ただし、提案法２（Proposed-Naive）と固有値テンプレート法（Eigen-Naive）とも、図８（ｃ）、（ｄ）のような複雑な背景の画像ではロバスト性の改善を行った他の手法と比べて低い検出成功率となった。 However, both the proposed method 2 (Proposed-Naive) and the eigenvalue template method (Eigen-Naive) use other methods that have improved robustness for images with complex backgrounds as shown in FIGS. 8C and 8D. The detection success rate was lower than that.

それに対して、第３実施形態において説明した複素勾配画像より生成した画像を適用した提案法１（Proposed-Phase）は、正規化相互相関を使った回転マッチング（NCC）や積分形正規化エッジ固有値テンプレート法（Eigen-Edge）のようなロバスト性の改善を行った従来法と比べても高い検出成功率が得られた。特に、積分形正規化エッジ固有値テンプレート法（Eigen-Edge）では、背景の輝度が学習時と大きく異なる図８（ｄ）のクエリ画像の検出成功率が大きく低下しているが、提案法１（Proposed-Phase）ではこのようなクエリ画像に対してもロバストに対象ワークの姿勢が推定できた。 On the other hand, the proposed method 1 (Proposed-Phase) applying the image generated from the complex gradient image described in the third embodiment is based on the rotation matching (NCC) using the normalized cross-correlation and the integral-type normalized edge eigenvalue. A higher detection success rate was obtained compared to the conventional method that improved robustness such as the template method (Eigen-Edge). In particular, in the integration type normalized edge eigenvalue template method (Eigen-Edge), the success rate of detecting the query image in FIG. In Proposed-Phase), the posture of the target work could be robustly estimated even for such a query image.

上述した各実施形態は、本発明の画像処理方法及び画像処理装置の好ましい実施形態ではあるが、本発明の画像処理方法及び画像処理装置は、上記実施形態になんら制限されることなく、本発明の主旨を逸脱しない範囲において、種々変形することが可能である。 Each of the above-described embodiments is a preferred embodiment of the image processing method and the image processing apparatus of the present invention. However, the image processing method and the image processing apparatus of the present invention are not limited to the above-described embodiments, and the present invention is not limited thereto. Various modifications can be made without departing from the spirit of the invention.

例えば、上述した第１実施形態及び第２実施形態では、直流成分を下限とし、所定の基準周波数を上限とするＭ個の低周波成分が、全固有値の中で相対的に大きい固有値に対応するものを包含していることを前提としている。しかし、大きな固有値に対応する周波数成分が、高周波領域にあるようなテンプレートの場合には、その大きな固有値に対応する周波数成分を含むようにＤＦＴ行列を選択する必要がある。例えば、テンプレート毎に、各周波数成分に対応する固有値の大きさを調べ、その結果に基づいて、ＤＦＴ行列の周波数成分を定めても良い。 For example, in the first and second embodiments described above, M low-frequency components having a DC component as a lower limit and a predetermined reference frequency as an upper limit correspond to relatively large eigenvalues among all eigenvalues. It is assumed that things are included. However, in the case of a template in which a frequency component corresponding to a large eigenvalue is in a high frequency region, it is necessary to select a DFT matrix so as to include a frequency component corresponding to the large eigenvalue. For example, the magnitude of the eigenvalue corresponding to each frequency component may be checked for each template, and the frequency component of the DFT matrix may be determined based on the result.

また、画像の回転方向の高周波成分をカットするような前処理を適用することで、大きな固有値に対応する周波数成分を低周波領域に集中させるようにしても良い。 Further, by applying a pre-processing that cuts a high-frequency component in the rotation direction of the image, a frequency component corresponding to a large eigenvalue may be concentrated in a low-frequency region.

１０カメラ
２０画像処理装置
２１メモリ
３０ロボット Reference Signs List 10 camera 20 image processing device 21 memory 30 robot

Claims

An image processing method for detecting an object corresponding to the template image in the input image by comparing the input image with a predetermined template image,
A first step of performing frequency decomposition on the template image in a rotational direction based on a predetermined number of frequency components from a low frequency to a high frequency belonging to a predetermined frequency range as a basis, and calculating the same number of eigenvalue template images as the frequency components ( S100, S110),
A second step (S210) of calculating a correlation value with the input image for each of the eigenvalue template images by performing a convolution operation on the input image and the eigenvalue template image;
A third step (S240) of performing an FFT calculation process on the calculated correlation value based on a predetermined rotation direction to calculate a similarity for each rotation angle;
A fourth step (S250, S260) of calculating a rotation angle of the object in the input image from the similarity for each rotation angle.

The image processing method according to claim 1, wherein the predetermined number of frequency components from a low frequency to a high frequency belonging to the predetermined frequency range has a DC component as a lower limit and a predetermined reference frequency as an upper limit.

The first step includes:
A generation step (S100) of generating an image group including a plurality of the template images having different rotation angles;
The image group is frequency-decomposed in the rotational direction, and a frequency basis function including the predetermined number of frequency components equal to or lower than a predetermined reference frequency, and the frequency components are decomposed into the same number of eigenvalue template images as the eigenvalue template. The image processing method according to claim 1, further comprising a decomposition step of calculating an image.

The first step includes:
A conversion step (S300) of converting the template image into polar coordinates;
The template image converted to the polar coordinates, the frequency component of a predetermined number or less of a predetermined reference frequency or less as a basis, frequency decomposition, using the coefficients of each frequency component when the frequency decomposition, using each frequency component, The image processing method according to claim 1, further comprising a calculating step of calculating the eigenvalue template image (S <b> 310 to S <b> 340).

For the input image and the template image, respectively, a fifth step of calculating a complex gradient image composed of the intensity and direction of a luminance gradient, and converting the complex gradient image into an image obtained by extracting a phase component,
The image processing method according to any one of claims 1 to 4, wherein the processing from the first step to the fourth step is performed using the converted image.

The image processing method according to claim 5, wherein in the fifth step, a complex gradient image is created by using the following Expression 1.
Here, Tx and Ty are horizontal and vertical differential images of the original image T.

The input image has an image area wider than the template image, and includes a sixth step (S200, S270) of defining a collation region in the input image to be collated with the template image.
In the sixth step, the collation area is defined a plurality of times while shifting the collation area by a predetermined pixel unit so that the collation area covers the entire area of the input image,
In the second step, the correlation value is calculated by performing a convolution operation with the eigenvalue template image using the matching area determined in the sixth step as the input image,
Furthermore, comparing the sum of the absolute values of the correlation values calculated for each of the eigenvalue template images with a predetermined threshold value, and if the sum of the absolute values of the correlation values is less than the predetermined threshold value, the third step The image processing method according to claim 1, further comprising a seventh step (S <b> 230) of excluding from subsequent processing.

The third step is
The maximum degree of similarity among the similarity for each rotation angle calculated by the FFT arithmetic processing, saw including a filter step of performing a filtering process shown by the following equation 2, the similarity of the filtered, final The image processing method according to claim 1, wherein the similarity is a maximum similarity .
Here, peak is the maximum similarity, mean is the average value of the similarities, and σ is the variance of the similarities.

An image processing device that detects an object corresponding to the template image in the input image by performing collation between the input image and a predetermined template image,
The template image is stored by storing the same number of eigenvalue template images as the number of the frequency components, which are calculated by performing frequency decomposition in a rotational direction based on a predetermined number of frequency components from a low frequency to a high frequency belonging to a predetermined frequency range as a base. Part (21),
A correlation value calculation unit (S210) for calculating a correlation value with the input image for each of the eigenvalue template images by performing a convolution operation on the input image and the eigenvalue template images stored in the storage unit;
A similarity calculation unit (S240) that performs an FFT calculation process based on a predetermined rotation direction on the calculated correlation value to calculate a similarity for each rotation angle;
A rotation angle calculation unit (S250, S260) for calculating a rotation angle of the object in the input image from the similarity for each rotation angle.

The image processing apparatus according to claim 9, wherein a predetermined number of frequency components from a low frequency to a high frequency belonging to the predetermined frequency range has a DC component as a lower limit and a predetermined reference frequency as an upper limit.

The image processing apparatus according to claim 9, further comprising: a conversion unit configured to calculate a complex gradient image including the intensity and direction of a luminance gradient for the input image and the template image, and to convert a phase component of the complex gradient image into an extracted image. The image processing apparatus according to any one of the preceding claims.

The image processing device according to claim 11, wherein the conversion unit creates a complex gradient image using Expression 3 below.
Here, Tx and Ty are horizontal and vertical differential images of the original image T.

The input image has an image area wider than the template image, and includes a collation area setting unit (S200, S270) that determines a collation area to be collated with the template image in the input image.
The collation area setting unit, while shifting the collation area by a predetermined pixel unit, such that the collation area covers the entire area of the input image, defines the collation area a plurality of times,
The correlation value calculating unit, as the input image, using a matching area determined by the matching area setting unit, performs a convolution operation with the eigenvalue template image, calculates the correlation value,
Further, the sum of the absolute values of the correlation values calculated for each of the eigenvalue template images is compared with a predetermined threshold, and when the sum of the absolute values of the correlation values is less than the predetermined threshold, the similarity calculation is performed. The image processing apparatus according to any one of claims 9 to 12, further comprising a correlation value determining unit (S230) for excluding the correlation value from being calculated by the unit.

The similarity calculating section,
The maximum degree of similarity among the similarity for each rotation angle calculated by the FFT arithmetic processing, saw including a filter unit that performs a filtering process shown in Equation 4 below, the similarity of the filtered, final 14. The image processing apparatus according to claim 9 , wherein the similarity is a maximum similarity .
Here, peak is the maximum similarity, mean is the average value of the similarities, and σ is the variance of the similarities.