JP2013058174A

JP2013058174A - Image processing program, image processing method, and image processing device

Info

Publication number: JP2013058174A
Application number: JP2011197622A
Authority: JP
Inventors: Masayoshi Shimizu; 雅芳清水; Hideo Saito; 英雄斎藤; Takumi Yoshida; 拓洋吉田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-09-09
Filing date: 2011-09-09
Publication date: 2013-03-28

Abstract

PROBLEM TO BE SOLVED: To reduce a calculation amount in associating coordinates of respective images with each other.SOLUTION: An image processing device 100 extracts feature points from learning target image data 130a and a viewpoint variation image data group 130b with different camera viewpoints, and classifies the respective feature points to generate a registration table 130c. The image processing device 100 compares a centroid coordinate of each feature point group of the registration table 130c with the coordinate of a feature point extracted from recognition target image data 130d to associate the feature point of the learning target image data 130a with the feature point of the recognition target image data 130d.

Description

本発明は、画像処理プログラム等に関するものである。 The present invention relates to an image processing program and the like.

予め画像を学習しておき、かかる画像と同一の画像をあるカメラ視点で撮影した場合に、予め学習しておいた画像を基にして、撮影した画像の姿勢を認識する技術がある。撮影した画像の姿勢を認識するためには、予め学習しておいた画像の座標と、撮影した画像の座標とを対応付ける処理が行われる。各画像の座標を対応付ける技術として、ASIFTや、Randomized Treesによる手法等がある。 There is a technique of learning an image in advance and recognizing the posture of the captured image based on the previously learned image when the same image as that image is captured from a certain camera viewpoint. In order to recognize the orientation of the photographed image, a process of associating the coordinates of the image learned in advance with the coordinates of the photographed image is performed. Techniques for associating the coordinates of each image include ASIFT and Randomized Trees.

例えば、Randomized Treesによる手法では、学習しておいた画像と撮影した画像との座標をランダムに選択し、選択した座標の画素を順次比較することで、各画像の座標とを対応付ける。 For example, in the technique based on Randomized Trees, the coordinates of a learned image and a captured image are randomly selected, and the pixels of the selected coordinates are sequentially compared to associate the coordinates of each image.

特開２０１０−２０４８２６号公報JP 2010-204826 A 特開２００５−２１５９８８号公報Japanese Patent Laid-Open No. 2005-215988

J. M. Morel and G. Yu, “ASIFT: A new framework forfully affine invariant image comparison”, SIAM Journal on Imaging Sciences, 2, 2, pp. 438-469, 2009J. M. Morel and G. Yu, “ASIFT: A new framework for fully affine invariant image comparison”, SIAM Journal on Imaging Sciences, 2, 2, pp. 438-469, 2009 V. Lepetit and P. Fua, “Keypoint recognition using randomized trees”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 9, pp. 1465-1479, 2006V. Lepetit and P. Fua, “Keypoint recognition using randomized trees”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 9, pp. 1465-1479, 2006

しかしながら、上述した従来技術では、学習しておいた画像の座標と、姿勢の認識対象となる画像の座標とを対応付ける場合の計算量が多いという問題があった。 However, the above-described conventional technique has a problem that the amount of calculation is large when the coordinates of the learned image are associated with the coordinates of the image that is the posture recognition target.

例えば、Randomized Treesによる手法では、比較対象となる画素の組み合わせが多いため、各組み合わせを一つ一つ比較していくと、計算に多くの時間を要してしまう。 For example, in the method based on Randomized Trees, since there are many combinations of pixels to be compared, if each combination is compared one by one, a long time is required for calculation.

開示の技術は、上記に鑑みてなされたものであって、各画像の座標を対応付ける場合の計算量を削減することができる画像処理プログラム、画像処理方法および画像処理装置を提供することを目的とする。 The disclosed technique has been made in view of the above, and an object thereof is to provide an image processing program, an image processing method, and an image processing apparatus that can reduce the amount of calculation when the coordinates of each image are associated. To do.

開示の画像処理プログラムは、コンピュータに下記の処理を実行させる。画像処理プログラムは、コンピュータに、画像を正面のカメラ視点から撮影した学習対象画像を射影変換することで、学習対象画像に対するカメラ視点の異なる視点変動画像を複数生成させる。コンピュータに、学習対象画像および複数の視点変動画像から特徴点を抽出させる。コンピュータに、学習対象画像から抽出した各特徴点に対して、複数の視点変動画像から抽出した各特徴点を対応付けることで、複数の視点変動画像から抽出した各特徴点を複数の特徴点群に分類させる。コンピュータに、特徴点群を含む領域を登録テーブルに登録させる。コンピュータに、認識対象画像を取得し、該認識対象画像から特徴点を抽出させる。コンピュータに、認識対象画像から抽出した特徴点の位置と、前記登録テーブルの領域との関係に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付させる。 The disclosed image processing program causes a computer to execute the following processing. The image processing program causes a computer to generate a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image by projective conversion of the learning target image obtained by capturing the image from the front camera viewpoint. A computer extracts feature points from the learning target image and the plurality of viewpoint variation images. By associating each feature point extracted from a plurality of viewpoint variation images with each feature point extracted from the learning target image on a computer, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. Let them be classified. The computer registers the region including the feature point group in the registration table. The computer acquires a recognition target image and causes feature points to be extracted from the recognition target image. The computer associates the feature point of the recognition target image with the feature point of the learning target image based on the relationship between the position of the feature point extracted from the recognition target image and the region of the registration table.

開示の画像処理プログラムによれば、各画像の座標を対応付ける場合の計算量を削減することができるという効果を奏する。 According to the disclosed image processing program, it is possible to reduce the amount of calculation when the coordinates of each image are associated.

図１は、本実施例１にかかる画像処理装置の構成を示す機能ブロック図である。FIG. 1 is a functional block diagram of the configuration of the image processing apparatus according to the first embodiment. 図２は、学習対象画像データと視点変動画像データとの関係を説明するための図である。FIG. 2 is a diagram for explaining the relationship between learning target image data and viewpoint variation image data. 図３は、学習対象画像データから抽出した特徴点の一例を示す図である。FIG. 3 is a diagram illustrating an example of feature points extracted from learning target image data. 図４は、学習対象画像データからの特徴点の抽出結果の一例を示す図である。FIG. 4 is a diagram illustrating an example of a feature point extraction result from learning target image data. 図５は、視点変動画像データ群から抽出した特徴点の一例を示す図である。FIG. 5 is a diagram illustrating an example of feature points extracted from the viewpoint variation image data group. 図６は、視点変動画像データからの特徴点の抽出結果の一例を示す図である。FIG. 6 is a diagram illustrating an example of a feature point extraction result from the viewpoint variation image data. 図７は、変換結果の一例を示す図である。FIG. 7 is a diagram illustrating an example of the conversion result. 図８は、特徴空間上の重心座標を説明するための図である。FIG. 8 is a diagram for explaining the barycentric coordinates on the feature space. 図９は、本実施例１の登録テーブルのデータ構造の一例を示す図である。FIG. 9 is a diagram illustrating an example of the data structure of the registration table according to the first embodiment. 図１０は、本実施例１の学習フェーズの処理手順を示すフローチャートである。FIG. 10 is a flowchart illustrating the processing procedure of the learning phase according to the first embodiment. 図１１は、本実施例１の認識フェーズの処理手順を示すフローチャートである。FIG. 11 is a flowchart illustrating the processing procedure of the recognition phase according to the first embodiment. 図１２は、本実施例２にかかる画像処理装置の構成を示す機能ブロック図である。FIG. 12 is a functional block diagram of the configuration of the image processing apparatus according to the second embodiment. 図１３は、本実施例２にかかる分類部の処理を説明するための図である。FIG. 13 is a diagram for explaining the processing of the classification unit according to the second embodiment. 図１４は、本実施例２の登録テーブルのデータ構造の一例を示す図である。FIG. 14 is a diagram illustrating an example of a data structure of a registration table according to the second embodiment. 図１５は、本実施例３にかかる画像処理装置の構成を示す機能ブロック図である。FIG. 15 is a functional block diagram of the configuration of the image processing apparatus according to the third embodiment. 図１６は、本実施例３にかかる分類部の処理を説明するための図（１）である。FIG. 16 is a diagram (1) for explaining the process of the classification unit according to the third embodiment. 図１７は、本実施例３にかかる分類部の処理を説明するための図（２）である。FIG. 17 is a diagram (2) for explaining the process of the classification unit according to the third embodiment. 図１８は、本実施例３の登録テーブルのデータ構造の一例を示す図である。FIG. 18 is a diagram illustrating an example of the data structure of the registration table according to the third embodiment. 図１９は、画像処理プログラムを実行するコンピュータの一例を示す図である。FIG. 19 is a diagram illustrating an example of a computer that executes an image processing program.

以下に、本願の開示する画像処理プログラム、画像処理方法および画像処理装置の実施例を図面に基づいて詳細に説明する。なお、この実施例によりこの発明が限定されるものではない。 Embodiments of an image processing program, an image processing method, and an image processing apparatus disclosed in the present application will be described below in detail with reference to the drawings. Note that the present invention is not limited to the embodiments.

本実施例１にかかる画像処理装置について説明する。図１は、本実施例１にかかる画像処理装置の構成を示す機能ブロック図である。図１に示すように、画像処理装置１００は、カメラ１１０ａ、入力部１１０ｂ、出力部１２０、記憶部１３０、制御部１４０を有する。 An image processing apparatus according to the first embodiment will be described. FIG. 1 is a functional block diagram of the configuration of the image processing apparatus according to the first embodiment. As illustrated in FIG. 1, the image processing apparatus 100 includes a camera 110a, an input unit 110b, an output unit 120, a storage unit 130, and a control unit 140.

カメラ１１０ａは、対象物の画像を撮影する。学習フェーズにおいて、カメラ１１０ａは、学習対象となる対象物の画像を撮影し、撮影した画像のデータを、制御部１４０に出力する。また、カメラ１１０ａは、認識フェーズにおいて、姿勢の認識対象となる対象物の画像を撮影し、撮影した画像のデータを、制御部１４０に出力する。以下の説明において、学習対象となる画像のデータを、学習対象画像データと表記する。認識対象となる画像のデータを、認識対象画像データと表記する。 The camera 110a captures an image of the object. In the learning phase, the camera 110a captures an image of an object to be learned, and outputs captured image data to the control unit 140. In the recognition phase, the camera 110a captures an image of a target object whose posture is to be recognized, and outputs data of the captured image to the control unit 140. In the following description, data of an image to be learned is referred to as learning target image data. The image data to be recognized is referred to as recognition target image data.

学習対象画像データは、対象物を正面から撮影した画像のデータとする。認識対象画像データは、対象物を任意の方向から撮影した画像のデータとする。学習対象画像データの撮影対象となる対象物および認識対象画像データの撮影対象となる対象物は同一の対象物とする。対象物に多少の凹凸があってもよいが、平面的であることが望ましい。 The learning object image data is data of an image obtained by photographing the object from the front. The recognition target image data is data of an image obtained by photographing the target object from an arbitrary direction. The object to be imaged in the learning object image data and the object to be imaged in the recognition object image data are the same object. The object may have some unevenness, but is preferably planar.

入力部１１０ｂは、各種のデータを画像処理装置１００に入力する入力装置である。例えば、入力部１１０ｂは、入力キー、タッチパネル等に対応する。出力部１２０は、制御部１４０の処理結果を出力する出力装置である。 The input unit 110 b is an input device that inputs various data to the image processing apparatus 100. For example, the input unit 110b corresponds to an input key, a touch panel, or the like. The output unit 120 is an output device that outputs the processing result of the control unit 140.

記憶部１３０は、学習対象画像データ１３０ａ、視点変動画像データ群１３０ｂ、登録テーブル１３０ｃ、認識対象画像データ１３０ｄを記憶する。記憶部１３０は、例えば、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、フラッシュメモリ（Flash Memory）などの半導体メモリ素子、またはハードディスク、光ディスクなどの記憶装置に対応する。 The storage unit 130 stores learning target image data 130a, a viewpoint variation image data group 130b, a registration table 130c, and recognition target image data 130d. The storage unit 130 corresponds to, for example, a semiconductor memory device such as a random access memory (RAM), a read only memory (ROM), and a flash memory, or a storage device such as a hard disk or an optical disk.

学習対象画像データ１３０ａは、上記のように、カメラ１１０ａが撮影した学習対象となる画像のデータである。 As described above, the learning target image data 130a is data of an image to be learned taken by the camera 110a.

視点変動画像データ群１３０ｂは、対象物を様々なカメラ視点から撮影した画像に相当する画像データ群である。 The viewpoint changing image data group 130b is an image data group corresponding to images obtained by photographing the object from various camera viewpoints.

登録テーブル１３０ｃは、学習対象画像データ１３０ａ、視点変動画像データ１３０ｂから抽出される特徴点に関する各種の情報を保持するテーブルである。登録テーブル１３０ｃのデータ構造は、後述する。 The registration table 130c is a table that holds various types of information related to feature points extracted from the learning target image data 130a and the viewpoint variation image data 130b. The data structure of the registration table 130c will be described later.

認識対象画像データ１３０ｄは、上記のように、カメラ１１０ａが撮影した認識対象となる画像のデータである。 As described above, the recognition target image data 130d is data of an image to be recognized captured by the camera 110a.

制御部１４０は、データ管理部１４０ａ、視点変動画像生成部１４０ｂ、特徴点抽出部１４０ｃ、分類部１４０ｄ、対応付け部１４０ｅを有する。制御部１４０は、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）や、ＦＰＧＡ（Field Programmable Gate Array）などの集積装置に対応する。また、制御部１４０は、例えば、ＣＰＵやＭＰＵ（Micro Processing Unit）等の電子回路に対応する。 The control unit 140 includes a data management unit 140a, a viewpoint variation image generation unit 140b, a feature point extraction unit 140c, a classification unit 140d, and an association unit 140e. The control unit 140 corresponds to an integrated device such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The control unit 140 corresponds to an electronic circuit such as a CPU or MPU (Micro Processing Unit).

データ管理部１４０ａは、記憶部１３０を管理する処理部である。例えば、学習フェーズにおいて、データ管理部１４０ａは、カメラ１１０ａから学習対象画像データを取得し、取得した学習対象画像データを、記憶部１３０に記憶する。認識フェーズにおいて、データ管理部１４０ａは、カメラ１１０ａから認識対象画像データを取得し、取得した認識対象画像データを、記憶部１３０に記憶する。 The data management unit 140 a is a processing unit that manages the storage unit 130. For example, in the learning phase, the data management unit 140a acquires learning target image data from the camera 110a, and stores the acquired learning target image data in the storage unit 130. In the recognition phase, the data management unit 140a acquires recognition target image data from the camera 110a, and stores the acquired recognition target image data in the storage unit 130.

視点変動画像生成部１４０ｂは、学習対象画像データ１３０ａを基にして、視点変動画像データ群１３０ｂを生成する処理部である。視点変動画像生成部１４０ｂは、学習対象画像データ１３０ａに対して、カメラ視点の異なる視点変動画像データを、カメラ視点毎に生成する。 The viewpoint variation image generation unit 140b is a processing unit that generates a viewpoint variation image data group 130b based on the learning target image data 130a. The viewpoint variation image generation unit 140b generates viewpoint variation image data with different camera viewpoints for each camera viewpoint with respect to the learning target image data 130a.

図２は、学習対象画像データと視点変動画像データとの関係を説明するための図である。学習対象画像データ１３０ａは、対象物１０の正面のカメラ視点２０で対象物１０を撮影した画像のデータである。視線変動画像データ群１３０ｂは、θ、ψ、φのそれぞれ異なるカメラ視点３０ａ〜３０ｄで対象物１０を撮影した画像の各データに対応する。例えば、θ、ψ、φをそれぞれ、ロール角、ピッチ角、ヨー角とする。 FIG. 2 is a diagram for explaining the relationship between learning target image data and viewpoint variation image data. The learning target image data 130 a is data of an image obtained by capturing the target object 10 from the camera viewpoint 20 in front of the target object 10. The line-of-sight variation image data group 130b corresponds to each data of an image obtained by capturing the object 10 from different camera viewpoints 30a to 30d for θ, ψ, and φ. For example, θ, ψ, and φ are set as a roll angle, a pitch angle, and a yaw angle, respectively.

視点変動画像生成部１４０ｂは、学習対象画像データ１３０ａに対して、θ、ψ、φをパラメータに持つ射影変換を適用することで、視点変動画像データを生成する。視点変動画像生成部１４０ｂは、ψおよびφの値を、−６０°から６０°の範囲で１０°毎の間隔で、射影変換のパラメータを設定する。また、視点変動画像生成部１４０ｂは、θの値を０°から３６０°の範囲で、４５°毎の間隔で射影変換のパラメータを設定する。 The viewpoint variation image generation unit 140b generates viewpoint variation image data by applying projective transformation having θ, ψ, and φ as parameters to the learning target image data 130a. The viewpoint variation image generation unit 140b sets projection transformation parameters with values of ψ and φ at intervals of 10 ° in a range of −60 ° to 60 °. In addition, the viewpoint variation image generation unit 140b sets projection transformation parameters at intervals of 45 ° in the range of θ from 0 ° to 360 °.

視点変動画像生成部１４０ｂは、各パラメータの射影変換を実行して、パラメータ毎に視点変動画像データを生成する。視点変動画像生成部１４０ｂは、生成した複数の視点変動画像データを、視点変動画像データ群１３０ｂとして、記憶部１３０に記憶する。 The viewpoint variation image generation unit 140b performs projective transformation of each parameter, and generates viewpoint variation image data for each parameter. The viewpoint variation image generation unit 140b stores the generated plurality of viewpoint variation image data in the storage unit 130 as a viewpoint variation image data group 130b.

視点変動画像生成部１４０ｂが行う射影変換は、従来の如何なる射影変換を用いてもよい。例えば、視点変動画像生成部１４０ｂは、文献（金澤靖、金谷健一、’２画像間の特徴点対応の自動探索’、画像ラボ、pp.20-23、2004）に基づいて、射影変換を行う。 The projection transformation performed by the viewpoint variation image generation unit 140b may use any conventional projection transformation. For example, the viewpoint variation image generation unit 140b performs projective transformation based on literature (Akira Kanazawa, Kenichi Kanaya, 'Automatic search for feature point correspondence between two images', Image Lab, pp.20-23, 2004). .

特徴点抽出部１４０ｃは、学習対象画像データ１３０ａから特徴点を抽出する。また、特徴点抽出部１４０ｃは、視点変動画像データ群１３０ｂの各視点変動画像データから、特徴点を抽出する。特徴点抽出部１４０ｃは、学習対象画像データ１３０ａから抽出した特徴点に関するデータおよび視点変動画像データ群１３０ｂから抽出した特徴点に関するデータを、分類部１４０ｄに出力する。 The feature point extraction unit 140c extracts feature points from the learning target image data 130a. In addition, the feature point extraction unit 140c extracts feature points from each viewpoint variation image data of the viewpoint variation image data group 130b. The feature point extraction unit 140c outputs data on the feature points extracted from the learning target image data 130a and data on the feature points extracted from the viewpoint variation image data group 130b to the classification unit 140d.

特徴点抽出部１４０ｃは、例えば、ＳＩＦＴ（Scale Invariant Feature Transform）を利用して、特徴点を抽出する。ＳＩＦＴは、画像データを走査し、特徴の抽出に適した点を検出する技術である。例えば、特徴点抽出部１４０ｃは、画像データを走査し、各部分において、画像のエッジの尖り具合、濃淡値の勾配の具合、勾配変化の方向などの複数の要素を数値化する。そして、各要素の数値が極値を取る部分を特徴点として抽出する。 The feature point extraction unit 140c extracts feature points using, for example, SIFT (Scale Invariant Feature Transform). SIFT is a technique for scanning image data and detecting points suitable for feature extraction. For example, the feature point extraction unit 140c scans the image data, and in each portion, a plurality of elements such as the sharpness of the edge of the image, the gradient of the gray value, and the direction of the gradient change are digitized. Then, a portion where the numerical value of each element takes an extreme value is extracted as a feature point.

特徴点抽出部１４０ｃは、ＳＩＦＴを利用して、学習対象画像データ１３０ａから、特徴点を抽出する。図３は、学習対象画像データから抽出した特徴点の一例を示す図である。図３に示す例では、特徴点抽出部１４０ｃは、学習対象画像データ１３０ａから、特徴点４０Ａ、４０Ｂ、４０Ｃを抽出する。特徴点抽出部１４０ｃは、学習対象画像データ１３０ａからの抽出結果を、分類部１４０ｄに出力する。 The feature point extraction unit 140c extracts feature points from the learning target image data 130a using SIFT. FIG. 3 is a diagram illustrating an example of feature points extracted from learning target image data. In the example illustrated in FIG. 3, the feature point extraction unit 140c extracts feature points 40A, 40B, and 40C from the learning target image data 130a. The feature point extraction unit 140c outputs the extraction result from the learning target image data 130a to the classification unit 140d.

図４は、学習対象画像データからの特徴点の抽出結果の一例を示す図である。図４に示すように、この抽出結果１Ａは、特徴点の識別情報と、学習対象画像データ上の座標とを対応付けている。 FIG. 4 is a diagram illustrating an example of a feature point extraction result from learning target image data. As shown in FIG. 4, the extraction result 1A associates feature point identification information with coordinates on learning target image data.

特徴点抽出部１４０ｃは、ＳＩＦＴを利用して、視点変動画像データ群１３０ｂに含まれる各視点変動画像データから、特徴点を抽出する。図５は、視点変動画像データ群から抽出した特徴点の一例を示す図である。図５に示す例では、視点変動画像データ群１３０ｂには、視点変動画像データ１３ａ、１３ｂ、１３ｃ、１３ｄが含まれている。 The feature point extraction unit 140c uses SIFT to extract feature points from each viewpoint variation image data included in the viewpoint variation image data group 130b. FIG. 5 is a diagram illustrating an example of feature points extracted from the viewpoint variation image data group. In the example shown in FIG. 5, the viewpoint variation image data group 130b includes viewpoint variation image data 13a, 13b, 13c, and 13d.

特徴点抽出部１４０ｃは、視点変動画像データ１３ａから、特徴点１ａ、１ｂを抽出する。特徴点抽出部１４０ｃは、視点変動画像データ１３ｂから、特徴点１ｃ、１ｄを抽出する。特徴点抽出部１４０ｃは、視点変動画像データ１３ｃから、特徴点１ｅ、１ｆ、１ｇを抽出する。特徴点抽出部１４０ｃは、視点変動画像データ１３ｄから、特徴点１ｈ、１ｉを抽出する。特徴点抽出部１４０ｃは、視点変動画像データ群１３０ｂからの抽出結果を、分類部１４０ｄに出力する。 The feature point extraction unit 140c extracts the feature points 1a and 1b from the viewpoint variation image data 13a. The feature point extraction unit 140c extracts feature points 1c and 1d from the viewpoint variation image data 13b. The feature point extraction unit 140c extracts feature points 1e, 1f, and 1g from the viewpoint variation image data 13c. The feature point extraction unit 140c extracts feature points 1h and 1i from the viewpoint variation image data 13d. The feature point extraction unit 140c outputs the extraction result from the viewpoint variation image data group 130b to the classification unit 140d.

図６は、視点変動画像データからの特徴点の抽出結果の一例を示す図である。図６に示すように、この抽出結果１Ｂは、特徴点の識別情報と、視点変動画像データ上の座標とを対応付けている。また、特徴点抽出部１４０ｃは、特徴点に各要素の数値を、抽出結果１Ｂに対応付ける。ここで、各要素の数値は、例えば、画像のエッジの尖り具合、濃淡値の勾配の具合、勾配変化の方向などをそれぞれ数値化したものである。 FIG. 6 is a diagram illustrating an example of a feature point extraction result from the viewpoint variation image data. As shown in FIG. 6, this extraction result 1B associates the feature point identification information with the coordinates on the viewpoint variation image data. The feature point extraction unit 140c associates the numerical value of each element with the feature point in the extraction result 1B. Here, the numerical value of each element is obtained by numerically expressing, for example, the sharpness of the edge of the image, the gradient of the gray value, the direction of the gradient change, and the like.

分類部１４０ｄは、視点変動画像データ群１３０ｂから抽出された複数の特徴点を分類する処理部である。例えば、分類部１４０ｄは、抽出結果１Ａと、抽出結果１Ｂとを基にして、特徴点を分類する。 The classification unit 140d is a processing unit that classifies a plurality of feature points extracted from the viewpoint variation image data group 130b. For example, the classification unit 140d classifies the feature points based on the extraction result 1A and the extraction result 1B.

分類部１４０ｄは、抽出結果１Ｂの視点変動画像データ上の座標を、学習対象画像データ上の座標に変換する。分類部１４０ｄは、射影変換の逆行列を、視点変動画像データ上の座標に適用することで、学習対象画像データ上の座標に変換する。図７は、変換結果の一例を示す図である。 The classification unit 140d converts the coordinates on the viewpoint variation image data of the extraction result 1B into the coordinates on the learning target image data. The classification unit 140d converts the inverse matrix of the projective transformation to the coordinates on the learning target image data by applying the inverse matrix to the coordinates on the viewpoint variation image data. FIG. 7 is a diagram illustrating an example of the conversion result.

図７に示すように、変換結果１Ｃは、識別情報、座標、変換座標、各要素を対応付ける。このうち、変換座標は、射影変換の逆行列により変換された特徴点の座標である。 As shown in FIG. 7, the conversion result 1C associates identification information, coordinates, conversion coordinates, and each element. Of these, the transformation coordinates are the coordinates of the feature points transformed by the inverse matrix of the projective transformation.

分類部１４０ｄは、図４の抽出結果１Ａの座標と、図７の変換結果１Ｃの変換座標との距離に応じて、視点変動画像データから抽出した各特徴点を分類する。例えば、視点変動画像データから抽出したある特徴点の変換座標が、特徴点４０Ａ〜４０Ｃのうち、特徴点４０Ａの座標に最も近い場合には、かかる特徴点が特徴点４０Ａに分類する。分類部１４０ｄは、上記処理を、視点変動画像データから抽出した各特徴点に対して実行し、特徴点４０Ａ、特徴点４０Ｂ、特徴点４０Ｃの何れかに分類する。 The classification unit 140d classifies each feature point extracted from the viewpoint variation image data according to the distance between the coordinates of the extraction result 1A in FIG. 4 and the conversion coordinates of the conversion result 1C in FIG. For example, when the conversion coordinates of a certain feature point extracted from the viewpoint variation image data are closest to the coordinates of the feature point 40A among the feature points 40A to 40C, the feature point is classified as the feature point 40A. The classification unit 140d performs the above process on each feature point extracted from the viewpoint variation image data, and classifies the feature point as one of the feature point 40A, the feature point 40B, and the feature point 40C.

なお、分類部１４０ｄは、特徴点の変換座標が、特徴点４０Ａ〜４０Ｃの座標から所定の閾値以上離れている場合には、かかる特徴点を分類対象から除外する。例えば、所定の閾値を、２画素とする。 The classifying unit 140d excludes the feature points from the classification target when the converted coordinates of the feature points are separated from the coordinates of the feature points 40A to 40C by a predetermined threshold or more. For example, the predetermined threshold is 2 pixels.

分類部１４０ｄは、視点変動画像データから抽出した各特徴点を分類した後、分類した特徴点群毎に、特徴空間上の重心座標を求める。図８は、特徴空間上の重心座標を説明するための図である。例えば、図８のｘ軸は、特徴点の要素１に対応し、ｙ軸は特徴点の要素２に対応し、ｚ軸は特徴点の要素３に対応する。ここでは、一例として、３次元で特徴空間を示すが、特徴点の要素がｎ個の場合には、特徴空間は、ｎ次元となる。また、図８において、丸印は、特徴点４０Ａに分類された特徴点に対応する。三角印は、特徴点４０Ｂに分類された特徴点に対応する。四角印は、特徴点４０Ｃに分類された特徴点に対応する。 The classification unit 140d classifies each feature point extracted from the viewpoint variation image data, and then obtains the barycentric coordinate in the feature space for each classified feature point group. FIG. 8 is a diagram for explaining the barycentric coordinates on the feature space. For example, the x-axis in FIG. 8 corresponds to the feature point element 1, the y-axis corresponds to the feature point element 2, and the z-axis corresponds to the feature point element 3. Here, as an example, the feature space is shown in three dimensions, but when the number of feature point elements is n, the feature space becomes n dimensions. In FIG. 8, circles correspond to the feature points classified as the feature points 40A. The triangle mark corresponds to the feature point classified as the feature point 40B. The square marks correspond to the feature points classified as the feature points 40C.

分類部１４０ｄは、特徴点４０Ａに分類された各特徴点に基づいて、特徴空間上の重心２Ａを求める。分類部１４０ｄは、特徴点４０Ｂに分類された各特徴点に基づいて、特徴空間上の重心２Ｂを求める。分類部１４０ｄは、特徴点４０Ｃに分類された各特徴点に基づいて、特徴空間上の重心２Ｃを求める。 The classification unit 140d obtains the center of gravity 2A on the feature space based on each feature point classified as the feature point 40A. The classification unit 140d obtains the center of gravity 2B on the feature space based on each feature point classified as the feature point 40B. The classification unit 140d obtains the center of gravity 2C on the feature space based on each feature point classified as the feature point 40C.

分類部１４０ｄは、特徴点４０Ａに分類された各特徴点に基づいて、共分散を求める。また、分類部１４０ｄは、特徴点４０Ｂに分類された各特徴点に基づいて、共分散を求める。分類部１４０ｄは、特徴点４０Ｃに分類された各特徴点に基づいて、共分散を求める。なお、分類部１４０ｄは、共分散の代わりに、標準偏差を求めてもよい。 The classification unit 140d obtains covariance based on each feature point classified as the feature point 40A. The classification unit 140d obtains covariance based on each feature point classified as the feature point 40B. The classification unit 140d obtains covariance based on each feature point classified as the feature point 40C. The classification unit 140d may obtain a standard deviation instead of the covariance.

分類部１４０ｄは、上記した処理の結果を、登録テーブル１３０ｃに登録する。図９は、本実施例１の登録テーブルのデータ構造の一例を示す図である。図９に示すように、登録テーブル１３０ｃは、クラスタ番号、座標、重心座標、共分散を有する。 The classification unit 140d registers the result of the above processing in the registration table 130c. FIG. 9 is a diagram illustrating an example of the data structure of the registration table according to the first embodiment. As shown in FIG. 9, the registration table 130c has a cluster number, coordinates, barycentric coordinates, and covariance.

図９のクラスタ番号は、分類した特徴点群を一意に識別する番号である。例えば、特徴点４０Ａに分類された特徴点群のクラスタ番号を１とする。特徴点４０Ｂに分類された特徴点群をクラスタ番号２とする。特徴点４０Ｃに分類された特徴点分をクラスタ番号３とする。 The cluster number in FIG. 9 is a number that uniquely identifies the classified feature point group. For example, the cluster number of the feature point group classified as the feature point 40A is 1. The feature point group classified into the feature points 40B is set as cluster number 2. The feature point classified as the feature point 40C is set as cluster number 3.

図９の座標は、学習対象画像データ１３０ａから抽出した特徴点の座標である。例えば、図９の座標（ｘ１、ｙ１）は、特徴点４０Ａの座標である。図９の重心座標は、図８に示した重心座標に対応するものである。例えば、重心座標（ｘｇ１、ｙｇ１、ｚｇ１）は、クラスタ番号１に対応する各特徴点の特徴空間上の座標である。共分散は、分類された特徴点群毎の共分散である。例えば、共分散「ｖ１」は、クラスタ番号１に対応する各特徴点の共分散である。 The coordinates in FIG. 9 are the coordinates of the feature points extracted from the learning target image data 130a. For example, the coordinates (x1, y1) in FIG. 9 are the coordinates of the feature point 40A. The barycentric coordinates in FIG. 9 correspond to the barycentric coordinates shown in FIG. For example, barycentric coordinates (xg1, yg1, zg1) are coordinates on the feature space of each feature point corresponding to cluster number 1. The covariance is a covariance for each classified feature point group. For example, covariance “v1” is the covariance of each feature point corresponding to cluster number 1.

なお、分類部１４０ｄは、分類された特徴点群の数が、所定の数より少ない場合には、かかる特徴点群を、分類対象から除外してもよい。例えば、分類部は、特徴点４０Ｃに分類された特徴点の数が、所定の数より少ない場合には、特徴点４０Ａに分類される各特徴点を登録テーブル１３０ｃから除外する。 Note that the classification unit 140d may exclude the feature point group from the classification target when the number of classified feature point groups is smaller than a predetermined number. For example, when the number of feature points classified into the feature points 40C is smaller than a predetermined number, the classification unit excludes each feature point classified as the feature point 40A from the registration table 130c.

対応付け部１４０ｅは、登録テーブル１３０ｃを基にして、認識対象画像データ１３０ｄの特徴点と、学習対象画像データ１３０ａの特徴点とを対応付ける処理部である。 The association unit 140e is a processing unit that associates the feature points of the recognition target image data 130d with the feature points of the learning target image data 130a based on the registration table 130c.

対応付け部１４０ｅは、ＳＩＦＴを利用して、認識対象画像データ１３０ｄから、特徴点を抽出する。対応付け部１４０ｅは、認識対象画像データ１３０ｄから抽出した特徴点の特徴空間上の座標と、登録テーブル１３０ｃの重心座標とを比較し、最も座標間の距離が短い重心座標に対応するクラスタ番号を判定する。対応付け部１４０ｅは、判定したクラスタ番号の座標と、認識対象画像データ１３０ｄの特徴点とを対応付ける。 The associating unit 140e extracts feature points from the recognition target image data 130d using SIFT. The associating unit 140e compares the coordinates on the feature space of the feature points extracted from the recognition target image data 130d with the barycentric coordinates of the registration table 130c, and determines the cluster number corresponding to the barycentric coordinates with the shortest distance between the coordinates. judge. The associating unit 140e associates the coordinates of the determined cluster number with the feature points of the recognition target image data 130d.

例えば、対応付け部１４０ｅは、クラスタ番号１〜３の重心座標のうち、特徴点の座標とクラスタ番号１の重心座標との距離が最も近ければ、クラスタ番号１と判定する。この場合には、対応付け部１４０ｅは、認識対象画像データの特徴点と、クラスタ番号１の座標（ｘ１、ｙ１）とを対応付ける。クラスタ番号１の座標（ｘ１、ｙ１）は、図３の特徴点４０Ａの座標である。 For example, the associating unit 140e determines the cluster number 1 if the distance between the feature point coordinates and the cluster number 1 centroid coordinates of the cluster numbers 1 to 3 is the shortest. In this case, the associating unit 140e associates the feature points of the recognition target image data with the coordinates (x1, y1) of the cluster number 1. The coordinates (x1, y1) of cluster number 1 are the coordinates of the feature point 40A in FIG.

対応付け部１４０ｅは、認識対象画像データ１３０ｄから抽出した各特徴点に対して、上記処理を実行し、実行結果を出力部１２０に出力する。対応付け部１４０ｅがかかる処理を実行することで、認識対象画像データ１３０ｄの特徴点と、学習対象画像データ１３０ａの特徴点とが対応付けられる。 The associating unit 140e performs the above process on each feature point extracted from the recognition target image data 130d, and outputs the execution result to the output unit 120. When the association unit 140e executes such processing, the feature point of the recognition target image data 130d and the feature point of the learning target image data 130a are associated with each other.

なお、対応付け部１４０ｅは、マハラノビス距離を用いて、認識対象画像データ１３０ｄの特徴点と、学習対象画像データ１３０ａの特徴点とを対応付けてもよい。例えば、対応付け部１４０ｅは、共分散に対応する重みを設定する。対応付け部１４０ｅは、共分散の値が大きいほど、小さい値を重みに設定する。例えば、共分散ｖ１、ｖ２、ｖ３の重みを、ｇ１、ｇ２、ｇ３とする。各共分散の大小関係がｖ１＞ｖ２＞ｖ３の場合には、重みはｇ１＜ｇ２＜ｇ３となる。 The association unit 140e may associate the feature points of the recognition target image data 130d and the feature points of the learning target image data 130a using the Mahalanobis distance. For example, the associating unit 140e sets a weight corresponding to covariance. The association unit 140e sets a smaller value as the weight as the covariance value is larger. For example, the weights of the covariances v1, v2, and v3 are g1, g2, and g3. When the magnitude relationship of each covariance is v1> v2> v3, the weight is g1 <g2 <g3.

対応付け部１４０ｅは、クラスタ番号１の座標と特徴点の座標との距離にｇ１を乗算した値Ｇ１と、クラスタ番号２の座標と特徴点の座標との距離にｇ２を乗算した値Ｇ２と、クラスタ番号３の座標と特徴点の座標との距離にｇ３を乗算した値Ｇ３とを算出する。そして、対応付け部１４０ｅは、値Ｇ１〜Ｇ３のうち、最小となる値に対応するクラスタ番号を判定する。対応付け部１４０ｅは、判定したクラスタ番号の座標と、認識対象画像データ１３０ｄの特徴点とを対応付ける。マハラノビス距離を利用すると、共分散の値が大きいクラスタ番号ほど、対応付けされやすくなる。 The associating unit 140e has a value G1 obtained by multiplying the distance between the coordinates of the cluster number 1 and the coordinates of the feature point by g1, a value G2 obtained by multiplying the distance between the coordinates of the cluster number 2 and the coordinates of the feature point by g2, A value G3 obtained by multiplying the distance between the coordinates of the cluster number 3 and the coordinates of the feature points by g3 is calculated. Then, the associating unit 140e determines the cluster number corresponding to the smallest value among the values G1 to G3. The associating unit 140e associates the coordinates of the determined cluster number with the feature points of the recognition target image data 130d. When the Mahalanobis distance is used, a cluster number having a larger covariance value is more easily associated.

次に、本実施例１にかかる画像処理装置１００の学習フェーズの処理手順および認識フェーズの処理手順について順に説明する。図１０は、本実施例１の学習フェーズの処理手順を示すフローチャートである。図１０に示す処理は、例えば、学習対象画像データ１３０ａが記憶部１３０に記憶されたことを契機に実行される。 Next, the learning phase processing procedure and the recognition phase processing procedure of the image processing apparatus 100 according to the first embodiment will be described in order. FIG. 10 is a flowchart illustrating the processing procedure of the learning phase according to the first embodiment. The process illustrated in FIG. 10 is executed when the learning target image data 130a is stored in the storage unit 130, for example.

図１０に示すように、画像処理装置１００は、学習対象画像データ１３０ａを取得し（ステップＳ１０１）、視点変動画像データ群１３０ｂを生成する（ステップＳ１０２）。画像処理装置１００は、学習対象画像データ１３０ａから特徴点を抽出する（ステップＳ１０３）。 As illustrated in FIG. 10, the image processing apparatus 100 acquires learning target image data 130a (step S101), and generates a viewpoint variation image data group 130b (step S102). The image processing apparatus 100 extracts feature points from the learning target image data 130a (step S103).

画像処理装置１００は、各視点変動画像データから特徴点を抽出する（ステップＳ１０４）。画像処理装置１００は、各視点変動画像データから抽出した特徴点の座標を、学習対象画像データの座標系に変換する（ステップＳ１０５）。 The image processing apparatus 100 extracts feature points from each viewpoint variation image data (step S104). The image processing apparatus 100 converts the coordinates of the feature points extracted from each viewpoint variation image data into the coordinate system of the learning target image data (step S105).

画像処理装置１００は、各特徴点を分類し（ステップＳ１０６）、登録テーブル１３０ｃに各種情報を登録する（ステップＳ１０７）。 The image processing apparatus 100 classifies each feature point (step S106) and registers various information in the registration table 130c (step S107).

図１１は、本実施例１の認識フェーズの処理手順を示すフローチャートである。図１１に示す処理は、例えば、認識対象画像データ１３０ｄが記憶部１３０に記憶されたことを契機に実行される。 FIG. 11 is a flowchart illustrating the processing procedure of the recognition phase according to the first embodiment. The process illustrated in FIG. 11 is executed, for example, when the recognition target image data 130d is stored in the storage unit 130.

画像処理装置１００は、認識対象画像データ１３０ｄを取得し（ステップＳ２０１）、認識対象画像データ１３０ｄから特徴点を抽出する（ステップＳ２０２）。画像処理装置１００は、認識対象画像データ１３０ｄの特徴点と、登録テーブル１３０ｃとを基にして、特徴点の対応付けを行う（ステップＳ２０３）。 The image processing apparatus 100 acquires the recognition target image data 130d (step S201), and extracts feature points from the recognition target image data 130d (step S202). The image processing apparatus 100 associates the feature points based on the feature points of the recognition target image data 130d and the registration table 130c (step S203).

次に、本実施例１にかかる画像処理装置１００の効果について説明する。画像処理装置１００は、学習対象画像データ１３０ａおよびカメラ視点の異なる視点変動画像データ群１３０ｄから特徴点を抽出し、各特徴点を分類して、登録テーブル１３０ｃを生成する。画像処理装置１００は、登録テーブル１３０ｃの各特徴点群の重心座標と、認識対象画像データ１３０ｂから抽出した特徴点の座標とを比較して、学習対象画像データ１３０ａの特徴点と認識対象画像データ１３０ｄの特徴点とを対応付ける。このため、画像処理装置１００によれば、各特徴点群の重心座標と、認識対象画像データ１３０ｄから抽出した特徴点の座標との比較より、各特徴点を対応付けられるので、各画像の座標を対応付ける場合の計算量を削減することができる。 Next, effects of the image processing apparatus 100 according to the first embodiment will be described. The image processing apparatus 100 extracts feature points from the learning target image data 130a and the viewpoint variation image data group 130d having different camera viewpoints, classifies each feature point, and generates a registration table 130c. The image processing apparatus 100 compares the barycentric coordinates of each feature point group in the registration table 130c with the coordinates of the feature points extracted from the recognition target image data 130b, and the feature points of the learning target image data 130a and the recognition target image data. 130d feature points are associated. For this reason, according to the image processing apparatus 100, each feature point can be associated by comparing the barycentric coordinates of each feature point group with the coordinates of the feature points extracted from the recognition target image data 130d. It is possible to reduce the amount of calculation when associating.

また、画像処理装置１００は、マハラノビス距離を用いて、各特徴点を対応付けるので、各特徴点の分散具合を考慮して、正確に各画像の座標を対応付けることができる。 Further, since the image processing apparatus 100 associates each feature point using the Mahalanobis distance, it is possible to associate the coordinates of each image accurately in consideration of the degree of dispersion of each feature point.

本実施例２にかかる画像処理装置について説明する。図１２は、本実施例２にかかる画像処理装置の構成を示す機能ブロック図である。図１２に示すように、画像処理装置２００は、カメラ２１０ａ、入力部２１０ｂ、出力部２２０、記憶部２３０、制御部２４０を有する。 An image processing apparatus according to the second embodiment will be described. FIG. 12 is a functional block diagram of the configuration of the image processing apparatus according to the second embodiment. As shown in FIG. 12, the image processing apparatus 200 includes a camera 210a, an input unit 210b, an output unit 220, a storage unit 230, and a control unit 240.

カメラ２１０ａ、入力部２１０ｂ、出力部２２０に関する説明は、図１に示したカメラ１１０ａ、入力部１１０ｂ、出力部１２０に関する説明と同様である。 The description regarding the camera 210a, the input unit 210b, and the output unit 220 is the same as the description regarding the camera 110a, the input unit 110b, and the output unit 120 illustrated in FIG.

記憶部２３０は、学習対象画像データ２３０ａ、視点変動画像データ群２３０ｂ、登録テーブル２３０ｃ、認識対象画像データ２３０ｄを記憶する。記憶部２３０は、例えば、ＲＡＭ、ＲＯＭ、フラッシュメモリなどの半導体メモリ素子、またはハードディスク、光ディスクなどの記憶装置に対応する。 The storage unit 230 stores learning target image data 230a, a viewpoint variation image data group 230b, a registration table 230c, and recognition target image data 230d. The storage unit 230 corresponds to, for example, a semiconductor memory device such as a RAM, a ROM, or a flash memory, or a storage device such as a hard disk or an optical disk.

学習対象画像データ２３０ａは、実施例１の学習対象画像データ１３０ａに対応する。視点変動画像データ群２３０ｂは、実施例１の視点変動画像データ群１３０ｂに対応する。認識対象画像データ２３０ｄは、実施例１の認識対象画像データ１３０ｄに対応する。 The learning target image data 230a corresponds to the learning target image data 130a of the first embodiment. The viewpoint variation image data group 230b corresponds to the viewpoint variation image data group 130b of the first embodiment. The recognition target image data 230d corresponds to the recognition target image data 130d of the first embodiment.

登録テーブル２３０ｃは、学習対象画像データ２３０ａおよび視点変動画像データ群２３０ｂから抽出される特徴点に関する各種の情報を保持するテーブルである。登録テーブル２３０ｃのデータ構造は、後述する。 The registration table 230c is a table that holds various types of information regarding feature points extracted from the learning target image data 230a and the viewpoint variation image data group 230b. The data structure of the registration table 230c will be described later.

制御部２４０は、データ管理部２４０ａ、視点変動画像生成部２４０ｂ、特徴点抽出部２４０ｃ、分類部２４０ｄ、対応付け部２４０ｅを有する。制御部２４０は、例えば、ＡＳＩＣや、ＦＰＧＡなどの集積装置に対応する。また、制御部２４０は、例えば、ＣＰＵやＭＰＵ等の電子回路に対応する。 The control unit 240 includes a data management unit 240a, a viewpoint variation image generation unit 240b, a feature point extraction unit 240c, a classification unit 240d, and an association unit 240e. The control unit 240 corresponds to, for example, an integrated device such as an ASIC or FPGA. Moreover, the control part 240 respond | corresponds to electronic circuits, such as CPU and MPU, for example.

データ管理部２４０ａは、記憶部２３０を管理する処理部である。例えば、学習フェーズにおいて、データ管理部２４０ａは、カメラ２１０ａから学習対象画像データを取得し、取得した学習対象画像データを、記憶部２３０に記憶する。認識フェーズにおいて、データ管理部２４０ａは、カメラ２１０ａから認識対象画像データを取得し、取得した認識対象画像データを、記憶部２３０に記憶する。 The data management unit 240 a is a processing unit that manages the storage unit 230. For example, in the learning phase, the data management unit 240a acquires learning target image data from the camera 210a, and stores the acquired learning target image data in the storage unit 230. In the recognition phase, the data management unit 240a acquires recognition target image data from the camera 210a, and stores the acquired recognition target image data in the storage unit 230.

視点変動画像生成部２４０ｂは、学習対象画像データ２３０ａを基にして、視点変動画像データ群２３０ｂを生成する処理部である。視点変動画像生成部２４０ｂの具体的な処理は、実施例１の視点変動画像生成部１４０ｂの処理と同様である。 The viewpoint variation image generation unit 240b is a processing unit that generates a viewpoint variation image data group 230b based on the learning target image data 230a. Specific processing of the viewpoint variation image generation unit 240b is the same as the processing of the viewpoint variation image generation unit 140b of the first embodiment.

特徴点抽出部２４０ｃは、学習対象画像データ２３０ａから特徴点を抽出する。また、特徴点抽出部２４０ｃは、視点変動画像データ群２３０ｂの各視点変動画像データから、特徴点を抽出する。特徴点抽出部２４０ｃは、学習対象画像データ２３０ａから抽出した特徴点に関するデータおよび視点変動画像データ群２３０ｂから抽出した特徴点に関するデータを、分類部２４０ｄに出力する。 The feature point extraction unit 240c extracts feature points from the learning target image data 230a. In addition, the feature point extraction unit 240c extracts feature points from each viewpoint variation image data of the viewpoint variation image data group 230b. The feature point extraction unit 240c outputs data relating to feature points extracted from the learning target image data 230a and data relating to feature points extracted from the viewpoint variation image data group 230b to the classification unit 240d.

特徴点抽出部２４０ｃの具体的な処理は、実施例１の特徴点抽出部１４０ｃの処理と同様である。特徴点抽出部２４０ｃは、図４に示した抽出結果１Ａおよび図６に示した抽出結果１Ｂを分類部２４０ｄに出力する。 The specific processing of the feature point extraction unit 240c is the same as the processing of the feature point extraction unit 140c of the first embodiment. The feature point extraction unit 240c outputs the extraction result 1A shown in FIG. 4 and the extraction result 1B shown in FIG. 6 to the classification unit 240d.

分類部２４０ｄは、視点変動画像データ群２３０ｂから抽出された複数の特徴点を分類する処理部である。分類部２４０ｄは、抽出結果１Ａと、抽出結果１Ｂとを基にして、特徴点を分類する。 The classification unit 240d is a processing unit that classifies a plurality of feature points extracted from the viewpoint variation image data group 230b. The classification unit 240d classifies the feature points based on the extraction result 1A and the extraction result 1B.

特に、実施例２の分類部２４０ｄは、実施例１の分類部１４０ｄが分類した特徴点を、更に分類する。分類部２４０ｄが、視点変動画像データから抽出した各特徴点を、特徴点４０Ａ、特徴点４０Ｂ、特徴点４０Ｃの何れかに分類するまでの処理は、実施例１の分類部１４０ｄと同様である。 In particular, the classification unit 240d according to the second embodiment further classifies the feature points classified by the classification unit 140d according to the first embodiment. The processing until the classification unit 240d classifies each feature point extracted from the viewpoint variation image data into one of the feature point 40A, the feature point 40B, and the feature point 40C is the same as that of the classification unit 140d of the first embodiment. .

図１３は、実施例２にかかる分類部の処理を説明するための図である。ここでは、分類部２４０ｄが、特徴点４０Ａに分類した各特徴点を更に分類する場合を例にして説明する。分類部２４０ｄは、各特徴点に対してk-meansを適用することで、特徴点をグループ３Ａ〜３Ｃに分類する。分類部２４０ｄは、グループ毎に、各特徴点の重心ｋ１−１〜ｋ３−１を求める。各重心をキーポイントと表記する。 FIG. 13 is a diagram for explaining the processing of the classification unit according to the second embodiment. Here, the case where the classification unit 240d further classifies each feature point classified into the feature points 40A will be described as an example. The classification unit 240d classifies the feature points into groups 3A to 3C by applying k-means to each feature point. The classification unit 240d obtains centroids k1-1 to k3-1 of each feature point for each group. Each center of gravity is expressed as a key point.

ここで、分類部２４０ｄが利用するk-meansによる処理の一例を説明する。分類部２４０ｄは、各特徴点にランダムにクラスタを割り当て、クラスタ毎の重心を算出する。分類部２４０ｄは、各特徴点のクラスタを、一番近い重心のクラスタに変更する。分離部２４０ｄは、各特徴点のクラスタが変化しなくなるまで、上記処理を繰り返すことで、特徴点を複数のグループに分類する。 Here, an example of processing by k-means used by the classification unit 240d will be described. The classification unit 240d assigns clusters to each feature point at random, and calculates the center of gravity for each cluster. The classification unit 240d changes the cluster of each feature point to the cluster with the nearest center of gravity. The separation unit 240d classifies the feature points into a plurality of groups by repeating the above processing until the cluster of each feature point does not change.

分類部２４０ｄは、特徴点４０Ａに分類した各特徴点と同様にして、特徴点４０Ｂに分類した各特徴点および特徴点４０Ｃに分類した各特徴点を更に分類する。分類部２４０ｄは、分類した結果を、登録テーブル２３０ｃに登録する。 The classification unit 240d further classifies the feature points classified into the feature points 40B and the feature points classified into the feature points 40C in the same manner as the feature points classified into the feature points 40A. The classification unit 240d registers the classification result in the registration table 230c.

なお、分類部２４０ｄは、k-meansにより更に分類したグループに含まれる特徴点に基づいて、グループ毎に共分散値を算出し、登録テーブル２３０ｃに登録する。 The classification unit 240d calculates a covariance value for each group based on the feature points included in the group further classified by k-means, and registers the covariance value in the registration table 230c.

図１４は、本実施例２の登録テーブルのデータ構造の一例を示す図である。図１４に示すように、この登録テーブル２３０ｃは、クラスタ番号、座標、キー番号、重心座標、共分散を有する。 FIG. 14 is a diagram illustrating an example of a data structure of a registration table according to the second embodiment. As shown in FIG. 14, this registration table 230c has cluster numbers, coordinates, key numbers, barycentric coordinates, and covariance.

図１４のクラスタ番号は、分類した特徴点群を一意に識別する番号である。例えば、特徴点４０Ａに分類された特徴点群のクラスタ番号を１とする。特徴点４０Ｂに分類された特徴点群をクラスタ番号２とする。特徴点４０Ｃに分類された特徴点分をクラスタ番号３とする。 The cluster number in FIG. 14 is a number that uniquely identifies the classified feature point group. For example, the cluster number of the feature point group classified as the feature point 40A is 1. The feature point group classified into the feature points 40B is set as cluster number 2. The feature point classified as the feature point 40C is set as cluster number 3.

図１４の座標は、学習対象画像データ２３０ａから抽出した特徴点の座標である。例えば、図１４の座標（ｘ１、ｙ１）は、特徴点４０Ａの座標である。キー番号は、キーポイントを一意に識別する番号である。重心座標は、キーポイントの重心座標に対応する。共分散は、k-meansにより更に分類したグループに含まれる特徴点の共分散値である。 The coordinates in FIG. 14 are the coordinates of feature points extracted from the learning target image data 230a. For example, the coordinates (x1, y1) in FIG. 14 are the coordinates of the feature point 40A. The key number is a number that uniquely identifies a key point. The barycentric coordinates correspond to the barycentric coordinates of the key points. The covariance is a covariance value of feature points included in a group further classified by k-means.

対応付け部２４０ｅは、登録テーブル２３０ｃを基にして、認識対象画像データ２３０ｄの特徴点と、学習対象画像データ２３０ａの特徴点とを対応付ける処理部である。 The association unit 240e is a processing unit that associates the feature points of the recognition target image data 230d with the feature points of the learning target image data 230a based on the registration table 230c.

対応付け部２４０ｅは、ＳＩＦＴを利用して、認識対象画像データ２３０ｄから、特徴点を抽出する。対応付け部２４０ｅは、認識対象画像データ２３０ｄから抽出した特徴点の特徴空間上の座標と、登録テーブル２３０ｃの重心座標とを比較し、最も座標間の距離が短い重心座標に対応するクラスタ番号を判定する。対応付け部２４０ｅは、判定したクラスタ番号の座標と、認識対象画像データ２３０ｄの特徴点とを対応付ける。 The associating unit 240e extracts feature points from the recognition target image data 230d using SIFT. The associating unit 240e compares the coordinates on the feature space of the feature points extracted from the recognition target image data 230d with the centroid coordinates of the registration table 230c, and determines the cluster number corresponding to the centroid coordinates with the shortest distance between the coordinates. judge. The association unit 240e associates the coordinates of the determined cluster number with the feature points of the recognition target image data 230d.

対応付け部２４０ｅは、キー番号ｋ１−１〜ｋ１−３の重心座標の何れかと、特徴点との距離が最も短い場合には、該特徴点は、クラスタ番号１に対応すると判定する。対応付け部２４０ｅは、キー番号ｋ２−１〜ｋ２−３の重心座標の何れかと、特徴点との距離が最も短い場合には、該特徴点は、クラスタ番号２に対応すると判定する。対応付け部２４０ｅは、キー番号ｋ３−１〜ｋ３−３の重心座標の何れかと、特徴点との距離が最も短い場合には、該特徴点は、クラスタ番号３に対応すると判定する。 The associating unit 240e determines that the feature point corresponds to the cluster number 1 when the distance between any one of the barycentric coordinates of the key numbers k1-1 to k1-3 and the feature point is the shortest. The associating unit 240e determines that the feature point corresponds to the cluster number 2 when the distance between any one of the barycentric coordinates of the key numbers k2-1 to k2-3 and the feature point is the shortest. The associating unit 240e determines that the feature point corresponds to cluster number 3 when the distance between any one of the barycentric coordinates of the key numbers k3-1 to k3-3 and the feature point is the shortest.

対応付け部２４０ｅは、認識対象画像データ２３０ｄから抽出した各特徴点に対して、上記処理を実行し、実行結果を出力部２２０に出力する。対応付け部２４０ｅがかかる処理を実行することで、認識対象画像データ２３０ｄの特徴点と、学習対象画像データ２３０ａの特徴点とが対応付けられる。 The associating unit 240e performs the above processing on each feature point extracted from the recognition target image data 230d, and outputs the execution result to the output unit 220. As the association unit 240e executes such processing, the feature points of the recognition target image data 230d are associated with the feature points of the learning target image data 230a.

なお、対応付け部２４０ｅは、実施例１と同様にして、マハラノビス距離を用いて、認識対象画像データ２３０ｄの特徴点と、学習対象画像データ２３０ａの特徴点とを対応付けてもよい。 Note that the association unit 240e may associate the feature points of the recognition target image data 230d and the feature points of the learning target image data 230a using the Mahalanobis distance in the same manner as in the first embodiment.

次に、本実施例２にかかる画像処理装置２００の効果について説明する。画像処理装置２００は、k-meansを利用して、実施例１で分類した特徴点群を更に細かく分類する。このため、特徴空間上で特徴点が歪んで分布している場合や、分離した分布をしている場合であっても、正確に認識対象画像データ２３０ｄの特徴点と、学習対象画像データ２３０ａの特徴点とを対応付けることができる。 Next, effects of the image processing apparatus 200 according to the second embodiment will be described. The image processing apparatus 200 further classifies the feature point group classified in the first embodiment using k-means. For this reason, even if the feature points are distorted and distributed in the feature space or separated, the feature points of the recognition target image data 230d and the learning target image data 230a are accurately detected. Feature points can be associated with each other.

本実施例３にかかる画像処理装置について説明する。図１５は、本実施例３にかかる画像処理装置の構成を示す機能ブロック図である。図１５に示すように、画像処理装置３００は、カメラ３１０ａ、入力部３１０ｂ、出力部３２０、記憶部３３０、制御部３４０を有する。 An image processing apparatus according to the third embodiment will be described. FIG. 15 is a functional block diagram of the configuration of the image processing apparatus according to the third embodiment. As illustrated in FIG. 15, the image processing apparatus 300 includes a camera 310 a, an input unit 310 b, an output unit 320, a storage unit 330, and a control unit 340.

カメラ３１０ａ、入力部３１０ｂ、出力部３２０に関する説明は、図１に示したカメラ１１０ａ、入力部１１０ｂ、出力部１２０に関する説明と同様である。 The description regarding the camera 310a, the input unit 310b, and the output unit 320 is the same as the description regarding the camera 110a, the input unit 110b, and the output unit 120 illustrated in FIG.

記憶部３３０は、学習対象画像データ３３０ａ、視点変動画像データ群３３０ｂ、登録テーブル３３０ｃ、認識対象画像データ３３０ｄを記憶する。記憶部３３０は、例えば、ＲＡＭ、ＲＯＭ、フラッシュメモリなどの半導体メモリ素子、またはハードディスク、光ディスクなどの記憶装置に対応する。 The storage unit 330 stores learning target image data 330a, a viewpoint variation image data group 330b, a registration table 330c, and recognition target image data 330d. The storage unit 330 corresponds to, for example, a semiconductor memory device such as a RAM, a ROM, or a flash memory, or a storage device such as a hard disk or an optical disk.

学習対象画像データ３３０ａは、実施例１の学習対象画像データ１３０ａに対応する。視点変動画像データ群３３０ｂは、実施例１の視点変動画像データ群１３０ｂに対応する。認識対象画像データ３３０ｄは、実施例１の認識対象画像データ１３０ｄに対応する。 The learning target image data 330a corresponds to the learning target image data 130a of the first embodiment. The viewpoint variation image data group 330b corresponds to the viewpoint variation image data group 130b of the first embodiment. The recognition target image data 330d corresponds to the recognition target image data 130d of the first embodiment.

登録テーブル３３０ｃは、学習対象画像データ３３０ａおよび視点変動画像データ群３３０ｂから抽出される特徴点に関する各種の情報を保持するテーブルである。登録テーブル３３０ｃのデータ構造は、後述する。 The registration table 330c is a table that holds various types of information regarding feature points extracted from the learning target image data 330a and the viewpoint variation image data group 330b. The data structure of the registration table 330c will be described later.

制御部３４０は、データ管理部３４０ａ、視点変動画像生成部３４０ｂ、特徴点抽出部３４０ｃ、分類部３４０ｄ、対応付け部３４０ｅを有する。制御部３４０は、例えば、ＡＳＩＣや、ＦＰＧＡなどの集積装置に対応する。また、制御部３４０は、例えば、ＣＰＵやＭＰＵ等の電子回路に対応する。 The control unit 340 includes a data management unit 340a, a viewpoint variation image generation unit 340b, a feature point extraction unit 340c, a classification unit 340d, and an association unit 340e. The control unit 340 corresponds to an integrated device such as an ASIC or FPGA, for example. The control unit 340 corresponds to, for example, an electronic circuit such as a CPU or MPU.

データ管理部３４０ａは、記憶部３３０を管理する処理部である。例えば、学習フェーズにおいて、データ管理部３４０ａは、カメラ３１０ａから学習対象画像データを取得し、取得した学習対象画像データを、記憶部３３０に記憶する。認識フェーズにおいて、データ管理部３４０ａは、カメラ３１０ａから認識対象画像データを取得し、取得した認識対象画像データを、記憶部３３０に記憶する。 The data management unit 340 a is a processing unit that manages the storage unit 330. For example, in the learning phase, the data management unit 340a acquires learning target image data from the camera 310a, and stores the acquired learning target image data in the storage unit 330. In the recognition phase, the data management unit 340a acquires recognition target image data from the camera 310a, and stores the acquired recognition target image data in the storage unit 330.

視点変動画像生成部３４０ｂは、学習対象画像データ３３０ａを基にして、視点変動画像データ群３３０ｂを生成する処理部である。視点変動画像生成部３４０ｂの具体的な処理は、実施例１の視点変動画像生成部１４０ｂの処理と同様である。 The viewpoint variation image generation unit 340b is a processing unit that generates a viewpoint variation image data group 330b based on the learning target image data 330a. Specific processing of the viewpoint variation image generation unit 340b is the same as the processing of the viewpoint variation image generation unit 140b of the first embodiment.

特徴点抽出部３４０ｃは、学習対象画像データ３３０ａから特徴点を抽出する。また、特徴点抽出部３４０ｃは、視点変動画像データ群３３０ｂの各視点変動画像データから、特徴点を抽出する。特徴点抽出部３４０ｃは、学習対象画像データ３３０ａから抽出した特徴点に関するデータおよび視点変動画像データ群３３０ｂから抽出した特徴点に関するデータを、分類部３４０ｄに出力する。 The feature point extraction unit 340c extracts feature points from the learning target image data 330a. In addition, the feature point extraction unit 340c extracts feature points from each viewpoint variation image data of the viewpoint variation image data group 330b. The feature point extraction unit 340c outputs the data on the feature points extracted from the learning target image data 330a and the data on the feature points extracted from the viewpoint variation image data group 330b to the classification unit 340d.

特徴点抽出部３４０ｃの具体的な処理は、実施例１の特徴点抽出部１４０ｃの処理と同様である。特徴点抽出部３４０ｃは、図４に示した抽出結果１Ａおよび図６に示した抽出結果１Ｂを分類部３４０ｄに出力する。 The specific process of the feature point extraction unit 340c is the same as the process of the feature point extraction unit 140c of the first embodiment. The feature point extraction unit 340c outputs the extraction result 1A shown in FIG. 4 and the extraction result 1B shown in FIG. 6 to the classification unit 340d.

分類部３４０ｄは、視点変動画像データ群３３０ｂから抽出された複数の特徴点を分類する処理部である。分類部３４０ｄは、抽出結果１Ａと、抽出結果１Ｂとを基にして、特徴点を分類する。 The classification unit 340d is a processing unit that classifies a plurality of feature points extracted from the viewpoint variation image data group 330b. The classification unit 340d classifies the feature points based on the extraction result 1A and the extraction result 1B.

特に、実施例３の分類部３４０ｄは、実施例１の分類部１４０ｄが分類した各特徴点群の重心座標を比較し、重心座標の近い特徴点群をまとめ、まとめた特徴点群に対して、k-meansを適用することで、重心座標が類似する特徴点群を分類し直す。分類部３４０ｄが、視点変動画像データから抽出した各特徴点を、特徴点４０Ａ、特徴点４０Ｂ、特徴点４０Ｃの何れかに分類するまでの処理は、実施例１の分類部１４０ｄと同様である。 In particular, the classification unit 340d according to the third embodiment compares the centroid coordinates of the feature point groups classified by the classification unit 140d according to the first embodiment, summarizes the feature point groups having the centroid coordinates close to each other, and collects the feature point groups. , K-means is applied to reclassify the feature points with similar barycentric coordinates. The processing until the classification unit 340d classifies each feature point extracted from the viewpoint variation image data into one of the feature point 40A, the feature point 40B, and the feature point 40C is the same as that of the classification unit 140d of the first embodiment. .

図１６、図１７は、実施例３にかかる分類部の処理を説明するための図である。図１６において、丸印は、特徴点４０Ａに分類された特徴点に対応する。三角印は、特徴点４０Ｂに分類された特徴点に対応する。四角印は、特徴点４０Ｃに分類された特徴点に対応する。 FIGS. 16 and 17 are diagrams for explaining the processing of the classification unit according to the third embodiment. In FIG. 16, the circles correspond to the feature points classified as the feature points 40A. The triangle mark corresponds to the feature point classified as the feature point 40B. The square marks correspond to the feature points classified as the feature points 40C.

分類部３４０ｄは、各特徴点群の重心２Ａ〜２Ｃの距離を比較し、重心間の距離が閾値未満となる重心の組を判定する。本実施例３では、一例として、重心２Ａと、重心２Ｂとの距離が閾値未満となる場合について説明する。 The classification unit 340d compares the distances of the centroids 2A to 2C of the feature point groups, and determines a set of centroids in which the distance between the centroids is less than a threshold. In the third embodiment, as an example, a case where the distance between the center of gravity 2A and the center of gravity 2B is less than the threshold will be described.

分類部３４０ｄは、判定した重心２Ａに対応する特徴点群と、重心２Ｂに対応する特徴点群に対して、k-meansを実行する。ここで、重心２Ａに対応する特徴点群は、特徴点４０Ａに分類された丸印の各特徴点である。重心２Ｂに対応する特徴点群は、特徴点４０Ｂに分類された三角印の各特徴点である。 The classification unit 340d performs k-means on the feature point group corresponding to the determined centroid 2A and the feature point group corresponding to the centroid 2B. Here, the feature point group corresponding to the center of gravity 2A is each of the feature points indicated by circles classified as the feature point 40A. The feature point group corresponding to the center of gravity 2B is each feature point of the triangle mark classified into the feature points 40B.

分類部３４０ｄが、k-meansを実行すると、図１７に示すように各特徴点がグループ４Ａ〜４Ｅに分類される。分類部３４０ｄは、グループ毎に、各特徴点の重心ｋ１〜ｋ５を求める。各重心をキーポイントと表記する。また、分類部３４０ｄは、グループ毎に、各特徴点の共分散を算出する。 When the classification unit 340d executes k-means, the feature points are classified into groups 4A to 4E as shown in FIG. The classification unit 340d obtains centroids k1 to k5 of each feature point for each group. Each center of gravity is expressed as a key point. Further, the classification unit 340d calculates the covariance of each feature point for each group.

図１７を参照すると、グループ４Ａ、４Ｂは、丸印の特徴点のみを含む。このため、分類部３４０ｄは、グループ４Ａ、４Ｂを、特徴点４０Ａに対応付ける。グループ４Ａのクラスタ番号を１、グループ４Ｂのクラスタ番号を２とする。 Referring to FIG. 17, the groups 4A and 4B include only the circle feature points. For this reason, the classification unit 340d associates the groups 4A and 4B with the feature point 40A. The cluster number of group 4A is 1, and the cluster number of group 4B is 2.

図１７を参照すると、グループ４Ｃは、丸印の特徴点および三角印の特徴点を含む。この場合には、分類部３４０ｄは、グループ４Ｃを、特徴点４０Ａ、特徴点４０Ｂの双方と対応付ける。グループ４Ｃのクラスタ番号を３とする。 Referring to FIG. 17, group 4C includes feature points with circles and feature points with triangles. In this case, the classification unit 340d associates the group 4C with both the feature point 40A and the feature point 40B. The cluster number of group 4C is set to 3.

図１７を参照すると、グループ４Ｄ、４Ｅは、三角印の特徴点のみを含む。このため、分類部３４０ｄは、グループ４Ｄ、４Ｅを、特徴点４０Ｂに対応付ける。グループ４Ｄのクラスタ番号を４、グループ４Ｅのクラスタ番号を５とする。 Referring to FIG. 17, the groups 4D and 4E include only triangular feature points. For this reason, the classification unit 340d associates the groups 4D and 4E with the feature point 40B. The cluster number of group 4D is 4, and the cluster number of group 4E is 5.

なお、分類部３４０ｄは、四角印の各特徴点のクラスタ番号を６とし、重心をｋ６とする。ｋ６は、図１６の２Ｃに対応するものとする。 The classification unit 340d sets the cluster number of each feature point of the square mark to 6 and the center of gravity to k6. k6 corresponds to 2C in FIG.

分類部３４０ｄは、分類した結果を、登録テーブル３３０ｃに登録する。図１８は、本実施例３の登録テーブルのデータ構造の一例を示す図である。この登録テーブル３３０ｃは、クラスタ番号、キー番号、構成データ数、座標、重心座標、共分散を有する。 The classification unit 340d registers the classification result in the registration table 330c. FIG. 18 is a diagram illustrating an example of the data structure of the registration table according to the third embodiment. The registration table 330c has a cluster number, a key number, the number of configuration data, coordinates, barycentric coordinates, and covariance.

図１８のクラスタ番号は、分類した特徴点を一意に識別する番号である。クラスタ番号１、２は、特徴点４０Ａに分類された特徴点群のクラスタ番号である。クラスタ番号３は、上記のように、特徴点４０Ａまたは特徴点４０Ｂに分類された特徴点群のクラスタである。クラスタ番号４、５は、特徴点４０Ｂに分類された特徴点群のクラスタ番号である。クラスタ番号６は、特徴点４０Ｃに分類された特徴点群のクラスタ番号である。 The cluster numbers in FIG. 18 are numbers that uniquely identify the classified feature points. Cluster numbers 1 and 2 are cluster numbers of the feature point group classified into the feature points 40A. Cluster number 3 is a cluster of feature points grouped as feature point 40A or feature point 40B as described above. Cluster numbers 4 and 5 are cluster numbers of the feature point group classified into the feature points 40B. The cluster number 6 is the cluster number of the feature point group classified into the feature points 40C.

図１８のキー番号は、各特徴点群のキーポイントを一位に識別する番号である。構成データ数は、各グループに含まれる特徴点の数に対応する。なお、同一のグループに異なる種類の特徴点が含まれる場合には、種類毎の特徴点の数が登録される。例えば、クラスタ番号３に対応する特徴点群は、丸印の特徴点と、三角印の特徴点を含むため、特徴点の種類毎に数が登録される。 The key number in FIG. 18 is a number that uniquely identifies the key point of each feature point group. The number of configuration data corresponds to the number of feature points included in each group. When different types of feature points are included in the same group, the number of feature points for each type is registered. For example, since the feature point group corresponding to the cluster number 3 includes a feature point with a circle and a feature point with a triangle, a number is registered for each type of feature point.

図１８の座標は、学習対象画像データ３３０ａから抽出した特徴点の座標である。例えば、座標（ｘ１、ｙ１）は、特徴点４０Ａの座標である。座標（ｘ２、ｙ２）は、特徴点４０Ｂの座標である。座標（ｘ３、ｙ３）は、特徴点４０Ｃの座標である。重心座標は、キーポイントの重心座標に対応する。共分散は、k-meansにより更に分類したグループに含まれる特徴点の共分散値である。 The coordinates in FIG. 18 are the coordinates of the feature points extracted from the learning target image data 330a. For example, the coordinates (x1, y1) are the coordinates of the feature point 40A. The coordinates (x2, y2) are the coordinates of the feature point 40B. The coordinates (x3, y3) are the coordinates of the feature point 40C. The barycentric coordinates correspond to the barycentric coordinates of the key points. The covariance is a covariance value of feature points included in a group further classified by k-means.

対応付け部３４０ｅは、登録テーブル３３０ｃを基にして、認識対象画像データ３３０ｄの特徴点と、学習対象画像データ３３０ａの特徴点とを対応付ける処理部である。 The association unit 340e is a processing unit that associates the feature points of the recognition target image data 330d with the feature points of the learning target image data 330a based on the registration table 330c.

対応付け部３４０ｅは、ＳＩＦＴを利用して、認識対象画像データ３３０ｄから、特徴点を抽出する。対応付け部３４０ｅは、認識対象画像データ３３０ｄから抽出した特徴点の特徴空間上の座標と、登録テーブル３３０ｃの重心座標とを比較し、最も座標間の距離が短い重心座標に対応するクラスタ番号を判定する。対応付け部３４０ｅは、判定したクラスタ番号の座標と、認識対象画像データ３３０ｄの特徴点とを対応付ける。 The associating unit 340e extracts feature points from the recognition target image data 330d using SIFT. The associating unit 340e compares the coordinates on the feature space of the feature points extracted from the recognition target image data 330d with the centroid coordinates of the registration table 330c, and determines the cluster number corresponding to the centroid coordinates with the shortest distance between the coordinates. judge. The associating unit 340e associates the coordinates of the determined cluster number with the feature points of the recognition target image data 330d.

対応付け部３４０ｅは、キー番号ｋ１、ｋ２の重心座標の何れかと、特徴点との距離が最も近い場合には、該特徴点は、学習対象画像データ３３０ａ上の特徴点４０Ａに対応すると判定する。対応付け部３４０ｅは、キー番号ｋ４、ｋ５の重心座標の何れかと、特徴点との距離が最も近い場合には、該特徴点は、学習対象画像データ３３０ａ上の特徴点４０Ｂに対応すると判定する。対応付け部３４０ｅは、キー番号ｋ６の重心座標の何れかと、特徴点との距離が最も近い場合には、該特徴点は、学習対象画像データ３３０ａ上の特徴点４０Ｃに対応すると判定する。 The associating unit 340e determines that the feature point corresponds to the feature point 40A on the learning target image data 330a when either of the barycentric coordinates of the key numbers k1 and k2 is closest to the feature point. . The associating unit 340e determines that the feature point corresponds to the feature point 40B on the learning target image data 330a when either of the barycentric coordinates of the key numbers k4 and k5 is closest to the feature point. . The associating unit 340e determines that the feature point corresponds to the feature point 40C on the learning target image data 330a when any one of the barycentric coordinates of the key number k6 is closest to the feature point.

なお、対応付け部３４０ｅは、キー番号ｋ３の重心座標と、特徴点との距離が最も近い場合には、該特徴点は、学習対象画像データ３３０ａ上の特徴点４０Ａまたは特徴点４０Ｂの何れかに対応すると判定する。認識対象画像データ３３０ｄの特徴点が、学習対象画像データ３３０ａ上の複数の特徴点に対応付けられた場合には、対応付け部３４０ｅは、特徴点全体として、誤差が最も少ない特徴点を最終的に対応付ける。 Note that when the distance between the barycentric coordinate of the key number k3 and the feature point is the closest, the associating unit 340e is either the feature point 40A or the feature point 40B on the learning target image data 330a. It is determined that it corresponds to. When the feature points of the recognition target image data 330d are associated with a plurality of feature points on the learning target image data 330a, the associating unit 340e finally determines the feature points with the least error as the entire feature points. Associate with.

例えば、対応付け部３４０ｅは、最小二乗法等を基にして、特徴点４０Ａに対応付けた場合の誤差と、特徴点４０Ｂに対応付けた場合の誤差とを比較し、誤差の少ない方の特徴点に、認識対象画像データ３３０ｄの特徴点を対応付ける。 For example, the associating unit 340e compares the error in the case of associating with the feature point 40A with the error in the case of associating with the feature point 40B based on the least square method or the like, and the feature with the smaller error The feature points of the recognition target image data 330d are associated with the points.

なお、対応付け部３４０ｅは、実施例１と同様にして、マハラノビス距離を用いて、認識対象画像データ３３０ｄの特徴点と、学習対象画像データ３３０ａの特徴点とを対応付けてもよい。 Note that the association unit 340e may associate the feature points of the recognition target image data 330d and the feature points of the learning target image data 330a using the Mahalanobis distance in the same manner as in the first embodiment.

次に、本実施例３にかかる画像処理装置３００の効果について説明する。画像処理装置３００は、複数の特徴点に対応する可能性がある場合には、認識対象画像データ３３０ｄの特徴点を、学習対象画像データ３３０ａの複数の特徴点に対応付ける。そして、画像処理装置３００は、全体の特徴点との関係から、複数の特徴点に対応付けた特徴点を、単一の特徴点に絞り込む。このため、認識対象画像データ３３０ｄの特徴点を、無理矢理、単一の特徴点に対応付けることが無くなり、正確に、認識対象画像データ３３０ｄの特徴点を、学習対象画像データ３３０ａの複数の特徴点に対応付けることができる。 Next, effects of the image processing apparatus 300 according to the third embodiment will be described. When there is a possibility of corresponding to a plurality of feature points, the image processing apparatus 300 associates the feature points of the recognition target image data 330d with the plurality of feature points of the learning target image data 330a. Then, the image processing apparatus 300 narrows down feature points associated with a plurality of feature points to a single feature point based on the relationship with the entire feature points. Therefore, the feature points of the recognition target image data 330d are not forced to be associated with a single feature point, and the feature points of the recognition target image data 330d are accurately used as a plurality of feature points of the learning target image data 330a. Can be associated.

次に、実施例に示した情報処理装置１００、２００、３００と同様の機能を実現する情報処理プログラムを実行するコンピュータの一例を説明する。図１９は、画像処理プログラムを実行するコンピュータの一例を示す図である。 Next, an example of a computer that executes an information processing program that implements the same functions as those of the information processing apparatuses 100, 200, and 300 described in the embodiments will be described. FIG. 19 is a diagram illustrating an example of a computer that executes an image processing program.

図１９に示すように、コンピュータ４００は、各種演算処理を実行するＣＰＵ４０１と、ユーザからのデータの入力を受け付ける入力装置４０２と、ディスプレイ４０３を有する。また、コンピュータ４００は、記憶媒体からプログラム等を読取る読み取り装置４０４と、ネットワークを介して他のコンピュータとの間でデータの授受を行うインターフェース装置４０５とを有する。また、コンピュータ４００は、各種情報を一時記憶するＲＡＭ４０６と、ハードディスク装置４０７を有する。そして、各装置４０１〜４０７は、バス４０８に接続される。 As shown in FIG. 19, the computer 400 includes a CPU 401 that executes various arithmetic processes, an input device 402 that receives data input from a user, and a display 403. The computer 400 includes a reading device 404 that reads a program and the like from a storage medium, and an interface device 405 that exchanges data with another computer via a network. The computer 400 also includes a RAM 406 that temporarily stores various types of information and a hard disk device 407. The devices 401 to 407 are connected to the bus 408.

ハードディスク装置４０７は、例えば、視点変動画像生成プログラム４０７ａ、特徴点抽出プログラム４０７ｂ、分類プログラム４０７ｃ、対応付けプログラム４０７ｄを有する。ＣＰＵ４０１は、各プログラム４０７ａ〜４０７ｄを読み出して、ＲＡＭ４０６に展開する。 The hard disk device 407 includes, for example, a viewpoint variation image generation program 407a, a feature point extraction program 407b, a classification program 407c, and an association program 407d. The CPU 401 reads each program 407 a to 407 d and develops it in the RAM 406.

視点変動画像生成プログラム４０７ａは、視点変動画像生成プロセス４０６ａとして機能する。特徴点抽出プログラム４０７ｂは、特徴点抽出プロセス４０６ｂとして機能する。分類プログラム４０７ｃは、分類プロセス４０６ｃとして機能する。対応付けプログラム４０７ｄは、対応付けプロセス４０６ｄとして機能する。 The viewpoint variation image generation program 407a functions as a viewpoint variation image generation process 406a. The feature point extraction program 407b functions as a feature point extraction process 406b. The classification program 407c functions as a classification process 406c. The association program 407d functions as the association process 406d.

例えば、視点変動画像生成プロセス４０６ａは、視点変動画像生成部１４０ｂ、２４０ｂ、３４０ｂに対応する。特徴点抽出プロセス４０６ｂは、特徴点抽出部１４０ｃ、２４０ｃ、３４０ｃに対応する。分類プロセス４０６ｃは、分類部１４０ｄ、２４０ｄ、３４０ｄに対応する。対応付けプロセス４０６ｄは、対応付け部１４０ｅ、２４０ｅ、３４０ｅに対応する。 For example, the viewpoint variation image generation process 406a corresponds to the viewpoint variation image generation units 140b, 240b, and 340b. The feature point extraction process 406b corresponds to the feature point extraction units 140c, 240c, and 340c. The classification process 406c corresponds to the classification units 140d, 240d, and 340d. The association process 406d corresponds to the association units 140e, 240e, and 340e.

なお、各プログラム４０７ａ〜４０７ｄについては、必ずしも最初からハードディスク装置４０７に記憶させておかなくてもよい。例えば、コンピュータ４００に挿入されるフレキシブルディスク（ＦＤ）、ＣＤ−ＲＯＭ、ＤＶＤディスク、光磁気ディスク、ＩＣカードなどの「可搬用の物理媒体」に各プログラムを記憶させておく。そして、コンピュータ４００がこれらから各プログラム４０７ａ〜４０７ｄを読み出して実行するようにしてもよい。 Note that the programs 407a to 407d are not necessarily stored in the hard disk device 407 from the beginning. For example, each program is stored in a “portable physical medium” such as a flexible disk (FD), a CD-ROM, a DVD disk, a magneto-optical disk, and an IC card inserted into the computer 400. Then, the computer 400 may read and execute each of the programs 407a to 407d from these.

以上の各実施例を含む実施形態に関し、さらに以下の付記を開示する。 The following supplementary notes are further disclosed with respect to the embodiments including the above examples.

（付記１）コンピュータに、
画像を正面のカメラ視点から撮影した学習対象画像を射影変換することで、前記学習対象画像に対するカメラ視点の異なる視点変動画像を複数生成し、
前記学習対象画像および複数の前記視点変動画像から特徴点を抽出し、
前記学習対象画像から抽出した各特徴点に対して、複数の前記視点変動画像から抽出した各特徴点を対応付けることで、複数の前記視点変動画像から抽出した各特徴点を複数の特徴点群に分類し、
分類した特徴点群を含む領域を登録テーブルに登録し、
認識対象画像を取得し、該認識対象画像から特徴点を抽出し、
前記認識対象画像から抽出した特徴点の位置と、前記登録テーブルの前記特徴点群の領域との関係に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける
各処理を実行させることを特徴とする画像処理プログラム。 (Supplementary note 1)
By projectively converting a learning target image obtained by photographing the image from the front camera viewpoint, a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image are generated,
Extracting feature points from the learning object image and the plurality of viewpoint variation images;
By associating each feature point extracted from the plurality of viewpoint variation images with each feature point extracted from the learning target image, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. Classify and
Register the region containing the classified feature point group in the registration table,
Obtaining a recognition target image, extracting feature points from the recognition target image,
Each process of associating the feature point of the recognition target image with the feature point of the learning target image based on the relationship between the position of the feature point extracted from the recognition target image and the region of the feature point group of the registration table An image processing program for executing

（付記２）前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける処理は、前記領域に含まれる特徴点群の重心と、前記認識対象画像の特徴点との距離に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付けることを特徴とする付記１に記載の画像処理プログラム。 (Supplementary Note 2) The process of associating the feature point of the recognition target image with the feature point of the learning target image is based on the distance between the centroid of the feature point group included in the region and the feature point of the recognition target image. The image processing program according to appendix 1, wherein a feature point of the recognition target image and a feature point of the learning target image are associated with each other.

（付記３）前記登録テーブルに登録する処理は、特徴点群に含まれる各特徴点の分布広がりを更に登録し、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける処理は、前記領域に含まれる特徴点群の重心と、前記分布広がりの重み付けをした前記認識対象画像の特徴点との距離に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付けることを特徴とする付記１または２に記載の画像処理プログラム。 (Supplementary note 3) The process of registering in the registration table further includes registering the distribution spread of each feature point included in the feature point group, and associating the feature point of the recognition target image with the feature point of the learning target image , Based on the distance between the centroid of the feature point group included in the region and the feature point of the recognition target image weighted with the distribution spread, the feature point of the recognition target image and the feature point of the learning target image, The image processing program according to appendix 1 or 2, characterized in that:

（付記４）前記分布広がりは、標準偏差または分散であることを特徴とする付記３に記載の画像処理プログラム。 (Supplementary note 4) The image processing program according to supplementary note 3, wherein the distribution spread is a standard deviation or a variance.

（付記５）前記特徴点群に分類する処理は、前記学習対象画像の一つの特徴点に対して、複数の特徴点群に分類することを特徴とする付記１〜４のいずれか一つに記載の画像処理プログラム。 (Additional remark 5) The process classified into the said feature point group is classified into any one of additional marks 1-4 characterized by classifying into one feature point of the above-mentioned learning object image into a plurality of feature point groups. The image processing program described.

（付記６）前記特徴点群に分類する処理は、前記学習対象画像の複数の特徴点に対して、一つの特徴点群に分類することを特徴とする付記１〜４のいずれか一つに記載の画像処理プログラム。 (Additional remark 6) The process classified into the said feature point group classify | categorizes into one feature point group with respect to the some feature point of the said learning object image to any one of the additional notes 1-4 characterized by the above-mentioned. The image processing program described.

（付記７）コンピュータが実行する画像処理方法であって、
画像を正面のカメラ視点から撮影した学習対象画像を射影変換することで、前記学習対象画像に対するカメラ視点の異なる視点変動画像を複数生成し、
前記学習対象画像および複数の前記視点変動画像から特徴点を抽出し、
前記学習対象画像から抽出した各特徴点に対して、複数の前記視点変動画像から抽出した各特徴点を対応付けることで、複数の前記視点変動画像から抽出した各特徴点を複数の特徴点群に分類し、
分類した特徴点群を含む領域を登録テーブルに登録し、
認識対象画像を取得し、該認識対象画像から特徴点を抽出し、
前記認識対象画像から抽出した特徴点の位置と、前記登録テーブルの前記特徴点群の領域との関係に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける
各処理を実行することを特徴とする画像処理方法。 (Appendix 7) An image processing method executed by a computer,
By projectively converting a learning target image obtained by photographing the image from the front camera viewpoint, a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image are generated,
Extracting feature points from the learning object image and the plurality of viewpoint variation images;
By associating each feature point extracted from the plurality of viewpoint variation images with each feature point extracted from the learning target image, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. Classify and
Register the region containing the classified feature point group in the registration table,
Obtaining a recognition target image, extracting feature points from the recognition target image,
Each process of associating the feature point of the recognition target image with the feature point of the learning target image based on the relationship between the position of the feature point extracted from the recognition target image and the region of the feature point group of the registration table The image processing method characterized by performing.

（付記８）前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける処理は、前記領域に含まれる特徴点群の重心と、前記認識対象画像の特徴点との距離に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付けることを特徴とする付記７に記載の画像処理方法。 (Supplementary Note 8) The process of associating the feature point of the recognition target image with the feature point of the learning target image is based on the distance between the centroid of the feature point group included in the region and the feature point of the recognition target image. The image processing method according to appendix 7, wherein a feature point of the recognition target image is associated with a feature point of the learning target image.

（付記９）前記登録テーブルに登録する処理は、特徴点群に含まれる各特徴点の分布広がりを更に登録し、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける処理は、前記領域に含まれる特徴点群の重心と、前記分布広がりの重み付けをした前記認識対象画像の特徴点との距離に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付けることを特徴とする付記７または８に記載の画像処理方法。 (Supplementary Note 9) The process of registering in the registration table further registers the distribution spread of each feature point included in the feature point group, and associates the feature point of the recognition target image with the feature point of the learning target image. , Based on the distance between the centroid of the feature point group included in the region and the feature point of the recognition target image weighted with the distribution spread, the feature point of the recognition target image and the feature point of the learning target image, The image processing method according to appendix 7 or 8, characterized in that

（付記１０）前記分布広がりは、標準偏差または分散であることを特徴とする付記９に記載の画像処理方法。 (Supplementary note 10) The image processing method according to supplementary note 9, wherein the distribution spread is a standard deviation or a variance.

（付記１１）前記特徴点群に分類する処理は、前記学習対象画像の一つの特徴点に対して、複数の特徴点群に分類することを特徴とする付記７〜１０のいずれか一つに記載の画像処理方法。 (Additional remark 11) The process classified into the said feature point group is classified into any one of Additional remarks 7-10 characterized by classifying into one feature point of the said learning object image to several feature point groups. The image processing method as described.

（付記１２）前記特徴点群に分類する処理は、前記学習対象画像の複数の特徴点に対して、一つの特徴点群に分類することを特徴とする付記７〜１０のいずれか一つに記載の画像処理方法。 (Additional remark 12) The process classified into the said feature point group classify | categorizes into one feature point group with respect to the some feature point of the said learning object image to any one of additional marks 7-10 characterized by the above-mentioned. The image processing method as described.

（付記１３）画像を正面のカメラ視点から撮影した学習対象画像を射影変換することで、前記学習対象画像に対するカメラ視点の異なる視点変動画像を複数生成する視点変動画像生成部と、
前記学習対象画像および複数の前記視点変動画像から特徴点を抽出する特徴点抽出部と、
前記学習対象画像から抽出した各特徴点に対して、複数の前記視点変動画像から抽出した各特徴点を対応付けることで、複数の前記視点変動画像から抽出した各特徴点を複数の特徴点群に分類し、分類した特徴点群を含む領域を登録テーブルに登録する分類部と、
認識対象画像を取得し、該認識対象画像から特徴点を抽出し、前記認識対象画像から抽出した特徴点の位置と、前記登録テーブルの前記特徴点群の領域との関係に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付ける対応付け部と
を有することを特徴とする画像処理装置。 (Supplementary Note 13) A viewpoint variation image generation unit that generates a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image by projective transformation of the learning target image obtained by photographing the image from the front camera viewpoint;
A feature point extraction unit that extracts feature points from the learning target image and the plurality of viewpoint variation images;
By associating each feature point extracted from the plurality of viewpoint variation images with each feature point extracted from the learning target image, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. A classification unit that classifies and registers a region including the classified feature point group in a registration table;
A recognition target image is acquired, feature points are extracted from the recognition target image, and the recognition is performed based on a relationship between the position of the feature point extracted from the recognition target image and the region of the feature point group in the registration table. An image processing apparatus comprising: an association unit that associates a feature point of a target image with a feature point of the learning target image.

（付記１４）前記対応付け部は、前記領域に含まれる特徴点群の重心と、前記認識対象画像の特徴点との距離に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付けることを特徴とする付記１３に記載の画像処理装置。 (Additional remark 14) The said matching part is based on the distance of the gravity center of the feature point group contained in the said area | region, and the feature point of the said recognition target image, The feature point of the said recognition target image, and the feature of the said learning target image 14. The image processing apparatus according to appendix 13, wherein points are associated with each other.

（付記１５）前記分類部は、前記登録テーブルの特徴点群に含まれる各特徴点の分布広がりを更に登録し、前記対応付け部は、前記領域に含まれる特徴点群の重心と、前記分布広がりの重み付けをした前記認識対象画像の特徴点との距離に基づいて、前記認識対象画像の特徴点と前記学習対象画像の特徴点とを対応付けることを特徴とする付記１３または１４に記載の画像処理装置。 (Additional remark 15) The said classification | category part further registers the distribution spread of each feature point contained in the feature point group of the said registration table, The said matching part is the gravity center of the feature point group contained in the said area | region, and the said distribution 15. The image according to appendix 13 or 14, wherein the feature point of the recognition target image is associated with the feature point of the learning target image based on a distance from the feature point of the recognition target image that is weighted for spread. Processing equipment.

（付記１６）前記分布広がりは、標準偏差または分散であることを特徴とする付記１５に記載の画像処理装置。 (Supplementary note 16) The image processing apparatus according to supplementary note 15, wherein the distribution spread is a standard deviation or a variance.

（付記１７）前記分類部は、前記学習対象画像の一つの特徴点に対して、複数の特徴点群に分類することを特徴とする付記１３〜１６のいずれか一つに記載の画像処理装置。 (Supplementary note 17) The image processing device according to any one of supplementary notes 13 to 16, wherein the classification unit classifies one feature point of the learning target image into a plurality of feature point groups. .

（付記１８）前記分類部は、前記学習対象画像の複数の特徴点に対して、一つの特徴点群に分類することを特徴とする付記１３〜１６のいずれか一つに記載の画像処理装置。 (Supplementary note 18) The image processing device according to any one of supplementary notes 13 to 16, wherein the classification unit classifies a plurality of feature points of the learning target image into one feature point group. .

１００画像処理装置
１１０ａカメラ
１１０ｂ入力部
１２０出力部
１３０記憶部
１４０制御部 DESCRIPTION OF SYMBOLS 100 Image processing apparatus 110a Camera 110b Input part 120 Output part 130 Storage part 140 Control part

Claims

On the computer,
By projectively converting a learning target image obtained by photographing the image from the front camera viewpoint, a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image are generated,
Extracting feature points from the learning object image and the plurality of viewpoint variation images;
By associating each feature point extracted from the plurality of viewpoint variation images with each feature point extracted from the learning target image, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. Classify and
Register the region containing the classified feature point group in the registration table,
Obtaining a recognition target image, extracting feature points from the recognition target image,
Each process of associating the feature point of the recognition target image with the feature point of the learning target image based on the relationship between the position of the feature point extracted from the recognition target image and the region of the feature point group of the registration table An image processing program for executing

The process of associating the feature point of the recognition target image with the feature point of the learning target image is based on the distance between the centroid of the feature point group included in the region and the feature point of the recognition target image. The image processing program according to claim 1, wherein a feature point of the image is associated with a feature point of the learning target image.

The process of registering in the registration table further registers the distribution spread of each feature point included in the feature point group, and the process of associating the feature point of the recognition target image with the feature point of the learning target image is performed on the region. Associating the feature point of the recognition target image with the feature point of the learning target image based on the distance between the center of gravity of the included feature point group and the feature point of the recognition target image weighted with the distribution spread. The image processing program according to claim 1 or 2, characterized in that

The image processing program according to claim 3, wherein the distribution spread is a standard deviation or a variance.

5. The image according to claim 1, wherein the process of classifying into feature point groups classifies one feature point of the learning target image into a plurality of feature point groups. Processing program.

The image according to any one of claims 1 to 4, wherein the process of classifying into the feature point group classifies a plurality of feature points of the learning target image into one feature point group. Processing program.

An image processing method executed by a computer,
By projectively converting a learning target image obtained by photographing the image from the front camera viewpoint, a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image are generated,
Extracting feature points from the learning object image and the plurality of viewpoint variation images;
By associating each feature point extracted from the plurality of viewpoint variation images with each feature point extracted from the learning target image, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. Classify and
Register the region containing the classified feature point group in the registration table,
Obtaining a recognition target image, extracting feature points from the recognition target image,
Each process of associating the feature point of the recognition target image with the feature point of the learning target image based on the relationship between the position of the feature point extracted from the recognition target image and the region of the feature point group of the registration table The image processing method characterized by performing.

A viewpoint variation image generating unit that generates a plurality of viewpoint variation images having different camera viewpoints with respect to the learning target image by projective transformation of the learning target image obtained by capturing the image from the front camera viewpoint;
A feature point extraction unit that extracts feature points from the learning target image and the plurality of viewpoint variation images;
By associating each feature point extracted from the plurality of viewpoint variation images with each feature point extracted from the learning target image, each feature point extracted from the plurality of viewpoint variation images is converted into a plurality of feature point groups. A classification unit that classifies and registers a region including the classified feature point group in a registration table;
A recognition target image is acquired, feature points are extracted from the recognition target image, and the recognition is performed based on a relationship between the position of the feature point extracted from the recognition target image and the region of the feature point group in the registration table. An image processing apparatus comprising: an association unit that associates a feature point of a target image with a feature point of the learning target image.