JP2007304900A

JP2007304900A - Object recognition device and object recognition program

Info

Publication number: JP2007304900A
Application number: JP2006133177A
Authority: JP
Inventors: Kaori Kataoka; 香織片岡; Shingo Ando; 慎吾安藤; Yoshinori Kusachi; 良規草地; Akira Suzuki; 章鈴木; Kenichi Ichikawa; 研一市河; Takayuki Yasuno; 貴之安野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2006-05-12
Filing date: 2006-05-12
Publication date: 2007-11-22

Abstract

<P>PROBLEM TO BE SOLVED: To set an index relating to the degree of decrease in the resolution of an image for learning, and to judge proper resolution in order to adopt it as the image for learning. <P>SOLUTION: This object recognition device is configured to recognize an object by using a device 201 for acquiring a standard pattern for learning from an image acquisition means, and for deforming the standard pattern for learning, and for generating a pattern for learning by reducing the resolution of the deformed standard pattern for learning, and for extracting featured values from the pattern for learning; a partial space generation device 202 for generating the partial space of each category based on the extracted featured values; a device 203 for judging whether or not the resolution of a pattern for learning is proper based on a specific threshold from the correlation relation of the generated partial space of each category; and a device 204 for recognizing the category to which the pattern for learning is belonging based on the partial space prepared from the pattern for learning having the resolution judged to be proper. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、画像認識における学習用画像の適正な解像度を得るための装置またはプログラムに関するものである。 The present invention relates to an apparatus or a program for obtaining an appropriate resolution of a learning image in image recognition.

現在の部分空間法に基づく物体識別（または、オブジェクト認識とも言う）装置は、例えば、図７中の特徴量抽出装置１０１，部分空間生成装置１０２，識別装置１０３から構成されている。即ち、特徴量抽出装置１０１によって多数の学習用画像から特徴量を抽出し、その特徴量に基づいて部分空間生成装置１０２によってカテゴリ毎に部分空間を生成し、識別装置１０３によって未知の入力画像と各カテゴリの部分空間との類似度を算出し、その算出した類似度から属するカテゴリを識別する手法が知られている（例えば、特許文献１参照）。 An object identification (or object recognition) device based on the current subspace method is composed of, for example, a feature quantity extraction device 101, a subspace generation device 102, and an identification device 103 in FIG. That is, a feature amount is extracted from a large number of learning images by the feature amount extraction device 101, a partial space is generated for each category by the subspace generation device 102 based on the feature amount, and an unknown input image and A technique is known in which the similarity between each category and the subspace is calculated, and the category belonging to the calculated similarity is identified (see, for example, Patent Document 1).

一般に、画像中の対象物の大きさ変動や傾き変動に強い識別装置を作成する場合、特徴量を抽出するための学習用画像を多数用意しなければならず、メモリ容量や処理速度に制約を受けてしまう。そこで、省メモリ消費や処理速度向上のために、学習用画像の解像度を減少させる方法（例えば、非特許文献１参照）が知られている。
特開平１１−１７５７２４号公報（段落［０００９］〜［００１０］等）。松山隆司，久野義徳，井宮淳、「コンピュータビジョン―技術評論と将来展望」、新技術コミュニケーションズ、１９９９年（平成１１年）、ｐｐ．１２７。画像処理標準テキストブック編集委員会、「画像処理標準テキストブック―Ｉｍａｇｅｐｒｏｃｅｓｓｉｎｇ」、財団法人画像情報教育振興協会、１９９７年（平成９年）５月、ｐｐ．１６２。松山隆司，久野義徳，井宮淳、「コンピュータビジョン―技術評論と将来展望」、新技術コミュニケーションズ、１９９９年（平成１１年）、ｐｐ．２０８。福井和広，山口修、「一般化差分部分空間に基づく制約相互部分空間法」、電子情報通信学会論文誌Ｄ−ＩＩ、社団法人電子情報通信学会情報・システムソサイエティ、２００４年（平成１６年）８月、Ｖｏｌ．Ｊ８７−Ｄ−ＩＩ、Ｎｏ．８、ｐｐ．１６２３。 In general, when creating an identification device that is resistant to size fluctuations and tilt fluctuations of objects in an image, it is necessary to prepare a large number of learning images for extracting feature values, which limit memory capacity and processing speed. I will receive it. Therefore, a method for reducing the resolution of the learning image (for example, see Non-Patent Document 1) is known in order to save memory and improve processing speed.
JP-A-11-175724 (paragraphs [0009] to [0010] and the like). Takashi Matsuyama, Yoshinori Kuno, Satoshi Imiya, “Computer Vision-Technical Review and Future Prospects”, New Technology Communications, 1999, pp. 127. Image Processing Standard Textbook Editorial Committee, “Image Processing Standard Textbook-Image Processing”, Foundation for Image Information Education, May 1997, pp. 162. Takashi Matsuyama, Yoshinori Kuno, Satoshi Imiya, “Computer Vision-Technical Review and Future Prospects”, New Technology Communications, 1999, pp. 208. Kazuhiro Fukui, Osamu Yamaguchi, “Constrained Mutual Subspace Method Based on Generalized Difference Subspace”, IEICE Transactions D-II, The Institute of Electronics, Information and Communication Engineers Information and Systems Society, 2004 (2004) 8 Month, Vol. J87-D-II, no. 8, pp. 1623.

上述のような学習用画像の解像度を減少させる方法では、学習用画像の解像度を減少させ過ぎれば、カテゴリごとの部分空間が重なるため、識別率を下げることとなる。 In the method of reducing the resolution of the learning image as described above, if the resolution of the learning image is excessively reduced, the partial spaces for each category overlap, so that the identification rate is lowered.

また、疎密探索において、低解像度の学習画像から作成した部分空間を用いて属するカテゴリの候補を減らすときに疎探索で誤ると、その後の高解像度画像を用いた認識でも誤ってしまう、という欠点を生じていた。 Also, in the sparse / dense search, if the sparse search is mistaken to reduce the category candidates that belong to the partial space created from the low-resolution learning image, the subsequent recognition using the high-resolution image will also be wrong. It was happening.

このように、低解像度の学習用画像を用いた識別においても、認識率を下げてはいけないにも関わらず、学習用画像の解像度の減少度合いに関する指標は現在の技術に見られないものであった。 As described above, even in the identification using the low-resolution learning image, although the recognition rate should not be lowered, an index relating to the degree of decrease in the resolution of the learning image is not found in the current technology. It was.

本発明は、前記課題に基づいてなされたものであって、学習用画像（即ち、学習用パターン）の解像度の減少度合いに関する指標を設定し、学習用画像として採用するために適正な解像度を判定するオブジェクト認識装置及びオブジェクト認識プログラムを提供することにある。 The present invention has been made based on the above-described problem, and sets an index relating to a degree of resolution reduction of a learning image (that is, a learning pattern), and determines an appropriate resolution for use as a learning image. An object recognition apparatus and an object recognition program are provided.

本発明は、前記課題の解決を図るために、請求項１記載の発明は、学習用パターンから特徴量を抽出し、その特徴量に基づいてオブジェクトを認識するオブジェクト認識装置であって、画像取得手段から学習用標準パターンを取得し、その学習用標準パターンを変形し、その変形された学習用パターンを低解像化して学習用パターンを生成し、その学習用パターンから特徴量を抽出する低解像度特徴量抽出装置と、前記抽出された特徴量に基づいて各カテゴリの部分空間を生成する部分空間生成装置と、前記生成された各カテゴリの部分空間の相関関係から学習用パターンの有する解像度が適性か否かを特定の閾値に基づいて判定する解像度判定装置と、その適性と判定された解像度を有する学習用パターンから作成された部分空間に基づいて、その学習用パターンが属するカテゴリを認識する認識装置と、を備えることを特徴とする。 In order to solve the above problem, the present invention provides an object recognition apparatus that extracts a feature amount from a learning pattern and recognizes an object on the basis of the feature amount. The learning standard pattern is acquired from the means, the learning standard pattern is deformed, the modified learning pattern is reduced in resolution to generate the learning pattern, and the feature amount is extracted from the learning pattern. A resolution feature amount extraction device, a subspace generation device that generates a subspace of each category based on the extracted feature amount, and a resolution of the learning pattern based on the correlation between the generated subspaces of each category A resolution determination device that determines whether or not it is appropriate based on a specific threshold, and a subspace created from a learning pattern having a resolution determined to be appropriate , Characterized in that it comprises a recognizing device categories that learning pattern belongs, a.

請求項２記載の発明は、請求項１記載の発明において、前記解像度判定装置が、学習用パターンに対する認識率に基づいて特定の閾値を設定し、その閾値に基づいて学習用パターンの有する解像度が適性か否かを判定する、ことを特徴とする。 According to a second aspect of the present invention, in the first aspect of the invention, the resolution determination device sets a specific threshold based on a recognition rate for the learning pattern, and the resolution of the learning pattern is determined based on the threshold. It is characterized by determining whether it is appropriate or not.

請求項３記載の発明は、請求項１記載の発明において、前記解像度判定装置が、前記各カテゴリの部分空間が形成する最小角度を特定の閾値と見做し、その閾値に基づいて学習用パターンの有する解像度が適性か否かを判定する、ことを特徴とする。 According to a third aspect of the present invention, in the first aspect of the invention, the resolution determination device considers a minimum angle formed by the subspace of each category as a specific threshold value, and a learning pattern based on the threshold value. It is characterized in that it is determined whether or not the resolution of the image is appropriate.

請求項４記載の発明は、請求項１記載の発明において、前記解像度判定装置が、前記各カテゴリの部分空間の構造的な類似度に基づいて特定の閾値を設定し、その閾値に基づいて学習用パターンの有する解像度が適性か否かを判定する、ことを特徴とする。 According to a fourth aspect of the present invention, in the first aspect of the invention, the resolution determination device sets a specific threshold based on the structural similarity of the subspaces of the categories, and learns based on the threshold. It is characterized in that it is determined whether or not the resolution of the pattern for use is appropriate.

請求項５記載の発明は、請求項１乃至４のいずれかに記載の発明において、前記認識装置が、前記適性と判定された解像度を有する学習用パターンから作成された部分空間に基づいて、その学習用パターンが属するカテゴリを複数個に絞り込んだ後、高解像度を有する学習用パターンから生成した部分空間に基づいて、その高解像度を有する学習用パターンが属するカテゴリを、前記複数個に絞り込まれたカテゴリから選択する、ことを特徴とする。 The invention according to claim 5 is the invention according to any one of claims 1 to 4, wherein the recognition device is based on a subspace created from a learning pattern having a resolution determined to be appropriate. After narrowing the category to which the learning pattern belongs to a plurality of categories, the category to which the learning pattern having the high resolution belongs was narrowed to the plurality based on the subspace generated from the learning pattern having a high resolution. Select from categories.

請求項６記載の発明は、コンピュータで、学習用パターンから特徴量を抽出し、その特徴量に基づいてオブジェクトを認識させるオブジェクト認識プログラムであって、画像取得手段から学習用標準パターンを取得し、その学習用標準パターンを変形し、その変形された学習用パターンを低解像化して学習用パターンを生成し、その学習用パターンから特徴量を抽出する低解像度特徴量抽出ステップと、前記抽出された特徴量に基づいて各カテゴリの部分空間を生成する部分空間生成ステップと、前記生成された各カテゴリの部分空間の相関関係から学習用パターンの有する解像度が適性か否かを特定の閾値に基づいて判定する解像度判定ステップと、その適性と判定された解像度を有する学習用パターンから作成された部分空間に基づいて、その学習用パターンが属するカテゴリを認識する認識ステップと、を有することを特徴とする。 The invention according to claim 6 is an object recognition program for extracting a feature amount from a learning pattern by a computer and recognizing an object based on the feature amount, acquiring a learning standard pattern from an image acquisition means, A low-resolution feature amount extracting step for deforming the learning standard pattern, generating a learning pattern by reducing the resolution of the modified learning pattern, and extracting the feature amount from the learning pattern; A subspace generation step for generating a subspace for each category based on the determined feature amount, and whether or not the resolution of the learning pattern is appropriate based on a correlation between the generated subspaces for each category based on a specific threshold Resolution determination step, and a subspace created from a learning pattern having a resolution determined to be appropriate A recognition step of recognizing the category of the learning pattern belongs, and having a.

請求項７記載の発明は、請求項６に記載の発明において、前記解像度判定ステップが、学習用パターンに対する認識率に基づいて特定の閾値を設定し、その閾値に基づいて学習用パターンの有する解像度が適性か否かを判定する、ことを特徴とする。 According to a seventh aspect of the invention, in the sixth aspect of the invention, the resolution determination step sets a specific threshold value based on a recognition rate for the learning pattern, and the resolution of the learning pattern based on the threshold value. It is characterized by determining whether or not is suitable.

請求項８記載の発明は、請求項６に記載の発明において、前記解像度判定ステップが、前記各カテゴリの部分空間が形成する最小角度を特定の閾値と見做し、その閾値に基づいて学習用パターンの有する解像度が適性か否かを判定する、ことを特徴とする。 According to an eighth aspect of the present invention, in the invention according to the sixth aspect, the resolution determining step considers a minimum angle formed by the subspace of each category as a specific threshold value, and performs learning based on the threshold value. It is characterized by determining whether or not the resolution of the pattern is appropriate.

請求項９記載の発明は、請求項６に記載の発明において、前記解像度判定ステップが、前記各カテゴリの部分空間の構造的な類似度に基づいて特定の閾値を設定し、その閾値に基づいて学習用パターンの有する解像度が適性か否かを判定する、ことを特徴とする。 The invention according to claim 9 is the invention according to claim 6, wherein the resolution determination step sets a specific threshold based on the structural similarity of the subspaces of each category, and based on the threshold It is characterized by determining whether or not the resolution of the learning pattern is appropriate.

請求項１０記載の発明は、請求項６乃至９のいずれかに記載の発明において、前記認識ステップが、前記適性と判定された解像度を有する学習用パターンから作成された部分空間に基づいて、その学習用パターンが属するカテゴリを複数個に絞り込んだ後、高解像度を有する学習用パターンから生成した部分空間に基づいて、その高解像度を有する学習用パターンが属するカテゴリを、前記複数個に絞り込まれたカテゴリから選択する、ことを特徴とする。 According to a tenth aspect of the present invention, in the invention according to any one of the sixth to ninth aspects, the recognition step is based on a subspace created from a learning pattern having a resolution determined to be appropriate. After narrowing the category to which the learning pattern belongs to a plurality of categories, the category to which the learning pattern having the high resolution belongs was narrowed to the plurality based on the subspace generated from the learning pattern having a high resolution. Select from categories.

前記の請求項１，６の発明によれば、認識精度を保ちつつ学習用パターンの解像度を低くできる。 According to the first and sixth aspects of the invention, the resolution of the learning pattern can be lowered while maintaining the recognition accuracy.

前記の請求項２，７の発明によれば、学習用パターンに対する認識率が既知である場合に特定の閾値を決定できる。 According to the second and seventh aspects of the invention, the specific threshold can be determined when the recognition rate for the learning pattern is known.

前記の請求項３，８の発明によれば、各カテゴリの部分空間が形成する最小角度が既知である場合に特定の閾値を決定できる。 According to the third and eighth aspects of the invention, the specific threshold value can be determined when the minimum angle formed by the subspace of each category is known.

前記の請求項４，９の発明によれば、各カテゴリの部分空間の構造的な類似度が既知である場合に特定の閾値を決定できる。 According to the fourth and ninth aspects of the invention, the specific threshold can be determined when the structural similarity of the subspaces of each category is known.

前記の請求項５，１０の発明によれば、低解像度を有する学習用パターンに基づいて絞り込まれたカテゴリを取得できる。 According to the fifth and tenth aspects of the present invention, it is possible to obtain a narrowed-down category based on a learning pattern having a low resolution.

以上示したように請求項１，６の発明によれば、オブジェクト認識装置の省メモリ消費を実現し、処理速度を向上できる。そして、画像中の対象物の大きさ変動や傾き変動が大きくても、安定した認識を実行できる。 As described above, according to the first and sixth aspects of the present invention, the memory consumption of the object recognition apparatus can be realized and the processing speed can be improved. And even if the size fluctuation or inclination fluctuation of the object in the image is large, stable recognition can be executed.

請求項２，７の発明によれば、学習用パターンに対する認識率に基づいて、学習用パターンの適正な解像度を取得できる。 According to the second and seventh aspects of the invention, the appropriate resolution of the learning pattern can be acquired based on the recognition rate for the learning pattern.

請求項３，８の発明によれば、各カテゴリの部分空間が形成する最小角度に基づいて、学習用パターンの適正な解像度を取得できる。 According to the third and eighth aspects of the invention, the appropriate resolution of the learning pattern can be acquired based on the minimum angle formed by the partial spaces of each category.

請求項４，９の発明によれば、各カテゴリの部分空間の構造的な類似度に基づいて、学習用パターンの適正な解像度を取得できる。 According to the fourth and ninth aspects of the present invention, an appropriate resolution of the learning pattern can be acquired based on the structural similarity of the partial spaces of each category.

請求項５，１０の発明によれば、低解像度を有する学習用パターンに基づいてカテゴリを絞り込むため、最初から高解像度を有する学習用パターンに基づいてカテゴリを絞り込むより、高速かつ高精度に認識を実行できる。 According to the fifth and tenth aspects of the invention, since the category is narrowed down based on the learning pattern having a low resolution, the recognition is performed at a higher speed and with higher accuracy than narrowing down the category based on the learning pattern having a high resolution from the beginning. Can be executed.

これらを以って画像認識技術分野に貢献できる。 These can contribute to the field of image recognition technology.

以下、本発明の実施形態を図面等に基づいて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

本実施形態におけるオブジェクト認識装置の構成を図１に基づいて説明する。本実施形態におけるオブジェクト認識装置は、各カテゴリの部分空間の分布を基に低解像度化の指標（例えば、閾値）を設定し、学習用パターンの適正な解像度を判定する解像度判定装置を有するものである。 The configuration of the object recognition apparatus in the present embodiment will be described with reference to FIG. The object recognition apparatus according to the present embodiment includes a resolution determination apparatus that sets a low resolution index (for example, a threshold) based on the distribution of subspaces of each category and determines an appropriate resolution of the learning pattern. is there.

オブジェクト認識装置は、低解像度の学習用パターンから特徴量を抽出する低解像度特徴量抽出装置２０１、抽出した特徴量に基づいてカテゴリごとに部分空間を作成する部分空間作成装置２０２、生成された部分空間の相関関係に基づいて学習用パターンの解像度が適性か否かを判定する（即ち、生成された部分空間が分離しているか否かを判定する）解像度判定装置２０３と、その適性と判定された解像度を有する学習用パターンに基づいて生成された部分空間において、その学習用パターンが属するカテゴリを認識する認識装置２０４と、から構成される。 The object recognition device includes a low-resolution feature quantity extraction device 201 that extracts a feature quantity from a low-resolution learning pattern, a subspace creation apparatus 202 that creates a subspace for each category based on the extracted feature quantity, and a generated part A resolution determination device 203 that determines whether or not the resolution of the learning pattern is appropriate based on the spatial correlation (that is, determines whether or not the generated partial space is separated), and is determined to be appropriate. A recognition device 204 for recognizing a category to which the learning pattern belongs in a partial space generated based on the learning pattern having the same resolution.

なお、以下の説明では、低解像度画像とは、原画像（即ち、高解像度画像）より解像度の低い画像のことをいう。例えば、低解像度の学習用パターンとは、原画像より解像度の低い学習用パターン（学習用パターン画像）を指す。 In the following description, a low resolution image refers to an image having a lower resolution than the original image (that is, a high resolution image). For example, a low-resolution learning pattern refers to a learning pattern (learning pattern image) having a lower resolution than the original image.

特徴量抽出装置２０１は、さらに、図２中の学習用パターン生成部３０１、低解像度化部３０２、特徴量抽出部３０３、から構成される。 The feature quantity extraction device 201 further includes a learning pattern generation unit 301, a resolution reduction unit 302, and a feature quantity extraction unit 303 in FIG.

学習用パターン生成部３０１は、学習用標準パターンを取得し、その学習用標準パターンを変形して、多数の学習用パターンを生成する部である。 The learning pattern generation unit 301 is a unit that acquires a learning standard pattern, deforms the learning standard pattern, and generates a large number of learning patterns.

その多数の学習用パターンを生成する方法を図３に基づいて説明する。なお、図３中では、認識対象物（図３中では絵葉書）を、画像取得手段３０４（例えば、携帯電話に内蔵されたデジタルカメラ装置）を用いて正面から撮像し、その撮像によって得られた画像データを学習用標準パターン４０１〜４０３と見做す。また、前記画像取得手段３０４は、撮像装置（例えば、スキャナ装置やデジタルカメラ装置）または画像データ管理手段（例えば、データベース）を用いてもよい。 A method for generating the large number of learning patterns will be described with reference to FIG. In FIG. 3, a recognition object (a postcard in FIG. 3) is captured from the front using image acquisition means 304 (for example, a digital camera device built in a mobile phone), and obtained by the imaging. Image data is regarded as learning standard patterns 401 to 403. Further, the image acquisition unit 304 may use an imaging device (for example, a scanner device or a digital camera device) or an image data management unit (for example, a database).

そして、学習用パターン生成部３０１は、前記学習用標準パターンを変形し、学習用パターンを生成する。なお、以下の説明では、学習用パターン生成部３０１を用いて透視投影変換を用いて画像を変形し、学習用パターンを生成するものとする。 Then, the learning pattern generation unit 301 deforms the learning standard pattern to generate a learning pattern. In the following description, it is assumed that the learning pattern generation unit 301 is used to deform an image using perspective projection conversion to generate a learning pattern.

入力された画像データ（即ち、学習用標準パターン）の撮影方向（即ち、どの方向から撮影されたか）は特定できないため、透視投影変換（例えば、非特許文献１参照）によって学習用標準パターンを変形し、さまざまな方向から撮影された場合を想定した学習用パターンを作成する。 Since the shooting direction (that is, the direction from which the image was taken) of the input image data (that is, the learning standard pattern) cannot be specified, the learning standard pattern is transformed by perspective projection conversion (for example, see Non-Patent Document 1). Then, a learning pattern is created assuming that images are taken from various directions.

例えば、学習用標準パターンをｘ、ｙ、ｚ軸方向それぞれに特定角度（例えば、−１２°〜１２°）まで回転させ、さらに、デジタルカメラ装置から対象物の距離が標準パターンを撮影したときより短い場合、あるいは、長い場合における変形を行う、などの変形を施す。なお、前述のように人為的に変形を施すのではなく、対象物をさまざまな方向から撮影し、その撮影画像を学習用パターンと見做してもよい。 For example, the learning standard pattern is rotated to a specific angle (for example, −12 ° to 12 °) in each of the x-, y-, and z-axis directions, and the distance of the object from the digital camera device is taken from the standard pattern. Deformation such as performing deformation in a short case or a long case is performed. Instead of artificially deforming as described above, the object may be photographed from various directions, and the photographed image may be regarded as a learning pattern.

低解像度化部３０２は、学習用パターン生成部３０１から生成された学習用パターンを低解像度化し縮小する部である。なお、以下の説明では、学習用パターンにガウシアンフィルタ（例えば、非特許文献２参照）を適用する。 The resolution reduction unit 302 is a unit that reduces the resolution of the learning pattern generated from the learning pattern generation unit 301 and reduces the learning pattern. In the following description, a Gaussian filter (see, for example, Non-Patent Document 2) is applied to the learning pattern.

例えば、図４は、透視投影変換によって生成した学習用パターンを１／５に縮小した画像である。図４中では、１／５に縮小する際に、加重平均フィルタリングの一つであるガウシアンフィルタを適用している。なお、縮小には、どの平滑化フィルタを用いても構わない。 For example, FIG. 4 is an image obtained by reducing a learning pattern generated by perspective projection conversion to 1/5. In FIG. 4, a Gaussian filter, which is one of weighted average filtering, is applied when reducing to 1/5. Any smoothing filter may be used for the reduction.

学習用標準パターン４０１は、次のように変形されている。 The learning standard pattern 401 is modified as follows.

学習用パターン５０１は学習用標準パターン４０１をｘ軸方向に−１２°回転させ１／５に縮小した画像である。 The learning pattern 501 is an image obtained by rotating the learning standard pattern 401 by −12 ° in the x-axis direction and reducing it to 1/5.

学習用パターン５０２は学習用標準パターン４０１をｘ軸方向に１２°回転させ１／５に縮小した画像である。 The learning pattern 502 is an image obtained by rotating the learning standard pattern 401 by 12 ° in the x-axis direction and reducing it to 1/5.

学習用パターン５０３は学習用標準パターン４０１をｙ軸方向に−１２°回転させ１／５に縮小した画像である。 The learning pattern 503 is an image obtained by rotating the learning standard pattern 401 by −12 ° in the y-axis direction and reducing it to 1/5.

学習用パターン５０４は学習用標準パターン４０１をｙ軸方向に１２°回転させ１／５に縮小した画像である。 The learning pattern 504 is an image obtained by rotating the learning standard pattern 401 by 12 ° in the y-axis direction and reducing it to 1/5.

学習用パターン５０５は学習用標準パターン４０１をｚ軸方向に−１２°回転させ１／５に縮小した画像である。 The learning pattern 505 is an image obtained by rotating the learning standard pattern 401 by −12 ° in the z-axis direction and reducing it to 1/5.

学習用パターン５０６は学習用標準パターン４０１をｚ軸方向に１２°回転させ１／５に縮小した画像である。 The learning pattern 506 is an image obtained by rotating the learning standard pattern 401 by 12 ° in the z-axis direction and reducing it to 1/5.

以上のように、学習用パターン５０１を生成して（即ち、回転して）から低解像度化したが、先に標準パターンを低解像度化して、その低解像度化された標準パターンを変形して学習パターンを生成してもよい。 As described above, the learning pattern 501 is generated (that is, rotated) and then the resolution is reduced, but the standard pattern is first reduced in resolution, and the reduced-resolution standard pattern is deformed and learned. A pattern may be generated.

特徴量抽出処理部３０３は、生成した学習用パターンから特徴量を抽出する部である。なお、以下の説明では、ｘ方向とｙ方向の画素値に関する微分値を特徴量６０１と見做す。また、ｘ方向とｙ方向の画素値に関する微分値に限らず、どのような特徴量を用いてもよい。 The feature amount extraction processing unit 303 is a unit that extracts a feature amount from the generated learning pattern. In the following description, a differential value related to pixel values in the x direction and the y direction is regarded as a feature quantity 601. Moreover, not only the differential value regarding the pixel value of an x direction and a y direction but what kind of feature-value may be used.

例えば、図５中では、特徴量を抽出するために、学習用パターン（即ち、ｘ方向の画素数Ｎ、ｙ方向の画素数Ｎの学習用パターン）中のｘ方向とｙ方向の濃淡値（即ち、画素値）に対して微分を行っている。即ち、符号６０１１は学習用パターン中の座標（０，１）−（０，０）に関する微分値である。同様に、符号６０１２は学習用パターン中の座標（０，２）−（０，１）に関する微分値，符号６０１３は学習用パターン中の座標（Ｎ，Ｎ）−（Ｎ，Ｎ−１）に関する微分値，符号６０１４は学習用パターン中の座標（１，０）−（０，０）に関する微分値，符号６０１５は学習用パターン中の座標（Ｎ，Ｎ）−（Ｎ−１，Ｎ）に関する微分値である。 For example, in FIG. 5, in order to extract a feature amount, a grayscale value (x-direction and y-direction gray values (learning pattern having a number N of pixels in the x direction and a number N of pixels in the y direction)) ( That is, differentiation is performed on the pixel value). That is, reference numeral 6011 is a differential value related to coordinates (0, 1)-(0, 0) in the learning pattern. Similarly, reference numeral 6012 represents a differential value related to coordinates (0, 2)-(0, 1) in the learning pattern, and reference numeral 6013 represents coordinates (N, N)-(N, N-1) in the learning pattern. The differential value, reference numeral 6014 is a differential value related to the coordinates (1, 0)-(0, 0) in the learning pattern, and reference numeral 6015 is related to the coordinates (N, N)-(N-1, N) in the learning pattern. It is a differential value.

そして、特徴量抽出処理部３０３は、低解像度化部３０２によって生成された全ての学習用パターンから特徴ベクトルを抽出する。 Then, the feature amount extraction processing unit 303 extracts feature vectors from all the learning patterns generated by the resolution reduction unit 302.

部分空間生成装置２０２は、特徴量抽出処理部３０３によって抽出された特徴ベクトルに基づいて部分空間を生成する装置である。即ち、学習パターンから抽出した特徴ベクトルからＫＬ（Ｋａｒｈｕｎｅｎ−Ｌｏｅ’ｖｅ）展開などによって各カテゴリごとに部分空間を求める（非特許文献３参照）。 The partial space generation device 202 is a device that generates a partial space based on the feature vector extracted by the feature amount extraction processing unit 303. That is, a partial space is obtained for each category from the feature vector extracted from the learning pattern by KL (Karhunen-Loe've) expansion or the like (see Non-Patent Document 3).

解像度判定装置２０３は、部分空間相互の関係を調べる装置である。例えば、解像度判定装置２０３が、学習用パターンを入力データとした認識率に基づいて学習パターンの解像度が適性か否かを判定しても良い。 The resolution determination device 203 is a device that examines the relationship between subspaces. For example, the resolution determination device 203 may determine whether or not the resolution of the learning pattern is appropriate based on the recognition rate using the learning pattern as input data.

図６は、学習用パターンの認識率と学習用パターンの解像度との関係（即ち、符号７０１で示すグラフ）の一例を示す図である。図６中では、学習用パターン解像度が符号７０２で示す解像度であった場合、認識率が急に下がる現象が見られる。この現象は、部分空間が重なっているため、判別（即ち、認識）に最適な空間ではなくなっている、と判定できる。即ち、この解像度７０２以上の学習用パターンを用いて判定を行うことになる。 FIG. 6 is a diagram illustrating an example of the relationship between the recognition rate of the learning pattern and the resolution of the learning pattern (that is, a graph indicated by reference numeral 701). In FIG. 6, when the learning pattern resolution is the resolution indicated by reference numeral 702, a phenomenon in which the recognition rate suddenly decreases is observed. It can be determined that this phenomenon is not an optimal space for discrimination (ie, recognition) because the partial spaces overlap. That is, the determination is performed using a learning pattern having a resolution of 702 or higher.

部分空間のなす最小角度である正準角（非特許文献４参照）を閾値に用いて、学習パターンの解像度が適性か否かを判定してもよい。なお、正準角は「０」に近いほど部分空間の角度が大きいことを示す。 A canonical angle (see Non-Patent Document 4) that is the minimum angle formed by the partial space may be used as a threshold value to determine whether or not the resolution of the learning pattern is appropriate. The canonical angle is closer to “0”, indicating that the angle of the subspace is larger.

例えば、閾値を「０．３」と設定し、部分空間のなす最小角度が該閾値以下ならば、部分空間の分離度が良好と見做し、その解像度で認識可能と判定する。 For example, when the threshold value is set to “0.3” and the minimum angle formed by the partial space is equal to or smaller than the threshold value, it is determined that the degree of separation of the partial space is good, and the recognition is possible at the resolution.

部分空間の構造的な類似度（非特許文献４参照）を閾値に用いて、学習パターンの解像度が適性か否かを判定してもよい。即ち、その閾値が「０」に近いほど部分空間の相関が低いことを示す。 It may be determined whether or not the resolution of the learning pattern is appropriate using the structural similarity of the partial space (see Non-Patent Document 4) as a threshold value. That is, the closer the threshold is to “0”, the lower the correlation of the subspace.

例えば、閾値を「０．３」と設定し、部分空間の構造的な類似度が該閾値以下ならば、部分空間の分離度が良好と見做し、その解像度で認識可能と判定する。 For example, if the threshold is set to “0.3” and the structural similarity of the subspace is equal to or less than the threshold, it is determined that the degree of separation of the subspace is good, and it is determined that the resolution can be recognized.

以上のように閾値に関する条件を満たす解像度の学習用パターンから生成された部分空間を選択する。 As described above, the partial space generated from the learning pattern having the resolution that satisfies the threshold condition is selected.

認識装置２０４は、解像度判定装置２０３で判定された解像度の学習用パターンに基づいて作成された部分空間に、特徴量抽出処理部３０３によって抽出された特徴ベクトルを投影し、その投影距離に基いて、学習用パターンが属するカテゴリを決定する装置である。 The recognition device 204 projects the feature vector extracted by the feature amount extraction processing unit 303 onto a partial space created based on the resolution learning pattern determined by the resolution determination device 203, and based on the projection distance. This is an apparatus for determining a category to which a learning pattern belongs.

例えば、低解像度な学習用パターンを用いた認識では、一意にカテゴリを決定することなく、複数個のカテゴリ候補を決定し、その後、高解像度な学習パターンを用いた認識を行い、一意に属するカテゴリを決定してもよい。 For example, in recognition using a low-resolution learning pattern, a plurality of category candidates are determined without uniquely determining a category, and then recognition using a high-resolution learning pattern is performed to uniquely identify the category. May be determined.

なお、上述のオブジェクト認識装置に関する方法をコンピュータプログラムとして記述し、コンピュータに実行させることによって、オブジェクト認識装置を実現できる。 The object recognition apparatus can be realized by describing the method related to the object recognition apparatus as a computer program and causing the computer to execute the method.

さらに、上述のオブジェクト認識装置に関する方法を記述したコンピュータプログラムを、オブジェクト認識装置に関する方法に必要とされる入出力データを格納したメモリや外部記憶装置等にアクセスするように実装してもよい。 Furthermore, a computer program describing a method related to the object recognition device described above may be mounted so as to access a memory or an external storage device that stores input / output data required for the method related to the object recognition device.

以上のように、本実施形態における解像度判定装置を導入することによって、精度を下げずに学習用パターンの解像度を効率よく決定できる。 As described above, by introducing the resolution determination apparatus according to the present embodiment, the resolution of the learning pattern can be determined efficiently without reducing accuracy.

以上、本発明において、記載された具体例に対してのみ詳細に説明したが、本発明の技術思想の範囲で多彩な変形および修正が可能であることは、当業者にとって明白なことであり、このような変形および修正が特許請求の範囲に属することは当然のことである。 Although the present invention has been described in detail only for the specific examples described above, it is obvious to those skilled in the art that various changes and modifications can be made within the scope of the technical idea of the present invention. Such variations and modifications are naturally within the scope of the claims.

例えば、本実施形態の変形としては、オブジェクト認識装置は、前記閾値（例えば、部分空間のなす最小角度に基づく閾値）を設定するための閾値設定手段（例えば、キーボード装置，ポインティング装置，ディスプレイ装置）を備えていても良い。この閾値設定手段によって、装置起動中でも、閾値を適宜に設定もしくは変更できる。 For example, as a modification of the present embodiment, the object recognition device uses threshold setting means (for example, a keyboard device, a pointing device, a display device) for setting the threshold value (for example, a threshold value based on the minimum angle formed by the partial space). May be provided. By this threshold value setting means, the threshold value can be appropriately set or changed even while the apparatus is activated.

本実施形態におけるオブジェクト認識装置の構成図。The block diagram of the object recognition apparatus in this embodiment. 本実施形態における低解像度特徴量抽出装置の構成図。The block diagram of the low-resolution feature-value extraction apparatus in this embodiment. 本実施形態における学習用標準パターンの一例を示す図。The figure which shows an example of the standard pattern for learning in this embodiment. 本実施形態における学習用パターンの一例を示す図。The figure which shows an example of the pattern for learning in this embodiment. 本実施形態における特徴量の一例を示す図。The figure which shows an example of the feature-value in this embodiment. 本実施形態における学習パターンの解像度と認識率の関係を示すグラフ。The graph which shows the relationship between the resolution of the learning pattern in this embodiment, and a recognition rate. 一般的なオブジェクト認識装置の構成図。The block diagram of a general object recognition apparatus.

Explanation of symbols

１０１…特徴量抽出装置
１０２…部分空間生成装置
１０３…識別装置
２０１…低解像度特徴量抽出装置
２０２…部分空間作成装置
２０３…解像度判定装置
２０４…認識装置
３０１…学習用パターン生成部
３０２…低解像度化部
３０３…特徴量抽出部
３０４…画像取得手段
４０１〜４０３…学習用標準パターン
５０１〜５０６…学習用パターン
６０１，６０１１〜６０１５…特徴量
７０１…学習用パターンの認識率と学習用パターンの解像度との関係を示すグラフ
７０２…認識率が急に下がる時の学習用パターン解像度 DESCRIPTION OF SYMBOLS 101 ... Feature quantity extraction apparatus 102 ... Subspace generation apparatus 103 ... Identification apparatus 201 ... Low resolution feature quantity extraction apparatus 202 ... Subspace creation apparatus 203 ... Resolution determination apparatus 204 ... Recognition apparatus 301 ... Learning pattern generation section 302 ... Low resolution Conversion unit 303 ... feature quantity extraction unit 304 ... image acquisition means 401 to 403 ... learning standard patterns 501 to 506 ... learning patterns 601, 6011 to 6015 ... feature quantity 701 ... learning pattern recognition rate and learning pattern resolution 702 ... The pattern resolution for learning when the recognition rate falls suddenly

Claims

An object recognition device that extracts a feature quantity from a learning pattern and recognizes an object based on the feature quantity,
The learning standard pattern is acquired from the image acquisition means, the learning standard pattern is deformed, the modified learning pattern is reduced in resolution, the learning pattern is generated, and the feature amount is extracted from the learning pattern A low-resolution feature extraction device,
A subspace generation device that generates a subspace of each category based on the extracted feature amount;
A resolution determination device that determines whether or not the resolution of the learning pattern is appropriate based on the correlation between the generated partial spaces of each category,
A recognition device for recognizing a category to which the learning pattern belongs, based on a subspace created from the learning pattern having a resolution determined to be appropriate;
An object recognition apparatus comprising:

The resolution determination device is
A specific threshold is set based on the recognition rate for the learning pattern, and it is determined whether the resolution of the learning pattern is appropriate based on the threshold.
The object recognition apparatus according to claim 1.

The resolution determination device is
The minimum angle formed by the partial space of each category is regarded as a specific threshold value, and it is determined whether or not the resolution of the learning pattern is appropriate based on the threshold value.
The object recognition apparatus according to claim 1.

The resolution determination device is
A specific threshold is set based on the structural similarity of the subspaces of each category, and it is determined whether the resolution of the learning pattern is appropriate based on the threshold.
The object recognition apparatus according to claim 1.

The recognition device is
Based on the subspace created from the learning pattern having the resolution determined to be appropriate, after narrowing down to a plurality of categories to which the learning pattern belongs,
Based on the partial space generated from the learning pattern having a high resolution, the category to which the learning pattern having the high resolution belongs is selected from the categories narrowed down to the plurality.
The object recognition device according to claim 1, wherein the object recognition device is an object recognition device.

On the computer,
An object recognition program that extracts features from learning patterns and recognizes objects based on the features,
The learning standard pattern is acquired from the image acquisition means, the learning standard pattern is deformed, the modified learning pattern is reduced in resolution, the learning pattern is generated, and the feature amount is extracted from the learning pattern A low-resolution feature extraction step,
A subspace generation step of generating a subspace of each category based on the extracted feature amount;
A resolution determination step of determining whether or not the resolution of the learning pattern is appropriate based on the correlation between the generated partial spaces of each category; and
A recognition step for recognizing a category to which the learning pattern belongs, based on a subspace created from the learning pattern having a resolution determined to be appropriate;
An object recognition program characterized by comprising:

The resolution determination step includes
A specific threshold is set based on the recognition rate for the learning pattern, and it is determined whether the resolution of the learning pattern is appropriate based on the threshold.
The object recognition program according to claim 6.

The resolution determination step includes
The minimum angle formed by the partial space of each category is regarded as a specific threshold value, and it is determined whether or not the resolution of the learning pattern is appropriate based on the threshold value.
The object recognition program according to claim 6.

The resolution determination step includes
A specific threshold is set based on the structural similarity of the subspaces of each category, and it is determined whether the resolution of the learning pattern is appropriate based on the threshold.
The object recognition program according to claim 6.

The recognition step comprises:
Based on the subspace created from the learning pattern having the resolution determined to be appropriate, after narrowing down to a plurality of categories to which the learning pattern belongs,
Based on the partial space generated from the learning pattern having a high resolution, the category to which the learning pattern having the high resolution belongs is selected from the categories narrowed down to the plurality.
10. The object recognition program according to claim 6, wherein the object recognition program is any one of claims 6 to 9.