JP2014199533A

JP2014199533A - Image identification device, image identification method, and program

Info

Publication number: JP2014199533A
Application number: JP2013074280A
Authority: JP
Inventors: 一郎梅田; Ichiro Umeda
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2013-03-29
Filing date: 2013-03-29
Publication date: 2014-10-23

Abstract

PROBLEM TO BE SOLVED: To identify an image at a high speed while suppressing the storage capacity.SOLUTION: A learning section 201 can reduce the dictionary size by storing, in a dictionary, a lookup table corresponding to a selected feature category and the identification contribution of a non-selected feature category that is used as a correction value. A selected category feature vector acquisition section 222 can reduce the referring/writing processing of a memory corresponding to the resolution of an input image, by preventing the acquisition of the feature vector corresponding to the feature category that is not selected by the learning section 201.

Description

本発明は、特に、辞書に基づいて物体を識別するために用いて好適な画像識別装置、画像識別方法およびプログラムに関する。 The present invention particularly relates to an image identification device, an image identification method, and a program suitable for identifying an object based on a dictionary.

ある画像がある物体クラスであるか否かを識別するには、画像から特徴ベクトルを取得し、機械学習を用いて辞書を作成し、辞書に基づいてある画像がある物体であるか否かを識別する、という手法が多く取られる。 To identify whether an image is an object class, obtain a feature vector from the image, create a dictionary using machine learning, and determine whether an image is an object based on the dictionary. Many methods of identifying are taken.

非特許文献１には、ある画像がある物体クラスであるか否かを識別する手法としてＨｉｓｔｏｇｒａｍＩｎｔｅｒｓｅｃｔｉｏｎＫｅｒｎｅｌの利用及びルックアップテーブルの作成により高速化する手法が開示されている。非特許文献２には、非特許文献１に記載の方法と同様に、ルックアップテーブルを作成できるカーネル関数について開示されている。さらに、非特許文献３に記載されているＳｕｐｐｏｒｔＶｅｃｔｏｒＤａｔａＤｅｓｃｒｉｐｔｉｏｎは、非特許文献１に記載のＨｉｓｔｏｇｒａｍＩｎｔｅｒｓｅｃｔｉｏｎＫｅｒｎｅｌ及びルックアップテーブルの作成を適用可能な学習機械である。 Non-Patent Document 1 discloses a technique for speeding up the process by using the Histogram Intersection Knel and creating a lookup table as a technique for identifying whether or not an image is an object class. Non-Patent Document 2 discloses a kernel function that can create a lookup table in the same manner as the method described in Non-Patent Document 1. Further, the Support Vector Data Description described in Non-Patent Document 3 is a learning machine to which the creation of the Histogram Inter- ference Kernel and the lookup table described in Non-Patent Document 1 can be applied.

また、ある画像が複数の物体クラスのいずれであるかを識別するには、個々の物体クラスの特性を描写するための複数の特徴カテゴリを用意する手法が取られている。非特許文献４〜６には、それぞれ代表的な特徴カテゴリであるＨＯＧ特徴、ＬＢＰ特徴、Ｂａｇｏｆｖｉｓｕａｌｗｏｒｄｓ特徴が開示されている。 Further, in order to identify which of a plurality of object classes an image is, a technique of preparing a plurality of feature categories for describing the characteristics of each object class is used. Non-Patent Documents 4 to 6 disclose HOG features, LBP features, and Bag of visual words features, which are representative feature categories, respectively.

さらに特許文献１には、特徴ベクトルを複数の特徴カテゴリから取得し、機械学習で識別器を得る上で、複数のカーネル関数を用意し、それぞれのカーネル関数に最適な重みを求める方法が開示されている。特許文献１に記載の方法によれば、複数の物体クラスで特性が異なり識別に有効な特徴カテゴリが異なる場合には、そのカーネルの有効性に応じてその適切な重みを学習することにより、識別率を向上させることが記載されている。 Further, Patent Document 1 discloses a method of obtaining a feature vector from a plurality of feature categories, obtaining a discriminator by machine learning, preparing a plurality of kernel functions, and obtaining an optimum weight for each kernel function. ing. According to the method described in Patent Document 1, when the characteristic categories that are different in the characteristics of a plurality of object classes and the effective feature categories are different, the identification is performed by learning the appropriate weight according to the effectiveness of the kernel. It is described to improve the rate.

米国特許出願公開第２００９／００９９９８６号明細書US Patent Application Publication No. 2009/0099986

ＳｕｂｈｒａｎｓｕＭａｊｉ，ＡｌｅｘａｎｄｅｒＣ．Ｂｅｒｇ，ＪｉｔｅｎｄｒａＭａｌｉｋ， "ＣｌａｓｓｉｆｉｃａｔｉｏｎｕｓｉｎｇＩｎｔｅｒｓｅｃｔｉｏｎＫｅｒｎｅｌＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅｓｉｓＥｆｆｉｃｉｅｎｔ"，ＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，２００８Subhran Maji, Alexander C. et al. Berg, Jitendra Malik, “Classification using Intersection Kernell Support Vector Machines is Efficient”, Computer Vision and Pattern Recognition, 2008 ＡｎｄｒｅａＶｅｄａｌｄｉ，ＡｎｄｒｅｗＺｉｓｓｅｒｍａｎ，ＥｆｆｉｃｉｅｎｔＡｄｄｉｔｉｖｅＫｅｒｎｅｌｓｖｉａＥｘｐｌｉｃｉｔＦｅａｔｕｒｅＭａｐｓ，ＩＥＥＥＴｒａｎｓａｃｔｉｏｎｓｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，２０１１Andrew Vedaldi, Andrew Zisserman, Efficient Additive Kernels, Via Expert Feature Map, IEEE Transactions on Pattern Analysis and Mass Analysis and Pattern Analysis. Ｄ．Ｍ．Ｊ．ＴａｘａｎｄＲ．Ｐ．Ｗ．Ｄｕｉｎ．Ｓｕｐｐｏｒｔｖｅｃｔｏｒｄａｔａｄｅｓｃｒｉｐｔｉｏｎ．ＭａｃｈｉｎｅＬｅａｒｎｉｎｇ，２００２D. M.M. J. et al. Tax and R.C. P. W. Duin. Support vector data description. Machine Learning, 2002 Ｄａｌａｌ，Ｎ．，Ｔｒｉｇｇｓ，Ｂ， "Ｈｉｓｔｏｇｒａｍｓｏｆｏｒｉｅｎｔｅｄｇｒａｄｉｅｎｔｓｆｏｒｈｕｍａｎｄｅｔｅｃｔｉｏｎ"，ＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，２００５Dalal, N.A. , Triggs, B, “Histograms of orientated gradients for human detection”, Computer Vision and Pattern Recognition, 2005 ＴｉｍｏＯｊａｌａ，ＭａｔｔｉＰｉｅｔｉｋａｉｎｅｎ，ＭｕｌｔｉｒｅｓｏｌｕｔｉｏｎＧｒａｙ−ＳｃａｌｅａｎｄＲｏｔａｔｉｏｎＩｎｖａｒｉａｎｔＴｅｘｔｕｒｅＣｌａｓｓｉｆｉｃａｔｉｏｎｗｉｔｈＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎｓ，ＩＥＥＥＴＲＡＮＳＡＣＴＩＯＮＳＯＮＰＡＴＴＥＲＮＡＮＡＬＹＳＩＳＡＮＤＭＡＣＨＩＮＥＩＮＴＥＬＬＩＧＥＮＣＥ，ＪＵＬＹ２００２Timo Ojala, Matti Pietikainen, Multiresolution Gray-Scale and Rotation Invariant Texture Classification 2 Local Binary Strings, IEEE TRANSITION Ｌ．Ｆｅｉ−ＦｅｉａｎｄＰ．Ｐｅｒｏｎａ（２００５）． "ＡＢａｙｅｓｉａｎＨｉｅｒａｒｃｈｉｃａｌＭｏｄｅｌｆｏｒＬｅａｒｎｉｎｇＮａｔｕｒａｌＳｃｅｎｅＣａｔｅｇｏｒｉｅｓ"．Ｐｒｏｃ．ｏｆＩＥＥＥＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．ｐｐ．５２４−５３１L. Fei-Fei and P.M. Perona (2005). “A Bayesian Hierarchical Model for Learning Natural Scene Categories”. Proc. of IEEE Computer Vision and Pattern Recognition. pp. 524-531

一般に画像を識別する場合には、高速でありかつ記憶容量を抑制できることが望ましい。異なる複数の特徴カテゴリを用いて識別するための辞書を作成する場合、既知の手法では、特徴カテゴリの有効性を問わず特徴カテゴリ数に応じて、特徴ベクトルの辞書を保存するための記憶容量と演算時間とが必要となるという問題がある。例えば特許文献１に記載の手法では、用いる全ての特徴カテゴリの特徴ベクトルを演算するため、特徴カテゴリ数に応じて特徴ベクトルの演算時間と辞書を保存するための記憶容量とが必要となる。 In general, when an image is identified, it is desirable that it is high speed and storage capacity can be suppressed. When creating a dictionary for identification using a plurality of different feature categories, the known method has a storage capacity for storing the feature vector dictionary according to the number of feature categories regardless of the validity of the feature categories. There is a problem that calculation time is required. For example, in the method described in Patent Document 1, since the feature vectors of all the feature categories to be used are computed, the feature vector computation time and the storage capacity for storing the dictionary are required according to the number of feature categories.

本発明は前述の問題点に鑑み、高速に、かつ記憶容量を抑えて画像を識別することができるようにすることを目的としている。 An object of the present invention is to make it possible to identify an image at a high speed and with a small storage capacity.

本発明の画像識別装置は、学習サンプル画像の特徴を示す特徴ベクトルを取得する第１の取得手段と、前記第１の取得手段によって取得された前記学習サンプル画像の特徴ベクトルから、特徴カテゴリごとの識別寄与度を求める演算手段と、前記演算手段により求められた識別寄与度が所定値以上である特徴カテゴリを選択する選択手段と、前記選択手段によって選択された特徴カテゴリに関する情報を含む辞書を作成する辞書作成手段と、入力画像の特徴を示す特徴ベクトルを取得する第２の取得手段と、前記第２の取得手段によって取得された前記入力画像の特徴ベクトルと、前記辞書作成手段によって作成された辞書とに基づいて、前記入力画像の物体が、前記学習サンプル画像の物体と類似しているか否かを判定する判定手段とを有し、前記第２の取得手段は、前記選択手段によって選択されていない特徴カテゴリに対応する特徴ベクトルを前記入力画像から取得しないようにすることを特徴とする。 The image identification device of the present invention includes a first acquisition unit that acquires a feature vector indicating a feature of a learning sample image, and a feature vector of each learning category acquired from the feature vector of the learning sample image acquired by the first acquisition unit. A calculation unit for obtaining an identification contribution, a selection unit for selecting a feature category whose identification contribution obtained by the calculation unit is equal to or greater than a predetermined value, and a dictionary including information on the feature category selected by the selection unit A dictionary creating means, a second obtaining means for obtaining a feature vector indicating a feature of the input image, a feature vector of the input image obtained by the second obtaining means, and a dictionary created by the dictionary creating means Determination means for determining whether the object of the input image is similar to the object of the learning sample image based on the dictionary. Said second acquisition means, characterized by the feature vector corresponding to the feature category is not selected by said selecting means so as not obtained from the input image.

本発明によれば、特徴ベクトルを取得して入力画像を識別する処理を高速にするとともに、かつ辞書の記憶容量を抑えて画像を識別することができる。 According to the present invention, it is possible to speed up the process of acquiring the feature vector and identifying the input image, and to identify the image while suppressing the storage capacity of the dictionary.

本発明の実施形態に係る画像識別装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the image identification device which concerns on embodiment of this invention. 本発明の実施形態に係る画像識別装置の機能構成例を示すブロック図である。It is a block diagram which shows the function structural example of the image identification device which concerns on embodiment of this invention. 特徴ベクトルを取得して連結する方法を説明するための図である。It is a figure for demonstrating the method of acquiring and connecting a feature vector. 実施形態において、学習部における処理手順の一例を示すフローチャートである。In an embodiment, it is a flow chart which shows an example of a processing procedure in a learning part. 物体Ａから辞書Ａを作成する過程で参照する情報群を示す図である。It is a figure which shows the information group referred in the process of creating the dictionary A from the object A. FIG. 物体Ａと異なる物体Ｂから辞書Ｂを作成する過程で参照する情報群を示す図である。It is a figure which shows the information group referred in the process of creating the dictionary B from the object B different from the object A. 識別器の演算を高速化するルックアップテーブルの概略を説明するための図である。It is a figure for demonstrating the outline of the look-up table which speeds up the calculation of a discriminator. 実施形態において、識別部における処理手順の一例を示すフローチャートである。In an embodiment, it is a flow chart which shows an example of a processing procedure in an identification part. 物体Ａに対応する辞書Ａについての特徴カテゴリの選択結果、選択されなかった特徴カテゴリの補正値、および選択された特徴カテゴリに対するルックアップテーブルの一例を示す図である。It is a figure which shows an example of the selection result of the feature category about the dictionary A corresponding to the object A, the correction value of the feature category which was not selected, and the lookup table with respect to the selected feature category. 図８のステップＳ８０１における、選択された特徴カテゴリについて特徴ベクトルを作成する詳細な処理手順の一例を示すフローチャートである。FIG. 9 is a flowchart showing an example of a detailed processing procedure for creating a feature vector for a selected feature category in step S801 in FIG.

以下、図面を参照しながら、本発明の実施形態について説明する。
（第１の実施形態）
図１は、本実施形態の画像識別装置１００のハードウェア構成例を示すブロック図である。
図１において、撮像素子１０１は、ＣＣＤ、ＭＯＳ、等で構成され、被写体像を光から電気信号に変換するための撮像手段である。信号処理回路１０２は、撮像素子１０１から得られた被写体像に関する時系列信号を処理し、デジタル信号に変換する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
(First embodiment)
FIG. 1 is a block diagram illustrating a hardware configuration example of an image identification device 100 according to the present embodiment.
In FIG. 1, an image sensor 101 is constituted by a CCD, a MOS, or the like, and is an image pickup means for converting a subject image from light to an electrical signal. The signal processing circuit 102 processes a time series signal related to the subject image obtained from the image sensor 101 and converts it into a digital signal.

ＣＰＵ１０３は、ＲＯＭ１０４に格納されている制御プログラムを実行することにより、画像識別装置１００全体を制御する。ＲＯＭ１０４は、ＣＰＵ１０３が実行する制御プログラムや各種パラメータデータを格納する。制御プログラムは、ＣＰＵ１０３で実行されることにより、後述するフローチャートに示す各処理を実行するための各種手段として、当該装置を機能させる。ＲＡＭ１０５は、画像や各種情報を記憶する。また、ＲＡＭ１０５は、ＣＰＵ１０３のワークエリアやデータの一時待避領域として機能する。ディスプレイ１０６は、例えば、ＬＣＤやＣＲＴで構成される。 The CPU 103 controls the entire image identification apparatus 100 by executing a control program stored in the ROM 104. The ROM 104 stores a control program executed by the CPU 103 and various parameter data. The control program is executed by the CPU 103, thereby causing the apparatus to function as various means for executing each process shown in a flowchart described below. The RAM 105 stores images and various information. The RAM 105 functions as a work area for the CPU 103 and a temporary data saving area. The display 106 is composed of, for example, an LCD or a CRT.

なお、本実施形態では、後述するフローチャートの各ステップに対応する処理を、ＣＰＵ１０３を用いてソフトウェアで実現することとするが、その処理の一部または全部を電子回路などのハードウェアで実現するようにしても構わない。また、本実施形態の画像識別装置は、撮像素子１０１や信号処理回路１０２を省いて汎用のパーソナルコンピュータ（ＰＣ）を用いて実現してもよいし、画像識別専用装置として実現するようにしても構わない。また、ネットワークまたは各種記憶媒体を介して取得したソフトウェア（プログラム）をＰＣ等の処理装置（ＣＰＵ、プロセッサ）にて実行してもよい。 In the present embodiment, processing corresponding to each step of the flowchart described later is realized by software using the CPU 103, but part or all of the processing is realized by hardware such as an electronic circuit. It doesn't matter. Further, the image identification apparatus of the present embodiment may be realized using a general-purpose personal computer (PC) without the image sensor 101 and the signal processing circuit 102, or may be realized as an image identification dedicated apparatus. I do not care. Further, software (program) acquired via a network or various storage media may be executed by a processing device (CPU, processor) such as a PC.

図２は、本実施形態に係る画像識別装置１００の機能構成例を示すブロック図である。
本実施形態の画像識別装置１００は、同一の物体クラスを示す複数枚の学習サンプル画像が事前に与えられている。また、画像識別装置１００は、その学習サンプル画像群から識別器の辞書を作成する学習部２０１と、その辞書に基づく画像がその物体クラスと同一か否かを識別する識別部２０２とを備えている。 FIG. 2 is a block diagram illustrating a functional configuration example of the image identification device 100 according to the present embodiment.
In the image identification device 100 of the present embodiment, a plurality of learning sample images indicating the same object class are given in advance. In addition, the image identification device 100 includes a learning unit 201 that creates a dictionary of classifiers from the learning sample image group, and an identification unit 202 that identifies whether an image based on the dictionary is the same as the object class. Yes.

本実施形態の画像識別装置１００は、学習部２０１において辞書のサイズを削減し、識別部２０２において物体クラスの識別を高速化する。なお、学習部２０１および識別部２０２は、図１に示すハードウェア構成上で動作するものとするが、それぞれ異なる環境で動作しても構わない。例えば学習部２０１はネットワークを介したＰＣ上などで動作し、図１に示す装置はその辞書を取得し、識別部２０２を動作させる等の構成でも構わない。 In the image identification device 100 according to the present embodiment, the learning unit 201 reduces the size of the dictionary, and the identification unit 202 speeds up the identification of the object class. Note that the learning unit 201 and the identification unit 202 operate on the hardware configuration illustrated in FIG. 1, but may operate in different environments. For example, the learning unit 201 may operate on a PC via a network, and the apparatus illustrated in FIG. 1 may acquire the dictionary and operate the identification unit 202.

学習部２０１は、学習サンプル画像から、その物体クラスを識別するための情報群を格納する辞書を作成する。学習部２０１は、特徴ベクトル取得部２１１、第１の特徴ベクトル正規化部２１２、識別器学習部２１３、ルックアップテーブル作成部２１４、および識別寄与度演算部２１５を備えている。さらに、特徴カテゴリ選択部２１６、特徴カテゴリ補正値演算部２１７、および辞書作成部２１８を備えている。本実施形態では、特徴ベクトルを取得して学習した結果を用いて特徴カテゴリごとに識別寄与度を求め、識別に寄与する特徴カテゴリのみを辞書に格納することにより、辞書のサイズを削減する。 The learning unit 201 creates a dictionary that stores an information group for identifying the object class from the learning sample image. The learning unit 201 includes a feature vector acquisition unit 211, a first feature vector normalization unit 212, a classifier learning unit 213, a lookup table creation unit 214, and an identification contribution degree calculation unit 215. Furthermore, a feature category selection unit 216, a feature category correction value calculation unit 217, and a dictionary creation unit 218 are provided. In the present embodiment, the size of the dictionary is reduced by obtaining the identification contribution for each feature category using the learning result obtained by acquiring the feature vector, and storing only the feature category contributing to the identification in the dictionary.

特徴ベクトル取得部（第１の取得部）２１１は、学習サンプル画像それぞれから特徴ベクトルを取得する。特徴ベクトルは画像の特徴、例えば明暗やエッジのパターンをベクトルとして示すヒストグラム特徴である。この特徴ベクトルは異なる特徴カテゴリから得られた特徴を連結したものが生成され、個々の次元はいずれかの特徴カテゴリに属している。 The feature vector acquisition unit (first acquisition unit) 211 acquires a feature vector from each learning sample image. The feature vector is a histogram feature indicating a feature of an image, for example, a light and dark pattern or an edge pattern as a vector. This feature vector is generated by connecting features obtained from different feature categories, and each dimension belongs to one of the feature categories.

図３は、特徴ベクトルを取得して連結する方法を説明するための図である。まず、学習サンプル画像について、全ての特徴カテゴリに対応する特徴ベクトルを求める。次にこれらの特徴ベクトルを連結する。本実施形態では、特徴カテゴリは非特許文献４〜６にそれぞれ記載されているＨＯＧ特徴、ＬＢＰ特徴、およびＢａｇｏｆｖｉｓｕａｌｗｏｒｄｓ（ＢｏＶＷ）特徴とするが、これらの特徴に制限されない。例えば、画像を格子状に分割した個々の格子から特徴ベクトルを取得し、取得した格子のＸ座標及びＹ座標などを特徴カテゴリに含めてもよいし、異なる特徴を導入し新たな特徴カテゴリを割り当ててもよい。 FIG. 3 is a diagram for explaining a method of acquiring and connecting feature vectors. First, feature vectors corresponding to all feature categories are obtained for the learning sample image. Next, these feature vectors are connected. In the present embodiment, the feature categories are the HOG feature, the LBP feature, and the Bag of visual words (BoVW) feature described in Non-Patent Documents 4 to 6, respectively, but are not limited to these features. For example, feature vectors may be acquired from individual grids obtained by dividing an image into grids, and the X and Y coordinates of the acquired grids may be included in the feature categories, or different features may be introduced and new feature categories assigned. May be.

第１の特徴ベクトル正規化部２１２は、特徴ベクトル取得部２１１で取得した特徴ベクトルを事前に定めた値域に収まる様に正規化する。この値域内に収まるならば正規化の手法は特に限定されない。 The first feature vector normalization unit 212 normalizes the feature vector acquired by the feature vector acquisition unit 211 so as to be within a predetermined range. The normalization method is not particularly limited as long as it falls within this range.

識別器学習部２１３は、第１の特徴ベクトル正規化部２１２で正規化した特徴ベクトル群から識別器を学習する。この識別器は、非特許文献１や非特許文献３に記載される、カーネル関数及びサポートベクタに基づく学習機械により作成される。すなわち、この手法は、学習の過程で、カーネル関数を内積の代わりに用い、与えられた特徴ベクトル群から識別に有効なサポートベクタ群を残し、かつ、残った個々のサポートベクタに対し重みを推定する手法である。また、この学習機械は、さらに後述するルックアップテーブルを生成可能であるものである。ルックアップテーブルを生成可能であるとは、非特許文献２に示すように、特徴ベクトルとサポートベクタとのカーネル内積の重み付き総和により識別を行う識別器であって、カーネル関数が個々の次元の演算の線形加算の形で表現できる関数であることである。 The classifier learning unit 213 learns a classifier from the feature vector group normalized by the first feature vector normalization unit 212. This discriminator is created by a learning machine based on a kernel function and a support vector described in Non-Patent Document 1 and Non-Patent Document 3. In other words, this method uses a kernel function instead of the inner product in the learning process, leaves a valid support vector group from the given feature vector group, and estimates the weight for each remaining support vector. It is a technique to do. In addition, this learning machine can generate a lookup table described later. As shown in Non-Patent Document 2, a lookup table can be generated by an identifier that performs identification based on a weighted sum of kernel inner products of feature vectors and support vectors. It is a function that can be expressed in the form of linear addition of operations.

ルックアップテーブル作成部２１４は、識別器学習部２１３が学習したサポートベクタ及びその重みについて、識別部２０２での演算を高速化するため、その演算の近似値を格納するルックアップテーブルを作成する。図５（Ｂ）において、ルックアップテーブルの例をＬＵＴ（ｒ＝０）からＬＵＴ（ｒ＝Ｒ_max）の行に示す。図５（Ｂ）において、Ｄが次元、ｒが区間を意味する。例えば次元ｄ＝１の区間ｒ＝１の識別値はエントリ５１６に格納される。 The lookup table creation unit 214 creates a lookup table that stores approximate values of the computations of the support vectors learned by the discriminator learning unit 213 and their weights in order to speed up computations in the discrimination unit 202. In FIG. 5B, an example of a lookup table is shown in the rows from LUT (r = 0) to LUT (r = R _max ). In FIG. 5B, D means a dimension and r means a section. For example, the identification value of the section r = 1 of the dimension d = 1 is stored in the entry 516.

識別寄与度演算部２１５は、ルックアップテーブル作成部２１４で作成したルックアップテーブルに基づき、それぞれの特徴カテゴリが画像の識別に寄与する程度を求める。特徴カテゴリ選択部２１６は、識別寄与度演算部２１５で求めた特徴カテゴリごとの識別寄与度を元に、識別寄与度が所定値以上であるか否かを判定し、識別に用いる特徴カテゴリを選択する。 Based on the lookup table created by the lookup table creation unit 214, the identification contribution degree calculation unit 215 obtains the degree to which each feature category contributes to image identification. The feature category selection unit 216 determines whether or not the identification contribution is a predetermined value or more based on the identification contribution for each feature category obtained by the identification contribution calculation unit 215, and selects the feature category used for identification To do.

特徴カテゴリ補正値演算部２１７は、特徴カテゴリ選択部２１６で選択されなかった特徴カテゴリについて、選択されていたら識別値演算部２２４において識別値に加算されたと予測できる値ないしその値を近似するパラメータを求める。辞書作成部２１８は、識別部２０２において画像の識別に必要な情報群を辞書としてまとめ、ＲＡＭ１０５に記憶する。ここで、特徴カテゴリ選択部２１６で選択されなかった特徴カテゴリについては、ルックアップテーブルの代わりに特徴カテゴリ補正値演算部２１７で求めた補正値を格納し、辞書を小さくする。 The feature category correction value calculation unit 217 sets a parameter that approximates the value that can be predicted to be added to the identification value by the identification value calculation unit 224 if selected for the feature category not selected by the feature category selection unit 216. Ask. The dictionary creation unit 218 collects a group of information necessary for image identification in the identification unit 202 as a dictionary and stores it in the RAM 105. Here, for the feature categories not selected by the feature category selection unit 216, the correction value obtained by the feature category correction value calculation unit 217 is stored instead of the lookup table, and the dictionary is made smaller.

識別部２０２は、学習サンプル画像群から学習部２０１が作成した辞書に基づき、ある画像がその物体クラスであるか否かを識別する。識別部２０２は、辞書読み込み部２２１、選択カテゴリ特徴ベクトル取得部２２２、第２の特徴ベクトル正規化部２２３、識別値演算部２２４、および識別値補正部２２５を備えている。本実施形態では、辞書に格納される特徴カテゴリについてのみ特徴ベクトルを取得することにより計算量を削減している。 The identification unit 202 identifies whether an image is the object class based on the dictionary created by the learning unit 201 from the learning sample image group. The identification unit 202 includes a dictionary reading unit 221, a selected category feature vector acquisition unit 222, a second feature vector normalization unit 223, an identification value calculation unit 224, and an identification value correction unit 225. In this embodiment, the amount of calculation is reduced by acquiring feature vectors only for feature categories stored in the dictionary.

辞書読み出し部２２１は、辞書作成部２１８が作成した辞書をＲＡＭ１０５から読み出す。選択カテゴリ特徴ベクトル取得部（第２の取得部）２２２は、物体クラスであるか否かを識別する入力画像に対して、特徴カテゴリ選択部２１６が選択した特徴カテゴリから、特徴ベクトル取得部２１１と同様に特徴ベクトルを取得する。ここで、特徴ベクトルの取得には多くの場合前処理が必要であるが、選択していない特徴カテゴリについては特徴ベクトルの取得のみならず前処理も行わないものとする。 The dictionary reading unit 221 reads the dictionary created by the dictionary creating unit 218 from the RAM 105. The selected category feature vector acquisition unit (second acquisition unit) 222 selects a feature vector acquisition unit 211 from the feature category selected by the feature category selection unit 216 with respect to an input image that identifies whether the object class is an object class. Similarly, a feature vector is acquired. Here, in many cases, pre-processing is required to acquire a feature vector, but pre-processing as well as acquisition of a feature vector is not performed for an unselected feature category.

第２の特徴ベクトル正規化部２２３は、選択カテゴリ特徴ベクトル取得部２２２が取得した特徴ベクトルについて、第１の特徴ベクトル正規化部２１２と同等の正規化を行う。識別値演算部２２４は、第２の特徴ベクトル正規化部２２３で正規化した特徴ベクトルに対して、物体クラスであるか否かを識別するための識別値を演算する。このとき、識別値演算部２２４は、ルックアップテーブル作成部２１４で作成され、辞書作成部２１８で辞書に格納されたルックアップテーブルを用いる。非特許文献１の記載によれば、識別の演算時間はサポートベクタ数に対し定数時間であって、次元数にのみ比例する。但し、次元数及び定義域の分割数に比例する記憶容量を必要とするものとする。 The second feature vector normalization unit 223 performs normalization equivalent to the first feature vector normalization unit 212 for the feature vector acquired by the selected category feature vector acquisition unit 222. The identification value calculation unit 224 calculates an identification value for identifying whether or not the feature vector is the object class with respect to the feature vector normalized by the second feature vector normalization unit 223. At this time, the identification value calculation unit 224 uses the lookup table created by the lookup table creation unit 214 and stored in the dictionary by the dictionary creation unit 218. According to the description in Non-Patent Document 1, the calculation time for identification is a constant time with respect to the number of support vectors, and is proportional only to the number of dimensions. However, a storage capacity proportional to the number of dimensions and the number of domain divisions is required.

識別値補正部２２５は、識別値演算部２２４で求めた識別値を補正し、物体クラスであるか否かを識別する。具体的には、特徴カテゴリ選択部２１６により選択されなかった特徴カテゴリについての重み付き総和の近似値を識別値に加算した上で、その識別値が事前に決定された閾値を超えるか否かを判定する。 The identification value correction unit 225 corrects the identification value obtained by the identification value calculation unit 224 and identifies whether or not it is an object class. Specifically, after adding an approximate value of the weighted sum for the feature category not selected by the feature category selection unit 216 to the identification value, it is determined whether or not the identification value exceeds a predetermined threshold value. judge.

以下、ある物体Ａを示す学習サンプル画像群Ａ、物体Ａとは異なる物体Ｂを示す学習サンプル画像群Ｂとが与えられた時の学習部２０１の動作について説明する。図４は、学習部２０１における処理手順の一例を示すフローチャートである。図４に示すフローチャートは、識別器を学習する物体の学習サンプル画像群を入力し、その識別器の辞書を出力する手順を示している。また、図５には、物体Ａから辞書Ａを作成する過程で参照する情報群を示し、図６には、物体Ｂから辞書Ｂを作成する過程で参照する情報群を示す。 Hereinafter, the operation of the learning unit 201 when a learning sample image group A indicating a certain object A and a learning sample image group B indicating an object B different from the object A are given will be described. FIG. 4 is a flowchart illustrating an example of a processing procedure in the learning unit 201. The flowchart shown in FIG. 4 shows a procedure for inputting a learning sample image group of an object for learning a classifier and outputting a dictionary of the classifier. FIG. 5 shows an information group referred to in the process of creating the dictionary A from the object A, and FIG. 6 shows an information group referred to in the process of creating the dictionary B from the object B.

まず、ステップＳ４０１において、特徴ベクトル取得部２１１は、学習サンプル画像群が含む全ての画像について、特徴カテゴリが含む全ての特徴について、特徴ベクトルを取得する。次いで、ステップＳ４０２において、第１の特徴ベクトル正規化部２１２は、特徴ベクトルを正規化する。 First, in step S401, the feature vector acquisition unit 211 acquires feature vectors for all the features included in the feature category for all the images included in the learning sample image group. Next, in step S402, the first feature vector normalization unit 212 normalizes the feature vector.

図５（Ａ）には、学習サンプル画像群Ａから取得した特徴ベクトルの例を示す。特徴カテゴリは、非特許文献４〜６に記載の傾斜方向ヒストグラム（ＨＯＧ）特徴、局所二値パターン（ＬＢＰ）特徴、Ｂａｇｏｆｖｉｓｕａｌｗｏｒｄｓ（ＢＯＶＷ）特徴とする。個々の特徴カテゴリはそれぞれＤ０，Ｄ１，・・・の次元数を持つベクトルから構成され、全ての特徴カテゴリを通しての総次元数はＤ_maxとし、正規化により特徴ベクトルの各次元の値は０，１，・・・Ｖ_maxの値域に収まるものとする。 FIG. 5A shows an example of a feature vector acquired from the learning sample image group A. The feature categories are a slope direction histogram (HOG) feature, a local binary pattern (LBP) feature, and a Bag of visual words (BOVW) feature described in Non-Patent Documents 4 to 6. Each feature category is composed of vectors having dimensions of D0, D1,..., The total number of dimensions through all feature categories is D _max, and the value of each dimension of the feature vector is 0 by normalization. 1, it is assumed that the fall in the range of ··· V _max.

学習サンプル画像群Ａのそれぞれから得られ、正規化された特徴ベクトル群を図５（Ａ）の各行に示す。例えば、値５０１は画像２からＨＯＧ特徴で得られた特徴であって、その値は６である。学習サンプル画像群Ｂについても同様の手段で異なる特徴ベクトルが取得されたものとする。 A normalized feature vector group obtained from each of the learning sample image groups A is shown in each row of FIG. For example, the value 501 is a feature obtained from the image 2 by the HOG feature, and the value is 6. It is assumed that different feature vectors have been acquired for the learning sample image group B by the same means.

次に、ステップＳ４０３において、識別器学習部２１３は、特徴ベクトルに基づき、認識器を学習し、ルックアップテーブル作成部２１４は、そのルックアップテーブルを作成する。学習サンプル画像群Ａについては、図５（Ａ）に示した特徴ベクトルから生成されたルックアップテーブルが、図５（Ｂ）のＬＵＴ（ｒ＝０）〜ＬＵＴ（ｒ＝Ｒ_max）の各行に示されている。 Next, in step S403, the classifier learning unit 213 learns a recognizer based on the feature vector, and the lookup table creation unit 214 creates the lookup table. For the learning sample image group A, a lookup table generated from the feature vector shown in FIG. 5A is displayed in each row of LUT (r = 0) to LUT (r = R _max ) in FIG. It is shown.

識別器については、非特許文献１に記載の手法である「ＨｉｓｔｏｇｒａｍＩｎｔｅｒｓｅｃｔｉｏｎＫｅｒｎｅｌ」を非特許文献３に記載の「ＳｕｐｐｏｒｔＶｅｃｔｏｒＤａｔａＤｅｓｃｒｉｐｔｉｏｎ」に適用したものとする。即ち、まず、「ｈｉｓｔｏｇｒａｍｉｎｔｅｒｓｅｃｔｉｏｎｋｅｒｎｅｌ」をベクトルの内積とする空間を仮定し、識別対象クラスに含まれる特徴ベクトル群を可能な限り多く含む最小の体積の超球を求める。次に、超球の定義に必要な特徴ベクトルのみをサポートベクタとして残し、残った個々のサポートベクタについて重みを求め、最後に、識別器が識別対象クラスにある特徴ベクトルが含まれると判断し得る閾値を求める。 As for the discriminator, it is assumed that “Histogram Intersection Kernel” which is a technique described in Non-Patent Document 1 is applied to “Support Vector Data Description” described in Non-Patent Document 3. That is, first, assuming a space having “histogram intersection kernel” as an inner product of vectors, a hypersphere having a minimum volume including as many feature vector groups as possible included in the class to be identified is obtained. Next, only the feature vectors necessary for the definition of the hypersphere are left as support vectors, the weights are calculated for the remaining individual support vectors, and finally, the classifier can determine that the feature vectors in the class to be identified are included. Find the threshold.

この識別器は、識別する特徴ベクトルとサポートベクタのＨｉｓｔｏｇｒａｍｉｎｔｅｒｓｅｃｔｉｏｎｋｅｒｎｅｌ内積との重み付き総和となる。即ち、その識別値は以下の式（１）により得られる。この値が求めた閾値を超えるか否かで識別器が表現される。 This discriminator is a weighted sum of the feature vector to be identified and the inner product of the histogram intersection kernel of the support vector. That is, the identification value is obtained by the following equation (1). A discriminator is represented by whether or not this value exceeds the obtained threshold value.

式（１）において、ｄは次元を表し、ＳＶはサポートベクタのインデクスを表している。また、ｘ_d ^SVはサポートベクタＳＶの次元ｄの値を表し、ｘ_dは識別する特徴ベクトルの次元ｄの値を表している。さらに、α_SVはサポートベクタＳＶの重みを表している。 In Expression (1), d represents a dimension, and SV represents an index of a support vector. X _d ^SV represents the value of the dimension d of the support vector SV, and x _d represents the value of the dimension d of the feature vector to be identified. Furthermore, α _SV represents the weight of the support vector SV.

図７は、式（１）で表される識別器の演算を高速化するルックアップテーブルの概略を説明するための図である。このルックアップテーブルの詳細及びその作成方法については、非特許文献１に記載されている。上記式（１）は、図７（Ａ）のアミカケ部の面積に相当し、式（１）が個々の次元の演算の線形加算の形で表現できることに着目する。次元ｄの識別値は、以下の式（２）により算出され、図７（Ｂ）に示す曲線の下部の面積に相当する。 FIG. 7 is a diagram for explaining an outline of a lookup table for speeding up the operation of the discriminator represented by Expression (1). Details of the lookup table and a method for creating the lookup table are described in Non-Patent Document 1. It should be noted that the above formula (1) corresponds to the area of the blurred portion in FIG. 7A, and that the formula (1) can be expressed in the form of linear addition of operations of individual dimensions. The identification value of the dimension d is calculated by the following formula (2) and corresponds to the area under the curve shown in FIG.

また、式（２）をｘ_dを境界に場合分けすると、以下の式（３）が得られる。 Further, when Expression (2) is divided into cases where x _d is a boundary, the following Expression (3) is obtained.

この式（３）の値を横軸をｘ_dとして表すと図７（Ｃ）に示すものとなる。｛ＳＶ｜ｘ_d ^SV＜ｘ_d｝となる集合はｘ_d ^SVをソートしておくことにより対数時間で求められるから、式（３）も対数時間で演算できることになる。 When the value of the equation (3) is represented by x _d on the horizontal axis, it is as shown in FIG. Since the set of {SV | x _d ^SV <x _d } is obtained in logarithmic time by sorting x _d ^SV , equation (3) can also be calculated in logarithmic time.

さらに、式（３）の演算を定数時間化する。ｘ_d ^SV及びｘ_dの定義域を分割した個々の区間ｒの代表値ｒ_dについてその式（３）の値を求めておき、その値をルックアップテーブルに格納する。式（３）を代表値ｒ_dについて求めると、ルックアップテーブルの次元ｄ、区間ｒ、に対応する値ＬＵＴ（ｒ，ｄ）は、以下の式（４）により算出される。 Further, the calculation of equation (3) is made into a constant time. The value of the expression (3) is obtained for the representative value r _d of each section r obtained by dividing the domain of x _d ^SV and x _d , and the value is stored in the lookup table. When Expression (3) is obtained for the representative value r _d , the value LUT (r, d) corresponding to the dimension d and the section r of the lookup table is calculated by the following Expression (4).

ここで、ｒは識別する特徴量ベクトルが落ちる区間のインデックスを表し、ｒ_dは識別する特徴量ベクトルが落ちる区間ｒにおける代表値を表している。図７（Ｄ）には、識別値と代表値ｒ_dとの関係を示している。但し、定義域の分割およびその代表値の導出は任意であり、例えば等間隔で分割してもよい。本実施形態では、０からＶ_maxの値域をＲ_max分割するものとし、所定の区間を示すインデックスをｒ、その代表値ｒ_dを区間の中央値、すなわち、
Ｖ_max＊（ｒ＋０．５）／Ｒ_max
とする。 Here, r is represents the index of the section feature quantity vector identifying fall, r _d denotes a representative value in the interval r which is the feature quantity vector that identifies fall. FIG. 7D shows the relationship between the identification value and the representative value r _d . However, the division of the definition area and the derivation of the representative value thereof are arbitrary. For example, the domain may be divided at equal intervals. In the present embodiment, the range from 0 V _max shall R _max divided, the index indicating the predetermined section r, the median of the interval the representative value r _d, i.e.,
V _max * (r + 0.5) / R _max
And

式（４）に示す通り、ある次元ｄにおける値ＬＵＴ（ｒ，ｄ）は、ｒ方向に単調増加である。したがって、ステップＳ４０４で求めるルックアップテーブルの個々の次元での最大値は、区間ｒが最大値Ｒ_maxを取る時の値と等しい。 As shown in Expression (4), the value LUT (r, d) in a certain dimension d is monotonically increasing in the r direction. Therefore, the maximum value in each dimension of the lookup table obtained in step S404 is equal to the value when the section r takes the maximum value _Rmax .

次に、ステップＳ４０４において、識別寄与度演算部２１５は、ステップＳ４０３で作成したルックアップテーブルから、特徴カテゴリごとに識別寄与度を求める。識別寄与度はそれぞれの特徴カテゴリが画像の識別に寄与する程度を意味し、識別寄与度は０からＶ_maxの値を取り、大きい方が物体に近いこととなる。本実施形態では、それぞれの特徴カテゴリについて、その特徴カテゴリが含むそれぞれの次元のルックアップテーブルの最大値を求め、その平均を識別寄与度とする。ルックアップテーブルの最大値は、ある未知のサンプルを物体クラスに属すると判別する時に特徴カテゴリが識別に寄与する程度の最大値を意味し、その平均は次元すなわち後述する辞書サイズあたりの寄与を意味する。 Next, in step S404, the identification contribution degree calculation unit 215 obtains an identification contribution degree for each feature category from the lookup table created in step S403. Identification contribution of each feature category mean degree of contribution to the identification of the image, identification contribution takes values from 0 V _max, and thus it is close to the object larger. In the present embodiment, for each feature category, the maximum value of the lookup table of each dimension included in the feature category is obtained, and the average is used as the identification contribution. The maximum value of the lookup table means the maximum value that the feature category contributes to identification when determining that an unknown sample belongs to the object class, and the average means the contribution per dimension, that is, the dictionary size described later To do.

例えば図５（Ｂ）には、学習サンプル画像群Ａから生成された、特徴カテゴリＢｏＶＷに対応するルックアップテーブル５１１の次元ごとの最大値としてＬＵＴ最大値５１２が示されている。また、図５（Ｃ）には、ＬＵＴ最大値５１２の平均値として識別寄与度５２２が示されている。同様に、特徴カテゴリＨＯＧの最大値としてＬＵＴ最大値５１３、その平均値である識別寄与度５２３が示されている。また、図６（Ａ）には、学習サンプル画像群Ｂから生成されたルックアップテーブル及び得られた最大値が示され、図６（Ｂ）には、特徴カテゴリごとの識別寄与度が示されている。 For example, FIG. 5B shows the LUT maximum value 512 as the maximum value for each dimension of the lookup table 511 corresponding to the feature category BoVW generated from the learning sample image group A. In FIG. 5C, the identification contribution 522 is shown as an average value of the LUT maximum value 512. Similarly, the LUT maximum value 513 as the maximum value of the feature category HOG and the identification contribution 523 which is the average value thereof are shown. 6A shows the lookup table generated from the learning sample image group B and the maximum value obtained, and FIG. 6B shows the identification contribution for each feature category. ing.

次に、ステップＳ４０５において、特徴カテゴリ選択部２１６は、ステップＳ４０４で求めた識別寄与度に基づき認識に用いる特徴カテゴリを選択する。本実施形態では、事前に定められた閾値以上の識別寄与度を持つ特徴のみを選択する。例えば識別寄与度の閾値を３０とした場合、物体Ａの特徴カテゴリについては、図５（Ｃ）の選択の列に示す通り、ＢｏＶＷ特徴及びＬＢＰ特徴が選ばれる。同様に、物体Ｂの特徴カテゴリについても、図６（Ｂ）の選択の列に示す通り、ＨＯＧ特徴及びＬＢＰ特徴が選ばれる。 Next, in step S405, the feature category selection unit 216 selects a feature category used for recognition based on the identification contribution obtained in step S404. In this embodiment, only features having an identification contribution greater than or equal to a predetermined threshold are selected. For example, when the threshold value of the identification contribution is set to 30, the BoVW feature and the LBP feature are selected for the feature category of the object A as shown in the selection column of FIG. Similarly, for the feature category of the object B, the HOG feature and the LBP feature are selected as shown in the selection column of FIG.

次に、ステップＳ４０６において、特徴カテゴリ補正値演算部２１７は、選択されなかった特徴カテゴリの補正値を求める。補正値は、選択されなかった特徴カテゴリについて、未知の特徴ベクトルとサポートベクタとのカーネル内積の重み付き総和を近似する値である。本実施形態では、識別寄与度として求めたルックアップテーブルの最大値を特徴カテゴリの補正値として用いる。ルックアップテーブルの最大値を補正値とすることは、省略された特徴カテゴリで最も物体クラスに属すると判別されやすい値を補正値とすることであって、識別における偽陰性の最小化を意図している。図５（Ｃ）の補正値の列に示す通り、物体ＡのＨＯＧ特徴の補正値５２１と、図６（Ｃ）の補正値の列に示す通り、物体ＢのＢｏＶＷ特徴に対応する補正値６２１とが求められる。 Next, in step S <b> 406, the feature category correction value calculation unit 217 obtains correction values for feature categories that have not been selected. The correction value is a value that approximates the weighted sum of the kernel inner product of the unknown feature vector and the support vector for the feature category that has not been selected. In the present embodiment, the maximum value of the lookup table obtained as the identification contribution is used as the correction value for the feature category. Setting the maximum value in the lookup table as a correction value means setting a correction value as a value that can be easily identified as belonging to the object class in the omitted feature category, and is intended to minimize false negatives in identification. ing. As shown in the correction value column of FIG. 5C, the correction value 521 of the HOG feature of the object A and the correction value 621 corresponding to the BoVW feature of the object B as shown in the correction value column of FIG. Is required.

次に、ステップＳ４０７において、辞書作成部２１８は、ステップＳ４０５で求めた特徴カテゴリごとの選択の結果、選択された特徴カテゴリに対応するルックアップテーブル、及び選択されなかった特徴カテゴリに対応する補正値を辞書に格納する。本実施形態では、ステップＳ４０３で得られる識別値の閾値も共に辞書に格納する。 Next, in step S407, the dictionary creation unit 218 obtains the selection result for each feature category obtained in step S405, the lookup table corresponding to the selected feature category, and the correction value corresponding to the feature category that has not been selected. Is stored in the dictionary. In this embodiment, the threshold value of the identification value obtained in step S403 is also stored in the dictionary.

例えば、図５（Ｂ）及び図５（Ｃ）に示す通り、特徴Ａから生成される辞書については、選択結果５２４、ルックアップテーブル５１１、５１４、および補正値５２５を格納する。なお、ルックアップテーブル５１１、５１４は、選択されたＢｏＶＷ特徴及びＬＢＰ特徴に対応する。そして、選択されなかった特徴カテゴリすなわちＨＯＧ特徴に対応するルックアップテーブル５１５は辞書には格納されず、補正値５２５が代わりに格納される。 For example, as shown in FIGS. 5B and 5C, for the dictionary generated from the feature A, a selection result 524, lookup tables 511 and 514, and a correction value 525 are stored. Note that the lookup tables 511 and 514 correspond to the selected BoVW feature and LBP feature. Then, the lookup table 515 corresponding to the feature category that has not been selected, that is, the HOG feature, is not stored in the dictionary, and the correction value 525 is stored instead.

同様に、特徴Ｂから生成される辞書は、図６（Ａ）及び図６（Ｂ）に示す通り、特徴カテゴリごとの選択結果６２２、選択されたＨＯＧ特徴及びＬＢＰ特徴にそれぞれ対応するルックアップテーブル６１１、６１２を含む。また、選択されなかったＢｏＶＷ特徴については対応する補正値６２３のみを辞書に格納し、ルックアップテーブル６１３については辞書には格納されない。 Similarly, the dictionary generated from the feature B is a lookup table corresponding to the selection result 622 for each feature category, the selected HOG feature, and the LBP feature, as shown in FIGS. 6 (A) and 6 (B). 611 and 612 are included. For the BoVW feature that has not been selected, only the corresponding correction value 623 is stored in the dictionary, and the lookup table 613 is not stored in the dictionary.

以上のように本実施形態によれば、学習部２０１は、ステップＳ４０５で選択されない特徴カテゴリに相当するルックアップテーブルを辞書に格納しないようにして、辞書サイズを削減する。また、特徴Ａの選択結果５２４と特徴Ｂの選択結果６２２とが異なることを示す様に、ステップＳ４０５で特徴カテゴリを選択する際には、物体の特徴に応じて適切な異なる特徴カテゴリが選択される。さらに、本実施形態では、特徴カテゴリを選択した後、選択されなかった特徴カテゴリについては識別寄与度を補正値として用い、特徴ベクトルの全てを再度参照する様な学習を行わないようにしている。これにより、選択した特徴カテゴリを用いて再度学習する様な手法に比べてより高速である。 As described above, according to the present embodiment, the learning unit 201 reduces the dictionary size by not storing the lookup table corresponding to the feature category not selected in Step S405 in the dictionary. Further, as shown in the step S405, when selecting a feature category, an appropriate different feature category is selected according to the feature of the object so that the selection result 524 of the feature A and the selection result 622 of the feature B are different. The Furthermore, in the present embodiment, after selecting a feature category, the identification contribution is used as a correction value for a feature category that has not been selected, and learning that refers to all feature vectors again is not performed. Thereby, it is faster than the method of re-learning using the selected feature category.

図８は、本実施形態の識別部２０２の動作手順の一例を示すフローチャートである。図８に示すフローチャートの処理では、学習部２０１で作成した辞書及び入力画像を入力し、入力画像が、学習部２０１に入力された学習サンプル画像群に類似するか否かの結果を出力する。前述したように辞書には、特徴カテゴリのそれぞれについて選択されたか否か、選択された特徴カテゴリについてはルックアップテーブル、選択されなかった特徴カテゴリについてはその補正値の情報が格納されている。図９（Ａ）には、物体Ａに対応する辞書Ａについての特徴カテゴリの選択結果及び選択されなかった特徴カテゴリの補正値を示している。また、図９（Ｂ）には、選択された特徴カテゴリに対するルックアップテーブルを示している。 FIG. 8 is a flowchart illustrating an example of an operation procedure of the identification unit 202 of the present embodiment. In the process of the flowchart shown in FIG. 8, the dictionary and the input image created by the learning unit 201 are input, and the result of whether or not the input image is similar to the learning sample image group input to the learning unit 201 is output. As described above, the dictionary stores whether or not each feature category has been selected, a lookup table for the selected feature category, and correction value information for the unselected feature categories. FIG. 9A shows the selection result of the feature category for the dictionary A corresponding to the object A and the correction value of the feature category that has not been selected. FIG. 9B shows a lookup table for the selected feature category.

まず、ステップＳ８０１において、選択カテゴリ特徴ベクトル取得部２２２は、選択された特徴カテゴリについて、特徴ベクトルを作成する。なお、この処理の詳細については後述する。 First, in step S801, the selected category feature vector acquisition unit 222 creates a feature vector for the selected feature category. Details of this process will be described later.

次に、ステップＳ８０２において、第２の特徴ベクトル正規化部２２３は、ステップＳ８０１で得た特徴ベクトルを正規化する。この処理は、前述したステップＳ４０２の処理と同等である。このようにステップＳ８０１及びステップＳ８０２の処理により、入力画像に対して、例えば図９（Ｃ）の入力画像の特徴ベクトルの行に示す特徴ベクトルが得られる。 Next, in step S802, the second feature vector normalization unit 223 normalizes the feature vector obtained in step S801. This process is equivalent to the process of step S402 described above. As described above, by the processing in step S801 and step S802, for example, a feature vector shown in the feature vector row of the input image in FIG. 9C is obtained for the input image.

次に、ステップＳ８０３において、識別値演算部２２４は、ステップＳ８０２で正規化した特徴ベクトルを元に、識別値を演算する。この識別値の演算方法は、前述したステップＳ４０３の認識器及びルックアップテーブルの作成方法に対応する。ルックアップテーブルを用いたある特徴ベクトルｘの識別では、ステップＳ４０３で説明した式（１）の式（２）の部分を式（４）で置き換えた以下の式（５）を用いる。 Next, in step S803, the identification value calculation unit 224 calculates the identification value based on the feature vector normalized in step S802. This calculation method of the identification value corresponds to the above-described recognizer and lookup table creation method in step S403. For identification of a certain feature vector x using a lookup table, the following formula (5) is used in which the formula (2) in formula (1) described in step S403 is replaced with formula (4).

ここで、ｒ_xdは、ｘ_dの落ちる区間を表している。すなわち、特徴ベクトルを構成するそれぞれの次元ｄについて、その特徴ベクトルの値ｘ_dから区間ｒ_dを求め、次元ｄ及び区間ｒ_dに基づきルックアップテーブルを参照し、ルックアップテーブルから得た値ＬＵＴ（ｒ_d，ｄ）の次元を通しての総和を求める。区間ｒ_xdはｘ_dをＶ_max／Ｒ_maxで割った商の整数部とする。Ｖ_maxはステップＳ４０２及びステップＳ８０２で定義した特徴ベクトルの値の正規化の最大値である。Ｒ_maxはステップＳ４０３で定義したインデックスの分割数である。 Here, r _xd represents a section where x _d falls. That is, for each dimension d constituting the feature vector, an interval r _d is obtained from the value x _{d of the} feature vector, a lookup table is referenced based on the dimension d and the interval r _d , and a value LUT obtained from the lookup table Find the sum through the dimensions of (r _d , d). The interval r _xd is the integer part of the quotient obtained by dividing x _d by V _max / R _max . V _max is the maximum normalization value of the feature vector values defined in step S402 and step S802. R _max is the number of index divisions defined in step S403.

図９（Ｃ）の入力画像の特徴ベクトルの行に示す特徴ベクトルについて、ｍａｘ＝１００，Ｒ_max＝１０とした時の、ｄ＝１に対応する値９３１に対応する区間ｒ_dは１２÷（１００÷１０）≒１である。すなわち、図９（Ｂ）に示すルックアップテーブルにおけるエントリ９２１に相当する。同様にｄ＝２に対応する値９３２は５５÷（１００÷１０）≒５であって、ルックアップテーブルではエントリ９２２に相当する。入力画像の特徴ベクトルのそれぞれの次元についてルックアップテーブルを参照した結果が、図９（Ｃ）のＬＵＴ参照値の項に示される。この図９（Ｃ）のＬＵＴ参照値の行の総和が本ステップで求める識別値である。求めた総和については、図９（Ｄ）において、選択された特徴カテゴリの識別値の項に示す。 For feature vector indicating a row of the feature vector of the input image of FIG. 9 (C), max = 100 , R max = 10 and the time interval corresponding to the value 931 corresponding to d = 1 r _d is 12 ÷ ( 100 ÷ 10) ≈1. That is, it corresponds to the entry 921 in the lookup table shown in FIG. Similarly, the value 932 corresponding to d = 2 is 55 ÷ (100 ÷ 10) ≈5, and corresponds to the entry 922 in the lookup table. The result of referring to the lookup table for each dimension of the feature vector of the input image is shown in the LUT reference value section of FIG. The sum of the rows of the LUT reference values in FIG. 9C is the identification value obtained in this step. The obtained sum is shown in the item of the identification value of the selected feature category in FIG.

ステップＳ８０３における演算量については、従来は次元数とサポートベクタ数とに応じて計算時間がかかっていたのに対し、式（５）では、ルックアップテーブルの利用によって、次元数に比例する程度にまで抑えられている。しかし、特段の工夫なく従来の特徴選択ないし次元削減を行うのみでは、その計算量削減の効果は、このルックアップテーブルの参照回数を減らすことができる程度である。 The calculation amount in step S803 conventionally takes a calculation time depending on the number of dimensions and the number of support vectors, but in the expression (5), the use of a lookup table increases the calculation amount to a degree proportional to the number of dimensions. Is suppressed. However, if only conventional feature selection or dimension reduction is performed without any special contrivance, the effect of reducing the amount of calculation is such that the number of references to the lookup table can be reduced.

次に、ステップＳ８０４において、識別値補正部２２５は、選択されなかったカテゴリの補正値を識別値に加算する。図９（Ａ）に示す通り辞書Ａの場合には選択されなかったカテゴリはＨＯＧであって、対応する補正値９１１を識別値に加算する。選択されなかったカテゴリが複数ある場合にはその全てのカテゴリについて補正値を加算する。加算の結果、図９（Ｄ）に示す値９４１となる。 Next, in step S804, the identification value correction unit 225 adds the correction value of the category not selected to the identification value. As shown in FIG. 9A, the category not selected in the case of the dictionary A is HOG, and the corresponding correction value 911 is added to the identification value. When there are a plurality of categories not selected, the correction values are added for all the categories. As a result of the addition, a value 941 shown in FIG. 9D is obtained.

次に、ステップＳ８０５において、識別値補正部２２５は、求めた識別値がある閾値以上であるか否かを判定し、その判定結果に応じて画像がある物体に類似するか否かを判定する。ステップＳ４０３の識別器として非特許文献３に記載の技術を適用した本実施形態では、この閾値はステップＳ４０３の学習時に決定し、辞書に含まれているとする。例えば、ステップＳ４０３で得た閾値が１００だったとした場合、値９４１はその閾値よりも大きいので、入力画像はその物体に類似すると判定する。 Next, in step S805, the identification value correction unit 225 determines whether or not the obtained identification value is greater than or equal to a certain threshold value, and determines whether or not the image is similar to an object according to the determination result. . In the present embodiment in which the technique described in Non-Patent Document 3 is applied as the discriminator in step S403, this threshold is determined at the time of learning in step S403 and is included in the dictionary. For example, if the threshold value obtained in step S403 is 100, the value 941 is larger than the threshold value, so it is determined that the input image is similar to the object.

図１０は、ステップＳ８０１における選択カテゴリ特徴ベクトル取得部２２２による、選択された特徴カテゴリについて特徴ベクトルを作成する詳細な処理手順の一例を示すフローチャートである。このフローチャートの処理では、個々の特徴カテゴリが選択されたか否かを格納する辞書が入力され、特徴ベクトルを出力する。なお、図９（Ａ）の選択結果の列には、個々の特徴カテゴリが選択されたか否かを示している。 FIG. 10 is a flowchart illustrating an example of a detailed processing procedure for creating a feature vector for a selected feature category by the selected category feature vector acquisition unit 222 in step S801. In the process of this flowchart, a dictionary storing whether or not each feature category has been selected is input, and a feature vector is output. Note that the selection result column in FIG. 9A indicates whether or not each feature category has been selected.

ステップＳ１００１においては、１つの特徴カテゴリに着目し、選択された特徴カテゴリのそれぞれについて、ステップＳ１００２〜ステップＳ１００８を繰り返す。 In step S1001, focusing on one feature category, steps S1002 to S1008 are repeated for each selected feature category.

まず、ステップＳ１００２において、ステップＳ１００１で着目した特徴カテゴリがどのカテゴリに属するかを判定する。この判定の結果、特徴カテゴリがＢｏＶＷならステップＳ１００３に進み、ＨＯＧならステップＳ１００５に進み、ＬＢＰならステップＳ１００７に進む。 First, in step S1002, it is determined to which category the feature category focused in step S1001 belongs. As a result of the determination, if the feature category is BoVW, the process proceeds to step S1003, if it is HOG, the process proceeds to step S1005, and if it is LBP, the process proceeds to step S1007.

ステップＳ１００３〜ステップＳ１００４ではＢｏＶＷ特徴の抽出を行う。まず、ステップＳ１００３において前処理として局所特徴点を求め、ステップＳ１００４において求めた局所特徴点からＢｏＶＷ特徴を取得する。ＢｏＶＷ特徴を取得した後、着目していない特徴カテゴリが残っている場合には、ステップＳ１００１に遷移する。 In steps S1003 to S1004, the BoVW feature is extracted. First, local feature points are obtained as preprocessing in step S1003, and BoVW features are obtained from the local feature points obtained in step S1004. After acquiring the BoVW feature, if a feature category that is not focused remains, the process proceeds to step S1001.

ステップＳ１００５〜ステップＳ１００６ではＨＯＧ特徴の抽出を行う。まず、ステップＳ１００５において微分画像を求め、ステップＳ１００６において入力画像を格子状に分割し、それぞれについてＨＯＧ特徴を抽出する。そして、着目していない特徴カテゴリが残っている場合には、ステップＳ１００１に遷移する。 In steps S1005 to S1006, HOG features are extracted. First, in step S1005, a differential image is obtained, and in step S1006, the input image is divided into a grid, and HOG features are extracted for each. If a feature category that is not focused remains, the process proceeds to step S1001.

ステップＳ１００７〜ステップＳ１００８ではＬＢＰ特徴の抽出を行う。まず、ステップＳ１００７において明度の積分画像を求め、ステップＳ１００８において入力画像を格子状に分割し、それぞれについてＬＢＰ特徴を抽出する。着目していない特徴カテゴリが残っている場合には、ステップＳ１００１に遷移する。 In steps S1007 to S1008, LBP features are extracted. First, in step S1007, an integral image of brightness is obtained, and in step S1008, the input image is divided into a grid, and LBP features are extracted for each. If a feature category that is not focused remains, the process proceeds to step S1001.

図９（Ａ）に示す様にＢｏＶＷ特徴及びＬＢＰ特徴が選択されている場合は、これらに相当するステップＳ１００３〜ステップＳ１００４及びステップＳ１００７〜ステップＳ１００８が実行される。また、この場合、選択されていないＨＯＧ特徴に相当するステップＳ１００５〜ステップＳ１００６は実行されない。 When the BoVW feature and the LBP feature are selected as shown in FIG. 9A, steps S1003 to S1004 and steps S1007 to S1008 corresponding to these are executed. In this case, Steps S1005 to S1006 corresponding to unselected HOG features are not executed.

以上のようにステップＳ１００２〜Ｓ１００８の処理を繰り返し、すべての特徴カテゴリについて特徴を求めると、ステップＳ１００９において、得られた特徴を結合し、特徴ベクトルとして出力する。例えば、図９（Ｂ）に示す辞書に従う場合にはＢｏＶＷ特徴及びＬＢＰ特徴が得られ、この結合した結果が図９（Ｃ）の入力画像の特徴ベクトルの行に示されるものとなる。 As described above, when the processing of steps S1002 to S1008 is repeated and features are obtained for all feature categories, the obtained features are combined and output as feature vectors in step S1009. For example, when the dictionary shown in FIG. 9B is followed, a BoVW feature and an LBP feature are obtained, and the result of the combination is shown in the feature vector row of the input image in FIG. 9C.

以上のように本実施形態によれば、選択カテゴリ特徴ベクトル取得部２２２は、学習部２０１で選択されなかった特徴カテゴリに対応する特徴ベクトルを取得しないようにした。これにより選択カテゴリ特徴ベクトル取得部２２２の演算量を、全ての特徴カテゴリの特徴ベクトルを取得する場合よりも小さくすることができる。また、多くの画像特徴、少なくともＢｏＶＷ、ＨＯＧ、ＬＢＰのいずれの特徴の取得についても前処理が必要であり、その処理コストは入力画像を処理する解像度に依存する。すなわち、ある特徴カテゴリを選択しないことにより、入力画像の解像度の分だけメモリの参照・書き込み処理が削減される。ある特徴カテゴリを選択しないことによる計算量の削減量は、入力画像の解像度に依存するものであって、本実施形態で述べた様な特徴選択によるルックアップテーブルの参照回数の削減よりも特段の効果が得られる。 As described above, according to the present embodiment, the selected category feature vector acquisition unit 222 is configured not to acquire feature vectors corresponding to feature categories not selected by the learning unit 201. As a result, the calculation amount of the selected category feature vector acquisition unit 222 can be made smaller than when the feature vectors of all feature categories are acquired. Also, pre-processing is required for acquiring many image features, at least any of BoVW, HOG, and LBP features, and the processing cost depends on the resolution for processing the input image. That is, by not selecting a certain feature category, the memory reference / write processing is reduced by the resolution of the input image. The amount of calculation reduction due to not selecting a certain feature category depends on the resolution of the input image, and is much less than the reduction in the number of lookup table references by feature selection as described in this embodiment. An effect is obtained.

（第２の実施形態）
第１の実施形態で説明したステップＳ４０３の処理のうち、識別器学習部２１３が行う処理では、サポートベクタに基づく学習機械であって、かつカーネル関数がａｄｄｉｔｉｖｅｋｅｒｎｅｌであれば、複数の手法が適用可能である。 (Second Embodiment)
Among the processes of step S403 described in the first embodiment, the process performed by the classifier learning unit 213 applies a plurality of methods if the learning machine is based on a support vector and the kernel function is an additive kernel. Is possible.

例えば、物体クラスに属する特徴ベクトルに加えて、物体クラスに属さないある特徴ベクトルを用意できる場合には、学習機械としてｏｎｅ−ｃｌａｓｓＳＶＭを適用してもよい。また、物体クラスに属する特徴ベクトルに加えて、物体クラスに属さない特徴ベクトルを多数用意できる場合には、非特許文献１に記載の通りのＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅに基づく手法でもよい。これらの場合、一般に、ＳＶＤＤを利用するよりも、学習時に類似する特徴ベクトルを与えられていない様な部分空間において、より高い般化性能が期待できる。 For example, when a feature vector that does not belong to the object class can be prepared in addition to the feature vector that belongs to the object class, the one-class SVM may be applied as a learning machine. In addition to the feature vector belonging to the object class, when a large number of feature vectors not belonging to the object class can be prepared, a method based on the Support Vector Machine as described in Non-Patent Document 1 may be used. In these cases, in general, higher generalization performance can be expected in a subspace where a similar feature vector is not given at the time of learning than when SVDD is used.

また、ａｄｄｉｔｉｖｅｋｅｒｎｅｌとして、非特許文献２には、ＲＢＦカーネル、及びＧＲＢＦカーネルが示されており、これらを適用することも可能である。これらのカーネルの場合、ＨｉｓｔｏｇｒａｍＩｎｔｅｒｓｅｃｔｉｏｎよりも特徴ベクトル同士の内積がより非線形となり、特徴ベクトル空間上で正例と負例が入り組む様な部分空間において、より高い弁別性能が期待できる。 In addition, as additive kernel, Non-Patent Document 2 shows an RBF kernel and a GRBF kernel, and these can also be applied. In the case of these kernels, the inner product of feature vectors becomes more nonlinear than the Histogram Intersection, and higher discrimination performance can be expected in a partial space where positive examples and negative examples are complicated in the feature vector space.

なお、いずれの組み合わせを選択した場合でも、その後のルックアップテーブルの作成手順においては第１の実施形態と同様である。但し、ＨｉｓｔｏｇｒａｍＩｎｔｅｒｓｅｃｔｉｏｎＫｅｒｎｅｌを用いない場合には、ルックアップテーブルにおけるある次元での最大値を求める場合に、ルックアップテーブルを探索する必要がある。 Note that, regardless of which combination is selected, the subsequent lookup table creation procedure is the same as in the first embodiment. However, when the Histogram Intersection Kernel is not used, it is necessary to search the lookup table when obtaining the maximum value in a certain dimension in the lookup table.

（第３の実施形態）
第１の実施形態で説明した識別寄与度演算部２１５及び特徴カテゴリ選択部２１６が行う処理については、複数の手法が適用可能である。例えば、ステップＳ４０４では、特徴カテゴリが含むそれぞれの次元のルックアップテーブルの最大値を求め、その分散を寄与度としてもよい。この場合、ステップＳ４０５では、この分散が小さい特徴カテゴリを選択する。分散が小さいことはその特徴カテゴリが安定していることを意味し、特徴カテゴリ補正値演算部２１７及び識別値補正部２２５でルックアップテーブルをその補正値で近似する場合の誤差が小さくなる。 (Third embodiment)
A plurality of methods can be applied to the processing performed by the identification contribution degree calculation unit 215 and the feature category selection unit 216 described in the first embodiment. For example, in step S404, the maximum value of the lookup table of each dimension included in the feature category may be obtained, and the variance may be used as the contribution. In this case, in step S405, a feature category having a small variance is selected. A small variance means that the feature category is stable, and an error in approximating the lookup table with the correction value by the feature category correction value calculation unit 217 and the identification value correction unit 225 becomes small.

また、例えば、学習サンプル画像として、識別対象物体ではない画像が多数集められ、その特徴カテゴリごとの特徴ベクトルの分布、すなわち背景分布が得られるものとする。このとき、特徴カテゴリごとにその背景分布に対する情報量を求め、識別寄与度として、この情報量、ないしこの情報量を特徴カテゴリの次元数での剰余を識別寄与度としてもよい。この場合、ステップＳ４０５では、識別寄与度の大きい特徴カテゴリを有効と見なして選択する。情報量の演算では、例えば個々の特徴カテゴリについて背景分布と識別対象物体の分布との双方を平均と分散との対で求めておいて、その両者のＫＬダイバージェンスを求めることにより得られる。情報量が大きいことは、識別値演算部２２４における確からしさが大きいことを意味する。また、次元あたりの情報量が大きいことは、辞書の大きさあたりの確からしさが大きいことを意味する。また、これらの尺度及び選択手法を組み合わせてもよい。 Also, for example, as learning sample images, a large number of images that are not identification target objects are collected, and a feature vector distribution for each feature category, that is, a background distribution is obtained. At this time, an information amount for the background distribution is obtained for each feature category, and this information amount, or a remainder in the number of dimensions of the feature category, may be used as the identification contribution. In this case, in step S405, a feature category having a large degree of identification contribution is selected as effective. In the calculation of the information amount, for example, both the background distribution and the distribution of the identification target object are obtained for each feature category as a pair of average and variance, and the KL divergence of both is obtained. A large amount of information means that the probability in the identification value calculation unit 224 is large. Also, a large amount of information per dimension means that there is a high probability per dictionary size. Further, these scales and selection methods may be combined.

（第４の実施形態）
第１の実施形態で説明した特徴カテゴリ補正値演算部２１７及び識別値補正部２２５の処理については、複数の手法が適用可能である。例えば、取得する特徴ベクトルの各次元の値の事前分布は一様であるものと仮定し、選択された特徴カテゴリでの識別値と選択されなかった特徴カテゴリの識別値は相関するものと仮定してもよい。この場合、選択された特徴カテゴリの全次元の最大値の平均及び分散と、選択されなかった特徴カテゴリ内の全次元の最大値の平均及び分散とを識別値を補正するためのパラメータとして辞書に保存する。識別値補正部２２５では、まず、選択されたカテゴリについて特徴ベクトルの値の平均を求める。次に、この平均について、辞書内の選択された特徴カテゴリの全次元の最大値の分布におけるＺ値を求める。このＺ値から、辞書内の選択されなかった特徴カテゴリの平均及び分散に基づき、補正値を推定する。 (Fourth embodiment)
A plurality of methods can be applied to the processing of the feature category correction value calculation unit 217 and the identification value correction unit 225 described in the first embodiment. For example, assume that the prior distribution of the values of each dimension of the acquired feature vector is uniform, and that the identification values of the selected feature categories and the identification values of the unselected feature categories are correlated. May be. In this case, the average and variance of the maximum values of all dimensions in the selected feature category and the average and variance of the maximum values of all dimensions in the unselected feature category are stored in the dictionary as parameters for correcting the identification value. save. The identification value correction unit 225 first obtains the average of the feature vector values for the selected category. Next, for this average, a Z value in the distribution of maximum values of all dimensions of the selected feature category in the dictionary is obtained. From this Z value, a correction value is estimated based on the mean and variance of unselected feature categories in the dictionary.

また、例えば、識別器学習部２１３におけるカーネル関数がＨｉｓｔｏｇｒａｍＩｎｔｅｒｓｅｃｔｉｏｎであって、かつ識別値補正部２２５にて偽陽性の最小化を意図する場合には、以下のようにしてもよい。すなわち、辞書には補正値を保存せず、かつ識別値補正部２２５では何もしないようにしてもよい。 For example, when the kernel function in the classifier learning unit 213 is Histogram Intersection and the discrimination value correction unit 225 intends to minimize false positives, the following may be performed. That is, the correction value may not be stored in the dictionary, and the identification value correction unit 225 may do nothing.

（第５の実施形態）
第１の実施形態で説明した識別器学習部２１３の処理は、必ずしも識別値演算部２２４に対応する手法である必要はない。特徴カテゴリ選択部２１６にて選択した特徴カテゴリに対応する特徴のみを用いて、識別値演算部２２４の識別手法に対応する学習手法で検出器を学習し、その結果を辞書に格納してもよい。この場合、学習部２０１での計算量は増加するが、識別値補正部２２５等は不要になり、また、選択された特徴カテゴリに適応した識別が期待できる。 (Fifth embodiment)
The processing of the discriminator learning unit 213 described in the first embodiment is not necessarily a method corresponding to the discriminant value calculation unit 224. Using only the features corresponding to the feature category selected by the feature category selection unit 216, the detector may be learned by a learning method corresponding to the identification method of the identification value calculation unit 224, and the result may be stored in the dictionary. . In this case, the amount of calculation in the learning unit 201 increases, but the identification value correction unit 225 and the like are not necessary, and identification adapted to the selected feature category can be expected.

（その他の実施形態）
以上、本発明の好ましい実施形態について詳述したが、本発明は係る特定の実施形態に限定されず、特許請求の範囲に記載された本発明の要旨の範囲内で、種々の変形・変更が可能である。 (Other embodiments)
The preferred embodiment of the present invention has been described in detail above, but the present invention is not limited to the specific embodiment, and various modifications and changes can be made within the scope of the gist of the present invention described in the claims. Is possible.

また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

２０１学習部
２０２識別部
２１１特徴ベクトル取得部
２１２第１の特徴ベクトル正規化部
２１３識別器学習部
２１４ルックアップテーブル作成部
２１５識別寄与度演算部
２１６特徴カテゴリ選択部
２１７特徴カテゴリ補正値演算部
２１８辞書作成部
２２１辞書読み出し部
２２２選択カテゴリ特徴ベクトル取得部
２２３第２の特徴ベクトル正規化部
２２４識別値演算部
２２５識別値補正部 DESCRIPTION OF SYMBOLS 201 Learning part 202 Identification part 211 Feature vector acquisition part 212 1st feature vector normalization part 213 Classifier learning part 214 Look-up table creation part 215 Identification contribution degree calculation part 216 Feature category selection part 217 Feature category correction value calculation part 218 Dictionary creation unit 221 Dictionary reading unit 222 Selected category feature vector acquisition unit 223 Second feature vector normalization unit 224 Discrimination value calculation unit 225 Discrimination value correction unit

Claims

First acquisition means for acquiring a feature vector indicating the characteristics of the learning sample image;
A computing means for obtaining an identification contribution for each feature category from the feature vector of the learning sample image obtained by the first obtaining means;
A selection means for selecting a feature category whose identification contribution obtained by the calculation means is a predetermined value or more;
A dictionary creation means for creating a dictionary including information on the feature category selected by the selection means;
Second acquisition means for acquiring a feature vector indicating the feature of the input image;
Whether the object of the input image is similar to the object of the learning sample image based on the feature vector of the input image acquired by the second acquiring unit and the dictionary generated by the dictionary generating unit Determining means for determining whether or not,
The image identification apparatus characterized in that the second acquisition means does not acquire a feature vector corresponding to a feature category not selected by the selection means from the input image.

The arithmetic means extracts a support vector value for each dimension from the feature vector of the learning sample image to obtain a weighted sum related to the support vector, and represents a representative value related to the weighted sum for each predetermined section as the dimension. Means for creating a lookup table stored for each
The image identification apparatus according to claim 1, wherein an identification contribution is calculated for each feature category based on the lookup table.

The image identification apparatus according to claim 2, wherein the calculation unit obtains an average value of maximum values of representative values of each dimension stored in the lookup table as an identification contribution degree of the feature category.

The learning sample image is an image that does not belong to a class of objects of the input image;
2. The calculation means obtains an information amount of a background distribution indicating a feature vector distribution for each feature category of the learning sample image acquired by the first acquisition means as the identification contribution degree. Or the image identification apparatus of 2.

The dictionary creation unit includes a lookup table corresponding to a feature category selected by the selection unit among the lookup tables created by the calculation unit, in the dictionary. The image identification device according to 2 or 3.

A first acquisition step of acquiring a feature vector indicating the feature of the learning sample image;
A calculation step of obtaining an identification contribution for each feature category from the feature vector of the learning sample image acquired in the first acquisition step;
A selection step of selecting a feature category whose identification contribution determined in the calculation step is a predetermined value or more;
A dictionary creation step of creating a dictionary including information on the feature category selected in the selection step;
A second acquisition step of acquiring a feature vector indicating the feature of the input image;
Whether the object of the input image is similar to the object of the learning sample image based on the feature vector of the input image acquired in the second acquisition step and the dictionary created in the dictionary creation step A determination step for determining whether or not,
In the second acquisition step, a feature vector corresponding to a feature category not selected in the selection step is not acquired from the input image.

A first acquisition step of acquiring a feature vector indicating the feature of the learning sample image;
A calculation step of obtaining an identification contribution for each feature category from the feature vector of the learning sample image acquired in the first acquisition step;
A selection step of selecting a feature category whose identification contribution determined in the calculation step is a predetermined value or more;
A dictionary creation step of creating a dictionary including information on the feature category selected in the selection step;
A second acquisition step of acquiring a feature vector indicating the feature of the input image;
Whether the object of the input image is similar to the object of the learning sample image based on the feature vector of the input image acquired in the second acquisition step and the dictionary created in the dictionary creation step A determination step for determining whether or not the computer is executed,
In the second acquisition step, a feature vector corresponding to a feature category not selected in the selection step is not acquired from the input image.