JP2015230715A

JP2015230715A - Feature quantity arithmetic device, feature quantity arithmetic method, and feature quantity arithmetic program

Info

Publication number: JP2015230715A
Application number: JP2014118175A
Authority: JP
Inventors: 満安倍; Mitsuru Abe; 幹郎清水; Mikiro Shimizu
Original assignee: Denso Corp; Denso IT Laboratory Inc
Current assignee: Denso Corp; Denso IT Laboratory Inc
Priority date: 2014-06-06
Filing date: 2014-06-06
Publication date: 2015-12-21
Anticipated expiration: 2034-06-06
Also published as: JP6235414B2

Abstract

PROBLEM TO BE SOLVED: To provide a feature quantity arithmetic device appropriate for computing binary feature quantities.SOLUTION: A feature quantity arithmetic device comprises: a feature-quantity binarization unit binarizing feature quantities extracted from a pyramid image constituted by an input image and a plurality of resized images obtained by scaling up or down at a plurality of magnifications; and a feature quantity arithmetic unit applying a dictionary set constituted by a plurality of dictionaries at different sizes to the binarized feature quantities and determining the association between the input image and the dictionaries, the feature quantity arithmetic unit applying the identical dictionary set or similar dictionary sets to the images in the pyramid image.

Description

本発明は、画像から抽出された特徴量を演算する特徴量演算装置、特徴量演算方法、及び特徴量演算プログラムに関し、特に、二値化された特徴量を演算する特徴量演算装置、特徴量演算方法、及び特徴量演算プログラムに関するものである。 The present invention relates to a feature amount calculation device, a feature amount calculation method, and a feature amount calculation program for calculating a feature amount extracted from an image, and in particular, a feature amount calculation device that calculates a binarized feature amount, a feature amount The present invention relates to a calculation method and a feature amount calculation program.

従来より、画像検索、音声認識、文章検索、パターン認識など、多くの分野で特徴量が用いられている。特徴量とは、画像、音声、文章などの情報を、計算機で扱いやすいように変換したものである。特徴量は、Ｄ次元のベクトル（特徴ベクトル）で表される。 Conventionally, feature quantities are used in many fields such as image search, voice recognition, sentence search, and pattern recognition. The feature amount is information obtained by converting information such as images, sounds, and sentences so as to be easily handled by a computer. The feature amount is represented by a D-dimensional vector (feature vector).

特徴ベクトルを用いた特徴量演算を行うことで、例えば、コンテンツの類似度を判定することができる。すなわち、画像αの特徴ベクトルと、画像βの特徴ベクトルの距離が小さければ、αとβは似ているとみなすことができる。同様に、音声波形αの特徴ベクトルと、音声波形βの特徴ベクトルとの距離が小さければ、αとβは似ているとみなすことができる。このように、音声認識、文章検索、パターン認識等の情報処理では、情報を特徴ベクトルに変換して、特徴ベクトル同士を比較して、その距離を求めることにより情報の類似度を判断する。 By performing the feature amount calculation using the feature vector, for example, the similarity of content can be determined. That is, if the distance between the feature vector of the image α and the feature vector of the image β is small, it can be considered that α and β are similar. Similarly, if the distance between the feature vector of the speech waveform α and the feature vector of the speech waveform β is small, it can be considered that α and β are similar. As described above, in information processing such as speech recognition, sentence search, and pattern recognition, information is converted into feature vectors, the feature vectors are compared with each other, and the distance between them is determined to determine information similarity.

特徴ベクトル間の距離の尺度としては、Ｌ１ノルム、Ｌ２ノルム、ベクトル間角度などが用いられる。これらは、特徴ベクトルｘ，ｙ∈Ｒ^Dについて、次のように計算できる。
Ｌ１ノルム
Ｌ２ノルム
ベクトル間角度
As a measure of the distance between feature vectors, an L1 norm, an L2 norm, an angle between vectors, or the like is used. These feature vectors x, for Y∈R ^D, can be calculated as follows.
L1 norm
L2 norm
Angle between vectors

ここで、抽出される特徴ベクトルが実数ベクトルである場合には、以下のような問題がある。まず、２つの特徴ベクトルｘ，ｙ∈Ｒ^Dの間の距離の計算が遅くなるという問題がある。例えば、Ｌ２ノルムの二乗を距離の尺度として用いる場合、
であるから、実数について、Ｄ回の引き算、Ｄ回の乗算、Ｄ−１回の加算が必要である。特に、特徴ベクトルが浮動小数で表現される場合には、この計算負荷は非常に高くなる。特徴ベクトルが高次元になれば、この計算負荷はさらに高くなる。 Here, when the extracted feature vector is a real vector, there are the following problems. First, there is a problem that two feature vectors x, the calculation of the distance between the Y∈R ^D slows. For example, when the L2 norm square is used as a distance measure,
Therefore, for a real number, D subtractions, D multiplications, and D-1 additions are required. In particular, when the feature vector is expressed by a floating point number, the calculation load becomes very high. If the feature vector has a higher dimension, the calculation load becomes higher.

また、大量のメモリを消費する点も問題となる。特徴ベクトルを４バイトの単精度実数で表現する場合、Ｄ次元の特徴ベクトルは４Ｄバイトのメモリを消費する。特徴ベクトルが高次元になれば、このメモリ消費量は大きくなる。大量の特徴ベクトルを扱う場合、扱う特徴ベクトルの数だけメモリを消費することになる。 Another problem is that a large amount of memory is consumed. When a feature vector is represented by a 4-byte single-precision real number, the D-dimensional feature vector consumes 4D bytes of memory. As the feature vector becomes higher in dimension, the memory consumption increases. When dealing with a large amount of feature vectors, the memory is consumed by the number of feature vectors to be handled.

そこで近年、特徴ベクトルを０と１の列から成るバイナリコードに変換することにより、これら２つの問題を解決する手法が提案されている。代表的な手法として、ランダムプロジェクション（random projection、非特許文献１参照）、ベリースパースランダムプロジェクション（very sparse random projection、非特許文献２参照）、及びスペクトラルハッシング（Spectral Hashing、非特許文献３参照）がある。 Therefore, in recent years, a method for solving these two problems by converting a feature vector into a binary code composed of a sequence of 0 and 1 has been proposed. Typical techniques include random projection (see random projection, Non-Patent Document 1), belly sparse random projection (see Non-Patent Document 2), and spectral hashing (Spectral Hashing, see Non-Patent Document 3). is there.

これらの手法では、Ｄ次元の特徴ベクトルがｄビットのバイナリコードに変換される。この変換は、もともとの空間における距離が、変換後の空間におけるハミング距離と強く相関するように行われる（もともとの空間における距離と、変換後の空間におけるハミング距離と強く相関する根拠については、非特許文献１の１１２１ページのＬｅｍｍａ３．２を参照）。これによって、特徴ベクトル間の距離の計算を、バイナリコード同士のハミング距離計算で代用できるようになる。 In these methods, a D-dimensional feature vector is converted into a d-bit binary code. This conversion is performed so that the distance in the original space strongly correlates with the Hamming distance in the converted space (for the reason that the distance in the original space and the Hamming distance in the converted space are strongly correlated) (See Lemma 3.2 on page 1121 of Patent Document 1). As a result, the calculation of the distance between feature vectors can be replaced by the calculation of the Hamming distance between binary codes.

ハミング距離とは、二つのバイナリコードのうち、異なるビットの数を数えたものである。この計算は、二つのコードのＸＯＲをとった後に１が立っているビット数を数えるだけなので、非常に高速に行うことができる。多くの場合、バイナリコード変換によって、数十〜数百倍程度の高速化が可能である。また、特徴ベクトル間の距離の計算を、バイナリコード同士のハミング距離計算で代用することにより、もともと４Ｄバイトであったメモリの必要容量を、ｄ／８バイトまで削減できる。これにより、数十〜数百分の一にメモリ容量を節約できる。 The Hamming distance is obtained by counting the number of different bits in two binary codes. This calculation can be performed very quickly because it only counts the number of bits that are 1 after XORing the two codes. In many cases, binary code conversion can increase the speed by several tens to several hundred times. Further, by substituting the calculation of the distance between feature vectors with the calculation of the Hamming distance between binary codes, the required memory capacity, which was originally 4D bytes, can be reduced to d / 8 bytes. Thereby, memory capacity can be saved to tens to hundreds of times.

抽出された特徴量をバイナリコードに変換して、さまざまなアルゴリズムを適用することで、コンテンツの検索や認識などが可能となる。例えば類似コンテンツを検索する場合には、あらかじめデータベースに登録されているコンテンツの特徴量を、すべてバイナリコードに変換しておく。また、入力クエリとして与えられたコンテンツの特徴量をバイナリコードに変換する。そして、入力クエリのバイナリコードと、データベースに登録されているすべてのバイナリコードとの間のハミング距離を計算することで、入力クエリに類似するコンテンツを検索して出力できる。 By converting the extracted feature quantity into binary code and applying various algorithms, it becomes possible to search and recognize content. For example, when searching for similar content, all the feature quantities registered in the database in advance are converted into binary codes. Also, the feature amount of the content given as an input query is converted into a binary code. Then, by calculating the Hamming distance between the binary code of the input query and all the binary codes registered in the database, it is possible to search and output content similar to the input query.

バイナリコードはｄビットの０と１の列からなる。これを、各要素が−１及び１の二値のみを取るｄ次元のベクトルと考えることもできる。以下の説明における混乱を避けるために、「バイナリコード」と「二値ベクトル」という用語について、以下のように区別をする。「バイナリコード」は、０と１の列からなるデータ表現である。例えば、Ｃ言語において１２８ビットのバイナリコードをメモリ上に格納する場合は、符号無し整数（unsigned char）型の１６要素分の配列を用意すればよい（８ｂｉｔ×１６＝１２８ｂｉｔ）。 The binary code consists of a sequence of 0's and 1's of d bits. This can be considered as a d-dimensional vector in which each element takes only binary values of −1 and 1. In order to avoid confusion in the following description, the terms “binary code” and “binary vector” are distinguished as follows. A “binary code” is a data representation consisting of a sequence of 0s and 1s. For example, when a 128-bit binary code is stored in a memory in the C language, an array for 16 elements of an unsigned integer type may be prepared (8 bits × 16 = 128 bits).

一方、「二値ベクトル」は、各要素が二値のみを取るベクトルである。例えば、二値ベクトルを各要素が−１及び１のみをとるベクトルとする場合には、バイナリコード「０１１０１１１０」に対応する二値ベクトルは、（−１，１，１，−１，１，１，１，−１）^Tである。もちろん、各要素が０及び１の二値のみを取るベクトルも二値ベクトルであるし、さらには、各要素が任意のα及びβ（ここでα≠βである）の二値のみを取るベクトルも二値ベクトルである。ただし、「バイナリコード」と「二値ベクトル」の違いは、情報の表現に関するものであり、両者に本質的な違いはない。 On the other hand, a “binary vector” is a vector in which each element takes only binary values. For example, when a binary vector is a vector in which each element takes only −1 and 1, the binary vector corresponding to the binary code “01101110” is (−1, 1, 1, −1, 1, 1). , 1, -1) ^T. Of course, a vector in which each element takes only binary values of 0 and 1 is also a binary vector, and furthermore, a vector in which each element takes only binary values of arbitrary α and β (where α ≠ β) Is also a binary vector. However, the difference between the “binary code” and the “binary vector” relates to the expression of information, and there is no essential difference between the two.

Michel X. Goemans, avid P. Williamson, "Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming", Journal of the ACM Volume 42 , Issue 6 (November 1995) Pages: 1115-1145Michel X. Goemans, avid P. Williamson, "Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming", Journal of the ACM Volume 42, Issue 6 (November 1995) Pages: 1115-1145 Ping Li, Trevor J. Hastie, Kenneth W. Church, "very sparse random projections", KDD '06 Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (2006)Ping Li, Trevor J. Hastie, Kenneth W. Church, "very sparse random projections", KDD '06 Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (2006) Y. Weiss, A. Torralba, R. Fergus., "Spectral Hashing", Advances in Neural Information Processing Systems, 2008.Y. Weiss, A. Torralba, R. Fergus., "Spectral Hashing", Advances in Neural Information Processing Systems, 2008.

特徴量を用いた演算を行うためには、入力コンテンツから特徴量を抽出する必要がある。以下では、特徴量演算として、入力コンテンツとしての入力画像に含まれる識別対象の識別を行う場合を例に、本発明の課題を説明する。 In order to perform the calculation using the feature amount, it is necessary to extract the feature amount from the input content. In the following, the problem of the present invention will be described by taking as an example the case where the identification target included in the input image as the input content is identified as the feature amount calculation.

一般に、物体認識ではＨＯＧ（Histograms of Oriented Gradients）特徴量が用いられる。そこで、まずＨＯＧ特徴量を用いた識別について概略を述べる。図４９は、入力画像からＨＯＧ特徴量を抽出する方法を説明するための図である。ＨＯＧ特徴量を抽出するためには、識別装置は、まず、入力画像をＭピクセル×Ｍピクセル（Ｍは自然数）ごとに分割し、そこからＤ種類（Ｄは自然数）の方向の勾配方向ヒストグラムを求める。このＭピクセル×Ｍピクセルの小領域を一つの単位として「セル」と呼ぶ。１つのセルにはＤ次元の特徴ベクトルが与えられることになる。さらに、Ｎセル×Ｎセルを一つの単位としてまとめたものを「ブロック」と呼ぶ。各セルにＤ次元の特徴ベクトルが与えられるので、１つのブロックには（Ｎ×Ｎ×Ｄ）次元の特徴ベクトルが与えられることになる。 In general, HOG (Histograms of Oriented Gradients) features are used in object recognition. First, an outline of identification using HOG feature values will be described. FIG. 49 is a diagram for explaining a method of extracting HOG feature values from an input image. In order to extract the HOG feature value, the identification device first divides the input image into M pixels × M pixels (M is a natural number), and then uses a gradient direction histogram in the direction of D types (D is a natural number). Ask. This small area of M pixels × M pixels is referred to as a “cell” as one unit. One cell is given a D-dimensional feature vector. Further, a group of N cells × N cells as one unit is called a “block”. Since a D-dimensional feature vector is given to each cell, an (N × N × D) -dimensional feature vector is given to one block.

通常、ブロックに与えられた（Ｎ×Ｎ×Ｄ）次元のベクトルは、長さが１になるように正規化される。これは照明条件の変化にロバストにするための措置である。隣接するブロックは重なり合うように配置される。すなわち、左右に隣接するブロックでは幾つかのセルを共有するように配置される。 Usually, a (N × N × D) -dimensional vector given to a block is normalized so that the length becomes 1. This is a measure to make the lighting conditions more robust. Adjacent blocks are arranged to overlap. That is, the left and right adjacent blocks are arranged to share some cells.

識別装置は、ここから横Ｈブロック×縦Ｖブロックのウィンドウを用いて、（Ｎ×Ｎ×Ｄ×Ｈ×Ｖ）次元の特徴量を切り出す。識別装置は、これを物体の特徴量と考え、識別処理を適用することで、このウィンドウに映っている物体が特定の対象（例えば歩行者）であるか否かを判定する。 The discriminating apparatus cuts out an (N × N × D × H × V) -dimensional feature amount from here using a window of horizontal H block × vertical V block. The identification device considers this as a feature amount of the object and applies identification processing to determine whether or not the object shown in this window is a specific target (for example, a pedestrian).

歩行者認識の場合、Ｍ＝８、Ｎ＝２、Ｄ＝３２、Ｈ＝８、Ｖ＝１６が適切なパラメータであることが知られている。例えば、上記の標準的なパラメータにおいて幅６４０ピクセル×高さ４８０ピクセルの入力画像からＨＯＧ特徴量を抽出する場合、横７９ブロック×縦５９ブロックのウィンドウを用いて切出されたＨＯＧ特徴量が抽出される。 In the case of pedestrian recognition, it is known that M = 8, N = 2, D = 32, H = 8, and V = 16 are appropriate parameters. For example, when the HOG feature value is extracted from an input image having a width of 640 pixels and a height of 480 pixels with the standard parameters described above, the HOG feature value extracted using a window of width 79 blocks × length 59 blocks is extracted. Is done.

入力画像に含まれる特定の対象の当該入力画像内での位置が不明である場合には、識別装置は、Ｈブロック×Ｖブロックのウィンドウを入力画像内でスライドさせながら（Ｎ×Ｎ×Ｄ×Ｈ×Ｖ）次元の特徴量を切り出して、その都度識別処理を適用することで、入力画像に特定の対象が含まれるか否かを判定する。さらに、入力画像における識別対象のサイズが不明である場合もある。識別対象のサイズが不明である場合に識別対象を識別する手法として、以下の手法がある。 When the position of the specific target included in the input image is unknown, the identification device slides the window of H block × V block in the input image (N × N × D × It is determined whether or not a specific target is included in the input image by cutting out the feature quantity of (H × V) dimension and applying the identification process each time. Furthermore, the size of the identification target in the input image may be unknown. As a technique for identifying an identification object when the size of the identification object is unknown, there are the following techniques.

（第１の手法：フィーチャ・ピラミッド法）
図１は、フィーチャ・ピラミッド法を説明するための図である。この手法では、識別装置は、入力画像をＬ通りのサイズに変形（リサイズ）して、Ｌ枚のサイズの異なる画像を生成し、それぞれの画像について特徴量を抽出する。識別装置は、各画像について同じサイズのウィンドウＷを用いて、識別のための特徴量演算を行う。 (First method: feature pyramid method)
FIG. 1 is a diagram for explaining the feature pyramid method. In this method, the identification device deforms (resizes) the input image into L sizes, generates L images having different sizes, and extracts feature amounts for the respective images. The identification device performs a feature amount calculation for identification using a window W of the same size for each image.

図２は、フィーチャ・ピラミッド法の識別処理を説明する図である。識別装置は、入力画像１０が得られると、それを複数とおりの縮小率で縮小して、複数のリサイズ（縮小）画像１１を生成する。識別装置は、入力画像及び複数のリサイズ画像（合計Ｌ枚）の各々について特徴量を抽出して、フィーチャ・ピラミッドを生成する。すなわち、識別装置は特徴量の抽出処理をＬ回行う。識別装置は、各サイズの画像から抽出された複数段の特徴量を用いて、識別のための特徴量演算を行う。このとき、ウィンドウのサイズは固定されているので、識別のための辞書は当該ウィンドウのサイズに対応するものを用意しておけば足りる。 FIG. 2 is a diagram for explaining the identification process of the feature pyramid method. When the identification device obtains the input image 10, the identification device reduces the input image 10 at a plurality of reduction rates, and generates a plurality of resized (reduced) images 11. The identification device extracts feature amounts for each of the input image and the plurality of resized images (a total of L images) to generate a feature pyramid. That is, the identification device performs feature amount extraction processing L times. The identification device performs a feature amount calculation for identification using a plurality of stages of feature amounts extracted from images of each size. At this time, since the size of the window is fixed, it is sufficient to prepare a dictionary for identification corresponding to the size of the window.

図３は、フィーチャ・ピラミッド法に上述の特徴量の二値化による高速化の技術を適用した場合の識別処理を説明する図である。図２の場合と同様に、識別装置は、入力画像１０が得られると、それを複数とおりの縮小率で縮小して、複数のリサイズ（縮小）画像１１を生成し、入力画像及び複数のリサイズ画像（合計Ｌ枚）の各々について特徴量を抽出して、フィーチャ・ピラミッドを生成する。識別装置は、各サイズの画像から抽出された複数段の特徴量の各々を二値化する。すなわち、識別装置は、特徴量の二値化処理をＬ回行う。識別装置は、各段の二値特徴量を用いて、識別のための特徴量演算を行う。二値特徴量を用いているので、この特徴量演算は高速化される。 FIG. 3 is a diagram for explaining identification processing when the above-described technology for speeding up by binarizing feature quantities is applied to the feature pyramid method. As in the case of FIG. 2, when the input image 10 is obtained, the identification device reduces the image by a plurality of reduction ratios, generates a plurality of resized (reduced) images 11, and generates the input image and the plurality of resized images. A feature amount is extracted for each of the images (total L images) to generate a feature pyramid. The identification device binarizes each of the plurality of stages of feature amounts extracted from each size image. That is, the identification device performs the feature value binarization process L times. The identification device performs a feature value calculation for identification using the binary feature value of each stage. Since binary feature values are used, this feature value calculation is speeded up.

しかしながら、まず、フィーチャ・ピラミッド法では、サイズの異なる複数の画像について特徴量抽出の処理を行う必要があるので、この点で特徴量抽出が遅いという問題がある。また、特徴量の二値化によって特徴量演算の高速化を図る場合にも、特徴量の二値化処理をリサイズの段数だけ行わなければならず、この点で、特徴量の二値化による特徴量演算の高速化の恩恵を十分に受けることができない。 However, first, in the feature pyramid method, since it is necessary to perform feature amount extraction processing for a plurality of images having different sizes, there is a problem that feature amount extraction is slow in this respect. In addition, when speeding up the feature amount calculation by binarizing the feature amount, the binarization process of the feature amount must be performed by the number of stages of resizing. In this respect, the binarization of the feature amount is performed. The benefits of speeding up the feature calculation cannot be fully received.

（第２の手法：クラシファイア・ピラミッド法）
図４は、クラシファイア・ピラミッド法を説明するための図である。この手法では、識別装置は、入力画像から特徴量を抽出する際のセルのサイズを２×２ピクセル、３×３ピクセル、・・・とＬ通りのサイズに変形（リサイズ）して、Ｌ段の特徴量を抽出する。ブロックのサイズＮや物体モデル（ウィンドウ）の縦横ブロック数Ｈ、Ｖは、例えば上記のように、Ｎ＝２、Ｈ＝８、Ｖ＝１６とすることができる。識別装置は、各段の特徴量についてＬ通りの異なるサイズのウィンドウＷを用いて、識別のための特徴量演算を行う。ウィンドウＷの縦横ピクセルサイズは、セルのサイズに応じで変わることになるが、特徴量の次元数は変わらない。 (Second method: Classifier pyramid method)
FIG. 4 is a diagram for explaining the classifier pyramid method. In this method, the identification device transforms (resizes) the cell size when extracting the feature value from the input image into 2 × 2 pixels, 3 × 3 pixels,... Extract feature values. The block size N and the number of vertical and horizontal blocks H and V of the object model (window) can be set to N = 2, H = 8, and V = 16 as described above, for example. The identification device performs feature value calculation for identification using L different sizes of windows W for the feature values of each stage. Although the vertical and horizontal pixel sizes of the window W change depending on the cell size, the number of dimensions of the feature amount does not change.

図５は、クラシファイア・ピラミッド法の識別処理を説明する図である。識別装置は、入力画像１０が得られると、この入力画像についてＬ通りの異なるセルのサイズ（例えば、２×２ピクセル、３×３ピクセル、・・・）で特徴量を抽出する。このとき、セル内の勾配ヒストグラムを求める操作には冗長性があるので、特徴量の積分画像を用いる等の手法によって特徴量の抽出処理の負荷を軽減できるものの、原理的には特徴量の抽出処理（ブロックの構成、ブロックに与えられた特徴量の正規化）をＬ回行う必要がある。識別装置は、各セルサイズで抽出された複数段の特徴量を用いて、識別のための特徴量演算を行う。 FIG. 5 is a diagram for explaining identification processing by the classifier pyramid method. When the input image 10 is obtained, the identification device extracts feature amounts of the input image with L different cell sizes (for example, 2 × 2 pixels, 3 × 3 pixels,...). At this time, since the operation for obtaining the gradient histogram in the cell is redundant, the load of the feature quantity extraction process can be reduced by a technique such as using an integral image of the feature quantity, but in principle, the feature quantity is extracted. It is necessary to perform the processing (block configuration, normalization of the feature value given to the block) L times. The identification device performs a feature value calculation for identification using a plurality of stages of feature values extracted for each cell size.

図６は、クラシファイア・ピラミッド法に上述の特徴量の二値化による高速化の技術を適用した場合の識別処理を説明する図である。図３の場合と同様に、識別装置は、入力画像１０が得られると、この入力画像についてＬ通りの異なるセルのサイズ（例えば、２×２ピクセル、３×３ピクセル、・・・）で特徴量を抽出する。識別装置は、各セルサイズの特徴量の各々を二値化する。すなわち、識別装置は、特徴量の二値化処理をＬ回行う。識別装置は、各段の二値特徴量を用いて、識別のための特徴量演算を行う。二値特徴量を用いているので、この特徴量演算は高速化される。 FIG. 6 is a diagram for explaining identification processing when the above-described technique for speeding up by binarizing feature quantities is applied to the classifier pyramid method. As in FIG. 3, when the input image 10 is obtained, the identification device is characterized by L different cell sizes (for example, 2 × 2 pixels, 3 × 3 pixels,...). Extract the amount. The identification device binarizes each feature amount of each cell size. That is, the identification device performs the feature value binarization process L times. The identification device performs a feature value calculation for identification using the binary feature value of each stage. Since binary feature values are used, this feature value calculation is speeded up.

しかしながら、まず、クラシファイア・ピラミッド法では、特徴量がスケール不変でない場合には、特徴量が苦手なスケールでは識別性能が劣化する。例えば、上述のＨＯＧ特徴量はスケール不変ではないので、クラシファイア・ピラミッド法は適さない。より具体的にいうと、ＨＯＧ特徴量はセルのサイズが８ピクセル×８ピクセルが適切であることが知られているが、クラシファイア・ピラミッド法では、見かけ上大きな物体を検出したい場合にはセルのサイズを非常に大きくしなければならず、また逆に見かけ上小さな物体を検出したい場合にはセルのサイズを非常に小さくしなければならず、その場合の物体認識精度は著しく劣化し得る。また、複数のブロックサイズごとに辞書を学習しなければならないという問題もある。 However, first, in the classifier pyramid method, when the feature quantity is not invariant to the scale, the discrimination performance deteriorates at a scale where the feature quantity is not good. For example, the classifier pyramid method is not suitable because the above-described HOG feature is not scale invariant. More specifically, it is known that the cell size of 8 × 8 pixels is appropriate for the HOG feature amount. However, in the classifier pyramid method, when an apparently large object is to be detected, the cell The size must be very large, and conversely, if an apparently small object is to be detected, the cell size must be very small, and the object recognition accuracy in that case can be significantly degraded. There is also a problem that a dictionary must be learned for each of a plurality of block sizes.

また、クラシファイア・ピラミッド法では、異なるブロックサイズの特徴量の冗長性を利用した特徴量抽出処理の高速化、及び特徴量を二値化することによる識別処理の高速化が可能であるが、これは単なる二つの技術の寄せ集めに過ぎず、それらの相乗的な効果が得られているわけではない。 The classifier pyramid method can speed up the feature extraction process using the redundancy of feature quantities of different block sizes, and can speed up the identification process by binarizing the feature quantities. Is just a collection of two technologies, not a synergistic effect.

本発明は、上記の問題に鑑みてなされたものであり、二値特徴量を演算するのに適した特徴量演算装置を提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a feature amount calculation apparatus suitable for calculating a binary feature amount.

本発明の一態様の特徴量演算装置は、入力画像と前記入力画像を複数の倍率でそれぞれ拡大又は縮小してなる複数のリサイズ画像からなるピラミッド画像の各々から抽出された特徴量を二値化する特徴量二値化部と、二値化された前記特徴量に対してサイズの異なる複数の辞書からなる辞書セットを適用して前記入力画像と前記辞書との関連性を判定する特徴量演算部とを備え、前記特徴量演算部は、前記ピラミッド画像の各々について、前記複数の辞書に対して、二値化された前記特徴量を共通して用いて、前記複数の辞書との関連性を判定する構成を有している。 The feature value calculation apparatus according to one aspect of the present invention binarizes an feature value extracted from each of an input image and a pyramid image including a plurality of resized images obtained by enlarging or reducing the input image at a plurality of magnifications. A feature amount binarization unit, and a feature amount calculation for determining a relevance between the input image and the dictionary by applying a dictionary set including a plurality of dictionaries having different sizes to the binarized feature amount And the feature amount calculation unit uses the binarized feature amount in common with respect to the plurality of dictionaries for each of the pyramid images, and relates to the plurality of dictionaries. Is determined.

この構成により、ピラミッド画像の各々から抽出された特徴量を二値化した上で、各二値特徴量に対してサイズの異なる複数の辞書を適用するので、フィーチャ・ピラミッド法のように、ピラミッド画像の各々から特徴量を抽出して、特徴量ごとに同一の辞書を用いて演算を行う場合と比較して、特徴量の抽出回数を減らすことができ、特徴量の抽出処理の負荷を軽減して高速化できる。また、クラシファイア・ピラミッド法のように、ピラミッド画像を生成せずに入力画像から複数のセルサイズの異なる特徴量抽出を行い、特徴量ごとに異なる辞書を用いて演算を行う場合と比較しても、二値化の処理回数を減らすことができ、二値化処理の負荷を軽減して高速化できる。すなわち、上記の構成では、複数の辞書に対して二値化特徴量を共通して用いるので、入力画像内の関連性を判定したい対象（例えば歩行者）の入力画像に対するサイズの違いに対応するためのセル数の異なる複数の辞書が、セル毎には共通化された二値化特徴量を用いることで、計算を要する特徴量の数を減少させることができ、これによって関連性判定の処理を高速化できる。 With this configuration, feature values extracted from each of the pyramid images are binarized, and a plurality of dictionaries with different sizes are applied to each binary feature value. Therefore, as in the feature pyramid method, the pyramid Compared to extracting feature values from each image and using the same dictionary for each feature value, the number of feature extraction times can be reduced, reducing the load of feature value extraction processing. Speed up. Compared to the case of extracting feature quantities with different cell sizes from the input image without generating a pyramid image and performing calculations using different dictionaries for each feature quantity as in the classifier pyramid method. The number of binarization processes can be reduced, and the load of the binarization process can be reduced and the speed can be increased. That is, in the above configuration, since binarized feature values are commonly used for a plurality of dictionaries, it corresponds to a difference in size of an input image of a target (for example, a pedestrian) whose relevance is to be determined in the input image. By using a binarized feature value that is shared for each cell, a plurality of dictionaries with different numbers of cells can reduce the number of feature values that need to be calculated. Can be speeded up.

上記の特徴量演算装置は、前記ピラミッド画像の各々から前記特徴量を抽出する特徴量抽出部をさらに備えていてよく、前記特徴量二値化部は、前記特徴量抽出部にて抽出された特徴量を二値化してよい。 The feature amount calculation device may further include a feature amount extraction unit that extracts the feature amount from each of the pyramid images, and the feature amount binarization unit is extracted by the feature amount extraction unit. The feature quantity may be binarized.

この構成により、ピラミッド画像の各々から抽出された実数の特徴量を二値化することができる。 With this configuration, it is possible to binarize the real number of feature quantities extracted from each of the pyramid images.

上記の特徴量演算装置において、前記特徴量演算部は、前記入力画像に対して、前記辞書を用いた識別を行ってよい。 In the feature amount calculation device, the feature amount calculation unit may perform identification using the dictionary for the input image.

この構成により、処理負荷を増やすことなく特徴量の二値化による識別の高速化を実現できるので、例えば、連続的に入力される画像（動画像）について、リアルタイムに認識を行うことも可能となる。 With this configuration, it is possible to speed up identification by binarizing feature amounts without increasing the processing load. For example, continuous input images (moving images) can be recognized in real time. Become.

上記の特徴量演算装置において、前記特徴量演算部は、前記ピラミッド画像の各々について、前記複数の辞書のうちの全部又は一部の辞書が同一である前記辞書セットを適用してよい。 In the feature amount calculation device, the feature amount calculation unit may apply the dictionary set in which all or some of the plurality of dictionaries are the same for each of the pyramid images.

この構成により、複数の辞書からなる辞書セットの容量を小さくできる。 With this configuration, the capacity of a dictionary set composed of a plurality of dictionaries can be reduced.

上記の特徴量演算装置は、二値化された前記特徴量の共起要素を用いて識別能力を強化するよう前記特徴量を変換する特徴量変換部をさらに備えていてよい。 The feature amount calculation device may further include a feature amount conversion unit that converts the feature amount so as to enhance the discrimination ability using the binarized co-occurrence element of the feature amount.

この構成により、特徴量演算部における入力画像の識別の精度を向上できる。 With this configuration, it is possible to improve the accuracy of identification of the input image in the feature amount calculation unit.

上記の特徴量演算装置は、実数を要素として持つ実数ベクトルを二値または三値の離散値のみから構成された要素を持つ複数の基底ベクトルの線形和に分解することで得られた前記複数の基底ベクトルを取得する基底ベクトル取得部をさらに備えていてよく、前記辞書は、前記複数の基底ベクトルを用いて生成されていてよく、前記特徴量演算部は、前記特徴量を示す特徴ベクトルと前記複数の基底ベクトルの各々との内積計算を順次行うことで、前記実数ベクトルと前記特徴ベクトルとの関連性を判定してよい。 In the above feature quantity computing device, the real vector having a real number as an element is decomposed into a linear sum of a plurality of base vectors having elements composed only of binary or ternary discrete values. The image processing apparatus may further include a basis vector obtaining unit that obtains a basis vector, the dictionary may be generated using the plurality of basis vectors, and the feature amount calculation unit may include a feature vector indicating the feature amount and the feature amount The relevance between the real vector and the feature vector may be determined by sequentially performing inner product calculation with each of a plurality of basis vectors.

この構成により、辞書の離散化によるベクトル演算を高速化することで、特徴量と実数ベクトルとの関連性を高速に判定できる。 With this configuration, it is possible to determine the relevance between the feature quantity and the real vector at high speed by speeding up the vector operation by discretization of the dictionary.

上記の特徴量演算装置は、前記特徴量演算部にて関連性があると判定された特徴ベクトルの共起要素を用いて識別能力を強化するよう前記特徴ベクトルを変換する特徴量変換部と、前記特徴量変換部にて変換された特徴ベクトルに対して、さらに複数の基底ベクトルの各々との内積計算を順次行うことで、前記実数ベクトルと前記特徴ベクトルとの関連性を判定する第２の特徴量演算部とをさらに備えていてよい。 The feature amount calculation device includes a feature amount conversion unit that converts the feature vector so as to enhance discrimination capability using a co-occurrence element of the feature vector determined to be relevant by the feature amount calculation unit; A feature vector converted by the feature quantity conversion unit is further subjected to inner product calculation with each of a plurality of base vectors, thereby determining a relevance between the real vector and the feature vector. And a feature amount calculation unit.

この構成により、共起を用いない関連性判定で精度の粗い判定を行って、関連性があると判定された特徴ベクトルについて共起を用いた関連性判定を行うというカスケード処理によって、関連性判定のさらなる高速化が可能となる。 With this configuration, a relevance determination is performed by a cascade process in which a relevance determination without using co-occurrence is performed and a relevance determination using a co-occurrence is performed for a feature vector determined to be related. Can be further increased in speed.

上記の特徴量演算装置において、前記特徴量演算部は、前記ピラミッド画像の各々について、ウィンドウをスライドさせながら特徴量を切り出し、前記ウィンドウから切り出された特徴量について、前記辞書セットを適用して関連性を判定してよい。 In the feature amount calculation apparatus, the feature amount calculation unit extracts a feature amount while sliding a window for each of the pyramid images, and applies the dictionary set to the feature amount extracted from the window. Sex may be determined.

この構成により、適用すべき辞書が複数であるのに対し、特徴量の切出しは一度でよいため、処理が簡略化される。 With this configuration, there is a plurality of dictionaries to be applied, but the feature amount may be extracted once, so that the processing is simplified.

上記の特徴量演算装置は、実数を要素として持つ複数の実数ベクトルからなる実数行列を、係数行列と、要素として二値または三値の離散値のみを持つ複数の基底ベクトルからなる基底行列との積に分解する実数行列分解部をさらに備えていてよく、前記辞書は、前記複数の基底行列を用いて生成されていてよく、前記特徴量演算部は、前記特徴量を示す特徴ベクトルと前記複数の実数ベクトルの各々との内積の計算として、前記特徴ベクトルと前記基底行列との積を計算し、さらに当該積と前記係数行列との積を計算して、その結果を用いて、前記複数の実数ベクトルの各々と前記特徴ベクトルとの関連性を判定してよい。 The above-described feature quantity computing device includes a real matrix composed of a plurality of real vectors having real numbers as elements, a coefficient matrix, and a base matrix composed of a plurality of base vectors having only binary or ternary discrete values as elements. A real matrix decomposition unit that decomposes the product into a product, the dictionary may be generated using the plurality of base matrices, and the feature value calculation unit includes the feature vector indicating the feature value and the plurality of feature vectors. Calculating a product of the feature vector and the basis matrix, calculating a product of the product and the coefficient matrix, and using the result to calculate the inner product with each of the real vectors of The relationship between each real vector and the feature vector may be determined.

この構成により、辞書の離散化によるベクトル演算を高速化することで、特徴量と複数の実数ベクトルの各々との関連性を高速に判定できる。 With this configuration, it is possible to determine the relevance between the feature amount and each of the plurality of real vectors at high speed by speeding up the vector operation by discretization of the dictionary.

本発明の一態様の特徴量演算方法は、入力画像と前記入力画像を複数の倍率でそれぞれ拡大又は縮小してなる複数のリサイズ画像からなるピラミッド画像の各々から抽出された特徴量を二値化する特徴量二値化ステップと、二値化された前記特徴量に対してサイズの異なる複数の辞書からなる辞書セットを適用して前記入力画像と前記辞書との関連性を判定する特徴量演算ステップとを含み、前記特徴量演算ステップでは、前記ピラミッド画像の各々について、前記複数の辞書のうちの全部又は一部の辞書が同一である前記辞書セットを適用する構成を有している。 The feature value calculation method according to one aspect of the present invention binarizes an feature value extracted from each of an input image and a pyramid image including a plurality of resized images obtained by enlarging or reducing the input image at a plurality of magnifications. A feature amount binarization step, and a feature amount calculation for determining a relevance between the input image and the dictionary by applying a dictionary set including a plurality of dictionaries having different sizes to the binarized feature amount In the feature amount calculation step, the dictionary set in which all or some of the plurality of dictionaries are the same is applied to each of the pyramid images.

この構成によっても、ピラミッド画像の各々から抽出された特徴量を二値化した上で、各二値特徴量に対してサイズの異なる複数の辞書を適用するので、フィーチャ・ピラミッド法のように、ピラミッド画像の各々から特徴量を抽出して、特徴量ごとに異なる辞書を用いて演算を行う場合と比較して、特徴量の抽出回数を減らすことができ、特徴量の抽出処理の負荷を軽減して高速化できる。また、クラシファイア・ピラミッド法のように、ピラミッド画像を生成せずに入力画像から複数のセルサイズの異なる特徴量抽出を行い、特徴量ごとに異なる辞書を用いて演算を行う場合と比較しても、二値化の処理回数を減らすことができ、二値化処理の負荷を軽減して高速化できる。すなわち、上記の構成では、複数の辞書に対して二値化特徴量を共通して用いるので、入力画像内の関連性を判定したい対象（例えば歩行者）の入力画像に対するサイズの違いに対応するためのセル数の異なる複数の辞書が、セル毎には共通化された二値化特徴量を用いることで、計算を要する特徴量の数を減少させることができ、これによって関連性判定の処理を高速化できる。 Even with this configuration, since the feature values extracted from each of the pyramid images are binarized and a plurality of dictionaries having different sizes are applied to each binary feature value, as in the feature pyramid method, Compared to extracting features from each pyramid image and performing calculations using different dictionaries for each feature, the number of feature extractions can be reduced, reducing the load of feature extraction processing. Speed up. Compared to the case of extracting feature quantities with different cell sizes from the input image without generating a pyramid image and performing calculations using different dictionaries for each feature quantity as in the classifier pyramid method. The number of binarization processes can be reduced, and the load of the binarization process can be reduced and the speed can be increased. That is, in the above configuration, since binarized feature values are commonly used for a plurality of dictionaries, it corresponds to a difference in size of an input image of a target (for example, a pedestrian) whose relevance is to be determined in the input image. By using a binarized feature value that is shared for each cell, a plurality of dictionaries with different numbers of cells can reduce the number of feature values that need to be calculated. Can be speeded up.

本発明の一態様の特徴量演算プログラムは、コンピュータに、入力画像と前記入力画像を複数の倍率でそれぞれ拡大又は縮小してなる複数のリサイズ画像からなるピラミッド画像の各々から抽出された特徴量を二値化する特徴量二値化ステップと、二値化された前記特徴量に対してサイズの異なる複数の辞書からなる辞書セットを適用して前記入力画像と前記辞書との関連性を判定する特徴量演算ステップとを実行させるための特徴量演算プログラムであって、前記特徴量演算ステップでは、前記ピラミッド画像の各々について、前記複数の辞書のうちの全部又は一部の辞書が同一である前記辞書セットを適用する構成を有している。 The feature amount calculation program according to one aspect of the present invention stores a feature amount extracted from each of a pyramid image including a plurality of resized images obtained by enlarging or reducing the input image and the input image at a plurality of magnifications. Applying a binarizing feature value binarizing step and applying a dictionary set of a plurality of dictionaries having different sizes to the binarized feature value to determine relevance between the input image and the dictionary A feature amount calculation program for executing a feature amount calculation step, wherein in the feature amount calculation step, all or some of the plurality of dictionaries are the same for each of the pyramid images. The dictionary set is applied.

本発明によれば、特徴量の抽出回数及び二値化の処理回数を減らすことができるので、関連性判定の処理を高速化できる。 According to the present invention, the number of feature extractions and the number of binarization processes can be reduced, so the relevance determination process can be speeded up.

フィーチャ・ピラミッド法を説明するための図Illustration for explaining the feature pyramid method フィーチャ・ピラミッド法の識別処理を説明する図Diagram explaining the identification process of the feature pyramid method フィーチャ・ピラミッド法に上述の特徴量の二値化による高速化の技術を適用した場合の識別処理を説明する図The figure explaining the identification process at the time of applying the speed-up technique by the above-mentioned binarization of the feature-value to the feature pyramid method クラシファイア・ピラミッド法を説明するための図Illustration for explaining the classifier pyramid method クラシファイア・ピラミッド法の識別処理を説明する図Diagram explaining the classification process of the classifier pyramid method クラシファイア・ピラミッド法に上述の特徴量の二値化による高速化の技術を適用した場合の識別処理を説明する図The figure explaining the identification process at the time of applying the speed-up technique by the above-mentioned binarization of the feature-value to the classifier pyramid method ハイブリッド・ピラミッド法を説明するための図Illustration for explaining the hybrid pyramid method ハイブリッド・ピラミッド法の識別処理を説明する図Diagram explaining the identification process of the hybrid pyramid method ハイブリッド・ピラミッド法に特徴量の二値化による高速化の技術を適用した本発明の実施の形態の識別処理を説明する図The figure explaining the identification processing of embodiment of this invention which applied the speed-up technique by the binarization of the feature-value to the hybrid pyramid method 本発明の第２の実施の形態の第１の例における二値の特徴ベクトルの要素の例を示す図The figure which shows the example of the element of the binary feature vector in the 1st example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第１の例におけるＸＯＲと調和平均との関係を示す表The table | surface which shows the relationship between XOR and the harmonic mean in the 1st example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第１の例における二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す表Table showing XOR of combinations of all elements of binary feature vector in the first example of the second exemplary embodiment of the present invention 本発明の第２の実施の形態の第１の例におけるキャリーなしローテートシフトによる共起要素の計算を示す図The figure which shows calculation of the co-occurrence element by the rotation shift without carry in the 1st example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第１の例における二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す表Table showing XOR of combinations of all elements of binary feature vector in the first example of the second exemplary embodiment of the present invention 本発明の第２の実施の形態の第１の例におけるキャリーなしローテートシフトによる共起要素の計算を示す図The figure which shows calculation of the co-occurrence element by the rotation shift without carry in the 1st example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第１の例における二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す表Table showing XOR of combinations of all elements of binary feature vector in the first example of the second exemplary embodiment of the present invention 本発明の第２の実施の形態の第１の例におけるキャリーなしローテートシフトによる共起要素の計算を示す図The figure which shows calculation of the co-occurrence element by the rotation shift without carry in the 1st example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第１の例における二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す表Table showing XOR of combinations of all elements of binary feature vector in the first example of the second exemplary embodiment of the present invention 本発明の第２の実施の形態の第１の例におけるキャリーなしローテートシフトによる共起要素の計算を示す図The figure which shows calculation of the co-occurrence element by the rotation shift without carry in the 1st example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第１の例における二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す表Table showing XOR of combinations of all elements of binary feature vector in the first example of the second exemplary embodiment of the present invention 本発明の第２の実施の形態の第１の例における特徴量変換装置の構成を示すブロック図The block diagram which shows the structure of the feature-value conversion apparatus in the 1st example of the 2nd Embodiment of this invention. 本発明の第２の実施の形態の第２の例における画像の１ブロック分のＨＯＧ特徴量とそれを二値化した結果を示す図The figure which shows the HOG feature-value for 1 block of the image in the 2nd example of the 2nd Embodiment of this invention, and the result of binarizing it 本発明の第２の実施の形態の第２の例における多重閾値による特徴記述能力の強化を説明する図The figure explaining the enhancement of the feature description capability by multiple thresholds in the second example of the second embodiment of the present invention 本発明の第２の実施の形態の第２の例における特徴量変換を説明する図The figure explaining feature-value conversion in the 2nd example of the 2nd Embodiment of this invention 本発明の第２の実施の形態の第２の例における特徴量変換装置の構成を示すブロック図The block diagram which shows the structure of the feature-value conversion apparatus in the 2nd example of the 2nd Embodiment of this invention. 比較例のプログラムコードProgram code for comparison example 実施例のプログラムコードExample program code 学習によって認識モデルを生成した後に認識装置にて認識を行ったときの誤検出と検出率との関係を示すグラフGraph showing the relationship between false detection and detection rate when recognition is performed by the recognition device after generating a recognition model by learning 本発明の第３の実施の形態の第１の例における特徴量演算装置の構成を示すブロック図The block diagram which shows the structure of the feature-value calculating apparatus in the 1st example of the 3rd Embodiment of this invention. 本発明の第３の実施の形態の第１の例における実数ベクトルの分解を示す図The figure which shows decomposition | disassembly of the real vector in the 1st example of the 3rd Embodiment of this invention 本発明の第３の実施の形態の第２の例における計算例を示す図The figure which shows the example of a calculation in the 2nd example of the 3rd Embodiment of this invention 本発明の第３の実施の形態の第３の例におけるベクトル演算部におけるカスケードによる閾値処理の高速化のフロー図Flow diagram for speeding up threshold processing by cascade in the vector operation unit in the third example of the third embodiment of the present invention 本発明の第３の実施の形態の第１の応用例における物体認識装置の構成を示すブロック図The block diagram which shows the structure of the object recognition apparatus in the 1st application example of the 3rd Embodiment of this invention. 本発明の第３の実施の形態の第２の応用例におけるｋ−ｍｅａｎｓクラスタリング装置の構成を示すブロック図The block diagram which shows the structure of the k-means clustering apparatus in the 2nd application example of the 3rd Embodiment of this invention. 本発明の第４の実施の形態における複数の識別基準で画像中の人を識別する場合の線形ＳＶＭの例を示す図The figure which shows the example of linear SVM in the case of identifying the person in an image with the some identification reference | standard in the 4th Embodiment of this invention. 本発明の第４の実施の形態における複数の識別基準で画像中の人を識別する場合の線形ＳＶＭの例を示す図The figure which shows the example of linear SVM in the case of identifying the person in an image with the some identification reference | standard in the 4th Embodiment of this invention. 本発明の第４の実施の形態の第１の例における特徴量演算装置の構成を示すブロック図The block diagram which shows the structure of the feature-value calculating apparatus in the 1st example of the 4th Embodiment of this invention. 本発明の第４の実施の形態の第１の例における実数行列の分解を示す図The figure which shows decomposition | disassembly of the real number matrix in the 1st example of the 4th Embodiment of this invention 本発明の第４の実施の形態の第１の例における実数行列と基底行列との関係を説明するための図The figure for demonstrating the relationship between the real number matrix and basis matrix in the 1st example of the 4th Embodiment of this invention. 本発明の第４の実施の形態の第２の例における計算例を示す図The figure which shows the example of a calculation in the 2nd example of the 4th Embodiment of this invention 本発明の第４の実施の形態の第１の応用例における物体認識装置の構成を示すブロック図The block diagram which shows the structure of the object recognition apparatus in the 1st application example of the 4th Embodiment of this invention. 本本発明の第４の実施の形態の第１の応用例における回転する道路標識と回転角度ごとの辞書及びバイアスを示す図The figure which shows the dictionary and bias for every rotating road sign, rotation angle in the 1st application example of the 4th Embodiment of this invention. 本発明の第４の実施の形態の第１の応用例における係数行列の性質を示す図The figure which shows the property of the coefficient matrix in the 1st application example of the 4th Embodiment of this invention 本発明の第４の実施の形態の第１の応用例における識別関数の例を示すグラフThe graph which shows the example of the identification function in the 1st application example of the 4th Embodiment of this invention 本発明の第４の実施の形態の第２の応用例におけるｋ−ｍｅａｎｓクラスタリング装置の構成を示すブロック図The block diagram which shows the structure of the k-means clustering apparatus in the 2nd application example of the 4th Embodiment of this invention. 本発明の第５の実施の形態の第１の例の識別装置における処理を示すブロック図The block diagram which shows the process in the identification device of the 1st example of the 5th Embodiment of this invention 本発明の第５の実施の形態の第２の例の識別装置における処理を示すブロック図The block diagram which shows the process in the identification device of the 2nd example of the 5th Embodiment of this invention 本発明の第５の実施の形態の第３の例の識別装置における処理を示すブロック図The block diagram which shows the process in the identification device of the 3rd example of the 5th Embodiment of this invention 入力画像からＨＯＧ特徴量を抽出する方法を説明するための図The figure for demonstrating the method of extracting a HOG feature-value from an input image.

以下、本発明の実施の形態の特徴量演算装置について、図面を参照しながら説明する。なお、以下に説明する実施の形態は、本発明を実施する場合の一例を示すものであって、本発明を以下に説明する具体的構成に限定するものではない。本発明の実施にあたっては、実施の形態に応じた具体的構成が適宜採用されてよい。 Hereinafter, a feature value computing device according to an embodiment of the present invention will be described with reference to the drawings. The embodiment described below shows an example when the present invention is implemented, and the present invention is not limited to the specific configuration described below. In carrying out the present invention, a specific configuration according to the embodiment may be adopted as appropriate.

１．第１の実施の形態
本発明の実施の形態の特徴量演算装置を説明するのに先立って、図１〜６に倣って、本発明の実施の形態の特徴量演算装置における特徴量抽出及び特徴量演算の処理の概要を説明する。以下では、本実施の形態の特徴量演算装置が識別装置であり、抽出する特徴量がＨＯＧ特徴量であり、特徴量演算処理が識別処理である場合を例に説明する。本実施の形態の特徴量演算装置としての識別装置は、上記のフィーチャ・ピラミッド法とクラシファイア・ピラミッド法を融合したハイブリッド・ピラミッド法を採用する。 1. First Embodiment Prior to describing a feature amount computing device according to an embodiment of the present invention, feature amount extraction and features in a feature amount computing device according to an embodiment of the present invention will be described with reference to FIGS. An outline of the quantity calculation process will be described. In the following, a case will be described as an example where the feature quantity computing device of the present embodiment is an identification device, the feature quantity to be extracted is a HOG feature quantity, and the feature quantity computation process is an identification process. The identification device as the feature amount computing device of the present embodiment employs a hybrid pyramid method in which the feature pyramid method and the classifier pyramid method are fused.

図７は、ハイブリッド・ピラミッド法を説明するための図である。識別装置は、入力画像をＬ／Ｋ通りのサイズに変形（リサイズ）して、Ｌ／Ｋ枚のサイズの異なる画像を生成し、それぞれの画像についてＨＯＧ特徴量を抽出する。識別装置は、各段の特徴量についてＫ通りの異なるサイズのウィンドウＷを用いて、識別のための特徴量演算を行う。 FIG. 7 is a diagram for explaining the hybrid pyramid method. The identification device deforms (resizes) the input image into L / K sizes, generates L / K different images, and extracts HOG feature values for each image. The identification device performs feature value calculation for identification using K different sizes of windows W for the feature values of each stage.

図８は、ハイブリッド・ピラミッド法の識別処理を説明する図である。識別装置は、入力画像１０が得られると、それを複数とおりの縮小率で縮小して、複数のリサイズ（縮小）画像１１を生成する。具体的には、識別装置は、入力画像１０を１／２に縮小した１／２画像、１／４に縮小した１／４画像、１／８に縮小した１／８画像、・・・というように、２のべき乗で順次縮小したリサイズ画像によってピラミッド画像を生成する。 FIG. 8 is a diagram for explaining the identification process of the hybrid pyramid method. When the identification device obtains the input image 10, the identification device reduces the input image 10 at a plurality of reduction rates, and generates a plurality of resized (reduced) images 11. Specifically, the identification device is called a 1/2 image obtained by reducing the input image 10 by 1/2, a 1/4 image reduced by 1/4, a 1/8 image reduced by 1/8, and so on. As described above, the pyramid image is generated by the resized image sequentially reduced by a power of 2.

次に、識別装置は、入力画像及び複数のリサイズ画像（合計Ｌ／Ｋ枚）からなるピラミッド画像の各々についてＨＯＧ特徴量を抽出する。すなわち、識別装置は、特徴量の抽出処理をＬ／Ｋ回行う。ここで、Ｋは、オクターブ間隔であり、ピラミッド画像の各々に対していくつのテンプレートを用意するか（図７において各画像に重畳されている枠の数）を示すものである。識別装置は、モデルサイズの異なる複数（Ｋ種類）の辞書からなる辞書セットを記憶しており、各ＨＯＧ特徴量について、特徴量の切り出しを行うサイズごとに対応する辞書を用いて、識別のための特徴量演算を行う。このとき、識別装置は、ピラミッド画像の各々に対して同じ辞書セットを用いて識別を行う。 Next, the identification device extracts a HOG feature amount for each of the pyramid image including the input image and a plurality of resized images (total L / K images). That is, the identification device performs feature amount extraction processing L / K times. Here, K is an octave interval and indicates how many templates are prepared for each pyramid image (the number of frames superimposed on each image in FIG. 7). The identification device stores a dictionary set including a plurality of (K types) dictionaries having different model sizes, and for each HOG feature value, a dictionary corresponding to each size from which the feature value is cut out is used for identification. The feature amount calculation is performed. At this time, the identification device identifies each pyramid image using the same dictionary set.

図９は、ハイブリッド・ピラミッド法に特徴量の二値化による高速化の技術を適用した本発明の実施の形態の識別処理を説明する図である。識別装置は、入力画像１０が得られると、それを複数とおりの縮小率で縮小して、複数のリサイズ（縮小）画像１１を生成する。具体的には、識別装置は、入力画像１０を１／２に縮小した１／２画像、１／４に縮小した１／４画像、１／８に縮小した１／８画像、・・・というように、２のべき乗で順次縮小したリサイズ画像によってピラミッド画像を生成する。なお、識別装置は、入力画像を縮小することによってリサイズ画像を生成するだけでなく、入力画像を拡大することによってリサイズ画像を生成してもよい。 FIG. 9 is a diagram for explaining identification processing according to an embodiment of the present invention in which a technique for speeding up by binarizing feature values is applied to the hybrid pyramid method. When the identification device obtains the input image 10, the identification device reduces the input image 10 at a plurality of reduction rates, and generates a plurality of resized (reduced) images 11. Specifically, the identification device is called a 1/2 image obtained by reducing the input image 10 by 1/2, a 1/4 image reduced by 1/4, a 1/8 image reduced by 1/8, and so on. As described above, the pyramid image is generated by the resized image sequentially reduced by a power of 2. Note that the identification device may generate not only the resized image by reducing the input image but also the resized image by enlarging the input image.

次に、識別装置は、入力画像及び複数のリサイズ画像（合計Ｌ／Ｋ枚）からなるピラミッド画像の各々についてＨＯＧ特徴量を抽出する。すなわち、識別装置は、特徴量の抽出処理をＬ／Ｋ回行う。識別装置は、各サイズの画像から抽出された複数段のＨＯＧ特徴量の各々を二値化する。すなわち、識別装置は、特徴量の二値化処理をＬ／Ｋ回行う。識別装置は、モデルサイズの異なる複数（Ｋ種類）の辞書からなる辞書セットを記憶しており、各二値ＨＯＧ特徴量について、特徴量の切り出しを行うサイズごとに対応する辞書を用いて、識別のための特徴量演算を行う。このとき、識別装置は、ピラミッド画像の各々に対して同じ辞書セットを用いて識別を行う。 Next, the identification device extracts a HOG feature amount for each of the pyramid image including the input image and a plurality of resized images (total L / K images). That is, the identification device performs feature amount extraction processing L / K times. The identification device binarizes each of a plurality of stages of HOG feature values extracted from each size image. That is, the identification device performs the feature value binarization process L / K times. The identification device stores a dictionary set composed of a plurality of (K types) dictionaries with different model sizes, and identifies each binary HOG feature value by using a dictionary corresponding to each size from which the feature value is cut out. The feature amount calculation for is performed. At this time, the identification device identifies each pyramid image using the same dictionary set.

このように、本実施の形態の識別装置は、二値化処理を加えても、ピラミッドの階層分（Ｌ／Ｋ）だけしか二値化処理が必要でなく、この点で二値特徴量を抽出する処理を高速化できる。また、二値化処理が多くないにもかかわらず、二値化による識別処理の高速化の恩恵を十分に受けることができる。さらに、ハイブリッド・ピラミッドで特徴量のサイズが減り、さらにそれが二値化されるので、特徴量によるメモリ消費量が少なくてすむ。よって、ハイブリッド・ピラミッド法と特徴量の二値化による識別処理の高速化技術とを組み合わせると、二値化の処理負荷を抑えつつ、二値化による識別処理の高速化の恩恵を最大限に活かすという相乗効果が得られる。 As described above, the identification device according to the present embodiment needs only binarization processing for the layer (L / K) of the pyramid even when binarization processing is added. The extraction process can be accelerated. Moreover, although there are not many binarization processes, the benefits of speeding up the identification process by binarization can be sufficiently received. Further, the size of the feature amount is reduced in the hybrid pyramid and further binarized, so that the memory consumption by the feature amount can be reduced. Therefore, combining the hybrid pyramid method and technology for speeding up identification processing by binarizing feature values maximizes the benefits of speeding up identification processing by binarization while reducing the processing load of binarization. A synergistic effect is achieved.

すなわち、入力画像から歩行者を検出する識別装置を例にすると、フィーチャ・ピラミッド法では、検出したい歩行者が特定のピクセル数になるように入力画像をリサイズするが、このとき、１セルのピクセル数はＭ×Ｍとなり、辞書のセル数も一定となる。また、クラシファイア・ピラミッド法では、検出したい歩行者に合わせて１セルのピクセル数をリサイズし、辞書のセル数は一定となる。これに対して、本実施の形態のハイブリッド・ピラミッド法では、検出したい歩行者に合わせて、ある程度入力画像をリサイズしてピラミッド画像を生成し、ピラミッド画像ごとにセル数の異なる辞書を用意して適用する。これによって、複数辞書を識別するのに用いる特徴量を共通化できるので、特徴計算時間を短縮できることになる。 That is, taking an identification device that detects a pedestrian from an input image as an example, in the feature pyramid method, the input image is resized so that the pedestrian to be detected has a specific number of pixels. The number is M × M, and the number of dictionary cells is also constant. In the classifier pyramid method, the number of pixels in one cell is resized according to the pedestrian to be detected, and the number of cells in the dictionary is constant. In contrast, in the hybrid pyramid method of the present embodiment, the pyramid image is generated by resizing the input image to some extent according to the pedestrian to be detected, and a dictionary having a different number of cells is prepared for each pyramid image. Apply. As a result, the feature quantity used to identify a plurality of dictionaries can be shared, and the feature calculation time can be shortened.

なお、上述で説明した識別装置では、ピラミッド画像に含まれる各画像について、同じ辞書セットを用いて識別を行ったが、辞書セットは完全に同一でなくてもよく、ピラミッド画像の各画像に用いる辞書セットが互いに類似するものであってもよい。辞書セットが類似するということは、辞書セットに含まれる複数の辞書のうちの一部のみが共通であることをいう。 In the identification device described above, each image included in the pyramid image is identified using the same dictionary set. However, the dictionary set may not be completely the same, and is used for each image of the pyramid image. The dictionary sets may be similar to each other. That the dictionary sets are similar means that only some of the plurality of dictionaries included in the dictionary set are common.

２．第２の実施の形態
２−１．背景
従来、画像検索、音声認識、文章検索などの多くの分野で機械学習によって対象を認識する認識装置が実用化されている。この認識のために、画像、音声、文章などの情報から特徴量が抽出される。画像から特定の対象を認識する場合には、画像の特徴量として、例えばＨＯＧ特徴量を用いることができる（例えば、Navneet Dalal and Bill Triggs, "Histograms of Oriented Gradients for Human Detection", CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01, Pages 886-893を参照）。特徴量は、計算機で扱いやすいように特徴ベクトルの形式で扱われる。すなわち、画像、音声、文章などの情報は、対象認識のために特徴ベクトルに変換される。 2. Second embodiment
2-1. BACKGROUND Conventionally, a recognition apparatus for recognizing an object by machine learning has been put into practical use in many fields such as image search, voice recognition, and text search. For this recognition, feature amounts are extracted from information such as images, sounds, and sentences. When recognizing a specific target from an image, for example, HOG feature values can be used as image feature values (for example, Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection”, CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)-Volume 1-Volume 01, Pages 886-893). The feature quantity is handled in the form of a feature vector so that it can be easily handled by a computer. That is, information such as images, sounds, and sentences is converted into feature vectors for object recognition.

認識装置は、特徴ベクトルを認識モデルに適用することで対象を認識する。例えば、線形識別器の認識モデルは、式（１）で与えられる。
ここで、ｘは特徴ベクトルであり、ｗは重みベクトルであり、ｂはバイアスである。線形識別器は、特徴ベクトルｘが与えられたときに、ｆ（ｘ）がゼロより大きいか小さいかによって、二値分類を行う。 The recognition device recognizes an object by applying a feature vector to a recognition model. For example, the recognition model of the linear classifier is given by Equation (1).
Here, x is a feature vector, w is a weight vector, and b is a bias. The linear classifier performs binary classification according to whether f (x) is greater than or less than zero when a feature vector x is given.

このような認識モデルは、学習用に準備された多数の特徴ベクトルを用いて学習を行うことによって決定される。上記の線形識別器の例では、多数の正例と負例を学習データとして用いることで、重みベクトルｗ及びバイアスｂが決定される。具体的な方法としては、例えば、ＳＶＭ（support vector machine）による学習を採用できる。 Such a recognition model is determined by performing learning using a large number of feature vectors prepared for learning. In the above linear classifier example, the weight vector w and the bias b are determined by using a large number of positive examples and negative examples as learning data. As a specific method, for example, learning by SVM (support vector machine) can be adopted.

線形識別器は、学習及び識別に要する計算が速いため、特に有用である。しかしながら、線形識別器は、線形判別（二値分類）しかできないため、識別能力に乏しいという欠点がある。そこで、特徴量に予め非線形変換をかけておくことで、特徴量の記述能力を向上させる試みがされている。例えば、特徴量の共起性を用いることで、識別能力を強化する試みが行われている。具体的には、ＦＩＮＤ（Feature Interaction Descriptor）特徴量がこれに相当する（例えば、Hui CAO, Koichiro YAMAGUCHI, Mitsuhiko OHTA, Takashi NAITO, and Yoshiki NINOMIYA, "Feature Interaction Descriptor for Pedestrian Detection", IEICE TRANSACTIONS on Information and Systems Vol.E93-D No.9 pp.2656-2659を参照）。 Linear discriminators are particularly useful because of the fast computation required for learning and discrimination. However, since the linear discriminator can only perform linear discrimination (binary classification), it has a drawback of poor discrimination ability. Therefore, an attempt has been made to improve the description ability of the feature quantity by applying nonlinear transformation to the feature quantity in advance. For example, attempts have been made to enhance the discrimination ability by using the co-occurrence of feature quantities. Specifically, FIND (Feature Interaction Descriptor) features correspond to this (for example, Hui CAO, Koichiro YAMAGUCHI, Mitsuhiko OHTA, Takashi NAITO, and Yoshiki NINOMIYA, "Feature Interaction Descriptor for Pedestrian Detection", IEICE TRANSACTIONS on Information and Systems Vol.E93-D No.9 pp.2656-2659).

ＦＩＮＤ特徴量は、特徴ベクトルの各要素のすべての組み合わせに関して調和平均をとることで、共起要素とし、特徴量の識別能力を高めるものである。具体的には、Ｄ次元の特徴ベクトルｘ＝（ｘ₁，ｘ₂，・・・，ｘ_D）^Tが与えられたときに、すべての要素の組み合わせに対して、下式（２）の非線形な計算を行う。
このとき、ＦＩＮＤ特徴量は、ｙ＝（ｙ₁₁，ｙ₁₂，・・・，ｙ_DD）^Tで与えられる。 The FIND feature value is a co-occurrence element by taking a harmonic average with respect to all combinations of each element of the feature vector, thereby enhancing the ability to identify the feature value. Specifically, when a D-dimensional feature vector x = (x ₁ , x ₂ ,..., X _D ) ^T is given, the nonlinearity of the following expression (2) is obtained for all combinations of elements. Perform simple calculations.
At this time, the FIND feature amount is given by y = (y ₁₁ , y ₁₂ ,..., Y _DD ) ^T.

例えば、特徴ベクトルｘが３２次元であるとき、組み合わせの重複を取り除いたＦＩＮＤ特徴量は５２８次元となる。なお、必要に応じて、ｙは長さが１となるように正規化されてよい。 For example, when the feature vector x has 32 dimensions, the FIND feature value obtained by removing duplicate combinations is 528 dimensions. Note that y may be normalized so that the length becomes 1 as necessary.

２−２．概要
しかしながら、ＦＩＮＤ特徴量を求めるには、特徴ベクトルの要素のすべての組み合わせの計算が必要であり、この計算量は次元数に対して二乗のオーダーになる。また、各要素の計算において割り算が発生するため、きわめて遅いという問題がある。さらに、特徴量の次元数が大きいため、メモリの消費量が大きくなるという問題もある。 2-2. SUMMARY However, in order to determine the FIND feature amount, it is necessary to calculate all combinations of elements of the feature vector, the amount of calculation on the order of the square against dimensionality. Moreover, since division occurs in the calculation of each element, there is a problem that it is extremely slow. Furthermore, since the feature quantity has a large number of dimensions, there is a problem that the amount of memory consumption increases.

本実施の形態は、上記の問題に鑑みて、特徴量が二値であるときに、特徴量の非線形変換を高速に行う特徴量変換装置を提供することを目的とする。 An object of the present embodiment is to provide a feature quantity conversion device that performs nonlinear transformation of a feature quantity at high speed when the feature quantity is binary in view of the above-described problem.

本実施の形態の他の目的は、特徴ベクトルが二値でない場合にも、これを二値に変換する特徴量変換装置を提供することである。 Another object of the present embodiment is to provide a feature amount conversion device that converts a feature vector into binary even when the feature vector is not binary.

本実施の形態の第一の態様の特徴量変換装置は、入力された二値の特徴ベクトルの要素をそれぞれ異なる配列に再配列した複数の再配列ビット列を生成するビット再配列部と、前記複数の再配列ビット列の各々と入力された前記特徴ベクトルとの論理演算をそれぞれ行って、複数の論理演算ビット列を生成する論理演算部と、生成された複数の前記論理演算ビット列を統合して、非線形変換特徴ベクトルを生成する特徴統合部とを備えた構成を有している。この構成により、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The feature amount conversion apparatus according to the first aspect of the present embodiment includes a bit rearrangement unit that generates a plurality of rearranged bit strings obtained by rearranging elements of input binary feature vectors into different arrays, A logic operation unit that generates a plurality of logic operation bit strings by performing a logic operation on each of the rearranged bit strings and the input feature vector, and integrates the plurality of generated logic operation bit strings, And a feature integration unit that generates a transformed feature vector. With this configuration, the co-occurrence element of the input feature vector is calculated by rearrangement of the input feature vector and logical operation, so that the operation of the co-occurrence element can be performed at high speed.

前記特徴統合部は、さらに、入力された前記特徴ベクトルの要素も生成された複数の前記論理演算ビット列とともに統合してよい。この構成によれば、もとの特徴ベクトルの要素も利用することで、演算量を増大させることなくより記述能力の高い非線形変換特徴ベクトルを得ることができる。 The feature integration unit may further integrate the elements of the input feature vector together with the generated plurality of logical operation bit strings. According to this configuration, by using the elements of the original feature vector, it is possible to obtain a non-linear transformation feature vector having a higher description capability without increasing the amount of calculation.

前記論理演算部は、前記再配列ビット列と、入力された前記特徴ベクトルとの排他的論理和を計算してよい。排他的論理和は、調和平均と等価であり、「＋１」と「−１」の出現確率も同じであるので、この構成によれば、ＦＩＮＤ相当の高い特徴記述能力をもつ共起要素を算出できる。 The logic operation unit may calculate an exclusive OR of the rearranged bit string and the input feature vector. Since the exclusive OR is equivalent to the harmonic mean and the appearance probabilities of “+1” and “−1” are also the same, according to this configuration, a co-occurrence element having a high feature description capability equivalent to FIND is calculated. it can.

前記ビット再配列部は、入力された前記特徴ベクトルの要素に対して、キャリーなしローテートシフトを行うことで前記再配列ビット列を生成してよい。この構成によれば、特徴記述能力の高い共起要素を効率よく算出できる。 The bit rearrangement unit may generate the rearranged bit string by performing a rotation shift without carry on the elements of the input feature vector. According to this configuration, co-occurrence elements with high feature description capability can be calculated efficiently.

前記特徴量変換装置は、入力された前記特徴ベクトルがｄ次元であるときに、ｄ／２個の前記ビット再配列部を備えていてよい。この構成によれば、各ビット再配列部が１ビットずつずらしたキャリーなしローテートシフトを行うことで、複数のビット再配列部によって、入力された特徴ベクトルの要素のすべての組み合わせを生成できる。 The feature quantity conversion device may include d / 2 bit rearrangement units when the inputted feature vector is d-dimensional. According to this configuration, all combinations of the elements of the input feature vector can be generated by the plurality of bit rearrangement units by performing a carry-less rotate shift in which each bit rearrangement unit is shifted by one bit.

前記ビット再配列部は、入力された前記特徴ベクトルの要素に対して、ランダムな再配列を行ってよい。この構成によっても、特徴記述能力の高い共起要素を算出できる。 The bit rearrangement unit may perform random rearrangement on the elements of the input feature vector. Also with this configuration, co-occurrence elements with high feature description capability can be calculated.

前記特徴量変換装置は、入力された実数の特徴ベクトルを二値化して前記二値の特徴ベクトルを生成する複数の二値化部と、前記複数の前記二値化部の各々に対応する複数の共起要素生成部とを備え、前記複数の共起要素生成部の各々は、前記複数のビット再配列部と前記複数の論理演算部とを備え、前記複数の共起要素生成部の各々には、対応する前記二値化部から前記二値の特徴ベクトルが入力され、前記特徴統合部は、複数の前記共起要素生成部の複数の前記論理演算部の各々によって生成された前記論理演算ビット列のすべてを統合して、前記非線形変換ベクトルを生成してよい。この構成によれば、特徴ベクトルの要素が実数である場合にも、特徴記述能力の高い二値の特徴ベクトルを高速に得ることができる。 The feature amount conversion apparatus includes: a plurality of binarization units that binarize an input real number feature vector to generate the binary feature vector; and a plurality of binarization units corresponding to each of the plurality of binarization units. Each of the plurality of co-occurrence element generation units includes the plurality of bit rearrangement units and the plurality of logic operation units, and each of the plurality of co-occurrence element generation units. , The binary feature vector is input from the corresponding binarization unit, and the feature integration unit generates the logic generated by each of the plurality of logic operation units of the plurality of co-occurrence element generation units. All of the operation bit strings may be integrated to generate the nonlinear transformation vector. According to this configuration, even when the feature vector element is a real number, a binary feature vector with high feature description capability can be obtained at high speed.

前記二値の特徴ベクトルはＨＯＧ特徴量を二値化して得られた特徴ベクトルであってよい。 The binary feature vector may be a feature vector obtained by binarizing the HOG feature value.

本実施の形態の第二の態様の特徴量変換装置は、入力された二値の特徴ベクトルの要素を再配列して再配列ビット列を生成するビット再配列部と、前記再配列ビット列と入力された前記特徴ベクトルとの論理演算を行って、論理演算ビット列を生成する論理演算部と、前記特徴ベクトルの要素と生成された前記論理演算ビット列を統合して、非線形変換特徴ベクトルを生成する特徴統合部とを備えた構成を有している。この構成によっても、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The feature amount conversion apparatus according to the second aspect of the present embodiment includes a bit rearrangement unit that rearranges elements of an input binary feature vector to generate a rearranged bit string, and the rearranged bit string is input. Further, a logical operation unit that performs a logical operation on the feature vector to generate a logical operation bit string, and a feature integration that generates a nonlinear transformation feature vector by integrating the element of the feature vector and the generated logical operation bit string The structure provided with the part. Also with this configuration, since the co-occurrence elements of the input feature vectors are calculated by rearranging the input feature vectors and logical operations, the calculation of the co-occurrence elements can be performed at high speed.

本実施の形態の第三の態様の特徴量変換装置は、入力された二値の特徴ベクトルの要素をそれぞれ異なる配列に再配列した再配列ビット列を生成する複数のビット再配列部と、前記複数のビット再配列部にて生成されたそれぞれの前記再配列ビット列どうしの論理演算を行って、論理演算ビット列を生成する論理演算部と、前記特徴ベクトルの要素と生成された複数の前記論理演算ビット列を統合して、非線形変換特徴ベクトルを生成する特徴統合部とを備えた構成を有している。この構成によっても、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The feature amount conversion apparatus according to the third aspect of the present embodiment includes a plurality of bit rearrangement units that generate rearranged bit strings obtained by rearranging elements of input binary feature vectors into different arrays, A logical operation unit that generates a logical operation bit string by performing a logical operation between each of the rearranged bit strings generated by the bit rearrangement unit, and a plurality of the logical operation bit strings generated by the elements of the feature vector And a feature integration unit that generates a nonlinear transformation feature vector. Also with this configuration, since the co-occurrence elements of the input feature vectors are calculated by rearranging the input feature vectors and logical operations, the calculation of the co-occurrence elements can be performed at high speed.

本実施の形態の第四の態様の特徴量変換装置は、入力された二値の特徴ベクトルの要素をそれぞれ異なる配列に再配列した再配列ビット列を生成する複数のビット再配列部と、前記複数のビット再配列部にて生成されたそれぞれの前記再配列ビット列どうしの論理演算を行って、それぞれ論理演算ビット列を生成する複数の論理演算部と、生成された複数の前記論理演算ビット列を統合して、非線形変換特徴ベクトルを生成する特徴統合部とを備えた構成を有している。この構成によっても、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The feature amount conversion apparatus according to the fourth aspect of the present embodiment includes a plurality of bit rearrangement units that generate rearranged bit strings in which elements of input binary feature vectors are rearranged in different arrays, A plurality of logical operation units that perform logical operations between the respective rearranged bit sequences generated by the bit rearrangement unit to generate logical operation bit sequences, and integrate the generated plurality of logical operation bit sequences. And a feature integration unit for generating a nonlinear transformation feature vector. Also with this configuration, since the co-occurrence elements of the input feature vectors are calculated by rearranging the input feature vectors and logical operations, the calculation of the co-occurrence elements can be performed at high speed.

本実施の形態の学習装置は、上記の特徴量変換装置と、前記特徴量変換装置にて生成された前記非線形変換特徴ベクトルを用いて学習を行う学習部とを備えた構成を有している。この構成によっても、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The learning device according to the present embodiment has a configuration including the above-described feature value conversion device and a learning unit that performs learning using the nonlinear conversion feature vector generated by the feature value conversion device. . Also with this configuration, since the co-occurrence elements of the input feature vectors are calculated by rearranging the input feature vectors and logical operations, the calculation of the co-occurrence elements can be performed at high speed.

本実施の形態の認識装置は、上記の特徴量変換装置と、前記特徴量変換装置にて生成された前記非線形変換特徴ベクトルを用いて認識を行う認識部とを備えた構成を有している。この構成によっても、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The recognition apparatus according to the present embodiment has a configuration including the above-described feature quantity conversion apparatus and a recognition unit that performs recognition using the nonlinear transformation feature vector generated by the feature quantity conversion apparatus. . Also with this configuration, since the co-occurrence elements of the input feature vectors are calculated by rearranging the input feature vectors and logical operations, the calculation of the co-occurrence elements can be performed at high speed.

上記の認識装置において、前記認識部は、前記認識における重みベクトルと前記非線形変換特徴ベクトルのとの内積計算において、分布の広い順又はエントロピーの値が高い順に計算をして、前記内積が認識のための所定の閾値より大きくなる、又は小さくなると判断できる時点で、前記内積の計算を打ち切ってよい。この構成により、認識処理を高速化できる。 In the above recognition device, the recognition unit calculates the inner product of the weight vector in the recognition and the nonlinear transformation feature vector in the order of wide distribution or the highest entropy value, and the inner product is recognized. The calculation of the inner product may be terminated at a time when it can be determined that the value is larger or smaller than a predetermined threshold. With this configuration, the recognition process can be speeded up.

本実施の形態の特徴量変換プログラムは、コンピュータを、入力された二値の特徴ベクトルの要素をそれぞれ異なる配列に再配列してそれぞれ再配列ビット列を生成する複数のビット再配列部、前記複数の再配列ビット列の各々と入力された前記特徴ベクトルとの論理演算をそれぞれ行って、それぞれ論理演算ビット列を生成する複数の論理演算部、及び生成された複数の前記論理演算ビット列を統合して、非線形変換特徴ベクトルを生成する特徴統合部として機能させる。この構成によっても、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 The feature amount conversion program according to the present embodiment includes a plurality of bit rearrangement units that generate a rearranged bit string by rearranging the elements of input binary feature vectors into different arrays, respectively, A logical operation of each of the rearranged bit strings and the input feature vector is performed, and a plurality of logical operation units each generating a logical operation bit string and a plurality of the generated logical operation bit strings are integrated to be nonlinear It is made to function as a feature integration unit that generates a transformed feature vector. Also with this configuration, since the co-occurrence elements of the input feature vectors are calculated by rearranging the input feature vectors and logical operations, the calculation of the co-occurrence elements can be performed at high speed.

２−３．効果
本実施の形態によれば、入力された特徴ベクトルの共起要素を、入力された特徴ベクトルの再配列と論理演算によって算出するので、共起要素の演算を高速にできる。 2-3. Effect According to the present embodiment, the co-occurrence elements of the input feature vector are calculated by the rearrangement of the input feature vector and the logical operation, so that the operation of the co-occurrence element can be performed at high speed.

２−４．第２の実施の形態の第１の例
第１の例の特徴量変換装置は、第１の実施の形態で説明したように、ハイブリッド・ピラミッド法によってＨＯＧ特徴量を抽出して、抽出したＨＯＧ特徴量を二値化する。第１の例の特徴量変換装置は、二値のＨＯＧ特徴量である特徴ベクトルが与えられたときに、この特徴ベクトルに対して非線形変換を行うことで、識別力の向上した特徴ベクトル（以下、「非線形変換特徴ベクトル」という。）を得る。例えば、８ピクセル×８ピクセルを１単位とした領域をセルと定義したとき、ＨＯＧ特徴量は、２×２のセルで構成されるブロックごとに３２次元のベクトルとして得られる。また、本例では、このＨＯＧ特徴量が二値化されたベクトルとして得られているものとする。本例の特徴量変換装置の構成を説明する前に、二値の特徴ベクトルに対して非線形変換を行ってＦＩＮＤ相当の共起要素を有する非線形変換特徴ベクトルを求める原理について説明する。 2-4. First Example of Second Embodiment As described in the first embodiment, the feature amount conversion apparatus of the first example extracts the HOG feature amount by the hybrid pyramid method, and extracts the extracted HOG. Binarize feature values. The feature quantity conversion device of the first example performs a nonlinear transformation on the feature vector when a feature vector that is a binary HOG feature quantity is given, so that a feature vector with improved discrimination power (hereinafter referred to as a feature vector) , Referred to as “non-linear transformation feature vector”). For example, when an area having 8 pixels × 8 pixels as one unit is defined as a cell, the HOG feature value is obtained as a 32-dimensional vector for each block formed of 2 × 2 cells. In this example, it is assumed that the HOG feature value is obtained as a binarized vector. Before explaining the configuration of the feature quantity conversion apparatus of this example, the principle of obtaining a nonlinear transformation feature vector having a co-occurrence element equivalent to FIND by performing nonlinear transformation on a binary feature vector will be explained.

図１０は、二値の特徴ベクトルの要素の例を示す図である。特徴ベクトルの各要素は、「＋１」か「−１」の値をとる。図１０において、縦軸は各要素の値を示しており、横軸は要素数（次元数）を示している。図１０の例では、要素数は３２である。 FIG. 10 is a diagram illustrating an example of elements of a binary feature vector. Each element of the feature vector takes a value of “+1” or “−1”. In FIG. 10, the vertical axis indicates the value of each element, and the horizontal axis indicates the number of elements (number of dimensions). In the example of FIG. 10, the number of elements is 32.

ＦＩＮＤ特徴量を求める場合には、これらの要素を用いて、式（３）による調和平均を計算する。
ここで、ａ、ｂは各要素の値（「＋１」か「−１」）である。ａ、ｂは、「＋１」又は「−１」のいずれかであるので、その組み合わせは４通りに限られる。よって、特徴ベクトルの要素が「＋１」か「−１」の二値である場合には、この調和平均はＸＯＲと等価になる。 When obtaining the FIND feature value, the harmonic average according to the equation (3) is calculated using these elements.
Here, a and b are values (“+1” or “−1”) of each element. Since a and b are either “+1” or “−1”, the number of combinations is limited to four. Therefore, when the element of the feature vector is binary of “+1” or “−1”, this harmonic average is equivalent to XOR.

図１１は、ＸＯＲと調和平均との関係を示す表である。図１１に示すように、ＸＯＲと調和平均との関係は、（−１／２）×ＸＯＲ＝調和平均という関係にある。よって、「＋１」及び「−１」に二値化された特徴量については、それらのすべての組み合わせの調和平均を求める代わりに、それらのすべての組み合わせのＸＯＲを求めても、ＦＩＮＤ特徴量と同等に識別力が向上した特徴量に変換できる。そこで、本例の特徴量変換装置は、「＋１」及び「−１」の値をとる二値の特徴ベクトルに対して、それらの組み合わせのＸＯＲをとることで、識別力を向上させる。 FIG. 11 is a table showing the relationship between XOR and harmonic average. As shown in FIG. 11, the relationship between XOR and the harmonic average is (−½) × XOR = harmonic average. Therefore, for the feature values binarized to “+1” and “−1”, instead of obtaining the harmonic average of all the combinations thereof, the FIND feature amount and the XOR of all the combinations are obtained. It can be converted into a feature quantity with improved discrimination power. Therefore, the feature quantity conversion apparatus of this example improves the discriminating power by taking the XOR of the combination of binary feature vectors having values of “+1” and “−1”.

図１２は、「１」及び「−１」の値をとる二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す表である。図１２では、図の簡略化のために、二値の特徴ベクトルの次元数が８である場合を示している。１行目の数列及び１行目の数列は特徴ベクトルである。図１２の例では、特徴ベクトルは（＋１，＋１，−１，−１，＋１，＋１，−１，−１）である。 FIG. 12 is a table showing XORs of combinations of all elements of binary feature vectors having values of “1” and “−1”. FIG. 12 shows a case where the number of dimensions of a binary feature vector is 8 for simplification of the drawing. The number sequence in the first row and the number sequence in the first row are feature vectors. In the example of FIG. 12, the feature vector is (+1, +1, -1, -1, +1, +1, -1, -1).

式（３）から明らかなように、ａとｂとはこれを入れ替えても調和平均は変わらないため、図１２の表の太線で囲った部分が、この特徴ベクトルの要素のすべての組み合わせのＸＯＲのうちの重複部分を除いた部分となる。よって、本例では、この部分を共起要素として採用する。なお、同じ要素同士によるＸＯＲは必ず「−１」となるので、本例ではこれらを共起要素として採用しない。 As is clear from the equation (3), since the harmonic mean does not change even if a and b are interchanged, the portion surrounded by the thick line in the table of FIG. 12 is the XOR of all combinations of the elements of this feature vector. It becomes a part except the duplication part. Therefore, in this example, this part is adopted as a co-occurrence element. In addition, since XOR by the same elements is always “−1”, these are not adopted as co-occurrence elements in this example.

本例のもとの特徴ベクトルの要素と、図１２の太線で囲った部分の要素（共起要素）とを並べるとＦＩＮＤ相当の特徴量が得られる。このとき、もとの特徴ベクトルにキャリーなしローテートシフトを行って各要素同士のＸＯＲを計算することで、高速に共起要素を計算できる。 When the elements of the original feature vector in this example and the elements (co-occurrence elements) of the portion surrounded by the thick line in FIG. 12 are arranged, a feature amount equivalent to FIND is obtained. At this time, a co-occurrence element can be calculated at high speed by performing a rotation shift without carry on the original feature vector and calculating the XOR of each element.

図１３は、キャリーなしローテートシフトによる共起要素の計算を示す図である。もとの特徴ベクトルのビット列１００を右に１ビットシフトして、最右のビットは１ビット目（最左）に持ってくることでキャリーなしローテートシフトを行って、再配列ビット列１０１を用意する。ビット列１００と再配列ビット列１０１のＸＯＲをとると、論理演算ビット列１０２が得られる。この論理演算ビット列１０２が共起要素となる。 FIG. 13 is a diagram illustrating calculation of co-occurrence elements by a rotate shift without carry. The bit string 100 of the original feature vector is shifted to the right by 1 bit, and the rightmost bit is brought to the first bit (leftmost) to perform a rotation shift without carry to prepare the rearranged bit string 101. . When the bit string 100 and the rearranged bit string 101 are XORed, a logical operation bit string 102 is obtained. This logical operation bit string 102 becomes a co-occurrence element.

図１４に再び二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す。図１３の論理演算ビット列１０２は、図１４において太枠で囲った部分に相当する。要素Ｅ８１は、要素Ｅ１８と同じである。 FIG. 14 again shows XOR of combinations of all elements of the binary feature vector. The logical operation bit string 102 in FIG. 13 corresponds to a portion surrounded by a thick frame in FIG. Element E81 is the same as element E18.

図１５は、キャリーなしローテートシフトによる共起要素の計算を示す図である。もとの特徴ベクトルのビット列１００を右に２ビットシフトして、最右の２ビットは１ビット目及び２ビット目にシフトすることでキャリーなしローテートシフトを行って、再配列ビット列２０１を用意する。ビット列１００と再配列ビット列２０１のＸＯＲをとると、論理演算ビット列２０２が得られる。この論理演算ビット列２０２が共起要素となる。 FIG. 15 is a diagram illustrating calculation of co-occurrence elements by a rotate shift without carry. The original feature vector bit string 100 is shifted to the right by 2 bits, and the rightmost 2 bits are shifted to the first and second bits to perform a carry-less rotate shift to prepare a rearranged bit string 201. . When the bit string 100 and the rearranged bit string 201 are XORed, a logical operation bit string 202 is obtained. This logical operation bit string 202 becomes a co-occurrence element.

図１６に二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す。図１５の論理演算ビット列２０２は、図１６において太枠で囲った部分に相当する。要素Ｅ７１、Ｅ８２は、要素Ｅ１７、Ｅ２８とそれぞれ同じである。 FIG. 16 shows XOR of combinations of all elements of the binary feature vector. The logical operation bit string 202 in FIG. 15 corresponds to a portion surrounded by a thick frame in FIG. Elements E71 and E82 are the same as elements E17 and E28, respectively.

図１７は、キャリーなしローテートシフトによる共起要素の計算を示す図である。もとの特徴ベクトルのビット列１００を右に３ビットシフトして、最右の３ビットは１ビット目２ビット目、及び３ビット目にシフトすることでキャリーなしローテートシフトを行って、再配列ビット列３０１を用意する。ビット列１００と再配列ビット列３０１のＸＯＲをとると、論理演算ビット列３０２が得られる。この論理演算ビット列３０２が共起要素となる。 FIG. 17 is a diagram illustrating calculation of co-occurrence elements by a rotate shift without carry. The bit string 100 of the original feature vector is shifted to the right by 3 bits, and the rightmost 3 bits are shifted to the first bit, the second bit, and the third bit to perform a rotation shift without carry, and the rearranged bit string 301 is prepared. When the bit string 100 and the rearranged bit string 301 are XORed, a logical operation bit string 302 is obtained. This logical operation bit string 302 becomes a co-occurrence element.

図１８に二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す。図１７の論理演算ビット列３０２は、図１８において太枠で囲った部分に相当する。要素Ｅ６１、Ｅ７２、Ｅ８３は、要素Ｅ１６、Ｅ２７、Ｅ３８とそれぞれ同じである。 FIG. 18 shows XOR of combinations of all elements of the binary feature vector. The logical operation bit string 302 in FIG. 17 corresponds to a portion surrounded by a thick frame in FIG. Elements E61, E72, and E83 are the same as elements E16, E27, and E38, respectively.

図１９は、キャリーなしローテートシフトによる共起要素の計算を示す図である。もとの特徴ベクトルのビット列１００を右に４ビットシフトして、右側の４ビットは１ビット目、２ビット目、３ビット目、４ビット目にシフトすることでキャリーなしローテートシフトを行って、再配列ビット列４０１を用意する。ビット列１００と再配列ビット列４０１のＸＯＲをとると、論理演算ビット列４０２が得られる。この論理演算ビット列４０２が共起要素となる。 FIG. 19 is a diagram illustrating calculation of co-occurrence elements by a rotate shift without carry. The original feature vector bit string 100 is shifted 4 bits to the right, and the right 4 bits are shifted to the 1st bit, 2nd bit, 3rd bit, 4th bit to perform a rotation without carry, A rearranged bit string 401 is prepared. When the bit string 100 and the rearranged bit string 401 are XORed, a logical operation bit string 402 is obtained. This logical operation bit string 402 becomes a co-occurrence element.

図２０に二値の特徴ベクトルのすべて要素の組み合わせのＸＯＲを示す。図１９の論理演算ビット列４０２は、図２０において太枠で囲った部分に相当する。要素Ｅ５１、Ｅ６２、Ｅ７３、Ｅ８１は、それぞれ要素Ｅ１５、Ｅ２６、Ｅ３７、Ｅ４８と同じであり、いずれか一方は不要であるが、計算の都合上、これをこのまま用いることとする。 FIG. 20 shows XOR of combinations of all elements of the binary feature vector. The logical operation bit string 402 in FIG. 19 corresponds to a portion surrounded by a thick frame in FIG. The elements E51, E62, E73, and E81 are the same as the elements E15, E26, E37, and E48, respectively, and either one is not necessary, but this is used as it is for the convenience of calculation.

図１３、図１５、図１７、図１９の計算を行うことで、図１２において太線で囲った部分の要素がすべて計算できることになる。すなわち、ビット数が８である特徴ベクトルの共起要素の計算は、４回のキャリーなしローテートシフト及びＸＯＲの計算によって得ることができる。同様に、二値の特徴ベクトルのビット数（次元数）が３２である場合には、１６回のキャリーなしローテートシフト及びＸＯＲの計算によって得ることができ、一般的には、二値の特徴ベクトルのビット数（次元数）がｄである場合には、ｄ／２回のキャリーなしローテートシフト及びＸＯＲの計算によって得ることができる。 By performing the calculations in FIGS. 13, 15, 17, and 19, all the elements in the portion surrounded by the thick line in FIG. 12 can be calculated. In other words, the calculation of the co-occurrence element of the feature vector having the number of bits of 8 can be obtained by performing the four rotations without carry and the XOR calculation. Similarly, when the number of bits (number of dimensions) of a binary feature vector is 32, the binary feature vector can be obtained by 16 rotations without carry and XOR calculation. When the number of bits (the number of dimensions) is d, it can be obtained by d / 2 rotation without carry rotation and calculation of XOR.

特徴量変換装置は、上記のようにして求めた共起要素に、もとの特徴ベクトルの要素を加えて、非線形変換特徴ベクトルを得る。よって、３２次元の二値の特徴ベクトルを変換すると、得られる非線形変換特徴ベクトルの次元数は、３２×１６＋３２＝５４４次元となる。以下では、上記のような特徴ベクトルの変換を実現する特徴量変換装置の構成を説明する。 The feature quantity conversion apparatus adds the element of the original feature vector to the co-occurrence element obtained as described above to obtain a nonlinear conversion feature vector. Therefore, when a 32-dimensional binary feature vector is transformed, the number of dimensions of the obtained nonlinear transformation feature vector is 32 × 16 + 32 = 544 dimensions. Below, the structure of the feature-value conversion apparatus which implement | achieves conversion of the above feature vectors is demonstrated.

図２１は、本例の特徴量変換装置の構成を示すブロック図である。特徴量変換装置１０１は、Ｎ個のビット再配列器１１１〜１１Ｎと、ビット再配列器と同数（Ｎ個）の論理演算器１２１〜１２Ｎと、特徴量統合器１３０を備えている。これらのビット再配列器１１１〜１１Ｎ、論理演算器１２１〜１２Ｎ、及び特徴量統合器１３０の一部又は全部は、コンピュータが特徴量変換プログラムを実行することによって実現されてよく、又はハードウェアによって実現されてもよい。 FIG. 21 is a block diagram showing the configuration of the feature quantity conversion apparatus of this example. The feature amount conversion apparatus 101 includes N bit rearrangers 111 to 11N, the same number (N) of logical operation units 121 to 12N as the bit rearrangers, and a feature amount integrator 130. A part or all of these bit rearranging units 111 to 11N, logical operation units 121 to 12N, and feature amount integrator 130 may be realized by a computer executing a feature amount conversion program, or by hardware. It may be realized.

本例では、特徴量変換装置１０１に、変換すべき特徴量として、二値化された特徴ベクトルが入力される。特徴ベクトルは、Ｎ個のビット再配列器１１１〜１１Ｎ及びＮ個の論理演算器１２１〜１２Ｎにそれぞれ入力される。Ｎ個の論理演算器１２１〜１２Ｎにはさらに対応するビット配列器１１１〜１１Ｎの出力が入力される。 In this example, a binarized feature vector is input to the feature value conversion apparatus 101 as a feature value to be converted. The feature vectors are input to N bit rearrangers 111 to 11N and N logical operators 121 to 12N, respectively. The outputs of the corresponding bit arrayers 111 to 11N are further input to the N logical operation units 121 to 12N.

ビット再配列器１１１〜１１Ｎは、入力された二値の特徴ベクトルに対して、キャリーなしローテートシフトによる再配列を行って、再配列ビット列を生成する。具体的には、ビット再配列器１１１は、特徴ベクトルを右に１ビットのキャリーなしローテートシフトを行い、ビット再配列器１１２は、特徴ベクトルを右に２ビットのキャリーなしローテートシフトを行い、ビット再配列器１１３は特徴ベクトルを右に３ビットのキャリーなしローテートシフトを行い、ビット再配列器１１Ｎは特徴ベクトルを右にＮビットのキャリーなしローテートシフトを行う。 The bit reordering units 111 to 11N reorder the input binary feature vectors by a carryless rotation shift to generate a rearranged bit string. Specifically, the bit reorderer 111 performs a 1-bit carryless rotate shift to the right of the feature vector, and the bit reorderer 112 performs a 2-bit carryless rotate shift to the right of the feature vector. The rearranger 113 performs a 3-bit carry-less rotate shift to the right of the feature vector, and the bit rearranger 11N performs an N-bit carry-less rotate shift to the right.

本例では、入力される二値の特徴ベクトルをｄ次元とすると、Ｎ＝ｄ／２とする。これにより、特徴ベクトルのすべての要素のすべての組み合わせについてＸＯＲを計算することができる。 In this example, if the input binary feature vector is d-dimensional, N = d / 2. Thereby, XOR can be calculated for all combinations of all elements of the feature vector.

論理演算器１２１〜１２Ｎは、それぞれ対応するビット再配列器１１１〜１１Ｎから出力された再配列ビット列ともとの特徴ベクトルのビット列とのＸＯＲを計算する。具体的には、論理演算器１２１は、ビット再配列器１１１から出力された再配列ビット列ともとの特徴ベクトルのビット列とのＸＯＲを計算し（図１３参照）、論理演算器１２２は、ビット再配列器１１２から出力された再配列ビット列ともとの特徴ベクトルのビット列とのＸＯＲを計算し（図１５参照）、論理演算器１２３は、ビット再配列器１１３から出力された再配列ビット列ともとの特徴ベクトルのビット列とのＸＯＲを計算し（図１７参照）、論理演算器１２Ｎは、ビット再配列器１１Ｎから出力された再配列ビット列ともとの特徴ベクトルのビット列とのＸＯＲを計算する。 The logical operation units 121 to 12N calculate XOR between the rearranged bit sequence output from the corresponding bit rearrangement unit 111 to 11N and the bit sequence of the original feature vector. Specifically, the logical operator 121 calculates the XOR between the rearranged bit string output from the bit rearranger 111 and the bit string of the original feature vector (see FIG. 13), and the logical operator 122 The XOR of the rearranged bit string output from the arrayer 112 and the bit string of the original feature vector is calculated (see FIG. 15), and the logical operator 123 calculates the original value of the rearranged bit string output from the bit rearranger 113. The XOR with the bit string of the feature vector is calculated (see FIG. 17), and the logical operator 12N calculates the XOR with the bit string of the original feature vector and the rearranged bit string output from the bit rearranger 11N.

特徴統合器１１３は、もとの特徴ベクトルと、論理演算器１２１〜１２Ｎからの出力（論理演算ビット列）を並べて、それらを要素とする非線形変換特徴ベクトルを生成する。上述のように、入力される特徴ベクトルが３２次元であるとき、特徴統合器１１３で生成される非線形変換特徴ベクトルは５４４次元となる。 The feature integrator 113 arranges the original feature vector and the outputs (logical operation bit strings) from the logical operation units 121 to 12N, and generates a non-linear transformation feature vector having them as elements. As described above, when the input feature vector has 32 dimensions, the nonlinear transformation feature vector generated by the feature integrator 113 has 544 dimensions.

以上のように、本例の特徴量変換装置１０１によれば、二値化された特徴ベクトルの要素にそれらの共起要素（論理演算ビット列の要素）を付け足して特徴ベクトルの次元を増加させるので、特徴ベクトルの識別力を向上できる。 As described above, according to the feature value conversion apparatus 101 of this example, the dimension of the feature vector is increased by adding those co-occurrence elements (elements of logical operation bit strings) to the binarized feature vector elements. Thus, the discriminating power of feature vectors can be improved.

また、本例の特徴量変換装置１０１は、もとの特徴ベクトルの要素が「＋１」及び「−１」であるのでＦＩＮＤ特徴量のようにそれらの調和平均を共起要素とすることと各要素のＸＯＲを共起要素とすることが等価であることに着目して、各要素のすべての組み合わせのＸＯＲを計算して、それらを共起要素とするので、共起要素の計算を高速に行うことができる。 In addition, since the feature vector conversion device 101 of this example has “+1” and “−1” as the elements of the original feature vector, the harmonic average thereof is used as a co-occurrence element like the FIND feature quantity, and Focusing on the fact that XOR of elements is equivalent to co-occurrence elements, XOR of all combinations of each element is calculated and they are used as co-occurrence elements. It can be carried out.

さらに、本例の特徴量変換装置１０１は、各要素のＸＯＲを計算するために、もとの特徴ベクトルのビット列と、それに対してキャリーなしローテートシフトを行ったビット列とのＸＯＲを計算するので、計算機のレジスタの幅がもとの特徴ベクトルのビット数（ＸＯＲの計算の数）以下である場合には、このＸＯＲの計算を同時に行うことができ、従って共起要素の計算を高速に行うことができる。 Furthermore, the feature quantity conversion apparatus 101 of this example calculates the XOR between the bit string of the original feature vector and the bit string obtained by performing a rotation shift without carry on the original feature vector in order to calculate the XOR of each element. When the register width of the computer is less than or equal to the number of bits of the original feature vector (the number of XOR calculations), this XOR calculation can be performed simultaneously, and therefore the co-occurrence elements can be calculated at high speed. Can do.

２−５．第２の実施の形態の第２の例
次に、第２の例として、ＨＯＧ特徴量が二値ベクトルではなく、実数ベクトルとして得られている場合について、それを識別力の高い二値ベクトルに変換する特徴量変換装置について説明する。 2-5. Second Example of Second Embodiment Next, as a second example, when the HOG feature quantity is obtained as a real vector instead of a binary vector, it is changed to a binary vector with high discriminating power. A feature amount conversion device for conversion will be described.

図２２は、画像の１ブロック分のＨＯＧ特徴量とそれを二値化した結果を示す図である。本例のＨＯＧ特徴量は、３２次元の特徴ベクトルとして得られる。図２２の上段は、この特徴ベクトルの各要素を示しており、縦軸は各要素の大きさ、横軸は要素数を示している。 FIG. 22 is a diagram showing the HOG feature amount for one block of an image and the result of binarizing it. The HOG feature amount of this example is obtained as a 32-dimensional feature vector. The upper part of FIG. 22 shows each element of the feature vector, the vertical axis indicates the size of each element, and the horizontal axis indicates the number of elements.

各要素は、二値化されて、下段の二値化された特徴ベクトルが得られる。具体的には、各要素のレンジの所定の位置に二値化のための閾値を設け、要素の値が設定された閾値以上である場合は、その要素を「＋１」とし、要素の値が設定された閾値より小さい場合は、その要素を「−１」とする。なお、各要素のレンジはそれぞれ異なるため、要素ごとに異なる閾値（３２種類）が設定される。特徴ベクトルの３２個の実数の要素をそれぞれ二値化することで、３２個の要素を持つ二値化された特徴ベクトル（３２ビット）に変換できる。 Each element is binarized to obtain the lower binarized feature vector. Specifically, a threshold for binarization is set at a predetermined position in the range of each element, and when the value of the element is equal to or greater than the set threshold, the element is set to “+1”, and the value of the element is If it is smaller than the set threshold, the element is set to “−1”. Since the range of each element is different, different threshold values (32 types) are set for each element. By binarizing each of the 32 real elements of the feature vector, it can be converted into a binarized feature vector (32 bits) having 32 elements.

ここで、多重閾値を用いることによって、特徴ベクトルの特徴記述能力を強化（情報量を増大）させることができる。すなわち、ｋ種類の異なる閾値を設定して、各閾値について、図２２に示した二値化を行うことで二値化された特徴ベクトルの次元数を増やすことが可能である。 Here, the feature description capability of the feature vector can be enhanced (the amount of information increased) by using the multiple threshold. That is, by setting k different thresholds and performing binarization shown in FIG. 22 for each threshold, the number of dimensions of the binarized feature vector can be increased.

図２３は、多重閾値による特徴記述能力の強化を説明する図である。この例では、４種類の閾値を用いて二値化を行っている。３２次元の実数ベクトルの各要素が、そのレンジの２０％位置を閾値として二値化されて、３２ビット分の要素が生成される。同様に、３２次元の実数ベクトルの各要素が、そのレンジの４０％位置、６０％位置、８０％位置をそれぞれ閾値として二値化されて、各々３２ビット分の要素が再生される。これらの要素を統合すると、二値化された１２８次元の特徴ベクトル（１２８ビット）が得られる。 FIG. 23 is a diagram for explaining the enhancement of feature description capability by multiple thresholds. In this example, binarization is performed using four types of threshold values. Each element of the 32-dimensional real vector is binarized using a 20% position in the range as a threshold value, and an element for 32 bits is generated. Similarly, each element of the 32-dimensional real vector is binarized using the 40% position, 60% position, and 80% position of the range as threshold values, and 32 bit elements are reproduced. When these elements are integrated, a binarized 128-dimensional feature vector (128 bits) is obtained.

特徴ベクトルが実数ベクトルとして与えられた場合に、図２３に示すように多重閾値による二値化を行って特徴ベクトルの特徴記述能力を向上させた上で、第１の例として説明した特徴量変換装置１０によって非線形変換を行い、さらに情報量を増加させることができる。 When a feature vector is given as a real vector, binarization with multiple thresholds is performed to improve the feature description capability of the feature vector as shown in FIG. The apparatus 10 can perform non-linear transformation to further increase the amount of information.

ここで、ＨＯＧ特徴量の二値化を高速化する工夫について説明する。一般に、ＨＯＧ特徴量はブロック単位で長さを１に正規化しなければならない。この正規化によって、明るさに対して頑健（ロバスト）になるからである。 Here, a device for speeding up the binarization of the HOG feature value will be described. In general, the length of the HOG feature must be normalized to 1 in block units. This is because the normalization makes it robust against the brightness.

正規化前の３２次元の実数のＨＯＧ特徴量を
とおく。また、正規化後の３２次元の実数のＨＯＧ特徴量を
とおく。このとき、
である。 32D real HOG features before normalization
far. Also, the normalized 32D real HOG feature value
far. At this time,
It is.

二値化後の３２次元のＨＯＧ特徴量を
とする。このとき、
である。 32D HOG features after binarization
And At this time,
It is.

この二値化は、平方根の演算、及び割り算が一度ずつ発生するため、非常に遅い。そこで、ＨＯＧ特徴量が非負であることに着目し、上記の不等式
の両辺を二乗し、左辺の分母を右辺に移項して、下式を得る。
This binarization is very slow because square root operations and division occur once. Therefore, paying attention to the fact that the HOG feature is non-negative, the above inequality
Is squared and the denominator of the left side is transferred to the right side to obtain the following expression.

このように変形することで、平方根の演算、及び割り算を行うことなく、下式によって実数のＨＯＧ特徴量を二値化することができる。
By transforming in this way, the real HOG feature value can be binarized by the following formula without performing the calculation and division of the square root.

ここで、例えば、レンジの２０％位置を閾値として二値化した結果「−１」（閾値より小さい）と判断された要素は、レンジの４０％位置、６０％位置、８０％位置を閾値として二値化した場合にも当然に「−１」となる。この意味で、多重閾値による二値化によって得られた１２８ビットの二値化ベクトルは冗長な要素を含んでいる。従って、この１２８ビットの二値化ベクトルをそのまま第１の例の特徴量変換装置１０に適用して共起要素を求めることは効率的でない。そこで、本例では、このような冗長性を軽減してより効率よく共起要素を求めることができる特徴量変換装置を提供する。 Here, for example, an element determined to be “−1” (smaller than the threshold) as a result of binarization using the 20% position of the range as the threshold value is set to the 40% position, 60% position, and 80% position of the range as the threshold value. Even in the case of binarization, it is naturally “−1”. In this sense, the 128-bit binarization vector obtained by binarization with multiple thresholds includes redundant elements. Therefore, it is not efficient to obtain the co-occurrence element by applying the 128-bit binarized vector as it is to the feature amount conversion apparatus 10 of the first example. In view of this, in this example, a feature quantity conversion device capable of reducing the redundancy and obtaining the co-occurrence element more efficiently is provided.

図２４は、本例の特徴量変換を説明する図である。本例の特徴量変換装置は、実数ベクトルとして得られている特徴ベクトルを、ｋ種類の異なる閾値で二値化する。図２４の例では、レンジの２０％位置、４０％位置、６０％位置、８０％位置の４種類の閾値でもって、３２次元の実数ベクトルをそれぞれ二値化することで、それぞれ３２個の要素を持つビット列を得る。ここまでは、図２３の例と同様である。 FIG. 24 is a diagram for explaining the feature amount conversion in this example. The feature amount conversion apparatus of this example binarizes a feature vector obtained as a real vector with k different thresholds. In the example of FIG. 24, each of 32 elements is binarized by binarizing a 32-dimensional real vector with four types of threshold values of 20% position, 40% position, 60% position, and 80% position of the range. Get a bit string with. The steps so far are the same as in the example of FIG.

本例の特徴量変換装置では、各閾値によって得られたビット列を統合する前に、それらのビット列を用いて、それぞれ共起要素を求める。これによって、図２４に示すように、各３２ビットのビット列から５４４ビットのビット列を得ることができる。最終的には、これらの４つのビット列を統合して、２１７６ビットの二値化された非線形変換特徴ベクトルが得られる。 In the feature quantity conversion apparatus of this example, before the bit strings obtained by the respective threshold values are integrated, the co-occurrence elements are obtained using these bit strings. As a result, as shown in FIG. 24, a 544-bit bit string can be obtained from each 32-bit bit string. Eventually, these four bit sequences are integrated to obtain a 2176-bit binarized nonlinear transformation feature vector.

図２５は、本例の特徴量変換装置の構成を示すブロック図である。特徴量変換装置１０２は、Ｎ個の二値化器２１１〜２１Ｎと、二値化器と同数（Ｎ個）の共起要素生成器２２１〜２２Ｎと、特徴量統合器２３を備えている。これらの二値化器２１１〜２１Ｎ、共起要素生成器２２１〜２２Ｎ、及び特徴量統合器２３の一部又は全部は、コンピュータが特徴量変換プログラムを実行することによって実現されてよく、又はハードウェアによって実現されてもよい。 FIG. 25 is a block diagram showing the configuration of the feature quantity conversion apparatus of this example. The feature quantity conversion apparatus 102 includes N binarizers 211 to 21N, the same number (N) of co-occurrence element generators 221 to 22N, and a feature quantity integrator 23. A part or all of the binarizers 211 to 21N, the co-occurrence element generators 221 to 22N, and the feature quantity integrator 23 may be realized by a computer executing a feature quantity conversion program, or hardware. It may be realized by wear.

本例では、特徴量変換装置１０２に実数の特徴ベクトルが入力される。特徴ベクトルは、Ｎ個の二値化器２１１〜２１Ｎにそれぞれ入力される。二値化器２１１〜２１Ｎは、それぞれ異なる閾値で実数の特徴ベクトルを二値化する。二値化された特徴ベクトルは、それぞれ対応する共起要素生成器２２１〜２２Ｎに入力される。 In this example, a real number feature vector is input to the feature quantity conversion apparatus 102. The feature vectors are input to N binarizers 211 to 21N, respectively. The binarizers 211 to 21N binarize real feature vectors with different threshold values. The binarized feature vectors are input to the corresponding co-occurrence element generators 221 to 22N, respectively.

共起要素生成器２２１〜２２Ｎは、それぞれ、第１の例で説明した特徴量変換装置１０１と同じ構成を有している。すなわち、各共起要素生成器２２１〜２２Ｎは、複数のビット再配列器１１１〜１１Ｎと、複数の論理演算器１２１〜１２Ｎと、特徴統合器１３を備え、キャリーなしローテートシフト及びＸＯＲ演算によって共起要素を算出し、それらと入力されたビット列とを統合する。 Each of the co-occurrence element generators 221 to 22N has the same configuration as the feature amount conversion apparatus 101 described in the first example. That is, each of the co-occurrence element generators 221 to 22N includes a plurality of bit reordering units 111 to 11N, a plurality of logical operation units 121 to 12N, and a feature integration unit 13, and performs co-rotation rotation without rotation and XOR operation. The starting elements are calculated, and these and the input bit string are integrated.

各共起要素生成器２２１〜２２Ｎに３２ビットのビット列が入力されると、各共起要素生成器２２１〜２２Ｎからはそれぞれ５４４ビットのビット列が出力される。特徴統合器２３は、共起要素生成器２２１〜２２Ｎからの出力を並べて、それらを要素とする非線形変換特徴ベクトルを生成する。上述のように、入力される特徴ベクトルが３２次元であるとき、特徴統合器２１３で生成される特徴ベクトルは２１７６次元（２１７６ビット）となる。 When a 32-bit bit string is input to each of the co-occurrence element generators 221 to 22N, a 544-bit bit string is output from each of the co-occurrence element generators 221 to 22N. The feature integrator 23 arranges the outputs from the co-occurrence element generators 221 to 22N, and generates a nonlinear transformation feature vector having these as elements. As described above, when the input feature vector has 32 dimensions, the feature vector generated by the feature integrator 213 has 2176 dimensions (2176 bits).

以上のように、本例の特徴量変換装置２０によれば、特徴量が実数ベクトルとして得られた場合にも、それを二値化するとともにその二値化ベクトルの情報量を多くすることができる。 As described above, according to the feature value conversion apparatus 20 of the present example, even when a feature value is obtained as a real vector, it can be binarized and the information amount of the binarized vector can be increased. it can.

２−６．第２の実施の形態の変形例
第１の例の特徴量変換装置１０１及び第２の例の特徴量変換装置１０２は、多数の学習用データから認識モデルを決定する際に、学習用データとして入力される特徴ベクトルに対して上記の非線形変換を行って、非線形変換特徴ベクトルを取得する。この非線形変換特徴ベクトルが、学習装置によるＳＶＭ等による学習処理に用いられて、認識モデルが確定する。すなわち、特徴量変換装置１０１、１０２は、学習装置に用いられ得る。また、特徴量変換装置１０１、１０２は、認識モデルが確定した後に、認識を行うべきデータが学習用データと同様の形式の特徴ベクトルとして入力されたときにも、その特徴ベクトルに対して上記の非線形変換を行って非線形変換特徴ベクトルを取得する。この非線形変換特徴ベクトルが、認識装置による線形識別等に用いられて、認識結果が得られる。すなわち、特徴量変換装置１０１、１０２は、認識装置に用いられ得る。 2-6. Modification of Second Embodiment The feature amount conversion device 101 of the first example and the feature amount conversion device 102 of the second example are used as learning data when determining a recognition model from a large number of learning data. The nonlinear transformation feature vector is obtained by performing the nonlinear transformation on the inputted feature vector. This non-linear transformation feature vector is used for learning processing by SVM or the like by the learning device, and the recognition model is determined. That is, the feature quantity conversion devices 101 and 102 can be used as a learning device. The feature quantity conversion apparatuses 101 and 102 also perform the above processing on the feature vector when the data to be recognized is input as the feature vector in the same format as the learning data after the recognition model is determined. Perform nonlinear transformation to obtain a nonlinear transformation feature vector. This nonlinear transformation feature vector is used for linear identification or the like by the recognition device, and a recognition result is obtained. That is, the feature amount conversion apparatuses 101 and 102 can be used as a recognition apparatus.

なお、論理演算器１２１〜１２Ｎでは、必ずしも論理演算としてＸＯＲを計算しなくてもよく、例えばＡＮＤやＯＲを計算してもよい。但し、上述のように、ＸＯＲはＦＩＮＤ特徴量を求める際の調和平均と等価であり、かつ、図１１の表から明らかなように、特徴ベクトルが任意である場合には、ＸＯＲの値として「＋１」と「−１」とが等確率で出現するため、共起要素のエントロピーが高くなり（情報量が多くなり）、非線形変換特徴ベクトルの記述能力が向上するので、論理演算器１２１〜１２ＮがＸＯＲを計算することは有利である。 Note that the logical operation units 121 to 12N do not necessarily calculate XOR as a logical operation, and may calculate AND or OR, for example. However, as described above, XOR is equivalent to the harmonic mean for obtaining the FIND feature value, and as is clear from the table of FIG. 11, when the feature vector is arbitrary, the value of XOR is “ Since “+1” and “−1” appear with equal probability, the entropy of the co-occurrence element is increased (the amount of information is increased), and the description capability of the nonlinear transformation feature vector is improved. It is advantageous to calculate XOR.

また、特徴量変換装置１０１及び共起要素生成器２２１〜２２Ｎは、特徴ベクトルの次元数ｄに対して、ｄ／２個のビット再配列器１１１〜１１Ｎを備えていたが、ビット再配列器の個数は、これより少なくてもよく（Ｎ＝１でもよく）、これより多くてもよい。また、論理演算器１２１〜１２Ｎの個数も、ｄ／２より少なくてもよく（Ｎ＝１でもよく）、ｄ／２より多くてもよい。 Further, the feature quantity conversion device 101 and the co-occurrence element generators 221 to 22N include the d / 2 bit rearrangers 111 to 11N with respect to the dimension d of the feature vector. The number may be smaller than this (N = 1 may be sufficient) or larger. In addition, the number of logical operation units 121 to 12N may be smaller than d / 2 (N = 1 may be sufficient) or larger than d / 2.

また、ビット再配列器１１１〜１１Ｎは、それぞれもとの特徴ベクトルのビット列に対してキャリーなしローテートシフトをすることで新たなビット列を生成したが、各再配列器１１１〜１１Ｎは、例えばもとの特徴ベクトルのビット列をランダムに並び替えることで新たなビット列を生成してもよい。但し、シフトなしキャリーローテートは、最小のビット数ですべての組み合わせを網羅できるとともに、ロジックがシンプルで処理速度が速いという点で有利である。 The bit rearrangers 111 to 11N each generate a new bit string by performing a carry-less rotate shift on the bit string of the original feature vector. Each of the rearrangers 111 to 11N, for example, A new bit string may be generated by randomly rearranging the bit strings of the feature vectors. However, carry-rotate without shift is advantageous in that all combinations can be covered with the minimum number of bits, and the logic is simple and the processing speed is high.

また、論理演算器１２１〜１２Ｎは、もとの特徴ベクトルのビット列とビット再配列器で再配列されたビット列との論理演算を行ったが、一部又はすべての論理演算器が、ビット再配列器で再配列されたビット列どうしの論理演算を行ってもよい。このとき、ビット再配列器で得られるビット列の次元数ともとの特徴ベクトルの次元数とが異なっていてもよい。また、二値化器２１１〜２１Ｎの入力と出力とで次元が異なっていてもよい。さらに、特徴統合器１３は、もとの特徴ベクトルの要素も用いて非線形変換特徴ベクトルを生成したが、もとの特徴ベクトルは用いなくてもよい。 In addition, the logical operation units 121 to 12N perform logical operations on the original feature vector bit sequence and the bit sequence rearranged by the bit rearrangement unit. However, some or all of the logical operation units perform bit rearrangement. A logical operation may be performed between the bit sequences rearranged by the unit. At this time, the dimension number of the bit vector obtained by the bit rearranger may differ from the dimension number of the original feature vector. Further, the dimensions may be different between the input and output of the binarizers 211 to 21N. Further, the feature integrator 13 generates the nonlinear transformation feature vector using the elements of the original feature vector, but the original feature vector may not be used.

また、上記の第２の例では、各共起要素生成器２２１〜２２Ｎが第１の例の特徴量変換装置１０１と同様の構成を有し、すなわち複数のビット再配列器１１１〜１１Ｎ、複数の論理演算器１２１〜１２Ｎ、及び特徴統合器１３を備えていたが、各共起要素生成器２２１〜２２Ｎが、特徴統合器１３を備えずに、複数の論理演算器１２１〜１２Ｎから出力される複数の論理演算ビット列を直接特徴統合器２３に出力して、特徴統合器２３がこられを統合して非線形変換特徴ベクトルを生成してもよい。 In the second example, each of the co-occurrence element generators 221 to 22N has the same configuration as that of the feature amount conversion apparatus 101 of the first example, that is, a plurality of bit rearrangers 111 to 11N, a plurality of However, the co-occurrence element generators 221 to 22N are not provided with the feature integrator 13 and are output from the plurality of logic operators 121 to 12N. A plurality of logical operation bit strings may be directly output to the feature integrator 23, and the feature integrator 23 may integrate these to generate a nonlinear transformation feature vector.

また、上記の第１及び第２の例では、画像の識別を行う例を説明したが、識別の対象は音声、文章等の他のデータであってもよい。また、認識処理は線形識別ではない他の認識処理であってもよい。 In the first and second examples, the example in which the image is identified has been described. However, the identification target may be other data such as voice and text. Further, the recognition process may be another recognition process that is not linear identification.

また、上記の第１及び第２の例では、複数のビット再配列器１１１〜１１Ｎがそれぞれ再配列ビット列を生成することで複数の再配列ビット列を生成し、複数の論理演算器１２１〜１２Ｎがそれぞれ論理演算を行うことで、複数の再配列ビット列の各々ともとの特徴ベクトルのビット列とのＸＯＲを計算した。これらの複数のビット再配列器１１１〜１１Ｎ、複数の論理演算器１２１〜１２Ｎは、それぞれ本実施の形態のビット再配列部及び論理演算部に相当する。本実施の形態のビット再配列部及び論理演算部は、上記の例に限られず、例えば、ソフトウェアの処理によって複数の再配列ビットの生成及び複数の論理演算を行ってもよい。 In the first and second examples described above, the plurality of bit rearrangers 111 to 11N generate a rearranged bit string by generating the rearranged bit strings, respectively, and the plurality of logical operation units 121 to 12N Each of the plurality of rearranged bit strings was subjected to a logical operation to calculate an XOR with each of the bit strings of the original feature vectors. The plurality of bit rearrangers 111 to 11N and the plurality of logical operation units 121 to 12N correspond to the bit rearrangement unit and the logical operation unit of the present embodiment, respectively. The bit rearrangement unit and the logical operation unit according to the present embodiment are not limited to the above example. For example, a plurality of rearrangement bits may be generated and a plurality of logical operations may be performed by software processing.

２−７．実施例
次に、本実施の形態の特徴量変換装置を用いた実施例を説明する。図２６は、比較例のプログラムコードであり、図２７は実施例のプログラムコードである。比較例は、３２次元の実数の要素を持つ特徴量をＦＩＮＤ特徴量に変換するプログラムである。実施例は、３２次元の二値化された要素を持つ特徴量に対して、第１の例の特徴量変換装置１０によって非線形変換を行うプログラムである。以下、説明の便宜を図るため、ｋは二値化の閾値の段階数である。 2-7. Example Next, an example using the feature value conversion apparatus of the present embodiment will be described. FIG. 26 shows the program code of the comparative example, and FIG. 27 shows the program code of the example. The comparative example is a program for converting a feature quantity having a 32-dimensional real number element into a FIND feature quantity. The embodiment is a program for performing non-linear transformation on a feature quantity having 32-dimensional binarized elements by the feature quantity conversion apparatus 10 of the first example. Hereinafter, for convenience of explanation, k is the number of steps of the binarization threshold.

比較例及び実施例のプログラムによって、同一の擬似データを変換した。その結果、比較例では、１ブロックあたりの計算時間は、７２１２．７１ナノ秒となった。これに対して、実施例で、同一の擬似データを変換した場合の１ブロックあたりの計算時間は、ｋ＝１のときに２２．０４ナノ秒（比較例の３２７．３２倍の速度）、ｋ＝２のときに３３．２０ナノ秒（比較例の２１７．２２倍の速度）、ｋ＝３のときに４２．１４ナノ秒（比較例の１７１．１７倍の速度）、ｋ＝４のときに５３．７６ナノ秒（比較例の１３４．１６倍の速度）となった。このように、実施例の非線形変換は、比較例と比較して十分に高速であった。 The same pseudo data was converted by the programs of the comparative example and the example. As a result, in the comparative example, the calculation time per block was 7212.71 nanoseconds. On the other hand, in the embodiment, the calculation time per block when the same pseudo data is converted is 22.04 nanoseconds (327.32 times the speed of the comparative example) when k = 1, k = 2 when 33.20 nanoseconds (217.22 times the speed of the comparative example), k = 3 when 42.14 nanoseconds (171.17 times the speed of the comparative example), when k = 4 To 53.76 nanoseconds (134.16 times the speed of the comparative example). Thus, the nonlinear transformation of the example was sufficiently fast compared with the comparative example.

図２８は、学習によって認識モデルを生成した後に認識装置にて認識を行ったときの誤検出と検出率との関係を示すグラフである。横軸は誤検出を示し、縦軸は検出率を示している。認識装置においては、誤検出が小さく、かつ検出率が高いことが望ましい。すなわち、図２８のグラフでは、左上の角に近いグラフほど認識性能が高い。 FIG. 28 is a graph showing the relationship between false detection and detection rate when recognition is performed by the recognition device after generating a recognition model by learning. The horizontal axis indicates erroneous detection, and the vertical axis indicates the detection rate. In the recognition device, it is desirable that the false detection is small and the detection rate is high. That is, in the graph of FIG. 28, the closer to the upper left corner, the higher the recognition performance.

図２８において、破線は、Ｄａｌａｌ氏のオリジナルの実装によるＨＯＧ特徴量をそのまま用いて学習及び認識を行った場合のグラフであり、一点鎖線は、Ｃパラメータを最適にチューニングして得られたＦＩＮＤ特徴量を用いて学習及び認識を行った場合のグラフであり、実線は、実施例を示しており、具体的には、ｋ＝４として本実施の形態の第２の例によって得られた非線形変換特徴ベクトルを用いて学習及び認識を行った場合のグラフである。 In FIG. 28, a broken line is a graph when learning and recognition are performed using the HOG feature amount as originally produced by Mr. Dalal as it is, and a one-dot chain line is a FIND feature obtained by optimally tuning the C parameter. This is a graph when learning and recognition are performed using a quantity, and the solid line indicates an example. Specifically, the non-linear transformation obtained by the second example of the present embodiment with k = 4. It is a graph at the time of learning and recognition using a feature vector.

図２８から明らかなように、ＦＩＮＤ特徴量及び実施例は、ＨＯＧ特徴量をそのまま用いた場合と比較して、認識性能が高い。実施例は、二値化をしているのでＦＩＮＤ特徴量よりも認識性能が劣るが、その劣化は僅かである。以上の結果から、本実施の形態によれば、ＦＩＮＤ特徴量と比較して、処理速度は格段に向上する一方で、認識性能はほとんど劣らないことが確認された。 As is clear from FIG. 28, the FIND feature value and the example have higher recognition performance than the case where the HOG feature value is used as it is. In the embodiment, since the binarization is performed, the recognition performance is inferior to the FIND feature amount, but the deterioration is slight. From the above results, according to the present embodiment, it was confirmed that the processing speed is remarkably improved while the recognition performance is hardly inferior as compared with the FIND feature amount.

本実施の形態の更なる例を説明する。本例は、実数の特徴量をｋ種類の閾値で二値化した場合における識別器での認識をカスケード処理によって高速化する。実数の特徴量Ｘをｋ種類の閾値で二値化して得られるベクトルを、
とおく。識別などの目的の場合には、下式のｗ^Tｂを計算し、閾値Ｔｈと比較するという操作が行われる。ここで、ｗは識別のための重みベクトルである。
A further example of this embodiment will be described. In this example, recognition by the discriminator when the real number of feature values is binarized with k types of thresholds is accelerated by cascade processing. A vector obtained by binarizing a real feature quantity X with k types of threshold values,
far. For the purpose of identification or the like, an operation of calculating w ^T b in the following equation and comparing it with a threshold Th is performed. Here, w is a weight vector for identification.

例えば、ｋ＝４で、ｂ₁は２０％、ｂ₂は４０％、ｂ₃は６０％、ｂ₄は８０％の位置で二値化されているものとする。このとき、明らかにｂ₂及びｂ₃は、ｂ₁及びｂ₄よりもエントロピーが高くなる。従って、ｗ₂ ^Tｂ₂及びｗ₃ ^Tｂ₃は、ｗ₁ ^Tｂ₁及びｗ₄ ^Tｂ₄よりも広い値の分布を持つことになる。 For example, it is assumed that k = 4, b ₁ is 20%, b ₂ is 40%, b ₃ is 60%, and b ₄ is binarized at 80%. At this time, b ₂ and b ₃ clearly have higher entropy than b ₁ and b ₄ . Therefore, w ₂ ^T b ₂ and w ₃ ^T b ₃ have a wider distribution than w ₁ ^T b ₁ and w ₄ ^T b ₄ .

これに着目し、本例では、ｗ₂ ^Tｂ₂、ｗ₃ ^Tｂ₃、ｗ₁ ^Tｂ₁、ｗ₄ ^Tｂ₄という順序で計算し、途中でｗ^Tｂが所定の閾値Ｔｈよりも確実に大きくなる、もしくは小さくなると判断できる場合は、その時点で処理を打ち切る。これにより処理が高速化できる。すなわち、カスケードの順序は、ｗ_i ^Tｂ_iの分布の広い順、もしくはエントロピーの値が高い順に並べる。 Focusing on this, in this example, w ₂ ^T b ₂ , w ₃ ^T b ₃ , w ₁ ^T b ₁ , and w ₄ ^T b ₄ are calculated in this order, and w ^T b is less than a predetermined threshold Th in the middle. If it can be determined that it will definitely increase or decrease, the process is terminated at that point. This can speed up the processing. In other words, the cascade order is arranged in the order of wide distribution of w _i ^T b _i or in descending order of entropy value.

３．第３の実施の形態
３−１．背景
特徴ベクトルを、各要素が−１及び１の二値のみを取るｄ次元の二値ベクトルに変換すれば、ＳＶＭ（サポートベクトルマシン）による識別処理や、ｋ−ｍｅａｎｓクラスタリングなど、さまざまな処理に、バイナリコードを適用できる。しかしながら、これらのケースではハミング距離による高速距離計算の恩恵を受けることができないことがある。すなわち、アルゴリズムによっては、バイナリコード変換による高速距離計算の恩恵を受けられないことがある。 3. Third embodiment
3-1. If the background feature vector is converted into a d-dimensional binary vector in which each element takes only binary values of -1 and 1, it can be used for various processes such as identification processing by SVM (support vector machine) and k-means clustering. Binary code can be applied. However, in these cases, it may not be possible to benefit from high-speed distance calculation by Hamming distance. That is, depending on the algorithm, there are cases where the benefits of high-speed distance calculation by binary code conversion cannot be obtained.

バイナリコード変換による高速距離計算の恩恵を受けられない例として、以下では、識別装置（Classifier）による認識（識別）処理およびｋ−ｍｅａｎｓクラスタリングを説明する。まず、識別装置による認識処理については、例えば、二値ベクトルｘ∈｛−１，１｝^dを２クラスに識別する問題に対して、線形ＳＶＭ（線形サポートベクトルマシン）を適用することを考える。線形ＳＶＭでは以下の式（４）を評価する。
識別装置は、評価関数ｆ（ｘ）が正ならば入力ベクトルｘはクラスＡに属し、評価関数（ｘ）が負ならば入力ベクトルｘはクラスＢに属するものとして識別する。ｗは、重みパラメータであって、ｗ∈Ｒ^dである。ｂは、バイアスパラメータであって、ｂ∈Ｒ¹である。パラメータｗ及びｂは、学習用に用意した特徴量を用いて学習処理により自動的に決定される辞書である。 As an example in which the benefits of high-speed distance calculation by binary code conversion cannot be obtained, recognition (identification) processing by an identification device (Classifier) and k-means clustering will be described below. First, regarding the recognition processing by the identification device, for example, it is considered that a linear SVM (linear support vector machine) is applied to the problem of identifying binary vectors xε {−1,1} ^d into two classes. In the linear SVM, the following expression (4) is evaluated.
The identification device identifies that the input vector x belongs to class A if the evaluation function f (x) is positive, and that the input vector x belongs to class B if the evaluation function (x) is negative. w is a weight parameter, and wεR ^d . b is a bias parameter, and bεR ¹ . The parameters w and b are dictionaries that are automatically determined by a learning process using feature amounts prepared for learning.

ここで、学習用に用意した特徴量が二値ベクトルであっても、ｗ∈Ｒ^dは二値にならず、実数値になってしまう。ｆ（ｘ）の計算にはｗ^Tｘが含まれているが、ｘが二値である一方でｗが実数値のベクトルであるため、ｗ^Tｘの計算には、浮動小数点演算が必要になってしまう。このように、ＳＶＭを適用する識別器による認識処理では、特徴ベクトルを二値ベクトルとすることによる計算高速化の恩恵を受けることができない。 Here, even if the feature quantity prepared for learning is a binary vector, wεR ^d is not a binary value but a real value. The calculation of f (x) includes w ^T x, but since x is binary and w is a real-valued vector, the calculation of w ^T x requires a floating-point operation. turn into. As described above, the recognition processing by the discriminator to which the SVM is applied cannot benefit from the speeding up of calculation by making the feature vector a binary vector.

次に、二値ベクトルに対して、ｋ−ｍｅａｎｓクラスタリングを適用する場合、すなわち、ｄ次元の二値ベクトルがＮ個与えられたとき、互いに距離が近い二値ベクトルをまとめたｋ個のクラスタを求める問題を考える。ｋ−ｍｅａｎｓとは、次の手順によりｋ個のクラスタと代表ベクトルを算出するアルゴリズムである。 Next, when k-means clustering is applied to a binary vector, i.e., when n d-dimensional binary vectors are given, k clusters obtained by collecting binary vectors that are close to each other are obtained. Think about the problem you want. k-means is an algorithm for calculating k clusters and representative vectors according to the following procedure.

ステップ１：Ｎ個の特徴量からｋ個をランダムに選出し、これをクラスタの代表ベクトルとする。
ステップ２：入力として与えられたＮ個の特徴量それぞれについて、最も距離が近い代表ベクトルを求める。
ステップ３：各代表ベクトルに所属する特徴量の平均を計算し、これを新しい代表ベクトルとする。
ステップ４：ステップ２、ステップ３を収束するまで繰り返す。 Step 1: k features are randomly selected from N feature amounts and set as cluster representative vectors.
Step 2: For each of the N feature values given as input, a representative vector having the closest distance is obtained.
Step 3: The average of the feature quantities belonging to each representative vector is calculated and set as a new representative vector.
Step 4: Repeat Step 2 and Step 3 until convergence.

ｋ−ｍｅａｎｓクラスタリングにおいて問題となるのは、ステップ３において、新しい代表ベクトルが二値ベクトルの平均で定義される点である。入力として与えられたデータが二値ベクトルであっても、平均の演算により、代表ベクトルは実数のベクトルになる。そのため、ステップ２における距離計算では、二値ベクトルと実数ベクトルとの間の距離を求めなければならなくなる。つまり、浮動小数点演算が必要になってしまう。このように、ｋ−ｍｅａｎｓクラスタリングにおいても、特徴ベクトルを二値ベクトルとすることによる計算高速化の恩恵を受けることができない。 The problem in k-means clustering is that in step 3 a new representative vector is defined by the average of the binary vectors. Even if the data given as input is a binary vector, the representative vector becomes a real vector by the average calculation. Therefore, in the distance calculation in step 2, it is necessary to obtain the distance between the binary vector and the real vector. In other words, floating point arithmetic is required. As described above, even in k-means clustering, it is not possible to receive the benefit of speeding up the calculation by making the feature vector a binary vector.

上記のように、識別装置（Classifier）による認識処理やｋ−ｍｅａｎｓクラスタリングでは、特徴ベクトルを二値ベクトルとすることによる計算高速化の恩恵を受けることができない。その理由は、いずれもｄ次元の二値ベクトルｐ∈｛−１，１｝^dと、ｄ次元の実数ベクトルｑ∈Ｒ^dとの内積演算が必要であるという点にある。なお、ｋ−ｍｅａｎｓクラスタリングで必要なのは、ｄビットの二値ベクトルｐ∈｛−１，１｝^dと、ｄ次元の実数ベクトルｑ∈Ｒ^dとの間の「距離」であるが、これも結局のところ、ｐ^Tｑという内積の演算に帰着される。なぜなら、ｐとｑとの間のユークリッド距離の二乗は、下式で表現されるからである。
As described above, the recognition processing by the classifier (Classifier) and k-means clustering cannot benefit from speeding up the calculation by making the feature vector a binary vector. The reason is that an inner product operation between the d-dimensional binary vector pε {−1,1} ^d and the d-dimensional real vector qεR ^d is necessary. Note that k-means clustering requires a “distance” between a d-bit binary vector pε {−1,1} ^d and a d-dimensional real vector qεR ^d, and this is also the result. now, it is reduced to the calculation of the inner product of p ^T q. This is because the square of the Euclidean distance between p and q is expressed by the following equation.

よって、識別装置による認識処理においてもｋ−ｍｅａｎｓクラスタリングにおいても、二値ベクトルとｄ次元の実数ベクトルとの内積の演算を高速化することこそが、問題の解決につながる。 Therefore, speeding up the calculation of the inner product of a binary vector and a d-dimensional real vector in both recognition processing by the identification device and k-means clustering leads to the solution of the problem.

３−２．概要
そこで、本実施の形態の関連性判定装置は、特徴ベクトルがｄ次元の二値ベクトルｐ∈｛−１，１｝^dである場合において、そのような特徴ベクトルとｄ次元の実数ベクトルｑ∈Ｒ^dとの間の内積（ｐ^Tｑもしくはｑ^Tｐ）の演算を高速に行うために、以下の構成を有する。 3-2. Overview Therefore, the relevance determination apparatus according to the present embodiment, when a feature vector is a d-dimensional binary vector pε {−1,1} ^d , such a feature vector and a d-dimensional real vector qε. In order to perform an inner product (p ^T q or q ^T p) with R ^d at high speed, the following configuration is provided.

本実施の形態の第一の態様の関連性判定装置は、二値化された特徴ベクトルを取得する特徴ベクトル取得部と、実数ベクトルを二値または三値の離散値のみから構成された要素を持つ複数の基底ベクトルの線形和に分解することで得られた前記複数の基底ベクトルを取得する基底ベクトル取得部と、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積計算を順次行うことで、前記実数ベクトルと前記特徴ベクトルとの関連性を判定するベクトル演算部とを備えた構成を有している。 The relevance determination device according to the first aspect of the present embodiment includes a feature vector acquisition unit that acquires a binarized feature vector, and an element that includes a real vector consisting only of binary or ternary discrete values. A basis vector acquisition unit for acquiring the plurality of basis vectors obtained by decomposing into a linear sum of a plurality of basis vectors, and sequentially calculating an inner product of the feature vector and each of the plurality of basis vectors; And a vector operation unit for determining the relevance between the real vector and the feature vector.

前記特徴ベクトルと前記基底ベクトルとの内積計算は、−１及び１のみを要素としてもつ第１の二値ベクトルと−１及び１のみを要素としてもつ複数の第２の二値ベクトルとの内積計算を含んでいてよい。 The inner product calculation of the feature vector and the basis vector is an inner product calculation of a first binary vector having only -1 and 1 as elements and a plurality of second binary vectors having only -1 and 1 as elements. May be included.

前記第１の二値ベクトルは、前記特徴ベクトルであってよく、前記特徴ベクトルの各要素を所定の係数で除したベクトルであってよく、その各要素を線形変換することで前記特徴ベクトルが得られるベクトルであってよい。 The first binary vector may be the feature vector or may be a vector obtained by dividing each element of the feature vector by a predetermined coefficient, and the feature vector is obtained by linearly transforming each element. Vector.

前記第２の二値ベクトルは、前記基底ベクトルであってよく、前記基底ベクトルの各要素を所定の係数で除したベクトルであってよく、前記第２の二値ベクトルは、その各要素を線形変換することで前記基底ベクトルが得られるベクトルであってよい。 The second binary vector may be the basis vector, may be a vector obtained by dividing each element of the basis vector by a predetermined coefficient, and the second binary vector may linearly represent each element. The basis vector may be obtained by conversion.

前記特徴ベクトルと前記基底ベクトルとの内積計算は、−１及び１のみを要素としてもつ二値ベクトルと−１、０及び１のみを要素としてもつ複数の三値ベクトルとの内積計算を含んでいてよい。 The inner product calculation of the feature vector and the base vector includes an inner product calculation of a binary vector having only -1 and 1 as elements and a plurality of ternary vectors having only -1, 0 and 1 as elements. Good.

前記二値ベクトルは、前記特徴ベクトルであってよく、前記特徴ベクトルの各要素を所定の係数で除したベクトルであってもよく、その各要素を線形変換することで前記特徴ベクトルが得られるベクトルであってもよい。 The binary vector may be the feature vector, may be a vector obtained by dividing each element of the feature vector by a predetermined coefficient, and the vector that obtains the feature vector by linearly transforming each element. It may be.

前記複数の三値ベクトルは、前記複数の基底ベクトルであってよく、前記複数の基底ベクトルの各要素を所定の係数で除したベクトルであってもよく、その各要素を線形変換することで前記複数の基底ベクトルが得られるベクトルであってもよい。 The plurality of ternary vectors may be the plurality of basis vectors, or may be a vector obtained by dividing each element of the plurality of basis vectors by a predetermined coefficient. It may be a vector from which a plurality of basis vectors are obtained.

前記ベクトル演算部は、前記第１の二値ベクトルと前記第２の二値ベクトルとの排他的論理和をとることで、前記第１の二値ベクトルと前記第２の二値ベクトルとの内積を計算してよい。 The vector calculation unit obtains an exclusive product of the first binary vector and the second binary vector, thereby obtaining an inner product of the first binary vector and the second binary vector. May be calculated.

前記ベクトル演算部は、前記二値ベクトルと前記三値ベクトルとの内積計算において、前記三値ベクトルの０の要素を−１又は１の任意のいずれかに置換して０置換ベクトルを生成し、前記三値ベクトルの０の要素を−１に置換し、かつ０以外の要素を１に置換してフィルタベクトルを生成し、前記二値ベクトルと前記０置換ベクトルとの排他的論理和と前記フィルタベクトルとの論理積をとることで、前記二値ベクトルと前記三値ベクトルとの間の非０で異なる要素の要素数Ｄ_{filterd＿hamming}を求め、前記要素数Ｄ_{filterd＿hamming}及び非０の要素数を前記二値ベクトルの要素数から引くことで、前記二値ベクトルと前記三値ベクトルとの間の非０で同一の要素の要素数を求め、前記二値ベクトルと前記三値ベクトルとの間の非０で同一の要素の要素数から前記二値ベクトルと前記三値ベクトルとの間の非０で異なる要素の要素数を引くことで、前記二値ベクトルと前記三値ベクトルとの内積を求めてよい。 The vector calculation unit generates a 0 permutation vector by substituting 0 elements of the ternary vector with any of -1 or 1 in the inner product calculation of the binary vector and the ternary vector, A filter vector is generated by replacing a zero element of the ternary vector with -1 and a non-zero element with 1, and generating an exclusive OR of the binary vector and the zero replacement vector and the filter By calculating the logical product with the vector, the number of elements D _{filterd_hamming} of non-zero and different elements between the binary vector and the ternary vector is obtained, and the number of elements D _{filterd_hamming} and the number of non-zero elements are calculated as the two elements. By subtracting from the number of elements of the value vector, the number of elements of the same element that is non-zero between the binary vector and the ternary vector is obtained, and non-zero between the binary vector and the ternary vector Of the same element Wherein said binary vector from prime by subtracting the number of elements of the non-zero at different elements between the ternary vector may determine the inner product of the three-value vector and the binary vector.

前記複数の基底ベクトルは、前記実数ベクトルと、前記複数の基底ベクトルの線形和との差分を分解誤差として、前記分解誤差が最小になるように、求められてよい。 The plurality of basis vectors may be obtained such that a difference between the real vector and a linear sum of the plurality of basis vectors is a decomposition error, and the decomposition error is minimized.

記複数の基底ベクトルは、前記実数ベクトルと前記特徴ベクトルとの内積と、前記複数の基底ベクトルの線形和と前記特徴ベクトルとの内積との差分を分解誤差として、前記分解誤差が最小になるように、求められてよい。 The plurality of basis vectors are set such that the difference between the inner product of the real vector and the feature vector and the inner product of the linear sum of the plurality of basis vectors and the feature vector is set as a decomposition error so that the decomposition error is minimized. You may be asked for.

前記複数の基底ベクトルは、前記複数の基底ベクトルの要素を固定して、前記分解誤差が最小になるように、前記複数の基底ベクトルに係る複数の係数を更新する第１の更新と、前記複数の係数を固定して、前記分解誤差が最小になるように前記基底ベクトルの要素を更新する第２の更新とを繰り返すことで、前記複数の係数とともに求められてよい。 The plurality of basis vectors includes a first update for fixing the elements of the plurality of basis vectors and updating a plurality of coefficients related to the plurality of basis vectors so that the decomposition error is minimized, and the plurality of basis vectors. And the second update for updating the element of the basis vector so as to minimize the decomposition error, and the coefficient may be obtained together with the plurality of coefficients.

前記複数の基底ベクトルは、前記分解誤差の減少量が所定の値以下になるまで前記第１の更新と前記第２の更新を繰り返すことで求められてよい。 The plurality of basis vectors may be obtained by repeating the first update and the second update until a reduction amount of the decomposition error becomes a predetermined value or less.

前記複数の基底ベクトルは、前記複数の基底ベクトル及び前記複数の係数の初期値を変えて、複数とおりの前記複数の基底ベクトル及び前記複数の係数を求め、前記分解誤差が最小となる前記複数の基底ベクトル及び前記複数の係数を採用することで求められてよい。 The plurality of basis vectors are obtained by changing the initial values of the plurality of basis vectors and the plurality of coefficients to obtain a plurality of the plurality of basis vectors and the plurality of coefficients, and the plurality of the plurality of basis vectors that minimize the decomposition error. It may be obtained by employing a basis vector and the plurality of coefficients.

前記複数の基底ベクトルに係る複数の係数は離散値であってよい。 The plurality of coefficients related to the plurality of basis vectors may be discrete values.

前記複数の基底ベクトルは、前記実数ベクトルの要素の平均値を前記実数ベクトルの各要素から引いたオフセット実数ベクトルを前記基底ベクトルの線形和に分解することで求められてよい。 The plurality of basis vectors may be obtained by decomposing an offset real vector obtained by subtracting an average value of elements of the real vector from each element of the real vector into a linear sum of the base vectors.

前記ベクトル演算部は、前記特徴ベクトルと前記基底ベクトルとの前記内積計算を実行する度に、前記内積計算の結果の合計と、前記実数ベクトルと前記特徴ベクトルとが関連している場合に前記積算値がとり得る範囲を求め、前記合計が前記とり得る範囲外である場合に、前記内積計算を打ち切って、前記特徴ベクトルと前記基底ベクトルとの内積と所定の閾値との大小関係を判定してよい。 Whenever the inner product calculation of the feature vector and the base vector is performed, the vector calculation unit performs the integration when the sum of the inner product calculation results, the real vector, and the feature vector are related to each other. A range of possible values is obtained, and when the sum is outside the range of possible values, the inner product calculation is terminated, and a magnitude relationship between the inner product of the feature vector and the base vector and a predetermined threshold is determined. Good.

前記ベクトル演算部は、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積計算ごとに、当該基底ベクトルまでの前記内積計算の結果の合計が、最大側早期判定用閾値より大きい場合に、前記内積計算を打ち切って、前記実数ベクトルと前記特徴ベクトルとの内積が前記閾値より大きいと判定してよい。 The vector calculation unit, for each inner product calculation of the feature vector and each of the plurality of basis vectors, when the total result of the inner product calculation up to the basis vector is larger than a threshold value for maximum-side early determination, The inner product calculation may be terminated, and it may be determined that the inner product of the real vector and the feature vector is larger than the threshold.

前記ベクトル演算部は、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積がとり得る最小値を学習によって求めて、前記閾値から前記内積計算を行なっていない前記基底ベクトルと前記特徴ベクトルとの内積がとり得る値の最小値の合計を引いて、前記最大側早期判定用閾値を求めてよい。 The vector calculation unit obtains, by learning, a minimum value that an inner product of the feature vector and each of the plurality of basis vectors can take, and calculates the inner product from the threshold value and the feature vector. The maximum side early determination threshold value may be obtained by subtracting the sum of the minimum values that the inner product can take.

前記特徴ベクトルと前記複数の基底ベクトルの各々との内積がとり得る最小値は、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積がとり得る値のうちの最小側の上位の所定の割合にある値であってよい。 The minimum value that can be taken by the inner product of the feature vector and each of the plurality of basis vectors is a predetermined upper ratio on the lowest side of the values that can be taken by the inner product of the feature vector and each of the plurality of basis vectors. It may be a certain value.

前記ベクトル演算部は、前記実数ベクトルと前記特徴ベクトルとの内積が前記閾値より大きいと判定したときに、前記特徴ベクトルと前記基底ベクトルとは関連していないと判定してよい。 The vector calculation unit may determine that the feature vector and the base vector are not related when it determines that the inner product of the real vector and the feature vector is larger than the threshold.

前記ベクトル演算部は、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積計算ごとに、当該基底ベクトルまでの前記内積計算の結果の合計が、最小側早期判定用閾値より小さい場合に、前記内積計算を打ち切って、前記実数ベクトルと前記特徴ベクトルとの内積が前記閾値より小さいと判定してよい。 The vector calculation unit, for each inner product calculation of the feature vector and each of the plurality of basis vectors, when the sum of the inner product calculation results up to the basis vector is smaller than a minimum-side early determination threshold, The inner product calculation may be terminated, and it may be determined that the inner product of the real vector and the feature vector is smaller than the threshold value.

前記ベクトル演算部は、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積がとり得る最大値を学習によって求めて、前記閾値から前記内積計算を行なっていない前記基底ベクトルと前記特徴ベクトルとの内積がとり得る値の最大値の合計を引いて、前記最小側早期判定用閾値を求めてよい。 The vector calculation unit obtains, by learning, a maximum value that an inner product of the feature vector and each of the plurality of basis vectors can take, and calculates the inner product from the threshold and the basis vector and the feature vector. The minimum early determination threshold may be obtained by subtracting the sum of the maximum values that the inner product can take.

前記特徴ベクトルと前記複数の基底ベクトルの各々との内積がとり得る最大値は、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積がとり得る値のうちの最大側の上位の所定の割合にある値であってよい。 The maximum value that can be taken by the inner product of the feature vector and each of the plurality of basis vectors is a predetermined ratio on the upper side of the maximum side of the values that can be taken by the inner product of the feature vector and each of the plurality of basis vectors. It may be a certain value.

前記最小側早期判定用閾値は、前記実数ベクトルと前記特徴ベクトルとが関連している場合にとり得る前記内積計算の結果の合計の最小値であってよい。 The minimum-side early determination threshold value may be a minimum value of the sum of the inner product calculation results that can be taken when the real vector and the feature vector are related.

前記ベクトル演算部は、前記実数ベクトルと前記特徴ベクトルとの内積が前記閾値より小さいと判定したときに、前記特徴ベクトルと前記基底ベクトルとは関連していないと判定してよい。 The vector calculation unit may determine that the feature vector and the base vector are not related when it determines that the inner product of the real vector and the feature vector is smaller than the threshold.

前記ベクトル演算部は、係数の絶対値が大きい前記基底ベクトルから順に前記内積計算を行ってよい。 The vector calculation unit may perform the inner product calculation in order from the basis vector having the largest absolute value of the coefficient.

記ベクトル演算部は、前記特徴ベクトルと、分解された前記実数ベクトルをそれぞれ複数の部分ベクトルに分解し、前記特徴ベクトルの分解ベクトルと分解された前記実数ベクトルの部分ベクトルとの内積が、前記閾値よりも大きくなるか否か、及び／又は前記閾値よりも小さくなるか否かを判断してよい。 The vector operation unit decomposes the feature vector and the decomposed real vector into a plurality of partial vectors, and an inner product of the feature vector decomposition vector and the decomposed partial vector of the real vector is the threshold value. It may be determined whether or not it becomes larger and / or smaller than the threshold value.

前記特徴ベクトルは、ＨＯＧ特徴量であり、前記実数ベクトルは、線形ＳＶＭの重みベクトルであり、前記ベクトル演算部は、前記関連性の判定として、線形ＳＶＭによって前記特徴ベクトルの識別を行なってよい。 The feature vector may be a HOG feature amount, the real vector may be a linear SVM weight vector, and the vector calculation unit may identify the feature vector by a linear SVM as the relevance determination.

前記特徴ベクトルは、ｋ−ｍｅａｎｓクラスタリングによるクラスタリングの対象となるベクトルであり、前記実数ベクトルは、ｋ−ｍｅａｎｓクラスタリングにおける代表ベクトルであり、前記ベクトル演算部は、前記関連性の判定として、前記特徴ベクトルと前記代表ベクトルとの間の距離の演算を含むクラスタリング処理を行なってよい。 The feature vector is a vector to be clustered by k-means clustering, the real vector is a representative vector in k-means clustering, and the vector calculation unit is configured to determine the relevance as the feature vector. And a clustering process including calculation of the distance between the representative vector and the representative vector.

前記特徴ベクトルは、ｋ−ｍｅａｎｓｔｒｅｅによる近似最近傍探索の対象となるベクトルであり、前記実数ベクトルは、ｋ−分木のノードに登録されている代表ベクトルであり、前記ベクトル演算部は、前記関連性の判定として、前記特徴ベクトルと前記代表ベクトルとの間の距離の演算を含むクラスタリング処理を行なってよい。 The feature vector is a vector to be subjected to an approximate nearest neighbor search by k-means tree, the real vector is a representative vector registered in a node of a k-ary tree, and the vector operation unit As the determination of the relevance, a clustering process including a calculation of a distance between the feature vector and the representative vector may be performed.

前記特徴ベクトルは、画像の特徴量を表すベクトルであってよい。 The feature vector may be a vector representing a feature amount of an image.

本実施の形態の関連性判定プログラムは、コンピュータを、上記の関連性判定装置として機能させる構成を有している。 The relevance determination program according to the present embodiment has a configuration that causes a computer to function as the relevance determination device.

本実施の形態の関連性判定方法は、二値化された特徴ベクトルを取得する特徴ベクトル取得ステップと、実数ベクトルを二値または三値の離散値のみから構成された要素を持つ複数の基底ベクトルの線形和に分解して得られた前記複数の基底ベクトルを取得する基底ベクトル取得ステップと、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積計算を順次行うことで、前記実数ベクトルと前記特徴ベクトルとの関連性を判定するベクトル演算ステップとを含む構成を有している。 The relevance determination method according to the present embodiment includes a feature vector acquisition step of acquiring a binarized feature vector, and a plurality of basis vectors having elements composed of only binary or ternary discrete values as real vectors. A basis vector obtaining step for obtaining the plurality of basis vectors obtained by decomposing the plurality of basis vectors, and sequentially calculating an inner product of the feature vector and each of the plurality of basis vectors. And a vector calculation step for determining relevance with the feature vector.

３−３．効果
本実施の形態によれば、実数ベクトルは二値の基底ベクトルの線形和に分解されたうえで二値化された特徴ベクトルとの内積計算が行なわれるので、特徴ベクトルと実数ベクトルの内積計算を高速化できる。 3-3. Effect According to the present embodiment, a real vector is decomposed into a linear sum of binary base vectors, and an inner product calculation with the binarized feature vector is performed, so an inner product calculation of the feature vector and the real vector is performed. Can be speeded up.

３−４．第３の実施の形態の第１の例
図２９は、本実施の形態の第１の例の特徴量演算装置１０３の構成を示すブロック図である。特徴量演算装置１０３は、コンテンツ取得部１３１と、特徴ベクトル生成部１３２と、特徴ベクトル二値化部１３３と、実数ベクトル取得部１３４と、実数ベクトル分解部１３５と、ベクトル演算部１３６と、データベース１３７とを備えている。 3-4. First Example of Third Embodiment FIG. 29 is a block diagram showing a configuration of a feature quantity computing device 103 of the first example of the present embodiment. The feature amount calculation device 103 includes a content acquisition unit 131, a feature vector generation unit 132, a feature vector binarization unit 133, a real vector acquisition unit 134, a real vector decomposition unit 135, a vector calculation unit 136, a database 137.

本例の特徴量演算装置１０３は、後述するように、特徴ベクトルと辞書データとしてデータベースに保存された実数ベクトルとの内積演算を伴うベクトル演算によって、特徴ベクトルと実数ベクトルとの関連性を判定する関連性判定装置として機能する。すなわち、特徴演算装置１０３は、本実施の形態の関連性判定装置に相当する。 As will be described later, the feature quantity computing device 103 of this example determines the relevance between a feature vector and a real vector by a vector operation involving an inner product operation of the feature vector and a real vector stored in the database as dictionary data. It functions as an association determination device. That is, the feature calculation device 103 corresponds to the relevance determination device of the present embodiment.

関連性判定装置としての特徴量演算装置１０３は、コンピュータが本実施の形態の関連性判定プログラムを実行することにより実現される。関連性判定プログラムは、記録媒体に記録されて、記録媒体からコンピュータによって読み出されてもよいし、ネットワークを通じてコンピュータにダウンロードされてもよい。 The feature amount calculation device 103 as the relevance determination device is realized by a computer executing the relevance determination program according to the present embodiment. The relevance determination program may be recorded on a recording medium and read from the recording medium by a computer, or may be downloaded to a computer through a network.

コンテンツ取得部１３１は、画像データ、音声データ、文字データ等のコンテンツデータを取得する。これらのコンテンツデータは、外部機器から与えられるものであってもよく、コンテンツ取得部１３１で生成されるものであってもよい。例えば、コンテンツ取得部１３１がカメラであり、そこでコンテンツデータとして画像データが生成されてよい。 The content acquisition unit 131 acquires content data such as image data, audio data, and character data. Such content data may be provided from an external device or may be generated by the content acquisition unit 131. For example, the content acquisition unit 131 may be a camera, and image data may be generated there as content data.

特徴ベクトル生成部１３２は、コンテンツ取得部１３１にて取得されたコンテンツデータからＤ次元の特徴ベクトルを生成する。例えばコンテンツが画像である場合には、特徴ベクトル生成部１３２は、画像の特徴量を抽出する。特徴ベクトル二値化部１３３は、特徴ベクトル生成部１３２で生成されたＤ次元の特徴ベクトルを二値化して、各要素が−１及び１の二値のみをとるｄ次元の二値ベクトルｐ∈｛−１，１｝^dを生成する。 The feature vector generation unit 132 generates a D-dimensional feature vector from the content data acquired by the content acquisition unit 131. For example, when the content is an image, the feature vector generation unit 132 extracts the feature amount of the image. The feature vector binarization unit 133 binarizes the D-dimensional feature vector generated by the feature vector generation unit 132, and each element has only a binary value of -1 and 1, and a d-dimensional binary vector pε {-1, 1} ^d is generated.

なお、コンテンツ取得部１３１、特徴ベクトル生成部１３２、及び特徴ベクトル二値化部１３３からなる構成は、最終的に二値化された特徴ベクトルを取得できる構成であればよく、例えば、コンテンツ取得部１３１及び特徴ベクトル１３２を備えずに、特徴ベクトル二値化部１３３が外部機器から特徴ベクトルを取得して、その取得した特徴ベクトルを二値化する構成であってよいし、また、外部機器から二値化された特徴ベクトルを直接取得する構成であってもよい。 The configuration including the content acquisition unit 131, the feature vector generation unit 132, and the feature vector binarization unit 133 may be any configuration that can finally acquire a binarized feature vector. For example, the content acquisition unit 131 and the feature vector 132 may be provided, and the feature vector binarization unit 133 may acquire the feature vector from the external device and binarize the acquired feature vector. It may be configured to directly acquire a binarized feature vector.

実数ベクトル取得部１３４は、ｄ次元の実数ベクトルｑ∈Ｒ^dを取得する。実数ベクトルは、外部機器から与えられるものであってもよく、特徴量演算装置１０３の図示しない記憶装置から読み出されるものであってもよく、実数ベクトル取得部１３４で生成されるものであってもよい。実数ベクトルは、その要素に浮動小数を含む実数を持つ。 The real vector acquisition unit 134 acquires a d-dimensional real vector qεR ^d . The real vector may be given from an external device, may be read from a storage device (not shown) of the feature quantity computing device 103, or may be generated by the real vector acquisition unit 134. Good. A real vector has a real number including floating-point numbers in its elements.

実数ベクトル分解部１３５は、ｄ次元の実数ベクトルｑ∈Ｒ^dを、二値の基底ベクトルｍ_i∈｛−１，１｝^dの線形和に分解する。具体的には、実数ベクトル分解部１３５は、ｄ次元の実数ベクトルｑ∈Ｒ^dを、下式（５）によって、二値の要素を持つ基底行列Ｍと実数の要素を持つ係数ベクトルｃに分解する。
ここで、Ｍ＝（ｍ₁，ｍ₂，…，ｍ_k）∈｛−１，１｝^dxkであり、ｃ＝（ｃ₁，ｃ₂，…，ｃ_k）^T∈Ｒ^kである。すなわち、基底行列Ｍは、ｋ個の基底ベクトルｍ_iからなり、ここで、基底ベクトルｍ_iは、要素が−１及び１のみをとるｄ次元の二値ベクトルであり、従って、基底行列Ｍは、要素が−１及び１のみをとるｄ行ｋ列の二値行列である。また、係数ベクトルｃは、ｋ個の基底ベクトルに係る実数の係数を要素として持つｋ次元の実数ベクトルである。もちろん、ｑとＭｃはなるべく一致するように分解することが好ましいが、誤差を含んでもよい。 The real vector decomposition unit 135 decomposes the d-dimensional real vector qεR ^d into a linear sum of binary base vectors m _i ε {−1,1} ^d . Specifically, the real vector decomposition unit 135 decomposes the d-dimensional real vector qεR ^d into a base matrix M having binary elements and a coefficient vector c having real elements by the following equation (5). To do.
Here, M = (m ₁ , m ₂ ,..., M _k ) ∈ {−1, 1} ^dxk , and c = (c ₁ , c ₂ ,..., C _k ) ^T ∈R ^k . That is, the basis matrix M is composed of k basis vectors m _i , where the basis vector _mi is a d-dimensional binary vector having elements of only −1 and 1, and thus the basis matrix M is , Is a binary matrix of d rows and k columns in which elements take only -1 and 1. The coefficient vector c is a k-dimensional real vector having real coefficients related to k basis vectors as elements. Of course, q and Mc are preferably decomposed so as to coincide as much as possible, but may include an error.

３−４−１．第１の分解手法
実数ベクトル分解部１３５は、誤差最少化によって実数ベクトルを分解する。第１の分解手法の手順は、以下のとおりである。
（１）基底行列Ｍ及び係数ベクトルｃをランダムに初期化する。
（２）基底行列Ｍを固定して、分解の誤差
が最小になるように係数ベクトルｃを更新する。これは、最小二乗法により求めることができる。
（３）係数ベクトルｃを固定して、分解の誤差
が最小になるように基底行列Ｍを更新する。この最小化アルゴリズムについては、後に詳しく述べる。
（４）収束するまで（２）及び（３）を繰り返す。例えば、
の減少量が一定値以下になったとき、収束したと判定する。
（５）ステップ（１）〜ステップ（４）により得た解を候補として保持する。
（６）ステップ（１）〜ステップ（５）を繰り返し、最も
を小さくできた候補基底行列Ｍ及びｃを最終結果として採用する。なお、このステップ（１）〜ステップ（５）の繰り返しはなくてもよいが、複数回繰り返すことで、初期値依存の問題を回避できる。 3-4-1. The first decomposition method real vector decomposition unit 135 decomposes the real vector by error minimization. The procedure of the first decomposition method is as follows.
(1) The base matrix M and the coefficient vector c are initialized at random.
(2) The base matrix M is fixed and the error of decomposition
The coefficient vector c is updated so that is minimized. This can be obtained by the least square method.
(3) Decomposition error with the coefficient vector c fixed
The basis matrix M is updated so that is minimized. This minimization algorithm will be described in detail later.
(4) Repeat (2) and (3) until convergence. For example,
When the amount of decrease is less than or equal to a certain value, it is determined that it has converged.
(5) The solutions obtained in steps (1) to (4) are held as candidates.
(6) Repeat step (1) to step (5)
Candidate basis matrices M and c that can be reduced are adopted as final results. Note that the steps (1) to (5) need not be repeated, but the problem of initial value dependency can be avoided by repeating a plurality of times.

次に、ステップ（３）における基底行列Ｍの更新処理を説明する。図３０は、式（５）を図式化したものである。図３０の破線枠で囲ったように、基底行列Ｍのｉ行目の行ベクトルの要素は、実数ベクトルｑのｉ番目の要素のみに依存する。基底行列Ｍのｉ行目の行ベクトルは、本例のように二値分解の場合は２^k通りしか存在しない（なお、後述の第２の例の三値分解の場合にも３^k通りしか存在しない）。よって、実数ベクトル分解部１０５は、これらをすべて網羅的にチェックし、分解誤差
を最小化する行ベクトルを採用する。これを基底行列Ｍのすべての行ベクトルに対して適用して、基底行列Ｍの要素を更新する。 Next, the update process of the base matrix M in step (3) will be described. FIG. 30 is a schematic representation of Equation (5). As enclosed by the broken line frame in FIG. 30, the element of the i-th row vector of the base matrix M depends only on the i-th element of the real vector q. There are only 2 ^k row vectors in the i-th row of the base matrix M in the case of binary decomposition as in this example (note that there are only 3 ^{k in} the case of ternary decomposition in the second example described later). not exist). Therefore, the real vector decomposition unit 105 comprehensively checks all of them to determine the decomposition error.
Use a row vector that minimizes. This is applied to all the row vectors of the base matrix M to update the elements of the base matrix M.

３−４−２．第２の分解手法
次に、第２の分解手法を説明する。第１の分解手法では、分解誤差を
として定義し、この分解誤差を最小化することを考えた。しかしながら、実数ベクトルを基底ベクトルの線形和に近似した後に実際に近似をしたいのは、特徴ベクトルと実数ベクトルの内積ｐ^Tｑである。 3-4-2. Second decomposition method Next, the second decomposition method will be described. In the first decomposition method, the decomposition error is
We considered to minimize this decomposition error. However, what is actually desired to be approximated after approximating the real vector to the linear sum of the basis vectors is the inner product p ^T q of the feature vector and the real vector.

そこで、第２の分解手法では、特徴ベクトルｐをあらかじめＮ個集め、これをまとめたものをＰ∈Ｒ^dxNとする。そして、分解誤差を
と定義して、これを最小化する。こうすることで、実数ベクトルは、実際のデータの分布に従って分解されることになるため、内積の近似精度が向上する。 Therefore, in the second decomposition method, N feature vectors p are collected in advance, and the sum of ^these is ^defined as P∈R ^dxN . And the decomposition error
And minimize this. By doing so, the real vector is decomposed according to the actual data distribution, so that the approximation accuracy of the inner product is improved.

この近似分解は、ｍ_iを逐次的に求めることで行うことができる。第２の分解手法の手順は以下のとおりである。
（１）ｒにｑを代入する（ｒ←ｑ）
（２）ｉに１を代入する（ｉ←１）
（３）第１の分解手法によって
を最小化してｍ_i、ｃ_iを得る。 This approximate decomposition can be performed by sequentially obtaining m _i . The procedure of the second decomposition method is as follows.
(1) Substituting q into r (r ← q)
(2) Assign 1 to i (i ← 1)
(3) By the first decomposition method
To obtain m _i and c _i .

（４）ステップ（３）で得られたｍ_i、ｃ_iを初期値として、次の手順で
を最小化する。
（４−１）ｍ_iを固定して、
が最小になるように、ｃ_iを更新する。これは、最小二乗法により求めることができる。
（４−２）ｃ_iを固定して、
が最小になるように、ｍ_iを更新する。ｍ_iが離散値であるため、これは組合最適化問題となり、例えば、グリーディアルゴリズム（Greedy algorithm）、タブ−サーチ（tabu search）、シミュレイテッドアニーリング（simulated annealing）等のアルゴリズムを用いて最小化を行うことができる。ステップ（３）でよい初期値が得られているので、これらのアルゴリズムでも良好に分解誤差を最小化できる。
（４−３）収束するまで（４−１）及び（４−２）を繰り返す。例えば、
の減少量が一定値以下になったときに、収束したと判定する。 (4) m _i obtained in step (3), the c _i as the initial value, the following steps
Minimize.
(4-1) Fix m _i ,
Update c _i so that is minimized. This can be obtained by the least square method.
(4-2) securing the c _i,
Update m _i so that is minimized. Since _mi is a discrete value, this becomes a combinatorial optimization problem, for example minimization using algorithms such as Greedy algorithm, tabu search, simulated annealing, etc. It can be carried out. Since a good initial value is obtained in step (3), these algorithms can satisfactorily minimize the decomposition error.
(4-3) Repeat (4-1) and (4-2) until convergence. For example,
When the amount of decrease of becomes less than a certain value, it is determined that it has converged.

（５）ｒにｒ−ｍ_iｃ_iを代入し（ｒ←ｒ−ｍ_iｃ_i）、ｉにｉ＋１を代入し（ｉ←ｉ＋１）、ｉ≦ｋであればステップ（３）に戻り、ｉ＞ｋであればステップ（６）に進む。
（６）ステップ（１）〜（６）により得た解Ｍ、ｃを候補として保持する。
（７）ステップ（１）〜（６）を繰り返し、最も
を小さくできた候補Ｍ、ｃを最終結果として採用する。なお、ステップ（７）の繰り返しはなくてもよいが、複数回繰り返すことで、初期値依存の問題を軽減できる。 (5) Substituting r−m _i c _i for r (r ← r−m _i c _i ), substituting i + 1 for i (i ← i + 1), and if i ≦ k, return to step (3), If i> k, go to step (6).
(6) The solutions M and c obtained in steps (1) to (6) are held as candidates.
(7) Repeat steps (1) to (6)
Candidates M and c that can be reduced are adopted as final results. Note that step (7) need not be repeated, but the problem of initial value dependency can be reduced by repeating a plurality of times.

なお、上記の第１及び第２の分解手法は、かならずしも基底行列Ｍが二値（又は第２の例の三値）でなくともよく、基底行列Ｍのとり得る要素の種類が有限の数であれば適用可能である。また、係数ベクトルｃも、基底行列Ｍと同様にあらかじめ定められた離散的な値でもよい。たとえば、２のべき乗に制約してもよく、そうすることで、処理を高速化できる。また、分解する実数ベクトルｑの要素の平均値が著しく大きい（若しくは小さい）場合、すなわち、平均値が０から著しく離れている場合には、この平均値をあらかじめ実数ベクトルｑの各要素から引いてオフセット実数ベクトルを生成し、このオフセット実数ベクトルを基底行列Ｍと係数ベクトルｃに分解すると、より少ない基底で式（５）の近似分解を行うことができる。 In the first and second decomposition methods, the base matrix M does not always have to be binary (or the ternary in the second example), and the types of elements that the base matrix M can take are limited. Applicable if available. The coefficient vector c may also be a discrete value determined in advance as in the base matrix M. For example, the power may be restricted to a power of 2, so that the processing can be speeded up. Further, when the average value of the elements of the real vector q to be decomposed is remarkably large (or small), that is, when the average value is significantly different from 0, the average value is subtracted from each element of the real vector q in advance. If an offset real vector is generated, and the offset real vector is decomposed into a base matrix M and a coefficient vector c, the approximate decomposition of Expression (5) can be performed with fewer bases.

ベクトル演算部１３６は、特徴ベクトルを用いた演算を行なう。演算の具体的内容については、後述にて、本例の特徴量演算装置１０３の応用例とともに具体的に説明する。この特徴ベクトルを用いた演算には、二値化された特徴ベクトルｐ∈｛−１，１｝^dと実数ベクトル分解部１３５にて二値ベクトルの線形和に分解された実数ベクトルｑとの内積ｐ^Tｑの計算が含まれる。以下では、まず、この内積ｐ^Tｑの計算について説明する。 The vector calculation unit 136 performs a calculation using the feature vector. The specific contents of the calculation will be specifically described later together with an application example of the feature amount calculation device 103 of this example. For the calculation using the feature vector, the inner product of the binarized feature vector pε {−1,1} ^d and the real vector q decomposed into a linear sum of binary vectors by the real vector decomposition unit 135 is used. It includes the calculation of p ^T q. Hereinafter, calculation of the inner product p ^T q will be described first.

内積ｐ^Tｑは、下式（６）のように式変形できる。
ここで、ｐ^Tｍ_iは二値ベクトル同士の内積である。この二値ベクトル同士の内積ｐ^Tｍ_iは、極めて高速に計算可能である。その理由は以下のとおりである。 The inner product p ^T q can be transformed into the following equation (6).
Here, p ^T _mi is an inner product of binary vectors. Inner product p ^T m _i of the binary vector each other can be calculated very fast. The reason is as follows.

二値ベクトル同士の内積は、ハミング距離の演算に帰着できる。ハミング距離とは、２つのバイナリコードにおいて、値が異なるビットを数えたものであり、２つの二値ベクトルの間のハミング距離は、すなわち値が異なる要素数を数えたものである。ここで、ｐとｍ_iのハミング距離をＤ_hamming（ｐ，ｍ_i）と記述すると、内積ｐ^Tｍ_iは、Ｄ_hamming（ｐ，ｍ_i）と下式（７）の関係がある。
ここで、前述のとおり、ｄはバイナリコードのビット数である。 An inner product between binary vectors can be reduced to a Hamming distance calculation. The Hamming distance is obtained by counting bits having different values in two binary codes, and the Hamming distance between two binary vectors is obtained by counting the number of elements having different values. Here, p and m _i Hamming distance D _hamming (p, m _i) of the writing and the inner product p ^T m _i is a relationship of D _hamming (p, m _i) and the following equation (7).
Here, as described above, d is the number of bits of the binary code.

ハミング距離の演算は、２つのバイナリコードにおいて、ＸＯＲを適用した後に、１が立っているビットを数えることで計算できるので、極めて高速である。二値ベクトルがバイナリコード（０と１のビット列）で表現されているのであれば、ハミング距離は、下式（８）で計算できる。
ここで、ＸＯＲ関数はｐとｍ_iをバイナリコード表現で考えたときに排他的論理和を取る操作であり、ＢＩＴＣＯＵＮＴ関数はバイナリコードの１が立っているビット数を数えあげる処理のことである。 The calculation of the Hamming distance is extremely fast because it can be calculated by counting the bits in which 1 stands after applying XOR in two binary codes. If the binary vector is expressed by a binary code (bit string of 0 and 1), the Hamming distance can be calculated by the following equation (8).
Here, the XOR function is an operation of the exclusive OR when considering the p and m _i in binary code representation, BITCOUNT function is processing to enumerate the number of bits 1 of the binary code is standing .

以上をまとめると、内積ｐ^Tｑは下式（９）のように変形できる。
すなわち、ｄビットのハミング距離計算をｋ回行い、ｋ個のハミング距離について、係数ベクトルｃに関する重み付け和を計算し、定数項を足したものがｐ^Tｑになる。よって、ｋが十分小さければ、ｐ^Tｑを浮動小数点精度で計算するよりも、はるかに高速に計算できるようになる。 In summary, the inner product p ^T q can be transformed as shown in the following equation (9).
That is, the d-bit Hamming distance calculation is performed k times, the weighted sum related to the coefficient vector c is calculated for k Hamming distances, and the sum of the constant terms is p ^T q. Thus, if k is sufficiently small, p ^T q can be calculated much faster than calculating floating point precision.

なお、上記の内積計算において、二値化された特徴ベクトルｐは、「第１の二値ベクトル」に相当し、基底ベクトルｍ_iは、「第２の二値ベクトル」に相当する。 In the above inner product calculation, the binarized feature vector p corresponds to the "first binary vector", basis vectors m _i corresponds to a "second binary vector".

データベース１３７には、実数ベクトル分解部１３５にて分解された複数の実数ベクトル、すなわち複数の基底ベクトルの線形和が辞書データとして記憶されている。ベクトル演算部１３６は、データベース１３７から基底ベクトルの線形和を読み出して、上記の演算を行う。このデータベース１３７は、「基底ベクトル取得部」に相当する。 The database 137 stores a plurality of real vectors decomposed by the real vector decomposition unit 135, that is, linear sums of a plurality of base vectors as dictionary data. The vector calculation unit 136 reads the linear sum of the basis vectors from the database 137 and performs the above calculation. This database 137 corresponds to a “basic vector acquisition unit”.

以上のように、本例の特徴量演算装置１０３によれば、特徴ベクトルを用いた演算処理に特徴ベクトルと他の実数ベクトルとの内積演算が含まれている場合にも、特徴ベクトルを二値化したうえで、実数ベクトルについても二値ベクトルの線形和に分解するので、それらの内積演算を高速化できる。 As described above, according to the feature amount computing device 103 of this example, even when the computation processing using the feature vector includes the inner product computation of the feature vector and another real vector, the feature vector is binarized. In addition, since real vectors are also decomposed into linear sums of binary vectors, their inner product operations can be speeded up.

３−５．第３の実施の形態の第１の例の拡張
上記の第１の例では、二値ベクトルｐ、ｍ_iを、それぞれ、ｐ∈｛−１，１｝^d、ｍ_i∈｛−１，１｝^dと定義して、実数ベクトルを二値ベクトルの線形和に分解することで内積演算ｐ^Tｍ_iが高速になることを説明した。しかしながら、ｐ、ｍ_iをより一般的な二値ベクトルｐ´∈｛−ａ，ａ｝^d、ｍ_i´∈｛−ａ，ａ｝^dとしても、それらの高速な内積演算が可能である。この場合、ｐ´^Tｍ_i´＝ａ²（ｐ^Tｍ_i）であることから、−１及び１により定義される二値ベクトル同士の内積にａ²を掛ければよい。なお、この場合には、特徴ベクトルｐ´を係数ａで除して得られる二値ベクトルｐが「第１の二値ベクトル」に相当し、基底ベクトルｍ_i´を係数ａで除して得られる二値ベクトルｍ_iが「第２の二値ベクトル」に相当する。 3-5. Expansion of the first example of the third embodiment In the first example described above, the binary vectors p and m _i are respectively expressed as pε {−1,1} ^d and m _i ε {−1,1. } It was defined as ^d, and it was explained that the inner product operation p ^T _mi becomes faster by decomposing a real vector into a linear sum of binary vectors. However, p, more general binary vector p'∈ a _{^{m i {-a, a} d}} , m i'∈ {-a, a} as ^d, it is possible to their fast inner product operations. In this case, since p ′ ^T m _i ′ = a ² (p ^T m _i ), the inner product of the binary vectors defined by −1 and 1 may be multiplied by a ² . In this case, the binary vector p obtained by dividing the feature vector p ′ by the coefficient a corresponds to the “first binary vector”, and obtained by dividing the base vector m _i ′ by the coefficient a. It is a binary vector m _i corresponds to a "second binary vector".

さらに、特徴ベクトル及び基底ベクトルを任意の二値ベクトルｐ´´∈｛α，β｝^d、ｍ_i´´∈｛γ，δ｝^dとしても、高速な内積演算が可能である。ここで、係数α、β、γ、δは実数であり、α≠β、γ≠δである。この場合、ｐ´´およびｍ_i´´は、−１及び１により定義される二値ベクトルｐ及びｍ_iの各要素に線形変換を施すことで得られ、下式（１０）及び（１１）のように展開される。
なお、式（１０）及び（１１）中の太字の「１」は、長さがｄですべての要素が１であるベクトルである。また、式（１０）及び（１１）中のＡ、Ｂ、Ｃ、Ｄは実数であり、式（１０）及び（１１）が成立するようにあらかじめ計算しておけばよい。 Furthermore, even if the feature vector and the base vector are arbitrary binary vectors p ″ ε {α, β} ^d and m _{i ″} ε {γ, δ} ^d , high-speed inner product calculation is possible. Here, the coefficients α, β, γ, and δ are real numbers, and α ≠ β and γ ≠ δ. In this case, P'' and m _i'' is obtained by performing a linear transformation to each element of the binary vector p, and m _i is defined by -1 and 1, the following equation (10) and (11) It is expanded like this.
Note that the bold “1” in the equations (10) and (11) is a vector having a length of d and all elements being 1. Further, A, B, C, and D in the expressions (10) and (11) are real numbers, and may be calculated in advance so that the expressions (10) and (11) are established.

内積ｐ´´^Tｍ_i´´は、下式（１２）のように展開できる。
式（１２）の括弧内の計算は、−１及び１からなる二値ベクトル同士の内積である。従って、特徴ベクトルが任意の二値の要素をもつ二値ベクトルにされ、かつ、実数ベクトルを任意の二値の要素を持つ二値ベクトルの線形和に展開した場合にも、高速演算が可能である。なお、この場合には、各要素を線形変換することで特徴ベクトルｐ´´が得られる上記の二値ベクトルｐが「第１の二値ベクトル」に相当し、各要素を線形変換することで基底ベクトルｍ_i´´が得られる上記の二値ベクトルｍ_iが「第２の二値ベクトル」に相当する。 Inner product p'' ^T m _i'' can be expanded by the following equation (12).
The calculation in parentheses in the equation (12) is an inner product between binary vectors consisting of -1 and 1. Therefore, even when the feature vector is a binary vector having an arbitrary binary element and the real vector is expanded into a linear sum of binary vectors having an arbitrary binary element, high-speed calculation is possible. is there. In this case, the binary vector p from which the feature vector p ″ is obtained by linearly transforming each element corresponds to the “first binary vector”, and each element is linearly transformed. The binary vector m _{i from} which the basis vector m _{i ″} is obtained corresponds to the “second binary vector”.

３−６．第３の実施の形態の第２の例
次に、第２の例の特徴量演算装置を説明する。第２の例の特徴量演算装置の構成は、図２９に示した第１の例のそれと同じである。第１の例では、実数ベクトル分解部１３５は、実数ベクトルを式（５）によって二値ベクトルの線形和に分解したが、本例の特徴量演算装置の実数ベクトル分解部１３５は、実数ベクトルを三値ベクトルの線形和に分解する。 3-6. Second Example of Third Embodiment Next, a feature value computing device of a second example will be described. The configuration of the feature quantity computing device of the second example is the same as that of the first example shown in FIG. In the first example, the real vector decomposition unit 135 decomposes the real vector into a linear sum of binary vectors using Equation (5). However, the real vector decomposition unit 135 of the feature amount computing device of this example converts the real vector into Decomposes a linear sum of ternary vectors.

実数ベクトル分解部１３５は、ｄ次元の実数ベクトルｑ∈Ｒ^dを、三値ベクトルの線形和に分解する。具体的には、実数ベクトル分解部１３５は、ｄ次元の実数ベクトルｑ∈Ｒ^dを、下式（１３）によって、三値の要素を持つ基底行列Ｍと実数の要素を持つ係数ベクトルｃに分解する。
ここで、Ｍ＝（ｍ₁，ｍ₂，…，ｍ_k）∈｛−１，０，１｝^dxkであり、ｃ＝（ｃ₁，ｃ₂，…，ｃ_k）^T∈Ｒ^kである。すなわち、基底行列Ｍは、ｋ個の基底ベクトルｍ_iからなり、ここで、基底ベクトルｍ_iは、要素が−１、０、及び１のみをとるｄ次元の三値ベクトルであり、従って、基底行列Ｍは、要素が−１、０、及び１のみをとるｄ行ｋ列の三値行列である。また、係数ベクトルｃは、ｋ個の基底ベクトルに係る実数の係数を要素として持つｋ次元の実数ベクトルである。もちろん、ｑとＭｃはなるべく一致するように分解することが好ましいが、誤差を含んでもよい。実数ベクトル分解部１３５は、第１の例と同様にして、誤差最小化によって実数ベクトルを分解する。 The real vector decomposition unit 135 decomposes the d-dimensional real vector qεR ^d into a linear sum of ternary vectors. Specifically, the real vector decomposition unit 135 decomposes the d-dimensional real vector qεR ^d into a base matrix M having ternary elements and a coefficient vector c having real elements by the following equation (13). To do.
Here, M = (m ₁ , m ₂ ,..., M _k ) ∈ {−1, 0, 1} ^dxk , and c = (c ₁ , c ₂ ,..., C _k ) ^T ∈R ^k . . That is, the basis matrix M consists of k basis vectors m _i, where the basis vectors m _i, the element is a three-value vector of d-dimensional take -1,0, and 1 only, therefore, basal The matrix M is a ternary matrix of d rows and k columns having elements of only −1, 0, and 1. The coefficient vector c is a k-dimensional real vector having real coefficients related to k basis vectors as elements. Of course, q and Mc are preferably decomposed so as to coincide as much as possible, but may include an error. The real vector decomposition unit 135 decomposes the real vector by error minimization as in the first example.

ベクトル演算部１３６は、内積ｐ^Tｑを計算する。以下では、内積ｐ^Tｑを計算するベクトル演算部１３６を特に、内積演算部１３６とも呼ぶ。内積ｐ^Tｑは、下式（１４）のように式変形できる。
ここで、ｐ^Tｍ_iは、二値ベクトルｐと三値ベクトルｍ_iとの内積である。内積演算部１０６は、ここで、三値ベクトルｍ_iの代わりに、以下に定義する０置換ベクトルｍ_i ^bin、フィルタベクトルｍ_i ^filter、及び０要素数ｚ_iを用いる。 The vector calculation unit 136 calculates the inner product p ^T q. Hereinafter, the vector calculation unit 136 for calculating the inner product p ^T q is also referred to as an inner product calculation unit 136. The inner product p ^T q can be transformed into the following equation (14).
Here, p ^T m _i is the inner product of the binary vector p and a three-value vector m _i. Here, the inner product calculation unit 106 uses, instead of the ternary vector m _i , a 0 permutation vector m _i ^bin , a filter vector m _i ^filter , and a 0 element number z _i defined below.

まず、内積演算部１３６は、ｍ_iの０の要素を、−１又１に置き換える。ｍ_iの各要素について、それを−１に置き換えるか、１に置き換えるかは、いずれでもよい。この置き換えによって、０置換ベクトルｍ_i ^bin∈｛−１，１｝^dが生成される。この０置換ベクトルｍ_i ^bin∈｛−１，１｝^dは二値ベクトルである。 First, the inner product calculation unit 136 replaces the 0 element of m _i with −1 or 1. For each element of m _i, to replace it to -1, is either replaced by 1, it may be any. By this replacement, a 0 replacement vector m _i ^bin ε {-1, 1} ^d is generated. This 0 permutation vector m _i ^bin ε {-1, 1} ^d is a binary vector.

また、内積演算部１３６は、ｍ_iの０の要素を−１に置き換え、０以外の要素を１に置き換える。この置き換えによって、フィルタベクトルｍ_i ^filter∈｛−１，１｝^dが生成される。このフィルタベクトルｍ_i ^filterも二値ベクトルである。 Further, the inner product calculation unit 136 replaces the 0 element of m _i -1, replace the elements other than 0 to 1. By this replacement, a filter vector m _i ^filter ε {-1, 1} ^d is generated. This filter vector m _i ^filter is also a binary vector.

さらに、内積演算部１３６は、ｍ_iの０の要素数ｚ_iを求める。ｚ_iは整数となる。内積演算部１３６は、これらの二値ベクトルｍ_i ^bin、フィルタベクトルｍ_i ^filter、及び０要素数ｚ_iを用いて、式（１４）におけるｐ^Tｍ_iを、下の式（１５）及び式（１６）によって計算する。
ここで、式（１５）のＡＮＤ関数は、二値ベクトルをバイナリコード表現で考えたときに、論理積を取る操作である。 Further, the inner product calculation unit 136 obtains the number of elements z _i of 0 of m _i . z _i is an integer. The inner product calculation unit 136 uses the binary vector m _i ^bin , the filter vector m _i ^filter , and the number of zero elements z _i to ^convert p ^T m _i in the equation (14) to the following equations (15) and Calculate according to (16).
Here, the AND function of Expression (15) is an operation of taking a logical product when a binary vector is considered in binary code expression.

以下、図３１を参照して、具体例を用いて、式（１５）及び（１６）の導出を説明する。図３１は、本例の計算例を示す図である。図３１の例では、ｐ＝｛−１，１，−１，１，−１，１｝であり、ｍ_i＝｛−１，０，１，０，１，１｝である。この例では、ｍ_i ^bin＝｛−１，＊，１，＊，１，１｝となる。ここで、「＊」は−１又は１の任意のいずれかを示す。また、ｍ_i ^filter＝｛１，−１，１，−１，１，１｝となり、ｚ_i＝２となる。 Hereinafter, with reference to FIG. 31, the derivation of the equations (15) and (16) will be described using a specific example. FIG. 31 is a diagram illustrating a calculation example of this example. In the example of FIG. 31, p = {− 1,1, −1,1, −1,1} and m _i = {− 1,0,1,0,1,1}. In this _{^{example, m i bin = {- 1}} , *, 1, *, 1,1} a. Here, “*” represents any one of −1 or 1. Further, m _i ^filter = {1, -1,1, -1,1,1}, and z _i = 2.

式（１５）におけるｐとｍ_i ^binとの排他的論理和は、ＸＯＲ（ｐ，ｍ_i ^bin）＝｛−１，＊，１，＊，１，−１｝となり、すなわち、ｐとｍ_iの要素のうち、非０で異なっている要素すなわち−１と１又は１と−１の組となる要素が１となり、−１と−１又は１と１の組となる要素が−１となる。 The exclusive OR of p and m _i ^bin in equation (15) is XOR (p, m _i ^bin ) = {− 1, *, 1, *, 1, −1}, that is, p and m _i Among the elements of, elements that are non-zero and different, that is, elements that are a pair of -1 and 1 or 1 and -1, are 1, and elements that are a pair of -1 and -1 or 1 and 1 are -1. .

次に、その排他的論理和とｍ_i ^filterとの論理積は、ＡＮＤ（ＸＯＲ（ｐ，ｍ_i ^bin），ｍ_i ^filter））＝｛−１，−１，１，−１，１，−１｝となり、ｐとｍ_iの要素のうち、非０で異なっている要素に１が立ち、それ以外は−１となる。このビットカウントを取ると、１である要素の個数、すなわち非０で異なっている要素の個数が数え上げられ、Ｄ_{filterd＿hamming}（ｐ，ｍ_i ^bin，ｍ_i ^filter）＝２となる。 Next, the logical product of the exclusive OR and m _i ^filter is AND (XOR (p, m _i ^bin ), m _i ^filter )) = {− 1, −1,1, −1,1, − 1}, and the elements of p and m _i, 1 is standing on elements that differ nonzero, is -1 otherwise. When this bit count is taken, the number of elements that are 1, that is, the number of non-zero and different elements is counted, and D _{filterd_hamming} (p, m _i ^bin , m _i ^filter ) = 2.

ここで、ｐとｍ_iの要素のうち、１と１又は−１と−１の組となる要素の個数は、全要素数ｄ＝６から、非０で異なっている要素の個数Ｄ_{filterd＿hamming}＝２と０である要素の個数ｚ_i＝２を引くことで求められる。すなわち、１と１又は−１と−１の組となる要素の数＝ｄ−Ｄ_{filterd＿hamming}−ｚ_i＝６−２−２＝２となる。 Here, among the elements p and m _i , the number of elements that are a set of 1 and 1 or −1 and −1 is the total number of elements d = 6, and the number of non-zero and different elements D _{filterd_hamming} = This is obtained by subtracting the number of elements z _i = 2 which are 2 and 0. That is, the number of elements that are a set of 1 and 1 or −1 and −1 = d−D _{filterd_hamming−} z _i = 6-2-2 = 2.

ｐとｍ_iは、１と１又は−１と−１の組となる要素（積が１になる要素の組）の個数から、−１と１又は１と−１との組となる要素（積が−１になる要素の組）の個数を引いた値と等しいため、ｐ^Tｍ_i＝（ｄ−Ｄ_{filterd＿hamming}−ｚ_i）−Ｄ_{filterd＿hamming}＝ｄ−ｚ_i−２Ｄ_{filterd＿hamming}となり、式（１６）が得られ、その値は、６−２−２×２＝０となる。なお、この結果は、当然ながら、ｐ^Tｍ_i＝｛−１，１，−１，１，−１，１｝×｛−１，０，１，０，１，１｝＝１＋０＋（−１）＋０＋（−１）＋１＝０と一致する。 p and m _i are elements (a set of -1 and 1 or 1 and -1) from the number of elements (a set of elements whose product is 1) that is a set of 1 and 1 or -1 and -1. P ^T m _i = (d−D _{filterd_hamming} −z _i ) −D _{filterd_hamming} = d−z _i −2D _{filterd_hamming} ), And the value is 6-2-2 × 2 = 0. Of course, this result is as follows. P ^T m _i = {− 1,1, −1,1, −1,1} × {−1,0,1,0,1,1} = 1 + 0 + (− 1 ) +0 + (− 1) + 1 = 0.

式（１５）〜（１６）をまとめると、内積ｐ^Tｑは、下式（１７）のように変形できる。
内積演算部１０６は、この式（１７）によって、内積ｐ^Tｑを計算する。 Summarizing the equations (15) to (16), the inner product p ^T q can be transformed as the following equation (17).
The inner product calculation unit 106 calculates the inner product p ^T q by the equation (17).

関数Ｄ_{filterd＿hamming}（ｐ，ｍ_i ^bin，ｍ_i ^filter）は、ハミング距離演算と非常に似ており、ＡＮＤ演算が加わっただけである。したがって、ｑ∈Ｒ^dを、三値ベクトルの線形和に分解した場合でも、ｐ^Tｑを浮動小数点精度で計算するよりも、はるかに高速にｐ^Tｑを計算できるようになる。 The function D _{filterd_hamming} (p, m _i ^bin , m _i ^filter ) is very similar to the Hamming distance calculation, only an AND operation is added. Therefore, even when qεR ^d is decomposed into a linear sum of ternary vectors, p ^T q can be calculated much faster than p ^T q is calculated with floating-point precision.

以上のように、ｄ次元の実数ベクトルｑ∈Ｒ^dを、二値ではなく三値ベクトルの線形和に分解することの利点は、式（１３）の近似が、より少ない数のベクトルの線形和でも成立するようになることにある。すなわち、ｋの値を小さく抑えられることになるため、さらなる高速化につながる。 As described above, the advantage of decomposing a d-dimensional real vector qεR ^d into a linear sum of ternary vectors instead of binary is that the approximation of Equation (13) is a linear sum of a smaller number of vectors. But it is to come true. That is, since the value of k can be kept small, the speed is further increased.

３−７．第３の実施の形態の第２の例の拡張
上記の第２の例では、二値ベクトルｐ及び三値ベクトルｍ_iを、それぞれ、ｐ∈｛−１，１｝^d、ｍ_i∈｛−１，０，１｝^dと定義して、実数ベクトルを三値ベクトルの線形和に分解することで内積演算ｐ^Tｍ_iが高速になることを説明した。しかしながら、ｐ、ｍ_iをより一般的な二値ベクトルｐ´∈｛−ａ，ａ｝^d、三値ベクトルｍ_i∈｛−ａ，０，ａ｝^dとしても、それらの高速な内積演算が可能である。この場合、ｐ´^Tｍ_i´＝ａ²（ｐ^Tｍ_i）であることから、−１及び１により定義される二値ベクトル同士の内積にａ²を掛ければよい。 3-7. Expansion of the second example of the third embodiment In the second example described above, the binary vector p and the ternary vector m _i are respectively expressed as pε {−1,1} ^d and m _i ε {−. 1,0,1} is defined as ^d, the inner product calculating a real vector by decomposing a linear sum of the three value vector p ^T m _i has been described to be a high speed. However, even if p and m _i are more general binary vectors p′∈ {−a, a} ^d and ternary vectors m _i ∈ {−a, 0, a} ^d , their high-speed inner product operations can be performed. Is possible. In this case, since p ′ ^T m _i ′ = a ² (p ^T m _i ), the inner product of the binary vectors defined by −1 and 1 may be multiplied by a ² .

さらに、二値ベクトルｐ及び三値ベクトルｍ_iをｐ∈｛α，β｝^d、ｍ_i∈｛γ−δ，γ，γ＋δ｝^dと一般化しても、高速な内積演算が可能である。ここで、α、β、γ、δは実数であり、α≠β、δ≠０である。この場合、ｐ及びｍ_iの各要素に下式（１８）及び（１９）の線形変換を施すことで、それぞれｐ´´およびｍ_i´´が得られる。
なお、式（１８）及び（１９）中の太字の「１」は、長さがｄですべての要素が１であるベクトルである。また、式（１８）及び（１９）中のＡ、Ｂ、Ｃ、Ｄは実数であり、式（１８）及び（１９）が成立するようにあらかじめ計算しておく。 Furthermore, even if the binary vector p and the ternary vector m _i are generalized as pε {α, β} ^d and m _i ε {γ−δ, γ, γ + δ} ^d , high-speed inner product calculation is possible. Here, α, β, γ, and δ are real numbers, and α ≠ β and δ ≠ 0. In this case, by performing a linear transformation of the formula (18) and (19) to each element of p and m _i, respectively p'' and m _i'' is obtained.
Note that the bold “1” in the equations (18) and (19) is a vector having a length of d and all elements being 1. Further, A, B, C, and D in the expressions (18) and (19) are real numbers, and are calculated in advance so that the expressions (18) and (19) are satisfied.

内積ｐ´´^Tｍ_i´´は、下式（２０）のように展開できる。
式（２０）の括弧内の計算は、−１及び１からなる二値ベクトル同士の内積、又は−１及び１からなる二値ベクトルと−１、０、１からなる三値ベクトルとの内積である。従って、特徴ベクトルが任意の二値ベクトルにされ、かつ、実数ベクトルを上記のとおり一般化した三値ベクトルの線形和に展開した場合にも、高速演算が可能である。 The inner product p ″ ^T m _{i ″} can be expanded as shown in the following equation (20).
The calculation in parentheses in the equation (20) is an inner product of binary vectors consisting of -1 and 1, or an inner product of a binary vector consisting of -1 and 1, and a ternary vector consisting of -1, 0, 1. is there. Accordingly, even when the feature vector is an arbitrary binary vector and the real vector is expanded into a linear sum of the ternary vectors generalized as described above, high-speed calculation is possible.

３−８．第３の実施の形態の第３の例
第１及び第２の例では、ベクトル演算部１３６における演算処理において行なわれる特徴ベクトルｐと実数ベクトルｑとの内積演算について説明した。特徴ベクトルｐと実数ベクトルｑとの内積演算を伴う演算処理については、後述にて応用例として説明するが、演算処理として、内積ｐ^Tｑがある閾値Ｔと比較されることがある。例えば、特徴ベクトルの識別を行なう場合には、内積ｐ^Tｑがある閾値Ｔと比較される。 3-8. Third Example of Third Embodiment In the first and second examples, the inner product calculation of the feature vector p and the real vector q performed in the calculation process in the vector calculation unit 136 has been described. An arithmetic process involving the inner product operation of the feature vector p and the real vector q will be described later as an application example, but as the arithmetic process, the inner product p ^T q may be compared with a certain threshold T. For example, when the feature vector is identified, the inner product p ^T q is compared with a certain threshold T.

第１の例及び第２の例において、内積ｐ^Tｑは、式（６）（実数ベクトルｑを二値ベクトルの線形和に分解する場合）及び式（１４）（実数ベクトルｑを三値ベクトルの線形和に分解する場合）に示すように、下式（２１）で表される。
この点に着目して、ベクトル演算部１３６は、特徴ベクトルを用いた演算に、内積ｐ^Tｑと閾値との比較の処理（閾値処理）が含まれる場合には、閾値処理をｋ段階に分けること（カスケード）により、閾値処理を高速化できる。 In the first example and the second example, the inner product p ^T q is expressed by Equation (6) (when the real vector q is decomposed into a linear sum of binary vectors) and Equation (14) (the real vector q is converted into a ternary vector). As shown in the following (21).
Focusing on this point, the vector calculation unit 136 divides the threshold value processing into k stages when the calculation using the feature vector includes a comparison process (threshold value processing) between the inner product p ^T q and the threshold value. This (cascade) can speed up the threshold processing.

３−８−１．第１のカスケード
以下、第１のカスケードによる閾値処理の高速化をする。以下の例では、閾値をＴとして、ｐ^Tｑ＞Ｔを判定する。図３２は、ベクトル演算部１３６おけるカスケードによる閾値処理の高速化のフロー図である。ベクトル演算部１３６は、まずｉ＝１、ｙ＝０とする（ステップＳ１１）。次に、ｙをｙ＋ｃ_i（ｐ^Tｍ_i）に更新する（ステップＳ１２）。次に、ｙがＴ_i ^minより大きいか否かを判断する（ステップＳ１３）。なお、Ｔ_i ^minは、ｐ^Tｑ＜Ｔを早期に判定するための最小側早期判定閾値であり、その決定方法については、後述する。ｙがＴ_i ^minより小さい場合には（ステップＳ１３にてＹＥＳ）、ｐ^Tｑ＜Ｔであると判定して（ステップＳ１４）、そこで特徴ベクトルｐと実数ベクトルｑとの内積演算を打ち切って、処理を終了する。 3-8-1. After the first cascade , the threshold processing by the first cascade is speeded up. In the following example, the threshold value is T, and p ^T q> T is determined. FIG. 32 is a flowchart for speeding up the threshold processing by cascade in the vector calculation unit 136. The vector calculation unit 136 first sets i = 1 and y = 0 (step S11). Next, y is updated to y + c _i (p ^T m _i ) (step S12). Next, it is determined whether y is larger than T _i ^min (step S13). Note that T _i ^min is a minimum early determination threshold value for determining p ^T q <T early, and the determination method will be described later. If y is smaller than T _i ^min (YES in step S13), it is determined that p ^T q <T (step S14), and the inner product calculation of feature vector p and real vector q is terminated there, The process ends.

ｙがＴ_i ^min以上である場合は（ステップＳ１３にてＮＯ）、次にｙがＴ_i ^maxより大きいか否かを判断する（ステップＳ１５）。なお、Ｔ_i ^maxは、ｐ^Tｑ＞Ｔを早期に判定するための最大側早期判定閾値であり、その決定方法については、後述する。ｙがＴ_i ^maxより大きい場合には（ステップＳ１５にてＹＥＳ）、ｐ^Tｑ＞Ｔであると判定して（ステップＳ１６）、そこで特徴ベクトルｐと実数ベクトルｑとの内積演算を打ち切って、処理を終了する。 If y is equal to or greater than T _i ^min (NO in step S13), it is next determined whether y is greater than T _i ^max (step S15). Note that T _i ^max is a maximum side early determination threshold value for determining p ^T q> T at an early stage, and a determination method thereof will be described later. If y is larger than T _i ^max (YES in step S15), it is determined that p ^T q> T (step S16), and the inner product operation of feature vector p and real vector q is terminated there, The process ends.

ｙがＴ_i ^max以下である場合には（ステップＳ１５にてＮＯ）、ｉ＝ｋであるか否か、すなわち、まだ計算していない基底ｍ_iがあるか否かを判断する（ステップＳ１７）、ｉ＝ｋでない場合には（ステップＳ１７にてＮＯ）、ｉをインクリメントして（ステップＳ１８）、ステップＳ１２に戻る。ｉ＝ｋである場合、すなわちすべての基底ｍ_iを計算している場合には（ステップＳ１７にてＹＥＳ）、ステップＳ１４に移行して、ｐ^Tｑ＜Ｔであると判断して処理を終了する。 If y is less than or equal to T _i ^max (NO at step S15), and whether or not i = k, that is, whether there is a basal m _i which has not been calculated yet (step S17) If i = k is not satisfied (NO in step S17), i is incremented (step S18), and the process returns to step S12. If i = k, that is, if all bases m _i have been calculated (YES in step S17), the process proceeds to step S14, and it is determined that p ^T q <T and the process is terminated. To do.

以上のように、二値ベクトルｐと実数ベクトルｑとの内積を計算する際に、実数ベクトルｑを二値ベクトル又は三値ベクトルの線形和に分解することで、閾値処理をｋ段階に分けることができ、それによって、ｐ^Tｑ＜Ｔが明らかに成立する場合、及びｐ^Tｑ＞Ｔが明らかに成立する場合は、そのような閾値処理の結果を早期に得ることができる。これによって、二値ベクトルｐと実数ベクトルｑの内積演算をｋ回より少ない回数に抑えることができる。 As described above, when calculating the inner product of the binary vector p and the real vector q, the threshold value processing is divided into k stages by decomposing the real vector q into a linear sum of the binary vector or the ternary vector. As a result, when p ^T q <T is clearly established, and when p ^T q> T is clearly established, the result of such threshold processing can be obtained early. As a result, the inner product operation of the binary vector p and the real vector q can be suppressed to less than k times.

なお、判定結果は、カスケードを採用しない（非カスケード）場合と完全に一致はしないが、Ｔ_i ^min及びＴ_i ^maxを適切に選択することで、判定結果を非カスケードの場合と限りなく一致させることができる。 Although the determination result does not completely match the case where the cascade is not adopted (non-cascade), the determination result is matched as much as the case where the determination result is not cascaded by appropriately selecting T _i ^min and T _i ^max . be able to.

なお、ｃ_iを大きさによって並べ替えることで、早期判定の効果を高めることができる。また、ｃ_iの大きさに差がつくように分解することで、早期判定の効果を高めることができる。 Note that by rearranging the magnitude of c _i, it is possible to enhance the effect of early decision. Furthermore, to decompose so take the difference in the magnitude of c _i, it is possible to enhance the effect of early decision.

次に、最小側早期判定用閾値Ｔ_i ^min及び最大側早期判定用閾値Ｔ_i ^maxの決定方法について説明する。カスケード一段目における最小側早期判定用閾値Ｔ₁ ^min及び最大側早期判定用閾値Ｔ₁ ^maxを決定することを考えると、ｃ_i（ｐ^Tｍ_i）（ｉ＝２，３，…，ｋ）が未知の状態で、安全に判定できる閾値を決めなければならない。そこで、あらかじめｃ_i（ｐ^Tｍ_i）のとり得る最大値Ｐ_i ^max及び最小値Ｐ_i ^minを求めておく。これは、事前に学習用に準備した複数のデータから抽出した二値の特徴量ｐをｃ_i（ｐ^Tｍ_i）に代入することで得られる。 Next, a method for determining the minimum side early determination threshold value T _i ^min and the maximum side early determination threshold value T _i ^max will be described. Considering the determination of the minimum-side early determination threshold T ₁ ^min and the maximum-side early determination threshold T ₁ ^max in the first stage of the cascade, c _i (p ^T m _i ) (i = 2, 3,..., K) It is necessary to determine a threshold value that can be safely judged in an unknown state. Therefore, the maximum value P _i ^max and the minimum value P _i ^min that can be taken by c _i (p ^T m _i ) are obtained in advance. This is obtained by substituting binary feature values p extracted from a plurality of data prepared for learning in advance into c _i (p ^T m _i ).

最小側早期判定用閾値Ｔ_i ^minは、次のようにして決定できる。
すなわち、
が成立するのであれば、カスケードの二段目以降は、どのような大きな値が入ってきたとしてもｐ^Tｑ＜Ｔが成立するので、この時点で必ずｐ^Tｑ＜Ｔが成立するといえる。従って、下式（２２）によってＴ_i ^minを求めることができる。
二段目以降のＴ_i ^minについても、同じ要領で決定することができる。 The minimum-side early determination threshold value T _i ^min can be determined as follows.
That is,
Is satisfied, p ^T q <T is satisfied regardless of what large value is entered after the second stage of the cascade. Therefore, it can be said that p ^T q <T is always satisfied at this point. Therefore, T _i ^min can be obtained by the following equation (22).
The T _i ^min after the second stage can be determined in the same manner.

最大側早期判定用閾値Ｔ_i ^maxは、次のようにして決定できる。
すなわち、
が成立するのであれば、カスケードの二段目以降は、どのような小さな値が入ってきたとしてもｐ^Tｑ＞Ｔが成立するので、この時点で必ずｐ^Tｑ＞Ｔが成立するといえる。従って、下式（２３）によってＴ_i ^maxを求めることができる。
二段目以降のＴ_i ^maxについても、同じ要領で決定することができる。なお、上記説明から明らかなように、Ｔ_k ^min及びＴ_i ^maxはＴである。 The maximum side early determination threshold value T _i ^max can be determined as follows.
That is,
Is satisfied, p ^T q> T is satisfied regardless of what small value is entered after the second stage of the cascade. Therefore, it can be said that p ^T q> T is always satisfied at this point. Therefore, T _i ^max can be obtained by the following equation (23).
The T _i ^max after the second stage can be determined in the same manner. As is clear from the above description, T _k ^min and T _i ^max are T.

なお、式（２３）において、Ｐ_i ^minは最小でなくても、十分小さい上位数％内の値として選択してもよい。また、式（２２）においても、Ｐ_i ^maxは最大でなくても、十分に大きい上位数％の値として選択してもよい。 In Equation (23), P _i ^min may not be the minimum, but may be selected as a value within a sufficiently small upper few%. Also in Equation (22), P _i ^max may not be the maximum, but may be selected as a sufficiently large value of several%.

３−８−２．第２のカスケード
第２のカスケードでは、実数ベクトルを複数のサブベクトルに分解することでより深いカスケードを実施し、これにより、閾値との比較処理をより高速化する。すなわち、第１のカスケードでは、
であることに着目して、閾値処理をｋ段階に分けた。第２のカスケードでは、これを以下のように拡張する。 3-8-2. Second Cascade In the second cascade, a deeper cascade is implemented by decomposing a real vector into a plurality of subvectors, thereby speeding up the comparison process with the threshold. That is, in the first cascade,
Focusing on this, the threshold processing was divided into k stages. In the second cascade, this is expanded as follows.

まず、下式（２４）のように、ｑ∈Ｒ^dをｍ個の部分ベクトルに分解する。
同様に、下式（２５）のように、ｐ∈｛１，１｝^dをｍ個の部分ベクトルに分解する。
ここで、ｑ_iとｐ_iの次元数は同じであるものとする。 First, as shown in the following equation (24), qεR ^d is decomposed into m partial vectors.
Similarly, pε {1,1} ^d is decomposed into m partial vectors as shown in the following equation (25).
Here, it is assumed that the number of dimensions of q _i and p _i is the same.

このとき、内積ｐ^Tｑは、下式（２６）のように書ける。
ｍ個のそれぞれの内積ｐ_i ^Tｑ_iは、二値ベクトルと実数ベクトルの内積であるため、これらもまた、それぞれ二値／三値分解法を適用可能である。例えば、内積ｐ_i ^Tｑ_iをそれぞれｋ個に分解するのであれば、ｐ^Tｑは、ｍｋ段階のカスケード処理に分解できる。これによりカスケードの段階数が増加し、早期判定の効果を向上できる。なお、第２のカスケードにおいても、カスケード処理の順序は、係数の絶対値順、又は誤識別が少なくなるような順序に選ぶことで、早期判定の効果をより向上できる。 At this time, the inner product p ^T q can be written as in the following equation (26).
Since each of the m inner products p _i ^T q _i is an inner product of a binary vector and a real vector, the binary / ternary decomposition method can also be applied to each of them. For example, if each inner product p _i ^T q _i is decomposed into k pieces, p ^T q can be decomposed into mk stages of cascade processing. As a result, the number of cascade stages is increased, and the effect of early determination can be improved. Even in the second cascade, the effect of early determination can be further improved by selecting the order of the cascade processing in the order of the absolute value of the coefficients or the order in which erroneous identification is reduced.

なお、上記の第３の例では、ｐ^Tｑ＞Ｔを判断するために、基底ごとに、明らかにｐ^Tｑ＜Ｔが成立する（すなわち明らかにｐ^Tｑ＞Ｔが成立しない）か否か、及び明らかにｐ^Tｑ＞Ｔが成立するか否かを判断したが、このいずれか一方のみを判断してもよい。 In the third example, in order to determine p ^T q> T, whether p ^T q <T is clearly established for each base (that is, clearly p ^T q> T is not established). Whether or not p ^T q> T is clearly satisfied is determined, but only one of them may be determined.

次に、ベクトル演算部１３６における演算処理について説明する。上記の第１及び第２の例のベクトル演算部１３６は、二値化された特徴ベクトルｐと実数ベクトルｑの内積計算を伴うものであるが、そのような演算処理は種々ある。すなわち、本実施の形態の上記の例は、特徴ベクトルを用いて演算処理を行なう種々の装置に応用できる。なお、上記の第３の例は、上述のとおり、特に特徴ベクトルを閾値と比較する処理を伴う演算処理を行なう種々の装置に応用できる。そこで、以下、本実施の形態の応用例を説明する。 Next, calculation processing in the vector calculation unit 136 will be described. The vector calculation unit 136 in the first and second examples described above involves the inner product calculation of the binarized feature vector p and the real vector q, but there are various types of such calculation processing. That is, the above example of the present embodiment can be applied to various apparatuses that perform arithmetic processing using feature vectors. Note that the third example described above can be applied to various apparatuses that perform arithmetic processing involving processing for comparing a feature vector with a threshold, as described above. Therefore, application examples of the present embodiment will be described below.

３−９．第３の実施の形態の第１の応用例
本応用例では、本実施の形態がＨＯＧによる物体認識に応用される。図３３は、物体認識装置の構成を示すブロック図である。物体認識装置１０４は、ＨＯＧによる物体認識を行なう。物体認識装置１０４は、ピラミッド画像生成部１４１と、ＨＯＧ特徴量抽出部１４２と、バイナリコード変換部１４３と、パラメータ決定部１４４と、パラメータ行列分解部１４５と、線形ＳＶＭ識別部１４６とを備えている。 3-9. First Application Example of Third Embodiment In this application example, this embodiment is applied to object recognition by HOG. FIG. 33 is a block diagram illustrating a configuration of the object recognition apparatus. The object recognition device 104 performs object recognition by HOG. The object recognition device 104 includes a pyramid image generation unit 141, an HOG feature amount extraction unit 142, a binary code conversion unit 143, a parameter determination unit 144, a parameter matrix decomposition unit 145, and a linear SVM identification unit 146. Yes.

ピラミッド画像生成部１４１は、入力クエリとしての画像を取得して、当該画像を複数段階の倍率でそれぞれ縮小してなるピラミッド画像を生成する。これにより、サイズの異なる物体に対処できる。このピラミッド画像生成部１４１は、図２９に示したコンテンツ取得部１３１に対応する。ＨＯＧ特徴量抽出部１４２は、ピラミッド画像の各段における画像を、８×８ピクセルのサイズのセルに分割し、各セルからＨＯＧ特徴量を抽出する。ＨＯＧ特徴量抽出部１４２は、各セルからＤ次元の特徴量を抽出する。このＨＯＧ特徴量抽出部１４２は、図２９に示した特徴ベクトル抽出部１０２に対応する。バイナリコード変換部１４３は、各セルに与えられたＤ次元の特徴量を、ｄ次元の二値ベクトルに変換する。このバイナリコード変換部１４３は、図２９に示した特徴ベクトル二値化部１３３に対応する。 The pyramid image generation unit 141 acquires an image as an input query, and generates a pyramid image obtained by reducing the image at a plurality of scales. Thereby, it is possible to deal with objects of different sizes. This pyramid image generation unit 141 corresponds to the content acquisition unit 131 shown in FIG. The HOG feature amount extraction unit 142 divides the image at each stage of the pyramid image into cells having a size of 8 × 8 pixels, and extracts the HOG feature amount from each cell. The HOG feature amount extraction unit 142 extracts a D-dimensional feature amount from each cell. The HOG feature amount extraction unit 142 corresponds to the feature vector extraction unit 102 shown in FIG. The binary code conversion unit 143 converts the D-dimensional feature value given to each cell into a d-dimensional binary vector. This binary code conversion unit 143 corresponds to the feature vector binarization unit 133 shown in FIG.

パラメータ決定部１４４は、線形ＳＶＭ識別部１４６における線形ＳＶＭにて用いる重みベクトルｗ及び実数のバイアスｂを決定する。パラメータ決定部１４４は、学習用に用意された特徴量を用いて、学習処理によって重みベクトルｗ及びバイアスｂを決定する。パラメータ行列分解部１４５は、重みベクトルｗを第１又は第２の例で説明した式（５）又は式（１３）によって離散値ベクトルの線形和に分解する。 The parameter determination unit 144 determines the weight vector w and the real number bias b used in the linear SVM in the linear SVM identification unit 146. The parameter determination unit 144 determines the weight vector w and the bias b by learning processing using the feature amount prepared for learning. The parameter matrix decomposing unit 145 decomposes the weight vector w into a linear sum of discrete value vectors according to Expression (5) or Expression (13) described in the first or second example.

線形ＳＶＭ識別部１４６は、線形ＳＶＭによって特徴ベクトルの識別を行なう。線形ＳＶＭ識別部１４６は、まず、Ｗ×Ｈセルをひとまとまりとして、ウィンドウを構成する。１つのウィンドウから抽出される特徴ベクトルは、Ｗ×Ｈ×ｄ次元のベクトルとなる。線形ＳＶＭ識別部１４６は、この特徴ベクトルに対して、下式（２７）の線形ＳＶＭを適用する。
ここで、線形ＳＶＭにおける内積演算ｗ^Tｘは、第１又は第２の例として説明した実数ベクトルと二値ベクトルの高速内積演算により実現できる。 The linear SVM identifying unit 146 performs feature vector identification using the linear SVM. The linear SVM identification unit 146 first configures a window by collecting W × H cells. A feature vector extracted from one window is a W × H × d-dimensional vector. The linear SVM identification unit 146 applies the linear SVM of the following expression (27) to this feature vector.
Here, the inner product operation w ^T x in the linear SVM can be realized by the high-speed inner product operation of the real vector and the binary vector described as the first or second example.

３−１０．第３の実施の形態の第２の応用例
本応用例では、本実施の形態がｋ−ｍｅａｎｓクラスタリングに応用される。図３４は、ｋ−ｍｅａｎｓクラスタリング装置の構成を示すブロック図である。ｋ−ｍｅａｎｓクラスタリング装置１０５は、コンテンツ取得部１５１と、特徴ベクトル生成部１５２と、特徴ベクトル二値化部１５３と、代表ベクトル更新部１５４と、収束判定部１５５と、代表ベクトル分解部１５６と、最近接代表ベクトル探索部１５７とを備えている。 3-10. Second Application Example of Third Embodiment In this application example, this embodiment is applied to k-means clustering. FIG. 34 is a block diagram illustrating a configuration of the k-means clustering apparatus. The k-means clustering device 105 includes a content acquisition unit 151, a feature vector generation unit 152, a feature vector binarization unit 153, a representative vector update unit 154, a convergence determination unit 155, a representative vector decomposition unit 156, And a nearest representative vector search unit 157.

コンテンツ取得部１５１は、クラスタリングの対象となるＮ個のコンテンツを取得する。特徴ベクトル生成部１５２は、コンテンツ取得部１５１にて取得した各コンテンツからそれらの特徴量を特徴ベクトルとして抽出する。特徴ベクトル二値化部１５３は、特徴ベクトル抽出部１５２にて抽出された各特徴ベクトルを二値化する。 The content acquisition unit 151 acquires N contents to be clustered. The feature vector generation unit 152 extracts the feature amounts as feature vectors from each content acquired by the content acquisition unit 151. The feature vector binarization unit 153 binarizes each feature vector extracted by the feature vector extraction unit 152.

代表ベクトル更新部１５４は、まず、特徴ベクトル二値化部１５３で二値化されたＮ個の特徴ベクトルからｋ個をランダムに選出してこれを代表ベクトルとする。収束判定部１５５は、代表ベクトル更新部１５４が代表ベクトルを更新するごとに収束判定を行なう。収束判定部１５５にて収束したと判定された場合には、ｋ−ｍｅａｎｓクラスタリング装置１０５はクラスタリングの処理を終了する。代表ベクトル分解部１５６は、代表ベクトル更新部１５４にて更新された代表ベクトルを離散値（二値又は三値）ベクトルに分解する。 First, the representative vector update unit 154 randomly selects k feature vectors from the N feature vectors binarized by the feature vector binarization unit 153 and sets them as representative vectors. The convergence determination unit 155 performs convergence determination every time the representative vector update unit 154 updates the representative vector. If the convergence determination unit 155 determines that the convergence has occurred, the k-means clustering apparatus 105 ends the clustering process. The representative vector decomposition unit 156 decomposes the representative vector updated by the representative vector update unit 154 into discrete value (binary or ternary) vectors.

最近接代表ベクトル探索部１５７は、特徴ベクトル二値化部１５３より入力されるＮ個の二値ベクトルをそれぞれ最も近傍の代表ベクトルに所属させる。最近接代表ベクトル１５７は、この結果を代表ベクトル更新部１５４に出力する。代表ベクトル更新部１５４は、各代表ベクトルについて、それに所属する特徴ベクトル（二値化されている）の平均ベクトルを算出して、これを新しい代表ベクトルとする。このようにして代表ベクトル更新部１５４で更新される代表ベクトルは、二値ベクトルの平均で算出されるので、実数ベクトルとなる。 The nearest representative vector search unit 157 causes the N binary vectors input from the feature vector binarization unit 153 to belong to the nearest representative vector. The closest representative vector 157 outputs this result to the representative vector update unit 154. For each representative vector, the representative vector update unit 154 calculates an average vector of feature vectors (binarized) belonging to the representative vector, and sets this as a new representative vector. Since the representative vector updated by the representative vector update unit 154 in this way is calculated as the average of the binary vectors, it becomes a real vector.

従って、仮に代表ベクトル分解部１５６がなければ、最近接代表ベクトル探索部１５７は、更新された代表ベクトル（実数ベクトル）と特徴ベクトル（二値ベクトル）との距離を求めるためにそれらの内積を計算しなければならない。そこで、本応用例では、上記のように、この代表ベクトル（実数ベクトル）を代表ベクトル分解部１５６によって、第１又は第２の例で説明したように、離散値（二値又は三値）ベクトルに分解する。それによって、最近接代表ベクトル探索部１５７における、各特徴ベクトルと各代表ベクトルとの距離の計算を高速にでき、よって各特徴ベクトルが最も近接する代表ベクトル（すなわち、所属すべき代表ベクトル）を高速に探索できる。 Therefore, if there is no representative vector decomposing unit 156, the nearest representative vector searching unit 157 calculates inner products in order to obtain the distance between the updated representative vector (real vector) and the feature vector (binary vector). Must. Therefore, in this application example, as described above, this representative vector (real vector) is converted into a discrete value (binary or ternary) vector by the representative vector decomposing unit 156 as described in the first or second example. Disassembled into As a result, the nearest representative vector search unit 157 can calculate the distance between each feature vector and each representative vector at high speed. Therefore, the representative vector to which each feature vector is closest (that is, the representative vector to which the feature vector belongs) can be calculated at high speed. To explore.

３−１１．第３の実施の形態の第３の応用例
本応用例では、本実施の形態がｋ−ｍｅａｎｓｔｒｅｅによる近似最近傍探索に応用される。本例の近似最近傍探索装置は、ｋ−ｍｅａｎｓを用いたｋ−分木による近似最近傍探索手法として、Marius Muja and David G. Lowe, "Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration", in International Conference on Computer Vision Theory and Applications (VISAPP' 09), 2009（http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN、http://people .cs.ubc.ca/~mariusm/uploads/FLANN/flann_visapp09.pdf）に提案されている手法を採用する。 3-11. Third Application Example of Third Embodiment In this application example, the present embodiment is applied to an approximate nearest neighbor search by k-means tree. The approximate nearest neighbor search device of this example is a Marius Muja and David G. Lowe, "Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration", in International Conference, as an approximate nearest neighbor search method using k-trees using k-means. on Computer Vision Theory and Applications (VISAPP '09), 2009 (http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN, http: // people .cs.ubc.ca / ~ mariusm / uploads / FLANN / flann_visapp09.pdf).

具体的には、本例の近似最近傍探索装置は、Ｎ個のデータに対してｋ−ｍｅａｎｓを再帰的に適用することでｋ−分木を構築し、上記提案の木探索の原理により近似的に最近傍点を探索する。この手法は、データが実数ベクトルであり、かつノードに登録されている代表ベクトルが二値ベクトルである場合を前提として設計される。但し、データが二値ベクトルであって、ノードに登録されている代表ベクトルが実数ベクトルである場合にも、第１又は第２の例を採用することで、木探索を高速化できる。 Specifically, the approximate nearest neighbor search apparatus of the present example constructs a k-ary tree by recursively applying k-means to N pieces of data, and approximates the proposed tree search principle. The nearest neighbor point is searched. This method is designed on the assumption that the data is a real vector and the representative vector registered in the node is a binary vector. However, even when the data is a binary vector and the representative vector registered in the node is a real vector, the tree search can be speeded up by adopting the first or second example.

３−１２．第３の実施の形態の変形例
特徴量演算装置１０３において、コンテンツ取得部１３１、特徴ベクトル生成部１３２、特徴ベクトル二値化部１３３、実数ベクトル取得部１３４、実数ベクトル分解部１３５、及びベクトル演算部１３６の一部と他の部分とが別々の装置として構成されていてもよい。特に、コンテンツ取得部１３１、特徴ベクトル生成部１３２、特徴ベクトル二値化部１３３、及びベクトル演算部１３６が特徴演算装置１０３に搭載され、実数ベクトル取得部１３４、及び実数ベクトル分解部１３５が別の装置に搭載されてよい。この場合には、実数ベクトル分解部１３５にて分解された複数の実数ベクトル（複数の係数ベクトルと基底ベクトルの組）が特徴演算装置１０３のデータベースに記憶され、ベクトル演算部１３６は、データベースから分解された複数の実数ベクトルを取得する。このとき、ベクトル演算部１３６は、基底ベクトル取得部（第１及び第２の例）、あるいは、二値ベクトル取得部（第１の例）、三値ベクトル取得部（第２の例）として機能する。 3-12. In the modified feature quantity computing device 103 of the third exemplary embodiment , the content acquisition unit 131, the feature vector generation unit 132, the feature vector binarization unit 133, the real vector acquisition unit 134, the real vector decomposition unit 135, and the vector calculation Part of the part 136 and other parts may be configured as separate devices. In particular, the content acquisition unit 131, the feature vector generation unit 132, the feature vector binarization unit 133, and the vector calculation unit 136 are mounted on the feature calculation device 103, and the real vector acquisition unit 134 and the real vector decomposition unit 135 are different from each other. It may be mounted on the device. In this case, a plurality of real vectors (a combination of a plurality of coefficient vectors and base vectors) decomposed by the real vector decomposition unit 135 are stored in the database of the feature calculation device 103, and the vector calculation unit 136 decomposes from the database. Obtain a plurality of real vectors. At this time, the vector calculation unit 136 functions as a base vector acquisition unit (first and second examples), a binary vector acquisition unit (first example), and a ternary vector acquisition unit (second example). To do.

なお、コンテンツ取得部１３１にて取得されるコンテンツデータは、車両から得られる計測データであってよい。さらに、車両から得られる計測データは、例えば、車両に設置されたカメラで撮影された画像データ、車両に設置されたセンサで計測されたセンシングデータであってよい。この場合に、関連性判定装置としての特徴演算装置１０３のベクトル演算部１３６は、計測データと辞書データとの関連性を判定する。例えば、計測データとして、車両に設置されたカメラで撮影された画像データが取得される場合には、辞書データとして複数の人物画像のデータがデータベースに保存されており、関連性判定装置としての特徴演算装置１０３のベクトル演算部１３６は、第４ないし第６の例のいずれかによって、画像データの画像に人物が含まれるか否かを判定してよい。 Note that the content data acquired by the content acquisition unit 131 may be measurement data obtained from a vehicle. Furthermore, the measurement data obtained from the vehicle may be, for example, image data captured by a camera installed in the vehicle, or sensing data measured by a sensor installed in the vehicle. In this case, the vector calculation unit 136 of the feature calculation device 103 as the relevance determination device determines the relevance between the measurement data and the dictionary data. For example, when image data taken by a camera installed in a vehicle is acquired as measurement data, data of a plurality of person images are stored in a database as dictionary data, and the feature as a relevance determination device The vector calculation unit 136 of the calculation device 103 may determine whether or not a person is included in the image of the image data according to any of the fourth to sixth examples.

４．第４の実施の形態
４−１．背景
第３の実施の形態では、識別器による認識処理においてもｋ−ｍｅａｎｓクラスタリングにおいても、二値ベクトルとｄ次元の実数ベクトルとの内積の演算を高速化することこそが、問題の解決につながるとの認識の下、特徴ベクトルがｄ次元の二値ベクトルｐ∈｛−１，１｝^dである場合において、そのような特徴ベクトルとｄ次元の実数ベクトルｑ∈Ｒ^dとの間の内積（ｐ^Tｑもしくはｑ^Tｐ）の演算を高速に行う関連性判定装置を説明した。 4). Fourth embodiment
4-1. Background In the third embodiment, speeding up the calculation of the inner product of a binary vector and a d-dimensional real vector in both recognition processing by a classifier and k-means clustering leads to the solution of the problem. When the feature vector is a d-dimensional binary vector pε {−1,1} ^d , the inner product between such a feature vector and the d-dimensional real vector qεR ^d ( The relevance determination device that performs the calculation of p ^T q or q ^T p) at high speed has been described.

すなわち、第３の実施の形態の関連性判定装置は、二値化された特徴ベクトルを取得する特徴ベクトル取得部と、実数ベクトルを二値または三値の離散値のみから構成された要素を持つ複数の基底ベクトルの線形和に分解することで得られた前記複数の基底ベクトルを取得する基底ベクトル取得部と、前記特徴ベクトルと前記複数の基底ベクトルの各々との内積計算を順次行うことで、前記実数ベクトルと前記特徴ベクトルとの関連性を判定するベクトル演算部とを備えており、この構成により、実数ベクトルは複数の二値の基底ベクトルの線形和に分解されたうえで二値化された特徴ベクトルとの内積計算が行なわれるので、特徴ベクトルと実数ベクトルの内積計算を高速化できた。 That is, the relevance determination apparatus according to the third embodiment includes a feature vector acquisition unit that acquires a binarized feature vector, and an element that includes a real vector only from binary or ternary discrete values. By sequentially performing an inner product calculation of each of the feature vector and each of the plurality of basis vectors, a basis vector acquisition unit that acquires the plurality of basis vectors obtained by decomposing into a linear sum of a plurality of basis vectors, A vector operation unit for determining the relevance between the real vector and the feature vector. With this configuration, the real vector is binarized after being decomposed into a linear sum of a plurality of binary basis vectors. Since the inner product calculation with the feature vector is performed, the inner product calculation between the feature vector and the real vector can be speeded up.

ところで、二値化された特徴ベクトルと複数の実数ベクトルとの内積を計算することで、特徴ベクトルと複数の実数ベクトルの各々との関連性を判定する必要がある場合がある。例えば、上述のように、線形ＳＶＭでは、特徴ベクトルがクラスＡに属するか、クラスＢに属するか、すなわち、特徴ベクトルがある識別基準に該当するか否かのみを判断するものであるが、このような識別を複数の基準について行いたい場合がある。具体的な例としては、撮影された画像に映っているのが、大人であるか否か、子供であるか否か、車であるか否か、道路標識であるか否かをそれぞれ判断したい場合がある。 Incidentally, it may be necessary to determine the relevance between the feature vector and each of the plurality of real vectors by calculating the inner product of the binarized feature vector and the plurality of real vectors. For example, as described above, in linear SVM, only whether a feature vector belongs to class A or class B, that is, whether or not the feature vector falls under a certain identification criterion, is determined. There are cases where such identification is desired for a plurality of criteria. As a specific example, I would like to determine whether the captured image is an adult, a child, a car, or a road sign. There is a case.

また、上述のｋ−ｍｅａｎｓクラスタリングでは、入力として与えられたＮ個の特徴ベクトルの各々について、ｋ個の代表ベクトルとの間で内積計算を伴う距離の計算を行う。ここで、ｋ個の代表ベクトルの各々は、上述のように、二値ベクトルの平均で定義されるので、実数ベクトルである。よって、ｋ−ｍｅａｎｓクラスタリングでも、二値化された特徴ベクトルと複数の実数ベクトルとの内積計算が必要となる。 In the k-means clustering described above, for each of the N feature vectors given as an input, a distance with an inner product calculation is calculated with k representative vectors. Here, each of the k representative vectors is a real vector because it is defined by the average of the binary vectors as described above. Therefore, k-means clustering also requires inner product calculation of binarized feature vectors and a plurality of real vectors.

４−２．概要
そこで、本実施の形態は、二値化された特徴ベクトルと複数の実数ベクトルとの内積計算を高速化することで、そのような特徴ベクトルと複数の実数ベクトルとの関連性の判定を高速に行うことを目的とする。 4-2. Overview Therefore, this embodiment speeds up the inner product calculation of a binarized feature vector and a plurality of real vectors, thereby quickly determining the relevance between such feature vectors and a plurality of real vectors. The purpose is to do.

本実施の形態の関連性判定装置は、二値化された特徴ベクトルを取得する特徴ベクトル取得部と、複数の実数ベクトルからなる実数行列を、係数行列と、要素として二値または三値の離散値のみを持つ複数の基底ベクトルからなる基底行列との積に分解する実数行列分解部と、前記特徴ベクトルと前記複数の実数ベクトルの各々との内積の計算として、前記特徴ベクトルと前記基底行列との積を計算し、さらに当該積と前記係数行列との積を計算して、その結果を用いて、前記複数の実数ベクトルの各々と前記特徴ベクトルとの関連性を判定するベクトル演算部とを備えた構成を有している。この構成により、特徴ベクトルと複数の実数ベクトルの各々との内積を計算のために、複数の実数ベクトルからなる実数行列を離散値の基底行列と係数行列に分解したうえで、特徴ベクトルと基底行列との積を計算し、さらに係数行列との積を計算するので、特徴ベクトルと複数の実数ベクトルの各々との内積演算の結果を高速に取得でき、よって特徴ベクトルと複数の実数ベクトルとの関連性の判定を高速に行うことができる。 The relevance determination apparatus according to the present embodiment includes a feature vector acquisition unit that acquires a binarized feature vector, a real matrix composed of a plurality of real vectors, a coefficient matrix, and binary or ternary discrete elements. A real number matrix decomposing unit that decomposes into a product of a base matrix composed of a plurality of basis vectors having only values; and calculation of an inner product of the feature vector and each of the plurality of real number vectors, the feature vector and the base matrix A vector operation unit that calculates a product of the product and the coefficient matrix, and uses the result to determine the relevance between each of the plurality of real vectors and the feature vector; It has the composition provided. With this configuration, in order to calculate the inner product of a feature vector and each of a plurality of real vectors, a real matrix composed of a plurality of real vectors is decomposed into a discrete base matrix and a coefficient matrix, and then the feature vector and the base matrix And the product with the coefficient matrix, the result of the inner product operation between the feature vector and each of multiple real vectors can be obtained at high speed, and the relationship between the feature vector and multiple real vectors can be obtained. Sex determination can be performed at high speed.

上記の関連性判定装置は、前記複数の実数ベクトルを並べることで前記実数行列を生成する実数行列生成部をさらに備えていてよい。この構成により、容易に複数の実数ベクトルから実数行列を生成できる。 The relevance determination device may further include a real matrix generation unit that generates the real matrix by arranging the plurality of real vectors. With this configuration, a real matrix can be easily generated from a plurality of real vectors.

上記の関連性判定装置において、前記実数行列生成部は、前記複数の実数ベクトルが所定のパラメータを有する場合に、当該パラメータの順に従って前記複数の実数ベクトルを並べることにより前記実数行列を生成してよい。この構成により、実数行列において互いに似た実数ベクトルが隣り合うこととなるので、隣り合う係数行列もまた類似するようになる。 In the above relevance determination device, when the plurality of real vectors have predetermined parameters, the real matrix generation unit generates the real matrix by arranging the plurality of real vectors according to the order of the parameters. Good. With this configuration, since real vectors similar to each other in the real number matrix are adjacent to each other, adjacent coefficient matrices are also similar.

上記の関連性判定装置において、前記実数行列分解部は、
をコスト関数として、前記コスト関数を解くことにより前記実数行列を分解してよい。ここで、Ｑは前記実数行列、Ｍは前記基底行列、Ｃは前記係数行列である。この構成により、実数行列を基底行列と係数行列との積に分解したときの誤差をコストとして評価して、実数行列を分解するので、容易かつ高精度に実数行列を分解できる。具体的には、このコスト関数を最小にする（所定の収束条件を満たす）基底行列及び係数行列で実数行列を分解することができる。 In the above relevance determination device, the real matrix decomposition unit includes:
May be decomposed by solving the cost function. Here, Q is the real matrix, M is the basis matrix, and C is the coefficient matrix. With this configuration, an error when a real matrix is decomposed into a product of a base matrix and a coefficient matrix is evaluated as a cost, and the real matrix is decomposed. Therefore, the real matrix can be decomposed easily and with high accuracy. Specifically, a real matrix can be decomposed with a base matrix and a coefficient matrix that minimize this cost function (which satisfies a predetermined convergence condition).

上記の関連性判定装置において、前記実数行列分解部は、前記基底行列の要素を固定して前記係数行列の要素を最小二乗法で最適化する第１の更新と、前記係数行列の要素を固定して前記基底行列の要素を全探索で最適化する第２の更新とを繰り返すことで、前記基底行列及び前記係数行列を求めてよい。この構成により、容易に実数行列を分解できる。なお、係数行列の要素を固定すると、基底行列の各行を求めるときに探索すべき組み合わせ数は、二値分解の場合は２^k通り、三値分解の場合は３^k通りしかないので、全探索を行っても計算量が多くなりすぎることはない。 In the above-described relevance determination device, the real matrix decomposition unit fixes the elements of the base matrix and optimizes the elements of the coefficient matrix by a least square method, and fixes the elements of the coefficient matrix Then, the base matrix and the coefficient matrix may be obtained by repeating the second update that optimizes the elements of the base matrix by full search. With this configuration, a real matrix can be easily decomposed. Note that when fixing the elements of the coefficient matrix, the number of combinations to be searched when determining each row of the base matrix, 2 ^k as in the case of binary decomposition, since there are only 3 ^k as in the case of ternary decomposition, full search The amount of computation will not increase too much.

上記の関連性判定装置において、前記実数行列分解部は、
をコスト関数として、前記コスト関数を解くことにより前記実数行列を分解してよい。ここで、Ｑは前記実数行列、Ｍは前記基底行列、Ｃは前記係数行列、λは係数である。この構成によっても、実数行列を基底行列と係数行列との積に分解したときの誤差をコストとして評価して、容易かつ高精度に実数行列を分解できるとともに、係数行列を疎にすることができるので、特徴ベクトルと実数行列との積を高速に計算できる。具体的には、このコスト関数を最小にする（所定の収束条件を満たす）基底行列及び係数行列で実数行列を分解することができる。 In the above relevance determination device, the real matrix decomposition unit includes:
May be decomposed by solving the cost function. Here, Q is the real matrix, M is the basis matrix, C is the coefficient matrix, and λ is a coefficient. Even with this configuration, an error when a real matrix is decomposed into a product of a base matrix and a coefficient matrix can be evaluated as a cost, and the real matrix can be decomposed easily and accurately, and the coefficient matrix can be sparse. Therefore, the product of the feature vector and the real matrix can be calculated at high speed. Specifically, a real matrix can be decomposed with a base matrix and a coefficient matrix that minimize this cost function (which satisfies a predetermined convergence condition).

上記の関連性判定装置において、前記実数行列分解部は、前記基底行列の要素を固定して前記係数行列の要素を近接勾配法で最適化する第１の更新と、前記係数行列の要素を固定して前記基底行列の要素を全探索で最適化する第２の更新とを繰り返すことで、前記基底行列及び前記係数行列を求めてよい。この構成により、容易に実数行列を分解できる。なお、係数行列の要素を固定すると、基底行列の各行を求めるときに探索すべき組み合わせ数は、二値分解の場合は２^k通り、三値分解の場合は３^k通りしかないので、全探索を行っても計算量が多くなりすぎることがない。 In the above-described relevance determination device, the real matrix decomposition unit fixes the elements of the base matrix and optimizes the elements of the coefficient matrix by a proximity gradient method, and fixes the elements of the coefficient matrix Then, the base matrix and the coefficient matrix may be obtained by repeating the second update that optimizes the elements of the base matrix by full search. With this configuration, a real matrix can be easily decomposed. Note that when fixing the elements of the coefficient matrix, the number of combinations to be searched when determining each row of the base matrix, 2 ^k as in the case of binary decomposition, since there are only 3 ^k as in the case of ternary decomposition, full search The amount of calculation does not increase too much.

上記の関連性判定装置において、前記実数行列分解部は、
をコスト関数として、前記コスト関数を解くことにより前記実数行列を分解してよい。ここで、Ｑは前記実数行列、Ｍは前記基底行列、Ｃは前記係数行列、Ｐは複数の前記特徴ベクトルの集合である。この構成により、実数行列の分解の誤差ではなく、複数の特徴ベクトルを用いて、特徴ベクトルと実数行列との積の分解による誤差をコストとして評価するので（データ依存分解）、特徴ベクトルと実数行列との積をより高精度に近似できる。具体的には、このコスト関数を最小にする（所定の収束条件を満たす）基底行列及び係数行列で実数行列を分解することができる。 In the above relevance determination device, the real matrix decomposition unit includes:
May be decomposed by solving the cost function. Here, Q is the real matrix, M is the basis matrix, C is the coefficient matrix, and P is a set of a plurality of the feature vectors. With this configuration, since the error due to the decomposition of the product of the feature vector and the real matrix is evaluated as a cost using a plurality of feature vectors instead of the error of the decomposition of the real matrix (data-dependent decomposition), the feature vector and the real matrix Can be approximated with higher accuracy. Specifically, a real matrix can be decomposed with a base matrix and a coefficient matrix that minimize this cost function (which satisfies a predetermined convergence condition).

上記の関連性判定装置において、前記実数行列分解部は、前記基底行列の要素を固定して前記係数行列の要素を最小二乗法で最適化する第１の更新と、前記係数行列の要素を固定して組合最適化問題を解くことで前記基底行列の要素を最適化する第２の更新とを繰り返すことで、前記基底行列及び前記係数行列を求めてよい。この構成により、容易に実数行列を分解できる。なお、組合最適化問題は、例えば、グリーディアルゴリズム、タブ−サーチ、シミュレイテッドアニーリング等のアルゴリズムを用いて解くことができる。 In the above-described relevance determination device, the real matrix decomposition unit fixes the elements of the base matrix and optimizes the elements of the coefficient matrix by a least square method, and fixes the elements of the coefficient matrix Then, the base matrix and the coefficient matrix may be obtained by repeating the second update for optimizing the elements of the base matrix by solving the combined optimization problem. With this configuration, a real matrix can be easily decomposed. Note that the union optimization problem can be solved using an algorithm such as a greedy algorithm, tab search, or simulated annealing.

上記の関連性判定装置において、前記実数行列分解部は、
をコスト関数として、前記コスト関数を解くことにより前記実数行列を分解してよい。ここで、Ｑは前記実数行列、Ｍは前記基底行列、Ｃは前記係数行列、Ｐは複数の前記特徴ベクトルの集合、λは係数である。この構成により、実数行列の分解の誤差ではなく、複数の特徴ベクトルを用いて、特徴ベクトルと実数行列との積の分解による誤差をコストとして評価するので（データ依存分解）、特徴ベクトルと実数行列との積をより高精度に近似できるとともに、係数行列を疎にすることで特徴ベクトルと実数行列との積を高速に計算できる。具体的には、このコスト関数を最小にする（所定の収束条件を満たす）基底行列及び係数行列で実数行列を分解することができる。 In the above relevance determination device, the real matrix decomposition unit includes:
May be decomposed by solving the cost function. Here, Q is the real matrix, M is the basis matrix, C is the coefficient matrix, P is a set of a plurality of feature vectors, and λ is a coefficient. With this configuration, since the error due to the decomposition of the product of the feature vector and the real matrix is evaluated as a cost using a plurality of feature vectors instead of the error of the decomposition of the real matrix (data-dependent decomposition), the feature vector and the real matrix The product of the feature vector and the real number matrix can be calculated at high speed by making the coefficient matrix sparse. Specifically, a real matrix can be decomposed with a base matrix and a coefficient matrix that minimize this cost function (which satisfies a predetermined convergence condition).

上記の関連性判定装置において、前記実数行列分解部は、前記基底行列の要素を固定して前記係数行列の要素を近接勾配法で最適化する第１の更新と、前記係数行列の要素を固定して組合最適化問題を解くことで前記基底行列の要素を最適化する第２の更新とを繰り返すことで、前記基底行列及び前記係数行列を求めてよい。この構成により、容易に実数行列を分解できる。なお、組合最適化問題は、例えば、グリーディアルゴリズム、タブーサーチ、シミュレイテッドアニーリング等のアルゴリズムを用いて解くことができる。 In the above-described relevance determination device, the real matrix decomposition unit fixes the elements of the base matrix and optimizes the elements of the coefficient matrix by a proximity gradient method, and fixes the elements of the coefficient matrix Then, the base matrix and the coefficient matrix may be obtained by repeating the second update for optimizing the elements of the base matrix by solving the combined optimization problem. With this configuration, a real matrix can be easily decomposed. Note that the union optimization problem can be solved using algorithms such as a greedy algorithm, tabu search, and simulated annealing.

上記の関連性判定装置において、前記実数行列分解部は、
をコスト関数として、前記コスト関数を解くことにより前記実数行列を分解して前記基底行列及び前記係数行列の要素の初期値を求め、又は、
をコスト関数として、前記コスト関数を解くことにより前記実数行列を分解して前記基底行列及び前記係数行列の要素の初期値を求めてよい。この構成により、データ非依分解により得られた基底行列及び係数行列を初期値とするので、十分に良好な初期解からデータ依存分解のための更新の繰り返しを開始でき、よって効果的にコストを減少させることができる。 In the above relevance determination device, the real matrix decomposition unit includes:
As a cost function, by decomposing the real matrix by solving the cost function to obtain initial values of elements of the basis matrix and the coefficient matrix, or
May be used as a cost function, and the real matrix may be decomposed by solving the cost function to obtain initial values of elements of the base matrix and the coefficient matrix. With this configuration, the base matrix and coefficient matrix obtained by data-independent decomposition are used as initial values, so it is possible to start repetitive updating for data-dependent decomposition from a sufficiently good initial solution, thus effectively reducing costs. Can be reduced.

上記の関連性判定装置において、前記実数行列分解部は、前記基底行列及び前記係数行列の要素の初期値を変えて、複数とおりの前記基底行列及び前記係数行列を求め、前記コスト関数が最小となる前記基底行列及び前記係数行列を採用することで前記実数行列を分解してよい。この構成により、初期値によるばらつきを軽減して、分解の誤差をより小さくできる。 In the above relevance determination device, the real number matrix decomposition unit obtains a plurality of types of the base matrix and the coefficient matrix by changing initial values of the elements of the base matrix and the coefficient matrix, and the cost function is minimized. The real matrix may be decomposed by adopting the basis matrix and the coefficient matrix. With this configuration, it is possible to reduce variations due to initial values and further reduce the error in decomposition.

上記の関連性判定装置において、前記特徴ベクトルは、ＨＯＧ特徴量であってよく、前記複数の実数ベクトルは、複数の線形識別器のパラメータに対応する複数の重みベクトルであってよく、前記ベクトル演算部は、前記関連性の判定として、前記複数の線形識別器の識別関数によって、前記複数の基準の各々に対する前記特徴ベクトルの識別を行なってよい。この構成により、複数の線形識別器による特徴ベクトルの識別を高速化できる。 In the above relevance determination device, the feature vector may be a HOG feature, the plurality of real vectors may be a plurality of weight vectors corresponding to parameters of a plurality of linear classifiers, and the vector calculation The unit may identify the feature vector for each of the plurality of criteria by using an identification function of the plurality of linear classifiers as the determination of the relevance. With this configuration, it is possible to speed up feature vector identification by a plurality of linear classifiers.

上記の関連性判定装置において、前記実数行列生成部は、前記特徴ベクトル及び前記複数の実数ベクトルが１又は複数のパラメータを有する場合に、当該パラメータの順に従って前記複数の実数ベクトルを並べることにより前記実数行列を生成し、前記ベクトル演算部は、前記係数行列を構成する複数のベクトルであって前記複数の実数ベクトルが並べられた方向と同方向の複数のベクトルの各々を前記パラメータに関する連続関数で表現し、前記識別関数を最大にする前記パラメータを、前記特徴ベクトルのパラメータ値として求めてよい。この構成により、複数の実数ベクトルをまとめて実数行列を生成する際に、複数の実数ベクトルをそれが滑らかに変化するパラメータの順に並べて実数行列を生成することで、識別関数をそのパラメータに関する連続関数で表現できるので、高い分解能で特徴ベクトルのパラメータ値を求めることができる。 In the above-described relevance determination device, when the feature vector and the plurality of real vectors have one or a plurality of parameters, the real matrix generation unit arranges the plurality of real vectors according to the order of the parameters. A real number matrix is generated, and the vector calculation unit is a continuous function related to the parameter for each of a plurality of vectors constituting the coefficient matrix and having the same direction as the direction in which the plurality of real number vectors are arranged. The parameter that expresses and maximizes the discriminant function may be obtained as a parameter value of the feature vector. With this configuration, when a real matrix is generated by combining a plurality of real vectors, a real matrix is generated by arranging a plurality of real vectors in the order of parameters in which they change smoothly. Therefore, the parameter value of the feature vector can be obtained with high resolution.

上記の関連性判定装置において、前記特徴ベクトルは、ｋ−ｍｅａｎｓクラスタリングによるクラスタリングの対象となるベクトルであってよく、前記実数ベクトルは、ｋ−ｍｅａｎｓクラスタリングにおける代表ベクトルであってよく、前記ベクトル演算部は、前記関連性の判定として、前記特徴ベクトルと前記代表ベクトルとの間の距離の演算を含むクラスタリング処理を行なってよい。この構成により、ｋ−ｍｅａｎｓクラスタリングにおける特徴ベクトルと代表ベクトルとの間の距離の演算を高速化できる。 In the above-described relevance determination device, the feature vector may be a vector to be clustered by k-means clustering, the real vector may be a representative vector in k-means clustering, and the vector calculation unit May perform a clustering process including a calculation of a distance between the feature vector and the representative vector as the determination of the relevance. With this configuration, the calculation of the distance between the feature vector and the representative vector in k-means clustering can be speeded up.

上記の関連性判定装置において、前記特徴ベクトルは、ｋ−ｍｅａｎｓｔｒｅｅによる近似最近傍探索の対象となるベクトルであってよく、前記実数ベクトルは、ｋ−分木のノードに登録されている代表ベクトルであってよく、前記ベクトル演算部は、前記関連性の判定として、前記特徴ベクトルと前記代表ベクトルとの間の距離の演算を含むクラスタリング処理を行なってよい。この構成により、ｋ−ｍｅａｎｓｔｒｅｅによる近似最近傍探索における特徴ベクトルとｋ−分木のノードに登録されている代表ベクトルとの間の距離の演算を高速化できる。 In the above-described relevance determination apparatus, the feature vector may be a vector to be subjected to an approximate nearest neighbor search by k-means tree, and the real vector is a representative vector registered in a node of a k-ary tree. The vector calculation unit may perform a clustering process including calculation of a distance between the feature vector and the representative vector as the determination of the relevance. With this configuration, it is possible to speed up the calculation of the distance between the feature vector in the approximate nearest neighbor search by k-means tree and the representative vector registered in the node of the k-ary tree.

上記の関連性判定装置において、前記特徴ベクトルは、画像の特徴量を表すベクトルであってよい。この構成により、画像の特徴量の演算における特徴ベクトルと複数の実数ベクトルの内積計算を高速化できる。 In the above-described relevance determination device, the feature vector may be a vector that represents a feature amount of an image. With this configuration, it is possible to speed up the inner product calculation of the feature vector and the plurality of real vectors in the calculation of the feature amount of the image.

本実施の形態の関連性判定プログラムは、コンピュータを、上記の関連性判定装置として機能させるための関連性判定プログラムである。この構成によっても、特徴ベクトルと複数の実数ベクトルの各々との内積を計算のために、複数の実数ベクトルからなる実数行列を離散値の基底行列と係数行列に分解したうえで、特徴ベクトルと基底行列との積を計算し、さらに係数行列との積を計算するので、特徴ベクトルと複数の実数ベクトルの各々との内積演算の結果を高速に取得でき、よって特徴ベクトルと複数の実数ベクトルとの関連性の判定を高速に行うことができる。 The relevance determination program according to the present embodiment is a relevance determination program for causing a computer to function as the above relevance determination device. Even with this configuration, in order to calculate the inner product of a feature vector and each of a plurality of real vectors, a real matrix composed of a plurality of real vectors is decomposed into a discrete value base matrix and a coefficient matrix, and then the feature vector and the base Since the product with the matrix is calculated and the product with the coefficient matrix is further calculated, the result of the inner product operation between the feature vector and each of the plurality of real vectors can be obtained at high speed, and thus the feature vector and the plurality of real vectors can be obtained. Relevance can be determined at high speed.

本実施の形態の関連性判定方法は、二値化された特徴ベクトルを取得する特徴ベクトル取得ステップと、複数の実数ベクトルからなる実数行列を、係数行列と、要素として二値または三値の離散値のみを持つ複数の基底ベクトルからなる基底行列との積に分解する実数行列分解ステップと、前記特徴ベクトルと前記複数の実数ベクトルの各々との内積の計算として、前記特徴ベクトルと前記基底行列との積を計算し、さらに当該積と前記係数行列との積を計算して、その結果を用いて、前記複数の実数ベクトルの各々と前記特徴ベクトルとの関連性を判定するベクトル演算ステップとを含む構成を有している。この構成によっても、特徴ベクトルと複数の実数ベクトルの各々との内積を計算のために、複数の実数ベクトルからなる実数行列を離散値の基底行列と係数行列に分解したうえで、特徴ベクトルと基底行列との積を計算し、さらに係数行列との積を計算するので、特徴ベクトルと複数の実数ベクトルの各々との内積演算の結果を高速に取得でき、よって特徴ベクトルと複数の実数ベクトルとの関連性の判定を高速に行うことができる。
４−３．効果 The relevance determination method according to the present embodiment includes a feature vector acquisition step for acquiring a binarized feature vector, a real matrix composed of a plurality of real vectors, a coefficient matrix, and binary or ternary discrete elements. A real matrix decomposing step for decomposing the product with a base matrix composed of a plurality of basis vectors having only values, and calculating an inner product of the feature vector and each of the plurality of real vectors, the feature vector and the base matrix A vector operation step of calculating the product of the product and the coefficient matrix, and using the result to determine the relationship between each of the plurality of real vectors and the feature vector, It has the composition which includes. Even with this configuration, in order to calculate the inner product of a feature vector and each of a plurality of real vectors, a real matrix composed of a plurality of real vectors is decomposed into a discrete value base matrix and a coefficient matrix, and then the feature vector and the base Since the product with the matrix is calculated and the product with the coefficient matrix is further calculated, the result of the inner product operation between the feature vector and each of the plurality of real vectors can be obtained at high speed, and thus the feature vector and the plurality of real vectors can be obtained. Relevance can be determined at high speed.
4-3. effect

本実施の形態によれば、二値化された特徴ベクトルと複数の実数ベクトルの各々との内積計算を高速化でき、そのような特徴ベクトルと複数の実数ベクトルの各々との関連性の判定を高速に行うことができる。 According to the present embodiment, it is possible to speed up the inner product calculation of the binarized feature vector and each of the plurality of real vectors, and to determine the relevance between such a feature vector and each of the plurality of real vectors. It can be done at high speed.

以下、本実施の形態の特徴量演算装置について、図面を参照しながら説明する。 Hereinafter, the feature amount calculation apparatus of the present embodiment will be described with reference to the drawings.

４−４．実数ベクトルが複数ある状況
まず、特徴ベクトルとの内積を計算すべき実数ベクトルが複数ある状況について説明する。図３５は、複数の識別基準で画像中の人を識別する場合の線形ＳＶＭの例を示す図である。この例では、入力されたある特徴ベクトルに対して、図３５に示すように、単にその特徴ベクトルの画像内に人がいるか否かの識別ではなく、それが「大人（正面）」であるか否か、「大人（横）」であるか否か、「子供（正面）」であるか否かをそれぞれ識別する。即ち、特徴ベクトルを識別する基準が複数ある。この場合、図３５に示すように、識線形ＳＶＭの評価式ｆ（ｘ）の重みパラメータ（以下、「辞書」ともいう。）ｗは、識別基準ごとに複数（ｗ₁，ｗ₂，ｗ₃，…，ｗ_L）用意する必要があり、バイアスｂも識別基準ごとに複数（ｂ₁，ｂ₂，ｂ₃，…，ｂ_L）用意する必要がある。 4-4. A situation where there are a plurality of real vectors First, a situation where there are a plurality of real vectors whose inner product with the feature vector is to be calculated will be described. FIG. 35 is a diagram illustrating an example of a linear SVM when a person in an image is identified based on a plurality of identification criteria. In this example, for an input feature vector, as shown in FIG. 35, it is not simply identification of whether or not there is a person in the image of the feature vector, but whether it is “adult (front)”. No, “adult (horizontal)”, and “child (front)” are identified. That is, there are a plurality of criteria for identifying feature vectors. In this case, as shown in FIG. 35, the weight parameter (hereinafter also referred to as “dictionary”) w of the evaluation formula f (x) of the wisdom line SVM has a plurality (w ₁ , w ₂ , w ₃ ) for each identification criterion. ,..., W _L ) and a plurality of bias b (b ₁ , b ₂ , b ₃ ,..., B _L ) must be prepared for each identification criterion.

図３６は、被写体までの距離に応じた複数の識別基準で画像中の人を識別する場合の線形ＳＶＭの例を示す図である。この例では、人の識別が、被写体までの距離、即ち画像内の被写体のスケールの変化に対してロバストとなるように、入力されたある特徴ベクトルに対して、図３６に示すように、単にその特徴ベクトルの画像内に大人がいるか否かを識別するだけでなく、それが「大人（遠）」であるか否か、「大人（中距離）」であるか否か、「大人（近）」であるか否かをそれぞれ識別する。即ち、この場合も、特徴ベクトルを識別する基準が複数あり、よって、図３６に示すように、線形ＳＶＭの辞書ｗは、識別基準ごとに複数（ｗ₁，ｗ₂，ｗ₃，…，ｗ_L）用意する必要があり、バイアスｂも識別基準ごとに複数（ｂ₁，ｂ₂，ｂ₃，…，ｂ_L）用意する必要がある。 FIG. 36 is a diagram illustrating an example of a linear SVM when a person in an image is identified by a plurality of identification criteria according to the distance to the subject. In this example, as shown in FIG. 36, for a certain feature vector input, the person identification is robust to the distance to the subject, that is, the scale change of the subject in the image. In addition to identifying whether there is an adult in the image of the feature vector, whether it is “adult (far)”, whether it is “adult (medium distance)”, “adult (near) ) ”Or not. That is, in this case as well, there are a plurality of criteria for identifying feature vectors. Therefore, as shown in FIG. 36, there are a plurality of linear SVM dictionaries w for each identification criterion (w ₁ , w ₂ , w ₃ ,..., W _L ) must be prepared, and a plurality of bias b (b ₁ , b ₂ , b ₃ ,..., B _L ) must be prepared for each identification criterion.

このように、ある特徴ベクトルに対して複数の基準で識別を行う場合には、それらの複数の基準が互いに似ていることが多い。図３５及び図３６もそのような例を示しており、即ち、図３５の例では、「大人（正面）」と「大人（横）」は、大人という共通点を有し、「大人（正面）」と「子供（正面）」は、人の正面という共通点を有し、また、「大人（正面）」と「大人（横）」と「子供（正面）」は、人という共通点を有する。図３６の例でも、「大人（遠）」と「大人（中距離）」と「大人（近）」は、「大人」という共通点を有する。よって、図３５及び図３６の複数の実数ベクトルである辞書（ｗ₁，ｗ₂，ｗ₃，…，ｗ_L）は互いに似ている。また、ｋ−ｍｅａｎｓクラスタリングにおいても、ｋ個の実数ベクトルである代表ベクトルが互いに似ていることが多い。本実施の形態の関連性判定装置は、このように複数の実数ベクトルが互いに似ているという性質を生かして、処理を高速化する。 As described above, when a certain feature vector is identified by a plurality of criteria, the plurality of criteria are often similar to each other. FIG. 35 and FIG. 36 also show such an example, that is, in the example of FIG. 35, “adult (front)” and “adult (horizontal)” have the common point of adults, and “adult (front) ) ”And“ Children (front) ”have the common feature of people, and“ Adults (front) ”,“ Adults (horizontal) ”and“ Children (front) ”have the common features of people. Have. Also in the example of FIG. 36, “adult (far)”, “adult (medium distance)”, and “adult (near)” have a common point of “adult”. Therefore, the dictionaries (w ₁ , w ₂ , w ₃ ,..., W _L ) that are a plurality of real vectors in FIGS. 35 and 36 are similar to each other. In k-means clustering, representative vectors that are k real vectors are often similar to each other. The relevance determination apparatus according to the present embodiment speeds up processing by taking advantage of the property that a plurality of real vectors are similar to each other.

４−５．第４の実施の形態の第１の例
図３７は、第４の実施の形態の第１の例の特徴量演算装置１０６の構成を示すブロック図である。特徴量演算装置１０６は、コンテンツ取得部１６１と、特徴ベクトル生成部１６２と、特徴ベクトル二値化部１６３と、実数行列取得部１６４と、実数行列分解部１６５と、ベクトル演算部１６６と、データベース１６７とを備えている。 4-5. First Example of Fourth Embodiment FIG. 37 is a block diagram showing a configuration of a feature quantity computing device 106 of a first example of the fourth embodiment. The feature amount calculation device 106 includes a content acquisition unit 161, a feature vector generation unit 162, a feature vector binarization unit 163, a real number matrix acquisition unit 164, a real number matrix decomposition unit 165, a vector calculation unit 166, a database 167.

本例の特徴量演算装置１０６は、後述するように、特徴ベクトルと辞書データとしてデータベースに保存された複数の実数ベクトルとの内積演算を伴うベクトル演算によって、特徴ベクトルと複数の実数ベクトルとの関連性を判定する関連性判定装置として機能する。即ち、特徴演算装置１０６は、本実施の形態の関連性判定装置に相当する。 As will be described later, the feature amount computing device 106 of this example relates the feature vector and the plurality of real vectors by a vector operation involving an inner product operation of the feature vector and a plurality of real vectors stored in the database as dictionary data. It functions as a relevance determination device that determines sex. That is, the feature calculation device 106 corresponds to the relevance determination device of the present embodiment.

関連性判定装置としての特徴量演算装置１０６は、コンピュータが本実施の形態の関連性判定プログラムを実行することにより実現される。関連性判定プログラムは、記録媒体に記録されて、記録媒体からコンピュータによって読み出されてもよいし、ネットワークを通じてコンピュータにダウンロードされてもよい。 The feature amount computing device 106 as a relevance determination device is realized by a computer executing the relevance determination program according to the present embodiment. The relevance determination program may be recorded on a recording medium and read from the recording medium by a computer, or may be downloaded to a computer through a network.

コンテンツ取得部１６１は、画像データ、音声データ、文字データ等のコンテンツデータを取得する。これらのコンテンツデータは、外部機器から与えられるものであってもよく、コンテンツ取得部１６１で生成されるものであってもよい。例えば、コンテンツ取得部１６１がカメラであり、そこでコンテンツデータとして画像データが生成されてよい。 The content acquisition unit 161 acquires content data such as image data, audio data, and character data. These content data may be provided from an external device or may be generated by the content acquisition unit 161. For example, the content acquisition unit 161 may be a camera, and image data may be generated there as content data.

特徴ベクトル生成部１６２は、コンテンツ取得部１６１にて取得されたコンテンツデータからＤ次元の特徴ベクトルを生成する。例えばコンテンツが画像である場合には、特徴ベクトル生成部１６２は、画像の特徴量を抽出する。特徴ベクトル二値化部１６３は、特徴ベクトル生成部１６２で生成されたＤ次元の特徴ベクトルを二値化して、各要素が−１及び１の二値のみをとるｄ次元の二値ベクトルｐ∈｛−１，１｝^dを生成する。この特徴ベクトル二値化部１６３は、本実施の形態の「特徴ベクトル取得部」に相当する。 The feature vector generation unit 162 generates a D-dimensional feature vector from the content data acquired by the content acquisition unit 161. For example, when the content is an image, the feature vector generation unit 162 extracts the feature amount of the image. The feature vector binarization unit 163 binarizes the D-dimensional feature vector generated by the feature vector generation unit 162, and each element has only a binary value of -1 and 1, and a d-dimensional binary vector pε {-1, 1} ^d is generated. This feature vector binarization unit 163 corresponds to the “feature vector acquisition unit” of the present embodiment.

なお、コンテンツ取得部１６１、特徴ベクトル生成部１６２、及び特徴ベクトル二値化部１６３からなる構成は、最終的に二値化された特徴ベクトルを取得できる構成であればよく、例えば、コンテンツ取得部１６１及び特徴ベクトル生成部１６２を備えずに、特徴ベクトル二値化部１６３が外部機器から特徴ベクトルを取得して、その取得した特徴ベクトルを二値化する構成であってよいし、また、特徴ベクトル二値化部１６３が外部機器から二値化された特徴ベクトルを直接取得する構成であってもよい。 The configuration including the content acquisition unit 161, the feature vector generation unit 162, and the feature vector binarization unit 163 may be any configuration that can finally acquire a binarized feature vector. For example, the content acquisition unit 161 and the feature vector generation unit 162 may be omitted, and the feature vector binarization unit 163 may acquire a feature vector from an external device and binarize the acquired feature vector. The vector binarization unit 163 may directly acquire a binarized feature vector from an external device.

実数行列取得部１６４は、複数のｄ次元の実数ベクトルｑ_n∈Ｒ^d（ｎ＝１，２，…，Ｌ）を取得する。複数の実数ベクトルｑ_nは、外部機器から与えられるものであってもよく、特徴量演算装置１０６の図示しない記憶装置から読み出されるものであってもよく、実数行列取得部１６４で生成されるものであってもよい。各実数ベクトルｑ_nは、その要素に浮動小数を含む実数を持つ。ここで、複数の実数ベクトルｑ_nを並べたものを実数行列Ｑ＝（ｑ₁，ｑ₂，…，ｑ_L）∈Ｒ^dｘ^Lと表記する。 The real matrix acquisition unit 164 acquires a plurality of d-dimensional real vector q _n ∈R ^d (n = 1, 2,..., L). The plurality of real number vectors q _n may be provided from an external device, may be read from a storage device (not shown) of the feature value computing device 106, and is generated by the real number matrix acquisition unit 164. It may be. Each real vector q _n has a real number including a floating-point number in its element. Here, the arrangement of a plurality of real number vectors q _n is expressed as a real number matrix Q = (q ₁ , q ₂ ,..., Q _L ) ∈R ^d x ^L.

このように複数の実数ベクトルｑ_nをまとめた実数行列Ｑを用いると、図３５及び図３６の複数の線形ＳＶＭは、下式（２８）のようにまとめて表現することができる。
When a real matrix Q in which a plurality of real vectors q _n are combined in this way, the plurality of linear SVMs in FIGS. 35 and 36 can be expressed together as in the following equation (28).

実数行列分解部１０５は、図３８に示すように、ｄ行Ｌ列の実数行列Ｑを、二値の基底行列Ｍ∈｛−１，１｝^dxkと係数行列との積に分解する。具体的には、実数行列分解部１０５は、ｄ行Ｌ列の実数行列Ｑを、下式（２９）によって、二値の要素を持つ基底行列Ｍと実数の要素を持つ係数行列Ｃに分解する。
ここで、図３８に示すように、Ｍ＝（ｍ₁，ｍ₂，…，ｍ_k）∈｛−１，１｝^dxkであり、Ｃ＝（ｃ₁，ｃ₂，…，ｃ_L）^T∈Ｒ^kxLである。 As shown in FIG. 38, the real matrix decomposition unit 105 decomposes the real matrix Q of d rows and L columns into a product of a binary base matrix Mε {−1,1} ^dxk and a coefficient matrix. Specifically, the real matrix decomposition unit 105 decomposes the real matrix Q of d rows and L columns into a base matrix M having binary elements and a coefficient matrix C having real elements by the following equation (29). .
Here, as shown in FIG. 38, M = (m ₁ , m ₂ ,..., M _k ) ∈ {−1, 1} ^dxk and C = (c ₁ , c ₂ ,..., C _L ) ^T ∈ R ^kxL .

すなわち、基底行列Ｍは、ｋ個の基底ベクトルｍ_iからなり、ここで、基底ベクトルｍ_iは、要素が−１及び１のみをとるｄ次元の二値ベクトルであり、従って、基底行列Ｍは、要素が−１及び１のみをとるｄ行ｋ列の二値行列である。 That is, the basis matrix M is composed of k basis vectors m _i , where the basis vector _mi is a d-dimensional binary vector having elements of only −1 and 1, and thus the basis matrix M is , Is a binary matrix of d rows and k columns in which elements take only -1 and 1.

また、係数行列Ｃは、Ｌ個（Ｌはクラス数）の係数ベクトルｃ_nからなり、ここで、係数ベクトルｃ_nは、ｋ個（ｋは基底数）の基底ベクトルに係る実数の係数を要素として持つｋ次元の実数ベクトルである。もちろん、ＱとＭＣはなるべく一致するように分解することが好ましいが、誤差を含んでもよい。以下、実数行列分解部１０５が実数行列Ｑを式（２９）のように分解する手法を説明する。 The coefficient matrix C includes L (L is the number of classes) coefficient vectors c _n , where the coefficient vector c _n is an element of real coefficients related to k (k is the basis number) basis vectors. As a k-dimensional real vector. Of course, it is preferable to decompose Q and MC so that they coincide as much as possible, but an error may be included. Hereinafter, a method in which the real matrix decomposition unit 105 decomposes the real matrix Q as shown in Expression (29) will be described.

４−５−１．第１の分解手法
第１の分解手法として、データ非依存型の分解手法を説明する。第１の分解手法では、実数行列分解部１０５は、分解誤差を表す下式（３０）のコスト関数ｇ₁を解くことで分解を行う。
ただし、基底行列Ｍは二値であり、Ｍ∈｛−１，１｝^dxkである。 4-5-1. First Decomposition Method As a first decomposition method, a data-independent decomposition method will be described. In the first decomposition method, the real matrix decomposition unit 105 performs decomposition by solving a cost function g ₁ of the following equation (30) representing a decomposition error.
However, the base matrix M is binary, and Mε {−1, 1} ^dxk .

実数行列分解部１０５は、以下の手順で上記のコスト関数ｇ₁を解く。
（１）基底行列Ｍ及び係数行列Ｃをランダムに初期化する。
（２）基底行列Ｍの要素を固定して、係数行列Ｃの要素を最小二乗法により最適化することで、コスト関数ｇ₁が最小になるように係数行列Ｃの要素を更新する。
（３）係数行列Ｃの要素を固定して、コスト関数ｇ₁が最小になるように全探索で基底行列Ｍの要素を更新する。この最小化アルゴリズムである全探索については、後に詳しく述べる。
（４）収束するまで（２）及び（３）を繰り返す。例えば、コスト関数ｇ₁が所定の収束条件（例えば、減少量が一定値以下となる）を満たしたときに、収束したと判定する。
（５）ステップ（１）〜ステップ（４）により得た解を候補として保持する。
（６）ステップ（１）〜ステップ（５）を繰り返し、最もコスト関数ｇ₁を小さくできた候補基底行列Ｍ及び候補係数行列Ｃを最終結果として採用する。なお、このステップ（１）〜ステップ（５）の繰り返しはなくてもよいが、複数回繰り返すことで、初期値依存の問題を回避できる。 The real matrix decomposition unit 105 solves the cost function g _{1 according} to the following procedure.
(1) The base matrix M and the coefficient matrix C are initialized at random.
(2) By fixing the elements of the base matrix M and optimizing the elements of the coefficient matrix C by the least square method, the elements of the coefficient matrix C are updated so that the cost function g ₁ is minimized.
(3) The elements of the coefficient matrix C are fixed, and the elements of the base matrix M are updated by a full search so that the cost function g ₁ is minimized. The full search which is this minimization algorithm will be described in detail later.
(4) Repeat (2) and (3) until convergence. For example, when the cost function g ₁ satisfies a predetermined convergence condition (for example, the amount of decrease is equal to or less than a certain value), it is determined that the cost function g ₁ has converged.
(5) The solutions obtained in steps (1) to (4) are held as candidates.
(6) Steps (1) to (5) are repeated, and the candidate base matrix M and candidate coefficient matrix C that have the smallest cost function g ₁ are adopted as the final results. Note that the steps (1) to (5) need not be repeated, but the problem of initial value dependency can be avoided by repeating a plurality of times.

次に、ステップ（３）における基底行列Ｍの更新処理を説明する。図３９の破線枠で囲ったように、基底行列Ｍのｊ行目の行ベクトルの要素は、実数行列のｊ行目の要素のみに依存する。よって、基底行列Ｍの各行ベクトルの値は、他の行とは独立して最適化することができるので、基底行列Ｍは、行ごとに網羅探索（全探索）を行うことができる。基底行列Ｍのｊ行目の行ベクトルは、本例のように二値分解の場合は２^k通りしか存在しない（なお、後述の第２の例の三値分解の場合にも３^k通りしか存在しない）。よって、実数行列分解部１０５は、これらをすべて網羅的にチェックし、コスト関数ｇ₁を最小化する行ベクトルを採用する。これを基底行列Ｍのすべての行ベクトルに対して適用して、基底行列Ｍの要素を更新する。 Next, the update process of the base matrix M in step (3) will be described. As surrounded by the broken line frame in FIG. 39, the element of the row vector of the jth row of the base matrix M depends only on the element of the jth row of the real number matrix. Therefore, the value of each row vector of the base matrix M can be optimized independently of other rows, so that the base matrix M can perform an exhaustive search (full search) for each row. There are only 2 ^k row vectors in the j-th row of the base matrix M in the case of binary decomposition as in this example (note that there are only 3 ^{k in} the case of ternary decomposition in the second example described later). not exist). Therefore, the real number matrix decomposing unit 105 comprehensively checks them and employs a row vector that minimizes the cost function g ₁ . This is applied to all the row vectors of the base matrix M to update the elements of the base matrix M.

４−５−２．第２の分解手法
第２の分解手法として、係数行列Ｃを疎にするデータ非依存型の分解手法を説明する。第２の分解手法では、実数行列分解部１０５は、分解誤差である下式（３１）のコスト関数ｇ₂を解くことで分解を行う。
ただし、基底行列Ｍは二値であり、Ｍ∈｛−１，１｝^dxkである。また、｜Ｃ｜₁は、係数行列Ｃの要素のＬ１ノルムであり、λはその係数である。 4-5-2. Second Decomposition Method As a second decomposition method, a data-independent decomposition method that makes the coefficient matrix C sparse will be described. In the second decomposition techniques, real matrix decomposition unit 105 performs degradation by solving the cost function g ₂ of the formula is an exploded error (31).
However, the base matrix M is binary, and Mε {−1, 1} ^dxk . Also, | C | ₁ is the L1 norm of the element of the coefficient matrix C, and λ is its coefficient.

実数行列分解部１０５は、以下の手順で上記のコスト関数ｇ₂を解く。
（１）基底行列Ｍ及び係数行列Ｃをランダムに初期化する。
（２）基底行列Ｍの要素を固定して、係数行列Ｃの要素を近接勾配法で最適化する。
（３）係数行列Ｃの要素を固定して、コスト関数ｇ₂が最小になるように全探索で基底行列Ｍの要素を更新する。
（４）収束するまで（２）及び（３）を繰り返す。例えば、コスト関数ｇ₂が所定の収束条件（例えば、減少量が一定値以下となる）を満たしたときに、収束したと判定する。
（５）ステップ（１）〜ステップ（４）により得た解を候補として保持する。
（６）ステップ（１）〜ステップ（５）を繰り返し、最もコスト関数ｇ₂を小さくできた候補基底行列Ｍ及び候補係数行列Ｃを最終結果として採用する。なお、このステップ（１）〜ステップ（５）の繰り返しはなくてもよいが、複数回繰り返すことで、初期値依存の問題を回避できる。 The real matrix decomposition unit 105 solves the cost function g _{2 according} to the following procedure.
(1) The base matrix M and the coefficient matrix C are initialized at random.
(2) The elements of the base matrix M are fixed, and the elements of the coefficient matrix C are optimized by the proximity gradient method.
(3) The elements of the coefficient matrix C are fixed, and the elements of the base matrix M are updated by a full search so that the cost function g ₂ is minimized.
(4) Repeat (2) and (3) until convergence. For example, when the cost function g ₂ satisfies a predetermined convergence condition (for example, the amount of decrease is equal to or less than a certain value), it is determined that the cost function g ₂ has converged.
(5) The solutions obtained in steps (1) to (4) are held as candidates.
(6) Steps (1) to (5) are repeated, and the candidate base matrix M and candidate coefficient matrix C that have the smallest cost function g ₂ are adopted as the final results. Note that the steps (1) to (5) need not be repeated, but the problem of initial value dependency can be avoided by repeating a plurality of times.

第２の分解手法によれば、係数行列Ｃを疎にすることができる。係数行列Ｃを疎にすることで、積ＭＣの計算において、係数行列Ｃのゼロ要素にかかわる部分を省略することができ、さらに高速に内積計算を行うことができる。 According to the second decomposition method, the coefficient matrix C can be made sparse. By making the coefficient matrix C sparse, in the calculation of the product MC, the portion related to the zero element of the coefficient matrix C can be omitted, and the inner product calculation can be performed at higher speed.

４−５−３．第３の分解手法
次に、第３の分解手法を説明する。第１の分解手法では、コスト関数ｇ₁として、分解誤差
を定義し、この分解誤差を最小化することを考えた。しかしながら、実数行列を基底行列と係数行列との積に近似した後に実際に近似をしたいのは、特徴ベクトルと実数行列の積Ｑ^Tｐである。 4-5-3. Third decomposition method Next, a third decomposition method will be described. In the first decomposition method, the decomposition error is expressed as the cost function g _1.
To minimize this decomposition error. However, what is actually desired to be approximated after approximating the real matrix to the product of the base matrix and the coefficient matrix is the product Q ^T p of the feature vector and the real matrix.

そこで、第３の分解手法では、特徴ベクトルｐをあらかじめＳ個集め、これをまとめたものをＰ∈Ｒ^dxSとする。そして、分解誤差を
と定義して、これを最小化する。即ち、第３の分解手法では、実数行列分解部１０５は、下式（３２）のコスト関数ｇ₃を解くことで分解を行う。
このコスト関数ｇ₃によれば、実数行列Ｑは、実際のデータの分布に従って分解されることになるため、分解の際の近似精度が向上する。 Therefore, in the third decomposition method, S feature vectors p are collected in advance, and the sum of ^these is ^defined as P∈R ^dxS . And the decomposition error
And minimize this. That is, in the third decomposition method, the real matrix decomposition unit 105 performs decomposition by solving the cost function g ₃ of the following equation (32).
According to this cost function g ₃ , the real matrix Q is decomposed according to the actual data distribution, so that the approximation accuracy at the time of decomposition is improved.

この近似分解は、基底ベクトルｍ_iを逐次的に求めることで行うことができる。第３の分解手法の手順は以下のとおりである。
（１）第１又は第２の分解手法によって、基底行列Ｍ及び係数行列Ｃを求めて、これをそれらの初期値とする。
（２）基底行列Ｍの要素を固定して、係数行列Ｃの要素を最小二乗法で最適化する。
（３）係数行列Ｃの要素を固定して、基底行列Ｍの要素を最適化することで、基底行列Ｍの要素を更新する。この基底行列Ｍの更新処理については後述する。
（４）収束するまで（２）及び（３）を繰り返し、コスト関数ｇ₃を最小化した基底行列Ｍ及び係数行列Ｃを候補として保持する。
（５）ステップ（１）〜（６）を繰り返し、コスト関数ｇ₃を最小化した基底行列Ｍ及び係数行列Ｃを最終結果として採用する。なお、ステップ（１）では再度第１又は第２の分解手法による基底行列Ｍ及び係数行列Ｃの最適化が行われるので、初期値が変更される。また、ステップ（５）の繰り返しはなくてもよいが、複数回繰り返すことで、初期値依存の問題を軽減できる。 This approximation decomposition can be performed by obtaining the basis vectors m _i sequentially. The procedure of the third decomposition method is as follows.
(1) The base matrix M and the coefficient matrix C are obtained by the first or second decomposition method and set as initial values thereof.
(2) The elements of the base matrix M are fixed, and the elements of the coefficient matrix C are optimized by the least square method.
(3) The elements of the base matrix M are updated by fixing the elements of the coefficient matrix C and optimizing the elements of the base matrix M. The update process of the base matrix M will be described later.
(4) Repeat (2) and (3) until convergence, and hold the base matrix M and coefficient matrix C minimizing the cost function g ₃ as candidates.
(5) Steps (1) to (6) are repeated, and a base matrix M and a coefficient matrix C in which the cost function g ₃ is minimized are adopted as final results. In step (1), since the base matrix M and the coefficient matrix C are optimized again by the first or second decomposition method, the initial values are changed. In addition, although step (5) may not be repeated, the problem of initial value dependency can be reduced by repeating a plurality of times.

次に、ステップ（３）における基底行列Ｍの更新処理を説明する。データ依存分解の場合、基底行列Ｍの行ベクトルの値は、もはや他の行と独立せず、依存してしまう。基底行列Ｍの要素は、二値又は三値、即ち離散値であるため、基底行列Ｍの最適化は、組合最適化問題となる。よって、基底行列Ｍの最適化には、例えば、グリーディアルゴリズム（Greedy algorithm）、タブ−サーチ（Tabu search）、シミュレイテッドアニーリング（Simulated annealing）等のアルゴリズムを用いることができる。ステップ（１）でよい初期値が得られているので、これらのアルゴリズムでも良好に分解誤差を最小化できる。 Next, the update process of the base matrix M in step (3) will be described. In the case of data-dependent decomposition, the value of the row vector of the base matrix M is no longer independent of other rows and is dependent. Since the elements of the base matrix M are binary or ternary, that is, discrete values, the optimization of the base matrix M becomes a combinatorial optimization problem. Therefore, for optimization of the base matrix M, for example, an algorithm such as a greedy algorithm, a tab search, or a simulated annealing can be used. Since a good initial value is obtained in step (1), these algorithms can satisfactorily minimize the decomposition error.

例えばグリーディアルゴリズムを用いる場合は、以下の手順で基底行列Ｍを最適化する。
（３−１）基底行列Ｍの要素のうち、ランダムにＴ個を選択する。
（３−２）２^T通りの組み合わせ（後述の三値分解の場合は３^T通り）を試し、最もコスト関数ｇ₃を最小化したものを採用する。
（３−３）ステップ（３−１）及びステップ（３−２）を収束するまで繰り返す。 For example, when the greedy algorithm is used, the base matrix M is optimized by the following procedure.
(3-1) T elements of the base matrix M are selected at random.
(3-2) 2 ^T combinations (3 ^{T in} the case of ternary decomposition described later) are tried, and the one that minimizes the cost function g ₃ is adopted.
(3-3) Repeat step (3-1) and step (3-2) until convergence.

４−５−４．第４の分解手法
第４の分解手法は、第２の分解手法と第３の分解手法とを組み合わせてものである。具体的には、実数行列分解部１０５は、下式（３３）のコスト関数ｇ₄を解くことで分解を行う。
このコスト関数ｇ₄によれば、実数行列Ｑは、実際のデータの分布に従って分解されることになるため、分解の際の近似精度が向上するとともに、係数行列Ｃを疎にすることができる。即ち、第２の分解手法のメリットと第３の分解手法のメリットをいずれも得ることができる。具体的な分解の手順は、第３の分解手法と同様である。 4-5-4. Fourth Decomposition Method The fourth decomposition method is a combination of the second decomposition method and the third decomposition method. Specifically, the real matrix decomposition unit 105 performs decomposition by solving the cost function g ₄ of the following equation (33).
According to the cost function g ₄ , the real matrix Q is decomposed according to the actual data distribution, so that the approximation accuracy at the time of decomposition is improved and the coefficient matrix C can be made sparse. That is, both of the advantages of the second decomposition method and the third decomposition method can be obtained. The specific decomposition procedure is the same as that in the third decomposition method.

４−５−５．第１及び第２の分解手法の変形例
上記の第１及び第２のデータ非依存分解の手法は、分解数をｋとしたとき、ｋ²通り（三値分解の場合はｋ³通り）の探索が必要であるため、ｋが大きいときは、適用が難しい。そのような場合は、あらかじめ実数行列Ｑに所属する実数ベクトルｑ_nの互いの類似度を調べ、似ている実数ベクトルどうしをクラスタリングし、各クラスタに対して第１又は第２の分解手法を適用すればよい。 4-5-5. Modifications of First and Second Decomposition Methods The first and second data-independent decomposition methods described above are k ² ways (k ³ ways in the case of ternary decomposition), where k is the number of decompositions. Since search is necessary, application is difficult when k is large. In such a case, the mutual similarity of the real vectors q _n belonging to the real matrix Q is examined in advance, the similar real vectors are clustered, and the first or second decomposition method is applied to each cluster. do it.

ベクトル演算部１０６は、特徴ベクトルを用いた演算を行なう。演算の具体的内容については、後述にて、本例の特徴量演算装置１００の応用例とともに具体的に説明する。この特徴ベクトルを用いた演算には、二値化された特徴ベクトルｐ∈｛−１，１｝^dと実数行列分解部１０５にて分解された実数行列Ｑとの積Ｑ^Tｐの計算が含まれる。以下では、まず、この積Ｑ^Tｐの計算について説明する。 The vector calculation unit 106 performs a calculation using a feature vector. The specific contents of the calculation will be specifically described later together with an application example of the feature amount calculation apparatus 100 of the present example. The calculation using the feature vector includes calculation of a product Q ^T p of the binarized feature vector pε {−1,1} ^d and the real matrix Q decomposed by the real matrix decomposition unit 105. It is. In the following, first, the calculation of the product Q ^T p will be described.

積Ｑ^Tｐは、下式（３４）のように式変形できる。
ここで、ｍ_i ^Tｐは二値ベクトル同士の内積である。また、ｃ_n,iは、ｎ番目のクラスの係数ベクトルｃ_nのｉ番目の要素、即ち係数行列Ｃのｉ行ｎ列の要素である。この二値ベクトル同士の内積ｍ_i ^Tｐは、極めて高速に計算可能である。その理由は以下のとおりである。 The product Q ^T p can be transformed into the following equation (34).
Here, m _i ^T p is an inner product of binary vectors. Further, c _{n, i} is the i th element of the coefficient vector c _n of the n th class, that is, the element of i row and n column of the coefficient matrix C. The inner product m _i ^T p between the binary vectors can be calculated extremely quickly. The reason is as follows.

二値ベクトル同士の内積は、ハミング距離の演算に帰着できる。ハミング距離とは、２つのバイナリコードにおいて、値が異なるビットを数えたものであり、２つの二値ベクトルの間のハミング距離は、すなわち値が異なる要素数を数えたものである。ここで、ｍ_iとｐのハミング距離をＤ_hamming（ｍ_i，ｐ）と記述すると、内積ｍ_i ^Tｐは、
Ｄ_hamming（ｍ_i，ｐ）と下式（３５）の関係がある。
ここで、前述のとおり、ｄはバイナリコードのビット数である。 An inner product between binary vectors can be reduced to a Hamming distance calculation. The Hamming distance is obtained by counting bits having different values in two binary codes, and the Hamming distance between two binary vectors is obtained by counting the number of elements having different values. Here, m _i and p Hamming distance D _hamming (m _i, p) of the writing, the inner product m _i ^T p is
D _hamming (m _i, p) and the following formula relation (35).
Here, as described above, d is the number of bits of the binary code.

ハミング距離の演算は、２つのバイナリコードにおいて、ＸＯＲを適用した後に、１が立っているビットを数えることで計算できるので、極めて高速である。二値ベクトルがバイナリコード（０と１のビット列）で表現されているのであれば、ハミング距離は、下式（３６）で計算できる。
ここで、ＸＯＲ関数はｍ_iとｐをバイナリコード表現で考えたときに排他的論理和を
取る操作であり、ＢＩＴＣＯＵＮＴ関数はバイナリコードの１が立っているビット数を数えあげる処理のことである。 The calculation of the Hamming distance is extremely fast because it can be calculated by counting the bits in which 1 stands after applying XOR in two binary codes. If the binary vector is expressed by a binary code (bit sequence of 0 and 1), the Hamming distance can be calculated by the following equation (36).
Here, XOR function is an operation of the exclusive OR when considering the m _i and p in binary code representation, BITCOUNT function is processing to enumerate the number of bits 1 of the binary code is standing .

以上をまとめると、積Ｑ^Tｐは下式（３７）のように変形できる。
すなわち、ｄビットのハミング距離計算をｋ回行い、ｋ個のハミング距離について、係数行列Ｃに関する重み付け和を計算し、定数項を足したものがＱ^Tｐになる。よって、ｋが十分小さければ、Ｑ^Tｐを浮動小数点精度で計算するよりも、はるかに高速に計算できるようになる。 In summary, the product Q ^T p can be transformed as shown in the following equation (37).
That is, the d-bit Hamming distance calculation is performed k times, the weighted sum related to the coefficient matrix C is calculated for k Hamming distances, and the sum of the constant terms is Q ^T p. Therefore, if k is sufficiently small, Q ^T p can be calculated much faster than calculating with floating point precision.

データベース１０７には、実数行列分解部１０５にて分解された複数の実数行列Ｑについて、基底行列Ｍと係数行列Ｃの積が辞書データとして記憶されている。ベクトル演算部１０６は、データベース１０７から基底行列Ｍと係数行列Ｃとの積を読み出して、上記の演算を行う。 The database 107 stores the product of the base matrix M and the coefficient matrix C as dictionary data for a plurality of real number matrices Q decomposed by the real number matrix decomposition unit 105. The vector calculation unit 106 reads the product of the base matrix M and the coefficient matrix C from the database 107 and performs the above calculation.

以上のように、本例の特徴量演算装置１００によれば、特徴ベクトルを用いた演算処理に特徴ベクトルと実数行列との積演算が含まれている場合にも、特徴ベクトルを二値化した上で、実数行列についても、二値行列である基底行列と係数行列との積に分解するので、特徴ベクトルと実数行列との積の計算において、特徴ベクトルと基底行列との積を計算した上で、さらに係数行列との積を計算することで、特徴ベクトルと実数行列との積演算を高速化できる。 As described above, according to the feature amount computing device 100 of this example, even when the computation processing using the feature vector includes the product computation of the feature vector and the real number matrix, the feature vector is binarized. Since the real matrix is also decomposed into the product of the base matrix and coefficient matrix that are binary matrices, the product of the feature vector and the base matrix is calculated in the product of the feature vector and the real matrix. By further calculating the product with the coefficient matrix, the product operation of the feature vector and the real number matrix can be speeded up.

また、複数の実数ベクトルを１つの実数行列としてまとめ、その実数行列を二値行列である基底行列と係数行列とに分解するので、先願の技術のように各実数ベクトルをそれぞれ分解する場合と比較して、基底行列を構成する基底ベクトルの個数、即ち基底数を小さくすることができる。原理的には、１クラスあたり１個以下の基底数（即ち、基底数ｋ≦クラス数Ｌ）とすることも可能である。 In addition, since a plurality of real vectors are combined into one real matrix and the real matrix is decomposed into a binary matrix, a base matrix and a coefficient matrix, each real vector is decomposed as in the prior application technique. In comparison, the number of basis vectors constituting the basis matrix, that is, the number of basis can be reduced. In principle, it is possible to set the number of bases to one or less per class (that is, base number k ≦ number of classes L).

４−６．第４の実施の形態の第１の例の拡張
上記の第１の例では、二値ベクトルｍ_i、ｐを、それぞれ、ｍ_i∈｛−１，１｝^d、ｐ∈｛−１，１｝^dと定義して、実数行列を二値の基底行列と実数の係数行列との積に分解することで積演算Ｑ^Tｐが高速になることを説明した。しかしながら、ｍ_i、ｐをより一般的な二値ベクトルｍ_i´∈｛−ａ，ａ｝^d、ｐ´∈｛−ａ，ａ｝^dとしても、それらの高速な積演算が可能である。この場合、ｍ_i´^Tｐ´＝ａ²（ｍ_i ^Tｐ）であることから、−１及び１により定義される二値ベクトル同士の内積にａ²を掛ければよい。 4-6. Expansion of the first example of the fourth embodiment In the first example described above, the binary vectors m _i and p are changed to m _i ε {−1,1} ^d and pε {−1,1, respectively. } It is defined as ^d, and it is explained that the product operation Q ^T p becomes faster by decomposing a real matrix into a product of a binary base matrix and a real coefficient matrix. However, even if m _i and p are more general binary vectors m _{i ′} ε {−a, a} ^d and p′ε {−a, a} ^d , their high-speed product operation can be performed. In this case, since m _i ′ ^T p ′ = a ² (m _i ^T p), the inner product of binary vectors defined by −1 and 1 may be multiplied by a ² .

さらに、特徴ベクトル及び基底ベクトルを任意の二値ベクトルｍ_i´´∈｛α，β｝^d、ｐ´´∈｛γ，δ｝^dとしても、高速な内積演算が可能である。ここで、係数α、β、γ、δは実数であり、α≠β、γ≠δである。この場合、ｍ_i´´及びｐ´´は、−１及び１により定義される二値ベクトルｍ_i及びｐの各要素に線形変換を施すことで得られ、下式（３８）及び（３９）のように展開される。
なお、式（３８）及び（３９）中の太字の「１」は、長さがｄですべての要素が１であるベクトルである。また、式（３８）及び（３９）中のＡ、Ｂ、Ｃ、Ｄは実数であり、式（３８）及び（３９）が成立するようにあらかじめ計算しておけばよい。 Further, even if the feature vector and the base vector are arbitrary binary vectors m _{i ″} ε {α, β} ^d , p ″ ε {γ, δ} ^d , high-speed inner product calculation is possible. Here, the coefficients α, β, γ, and δ are real numbers, and α ≠ β and γ ≠ δ. In this case, m _{i ″} and p ″ are obtained by performing linear transformation on the elements of the binary vectors m _i and p defined by −1 and 1, and the following equations (38) and (39) are obtained. It is expanded like this.
Note that the bold “1” in the equations (38) and (39) is a vector having a length of d and all elements being 1. Further, A, B, C, and D in the expressions (38) and (39) are real numbers, and may be calculated in advance so that the expressions (38) and (39) are satisfied.

内積ｍ_i´´^Tｐ´´は、下式（４０）のように展開できる。
式（４０）の括弧内の計算は、−１及び１からなる二値ベクトル同士の内積である。従って、特徴ベクトルが任意の二値の要素をもつ二値ベクトルにされ、かつ、実数行列を二値の基底行列と実数の係数行列との積に展開した場合にも、高速演算が可能である。 The inner product m _{i ″} ^T p ″ can be expanded as in the following equation (40).
The calculation in parentheses in the equation (40) is the inner product of binary vectors consisting of -1 and 1. Therefore, even when the feature vector is a binary vector having an arbitrary binary element and the real matrix is expanded into a product of a binary base matrix and a real coefficient matrix, high-speed calculation is possible. .

４−７．第４の実施の形態の第２の例
次に、第２の例の特徴量演算装置を説明する。第２の例の特徴量演算装置の構成は、図３５に示した第１の例のそれと同じである。第１の例では、実数行列分解部１０５は、実数行列Ｑを式（２８）によって二値の基底行列と実数の係数行列に分解したが、本例の特徴量演算装置１００の実数行列分解部１０５は、実数行列を三値の基底行列と実数の係数行列に分解する。 4-7. Second Example of Fourth Embodiment Next, a feature value computing device of a second example will be described. The configuration of the feature quantity computing device of the second example is the same as that of the first example shown in FIG. In the first example, the real number matrix decomposition unit 105 decomposes the real number matrix Q into a binary base matrix and a real number coefficient matrix according to Equation (28), but the real number matrix decomposition unit of the feature quantity computing device 100 of this example. 105 decomposes the real matrix into a ternary basis matrix and a real coefficient matrix.

実数行列分解部１０５は、ｄ行Ｌ列の実数行列Ｑ∈Ｒ^dxLを、三値の基底行列と実数の係数行列の積に分解する。具体的には、実数行列分解部１０５は、ｄ行Ｌ列の実数行列Ｑ∈Ｒ^dxLを、下式（４１）によって、三値の要素を持つ基底行列Ｍと実数の要素を持つ係数行列Ｃに分解する。
ここで、Ｍ＝（ｍ₁，ｍ₂，…，ｍ_k）∈｛−１，０，１｝^dxkであり、Ｃ＝（ｃ₁，ｃ₂，…，ｃ_L）^T∈Ｒ^kxLである。すなわち、基底行列Ｍは、ｋ個の基底ベクトルｍ_iからなり、ここで、基底ベクトルｍ_iは、要素が−１、０、及び１のみをとるｄ次元の三値ベクトルであり、従って、基底行列Ｍは、要素が−１、０、及び１のみをとるｄ行ｋ列の三値行列である。 The real matrix decomposition unit 105 decomposes the real matrix QεR ^dxL of d rows and L columns into a product of a ternary basis matrix and a real coefficient matrix. Specifically, the real matrix decomposing unit 105 converts a d-row L-column real matrix QεR ^dxL into a base matrix M having ternary elements and a coefficient matrix C having real elements by the following equation (41). Disassembled into
Here, M = (m ₁ , m ₂ ,..., M _k ) ∈ {−1, 0, 1} ^dxk and C = (c ₁ , c ₂ ,..., C _L ) ^T ^{∈R kxL} . . That is, the basis matrix M consists of k basis vectors m _i, where the basis vectors m _i, the element is a three-value vector of d-dimensional take -1,0, and 1 only, therefore, basal The matrix M is a ternary matrix of d rows and k columns having elements of only −1, 0, and 1.

また、係数行列Ｃは、Ｌ個（Ｌはクラス数）の係数ベクトルｃ_nからなり、ここで、係数ベクトルｃ_nは、ｋ個の基底ベクトルに係る実数の係数を要素として持つｋ次元の実数ベクトルである。もちろん、ＱとＭＣはなるべく一致するように分解することが好ましいが、誤差を含んでもよい。実数行列分解部１０５は、第１の例と同様にして、第１〜第３の分解手法によって実数行列Ｑを分解できる。 The coefficient matrix C is composed of L (L is the number of classes) coefficient vectors c _n , where the coefficient vector c _n is a k-dimensional real number having real coefficients related to k basis vectors as elements. Is a vector. Of course, it is preferable to decompose Q and MC so that they coincide as much as possible, but an error may be included. The real number matrix decomposition unit 105 can decompose the real number matrix Q by the first to third decomposition methods in the same manner as in the first example.

ベクトル演算部１０６は、積Ｑ^Tｐを計算する。以下では、積Ｑ^Tｐを計算するベクトル演算部１０６を特に、積演算部１０６とも呼ぶ。積Ｑ^Tｐは、下式（４２）のように式変形できる。
ここで、ｍ_i ^Tｐは、三値ベクトルｍ_iと二値ベクトルｐとの内積である。積演算部１０６は、ここで、三値ベクトルｍ_iの代わりに、以下に定義する０置換ベクトルｍ_i ^bin、フィルタベクトルｍ_i ^filter、及び０要素数ｚ_iを用いる。 Vector operation unit 106 calculates product Q ^T p. Hereinafter, the vector calculation unit 106 that calculates the product Q ^T p is also specifically referred to as a product calculation unit 106. The product Q ^T p can be transformed into the following equation (42).
Here, m _i ^T p is the inner product of the ternary vector m _i and binary vector p. Here, the product calculation unit 106 uses, instead of the ternary vector m _i , a 0 permutation vector m _i ^bin , a filter vector m _i ^filter , and a 0 element number z _i defined below.

まず、積演算部１０６は、ｍ_iの０の要素を、−１又１に置き換える。ｍ_iの各要素について、それを−１に置き換えるか、１に置き換えるかは、いずれでもよい。この置き換えによって、０置換ベクトルｍ_i ^bin∈｛−１，１｝^dが生成される。この０置換ベクトルｍ_i ^bin∈｛−１，１｝^dは二値ベクトルである。 First, the product calculation unit 106 replaces the 0 element of m _i with −1 or 1. For each element of m _i, to replace it to -1, is either replaced by 1, it may be any. By this replacement, a 0 replacement vector m _i ^bin ε {-1, 1} ^d is generated. This 0 permutation vector m _i ^bin ε {-1, 1} ^d is a binary vector.

また、積演算部１０６は、ｍ_iの０の要素を−１に置き換え、０以外の要素を１に置き換える。この置き換えによって、フィルタベクトルｍ_i ^filter∈｛−１，１｝^dが生成される。このフィルタベクトルｍ_i ^filterも二値ベクトルである。 Also, product unit 106 replaces the 0 element of m _i -1, replace the elements other than 0 to 1. By this replacement, a filter vector m _i ^filter ε {-1, 1} ^d is generated. This filter vector m _i ^filter is also a binary vector.

さらに、積演算部１０６は、ｍ_iの０の要素数ｚ_iを求める。ｚ_iは整数となる。積演算部１０６は、これらの二値ベクトルｍ_i ^bin、フィルタベクトルｍ_i ^filter、及び０要素数ｚ_iを用いて、式（４２）におけるｍ_i ^Tｐを、下の式（４３）及び式（４４）によって計算する。
ここで、式（４４）のＡＮＤ関数は、二値ベクトルをバイナリコード表現で考えたときに、論理積を取る操作である。 Further, the product calculation unit 106 obtains the number of elements z _i of 0 of m _i . z _i is an integer. The product calculation unit 106 uses these binary vectors m _i ^bin , filter vectors m _i ^filter , and 0 element number z _i to ^convert m _i ^T p in equation (42) to the following equations (43) and Calculate according to (44).
Here, the AND function of Expression (44) is an operation of taking a logical product when a binary vector is considered in binary code expression.

以下、図４０の具体例を用いて、式（４３）及び（４４）の導出を説明する。図４０は、本例の計算例を示す図である。図４０の例では、ｐ＝｛−１，１，−１，１，−１，１｝であり、ｍ_i＝｛−１，０，１，０，１，１｝である。この例では、ｍ_i ^bin＝｛−１，＊，１，＊，１，１｝となる。ここで、「＊」は−１又は１の任意のいずれかを示す。また、ｍ_i ^filter＝｛１，−１，１，−１，１，１｝となり、ｚ_i＝２となる。 Hereinafter, the derivation of the equations (43) and (44) will be described using the specific example of FIG. FIG. 40 is a diagram illustrating a calculation example of this example. In the example of FIG. 40, p = {− 1,1, −1,1, −1,1} and m _i = {− 1,0,1,0,1,1}. In this _{^{example, m i bin = {- 1}} , *, 1, *, 1,1} a. Here, “*” represents any one of −1 or 1. Further, m _i ^filter = {1, -1,1, -1,1,1}, and z _i = 2.

式（４４）におけるｐとｍ_i ^binとの排他的論理和は、ＸＯＲ（ｐ，ｍ_i ^bin）＝｛−１，＊，１，＊，１，−１｝となり、すなわち、ｐとｍ_iの要素のうち、非０で異なっている要素すなわち−１と１又は１と−１の組となる要素が１となり、−１と−１又は１と１の組となる要素が−１となる。 The exclusive OR of p and m _i ^bin in equation (44) is XOR (p, m _i ^bin ) = {− 1, *, 1, *, 1, −1}, that is, p and m _i Of the elements of, elements that are different from non-zero, that is, elements that are pairs of -1 and 1 or 1 and -1, are 1, and elements that are pairs of -1 and -1 or 1 and 1 are -1. .

ｍ_i ^Tｐは、１と１又は−１と−１の組となる要素（積が１になる要素の組）の個数から、−１と１又は１と−１との組となる要素（積が−１になる要素の組）の個数を引いた値と等しいため、ｍ_i ^Tｐ＝（ｄ−Ｄ_{filterd＿hamming}−ｚ_i）−Ｄ_{filterd＿hamming}＝ｄ−ｚ_i−２Ｄ_{filterd＿hamming}となり、式（４３）が得られ、その値は、６−２−２×２＝０となる。なお、この結果は、当然ながら、ｐ^Tｍ_i＝｛−１，１，−１，１，−１，１｝×｛−１，０，１，０，１，１｝＝１＋０＋（−１）＋０＋（−１）＋１＝０と一致する。 m _i ^T p is an element (a set of -1 and 1 or 1 and -1) from the number of elements (a set of elements whose product is 1) that is a set of 1 and 1 or -1 and -1. Since it is equal to a value obtained by subtracting the number of elements whose product is −1, m _i ^T p = (d−D _{filterd_hamming−} z _i ) −D _{filterd_hamming} = d−z _i −2D _{filterd_hamming} ), And the value is 6-2-2 × 2 = 0. Of course, this result is as follows. P ^T m _i = {− 1,1, −1,1, −1,1} × {−1,0,1,0,1,1} = 1 + 0 + (− 1 ) +0 + (− 1) + 1 = 0.

式（４２）〜（４４）をまとめると、積Ｑ^Tｐは、下式（４５）のように変形できる。
積演算部１０６は、この式（４５）によって、積Ｑ^Tｐを計算する。 Summarizing the equations (42) to (44), the product Q ^T p can be transformed as the following equation (45).
The product calculation unit 106 calculates the product Q ^T p by this equation (45).

関数Ｄ_{filterd＿hamming}（ｐ，ｍ_i ^bin，ｍ_i ^filter）は、ハミング距離演算と非常に似ており、ＡＮＤ演算が加わっただけである。したがって、Ｑ∈Ｒ^dｘ^Lを、三値行列と係数行列との積に分解した場合でも、Ｑ^Tｐを浮動小数点精度で計算するよりも、はるかに高速にＱ^Tｐを計算できるようになる。 The function D _{filterd_hamming} (p, m _i ^bin , m _i ^filter ) is very similar to the Hamming distance calculation, only an AND operation is added. Therefore, the Q∈R ^d x ^L, even when decomposed into a product of the three value matrix and the coefficient matrix, rather than calculating the Q ^T p in floating point precision, so that it can calculate the Q ^T p much faster Become.

以上のように、ｄ次元の実数行列Ｑ∈Ｒ^dxLを、二値ではなく三値の基底行列と係数行列との積に分解することの利点は、式（３７）の近似が、より少ない数の基底数の基底行列でも成立するようになることにある。すなわち、基底数を小さく抑えられることになるため、さらなる高速化につながる。 As described above, the advantage of decomposing a d-dimensional real matrix Q∈R ^dxL into a product of a ternary basis matrix and a coefficient matrix instead of a binary is that the approximation of Expression (37) is smaller in number. The basis matrix of the basis number of. That is, since the number of bases can be kept small, the speed is further increased.

４−８．第４の実施の形態の第２の例の拡張
上記の第２の例では、二値ベクトルｐ及び三値ベクトルｍ_iを、それぞれ、ｐ∈｛−１，１｝^d、ｍ_i∈｛−１，０，１｝^dと定義して、複数の実数ベクトルからなる実数行列を三値の基底行列と係数行列との積に分解することで内積演算ｐ^Tｍ_iが高速になることを説明した。しかしながら、ｐ、ｍ_iをより一般的な二値ベクトルｐ´∈｛−ａ，ａ｝^d、三値ベクトルｍ_i∈｛−ａ，０，ａ｝^dとしても、それらの高速な内積演算が可能である。この場合、ｐ´^Tｍ_i´＝ａ²（ｐ^Tｍ_i）であることから、−１及び１により定義される二値ベクトル同士の内積にａ²を掛ければよい。 4-8. Expansion of the second example of the fourth embodiment In the second example described above, the binary vector p and the ternary vector m _i are respectively expressed as pε {−1,1} ^d and m _i ε {−. 1,0,1} is defined as ^d, explain that inner product p ^T m _i is faster by decomposing the real matrix comprising a plurality of real vector the product of the basis matrix and a coefficient matrix of three-valued did. However, even if p and m _i are more general binary vectors p′∈ {−a, a} ^d and ternary vectors m _i ∈ {−a, 0, a} ^d , their high-speed inner product operations can be performed. Is possible. In this case, since p ′ ^T m _i ′ = a ² (p ^T m _i ), the inner product of the binary vectors defined by −1 and 1 may be multiplied by a ² .

さらに、二値ベクトルｐ及び三値ベクトルｍ_iをｐ∈｛α，β｝^d、ｍ_i∈｛γ−δ，γ，γ＋δ｝^dと一般化しても、高速な内積演算が可能である。ここで、α、β、γ、δは実数であり、α≠β、δ≠０である。この場合、ｍ_i及びｐの各要素に下式（４６）及び（４７）の線形変換を施すことで、それぞれｍ_i´´及びｐ´´が得られる。
なお、式（４６）及び（４７）中の太字の「１」は、長さがｄですべての要素が１であるベクトルである。また、式（４６）及び（４７）中のＡ、Ｂ、Ｃ、Ｄは実数であり、式（４６）及び（４７）が成立するようにあらかじめ計算しておく。 Furthermore, even if the binary vector p and the ternary vector m _i are generalized as pε {α, β} ^d and m _i ε {γ−δ, γ, γ + δ} ^d , high-speed inner product calculation is possible. Here, α, β, γ, and δ are real numbers, and α ≠ β and δ ≠ 0. In this case, by performing a linear transformation of the formula (46) and (47) to each element of the m _i and p, m _i'' and p'' is obtained, respectively.
Note that the bold “1” in the equations (46) and (47) is a vector having a length d and all elements being 1. Further, A, B, C, and D in the equations (46) and (47) are real numbers, and are calculated in advance so that the equations (46) and (47) are satisfied.

内積ｍ_i´´^Tｐ´´は、下式（４８）のように展開できる。
式（４８）の括弧内の計算は、−１及び１からなる二値ベクトル同士の内積、又は−１及び１からなる二値ベクトルと−１、０、１からなる三値ベクトルとの内積である。従って、特徴ベクトルが任意の二値ベクトルにされ、かつ、実数行列を上記のとおり一般化した三値行列を用いて展開した場合にも、そのような特徴ベクトルと実数行列との積を高速に演算できる。 The inner product m _{i ″} ^T p ″ can be expanded as shown in the following equation (48).
The calculation in parentheses in the equation (48) is an inner product of binary vectors consisting of -1 and 1, or an inner product of a binary vector consisting of -1 and 1 and a ternary vector consisting of -1, 0 and 1. is there. Therefore, even when the feature vector is an arbitrary binary vector and the real matrix is expanded using the generalized ternary matrix as described above, the product of such a feature vector and the real matrix can be increased at high speed. Can be calculated.

４−９．応用例
次に、ベクトル演算部１０６における演算処理について説明する。上記の第１及び第２の例のベクトル演算部１０６は、二値化された特徴ベクトルｐと複数の実数ベクトルｑをまとめた実数行列Ｑとの積の計算を伴うものであるが、そのような演算処理は種々ある。すなわち、本実施の形態の上記の例は、特徴ベクトルを用いて演算処理を行なう種々の装置に応用できる。 4-9. Application Example Next, calculation processing in the vector calculation unit 106 will be described. The vector calculation unit 106 in the first and second examples described above involves calculation of a product of the binarized feature vector p and a real matrix Q obtained by combining a plurality of real vectors q. There are various kinds of arithmetic processing. That is, the above example of the present embodiment can be applied to various apparatuses that perform arithmetic processing using feature vectors.

４−９−１．第４の実施の形態の第１の応用例
本応用では、本実施の形態がＨＯＧ特徴量を用いてＳＶＭにより複数種類の物体を認識する物体認識装置に応用される。図４１は、物体認識装置の構成を示すブロック図である。物体認識装置１０７は、ピラミッド画像生成部１７１と、ＨＯＧ特徴量抽出部１７２と、バイナリコード変換部１７３と、パラメータ決定部１７４と、パラメータ行列分解部１７５と、線形ＳＶＭ識別部１７６と、ピーク検出部１７７とを備えている。 4-9-1. First Application Example of Fourth Embodiment In this application, the present embodiment is applied to an object recognition apparatus that recognizes a plurality of types of objects by SVM using HOG feature values. FIG. 41 is a block diagram illustrating a configuration of the object recognition apparatus. The object recognition device 107 includes a pyramid image generation unit 171, a HOG feature amount extraction unit 172, a binary code conversion unit 173, a parameter determination unit 174, a parameter matrix decomposition unit 175, a linear SVM identification unit 176, and a peak detection. Part 177.

ピラミッド画像生成部１７１は、入力クエリとしての画像を取得して、当該画像を複数段階の倍率でそれぞれ縮小してなるＧ段のピラミッド画像を生成する。これにより、サイズの異なる物体に対処できる。このピラミッド画像生成部１７１は、図３７に示したコンテンツ取得部１６１に対応する。ＨＯＧ特徴量抽出部１７２は、ピラミッド画像の各段における画像を、１６×１６ピクセルのサイズのブロックに分割し、各ブロックからＨＯＧ特徴量を抽出する。ＨＯＧ特徴量抽出部１７２は、各ブロックからＤ次元の特徴量を抽出する。このＨＯＧ特徴量抽出部１７２は、図３７に示した特徴ベクトル抽出部１６２に対応する。バイナリコード変換部１７３は、各セルに与えられたＤ次元の特徴量を、ｄ次元の二値ベクトルに変換する。このバイナリコード変換部１７３は、図３７に示した特徴ベクトル二値化部１６３に対応する。 The pyramid image generation unit 171 acquires an image as an input query, and generates a G-stage pyramid image obtained by reducing the image at a plurality of scales. Thereby, it is possible to deal with objects of different sizes. The pyramid image generation unit 171 corresponds to the content acquisition unit 161 illustrated in FIG. The HOG feature amount extraction unit 172 divides the image at each stage of the pyramid image into blocks each having a size of 16 × 16 pixels, and extracts the HOG feature amount from each block. The HOG feature amount extraction unit 172 extracts a D-dimensional feature amount from each block. The HOG feature quantity extraction unit 172 corresponds to the feature vector extraction unit 162 shown in FIG. The binary code conversion unit 173 converts the D-dimensional feature value given to each cell into a d-dimensional binary vector. This binary code conversion unit 173 corresponds to the feature vector binarization unit 163 shown in FIG.

パラメータ決定部１７４は、認識したい対象の種類（大人、子供、車、バイクといった種類であって、パラメータで定義される）ごとに、それぞれ線形ＳＶＭ識別部１７６における線形ＳＶＭにて用いる重みベクトルｗ_n（ｎ＝１，２，…，Ｌ）及び実数のバイアスｂ_n（ｎ＝１，２，…，Ｌ）を決定する。パラメータ決定部１７４は、学習用に用意された特徴量を用いて、学習処理によってＬ種類の重みベクトルｗ_n及びバイアスｂ_nを決定して、重みベクトルｗ_nをまとめた重み行列Ｗを生成する。このパラメータ決定部１７４は、図３７に示した実数行列取得部１６４に対応する。パラメータ行列分解部１７５は、重み行列Ｗを第１又は第２の例で説明した式（２９）又は式（４１）によって離散値の基底行列と係数行列との積に分解する。このパラメータ行列分解部１７５は、図３７に示した実数行列分解部１６５に対応する。 The parameter determination unit 174 uses the weight vector w _n used in the linear SVM in the linear SVM identification unit 176 for each type of object to be recognized (adult, child, car, motorcycle, etc., which is defined by parameters). (N = 1, 2,..., L) and a real bias b _n (n = 1, 2,..., L) are determined. Parameter determination unit 174, using the feature quantity prepared for the learning, and determining the L type of weight vector w _n and the bias b _n by the learning process to generate a weight matrix W summarizes the weight vector w _n . This parameter determination unit 174 corresponds to the real matrix acquisition unit 164 shown in FIG. The parameter matrix decomposition unit 175 decomposes the weight matrix W into a product of a discrete value base matrix and a coefficient matrix according to the equation (29) or the equation (41) described in the first or second example. The parameter matrix decomposition unit 175 corresponds to the real number matrix decomposition unit 165 shown in FIG.

線形ＳＶＭ識別部１７６は、線形ＳＶＭによって特徴ベクトルの識別を行なう。線形ＳＶＭ識別部１７６は、まず、ｓ_x×ｓ_yブロックをひとまとまりとして、ウィンドウを構成する。１つのウィンドウから抽出される特徴ベクトルは、ｓ_x×ｓ_y×ｄ次元のベクトルとなる。線形ＳＶＭ識別部１７６は、この特徴ベクトルに対して、下式（４９）の線形ＳＶＭを適用する。
ここで、線形ＳＶＭにおける積演算Ｗ^Tｘは、第１又は第２の例として説明した実数行列と二値ベクトルの高速な積演算により実現できる。 The linear SVM identifying unit 176 identifies feature vectors using the linear SVM. The linear SVM identification unit 176 first configures a window by collecting s _x × _sy blocks as a group. A feature vector extracted from one window is an s _x × s _y × d-dimensional vector. The linear SVM discriminating unit 176 applies the linear SVM of the following expression (49) to this feature vector.
Here, the product operation W ^T x in the linear SVM can be realized by the high-speed product operation of the real matrix and the binary vector described as the first or second example.

検出位置付近では、検出結果が固まることがある。そこで、ピーク検出部１７７は、周辺でｆ（ｘ）の値が最大になったところを、代表的な検出位置とする。この線形ＳＶＭ識別部１７６及びピーク検出部１７７は、特徴ベクトルを用いた処理を行なう構成であり、図３７のベクトル演算部１６６に対応する。 The detection result may be hardened in the vicinity of the detection position. Therefore, the peak detection unit 177 sets a position where the value of f (x) is maximized around as a representative detection position. The linear SVM identification unit 176 and the peak detection unit 177 are configured to perform processing using feature vectors, and correspond to the vector calculation unit 166 in FIG.

次に、この物体認識装置１０７において、ＨＯＧ特徴量により、回転し得る物体を検出する例を説明する。図４２は、回転する道路標識について、それぞれの回転角度で辞書ｗ_n及びバイアスｂ_nを作成する場合を示している。図４２において左右方向は道路標識の回転角度θを示している。 Next, an example will be described in which the object recognition apparatus 107 detects a rotatable object based on the HOG feature value. Figure 42, for road signs to rotate, shows a case of creating a dictionary w _n and the bias b _n at each rotation angle. In FIG. 42, the left-right direction indicates the rotation angle θ of the road sign.

従来のアプローチでは、回転角度ごとに学習処理を行って辞書ｗ_n及びバイアスｂ_nを取得する。その後、入力画像からＨＯＧ特徴量を抽出して、ウィンドウ（スライディングウィンドウ）をＬ回適用することでこの道路標識の検出を行っている。しかしながら、このような従来の手法では、１ウィンドウあたりＬ回の内積計算が必要となり、計算量が多くなる。また、検出の角度分解能は２ｐｉ／Ｌであり、荒い。 The traditional approach to obtain a dictionary w _n and the bias b _n for each rotation angle by performing a learning process. Thereafter, the HOG feature amount is extracted from the input image, and this road sign is detected by applying a window (sliding window) L times. However, such a conventional method requires L times of inner product calculation per window, which increases the amount of calculation. Further, the angular resolution of detection is 2 pi / L, which is rough.

そこで、本応用例では、パラメータ決定部１７４が辞書ｗ_nをまとめて行列Ｑとし、ＳＶＭ識別部１７６は、下式（５０）により複数の辞書ｗ_nと特徴ベクトルｐとの内積計算をまとめて行う。
このようにｋ個の整数基底に分解することにより、１ウィンドウあたり、ｋ回の二値と二値との内積演算又は二値と三値との内積演算で処理が可能となる。このとき、隣り合う辞書同士が似ているため、整数基底の数ｋを小さくすることができ、原理的には１クラスあたり１個以下（ｋ≦Ｌ）とすることも可能である。 Therefore, in this application example, the parameter determination unit 174 as a matrix Q collectively dictionary w _n, SVM discrimination unit 176, together the inner product calculation of the plurality of dictionaries w _n and the feature vector p by the following equation (50) Do.
By decomposing into k integer bases in this way, processing can be performed by k times of inner product operation of binary and binary or inner product operation of binary and ternary per window. At this time, since adjacent dictionaries are similar, the number k of integer bases can be reduced, and in principle, one or less (k ≦ L) can be set per class.

本応用例では、さらに、ピーク検出部１７７が、係数行列Ｃの性質に着目した検出分解能の高精度化を行う。図４３は、係数行列Ｃの性質を示す図である。実数ベクトルｑ_nが回転角度θをパラメータとして、そのパラメータに従って変化するものである場合には、複数の実数ベクトルｑ_nをまとめて実数行列Ｑを生成する際に、図４２に示すように、複数の実数ベクトルｑ_nをパラメータθの順に並べると、図４３に示すように、係数行列Ｃの実数ベクトルｑ_nが並べられた方向と同方向の各ベクトル、即ち係数行列Ｃの各行ベクトルの要素の行方向の変化が滑らかになる。 In this application example, the peak detection unit 177 further increases the accuracy of the detection resolution focusing on the property of the coefficient matrix C. FIG. 43 is a diagram illustrating the properties of the coefficient matrix C. When the real vector q _n changes according to the rotation angle θ as a parameter, a plurality of real vectors q _n are combined to generate a real matrix Q as shown in FIG. of arranging the real vector q _n in the order parameter theta, as shown in FIG. 43, the vector of the real vector q _n is ordered in the same direction as the direction of the coefficient matrix C, i.e. the coefficient matrix C of each row vector elements The line direction changes smoothly.

そこで、ピーク検出部１７７は、係数行列Ｃの行ベクトルを多項式でフィッティングして、下式（５１）のように連続関数で表現する。
ここで、α_iは、フィッティングの係数である。 Therefore, the peak detection unit 177 fits the row vector of the coefficient matrix C with a polynomial and expresses it with a continuous function as shown in the following equation (51).
Here, α _i is a fitting coefficient.

これを用いて識別関数の式を整理すると、回転角度θにおける識別関数は下式（５２）のようにパラメータθに関する連続関数の形式で表現できる。
ピーク検出部１７７は、この識別関数を用いてピークの検出を行う。ｃ_i（θ）は式（５１）に示すように多項式であるから、ｆθ（ｐ）もまた連続関数（連続の多項式）となる。図４４は、ｆθ（ｐ）の例を示すグラフである。図４４において、横軸は回転角度θであり、縦軸はｆθ（ｐ）である。ピーク検出部１７７は、ｆθ（ｐ）が正の最大をとるときのθを対象の回転角度、即ち特徴ベクトルｐのパラメータ値として検出する。 If the formula of the discriminant function is arranged using this, the discriminant function at the rotation angle θ can be expressed in the form of a continuous function related to the parameter θ as shown in the following formula (52).
The peak detection unit 177 performs peak detection using this discrimination function. Since c _i (θ) is a polynomial as shown in equation (51), fθ (p) is also a continuous function (continuous polynomial). FIG. 44 is a graph showing an example of fθ (p). In FIG. 44, the horizontal axis is the rotation angle θ, and the vertical axis is fθ (p). The peak detection unit 177 detects θ when fθ (p) has a positive maximum as the target rotation angle, that is, the parameter value of the feature vector p.

以上のように、複数の辞書ｗ_nをまとめて行列Ｑを生成する際に、複数の辞書ｗ_nをそれが滑らかに変化するように、パラメータ（図４２の例ではθ）の順に並べて行列Ｑを生成することで、識別関数をそのパラメータに関する多項式の形式で表現できるので、高い分解能でそのパラメータを検出できるようになる。 As described above, when generating a matrix Q together multiple dictionaries w _n, as a plurality of dictionaries w _n it changes smoothly, the parameter matrix arranged in the order of (the θ in the example of FIG. 42) Q Since the discriminant function can be expressed in the form of a polynomial related to the parameter, the parameter can be detected with high resolution.

なお、上記ではパラメータを回転角度として説明したが、パラメータは例えばスケールであってもよい。すなわち、図３６のようにウィンドウの大きさは固定とし、ウィンドウ内における人物のサイズ（スケール）ごとに、別々に識別器を学習しておき、スケールσに関して多項式のフィッティングを行い、スケールσに関して識別器のピークを求めることで、高精度にスケール推定をおこなえるようになる。また、このように工夫することで、ピラミッド画像自体の生成を不要とできる。さらにパラメータが複数であってもよい。例えば、回転角度θとスケールσの両方に関して上記の多項式へのフィッティングを行ってもよい。この場合、係数はｃ_i（θ，σ）のように、二次元の多項式となる。 In the above description, the parameter is described as the rotation angle, but the parameter may be a scale, for example. That is, as shown in FIG. 36, the size of the window is fixed, the classifier is separately learned for each person size (scale) in the window, the polynomial is fitted to the scale σ, and the scale σ is identified. By calculating the peak of the instrument, the scale can be estimated with high accuracy. Further, by devising in this way, generation of the pyramid image itself can be made unnecessary. Furthermore, there may be a plurality of parameters. For example, fitting to the above polynomial may be performed for both the rotation angle θ and the scale σ. In this case, the coefficient is a two-dimensional polynomial such as c _i (θ, σ).

また、係数α_iは、まず係数行列Ｃを求めてから各行をフィッティングして求めることができるが、係数行列Ｃの個々の要素ｃ_n,iを求めずに直接係数α_iを求めてもよい。さらに、フィッティングする関数は多項式でなくてもよく、例えば三角関数（サイン、コサイン）にフィッティングしてもよい。 The coefficient α _i can be obtained by first obtaining the coefficient matrix C and then fitting each row, but the coefficient α _i may be obtained directly without obtaining the individual elements c _{n, i} of the coefficient matrix C. . Furthermore, the function to be fitted may not be a polynomial, and may be fitted to a trigonometric function (sine, cosine), for example.

４−９−２．第４の実施の形態の第２の応用例
本応用例では、本実施の形態がｋ−ｍｅａｎｓクラスタリングに応用される。図４５は、ｋ−ｍｅａｎｓクラスタリング装置の構成を示すブロック図である。ｋ−ｍｅａｎｓクラスタリング装置１０８は、コンテンツ取得部１８１と、特徴ベクトル生成部１８２と、特徴ベクトル二値化部１８３と、代表行列更新部１８４と、収束判定部１８５と、代表行列分解部１８６と、最近接代表ベクトル探索部１８７とを備えている。 4-9-2. Second Application Example of Fourth Embodiment In this application example, this embodiment is applied to k-means clustering. FIG. 45 is a block diagram illustrating a configuration of the k-means clustering apparatus. The k-means clustering apparatus 108 includes a content acquisition unit 181, a feature vector generation unit 182, a feature vector binarization unit 183, a representative matrix update unit 184, a convergence determination unit 185, a representative matrix decomposition unit 186, A closest representative vector search unit 187.

コンテンツ取得部１８１は、クラスタリングの対象となるＮ個のコンテンツを取得する。特徴ベクトル生成部１８２は、コンテンツ取得部１８１にて取得した各コンテンツからそれらの特徴量を特徴ベクトルｐとして抽出する。特徴ベクトル二値化部１８３は、特徴ベクトル抽出部１８２にて抽出された各特徴ベクトルを二値化する。 The content acquisition unit 181 acquires N contents to be clustered. The feature vector generation unit 182 extracts the feature amounts from the contents acquired by the content acquisition unit 181 as feature vectors p. The feature vector binarization unit 183 binarizes each feature vector extracted by the feature vector extraction unit 182.

代表行列更新部１８４は、まず、特徴ベクトル二値化部１８３で二値化されたＮ個の特徴ベクトルからｋ（＝Ｌ）個をランダムに選出してこれを代表ベクトルｑ_n（ｎ＝１，２，…，Ｌ）とし、これらの代表ベクトルｑ_nをまとめた行列を代表行列Ｑとする。収束判定部１８５は、代表行列更新部２４が代表行列を更新するごとに収束判定を行なう。収束判定部１８５にて収束したと判定された場合には、ｋ−ｍｅａｎｓクラスタリング装置１０８はクラスタリングの処理を終了する。代表行列分解部１８６は、代表行列更新部１８４にて更新された代表行列を離散値（二値又は三値）行列に分解する。 First, the representative matrix update unit 184 randomly selects k (= L) from the N feature vectors binarized by the feature vector binarization unit 183, and selects the representative vector q _n (n = 1). , 2,..., L), and a matrix in which these representative vectors q _n are collected is a representative matrix Q. The convergence determination unit 185 performs convergence determination each time the representative matrix update unit 24 updates the representative matrix. If the convergence determination unit 185 determines that the convergence has occurred, the k-means clustering apparatus 108 ends the clustering process. The representative matrix decomposition unit 186 decomposes the representative matrix updated by the representative matrix update unit 184 into a discrete value (binary or ternary) matrix.

最近接代表ベクトル探索部１８７は、特徴ベクトル二値化部１８３より入力されるＮ個の二値ベクトルをそれぞれ最も近傍の代表ベクトルｑ_nに所属させる。最近接代表ベクトル探索部１８７は、この結果を代表行列更新部１８４に出力する。代表行列更新部１８４は、各代表ベクトルｑ_nについて、それに所属する特徴ベクトル（二値化されている）の平均ベクトルを算出して、これを新しい代表ベクトルｑ_nとする。このようにして代表行列更新部１８４で更新される代表ベクトルｑ_nは、二値ベクトルの平均で算出されるので、実数ベクトルとなる。 The closest representative vector search unit 187 causes the N binary vectors input from the feature vector binarization unit 183 to belong to the nearest representative vector q _n . The nearest representative vector search unit 187 outputs this result to the representative matrix update unit 184. For each representative vector q _n , the representative matrix update unit 184 calculates an average vector of feature vectors (binarized) belonging to the representative vector q _{n and} sets this as a new representative vector q _n . The representative vector q _n updated by the representative matrix update unit 184 in this way is calculated as the average of the binary vectors, and thus becomes a real vector.

従って、仮に代表行列分解部１８６がなければ、最近接代表ベクトル探索部１８７は、更新された代表ベクトル（実数ベクトル）と特徴ベクトル（二値ベクトル）との距離を求めるためにそれらの内積を計算しなければならない。そこで、本応用例では、上記のように、この代表ベクトルｑ_n（実数ベクトル）の集合である代表行列Ｑを代表行列分解部１８６によって、第１又は第２の例で説明したように、離散値（二値又は三値）行列と実数の係数行列との積に分解する。それによって、最近接代表ベクトル探索部１８７における、各特徴ベクトルと各代表ベクトルとの距離の計算を高速にでき、よって各特徴ベクトルが最も近接する代表ベクトル（すなわち、所属すべき代表ベクトル）を高速に探索できる。 Therefore, if there is no representative matrix decomposition unit 186, the nearest representative vector search unit 187 calculates the inner product of them to obtain the distance between the updated representative vector (real vector) and feature vector (binary vector). Must. Therefore, in this application example, as described above, the representative matrix Q, which is a set of the representative vectors q _n (real vector), is separated by the representative matrix decomposition unit 186 as described in the first or second example. Decomposes a product of a value (binary or ternary) matrix and a real coefficient matrix. As a result, the nearest representative vector search unit 187 can calculate the distance between each feature vector and each representative vector at high speed. Therefore, the representative vector to which each feature vector is closest (that is, the representative vector to which the feature vector belongs) can be calculated at high speed. To explore.

４−９−３．第４の実施の形態の第３の応用例
本応用例では、本実施の形態がｋ−ｍｅａｎｓｔｒｅｅによる近似最近傍探索に応用される。本応用例の近似最近傍探索装置は、ｋ−ｍｅａｎｓを用いたｋ−分木による近似最近傍探索手法として、Marius Muja and David G. Lowe, "Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration", in International Conference on Computer Vision Theory and Applications (VISAPP' 09), 2009（http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN、http://people .cs.ubc.ca/~mariusm/uploads/FLANN/flann_visapp09.pdf）に提案されている手法を採用する。 4-9-3. Third Application Example of Fourth Embodiment In this application example, the present embodiment is applied to an approximate nearest neighbor search by k-means tree. The approximate nearest neighbor search device of this application example is Marius Muja and David G. Lowe, "Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration", in International, as an approximate nearest neighbor search method using k-trees using k-means. Conference on Computer Vision Theory and Applications (VISAPP '09), 2009 (http://www.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN, http: // people .cs.ubc.ca / ~ mariusm / uploads / FLANN / flann_visapp09.pdf)

具体的には、本応用例の近似最近傍探索装置は、Ｎ個のデータに対してｋ−ｍｅａｎｓを再帰的に適用することでｋ−分木を構築し、上記提案の木探索の原理により近似的に最近傍点を探索する。この手法は、データが実数ベクトルであり、かつノードに登録されている代表ベクトルが二値ベクトルである場合を前提として設計される。但し、データが二値ベクトルであって、ノードに登録されている代表ベクトルが実数ベクトルである場合にも、第１又は第２の例を採用することで、木探索を高速化できる。 Specifically, the approximate nearest neighbor search apparatus of this application example constructs a k-ary tree by recursively applying k-means to N pieces of data, and follows the above-described tree search principle. Approximately search for the nearest point. This method is designed on the assumption that the data is a real vector and the representative vector registered in the node is a binary vector. However, even when the data is a binary vector and the representative vector registered in the node is a real vector, the tree search can be speeded up by adopting the first or second example .

４−１０．第４の実施の形態の変形例
特徴量演算装置１０６において、コンテンツ取得部１６１、特徴ベクトル生成部１６２、特徴ベクトル二値化部１６３、実数行列取得部１６４、実数行列分解部１６５、及びベクトル演算部１６６の一部と他の部分とが別々の装置として構成されていてもよい。特に、コンテンツ取得部１６１、特徴ベクトル生成部１６２、特徴ベクトル二値化部１６３、及びベクトル演算部１６６が特徴演算装置１０６に搭載され、実数行列取得部１６４、及び実数行列分解部１６５が別の装置に搭載されてよい。この場合には、実数行列分解部１６５にて分解された複数の実数行列が特徴演算装置１０６のデータベース１６７に記憶され、ベクトル演算部１６６は、データベース１６７から分解された複数の実数行列を取得する。 4-10. In the modified feature quantity computing device 106 of the fourth exemplary embodiment , the content acquisition unit 161, the feature vector generation unit 162, the feature vector binarization unit 163, the real matrix acquisition unit 164, the real number matrix decomposition unit 165, and the vector calculation A part of the unit 166 and other parts may be configured as separate devices. In particular, the content acquisition unit 161, the feature vector generation unit 162, the feature vector binarization unit 163, and the vector calculation unit 166 are mounted on the feature calculation device 106, and the real matrix acquisition unit 164 and the real number matrix decomposition unit 165 are different from each other. It may be mounted on the device. In this case, a plurality of real number matrices decomposed by the real number matrix decomposition unit 165 are stored in the database 167 of the feature calculation device 106, and the vector calculation unit 166 acquires the plurality of real number matrices decomposed from the database 167. .

上記の実施の形態の例では、基底行列Ｍが二値又は三値であったが、基底行列Ｍが二値又は三値でなくともよい。基底行列Ｍのとり得る要素の種類が有限の数であれば上記の分解手法を適用して実数行列を分解することができる。また、係数行列Ｃも、基底行列Ｍと同様にあらかじめ定められた離散的な値でもよい。例えば、係数行列Ｃの要素を２のべき乗に制約してもよく、そうすることで、処理を高速化できる。また、分解する実数行列Ｑの要素の平均値が著しく大きい（若しくは小さい）場合、すなわち、平均値が０から著しく離れている場合には、この平均値をあらかじめ実数行列Ｑの各要素から引いてオフセット実数行列を生成し、このオフセット実数行列Ｑ´を基底行列Ｍと係数行列Ｃに分解すると、より少ない基底で式（２９）や式（４１）の近似分解を行うことができる。 In the example of the above embodiment, the base matrix M is binary or ternary. However, the base matrix M may not be binary or ternary. If the number of types of elements that the base matrix M can take is a finite number, the real matrix can be decomposed by applying the above decomposition method. Also, the coefficient matrix C may be a discrete value determined in advance as in the base matrix M. For example, the elements of the coefficient matrix C may be constrained to a power of 2, and the processing can be speeded up by doing so. When the average value of the elements of the real number matrix Q to be decomposed is remarkably large (or small), that is, when the average value is significantly different from 0, the average value is subtracted from each element of the real number matrix Q in advance. When an offset real number matrix is generated and this offset real number matrix Q ′ is decomposed into a base matrix M and a coefficient matrix C, approximate decomposition of Expression (29) and Expression (41) can be performed with fewer bases.

なお、第１及び第２の例において、コンテンツ取得部１６１にて取得されるコンテンツデータは、車両から得られる計測データであってよい。さらに、車両から得られる計測データは、例えば、車両に設置されたカメラで撮影された画像データ、車両に設置されたセンサで計測されたセンシングデータであってよい。この場合に、関連性判定装置としての特徴演算装置１０６のベクトル演算部１６６は、計測データと辞書データとの関連性を判定する。例えば、計測データとして、車両に設置されたカメラで撮影された画像データが取得される場合には、辞書データとして複数の人物画像のデータがデータベースに保存されており、関連性判定装置としての特徴演算装置１０６のベクトル演算部１６６は、上記の応用例のいずれかによって、画像データの画像に人物が含まれるか否かを判定してよい。 In the first and second examples, the content data acquired by the content acquisition unit 161 may be measurement data obtained from the vehicle. Furthermore, the measurement data obtained from the vehicle may be, for example, image data captured by a camera installed in the vehicle, or sensing data measured by a sensor installed in the vehicle. In this case, the vector calculation unit 166 of the feature calculation device 106 as the relevance determination device determines the relevance between the measurement data and the dictionary data. For example, when image data taken by a camera installed in a vehicle is acquired as measurement data, data of a plurality of person images are stored in a database as dictionary data, and the feature as a relevance determination device The vector calculation unit 166 of the calculation device 106 may determine whether or not a person is included in the image of the image data according to any of the application examples described above.

５．第５の実施の形態
上記の第１ないし第４の実施の形態は、組み合わせて実施することが可能である。特に、第１の実施の形態は、第２ないし第４の実施の形態と組み合わせることができる。例えば、第３の実施の形態の第１の応用例として説明した物体認識装置１０４（図３３）におけるピラミッド画像生成部１４１、ＨＯＧ特徴量抽出部１４２、バイナリ変換部１４３、及び線形ＳＶＭ識別部１４６が、第１の実施の形態として説明したハイブリッド・ピラミッド法（図９）に従って各処理を行ってよい。さらに、この物体認識装置１０４のパラメータ分解部１４５が、第４の実施の形態の特徴量演算装置１０５（図３７）の実数行列分解部１５５と同様に、複数の実数ベクトルからなる実数行列を、係数行列と、要素として二値または三値の離散値のみを持つ複数の基底ベクトルからなる基底行列との積に分解して、線形ＳＶＭ識別部１４６が、特徴ベクトルと複数の実数ベクトルの各々との内積の計算として、特徴ベクトルと基底行列との積を計算し、さらに当該積と係数行列との積を計算して、その結果を用いて、複数の実数ベクトルの各々と特徴ベクトルとの関連性を判定してもよい。 5. Fifth Embodiment The first to fourth embodiments described above can be implemented in combination. In particular, the first embodiment can be combined with the second to fourth embodiments. For example, the pyramid image generation unit 141, the HOG feature amount extraction unit 142, the binary conversion unit 143, and the linear SVM identification unit 146 in the object recognition apparatus 104 (FIG. 33) described as the first application example of the third embodiment. However, each processing may be performed according to the hybrid pyramid method (FIG. 9) described as the first embodiment. Further, the parameter decomposing unit 145 of the object recognizing device 104, like the real matrix decomposing unit 155 of the feature value computing device 105 (FIG. 37) of the fourth embodiment, converts a real matrix composed of a plurality of real vectors, By decomposing the coefficient matrix into a product of a base matrix made up of a plurality of base vectors having only binary or ternary discrete values as elements, a linear SVM identification unit 146 has a feature vector and each of a plurality of real vectors. For calculating the inner product of, calculate the product of the feature vector and the base matrix, calculate the product of the product and coefficient matrix, and use the result to associate each of the real vectors with the feature vector. Sex may be determined.

５−１．第５の実施の形態の第１の例
図４６は、第５の実施の形態の第１の例の識別装置における処理を示すブロック図である。この識別装置１０９における処理は、図９に示した二値高速識別の処理に相当する。本例では、入力コンテンツから第１の実施の形態に従ってＨＯＧ特徴量が抽出されて二値化される。識別装置１０９は、入力コンテンツの二値特徴量が得られると、それに対してサイズの異なる複数種類のウィンドウをスライドさせて、ウィンドウ内から特徴量を切り出す（ステップＳ２１）。この切り出し処理は、例えば、横４ブロック×縦８ブロック、横５ブロック×縦１０ブロック・・・というように、ウィンドウのサイズ（縦横ブロック数）を変えながら切り出せばよい。これにより、サイズの異なる複数種類のウィンドウの切り出しが実現できる。この場合、それぞれ切り出した特徴量の次元数は、切り出したときの縦横ブロック数に応じて異なるものになる。 5-1. First Example of Fifth Embodiment FIG. 46 is a block diagram showing processing in the identification device of the first example of the fifth embodiment. The processing in the identification device 109 corresponds to the binary high-speed identification processing shown in FIG. In this example, the HOG feature amount is extracted from the input content according to the first embodiment and binarized. When the binary feature amount of the input content is obtained, the identification device 109 slides a plurality of types of windows having different sizes to cut out the feature amount from the window (step S21). This cutout process may be cut out while changing the size of the window (number of vertical and horizontal blocks) such as 4 horizontal blocks × 8 vertical blocks, 5 horizontal blocks × 10 vertical blocks,. Thereby, it is possible to cut out a plurality of types of windows having different sizes. In this case, the number of dimensions of each cut-out feature amount differs depending on the number of vertical and horizontal blocks when cut out.

三値分解済み辞書１１９には、サイズの異なる複数種類のウィンドウに対応する辞書（識別モデル）が記憶されている。この三値分解済み辞書１１９は、第３の実施の形態に従って、実数ベクトルを三値の基底行列と係数の積に分解することで得られたものである。 The ternary decomposition completed dictionary 119 stores a dictionary (identification model) corresponding to a plurality of types of windows having different sizes. This ternary decomposition completed dictionary 119 is obtained by decomposing a real vector into a product of a ternary base matrix and a coefficient according to the third embodiment.

特徴量が切り出されると、切り出された二値特徴量と三値分解済み辞書１１９とを用いて、第３の実施の形態に従って、三値基底のカスケードによる認識が行われる（ステップＳ２２）。このようにカスケード識別を行うことで、識別処理を高速化できる。 When the feature amount is cut out, recognition is performed by a cascade of ternary bases using the cut out binary feature amount and the ternary decomposed dictionary 119 according to the third embodiment (step S22). By performing cascade identification in this way, the identification process can be speeded up.

５−２．第５の実施の形態の第２の例
図４７は、第５の実施の形態の第２の例の識別装置における処理を示すブロック図である。この識別装置１１０における処理は、図９に示した二値高速識別の処理に相当する。本例でも、入力コンテンツから第１の実施の形態に従ってＨＯＧ特徴量が抽出されて二値化される。入力コンテンツの二値特徴量が得られると、ウィンドウ内から特徴量を切り出す処理が行われる（ステップＳ３１）。本例では、特徴量を切り出す処理（ステップＳ３１）は、複数の辞書に対して一度だけ行う。切り出しの処理が一度だけになることで、処理が簡略化される。 5-2. Second Example of Fifth Embodiment FIG. 47 is a block diagram illustrating processing in the identification device of the second example of the fifth embodiment. The processing in the identification device 110 corresponds to the binary high-speed identification processing shown in FIG. Also in this example, the HOG feature amount is extracted from the input content according to the first embodiment and binarized. When the binary feature value of the input content is obtained, a process of cutting out the feature value from the window is performed (step S31). In this example, the process of extracting the feature value (step S31) is performed only once for a plurality of dictionaries. Since the cut-out process is performed only once, the process is simplified.

三値分解済み辞書１１０１及び三値分解済み辞書１１０２には、認識したい認識対象に関して、認識したいサイズごとに別々に学習された辞書（識別モデル）を記憶しておけばよい。ステップＳ３１では切り出し処理が一度だけであるため、どの辞書においても検出のためのウィンドウのサイズが一定であるが、学習サンプルを図３６ように認識対象のサイズ（倍率）を変え、倍率ごとに独立して辞書を学習することで、異なるサイズの対象を認識できるようになる。三値分解済み辞書１１０１、１１０２は、第３の実施の形態に従って、実数ベクトルを三値の基底行列と係数の積に分解することで得られたものである。 The ternary decomposed dictionary 1101 and the ternary decomposed dictionary 1102 may store a dictionary (identification model) learned separately for each size to be recognized regarding a recognition target to be recognized. In step S31, since the extraction process is performed only once, the size of the detection window is constant in any dictionary, but the size (magnification) of the recognition sample is changed as shown in FIG. By learning the dictionary, it becomes possible to recognize objects of different sizes. The ternary decomposed dictionaries 1101 and 1102 are obtained by decomposing a real vector into a product of a ternary basis matrix and coefficients according to the third embodiment.

特徴量が切り出されると（ステップＳ３１）、識別装置１１０は、サイズのことなる複数（Ｋ種類）のウィンドウの各々について、識別処理Ｓ３０を行う。各識別処理Ｓ３０では、切り出された二値特徴量と三値分解済み辞書１１０１とを用いて、第３の実施の形態に従って、三値基底のカスケードによる識別が行われる（ステップＳ３２）。このカスケード識別（ステップＳ３２）において検出がされなかった場合には、識別装置１１０は、そのサイズのウィンドウについて、直ちに非検出の結果を出力する。識別装置１１０は、二段階のカスケード識別を行う。最初のカスケード識別（ステップＳ３２）にて検出された場合には、この二値特徴量に対して、第２の実施の形態に従って、ＸＯＲとビットシフトによる共起をとることでＦＩＮＤ特徴量を生成する（ステップＳ３３）。 When the feature amount is cut out (step S31), the identification device 110 performs the identification process S30 for each of a plurality of (K types) windows having different sizes. In each identification process S30, identification is performed by a cascade of ternary bases according to the third embodiment using the extracted binary feature quantity and the ternary decomposition completed dictionary 1101 (step S32). If no detection is made in this cascade identification (step S32), the identification device 110 immediately outputs a non-detection result for the window of that size. The identification device 110 performs two-stage cascade identification. If detected in the first cascade identification (step S32), a FIND feature quantity is generated by taking the co-occurrence by XOR and bit shift according to the second embodiment for this binary feature quantity. (Step S33).

ＦＩＮＤ特徴量が生成されると（ステップＳ３３）、ＦＩＮＤ特徴量と三値分解済み辞書１１０２とを用いて、第３の実施の形態に従って、三値基底のカスケードによる識別が行われる（ステップＳ３４）。このように、共起を用いないカスケード識別で精度の粗い識別を行って、検出されたものについて共起を用いたカスケード識別をするという二段階のカスケード識別によって、さらなる高速化が可能である。 When the FIND feature value is generated (step S33), the FIND feature value and the ternary decomposition completed dictionary 1102 are used to identify the ternary basis cascade according to the third embodiment (step S34). . In this way, it is possible to further increase the speed by performing two-stage cascade identification in which coarse identification is performed by cascade identification that does not use co-occurrence, and cascade detection using co-occurrence is performed on the detected one.

５−２．第５の実施の形態の第３の例
図４８は、第５の実施の形態の第３の例の識別装置における処理を示すブロック図である。この識別装置１２０における処理は、図９に示した二値高速識別の処理に相当する。本例でも、入力コンテンツから第１の実施の形態に従ってＨＯＧ特徴量が抽出されて二値化される。入力コンテンツの二値特徴量が得られると、識別装置１２０は、ウィンドウをスライドさせながら、ウィンドウ内から特徴量を切り出す（ステップＳ４１）。本例では、特徴量を切り出す処理（ステップＳ４１）は、複数の辞書に対して一度だけ行う。切り出しの処理が一度だけになることで、処理が簡略化される。 5-2. Third Example of Fifth Embodiment FIG. 48 is a block diagram illustrating processing in the identification device of the third example of the fifth embodiment. The processing in the identification device 120 corresponds to the binary high-speed identification processing shown in FIG. Also in this example, the HOG feature amount is extracted from the input content according to the first embodiment and binarized. When the binary feature amount of the input content is obtained, the identification device 120 cuts out the feature amount from the window while sliding the window (step S41). In this example, the process of extracting the feature amount (step S41) is performed only once for a plurality of dictionaries. Since the cut-out process is performed only once, the process is simplified.

三値分解済み辞書１２０１には、認識したい認識対象に関して、認識したいサイズごとに別々に学習された辞書（識別モデル）を記憶しておく。ステップＳ４１では切り出し処理が一度だけであるため、どの辞書においても検出のためのウィンドウのサイズが一定であるが、学習サンプルを図３６ように認識対象のサイズ（倍率）を変え、倍率ごとに独立して辞書を学習することで、異なるサイズの対象を認識できるようになる。特徴量の切り出し（ステップＳ４１）が一度だけであり、どの辞書においても次元数が同じであるため、第４の実施の形態を適用可能となる。そこで、この三値分解済み辞書１２０１は、第４の実施の形態に従って、複数の実数ベクトルをまとめた実数行列を三値の基底行列と係数行列の積に分解することで得られたものとする。 The ternary decomposition completed dictionary 1201 stores a dictionary (identification model) learned separately for each size to be recognized with respect to a recognition target to be recognized. In step S41, since the extraction process is performed only once, the size of the detection window is constant in any dictionary. However, the size (magnification) of the recognition sample is changed as shown in FIG. By learning the dictionary, it becomes possible to recognize objects of different sizes. Since the feature amount is cut out only once (step S41) and the number of dimensions is the same in any dictionary, the fourth embodiment can be applied. Therefore, it is assumed that the ternary decomposition completed dictionary 1201 is obtained by decomposing a real matrix obtained by collecting a plurality of real vectors into a product of a ternary base matrix and a coefficient matrix according to the fourth embodiment. .

特徴量が切り出されると（ステップＳ４１）、識別装置１２０は、切り出された二値特徴量と三値分解済み辞書１２０１とを用いて、サイズのことなる複数（Ｋ種類）のウィンドウの各々について、識別処理を行う（ステップＳ４０）。この識別処理では、第４の実施の形態に従って、三値基底のカスケードによる識別を行うが（ステップＳ４２）、このとき、全ての辞書に対応する線形識別関数を一括して計算する。一括して計算した線形識別関数のうち、符号が正になった辞書に対応するウィンドウを検出結果として出力する（ステップＳ４３）。このように、全ての辞書に対応する線形識別関数を一括して計算できるようになるため、さらなる高速化が可能である。 When the feature amount is cut out (step S41), the identification device 120 uses the cut out binary feature amount and the ternary decomposed dictionary 1201, for each of a plurality (K types) of windows having different sizes. Identification processing is performed (step S40). In this identification process, classification is performed by ternary basis cascade according to the fourth embodiment (step S42). At this time, linear identification functions corresponding to all dictionaries are calculated in a lump. Of the linear discriminant functions calculated in a lump, a window corresponding to a dictionary having a positive sign is output as a detection result (step S43). In this way, linear discrimination functions corresponding to all the dictionaries can be calculated in a lump, so that further speed-up is possible.

本発明は、特徴量の抽出回数及び二値化の処理回数を減らすことができるので、関連性判定の処理を高速化できるという効果を有し、画像から抽出された特徴量を演算する特徴量演算装置等として有用である。 The present invention can reduce the number of feature extractions and the number of binarization processes, and thus has the effect of speeding up the relevance determination process, and is a feature that calculates the feature extracted from an image It is useful as an arithmetic unit.

１０入力画像
１１リサイズ画像
１０１特徴量変換装置
１１１〜１１Ｎビット再配列器
１２１〜１２Ｎ論理演算器
１３０特徴統合器
１０２特徴量変換装置
２１１〜２１Ｎ二値化器
２２１〜２２Ｎ共起要素生成器
２３０特徴統合器
１０３特徴量演算装置
１３１コンテンツ取得部
１３２特徴ベクトル生成部
１３３特徴ベクトル二値化部
１３４実数ベクトル取得部
１３５実数ベクトル分解部
１３６ベクトル演算部（内積演算部）
１０５ｋ−ｍｅａｎｓクラスタリング装置
１５１コンテンツ取得部
１５２特徴ベクトル生成部
１５３特徴ベクトル二値化部
１５４代表ベクトル更新部
１５５収束判定部
１５６代表ベクトル分解部
１５７最近接代表ベクトル算出部
１０６特徴量演算装置
１６１コンテンツ取得部
１６２特徴ベクトル生成部
１６３特徴ベクトル二値化部
１６４実数行列取得部
１６５実数行列分解部
１６６ベクトル演算部（積演算部）
１０７物体認識装置
１７１ピラミッド画像生成部
１７２ＨＯＧ特徴量抽出部
１７３バイナリコード変換部
１７４パラメータ決定部
１７５パラメータ行列分解部
１７６線形ＳＶＭ識別部
１７７ピーク検出部
１０８ｋ−ｍｅａｎｓクラスタリング装置
１８１コンテンツ取得部
１８２特徴ベクトル生成部
１８３特徴ベクトル二値化部
１８４代表行列更新部
１８５収束判定部
１８６代表行列分解部
１８７最近接代表ベクトル算出部
１０９識別装置
１１９三値分解済み辞書
１０９、１１０、１２０識別装置
１１０１、１１０２、１２０１三値分解済み辞書 DESCRIPTION OF SYMBOLS 10 Input image 11 Resized image 101 Feature-value conversion apparatus 111-11N Bit rearrangement device 121-12N Logical operation unit 130 Feature integrator 102 Feature-value conversion apparatus 211-21N Binarizer 221-22N Co-occurrence element generator 230 Feature Integrator 103 Feature amount computing device 131 Content acquisition unit 132 Feature vector generation unit 133 Feature vector binarization unit 134 Real vector acquisition unit 135 Real vector decomposition unit 136 Vector calculation unit (inner product calculation unit)
105 k-means clustering device 151 content acquisition unit 152 feature vector generation unit 153 feature vector binarization unit 154 representative vector update unit 155 convergence determination unit 156 representative vector decomposition unit 157 nearest neighbor representative vector calculation unit 106 feature quantity calculation device 161 content Acquisition unit 162 Feature vector generation unit 163 Feature vector binarization unit 164 Real matrix acquisition unit 165 Real matrix decomposition unit 166 Vector operation unit (product operation unit)
DESCRIPTION OF SYMBOLS 107 Object recognition apparatus 171 Pyramid image generation part 172 HOG feature-value extraction part 173 Binary code conversion part 174 Parameter determination part 175 Parameter matrix decomposition part 176 Linear SVM identification part 177 Peak detection part 108 k-means clustering apparatus 181 Content acquisition part 182 Feature Vector generating unit 183 Feature vector binarizing unit 184 Representative matrix updating unit 185 Convergence determining unit 186 Representative matrix decomposing unit 187 Nearest representative vector calculating unit 109 Discriminating device 119 Tri-level decomposed dictionary 109, 110, 120 Discriminating device 1101, 1102 , 1201 Ternary decomposition dictionary

Claims

A feature quantity binarization unit that binarizes a feature quantity extracted from each of a pyramid image made up of an input image and a plurality of resized images respectively enlarged or reduced at a plurality of magnifications;
A feature amount calculation unit that determines a relevance between the input image and the plurality of dictionaries by applying a dictionary set including a plurality of dictionaries having different sizes to the binarized feature amount;
With
The feature amount calculation unit determines, for each of the pyramid images, the relevance to the plurality of dictionaries by using the binarized feature amount in common for the plurality of dictionaries. A feature amount calculation device.

A feature amount extraction unit that extracts the feature amount from each of the pyramid images;
The feature quantity calculation apparatus according to claim 1, wherein the feature quantity binarization unit binarizes the feature quantity extracted by the feature quantity extraction unit.

The feature quantity computing device according to claim 1, wherein the feature quantity computing unit performs identification using the dictionary for the input image.

4. The feature amount calculation unit applies the dictionary set in which all or some of the plurality of dictionaries are the same for each of the pyramid images. The feature amount calculation device according to one item.

The feature quantity calculation device according to claim 3, further comprising a feature quantity conversion unit that transforms the feature quantity so as to enhance discrimination ability using the binarized co-occurrence element of the feature quantity. .

Obtaining a basis vector for obtaining the plurality of basis vectors obtained by decomposing a real vector having real numbers as elements into a linear sum of a plurality of basis vectors having elements composed only of binary or ternary discrete values Further comprising
The dictionary is generated using the plurality of basis vectors;
The feature amount calculation unit determines the relevance between the real vector and the feature vector by sequentially calculating an inner product of the feature vector indicating the feature amount and each of the plurality of base vectors. The feature-value calculating apparatus as described in any one of Claim 1 thru | or 5.

A feature amount conversion unit that converts the feature vector so as to enhance identification ability using a co-occurrence element of the feature vector determined to be relevant by the feature amount calculation unit;
A feature vector converted by the feature quantity conversion unit is further subjected to inner product calculation with each of a plurality of base vectors, thereby determining a relevance between the real vector and the feature vector. A feature amount calculation unit;
The feature amount calculation apparatus according to claim 6, further comprising:

The feature amount calculation unit cuts out a feature amount while sliding a window for each of the pyramid images, and determines the relevance of the feature amount cut out from the window by applying the dictionary set. The feature-value calculating apparatus as described in any one of Claim 1 thru | or 7.

Real matrix decomposition unit that decomposes a real matrix consisting of multiple real vectors with real elements into a product of a coefficient matrix and multiple base vectors with only binary or ternary discrete values as elements Further comprising
The dictionary is generated using the plurality of basis matrices;
The feature amount calculation unit calculates a product of the feature vector and the base matrix as a calculation of an inner product of the feature vector indicating the feature amount and each of the plurality of real vectors, and further calculates the product and the coefficient matrix. The feature quantity calculation apparatus according to claim 8, wherein the product is calculated, and the relevance between each of the plurality of real vectors and the feature vector is determined using the result.

A feature quantity binarization step for binarizing a feature quantity extracted from each of a pyramid image made up of an input image and a plurality of resized images each enlarged or reduced at a plurality of magnifications;
A feature amount calculation step of determining a relevance between the input image and the plurality of dictionaries by applying a dictionary set including a plurality of dictionaries having different sizes with respect to the binarized feature amount;
Including
In the feature amount calculating step, for each of the pyramid images, the binarized feature amount is commonly used for the plurality of dictionaries to determine relevance to the plurality of dictionaries. Characteristic feature amount calculation method.

On the computer,
A feature quantity binarization step for binarizing a feature quantity extracted from each of a pyramid image made up of an input image and a plurality of resized images each enlarged or reduced at a plurality of magnifications;
A feature amount calculation step of determining a relevance between the input image and the plurality of dictionaries by applying a dictionary set including a plurality of dictionaries having different sizes with respect to the binarized feature amount;
A feature amount calculation program for executing
In the feature amount calculating step, for each of the pyramid images, the binarized feature amount is commonly used for the plurality of dictionaries to determine relevance to the plurality of dictionaries. Feature feature calculation program.