JP6104209B2

JP6104209B2 - Hash function generation method, hash value generation method, apparatus, and program

Info

Publication number: JP6104209B2
Application number: JP2014079577A
Authority: JP
Inventors: 豪入江; 新井　啓之; 啓之新井; 行信谷口
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-04-08
Filing date: 2014-04-08
Publication date: 2017-03-29
Anticipated expiration: 2034-04-08
Also published as: JP2015201042A

Description

本発明は、ハッシュ関数生成方法、ハッシュ値生成方法、装置、及びプログラムに係り、特に、ハッシュ関数を生成するハッシュ関数生成方法、ハッシュ値生成方法、装置、及びプログラムに関する。 The present invention relates to a hash function generation method, a hash value generation method, an apparatus, and a program, and more particularly, to a hash function generation method, a hash value generation method, an apparatus, and a program for generating a hash function.

通信環境やコンピュータ、分散処理基盤技術の高度・高品質化により、ネットワークに流通するコンテンツ（画像・映像・音声・文書等）の数は膨大なものとなっている。例えば、ある検索エンジンがインデクシングしているウェブページの数は数兆にのぼるといわれている。また、あるサイトでは、日々３．５億の画像がアップロードされていると報告されており、また、あるサイトでは、１分当たり６４時間分の映像が新規に公開されているとの報告もある。 The number of contents (images, videos, sounds, documents, etc.) distributed on the network has become enormous due to the advancement and high quality of communication environments, computers, and distributed processing infrastructure technologies. For example, a search engine is said to have trillions of web pages indexed. Some sites report that 350 million images are uploaded every day, and some sites report that 64 hours of video per minute are newly released. .

このような膨大な量のコンテンツは、利用者にとっては豊富な情報源となる一方で、閲覧したいコンテンツに素早くアクセスすることがますます困難になっているという課題ももたらしている。このような流れの中、閲覧・視聴したいコンテンツを効率的に探し出すためのメディア解析技術への要望がますます高まってきている。 Such an enormous amount of content is a rich source of information for users, but also brings about the problem that it becomes increasingly difficult to quickly access the content to be viewed. In such a trend, there is an increasing demand for media analysis technology for efficiently searching for contents to be browsed and viewed.

コンテンツの解析においては、類似したコンテンツの発見が重要な役割を果たす。例えば、コンテンツを分類する場合は、同じようなコンテンツは同じカテゴリに分類する。あるいは検索の場合、あるコンテンツをクエリとして与えたとき、このコンテンツに類似したコンテンツを検索することが基本的な要件となる。そのほか、コンテンツ推薦においても利用者がこれまでに閲覧した／閲覧しているコンテンツと類似したコンテンツを発見してこれを推薦するし、コンテンツ要約の場合においても、類似したコンテンツを提示することは冗長であるため、これを発見して省くような処理が必要となる。 In content analysis, the discovery of similar content plays an important role. For example, when classifying content, similar content is classified into the same category. Alternatively, in the case of search, when a certain content is given as a query, it is a basic requirement to search for content similar to this content. In addition, in content recommendation, it is redundant to find and recommend content similar to the content that the user has browsed / viewed so far, and even in the case of content summary, it is redundant to present similar content Therefore, it is necessary to perform processing to find and omit this.

ここで、類似コンテンツを発見する典型的な手続きについて説明する。まず、コンテンツをある特徴量によって表現する。次に、特徴量同士の近さを測ることで類似度を計算し、この類似度に基づいて類似コンテンツを発見する。単純な例を挙げれば、コンテンツが画像や映像であれば、画像（映像フレーム）の色ヒストグラムを特徴量としてその類似度を測ることができる。コンテンツが文書であれば、単語の出現頻度をヒストグラム化したもの（Ｂａｇ−ｏｆ−Ｗｏｒｄｓヒストグラムなどと呼ぶ）を特徴量として類似度を測ることができる。いうまでもなく、仮にコンテンツの数が１０００あれば、１０００のコンテンツそれぞれに対して類似度を計算し、結果類似度の高いコンテンツを類似コンテンツとして拾い上げる必要がある。 Here, a typical procedure for finding similar contents will be described. First, the content is expressed by a certain feature amount. Next, the similarity is calculated by measuring the proximity of the feature quantities, and similar content is found based on the similarity. To give a simple example, if the content is an image or video, the degree of similarity can be measured using the color histogram of the image (video frame) as a feature amount. If the content is a document, the degree of similarity can be measured using a histogram of the appearance frequency of words (referred to as a Bag-of-Words histogram) as a feature amount. Needless to say, if the number of contents is 1000, it is necessary to calculate the similarity for each of the 1000 contents, and to pick up the content having a high similarity as a similar content.

しかしながら、前述のように、膨大な量のコンテンツを対象にした場合、計算時間とメモリを大量に消費するという課題があった。 However, as described above, when a huge amount of content is targeted, there is a problem of consuming a large amount of calculation time and memory.

通常、コンテンツの特徴量（ベクトル）の次元は高次元になることが多く、その類似度の計算には膨大な時間を要する。一般に、文書のＢａｇ−ｏｆ−Ｗｏｒｄｓヒストグラムの次元は、単語の種類（語彙）と同次元になる。画像の色ヒストグラムのような単純な特徴量であっても、一般に数百〜数千次元の実数値ベクトルとなるし、最近用いられるスパース表現やフィッシャーカーネルに基づく特徴表現では、数十万〜数百万次元のベクトルとなることもあり得る。さらに、全てのコンテンツの組に対してその類似度を計算する必要があるため、どのような類似度計算手段を用いようとも、特徴量の次元がＤ、コンテンツがＮ個あったとするとＯ（ＤＮ）の計算量を要する。また、即時検索を実行するためには、特徴量あるいはその類似度をメモリに蓄積しておくことが好ましいが、これを行うためにはＯ（Ｎ^２）のメモリが必要となる。このように、億を超えるオーダのコンテンツを扱う必要がある昨今においては、非現実的な時間とメモリを要するのである。 In general, the dimension of the feature amount (vector) of the content is often high, and the calculation of the similarity requires enormous time. In general, the dimension of the Bag-of-Words histogram of a document is the same as that of a word type (vocabulary). Even a simple feature quantity such as an image color histogram is generally a real-valued vector of hundreds to thousands of dimensions, and several hundreds of thousands to several hundreds of recently used feature expressions based on sparse and Fisher kernels. It can also be a million-dimensional vector. Furthermore, since it is necessary to calculate the degree of similarity for all content sets, no matter what degree of similarity calculation means is used, assuming that there are D feature quantities and N contents, O (DN ). In order to execute an immediate search, it is preferable to store a feature amount or its similarity in a memory. However, in order to perform this, an O (N ² ) memory is required. In this way, it is necessary to deal with content on the order of more than 100 million, and unrealistic time and memory are required.

このような問題を解決するために、従来からいくつかの発明がなされている。 In order to solve such a problem, several inventions have been conventionally made.

例えば、非特許文献１に開示されている技術では、近接する任意の２つのコンテンツ（特徴量）において、元の特徴量の類似度と衝突確率が等しくなるようなハッシュ関数群を生成する。典型的な類似度としてコサイン類似度を考えており、その場合のハッシュ関数生成の基本的な手続きは、特徴量空間にランダムな超平面を複数生成することによる（ｒａｎｄｏｍｐｒｏｊｅｃｔｉｏｎと呼ばれる）。各超平面のどちら側に特徴量が存在するかによって特徴量をハッシュ化し、全てのコンテンツ間で類似度を求めることなく、近似的に類似コンテンツを発見することができる。 For example, in the technology disclosed in Non-Patent Document 1, a hash function group is generated such that the similarity between the original feature amount and the collision probability are equal in any two adjacent contents (feature amounts). A cosine similarity is considered as a typical similarity, and the basic procedure for generating a hash function in that case is by generating a plurality of random hyperplanes in the feature amount space (called random projection). By hashing the feature amount depending on which side of each hyperplane the feature amount exists, similar content can be found approximately without obtaining similarity between all the contents.

また、非特許文献２に開示されている技術は、特徴量の分布を捉え、その分布に対して最適なハッシュ値を構成する。具体的には、特徴量空間における多様体構造を捉え、その多様体構造を最適に保存するバイナリ空間（ハッシュ空間）への非線形な埋め込みを求めることで、元の高次元な特徴量を、低ビットなハッシュ値に変換する。そのハッシュ値の類似性を評価することで、高速な類似コンテンツの発見を実現することができる。 Further, the technique disclosed in Non-Patent Document 2 captures a distribution of feature amounts and configures an optimum hash value for the distribution. Specifically, by capturing the manifold structure in the feature space and finding non-linear embedding in the binary space (hash space) that optimally stores the manifold structure, the original high-dimensional feature is reduced. Convert to a bit hash value. By evaluating the similarity of the hash values, high-speed similar content can be found.

また、特許文献１に開示されている技術では、コンテンツの特徴量と、異なる２つのコンテンツを関連付けるべきか否かを示す関連情報（正解データ）に基づいてハッシュ関数を求め、このハッシュ関数を基に特徴量を低ビットなハッシュ値に変換する。 Further, in the technique disclosed in Patent Document 1, a hash function is obtained based on content information and related information (correct data) indicating whether or not two different contents should be associated with each other. The feature value is converted into a low-bit hash value.

特開２０１３−６８８８４号公報JP 2013-68884 A

M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni，“Locality-Sensitive Hashing Scheme based on p-Stable Distributions”，In Proceedings of the Twentieth Annual Symposium on Computational Geometry，2004，p.253-262.M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni, “Locality-Sensitive Hashing Scheme based on p-Stable Distributions”, In Proceedings of the Twentieth Annual Symposium on Computational Geometry, 2004, p.253-262. 入江豪、ＺｈｅｎｇｕｏＬｉ、Ｓｈｉｈ−ＦｕＣｈａｎｇ、「構造を保存するハッシング」、画像の認識・理解シンポジウム、２０１３．Go Irie, Zhengu Li, Shih-Fu Chang, “Hashing to Preserve Structure”, Image Recognition and Understanding Symposium, 2013.

非特許文献１、２に開示されている技術では、元のコンテンツをコンパクトなハッシュ値に変換することで、非常に高精度かつ高速な類似コンテンツの発見を可能にしていた。しかしながら、いずれの技術もそのハッシュ値は、コンテンツの特徴量の情報のみによって求められるのであり、コンテンツの意味的観点に即したハッシュ値を生成するものではなかったため、十分な精度を得られないという問題があった。 In the technologies disclosed in Non-Patent Documents 1 and 2, the original content is converted into a compact hash value, thereby making it possible to find similar content with very high accuracy and high speed. However, in any of the techniques, the hash value is obtained only from the content feature amount information, and the hash value is not generated in accordance with the semantic viewpoint of the content, so that sufficient accuracy cannot be obtained. There was a problem.

一方、特許文献１に開示されている技術では、コンテンツの特徴量と、異なる２つのコンテンツを関連付けるべきか否かを示す関連情報（正解データ）に基づいてハッシュ関数を求めるものであり、関連情報に即したハッシュ値を生成することができる。 On the other hand, in the technology disclosed in Patent Document 1, a hash function is obtained based on related information (correct data) indicating whether or not two different contents should be associated with the feature amount of the content. It is possible to generate a hash value in accordance with.

しかしながら、コンテンツを関連付けるべきか否かという関連情報と、コンテンツの意味的観点は必ずしも同等の情報を表すものではなく、後者はより多様でさまざまなものが存在する。分かりやすい一例をあげると、例えば、「赤いりんご」、「青りんご」、「みかん」、「マグロ」の合計４枚の画像を有していたとする。特許文献１の技術では、どの画像とどの画像が関連しているか（またしていないか）を関連情報として与えるが、この４枚の画像を考えた場合、同じ果物である「赤いりんご」と「青りんご」は関連しているとし、それ以外は関連していないとして関連情報を与える。あるいは、「赤いりんご」と「青りんご」と「みかん」は果物であるため、これらを関連しているとし、「マグロ」のみを関連していないと見做して関連情報を与えるという選択も採りうるであろう。一方で、「赤いりんご」と「マグロ」は、ともに青森県の名産であるという観点に立てば、弱い関連を持っているという見方もできるはずである。すなわち、「赤いりんご」を様々な意味的観点から見るに、果物という観点で「青りんご」と強く関連し、「みかん」と弱く関連し、青森名産という観点で弱く関連しているのである。特許文献１の技術では、こういった多様な観点、さらに関連の強弱を表すことは不可能であった。 However, the related information as to whether or not the content should be associated with the semantic viewpoint of the content does not necessarily represent the same information, and the latter includes more various and various types. As an easy-to-understand example, it is assumed that the image has a total of four images, for example, “red apple”, “green apple”, “mandarin orange”, and “tuna”. In the technique of Patent Document 1, which image is related to which image is related (and is not) is given as related information. When these four images are considered, the same fruit “red apple” It is assumed that “green apple” is related, and other information is given that it is not related otherwise. Or “red apples”, “green apples” and “mandarin oranges” are fruits, so they are related, and it is considered that only “tuna” is not related and given relevant information. It could be taken. On the other hand, “red apples” and “tuna” can be considered to be weakly related from the viewpoint that both are special products of Aomori Prefecture. In other words, looking at “red apples” from various semantic viewpoints, they are strongly related to “green apples” in terms of fruits, weakly related to “mandarin oranges”, and weakly related to Aomori specialties. With the technology of Patent Document 1, it is impossible to express such various viewpoints and related strengths.

然るに、現在に至るまで、いずれの技術によっても、高速かつ省メモリでありながら、多様な意味的観点に基づいてハッシュ値を構成することで高精度な類似コンテンツの発見を実現することはできないという問題があった。 However, until now, no technology can realize high-precision similar content discovery by constructing hash values based on various semantic viewpoints, while being fast and memory-saving. There was a problem.

本発明は、上記の事情を鑑みてなされたもので、コンテンツの意味を考慮したハッシュ値を求めるためのハッシュ関数を生成することができるハッシュ関数生成方法、ハッシュ値生成方法、装置、及びプログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and provides a hash function generation method, a hash value generation method, an apparatus, and a program that can generate a hash function for obtaining a hash value in consideration of the meaning of content. The purpose is to provide.

上記の目的を達成するために本発明に係るハッシュ関数生成方法は、特徴抽出手段、及びハッシュ関数生成手段を含むハッシュ関数生成装置におけるハッシュ関数生成方法であって、前記特徴抽出手段が、複数のコンテンツの各々について、前記コンテンツから特徴量を抽出するステップと、前記ハッシュ関数生成手段が、前記特徴抽出手段によって抽出された前記複数のコンテンツの各々の前記特徴量と、前記複数のコンテンツの各々について与えられた前記コンテンツの意味を表す意味ベクトルとに基づいて、前記複数のコンテンツの各々について、前記コンテンツから抽出された特徴量に対応するハッシュ値と、前記コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、前記特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び前記意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成するステップと、を含んで構成されている。 In order to achieve the above object, a hash function generation method according to the present invention is a hash function generation method in a hash function generation apparatus including a feature extraction unit and a hash function generation unit, wherein the feature extraction unit includes a plurality of feature extraction units. For each of the contents, the step of extracting a feature amount from the content, and the hash function generation means for the feature amount of each of the plurality of contents extracted by the feature extraction means, and for each of the plurality of contents Based on a given semantic vector representing the meaning of the content, for each of the plurality of content, a hash value corresponding to a feature amount extracted from the content, and a hash value corresponding to the semantic vector of the content, So that the hash value corresponding to the feature amount is obtained. It is configured to include the steps of: generating a hash function for obtaining a hash value corresponding to the hash function, and the semantic vector of order, the.

本発明に係るハッシュ関数生成装置は、複数のコンテンツの各々について、前記コンテンツから特徴量を抽出する特徴抽出手段と、前記特徴抽出手段によって抽出された前記複数のコンテンツの各々の前記特徴量と、前記複数のコンテンツの各々について与えられた前記コンテンツの意味を表す意味ベクトルとに基づいて、前記複数のコンテンツの各々について、前記コンテンツから抽出された特徴量に対応するハッシュ値と、前記コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、前記特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び前記意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成するハッシュ関数生成手段と、を含んで構成されている。 The hash function generation device according to the present invention includes, for each of a plurality of contents, a feature extraction unit that extracts a feature amount from the content, and the feature amount of each of the plurality of contents extracted by the feature extraction unit, Based on a semantic vector representing the meaning of the content given for each of the plurality of contents, a hash value corresponding to a feature amount extracted from the content for each of the plurality of contents, and the meaning of the content A hash function for generating a hash function for obtaining a hash value corresponding to the feature quantity and a hash function for obtaining a hash value corresponding to the semantic vector so that a distance from the hash value corresponding to the vector is reduced. Generating means.

上記ハッシュ関数生成方法及び上記ハッシュ関数生成装置における、前記ハッシュ関数生成手段は、前記特徴抽出手段によって抽出された前記複数のコンテンツの各々の前記特徴量と、前記複数のコンテンツの各々について与えられた前記コンテンツの意味を表す意味ベクトルとに基づいて、前記複数のコンテンツの各々について、前記特徴量が存在する空間である特徴量空間において、前記コンテンツの前記特徴量を、前記コンテンツの前記特徴量の近傍に存在する他のコンテンツの前記特徴量に対応するハッシュ値の線形結合で表した多様体構造に基づいて求められる前記コンテンツの前記特徴量に対応するハッシュ値と、前記コンテンツから抽出された特徴量に対応するハッシュ値との距離が小さくなり、かつ、前記複数のコンテンツの各々について、前記コンテンツから抽出された特徴量に対応するハッシュ値と、前記コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、前記特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び前記意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成するようにすることができる。 In the hash function generation method and the hash function generation device, the hash function generation unit is provided for the feature amount of each of the plurality of contents extracted by the feature extraction unit and each of the plurality of contents. Based on a semantic vector representing the meaning of the content, for each of the plurality of contents, in the feature amount space that is a space in which the feature amount exists, the feature amount of the content is set to the feature amount of the content. A hash value corresponding to the feature amount of the content obtained based on a manifold structure represented by a linear combination of hash values corresponding to the feature amount of other content existing in the vicinity, and a feature extracted from the content The distance from the hash value corresponding to the amount is reduced, and the plurality of contents For each, a hash for obtaining a hash value corresponding to the feature amount so that a distance between the hash value corresponding to the feature amount extracted from the content and the hash value corresponding to the semantic vector of the content becomes small A hash function for obtaining a function and a hash value corresponding to the semantic vector can be generated.

また、上記ハッシュ関数生成方法及び上記ハッシュ関数生成装置における、前記ハッシュ関数生成手段は、前記特徴抽出手段によって抽出された前記複数のコンテンツの各々の前記特徴量と、前記複数のコンテンツの各々について与えられた前記コンテンツの意味を表す意味ベクトルとに基づいて、前記複数のコンテンツの各々について、前記意味ベクトルが存在する空間である意味ベクトル空間において、前記コンテンツの前記意味ベクトルを、前記コンテンツの前記意味ベクトルの近傍に存在する他のコンテンツの前記意味ベクトルに対応するハッシュ値の線形結合で表した多様体構造に基づいて求められる前記コンテンツの前記意味ベクトルに対応するハッシュ値と、前記コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなり、かつ、前記複数のコンテンツの各々について、前記コンテンツから抽出された特徴量に対応するハッシュ値と、前記コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、前記特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び前記意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成するようにすることができる。 In the hash function generation method and the hash function generation device, the hash function generation unit provides the feature amount of each of the plurality of contents extracted by the feature extraction unit and each of the plurality of contents. Based on the semantic vector representing the meaning of the content, the semantic vector of the content is the meaning of the content in a semantic vector space in which the semantic vector exists for each of the plurality of content. A hash value corresponding to the semantic vector of the content obtained based on a manifold structure represented by a linear combination of hash values corresponding to the semantic vector of other content existing in the vicinity of the vector, and a semantic vector of the content The distance from the hash value corresponding to is small And for each of the plurality of contents, the feature amount is reduced such that a distance between a hash value corresponding to the feature amount extracted from the content and a hash value corresponding to the semantic vector of the content is reduced. A hash function for obtaining a hash value corresponding to, and a hash function for obtaining a hash value corresponding to the semantic vector can be generated.

本発明における前記ハッシュ値を、バイナリ値とすることができる。 The hash value in the present invention can be a binary value.

本発明に係るハッシュ値生成方法は、特徴抽出手段、及びハッシュ値生成手段を含むハッシュ値生成装置におけるハッシュ値生成方法であって、前記特徴抽出手段が、コンテンツから特徴量を抽出するステップと、前記ハッシュ値生成手段が、前記特徴抽出手段によって抽出された前記特徴量と、本発明のハッシュ関数生成方法によって生成された前記特徴量に対応するハッシュ値を求めるための前記ハッシュ関数とに基づいて、前記コンテンツの特徴量に対応するハッシュ値を生成するステップと、を含んで構成されている。 A hash value generation method according to the present invention is a hash value generation method in a hash value generation device including a feature extraction unit and a hash value generation unit, wherein the feature extraction unit extracts a feature amount from content, The hash value generation unit is based on the feature amount extracted by the feature extraction unit and the hash function for obtaining a hash value corresponding to the feature amount generated by the hash function generation method of the present invention. And a step of generating a hash value corresponding to the feature amount of the content.

本発明に係るハッシュ値生成装置は、コンテンツから特徴量を抽出する特徴抽出手段と、前記特徴抽出手段によって抽出された前記特徴量と、本発明のハッシュ関数生成装置によって生成された前記特徴量に対応するハッシュ値を求めるための前記ハッシュ関数とに基づいて、前記コンテンツの特徴量に対応するハッシュ値を生成するハッシュ値生成手段と、を含んで構成されている。 The hash value generation device according to the present invention includes a feature extraction unit that extracts a feature amount from content, the feature amount extracted by the feature extraction unit, and the feature amount generated by the hash function generation device of the present invention. Hash value generation means for generating a hash value corresponding to the feature amount of the content based on the hash function for obtaining a corresponding hash value.

本発明のプログラムは、本発明のハッシュ関数生成方法、又は本発明のハッシュ値生成方法の各ステップをコンピュータに実行させるためのプログラムである。 The program of the present invention is a program for causing a computer to execute each step of the hash function generating method of the present invention or the hash value generating method of the present invention.

以上説明したように、本発明のハッシュ関数生成方法、装置、及びプログラムによれば、複数のコンテンツの各々の特徴量と、複数のコンテンツの各々について与えられたコンテンツの意味を表す意味ベクトルとに基づいて、複数のコンテンツの各々について、当該コンテンツから抽出された特徴量に対応するハッシュ値と、当該コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成することにより、コンテンツの意味を考慮したハッシュ値を求めるためのハッシュ関数を生成することができる、という効果が得られる。 As described above, according to the hash function generation method, apparatus, and program of the present invention, the feature amount of each of the plurality of contents and the semantic vector that represents the meaning of the content given to each of the plurality of contents. On the basis of each of the plurality of contents, the hash corresponding to the feature amount is set such that the distance between the hash value corresponding to the feature amount extracted from the content and the hash value corresponding to the semantic vector of the content is reduced. By generating a hash function for obtaining a value and a hash function for obtaining a hash value corresponding to a semantic vector, a hash function for obtaining a hash value in consideration of the meaning of the content can be generated. An effect is obtained.

また、本発明のハッシュ値生成方法、装置、及びプログラムによれば、コンテンツから抽出された特徴量と、生成されたハッシュ関数とに基づいて、コンテンツの特徴量に対応するハッシュ値を生成することにより、コンテンツの意味を考慮したハッシュ値を用いて、コンテンツと類似するデータを精度よく発見することができる、という効果が得られる。 In addition, according to the hash value generation method, apparatus, and program of the present invention, the hash value corresponding to the feature amount of the content is generated based on the feature amount extracted from the content and the generated hash function. Thus, it is possible to obtain an effect that data similar to the content can be found with high accuracy by using the hash value in consideration of the meaning of the content.

本発明の第１の実施の形態に係る情報処理装置の構成を示す概略図である。It is the schematic which shows the structure of the information processing apparatus which concerns on the 1st Embodiment of this invention. ハッシュ関数を説明するための説明図である。It is explanatory drawing for demonstrating a hash function. 特徴量空間における多様体構造を説明するための説明図である。It is explanatory drawing for demonstrating the manifold structure in feature-value space. 特徴量空間における多様体構造を考慮したハッシュ関数を説明するための説明図である。It is explanatory drawing for demonstrating the hash function which considered the manifold structure in the feature-value space. 特徴量空間における多様体構造と意味ベクトル空間における多様体構造とを説明するための説明図である。It is explanatory drawing for demonstrating the manifold structure in a feature-value space, and the manifold structure in a semantic vector space. 本発明の第１の実施の形態に係る情報処理装置におけるハッシュ関数生成処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the hash function generation process routine in the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る情報処理装置におけるハッシュ値生成処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the hash value generation process routine in the information processing apparatus which concerns on the 1st Embodiment of this invention. 本発明の第３の実施の形態に係る情報処理システムの構成を示す概略図である。It is the schematic which shows the structure of the information processing system which concerns on the 3rd Embodiment of this invention. ハッシュ値によって対応付けられたコンテンツの一例を示す図である。It is a figure which shows an example of the content matched by the hash value.

本発明の実施の形態では、高速かつ省メモリでありながらも高精度な類似コンテンツ発見を実現するためのハッシュ関数の生成及びハッシュ値の生成を行う情報処理装置に本発明を適用した場合を例に説明する。以下、図面を参照して本発明の実施の形態を詳細に説明する。 In the embodiment of the present invention, a case where the present invention is applied to an information processing apparatus that generates a hash function and a hash value for realizing high-precision similar content discovery while being high-speed and memory-saving is taken as an example Explained. Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜第１の実施の形態＞
＜システム構成＞
本発明の実施の形態では、コンテンツに対して、複数の意味的観点およびその関連の強さを考慮したハッシュ関数とハッシュ値とを生成する。本発明の第１の実施の形態に係る情報処理装置１は、ハッシュ関数を生成し、生成されたハッシュ関数を用いてハッシュ値を生成する。この情報処理装置１は、ＣＰＵと、ＲＡＭと、後述するハッシュ関数生成処理ルーチン及びハッシュ値生成処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。図１に示すように、情報処理装置１は、入力部２と、演算部３と、出力部４とを備えている。 <First Embodiment>
<System configuration>
In the embodiment of the present invention, a hash function and a hash value are generated for the content in consideration of a plurality of semantic viewpoints and the strength of the relation. The information processing apparatus 1 according to the first embodiment of the present invention generates a hash function, and generates a hash value using the generated hash function. The information processing apparatus 1 includes a computer including a CPU, a RAM, and a ROM that stores a program for executing a hash function generation processing routine and a hash value generation processing routine, which will be described later. As shown in FIG. As illustrated in FIG. 1, the information processing apparatus 1 includes an input unit 2, a calculation unit 3, and an output unit 4.

また、図１に示すコンテンツデータベース５には、複数のコンテンツが登録されている。コンテンツデータベース５には、少なくともコンテンツ自体、あるいは、当該コンテンツデータの所在を一意に示すアドレスが格納されている。コンテンツは、例えば、文書であれば文書ファイル、画像であれば画像ファイル、音であれば音ファイル、映像であれば映像ファイルなどであり、好ましくは、コンテンツデータベース５には、更に、各コンテンツのメディア種別とそれ自体を一意に識別可能な識別子が格納されているものとする。 A plurality of contents are registered in the contents database 5 shown in FIG. The content database 5 stores at least the content itself or an address that uniquely indicates the location of the content data. The content is, for example, a document file for a document, an image file for an image, a sound file for sound, a video file for video, and the like. It is assumed that an identifier capable of uniquely identifying the media type and itself is stored.

また、図１に示す意味ベクトルデータベース６には、コンテンツデータベース５に登録された各コンテンツに対応する意味ベクトルが格納されているものとする。 Further, it is assumed that the semantic vector database 6 shown in FIG. 1 stores semantic vectors corresponding to the respective contents registered in the content database 5.

意味ベクトルは、コンテンツの意味を表す。また、意味ベクトルは、コンテンツごとに対応して与えられるベクトルであり、一つ以上の意味概念に対して当該コンテンツがどの程度の関連を持つかをベクトルとして表現したものである。例えば、上記例では意味ベクトルは「りんご」、「果物」、「青森」の３つの意味概念を表す３次元のベクトルと定めることができ、例えば The semantic vector represents the meaning of the content. The semantic vector is a vector given corresponding to each content, and expresses as a vector how much the content is related to one or more semantic concepts. For example, in the above example, the semantic vector can be defined as a three-dimensional vector representing three semantic concepts of “apple”, “fruit”, and “Aomori”.

「赤いりんご」は｛りんご：１｝、｛果物：１｝、｛青森：０．９｝―＞（１，１，０．９） "Red apple" is {apple: 1}, {fruit: 1}, {Aomori: 0.9}-> (1,1,0.9)

「青りんご」は｛りんご：１｝、｛果物：１｝、｛青森：０．１｝―＞（１，１，０．１） "Green apple" is {apple: 1}, {fruit: 1}, {Aomori: 0.1}-> (1,1,0.1)

「みかん」は｛りんご：０｝、｛果物：１｝、｛青森：０｝―＞（０，１，０） "Mikan" is {apple: 0}, {fruit: 1}, {Aomori: 0}-> (0, 1, 0)

「マグロ」は｛りんご：０｝、｛果物：０｝、｛青森：０．７｝―＞（０，０，０．７） "Tuna" is {apple: 0}, {fruit: 0}, {Aomori: 0.7}-> (0, 0, 0.7)

などと定めることができる。上記の３次元の意味ベクトルによって、「赤いりんご」と「青りんご」の距離は（１−１）^２＋（１−１）^２＋（０．９−０．１）^２＝０．６４と計算できる。同様に、「赤いりんご」と「みかん」の距離は１．８１、「赤いりんご」と「マグロ」の距離は２．０４と求めることができ、多様な観点に基づく意味的な差異を柔軟に表現することができる。 And so on. According to the above three-dimensional semantic vector, the distance between “red apple” and “blue apple” is (1-1) ² + (1-1) ² + (0.9−0.1) ² = 0.64. Can be calculated. Similarly, the distance between “red apples” and “mandarin oranges” is 1.81, and the distance between “red apples” and “tuna” is 2.04. Can be expressed.

ここで、本発明の実施の形態において最も重要な点は、意味的観点を柔軟に表現できる意味ベクトルを取り入れたハッシュ関数およびハッシュ値を求めることができる点にある。上記特許文献１の技術によっては、どのコンテンツとどのコンテンツが関連しているか、という確定的かつ画一的な関連情報しか考慮できず、このように柔軟な意味ベクトルを取り入れてハッシュ関数およびハッシュ値を求めることはできない。 Here, the most important point in the embodiment of the present invention is that a hash function and a hash value incorporating a semantic vector that can flexibly express a semantic viewpoint can be obtained. Depending on the technique of the above-mentioned Patent Document 1, only definite and uniform related information such as which content is related to which content can be considered, and thus a hash function and a hash value are obtained by incorporating a flexible semantic vector. Cannot be asked.

意味ベクトルデータベース６が所与でない場合には、別途意味ベクトルデータベース入力部を備えていても構わない。意味ベクトルデータベース入力部は、意味ベクトルの各次元（要素）に対応する意味概念と各要素の値を人手、あるいは、機械的に与えるための入力装置であり、人手によって与える場合にはユーザインタフェース（キーボードやポインティングデバイス等）を備えた汎用コンピュータ、機械的に与える場合には、例えばインターネットに接続可能な汎用コンピュータによって構成されているものとすればよい。 If the semantic vector database 6 is not given, a separate semantic vector database input unit may be provided. The semantic vector database input unit is an input device for manually or mechanically providing a semantic concept corresponding to each dimension (element) of a semantic vector and a value of each element. When a general-purpose computer provided with a keyboard, a pointing device, or the like is mechanically given, it may be constituted by a general-purpose computer that can be connected to the Internet, for example.

人手により与える場合には、利用者はユーザインタフェースを通じて、先行して示した一例のように、コンテンツの内容に関連の深いと想定される意味概念をリストしておき、各コンテンツがどの意味概念とどの程度関連しているかを、例えば０〜１の実数値で与えるものとすればよい。 When giving manually, the user lists through the user interface the semantic concepts that are assumed to be closely related to the contents of the content, as shown in the previous example, and what semantic concept each content has. The degree to which the values are related may be given as a real value of 0 to 1, for example.

また、機械的に与える場合の例として、例えばＷｅｂページから画像を収集するような場合があげられる。最も単純には、例えば同一Ｗｅｂページ内にある画像と文書は関連していると見做すという前提のもと、文書中に含まれる特徴的な単語を抽出し、これを意味概念として用いてもよい。特徴的な単語の抽出には、いかなる公知の方法を用いてもよいが、単純にはＳｔｏｐｗｏｒｄｓ（助詞などの情報量の低い単語）をルールベースで除去した後、ＤｏｃｕｍｅｎｔＦｒｅｑｕｅｎｃｙ（ＤＦ）と呼ばれる各単語を含む文書数の高い／低い単語を除去することで実行してもよい。 Further, as an example of mechanically giving, for example, a case where an image is collected from a Web page can be cited. Most simply, for example, on the assumption that images and documents in the same Web page are related, a characteristic word included in the document is extracted and used as a semantic concept. Also good. Any known method may be used to extract characteristic words. However, after removing Stopwords (words with a low amount of information such as particles) on a rule basis, each called Document Frequency (DF) is used. You may perform by removing the word with the high / low number of documents containing a word.

あるいは、ブログやニュースサイト、マイクロブログ、ＳＮＳ等、特定のＷｅｂページには、タグなどと呼ばれる分類キーワードが付与されている場合もある。このときにはこれらを直接意味概念として用いても構わない。 Alternatively, a classification keyword called a tag may be assigned to a specific Web page such as a blog, a news site, a microblog, or an SNS. In this case, these may be used directly as semantic concepts.

その他、メタデータとして、例えばコンテンツの内容を表現するもの（コンテンツのタイトル、概要文、キーワード）、コンテンツのフォーマットに関するもの（コンテンツのデータ量、サムネイル等のサイズ）などが、意味ベクトルデータベース６に格納されていてもよく、これらを意味ベクトルとして利用してもよい。 In addition, metadata representing content details (content title, summary sentence, keyword), content format (content data amount, thumbnail size, etc.), and the like are stored in the semantic vector database 6. These may be used as semantic vectors.

なお、意味ベクトルが機械的に与えられる場合、人手をかけることなく意味ベクトルが得られるというメリットがある。 When the semantic vector is given mechanically, there is an advantage that the semantic vector can be obtained without manpower.

情報処理装置１は、コンテンツデータベース５、及び意味ベクトルデータベース６と通信手段を介して接続され、入力部２、出力部４を介して相互に情報通信し、コンテンツデータベース５、及び意味ベクトルデータベース６に登録されたコンテンツに基づいてハッシュ関数を生成するハッシュ関数生成処理と、生成したハッシュ関数を用いてコンテンツを複数のバイナリ値に変換するハッシュ値生成処理を行う。 The information processing apparatus 1 is connected to the content database 5 and the semantic vector database 6 via communication means, and communicates information with each other via the input unit 2 and the output unit 4. A hash function generation process for generating a hash function based on the registered content and a hash value generation process for converting the content into a plurality of binary values using the generated hash function are performed.

また、コンテンツデータベース５、及び意味ベクトルデータベース６は、情報処理装置１の内部にあっても外部にあっても構わず、通信手段は任意の公知のものを用いることができるが、本実施形態においては、外部にあるものとして、通信手段は、インターネット、ＴＣＰ／ＩＰにより通信するよう接続されているものとする。コンテンツデータベース５、及び意味ベクトルデータベース６は、いわゆるＲＤＢＭＳ（Relational Database Management System）などで構成されているものとしてもよい。 In addition, the content database 5 and the semantic vector database 6 may be inside or outside the information processing apparatus 1, and any known communication means can be used. Suppose that the communication means is connected to communicate via the Internet or TCP / IP. The content database 5 and the semantic vector database 6 may be configured by so-called RDBMS (Relational Database Management System) or the like.

情報処理装置１の各部、コンテンツデータベース５、及び意味ベクトルデータベース６は、演算処理装置、記憶装置等を備えたコンピュータやサーバ等により構成して、各部の処理がプログラムによって実行されるものとしてもよい。このプログラムは情報処理装置１が備える記憶装置に記憶されており、磁気ディスク、光ディスク、半導体メモリ等の記録媒体に記録することも、ネットワークを通して提供することも可能である。もちろん、その他いかなる構成要素についても、単一のコンピュータやサーバによって実現しなければならないものではなく、ネットワークによって接続された複数のコンピュータに分散して実現してもよい。 Each unit of the information processing device 1, the content database 5, and the semantic vector database 6 may be configured by a computer, a server, or the like that includes an arithmetic processing device, a storage device, and the like, and the processing of each unit may be executed by a program. . This program is stored in a storage device included in the information processing apparatus 1, and can be recorded on a recording medium such as a magnetic disk, an optical disk, or a semiconductor memory, or provided through a network. Of course, any other component does not have to be realized by a single computer or server, and may be realized by being distributed to a plurality of computers connected by a network.

次に、図１に示す情報処理装置１の各部について説明する。 Next, each part of the information processing apparatus 1 shown in FIG. 1 will be described.

入力部２は、コンテンツデータベース５から、複数（Ｎ個）のコンテンツを取得する。また、入力部２は、検索クエリとしてのコンテンツの入力を受け付ける。また、入力部２は、意味ベクトルデータベース６から、Ｎ個のコンテンツの各々について与えられた意味ベクトルを取得する。 The input unit 2 acquires a plurality (N) of contents from the content database 5. Further, the input unit 2 accepts input of content as a search query. Further, the input unit 2 acquires a semantic vector given for each of the N contents from the semantic vector database 6.

演算部３は、特徴抽出部３０と、ハッシュ関数生成部３２と、ハッシュ関数記憶部３４と、ハッシュ値生成部３６とを備えている。 The calculation unit 3 includes a feature extraction unit 30, a hash function generation unit 32, a hash function storage unit 34, and a hash value generation unit 36.

特徴抽出部３０は、入力部２によって受け付けたＮ個のコンテンツの各々に対し、当該コンテンツから特徴量を抽出する。
また、特徴抽出部３０は、入力部２によって受け付けた検索クエリとしてのコンテンツから、特徴量を抽出する。 The feature extraction unit 30 extracts a feature amount from each of the N pieces of content received by the input unit 2.
In addition, the feature extraction unit 30 extracts a feature amount from the content as a search query received by the input unit 2.

特徴抽出部３０における特徴量を抽出する処理は、コンテンツのメディア種別に依存する。例えば、コンテンツが文書であるか、画像であるか、音であるか、映像であるかによって、抽出するまたは抽出できる特徴量は変化する。ここで、各メディア種別に対してどのような特徴量を抽出するかは、本実施の形態の要件として重要ではなく、一般に知られた公知の特徴抽出処理を用いてよい。具体的には、あるコンテンツから抽出された次元を持つ数値データ（スカラー又はベクトル）であれば、あらゆる特徴量に対して有効である。したがって、ここでは、本実施形態に適する、各種コンテンツに対する特徴抽出処理の一例を説明する。 The process of extracting feature amounts in the feature extraction unit 30 depends on the media type of the content. For example, the feature quantity that can be extracted or extracted varies depending on whether the content is a document, an image, a sound, or a video. Here, what kind of feature amount is extracted for each media type is not important as a requirement of the present embodiment, and a generally known feature extraction process may be used. Specifically, it is effective for all feature quantities as long as it is numerical data (scalar or vector) having a dimension extracted from a certain content. Accordingly, here, an example of feature extraction processing for various contents suitable for the present embodiment will be described.

コンテンツが文書である場合には、文書中に出現する単語の出現頻度を用いることができる。例えば、公知の形態素解析を用いて、名詞、形容詞等に相当する単語ごとに、その出現頻度を計数すればよい。この場合、各文書の特徴量は、単語種別と同じだけの次元を持つベクトルとして表現される。 When the content is a document, the appearance frequency of words appearing in the document can be used. For example, the appearance frequency may be counted for each word corresponding to a noun, an adjective, or the like using a known morphological analysis. In this case, the feature amount of each document is expressed as a vector having the same dimensions as the word type.

コンテンツが画像である場合には、例えば、明るさ特徴、色特徴、テクスチャ特徴、コンセプト特徴、景観特徴、形状特徴などを抽出する。明るさ特徴は、ＨＳＶ色空間におけるＶ値を数え上げることで、ヒストグラムとして抽出することができる。この場合、各画像の特徴量は、Ｖ値の量子化数（例えば、１６ビット量子化であれば２５６諧調）と同数の次元を持つベクトルとして表現される。 When the content is an image, for example, brightness features, color features, texture features, concept features, landscape features, shape features, and the like are extracted. The brightness feature can be extracted as a histogram by counting the V values in the HSV color space. In this case, the feature amount of each image is expressed as a vector having the same number of dimensions as the number of V-value quantizations (for example, 256 gradations for 16-bit quantization).

色特徴は、Ｌ＊ａ＊ｂ＊色空間における各軸（Ｌ＊、ａ＊、ｂ＊）の値を数え上げることで、ヒストグラムとして抽出することができる。各軸のヒストグラムのビンの数は、例えば、Ｌ＊に対して４、ａ＊に対して１４、ｂ＊に対して１４などとすればよく、この場合、３軸の合計ビン数は、４×１４×１４＝７８４、すなわち７８４次元のベクトルとなる。 The color feature can be extracted as a histogram by counting the values of the respective axes (L *, a *, b *) in the L * a * b * color space. The number of histogram bins on each axis may be, for example, 4 for L *, 14 for a *, 14 for b *, etc. In this case, the total number of bins for 3 axes is 4 × 14 × 14 = 784, that is, a 784-dimensional vector.

テクスチャ特徴としては、濃淡ヒストグラムの統計量（コントラスト）やパワースペクトルなどを求めればよい。あるいは、局所特徴量を用いると、色や動きなどと同様、ヒストグラムの形式で抽出することができるようになるため好適である。局所特徴としては、例えば下記の参考文献１に記載されるＳＩＦＴ（Scale Invariant Feature Transform）や、下記の参考文献２に記載されるＳＵＲＦ（Speeded Up Robust Features）などを用いることができる。 As a texture feature, a statistic (contrast) of a density histogram, a power spectrum, or the like may be obtained. Alternatively, it is preferable to use a local feature amount because it can be extracted in the form of a histogram as in the case of color and movement. As the local feature, for example, SIFT (Scale Invariant Feature Transform) described in the following Reference 1 or SURF (Speeded Up Robust Features) described in the following Reference 2 can be used.

［参考文献１］D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints ", International Journal of Computer Vision, pp.91-110, 2004
［参考文献２］H. Bay, T. Tuytelaars, and L.V. Gool, “SURF: Speeded Up Robust Features", Lecture Notes in Computer Science, vol. 3951, pp.404-417, 2006 [Reference 1] DG Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, pp.91-110, 2004
[Reference 2] H. Bay, T. Tuytelaars, and LV Gool, “SURF: Speeded Up Robust Features”, Lecture Notes in Computer Science, vol. 3951, pp.404-417, 2006

これらによって抽出される局所特徴は、例えば１２８次元の実数値ベクトルとなる。このベクトルを、予め学習して生成しておいた符号長を参照して、符号に変換し、その符号の数を数え上げることでヒストグラムを生成することができる。この場合、ヒストグラムのビンの数は、符号長の符号数と一致する。又は、参考文献３に記載のスパース表現や、参考文献４、５に記載のフィッシャーカーネルに基づく特徴表現などを利用してもよい。 The local feature extracted by these becomes a 128-dimensional real value vector, for example. This vector is converted into a code with reference to a code length that has been learned and generated in advance, and a histogram can be generated by counting the number of the codes. In this case, the number of bins in the histogram matches the code number of the code length. Alternatively, the sparse expression described in Reference 3 or the feature expression based on the Fisher kernel described in References 4 and 5 may be used.

［参考文献３］Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong, “Locality-constrained Linear Coding for Image Classification", IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360-3367, 2010.
［参考文献４］Florent Perronnin, Jorge Sanchez, Thomas Mensink, “Improving the Fisher Kernel for Large-Scale Image Classification", European Conference on Computer Vision, pp. 143-156, 2010.
［参考文献５］Herve Jegou, Florent Perronnin, Matthijs Douze, Jorge Sanchez, Patrick Perez, Cordelia Schmid, “Aggregating Local Image Descriptors into Compact Codes", IEEE Trans. Pattern Recognition and Machine Intelligence, Vol. 34, No. 9, pp. 1704-1716, 2012. [Reference 3] Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong, “Locality-constrained Linear Coding for Image Classification”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360-3367, 2010.
[Reference 4] Florent Perronnin, Jorge Sanchez, Thomas Mensink, “Improving the Fisher Kernel for Large-Scale Image Classification”, European Conference on Computer Vision, pp. 143-156, 2010.
[Reference 5] Herve Jegou, Florent Perronnin, Matthijs Douze, Jorge Sanchez, Patrick Perez, Cordelia Schmid, “Aggregating Local Image Descriptors into Compact Codes”, IEEE Trans. Pattern Recognition and Machine Intelligence, Vol. 34, No. 9, pp. 1704-1716, 2012.

結果として生成される特徴量は、いずれの場合にも、符号長の符号数に依存した長さを持つ実数値ベクトルになる。 In any case, the resulting feature quantity is a real value vector having a length that depends on the number of codes of the code length.

コンセプト特徴とは、画像中に含まれる物体や、画像が捉えているイベントのことである。任意のものを用いてよいが、例を挙げれば、「海」、「山」、「ボール」などのようなものである。もし、ある画像に「海」が映っていた場合、その画像は「海」コンセプトに帰属する画像であるという。その画像が、各コンセプトに帰属するか否かは、コンセプト識別器を用いて判断することができる。通常、コンセプト識別器はコンセプト毎に一つ用意され、画像の特徴量を入力として、その画像があるコンセプトに帰属しているか否かを帰属レベルとして出力する。コンセプト識別器は、予め学習して獲得しておくものであり、決められた画像特徴、例えば先に述べた局所特徴と、予め人手によって、その画像がどのコンセプトに帰属しているかを表した正解ラベルとの関係を学習することによって獲得する。学習器としては、例えばサポートベクターマシンなどを用いればよい。コンセプト特徴は、各コンセプトへの帰属レベルをまとめてベクトルとして表現することで得ることができる。この場合、生成される特徴量はコンセプトの数と同数の次元を持つベクトルとなる。 A concept feature is an object included in an image or an event captured by the image. Anything may be used, but examples include “sea”, “mountain”, “ball”, and the like. If “sea” appears in an image, the image belongs to the “sea” concept. Whether or not the image belongs to each concept can be determined using a concept classifier. Usually, one concept discriminator is prepared for each concept, and the feature amount of the image is input, and whether or not the image belongs to a certain concept is output as an attribution level. The concept classifier is learned and acquired in advance, and it is a correct answer that expresses the predetermined image features, for example, the local features described above and the concept to which the image belongs by hand in advance. Earn by learning the relationship with the label. For example, a support vector machine may be used as the learning device. Concept features can be obtained by expressing the attribution levels for each concept together as a vector. In this case, the generated feature quantity is a vector having the same number of dimensions as the number of concepts.

景観特徴は、画像の風景や場面を表現した特徴量である。例えば参考文献６に記載のＧＩＳＴ記述子を用いることができる。ＧＩＳＴ記述子は画像を領域分割し、各領域に対して一定のオリエンテーションを持つフィルタを掛けたときの係数によって表現されるが、この場合、生成される特徴量は、フィルタの種類（分割する領域の数とオリエンテーションの数）に依存した長さのベクトルとなる。 A landscape feature is a feature amount that represents a landscape or scene of an image. For example, the GIST descriptor described in Reference 6 can be used. The GIST descriptor is represented by a coefficient when an image is divided into regions and a filter having a certain orientation is applied to each region. In this case, the generated feature amount is the type of filter (region to be divided). And the number of orientations).

［参考文献６］A. Oliva and A. Torralba, “Building the gist of a scene: the role of global image features in recognition", Progress in Brain Research, 155, pp.23-36, 2006 [Reference 6] A. Oliva and A. Torralba, “Building the gist of a scene: the role of global image features in recognition”, Progress in Brain Research, 155, pp. 23-36, 2006

形状特徴は、画像に写る物体の形状を表す特徴量である。例えば参考文献７に記載のＨＯＧ特徴量やエッジヒストグラムを用いることができる。 The shape feature is a feature amount representing the shape of an object shown in an image. For example, the HOG feature amount and edge histogram described in Reference 7 can be used.

［参考文献７］N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection", IEEE Conference on Computer Vision and Pattern Recognition, pp.886-893, 2005 [Reference 7] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection”, IEEE Conference on Computer Vision and Pattern Recognition, pp.886-893, 2005

コンテンツが音である場合には、音高特徴、音圧特徴、スペクトル特徴、リズム特徴、発話特徴、音楽特徴、音イベント特徴などを抽出する。音高特徴は、例えばピッチを取るものとすればよく、下記の参考文献８に記載される方法などを用いて抽出することができる。この場合、ピッチを１次元ベクトル（スカラー）として表現するか、あるいはこれをいくつかの次元に量子化しておいてもよい。 When the content is a sound, a pitch feature, a sound pressure feature, a spectrum feature, a rhythm feature, an utterance feature, a music feature, a sound event feature, and the like are extracted. The pitch feature may be a pitch, for example, and can be extracted using a method described in Reference Document 8 below. In this case, the pitch may be expressed as a one-dimensional vector (scalar) or may be quantized into several dimensions.

［参考文献８］古井貞熙，“ディジタル音声処理，４．９ピッチ抽出”，ｐｐ．５７−５９，１９８５ [Reference 8] Sadahiro Furui, “Digital Audio Processing, 4.9 Pitch Extraction”, pp. 57-59, 1985

音圧特徴としては、音声波形データの振幅値を用いるものとしてもよいし、短時間パワースペクトルを求め、任意の帯域の平均パワーを計算して用いるものとしてもよい。いずれにしても、音圧を計算するバンドの数に依存した長さのベクトルとなる。 As the sound pressure feature, an amplitude value of speech waveform data may be used, or a short-time power spectrum may be obtained, and an average power in an arbitrary band may be calculated and used. In any case, the length vector depends on the number of bands for calculating the sound pressure.

スペクトル特徴としては、例えばメル尺度ケプストラム係数（ＭＦＣＣ：Mel-Frequency Cepstral Coefficients）を用いることができる。 As the spectrum feature, for example, Mel-Frequency Cepstral Coefficients (MFCC) can be used.

リズム特徴としては、例えばテンポを抽出すればよい。テンポを抽出するには、例えば下記の参考文献９に記載される方法などを用いることができる。 As the rhythm feature, for example, a tempo may be extracted. In order to extract the tempo, for example, a method described in Reference Document 9 below can be used.

［参考文献９］E.D. Scheirer, “Tempo and Beat Analysis of Acoustic Musical Signals ", Journal of Acoustic Society America, Vol. 103, Issue 1, pp.588-601, 1998 [Reference 9] E.D. Scheirer, “Tempo and Beat Analysis of Acoustic Musical Signals”, Journal of Acoustic Society America, Vol. 103, Issue 1, pp.588-601, 1998

発話特徴、音楽特徴は、それぞれ、発話の有無、音楽の有無を表す。発話・音楽の存在する区間を発見するには、例えば下記の参考文献１０に記載される方法などを用いればよい。 The utterance feature and the music feature represent the presence or absence of utterance and the presence or absence of music, respectively. In order to find a section where speech / music exists, for example, a method described in Reference Document 10 below may be used.

［参考文献１０］K. Minami, A. Akutsu, H. Hamada, and Y. Tonomura, “Video Handling with Music and Speech Detection", IEEE Multimedia, vol. 5, no. 3, pp.17-25, 1998 [Reference 10] K. Minami, A. Akutsu, H. Hamada, and Y. Tonomura, “Video Handling with Music and Speech Detection”, IEEE Multimedia, vol. 5, no. 3, pp. 17-25, 1998

音イベント特徴としては、例えば、笑い声や大声などの感情的な音声、あるいは、銃声や爆発音などの環境音の生起などを用いるものとすればよい。このような音イベントを検出するには、例えば下記の参考文献１１に記載される方法などを用いればよい。 As the sound event feature, for example, emotional sound such as laughter and loud voice, or occurrence of environmental sound such as gunshot and explosion sound may be used. In order to detect such a sound event, for example, a method described in Reference Document 11 below may be used.

［参考文献１１］国際公開第２００８／０３２７８７号 [Reference 11] International Publication No. 2008/032787

コンテンツが映像である場合、映像は、一般に画像と音のストリームであるから、上記説明した画像特徴と音特徴を用いることができる。映像中のどの画像、音情報を分析するかについては、例えば、予め映像をいくつかの区間に分割し、その区間ごとに１つの画像、音から特徴抽出を実施する。 When the content is a video, since the video is generally a stream of images and sounds, the image features and sound features described above can be used. As to which image and sound information in the video is analyzed, for example, the video is divided into several sections in advance, and feature extraction is performed from one image and sound for each section.

映像を区間に分割するには、予め決定しておいた一定の間隔で分割するものとしてもよいし、例えば下記の参考文献１２に記載される方法などを用いて、映像が不連続に切れる点であるカット点によって分割するものとしてもよい。 In order to divide the video into sections, the video may be divided at predetermined intervals, for example, by using the method described in Reference Document 12 below or the like, where the video is cut discontinuously. It is good also as what divides | segments by the cut point which is.

［参考文献１２］Y. Tonomura, A. Akutsu, Y. Taniguchi, and G. Suzuki, “Structured Video Computing", IEEE Multimedia, pp.34-43, 1994 [Reference 12] Y. Tonomura, A. Akutsu, Y. Taniguchi, and G. Suzuki, “Structured Video Computing”, IEEE Multimedia, pp.34-43, 1994

映像を区間に分割する場合には、望ましくは、上記の後者の方法を採用する。映像区間分割処理の結果として、区間の開始点（開始時刻）と終了点（終了時刻）が得られるが、この時刻毎に別々の特徴量として扱えばよい。 When the video is divided into sections, the latter method is desirably employed. As a result of the video section division process, the start point (start time) and end point (end time) of the section are obtained, and may be handled as separate feature quantities at each time.

上記説明した特徴量の中から、一つあるいは複数を利用してもよいし、その他の公知の特徴量を用いるものとしてもよい。 One or a plurality of feature quantities described above may be used, or other known feature quantities may be used.

ハッシュ関数生成部３２は、特徴抽出部３０によって抽出されたＮ個のコンテンツの各々の特徴量と、入力部２によって取得したＮ個のコンテンツの各々について与えられた意味ベクトルとに基づいて、ハッシュ関数を生成する。ハッシュ関数は、特徴量に対応するハッシュ値を求めるためのハッシュ関数である。 The hash function generation unit 32 performs a hash based on the feature amount of each of the N pieces of content extracted by the feature extraction unit 30 and the semantic vector provided for each of the N pieces of content acquired by the input unit 2. Generate a function. The hash function is a hash function for obtaining a hash value corresponding to the feature amount.

具体的には、あるコンテンツｉから抽出された特徴量をｘ_ｉ∈Ｒ^Ｄと表し、特徴量ｘ_ｉのコンテンツの特徴量次元はＤであるとき、ハッシュ関数生成部３２は、特徴量ｘ_ｉに対して、ｈ_ｋ：Ｒ^Ｄ→｛−１，１｝となるハッシュ関数の集合を求める。｛−１，１｝と｛０，１｝とは情報量という観点で本質的に差異がないことに注意すれば、各ｈによって、特徴量ｘ_ｉ∈Ｒ^Ｄは０または１を取るバイナリ値に写像されるから、特徴量ｘ_ｉは、ハッシュ関数集合Ｈ＝｛ｈ_１，ｈ_２，・・・，ｈ_Ｂ｝によってＢ個のバイナリ値、すなわち、Ｂビットのハッシュ値に変換されることを意味する。本実施の形態では、ハッシュ値が、複数のバイナリ値によって構成される場合を例に説明する。 Specifically, when the feature quantity extracted from a certain content i is represented as x _i ∈R ^D and the feature quantity dimension of the content of the feature quantity x _i is D, the hash function generation unit 32 determines the feature quantity x _i. In contrast, a set of hash functions such that h _k : R ^D → {−1, 1} is obtained. If Note that there is no essentially difference in terms of the amount of information and {-1, 1} and {0,1}, by each h, characteristic amounts _{x i} ∈R ^D binary value takes 0 or 1 Therefore, the feature quantity x _i is converted into B binary values, that is, B-bit hash values by the hash function set H = {h ₁ , h ₂ ,..., H _B }. Means. In the present embodiment, a case where a hash value is composed of a plurality of binary values will be described as an example.

本発明の実施の形態における目的は、このハッシュ値によって、異なるメディア種別であっても類似度の計測を可能にしたうえ、さらに時間のかかる類似度計算を省略することである。したがって、ここで生成するハッシュ関数と、それにより生成されるハッシュ値は、次の２つの性質を持つ。 An object of the embodiment of the present invention is to enable measurement of similarity even with different media types using this hash value, and to omit time-consuming similarity calculation. Therefore, the hash function generated here and the hash value generated thereby have the following two properties.

（Ａ）元の空間Ｒ^Ｄでの類似度を表すハッシュ値へと変換する。すなわち、高い類似度を持つコンテンツほど、ハッシュ値の距離（ハミング距離）が近くなる。 (A) Conversion into a hash value representing the similarity in the original space ^RD . That is, the content having a higher similarity has a shorter hash value distance (Hamming distance).

（Ｂ）意味ベクトルの近いコンテンツ同士は、ハッシュ値の距離が近くなる。 (B) Hash values are close to each other in content having similar semantic vectors.

本実施の形態の一例では、ハッシュ関数として（１）式で示す線形関数に基づくハッシュ関数を適用する。 In an example of the present embodiment, a hash function based on a linear function expressed by equation (1) is applied as a hash function.

ここで、ｓｉｇｎ（ｘ）は符号関数であり、ｘ≧０のとき１、ｘ＜０のとき−１をとる関数である。また、ｗ_ｋ∈Ｒ^Ｄ、ｂ_ｋ∈Ｒのパラメータである。このハッシュ関数において、未知のパラメータはｗ_ｋとｂ_ｋの二つだけである。 Here, sign (x) is a sign function, which is 1 when x ≧ 0 and -1 when x <0. The parameters are w _k εR ^D and b _k εR. In this hash function, there are only two unknown parameters, w _k and b _k .

ここで、仮にｘ_ｉ（ｉ＝１，２，・・・，Ｎ）が平均０に正規化されているとき、ｂ_ｋ＝０としても一般性を失わない。ｘ_ｉを０に正規化するには、ｘ_ｉの平均を、各ｘ_ｉから減算すればよいのであり、これはｘ_ｉ∈Ｒ^Ｄにおいて常に可能であることから、ｂ_ｋ＝０と決定できる。したがって、以降、ｘ_ｉの平均は０に正規化されているとし、上記（１）式を（２）式のように定義しなおして説明する。 Here, if x _i (i = 1, 2,..., N) is normalized to an average of 0, generality is not lost even if b _k = 0. To normalize the x _i to 0, the average of _{x i,} and than can be subtracted from each _{x i,} which is because it is always possible in _{x i} ∈R ^_D, it can be determined and _b k = 0 . Therefore, hereinafter, it is assumed that the average of x _i is normalized to 0, and the above equation (1) is redefined as equation (2).

このハッシュ関数の定義によれば、ハッシュ関数を構成する関数φ_ｋ内にあるパラメータｗ_ｋを定めることで、ハッシュ関数を一意に定めることができる。したがって、本ハッシュ関数生成処理の目的は、このｗ_ｋ（ｋ＝１，２，…，Ｂ）を求めることである。 According to the definition of the hash function, the hash function can be uniquely determined by determining the parameter w _k in the function φ _k constituting the hash function. Therefore, the purpose of this hash function generation process is to obtain this w _k (k = 1, 2,..., B).

ここで、上記（２）式のように規定されるハッシュ関数の意味は、幾何的には図２を用いて説明できる。図２には、特徴量空間Ｒ^Ｄ上に、各コンテンツ（ｉ＝１，２，・・・，Ｎ）から抽出された特徴量ｘ_ｉ（ｉ＝１，２，・・・，Ｎ）が分布している。図２では、便宜上２次元のように図示しているが、実際にはＤ次元の空間である。ここでハッシュ関数を構成する関数φ_ｋ（ｘ）は、この特徴量空間上の原点を通る直線（実際はＤ−１次元の超平面）を表す。ｈ_ｋ（ｘ）は、本質的には符号関数であるから、その値は、特徴量の点が、ハッシュ関数を構成する関数φ_ｋ（ｘ）の直線のどちら側にあるかによって、１または０をとる。すなわち、上記（２）式によって定義されるハッシュ関数４１は、特徴量空間を直線によって１と０の２つの領域に分割する関数である。ここで、ｗ_ｋはこの直線の傾きに対応し、ｗ_ｋが変化すれば、分割する角度が変化することになる。 Here, the meaning of the hash function defined as the above equation (2) can be geometrically explained with reference to FIG. In FIG. 2, feature amounts x _i (i = 1, 2,..., N) extracted from each content (i = 1, 2,..., N) are stored in the feature amount space ^RD. Distributed. In FIG. 2, for the sake of convenience, it is illustrated as two-dimensional, but it is actually a D-dimensional space. Here, the function φ _k (x) constituting the hash function represents a straight line (actually a D−1-dimensional hyperplane) passing through the origin on the feature amount space. Since h _k (x) is essentially a sign function, its value is 1 or 2 depending on which side of the straight line of the function φ _k (x) that constitutes the hash function the feature point is. Take 0. That is, the hash function 41 defined by the above equation (2) is a function that divides the feature amount space into two areas of 1 and 0 by a straight line. Here, w _k corresponds to the slope of this straight line, and if w _k changes, the angle to be divided will change.

前述した（Ａ）、（Ｂ）の２つの性質に合うハッシュ関数となるように、ｗ_ｋを求めるには、 In order to obtain w _k so that the hash function matches the two properties (A) and (B) described above,

（Ａ’）類似したコンテンツ群が、上記図２の例における直線の片側に集まるように直線を引き（すなわち、ｗ_ｋを決める）、 (A ′) Draw a straight line so that similar content groups gather on one side of the straight line in the example of FIG. 2 (that is, determine w _k ),

（Ｂ’）なおかつ、意味ベクトルが示す関連するコンテンツ同士が、ハッシュ値の空間において近い値を持つようにすればよい。 (B ′) Furthermore, the related contents indicated by the semantic vector may have close values in the hash value space.

まず、（Ａ’）の性質を満たすための方法を説明する。メディアコンテンツにおいては、前述した特徴量の種別によらず、類似したコンテンツ同士の特徴量や意味ベクトルの分布は滑らかな多様体構造を形成することがよく知られている。多様体構造とは、簡単に言えば滑らかな変化である。分かりやすく図３を用いて説明すると、各特徴量は大まかに、曲線５１と曲線５２の滑らかに変化する２本の曲線上に分布しており、同じ曲線上の点同士は互いに類似していることが多い。上記図３でいうところの、白丸（○）と黒丸（●）で表されている特徴量は、同色であれば互いに類似したコンテンツの特徴量となる。 First, a method for satisfying the property (A ′) will be described. In media content, it is well known that the distribution of feature quantities and semantic vectors between similar contents forms a smooth manifold structure regardless of the types of feature quantities described above. A manifold structure is simply a smooth change. If it demonstrates easily using FIG. 3, each feature-value will be roughly distributed on two curves of the curve 51 and the curve 52 which change smoothly, and the points on the same curve are mutually similar. There are many cases. The feature values represented by white circles (◯) and black circles (●) in FIG. 3 are content features that are similar to each other if they are the same color.

さらに、意味ベクトルによって形成される意味ベクトル空間を考えた場合にも同じことが言える。 Further, the same can be said when a semantic vector space formed by semantic vectors is considered.

この知見に基づけば、これらの特徴量空間あるいは意味ベクトル空間において類似したコンテンツ群が直線の片側に集まるように直線を引けばよい。この観点に基づけば、図４に示す直線の内、直線６１のような直線は好ましくなく、２群の間を通る直線６２のような直線を規定するハッシュ関数のパラメータｗ_ｋを求めればよいことになる。 Based on this knowledge, a straight line may be drawn so that similar content groups in these feature amount space or semantic vector space gather on one side of the straight line. Based on this point of view, a straight line such as the straight line 61 is not preferable among the straight lines shown in FIG. 4, and a hash function parameter w _k that defines a straight line such as the straight line 62 passing between the two groups may be obtained. become.

続いて、（Ｂ’）の性質を満たすための方法について、図５を用いて説明する。図５の例では、特徴量によって形成される特徴量空間と、各コンテンツと対応して存在する意味ベクトルによって形成される意味ベクトル空間とを対応づけて示し、特徴量を丸、意味ベクトルを三角で表している。仮に、それぞれの特徴量空間と意味ベクトル空間において、性質（Ａ’）を満たすように、すなわち、多様体構造を分離するような直線７１、７２がそれぞれ得られているとし、さらに、コンテンツごとの特徴量と意味ベクトルの対応関係が破線７３〜７６で示すように得られているとする（図中では３つのコンテンツのみにしか対応関係を示していないが、実際には他のコンテンツに対しても同様の対応関係があると考えてよい）。 Next, a method for satisfying the property (B ′) will be described with reference to FIG. In the example of FIG. 5, the feature amount space formed by the feature amount and the semantic vector space formed by the semantic vector corresponding to each content are shown in association with each other, the feature amount is a circle, and the semantic vector is a triangle. It is represented by It is assumed that straight lines 71 and 72 that satisfy the property (A ′) in each feature amount space and semantic vector space, that is, separate manifold structures, are obtained, respectively. It is assumed that the correspondence between the feature quantity and the semantic vector is obtained as indicated by the broken lines 73 to 76 (in the figure, the correspondence is shown only for three contents, but in actuality with respect to other contents) May be considered to have a similar correspondence).

このとき、直線７１、７２によって分離されている特徴量および意味ベクトルのうち、対応している特徴量／意味ベクトル同士が、互いに同じハッシュ値を持つようにハッシュ関数のパラメータｗ_ｋを求めれば、特徴量の意味においても、意味ベクトルの意味においても類似したハッシュ値を生成でき、（Ｂ’）の性質を満たすようなになる。例えば、図５の例では白丸と白三角（△）、黒丸と黒三角（▲）がそれぞれ同じハッシュ値を持てばよい。 At this time, if the hash function parameter w _k is calculated so that the corresponding feature quantities / semantic vectors among the feature quantities and the semantic vectors separated by the straight lines 71 and 72 have the same hash value, Similar hash values can be generated both in the meaning of the feature quantity and in the meaning of the semantic vector, and satisfy the property (B ′). For example, in the example of FIG. 5, the white circle and the white triangle (Δ), and the black circle and the black triangle (▲) may have the same hash value.

以上示した２つの方法に基づき、本実施形態の一例では、前述の（Ａ’）と（Ｂ’）の２つの性質を満たすパラメータｗ_ｋを求める。本実施形態の一例では、次の２つの手続きによってｗ_ｋを求める。第１の手続きは、特徴量空間および意味ベクトル空間のそれぞれにおける多様体構造を捉える。第２の手続きは、特徴量空間および意味ベクトル空間それぞれの多様体構造、および、コンテンツ関係の対応に基づいて、ｗ_ｋを求める。 Based on the two methods described above, in the example of the present embodiment, the parameter w _k satisfying the two properties (A ′) and (B ′) described above is _obtained . In an example of this embodiment, w _k is _obtained by the following two procedures. The first procedure captures the manifold structure in each of the feature space and the semantic vector space. The second procedure obtains w _k based on the manifold structure of each of the feature amount space and the semantic vector space, and the correspondence between the contents.

以下、それぞれの手続きについて詳述する。第１の手続きは、特徴量、意味ベクトルによらず同じであり、双方に対してそれぞれ同じ処理を適用すればよいので、ここでは特徴量の場合についてのみ述べる。例えば、上記非特許文献２に記載の公知の方法を用いることができる。以下、上記非特許文献２に記載の方法を説明する。 Each procedure is described in detail below. The first procedure is the same regardless of the feature quantity and the semantic vector, and the same process may be applied to both, so only the case of the feature quantity will be described here. For example, a known method described in Non-Patent Document 2 can be used. Hereinafter, the method described in Non-Patent Document 2 will be described.

多様体とは、大まかに言えば滑らかな図形であり、言い換えれば局所的に見ればユークリッドな空間とみなせる。例えば、上記図３に示すような曲線のように、いくつかの直線の集まりとして近似されるようなものであると解釈してもよい。このことは、多様体とは局所的に見れば線形で近似される構造を持つことを表しているのであり、言い換えれば、多様体上の任意の点は、同じ多様体上にあるいくつかの近傍点に基づく、近傍の相対的幾何関係によって表現できることを意味している。 A manifold is roughly a smooth figure, in other words, it can be regarded as a Euclidean space when viewed locally. For example, it may be interpreted as being approximated as a collection of several straight lines like the curve shown in FIG. This means that a manifold has a structure that is approximated linearly when viewed locally, in other words, any point on a manifold is a number of points on the same manifold. It means that it can be expressed by the relative geometric relationship of the neighborhood based on the neighborhood point.

上記非特許文献２では、次の問題を解くことによって多様体を発見する。 In the said nonpatent literature 2, a manifold is discovered by solving the following problems.

ここで、第一項は特徴量ｘ_ｉを、そのユークリッド空間上での近傍集合ε（ｘ_ｉ）に含まれる特徴量インデクスに対応する特徴量の集合｛ｘ_ｊ｜ｊ∈ε（ｘ_ｉ）｝によって線形結合で表したときの誤差であり、ｓ_ｉｊはその際の結合重みである。第二項は、結合重みのベクトルｓ_ｉ＝｛ｓ_ｉ１，・・・，ｓ_{ｉ｜ε（ｘｉ）｜}｝に対して、その要素がスパースであることを要請する、すなわち、ベクトル中のいくつかの限られた要素にのみ非ゼロの値を持つように正則化するスパース項であり、ｖ_ｉはｘ_ｉに近いほど小さな値を持つような定数を要素として持つベクトルである。ベクトルｖ_ｉの要素ｖ_ｉjは、例えば、（４）式のように表わされる。なお、自分自身のベクトルについての重みｓ_ｉ＝ｊは０である。 Here, the first term is the feature quantity x _i , a set of feature quantities {x _j | j∈ε (x _i ) corresponding to the feature quantity index included in the neighborhood set ε (x _i ) in the Euclidean space. }, S _ij is a connection weight at that time. The second term requires that the elements of the weight vector s _i = {s _i1 ,..., S _{i | ε (xi) |} } be sparse, ie how many in the vector It is a sparse term that regularizes only such limited elements so as to have a non-zero value, and v _i is a vector having constants that have smaller values closer to x _i as elements. The element v _ij of the vector v _i is expressed by, for example, the following equation (4). Note that the weight s _{i = j} for its own vector is zero.

つまるところ、この問題を解くことによってある特徴量ｘ_ｉをできる限り少数の近傍点の線形結合として表した場合の結合重みｓ_ｉを求めることができるが、これは多様体を表現するいくつかの近傍点と、その相対的幾何関係（結合重み）を表しているに他ならない。この問題は、公知のスパース問題ソルバによって解決することができる。例えば、ＳＰＡＭＳなどのオープンソースソフトウェアを用いてもよい。 In other words, by solving this problem, it is possible to obtain a connection weight s _i when a certain feature amount x _i is expressed as a linear combination of as few neighboring points as possible. It represents the point and its relative geometric relationship (join weight). This problem can be solved by known sparse problem solvers. For example, open source software such as SPAMS may be used.

なお、近傍集合ε（ｘ_ｉ）は、いかなる方法を用いて求めてもよい。最も単純な方法は、各特徴量ｘ_ｊに対して、その他全ての点ｘ_ｊ≠ｉとのユークリッド距離を求め、近いものからｔ個を近傍集合とするものである。ｔは任意の正の整数でよく、例えばｔ＝１０などとしてもよい。 The neighborhood set ε (x _i ) may be obtained using any method. The simplest method is to obtain Euclidean distances with respect to all the other points x _{j ≠ i} for each feature quantity x _j , and use t items from the closest to the neighborhood set. t may be any positive integer, for example, t = 10.

しかし、この方法では１つの特徴量に対してその他全ての特徴量との距離を求める必要があるため、未知の特徴量ｘ_ｊに対して近傍集合を求めようとすると、Ｏ（Ｎ）の計算時間が掛かるという問題がある。したがって、高速に計算できる手法を用いることが好ましい。例えば、クラスタリングやハッシュによる方法を用いることができる。 However, in this method, since it is necessary to obtain the distance from one feature quantity to all the other feature quantities, if an attempt is made to obtain a neighborhood set for an unknown feature quantity x _j , the calculation of O (N) is performed. There is a problem that it takes time. Therefore, it is preferable to use a method capable of calculating at high speed. For example, clustering or hashing methods can be used.

クラスタリングを用いる場合、例えばｋ−ｍｅａｎｓ法などにより全Ｎ個の特徴量をクラスタリングし、Ｌ個のクラスタ（Ｌ＜＜Ｎ）と、各クラスタを代表するＬ個の代表特徴量（クラスタ中心）を求めておく。Ｌの値は任意の正の整数としてよいが、例えば、Ｌ＝１２８などとすればよい。この結果、各特徴量がどのクラスタに属するか、および、当該クラスタの代表特徴量を得ることができる。この前提のもと、下記の手続きによって、未知の特徴量ｘ_ｉに対する近傍集合を得ることができる。まず、特徴量ｘ_ｊに対して、Ｌ個の代表特徴量との距離を計算し、最も近いクラスタを特定する。次に、当該クラスタに属する全ての特徴量を、近傍集合ε（ｘ_ｉ）として得る。この処理に必要な計算時間はＯ（Ｌ）であり、Ｌ＜＜Ｎであることから、単純な方法に比べて高速に近傍集合を得ることができる。 When clustering is used, for example, all N feature quantities are clustered by the k-means method or the like, and L clusters (L << N) and L representative feature quantities (cluster centers) representing each cluster are obtained. I ask for it. The value of L may be an arbitrary positive integer, but may be L = 128, for example. As a result, it is possible to obtain which cluster each feature quantity belongs to and a representative feature quantity of the cluster. Under this premise, a neighborhood set for an unknown feature quantity x _i can be obtained by the following procedure. First, the feature amount x _j, calculates the distance between the L representative characteristic quantity, identifying the nearest cluster. Next, all feature quantities belonging to the cluster are obtained as a neighborhood set ε (x _i ). Since the calculation time required for this processing is O (L) and L << N, a neighborhood set can be obtained at a higher speed than in a simple method.

また、ハッシュを用いる場合、例えば上記非特許文献１などの方法によって、全Ｎ個の特徴量に対するハッシュ値を求めておく。この前提のもと、未知の特徴量ｘ_ｊのハッシュ値を求め、これと同一またはハミング距離上近い値を持つハッシュ値を持つ（すなわち、同一あるいはそれに近接するバケットに属する）全ての特徴量を、近傍集合ε（ｘ_ｉ）として得ればよい。この処理に必要な計算時間は参照するバケットの数に依存するが、一般に参照バケット数はＮよりも小さいことから、こちらも高速に近傍集合を得ることができる。なお、非特許文献１の方法によるハッシュ値は、ユークリッド空間上のコサイン類似度を保存するようなハッシュ関数であり、ユークリッド空間上の角度が近ければ近いほどハッシュ値が衝突する確率が高くなる。一方で、本実施形態により生成されるハッシュ値は、ユークリッド空間上ではなく、多様体上の近さ（測地線距離に基づく近さ）を保存するようなハッシュ関数であり、特徴量の分布をより正確に捉えたハッシュ値を生成できる。 When using a hash, for example, the hash values for all N feature quantities are obtained by the method described in Non-Patent Document 1, for example. Under this assumption, the hash value of the unknown feature quantity x _j is obtained, and all the feature quantities having the same hash value or a value close to the Hamming distance (that is, belonging to the same or close bucket) are obtained. And a neighborhood set ε (x _i ). The calculation time required for this process depends on the number of buckets to be referenced, but since the number of reference buckets is generally smaller than N, a neighborhood set can be obtained at high speed as well. Note that the hash value by the method of Non-Patent Document 1 is a hash function that preserves the cosine similarity in the Euclidean space, and the closer the angle in the Euclidean space is, the higher the probability that the hash values collide. On the other hand, the hash value generated according to the present embodiment is a hash function that stores the proximity on the manifold (closeness based on the geodesic distance), not on the Euclidean space, and the distribution of the feature amount is A more accurate captured hash value can be generated.

以上の手続きを、特徴量、意味ベクトル双方に対して適用すればよい。 The above procedure may be applied to both feature quantities and semantic vectors.

次に、第２の手続きについて説明する。第１の手続きによって得た特徴量のｓ_ｉ（ｉ＝１，２，・・・，Ｎ）と同様の近傍の相対的幾何関係を持つハッシュ空間（埋め込み）を求めることによって、ｗ_ｋを求める。 Next, the second procedure will be described. W _k is obtained by obtaining a hash space (embedding) having a relative geometric relationship in the vicinity similar to s _i (i = 1, 2,..., N) of the feature amount obtained by the first procedure. .

以降、ｉ番目のコンテンツの特徴量をこれまで同様ｘ_ｉ、意味ベクトルをｙ_ｉと表すものとする。 Hereinafter, the feature quantity of the i-th content is represented as x _i and the semantic vector is represented as y _i as before.

具体的には、下記の問題を解決する。便宜上、特徴量ｘ_ｉ（ｉ＝１，２，・・・，Ｎ）および意味ベクトルｙ_ｉ（ｉ＝１，２，・・・，Ｎ）を並べた行列Ｘ＝｛ｘ_１，・・・，ｘ_Ｎ｝、Ｙ＝｛ｙ_１，・・・，ｙ_Ｎ｝を定義する。さらに、ハッシュ関数のパラメータｗ_ｋ（ｋ＝１，２，…，Ｂ）の他、意味ベクトルに対する疑似的なハッシュ関数のパラメータｕ_ｋ（ｋ＝１，２，…，Ｂ）を導入し、これらを並べた行列Ｗ＝｛ｗ_１，・・・，ｗ_Ｂ｝、Ｕ＝｛ｕ_１，・・・，ｕ_Ｂ｝を定義する。 Specifically, the following problems are solved. For convenience, a matrix X = {x ₁ ,... In which feature amounts x _i (i = 1, 2,..., N) and semantic vectors y _i (i = 1, 2,..., N) are arranged. , X _N }, Y = {y ₁ ,..., Y _N }. Furthermore, in addition to the hash function parameters w _k (k = 1, 2,..., B), pseudo hash function parameters u _k (k = 1, 2,..., B) for the semantic vectors are introduced. Define a matrix W = {w ₁ ,..., W _B }, U = {u ₁ ,..., U _B }.

以下の問題を解く。 Solve the following problems:

ここで、Ｓ_Ｘ／Ｓ_Ｙは特徴量／意味ベクトルそれぞれに対するｓ_ｉｊを要素に持つ行列（ｓ_ｉｊについては前記第１の手続きによって求められている）である。 Here, S _X / S _Y is a matrix having s _ij for each feature quantity / semantic vector (s _ij is obtained by the first procedure).

Ｊ_Ｘ、Ｊ_Ｙは、それぞれ特徴量、意味ベクトルの空間における多様体構造を保存するための関数であり、例えば、（６）式のように定義することができる。 J _X and J _Y are functions for saving the manifold structure in the space of the feature value and the semantic vector, respectively, and can be defined as, for example, Expression (6).

上記式（６）における多様体構造は、特徴量が存在する空間である特徴量空間において、コンテンツの特徴量を、当該コンテンツの特徴量の近傍に存在する他のコンテンツの特徴量に対応するハッシュ値の線形結合で表したものである。 The manifold structure in the above formula (6) is a hash that corresponds to the feature amount of the content in the feature amount space, which is the space where the feature amount exists, corresponding to the feature amount of the other content existing in the vicinity of the feature amount of the content. It is a linear combination of values.

上記式（６）は、元々の特徴量空間における多様体構造、すなわち、ｓ_ｉｊとその線形結合を、ハッシュ関数を構成する関数 The above equation (6) is a function that constitutes a hash function by using a manifold structure in the original feature space, that is, s _ij and its linear combination.

によって変換された先においてもそのまま再構築することを要請するものであり、上記（３）式とも相似性を持つものである。すなわち、上記（５）式に代入されたとき、ハッシュ値に変換された先でも元の空間と同様の多様体構造を持つようにＷを決定することができる。上記はＪ_Ｘの例について示したが、Ｊ_Ｙについても全く同様でよい。Ｊ_Ｙについての式を（８）式に示す。 It is requested to reconstruct as it is even in the destination converted by the above, and is similar to the above equation (3). That is, when assigned to the above equation (5), W can be determined so as to have a manifold structure similar to the original space even after being converted into a hash value. The above has been described an example of a J _X, may be exactly the same applies to the J _Y. The equation for J _Y shown in equation (8).

上記（８）式における多様体構造は、意味ベクトルが存在する空間である意味ベクトル空間において、コンテンツの意味ベクトルを、当該コンテンツの意味ベクトルの近傍に存在する他のコンテンツの意味ベクトルに対応するハッシュ値の線形結合で表したものである。 The manifold structure in the above equation (8) is a hash corresponding to the semantic vector of another content existing in the vicinity of the semantic vector of the content in the semantic vector space where the semantic vector exists. It is a linear combination of values.

上記（５）式におけるＪ_ＸＹは、同一コンテンツの特徴量と意味ベクトルとの関係を保存するための関数であり、例えば、（９）式のように定義することができる。 J _XY in the above (5) is a function for saving the relationship between the mean vector and the feature quantity of the same content, for example, can be defined as equation (9).

この式では、同一コンテンツの特徴量と意味ベクトルを、変換先で同一の値にするよう要請するものである。上記式は、特徴量と意味ベクトルのペアについて、それぞれ上記（７）式によって与えられるハッシュ関数を構成する関数により変換された値の距離を表している。したがって、これを上記（５）式に代入することで、できる限りその距離を小さくするようにＷ（およびＵ）を決定することができる。 In this equation, the feature amount and the semantic vector of the same content are requested to have the same value at the conversion destination. The above expression represents the distance between values converted by the functions constituting the hash function given by the above expression (7) for each pair of feature quantity and semantic vector. Therefore, by substituting this into the above equation (5), W (and U) can be determined so as to make the distance as small as possible.

従って、ハッシュ関数生成部３２は、上記（５）式、（６）式、（８）式、（９）式に従って、Ｎ個のコンテンツの各々について、当該コンテンツから抽出された特徴量に対応するハッシュ値と、当該コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなり、かつ、Ｎ個のコンテンツの各々について、多様体構造に基づいて求められる当該コンテンツの特徴量に対応するハッシュ値と、当該コンテンツから抽出された特徴量に対応するハッシュ値との距離が小さくなり、かつ、Ｎ個のコンテンツの各々について、多様体構造に基づいて求められる当該コンテンツの意味ベクトルに対応するハッシュ値と、当該コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成する。 Therefore, the hash function generation unit 32 corresponds to the feature amount extracted from the content for each of the N pieces of content according to the above formulas (5), (6), (8), and (9). A hash value corresponding to a feature amount of the content obtained based on a manifold structure for each of the N contents, and a distance between the hash value and a hash value corresponding to the semantic vector of the content is reduced; A hash value corresponding to a semantic vector of the content obtained based on the manifold structure for each of the N content, and a distance from the hash value corresponding to the feature value extracted from the content is reduced. The hash value corresponding to the feature value is obtained so that the distance from the hash value corresponding to the semantic vector of the content is reduced. Generating a hash function for obtaining a hash value corresponding to the hash function, and means a vector of order.

以上のように定義された上記（６）式、（８）式及び（９）式を、上記（５）式に代入し、代数変形を適用すると、次の（１０）式の問題が得られる。 Substituting the above defined equations (6), (8) and (9) into the above equation (5) and applying algebraic deformation, the following problem of equation (10) is obtained. .

ここで、 here,

である。上記（１０）式は、Ｗ（およびＵ）、すなわちＰについて凸であるので、Ｗについて微分して極値を取ることで、次の一般化固有値問題に帰着される。 It is. Since the above expression (10) is convex with respect to W (and U), that is, P, differentiation with respect to W and taking an extreme value results in the following generalized eigenvalue problem.

なお、上記式（１２）におけるηは固有値を表す。このような一般化固有値問題の解は、反復法やべき乗法などの公知の方法によって求めることができる。 In the above equation (12), η represents an eigenvalue. A solution of such a generalized eigenvalue problem can be obtained by a known method such as an iterative method or a power method.

このようにして求めたＷ（およびＵ）は、元の空間における多様体構造を最適に保存し、かつ、同一のコンテンツと意味ベクトルとの関係を可能な限り同じハッシュ値に変換するものである。したがって、目的としていた２つの性質（Ａ’）、（Ｂ’）を最適に満たすようなＷを得ることができる。 W (and U) obtained in this way optimally preserves the manifold structure in the original space, and converts the relationship between the same content and the semantic vector to the same hash value as much as possible. . Therefore, it is possible to obtain W that optimally satisfies the two properties (A ′) and (B ′) that were intended.

新たなハッシュ値を生成する際には、上記（７）式を計算した後、その符号を調べればよいだけである。この計算に必要となるメモリ量は、ｗ_ｋとｘ_ｉそれぞれを格納するに必要なメモリ量のみであり、仮に、特徴量が浮動小数点表示であり、次元Ｄが１００の場合８００Ｂ程度、仮に次元Ｄが１０００００程度になったとしても高々８００ＫＢと、現存する一般的なコンピュータにおいても極めて容易に蓄積できるメモリ量に抑えることができる。したがって、この方法によって、多様体の構造を捉えることによる高い精度でありながら、高速かつ省メモリなハッシュ値生成が可能である。 When generating a new hash value, it is only necessary to check the sign after calculating the above equation (7). Amount of memory required for this calculation is only the amount of memory required to store each w _k and x _i, if a feature quantity is a floating-point representation, if the dimension D is 100 800B about, if the dimension Even if D becomes about 100,000, it can be suppressed to 800 KB at most, and the amount of memory that can be stored extremely easily even in an existing general computer. Therefore, this method enables high-speed and memory-saving hash value generation with high accuracy by capturing the manifold structure.

ハッシュ関数生成部３２は、特徴抽出部３０によって抽出されたＮ個のコンテンツの各々の特徴量と、入力部２によって受け付けたＮ個のコンテンツの各々について与えられたコンテンツの意味ベクトルとに基づいて、上記（１０）式、（１１）式、及び（１２）式に従って、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成する。 The hash function generation unit 32 is based on the feature amount of each of the N pieces of content extracted by the feature extraction unit 30 and the content semantic vector given to each of the N pieces of content received by the input unit 2. In accordance with the equations (10), (11), and (12), a hash function for obtaining a hash value corresponding to the feature value and a hash function for obtaining a hash value corresponding to the semantic vector are generated. .

ハッシュ関数記憶部３４には、ハッシュ関数生成部３２によって生成された特徴量に対応するハッシュ値を求めるためのハッシュ関数が格納される。上記の処理詳細によって生成された特徴量に対応するハッシュ値を求めるためのハッシュ関数、すなわち、具体的には、全てのメディア種別におけるＷが、ハッシュ関数記憶部３４に記憶される。 The hash function storage unit 34 stores a hash function for obtaining a hash value corresponding to the feature amount generated by the hash function generation unit 32. A hash function for obtaining a hash value corresponding to the feature amount generated by the above processing details, that is, specifically, W for all media types is stored in the hash function storage unit 34.

ハッシュ値生成部３６は、コンテンツデータベース５に格納されたＮ個のコンテンツの各々について、特徴抽出部３０によって抽出された特徴量と、ハッシュ関数記憶部３４に格納されたハッシュ関数とに基づいて、当該コンテンツの特徴量に対応するハッシュ値を生成する。
また、ハッシュ値生成部３６は、入力部２によって受け付けた検索クエリとしてのコンテンツについて、特徴抽出部３０によって抽出された特徴量と、ハッシュ関数記憶部３４に格納されたハッシュ関数とに基づいて、当該コンテンツの特徴量に対応するハッシュ値を生成する。 For each of the N pieces of content stored in the content database 5, the hash value generation unit 36 is based on the feature amount extracted by the feature extraction unit 30 and the hash function stored in the hash function storage unit 34. A hash value corresponding to the feature amount of the content is generated.
In addition, the hash value generation unit 36 uses the feature amount extracted by the feature extraction unit 30 and the hash function stored in the hash function storage unit 34 for the content as a search query received by the input unit 2. A hash value corresponding to the feature amount of the content is generated.

出力部４は、ハッシュ値生成部３６によって生成されたハッシュ値を、コンテンツデータベース５へ出力する。 The output unit 4 outputs the hash value generated by the hash value generation unit 36 to the content database 5.

コンテンツデータベース５には、出力部４によって出力されたハッシュ値と、当該ハッシュ値に対応するコンテンツとの組み合わせが格納される。 The content database 5 stores a combination of the hash value output by the output unit 4 and the content corresponding to the hash value.

＜情報処理装置の作用＞
次に、本実施の形態に係る情報処理装置１の作用について説明する。情報処理装置１は、ハッシュ関数を生成するハッシュ関数生成処理と、特徴量をハッシュ化するハッシュ値生成処理を実行する。以下、これら２つの処理について説明する。 <Operation of information processing device>
Next, the operation of the information processing apparatus 1 according to this embodiment will be described. The information processing apparatus 1 executes a hash function generation process for generating a hash function and a hash value generation process for hashing the feature amount. Hereinafter, these two processes will be described.

＜ハッシュ関数生成処理ルーチン＞
まず、情報処理装置１が、コンテンツデータベース５に格納されたＮ個のコンテンツと、意味ベクトルデータベース６に格納されたＮ個の意味ベクトルとを取得すると、情報処理装置１によって、図６に示すハッシュ関数生成処理ルーチンが実行される。ハッシュ関数生成処理ルーチンは、実際にコンテンツデータをハッシュ化する前に、少なくとも１度実行される処理である。 <Hash function generation processing routine>
First, when the information processing apparatus 1 acquires N contents stored in the content database 5 and N meaning vectors stored in the semantic vector database 6, the information processing apparatus 1 uses the hash shown in FIG. A function generation processing routine is executed. The hash function generation processing routine is a process executed at least once before the content data is actually hashed.

まず、ステップＳ１００において、入力部２によって、コンテンツデータベース５に格納されている複数（Ｎ個）のコンテンツと、意味ベクトルデータベース６に格納されたＮ個の意味ベクトルとを取得する。 First, in step S100, the input unit 2 acquires a plurality (N) of contents stored in the content database 5 and N semantic vectors stored in the semantic vector database 6.

ステップＳ１０２において、特徴抽出部３０によって、上記ステップＳ１００で取得されたＮ個のコンテンツの各々に対し、当該コンテンツから特徴量を抽出する。 In step S102, the feature extraction unit 30 extracts a feature amount from the content for each of the N pieces of content acquired in step S100.

ステップＳ１０４において、ハッシュ関数生成部３２によって、上記ステップＳ１０２で抽出されたＮ個のコンテンツの各々の特徴量と、上記ステップＳ１００で取得したＮ個のコンテンツの各々について与えられた意味ベクトルとに基づいて、上記式（１０）〜式（１２）に従って、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成する。また、ハッシュ関数生成部３２によって、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を、ハッシュ関数記憶部３４へ格納して、ハッシュ関数生成処理ルーチンを終了する。 In step S104, the hash function generation unit 32 based on the feature amount of each of the N contents extracted in step S102 and the semantic vector given to each of the N contents acquired in step S100. Thus, a hash function for obtaining a hash value corresponding to the feature amount and a hash function for obtaining a hash value corresponding to the semantic vector are generated according to the above equations (10) to (12). Further, the hash function generation unit 32 stores the hash function for obtaining the hash value corresponding to the feature amount and the hash function for obtaining the hash value corresponding to the semantic vector in the hash function storage unit 34, and The function generation processing routine is terminated.

以上の処理により、コンテンツデータベース５に格納されたＮ個のコンテンツと、意味ベクトルデータベース６に格納されたＮ個の意味ベクトルとからハッシュ関数を生成することができる。 Through the above processing, a hash function can be generated from the N contents stored in the content database 5 and the N semantic vectors stored in the semantic vector database 6.

＜ハッシュ値生成処理ルーチン＞
次に、検索クエリとしてのコンテンツが情報処理装置１に入力されると、情報処理装置１によって、図７に示すハッシュ値生成処理ルーチンが実行される。ハッシュ値生成処理ルーチンは、ハッシュ関数記憶部３４に格納された特徴量に対応するハッシュ値を求めるためのハッシュ関数を用いてコンテンツの特徴量をハッシュ化する処理である。 <Hash value generation processing routine>
Next, when content as a search query is input to the information processing apparatus 1, the information processing apparatus 1 executes a hash value generation processing routine shown in FIG. 7. The hash value generation processing routine is a process of hashing the content feature amount using a hash function for obtaining a hash value corresponding to the feature amount stored in the hash function storage unit 34.

まず、ステップＳ２００において、入力部２によって、検索クエリとしてのコンテンツの入力を受け付ける。 First, in step S200, the input unit 2 accepts input of content as a search query.

ステップＳ２０２において、特徴抽出部３０によって、上記ステップＳ２００で受け付けたコンテンツから、特徴量を抽出する。 In step S202, the feature extraction unit 30 extracts feature amounts from the content received in step S200.

ステップＳ２０４において、ハッシュ値生成部３６によって、上記ステップＳ２０２で抽出された特徴量と、上記ステップＳ１０４でハッシュ関数記憶部３４に格納されたハッシュ関数とに基づいて、上記ステップＳ２００で受け付けたコンテンツのハッシュ値を生成する。また、ハッシュ値生成部３６によって、コンテンツデータベース５に格納されたＮ個のコンテンツの各々について、上記ステップＳ１０２で抽出された当該コンテンツの各々の特徴量と、上記ステップＳ１０４でハッシュ関数記憶部３４に格納されたハッシュ関数とに基づいて、当該コンテンツのハッシュ値を生成する。 In step S204, based on the feature value extracted in step S202 by the hash value generation unit 36 and the hash function stored in the hash function storage unit 34 in step S104, the content received in step S200 is stored. Generate a hash value. Further, for each of the N pieces of content stored in the content database 5 by the hash value generation unit 36, each feature amount of the content extracted in step S102 and the hash function storage unit 34 in step S104 are stored. Based on the stored hash function, a hash value of the content is generated.

ステップＳ２０６において、出力部４によって、上記ステップＳ２０４で生成されたハッシュ値と、上記ステップＳ２００で受け付けたコンテンツとの組み合わせ、及び上記ステップＳ２０４で生成されたハッシュ値と、Ｎ個コンテンツとの組み合わせをコンテンツデータベース５へ出力し、コンテンツデータベース５に格納して、ハッシュ値生成処理ルーチンを終了する。 In step S206, the output unit 4 determines a combination of the hash value generated in step S204 and the content received in step S200, and a combination of the hash value generated in step S204 and the N content. The data is output to the content database 5, stored in the content database 5, and the hash value generation processing routine is terminated.

以上の処理により、入力したコンテンツに対して、メディアの種別によらずハッシュ値を求めることができる。また、検索クエリとしてのコンテンツのハッシュ値と類似するハッシュ値を有するコンテンツを、Ｎ個のコンテンツから発見することができる。 Through the above processing, a hash value can be obtained for the input content regardless of the type of media. In addition, content having a hash value similar to the hash value of content as a search query can be found from N pieces of content.

以上説明したように、本発明の実施の形態に係る情報処理装置によれば、複数のコンテンツの各々の特徴量と、複数のコンテンツの各々について与えられたコンテンツの意味を表す意味ベクトルとに基づいて、複数のコンテンツの各々について、当該コンテンツから抽出された特徴量に対応するハッシュ値と、当該コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成することにより、コンテンツの意味を考慮したハッシュ値を求めるためのハッシュ関数を生成することができる。 As described above, according to the information processing apparatus according to the embodiment of the present invention, based on the feature amount of each of the plurality of contents and the semantic vector representing the meaning of the content given to each of the plurality of contents. Then, for each of the plurality of contents, the hash value corresponding to the feature value is reduced so that the distance between the hash value corresponding to the feature value extracted from the content and the hash value corresponding to the semantic vector of the content is reduced. By generating a hash function for obtaining the hash value and a hash function for obtaining the hash value corresponding to the semantic vector, a hash function for obtaining a hash value in consideration of the meaning of the content can be generated.

また、本発明の実施の形態に係る情報処理装置によれば、コンテンツから抽出された特徴量と、生成されたハッシュ関数とに基づいて、コンテンツの特徴量に対応するハッシュ値を生成することにより、コンテンツの意味を考慮したハッシュ値を生成することができる。また、検索クエリとしてのコンテンツから抽出された特徴量と、生成されたハッシュ関数とに基づいて、検索クエリとしてのコンテンツの特徴量に対応するハッシュ値を生成することにより、コンテンツの意味を考慮したハッシュ値を用いて、検索クエリとしてのコンテンツの類似するコンテンツを精度よく発見することができる。 In addition, according to the information processing apparatus according to the embodiment of the present invention, by generating a hash value corresponding to the feature amount of the content based on the feature amount extracted from the content and the generated hash function, A hash value considering the meaning of the content can be generated. In addition, based on the feature amount extracted from the content as the search query and the generated hash function, the hash value corresponding to the feature amount of the content as the search query is generated, thereby taking into account the meaning of the content. Using the hash value, content similar to the content as the search query can be found with high accuracy.

また、以上説明したように、本実施の形態によれば、特徴量空間の多様体構造を捉え、かつ意味ベクトルの近さを保存するようにパラメトリックなハッシュ関数を生成することによって、意味的観点において類似したコンテンツ同士を、高速かつ省メモリでありながらも高精度に発見することができる。 Further, as described above, according to the present embodiment, a semantic viewpoint can be obtained by generating a parametric hash function so as to capture the manifold structure of the feature amount space and preserve the closeness of the semantic vector. Similar contents can be found with high accuracy while being fast and memory-saving.

また、大量のコンテンツの中から、類似コンテンツを発見することができる。 Also, similar content can be found from a large amount of content.

また、多様な意味的観点を考慮しながら、コンテンツを、高速かつ省メモリでありながらも高精度に発見することができるという効果が得られる。 In addition, the content can be found with high accuracy while considering various semantic viewpoints while being high speed and saving memory.

＜第２の実施の形態＞
＜システム構成＞
次に、本発明の第２の実施の形態について説明する。なお、第１の実施の形態と同様の構成となる部分については、同一符号を付して説明を省略する。 <Second Embodiment>
<System configuration>
Next, a second embodiment of the present invention will be described. In addition, about the part which becomes the structure similar to 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

第２の実施の形態では、ハッシュ関数の種類が第１の実施の形態と異なっている。 In the second embodiment, the type of hash function is different from that in the first embodiment.

上記第１の実施の形態で前述した第２の手続きでは、上記式（２）の形をとるハッシュ関数の場合において、そのパラメータｗ_ｋ（ｋ＝１，２，…，Ｂ）を求める方法について述べたが、本発明の実施の形態で扱えるハッシュ関数は、何もこの形に限るものではなく、別の形式をとるハッシュ関数であっても、同様にそのパラメータを決定することができる。 In the second procedure described above in the first embodiment, in the case of a hash function taking the form of the above formula (2), a method for obtaining the parameter w _k (k = 1, 2,..., B) As described above, the hash function that can be handled in the embodiment of the present invention is not limited to this form, and even if the hash function takes another form, its parameters can be similarly determined.

例えば、次のようなハッシュ関数も扱うことができる。 For example, the following hash function can be handled.

ここで、α_ｋ，_ｔはパラメータ、κ（ｘ_ｔ，ｘ）はカーネル関数である。カーネル関数は、 Here, α _k , _t is a parameter, and κ (x _t , x) is a kernel function. The kernel function

のような関数であり、さらにＮ個の特徴量｛ｘ_１，・・・，ｘ_Ｎ｝に対して、 And for N feature quantities {x ₁ ,..., X _N },

および、任意の実数α_ｉ、α_ｊに対して And for any real number α _i , α _j

を満たすような任意の関数である。このような関数は無数に存在するが、例を挙げれば、 Any function that satisfies There are an infinite number of such functions.

などが存在する。ただし、β、γは正の実数値パラメータ、ｐは整数パラメータであり、適宜決定してよい。 Etc. exist. However, β and γ are positive real value parameters, and p is an integer parameter, which may be determined as appropriate.

上記（１３）式において、ｂ_ｋは In the above equation (13), b _k is

すなわち平均値で定められる定数なので、上記（１３）式は、 That is, since the constant is determined by the average value, the above equation (13) is

と、内積の形に変換できる。ただし、 And can be converted to the inner product form. However,

である。ここで、Ｔはハッシュ関数を定める定数である。上記ハッシュ関数を構成するカーネル関数、具体的にはカーネルベクトル写像κ（ｘ）は、Ｔ個の特徴量によって定められるが、ＴはＴ＜Ｎの範囲で任意の値に決めてよい。例えば、Ｔ＝３００等として、全特徴量｛ｘ_１，・・・，ｘ_Ｎ｝の中からランダムにＴ個選んでもよいし、あるいはＫ−ｍｅａｎｓなどのクラスタリング法を用いて選ばれた代表ベクトルとしてもよい。本実施の形態では、Ｔ個の特徴量が予め選択されている場合を例に説明する。 It is. Here, T is a constant that determines the hash function. The kernel function constituting the hash function, specifically the kernel vector map κ (x), is determined by T feature quantities, but T may be set to an arbitrary value in the range of T <N. For example, assuming that T = 300, T may be selected at random from all the feature quantities {x ₁ ,..., X _N }, or a representative vector selected using a clustering method such as K-means. It is good. In the present embodiment, a case where T feature amounts are selected in advance will be described as an example.

このように定義されたハッシュ関数は、カーネル関数の形で定義された非線形写像を扱うことができる。したがって、非線形な関数、すなわち、直線だけでなく、曲線も扱える点で、上記（６）式によるハッシュ関数よりも柔軟な表現が可能であるという利点を持つ。 A hash function defined in this way can handle a nonlinear mapping defined in the form of a kernel function. Therefore, it has an advantage that it can be expressed more flexibly than the hash function according to the above equation (6) in that it can handle not only a non-linear function, that is, a straight line but also a curve.

ハッシュ関数生成部３２は、予め選択されているＴ個のコンテンツの特徴量に基づいて、カーネル関数を生成する。 The hash function generation unit 32 generates a kernel function based on the feature amounts of T pieces of content selected in advance.

以下、（１３）式の形式をとるハッシュ関数において、そのパラメータα_ｋを決定する方法を説明する。便宜上、特徴量についてκ_Ｘ（ｘ_ｉ）（ｉ＝１，２，…，Ｎ）および意味ベクトルについてκ_Ｙ（ｙ_ｉ）（ｉ＝１，２，…，Ｎ）を並べた行列Κ_Ｘ＝｛κ_Ｘ（ｘ_１），・・・，κ_Ｘ（ｘ_Ｎ）｝、Κ_Ｙ＝｛κ_Ｙ（ｙ_１），・・・，κ_Ｙ（ｙ_Ｎ）｝を定義する。さらに、特徴量のためのハッシュ関数のパラメータα_ｋ（ｋ＝１，２，…，Ｂ）および意味ベクトルのための疑似的なハッシュ関数のパラメータθ_ｋ（ｋ＝１，２，…，Ｂ）を並べた行列Α＝｛α_１，・・・，α_Ｂ｝、Θ＝｛θ_１，・・・，θ_Ｂ｝を定義する。 Hereinafter, a method of determining the parameter α _k in the hash function taking the form of the equation (13) will be described. For convenience, a matrix た_X == κ _X (x _i ) (i = 1, 2,..., N) for feature quantities and κ _Y (y _i ) (i = 1, 2,..., N) for semantic vectors. {Κ _X (x ₁ ),..., Κ _X (x _N )}, Κ _Y = {κ _Y (y ₁ ),..., Κ _Y (y _N )} are defined. Further, a hash function parameter α _k (k = 1, 2,..., B) for the feature quantity and a pseudo hash function parameter θ _k (k = 1, 2,..., B) for the semantic vector. Α = {α ₁ ,..., Α _B } and Θ = {θ ₁ ,..., Θ _B } are defined.

具体的には、上記（２）式で定義されるハッシュ関数で言うところの上記（５）式に相当する、以下の問題を解く。 Specifically, the following problem corresponding to the above equation (5), which is the hash function defined by the above equation (2), is solved.

Ｊ_Ｘ、Ｊ_ＹおよびＪ_ＸＹは、上記（６）式、（８）式、及び（９）式と同様の理由で、例えば、下記のように定義することができる。 J _X , J _Y and J _XY can be defined, for example, as follows for the same reason as the above formulas (6), (8), and (9).

上記（２０）式、（２１）式、及び（２２）式を、上記（１９）式に代入し、代数変形を適用すると、次の（２３）式の問題が得られる。 Substituting the above equations (20), (21), and (22) into the above equation (19) and applying algebraic deformation, the following problem of equation (23) is obtained.

ここで、 here,

である。この問題は、上記（１０）式の問題と等価であるため、全く同様の手続きで解くことができる。 It is. This problem is equivalent to the problem of the above equation (10), and can be solved by exactly the same procedure.

ハッシュ関数生成部３２は、特徴抽出部３０によって抽出されたＮ個のコンテンツの各々の特徴量と、入力部２によって受け付けたＮ個のコンテンツの各々について与えられたコンテンツの意味ベクトルとに基づいて、上記（２３）式、（２４）式に従って、Ｎ個のコンテンツの各々について、当該コンテンツから抽出された特徴量に対応するハッシュ値と、当該コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなり、かつ、Ｎ個のコンテンツの各々について、多様体構造に基づいて求められる当該コンテンツの特徴量に対応するハッシュ値と、当該コンテンツから抽出された特徴量に対応するハッシュ値との距離が小さくなり、かつ、Ｎ個のコンテンツの各々について、多様体構造に基づいて求められる当該コンテンツの意味ベクトルに対応するハッシュ値と、当該コンテンツの意味ベクトルに対応するハッシュ値との距離が小さくなるように、特徴量に対応するハッシュ値を求めるためのハッシュ関数、及び意味ベクトルに対応するハッシュ値を求めるためのハッシュ関数を生成する。
ハッシュ関数生成部３２によって生成された特徴量に対応するハッシュ値を求めるためのハッシュ関数及びカーネル関数は、ハッシュ関数記憶部３４に記憶される。具体的には、パラメータΑおよびカーネル関数κ（ｘ）が、ハッシュ関数記憶部３４に記憶される。 The hash function generation unit 32 is based on the feature amount of each of the N pieces of content extracted by the feature extraction unit 30 and the content semantic vector given to each of the N pieces of content received by the input unit 2. According to the above equations (23) and (24), for each of the N contents, the distance between the hash value corresponding to the feature amount extracted from the content and the hash value corresponding to the semantic vector of the content is For each of the N pieces of content, the distance between the hash value corresponding to the feature amount of the content obtained based on the manifold structure and the hash value corresponding to the feature amount extracted from the content is For each of the N pieces of content, the content of the content is calculated based on the manifold structure. A hash function for obtaining a hash value corresponding to a feature value and a hash value corresponding to a semantic vector so that the distance between the hash value corresponding to the taste vector and the hash value corresponding to the semantic vector of the content is reduced. Generate a hash function to find
A hash function and a kernel function for obtaining a hash value corresponding to the feature amount generated by the hash function generation unit 32 are stored in the hash function storage unit 34. Specifically, the parameter Α and the kernel function κ (x) are stored in the hash function storage unit 34.

ハッシュ値生成部３６は、コンテンツデータベース５に格納されたＮ個のコンテンツの各々について、特徴抽出部３０によって抽出された特徴量と、ハッシュ関数記憶部３４に格納されたハッシュ関数及びカーネル関数とに基づいて、当該コンテンツの特徴量に対応するハッシュ値を生成する。
また、ハッシュ値生成部３６は、入力部２によって受け付けた検索クエリとしてのコンテンツについて、特徴抽出部３０によって抽出された特徴量と、ハッシュ関数記憶部３４に格納されたハッシュ関数及びカーネル関数とに基づいて、当該コンテンツの特徴量に対応するハッシュ値を生成する。 For each of the N contents stored in the content database 5, the hash value generation unit 36 converts the feature amount extracted by the feature extraction unit 30 and the hash function and kernel function stored in the hash function storage unit 34. Based on this, a hash value corresponding to the feature amount of the content is generated.
In addition, the hash value generation unit 36 divides the content as a search query received by the input unit 2 into the feature amount extracted by the feature extraction unit 30 and the hash function and kernel function stored in the hash function storage unit 34. Based on this, a hash value corresponding to the feature amount of the content is generated.

なお、第２の実施の形態に係る情報処理装置の他の構成及び作用については、第１の実施の形態と同様であるため、説明を省略する。 Note that other configurations and operations of the information processing apparatus according to the second embodiment are the same as those of the first embodiment, and thus description thereof is omitted.

＜第３の実施の形態＞
＜システム構成＞
次に、図８を参照して、本発明の第３の実施の形態について説明する。なお、第１の実施の形態と同様の構成となる部分については、同一符号を付して説明を省略する。 <Third Embodiment>
<System configuration>
Next, a third embodiment of the present invention will be described with reference to FIG. In addition, about the part which becomes the structure similar to 1st Embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

第３の実施の形態では、サーバ装置とクライアント装置とで情報処理システムを構成する点が、第１及び第２の実施の形態と異なっている。第３の実施の形態では、類似コンテンツ検索を実施する情報処理システムに、本発明を適用させた場合を例に説明する。 The third embodiment is different from the first and second embodiments in that an information processing system is configured by a server device and a client device. In the third embodiment, a case where the present invention is applied to an information processing system that performs similar content search will be described as an example.

本発明の第３の実施の形態に係る情報処理システム２００は、図８に示すように、サーバ装置７と、クライアント装置１３とを備えている。 An information processing system 200 according to the third embodiment of the present invention includes a server device 7 and a client device 13 as shown in FIG.

図８に示すサーバ装置７は、ＣＰＵと、ＲＡＭと、各処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。図８に示すように、サーバ装置７は、入力部８、演算部９、出力部１０を備えている。また、演算部９は、特徴抽出部９０、ハッシュ関数生成部９２、ハッシュ関数記憶部９４、及びハッシュ値生成部９６を備えている。また、コンテンツデータベース１１には、第１の実施の形態のコンテンツデータベース５と同様に、複数のコンテンツが登録されている。また、意味ベクトルデータベース１２には、第１の実施の形態の意味ベクトルデータベース６と同様に、コンテンツデータベース１１に登録された各コンテンツに対応する意味ベクトルが格納されているものとする。 The server device 7 shown in FIG. 8 is configured by a computer including a CPU, a RAM, and a ROM that stores a program for executing each processing routine, and is functionally configured as follows. . As shown in FIG. 8, the server device 7 includes an input unit 8, a calculation unit 9, and an output unit 10. The calculation unit 9 includes a feature extraction unit 90, a hash function generation unit 92, a hash function storage unit 94, and a hash value generation unit 96. Further, a plurality of contents are registered in the content database 11 as in the case of the content database 5 of the first embodiment. The semantic vector database 12 stores semantic vectors corresponding to the respective contents registered in the content database 11 as in the semantic vector database 6 of the first embodiment.

また、図８に示すクライアント装置１３は、ＣＰＵと、ＲＡＭと、各処理ルーチンを実行するためのプログラムを記憶したＲＯＭとを備えたコンピュータで構成され、機能的には次に示すように構成されている。図８に示すように、クライアント装置１３は、入力部１４、演算部１５、出力部１６を備えている。また、演算部１５は、特徴抽出部１５０、ハッシュ関数記憶部１５４、及びハッシュ値生成部１５６を備える。 Further, the client device 13 shown in FIG. 8 is configured by a computer including a CPU, a RAM, and a ROM storing a program for executing each processing routine, and is functionally configured as follows. ing. As shown in FIG. 8, the client device 13 includes an input unit 14, a calculation unit 15, and an output unit 16. The calculation unit 15 includes a feature extraction unit 150, a hash function storage unit 154, and a hash value generation unit 156.

ここで、サーバ装置７とクライアント装置１３において、共通する構成要素（入力部、特徴抽出部、ハッシュ関数記憶部、ハッシュ値生成部）はそれぞれ同一の機能を有するように構成し、また、上記図１に記載した各構成要素と同一名称のものは、上記図１の場合と同一の機能を有するものとしてよい。さらに、ハッシュ値生成部の内容は、それぞれ何らかの通信手段で適宜同期されているものとする。 Here, in the server device 7 and the client device 13, the common components (input unit, feature extraction unit, hash function storage unit, hash value generation unit) are configured to have the same functions, respectively, Those having the same names as the constituent elements described in FIG. 1 may have the same functions as those in FIG. Furthermore, it is assumed that the contents of the hash value generation unit are appropriately synchronized by some communication means.

図８に示す装置構成における処理動作は下記の通りである。まずサーバ装置７は、上記説明した処理と同様の処理を行って、適宜、ハッシュ関数を生成してハッシュ関数記憶部９４に格納し、クライアント装置１３のハッシュ関数記憶部１５４と同期させる。さらに、コンテンツデータベース１１中のコンテンツに対して、上記説明した処理と同様の処理を行って、ハッシュ値を生成し、コンテンツデータベース１１に格納しておく。 Processing operations in the apparatus configuration shown in FIG. 8 are as follows. First, the server device 7 performs processing similar to the processing described above, generates a hash function as appropriate, stores it in the hash function storage unit 94, and synchronizes with the hash function storage unit 154 of the client device 13. Further, a process similar to the process described above is performed on the content in the content database 11 to generate a hash value and store it in the content database 11.

一方、クライアント装置１３は、入力部１４によって、利用者からの検索要求、すなわち、検索クエリとしてのコンテンツの入力を受け付けたら、当該コンテンツに対してハッシュ値を生成し、出力部１６からサーバ装置７の入力部８へと当該ハッシュ値を出力する。 On the other hand, when the client device 13 receives a search request from a user, that is, an input of content as a search query, by the input unit 14, the client device 13 generates a hash value for the content, and the output unit 16 sends the server device 7. The hash value is output to the input unit 8.

クライアント装置１３からハッシュ値を受けた場合、サーバ装置７は、当該ハッシュ値を用いて、コンテンツデータベース１１に対して検索を行い、ハッシュ値に基づいて類似コンテンツを発見し、その結果をクライアント装置１３へと出力する。 When the hash value is received from the client device 13, the server device 7 searches the content database 11 using the hash value, finds similar content based on the hash value, and uses the result as the client device 13. To output.

最後に、クライアント装置１３は、サーバ装置７より受け取った検索結果を利用者に出力する。 Finally, the client device 13 outputs the search result received from the server device 7 to the user.

このように構成することで、サーバ装置７でハッシュ関数生成処理を実施し、クライアント装置１３ではハッシュ値生成処理のみを実施するように構成することができる。 With this configuration, the server device 7 can perform the hash function generation process, and the client device 13 can perform only the hash value generation process.

なお、第３の実施の形態に係る情報処理システム２００の他の構成及び作用については、第１の実施の形態と同様であるため、説明を省略する。 Note that the other configuration and operation of the information processing system 200 according to the third embodiment are the same as those in the first embodiment, and thus the description thereof is omitted.

この構成を取るメリットを説明する。一般に、クライアント装置（ＰＣ，携帯端末等）は、サーバ装置と比較して演算能力に乏しいため、ハッシュ関数生成のように演算量が比較的多い処理には適さない場合がある。この構成にすれば、ハッシュ関数生成処理は演算能力の高いサーバ装置で適宜実施し、クライアント装置では演算量の少ないハッシュ値生成処理だけを実施することができる。さらに、通常、ネットワークを介した通信によってデータ容量の多い情報を伝送する場合、伝送時間が掛かるという問題があるが、当該構成によって、伝送するのは情報量の小さい複数特徴量ハッシュ値のみでよくなり、検索に対する即応性を高めることができる。 The merit of taking this configuration will be described. In general, a client device (PC, portable terminal, etc.) has poor calculation capability compared to a server device, and therefore may not be suitable for processing with a relatively large amount of calculation such as hash function generation. According to this configuration, the hash function generation process can be appropriately performed by a server apparatus having high calculation capability, and only the hash value generation process with a small calculation amount can be performed by the client apparatus. In addition, when transmitting information with a large amount of data by communication via a network, there is a problem that it takes a long transmission time. However, according to the configuration, only a plurality of feature amount hash values having a small amount of information may be transmitted. Thus, the quick response to the search can be improved.

以上説明したように、前述したハッシュ関数とハッシュ値を用いることにより、多様な意味的観点とその強さも含めて類似したコンテンツを、高速かつ省メモリでありながらも高精度に発見することが可能になる。この構成により、省メモリであることから、例えば、メモリ量の小さいモバイル端末（スマートフォンやタブレット）での利用も可能となる。また、高速であることから、実時間性の要求される利用に対しても対応可能である。これらの効果を活用した具体的な利用シーンとして、街中を歩いているときに気になる場所や商品をモバイル端末で写真撮影し、類似した場所・商品を検索することが可能になる。 As explained above, by using the hash function and hash value described above, it is possible to find similar contents including various semantic viewpoints and their strengths with high accuracy while being fast and memory-saving. become. With this configuration, since the memory is saved, for example, the mobile terminal (smart phone or tablet) having a small memory amount can be used. Moreover, since it is high-speed, it can respond also to the use for which real-time property is required. As specific usage scenes utilizing these effects, it is possible to take a picture of a place or product that is of interest when walking in the city with a mobile terminal and search for a similar place or product.

[実施例]
次に、第１の実施の形態で説明した処理により生成したハッシュ関数によって、類似コンテンツを高速かつ省メモリに検索する実施例の一例について説明する。前述のハッシュ関数生成処理が済んでいれば、ハッシュ関数記憶部３４には、メディア種別ごとにＢ組のハッシュ関数が格納されている。これを用いれば、上記（２）式にしたがって、特徴量で表現された任意のコンテンツを、Ｂビット以下のハッシュ値で表現することができる。例えば、コンテンツデータベース５に、Ｎ個の画像コンテンツが格納されているとし、特徴量Ｘ＝｛ｘ_１，・・・，ｘ_Ｎ｝と対応する意味ベクトルＹ＝｛ｙ_１，・・・，ｙ_Ｎ｝が意味ベクトルデータベース６に格納されているとし、特徴量は全て上記（２）式に基づいてハッシュ値Ｚ＝｛ｚ_１，・・・，ｚ_Ｎ｝に変換されているものとする。このとき、目的は、Ｘに含まれない特徴量ｘ_ｑに対して類似するコンテンツをＸの中から発見することである。 [Example]
Next, an example of an embodiment that searches for similar contents in a high-speed and memory-saving manner using the hash function generated by the processing described in the first embodiment will be described. If the hash function generation processing described above has been completed, the hash function storage unit 34 stores B sets of hash functions for each media type. If this is used, any content expressed by the feature value can be expressed by a hash value of B bits or less according to the above equation (2). For example, the content database 5, the N image content is stored, the feature quantity _{X = {x 1, ···,} x N} and the corresponding mean vector _{Y = {y 1, ···,} y It is assumed that _N } is stored in the semantic vector database 6, and all feature quantities are converted into hash values Z = {z ₁ ,..., Z _N } based on the above equation (2). At this time, an object is to find content similar to the feature quantity x _q not included in X from X.

まず、上記（２）式に基づいて、特徴量ｘ_ｑをハッシュ値ｚ_ｑに変換しておく。最も単純には、図９に示すハッシュテーブルを用いる方法がある。まず、コンテンツデータベース５に登録されているハッシュ値Ｚによって、図９に示すようなハッシュテーブルを構成する。このテーブルでは、あるハッシュ値と、そのハッシュ値に変換された特徴量（コンテンツ識別子）を対応づけて格納しており、ハッシュ値が与えられた際に、それと同一のハッシュ値を取るコンテンツを即時発見することができる。ここで、本発明の実施の形態により生成されるハッシュ値においては、メディア種別に寄らず、関連するもの、類似するものを同一のハッシュ値に変換できることが特徴である。すなわち、例えば、ハッシュ値「００００」を指定した場合、それに対応づけられた画像（画像１、画像３・・・）を、メディア種別に寄らずに直ちに発見できるのである。同様に、このハッシュテーブルを利用すれば、ハッシュ値ｚ_ｑに対応したコンテンツを即座に発見することが可能となる。 First, the feature quantity x _q is converted into the hash value z _q based on the above equation (2). The simplest method is to use a hash table shown in FIG. First, a hash table as shown in FIG. 9 is configured by the hash value Z registered in the content database 5. In this table, a hash value and a feature value (content identifier) converted to the hash value are stored in association with each other. When a hash value is given, the content having the same hash value is immediately stored. Can be found. Here, the hash values generated by the embodiment of the present invention are characterized in that related and similar ones can be converted into the same hash value regardless of the media type. That is, for example, when the hash value “0000” is specified, the image (image 1, image 3...) Associated with the hash value “0000” can be immediately found without depending on the media type. Similarly, given the benefit of this hash table, it is possible to find instantly content corresponding to the hash value z _q.

この方法によれば、コンテンツデータベース５に登録された画像コンテンツ数Ｎに寄らず、ほぼ一定の時間で高速に、かつ、元の特徴量をメモリに保持する必要がないため、省メモリに類似コンテンツを発見できるという利点がある。 According to this method, the number of image contents registered in the content database 5 does not depend on the number N, and it is not necessary to store the original feature amount in the memory at a high speed in a substantially constant time. There is an advantage that can be found.

また、別の方法として、ハミング距離による距離計算を利用することができる。すなわち、ハッシュ値ｚ_ｑと、Ｚに含まれるＮ個のハッシュ値との距離を計算し、距離の小さいものを類似コンテンツとして得るものである。ハッシュ値はバイナリであるため、距離計算は例えばハミング距離で計算することができるが、ハミング距離はＸＯＲ（排他的論理和）とｐｏｐｃｎｔ演算（すなわち、バイナリ列のうち、１となっているビットの数を数える演算）のみで計算できること、および、ハッシュ値は通常少数のバイナリ値で表現できることから、元の特徴量で距離計算する場合に比べ、遥かに高速に演算できる。 As another method, distance calculation based on the Hamming distance can be used. That is, the distance between the hash value z _q and the N hash values included in Z is calculated, and the one with a small distance is obtained as similar content. Since the hash value is binary, the distance calculation can be performed by, for example, the Hamming distance, but the Hamming distance is calculated by XOR (exclusive OR) and popcnt operation (that is, the bit of 1 in the binary string). The calculation can be performed only by counting the number), and the hash value can usually be expressed by a small number of binary values. Therefore, the calculation can be performed much faster than the distance calculation using the original feature amount.

なお、本発明は、上述した実施形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。本実施形態の主要な特徴を満たす範囲内において、任意の用途と構成を取ることができることは言うまでもない。 Note that the present invention is not limited to the above-described embodiment, and various modifications and applications are possible without departing from the gist of the present invention. Needless to say, any application and configuration can be adopted within a range that satisfies the main features of the present embodiment.

また、上記第１の実施の形態における、ハッシュ関数生成部３２と、ハッシュ値生成部３６とを別々の装置として構成してもよい。その場合には、ハッシュ関数生成部３２を含んでハッシュ関数生成装置を構成し、ハッシュ値生成部３６を含んでハッシュ値生成装置を構成する。 In addition, the hash function generation unit 32 and the hash value generation unit 36 in the first embodiment may be configured as separate devices. In that case, a hash function generation device is configured including the hash function generation unit 32, and a hash value generation device is configured including the hash value generation unit 36.

また、本実施の形態の情報処理装置及び情報処理システムは、ハッシュ関数記憶部３４を備えている場合について説明したが、例えばハッシュ関数記憶部３４が情報処理装置及び情報処理システムの外部装置に設けられ、情報処理装置及び情報処理システムは、外部装置と通信手段を用いて通信することにより、ハッシュ関数記憶部３４を参照するようにしてもよい。 The information processing apparatus and the information processing system according to the present embodiment have been described with respect to the case where the hash function storage unit 34 is provided. For example, the hash function storage unit 34 is provided in the information processing apparatus and an external device of the information processing system. The information processing apparatus and the information processing system may refer to the hash function storage unit 34 by communicating with an external apparatus using a communication unit.

また、本願明細書中において、プログラムが予めインストールされている実施形態として説明したが、当該プログラムを、コンピュータ読み取り可能な記録媒体に格納して提供することも可能である。
例えば、前述した実施形態におけるハッシュ関数生成部、ハッシュ値生成部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されるものであってもよい。 In the present specification, the embodiment has been described in which the program is installed in advance. However, the program can be provided by being stored in a computer-readable recording medium.
For example, the hash function generation unit and the hash value generation unit in the above-described embodiment may be realized by a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on this recording medium may be read into a computer system and executed. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory inside a computer system serving as a server or a client in that case may be included and a program held for a certain period of time. Further, the program may be for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in the computer system. It may be realized using hardware such as PLD (Programmable Logic Device) or FPGA (Field Programmable Gate Array).

また、上述の情報処理装置及び情報処理システムは、内部にコンピュータシステムを有しているが、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。 In addition, the information processing apparatus and the information processing system described above have a computer system inside, but if the “computer system” uses a WWW system, a homepage providing environment (or display environment) Shall also be included.

以上、図面を参照して本発明の実施の形態を説明してきたが、上記実施の形態は本発明の例示に過ぎず、本発明が上記実施の形態に限定されるものではないことは明らかである。したがって、本発明の技術思想及び範囲を逸脱しない範囲で構成要素の追加、省略、置換、その他の変更を行ってもよい。 As mentioned above, although embodiment of this invention has been described with reference to drawings, the said embodiment is only the illustration of this invention, and it is clear that this invention is not limited to the said embodiment. is there. Therefore, additions, omissions, substitutions, and other modifications of the components may be made without departing from the technical idea and scope of the present invention.

１情報処理装置
２，８，１４入力部
３，９，１５演算部
４，１０，１６出力部
５，１１コンテンツデータベース
６，１２意味ベクトルデータベース
７サーバ装置
１３クライアント装置
３０，９０，１５０特徴抽出部
３２，９２ハッシュ関数生成部
３４，９４，１５４ハッシュ関数記憶部
３６，９６，１５６ハッシュ値生成部
２００情報処理システム DESCRIPTION OF SYMBOLS 1 Information processing apparatus 2,8,14 Input part 3,9,15 Calculation part 4,10,16 Output part 5,11 Content database 6,12 Meaning vector database 7 Server apparatus 13 Client apparatus 30,90,150 Feature extraction part 32, 92 Hash function generators 34, 94, 154 Hash function storage units 36, 96, 156 Hash value generator 200 Information processing system

Claims

A hash function generation method in a hash function generation device including a feature extraction unit and a hash function generation unit,
The feature extraction means for extracting a feature amount from the content for each of a plurality of content;
The hash function generation means is based on the feature amount of each of the plurality of contents extracted by the feature extraction means and a semantic vector representing the meaning of the content given to each of the plurality of contents, For each of the plurality of contents, a hash value corresponding to the feature amount so that a distance between a hash value corresponding to the feature amount extracted from the content and a hash value corresponding to the semantic vector of the content is reduced. Generating a hash function for obtaining a hash value for obtaining a hash value corresponding to the semantic vector;
Hash function generation method including

The step of generating the hash function by the hash function generating means includes the feature amount of each of the plurality of contents extracted by the feature extracting means and the meaning of the content given to each of the plurality of contents. For each of the plurality of contents, the feature quantity of the content is present in the vicinity of the feature quantity of the content in a feature quantity space in which the feature quantity exists for each of the plurality of contents. A hash value corresponding to the feature amount of the content obtained based on a manifold structure represented by a linear combination of hash values corresponding to the feature amount of the content, and a hash corresponding to the feature amount extracted from the content The distance to the value is reduced, and each of the plurality of contents includes the content. A hash function for obtaining a hash value corresponding to the feature value so that a distance between a hash value corresponding to the feature value extracted from the hash value and a hash value corresponding to the semantic vector of the content is reduced, and The hash function generation method according to claim 1, wherein a hash function for obtaining a hash value corresponding to the semantic vector is generated.

The step of generating the hash function by the hash function generating means includes the feature amount of each of the plurality of contents extracted by the feature extracting means and the meaning of the content given to each of the plurality of contents. For each of the plurality of contents, the meaning vector of the content is present in the vicinity of the meaning vector of the content in the meaning vector space in which the meaning vector exists. A hash value corresponding to the semantic vector of the content obtained based on a manifold structure represented by a linear combination of hash values corresponding to the semantic vector of the content, and a hash value corresponding to the semantic vector of the content The distance is reduced and the plurality of containers are In order to obtain the hash value corresponding to the feature amount so that the distance between the hash value corresponding to the feature amount extracted from the content and the hash value corresponding to the semantic vector of the content is reduced The hash function generation method of Claim 1 or Claim 2 which produces | generates the hash function for calculating | requiring the hash function corresponding to the said hash function and the said semantic vector.

The hash function generation method according to any one of claims 1 to 3, wherein the hash value is a binary value.

A hash value generation method in a hash value generation device including a feature extraction unit and a hash value generation unit,
The feature extracting means extracting a feature quantity from the content;
The hash value corresponding to the feature amount generated by the hash value generation method according to any one of claims 1 to 3, wherein the hash value generation unit extracts the feature amount extracted by the feature extraction unit. Generating a hash value corresponding to the feature amount of the content based on the hash function for obtaining
Hash value generation method including

For each of a plurality of contents, feature extraction means for extracting a feature amount from the contents;
For each of the plurality of contents, based on the feature amount of each of the plurality of contents extracted by the feature extraction unit and a semantic vector representing the meaning of the content given to each of the plurality of contents. A hash function for obtaining a hash value corresponding to the feature amount so that a distance between a hash value corresponding to the feature amount extracted from the content and a hash value corresponding to the semantic vector of the content is reduced. And a hash function generating means for generating a hash function for obtaining a hash value corresponding to the semantic vector,
Hash function generator including

Feature extraction means for extracting feature quantities from the content;
The feature of the content based on the feature amount extracted by the feature extraction unit and the hash function for obtaining a hash value corresponding to the feature amount generated by the hash function generation device according to claim 6. Hash value generation means for generating a hash value corresponding to the quantity;
Hash value generator including

The program for making a computer perform each step of the hash function production | generation method of any one of Claims 1-4, or the hash value production | generation method of Claim 5.