JP2018156333A

JP2018156333A - Generation device, generation method, and generation program

Info

Publication number: JP2018156333A
Application number: JP2017051953A
Authority: JP
Inventors: 陸富樫; Riku Togashi
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2017-03-16
Filing date: 2017-03-16
Publication date: 2018-10-04
Anticipated expiration: 2037-03-16
Also published as: JP6756648B2

Abstract

PROBLEM TO BE SOLVED: To reflect information similarity on distance in a distributed representation space.SOLUTION: A generation device of the present disclosure comprises: a computation unit configured to compute a first similarity, representing similarity between first information and second information, and a second similarity representing similarity between the first information and third information; and a generation unit configured to generate each distributed representation based on a relationship between the first similarity and the second similarity in such a way that one of distributed representations of the second and third information is similar to a distributed representation of the first information but the other one of the distributed representations is not similar thereto.SELECTED DRAWING: Figure 1

Description

本発明は、生成装置、生成方法および生成プログラムに関する。 The present invention relates to a generation device, a generation method, and a generation program.

従来、画像やテキスト等といった情報の特徴量に応じた分類技術が知られている。このような技術の一例として、トリプレットロスと呼ばれる手法が提案されている。トリプレットロスにおいては、各情報の内容を示すタグの一致度をそれぞれ算出し、算出したタグの一致度が第１閾値を超える情報を類似の情報とし、一致度が第２閾値を下回る情報を非類似の情報とする。そして、トリプレットロスにおいては、例えば、第１情報の分散表現と第１情報に類似する第２情報の分散表現との差が小さくなり、第１情報と分散表現と第１情報とは類似しない第３情報の分散表現との差が大きくなるように、各情報の分散表現を学習する。このような処理の結果、分散表現空間上において、第１情報と第２情報とが近傍に配置され、第１情報と第３情報とが離れるように配置されることとなる。 Conventionally, a classification technique corresponding to a feature amount of information such as an image or text is known. As an example of such a technique, a method called triplet loss has been proposed. In the triplet loss, the degree of coincidence of tags indicating the contents of each information is calculated, information having the calculated degree of coincidence of tags exceeding the first threshold is set as similar information, and information whose degree of coincidence is below the second threshold is not calculated. Similar information. In triplet loss, for example, the difference between the distributed representation of the first information and the distributed representation of the second information similar to the first information is small, and the first information, the distributed representation, and the first information are not similar. The distributed representation of each information is learned so that the difference from the distributed representation of the three information becomes large. As a result of such processing, the first information and the second information are arranged in the vicinity in the distributed representation space, and the first information and the third information are arranged so as to be separated from each other.

特開２０１０−２５０８４９号公報JP 2010-250849 A

"Fashion Style in 128 Floats:Joint Ranking and Classification using Weak Data for Feature Extraction” Edgar Simo-Serra and Hiroshi Ishikawa, Department of Computer Science and Engineering, Waseda University, Tokyo, Japan"Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction" Edgar Simo-Serra and Hiroshi Ishikawa, Department of Computer Science and Engineering, Waseda University, Tokyo, Japan

しかしながら、上述した従来技術では、情報の類似性を分散表現空間上の距離に反映させることができない場合がある。 However, in the above-described prior art, there are cases where the similarity of information cannot be reflected on the distance in the distributed expression space.

例えば、従来技術では、第１情報とのタグの一致度が第１閾値よりも小さい情報は、タグの一致度の高低によらず、全て第１情報と類似する情報として取り扱われるため、類似する情報間におけるタグの一致度が分散表現の類似度に反映されない。この結果、分散表現空間を用いた検索においては、第１情報と類似する情報を検索することができるものの、第１情報との類似度に応じたランキングの正確性を保証することができない。 For example, in the prior art, information whose tag matching degree with the first information is smaller than the first threshold is treated as information similar to the first information regardless of whether the tag matching degree is high or low. The degree of matching of tags between information is not reflected in the similarity of distributed expressions. As a result, in the search using the distributed expression space, information similar to the first information can be searched, but the accuracy of the ranking according to the similarity to the first information cannot be guaranteed.

本願は、上記に鑑みてなされたものであって、情報の類似性を分散表現空間上の距離に反映させることを目的とする。 The present application has been made in view of the above, and an object thereof is to reflect the similarity of information to the distance in the distributed expression space.

本願に係る生成装置は、第１情報と第２情報との間の類似度である第１類似度と、第１情報と第３情報との間の類似度である第２類似度とを算出する算出部と、前記第１類似度と前記第２類似度との関係性に基づいて、前記第２情報および前記第３情報の分散表現のうちいずれか一方が前記第１情報の分散表現と類似し、他方が類似しないように、各分散表現を生成する生成部とを有することを特徴とする。 The generation device according to the present application calculates a first similarity that is a similarity between the first information and the second information, and a second similarity that is a similarity between the first information and the third information. Based on the relationship between the calculating unit and the first similarity and the second similarity, one of the distributed representation of the second information and the third information is a distributed representation of the first information. It has a generation part which generates each distributed expression so that it may be similar and the other may not be similar.

実施形態の一態様によれば、情報の類似性を分散表現空間上の距離に反映させることができる。 According to one aspect of the embodiment, the similarity of information can be reflected in the distance on the distributed representation space.

図１は、実施形態に係る情報提供装置が実行する処理の一例を示す図である。FIG. 1 is a diagram illustrating an example of processing executed by the information providing apparatus according to the embodiment. 図２は、実施形態に係る情報提供装置の構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example of the information providing apparatus according to the embodiment. 図３は、実施形態に係る画像データベースに登録される情報の一例を示す図である。FIG. 3 is a diagram illustrating an example of information registered in the image database according to the embodiment. 図４は、実施形態に係る分散表現データベースに登録される情報の一例を示す図である。FIG. 4 is a diagram illustrating an example of information registered in the distributed representation database according to the embodiment. 図５は、実施形態にかかる情報提供装置が算出する類似度の一例を示す図である。FIG. 5 is a diagram illustrating an example of the similarity calculated by the information providing apparatus according to the embodiment. 図６は、実施形態に係る情報提供装置が実行する生成処理の流れの一例を示すフローチャートである。FIG. 6 is a flowchart illustrating an example of a flow of generation processing executed by the information providing apparatus according to the embodiment. 図７は、ハードウェア構成の一例を示す図である。FIG. 7 is a diagram illustrating an example of a hardware configuration.

以下に、本願に係る生成装置、生成方法および生成プログラムを実施するための形態（以下、「実施形態」と記載する。）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る生成装置、生成方法および生成プログラムが限定されるものではない。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略する。 Hereinafter, a mode for carrying out a generation device, a generation method, and a generation program according to the present application (hereinafter referred to as “embodiment”) will be described in detail with reference to the drawings. Note that the generation device, the generation method, and the generation program according to the present application are not limited to the embodiment. In the following embodiments, the same parts are denoted by the same reference numerals, and redundant description is omitted.

［実施形態］
〔１．情報提供装置が提供する処理について〕
まず、図１を用いて、生成装置の一例となる情報提供装置が実行する生成処理の一例について説明する。図１は、実施形態に係る情報提供装置が実行する処理の一例を示す図である。なお、以下の説明では、情報提供装置１０が実行する処理として、分散表現を用いた画像検索を様にするため、画像の類似性を反映させた分散表現を生成する生成処理と、生成処理によって生成した分散表現を用いて、画像の検索を行う検索処理とについて説明する。なお、以下の説明では、情報提供装置１０が各情報の分散表現を「生成」する処理について説明するが、かかる処理は、各データに対応する適切な分散表現の値を適宜「学習」することで、分散表現を「生成」する処理であるものとする。 [Embodiment]
[1. Regarding the processing provided by the information providing device]
First, an example of a generation process executed by an information providing apparatus as an example of a generation apparatus will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of processing executed by the information providing apparatus according to the embodiment. In the following description, as processing executed by the information providing apparatus 10, in order to perform image search using distributed representation, a generation processing that generates a distributed representation reflecting image similarity, and a generation processing are performed. A search process for searching for an image using the generated distributed expression will be described. In the following description, a process in which the information providing apparatus 10 “generates” a distributed representation of each information will be described. However, such processing appropriately “learns” the value of an appropriate distributed representation corresponding to each data. It is assumed that this is a process of “generating” a distributed representation.

また、以下の生成処理および検索処理は、静止画像や動画像等といった各種の画像のみならず、音声、映画、小説、ニュース記事等、任意のコンテンツの分散表現を生成し、生成した分散表現を用いてコンテンツを検索する処理に適用可能である。 In addition, the following generation processing and search processing generate not only various images such as still images and moving images but also distributed representations of arbitrary contents such as audio, movies, novels, news articles, etc. It can be applied to the process of searching for content using.

〔１−１．情報提供装置の概要〕
情報提供装置１０は、インターネット等の所定のネットワークＮ（例えば、図２を参照。）を介して、利用者端末１００と通信可能な情報処理装置であり、例えば、サーバ装置やクラウドシステム等により実現される。なお、情報提供装置１０は、ネットワークＮを介して、任意の数の利用者端末１００と通信可能であってよい。 [1-1. Overview of information providing device)
The information providing apparatus 10 is an information processing apparatus that can communicate with the user terminal 100 via a predetermined network N (for example, see FIG. 2) such as the Internet, and is realized by, for example, a server apparatus or a cloud system. Is done. The information providing apparatus 10 may be able to communicate with an arbitrary number of user terminals 100 via the network N.

利用者端末１００は、情報の検索を要求する利用者が使用する情報処理装置であり、ＰＣ（Personal Computer）、サーバ装置、スマートデバイスといった情報処理装置により実現される。例えば、利用者端末１００は、検索クエリとして、画像や画像の内容を示す情報を情報提供装置１０に送信する。このような場合、情報提供装置１０は、後述する生成処理により生成された分散表現を用いて、検索クエリとして受付けた画像と類似する画像を検索し、検索した画像を検索クエリとの類似度に応じたランキング形式で利用者端末１００へと提供する。 The user terminal 100 is an information processing device used by a user who requests information retrieval, and is realized by an information processing device such as a PC (Personal Computer), a server device, or a smart device. For example, the user terminal 100 transmits information indicating the image and the content of the image to the information providing apparatus 10 as a search query. In such a case, the information providing apparatus 10 searches for an image similar to the image received as the search query using the distributed expression generated by the generation process described later, and sets the searched image to the similarity with the search query. The data is provided to the user terminal 100 in the corresponding ranking format.

〔１−２．生成処理について〕
ここで、画像の外見的な特徴量のみならず、撮像された撮像対象の内容に応じた画像検索を容易にするため、撮像された撮像対象の内容を示す特徴量を生成し、生成した特徴量に応じて画像を分類する技術が考えられる。例えば、画像に撮像された撮像対象の内容を示すタグ情報を画像のメタ情報として保持し、メタ情報の一致度に応じて撮像対象の特徴量に基づいた分散表現を生成する技術が考えられる。 [1-2. About generation processing)
Here, in order to facilitate not only the appearance feature amount of the image but also the image search according to the content of the imaged imaging target, the feature amount indicating the content of the imaged imaging target is generated, and the generated feature A technique for classifying images according to the amount can be considered. For example, a technique is conceivable in which tag information indicating the content of the imaging target captured in the image is held as meta information of the image, and a distributed expression based on the feature amount of the imaging target is generated according to the degree of coincidence of the meta information.

また、メタ情報の一致度に応じて分散表現を生成する技術として、トリプレットロスの技術が知られている。しかしながら、トリプレットロスの技術では、メタ情報の一致度が所定の閾値よりも高いか否かに応じて、基準画像と、正解データとなる画像と、不正解データとなる画像との組、すなわち、トリプルと呼ばれる情報を生成する。そして、トリプレットロスの技術では、基準画像と正解データとなる画像（以下、「正解ペア」と記載する。）との分散表現が類似し、基準画像と不正解データとなる画像（以下、「不正解ペア」と記載する。）との分散表現が非類似となるように、各画像の分散表現を学習する。 A triplet loss technique is known as a technique for generating a distributed representation according to the degree of coincidence of meta information. However, in the triplet loss technique, depending on whether or not the degree of coincidence of meta information is higher than a predetermined threshold, a set of a reference image, an image that is correct data, and an image that is incorrect data, that is, Generate information called triples. In the triplet loss technique, the distributed representations of the reference image and the image that is correct data (hereinafter referred to as “correct pair”) are similar, and the image that is the reference image and incorrect data (hereinafter “incorrect data”). The distributed expression of each image is learned so that the distributed expression of “correct answer pair” is dissimilar.

しかしながら、このようなトリプレットロスの技術では、メタ情報の一致度が所定の閾値よりも高い場合には、正解データとして取り扱われるので、分散表現空間上において基準画像の分散表現の近傍に、基準画像と類似する画像の分散表現が配置されるものの、基準画像の分散表現の最近傍に、基準画像と最も類似する画像の分散表現が配置されない恐れがある。すなわち、従来のトリプレットロスの技術では、類似する画像間における類似度が分散表現の類似度に反映させることができない。 However, in such triplet loss technology, when the degree of coincidence of meta information is higher than a predetermined threshold value, it is handled as correct answer data, so that the reference image is located near the distributed representation of the reference image in the distributed representation space. Although the distributed representation of the image similar to the reference image is arranged, the distributed representation of the image most similar to the reference image may not be arranged in the vicinity of the distributed representation of the reference image. That is, with the conventional triplet loss technique, the similarity between similar images cannot be reflected in the similarity of the distributed representation.

そこで、情報提供装置１０は、以下の生成処理を実行する。まず、情報提供装置１０は、分散表現の対象となる画像群の中から、第１情報、第２情報、および第３情報となる３つの画像を選択する。なお、以下の説明では、第１情報〜第３情報として選択された３つの画像を、それぞれ第１画像〜第３画像と記載する。 Therefore, the information providing apparatus 10 executes the following generation process. First, the information providing apparatus 10 selects three images that are the first information, the second information, and the third information from the group of images to be distributed. In the following description, the three images selected as the first information to the third information are referred to as the first image to the third image, respectively.

続いて、情報提供装置１０は、第１情報と第２情報との間の類似度である第１類似度と、第１情報と第３情報との間の類似度である第２類似度とを算出する。具体的には、情報提供装置１０は、第１画像と第２画像との間の類似度を第１類似度として算出し、第１画像と第３画像との間の類似度を第２類似度として算出する。すなわち、情報提供装置１０は、第１画像を基準画像とし、基準画像と第２画像の第１類似度、および基準画像と第３画像の第２類似度を算出する。 Subsequently, the information providing apparatus 10 includes a first similarity that is a similarity between the first information and the second information, and a second similarity that is a similarity between the first information and the third information. Is calculated. Specifically, the information providing apparatus 10 calculates the similarity between the first image and the second image as the first similarity, and sets the similarity between the first image and the third image as the second similarity. Calculate as degrees. That is, the information providing apparatus 10 uses the first image as a reference image, and calculates the first similarity between the reference image and the second image, and the second similarity between the reference image and the third image.

そして、情報提供装置１０は、第１類似度と第２類似度との関係性に基づいて、第２画像および第３画像の分散表現のうちいずれか一方が第１画像の分散表現と類似し、他方が類似しないように、各画像の分散表現を生成する。例えば、情報提供装置１０は、第１類似度が第２類似度よりも大きい場合は、第１画像の分散表現と第２画像の分散表現とが類似し、かつ第１画像の分散表現と第３画像の分散表現とが類似しないように、各分散表現を生成する。一方、情報提供装置１０は、第２類似度が第１類似度よりも大きい場合は、第１画像の分散表現と第３画像の分散表現とが類似し、かつ第１画像の分散表現と第２画像の分散表現とが類似しないように、各分散表現を生成する。 Then, based on the relationship between the first similarity and the second similarity, the information providing apparatus 10 has one of the distributed representations of the second image and the third image similar to the distributed representation of the first image. , Generate a distributed representation of each image so that the other is not similar. For example, when the first similarity is greater than the second similarity, the information providing apparatus 10 has a similar distributed representation of the first image and the distributed representation of the second image, and the first and second distributed representations of the first image and the second image. Each distributed representation is generated so that the distributed representation of the three images is not similar. On the other hand, when the second similarity is larger than the first similarity, the information providing apparatus 10 has the distributed representation of the first image similar to the distributed representation of the third image, and the distributed representation of the first image and the first representation. Each distributed representation is generated so that the two images are not similar to the distributed representation.

すなわち、情報提供装置１０は、基準画像と他の画像との間の類似度が所定の閾値よりも高いか否かに応じて、正解データや不正解データとなる画像を選定するのではなく、基準画像と第２画像との類似度、および、基準画像と第３画像との類似度に基づいて、相対的な正解データおよび不正解データとなる画像を設定する。そして、情報提供装置１０は、正解ペアの分散表現が類似し、不正解ペアの分散表現が類似しないように、分散表現の学習を行う。 That is, the information providing apparatus 10 does not select an image that is correct answer data or incorrect answer data depending on whether the similarity between the reference image and another image is higher than a predetermined threshold value. Based on the similarity between the reference image and the second image and the similarity between the reference image and the third image, images that are relative correct answer data and incorrect answer data are set. The information providing apparatus 10 learns the distributed expression so that the distributed expressions of the correct answer pairs are similar and the distributed expressions of the incorrect answer pairs are not similar.

すなわち、情報提供装置１０は、基準画像と類似する画像が第２画像および第３画像として選択された場合にも、基準画像との類似度に応じて相対的な正解データと不正解データとを設定し、各画像の分散表現を生成する。このような処理の結果、情報提供装置１０は、基準画像と相互に類似する画像間における相対的な類似度を分散表現に反映させることができるので、分散表現を用いた画像検索の際に、類似性に応じた画像のランキングを生成することができる。 That is, even when an image similar to the reference image is selected as the second image and the third image, the information providing apparatus 10 calculates relative correct answer data and incorrect answer data according to the similarity to the reference image. Set and generate a distributed representation of each image. As a result of such processing, the information providing apparatus 10 can reflect the relative similarity between the images similar to the reference image in the distributed expression. Therefore, in the image search using the distributed expression, The ranking of images according to the similarity can be generated.

また、情報提供装置１０は、各画像の相対的な類似性を分散表現に反映させることができるので、距離公理を満たす距離関数が使えることが担保された分散表現空間に各画像を落とし込むことができる。例えば、ディープラーニング等に用いられる多段のニューラルネットが画像から生成した中間表現を特徴量として採用する技術では、どのような距離関数が適用可能であるかが不明なため、特徴量同士の単純な比較が困難となる。一方で、情報提供装置１０は、各画像の相対的な類似度の高低を反映させるように分散表現を生成するので、分散表現空間がユークリッド空間となるように分散表現の学習が行われる。このため、情報提供装置１０は、例えば、画像検索の際に、検索クエリとなる画像の分散表現と他の画像の分散表現とのユークリッド距離を算出することで、容易に類似画像検索を実現することができる。 Moreover, since the information providing apparatus 10 can reflect the relative similarity of each image in the distributed expression, each image can be dropped into the distributed expression space in which it is guaranteed that a distance function satisfying the distance axiom can be used. it can. For example, in a technology that uses an intermediate representation generated from an image as a feature value by a multistage neural network used for deep learning etc., it is unclear what distance function is applicable, so the feature values can be simply Comparison becomes difficult. On the other hand, since the information providing apparatus 10 generates a distributed expression so as to reflect the relative similarity of each image, learning of the distributed expression is performed so that the distributed expression space becomes the Euclidean space. For this reason, the information providing apparatus 10 easily realizes a similar image search, for example, by calculating the Euclidean distance between the distributed representation of an image serving as a search query and the distributed representation of another image during an image search. be able to.

〔１−３．類似度について〕
ここで、情報提供装置１０は、各画像間の類似度を算出する際、分散表現の生成目的に応じた任意の基準により、各画像の類似度を算出してよい。例えば、情報提供装置１０は、色味、ピクセル、エッジ等といった各種の構造的（外見的）な類似性に基づいて、第１類似度および第２類似度を算出してもよい。また、情報提供装置１０は、各画像の意味的な類似度（すなわち、セマンティックな類似度）を算出してもよい。 [1-3. About similarity)
Here, when calculating the degree of similarity between the images, the information providing apparatus 10 may calculate the degree of similarity of each image according to an arbitrary criterion according to the purpose of generating the distributed representation. For example, the information providing apparatus 10 may calculate the first similarity and the second similarity based on various structural (appearance) similarities such as color, pixel, and edge. Further, the information providing apparatus 10 may calculate the semantic similarity (that is, semantic similarity) of each image.

例えば、情報提供装置１０は、ＩＬＳＶＲＣ（ImageNet Large Scale Visual Recognition Challenge）等における技術を用いて、画像から撮像対象の特徴を抽出するように学習が行われたモデルを用いて、各画像における撮像対象の特徴を抽出し、抽出した特徴の類似度を示す第１類似度および第２類似度を算出してもよい。例えば、情報提供装置１０は、各画像に撮像された撮像対象の種別や色等といった特徴を特定し、特定した特徴の共通性や類似性（以下、「類似度」と記載する。）に基づいて、第１類似度および第２類似度を算出してもよい。なお、このようなモデルは、例えば、畳み込みニューラルネットワーク（CNN: Convolutional Neural Network）等のニューラルネットワークにより実現されてもよい。また、情報提供装置１０は、各画像のフィッシャーベクターに基づいて、撮像対象の分類を行うモデルを用いて、各画像の撮像対象の分類を行い、分類結果の類似性を示す第１類似度および第２類似度を算出してもよい。 For example, the information providing apparatus 10 uses a model in which a feature of an imaging target is extracted from an image using a technique in ILSVRC (ImageNet Large Scale Visual Recognition Challenge) or the like, and uses an imaging target in each image. And the first similarity and the second similarity indicating the similarity of the extracted features may be calculated. For example, the information providing apparatus 10 identifies features such as the type and color of the imaging target captured in each image, and is based on the commonality and similarity (hereinafter referred to as “similarity”) of the identified features. Thus, the first similarity and the second similarity may be calculated. Such a model may be realized by a neural network such as a convolutional neural network (CNN). Further, the information providing apparatus 10 classifies the imaging target of each image using a model that classifies the imaging target based on the Fisher vector of each image, and the first similarity indicating the similarity of the classification result and The second similarity may be calculated.

また、情報提供装置１０は、撮像対象の特徴、画像が電子商取引においてどの取引対象を説明するために用いられるか等といった画像の使用目的等、画像の意味的な類似度に基づいて、第１類似度および第２類似度を算出してもよい。例えば、各画像には、撮像対象の特徴や画像の目的等、各種画像の意味を示すタグ情報が紐付けられている場合がある。また、例えば、各画像が掲載されるウェブコンテンツにおいては、その画像のキャプションや、撮像対象の名称（例えば、取引対象の名称）、撮像対象の価格等といった各種画像と紐付可能な情報が掲載されている場合がある。そこで、情報提供装置１０は、各画像と紐付可能な各種の情報を画像の意味を示すメタ情報として収集する。そして、情報提供装置１０は、第１画像のメタ情報と第２画像のメタ情報との類似度に基づいて、第１類似度を算出し、第１画像のメタ情報と第３画像のメタ情報との類似度に基づいて、第２類似度を算出してもよい。すなわち、情報提供装置１０は、画像そのものの類似度だけではなく、画像に紐付られた情報の類似度に応じて、第１類似度および第２類似度を算出してもよい。 In addition, the information providing apparatus 10 determines the first based on the semantic similarity of the image, such as the characteristics of the imaging target, the purpose of use of the image such as which transaction target the image is used to describe in the electronic commerce, and the like. The similarity and the second similarity may be calculated. For example, each image may be associated with tag information indicating the meaning of various images such as the characteristics of the imaging target and the purpose of the image. In addition, for example, in web content in which each image is posted, information that can be associated with various images such as the caption of the image, the name of the imaging target (for example, the name of the transaction target), the price of the imaging target, etc. is posted. There may be. Therefore, the information providing apparatus 10 collects various types of information that can be associated with each image as meta information indicating the meaning of the image. Then, the information providing apparatus 10 calculates the first similarity based on the similarity between the meta information of the first image and the meta information of the second image, and the meta information of the first image and the meta information of the third image. The second similarity degree may be calculated based on the similarity degree. That is, the information providing apparatus 10 may calculate the first similarity and the second similarity according to not only the similarity of the image itself but also the similarity of the information associated with the image.

また、情報提供装置１０は、第１画像のメタ情報と第２画像のメタ情報との意味または表記の類似度に基づいて、第１類似度を算出し、第１画像のメタ情報と第３画像のメタ情報との意味または表記の類似度に基づいて、第２類似度を算出してもよい。例えば、情報提供装置１０は、メタ情報に含まれるテキストの一致度（すなわち、表記の類似度）を算出し、算出した一致度に基づいて、第１類似度や第２類似度を算出してもよい。また、情報提供装置１０は、例えば、ｗ２ｖを用いて、メタ情報に含まれるテキストの意味の類似度（すなわち、意味の類似度）を算出し、算出した類似度に基づいて、第１類似度や第２類似度を算出してもよい。 Further, the information providing apparatus 10 calculates the first similarity based on the meaning or the similarity of the notation between the meta information of the first image and the meta information of the second image, and the meta information of the first image and the third information The second similarity may be calculated based on the meaning of the image meta information or the similarity of the notation. For example, the information providing apparatus 10 calculates the degree of coincidence of the text included in the meta information (that is, the notation similarity), and calculates the first similarity and the second similarity based on the calculated coincidence. Also good. Further, the information providing apparatus 10 calculates the similarity of the meaning of the text included in the meta information (that is, the similarity of meaning) using, for example, w2v, and based on the calculated similarity, the first similarity Alternatively, the second similarity may be calculated.

なお、情報提供装置１０は、画像に付与されたタグ情報に含まれる情報や、画像と同じウェブコンテンツに掲載された各種の情報以外にも、例えば、画像と共に利用者がマイクロブログに投稿したテキストに含まれる単語等、画像と紐付けられる情報であるならば、任意の種別の任意の情報をメタ情報として採用してよい。すなわち、情報提供装置１０は、画像と紐付けられる情報であって、画像の意味的な内容を示しうる情報であるならば、任意の情報をメタ情報として採用してよい。また、情報提供装置１０は、画像の色、撮像対象の形、画像のセマンティックな意味等、どのような基準に基づいて類似画像検索を行うかに応じて、任意の種別の情報を画像のメタ情報として採用して良い。 In addition to the information included in the tag information attached to the image and various types of information posted on the same web content as the image, the information providing apparatus 10 may be, for example, text posted by the user on the microblog along with the image Any information of any type may be adopted as meta information as long as it is information associated with an image, such as a word included in. That is, the information providing apparatus 10 may adopt arbitrary information as meta information as long as it is information associated with an image and can indicate the semantic content of the image. In addition, the information providing apparatus 10 can store arbitrary types of information on the image meta data according to the criteria for performing similar image search, such as the color of the image, the shape of the imaging target, and the semantic meaning of the image. It may be adopted as information.

〔１−４．検索クエリについて〕
また、情報提供装置１０は、各画像に対応する検索クエリの内容をメタ情報として採用してもよい。例えば、情報提供装置１０は、ウェブ検索を行う検索サーバ等（図示は、省略）から、利用者Ｕが入力した検索クエリと、その検索クエリが入力された際に、検索結果として表示された画像の中から利用者Ｕが選択した画像とを示す検索ログを取得する。そして、情報提供装置１０は、分散表現の生成対象となる各画像について、その画像が利用者Ｕによって選択された際に利用者Ｕが入力した検索クエリをメタ情報として特定する。 [1-4. About search queries)
Further, the information providing apparatus 10 may adopt the content of the search query corresponding to each image as meta information. For example, the information providing apparatus 10 receives a search query input by the user U from a search server or the like (not shown) that performs a web search, and an image displayed as a search result when the search query is input. A search log indicating the image selected by the user U is acquired. And the information provision apparatus 10 specifies the search query which the user U input when each image used as the production | generation object of a distributed expression was selected by the user U as meta information.

そして、情報提供装置１０は、第１画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリと、第２画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリとの類似度に基づいて、第１類似度を算出する。また、情報提供装置１０は、第１画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリと、第３画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリとの類似度に基づいて、第２類似度を算出する。そして、情報提供装置１０は、算出した第１類似度および第２類似度を用いて、各画像の相対的な類似度に応じた分散表現を生成する。 Then, the information providing apparatus 10 includes the search query input by the user U when the first image is selected by the user U and the user U input when the second image is selected by the user U. The first similarity is calculated based on the similarity with the search query. In addition, the information providing apparatus 10 includes a search query input by the user U when the first image is selected by the user U, and a user U input when the third image is selected by the user U. Based on the similarity with the search query, the second similarity is calculated. And the information provision apparatus 10 produces | generates the dispersion | distribution expression according to the relative similarity of each image using the calculated 1st similarity and 2nd similarity.

ここで、利用者Ｕがある画像を選択した際に入力した検索クエリは、その画像を検索する際の検索意図を反映したものと言える。このため、情報提供装置１０は、各画像を選択した際に各利用者Ｕが入力した検索クエリの類似度を算出することで、各画像を検索する際の利用者Ｕの検索意図の類似度を算出することができる。このような類似度が反映させるように各画像の分散表現を生成した場合には、各画像を検索する際の利用者Ｕの検索意図を分散表現空間上の距離に埋め込むことができる。この結果、情報提供装置１０は、分散表現を用いて利用者Ｕの検索意図を反映した画像検索を実現することができる。 Here, it can be said that the search query input when the user U selects an image reflects the search intention when searching for the image. For this reason, the information providing apparatus 10 calculates the similarity of the search query input by each user U when each image is selected, so that the similarity of the search intention of the user U when searching for each image is calculated. Can be calculated. When the distributed representation of each image is generated so as to reflect such similarity, the search intention of the user U when searching for each image can be embedded in the distance in the distributed representation space. As a result, the information providing apparatus 10 can realize an image search that reflects the search intention of the user U using the distributed expression.

例えば、情報提供装置１０は、第１画像が選択された際に利用者Ｕが入力した検索クエリとして「ｓｈｏｅｓ、ｂｌｕｅ、ｂｌａｎｄＡ、ｓｉｚｅＡ」を取得し、第２画像が選択された際に利用者Ｕが入力した検索クエリとして「ｓｈｏｅｓ、ｂｌｕｅ、ｂｌａｎｄＡ、ｓｉｚｅＢ」を取得し、第３画像が選択された際に利用者Ｕが入力した検索クエリとして「ｓｈｏｅｓ、ｒｅｄ、ｂｌａｎｄＢ、ｓｉｚｅＢ」を取得したものとする。このような場合、情報提供装置１０は、第１画像に対応する検索クエリと第２画像に対応する検索クエリとの間のＩｏＵ(Intersection over Union)を第１類似度として算出する。 For example, the information providing apparatus 10 acquires “shoes, blue, brandA, sizeA” as a search query input by the user U when the first image is selected, and the user when the second image is selected. “Shoes, blue, brandA, sizeB” is acquired as a search query input by U, and “shoes, red, brandB, sizeB” is acquired as a search query input by the user U when the third image is selected. Shall. In such a case, the information providing apparatus 10 calculates an IoU (Intersection over Union) between the search query corresponding to the first image and the search query corresponding to the second image as the first similarity.

より具体的には、情報提供装置１０は、第１画像に対応する検索クエリに含まれるトークンと、第２画像に対応する検索クエリに含まれるトークンとのうち、共通するトークンの種別の数を、各検索クエリに含まれるトークンの種別の数で除算した値を第１類似度として算出する。上述した例では、第１画像に対応する検索クエリと第２画像に対応する検索クエリとで、「ｓｈｏｅｓ」、「ｂｌｕｅ」、「ｂｌａｎｄＡ」という３つのトークンが類似し、各クエリ中に「ｓｈｏｅｓ」、「ｂｌｕｅ」、「ｂｌａｎｄＡ」、「ｓｉｚｅＡ」、「ｓｉｚｅＢ」という５つのトークンが出現している。このため、情報提供装置１０は、共通して含まれるトークンの種別の数「３」を、各クエリ中に含まれるトークンの種別の数「５」で除算した「０．６」を第１類似度として算出する。同様に、情報提供装置１０は、第１画像に対応する検索クエリと第３画像に対応する検索クエリとの間のＩｏＵ（例えば、「０．３３」）を第２類似度として算出する。 More specifically, the information providing apparatus 10 determines the number of common token types among the tokens included in the search query corresponding to the first image and the tokens included in the search query corresponding to the second image. Then, a value divided by the number of types of tokens included in each search query is calculated as the first similarity. In the example described above, the search query corresponding to the first image and the search query corresponding to the second image have three similar tokens “shoes”, “blue”, and “brandA”, and “shoes” is included in each query. ”,“ Blue ”,“ brandA ”,“ sizeA ”, and“ sizeB ”have appeared. For this reason, the information providing apparatus 10 makes the first similarity “0.6” obtained by dividing the number “3” of commonly included token types by the number “5” of token types included in each query. Calculate as degrees. Similarly, the information providing apparatus 10 calculates IoU (for example, “0.33”) between the search query corresponding to the first image and the search query corresponding to the third image as the second similarity.

このような場合、第１類似度の値が第２類似度の値よりも大きくなる。このため、第１画像と第２画像とは、第１画像と第３画像よりもより類似していると言える。そこで、情報提供装置１０は、第１画像の分散表現と第２画像の分散表現とが類似し、第１画像の分散表現と第３画像の分散表現とが類似しなくなるように、各画像の分散表現を学習する。 In such a case, the first similarity value is larger than the second similarity value. For this reason, it can be said that the first image and the second image are more similar than the first image and the third image. Therefore, the information providing apparatus 10 makes the distributed representation of the first image and the distributed representation of the second image similar, and the distributed representation of the first image and the distributed representation of the third image are not similar. Learn distributed representation.

ここで、情報提供装置１０は、検索クエリのトークンの量を考慮して、第１類似度および第２類似度を算出してもよい。例えば、検索クエリのトークンの量が多い場合、その検索クエリは、利用者Ｕの検索意図を他の検索クエリよりも明確に示していると推定される。そこで、情報提供装置１０は、各画像が選択された際に利用者Ｕが入力した検索クエリのうち、トークンの量が所定の閾値を超える検索クエリを抽出し、抽出した検索クエリの類似度に基づいて、第１類似度および第２類似度を算出してもよい。例えば、情報提供装置１０は、４つ以上のトークンを含む検索クエリを抽出してもよく、外部サーバ等から取得した検索クエリのうち、トークンの数が最も多い検索クエリのみを抽出してもよい。また、情報提供装置１０は、画像を選択した利用者の過半数が入力した検索クエリが含まれるように、閾値となるトークンの量を設定してもよい。 Here, the information providing apparatus 10 may calculate the first similarity and the second similarity in consideration of the token amount of the search query. For example, when the amount of tokens in the search query is large, it is estimated that the search query shows the search intention of the user U more clearly than other search queries. Therefore, the information providing apparatus 10 extracts a search query in which the amount of tokens exceeds a predetermined threshold from the search queries input by the user U when each image is selected, and sets the similarity of the extracted search queries. Based on this, the first similarity and the second similarity may be calculated. For example, the information providing apparatus 10 may extract a search query including four or more tokens, or may extract only a search query having the largest number of tokens among search queries acquired from an external server or the like. . Further, the information providing apparatus 10 may set the token amount as a threshold so that a search query input by a majority of users who have selected images is included.

〔１−５．複数種別の情報に基づく類似度について〕
ここで、情報提供装置１０は、第１画像に紐付けられる複数種別の情報と第２画像に紐付けられる複数種別の情報との種別ごとの類似度に基づいて、第１類似度を算出し、第１画像に紐付けられる複数種別の情報と第３画像に紐付けられる複数種別の情報との種別ごとの類似度に基づいて、第２類似度を算出してもよい。すなわち、情報提供装置１０は、複数種別の情報を含むメタ情報が画像に紐付けられている場合、メタ情報全体としての類似度に基づいて第１類似度や第２類似度を算出してもよく、種別ごとの類似度に基づいた第１類似度や第２類似度を算出してもよい。 [1-5. (Similarity based on multiple types of information)
Here, the information providing apparatus 10 calculates the first similarity based on the similarity for each type of the plurality of types of information associated with the first image and the plurality of types of information associated with the second image. The second similarity may be calculated based on the similarity for each type of the plurality of types of information associated with the first image and the plurality of types of information associated with the third image. That is, the information providing apparatus 10 may calculate the first similarity and the second similarity based on the similarity as the whole meta information when meta information including plural types of information is associated with the image. The first similarity and the second similarity may be calculated based on the similarity for each type.

例えば、情報提供装置１０は、画像の構造的な特徴を示す情報（以下、「構造情報」と記載する。）と、撮像対象等、画像の意味的な特徴を示す情報（以下、「意味情報」と記載する。）とが含まれるメタ情報を取得する。このような場合、情報提供装置１０は、第１画像の構造情報と第２画像の構造情報との類似度（以下、「構造類似度」と記載する。）、および、第１画像の意味情報と第２画像の意味情報との類似度（以下、「意味類似度」と記載する。）とに基づいて、第１類似度を算出してもよい。 For example, the information providing apparatus 10 includes information indicating structural features of an image (hereinafter referred to as “structural information”) and information indicating semantic features of the image such as an imaging target (hereinafter referred to as “semantic information”). ").” Is acquired. In such a case, the information providing apparatus 10 uses the similarity between the structure information of the first image and the structure information of the second image (hereinafter referred to as “structure similarity”), and the semantic information of the first image. The first similarity may be calculated based on the similarity between the second image and the semantic information of the second image (hereinafter referred to as “meaning similarity”).

また、情報提供装置１０は、種別ごとに重みづけを考慮した第１類似度を算出してもよい。例えば、情報提供装置１０は、構造類似度に第１優先度を積算した値、および、意味類似度に第２優先度を積算した値とに基づいて、第１類似度を算出してもよい。このような優先度を設定することで、情報提供装置１０は、画像の構造的な類似度を重視した分散表現を生成するのか、画像の意味的な類似度を重視した分散表現を生成するのかを柔軟に設定することができる。 Moreover, the information provision apparatus 10 may calculate the 1st similarity which considered weighting for every classification. For example, the information providing apparatus 10 may calculate the first similarity based on a value obtained by integrating the first priority with the structural similarity and a value obtained by integrating the second priority with the semantic similarity. . By setting such priorities, whether the information providing apparatus 10 generates a distributed expression that emphasizes the structural similarity of the image or generates a distributed expression that emphasizes the semantic similarity of the image Can be set flexibly.

また、情報提供装置１０は、種別ごとの類似度を優先度が高い順に結合させることで、所定の桁数の第１類似度および第２類似度を算出してもよい。例えば、情報提供装置１０は、第１優先度として１２８ビットの値を算出する場合、６４ビットの構造類似度と意味類似度とを算出する。そして、情報提供装置１０は、例えば、意味類似度を上位の桁とし、構造類似度を下位の桁として結合することで、意味類似度がより優先的に寄与する１２８ビットの第１類似度を算出してもよい。 In addition, the information providing apparatus 10 may calculate the first similarity and the second similarity having a predetermined number of digits by combining the similarities for each type in descending order of priority. For example, in the case of calculating a 128-bit value as the first priority, the information providing apparatus 10 calculates a 64-bit structural similarity and a semantic similarity. Then, for example, the information providing apparatus 10 combines the semantic similarity as an upper digit and the structural similarity as a lower digit, thereby obtaining a 128-bit first similarity to which the semantic similarity contributes more preferentially. It may be calculated.

なお、情報提供装置１０は、１０進数で構造類似度と意味類似度とを算出する場合、例えば、意味類似度に１０のｎ乗（ｎは、構造類似度の桁数）を積算した値を算出し、算出した値に構造類似度を加算することで、意味類似度がより優先的に寄与する第１類似度を算出してもよい。また、情報提供装置１０は、メタ情報に含まれる情報の種別の数が３つ以上の場合であっても、種別ごとの類似度を算出し、算出した類似度のうちより優先的に用いる種別の類似度をより上位の桁として、各優先度を結合することで、第１類似度および第２類似度を算出してもよい。 In addition, when the information providing apparatus 10 calculates the structural similarity and the semantic similarity in decimal numbers, for example, a value obtained by integrating the semantic similarity with 10 to the nth power (n is the number of digits of the structural similarity) is used. The first similarity to which the semantic similarity contributes more preferentially may be calculated by calculating and adding the structural similarity to the calculated value. Further, the information providing apparatus 10 calculates the similarity for each type even when the number of types of information included in the meta information is three or more, and the type used with higher priority among the calculated similarities The first similarity degree and the second similarity degree may be calculated by combining the priorities with the similarity degree in the higher order.

〔１−６．画像の選択について〕
ここで、情報提供装置１０は、第１画像〜第３画像を選択する場合、分散表現の生成対象となる画像から、全ての組み合わせについて、第１画像〜第３画像を選択すればよい。また、情報提供装置１０は、第１画像と同じ分野（クラス）に属する画像等、第１画像との類似度が所定の閾値を超える情報を故意に選択することで、その分野に属する画像間の相対的な類似度を分散表現空間上に落とし込んでもよい。 [1-6. (Selecting images)
Here, when the information providing apparatus 10 selects the first image to the third image, the information providing device 10 may select the first image to the third image for all combinations from the images for which the distributed representation is to be generated. In addition, the information providing apparatus 10 intentionally selects information whose similarity with the first image exceeds a predetermined threshold, such as an image belonging to the same field (class) as the first image, so that images between images belonging to the field The relative similarity may be dropped on the distributed expression space.

また、情報提供装置１０は、分散表現の学習を効率的に進めるため、段階的に第１画像〜第３画像の選択元となる分野を狭めてもよい。例えば、情報提供装置１０は、学習の初期段階については、全カテゴリに属する画像からランダムに第１画像〜第３画像を選択し、学習が進んだ場合（例えば、分散表現の精度が所定の閾値を超えた場合）は、所定のカテゴリに属する画像からランダムに第１画像〜第３画像を選択し、さらに学習が進んだ場合は、所定のカテゴリに含まれるサブカテゴリに属する画像からランダムに第１画像〜第３画像を選択してもよい。すなわち、情報提供装置１０は、学習が進む度に、選択する画像の類似性を上昇させてもよい。 In addition, the information providing apparatus 10 may narrow the field that is the selection source of the first image to the third image in stages in order to efficiently advance the learning of the distributed expression. For example, in the initial stage of learning, the information providing apparatus 10 selects first to third images randomly from images belonging to all categories, and learning progresses (for example, the accuracy of distributed representation is a predetermined threshold value). 1) to 3rd image are selected at random from images belonging to a predetermined category. When learning further proceeds, first images are randomly selected from images belonging to subcategories included in the predetermined category. An image to a third image may be selected. That is, the information providing apparatus 10 may increase the similarity of images to be selected each time learning progresses.

〔１−７．多段階学習について〕
また、情報提供装置１０は、従来のトリプレットロスの学習手法と、上述した生成処理とを組み合わせて実行してもよい。例えば、情報提供装置１０は、第１画像との類似度が第１閾値以上となる第４画像と、第１画像との類似度が第２閾値以下となる第５画像とを選択する。このような場合、情報提供装置１０は、第１画像の分散表現と第４画像の分散表現とが類似し、かつ、第１画像の分散表現と第５画像の分散表現とが類似しないように、各画像の分散表現を生成する。 [1-7. (About multi-stage learning)
Further, the information providing apparatus 10 may execute a combination of the conventional triplet loss learning method and the generation process described above. For example, the information providing apparatus 10 selects a fourth image whose similarity with the first image is greater than or equal to the first threshold and a fifth image whose similarity with the first image is less than or equal to the second threshold. In such a case, the information providing apparatus 10 ensures that the distributed representation of the first image and the distributed representation of the fourth image are similar, and the distributed representation of the first image and the distributed representation of the fifth image are not similar. Generate a distributed representation of each image.

そして、情報提供装置１０は、第１画像から第５画像のうち３つの情報を含む全ての組について、第１類似度および第２類似度を算出する。すなわち、情報提供装置１０は、基準画像と２つの画像との全組み合わせを生成し、生成した組み合わせについて第１類似度および第２類似度を算出する。そして、情報提供装置１０は、第１類似度および第２類似度を用いて各組み合わせにおける正解データおよび不正解データを設定し、正解ペアの分散表現が類似し、不正解ペアの分散表現が類似しなくなるように、分散表現の学習を行う。 Then, the information providing apparatus 10 calculates the first similarity and the second similarity for all the sets including three pieces of information from the first image to the fifth image. That is, the information providing apparatus 10 generates all combinations of the reference image and the two images, and calculates the first similarity and the second similarity for the generated combination. Then, the information providing apparatus 10 sets the correct answer data and the incorrect answer data in each combination using the first similarity and the second similarity, the distributed expressions of the correct answer pairs are similar, and the distributed expressions of the incorrect answer pairs are similar. Learn distributed expressions so that they don't.

例えば、情報提供装置１０は、分散表現を生成する初期段階においては、基準画像と、基準画像との類似度が所定の閾値を超える正解データと、基準画像との類似度が所定の閾値を下回る不正解データとを選択し、正解ペアの分散表現が類似し、不正解ペアの分散表現が類似しなくなるように、分散表現の学習を行う。 For example, in the initial stage of generating the distributed representation, the information providing apparatus 10 has the similarity between the reference image and the correct answer data in which the similarity between the reference image exceeds a predetermined threshold and the reference image is below the predetermined threshold. Incorrect data is selected, and distributed expressions are learned so that the distributed expressions of correct pairs are similar and the distributed expressions of incorrect pairs are not similar.

そして、情報提供装置１０は、所定のタイミングで、分散表現の精度を算出する。例えば、情報提供装置１０は、分散表現を用いて、所定の画像と類似する画像のランキングを生成するとともに、各画像のメタ情報の類似性に基づいて、所定の画像と類似する画像のランキングを生成する。そして、情報提供装置１０は、分散表現を用いたランキングとメタ情報を用いたランキングとの間の一致度に基づいて、分散表現の精度を算出する。 Then, the information providing apparatus 10 calculates the accuracy of the distributed representation at a predetermined timing. For example, the information providing apparatus 10 generates a ranking of an image similar to a predetermined image using the distributed representation, and ranks an image similar to the predetermined image based on the similarity of meta information of each image. Generate. Then, the information providing apparatus 10 calculates the accuracy of the distributed expression based on the degree of coincidence between the ranking using the distributed expression and the ranking using the meta information.

ここで、情報提供装置１０は、分散表現の精度が所定の閾値を超えた場合や、分散表現の精度の上昇率が所定の期間の間上昇しなくなった場合には、上述した生成処理を実行する。すなわち、情報提供装置１０は、第１画像〜第３画像を選択し、選択した第１画像〜第３画像間の類似度に基づいて第１類似度および第２類似度を算出し、算出した第１類似度および第２類似度の比較結果に基づいて、第２画像および第３画像を正解データおよび不正解データとする。そして、情報提供装置１０は、正解ペアの分散表現が類似し、不正解ペアの分散表現が類似しなくなるように、分散表現の学習を行う。 Here, when the accuracy of the distributed representation exceeds a predetermined threshold or when the increase rate of the accuracy of the distributed representation does not increase for a predetermined period, the information providing apparatus 10 executes the generation process described above. To do. That is, the information providing apparatus 10 selects the first image to the third image, calculates the first similarity and the second similarity based on the similarity between the selected first image to the third image, and calculates Based on the comparison result of the first similarity and the second similarity, the second image and the third image are set as correct answer data and incorrect answer data. Then, the information providing apparatus 10 learns the distributed expression so that the distributed expressions of the correct answer pairs are similar and the distributed expressions of the incorrect answer pairs are not similar.

〔１−８．生成について〕
ここで、情報提供装置１０は、正解ペアの分散表現が類似し、不正解ペアの分散表現が類似しなくなるように、分散表現を生成するのであれば、任意の手法により分散表現を生成してよい。例えば、情報提供装置１０は、第１画像の分散表現と第２画像の分散表現との差が、第１画像の分散表現と第３画像の分散表現との差よりも少なくなるように、各分散表現を生成してもよい。 [1-8. About generation)
Here, the information providing apparatus 10 generates a distributed representation by an arbitrary method so long as the distributed representation is generated so that the distributed representation of the correct answer pair is similar and the distributed representation of the incorrect answer pair is not similar. Good. For example, the information providing apparatus 10 sets each difference such that the difference between the distributed representation of the first image and the distributed representation of the second image is smaller than the difference between the distributed representation of the first image and the distributed representation of the third image. A distributed representation may be generated.

以下、情報提供装置１０が分散表現の生成に用いる数式の一例について説明する。例えば、情報提供装置１０は、式（１）を用いて、トリプレットロスの手法に従い、各画像の分散表現を生成する。 Hereinafter, an example of a mathematical expression used by the information providing apparatus 10 to generate a distributed expression will be described. For example, the information providing apparatus 10 generates a distributed representation of each image according to the triplet loss method using Expression (1).

ここで、式（１）のｘ^ａ _ｉは、基準画像を示し、ｘ^ｐ _ｉは、第１類似度および第２類似度に基づいて設定された正解データの画像を示し、ｘ^ｎ _ｉは、第１類似度および第２類似度に基づいて設定された不正解データの画像を示す。式（１）のｆ（ｘ）は、画像ｘの分散表現であって、所定次元数の分散表現を示す。また、式（１）のαは、所定の係数である。情報提供装置１０は、式（１）のＬの値が最大化するように、各画像の分散表現ｆ（ｘ）の値を設定する。 Here, x ^a _i in Equation (1) indicates a reference image, x ^p _i indicates an image of correct data set based on the first similarity and the second similarity, and x ⁿ _i is An image of incorrect answer data set based on the first similarity and the second similarity is shown. F (x) in Expression (1) is a distributed representation of the image x, and indicates a distributed representation of a predetermined number of dimensions. In addition, α in Expression (1) is a predetermined coefficient. The information providing apparatus 10 sets the value of the distributed representation f (x) of each image so that the value of L in Expression (1) is maximized.

ここで、従来のトリプレットロスにおいては、基準画像と他の画像との間の類似度に応じて、絶対的な正解データと不正解データとを設定していた。このため、画像ｘのメタ情報をｗ（ｘ）と記載すると、従来のトリプレットロスにおける基準画像と正解データと不正解データとの関係は、以下の式（２）および式（３）で示される。 Here, in the conventional triplet loss, absolute correct data and incorrect data are set according to the degree of similarity between the reference image and another image. Therefore, when the meta information of the image x is described as w (x), the relationship between the reference image, the correct data, and the incorrect data in the conventional triplet loss is expressed by the following formulas (2) and (3). .

一方、情報提供装置１０は、絶対的な正解データおよび不正解データとして学習対象となる画像を選択するのではなく、選択した画像の相対的な類似度に基づいて、選択した画像から正解データと不正解データとを設定し、各画像の分散表現を生成する。このため、情報提供装置１０が選択する基準画像と正解データと不正解データとの関係は、以下の式（４）および式（５）で示されることとなる。 On the other hand, the information providing apparatus 10 does not select an image to be learned as absolute correct answer data and incorrect answer data, but instead of selecting correct answer data from the selected image based on the relative similarity of the selected image. Set incorrect data and generate a distributed representation of each image. For this reason, the relationship between the reference image selected by the information providing apparatus 10, the correct answer data, and the incorrect answer data is expressed by the following expressions (4) and (5).

このような処理の結果、情報提供装置１０は、基準画像と類似する画像および類似しない画像間の関係性のみならず、基準画像と類似する複数の画像間の関係性や、基準画像と類似しない複数の画像間の関係性等についても、分散表現に落とし込む。この結果、情報提供装置１０は、分散表現区間上において、処理対象となる全画像の相対的な類似性を落とし込むことができるので、分散表現の精度を向上させることができる。 As a result of such processing, the information providing apparatus 10 not only has a relationship between images similar to and not similar to the reference image, but also a relationship between a plurality of images similar to the reference image, and is not similar to the reference image. The relationship between multiple images is also included in the distributed expression. As a result, the information providing apparatus 10 can reduce the relative similarity of all the images to be processed on the distributed representation section, and thus can improve the accuracy of the distributed representation.

〔１−９．生成処理の一例について〕
次に、図１を用いて、情報提供装置１０が実行する生成処理の一例について説明する。まず、情報提供装置１０は、処理対象となる画像からランダムに第１画像Ｐ１、第２画像Ｐ２、および第３画像Ｐ３を選択し、各画像のメタ情報Ｍ１〜Ｍ３を取得する（ステップＳ１）。このような場合、情報提供装置１０は、メタ情報同士の類似度を算出し、算出した類似度を比較する（ステップＳ２）。例えば、情報提供装置１０は、メタ情報Ｍ１とメタ情報Ｍ２との類似度Ｓ１（すなわち、第１類似度）、およびメタ情報Ｍ１とメタ情報Ｍ３との類似度Ｓ２（すなわち、第２類似度）を算出し、算出した類似度Ｓ１および類似度Ｓ２を比較する。 [1-9. (Example of generation process)
Next, an example of a generation process executed by the information providing apparatus 10 will be described with reference to FIG. First, the information providing apparatus 10 randomly selects the first image P1, the second image P2, and the third image P3 from the images to be processed, and acquires meta information M1 to M3 of each image (step S1). . In such a case, the information providing apparatus 10 calculates the similarity between the meta information and compares the calculated similarities (step S2). For example, the information providing apparatus 10 uses the similarity S1 between the meta information M1 and the meta information M2 (that is, the first similarity) and the similarity S2 between the meta information M1 and the meta information M3 (that is, the second similarity). And the calculated similarity S1 and similarity S2 are compared.

そして、情報提供装置１０は、類似度の比較結果に応じて、各画像の分散表現を生成する（ステップＳ３）。例えば、情報提供装置１０は、類似度Ｓ１の値が類似度Ｓ２の値よりも大きい場合は、第２画像を正解データとし、第３画像を不正解データとする。そして、情報提供装置１０は、第１画像の分散表現Ｐ１と第２画像の分散表現Ｐ２とが類似し、第１画像の分散表現Ｐ１と第３画像の分散表現Ｐ３とが類似しないように、各分散表現Ｐ１〜Ｐ３を生成する。一方、情報提供装置１０は、類似度Ｓ２の値が類似度Ｓ１の値よりも大きい場合は、第３画像を正解データとし、第２画像を不正解データとする。そして、情報提供装置１０は、第１画像の分散表現Ｐ１と第３画像の分散表現Ｐ３とが類似し、第１画像の分散表現Ｐ１と第２画像の分散表現Ｐ２とが類似しないように、各分散表現Ｐ１〜Ｐ３を生成する。 And the information provision apparatus 10 produces | generates the dispersion | distribution expression of each image according to the comparison result of similarity (step S3). For example, when the value of the similarity S1 is larger than the value of the similarity S2, the information providing apparatus 10 sets the second image as correct answer data and the third image as incorrect answer data. Then, the information providing apparatus 10 is configured so that the distributed representation P1 of the first image and the distributed representation P2 of the second image are similar, and the distributed representation P1 of the first image and the distributed representation P3 of the third image are not similar. Each distributed representation P1-P3 is generated. On the other hand, when the value of the similarity S2 is larger than the value of the similarity S1, the information providing apparatus 10 sets the third image as correct answer data and the second image as incorrect answer data. Then, the information providing apparatus 10 is configured such that the distributed representation P1 of the first image and the distributed representation P3 of the third image are similar, and the distributed representation P1 of the first image and the distributed representation P2 of the second image are not similar. Each distributed representation P1-P3 is generated.

また、情報提供装置１０は、他の画像間についても同様の処理を実行する（ステップＳ４）。より具体的には、情報提供装置１０は、基準画像となる画像と、基準画像以外の画像を第２画像および第３画像とする全ての組み合わせのトリプルを生成する。そして、情報提供装置１０は、生成した全てのトリプルについて、ステップＳ１〜ステップＳ３を実行することで、全ての画像の分散表現を生成する。 In addition, the information providing apparatus 10 performs the same processing between other images (step S4). More specifically, the information providing apparatus 10 generates triples of all combinations in which an image that is a reference image and images other than the reference image are the second image and the third image. And the information provision apparatus 10 produces | generates the dispersion | distribution expression of all the images by performing step S1-step S3 about all the produced | generated triples.

続いて、情報提供装置１０が実行する検索処理の一例について説明する。まず、情報提供装置１０は、利用者端末１００から検索クエリを受付ける（ステップＳ５）。例えば、情報提供装置１０は、利用者端末１００から検索クエリとして、検索クエリｑ１を受付ける。 Next, an example of search processing executed by the information providing apparatus 10 will be described. First, the information providing apparatus 10 receives a search query from the user terminal 100 (step S5). For example, the information providing apparatus 10 receives a search query q1 as a search query from the user terminal 100.

このような場合、情報提供装置１０は、検索クエリｑ１と対応する画像を特定し、特定した画像の分散表現との距離に基づいて、検索クエリに応じた画像のランキングを生成する（ステップＳ６）。例えば、情報提供装置１０は、検索クエリｑ１と最も関連性が高い第１画像Ｐ１を選択する。このような場合、情報提供装置１０は、画像Ｐ１の分散表現Ｐ１と他の分散表現Ｐ２〜Ｐ４との間のユークリッド距離をそれぞれ算出する。 In such a case, the information providing apparatus 10 specifies an image corresponding to the search query q1, and generates a ranking of images according to the search query based on the distance from the distributed representation of the specified image (step S6). . For example, the information providing apparatus 10 selects the first image P1 that is most relevant to the search query q1. In such a case, the information providing apparatus 10 calculates the Euclidean distance between the distributed representation P1 of the image P1 and the other distributed representations P2 to P4.

そして、例えば、情報提供装置１０は、分散表現Ｐ３、分散表現Ｐ２、および分散表現Ｐ４の順に、分散表現Ｐ１とのユークリッド距離が近い場合は、分散表現Ｐ３に対応する画像Ｐ３、分散表現Ｐ２に対応する画像Ｐ２、および分散表現Ｐ４に対応する画像Ｐ４の順に、各画像Ｐ２〜Ｐ４を並べたランキング形式の検索結果を生成する。そして、情報提供装置１０は、生成したランキング形式の検索結果を利用者端末１００へと提供する（ステップＳ７）。 For example, when the Euclidean distance from the distributed representation P1 is short in the order of the distributed representation P3, the distributed representation P2, and the distributed representation P4, the information providing apparatus 10 displays the image P3 and the distributed representation P2 corresponding to the distributed representation P3. A search result in a ranking format in which the images P2 to P4 are arranged in the order of the corresponding image P2 and the image P4 corresponding to the distributed representation P4 is generated. Then, the information providing apparatus 10 provides the generated search result in the ranking format to the user terminal 100 (step S7).

〔２．情報提供装置の構成〕
続いて、上記した情報提供装置１０が有する機能構成の一例について説明する。図２は、実施形態に係る情報提供装置の構成例を示す図である。図２に示すように、情報提供装置１０は、通信部２０、記憶部３０、および制御部４０を有する。 [2. Configuration of information providing device]
Subsequently, an example of a functional configuration of the information providing apparatus 10 described above will be described. FIG. 2 is a diagram illustrating a configuration example of the information providing apparatus according to the embodiment. As illustrated in FIG. 2, the information providing apparatus 10 includes a communication unit 20, a storage unit 30, and a control unit 40.

通信部２０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部２０は、ネットワークＮと有線または無線で接続され、利用者端末１００との間で情報の送受信を行う。 The communication unit 20 is realized by, for example, a NIC (Network Interface Card). The communication unit 20 is connected to the network N by wire or wireless, and transmits / receives information to / from the user terminal 100.

記憶部３０は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。また、記憶部３０は、画像データベース３１、および分散表現データベース３２を記憶する。 The storage unit 30 is realized by, for example, a semiconductor memory device such as a RAM (Random Access Memory) or a flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 30 also stores an image database 31 and a distributed representation database 32.

画像データベース３１には、分散表現の生成対象となる画像が登録される。例えば、図３は、実施形態に係る画像データベースに登録される情報の一例を示す図である。図３に示すように、画像データベース３１には、「画像ＩＤ（Identifier）」、「画像データ」および「メタ情報」といった項目を有する情報が登録される。また、「メタ情報」には、それぞれ異なる優先度が設定された「検索クエリ」、「タグ情報」、および「特徴情報」等といった情報が登録される。 In the image database 31, an image for which a distributed expression is to be generated is registered. For example, FIG. 3 is a diagram illustrating an example of information registered in the image database according to the embodiment. As shown in FIG. 3, information having items such as “image ID (Identifier)”, “image data”, and “meta information” is registered in the image database 31. In the “meta information”, information such as “search query”, “tag information”, and “feature information” with different priorities is registered.

ここで、「画像ＩＤ」とは、画像の識別子である。また、「画像データ」とは、対応付けられた「画像ＩＤ」が示す画像の画像データである。また、「メタ情報」とは、対応付けられた「画像ＩＤ」が示す画像に付与されたメタ情報である。例えば、「検索クエリ」は、画像間の類似度を算出する際に最も優先して考慮される「優先度１」のメタ情報であり、対応付けられた「画像ＩＤ」が示す画像が利用者Ｕに選択された際に、利用者Ｕが入力した検索クエリである。また、「タグ情報」は、画像間の類似度を算出する際に２番目に優先して考慮される「優先度２」のメタ情報であり、対応付けられた「画像ＩＤ」が示す画像に撮像された撮像対象の特徴等、画像に予め付与されたタグ情報である。また、「特徴情報」は、画像間の類似度を算出する際に３番目に優先して考慮される「優先度３」のメタ情報であり、対応付けられた「画像ＩＤ」が示す画像の構造的（外観的）な特徴を示す特徴情報である。 Here, the “image ID” is an image identifier. The “image data” is image data of an image indicated by the associated “image ID”. The “meta information” is meta information given to the image indicated by the associated “image ID”. For example, “search query” is meta information of “priority 1” that is considered most preferentially when calculating the similarity between images, and the image indicated by the associated “image ID” is the user The search query entered by the user U when U is selected. The “tag information” is meta information of “priority 2” that is considered second in priority when calculating the similarity between images, and is included in the image indicated by the associated “image ID”. This is tag information previously assigned to the image, such as the characteristics of the imaged object. The “feature information” is meta information of “priority 3” that is considered with the third highest priority when calculating the similarity between images, and the image information indicated by the associated “image ID”. This is feature information indicating structural (appearance) features.

例えば、図３に示す例では、画像ＩＤ「画像ＩＤ＃１」、画像データ「画像データ＃１」、検索クエリ「検索クエリ＃１」、タグ情報「タグ情報＃１」、および特徴情報「特徴情報＃１」が対応付けて登録されている。このような情報は、画像ＩＤ「画像ＩＤ＃１」が示す画像の画像データが「画像データ＃１」であり、その画像が選択された際に利用者Ｕが入力した検索クエリが「検索クエリ＃１」であり、タグ情報「タグ情報＃１」が付与されており、画像の外観的な特徴が「特徴情報＃１」である旨を示す。なお、図３に示す例では、「画像ＩＤ＃１」、「画像データ＃１」、「検索クエリ＃１」、「タグ情報＃１」、「特徴情報＃１」等といった概念的な値について記載したが、実際には、画像を識別する文字列、各種フォーマットの画像データ、検索クエリとして入力された文字列、タグ情報に含まれる文字列、特徴を示す多次元量等が登録されることとなる。 For example, in the example shown in FIG. 3, the image ID “image ID # 1”, the image data “image data # 1”, the search query “search query # 1”, the tag information “tag information # 1”, and the feature information “feature” "Information # 1" is registered in association with each other. In such information, the image data of the image indicated by the image ID “image ID # 1” is “image data # 1”, and the search query input by the user U when the image is selected is “search query”. # 1 ”, tag information“ tag information # 1 ”is assigned, and the appearance characteristic of the image is“ feature information # 1 ”. In the example shown in FIG. 3, conceptual values such as “image ID # 1”, “image data # 1”, “search query # 1”, “tag information # 1”, “feature information # 1”, etc. Although described, actually, a character string for identifying an image, image data in various formats, a character string input as a search query, a character string included in tag information, a multidimensional quantity indicating a feature, and the like are registered It becomes.

図２に戻り、説明を続ける。分散表現データベース３２には、画像の分散表現が登録される。例えば、図４は、実施形態に係る分散表現データベースに登録される情報の一例を示す図である。図４に示す例では、分散表現データベース３２には、「画像ＩＤ」と「分散表現」といった項目を有する情報が登録される。ここで、「分散表現」とは、対応付けられた「画像ＩＤ」が示す画像から生成した分散表現である。 Returning to FIG. 2, the description will be continued. In the distributed expression database 32, distributed expressions of images are registered. For example, FIG. 4 is a diagram illustrating an example of information registered in the distributed expression database according to the embodiment. In the example illustrated in FIG. 4, information having items such as “image ID” and “distributed expression” is registered in the distributed expression database 32. Here, the “distributed expression” is a distributed expression generated from the image indicated by the associated “image ID”.

例えば、図４に示す例では、画像ＩＤ「画像ＩＤ＃１」および分散表現「分散表現＃１」といった情報が対応付けて登録されている。このような情報は、画像ＩＤ「画像ＩＤ＃１」が示す画像の分散表現が「分散表現＃１」である旨を示す。なお、図４に示す例では、「分散表現＃１」といった概念的な値を記載したが、実際には、分散表現として生成された多次元量が登録されることとなる。 For example, in the example illustrated in FIG. 4, information such as an image ID “image ID # 1” and a distributed expression “distributed expression # 1” is registered in association with each other. Such information indicates that the distributed representation of the image indicated by the image ID “image ID # 1” is “distributed representation # 1”. In the example shown in FIG. 4, a conceptual value such as “dispersion expression # 1” is described, but in practice, a multidimensional quantity generated as a dispersion expression is registered.

図２に戻り、説明を続ける。制御部４０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）等のプロセッサによって、情報提供装置１０内部の記憶装置に記憶されている各種プログラムがＲＡＭ等を作業領域として実行されることにより実現される。また、制御部４０は、コントローラ（controller）であり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現されてもよい。 Returning to FIG. 2, the description will be continued. The control unit 40 is a controller. For example, various programs stored in a storage device inside the information providing apparatus 10 are stored in a RAM or the like by a processor such as a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). Is implemented as a work area. The control unit 40 is a controller, and may be realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

図２に示すように、制御部４０は、選択部４１、算出部４２、比較部４３、生成部４４、受付部４５、および検索部４６を有する。選択部４１は、処理対象となる画像を選択する。例えば、選択部４１は、画像データベース３１を参照し、基準画像となる画像を一つ選択する。また、選択部４１は、基準画像に対して正解データまたは、不正解データとなる他の２つの画像の全ての組み合わせについて選択する。そして、選択部４１は、選択した３つ組の画像をトリプルとして算出部４２に通知する。なお、選択部４１は、分散表現の学習において、全ての画像を基準画像として選択し、選択した画像を基準画像とする全てのトリプルを生成する。 As illustrated in FIG. 2, the control unit 40 includes a selection unit 41, a calculation unit 42, a comparison unit 43, a generation unit 44, a reception unit 45, and a search unit 46. The selection unit 41 selects an image to be processed. For example, the selection unit 41 refers to the image database 31 and selects one image to be a reference image. The selection unit 41 selects all combinations of two other images that are correct data or incorrect data with respect to the reference image. Then, the selection unit 41 notifies the calculation unit 42 of the selected triple image as a triple. In the distributed expression learning, the selection unit 41 selects all images as reference images, and generates all triples using the selected images as reference images.

なお、選択部４１は、段階的に分散表現を生成する場合、例えば、任意の手法を用いて、基準画像との類似度が第１閾値以上となる第４画像を正解データとして選択し、基準画像との類似度が第２閾値以下となる第５画像を不正解データとして選択してもよい。また、選択部４１は、第２画像および第３画像として、基準画像との類似度が所定の閾値を超える画像を選択してもよい。また、選択部４１は、段階的に正解データおよび不正解データの選択元となる分野を徐々に狭めてもよい。例えば、選択部４１は、各画像データのメタ情報の類似度が徐々に狭まるように、トリプルとなる画像を選択してもよい。 Note that when the distributed representation is generated step by step, the selection unit 41 selects, for example, a fourth image having a similarity with the reference image equal to or higher than the first threshold as correct data using an arbitrary technique, A fifth image having a similarity to the image equal to or lower than the second threshold value may be selected as incorrect answer data. Further, the selection unit 41 may select an image whose similarity with the reference image exceeds a predetermined threshold as the second image and the third image. Further, the selection unit 41 may gradually narrow down the field that is the source of selection of the correct answer data and the incorrect answer data step by step. For example, the selection unit 41 may select triple images so that the similarity of the meta information of each image data is gradually reduced.

算出部４２は、基準画像と第２画像との間の類似度である第１類似度と、基準画像と第３画像との間の類似度である第２類似度とを算出する。例えば、算出部４２は、基準画像のメタ情報と第２画像のメタ情報との類似度に基づいて、第１類似度を算出し、基準画像のメタ情報と第３画像のメタ情報との類似度に基づいて、第２類似度を算出する。 The calculation unit 42 calculates a first similarity that is a similarity between the reference image and the second image, and a second similarity that is a similarity between the reference image and the third image. For example, the calculation unit 42 calculates the first similarity based on the similarity between the meta information of the reference image and the meta information of the second image, and the similarity between the meta information of the reference image and the meta information of the third image Based on the degree, the second similarity is calculated.

例えば、算出部４２は、トリプルに含まれる基準画像のメタ情報と第２画像のタグ情報とを画像データベース３１から読み出す。そして、算出部４２は、読み出したメタ情報の一致度やＩｏＵ等といった類似度を算出し、算出した類似度に基づいて、第１類似度を算出する。同様に、算出部４２は、基準画像のタグ情報と第３画像のメタ情報との類似度から、第２類似度を算出する。 For example, the calculation unit 42 reads the meta information of the reference image and the tag information of the second image included in the triple from the image database 31. Then, the calculation unit 42 calculates a similarity such as the degree of coincidence of the read meta information and IoU, and calculates the first similarity based on the calculated similarity. Similarly, the calculation unit 42 calculates the second similarity from the similarity between the tag information of the reference image and the meta information of the third image.

なお、算出部４２は、基準画像に紐付けられる複数種別のメタ情報（例えば、検索クエリ、タグ情報、特徴情報等）と第２画像または第３画像に紐付けられる複数種別のメタ情報（例えば、検索クエリ、タグ情報、特徴情報等）とを用いて、メタ情報の種別ごとに類似度を算出し、算出した複数の類似度に基づいて、第１類似度や第２類似度を算出してもよい。また、算出部４２は、種別ごとの類似度を優先度が高い順に結合させることで、所定の桁数の第１類似度および第２類似度を算出してもよい。 Note that the calculation unit 42 includes a plurality of types of meta information (for example, search query, tag information, feature information, etc.) associated with the reference image and a plurality of types of meta information (for example, associated with the second image or the third image). , Search query, tag information, feature information, etc.), the similarity is calculated for each type of meta information, and the first similarity and the second similarity are calculated based on the calculated plurality of similarities. May be. Further, the calculation unit 42 may calculate the first similarity and the second similarity with a predetermined number of digits by combining the similarity for each type in descending order of priority.

例えば、図５は、実施形態にかかる情報提供装置が算出する類似度の一例を示す図である。なお、図５に示す例では、１０進数の優先度を算出する例について記載した。例えば、算出部４２は、基準画像と第２画像の検索クエリ同士の類似度「ＡＡＡＡＡ」を算出し、タグ情報同士の類似度「ＢＢＢ」を算出し、特徴情報同士の類似度「ＣＣＣ」を算出する。このような場合、算出部４２は、優先度が最も高い検索クエリの類似度「ＡＡＡＡＡ」に対し、タグ情報同士の類似度および特徴情報同士の類似度の桁数分の係数α「１００００００」を積算する。また、算出部４２は、優先度が２番目に高い検索クエリの類似度「ＢＢＢ」に対し、特徴情報同士の類似度の桁数分の係数β「１０００」を積算する。また、算出部４２は、優先度が３番目に高い検索クエリの類似度「ＣＣＣ」に対し、係数γ「１」を積算する。そして、算出部４２は、各類似度に係数を積算した値「ＡＡＡＡＡＢＢＢＣＣＣ」を第１類似度とする。 For example, FIG. 5 is a diagram illustrating an example of the similarity calculated by the information providing apparatus according to the embodiment. In the example illustrated in FIG. 5, an example in which a decimal priority is calculated is described. For example, the calculation unit 42 calculates the similarity “AAAAA” between the search queries of the reference image and the second image, calculates the similarity “BBB” between the tag information, and calculates the similarity “CCC” between the feature information. calculate. In such a case, the calculation unit 42 calculates a coefficient α “1000000” corresponding to the number of digits of similarity between tag information and similarity between feature information with respect to the similarity “AAAAAA” of the search query having the highest priority. Accumulate. In addition, the calculation unit 42 accumulates a coefficient β “1000” corresponding to the number of digits of similarity between feature information to the similarity “BBB” of the search query having the second highest priority. In addition, the calculation unit 42 adds the coefficient γ “1” to the similarity “CCC” of the search query having the third highest priority. Then, the calculation unit 42 sets the value “AAAAAABBBCCC” obtained by adding the coefficient to each similarity as the first similarity.

例えば、算出部４２は、基準画像と第３画像の検索クエリ同士の類似度「ａａａａａ」を算出し、タグ情報同士の類似度「ｂｂｂ」を算出し、特徴情報同士の類似度「ｃｃｃ」を算出する。このような場合、算出部４２は、優先度が最も高い検索クエリの類似度「ａａａａａ」に対し係数αを積算し、優先度が２番目に高い検索クエリの類似度「ｂｂｂ」に係数β「１０００」をし、優先度が３番目に高い検索クエリの類似度「ｃｃｃ」に対し、係数γを積算する。そして、算出部４２は、各類似度に係数を積算した値「ａａａａａｂｂｂｃｃｃ」を第２類似度とする。 For example, the calculation unit 42 calculates the similarity “aaaaaa” between the search queries of the reference image and the third image, calculates the similarity “bbb” between the tag information, and calculates the similarity “ccc” between the feature information. calculate. In such a case, the calculation unit 42 adds the coefficient α to the similarity “aaaaaa” of the search query with the highest priority, and adds the coefficient β “to the similarity“ bbb ”of the search query with the second highest priority. 1000 ”and the coefficient γ is added to the similarity“ ccc ”of the search query having the third highest priority. Then, the calculation unit 42 sets the value “aaaaaabbbccc” obtained by adding the coefficient to each similarity as the second similarity.

このようにして算出された第１類似度および第２類似度を比較した場合、より優先度が高い種別のメタ情報がより優先的に比較結果に反映させることとなる。このため、算出部４２は、優先度が異なる複数種別のメタ情報の比較を容易にすることができる。 When the first similarity and the second similarity calculated in this way are compared, the type of meta information having a higher priority is more preferentially reflected in the comparison result. For this reason, the calculation unit 42 can easily compare a plurality of types of meta information having different priorities.

なお、算出部４２は、画像データベース３１に登録された各種のメタ情報を用いて、各画像の類似度を判断することで、基準画像のメタ情報と第２画像のメタ情報との意味または表記の類似度に基づいて、第１類似度を算出し、基準画像のメタ情報と第３画像のメタ情報との意味または表記の類似度に基づいて、第２類似度を算出することとなる。例えば、算出部４２は、基準画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリと、第２画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリとの類似度に基づいて、第１類似度を算出し、基準画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリと、第３画像が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリとの類似度に基づいて、第２類似度を算出することとなる。 Note that the calculation unit 42 determines the similarity of each image using various meta information registered in the image database 31, thereby meaning or notation of the meta information of the reference image and the meta information of the second image. The first similarity is calculated based on the similarity, and the second similarity is calculated based on the similarity between the meta information of the reference image and the meta information of the third image or the notation similarity. For example, the calculation unit 42 calculates the search query input by the user U when the reference image is selected by the user U and the search query input by the user U when the second image is selected by the user U. The first similarity is calculated based on the similarity to the search query entered by the user U when the reference image is selected by the user U, and the third image is selected by the user U. The second similarity is calculated based on the similarity with the search query input by the user U.

なお、算出部４２は、各画像を選択した利用者Ｕの同一性によらず、各画像が選択された際に各利用者が入力した検索クエリの類似度に基づいて、第１類似度および第２類似度を算出してよい。また、算出部４２は、基準画像と第２画像との構造的な類似度および基準画像と第２画像との意味的な類似度とに基づいて、第１類似度を算出し、基準画像と第３画像との構造的な類似度および基準画像と第３画像との意味的な類似度とに基づいて、第２類似度を算出することとなる。 Note that the calculation unit 42 does not depend on the identity of the user U who selected each image, but based on the similarity of the search query input by each user when each image is selected, The second similarity may be calculated. The calculating unit 42 calculates the first similarity based on the structural similarity between the reference image and the second image and the semantic similarity between the reference image and the second image, The second similarity is calculated based on the structural similarity with the third image and the semantic similarity between the reference image and the third image.

なお、算出部４２は、検索クエリのうち、トークンの量が所定の閾値を超える検索クエリの類似度に基づいて、第１類似度および第２類似度を算出してもよい。また、算出部４２は、段階的な学習を行う場合は、選択部４１が選択した第４画像や第５画像についても、基準画像との間の第１類似度や第２類似度を算出することとなる。 Note that the calculation unit 42 may calculate the first similarity and the second similarity based on the similarity of search queries in which the amount of tokens exceeds a predetermined threshold among the search queries. In addition, when performing stepwise learning, the calculation unit 42 calculates the first similarity and the second similarity between the fourth image and the fifth image selected by the selection unit 41 and the reference image. It will be.

図２に戻り、説明を続ける。比較部４３は、算出部４２がトリプルごとに算出した第１類似度および第２類似度を比較し、比較結果を生成部４４に通知する。例えば、比較部４３は、第１類似度の値と、第２類似度の値とのいずれがより大きいかを判定し、判定結果を生成部４４に通知する。 Returning to FIG. 2, the description will be continued. The comparison unit 43 compares the first similarity and the second similarity calculated for each triple by the calculation unit 42 and notifies the generation unit 44 of the comparison result. For example, the comparison unit 43 determines which of the first similarity value and the second similarity value is greater, and notifies the generation unit 44 of the determination result.

生成部４４は、第１類似度が第２類似度よりも大きい場合は、基準画像の分散表現と第２画像の分散表現とが類似し、かつ基準画像の分散表現と第３画像の分散表現とが類似しないように、各分散表現を生成する。また、生成部４４は、第２類似度が第１類似度よりも大きい場合は、基準画像の分散表現と第３画像の分散表現とが類似し、かつ基準画像の分散表現と第２画像の分散表現とが類似しないように、各分散表現を生成する。そして、生成部４４は、各画像の画像ＩＤと、生成した分散表現とを分散表現データベース３２に登録する。 When the first similarity is greater than the second similarity, the generation unit 44 resembles the distributed representation of the reference image and the distributed representation of the second image, and the distributed representation of the reference image and the distributed representation of the third image. Each distributed expression is generated so that and are not similar to each other. In addition, when the second similarity is greater than the first similarity, the generation unit 44 is similar to the distributed representation of the reference image and the distributed representation of the third image, and the distributed representation of the reference image and the second image Each distributed representation is generated so as not to be similar to the distributed representation. Then, the generation unit 44 registers the image ID of each image and the generated distributed expression in the distributed expression database 32.

例えば、生成部４４は、第１類似度が第２類似度よりも大きい場合は、トリプルに含まれる第２画像を正解データとし、第３画像を不正解データとする。また、生成部４４は、第２類似度が第１類似度よりも大きい場合は、トリプルに含まれる第３画像を正解データとし、第２画像を不正解データとする。すなわち、生成部４４は、第１類似度と第２類似度との比較結果に基づいて、正解データおよび不正解データの設定を行う。そして、生成部４４は、各トリプルから正解ペアと不正解ペアとを生成し、式（１）のＬの値が最大化するように、各画像の分散表現を生成する。 For example, when the first similarity is greater than the second similarity, the generation unit 44 sets the second image included in the triple as correct data and the third image as incorrect data. Further, when the second similarity is greater than the first similarity, the generation unit 44 sets the third image included in the triple as correct data and sets the second image as incorrect data. That is, the generation unit 44 sets correct answer data and incorrect answer data based on a comparison result between the first similarity and the second similarity. Then, the generation unit 44 generates a correct answer pair and an incorrect answer pair from each triple, and generates a distributed representation of each image so that the value of L in Expression (1) is maximized.

なお、生成部４４は、段階的に分散表現を生成する場合、基準画像の分散表現と第４画像の分散表現とが類似し、かつ、基準画像の分散表現と第５画像の分散表現とが類似しないように、基準画像、第４画像および第５画像の分散表現を生成する。そして、生成部４４は、生成した分散表現の精度を算出し、算出した精度が所定の閾値を超えた場合や、算出した精度が所定の期間上昇しない場合は、第１類似度および第２類似度の比較結果に基づいて、正解データおよび不正解データを設定を行い、設定結果に基づいた分散表現の修正を行ってもよい。また、生成部４４は、基準画像の分散表現と第２画像の分散表現との差が、基準画像の分散表現と第３画像の分散表現との差よりも少なくなるように、各分散表現を生成してもよい。 When the generation unit 44 generates the distributed representation step by step, the distributed representation of the reference image is similar to the distributed representation of the fourth image, and the distributed representation of the reference image and the distributed representation of the fifth image are different. A distributed representation of the reference image, the fourth image, and the fifth image is generated so as not to be similar. Then, the generation unit 44 calculates the accuracy of the generated distributed representation, and when the calculated accuracy exceeds a predetermined threshold or when the calculated accuracy does not increase for a predetermined period, the first similarity and the second similarity The correct answer data and the incorrect answer data may be set based on the degree comparison result, and the distributed expression may be corrected based on the setting result. Further, the generation unit 44 sets each distributed representation so that the difference between the distributed representation of the reference image and the distributed representation of the second image is smaller than the difference between the distributed representation of the reference image and the distributed representation of the third image. It may be generated.

受付部４５は、利用者端末１００から検索クエリを受付ける。このような場合、受付部４５は、検索クエリと対応する画像を特定する。例えば、受付部４５は、検索クエリとしてテキストを受付けた場合、画像データベース３１を参照し、検索クエリのテキストと最も一致度が高いメタ情報と対応付けられた画像を選択する。また、例えば、受付部４５は、検索クエリとして画像を受付けた場合、画像データベース３１を参照し、検索クエリとなる画像と類似度が最も高い画像を検索する。なお、受付部４５は、例えば、画像と画像との間、もしくは、画像とテキストとの間の意味的および構造的な類似性を学習した所定の学習モデルを用いて、検索クエリと対応する画像を検索してもよい。 The accepting unit 45 accepts a search query from the user terminal 100. In such a case, the reception unit 45 identifies an image corresponding to the search query. For example, when accepting text as a search query, the accepting unit 45 refers to the image database 31 and selects an image associated with meta information having the highest degree of matching with the text of the search query. For example, when receiving an image as a search query, the receiving unit 45 refers to the image database 31 and searches for an image having the highest degree of similarity with the image serving as the search query. Note that the reception unit 45 uses, for example, a predetermined learning model that has learned semantic and structural similarity between images or between an image and text, and an image corresponding to a search query. You may search for.

検索部４６は、検索クエリと類似する画像をランキング形式で特定する。例えば、検索部４６は、受付部４５が検索の結果特定した画像（以下、「クエリ画像」と記載する。）の分散表現を分散表現データベース３２から取得する。そして、検索部４６は、取得した分散表現と、分散表現データベース３２に登録された分散表現との間の距離を算出し、距離が近い方から順に所定の数の分散表現を特定する。また、検索部４６は、特定した分散表現と対応付けられた画像を分散表現データベース３２から特定し、特定した画像の画像データを画像データベース３１から読み出す。そして、検索部４６は、読み出した画像データを、クエリ画像の分散表現との間の距離が近い方から順にランキング形式で並べたコンテンツを生成し、生成したコンテンツを利用者端末１００へと提供する。 The search unit 46 specifies images similar to the search query in a ranking format. For example, the search unit 46 acquires a distributed representation of the image (hereinafter referred to as “query image”) specified by the reception unit 45 as a result of the search from the distributed representation database 32. Then, the search unit 46 calculates the distance between the acquired distributed representation and the distributed representation registered in the distributed representation database 32, and specifies a predetermined number of distributed representations in order from the shortest distance. Further, the search unit 46 specifies an image associated with the specified distributed expression from the distributed expression database 32, and reads image data of the specified image from the image database 31. Then, the search unit 46 generates content in which the read image data is arranged in the ranking format in order from the closest distance to the distributed representation of the query image, and provides the generated content to the user terminal 100. .

〔３．情報提供装置が実行する処理の流れの一例〕
続いて、図６を用いて、情報提供装置１０が実行する生成処理の流れについて説明する。図６は、実施形態に係る情報提供装置が実行する生成処理の流れの一例を示すフローチャートである。なお、情報提供装置１０は、図６に示す処理を、任意の単位で、任意のタイミングにより実行可能である。 [3. Example of flow of processing executed by information providing apparatus]
Subsequently, a flow of generation processing executed by the information providing apparatus 10 will be described with reference to FIG. FIG. 6 is a flowchart illustrating an example of a flow of generation processing executed by the information providing apparatus according to the embodiment. Note that the information providing apparatus 10 can execute the process illustrated in FIG. 6 in an arbitrary unit at an arbitrary timing.

まず、情報提供装置１０は、第１画像、第２画像、および第３画像の組を選択する(ステップＳ１０１)。そして、情報提供装置１０は、第１画像のメタ情報と第２画像のメタ情報との第１類似度、および第１画像のメタ情報と第３画像のメタ情報との第２類似度を算出する（ステップＳ１０２）。 First, the information providing apparatus 10 selects a set of a first image, a second image, and a third image (step S101). The information providing apparatus 10 calculates the first similarity between the meta information of the first image and the meta information of the second image, and the second similarity of the meta information of the first image and the meta information of the third image. (Step S102).

続いて、情報提供装置１０は、第１類似度が第２類似度よりも大きいか否かを判定し（ステップＳ１０３）、大きい場合は（ステップＳ１０３：Ｙｅｓ）、第２画像を正解データとし、第３画像を不正解データとする（ステップＳ１０４）。一方、情報提供装置１０は、第１類似度が第２類似度よりも大きくない場合は（ステップＳ１０３：Ｎｏ）、第２画像を不正解データとし、第３画像を正解データとする（ステップＳ１０５）。そして、情報提供装置１０は、第１画像の分散表現と正解データの分散表現とが類似し、第１画像の分散表現と不正解データの分散表現とが類似しないように、分散表現を生成し（ステップＳ１０６）、処理を終了する。 Subsequently, the information providing apparatus 10 determines whether or not the first similarity is larger than the second similarity (step S103). If the first similarity is larger (step S103: Yes), the second image is set as correct data, The third image is set as incorrect answer data (step S104). On the other hand, when the first similarity is not greater than the second similarity (step S103: No), the information providing apparatus 10 sets the second image as incorrect data and sets the third image as correct data (step S105). ). Then, the information providing apparatus 10 generates the distributed representation so that the distributed representation of the first image is similar to the distributed representation of the correct answer data, and the distributed representation of the first image is not similar to the distributed representation of the incorrect answer data. (Step S106), the process ends.

〔４．変形例〕
上記では、情報提供装置１０による生成処理や検索処理の一例について説明した。しかしながら、実施形態は、これに限定されるものではない。以下、情報提供装置１０が実行する提供処理や付与処理のバリエーションについて説明する。 [4. (Modification)
In the above, an example of generation processing and search processing by the information providing apparatus 10 has been described. However, the embodiment is not limited to this. Hereinafter, the variation of the provision process and provision process which the information provision apparatus 10 performs are demonstrated.

〔４−１．対象となる情報について〕
上述した例では、情報提供装置１０は、画像間の意味的および構造的な類似度を相対的に反映させた分散表現を生成した。しかしながら、実施形態は、これに限定されるものではない。例えば、情報提供装置１０は、静止画像や動画像等といった画像以外にも、音楽、映画、ニュース記事、各種の投稿、ウェブコンテンツ等、任意の情報間の意味的および構造的な類似度を相対的に反映させた分散表現を生成してもよい。このような場合、情報提供装置１０は、例えば、音楽の作曲家、作詞内容、演奏時間、演奏日時、映画の監督、出演者、上映時間、ニュース記事や投稿内容の要約等をメタ情報として採用してもよい。 [4-1. (Target information)
In the example described above, the information providing apparatus 10 generates a distributed expression that relatively reflects the semantic and structural similarity between images. However, the embodiment is not limited to this. For example, in addition to images such as still images and moving images, the information providing apparatus 10 makes relative semantic and structural similarities between arbitrary information such as music, movies, news articles, various posts, and web contents. Alternatively, a distributed expression reflecting the above may be generated. In such a case, the information providing apparatus 10 employs, for example, a music composer, lyrics content, performance time, performance date, movie director, performer, screening time, news article or summary of posted content as meta information. May be.

また、情報提供装置１０は、異なる種別の情報を含むトリプルを用いて、各情報の分散表現を生成してもよい。例えば、情報提供装置１０は、画像と、テキストと、音楽とを含むトリプルについて、画像とテキストとの間の第１類似度、画像と音楽との間の第２類似度を算出し、算出した第１類似度および第２類似度との比較結果に応じて、正解データや不正解データとし、各情報の分散表現を生成してもよい。 Further, the information providing apparatus 10 may generate a distributed representation of each piece of information using triples including different types of information. For example, the information providing apparatus 10 calculates and calculates the first similarity between the image and the text and the second similarity between the image and the music for the triple including the image, the text, and the music. Depending on the comparison result between the first similarity and the second similarity, the correct answer data or the incorrect answer data may be used to generate a distributed representation of each information.

すなわち、情報提供装置１０は、任意の種別の第１情報と任意の種別の第２情報との間の第１類似度と、第１情報と任意の種別の第３情報との間の第２類似度を算出し、第１類似度が第２類似度よりも大きい場合は、第１情報の分散表現と第２情報の分散表現とが類似し、かつ第１情報の分散表現と前記第３情報の分散表現とが類似しないように、各分散表現を生成すればよい。また、情報提供装置１０は、第２類似度が第１類似度よりも大きい場合は、第１情報の分散表現と第３情報の分散表現とが類似し、かつ第１情報の分散表現と第２情報の分散表現とが類似しないように、各分散表現を生成すればよい。 That is, the information providing apparatus 10 performs the first similarity between the first information of the arbitrary type and the second information of the arbitrary type, and the second between the first information and the third information of the arbitrary type. When the similarity is calculated and the first similarity is greater than the second similarity, the distributed representation of the first information is similar to the distributed representation of the second information, and the distributed representation of the first information and the third information Each distributed representation may be generated so that the distributed representation of information is not similar. In addition, when the second similarity is greater than the first similarity, the information providing apparatus 10 has a similar distributed representation of the first information and the distributed representation of the third information, and the first and second distributed representations of the first information and the first information. Each distributed representation may be generated so that the two information distributed representations are not similar.

〔４−２．装置構成〕
記憶部３０に登録された各データベース３１、３２は、外部のストレージサーバに保持されていてもよい。また、情報提供装置１０は、検索処理を実現するフロントエンドサーバと、生成処理を実現するバックエンドサーバとで実現されてもよい。このような場合、フロントエンドサーバには、図２に示す受付部４５および検索部４６が配置され、バックエンドサーバには、選択部４１、算出部４２、比較部４３、および生成部４４が配置される。 [4-2. Device configuration〕
The databases 31 and 32 registered in the storage unit 30 may be held in an external storage server. Further, the information providing apparatus 10 may be realized by a front-end server that realizes a search process and a back-end server that realizes a generation process. In such a case, the reception unit 45 and the search unit 46 shown in FIG. 2 are arranged in the front end server, and the selection unit 41, the calculation unit 42, the comparison unit 43, and the generation unit 44 are arranged in the back end server. Is done.

〔４−３．第１類似度と第２類似度との比較結果について〕
上述した例では、情報提供装置１０は、第１類似度が第２類似度よりも大きい場合は、第２情報を正解データとし、それ以外の場合には、第３情報を正解データとした。しかしながら、実施形態は、これに限定されるものではない。例えば、情報提供装置１０は、第１類似度が第２類似度と等しい場合には、第２情報や第３情報を選択しなおしてもよい。 [4-3. Comparison results between first similarity and second similarity]
In the example described above, the information providing apparatus 10 sets the second information as correct data when the first similarity is greater than the second similarity, and sets the third information as correct data in other cases. However, the embodiment is not limited to this. For example, the information providing apparatus 10 may reselect the second information or the third information when the first similarity is equal to the second similarity.

例えば、情報提供装置１０は、第１類似度が第２類似度よりも大きい場合は、第２情報を正解データとし、第２類似度が第１類似度よりも大きい場合は、第３情報を正解データとする。一方、情報提供装置１０は、第１類似度が第２類似度と同じ場合には、新たな第２情報および第３情報をランダムに選択しなおす。この際、情報提供装置１０は、例えば、第１情報と同じまたは類似する分野の第２情報および第３情報をランダムに選択してもよく、第２情報または第３情報のいずれか一方を、再度選択し直してもよい。また、情報提供装置１０は、第１類似度が第２類似度と同じ場合には、選択元となる分野を限定するか否かを確率的に決定し、限定すると決定した場合に、第１情報と同一または類似する分野に属する情報、すなわち、類似度が所定の閾値よりも高い情報の中から、第２情報または第３情報の少なくとも一方を選択し直してもよい。 For example, when the first similarity is greater than the second similarity, the information providing apparatus 10 sets the second information as correct data, and when the second similarity is greater than the first similarity, Correct data. On the other hand, when the first similarity is the same as the second similarity, the information providing apparatus 10 reselects new second information and third information at random. At this time, for example, the information providing apparatus 10 may randomly select the second information and the third information in the same or similar field as the first information, and select either the second information or the third information, You may select again. In addition, when the first similarity is the same as the second similarity, the information providing apparatus 10 probabilistically determines whether or not to limit the field that is the selection source, You may reselect at least one of 2nd information or 3rd information from the information which belongs to the field | area which is the same as or similar to information, ie, the information whose similarity is higher than a predetermined threshold value.

〔４−４．その他〕
また、上記実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、逆に、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [4-4. Others]
In addition, among the processes described in the above embodiment, all or part of the processes described as being automatically performed can be performed manually, and conversely, the processes described as being performed manually. All or a part of the above can be automatically performed by a known method. In addition, the processing procedures, specific names, and information including various data and parameters shown in the document and drawings can be arbitrarily changed unless otherwise specified. For example, the various types of information illustrated in each drawing is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Further, each component of each illustrated apparatus is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed or arbitrarily distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured.

また、上記してきた各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 In addition, the above-described embodiments can be appropriately combined within a range in which processing contents do not contradict each other.

〔４−５．プログラム〕
また、上述した実施形態に係る情報提供装置１０は、例えば図７に示すような構成のコンピュータ１０００によって実現される。図７は、ハードウェア構成の一例を示す図である。コンピュータ１０００は、出力装置１０１０、入力装置１０２０と接続され、演算装置１０３０、一次記憶装置１０４０、二次記憶装置１０５０、出力ＩＦ（Interface）１０６０、入力ＩＦ１０７０、ネットワークＩＦ１０８０がバス１０９０により接続された形態を有する。 [4-5. program〕
In addition, the information providing apparatus 10 according to the above-described embodiment is realized by a computer 1000 configured as shown in FIG. 7, for example. FIG. 7 is a diagram illustrating an example of a hardware configuration. The computer 1000 is connected to an output device 1010 and an input device 1020, and an arithmetic device 1030, a primary storage device 1040, a secondary storage device 1050, an output IF (Interface) 1060, an input IF 1070, and a network IF 1080 are connected via a bus 1090. Have

演算装置１０３０は、一次記憶装置１０４０や二次記憶装置１０５０に格納されたプログラムや入力装置１０２０から読み出したプログラム等に基づいて動作し、各種の処理を実行する。一次記憶装置１０４０は、ＲＡＭ等、演算装置１０３０が各種の演算に用いるデータを一次的に記憶するメモリ装置である。また、二次記憶装置１０５０は、演算装置１０３０が各種の演算に用いるデータや、各種のデータベースが登録される記憶装置であり、ＲＯＭ(Read Only Memory)、ＨＤＤ（Hard Disk Drive）、フラッシュメモリ等により実現される。 The arithmetic device 1030 operates based on a program stored in the primary storage device 1040 and the secondary storage device 1050, a program read from the input device 1020, and the like, and executes various processes. The primary storage device 1040 is a memory device such as a RAM that temporarily stores data used by the arithmetic device 1030 for various arithmetic operations. The secondary storage device 1050 is a storage device in which data used by the arithmetic device 1030 for various calculations and various databases are registered, such as ROM (Read Only Memory), HDD (Hard Disk Drive), flash memory, and the like. It is realized by.

出力ＩＦ１０６０は、モニタやプリンタといった各種の情報を出力する出力装置１０１０に対し、出力対象となる情報を送信するためのインタフェースであり、例えば、ＵＳＢ（Universal Serial Bus）やＤＶＩ（Digital Visual Interface）、ＨＤＭＩ（登録商標）（High Definition Multimedia Interface）といった規格のコネクタにより実現される。また、入力ＩＦ１０７０は、マウス、キーボード、およびスキャナ等といった各種の入力装置１０２０から情報を受信するためのインタフェースであり、例えば、ＵＳＢ等により実現される。 The output IF 1060 is an interface for transmitting information to be output to an output device 1010 that outputs various types of information such as a monitor and a printer. For example, USB (Universal Serial Bus), DVI (Digital Visual Interface), This is realized by a standard connector such as HDMI (registered trademark) (High Definition Multimedia Interface). The input IF 1070 is an interface for receiving information from various input devices 1020 such as a mouse, a keyboard, and a scanner, and is realized by, for example, a USB.

なお、入力装置１０２０は、例えば、ＣＤ（Compact Disc）、ＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等から情報を読み出す装置であってもよい。また、入力装置１０２０は、ＵＳＢメモリ等の外付け記憶媒体であってもよい。 The input device 1020 includes, for example, an optical recording medium such as a CD (Compact Disc), a DVD (Digital Versatile Disc), and a PD (Phase change rewritable disk), a magneto-optical recording medium such as an MO (Magneto-Optical disk), and a tape. It may be a device that reads information from a medium, a magnetic recording medium, a semiconductor memory, or the like. The input device 1020 may be an external storage medium such as a USB memory.

ネットワークＩＦ１０８０は、ネットワークＮを介して他の機器からデータを受信して演算装置１０３０へ送り、また、ネットワークＮを介して演算装置１０３０が生成したデータを他の機器へ送信する。 The network IF 1080 receives data from other devices via the network N and sends the data to the arithmetic device 1030, and transmits data generated by the arithmetic device 1030 to other devices via the network N.

演算装置１０３０は、出力ＩＦ１０６０や入力ＩＦ１０７０を介して、出力装置１０１０や入力装置１０２０の制御を行う。例えば、演算装置１０３０は、入力装置１０２０や二次記憶装置１０５０からプログラムを一次記憶装置１０４０上にロードし、ロードしたプログラムを実行する。 The arithmetic device 1030 controls the output device 1010 and the input device 1020 via the output IF 1060 and the input IF 1070. For example, the arithmetic device 1030 loads a program from the input device 1020 or the secondary storage device 1050 onto the primary storage device 1040, and executes the loaded program.

例えば、コンピュータ１０００が情報提供装置１０として機能する場合、コンピュータ１０００の演算装置１０３０は、一次記憶装置１０４０上にロードされたプログラムを実行することにより、制御部４０の機能を実現する。 For example, when the computer 1000 functions as the information providing device 10, the arithmetic device 1030 of the computer 1000 implements the function of the control unit 40 by executing a program loaded on the primary storage device 1040.

〔５．効果〕
上述したように、情報提供装置１０は、第１情報と第２情報との間の類似度である第１類似度と、第１情報と第３情報との間の類似度である第２類似度とを算出する。そして、情報提供装置１０は、第１類似度と第２類似度との関係性に基づいて、第２画像および第３画像の分散表現のうちいずれか１方が第１画像の分散表現と類似し、他方が類似しないように、各画像の分散表現を生成する。例えば、情報提供装置１０は、第１類似度が第２類似度よりも大きい場合は、第１情報の分散表現と第２情報の分散表現とが類似し、かつ第１情報の分散表現と第３情報の分散表現とが類似しないように、各分散表現を生成する。また、情報提供装置１０は、第２類似度が第１類似度よりも大きい場合は、第１情報の分散表現と第３情報の分散表現とが類似し、かつ第１情報の分散表現と第２情報の分散表現とが類似しないように、各分散表現を生成する。 [5. effect〕
As described above, the information providing apparatus 10 includes the first similarity that is the similarity between the first information and the second information, and the second similarity that is the similarity between the first information and the third information. Calculate the degree. Then, the information providing apparatus 10 determines that one of the distributed representations of the second image and the third image is similar to the distributed representation of the first image based on the relationship between the first similarity and the second similarity. Then, a distributed representation of each image is generated so that the other is not similar. For example, when the first similarity is larger than the second similarity, the information providing apparatus 10 has a similar distributed representation of the first information and the distributed representation of the second information, and the first information and the distributed representation of the first information. Each distributed representation is generated so that the distributed representation of the three information is not similar. In addition, when the second similarity is greater than the first similarity, the information providing apparatus 10 has a similar distributed representation of the first information and the distributed representation of the third information, and the first and second distributed representations of the first information and the first information. Each distributed representation is generated so that the two information distributed representations are not similar.

このような処理の結果、情報提供装置１０は、各情報の相対的な類似度を分散表現空間上に落とし込むことができるので、距離公理を満たす距離関数が使えることが担保された分散表現空間を生成することができる。すなわち、情報提供装置１０は、情報の類似性を分散表現空間上の距離に反映させることができる。 As a result of such processing, the information providing apparatus 10 can drop the relative similarity of each information on the distributed expression space, so that the distributed expression space that guarantees that the distance function satisfying the distance axiom can be used. Can be generated. That is, the information providing apparatus 10 can reflect the similarity of information on the distance in the distributed expression space.

また、情報提供装置１０は、第１情報に紐付けられるメタ情報と第２情報に紐付けられるメタ情報との類似度に基づいて、第１類似度を算出し、第１情報に紐付けられるメタ情報と第３情報に紐付けられるメタ情報との類似度に基づいて、第２類似度を算出する。このため、情報提供装置１０は、各情報の意味的な類似性を分散表現に反映させることができる。 The information providing apparatus 10 calculates the first similarity based on the similarity between the meta information associated with the first information and the meta information associated with the second information, and associates the first information with the first information. The second similarity is calculated based on the similarity between the meta information and the meta information associated with the third information. For this reason, the information providing apparatus 10 can reflect the semantic similarity of each information in the distributed expression.

また、情報提供装置１０は、第１情報に紐付けられる複数種別の情報と第２情報に紐付けられる複数種別の情報との種別ごとの類似度に基づいて、第１類似度を算出し、第１情報に紐付けられる複数種別の情報と第３情報に紐付けられる複数種別の情報との種別ごとの類似度に基づいて、第２類似度を算出する。このため、情報提供装置１０は、様々な観点での類似性を統合的に分散表現へと反映させることができる。 Further, the information providing device 10 calculates the first similarity based on the similarity for each type of the plurality of types of information linked to the first information and the plurality of types of information linked to the second information, The second similarity is calculated based on the similarity for each type of the plurality of types of information associated with the first information and the plurality of types of information associated with the third information. For this reason, the information provision apparatus 10 can reflect the similarity in various viewpoints to a distributed expression integratively.

また、情報提供装置１０は、種別ごとの類似度を優先度が高い順に結合させることで、所定の桁数の第１類似度および第２類似度を算出する。このため、情報提供装置１０は、各種別の階層的な優先度を考慮して、各情報の相対的な類似性を判定することができる。 Moreover, the information provision apparatus 10 calculates the 1st similarity and 2nd similarity of predetermined number of digits by combining the similarity for every classification in order with high priority. For this reason, the information provision apparatus 10 can determine the relative similarity of each information in consideration of various hierarchical priorities.

また、情報提供装置１０は、第１情報に紐付けられる情報と第２情報に紐付けられる情報との意味または表記の類似度に基づいて、第１類似度を算出し、第１情報に紐付けられる情報と第３情報に紐付けられる情報との意味または表記の類似度に基づいて、第２類似度を算出する。このため、情報提供装置１０は、各情報の意味的、構造的な類似性を分散表現に反映させることができる。 Further, the information providing device 10 calculates the first similarity based on the meaning or notation similarity between the information associated with the first information and the information associated with the second information, and associates the information with the first information. The second similarity is calculated on the basis of the meaning or notation similarity between the information attached and the information associated with the third information. For this reason, the information providing apparatus 10 can reflect the semantic and structural similarity of each information in the distributed expression.

また、情報提供装置１０は、第１情報が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリと、第２情報が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリとの類似度に基づいて、第１類似度を算出する。また、情報提供装置１０は、第１情報が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリと、第３情報が利用者Ｕにより選択された際に利用者Ｕが入力した検索クエリとの類似度に基づいて、第２類似度を算出する。このため、情報提供装置１０は、利用者Ｕの検索意図の類似性を分散表現に反映させることができる。 In addition, the information providing apparatus 10 includes a search query input by the user U when the first information is selected by the user U and a user U input when the second information is selected by the user U. The first similarity is calculated based on the similarity with the search query. In addition, the information providing apparatus 10 includes a search query input by the user U when the first information is selected by the user U and a user U input when the third information is selected by the user U. Based on the similarity with the search query, the second similarity is calculated. For this reason, the information providing apparatus 10 can reflect the similarity of the search intention of the user U in the distributed expression.

また、情報提供装置１０は、検索クエリのうち、トークンの量が所定の閾値を超える検索クエリの類似度に基づいて、第１類似度および第２類似度を算出する。このため、情報提供装置１０は、利用者Ｕの検索意図をより精度よく分散表現に反映させることができる。 Further, the information providing apparatus 10 calculates the first similarity and the second similarity based on the similarity of the search query in which the amount of tokens exceeds a predetermined threshold among the search queries. For this reason, the information providing apparatus 10 can reflect the search intention of the user U in the distributed expression with higher accuracy.

また、情報提供装置１０は、第１情報と第２情報との構造的な類似度および第１情報と第２情報との意味的な類似度とに基づいて、第１類似度を算出し、第１情報と第３情報との構造的な類似度および第１情報と第３情報との意味的な類似度とに基づいて、第２類似度を算出する。このため、情報提供装置１０は、各情報の意味的な類似度および構造的な類似度を、分散表現に反映させることができる。 Further, the information providing apparatus 10 calculates the first similarity based on the structural similarity between the first information and the second information and the semantic similarity between the first information and the second information, The second similarity is calculated based on the structural similarity between the first information and the third information and the semantic similarity between the first information and the third information. Therefore, the information providing apparatus 10 can reflect the semantic similarity and structural similarity of each information in the distributed expression.

また、情報提供装置１０は、第２情報および第３情報として、第１情報との類似度が所定の閾値を超える情報を選択し、第１情報と、選択された第２情報および第３情報とから、第１類似度および第２類似度を算出する。このため、情報提供装置１０は、相互に類似しする情報間における相対的な類似性を分散表現に反映させることができる。 Further, the information providing apparatus 10 selects, as the second information and the third information, information whose similarity with the first information exceeds a predetermined threshold, and the first information, the selected second information and the third information From the above, the first similarity and the second similarity are calculated. For this reason, the information providing apparatus 10 can reflect the relative similarity between pieces of information that are similar to each other in the distributed expression.

また、情報提供装置１０は、第１情報との類似度が第１閾値以上となる第４情報と、第１情報との類似度が第２閾値以下となる第５情報とを選択する。また、情報提供装置１０は、第１情報ないし第５情報のうち３つの情報を含む全ての組について、第１類似度および第２類似度を算出する。そして、情報提供装置１０は、第１情報の分散表現と第４情報の分散表現とが類似し、かつ、第１情報の分散表現と第５情報の分散表現とが類似しないように、第１情報、第４情報および第５情報の分散表現を生成し、その後、組ごとに算出した第１類似度および第２類似度に基づいて、組に含まれる第１情報ないし第５情報の分散表現を生成する。このため、情報提供装置１０は、効率的な分散表現の学習を実現することができる。 In addition, the information providing apparatus 10 selects the fourth information whose similarity with the first information is equal to or higher than the first threshold and the fifth information whose similarity with the first information is equal to or lower than the second threshold. Further, the information providing apparatus 10 calculates the first similarity and the second similarity for all the sets including three pieces of information from the first information to the fifth information. Then, the information providing apparatus 10 includes the first information so that the distributed representation of the first information is similar to the distributed representation of the fourth information, and the distributed representation of the first information is not similar to the distributed representation of the fifth information. A distributed representation of the first information to the fifth information included in the set is generated based on the first similarity and the second similarity calculated for each set after generating the distributed representation of the information, the fourth information, and the fifth information. Is generated. For this reason, the information providing device 10 can realize efficient distributed representation learning.

また、情報提供装置１０は、第１情報の分散表現と第２情報の分散表現との差が、第１情報の分散表現と第３情報の分散表現との差よりも少なくなるように、各分散表現を生成する。このため、情報提供装置１０は、分散表現を適切に生成することができる。 Further, the information providing apparatus 10 is configured so that the difference between the distributed representation of the first information and the distributed representation of the second information is smaller than the difference between the distributed representation of the first information and the distributed representation of the third information. Generate a distributed representation. For this reason, the information providing apparatus 10 can appropriately generate the distributed representation.

また、情報提供装置１０は、画像である第１情報と画像である第２情報との間の第１類似度と、第１情報と画像である第３情報との間の第２類似度とを算出する。このため、情報提供装置１０は、画像間の相対的な類似性を各画像の分散表現に反映させることができる。 The information providing apparatus 10 also includes a first similarity between the first information that is an image and the second information that is an image, and a second similarity between the first information and the third information that is an image. Is calculated. For this reason, the information providing apparatus 10 can reflect the relative similarity between images in the distributed representation of each image.

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 As described above, some of the embodiments of the present application have been described in detail with reference to the drawings. However, these are merely examples, and various modifications, including the aspects described in the disclosure section of the invention, based on the knowledge of those skilled in the art, It is possible to implement the present invention in other forms with improvements.

また、上記してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、生成部は、生成手段や生成回路に読み替えることができる。 Moreover, the above-mentioned “section (module, unit)” can be read as “means”, “circuit”, and the like. For example, the generation unit can be read as generation means or a generation circuit.

１０情報提供装置
２０通信部
３０記憶部
３１画像データベース
３２分散表現データベース
４０制御部
４１選択部
４２算出部
４３比較部
４４生成部
４５受付部
４６検索部
１００利用者端末 DESCRIPTION OF SYMBOLS 10 Information providing apparatus 20 Communication part 30 Storage part 31 Image database 32 Distributed expression database 40 Control part 41 Selection part 42 Calculation part 43 Comparison part 44 Generation part 45 Reception part 46 Search part 100 User terminal

Claims

A calculation unit that calculates a first similarity that is a similarity between the first information and the second information, and a second similarity that is a similarity between the first information and the third information;
Based on the relationship between the first similarity and the second similarity, one of the distributed representations of the second information and the third information is similar to the distributed representation of the first information, and the other is A generation device comprising: a generation unit configured to generate each distributed expression so as not to be similar.

When the first similarity is greater than the second similarity, the generation unit is configured such that the distributed representation of the first information and the distributed representation of the second information are similar, and the distributed representation of the first information The generation apparatus according to claim 1, wherein each distributed expression is generated so that the distributed expression of the third information is not similar to the distributed expression.

When the second similarity is greater than the first similarity, the distributed representation of the first information and the distributed representation of the third information are similar, and the distributed representation of the first information and the distributed of the second information The generation apparatus according to claim 1, wherein each distributed expression is generated so that the expression is not similar to the expression.

The calculation unit calculates the first similarity based on the similarity between the information associated with the first information and the information associated with the second information, and information associated with the first information The generation device according to any one of claims 1 to 3, wherein the second similarity is calculated based on a similarity between the first information and information associated with the third information.

The calculation unit calculates the first similarity based on a similarity for each type of information of a plurality of types associated with the first information and information of a plurality of types associated with the second information, The second similarity is calculated based on a similarity for each type of information of a plurality of types associated with the first information and information of a plurality of types associated with the third information. Item 5. The generating device according to Item 4.

6. The calculation unit according to claim 5, wherein the calculation unit calculates the first similarity and the second similarity of a predetermined number of digits by combining similarities of the types in descending order of priority. The generator described.

The calculation unit calculates the first similarity based on the meaning or notation similarity between the information associated with the first information and the information associated with the second information, and includes the first information in the first information. The second similarity is calculated based on the meaning or notation similarity between the information associated with the information and the information associated with the third information. The generating device described in 1.

The calculation unit includes a search query input by the user when the first information is selected by the user, and a search query input by the user when the second information is selected by the user. The first similarity is calculated based on the similarity, and the search query input by the user when the first information is selected by the user and the third information is selected by the user The generation apparatus according to any one of claims 4 to 7, wherein the second similarity is calculated based on a similarity with a search query input by the user.

The said calculation part calculates the said 1st similarity and the said 2nd similarity based on the similarity of the search query from which the amount of tokens exceeds a predetermined threshold value among the said search queries. 9. The generating device according to 8.

The calculation unit calculates the first similarity based on a structural similarity between the first information and the second information and a semantic similarity between the first information and the second information. And calculating the second similarity based on the structural similarity between the first information and the third information and the semantic similarity between the first information and the third information. The generation device according to claim 1, wherein the generation device is a feature.

As the second information and the third information, a first selection unit that selects information whose similarity with the first information exceeds a predetermined threshold,
The calculation unit calculates the first similarity and the second similarity from the first information and the second information and the third information selected by the first selection unit. The generation device according to any one of claims 1 to 10.

A second selection unit that selects the fourth information whose similarity to the first information is equal to or higher than a first threshold and the fifth information whose similarity to the first information is equal to or lower than a second threshold;
The calculation unit calculates the first similarity and the second similarity for all sets including three pieces of information from the first information to the fifth information,
The generation unit is configured so that the distributed representation of the first information is similar to the distributed representation of the fourth information, and the distributed representation of the first information is not similar to the distributed representation of the fifth information. , Generating a distributed representation of the fourth information and the fifth information, and then, based on the first similarity and the second similarity calculated by the calculation unit for each of the sets, the first information or the first information included in the set The generation apparatus according to claim 1, wherein a distributed representation of the fifth information is generated.

The generation unit is configured such that a difference between the distributed representation of the first information and the distributed representation of the second information is smaller than the difference between the distributed representation of the first information and the distributed representation of the third information. Each distributed expression is produced | generated. The production | generation apparatus as described in any one of Claims 1-12 characterized by the above-mentioned.

The calculation unit includes the first similarity between the first information that is an image and the second information that is an image, and the second information between the first information and the third information that is an image. The generation device according to any one of claims 1 to 13, wherein similarity is calculated.

A generation method executed by a generation device,
A calculation step of calculating a first similarity that is a similarity between the first information and the second information, and a second similarity that is a similarity between the first information and the third information;
Based on the relationship between the first similarity and the second similarity, one of the distributed representations of the second information and the third information is similar to the distributed representation of the first information, and the other is A generation method characterized by including a generation step of generating each distributed expression so as not to be similar.

A calculation procedure for calculating a first similarity that is a similarity between the first information and the second information, and a second similarity that is a similarity between the first information and the third information;
Based on the relationship between the first similarity and the second similarity, one of the distributed representations of the second information and the third information is similar to the distributed representation of the first information, and the other is A generation program characterized by causing a computer to execute a generation procedure for generating each distributed expression so as not to be similar.