JP2021009574A

JP2021009574A - Information processing device, information processing method, and information processing program

Info

Publication number: JP2021009574A
Application number: JP2019123238A
Authority: JP
Inventors: 泰介森; Taisuke Mori; 高昌澁川; Takamasa Shibukawa; 知紘小川; Tomohiro Ogawa; 寺田　幸弘; Yukihiro Terada; 幸弘寺田; 朋美田畑; Tomomi Tabata
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2019-07-01
Filing date: 2019-07-01
Publication date: 2021-01-28
Anticipated expiration: 2039-07-01
Also published as: JP7177013B2

Abstract

To obtain useful information.SOLUTION: An information processing device comprises a reception unit, a generation unit, and an output unit. The reception unit accepts a plurality of pieces of object information indicating a classification object, and the number of specified clusters, which is the number of clusters when the plurality of pieces of object information is classified into clusters. The generation unit generates specified cluster information on a cluster generated by classifying the plurality of pieces of object information into clusters of the number of specified cluster. The generation unit outputs the specified cluster information generated by the generation unit.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program.

近年、インターネットの飛躍的な普及に伴い、例えば、インターネット上の種々の情報を用いた分析に関する技術が提供されている。例えば、ユーザが入力した検索クエリに基づいて、所定の事業者が提供する対象に対するニーズに関する情報を抽出する技術が提案されている。 In recent years, with the rapid spread of the Internet, for example, techniques related to analysis using various information on the Internet have been provided. For example, a technique has been proposed in which information on needs for a target provided by a predetermined business operator is extracted based on a search query entered by a user.

特開２０１９−３２７７６号公報JP-A-2019-32776

しかしながら、上記の従来技術では、有用な情報を得ることができるとは限らない。例えば、上記の従来技術では、所定の事業者が提供する対象に対するニーズに関する情報を抽出しているに過ぎないので、所定の事業者が提供する対象に対する潜在的なニーズに関する情報を抽出できるとは言えない。 However, it is not always possible to obtain useful information with the above-mentioned prior art. For example, in the above-mentioned prior art, since only the information regarding the needs for the target provided by the predetermined business operator is extracted, it is possible to extract the information regarding the potential needs for the target provided by the predetermined business operator. I can not say.

本願は、上記に鑑みてなされたものであって、有用な情報を提供することができる情報処理装置、情報処理方法及び情報処理プログラムを提供することを目的とする。 The present application has been made in view of the above, and an object of the present application is to provide an information processing device, an information processing method, and an information processing program capable of providing useful information.

本願に係る情報処理装置は、分類対象を示す複数の対象情報と、前記複数の対象情報をクラスタに分類する際のクラスタ数である指定クラスタ数とを受け付ける受付部と、前記複数の対象情報を前記指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を生成する生成部と、前記生成部によって生成された指定クラスタ情報を出力する出力部と、を有することを特徴とする。 The information processing apparatus according to the present application receives a plurality of target information indicating a classification target, a reception unit that receives a specified number of clusters, which is the number of clusters when the plurality of target information is classified into clusters, and the plurality of target information. It is characterized by having a generation unit that generates designated cluster information about clusters generated by classifying the clusters into the specified number of clusters, and an output unit that outputs the designated cluster information generated by the generation unit.

実施形態の一態様によれば、有用な情報を得ることができるといった効果を奏する。 According to one aspect of the embodiment, there is an effect that useful information can be obtained.

図１は、実施形態に係る情報処理の一例を示す図である。FIG. 1 is a diagram showing an example of information processing according to an embodiment. 図２は、実施形態に係る情報処理システムの構成例を示す図である。FIG. 2 is a diagram showing a configuration example of an information processing system according to an embodiment. 図３は、実施形態に係る情報処理装置の構成例を示す図である。FIG. 3 is a diagram showing a configuration example of the information processing device according to the embodiment. 図４は、実施形態に係るクエリ情報記憶部の一例を示す図である。FIG. 4 is a diagram showing an example of the query information storage unit according to the embodiment. 図５は、実施形態に係るベクトル情報記憶部の一例を示す図である。FIG. 5 is a diagram showing an example of a vector information storage unit according to the embodiment. 図６は、実施形態に係るクラスタ情報記憶部の一例を示す図である。FIG. 6 is a diagram showing an example of the cluster information storage unit according to the embodiment. 図７は、実施形態に係るモデル情報記憶部の一例を示す図である。FIG. 7 is a diagram showing an example of a model information storage unit according to the embodiment. 図８は、実施形態に係る端末装置の構成例を示す図である。FIG. 8 is a diagram showing a configuration example of the terminal device according to the embodiment. 図９は、実施形態に係る情報処理手順を示すフローチャートである。FIG. 9 is a flowchart showing an information processing procedure according to the embodiment. 図１０は、実施形態に係る第１モデルの生成処理の一例を示す図である。FIG. 10 is a diagram showing an example of the generation process of the first model according to the embodiment. 図１１は、実施形態に係る第１モデルの生成処理の一例を示す図である。FIG. 11 is a diagram showing an example of the generation process of the first model according to the embodiment. 図１２は、実施形態に係る生成装置の構成例を示す図である。FIG. 12 is a diagram showing a configuration example of the generator according to the embodiment. 図１３は、実施形態に係るクエリ情報記憶部の一例を示す図である。FIG. 13 is a diagram showing an example of the query information storage unit according to the embodiment. 図１４は、実施形態に係るベクトル情報記憶部の一例を示す図である。FIG. 14 is a diagram showing an example of the vector information storage unit according to the embodiment. 図１５は、実施形態に係るモデル情報記憶部の一例を示す図である。FIG. 15 is a diagram showing an example of a model information storage unit according to the embodiment. 図１６は、実施形態に係る第１モデルの一例を示す図である。FIG. 16 is a diagram showing an example of the first model according to the embodiment. 図１７は、実施形態に係る第１モデルの生成処理手順を示すフローチャートである。FIG. 17 is a flowchart showing a generation processing procedure of the first model according to the embodiment. 図１８は、プログラムを実行するコンピュータのハードウェア構成の一例を示す図である。FIG. 18 is a diagram showing an example of the hardware configuration of the computer that executes the program.

以下に、本願に係る情報処理装置、情報処理方法及び情報処理プログラムを実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る情報処理装置、情報処理方法及び情報処理プログラムが限定されるものではない。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略する。 Hereinafter, the information processing apparatus, the information processing method, and the mode for carrying out the information processing program (hereinafter referred to as “the embodiment”) according to the present application will be described in detail with reference to the drawings. Note that this embodiment does not limit the information processing apparatus, information processing method, and information processing program according to the present application. Further, in each of the following embodiments, the same parts are designated by the same reference numerals, and duplicate description will be omitted.

〔１．情報処理の一例〕
まず、図１を用いて、実施形態に係る情報処理の一例について説明する。図１は、実施形態に係る情報処理の一例を示す図である。実施形態に係る情報処理は、図１に示す情報処理装置１００によって行われる。図１に示す情報処理装置１００は、クライアント（企業等）に対して、クエリの分析サービスを提供するサーバ装置である。ここで、「クエリ」とは、データベースに対する問い合わせや要求を行うためにユーザによって用いられる語句（文字情報）を指す。例えば、「クエリ」には、ユーザによって検索に用いられる語句である検索クエリが含まれる。なお、本願発明における「クエリ」には、ユーザによってどのように用いられたかとは関係のない単なるキーワードやフレーズといった語句が含まれてもよいものとする。 [1. An example of information processing]
First, an example of information processing according to the embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an example of information processing according to an embodiment. The information processing according to the embodiment is performed by the information processing device 100 shown in FIG. The information processing device 100 shown in FIG. 1 is a server device that provides a query analysis service to a client (company or the like). Here, the "query" refers to a phrase (character information) used by the user to make an inquiry or request to the database. For example, a "query" includes a search query, which is a phrase used by the user for a search. It should be noted that the "query" in the present invention may include words and phrases such as simple keywords and phrases that have nothing to do with how they are used by the user.

図１の説明に先立って、図２を用いて、実施形態に係る情報処理システムの構成について説明する。図２は、実施形態に係る情報処理システムの構成例を示す図である。図２に示すように、情報処理システム１には、端末装置１０と、検索サーバ２０と、生成装置５０と、情報処理装置１００とが含まれる。端末装置１０と、検索サーバ２０と、生成装置５０と、情報処理装置１００とは所定のネットワークＮを介して、有線または無線により通信可能に接続される。なお、図２に示す情報処理システム１には、任意の数の端末装置１０と任意の数の検索サーバ２０と任意の数の生成装置５０と任意の数の情報処理装置１００とが含まれてもよい。 Prior to the description of FIG. 1, the configuration of the information processing system according to the embodiment will be described with reference to FIG. FIG. 2 is a diagram showing a configuration example of an information processing system according to an embodiment. As shown in FIG. 2, the information processing system 1 includes a terminal device 10, a search server 20, a generation device 50, and an information processing device 100. The terminal device 10, the search server 20, the generation device 50, and the information processing device 100 are connected to each other via a predetermined network N so as to be communicable by wire or wirelessly. The information processing system 1 shown in FIG. 2 includes an arbitrary number of terminal devices 10, an arbitrary number of search servers 20, an arbitrary number of generation devices 50, and an arbitrary number of information processing devices 100. May be good.

端末装置１０は、クライアント（企業等）の担当者である利用者によって使用される情報処理装置である。端末装置１０は、例えば、スマートフォンや、タブレット型端末や、ノート型ＰＣ（Personal Computer）や、携帯電話機や、ＰＤＡ（Personal Digital Assistant）等により実現される。図１に示す例では、端末装置１０はノート型ＰＣである。なお、以下では、端末装置１０を利用者と同一視する場合がある。すなわち、以下では、利用者を端末装置１０と読み替えることもできる。 The terminal device 10 is an information processing device used by a user who is in charge of a client (company or the like). The terminal device 10 is realized by, for example, a smartphone, a tablet terminal, a notebook PC (Personal Computer), a mobile phone, a PDA (Personal Digital Assistant), or the like. In the example shown in FIG. 1, the terminal device 10 is a notebook PC. In the following, the terminal device 10 may be equated with the user. That is, in the following, the user can be read as the terminal device 10.

端末装置１０は、クライアント（企業等）の担当者である利用者Ｕ１１の操作に従って、分類対象を示す複数のクエリと、複数のクエリをクラスタに分類する際のクラスタ数である指定クラスタ数とを情報処理装置１００に送信する。また、端末装置１０は、複数のクエリを指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を情報処理装置１００から受信する。端末装置１０は、受信した指定クラスタ情報を端末装置１０の画面に表示する。図１に示す例では、端末装置１０は、受信した指定クラスタ情報を表示する部分コンテンツＣ１２を端末装置１０の画面に表示する。 The terminal device 10 determines a plurality of queries indicating the classification target and a designated number of clusters, which is the number of clusters when classifying the plurality of queries into clusters, according to the operation of the user U11 who is in charge of the client (company or the like). It is transmitted to the information processing device 100. Further, the terminal device 10 receives from the information processing device 100 the designated cluster information regarding the clusters generated by classifying the plurality of queries into clusters having a designated number of clusters. The terminal device 10 displays the received designated cluster information on the screen of the terminal device 10. In the example shown in FIG. 1, the terminal device 10 displays the partial content C12 displaying the received designated cluster information on the screen of the terminal device 10.

検索サーバ２０は、検索サービスを提供するサーバ装置である。例えば、検索サーバ２０が提供する検索サービスは、あらゆる情報を検索可能な総合検索サービスである。検索サーバ２０は、利用者によって入力された検索クエリに関する情報を記憶する。具体的には、検索サーバ２０は、利用者の検索履歴に関する情報を記憶する。また、検索サーバ２０は、生成装置５０の要求に応じて、利用者によって入力された検索クエリに関する情報を生成装置５０に送信する。 The search server 20 is a server device that provides a search service. For example, the search service provided by the search server 20 is a comprehensive search service capable of searching for any information. The search server 20 stores information about the search query entered by the user. Specifically, the search server 20 stores information regarding the user's search history. Further, the search server 20 transmits information about the search query input by the user to the generation device 50 in response to the request of the generation device 50.

生成装置５０は、第１モデルＭ１を生成するサーバ装置である。生成装置５０は、後述する処理を実行することにより、第１モデルＭ１を生成する。第１モデルＭ１は、文字情報を入力すると、文字情報（例えば、検索クエリ）の分散表現を出力するモデルである。なお、分散表現は、ベクトルでもよい。ここで、第１モデルＭ１から出力される文字情報の分散表現には、その文字情報が検索クエリとして入力された際の利用者の検索意図を示す特徴情報が含まれている。また、第１モデルＭ１から出力される所定の文字情報の分散表現と他の文字情報の分散表現が類似することは、所定の文字情報が検索クエリとして入力された際の利用者の検索意図と他の文字情報が検索クエリとして入力された際の利用者の検索意図とが類似することを意味する。なお、生成装置５０による第１モデルの生成処理の詳細については後述する。 The generation device 50 is a server device that generates the first model M1. The generation device 50 generates the first model M1 by executing a process described later. The first model M1 is a model that outputs a distributed representation of character information (for example, a search query) when character information is input. The distributed representation may be a vector. Here, the distributed representation of the character information output from the first model M1 includes feature information indicating the user's search intention when the character information is input as a search query. Further, the similarity between the distributed representation of the predetermined character information output from the first model M1 and the distributed representation of other character information is the search intention of the user when the predetermined character information is input as a search query. It means that the search intention of the user when other character information is input as a search query is similar. The details of the generation process of the first model by the generation device 50 will be described later.

情報処理装置１００は、企業等であるクライアントに対して、クライアントが市場分析を所望する分析対象を示すキーワード（文字情報）に関する一般利用者の検索動向を分析するクエリの分析サービスを提供するサーバ装置である。図１に示す例では、情報処理装置１００は、分類対象を示す複数のクエリと、複数のクエリをクラスタに分類する際のクラスタ数である指定クラスタ数とを受け付ける。また、情報処理装置１００は、複数のクエリを指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を生成する。そして、情報処理装置１００は、生成した指定クラスタ情報を出力する。 The information processing device 100 is a server device that provides a query analysis service for a client such as a company to analyze a general user's search trend regarding a keyword (character information) indicating an analysis target that the client desires to analyze the market. Is. In the example shown in FIG. 1, the information processing apparatus 100 accepts a plurality of queries indicating a classification target and a designated number of clusters, which is the number of clusters when the plurality of queries are classified into clusters. Further, the information processing apparatus 100 generates designated cluster information regarding the clusters generated by classifying a plurality of queries into clusters having a designated number of clusters. Then, the information processing device 100 outputs the generated designated cluster information.

一般的に、クエリの分析サービスでは、企業等であるクライアントから、クライアントが市場分析を所望する分析対象を示すキーワード（文字情報）を受け付ける。そして、クエリの分析サービスは、膨大なクエリのデータベースの中から、指定されたキーワードと類似するクエリに関する情報を抽出し、抽出したクエリに関する情報をクライアントに対して提供する。例えば、クエリの分析サービスは、指定されたキーワードとクエリとの類似性を数値によって把握するため、クエリに対応する文字列やクライアントによって指定されたキーワードを分散表現に変換する。そして、クエリの分析サービスは、指定されたキーワードを変換した分散表現とクエリに対応する文字列を変換した分散表現との類似度を算出することで、指定されたキーワードと類似するクエリに関する情報を抽出する。 Generally, the query analysis service accepts keywords (character information) indicating an analysis target that the client wants to analyze the market from a client such as a company. Then, the query analysis service extracts information on queries similar to the specified keyword from a huge database of queries, and provides the client with information on the extracted queries. For example, the query analysis service converts the character string corresponding to the query and the keyword specified by the client into a distributed expression in order to grasp the similarity between the specified keyword and the query numerically. Then, the query analysis service calculates the similarity between the distributed expression obtained by converting the specified keyword and the distributed expression obtained by converting the character string corresponding to the query, and obtains information about the query similar to the specified keyword. Extract.

ここで、クエリの分析サービスが保有するデータベースには、膨大な数のクエリが存在するため、クエリに対応する文字列を変換した分散表現の数も膨大な数になる。また、一般的に、文字列を変換した分散表現は、高次元のベクトル（例えば、何百次元や何千次元のベクトル）であることが一般的である。すなわち、クエリの分析サービスを提供する装置が指定されたキーワードと類似するクエリの分析結果に関する情報を単に抽出して提供するだけでは、分析結果に関する情報の提供を受けたクライアントが有用な情報を得ることは難しい。例えば、クライアントに対して提供される情報が、多数の高次元の分散表現を分散表現空間にマッピングした状態で提供された場合や、分散表現間の類似度を示す膨大な数のデータを羅列した状態で提供された場合には、情報量が多く、クライアントが分析結果を一見して把握することは難しい。そのため、提供された情報からクライアントが有用な情報を得ることは難しい。 Here, since the database owned by the query analysis service has a huge number of queries, the number of distributed representations obtained by converting the character strings corresponding to the queries is also huge. Also, in general, the distributed representation obtained by converting a character string is generally a high-dimensional vector (for example, a vector having hundreds or thousands of dimensions). That is, if the device that provides the query analysis service simply extracts and provides the information on the analysis result of the query similar to the specified keyword, the client who receives the information on the analysis result obtains useful information. It's difficult. For example, when the information provided to the client is provided with a large number of high-dimensional distributed representations mapped to the distributed representation space, or a huge amount of data showing the similarity between the distributed representations is listed. When provided in a state, the amount of information is large and it is difficult for the client to grasp the analysis result at a glance. Therefore, it is difficult for the client to obtain useful information from the provided information.

そこで、本願発明に係る情報処理装置１００は、分類対象を示す複数の対象情報と、複数の対象情報をクラスタに分類する際のクラスタ数である指定クラスタ数とを受け付ける。また、情報処理装置１００は、複数の対象情報を指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を生成する。そして、情報処理装置１００は、生成した指定クラスタ情報を出力する。本願発明に係る情報処理装置１００は、例えば、分類対象を示す１００個の対象情報と、指定クラスタ数「５」とを受け付けたとする。この場合、本願発明に係る情報処理装置１００は、分類対象を示す１００個の対象情報を、類似する特徴を持つ対象情報の５つのクラスタに分けることができる。すなわち、本願発明に係る情報処理装置１００は、１００個の対象情報から、１００個の対象情報の特徴の要約とも言える５つのクラスタ情報を抽出可能とする。そして、本願発明に係る情報処理装置１００は、１００個の対象情報の特徴を５つのクラスタ情報に要約して提供可能とする。このように、本願発明に係る情報処理装置１００は、分類対象を示す多数の対象情報の特徴を少ない情報量の要約にまとめて提供可能とする。したがって、本願発明に係る情報処理装置１００は、有用な情報を得ることができる。 Therefore, the information processing apparatus 100 according to the present invention accepts a plurality of target information indicating a classification target and a designated number of clusters, which is the number of clusters when the plurality of target information is classified into clusters. Further, the information processing apparatus 100 generates designated cluster information regarding the clusters generated by classifying the plurality of target information into clusters having a designated number of clusters. Then, the information processing device 100 outputs the generated designated cluster information. It is assumed that the information processing apparatus 100 according to the present invention has received, for example, 100 target information indicating a classification target and a designated number of clusters "5". In this case, the information processing apparatus 100 according to the present invention can divide 100 target information indicating a classification target into five clusters of target information having similar characteristics. That is, the information processing apparatus 100 according to the present invention can extract five cluster information, which can be said to be a summary of the features of the 100 target information, from the 100 target information. Then, the information processing apparatus 100 according to the present invention can provide the features of 100 target information in a summary of five cluster information. As described above, the information processing apparatus 100 according to the present invention can provide the features of a large number of target information indicating the classification target in a summary of a small amount of information. Therefore, the information processing apparatus 100 according to the present invention can obtain useful information.

ここから、図１を用いて、情報処理の流れについて説明する。図１では、情報処理装置１００は、端末装置１０の要求に応じて、コンテンツＣ１の部分コンテンツＣ１１を端末装置１０に送信する。具体的には、情報処理装置１００は、クラスタ数を入力可能な入力フィールドＦ１１と、複数のクエリを入力可能な入力フィールドＦ１２と、入力フィールドに入力された情報を情報処理装置１００に送信する送信ボタンＢ１１とを含む部分コンテンツＣ１１を端末装置１０に送信する。 From here, the flow of information processing will be described with reference to FIG. In FIG. 1, the information processing device 100 transmits the partial content C11 of the content C1 to the terminal device 10 in response to the request of the terminal device 10. Specifically, the information processing device 100 transmits the input field F11 capable of inputting the number of clusters, the input field F12 capable of inputting a plurality of queries, and the information input in the input fields to the information processing device 100. The partial content C11 including the button B11 is transmitted to the terminal device 10.

端末装置１０は、コンテンツＣ１の部分コンテンツＣ１１を情報処理装置１００から受信する。端末装置１０は、部分コンテンツＣ１１を受信すると、受信した部分コンテンツＣ１１を端末装置１０の画面に表示する。 The terminal device 10 receives the partial content C11 of the content C1 from the information processing device 100. When the terminal device 10 receives the partial content C11, the terminal device 10 displays the received partial content C11 on the screen of the terminal device 10.

端末装置１０の利用者Ｕ１１は、端末装置１０の画面に表示された部分コンテンツＣ１１に含まれる入力フィールドＦ１１にクラスタ数を入力する操作を行う。図１に示す例では、利用者Ｕ１１は、クラスタ数「３」（クラスタ数ＣＮ）を入力フィールドＦ１１に入力する操作を行う。 The user U11 of the terminal device 10 performs an operation of inputting the number of clusters in the input field F11 included in the partial content C11 displayed on the screen of the terminal device 10. In the example shown in FIG. 1, the user U11 performs an operation of inputting the number of clusters “3” (number of clusters CN) into the input field F11.

また、端末装置１０の利用者Ｕ１１は、端末装置１０の画面に表示された部分コンテンツＣ１１に含まれる入力フィールドＦ１２に複数のクエリを入力する操作を行う。図１に示す例では、利用者Ｕ１１は、１２個のクエリＱ１-１〜Ｑ１-１２を入力フィールドＦ１２に入力する操作を行う。具体的には、利用者Ｕ１１は、区切り文字で区切られた各文字列をそれぞれ一つのクエリとして入力フィールドＦ１２に入力する操作を行う。例えば、自動車メーカー＃１の担当者である利用者Ｕ１１は、自社（自動車メーカー＃１）の商品である６種類の自動車の名称を示す６つのクエリ「車種Ｔ１１」(クエリＱ１-１)、クエリ「車種Ｔ１２」(クエリＱ１-２)、クエリ「車種Ｔ１３」(クエリＱ１-３)、クエリ「車種Ｔ１４」(クエリＱ１-４)、クエリ「車種Ｔ１５」(クエリＱ１-５)、クエリ「車種Ｔ１６」(クエリＱ１-６)を入力フィールドＦ１２に入力する操作を行う。また、利用者Ｕ１１は、競合他社である自動車メーカー＃２の商品である６種類の自動車の名称を示す６つのクエリ「車種Ｔ２１」(クエリＱ１-７)、クエリ「車種Ｔ２２」(クエリＱ１-８)、クエリ「車種Ｔ２３」(クエリＱ１-９)、クエリ「車種Ｔ２４」(クエリＱ１-１０)、クエリ「車種Ｔ２５」(クエリＱ１-１１)、クエリ「車種Ｔ２６」(クエリＱ１-１２)を入力フィールドＦ１２に入力する操作を行う。 Further, the user U11 of the terminal device 10 performs an operation of inputting a plurality of queries into the input field F12 included in the partial content C11 displayed on the screen of the terminal device 10. In the example shown in FIG. 1, the user U11 performs an operation of inputting 12 queries Q1-1 to Q1-12 into the input field F12. Specifically, the user U11 performs an operation of inputting each character string separated by a delimiter into the input field F12 as one query. For example, the user U11, who is in charge of the automobile manufacturer # 1, has six queries "vehicle type T11" (query Q1-1) indicating the names of six types of automobiles that are products of the company (automobile manufacturer # 1). "Vehicle type T12" (query Q1-2), query "vehicle type T13" (query Q1-3), query "vehicle type T14" (query Q1-4), query "vehicle type T15" (query Q1-5), query "vehicle type" The operation of inputting "T16" (query Q1-6) into the input field F12 is performed. In addition, the user U11 uses six queries "vehicle type T21" (query Q1-7) and a query "vehicle type T22" (query Q1-) indicating the names of six types of vehicles that are products of competitor automobile manufacturer # 2. 8), query "vehicle type T23" (query Q1-9), query "vehicle type T24" (query Q1-10), query "vehicle type T25" (query Q1-11), query "vehicle type T26" (query Q1-12) Is input to the input field F12.

続いて、端末装置１０の利用者Ｕ１１は、端末装置１０の画面に表示された部分コンテンツＣ１１に含まれる送信ボタンＢ１１を選択する操作を行う。端末装置１０は、利用者Ｕ１１の操作に従って送信ボタンＢ１１が選択されると、入力フィールドＦ１１に入力されたクラスタ数「３」と、入力フィールドＦ１２に入力された１２個のクエリＱ１-１〜Ｑ１-１２を情報処理装置１００に送信する。 Subsequently, the user U11 of the terminal device 10 performs an operation of selecting the transmission button B11 included in the partial content C11 displayed on the screen of the terminal device 10. When the transmission button B11 is selected according to the operation of the user U11, the terminal device 10 has the number of clusters "3" input in the input field F11 and the 12 queries Q1-1 to Q1 input in the input field F12. -12 is transmitted to the information processing device 100.

情報処理装置１００は、クラスタ数「３」と１２個のクエリＱ１-１〜Ｑ１-１２を利用者Ｕ１１から受け付ける。具体的には、情報処理装置１００は、クラスタ数「３」と１２個のクエリＱ１-１〜Ｑ１-１２を端末装置１０から受信する。 The information processing device 100 receives the number of clusters "3" and 12 queries Q1-1 to Q1-12 from the user U11. Specifically, the information processing device 100 receives the number of clusters "3" and 12 queries Q1-1 to Q1-12 from the terminal device 10.

情報処理装置１００は、クラスタ数「３」と１２個のクエリＱ１-１〜Ｑ１-１２を受け付けると、第１モデルＭ１を用いて生成されたクエリＱ１-１〜Ｑ１-１２の分散表現ＱＶ１-１〜ＱＶ１-１２（図５参照）を取得する。続いて、情報処理装置１００は、分散表現ＱＶ１-１〜ＱＶ１-１２を取得すると、取得した分散表現ＱＶ１-１〜ＱＶ１-１２をｋ−ｍｅａｎｓ法を用いてクラスタ数「３」のクラスタに分類する。なお、情報処理装置１００は、取得した分散表現ＱＶ１-１〜ＱＶ１-１２をクラスタ数「３」のクラスタに分類可能であれば、ｋ−ｍｅａｎｓ法に限らず、どのようなクラスタリング手法を用いてもよい。 When the information processing device 100 receives the number of clusters "3" and 12 queries Q1-1 to Q1-12, the distributed representation QV1- of the queries Q1-1 to Q1-12 generated using the first model M1. Obtain 1 to QV1-12 (see FIG. 5). Subsequently, when the information processing apparatus 100 acquires the distributed expressions QV1-1 to QV1-12, the acquired distributed expressions QV1-1 to QV1-12 are classified into clusters having the number of clusters "3" by using the k-means method. To do. The information processing apparatus 100 uses any clustering method, not limited to the k-means method, as long as the acquired distributed representations QV1-1 to QV1-12 can be classified into clusters having the number of clusters "3". May be good.

また、情報処理装置１００は、クエリＱ１-１〜Ｑ１-１２に対応する分散表現ＱＶ１-１〜ＱＶ１-１２をクラスタ数「３」のクラスタに分類することによって、各分散表現に対応する各クエリをクラスタ数「３」のクラスタに分類する。このように、情報処理装置１００は、各分散表現に対応する各クエリをクラスタ数「３」のクラスタに分類することにより、各クエリが分類されるクラスタに関するクラスタ情報を生成する。 Further, the information processing apparatus 100 classifies the distributed expressions QV1-1 to QV1-12 corresponding to the queries Q1-1 to Q1-12 into clusters having the number of clusters "3", and thereby each query corresponding to each distributed expression. Is classified into clusters with the number of clusters "3". In this way, the information processing apparatus 100 classifies each query corresponding to each distributed representation into clusters having the number of clusters "3", thereby generating cluster information regarding the cluster into which each query is classified.

例えば、情報処理装置１００は、分散表現ＱＶ１-１と分散表現ＱＶ１-２と分散表現ＱＶ１-３と分散表現ＱＶ１-７と分散表現ＱＶ１-８を一つのクラスタ（クラスタＣＬ１）に分類する。情報処理装置１００は、分散表現ＱＶ１-１をクラスタＣＬ１に分類したので、分散表現ＱＶ１-１に対応するクエリＱ１-１をクラスタＣＬ１に分類する。また、情報処理装置１００は、分散表現ＱＶ１-２をクラスタＣＬ１に分類したので、分散表現ＱＶ１-２に対応するクエリＱ１-２をクラスタＣＬ１に分類する。同様に、情報処理装置１００は、分散表現ＱＶ１-３と分散表現ＱＶ１-７と分散表現ＱＶ１-８をクラスタＣＬ１に分類したので、分散表現ＱＶ１-３に対応するクエリＱ１-３と分散表現ＱＶ１-７に対応するクエリＱ１-７と分散表現ＱＶ１-８に対応するクエリＱ１-８をクラスタＣＬ１に分類する。このようにして、情報処理装置１００は、クエリＱ１-１とクエリＱ１-２とクエリＱ１-３とクエリＱ１-７とクエリＱ１-８が分類されるクラスタＣＬ１に関するクラスタ情報を生成する。 For example, the information processing apparatus 100 classifies the distributed representation QV1-1, the distributed representation QV1-2, the distributed representation QV1-3, the distributed representation QV1-7, and the distributed representation QV1-8 into one cluster (cluster CL1). Since the information processing apparatus 100 has classified the distributed expression QV1-1 into the cluster CL1, the query Q1-1 corresponding to the distributed expression QV1-1 is classified into the cluster CL1. Further, since the information processing apparatus 100 classifies the distributed expression QV1-2 into the cluster CL1, the query Q1-2 corresponding to the distributed expression QV1-2 is classified into the cluster CL1. Similarly, since the information processing apparatus 100 classifies the distributed representation QV1-3, the distributed representation QV1-7, and the distributed representation QV1-8 into the cluster CL1, the queries Q1-3 and the distributed representation QV1 corresponding to the distributed representation QV1-3. Query Q1-7 corresponding to -7 and query Q1-8 corresponding to the distributed representation QV1-8 are classified into cluster CL1. In this way, the information processing apparatus 100 generates cluster information regarding the cluster CL1 in which the query Q1-1, the query Q1-2, the query Q1-3, the query Q1-7, and the query Q1-8 are classified.

ここで、第１モデルＭ１を用いて生成された文字情報の分散表現には、その文字情報が検索クエリとして入力された際の利用者の検索意図を示す特徴情報が含まれている。例えば、クエリＱ１-１の分散表現ＱＶ１-１には、クエリＱ１-１に対応する文字情報「車種Ｔ１１」が検索クエリとして入力された際の利用者の検索意図を示す特徴情報が含まれている。また、クエリＱ１-２の分散表現ＱＶ１-２には、クエリＱ１-２に対応する文字情報「車種Ｔ１２」が検索クエリとして入力された際の利用者の検索意図を示す特徴情報が含まれている。 Here, the distributed representation of the character information generated by using the first model M1 includes feature information indicating the user's search intention when the character information is input as a search query. For example, the distributed expression QV1-1 of the query Q1-1 includes feature information indicating the user's search intention when the character information "vehicle type T11" corresponding to the query Q1-1 is input as a search query. There is. Further, the distributed expression QV1-2 of the query Q1-2 includes feature information indicating the user's search intention when the character information "vehicle type T12" corresponding to the query Q1-2 is input as a search query. There is.

また、第１モデルＭ１から出力される所定の文字情報の分散表現と他の文字情報の分散表現が類似することは、所定の文字情報が検索クエリとして入力された際の利用者の検索意図と他の文字情報が検索クエリとして入力された際の利用者の検索意図とが類似することを意味する。例えば、文字情報「車種Ｔ１１」の分散表現ＱＶ１-１と文字情報「車種Ｔ１２」の分散表現ＱＶ１-２とが類似することは、文字情報「車種Ｔ１１」が検索クエリとして入力された際の利用者の検索意図と文字情報「車種Ｔ１２」が検索クエリとして入力された際の利用者の検索意図とが類似することを意味する。 Further, the similarity between the distributed representation of the predetermined character information output from the first model M1 and the distributed representation of other character information is the search intention of the user when the predetermined character information is input as a search query. It means that the search intention of the user when other character information is input as a search query is similar. For example, the similarity between the distributed expression QV1-1 of the character information "vehicle type T11" and the distributed expression QV1-2 of the character information "vehicle type T12" is used when the character information "vehicle type T11" is input as a search query. It means that the search intention of the user and the search intention of the user when the character information "vehicle type T12" is input as a search query are similar.

また、一般的に、ｋ−ｍｅａｎｓ法等のクラスタリング法を用いて同一のクラスタに分類されるデータ同士は、類似する特徴を有する。例えば、ｋ−ｍｅａｎｓ法を用いて同一のクラスタに分類されるベクトル同士は、類似する特徴を有する。したがって、本願発明において、例えば、ｋ−ｍｅａｎｓ法を用いて分散表現ＱＶ１-１と分散表現ＱＶ１-２とが一つのクラスタ（クラスタＣＬ１）に分類されることは、分散表現ＱＶ１-１に対応する文字情報「車種Ｔ１１」が検索クエリとして入力された際の利用者の検索意図と、分散表現ＱＶ１-２に対応する文字情報「車種Ｔ１２」が検索クエリとして入力された際の利用者の検索意図とが類似することを意味する。このように、本願発明において、同一のクラスタに分類される分散表現に対応する文字情報（例えば、クエリ）同士は、検索クエリとして入力された際の利用者の検索意図が類似する。例えば、同一のクラスタＣＬ１に分類されたクエリＱ１-１とクエリＱ１-２とクエリＱ１-３とクエリＱ１-７とクエリＱ１-８は、検索クエリとして入力された際の利用者の検索意図が互いに類似する。すなわち、情報処理装置１００、複数のクエリを、各クエリが検索クエリとして入力された際の利用者の検索意図に応じた指定クラスタ数のクラスタに分類することができる。 Further, in general, data classified into the same cluster using a clustering method such as the k-means method have similar characteristics. For example, vectors classified into the same cluster using the k-means method have similar characteristics. Therefore, in the present invention, for example, the fact that the distributed representation QV1-1 and the distributed representation QV1-2 are classified into one cluster (cluster CL1) by using the k-means method corresponds to the distributed representation QV1-1. The user's search intention when the character information "vehicle type T11" is input as a search query, and the user's search intention when the character information "vehicle type T12" corresponding to the distributed expression QV1-2 is input as a search query. Means similar to. As described above, in the present invention, the character information (for example, the query) corresponding to the distributed representation classified into the same cluster has a similar search intention of the user when input as a search query. For example, the query Q1-1, the query Q1-2, the query Q1-3, the query Q1-7, and the query Q1-8 classified into the same cluster CL1 have the search intention of the user when they are input as a search query. Similar to each other. That is, the information processing device 100 and a plurality of queries can be classified into clusters having a specified number of clusters according to the user's search intention when each query is input as a search query.

なお、同じ複数のクエリであっても、検索意図に応じたクラスタリングと、通常のクラスタリングとでは、分類されるクラスタが異なる場合がある。例えば、「車種Ｔ１１」と「車種Ｔ１４」がスポーツカーであり、「車種Ｔ１２」と「車種Ｔ１５」がファミリーワゴン車であり、「車種Ｔ１３」と「車種Ｔ１６」が軽自動車であるとする。この場合、通常のクラスタリングであれば、スポーツカーである「車種Ｔ１１」と「車種Ｔ１４」、ファミリーワゴン車である「車種Ｔ１２」と「車種Ｔ１５」、軽自動車である「車種Ｔ１３」と「車種Ｔ１６」は、それぞれ車種が同一であるので、同一のクラスタに分類されることがある。しかしながら、本願発明におけるように、検索意図に応じたクラスタリングでは、必ずしも車種が同一であるから、同一のクラスタに分類されるとは限らず、また、異なる車種であっても同一のクラスタに分類される場合がある。例えば、スポーツカーである「車種Ｔ１１」とファミリーワゴン車である「車種Ｔ１２」と軽自動車である「車種Ｔ１３」とが検索クエリとして入力された際の利用者の検索意図が互いに類似する場合には、異なる車種同士であっても同一のクラスタＣＬ１に分類される。また、スポーツカーである「車種Ｔ１４」とファミリーワゴン車である「車種Ｔ１５」と軽自動車である「車種Ｔ１６」とが検索クエリとして入力された際の利用者の検索意図が互いに類似する場合には、異なる車種同士であっても同一のクラスタＣＬ２に分類される。このように、情報処理装置１００は、検索意図に応じたクラスタリングを可能とするため、検索意図に応じたクラスタリングに基づく新たな知見を得ることができる。 Even for the same plurality of queries, the clusters classified according to the search intention and the normal clustering may be different. For example, assume that "vehicle type T11" and "vehicle type T14" are sports cars, "vehicle type T12" and "vehicle type T15" are family wagon vehicles, and "vehicle type T13" and "vehicle type T16" are light vehicles. In this case, in the case of normal clustering, the sports cars "vehicle type T11" and "vehicle type T14", the family wagon vehicles "vehicle type T12" and "vehicle type T15", and the light vehicles "vehicle type T13" and "vehicle type" Since the vehicle types of "T16" are the same, they may be classified into the same cluster. However, as in the present invention, in clustering according to the search intention, since the vehicle types are not necessarily the same, they are not always classified into the same cluster, and even different vehicle types are classified into the same cluster. May occur. For example, when the search intentions of the users when the sports car "vehicle type T11", the family wagon vehicle "vehicle type T12", and the light vehicle "vehicle type T13" are input as search queries are similar to each other. Is classified into the same cluster CL1 even if they are different vehicle types. In addition, when the search intentions of the users when the sports car "vehicle type T14", the family wagon vehicle "vehicle type T15", and the light vehicle "vehicle type T16" are input as search queries are similar to each other. Is classified into the same cluster CL2 even if they are different vehicle types. As described above, since the information processing apparatus 100 enables clustering according to the search intention, new knowledge based on the clustering according to the search intention can be obtained.

また、情報処理装置１００は、分散表現ＱＶ１-４と分散表現ＱＶ１-５と分散表現ＱＶ１-６と分散表現ＱＶ１-９と分散表現ＱＶ１-１０をクラスタＣＬ１とは異なる一つのクラスタ（クラスタＣＬ２）に分類する。情報処理装置１００は、分散表現ＱＶ１-４をクラスタＣＬ２に分類したので、分散表現ＱＶ１-４に対応するクエリＱ１-４をクラスタＣＬ２に分類する。また、情報処理装置１００は、分散表現ＱＶ１-５をクラスタＣＬ２に分類したので、分散表現ＱＶ１-５に対応するクエリＱ１-５をクラスタＣＬ２に分類する。同様に、情報処理装置１００は、分散表現ＱＶ１-６と分散表現ＱＶ１-９と分散表現ＱＶ１-１０をクラスタＣＬ２に分類したので、分散表現ＱＶ１-６に対応するクエリＱ１-６と分散表現ＱＶ１-９に対応するクエリＱ１-９と分散表現ＱＶ１-１０に対応するクエリＱ１-１０をクラスタＣＬ２に分類する。このようにして、情報処理装置１００は、クエリＱ１-４とクエリＱ１-５とクエリＱ１-６とクエリＱ１-９とクエリＱ１-１０が分類されるクラスタＣＬ２に関するクラスタ情報を生成する。また、同一のクラスタＣＬ２に分類されたクエリＱ１-４とクエリＱ１-５とクエリＱ１-６とクエリＱ１-９とクエリＱ１-１０は、検索クエリとして入力された際の利用者の検索意図が互いに類似する。 Further, the information processing apparatus 100 uses one cluster (cluster CL2) in which the distributed representation QV1-4, the distributed representation QV1-5, the distributed representation QV1-6, the distributed representation QV1-9, and the distributed representation QV1-10 are different from the cluster CL1. Classify into. Since the information processing apparatus 100 has classified the distributed expression QV1-4 into the cluster CL2, the query Q1-4 corresponding to the distributed expression QV1-4 is classified into the cluster CL2. Further, since the information processing apparatus 100 classifies the distributed expression QV1-5 into the cluster CL2, the query Q1-5 corresponding to the distributed expression QV1-5 is classified into the cluster CL2. Similarly, since the information processing apparatus 100 classifies the distributed representation QV1-6, the distributed representation QV1-9, and the distributed representation QV1-10 into the cluster CL2, the queries Q1-6 and the distributed representation QV1 corresponding to the distributed representation QV1-6 Queries Q1-9 corresponding to -9 and queries Q1-10 corresponding to the distributed representation QV1-10 are classified into cluster CL2. In this way, the information processing apparatus 100 generates cluster information regarding the cluster CL2 in which the query Q1-4, the query Q1-5, the query Q1-6, the query Q1-9, and the query Q1-10 are classified. Further, the query Q1-4, the query Q1-5, the query Q1-6, the query Q1-9, and the query Q1-10 classified into the same cluster CL2 have the search intention of the user when they are input as a search query. Similar to each other.

また、情報処理装置１００は、分散表現ＱＶ１-１１と分散表現ＱＶ１-１２をクラスタＣＬ１およびクラスタＣＬ２とは異なる一つのクラスタ（クラスタＣＬ３）に分類する。情報処理装置１００は、分散表現ＱＶ１-１１をクラスタＣＬ３に分類したので、分散表現ＱＶ１-１１に対応するクエリＱ１-１１をクラスタＣＬ３に分類する。また、情報処理装置１００は、分散表現ＱＶ１-１２をクラスタＣＬ３に分類したので、分散表現ＱＶ１-１２に対応するクエリＱ１-１２をクラスタＣＬ３に分類する。このようにして、情報処理装置１００は、クエリＱ１-１１とクエリＱ１-１２が分類されるクラスタＣＬ３に関するクラスタ情報を生成する。また、同一のクラスタＣＬ３に分類されたクエリＱ１-１１とクエリＱ１-１２は、検索クエリとして入力された際の利用者の検索意図が互いに類似する。 Further, the information processing apparatus 100 classifies the distributed representation QV1-11 and the distributed representation QV1-12 into one cluster (cluster CL3) different from the cluster CL1 and the cluster CL2. Since the information processing apparatus 100 has classified the distributed representation QV1-11 into the cluster CL3, the query Q1-11 corresponding to the distributed representation QV1-11 is classified into the cluster CL3. Further, since the information processing apparatus 100 classifies the distributed expression QV1-12 into the cluster CL3, the query Q1-12 corresponding to the distributed expression QV1-12 is classified into the cluster CL3. In this way, the information processing apparatus 100 generates cluster information regarding the cluster CL3 in which the queries Q1-11 and the queries Q1-12 are classified. Further, the queries Q1-11 and the queries Q1-12 classified into the same cluster CL3 have similar search intentions of the users when they are input as search queries.

ここで、情報処理装置１００によるクラスタリング結果の解釈についての一例を説明する。例えば、クラスタＣＬ１には、自動車メーカー＃１の商品名を示すクエリ（「車種Ｔ１１」、「車種Ｔ１２」、「車種Ｔ１３」）と、自動車メーカー＃２の商品名を示すクエリ（「車種Ｔ２１」、「車種Ｔ２２」）の両方が含まれている。また、クラスタＣＬ２にも、自動車メーカー＃１の商品名を示すクエリ（「車種Ｔ１４」、「車種Ｔ１５」、「車種Ｔ１６」）と、自動車メーカー＃２の商品名を示すクエリ（「車種Ｔ２３」、「車種Ｔ２４」）の両方が含まれている。一方、クラスタＣＬ３には、自動車メーカー＃１の商品名を示すクエリは含まれておらず、自動車メーカー＃２の商品名を示すクエリ（「車種Ｔ２５」、「車種Ｔ２６」）のみが含まれている。これらの結果から、自動車メーカー＃１の商品には、クラスタＣＬ３に対応する検索意図で検索された商品に該当する商品が存在しないことがわかる。これにより、例えば、自動車メーカー＃１は、クラスタＣＬ３に対応する検索意図で検索される商品に関して、自動車メーカー＃２と比べて市場における弱みがあるという分析結果を得ることができる。また、分析結果を得た自動車メーカー＃１は、分析結果を、クラスタＣＬ３に対応する検索意図で検索される商品開発を進める等のマーケティング方針に役立てることができる。このように、情報処理装置１００は、クライアントに対して、利用者の検索意図を反映した市場分析に関する情報を提供可能とする。例えば、情報処理装置１００は、利用者の検索意図を反映した市場における自社（他社）の弱みや強みに関する知見を提供可能とする。 Here, an example of interpretation of the clustering result by the information processing apparatus 100 will be described. For example, the cluster CL1 includes a query indicating the product name of the automobile manufacturer # 1 ("vehicle type T11", "vehicle type T12", "vehicle type T13") and a query indicating the product name of the automobile manufacturer # 2 ("vehicle type T21"). , "Vehicle type T22") are included. In addition, a query indicating the product name of the automobile manufacturer # 1 ("vehicle type T14", "vehicle type T15", "vehicle type T16") and a query indicating the product name of the automobile manufacturer # 2 ("vehicle type T23") are also sent to the cluster CL2. , "Vehicle type T24") are included. On the other hand, the cluster CL3 does not include a query indicating the product name of the automobile manufacturer # 1, but includes only a query indicating the product name of the automobile manufacturer # 2 ("vehicle type T25", "vehicle type T26"). There is. From these results, it can be seen that there is no product corresponding to the product searched with the search intention corresponding to the cluster CL3 in the product of the automobile manufacturer # 1. As a result, for example, the automobile manufacturer # 1 can obtain an analysis result that the product searched with the search intention corresponding to the cluster CL3 has a weakness in the market as compared with the automobile manufacturer # 2. In addition, the automobile manufacturer # 1 that has obtained the analysis result can use the analysis result for marketing policies such as promoting the development of products that are searched with the search intention corresponding to the cluster CL3. In this way, the information processing device 100 makes it possible to provide the client with information regarding market analysis that reflects the user's search intention. For example, the information processing apparatus 100 can provide knowledge about the weaknesses and strengths of the company (other companies) in the market that reflects the user's search intention.

続いて、情報処理装置１００は、３つのクラスタＣＬ１〜ＣＬ３に関するクラスタ情報を生成すると、生成した３つのクラスタＣＬ１〜ＣＬ３に関するクラスタ情報を端末装置１０に送信する。端末装置１０は、３つのクラスタＣＬ１〜ＣＬ３に関するクラスタ情報を取得すると、コンテンツＣ１の部分コンテンツＣ１２に含まれる表示領域Ｆ２１〜Ｆ２３のそれぞれにクラスタＣＬ１〜ＣＬ３に関するクラスタ情報が表示されるように画面の表示を制御する。 Subsequently, when the information processing apparatus 100 generates the cluster information regarding the three clusters CL1 to CL3, the information processing apparatus 100 transmits the cluster information regarding the generated three clusters CL1 to CL3 to the terminal apparatus 10. When the terminal device 10 acquires the cluster information related to the three clusters CL1 to CL3, the terminal device 10 displays the cluster information related to the clusters CL1 to CL3 in each of the display areas F21 to F23 included in the partial content C12 of the content C1. Control the display.

例えば、情報処理装置１００は、クラスタＣＬ１に関するクラスタ情報が表示される表示領域Ｆ２１の上方にクラスタＣＬ１の名称「クラスタ１」が表示されるように画面の表示を制御する。また、情報処理装置１００は、クラスタＣＬ２に関するクラスタ情報が表示される表示領域Ｆ２２の上方にクラスタＣＬ２の名称「クラスタ２」が表示されるように画面の表示を制御する。また、情報処理装置１００は、クラスタＣＬ３に関するクラスタ情報が表示される表示領域Ｆ２３の上方にクラスタＣＬ３の名称「クラスタ３」が表示されるように画面の表示を制御する。 For example, the information processing apparatus 100 controls the display of the screen so that the name "cluster 1" of the cluster CL1 is displayed above the display area F21 in which the cluster information regarding the cluster CL1 is displayed. Further, the information processing apparatus 100 controls the display of the screen so that the name "cluster 2" of the cluster CL2 is displayed above the display area F22 in which the cluster information regarding the cluster CL2 is displayed. Further, the information processing apparatus 100 controls the display of the screen so that the name "cluster 3" of the cluster CL3 is displayed above the display area F23 in which the cluster information regarding the cluster CL3 is displayed.

上述したように、情報処理装置１００は、分類対象を示す複数の対象情報と、複数の対象情報をクラスタに分類する際のクラスタ数である指定クラスタ数とを受け付ける。また、情報処理装置１００は、複数の対象情報を指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を生成する。そして、情報処理装置１００は、生成した指定クラスタ情報を出力する。これにより、情報処理装置１００は、分類対象を示す多数の対象情報から、多数の対象情報の特徴の要約とも言える指定クラスタ数のクラスタ情報を抽出可能とする。そして、情報処理装置１００は、多数の対象情報の特徴を指定クラスタ数のクラスタ情報に要約して提供可能とする。このように、情報処理装置１００は、分類対象を示す多数の対象情報の特徴を少ない情報量の要約にまとめて提供可能とする。したがって、情報処理装置１００は、有用な情報を得ることができる。 As described above, the information processing apparatus 100 receives a plurality of target information indicating the classification target and a designated number of clusters which is the number of clusters when the plurality of target information is classified into clusters. Further, the information processing apparatus 100 generates designated cluster information regarding the clusters generated by classifying the plurality of target information into clusters having a designated number of clusters. Then, the information processing device 100 outputs the generated designated cluster information. As a result, the information processing apparatus 100 can extract cluster information of a specified number of clusters, which can be said to be a summary of the features of a large number of target information, from a large number of target information indicating classification targets. Then, the information processing apparatus 100 can summarize the features of a large number of target information into cluster information of a specified number of clusters and provide the information. As described above, the information processing apparatus 100 can provide the features of a large number of target information indicating the classification target in a summary of a small amount of information. Therefore, the information processing device 100 can obtain useful information.

〔２．情報処理装置の構成〕
次に、図３を用いて、実施形態に係る情報処理装置１００の構成について説明する。図３は、実施形態に係る情報処理装置１００の構成例を示す図である。図３に示すように、情報処理装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。なお、情報処理装置１００は、情報処理装置１００の管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示させるための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [2. Information processing device configuration]
Next, the configuration of the information processing apparatus 100 according to the embodiment will be described with reference to FIG. FIG. 3 is a diagram showing a configuration example of the information processing device 100 according to the embodiment. As shown in FIG. 3, the information processing device 100 includes a communication unit 110, a storage unit 120, and a control unit 130. The information processing device 100 includes an input unit (for example, a keyboard, a mouse, etc.) that receives various operations from the administrator of the information processing device 100, and a display unit (for example, a liquid crystal display, etc.) for displaying various information. You may have.

（通信部１１０）
通信部１１０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部１１０は、ネットワークと有線または無線で接続され、例えば、端末装置１０と検索サーバ２０と生成装置５０との間で情報の送受信を行う。 (Communication unit 110)
The communication unit 110 is realized by, for example, a NIC (Network Interface Card) or the like. Then, the communication unit 110 is connected to the network by wire or wirelessly, and for example, information is transmitted / received between the terminal device 10, the search server 20, and the generation device 50.

（記憶部１２０）
記憶部１２０は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部１２０は、図３に示すように、クエリ情報記憶部１２１とベクトル情報記憶部１２２とクラスタ情報記憶部１２３とモデル情報記憶部１２４を有する。 (Memory unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. As shown in FIG. 3, the storage unit 120 includes a query information storage unit 121, a vector information storage unit 122, a cluster information storage unit 123, and a model information storage unit 124.

（クエリ情報記憶部１２１）
クエリ情報記憶部１２１は、ユーザから受け付けたクエリに関する各種の情報を記憶する。図４に、実施形態に係るクエリ情報記憶部の一例を示す。図４に示す例では、クエリ情報記憶部１２１は、「ユーザＩＤ」、「日時」、「指定クラスタ数」、「クエリ」、「クエリＩＤ」といった項目を有する。 (Query information storage unit 121)
The query information storage unit 121 stores various information related to the query received from the user. FIG. 4 shows an example of the query information storage unit according to the embodiment. In the example shown in FIG. 4, the query information storage unit 121 has items such as "user ID", "date and time", "specified number of clusters", "query", and "query ID".

「ユーザＩＤ」は、ユーザを識別するための識別情報を示す。「日時」は、ユーザからクエリを受け付けた日時を示す。「指定クラスタ数」は、ユーザから受け付けた複数のクエリをクラスタに分類する際のクラスタ数としてユーザから指定されたクラスタ数を示す。「クエリ」は、ユーザから受け付けたクエリを示す。具体的には、「クエリＩＤ」は、ユーザから受け付けたクエリを識別するための識別情報を示す。 The "user ID" indicates identification information for identifying a user. "Date and time" indicates the date and time when the query was received from the user. The "specified number of clusters" indicates the number of clusters specified by the user as the number of clusters when classifying a plurality of queries received from the user into clusters. "Query" indicates a query received from a user. Specifically, the "query ID" indicates identification information for identifying the query received from the user.

図４の１レコード目に示す例では、クエリＩＤ「Ｑ１-１」で識別されるクエリ（クエリＱ１-１）は、図１に示すクエリ「車種Ｔ１１」に対応する。また、指定クラスタ数「３」は、ユーザから受け付けた１２個のクエリＱ１-１〜Ｑ１-１２をクラスタに分類する際のクラスタ数としてユーザから指定されたクラスタ数が「３」であることを示す。 In the example shown in the first record of FIG. 4, the query (query Q1-1) identified by the query ID "Q1-1" corresponds to the query "vehicle type T11" shown in FIG. Further, the specified number of clusters "3" means that the number of clusters specified by the user as the number of clusters when classifying the 12 queries Q1-1 to Q1-12 received from the user into clusters is "3". Shown.

（ベクトル情報記憶部１２２）
ベクトル情報記憶部１２２は、クエリの分散表現であるベクトルに関する各種の情報を記憶する。図５に、実施形態に係るベクトル情報記憶部の一例を示す。図５に示す例では、ベクトル情報記憶部１２２は、「ベクトルＩＤ」、「クエリＩＤ」、「ベクトル情報」といった項目を有する。 (Vector information storage unit 122)
The vector information storage unit 122 stores various information related to the vector, which is a distributed representation of the query. FIG. 5 shows an example of the vector information storage unit according to the embodiment. In the example shown in FIG. 5, the vector information storage unit 122 has items such as “vector ID”, “query ID”, and “vector information”.

「ベクトルＩＤ」は、クエリの分散表現であるベクトルを識別するための識別情報を示す。「クエリＩＤ」は、ベクトルに対応するクエリを識別するための識別情報を示す。「ベクトル情報」は、クエリの分散表現であるＮ次元（例えば、１２８次元）のベクトルを示す。 The "vector ID" indicates identification information for identifying a vector which is a distributed representation of a query. The "query ID" indicates identification information for identifying the query corresponding to the vector. The "vector information" indicates an N-dimensional (for example, 128-dimensional) vector which is a distributed representation of the query.

図５の１レコード目に示す例では、ベクトルＩＤ「ＱＶ１-１」で識別されるベクトル（ベクトルＱＶ１-１）は、図１に示したクエリＱ１-１分散表現であるベクトルＱＶ１-１に対応する。また、クエリＩＤ「Ｑ１-１」で識別されるクエリ（クエリＱ１-１）は、ベクトルＱＶ１-１に対応するクエリがクエリＱ１-１であることを示す。また、ベクトル情報「ＱＶＤＴ１-１」は、クエリＱ１-１の分散表現であるＮ次元のベクトルを示す。 In the example shown in the first record of FIG. 5, the vector (vector QV1-1) identified by the vector ID “QV1-1” corresponds to the vector QV1-1 which is the query Q1-1 distributed representation shown in FIG. To do. Further, the query (query Q1-1) identified by the query ID "Q1-1" indicates that the query corresponding to the vector QV1-1 is the query Q1-1. Further, the vector information "QVDT1-1" indicates an N-dimensional vector which is a distributed representation of the query Q1-1.

（クラスタ情報記憶部１２３）
クラスタ情報記憶部１２３は、クラスタに関する各種の情報を記憶する。図６に、実施形態に係るクラスタ情報記憶部の一例を示す。図６に示す例では、クラスタ情報記憶部１２３は、ユーザから受け付けた複数のクエリをユーザから受け付けた指定クラスタ数に分類することにより生成されるクラスタ情報毎にデータが格納される複数のデータテーブルから成る。また、各データテーブルは、「クラスタＩＤ」、「クラスタ名」、「クエリＩＤ」といった項目を有する。 (Cluster Information Storage Unit 123)
The cluster information storage unit 123 stores various information about the cluster. FIG. 6 shows an example of the cluster information storage unit according to the embodiment. In the example shown in FIG. 6, the cluster information storage unit 123 stores a plurality of data tables for each cluster information generated by classifying a plurality of queries received from the user into a specified number of clusters received from the user. Consists of. In addition, each data table has items such as "cluster ID", "cluster name", and "query ID".

「クラスタＩＤ」は、クラスタを識別するための識別情報を示す。「クラスタ名」は、クラスタの名称を示す。「クエリＩＤ」は、クエリを識別するための識別情報を示す。 The "cluster ID" indicates identification information for identifying the cluster. "Cluster name" indicates the name of the cluster. The "query ID" indicates identification information for identifying the query.

図６の１レコード目に示す例では、クラスタＩＤ「ＣＬ１」で識別されるクラスタ(クラスタＣＬ１)は、図１に示すクラスタＣＬ１に対応する。また、クラスタ名「クラスタ１」は、クラスタＣＬ１の名称がクラスタ１であることを示す。また、クラスタＣＬ１には、クエリＩＤ「Ｑ１-１」で識別されるクエリ（クエリＱ１-１）とクエリＩＤ「Ｑ１-２」で識別されるクエリ（クエリＱ１-２）とクエリＩＤ「Ｑ１-３」で識別されるクエリ（クエリＱ１-３）とクエリＩＤ「Ｑ１-７」で識別されるクエリ（クエリＱ１-７）とクエリＩＤ「Ｑ１-８」で識別されるクエリ（クエリＱ１-８）が分類されることを示す。 In the example shown in the first record of FIG. 6, the cluster (cluster CL1) identified by the cluster ID “CL1” corresponds to the cluster CL1 shown in FIG. Further, the cluster name "cluster 1" indicates that the name of the cluster CL1 is cluster 1. Further, the cluster CL1 includes a query (query Q1-1) identified by the query ID "Q1-1", a query (query Q1-2) identified by the query ID "Q1-2", and a query ID "Q1-2". Query identified by "3" (query Q1-3), query identified by query ID "Q1-7" (query Q1-7), and query identified by query ID "Q1-8" (query Q1-8) ) Is classified.

（モデル情報記憶部１２４）
モデル情報記憶部１２４は、生成装置５０によって生成された学習モデルに関する各種の情報を記憶する。図７に、実施形態に係るモデル情報記憶部の一例を示す。図７に示す例では、モデル情報記憶部１２４は、「モデルＩＤ」、「モデルデータ」といった項目を有する。 (Model information storage unit 124)
The model information storage unit 124 stores various information related to the learning model generated by the generation device 50. FIG. 7 shows an example of the model information storage unit according to the embodiment. In the example shown in FIG. 7, the model information storage unit 124 has items such as "model ID" and "model data".

「モデルＩＤ」は、生成装置５０によって生成された学習モデルを識別するための識別情報を示す。「モデルデータ」は、生成装置５０によって生成された学習モデルのモデルデータを示す。例えば、「モデルデータ」には、クエリを分散表現に変換するためのデータが格納される。 The "model ID" indicates identification information for identifying the learning model generated by the generation device 50. The "model data" indicates the model data of the learning model generated by the generation device 50. For example, "model data" stores data for converting a query into a distributed representation.

図７の１レコード目に示す例では、モデルＩＤ「Ｍ１」で識別される学習モデルは、図１に示した第１モデルＭ１に対応する。また、モデルデータ「ＭＤＴ１」は、生成装置５０によって生成された第１モデルＭ１のモデルデータ（モデルデータＭＤＴ１）を示す。 In the example shown in the first record of FIG. 7, the learning model identified by the model ID “M1” corresponds to the first model M1 shown in FIG. Further, the model data "MDT1" indicates model data (model data MDT1) of the first model M1 generated by the generation device 50.

モデルデータＭＤＴ１は、クエリが入力される入力層と、出力層と、入力層から出力層までのいずれかの層であって出力層以外の層に属する第１要素と、第１要素と第１要素の重みとに基づいて値が算出される第２要素と、を含み、入力層に入力されたクエリに応じて、入力層に入力されたクエリの分散表現を出力層から出力するよう、生成装置５０を機能させてもよい。 The model data MDT1 includes an input layer into which a query is input, an output layer, a first element which is any layer from the input layer to the output layer and belongs to a layer other than the output layer, and the first element and the first element. Generated to output a distributed representation of the query input to the input layer from the output layer, including a second element whose value is calculated based on the weight of the element, and in response to the query input to the input layer. The device 50 may function.

ここで、モデルデータＭＤＴ１が「y=a1*x1+a2*x2+・・・+ai*xi」で示す回帰モデルで実現されるとする。この場合、モデルデータＭＤＴ１が含む第１要素は、x1やx2等といった入力データ（xi）に対応する。また、第１要素の重みは、xiに対応する係数aiに対応する。ここで、回帰モデルは、入力層と出力層とを有する単純パーセプトロンと見做すことができる。各モデルを単純パーセプトロンと見做した場合、第１要素は、入力層が有するいずれかのノードに対応し、第２要素は、出力層が有するノードと見做すことができる。 Here, it is assumed that the model data MDT1 is realized by the regression model represented by "y = a1 * x1 + a2 * x2 + ... + ai * xi". In this case, the first element included in the model data MDT1 corresponds to input data (xi) such as x1 and x2. Further, the weight of the first element corresponds to the coefficient ai corresponding to xi. Here, the regression model can be regarded as a simple perceptron having an input layer and an output layer. When each model is regarded as a simple perceptron, the first element can be regarded as any node of the input layer, and the second element can be regarded as the node of the output layer.

また、モデルデータＭＤＴ１がＤＮＮ（Deep Neural Network）等、１つまたは複数の中間層を有するニューラルネットワークで実現されるとする。この場合、モデルデータＭＤＴ１が含む第１要素は、入力層または中間層が有するいずれかのノードに対応する。また、第２要素は、第１要素と対応するノードから値が伝達されるノードである次段のノードに対応する。また、第１要素の重みは、第１要素と対応するノードから第２要素と対応するノードに伝達される値に対して考慮される重みである接続係数に対応する。 Further, it is assumed that the model data MDT1 is realized by a neural network having one or a plurality of intermediate layers such as DNN (Deep Neural Network). In this case, the first element included in the model data MDT1 corresponds to either the node of the input layer or the intermediate layer. Further, the second element corresponds to a node in the next stage, which is a node to which a value is transmitted from a node corresponding to the first element. Further, the weight of the first element corresponds to a connection coefficient which is a weight considered for the value transmitted from the node corresponding to the first element to the node corresponding to the second element.

生成装置５０は、上述した回帰モデルやニューラルネットワーク等、任意の構造を有するモデルを用いて、分散表現の算出を行う。具体的には、モデルデータＭＤＴ１は、クエリが入力された場合に、分散表現を出力するように係数が設定される。生成装置５０は、このようなモデルデータＭＤＴ１を用いて、分散表現を算出する。 The generation device 50 calculates the variance representation using a model having an arbitrary structure such as the regression model and the neural network described above. Specifically, the model data MDT1 is set with a coefficient so as to output a distributed representation when a query is input. The generation device 50 calculates the distributed representation using such model data MDT1.

なお、上記例では、モデルデータＭＤＴ１が、クエリが入力された場合に、クエリの分散表現を出力するモデル（以下、モデルＸ１という。）である例を示した。しかし、実施形態に係るモデルデータＭＤＴ１は、モデルＸ１にデータの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。例えば、モデルデータＭＤＴ１は、クエリを入力とした際に、モデルＸ１が出力した分散表現を入力して学習されたモデル（以下、モデルＹ１という。）であってもよい。または、モデルデータＭＤＴ１は、クエリを入力とし、モデルＹ１の出力値を出力とするよう学習されたモデルであってもよい。 In the above example, the model data MDT1 is a model (hereinafter referred to as model X1) that outputs a distributed representation of the query when a query is input. However, the model data MDT1 according to the embodiment may be a model generated based on the result obtained by repeating the input / output of data to the model X1. For example, the model data MDT1 may be a model (hereinafter, referred to as model Y1) learned by inputting the distributed representation output by the model X1 when the query is input. Alternatively, the model data MDT1 may be a model trained to take a query as an input and output a value of the model Y1 as an output.

また、生成装置５０がＧＡＮ（Generative Adversarial Networks）を用いた推定処理を行う場合、モデルデータＭＤＴ１は、ＧＡＮの一部を構成するモデルであってもよい。 Further, when the generation device 50 performs the estimation process using GAN (Generative Adversarial Networks), the model data MDT1 may be a model forming a part of GAN.

（制御部１３０）
図３の説明に戻って、制御部１３０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、情報処理装置１００内部の記憶装置に記憶されている各種プログラム（情報処理プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (Control unit 130)
Returning to the description of FIG. 3, the control unit 130 is a controller, and is stored in a storage device inside the information processing device 100 by, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). This is realized by executing various programs (corresponding to an example of an information processing program) using the RAM as a work area. Further, the control unit 130 is a controller, and is realized by, for example, an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

図３に示すように、制御部１３０は、受付部１３１と、取得部１３２と、生成部１３３と、出力部１３４とを有し、以下に説明する情報処理の作用を実現または実行する。なお、制御部１３０の内部構成は、図３に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As shown in FIG. 3, the control unit 130 includes a reception unit 131, an acquisition unit 132, a generation unit 133, and an output unit 134, and realizes or executes the information processing operation described below. The internal configuration of the control unit 130 is not limited to the configuration shown in FIG. 3, and may be another configuration as long as it is a configuration for performing information processing described later.

（受付部１３１）
受付部１３１は、端末装置１０からコンテンツＣ１の配信要求を受け付ける。受付部１３１は、端末装置１０からコンテンツＣ１の部分コンテンツＣ１１の配信要求を受け付ける。具体的には、受付部１３１は、クラスタ数を入力可能な入力フィールドＦ１１と、複数のクエリを入力可能な入力フィールドＦ１２と、入力フィールドに入力された情報を情報処理装置１００に送信する送信ボタンＢ１１とを含む部分コンテンツＣ１１の配信要求を受け付ける。 (Reception Department 131)
The reception unit 131 receives a distribution request for the content C1 from the terminal device 10. The reception unit 131 receives a distribution request for the partial content C11 of the content C1 from the terminal device 10. Specifically, the reception unit 131 has an input field F11 capable of inputting the number of clusters, an input field F12 capable of inputting a plurality of queries, and a send button for transmitting the information input in the input fields to the information processing apparatus 100. Accepts a distribution request for partial content C11 including B11.

続いて、受付部１３１は、コンテンツＣ１の配信要求を受け付けると、端末装置１０にコンテンツＣ１を配信する。受付部１３１は、コンテンツＣ１の部分コンテンツＣ１１の配信要求を受け付けると、端末装置１０にコンテンツＣ１の部分コンテンツＣ１１を配信する。具体的には、受付部１３１は、クラスタ数を入力可能な入力フィールドＦ１１と、複数のクエリを入力可能な入力フィールドＦ１２と、入力フィールドに入力された情報を情報処理装置１００に送信する送信ボタンＢ１１とを含む部分コンテンツＣ１１を端末装置１０に配信する。 Subsequently, when the reception unit 131 receives the distribution request for the content C1, the reception unit 131 distributes the content C1 to the terminal device 10. When the reception unit 131 receives the distribution request for the partial content C11 of the content C1, the reception unit 131 distributes the partial content C11 of the content C1 to the terminal device 10. Specifically, the reception unit 131 has an input field F11 capable of inputting the number of clusters, an input field F12 capable of inputting a plurality of queries, and a send button for transmitting the information input in the input fields to the information processing apparatus 100. The partial content C11 including the B11 is delivered to the terminal device 10.

また、受付部１３１は、分類対象を示す複数の対象情報と、複数の対象情報をクラスタに分類する際のクラスタ数である指定クラスタ数とを受け付ける。具体的には、受付部１３１は、分類対象を示す複数のクエリと、複数のクエリをクラスタに分類する際のクラスタ数である指定クラスタ数とを端末装置１０から受信する。続いて、受付部１３１は、複数のクエリと指定クラスタ数とを受信すると、受信した複数のクエリと指定クラスタ数とを対応付けてクエリ情報記憶部１２１に格納する。 Further, the reception unit 131 receives a plurality of target information indicating the classification target and a designated number of clusters, which is the number of clusters when the plurality of target information is classified into clusters. Specifically, the reception unit 131 receives from the terminal device 10 a plurality of queries indicating the classification target and a designated number of clusters, which is the number of clusters when the plurality of queries are classified into clusters. Subsequently, when the reception unit 131 receives the plurality of queries and the specified number of clusters, the reception unit 131 stores the received plurality of queries and the designated number of clusters in the query information storage unit 121 in association with each other.

（取得部１３２）
取得部１３２は、各種情報を取得する。取得部１３２は、外部の情報処理装置から各種情報を取得する。取得部１３２は、生成装置５０等の他の情報処理装置から各種情報を取得する。 (Acquisition unit 132)
The acquisition unit 132 acquires various types of information. The acquisition unit 132 acquires various information from an external information processing device. The acquisition unit 132 acquires various information from other information processing devices such as the generation device 50.

また、取得部１３２は、記憶部１２０から各種情報を取得する。取得部１３２は、クエリ情報記憶部１２１やベクトル情報記憶部１２２やクラスタ情報記憶部１２３やモデル情報記憶部１２４から各種情報を取得する。 Further, the acquisition unit 132 acquires various information from the storage unit 120. The acquisition unit 132 acquires various types of information from the query information storage unit 121, the vector information storage unit 122, the cluster information storage unit 123, and the model information storage unit 124.

具体的には、取得部１３２は、モデルを取得する。より具体的には、取得部１３２は、受付部１３１によって分類対象を示す複数のクエリと、複数のクエリをクラスタに分類する際のクラスタ数である指定クラスタ数とが受け付けられると、生成装置５０から第１モデルＭ１を取得する。取得部１３２は、生成装置５０から第１モデルＭ１のモデルデータＭＤＴ１を取得する。続いて、取得部１３２は、第１モデルＭ１（モデルデータＭＤＴ１）を取得すると、取得した第１モデルＭ１（モデルデータＭＤＴ１）をモデル情報記憶部１２４に格納する。 Specifically, the acquisition unit 132 acquires the model. More specifically, the acquisition unit 132 receives the plurality of queries indicating the classification target by the reception unit 131 and the specified number of clusters, which is the number of clusters when the plurality of queries are classified into clusters, and the generation device 50. The first model M1 is acquired from. The acquisition unit 132 acquires the model data MDT1 of the first model M1 from the generation device 50. Subsequently, when the acquisition unit 132 acquires the first model M1 (model data MDT1), the acquisition unit 132 stores the acquired first model M1 (model data MDT1) in the model information storage unit 124.

（生成部１３３）
生成部１３３は、種々の情報を生成する。生成部１３３は、クエリの分散表現を生成する。具体的には、生成部１３３は、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 (Generator 133)
The generation unit 133 generates various information. The generation unit 133 generates a distributed representation of the query. Specifically, the generation unit 133 generates a distributed representation of a plurality of queries received by the reception unit 131.

また、生成部１３３は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、複数の対象情報に含まれるそれぞれの対象情報である文字情報に対応する分散表現を生成する。具体的には、生成部１３３は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した第１モデルＭ１を用いて、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 In addition, the generation unit 133 uses a learning model that learns the characteristics of the plurality of search queries, assuming that the plurality of search queries input by the same user within a predetermined time have similar characteristics. Generates a distributed representation corresponding to the character information that is each target information included in the target information. Specifically, the generation unit 133 uses the first model M1 that has learned the characteristics of the plurality of search queries, assuming that the plurality of search queries input by the same user within a predetermined time have similar characteristics. It is used to generate a distributed representation of a plurality of queries received by the reception unit 131.

また、生成部１３３は、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する学習モデルを用いて、分散表現を生成する。具体的には、生成部１３３は、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する第１モデルＭ１を用いて、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 Further, the generation unit 133 generates a distributed expression by using a learning model that outputs a distributed expression of the predetermined search query as output information when a predetermined search query is input as input information. Specifically, the generation unit 133 receives the input information by the reception unit 131 using the first model M1 that outputs the distributed representation of the predetermined search query as the output information when the predetermined search query is input. Generate a distributed representation of multiple queries.

また、生成部１３３は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。具体的には、生成部１３３は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した第１モデルＭ１を用いて、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 In addition, the generation unit 133 uses a learning model that learns the characteristics of a plurality of search queries by learning so that the distributed expressions of a pair of search queries input consecutively within a predetermined time are similar. Generate a distributed representation. Specifically, the generation unit 133 learns the characteristics of a plurality of search queries by learning so that the distributed expressions of a pair of search queries input consecutively within a predetermined time are similar to each other. Using M1, a distributed representation of a plurality of queries received by the reception unit 131 is generated.

また、生成部１３３は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。具体的には、生成部１３３は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した第１モデルＭ１を用いて、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 Further, the generation unit 133 learns that as a plurality of search queries input by the same user within a predetermined time, a plurality of search queries including a character string delimited by a predetermined delimiter have similar characteristics. By doing so, a distributed representation is generated using a learning model that has learned the characteristics of a plurality of search queries. Specifically, the generation unit 133 has a feature that a plurality of search queries including a character string separated by a predetermined delimiter are similar as a plurality of search queries input by the same user within a predetermined time. By learning as a thing, a distributed representation of a plurality of queries received by the reception unit 131 is generated by using the first model M1 that has learned the characteristics of the plurality of search queries.

また、生成部１３３は、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。具体的には、生成部１３３は、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した第１モデルＭ１を用いて、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 In addition, the generation unit 133 generates a distributed expression by using the learning model in which the characteristics of the plurality of search queries are learned by learning that the plurality of randomly extracted search queries have different characteristics. .. Specifically, the generation unit 133 uses the first model M1 that has learned the characteristics of the plurality of search queries by learning that the plurality of randomly extracted search queries have different characteristics. Generates a distributed representation of a plurality of queries received by the reception unit 131.

また、生成部１３３は、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。具体的には、生成部１３３は、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した第１モデルＭ１を用いて、受付部１３１によって受け付けられた複数のクエリの分散表現を生成する。 Further, the generation unit 133 generates a distributed expression by using a learning model that has learned the characteristics of a plurality of search queries by learning so that the distributed expressions of a pair of randomly extracted search queries are different. .. Specifically, the generation unit 133 uses the first model M1 that has learned the characteristics of the plurality of search queries by learning so that the distributed expressions of the pair of randomly extracted search queries are different. Generates a distributed representation of a plurality of queries received by the reception unit 131.

より具体的には、生成部１３３は、受付部１３１によって分類対象を示す複数のクエリと、複数のクエリをクラスタに分類する際のクラスタ数である指定クラスタ数とが受け付けられると、モデル情報記憶部１２４を参照して、第１モデルＭ１を取得する。また、生成部１３３は、受付部１３１によって分類対象を示す複数のクエリと、複数のクエリをクラスタに分類する際のクラスタ数である指定クラスタ数とが受け付けられると、クエリ情報記憶部１２１を参照して、受付部１３１によって受け付けられた複数のクエリを取得する。図１に示す例では、生成部１３３は、受付部１３１によって受け付けられたクエリＱ１-１〜Ｑ１-１２を取得する。 More specifically, the generation unit 133 stores model information when a plurality of queries indicating classification targets and a specified number of clusters, which is the number of clusters when classifying the plurality of queries into clusters, are received by the reception unit 131. The first model M1 is acquired with reference to the part 124. Further, when the generation unit 133 receives a plurality of queries indicating classification targets by the reception unit 131 and a specified number of clusters which is the number of clusters when classifying the plurality of queries into clusters, the generation unit 133 refers to the query information storage unit 121. Then, a plurality of queries received by the reception unit 131 are acquired. In the example shown in FIG. 1, the generation unit 133 acquires the queries Q1-1 to Q1-12 received by the reception unit 131.

続いて、生成部１３３は、複数のクエリを取得すると、複数のクエリの中から一のクエリを取得する。図１に示す例では、生成部１３３は、クエリＱ１-１〜Ｑ１-１２を取得すると、クエリＱ１-１〜Ｑ１-１２の中から一のクエリＱ１-１を取得する。続いて、生成部１３３は、第１モデルＭ１の入力情報として、取得した一のクエリを第１モデルＭ１に入力する。図１に示す例では、生成部１３３は、第１モデルＭ１の入力情報として、取得した一のクエリＱ１-１を第１モデルＭ１に入力する。また、生成部１３３は、第１モデルＭ１の出力情報として、第１モデルＭ１に入力されたクエリの分散表現を第１モデルＭ１から出力する。図１に示す例では、生成部１３３は、第１モデルＭ１の出力情報として、第１モデルＭ１に入力されたクエリＱ１-１の分散表現ＱＶ１-１を第１モデルＭ１から出力する。このようにして、生成部１３３は、クエリの分散表現を生成する。同様に、生成部１３３は、取得したクエリの全てについて、各クエリの分散表現を生成する。図１に示す例では、生成部１３３は、取得したクエリＱ１-１〜Ｑ１-１２について、各クエリＱ１-１〜Ｑ１-１２の分散表現ＱＶ１-１〜ＱＶ１-１２を生成する。続いて、生成部１３３は、各クエリの分散表現を生成すると、生成した各クエリの分散表現を各クエリと対応付けてベクトル情報記憶部１２２に格納する。 Subsequently, when the generation unit 133 acquires a plurality of queries, the generation unit 133 acquires one query from the plurality of queries. In the example shown in FIG. 1, when the generation unit 133 acquires the queries Q1-1 to Q1-12, the generation unit 133 acquires one query Q1-1 from the queries Q1-1 to Q1-12. Subsequently, the generation unit 133 inputs the acquired one query to the first model M1 as the input information of the first model M1. In the example shown in FIG. 1, the generation unit 133 inputs the acquired one query Q1-1 to the first model M1 as the input information of the first model M1. Further, the generation unit 133 outputs the distributed representation of the query input to the first model M1 as the output information of the first model M1 from the first model M1. In the example shown in FIG. 1, the generation unit 133 outputs the distributed representation QV1-1 of the query Q1-1 input to the first model M1 as the output information of the first model M1 from the first model M1. In this way, the generator 133 generates a distributed representation of the query. Similarly, the generation unit 133 generates a distributed representation of each query for all the acquired queries. In the example shown in FIG. 1, the generation unit 133 generates the distributed representations QV1-1 to QV1-12 of each query Q1-1 to Q1-12 for the acquired queries Q1-1 to Q1-12. Subsequently, when the generation unit 133 generates the distributed expression of each query, the generated unit 133 stores the distributed expression of each generated query in association with each query in the vector information storage unit 122.

また、生成部１３３は、クラスタを生成する。具体的には、生成部１３３は、複数の対象情報を指定クラスタ数のクラスタに分類することにより生成したクラスタを生成する。例えば、生成部１３３は、受付部１３１によって受け付けられた複数のクエリを指定クラスタ数のクラスタに分類することにより生成したクラスタを生成する。 In addition, the generation unit 133 generates a cluster. Specifically, the generation unit 133 generates a cluster generated by classifying a plurality of target information into clusters having a specified number of clusters. For example, the generation unit 133 generates a cluster generated by classifying a plurality of queries received by the reception unit 131 into clusters having a specified number of clusters.

また、生成部１３３は、複数の対象情報に含まれる一の対象情報が検索クエリとして入力された際の検索意図と、複数の対象情報に含まれる他の対象情報が検索クエリとして入力された際の検索意図との類似性に基づいて、複数の対象情報を指定クラスタ数のクラスタに分類する。具体的には、生成部１３３は、複数の対象情報に含まれる一の対象情報である文字情報に対応する分散表現と、複数の対象情報に含まれる他の対象情報である文字情報に対応する分散表現との類似度に基づいて、複数の対象情報を指定クラスタ数のクラスタに分類する。より具体的には、生成部１３３は、第１モデルＭ１を用いて生成された一のクエリの分散表現と、第１モデルＭ１を用いて生成された他のクエリの分散表現との類似度に基づいて、受付部１３１によって受け付けられた複数のクエリを指定クラスタ数のクラスタに分類する。 In addition, the generation unit 133 includes a search intention when one target information included in a plurality of target information is input as a search query, and a search query when another target information included in the plurality of target information is input as a search query. Classify multiple target information into clusters with a specified number of clusters based on the similarity with the search intention of. Specifically, the generation unit 133 corresponds to the distributed expression corresponding to the character information which is one target information included in the plurality of target information and the character information which is other target information included in the plurality of target information. Classify multiple target information into clusters with a specified number of clusters based on the similarity with the distributed representation. More specifically, the generation unit 133 determines the degree of similarity between the distributed representation of one query generated using the first model M1 and the distributed representation of another query generated using the first model M1. Based on this, a plurality of queries received by the reception unit 131 are classified into clusters having a specified number of clusters.

また、生成部１３３は、クラスタに関するクラスタ情報を生成する。具体的には、生成部１３３は、複数の対象情報を指定クラスタ数のクラスタに分類することにより、各対象情報が分類されるクラスタに関するクラスタ情報を生成する。例えば、生成部１３３は、受付部１３１によって受け付けられた複数のクエリを指定クラスタ数のクラスタに分類することにより、各クエリが分類されるクラスタに関する指定クラスタ情報を生成する。 In addition, the generation unit 133 generates cluster information regarding the cluster. Specifically, the generation unit 133 generates cluster information regarding the cluster in which each target information is classified by classifying a plurality of target information into clusters having a specified number of clusters. For example, the generation unit 133 generates designated cluster information regarding the cluster into which each query is classified by classifying the plurality of queries received by the reception unit 131 into clusters having a specified number of clusters.

また、生成部１３３は、複数の対象情報に含まれる一の対象情報が検索クエリとして入力された際の検索意図と、複数の対象情報に含まれる他の対象情報が検索クエリとして入力された際の検索意図との類似性に基づいて、指定クラスタ情報を生成する。具体的には、生成部１３３は、複数の対象情報に含まれる一の対象情報である文字情報に対応する分散表現と、複数の対象情報に含まれる他の対象情報である文字情報に対応する分散表現との類似度に基づいて、指定クラスタ情報を生成する。より具体的には、生成部１３３は、第１モデルＭ１を用いて生成された一のクエリの分散表現と、第１モデルＭ１を用いて生成された他のクエリの分散表現との類似度に基づいて、受付部１３１によって受け付けられた複数のクエリを指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を生成する。 In addition, the generation unit 133 includes a search intention when one target information included in the plurality of target information is input as a search query, and when another target information included in the plurality of target information is input as a search query. Generates designated cluster information based on the similarity to the search intent of. Specifically, the generation unit 133 corresponds to the distributed expression corresponding to the character information which is one target information included in the plurality of target information and the character information which is other target information included in the plurality of target information. Generates designated cluster information based on similarity to the distributed representation. More specifically, the generation unit 133 determines the degree of similarity between the distributed representation of one query generated using the first model M1 and the distributed representation of another query generated using the first model M1. Based on this, the designated cluster information regarding the cluster generated by classifying the plurality of queries received by the reception unit 131 into the clusters having the specified number of clusters is generated.

図１に示す例では、生成部１３３は、受付部１３１によってクラスタ数「３」と１２個のクエリＱ１-１〜Ｑ１-１２が受け付けられると、第１モデルＭ１を用いて生成されたクエリＱ１-１〜Ｑ１-１２の分散表現ＱＶ１-１〜ＱＶ１-１２を取得する。具体的には、生成部１３３は、クエリ情報記憶部１２１とベクトル情報記憶部１２２を参照して、第１モデルＭ１を用いて生成されたクエリＱ１-１〜Ｑ１-１２の分散表現ＱＶ１-１〜ＱＶ１-１２を取得する。 In the example shown in FIG. 1, when the reception unit 131 receives the number of clusters “3” and 12 queries Q1-1 to Q1-12, the generation unit 133 generates the query Q1 using the first model M1. Obtain the distributed representations QV1-1 to QV1-12 of -1 to Q1-12. Specifically, the generation unit 133 refers to the query information storage unit 121 and the vector information storage unit 122, and refers to the distributed representation QV1-1 of the queries Q1-1 to Q1-12 generated by using the first model M1. ~ QV1-12 is acquired.

続いて、生成部１３３は、分散表現ＱＶ１-１〜ＱＶ１-１２を取得すると、取得した分散表現ＱＶ１-１〜ＱＶ１-１２をｋ−ｍｅａｎｓ法を用いてクラスタ数「３」のクラスタに分類する。なお、生成部１３３は、取得した分散表現ＱＶ１-１〜ＱＶ１-１２をクラスタ数「３」のクラスタに分類可能であれば、ｋ−ｍｅａｎｓ法に限らず、どのようなクラスタリング手法を用いてもよい。 Subsequently, when the generation unit 133 acquires the distributed expressions QV1-1 to QV1-12, the generated unit 133 classifies the acquired distributed expressions QV1-1 to QV1-12 into clusters having the number of clusters "3" using the k-means method. .. Note that the generation unit 133 can use any clustering method, not limited to the k-means method, as long as the acquired distributed representations QV1-1 to QV1-12 can be classified into clusters having the number of clusters "3". Good.

また、生成部１３３は、クエリＱ１-１〜Ｑ１-１２に対応する分散表現ＱＶ１-１〜ＱＶ１-１２をクラスタ数「３」のクラスタに分類することによって、各分散表現に対応する各クエリをクラスタ数「３」のクラスタに分類する。このように、生成部１３３は、各分散表現に対応する各クエリをクラスタ数「３」のクラスタに分類することにより、各クエリが分類されるクラスタに関するクラスタ情報を生成する。 In addition, the generation unit 133 classifies the distributed expressions QV1-1 to QV1-12 corresponding to the queries Q1-1 to Q1-12 into clusters having the number of clusters "3", so that each query corresponding to each distributed expression is classified. Classify into clusters with the number of clusters "3". In this way, the generation unit 133 generates cluster information regarding the cluster to which each query is classified by classifying each query corresponding to each distributed representation into clusters having the number of clusters “3”.

例えば、生成部１３３は、分散表現ＱＶ１-１と分散表現ＱＶ１-２と分散表現ＱＶ１-３と分散表現ＱＶ１-７と分散表現ＱＶ１-８を一つのクラスタ（クラスタＣＬ１）に分類する。生成部１３３は、分散表現ＱＶ１-１をクラスタＣＬ１に分類したので、分散表現ＱＶ１-１に対応するクエリＱ１-１をクラスタＣＬ１に分類する。また、生成部１３３は、分散表現ＱＶ１-２をクラスタＣＬ１に分類したので、分散表現ＱＶ１-２に対応するクエリＱ１-２をクラスタＣＬ１に分類する。同様に、生成部１３３は、分散表現ＱＶ１-３と分散表現ＱＶ１-７と分散表現ＱＶ１-８をクラスタＣＬ１に分類したので、分散表現ＱＶ１-３に対応するクエリＱ１-３と分散表現ＱＶ１-７に対応するクエリＱ１-７と分散表現ＱＶ１-８に対応するクエリＱ１-８をクラスタＣＬ１に分類する。このようにして、生成部１３３は、クエリＱ１-１とクエリＱ１-２とクエリＱ１-３とクエリＱ１-７とクエリＱ１-８が分類されるクラスタＣＬ１に関するクラスタ情報を生成する。 For example, the generation unit 133 classifies the distributed representation QV1-1, the distributed representation QV1-2, the distributed representation QV1-3, the distributed representation QV1-7, and the distributed representation QV1-8 into one cluster (cluster CL1). Since the generation unit 133 has classified the distributed expression QV1-1 into the cluster CL1, the query Q1-1 corresponding to the distributed expression QV1-1 is classified into the cluster CL1. Further, since the generation unit 133 classifies the distributed expression QV1-2 into the cluster CL1, the query Q1-2 corresponding to the distributed expression QV1-2 is classified into the cluster CL1. Similarly, since the generation unit 133 classifies the distributed representation QV1-3, the distributed representation QV1-7, and the distributed representation QV1-8 into the cluster CL1, the query Q1-3 and the distributed representation QV1- corresponding to the distributed representation QV1-3. The query Q1-7 corresponding to 7 and the query Q1-8 corresponding to the distributed representation QV1-8 are classified into cluster CL1. In this way, the generation unit 133 generates cluster information regarding the cluster CL1 in which the query Q1-1, the query Q1-2, the query Q1-3, the query Q1-7, and the query Q1-8 are classified.

また、生成部１３３は、分散表現ＱＶ１-４と分散表現ＱＶ１-５と分散表現ＱＶ１-６と分散表現ＱＶ１-９と分散表現ＱＶ１-１０をクラスタＣＬ１とは異なる一つのクラスタ（クラスタＣＬ２）に分類する。生成部１３３は、分散表現ＱＶ１-４をクラスタＣＬ２に分類したので、分散表現ＱＶ１-４に対応するクエリＱ１-４をクラスタＣＬ２に分類する。また、生成部１３３は、分散表現ＱＶ１-５をクラスタＣＬ２に分類したので、分散表現ＱＶ１-５に対応するクエリＱ１-５をクラスタＣＬ２に分類する。同様に、生成部１３３は、分散表現ＱＶ１-６と分散表現ＱＶ１-９と分散表現ＱＶ１-１０をクラスタＣＬ２に分類したので、分散表現ＱＶ１-６に対応するクエリＱ１-６と分散表現ＱＶ１-９に対応するクエリＱ１-９と分散表現ＱＶ１-１０に対応するクエリＱ１-１０をクラスタＣＬ２に分類する。このようにして、生成部１３３は、クエリＱ１-４とクエリＱ１-５とクエリＱ１-６とクエリＱ１-９とクエリＱ１-１０が分類されるクラスタＣＬ２に関するクラスタ情報を生成する。 Further, the generation unit 133 puts the distributed representation QV1-4, the distributed representation QV1-5, the distributed representation QV1-6, the distributed representation QV1-9, and the distributed representation QV1-10 into one cluster (cluster CL2) different from the cluster CL1. Classify. Since the generation unit 133 has classified the distributed expression QV1-4 into the cluster CL2, the query Q1-4 corresponding to the distributed expression QV1-4 is classified into the cluster CL2. Further, since the generation unit 133 classifies the distributed expression QV1-5 into the cluster CL2, the query Q1-5 corresponding to the distributed expression QV1-5 is classified into the cluster CL2. Similarly, since the generation unit 133 classifies the distributed representation QV1-6, the distributed representation QV1-9, and the distributed representation QV1-10 into the cluster CL2, the query Q1-6 and the distributed representation QV1- corresponding to the distributed representation QV1-6. Query Q1-9 corresponding to 9 and query Q1-10 corresponding to the distributed representation QV1-10 are classified into cluster CL2. In this way, the generation unit 133 generates cluster information regarding the cluster CL2 in which the query Q1-4, the query Q1-5, the query Q1-6, the query Q1-9, and the query Q1-10 are classified.

また、生成部１３３は、分散表現ＱＶ１-１１と分散表現ＱＶ１-１２をクラスタＣＬ１およびクラスタＣＬ２とは異なる一つのクラスタ（クラスタＣＬ３）に分類する。生成部１３３は、分散表現ＱＶ１-１１をクラスタＣＬ３に分類したので、分散表現ＱＶ１-１１に対応するクエリＱ１-１１をクラスタＣＬ３に分類する。また、生成部１３３は、分散表現ＱＶ１-１２をクラスタＣＬ３に分類したので、分散表現ＱＶ１-１２に対応するクエリＱ１-１２をクラスタＣＬ３に分類する。このようにして、生成部１３３は、クエリＱ１-１１とクエリＱ１-１２が分類されるクラスタＣＬ３に関するクラスタ情報を生成する。 Further, the generation unit 133 classifies the distributed representation QV1-11 and the distributed representation QV1-12 into one cluster (cluster CL3) different from the cluster CL1 and the cluster CL2. Since the generation unit 133 has classified the distributed representation QV1-11 into the cluster CL3, the query Q1-11 corresponding to the distributed representation QV1-11 is classified into the cluster CL3. Further, since the generation unit 133 classifies the distributed expression QV1-12 into the cluster CL3, the query Q1-12 corresponding to the distributed expression QV1-12 is classified into the cluster CL3. In this way, the generation unit 133 generates cluster information regarding the cluster CL3 in which the query Q1-11 and the query Q1-12 are classified.

また、生成部１３３は、指定クラスタ情報として、クラスタごとに、クラスタに分類される対象情報を視認可能な情報を生成する。例えば、生成部１３３は、指定クラスタ情報として、クラスタごとに、クラスタに分類される対象情報を視認可能なコンテンツを生成する。 In addition, the generation unit 133 generates information in which the target information classified into the clusters can be visually recognized as the designated cluster information for each cluster. For example, the generation unit 133 generates content in which the target information classified into the clusters can be visually recognized as the designated cluster information for each cluster.

図１に示す例では、生成部１３３は、クラスタＣＬ１に関するクラスタ情報が表示される表示領域Ｆ２１の上方にクラスタＣＬ１の名称「クラスタ１」が表示される部分コンテンツＣ１２を生成してもよい。また、生成部１３３は、クラスタＣＬ２に関するクラスタ情報が表示される表示領域Ｆ２２の上方にクラスタＣＬ２の名称「クラスタ２」が表示されるが表示される部分コンテンツＣ１２を生成してもよい。また、生成部１３３は、クラスタＣＬ３に関するクラスタ情報が表示される表示領域Ｆ２３の上方にクラスタＣＬ３の名称「クラスタ３」が表示されるが表示される部分コンテンツＣ１２を生成してもよい。 In the example shown in FIG. 1, the generation unit 133 may generate the partial content C12 in which the name “cluster 1” of the cluster CL1 is displayed above the display area F21 in which the cluster information regarding the cluster CL1 is displayed. Further, the generation unit 133 may generate the partial content C12 in which the name “cluster 2” of the cluster CL2 is displayed above the display area F22 in which the cluster information regarding the cluster CL2 is displayed. Further, the generation unit 133 may generate the partial content C12 in which the name “cluster 3” of the cluster CL3 is displayed above the display area F23 in which the cluster information regarding the cluster CL3 is displayed.

（出力部１３４）
出力部１３４は、生成部１３３によって生成された指定クラスタ情報を出力する。具体的には、出力部１３４は、生成部１３３によって生成された指定クラスタ情報を端末装置１０に送信する。 (Output unit 134)
The output unit 134 outputs the designated cluster information generated by the generation unit 133. Specifically, the output unit 134 transmits the designated cluster information generated by the generation unit 133 to the terminal device 10.

〔３．端末装置の構成〕
次に、図８を用いて、実施形態に係る端末装置１０の構成について説明する。図８は、実施形態に係る端末装置１０の構成例を示す図である。図８に示すように、端末装置１０は、通信部１１と、入力部１２と、表示部１３と、記憶部１４と、制御部１５とを有する。 [3. Terminal device configuration]
Next, the configuration of the terminal device 10 according to the embodiment will be described with reference to FIG. FIG. 8 is a diagram showing a configuration example of the terminal device 10 according to the embodiment. As shown in FIG. 8, the terminal device 10 includes a communication unit 11, an input unit 12, a display unit 13, a storage unit 14, and a control unit 15.

（通信部１１）
通信部１１は、例えば、ＮＩＣ等によって実現される。そして、通信部１１は、ネットワークＮと有線または無線で接続され、情報処理装置１００との間で情報の送受信を行う。 (Communication unit 11)
The communication unit 11 is realized by, for example, a NIC or the like. Then, the communication unit 11 is connected to the network N by wire or wirelessly, and transmits / receives information to / from the information processing device 100.

（入力部１２、表示部１３）
入力部１２は、利用者から各種操作を受け付ける入力装置である。入力部１２は、表示部１３を介して各種情報が入力される。例えば、入力部１２は、キーボードやマウスや操作キー等によって実現される。表示部１３は、各種情報を表示するための表示装置であり、すなわち、画面である。例えば、表示部１３は、液晶ディスプレイ等によって実現される。表示部１３は、記憶部１４に記憶された情報を表示する。表示部１３は、受信部１５１によって受信された情報を表示する。表示部１３は、表示制御部１５２による制御に応じて、各種情報を表示する。なお、端末装置１０にタッチパネルが採用される場合には、入力部１２と表示部１３とは一体化される。また、以下の説明では、表示部１３を画面と記載する場合がある。 (Input unit 12, display unit 13)
The input unit 12 is an input device that receives various operations from the user. Various information is input to the input unit 12 via the display unit 13. For example, the input unit 12 is realized by a keyboard, a mouse, operation keys, or the like. The display unit 13 is a display device for displaying various information, that is, a screen. For example, the display unit 13 is realized by a liquid crystal display or the like. The display unit 13 displays the information stored in the storage unit 14. The display unit 13 displays the information received by the reception unit 151. The display unit 13 displays various information according to the control by the display control unit 152. When a touch panel is adopted for the terminal device 10, the input unit 12 and the display unit 13 are integrated. Further, in the following description, the display unit 13 may be described as a screen.

図１の例では、表示部１３は、受信部１５１によって受信されたコンテンツＣ１を表示する。表示部１３は、受信部１５１によって受信されたコンテンツＣ１の部分コンテンツＣ１１を表示する。具体的には、表示部１３は、複数の第１クエリを入力可能な入力フィールドＦ１１と、複数の第２クエリを入力可能な入力フィールドＦ１２と、入力フィールドに入力されたクエリを情報処理装置１００に送信する送信ボタンＢ１１とを含む部分コンテンツＣ１１を表示する。 In the example of FIG. 1, the display unit 13 displays the content C1 received by the reception unit 151. The display unit 13 displays the partial content C11 of the content C1 received by the reception unit 151. Specifically, the display unit 13 inputs the input field F11 capable of inputting a plurality of first queries, the input field F12 capable of inputting a plurality of second queries, and the information processing device 100 for the queries input in the input fields. The partial content C11 including the transmission button B11 to be transmitted to is displayed.

また、表示部１３は、受信部１５１によって受信されたコンテンツＣ１の部分コンテンツＣ１２を表示する。表示部１３は、受信部１５１によって受信された棒グラフＧ２１〜Ｇ２３に関する情報を表示する。具体的には、表示部１３は、受信部１５１によって受信されたコンテンツＣ１の部分コンテンツＣ１２に含まれる表示領域Ｆ２１〜Ｆ２３のそれぞれに棒グラフＧ２１〜Ｇ２３に関する情報を表示する。 Further, the display unit 13 displays the partial content C12 of the content C1 received by the reception unit 151. The display unit 13 displays information about the bar graphs G21 to G23 received by the reception unit 151. Specifically, the display unit 13 displays information about the bar graphs G21 to G23 in each of the display areas F21 to F23 included in the partial content C12 of the content C1 received by the reception unit 151.

（記憶部１４）
記憶部１４は、例えば、ＲＡＭ、フラッシュメモリ等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部１４は、情報の表示に用いる各種情報を記憶する。記憶部１４は、受信部１５１によって受信された情報を記憶する。図１に示す例では、記憶部１４は、受信部１５１によって受信されたコンテンツＣ１を記憶する。また、記憶部１４は、受信部１５１によって受信されたコンテンツＣ１の部分コンテンツＣ１１を記憶する。また、記憶部１４は、受信部１５１によって受信されたコンテンツＣ１の部分コンテンツＣ１２を記憶する。また、記憶部１４は、受信部１５１によって受信された棒グラフＧ２１〜Ｇ２３に関する情報を記憶する。 (Memory unit 14)
The storage unit 14 is realized by, for example, a semiconductor memory element such as a RAM or a flash memory, or a storage device such as a hard disk or an optical disk. The storage unit 14 stores various information used for displaying the information. The storage unit 14 stores the information received by the reception unit 151. In the example shown in FIG. 1, the storage unit 14 stores the content C1 received by the reception unit 151. Further, the storage unit 14 stores the partial content C11 of the content C1 received by the reception unit 151. Further, the storage unit 14 stores the partial content C12 of the content C1 received by the reception unit 151. Further, the storage unit 14 stores information regarding the bar graphs G21 to G23 received by the reception unit 151.

（制御部１５）
制御部１５は、ＣＰＵやＭＰＵ等によって、端末装置１０内部の記憶装置に記憶されている各種プログラム（例えば、コンテンツＣ１等を画面に表示させる表示制御プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。例えば、この各種プログラムは、ウェブブラウザと呼ばれるアプリケーションプログラムに該当する。また、制御部１５は、例えば、ＡＳＩＣやＦＰＧＡ等の集積回路により実現される。 (Control unit 15)
In the control unit 15, various programs (for example, corresponding to an example of a display control program for displaying the content C1 or the like on the screen) stored in the storage device inside the terminal device 10 by the CPU, MPU, or the like use the RAM as a work area. It is realized by being executed. For example, these various programs correspond to application programs called web browsers. Further, the control unit 15 is realized by, for example, an integrated circuit such as an ASIC or FPGA.

図８に示すように、制御部１５は、受信部１５１と、表示制御部１５２と、受付部１５３と、送信部１５４とを有し、以下に説明する情報処理の機能や作用を実現または実行する。なお、制御部１５の内部構成は、図８に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As shown in FIG. 8, the control unit 15 includes a reception unit 151, a display control unit 152, a reception unit 153, and a transmission unit 154, and realizes or executes an information processing function or operation described below. To do. The internal configuration of the control unit 15 is not limited to the configuration shown in FIG. 8, and may be another configuration as long as it is a configuration for performing information processing described later.

（受信部１５１）
受信部１５１は、各種情報を受信する。受信部１５１は、外部の情報処理装置から各種情報を受信する。受信部１５１は、情報処理装置１００等の他の情報処理装置から各種情報を受信する。図１の例では、受信部１５１は、情報処理装置１００からコンテンツＣ１を受信する。また、受信部１５１は、情報処理装置１００からコンテンツＣ１の部分コンテンツＣ１１を受信する。また、受信部１５１は、情報処理装置１００からコンテンツＣ１の部分コンテンツＣ１２を受信する。また、受信部１５１は、情報処理装置１００からクラスタＣＬ１〜ＣＬ３に関するクラスタ情報を受信する。 (Receiver 151)
The receiving unit 151 receives various information. The receiving unit 151 receives various information from an external information processing device. The receiving unit 151 receives various information from other information processing devices such as the information processing device 100. In the example of FIG. 1, the receiving unit 151 receives the content C1 from the information processing device 100. Further, the receiving unit 151 receives the partial content C11 of the content C1 from the information processing device 100. Further, the receiving unit 151 receives the partial content C12 of the content C1 from the information processing device 100. Further, the receiving unit 151 receives the cluster information regarding the clusters CL1 to CL3 from the information processing device 100.

（表示制御部１５２）
表示制御部１５２は、各種表示を制御する。表示制御部１５２は、表示部１３の表示を制御する。表示制御部１５２は、受信部１５１による受信に応じて、表示部１３の表示を制御する。表示制御部１５２は、受信部１５１により受信された情報に基づいて、表示部１３の表示を制御する。表示制御部１５２は、受付部１５３により受け付けられた情報に基づいて、表示部１３の表示を制御する。表示制御部１５２は、受付部１５３による受付けに応じて、表示部１３の表示を制御する。表示制御部１５２は、表示部１３にコンテンツが表示されるように表示部１３の表示を制御する。 (Display control unit 152)
The display control unit 152 controls various displays. The display control unit 152 controls the display of the display unit 13. The display control unit 152 controls the display of the display unit 13 in response to the reception by the reception unit 151. The display control unit 152 controls the display of the display unit 13 based on the information received by the reception unit 151. The display control unit 152 controls the display of the display unit 13 based on the information received by the reception unit 153. The display control unit 152 controls the display of the display unit 13 in response to the reception by the reception unit 153. The display control unit 152 controls the display of the display unit 13 so that the content is displayed on the display unit 13.

図１の例では、表示制御部１５２は、コンテンツＣ１が表示されるように表示部１３の表示を制御する。また、表示制御部１５２は、コンテンツＣ１の部分コンテンツＣ１１が表示されるように表示部１３の表示を制御する。また、表示制御部１５２は、コンテンツＣ１の部分コンテンツＣ１２が表示されるように表示部１３の表示を制御する。また、表示制御部１５２は、コンテンツＣ１の部分コンテンツＣ１２に含まれる表示領域Ｆ２１〜Ｆ２３のそれぞれにクラスタＣＬ１〜ＣＬ３に関するクラスタ情報が表示されるように表示部１３の表示を制御する。 In the example of FIG. 1, the display control unit 152 controls the display of the display unit 13 so that the content C1 is displayed. Further, the display control unit 152 controls the display of the display unit 13 so that the partial content C11 of the content C1 is displayed. Further, the display control unit 152 controls the display of the display unit 13 so that the partial content C12 of the content C1 is displayed. Further, the display control unit 152 controls the display of the display unit 13 so that the cluster information regarding the clusters CL1 to CL3 is displayed in each of the display areas F21 to F23 included in the partial content C12 of the content C1.

（受付部１５３）
受付部１５３は、各種情報を受け付ける。例えば、受付部１５３は、入力部１２を介してユーザによる入力を受け付ける。受付部１５３は、ユーザによる操作を受け付ける。受付部１５３は、表示部１３により表示された情報に対するユーザの操作を受け付ける。受付部１５３は、ユーザによる文字入力を受け付ける。受付部１５３は、ユーザによるクラスタ数の入力を受け付ける。受付部１５３は、コンテンツＣ１の部分コンテンツＣ１１に含まれる入力フィールドＦ１１への文字入力により、クラスタ数の入力を受け付ける。受付部１５３は、ユーザにより入力フィールドＦ１１に入力された数字をクラスタ数として受け付ける。 (Reception Department 153)
The reception unit 153 receives various information. For example, the reception unit 153 receives an input by the user via the input unit 12. The reception unit 153 accepts operations by the user. The reception unit 153 accepts the user's operation on the information displayed by the display unit 13. The reception unit 153 accepts character input by the user. The reception unit 153 accepts the input of the number of clusters by the user. The reception unit 153 accepts the input of the number of clusters by inputting characters into the input field F11 included in the partial content C11 of the content C1. The reception unit 153 accepts the number input by the user in the input field F11 as the number of clusters.

また、受付部１５３は、ユーザによる複数のクエリの入力を受け付ける。受付部１５３は、コンテンツＣ１の部分コンテンツＣ１１に含まれる入力フィールドＦ１２への文字入力により、クエリの入力を受け付ける。受付部１５３は、ユーザにより入力フィールドＦ１２に入力された文字列をクエリとして受け付ける。また、受付部１５３は、区切り文字で区切られた各文字列を各クエリとして受け付ける。 In addition, the reception unit 153 accepts input of a plurality of queries by the user. The reception unit 153 accepts the input of the query by inputting characters into the input field F12 included in the partial content C11 of the content C1. The reception unit 153 accepts the character string input by the user in the input field F12 as a query. Further, the reception unit 153 accepts each character string separated by a delimiter as each query.

また、受付部１５３は、ユーザによる送信ボタンＢ１１の選択操作を受け付ける。受付部１５３は、入力部１２によって受け付けられた送信ボタンＢ１１の選択操作を受け付ける。 In addition, the reception unit 153 accepts the user's selection operation of the transmission button B11. The reception unit 153 accepts the selection operation of the transmission button B11 received by the input unit 12.

図１の例では、受付部１５３は、クラスタ数である数字「３」の入力を受け付ける。受付部１５３は、コンテンツＣ１の部分コンテンツＣ１１に含まれる入力フィールドＦ１１への文字入力により、クラスタ数「３」の入力を受け付ける。 In the example of FIG. 1, the reception unit 153 accepts the input of the number “3” which is the number of clusters. The reception unit 153 accepts the input of the number of clusters "3" by inputting characters into the input field F11 included in the partial content C11 of the content C1.

また、受付部１５３は、区切り文字で区切られた１２個の文字列である１２個のクエリＱ１-１〜Ｑ１-１２の入力を受け付ける。受付部１５３は、コンテンツＣ１の部分コンテンツＣ１１に含まれる入力フィールドＦ１２への文字入力により、１２個のクエリＱ１-１〜Ｑ１-１２の入力を受け付ける。受付部１５３は、利用者Ｕ１１により入力フィールドＦ１２に入力された１２個の文字列である１２個の文字列「車種Ｔ１１」〜文字列「車種Ｔ２６」の入力を受け付ける。 In addition, the reception unit 153 accepts inputs of 12 queries Q1-1 to Q1-12, which are 12 character strings separated by delimiters. The reception unit 153 accepts the input of 12 queries Q1-1 to Q1-12 by inputting characters into the input field F12 included in the partial content C11 of the content C1. The reception unit 153 accepts input of 12 character strings "vehicle type T11" to character strings "vehicle type T26", which are 12 character strings input by the user U11 in the input field F12.

また、受付部１５３は、利用者Ｕ１１による送信ボタンＢ１１の選択操作を受け付ける。受付部１５３は、入力部１２によって受け付けられた送信ボタンＢ１１の選択操作を受け付ける。受付部１５３は、表示部１３を介して受け付けられた送信ボタンＢ１１の選択操作を受け付ける。 In addition, the reception unit 153 accepts the selection operation of the transmission button B11 by the user U11. The reception unit 153 accepts the selection operation of the transmission button B11 received by the input unit 12. The reception unit 153 accepts the selection operation of the transmission button B11 received via the display unit 13.

（送信部１５４）
送信部１５４は、外部の情報処理装置へ各種情報を送信する。例えば、送信部１５４は、情報処理装置１００等の他の情報処理装置へ各種情報を送信する。送信部１５４は、記憶部１４に記憶された情報を送信する。また、送信部１５４は、情報処理装置１００等の他の情報処理装置からの情報に基づいて、各種情報を送信する。送信部１５４は、記憶部１４に記憶された情報に基づいて、各種情報を送信する。送信部１５４は、受付部１５３によって受け付けられた情報を送信する。 (Transmission unit 154)
The transmission unit 154 transmits various information to an external information processing device. For example, the transmission unit 154 transmits various information to another information processing device such as the information processing device 100. The transmission unit 154 transmits the information stored in the storage unit 14. Further, the transmission unit 154 transmits various information based on information from another information processing device such as the information processing device 100. The transmission unit 154 transmits various types of information based on the information stored in the storage unit 14. The transmission unit 154 transmits the information received by the reception unit 153.

送信部１５４は、受付部１５３によって受け付けられたクラスタ数と複数のクエリを情報処理装置１００に送信する。具体的には、送信部１５４は、受付部１５３によって送信ボタンＢ１１の選択操作が受け付けられると、受付部１５３によって受け付けられたクラスタ数と複数のクエリを情報処理装置１００に送信する。 The transmission unit 154 transmits the number of clusters received by the reception unit 153 and a plurality of queries to the information processing device 100. Specifically, when the reception unit 153 accepts the selection operation of the transmission button B11, the transmission unit 154 transmits the number of clusters accepted by the reception unit 153 and a plurality of queries to the information processing device 100.

図１の例では、送信部１５４は、受付部１５３によって送信ボタンＢ１１の選択操作が受け付けられると、受付部１５３によって受け付けられたクラスタ数「３」を情報処理装置１００に送信する。送信部１５４は、受付部１５３によって受け付けられた数字であるクラスタ数「３」を情報処理装置１００に送信する。 In the example of FIG. 1, when the reception unit 153 accepts the selection operation of the transmission button B11, the transmission unit 154 transmits the number of clusters "3" accepted by the reception unit 153 to the information processing device 100. The transmission unit 154 transmits the number of clusters “3”, which is a number received by the reception unit 153, to the information processing device 100.

また、送信部１５４は、受付部１５３によって送信ボタンＢ１１の選択操作が受け付けられると、受付部１５３によって受け付けられた１２個のクエリＱ１-１〜Ｑ１-１２を情報処理装置１００に送信する。送信部１５４は、受付部１５３によって受け付けられた１２個の文字列である１２個のクエリＱ２-１〜Ｑ２-３を情報処理装置１００に送信する。 Further, when the reception unit 153 accepts the selection operation of the transmission button B11, the transmission unit 154 transmits 12 queries Q1-1 to Q1-12 received by the reception unit 153 to the information processing device 100. The transmission unit 154 transmits 12 queries Q2-1 to Q2-3, which are 12 character strings received by the reception unit 153, to the information processing device 100.

〔４．情報処理のフロー〕
次に、図９を用いて、実施形態に係る情報処理の手順について説明する。図９は、実施形態に係る情報処理手順を示すフローチャートである。図９に示す例では、情報処理装置１００は、複数のクエリと指定クラスタ数とを端末装置１０から受け付けたか否かを判定する（ステップＳ１０１）。情報処理装置１００は、複数のクエリと指定クラスタ数とを受け付けなかった場合（ステップＳ１０１；Ｎｏ）、複数のクエリと指定クラスタ数とを受け付けるまで待機する。 [4. Information processing flow]
Next, the procedure of information processing according to the embodiment will be described with reference to FIG. FIG. 9 is a flowchart showing an information processing procedure according to the embodiment. In the example shown in FIG. 9, the information processing apparatus 100 determines whether or not a plurality of queries and a specified number of clusters have been received from the terminal apparatus 10 (step S101). When the information processing apparatus 100 does not accept a plurality of queries and a specified number of clusters (step S101; No), the information processing apparatus 100 waits until the plurality of queries and the specified number of clusters are accepted.

続いて、情報処理装置１００は、複数のクエリと指定クラスタ数とを受け付けた場合（ステップＳ１０１；Ｙｅｓ）、複数のクエリを指定クラスタ数に分類することにより生成したクラスタに関するクラスタ情報を生成する（ステップＳ１０２）。 Subsequently, when the information processing apparatus 100 receives a plurality of queries and a specified number of clusters (step S101; Yes), the information processing apparatus 100 generates cluster information regarding the generated clusters by classifying the plurality of queries into the specified number of clusters (step S101; Yes). Step S102).

続いて、情報処理装置１００は、クラスタ情報を生成すると、生成したクラスタ情報を端末装置１０に出力する（ステップＳ１０３）。 Subsequently, when the information processing apparatus 100 generates the cluster information, the information processing apparatus 100 outputs the generated cluster information to the terminal apparatus 10 (step S103).

〔５．変形例〕
上述した実施形態に係る情報処理システム１は、上記実施形態以外にも種々の異なる形態にて実施されてよい。そこで、以下では、情報処理システム１の他の実施形態について説明する。なお、実施形態と同一部分には、同一符号を付して説明を省略する。 [5. Modification example]
The information processing system 1 according to the above-described embodiment may be implemented in various different forms other than the above-described embodiment. Therefore, another embodiment of the information processing system 1 will be described below. The same parts as those in the embodiment are designated by the same reference numerals, and the description thereof will be omitted.

〔５−１．指定クラスタ数よりも少ないクラスタ数のクラスタ情報〕
情報処理装置１００は、数の対象情報を指定クラスタ数よりも少ない数のクラスタに分類することにより生成したクラスタに関するクラスタ情報を生成する。また、情報処理装置１００は、生成したクラスタ情報を指定クラスタ情報と比較可能に出力する。 [5-1. Cluster information for clusters less than the specified number of clusters]
The information processing apparatus 100 generates cluster information about the clusters generated by classifying the target information of the number into clusters having a number smaller than the number of designated clusters. Further, the information processing apparatus 100 outputs the generated cluster information in a comparable manner with the designated cluster information.

具体的には、生成部１３３は、複数の対象情報を指定クラスタ数よりも少ない数のクラスタに分類することにより生成したクラスタに関するクラスタ情報を生成する。続いて、出力部１３４は、生成部１３３によって生成されたクラスタ情報を指定クラスタ情報と比較可能に出力する。具体的には、生成部１３３は、指定クラスタ情報とクラスタ情報とを対比可能に上下に並べて表示する部分コンテンツＣ１２を生成する。例えば、生成部１３３は、図１に示すような指定クラスタ数「３」の３つのクラスタＣＬ１〜クラスタＣＬ３に関する指定クラスタ情報を生成する。また、生成部１３３は、指定クラスタ数「３」より少ないクラスタ数「２」の２つのクラスタＣＬ１（Ｑ１-１、Ｑ１-２、Ｑ１-３、Ｑ１-７、Ｑ１-８、Ｑ１-１１）とクラスタＣＬ２（Ｑ１-４、Ｑ１-５、Ｑ１-６、Ｑ１-９、Ｑ１-１０、Ｑ１-１２）に関するクラスタ情報を生成する。続いて、生成部１３３は、指定クラスタ情報とクラスタ情報とを対比可能に上下に並べて表示する部分コンテンツＣ１２を生成する。続いて、出力部１３４は、生成部によって生成された部分コンテンツＣ１２を出力する。 Specifically, the generation unit 133 generates cluster information regarding the generated clusters by classifying the plurality of target information into clusters having a number smaller than the number of designated clusters. Subsequently, the output unit 134 outputs the cluster information generated by the generation unit 133 so as to be comparable to the designated cluster information. Specifically, the generation unit 133 generates the partial content C12 that displays the designated cluster information and the cluster information side by side so as to be comparable. For example, the generation unit 133 generates the designated cluster information regarding the three clusters CL1 to CL3 having the designated number of clusters “3” as shown in FIG. In addition, the generation unit 133 has two clusters CL1 (Q1-1, Q1-2, Q1-3, Q1-7, Q1-8, Q1-11) having a number of clusters "2" less than the designated number of clusters "3". And cluster CL2 (Q1-4, Q1-5, Q1-6, Q1-9, Q1-10, Q1-12) to generate cluster information. Subsequently, the generation unit 133 generates the partial content C12 that displays the designated cluster information and the cluster information side by side so as to be comparable. Subsequently, the output unit 134 outputs the partial content C12 generated by the generation unit.

また、生成部１３３は、指定クラスタ情報とクラスタ情報との相違点（差分）に相当するクエリの文字色を変えた情報を生成してもよい。例えば、生成部１３３は、指定クラスタ数「３」の場合にクラスタＣＬ３に分類されていた（Ｑ１-１１、Ｑ１-１２）の文字色を他のクエリとは異なる色（例えば、赤色）に着色した情報を生成する。続いて、生成部１３３は、指定クラスタ情報とクラスタ情報とを対比可能に上下に並べて表示する部分コンテンツＣ１２を生成する。続いて、出力部１３４は、生成部１３３によって生成された部分コンテンツＣ１２を出力する。 Further, the generation unit 133 may generate information in which the character color of the query corresponding to the difference (difference) between the designated cluster information and the cluster information is changed. For example, the generation unit 133 colors the character color of the cluster CL3 (Q1-11, Q1-12) in the case of the designated number of clusters "3" to a color different from other queries (for example, red). Generate the information. Subsequently, the generation unit 133 generates the partial content C12 that displays the designated cluster information and the cluster information side by side so as to be comparable. Subsequently, the output unit 134 outputs the partial content C12 generated by the generation unit 133.

〔６．第１モデルの生成処理〕
次に、図１０を用いて、第１モデルの生成処理の流れについて説明する。図１０は、実施形態に係る第１モデルの生成処理の一例を示す図である。図１０に示す例では、生成装置５０は、同一の利用者Ｕ１によって所定の時間内に連続して入力された「六本木パスタ」という検索クエリＱ１１と「六本木イタリアン」という検索クエリＱ１２とから成る一対の検索クエリを抽出する（ステップＳ１１）。 [6. First model generation process]
Next, the flow of the generation process of the first model will be described with reference to FIG. FIG. 10 is a diagram showing an example of the generation process of the first model according to the embodiment. In the example shown in FIG. 10, the generator 50 is a pair consisting of a search query Q11 "Roppongi pasta" and a search query Q12 "Roppongi Italian" continuously input by the same user U1 within a predetermined time. The search query of is extracted (step S11).

続いて、生成装置５０は、抽出した検索クエリＱ１１を第１モデルＭ１に入力して、検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１を出力する。ここで、ベクトルＢＱＶ１１は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１１の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す。また、生成装置５０は、抽出した検索クエリＱ１２を第１モデルＭ１に入力して、検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２を出力する。ここで、ベクトルＢＱＶ１２は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１２の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す。このようにして、生成装置５０は、検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１と、検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２とを出力する（ステップＳ１２）。 Subsequently, the generation device 50 inputs the extracted search query Q11 into the first model M1 and outputs a vector BQV11 which is a distributed representation of the search query Q11. Here, the vector BQV11 is a distributed representation of the search query Q11 just output from the output layer of the first model M1, and shows a distributed representation before giving feedback to the first model M1 (before learning). Further, the generation device 50 inputs the extracted search query Q12 into the first model M1 and outputs a vector BQV12 which is a distributed representation of the search query Q12. Here, the vector BQV12 is a distributed representation of the search query Q12 just output from the output layer of the first model M1, and shows a distributed representation before giving feedback to the first model M1 (before learning). In this way, the generation device 50 outputs the vector BQV11 which is the distributed representation of the search query Q11 and the vector BQV12 which is the distributed representation of the search query Q12 (step S12).

続いて、生成装置５０は、同一の利用者Ｕ１によって所定の時間内に連続して入力された検索クエリＱ１１（「六本木パスタ」）と検索クエリＱ１２（「六本木イタリアン」）とから成る一対の検索クエリは、所定の検索意図（例えば、「ある場所で飲食店を探す」という検索意図）で入力された検索クエリであると推定されるため、相互に類似する特徴を有するものとして、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１と対となる検索クエリＱ１２の分散表現（ベクトルＱＶ１２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。例えば、第１モデルＭ１にフィードバックをかける前（学習前）の検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２とのなす角度の大きさをΘとする。また、第１モデルＭ１にフィードバックをかけた後（学習後）の検索クエリＱ１１の分散表現であるベクトルＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＱＶ１２とのなす角度の大きさをΦとする。この時、生成装置５０は、ΘよりもΦが小さくなるように、第１モデルＭ１を学習させる。例えば、生成装置５０は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値を算出する。また、生成装置５０は、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値を算出する。続いて、生成装置５０は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値よりも、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値が大きくなるように（値が１に近づくように）第１モデルＭ１を学習させる。このように、生成装置５０は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現を出力する第１モデルＭ１を生成する（ステップＳ１３）。なお、生成装置５０は、コサイン類似度に限らず、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて分散表現の間の類似度を算出してもよい。また、生成装置５０は、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて第１モデルＭ１を学習させてもよい。例えば、生成装置５０は、分散表現同士のユークリッド距離や双曲空間等の非ユークリッド空間中での距離、マンハッタン距離、マハラノビス距離等といった所定の距離関数の値を算出する。続いて、生成装置５０は、分散表現同士の所定の距離関数の値（すなわち、分散表現空間における距離）が小さくなるように第１モデルＭ１を学習させてもよい。 Subsequently, the generator 50 is a pair of searches including the search query Q11 (“Roppongi pasta”) and the search query Q12 (“Roppongi Italian”) that are continuously input by the same user U1 within a predetermined time. Since the query is presumed to be a search query entered with a predetermined search intention (for example, a search intention of "searching for a restaurant in a certain place"), the search query Q11 is assumed to have similar characteristics to each other. The first model M1 is trained so that the distributed representation (vector QV11) of the above and the distributed representation (vector QV12) of the search query Q12 paired with the search query Q11 are similar in the distributed representation space. For example, let Θ be the magnitude of the angle formed by the vector BQV11 which is the distributed representation of the search query Q11 before giving feedback to the first model M1 (before learning) and the vector BQV12 which is the distributed representation of the search query Q12. Further, let Φ be the size of the angle formed by the vector QV11 which is the distributed representation of the search query Q11 after giving feedback to the first model M1 (after learning) and the vector QV12 which is the distributed representation of the search query Q12. At this time, the generator 50 trains the first model M1 so that Φ is smaller than Θ. For example, the generator 50 calculates the value of the cosine similarity between the vector BQV11 and the vector BQV12. Further, the generation device 50 calculates the value of the cosine similarity between the vector QV11 and the vector QV12. Subsequently, the generator 50 uses the first model so that the cosine similarity values of the vector QV11 and the vector QV12 are larger than the cosine similarity values of the vector BQV11 and the vector BQV12 (so that the values approach 1). Learn M1. In this way, the generation device 50 trains the first model M1 so that the two vectors, which are a pair of distributed representations corresponding to the pair of search queries, are similar in the distributed representation space, so that the distributed representation is expressed from the search query. The first model M1 that outputs the above is generated (step S13). The generator 50 is not limited to the cosine similarity, and may calculate the similarity between the distributed representations based on any index as long as it is an index applicable as a distance scale between vectors. Further, the generation device 50 may train the first model M1 based on any index as long as it is an index applicable as a distance scale between vectors. For example, the generation device 50 calculates the value of a predetermined distance function such as the Euclidean distance between distributed expressions, the distance in a non-Euclidean space such as a hyperbolic space, the Manhattan distance, and the Mahalanobis distance. Subsequently, the generation device 50 may train the first model M1 so that the value of a predetermined distance function between the distributed representations (that is, the distance in the distributed representation space) becomes small.

次に、図１１を用いて、第１モデルの生成処理の流れについてより詳しく説明する。なお、図１１の説明では、図９の説明と重複する部分は、適宜省略する。図１１は、実施形態に係る第１モデルの生成処理を示す図である。図１１に示す例では、生成装置５０が生成した第１モデルＭ１によって出力された分散表現が分散表現空間にマッピングされる様子が示されている。生成装置５０は、所定の検索クエリの分散表現と所定の検索クエリと対となる他の検索クエリの分散表現とが分散表現空間上で近くにマッピングされるように第１モデルＭ１のトレーニングを行う。 Next, the flow of the generation process of the first model will be described in more detail with reference to FIG. In the description of FIG. 11, a part that overlaps with the description of FIG. 9 will be omitted as appropriate. FIG. 11 is a diagram showing a generation process of the first model according to the embodiment. In the example shown in FIG. 11, the distributed representation output by the first model M1 generated by the generation device 50 is mapped to the distributed representation space. The generation device 50 trains the first model M1 so that the distributed representation of a predetermined search query and the distributed representation of another search query paired with the predetermined search query are mapped close to each other on the distributed representation space. ..

図１１の上段に示す例では、生成装置５０は、同一の利用者Ｕ１によって所定の時間内に連続して入力された４個の検索クエリである検索クエリＱ１１（「六本木パスタ」）、検索クエリＱ１２（「六本木イタリアン」）、検索クエリＱ１３（「赤坂パスタ」）、検索クエリＱ１４（「麻布パスタ」）を抽出する。生成装置５０は、同一の利用者Ｕ１によって各検索クエリが入力された時間の間隔が所定の時間内である４個の検索クエリを抽出する。生成装置５０は、同一の利用者Ｕ１によって後述する各検索クエリのペアが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。生成装置５０は、検索クエリが入力された順番に並べると、検索クエリＱ１１、検索クエリＱ１２、検索クエリＱ１３、検索クエリＱ１４の順番で入力された４個の検索クエリを抽出する。生成装置５０は、４個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、３対の検索クエリのペアである（検索クエリＱ１１、検索クエリＱ１２）、（検索クエリＱ１２、検索クエリＱ１３）、（検索クエリＱ１３、検索クエリＱ１４）を抽出する（ステップＳ２１−１）。なお、生成装置５０は、同一の利用者Ｕ１によって全ての検索クエリが所定の時間内に入力された複数の検索クエリを抽出してもよい。そして、生成装置５０は、時系列的に隣り合うか否かに関わらず、抽出した複数の検索クエリの中から２つの検索クエリを選択して、選択した２つの検索クエリを一対の検索クエリとして抽出してもよい。 In the example shown in the upper part of FIG. 11, the generation device 50 is a search query Q11 (“Roppongi pasta”), which is four search queries continuously input by the same user U1 within a predetermined time, and a search query. Extract Q12 (“Roppongi Italian”), search query Q13 (“Akasaka pasta”), and search query Q14 (“Azabu pasta”). The generation device 50 extracts four search queries in which the time interval in which each search query is input by the same user U1 is within a predetermined time. The generation device 50 extracts a plurality of search queries in which the time interval in which each search query pair described later is input by the same user U1 is within a predetermined time. When the search query 50 is arranged in the order in which the search queries are input, the generation device 50 extracts four search queries input in the order of the search query Q11, the search query Q12, the search query Q13, and the search query Q14. When the generation device 50 extracts four search queries, it is a pair of three search queries (search query Q11, search query Q12), with two search queries adjacent in chronological order as a pair of search queries. (Search query Q12, search query Q13) and (search query Q13, search query Q14) are extracted (step S21-1). The generation device 50 may extract a plurality of search queries in which all the search queries are input within a predetermined time by the same user U1. Then, the generation device 50 selects two search queries from the extracted plurality of search queries regardless of whether they are adjacent to each other in chronological order, and uses the two selected search queries as a pair of search queries. It may be extracted.

続いて、生成装置５０は、抽出した検索クエリＱ１ｋ（ｋ＝１、２、３、４）を第１モデルＭ１に入力して、検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＢＱＶ１ｋ（ｋ＝１、２、３、４）を出力する。ここで、ベクトルＢＱＶ１ｋ（ｋ＝１、２、３、４）は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す（ステップＳ２２−１）。 Subsequently, the generation device 50 inputs the extracted search query Q1k (k = 1, 2, 3, 4) into the first model M1, and distributes the search query Q1k (k = 1, 2, 3, 4). The representation vector BQV1k (k = 1, 2, 3, 4) is output. Here, the vector BQV1k (k = 1, 2, 3, 4) is a distributed representation of the search query Q1k (k = 1, 2, 3, 4) just output from the output layer of the first model M1. , The distributed representation before giving feedback to the first model M1 (before learning) is shown (step S22-1).

続いて、生成装置５０は、同一の利用者Ｕ１によって所定の時間内に連続して入力された一対の検索クエリは、所定の検索意図（例えば、「ある場所（東京都港区付近）で飲食店を探す」という検索意図）で入力された検索クエリであると推定されるため、相互に類似する特徴を有するものとして、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１と対となる検索クエリＱ１２の分散表現（ベクトルＱＶ１２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ１２の分散表現（ベクトルＱＶ１２）と、検索クエリＱ１２と対となる検索クエリＱ１３の分散表現（ベクトルＱＶ１３）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ１３の分散表現（ベクトルＱＶ１３）と、検索クエリＱ１３と対となる検索クエリＱ１４の分散表現（ベクトルＱＶ１４）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。このように、生成装置５０は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現を出力する第１モデルＭ１を生成する（ステップＳ２３−１）。 Subsequently, in the generator 50, a pair of search queries continuously input by the same user U1 within a predetermined time is subjected to a predetermined search intention (for example, "eating and drinking at a certain place (near Minato-ku, Tokyo)). Since it is presumed that the search query was entered with the search intent of "finding a store"), the distributed representation of the search query Q11 (vector QV11) and the pair with the search query Q11 are assumed to have similar characteristics to each other. The first model M1 is trained so that the distributed representation (vector QV12) of the search query Q12 is similar on the distributed representation space. Further, in the generation device 50, the first distributed representation of the search query Q12 (vector QV12) and the distributed representation of the search query Q13 paired with the search query Q12 (vector QV13) are similar in the distributed representation space. Train model M1. Further, in the generation device 50, the first distributed representation of the search query Q13 (vector QV13) and the distributed representation of the search query Q14 paired with the search query Q13 (vector QV14) are similar in the distributed representation space. Train model M1. In this way, the generation device 50 trains the first model M1 so that the two vectors, which are a pair of distributed representations corresponding to the pair of search queries, are similar in the distributed representation space, so that the distributed representation is expressed from the search query. The first model M1 that outputs the above is generated (step S23-1).

図１１の上段に示す情報処理の結果として、検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＱＶ１ｋ（ｋ＝１、２、３、４）が分散表現空間の近い位置にクラスタＣＬ１１としてマッピングされる様子が示されている。例えば、検索クエリＱ１ｋ（ｋ＝１、２、３、４）は、利用者Ｕ１によって「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリの集合であると推定される。すなわち、検索クエリＱ１ｋ（ｋ＝１、２、３、４）は、「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであると推定される。ここで、生成装置５０は、「ある場所（東京都港区付近）で飲食店を探す」という検索意図で入力された所定の検索クエリが第１モデルに入力されると、クラスタＣＬ１１の位置にマッピングされるような分散表現を出力することができる。これにより、例えば、生成装置５０は、クラスタＣＬ１１の位置にマッピングされる分散表現に対応する検索クエリを抽出することにより、「ある場所（東京都港区付近）で飲食店を探す」という検索意図に応じた検索クエリを抽出することができる。したがって、生成装置５０は、検索クエリの意味を適切に解釈可能とすることができる。 As a result of the information processing shown in the upper part of FIG. 11, the vector QV1k (k = 1, 2, 3, 4), which is a distributed representation of the search query Q1k (k = 1, 2, 3, 4), is close to the distributed representation space. It is shown that the location is mapped as cluster CL11. For example, the search query Q1k (k = 1, 2, 3, 4) is a search query searched by the user U1 with the search intention of "searching for a restaurant in a certain place (near Minato-ku, Tokyo)". It is presumed to be a set. That is, the search query Q1k (k = 1, 2, 3, 4) is a search query searched with the search intention of "searching for a restaurant in a certain place (near Minato-ku, Tokyo)". , It is presumed that the search query has similar characteristics to each other. Here, when a predetermined search query input with the search intention of "searching for a restaurant in a certain place (near Minato-ku, Tokyo)" is input to the first model, the generator 50 is placed at the position of the cluster CL11. It is possible to output a distributed representation that is mapped. As a result, for example, the generator 50 has a search intention of "searching for a restaurant in a certain place (near Minato-ku, Tokyo)" by extracting a search query corresponding to the distributed representation mapped to the position of the cluster CL11. Search queries can be extracted according to. Therefore, the generation device 50 can appropriately interpret the meaning of the search query.

図１１の下段に示す例では、生成装置５０は、同一の利用者Ｕ２によって所定の時間内に連続して入力された３個の検索クエリである検索クエリＱ２１（「冷蔵庫４００Ｌ」）、検索クエリＱ２２（「冷蔵庫中型」）、検索クエリＱ２３（「冷蔵庫中型おすすめ」）を抽出する。生成装置５０は、検索クエリが入力された順番に並べると、検索クエリＱ２１、検索クエリＱ２２、検索クエリＱ２３の順番で入力された３個の検索クエリを抽出する。生成装置５０は、３個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、２対の検索クエリのペアである（検索クエリＱ２１、検索クエリＱ２２）、（検索クエリＱ２２、検索クエリＱ２３）を抽出する（ステップＳ２１−２）。 In the example shown in the lower part of FIG. 11, the generator 50 is a search query Q21 (“refrigerator 400L”), which is three search queries continuously input by the same user U2 within a predetermined time, and a search query. Extract Q22 ("refrigerator medium size") and search query Q23 ("refrigerator medium size recommended"). When the generation device 50 arranges the search queries in the order in which they are input, the generation device 50 extracts three search queries input in the order of the search query Q21, the search query Q22, and the search query Q23. When the generator 50 extracts three search queries, it is a pair of two search queries (search query Q21, search query Q22), with two search queries adjacent in chronological order as a pair of search queries. (Search query Q22, search query Q23) is extracted (step S21-2).

続いて、生成装置５０は、抽出した検索クエリＱ２ｍ（ｍ＝１、２、３）を第１モデルＭ１に入力して、検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であるベクトルＢＱＶ２ｍ（ｍ＝１、２、３）を出力する。ここで、ベクトルＢＱＶ２ｍ（ｍ＝１、２、３）は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す（ステップＳ２２−２）。 Subsequently, the generation device 50 inputs the extracted search query Q2m (m = 1, 2, 3) into the first model M1, and a vector that is a distributed representation of the search query Q2m (m = 1, 2, 3). BQV2m (m = 1, 2, 3) is output. Here, the vector BQV2m (m = 1, 2, 3) is a distributed representation of the search query Q2m (m = 1, 2, 3) just output from the output layer of the first model M1, and is the first model. The distributed expression before giving feedback to M1 (before learning) is shown (step S22-2).

続いて、生成装置５０は、同一の利用者Ｕ２によって所定の時間内に連続して入力された一対の検索クエリは、所定の検索意図（例えば、「中型の冷蔵庫を調べる」という検索意図）で入力された検索クエリであると推定されるため、相互に類似する特徴を有するものとして、検索クエリＱ２１の分散表現（ベクトルＱＶ２１）と、検索クエリＱ２１と対となる検索クエリＱ２２の分散表現（ベクトルＱＶ２２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ２２の分散表現（ベクトルＱＶ２２）と、検索クエリＱ２２と対となる検索クエリＱ２３の分散表現（ベクトルＱＶ２３）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。このように、生成装置５０は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現を出力する第１モデルＭ１を生成する（ステップＳ２３−２）。 Subsequently, in the generation device 50, a pair of search queries continuously input by the same user U2 within a predetermined time has a predetermined search intention (for example, a search intention of "examining a medium-sized refrigerator"). Since it is presumed to be an input search query, the distributed representation of the search query Q21 (vector QV21) and the distributed representation of the search query Q22 paired with the search query Q21 (vector) are assumed to have similar characteristics to each other. QV22) and the first model M1 are trained so as to be similar in the distributed expression space. Further, in the generation device 50, the first distributed representation of the search query Q22 (vector QV22) and the distributed representation of the search query Q23 paired with the search query Q22 (vector QV23) are similar in the distributed representation space. Train model M1. In this way, the generation device 50 trains the first model M1 so that the two vectors, which are a pair of distributed representations corresponding to the pair of search queries, are similar in the distributed representation space, so that the distributed representation is expressed from the search query. The first model M1 that outputs the above is generated (step S23-2).

図１１の下段に示す情報処理の結果として、検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であるベクトルＱＶ２ｍ（ｍ＝１、２、３）が分散表現空間の近い位置にクラスタＣＬ２１としてマッピングされる様子が示されている。例えば、検索クエリＱ２ｍ（ｍ＝１、２、３）は、利用者Ｕ２によって「中型の冷蔵庫を調べる」という検索意図の下で検索された検索クエリの集合であると推定される。すなわち、Ｑ２ｍ（ｍ＝１、２、３）は、「中型の冷蔵庫を調べる」という検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであると推定される。ここで、生成装置５０は、「中型の冷蔵庫を調べる」という検索意図で入力された所定の検索クエリが第１モデルに入力されると、クラスタＣＬ２１の位置にマッピングされるような分散表現を出力することができる。これにより、例えば、生成装置５０は、クラスタＣＬ２１の位置にマッピングされる分散表現に対応する検索クエリを抽出することにより、「中型の冷蔵庫を調べる」という検索意図に応じた検索クエリを抽出することができる。したがって、生成装置５０は、検索クエリの意味を適切に解釈可能とすることができる。 As a result of the information processing shown in the lower part of FIG. 11, the vector QV2m (m = 1, 2, 3), which is a distributed representation of the search query Q2m (m = 1, 2, 3), is located near the distributed representation space of the cluster CL21. It is shown that it is mapped as. For example, the search query Q2m (m = 1, 2, 3) is presumed to be a set of search queries searched by the user U2 with the search intention of "checking a medium-sized refrigerator". That is, Q2m (m = 1, 2, 3) is a search query having characteristics similar to each other in that it is a search query searched with the search intention of "checking a medium-sized refrigerator". Presumed. Here, the generator 50 outputs a distributed representation that is mapped to the position of the cluster CL21 when a predetermined search query input with the search intention of "checking a medium-sized refrigerator" is input to the first model. can do. As a result, for example, the generator 50 extracts the search query corresponding to the distributed representation mapped to the position of the cluster CL21, thereby extracting the search query according to the search intention of "checking the medium-sized refrigerator". Can be done. Therefore, the generation device 50 can appropriately interpret the meaning of the search query.

また、本願発明に係る生成装置５０は、ランダムに抽出された複数の検索クエリは、異なる検索意図の下で検索された検索クエリであるという点で、相互に相違する特徴を有する検索クエリであるとみなして第１モデルＭ１を学習させる。具体的には、生成装置５０は、所定の検索クエリの分散表現と、所定の検索クエリとは無関係にランダムに抽出された検索クエリの分散表現とが分散表現空間上で遠くにマッピングされるように第１モデルＭ１のトレーニングを行う。図１１に示す例では、生成装置５０は、検索クエリＱ１１とは無関係にランダムに検索クエリを抽出したところ、検索クエリＱ２１が抽出されたとする。この場合、生成装置５０は、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１とは無関係にランダムに抽出された検索クエリＱ２１の分散表現（ベクトルＱＶ２１）とが分散表現空間上で遠くにマッピングされるように第１モデルＭ１のトレーニングを行う。その結果として、「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＱＶ１ｋ（ｋ＝１、２、３、４）を含むクラスタＣＬ１１と、「中型の冷蔵庫を調べる」という検索意図の下で検索された検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であるベクトルＱＶ２ｍ（ｍ＝１、２、３）を含むクラスタＣＬ２１とは、分散表現空間上で遠くにマッピングされる。すなわち、本願発明に係る生成装置５０は、ランダムに抽出された複数の検索クエリの分散表現が相違するように第１モデルＭ１を学習させることにより、検索意図が異なる検索クエリの分散表現を分散表現空間上で遠い位置に出力可能とする。 Further, the generation device 50 according to the present invention is a search query having different characteristics from each other in that a plurality of randomly extracted search queries are search queries searched under different search intentions. Assuming that, the first model M1 is trained. Specifically, the generation device 50 maps the distributed representation of a predetermined search query and the distributed representation of a search query randomly extracted independently of the predetermined search query far away in the distributed representation space. The first model M1 is trained. In the example shown in FIG. 11, it is assumed that the generation device 50 randomly extracts the search query regardless of the search query Q11, and the search query Q21 is extracted. In this case, in the generation device 50, the distributed representation of the search query Q11 (vector QV11) and the distributed representation of the search query Q21 randomly extracted regardless of the search query Q11 (vector QV21) are far apart in the distributed representation space. The first model M1 is trained so as to be mapped to. As a result, the vector QV1k, which is a distributed expression of the search query Q1k (k = 1, 2, 3, 4) searched with the search intention of "searching for a restaurant in a certain place (near Minato-ku, Tokyo)". It is a distributed representation of cluster CL11 including (k = 1, 2, 3, 4) and search query Q2m (m = 1, 2, 3) searched with the search intention of "checking a medium-sized refrigerator". The cluster CL21 containing the vector QV2m (m = 1, 2, 3) is mapped far away on the distributed representation space. That is, the generator 50 according to the present invention trains the first model M1 so that the distributed expressions of a plurality of randomly extracted search queries are different, so that the distributed expressions of the search queries having different search intentions are distributed. It is possible to output to a distant position in space.

なお、生成装置５０が生成した第１モデルＭ１によって出力された分散表現が分散表現空間にマッピングされた結果として、上述したクラスタＣＬ１１とクラスタＣＬ２１の他にも、同一のユーザによって所定の時間内に入力された複数の検索クエリの分散表現の集合であるクラスタＣＬ１２やクラスタＣＬ２２が生成される。 As a result of mapping the distributed representation output by the first model M1 generated by the generation device 50 to the distributed representation space, in addition to the cluster CL11 and the cluster CL21 described above, the same user within a predetermined time Cluster CL12 and cluster CL22, which are a set of distributed expressions of a plurality of input search queries, are generated.

上述したように、生成装置５０は、ユーザによって入力された検索クエリを取得する。また、生成装置５０は、取得した検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する第１モデルを生成する。すなわち、本願発明に係る生成装置５０は、所定の時間内に連続して入力された複数の検索クエリは、所定の検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであるとみなして第１モデルを学習させる。具体的には、生成装置５０は、同一のユーザによって所定の時間内に入力された複数の検索クエリの分散表現が類似するように第１モデルを学習させることで、所定の検索クエリから所定の検索クエリの特徴情報を含む分散表現を出力する第１モデルを生成する。すなわち、本願発明に係る生成装置５０は、所定の時間内に連続して入力された複数の検索クエリの分散表現が類似するように第１モデルＭ１を学習させることにより、所定の検索意図の下で検索された検索クエリの分散表現を分散表現空間上で近い位置に出力可能とする。これにより、生成装置５０は、検索クエリを入力したユーザのコンテクストに応じて検索クエリの意味（検索意図）を出力（解釈）することを可能にする。したがって、生成装置５０は、検索クエリの意味を適切に解釈可能とすることができる。 As described above, the generator 50 acquires the search query entered by the user. Further, the generation device 50 learns from the acquired search queries that a plurality of search queries input by the same user within a predetermined time have similar characteristics, so that the generated search queries can be used as predetermined search queries. Generate a first model that predicts the feature information of the search query. That is, the generation device 50 according to the present invention is similar to each other in that a plurality of search queries continuously input within a predetermined time are search queries searched under a predetermined search intention. The first model is trained by regarding it as a search query having characteristics. Specifically, the generation device 50 trains the first model so that the distributed representations of a plurality of search queries input by the same user within a predetermined time are similar to each other, so that the first model is trained from the predetermined search query. Generate the first model that outputs the distributed representation including the feature information of the search query. That is, the generation device 50 according to the present invention trains the first model M1 so that the distributed representations of a plurality of search queries continuously input within a predetermined time are similar, so that the first model M1 is trained under a predetermined search intention. The distributed representation of the search query searched in is made possible to be output to a close position on the distributed representation space. This enables the generation device 50 to output (interpret) the meaning (search intention) of the search query according to the context of the user who input the search query. Therefore, the generation device 50 can appropriately interpret the meaning of the search query.

また、生成装置５０は、所定の検索クエリの特徴情報を含む分散表現の近傍にマッピングされる分散表現に対応する検索クエリを抽出することにより、所定の検索クエリが検索された検索意図に応じた検索クエリを抽出することができる。すなわち、生成装置５０は、検索クエリを入力したユーザの検索意図やコンテクストを考慮して、ユーザの検索動向を分析することを可能にする。したがって、生成装置５０は、ユーザの検索動向の分析精度を高めることができる。また、生成装置５０が生成した第１モデルＭ１を検索システムの一部として機能させることもできる。あるいは、生成装置５０は、第１モデルＭ１によって予測された検索クエリの特徴情報を利用する他のシステム（例えば、検索エンジン）への入力情報として、第１モデルＭ１が出力した検索クエリの分散表現を提供することもできる。これにより、検索システムは、第１モデルＭ１によって予測された検索クエリの特徴情報に基づいて、検索結果として出力されるコンテンツを選択可能になる。すなわち、検索システムは、検索クエリを入力したユーザの検索意図やコンテクストを考慮して、検索結果として出力されるコンテンツを選択可能になる。さらに、検索システムは、第１モデルＭ１によって予測された検索クエリの特徴情報に基づいて、検索結果として出力されるコンテンツに含まれる文字情報の分散表現と検索クエリの分散表現との類似度を算出可能になる。そして、検索システムは、算出した類似度に基づいて、検索結果として出力されるコンテンツの表示順を決定可能になる。すなわち、検索システムは、検索クエリを入力したユーザの検索意図やコンテクストを考慮して、検索結果として出力されるコンテンツの表示順を決定可能になる。したがって、生成装置５０は、検索サービスにおけるユーザビリティを向上させることができる。 Further, the generation device 50 extracts the search query corresponding to the distributed expression mapped in the vicinity of the distributed expression including the feature information of the predetermined search query, so that the predetermined search query can be searched according to the search intention. Search queries can be extracted. That is, the generation device 50 makes it possible to analyze the search trend of the user in consideration of the search intention and context of the user who input the search query. Therefore, the generation device 50 can improve the analysis accuracy of the user's search trend. Further, the first model M1 generated by the generation device 50 can be made to function as a part of the search system. Alternatively, the generation device 50 is a distributed representation of the search query output by the first model M1 as input information to another system (for example, a search engine) that uses the feature information of the search query predicted by the first model M1. Can also be provided. As a result, the search system can select the content output as the search result based on the feature information of the search query predicted by the first model M1. That is, the search system can select the content output as the search result in consideration of the search intention and context of the user who entered the search query. Further, the search system calculates the similarity between the distributed representation of the character information included in the content output as the search result and the distributed representation of the search query based on the characteristic information of the search query predicted by the first model M1. It will be possible. Then, the search system can determine the display order of the contents output as the search result based on the calculated similarity. That is, the search system can determine the display order of the contents output as the search result in consideration of the search intention and context of the user who input the search query. Therefore, the generation device 50 can improve the usability in the search service.

〔７．生成装置の構成〕
次に、図１２を用いて、実施形態に係る生成装置５０の構成について説明する。図１２は、実施形態に係る生成装置５０の構成例を示す図である。図１２に示すように、生成装置５０は、通信部５１と、記憶部５３と、制御部５２とを有する。なお、生成装置５０は、生成装置５０の管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示するための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [7. Generation device configuration]
Next, the configuration of the generation device 50 according to the embodiment will be described with reference to FIG. FIG. 12 is a diagram showing a configuration example of the generator 50 according to the embodiment. As shown in FIG. 12, the generation device 50 includes a communication unit 51, a storage unit 53, and a control unit 52. The generation device 50 has an input unit (for example, a keyboard, a mouse, etc.) that receives various operations from the administrator of the generation device 50, and a display unit (for example, a liquid crystal display, etc.) for displaying various information. You may.

（通信部５１）
通信部５１は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部５１は、ネットワークと有線または無線で接続され、例えば、端末装置１０と、検索サーバ２０との間で情報の送受信を行う。 (Communication unit 51)
The communication unit 51 is realized by, for example, a NIC (Network Interface Card) or the like. Then, the communication unit 51 is connected to the network by wire or wirelessly, and transmits / receives information between the terminal device 10 and the search server 20, for example.

（記憶部５３）
記憶部５３は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部５３は、図１２に示すように、クエリ情報記憶部５３１と、ベクトル情報記憶部５３２と、モデル情報記憶部５３３とを有する。 (Memory unit 53)
The storage unit 53 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. As shown in FIG. 12, the storage unit 53 includes a query information storage unit 531, a vector information storage unit 532, and a model information storage unit 533.

（クエリ情報記憶部５３１）
クエリ情報記憶部５３１は、ユーザによって入力された検索クエリに関する各種の情報を記憶する。図１３に、実施形態に係るクエリ情報記憶部の一例を示す。図１３に示す例では、クエリ情報記憶部５３１は、「ユーザＩＤ」、「日時」、「検索クエリ」、「検索クエリＩＤ」といった項目を有する。 (Query information storage unit 531)
The query information storage unit 531 stores various information related to the search query input by the user. FIG. 13 shows an example of the query information storage unit according to the embodiment. In the example shown in FIG. 13, the query information storage unit 531 has items such as "user ID", "date and time", "search query", and "search query ID".

「ユーザＩＤ」は、検索クエリを入力したユーザを識別するための識別情報を示す。「日時」は、検索サーバがユーザから検索クエリを受け付けた日時を示す。「検索クエリ」は、ユーザによって入力された検索クエリを示す。「検索クエリＩＤ」は、ユーザによって入力された検索クエリを識別するための識別情報を示す。 The "user ID" indicates identification information for identifying the user who entered the search query. "Date and time" indicates the date and time when the search server received the search query from the user. The "search query" indicates a search query entered by the user. The "search query ID" indicates identification information for identifying the search query entered by the user.

図１３の１レコード目に示す例では、検索クエリＩＤ「Ｑ１１」で識別される検索クエリ（検索クエリＱ１１）は、図１０に示した検索クエリＱ１１に対応する。また、ユーザＩＤ「Ｕ１」は、検索クエリＱ１１を入力したユーザがユーザＩＤ「Ｕ１」で識別されるユーザ（利用者Ｕ１）であることを示す。また、日時「２０１８／９／１ＰＭ１７：００」は、検索サーバが利用者Ｕ１から検索クエリＱ１１を受け付けた日時が２０１８年９月１日の午後１７：００であることを示す。また、検索クエリ「六本木パスタ」は、利用者Ｕ１によって入力された検索クエリＱ１１を示す。具体的には、検索クエリ「六本木パスタ」は、地名を示す「六本木」と食品の種類を示す「パスタ」の文字とが区切り文字であるスペースで区切られた文字情報であることを示す。 In the example shown in the first record of FIG. 13, the search query (search query Q11) identified by the search query ID “Q11” corresponds to the search query Q11 shown in FIG. Further, the user ID "U1" indicates that the user who entered the search query Q11 is a user (user U1) identified by the user ID "U1". Further, the date and time "2018/9/1 PM 17:00" indicates that the date and time when the search server receives the search query Q11 from the user U1 is 17:00 pm on September 1, 2018. Further, the search query "Roppongi pasta" indicates the search query Q11 input by the user U1. Specifically, the search query "Roppongi pasta" indicates that the characters "Roppongi" indicating the place name and the characters "pasta" indicating the type of food are separated by a space which is a delimiter.

（ベクトル情報記憶部５３２）
ベクトル情報記憶部５３２は、検索クエリの分散表現であるベクトルに関する各種の情報を記憶する。図１４に、実施形態に係るベクトル情報記憶部の一例を示す。図１４に示す例では、ベクトル情報記憶部５３２は、「ベクトルＩＤ」、「検索クエリＩＤ」、「ベクトル情報」といった項目を有する。 (Vector information storage unit 532)
The vector information storage unit 532 stores various information related to the vector, which is a distributed representation of the search query. FIG. 14 shows an example of the vector information storage unit according to the embodiment. In the example shown in FIG. 14, the vector information storage unit 532 has items such as "vector ID", "search query ID", and "vector information".

「ベクトルＩＤ」は、検索クエリの分散表現であるベクトルを識別するための識別情報を示す。「検索クエリＩＤ」は、ベクトルに対応する検索クエリを識別するための識別情報を示す。「ベクトル情報」は、検索クエリの分散表現であるＮ次元のベクトルを示す。検索クエリの分散表現であるベクトルは、例えば、１２８次元のベクトルである。 The "vector ID" indicates identification information for identifying a vector which is a distributed representation of a search query. The "search query ID" indicates identification information for identifying the search query corresponding to the vector. "Vector information" indicates an N-dimensional vector which is a distributed representation of a search query. The vector, which is a distributed representation of the search query, is, for example, a 128-dimensional vector.

図１４の１レコード目に示す例では、ベクトルＩＤ「ＱＶ１１」で識別されるベクトル（ベクトルＱＶ１１）は、図１０に示した検索クエリＱ１１の分散表現であるベクトルＱＶ１１に対応する。また、検索クエリＩＤ「Ｑ１１」で識別される検索クエリ（検索クエリＱ１１）は、ベクトルＱＶ１１に対応する検索クエリが検索クエリＱ１１であることを示す。また、ベクトル情報「ＱＶＤＴ１１」は、検索クエリＱ１１の分散表現であるＮ次元のベクトルを示す。 In the example shown in the first record of FIG. 14, the vector (vector QV11) identified by the vector ID “QV11” corresponds to the vector QV11 which is a distributed representation of the search query Q11 shown in FIG. Further, the search query (search query Q11) identified by the search query ID "Q11" indicates that the search query corresponding to the vector QV11 is the search query Q11. Further, the vector information "QVDT11" indicates an N-dimensional vector which is a distributed representation of the search query Q11.

（モデル情報記憶部５３３）
モデル情報記憶部５３３は、生成装置５０によって生成された学習モデルに関する各種の情報を記憶する。図１５に、実施形態に係るモデル情報記憶部の一例を示す。図１５に示す例では、モデル情報記憶部５３３は、「モデルＩＤ」、「モデルデータ」といった項目を有する。 (Model information storage unit 533)
The model information storage unit 533 stores various information related to the learning model generated by the generation device 50. FIG. 15 shows an example of the model information storage unit according to the embodiment. In the example shown in FIG. 15, the model information storage unit 533 has items such as "model ID" and "model data".

「モデルＩＤ」は、生成装置５０によって生成された学習モデルを識別するための識別情報を示す。「モデルデータ」は、生成装置５０によって生成された学習モデルのモデルデータを示す。例えば、「モデルデータ」には、検索クエリを分散表現に変換するためのデータが格納される。 The "model ID" indicates identification information for identifying the learning model generated by the generation device 50. The "model data" indicates the model data of the learning model generated by the generation device 50. For example, "model data" stores data for converting a search query into a distributed representation.

図１５の１レコード目に示す例では、モデルＩＤ「Ｍ１」で識別される学習モデルは、図１に示した第１モデルＭ１に対応する。また、モデルデータ「ＭＤＴ１」は、生成装置５０によって生成された第１モデルＭ１のモデルデータ（モデルデータＭＤＴ１）を示す。 In the example shown in the first record of FIG. 15, the learning model identified by the model ID “M1” corresponds to the first model M1 shown in FIG. Further, the model data "MDT1" indicates model data (model data MDT1) of the first model M1 generated by the generation device 50.

モデルデータＭＤＴ１は、検索クエリが入力される入力層と、出力層と、入力層から出力層までのいずれかの層であって出力層以外の層に属する第１要素と、第１要素と第１要素の重みとに基づいて値が算出される第２要素と、を含み、入力層に入力された検索クエリに応じて、入力層に入力された検索クエリの分散表現を出力層から出力するよう、生成装置５０を機能させてもよい。 The model data MDT1 includes an input layer into which a search query is input, an output layer, a first element which is any layer from the input layer to the output layer and belongs to a layer other than the output layer, and the first element and the first element. A second element whose value is calculated based on the weight of one element and a second element are included, and a distributed representation of the search query input to the input layer is output from the output layer according to the search query input to the input layer. As such, the generator 50 may function.

生成装置５０は、上述した回帰モデルやニューラルネットワーク等、任意の構造を有するモデルを用いて、分散表現の算出を行う。具体的には、モデルデータＭＤＴ１は、検索クエリが入力された場合に、分散表現を出力するように係数が設定される。生成装置５０は、このようなモデルデータＭＤＴ１を用いて、分散表現を算出する。 The generation device 50 calculates the variance representation using a model having an arbitrary structure such as the regression model and the neural network described above. Specifically, the model data MDT1 is set with a coefficient so as to output a distributed representation when a search query is input. The generation device 50 calculates the distributed representation using such model data MDT1.

なお、上記例では、モデルデータＭＤＴ１が、検索クエリが入力された場合に、検索クエリの分散表現を出力するモデル（以下、モデルＸ１という。）である例を示した。しかし、実施形態に係るモデルデータＭＤＴ１は、モデルＸ１にデータの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。例えば、モデルデータＭＤＴ１は、検索クエリを入力とした際に、モデルＸ１が出力した分散表現を入力して学習されたモデル（以下、モデルＹ１という。）であってもよい。または、モデルデータＭＤＴ１は、検索クエリを入力とし、モデルＹ１の出力値を出力とするよう学習されたモデルであってもよい。 In the above example, the model data MDT1 is a model (hereinafter referred to as model X1) that outputs a distributed representation of the search query when the search query is input. However, the model data MDT1 according to the embodiment may be a model generated based on the result obtained by repeating the input / output of data to the model X1. For example, the model data MDT1 may be a model (hereinafter, referred to as model Y1) learned by inputting the distributed representation output by the model X1 when the search query is input. Alternatively, the model data MDT1 may be a model trained to input a search query and output the output value of the model Y1.

（制御部５２）
図１２の説明に戻って、制御部５２は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、生成装置５０内部の記憶装置に記憶されている各種プログラム（生成プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部５２は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (Control unit 52)
Returning to the description of FIG. 12, the control unit 52 is a controller, and is stored in a storage device inside the generation device 50 by, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). It is realized by executing various programs (corresponding to an example of a generation program) using the RAM as a work area. Further, the control unit 52 is a controller, and is realized by, for example, an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

また、制御部５２は、モデル情報記憶部５３３に記憶されている第１モデルＭ１（モデルデータＭＤＴ１）に従った情報処理により、入力層に入力された検索クエリに対し、出力層以外の各層に属する各要素を第１要素として、第１要素と第１要素の重みとに基づく演算を行うことにより、分散表現を出力層から出力するよう、コンピュータを機能させる。 In addition, the control unit 52 sends the search query input to the input layer to each layer other than the output layer by information processing according to the first model M1 (model data MDT1) stored in the model information storage unit 533. The computer is made to function so as to output a distributed representation from the output layer by performing an operation based on the first element and the weight of the first element with each element to which it belongs as the first element.

また、制御部５２は、モデル情報記憶部５３３に記憶されている第２モデルＭ２（モデルデータＭＤＴ２）に従った情報処理により、入力層に入力された検索クエリに対し、出力層以外の各層に属する各要素を第１要素として、第１要素と第１要素の重みとに基づく演算を行うことにより、検索クエリが各カテゴリに属する確率を出力層から出力するよう、コンピュータを機能させる。 Further, the control unit 52 sends the search query input to the input layer to each layer other than the output layer by information processing according to the second model M2 (model data MDT2) stored in the model information storage unit 533. By performing an operation based on the first element and the weight of the first element with each element to which it belongs as the first element, the computer is made to function so that the probability that the search query belongs to each category is output from the output layer.

図１２に示すように、制御部５２は、取得部５２１と、抽出部５２２と、生成部５２３を有し、以下に説明する情報処理の作用を実現または実行する。なお、制御部５２の内部構成は、図１２に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As shown in FIG. 12, the control unit 52 includes an acquisition unit 521, an extraction unit 522, and a generation unit 523, and realizes or executes the information processing operation described below. The internal configuration of the control unit 52 is not limited to the configuration shown in FIG. 12, and may be any other configuration as long as it performs information processing described later.

（取得部５２１）
取得部５２１は、種々の情報を取得する。具体的には、取得部５２１は、ユーザによって入力された検索クエリを検索サーバ２０から取得する。取得部５２１は、ユーザによって入力された検索クエリを取得すると、取得した検索クエリをクエリ情報記憶部５３１に格納する。 (Acquisition unit 521)
The acquisition unit 521 acquires various information. Specifically, the acquisition unit 521 acquires the search query input by the user from the search server 20. When the acquisition unit 521 acquires the search query input by the user, the acquisition unit 521 stores the acquired search query in the query information storage unit 531.

また、取得部５２１は、検索クエリの分散表現であるベクトルに関するベクトル情報を取得する。取得部５２１は、ベクトル情報を取得すると、取得したベクトル情報をベクトル情報記憶部５３２に格納する。 In addition, the acquisition unit 521 acquires vector information regarding a vector which is a distributed representation of the search query. When the acquisition unit 521 acquires the vector information, the acquisition unit 521 stores the acquired vector information in the vector information storage unit 532.

また、取得部５２１は、第１モデルに関する情報を取得する。具体的には、取得部５２１は、取得部５２１は、生成部５２３によって生成された第１モデルに関する情報を取得する。取得部５２１は、第１モデルに関する情報を取得すると、取得した第１モデルに関する情報をモデル情報記憶部５３３に格納する。 In addition, the acquisition unit 521 acquires information about the first model. Specifically, the acquisition unit 521 acquires information about the first model generated by the generation unit 523. When the acquisition unit 521 acquires the information regarding the first model, the acquisition unit 521 stores the acquired information regarding the first model in the model information storage unit 533.

（抽出部５２２）
抽出部５２２は、種々の情報を抽出する。具体的には、抽出部５２２は、取得部５２１によって取得された検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリを抽出する。例えば、抽出部５２２は、同一のユーザによって各検索クエリが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。続いて、抽出部５２２は、同一のユーザによって所定の時間内に入力された複数の検索クエリのうち、同一のユーザによって所定の時間内に連続して入力された一対の検索クエリを抽出する。例えば、抽出部５２２は、同一のユーザによって各検索クエリのペアが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。例えば、抽出部５２２は、取得部５２１によって取得された検索クエリのうち、同一の利用者Ｕ１によって所定の時間内に連続して入力された４個の検索クエリである検索クエリＱ１１（「六本木パスタ」）、検索クエリＱ１２（「六本木イタリアン」）、検索クエリＱ１３（「赤坂パスタ」）、検索クエリＱ１４（「麻布パスタ」）を抽出する。抽出部５２２は、検索クエリが入力された順番に並べると、検索クエリＱ１１、検索クエリＱ１２、検索クエリＱ１３、検索クエリＱ１４の順番で入力された４個の検索クエリを抽出する。続いて、抽出部５２２は、４個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、３対の検索クエリのペアである（検索クエリＱ１１、検索クエリＱ１２）、（検索クエリＱ１２、検索クエリＱ１３）、（検索クエリＱ１３、検索クエリＱ１４）を抽出する。なお、抽出部５２２は、同一のユーザによって全ての検索クエリが所定の時間内に入力された複数の検索クエリを抽出してもよい。そして、抽出部５２２は、時系列的に隣り合うか否かに関わらず、抽出した複数の検索クエリの中から２つの検索クエリを選択して、選択した２つの検索クエリを一対の検索クエリとして抽出してもよい。 (Extractor 522)
The extraction unit 522 extracts various information. Specifically, the extraction unit 522 extracts a plurality of search queries input by the same user within a predetermined time from the search queries acquired by the acquisition unit 521. For example, the extraction unit 522 extracts a plurality of search queries in which the time interval in which each search query is input by the same user is within a predetermined time. Subsequently, the extraction unit 522 extracts a pair of search queries continuously input by the same user within a predetermined time from among a plurality of search queries input by the same user within a predetermined time. For example, the extraction unit 522 extracts a plurality of search queries in which the time interval in which each search query pair is input by the same user is within a predetermined time. For example, the extraction unit 522 is a search query Q11 (“Roppongi pasta”), which is four search queries continuously input by the same user U1 within a predetermined time among the search queries acquired by the acquisition unit 521. ”), Search query Q12 (“Roppongi Italian”), search query Q13 (“Akasaka pasta”), and search query Q14 (“Azabu pasta”) are extracted. When the search queries are arranged in the order in which they are input, the extraction unit 522 extracts four search queries input in the order of search query Q11, search query Q12, search query Q13, and search query Q14. Subsequently, when the extraction unit 522 extracts four search queries, it is a pair of three search queries (search query Q11, search query), with two search queries adjacent in chronological order as a pair of search queries. Q12), (search query Q12, search query Q13), (search query Q13, search query Q14) are extracted. The extraction unit 522 may extract a plurality of search queries in which all the search queries are input by the same user within a predetermined time. Then, the extraction unit 522 selects two search queries from the extracted plurality of search queries regardless of whether they are adjacent to each other in chronological order, and sets the two selected search queries as a pair of search queries. It may be extracted.

また、抽出部５２２は、取得部５２１によって取得された検索クエリのうち、所定の検索クエリと所定の検索クエリに無関係な他の検索クエリとを抽出する。例えば、抽出部５２２は、取得部５２１によって取得された検索クエリの中から、所定の検索クエリを抽出する。続いて、抽出部５２２は、取得部５２１によって取得された検索クエリの中から、所定の検索クエリとは無関係にランダムに他の検索クエリを抽出する。 In addition, the extraction unit 522 extracts a predetermined search query and other search queries unrelated to the predetermined search query from the search queries acquired by the acquisition unit 521. For example, the extraction unit 522 extracts a predetermined search query from the search queries acquired by the acquisition unit 521. Subsequently, the extraction unit 522 randomly extracts another search query from the search queries acquired by the acquisition unit 521, regardless of the predetermined search query.

（生成部５２３）
生成部５２３は、種々の情報を生成する。具体的には、生成部５２３は、取得部５２１によって取得された検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する学習モデルを生成する。具体的には、生成部５２３は、同一のユーザによって所定の時間内に入力された複数の検索クエリの分散表現が類似するように学習モデルを学習させることで、所定の検索クエリから所定の検索クエリの特徴情報を予測する学習モデルを生成する。例えば、生成部５２３は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、学習モデルを生成する。例えば、生成部５２３は、一対の検索クエリの学習前の分散表現の類似度の値を算出する。また、生成部５２３は、一対の検索クエリの学習後の分散表現の類似度の値を算出する。続いて、生成部５２３は、学習前の分散表現の類似度の値よりも、学習後の分散表現の類似度の値が大きくなるように学習モデルを学習させる。このように、生成部５２３は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように学習モデルを学習させることで、検索クエリから分散表現を出力する学習モデルを生成する。より具体的には、生成部５２３は、ＲＮＮの一種であるＬＳＴＭを分散表現生成に用いたＤＳＳＭの技術を用いて、検索クエリから分散表現を出力する学習モデルを生成する。例えば、生成部５２３は、学習モデルの正解データとして、同一のユーザによって所定の時間内に入力された一対の検索クエリが類似する特徴を有するものとして、所定の検索クエリの分散表現と、所定の検索クエリと対となる他の検索クエリの分散表現とが、分散表現空間上で近くに存在するように学習する。また、生成部５２３は、第１モデルを生成すると、第１モデルを識別する識別情報と対応付けて、生成した第１モデル（モデルデータＭＤＴ１）をモデル情報記憶部５３３に格納する。 (Generation unit 523)
The generation unit 523 generates various information. Specifically, the generation unit 523 learns that among the search queries acquired by the acquisition unit 521, a plurality of search queries input by the same user within a predetermined time have similar characteristics. , Generate a learning model that predicts the feature information of a predetermined search query from a predetermined search query. Specifically, the generation unit 523 trains the learning model so that the distributed expressions of a plurality of search queries input by the same user within a predetermined time are similar, thereby performing a predetermined search from the predetermined search query. Generate a learning model that predicts the characteristic information of the query. For example, the generation unit 523 generates a learning model by learning so that the distributed representations of a pair of search queries input consecutively within a predetermined time are similar. For example, the generation unit 523 calculates the value of the similarity of the distributed representation before learning the pair of search queries. In addition, the generation unit 523 calculates the value of the similarity of the distributed representation after learning the pair of search queries. Subsequently, the generation unit 523 trains the learning model so that the value of the similarity of the distributed expression after learning is larger than the value of the similarity of the distributed expression before learning. In this way, the generation unit 523 outputs the distributed expression from the search query by training the learning model so that the two vectors, which are the pair of distributed expressions corresponding to the pair of search queries, are similar in the distributed expression space. Generate a learning model to do. More specifically, the generation unit 523 generates a learning model that outputs a distributed expression from a search query by using the DSSM technology that uses LSTM, which is a kind of RNN, for the distributed expression generation. For example, the generation unit 523 assumes that the pair of search queries input by the same user within a predetermined time have similar characteristics as the correct answer data of the learning model, and the distributed representation of the predetermined search query and the predetermined Learn so that the distributed representations of other search queries that are paired with the search query are close together in the distributed representation space. Further, when the first model is generated, the generation unit 523 stores the generated first model (model data MDT1) in the model information storage unit 533 in association with the identification information that identifies the first model.

〔８．第１モデルの一例〕
次に、図１６を用いて生成装置５０が生成する第１モデルの一例について説明する。図１６は、実施形態に係る第１モデルの一例を示す図である。図１６に示す例では、生成装置５０が生成する第１モデルＭ１は、３層のＬＳＴＭＲＮＮで構成されている。図１６に示す例では、抽出部５２２は、同一の利用者Ｕ１によって所定の時間内に連続して入力された「六本木パスタ」という検索クエリＱ１１と「六本木イタリアン」という検索クエリＱ１２とから成る一対の検索クエリを抽出する。生成部５２３は、抽出部５２２によって抽出されたた検索クエリＱ１１を第１モデルＭ１の入力層に入力する（ステップＳ４１）。 [8. Example of the first model]
Next, an example of the first model generated by the generation device 50 will be described with reference to FIG. FIG. 16 is a diagram showing an example of the first model according to the embodiment. In the example shown in FIG. 16, the first model M1 generated by the generation device 50 is composed of three layers of LSTM RNNs. In the example shown in FIG. 16, the extraction unit 522 is a pair consisting of a search query Q11 "Roppongi pasta" and a search query Q12 "Roppongi Italian" continuously input by the same user U1 within a predetermined time. Extract search queries for. The generation unit 523 inputs the search query Q11 extracted by the extraction unit 522 to the input layer of the first model M1 (step S41).

続いて、生成部５２３は、第１モデルＭ１の出力層から検索クエリＱ１１の分散表現である２５６次元のベクトルＢＱＶ１１を出力する。また、生成部５２３は、抽出部５２２によって抽出された検索クエリＱ１２を第１モデルＭ１の入力層に入力する。続いて、生成部５２３は、第１モデルＭ１の出力層から検索クエリＱ１２の分散表現である２５６次元のベクトルＢＱＶ１２を出力する（ステップＳ４２）。 Subsequently, the generation unit 523 outputs a 256-dimensional vector BQV11 which is a distributed representation of the search query Q11 from the output layer of the first model M1. Further, the generation unit 523 inputs the search query Q12 extracted by the extraction unit 522 to the input layer of the first model M1. Subsequently, the generation unit 523 outputs a 256-dimensional vector BQV12 which is a distributed representation of the search query Q12 from the output layer of the first model M1 (step S42).

続いて、生成部５２３は、連続して入力された２つの検索クエリの分散表現が類似するように学習することで、検索クエリから分散表現を出力する第１モデルＭ１を生成する（ステップＳ４３）。例えば、第１モデルＭ１にフィードバックをかける前（学習前）の検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２とのなす角度の大きさをΘとする。また、第１モデルＭ１にフィードバックをかけた後（学習後）の検索クエリＱ１１の分散表現であるベクトルＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＱＶ１２とのなす角度の大きさをΦとする。この時、生成部５２３は、ΘよりもΦが小さくなるように、第１モデルＭ１を学習させる。例えば、生成部５２３は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値を算出する。また、生成部５２３は、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値を算出する。続いて、生成部５２３は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値よりも、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値が大きくなるように（値が１に近づくように）学習モデルＭ１を学習させる。このように、生成部５２３は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現を出力する第１モデルＭ１を生成する。なお、生成部５２３は、コサイン類似度に限らず、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて分散表現の間の類似度を算出してもよい。また、生成部５２３は、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて学習モデルＭ１を学習させてもよい。例えば、生成部５２３は、分散表現同士のユークリッド距離や双曲空間等の非ユークリッド空間中での距離、マンハッタン距離、マハラノビス距離等といった所定の距離関数の値を算出する。続いて、生成部５２３は、分散表現同士の所定の距離関数の値（すなわち、分散表現空間における距離）が小さくなるように学習モデルＭ１を学習させてもよい。 Subsequently, the generation unit 523 generates the first model M1 that outputs the distributed expression from the search query by learning so that the distributed expressions of the two consecutively input search queries are similar (step S43). .. For example, let Θ be the magnitude of the angle formed by the vector BQV11 which is the distributed representation of the search query Q11 before giving feedback to the first model M1 (before learning) and the vector BQV12 which is the distributed representation of the search query Q12. Further, let Φ be the size of the angle formed by the vector QV11 which is the distributed representation of the search query Q11 after giving feedback to the first model M1 (after learning) and the vector QV12 which is the distributed representation of the search query Q12. At this time, the generation unit 523 trains the first model M1 so that Φ is smaller than Θ. For example, the generation unit 523 calculates the value of the cosine similarity between the vector BQV11 and the vector BQV12. Further, the generation unit 523 calculates the value of the cosine similarity between the vector QV11 and the vector QV12. Subsequently, the generation unit 523 prepares the learning model M1 so that the value of the cosine similarity between the vector QV11 and the vector QV12 is larger than the value of the cosine similarity between the vector BQV11 and the vector BQV12 (so that the value approaches 1). To learn. In this way, the generation unit 523 trains the first model M1 so that the two vectors, which are a pair of distributed representations corresponding to the pair of search queries, are similar in the distributed representation space, so that the distributed representation is expressed from the search query. Generates the first model M1 that outputs. The generation unit 523 may calculate the similarity between the variance representations based on any index as long as it is an index applicable as a distance scale between vectors, not limited to the cosine similarity. Further, the generation unit 523 may train the learning model M1 based on any index as long as it is an index applicable as a distance scale between vectors. For example, the generation unit 523 calculates the value of a predetermined distance function such as the Euclidean distance between distributed expressions, the distance in a non-Euclidean space such as hyperbolic space, the Manhattan distance, and the Mahalanobis distance. Subsequently, the generation unit 523 may train the learning model M1 so that the value of a predetermined distance function between the distributed expressions (that is, the distance in the distributed expression space) becomes small.

また、生成部５２３は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字情報を含む複数の検索クエリが類似する特徴を有するものとして学習することで、第１モデルを生成する。例えば、生成部５２３は、地名を示す「六本木」と食品の種類を示す「パスタ」の文字とが区切り文字であるスペースで区切られた検索クエリ「六本木パスタ」と、地名を示す「六本木」と料理の種類を示す「イタリアン」の文字とが区切り文字であるスペースで区切られた検索クエリ「六本木イタリアン」とが類似する特徴を有するものとして学習することで、第１モデルを生成する。 Further, the generation unit 523 learns that, as a plurality of search queries input by the same user within a predetermined time, a plurality of search queries including character information delimited by a predetermined delimiter have similar characteristics. By doing so, the first model is generated. For example, the generation unit 523 includes a search query "Roppongi pasta" in which the characters "Roppongi" indicating the place name and the characters "pasta" indicating the type of food are separated by a space, and "Roppongi" indicating the place name. The first model is generated by learning that the search query "Roppongi Italian", which is separated by a space delimiter from the character "Italian" indicating the type of food, has similar characteristics.

また、生成部５２３は、取得部５２１によって取得された検索クエリのうち、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、第１モデルを生成する。具体的には、生成部５２３は、取得部５２１によって取得された検索クエリのうち、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、第１モデルを生成する。例えば、生成部５２３は、抽出部５２２によって抽出された所定の検索クエリの分散表現と、所定の検索クエリとは無関係にランダムに抽出された検索クエリの分散表現とが分散表現空間上で遠くにマッピングされるように第１モデルＭ１のトレーニングを行う。 Further, the generation unit 523 generates the first model by learning that the plurality of randomly extracted search queries among the search queries acquired by the acquisition unit 521 have different characteristics. Specifically, the generation unit 523 generates the first model by learning so that the distributed representations of a pair of randomly extracted search queries among the search queries acquired by the acquisition unit 521 are different. .. For example, in the generation unit 523, the distributed representation of the predetermined search query extracted by the extraction unit 522 and the distributed representation of the search query randomly extracted regardless of the predetermined search query are far apart in the distributed representation space. The first model M1 is trained so as to be mapped.

〔９．第１モデルの生成処理のフロー〕
次に、図１７を用いて、実施形態に係る第１モデルの生成処理の手順について説明する。図１７は、実施形態に係る第１モデルの生成処理手順を示すフローチャートである。 [9. Flow of generation processing of the first model]
Next, the procedure of the generation process of the first model according to the embodiment will be described with reference to FIG. FIG. 17 is a flowchart showing a generation processing procedure of the first model according to the embodiment.

図１７に示す例では、生成装置５０は、ユーザによって入力された検索クエリを取得する（ステップＳ１００１）。 In the example shown in FIG. 17, the generation device 50 acquires the search query input by the user (step S1001).

続いて、生成装置５０は、同一のユーザによって所定の時間内に入力された複数の検索クエリを抽出する（ステップＳ１００２）。 Subsequently, the generation device 50 extracts a plurality of search queries input by the same user within a predetermined time (step S1002).

続いて、生成装置５０は、抽出した複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する第１モデルを生成する（ステップＳ１００３）。 Subsequently, the generation device 50 generates a first model that predicts the characteristic information of a predetermined search query from the predetermined search query by learning that the plurality of extracted search queries have similar characteristics (step). S1003).

〔１０．効果〕
上述してきたように、実施形態に係る情報処理装置１００は、受付部１３１と生成部１３３と出力部１３４を有する。受付部１３１は、分類対象を示す複数の対象情報と、複数の対象情報をクラスタに分類する際のクラスタ数である指定クラスタ数とを受け付ける。生成部１３３は、複数の対象情報を指定クラスタ数のクラスタに分類することにより生成したクラスタに関する指定クラスタ情報を生成する。出力部１３４は、生成部１３３によって生成された指定クラスタ情報を出力する。 [10. effect〕
As described above, the information processing apparatus 100 according to the embodiment includes a reception unit 131, a generation unit 133, and an output unit 134. The reception unit 131 receives a plurality of target information indicating the classification target and a designated number of clusters, which is the number of clusters when the plurality of target information is classified into clusters. The generation unit 133 generates the designated cluster information regarding the generated clusters by classifying the plurality of target information into clusters having a designated number of clusters. The output unit 134 outputs the designated cluster information generated by the generation unit 133.

これにより、情報処理装置１００は、分類対象を示す多数の対象情報から、多数の対象情報の特徴の要約とも言える指定クラスタ数のクラスタ情報を抽出可能とする。そして、情報処理装置１００は、多数の対象情報の特徴を指定クラスタ数のクラスタ情報に要約して提供可能とする。このように、情報処理装置１００は、分類対象を示す多数の対象情報の特徴を少ない情報量の要約にまとめて提供可能とする。したがって、情報処理装置１００は、有用な情報を得ることができる。 As a result, the information processing apparatus 100 can extract cluster information of a specified number of clusters, which can be said to be a summary of the features of a large number of target information, from a large number of target information indicating classification targets. Then, the information processing apparatus 100 can summarize the features of a large number of target information into cluster information of a specified number of clusters and provide the information. As described above, the information processing apparatus 100 can provide the features of a large number of target information indicating the classification target in a summary of a small amount of information. Therefore, the information processing device 100 can obtain useful information.

また、生成部１３３は、複数の対象情報に含まれる一の対象情報が検索クエリとして入力された際の検索意図と、複数の対象情報に含まれる他の対象情報が検索クエリとして入力された際の検索意図との類似性に基づいて、指定クラスタ情報を生成する。 In addition, the generation unit 133 includes a search intention when one target information included in the plurality of target information is input as a search query, and when another target information included in the plurality of target information is input as a search query. Generates designated cluster information based on the similarity to the search intent of.

これにより、情報処理装置１００は、対象情報が検索クエリとして入力された際の検索意図の類似度に基づいて分類された指定クラスタ数のクラスタに関する指定クラスタ情報を提供可能とする。 As a result, the information processing apparatus 100 can provide the designated cluster information regarding the clusters of the designated number of clusters classified based on the similarity of the search intention when the target information is input as the search query.

また、生成部１３３は、複数の対象情報に含まれる一の対象情報である文字情報に対応する分散表現と、複数の対象情報に含まれる他の対象情報である文字情報に対応する分散表現との類似度に基づいて、指定クラスタ情報を生成する。 Further, the generation unit 133 includes a distributed expression corresponding to the character information which is one target information included in the plurality of target information and a distributed expression corresponding to the character information which is other target information included in the plurality of target information. Generates designated cluster information based on the similarity of.

これにより、情報処理装置１００は、クエリ同士の検索意図の類似性に基づくクラスタの分類結果を数値によって客観的に把握できる状態で提供可能とする。 As a result, the information processing apparatus 100 can provide the classification result of the cluster based on the similarity of the search intentions between the queries in a state where it can be objectively grasped numerically.

また、生成部１３３は、指定クラスタ情報として、クラスタごとに、クラスタに分類される対象情報を視認可能な情報を生成する。 In addition, the generation unit 133 generates information in which the target information classified into the clusters can be visually recognized as the designated cluster information for each cluster.

これにより、情報処理装置１００は、多数の高次元の分散表現同士の類似度に関する分析結果を一見して把握しやすい状態で提供可能にする。 As a result, the information processing apparatus 100 can provide an analysis result regarding the similarity between a large number of high-dimensional distributed expressions in a state in which it is easy to grasp at a glance.

また、生成部１３３は、複数の対象情報を指定クラスタ数よりも少ない数のクラスタに分類することにより生成したクラスタに関するクラスタ情報を生成する。出力部１３４は、生成部１３３によって生成されたクラスタ情報を指定クラスタ情報と比較可能に出力する。 In addition, the generation unit 133 generates cluster information regarding the generated clusters by classifying the plurality of target information into clusters having a number smaller than the number of designated clusters. The output unit 134 outputs the cluster information generated by the generation unit 133 so as to be comparable to the designated cluster information.

これにより、情報処理装置１００は、複数の対象情報を指定クラスタ数よりも少ない数のクラスタに分類することにより生成したクラスタに関するクラスタ情報と指定クラスタ情報とを比較できる状態で提供可能とする。 As a result, the information processing apparatus 100 can provide the cluster information about the cluster generated by classifying the plurality of target information into clusters smaller than the number of designated clusters in a state where the designated cluster information can be compared.

また、生成部１３３は、複数の対象情報に含まれる一の対象情報が検索クエリとして入力された際の検索意図と、複数の対象情報に含まれる他の対象情報が検索クエリとして入力された際の検索意図との類似性に基づいて、複数の対象情報を指定クラスタ数のクラスタに分類する。 In addition, the generation unit 133 includes a search intention when one target information included in a plurality of target information is input as a search query, and a search query when another target information included in the plurality of target information is input as a search query. Classify multiple target information into clusters with a specified number of clusters based on the similarity with the search intention of.

これにより、情報処理装置１００は、対象情報が検索クエリとして入力された際の検索意図の類似度に基づいて指定クラスタ数のクラスタを生成することができる。 As a result, the information processing apparatus 100 can generate a specified number of clusters based on the similarity of the search intention when the target information is input as the search query.

また、生成部１３３は、複数の対象情報に含まれる一の対象情報である文字情報に対応する分散表現と、複数の対象情報に含まれる他の対象情報である文字情報に対応する分散表現との類似度に基づいて、複数の対象情報を指定クラスタ数のクラスタに分類する。 Further, the generation unit 133 includes a distributed expression corresponding to the character information which is one target information included in the plurality of target information and a distributed expression corresponding to the character information which is other target information included in the plurality of target information. Classify multiple target information into clusters with a specified number of clusters based on the similarity of.

また、生成部１３３は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、複数の対象情報に含まれるそれぞれの対象情報である文字情報に対応する分散表現を生成する。 In addition, the generation unit 133 uses a learning model that learns the characteristics of the plurality of search queries, assuming that the plurality of search queries input by the same user within a predetermined time have similar characteristics. Generates a distributed representation corresponding to the character information that is each target information included in the target information.

これにより、情報処理装置１００は、対象情報である文字情報が検索クエリとし手入力された検索意図を反映した分散表現を生成可能とする。 As a result, the information processing apparatus 100 can generate a distributed representation in which the character information which is the target information reflects the search intention manually input as the search query.

また、生成部１３３は、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する学習モデルを用いて、分散表現を生成する。 Further, the generation unit 133 generates a distributed expression by using a learning model that outputs a distributed expression of the predetermined search query as output information when a predetermined search query is input as input information.

これにより、情報処理装置１００は、対象情報である文字情報が検索クエリとし手入力された検索意図を反映した分散表現を生成可能とする。 As a result, the information processing apparatus 100 can generate a distributed expression that reflects the search intention manually input by using the character information as the target information as a search query.

また、生成部１３３は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。 In addition, the generation unit 133 uses a learning model that learns the characteristics of a plurality of search queries by learning so that the distributed expressions of a pair of search queries input consecutively within a predetermined time are similar. Generate a distributed representation.

また、生成部１３３は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。 Further, the generation unit 133 learns that as a plurality of search queries input by the same user within a predetermined time, a plurality of search queries including a character string delimited by a predetermined delimiter have similar characteristics. By doing so, a distributed representation is generated using a learning model that has learned the characteristics of a plurality of search queries.

また、生成部１３３は、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。 In addition, the generation unit 133 generates a distributed expression by using the learning model in which the characteristics of the plurality of search queries are learned by learning that the plurality of randomly extracted search queries have different characteristics. ..

また、生成部１３３は、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、分散表現を生成する。 Further, the generation unit 133 generates a distributed expression by using a learning model that has learned the characteristics of a plurality of search queries by learning so that the distributed expressions of a pair of randomly extracted search queries are different. ..

〔１１．ハードウェア構成〕
また、上述してきた実施形態に係る情報処理装置１００または生成装置５０または端末装置１０は、例えば図１８に示すような構成のコンピュータ１０００によって実現される。図１８は、情報処理装置１００または生成装置５０または端末装置１０の機能を実現するコンピュータの一例を示すハードウェア構成図である。コンピュータ１０００は、ＣＰＵ１１００、ＲＡＭ１２００、ＲＯＭ１３００、ＨＤＤ１４００、通信インターフェイス（Ｉ／Ｆ）１５００、入出力インターフェイス（Ｉ／Ｆ）１６００、及びメディアインターフェイス（Ｉ／Ｆ）１７００を備える。 [11. Hardware configuration]
Further, the information processing device 100 or the generation device 50 or the terminal device 10 according to the above-described embodiment is realized by, for example, a computer 1000 having a configuration as shown in FIG. FIG. 18 is a hardware configuration diagram showing an example of a computer that realizes the functions of the information processing device 100, the generation device 50, or the terminal device 10. The computer 1000 includes a CPU 1100, a RAM 1200, a ROM 1300, an HDD 1400, a communication interface (I / F) 1500, an input / output interface (I / F) 1600, and a media interface (I / F) 1700.

ＣＰＵ１１００は、ＲＯＭ１３００またはＨＤＤ１４００に格納されたプログラムに基づいて動作し、各部の制御を行う。ＲＯＭ１３００は、コンピュータ１０００の起動時にＣＰＵ１１００によって実行されるブートプログラムや、コンピュータ１０００のハードウェアに依存するプログラム等を格納する。 The CPU 1100 operates based on a program stored in the ROM 1300 or the HDD 1400, and controls each part. The ROM 1300 stores a boot program executed by the CPU 1100 when the computer 1000 is started, a program that depends on the hardware of the computer 1000, and the like.

ＨＤＤ１４００は、ＣＰＵ１１００によって実行されるプログラム、及び、かかるプログラムによって使用されるデータ等を格納する。通信インターフェイス１５００は、所定の通信網を介して他の機器からデータを受信してＣＰＵ１１００へ送り、ＣＰＵ１１００が生成したデータを所定の通信網を介して他の機器へ送信する。 The HDD 1400 stores a program executed by the CPU 1100, data used by such a program, and the like. The communication interface 1500 receives data from another device via a predetermined communication network and sends it to the CPU 1100, and transmits the data generated by the CPU 1100 to the other device via the predetermined communication network.

ＣＰＵ１１００は、入出力インターフェイス１６００を介して、ディスプレイやプリンタ等の出力装置、及び、キーボードやマウス等の入力装置を制御する。ＣＰＵ１１００は、入出力インターフェイス１６００を介して、入力装置からデータを取得する。また、ＣＰＵ１１００は、生成したデータを入出力インターフェイス１６００を介して出力装置へ出力する。 The CPU 1100 controls an output device such as a display or a printer, and an input device such as a keyboard or a mouse via the input / output interface 1600. The CPU 1100 acquires data from the input device via the input / output interface 1600. Further, the CPU 1100 outputs the generated data to the output device via the input / output interface 1600.

メディアインターフェイス１７００は、記録媒体１８００に格納されたプログラムまたはデータを読み取り、ＲＡＭ１２００を介してＣＰＵ１１００に提供する。ＣＰＵ１１００は、かかるプログラムを、メディアインターフェイス１７００を介して記録媒体１８００からＲＡＭ１２００上にロードし、ロードしたプログラムを実行する。記録媒体１８００は、例えばＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等である。 The media interface 1700 reads a program or data stored in the recording medium 1800 and provides the program or data to the CPU 1100 via the RAM 1200. The CPU 1100 loads the program from the recording medium 1800 onto the RAM 1200 via the media interface 1700, and executes the loaded program. The recording medium 1800 is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as an MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. And so on.

例えば、コンピュータ１０００が情報処理装置１００または生成装置５０または端末装置１０として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラムを実行することにより、制御部１３０または制御部５２または制御部１５の機能を実現する。コンピュータ１０００のＣＰＵ１１００は、これらのプログラムを記録媒体１８００から読み取って実行するが、他の例として、他の装置から所定の通信網を介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the information processing device 100, the generation device 50, or the terminal device 10, the CPU 1100 of the computer 1000 controls the control unit 130 or the control unit 52 or the control unit 52 by executing the program loaded on the RAM 1200. The function of the part 15 is realized. The CPU 1100 of the computer 1000 reads and executes these programs from the recording medium 1800, but as another example, these programs may be acquired from another device via a predetermined communication network.

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 Although some of the embodiments of the present application have been described in detail with reference to the drawings, these are examples, and various modifications are made based on the knowledge of those skilled in the art, including the embodiments described in the disclosure column of the invention. It is possible to practice the present invention in other improved forms.

〔１２．その他〕
また、上記実施形態及び変形例において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [12. Others]
Further, among the processes described in the above-described embodiments and modifications, all or part of the processes described as being automatically performed can be manually performed, or are described as being manually performed. It is also possible to automatically perform all or part of the processed processing by a known method. In addition, the processing procedure, specific name, and information including various data and parameters shown in the above document and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each figure is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Further, each component of each of the illustrated devices is a functional concept, and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of the device is functionally or physically distributed / physically in any unit according to various loads and usage conditions. It can be integrated and configured.

また、上述してきた実施形態及び変形例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 In addition, the above-described embodiments and modifications can be appropriately combined as long as the processing contents do not contradict each other.

また、上述してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、生成部は、生成手段や生成回路に読み替えることができる。 In addition, the above-mentioned "section, module, unit" can be read as "means" or "circuit". For example, the generation unit can be read as a generation means or a generation circuit.

１情報処理システム
１０端末装置
２０検索サーバ
５０生成装置
１００情報処理装置
１１０通信部
１２０記憶部
１２１クエリ情報記憶部
１２２ベクトル情報記憶部
１２３クラスタ情報記憶部
１２４モデル情報記憶部
１３０制御部
１３１受付部
１３２取得部
１３３生成部
１３４出力部 1 Information processing system 10 Terminal device 20 Search server 50 Generation device 100 Information processing device 110 Communication unit 120 Storage unit 121 Query information storage unit 122 Vector information storage unit 123 Cluster information storage unit 124 Model information storage unit 130 Control unit 131 Reception unit 132 Acquisition unit 133 Generation unit 134 Output unit

Claims

A reception unit that accepts a plurality of target information indicating a classification target and a specified number of clusters, which is the number of clusters when the plurality of target information is classified into clusters.
A generator that generates designated cluster information related to a cluster generated by classifying the plurality of target information into clusters having the specified number of clusters, and a generator.
An output unit that outputs the specified cluster information generated by the generation unit, and
An information processing device characterized by being equipped with.

The generator
The search intention when one target information included in the plurality of target information is input as a search query and the search intention when other target information included in the plurality of target information is input as a search query. The information processing apparatus according to claim 1, wherein the designated cluster information is generated based on the similarity.

The generator
Based on the degree of similarity between the distributed expression corresponding to the character information which is one target information included in the plurality of target information and the distributed expression corresponding to the character information which is other target information included in the plurality of target information. The information processing apparatus according to claim 1 or 2, wherein the designated cluster information is generated.

The generator
The information processing apparatus according to any one of claims 1 to 3, wherein, as the designated cluster information, information that can visually recognize the target information classified into the cluster is generated for each cluster.

The generator
Cluster information related to the cluster generated by classifying the plurality of target information into a number of clusters smaller than the specified number of clusters is generated.
The output unit
The information processing apparatus according to any one of claims 1 to 4, wherein the cluster information generated by the generation unit is output so as to be comparable to the designated cluster information.

The generator
The search intention when one target information included in the plurality of target information is input as a search query and the search intention when other target information included in the plurality of target information is input as a search query. The information processing apparatus according to any one of claims 1 to 5, wherein the plurality of target information is classified into clusters having a specified number of clusters based on the similarity.

The generator
Based on the degree of similarity between the distributed expression corresponding to the character information which is one target information included in the plurality of target information and the distributed expression corresponding to the character information which is other target information included in the plurality of target information. The information processing apparatus according to any one of claims 1 to 6, wherein the plurality of target information is classified into clusters having the specified number of clusters.

The generator
Assuming that a plurality of search queries input by the same user within a predetermined time have similar characteristics, the plurality of target information is included in the plurality of target information by using a learning model in which the characteristics of the plurality of search queries are learned. The information processing apparatus according to any one of claims 1 to 7, wherein a distributed expression corresponding to character information which is each target information is generated.

The generator
The eighth aspect of claim 8 is characterized in that when a predetermined search query is input as input information, the distributed expression is generated by using a learning model that outputs a distributed expression of the predetermined search query as output information. Information processing equipment.

The generator
By learning so that the distributed expressions of a pair of search queries input consecutively within the predetermined time are similar, the distributed expressions are generated using a learning model that has learned the characteristics of the plurality of search queries. The information processing apparatus according to claim 8 or 9.

The generator
As a plurality of search queries input by the same user within a predetermined time, the plurality of search queries including a character string separated by a predetermined delimiter are learned as having similar characteristics. The information processing apparatus according to any one of claims 8 to 10, wherein the distributed representation is generated by using a learning model that has learned the characteristics of a search query.

The generator
It is characterized in that the distributed representation is generated by using a learning model that has learned the characteristics of the plurality of search queries by learning that a plurality of randomly extracted search queries have different characteristics. The information processing apparatus according to any one of claims 8 to 11.

The generator
By learning so that the distributed expressions of a pair of randomly extracted search queries are different, the distributed expressions are generated by using a learning model that has learned the characteristics of the plurality of search queries. The information processing apparatus according to any one of claims 8 to 12.

Information processing method executed by a computer
A reception process that accepts a plurality of target information indicating a classification target and a designated number of clusters, which is the number of clusters when the plurality of target information is classified into clusters.
A generation process for generating designated cluster information regarding a cluster generated by classifying the plurality of target information into clusters having the specified number of clusters, and a generation step.
An output process that outputs the designated cluster information generated by the generation process, and
An information processing method characterized by including.

A reception procedure for accepting a plurality of target information indicating a classification target and a specified number of clusters, which is the number of clusters when classifying the plurality of target information into clusters.
A generation procedure for generating designated cluster information regarding a cluster generated by classifying the plurality of target information into clusters having the specified number of clusters, and a generation procedure.
An output procedure that outputs the specified cluster information generated by the above generation procedure, and
An information processing program characterized by having a computer execute.