JP2020129192A

JP2020129192A - Information processing device, information processing method, and information processing program

Info

Publication number: JP2020129192A
Application number: JP2019020767A
Authority: JP
Inventors: 孝太坪内; Kota Tsubouchi; 清水　徹; Toru Shimizu; 徹清水; 潤二宰川; Junji Saikawa; 伸幸清水; Nobuyuki Shimizu; 隼人小林; Hayato Kobayashi; アヌパムバッタチャルジ; Bhattacharjee Anupam
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2019-02-07
Filing date: 2019-02-07
Publication date: 2020-08-27
Anticipated expiration: 2039-02-07
Also published as: JP7071304B2

Abstract

To recommend appropriate information to a user.SOLUTION: An information processing device comprises an extraction unit and a determination unit. The extraction unit extracts feature information indicating features of a prescribed query, assuming that a plurality of queries entered by a same user in a prescribed time should have similar features, using a learning model having learned the features of the plurality of queries. The determination unit determines recommendation information to be recommended to the user who entered the prescribed query on the basis of the feature information extracted by the extraction unit.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program.

従来、ユーザの興味や関心にあった物品やサービスを推薦する技術が知られている。例えば、入力された自然言語要求を意味解析して、ユーザの意図を含む分脈情報を生成する。そして、生成した文脈情報に基づいて、ユーザに提示するアイテムの候補に対する順位付けを行う技術が提案されている。 2. Description of the Related Art Conventionally, there is known a technique of recommending a product or service that is of interest to a user. For example, the input natural language request is semantically analyzed to generate branch information including the user's intention. Then, a technique for ranking the item candidates to be presented to the user based on the generated context information has been proposed.

特開２０１６−９１５３５号公報JP, 2016-91535, A

しかしながら、上記の従来技術では、ユーザに対して適切な情報を推薦することができるとは限らない。例えば、上記の従来技術では、入力された自然言語要求を意味解析して、文脈情報を生成するにすぎず、ユーザに対して適切な情報を推薦することができるとは限らない。 However, in the above-mentioned related art, it is not always possible to recommend appropriate information to the user. For example, in the above-mentioned conventional technique, the input natural language request is only semantically analyzed to generate context information, and it is not always possible to recommend appropriate information to the user.

本願は、上記に鑑みてなされたものであって、ユーザに対して適切な情報を推薦することができる情報処理装置、情報処理方法及び情報処理プログラムを提供することを目的とする。 The present application has been made in view of the above, and an object thereof is to provide an information processing apparatus, an information processing method, and an information processing program that can recommend appropriate information to a user.

本願に係る情報処理装置は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、前記複数の検索クエリが有する特徴を学習した学習モデルを用いて、所定のクエリの特徴を示す特徴情報を抽出する抽出部と、前記抽出部によって抽出された特徴情報に基づいて、前記所定のクエリを入力したユーザに対して推薦する推薦情報を決定する決定部を備えたことを特徴とする。 The information processing apparatus according to the present application uses a learning model in which the features of the plurality of search queries are learned, assuming that the plurality of search queries input by the same user within a predetermined time have similar features, An extraction unit that extracts characteristic information indicating characteristics of a predetermined query, and a determination unit that determines recommendation information recommended to a user who has input the predetermined query, based on the characteristic information extracted by the extraction unit. It is characterized by having.

実施形態の一態様によれば、ユーザに対して適切な情報を推薦することができるといった効果を奏する。 According to the aspect of the embodiment, it is possible to recommend appropriate information to the user.

図１は、第１の実施形態に係る情報処理の一例を示す図である。FIG. 1 is a diagram illustrating an example of information processing according to the first embodiment. 図２は、第１の実施形態に係る情報処理システムの構成例を示す図である。FIG. 2 is a diagram illustrating a configuration example of the information processing system according to the first embodiment. 図３は、第１の実施形態に係る情報処理装置の構成例を示す図である。FIG. 3 is a diagram illustrating a configuration example of the information processing device according to the first embodiment. 図４は、第１の実施形態に係るモデル情報記憶部の一例を示す図である。FIG. 4 is a diagram illustrating an example of the model information storage unit according to the first embodiment. 図５は、第１の実施形態に係るベクトル情報記憶部の一例を示す図である。FIG. 5 is a diagram illustrating an example of the vector information storage unit according to the first embodiment. 図６は、第１の実施形態に係る検索情報記憶部の一例を示す図である。FIG. 6 is a diagram illustrating an example of the search information storage unit according to the first embodiment. 図７は、第１の実施形態に係る生成処理手順を示すフローチャートである。FIG. 7 is a flowchart showing a generation processing procedure according to the first embodiment. 図８は、第１の実施形態に係る情報処理手順を示すフローチャートである。FIG. 8 is a flowchart showing an information processing procedure according to the first embodiment. 図９は、変形例に係る情報処理の一例を示す図である。FIG. 9 is a diagram illustrating an example of information processing according to the modification. 図１０は、変形例に係る情報処理の一例を示す図である。FIG. 10 is a diagram illustrating an example of information processing according to the modification. 図１１は、変形例に係るユーザ端末がコンテンツを切り替える処理の一例を説明する図である。FIG. 11 is a diagram illustrating an example of a process in which the user terminal according to the modification switches the content. 図１２は、第２の実施形態に係る情報処理装置の構成例を示す図である。FIG. 12 is a diagram illustrating a configuration example of the information processing device according to the second embodiment. 図１３は、第２の実施形態に係るカテゴリ情報記憶部の一例を示す図である。FIG. 13 is a diagram illustrating an example of the category information storage unit according to the second embodiment. 図１４は、第２の実施形態に係る予測処理手順を示すフローチャートである。FIG. 14 is a flowchart showing a prediction processing procedure according to the second embodiment. 図１５は、第２の実施形態に係る情報処理手順を示すフローチャートである。FIG. 15 is a flowchart showing an information processing procedure according to the second embodiment. 図１６は、実施形態に係る第１学習モデルの生成処理の一例を示す図である。FIG. 16 is a diagram illustrating an example of the first learning model generation process according to the embodiment. 図１７は、実施形態に係る第１学習モデルの生成処理の一例を示す図である。FIG. 17 is a diagram illustrating an example of the first learning model generation process according to the embodiment. 図１８は、実施形態に係る第２学習モデルの生成処理の一例を示す図である。FIG. 18 is a diagram illustrating an example of a second learning model generation process according to the embodiment. 図１９は、実施形態に係る生成装置の構成例を示す図である。FIG. 19 is a diagram illustrating a configuration example of the generation device according to the embodiment. 図２０は、実施形態に係るクエリ情報記憶部の一例を示す図である。FIG. 20 is a diagram illustrating an example of the query information storage unit according to the embodiment. 図２１は、実施形態に係るベクトル情報記憶部の一例を示す図である。FIG. 21 is a diagram illustrating an example of the vector information storage unit according to the embodiment. 図２２は、実施形態に係る分類定義記憶部の一例を示す図である。FIG. 22 is a diagram illustrating an example of the classification definition storage unit according to the embodiment. 図２３は、実施形態に係るカテゴリ情報記憶部の一例を示す図である。FIG. 23 is a diagram illustrating an example of the category information storage unit according to the embodiment. 図２４は、実施形態に係るモデル情報記憶部の一例を示す図である。FIG. 24 is a diagram illustrating an example of the model information storage unit according to the embodiment. 図２５は、実施形態に係る第１学習モデルの一例を示す図である。FIG. 25 is a diagram illustrating an example of the first learning model according to the embodiment. 図２６は、実施形態に係る第２学習モデルの一例を示す図である。FIG. 26 is a diagram illustrating an example of the second learning model according to the embodiment. 図２７は、実施形態に係る第１学習モデルの生成処理手順を示すフローチャートである。FIG. 27 is a flowchart showing a procedure for generating the first learning model according to the embodiment. 図２８は、実施形態に係る第２学習モデルの生成処理手順を示すフローチャートである。FIG. 28 is a flowchart showing a procedure for generating the second learning model according to the embodiment. 図２９は、プログラムを実行するコンピュータのハードウェア構成の一例を示す図である。FIG. 29 is a diagram illustrating an example of the hardware configuration of a computer that executes a program.

以下に、本願に係る情報処理装置、情報処理方法及び情報処理プログラムを実施するための形態（以下、「実施形態」と呼ぶ）について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る情報処理装置、情報処理方法及び情報処理プログラムが限定されるものではない。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略する。 Hereinafter, modes (hereinafter, referred to as “embodiments”) for implementing an information processing apparatus, an information processing method, and an information processing program according to the present application will be described in detail with reference to the drawings. The information processing apparatus, the information processing method, and the information processing program according to the present application are not limited to this embodiment. Further, in each of the following embodiments, the same parts are designated by the same reference numerals, and overlapping description will be omitted.

〔１．第１の実施形態〕
〔１−１．情報処理の一例〕
まず、図１を用いて、第１の実施形態に係る情報処理の一例について説明する。図１は、第１の実施形態に係る情報処理の一例を示す図である。図１に示す情報処理は、ユーザ端末１０と検索サーバ２０（図２参照）と生成装置５０（図２参照）と情報処理装置１００とによって行われる。 [1. First Embodiment]
[1-1. Example of information processing]
First, an example of information processing according to the first embodiment will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of information processing according to the first embodiment. The information processing shown in FIG. 1 is performed by the user terminal 10, the search server 20 (see FIG. 2), the generation device 50 (see FIG. 2), and the information processing device 100.

ユーザ端末１０は、ユーザによって使用される情報処理装置である。ユーザ端末１０は、例えば、スマートフォンや、タブレット型端末や、ノート型ＰＣ（Personal Computer）や、携帯電話機や、ＰＤＡ（Personal Digital Assistant）等により実現される。なお、以下では、ユーザ端末１０をユーザと同一視する場合がある。すなわち、以下では、ユーザをユーザ端末１０と読み替えることもできる。 The user terminal 10 is an information processing device used by a user. The user terminal 10 is realized by, for example, a smartphone, a tablet terminal, a notebook PC (Personal Computer), a mobile phone, a PDA (Personal Digital Assistant), or the like. In the following, the user terminal 10 may be identified with the user. That is, in the following, the user can be read as the user terminal 10.

また、以下では、ユーザＩＤ「Ｕ１１」により特定されるユーザを「ユーザＵ１１」とする場合がある。このように、以下では、「ユーザＵ＊（＊は任意の数値）」と記載した場合、そのユーザはユーザＩＤ「Ｕ＊」により特定されるユーザであることを示す。例えば、「ユーザＵ２１」と記載した場合、そのユーザはユーザＩＤ「Ｕ２１」により特定されるユーザである。 In the following, the user identified by the user ID “U11” may be referred to as “user U11”. As described above, in the following, when "user U* (* is an arbitrary numerical value)" is described, that user is a user specified by the user ID "U*". For example, when described as “user U21”, the user is the user specified by the user ID “U21”.

また、以下では、ユーザ端末１０を利用するユーザに応じて、ユーザ端末１０をユーザ端末１０−１、１０−２として説明する場合がある。例えば、ユーザ端末１０−１は、ユーザＵ１１により使用されるユーザ端末１０である。また、例えば、ユーザ端末１０−２は、ユーザＵ２１により使用されるユーザ端末１０である。また、以下では、ユーザ端末１０−１、１０−２について、特に区別なく説明する場合には、ユーザ端末１０と記載する。 Moreover, below, depending on the user who uses the user terminal 10, the user terminal 10 may be described as the user terminals 10-1 and 10-2. For example, the user terminal 10-1 is the user terminal 10 used by the user U11. Further, for example, the user terminal 10-2 is the user terminal 10 used by the user U21. Further, in the following, the user terminals 10-1 and 10-2 will be referred to as the user terminal 10 if they are described without any particular distinction.

検索サーバ２０（図２参照）は、検索サービスを提供するサーバ装置である。例えば、検索サーバ２０が提供する検索サービスは、あらゆる情報を検索可能な総合検索サービスである。検索サーバ２０は、ユーザによって入力された検索クエリに関する情報を記憶する。具体的には、検索サーバ２０は、ユーザの検索履歴に関する情報を記憶する。 The search server 20 (see FIG. 2) is a server device that provides a search service. For example, the search service provided by the search server 20 is a comprehensive search service capable of searching all kinds of information. The search server 20 stores information about the search query input by the user. Specifically, the search server 20 stores information regarding the search history of the user.

生成装置５０（図２参照）は、第１学習モデルを生成するサーバ装置である。ここでは、生成装置５０による第１学習モデルの生成処理の概要を述べる。なお、生成装置５０による第１学習モデルの生成処理の詳細は後述する。具体的には、生成装置５０は、ユーザによって入力された検索クエリに関する情報を検索サーバ２０から取得する。続いて、生成装置５０は、検索サーバ２０から取得した検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリを抽出する。ここで、生成装置５０は、ユーザによる１回の検索ごとに検索ボックスに入力された文字列全体をユーザによって入力された一つの検索クエリとして取り扱う。例えば、生成装置５０は、ユーザＵ１による１回の検索で検索ボックスに「六本木パスタ」のように複数の文字列を含む検索クエリが入力された場合は、「六本木パスタ」全体で一つの検索クエリとして取り扱う。また、生成装置５０は、同一のユーザによって各検索クエリが入力された時間の間隔が所定の時間内（例えば、２分以内）であるような複数の検索クエリを同一のユーザによって所定の時間内に入力された複数の検索クエリとして抽出する。 The generation device 50 (see FIG. 2) is a server device that generates the first learning model. Here, the outline of the generation processing of the first learning model by the generation device 50 will be described. The details of the generation process of the first learning model by the generation device 50 will be described later. Specifically, the generation device 50 acquires information regarding the search query input by the user from the search server 20. Then, the generation device 50 extracts a plurality of search queries input from the search server 20 by the same user within a predetermined time. Here, the generation device 50 treats the entire character string input in the search box for each search by the user as one search query input by the user. For example, when the search query including a plurality of character strings such as “Roppongi pasta” is input to the search box in one search by the user U1, the generation device 50 makes one search query for the entire “Roppongi pasta”. Treat as. In addition, the generating device 50 may generate a plurality of search queries within a predetermined time by the same user such that the time intervals at which the respective search queries are input by the same user are within a predetermined time (for example, within 2 minutes). Extract as multiple search queries entered in.

続いて、生成装置５０は、抽出した複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する第１学習モデルを生成する。具体的には、生成装置５０は、抽出した複数の検索クエリの分散表現が類似するように第１学習モデルを学習させることで、所定の検索クエリから所定の検索クエリの特徴情報を含む分散表現（ベクトル）を出力する第１学習モデルを生成する。より具体的には、生成装置５０は、ＲＮＮ（Recurrent Neural Network）の一種であるＬＳＴＭ（Long Short-Term Memory）を分散表現生成に用いたＤＳＳＭ（Deep Structured Semantic Model）の技術を用いて、検索クエリから分散表現（ベクトル）を出力する第１学習モデルを生成する。例えば、生成装置５０は、第１学習モデルの正解データとして、同一のユーザによって所定の時間内に入力された一対の検索クエリが類似する特徴を有するものとして、所定の検索クエリの分散表現（ベクトル）と、所定の検索クエリと対となる他の検索クエリの分散表現（ベクトル）とが、分散表現空間上で近くに存在するように学習する。なお、２つのベクトルが分散表現空間上で近くに存在するように学習することは、２つのベクトルが分散表現空間上で類似するように学習することと言い換えることができる。 Subsequently, the generating device 50 generates a first learning model that predicts the characteristic information of the predetermined search query from the predetermined search query by learning that the extracted plurality of search queries have similar characteristics. Specifically, the generation device 50 trains the first learning model so that the extracted distributed expressions of the plurality of search queries are similar to each other, so that the distributed expression including the characteristic information of the predetermined search query from the predetermined search query. A first learning model that outputs (vector) is generated. More specifically, the generation device 50 uses a technique of DSSM (Deep Structured Semantic Model) that uses an LSTM (Long Short-Term Memory), which is a type of RNN (Recurrent Neural Network), for distributed expression generation, and performs a search. A first learning model that outputs a distributed expression (vector) is generated from the query. For example, the generation device 50 assumes that, as the correct answer data of the first learning model, a pair of search queries input by the same user within a predetermined time have similar characteristics, and the distributed expression (vector) of the predetermined search query is used. ) And a distributed expression (vector) of another search query that forms a pair with a predetermined search query are learned so that they exist near each other in the distributed expression space. Note that learning so that two vectors exist close to each other in the distributed expression space can be rephrased as learning so that the two vectors are similar in the distributed expression space.

情報処理装置１００は、不動産情報の検索サービス（以下、適宜「不動産情報検索サービスＲ１」と記載する。）を提供するサーバ装置である。情報処理装置１００は、第１学習モデルのモデルデータを生成装置５０から取得する。なお、以下では、第１学習モデルのモデルデータを単に第１学習モデルと記載する場合がある。情報処理装置１００は、第１学習モデルを用いて、ユーザから受け付けた地名に対応する不動産エリアと類似する特徴を有する他の不動産エリアをお勧めエリアとして推薦する。 The information processing device 100 is a server device that provides a real estate information search service (hereinafter, appropriately referred to as “real estate information search service R1”). The information processing device 100 acquires the model data of the first learning model from the generation device 50. In the following, the model data of the first learning model may be simply referred to as the first learning model. The information processing apparatus 100 uses the first learning model to recommend another real estate area having characteristics similar to the real estate area corresponding to the place name received from the user as the recommended area.

ここから、図１を用いて、情報処理の流れについて説明する。図１では、情報処理装置１００は、第１学習モデルを用いて、全国各地の地名や駅名を示す文字列（以下、適宜「地名クエリ」と記載する。）に対応する分散表現（ベクトル）を生成する（ステップＳ１）。図１右方に点線で示した吹き出しの中には、情報処理装置１００が生成した地名クエリに対応する分散表現（ベクトル）が分散表現空間にマッピングされる様子が示されている。例えば、地名クエリ「地名＃１１」に対応する分散表現（ベクトル）を示す点と地名クエリ「地名＃１２」〜「地名＃１４」に対応する分散表現（ベクトル）を示す点とは、分散表現空間上で相対的に近くに位置する様子が示されている。すなわちこの図は、地名クエリ「地名＃１１」と地名クエリ「地名＃１２」〜「地名＃１４」とは、類似する特徴を有することを意味する。一方、地名クエリ「地名＃１１」に対応する分散表現（ベクトル）を示す点と地名クエリ「地名＃２１」〜「地名＃２２」に対応する分散表現（ベクトル）を示す点とは、分散表現空間上で相対的に遠くに存在する様子が示されている。すなわちこの図は、地名クエリ「地名＃１１」と地名クエリ「地名＃２１」〜「地名＃２２」とは、相違する特徴を有することを意味する。なお、図１では、説明のため、地名クエリを「地名＃１１」のように抽象的な記号で表現するが、本願発明を実施する際には、地名クエリには「武蔵小杉」や「吉祥寺」といった具体的な駅名や「港区」や「東京都」といった具体的な地名が用いられる。 From here, the flow of information processing is demonstrated using FIG. In FIG. 1, the information processing apparatus 100 uses a first learning model to generate a distributed expression (vector) corresponding to a character string indicating a place name or a station name (hereinafter, referred to as “place name query” as appropriate) throughout the country. Generate (step S1). In the balloon indicated by a dotted line on the right side of FIG. 1, a state in which the distributed expression (vector) corresponding to the place name query generated by the information processing apparatus 100 is mapped in the distributed expression space is shown. For example, the point indicating the distributed expression (vector) corresponding to the place name query “place name #11” and the point indicating the distributed expression (vector) corresponding to the place name queries “place name #12” to “place name #14” are the distributed expressions. It is shown to be located relatively close in space. That is, this figure means that the place name query “place name #11” and the place name queries “place name #12” to “place name #14” have similar features. On the other hand, the point indicating the distributed expression (vector) corresponding to the place name query “place name #11” and the point indicating the distributed expression (vector) corresponding to the place name queries “place name #21” to “place name #22” are the distributed expressions. It is shown to exist relatively far in space. That is, this figure means that the place name query “place name #11” and the place name queries “place name #21” to “place name #22” have different characteristics. In FIG. 1, for the sake of explanation, the place name query is expressed by an abstract symbol such as “place name #11”. However, when implementing the present invention, the place name query includes “Musashi Kosugi” and “Kichijoji”. A specific station name such as "" or a specific place name such as "Minato Ward" or "Tokyo" is used.

また、情報処理装置１００は、知りたい街を検索する検索クエリを入力するための検索ボックスを含むコンテンツＣ１１を介して、地名クエリ「地名＃１１」をユーザＵ１１から受け付ける（ステップＳ２）。続いて、情報処理装置１００は、地名クエリ「地名＃１１」を受け付けると、あらかじめ生成しておいた地名クエリ「地名＃１１」に対応する分散表現（ベクトル）と地名クエリ「地名＃１１」以外の他の地名クエリに対応する分散表現（ベクトル）との類似度を算出する（ステップＳ３）。なお、情報処理装置１００は、地名クエリを受け付ける度に類似度を計算するのでなく、あらかじめ地名クエリ同士の類似度を算出してもよい。続いて、情報処理装置１００は、類似度を算出すると、算出した類似度が所定の閾値を超えるか否かを判定する。続いて、情報処理装置１００は、類似度が所定の閾値を超えると判定した場合に、地名クエリ「地名＃１１」と類似する特徴を有する類似クエリとして、他の地名クエリを抽出する（ステップＳ４）。 In addition, the information processing apparatus 100 accepts the place name query “place name #11” from the user U11 via the content C11 that includes the search box for inputting the search query for searching the city of interest (step S2). Subsequently, when the information processing apparatus 100 receives the place name query “place name #11”, the distributed expression (vector) corresponding to the place name query “place name #11” that has been generated in advance and the place name query “place name #11” are excluded. The degree of similarity with the distributed expression (vector) corresponding to another place name query is calculated (step S3). The information processing apparatus 100 may calculate the similarity between the place name queries in advance, instead of calculating the similarity each time the place name query is received. Then, the information processing apparatus 100, after calculating the similarity, determines whether the calculated similarity exceeds a predetermined threshold value. Subsequently, when the information processing apparatus 100 determines that the degree of similarity exceeds a predetermined threshold value, the information processing apparatus 100 extracts another place name query as a similar query having characteristics similar to the place name query “place name #11” (step S4). ).

例えば、情報処理装置１００は、地名クエリ「地名＃１１」に対応する分散表現（ベクトル）と地名クエリ「地名＃１２」に対応する分散表現（ベクトル）との類似度を０．９と算出する。続いて、情報処理装置１００は、地名クエリ「地名＃１１」に対応する分散表現（ベクトル）と地名クエリ「地名＃１２」に対応する分散表現（ベクトル）との類似度が所定の閾値（例えば、０．８）を超えるか否かを判定する。例えば、情報処理装置１００は、類似度である０．９が所定の閾値である０．８を超えるので、地名クエリ「地名＃１１」の類似クエリとして、地名クエリ「地名＃１２」を抽出する。同様にして、情報処理装置１００は、地名クエリ「地名＃１１」以外の他の地名クエリ全てについて、地名クエリ「地名＃１１」に対応する分散表現（ベクトル）との類似度を算出する。そして、情報処理装置１００は、地名クエリ「地名＃１１」以外の他の地名クエリ全てについて、類似度が所定の閾値を超えるか否かを判定する。 For example, the information processing apparatus 100 calculates the similarity between the distributed expression (vector) corresponding to the place name query “place name #11” and the distributed expression (vector) corresponding to the place name query “place name #12” as 0.9. .. Subsequently, in the information processing apparatus 100, the similarity between the distributed expression (vector) corresponding to the place name query “place name #11” and the distributed expression (vector) corresponding to the place name query “place name #12” is a predetermined threshold value (eg, , 0.8) is exceeded. For example, the information processing apparatus 100 extracts the place name query “place name #12” as the similar query of the place name query “place name #11” because the similarity 0.9 exceeds the predetermined threshold of 0.8. .. Similarly, the information processing apparatus 100 calculates the degree of similarity with the distributed expression (vector) corresponding to the place name query “place name #11” for all other place name queries other than the place name query “place name #11”. Then, the information processing apparatus 100 determines whether or not the degree of similarity exceeds a predetermined threshold for all place name queries other than the place name query “place name #11”.

続いて、情報処理装置１００は、類似クエリを抽出すると、類似クエリに基づいて、ユーザＵ１１に対して推薦するお勧めエリアを決定する（ステップＳ５）。具体的には、情報処理装置１００は、類似クエリとして抽出した地名クエリに対応する不動産エリアをお勧めエリアとして推薦することを決定する。例えば、情報処理装置１００は、地名クエリ「地名＃１１」の類似クエリとして抽出した地名クエリ「地名＃１２」に対応する不動産エリアをユーザＵ１１に対して推薦するお勧めエリアとして推薦することを決定する。なお、情報処理装置１００は、類似クエリに対応する不動産エリアの中から、さらに類似度が上位所定数以内である地名クエリに対応する不動産エリアをお勧めエリアとして推薦することを決定してもよい。 Subsequently, when the information processing apparatus 100 extracts the similar query, the information processing apparatus 100 determines a recommended area recommended for the user U11 based on the similar query (step S5). Specifically, the information processing apparatus 100 determines to recommend the real estate area corresponding to the place name query extracted as the similar query as the recommended area. For example, the information processing apparatus 100 determines to recommend the real estate area corresponding to the place name query “place name #12” extracted as a similar query to the place name query “place name #11” as a recommended area recommended to the user U11. To do. Note that the information processing apparatus 100 may determine to recommend, as a recommended area, a real estate area corresponding to a place name query having a similarity within a higher predetermined number than the real estate areas corresponding to the similar query. ..

続いて、情報処理装置１００は、お勧めエリアを決定すると、決定したお勧めエリアに関する情報（例えば、お勧めエリアの不動産物件の情報）をユーザＵ１１に対して送信する（ステップＳ６）。 Subsequently, when the information processing apparatus 100 determines the recommended area, the information processing apparatus 100 transmits information on the determined recommended area (for example, information on real estate properties in the recommended area) to the user U11 (step S6).

上述したように、第１の実施形態に係る情報処理装置１００は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、所定のクエリの特徴を示す特徴情報を抽出する。また、情報処理装置１００は、抽出した特徴情報に基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。これにより、情報処理装置１００は、所定の検索クエリに興味や関心を抱いたユーザに対して、所定の検索クエリの特徴を示す特徴情報に基づく情報を推薦可能とする。すなわち、情報処理装置１００は、ユーザの興味や関心にマッチする情報を推薦可能とする。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 As described above, the information processing apparatus 100 according to the first embodiment has a feature that a plurality of search queries have the similar feature that a plurality of search queries input by the same user within a predetermined time period have similar features. Feature information indicating the features of a predetermined query is extracted using the learning model that has learned. In addition, the information processing apparatus 100 determines recommendation information that is recommended to the user who has input the predetermined query, based on the extracted characteristic information. With this, the information processing apparatus 100 can recommend information based on the characteristic information indicating the characteristics of the predetermined search query to the user who is interested in the predetermined search query. That is, the information processing apparatus 100 can recommend the interest of the user or information that matches the interest. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

また、一般的に、検索サービスを訪れるユーザ等、特定の分野に興味や関心を抱いてはいるものの、その分野に関する知識が少ないユーザが検索によって知識を得ようとする場面においては、適切な検索クエリが思いつかないという課題がある。本願発明に係る情報処理装置１００は、入力する検索クエリに対する知識が少ないユーザが入力した検索クエリに基づいて、検索意図に応じた適切な検索クエリに基づく推薦情報を推薦することができる。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 In addition, in general, a user who is interested in a specific field such as a user who visits a search service, but has little knowledge about the field, who has little knowledge of the field, is likely to find an appropriate search. The problem is that I can't think of a query. The information processing apparatus 100 according to the present invention can recommend recommended information based on an appropriate search query according to a search intention, based on a search query input by a user who has little knowledge of the input search query. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

また、従来、概念検索に関する技術が知られている。例えば、ユーザの行動のログから概念を構築する技術が知られている。例えば、クーラーをつけるという行為と気温との相関に着目して、ユーザがクーラーをつけたという行動のログから暑い（気温が高い）という概念（例えば、検索クエリ）を構築する技術が知られている。しかしながら、従来は、クーラーをつけるという行為と暑い（気温が高い）という概念が関連するというルールを人間が登録していた（ルールベース）。また、ルールベースに寄らず、機械学習を用いた場合は、人手で大量のクエリについてラベル付けをするなどして学習データを作成する必要があった。そのため、従来は、狭義のクエリの概念検索を行うことしかできなかった。すなわち、従来は、人間によってあらかじめ答えとなる概念が付与されたクエリ（狭義のクエリ）の概念検索しか行うことができなかった。そこで、本願発明に係る情報処理装置１００は、人手で大量のクエリについてラベル付けしたりすることなく、ユーザの検索セッションにおけるクエリ列をモデル化する。これにより、本願発明に係る情報処理装置１００は、ごく少数のユーザによってしか検索されないようなニッチなクエリ（ロングテールなクエリ）についても、クエリと検索意図とを結び付けて学習することができる。すなわち、本願発明に係る情報処理装置１００は、ユーザが自由に入力するようなニッチなクエリをもカバーする広義のクエリの概念検索を行うことができる。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 Further, conventionally, a technique related to concept retrieval is known. For example, a technique for constructing a concept from a log of user actions is known. For example, focusing on the correlation between the action of turning on the cooler and the temperature, there is known a technique of constructing a concept of hot (high temperature) (for example, a search query) from a log of the action of the user turning on the cooler. There is. However, in the past, humans have registered the rule that the act of putting on a cooler is related to the concept of being hot (high temperature) (rule base). In addition, if machine learning is used without depending on the rule base, it is necessary to manually create learning data by labeling a large number of queries. Therefore, conventionally, only a concept search of a narrowly defined query can be performed. That is, conventionally, only a concept search of a query (a narrowly defined query) to which a human being has previously given an answer concept can be performed. Therefore, the information processing apparatus 100 according to the present invention models a query string in a user's search session without manually labeling a large number of queries. As a result, the information processing apparatus 100 according to the present invention can learn a niche query (long tail query) that is searched by only a small number of users by associating the query with the search intention. That is, the information processing apparatus 100 according to the present invention can perform a concept search of a query in a broad sense that also covers a niche query that the user freely inputs. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

〔１−２．情報処理システムの構成〕
次に、図２を用いて、第１の実施形態に係る情報処理システムの構成について説明する。図２は、第１の実施形態に係る情報処理システムの構成例を示す図である。図２に示すように、情報処理システム１には、ユーザ端末１０と、検索サーバ２０と、生成装置５０と、情報処理装置１００とが含まれる。ユーザ端末１０と、検索サーバ２０と、生成装置５０と、情報処理装置１００とは所定のネットワークＮを介して、有線または無線により通信可能に接続される。なお、図２に示す情報処理システム１には、任意の数のユーザ端末１０と任意の数の検索サーバ２０と任意の数の生成装置５０と任意の数の情報処理装置１００とが含まれてもよい。 [1-2. Information processing system configuration]
Next, the configuration of the information processing system according to the first embodiment will be described with reference to FIG. FIG. 2 is a diagram illustrating a configuration example of the information processing system according to the first embodiment. As shown in FIG. 2, the information processing system 1 includes a user terminal 10, a search server 20, a generation device 50, and an information processing device 100. The user terminal 10, the search server 20, the generation device 50, and the information processing device 100 are connected via a predetermined network N so as to be communicable by wire or wirelessly. Note that the information processing system 1 illustrated in FIG. 2 includes an arbitrary number of user terminals 10, an arbitrary number of search servers 20, an arbitrary number of generation devices 50, and an arbitrary number of information processing devices 100. Good.

ユーザ端末１０は、ユーザによって入力された検索クエリを検索サーバ２０に送信する。具体的には、ユーザ端末１０は、ユーザによる操作に従って、検索クエリを入力するための検索ボックスを含む検索ページを検索サーバ２０から取得する。続いて、ユーザ端末１０は、ユーザによって検索ボックスに文字が入力される操作に続いて、検索クエリを送信する操作が行われると、検索ページを介して検索ボックスに入力された文字を検索クエリとして検索サーバ２０に送信する。例えば、ユーザ端末１０は、ユーザによって検索ボックスに文字が入力される操作に続いて、検索クエリの送信ボタンが押下される操作やエンターキーが押下される操作が行われると、検索ページを介して検索ボックスに入力された文字を検索クエリとして検索サーバ２０に送信する。 The user terminal 10 transmits the search query input by the user to the search server 20. Specifically, the user terminal 10 acquires a search page including a search box for inputting a search query from the search server 20 according to the operation by the user. Subsequently, when the user terminal 10 performs an operation of transmitting a search query following an operation of inputting characters in the search box by the user, the character input in the search box via the search page is used as the search query. It is transmitted to the search server 20. For example, when the user presses the send button of the search query or presses the enter key after the user inputs characters in the search box, the user terminal 10 displays the search page through the search page. The characters input in the search box are transmitted to the search server 20 as a search query.

検索サーバ２０は、ユーザ端末１０から検索クエリを受け付けると、受け付けた検索クエリに応じたコンテンツであって、検索結果として出力されるコンテンツを選択する。続いて、検索サーバ２０は、選択されたコンテンツを含む検索結果ページをユーザ端末１０に配信する。ここで、検索サーバ２０によって配信されるコンテンツは、ウェブブラウザによって表示されるウェブページに限られない。例えば、検索サーバ２０によって配信されるコンテンツは、ユーザ端末１０にインストールされた専用のアプリケーションによって表示されるコンテンツであってもよい。また、検索サーバ２０によって配信されるコンテンツは、音楽コンテンツや画像（静止画のみならず動画を含む。）コンテンツ、テキストコンテンツ（ニュース記事やＳＮＳ（Social Networking Service）に投稿された記事を含む。）、画像とテキストを組み合わせたコンテンツ、ゲームコンテンツなど、どのようなコンテンツであってもよい。 When the search server 20 receives the search query from the user terminal 10, the search server 20 selects the content corresponding to the received search query and output as the search result. Then, the search server 20 delivers the search result page including the selected content to the user terminal 10. Here, the content distributed by the search server 20 is not limited to the web page displayed by the web browser. For example, the content distributed by the search server 20 may be content displayed by a dedicated application installed in the user terminal 10. The contents distributed by the search server 20 are music contents, images (including still images as well as moving images) contents, and text contents (including news articles and articles posted on SNS (Social Networking Service)). , Any content including a combination of images and text, game content, and the like.

また、検索サーバ２０は、ユーザ端末１０から検索クエリを受け付けると、受け付けた検索クエリと検索クエリの送信元であるユーザを識別するユーザＩＤと検索クエリの送信日時とを対応付けてデータベースに登録する。検索サーバ２０は、生成装置５０の要求に応じて、ユーザによって入力された検索クエリに関する情報を生成装置５０に送信する。 In addition, when the search server 20 receives a search query from the user terminal 10, the search server 20 registers the received search query, the user ID that identifies the user who is the transmission source of the search query, and the transmission date and time of the search query in association with each other. .. The search server 20 transmits the information about the search query input by the user to the generation device 50 in response to the request from the generation device 50.

生成装置５０は、後述する処理を実行することにより、第１学習モデルを生成する。生成装置５０による第１学習モデルの生成処理の詳細は後述する。 The generation device 50 generates the first learning model by executing the processing described below. Details of the generation processing of the first learning model by the generation device 50 will be described later.

また、ユーザ端末１０は、ユーザによって入力された検索クエリを情報処理装置１００に送信する。具体的には、ユーザ端末１０は、ユーザによる操作に従って、知りたい街を検索する検索クエリを入力するための検索ボックスを含むコンテンツＣ１１を情報処理装置１００から取得する。続いて、ユーザ端末１０は、ユーザによって検索ボックスに文字が入力される操作に続いて、検索クエリを送信する操作が行われると、コンテンツＣ１１を介して検索ボックスに入力された地名や駅名といった文字を検索クエリとして情報処理装置１００に送信する。例えば、ユーザ端末１０は、ユーザによって検索ボックスに文字が入力される操作に続いて、検索クエリの送信ボタンが押下される操作やエンターキーが押下される操作が行われると、コンテンツＣ１１を介して検索ボックスに入力された文字を検索クエリとして情報処理装置１００に送信する。 In addition, the user terminal 10 transmits the search query input by the user to the information processing device 100. Specifically, the user terminal 10 acquires, from the information processing apparatus 100, the content C11 including a search box for inputting a search query for searching a city that the user wants to know according to an operation performed by the user. Subsequently, when the user terminal 10 performs an operation of transmitting a search query after the operation of inputting characters in the search box by the user, the characters such as the place name and the station name input in the search box via the content C11. Is transmitted to the information processing device 100 as a search query. For example, when the operation of pressing the send button of the search query or the operation of pressing the enter key is performed subsequent to the operation of the user inputting a character in the search box, the user terminal 10 transmits the content C11 via the content C11. The characters input in the search box are transmitted to the information processing device 100 as a search query.

情報処理装置１００は、図１で説明した情報処理を行うサーバ装置である。情報処理装置１００は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて、所定のクエリの特徴を示す特徴情報として、所定のクエリと類似する特徴を有する検索クエリである類似クエリを抽出する。また、情報処理装置１００は、抽出した特徴情報である類似クエリに基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。 The information processing device 100 is a server device that performs the information processing described in FIG. The information processing apparatus 100 determines that a plurality of search queries input by the same user within a predetermined time period have similar features, and use a first learning model that learns the features of the plurality of search queries to determine a predetermined value. As the feature information indicating the feature of the query, a similar query that is a search query having a feature similar to the predetermined query is extracted. In addition, the information processing apparatus 100 determines recommendation information that is recommended to the user who has input a predetermined query, based on the similar query that is the extracted feature information.

〔１−３．情報処理装置の構成〕
次に、図３を用いて、第１の実施形態に係る情報処理装置１００の構成について説明する。図３は、第１の実施形態に係る情報処理装置１００の構成例を示す図である。図３に示すように、情報処理装置１００は、通信部１１０と、記憶部１２０と、制御部１３０とを有する。なお、情報処理装置１００は、情報処理装置１００の管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示させるための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [1-3. Configuration of information processing device]
Next, the configuration of the information processing device 100 according to the first embodiment will be described with reference to FIG. FIG. 3 is a diagram illustrating a configuration example of the information processing device 100 according to the first embodiment. As illustrated in FIG. 3, the information processing device 100 includes a communication unit 110, a storage unit 120, and a control unit 130. The information processing apparatus 100 includes an input unit (for example, a keyboard and a mouse) that receives various operations from an administrator of the information processing apparatus 100, and a display unit (for example, a liquid crystal display) for displaying various information. You may have.

（通信部１１０）
通信部１１０は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部１１０は、ネットワークと有線または無線で接続され、例えば、ユーザ端末１０と検索サーバ２０と生成装置５０との間で情報の送受信を行う。 (Communication unit 110)
The communication unit 110 is realized by, for example, a NIC (Network Interface Card) or the like. Then, the communication unit 110 is connected to the network by wire or wirelessly, and transmits and receives information between the user terminal 10, the search server 20, and the generation device 50, for example.

（記憶部１２０）
記憶部１２０は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部１２０は、図３に示すように、モデル情報記憶部１２１とベクトル情報記憶部１２２と検索情報記憶部１２３とコンテンツ記憶部１２４を有する。 (Storage unit 120)
The storage unit 120 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. As shown in FIG. 3, the storage unit 120 has a model information storage unit 121, a vector information storage unit 122, a search information storage unit 123, and a content storage unit 124.

（モデル情報記憶部１２１）
モデル情報記憶部１２１は、生成装置５０によって生成された学習モデルに関する各種の情報を記憶する。図４に、第１の実施形態に係るモデル情報記憶部の一例を示す。図４に示す例では、モデル情報記憶部１２１は、「モデルＩＤ」、「モデルデータ」といった項目を有する。 (Model information storage unit 121)
The model information storage unit 121 stores various information regarding the learning model generated by the generation device 50. FIG. 4 shows an example of the model information storage unit according to the first embodiment. In the example shown in FIG. 4, the model information storage unit 121 has items such as “model ID” and “model data”.

「モデルＩＤ」は、生成装置５０によって生成された学習モデルを識別するための識別情報を示す。「モデルデータ」は、生成装置５０によって生成された学習モデルのモデルデータを示す。例えば、「モデルデータ」には、検索クエリを分散表現に変換するためのデータが格納される。 The “model ID” indicates identification information for identifying the learning model generated by the generation device 50. The “model data” indicates model data of the learning model generated by the generation device 50. For example, “model data” stores data for converting a search query into a distributed expression.

図４の１レコード目に示す例では、モデルＩＤ「Ｍ１」で識別される学習モデルは、図１に示した第１学習モデルＭ１に対応する。また、モデルデータ「ＭＤＴ１」は、生成装置５０によって生成された第１学習モデルＭ１のモデルデータ（モデルデータＭＤＴ１）を示す。 In the example shown in the first record of FIG. 4, the learning model identified by the model ID “M1” corresponds to the first learning model M1 shown in FIG. The model data “MDT1” indicates model data (model data MDT1) of the first learning model M1 generated by the generation device 50.

モデルデータＭＤＴ１は、検索クエリが入力される入力層と、出力層と、入力層から出力層までのいずれかの層であって出力層以外の層に属する第１要素と、第１要素と第１要素の重みとに基づいて値が算出される第２要素と、を含み、入力層に入力された検索クエリに応じて、入力層に入力された検索クエリの分散表現を出力層から出力するよう、生成装置５０を機能させてもよい。 The model data MDT1 includes an input layer to which a search query is input, an output layer, a first element belonging to any layer from the input layer to the output layer and other than the output layer, the first element, and the first element. A second element whose value is calculated based on the weight of one element, and a distributed expression of the search query input to the input layer is output from the output layer according to the search query input to the input layer. Thus, the generation device 50 may be operated.

ここで、モデルデータＭＤＴ１が「y=a1*x1+a2*x2+・・・+ai*xi」で示す回帰モデルで実現されるとする。この場合、モデルデータＭＤＴ１が含む第１要素は、x1やx2等といった入力データ（xi）に対応する。また、第１要素の重みは、xiに対応する係数aiに対応する。ここで、回帰モデルは、入力層と出力層とを有する単純パーセプトロンと見做すことができる。各モデルを単純パーセプトロンと見做した場合、第１要素は、入力層が有するいずれかのノードに対応し、第２要素は、出力層が有するノードと見做すことができる。 Here, it is assumed that the model data MDT1 is realized by the regression model represented by “y=a1*x1+a2*x2+...+ai*xi”. In this case, the first element included in the model data MDT1 corresponds to the input data (xi) such as x1 and x2. The weight of the first element corresponds to the coefficient ai corresponding to xi. Here, the regression model can be regarded as a simple perceptron having an input layer and an output layer. When each model is regarded as a simple perceptron, the first element can be regarded as a node included in the input layer, and the second element can be regarded as a node included in the output layer.

また、モデルデータＭＤＴ１がＤＮＮ（Deep Neural Network）等、１つまたは複数の中間層を有するニューラルネットワークで実現されるとする。この場合、モデルデータＭＤＴ１が含む第１要素は、入力層または中間層が有するいずれかのノードに対応する。また、第２要素は、第１要素と対応するノードから値が伝達されるノードである次段のノードに対応する。また、第１要素の重みは、第１要素と対応するノードから第２要素と対応するノードに伝達される値に対して考慮される重みである接続係数に対応する。 Further, it is assumed that the model data MDT1 is realized by a neural network having one or a plurality of intermediate layers such as DNN (Deep Neural Network). In this case, the first element included in the model data MDT1 corresponds to any node included in the input layer or the intermediate layer. The second element corresponds to the node at the next stage, which is the node to which the value is transmitted from the node corresponding to the first element. The weight of the first element corresponds to the connection coefficient, which is a weight considered for the value transmitted from the node corresponding to the first element to the node corresponding to the second element.

生成装置５０は、上述した回帰モデルやニューラルネットワーク等、任意の構造を有するモデルを用いて、分散表現の算出を行う。具体的には、モデルデータＭＤＴ１は、検索クエリが入力された場合に、分散表現を出力するように係数が設定される。生成装置５０は、このようなモデルデータＭＤＴ１を用いて、分散表現を算出する。 The generation device 50 calculates the distributed expression using a model having an arbitrary structure such as the above-described regression model or neural network. Specifically, the model data MDT1 is set with a coefficient so as to output a distributed expression when a search query is input. The generation device 50 calculates a distributed expression using such model data MDT1.

なお、上記例では、モデルデータＭＤＴ１が、検索クエリが入力された場合に、検索クエリの分散表現を出力するモデル（以下、モデルＸ１という。）である例を示した。しかし、実施形態に係るモデルデータＭＤＴ１は、モデルＸ１にデータの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。例えば、モデルデータＭＤＴ１は、検索クエリを入力とした際に、モデルＸ１が出力した分散表現を入力して学習されたモデル（以下、モデルＹ１という。）であってもよい。または、モデルデータＭＤＴ１は、検索クエリを入力とし、モデルＹ１の出力値を出力とするよう学習されたモデルであってもよい。 In the above example, the model data MDT1 is the model (hereinafter referred to as model X1) that outputs the distributed expression of the search query when the search query is input. However, the model data MDT1 according to the embodiment may be a model generated based on a result obtained by repeating input/output of data to/from the model X1. For example, the model data MDT1 may be a model (hereinafter, referred to as model Y1) learned by inputting the distributed expression output by the model X1 when the search query is input. Alternatively, the model data MDT1 may be a model learned so that the search query is input and the output value of the model Y1 is output.

また、生成装置５０がＧＡＮ（Generative Adversarial Networks）を用いた推定処理を行う場合、モデルデータＭＤＴ１は、ＧＡＮの一部を構成するモデルであってもよい。 When the generation device 50 performs the estimation process using GAN (Generative Adversarial Networks), the model data MDT1 may be a model forming a part of GAN.

（ベクトル情報記憶部１２２）
ベクトル情報記憶部１２２は、検索クエリの分散表現であるベクトルに関する各種の情報を記憶する。図５に、第１の実施形態に係るベクトル情報記憶部の一例を示す。図５に示す例では、ベクトル情報記憶部１２２は、「検索クエリ」、「ベクトル情報」といった項目を有する。 (Vector information storage unit 122)
The vector information storage unit 122 stores various kinds of information regarding vectors, which are distributed expressions of search queries. FIG. 5 shows an example of the vector information storage unit according to the first embodiment. In the example shown in FIG. 5, the vector information storage unit 122 has items such as “search query” and “vector information”.

「検索クエリ」は、ユーザによって入力された検索クエリを示す。「ベクトル情報」は、検索クエリの分散表現であるＮ次元のベクトルを示す。検索クエリの分散表現であるベクトルは、例えば、１２８次元のベクトルである。 "Search query" indicates a search query input by the user. “Vector information” indicates an N-dimensional vector that is a distributed expression of a search query. The vector that is the distributed expression of the search query is, for example, a 128-dimensional vector.

図５の１レコード目に示す例では、検索クエリ「地名＃１１」は、図１に示した地名クエリ「地名＃１１」に対応する。また、ベクトル情報「Ｖ１１」は、図１に示した地名クエリ「地名＃１１」に対応する分散表現（ベクトル）を示す。 In the example shown in the first record of FIG. 5, the search query “place name #11” corresponds to the place name query “place name #11” shown in FIG. The vector information “V11” indicates a distributed expression (vector) corresponding to the place name query “place name #11” shown in FIG.

（検索情報記憶部１２３）
検索情報記憶部１２３は、情報処理装置１００が提供する不動産情報検索サービスＲ１におけるユーザの検索履歴に関する各種の情報を記憶する。図６に、第１の実施形態に係る検索情報記憶部の一例を示す。図６に示す例では、検索情報記憶部１２３は、「ユーザＩＤ」、「日時」、「検索クエリ」といった項目を有する。 (Search information storage unit 123)
The search information storage unit 123 stores various kinds of information regarding a user's search history in the real estate information search service R1 provided by the information processing apparatus 100. FIG. 6 shows an example of the search information storage unit according to the first embodiment. In the example shown in FIG. 6, the search information storage unit 123 has items such as “user ID”, “date and time”, and “search query”.

「ユーザＩＤ」は、検索クエリを入力したユーザを識別するための識別情報を示す。「日時」は、情報処理装置１００がユーザから検索クエリを受け付けた日時を示す。「検索クエリ」は、ユーザによって入力された検索クエリを示す。 The “user ID” indicates identification information for identifying the user who has input the search query. The “date and time” indicates the date and time when the information processing apparatus 100 accepted the search query from the user. "Search query" indicates a search query input by the user.

図６の１レコード目に示す例では、検索クエリ「地名＃１１」は、図１に示した地名クエリ「地名＃１１」に対応する。また、ユーザＩＤ「Ｕ１１」は、地名クエリ「地名＃１１」を入力したユーザがユーザＩＤ「Ｕ１１」で識別されるユーザ（ユーザＵ１１）であることを示す。また、日時「２０１９／１／１ＰＭ１７：００」は、情報処理装置１００がユーザＵ１１から地名クエリ「地名＃１１」を受け付けた日時が２０１９年１月１日の午後１７：００であることを示す。 In the example shown in the first record of FIG. 6, the search query “place name #11” corresponds to the place name query “place name #11” shown in FIG. Further, the user ID “U11” indicates that the user who inputs the place name query “place name #11” is the user (user U11) identified by the user ID “U11”. The date and time “2019/1/1 PM 17:00” means that the date and time when the information processing apparatus 100 receives the place name query “place name #11” from the user U11 is 17:00 pm on January 1, 2019. Show.

（コンテンツ記憶部１２４）
コンテンツ記憶部１２４は、コンテンツに関する各種の情報を記憶する。具体的には、コンテンツ記憶部１２４は、情報処理装置１００が提供する不動産情報検索サービスＲ１に関するコンテンツを格納する。例えば、コンテンツ記憶部１２４は、図１に示す知りたい街を検索する検索クエリを入力するための検索ボックスを含むコンテンツＣ１１を格納する。 (Content storage unit 124)
The content storage unit 124 stores various kinds of information regarding content. Specifically, the content storage unit 124 stores content related to the real estate information search service R1 provided by the information processing apparatus 100. For example, the content storage unit 124 stores the content C11 including a search box for inputting a search query shown in FIG.

（制御部１３０）
図３の説明に戻って、制御部１３０は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、情報処理装置１００内部の記憶装置に記憶されている各種プログラム（情報処理プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (Control unit 130)
Returning to the description of FIG. 3, the control unit 130 is a controller, and is stored in a storage device inside the information processing device 100 by, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). It is realized by executing various programs (corresponding to an example of an information processing program) stored in the RAM as a work area. The control unit 130 is a controller, and is realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

図３に示すように、制御部１３０は、取得部１３１と、生成部１３２と、提供部１３３と、算出部１３４と、抽出部１３５と、決定部１３６とを有し、以下に説明する情報処理の作用を実現または実行する。なお、制御部１３０の内部構成は、図３に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As illustrated in FIG. 3, the control unit 130 includes an acquisition unit 131, a generation unit 132, a provision unit 133, a calculation unit 134, an extraction unit 135, and a determination unit 136, and information described below. Realize or execute the action of processing. Note that the internal configuration of the control unit 130 is not limited to the configuration shown in FIG.

（取得部１３１）
取得部１３１は、第１学習モデルを取得する。より具体的には、取得部１３１は、生成装置５０によって生成された第１学習モデルを生成装置５０から取得する。取得部１３１は、第１学習モデルを取得すると、取得した第１学習モデルをモデル情報記憶部１２１に格納する。 (Acquisition unit 131)
The acquisition unit 131 acquires the first learning model. More specifically, the acquisition unit 131 acquires the first learning model generated by the generation device 50 from the generation device 50. When acquiring the first learning model, the acquisition unit 131 stores the acquired first learning model in the model information storage unit 121.

（生成部１３２）
生成部１３２は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、所定の検索クエリの分散表現を生成する。具体的には、生成部１３２は、取得部１３１によって取得された第１学習モデルに検索クエリを入力して、検索クエリに対応する分散表現（ベクトル）を生成する。例えば、生成部１３２は、全国各地の地名や駅名の一覧データをオープンデータベースや辞書等から取得する。このようにして、生成部１３２は、全国各地の地名や駅名を示す文字列である地名クエリを取得する。続いて、生成部１３２は、取得部１３１によって取得された第１学習モデルに地名クエリを入力して、地名クエリに対応する分散表現（ベクトル）を生成する。生成部１３２は、分散表現（ベクトル）を生成すると、生成した分散表現のベクトル情報を検索クエリと対応付けてベクトル情報記憶部１２２に格納する。 (Generator 132)
The generation unit 132 determines that a plurality of search queries input by the same user within a predetermined time have similar features, and uses a learning model that learns the features of the plurality of search queries to perform a predetermined search query. Generate a distributed representation of. Specifically, the generation unit 132 inputs the search query into the first learning model acquired by the acquisition unit 131 and generates a distributed expression (vector) corresponding to the search query. For example, the generation unit 132 acquires list data of place names and station names from all over the country from an open database or a dictionary. In this way, the generation unit 132 acquires a place name query that is a character string indicating a place name or a station name nationwide. Subsequently, the generation unit 132 inputs the place name query into the first learning model acquired by the acquisition unit 131, and generates a distributed expression (vector) corresponding to the place name query. After generating the distributed expression (vector), the generation unit 132 stores the vector information of the generated distributed expression in the vector information storage unit 122 in association with the search query.

（提供部１３３）
提供部１３３は、不動産情報検索サービスＲ１を提供する。具体的には、提供部１３３は、不動産情報検索サービスＲ１に関するコンテンツをユーザ端末１０に配信する。例えば、提供部１３３は、知りたい街を検索する検索クエリを入力するための検索ボックスを含むコンテンツＣ１１を配信する。また、提供部１３３は、コンテンツＣ１１を介して、検索クエリをユーザから受け付ける。例えば、提供部１３３は、地名クエリをユーザから受け付ける。提供部１３３は、ユーザから検索クエリを受け付けると、受け付けた検索クエリをクエリの受け付け日時とクエリの送信元であるユーザのユーザＩＤと対応付けて検索情報記憶部１２３に格納する。 (Providing section 133)
The providing unit 133 provides the real estate information search service R1. Specifically, the providing unit 133 distributes the content related to the real estate information search service R1 to the user terminal 10. For example, the providing unit 133 distributes the content C11 including a search box for inputting a search query for searching a city that the user wants to know. The providing unit 133 also receives a search query from the user via the content C11. For example, the providing unit 133 receives a place name query from the user. When the providing unit 133 receives a search query from a user, the providing unit 133 stores the received search query in the search information storage unit 123 in association with the date and time when the query was received and the user ID of the user who is the transmission source of the query.

提供部１３３は、ユーザから受け付けた地名に対応する不動産エリアと類似する特徴を有する他の不動産エリアをお勧めエリアとして推薦するサービスを提供する。具体的には、提供部１３３は、決定部１３６によって決定されたお勧めエリアに関する情報をユーザ端末１０に送信する。 The providing unit 133 provides a service that recommends another real estate area having characteristics similar to the real estate area corresponding to the place name received from the user as a recommended area. Specifically, the providing unit 133 transmits information on the recommended area determined by the determining unit 136 to the user terminal 10.

（算出部１３４）
算出部１３４は、生成部１３２によって生成された所定の検索クエリの分散表現と、生成部１３２によって生成された所定の検索クエリとは異なる他の検索クエリの分散表現との類似度を算出する。具体的には、算出部１３４は、提供部１３３によって所定の検索クエリが受け付けられると、生成部１３２によってあらかじめ生成された所定の検索クエリに対応する分散表現（ベクトル）と受け付けた所定の検索クエリ以外の他の検索クエリに対応する分散表現（ベクトル）との類似度を算出する。例えば、算出部１３４は、分散表現（ベクトル）同士のコサイン類似度を算出する。なお、算出部１３４は、コサイン類似度に限らず、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて分散表現（ベクトル）の間の類似度を算出してもよい。例えば、算出部１３４は、分散表現（ベクトル）同士のユークリッド距離や双曲空間等の非ユークリッド空間中での距離、マンハッタン距離、マハラノビス距離等といった所定の距離関数の値を算出してもよい。なお、算出部１３４は、提供部１３３によって所定の検索クエリが受け付けられる度に類似度を計算するのでなく、あらかじめ検索クエリに対応する分散表現（ベクトル）同士の類似度を算出しておいてもよい。例えば、算出部１３４は、あらかじめ地名クエリに対応する分散表現（ベクトル）同士の類似度を算出する。 (Calculation unit 134)
The calculation unit 134 calculates the similarity between the distributed expression of the predetermined search query generated by the generation unit 132 and the distributed expression of another search query different from the predetermined search query generated by the generation unit 132. Specifically, when the providing unit 133 receives the predetermined search query, the calculating unit 134 receives the distributed expression (vector) corresponding to the predetermined search query generated in advance by the generating unit 132 and the received predetermined search query. Other than, the similarity with the distributed expression (vector) corresponding to the search query is calculated. For example, the calculation unit 134 calculates the cosine similarity between the distributed expressions (vectors). Note that the calculation unit 134 may calculate the similarity between distributed expressions (vectors) based on any index as long as it is an index applicable as a distance measure between vectors, not limited to the cosine similarity. Good. For example, the calculation unit 134 may calculate a value of a predetermined distance function such as a Euclidean distance between distributed expressions (vectors), a distance in a non-Euclidean space such as a hyperbolic space, a Manhattan distance, a Mahalanobis distance, or the like. The calculating unit 134 may calculate the similarity between the distributed expressions (vectors) corresponding to the search query in advance, instead of calculating the similarity each time the providing unit 133 accepts a predetermined search query. Good. For example, the calculation unit 134 calculates the degree of similarity between the distributed expressions (vectors) corresponding to the place name query in advance.

（抽出部１３５）
抽出部１３５は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、所定のクエリの特徴を示す特徴情報を抽出する。具体的には、抽出部１３５は、特徴情報として、所定のクエリと類似する特徴を有する検索クエリである類似クエリを抽出する。より具体的には、抽出部１３５は、算出部１３４によって分散表現（ベクトル）同士の類似度が算出されると、算出した類似度が所定の閾値を超えるか否かを判定する。例えば、算出部１３４によって算出された分散表現（ベクトル）同士のコサイン類似度が所定の閾値を超えるか否かを判定する。続いて、抽出部１３５は、算出部１３４によって算出された類似度が所定の閾値を超えると判定した場合に、所定の検索クエリと類似する特徴を有する類似クエリとして、他の地名クエリを抽出する。なお、算出部１３４は、分散表現（ベクトル）同士の所定の距離関数の値（すなわち、分散表現空間における距離）が所定の閾値を下回るか否かを判定してもよい。続いて、抽出部１３５は、算出部１３４によって算出された類似度が所定の閾値を下回ると判定した場合に、所定の検索クエリと類似する特徴を有する類似クエリとして、他の地名クエリを抽出する。 (Extractor 135)
The extraction unit 135 determines that a plurality of search queries input by the same user within a predetermined time have similar features by using a learning model that has learned the features of the plurality of search queries. Feature information indicating features is extracted. Specifically, the extraction unit 135 extracts, as the characteristic information, a similar query that is a search query having characteristics similar to a predetermined query. More specifically, when the calculating unit 134 calculates the similarity between the distributed expressions (vectors), the extracting unit 135 determines whether the calculated similarity exceeds a predetermined threshold value. For example, it is determined whether the cosine similarity between the distributed expressions (vectors) calculated by the calculation unit 134 exceeds a predetermined threshold. Subsequently, when the extraction unit 135 determines that the similarity calculated by the calculation unit 134 exceeds a predetermined threshold, the extraction unit 135 extracts another place name query as a similar query having similar characteristics to the predetermined search query. .. The calculating unit 134 may determine whether or not the value of the predetermined distance function between the distributed expressions (vectors) (that is, the distance in the distributed expression space) is below a predetermined threshold. Subsequently, when the extraction unit 135 determines that the similarity calculated by the calculation unit 134 is lower than a predetermined threshold value, the extraction unit 135 extracts another place name query as a similar query having similar characteristics to the predetermined search query. ..

また、抽出部１３５は、所定のクエリと属性が共通する類似クエリを抽出する。例えば、抽出部１３５は、所定のクエリと属性が共通する類似クエリとして、不動産エリアを示す所定のクエリと類似する特徴を有する検索クエリであって、不動産エリアを示す類似クエリを抽出する。 In addition, the extraction unit 135 extracts a similar query having the same attribute as the predetermined query. For example, the extraction unit 135 extracts a similar query indicating a real estate area, which is a search query having similar characteristics to the predetermined query indicating the real estate area, as the similar query having the same attribute as the predetermined query.

また、抽出部１３５は、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５は、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する第１学習モデルを用いて、特徴情報として類似クエリを抽出する。 The extraction unit 135 also extracts the characteristic information by using a learning model that outputs a distributed expression of the predetermined search query as the output information when the predetermined search query is input as the input information. For example, the extraction unit 135 extracts the similar query as the characteristic information by using the first learning model that outputs the distributed expression of the predetermined search query as the output information when the predetermined search query is input as the input information. ..

また、抽出部１３５は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて、特徴情報として類似クエリを抽出する。 In addition, the extraction unit 135 uses a learning model in which the characteristics of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries that are continuously input within a predetermined time are similar to each other. Extract feature information. For example, the extraction unit 135 uses the first learning model in which the features of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries continuously input within a predetermined time are similar to each other. Then, a similar query is extracted as the characteristic information.

また、抽出部１３５は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて、特徴情報として類似クエリを抽出する。 The extraction unit 135 also learns that a plurality of search queries that are input by the same user within a predetermined time have similar characteristics to each other that include a character string delimited by a predetermined delimiter. By doing so, the feature information is extracted using the learning model in which the features of the plurality of search queries are learned. For example, the extraction unit 135 learns that a plurality of search queries input by the same user within a predetermined time period have a similar feature to a plurality of search queries including a character string delimited by a predetermined delimiter. By doing so, the similar query is extracted as the feature information using the first learning model in which the features of the plurality of search queries are learned.

また、抽出部１３５は、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５は、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて、特徴情報として類似クエリを抽出する。 Further, the extraction unit 135 extracts the characteristic information by using the learning model in which the characteristics of the plurality of search queries are learned by learning the plurality of randomly extracted search queries as having different characteristics. .. For example, the extraction unit 135 uses the first learning model in which the features of the plurality of search queries are learned by learning the plurality of randomly extracted search queries as having different features, and uses the first learning model as the feature information. Extract similar queries.

また、抽出部１３５は、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５は、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて、特徴情報として類似クエリを抽出する。 In addition, the extraction unit 135 extracts feature information using a learning model in which the features of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries that are randomly extracted differ. .. For example, the extraction unit 135 uses the first learning model in which the characteristics of the plurality of search queries are learned by learning so that the distributed expressions of the pair of search queries that are randomly extracted are different from each other. Extract similar queries.

（決定部１３６）
決定部１３６は、抽出部１３５によって抽出された特徴情報に基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。具体的には、決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。より具体的には、決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、推薦情報である不動産エリアに関する情報を決定する。例えば、決定部１３６は、抽出部１３５によって抽出された類似クエリが示す不動産エリアをお勧めエリアとして推薦することを決定する。 (Determination unit 136)
The determination unit 136 determines recommendation information to be recommended to the user who has input the predetermined query, based on the characteristic information extracted by the extraction unit 135. Specifically, the determination unit 136 determines recommendation information to be recommended to the user who has input the predetermined query, based on the similar query extracted by the extraction unit 135. More specifically, the determining unit 136 determines the information regarding the real estate area, which is the recommendation information, based on the similar query extracted by the extracting unit 135. For example, the determination unit 136 determines to recommend the real estate area indicated by the similar query extracted by the extraction unit 135 as the recommended area.

〔１−４．生成処理のフロー〕
次に、図７を用いて、第１の実施形態に係る生成処理の手順について説明する。図７は、第１の実施形態に係る生成処理手順を示すフローチャートである。図７に示す例では、情報処理装置１００は、検索クエリと第１学習モデルを取得する（ステップＳ１０１）。続いて、情報処理装置１００は、検索クエリと第１学習モデルを取得すると、第１学習モデルを用いて、検索クエリの分散表現（ベクトル）を生成する（ステップＳ１０２）。 [1-4. Flow of generation processing]
Next, the procedure of the generation process according to the first embodiment will be described with reference to FIG. 7. FIG. 7 is a flowchart showing a generation processing procedure according to the first embodiment. In the example illustrated in FIG. 7, the information processing device 100 acquires the search query and the first learning model (step S101). Subsequently, when the information processing apparatus 100 acquires the search query and the first learning model, the information processing apparatus 100 uses the first learning model to generate a distributed expression (vector) of the search query (step S102).

〔１−５．情報処理のフロー〕
次に、図８を用いて、第１の実施形態に係る情報処理の手順について説明する。図８は、第１の実施形態に係る情報処理手順を示すフローチャートである。図８に示す例では、情報処理装置１００は、検索クエリを受け付けたか否かを判定する（ステップＳ２０１）。情報処理装置１００は、検索クエリを受け付けていない場合（ステップＳ２０１；Ｎｏ）、検索クエリを受け付けるまで待機する。 [1-5. Information processing flow]
Next, the procedure of information processing according to the first embodiment will be described with reference to FIG. FIG. 8 is a flowchart showing an information processing procedure according to the first embodiment. In the example illustrated in FIG. 8, the information processing device 100 determines whether a search query has been accepted (step S201). If the search query has not been received (step S201; No), the information processing apparatus 100 waits until the search query is received.

一方、情報処理装置１００は、検索クエリを受け付けた場合（ステップＳ２０１；Ｙｅｓ）、検索クエリに対応する分散表現（ベクトル）同士の類似度を算出する（ステップＳ２０２）。具体的には、情報処理装置１００は、受け付けた検索クエリに対応するベクトルと他の検索クエリに対応するベクトルとの類似度をベクトル毎に算出する。 On the other hand, when the information processing apparatus 100 receives the search query (step S201; Yes), the information processing apparatus 100 calculates the similarity between the distributed expressions (vectors) corresponding to the search query (step S202). Specifically, the information processing apparatus 100 calculates, for each vector, the degree of similarity between the vector corresponding to the received search query and the vector corresponding to another search query.

続いて、情報処理装置１００は、ベクトル同士の類似度を算出すると、算出した類似度が所定の閾値を超えるか否かを判定する（ステップＳ２０３）。情報処理装置１００は、算出した類似度が所定の閾値を超えない場合（ステップＳ２０３；Ｎｏ）、処理を終了する。 Then, the information processing apparatus 100, after calculating the similarity between the vectors, determines whether the calculated similarity exceeds a predetermined threshold value (step S203). If the calculated similarity does not exceed the predetermined threshold (step S203; No), the information processing apparatus 100 ends the process.

一方、情報処理装置１００は、算出した類似度が所定の閾値を超える場合（ステップＳ２０３；Ｙｅｓ）、受け付けた検索クエリと類似する特徴を有する類似クエリを抽出する（ステップＳ２０４）。具体的には、情報処理装置１００は、受け付けた検索クエリに対応するベクトルと他の検索クエリに対応するベクトルとの類似度が所定の閾値を超える場合、類似クエリとして、他の検索クエリを抽出する。続いて、情報処理装置１００は、類似クエリを抽出すると、抽出した類似クエリに基づいてお勧めエリアを決定する（ステップＳ２０５）。 On the other hand, when the calculated similarity exceeds a predetermined threshold (step S203; Yes), the information processing apparatus 100 extracts a similar query having similar characteristics to the received search query (step S204). Specifically, when the similarity between the vector corresponding to the received search query and the vector corresponding to another search query exceeds a predetermined threshold, the information processing apparatus 100 extracts another search query as a similar query. To do. Subsequently, when the information processing apparatus 100 extracts the similar query, the information processing apparatus 100 determines a recommended area based on the extracted similar query (step S205).

〔１−６．変形例〕
上述した第１の実施形態に係る情報処理システム１は、上記実施形態以外にも種々の異なる形態にて実施されてよい。そこで、以下では、情報処理システム１の他の実施形態について説明する。なお、実施形態と同一部分には、同一符号を付して説明を省略する。 [1-6. Modified example)
The information processing system 1 according to the above-described first embodiment may be implemented in various different forms other than the above-described embodiment. Therefore, other embodiments of the information processing system 1 will be described below. The same parts as those in the embodiment are designated by the same reference numerals and the description thereof will be omitted.

〔１−６−１．概念的なクエリに基づく不動産エリアの推薦〕
次に、図９を用いて、変形例に係る情報処理について説明する。図９は、変形例に係る情報処理の一例を示す図である。図９では、生成部１３２は、第１学習モデルを用いて、不動産検索で想定される概念的なキーワード（以下、適宜「概念クエリ」と記載する。）に対応する分散表現（ベクトル）を生成する（ステップＳ１−Ａ）。ここで、不動産検索で想定される概念的なキーワードの例としては、「治安が良い」、「学園都市」、「日当たりが良い」、「花火がきれい」、「間取りが広い」等が挙げられる。 [1-6-1. Recommendation of real estate area based on conceptual query]
Next, information processing according to the modification will be described with reference to FIG. FIG. 9 is a diagram illustrating an example of information processing according to the modification. In FIG. 9, the generation unit 132 uses the first learning model to generate a distributed expression (vector) corresponding to a conceptual keyword (hereinafter, appropriately referred to as “concept query”) assumed in real estate search. (Step S1-A). Here, examples of conceptual keywords that can be assumed in real estate search include "safety", "school city", "sunny", "beautiful fireworks", "wide layout", etc. ..

図９右方に点線で示した吹き出しの中には、図１で生成部１３２が生成した地名クエリに対応する分散表現（ベクトル）に加えて、図９で生成部１３２が生成した概念クエリに対応する分散表現（ベクトル）が分散表現空間にマッピングされる様子が示されている。例えば、概念クエリ「治安が良い」に対応する分散表現（ベクトル）を示す点と地名クエリ「地名＃２１」に対応する分散表現（ベクトル）を示す点とは、分散表現空間上で相対的に近くに位置する様子が示されている。すなわちこの図は、概念クエリ「治安が良い」と地名クエリ「地名＃２１」とは、類似する特徴を有することを意味する。一方、概念クエリ「学園都市」に対応する分散表現（ベクトル）を示す点と地名クエリ「地名＃２１」に対応する分散表現（ベクトル）を示す点とは、分散表現空間上で相対的に遠くに存在する様子が示されている。すなわちこの図は、概念クエリ「治安が良い」と地名クエリ「地名＃２１」とは、相違する特徴を有することを意味する。 In the balloon indicated by the dotted line on the right side of FIG. 9, in addition to the distributed expression (vector) corresponding to the place name query generated by the generation unit 132 in FIG. 1, the concept query generated by the generation unit 132 in FIG. It is shown that the corresponding distributed representation (vector) is mapped in the distributed representation space. For example, the point indicating the distributed expression (vector) corresponding to the concept query “safety” and the point indicating the distributed expression (vector) corresponding to the place name query “place name #21” are relatively in the distributed expression space. It is shown to be located nearby. That is, this figure means that the concept query “safety is good” and the place name query “place name #21” have similar features. On the other hand, the point indicating the distributed expression (vector) corresponding to the concept query “school city” and the point indicating the distributed expression (vector) corresponding to the place name query “place name #21” are relatively far in the distributed expression space. It is shown to exist in. That is, this figure means that the concept query “safety is good” and the place name query “place name #21” have different characteristics.

また、提供部１３３は、フリーワード形式による検索クエリを入力するための検索ボックスを含むコンテンツＣ２１を介して、概念クエリ「治安が良い」をユーザＵ２１から受け付ける（ステップＳ２−Ａ）。続いて、提供部１３３によって概念クエリ「治安が良い」が受け付けられると、算出部１３４は、あらかじめ生成しておいた概念クエリ「治安が良い」に対応する分散表現（ベクトル）と地名クエリに対応する分散表現（ベクトル）との類似度を算出する（ステップＳ３−Ａ）。続いて、抽出部１３５は、類似度を算出すると、算出した類似度が所定の閾値を超えるか否かを判定する。続いて、抽出部１３５は、類似度が所定の閾値を超えると判定した場合に、概念クエリ「治安が良い」と類似する特徴を有する類似クエリとして、その地名クエリを抽出する（ステップＳ４−Ａ）。 Further, the providing unit 133 receives the concept query “safety is good” from the user U21 via the content C21 including the search box for inputting the search query in the free word format (step S2-A). Subsequently, when the providing unit 133 receives the concept query “safety is good”, the calculating unit 134 corresponds to the distributed expression (vector) and the place name query corresponding to the concept query “safety is good” generated in advance. The degree of similarity with the distributed expression (vector) is calculated (step S3-A). Subsequently, when the extraction unit 135 calculates the similarity, it determines whether the calculated similarity exceeds a predetermined threshold value. Subsequently, when the extraction unit 135 determines that the degree of similarity exceeds a predetermined threshold, the extraction unit 135 extracts the place name query as a similar query having characteristics similar to the conceptual query “safety” (step S4-A). ).

例えば、算出部１３４は、概念クエリ「治安が良い」に対応する分散表現（ベクトル）と地名クエリ「地名＃２１」に対応する分散表現（ベクトル）との類似度を０．９と算出する。続いて、抽出部１３５は、概念クエリ「治安が良い」に対応する分散表現（ベクトル）と地名クエリ「地名＃２１」に対応する分散表現（ベクトル）との類似度が所定の閾値（例えば、０．８）を超えるか否かを判定する。例えば、抽出部１３５は、類似度である０．９が所定の閾値である０．８を超えるので、概念クエリ「治安が良い」の類似クエリとして、地名クエリ「地名＃２１」を抽出する。同様にして、算出部１３４は、全ての地名クエリについて、概念クエリ「治安が良い」に対応する分散表現（ベクトル）との類似度を算出する。そして、抽出部１３５は、全ての地名クエリについて、類似度が所定の閾値を超えるか否かを判定する。 For example, the calculation unit 134 calculates the similarity between the distributed expression (vector) corresponding to the concept query “safety is good” and the distributed expression (vector) corresponding to the place name query “place name #21” as 0.9. Then, the extraction unit 135 determines that the similarity between the distributed expression (vector) corresponding to the concept query “safety is good” and the distributed expression (vector) corresponding to the place name query “place name #21” is a predetermined threshold value (for example, 0.8) is determined. For example, the extraction unit 135 extracts the place name query “place name #21” as a similar query to the concept query “safety” because the similarity 0.9 exceeds a predetermined threshold of 0.8. Similarly, the calculation unit 134 calculates the degree of similarity with the distributed expression (vector) corresponding to the conceptual query “safety is good” for all the place name queries. Then, the extraction unit 135 determines whether or not the degree of similarity exceeds a predetermined threshold for all the place name queries.

続いて、決定部１３６は、抽出部１３５によって類似クエリが抽出されると、抽出部１３５によって抽出された類似クエリに基づいて、ユーザＵ２１に対して推薦するお勧めエリアを決定する（ステップＳ５−Ａ）。具体的には、決定部１３６は、類似クエリとして抽出した地名クエリに対応する不動産エリアをお勧めエリアとして推薦することを決定する。例えば、決定部１３６は、概念クエリ「治安が良い」の類似クエリとして抽出した地名クエリ「地名＃２１」に対応する不動産エリアをユーザＵ２１に対して推薦するお勧めエリアとして推薦することを決定する。 Subsequently, when the extraction unit 135 extracts a similar query, the determination unit 136 determines a recommended area recommended for the user U21 based on the similar query extracted by the extraction unit 135 (step S5- A). Specifically, the determination unit 136 determines to recommend the real estate area corresponding to the place name query extracted as the similar query as the recommended area. For example, the determination unit 136 determines to recommend the real estate area corresponding to the place name query “place name #21” extracted as a similar query to the concept query “safety” as a recommended area recommended to the user U21. ..

続いて、提供部１３３は、決定部１３６によってお勧めエリアが決定されると、決定部１３６によって決定されたお勧めエリアに関する情報（例えば、お勧めエリアの不動産物件の情報）をユーザＵ２１に対して送信する（ステップＳ６−Ａ）。 Subsequently, when the determining unit 136 determines the recommended area, the providing unit 133 provides the user U21 with information on the recommended area determined by the determining unit 136 (for example, information on real estate properties in the recommended area). And transmits (step S6-A).

〔１−６−２．概念的なクエリに基づく絞り込み条件の推薦〕
次に、図１０を用いて、変形例に係る情報処理について説明する。図１０は、変形例に係る情報処理の一例を示す図である。図１０では、決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、推薦情報である再検索用のクエリの候補を決定する。具体的には、生成部１３２は、第１学習モデルを用いて、不動産検索においてユーザが物件を絞り込む際に用いられる絞り込み条件を示すキーワード（以下、適宜「再検索用クエリ」と記載する。）に対応する分散表現（ベクトル）を生成する（ステップＳ１−Ｂ）。ここで、不動産検索においてユーザが物件を絞り込む際に用いられる絞り込み条件を示すキーワードの例としては、物件の特徴を示すキーワードである「高層マンション」や「低層マンション」、物件の立地条件を示すキーワードである「リバーサイド」、「駅徒歩５分以内」等が挙げられる。 [1-6-2. Recommendation of refinement condition based on conceptual query]
Next, information processing according to the modification will be described with reference to FIG. FIG. 10 is a diagram illustrating an example of information processing according to the modification. In FIG. 10, the determination unit 136 determines a query candidate for re-search, which is recommendation information, based on the similar query extracted by the extraction unit 135. Specifically, the generation unit 132 uses the first learning model, and is a keyword indicating a narrowing-down condition used when the user narrows down properties in the real estate search (hereinafter, appropriately referred to as “re-search query”). A distributed expression (vector) corresponding to is generated (step S1-B). Here, as an example of the keyword indicating the narrowing condition used when the user narrows down the property in the real estate search, "high-rise condominium" or "low-rise condominium", which is a keyword indicating the characteristic of the property, or a keyword indicating the location condition of the property "Riverside", "within 5 minutes walk from the station" and so on.

図１０右方に点線で示した吹き出しの中には、図１で生成部１３２が生成した地名クエリに対応する分散表現（ベクトル）及び図９で生成部１３２が生成した概念クエリに対応する分散表現（ベクトル）に加えて、図１０で生成部１３２が生成した再検索用クエリに対応する分散表現（ベクトル）が分散表現空間にマッピングされる様子が示されている。例えば、概念クエリ「花火がきれい」に対応する分散表現（ベクトル）を示す点と再検索用クエリ「高層マンション」に対応する分散表現（ベクトル）を示す点とは、分散表現空間上で相対的に近くに位置する様子が示されている。すなわちこの図は、概念クエリ「花火がきれい」と再検索用クエリ「高層マンション」とは、類似する特徴を有することを意味する。一方、概念クエリ「花火がきれい」に対応する分散表現（ベクトル）を示す点と再検索用クエリ「低層マンション」に対応する分散表現（ベクトル）を示す点とは、分散表現空間上で相対的に遠くに存在する様子が示されている。すなわちこの図は、概念クエリ「花火がきれい」と再検索用クエリ「低層マンション」とは、相違する特徴を有することを意味する。 In the balloon indicated by the dotted line on the right side of FIG. 10, a distributed expression (vector) corresponding to the place name query generated by the generation unit 132 in FIG. 1 and a distribution corresponding to the conceptual query generated by the generation unit 132 in FIG. In addition to the expressions (vectors), FIG. 10 illustrates that the distributed expressions (vectors) corresponding to the re-search query generated by the generation unit 132 are mapped in the distributed expression space. For example, the point indicating the distributed expression (vector) corresponding to the concept query “fireworks is beautiful” and the point indicating the distributed expression (vector) corresponding to the re-search query “high-rise apartment” are relative in the distributed expression space. It is shown to be located close to. That is, this figure means that the concept query “fireworks are beautiful” and the re-search query “high-rise apartment” have similar features. On the other hand, the point indicating the distributed expression (vector) corresponding to the concept query “fireworks is beautiful” and the point indicating the distributed expression (vector) corresponding to the re-search query “low-rise apartment” are relative in the distributed expression space. It is shown to exist in the distance. That is, this figure means that the concept query “fireworks are beautiful” and the re-search query “low-rise apartment” have different characteristics.

また、提供部１３３は、フリーワード形式による検索クエリを入力するための検索ボックスを含むコンテンツＣ２１を介して、概念クエリ「花火がきれい」をユーザＵ３１から受け付ける（ステップＳ２−Ｂ）。続いて、提供部１３３によって概念クエリ「花火がきれい」が受け付けられると、算出部１３４は、あらかじめ生成しておいた概念クエリ「花火がきれい」に対応する分散表現（ベクトル）と再検索用クエリに対応する分散表現（ベクトル）との類似度を算出する（ステップＳ３−Ｂ）。続いて、抽出部１３５は、類似度を算出すると、算出した類似度が所定の閾値を超えるか否かを判定する。続いて、抽出部１３５は、類似度が所定の閾値を超えると判定した場合に、概念クエリ「花火がきれい」と類似する特徴を有する類似クエリとして、その再検索用クエリを抽出する（ステップＳ４−Ｂ）。 Further, the providing unit 133 receives the conceptual query “fireworks are beautiful” from the user U31 via the content C21 including the search box for inputting the search query in the free word format (step S2-B). Subsequently, when the providing unit 133 receives the conceptual query “fireworks are beautiful”, the calculating unit 134 causes the distributed expression (vector) corresponding to the previously generated conceptual query “fireworks is beautiful” and a re-search query. The degree of similarity with the distributed expression (vector) corresponding to is calculated (step S3-B). Subsequently, when the extraction unit 135 calculates the similarity, it determines whether the calculated similarity exceeds a predetermined threshold value. Subsequently, when the extraction unit 135 determines that the degree of similarity exceeds a predetermined threshold value, the extraction unit 135 extracts the re-search query as a similar query having characteristics similar to the conceptual query “fireworks are beautiful” (step S4). -B).

例えば、算出部１３４は、概念クエリ「花火がきれい」に対応する分散表現（ベクトル）と再検索用クエリ「高層マンション」に対応する分散表現（ベクトル）との類似度を０．９と算出する。続いて、抽出部１３５は、概念クエリ「花火がきれい」に対応する分散表現（ベクトル）と再検索用クエリ「高層マンション」に対応する分散表現（ベクトル）との類似度が所定の閾値（例えば、０．８）を超えるか否かを判定する。例えば、抽出部１３５は、類似度である０．９が所定の閾値である０．８を超えるので、概念クエリ「花火がきれい」の類似クエリとして、再検索用クエリ「高層マンション」を抽出する。同様にして、算出部１３４は、全ての再検索用クエリについて、概念クエリ「花火がきれい」に対応する分散表現（ベクトル）との類似度を算出する。そして、抽出部１３５は、全ての再検索用クエリについて、類似度が所定の閾値を超えるか否かを判定する。 For example, the calculation unit 134 calculates the similarity between the distributed expression (vector) corresponding to the concept query “fireworks are beautiful” and the distributed expression (vector) corresponding to the re-search query “high-rise apartment” as 0.9. .. Then, the extraction unit 135 determines that the similarity between the distributed expression (vector) corresponding to the concept query “fireworks is beautiful” and the distributed expression (vector) corresponding to the re-search query “high-rise apartment” is a predetermined threshold value (eg, , 0.8) is exceeded. For example, the extraction unit 135 extracts the re-search query “high-rise apartment” as a similar query of the conceptual query “fireworks are beautiful” because the similarity 0.9 exceeds the predetermined threshold of 0.8. .. Similarly, the calculation unit 134 calculates the degree of similarity with the distributed expression (vector) corresponding to the conceptual query “fireworks are beautiful” for all the re-search queries. Then, the extraction unit 135 determines whether or not the degree of similarity exceeds a predetermined threshold for all re-search queries.

続いて、決定部１３６は、抽出部１３５によって類似クエリが抽出されると、抽出部１３５によって抽出された類似クエリに基づいて、ユーザＵ３１に対して推薦する絞り込み条件を決定する（ステップＳ５−Ｂ）。具体的には、決定部１３６は、類似クエリとして抽出した再検索用クエリに対応する絞り込み条件を推薦することを決定する。例えば、決定部１３６は、概念クエリ「花火がきれい」の類似クエリとして抽出した再検索用クエリ「高層マンション」に対応する絞り込み条件をユーザＵ３１に対して推薦することを決定する。 Subsequently, when the extraction unit 135 extracts the similar query, the determination unit 136 determines the narrowing-down condition recommended for the user U31 based on the similar query extracted by the extraction unit 135 (step S5-B). ). Specifically, the determination unit 136 determines to recommend the narrowing-down condition corresponding to the re-search query extracted as the similar query. For example, the determination unit 136 determines to recommend the narrowing-down condition corresponding to the re-search query “high-rise apartment” extracted as a similar query of the conceptual query “fireworks is beautiful” to the user U31.

続いて、提供部１３３は、決定部１３６によって推薦する絞り込み条件が決定されると、決定部１３６によって決定された絞り込み条件に関する情報（例えば、推薦する絞り込み条件に対応するチェックボックスにチェックが入った状態のコンテンツ）をユーザＵ３１に対して送信する（ステップＳ６−Ｂ）。 Subsequently, when the determination unit 136 determines the recommended narrowing-down condition, the providing unit 133 provides information on the narrowing-down condition determined by the determination unit 136 (for example, a check box corresponding to the recommended narrowing-down condition is checked. Content) is transmitted to the user U31 (step S6-B).

次に、図１１を用いて、変形例に係るコンテンツの切り替え処理について説明する。図１１は、変形例に係るユーザ端末がコンテンツを切り替える処理の一例を説明する図である。図１１の左図は、図１０に示すコンテンツＣ２１の一例を示す。ユーザ端末１０は、ユーザＵ３１の操作に応じて、コンテンツＣ２１に表示された検索ボタンＢ１が押下されると、情報処理装置１００に概念クエリ「花火がきれい」を送信する。情報処理装置１００は、概念クエリ「花火がきれい」をユーザＵ３１から受け付ける（図９に示すステップＳ２−Ｂ）。続いて、情報処理装置１００は、図９に示すステップＳ３−ＢからステップＳ６−Ｂの処理を実行する。 Next, a content switching process according to the modification will be described with reference to FIG. FIG. 11 is a diagram illustrating an example of a process in which the user terminal according to the modification switches the content. The left diagram of FIG. 11 shows an example of the content C21 shown in FIG. When the search button B1 displayed on the content C21 is pressed in response to the operation of the user U31, the user terminal 10 transmits the concept query “fireworks are beautiful” to the information processing device 100. The information processing apparatus 100 receives the concept query “fireworks are beautiful” from the user U31 (step S2-B shown in FIG. 9). Subsequently, the information processing apparatus 100 executes the processes of steps S3-B to S6-B shown in FIG.

図１１の右図は、図９に示すステップＳ６‐Ｂにおいて、情報処理装置１００がユーザＵ３１に対して送信するコンテンツＣ２２の一例を示す。情報処理装置１００は、条件検索の絞り込み条件のうち、推薦する絞り込み条件に対応するチェックボックスにチェックが入った状態のコンテンツＣ２２をユーザＵ３１に対して送信する。ユーザ端末１０は、コンテンツＣ２２を受信すると、画面にコンテンツＣ２２を表示する。ユーザ端末１０は、ユーザＵ３１の操作に応じて、コンテンツＣ２２に表示された検索ボタンＢ２が押下されると、チェックボックスにチェックされた絞り込み条件で不動産情報を検索する検索要求を情報処理装置１００に送信する。 The right diagram of FIG. 11 illustrates an example of the content C22 transmitted by the information processing apparatus 100 to the user U31 in step S6-B illustrated in FIG. 9. The information processing apparatus 100 transmits, to the user U31, the content C22 in which the check box corresponding to the recommended narrowing-down condition is checked among the narrowing-down conditions of the condition search. Upon receiving the content C22, the user terminal 10 displays the content C22 on the screen. When the search button B2 displayed in the content C22 is pressed in response to the operation of the user U31, the user terminal 10 issues a search request to the information processing apparatus 100 to search for real estate information under the narrowing-down condition checked in the check box. Send.

〔１−６−３．不動産以外の他の分野への応用〕
上記の例では、情報処理装置１００が、不動産検索サービスにおいて、所定のクエリを入力したユーザに対して不動産エリアを推薦する例について説明した。上記の例に限らず、情報処理装置１００は、商品、動画、音楽、レストラン、食べ物、会社（株価、就活）など不動産以外の他の分野のコンテンツ全般について、所定のクエリの特徴を示す特徴情報を抽出する。 [1-6-3. Application to other fields besides real estate]
In the above example, the information processing apparatus 100 has described the example of recommending the real estate area to the user who has input a predetermined query in the real estate search service. Not limited to the above example, the information processing apparatus 100 is characteristic information indicating characteristics of a predetermined query regarding general content in fields other than real estate, such as products, videos, music, restaurants, food, companies (stock prices, job hunting). To extract.

具体的には、情報処理装置１００は、商品、動画、音楽、レストラン、食べ物、会社（株価、就活）など不動産以外の他の分野のコンテンツを検索対象とする検索サービスにおいて、商品、動画、音楽、レストラン、食べ物、会社（株価、就活）など不動産以外の他の分野に関する所定のクエリを取得する。続いて、情報処理装置１００は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて、所定のクエリと類似する特徴を有する類似クエリを抽出する。続いて、情報処理装置１００は、抽出した類似クエリに基づいて、推薦情報を決定する。 Specifically, the information processing apparatus 100 is a search service that searches for content in fields other than real estate, such as products, videos, music, restaurants, foods, companies (stock prices, job hunting), in products, videos, music. Get predetermined queries for other areas besides real estate, such as restaurants, food, companies (stock prices, job hunting). Subsequently, the information processing apparatus 100 uses the first learning model in which the features of the plurality of search queries are learned, assuming that the plurality of search queries input by the same user within a predetermined time have similar features. Then, a similar query having characteristics similar to the predetermined query is extracted. Then, the information processing apparatus 100 determines recommendation information based on the extracted similar query.

〔２．第２の実施形態〕
〔２−１．情報処理の一例〕
次に、第２の実施形態について説明する。上述してきた第１の実施形態では、情報処理装置１００がユーザから受け付けた所定のクエリと類似する類似クエリを抽出し、抽出した類似クエリに基づいて、推薦情報を決定する情報処理の一例を説明した。第２の実施形態では、情報処理装置１００Ａがユーザから受け付けた所定のクエリが属するカテゴリを抽出し、抽出したカテゴリに基づいて、推薦情報を決定する情報処理の例を示す。なお、第２の実施形態では、第１の実施形態と同様の構成について同一の符号を付して説明を省略する。 [2. Second Embodiment]
[2-1. Example of information processing]
Next, a second embodiment will be described. In the first embodiment described above, an example of information processing in which the information processing apparatus 100 extracts a similar query similar to a predetermined query received from a user and determines recommendation information based on the extracted similar query will be described. did. The second embodiment shows an example of information processing in which the information processing apparatus 100A extracts a category to which a predetermined query received from a user belongs, and determines recommendation information based on the extracted category. In addition, in the second embodiment, the same components as those in the first embodiment are designated by the same reference numerals, and the description thereof will be omitted.

生成装置５０は、第２学習モデルを生成する。ここでは、生成装置５０による第２学習モデルの生成処理の概要を述べる。なお、生成装置５０による第２学習モデルの生成処理の詳細は後述する。具体的には、生成装置５０は、第１学習モデルを用いて、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２学習モデルを生成する。より具体的には、生成装置５０は、第１学習モデルを生成すると、生成した第１学習モデル（第１学習モデルＭ１のモデルデータＭＤＴ１）を取得する。生成装置５０は、第１モデルＭ１を取得すると、取得した第１モデルＭ１を用いて、第２学習モデルＭ２を生成する。生成装置５０は、第１モデルＭ１を再学習させることにより、第１モデルＭ１とは学習モデルの重みである接続係数が異なる第２学習モデルＭ２を生成する。例えば、生成装置５０は、検索クエリが学習モデルに入力された際に、学習モデルが出力する分散表現の分類結果が、検索クエリが属するカテゴリに対応するように学習することで、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２学習モデルＭ２を生成する。 The generation device 50 generates the second learning model. Here, the outline of the generation processing of the second learning model by the generation device 50 will be described. The details of the generation processing of the second learning model by the generation device 50 will be described later. Specifically, the generation device 50 uses the first learning model to generate a second learning model that predicts a category to which a predetermined search query belongs from a predetermined search query. More specifically, when generating the first learning model, the generation device 50 acquires the generated first learning model (model data MDT1 of the first learning model M1). When the generation device 50 acquires the first model M1, the generation device 50 generates the second learning model M2 using the acquired first model M1. The generation device 50 re-learns the first model M1 to generate a second learning model M2 having a connection coefficient that is a weight of the learning model different from that of the first model M1. For example, when the search query is input to the learning model, the generation device 50 learns such that the classification result of the distributed expression output by the learning model corresponds to the category to which the search query belongs, thereby generating a predetermined search query. A second learning model M2 that predicts a category to which a predetermined search query belongs is generated.

情報処理装置１００Ａは、不動産情報検索サービスＲ１を提供するサーバ装置である。情報処理装置１００Ａは、第２学習モデルのモデルデータを生成装置５０から取得する。なお、以下では、第２学習モデルのモデルデータを単に第２学習モデルと記載する場合がある。例えば、情報処理装置１００Ａは、第２学習モデルを用いて、ユーザから受け付けた所定の地名クエリが分類されるカテゴリに対応する不動産エリアに属する物件をお勧め物件として推薦する。 The information processing device 100A is a server device that provides the real estate information search service R1. The information processing device 100A acquires the model data of the second learning model from the generation device 50. In the following, the model data of the second learning model may be simply referred to as the second learning model. For example, the information processing apparatus 100A recommends a property belonging to the real estate area corresponding to the category into which the predetermined place name query received from the user is classified, as a recommended property, using the second learning model.

〔２−２．情報処理装置の構成〕
次に、図１２を用いて、第２の実施形態に係る情報処理装置１００Ａの構成について説明する。図１２は、第２の実施形態に係る情報処理装置１００Ａの構成例を示す図である。図１２に示すように、情報処理装置１００Ａは、通信部１１０と、記憶部１２０Ａと、制御部１３０Ａとを有する。なお、情報処理装置１００Ａは、情報処理装置１００Ａの管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示させるための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [2-2. Configuration of information processing device]
Next, the configuration of the information processing apparatus 100A according to the second embodiment will be described with reference to FIG. FIG. 12 is a diagram illustrating a configuration example of the information processing apparatus 100A according to the second embodiment. As illustrated in FIG. 12, the information processing device 100A includes a communication unit 110, a storage unit 120A, and a control unit 130A. The information processing apparatus 100A includes an input unit (for example, a keyboard and a mouse) that receives various operations from an administrator of the information processing apparatus 100A, and a display unit (for example, a liquid crystal display) for displaying various information. You may have.

（記憶部１２０Ａ）
記憶部１２０Ａは、例えば、ＲＡＭ、フラッシュメモリ等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部１２０Ａは、図１２に示すように、モデル情報記憶部１２１とカテゴリ情報記憶部１２２Ａと検索情報記憶部１２３とコンテンツ記憶部１２４を有する。 (Storage unit 120A)
The storage unit 120A is realized by, for example, a semiconductor memory device such as a RAM or a flash memory, or a storage device such as a hard disk or an optical disk. As shown in FIG. 12, the storage unit 120A includes a model information storage unit 121, a category information storage unit 122A, a search information storage unit 123, and a content storage unit 124.

（カテゴリ情報記憶部１２２Ａ）
カテゴリ情報記憶部１２２Ａは、検索クエリが属するカテゴリに関する各種の情報を記憶する。具体的には、カテゴリ情報記憶部１２２Ａは、学習済みの第２学習モデルに検索クエリが入力された際に、第２学習モデルが出力するカテゴリに関する各種の情報を記憶する。図１３に、第２の実施形態に係るカテゴリ情報記憶部の一例を示す。図１３に示す例では、カテゴリ情報記憶部１２２Ａは、「検索クエリ」、「大分類」、「小分類」、「確率（％）」といった項目を有する。 (Category information storage unit 122A)
122 A of category information storage parts memorize|store various information regarding the category to which a search query belongs. Specifically, the category information storage unit 122A stores various kinds of information regarding the categories output by the second learning model when the search query is input to the learned second learning model. FIG. 13 shows an example of the category information storage unit according to the second embodiment. In the example illustrated in FIG. 13, the category information storage unit 122A has items such as “search query”, “major classification”, “small classification”, and “probability (%)”.

「検索クエリＩＤ」は、ユーザによって入力された検索クエリを示す。「大分類」は、検索クエリが分類されるカテゴリの大分類を示す。「小分類」、検索クエリが分類されるカテゴリの小分類を示す。「確率（％）」は、学習済みの第２学習モデルに検索クエリが入力された際に、第２学習モデルが出力する小分類毎の確率を示す。 The “search query ID” indicates a search query input by the user. The “major classification” indicates the major classification of the categories into which the search query is classified. “Minor classification” indicates a minor classification of the category into which the search query is classified. The “probability (%)” indicates the probability for each subclass output by the second learning model when the search query is input to the learned second learning model.

図１３に示す例では、大分類「不動産エリアを探す」は、検索クエリを分類するカテゴリの大分類が不動産エリアを探すという検索意図であることを示す。図１３に示す例では、大分類「不動産エリアを探す」は、さらに４つの小分類を有する。小分類「高級住宅街を探す」は、大分類「不動産エリアを探す」に属する分類であって、小分類に分類される検索クエリが、ユーザによって高級住宅街を探す意図で入力された検索クエリであることを示す。また、小分類「下町エリアを探す」は、大分類「不動産エリアを探す」に属する分類であって、小分類に分類される検索クエリが、ユーザによって下町エリアを探す意図で入力された検索クエリであることを示す。また、小分類「湾岸エリアを探す」は、大分類「不動産エリアを探す」に属する分類であって、小分類に分類される検索クエリが、ユーザによって湾岸エリアを探す意図で入力された検索クエリであることを示す。また、小分類「郊外エリアを探す」は、大分類「不動産エリアを探す」に属する分類であって、小分類に分類される検索クエリが、ユーザによって郊外エリアを探す意図で入力された検索クエリであることを示す。 In the example illustrated in FIG. 13, the large classification “search for real estate area” indicates that the large classification of the categories for classifying the search query is the search intention to search for the real estate area. In the example shown in FIG. 13, the large category “search for real estate area” has four smaller categories. The sub-category “Find a high-class residential area” is a category that belongs to the main category “Search for a real estate area”, and the search query classified into the sub-category is entered by the user with the intention of searching for a high-class residential area. Is shown. In addition, the subcategory “Find a downtown area” is a category that belongs to the large category “Find a real estate area”, and a search query that is classified into the subcategory is a search query that is input by the user with the intention of searching for a downtown area. Is shown. In addition, the subcategory “Search for Gulf area” is a category that belongs to the large category “Search for real estate area”, and the search query classified into the subclass is input by the user with the intention of searching for the Gulf area. Is shown. In addition, the sub-category “Search for suburb area” is a category that belongs to the subcategory “Search for real estate area”, and the search query classified into the sub-category is input by the user with the intention of searching the suburb area. Is shown.

図１３に示す例では、検索クエリ「地名＃１１」の確率（％）「９０」は、検索クエリ「地名＃１１」が高級住宅街を探す意図で入力されたクエリに分類される確率が９０％であることを示す。また、検索クエリ「地名＃１１」の確率（％）「０」は、検索クエリ「地名＃１１」が下町エリアを探す意図で入力されたクエリに分類される確率が０％であることを示す。また、検索クエリ「地名＃１１」の確率（％）「１０」は、検索クエリ「地名＃１１」が湾岸エリアを探す意図で入力されたクエリに分類される確率が１０％であることを示す。また、検索クエリ「地名＃１１」の確率（％）「０」は、検索クエリ「地名＃１１」が郊外エリアを探す意図で入力されたクエリに分類される確率が０％であることを示す。 In the example illustrated in FIG. 13, the probability (90) of the search query “place name #11” is 90, and the probability that the search query “place name #11” is classified as a query input with the intention of searching for a high-class residential area is 90. % Is shown. Further, the probability (%) “0” of the search query “place name #11” indicates that the probability that the search query “place name #11” is classified into the query input with the intention of searching the downtown area is 0%. .. The probability (%) “10” of the search query “place name #11” indicates that the probability that the search query “place name #11” is classified into the query input with the intention of searching the Gulf area is 10%. .. In addition, the probability (%) “0” of the search query “place name #11” indicates that the probability that the search query “place name #11” is classified into the query input with the intention of searching the suburb area is 0%. ..

（制御部１３０Ａ）
図１２の説明に戻って、制御部１３０Ａは、コントローラであり、例えば、ＣＰＵやＭＰＵ等によって、情報処理装置１００Ａ内部の記憶装置に記憶されている各種プログラム（情報処理プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部１３０Ａは、コントローラであり、例えば、ＡＳＩＣやＦＰＧＡ等の集積回路により実現される。 (Control unit 130A)
Returning to the description of FIG. 12, the control unit 130A is a controller, and various programs (corresponding to an example of the information processing program) stored in the storage device inside the information processing apparatus 100A by, for example, a CPU, an MPU, and the like are stored. It is realized by executing the RAM as a work area. The control unit 130A is a controller and is realized by an integrated circuit such as ASIC or FPGA.

図１２に示すように、制御部１３０Ａは、取得部１３１と、生成部１３２と、提供部１３３と、算出部１３４Ａと、抽出部１３５Ａと、決定部１３６Ａとを有し、以下に説明する情報処理の作用を実現または実行する。なお、制御部１３０Ａの内部構成は、図１２に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As illustrated in FIG. 12, the control unit 130A includes an acquisition unit 131, a generation unit 132, a provision unit 133, a calculation unit 134A, an extraction unit 135A, and a determination unit 136A, and information described below. Realize or execute the action of processing. Note that the internal configuration of the control unit 130A is not limited to the configuration shown in FIG. 12, and may be another configuration as long as it is a configuration for performing information processing described later.

（取得部１３１）
取得部１３１は、第２学習モデルを取得する。より具体的には、取得部１３１は、生成装置５０によって生成された第２学習モデルを生成装置５０から取得する。取得部１３１は、第２学習モデルを取得すると、取得した第２学習モデルをモデル情報記憶部１２１に格納する。 (Acquisition unit 131)
The acquisition unit 131 acquires the second learning model. More specifically, the acquisition unit 131 acquires the second learning model generated by the generation device 50 from the generation device 50. When acquiring the second learning model, the acquisition unit 131 stores the acquired second learning model in the model information storage unit 121.

（算出部１３４Ａ）
算出部１３４Ａは、検索クエリが所定のカテゴリに属する確率をカテゴリ毎に算出する。具体的には、算出部１３４Ａは、提供部１３３によって所定の検索クエリが受け付けられると、取得部１３１によって取得された第２学習モデルに検索クエリを入力して、所定のクエリが所定のカテゴリに属する確率をカテゴリ毎に算出する。例えば、算出部１３４Ａは、提供部１３３によって所定の地名クエリが受け付けられると、取得部１３１によって取得された第２学習モデルに受け付けた地名クエリを入力して、受け付けた地名クエリが所定のカテゴリに属する確率をカテゴリ毎に算出する。例えば、算出部１３４Ａは、受け付けた所定の地名クエリが４つのカテゴリ（小分類）である「高級住宅街を探す」、「下町エリアを探す」、「湾岸エリアを探す」、「郊外エリアを探す」の各カテゴリ（小分類）に属する確率をカテゴリ（小分類）毎に算出する。 (Calculator 134A)
The calculation unit 134A calculates the probability that the search query belongs to a predetermined category for each category. Specifically, when the provision unit 133 accepts a predetermined search query, the calculation unit 134A inputs the search query into the second learning model acquired by the acquisition unit 131 so that the predetermined query falls into a predetermined category. Probability of belonging is calculated for each category. For example, when the provision unit 133 receives the predetermined place name query, the calculation unit 134A inputs the received place name query into the second learning model acquired by the acquisition unit 131 and places the received place name query into a predetermined category. Probability of belonging is calculated for each category. For example, the calculation unit 134A has four categories (small categories) of the received predetermined place name queries, “search for high-class residential area”, “search for downtown area”, “search for coastal area”, “search for suburban area”. The probability of belonging to each category (small classification) is calculated for each category (small classification).

（抽出部１３５Ａ）
抽出部１３５Ａは、特徴情報として、所定のクエリが属するカテゴリを抽出する。例えば、抽出部１３５Ａは、算出部１３４Ａによって算出された各カテゴリ（小分類）の確率が所定の閾値を超えるか否かをカテゴリ（小分類）毎に判定する。続いて、抽出部１３５Ａは、受け付けた検索クエリが所定のカテゴリに属する確率が所定の閾値を超える場合、受け付けた検索クエリが分類されるカテゴリとして、所定のカテゴリを抽出する。例えば、抽出部１３５Ａは、受け付けた所定の地名クエリが「高級住宅街を探す」カテゴリに属する確率が９０％であり、所定の閾値が８０％である場合には、受け付けた所定の地名クエリが分類されるカテゴリとして、「高級住宅街を探す」カテゴリを抽出する。 (Extractor 135A)
The extraction unit 135A extracts, as the characteristic information, a category to which a predetermined query belongs. For example, the extraction unit 135A determines, for each category (small classification), whether or not the probability of each category (small classification) calculated by the calculation unit 134A exceeds a predetermined threshold value. Subsequently, when the probability that the received search query belongs to the predetermined category exceeds a predetermined threshold, the extraction unit 135A extracts the predetermined category as the category into which the received search query is classified. For example, when the received predetermined place name query has a probability of 90% belonging to the “search for a high-class residential area” category and the predetermined threshold value is 80%, the extraction unit 135A determines that the received predetermined place name query is As a category to be classified, a category "search for a high-class residential area" is extracted.

また、抽出部１３５Ａは、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５Ａは、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する第１学習モデルを用いて生成された第２学習モデルを用いて、特徴情報として所定のクエリが属するカテゴリを抽出する。 Further, the extraction unit 135A extracts feature information by using a learning model that outputs a distributed expression of a predetermined search query as output information when a predetermined search query is input as input information. For example, the extraction unit 135A uses the second learning model generated by using the first learning model that outputs the distributed expression of the predetermined search query as the output information when the predetermined search query is input as the input information. Then, the category to which the predetermined query belongs is extracted as the characteristic information.

また、抽出部１３５Ａは、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５Ａは、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて生成された第２学習モデルを用いて、特徴情報として所定のクエリが属するカテゴリを抽出する。 In addition, the extraction unit 135A uses a learning model that learns the characteristics of a plurality of search queries by learning so that the distributed expressions of a pair of search queries that are continuously input within a predetermined time are similar, Extract feature information. For example, the extraction unit 135A uses the first learning model in which the characteristics of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries continuously input within a predetermined time period are similar to each other. A category to which a predetermined query belongs is extracted as the characteristic information using the second learning model generated as described above.

また、抽出部１３５Ａは、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５Ａは、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて生成された第２学習モデルを用いて、特徴情報として所定のクエリが属するカテゴリを抽出する。 Further, the extraction unit 135A learns that a plurality of search queries including a character string delimited by a predetermined delimiter have similar characteristics as the plurality of search queries input by the same user within a predetermined time. By doing so, the feature information is extracted using the learning model in which the features of the plurality of search queries are learned. For example, the extraction unit 135A learns that a plurality of search queries including a character string delimited by a predetermined delimiter have similar characteristics as the plurality of search queries input by the same user within a predetermined time period. By doing so, the category to which the predetermined query belongs is extracted as the feature information by using the second learning model generated by using the first learning model in which the features of the plurality of search queries are learned.

また、抽出部１３５Ａは、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５Ａは、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて生成された第２学習モデルを用いて、特徴情報として所定のクエリが属するカテゴリを抽出する。 In addition, the extraction unit 135A extracts the characteristic information by using the learning model in which the characteristics of the plurality of search queries are learned by learning that the plurality of search queries that are randomly extracted have different characteristics. .. For example, the extraction unit 135A learns that a plurality of search queries that are randomly extracted have different characteristics, and thus the first learning model that is generated by using the first learning model that has learned the characteristics of the plurality of search queries. A category to which a predetermined query belongs is extracted as feature information using the two learning model.

また、抽出部１３５Ａは、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。例えば、抽出部１３５Ａは、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した第１学習モデルを用いて生成された第２学習モデルを用いて、特徴情報として所定のクエリが属するカテゴリを抽出する。 In addition, the extraction unit 135A extracts feature information using a learning model in which the features of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries that are randomly extracted differ. .. For example, the extraction unit 135A learns such that the distributed expressions of a pair of search queries that are randomly extracted differ from each other, and thus the first learning model generated by using the first learning model that has learned the features of the plurality of search queries. A category to which a predetermined query belongs is extracted as feature information using the two learning model.

（決定部１３６Ａ）
決定部１３６Ａは、抽出部１３５Ａによって抽出されたカテゴリに基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。例えば、決定部１３６Ａは、抽出部１３５Ａによって抽出された「高級住宅街を探す」カテゴリに基づいて、所定の地名クエリを入力したユーザに対して、高級住宅街に属する物件を推薦することを決定する。 (Decision section 136A)
The determination unit 136A determines recommendation information to be recommended to the user who has input the predetermined query, based on the category extracted by the extraction unit 135A. For example, the determination unit 136A determines to recommend the property belonging to the high-class residential area to the user who has input a predetermined place name query based on the “search for high-class residential area” category extracted by the extraction unit 135A. To do.

〔２−３．予測処理のフロー〕
次に、図１４を用いて、第２の実施形態に係る予測処理の手順について説明する。図１４は、第２の実施形態に係る予測処理手順を示すフローチャートである。図１４に示す例では、情報処理装置１００Ａは、検索クエリと第２学習モデルを取得する（ステップＳ３０１）。続いて、情報処理装置１００Ａは、検索クエリと第２学習モデルを取得すると、第２学習モデルを用いて、検索クエリが分類されるカテゴリを推定する（ステップＳ３０２）。 [2-3. Prediction processing flow]
Next, the procedure of the prediction process according to the second embodiment will be described with reference to FIG. FIG. 14 is a flowchart showing a prediction processing procedure according to the second embodiment. In the example illustrated in FIG. 14, the information processing device 100A acquires the search query and the second learning model (step S301). Subsequently, when the information processing apparatus 100A acquires the search query and the second learning model, the information processing apparatus 100A estimates the category into which the search query is classified, using the second learning model (step S302).

〔２−４．情報処理のフロー〕
次に、図１５を用いて、第２の実施形態に係る情報処理の手順について説明する。図１５は、第２の実施形態に係る情報処理手順を示すフローチャートである。図１５に示す例では、情報処理装置１００Ａは、検索クエリを受け付けたか否かを判定する（ステップＳ４０１）。情報処理装置１００Ａは、検索クエリを受け付けていない場合（ステップＳ４０１；Ｎｏ）、検索クエリを受け付けるまで待機する。 [2-4. Information processing flow]
Next, a procedure of information processing according to the second embodiment will be described with reference to FIG. FIG. 15 is a flowchart showing an information processing procedure according to the second embodiment. In the example illustrated in FIG. 15, the information processing device 100A determines whether a search query has been accepted (step S401). When the search query is not accepted (step S401; No), the information processing apparatus 100A waits until the search query is accepted.

一方、情報処理装置１００Ａは、検索クエリを受け付けた場合（ステップＳ４０１；Ｙｅｓ）、検索クエリが所定のカテゴリに属する確率をカテゴリ毎に算出する（ステップＳ４０２）。 On the other hand, when the information processing device 100A receives the search query (step S401; Yes), the information processing device 100A calculates the probability of the search query belonging to a predetermined category for each category (step S402).

続いて、情報処理装置１００Ａは、検索クエリが所定のカテゴリに属する確率を算出すると、算出した確率が所定の閾値を超えるか否かを判定する（ステップＳ４０３）。情報処理装置１００Ａは、算出した確率が所定の閾値を超えない場合（ステップＳ４０３；Ｎｏ）、処理を終了する。 Subsequently, when the probability that the search query belongs to the predetermined category is calculated, the information processing apparatus 100A determines whether the calculated probability exceeds a predetermined threshold value (step S403). When the calculated probability does not exceed the predetermined threshold value (step S403; No), the information processing apparatus 100A ends the process.

一方、情報処理装置１００Ａは、算出した確率が所定の閾値を超える場合（ステップＳ４０３；Ｙｅｓ）、受け付けた検索クエリが分類されるカテゴリを抽出する（ステップＳ４０４）。具体的には、情報処理装置１００Ａは、受け付けた検索クエリが所定のカテゴリに属する確率が所定の閾値を超える場合、受け付けた検索クエリが分類されるカテゴリとして、所定のカテゴリを抽出する。続いて、情報処理装置１００Ａは、カテゴリを抽出すると、抽出したカテゴリに基づいてお勧め情報を決定する（ステップＳ４０５）。 On the other hand, when the calculated probability exceeds the predetermined threshold value (step S403; Yes), the information processing apparatus 100A extracts the category into which the received search query is classified (step S404). Specifically, when the probability that the received search query belongs to a predetermined category exceeds a predetermined threshold, information processing apparatus 100A extracts the predetermined category as the category into which the received search query is classified. Subsequently, when the information processing apparatus 100A extracts the category, the information processing apparatus 100A determines recommended information based on the extracted category (step S405).

〔３．学習モデルの生成処理〕
〔３−１．第１学習モデルの生成処理〕
次に、図１６を用いて、第１学習モデルの生成処理の流れについて説明する。図１６は、実施形態に係る第１学習モデルの生成処理の一例を示す図である。図１６に示す例では、生成装置５０は、同一のユーザＵ１によって所定の時間内に連続して入力された「六本木パスタ」という検索クエリＱ１１と「六本木イタリアン」という検索クエリＱ１２とから成る一対の検索クエリを抽出する（ステップＳ１１）。 [3. Learning model generation processing]
[3-1. Generation processing of first learning model]
Next, the flow of generation processing of the first learning model will be described with reference to FIG. FIG. 16 is a diagram illustrating an example of the first learning model generation process according to the embodiment. In the example illustrated in FIG. 16, the generation device 50 includes a pair of search queries Q11 “Roppongi pasta” and a search query Q12 “Roppongi Italian” that are continuously input by the same user U1 within a predetermined time. A search query is extracted (step S11).

続いて、生成装置５０は、抽出した検索クエリＱ１１を第１モデルＭ１に入力して、検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１を出力する。ここで、ベクトルＢＱＶ１１は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１１の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す。また、生成装置５０は、抽出した検索クエリＱ１２を第１モデルＭ１に入力して、検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２を出力する。ここで、ベクトルＢＱＶ１２は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１２の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す。このようにして、生成装置５０は、検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１と、検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２とを出力する（ステップＳ１２）。 Subsequently, the generation device 50 inputs the extracted search query Q11 into the first model M1 and outputs a vector BQV11 that is a distributed expression of the search query Q11. Here, the vector BQV11 is a distributed expression of the search query Q11 just output from the output layer of the first model M1, and represents a distributed expression before feedback is given to the first model M1 (before learning). Further, the generation device 50 inputs the extracted search query Q12 into the first model M1 and outputs the vector BQV12 which is a distributed expression of the search query Q12. Here, the vector BQV12 is a distributed expression of the search query Q12 just output from the output layer of the first model M1, and represents a distributed expression before feedback is applied to the first model M1 (before learning). In this way, the generation device 50 outputs the vector BQV11, which is the distributed expression of the search query Q11, and the vector BQV12, which is the distributed expression of the search query Q12 (step S12).

続いて、生成装置５０は、同一のユーザＵ１によって所定の時間内に連続して入力された検索クエリＱ１１（「六本木パスタ」）と検索クエリＱ１２（「六本木イタリアン」）とから成る一対の検索クエリは、所定の検索意図（例えば、「ある場所で飲食店を探す」という検索意図）で入力された検索クエリであると推定されるため、相互に類似する特徴を有するものとして、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１と対となる検索クエリＱ１２の分散表現（ベクトルＱＶ１２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。例えば、第１モデルＭ１にフィードバックをかける前（学習前）の検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２とのなす角度の大きさをΘとする。また、第１モデルＭ１にフィードバックをかけた後（学習後）の検索クエリＱ１１の分散表現であるベクトルＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＱＶ１２とのなす角度の大きさをΦとする。この時、生成装置５０は、ΘよりもΦが小さくなるように、第１モデルＭ１を学習させる。例えば、生成装置５０は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値を算出する。また、生成装置５０は、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値を算出する。続いて、生成装置５０は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値よりも、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値が大きくなるように（値が１に近づくように）第１モデルＭ１を学習させる。このように、生成装置５０は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現（ベクトル）を出力する第１モデルＭ１を生成する（ステップＳ１３）。なお、生成装置５０は、コサイン類似度に限らず、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて分散表現（ベクトル）の間の類似度を算出してもよい。また、生成装置５０は、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて第１モデルＭ１を学習させてもよい。例えば、生成装置５０は、分散表現（ベクトル）同士のユークリッド距離や双曲空間等の非ユークリッド空間中での距離、マンハッタン距離、マハラノビス距離等といった所定の距離関数の値を算出する。続いて、生成装置５０は、分散表現（ベクトル）同士の所定の距離関数の値（すなわち、分散表現空間における距離）が小さくなるように第１モデルＭ１を学習させてもよい。 Subsequently, the generation device 50 includes a pair of search queries including a search query Q11 (“Roppongi pasta”) and a search query Q12 (“Roppongi Italian”) that are continuously input by the same user U1 within a predetermined time. Is presumed to be a search query input with a predetermined search intent (for example, a search intent "to search for a restaurant at a certain place"), and therefore, it is assumed that search queries Q11 have similar features to each other. The first model M1 is trained so that the distributed expression (vector QV11) and the distributed expression (vector QV12) of the search query Q12 paired with the search query Q11 are similar in the distributed expression space. For example, the magnitude of the angle between the vector BQV11, which is the distributed expression of the search query Q11 before feedback (before learning) to the first model M1, and the vector BQV12, which is the distributed expression of the search query Q12, is Θ. Further, the size of the angle formed by the vector QV11, which is the distributed expression of the search query Q11 after feedback is applied to the first model M1 (after learning), and the vector QV12, which is the distributed expression of the search query Q12, is Φ. At this time, the generation device 50 trains the first model M1 so that Φ becomes smaller than Θ. For example, the generation device 50 calculates the value of the cosine similarity between the vector BQV11 and the vector BQV12. Further, the generation device 50 calculates the value of the cosine similarity between the vector QV11 and the vector QV12. Then, the generation device 50 sets the first model so that the value of the cosine similarity between the vectors QV11 and QV12 is larger than the value of the cosine similarity between the vectors BQV11 and BQV12 (so that the value approaches 1). Train M1. In this way, the generation device 50 learns the first model M1 so that the two vectors, which are a pair of distributed expressions corresponding to the pair of search queries, are similar in the distributed expression space, and thus the distributed expression from the search query. A first model M1 that outputs (vector) is generated (step S13). Note that the generation device 50 may calculate the similarity between distributed expressions (vectors) based on any index as long as it is an index applicable as a distance measure between vectors, not limited to the cosine similarity. Good. Further, the generation device 50 may train the first model M1 based on any index as long as it is an index applicable as a distance measure between vectors. For example, the generating device 50 calculates a value of a predetermined distance function such as a Euclidean distance between distributed expressions (vectors), a distance in a non-Euclidean space such as a hyperbolic space, a Manhattan distance, a Mahalanobis distance, or the like. Subsequently, the generation device 50 may train the first model M1 so that the value of the predetermined distance function between the distributed expressions (vectors) (that is, the distance in the distributed expression space) becomes small.

次に、図１７を用いて、第１学習モデルの生成処理の流れについてより詳しく説明する。なお、図１７の説明では、図１６の説明と重複する部分は、適宜省略する。図１７は、実施形態に係る第１学習モデルの生成処理を示す図である。なお、以下では、適宜、第１学習モデルを第１モデル（又は、第１モデルＭ１）と記載する。図１７に示す例では、生成装置５０が生成した第１モデルＭ１によって出力された分散表現（ベクトル）が分散表現空間にマッピングされる様子が示されている。生成装置５０は、所定の検索クエリの分散表現と所定の検索クエリと対となる他の検索クエリの分散表現とが分散表現空間上で近くにマッピングされるように第１モデルＭ１のトレーニングを行う。 Next, the flow of the generation process of the first learning model will be described in more detail with reference to FIG. Note that in the description of FIG. 17, the portions overlapping the description of FIG. 16 will be omitted as appropriate. FIG. 17 is a diagram showing a generation process of the first learning model according to the embodiment. In the following, the first learning model will be appropriately referred to as the first model (or the first model M1). In the example illustrated in FIG. 17, the distributed representation (vector) output by the first model M1 generated by the generation device 50 is mapped to the distributed representation space. The generation device 50 trains the first model M1 so that the distributed expression of the predetermined search query and the distributed expressions of the other search queries paired with the predetermined search query are closely mapped in the distributed expression space. ..

図１７の上段に示す例では、生成装置５０は、同一のユーザＵ１によって所定の時間内に連続して入力された４個の検索クエリである検索クエリＱ１１（「六本木パスタ」）、検索クエリＱ１２（「六本木イタリアン」）、検索クエリＱ１３（「赤坂パスタ」）、検索クエリＱ１４（「麻布パスタ」）を抽出する。生成装置５０は、同一のユーザＵ１によって各検索クエリが入力された時間の間隔が所定の時間内である４個の検索クエリを抽出する。生成装置５０は、同一のユーザＵ１によって後述する各検索クエリのペアが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。生成装置５０は、検索クエリが入力された順番に並べると、検索クエリＱ１１、検索クエリＱ１２、検索クエリＱ１３、検索クエリＱ１４の順番で入力された４個の検索クエリを抽出する。生成装置５０は、４個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、３対の検索クエリのペアである（検索クエリＱ１１、検索クエリＱ１２）、（検索クエリＱ１２、検索クエリＱ１３）、（検索クエリＱ１３、検索クエリＱ１４）を抽出する（ステップＳ２１−１）。なお、生成装置５０は、同一のユーザＵ１によって全ての検索クエリが所定の時間内に入力された複数の検索クエリを抽出してもよい。そして、生成装置５０は、時系列的に隣り合うか否かに関わらず、抽出した複数の検索クエリの中から２つの検索クエリを選択して、選択した２つの検索クエリを一対の検索クエリとして抽出してもよい。 In the example illustrated in the upper part of FIG. 17, the generation device 50 includes a search query Q11 (“Roppongi pasta”) and a search query Q12, which are four search queries continuously input by the same user U1 within a predetermined time. (“Roppongi Italian”), search query Q13 (“Akasaka pasta”), and search query Q14 (“Azabu pasta”) are extracted. The generation device 50 extracts four search queries whose time intervals at which each search query is input by the same user U1 are within a predetermined time. The generation device 50 extracts a plurality of search queries in which a time interval at which a pair of search queries described below is input by the same user U1 is within a predetermined time. When the generating device 50 arranges the search queries in the input order, the generating device 50 extracts the four search queries input in the order of the search query Q11, the search query Q12, the search query Q13, and the search query Q14. When the four search queries are extracted, the generation device 50 defines two search queries that are adjacent in time series as a pair of search queries, and is a pair of three search queries (search query Q11, search query Q12), (Search query Q12, search query Q13) and (search query Q13, search query Q14) are extracted (step S21-1). Note that the generation device 50 may extract a plurality of search queries in which all the search queries are input by the same user U1 within a predetermined time. Then, the generation device 50 selects two search queries from the plurality of extracted search queries, regardless of whether they are adjacent in time series, and sets the selected two search queries as a pair of search queries. You may extract.

続いて、生成装置５０は、抽出した検索クエリＱ１ｋ（ｋ＝１、２、３、４）を第１モデルＭ１に入力して、検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＢＱＶ１ｋ（ｋ＝１、２、３、４）を出力する。ここで、ベクトルＢＱＶ１ｋ（ｋ＝１、２、３、４）は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す（ステップＳ２２−１）。 Then, the generation device 50 inputs the extracted search query Q1k (k=1, 2, 3, 4) into the first model M1 and distributes the search query Q1k (k=1, 2, 3, 4). The vector BQV1k (k=1, 2, 3, 4) that is the expression is output. Here, the vector BQV1k (k=1, 2, 3, 4) is a distributed representation of the search query Q1k (k=1, 2, 3, 4) just output from the output layer of the first model M1. , A distributed expression before feedback is given to the first model M1 (before learning) (step S22-1).

続いて、生成装置５０は、同一のユーザＵ１によって所定の時間内に連続して入力された一対の検索クエリは、所定の検索意図（例えば、「ある場所（東京都港区付近）で飲食店を探す」という検索意図）で入力された検索クエリであると推定されるため、相互に類似する特徴を有するものとして、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１と対となる検索クエリＱ１２の分散表現（ベクトルＱＶ１２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ１２の分散表現（ベクトルＱＶ１２）と、検索クエリＱ１２と対となる検索クエリＱ１３の分散表現（ベクトルＱＶ１３）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ１３の分散表現（ベクトルＱＶ１３）と、検索クエリＱ１３と対となる検索クエリＱ１４の分散表現（ベクトルＱＶ１４）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。このように、生成装置５０は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現（ベクトル）を出力する第１モデルＭ１を生成する（ステップＳ２３−１）。 Then, the generation device 50 determines that the pair of search queries continuously input by the same user U1 within a predetermined time period has a predetermined search intention (for example, “a certain place (in the vicinity of Minato-ku, Tokyo) at a restaurant). It is presumed that the search queries are input according to the search intention of "searching for". Therefore, the distributed representation (vector QV11) of the search query Q11 is paired with the search query Q11 as having similar features. The first model M1 is trained so that the distributed expression (vector QV12) of the search query Q12 is similar in the distributed expression space. In addition, the generating device 50 makes the distributed expression of the search query Q12 (vector QV12) and the distributed expression of the search query Q13 (vector QV13) paired with the search query Q12 similar to each other in the distributed expression space. The model M1 is trained. In addition, the generation device 50 makes the distributed expression (vector QV13) of the search query Q13 and the distributed expression (vector QV14) of the search query Q14 paired with the search query Q13 similar to each other in the distributed expression space. The model M1 is trained. In this way, the generation device 50 learns the first model M1 so that the two vectors, which are a pair of distributed expressions corresponding to the pair of search queries, are similar in the distributed expression space, and thus the distributed expression from the search query. A first model M1 that outputs (vector) is generated (step S23-1).

図１７の上段に示す情報処理の結果として、検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＱＶ１ｋ（ｋ＝１、２、３、４）が分散表現空間の近い位置にクラスタＣＬ１１としてマッピングされる様子が示されている。例えば、検索クエリＱ１ｋ（ｋ＝１、２、３、４）は、ユーザＵ１によって「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリの集合であると推定される。すなわち、検索クエリＱ１ｋ（ｋ＝１、２、３、４）は、「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであると推定される。ここで、生成装置５０は、「ある場所（東京都港区付近）で飲食店を探す」という検索意図で入力された所定の検索クエリが第１モデルに入力されると、クラスタＣＬ１１の位置にマッピングされるような分散表現を出力することができる。これにより、例えば、生成装置５０は、クラスタＣＬ１１の位置にマッピングされる分散表現に対応する検索クエリを抽出することにより、「ある場所（東京都港区付近）で飲食店を探す」という検索意図に応じた検索クエリを抽出することができる。したがって、生成装置５０は、検索クエリの意味を適切に解釈可能とすることができる。 As a result of the information processing shown in the upper part of FIG. 17, the vector QV1k (k=1, 2, 3, 4), which is the distributed expression of the search query Q1k (k=1, 2, 3, 4), is close to the distributed expression space. It is shown that the position is mapped as the cluster CL11. For example, the search query Q1k (k=1, 2, 3, 4) is a set of search queries searched by the user U1 under the search intention of “searching for a restaurant in a certain place (near Minato-ku, Tokyo)”. Is estimated to be That is, the search query Q1k (k=1, 2, 3, 4) is a search query that is searched under the search intent of “searching for a restaurant in a certain place (in the vicinity of Minato-ku, Tokyo)”. , Are presumed to be search queries having similar features to each other. Here, when the predetermined search query input with the search intention of “search for a restaurant in a certain location (Near Minato-ku, Tokyo)” is input to the first model, the generation device 50 sets the position of the cluster CL11. A distributed representation as mapped can be output. As a result, for example, the generating device 50 extracts the search query corresponding to the distributed expression mapped to the position of the cluster CL11 to search for a restaurant in a certain place (in the vicinity of Minato-ku, Tokyo). It is possible to extract a search query according to. Therefore, the generation device 50 can appropriately interpret the meaning of the search query.

図１７の下段に示す例では、生成装置５０は、同一のユーザＵ２によって所定の時間内に連続して入力された３個の検索クエリである検索クエリＱ２１（「冷蔵庫４００Ｌ」）、検索クエリＱ２２（「冷蔵庫中型」）、検索クエリＱ２３（「冷蔵庫中型おすすめ」）を抽出する。生成装置５０は、検索クエリが入力された順番に並べると、検索クエリＱ２１、検索クエリＱ２２、検索クエリＱ２３の順番で入力された３個の検索クエリを抽出する。生成装置５０は、３個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、２対の検索クエリのペアである（検索クエリＱ２１、検索クエリＱ２２）、（検索クエリＱ２２、検索クエリＱ２３）を抽出する（ステップＳ２１−２）。 In the example illustrated in the lower part of FIG. 17, the generation device 50 includes a search query Q21 (“refrigerator 400L”) and a search query Q22, which are three search queries continuously input by the same user U2 within a predetermined time. (“Refrigerator Medium-sized”) and search query Q23 (“Refrigerator Medium-sized recommended”) are extracted. After arranging the search queries in the input order, the generation device 50 extracts the three search queries input in the order of the search query Q21, the search query Q22, and the search query Q23. When the three search queries are extracted, the generation device 50 regards two search queries that are adjacent in time series as a pair of search queries, and is a pair of two search queries (search query Q21, search query Q22), (Search query Q22, search query Q23) is extracted (step S21-2).

続いて、生成装置５０は、抽出した検索クエリＱ２ｍ（ｍ＝１、２、３）を第１モデルＭ１に入力して、検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であるベクトルＢＱＶ２ｍ（ｍ＝１、２、３）を出力する。ここで、ベクトルＢＱＶ２ｍ（ｍ＝１、２、３）は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す（ステップＳ２２−２）。 Subsequently, the generation device 50 inputs the extracted search query Q2m (m=1, 2, 3) into the first model M1 and outputs a vector that is a distributed expression of the search query Q2m (m=1, 2, 3). BQV2m (m=1, 2, 3) is output. Here, the vector BQV2m (m=1, 2, 3) is a distributed representation of the search query Q2m (m=1, 2, 3) just output from the output layer of the first model M1, and is the first model. A distributed expression before feedback is given to M1 (before learning) is shown (step S22-2).

続いて、生成装置５０は、同一のユーザＵ２によって所定の時間内に連続して入力された一対の検索クエリは、所定の検索意図（例えば、「中型の冷蔵庫を調べる」という検索意図）で入力された検索クエリであると推定されるため、相互に類似する特徴を有するものとして、検索クエリＱ２１の分散表現（ベクトルＱＶ２１）と、検索クエリＱ２１と対となる検索クエリＱ２２の分散表現（ベクトルＱＶ２２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ２２の分散表現（ベクトルＱＶ２２）と、検索クエリＱ２２と対となる検索クエリＱ２３の分散表現（ベクトルＱＶ２３）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。このように、生成装置５０は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１モデルＭ１を学習させることで、検索クエリから分散表現（ベクトル）を出力する第１モデルＭ１を生成する（ステップＳ２３−２）。 Subsequently, the generation device 50 inputs a pair of search queries continuously input by the same user U2 within a predetermined time period with a predetermined search intention (for example, a search intention of “searching a medium-sized refrigerator”). Since it is estimated that the search query is a search query that has been generated, it is assumed that the search query Q21 has a distributed expression (vector QV21) and a search query Q22 that forms a pair with the search query Q21 (a vector QV22). ) And the first model M1 are learned so that they are similar to each other in the distributed expression space. In addition, the generation device 50 makes the distributed expression (vector QV22) of the search query Q22 and the distributed expression (vector QV23) of the search query Q23 paired with the search query Q22 similar to each other in the distributed expression space. The model M1 is trained. In this way, the generation device 50 learns the first model M1 so that the two vectors, which are a pair of distributed expressions corresponding to the pair of search queries, are similar in the distributed expression space, and thus the distributed expression from the search query. A first model M1 that outputs (vector) is generated (step S23-2).

図１７の下段に示す情報処理の結果として、検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であるベクトルＱＶ２ｍ（ｍ＝１、２、３）が分散表現空間の近い位置にクラスタＣＬ２１としてマッピングされる様子が示されている。例えば、検索クエリＱ２ｍ（ｍ＝１、２、３）は、ユーザＵ２によって「中型の冷蔵庫を調べる」という検索意図の下で検索された検索クエリの集合であると推定される。すなわち、Ｑ２ｍ（ｍ＝１、２、３）は、「中型の冷蔵庫を調べる」という検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであると推定される。ここで、生成装置５０は、「中型の冷蔵庫を調べる」という検索意図で入力された所定の検索クエリが第１モデルに入力されると、クラスタＣＬ２１の位置にマッピングされるような分散表現を出力することができる。これにより、例えば、生成装置５０は、クラスタＣＬ２１の位置にマッピングされる分散表現に対応する検索クエリを抽出することにより、「中型の冷蔵庫を調べる」という検索意図に応じた検索クエリを抽出することができる。したがって、生成装置５０は、検索クエリの意味を適切に解釈可能とすることができる。 As a result of the information processing shown in the lower part of FIG. 17, the vector QV2m (m=1, 2, 3), which is the distributed expression of the search query Q2m (m=1, 2, 3), is located in the cluster CL21 near the distributed expression space. Is mapped as. For example, the search query Q2m (m=1, 2, 3) is presumed to be a set of search queries searched by the user U2 under the search intention of “searching for a medium-sized refrigerator”. That is, Q2m (m=1, 2, 3) is a search query having similar characteristics in that it is a search query searched under the search intention of “searching for a medium-sized refrigerator”. Presumed. Here, when the predetermined search query input with the search intention of “search for a medium-sized refrigerator” is input to the first model, the generation device 50 outputs a distributed expression that is mapped to the position of the cluster CL21. can do. Thereby, for example, the generating device 50 extracts the search query corresponding to the distributed expression mapped to the position of the cluster CL21, thereby extracting the search query according to the search intention of “inspecting the medium-sized refrigerator”. You can Therefore, the generation device 50 can appropriately interpret the meaning of the search query.

また、本願発明に係る生成装置５０は、ランダムに抽出された複数の検索クエリは、異なる検索意図の下で検索された検索クエリであるという点で、相互に相違する特徴を有する検索クエリであるとみなして第１モデルＭ１を学習させる。具体的には、生成装置５０は、所定の検索クエリの分散表現と、所定の検索クエリとは無関係にランダムに抽出された検索クエリの分散表現とが分散表現空間上で遠くにマッピングされるように第１モデルＭ１のトレーニングを行う。図１７に示す例では、生成装置５０は、検索クエリＱ１１とは無関係にランダムに検索クエリを抽出したところ、検索クエリＱ２１が抽出されたとする。この場合、生成装置５０は、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１とは無関係にランダムに抽出された検索クエリＱ２１の分散表現（ベクトルＱＶ２１）とが分散表現空間上で遠くにマッピングされるように第１モデルＭ１のトレーニングを行う。その結果として、「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＱＶ１ｋ（ｋ＝１、２、３、４）を含むクラスタＣＬ１１と、「中型の冷蔵庫を調べる」という検索意図の下で検索された検索クエリＱ２ｍ（ｍ＝１、２、３）の分散表現であるベクトルＱＶ２ｍ（ｍ＝１、２、３）を含むクラスタＣＬ２１とは、分散表現空間上で遠くにマッピングされる。すなわち、本願発明に係る生成装置５０は、ランダムに抽出された複数の検索クエリの分散表現が相違するように第１モデルＭ１を学習させることにより、検索意図が異なる検索クエリの分散表現を分散表現空間上で遠い位置に出力可能とする。 Further, the generation device 50 according to the present invention is a search query having mutually different characteristics in that the plurality of search queries that are randomly extracted are search queries that are searched under different search intentions. And the first model M1 is learned. Specifically, the generation device 50 causes the distributed expression of the predetermined search query and the distributed expression of the search query randomly extracted irrespective of the predetermined search query to be mapped far in the distributed expression space. The first model M1 is trained. In the example illustrated in FIG. 17, it is assumed that the generation device 50 extracts the search query Q21 when the search query is randomly extracted regardless of the search query Q11. In this case, the generation device 50 causes the distributed representation of the search query Q11 (vector QV11) and the distributed representation of the search query Q21 (vector QV21) randomly extracted irrespective of the search query Q11 to be far in the distributed representation space. The first model M1 is trained so as to be mapped to As a result, the vector QV1k, which is a distributed expression of the search query Q1k (k=1, 2, 3, 4), which is searched under the search intention of “searching for a restaurant in a certain place (Near Minato-ku, Tokyo)”. It is a distributed representation of a cluster CL11 including (k=1, 2, 3, 4) and a search query Q2m (m=1, 2, 3) searched under the search intention of “searching for a medium-sized refrigerator”. The cluster CL21 including the vector QV2m (m=1, 2, 3) is mapped far in the distributed representation space. That is, the generation device 50 according to the present invention trains the first model M1 so that the distributed expressions of a plurality of search queries that are randomly extracted are different, and thus the distributed expressions of the search queries having different search intentions are distributed expressions. It is possible to output to a distant position in space.

なお、生成装置５０が生成した第１モデルＭ１によって出力された分散表現（ベクトル）が分散表現空間にマッピングされた結果として、上述したクラスタＣＬ１１とクラスタＣＬ２１の他にも、同一のユーザによって所定の時間内に入力された複数の検索クエリの分散表現（ベクトル）の集合であるクラスタＣＬ１２やクラスタＣＬ２２が生成される。 As a result of the distributed representation (vector) output by the first model M1 generated by the generation device 50 being mapped in the distributed representation space, in addition to the cluster CL11 and the cluster CL21 described above, a predetermined user may specify a predetermined expression. A cluster CL12 or a cluster CL22, which is a set of distributed expressions (vectors) of a plurality of search queries input in time, is generated.

上述したように、生成装置５０は、ユーザによって入力された検索クエリを取得する。また、生成装置５０は、取得した検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する第１モデルを生成する。すなわち、本願発明に係る生成装置５０は、所定の時間内に連続して入力された複数の検索クエリは、所定の検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであるとみなして第１モデルを学習させる。具体的には、生成装置５０は、同一のユーザによって所定の時間内に入力された複数の検索クエリの分散表現が類似するように第１モデルを学習させることで、所定の検索クエリから所定の検索クエリの特徴情報を含む分散表現を出力する第１モデルを生成する。すなわち、本願発明に係る生成装置５０は、所定の時間内に連続して入力された複数の検索クエリの分散表現が類似するように第１モデルＭ１を学習させることにより、所定の検索意図の下で検索された検索クエリの分散表現を分散表現空間上で近い位置に出力可能とする。これにより、生成装置５０は、検索クエリを入力したユーザのコンテクストに応じて検索クエリの意味（検索意図）を出力（解釈）することを可能にする。したがって、生成装置５０は、検索クエリの意味を適切に解釈可能とすることができる。さらに、生成装置５０は、所定の検索クエリの特徴情報を含む分散表現の近傍にマッピングされる分散表現に対応する検索クエリを抽出することにより、所定の検索クエリが検索された検索意図に応じた検索クエリを抽出することができる。すなわち、生成装置５０は、検索クエリを入力したユーザの検索意図やコンテクストを考慮して、ユーザの検索動向を分析することを可能にする。したがって、生成装置５０は、ユーザの検索動向の分析精度を高めることができる。また、生成装置５０が生成した第１モデルＭ１を検索システムの一部として機能させることもできる。あるいは、生成装置５０は、第１モデルＭ１によって予測された検索クエリの特徴情報を利用する他のシステム（例えば、検索エンジン）への入力情報として、第１モデルＭ１が出力した検索クエリの分散表現を提供することもできる。これにより、検索システムは、第１モデルＭ１によって予測された検索クエリの特徴情報に基づいて、検索結果として出力されるコンテンツを選択可能になる。すなわち、検索システムは、検索クエリを入力したユーザの検索意図やコンテクストを考慮して、検索結果として出力されるコンテンツを選択可能になる。さらに、検索システムは、第１モデルＭ１によって予測された検索クエリの特徴情報に基づいて、検索結果として出力されるコンテンツに含まれる文字列の分散表現と検索クエリの分散表現との類似度を算出可能になる。そして、検索システムは、算出した類似度に基づいて、検索結果として出力されるコンテンツの表示順を決定可能になる。すなわち、検索システムは、検索クエリを入力したユーザの検索意図やコンテクストを考慮して、検索結果として出力されるコンテンツの表示順を決定可能になる。したがって、生成装置５０は、検索サービスにおけるユーザビリティを向上させることができる。 As described above, the generation device 50 acquires the search query input by the user. In addition, the generation device 50 learns that a plurality of search queries input by the same user within a predetermined time have a similar feature among the acquired search queries, and thus a predetermined search query can be used. A first model that predicts the characteristic information of the search query is generated. That is, the generation device 50 according to the present invention is similar to each other in that the plurality of search queries continuously input within a predetermined time period are search queries searched under a predetermined search intention. The first model is trained by regarding it as a search query having characteristics. Specifically, the generation device 50 learns the first model such that the distributed expressions of the plurality of search queries input by the same user within a predetermined time are similar to each other, and thereby the predetermined model is obtained from the predetermined search query. A first model that outputs a distributed expression that includes the characteristic information of the search query is generated. That is, the generation device 50 according to the present invention learns the first model M1 so that the distributed expressions of a plurality of search queries continuously input within a predetermined time are similar to each other. It is possible to output the distributed expression of the search query searched by in a near position on the distributed expression space. Thereby, the generation device 50 enables the meaning (search intention) of the search query to be output (interpreted) according to the context of the user who inputs the search query. Therefore, the generation device 50 can appropriately interpret the meaning of the search query. Furthermore, the generation device 50 extracts the search query corresponding to the distributed expression that is mapped in the vicinity of the distributed expression that includes the characteristic information of the predetermined search query, and thus the predetermined search query responds to the search intention. Search queries can be extracted. That is, the generation device 50 makes it possible to analyze the search trend of the user in consideration of the search intention and context of the user who inputs the search query. Therefore, the generation device 50 can improve the accuracy of analysis of the search trend of the user. In addition, the first model M1 generated by the generation device 50 can be made to function as a part of the search system. Alternatively, the generation device 50 uses the distributed representation of the search query output by the first model M1 as input information to another system (for example, a search engine) that uses the feature information of the search query predicted by the first model M1. Can also be provided. Accordingly, the search system can select the content output as the search result based on the characteristic information of the search query predicted by the first model M1. That is, the search system can select the content output as the search result in consideration of the search intention or context of the user who inputs the search query. Furthermore, the search system calculates the similarity between the distributed expression of the character string included in the content output as the search result and the distributed expression of the search query based on the characteristic information of the search query predicted by the first model M1. It will be possible. Then, the search system can determine the display order of the content output as the search result based on the calculated similarity. That is, the search system can determine the display order of the content output as the search result in consideration of the search intention or context of the user who inputs the search query. Therefore, the generation device 50 can improve usability in the search service.

〔３−２．第２学習モデルの生成処理〕
次に、図１８を用いて、第２学習モデルの生成処理の流れについて説明する。図１８は、実施形態に係る第２学習モデルの生成処理の一例を示す図である。なお、以下では、適宜、第２学習モデルを第２モデル（又は、第２モデルＭ２）と記載する。図１８の上段に示す例では、生成装置５０は、同一のユーザＵ１によって所定の時間内に連続して入力された４個の検索クエリである検索クエリＱ１１（「六本木パスタ」）、検索クエリＱ１２（「六本木イタリアン」）、検索クエリＱ１３（「赤坂パスタ」）、検索クエリＱ１４（「麻布パスタ」）を抽出する。生成装置５０は、同一のユーザＵ１によって各検索クエリが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。また、生成装置５０は、同一のユーザＵ１によって各検索クエリのペアが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。ここで、４個の検索クエリは、検索クエリＱ１１、検索クエリＱ１２、検索クエリＱ１３、検索クエリＱ１４の順番でユーザＵ１によって各検索クエリが所定の時間内に入力された検索クエリであるとする。生成装置５０は、４個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、３対の検索クエリのペアである（検索クエリＱ１１、検索クエリＱ１２）、（検索クエリＱ１２、検索クエリＱ１３）、（検索クエリＱ１３、検索クエリＱ１４）を抽出する。生成装置５０は、３対の検索クエリのペアを抽出すると、抽出した検索クエリＱ１ｋ（ｋ＝１、２、３、４）を第１モデルＭ１に入力する（ステップＳ３１）。なお、生成装置５０は、同一のユーザＵ１によって全ての検索クエリが所定の時間内に入力された複数の検索クエリを抽出してもよい。そして、生成装置５０は、時系列的に隣り合うか否かに関わらず、抽出した複数の検索クエリの中から２つの検索クエリを選択して、選択した２つの検索クエリを一対の検索クエリとして抽出してもよい。 [3-2. Second learning model generation process]
Next, the flow of the second learning model generation process will be described with reference to FIG. FIG. 18 is a diagram illustrating an example of a second learning model generation process according to the embodiment. In the following, the second learning model will be appropriately referred to as the second model (or the second model M2). In the example illustrated in the upper part of FIG. 18, the generation device 50 includes a search query Q11 (“Roppongi pasta”) and a search query Q12, which are four search queries continuously input by the same user U1 within a predetermined time. (“Roppongi Italian”), search query Q13 (“Akasaka pasta”), and search query Q14 (“Azabu pasta”) are extracted. The generation device 50 extracts a plurality of search queries in which the time intervals at which the search queries are input by the same user U1 are within a predetermined time. In addition, the generation device 50 extracts a plurality of search queries in which the time interval at which each search query pair is input by the same user U1 is within a predetermined time. Here, it is assumed that the four search queries are search queries in which the search query Q11, the search query Q12, the search query Q13, and the search query Q14 are input in this order by the user U1 within a predetermined time. When the four search queries are extracted, the generation device 50 defines two search queries that are adjacent in time series as a pair of search queries, and is a pair of three search queries (search query Q11, search query Q12), (Search query Q12, search query Q13), (search query Q13, search query Q14) are extracted. When the generator 50 extracts three pairs of search queries, it inputs the extracted search queries Q1k (k=1, 2, 3, 4) to the first model M1 (step S31). Note that the generation device 50 may extract a plurality of search queries in which all the search queries are input by the same user U1 within a predetermined time. Then, the generation device 50 selects two search queries from the plurality of extracted search queries, regardless of whether they are adjacent in time series, and sets the selected two search queries as a pair of search queries. You may extract.

続いて、生成装置５０は、検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＢＱＶ１ｋ（ｋ＝１、２、３、４）を第１モデルＭ１の出力データとして出力する（ステップＳ３２）。ここで、ベクトルＢＱＶ１ｋ（ｋ＝１、２、３、４）は、第１モデルＭ１の出力層から出力されたばかりの検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であって、第１モデルＭ１にフィードバックをかける前（学習前）の分散表現を示す。 Then, the generation device 50 outputs the vector BQV1k (k=1, 2, 3, 4), which is the distributed expression of the search query Q1k (k=1, 2, 3, 4), as the output data of the first model M1. Yes (step S32). Here, the vector BQV1k (k=1, 2, 3, 4) is a distributed representation of the search query Q1k (k=1, 2, 3, 4) just output from the output layer of the first model M1. , A distributed expression before feedback is applied to the first model M1 (before learning).

ここで、同一のユーザＵ１によって所定の時間内に連続して入力された検索クエリＱ１ｋ（ｋ＝１、２、３、４）は、例えば、ユーザＵ１によって「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリの集合であると推定される。すなわち、検索クエリＱ１ｋ（ｋ＝１、２、３、４）は、「ある場所（東京都港区付近）で飲食店を探す」という検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであると推定される。そこで、生成装置５０は、連続して入力された検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する第１モデルを生成する（ステップＳ３３）。具体的には、生成装置５０は、連続して入力された検索クエリの分散表現が類似するものとして学習することで、所定の検索クエリから所定の検索クエリの分散表現を予測する第１モデルＭ１を生成する。例えば、生成装置５０は、検索クエリＱ１１の分散表現（ベクトルＱＶ１１）と、検索クエリＱ１１と対となる検索クエリＱ１２の分散表現（ベクトルＱＶ１２）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ１２の分散表現（ベクトルＱＶ１２）と、検索クエリＱ１２と対となる検索クエリＱ１３の分散表現（ベクトルＱＶ１３）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。また、生成装置５０は、検索クエリＱ１３の分散表現（ベクトルＱＶ１３）と、検索クエリＱ１３と対となる検索クエリＱ１４の分散表現（ベクトルＱＶ１４）とが、分散表現空間上で類似するように第１モデルＭ１を学習させる。 Here, the search query Q1k (k=1, 2, 3, 4) continuously input by the same user U1 within a predetermined time is, for example, “some place (near Minato-ku, Tokyo)” by the user U1. It is presumed to be a set of search queries searched under the search intention of "searching for restaurants in." That is, the search query Q1k (k=1, 2, 3, 4) is a search query that is searched under the search intent of “searching for a restaurant in a certain place (in the vicinity of Minato-ku, Tokyo)”. , Are presumed to be search queries having similar features to each other. Therefore, the generating device 50 generates the first model that predicts the characteristic information of the predetermined search query from the predetermined search query by learning that the successively input search queries have similar characteristics. Step S33). Specifically, the generation device 50 learns that the distributed expressions of continuously input search queries are similar to each other, and thus the first model M1 predicts the distributed expression of the predetermined search query from the predetermined search query. To generate. For example, the generation device 50 may make the distributed expression (vector QV11) of the search query Q11 and the distributed expression (vector QV12) of the search query Q12 paired with the search query Q11 similar to each other in the distributed expression space. The model M1 is trained. In addition, the generating device 50 makes the distributed expression of the search query Q12 (vector QV12) and the distributed expression of the search query Q13 (vector QV13) paired with the search query Q12 similar to each other in the distributed expression space. The model M1 is trained. In addition, the generation device 50 makes the distributed expression (vector QV13) of the search query Q13 and the distributed expression (vector QV14) of the search query Q14 paired with the search query Q13 similar to each other in the distributed expression space. The model M1 is trained.

図１８の上段の右側には、学習済みの第１モデルＭ１の出力結果として、同一のユーザＵ１によって所定の時間内に入力された検索クエリＱ１ｋ（ｋ＝１、２、３、４）の分散表現であるベクトルＱＶ１ｋ（ｋ＝１、２、３、４）が分散表現空間のクラスタＣＬ１１としてマッピングされる様子が示されている。このように、生成装置５０は、同一のユーザによって所定の時間内に入力された複数の検索クエリが有する特徴を学習した第１学習モデルＭ１を生成する。 On the right side of the upper part of FIG. 18, the variance of the search query Q1k (k=1, 2, 3, 4) input by the same user U1 within a predetermined time as the output result of the learned first model M1. It is shown that the expression vector QV1k (k=1, 2, 3, 4) is mapped as the cluster CL11 in the distributed expression space. In this way, the generation device 50 generates the first learning model M1 that has learned the characteristics of the plurality of search queries input by the same user within a predetermined time.

生成装置５０は、第１モデルＭ１を生成すると、生成した第１モデルＭ１（第１モデルＭ１のモデルデータＭＤＴ１）を取得する。生成装置５０は、第１モデルＭ１を取得すると、取得した第１モデルＭ１を用いて、第２学習モデルＭ２を生成する。具体的には、生成装置５０は、第１モデルＭ１を再学習させることにより、第１モデルＭ１とは学習モデルの重みである接続係数が異なる第２モデルＭ２を生成する。より具体的には、生成装置５０は、第１モデルＭ１を用いて、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２学習モデルＭ２を生成する（ステップＳ３４）。 After generating the first model M1, the generation device 50 acquires the generated first model M1 (model data MDT1 of the first model M1). When the generation device 50 acquires the first model M1, the generation device 50 generates the second learning model M2 using the acquired first model M1. Specifically, the generation device 50 re-learns the first model M1 to generate the second model M2 having a connection coefficient that is a weight of the learning model different from that of the first model M1. More specifically, the generation device 50 uses the first model M1 to generate a second learning model M2 that predicts a category to which a predetermined search query belongs from a predetermined search query (step S34).

図１８の下段に示す例では、生成装置５０は、検索クエリが第２モデルＭ２に入力された際に、ＣＡＴ１１（「飲食店を探す」）、ＣＡＴ１２（「商品を探す」）、ＣＡＴ１３（「飲食店を予約する」）、ＣＡＴ１４（「商品を購入する」）の４つのカテゴリのいずれのカテゴリに属するかを予測する第２モデルＭ２を生成する。具体的には、生成装置５０は、入力情報として検索クエリが第２モデルＭ２に入力された際に、出力情報として検索クエリがそのカテゴリに属する確率をカテゴリ毎に出力する第２モデルＭ２を生成する。例えば、生成装置５０は、第２モデルＭ２の正解データとして、検索クエリと検索クエリが属するカテゴリ（ＣＡＴ１１〜ＣＡＴ１４のいずれか）との組を学習する。 In the example illustrated in the lower part of FIG. 18, when the search query is input to the second model M2, the generation device 50 CAT11 (“search for a restaurant”), CAT12 (“search for a product”), CAT13 (“ A second model M2 that predicts which one of the four categories of "reserve a restaurant") and CAT14 ("purchase a product") is generated. Specifically, when the search query is input to the second model M2 as the input information, the generation device 50 generates the second model M2 that outputs the probability that the search query belongs to the category for each category as the output information. To do. For example, the generation device 50 learns a set of a search query and a category (one of CAT11 to CAT14) to which the search query belongs as the correct answer data of the second model M2.

なお、検索クエリがＣＡＴ１１（「飲食店を探す」）に属することは、検索クエリが飲食店を探す意図で入力された検索クエリであることを示す。また、ＣＡＴ１２（「商品を探す」）に属することは、検索クエリが商品を探す意図で入力された検索クエリであることを示す。また、検索クエリがＣＡＴ１３（「飲食店を予約する」）に属することは、検索クエリが飲食店を予約する意図で入力された検索クエリであることを示す。また、検索クエリがＣＡＴ１４（「商品を購入する」）に属することは、検索クエリが商品を購入する意図で入力された検索クエリであることを示す。 It should be noted that the fact that the search query belongs to CAT11 (“search for restaurants”) indicates that the search query is a search query that was input with the intention of searching for restaurants. Belonging to CAT12 (“search for product”) indicates that the search query is a search query input with the intention of searching for a product. Further, the fact that the search query belongs to CAT13 (“reserve a restaurant”) indicates that the search query is a search query input with the intention of reserving a restaurant. Further, the fact that the search query belongs to CAT14 (“purchase a product”) indicates that the search query is a search query input with the intention of purchasing the product.

具体的には、生成装置５０は、検索クエリが学習モデルに入力された際に、学習モデルが出力する分散表現の分類結果が、検索クエリが属するカテゴリに対応するように学習することで、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２モデルＭ２を生成する。そして、生成装置５０は、例えば、入力情報として検索クエリが第２モデルＭ２に入力された際に、出力情報として検索クエリがそのカテゴリに属する確率をカテゴリＣＡＴ１１〜ＣＡＴ１４毎に出力する第２モデルＭ２を生成する。 Specifically, when the search query is input to the learning model, the generation device 50 learns such that the classification result of the distributed expressions output by the learning model corresponds to the category to which the search query belongs, thereby making A second model M2 that predicts a category to which a predetermined search query belongs is generated from the search query of. Then, for example, when the search query is input to the second model M2 as the input information, the generation device 50 outputs the probability that the search query belongs to the category as the output information for each of the categories CAT11 to CAT14. To generate.

例えば、生成装置５０は、入力情報として検索クエリＱ１１（「六本木パスタ」）が第２モデルＭ２に入力された際に（ステップＳ３５）、出力情報として検索クエリＱ１１（「六本木パスタ」）の分散表現であるベクトルＢＱＶ１１を出力する。ここで、ベクトルＢＱＶ１１は、第２モデルＭ２の出力層から出力されたばかりの検索クエリＱ１１の分散表現であって、第２モデルＭ２にフィードバックをかける前（学習前）の分散表現を示す。ここで、検索クエリＱ１１（「六本木パスタ」）が属する正解カテゴリがＣＡＴ１１（「飲食店を探す」）であるとする。この場合、生成装置５０は、出力された検索クエリＱ１１（「六本木パスタ」）の分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率が所定の閾値を超えるように第２モデルＭ２を学習させる。なお、生成装置５０は、あらかじめ用意された正解データを用いて第２モデルを学習させる。生成装置５０は、第２モデルＭ２の正解データを生成してもよい。そして、生成装置５０は、生成した正解データを用いて第２モデルＭ２を学習させてもよい。具体的には、生成装置５０は、検索クエリを検索したユーザの検索後の行動に基づいて、検索クエリが属する正解カテゴリを決定する。より具体的には、生成装置５０は、所定の検索クエリを検索したユーザに対して、検索後に所定の行動を起こしたユーザの割合が所定の閾値を超える所定の行動を、正解カテゴリに対応する行動として決定する。例えば、検索クエリＱ１１（「六本木パスタ」）を検索したユーザが検索後に所定の行動を起こしたユーザの割合として、飲食店を探す行動を起こしたユーザの割合が９０％、検索後に商品を探す行動を起こしたユーザの割合が０％、検索後に飲食店を予約する行動を起こしたユーザの割合が１０％、検索後に商品を購入する行動を起こしたユーザの割合が０％であったとする。この場合、生成装置５０は、飲食店を探す行動を起こしたユーザの割合が所定の閾値（例えば、９０％）を超えるため、飲食店を探す行動を検索クエリＱ１１（「六本木パスタ」）の正解カテゴリに対応する行動として決定する。そして、生成装置５０は、正解カテゴリに対応する行動を飲食店を探す行動であると決定したので、検索クエリＱ１１（「六本木パスタ」）が属する正解カテゴリをＣＡＴ１１（「飲食店を探す」）に決定する。 For example, when the search query Q11 (“Roppongi pasta”) is input to the second model M2 as the input information (step S35), the generation device 50 outputs the distributed representation of the search query Q11 (“Roppongi pasta”) as the output information. The vector BQV11 is output. Here, the vector BQV11 is a distributed expression of the search query Q11 just output from the output layer of the second model M2, and represents a distributed expression before feedback is applied to the second model M2 (before learning). Here, it is assumed that the correct category to which the search query Q11 (“Roppongi pasta”) belongs is CAT11 (“Find a restaurant”). In this case, the generation device 50 sets the probability that the vector BQV11, which is the distributed expression of the output search query Q11 (“Roppongi pasta”), is classified into CAT11 (“Search for restaurants”) exceeds a predetermined threshold. The second model M2 is trained. The generation device 50 uses the correct answer data prepared in advance to train the second model. The generation device 50 may generate correct answer data of the second model M2. Then, the generation device 50 may train the second model M2 using the generated correct answer data. Specifically, the generation device 50 determines the correct category to which the search query belongs, based on the post-search behavior of the user who searched for the search query. More specifically, the generation device 50 associates, with respect to the user who has searched for a predetermined search query, a predetermined action in which the ratio of users who have performed a predetermined action after the search exceeds a predetermined threshold, to the correct category. Decide as an action. For example, as a ratio of users who searched for the search query Q11 (“Roppongi pasta”) to take a predetermined action after the search, a ratio of users who took action to search for a restaurant was 90%, and to search for a product after the search. It is assumed that the percentage of users who have caused a search is 0%, the percentage of users who have made an action to reserve a restaurant after the search is 10%, and the percentage of users who have made an action to purchase a product after the search is 0%. In this case, since the ratio of the users who have taken the action to search for the restaurant exceeds the predetermined threshold value (for example, 90%), the generation device 50 searches for the action to search for the restaurant in the search query Q11 (“Roppongi pasta”). Determined as the action corresponding to the category. Then, since the generation device 50 has determined that the action corresponding to the correct answer category is the action of searching for a restaurant, the correct answer category to which the search query Q11 (“Roppongi pasta”) belongs to is CAT11 (“search for a restaurant”). decide.

例えば、生成装置５０は、学習前の第２モデルＭ２に検索クエリＱ１１（「六本木パスタ」）が入力された際に、分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率を８０％、ＣＡＴ１２（「商品を探す」）に分類される確率を０％、ＣＡＴ１３（「飲食店を予約」）に分類される確率を２０％、ＣＡＴ１４（「商品を購入する」）に分類される確率を０％と出力したとする。この場合、生成装置５０は、分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率を所定の閾値（例えば、９０％）を超えるように第２モデルＭ２を学習させる。また、生成装置５０は、分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率が所定の閾値（例えば、９０％）を超えるように学習させるのに合わせて、分散表現であるベクトルＢＱＶ１１が他のカテゴリＣＡＴ１３（「飲食店を予約」）に分類される確率を１０％に下げるように第２モデルＭ２を学習させる。 For example, when the search query Q11 (“Roppongi pasta”) is input to the second model M2 before learning, the generator 50 classifies the vector BQV11, which is a distributed expression, into CAT11 (“Find a restaurant”). 80% probability, 0% probability of being classified as CAT12 (“find a product”), 20% probability of being classified as CAT13 (“reserve a restaurant”), CAT14 (“purchase product”) It is assumed that the probability of being classified as is output as 0%. In this case, the generation device 50 trains the second model M2 so that the probability that the vector BQV11, which is a distributed representation, is classified into CAT11 (“search for a restaurant”) exceeds a predetermined threshold value (for example, 90%). .. In addition, the generation device 50 performs dispersion so that the probability that the vector BQV11, which is a distributed expression, is classified into CAT11 (“search for a restaurant”) exceeds a predetermined threshold value (for example, 90%). The second model M2 is trained so as to reduce the probability that the expression vector BQV11 is classified into another category CAT13 (“reserve restaurant”) to 10%.

このように、生成装置５０は、入力情報として所定の検索クエリが入力されると、出力情報として所定の検索クエリの分散表現が正解カテゴリに分類される確率が所定の閾値を超えるように第２モデルを学習させる。そして、生成装置５０は、入力情報として所定の検索クエリが入力された際に、所定の検索クエリの分散表現がそのカテゴリに属する確率が所定の閾値を超えるカテゴリを、所定の検索クエリのカテゴリとして出力する。例えば、生成装置５０は、学習済みの第２モデルＭ２に入力情報として検索クエリＱ１１（「六本木パスタ」）が入力されると、検索クエリＱ１１（「六本木パスタ」）の分散表現であるベクトルＢＱＶ１１がカテゴリＣＡＴ１１（「飲食店を探す」）に属する確率が９０％を超えるので、出力情報として検索クエリが属するカテゴリをＣＡＴ１１（「飲食店を探す」）と出力する（ステップＳ３６）。このように、生成装置５０は、検索クエリと検索クエリの正解カテゴリとの組を学習することで、所定の検索クエリから所定の検索クエリのカテゴリを予測する第２モデルを生成する（ステップＳ３７）。 As described above, when the predetermined search query is input as the input information, the generation device 50 is configured such that the probability that the distributed expression of the predetermined search query is classified into the correct answer category as the output information exceeds the predetermined threshold value. Train the model. Then, when a predetermined search query is input as input information, the generation device 50 sets a category whose probability that the distributed expression of the predetermined search query belongs to the category exceeds a predetermined threshold as a category of the predetermined search query. Output. For example, when the search query Q11 (“Roppongi pasta”) is input as input information to the learned second model M2, the generation device 50 generates the vector BQV11 that is a distributed expression of the search query Q11 (“Roppongi pasta”). Since the probability of belonging to the category CAT11 (“search for restaurants”) exceeds 90%, the category to which the search query belongs is output as output information as CAT11 (“searches for restaurants”) (step S36). In this way, the generation device 50 generates the second model that predicts the category of the predetermined search query from the predetermined search query by learning the set of the search query and the correct category of the search query (step S37). ..

一般的に、ユーザはある意図を持って検索を複数回行うと考えられるため、所定の時間内に連続して入力された検索クエリは、検索意図が近いという仮定が成り立つ。そこで、本願発明に係る生成装置５０は、所定の時間内に連続して入力された複数の検索クエリは、所定の検索意図の下で検索された検索クエリであるという点で、相互に類似する特徴を有する検索クエリであるとみなして第１モデルＭ１を学習させる。これにより、生成装置５０は、検索意図を考慮した検索クエリの特徴を第１モデルＭ１に学習させることができる。そして、生成装置５０は、検索意図を考慮した検索クエリの特徴を学習した第１モデルＭ１を活用して、所定の検索クエリから所定の検索クエリのカテゴリを予測する第２モデルを効率的に生成することができる。これにより、生成装置５０は、検索クエリを入力したユーザの検索意図を考慮したカテゴリに検索クエリを分類することを可能にする。また、従来、検索クエリをカテゴリに分類し、高い分類精度を得るためには、十分な量の正解データを用意することが必要であった。しかしながら、検索クエリ自体、多種多様であり、ロングテイルな性質を持つものであるため、多数の検索クエリに対応する正解カテゴリをラベル付けするのは、非常に手間がかかり困難である。ここで、生成装置５０は、正解カテゴリをラベル付けする代わりに、ユーザの検索意図（検索クエリを入力したユーザのコンテクスト）を一種の正解として、検索クエリのカテゴリを予測する第２モデルを学習させることができる。これにより、生成装置５０は、人手で検索クエリの正解カテゴリをラベル付けすることなく、第２モデルを学習させることができる。すなわち、生成装置５０は、正解データが少ないときでも、十分な分類精度を得られるようになる。また、生成装置５０は、正解データが多いときであれば、さらに高い分類精度を得られるようになる。したがって、生成装置５０は、検索クエリの分類精度を高めることができる。 Generally, it is considered that a user performs a search a plurality of times with a certain intention, and therefore it is assumed that search queries continuously input within a predetermined time have similar search intentions. Therefore, the generation device 50 according to the present invention is similar to each other in that the plurality of search queries continuously input within a predetermined time period are search queries searched under a predetermined search intention. The first model M1 is trained by considering it as a search query having characteristics. Thereby, the generation device 50 can cause the first model M1 to learn the characteristics of the search query in consideration of the search intention. Then, the generation device 50 efficiently generates the second model that predicts the category of the predetermined search query from the predetermined search query by utilizing the first model M1 that has learned the characteristics of the search query in consideration of the search intention. can do. As a result, the generation device 50 enables the search query to be classified into categories in consideration of the search intention of the user who inputs the search query. Further, conventionally, it has been necessary to prepare a sufficient amount of correct answer data in order to classify search queries into categories and obtain high classification accuracy. However, since the search queries themselves are diverse and have long tail characteristics, it is very troublesome and difficult to label the correct answer categories corresponding to a large number of search queries. Here, instead of labeling the correct answer category, the generation device 50 uses the user's search intention (the context of the user who entered the search query) as a kind of correct answer to learn the second model that predicts the category of the search query. be able to. Thereby, the generation device 50 can train the second model without manually labeling the correct category of the search query. That is, the generation device 50 can obtain sufficient classification accuracy even when the correct answer data is small. In addition, the generation device 50 can obtain higher classification accuracy when there are many correct data. Therefore, the generation device 50 can improve the classification accuracy of the search query.

〔３−３．情報処理装置の構成〕
次に、図１９を用いて、実施形態に係る生成装置５０の構成について説明する。図１９は、実施形態に係る生成装置５０の構成例を示す図である。図１９に示すように、生成装置５０は、通信部５１と、記憶部５３と、制御部５２とを有する。なお、生成装置５０は、生成装置５０の管理者等から各種操作を受け付ける入力部（例えば、キーボードやマウス等）や、各種情報を表示するための表示部（例えば、液晶ディスプレイ等）を有してもよい。 [3-3. Configuration of information processing device]
Next, the configuration of the generation device 50 according to the embodiment will be described with reference to FIG. FIG. 19 is a diagram illustrating a configuration example of the generation device 50 according to the embodiment. As illustrated in FIG. 19, the generation device 50 includes a communication unit 51, a storage unit 53, and a control unit 52. The generating device 50 has an input unit (for example, a keyboard or a mouse) that receives various operations from an administrator of the generating device 50, and a display unit (for example, a liquid crystal display) for displaying various information. May be.

（通信部５１）
通信部５１は、例えば、ＮＩＣ（Network Interface Card）等によって実現される。そして、通信部５１は、ネットワークと有線または無線で接続され、例えば、ユーザ端末１０と、検索サーバ２０との間で情報の送受信を行う。 (Communication unit 51)
The communication unit 51 is realized by, for example, a NIC (Network Interface Card) or the like. The communication unit 51 is connected to the network by wire or wirelessly, and transmits and receives information between the user terminal 10 and the search server 20, for example.

（記憶部５３）
記憶部５３は、例えば、ＲＡＭ（Random Access Memory)、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、または、ハードディスク、光ディスク等の記憶装置によって実現される。記憶部５３は、図１９に示すように、クエリ情報記憶部５３１と、ベクトル情報記憶部５３２と、分類定義記憶部５３３と、カテゴリ情報記憶部５３４と、モデル情報記憶部５３５とを有する。 (Storage unit 53)
The storage unit 53 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory) or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. As shown in FIG. 19, the storage unit 53 includes a query information storage unit 531, a vector information storage unit 532, a classification definition storage unit 533, a category information storage unit 534, and a model information storage unit 535.

（クエリ情報記憶部５３１）
クエリ情報記憶部５３１は、ユーザによって入力された検索クエリに関する各種の情報を記憶する。図２０に、実施形態に係るクエリ情報記憶部の一例を示す。図２０に示す例では、クエリ情報記憶部５３１は、「ユーザＩＤ」、「日時」、「検索クエリ」、「検索クエリＩＤ」といった項目を有する。 (Query information storage unit 531)
The query information storage unit 531 stores various information regarding the search query input by the user. FIG. 20 shows an example of the query information storage unit according to the embodiment. In the example illustrated in FIG. 20, the query information storage unit 531 has items such as “user ID”, “date and time”, “search query”, and “search query ID”.

「ユーザＩＤ」は、検索クエリを入力したユーザを識別するための識別情報を示す。「日時」は、検索サーバがユーザから検索クエリを受け付けた日時を示す。「検索クエリ」は、ユーザによって入力された検索クエリを示す。「検索クエリＩＤ」は、ユーザによって入力された検索クエリを識別するための識別情報を示す。 The “user ID” indicates identification information for identifying the user who has input the search query. The “date and time” indicates the date and time when the search server accepted the search query from the user. "Search query" indicates a search query input by the user. The “search query ID” indicates identification information for identifying the search query input by the user.

図２０の１レコード目に示す例では、検索クエリＩＤ「Ｑ１１」で識別される検索クエリ（検索クエリＱ１１）は、図１６に示した検索クエリＱ１１に対応する。また、ユーザＩＤ「Ｕ１」は、検索クエリＱ１１を入力したユーザがユーザＩＤ「Ｕ１」で識別されるユーザ（ユーザＵ１）であることを示す。また、日時「２０１８／９／１ＰＭ１７：００」は、検索サーバがユーザＵ１から検索クエリＱ１１を受け付けた日時が２０１８年９月１日の午後１７：００であることを示す。また、検索クエリ「六本木パスタ」は、ユーザＵ１によって入力された検索クエリＱ１１を示す。具体的には、検索クエリ「六本木パスタ」は、地名を示す「六本木」と食品の種類を示す「パスタ」の文字とが区切り文字であるスペースで区切られた文字列であることを示す。 In the example shown in the first record of FIG. 20, the search query (search query Q11) identified by the search query ID “Q11” corresponds to the search query Q11 shown in FIG. The user ID “U1” indicates that the user who inputs the search query Q11 is the user (user U1) identified by the user ID “U1”. Further, the date and time "2018/9/1 PM 17:00" indicates that the date and time when the search server receives the search query Q11 from the user U1 is 17:00 pm on September 1, 2018. The search query "Roppongi pasta" indicates the search query Q11 input by the user U1. Specifically, the search query “Roppongi pasta” is a character string in which “Roppongi” indicating a place name and “pasta” indicating the type of food are separated by a space, which is a delimiter.

（ベクトル情報記憶部５３２）
ベクトル情報記憶部５３２は、検索クエリの分散表現であるベクトルに関する各種の情報を記憶する。図２１に、実施形態に係るベクトル情報記憶部の一例を示す。図２１に示す例では、ベクトル情報記憶部５３２は、「ベクトルＩＤ」、「検索クエリＩＤ」、「ベクトル情報」といった項目を有する。 (Vector information storage unit 532)
The vector information storage unit 532 stores various kinds of information regarding vectors, which are distributed expressions of search queries. FIG. 21 shows an example of the vector information storage unit according to the embodiment. In the example illustrated in FIG. 21, the vector information storage unit 532 has items such as “vector ID”, “search query ID”, and “vector information”.

「ベクトルＩＤ」は、検索クエリの分散表現であるベクトルを識別するための識別情報を示す。「検索クエリＩＤ」は、ベクトルに対応する検索クエリを識別するための識別情報を示す。「ベクトル情報」は、検索クエリの分散表現であるＮ次元のベクトルを示す。検索クエリの分散表現であるベクトルは、例えば、１２８次元のベクトルである。 "Vector ID" indicates identification information for identifying a vector that is a distributed expression of a search query. The “search query ID” indicates identification information for identifying the search query corresponding to the vector. “Vector information” indicates an N-dimensional vector that is a distributed expression of a search query. The vector that is the distributed expression of the search query is, for example, a 128-dimensional vector.

図２１の１レコード目に示す例では、ベクトルＩＤ「ＱＶ１１」で識別されるベクトル（ベクトルＱＶ１１）は、図１６に示した検索クエリＱ１１の分散表現であるベクトルＱＶ１１に対応する。また、検索クエリＩＤ「Ｑ１１」で識別される検索クエリ（検索クエリＱ１１）は、ベクトルＱＶ１１に対応する検索クエリが検索クエリＱ１１であることを示す。また、ベクトル情報「ＱＶＤＴ１１」は、検索クエリＱ１１の分散表現であるＮ次元のベクトルを示す。 In the example shown in the first record of FIG. 21, the vector identified by the vector ID “QV11” (vector QV11) corresponds to the vector QV11 that is the distributed expression of the search query Q11 shown in FIG. The search query (search query Q11) identified by the search query ID “Q11” indicates that the search query corresponding to the vector QV11 is the search query Q11. The vector information “QVDT11” indicates an N-dimensional vector that is a distributed expression of the search query Q11.

（分類定義記憶部５３３）
分類定義記憶部５３３は、検索クエリが分類されるカテゴリの定義に関する各種の情報を記憶する。図２２に、実施形態に係る分類定義記憶部の一例を示す。図２２に示す例では、分類定義記憶部５３３は、「大分類ＩＤ」、「大分類」、「小分類ＩＤ」、「小分類」といった項目を有する。 (Classification definition storage unit 533)
The classification definition storage unit 533 stores various kinds of information regarding the definition of the category into which the search query is classified. FIG. 22 shows an example of the classification definition storage unit according to the embodiment. In the example shown in FIG. 22, the classification definition storage unit 533 has items such as “large classification ID”, “large classification”, “small classification ID”, and “small classification”.

「大分類」は、検索クエリが分類されるカテゴリの大分類を示す。「大分類ＩＤ」は、大分類を識別するための識別情報を示す。図２２に示す例では、大分類「購買行動系」は、図１の下段に示す例で説明した大分類に対応する。大分類「購買行動系」は、検索クエリをユーザの購買行動に基づいて分類するカテゴリの大分類を示す。図２２に示す例では、大分類「購買行動系」は、さらに４つの小分類を有する。大分類ＩＤ「ＣＡＴ１」は、大分類「購買行動系」を識別するための識別情報を示す。 The “major classification” indicates the major classification of the categories into which the search query is classified. The “major classification ID” indicates identification information for identifying the major classification. In the example illustrated in FIG. 22, the large classification “purchasing behavior type” corresponds to the large classification described in the example illustrated in the lower part of FIG. 1. The large classification "purchasing behavior system" indicates a large classification of categories into which the search query is classified based on the purchasing behavior of the user. In the example shown in FIG. 22, the large classification “purchasing behavior type” further has four small classifications. The large classification ID “CAT1” indicates identification information for identifying the large classification “purchasing behavior type”.

「小分類」、検索クエリが分類されるカテゴリの小分類を示す。「小分類ＩＤ」は、小分類を識別するための識別情報を示す。図２２に示す例では、小分類「飲食店を探す」は、大分類「購買行動系」に属する分類であって、小分類に分類される検索クエリが、ユーザによって飲食店を探す意図で入力された検索クエリであることを示す。小分類ＩＤ「ＣＡＴ１１」は、小分類「飲食店を探す」を識別するための識別情報を示す。 “Minor classification” indicates a minor classification of the category into which the search query is classified. The “small category ID” indicates identification information for identifying the small category. In the example shown in FIG. 22, the small category “search for restaurants” is a category belonging to the large category “purchasing behavior system”, and the search query classified into the small categories is input by the user with the intention of searching for restaurants. Indicates that the search query was performed. The small classification ID “CAT11” indicates identification information for identifying the small classification “search for a restaurant”.

小分類「商品を探す」は、大分類「購買行動系」に属する分類であって、小分類に分類される検索クエリが、ユーザによって商品を探す意図で入力された検索クエリであることを示す。小分類ＩＤ「ＣＡＴ１２」は、小分類「商品を探す」を識別するための識別情報を示す。 The small category “search for products” is a category belonging to the large category “purchasing behavior type”, and indicates that the search queries classified into the small categories are search queries input by the user with the intention of searching for products. .. The small classification ID “CAT12” indicates identification information for identifying the small classification “search for products”.

小分類「飲食店を予約」は、大分類「購買行動系」に属する分類であって、小分類に分類される検索クエリが、ユーザによって飲食店を予約する意図で入力された検索クエリであることを示す。小分類ＩＤ「ＣＡＴ１３」は、小分類「飲食店を予約」を識別するための識別情報を示す。 The small category “reserve a restaurant” is a category belonging to the large category “purchasing behavior type”, and the search query classified into the small category is a search query input by the user with the intention of reserving a restaurant. Indicates that. The small classification ID “CAT13” indicates identification information for identifying the small classification “reserve a restaurant”.

小分類「商品を購入」は、大分類「購買行動系」に属する分類であって、小分類に分類される検索クエリが、ユーザによって商品を購入する意図で入力された検索クエリであることを示す。小分類ＩＤ「ＣＡＴ１４」は、小分類「商品を購入」を識別するための識別情報を示す。 The small category “Purchase goods” is a category belonging to the large category “Purchasing behavior type”, and the search queries classified into the small categories are search queries input by the user with the intention of purchasing the product. Show. The small classification ID “CAT14” indicates identification information for identifying the small classification “purchase a product”.

（カテゴリ情報記憶部５３４）
カテゴリ情報記憶部５３４は、検索クエリが属するカテゴリに関する各種の情報を記憶する。具体的には、カテゴリ情報記憶部５３４は、学習済みの第２学習モデルに検索クエリが入力された際に、第２学習モデルが出力するカテゴリに関する各種の情報を記憶する。図２３に、実施形態に係るカテゴリ情報記憶部の一例を示す。図２３に示す例では、カテゴリ情報記憶部５３４は、「検索クエリＩＤ」、「大分類ＩＤ」、「小分類ＩＤ」、「確率（％）」といった項目を有する。 (Category information storage unit 534)
The category information storage unit 534 stores various kinds of information regarding the category to which the search query belongs. Specifically, the category information storage unit 534 stores various kinds of information regarding the category output by the second learning model when the search query is input to the learned second learning model. FIG. 23 shows an example of the category information storage unit according to the embodiment. In the example illustrated in FIG. 23, the category information storage unit 534 has items such as “search query ID”, “large classification ID”, “small classification ID”, and “probability (%)”.

「検索クエリＩＤ」は、ユーザによって入力された検索クエリを識別するための識別情報を示す。図２３に示す例では、検索クエリＩＤ「Ｑ１１」で識別される検索クエリ（検索クエリＱ１１）は、図１８に示した検索クエリＱ１１に対応する。 The “search query ID” indicates identification information for identifying the search query input by the user. In the example shown in FIG. 23, the search query (search query Q11) identified by the search query ID “Q11” corresponds to the search query Q11 shown in FIG.

「大分類ＩＤ」は、大分類を識別するための識別情報を示す。「小分類ＩＤ」は、小分類を識別するための識別情報を示す。「確率（％）」は、学習済みの第２学習モデルに検索クエリが入力された際に、第２学習モデルが出力する小分類毎の確率を示す。図２３に示す例では、確率（％）「９０」は、検索クエリＱ１１がカテゴリＣＡＴ１１に分類される確率が９０％であることを示す。 The “major classification ID” indicates identification information for identifying the major classification. The “small category ID” indicates identification information for identifying the small category. The “probability (%)” indicates the probability for each subclass output by the second learning model when the search query is input to the learned second learning model. In the example shown in FIG. 23, the probability (%) “90” indicates that the probability that the search query Q11 is classified into the category CAT11 is 90%.

（モデル情報記憶部５３５）
モデル情報記憶部５３５は、生成装置５０によって生成された学習モデルに関する各種の情報を記憶する。図２４に、実施形態に係るモデル情報記憶部の一例を示す。図２４に示す例では、モデル情報記憶部５３５は、「モデルＩＤ」、「モデルデータ」といった項目を有する。 (Model information storage unit 535)
The model information storage unit 535 stores various information regarding the learning model generated by the generation device 50. FIG. 24 shows an example of the model information storage unit according to the embodiment. In the example shown in FIG. 24, the model information storage unit 535 has items such as “model ID” and “model data”.

図２４の１レコード目に示す例では、モデルＩＤ「Ｍ１」で識別される学習モデルは、図１に示した第１モデルＭ１に対応する。また、モデルデータ「ＭＤＴ１」は、生成装置５０によって生成された第１モデルＭ１のモデルデータ（モデルデータＭＤＴ１）を示す。 In the example shown in the first record of FIG. 24, the learning model identified by the model ID “M1” corresponds to the first model M1 shown in FIG. The model data “MDT1” indicates the model data (model data MDT1) of the first model M1 generated by the generation device 50.

図２４の２レコード目に示す例では、モデルＩＤ「Ｍ２」で識別される学習モデルは、図１に示した第２モデルＭ２に対応する。また、モデルデータ「ＭＤＴ２」は、生成装置５０によって生成された第２モデルＭ２のモデルデータ（モデルデータＭＤＴ２）を示す。 In the example shown in the second record of FIG. 24, the learning model identified by the model ID “M2” corresponds to the second model M2 shown in FIG. The model data “MDT2” indicates the model data (model data MDT2) of the second model M2 generated by the generation device 50.

モデルデータＭＤＴ２は、検索クエリが入力される入力層と、出力層と、入力層から出力層までのいずれかの層であって出力層以外の層に属する第１要素と、第１要素と第１要素の重みとに基づいて値が算出される第２要素と、を含み、入力層に入力された検索クエリに応じて、入力層に入力された検索クエリが各カテゴリに属する確率を出力層から出力するよう、生成装置５０を機能させてもよい。 The model data MDT2 includes an input layer to which a search query is input, an output layer, a first element belonging to any layer from the input layer to the output layer and other than the output layer, the first element, and the first element. A second element whose value is calculated based on the weight of one element, and a probability that the search query input to the input layer belongs to each category according to the search query input to the input layer The generator 50 may be operated to output from.

ここで、モデルデータＭＤＴ２が「y=a1*x1+a2*x2+・・・+ai*xi」で示す回帰モデルで実現されるとする。この場合、モデルデータＭＤＴ２が含む第１要素は、x1やx2等といった入力データ（xi）に対応する。また、第１要素の重みは、xiに対応する係数aiに対応する。ここで、回帰モデルは、入力層と出力層とを有する単純パーセプトロンと見做すことができる。各モデルを単純パーセプトロンと見做した場合、第１要素は、入力層が有するいずれかのノードに対応し、第２要素は、出力層が有するノードと見做すことができる。 Here, it is assumed that the model data MDT2 is realized by the regression model represented by “y=a1*x1+a2*x2+...+ai*xi”. In this case, the first element included in the model data MDT2 corresponds to the input data (xi) such as x1 and x2. The weight of the first element corresponds to the coefficient ai corresponding to xi. Here, the regression model can be regarded as a simple perceptron having an input layer and an output layer. When each model is regarded as a simple perceptron, the first element can be regarded as a node included in the input layer, and the second element can be regarded as a node included in the output layer.

また、モデルデータＭＤＴ２がＤＮＮ（Deep Neural Network）等、１つまたは複数の中間層を有するニューラルネットワークで実現されるとする。この場合、モデルデータＭＤＴ２が含む第１要素は、入力層または中間層が有するいずれかのノードに対応する。また、第２要素は、第１要素と対応するノードから値が伝達されるノードである次段のノードに対応する。また、第１要素の重みは、第１要素と対応するノードから第２要素と対応するノードに伝達される値に対して考慮される重みである接続係数に対応する。 It is also assumed that the model data MDT2 is realized by a neural network having one or a plurality of intermediate layers such as DNN (Deep Neural Network). In this case, the first element included in the model data MDT2 corresponds to any node included in the input layer or the intermediate layer. The second element corresponds to the node at the next stage, which is the node to which the value is transmitted from the node corresponding to the first element. The weight of the first element corresponds to the connection coefficient, which is a weight considered for the value transmitted from the node corresponding to the first element to the node corresponding to the second element.

生成装置５０は、上述した回帰モデルやニューラルネットワーク等、任意の構造を有するモデルを用いて、検索クエリが各カテゴリに属する確率の算出を行う。具体的には、モデルデータＭＤＴ２は、検索クエリが入力された場合に、検索クエリが各カテゴリに属する確率を出力するように係数が設定される。生成装置５０は、このようなモデルデータＭＤＴ２を用いて、検索クエリが各カテゴリに属する確率を算出する。 The generation device 50 calculates the probability that the search query belongs to each category using a model having an arbitrary structure such as the regression model and the neural network described above. Specifically, the model data MDT2 is set with a coefficient so as to output the probability that the search query belongs to each category when the search query is input. The generation device 50 uses such model data MDT2 to calculate the probability that the search query belongs to each category.

なお、上記例では、モデルデータＭＤＴ２が、検索クエリが入力された場合に、検索クエリの分散表現を出力するモデル（以下、モデルＸ２という。）である例を示した。しかし、実施形態に係るモデルデータＭＤＴ２は、モデルＸ２にデータの入出力を繰り返すことで得られる結果に基づいて生成されるモデルであってもよい。例えば、モデルデータＭＤＴ２は、検索クエリを入力とした際に、モデルＸ２が出力した分散表現を入力して学習されたモデル（以下、モデルＹ２という。）であってもよい。または、モデルデータＭＤＴ２は、検索クエリを入力とし、モデルＹ２の出力値を出力とするよう学習されたモデルであってもよい。 In the above example, the model data MDT2 is the model (hereinafter, referred to as model X2) that outputs the distributed expression of the search query when the search query is input. However, the model data MDT2 according to the embodiment may be a model generated based on a result obtained by repeating input/output of data to/from the model X2. For example, the model data MDT2 may be a model (hereinafter, referred to as model Y2) learned by inputting the distributed expression output by the model X2 when the search query is input. Alternatively, the model data MDT2 may be a model learned so that the search query is input and the output value of the model Y2 is output.

また、生成装置５０がＧＡＮ（Generative Adversarial Networks）を用いた推定処理を行う場合、モデルデータＭＤＴ２は、ＧＡＮの一部を構成するモデルであってもよい。 Further, when the generation device 50 performs the estimation process using GAN (Generative Adversarial Networks), the model data MDT2 may be a model forming a part of GAN.

（制御部５２）
図１９の説明に戻って、制御部５２は、コントローラ（controller）であり、例えば、ＣＰＵ（Central Processing Unit）やＭＰＵ（Micro Processing Unit）等によって、生成装置５０内部の記憶装置に記憶されている各種プログラム（生成プログラムの一例に相当）がＲＡＭを作業領域として実行されることにより実現される。また、制御部５２は、コントローラであり、例えば、ＡＳＩＣ（Application Specific Integrated Circuit）やＦＰＧＡ（Field Programmable Gate Array）等の集積回路により実現される。 (Control unit 52)
Returning to the description of FIG. 19, the control unit 52 is a controller, and is stored in a storage device inside the generation device 50 by, for example, a CPU (Central Processing Unit) or an MPU (Micro Processing Unit). It is realized by executing various programs (corresponding to an example of the generation program) using the RAM as a work area. The control unit 52 is a controller, and is realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

また、制御部５２は、モデル情報記憶部５３５に記憶されている第１モデルＭ１（モデルデータＭＤＴ１）に従った情報処理により、入力層に入力された検索クエリに対し、出力層以外の各層に属する各要素を第１要素として、第１要素と第１要素の重みとに基づく演算を行うことにより、分散表現を出力層から出力するよう、コンピュータを機能させる。 In addition, the control unit 52 performs information processing according to the first model M1 (model data MDT1) stored in the model information storage unit 535, in response to the search query input in the input layer, in each layer other than the output layer. The computer is made to function so that the distributed representation is output from the output layer by performing the calculation based on the first element and the weight of the first element with each element that belongs to the first element.

また、制御部５２は、モデル情報記憶部５３５に記憶されている第２モデルＭ２（モデルデータＭＤＴ２）に従った情報処理により、入力層に入力された検索クエリに対し、出力層以外の各層に属する各要素を第１要素として、第１要素と第１要素の重みとに基づく演算を行うことにより、検索クエリが各カテゴリに属する確率を出力層から出力するよう、コンピュータを機能させる。 In addition, the control unit 52 performs information processing in accordance with the second model M2 (model data MDT2) stored in the model information storage unit 535 on each layer other than the output layer with respect to the search query input to the input layer. The computer is operated so that the probability that the search query belongs to each category is output from the output layer by performing an operation based on the first element and the weight of the first element with each element that belongs to the first element.

図１９に示すように、制御部５２は、取得部５２１と、抽出部５２２と、生成部５２３を有し、以下に説明する情報処理の作用を実現または実行する。なお、制御部５２の内部構成は、図１９に示した構成に限られず、後述する情報処理を行う構成であれば他の構成であってもよい。 As illustrated in FIG. 19, the control unit 52 includes an acquisition unit 521, an extraction unit 522, and a generation unit 523, and realizes or executes the operation of information processing described below. Note that the internal configuration of the control unit 52 is not limited to the configuration shown in FIG. 19, and may be another configuration as long as it is a configuration for performing information processing described later.

（取得部５２１）
取得部５２１は、種々の情報を取得する。具体的には、取得部５２１は、ユーザによって入力された検索クエリを検索サーバ２０から取得する。取得部５２１は、ユーザによって入力された検索クエリを取得すると、取得した検索クエリをクエリ情報記憶部５３１に格納する。また、取得部５２１は、検索クエリの分散表現であるベクトルに関するベクトル情報を取得する。取得部５２１は、ベクトル情報を取得すると、取得したベクトル情報をベクトル情報記憶部５３２に格納する。また、取得部５２１は、検索クエリと検索クエリが属するカテゴリの分類を定義する情報を取得する。取得部５２１は、検索クエリと検索クエリが属するカテゴリの分類を定義する分類定義情報を取得すると、取得した分類定義情報を分類定義記憶部５３３に格納する。また、取得部５２１は、検索クエリが属するカテゴリに関するカテゴリ情報を取得する。取得部５２１は、カテゴリ情報を取得すると、取得したカテゴリ情報をカテゴリ情報記憶部５３４に格納する。 (Acquisition unit 521)
The acquisition unit 521 acquires various information. Specifically, the acquisition unit 521 acquires the search query input by the user from the search server 20. When acquiring the search query input by the user, the acquisition unit 521 stores the acquired search query in the query information storage unit 531. The acquisition unit 521 also acquires vector information about a vector that is a distributed expression of the search query. When acquiring the vector information, the acquisition unit 521 stores the acquired vector information in the vector information storage unit 532. In addition, the acquisition unit 521 acquires information defining the classification of the search query and the category to which the search query belongs. When acquiring the search query and the classification definition information that defines the classification of the category to which the search query belongs, the acquisition unit 521 stores the acquired classification definition information in the classification definition storage unit 533. The acquisition unit 521 also acquires category information regarding the category to which the search query belongs. When acquiring the category information, the acquisition unit 521 stores the acquired category information in the category information storage unit 534.

（抽出部５２２）
抽出部５２２は、種々の情報を抽出する。具体的には、抽出部５２２は、取得部５２１によって取得された検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリを抽出する。例えば、抽出部５２２は、同一のユーザによって各検索クエリが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。続いて、抽出部５２２は、同一のユーザによって所定の時間内に入力された複数の検索クエリのうち、同一のユーザによって所定の時間内に連続して入力された一対の検索クエリを抽出する。例えば、抽出部５２２は、同一のユーザによって各検索クエリのペアが入力された時間の間隔が所定の時間内である複数の検索クエリを抽出する。例えば、抽出部５２２は、取得部５２１によって取得された検索クエリのうち、同一のユーザＵ１によって所定の時間内に連続して入力された４個の検索クエリである検索クエリＱ１１（「六本木パスタ」）、検索クエリＱ１２（「六本木イタリアン」）、検索クエリＱ１３（「赤坂パスタ」）、検索クエリＱ１４（「麻布パスタ」）を抽出する。抽出部５２２は、検索クエリが入力された順番に並べると、検索クエリＱ１１、検索クエリＱ１２、検索クエリＱ１３、検索クエリＱ１４の順番で入力された４個の検索クエリを抽出する。続いて、抽出部５２２は、４個の検索クエリを抽出すると、時系列的に隣り合う２つの検索クエリを一対の検索クエリとして、３対の検索クエリのペアである（検索クエリＱ１１、検索クエリＱ１２）、（検索クエリＱ１２、検索クエリＱ１３）、（検索クエリＱ１３、検索クエリＱ１４）を抽出する。なお、抽出部５２２は、同一のユーザによって全ての検索クエリが所定の時間内に入力された複数の検索クエリを抽出してもよい。そして、抽出部５２２は、時系列的に隣り合うか否かに関わらず、抽出した複数の検索クエリの中から２つの検索クエリを選択して、選択した２つの検索クエリを一対の検索クエリとして抽出してもよい。 (Extractor 522)
The extraction unit 522 extracts various information. Specifically, the extraction unit 522 extracts a plurality of search queries input by the same user within a predetermined time, from the search queries acquired by the acquisition unit 521. For example, the extraction unit 522 extracts a plurality of search queries in which the time intervals at which the search queries are input by the same user are within a predetermined time. Subsequently, the extraction unit 522 extracts a pair of search queries continuously input by the same user within a predetermined time from a plurality of search queries input by the same user within a predetermined time. For example, the extraction unit 522 extracts a plurality of search queries in which the time interval at which each search query pair is input by the same user is within a predetermined time. For example, the extraction unit 522, among the search queries acquired by the acquisition unit 521, is a search query Q11 (“Roppongi pasta”) that is four search queries continuously input by the same user U1 within a predetermined time. ), search query Q12 (“Roppongi Italian”), search query Q13 (“Akasaka pasta”), and search query Q14 (“Azabu pasta”). When the extraction unit 522 arranges the search queries in the input order, the extraction unit 522 extracts the four search queries input in the order of the search query Q11, the search query Q12, the search query Q13, and the search query Q14. Subsequently, when the four search queries are extracted, the extraction unit 522 defines two search queries that are adjacent in time series as a pair of search queries, and is a pair of three search queries (search query Q11, search query). Q12), (search query Q12, search query Q13), and (search query Q13, search query Q14) are extracted. The extraction unit 522 may extract a plurality of search queries in which all the search queries have been input by the same user within a predetermined time. Then, the extraction unit 522 selects two search queries from the plurality of extracted search queries, regardless of whether they are adjacent in time series, and sets the selected two search queries as a pair of search queries. You may extract.

また、抽出部５２２は、取得部５２１によって取得された検索クエリのうち、所定の検索クエリと所定の検索クエリに無関係な他の検索クエリとを抽出する。例えば、抽出部５２２は、取得部５２１によって取得された検索クエリの中から、所定の検索クエリを抽出する。続いて、抽出部５２２は、取得部５２１によって取得された検索クエリの中から、所定の検索クエリとは無関係にランダムに他の検索クエリを抽出する。 Further, the extraction unit 522 extracts a predetermined search query and other search queries unrelated to the predetermined search query from the search queries acquired by the acquisition unit 521. For example, the extraction unit 522 extracts a predetermined search query from the search queries acquired by the acquisition unit 521. Subsequently, the extraction unit 522 randomly extracts another search query from the search queries acquired by the acquisition unit 521, regardless of the predetermined search query.

（生成部５２３）
生成部５２３は、種々の情報を生成する。具体的には、生成部５２３は、取得部５２１によって取得された検索クエリのうち、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する学習モデルを生成する。具体的には、生成部５２３は、同一のユーザによって所定の時間内に入力された複数の検索クエリの分散表現が類似するように学習モデルを学習させることで、所定の検索クエリから所定の検索クエリの特徴情報を予測する学習モデルを生成する。例えば、生成部５２３は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、学習モデルを生成する。例えば、生成部５２３は、一対の検索クエリの学習前の分散表現（ベクトル）の類似度の値を算出する。また、生成部５２３は、一対の検索クエリの学習後の分散表現（ベクトル）の類似度の値を算出する。続いて、生成部５２３は、学習前の分散表現（ベクトル）の類似度の値よりも、学習後の分散表現（ベクトル）の類似度の値が大きくなるように学習モデルを学習させる。このように、生成部５２３は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように学習モデルを学習させることで、検索クエリから分散表現（ベクトル）を出力する学習モデルを生成する。より具体的には、生成部５２３は、ＲＮＮの一種であるＬＳＴＭを分散表現生成に用いたＤＳＳＭの技術を用いて、検索クエリから分散表現（ベクトル）を出力する学習モデルを生成する。例えば、生成部５２３は、学習モデルの正解データとして、同一のユーザによって所定の時間内に入力された一対の検索クエリが類似する特徴を有するものとして、所定の検索クエリの分散表現（ベクトル）と、所定の検索クエリと対となる他の検索クエリの分散表現（ベクトル）とが、分散表現空間上で近くに存在するように学習する。また、生成部５２３は、第１学習モデルを生成すると、第１学習モデルを識別する識別情報と対応付けて、生成した第１学習モデル（モデルデータＭＤＴ１）をモデル情報記憶部５３５に格納する。 (Generator 523)
The generation unit 523 generates various information. Specifically, the generation unit 523 learns, among the search queries acquired by the acquisition unit 521, that a plurality of search queries input by the same user within a predetermined time have similar characteristics. , A learning model for predicting the characteristic information of the predetermined search query from the predetermined search query is generated. Specifically, the generation unit 523 learns the learning model so that the distributed expressions of the plurality of search queries input by the same user within a predetermined time are similar to each other, thereby performing a predetermined search from the predetermined search query. A learning model for predicting query feature information is generated. For example, the generation unit 523 generates a learning model by learning so that the distributed expressions of a pair of search queries continuously input within a predetermined time period are similar to each other. For example, the generation unit 523 calculates the similarity value of the distributed representation (vector) before learning of the pair of search queries. In addition, the generation unit 523 calculates the value of the similarity of the distributed expressions (vectors) after learning of the pair of search queries. Subsequently, the generation unit 523 trains the learning model such that the similarity value of the distributed expression (vector) after learning is larger than the similarity value of the distributed expression (vector) before learning. In this way, the generating unit 523 learns the learning model so that the two vectors, which are a pair of distributed expressions corresponding to the pair of search queries, are similar in the distributed expression space, and thus the distributed expression (vector ) Is output to generate a learning model. More specifically, the generation unit 523 generates a learning model that outputs a distributed expression (vector) from a search query using the DSSM technique that uses LSTM, which is a type of RNN, for distributed expression generation. For example, the generation unit 523 determines that, as the correct data of the learning model, a pair of search queries input by the same user within a predetermined time have similar characteristics, and a distributed expression (vector) of the predetermined search query is used. , A predetermined search query and a distributed expression (vector) of another search query paired with each other are learned so as to exist close to each other in the distributed expression space. Further, when the generation unit 523 generates the first learning model, the generation unit 523 stores the generated first learning model (model data MDT1) in the model information storage unit 535 in association with the identification information for identifying the first learning model.

〔３−４．第１学習モデルの一例〕
ここで、図２５を用いて生成装置５０が生成する第１学習モデルの一例について説明する。図２５は、実施形態に係る第１学習モデルの一例を示す図である。図２５に示す例では、生成装置５０が生成する第１学習モデルＭ１は、３層のＬＳＴＭＲＮＮで構成されている。図２５に示す例では、抽出部５２２は、同一のユーザＵ１によって所定の時間内に連続して入力された「六本木パスタ」という検索クエリＱ１１と「六本木イタリアン」という検索クエリＱ１２とから成る一対の検索クエリを抽出する。生成部５２３は、抽出部５２２によって抽出されたた検索クエリＱ１１を第１学習モデルＭ１の入力層に入力する（ステップＳ４１）。 [3-4. Example of first learning model]
Here, an example of the first learning model generated by the generation device 50 will be described with reference to FIG. 25. FIG. 25 is a diagram illustrating an example of the first learning model according to the embodiment. In the example shown in FIG. 25, the first learning model M1 generated by the generation device 50 is composed of three layers of LSTM RNNs. In the example illustrated in FIG. 25, the extraction unit 522 includes a pair of search queries Q11 “Roppongi pasta” and a search query Q12 “Roppongi Italian” that are continuously input by the same user U1 within a predetermined time. Extract the search query. The generation unit 523 inputs the search query Q11 extracted by the extraction unit 522 into the input layer of the first learning model M1 (step S41).

続いて、生成部５２３は、第１学習モデルＭ１の出力層から検索クエリＱ１１の分散表現である２５６次元のベクトルＢＱＶ１１を出力する。また、生成部５２３は、抽出部５２２によって抽出された検索クエリＱ１２を第１学習モデルＭ１の入力層に入力する。続いて、生成部５２３は、第１学習モデルＭ１の出力層から検索クエリＱ１２の分散表現である２５６次元のベクトルＢＱＶ１２を出力する（ステップＳ４２）。 Then, the generation unit 523 outputs a 256-dimensional vector BQV11, which is a distributed expression of the search query Q11, from the output layer of the first learning model M1. Further, the generation unit 523 inputs the search query Q12 extracted by the extraction unit 522 into the input layer of the first learning model M1. Subsequently, the generation unit 523 outputs the 256-dimensional vector BQV12, which is the distributed expression of the search query Q12, from the output layer of the first learning model M1 (step S42).

続いて、生成部５２３は、連続して入力された２つの検索クエリの分散表現（ベクトル）が類似するように学習することで、検索クエリから分散表現（ベクトル）を出力する第１学習モデルＭ１を生成する（ステップＳ４３）。例えば、第１学習モデルＭ１にフィードバックをかける前（学習前）の検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＢＱＶ１２とのなす角度の大きさをΘとする。また、第１学習モデルＭ１にフィードバックをかけた後（学習後）の検索クエリＱ１１の分散表現であるベクトルＱＶ１１と検索クエリＱ１２の分散表現であるベクトルＱＶ１２とのなす角度の大きさをΦとする。この時、生成部５２３は、ΘよりもΦが小さくなるように、第１学習モデルＭ１を学習させる。例えば、生成部５２３は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値を算出する。また、生成部５２３は、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値を算出する。続いて、生成部５２３は、ベクトルＢＱＶ１１とベクトルＢＱＶ１２のコサイン類似度の値よりも、ベクトルＱＶ１１とベクトルＱＶ１２のコサイン類似度の値が大きくなるように（値が１に近づくように）学習モデルＭ１を学習させる。このように、生成部５２３は、一対の検索クエリに対応する一対の分散表現である２つのベクトルが分散表現空間上で類似するように第１学習モデルＭ１を学習させることで、検索クエリから分散表現（ベクトル）を出力する第１学習モデルＭ１を生成する。なお、生成部５２３は、コサイン類似度に限らず、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて分散表現（ベクトル）の間の類似度を算出してもよい。また、生成部５２３は、ベクトル間の距離尺度として適用可能な指標であれば、どのような指標に基づいて学習モデルＭ１を学習させてもよい。例えば、生成部５２３は、分散表現（ベクトル）同士のユークリッド距離や双曲空間等の非ユークリッド空間中での距離、マンハッタン距離、マハラノビス距離等といった所定の距離関数の値を算出する。続いて、生成部５２３は、分散表現（ベクトル）同士の所定の距離関数の値（すなわち、分散表現空間における距離）が小さくなるように学習モデルＭ１を学習させてもよい。 Subsequently, the generation unit 523 learns so that the distributed expressions (vectors) of the two consecutively input search queries are similar to each other, and thus the first learning model M1 that outputs the distributed expressions (vectors) from the search query. Is generated (step S43). For example, the magnitude of the angle between the vector BQV11, which is the distributed expression of the search query Q11 before feedback (before learning) to the first learning model M1, and the vector BQV12, which is the distributed expression of the search query Q12, is Θ. In addition, the size of the angle formed by the vector QV11, which is the distributed expression of the search query Q11 after feedback is applied to the first learning model M1 (after learning), and the vector QV12, which is the distributed expression of the search query Q12, is Φ. .. At this time, the generation unit 523 trains the first learning model M1 so that Φ becomes smaller than Θ. For example, the generation unit 523 calculates the value of the cosine similarity between the vector BQV11 and the vector BQV12. Further, the generation unit 523 calculates the value of the cosine similarity between the vector QV11 and the vector QV12. Then, the generation unit 523 sets the learning model M1 so that the value of the cosine similarity between the vectors QV11 and QV12 is larger than the value of the cosine similarity between the vectors BQV11 and BQV12 (so that the value approaches 1). To learn. In this way, the generation unit 523 learns the first learning model M1 so that the two vectors, which are a pair of distributed expressions corresponding to the pair of search queries, are similar in the distributed expression space, and thus the search query is distributed. A first learning model M1 that outputs an expression (vector) is generated. The generation unit 523 is not limited to the cosine similarity, and may calculate the similarity between the distributed expressions (vectors) based on any index as long as it is an index applicable as a distance measure between vectors. Good. Further, the generation unit 523 may train the learning model M1 based on any index as long as it is an index applicable as a distance measure between vectors. For example, the generation unit 523 calculates a value of a predetermined distance function such as a Euclidean distance between distributed expressions (vectors), a distance in a non-Euclidean space such as a hyperbolic space, a Manhattan distance, a Mahalanobis distance, or the like. Subsequently, the generation unit 523 may train the learning model M1 so that the value of the predetermined distance function between the distributed expressions (vectors) (that is, the distance in the distributed expression space) becomes small.

また、生成部５２３は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、第１学習モデルを生成する。例えば、生成部５２３は、地名を示す「六本木」と食品の種類を示す「パスタ」の文字とが区切り文字であるスペースで区切られた検索クエリ「六本木パスタ」と、地名を示す「六本木」と料理の種類を示す「イタリアン」の文字とが区切り文字であるスペースで区切られた検索クエリ「六本木イタリアン」とが類似する特徴を有するものとして学習することで、第１学習モデルを生成する。 Further, the generation unit 523 learns that a plurality of search queries including a character string delimited by a predetermined delimiter have similar characteristics as the plurality of search queries input by the same user within a predetermined time. By doing so, the first learning model is generated. For example, the generation unit 523 uses a search query “Roppongi pasta” in which “Roppongi” indicating a place name and “pasta” indicating the type of food are separated by a space that is a delimiter, and “Roppongi” indicating a place name. The first learning model is generated by learning that the search query “Roppongi Italian”, which is separated by a space that is a delimiter between the characters “Italian” indicating the type of food, has similar characteristics.

また、生成部５２３は、取得部５２１によって取得された検索クエリのうち、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、第１学習モデルを生成する。具体的には、生成部５２３は、取得部５２１によって取得された検索クエリのうち、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、第１学習モデルを生成する。例えば、生成部５２３は、抽出部５２２によって抽出された所定の検索クエリの分散表現と、所定の検索クエリとは無関係にランダムに抽出された検索クエリの分散表現とが分散表現空間上で遠くにマッピングされるように第１学習モデルＭ１のトレーニングを行う。 In addition, the generation unit 523 generates the first learning model by learning that among the search queries acquired by the acquisition unit 521, a plurality of randomly extracted search queries have different characteristics. Specifically, the generation unit 523 generates a first learning model by learning so that the distributed expressions of a pair of search queries that are randomly extracted among the search queries acquired by the acquisition unit 521 are different. To do. For example, the generation unit 523 causes the distributed representation of the predetermined search query extracted by the extraction unit 522 and the distributed representation of the search query randomly extracted irrespective of the predetermined search query to be far apart in the distributed representation space. The first learning model M1 is trained so as to be mapped.

また、生成部５２３は、第２学習モデルを生成する。具体的には、生成部５２３は、モデル情報記憶部５３５を参照して、生成部５２３によって生成された第１学習モデル（第１学習モデルＭ１のモデルデータＭＤＴ１）を取得する。続いて、生成部５２３は、取得した第１学習モデルを用いて、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２学習モデルを生成する。生成部５２３は、第１モデルＭ１を取得すると、取得した第１モデルＭ１を用いて、第２学習モデルＭ２を生成する。生成部５２３は、第１モデルＭ１を再学習させることにより、第１モデルＭ１とは学習モデルの重みである接続係数が異なる第２モデルＭ２を生成する。具体的には、生成部５２３は、検索クエリが学習モデルに入力された際に、学習モデルが出力する分散表現の分類結果が、検索クエリが属するカテゴリに対応するように学習することで、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２モデルＭ２を生成する。 The generation unit 523 also generates a second learning model. Specifically, the generation unit 523 refers to the model information storage unit 535 and acquires the first learning model (model data MDT1 of the first learning model M1) generated by the generation unit 523. Subsequently, the generation unit 523 uses the acquired first learning model to generate a second learning model that predicts the category to which the predetermined search query belongs from the predetermined search query. When the generation unit 523 acquires the first model M1, the generation unit 523 generates the second learning model M2 using the acquired first model M1. The generating unit 523 re-learns the first model M1 to generate the second model M2 having a connection coefficient that is a weight of the learning model different from that of the first model M1. Specifically, when the search query is input to the learning model, the generation unit 523 learns such that the classification result of the distributed expressions output by the learning model corresponds to the category to which the search query belongs, and thus the predetermined result is obtained. A second model M2 that predicts a category to which a predetermined search query belongs is generated from the search query of.

具体的には、生成部５２３は、検索クエリが学習モデルに入力された際に、学習モデルが出力する分散表現の分類結果が、検索クエリが属するカテゴリに対応するように学習することで、所定の検索クエリから所定の検索クエリが属するカテゴリを予測する第２学習モデルを生成する。生成部５２３は、入力情報として検索クエリが学習モデルに入力された際に、出力情報として検索クエリが属するカテゴリ毎の確率を出力する第２学習モデルを生成する。例えば、生成部５２３は、第１モデルＭ１を用いて、入力情報として所定の検索クエリが学習モデルに入力された際に、出力情報として検索クエリの分散表現がそのカテゴリに分類される確率をカテゴリ毎に出力する第２モデルＭ２を生成する。生成部５２３は、入力情報として所定の検索クエリが入力されると、出力情報として所定の検索クエリの分散表現が正解カテゴリに分類される確率が所定の閾値を超えるように第２モデルを学習させる。そして、生成部５２３は、入力情報として所定の検索クエリが入力された際に、所定の検索クエリの分散表現がそのカテゴリに属する確率が所定の閾値を超えるカテゴリを、所定の検索クエリのカテゴリとして出力する第２モデルＭ２を生成する。また、生成部５２３は、第２学習モデルを生成すると、第２学習モデルを識別する識別情報と対応付けて、生成した第２学習モデル（モデルデータＭＤＴ２）をモデル情報記憶部５３５に格納する。 Specifically, when the search query is input to the learning model, the generation unit 523 learns such that the classification result of the distributed expressions output by the learning model corresponds to the category to which the search query belongs, and thus the predetermined result is obtained. A second learning model that predicts a category to which a predetermined search query belongs is generated from the search query. The generation unit 523 generates a second learning model that outputs the probability of each category to which the search query belongs as output information when the search query is input as input information to the learning model. For example, the generation unit 523 uses the first model M1 to calculate the probability that the distributed expression of the search query is classified as the output information into the category when the predetermined search query is input as the input information into the learning model. The second model M2 to be output for each is generated. When a predetermined search query is input as the input information, the generation unit 523 trains the second model so that the probability that the distributed expression of the predetermined search query is classified into the correct category as the output information exceeds the predetermined threshold. .. Then, when the predetermined search query is input as the input information, the generation unit 523 sets the category whose probability that the distributed expression of the predetermined search query belongs to the category exceeds the predetermined threshold as the category of the predetermined search query. The second model M2 to be output is generated. Further, when the generation unit 523 generates the second learning model, the generation unit 523 stores the generated second learning model (model data MDT2) in the model information storage unit 535 in association with the identification information for identifying the second learning model.

例えば、生成部５２３は、図２４に示すモデル情報記憶部５３５を参照して、第１モデルＭ１（第１モデルＭ１のモデルデータＭＤＴ１）を取得する。続いて、生成部５２３は、図２２に示す分類定義記憶部５３３を参照して、検索クエリを分類するカテゴリの大分類を選択する。続いて、生成部５２３は、大分類を選択すると、第２モデルＭ２の学習データとして、検索クエリと検索クエリが属する小分類との組を学習する。 For example, the generation unit 523 refers to the model information storage unit 535 illustrated in FIG. 24 and acquires the first model M1 (model data MDT1 of the first model M1). Subsequently, the generation unit 523 refers to the classification definition storage unit 533 illustrated in FIG. 22 and selects the large classification of the categories into which the search query is classified. Subsequently, when selecting the large classification, the generation unit 523 learns a set of the search query and the small classification to which the search query belongs as the learning data of the second model M2.

例えば、検索クエリＱ１１（「六本木パスタ」）が属する正解カテゴリがＣＡＴ１１（「飲食店を探す」）であるとする。生成部５２３は、入力情報として検索クエリＱ１１（「六本木パスタ」）が第２モデルＭ２に入力された際に、第２モデルＭ２の出力層から検索クエリＱ１１（「六本木パスタ」）の分散表現であるベクトルＢＱＶ１１を出力する。ここで、ベクトルＢＱＶ１１は、第２モデルＭ２の出力層から出力されたばかりの検索クエリＱ１１の分散表現であって、第２モデルＭ２にフィードバックをかける前（学習前）の分散表現を示す。この場合、生成部５２３は、出力された検索クエリＱ１１（「六本木パスタ」）の分散表現であるベクトルＢＱＶ１１が正解カテゴリＣＡＴ１１（「飲食店を探す」）に分類される確率が所定の閾値を超えるように第２モデルＭ２を学習させる。 For example, it is assumed that the correct category to which the search query Q11 (“Roppongi pasta”) belongs is CAT11 (“Find a restaurant”). When the search query Q11 (“Roppongi pasta”) is input to the second model M2 as input information, the generation unit 523 uses the distributed representation of the search query Q11 (“Roppongi pasta”) from the output layer of the second model M2. It outputs a certain vector BQV11. Here, the vector BQV11 is a distributed expression of the search query Q11 just output from the output layer of the second model M2, and represents a distributed expression before feedback is given to the second model M2 (before learning). In this case, the generation unit 523 causes the probability that the vector BQV11, which is the distributed expression of the output search query Q11 (“Roppongi pasta”), is classified into the correct answer category CAT11 (“Search for restaurants”) exceeds a predetermined threshold. Thus, the second model M2 is trained.

例えば、生成部５２３は、学習前の第２モデルＭ２に検索クエリＱ１１（「六本木パスタ」）が入力された際に、分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率を８０％、ＣＡＴ１２（「商品を探す」）に分類される確率を０％、ＣＡＴ１３（「飲食店を予約」）に分類される確率を２０％、ＣＡＴ１４（「商品を購入する」）に分類される確率を０％と出力したとする。この場合、生成部５２３は、分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率を所定の閾値（例えば、９０％）を超えるように第２モデルＭ２を学習させる。また、生成部５２３は、分散表現であるベクトルＢＱＶ１１がＣＡＴ１１（「飲食店を探す」）に分類される確率が所定の閾値（例えば、９０％）を超えるように学習させるのに合わせて、分散表現であるベクトルＢＱＶ１１が他のカテゴリＣＡＴ１３（「飲食店を予約」）に分類される確率を１０％に下げるように第２モデルＭ２を学習させる。続いて、生成部５２３は、学習済みの第２モデルＭ２に入力情報として検索クエリＱ１１（「六本木パスタ」）が入力されると、検索クエリＱ１１（「六本木パスタ」）の分散表現であるベクトルＢＱＶ１１がカテゴリＣＡＴ１１（「飲食店を探す」）に属する確率が９０％を超えるので、出力情報として検索クエリが属するカテゴリをＣＡＴ１１（「飲食店を探す」）と出力する。 For example, when the search query Q11 (“Roppongi pasta”) is input to the second model M2 before learning, the generation unit 523 classifies the vector BQV11, which is a distributed expression, into CAT11 (“Find a restaurant”). 80% probability, 0% probability of being classified as CAT12 (“find a product”), 20% probability of being classified as CAT13 (“reserve a restaurant”), CAT14 (“purchase product”) It is assumed that the probability of being classified as is output as 0%. In this case, the generation unit 523 trains the second model M2 so that the probability that the vector BQV11, which is a distributed expression, is classified into CAT11 (“search for a restaurant”) exceeds a predetermined threshold value (for example, 90%). .. In addition, the generation unit 523 performs dispersion so that the probability that the vector BQV11, which is a distributed expression, is classified into CAT11 (“search for a restaurant”) exceeds a predetermined threshold value (for example, 90%), and the distribution is performed. The second model M2 is trained so as to reduce the probability that the expression vector BQV11 is classified into another category CAT13 (“reserve restaurant”) to 10%. Subsequently, when the search query Q11 (“Roppongi pasta”) is input as input information to the learned second model M2, the generation unit 523 generates a vector BQV11 that is a distributed expression of the search query Q11 (“Roppongi pasta”). Belongs to the category CAT11 (“search for restaurants”) exceeds 90%, the category to which the search query belongs is output as output information as CAT11 (“searches for restaurants”).

なお、生成部５２３は、大分類として、任意の数の大分類を選択してもよい。そして、生成部５２３は、入力情報として検索クエリが第２モデルＭ２に入力された際に、出力情報として検索クエリが選択した任意の数の大分類に属する各小分類に属する確率を小分類毎に出力する第２モデルＭ２を生成してもよい。また、生成部５２３は、大分類として、全ての大分類を選択してもよい。そして、生成部５２３は、検索クエリが第２モデルＭ２に入力された際に、各小分類に属する確率を全ての小分類毎に出力する第２モデルＭ２を生成してもよい。 Note that the generation unit 523 may select any number of large classifications as the large classifications. Then, when the search query is input to the second model M2 as the input information, the generation unit 523 determines the probability of belonging to each of the small categories belonging to the arbitrary number of the large categories selected by the search query as the output information for each of the small categories. The second model M2 to be output to may be generated. Further, the generation unit 523 may select all the large classifications as the large classifications. Then, the generation unit 523 may generate the second model M2 that outputs the probability of belonging to each of the small classifications for every small classification when the search query is input to the second model M2.

〔３−５．第２学習モデルの一例〕
ここで、図２６を用いて生成装置５０が生成する第２学習モデルの一例について説明する。図２６は、実施形態に係る第２学習モデルの一例を示す図である。図２６に示す例では、生成装置５０が生成する第２学習モデルＭ２は、第１学習モデルＭ１を用いて生成される。すなわち、生成装置５０は、第１学習モデルＭ１を再学習させることにより、第１学習モデルＭ１とは学習モデルの重みである接続係数が異なる第２学習モデルＭ２を生成する。 [3-5. Example of second learning model]
Here, an example of the second learning model generated by the generation device 50 will be described with reference to FIG. FIG. 26 is a diagram illustrating an example of the second learning model according to the embodiment. In the example shown in FIG. 26, the second learning model M2 generated by the generating device 50 is generated using the first learning model M1. That is, the generation device 50 re-learns the first learning model M1 to generate the second learning model M2 having a connection coefficient that is a weight of the learning model different from that of the first learning model M1.

より具体的には、生成装置５０が生成する第２学習モデルＭ２は、第１学習モデルＭ１と同様に、３層のＬＳＴＭＲＮＮで構成されている。図２６に示す例では、抽出部５２２は、ユーザＵ１によって入力された「六本木パスタ」という検索クエリＱ１１を第２学習モデルＭ２の入力層に入力する（ステップＳ５１）。 More specifically, the second learning model M2 generated by the generating device 50 is composed of three layers of LSTM RNNs, like the first learning model M1. In the example illustrated in FIG. 26, the extraction unit 522 inputs the search query Q11 “Roppongi pasta” input by the user U1 into the input layer of the second learning model M2 (step S51).

続いて、生成部５２３は、第２学習モデルＭ２の出力層から検索クエリＱ１１の分散表現である２５６次元のベクトルＢＱＶ１１を出力する（ステップＳ５２）。 Subsequently, the generation unit 523 outputs the 256-dimensional vector BQV11, which is the distributed expression of the search query Q11, from the output layer of the second learning model M2 (step S52).

続いて、生成部５２３は、検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１が各カテゴリに分類される確率を出力する（ステップＳ５３）。 Then, the generation unit 523 outputs the probability that the vector BQV11, which is the distributed expression of the search query Q11, is classified into each category (step S53).

続いて、生成部５２３は、検索クエリＱ１１の分散表現であるベクトルＢＱＶ１１が正解カテゴリに分類される確率を高くするように第２学習モデルＭ２を学習することで、検索クエリから検索クエリのカテゴリを予測する第２モデルを生成する（ステップＳ５４）。 Subsequently, the generation unit 523 learns the second learning model M2 so as to increase the probability that the vector BQV11, which is the distributed expression of the search query Q11, is classified into the correct answer category, thereby changing the category of the search query from the search query. A second model to be predicted is generated (step S54).

〔３−６．第１学習モデルの生成処理のフロー〕
次に、図２７を用いて、実施形態に係る第１学習モデルの生成処理の手順について説明する。図２７は、実施形態に係る第１学習モデルの生成処理手順を示すフローチャートである。 [3-6. Flow of processing for generating first learning model]
Next, the procedure of the first learning model generation process according to the embodiment will be described with reference to FIG. 27. FIG. 27 is a flowchart showing a procedure for generating the first learning model according to the embodiment.

図２７に示す例では、生成装置５０は、ユーザによって入力された検索クエリを取得する（ステップＳ１００１）。 In the example illustrated in FIG. 27, the generation device 50 acquires the search query input by the user (step S1001).

続いて、生成装置５０は、同一のユーザによって所定の時間内に入力された複数の検索クエリを抽出する（ステップＳ１００２）。 Subsequently, the generation device 50 extracts a plurality of search queries input by the same user within a predetermined time (step S1002).

続いて、生成装置５０は、抽出した複数の検索クエリが類似する特徴を有するものとして学習することで、所定の検索クエリから所定の検索クエリの特徴情報を予測する第１学習モデルを生成する（ステップＳ１００３）。 Subsequently, the generation device 50 generates a first learning model that predicts the characteristic information of a predetermined search query from the predetermined search query by learning that the extracted plurality of search queries have similar characteristics ( Step S1003).

〔３−７．第２学習モデルの生成処理のフロー〕
次に、図２８を用いて、実施形態に係る第２学習モデルの生成処理の手順について説明する。図２８は、実施形態に係る第２学習モデルの生成処理の手順を示すフローチャートである。 [3-7. Flow of second learning model generation processing]
Next, the procedure of the second learning model generation process according to the embodiment will be described with reference to FIG. FIG. 28 is a flowchart showing the procedure of the second learning model generation process according to the embodiment.

図２８に示す例では、生成装置５０は、第１学習モデル（第１学習モデルＭ１のモデルデータＭＤＴ１）を取得する（ステップＳ２００１）。 In the example illustrated in FIG. 28, the generation device 50 acquires the first learning model (model data MDT1 of the first learning model M1) (step S2001).

続いて、生成装置５０は、第１学習モデルを用いて、所定の検索クエリから所定の検索クエリのカテゴリを予測する第２学習モデルを生成する（ステップＳ２００２）。 Then, the generation device 50 generates a second learning model that predicts a category of a predetermined search query from a predetermined search query using the first learning model (step S2002).

〔４．効果〕
上述してきたように、第１の実施形態に係る情報処理装置１００は、抽出部１３５と決定部１３６とを備える。抽出部１３５は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、所定のクエリの特徴を示す特徴情報を抽出する。決定部１３６は、抽出部１３５によって抽出された特徴情報に基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。 [4. effect〕
As described above, the information processing device 100 according to the first embodiment includes the extraction unit 135 and the determination unit 136. The extraction unit 135 determines that a plurality of search queries input by the same user within a predetermined time have similar features by using a learning model that has learned the features of the plurality of search queries. Feature information indicating features is extracted. The determination unit 136 determines recommendation information to be recommended to the user who has input the predetermined query, based on the characteristic information extracted by the extraction unit 135.

これにより、情報処理装置１００は、所定の検索クエリに興味や関心を抱いたユーザに対して、所定の検索クエリの特徴を示す特徴情報に基づく情報を推薦可能とする。すなわち、情報処理装置１００は、ユーザの興味や関心にマッチする情報を推薦可能とする。また、一般的に、検索サービスを訪れるユーザ等、特定の分野に興味や関心を抱いてはいるものの、その分野に関する知識が少ないユーザが検索によって知識を得ようとする場面においては、適切な検索クエリが思いつかないという課題がある。本願発明に係る情報処理装置１００は、知識が少ないユーザが入力した検索クエリに基づいて、検索意図に応じた適切な検索クエリに基づく推薦情報を推薦することができる。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 With this, the information processing apparatus 100 can recommend information based on the characteristic information indicating the characteristics of the predetermined search query to the user who is interested in the predetermined search query. That is, the information processing apparatus 100 can recommend the interest of the user or information that matches the interest. In addition, in general, a user who is interested in a specific field such as a user who visits a search service, but has little knowledge about the field, who has little knowledge of the field, is likely to find an appropriate search. The problem is that I can't think of a query. The information processing apparatus 100 according to the present invention can recommend recommended information based on an appropriate search query according to a search intention, based on a search query input by a user who has little knowledge. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

また、抽出部１３５は、特徴情報として、所定のクエリと類似する特徴を有する検索クエリである類似クエリを抽出する。決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。 Further, the extraction unit 135 extracts, as the characteristic information, a similar query that is a search query having characteristics similar to a predetermined query. The determination unit 136 determines recommendation information to be recommended to the user who has input the predetermined query, based on the similar query extracted by the extraction unit 135.

これにより、情報処理装置１００は、所定の検索クエリに興味や関心を抱いたユーザに対して、所定の検索クエリと類似する特徴を有する類似クエリに基づく情報を推薦可能とする。すなわち、情報処理装置１００は、ユーザの興味や関心にマッチする情報を推薦可能とする。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 As a result, the information processing apparatus 100 can recommend information based on a similar query having characteristics similar to the predetermined search query to a user who is interested in the predetermined search query. That is, the information processing apparatus 100 can recommend the interest of the user or information that matches the interest. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

また、抽出部１３５は、所定のクエリと属性が共通する類似クエリを抽出する。決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、推薦情報である類似クエリに関する情報を決定する。 In addition, the extraction unit 135 extracts a similar query having the same attribute as the predetermined query. The determining unit 136 determines information about the similar query, which is the recommendation information, based on the similar query extracted by the extracting unit 135.

これにより、情報処理装置１００は、所定の検索クエリに興味や関心を抱いたユーザに対して、所定のクエリと属性が共通する類似クエリに基づく情報を推薦可能とする。したがって、情報処理装置１００は、ユーザに対してより適切な情報を推薦することができる。 As a result, the information processing apparatus 100 can recommend information based on a similar query having the same attributes as the predetermined query to users who are interested in the predetermined search query. Therefore, the information processing apparatus 100 can recommend more appropriate information to the user.

また、抽出部１３５は、所定のクエリと属性が共通する類似クエリとして、不動産エリアを示す所定のクエリと類似する特徴を有する検索クエリであって、不動産エリアを示す類似クエリを抽出する。決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、推薦情報である不動産エリアに関する情報を決定する。 In addition, the extraction unit 135 extracts a similar query indicating a real estate area, which is a search query having characteristics similar to the predetermined query indicating the real estate area, as a similar query having the same attribute as the predetermined query. The determination unit 136 determines information regarding the real estate area, which is the recommendation information, based on the similar query extracted by the extraction unit 135.

これにより、情報処理装置１００は、所定の不動産エリアに興味や関心を抱いたユーザに対して、所定の不動産エリアと類似する特徴を有する不動産エリアを推薦可能とする。したがって、情報処理装置１００は、ユーザに対して適切な不動産エリアを推薦することができる。 Accordingly, the information processing apparatus 100 can recommend a real estate area having characteristics similar to the predetermined real estate area to a user who is interested in the predetermined real estate area. Therefore, the information processing apparatus 100 can recommend an appropriate real estate area to the user.

また、決定部１３６は、抽出部１３５によって抽出された類似クエリに基づいて、推薦情報である再検索用のクエリの候補を決定する。 Further, the determining unit 136 determines a query candidate for re-search, which is recommendation information, based on the similar query extracted by the extracting unit 135.

これにより、情報処理装置１００は、所定の検索クエリに興味や関心を抱いているものの、具体的な検索条件がわからないユーザに対して、適切な絞り込み条件の候補を推薦することができる。 As a result, the information processing apparatus 100 can recommend an appropriate narrowing-down condition candidate to a user who is interested in a predetermined search query but does not know the specific search condition.

また、情報処理装置１００は、生成部１３２と算出部１３４とをさらに備える。生成部１３２は、同一のユーザによって所定の時間内に入力された複数の検索クエリが類似する特徴を有するものとして、複数の検索クエリが有する特徴を学習した学習モデルを用いて、所定の検索クエリの分散表現を生成する。算出部１３４は、生成部１３２によって生成された所定の検索クエリの分散表現と、生成部１３２によって生成された所定の検索クエリとは異なる他の検索クエリの分散表現との類似度を算出する。抽出部１３５は、算出部１３４によって算出された類似度が所定の閾値を超える他の検索クエリを類似クエリとして抽出する。 The information processing apparatus 100 further includes a generation unit 132 and a calculation unit 134. The generation unit 132 determines that a plurality of search queries input by the same user within a predetermined time have similar features, and uses a learning model that learns the features of the plurality of search queries to perform a predetermined search query. Generate a distributed representation of. The calculation unit 134 calculates the similarity between the distributed expression of the predetermined search query generated by the generation unit 132 and the distributed expression of another search query different from the predetermined search query generated by the generation unit 132. The extraction unit 135 extracts another search query whose similarity calculated by the calculation unit 134 exceeds a predetermined threshold as a similar query.

情報処理装置１００は、所定の検索クエリに興味や関心を抱いたユーザに対して、所定の検索クエリの特徴を示す特徴情報に基づく情報を推薦可能とする。すなわち、情報処理装置１００は、ユーザの興味や関心にマッチする情報を推薦可能とする。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 The information processing apparatus 100 can recommend information based on characteristic information indicating characteristics of a predetermined search query to a user who is interested in the predetermined search query. That is, the information processing apparatus 100 can recommend the interest of the user or information that matches the interest. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

また、抽出部１３５は、入力情報として所定の検索クエリが入力された際に、出力情報として所定の検索クエリの分散表現を出力する学習モデルを用いて、特徴情報を抽出する。また、抽出部１３５は、所定の時間内に続けて入力された一対の検索クエリの分散表現が類似するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。また、抽出部１３５は、同一のユーザによって所定の時間内に入力された複数の検索クエリとして、所定の区切り文字で区切られた文字列を含む複数の検索クエリが類似する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。また、抽出部１３５は、ランダムに抽出された複数の検索クエリが相違する特徴を有するものとして学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。また、抽出部１３５は、ランダムに抽出された一対の検索クエリの分散表現が相違するように学習することで、複数の検索クエリが有する特徴を学習した学習モデルを用いて、特徴情報を抽出する。 The extraction unit 135 also extracts the characteristic information by using a learning model that outputs a distributed expression of the predetermined search query as the output information when the predetermined search query is input as the input information. In addition, the extraction unit 135 uses a learning model in which the characteristics of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries that are continuously input within a predetermined time are similar to each other. Extract feature information. The extraction unit 135 also learns that a plurality of search queries that are input by the same user within a predetermined time have similar characteristics to each other that include a character string delimited by a predetermined delimiter. By doing so, the feature information is extracted using the learning model in which the features of the plurality of search queries are learned. Further, the extraction unit 135 extracts the characteristic information by using the learning model in which the characteristics of the plurality of search queries are learned by learning the plurality of randomly extracted search queries as having different characteristics. .. In addition, the extraction unit 135 extracts feature information using a learning model in which the features of a plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries that are randomly extracted differ. ..

これにより、情報処理装置１００は、ユーザの検索意図を考慮して、適切な特徴情報を抽出可能とする。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 As a result, the information processing apparatus 100 can extract appropriate characteristic information in consideration of the user's search intention. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

また、第２の実施形態に係る情報処理装置１００Ａは、抽出部１３５Ａと決定部１３６Ａとを備える。抽出部１３５Ａは、特徴情報として、所定のクエリが属するカテゴリを抽出する。決定部１３６Ａは、抽出部１３５Ａによって抽出されたカテゴリに基づいて、所定のクエリを入力したユーザに対して推薦する推薦情報を決定する。 The information processing apparatus 100A according to the second embodiment includes an extraction unit 135A and a determination unit 136A. The extraction unit 135A extracts, as the characteristic information, a category to which a predetermined query belongs. The determination unit 136A determines recommendation information to be recommended to the user who has input the predetermined query, based on the category extracted by the extraction unit 135A.

これにより、情報処理装置１００Ａは、所定の検索クエリに興味や関心を抱いたユーザに対して、所定の検索クエリが属するカテゴリに基づく情報を推薦可能とする。すなわち、情報処理装置１００Ａは、ユーザの興味や関心にマッチする情報を推薦可能とする。したがって、情報処理装置１００は、ユーザに対して適切な情報を推薦することができる。 Accordingly, the information processing apparatus 100A can recommend information based on a category to which a predetermined search query belongs to a user who is interested in the predetermined search query. That is, the information processing apparatus 100A can recommend the interest of the user or information matching the interest. Therefore, the information processing apparatus 100 can recommend appropriate information to the user.

〔５．ハードウェア構成〕
また、上述してきた第１の実施形態に係る情報処理装置１００、第２の実施形態に係る情報処理装置１００Ａおよび実施形態に係る生成装置５０は、例えば図２９に示すような構成のコンピュータ１０００によって実現される。図２９は、情報処理装置１００および情報処理装置１００Ａおよび生成装置５０の機能を実現するコンピュータの一例を示すハードウェア構成図である。コンピュータ１０００は、ＣＰＵ１１００、ＲＡＭ１２００、ＲＯＭ１３００、ＨＤＤ１４００、通信インターフェイス（Ｉ／Ｆ）１５００、入出力インターフェイス（Ｉ／Ｆ）１６００、及びメディアインターフェイス（Ｉ／Ｆ）１７００を備える。 [5. Hardware configuration]
In addition, the information processing apparatus 100 according to the first embodiment, the information processing apparatus 100A according to the second embodiment, and the generation apparatus 50 according to the embodiments described above are implemented by a computer 1000 having a configuration illustrated in FIG. 29, for example. Will be realized. FIG. 29 is a hardware configuration diagram illustrating an example of a computer that realizes the functions of the information processing device 100, the information processing device 100A, and the generation device 50. The computer 1000 includes a CPU 1100, RAM 1200, ROM 1300, HDD 1400, communication interface (I/F) 1500, input/output interface (I/F) 1600, and media interface (I/F) 1700.

ＣＰＵ１１００は、ＲＯＭ１３００またはＨＤＤ１４００に格納されたプログラムに基づいて動作し、各部の制御を行う。ＲＯＭ１３００は、コンピュータ１０００の起動時にＣＰＵ１１００によって実行されるブートプログラムや、コンピュータ１０００のハードウェアに依存するプログラム等を格納する。 The CPU 1100 operates based on a program stored in the ROM 1300 or the HDD 1400, and controls each unit. The ROM 1300 stores a boot program executed by the CPU 1100 when the computer 1000 starts up, a program dependent on the hardware of the computer 1000, and the like.

ＨＤＤ１４００は、ＣＰＵ１１００によって実行されるプログラム、及び、かかるプログラムによって使用されるデータ等を格納する。通信インターフェイス１５００は、所定の通信網を介して他の機器からデータを受信してＣＰＵ１１００へ送り、ＣＰＵ１１００が生成したデータを所定の通信網を介して他の機器へ送信する。 The HDD 1400 stores programs executed by the CPU 1100, data used by the programs, and the like. The communication interface 1500 receives data from another device via a predetermined communication network, sends the data to the CPU 1100, and transmits the data generated by the CPU 1100 to another device via the predetermined communication network.

ＣＰＵ１１００は、入出力インターフェイス１６００を介して、ディスプレイやプリンタ等の出力装置、及び、キーボードやマウス等の入力装置を制御する。ＣＰＵ１１００は、入出力インターフェイス１６００を介して、入力装置からデータを取得する。また、ＣＰＵ１１００は、生成したデータを入出力インターフェイス１６００を介して出力装置へ出力する。 The CPU 1100 controls output devices such as a display and a printer and input devices such as a keyboard and a mouse via the input/output interface 1600. The CPU 1100 acquires data from the input device via the input/output interface 1600. Further, the CPU 1100 outputs the generated data to the output device via the input/output interface 1600.

メディアインターフェイス１７００は、記録媒体１８００に格納されたプログラムまたはデータを読み取り、ＲＡＭ１２００を介してＣＰＵ１１００に提供する。ＣＰＵ１１００は、かかるプログラムを、メディアインターフェイス１７００を介して記録媒体１８００からＲＡＭ１２００上にロードし、ロードしたプログラムを実行する。記録媒体１８００は、例えばＤＶＤ（Digital Versatile Disc）、ＰＤ（Phase change rewritable Disk）等の光学記録媒体、ＭＯ（Magneto-Optical disk）等の光磁気記録媒体、テープ媒体、磁気記録媒体、または半導体メモリ等である。 The media interface 1700 reads a program or data stored in the recording medium 1800 and provides the program or data to the CPU 1100 via the RAM 1200. The CPU 1100 loads the program from the recording medium 1800 onto the RAM 1200 via the media interface 1700, and executes the loaded program. The recording medium 1800 is, for example, an optical recording medium such as a DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as an MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. Etc.

例えば、コンピュータ１０００が情報処理装置１００、情報処理装置１００Ａまたは生成装置５０として機能する場合、コンピュータ１０００のＣＰＵ１１００は、ＲＡＭ１２００上にロードされたプログラムを実行することにより、制御部１３０、制御部１３０Ａまたは制御部５２の機能を実現する。コンピュータ１０００のＣＰＵ１１００は、これらのプログラムを記録媒体１８００から読み取って実行するが、他の例として、他の装置から所定の通信網を介してこれらのプログラムを取得してもよい。 For example, when the computer 1000 functions as the information processing device 100, the information processing device 100A, or the generation device 50, the CPU 1100 of the computer 1000 executes the program loaded on the RAM 1200 to cause the control unit 130, the control unit 130A, or The function of the control unit 52 is realized. The CPU 1100 of the computer 1000 reads these programs from the recording medium 1800 and executes them, but as another example, these programs may be obtained from other devices via a predetermined communication network.

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 As described above, some of the embodiments of the present application have been described in detail based on the drawings, but these are examples, and various modifications based on the knowledge of those skilled in the art, including the modes described in the section of the disclosure of the invention, It is possible to implement the present invention in other forms with improvements.

〔６．その他〕
また、上記実施形態及び変形例において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。例えば、各図に示した各種情報は、図示した情報に限られない。 [6. Other]
Further, of the processes described in the above-described embodiment and modification, all or part of the processes described as being automatically performed may be manually performed, or described as manually performed. It is also possible to automatically carry out all or part of the above-mentioned processing by a known method. In addition, the processing procedures, specific names, information including various data and parameters shown in the above-mentioned documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each drawing is not limited to the illustrated information.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。 Further, each component of each device shown in the drawings is functionally conceptual and does not necessarily have to be physically configured as shown. That is, the specific form of distribution/integration of each device is not limited to that shown in the figure, and all or part of the device may be functionally or physically distributed/arranged in arbitrary units according to various loads and usage conditions. It can be integrated and configured.

また、上述してきた実施形態及び変形例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Further, the above-described embodiments and modified examples can be appropriately combined within a range in which the processing content is not inconsistent.

また、上述してきた「部（section、module、unit）」は、「手段」や「回路」などに読み替えることができる。例えば、抽出部は、抽出手段や抽出回路に読み替えることができる。 Also, the above-mentioned "section (module, unit)" can be read as "means" or "circuit". For example, the extraction unit can be read as an extraction unit or an extraction circuit.

１情報処理システム
１０ユーザ端末
２０検索サーバ
５０生成装置
１００情報処理装置
１１０通信部
１２０記憶部
１２１モデル情報記憶部
１２２ベクトル情報記憶部
１２３検索情報記憶部
１２４コンテンツ記憶部
１３０制御部
１３１取得部
１３２生成部
１３３提供部
１３４算出部
１３５抽出部
１３６決定部 1 Information Processing System 10 User Terminal 20 Search Server 50 Generation Device 100 Information Processing Device 110 Communication Unit 120 Storage Unit 121 Model Information Storage Unit 122 Vector Information Storage Unit 123 Search Information Storage Unit 124 Content Storage Unit 130 Control Unit 131 Acquisition Unit 132 Generation Part 133 Providing part 134 Calculation part 135 Extracting part 136 Determining part

Claims

A feature indicating a feature of a predetermined query by using a learning model in which the features of the plurality of search queries are learned, assuming that the plurality of search queries input by the same user within a predetermined time have similar features An extraction unit for extracting information,
A determination unit that determines recommendation information recommended to the user who has input the predetermined query, based on the characteristic information extracted by the extraction unit;
An information processing apparatus comprising:

The extraction unit is
As the characteristic information, a similar query that is a search query having characteristics similar to the predetermined query is extracted,
The determining unit is
The information processing apparatus according to claim 1, wherein recommendation information recommended for a user who has input the predetermined query is determined based on the similar query extracted by the extraction unit.

The extraction unit is
Extract a similar query that has the same attributes as the predetermined query,
The determining unit is
The information processing apparatus according to claim 2, wherein information regarding the similar query that is the recommendation information is determined based on the similar query extracted by the extraction unit.

The extraction unit is
As the similar query having the same attributes as the predetermined query, a search query having characteristics similar to the predetermined query indicating the real estate area, and extracting a similar query indicating the real estate area,
The determining unit is
The information processing apparatus according to claim 3, wherein information regarding the real estate area, which is the recommendation information, is determined based on the similar query extracted by the extraction unit.

The determining unit is
The information processing apparatus according to any one of claims 1 to 4, wherein a candidate for a query for re-search that is the recommendation information is determined based on a similar query extracted by the extraction unit.

As a plurality of search queries input by the same user within a predetermined time have similar features, a distributed representation of the predetermined search queries is calculated using a learning model that learns the features of the plurality of search queries. A generator to generate,
A calculation unit that calculates the degree of similarity between the distributed expression of the predetermined search query generated by the generation unit and the distributed expression of another search query that is different from the predetermined search query generated by the generation unit;
Further equipped with,
The extraction unit is
The information processing apparatus according to any one of claims 2 to 5, wherein another search query whose similarity calculated by the calculator exceeds a predetermined threshold is extracted as the similar query.

The extraction unit is
The characteristic information is extracted by using a learning model that outputs a distributed expression of the predetermined search query as output information when a predetermined search query is input as input information. The information processing apparatus described in any one of 1.

The extraction unit is
Extracting the feature information by using a learning model in which the features of the plurality of search queries are learned by learning so that the distributed expressions of a pair of search queries that are continuously input within the predetermined time are similar to each other. The information processing apparatus according to any one of claims 1 to 7, characterized in that.

The extraction unit is
By learning as a plurality of search queries input by the same user within a predetermined time period, a plurality of search queries including a character string delimited by a predetermined delimiter have similar characteristics. The information processing apparatus according to any one of claims 1 to 8, wherein the characteristic information is extracted using a learning model in which a characteristic included in the search query is learned.

The extraction unit is
The feature information is extracted by using a learning model that learns the features of the plurality of search queries by learning as a plurality of randomly extracted search queries having different features. The information processing device according to claim 1.

The extraction unit is
By learning so that the distributed expressions of a pair of search queries that are randomly extracted are different, the feature information is extracted using a learning model that has learned the features of the plurality of search queries. The information processing apparatus according to claim 1.

The extraction unit is
As the characteristic information, a category to which the predetermined query belongs is extracted,
The determining unit is
The recommendation information recommended for the user who has input the predetermined query is determined based on the category extracted by the extraction unit. apparatus.

An information processing method executed by a computer,
A feature indicating a feature of a predetermined query by using a learning model in which the features of the plurality of search queries are learned, assuming that the plurality of search queries input by the same user within a predetermined time have similar features An extraction step for extracting information,
A determination step of determining recommendation information to be recommended to the user who has input the predetermined query, based on the characteristic information extracted by the extraction step;
An information processing method comprising:

A feature indicating a feature of a predetermined query by using a learning model in which the features of the plurality of search queries are learned, assuming that the plurality of search queries input by the same user within a predetermined time have similar features Extraction means for extracting information,
Determination means for determining recommendation information to be recommended to the user who has input the predetermined query, based on the characteristic information extracted by the extraction means,
An information processing program that causes a computer to execute.