JP6501936B1

JP6501936B1 - INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Info

Publication number: JP6501936B1
Application number: JP2018032915A
Authority: JP
Inventors: 剛塚原
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2018-02-27
Filing date: 2018-02-27
Publication date: 2019-04-17
Anticipated expiration: 2038-02-27
Also published as: JP2019148962A

Abstract

【課題】与えられたキーワードに興味を持っている可能性があるユーザの数を、比較的広範に抽出することが可能な情報処理装置、情報処理方法、およびプログラムを提供すること。【解決手段】複数のユーザのそれぞれのネットワーク上の行動履歴において第１キーワードと共起しやすい第２キーワードを抽出する第１抽出部と、前記第１キーワードまたは前記第２キーワードを含むネットワーク上の行動履歴を有するユーザを抽出する第２抽出部と、複数のユーザの中から、前記第２抽出部により抽出されたユーザに、ネットワーク上の行動履歴が類似する一以上のユーザを、前記第１キーワードに対応するユーザとして抽出する第３抽出部と、を備える情報処理装置。【選択図】図１An information processing apparatus, an information processing method, and a program capable of relatively widely extracting the number of users who may be interested in a given keyword. A first extraction unit extracts a second keyword that is likely to co-occur with a first keyword in the action history of each of a plurality of users on the network, and the network includes the first keyword or the second keyword. A second extraction unit for extracting a user having an action history, and one or more users whose action history on the network is similar to the user extracted by the second extraction unit from among a plurality of users; An information processing apparatus comprising: a third extraction unit that extracts a user corresponding to a keyword. [Selected figure] Figure 1

Description

本発明は、情報処理装置、情報処理方法、およびプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

情報検索エンジンを利用した各ユーザの検索履歴情報に含まれる検索キーワード情報を抽出する検索キーワード情報抽出手段と、抽出した検索キーワード情報に基づいて、各ユーザの入力キーワードの動向を示すクラスを生成するクラス生成手段と、生成されたクラスに各ユーザを分類する分類手段と、少なくとも分類された分類結果を提示する分類結果提示手段とを備えるユーザ分類装置が知られている（特許文献１参照）。 Based on the search keyword information extraction means for extracting search keyword information contained in the search history information of each user using the information search engine, and based on the extracted search keyword information, a class indicating the trend of the input keyword of each user is generated There is known a user classification device including class generation means, classification means for classifying each user in the generated class, and classification result presentation means for presenting at least classified classification results (see Patent Document 1).

この装置は、クラス生成手段により生成されたクラスを代表するコアキーワードを抽出するコアキーワード抽出手段をさらに備え、分類結果提示手段が、ユーザの分類結果と併せて、抽出したコアキーワードを提示する。コアキーワード抽出手段は、コアキーワードを抽出するための情報として、各ユーザの入力キーワードに対するＷｅｂ検索結果に含まれるサマリ内での単語共起情報と、分類されたクラス内における単語共起情報との比較によりコアキーワードを抽出する。 The apparatus further includes core keyword extraction means for extracting core keywords representing classes generated by the class generation means, and the classification result presentation means presents the extracted core keywords together with the user classification results. The core keyword extraction unit includes, as information for extracting the core keyword, word co-occurrence information in the summary included in the Web search result for each user's input keyword and word co-occurrence information in the classified class. Extract core keywords by comparison.

特開２００９−４３１２５号公報JP, 2009-43125, A

従来の技術では、類似の性質を有するユーザの数を十分に確保することができない場合があった。このため、例えばマーケティングなどの用途に使用するにはユーザを絞り込み過ぎとなる場合があった。 In the prior art, it may not be possible to secure a sufficient number of users having similar properties. For this reason, for example, in order to use it for applications, such as marketing, the user might be narrowed down too much.

本発明は、このような事情を考慮してなされたものであり、与えられたキーワードに興味を持っている可能性があるユーザの数を、比較的広範に抽出することが可能な情報処理装置、情報処理方法、およびプログラムを提供することを目的の一つとする。 The present invention has been made in consideration of such circumstances, and is an information processing apparatus capable of relatively widely extracting the number of users who may be interested in a given keyword. , An information processing method, and a program are provided.

本発明の一態様は、複数のユーザのそれぞれのネットワーク上の行動履歴において第１キーワードと共起しやすい第２キーワードを抽出する第１抽出部と、前記第１キーワードまたは前記第２キーワードを含むネットワーク上の行動履歴を有するユーザを抽出する第２抽出部と、複数のユーザの中から、前記第２抽出部により抽出されたユーザにネットワーク上の行動履歴が類似する一以上のユーザを、前記第１キーワードに対応するユーザとして抽出する第３抽出部と、を備える情報処理装置である。 One aspect of the present invention includes a first extraction unit that extracts a second keyword that is likely to co-occur with a first keyword in the behavior history of each of a plurality of users, and the first keyword or the second keyword. A second extraction unit for extracting a user having an action history on the network; and one or more users whose action history on the network is similar to the user extracted by the second extraction unit among the plurality of users; It is an information processor provided with the 3rd extraction part extracted as a user corresponding to the 1st keyword.

本発明の一態様によれば、与えられたキーワードに興味を持っている可能性があるユーザの数を、比較的広範に抽出することができる。 According to one aspect of the present invention, the number of users who may be interested in a given keyword can be extracted relatively widely.

情報処理装置を利用したサービスサーバ１００の構成および使用環境の一例を示す図である。It is a figure which shows an example of a structure of service server 100 using an information processing apparatus, and use environment. ユーザ情報１９４の内容の一例を示す図である。It is a figure which shows an example of the content of the user information 194. FIG. 検索ログ１９６の内容の一例を示す図である。It is a figure which shows an example of the content of the search log 196. FIG. 学習モデル生成部１５４の処理について説明するための図である。It is a figure for demonstrating the process of the learning model production | generation part 154. FIG. 情報処理装置によって実行される処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the process performed by an information processing apparatus. 情報処理装置を利用したショッピングサーバ２００の構成および使用環境の一例を示す図である。It is a figure which shows an example of a structure of the shopping server 200 using an information processing apparatus, and a use environment. 商品等データ２９２の内容の一例を示す図である。It is a figure which shows an example of the content of goods etc. data 292. FIG. 購入ログ２９６の内容の一例を示す図である。It is a figure which shows an example of the content of the purchase log 296. FIG.

以下、図面を参照し、本発明の情報処理装置、情報処理方法、およびプログラムの実施形態について説明する。 Hereinafter, embodiments of an information processing apparatus, an information processing method, and a program of the present invention will be described with reference to the drawings.

情報処理装置は、一以上のプロセッサにより実現される。情報処理装置は、あるキーワード（第１キーワード）に対して興味を持っている可能性があるユーザ（利用者）を、比較的広範囲に抽出するという目的で使用される。例えば、ある事業者が電子的なクーポンや広告を、その事業者の提供する商品またはサービスを表すキーワードに対して興味を持っている可能性があるユーザに限定して提供したいような場合に、本発明の情報処理装置が使用される。 The information processing apparatus is realized by one or more processors. The information processing apparatus is used for the purpose of relatively widely extracting users (users) who may be interested in a certain keyword (first keyword). For example, when an enterprise wants to offer electronic coupons or advertisements only to users who may be interested in keywords representing a product or service provided by the enterprise, The information processing apparatus of the present invention is used.

情報処理装置は、第１キーワードと共起しやすい第２キーワードを抽出し、第１キーワードまたは第２キーワードを含むネットワーク上の行動履歴を有するユーザ（一次被抽出ユーザ）を抽出し、一次被抽出ユーザのネットワーク上の行動履歴を正解データとして機械学習を行い、対象となるユーザのネットワーク上の行動履歴を入力すると、一次被抽出ユーザとの類似性を示す情報を出力する学習モデルを生成する。一次被抽出ユーザは、第１キーワードに興味を持っている蓋然性が高いユーザであるため、学習モデルによって一次被抽出ユーザとの類似性が高いと判断されるユーザは、第１キーワードに興味を持っている可能性があることが推認される。従って、この学習モデルは、対象となるユーザが、第１キーワードに興味を持っているか否かを判断可能な情報を出力するものである。なお、一次被抽出ユーザにネットワーク上の行動履歴が類似するユーザを抽出することが可能な手法であれば、学習モデルを生成する手法に限らず、如何なる手法を用いてもよい。 The information processing apparatus extracts a second keyword which is likely to co-occur with the first keyword, extracts a user (primary extracted user) having an action history on the network including the first keyword or the second keyword, and performs primary extracted Machine learning is performed with the user's action history on the network as correct data, and when the target user's action history on the network is input, a learning model is generated that outputs information indicating the similarity with the primary user to be extracted. Since the primary target user is a user who is highly likely to be interested in the first keyword, the user judged by the learning model to be highly similar to the primary target user is interested in the first keyword. It is presumed that there is a possibility of Therefore, this learning model is for outputting information that allows the target user to determine whether or not the user is interested in the first keyword. In addition, as long as it is a method capable of extracting a user whose action history on the network is similar to the primary extracted user, any method may be used without being limited to the method of generating a learning model.

ここで、仮に、一次被抽出ユーザを抽出するだけだと、抽出されるユーザの数が限定的になってしまう場合が多いが、本発明の情報処理装置では、学習モデルを生成して対象となるユーザを拡げることで、与えられた第１キーワードに興味を持っている可能性があるユーザの数を、比較的広範に抽出することができる。 Here, if only the primary extracted users are extracted, the number of extracted users is often limited, but in the information processing apparatus of the present invention, the learning model is generated to be an object. By expanding the number of users, the number of users who may be interested in the given first keyword can be extracted relatively widely.

なお、ネットワーク上の行動履歴とは、例えば、検索のために入力したクエリの履歴、或いは、商品またはサービス（以下、商品等）を販売する電子商取引において購入された商品等の履歴（購買履歴）である。第１実施形態では前者について説明し、第２実施形態では後者について説明する。 The action history on the network means, for example, the history of a query input for a search, or the history (purchase history) of a product or the like purchased in an electronic commerce for selling a product or service (hereinafter, product or the like) It is. In the first embodiment, the former will be described, and in the second embodiment, the latter will be described.

情報処理装置は、単体で機能を実現する装置であってもよいし、他の機能を有する装置（ウェブサーバやアプリサーバなど）に包含される仮想的な装置であってもよい。以下の説明では、情報処理装置が、コンテンツをユーザの端末装置に提供するサービスサーバや、ショッピングサイトを提供するショッピングサーバに包含されるものとして説明する。 The information processing apparatus may be an apparatus that realizes functions alone or may be a virtual apparatus included in an apparatus (web server, application server, etc.) having other functions. In the following description, the information processing apparatus will be described as being included in a service server that provides content to a user's terminal device, and a shopping server that provides a shopping site.

＜第１実施形態＞
［構成］
図１は、情報処理装置を利用したサービスサーバ１００の構成および使用環境の一例を示す図である。図示の例では、一以上の端末装置１０や依頼元サーバ３００が、ネットワークＮＷを介してサービスサーバ１００に接続されている。ネットワークＮＷは、例えば、インターネット、ＷＡＮ（Wide Area Network）、ＬＡＮ（Local Area Network）、プロバイダ端末、無線通信網、無線基地局、専用回線などを含む。図１に示される構成要素は、ネットワークＮＷその他のネットワークに接続するための通信インターフェースを備えるものとする。通信インターフェースは、ＮＩＣ（Network Interface Card）などのネットワークカード、無線通信モジュールなどを含む。 First Embodiment
[Constitution]
FIG. 1 is a diagram showing an example of the configuration and usage environment of a service server 100 using an information processing apparatus. In the illustrated example, one or more terminal devices 10 and the request source server 300 are connected to the service server 100 via the network NW. The network NW includes, for example, the Internet, a wide area network (WAN), a local area network (LAN), a provider terminal, a wireless communication network, a wireless base station, a dedicated line, and the like. The components shown in FIG. 1 include a communication interface for connecting to the network NW and other networks. The communication interface includes a network card such as a network interface card (NIC), a wireless communication module, and the like.

［端末装置］
端末装置１０は、例えば、スマートフォンなどの携帯電話、タブレット端末、各種パーソナルコンピュータなどである。端末装置１０では、ブラウザやアプリケーションプログラムなどのＵＡ（User Agent）が起動し、ユーザの入力する内容に応じたリクエストをサービスサーバ１００に送信する。また、ＵＡは、サービスサーバから取得した情報に基づいて、各種画像を表示する。 [Terminal device]
The terminal device 10 is, for example, a mobile phone such as a smartphone, a tablet terminal, various personal computers, and the like. In the terminal device 10, a UA (User Agent) such as a browser or an application program is activated, and transmits a request according to the content input by the user to the service server 100. Also, the UA displays various images based on the information acquired from the service server.

［サービスサーバ］
サービスサーバ１００は、ブラウザからのリクエストに応じてウェブページを端末装置１０に提供するウェブサーバ、またはアプリケーションプログラムからのリクエストに応じて画像や音声を提供する端末装置１０に提供するアプリサーバである。 [Service server]
The service server 100 is a web server that provides a web page to the terminal device 10 in response to a request from a browser, or an application server that provides the terminal device 10 that provides images and sounds in response to a request from an application program.

サービスサーバ１００の提供するサービスは、例えば、ユーザＩＤとパスワードを入力しログインすることで、より個別のユーザに対してカスタマイズされたサービスとして提供される。なお、サービスサーバ１００がウェブサーバである場合、ログインしなくてもウェブページの提供を受けることはできるが、その場合、ユーザごとにカスタマイズされていない汎用のウェブページが提供される。 The service provided by the service server 100 is provided as a customized service for more individual users, for example, by inputting a user ID and a password and logging in. When the service server 100 is a web server, it is possible to receive provision of a web page without logging in, but in that case, a general-purpose web page which is not customized for each user is provided.

サービスサーバ１００は、例えば、コンテンツ提供部１１０と、ユーザ管理部１２０と、検索実行部１３０と、第１抽出部１５０と、第２抽出部１５２と、学習モデル生成部１５４と、第３抽出部１５６と、特典付与部１７０とを備える。これらの構成要素のうち一部または全部は、ＬＳＩ（Large Scale Integration）やＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field-Programmable Gate Array）、ＧＰＵ（Graphics Processing Unit）などのハードウェア（回路部；circuitryを含む）によって実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。プログラムは、予めＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの記憶装置に格納されていてもよいし、ＤＶＤやＣＤ−ＲＯＭなどの着脱可能な記憶媒体に格納されており、記憶媒体がドライブ装置に装着されることでインストールされてもよい。「情報処理装置」は、これらの構成要素のうち、少なくとも、第１抽出部１５０、第２抽出部１５２、学習モデル生成部１５４、および第３抽出部１５６を含む。 For example, the service server 100 includes a content providing unit 110, a user management unit 120, a search execution unit 130, a first extraction unit 150, a second extraction unit 152, a learning model generation unit 154, and a third extraction unit. 156 and the privilege giving unit 170. Some or all of these components are hardware (circuit units) such as LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array), GPU (Graphics Processing Unit), etc. may be realized by the cooperation of software and hardware. The program may be stored in advance in a storage device such as a hard disk drive (HDD) or a flash memory, or is stored in a removable storage medium such as a DVD or a CD-ROM, and the storage medium is stored in the drive device. It may be installed by being attached. The “information processing apparatus” includes at least a first extraction unit 150, a second extraction unit 152, a learning model generation unit 154, and a third extraction unit 156 among the components.

また、サービスサーバ１００は、記憶部１９０を含んでもよい。記憶部１９０は、サービスサーバ１００がネットワークＮＷを介してアクセス可能なＮＡＳ（Network Attached Storage）などの外部記憶装置であってもよい。記憶部１９０には、コンテンツデータ１９２、ユーザ情報１９４、検索ログ１９６などの情報が格納される。 The service server 100 may also include a storage unit 190. The storage unit 190 may be an external storage device such as a NAS (Network Attached Storage) accessible by the service server 100 via the network NW. The storage unit 190 stores information such as content data 192, user information 194, and search log 196.

コンテンツ提供部１１０は、コンテンツデータ１９２に基づくコンテンツを端末装置１０に提供する。コンテンツデータ１９２は、例えば、ニュース記事、動画、静止画、音声など、或いは、それらを参照するための参照情報（例えば、ＵＲＬ；Uniform Resource Locator）である。コンテンツ提供部１１０が提供するコンテンツには、クエリを入力して検索を指示する機能が付与されている。コンテンツ提供部１１０は、端末装置１０において入力されたクエリを検索実行部１３０に渡して検索を実行させ、検索結果を示す画面を端末装置１０に提供する。なお、クエリには、直接入力されるクエリの他、サジェストクエリなどが含まれてもよい。 The content providing unit 110 provides the terminal device 10 with content based on the content data 192. The content data 192 is, for example, a news article, a moving image, a still image, a voice, or the like, or reference information (eg, URL; Uniform Resource Locator) for referring to them. The content provided by the content providing unit 110 has a function of inputting a query and instructing a search. The content providing unit 110 passes the query input in the terminal device 10 to the search execution unit 130 to execute the search, and provides the terminal device 10 with a screen indicating the search result. The query may include a suggestion query or the like in addition to the query directly input.

ユーザ管理部１２０は、サービスサーバ１００の提供するサービス（この例ではコンテンツ提供）にログインするユーザを、ユーザ情報１９４によって管理する。図２は、ユーザ情報１９４の内容の一例を示す図である。ユーザ情報１９４は、例えば、ユーザの識別情報であるユーザＩＤに対し、年齢、性別、誕生日、職業その他の情報が対応付けられた情報である。 The user management unit 120 manages a user who logs in to a service provided by the service server 100 (content provision in this example) by using the user information 194. FIG. 2 is a diagram showing an example of the content of the user information 194. As shown in FIG. The user information 194 is, for example, information in which age, gender, birthday, occupation, and other information are associated with a user ID that is identification information of the user.

検索実行部１３０は、コンテンツ提供部１１０からの指示に応じてネットワークＮＷ上で検索を実行する。ネットワークにおける検索の具体的手法については、既に種々の技術が公開されているため、詳細な説明を省略する。検索実行部１３０は、検索を行う度に、入力されたクエリを検索ログ１９６に登録する。図３は、検索ログ１９６の内容の一例を示す図である。検索ログ１９６は、例えば、ユーザＩＤごとに、クエリと検索時刻が対応付けられた情報である。以下、あるユーザのユーザＩＤに対応付けられたクエリの集合を、「ユーザのクエリ履歴」と称する場合がある。なお、検索ログ１９６には、セッションの区切りを示す情報が付加されてもよい。セッションとは、例えば、クッキー等の状態管理機能の有効期間である。例えば、ウェブサイト内のあるウェブページにアクセスしてから所定時間経過（タイムアウト）するまでの期間が一つのセッションとして扱われる。また、セッションとは、ウェブサイト内のあるウェブページにアクセスしてから、当該ウェブサイト内の他のウェブページ、または他のウェブサイト内のウェブページに切り替わるまでの期間であってもよく、ウェブサイト内のあるウェブページにアクセスしてから、当該ウェブページを表示するウェブブラウザを閉じるまでの期間であってもよい。また、検索ログ１９６とは別に、閲覧ログが保存されてもよいし、検索ログ１９６は、閲覧ログに包含される形で記憶部１９０に保持されてもよい。 The search execution unit 130 executes a search on the network NW in response to an instruction from the content providing unit 110. Detailed techniques for searching in the network will not be described because various techniques have already been disclosed. The search execution unit 130 registers the input query in the search log 196 each time a search is performed. FIG. 3 is a diagram showing an example of the content of the search log 196. As shown in FIG. The search log 196 is, for example, information in which a query is associated with a search time for each user ID. Hereinafter, a set of queries associated with a user ID of a certain user may be referred to as “user's query history”. The search log 196 may be added with information indicating the break of the session. A session is, for example, a valid period of a state management function such as a cookie. For example, a period from access to a web page in a website to a predetermined time (timeout) is treated as one session. Also, a session may be a period from accessing a certain web page in a web site to switching to another web page in the web site or a web page in another web site. It may be a period from accessing a web page in the site to closing a web browser displaying the web page. Further, the browsing log may be stored separately from the search log 196, and the search log 196 may be held in the storage unit 190 so as to be included in the browsing log.

（情報処理装置）
以下、情報処理装置を構成する構成要素について説明する。情報処理装置における各構成要素の処理は、期限を設けずに行ってもよいし、一年、数か月、一カ月、或いはセッションの範囲内といった期限を設けて行ってもよい。 (Information processing device)
Hereinafter, components constituting the information processing apparatus will be described. The processing of each component in the information processing apparatus may be performed without setting a time limit, or may be performed with a time limit such as one year, several months, one month, or within the range of a session.

第１抽出部１５０は、依頼元サーバ３００から第１キーワードを取得する。依頼元サーバ３００を運営する事業者は、「このキーワードに興味を持っている可能性があるユーザに電子クーポンなどの特典を付与したい」といった目的で、キーワードを指定してユーザの抽出を依頼する。第１キーワードは、この依頼に係るキーワードである。 The first extraction unit 150 acquires the first keyword from the request source server 300. The business operator operating the request source server 300 requests extraction of a user by designating a keyword for the purpose of “I want to give a privilege such as an electronic coupon to a user who may be interested in this keyword”. . The first keyword is a keyword related to this request.

第１抽出部１５０は、複数のユーザのそれぞれのネットワーク上の行動履歴の一例である検索ログ１９６において、第１キーワードと共起しやすい（例えば、確率の高い）第２キーワードを抽出する。以下に、第１キーワードと共起しやすい第２キーワードを抽出する処理の一例について説明するが、この一例に限らず、同様の傾向で第２キーワードを抽出可能な手法であれば、いかなる手法を用いてもよい。例えば、第１抽出部１５０は、検索ログ１９６に含まれる原則全てのクエリ（数が少ないものを除外してもよい）から、第２キーワードの候補を順に選択する。第２キーワードの候補の母集団は、検索ログ１９６に含まれるクエリに限らず、何らかの辞書やウェブサイトに含まれるキーワードであってもよい。 The first extraction unit 150 extracts a second keyword that is likely to co-occur with the first keyword (for example, has a high probability) in the search log 196 that is an example of the behavior history of each of the plurality of users on the network. Although an example of processing which extracts the 2nd keyword which tends to co-occur with the 1st keyword is explained below, if it is the method which can extract the 2nd keyword not just this one example with similar tendency, any method You may use. For example, the first extraction unit 150 sequentially selects candidates for the second keyword from all the queries in principle included in the search log 196 (the number may be excluded). The population of candidates for the second keyword is not limited to the query included in the search log 196, but may be a keyword included in any dictionary or website.

第１抽出部１５０は、全てのユーザに対し、第１キーワードと第２キーワードの双方がクエリ履歴に含まれるユーザの割合（共起ユーザ割合ＰＰ）を求める。共起ユーザ割合ＰＰは、例えば、条件付き確率の式（１）で表される。また、第１抽出部１５０は、全てのユーザに対し、第１キーワードがクエリ履歴に含まれ、第２キーワードがクエリ履歴に含まれないユーザの割合（非共起ユーザ割合ＰＮ）を求める。非共起ユーザ割合ＰＮは、例えば、条件付き確率の式（２）で表される。
ＰＰ＝Ｐ（第１クエリ｜第２クエリ） …（１）
Ｐｎ＝Ｐ（第１クエリ｜ｎｏｔ第２クエリ） …（２） The first extraction unit 150 obtains, for all users, the ratio of users in which both the first and second keywords are included in the query history (co-occurring user ratio PP). The co-occurrence user ratio PP is, for example, expressed by the conditional probability formula (1). In addition, the first extraction unit 150 determines, for all users, the proportion of users who do not include the first keyword in the query history and the second keyword in the query history (non-co-occurrence user ratio PN). The non-co-occurrence user ratio PN is, for example, expressed by the conditional probability formula (2).
PP = P (first query | second query) ... (1)
Pn = P (first query | not second query) ... (2)

そして、第１抽出部１５０は、共起ユーザ割合ＰＰと非共起ユーザ割合ＰＮとの相違に基づいて、第１キーワードと共起する確率の高い第２キーワードを抽出する。例えば、第１抽出部１５０は、共起ユーザ割合ＰＰから非共起ユーザ割合ＰＮを差し引いた差分、或いは共起ユーザ割合ＰＰを非共起ユーザ割合ＰＮで除算した商などの演算結果が、閾値以上、或いは第２キーワードの候補の全体の中で上位である第２キーワードを、第１キーワードと共起する確率の高いものとして抽出する。第２キーワードとして抽出されるクエリの数に特段の制約は無く、第１抽出部１５０は、任意の数のクエリを第２キーワードとして抽出してもよい。 Then, based on the difference between the co-occurrence user ratio PP and the non-co-occurrence user ratio PN, the first extraction unit 150 extracts the second keyword having a high probability of co-occurrence with the first keyword. For example, the first extraction unit 150 calculates the difference between the co-occurring user ratio PP minus the non-co-occurring user ratio PN or the calculation result such as the quotient obtained by dividing the co-occurring user ratio PP by the non-co-occurring user ratio PN. The second keyword which is higher in the above or the whole of the second keyword candidates is extracted as a keyword having a high probability of co-occurring with the first keyword. There is no particular limitation on the number of queries extracted as the second keyword, and the first extraction unit 150 may extract an arbitrary number of queries as the second keyword.

第２抽出部１５２は、第１キーワードと第２キーワードとのうち少なくとも一方がクエリ履歴に含まれるユーザを抽出する。図３に例示するクエリのうち第１キーワードとして「野球」が指定され、第２キーワードの一つとして「ホークス」が抽出されたとする。この場合、「野球」または「ホークス」をクエリとして入力したユーザの集合が、第２抽出部１５２によって抽出される。図３の例では、ユーザＩＤが「ＡＡＡ」であるユーザ、「ＢＢＢ」であるユーザ、「ＣＣＣ］であるユーザが第２抽出部１５２によって抽出される。以下、第２抽出部１５２によって抽出されたユーザを「一次被抽出ユーザ」と称する。 The second extraction unit 152 extracts a user in which at least one of the first keyword and the second keyword is included in the query history. It is assumed that “baseball” is designated as the first keyword among the queries illustrated in FIG. 3 and “hawks” is extracted as one of the second keywords. In this case, the second extraction unit 152 extracts a set of users who have input “baseball” or “hawks” as a query. 3, the second extraction unit 152 extracts the user whose user ID is "AAA", the user who is "BBB", and the user whose user ID is "CCC". The called user is referred to as "primary user to be extracted".

学習モデル生成部１５４は、一次被抽出ユーザの特徴量を正解データとして機械学習を行い、一次被抽出ユーザに近い特徴量を有するユーザ（以下、二次被抽出ユーザ）を抽出するための学習モデルを生成する。 The learning model generation unit 154 performs machine learning using the feature quantities of the primary extracted user as correct data, and a learning model for extracting a user having a feature quantity close to that of the primary extracted user (hereinafter, secondary extracted user). Generate

図４は、学習モデル生成部１５４の処理について説明するための図である。学習モデル生成部１５４が扱う特徴量は、例えば、第２キーワードの母集団と同様の複数のクエリのそれぞれを要素に対応付けたベクトル（例えば、要素数が１万個の１万次元ベクトル）において、それぞれの要素に対応するクエリが、ユーザが過去に入力したクエリ（第１キーワードや第２キーワードに限られない）である場合に１、そうでない場合にゼロを要素値とするベクトルである。ユーザが入力するクエリは網羅的でないため、このベクトルは疎ベクトルと称される場合がある。以下、これをユーザベクトルと称する。図４では、図３におけるユーザＩＤが「ＤＤＤ」であるユーザのユーザベクトルを例示している。 FIG. 4 is a diagram for explaining the process of the learning model generation unit 154. The feature quantity handled by the learning model generation unit 154 is, for example, a vector (for example, 10,000-dimensional vector having 10,000 elements) in which each of a plurality of queries similar to the population of the second keyword is associated with the element This is a vector having 1 if the query corresponding to each element is a query (not limited to the first keyword or the second keyword) input by the user in the past, and zero otherwise. This vector may be referred to as a sparse vector because the queries the user enters are not exhaustive. Hereinafter, this is referred to as a user vector. FIG. 4 exemplifies the user vector of the user whose user ID in FIG. 3 is “DDD”.

学習モデル生成部１５４は、一次被抽出ユーザのユーザベクトルが入力された場合にスコア値が高くなるような何らかの学習モデルを機械学習によって生成する。学習モデルは、例えば、ＤＮＮ（Deep Neural Network）などのニューラルネットワーク、活性化関数などを構成要素として生成される。具体的に、学習モデルは、ニューラルネットワークにおける節点情報や重み値、活性化関数のパラメータなどを含むデータ構造（ソフトウェア）である。 The learning model generation unit 154 generates some learning model by machine learning such that the score value becomes high when the user vector of the primary extracted user is input. The learning model is generated using, for example, a neural network such as DNN (Deep Neural Network), an activation function, and the like as a component. Specifically, the learning model is a data structure (software) including node information and weight values in the neural network, parameters of the activation function, and the like.

このような学習モデルを生成すると、一次被抽出ユーザではないユーザであって、一次被抽出ユーザと似たようなクエリ履歴を有するユーザのユーザベクトルを入力した場合でも、一次被抽出ユーザのユーザベクトルを入力した場合と同様に高いスコア値が得られることになる。この高いスコア値が得られた二次被抽出ユーザは、ベクトル空間において一次被抽出ユーザに近いユーザベクトルを有するユーザであるため、一次被抽出ユーザが抽出されるキーとなった第１キーワードに興味を持つ可能性があるユーザであることが推認される。 When such a learning model is generated, even if a user who is not the primary extracted user and has a query history similar to that of the primary extracted user is input, the user vector of the primary extracted user As in the case of entering, a high score value will be obtained. Since the secondary extracted user who obtained this high score value is a user having a user vector close to the primary extracted user in the vector space, the primary extracted user is interested in the first keyword which is the key to be extracted It is inferred that the user is likely to have

仮に、「野球に興味を持つユーザは、ラーメンを好む」といった傾向が世の中に存在するものとする。この場合、「ラーメン」が第２キーワードとして抽出されなかったとしても、一次被抽出ユーザのクエリ履歴には、「ラーメン」、「鶏ガラスープ」、「担々麺」、「中華そば」といったクエリが含まれる可能性が高くなる。学習モデルは、第１キーワードや第２キーワードだけでなく、このような傾向を併せて学習したものとなる。この結果、ラーメンに関するクエリがクエリ履歴に含まれるユーザのユーザベクトルを学習モデルに入力すると、高いスコア値が得られることになる。図４に例示したユーザＩＤ「ＤＤＤ」のユーザは、「野球」や「ホークス」がクエリ履歴に含まれないが、「野球」に興味を持つ可能性がある二次被抽出ユーザとして抽出される可能性がある。このように、情報処理装置は、与えられた第１キーワードに興味を持っている可能性があるユーザの数を、例えば、単に第１キーワードや第２キーワードをクエリ履歴に含むユーザを抽出する場合に比して、広範に抽出することができる。 Temporarily, it is assumed that the tendency "a user who is interested in baseball likes ramen" exists in the world. In this case, even if "ramen" is not extracted as the second keyword, the query history of the primary user to be extracted includes queries such as "ramen", "chicken galley soup", "delicious noodles", and "Chinese buckwheat noodles" The possibility is high. The learning model is not only the first keyword and the second keyword, but also those tendencies learned together. As a result, when the user vector of the user whose query regarding the frame is included in the query history is input to the learning model, a high score value is obtained. The user with the user ID "DDD" illustrated in FIG. 4 is extracted as a secondary user who may be interested in "baseball" although "baseball" and "hawks" are not included in the query history. there is a possibility. As described above, when the information processing apparatus extracts the number of users who may be interested in the given first keyword, for example, the user who simply includes the first keyword and the second keyword in the query history. Can be widely extracted.

第３抽出部１５６は、学習モデルによって高いスコア値が出力された一以上のユーザを、二次被抽出ユーザとして抽出する。 The third extraction unit 156 extracts one or more users whose high score values are output by the learning model as secondary extracted users.

特典付与部１７０は、一次被抽出ユーザと二次被抽出ユーザに対して、電子クーポン、くじ引き権などの特典を付与する。付与された特典の内容は、ログインしているユーザに対してコンテンツ提供部１１０が提供しているコンテンツの中で報知される。特典が付与されたユーザは、各種の場面で特典を行使することができる。 The privilege imparting unit 170 imparts privileges such as an electronic coupon and a lottery right to the primary user to be extracted and the secondary user to be extracted. The content of the granted benefit is notified to the logged-in user among the content provided by the content providing unit 110. A user who has been given a benefit can exercise the benefit in various situations.

（処理フロー）
図５は、情報処理装置によって実行される処理の流れの一例を示すフローチャートである。まず、第１抽出部１５０が、依頼元サーバ３００から第１キーワードを取得する（Ｓ１００）。第１抽出部１５０は、検索ログ１９６を参照し（Ｓ１０２）、第２キーワードの候補（例えば検索ログ１９６に含まれるクエリ）ごとに、共起ユーザ割合ＰＰと非共起ユーザ割合ＰＮを算出する（Ｓ１０４）。そして、第１抽出部１５０は、共起ユーザ割合ＰＰが非共起ユーザ割合ＰＮに比して基準以上に大きいクエリを、第２キーワードとして抽出する（Ｓ１０６）。 (Processing flow)
FIG. 5 is a flowchart showing an example of the flow of processing executed by the information processing apparatus. First, the first extraction unit 150 acquires the first keyword from the request source server 300 (S100). The first extraction unit 150 refers to the search log 196 (S102), and calculates the co-occurrence user ratio PP and the non-co-occurrence user ratio PN for each second keyword candidate (for example, a query included in the search log 196). (S104). Then, the first extraction unit 150 extracts, as a second keyword, a query in which the co-occurrence user ratio PP is larger than the non-co-occurrence user ratio PN as a second keyword (S106).

次に、第２抽出部１５２が、第１キーワードまたは第２キーワードをクエリ履歴に含むユーザを、一次被抽出ユーザとして抽出する（Ｓ１０８）。 Next, the second extraction unit 152 extracts a user including the first keyword or the second keyword in the query history as a primary extracted user (S108).

次に、学習モデル生成部１５４が、ユーザ情報１９４にユーザＩＤが含まれるユーザごとに、ユーザベクトルを生成する（Ｓ１１０）。そして、学習モデル生成部１５４は、一次被抽出ユーザのスコア値が高くなるように機械学習を行って、学習モデルを生成する（Ｓ１１２）。 Next, the learning model generation unit 154 generates a user vector for each user whose user ID is included in the user information 194 (S110). Then, the learning model generation unit 154 performs machine learning so that the score value of the primary extracted user becomes high, and generates a learning model (S112).

次に、第３抽出部１５６が（他の機能部でもよい）、ユーザ情報１９４にユーザＩＤが含まれるユーザごとに、ユーザベクトルを学習モデルに入力する（Ｓ１１４）。そして、第３抽出部１５６は、スコア値の高いユーザベクトルに係るユーザを二次被抽出ユーザとして抽出する（Ｓ１１６）。情報処理装置は、一次被抽出ユーザと二次被抽出ユーザと、第１キーワードに興味を持つ可能性があるユーザとして出力する（Ｓ１１８）。 Next, the third extraction unit 156 (which may be another functional unit) inputs a user vector to the learning model for each user whose user ID is included in the user information 194 (S114). Then, the third extraction unit 156 extracts a user related to a user vector with a high score value as a secondary extracted user (S116). The information processing apparatus outputs the primary extracted user, the secondary extracted user, and the user who may be interested in the first keyword (S118).

以上説明した第１実施形態によれば、複数のユーザのそれぞれのクエリ履歴（ネットワーク上の行動履歴）において、第１キーワードと共起しやすい第２キーワードを抽出し、第１キーワードまたは前記第２キーワードを含むクエリ履歴（ネットワーク上の行動履歴）を有するユーザを一次被抽出ユーザとして抽出し、一次被抽出ユーザのユーザベクトル（特徴量）を正解データとして機械学習を行い、対象となるユーザのユーザベクトル（特徴量）を入力すると、一次被抽出ユーザとの類似性を示す情報を出力する学習モデルを生成し、複数のユーザの中から、一次被抽出ユーザにクエリ履歴が類似する一以上のユーザを、第１キーワードに対応するユーザとして抽出することにより、与えられた第１キーワードに興味を持っている可能性があるユーザの数を、比較的広範に抽出することができる。 According to the first embodiment described above, the second keyword that is likely to co-occur with the first keyword is extracted from the query history (action history on the network) of each of a plurality of users, and the first keyword or the second keyword is extracted. A user having a query history (activity history on the network) including a keyword is extracted as a primary extracted user, machine learning is performed with the user vector (feature amount) of the primary extracted user as correct data, and the target user's user When a vector (feature amount) is input, a learning model that outputs information indicating the similarity to the primary extracted user is generated, and among a plurality of users, one or more users whose query history is similar to the primary extracted user May be interested in the given first keyword by extracting as a user corresponding to the first keyword The number of a user, can be relatively widely extraction.

＜第２実施形態＞
以下、第２実施形態について説明する。第１実施形態において、ネットワーク上の行動履歴は、クエリ履歴であるものとした。第２実施形態では、ネットワーク上の行動履歴は、ショッピングサイトやオークションサイトを介した電子商取引において購入した商品またはサービスの名称等の履歴である。以下の説明では、情報処理装置が、ショッピングサイトを提供するショッピングサーバに包含されるものとして説明する。 Second Embodiment
The second embodiment will be described below. In the first embodiment, the action history on the network is the query history. In the second embodiment, the action history on the network is a history of the name or the like of a product or service purchased in an electronic commerce via a shopping site or an auction site. In the following description, the information processing apparatus is described as being included in a shopping server that provides a shopping site.

［構成］
図６は、情報処理装置を利用したショッピングサーバ２００の構成および使用環境の一例を示す図である。ショッピングサーバ２００は、ブラウザからのリクエストに応じてウェブページを端末装置１０に提供するウェブサーバ、またはアプリケーションプログラムからのリクエストに応じて画像や音声を提供する端末装置１０に提供するアプリサーバである。 [Constitution]
FIG. 6 is a diagram showing an example of the configuration and usage environment of the shopping server 200 using the information processing apparatus. The shopping server 200 is a web server that provides a web page to the terminal device 10 in response to a request from a browser, or an application server that provides the terminal device 10 that provides images and sounds in response to a request from an application program.

ショッピングサーバ２００の提供するサービスは、例えば、ユーザＩＤとパスワードを入力しログインすることで提供される。 The service provided by the shopping server 200 is provided, for example, by inputting a user ID and a password and logging in.

図６に示す構成のうち、コンテンツ提供部２１０およびユーザ管理部２２０の機能は、第１実施形態におけるコンテンツ提供部１１０およびユーザ管理部１２０の機能と同様であるため、説明を省略する。また、ユーザ情報２９４は、第１実施形態におけるユーザ情報１９４と同様の情報であってよい。 In the configuration shown in FIG. 6, the functions of the content providing unit 210 and the user management unit 220 are the same as the functions of the content providing unit 110 and the user management unit 120 in the first embodiment, and thus the description thereof is omitted. Also, the user information 294 may be the same information as the user information 194 in the first embodiment.

販売管理部２３０は、商品等データ２９２を参照し、コンテンツ提供部２１０が提供するショッピングサイトに埋め込む商品等紹介欄を構成する。図７は、商品等データ２９２の内容の一例を示す図である。商品等データ２９２は、例えば、商品等の識別情報である商品等ＩＤに対し、階層的に表現されたカテゴリ、タイトル、詳細説明などの項目が対応けられた情報である。 The sales management unit 230 refers to the product etc. data 292 and configures a product etc. introduction column embedded in the shopping site provided by the content providing unit 210. FIG. 7 is a diagram showing an example of the contents of the product etc. data 292. As shown in FIG. The product etc. data 292 is information in which items such as categories, titles, and detailed descriptions hierarchically represented are corresponded to a product etc. ID which is identification information of a product etc., for example.

また、販売管理部２３０は、ショッピングサイトにおいて商品等が購入される度に、購入された商品等の情報を購入ログ２９６に登録する。図８は、購入ログ２９６の内容の一例を示す図である。購入ログ２９６は、例えば、ユーザＩＤごとに、購入された商品等の情報と検索時刻が対応付けられた情報である。購入ログ２９６とは別に、閲覧ログが保存されてもよいし、購入ログ２９６は、閲覧ログに包含される形で記憶部２９０に保持されてもよい。 In addition, the sales management unit 230 registers information on purchased products and the like in the purchase log 296 each time a product or the like is purchased on the shopping site. FIG. 8 is a diagram showing an example of the contents of the purchase log 296. As shown in FIG. The purchase log 296 is, for example, information in which information such as a purchased product is associated with a search time for each user ID. The browsing log may be stored separately from the purchase log 296, and the purchase log 296 may be held in the storage unit 290 as being included in the browsing log.

第２実施形態に係る情報処理装置は、クエリに代えて、購入ログ２９６に登録された商品等の情報から抽出したキーワードを、第２キーワードの候補とする。購入ログ２９６に登録された商品等の情報から抽出したキーワードとは、例えば、カテゴリの最下層の情報、タイトルから形態素解析によって抽出した固有名詞などである。ユーザごとの上記抽出されたキーワードのことを、購入履歴と称する。 The information processing apparatus according to the second embodiment sets a keyword extracted from information such as a product registered in the purchase log 296 as a candidate for the second keyword, instead of the query. The keywords extracted from the information of goods etc. registered in the purchase log 296 are, for example, information of the lowermost layer of the category, proper nouns extracted from the title by morphological analysis, and the like. The extracted keywords for each user are referred to as a purchase history.

その他については第１実施形態と同様であるため、詳細な説明を省略する。第１抽出部２５０、第２抽出部２５２、学習モデル生成部２５４、および第３抽出部２５６はそれぞれ、入力データの種類は異なるが、第１実施形態の第１抽出部１５０、第２抽出部１５２、学習モデル生成部１５４、および第３抽出部１５６と同様の処理を行う。また、特典付与部２７０は、特典付与部１７０と同様に、一次被抽出ユーザと二次被抽出ユーザに対して、電子クーポン、くじ引き権などの特典を付与する。 The other aspects are the same as in the first embodiment, and thus detailed description will be omitted. Although the first extraction unit 250, the second extraction unit 252, the learning model generation unit 254, and the third extraction unit 256 are different in type of input data, the first extraction unit 150, the second extraction unit of the first embodiment The same processing as the learning model generation unit 154 and the third extraction unit 156 is performed. Further, the privilege imparting unit 270, similarly to the privilege imparting unit 170, imparts a privilege such as an electronic coupon or a lottery right to the primary user to be extracted and the secondary user to be extracted.

以上説明した第２実施形態によれば、複数のユーザのそれぞれの購入履歴（ネットワーク上の行動履歴）において、第１キーワードと共起しやすい第２キーワードを抽出し、第１キーワードまたは前記第２キーワードを含む購入履歴（ネットワーク上の行動履歴）を有するユーザを一次被抽出ユーザとして抽出し、一次被抽出ユーザのユーザベクトル（特徴量）を正解データとして機械学習を行い、対象となるユーザのユーザベクトル（特徴量）を入力すると、一次被抽出ユーザとの類似性を示す情報を出力する学習モデルを生成し、複数のユーザの中から、一次被抽出ユーザに購入履歴が類似する一以上のユーザを、第１キーワードに対応するユーザとして抽出することにより、与えられた第１キーワードに興味を持っている可能性があるユーザの数を、比較的広範に抽出することができる。 According to the second embodiment described above, the second keyword that is likely to co-occur with the first keyword is extracted from the purchase history (action history on the network) of each of the plurality of users, and the first keyword or the second keyword is extracted. A user having a purchase history (activity history on the network) including a keyword is extracted as a primary extracted user, machine learning is performed with the user vector (feature amount) of the primary extracted user as correct data, and the target user's user When a vector (feature amount) is input, a learning model that outputs information indicating the similarity to the primary extracted user is generated, and among a plurality of users, one or more users whose purchase histories are similar to the primary extracted user May be interested in the given first keyword by extracting as a user corresponding to the first keyword The number of over The, can be relatively widely extraction.

＜その他の実施形態＞
第２実施形態では、ユーザの商品等の購入履歴から第２キーワードの候補を抽出するものとしたが、これに代えて、ショッピングサイトやオークションサイトから商品の紹介画面に遷移した履歴、すなわち商品等の紹介画面の閲覧履歴から第２キーワードの候補を抽出してもよい。閲覧履歴とは、購入履歴と同様に、紹介画面を閲覧した商品等のカテゴリの最下層の情報や、タイトルから形態素解析によって抽出した固有名詞などである。 <Other Embodiments>
In the second embodiment, the candidate of the second keyword is extracted from the purchase history of the product etc. of the user, but instead, the history of transition from the shopping site or auction site to the product introduction screen, ie, product etc. The candidate of the second keyword may be extracted from the browsing history of the introduction screen. The browsing history is, like the purchase history, information on the lowermost layer of a category such as a product for which the introduction screen is browsed, a proper noun extracted from the title by morphological analysis, and the like.

また、クエリ履歴や購入履歴、商品等の紹介画面の閲覧履歴に代えて、ユーザの閲覧したニュース記事の履歴から第２キーワードの候補を抽出してもよい。この場合、情報処理装置は、ニュース記事に含まれる代表的な名詞（例えばｔｆ−ｉｄｆ値が高い名詞）を、第２キーワードの候補としてよい。 Alternatively, instead of the query history, purchase history, and browsing history of the introduction screen such as products, the second keyword candidate may be extracted from the history of the news article browsed by the user. In this case, the information processing apparatus may set a representative noun (for example, a noun having a high tf-idf value) included in the news article as a candidate for the second keyword.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 As mentioned above, although the form for carrying out the present invention was explained using an embodiment, the present invention is not limited at all by such an embodiment, and various modification and substitution within the range which does not deviate from the gist of the present invention Can be added.

１０端末装置
１００サービスサーバ
１１０、２１０コンテンツ提供部
１２０、２２０ユーザ管理部
１３０検索実行部
１５０、２５０第１抽出部
１５２、２５２第２抽出部
１５４、２５４学習モデル生成部
１５６、２５６第３抽出部
１７０、２７０特典付与部
１９０、２９０記憶部
１９２コンテンツデータ
１９４、２９４ユーザ情報
１９６検索ログ
２００ショッピングサーバ
２３０販売管理部
２９２商品等データ
２９６購入ログ
３００依頼元サーバ 10 terminal device 100 service server 110, 210 content providing unit 120, 220 user management unit 130 search execution unit 150, 250 first extraction unit 152, 252 second extraction unit 154, 254 learning model generation unit 156, 256 third extraction unit 170, 270 bonus granting unit 190, 290 storage unit 192 content data 194, 294 user information 196 search log 200 shopping server 230 sales management unit 292 product etc. data 296 purchase log 300 request source server

Claims

A first extraction unit that extracts a second keyword that is likely to co-occur with the first keyword in the behavior history on the network of each of a plurality of users;
A second extraction unit for extracting a user having an action history on the network including the first keyword or the second keyword from among a plurality of users;
A third extraction unit which extracts, from among a plurality of users, one or more users whose action history on the network is similar to the user extracted by the second extraction unit, as a user corresponding to the first keyword;
An information processing apparatus comprising:

When machine learning is performed using the feature amount related to the user's activity history on the network extracted by the second extraction unit as correct data, and the feature amount related to the activity history on the target user's network is input, the second extraction unit And a generation unit that generates a learning model that outputs information indicating the similarity to the user extracted by
The third extraction unit extracts, from among a plurality of users, one or more users to which information indicating that the learning model has high similarity to the user extracted by the second extraction unit is output.
An information processing apparatus according to claim 1.

The first extraction unit includes, in the action history, a probability that the first keyword is included in the action history of the user whose keyword is a candidate for the second keyword is included in the action history, and a keyword that is a candidate for the second keyword The second keyword is extracted based on the degree of difference with the probability that the first keyword is included in the action history of the user who is not
The information processing apparatus according to claim 1.

The feature value is a vector whose element is a value representing whether each of a plurality of keywords related to the action history on the network is included.
The information processing apparatus according to claim 2.

The action history on the network is a query history,
The information processing apparatus according to any one of claims 1 to 4 .

The action history on the network is a purchase history in electronic commerce,
The information processing apparatus according to any one of claims 1 to 4 .

The computer is
Extract a second keyword that is likely to co-occur with the first keyword in the action history on the network of each of a plurality of users,
Extracting a user having an activity history on the network including the first keyword or the second keyword;
Extracting one or more users whose behavior history on the network is similar to the extracted user among the plurality of users as the user corresponding to the first keyword;
Information processing method.

On the computer
The second keyword that is likely to co-occur with the first keyword is extracted in the action history on the network of each of a plurality of users,
Causing a user having an activity history on the network including the first keyword or the second keyword to be extracted;
Among the plurality of users, one or more users whose action history on the network is similar to the extracted user are extracted as a user corresponding to the first keyword.
program.