JP2021103339A

JP2021103339A - Device, method, and program for analyzing customer attribute information

Info

Publication number: JP2021103339A
Application number: JP2018059214A
Authority: JP
Inventors: 増田　宗昭; Muneaki Masuda; 宗昭増田
Original assignee: Culture Convenience Club Co Ltd
Current assignee: Culture Convenience Club Co Ltd
Priority date: 2018-03-27
Filing date: 2018-03-27
Publication date: 2021-07-15
Anticipated expiration: 2038-03-27
Also published as: JP7198591B2

Abstract

To improve the accuracy and efficiency of data analysis of customer attribute information.SOLUTION: A device includes: an attribute database connection unit connected to an attribute database for storing a plurality of attribute values corresponding to a plurality of attributes for each of a plurality of target people; an attribute prediction model generation unit that generates a plurality of first attribute prediction models that each predict an attribute value of a first prediction target attribute, which is a prediction target, based on an attribute value of at least one attribute other than the first prediction target attribute of the plurality of attributes, by using the attribute database; and an attribute prediction model selection unit that selects a first attribute prediction model used for prediction of an attribute value of the first prediction target attribute based on a prediction error of each of the plurality of first attribute prediction models.SELECTED DRAWING: Figure 1

Description

本発明は、顧客の属性情報を解析する装置、方法、およびプログラムに関する。 The present invention relates to devices, methods, and programs for analyzing customer attribute information.

従来、顧客に関するデータを蒐集して解析するシステムが知られている（例えば、特許文献１参照）。このようなシステムで解析されたデータは、顧客へのレコメンド、および市場調査等に用いられる。
特許文献１特開２０１５−７６０７６号公報 Conventionally, a system for collecting and analyzing customer data is known (see, for example, Patent Document 1). The data analyzed by such a system is used for customer recommendations, market research, and the like.
Patent Document 1 Japanese Unexamined Patent Publication No. 2015-76076

近年、顧客に関するデータを解析して活用することへの期待は益々高まっており、データ解析の精度・効率を更に高めていくことが望まれている。 In recent years, expectations for analyzing and utilizing data related to customers have been increasing more and more, and it is desired to further improve the accuracy and efficiency of data analysis.

上記課題を解決するために、本発明の第１の態様においては、複数の対象者のそれぞれについて、複数の属性に対応する複数の属性値を記憶するための属性データベースに接続される属性データベース接続部と、属性データベースを用いて、予測対象である第１予測対象属性の属性値を、複数の属性のうち第１予測対象属性以外の少なくとも１つの属性の属性値に基づいてそれぞれ予測する第１の複数の属性予測モデルを生成する属性予測モデル生成部と、第１の複数の属性予測モデルのそれぞれの予測誤差に基づいて、第１予測対象属性の属性値の予測に用いる第１属性予測モデルを選択する属性予測モデル選択部とを備える装置を提供する。 In order to solve the above problem, in the first aspect of the present invention, for each of a plurality of target persons, an attribute database connection connected to an attribute database for storing a plurality of attribute values corresponding to the plurality of attributes. First, the attribute value of the first prediction target attribute, which is the prediction target, is predicted based on the attribute value of at least one attribute other than the first prediction target attribute among the plurality of attributes, using the unit and the attribute database. The first attribute prediction model used for predicting the attribute value of the first prediction target attribute based on the prediction error of each of the attribute prediction model generation unit that generates a plurality of attribute prediction models and the first plurality of attribute prediction models. Provided is an apparatus including an attribute prediction model selection unit for selecting.

本発明の第２の態様においては、コンピュータが、複数の対象者のそれぞれについて、複数の属性に対応する複数の属性値を記憶するための属性データベースを用いて、予測対象である第１予測対象属性の属性値を、複数の属性のうち第１予測対象属性以外の少なくとも１つの属性の属性値に基づいてそれぞれ予測する第１の複数の属性予測モデルを生成する属性予測モデル生成段階と、コンピュータが、第１の複数の属性予測モデルのそれぞれの予測誤差に基づいて、第１予測対象属性の属性値の予測に用いる第１属性予測モデルを選択する属性予測モデル選択段階とを備える方法を提供する。 In the second aspect of the present invention, the computer uses an attribute database for storing a plurality of attribute values corresponding to a plurality of attributes for each of the plurality of subjects, and is a first prediction target to be predicted. The attribute prediction model generation stage of generating the first plurality of attribute prediction models that predict the attribute value of the attribute based on the attribute value of at least one attribute other than the first prediction target attribute among the plurality of attributes, and the computer. Provided a method including an attribute prediction model selection step of selecting a first attribute prediction model to be used for predicting the attribute value of the first prediction target attribute based on the prediction error of each of the first plurality of attribute prediction models. To do.

本発明の第３の態様においては、コンピュータにより実行され、コンピュータを、複数の対象者のそれぞれについて、複数の属性に対応する複数の属性値を記憶するための属性データベースに接続される属性データベース接続部と、属性データベースを用いて、予測対象である第１予測対象属性の属性値を、複数の属性のうち第１予測対象属性以外の少なくとも１つの属性の属性値に基づいてそれぞれ予測する第１の複数の属性予測モデルを生成する属性予測モデル生成部と、第１の複数の属性予測モデルのそれぞれの予測誤差に基づいて、第１予測対象属性の属性値の予測に用いる第１属性予測モデルを選択する属性予測モデル選択部として機能させるプログラムを提供する。 In a third aspect of the invention, an attribute database connection executed by a computer that connects the computer to an attribute database for storing a plurality of attribute values corresponding to the plurality of attributes for each of the plurality of subjects. First, the attribute value of the first prediction target attribute, which is the prediction target, is predicted based on the attribute value of at least one attribute other than the first prediction target attribute among the plurality of attributes, using the unit and the attribute database. The first attribute prediction model used for predicting the attribute value of the first prediction target attribute based on the prediction error of each of the attribute prediction model generation unit that generates a plurality of attribute prediction models and the first plurality of attribute prediction models. Provides a program that functions as an attribute prediction model selection unit for selecting.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 The outline of the above invention does not list all the necessary features of the present invention. Sub-combinations of these feature groups can also be inventions.

本実施形態に係るシステム１００の構成を端末１１２、端末１５２、および端末１９２と共に示す。The configuration of the system 100 according to this embodiment is shown together with the terminal 112, the terminal 152, and the terminal 192. 本実施形態に係る属性ＤＢ１２２および属性ＤＢ１６７に格納されるデータ構造の一例を示す。An example of the data structure stored in the attribute DB 122 and the attribute DB 167 according to the present embodiment is shown. 本実施形態に係る予測モデル生成装置１５０の動作フローを示す。The operation flow of the prediction model generation device 150 according to this embodiment is shown. 本実施形態に係る属性予測モデル生成部１８０が生成する属性予測モデルの評価結果の一例を示す。An example of the evaluation result of the attribute prediction model generated by the attribute prediction model generation unit 180 according to the present embodiment is shown. 本実施形態に係る属性予測モデル選択部１８５が生成する属性予測モデルの選択結果の一例を示す。An example of the selection result of the attribute prediction model generated by the attribute prediction model selection unit 185 according to the present embodiment is shown. 本実施形態に係る属性予測装置１１０における属性情報取得フローを示す。The attribute information acquisition flow in the attribute prediction apparatus 110 according to this embodiment is shown. 本実施形態に係る属性予測装置１１０における属性予測フローを示す。The attribute prediction flow in the attribute prediction device 110 according to this embodiment is shown. 予測対象属性の依存関係の一例を示す。An example of the dependency of the predicted attribute is shown. 本実施形態に係るシステム１００における属性追加フローを示す。The attribute addition flow in the system 100 which concerns on this embodiment is shown. 本実施形態に係るコンピュータ１９００の構成の一例を示す。An example of the configuration of the computer 1900 according to the present embodiment is shown.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲にかかる発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention, but the following embodiments do not limit the inventions that fall within the scope of the claims. Also, not all combinations of features described in the embodiments are essential to the means of solving the invention.

図１は、本実施形態に係るシステム１００の構成を端末１１２、端末１５２、および端末１９２と共に示す。システム１００は、ポイントサービス等の加入者または会員等である各対象者の属性データ（「属性情報」とも示す。）を属性データベース（属性ＤＢ）に登録して管理する。システム１００は、属性ＤＢにおける、属性データを取得していない属性を主に予測する属性予測モデルを生成して、属性値を予測する。そして、システム１００は、予測した属性値を含む属性ＤＢを、商品またはサービス（以下「商品等」と総称する。）のレコメンドおよび市場調査等に活用可能とする。 FIG. 1 shows the configuration of the system 100 according to the present embodiment together with the terminal 112, the terminal 152, and the terminal 192. The system 100 registers and manages attribute data (also referred to as "attribute information") of each target person who is a subscriber or member of a point service or the like in an attribute database (attribute DB). The system 100 predicts the attribute value by generating an attribute prediction model that mainly predicts the attribute in the attribute DB for which the attribute data has not been acquired. Then, the system 100 makes it possible to utilize the attribute DB including the predicted attribute value for the recommendation of the product or service (hereinafter collectively referred to as “product or the like”), market research, and the like.

システム１００は、属性予測装置１１０と、予測モデル生成装置１５０と、レコメンド処理装置１９０とを備える。システム１００は、サーバ・コンピュータ等のコンピュータにより実現されてもよく、複数のコンピュータで実現されたシステムであってもよく、地域的に分散された複数のコンピュータで実現された分散システムであってもよい。システム１００は、１事業者が提供するポイントシステム、複数の事業者にわたる共通の共通ポイントシステム、クレジットカード、電子マネー、およびその他の任意の会員サービスにおける各会員等の対象者の属性情報を管理する。 The system 100 includes an attribute prediction device 110, a prediction model generation device 150, and a recommendation processing device 190. The system 100 may be realized by a computer such as a server computer, may be a system realized by a plurality of computers, or may be a distributed system realized by a plurality of locally distributed computers. Good. System 100 manages the attribute information of the target person such as each member in the point system provided by one business operator, the common common point system across a plurality of business operators, credit card, electronic money, and any other member service. ..

属性予測装置１１０は、対象者の属性情報を入力して登録し、登録された属性情報に基づいて主に未知の属性情報を予測する。本実施形態に係る属性予測装置１１０は、複数の対象者間の属性値を、属性情報が既知であるか未知であるかに関わらず比較可能とするために、属性情報が既知の対象者についてもその属性情報を予測する。属性予測装置１１０は、属性情報取得部１１５と、属性ＤＢ接続部１２０と、属性ＤＢ１２２と、次元縮約部１２５と、縮約ＤＢ接続部１３０と、縮約ＤＢ１３２と、属性予測部１３５と、属性値更新部１４０と、属性予測値更新部１４２とを有する。 The attribute prediction device 110 inputs and registers the attribute information of the target person, and mainly predicts the unknown attribute information based on the registered attribute information. The attribute prediction device 110 according to the present embodiment is for a target person whose attribute information is known so that the attribute value between a plurality of target persons can be compared regardless of whether the attribute information is known or unknown. Also predicts its attribute information. The attribute prediction device 110 includes an attribute information acquisition unit 115, an attribute DB connection unit 120, an attribute DB 122, a dimension reduction unit 125, a reduction DB connection unit 130, a reduction DB 132, an attribute prediction unit 135, and the like. It has an attribute value updating unit 140 and an attribute predicted value updating unit 142.

属性情報取得部１１５は、対象者の属性情報を取得して、属性ＤＢ接続部１２０を介して属性ＤＢ１２２に格納する。対象者の属性情報のソースは、一例として、新規会員登録の際に対象者が必須または任意で記入または入力する情報、会員アンケートに対する対象者の回答、店舗等において対象者が商品等を購入したことに応じた決済情報、電子商取引サイトにおいて対象者が商品等を購入したことに応じた決済情報、対象者がアクセスしたＷｅｂサイトの情報、対象者がＷｅｂサイト上でクリックしたインターネット広告（Ｗｅｂ広告）の情報、および、対象者が視聴したテレビ番組の情報等の少なくとも１つであってよく、対象者の同意を受けて提供される。属性情報取得部１１５は、定期的等の任意のタイミングで属性ＤＢ接続部１２０を介して属性ＤＢ１２２をアクセスし、少なくとも１つの属性に対する属性値が未知の対象者を検索して、Ｗｅｂサイト等を介して会員アンケートを行うこと等によって、能動的に未知の属性値を蒐集してもよい。 The attribute information acquisition unit 115 acquires the attribute information of the target person and stores it in the attribute DB 122 via the attribute DB connection unit 120. As an example, the source of the attribute information of the target person is the information that the target person must or optionally fills in or inputs when registering a new member, the response of the target person to the member questionnaire, the target person purchased the product, etc. at the store, etc. Payment information according to the situation, payment information according to the purchase of products by the target person on the electronic commerce site, information on the website accessed by the target person, Internet advertisement clicked on the website by the target person (web advertisement) ), And at least one of the information of the TV program viewed by the target person, etc., and is provided with the consent of the target person. The attribute information acquisition unit 115 accesses the attribute DB 122 via the attribute DB connection unit 120 at an arbitrary timing such as periodically, searches for a target person whose attribute value for at least one attribute is unknown, and visits a website or the like. Unknown attribute values may be actively collected by conducting a member questionnaire or the like.

属性ＤＢ接続部１２０は、属性ＤＢ１２２に接続され、システム１００内の各部から属性ＤＢ１２２に対するアクセスを処理する。属性ＤＢ１２２は、複数の対象者のそれぞれについて、複数の属性に対応する複数の属性値を含む属性情報を記憶する。属性ＤＢ１２２は、システム１００の処理を行うコンピュータに接続されたハードディスクドライブ等の外部記憶装置の少なくとも一部の記憶領域によって実現されてもよく、例えばクラウドストレージサービス等によって提供されるシステム１００の外部の記憶装置によって実現されてもよい。 The attribute DB connection unit 120 is connected to the attribute DB 122 and processes access to the attribute DB 122 from each unit in the system 100. The attribute DB 122 stores attribute information including a plurality of attribute values corresponding to the plurality of attributes for each of the plurality of target persons. The attribute DB 122 may be realized by a storage area of at least a part of an external storage device such as a hard disk drive connected to a computer that processes the system 100, and is external to the system 100 provided by, for example, a cloud storage service. It may be realized by a storage device.

次元縮約部１２５は、属性ＤＢ１２２に記憶された、複数の対象者のそれぞれの複数の属性値に基づいて、複数の属性の次元を縮約する。一例として次元縮約部１２５は、トピックモデルを用いて、属性ＤＢ１２２に含まれる各対象者の属性情報が、複数のトピックのそれぞれに該当する程度または確率を表す複数の値を算出する。そして、次元縮約部１２５は、算出した複数の値を、各対象者の属性情報に対して次元縮約用の複数の属性として追加する。これにより、次元縮約部１２５は、対象者の属性情報を構成する多数の属性を、より少ない数となるトピック毎の属性に縮約することができる。これに伴い、次元縮約部１２５は、次元縮約用に追加した複数の属性以外の属性の少なくとも一部または全部を、属性予測モデルの説明変数から除いてもよい。次元縮約部１２５は、属性ＤＢ１２２に対し上記の次元縮約処理を適用して、縮約ＤＢ１３２に変換する。 The dimension reduction unit 125 reduces the dimensions of a plurality of attributes based on the plurality of attribute values of each of the plurality of subjects stored in the attribute DB 122. As an example, the dimension reduction unit 125 uses a topic model to calculate a plurality of values representing the degree or probability that the attribute information of each target person included in the attribute DB 122 corresponds to each of the plurality of topics. Then, the dimension reduction unit 125 adds the calculated plurality of values to the attribute information of each target person as a plurality of attributes for dimension reduction. As a result, the dimension reduction unit 125 can reduce a large number of attributes constituting the attribute information of the target person to a smaller number of attributes for each topic. Along with this, the dimension reduction unit 125 may remove at least a part or all of the attributes other than the plurality of attributes added for the dimension reduction from the explanatory variables of the attribute prediction model. The dimension reduction unit 125 applies the above-mentioned dimension reduction processing to the attribute DB 122 to convert it into the reduction DB 132.

縮約ＤＢ接続部１３０は、縮約ＤＢ１３２に接続され、次元縮約部１２５から縮約ＤＢ１３２への書込アクセスおよび属性予測部１３５からの縮約ＤＢ１３２の読出アクセスを処理する。縮約ＤＢ１３２は、複数の対象者のそれぞれについて、次元縮約部１２５により縮約された属性を少なくとも一部に含む、複数の属性に対応する複数の属性値を属性情報として記憶する。縮約ＤＢ１３２も、属性ＤＢ１２２と同様に、システム１００の処理を行うコンピュータに接続された外部記憶装置により実現されてもよく、システム１００の外部のクラウドストレージサービス等により提供される記憶装置によって実現されてもよい。 The contraction DB connection unit 130 is connected to the contraction DB 132 and processes a write access from the dimension contraction unit 125 to the contraction DB 132 and a read access of the contraction DB 132 from the attribute prediction unit 135. The contraction DB 132 stores a plurality of attribute values corresponding to a plurality of attributes as attribute information, including at least a part of the attributes contracted by the dimension contraction unit 125 for each of the plurality of subjects. Similar to the attribute DB 122, the contraction DB 132 may be realized by an external storage device connected to a computer that processes the system 100, or is realized by a storage device provided by a cloud storage service or the like outside the system 100. You may.

属性予測部１３５は、属性ＤＢ１２２における予測対象の１または２以上の属性（「予測対象属性」と示す。）のそれぞれに対して選択された属性予測モデルを予測モデル生成装置１５０から受け取る。そして、属性予測部１３５は、受け取った属性予測モデルを用いて、複数の対象者のそれぞれについて予測対象属性の属性値を予測する。より具体的には、属性予測部１３５は、全対象者または少なくとも一部の対象者のそれぞれについて、予測対象属性毎に、選択された属性予測モデルを実行し、縮約ＤＢ１３２に格納された各対象者の属性情報に基づいて予測対象属性の予測値を算出する。属性予測部１３５は、算出した予測値を属性ＤＢ接続部１２０を介して属性ＤＢ１２２に格納する。 The attribute prediction unit 135 receives the attribute prediction model selected for each of one or more attributes (referred to as “prediction target attribute”) of the prediction target in the attribute DB 122 from the prediction model generation device 150. Then, the attribute prediction unit 135 predicts the attribute value of the prediction target attribute for each of the plurality of target persons by using the received attribute prediction model. More specifically, the attribute prediction unit 135 executes a selected attribute prediction model for each prediction target attribute for all or at least a part of the target persons, and stores each in the reduction DB 132. The predicted value of the predicted target attribute is calculated based on the attribute information of the target person. The attribute prediction unit 135 stores the calculated predicted value in the attribute DB 122 via the attribute DB connection unit 120.

なお、本実施形態に係る属性ＤＢ１２２は、複数の属性のそれぞれに対応付けて、属性情報取得部１１５が取得した属性情報に基づく既知の属性値、および属性予測部１３５により予測された予測値の両方を広義の属性値として格納する。本実施形態においては、説明の便宜上、属性ＤＢ１２２は、既知の属性値と予測値とを、同じ表現形式を用いて格納するものとする。この場合、属性情報取得部１１５は、取得した属性情報の生データの表現形式を変換し、属性ＤＢ１２２は、表現形式が変換された属性値を、既知の属性値として格納する。例えば、属性情報取得部１１５は、属性情報の生データとして取得した絶対評価の属性値を相対評価の属性値に変換し、属性ＤＢ１２２は、相対評価に変換された属性値を格納してよい。このような例としては、属性ＤＢ１２２が、３５歳等の絶対評価で表された対象者の年齢に代えて、年齢が既知である全対象者の中での相対値（例えば偏差値またはパーセンタイルランキング等）に変換された属性値を格納する場合が挙げられる。別の例としては、属性ＤＢ１２２が、対象者の性別について、男性である確率、または男性である可能性の偏差値等に変換された属性値を格納する場合が挙げられる。ここで、属性ＤＢ１２２が、属性の予測値も変換後の既知の属性値と同様に偏差値、パーセンタイルランキング、または確率等の表現形式で格納することで、既知の属性値と予測値との間で互換性を得ることができる。これにより、システム１００は、属性予測モデルの生成においては既知の属性値を説明変数として使用し、属性予測モデルの使用時においては予測値を説明変数として使用することも可能となる。この結果、属性予測モデルの説明変数となる少なくとも一部の属性について既知の属性値が得られていない対象者についても、属性予測装置１１０は、その属性予測モデルを用いて予測対象属性の属性値を予測することが可能となる。 The attribute DB 122 according to the present embodiment corresponds to each of a plurality of attributes, and has a known attribute value based on the attribute information acquired by the attribute information acquisition unit 115 and a predicted value predicted by the attribute prediction unit 135. Store both as attribute values in a broad sense. In the present embodiment, for convenience of explanation, the attribute DB 122 stores known attribute values and predicted values using the same expression format. In this case, the attribute information acquisition unit 115 converts the representation format of the raw data of the acquired attribute information, and the attribute DB 122 stores the attribute value whose representation format has been converted as a known attribute value. For example, the attribute information acquisition unit 115 may convert the attribute value of the absolute evaluation acquired as the raw data of the attribute information into the attribute value of the relative evaluation, and the attribute DB 122 may store the attribute value converted into the relative evaluation. As such an example, the attribute DB 122 has a relative value (for example, deviation value or percentile ranking) among all subjects whose ages are known, instead of the age of the subject represented by an absolute evaluation such as 35 years old. Etc.), and the converted attribute value may be stored. As another example, there is a case where the attribute DB 122 stores the attribute value converted into the probability of being male, the deviation value of the possibility of being male, or the like with respect to the gender of the subject. Here, the attribute DB 122 stores the predicted value of the attribute in an expression format such as a deviation value, a percentile ranking, or a probability in the same manner as the known attribute value after conversion, so that the predicted value is between the known attribute value and the predicted value. You can get compatibility with. As a result, the system 100 can use a known attribute value as an explanatory variable when generating the attribute prediction model, and can also use the predicted value as an explanatory variable when using the attribute prediction model. As a result, the attribute prediction device 110 uses the attribute prediction model to obtain the attribute value of the prediction target attribute even for the target person for which the known attribute value has not been obtained for at least some of the attributes that are the explanatory variables of the attribute prediction model. Can be predicted.

なお、属性ＤＢ１２２は、変換前の表現形式で既知の属性値を格納し、変換後の表現形式で予測値を格納する形態等、他の様々な形態で実現し得る。このような場合であっても、システム１００は、既知の属性値に代えて予測値を参照する度に、予測値の表現形式から既知の属性値の表現形式へと変換することで、既知の属性値および予測値を同じ表現形式で格納する場合と同様の動作を実現することが可能となる。また、属性ＤＢ１２２は、表現形式を変換した既知の属性値に加え、属性情報取得部１１５が取得した属性情報の生データも別途格納してもよい。 The attribute DB 122 can be realized in various other forms such as a form in which a known attribute value is stored in the expression format before conversion and a predicted value is stored in the expression format after conversion. Even in such a case, the system 100 is known by converting the representation format of the predicted value to the representation format of the known attribute value each time the predicted value is referred to instead of the known attribute value. It is possible to realize the same operation as when the attribute value and the predicted value are stored in the same expression format. Further, the attribute DB 122 may separately store the raw data of the attribute information acquired by the attribute information acquisition unit 115 in addition to the known attribute value whose expression format has been converted.

属性値更新部１４０は、属性ＤＢ１２２に格納されているある属性の予測値の確からしさが閾値以上であることを条件として、予測値に基づいてその属性の属性値を更新する。これにより、予測値の確度が十分に高い場合には、未取得の属性値を予測値に基づいて設定することができる。 The attribute value update unit 140 updates the attribute value of a certain attribute stored in the attribute DB 122 on the condition that the certainty of the predicted value is equal to or higher than the threshold value. As a result, when the accuracy of the predicted value is sufficiently high, the unacquired attribute value can be set based on the predicted value.

属性予測値更新部１４２は、属性ＤＢ１２２に格納されているある属性の属性値が既知である場合において、その属性の予測値が既知の属性値から基準以上乖離していることに応じて、その属性の予測値を既知の属性値に基づいて更新する。これにより、一部の対象者について予測値が既知の属性値から大きく外れている場合に、予測値をより正確な値に書き換えることができる。 When the attribute value of a certain attribute stored in the attribute DB 122 is known, the attribute predicted value update unit 142 sets the predicted value of the attribute according to the deviation from the known attribute value by a reference or more. Update the predicted value of the attribute based on the known attribute value. As a result, the predicted value can be rewritten to a more accurate value when the predicted value greatly deviates from the known attribute value for some target persons.

また、属性予測装置１１０は、属性データ取得部１４４と属性追加部１４６とを有してもよい。属性データ取得部１４４は、複数の対象者の少なくとも一部について、属性データベースに追加すべき追加属性の既知の属性値を示す既知情報を取得する。例えば、属性データ取得部１４４は、システム１００または属性予測装置１１０のユーザ等から、端末１１２を介して既知情報を受け取る。 Further, the attribute prediction device 110 may have an attribute data acquisition unit 144 and an attribute addition unit 146. The attribute data acquisition unit 144 acquires known information indicating the known attribute values of the additional attributes to be added to the attribute database for at least a part of the plurality of target persons. For example, the attribute data acquisition unit 144 receives known information from the user of the system 100 or the attribute prediction device 110 via the terminal 112.

属性追加部１４６は、属性ＤＢ１２２の複数の属性に、追加属性を追加する。属性追加部１４６は、既知情報に含まれる各対象者についての追加属性の既知の属性値を、属性データ取得部１４４が取得した既知情報において示された属性値に基づいて設定する。既知情報において既知の属性値が示されていない対象者については、属性追加部１４６は、既知の属性値を設定しない。 The attribute addition unit 146 adds additional attributes to a plurality of attributes of the attribute DB 122. The attribute addition unit 146 sets the known attribute value of the additional attribute for each target person included in the known information based on the attribute value indicated in the known information acquired by the attribute data acquisition unit 144. For the target person whose known attribute value is not shown in the known information, the attribute addition unit 146 does not set the known attribute value.

追加属性が属性ＤＢ１２２に追加されると、システム１００は、追加属性を予測対象属性として処理してよい。すなわち、予測モデル生成装置１５０は、追加属性を予測対象属性として、予測対象属性の属性値を予測する１または複数の属性予測モデルを属性予測モデル生成部１８０により生成してよい。また、予測モデル生成装置１５０は、複数の属性予測モデルを生成した場合に、それぞれの予測誤差に基づいて、予測対象属性の属性値の予測に用いる属性予測モデルを属性予測モデル選択部１８５により選択してよい。 When the additional attribute is added to the attribute DB 122, the system 100 may process the additional attribute as a prediction target attribute. That is, the prediction model generation device 150 may generate one or a plurality of attribute prediction models for predicting the attribute value of the prediction target attribute by the attribute prediction model generation unit 180, using the additional attribute as the prediction target attribute. Further, when a plurality of attribute prediction models are generated, the prediction model generation device 150 selects an attribute prediction model to be used for predicting the attribute value of the prediction target attribute by the attribute prediction model selection unit 185 based on each prediction error. You can do it.

予測モデル生成装置１５０は、属性ＤＢ１２２において予測対象とする１または２以上の属性のそれぞれを予測する属性予測モデルを生成する。予測モデル生成装置１５０は、モデル更新指示部１５５と、サンプリング部１６０と、属性ＤＢ接続部１６５と、属性ＤＢ１６７と、次元縮約部１７０と、縮約ＤＢ接続部１７５と、縮約ＤＢ１７７と、属性予測モデル生成部１８０と、属性予測モデル選択部１８５とを有する。 The prediction model generation device 150 generates an attribute prediction model that predicts each of one or more attributes to be predicted in the attribute DB 122. The prediction model generation device 150 includes a model update instruction unit 155, a sampling unit 160, an attribute DB connection unit 165, an attribute DB 167, a dimension contraction unit 170, a contraction DB connection unit 175, and a contraction DB 177. It has an attribute prediction model generation unit 180 and an attribute prediction model selection unit 185.

モデル更新指示部１５５は、予め定められた期間が経過したことに応じて、属性予測装置１１０が使用する属性予測モデルの更新を指示する。モデル更新指示部１５５は、例えば、前回属性予測モデルを更新してから１日、１週間、または１ヶ月等といった期間が経過したことに応じて、サンプリング部１６０に対して処理の開始をトリガする等により、属性予測モデルの更新を指示する。これに代えて、モデル更新指示部１５５は、その期間の間に予測モデル生成装置１５０内で既に準備された属性予測モデルを、その期間が経過したことに応じて属性予測装置１１０へと送信することを属性予測モデル選択部１８５に指示してもよい。 The model update instruction unit 155 instructs the update of the attribute prediction model used by the attribute prediction device 110 according to the elapse of a predetermined period. The model update instruction unit 155 triggers the sampling unit 160 to start processing according to the elapse of a period such as one day, one week, or one month since the last time the attribute prediction model was updated. Instruct the update of the attribute prediction model by such means. Instead, the model update instruction unit 155 transmits the attribute prediction model already prepared in the prediction model generation device 150 during the period to the attribute prediction device 110 according to the elapse of the period. This may be instructed to the attribute prediction model selection unit 185.

サンプリング部１６０は、属性ＤＢ１２２が属性情報を格納する複数の対象者のうち一部の対象者をサンプリングして、サンプリングした対象者に対応付けて属性ＤＢ１２２に格納されている属性情報を、属性ＤＢ接続部１６５を介して属性ＤＢ１６７に格納またはコピーする。ここで、サンプリング部１６０は、属性ＤＢ１２２に登録されている対象者のうち、予測対象属性の属性値が既知である対象者の中から、属性予測モデルの生成に使用するサンプル数分の対象者を抽出してもよい。これに代えて、サンプリング部１６０は、属性ＤＢ１２２に登録されている対象者の中から、属性予測モデルの生成に使用するサンプル数分の対象者をランダムに抽出してもよい。 The sampling unit 160 samples a part of the target persons in which the attribute DB 122 stores the attribute information, and sets the attribute information stored in the attribute DB 122 in association with the sampled target person as the attribute DB. It is stored or copied in the attribute DB 167 via the connection unit 165. Here, the sampling unit 160 is among the target persons registered in the attribute DB 122, among the target persons whose attribute values of the predicted target attribute are known, the target persons for the number of samples used to generate the attribute prediction model. May be extracted. Instead of this, the sampling unit 160 may randomly extract as many target persons as the number of samples used for generating the attribute prediction model from the target persons registered in the attribute DB 122.

属性ＤＢ接続部１６５は、属性ＤＢ１６７に接続され、予測モデル生成装置１５０内の各部から属性ＤＢ１６７に対するアクセスを処理する。属性ＤＢ１６７は、属性ＤＢ１２２に登録された対象者のうちサンプリング部１６０によりサンプリングされた複数の対象者のそれぞれについて、複数の属性に対応する複数の属性値を含む属性情報を記憶する。属性ＤＢ１６７も、属性ＤＢ１２２と同様に、システム１００の処理を行うコンピュータに接続された外部記憶装置により実現されてもよく、システム１００の外部のクラウドストレージサービス等により提供される記憶装置によって実現されてもよい。 The attribute DB connection unit 165 is connected to the attribute DB 167 and processes access to the attribute DB 167 from each unit in the prediction model generation device 150. The attribute DB 167 stores attribute information including a plurality of attribute values corresponding to the plurality of attributes for each of the plurality of target persons sampled by the sampling unit 160 among the target persons registered in the attribute DB 122. Like the attribute DB 122, the attribute DB 167 may be realized by an external storage device connected to a computer that processes the system 100, or is realized by a storage device provided by a cloud storage service or the like outside the system 100. May be good.

次元縮約部１７０は、属性ＤＢ１６７に記憶された、複数の対象者のそれぞれの複数の属性値に基づいて、複数の属性の次元を縮約する。次元縮約部１７０は、属性ＤＢ１６７に対して次元縮約部１２５と同様の次元縮約処理を適用して、縮約ＤＢ１７７に変換してもよい。 The dimension reduction unit 170 reduces the dimensions of a plurality of attributes based on the plurality of attribute values of each of the plurality of subjects stored in the attribute DB 167. The dimensional contraction unit 170 may apply the same dimensional contraction process as the dimensional contraction unit 125 to the attribute DB 167 to convert the attribute DB 167 into a contraction DB 177.

縮約ＤＢ接続部１７５は、縮約ＤＢ１７７に接続され、次元縮約部１７０から縮約ＤＢ１７７への書込アクセスおよび属性予測モデル生成部１８０からの縮約ＤＢ１７７の読出アクセスを処理する。縮約ＤＢ１７７は、サンプリング部１６０によりサンプリングされた複数の対象者のそれぞれについて、次元縮約部１７０により縮約された属性を少なくとも一部に含む複数の属性に対応する複数の属性値を有する属性情報を記憶する。縮約ＤＢ１７７も、属性ＤＢ１２２と同様に、システム１００の処理を行うコンピュータに接続された外部記憶装置により実現されてもよく、システム１００の外部のクラウドストレージサービス等により提供される記憶装置によって実現されてもよい。 The contraction DB connection unit 175 is connected to the contraction DB 177 and processes a write access from the dimension contraction unit 170 to the contraction DB 177 and a read access of the contraction DB 177 from the attribute prediction model generation unit 180. The contraction DB 177 has a plurality of attribute values corresponding to a plurality of attributes including at least a part of the attributes contracted by the dimension contraction unit 170 for each of the plurality of subjects sampled by the sampling unit 160. Memorize information. Similar to the attribute DB 122, the contraction DB 177 may be realized by an external storage device connected to a computer that processes the system 100, or is realized by a storage device provided by a cloud storage service or the like outside the system 100. You may.

属性予測モデル生成部１８０は、属性ＤＢ１６７を用いて、予測対象属性の属性値を、複数の属性のうち予測対象属性以外の少なくとも１つの属性の属性値に基づいてそれぞれ予測する複数の属性予測モデルを生成する。本実施形態に係る属性予測モデル生成部１８０は、属性ＤＢ１６７を参照することにより、属性ＤＢ１２２からサンプリングされた一部の対象者に対応付けられた属性値を用いて、複数の属性予測モデルを生成する。これにより、属性予測モデル生成部１８０は、複数の属性予測モデルのそれぞれを、属性ＤＢ１２２に格納された全対象者の属性情報を用いて生成した場合に必要となる計算量および計算時間を削減することができる。なお、十分な計算能力が確保できる場合には、属性予測モデル生成部１８０は、サンプリングを用いず属性ＤＢ１２２に格納された各対象者の属性情報を用いて複数の属性予測モデルを生成してもよい。 The attribute prediction model generation unit 180 uses the attribute DB 167 to predict the attribute value of the prediction target attribute based on the attribute value of at least one attribute other than the prediction target attribute among the plurality of attributes. To generate. The attribute prediction model generation unit 180 according to the present embodiment generates a plurality of attribute prediction models by referring to the attribute DB 167 and using the attribute values associated with some of the target persons sampled from the attribute DB 122. To do. As a result, the attribute prediction model generation unit 180 reduces the amount of calculation and the calculation time required when each of the plurality of attribute prediction models is generated using the attribute information of all the target persons stored in the attribute DB 122. be able to. If sufficient computing power can be secured, the attribute prediction model generation unit 180 may generate a plurality of attribute prediction models using the attribute information of each target person stored in the attribute DB 122 without using sampling. Good.

また、本実施形態において、属性予測モデル生成部１８０は、属性ＤＢ１６７から変換された縮約ＤＢ１７７を参照することにより、属性ＤＢ１６７を間接的に使用してよい。これにより、属性予測モデル生成部１８０は、次元縮約された複数の属性のうち少なくとも１つの属性の属性値から予測対象属性の属性値を予測することができる。ここで次元縮約部１７０が次元縮約用に追加した複数の属性以外の全ての属性または少なくとも一部の属性を属性予測モデルの説明変数から除くことで、属性予測モデル生成部１８０は、次元縮約の結果を効果的に利用して、多数の属性の中から予測対象属性の予測に影響を与えうる一部の属性のみを説明変数として属性予測モデルを生成することができる。この結果、属性予測モデル生成部１８０は、複数の属性予測モデルの生成に必要となる計算量および計算時間を削減することができる。 Further, in the present embodiment, the attribute prediction model generation unit 180 may indirectly use the attribute DB 167 by referring to the contraction DB 177 converted from the attribute DB 167. As a result, the attribute prediction model generation unit 180 can predict the attribute value of the prediction target attribute from the attribute value of at least one of the plurality of dimension-reduced attributes. Here, by excluding all attributes or at least some attributes other than the plurality of attributes added by the dimension reduction unit 170 for dimension reduction from the explanatory variables of the attribute prediction model, the attribute prediction model generation unit 180 can generate the dimension. By effectively using the result of reduction, it is possible to generate an attribute prediction model using only some of the attributes that can affect the prediction of the prediction target attribute as explanatory variables. As a result, the attribute prediction model generation unit 180 can reduce the amount of calculation and the calculation time required to generate a plurality of attribute prediction models.

以上において、属性予測モデル生成部１８０は、複数の予測対象属性のそれぞれに対して、その予測対象属性を予測する属性予測モデルを複数生成することができる。ここで、属性予測モデル生成部１８０は、第１予測対象属性の属性値の予測に用いる第１属性予測モデル、および第２予測対象属性の属性値の予測に用いる第２属性予測モデルとして、異なる属性予測モデルを選択可能とする。これら複数の属性予測モデルのそれぞれは、学習によって更新されないハイパーパラメータおよび予測アルゴリズムの少なくとも１つが他の属性予測モデルと異なるものであってよい。属性予測モデル生成部１８０は、複数の属性予測モデルのそれぞれについて、その属性予測モデルにおける学習可能なパラメータを学習させて、複数の属性予測モデルのそれぞれが予測対象属性の属性値を予測する予測誤差を最小化する処理を行う。 In the above, the attribute prediction model generation unit 180 can generate a plurality of attribute prediction models for predicting the prediction target attribute for each of the plurality of prediction target attributes. Here, the attribute prediction model generation unit 180 is different as a first attribute prediction model used for predicting the attribute value of the first prediction target attribute and a second attribute prediction model used for predicting the attribute value of the second prediction target attribute. Attribute prediction model can be selected. Each of these plurality of attribute prediction models may differ from other attribute prediction models in at least one of the hyperparameters and prediction algorithms that are not updated by training. The attribute prediction model generation unit 180 trains the learnable parameters in the attribute prediction model for each of the plurality of attribute prediction models, and each of the plurality of attribute prediction models predicts the attribute value of the prediction target attribute. Is performed to minimize.

属性予測モデル選択部１８５は、複数の予測対象属性のそれぞれに対して属性予測モデル生成部１８０が生成した複数の属性予測モデルのそれぞれの予測誤差に基づいて、その予測対象属性の属性値の予測に用いる属性予測モデルを選択する。例えば、属性予測モデル選択部１８５は、予測対象属性毎に、複数の属性予測モデルの中から予測誤差が最小の属性予測モデルを選択する。この際、属性予測モデル選択部１８５は、予測対象属性毎に、ハイパーパラメータおよび予測アルゴリズムの少なくとも１つが異なる属性予測モデルを選択しうる。属性予測モデル選択部１８５は、予測対象属性毎に選択された属性予測モデルを属性予測装置１１０内の属性予測部１３５へと提供する。一例として、属性予測モデル選択部１８５は、属性予測モデルに用いる予測アルゴリズムを指定する識別情報およびハイパーパラメータの値のセットを属性予測部１３５へと提供する。 The attribute prediction model selection unit 185 predicts the attribute value of the prediction target attribute based on the prediction error of each of the plurality of attribute prediction models generated by the attribute prediction model generation unit 180 for each of the plurality of prediction target attributes. Select the attribute prediction model used for. For example, the attribute prediction model selection unit 185 selects an attribute prediction model having the smallest prediction error from a plurality of attribute prediction models for each prediction target attribute. At this time, the attribute prediction model selection unit 185 can select an attribute prediction model in which at least one of the hyperparameters and the prediction algorithm is different for each prediction target attribute. The attribute prediction model selection unit 185 provides the attribute prediction model selected for each prediction target attribute to the attribute prediction unit 135 in the attribute prediction device 110. As an example, the attribute prediction model selection unit 185 provides the attribute prediction unit 135 with a set of identification information and hyperparameter values that specify a prediction algorithm used for the attribute prediction model.

レコメンド処理装置１９０は、複数の対象者のそれぞれにおける予測対象とした予測対象属性の属性値に基づいて、予測対象属性に対応付けられた商品等を対象者にレコメンドするか否かを選択する。ここで、レコメンド処理装置１９０は、複数の対象者のそれぞれについて、特定の商品等に対する嗜好性を示す予測対象属性の属性値が閾値以上の嗜好性を示す場合に、その対象者にその商品等をレコメンドすることを決定してもよい。また、レコメンド処理装置１９０は、１または複数の基本属性、生活属性、および志向性の少なくとも一部（例えば居住地、車を所有しているか否か、高級志向か倹約志向か）を示す予測対象属性の属性値に基づいて、商品等をレコメンドすることを決定してもよい。 The recommendation processing device 190 selects whether or not to recommend a product or the like associated with the prediction target attribute to the target person based on the attribute value of the prediction target attribute set as the prediction target in each of the plurality of target persons. Here, the recommendation processing device 190 indicates that, for each of a plurality of target persons, when the attribute value of the predictive target attribute indicating the preference for a specific product or the like shows the preference of the target person or more, the product or the like is given to the target person. May be decided to recommend. In addition, the recommendation processing device 190 is a prediction target that indicates one or more basic attributes, living attributes, and at least a part of orientation (for example, place of residence, whether or not a car is owned, luxury-oriented or frugality-oriented). It may be decided to recommend a product or the like based on the attribute value of the attribute.

そして、レコメンド処理装置１９０は、複数の対象者のうちレコメンドをすることを決定した対象者に対して、商品等をレコメンドする。一例として、レコメンド処理装置１９０は、その商品等の広告を含む電子メール、ダイレクトメール、およびインターネット広告等を対象者に提供してもよく、その商品等の広告を含むテレビＣＭをその対象者を含む視聴者向けに提供してもよく、その商品等の購入を優遇するクーポン、値引、およびポイント付与等のサービスを提供してもよい。 Then, the recommendation processing device 190 recommends a product or the like to a target person who has decided to make a recommendation among a plurality of target persons. As an example, the recommendation processing device 190 may provide an e-mail including an advertisement for the product or the like, a direct mail, an Internet advertisement or the like to the target person, and a TV commercial containing the advertisement for the product or the like may be provided to the target person. It may be provided to viewers including those, and services such as coupons, discounts, and point awarding that give preferential treatment to the purchase of the product or the like may be provided.

端末１１２は、直接的またはネットワークを介して間接的に属性予測装置１１０に接続され、属性予測装置１１０の属性ＤＢ１２２または縮約ＤＢ１３２に格納された複数の対象者の属性情報の統計処理、属性の追加・削除、属性値の設定、および少なくとも１つの属性の組が設定した抽出条件を満たす対象者の絞り込み等のデータベース処理、並びに予測対象の各予測対象属性に対する属性予測モデルの評価等を行うためのユーザーインターフェイスを提供する。 The terminal 112 is directly or indirectly connected to the attribute prediction device 110 via a network, and statistically processes the attribute information of a plurality of target persons stored in the attribute DB 122 or the contraction DB 132 of the attribute prediction device 110, and the attributes of the attributes. To perform database processing such as addition / deletion, setting of attribute values, narrowing down of target persons who satisfy the extraction conditions set by at least one set of attributes, and evaluation of the attribute prediction model for each prediction target attribute of the prediction target. Provides a user interface for.

端末１５２は、直接的またはネットワークを介して間接的に予測モデル生成装置１５０に接続され、属性予測モデル生成部１８０が生成可能な予測アルゴリズムの追加・変更・削除、属性予測モデル生成部１８０が各属性予測モデルに与えるハイパーパラメータの範囲指定および値指定、属性予測モデル生成部１８０が生成した複数の属性予測モデルのそれぞれの学習済パラメータおよび予測結果の確認、評価、および属性モデル間の比較、属性予測モデル選択部１８５に対する手動による属性予測モデルの選択指定または選択補助、属性ＤＢ１２２のサブセットである属性ＤＢ１６７に対する属性の追加・削除、属性値の設定、および少なくとも１つの属性の組が設定した抽出条件を満たす対象者の絞り込み等のデータベース処理、ならびに予測モデル生成装置１５０の属性ＤＢ１６７または縮約ＤＢ１７７に格納した複数の対象者の属性情報の統計処理等を行うためのユーザーインターフェイスを提供する。 The terminal 152 is directly or indirectly connected to the prediction model generation device 150 via a network, and the attribute prediction model generation unit 180 adds, changes, or deletes a prediction algorithm that can be generated by the attribute prediction model generation unit 180. Specifying the range and value of hyperparameters given to the attribute prediction model, checking and evaluating the trained parameters and prediction results of each of the plurality of attribute prediction models generated by the attribute prediction model generation unit 180, comparing between attribute models, and attributes. Manual selection specification or selection assistance of the attribute prediction model for the prediction model selection unit 185, addition / deletion of attributes for the attribute DB167 which is a subset of the attribute DB122, setting of the attribute value, and extraction conditions set by at least one set of attributes. Provided is a user interface for performing database processing such as narrowing down of target persons satisfying the conditions, and statistical processing of attribute information of a plurality of target persons stored in the attribute DB 167 or the contraction DB 177 of the prediction model generation device 150.

端末１９２は、直接的またはネットワークを介して間接的に端末１９２に接続され、レコメンド処理装置１９０におけるレコメンド処理を管理するためのユーザーインターフェイスを提供する。一例として、端末１９２は、広告予算、および商品等の値引またはポイント付与予算に応じてレコメンドの対象者を絞り込むための条件の設定および調整、レコメンドの対象者の絞り込み結果の確認、レコメンド方法の設定、およびレコメンド実行の指示等を行うためのユーザーインターフェイスを提供する。 The terminal 192 is directly or indirectly connected to the terminal 192 via a network to provide a user interface for managing the recommendation processing in the recommendation processing device 190. As an example, the terminal 192 sets and adjusts conditions for narrowing down the recommendation target people according to the advertising budget and the discount or point granting budget for products, etc., confirms the narrowing down result of the recommendation target people, and makes a recommendation method. It provides a user interface for setting and instructing recommendation execution.

端末１１２、端末１５２、および端末１９２は、デスクトップ・コンピュータであってよく、タブレットおよびスマートフォン等の携帯端末であってもよい。 The terminal 112, the terminal 152, and the terminal 192 may be desktop computers or mobile terminals such as tablets and smartphones.

以上に示したシステム１００によれば、属性ＤＢ１２２または属性ＤＢ１２２のサブセットである属性ＤＢ１６７に格納された複数の対象者の属性情報に基づいて、予測対象属性毎に、属性値を予測する複数の属性予測モデルを生成し、予測対象属性毎に好適な属性予測モデルを選択することができる。したがって、システム１００によれば、予測対象属性毎の予測精度をより向上することが可能となる。 According to the system 100 shown above, a plurality of attributes that predict the attribute value for each prediction target attribute based on the attribute information of the plurality of target persons stored in the attribute DB 122 or the attribute DB 167 that is a subset of the attribute DB 122. A prediction model can be generated and a suitable attribute prediction model can be selected for each prediction target attribute. Therefore, according to the system 100, it is possible to further improve the prediction accuracy for each prediction target attribute.

図２は、本実施形態に係る属性ＤＢ１２２および属性ＤＢ１６７に格納されるデータ構造の一例を示す。属性ＤＢ１２２および属性ＤＢ１６７は、複数の対象者のそれぞれについて、個人を識別する個人識別情報（個人ＩＤ）、および当該個人が有する複数の属性についての属性情報を記憶する。 FIG. 2 shows an example of the data structure stored in the attribute DB 122 and the attribute DB 167 according to the present embodiment. The attribute DB 122 and the attribute DB 167 store personal identification information (personal ID) for identifying an individual and attribute information about a plurality of attributes possessed by the individual for each of the plurality of target persons.

「個人ＩＤ」は、システム１００において個々の対象者個人を識別するための識別子であり、例えばシステム１００が提供するサービスの会員番号またはログインＩＤ等である。これに代えて、属性ＤＢ１２２および属性ＤＢ１６７は、「個人ＩＤ」として、対象者の名前、電子メールアドレス、住所、電話番号、対象者所有の携帯端末の識別情報、またはこれらの少なくとも１つの組み合わせに基づいて生成された情報を用いてもよい。 The "individual ID" is an identifier for identifying an individual target person in the system 100, and is, for example, a membership number or a login ID of a service provided by the system 100. Instead, the attribute DB 122 and the attribute DB 167 can be used as the "personal ID" in the target person's name, e-mail address, address, telephone number, identification information of the target person's mobile terminal, or at least one combination thereof. Information generated based on this may be used.

「属性情報」は、対象者が有する各種の属性についての属性値である。本実施形態に係る属性ＤＢ１２２および属性ＤＢ１６７が格納する各属性は、汎用属性データ、購買ポテンシャルデータ、およびレコメンドポテンシャルデータに大別される。 "Attribute information" is an attribute value for various attributes possessed by the target person. Each attribute stored in the attribute DB 122 and the attribute DB 167 according to the present embodiment is roughly classified into general-purpose attribute data, purchasing potential data, and recommended potential data.

「汎用属性データ」は、各対象者の特性を示す属性のセットであり、特に対象者自身の特性を汎用的に示すものである。「汎用属性データ」は、基本属性に分類される１または複数の属性、生活属性に分類される１または複数の属性、および志向性に分類される１または複数の属性のうちの少なくとも１つを含んでよい。 The "general-purpose attribute data" is a set of attributes showing the characteristics of each target person, and particularly shows the characteristics of the target person itself in a general-purpose manner. "General-purpose attribute data" includes at least one of one or more attributes classified as basic attributes, one or more attributes classified as living attributes, and one or more attributes classified as intentionality. May include.

「基本属性」は、各対象者の基本情報であり、基本属性に分類される１または複数の属性として、例えば名前、生年月日、年令または年代、性別、住所、電話番号等の少なくとも１つを含む。「基本属性」は、主に対象者の新規登録時または登録内容変更時等に入力される属性であるが、少なくとも一部の属性は任意登録であってよく、予測対象となってもよい。 The "basic attribute" is the basic information of each target person, and one or more attributes classified into the basic attributes include, for example, at least one such as name, date of birth, age or age, gender, address, and telephone number. Including one. The "basic attribute" is an attribute that is mainly input when the target person is newly registered or when the registered contents are changed, but at least some of the attributes may be voluntary registration and may be a prediction target.

「生活属性」は、対象者の生活態様に関する情報であり、「生活属性」に分類される１または複数の属性として、例えば既婚／未婚、住居形態、世帯年収、個人年収、職種、自動車の所有有無、および住居の所有有無等の少なくとも１つを含みうる。「生活属性」に関する属性値は、新規登録時等に収集されてもよく、アンケート等の各種の方法によって収集されてもよく、予測対象となってもよい。 The "living attribute" is information on the lifestyle of the target person, and one or more attributes classified into the "living attribute" include, for example, married / unmarried, housing form, household annual income, personal annual income, occupation, and ownership of a car. It may include at least one such as presence / absence and possession of a residence. The attribute value related to the "living attribute" may be collected at the time of new registration or the like, may be collected by various methods such as a questionnaire, or may be a prediction target.

「志向性」は、対象者の志向、傾向及び／又は嗜好を示す情報であり、「志向性」に分類される１または複数の属性として、例えば衣類に対する品質志向／チャレンジ志向／堅実志向／ブランド志向等、食に対する高級志向／倹約志向／値引志向等、住に対するコンビニ志向／都会志向／地域重視志向等、その他健康志向、キャリア志向、およびグローバル志向等の少なくとも１つを含みうる。または、志向性は、対象者の嗜好に関する属性として、例えば、ドライブ、グルメ、旅行、およびスポーツ等の各種の趣味に対する嗜好の有無または嗜好度、各種の商品等に対する嗜好の有無または嗜好度、並びに各種のウェブサイト等に対する嗜好の有無または嗜好度等の少なくとも１つを含んでよい。「志向性」に関する属性は、各種の調査において調査の目的に応じて追加されうる。「志向性」に関する属性値は、新規登録時等に収集されてもよく、アンケート等の各種の方法によって収集されてもよく、予測対象となってもよい。 "Intentionality" is information indicating the intention, tendency and / or preference of the subject, and one or more attributes classified as "intentionality" include, for example, quality-oriented / challenge-oriented / solid-oriented / brand for clothing. It may include at least one of health-oriented, career-oriented, global-oriented, etc., such as high-class-oriented / frugal-oriented / discount-oriented, convenience-oriented / urban-oriented / community-oriented, etc. for housing. Alternatively, intentionality is an attribute related to the taste of the subject, for example, whether or not there is a preference for various hobbies such as driving, gourmet, travel, and sports, or the degree of preference, whether or not there is a preference for various products, or the degree of preference, and It may include at least one such as presence / absence of preference or degree of preference for various websites and the like. Attributes related to "intentionality" can be added in various surveys depending on the purpose of the survey. The attribute value related to "intentionality" may be collected at the time of new registration or the like, may be collected by various methods such as a questionnaire, or may be a prediction target.

「購買ポテンシャルデータ」は、複数の商品等、あるいは複数の商品群またはサービス群のそれぞれに対する各対象者の購買可能性を示す属性のセットである。「購買ポテンシャルデータ」は、一例としてエンターテインメント、食品、および日用品といった商品等のジャンル、種別、または分類のそれぞれについて、そのジャンル等に対応する各商品または各サービスに対応付けられた属性を含んでよい。「購買ポテンシャルデータ」の各属性は、その属性に対応付けられた商品等に対する対象者の嗜好性を示す嗜好属性であってよい。 The "purchasing potential data" is a set of attributes indicating the purchaseability of each target person for each of a plurality of products, etc., or a plurality of product groups or service groups. "Purchase potential data" may include, for example, attributes associated with each product or service corresponding to the genre, type, or classification of products such as entertainment, food, and daily necessities. .. Each attribute of the "purchasing potential data" may be a preference attribute indicating the preference of the target person for the product or the like associated with the attribute.

一例として、「購買ポテンシャルデータ」は、システム１００が提供する会員サービスによる販売管理の対象とする多数の商品等のそれぞれに対する属性を含む。「購買ポテンシャルデータ」は、各商品等を識別する例えばＪＡＮコードの各コード値に対応して各属性を含んでもよく、この場合、ＪＡＮコードが割り当てられた１つ１つの商品等に対し、１または２以上の属性が割り当てられる。システム１００が提供する会員サービスが複数の事業者にわたる共通ポイントシステムを提供する場合には、購買ポテンシャルデータは、数十万〜数百万、あるいはそれ以上の属性を含みうる。また、「購買ポテンシャルデータ」は、商品群またはサービス群に対応する属性を有していてもよい。例えば、「購買ポテンシャルデータ」は、例えば、ビール、酒類、及び／又は飲料といった商品群に対応する属性を有していてもよい。 As an example, the "purchasing potential data" includes attributes for each of a large number of products and the like subject to sales management by the member service provided by the system 100. The "purchase potential data" may include each attribute corresponding to each code value of the JAN code that identifies each product or the like. In this case, 1 for each product or the like to which the JAN code is assigned. Or two or more attributes are assigned. When the member service provided by the system 100 provides a common point system across a plurality of businesses, the purchasing potential data may include hundreds of thousands to millions or more attributes. Further, the "purchasing potential data" may have an attribute corresponding to a product group or a service group. For example, the "purchasing potential data" may have attributes corresponding to product groups such as beer, alcoholic beverages, and / or beverages.

「購買ポテンシャルデータ」は、各対象者について、購買実績（購買の有無、購買量、購買時期、購買場所等）、及び／又は、商品アンケートもしくは広告に対する対象者の反応等を数値化した嗜好度等を既知の属性値として格納してもよい。また、「購買ポテンシャルデータ」は、属性予測モデルを用いて予測した予測値も属性値の少なくとも一部として格納してもよい。 "Purchase potential data" is the degree of preference that quantifies the purchase record (presence or absence of purchase, purchase amount, purchase time, purchase location, etc.) and / or the reaction of the target person to the product questionnaire or advertisement for each target person. Etc. may be stored as known attribute values. Further, in the "purchasing potential data", the predicted value predicted by using the attribute prediction model may be stored as at least a part of the attribute value.

「レコメンドポテンシャルデータ」は、レコメンドに関する各対象者の特性を示す属性のセットである。「レコメンドポテンシャルデータ」は、一例としてメディア反応に関する１または複数の属性、インセンティブ反応に関する１または複数の属性、および、離反可能性に関する１または複数の属性のうちの少なくとも１つを含んでよい。 "Recommend potential data" is a set of attributes indicating the characteristics of each subject regarding recommendations. The "recommended potential data" may include, for example, at least one or more attributes of the media reaction, one or more attributes of the incentive reaction, and one or more attributes of the possibility of separation.

「メディア反応」は、一例としてダイレクトメール、電子メール、レシートに印刷する広告、インターネット広告、および、ＴＶ広告等のレコメンドに用いるメディア毎に、対象者に対するそのメディアを用いたレコメンドの有効性または評価値等を示す１または複数の属性を含む。例えば、ある対象者がダイレクトメールには反応しない場合には、レコメンド処理装置１９０は、ダイレクトメールの有効性または評価値等に関する属性の属性値を減少させ、またはより低い属性値を設定してもよい。また、ある対象者がインターネット広告を基準頻度以上の頻度でクリックする場合、またはインターネット広告を経由して商品を購入した場合等には、レコメンド処理装置１９０は、インターネット広告の有効性または評価値等に関する属性の属性値を増加させ、またはより高い属性値を設定してもよい。 "Media reaction" is, for example, the effectiveness or evaluation of the recommendation using the media for the target person for each media used for the recommendation such as direct mail, e-mail, advertisement to be printed on the receipt, Internet advertisement, and TV advertisement. Includes one or more attributes that indicate values and the like. For example, if a target person does not respond to direct mail, the recommendation processing device 190 may reduce the attribute value of the attribute related to the effectiveness or evaluation value of direct mail, or set a lower attribute value. Good. In addition, when a certain target person clicks on the Internet advertisement more frequently than the standard frequency, or when the product is purchased via the Internet advertisement, the recommendation processing device 190 sets the effectiveness or evaluation value of the Internet advertisement, etc. You may increase the attribute value of the attribute related to, or set a higher attribute value.

「インセンティブ反応」は、一例として値引、クーポン付与、ポイント付与、ポイント増額、および、販売促進品の提供等といったインセンティブ毎に、対象者に対するそのインセンティブを用いたレコメンドの有効性または評価値等を示す１または複数の属性を含む。例えば、値引の提供をしてもある対象者が商品等の購入に至らなかった場合には、レコメンド処理装置１９０は、値引の有効性または評価値等に関する属性の属性値を減少させ、またはより低い属性値を設定してもよい。また、ある対象者がポイント付与を提供した商品等を購入した場合には、レコメンド処理装置１９０は、ポイント付与の有効性または評価値等に関する属性の属性値を増加させ、またはより高い属性値を設定してもよい。 The "incentive reaction" is, for example, for each incentive such as discount, coupon grant, point grant, point increase, and provision of sales promotion products, the effectiveness or evaluation value of the recommendation using the incentive to the target person, etc. Includes one or more attributes to indicate. For example, if a certain target person does not purchase a product or the like even if the discount is provided, the recommendation processing device 190 reduces the attribute value of the attribute related to the effectiveness of the discount or the evaluation value or the like. Alternatively, a lower attribute value may be set. In addition, when a target person purchases a product or the like for which points have been granted, the recommendation processing device 190 increases the attribute value of the attribute related to the effectiveness of point granting or the evaluation value, or increases the attribute value. It may be set.

「離反可能性」は、一例として各販売事業者、及び／又は、各商品メーカ若しくはサービス提供者毎に、その事業者等の商品等を利用しなくなる可能性を示す１または複数の属性を含む。例えば、ある対象者がある販売事業者から最後に商品等を購入してから基準期間以上経過した場合、属性予測装置１１０は、その対象者がその販売事業者から離反したとみなし、離反した旨を示す属性値を属性ＤＢ１２２に格納する。これに代えて、属性予測装置１１０は、その対象者がその販売業者を最後に利用してからの経過期間に応じて算出した離反評価値を属性値として属性ＤＢ１２２に格納してもよい。 "Possibility of separation" includes, for example, one or more attributes indicating the possibility that each seller and / or each product maker or service provider will not use the products of the business operator or the like. .. For example, when a certain target person has passed a reference period or more since the last purchase of a product or the like from a certain sales company, the attribute prediction device 110 considers that the target person has separated from the sales company and indicates that the target person has separated. The attribute value indicating is stored in the attribute DB 122. Instead of this, the attribute prediction device 110 may store the separation evaluation value calculated according to the elapsed period since the target person last used the seller as the attribute value in the attribute DB 122.

図３は、本実施形態に係る予測モデル生成装置１５０の動作フローを示す。ステップＳ３１０において、モデル更新指示部１５５は、予め定められた期間が経過したことに応じて、属性予測装置１１０が使用する属性予測モデルの更新を指示する。これに代えて、またはこれに加えて、モデル更新指示部１５５は、システム１００のユーザまたは管理者等の指示を受けて、属性予測モデルの更新を指示してもよい。 FIG. 3 shows an operation flow of the prediction model generation device 150 according to the present embodiment. In step S310, the model update instruction unit 155 instructs to update the attribute prediction model used by the attribute prediction device 110 according to the elapse of a predetermined period. Alternatively or additionally, the model update instruction unit 155 may instruct the update of the attribute prediction model in response to an instruction from a user or administrator of the system 100.

ここで予測モデル生成装置１５０は、属性ＤＢ１２２または属性ＤＢ１６７が有する全属性を予測対象としてもよく、対象者の名前等の一部の属性を除いた各属性を予測対象としてもよく、汎用属性データに含まれる基本属性を除いた各属性を予測対象としてもよく、購買ポテンシャルデータ及び／又はレコメンドポテンシャルデータを予測対象としてもよい。また、予測モデル生成装置１５０は、次元縮約部１７０が追加した次元縮約用の属性については、予測対象から除いてもよい。また、予測モデル生成装置１５０は、属性ＤＢ１２２または属性ＤＢ１６７が有する全属性のうち、システム１００のユーザまたは管理者が指定した属性のみを予測対象としてもよい。 Here, the prediction model generation device 150 may use all the attributes of the attribute DB 122 or the attribute DB 167 as prediction targets, or each attribute excluding some attributes such as the name of the target person as the prediction target, and general-purpose attribute data. Each attribute excluding the basic attribute included in may be the prediction target, and the purchase potential data and / or the recommendation potential data may be the prediction target. Further, the prediction model generation device 150 may exclude the attribute for dimension reduction added by the dimension reduction unit 170 from the prediction target. Further, the prediction model generation device 150 may target only the attributes specified by the user or the administrator of the system 100 among all the attributes possessed by the attribute DB 122 or the attribute DB 167.

なお、予測モデル生成装置１５０は、予測対象属性によって異なりうるタイミングで属性予測モデルの更新を指示してもよい。例えば、予測モデル生成装置１５０は、汎用属性データに属する属性については、定期的に属性予測モデルの更新を指示し、購買ポテンシャルデータに属する属性については、一例としてその属性に対応する商品等の購買データを予め定められた数量分受け取る度に属性予測モデルの更新を指示し、レコメンドポテンシャルデータに属する属性については、一例として予め定められた回数レコメンドを行う度に属性予測モデルの更新を指示してもよい。また、予測モデル生成装置１５０は、購買ポテンシャルデータに属する属性を予測する属性予測モデルの更新頻度を、汎用属性データに属する属性を予測する属性予測モデルの更新頻度よりも低くしてもよい。購買ポテンシャルデータは、個々の商品等に対応する属性を含むことから通常は汎用属性データよりも属性数が多くなるので、予測モデル生成装置１５０は、購買ポテンシャルデータに属する属性を予測する属性予測モデルの更新頻度をより低くすることにより予測モデル生成装置１５０の計算量を大幅に低減させることが可能となる。 The prediction model generation device 150 may instruct the update of the attribute prediction model at a timing that may differ depending on the prediction target attribute. For example, the prediction model generation device 150 periodically instructs the update of the attribute prediction model for the attribute belonging to the general-purpose attribute data, and purchases a product or the like corresponding to the attribute for the attribute belonging to the purchasing potential data as an example. Instruct the update of the attribute prediction model every time the data is received in a predetermined quantity, and for the attributes belonging to the recommendation potential data, instruct the update of the attribute prediction model every time the recommendation is performed a predetermined number of times as an example. May be good. Further, the prediction model generation device 150 may update the attribute prediction model that predicts the attributes belonging to the purchasing potential data lower than the update frequency of the attribute prediction model that predicts the attributes belonging to the general-purpose attribute data. Since the purchasing potential data includes attributes corresponding to individual products and the like, the number of attributes is usually larger than that of general-purpose attribute data. Therefore, the prediction model generator 150 predicts the attributes belonging to the purchasing potential data. By lowering the update frequency of, the calculation amount of the prediction model generation device 150 can be significantly reduced.

Ｓ３２０において、サンプリング部１６０は、属性ＤＢ１２２から一部の対象者をサンプリングして、サンプリングした対象者に対応付けて属性ＤＢ１２２に格納されている属性情報を、属性ＤＢ接続部１６５を介して属性ＤＢ１６７に格納する。 In S320, the sampling unit 160 samples a part of the target person from the attribute DB 122, and the attribute information stored in the attribute DB 122 in association with the sampled target person is transmitted to the attribute DB 167 via the attribute DB connection unit 165. Store in.

Ｓ３３０において、次元縮約部１７０は、一例として潜在的ディリクレ配分法（ＬＤＡ：ＬａｔｅｎｔＤｉｒｉｃｈｌｅｔＡｌｌｏｃａｔｉｏｎ）または確率的潜在意味解析法（ＰｒｏｂａｂｉｌｉｓｔｉｃＬａｔｅｎｔＳｅｍａｎｔｉｃＡｎａｌｙｓｉｓ）等のトピックモデルを用いて、属性ＤＢ１６７の次元を縮約する。トピックモデルを用いる場合、次元縮約部１７０は、トピック数を例えば１０、２０、３０、…、１００等の複数種類のそれぞれとしてモデリングし、トピック数毎に生成した複数の属性を次元縮約用の属性として縮約ＤＢ１７７に格納してもよい。これにより、属性予測モデル生成部１８０は、使用するトピック数を属性予測モデルのハイパーパラメータとすることができ、属性予測モデル選択部１８５は、最適なトピック数の属性予測モデルを選択することができる。 In S330, the dimension contraction unit 170 uses a topic model such as Latent Dirichlet Allocation (LDA) or Stochastic Latent Semantics Analysis (LDA) as an example to determine the dimension of the attribute DB 167. Reduce. When the topic model is used, the dimension reduction unit 170 models the number of topics as each of a plurality of types such as 10, 20, 30, ..., 100, etc., and a plurality of attributes generated for each number of topics are used for dimension reduction. It may be stored in the contraction DB 177 as an attribute of. As a result, the attribute prediction model generation unit 180 can use the number of topics to be used as a hyperparameter of the attribute prediction model, and the attribute prediction model selection unit 185 can select the attribute prediction model with the optimum number of topics. ..

ここで次元縮約部１７０は、属性ＤＢ１２２または属性ＤＢ１６７が有する全属性をトピック生成の評価対象としてもよく、これに代えて、属性ＤＢ１２２または属性ＤＢ１６７が有する属性の一部をトピック生成の評価対象としてもよい。例えば、次元縮約部１７０は、汎用属性データに含まれる属性はトピック生成の評価対象とし、購買ポテンシャルデータおよびレコメンドポテンシャルデータに含まれる属性はトピック生成には使用しなくてもよい。また例えば、次元縮約部１７０は、汎用属性データに含まれる基本属性、生活属性、および志向性のうち一部のみに含まれる属性のみをトピック生成の評価対象としてもよい。一例として、次元縮約部１７０は、基本属性および生活属性に含まれる属性のみをトピック生成の評価対象とし、志向性に含まれる属性はトピック生成に使用しなくてもよい。全属性をトピック生成の評価対象とした場合には、対象者間の小さな差異（例えば特定の商品を購入したかどうか）までも評価に含めて次元縮約後の属性値に反映できるが、次元縮約処理に要する計算量が増加する。一部の属性のみをトピック生成の評価対象とする場合には、システム１００のユーザまたは管理者等が重視する一部の属性のみを用いることで、計算量を抑えることができる。 Here, the dimension reduction unit 170 may use all the attributes of the attribute DB 122 or the attribute DB 167 as evaluation targets for topic generation, and instead, a part of the attributes of the attribute DB 122 or the attribute DB 167 may be evaluated for topic generation. May be. For example, in the dimension reduction unit 170, the attributes included in the general-purpose attribute data may be evaluated for topic generation, and the attributes included in the purchasing potential data and the recommendation potential data may not be used for topic generation. Further, for example, the dimension reduction unit 170 may evaluate only the basic attributes, the living attributes, and the attributes included in only a part of the orientations included in the general-purpose attribute data as the evaluation target of the topic generation. As an example, the dimension reduction unit 170 targets only the attributes included in the basic attribute and the living attribute as the evaluation target of the topic generation, and the attribute included in the intentionality does not have to be used for the topic generation. When all attributes are evaluated for topic generation, even small differences between the target people (for example, whether or not a specific product was purchased) can be included in the evaluation and reflected in the attribute value after dimension reduction. The amount of calculation required for the reduction process increases. When only some attributes are to be evaluated for topic generation, the amount of calculation can be suppressed by using only some attributes that are emphasized by the user or administrator of the system 100.

また、次元縮約部１７０は、トピックモデルに代えて、またはトピックモデルに加えて、他の手法により属性ＤＢ１６７の次元を縮約してもよい。例えば、次元縮約部１７０は、既知の属性値が格納されている対象者の割合が基準割合未満である属性については、属性予測モデルの説明変数から除いてもよい。 Further, the dimension reduction unit 170 may reduce the dimension of the attribute DB 167 by another method instead of the topic model or in addition to the topic model. For example, the dimension reduction unit 170 may exclude an attribute in which the ratio of the target person in which the known attribute value is stored is less than the reference ratio from the explanatory variables of the attribute prediction model.

次に、予測モデル生成装置１５０は、予測対象属性毎にＳ３４０およびＳ３８０の間の処理（Ｓ３５０〜Ｓ３７０）を繰り返す。 Next, the prediction model generation device 150 repeats the processing (S350 to S370) between S340 and S380 for each prediction target attribute.

Ｓ３５０において、属性予測モデル生成部１８０は、予測対象属性の属性値を予測する１または複数の属性予測モデルを生成する。まず、属性予測モデル生成部１８０は、属性ＤＢ１６７に格納された各対象者の属性情報のうち、属性予測モデルの生成に用いるモデリング用の複数の対象者の属性情報を抽出する。属性予測モデル生成部１８０は、モデリング用の対象者の属性情報として、予測対象属性の属性値が既知である属性情報、すなわち予測対象属性について実際に属性値が得られている属性情報を抽出する。ここで、属性予測モデル生成部１８０は、属性予測モデルの説明変数として使用する可能性がある各属性（すなわち説明変数の候補として属性予測モデルに属性値を入力する属性）のうち属性値が既知であるものの割合がより大きい属性情報をより優先して抽出するようにしてもよい。なお、属性予測モデル生成部１８０は、モデリング用の対象者の数が予め定められた基準数以下の場合には、ブーストサンプリング等の手法を用いてサンプル数を増やしてもよい。 In S350, the attribute prediction model generation unit 180 generates one or a plurality of attribute prediction models for predicting the attribute value of the prediction target attribute. First, the attribute prediction model generation unit 180 extracts the attribute information of a plurality of target persons for modeling used for generating the attribute prediction model from the attribute information of each target person stored in the attribute DB 167. The attribute prediction model generation unit 180 extracts the attribute information for which the attribute value of the prediction target attribute is known, that is, the attribute information for which the attribute value is actually obtained for the prediction target attribute, as the attribute information of the target person for modeling. .. Here, the attribute prediction model generation unit 180 knows the attribute value of each attribute that may be used as an explanatory variable of the attribute prediction model (that is, an attribute that inputs an attribute value to the attribute prediction model as a candidate for the explanatory variable). Attribute information with a larger proportion of those that are may be extracted with higher priority. When the number of subjects for modeling is equal to or less than a predetermined reference number, the attribute prediction model generation unit 180 may increase the number of samples by using a method such as boost sampling.

次に、属性予測モデル生成部１８０は、抽出した対象者の属性情報を学習用属性情報として用いて、複数の属性予測モデルを生成する。属性予測モデル生成部１８０は、予測アルゴリズムおよび学習によって更新されないハイパーパラメータの少なくとも１つが互いに異なる複数の属性予測モデルを生成する。属性予測モデル生成部１８０は、例えばランダムフォレスト、勾配ブースティング、ロジスティック回帰、ニューラルネットワーク、およびサポートベクタマシン（ＳＶＭ）等を含む各種の機械学習アルゴリズムの中から予測アルゴリズムを選択してもよい。また、属性予測モデル生成部１８０は、次元縮約部１７０における次元縮約のトピック数、ランダムフォレストにおける決定木の深さ、勾配ブースティングにおける木の深さ、ロジスティック回帰における正規化パラメータ、およびニューラルネットワークにおけるニューロン数および層数等のような予測アルゴリズムに設定可能な１または複数のハイパーパラメータのそれぞれを、ハイパーパラメータとして設定可能な複数の設定値の組の中から独立して選択することにより、選択した予測アルゴリズム毎に互いに異なる複数のハイパーパラメータの組を得てもよい。 Next, the attribute prediction model generation unit 180 uses the extracted attribute information of the target person as learning attribute information to generate a plurality of attribute prediction models. The attribute prediction model generation unit 180 generates a plurality of attribute prediction models in which at least one of the hyperparameters that are not updated by the prediction algorithm and learning is different from each other. The attribute prediction model generation unit 180 may select a prediction algorithm from various machine learning algorithms including, for example, random forest, gradient boosting, logistic regression, neural network, and support vector machine (SVM). In addition, the attribute prediction model generation unit 180 includes the number of dimensional reduction topics in the dimensional reduction unit 170, the depth of the decision tree in the random forest, the depth of the tree in gradient boosting, the normalization parameters in logistic regression, and the neural. By independently selecting each of one or more hyperparameters that can be set in a prediction algorithm such as the number of neurons and the number of layers in a network from a set of multiple settings that can be set as hyperparameters. You may obtain a plurality of hyperparameter sets that are different from each other for each selected prediction algorithm.

属性予測モデル生成部１８０は、上記のようにして得られた予測アルゴリズムおよびハイパーパラメータの組の少なくとも一方が互いに異なる複数の属性予測モデルのそれぞれを、学習用属性情報を用いて学習により最適化する。すなわち、属性予測モデル生成部１８０は、各学習用属性情報について、その学習用属性情報のうち予測対象属性以外の属性値を属性予測モデルの入力とした場合に、その学習用属性情報における予測対象属性の既知の属性値により近い予測値を出力するように属性予測モデルの学習可能パラメータを更新していく。これにより、属性予測モデル生成部１８０は、各属性予測モデルのそれぞれについて、学習可能パラメータを学習させて最適化する。ここで、「最適化」とは、必ずしもその属性予測モデルの予測誤差を最小化すること（すなわち学習可能パラメータの全てを最適値とすること）を意味せず、その属性予測モデルの予測誤差をより低減する学習処理を属性予測モデル生成部１８０がやり終えた状態（すなわち現実的に実行可能な学習処理を完了した状態）を意味する。 The attribute prediction model generation unit 180 optimizes each of a plurality of attribute prediction models in which at least one of the prediction algorithm and the hyperparameter set obtained as described above is different from each other by learning using the learning attribute information. .. That is, when the attribute prediction model generation unit 180 inputs an attribute value other than the prediction target attribute among the learning attribute information for each learning attribute information as the input of the attribute prediction model, the prediction target in the learning attribute information. The trainable parameters of the attribute prediction model are updated so that the predicted values closer to the known attribute values of the attributes are output. As a result, the attribute prediction model generation unit 180 learns and optimizes the learnable parameters for each of the attribute prediction models. Here, "optimization" does not necessarily mean minimizing the prediction error of the attribute prediction model (that is, setting all the learnable parameters to the optimum values), and the prediction error of the attribute prediction model is used. It means a state in which the attribute prediction model generation unit 180 has completed the learning process to be further reduced (that is, a state in which the practically feasible learning process is completed).

ここで、属性予測モデル生成部１８０は、属性予測モデルの説明変数を、既知の属性値のみに制限してもよい。この場合、属性予測モデル生成部１８０は、学習用属性情報として使用できるサンプル数は制限されるが、現実の属性情報に基づきより精度の高い属性予測モデルを生成することができる可能性を高めることができる。これに代えて、属性予測モデル生成部１８０は、一部または全部の説明変数として属性の予測値を用いてもよい。すなわち、属性予測モデル生成部１８０は、第２予測対象属性の属性値を、第１の属性予測モデルにより予測された第１予測対象属性の予測値を用いて予測する第２の１または複数の属性予測モデルを生成してもよい。第２の複数の属性予測モデルの生成において、既に生成された第１属性予測モデルによって予測された第１予測対象属性の予測値を説明変数として使用可能とすることにより、属性予測モデル生成部１８０は、既知の属性値が一部欠けている学習用属性情報も利用することができ、より多くの学習用属性情報を用いて属性予測モデルの精度を高めうる。 Here, the attribute prediction model generation unit 180 may limit the explanatory variables of the attribute prediction model to only known attribute values. In this case, the attribute prediction model generation unit 180 is limited in the number of samples that can be used as the learning attribute information, but increases the possibility that a more accurate attribute prediction model can be generated based on the actual attribute information. Can be done. Instead, the attribute prediction model generation unit 180 may use the predicted value of the attribute as a part or all of the explanatory variables. That is, the attribute prediction model generation unit 180 predicts the attribute value of the second prediction target attribute by using the prediction value of the first prediction target attribute predicted by the first attribute prediction model. An attribute prediction model may be generated. In the generation of the second plurality of attribute prediction models, the attribute prediction model generation unit 180 by making the predicted value of the first prediction target attribute predicted by the already generated first attribute prediction model available as an explanatory variable. Can also use training attribute information that lacks some known attribute values, and can improve the accuracy of the attribute prediction model by using more training attribute information.

なお、属性予測モデル生成部１８０は、予測誤差として、各属性情報から属性予測モデルによって予測された予測対象属性の予測値と期待値との差を、予測誤差の評価対象となる全属性情報について累積した値を用いてもよい。このような誤差の算出方法は、一例として予測値と期待値との差の２乗を合計するもの等を含め公知の様々な方法を適用することができる。これに代えて、属性予測モデル生成部１８０は、後述するＡＵＣ（ＡｒｅａＵｎｄｅｒｔｈｅＣｕｒｖｅ）が増えるにつれて減少する値（一例として１−ＡＵＣ値）を、予測誤差を示す指標として用いてもよい。 The attribute prediction model generation unit 180 determines the difference between the predicted value and the expected value of the predicted target attribute predicted by the attribute prediction model from each attribute information as the prediction error for all the attribute information to be evaluated for the prediction error. Cumulative values may be used. As an example of such an error calculation method, various known methods can be applied, including a method of summing the squares of the difference between the predicted value and the expected value. Instead, the attribute prediction model generation unit 180 may use a value (1-AUC value as an example) that decreases as the AUC (Area Under the Curve), which will be described later, increases, as an index indicating a prediction error.

ここで「予測誤差」とは、「予測誤差」に応じて変化する指標も含まれる。すなわち例えば、予測精度を示す指標もまた、予測誤差を示す指標に該当し、予測精度がより高いことは、予測誤差がより低いことに相当し、予測精度がより低いことは、予測誤差がより高いことに相当する。したがって、属性予測モデル生成部１８０は、予測精度が高いほど大きくなるＡＵＣを、予測誤差を示す指標（予測誤差が高いほど小さくなる指標）として用いてもよい。この場合、予測誤差が高いとは、予測誤差を示す指標値が小さいことを意味する。 Here, the "prediction error" includes an index that changes according to the "prediction error". That is, for example, an index indicating prediction accuracy also corresponds to an index indicating prediction error, and a higher prediction accuracy corresponds to a lower prediction error, and a lower prediction accuracy means a higher prediction error. Corresponds to high. Therefore, the attribute prediction model generation unit 180 may use the AUC, which increases as the prediction accuracy increases, as an index indicating a prediction error (an index that decreases as the prediction error increases). In this case, a high prediction error means that the index value indicating the prediction error is small.

Ｓ３６０において、属性予測モデル生成部１８０は、生成された学習済の複数の属性予測モデルをそれぞれ評価する。属性予測モデル生成部１８０は、予測誤差のみに基づいて属性予測モデルを評価してもよく、予測誤差のみでなく、例えば計算量がより小さいこと、及び／又は、参照する属性数がより少ないこと等の他の条件も評価パラメータとして含めてもよい。また、属性予測モデル生成部１８０は、予測アルゴリズムの最適化に用いる、予測誤差に関する項を含む目的関数の値を用いて、属性予測モデルを評価してもよい。ここで、属性予測モデル生成部１８０は、交叉検定を用いて複数の属性予測モデルのそれぞれの予測誤差を算出してもよい。この場合、属性予測モデル生成部１８０は、抽出したモデリング用の属性情報の一部を用いて複数の属性予測モデルを学習させ、モデリング用の属性情報のうち学習に使用していない属性情報を用いて複数の属性予測モデルのそれぞれの予測誤差を算出する。 In S360, the attribute prediction model generation unit 180 evaluates each of the generated plurality of trained attribute prediction models. The attribute prediction model generation unit 180 may evaluate the attribute prediction model based only on the prediction error, and not only the prediction error but also, for example, a smaller amount of calculation and / or a smaller number of referenced attributes. Other conditions such as may be included as evaluation parameters. Further, the attribute prediction model generation unit 180 may evaluate the attribute prediction model by using the value of the objective function including the term related to the prediction error, which is used for optimizing the prediction algorithm. Here, the attribute prediction model generation unit 180 may calculate the prediction error of each of the plurality of attribute prediction models by using the cross test. In this case, the attribute prediction model generation unit 180 trains a plurality of attribute prediction models using a part of the extracted attribute information for modeling, and uses the attribute information for modeling that is not used for training. Calculate the prediction error of each of the multiple attribute prediction models.

Ｓ３７０において、属性予測モデル選択部１８５は、複数の属性予測モデルのそれぞれの予測誤差等の評価結果に基づいて、その予測対象属性の属性値の予測に用いる属性予測モデルを選択する。一例として属性予測モデル選択部１８５は、複数の属性予測モデルのうち、予測誤差が最小（すなわち予測精度が最大）の属性予測モデルを選択する。これに代えて、属性予測モデル選択部１８５は、複数の属性予測モデルのうち、予測誤差以外の条件も評価パラメータとして含めた評価結果を用いて評価が最も高い属性予測モデルを選択してもよい。ここで、属性予測モデル選択部１８５は、上記のように他の属性予測モデルによる予測値を説明変数とする第２の１または複数の属性予測モデルのそれぞれの予測誤差等の評価結果に基づいて、第２予測対象属性の属性値の予測に用いる第２属性予測モデルを更に選択してもよい。 In S370, the attribute prediction model selection unit 185 selects an attribute prediction model to be used for predicting the attribute value of the prediction target attribute based on the evaluation results such as the prediction errors of each of the plurality of attribute prediction models. As an example, the attribute prediction model selection unit 185 selects the attribute prediction model having the smallest prediction error (that is, the maximum prediction accuracy) from the plurality of attribute prediction models. Instead, the attribute prediction model selection unit 185 may select the attribute prediction model having the highest evaluation from the plurality of attribute prediction models by using the evaluation result including conditions other than the prediction error as evaluation parameters. .. Here, the attribute prediction model selection unit 185 is based on the evaluation result of each prediction error of the second one or a plurality of attribute prediction models using the prediction value by another attribute prediction model as an explanatory variable as described above. , The second attribute prediction model used for predicting the attribute value of the second prediction target attribute may be further selected.

予測モデル生成装置１５０は、予測対象の各予測対象属性のそれぞれに対してＳ３５０〜Ｓ３７０の処理を行なうことにより、各予測対象属性に対して個別に最適化された属性予測モデルを選択することができる。これにより、予測モデル生成装置１５０は、各予測対象属性の予測精度をより高めることができる。 The prediction model generation device 150 can select an attribute prediction model individually optimized for each prediction target attribute by performing the processes S350 to S370 for each of the prediction target attributes of the prediction target. it can. As a result, the prediction model generation device 150 can further improve the prediction accuracy of each prediction target attribute.

図４は、本実施形態に係る属性予測モデル生成部１８０が生成する属性予測モデルの評価結果の一例を示す。属性予測モデル生成部１８０は、各予測対象属性に対して生成された１または複数の属性予測モデルのそれぞれについて、属性識別情報（属性ＩＤ）、次元縮約パラメータ、予測アルゴリズム識別情報（予測アルゴリズムＩＤ）、ハイパーパラメータ、予測誤差、およびモデル評価を含む評価結果を生成する。 FIG. 4 shows an example of the evaluation result of the attribute prediction model generated by the attribute prediction model generation unit 180 according to the present embodiment. The attribute prediction model generation unit 180 has attribute identification information (attribute ID), dimension reduction parameters, and prediction algorithm identification information (prediction algorithm ID) for each of one or a plurality of attribute prediction models generated for each prediction target attribute. ), Hyperparameters, prediction errors, and evaluation results including model evaluation.

「属性ＩＤ」は、予測対象の予測対象属性の識別情報である。「次元縮約パラメータ」は、次元縮約部１２５および次元縮約部１７０が次元縮約に用いる、トピック数等のパラメータである。次元縮約パラメータが複数存在する場合、属性予測モデル生成部１８０は、パラメータ毎の設定値を記録する。次元縮約パラメータも、ハイパーパラメータの一種となりうる。 The "attribute ID" is identification information of the prediction target attribute of the prediction target. The "dimensional contraction parameter" is a parameter such as the number of topics used by the dimensional contraction unit 125 and the dimensional contraction unit 170 for dimensional contraction. When there are a plurality of dimensional reduction parameters, the attribute prediction model generation unit 180 records the set value for each parameter. Dimensional contraction parameters can also be a kind of hyperparameters.

「予測アルゴリズムＩＤ」は、当該エントリにおいて選択された予測アルゴリズムを識別する識別情報である。「ハイパーパラメータ」は、選択された予測アルゴリズムにおけるハイパーパラメータ毎の設定値を示す。「予測誤差」は、当該エントリにおいて指定された次元縮約パラメータ、予測アルゴリズム、およびハイパーパラメータを用いて生成した属性予測モデルの予測誤差を示す。「モデル評価」は、当該エントリに対応して生成された属性予測モデルの評価結果を示す。ここで、属性予測モデル生成部１８０は、属性予測モデルを予測誤差のみに基づいて評価する場合には、「モデル評価」のカラムを別途設けなくてもよい。 The "prediction algorithm ID" is identification information that identifies the prediction algorithm selected in the entry. "Hyperparameter" indicates the setting value for each hyperparameter in the selected prediction algorithm. “Forecast error” indicates the prediction error of the attribute prediction model generated using the dimension reduction parameters, prediction algorithms, and hyperparameters specified in the entry. "Model evaluation" indicates the evaluation result of the attribute prediction model generated corresponding to the entry. Here, when the attribute prediction model generation unit 180 evaluates the attribute prediction model based only on the prediction error, it is not necessary to separately provide a column for "model evaluation".

図５は、本実施形態に係る属性予測モデル選択部１８５が生成する属性予測モデルの選択結果の一例を示す。属性予測モデル選択部１８５は、属性予測モデル生成部１８０による各属性予測モデルの評価結果を用いて、予測対象属性毎に、属性値の予測に用いる属性予測モデルを選択する。属性予測モデル選択部１８５は、予測対象の各予測対象属性について、属性ＩＤ、次元縮約パラメータ選択値、選択予測アルゴリズムＩＤ、ハイパーパラメータ選択値を含む選択結果を生成する。 FIG. 5 shows an example of the selection result of the attribute prediction model generated by the attribute prediction model selection unit 185 according to the present embodiment. The attribute prediction model selection unit 185 selects an attribute prediction model to be used for predicting the attribute value for each prediction target attribute by using the evaluation result of each attribute prediction model by the attribute prediction model generation unit 180. The attribute prediction model selection unit 185 generates a selection result including an attribute ID, a dimension reduction parameter selection value, a selection prediction algorithm ID, and a hyperparameter selection value for each prediction target attribute of the prediction target.

「属性ＩＤ」は、当該エントリに対応する予測対象属性の識別情報である。「次元縮約パラメータ選択値」は、当該エントリに対応する予測対象属性の予測に用いる１または複数の次元縮約パラメータのそれぞれの設定値を含む。「次元縮約パラメータ値」は、図４中における選択された属性予測モデルの次元縮約パラメータに対応する。「選択予測アルゴリズムＩＤ」は、当該エントリに対応する予測対象属性の予測に用いる予測アルゴリズムを識別する識別情報である。「選択予測アルゴリズムＩＤ」は、図４中における選択された属性予測モデルの予測アルゴリズムＩＤに対応する。「ハイパーパラメータ選択値」は、当該エントリに対応する予測対象属性の予測に用いる予測アルゴリズムに与えるハイパーパラメータ毎の設定値である。「ハイパーパラメータ選択値」は、図４中における選択された属性予測モデルのハイパーパラメータに対応する。 The "attribute ID" is the identification information of the prediction target attribute corresponding to the entry. The "dimensional contraction parameter selection value" includes each set value of one or a plurality of dimension contraction parameters used for predicting the prediction target attribute corresponding to the entry. The “dimensional reduction parameter value” corresponds to the dimension reduction parameter of the selected attribute prediction model in FIG. The "selection prediction algorithm ID" is identification information that identifies the prediction algorithm used for predicting the prediction target attribute corresponding to the entry. The “selection prediction algorithm ID” corresponds to the prediction algorithm ID of the selected attribute prediction model in FIG. The "hyperparameter selection value" is a set value for each hyperparameter given to the prediction algorithm used for predicting the prediction target attribute corresponding to the entry. The "hyperparameter selection value" corresponds to the hyperparameter of the selected attribute prediction model in FIG.

属性予測モデル選択部１８５は、予測対象属性毎の属性予測モデルの選択結果を属性予測部１３５に供給することにより、予測対象属性毎に予測に用いる次元縮約パラメータ、予測アルゴリズム、およびハイパーパラメータを属性予測部１３５に対して設定することができる。なお、属性予測モデル選択部１８５は、選択された属性予測モデルの予測誤差及び／又は評価結果を属性予測部１３５に供給してもよい。 The attribute prediction model selection unit 185 supplies the selection result of the attribute prediction model for each prediction target attribute to the attribute prediction unit 135, thereby supplying the dimension reduction parameters, prediction algorithms, and hyper parameters used for prediction for each prediction target attribute. It can be set for the attribute prediction unit 135. The attribute prediction model selection unit 185 may supply the prediction error and / or the evaluation result of the selected attribute prediction model to the attribute prediction unit 135.

図６は、本実施形態に係る属性予測装置１１０における属性情報取得フローを示す。Ｓ６１０において、属性情報取得部１１５は、対象者の属性情報を取得する。Ｓ６２０において、属性情報取得部１１５は、取得した属性情報を、属性ＤＢ接続部１２０を介して属性ＤＢ１２２に書き込む。 FIG. 6 shows an attribute information acquisition flow in the attribute prediction device 110 according to the present embodiment. In S610, the attribute information acquisition unit 115 acquires the attribute information of the target person. In S620, the attribute information acquisition unit 115 writes the acquired attribute information to the attribute DB 122 via the attribute DB connection unit 120.

属性ＤＢ１２２が書き込み対象の属性について既知の属性値と予測値とを別個に記憶する場合、属性情報取得部１１５は、取得した属性情報を、既知の属性値として属性ＤＢ１２２に書き込んでよい。また、属性ＤＢ１２２および属性ＤＢ１６７が書き込み対象の属性について予測値と同じ表現形式で既知の属性値を記憶する場合には、属性情報取得部１１５は、取得した属性情報の生データの表現形式を予測値と同じ表現形式に変換して属性ＤＢ１２２に格納する。 When the attribute DB 122 separately stores the known attribute value and the predicted value for the attribute to be written, the attribute information acquisition unit 115 may write the acquired attribute information to the attribute DB 122 as a known attribute value. Further, when the attribute DB 122 and the attribute DB 167 store a known attribute value in the same expression format as the predicted value for the attribute to be written, the attribute information acquisition unit 115 predicts the expression format of the raw data of the acquired attribute information. It is converted into the same expression format as the value and stored in the attribute DB 122.

図７は、本実施形態に係る属性予測装置１１０における属性予測フローを示す。ステップＳ７１０において、次元縮約部１２５は、サンプリング部１６０と同様に、一例として潜在的ディリクレ配分法または確率的潜在意味解析法等のトピックモデルを用いて、属性ＤＢ１２２の次元を縮約する。ここで、次元縮約部１２５は、各予測対象属性について選択された属性予測モデルに対応する次元縮約パラメータ選択値を参照し、少なくとも１つの予測対象属性に対して指定された次元縮約パラメータ選択値に応じた次元縮約用の属性を縮約ＤＢ１３２に格納する。また、次元縮約部１２５は、属性ＤＢ１２２における複数の属性のうち、属性予測モデル選択部１８５により選択された少なくとも１つの属性予測モデルにおいて説明変数として使用する属性を、縮約ＤＢ１３２の属性としてコピーする。 FIG. 7 shows an attribute prediction flow in the attribute prediction device 110 according to the present embodiment. In step S710, the dimension contraction unit 125 contracts the dimension of the attribute DB 122 by using a topic model such as a latent Dirichlet allocation method or a stochastic latent semantic analysis method as an example, similarly to the sampling unit 160. Here, the dimension reduction unit 125 refers to the dimension reduction parameter selection value corresponding to the attribute prediction model selected for each prediction target attribute, and the dimension reduction parameter specified for at least one prediction target attribute. The attribute for dimension reduction according to the selected value is stored in the reduction DB 132. Further, the dimension reduction unit 125 copies an attribute used as an explanatory variable in at least one attribute prediction model selected by the attribute prediction model selection unit 185 among the plurality of attributes in the attribute DB 122 as an attribute of the reduction DB 132. To do.

属性予測装置１１０は、予測対象属性毎に、Ｓ７３０およびＳ７９０の間の処理（Ｓ７３０〜Ｓ７８０）を繰り返す（Ｓ７２０、Ｓ７９０）。 The attribute prediction device 110 repeats the processing (S730 to S780) between S730 and S790 for each prediction target attribute (S720, S790).

Ｓ７３０において、属性予測部１３５は、属性予測モデル選択部１８５によって予測対象属性用に選択された属性予測モデルを用いて、予測対象属性の属性値を予測する。属性予測部１３５は、各対象者について縮約ＤＢ１３２に格納された属性情報に含まれる、属性予測モデルの説明変数に対応する属性の属性値を縮約ＤＢ接続部１３０を介して取得し、これらの属性値を属性予測モデルに入力して予測対象属性の予測値を算出する。ここで、属性予測部１３５は、第１属性予測モデルによる第１予測対象属性の予測値を用いて予測対象である第２予測対象属性の予測値を予測する第２属性予測モデルについては、説明変数となる第１予測対象属性の属性値として、当該第１予測対象属性の予測値を第２属性予測モデルに入力し、第２属性予測モデルを用いて、複数の対象者のそれぞれについて第２予測対象属性を更に予測する。 In S730, the attribute prediction unit 135 predicts the attribute value of the prediction target attribute by using the attribute prediction model selected for the prediction target attribute by the attribute prediction model selection unit 185. The attribute prediction unit 135 acquires the attribute values of the attributes corresponding to the explanatory variables of the attribute prediction model included in the attribute information stored in the reduction DB 132 for each target person via the reduction DB connection unit 130. The attribute value of is input to the attribute prediction model to calculate the predicted value of the predicted target attribute. Here, the attribute prediction unit 135 describes the second attribute prediction model that predicts the prediction value of the second prediction target attribute, which is the prediction target, using the prediction value of the first prediction target attribute by the first attribute prediction model. As the attribute value of the first prediction target attribute that becomes a variable, the prediction value of the first prediction target attribute is input to the second attribute prediction model, and the second attribute prediction model is used for each of the plurality of target persons. Further predict the prediction target attribute.

また、第１予測対象属性について既知の属性値および予測値が同じ表現形式で表されている場合、または一方から他方に変換可能な場合においては、属性予測部１３５は、複数の対象者のそれぞれについて、第１予測対象属性の属性値が既知であることを条件として既知の属性値を用いて対象者の第２予測対象属性の予測値を予測し、第１予測対象属性の属性値が未知であることを条件として第１予測対象属性の属性値を用いて対象者の第２予測対象属性の予測値を予測してもよい。これにより、属性予測部１３５は、第１属性予測モデルによる予測対象となった第１予測対象属性が既知の対象者については既知の属性値を用いてより高い精度で第２予測対象属性を予測しつつ、第１予測対象属性が未知の対象者についても予測を行うことが可能となる。 Further, when the known attribute value and the predicted value for the first predicted target attribute are represented in the same expression format, or when one can be converted to the other, the attribute prediction unit 135 can be used for each of the plurality of target persons. The predicted value of the second prediction target attribute of the target person is predicted using the known attribute value on condition that the attribute value of the first prediction target attribute is known, and the attribute value of the first prediction target attribute is unknown. The predicted value of the second prediction target attribute of the target person may be predicted by using the attribute value of the first prediction target attribute on the condition that. As a result, the attribute prediction unit 135 predicts the second prediction target attribute with higher accuracy by using the known attribute value for the target person whose first prediction target attribute is known, which is the prediction target by the first attribute prediction model. At the same time, it is possible to make a prediction even for a target person whose first prediction target attribute is unknown.

Ｓ７４０において、属性予測部１３５は、予測対象属性の予測値を属性ＤＢ接続部１２０を介して属性ＤＢ１２２に格納する。ここで、属性予測部１３５は、予測対象属性の予測値が算出できた場合、予測対象属性の既知の属性値が属性ＤＢ１２２に格納されている場合であっても予測値を属性ＤＢ１２２に格納してもよい。 In S740, the attribute prediction unit 135 stores the predicted value of the prediction target attribute in the attribute DB 122 via the attribute DB connection unit 120. Here, when the predicted value of the predicted target attribute can be calculated, the attribute prediction unit 135 stores the predicted value in the attribute DB 122 even if the known attribute value of the predicted target attribute is stored in the attribute DB 122. You may.

Ｓ７５０において、属性予測値更新部１４２は、属性ＤＢ１２２に登録された複数の対象者のうち１または複数の対象者について予測対象属性の属性値が既知である場合において、予測対象属性の予測値が既知の属性値から基準以上乖離しているか否かを判断する。この基準は、システム１００の設計者、システム構築者、ユーザ、及び／又は管理者によって予め定められてもよく、属性予測値更新部１４２が、予測対象属性の既知の属性値の分散または標準偏差等を定数倍すること等により算出してもよい。 In S750, the attribute prediction value update unit 142 sets the prediction value of the prediction target attribute when the attribute value of the prediction target attribute is known for one or more of the plurality of target persons registered in the attribute DB 122. Judge whether or not the value deviates from the known attribute value by more than the standard. This criterion may be predetermined by the designer, system builder, user, and / or administrator of the system 100, and the attribute prediction value update unit 142 determines the variance or standard deviation of the known attribute values of the prediction target attribute. Etc. may be calculated by multiplying the above by a constant.

予測対象属性の予測値が既知の属性値から基準以上乖離していることに応じて（Ｓ７５０でＹｅｓ）、属性予測値更新部１４２は、Ｓ７６０において、予測対象属性の予測値を既知の属性値に基づいて更新してもよい。すなわち、予測対象属性の既知の属性値および予測値が同じ表現形式をとる場合には、属性予測値更新部１４２は、予測対象属性の予測値を既知の属性値に書き換えてもよい。予測対象属性の既知の属性値および予測値が異なる表現形式をとる場合には、属性予測値更新部１４２は、既知の属性値の表現形式を変換して、予測対象属性の予測値を書き換えてもよい。ここで、乖離しているか否かの判断に用いる基準は、システム１００の設計者、システム構築者、ユーザ、及び／又は管理者等によって予め設定されてよい。 According to the fact that the predicted value of the predicted target attribute deviates from the known attribute value by more than the standard (Yes in S750), the attribute predicted value update unit 142 sets the predicted value of the predicted target attribute to the known attribute value in S760. May be updated based on. That is, when the known attribute value and the predicted value of the predicted target attribute have the same expression format, the attribute predicted value update unit 142 may rewrite the predicted value of the predicted target attribute to the known attribute value. When the known attribute value and the predicted value of the predicted target attribute have different expression formats, the attribute predicted value update unit 142 converts the expressed format of the known attribute value and rewrites the predicted value of the predicted target attribute. May be good. Here, the criteria used for determining whether or not there is a divergence may be preset by the designer, system builder, user, and / or administrator of the system 100.

これにより、属性予測値更新部１４２は、予測対象属性の属性値が既知の対象者については、当該予測対象属性を説明変数として使用する他の属性予測モデルが、大幅に誤った予測値を用いて他の予測対象属性を予測するのを防ぐことができる。例えば、属性予測値更新部１４２は、ある対象者について誤って「既婚」と予測された場合においても、その対象者が「未婚」であることが既知であれば、予測値を「未婚」に更新することができ、この属性を説明変数として用いて予測される他の属性の予測精度を高めることができる。 As a result, the attribute prediction value update unit 142 uses the predicted value that is significantly wrong in other attribute prediction models that use the predicted target attribute as the explanatory variable for the target person whose attribute value of the predicted target attribute is known. It is possible to prevent the prediction of other prediction target attributes. For example, the attribute prediction value update unit 142 sets the prediction value to "unmarried" if it is known that the target person is "unmarried" even if the target person is mistakenly predicted to be "married". It can be updated and the prediction accuracy of other attributes predicted by using this attribute as an explanatory variable can be improved.

また、属性値更新部１４０は、予測対象属性の予測値の予測の確からしさが閾値以上であることを条件として（Ｓ７７０でＹｅｓ）、予測対象属性の予測値に基づいて予測対象属性の属性値を更新してもよい（Ｓ７８０）。ここで、属性値更新部１４０は、予測対象属性の既知の属性値および予測値が同じ表現形式をとる場合には、属性予測値更新部１４２は、予測対象属性の既知の属性値を予測値に書き換えてもよい。予測対象属性の既知の属性値および予測値が異なる表現形式をとる場合には、属性値更新部１４０は、予測値の表現形式を変換して、予測対象属性の属性値を書き換えてもよい。ここで予測の確からしさの判断に用いる閾値は、システム１００の設計者、システム構築者、ユーザ、及び／又は管理者等によって予め設定されてよい。 Further, the attribute value update unit 140 sets the attribute value of the prediction target attribute based on the prediction value of the prediction target attribute, provided that the prediction accuracy of the prediction value of the prediction target attribute is equal to or higher than the threshold value (Yes in S770). May be updated (S780). Here, when the attribute value update unit 140 takes the same expression format as the known attribute value and the predicted value of the predicted target attribute, the attribute predicted value updating unit 142 predicts the known attribute value of the predicted target attribute. It may be rewritten as. When the known attribute value of the predicted target attribute and the predicted value take different representation formats, the attribute value update unit 140 may convert the representation format of the predicted value and rewrite the attribute value of the predicted target attribute. Here, the threshold value used for determining the accuracy of the prediction may be preset by the designer, system builder, user, and / or administrator of the system 100.

これにより、属性値更新部１４０は、予測対象属性の属性値が既知の対象者についても、属性値の入手時期が古い場合、および、対象者の状況が変化した場合等の事情に適応して、予測対象属性の予測値が十分に確からしくなったことに応じて、既知の属性値を予測値に基づいて置き換えることができる。この結果、一例としておむつを購入するようになった対象者を「子なし」属性としたまま維持すること、および、頻繁にガソリンを購入するようになった対象者を「車なし」属性としたまま維持すること等を防ぐことが可能となる。 As a result, the attribute value update unit 140 adapts to the situation such as when the acquisition time of the attribute value is old or when the situation of the target person changes even for the target person whose attribute value of the predicted target attribute is known. , The known attribute value can be replaced based on the predicted value as the predicted value of the predicted attribute becomes sufficiently accurate. As a result, as an example, the target person who started to purchase diapers was kept as the "childless" attribute, and the target person who started to purchase gasoline frequently was set as the "carless" attribute. It is possible to prevent the maintenance as it is.

ここで、属性値更新部１４０は、予測の確からしさとして、予測対象属性が既知の全対象者、学習用属性情報に含まれる全対象者、または、交叉検定用の学習用属性情報に含まれる全対象者の予測値に基づくＡＵＣ（ＡｒｅａＵｎｄｅｒｔｈｅＣｕｒｖｅ）を用いてよい。一例として、属性値更新部１４０は、予測対象属性が既知である各対象者を予測値が高い順に横軸に並べ、横軸上の各対象者について、その対象者以上の予測値を有する全対象者のうち予測対象属性を満たす対象者の割合をプロットしたＲＯＣ曲線（ＲｅｃｅｉｖｅｒＯｐｅｒａｔｉｎｇＣｈａｒａｃｔｅｒｉｓｔｉｃＣｕｒｖｅ）における、ＲＯＣ曲線の下側の面積をＡＵＣとして算出する。この際、属性値更新部１４０は、ＲＯＣ曲線の横軸も１に正規化することにより、ＡＵＣの取り得る最大値を１に正規化してもよい。これに代えて、属性値更新部１４０は、予測の確からしさとして、（定数−予測誤差）のように予測誤差が減少するほど増加するパラメータの値を用いてもよい。 Here, the attribute value update unit 140 is included in all the target persons whose predicted target attributes are known, all the target persons included in the learning attribute information, or the learning attribute information for the cross test, as the certainty of the prediction. AUC (Area Under the Curve) based on the predicted values of all subjects may be used. As an example, the attribute value update unit 140 arranges each target person whose predicted target attribute is known on the horizontal axis in descending order of the predicted value, and for each target person on the horizontal axis, all the target persons having a predicted value equal to or higher than that target person. The area below the ROC curve in the ROC curve (Receiver Operating Characteristic Curve) plotting the ratio of the subjects satisfying the predicted target attribute among the subjects is calculated as AUC. At this time, the attribute value updating unit 140 may normalize the maximum value that the AUC can take to 1 by normalizing the horizontal axis of the ROC curve to 1. Instead, the attribute value update unit 140 may use a parameter value such as (constant-prediction error) that increases as the prediction error decreases, as the certainty of the prediction.

図８は、予測対象属性の依存関係の一例を示す。属性予測モデルＡ８１０は、属性ａの属性値および属性ｂの属性値に基づいて、予測対象属性ｃの属性値を予測する。属性予測モデルＢ８２０は、属性ｃの属性値および属性ｄの属性値に基づいて、予測対象属性ｅの属性値を予測する。属性予測モデルＣ８３０は、属性ｅの属性値および属性ｆの属性値に基づいて、予測対象属性ｂの属性値を予測する。 FIG. 8 shows an example of the dependency of the predicted target attribute. The attribute prediction model A810 predicts the attribute value of the prediction target attribute c based on the attribute value of the attribute a and the attribute value of the attribute b. The attribute prediction model B820 predicts the attribute value of the prediction target attribute e based on the attribute value of the attribute c and the attribute value of the attribute d. The attribute prediction model C830 predicts the attribute value of the prediction target attribute b based on the attribute value of the attribute e and the attribute value of the attribute f.

この例において、属性予測モデルＡ８１０が属性ｂの属性値を入力して属性ｃの属性値を出力し、属性予測モデルＢ８２０が属性ｃの属性値を入力して属性ｅの属性値を出力し、属性予測モデルＣ８３０が属性ｅの属性値を入力して属性ｂの属性値を出力する。したがって、複数の属性予測モデルにおける説明変数および目的変数の連鎖の中に、属性ｂ→属性ｃ→属性ｅ→属性ｂという２以上の予測対象属性の間の循環依存が存在する。 In this example, the attribute prediction model A810 inputs the attribute value of the attribute b and outputs the attribute value of the attribute c, and the attribute prediction model B820 inputs the attribute value of the attribute c and outputs the attribute value of the attribute e. The attribute prediction model C830 inputs the attribute value of the attribute e and outputs the attribute value of the attribute b. Therefore, in the chain of explanatory variables and objective variables in a plurality of attribute prediction models, there is a cyclic dependency between two or more prediction target attributes such as attribute b → attribute c → attribute e → attribute b.

ここで、属性予測モデル生成部１８０、属性予測モデル選択部１８５、及び／又は属性予測部１３５は、例えば属性予測モデルＢ８２０が属性ｃの既知の属性値を説明変数とし、属性ｃの予測値は説明変数としないようにすれば、属性予測モデルＡ８１０が出力する属性ｃの予測値を属性予測モデルＢ８２０が入力しなくなるので予測対象属性間の循環依存を解消することができる。すなわち、属性予測モデル生成部１８０、属性予測モデル選択部１８５、及び／又は属性予測部１３５は、２以上の予測対象属性の間に属性予測モデルへの入出力関係の循環依存が存在する場合には、循環依存を構成する少なくとも１つの属性予測モデルにおいて循環依存に含まれる属性の属性値として既知の属性値を入力するように属性予測モデルを生成または選択すればよい。 Here, in the attribute prediction model generation unit 180, the attribute prediction model selection unit 185, and / or the attribute prediction unit 135, for example, the attribute prediction model B820 uses a known attribute value of the attribute c as an explanatory variable, and the predicted value of the attribute c is If it is not used as an explanatory variable, the attribute prediction model B820 does not input the predicted value of the attribute c output by the attribute prediction model A810, so that the cyclic dependence between the predicted target attributes can be eliminated. That is, when the attribute prediction model generation unit 180, the attribute prediction model selection unit 185, and / or the attribute prediction unit 135 have a circular dependence of the input / output relationship to the attribute prediction model between two or more prediction target attributes. May generate or select an attribute prediction model to enter a known attribute value as the attribute value of the attribute included in the circular dependency in at least one attribute prediction model constituting the circular dependency.

ただし、既知の属性値を入力とする場合には、属性予測部１３５は、既知の属性値が存在しない対象者については予測対象属性の属性値を予測できなくなり、または予測精度が低下してしまう。そこで、属性予測部１３５は、２以上の予測対象属性の間の循環依存を許容する構成も取り得る。 However, when a known attribute value is input, the attribute prediction unit 135 cannot predict the attribute value of the prediction target attribute for a target person who does not have a known attribute value, or the prediction accuracy is lowered. .. Therefore, the attribute prediction unit 135 may have a configuration that allows a circular dependency between two or more prediction target attributes.

この場合、属性予測部１３５は、２以上の予測対象属性の間に循環依存が存在することに応じて、２以上の予測対象属性のそれぞれの予測に用いる他の予測対象属性の予測の確からしさ、および２以上の予測対象属性のそれぞれに対する他の予測対象属性の寄与度の少なくとも１つに基づいて、２以上の予測対象属性の予測順序を決定するようにしてもよい。例えば、本図の例において予測の確からしさが高い順に属性ｃ、属性ｂ、属性ｅの順序であれば、属性予測部１３５は、最も予測の確からしさが高い属性ｃを入力とする属性予測モデルＢ８２０を最も優先して先に実行するように予測順序を決定してもよい。その後属性予測部１３５は、循環依存する属性の中で予測の確からしさが高い順に、その属性を入力とする属性予測モデルを実行していってよい。これに代えて、属性予測部１３５は、属性予測モデルＢ８２０を実行した場合には、属性予測モデルＢ８２０が出力した予測値を入力する属性予測モデルＣ８３０、属性予測モデルＣ８３０が出力した予測値を入力とする属性予測モデルＡ８１０というように属性予測モデルＢ８２０以降は予測対象属性間の依存方向に沿って属性予測モデルを実行してもよい。これにより、属性予測部１３５は、予測の確からしさが高くより本来の属性値に近いと想定される予測値を優先して使用して依存先の予測対象属性の予測値を予測していくことができ、予測値の精度をより早く高めることが可能となる。 In this case, the attribute prediction unit 135 determines the certainty of prediction of other prediction target attributes used for each prediction of the two or more prediction target attributes according to the existence of a cyclic dependency between the two or more prediction target attributes. , And the prediction order of the two or more prediction target attributes may be determined based on at least one of the contributions of the other prediction target attributes to each of the two or more prediction target attributes. For example, in the example of this figure, if the order is attribute c, attribute b, and attribute e in descending order of prediction probability, the attribute prediction unit 135 inputs the attribute c having the highest prediction probability. The prediction order may be determined so that B820 is executed first. After that, the attribute prediction unit 135 may execute the attribute prediction model in which the attributes are input in descending order of the probability of prediction among the attributes that depend on the circulation. Instead, when the attribute prediction model B820 is executed, the attribute prediction unit 135 inputs the prediction values output by the attribute prediction model B820 and the attribute prediction model C830, and inputs the prediction values output by the attribute prediction model C830. After the attribute prediction model B820 such as the attribute prediction model A810, the attribute prediction model may be executed along the dependency direction between the prediction target attributes. As a result, the attribute prediction unit 135 predicts the predicted value of the predicted attribute of the dependent destination by preferentially using the predicted value that is expected to be closer to the original attribute value due to the high accuracy of the prediction. It is possible to improve the accuracy of the predicted value more quickly.

また例えば、属性予測部１３５は、本図の例において予測対象属性ｅの予測における属性ｃの寄与度、予測対象属性ｃの予測における属性ｂの寄与度、予測対象属性ｂの予測における属性ｅの寄与度がこの順に低くなるのであれば、属性予測部１３５は、最も寄与度が低い属性ｅを入力とする属性予測モデルＣ８３０を最も優先して先に実行するように予測順序を決定してもよい。その後属性予測部１３５は、循環依存する属性の中で属性予測モデルによる次の予測対象属性の予測における寄与度が低い順に、その属性を入力とする属性予測モデルを実行していってよい。これに代えて、属性予測部１３５は、属性予測モデルＣ８３０を実行した場合には、属性予測モデルＣ８３０が出力した予測値を入力する属性予測モデルＡ８１０、属性予測モデルＡ８１０が出力した予測値を入力とする属性予測モデルＢ８２０というように属性予測モデルＣ８３０以降は予測対象属性間の依存方向に沿って属性予測モデルを実行してもよい。これにより、属性予測部１３５は、後段の属性予測モデルの予測に影響を与える寄与度が低く循環依存の影響を与えにくい予測値を優先して使用して依存先の予測対象属性の予測値を予測していくことができ、予測値の循環依存の影響をより早く減衰させることが可能となる。 Further, for example, in the example of this figure, the attribute prediction unit 135 describes the contribution of the attribute c in the prediction of the prediction target attribute e, the contribution of the attribute b in the prediction of the prediction target attribute c, and the attribute e in the prediction of the prediction target attribute b. If the contribution degree decreases in this order, the attribute prediction unit 135 may determine the prediction order so that the attribute prediction model C830 having the attribute e having the lowest contribution degree as an input is executed first with the highest priority. Good. After that, the attribute prediction unit 135 may execute the attribute prediction model in which the attribute is input in ascending order of contribution in the prediction of the next prediction target attribute by the attribute prediction model among the attributes that depend on the circulation. Instead, when the attribute prediction model C830 is executed, the attribute prediction unit 135 inputs the prediction values output by the attribute prediction model C830 and the attribute prediction model A810 and the prediction values output by the attribute prediction model A810. After the attribute prediction model C830, such as the attribute prediction model B820, the attribute prediction model may be executed along the dependency direction between the prediction target attributes. As a result, the attribute prediction unit 135 preferentially uses the prediction value that has a low contribution to the prediction of the attribute prediction model in the subsequent stage and is unlikely to be affected by the circular dependency, and determines the prediction value of the prediction target attribute of the dependency. Predictions can be made, and the effects of circular dependencies of predicted values can be attenuated more quickly.

属性予測部１３５は、２以上の予測対象属性のそれぞれの予測に用いる他の予測対象属性の予測の確からしさ、および２以上の予測対象属性のそれぞれに対する他の予測対象属性の寄与度の両方に基づいて、２以上の予測対象属性の予測順序を決定するようにしてもよく、他の条件も更に加味して予測順序を決定してもよい。ここで、属性予測部１３５は、循環依存に含まれる属性に関して、予測に用いる属性の予測の確からしさがより高く、または予測における属性の寄与度がより低い場合に、その属性を入力する属性予測モデルの実行をより優先する。一例として、属性予測部１３５は、予測に用いる属性の予測の確からしさ（または予測の不確からしさ）と予測における属性の寄与度との重み付け和によって優先度を決定してよい。 The attribute prediction unit 135 determines both the certainty of the prediction of the other prediction target attributes used for each prediction of the two or more prediction target attributes and the contribution of the other prediction target attributes to each of the two or more prediction target attributes. Based on this, the prediction order of two or more prediction target attributes may be determined, or the prediction order may be determined in consideration of other conditions. Here, the attribute prediction unit 135 inputs the attribute for the attribute included in the circular dependency when the prediction accuracy of the attribute used for the prediction is higher or the contribution of the attribute in the prediction is lower. Prioritize model execution. As an example, the attribute prediction unit 135 may determine the priority by the weighted sum of the certainty of the prediction (or the uncertainty of the prediction) of the attribute used for the prediction and the contribution of the attribute in the prediction.

他の例として、属性予測部１３５は、２以上の予測対象属性のそれぞれについて、他の予測対象属性の予測の不確からしさおよび他の予測対象属性の寄与度の積和に基づいて、２以上の予測対象属性の予測値の予測順序を決定してもよい。すなわち例えば、属性予測部１３５は、複数の属性予測モデル（属性予測モデルＡ８１０、属性予測モデルＢ８２０、および属性予測モデルＣ８３０等）のそれぞれについて、循環依存に含まれる各入力属性の不確からしさおよび各入力属性の寄与度の積和をとり、積和値がより小さい属性予測モデルをより優先して実行する。 As another example, the attribute prediction unit 135 has two or more prediction target attributes for each of the two or more prediction target attributes, based on the sum of the products of the prediction uncertainty of the other prediction target attributes and the contributions of the other prediction target attributes. The prediction order of the prediction values of the prediction target attributes may be determined. That is, for example, the attribute prediction unit 135 determines the uncertainty of each input attribute included in the circular dependency and each input for each of the plurality of attribute prediction models (attribute prediction model A810, attribute prediction model B820, attribute prediction model C830, etc.). The product of the contributions of the attributes is taken, and the attribute prediction model with a smaller sum of products value is executed with higher priority.

なお、属性予測部１３５は、予測の不確からしさとして、予測の確からしさが増加すると値が減少するパラメータの値を用いてよい。例えば属性予測部１３５は、（１−ＡＵＣ）のように、最大１に正規化されたＡＵＣに基づく予測の確からしさを１から減じた値を用いてもよい。 The attribute prediction unit 135 may use the value of a parameter whose value decreases as the certainty of the prediction increases, as the uncertainty of the prediction. For example, the attribute prediction unit 135 may use a value obtained by subtracting the certainty of the prediction based on the AUC normalized to a maximum of 1 from 1 as in (1-AUC).

以上に示したように、循環依存に含まれる予測対象属性間の予測順序をより適切な順序とすることで、属性予測部１３５は、循環して算出される２以上の予測対象属性間の予測値の精度をより早く高め、及び／又は予測値の収束をより早めることができる。これにより、属性ＤＢ１２２および属性ＤＢ１６７が多くの属性を有することにより各属性予測モデルの実行周期が長くなる場合にも、予測値をより早く最適化することができる。 As shown above, by making the prediction order between the prediction target attributes included in the circular dependence a more appropriate order, the attribute prediction unit 135 makes a prediction between two or more prediction target attributes calculated cyclically. The accuracy of the values can be increased faster and / or the predicted values can be converged faster. As a result, even when the execution cycle of each attribute prediction model becomes long due to the attribute DB 122 and the attribute DB 167 having many attributes, the predicted value can be optimized faster.

図９は、本実施形態に係るシステム１００における属性追加フローを示す。Ｓ９００において、属性データ取得部１４４は、複数の対象者の少なくとも一部について、属性データベースに追加すべき追加属性の既知の属性値を示す既知情報を取得する。例えば、端末１１２は、マーケティング担当者等のシステム１００のエンドユーザ等が各対象者の任意の特性を簡単に予測できるようにするべく、簡単に属性ＤＢ１２２に属性を追加して各対象者のその属性の予測値を算出するユーザーインターフェイスを提供する。一例として、端末１１２は、当該ユーザーインターフェイスを提供するアプリケーションのアイコンに対して既知情報がドラッグ＆ドロップされたことに応じて、この既知情報に応じた属性の追加、既知の属性値の設定、および属性予測モデルの生成・選択の少なくとも一部を実行するユーザーインターフェイスを提供してもよい。 FIG. 9 shows an attribute addition flow in the system 100 according to the present embodiment. In S900, the attribute data acquisition unit 144 acquires known information indicating the known attribute value of the additional attribute to be added to the attribute database for at least a part of the plurality of target persons. For example, the terminal 112 can easily add an attribute to the attribute DB 122 so that an end user of the system 100 such as a marketer can easily predict an arbitrary characteristic of each target person. Provides a user interface for calculating predicted values of attributes. As an example, the terminal 112 adds an attribute according to the known information, sets a known attribute value, and responds to the drag and drop of the known information to the icon of the application providing the user interface. A user interface may be provided that performs at least part of the generation and selection of the attribute prediction model.

属性データ取得部１４４が取得する既知情報は、複数の対象者の少なくとも一部のそれぞれについて、個人識別情報と、１または複数の追加属性の既知の属性値との組を含む情報であってよい。また、追加属性が、対象者が特定の特性を有するか否かを示す２値属性である場合（例えば「商品Ａを買ったか否か」）、既知情報は、追加属性が真である各対象者の個人識別情報のリスト、および追加属性が偽である各対象者の個人識別情報のリストの少なくとも一方を含んでよい。 The known information acquired by the attribute data acquisition unit 144 may be information including a pair of personal identification information and known attribute values of one or a plurality of additional attributes for each of at least a part of a plurality of target persons. .. Further, when the additional attribute is a binary attribute indicating whether or not the target person has a specific characteristic (for example, "whether or not the product A is bought"), the known information is each target whose additional attribute is true. It may include at least one of a list of personally identifiable information of the person and a list of personally identifiable information of each subject whose additional attributes are false.

Ｓ９１０において、属性追加部１４６は、属性ＤＢ１２２の複数の属性に、追加属性を追加する。例えば、属性追加部１４６は、属性ＤＢ接続部１２０を介して属性ＤＢ１２２に追加属性のカラムを追加し、既知情報に含まれる各対象者の既知の属性値に基づいて、属性ＤＢ１２２における追加属性に格納する既知の属性値を設定する。この際、属性追加部１４６は、既知情報において示された既知の属性値を属性ＤＢ１２２における追加属性の既知の属性値として設定してもよく、既知情報において示された既知の属性値を属性ＤＢ１２２における表現形式に変換して追加属性の既知の属性値として設定してもよい。 In S910, the attribute addition unit 146 adds an additional attribute to a plurality of attributes of the attribute DB 122. For example, the attribute addition unit 146 adds a column of additional attributes to the attribute DB 122 via the attribute DB connection unit 120, and adds the additional attribute in the attribute DB 122 based on the known attribute value of each target person included in the known information. Set the known attribute value to be stored. At this time, the attribute addition unit 146 may set the known attribute value indicated in the known information as the known attribute value of the additional attribute in the attribute DB 122, and the known attribute value indicated in the known information may be set as the attribute DB 122. It may be converted to the expression format in and set as a known attribute value of the additional attribute.

Ｓ９２０において、システム１００内の予測モデル生成装置１５０は、追加属性を予測対象属性として１または複数の属性予測モデルを生成する。ここで、モデル更新指示部１５５は、属性データ取得部１４４によって属性ＤＢ１２２に追加属性が追加されたことに応じて、予め定められた期間の経過を待つことなく、すみやかに追加属性の属性予測モデルの生成を指示してよい。追加属性を予測対象属性として１または複数の属性予測モデルを生成する処理は、図３の動作フローにおけるＳ３２０からＳ３５０の処理と同様であってよい。 In S920, the prediction model generation device 150 in the system 100 generates one or a plurality of attribute prediction models with the additional attribute as the prediction target attribute. Here, the model update instruction unit 155 promptly responds to the addition of the additional attribute to the attribute DB 122 by the attribute data acquisition unit 144 without waiting for the elapse of a predetermined period, and promptly predicts the attribute of the additional attribute. May be instructed to generate. The process of generating one or more attribute prediction models with the additional attribute as the prediction target attribute may be the same as the process of S320 to S350 in the operation flow of FIG.

Ｓ９３０において、システム１００内の予測モデル生成装置１５０は、複数の属性予測モデルのそれぞれの予測誤差に基づいて、追加属性である予測対象属性の属性値の予測に用いる属性予測モデルを選択する。複数の属性予測モデルの中から追加属性の属性値の予測に用いる属性予測モデルを選択する処理は、図３のＳ３６０からＳ３７０の処理と同様であってよい。 In S930, the prediction model generation device 150 in the system 100 selects an attribute prediction model used for predicting the attribute value of the prediction target attribute, which is an additional attribute, based on the prediction error of each of the plurality of attribute prediction models. The process of selecting the attribute prediction model used for predicting the attribute value of the additional attribute from the plurality of attribute prediction models may be the same as the process of S360 to S370 of FIG.

以上に示した処理の後、システム１００内の属性予測装置１１０は、図７に示した属性予測フローを実行して、各対象者または全対象者についての追加属性の属性値を予測してよい。これにより、システム１００は、既知情報に含まれる一部の対象者についての既知の属性値を教師データとして追加属性の属性値を予測する属性予測モデルを学習し、他の対象者または全対象者について追加属性の予測値を算出することができる。 After the processing shown above, the attribute prediction device 110 in the system 100 may execute the attribute prediction flow shown in FIG. 7 to predict the attribute value of the additional attribute for each target person or all target persons. .. As a result, the system 100 learns an attribute prediction model that predicts the attribute value of the additional attribute by using the known attribute value of some target persons included in the known information as teacher data, and other target persons or all target persons. The predicted value of the additional attribute can be calculated for.

なお、本実施形態において、属性データ取得部１４４および属性追加部１４６は、属性予測装置１１０内に設けられ、マスターである属性ＤＢ１２２に対して追加属性を追加する。これに代えて、属性データ取得部１４４および属性追加部１４６は、予測モデル生成装置１５０内に設けられ、サブセットである属性ＤＢ１６７に対して追加属性を追加する形態を採ってもよい。 In the present embodiment, the attribute data acquisition unit 144 and the attribute addition unit 146 are provided in the attribute prediction device 110 and add additional attributes to the master attribute DB 122. Instead of this, the attribute data acquisition unit 144 and the attribute addition unit 146 may be provided in the prediction model generation device 150 and may adopt a form in which additional attributes are added to the attribute DB 167 which is a subset.

図１０は、本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ１９００の例を示す。コンピュータ１９００にインストールされたプログラムは、コンピュータ１９００に、本発明の実施形態に係る装置に関連付けられる操作または当該装置の１または複数のセクションとして機能させることができ、または当該操作または当該１または複数のセクションを実行させることができ、および／またはコンピュータ１９００に、本発明の実施形態に係るプロセスまたは当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ１９００に、本明細書に記載のフローチャートおよびブロック図のブロックのうちのいくつかまたはすべてに関連付けられた特定の操作を実行させるべく、ＣＰＵ２０００によって実行されてよい。 FIG. 10 shows an example of a computer 1900 in which a plurality of aspects of the present invention may be embodied in whole or in part. The program installed on the computer 1900 can cause the computer 1900 to function as an operation associated with the device according to an embodiment of the present invention or as one or more sections of the device, or the operation or the one or more. Sections can be run and / or computer 1900 can be run a process according to an embodiment of the invention or a stage of such process. Such a program may be run by the CPU 2000 to cause the computer 1900 to perform certain operations associated with some or all of the blocks of the flowcharts and block diagrams described herein.

本実施形態に係るコンピュータ１９００は、ホスト・コントローラ２０８２により相互に接続されるＣＰＵ２０００、ＲＡＭ２０２０、グラフィック・コントローラ２０７５、及び表示装置２０８０を有するＣＰＵ周辺部と、入出力コントローラ２０８４によりホスト・コントローラ２０８２に接続される通信インターフェイス２０３０、ハードディスクドライブ２０４０、及びＤＶＤドライブ２０６０を有する入出力部と、入出力コントローラ２０８４に接続されるＲＯＭ２０１０、フラッシュメモリ・ドライブ２０５０、及び入出力チップ２０７０を有するレガシー入出力部を備える。 The computer 1900 according to the present embodiment is connected to the host controller 2082 by the input / output controller 2084 and the CPU peripheral portion having the CPU 2000, the RAM 2020, the graphic controller 2075, and the display device 2080 which are interconnected by the host controller 2082. It includes an input / output unit having a communication interface 2030, a hard disk drive 2040, and a DVD drive 2060, and a legacy input / output unit having a ROM 2010, a flash memory drive 2050, and an input / output chip 2070 connected to the input / output controller 2084. ..

ホスト・コントローラ２０８２は、ＲＡＭ２０２０と、高い転送レートでＲＡＭ２０２０をアクセスするＣＰＵ２０００及びグラフィック・コントローラ２０７５とを接続する。ＣＰＵ２０００は、ＲＯＭ２０１０及びＲＡＭ２０２０に格納されたプログラムに基づいて動作し、各部の制御を行う。グラフィック・コントローラ２０７５は、ＣＰＵ２０００等がＲＡＭ２０２０内に設けたフレーム・バッファ上に生成する画像データを取得し、表示装置２０８０上に表示させる。これに代えて、グラフィック・コントローラ２０７５は、ＣＰＵ２０００等が生成する画像データを格納するフレーム・バッファを、内部に含んでもよい。 The host controller 2082 connects the RAM 2020 to the CPU 2000 and the graphic controller 2075 that access the RAM 2020 at a high transfer rate. The CPU 2000 operates based on the programs stored in the ROM 2010 and the RAM 2020, and controls each part. The graphic controller 2075 acquires image data generated on a frame buffer provided in the RAM 2020 by the CPU 2000 or the like, and displays the image data on the display device 2080. Instead, the graphic controller 2075 may internally include a frame buffer for storing image data generated by the CPU 2000 or the like.

入出力コントローラ２０８４は、ホスト・コントローラ２０８２と、比較的高速な入出力装置である通信インターフェイス２０３０、ハードディスクドライブ２０４０、ＤＶＤドライブ２０６０を接続する。通信インターフェイス２０３０は、有線又は無線によりネットワークを介して他の装置と通信する。また、通信インターフェイスは、通信を行うハードウェアとして機能する。ハードディスクドライブ２０４０は、コンピュータ１９００内のＣＰＵ２０００が使用するプログラム及びデータを格納する。ＤＶＤドライブ２０６０は、ＤＶＤ２０９５からプログラム又はデータを読み取り、ＲＡＭ２０２０を介してハードディスクドライブ２０４０に提供する。 The input / output controller 2084 connects the host controller 2082 to the communication interface 2030, the hard disk drive 2040, and the DVD drive 2060, which are relatively high-speed input / output devices. The communication interface 2030 communicates with other devices via a network by wire or wirelessly. In addition, the communication interface functions as hardware for communication. The hard disk drive 2040 stores programs and data used by the CPU 2000 in the computer 1900. The DVD drive 2060 reads a program or data from the DVD 2095 and provides it to the hard disk drive 2040 via the RAM 2020.

また、入出力コントローラ２０８４には、ＲＯＭ２０１０と、フラッシュメモリ・ドライブ２０５０、及び入出力チップ２０７０の比較的低速な入出力装置とが接続される。ＲＯＭ２０１０は、コンピュータ１９００が起動時に実行するブート・プログラム、及び／又は、コンピュータ１９００のハードウェアに依存するプログラム等を格納する。フラッシュメモリ・ドライブ２０５０は、フラッシュメモリ２０９０からプログラム又はデータを読み取り、ＲＡＭ２０２０を介してハードディスクドライブ２０４０に提供する。入出力チップ２０７０は、フラッシュメモリ・ドライブ２０５０を入出力コントローラ２０８４へと接続するとともに、例えばパラレル・ポート、シリアル・ポート、キーボード・ポート、マウス・ポート等を介して各種の入出力装置を入出力コントローラ２０８４へと接続する。 Further, the ROM 2010, the flash memory drive 2050, and the relatively low-speed input / output device of the input / output chip 2070 are connected to the input / output controller 2084. The ROM 2010 stores a boot program executed by the computer 1900 at startup and / or a program depending on the hardware of the computer 1900. The flash memory drive 2050 reads a program or data from the flash memory 2090 and provides it to the hard disk drive 2040 via the RAM 2020. The input / output chip 2070 connects the flash memory drive 2050 to the input / output controller 2084, and inputs / outputs various input / output devices via, for example, a parallel port, a serial port, a keyboard port, a mouse port, and the like. Connect to controller 2084.

ＲＡＭ２０２０を介してハードディスクドライブ２０４０に提供されるプログラムは、フラッシュメモリ２０９０、ＤＶＤ２０９５、又はＩＣカード等の記録媒体に格納されて利用者によって提供される。プログラムは、記録媒体から読み出され、ＲＡＭ２０２０を介してコンピュータ１９００内のハードディスクドライブ２０４０にインストールされ、ＣＰＵ２０００において実行される。これらのプログラム内に記述される情報処理は、コンピュータ１９００に読み取られ、ソフトウェアと、上記様々なタイプのハードウェア資源との間の協働をもたらす。装置または方法が、コンピュータ１９００の使用に従い情報の操作または処理を実現することによって構成されてよい。 The program provided to the hard disk drive 2040 via the RAM 2020 is stored in a recording medium such as a flash memory 2090, a DVD 2095, or an IC card and provided by the user. The program is read from the recording medium, installed on the hard disk drive 2040 in the computer 1900 via the RAM 2020, and executed in the CPU 2000. The information processing described in these programs is read by the computer 1900 and results in collaboration between the software and the various types of hardware resources described above. The device or method may be configured to perform manipulation or processing of information in accordance with the use of computer 1900.

一例として、コンピュータ１９００と外部の装置等との間で通信を行う場合には、ＣＰＵ２０００は、ＲＡＭ２０２０上にロードされた通信プログラムを実行し、通信プログラムに記述された処理内容に基づいて、通信インターフェイス２０３０に対して通信処理を指示する。通信インターフェイス２０３０は、ＣＰＵ２０００の制御を受けて、ＲＡＭ２０２０、ハードディスクドライブ２０４０、フラッシュメモリ２０９０、又はＤＶＤ２０９５等の記憶装置上に設けた送信バッファ領域等に記憶された送信データを読み出してネットワークへと送信し、もしくは、ネットワークから受信した受信データを記憶装置上に設けた受信バッファ領域等へと書き込む。このように、通信インターフェイス２０３０は、ＤＭＡ（ダイレクト・メモリ・アクセス）方式により記憶装置との間で送受信データを転送してもよく、これに代えて、ＣＰＵ２０００が転送元の記憶装置又は通信インターフェイス２０３０からデータを読み出し、転送先の通信インターフェイス２０３０又は記憶装置へとデータを書き込むことにより送受信データを転送してもよい。 As an example, when communicating between the computer 1900 and an external device or the like, the CPU 2000 executes a communication program loaded on the RAM 2020, and based on the processing content described in the communication program, a communication interface. Instruct 2030 to perform communication processing. Under the control of the CPU 2000, the communication interface 2030 reads the transmission data stored in the transmission buffer area or the like provided on the storage device such as the RAM 2020, the hard disk drive 2040, the flash memory 2090, or the DVD 2095, and transmits the transmission data to the network. Alternatively, the received data received from the network is written to a receive buffer area or the like provided on the storage device. As described above, the communication interface 2030 may transfer the transmitted / received data to / from the storage device by the DMA (direct memory access) method, and instead, the CPU 2000 may transfer the transfer source storage device or the communication interface 2030. The transmitted / received data may be transferred by reading the data from the data and writing the data to the communication interface 2030 or the storage device of the transfer destination.

また、ＣＰＵ２０００は、ハードディスクドライブ２０４０、ＤＶＤドライブ２０６０（ＤＶＤ２０９５）、フラッシュメモリ・ドライブ２０５０（フラッシュメモリ２０９０）等の外部記憶装置に格納されたファイルまたはデータベース等の中から、全部または必要な部分をＤＭＡ転送等によりＲＡＭ２０２０へと読み込ませ、ＲＡＭ２０２０上のデータに対して各種の処理を行う。そして、ＣＰＵ２０００は、処理を終えたデータを、ＤＭＡ転送等により外部記憶装置へと書き戻す。このような処理において、ＲＡＭ２０２０は、外部記憶装置の内容を一時的に保持するものとみなせるから、本実施形態においてはＲＡＭ２０２０及び外部記憶装置等をメモリ、記憶部、または記憶装置等と総称する。 Further, the CPU 2000 performs DMA all or necessary parts from files or databases stored in an external storage device such as a hard disk drive 2040, a DVD drive 2060 (DVD2095), and a flash memory drive 2050 (flash memory 2090). It is read into the RAM 2020 by transfer or the like, and various processes are performed on the data on the RAM 2020. Then, the CPU 2000 writes the processed data back to the external storage device by DMA transfer or the like. In such processing, the RAM 2020 can be regarded as temporarily holding the contents of the external storage device. Therefore, in the present embodiment, the RAM 2020 and the external storage device are collectively referred to as a memory, a storage unit, a storage device, or the like.

本実施形態における各種のプログラム、データ、テーブル、データベース等の各種の情報は、このような記憶装置上に格納されて、情報処理の対象となる。なお、ＣＰＵ２０００は、ＲＡＭ２０２０の一部をキャッシュメモリに保持し、キャッシュメモリ上で読み書きを行うこともできる。このような形態においても、キャッシュメモリはＲＡＭ２０２０の機能の一部を担うから、本実施形態においては、区別して示す場合を除き、キャッシュメモリもＲＡＭ２０２０、メモリ、及び／又は記憶装置に含まれるものとする。 Various information such as various programs, data, tables, and databases in the present embodiment are stored in such a storage device and are subject to information processing. The CPU 2000 can also hold a part of the RAM 2020 in the cache memory and read / write on the cache memory. Even in such a form, the cache memory plays a part of the function of the RAM 2020. Therefore, in the present embodiment, the cache memory is also included in the RAM 2020, the memory, and / or the storage device, unless otherwise indicated. To do.

また、ＣＰＵ２０００は、ＲＡＭ２０２０から読み出したデータに対して、プログラムの命令列により指定された、本実施形態中に記載した各種の演算、情報の加工、条件判断、情報の検索・置換等を含む各種の処理を行い、ＲＡＭ２０２０へと書き戻す。例えば、ＣＰＵ２０００は、条件判断を行う場合においては、本実施形態において示した各種の変数が、他の変数または定数と比較して、大きい、小さい、以上、以下、等しい等の条件を満たすか否かを判断し、条件が成立した場合（又は不成立であった場合）に、異なる命令列へと分岐し、またはサブルーチンを呼び出す。 In addition, the CPU 2000 includes various operations, information processing, condition determination, information retrieval / replacement, and the like specified in the instruction sequence of the program for the data read from the RAM 2020. Is performed, and the data is written back to the RAM 2020. For example, when the CPU 2000 determines a condition, whether or not various variables shown in the present embodiment satisfy conditions such as large, small, above, below, and equal to other variables or constants. If the condition is satisfied (or if it is not satisfied), it branches to a different instruction sequence or calls a subroutine.

また、ＣＰＵ２０００は、記憶装置内のファイルまたはデータベース等に格納された情報を検索することができる。例えば、第１属性の属性値に対し第２属性の属性値がそれぞれ対応付けられた複数のエントリが記憶装置に格納されている場合において、ＣＰＵ２０００は、記憶装置に格納されている複数のエントリの中から第１属性の属性値が指定された条件と一致するエントリを検索し、そのエントリに格納されている第２属性の属性値を読み出すことにより、所定の条件を満たす第１属性に対応付けられた第２属性の属性値を得ることができる。 In addition, the CPU 2000 can search for information stored in a file in the storage device, a database, or the like. For example, when a plurality of entries in which the attribute value of the second attribute is associated with the attribute value of the first attribute are stored in the storage device, the CPU 2000 describes the plurality of entries stored in the storage device. By searching for an entry in which the attribute value of the first attribute matches the specified condition and reading the attribute value of the second attribute stored in that entry, it is associated with the first attribute that satisfies the predetermined condition. The attribute value of the second attribute obtained can be obtained.

また、実施形態の説明において複数の要素が列挙された場合には、列挙された要素以外の要素を用いてもよい。例えば、「Ｘは、Ａ、Ｂ及びＣを用いてＹを実行する」と記載される場合、Ｘは、Ａ、Ｂ及びＣに加え、Ｄを用いてＹを実行してもよい。 Further, when a plurality of elements are listed in the description of the embodiment, elements other than the listed elements may be used. For example, when it is described that "X executes Y using A, B and C", X may execute Y using D in addition to A, B and C.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更または改良を加えることが可能であることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various changes or improvements can be made to the above embodiments. It is clear from the description of the claims that such modified or improved forms may also be included in the technical scope of the present invention.

特許請求の範囲、明細書、および図面中において示した装置、システム、プログラム、および方法における動作、手順、ステップ、および段階等の各処理の実行順序は、特段「より前に」、「先立って」等と明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The order of execution of operations, procedures, steps, steps, etc. in the devices, systems, programs, and methods shown in the claims, specification, and drawings is particularly "before" and "prior to". It should be noted that it can be realized in any order unless the output of the previous process is used in the subsequent process. Even if the scope of claims, the specification, and the operation flow in the drawings are explained using "first", "next", etc. for convenience, it means that it is essential to carry out in this order. It's not a thing.

１００システム
１１０属性予測装置
１１２端末
１１５属性情報取得部
１２０属性ＤＢ接続部
１２２属性ＤＢ
１２５次元縮約部
１３０縮約ＤＢ接続部
１３２縮約ＤＢ
１３５属性予測部
１４０属性値更新部
１４２属性予測値更新部
１４４属性データ取得部
１４６属性追加部
１５０予測モデル生成装置
１５２端末
１５５モデル更新指示部
１６０サンプリング部
１６５属性ＤＢ接続部
１６７属性ＤＢ
１７０次元縮約部
１７５縮約ＤＢ接続部
１７７縮約ＤＢ
１８０属性予測モデル生成部
１８５属性予測モデル選択部
１９０レコメンド処理装置
１９２端末
８１０属性予測モデルＡ
８２０属性予測モデルＢ
８３０属性予測モデルＣ
１９００コンピュータ
２０００ＣＰＵ
２０１０ＲＯＭ
２０２０ＲＡＭ
２０３０通信インターフェイス
２０４０ハードディスクドライブ
２０５０フラッシュメモリ・ドライブ
２０６０ＤＶＤドライブ
２０７０入出力チップ
２０７５グラフィック・コントローラ
２０８０表示装置
２０８２ホスト・コントローラ
２０８４入出力コントローラ
２０９０フラッシュメモリ
２０９５ＤＶＤ 100 System 110 Attribute prediction device 112 Terminal 115 Attribute information acquisition unit 120 Attribute DB Connection unit 122 Attribute DB
125 dimensional contraction 130 contraction DB connection 132 contraction DB
135 Attribute prediction unit 140 Attribute value update unit 142 Attribute prediction value update unit 144 Attribute data acquisition unit 146 Attribute addition unit 150 Prediction model generator 152 Terminal 155 Model update instruction unit 160 Sampling unit 165 Attribute DB connection unit 167 Attribute DB
170 dimensional contraction 175 contraction DB connection 177 contraction DB
180 Attribute prediction model generation unit 185 Attribute prediction model selection unit 190 Recommendation processing device 192 Terminal 810 Attribute prediction model A
820 Attribute Prediction Model B
830 Attribute Prediction Model C
1900 computer 2000 CPU
2010 ROM
2020 RAM
2030 Communication Interface 2040 Hard Disk Drive 2050 Flash Memory Drive 2060 DVD Drive 2070 I / O Chip 2075 Graphic Controller 2080 Display 2082 Host Controller 2084 I / O Controller 2090 Flash Memory 2095 DVD

Claims

For each of multiple target persons, an attribute database connection part connected to an attribute database for storing multiple attribute values corresponding to multiple attributes, and
First, using the attribute database, the attribute value of the first prediction target attribute, which is the prediction target, is predicted based on the attribute value of at least one attribute other than the first prediction target attribute among the plurality of attributes. Attribute prediction model generator that generates multiple attribute prediction models of
A device including an attribute prediction model selection unit that selects a first attribute prediction model to be used for predicting the attribute value of the first prediction target attribute based on the prediction error of each of the first plurality of attribute prediction models.

Further, a sampling unit for sampling a part of the plurality of target persons from the attribute database is provided.
The device according to claim 1, wherein the attribute prediction model generation unit generates the first plurality of attribute prediction models by using the attribute values associated with a part of the sampled target persons.

A dimension reduction unit that reduces the dimensions of the plurality of attributes based on the plurality of attribute values of each of the plurality of target persons stored in the attribute database is further provided.
The device according to claim 1 or 2, wherein the attribute prediction model generation unit predicts the attribute value of the first prediction target attribute from the attribute value of at least one of the plurality of dimension-reduced attributes.

The attribute prediction model selection unit has different attributes as the first attribute prediction model used for predicting the attribute value of the first prediction target attribute and the second attribute prediction model used for predicting the attribute value of the second prediction target attribute. The device according to any one of claims 1 to 3, wherein a prediction model can be selected.

The device according to any one of claims 1 to 4, wherein the attribute prediction model generation unit learns learnable parameters in each of the first plurality of attribute prediction models.

The device according to any one of claims 1 to 5, wherein each of the first plurality of attribute prediction models is different from other attribute prediction models in at least one of the hyperparameters and prediction algorithms that are not updated by learning.

The apparatus according to any one of claims 1 to 6, further comprising an attribute prediction unit that predicts the attribute value of the first prediction target attribute for each of the plurality of target persons using the first attribute prediction model. ..

When the attribute value of the first prediction target attribute is known for one of the plurality of target persons, the predicted value of the first prediction target attribute deviates from the known attribute value by a reference or more. The apparatus according to claim 7, further comprising an attribute predicted value updating unit that updates the predicted value of the first predicted target attribute based on the known attribute value.

The attribute prediction model generation unit generates a second plurality of attribute prediction models that predict the attribute value of the second prediction target attribute, which is the prediction target, using the prediction value of the first prediction target attribute.
The attribute prediction model selection unit further selects a second attribute prediction model used for predicting the attribute value of the second prediction target attribute based on the prediction error of each of the second plurality of attribute prediction models.
The device according to claim 7 or 8, wherein the attribute prediction unit further predicts the second prediction target attribute for each of the plurality of target persons by using the second attribute prediction model.
The device according to claim 7 or 8.

The attribute prediction unit predicts the second prediction target attribute of the target person using the known attribute value on condition that the attribute value of the first prediction target attribute is known for each of the plurality of target persons. A request for predicting a value and predicting the predicted value of the second prediction target attribute of the target person using the attribute value of the first prediction target attribute on condition that the attribute value of the first prediction target attribute is unknown. Item 9. The device according to item 9.

Attribute value update that updates the attribute value of the first prediction target attribute based on the prediction value of the first prediction target attribute, provided that the prediction accuracy of the prediction value of the first prediction target attribute is equal to or greater than the threshold value. The device according to claim 7, further comprising a unit.

Depending on the existence of a circular dependency between the two or more prediction target attributes, the attribute prediction unit determines the certainty of prediction of the other prediction target attributes used for each prediction of the two or more prediction target attributes, and According to any one of claims 7 to 11, which determines the prediction order of the two or more prediction target attributes based on at least one of the contributions of the other prediction target attributes to each of the two or more prediction target attributes. The device described.

For each of the two or more prediction target attributes, the attribute prediction unit has the two or more prediction target attributes based on the sum of the uncertainties of prediction of the other prediction target attributes and the contribution of the other prediction target attributes. The apparatus according to claim 12, wherein the prediction order of the predicted values of the above is determined.

The device according to any one of claims 7 to 13, wherein the first prediction target attribute is a preference attribute indicating a target person's preference for a product or service associated with the first prediction target attribute.

14. The claim 14 further includes a recommendation processing unit that selects whether or not to recommend a product or service associated with the first prediction target attribute to the target person based on the attribute value of the first prediction target attribute. Equipment.

The apparatus according to any one of claims 1 to 15, further comprising a model update instruction unit for instructing the update of the first attribute prediction model according to the elapse of a predetermined period.

A known information acquisition unit that acquires known information indicating known attribute values of additional attributes to be added to the attribute database for at least a part of the plurality of target persons.
An attribute addition part that adds the additional attribute to the plurality of attributes in the attribute database,
Further prepare
The attribute prediction model generation unit generates a plurality of attribute prediction models using the additional attribute as a prediction target attribute, and generates a plurality of attribute prediction models.
The attribute prediction model selection unit selects any one of claims 1 to 16 to select an attribute prediction model to be used for predicting the attribute value of the prediction target attribute based on the prediction error of each of the plurality of attribute prediction models. The device described in.

The device according to claim 17, wherein the known information indicates the presence or absence of the additional attribute for each of at least a part of the plurality of subjects.

The computer uses an attribute database for storing a plurality of attribute values corresponding to a plurality of attributes for each of the plurality of target persons, and sets the attribute value of the first prediction target attribute, which is the prediction target, to the plurality of attributes. Of these, an attribute prediction model generation stage that generates a first plurality of attribute prediction models that predict each based on the attribute values of at least one attribute other than the first prediction target attribute.
An attribute prediction model selection step in which the computer selects a first attribute prediction model to be used for predicting the attribute value of the first prediction target attribute based on the prediction error of each of the first plurality of attribute prediction models. How to prepare.

Executed by a computer, said computer,
For each of multiple target persons, an attribute database connection part connected to an attribute database for storing multiple attribute values corresponding to multiple attributes, and
First, using the attribute database, the attribute value of the first prediction target attribute, which is the prediction target, is predicted based on the attribute value of at least one attribute other than the first prediction target attribute among the plurality of attributes. Attribute prediction model generator that generates multiple attribute prediction models of
A program that functions as an attribute prediction model selection unit that selects a first attribute prediction model to be used for predicting the attribute value of the first prediction target attribute based on the prediction error of each of the first plurality of attribute prediction models. ..