JP6535564B2

JP6535564B2 - INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Info

Publication number: JP6535564B2
Application number: JP2015191937A
Authority: JP
Inventors: 佐藤　新; 新佐藤; 大樹後藤; 森本　俊彦; 俊彦森本
Original assignee: NTT Data Corp
Current assignee: NTT Data Corp
Priority date: 2015-09-29
Filing date: 2015-09-29
Publication date: 2019-06-26
Anticipated expiration: 2035-09-29
Also published as: JP2017068483A

Description

本発明は、情報処理装置、情報処理方法、およびプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

分析者からの分析要求を受けて、ユーザ（例えば、Ｗｅｂサイト閲覧者）の行動履歴や嗜好情報などのユーザ属性情報を示すデータを用いてユーザの行動傾向を分析する技術がある。行動履歴とは、例えば、Ｗｅｂサイトの閲覧履歴、またはＷｅｂサイトを介しての商品の購入履歴などである。また、その行動傾向の分析結果に基づいて、特定のユーザ属性情報を持つユーザ群の行動傾向を示す分析結果を分析者へ提示する技術がある。そして、分析者は、得られた分析結果に基づいて、分析の目的である各種の業務、例えば、広告を配信する配信対象ユーザの選定、などを行う。
例えば、特許文献１に記載の商品購入データ処理システムは、特定の属性を持つ顧客の購買履歴から、顧客属性に応じた購入傾向を分析する。 There is a technique of analyzing a user's behavior tendency using data indicating user attribute information such as user's (for example, website browsing person) behavior history and preference information in response to an analysis request from an analyst. The action history is, for example, a browsing history of a website, or a purchase history of a product via the website. In addition, there is a technique of presenting an analysis result indicating an action tendency of a user group having specific user attribute information to an analyst based on an analysis result of the action tendency. Then, based on the obtained analysis result, the analyst performs various operations that are the purpose of analysis, for example, selection of a distribution target user to which an advertisement is to be distributed.
For example, the product purchase data processing system described in Patent Document 1 analyzes the purchase tendency according to the customer attribute from the purchase history of the customer having the specific attribute.

特開２００１−２１６３６９号公報JP 2001-216369 A

ところで、ユーザの行動傾向を分析するための分析手法として、決定木分析が用いられることがある。決定木分析は、分析対象となるユーザのユーザ属性情報および行動履歴などを示すデータから、木構造のデータ形式で表した決定木を算出する。決定木は、分類を表すノードと、その分類に至るまでの特徴の集まりを表すルール（枝）とによって木構造を成す。決定木分析は、決定木を算出することよってユーザを分類し、分類したユーザごとの行動傾向を示す分析結果データを視覚化して提示することができる分析手法である。このような決定木分析の分析結果は、ユーザに対するサービス提供内容や提供方法についての意思決定支援に用いられることがある。 By the way, decision tree analysis may be used as an analysis method for analyzing a user's action tendency. The decision tree analysis calculates a decision tree represented in a data format of a tree structure from data indicating user attribute information and an action history of a user to be analyzed. A decision tree has a tree structure of nodes representing classifications and rules (branches) representing a collection of features leading to the classifications. Decision tree analysis is an analysis method that can classify users by calculating a decision tree and visualize and present analysis result data indicating behavior tendencies for each classified user. The analysis result of such decision tree analysis may be used for decision support for the service provision contents and provision method for the user.

分析者は、算出された決定木の中で特徴的であると判断したノードを選択し、選択したノードが示す分類に属するユーザ群に対して、分析の目的とする業務（例えば、広告配信）を行う。ここで、ノードの選択は、分析の目的とする業務へ与える効果が大きくなるようになされることが好ましい。効果とは、例えば、新規ユーザの獲得、既存ユーザの再訪率の向上、または売上高の拡大などである。
しかしながら、分析対象となるユーザのユーザ属性情報および行動履歴などを示すデータを分析に用いる際、どの期間内のデータを分析に用いれば分析の目的とする業務へ与える効果が大きくなるかについて把握することは容易ではない。例えば、直近一か月分のデータを用いた分析が効果的であるか、または、直近一年分のデータを用いた分析が効果的であるかは、分析の目的とする業務や分析の対象とする業種によって異なる。 The analyst selects a node that is determined to be characteristic in the calculated decision tree, and performs a task (for example, advertisement delivery) targeted by the analysis for the user group belonging to the classification indicated by the selected node. I do. Here, it is preferable that the selection of nodes be made to have a large effect on the task of analysis. The effects include, for example, acquisition of new users, improvement of revisit rates of existing users, or expansion of sales.
However, when using data indicating the user attribute information of the user to be analyzed and the action history, etc. for analysis, it is grasped if the data within which period is used for analysis will increase the effect on the task of analysis purpose. It is not easy. For example, whether analysis using data for the last one month is effective or analysis using data for the last one year is effective is the target of the operation or analysis that is the objective of the analysis. It depends on the type of industry.

本発明は上記の点に鑑みてなされたものであり、分析対象期間が異なる複数の分析結果の中で、より特徴が強く表れた分析結果を提示することができる情報処理装置、情報処理方法、およびプログラムを提供する。 The present invention has been made in view of the above points, and is an information processing apparatus capable of presenting an analysis result in which a feature is more strongly expressed among a plurality of analysis results having different analysis target periods, an information processing method, And provide programs.

（１）本発明は上記の課題を解決するためになされたものであり、本発明の一態様としては、複数のユーザそれぞれのユーザ属性を示すデータを取得し、前記データのうち特定の期間に該当するデータの前記ユーザ属性に基づいて前記ユーザを複数のノードのいずれかにそれぞれ分類する決定木を、前記期間を変えて複数算出する決定木算出部と、前記ノードに分類された前記ユーザの数に基づく指標である代表指標の値を算出する代表指標算出部と、前記代表指標算出部によって算出された前記代表指標の値に基づいて、前記決定木算出部によって算出された複数の決定木の中から、いずれかの決定木を選択する検証評価部と、前記検証評価部によって選択された決定木を示すデータと前記代表指標算出部が算出した代表指標の値を示すデータとに基づいて算出される決定木を含む分析結果情報を出力する出力部と、を備えることを特徴とする情報処理装置である。 (1) The present invention has been made to solve the above-described problems, and one aspect of the present invention is to obtain data indicating user attributes of each of a plurality of users, and to obtain data during a specific period of the data. A decision tree calculating unit that calculates a plurality of decision trees that respectively classify the user into any one of a plurality of nodes based on the user attribute of the corresponding data, and the user classified into the nodes A plurality of decision trees calculated by the decision tree calculation unit on the basis of a representative index calculation unit that calculates values of representative indexes that are indices based on numbers, and the values of the representative indexes calculated by the representative index calculation unit A verification evaluation unit that selects any decision tree from among the data, data indicating the decision tree selected by the verification evaluation unit, and data indicating the value of the representative index calculated by the representative index calculation unit An output unit for outputting the analysis result information including the decision tree is calculated based on the data, an information processing apparatus comprising: a.

（２）また、本発明の一態様としては、前記検証評価部は、前記代表指標算出部によって算出された前記ノードごとの代表指標の値の分散を、前記複数の異なる期間ごとの決定木についてそれぞれ算出し、最も前記分散が大きい決定木を選択することによって、より前記代表指標の値に特徴が表れた決定木を選択する、ことを特徴とする（１）に記載の情報処理装置である。 (2) Further, as one aspect of the present invention, the verification evaluation unit distributes the value of the representative index for each of the nodes calculated by the representative index calculation unit with respect to the decision tree for each of the plurality of different periods. The information processing apparatus according to (1), wherein a decision tree having features represented by values of the representative indexes is selected by calculating each and selecting the decision tree with the largest variance. .

（３）また、本発明の一態様としては、前記代表指標は、前記ノードに分類されたユーザの数のうち分析対象となる行動をしたユーザの数の比率である、ことを特徴とする（１）または（２）に記載の情報処理装置である。 (3) Further, according to an aspect of the present invention, the representative index is a ratio of the number of users who acted as an analysis target among the number of users classified into the nodes ( It is an information processor given in 1) or (2).

（４）また、本発明の一態様としては、前記代表指標は、前記ノードに分類されたユーザの数のうち分析対象となる行動をしたユーザの数の比率に、前記ノードに分類されたユーザの一人当たり平均売上高を乗算した値である、ことを特徴とする（１）から（３）のいずれか一つに記載の情報処理装置である。 (4) Further, according to an aspect of the present invention, the representative index is a user classified into the node in a ratio of the number of users who perform an analysis target among the number of users classified into the node. The information processing apparatus according to any one of (1) to (3), which is a value obtained by multiplying the average sales per person.

（５）また、本発明の一態様としては、コンピュータを用いた情報処理方法であって、前記コンピュータが備える決定木算出部が、複数のユーザそれぞれのユーザ属性を示すデータを取得し、前記データのうち特定の期間に該当するデータの前記ユーザ属性に基づいて前記ユーザを複数のノードのいずれかにそれぞれ分類する決定木を、前記期間を変えて複数算出する決定木算出ステップと、前記コンピュータが備える代表指標算出部が、前記ノードに分類された前記ユーザの数に基づいて算出される代表指標の値を算出する代表指標算出ステップと、前記コンピュータが備える検証評価部が、前記代表指標算出ステップにおいて算出された前記代表指標の値に基づいて、前記決定木算出ステップによって算出された複数の決定木の中から、いずれかの決定木を選択する検証評価ステップと、前記コンピュータが備える出力部が、前記検証評価ステップにおいて選択された決定木を示すデータと前記代表指標算出ステップにおいて算出された代表指標の値を示すデータとに基づいて算出される決定木を示す分析結果情報を出力する出力ステップと、を有することを特徴とする情報処理方法である。 (5) Further, according to an aspect of the present invention, in the information processing method using a computer, the decision tree calculation unit included in the computer acquires data indicating user attributes of each of a plurality of users, and the data A decision tree calculating step of computing a plurality of decision trees each of which classifies the user into any one of a plurality of nodes based on the user attribute of data corresponding to a specific period among the plurality of nodes; A representative index calculation step of calculating a representative index value calculated based on the number of users classified into the nodes, and a verification evaluation unit included in the computer; From among the plurality of decision trees calculated by the decision tree calculating step based on the value of the representative index calculated in A verification and evaluation step of selecting one of the decision trees, and an output unit of the computer indicates data indicating the decision tree selected in the verification and evaluation step and the value of the representative index calculated in the representative index calculation step An output step of outputting analysis result information indicating a decision tree calculated on the basis of data.

（６）また、本発明の一態様としては、コンピュータに、複数のユーザのユーザ属性を示すデータを取得し、複前記データのうち特定の期間に該当するデータの前記ユーザ属性に基づいて前記ユーザを複数のノードのいずれかにそれぞれ分類する決定木を、前記期間を変えて複数算出する決定木算出ステップと、前記ノードに分類された前記ユーザの数に基づいて算出される代表指標の値を算出する代表指標算出ステップと、前記代表指標算出ステップにおいて算出された前記代表指標の値に基づいて、前記決定木算出ステップによって算出された複数の決定木の中から、いずれかの決定木を選択する検証評価ステップと、前記検証評価ステップにおいて選択された決定木を示すデータと前記代表指標算出ステップにおいて算出された代表指標の値を示すデータとに基づいて算出される決定木を示す分析結果情報を出力する出力ステップと、を実行させるためのプログラムである。 (6) Further, according to an aspect of the present invention, the computer acquires data indicating user attributes of a plurality of users, and the user of the plurality of pieces of data is determined based on the user attributes of data corresponding to a specific period. Decision tree calculating step of calculating a plurality of decision trees each of which is classified into any one of a plurality of nodes by changing the period, and a value of a representative index calculated based on the number of the users classified into the nodes Based on the representative index calculating step to be calculated and the value of the representative index calculated in the representative index calculating step, any decision tree is selected from among the plurality of decision trees calculated in the decision tree calculating step Verification evaluation step, data indicating the decision tree selected in the verification evaluation step, and the representative finger calculated in the representative index calculation step An output step of outputting the analysis result information indicating a decision tree which is calculated based on the values in the data indicating a program for execution.

本発明によれば、分析対象期間が異なる複数の分析結果の中で、より特徴が強く表れた分析結果を提示することができる。 According to the present invention, it is possible to present an analysis result with more distinctive features among a plurality of analysis results with different analysis target periods.

本実施形態に係る情報処理システムの概要を示す概略図である。BRIEF DESCRIPTION OF THE DRAWINGS It is the schematic which shows the outline | summary of the information processing system which concerns on this embodiment. 本実施形態に係る情報処理装置の機能構成を示すブロック図である。It is a block diagram showing functional composition of an information processor concerning this embodiment. 本実施形態に係る情報処理装置の足切り値格納部に格納されるノード選択実績テーブルの一例を示す図である。It is a figure which shows an example of the node selection performance table stored in the undercut value storage part of the information processing apparatus which concerns on this embodiment. 本実施形態に係る情報処理装置の出力部によって出力される画像の一例を示す図である。It is a figure which shows an example of the image output by the output part of the information processing apparatus which concerns on this embodiment. 本実施形態に係る情報処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the information processing apparatus which concerns on this embodiment.

（実施形態）
以下、本実施形態に係る情報処理システムの概要について、図面を参照しながら説明する。
図１は、本実施形態に係る情報処理システムの概要を示す概略図である。
本実施形態に係る情報処理システムは、ウェブ広告（以下、Ｗｅｂ広告という）配信担当者の意思決定の支援をするためのＷｅｂ広告配信支援システムである。情報処理システムは、Ｗｅｂ広告配信対象ユーザの選定における、Ｗｅｂ広告配信担当者（以下、分析者という）の意思決定を支援する。 (Embodiment)
Hereinafter, an outline of the information processing system according to the present embodiment will be described with reference to the drawings.
FIG. 1 is a schematic view showing an outline of an information processing system according to the present embodiment.
The information processing system according to the present embodiment is a Web advertisement distribution support system for supporting the decision making of a person in charge of distribution of a Web advertisement (hereinafter referred to as Web advertisement). The information processing system supports the decision making of a person in charge of web advertisement distribution (hereinafter, referred to as an analyst) in selecting a web advertisement distribution target user.

図示するように、情報処理システムは、情報処理装置１と、Ｗｅｂサーバ２と、Ｗｅｂ広告配信業者端末３と、によって構成される。
情報処理装置１は、Ｗｅｂ広告配信対象ユーザの選定における、分析者の意思決定を支援するＷｅｂ広告配信支援装置である。情報処理装置１は、例えば、分析者が所属する企業の構内に設置される。情報処理装置１は、例えば、パーソナルコンピュータ、または汎用コンピュータなどを含んで構成される。 As illustrated, the information processing system includes an information processing apparatus 1, a web server 2, and a web advertisement distributor terminal 3.
The information processing device 1 is a Web advertisement distribution support device that supports the analyst's decision making in selecting a Web advertisement distribution target user. The information processing apparatus 1 is installed, for example, on the premises of a company to which an analyst belongs. The information processing apparatus 1 is configured to include, for example, a personal computer or a general-purpose computer.

Ｗｅｂサーバ２は、通信ネットワーク（例えば、インターネット）を介してＷｅｂサイトを公開するサーバ装置である。Ｗｅｂサイトは、例えば、ＥＣ（ＥｌｅｃｔｒｏｎｉｃＣｏｍｍｅｒｃｅ；電子商取引）サイトなどである。図１においては、説明を簡略化するため、Ｗｅｂサーバ２を１つのみ記載しているが、Ｗｅｂサーバ２の数は複数であってもかまわない。
また、複数のＷｅｂサーバ２には、分析者が所属する企業が運営するＷｅｂサイト（以下、自社Ｗｅｂサイトとも言う）を公開するサーバ装置のほか、分析者が所属する企業以外の企業が運営するＷｅｂサイト（以下、他社Ｗｅｂサイトとも言う）を公開するサーバ装置が含まれていてもよい。 The web server 2 is a server device that publishes a web site via a communication network (for example, the Internet). The website is, for example, an EC (Electronic Commerce) site. Although only one web server 2 is shown in FIG. 1 to simplify the description, the number of web servers 2 may be plural.
In addition to a server device that publishes a website operated by a company to which the analyst belongs (hereinafter also referred to as an in-house website), a plurality of web servers 2 are operated by a company other than the company to which the analyst belongs. A server apparatus that publishes a website (hereinafter also referred to as a third-party website) may be included.

Ｗｅｂ広告配信業者端末３は、情報処理装置１から入力されるデータに基づくＷｅｂ広告配信の実行指示に従って、ユーザへＷｅｂ広告を配信する端末装置である。Ｗｅｂ広告配信業者端末３は、Ｗｅｂ広告配信業者の構内に設置され、通信ネットワークによって情報処理装置１と通信接続された端末装置である。Ｗｅｂ広告配信業者とは、例えば、ＤＳＰ（Ｄｅｍａｎｄ−ＳｉｄｅＰｌａｔｆｏｒｍ；オンライン広告において広告主側の広告効果最大化を支援するためのプラットフォーム）を運用するＤＳＰ業者である。Ｗｅｂ広告配信業者端末３は、例えば、パーソナルコンピュータ、または汎用コンピュータなどを含んで構成される。 The web advertisement distributor terminal 3 is a terminal device that distributes the web advertisement to the user according to the execution instruction of the web advertisement distribution based on the data input from the information processing device 1. The web advertisement distributor terminal 3 is a terminal device installed on the premises of the web advertisement distributor and in communication connection with the information processing apparatus 1 through a communication network. The web advertisement distributor is, for example, a DSP (Demand-Side Platform) that operates a demand-side platform (a platform for supporting advertising effectiveness maximization on the advertiser side in online advertising). The web advertisement distributor terminal 3 includes, for example, a personal computer or a general-purpose computer.

情報処理装置１は、自社Ｗｅｂサイトを公開するＷｅｂサーバ２から、自社Ｗｅｂサイトの閲覧者（以下、ユーザとも言う）ごとの自社Ｗｅｂサイトの閲覧履歴を示すログデータを取得する。また、情報処理装置１は、他社Ｗｅｂサイトを公開するＷｅｂサーバ２から、ユーザごとの他社Ｗｅｂサイトの閲覧履歴を示すログデータを取得する。 The information processing apparatus 1 acquires, from the web server 2 that publishes the company website, log data indicating the browsing history of the company website for each viewer (hereinafter, also referred to as a user) of the company website. Further, the information processing apparatus 1 acquires, from the web server 2 that publishes the web site of the other company, log data indicating the browsing history of the web site of the other company for each user.

情報処理装置１は、分析者によって入力されるデータに基づく分析要求を受けて、ユーザの行動傾向を分析する。分析者からの分析要求の内容は、例えば、特定の商品（例えば、商品Ａ）を購入したユーザの行動傾向の分析を要求することなどである。
情報処理装置１は、Ｗｅｂサーバ２から取得した、ユーザごとの自社Ｗｅｂサイトの閲覧履歴を示すログデータ、およびユーザごとの他社Ｗｅｂサイトの閲覧履歴を示すログデータに基づいて、ユーザのユーザ属性情報を分析する。ユーザ属性情報には、嗜好情報および成果情報が含まれる。 In response to an analysis request based on data input by the analyst, the information processing device 1 analyzes the user's behavior tendency. The content of the analysis request from the analyst is, for example, requesting analysis of the behavior tendency of the user who has purchased a specific product (for example, the product A).
The information processing apparatus 1 uses the user attribute information of the user based on the log data indicating the browsing history of the company website for each user acquired from the Web server 2 and the log data indicating the browsing history of the other company website for each user. Analyze The user attribute information includes preference information and outcome information.

嗜好情報とは、ユーザが興味を持っている物事を示すデータ、例えば、ユーザが閲覧したＷｅｂサイトの内容に含まれるジャンル（例えば、スポーツ、映画など）を示すデータである。また、成果情報とは、ユーザのＷｅｂサイトにおける行動を示すデータ、例えば、商品購入がなされたか否か、または資料請求がなされたか否かなどを示すデータである。 The preference information is data indicating things that the user is interested in, for example, data indicating a genre (for example, sports, movies, etc.) included in the content of the web site that the user browses. Further, the result information is data indicating the user's behavior on the website, for example, data indicating whether a product purchase has been made or whether a material request has been made.

情報処理装置１は、ユーザ属性情報に基づいて、ユーザ属性ごとにユーザを分類したユーザ群を示す分析結果データを出力する。例えば、情報処理装置１は、ある特定の期間において「科学」のジャンルに属するＷｅｂページを参照した回数が２０回以上であったユーザ群と、２０回未満であったユーザ群とに分類する。そして、情報処理装置１は、それぞれのユーザ群について、例えば商品Ａを購入したユーザの比率を分析し、分析結果データを出力する。
なお、本実施形態に係る情報処理装置１は、決定木分析の手法を用いてユーザを分類するが、詳細は後述する。 The information processing apparatus 1 outputs analysis result data indicating a user group in which the user is classified for each user attribute based on the user attribute information. For example, the information processing apparatus 1 classifies the web page belonging to the “science” genre into a user group in which the number of times the web page belonging to the “science” category is 20 or more and a user group in which the web page is less than 20 times. Then, the information processing apparatus 1 analyzes, for example, the ratio of users who have purchased the product A for each user group, and outputs analysis result data.
The information processing apparatus 1 according to the present embodiment classifies users using a decision tree analysis method, and the details will be described later.

情報処理装置１は、分析結果データに含まれる複数のユーザ群の中から分析者によって選択されたユーザ群を示すデータの入力を受け付ける。情報処理装置１は、当該ユーザ群を示すデータをＷｅｂ広告配信業者端末３へ送信する。
Ｗｅｂ広告配信業者端末３は、情報処理装置１から入力されたデータに基づくユーザ群に含まれるユーザの端末へＷｅｂ広告を配信する。
なお、本実施形態においては、Ｗｅｂ広告配信業者端末３は、ユーザの端末へ直接Ｗｅｂ広告を配信するものとしたが、これに限られない。例えば、Ｗｅｂ広告配信業者端末３が、電子メールを介してユーザへ広告を配信するような構成であってもよい。すなわち、Ｗｅｂ広告配信業者端末３が、ユーザのメールアドレスへ、広告が記載された電子メール、またはＷｅｂ広告が掲載されたＷｅｂページのＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ；統一資源位置指定子）が記載された電子メールなどを送信するような構成であっても構わない。 The information processing apparatus 1 receives an input of data indicating a user group selected by the analyst from among a plurality of user groups included in the analysis result data. The information processing device 1 transmits data indicating the user group to the Web advertisement distributor terminal 3.
The web advertisement distributor terminal 3 distributes the web advertisement to the terminal of the user included in the user group based on the data input from the information processing device 1.
In the present embodiment, the web advertisement distributor terminal 3 directly delivers the web advertisement to the user's terminal, but the present invention is not limited to this. For example, the Web advertisement distributor terminal 3 may be configured to distribute an advertisement to a user via an electronic mail. That is, the web advertisement distributor terminal 3 describes an e-mail in which an advertisement is described or a URL of a web page on which a web advertisement is posted (Uniform Resource Locator; uniform resource locator) to the user's e-mail address. It may be configured to send an e-mail or the like.

（情報処理装置の構成）
以下、本発明の実施形態に係る情報処理装置１の機能構成について、図面を参照しながら詳細に説明する。
図２は、本実施形態に係る情報処理装置１の機能構成を示すブロック図である。
情報処理装置１は、制御部１０と、入力部１１と、履歴記憶部１２と、サイト嗜好定義部１３と、シミュレーション検証評価部１４と、決定木算出部１５と、代表指標算出部１６と、足切り値格納部１７と、出力部１８と、広告配信部１９と、表示方法記憶部２０と、を含んで構成される。 (Configuration of information processing apparatus)
Hereinafter, the functional configuration of the information processing apparatus 1 according to the embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 2 is a block diagram showing a functional configuration of the information processing apparatus 1 according to the present embodiment.
The information processing apparatus 1 includes a control unit 10, an input unit 11, a history storage unit 12, a site preference definition unit 13, a simulation verification evaluation unit 14, a decision tree calculation unit 15, and a representative index calculation unit 16. The threshold value storage unit 17, the output unit 18, the advertisement distribution unit 19, and the display method storage unit 20 are included.

制御部１０は、情報処理装置１の各種の処理を制御する。制御部１０は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ；中央処理演算装置）を含んで構成される。
入力部１１は、分析者によって入力される分析要求を受け付ける。入力部１１は、入力された分析要求を示すデータを、制御部１０へ出力する。入力部１１は、例えば、マウス、キーボード、またはタッチパネルなどを含んで構成される。 The control unit 10 controls various processes of the information processing apparatus 1. The control unit 10 includes, for example, a central processing unit (CPU).
The input unit 11 receives an analysis request input by the analyst. The input unit 11 outputs data indicating the input analysis request to the control unit 10. The input unit 11 includes, for example, a mouse, a keyboard, or a touch panel.

履歴記憶部１２は、ユーザごとの自社Ｗｅｂサイト閲覧履歴を示すログデータを記憶する。履歴記憶部１２は、自社Ｗｅｂサイト閲覧履歴を示すログデータを、Ｗｅｂサーバ２から取得する。ここで言う自社Ｗｅｂサイトとは、Ｗｅｂサーバ２によって通信ネットワークを介して公開されたＷｅｂサイトである。Ｗｅｂサーバ２は、例えば、分析者が所属する企業が運営するＥＣサイトのＷｅｂサーバである。また、自社Ｗｅｂサイト閲覧履歴には、自社Ｗｅｂサイト内においてアクセスされたＷｅｂページの名称、アクセス日時、および成果（例えば、商品購入がなされたか否か、または、資料請求がなされたか否かなど）を示すデータなどが含まれる。 The history storage unit 12 stores log data indicating a company website browsing history for each user. The history storage unit 12 acquires, from the web server 2, log data indicating a company website browsing history. The company website referred to here is a website published by the web server 2 via the communication network. The web server 2 is, for example, a web server of an EC site operated by a company to which an analyst belongs. Also, in the company website browsing history, the name of the web page accessed in the company website, the date and time of access, and the result (for example, whether the product has been purchased or whether a document request has been made) Data indicating the

また、履歴記憶部１２は、ユーザごとの他社Ｗｅｂサイト閲覧履歴を示すログデータを記憶する。履歴記憶部１２は、他社Ｗｅｂサイト閲覧履歴を示すログデータを、例えば、Ｗｅｂサーバ２に記憶されたリファラ情報を参照することによって、Ｗｅｂサーバ２から取得する。ここで言う他社Ｗｅｂサイトとは、他社（すなわち、例えば分析者が所属する企業以外の企業）が運営するＷｅｂサイトである。他社Ｗｅｂサイト閲覧履歴には、アクセスしたユーザのクッキー（Ｃｏｏｋｉｅ）情報などを基とした他社Ｗｅｂサイト内においてアクセスされたＷｅｂページの名称、およびアクセス日時などが含まれる。 Further, the history storage unit 12 stores log data indicating the other company website browsing history for each user. The history storage unit 12 acquires, from the Web server 2, log data indicating the other company's website browsing history, for example, by referring to referrer information stored in the Web server 2. The other company's Web site mentioned here is a Web site operated by another company (that is, for example, a company other than the company to which the analyst belongs). The other-company Web site browsing history includes the name of the Web page accessed in the other-company Web site based on the cookie (Cookie) information of the accessed user, the access date, and the like.

履歴記憶部１２は、記憶媒体、例えば、ハードディスクドライブ（ＨＤＤ；ＨａｒｄＤｉｓｋＤｒｉｖｅ）、フラッシュメモリ、イーイープロム（ＥＥＰＲＯＭ；ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ；読み出し専用メモリ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓｒｅａｄ／ｗｒｉｔｅＭｅｍｏｒｙ；読み書き可能なメモリ）、またはそれらの任意の組み合わせを含んで構成される。 The history storage unit 12 may be a storage medium such as a hard disk drive (HDD; Hard Disk Drive), a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), a ROM (Read Only Memory), a RAM (Read Only Memory), or the like. Random Access read / write Memory (Read / Write Memory), or any combination thereof.

サイト嗜好定義部１３は、他社Ｗｅｂサイト内のＷｅｂページの属性情報と、他社Ｗｅｂサイト内のＷｅｂページを識別する情報と、を対応付けて予め記憶している。
他社Ｗｅｂサイト内のＷｅｂページの属性情報とは、例えば、Ｗｅｂサイトに含まれるコンテンツのジャンル（例えば、スポーツ、映画）を示すデータなどである。他社Ｗｅｂサイト内のＷｅｂページの属性情報は、決定木分析において、ユーザの嗜好情報として活用される。
また、他社Ｗｅｂサイト内のＷｅｂページを識別する情報は、例えば、他社Ｗｅｂサイト内のＷｅｂページの名称、またはＵＲＬなどである。 The site preference definition unit 13 associates and stores in advance attribute information of web pages in another company's web site and information for identifying a web page in another company's web site.
The attribute information of the web page in another company's web site is, for example, data indicating the genre (for example, sports, movie) of the content included in the web site. The attribute information of the web page in the other company website is utilized as preference information of the user in decision tree analysis.
Further, the information for identifying the web page in the other company's web site is, for example, the name or the URL of the web page in the other company's web site.

サイト嗜好定義部１３は、記憶媒体、例えば、ハードディスクドライブ、フラッシュメモリ、イーイープロム、ＲＯＭ、ＲＡＭ、またはそれらの任意の組み合わせを含んで構成される。 The site preference definition unit 13 is configured to include a storage medium, for example, a hard disk drive, a flash memory, an Eyprom, a ROM, a RAM, or any combination thereof.

履歴記憶部１２は、他社Ｗｅｂサイト内でユーザがアクセスしたＷｅｂページの識別情報と、サイト嗜好定義部１３に記憶された他社Ｗｅｂサイト内のＷｅｂページの識別情報とを紐付ける。これにより、履歴記憶部１２は、他社Ｗｅｂサイト内でユーザがアクセスしたＷｅｂページの属性情報を取得することができる。すなわち、履歴記憶部１２は、例えば、各ユーザがアクセスした他社Ｗｅｂサイト内のＷｅｂページの内容が属するジャンルを示すデータを取得することができる。 The history storage unit 12 associates the identification information of the web page accessed by the user in the other company website with the identification information of the web page in the other company website stored in the site preference definition unit 13. As a result, the history storage unit 12 can acquire attribute information of the web page accessed by the user in the web site of another company. That is, the history storage unit 12 can acquire, for example, data indicating the genre to which the content of the web page in the other company website accessed by each user belongs.

制御部１０は、分析者からの分析要求を示すデータが入力部１１から入力されると、当該データをシミュレーション検証評価部１４へ出力する。
シミュレーション検証評価部１４は、分析者からの分析要求を示すデータが制御部１０から入力されると、ユーザごとのユーザ属性情報、すなわち、成果情報および嗜好情報（例えば、ユーザが閲覧した他社ＷｅｂサイトのＷｅｂページのジャンルなど）を示すログデータを、履歴記憶部１２から取得する。 When data indicating an analysis request from the analyst is input from the input unit 11, the control unit 10 outputs the data to the simulation verification evaluation unit 14.
When data indicating an analysis request from the analyst is input from the control unit 10, the simulation verification and evaluation unit 14 receives user attribute information for each user, that is, result information and preference information (for example, another company's website viewed by the user) The log data indicating the genre of the web page, etc.) is acquired from the history storage unit 12.

シミュレーション検証評価部１４は、履歴記憶部１２から取得したログデータのうち、特定の期間内に記録されたログデータを抽出する。シミュレーション検証評価部１４は、抽出したログデータを、制御部１０を介して決定木算出部１５へ出力し、決定木算出部１５に当該ログデータに基づく決定木を算出させる。シミュレーション検証評価部１４は、決定木算出部１５に算出させた決定木を示すデータを、制御部１０を介して取得する。 The simulation verification evaluation unit 14 extracts log data recorded within a specific period from the log data acquired from the history storage unit 12. The simulation verification evaluation unit 14 outputs the extracted log data to the decision tree calculation unit 15 via the control unit 10, and causes the decision tree calculation unit 15 to calculate a decision tree based on the log data. The simulation verification and evaluation unit 14 acquires data indicating the decision tree calculated by the decision tree calculation unit 15 via the control unit 10.

シミュレーション検証評価部１４は、上述した、特定の期間内に記録されたログデータの抽出を、期間を変えて、複数の期間について同様に行う。
このログデータの抽出対象とする複数の期間は、例えば、「直近一週間」、「直近一か月」、および「直近一年間」のように、期間の長さが異なる複数の期間である。または、このログデータの抽出対象とする複数の期間は、例えば、「１０日前から昨日までの期間」、「２０日前から１１日前までの期間」、および「３０日前から２１日前までの期間」のように、期間の始期と終期が異なる複数の期間でもよい。または、このログデータの抽出対象とする複数の期間は、上述した、期間の長さが異なる複数の期間と、期間の始期と終期が異なる複数の期間との組み合わせであってもよい。 The simulation verification and evaluation unit 14 performs the extraction of the log data recorded in the specific period as described above in the same manner for a plurality of periods while changing the period.
The plurality of periods to be extracted from the log data are, for example, a plurality of periods having different lengths, such as “last week”, “last month”, and “last year”. Alternatively, the plurality of periods to be extracted from the log data may be, for example, “period from 10 days to yesterday,” “period from 20 days to 11 days,” and “period from 30 days to 21 days”. As such, there may be a plurality of periods where the beginning and end of the period are different. Alternatively, the plurality of periods targeted for extraction of the log data may be a combination of a plurality of periods having different lengths of periods and a plurality of periods having different beginnings and ends of the periods.

シミュレーション検証評価部１４は、抽出した複数の期間のログデータを、それぞれ制御部１０を介して決定木算出部１５へ出力し、決定木算出部１５に複数の期間のログデータに基づく決定木をそれぞれ算出させる。シミュレーション検証評価部１４は、決定木算出部１５に算出させた複数の期間それぞれの決定木を示すデータを、制御部１０を介して取得する。 The simulation verification and evaluation unit 14 outputs the extracted log data of a plurality of periods to the decision tree calculation unit 15 through the control unit 10, and the decision tree calculation unit 15 determines a decision tree based on the log data of a plurality of periods. Let each calculate. The simulation verification and evaluation unit 14 acquires data indicating the decision tree of each of the plurality of periods calculated by the decision tree calculation unit 15 through the control unit 10.

決定木算出部１５は、シミュレーション検証評価部１４から制御部１０を介して入力されたユーザごとの成果情報および嗜好情報に基づいて決定木分析を行い、決定木を算出する。算出された決定木には、ノードごとに、「数」、「比率」、および「占（占有率）」の値が含まれる。「数」は、それぞれのノードに対応する条件に該当するユーザ数を表す。「比率」は、それぞれのノードに対応する条件での各セグメントの比率を表す。「占」は、それぞれのノードに対応する条件に該当するセグメントのボリュームを表す。
シミュレーション検証評価部１４は、決定木算出部１５から制御部１０を介して取得した複数の期間それぞれの決定木を示すデータを、制御部１０を介して代表指標算出部１６へ出力する。 The decision tree calculation unit 15 performs decision tree analysis based on the result information and preference information for each user input from the simulation verification and evaluation unit 14 via the control unit 10, and calculates a decision tree. The calculated decision tree includes, for each node, values of “number”, “ratio”, and “occupancy (occupancy ratio)”. “Number” represents the number of users corresponding to the condition corresponding to each node. "Rate" represents the ratio of each segment in the condition corresponding to each node. "Saiyou" represents the volume of a segment corresponding to the condition corresponding to each node.
The simulation verification and evaluation unit 14 outputs, to the representative index calculation unit 16 via the control unit 10, data indicating the decision trees for each of the plurality of periods acquired from the decision tree calculation unit 15 via the control unit 10.

代表指標算出部１６は、シミュレーション検証評価部１４から制御部１０を介して入力されたデータに基づく複数の期間それぞれの決定木の各ノードについて、代表指標を算出する。代表指標とは、目的を設定した時の期待の大きさを示したものである。本実施形態においては、代表指標は、各ノードにおける「比率」の値そのものであるものとするが、これに限られない。代表指標は、分析の目的に応じた適切な計算方法によって求められる指標であることが好ましい。 The representative index calculation unit 16 calculates a representative index for each node of a decision tree for each of a plurality of periods based on data input from the simulation verification and evaluation unit 14 via the control unit 10. The representative index indicates the size of the expectation when setting the purpose. In the present embodiment, the representative index is assumed to be the value of “ratio” at each node, but is not limited to this. The representative index is preferably an index determined by an appropriate calculation method according to the purpose of analysis.

例えば、分析の目的が商品購入率の向上を図ることであるならば、代表指標には、例えば、各ノードにおける「比率」の値そのものが用いられる。すなわち、この場合の代表指標は、各ノードに含まれるユーザの商品購入率の期待値である。
また、例えば、分析の目的が、期待売上高を拡大することであるならば、代表指標は、例えば、各ノードにおける「比率」に当該ノードに含まれるユーザの一人当たり平均売上高を乗算した値が用いられる。すなわち、この場合の代表指標は、各ノードに含まれるユーザの一人あたりの平均売上高の期待値である。
その他、分析の目的に応じて、各ノードに含まれるユーザの、商品リピート購入率、Ｗｅｂサイト訪問時間、Ｗｅｂページ閲覧数、または前回の商品購入（訪問）時からの間隔などが代表指標として用いられてもよい。 For example, if the purpose of analysis is to improve the product purchase rate, then the value itself of the “ratio” at each node is used as the representative index, for example. That is, the representative index in this case is the expected value of the product purchase rate of the user included in each node.
Also, for example, if the purpose of analysis is to expand expected sales, the representative index is, for example, a value obtained by multiplying the “ratio” at each node by the average sales per user of users included in the node. Is used. That is, the representative index in this case is the expected value of average sales per person of the user included in each node.
In addition, according to the purpose of analysis, the representative user uses the product repeat purchase rate, the website visit time, the number of web page visits, or the interval from the previous product purchase (visit) as a representative index for users included in each node It may be done.

代表指標算出部１６は、複数の期間それぞれの決定木の各ノードの代表指標の値を示すデータと、複数の期間それぞれの決定木を示すデータとを、制御部１０を介してシミュレーション検証評価部１４へ出力する。
シミュレーション検証評価部１４は、複数の期間それぞれの決定木の各ノードの代表指標の値に基づいて、代表指標の値の特徴が最も明確に表れている決定木を選択する。例えば、シミュレーション検証評価部１４は、複数の期間それぞれの決定木ごとに、決定木の各ノードの代表指標の値の分散を算出する。そして、シミュレーション検証評価部１４は、算出した結果に基づいて、分散が最も大きい決定木を選択する。
シミュレーション検証評価部１４は、選択した決定木を示すデータを、制御部１０を介して代表指標算出部１６へ出力する。 The representative index calculation unit 16 performs simulation verification and evaluation unit via the control unit 10 with data indicating the value of the representative index of each node of the decision tree for each of a plurality of periods, and data indicating the decision tree for each of the plurality of periods. Output to 14.
The simulation verification and evaluation unit 14 selects a decision tree in which the feature of the value of the representative index appears most clearly, based on the value of the representative index of each node of the decision tree for each of a plurality of periods. For example, the simulation verification and evaluation unit 14 calculates the variance of the value of the representative index of each node of the decision tree for each decision tree of each of a plurality of periods. Then, the simulation verification and evaluation unit 14 selects a decision tree with the largest variance based on the calculated result.
The simulation verification and evaluation unit 14 outputs data indicating the selected decision tree to the representative index calculation unit 16 via the control unit 10.

足切り値格納部１７には、例えば、過去の分析における熟練の分析者によって選択されたノードの「数」の値（以下、足切り値と言う）が格納されている。足切り値格納部１７は、記憶媒体、例えば、ハードディスクドライブ、フラッシュメモリ、イーイープロム、ＲＯＭ、ＲＡＭ、またはそれらの任意の組み合わせを含んで構成される。 The threshold value storage unit 17 stores, for example, the value of the “number” of nodes selected by a skilled analyst in past analysis (hereinafter referred to as a threshold value). The cut-off value storage unit 17 is configured to include a storage medium, such as a hard disk drive, a flash memory, an easy prom, a ROM, a RAM, or any combination thereof.

図３は、本実施形態に係る情報処理装置１の足切り値格納部１７に格納されるノード選択実績テーブルの一例を示す図である。
図示するように、ノード選択実績テーブルは、「分析者のスキルレベル」および「選択されたノードに含まれるユーザの「数」」の２つの項目の列からなる、２次元の表形式のデータである。このノード選択実績テーブルの各行は、それぞれ、過去の分析者によるノードの選択の実績に相当する。 FIG. 3 is a diagram showing an example of the node selection result table stored in the footing value storage unit 17 of the information processing apparatus 1 according to the present embodiment.
As shown, the node selection result table is a two-dimensional tabular data consisting of two columns of "analyst's skill level" and "number of users included in the selected node". is there. Each row of the node selection result table corresponds to the result of node selection by the analyst in the past.

例えば、図３に示すノード選択実績テーブルにおける先頭の（最上段の）データ行には、「高い」および「５４」というデータが格納されている。これは、このデータ行が示すノード選択を行った分析者のスキルレベルが「高い」（スキルレベルが高い分析者である）ことを示す。また、このデータ行が示すノード選択において選択されたノードに含まれるユーザの数が５４人であることを示す。
また、例えば、図３に示すノード選択実績テーブルにおける２つめのデータ行には、「標準」および「４２」というデータが格納されている。これは、この行が示すノード選択を行った分析者のスキルレベルが「標準」（標準的なスキルレベルの分析者である）ことを示す。また、このデータ行が示すノード選択において選択されたノードに含まれるユーザの数が４２人であることを示す。 For example, data “high” and “54” are stored in the top (uppermost) data row in the node selection result table shown in FIG. This indicates that the skill level of the analyst who made the node selection indicated by this data row is “high” (an analyst with a high skill level). It also indicates that the number of users included in the node selected in the node selection indicated by this data row is 54.
Also, for example, data “standard” and “42” are stored in the second data row in the node selection result table shown in FIG. This indicates that the skill level of the analyst who made the node selection indicated by this line is "standard" (the analyst of the standard skill level). It also indicates that the number of users included in the node selected in the node selection indicated by this data row is 42.

代表指標算出部１６は、上記でシミュレーション検証評価部１４が選択した決定木の各ノードにおける「数」の値について、熟練の分析者によって選択されたノードの「数」の値を下回る値であるならば、当該ノードの代表指標の値を０にする。すなわち、各ノードにおけるユーザ数が足切り値に満たないならば、代表指標の値は０となる。 The representative index calculation unit 16 is a value below the value of “number” of nodes selected by the skilled analyst for the value of “number” in each node of the decision tree selected by the simulation verification evaluation unit 14 above. Then, the value of the representative index of the node is set to 0. That is, if the number of users at each node is less than the threshold value, the value of the representative index is zero.

なお、どのスキルレベルの分析者までを「熟練の分析者」とするか（例えば、スキルレベルが「高い」分析者のみを「熟練の分析者」とするか、または、スキルレベルが「高い」分析者とスキルレベルが「標準」の分析者とをあわせて「熟練の分析者」とするかなど）は、予め設定される。 Note that up to which skill level analysts are "skilled analysts" (for example, only "skill level" analysts are "skilled analysts" or "skill level" is "high") Whether the analyst and the analyst having the skill level “standard” are combined to be “skilled analyst”, etc. is set in advance.

本来、代表指標である「比率」の値がより大きいノードに含まれるユーザに対し広告配信をするほど、ユーザからのより高い反応率が期待できるため、効率的な広告配信を行うことができる。しかしながら、分析者が選択したノードの「比率」の値が大きかったとしても、当該ノードに含まれるユーザの数が少なすぎる場合には、配信された広告に対して反応するユーザの数（ボリューム）も多くはならない。そのような、ユーザの数が少なすぎるユーザ群に対し広告配信をすることは効率的ではない。 Essentially, the higher the response rate from the user can be expected, the more efficiently the advertisement can be distributed, as the advertisement is distributed to the user included in the node having the larger value of “ratio” which is the representative index. However, even if the value of the “ratio” of the node selected by the analyst is large, the number of users (volumes) who respond to the distributed advertisement if the number of users included in the node is too small. Not too many. It is not efficient to deliver advertisements to such a group of users with too few users.

しかしながら、広告を配信するユーザの数が少なくともどの程度の数であれば効果的であるかは、例えば、広告の内容（例えば、宣伝する商品）に依存する。そのため、広告を配信するユーザの数が少なくともどの程度の数であるならば効果的な広告配信ができそうかを見極める判断は、熟練の分析者による広告配信の経験等に基づいて行われることが好ましい。
上記の理由により、本実施形態に係る情報処理装置１の代表指標算出部１６は、熟練の分析者の過去の分析における判断に基づいて決められる「足切り値」によって、ユーザ数の少ないノードを足切りする。情報処理装置１は、足切りに応じて、分析結果である決定木の出力内容（ノードの表示状態、例えば、ノードの枠内の色の濃度）を異ならせることによって、当該ノードを分析者に選択させ難くする。なお、決定木の出力例については後述する。 However, at least how many users are effective in delivering the advertisement depends, for example, on the content of the advertisement (for example, the product to be advertised). Therefore, the decision to determine if the number of users delivering the ad is likely to be effective ad delivery can be made based on the ad delivery experience by a skilled analyst, etc. preferable.
For the above reason, the representative index calculation unit 16 of the information processing apparatus 1 according to the present embodiment determines nodes with a small number of users according to the “threshold value” determined based on the judgment in the past analysis of the skilled analyst. Cut short. The information processing apparatus 1 makes the node an analyst by making the output contents of the decision tree (the display state of the node, for example, the density of the color in the frame of the node) different according to the cutoff. Make it hard to select. An output example of the decision tree will be described later.

制御部１０は、決定木算出部１５が算出した決定木を示すデータ、代表指標算出部１６によって算出された各ノードの代表指標の値、および後述する表示方法記憶部２０に記憶された表示方法情報に基づいて、決定木を視覚化した画像を示すデータ（分析結果情報）を生成し、出力部１８に当該画像を出力させる。
出力部１８は、ディスプレイ、例えば、例えば、液晶ディスプレイ、または有機ＥＬ（ＥｌｅｃｔｒｏＬｕｍｉｎｅｓｃｅｎｃｅ；エレクトロルミネッセンス）ディスプレイを含んで構成される。 The control unit 10 is data indicating the decision tree calculated by the decision tree calculation unit 15, the value of the representative index of each node calculated by the representative index calculation unit 16, and the display method stored in the display method storage unit 20 described later. Based on the information, data (analysis result information) indicating an image visualizing the decision tree is generated, and the output unit 18 outputs the image.
The output unit 18 is configured to include a display, for example, a liquid crystal display, or an organic EL (Electro Luminescence) display.

（決定木の出力例）
以下に、出力部１８が出力する決定木の画像の一例について説明する。
図４は、本実施形態に係る情報処理装置１の出力部１８によって出力される画像の一例を示す図である。
図示するように、本例における決定木ｄｔ１は、１３個のノード（ノードｎｄ０、ノードｎｄ１、ノードｎｄ２、・・・、ノードｎｄ１２）を含む。最上位階層のノードｎｄ０を除く１２個のノード（ノードｎｄ１、ノードｎｄ２、ノードｎｄ３、・・・、ノードｎｄ１２）は、一階層上位のノードとそれぞれルール（ルールｅｄ１、ルールｅｄ２、ルールｅｄ３、・・・、ルールｅｄ１２）で結ばれている。これにより、決定木ｄｔ１は木構造を成している。 (Example output of decision tree)
Below, an example of the image of the decision tree which the output part 18 outputs is demonstrated.
FIG. 4 is a view showing an example of an image output by the output unit 18 of the information processing apparatus 1 according to the present embodiment.
As illustrated, the decision tree dt1 in this example includes thirteen nodes (node nd0, node nd1, node nd2,..., Node nd12). The 12 nodes (node nd1, node nd2, node nd3,..., Node nd12) except for the node nd0 of the top layer are the nodes above the first layer and the rules (rule ed1, rule ed2, rule ed3,. · · · Are connected by the rule ed12). Thus, the decision tree dt1 has a tree structure.

最上位階層のノードであるノードｎｄ０の下部には、「全体」と表示されている。各ノードの下部に表示される文言は、分析対象データを分類する分類条件を表す。すなわち、ノードｎｄ０は、分析対象のユーザ全体に対して集計した結果を表すノードである。
また、ノードｎｄ０の枠内の上段には、「数：９９０００：１０００」と表示されている。これは、分析対象となるユーザの行動の有無（本例においては、商品Ａの購入履歴の有無）によって分析対象ユーザを分類し、それぞれの行動の有無に該当するユーザの数を表す。本例においては、左側の数字が購入履歴の無いユーザの数を表し、右側の数字が購入履歴のあるユーザの数を表す。すなわち、ノードｎｄ０の枠内の上段の数字は、９９０００人は商品Ａの購入履歴が無く、１０００人は商品Ａの購入履歴が有る、ということを示す。なお、これらの数字から分かるように、本例における分析対象ユーザの全体数は、９９０００人と１０００人の合計値であり、１０００００人である。 In the lower part of the node nd0 which is a node of the top hierarchy, "whole" is displayed. The words displayed at the bottom of each node represent classification conditions for classifying data to be analyzed. That is, the node nd0 is a node that represents the result of counting for all users to be analyzed.
In the upper part of the frame of the node nd0, “number: 99000: 1000” is displayed. This classifies the analysis target user according to the presence or absence of the action target user (in the present example, the presence or absence of the purchase history of the product A), and represents the number of users corresponding to the presence or absence of each action. In this example, the numbers on the left represent the number of users without a purchase history, and the numbers on the right represent the number of users with a purchase history. That is, the upper number in the frame of the node nd0 indicates that 99000 have no purchase history of the product A and 1000 have a purchase history of the product A. As can be seen from these figures, the total number of analysis target users in this example is a total value of 99,000 and 1,000, and is 100,000.

また、ノードｎｄ０の枠内の下段には、「比率：９９．０：１．０」と表示されている。これは、分析対象となるユーザの行動の有無（本例においては、商品Ａの購入履歴の有無）によって分析対象ユーザを分類し、それぞれの行動の有無に該当するユーザの数の比率を表す。本例においては、左側の数字が購入履歴の無いユーザの比率を表し、右側の数字が購入履歴のあるユーザの比率を表す。すなわち、ノードｎｄ０の枠内の中段の数字は、ノードｎｄ０に含まれるユーザ（すなわち、分析対象ユーザ全体）のうち、９９．０％のユーザは商品Ａの購入履歴が無く、１．０％のユーザは商品Ａの購入履歴が有る、ということを示す。 In the lower part of the frame of the node nd0, “ratio: 99.0: 1.0” is displayed. This classifies the analysis target users according to the presence or absence of the action target user (in the present example, the presence or absence of the purchase history of the product A), and represents the ratio of the number of users corresponding to the presence or absence of each action. In this example, the number on the left represents the proportion of users without a purchase history, and the number on the right represents the proportion of users with a purchase history. That is, the middle number in the frame of node nd0 indicates that 99.0% of users included in node nd0 (that is, all users to be analyzed) have no purchase history of product A, and 1.0% The user indicates that there is a purchase history of the product A.

ノードｎｄ０は、ノードｎｄ０の左下方向に表示されているノードｎｄ１と、ルールｅｇ１によって結ばれている。また、ノードｎｄ０は、ノードｎｄ０の右下方向に表示されているノードｎｄ２と、ルールｅｇ２によって結ばれている。ノードｎｄ１およびノードｎｄ２は、ノードｎｄ０より一段階下位の階層にあたるノードである。 The node nd0 is connected to a node nd1 displayed in the lower left direction of the node nd0 by a rule eg1. The node nd0 is connected to a node nd2 displayed in the lower right direction of the node nd0 by a rule eg2. The nodes nd1 and nd2 are nodes corresponding to a hierarchy one stage lower than the node nd0.

ノードｎｄ１の下部には、「科学＜２０」と表示されている。上述したように、各ノードの下部に表示される文言は、分析対象データを分類する分類条件を表す。すなわち、ノードｎｄ１は、分析対象のユーザ全体の中で「科学＜２０」に該当するユーザに対して集計した結果を表すノードである。「科学＜２０」によって表される分類条件は、本例においては、科学に関連するＷｅｂページを参照した回数に基づく分類条件であるものとする。すなわち、ノードｎｄ１は、分析対象のユーザ全体の中で、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザに対して集計した結果を表すノードである。
同様に、ノードｎｄ２の下部には、「科学＞＝２０」と表示されている。すなわち、ノードｎｄ１は、分析対象のユーザ全体の中で、科学に関連するＷｅｂページを参照した回数が２０回以上であるユーザに対して集計した結果を表すノードである。 Below the node nd1, "Science <20" is displayed. As described above, the words displayed at the bottom of each node represent classification conditions for classifying data to be analyzed. That is, the node nd1 is a node representing the result of counting for users corresponding to “science <20” among all users to be analyzed. In the present example, the classification condition represented by “science <20” is assumed to be the classification condition based on the number of times of reference to the web page related to science. That is, the node nd1 is a node representing the result of counting for users whose number of times of reference to the Web page related to the science is less than 20 among the users to be analyzed.
Similarly, “Science> = 20” is displayed below the node nd2. That is, the node nd1 is a node representing the result of aggregation for users whose number of times of reference to the Web page related to science is 20 or more among all users to be analyzed.

このように、決定木ｄｔ１は、あるノードに含まれるユーザ群を、特定の分類条件（例えば、科学に関連するＷｅｂページを参照した回数が２０回以上であるか否か）に従って２つのユーザ群に分類し、分類したそれぞれのユーザ群に該当するノードを一段階下位の階層のノードとしてそれぞれ表示する決定木である。 In this manner, the decision tree dt1 indicates two groups of users included in a certain node according to a specific classification condition (for example, whether or not the number of times the Web page related to science is referenced is 20 or more). It is a decision tree that classifies and displays the nodes corresponding to each classified user group as nodes of a hierarchy one level lower.

ノードｎｄ１の枠内の上段には、「数：９８９９２：９９８」と表示されている。これは、ノードｎｄ１に含まれるユーザ（すなわち、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザ）を、商品Ａの購入履歴の有無によって分類し、それぞれの行動の有無に該当するユーザの数を表している。すなわち、ノードｎｄ１の枠内の上段の数字は、ノードｎｄ１に含まれるユーザのうち、９８９９２人は商品Ａの購入履歴が無く、９９８人は商品Ａの購入履歴が有る、ということを示す。なお、これらの数字から分かるように、ノードｎｄ１に含まれるユーザ（すなわち、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザ）の数は、９８９９２人と９９８人の合計値であり、９９９９０人である。 In the upper part of the frame of the node nd1, “number: 98992: 998” is displayed. This classifies the users included in the node nd1 (that is, users among the analysis target users who have referred to the Web page related to science less than 20 times) by the presence or absence of the purchase history of the product A, and The number of users corresponding to the presence or absence of the action of That is, the upper number in the frame of the node nd1 indicates that among the users included in the node nd1, 98992 have no purchase history of the product A and 998 have a purchase history of the product A. As can be seen from these figures, the number of users included in the node nd1 (ie, among the analysis target users who have referred to a Web page related to science less than 20 times) is 98992 It is the total value of 998 people and is 99999 people.

また、ノードｎｄ１の枠内の中段には、「比率：９９．０：１．０」と表示されている。これは、ノードｎｄ１に含まれるユーザを、商品Ａの購入履歴の有無によって分類し、それぞれの行動の有無に該当するユーザの数の比率を表している。すなわち、ノードｎｄ１の枠内の中段の数字は、ノードｎｄ１に含まれるユーザ（すなわち、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザ）のうち、９９．９％のユーザは商品Ａの購入履歴が無く、１．０％のユーザは商品Ａの購入履歴が有る、ということを示す。 In the middle part of the frame of the node nd1, “ratio: 99.0: 1.0” is displayed. This classifies the users included in the node nd1 according to the presence or absence of the purchase history of the product A, and represents the ratio of the number of users corresponding to the presence or absence of each action. That is, the middle number in the frame of the node nd1 is 99 of the users included in the node nd1 (that is, users among the analysis target users who have referred to a Web page related to science less than 20 times) 9% indicates that there is no purchase history of product A, and 1.0% indicates that there is a purchase history of product A.

また、ノードｎｄ１の枠内の下段には、「占：１００．０：９９．８」と表示されている。この左側の数字である「１００．０」は、分析対象となるユーザの行動（本例においては、商品Ａを購入したこと）が無いユーザ全体における、ノードｎｄ１に含まれるユーザにおいて分析対象となるユーザの行動が無いユーザの占有率を表している。すなわち、この左側の数字である「１００．０」は、商品Ａの購入履歴が無いユーザ全体における、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザでの商品Ａの購入履歴が無いユーザの占有率を表している。 In the lower part of the frame of the node nd1, it is displayed as “Shingo: 100.0: 99.8”. The number “100.0”, which is the number on the left, becomes the analysis target in the user included in the node nd1 in the entire user who does not have the activity of the user to be analyzed (in this example, the purchase of the product A). It represents the occupancy rate of the user without the user's action. That is, “100.0”, which is the number on the left side, indicates that the purchase history of the product A by the user having the number of times of referring to the Web page related to science is less than 20 It represents the occupancy rate of non-users.

なお、上述したように、商品Ａの購入履歴が無いユーザの全体数は９９０００人であり、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザでの商品Ａの購入履歴が無いユーザの数は９８９９２人であることから、占有率は「９９．９９１９・・・％」となるが、本例における決定木ｄｔ１は小数点以下一桁までを表示する。これにより、ノードｎｄ０の枠内の下段ことから、「９９．９９１９・・・％」が四捨五入された値である「１００．０％」が表示されている。 As described above, the total number of users who do not have a purchase history of product A is 99000, and the user who has not browsed a Web page related to science less than 20 times does not have a purchase history of product A Since the number of is 98992 people, the occupancy rate is “99.919 ...%”, but the decision tree dt1 in this example displays up to one decimal place. As a result, “100.0%”, which is a value obtained by rounding “99.9919 ···%”, is displayed from the lower part of the frame of the node nd0.

また、同様に、ノードｎｄ１の枠内の下段の右側の数字である「９９．８」は、分析対象ユーザ全体において分析対象となるユーザの行動（本例においては、商品Ａの購入）が有るユーザの全体数の中で、ノードｎｄ１に含まれるユーザにおいて分析対象となるユーザの行動が有るユーザの占有率を表している。すなわち、この右側の数字である「９９．８」は、商品Ａの購入履歴が有るユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザでの商品Ａの購入履歴が有るユーザの数の占有率を表している。 Similarly, “99.8”, which is the lower right number in the frame of the node nd1, indicates the behavior of the user to be analyzed (purchase of the product A in this example) in the entire analysis target user. In the total number of users, the occupancy rate of the user having the user's behavior to be analyzed among the users included in the node nd1 is represented. That is, “99.8”, which is the number on the right, indicates that the purchase of product A by a user who has less than 20 times of reference to a science related web page in the total number of users who have a purchase history of product A It represents the occupancy rate of the number of users who have a history.

次に、ノードｎｄ２の枠内の上段には、「数：８：２」と表示されている。すなわち、ノードｎｄ２の枠内の上段の数字は、ノードｎｄ２に含まれるユーザのうち、８人は商品Ａの購入履歴が無く、２人は商品Ａの購入履歴が有る、ということを示す。なお、これらの数字から分かるように、ノードｎｄ２に含まれるユーザの数は、８人と２人の合計値であり、１０人である。 Next, “number: 8: 2” is displayed in the upper stage within the frame of the node nd2. That is, the upper number in the frame of the node nd2 indicates that, among the users included in the node nd2, eight have no purchase history of the product A and two have a purchase history of the product A. As can be understood from these figures, the number of users included in the node nd2 is a total value of eight and two, and is ten.

また、ノードｎｄ２の枠内の中段には、「比率：８０．０：２．０」と表示されている。すなわち、ノードｎｄ２の枠内の中段の数字は、ノードｎｄ２に含まれるユーザ（すなわち、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少ないユーザ）のうち、８０．０％のユーザは商品Ａの購入履歴が無く、２０．０％のユーザは商品Ａの購入履歴が有る、ということを示す。 In the middle part of the frame of the node nd2, “ratio: 80.0: 2.0” is displayed. That is, the middle number in the frame of the node nd2 is 80 of the users included in the node nd2 (that is, users among the analysis target users who have referred to a Web page related to science less than 20 times) .0% indicates that there is no purchase history of product A, and 20.0% indicates that there is a purchase history of product A.

また、ノードｎｄ２の枠内の下段には、「占：０．０：０．２」と表示されている。すなわち、この左側の数字である「０．０」は、商品Ａの購入履歴が無いユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回以上であるユーザでの商品Ａの購入履歴が無いユーザの数の占有率を表している。 In the lower part of the frame of the node nd2, it is displayed as "Shingo: 0.0: 0.2". That is, the number “0.0”, which is the number on the left, corresponds to the number of users who have referred to the science related Web page in the total number of users who do not have a purchase history of the product A at 20 or more times. It represents the occupancy rate of the number of users who have no purchase history.

なお、上述したように、商品Ａの購入ユーザの全体数は９９０００人であり、科学に関連するＷｅｂページを参照した回数が２０回以上であるユーザでの商品Ａの購入ユーザ数は８人であることから、占有率は「０．００８０・・・％」となるが、四捨五入された値である「０．０％」が表示されている。 As described above, the total number of purchased users of product A is 99000, and the number of users who purchased product A is 8 with 20 or more times referring to the Web page related to science Although the occupancy rate is “0.0080...%”, The rounding-off value “0.0%” is displayed.

また、同様に、ノードｎｄ２の枠内の下段の右側の数字である「０．２」は、商品Ａの購入履歴が有るユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回以上であるユーザでの商品Ａの購入履歴が有るユーザの数の占有率を表している。 Similarly, the lower rightmost digit “0.2” in the frame of the node nd 2 indicates that the number of times the science related web page has been referred to in the total number of users who have a purchase history of the product A is 20. It represents the occupancy rate of the number of users who have a purchase history of the product A for the user who is more than the number of times.

次に、ノードｎｄ１の一段階下位の階層のノードであるノードｎｄ３の枠内の上段には、「数：８３６８２：７５８」と表示されている。すなわち、ノードｎｄ３の枠内の上段の数字は、ノードｎｄ３に含まれるユーザ（すなわち、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回より少ない（すなわち、参照していない）ユーザ）のうち、８３６８２人は商品Ａの購入履歴が無く、７５８人は商品Ａの購入履歴が有る、ということを示す。なお、これらの数字から分かるように、ノードｎｄ３に含まれるユーザの数は、８３６８２人と７５８人の合計値であり、８４４４０人である。 Next, “number: 83682: 758” is displayed in the upper part of the frame of the node nd3, which is a node in a hierarchy one step lower than the node nd1. That is, the upper number in the frame of the node nd3 indicates the user included in the node nd3 (that is, among the analysis target users, the number of times the Web page related to science has been referenced less than 20 times and related to health Among the users who have referred to the Web page to be accessed less than once (that is, users who do not refer to it), there are 83,682 people who do not have a purchase history of product A and 758 have a purchase history of product A Show. As can be seen from these figures, the number of users included in the node nd3 is the sum of 8,3682 and 7,758, and is 84,440.

また、ノードｎｄ３の枠内の中段には、「比率：９９．１：０．９」と表示されている。すなわち、ノードｎｄ３の枠内の中段の数字は、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回より少ない（すなわち、参照していない）ユーザのうち、９９．１％のユーザは商品Ａの購入履歴が無く、０．９％のユーザは商品Ａの購入履歴が有る、ということを示す。 In the middle part of the frame of the node nd3, “ratio: 99.1: 0.9” is displayed. That is, the middle number in the frame of the node nd3 indicates that among the analysis target users, the number of times of referring to the Web page related to science is less than 20 and the number of times of reference to the Web page related to health is 1 Among the users who are less than the number of times (that is, not referred to), the 99.1% user indicates that there is no purchase history of product A, and the 0.9% user has a purchase history of product A.

また、ノードｎｄ３の枠内の下段には、「占：８４．５：７５．８」と表示されている。すなわち、この左側の数字である「８４．５」は、商品Ａの購入履歴が無いユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回より少ない（すなわち、参照していない）ユーザでの商品Ａの購入履歴が無いユーザの数の占有率を表している。 In the lower part of the frame of the node nd3, it is displayed as "Shingo: 84.5: 75.8". That is, the number on the left side of “84.5” refers to the number of users who have not had a purchase history of product A and refers to a Web page related to science less than 20 times, and is related to health. It represents the occupancy rate of the number of users who do not have the purchase history of the product A with the user who has referred to the Web page less than once (that is, not referring to it).

また、同様に、ノードｎｄ３の枠内の下段の右側の数字である「７５．８」は、商品Ａの購入履歴が有るユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回より少ない（すなわち、参照していない）ユーザでの商品Ａの購入履歴が有るユーザの数の占有率を表している。 Similarly, the number “75.8” on the lower right in the frame of the node nd3 indicates that the number of web pages related to science in the total number of users who have the purchase history of the product A is 20 It represents the occupancy rate of the number of users who have a purchase history of the product A with users who are less than times and who have referred to health related web pages less than once (that is, not referring to them).

次に、ノードｎｄ１の一段階下位の階層のもう１つのノードであるノードｎｄ４の枠内の上段には、「数：１５３１０：２４０」と表示されている。すなわち、ノードｎｄ４の枠内の上段の数字は、ノードｎｄ４に含まれるユーザ（すなわち、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回以上であるユーザ）のうち、１５３１０人は商品Ａの購入履歴が無く、２４０人は商品Ａの購入履歴が有る、ということを示す。なお、これらの数字から分かるように、ノードｎｄ３に含まれるユーザの数は、１５３１０人と２４０人の合計値であり、１５５５０人である。 Next, “number: 15310: 240” is displayed in the upper part of the frame of the node nd 4 which is another node of the hierarchy one level lower than the node nd 1. That is, the upper number in the frame of the node nd4 indicates the user included in the node nd4 (that is, among the analysis target users, the number of times the science related web page has been referenced less than 20 times and related to health Among users (whose number of times the web page is referred to is one or more), 15310 indicates that there is no purchase history of product A, and 240 indicates that there is a purchase history of product A. As can be understood from these figures, the number of users included in the node nd3 is a total value of 15,310 and 240, which is 15,550.

また、ノードｎｄ４の枠内の中段には、「比率：９８．５：１．５」と表示されている。すなわち、ノードｎｄ４の枠内の中段の数字は、分析対象ユーザの中で、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回以上であるユーザのうち、９８．５％のユーザは商品Ａの購入履歴が無く、１．５％のユーザは商品Ａの購入履歴が有る、ということを示す。 In the middle part of the frame of the node nd4, “ratio: 98.5: 1.5” is displayed. That is, the middle numbers in the frame of the node nd4 indicate that among the analysis target users, the number of times of referring to the Web page related to science is less than 20 and the number of times of reference to the Web page related to health is 1 Among the users who are more than times, 98.5% of the users have no purchase history of the product A, and 1.5% of the users have a purchase history of the product A.

また、ノードｎｄ４の枠内の下段には、「占：１５．５：２４．０」と表示されている。すなわち、この左側の数字である「１５．５」は、商品Ａの購入履歴が無いユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回以上であるユーザでの商品Ａの購入履歴が無いユーザの数の占有率を表している。 In addition, in the lower part of the frame of the node nd4, “Oishi: 15.5: 24.0” is displayed. That is, the number on the left side of “15.5” is less than 20 times the number of times of reference to the science related web page in the total number of users without a purchase history of product A, and is related to health It represents the occupancy rate of the number of users who do not have a purchase history of the product A in a user whose number of times of reference to the Web page is one or more.

また、同様に、ノードｎｄ４の枠内の下段の右側の数字である「２４．０」は、商品Ａの購入履歴が有るユーザの全体数における、科学に関連するＷｅｂページを参照した回数が２０回より少なく、かつ、健康に関連するＷｅｂページを参照した回数が１回以上であるユーザでの商品Ａの購入履歴が有るユーザの数の占有率を表している。 Similarly, “24.0”, which is the lower right number in the frame of the node nd4, indicates that the number of times the science related Web page has been referred to in the total number of users who have a purchase history of the product A is 20. It represents the occupancy rate of the number of users who have a purchase history of the product A with a user who is less than the number of times and the number of times the web page related to health is referenced is one or more.

その他のノードである、ノードｎｄ５、ノードｎｄ６、ノードｎｄ７、・・・、ノードｎｄ１２に表示されている内容の意味についても、上記説明したノードｎｄ３、およびノードｎｄ４に表示されている内容の意味と同様であるため、説明を省略する。 The contents of the nodes nd5, nd6, nd7,..., Nd12, which are other nodes, are also the same as the contents of the nodes nd3 and nd4 described above. The description is omitted because it is similar.

図４に図示するように、決定木ｄｔ１に含まれる各ノード間を結ぶルール（枝）はそれぞれ異なる太さで表示されている。本実施形態における決定木ｄｔ１は、ルールの太さの違いによって、各ノードに含まれるユーザの数の多さを視覚的に表現する。
例えば、ルールは、当該ルールが結ぶ下位側のノードに含まれるユーザの数に比例した太さで表示される。例えば、ノードｎｄ０とノードｎｄ１とを結ぶルールｅｇ１は、ノードｎｄ１に含まれるユーザの数である「９９９９０人」に対応する太さで表現される。同様に、例えば、ノードｎｄ０とノードｎｄ２とを結ぶルールｅｇ２は、ノードｎｄ２に含まれるユーザの数である「１０人」に対応する太さで表現される。したがって、ルールｅｇ１の太さはルールｅｇ２の太さよりも太く表示される。 As illustrated in FIG. 4, rules (branches) connecting between nodes included in the decision tree dt1 are displayed in different thicknesses. The decision tree dt1 in the present embodiment visually expresses the number of users included in each node according to the difference in rule thickness.
For example, the rules are displayed in a thickness proportional to the number of users included in the lower nodes to which the rules connect. For example, a rule eg1 connecting the node nd0 and the node nd1 is expressed by a thickness corresponding to “99990 people” which is the number of users included in the node nd1. Similarly, for example, a rule eg2 connecting the node nd0 to the node nd2 is expressed by a thickness corresponding to “10 people” which is the number of users included in the node nd2. Therefore, the thickness of the rule eg1 is displayed thicker than the thickness of the rule eg2.

また、図４に図示するように、ノードによって、ノードの枠内の色は異なる濃度で表示されている。例えば、ノードｎｄ６およびノードｎｄ１０の枠内は濃色で表示されており、ノードｎｄ２およびノードｎｄ１２はやや濃色で表示されている。本実施形態における決定木ｄｔ１は、各ノードの枠内の色の濃度によって、上述した代表指標の値の大きさを視覚的に表現する。 Also, as illustrated in FIG. 4, the colors in the frame of the node are displayed in different densities depending on the node. For example, the frames in the nodes nd6 and nd10 are displayed in dark color, and the nodes nd2 and nd12 are displayed in slightly dark color. The decision tree dt1 in the present embodiment visually expresses the magnitude of the value of the representative index described above by the color density in the frame of each node.

例えば、上述したように、本実施形態における代表指標は「比率」の値そのものであり、図４に示す決定木ｄｔ１においては、代表指標の値が５０以上であるノードについては当該ノードの枠内は濃色で表示される。また、代表指標の値が２０以上であるノードについては当該ノードの枠内はやや濃色で表示される。また、代表指標の値が２０未満であるノードについては当該ノードの枠内は白色で表示される。
図２に示す表示方法記憶部２０には、このような、代表指標の値に応じたノードの表示方法を示す表示方法情報が記憶されている。表示方法情報は、例えば、代表指標の値を示す項目とノードの表示方法を示す項目との２つの項目の列からなる、二次元の表形式のデータである。表示方法記憶部２０は、例えば、記憶媒体、例えば、ハードディスクドライブ、フラッシュメモリ、イーイープロム、ＲＯＭ、ＲＡＭ、またはそれらの任意の組み合わせを含んで構成される。 For example, as described above, the representative index in the present embodiment is the value of “ratio” itself, and in the decision tree dt1 shown in FIG. Is displayed in dark color. In addition, for a node whose value of the representative index is 20 or more, the frame of the node is displayed in a slightly dark color. In addition, for nodes whose representative index value is less than 20, the frame of the node is displayed in white.
The display method storage unit 20 shown in FIG. 2 stores display method information indicating a method of displaying nodes according to the value of the representative index. The display method information is, for example, two-dimensional tabular data including a series of two items of an item indicating the value of the representative index and an item indicating the display method of the node. The display method storage unit 20 is configured to include, for example, a storage medium such as a hard disk drive, a flash memory, an Eyprom, a ROM, a RAM, or any combination thereof.

ここで、「比率」の値とは、各ノードの比率の欄の右側の値、すなわち、各ノードに含まれるユーザにおいて商品Ａを購入しているユーザの比率である。例えば、ノードｎｄ２の枠内の中段には「比率：８０．０：２０．０」と表示されている。この表示中の右側の数字（すなわち、代表指標の値）は「２０．０」であり、２０以上の値であることから、ノードｎｄ２の枠内はやや濃色で表示される。また、例えば、ノードｎｄ６の枠内の中段には「比率：０．０：１００．０」と表示されている。この表示中の右側の数字（すなわち、代表指標の値）は「１００．０」であり、８０以上の値であることから、ノードｎｄ２の枠内は濃色で表示される。 Here, the value of “ratio” is the value on the right side of the column of the ratio of each node, that is, the ratio of the user who is purchasing the product A in the user included in each node. For example, in the middle part of the frame of the node nd2, "ratio: 80.0: 20.0" is displayed. The right number in this display (that is, the value of the representative index) is “20.0”, which is a value of 20 or more, so the frame of the node nd2 is displayed in a slightly dark color. Also, for example, in the middle part of the frame of the node nd6, “ratio: 0.0: 100.0” is displayed. The right number in this display (that is, the value of the representative index) is “100.0” and has a value of 80 or more, so the frame of the node nd2 is displayed in dark color.

なお、決定木ｄｔ１の各ノードにおいて、商品Ａを購入しているユーザの数（すなわち、各ノードの枠内の上段の右側に表示される数）が、上述した足切り値以下であるならば、「比率」の値が高かったとしても、当該ノードの枠内の色は濃色またはやや濃色で表示はされない（白色で表示される）。
例えば、図４に図示する決定木ｄｔ１は、足切り値の値が「１」である場合の決定木である。枠内が濃色またはやや濃色で表示されたノードｎｄ２、ノードｎｄ６、ノードｎｄ１０、およびノードｎｄ１２は、いずれも各ノードにおいて、商品Ａを購入しているユーザの数（すなわち、各ノードの枠内の上段の右側に表示される数）が、１より大きいことから、枠内が白色では表示されない。もし足切り値が「３」であったとしたならば、商品Ａを購入しているユーザの数が３以下であるノードｎｄ２、ノードｎｄ６、およびノードｎｄ１２の枠内は、白色で表示されることになる。 In each node of the decision tree dt1, if the number of users who have purchased the product A (that is, the number displayed on the right side of the upper row in the frame of each node) is less than the above-mentioned threshold value Even if the value of “ratio” is high, the color in the frame of the node is not displayed in dark or slightly dark (displayed in white).
For example, the decision tree dt1 illustrated in FIG. 4 is a decision tree when the value of the threshold value is “1”. In each of the nodes nd2, nd6, nd10, and nd12 in which the frame is displayed in dark or light color, the number of users who have purchased the product A at each node (ie, the frame of each node Since the number displayed on the right side of the upper row of the inside is larger than 1, the frame is not displayed in white. If the threshold value is "3", the frames of the nodes nd2, nd6, and nd12 whose number of users who have purchased the product A is 3 or less are displayed in white. become.

以上、説明したように、各ルールの太さによって各ノードに含まれるユーザの数が視覚的に表示されることによって、ユーザの数が多いノードが分析者によって認識され易くなる。
また、代表指標の値が大きいノードの枠内の色の濃度が高く表示され強調表示されることによって、代表指標の値が大きいノードが分析者によって認識され易くなる。これにより、分析者は、代表指標である「比率」の値が高い、すなわち、商品Ａの購入率が高いユーザ群を含むノードを容易に認識することができる。 As described above, by visually displaying the number of users included in each node according to the thickness of each rule, a node having a large number of users can be easily recognized by the analyst.
Further, by displaying and emphasizing the color density in the frame of the node having a large value of the representative index high, the node having a large value of the representative index can be easily recognized by the analyst. Thereby, the analyst can easily recognize the node including the user group whose value of “ratio” which is the representative index is high, that is, the purchase rate of the product A is high.

さらに、過去に熟練者によって選択されたノードに含まれるユーザの数に基づいて設定された足切り値を下回るユーザの数であるノードの枠内の色については、濃度が高く表示されないことにより、当該ノードは分析者に認識されにくくなる（または、熟練者には選択されにくいノードであることが分析者に認識され易くなる）。 Furthermore, the color in the frame of the node, which is the number of users below the threshold set based on the number of users included in the node selected by the expert in the past, is not displayed with high density, The node is less likely to be recognized by the analyst (or is more likely to be recognized by the analyst as a node which is less likely to be selected by the expert).

再び図２を参照しながら、情報処理装置１のその他の機能構成について説明する。
出力部１８によって決定木ｄｔ１を含む画像が出力されることにより、決定木ｄｔ１が分析者に提示される。分析者は、出力部１８に表示された画像に基づく決定木ｄｔ１の中からノードを選択し、選択したノードを示すデータを入力部１１によって入力する。 Another functional configuration of the information processing apparatus 1 will be described with reference to FIG. 2 again.
The decision tree dt1 is presented to the analyst by the output unit 18 outputting an image including the decision tree dt1. The analyst selects a node from the decision tree dt1 based on the image displayed on the output unit 18, and inputs data indicating the selected node through the input unit 11.

入力部１１は、分析者によって選択されたノードを示すデータを広告配信部１９へ出力する。
広告配信部１９は、入力部１１から入力されたデータに基づくノードに含まれるユーザ群を示すデータを、Ｗｅｂ広告配信業者端末３へ出力する。 The input unit 11 outputs data indicating the node selected by the analyst to the advertisement distribution unit 19.
The advertisement distribution unit 19 outputs data indicating a group of users included in a node based on the data input from the input unit 11 to the Web advertisement distributor terminal 3.

また、入力部１１は、分析者によって選択されたノードと分析者のスキルレベルを示すデータとを制御部１０へ出力する。または、入力部１１は、選択されたノードを示すデータを入力した分析者が熟練者である場合に、分析者によって選択されたノードを示すデータを制御部１０へ出力するようにしてもよい。
なお、情報処理装置１は、分析者のスキルレベルを予め認識可能であるものとする。例えば、情報処理装置１に、分析者ごとのスキルレベルを示すテーブル（図示せず）が予め記憶されており、分析者が情報処理装置１の使用を開始する際に認証をすることによって、情報処理装置１が分析者のスキルレベルを認識できるようにしてもよい。これにより、情報処理装置１は、分析者のスキルレベル（例えば、分析者が熟練者であるか否か）を認識することができる。 Further, the input unit 11 outputs, to the control unit 10, the node selected by the analyst and data indicating the skill level of the analyst. Alternatively, when the analyst who has input the data indicating the selected node is a skilled person, the input unit 11 may output data indicating the node selected by the analyst to the control unit 10.
It is assumed that the information processing apparatus 1 can recognize the skill level of the analyst in advance. For example, a table (not shown) indicating the skill level for each analyst is stored in advance in the information processing apparatus 1, and information is obtained by authenticating when the analyst starts using the information processing apparatus 1. The processing device 1 may be able to recognize the skill level of the analyst. Thereby, the information processing apparatus 1 can recognize the skill level of the analyst (for example, whether the analyst is a skilled person or not).

制御部１０は、入力部１１から入力されたデータに基づくノードに対応する「数」の値を決定木算出部１５から取得し、代表指標算出部１６を介して足切り値格納部１７に記憶させる。これにより、情報処理装置１は、熟練者が選択したノードに含まれるユーザの数を蓄積することができる。情報処理装置１は、蓄積したデータに基づいて、上述した足切り値の値を学習していくことができる。 The control unit 10 acquires the value of “number” corresponding to the node based on the data input from the input unit 11 from the decision tree calculation unit 15 and stores the value in the cutaway value storage unit 17 via the representative index calculation unit 16. Let Thereby, the information processing apparatus 1 can accumulate the number of users included in the node selected by the expert. The information processing device 1 can learn the value of the above-mentioned threshold value based on the accumulated data.

なお、足切り値は、例えば、足切り値格納部１７に記憶された「数」の値の中の最小値であることとしてもよいし、足切り値格納部１７に記憶された「数」の値の平均値であることとしてもよい。 The threshold value may be, for example, the minimum value among the values of the "number" stored in the threshold value storage unit 17, or the "number" stored in the threshold value storage unit 17. It may be an average value of the values of.

また、過去の分析において熟練の分析者によって複数のノードが選択された場合には、足切り値格納部１７に記憶される「数」の値は、選択された複数のノードに含まれるそれぞれのユーザの数の中の最小値であってもよいし、ユーザの数の平均値または合計値であってもよい。 In addition, when a plurality of nodes are selected by a skilled analyst in the past analysis, the value of “number” stored in threshold value storage unit 17 is each included in the plurality of selected nodes. It may be the minimum value among the number of users, or it may be the average value or the total value of the number of users.

（情報処理装置の動作）
以下、図面を参照しながら、情報処理装置１の動作を説明する。
図５は、本実施形態に係る情報処理装置１の動作を示すフローチャートである。
本フローチャートが示す処理は、情報処理装置１の入力部１１に、分析者からの分析要求を示すデータが入力された際に開始される。 (Operation of information processing apparatus)
Hereinafter, the operation of the information processing apparatus 1 will be described with reference to the drawings.
FIG. 5 is a flowchart showing the operation of the information processing apparatus 1 according to the present embodiment.
The process shown in the present flowchart is started when data indicating an analysis request from an analyst is input to the input unit 11 of the information processing device 1.

（ステップＳ１０１）入力部１１は、分析者からの分析要求を受け付ける。入力部１１は、分析者から入力された、分析要求を示すデータを制御部１０へ出力する。その後、ステップＳ１０２へ進む。 (Step S101) The input unit 11 receives an analysis request from an analyst. The input unit 11 outputs, to the control unit 10, data indicating an analysis request, which is input from the analyst. Thereafter, the process proceeds to step S102.

（ステップＳ１０２）制御部１０は、分析者からの分析要求を示すデータが入力部１１から入力されると、当該データをシミュレーション検証評価部１４へ出力する。シミュレーション検証評価部１４は、履歴記憶部１２から分析対象データを取得する。分析対象データには、ユーザごとのユーザ属性情報、例えば、商品Ａの購入履歴、閲覧したＷｅｂページ、および閲覧したＷｅｂページのコンテンツが属するジャンル（すなわち、ユーザの嗜好を示すデータ）が含まれる。シミュレーション検証評価部１４は、履歴記憶部１２から取得した分析対象データから、複数の異なる期間における分析対象データをそれぞれ抽出し、制御部１０を介して決定木算出部１５へそれぞれ出力する。その後、ステップＳ１０３へ進む。 (Step S102) When data indicating an analysis request from the analyst is input from the input unit 11, the control unit 10 outputs the data to the simulation verification evaluation unit 14. The simulation verification evaluation unit 14 acquires analysis target data from the history storage unit 12. The analysis target data includes user attribute information for each user, for example, the purchase history of the product A, the browsed Web page, and the category to which the content of the browsed Web page belongs (that is, data indicating the preference of the user). The simulation verification evaluation unit 14 extracts analysis target data in a plurality of different periods from the analysis target data acquired from the history storage unit 12 and outputs the analysis target data to the decision tree calculation unit 15 via the control unit 10, respectively. Thereafter, the process proceeds to step S103.

（ステップＳ１０３）決定木算出部１５は、シミュレーション検証評価部１４から制御部１０を介して入力された複数の異なる期間における分析対象データに基づいて、それぞれ決定木を算出する。決定木算出部１５は、算出したそれぞれの決定木を示すデータを、制御部１０を介してシミュレーション検証評価部１４へ出力する。シミュレーション検証評価部１４は、複数の異なる期間における分析対象データに基づく決定木を示すデータを、制御部１０を介して代表指標算出部１６へ出力する。その後、ステップＳ１０４へ進む。 (Step S103) The decision tree calculation unit 15 calculates a decision tree based on analysis target data in a plurality of different periods input from the simulation verification and evaluation unit 14 via the control unit 10. The decision tree calculation unit 15 outputs data indicating each calculated decision tree to the simulation verification evaluation unit 14 via the control unit 10. The simulation verification and evaluation unit 14 outputs data indicating a decision tree based on analysis target data in a plurality of different periods to the representative index calculation unit 16 via the control unit 10. Thereafter, the process proceeds to step S104.

（ステップＳ１０４）代表指標算出部１６は、制御部１０から入力されたデータに基づく決定木の各ノードについて、それぞれ代表指標を算出する。その後、ステップＳ１０５へ進む。 (Step S104) The representative index calculation unit 16 calculates a representative index for each node of the decision tree based on the data input from the control unit 10. Thereafter, the process proceeds to step S105.

（ステップＳ１０５）代表指標算出部１６は、複数の期間それぞれの決定木の各ノードの代表指標の値を示すデータと、複数の期間それぞれの決定木を示すデータとを、制御部１０を介してシミュレーション検証評価部１４へ出力する。
シミュレーション検証評価部１４は、複数の期間それぞれの決定木の各ノードの代表指標の値に基づいて、代表指標の値の特徴が最も明確に表れている決定木を選択する。シミュレーション検証評価部１４は、選択した決定木を示すデータを、制御部１０を介して代表指標算出部１６へ出力する。その後、ステップＳ１０６へ進む。 (Step S105) The representative index calculation unit 16 controls the data indicating the value of the representative index of each node of the decision tree for each of the plurality of periods and the data indicating the decision tree for each of the plurality of periods via the control unit 10. Output to the simulation verification and evaluation unit 14.
The simulation verification and evaluation unit 14 selects a decision tree in which the feature of the value of the representative index appears most clearly, based on the value of the representative index of each node of the decision tree for each of a plurality of periods. The simulation verification and evaluation unit 14 outputs data indicating the selected decision tree to the representative index calculation unit 16 via the control unit 10. Thereafter, the process proceeds to step S106.

（ステップＳ１０６）代表指標算出部１６は、シミュレーション検証評価部１４によって選択された決定木の各ノードにおける「数」の値について、熟練の分析者によって選択されたノードの「数」の値である「足切り値」を下回る値であるならば、当該ノードの代表指標の値を０にする。すなわち、ユーザ数が足切り値に満たないノードの代表指標の値を０とする。代表指標算出部１６は、決定木を示すデータ、および各ノードの代表指標を示すデータを、制御部１０へ出力する。その後、ステップＳ１０７へ進む。 (Step S106) The representative index calculation unit 16 is the value of the “number” of nodes selected by the skilled analyst for the value of “number” in each node of the decision tree selected by the simulation verification evaluation unit 14 If the value is less than the "threshold value", the value of the representative index of the node is set to 0. That is, the value of the representative index of nodes whose number of users is less than the threshold value is set to 0. The representative index calculation unit 16 outputs data indicating the decision tree and data indicating the representative index of each node to the control unit 10. Thereafter, the process proceeds to step S107.

（ステップＳ１０７）制御部１０は、代表指標算出部１６から入力された、決定木を示すデータ、および各ノードの代表指標を示すデータに基づいて、決定木を含む画像を示すデータを生成する。制御部１０は、算出した決定木を含む画像を示すデータを、出力部１８へ出力する。その後、ステップＳ１０８へ進む。 (Step S107) The control unit 10 generates data indicating an image including a decision tree, based on the data indicating the decision tree and the data indicating the representative index of each node, which are input from the representative index calculation unit 16. The control unit 10 outputs data indicating an image including the calculated decision tree to the output unit 18. Thereafter, the process proceeds to step S108.

（ステップＳ１０８）出力部１８は、制御部１０から入力されたデータに基づく決定木を含む画像を出力する。これにより、決定木を含む画像が、分析者へ提示される。その後、ステップＳ１０９へ進む。 (Step S108) The output unit 18 outputs an image including a decision tree based on the data input from the control unit 10. Thereby, an image including the decision tree is presented to the analyst. Thereafter, the process proceeds to step S109.

（ステップＳ１０９）分析者によって決定木の中のノードが選択され（すなわち、広告配信対象とするユーザ群を含むノードが選択され）、選択されたノードを示すデータの入力を入力部１１が受け付けた場合、ステップＳ１１０へ進む。そうでない場合は、ステップＳ１０９に留まる。 (Step S109) The analyst selects a node in the decision tree (that is, a node including a user group to be targeted for advertisement distribution is selected), and the input unit 11 receives an input of data indicating the selected node. In the case, the process proceeds to step S110. If not, the process remains in step S109.

（ステップＳ１１０）入力部１１は、選択されたノードを示すデータを広告配信部１９へ出力する。広告配信部１９は、入力部１１から入力されたデータに基づくノードに含まれるユーザ群に対して、Ｗｅｂ広告配信業者を介して、Ｗｅｂ広告を配信する。その後、ステップＳ１１１へ進む。 (Step S110) The input unit 11 outputs data indicating the selected node to the advertisement distribution unit 19. The advertisement distribution unit 19 distributes the Web advertisement to the group of users included in the node based on the data input from the input unit 11 via the Web advertisement distribution provider. Thereafter, the process proceeds to step S111.

（ステップＳ１１１）入力部１１は、選択されたノードを示すデータを入力した分析者が熟練者である場合、ステップＳ１１２へ進む。そうでない場合は、本フローチャートの処理を終了する。 (Step S111) If the analyst who has input the data indicating the selected node is a skilled person, the input unit 11 proceeds to step S112. If not, the processing of this flowchart ends.

（ステップＳ１１２）入力部１１は、分析者によって選択されたノードを示すデータを制御部１０へ出力する。制御部１０は、入力部１１から入力されたデータに基づくノードに対応する「数」の値を決定木算出部１５から取得する。制御部１０は、決定木算出部１５から取得したデータに基づく「数」の値を、代表指標算出部１６を介して、足切り値格納部１７に記憶させる。これにより、情報処理装置１は熟練者が選択したノードに含まれるユーザの数を蓄積し、情報処理装置１は足切り値を学習する。以上で、本フローチャートの処理を終了する。 (Step S112) The input unit 11 outputs data indicating the node selected by the analyst to the control unit 10. The control unit 10 acquires the value of “number” corresponding to the node based on the data input from the input unit 11 from the decision tree calculation unit 15. The control unit 10 stores the value of “number” based on the data acquired from the decision tree calculation unit 15 in the cutoff value storage unit 17 via the representative index calculation unit 16. Thereby, the information processing device 1 accumulates the number of users included in the node selected by the expert, and the information processing device 1 learns the threshold value. This is the end of the processing of this flowchart.

以上、説明したように、本実施形態に係る情報処理装置１は、複数の分析対象期間を設定し、それぞれの分析対象期間に基づくログデータ（ユーザ属性情報）に基づいて、それぞれ決定木を算出する。情報処理装置１は、算出した決定木の各ノードの代表指標の値を算出する。情報処理装置１は、それぞれの決定木の代表指標の値に基づいて、代表指標の値の特徴が最も明確に表れている決定木を選択する。情報処理装置１は、代表指標の値の特徴が最も明確に表れている決定木を選択して分析者へ提示することにより、分析の目的とする業務への効果をより大きくしうる決定木を分析者へ提示することができる。
以上により、本実施形態に係る情報処理装置１は、分析対象期間が異なる複数の分析結果の中で、より特徴が強く表れた分析結果を提示することができる。 As described above, the information processing apparatus 1 according to the present embodiment sets a plurality of analysis target periods and calculates a decision tree based on log data (user attribute information) based on each analysis target period, as described above. Do. The information processing device 1 calculates the value of the representative index of each node of the calculated decision tree. The information processing apparatus 1 selects a decision tree in which the feature of the value of the representative index is most clearly displayed, based on the value of the representative index of each decision tree. The information processing apparatus 1 selects a decision tree in which the characteristic of the value of the representative index is most clearly displayed and presents the decision tree to the analyst, thereby making the decision tree capable of further increasing the effect on the task of analysis. It can be presented to the analyst.
As described above, the information processing apparatus 1 according to the present embodiment can present the analysis result in which the feature is more strongly exhibited among the plurality of analysis results in which the analysis target period is different.

以上、この発明の実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。 The embodiment of the present invention has been described in detail, but the specific configuration is not limited to the above-described one, and various design changes can be made without departing from the scope of the present invention. is there.

なお、上述した実施形態における情報処理装置１の一部又は全部をコンピュータで実現するようにしてもよい。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。 Note that part or all of the information processing apparatus 1 in the embodiment described above may be realized by a computer. In that case, a program for realizing the control function may be recorded in a computer readable recording medium, and the program recorded in the recording medium may be read and executed by a computer system.

なお、ここでいう「コンピュータシステム」とは、情報処理装置１に内蔵されたコンピュータシステムであって、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 Here, the “computer system” is a computer system built in the information processing apparatus 1 and includes an OS and hardware such as peripheral devices. The term "computer-readable recording medium" refers to a storage medium such as a flexible disk, a magneto-optical disk, a ROM, a portable medium such as a ROM or a CD-ROM, or a hard disk built in a computer system.

さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信回線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Furthermore, the “computer-readable recording medium” is one that holds a program dynamically for a short time, such as a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case may also include one that holds a program for a certain period of time. The program may be for realizing a part of the functions described above, or may be realized in combination with the program already recorded in the computer system.

また、上述した実施形態における情報処理装置１を、ＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）等の集積回路として実現してもよい。情報処理装置１の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化してもよい。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現してもよい。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 In addition, the information processing device 1 in the above-described embodiment may be realized as an integrated circuit such as an LSI (Large Scale Integration). Each functional block of the information processing apparatus 1 may be individually processorized, or part or all may be integrated and processorized. Further, the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. In the case where an integrated circuit technology comes out to replace LSI's as a result of the advancement of semiconductor technology, integrated circuits based on such technology may also be used.

１・・・情報処理装置、２・・・Ｗｅｂサーバ、３・・・Ｗｅｂ広告配信業者端末、１０・・・制御部、１１・・・入力部、１２・・・履歴記憶部、１３・・・サイト嗜好定義部、１４・・シミュレーション検証評価部、１５・・・決定木算出部、１６・・・代表指標算出部、１７・・・足切り値格納部、１８・・・出力部、１９・・・広告配信部、２０・・・表示方法記憶部 1 ... information processing apparatus, 2 ... Web server, 3 ... Web advertisement distributor terminal, 10 ... control unit, 11 ... input unit, 12 ... history storage unit, 13 ... Site preference definition unit 14 Simulation verification evaluation unit 15 Decision tree calculation unit 16 Representative index calculation unit 17 Deadline value storage unit 18 Output unit 19 ... Advertising distribution unit, 20 ... Display method storage unit

Claims

A decision tree for acquiring data indicating user attributes of each of a plurality of users, and classifying the users into any of a plurality of nodes based on the user attributes of data corresponding to a specific period of the data, A decision tree calculation unit that calculates a plurality of times while changing a period;
A representative index calculation unit that calculates a value of a representative index that is an index based on the number of users classified into the nodes;
A verification and evaluation unit that selects any decision tree from among a plurality of decision trees calculated by the decision tree calculation unit based on the value of the representative index calculated by the representative index calculation unit;
An output unit that outputs analysis result information including a decision tree calculated based on data indicating the decision tree selected by the verification and evaluation unit and data indicating the value of the representative index calculated by the representative index calculation unit;
An information processing apparatus comprising:

The verification and evaluation unit calculates the variance of the values of the representative index for each of the nodes calculated by the representative index calculation unit for each of the plurality of different period-based decision trees, and selects the decision tree with the largest variance. Select a decision tree more characterized by the value of the representative index by
An information processing apparatus according to claim 1, characterized in that.

The representative index is a ratio of the number of users who acted as an analysis target to the number of users classified into the node.
The information processing apparatus according to claim 1 or 2, characterized in that:

The representative index is a value obtained by multiplying the average number of sales per user of the users classified into the node by the ratio of the number of users who performed the analysis target out of the number of users classified into the nodes.
The information processing apparatus according to any one of claims 1 to 3, characterized in that:

An information processing method using a computer,
The decision tree calculation unit included in the computer acquires data indicating user attributes of each of a plurality of users, and the user is selected from any of a plurality of nodes based on the user attributes of data corresponding to a specific period among the data. A decision tree calculating step of calculating a plurality of decision trees to be classified into different ones by changing the period;
A representative index calculation step of calculating a value of a representative index calculated based on the number of the users classified into the nodes, the representative index calculation unit included in the computer;
The verification evaluation unit included in the computer determines any one decision tree among the plurality of decision trees calculated in the decision tree calculation step based on the value of the representative index calculated in the representative index calculation step. The verification evaluation step to select
Analysis showing a decision tree calculated based on data indicating the decision tree selected in the verification and evaluation step and data indicating the value of the representative index calculated in the representative index calculating step, and an output unit included in the computer An output step for outputting result information;
An information processing method characterized by comprising:

On the computer
A determination tree for acquiring data indicating user attributes of a plurality of users and classifying the users into any of a plurality of nodes based on the user attributes of data corresponding to a specific period among the data, the period Decision tree calculation step of calculating plural numbers by changing
A representative indicator calculation step of calculating the value of the representative indicator calculated based on the number of the users classified into the nodes;
A verification and evaluation step of selecting any decision tree from among a plurality of decision trees calculated in the decision tree calculating step, based on the value of the representative index calculated in the representative index calculating step;
An output step of outputting analysis result information indicating a decision tree calculated based on data indicating the decision tree selected in the verification and evaluation step and data indicating the value of the representative index calculated in the representative index calculating step; ,
A program to run a program.