JP6130977B1

JP6130977B1 - Information processing apparatus, information processing method, information processing system, and program

Info

Publication number: JP6130977B1
Application number: JP2016555631A
Authority: JP
Inventors: 伊隆岡部
Original assignee: Mitsui Knowledge Industry Co Ltd
Current assignee: Mitsui Knowledge Industry Co Ltd
Priority date: 2016-05-24
Filing date: 2016-05-24
Publication date: 2017-05-17
Anticipated expiration: 2036-05-24
Also published as: JPWO2017203601A1; WO2017203601A1

Abstract

多変量解析により将来の目的変数を好適に予測すること。本発明の一態様に係る情報処理装置は、データを取得する取得部と、前記データから、目的変数と、１つ以上の説明変数候補と、を選択する選択部と、各説明変数候補の時系列モデルを構築する構築部と、を有し、前記構築部は、前記時系列モデルによる各説明変数候補の予測精度に基づいて前記説明変数候補から選択した１つ以上の説明変数を用いて、前記目的変数の多変量モデルを構築することを特徴とする。Properly predict future objective variables by multivariate analysis. An information processing apparatus according to an aspect of the present invention includes an acquisition unit that acquires data, a selection unit that selects an objective variable and one or more explanatory variable candidates from the data, and each explanatory variable candidate A construction unit for constructing a series model, wherein the construction unit uses one or more explanatory variables selected from the explanatory variable candidates based on the prediction accuracy of each explanatory variable candidate by the time series model, A multivariate model of the objective variable is constructed.

Description

本発明は、データの予測に係る情報処理装置、情報処理方法、情報処理システム及びプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, an information processing system, and a program related to data prediction.

多数のデータを収集し、データの時間変化に伴う変動傾向を解析し、ビジネスに活かす取り組みが行われている。このような解析には、多変量解析（例えば、重回帰分析）が広く用いられる。 Efforts are being made to collect a large amount of data, analyze changes in the data with time, and make use of the data. For such analysis, multivariate analysis (for example, multiple regression analysis) is widely used.

例えば、時間の経過に従って変化する時系列データ（目的変数などとも呼ばれる）を、別の時系列データ（説明変数などとも呼ばれる）を利用して予測するための予測モデルを構築する技術が提案されている（特許文献１）。 For example, a technique has been proposed for constructing a prediction model for predicting time-series data (also referred to as objective variables) that changes over time using another time-series data (also referred to as explanatory variables). (Patent Document 1).

特開２０１３−１５２６５６号公報JP 2013-152656 A

しかしながら、将来の目的変数を予測するためには、将来の説明変数を考慮する必要があるが、将来の説明変数は、その時点にならないと取得することができない。このため、従来の多変量解析方法では、将来の目的変数の予測を好適に行うことができない。 However, in order to predict the future objective variable, it is necessary to consider the future explanatory variable. However, the future explanatory variable cannot be acquired unless it is at that time. For this reason, the conventional multivariate analysis method cannot suitably predict the future target variable.

本発明はかかる点に鑑みてなされたものであり、多変量解析により将来の目的変数を好適に予測することができる情報処理装置、情報処理方法、情報処理システム及びプログラムを提供することを目的の１つとする。 The present invention has been made in view of such points, and an object of the present invention is to provide an information processing apparatus, an information processing method, an information processing system, and a program capable of suitably predicting a future target variable by multivariate analysis. One.

本発明の一態様に係る情報処理装置は、データを取得する取得部と、前記データから、目的変数と、１つ以上の説明変数候補と、を選択する選択部と、前記データのうち第１の期間のデータを用いて各説明変数候補の時系列モデルを構築し、前記データのうち第２の期間のデータを用いて各説明変数候補の時系列モデルの予測精度を算出する構築部と、を有し、前記構築部は、各説明変数候補の時系列モデルの予測精度に基づいて各説明変数候補から選択した１つ以上の説明変数を用いて、前記目的変数の多変量モデルを構築することを特徴とする。 An information processing apparatus according to an aspect of the present invention includes an acquisition unit that acquires data, a selection unit that selects an objective variable and one or more explanatory variable candidates from the data, and a first of the data building a time series model for each explanatory variable candidates with the duration of the data, the construction unit for calculating a prediction accuracy of the time-series models for each explanatory variable candidates by using the data of the second period of the data, The construction unit constructs a multivariate model of the objective variable using one or more explanatory variables selected from each explanatory variable candidate based on the prediction accuracy of the time series model of each explanatory variable candidate. It is characterized by that.

本発明によれば、多変量解析により将来の目的変数を好適に予測することができる。 According to the present invention, a future objective variable can be suitably predicted by multivariate analysis.

本発明の一実施形態に係る情報処理システムの概略構成の一例を示す図である。It is a figure which shows an example of schematic structure of the information processing system which concerns on one Embodiment of this invention. 本発明の一実施形態に係る目的変数予測方法の概念説明図である。It is a conceptual explanatory drawing of the objective variable prediction method which concerns on one Embodiment of this invention. 本発明の一実施形態に係る目的変数予測方法のフローチャートの一例を示す図である。It is a figure which shows an example of the flowchart of the objective variable prediction method which concerns on one Embodiment of this invention. 図３のステップＳ１２に係る説明変数候補の選択画面の一例を示す図である。It is a figure which shows an example of the selection screen of the explanatory variable candidate which concerns on FIG.3 S12. 図３のステップＳ１８に係る説明変数及び目的変数に関するグラフの表示画面の一例を示す図である。It is a figure which shows an example of the display screen of the graph regarding the explanatory variable and objective variable which concern on FIG.3 S18. 本発明の一実施形態に係るサーバの機能構成の一例を示す図である。It is a figure which shows an example of the function structure of the server which concerns on one Embodiment of this invention. 本発明の一実施形態に係るサーバ及びデバイスのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the server and device which concern on one Embodiment of this invention.

本発明者は、目的変数と説明変数との関係を多変量モデルで解析するとともに、各説明変数を時系列モデルで予測することを着想した。そして、時系列モデルによる説明変数の予測精度に基づいて、多変量モデルで採用する説明変数の選択を適切に行うことを見出した。 The inventor has conceived that the relationship between the objective variable and the explanatory variable is analyzed by a multivariate model, and each explanatory variable is predicted by a time series model. And it discovered that the explanatory variable employ | adopted by a multivariate model was selected appropriately based on the prediction accuracy of the explanatory variable by a time series model.

これにより、予測精度の悪い説明変数を目的変数の予測に利用しないように制限することができ、目的変数の予測精度が大きく劣化する事態を抑制することができる。 As a result, it is possible to restrict the explanatory variable with poor prediction accuracy from being used for prediction of the objective variable, and it is possible to suppress a situation in which the prediction accuracy of the objective variable is greatly deteriorated.

以下、本発明の実施形態について添付図面を参照して詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

（情報処理システム）
まず、本発明が適用される情報処理システムについて説明する。図１は、本発明の一実施形態に係る情報処理システムの概略構成の一例を示す図である。図１に示す情報処理システム１は、サーバ１０と、デバイス２０と、を含む。(Information processing system)
First, an information processing system to which the present invention is applied will be described. FIG. 1 is a diagram illustrating an example of a schematic configuration of an information processing system according to an embodiment of the present invention. An information processing system 1 illustrated in FIG. 1 includes a server 10 and a device 20.

本発明の一実施形態では、図１に示した情報処理システムの構成において、サーバ１０が、デバイス２０からの指示に基づいて、データに対して多変量解析を実施し、解析結果をデバイス２０に送信する。 In the embodiment of the present invention, in the configuration of the information processing system illustrated in FIG. 1, the server 10 performs multivariate analysis on data based on an instruction from the device 20, and the analysis result is transmitted to the device 20. Send.

サーバ１０は、デバイス２０からの指示に基づいてデータの解析を行う機能を有する情報処理装置である。サーバ１０は、複数のサーバから構成されてもよい。 The server 10 is an information processing apparatus having a function of analyzing data based on an instruction from the device 20. The server 10 may be composed of a plurality of servers.

デバイス２０は、ユーザの操作によりブラウザなどのアプリケーションを実行し、サーバ２０と通信する情報処理装置である。デバイス１０は、携帯電話、スマートフォン、タブレット型端末などの携帯端末（移動通信端末）であってもよいし、パソコン（ＰＣ：Personal Computer）などの固定通信端末であってもよい。 The device 20 is an information processing apparatus that executes an application such as a browser by a user operation and communicates with the server 20. The device 10 may be a mobile terminal (mobile communication terminal) such as a mobile phone, a smartphone, or a tablet terminal, or may be a fixed communication terminal such as a personal computer (PC).

デバイス１０、サーバ２０など、各装置の機能構成及びハードウェア構成の一例については、後述する。 An example of the functional configuration and hardware configuration of each device such as the device 10 and the server 20 will be described later.

なお、当該システム構成は一例であり、これに限られない。例えば、各装置は、図１ではそれぞれ１つずつ含まれる構成としたが、各機器の数はこれに限られず、複数存在してもよい。また、情報処理システム１では、所定の装置の機能が複数の装置により実現される構成としてもよい。 Note that the system configuration is an example, and the present invention is not limited to this. For example, each device is configured to include one device in FIG. 1, but the number of devices is not limited to this, and a plurality of devices may exist. Further, the information processing system 1 may be configured such that a function of a predetermined device is realized by a plurality of devices.

（情報処理方法）
本発明の実施形態に係る情報処理方法（目的変数予測方法）について、以下で説明する。各情報処理方法は、上述の情報処理システムに適用されてもよい。(Information processing method)
An information processing method (objective variable prediction method) according to an embodiment of the present invention will be described below. Each information processing method may be applied to the information processing system described above.

図２は、本発明の一実施形態に係る目的変数予測方法の概念説明図である。前提として、図１に示したサーバ１０（以下、単に「サーバ」と記載する）は、予測したいデータ（目的変数）の実績値と、当該データに影響を与えていると想定される時系列データ（説明変数）の実績値と、を保持しているものとする。 FIG. 2 is a conceptual explanatory diagram of an objective variable prediction method according to an embodiment of the present invention. As a premise, the server 10 shown in FIG. 1 (hereinafter simply referred to as “server”) is the actual value of the data (object variable) to be predicted and the time series data that is assumed to affect the data. It is assumed that the actual value of (explanatory variable) is held.

サーバは、目的変数の実績値と、１つ以上の説明変数の実績値と、に基づいて、多変量モデルを構築する（各データの関係性を定量化する）。多変量モデルは、目的変数と説明変数との関係を定式化（モデル化）したものであり、例えば、重回帰分析、判別分析、ロジスティック回帰分析、非線型解析などによりモデルを構築することができる。以下、本明細書では多変量モデルとして重回帰モデルを例に説明するが、他のモデルに対しても本発明を適用することができる（つまり、重回帰モデルという文言は多変量モデルと読み替えられてもよい）。 The server builds a multivariate model based on the actual value of the objective variable and the actual value of one or more explanatory variables (quantifies the relationship of each data). A multivariate model formulates (models) the relationship between objective variables and explanatory variables. For example, a model can be constructed by multiple regression analysis, discriminant analysis, logistic regression analysis, nonlinear analysis, or the like. . Hereinafter, in this specification, a multiple regression model will be described as an example of a multivariate model. However, the present invention can be applied to other models (that is, the term multiple regression model can be read as a multivariate model). May be)

重回帰モデルは、例えば以下の式１で表される。
（式１）
目的変数＝Ａｘ_１＋Ｂｘ_２＋Ｃｘ_３＋…＋Ｚ
ここで、Ａ、Ｂ及びＣは重回帰分析により求められる係数であり、ｘ_１、ｘ_２、及びｘ_３は説明変数であり、Ｚは重回帰分析により求められる切片（定数）である。The multiple regression model is represented by the following formula 1, for example.
(Formula 1)
Objective variable = Ax ₁ + Bx ₂ + Cx ₃ + ... + Z
Here, A, B, and C are coefficients obtained by multiple regression analysis, x ₁ , x ₂ , and x ₃ are explanatory variables, and Z is an intercept (constant) obtained by multiple regression analysis.

なお、図２では３つの説明変数が与えられるものとしているが、利用する説明変数の数はこれに限られない。また、説明変数が１つだけ選択される場合には、単回帰モデルを用いてもよい。 Although three explanatory variables are given in FIG. 2, the number of explanatory variables to be used is not limited to this. Further, when only one explanatory variable is selected, a single regression model may be used.

また、サーバは、各説明変数の実績値に基づいて、それぞれの説明変数の時系列モデルを構築する。時系列モデルは、過去のパターンや傾向から将来の動きを予測可能なモデルであり、詳しくは後述する。 Further, the server constructs a time series model of each explanatory variable based on the actual value of each explanatory variable. The time series model is a model that can predict future movements from past patterns and trends, and will be described in detail later.

本実施形態において、時系列モデルの構築は、上述の重回帰モデルの構築と関連している。具体的には、サーバは、構築された時系列モデルに基づいて、重回帰モデルに用いる説明変数の絞りこみを行う。このため、サーバは、適宜重回帰モデルを再構築（更新）してもよい。 In the present embodiment, the construction of the time series model is related to the construction of the multiple regression model described above. Specifically, the server narrows down the explanatory variables used for the multiple regression model based on the constructed time series model. For this reason, the server may reconstruct (update) the multiple regression model as appropriate.

サーバは、時系列モデル及び重回帰モデルが確定したら、各時系列モデルに基づいて各説明変数予測値を算出し、これらを重回帰モデルに入力して目的変数予測値を算出する。 When the time series model and the multiple regression model are determined, the server calculates each explanatory variable prediction value based on each time series model, and inputs these into the multiple regression model to calculate the target variable prediction value.

以下、図３を参照して、具体的な処理の流れについて説明する。図３は、本発明の一実施形態に係る目的変数予測方法のフローチャートの一例を示す図である。 Hereinafter, a specific processing flow will be described with reference to FIG. FIG. 3 is a diagram illustrating an example of a flowchart of an objective variable prediction method according to an embodiment of the present invention.

サーバは、処理に用いるデータを準備する（ステップＳ１１）。サーバは、所定のデータ（ソースデータともいう）を、デバイス１０からアップロードされてもよいし、別途取得する（例えば、図示しない所定のサーバからダウンロードする）ものとしてもよい。例えば、ソースデータは、ＣＳＶ（Comma-Separated Values）フォーマットで表されるものであってもよい。 The server prepares data used for processing (step S11). The server may upload predetermined data (also referred to as source data) from the device 10 or obtain it separately (for example, download from a predetermined server (not shown)). For example, the source data may be expressed in CSV (Comma-Separated Values) format.

サーバは、取得したソースデータに基づいて、変数を作成する（ソースデータの加工を行う）。各変数は、目的変数として用いられてもよいし、説明変数として用いられてもよい。例えば、サーバは、ソースデータが日ごとのデータ（日次データ）となっている場合、年次データに変換してもよい。また、複数のデータ（変数）をマージしてもよい。 The server creates a variable (processes the source data) based on the acquired source data. Each variable may be used as an objective variable or may be used as an explanatory variable. For example, when the source data is daily data (daily data), the server may convert the data into annual data. A plurality of data (variables) may be merged.

次に、サーバは、各モデル構築に用いる環境（条件）を設定する（ステップＳ１２）。例えば、サーバは、予測対象とする目的変数を、ステップＳ１１で作成された変数の中から選択する。また、サーバは、選択した目的変数の予測に用いる説明変数候補を選択する。例えば、サーバは、ステップＳ１１において、各変数同士の関連付け（紐付け）を行ってもよく、この場合、ステップＳ１２において、目的変数の予測に用いる説明変数を、当該目的変数に関連付けられた変数から選択することができる。 Next, the server sets an environment (condition) used for each model construction (step S12). For example, the server selects an objective variable to be predicted from the variables created in step S11. Further, the server selects an explanatory variable candidate used for prediction of the selected objective variable. For example, the server may associate (link) each variable in step S11. In this case, in step S12, the explanatory variable used for prediction of the objective variable is determined from the variable associated with the objective variable. You can choose.

また、ステップＳ１２において、サーバは、予測基準日、学習期間、評価期間及び予測期間の設定を行う。予測基準日は、モデル構築及び予測に関する基準日である。また、学習期間は、モデル構築に利用するデータ（学習データ、トレーニングデータ）を含む期間である。評価期間は、構築したモデルを用いて精度評価を行う対象の期間である。つまり、評価期間に属するデータは、テストデータとして用いられる。予測期間は、構築したモデルを用いて実際に予測したい対象の期間である。 In step S12, the server sets a prediction reference date, a learning period, an evaluation period, and a prediction period. The prediction reference date is a reference date related to model construction and prediction. The learning period is a period including data (learning data, training data) used for model construction. The evaluation period is a period during which accuracy evaluation is performed using the constructed model. That is, data belonging to the evaluation period is used as test data. The prediction period is a period of a target that is actually predicted using the constructed model.

学習期間は、予測基準日から遡る期間とすることができ、評価期間は、予測基準日以後の期間とすることができる。例えば、予測基準日を２０１４年４月、学習期間を６０か月、そして評価期間を１２か月とした場合、学習期間は２００９年３月から２０１４年３月に相当し、評価期間は２０１４年４月から２０１５年４月となる。なお、これらの期間は以上の定義とは異なる期間として定義されてもよい。 The learning period can be a period that goes back from the prediction reference date, and the evaluation period can be a period after the prediction reference date. For example, when the prediction reference date is April 2014, the learning period is 60 months, and the evaluation period is 12 months, the learning period corresponds to March 2009 to March 2014, and the evaluation period is 2014. From April to April 2015. Note that these periods may be defined as periods different from the above definitions.

また、ステップＳ１２において、サーバは、説明変数候補を予測する時系列モデルの構築に係るアルゴリズムの候補を選択する。当該候補は、例えば、一次指数平滑、二次指数平滑、三次指数平滑（加法）、三次指数平滑（乗法）、自己回帰和分移動平均（ＡＲＩＭＡ：AutoRegressive Integrated Moving Average）、季節性自己回帰和分移動平均（ＳＡＲＩＭＡ：Seasonal ARIMA）の６種類としてもよい。なお、これらのアルゴリズムのうち少なくとも１つを含まない構成としてもよい。また、他のアルゴリズムを含む構成としてもよい。 In step S12, the server selects an algorithm candidate related to construction of a time-series model for predicting explanatory variable candidates. The candidates are, for example, first-order exponential smoothing, second-order exponential smoothing, third-order exponential smoothing (addition), third-order exponential smoothing (multiplication), autoregressive integrated moving average (ARIMA), seasonal autoregressive sum. It is good also as six types of moving averages (SARIMA: Seasonal ARIMA). Note that at least one of these algorithms may not be included. Moreover, it is good also as a structure containing another algorithm.

また、サーバは、説明変数及び目的変数のモデル構築に係る基準パラメータを設定する。当該基準パラメータは、例えば以下の（１）−（６）であってもよいが、これらに限られない：
（１）説明変数候補評価時の予測精度に関する情報の閾値（例えば、平均絶対パーセント誤差（ＭＡＰＥ：Mean Absolute Percentage Error）の閾値）、
（２）説明変数候補評価時の相関係数の閾値、
（３）選択する説明変数の数の上限値、
（４）多重共線性検査の有無（有の場合には、多重共線性検査に用いる分散拡大要因（ＶＩＦ：Variance Inflation Factor）値を指定する）、
（５）ラグ相関（後述）の考慮の有無（有の場合には、ラグ相関を導出する場合の最大スライド期間（例えば、月数）を指定する）、
（６）データ定常化の有無（有の場合には、さらに混合、一次階差、二次階差などの定常化処理を指定）。The server also sets reference parameters related to model construction of explanatory variables and objective variables. The reference parameter may be, for example, the following (1) to (6), but is not limited to these:
(1) Threshold value of information related to prediction accuracy at the time of evaluation of explanatory variable candidates (for example, threshold value of mean absolute percentage error (MAPE)),
(2) Correlation coefficient threshold at the time of explanatory variable candidate evaluation,
(3) An upper limit value of the number of explanatory variables to be selected,
(4) Presence / absence of multiple collinearity test (if present, specify the Variance Inflation Factor (VIF) value used for multiple collinearity test),
(5) Presence / absence of consideration of lag correlation (described later) (if yes, specifies the maximum slide period (eg, number of months) when lag correlation is derived),
(6) Presence / absence of data regularization (if present, further regularization processing such as mixing, primary difference, secondary difference, etc. is designated).

これらの基準パラメータは、デフォルト値が設定されていてもよく、例えば上記（２）の閾値は０．２であってもよいし、上記（４）のＶＩＦ値は１０であってもよいし、上記（５）の最大スライド期間は１２か月であってもよい。なお、デフォルト値は、これらの値に限られるものではない。また、上記（４）−（６）は、デフォルトで有無のいずれかが指定されてもよい。 For these reference parameters, default values may be set. For example, the threshold value of (2) may be 0.2, the VIF value of (4) may be 10, The maximum slide period of (5) may be 12 months. Note that the default values are not limited to these values. Also, in the above (4) to (6), any of presence or absence may be designated by default.

なお、説明変数候補及び／又は目的変数のデータ（実績値）に、学習期間及び／又は評価期間に関する欠損がある場合、サーバは、補完処理を行うようにすることが好ましい。例えば、補完の方法としては、時系列予測、最小二乗法などを用いてもよい。また、サーバは、上記（６）でデータ定常化が有効に設定された場合、目的変数及び説明変数候補の少なくとも１つに定常化処理を適用しておく。 In addition, when there is a deficiency related to the learning period and / or the evaluation period in the explanatory variable candidate and / or objective variable data (actual value), it is preferable that the server performs the complementing process. For example, time series prediction, least squares, or the like may be used as a complementing method. Further, the server applies the steadying process to at least one of the objective variable and the explanatory variable candidate when the data steadying is effectively set in the above (6).

サーバは、ステップＳ１２で選択した目的変数と説明変数候補との間で、学習期間における相関（具体的には、相関係数）を求める（ステップＳ１３）。ここで、ラグ相関を導出する設定がされている場合、サーバは、説明変数候補を時間方向（例えば、マイナス方向（過去方向））に所定のスライド期間だけスライド（シフト）させ、相関をチェックする。ラグ相関は、説明変数の値が変動してから一定の期間（ラグ）を空けて、目的変数の値が変動するような関係がある場合に好適である。なお、相関は学習期間に加えて／又は学習期間でなく、評価期間を用いて求められてもよい。 The server obtains a correlation (specifically, a correlation coefficient) in the learning period between the objective variable selected in step S12 and the explanatory variable candidate (step S13). Here, when the lag correlation is set to be derived, the server slides (shifts) the explanatory variable candidates in the time direction (for example, the minus direction (past direction)) for a predetermined slide period, and checks the correlation. . Lag correlation is suitable when there is a relationship in which the value of the objective variable varies after a certain period (lag) after the value of the explanatory variable varies. The correlation may be obtained using the evaluation period instead of the learning period and / or the learning period.

サーバは、ラグ相関を導出する場合、１つの説明変数候補に対して、０か月（スライドなし）からステップＳ１２で設定された最大スライド期間までスライドさせてそれぞれ相関を算出する。例えば、選択された説明変数候補の数が８０かつ最大スライド期間が１２か月であり、月ごとのデータを有する場合、ステップＳ１３の相関チェックは、８０×（１２＋１）＝１０４０回実施される。スライドが適用された説明変数候補は、もともとのスライドが適用されていない説明変数候補と異なる変数として以降のモデル化に用いられてもよい。 When deriving the lag correlation, the server slides one explanatory variable candidate from 0 month (no slide) to the maximum slide period set in step S12 to calculate the correlation. For example, when the number of selected explanatory variable candidates is 80, the maximum slide period is 12 months, and there is data for each month, the correlation check in step S13 is performed 80 × (12 + 1) = 1040 times. The explanatory variable candidate to which the slide is applied may be used for subsequent modeling as a variable different from the explanatory variable candidate to which the original slide is not applied.

サーバは、算出された相関係数の絶対値がステップＳ１２で設定された閾値以上（又はより大きい）場合、当該相関係数に対応する説明変数候補について、ステップＳ１４の処理を実施する。つまり、相関係数の絶対値が閾値未満（以下）の場合、対応する説明変数候補を重回帰モデルの構築に用いないように除外する。 When the absolute value of the calculated correlation coefficient is equal to or greater than (or larger than) the threshold value set in step S12, the server performs the process of step S14 on the explanatory variable candidates corresponding to the correlation coefficient. In other words, when the absolute value of the correlation coefficient is less than the threshold (below), the corresponding explanatory variable candidate is excluded from being used for the construction of the multiple regression model.

次に、サーバは、ステップＳ１３の相関チェックをパスした説明変数候補に対して、ステップＳ１２で選択されたアルゴリズムをそれぞれ用いて、学習期間のデータを用いて時系列モデルを構築し、評価期間についての予測及び精度評価を行う（ステップＳ１４）。例えば、ステップＳ１２で６種類のアルゴリズムが選択され、ステップＳ１３で３０個の説明変数候補が相関チェックをパスした場合、３０×６＝１８０回の予測計算が行われる。予測結果と、評価期間の実績値（テストデータ）と、に基づいて、予測精度に関する情報（例えば、予測の誤差率）を算出することができる。 Next, for each explanatory variable candidate that has passed the correlation check in step S13, the server uses each of the algorithms selected in step S12 to construct a time series model using the learning period data. Prediction and accuracy evaluation are performed (step S14). For example, when six types of algorithms are selected in step S12 and 30 explanatory variable candidates pass the correlation check in step S13, 30 × 6 = 180 prediction calculations are performed. Based on the prediction result and the actual value (test data) of the evaluation period, information on the prediction accuracy (for example, the error rate of prediction) can be calculated.

サーバは、説明変数候補ごとに、最も精度の良い（例えば、誤差率の低い）アルゴリズムを判断し、当該アルゴリズムで構築された時系列モデルを、当該説明変数候補の予測モデルとして決定（選択）する。各説明変数候補について、それぞれ異なるアルゴリズムを用いた時系列モデルが選択されてもよい。ここで、誤差率は、ＭＡＰＥであるものとするが、他の指標が用いられてもよく、その場合ステップＳ１２にて当該指標の閾値が設定されてもよい。 For each explanatory variable candidate, the server determines the most accurate algorithm (for example, with a low error rate), and determines (selects) a time series model constructed by the algorithm as a prediction model of the explanatory variable candidate. . For each explanatory variable candidate, a time series model using a different algorithm may be selected. Here, the error rate is assumed to be MAPE, but another index may be used, and in this case, a threshold value of the index may be set in step S12.

ただし、ある説明変数候補について、各アルゴリズムによる予測精度が全て、ステップＳ１２で設定された予測精度に関する情報の閾値より悪い（例えば、誤差率がステップＳ１２で設定された閾値を超える）場合には、当該説明変数候補を重回帰モデルの構築に用いないように除外する。つまり、ステップＳ１３及びＳ１４により、時系列モデルによる説明変数候補の予測精度に基づいて、重回帰モデルで採用する説明変数の取捨選択が行われる。 However, for a certain explanatory variable candidate, when the prediction accuracy by each algorithm is all worse than the threshold value of information related to the prediction accuracy set in step S12 (for example, the error rate exceeds the threshold value set in step S12), The explanatory variable candidate is excluded so as not to be used in the construction of the multiple regression model. That is, in steps S13 and S14, the explanatory variables used in the multiple regression model are selected based on the prediction accuracy of the explanatory variable candidates based on the time series model.

サーバは、これまでのステップで除外されていない説明変数候補を用いて、目的変数の重回帰モデルを構築する（ステップＳ１５）。重回帰モデルは、ステップワイズ法、変数増加法（前進選択法）、変数減少法（後退消去法）、総あたり法などの手法を用いて、所定の基準（例えば、赤池情報量規準（ＡＩＣ：Akaike’s Information Criterion）が最小となる説明変数の組み合わせを決定することで構築できる。例えば、ＡＩＣを用いてモデル構築を行う処理（つまり、予測に用いられる説明変数を選択する処理）は、ＡＩＣチェックと呼ばれてもよい。 The server constructs a multiple regression model of the objective variable using the explanatory variable candidates that have not been excluded in the previous steps (step S15). A multiple regression model uses a stepwise method, a variable increase method (advance selection method), a variable decrease method (regression elimination method), a round robin method, and the like, using a predetermined standard (for example, Akaike Information Criterion (AIC: Akaike's Information Criterion) can be constructed by determining a combination of explanatory variables that minimizes, for example, the process of building a model using AIC (that is, the process of selecting explanatory variables used for prediction) is an AIC check May be called.

なお、所定の基準（説明変数の有用性を表す指標）としては、Ｆ値（F value）、ＢＩＣ（Bayesian Information Criterion）などが用いられてもよい。 Note that as a predetermined standard (an index indicating the usefulness of explanatory variables), an F value, a BIC (Bayesian Information Criterion), or the like may be used.

サーバは、ステップＳ１５で構築された重回帰モデルに対して、ＶＩＦを算出し、説明変数間の相関性をチェック（ＶＩＦチェック）する（ステップＳ１６）。そして、ある２つの説明変数について算出されたＶＩＦがステップＳ１２で設定されたＶＩＦ値（例えば、１０）以上となった場合、片方の説明変数を除外する。これにより、多重共線性の問題が生じることを抑制できる。なお、ステップＳ１２で多重共線性検査が無に設定される場合には、ステップＳ１６はスキップすることができる。 The server calculates VIF for the multiple regression model constructed in step S15 and checks the correlation between explanatory variables (VIF check) (step S16). Then, when the VIF calculated for two certain explanatory variables is equal to or greater than the VIF value (eg, 10) set in step S12, one explanatory variable is excluded. Thereby, it can suppress that the problem of multiple collinearity arises. Note that if the multiple collinearity inspection is set to no in step S12, step S16 can be skipped.

サーバは、ステップＳ１５で決定された重回帰モデルを規定する説明変数の組み合わせから、ステップＳ１６で除外された説明変数を除いて、再度所定の基準が最小となる説明変数の組み合わせを決定して、重回帰モデルを再構築する（ステップＳ１７）。例えば、ステップＳ１６にてＶＩＦチェックをパスしなかった２つの説明変数の組に関して、一方を説明変数から除去した場合のモデルと、他方を説明変数から除去した場合のモデルと、をそれぞれＡＩＣチェックし、ＡＩＣが最小となる方のモデルを採用してもよい。 The server removes the explanatory variables excluded in step S16 from the explanatory variable combinations that define the multiple regression model determined in step S15, and again determines the combination of explanatory variables that minimizes the predetermined criterion. A multiple regression model is reconstructed (step S17). For example, with respect to a pair of two explanatory variables that did not pass the VIF check in step S16, an AIC check is performed on a model when one is removed from the explanatory variable and a model when the other is removed from the explanatory variable. The model with the smallest AIC may be adopted.

ステップＳ１３−Ｓ１７の順番で処理を行うことによって、説明変数候補の数を逐次減らすことができるため、モデル構築に必要な計算量を好適に低減しつつ、予測精度を担保することができる。 By performing the processing in the order of steps S13 to S17, the number of explanatory variable candidates can be sequentially reduced, so that the prediction accuracy can be ensured while suitably reducing the amount of calculation required for model construction.

サーバは、ステップＳ１７で完成した重回帰モデルに、ステップＳ１４で構築した各説明変数の時系列モデルにより算出された予測値を入力し、予測期間についての目的変数の予測値を算出する（ステップＳ１８）。 The server inputs the predicted value calculated by the time series model of each explanatory variable constructed in step S14 to the multiple regression model completed in step S17, and calculates the predicted value of the objective variable for the prediction period (step S18). ).

サーバは、ステップＳ１７で完成した重回帰モデル、ステップＳ１４で作成した各説明変数（候補）の時系列モデル、ステップＳ１８で算出された説明変数及び／又は目的変数の予測値に関する情報を、デバイス１０に送信してもよい。例えば、サーバは、説明変数（候補）及び／又は目的変数の予測値に関するＣＳＶファイルを、デバイス１０に送信してもよい。 The server obtains information about the multiple regression model completed in step S17, the time series model of each explanatory variable (candidate) created in step S14, the explanatory variable calculated in step S18 and / or the predicted value of the objective variable, from the device 10. May be sent to. For example, the server may transmit a CSV file related to the explanatory variable (candidate) and / or the predicted value of the objective variable to the device 10.

また、サーバは、説明変数（説明変数候補でもよい）及び／又は目的変数に関して、実績値及び予測値を表示するグラフを生成して、デバイス１０に送信して表示させてもよい。なお、データ定常化を有効にして以上の予測処理が実行された場合には、定常化されたデータを元々のデータとして表示させてもよいし、定常化されたデータのまま表示させてもよい。 Further, the server may generate a graph that displays the actual value and the predicted value regarding the explanatory variable (may be an explanatory variable candidate) and / or the objective variable, and may transmit the graph to the device 10 to display the graph. In addition, when the above prediction process is performed with the data regularization enabled, the regularized data may be displayed as the original data or may be displayed as the regularized data. .

なお、サーバは、ステップＳ１５−Ｓ１７において、重回帰モデルを複数構築してもよく、この場合、ステップＳ１８において各モデルを用いた場合の予測結果を求めてもよい。 The server may construct a plurality of multiple regression models in steps S15 to S17. In this case, the server may obtain a prediction result when each model is used in step S18.

サーバは、デバイス１０に、図３のフローに関する操作画面を表示するように通信し、デバイス１０におけるユーザの入力などを受けて各処理の制御を行うようにしてもよい。図４は、図３のステップＳ１２に係る説明変数候補の選択画面の一例を示す図である。図４には、デバイス１０に表示される表示画面４００が示されている。 The server may communicate with the device 10 so as to display the operation screen related to the flow of FIG. 3, and may control each process in response to a user input or the like in the device 10. FIG. 4 is a diagram showing an example of the explanatory variable candidate selection screen according to step S12 of FIG. FIG. 4 shows a display screen 400 displayed on the device 10.

表示画面４００には、画面切り替え用のタブ部４０１が示されてもよい。ユーザの操作により、所定のタブが選択されると、情報表示／操作用の領域４１１が対応する内容に切り替わるように構成されてもよい。図４では、説明変数選択のタブが指定されている。 The display screen 400 may include a screen switching tab portion 401. When a predetermined tab is selected by a user operation, the information display / operation area 411 may be switched to the corresponding content. In FIG. 4, an explanatory variable selection tab is designated.

領域４１１には、目的変数（ここでは、商品先物Ａというデータ）の予測に用いる説明変数候補（パラメータ）が列挙されている。ユーザの操作により、これらの候補が全てモデル構築の候補として考慮されるように選択されてもよいし、一部の候補は最初から除外されるように選択されなくてもよい。 In the area 411, explanatory variable candidates (parameters) used for prediction of objective variables (here, data called commodity future A) are listed. By the user's operation, all of these candidates may be selected as candidates for model construction, or some candidates may not be selected so as to be excluded from the beginning.

図５は、図３のステップＳ１８に係る説明変数及び目的変数に関するグラフの表示画面の一例を示す図である。タブ部４０１では、予測結果のタブが指定されている。 FIG. 5 is a diagram illustrating an example of a graph display screen regarding the explanatory variable and the objective variable according to step S18 of FIG. In the tab part 401, a tab of the prediction result is designated.

領域４２１は、説明変数（Ｅ）及び目的変数（Ｏ）の実績値及び予測値を縮小表示するために用いられる。ユーザの操作により、１つ以上の変数について、領域４２２に拡大表示させることができる。図５では、目的変数である商品先物Ａのチェックボックスがチェックされており、領域４２２には商品先物Ａの予測値及び実績値が示されている。 The area 421 is used to reduce and display the actual values and predicted values of the explanatory variable (E) and the objective variable (O). One or more variables can be enlarged and displayed in the region 422 by a user operation. In FIG. 5, the check box of the product future A that is the objective variable is checked, and the predicted value and the actual value of the product future A are shown in the area 422.

以上説明した情報処理方法の実施形態によれば、予測精度の悪い説明変数を目的変数の予測に利用しないように制限することができ、目的変数の予測精度が大きく劣化する事態を抑制することができる。 According to the embodiment of the information processing method described above, it is possible to limit the explanatory variable with poor prediction accuracy so as not to be used for prediction of the objective variable, and to suppress a situation in which the prediction accuracy of the objective variable is greatly deteriorated. it can.

なお、サーバは、本実施形態に係る目的変数予測方法（時系列モデル＋重回帰モデル）以外の方法での予測モデル構築を行ってもよい。例えば、状態空間モデル、ニューラルネットワーク（例えば、再起型（リカレント）ニューラルネットワーク）モデルなどに基づく予測モデル構築を設定してもよい。 Note that the server may construct a prediction model by a method other than the objective variable prediction method (time series model + multiple regression model) according to the present embodiment. For example, a prediction model construction based on a state space model, a neural network (for example, a recurrent neural network) model, or the like may be set.

（機器の構成）
図６は、本発明の一実施形態に係るサーバの機能構成の一例を示す図である。サーバ１０は、制御部１１と、記憶部１２と、通信部１３と、入力部１４と、出力部１５と、を有する。なお、本例では、本実施形態における特徴部分の機能ブロックを主に示しており、サーバ１０は、他の処理に必要な他の機能ブロックも有してもよい。また、一部の機能ブロックを含まない構成としてもよい。(Device configuration)
FIG. 6 is a diagram illustrating an example of a functional configuration of a server according to an embodiment of the present invention. The server 10 includes a control unit 11, a storage unit 12, a communication unit 13, an input unit 14, and an output unit 15. In this example, the functional blocks of the characteristic part in the present embodiment are mainly shown, and the server 10 may also have other functional blocks necessary for other processing. Moreover, it is good also as a structure which does not include one part functional block.

制御部１１は、サーバ１０の制御を実施する。制御部１１は、本発明に係る技術分野での共通認識に基づいて説明されるコントローラ、制御回路又は制御装置により構成することができる。制御部１１は、本発明の一実施形態に係る取得部、選択部、構築部などを構成することができる。 The control unit 11 controls the server 10. The control part 11 can be comprised by the controller, the control circuit, or control apparatus demonstrated based on the common recognition in the technical field which concerns on this invention. The control unit 11 can configure an acquisition unit, a selection unit, a construction unit, and the like according to an embodiment of the present invention.

例えば、制御部１１は、記憶部１２、通信部１３、入力部１４などを介して、データを取得する取得部として機能する制御を行ってもよい。また、制御部１１は、取得したデータから、目的変数と、１つ以上の説明変数候補と、を選択する選択部として機能する制御を行ってもよい。また、制御部１１は、各説明変数候補の時系列モデルを構築する構築部として機能する制御を行ってもよい。 For example, the control unit 11 may perform control that functions as an acquisition unit that acquires data via the storage unit 12, the communication unit 13, the input unit 14, and the like. The control unit 11 may perform control that functions as a selection unit that selects an objective variable and one or more explanatory variable candidates from the acquired data. The control unit 11 may perform control that functions as a construction unit that constructs a time-series model of each explanatory variable candidate.

制御部（構築部）１１は、時系列モデルによる各説明変数候補の予測精度に基づいて説明変数候補から選択した１つ以上の説明変数を用いて、目的変数の多変量モデルを構築するように制御してもよい。 The control unit (construction unit) 11 constructs a multivariate model of the objective variable using one or more explanatory variables selected from the explanatory variable candidates based on the prediction accuracy of each explanatory variable candidate based on the time series model. You may control.

制御部（構築部）１１は、所定の説明変数候補の時系列モデルの予測精度を、テストデータを用いて算出し、当該予測精度が所定の閾値より悪い場合、当該所定の説明変数候補を、多変量モデルの説明変数には用いないように制御してもよい。 The control unit (construction unit) 11 calculates the prediction accuracy of the time-series model of the predetermined explanatory variable candidate using the test data, and when the prediction accuracy is worse than the predetermined threshold, the predetermined explanatory variable candidate is You may control not to use for the explanatory variable of a multivariate model.

制御部（構築部）１１は、予測精度に関する所定の情報（例えば、ＭＡＰＥ）が所定の閾値を超える場合、対応する所定の説明変数候補を、多変量モデルの説明変数には用いないように制御してもよい。 The control unit (construction unit) 11 performs control so as not to use the corresponding predetermined explanatory variable candidate as the explanatory variable of the multivariate model when predetermined information (for example, MAPE) regarding the prediction accuracy exceeds a predetermined threshold. May be.

また、制御部（選択部）１１は、説明変数候補を、目的変数に対応するデータとの間で所定の値以上の相関係数の絶対値を有するデータから選択するように制御してもよい。 In addition, the control unit (selection unit) 11 may perform control so that the explanatory variable candidate is selected from data having an absolute value of a correlation coefficient equal to or greater than a predetermined value with data corresponding to the target variable. .

また、制御部（選択部）１１は、説明変数候補に対応するデータを、目的変数に対応するデータに比べて時間方向にスライドさせ、相関を求めるように制御してもよい。 Further, the control unit (selection unit) 11 may perform control so that the data corresponding to the explanatory variable candidate is slid in the time direction as compared with the data corresponding to the target variable and the correlation is obtained.

制御部（構築部）１１は、所定の基準（例えば、ＡＩＣ）に基づいて多変量モデルを構築するように制御してもよい。 The control unit (construction unit) 11 may perform control so as to construct a multivariate model based on a predetermined criterion (for example, AIC).

制御部（構築部）１１は、所定の基準（例えば、ＡＩＣ）に基づいて一旦多変量モデルを構築した後、ＶＩＦが所定のＶＩＦ値以上となる説明変数の組み合わせの少なくとも一方を除外して、多変量モデルを再構築するように制御してもよい。 The control unit (construction unit) 11 once constructs a multivariate model based on a predetermined standard (for example, AIC), and then excludes at least one of combinations of explanatory variables in which VIF is equal to or greater than a predetermined VIF value, The multivariate model may be controlled to be reconstructed.

制御部１１は、トランザクションデータ処理及び／又は分析データ処理（オンライン分析）をリアルタイムに実施できることが好ましく、例えばインメモリデータ処理を行うように制御することが好ましい。 The control unit 11 is preferably capable of performing transaction data processing and / or analysis data processing (online analysis) in real time, and for example, preferably performs control so as to perform in-memory data processing.

記憶部１２は、サーバ１０で利用する情報を記憶（保持）する。例えば、記憶部１２は、目的変数／説明変数のもととなるデータを逐次記憶する。記憶部１２は、例えば、本発明に係る技術分野での共通認識に基づいて説明されるメモリ、ストレージ、記憶装置などにより構成することができる。 The storage unit 12 stores (holds) information used by the server 10. For example, the storage unit 12 sequentially stores data that is the basis of the objective variable / explanatory variable. The storage unit 12 can be configured by, for example, a memory, a storage, a storage device, or the like described based on common recognition in the technical field according to the present invention.

通信部１３は、デバイス２０との間で種々の情報を通信する。通信部１３は、本発明に係る技術分野での共通認識に基づいて説明されるトランスミッター／レシーバー、送受信回路又は送受信装置により構成することができる。なお、通信部１３は、送信部及び受信部から構成されてもよい。 The communication unit 13 communicates various information with the device 20. The communication part 13 can be comprised by the transmitter / receiver, the transmission / reception circuit, or transmission / reception apparatus demonstrated based on the common recognition in the technical field which concerns on this invention. The communication unit 13 may include a transmission unit and a reception unit.

入力部１４は、ユーザからの操作により入力を受け付ける。また、入力部１４は、所定の機器や記憶媒体と接続され、データの入力を受け付けてもよい。入力部１４は、入力結果を例えば制御部１１に出力してもよい。 The input unit 14 receives an input by an operation from the user. The input unit 14 may be connected to a predetermined device or storage medium and accept data input. The input unit 14 may output the input result to the control unit 11, for example.

入力部１４は、本発明に係る技術分野での共通認識に基づいて説明されるキーボード、マウス、ボタンなどの入力装置や、入出力端子、入出力回路などにより構成することができる。また、入力部１４は、表示部と一体となった構成（例えば、タッチパネル）としてもよい。 The input unit 14 can be configured by an input device such as a keyboard, a mouse, and a button, an input / output terminal, an input / output circuit, and the like described based on common recognition in the technical field according to the present invention. Further, the input unit 14 may have a configuration (for example, a touch panel) integrated with the display unit.

出力部１５は、種々の情報をユーザが認識できるように出力する。例えば、出力部１５は、画像を表示する表示部、音声を出力する音声出力部などを含んで構成されてもよい。表示部は、例えば、本発明に係る技術分野での共通認識に基づいて説明されるディスプレイ、モニタなどの表示装置により構成することができる。また、音声出力部は、本発明に係る技術分野での共通認識に基づいて説明されるスピーカーなどの出力装置により構成することができる。 The output unit 15 outputs various information so that the user can recognize it. For example, the output unit 15 may include a display unit that displays an image, an audio output unit that outputs audio, and the like. The display unit can be configured by, for example, a display device such as a display or a monitor described based on common recognition in the technical field according to the present invention. The audio output unit can be configured by an output device such as a speaker described based on common recognition in the technical field according to the present invention.

出力部１５は、例えば、本発明に係る技術分野での共通認識に基づいて説明される演算器、演算回路、演算装置、プレイヤー、画像／映像／音声処理回路、画像／映像／音声処理装置、アンプなどを含んで構成することができる。 The output unit 15 includes, for example, an arithmetic unit, an arithmetic circuit, an arithmetic device, a player, an image / video / audio processing circuit, an image / video / audio processing device, which are described based on common recognition in the technical field according to the present invention, An amplifier or the like can be included.

デバイス２０についても、図６と同様の構成を有してもよい。なお、サーバ１０は、クラウドプラットフォームで実現されるものであってもよいし、ＳａａＳ（Software as a Service）、ＰａａＳ（Platform as a Service）、ＩａａＳ（Infrastructure as a Service）などと呼ばれるサービスを提供してもよい。 The device 20 may have the same configuration as that in FIG. The server 10 may be realized by a cloud platform, and provides a service called SaaS (Software as a Service), PaaS (Platform as a Service), IaaS (Infrastructure as a Service), or the like. May be.

（ハードウェア構成）
なお、上記実施形態の説明に用いたブロック図は、機能単位のブロックを示している。これらの機能ブロック（構成部）は、ハードウェア及び／又はソフトウェアの任意の組み合わせによって実現される。また、各機能ブロックの実現手段は特に限定されない。すなわち、各機能ブロックは、物理的に結合した１つの装置により実現されてもよいし、物理的に分離した２つ以上の装置を有線又は無線で接続し、これら複数の装置により実現されてもよい。(Hardware configuration)
In addition, the block diagram used for description of the said embodiment has shown the block of the functional unit. These functional blocks (components) are realized by any combination of hardware and / or software. Further, the means for realizing each functional block is not particularly limited. That is, each functional block may be realized by one physically coupled device, or may be realized by two or more physically separated devices connected by wire or wirelessly and by a plurality of these devices. Good.

例えば、本発明の一実施形態におけるサーバ１０などは、本発明の情報通信方法の処理を行うコンピュータとして機能してもよい。図７は、本発明の一実施形態に係るサーバ及びデバイスのハードウェア構成の一例を示す図である。上述のサーバ１０、デバイス２０などは、物理的には、プロセッサ１００１、メモリ１００２、ストレージ１００３、通信装置１００４、入力装置１００５、出力装置１００６、バス１００７などを含むコンピュータ装置として構成されてもよい。 For example, the server 10 or the like according to an embodiment of the present invention may function as a computer that performs processing of the information communication method of the present invention. FIG. 7 is a diagram illustrating an example of a hardware configuration of a server and a device according to an embodiment of the present invention. The above-described server 10, device 20, and the like may be physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like.

なお、以下の説明では、「装置」という文言は、回路、デバイス、ユニットなどに読み替えることができる。サーバ１０、デバイス２０などのハードウェア構成は、図に示した各装置を１つ又は複数含むように構成されてもよいし、一部の装置を含まずに構成されてもよい。 In the following description, the term “apparatus” can be read as a circuit, a device, a unit, or the like. The hardware configuration of the server 10 and the device 20 may be configured to include one or a plurality of each device illustrated in the figure, or may be configured not to include some devices.

例えば、プロセッサ１００１は１つだけ図示されているが、複数のプロセッサがあってもよい。また、処理は、１のプロセッサで実行されてもよいし、処理が同時に、逐次に、又はその他の手法で、１以上のプロセッサで実行されてもよい。 For example, although only one processor 1001 is shown, there may be a plurality of processors. Further, the processing may be executed by one processor, or the processing may be executed by one or more processors simultaneously, sequentially, or in another manner.

サーバ１０、デバイス２０などにおける各機能は、プロセッサ１００１、メモリ１００２などのハードウェア上に所定のソフトウェア（プログラム）を読み込ませることで、プロセッサ１００１が演算を行い、通信装置１００４による通信や、メモリ１００２及びストレージ１００３におけるデータの読み出し及び／又は書き込みを制御することで実現される。 Each function in the server 10, the device 20, and the like is performed by causing the processor 1001 to perform calculations by reading predetermined software (programs) on hardware such as the processor 1001 and the memory 1002, and the communication by the communication device 1004 or the memory 1002. This is realized by controlling reading and / or writing of data in the storage 1003.

プロセッサ１００１は、例えば、オペレーティングシステムを動作させてコンピュータ全体を制御する。プロセッサ１００１は、周辺装置とのインターフェース、制御装置、演算装置、レジスタなどを含む中央処理装置（ＣＰＵ：Central Processing Unit）で構成されてもよい。なお、上述の制御部１１などの各部は、プロセッサ１００１で実現されてもよい。プロセッサ１００１は、１以上のチップで実装されてもよい。 For example, the processor 1001 controls the entire computer by operating an operating system. The processor 1001 may be configured by a central processing unit (CPU) including an interface with peripheral devices, a control device, an arithmetic device, a register, and the like. Each unit such as the control unit 11 described above may be realized by the processor 1001. The processor 1001 may be implemented by one or more chips.

また、プロセッサ１００１は、プログラム（プログラムコード）、ソフトウェアモジュールやデータを、ストレージ１００３及び／又は通信装置１００４からメモリ１００２に読み出し、これらに従って各種の処理を実行する。プログラムとしては、上述の実施形態で説明した動作の少なくとも一部をコンピュータに実行させるプログラムが用いられる。例えば、制御部１１は、メモリ１００２に格納され、プロセッサ１００１で動作する制御プログラムによって実現されてもよく、他の機能ブロックについても同様に実現されてもよい。 Further, the processor 1001 reads a program (program code), software module, and data from the storage 1003 and / or the communication device 1004 to the memory 1002, and executes various processes according to these. As the program, a program that causes a computer to execute at least a part of the operations described in the above embodiments is used. For example, the control unit 11 may be realized by a control program stored in the memory 1002 and operated by the processor 1001, and may be realized similarly for other functional blocks.

メモリ１００２は、コンピュータ読み取り可能な記録媒体であり、例えば、ＲＯＭ（Read Only Memory）、ＥＰＲＯＭ（Erasable Programmable ROM）、ＥＥＰＲＯＭ（Electrically EPROM）、ＲＡＭ（Random Access Memory）、その他の適切な記憶媒体の少なくとも１つで構成されてもよい。メモリ１００２は、レジスタ、キャッシュ、メインメモリ（主記憶装置）などと呼ばれてもよい。メモリ１００２は、本発明の一実施形態に係る情報処理方法を実施するために実行可能なプログラム（プログラムコード）、ソフトウェアモジュールなどを保存することができる。 The memory 1002 is a computer-readable recording medium. For example, the memory 1002 includes at least a ROM (Read Only Memory), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically EPROM), a RAM (Random Access Memory), and other suitable storage media. It may be configured by one. The memory 1002 may be called a register, a cache, a main memory (main storage device), or the like. The memory 1002 can store a program (program code), a software module, and the like that can be executed to perform the information processing method according to an embodiment of the present invention.

ストレージ１００３は、コンピュータ読み取り可能な記録媒体であり、例えば、フレキシブルディスク、フロッピー（登録商標）ディスク、光磁気ディスク（例えば、コンパクトディスク（ＣＤ−ＲＯＭ（Compact Disc ROM）など）、デジタル多用途ディスク、Ｂｌｕ−ｒａｙ（登録商標）ディスク）、リムーバブルディスク、ハードディスクドライブ、スマートカード、フラッシュメモリデバイス（例えば、カード、スティック、キードライブ）、磁気ストライプ、データベース、サーバ、その他の適切な記憶媒体の少なくとも１つで構成されてもよい。ストレージ１００３は、補助記憶装置と呼ばれてもよい。なお、上述の記憶部１２は、メモリ１００２及び／又はストレージ１００３で実現されてもよい。 The storage 1003 is a computer-readable recording medium such as a flexible disk, a floppy (registered trademark) disk, a magneto-optical disk (for example, a compact disk (CD-ROM (Compact Disc ROM), etc.)), a digital versatile disk, Blu-ray (registered trademark) disk, removable disk, hard disk drive, smart card, flash memory device (eg, card, stick, key drive), magnetic stripe, database, server, or other suitable storage medium It may be constituted by. The storage 1003 may be referred to as an auxiliary storage device. Note that the above-described storage unit 12 may be realized by the memory 1002 and / or the storage 1003.

通信装置１００４は、有線及び／又は無線ネットワークを介してコンピュータ間の通信を行うためのハードウェア（送受信デバイス）であり、例えばネットワークデバイス、ネットワークコントローラ、ネットワークカード、通信モジュールなどともいう。なお、上述の通信部１３は、通信装置１００４で実現されてもよい。 The communication device 1004 is hardware (transmission / reception device) for performing communication between computers via a wired and / or wireless network, and is also referred to as a network device, a network controller, a network card, a communication module, or the like. Note that the communication unit 13 described above may be realized by the communication device 1004.

入力装置１００５は、外部からの入力を受け付ける入力デバイス（例えば、キーボード、マウスなど）である。出力装置１００６は、外部への出力を実施する出力デバイス（例えば、ディスプレイ、スピーカーなど）である。なお、入力装置１００５及び出力装置１００６は、一体となった構成（例えば、タッチパネル）であってもよい。なお、上述の入力部１４及び出力部１５は、それぞれ入力装置１００５及び出力装置１００６で実現されてもよい。 The input device 1005 is an input device (for example, a keyboard, a mouse, etc.) that accepts external input. The output device 1006 is an output device (for example, a display, a speaker, etc.) that performs output to the outside. The input device 1005 and the output device 1006 may have an integrated configuration (for example, a touch panel). The input unit 14 and the output unit 15 described above may be realized by the input device 1005 and the output device 1006, respectively.

また、プロセッサ１００１やメモリ１００２などの各装置は、情報を通信するためのバス１００７で接続される。バス１００７は、単一のバスで構成されてもよいし、装置間で異なるバスで構成されてもよい。 Each device such as the processor 1001 and the memory 1002 is connected by a bus 1007 for communicating information. The bus 1007 may be configured with a single bus or may be configured with different buses between apparatuses.

また、テレビ１０、デバイス２０、マッチングサーバ４０などは、マイクロプロセッサ、デジタル信号プロセッサ（ＤＳＰ：Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＰＬＤ（Programmable Logic Device）、ＦＰＧＡ（Field Programmable Gate Array）などのハードウェアを含んで構成されてもよく、当該ハードウェアにより、各機能ブロックの一部又は全てが実現されてもよい。例えば、プロセッサ１００１は、これらのハードウェアの少なくとも１つで実装されてもよい。 The television 10, the device 20, the matching server 40, and the like include a microprocessor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a programmable logic device (PLD), and a field programmable gate array (FPGA). Or a part of each functional block may be realized by the hardware. For example, the processor 1001 may be implemented by at least one of these hardware.

（変形例）
なお、本明細書で説明した用語及び／又は本明細書の理解に必要な用語については、同一の又は類似する意味を有する用語と置き換えてもよい。(Modification)
Note that the terms described in this specification and / or terms necessary for understanding this specification may be replaced with terms having the same or similar meaning.

本明細書で説明した情報、パラメータなどは、絶対値で表されてもよいし、所定の値からの相対値で表されてもよいし、対応する別の情報で表されてもよい。また、本明細書においてパラメータなどに使用する名称は、いかなる点においても限定的なものではない。 Information, parameters, and the like described in the present specification may be represented by absolute values, may be represented by relative values from predetermined values, or may be represented by other corresponding information. In addition, names used for parameters and the like in this specification are not limited in any respect.

本明細書で説明した情報、信号などは、様々な異なる技術のいずれかを使用して表されてもよい。例えば、上記の説明全体に渡って言及され得るデータ、命令、コマンド、情報、信号、ビット、シンボル、チップなどは、電圧、電流、電磁波、磁界若しくは磁性粒子、光場若しくは光子、又はこれらの任意の組み合わせによって表されてもよい。 Information, signals, etc. described herein may be represented using any of a variety of different technologies. For example, data, commands, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description are voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these May be represented by a combination of

情報、信号などは、複数のネットワークノードを介して入出力されてもよい。入出力された情報、信号などは、特定の場所（例えば、メモリ）に保存されてもよいし、テーブルで管理してもよい。入出力される情報、信号などは、上書き、更新又は追記をされ得る。出力された情報、信号などは、削除されてもよい。入力された情報、信号などは、他の装置へ送信されてもよい。 Information, signals, and the like may be input / output via a plurality of network nodes. Input / output information, signals, and the like may be stored in a specific location (for example, a memory) or may be managed in a table. Input / output information, signals, and the like can be overwritten, updated, or added. The output information, signals, etc. may be deleted. Input information, signals, and the like may be transmitted to other devices.

また、所定の情報の通知（例えば、「Ｘであること」の通知）は、明示的に行うものに限られず、暗示的に（例えば、当該所定の情報の通知を行わないことによって）行われてもよい。 In addition, notification of predetermined information (for example, notification of being “X”) is not limited to explicitly performed, but is performed implicitly (for example, by not performing notification of the predetermined information). May be.

ソフトウェアは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語と呼ばれるか、他の名称で呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、プログラム、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順、機能などを意味するよう広く解釈されるべきである。 Software, whether it is called software, firmware, middleware, microcode, hardware description language, or other names, instructions, instruction sets, codes, code segments, program codes, programs, subprograms, software modules , Applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures, functions, etc. should be interpreted broadly.

また、ソフトウェア、命令、情報などは、伝送媒体を介して送受信されてもよい。例えば、ソフトウェアが、有線技術（同軸ケーブル、光ファイバケーブル、ツイストペア及びデジタル加入者回線（ＤＳＬ）など）及び／又は無線技術（赤外線、マイクロ波など）を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び／又は無線技術は、伝送媒体の定義内に含まれる。 Further, software, instructions, information, and the like may be transmitted / received via a transmission medium. For example, software may use websites, servers, or other devices using wired technology (coaxial cable, fiber optic cable, twisted pair and digital subscriber line (DSL), etc.) and / or wireless technology (infrared, microwave, etc.) When transmitted from a remote source, these wired and / or wireless technologies are included within the definition of transmission media.

本明細書で使用する「システム」及び「ネットワーク」という用語は、互換的に使用される。 As used herein, the terms “system” and “network” are used interchangeably.

本明細書で説明した各態様／実施形態は単独で用いてもよいし、組み合わせて用いてもよいし、実行に伴って切り替えて用いてもよい。また、本明細書で説明した各態様／実施形態の処理手順、シーケンス、フローチャートなどは、矛盾の無い限り、順序を入れ替えてもよい。例えば、本明細書で説明した方法については、例示的な順序で様々なステップの要素を提示しており、提示した特定の順序に限定されない。 Each aspect / embodiment described in this specification may be used independently, may be used in combination, or may be switched according to execution. In addition, the order of the processing procedures, sequences, flowcharts, and the like of each aspect / embodiment described in this specification may be changed as long as there is no contradiction. For example, the methods described herein present the elements of the various steps in an exemplary order and are not limited to the specific order presented.

本明細書で使用する「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 As used herein, the phrase “based on” does not mean “based only on,” unless expressly specified otherwise. In other words, the phrase “based on” means both “based only on” and “based at least on.”

以上、本発明について詳細に説明したが、当業者にとっては、本発明が本明細書中に説明した実施形態に限定されるものではないということは明らかである。本発明は、特許請求の範囲の記載により定まる本発明の趣旨及び範囲を逸脱することなく修正及び変更態様として実施することができる。したがって、本明細書の記載は、例示説明を目的とするものであり、本発明に対して何ら制限的な意味を有するものではない。 Although the present invention has been described in detail above, it will be apparent to those skilled in the art that the present invention is not limited to the embodiments described herein. The present invention can be implemented as modified and changed modes without departing from the spirit and scope of the present invention defined by the description of the scope of claims. Therefore, the description of the present specification is for illustrative purposes and does not have any limiting meaning to the present invention.

Claims

An acquisition unit for acquiring data;
A selection unit for selecting an objective variable and one or more explanatory variable candidates from the data;
Among the data, the time series model of each explanatory variable candidate is constructed using the data of the first period, and the prediction accuracy of the time series model of each explanatory variable candidate is set using the data of the second period of the data. A construction unit for calculating ,
The construction unit constructs a multivariate model of the objective variable using one or more explanatory variables selected from each explanatory variable candidate based on the prediction accuracy of the time series model of each explanatory variable candidate. Information processing apparatus.

A control unit that inputs, as the one or more explanatory variables, a predicted value calculated by a time series model of each explanatory variable to the multivariate model and calculates a predicted value of the objective variable. Item 4. The information processing apparatus according to Item 1.

The building unit, if the prediction accuracy of the time series model in a predetermined explanatory variable candidates is worse than a predetermined threshold, the predetermined explanatory variable candidates, is characterized by not using the explanatory variable of the multivariate model The information processing apparatus according to claim 1.

The said selection part selects each explanatory variable candidate from the data which have the absolute value of a correlation coefficient more than predetermined value between the data corresponding to the said objective variable, The Claim 1 characterized by the above-mentioned. 4. The information processing apparatus according to any one of 3.

The information processing apparatus according to claim 4, wherein the selection unit obtains a correlation by sliding data corresponding to each explanatory variable candidate in a time direction as compared with data corresponding to the objective variable.

The time series model includes first-order exponential smoothing, second-order exponential smoothing, third-order exponential smoothing (addition), third-order exponential smoothing (multiplication), autoregressive integrated moving average (ARIMA) or seasonal autoregressive sum. 6. The information processing apparatus according to claim 1, wherein the information processing apparatus is at least one of a moving average (SARIMA: Seasonal ARIMA).

The information processing apparatus according to claim 1, wherein the multivariate model is a multiple regression model.

The information processing apparatus according to any one of claims 1 to 7, wherein the construction unit constructs the multivariate model based on an Akaike's Information Criterion (AIC).

The construction unit temporarily constructs the multivariate model based on the AIC, and then excludes at least one of combinations of explanatory variables in which a variance expansion factor (VIF) is a predetermined VIF value or more and excludes the multivariate model. The information processing apparatus according to claim 8, wherein a variable model is reconstructed.

Acquiring data;
Selecting an objective variable and one or more explanatory variable candidates from the data;
Among the data, the time series model of each explanatory variable candidate is constructed using the data of the first period, and the prediction accuracy of the time series model of each explanatory variable candidate is set using the data of the second period of the data. A calculating step,
An information processing method characterized in that a multivariate model of the objective variable is constructed using one or more explanatory variables selected from each explanatory variable candidate based on the prediction accuracy of the time series model of each explanatory variable candidate.

An information processing system including an information processing device,
The information processing apparatus includes an acquisition unit that acquires data;
A selection unit for selecting an objective variable and one or more explanatory variable candidates from the data;
Among the data, the time series model of each explanatory variable candidate is constructed using the data of the first period, and the prediction accuracy of the time series model of each explanatory variable candidate is set using the data of the second period of the data. A construction unit for calculating ,
The construction unit constructs a multivariate model of the objective variable using one or more explanatory variables selected from each explanatory variable candidate based on the prediction accuracy of the time series model of each explanatory variable candidate. Information processing system.

On the computer,
The steps to get the data,
A procedure for selecting an objective variable and one or more explanatory variable candidates from the data;
Among the data, the time series model of each explanatory variable candidate is constructed using the data of the first period, and the prediction accuracy of the time series model of each explanatory variable candidate is set using the data of the second period of the data. The procedure to calculate ,
A program for executing a procedure for constructing a multivariate model of the objective variable using one or more explanatory variables selected from each explanatory variable candidate based on the prediction accuracy of the time series model of each explanatory variable candidate .