JP7139625B2

JP7139625B2 - Factor analysis system, factor analysis method and program

Info

Publication number: JP7139625B2
Application number: JP2018037841A
Authority: JP
Inventors: 直人石橋
Original assignee: Fuji Electric Co Ltd
Current assignee: Fuji Electric Co Ltd
Priority date: 2017-08-04
Filing date: 2018-03-02
Publication date: 2022-09-21
Anticipated expiration: 2038-03-02
Also published as: JP2019032807A

Description

本発明は、要因分析システム、要因分析方法およびプログラムに関する。 The present invention relates to a factor analysis system, factor analysis method and program.

近年の電力システム改革では、例えば、「計画値同時同量制度」がトピックとなっている。「計画値同時同量制度」では、需要予測を正確に行なうことが重要となる。しかし、エネルギー事業者(電力会社等)が行っている、従来の需要予測は、自社エリア(テリトリー)の電力需要を、変動要因である「気象情報」や「暦情報」等を入力データに取り入れて予測するモデルを用いて運用することが多い。 In the recent electric power system reform, for example, the "planned value simultaneous equalization system" has become a topic. Accurate demand forecasting is important in the "simultaneous planned value system". However, conventional demand forecasting conducted by energy companies (electric power companies, etc.) incorporates fluctuation factors such as "weather information" and "calendar information" into the input data for power demand in their own area (territory). It is often operated using a model that predicts

しかしながら上記電力システム改革が施行されたことで、既存の電力会社以外に、所謂“新電力”と呼ばれる電力小売事業者(以降、新電力事業者と呼ぶ)が新たなエネルギー事業者として参画することが可能になっている。 However, with the implementation of the above-mentioned electric power system reform, in addition to the existing electric power companies, electric power retailers called "new electric power companies" (hereinafter referred to as new electric power companies) will participate as new energy companies. is possible.

新電力事業者が予測する顧客の電力需要は、契約状況によって日々大きく変動する。そのため新電力事業者が抱える電力需要の変動要因を、運用者(以降、ユーザと呼ぶ)が定期的に分析し、その分析結果を反映した入力データに基づいて新たな電力需要を予測することが必要となる。当該新たな電力需要の予測は、新電力事業者には限られず、既存の電力事業者にとっても必要な事項である。 The customer's power demand predicted by the new electric power company fluctuates greatly on a daily basis depending on the contract status. Therefore, the operator (hereafter referred to as the user) periodically analyzes the fluctuation factors of power demand faced by the new power company, and predicts the new power demand based on the input data that reflects the analysis results. necessary. Prediction of the new power demand is necessary not only for new power companies but also for existing power companies.

従来の電力需要に関する予測において使用されていた相関係数を用いた相関構造モデルは、“擬似相関”と呼ばれる見掛けの相関に起因して、説明変数と目的変数との純粋な関係を求めることができない。 Correlation structure models that use correlation coefficients, which have been used in conventional electricity demand forecasting, can obtain pure relationships between explanatory variables and objective variables due to apparent correlations called "pseudo-correlations." Can not.

関連する技術として、最大電力予測のために、電力需要と関係がありそうな需要要因を事前に相関分析によって分析し、その結果から電力の需要予測に用いる入力データを選択する技術が提案されている（例えば、非特許文献１を参照）。 As a related technology, for maximum power forecasting, a technology has been proposed in which demand factors that are likely to be related to power demand are analyzed in advance by correlation analysis, and based on the results, input data to be used for power demand forecasting is selected. (See, for example, Non-Patent Document 1).

相関分析は、相関係数を用いて要因間の関係を-１から１の範囲で定量的に分析する手法である。しかし相関分析は、分析対象となる要因が多いと、相関係数がどの閾値までの変数を入力データに用いるかの判断ができない場合がある。 Correlation analysis is a method of quantitatively analyzing the relationship between factors in the range of -1 to 1 using correlation coefficients. However, in correlation analysis, if there are many factors to be analyzed, it may not be possible to determine up to which threshold the correlation coefficient of variables should be used as input data.

近年では、需要予測に、偏相関係数を用いたグラフィカルモデリングという手法が用いられている（例えば、特許文献２を参照）。グラフィカルモデリングを用いた相関構造モデルは、上記した“擬似相関”を除去することができるため、需要予測を行なう際に適切な要因選択が可能になる。 In recent years, a technique called graphical modeling using a partial correlation coefficient has been used for demand forecasting (see Patent Document 2, for example). A correlation structure model using graphical modeling can remove the above-described "pseudo-correlation", so it is possible to select appropriate factors when performing demand forecasting.

グラフィカルモデリングは、条件付き独立を仮定することで、ある要因間の偏相関係数を0（無相関）とし、他の偏相関係数を推定するものであるため、出力結果として、偏相関係数が0（無相関）の要因は需要予測の入力データとして除外することで要因の数を減少させて要因を絞り込むことができる。 Graphical modeling assumes that the partial correlation coefficient between certain factors is 0 (uncorrelated) and estimates the other partial correlation coefficients by assuming conditional independence. By excluding factors with a number of 0 (no correlation) as input data for demand forecasting, the number of factors can be reduced and the factors can be narrowed down.

灰田武史、武藤昭一「重回帰手法に基づいた最大予測支援システムの開発」，オペレーションズリサーチ,Vol.41,No.9,pp.476-480(1996-9)Takeshi Haida, Shoichi Mutoh, "Development of Maximum Prediction Support System Based on Multiple Regression Method", Operations Research, Vol.41, No.9, pp.476-480 (1996-9) 倉田栄太郎、森啓之「グラフィカルモデリングを用いた短期電力負荷予測に対する入力変数選択法」，平成19年電気学会B部門大会,No.49(2007-9)Eitaro Kurata, Hiroyuki Mori, "Input variable selection method for short-term power load forecasting using graphical modeling", 2007 Institute of Electrical Engineers of Japan Division B Conference, No.49 (2007-9)

上記した相関分析およびグラフィカルモデリングは、いずれも統計に基づく分析手法であり、線形の要因に対しては有効な分析手法であるが、非線形の要因を分析する際には、要因分析の分析精度が低下する。 Both the correlation analysis and graphical modeling mentioned above are analytical methods based on statistics, and are effective analytical methods for linear factors. descend.

そこで、本発明は、要因分析を行なう際の分析精度を向上させることを目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to improve the accuracy of factor analysis.

上記課題を解決するために、本発明の一つの側面は、
要因分析に関する非線形の分析対象データについての入力処理を行なう入力処理部と、
前記分析対象データに対して所定の加工を行なうデータ前処理部と、
前記入力処理部が入力処理を行った前記分析対象データ、または前記データ前処理部が加工した前記分析対象データから線形部分のデータを抽出する線形部分抽出部と、
前記線形部分抽出部が抽出した前記線形部分のデータに対して予測対象に対する要因の関係性を分析し、分析結果を定量的に表示する制御を行なう要因分析部と、
を備えており、
前記要因分析部は、前記線形部分抽出部が抽出した線形部分に対して多変量解析手法を適用して得られた分析結果に基づいて変数選択を行なう、
ことを特徴とする。 In order to solve the above problems, one aspect of the present invention is
an input processing unit that performs input processing on non-linear analysis target data related to factor analysis;
a data preprocessing unit that performs predetermined processing on the data to be analyzed;
a linear part extracting unit for extracting linear part data from the analysis target data input processed by the input processing unit or from the analysis target data processed by the data preprocessing unit;
a factor analysis unit that analyzes the relationship of factors with respect to the prediction target with respect to the data of the linear portion extracted by the linear portion extraction unit, and performs control for quantitatively displaying the analysis results;
and
The factor analysis unit selects variables based on analysis results obtained by applying a multivariate analysis method to the linear part extracted by the linear part extraction unit,
It is characterized by

上記において前記分析対象データは、エネルギー事業者の予測対象情報と、気象情報と、暦情報と、イベント情報とのうちの少なくとも１つ以上を含む、ことを特徴とする。 In the above, the analysis target data is characterized by including at least one or more of prediction target information of an energy supplier, weather information, calendar information, and event information.

また上記において前記データ前処理部は、予測対象と要因の関係を可視化することを特徴とする。 In the above, the data preprocessing unit is characterized by visualizing the relationship between the prediction target and the factors.

また上記いずれかにおいて、前記データ前処理部は、前記分析対象データに含まれる異常データの除去と欠損データの補間のうち何れか一方または両方を行なうことを特徴とする。 Further, in any one of the above, the data preprocessing unit is characterized by performing one or both of removing abnormal data and interpolating missing data included in the data to be analyzed.

上記において前記線形部分抽出部は、決定木もしくはクラスタリング手法を用いて前記分析対象データから線形部分を抽出することを特徴とする。 In the above, the linear part extraction unit is characterized by extracting a linear part from the analysis object data using a decision tree or clustering method.

上記において前記要因分析部は、前記多変量解析手法として、グラフィカルモデリングを適用することを特徴とする。 In the above, the factor analysis unit is characterized by applying graphical modeling as the multivariate analysis method.

上記において前記要因分析部は、前記分析対象データに含まれる複数の要因を複数のグループに分割し、当該複数のグループに属する１または複数の要因ごとに前記グラフィカルモデリングを実施した後に、結果を統合する、ことを特徴とする。 In the above, the factor analysis unit divides a plurality of factors included in the analysis target data into a plurality of groups, performs the graphical modeling for each of the one or more factors belonging to the plurality of groups, and then integrates the results. characterized in that

また上記課題を解決するために、本発明の別の側面は、
要因分析に関する非線形の分析対象データについての入力処理を行ない、
前記分析対象データに対して所定の加工を行ない、
前記入力処理を行った前記分析対象データ、または前記所定の加工がされた前記分析対象データから線形部分のデータを抽出し、
抽出された前記線形部分のデータに対して予測対象に対する要因の関係性を分析し、分析結果を定量的に表示する制御を行ない、
前記要因の関係性を分析することが、抽出された前記線形部分に対して多変量解析手法を適用して得られた分析結果に基づいて変数選択を行なうことを含む、
ことを特徴とする。 In order to solve the above problems, another aspect of the present invention is
Perform input processing for non-linear analysis target data related to factor analysis,
performing predetermined processing on the data to be analyzed;
extracting data of a linear part from the data to be analyzed that has undergone the input process or from the data to be analyzed that has undergone the predetermined processing;
Analyze the relationship of factors with respect to the prediction target for the extracted data of the linear part, and perform control to quantitatively display the analysis results,
Analyzing the relationship of the factors includes performing variable selection based on analysis results obtained by applying a multivariate analysis method to the extracted linear portion.
It is characterized by

また上記課題を解決するために、本発明のさらに別の側面は、
要因分析に関する非線形の分析対象データについての入力処理を行ない、
前記分析対象データに対して所定の加工を行ない、
前記入力処理を行った前記分析対象データ、または前記所定の加工がされた前記分析対象データから線形部分のデータを抽出し、
抽出された前記線形部分のデータに対して予測対象に対する要因の関係性を分析し、分析結果を定量的に表示する制御を行なうことであって、前記要因の関係性を分析することが、抽出された前記線形部分に対して多変量解析手法を適用して得られた分析結果に基づいて変数選択を行なうことを含む、
処理をコンピュータに実行させるためのプログラムを有していることを特徴とする。 In order to solve the above problems, still another aspect of the present invention is
Perform input processing for non-linear analysis target data related to factor analysis,
performing predetermined processing on the data to be analyzed;
extracting data of a linear part from the data to be analyzed that has undergone the input process or from the data to be analyzed that has undergone the predetermined processing;
Analyzing the relationship of factors with respect to the prediction target for the extracted data of the linear portion, and performing control to quantitatively display the analysis results , wherein analyzing the relationship of the factors is the extraction performing variable selection based on analysis results obtained by applying a multivariate analysis technique to the linear part obtained ;
It is characterized by having a program for causing a computer to execute processing.

本発明によれば、要因分析を行なう際の分析精度を向上させることが可能となる。 ADVANTAGE OF THE INVENTION According to this invention, it becomes possible to improve the analysis precision at the time of factor analysis.

本発明の実施形態に係る要因分析システムのハードウェア構成の一例を示す図である。It is a figure showing an example of hardware constitutions of a factor analysis system concerning an embodiment of the present invention. 本発明の実施形態に係る要因分析システムの入力部における分析対象データのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the analysis object data in the input part of the factor-analysis system which concerns on embodiment of this invention. 一般的なコンピュータのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a general computer. 本発明の実施形態に係るデータの可視化の処理及びデータの除去・補間の処理の具体例を示す図である。FIG. 5 is a diagram showing a specific example of data visualization processing and data removal/interpolation processing according to the embodiment of the present invention; 本発明の実施形態に係るデータ前処理部におけるデータ加工の一例を示す図である。It is a figure which shows an example of data processing in the data pre-processing part which concerns on embodiment of this invention. 本発明の実施形態に係る線形部分抽出部における決定木の一例を示す図である。It is a figure which shows an example of the decision tree in the linear part extraction part which concerns on embodiment of this invention. 本発明の実施形態に係るクラスタリング手法による線形部分抽出の一例を示す図である。It is a figure which shows an example of linear part extraction by the clustering method based on embodiment of this invention. 本発明の実施形態に係る要因分析システムの要因分析部に基づく定量的指標が示された分析結果例を示す図である。It is a figure which shows the example of an analysis result in which the quantitative index based on the factor-analysis part of the factor-analysis system which concerns on embodiment of this invention was shown. 本発明の実施形態に係る要因分析システムの要因分析部による分析結果に基づくレポートの一例を示す図である。It is a figure which shows an example of the report based on the analysis result by the factor-analysis part of the factor-analysis system which concerns on embodiment of this invention. 本発明の実施形態に係る要因分析システムの要因分析部の動作を説明するフロー図である。It is a flowchart explaining operation|movement of the factor-analysis part of the factor-analysis system which concerns on embodiment of this invention. 本発明の実施形態に係る要因分析システムにおける要因分析部の分析対象データの分割プロセスの様子を示すイメージ図である。FIG. 10 is an image diagram showing a process of dividing data to be analyzed by a factor analysis unit in the factor analysis system according to the embodiment of the present invention; 図１１に示した分析対象データの分割前のデータの具体例を示す図である。FIG. 12 is a diagram showing a specific example of data before division of the analysis target data shown in FIG. 11; 図１１に示した分析対象データの分割後の分割データ１の具体例を示す図である。FIG. 12 is a diagram showing a specific example of divided data 1 after dividing the analysis target data shown in FIG. 11; 図１１に示した分析対象データの分割後の分割データ２の具体例を示す図である。FIG. 12 is a diagram showing a specific example of split data 2 after splitting the analysis target data shown in FIG. 11; 本発明の実施形態に係る要因分析システムにおける要因分析部の分析結果の統合の様子を示すイメージ図である。It is an image figure which shows the state of integration of the analysis result of the factor-analysis part in the factor-analysis system which concerns on embodiment of this invention. 図１２Ｂに示された統合前の分割データ１に対応する分析結果の具体例を示す図である。12C is a diagram showing a specific example of analysis results corresponding to the divided data 1 before integration shown in FIG. 12B; FIG. 図１２Ｃに示された統合前の分割データ２に対応する分析結果の具体例を示す図である。FIG. 12C is a diagram showing a specific example of analysis results corresponding to divided data 2 before integration shown in FIG. 12C. 図１４Ａに示す分割データ１の分析結果及び図１４Ｂに示す分割データ２の分析結果を統合して得た分析結果の具体例を示す図である。14B is a diagram showing a specific example of an analysis result obtained by integrating the analysis result of the divided data 1 shown in FIG. 14A and the analysis result of the divided data 2 shown in FIG. 14B. FIG. 本発明の実施形態に係るグラフィカルモデリングにおけるアルゴリズムを説明するフローチャートである。4 is a flow chart illustrating an algorithm in graphical modeling according to an embodiment of the invention; 本発明の実施形態に係る決定木におけるアルゴリズムを説明するフローチャートである。Fig. 4 is a flow chart illustrating an algorithm in a decision tree according to an embodiment of the invention; 本発明の別の実施形態に係る要因分析システムの要因分析部の動作を説明するフロー図である。It is a flowchart explaining operation|movement of the factor-analysis part of the factor-analysis system which concerns on another embodiment of this invention. 本発明の別の実施形態に係る要因分析システムにおける要因分析部の分析対象データからの離散値要因除去の様子を示すイメージ図である。FIG. 11 is an image diagram showing how a factor analysis unit removes discrete value factors from analysis target data in a factor analysis system according to another embodiment of the present invention;

以下、本発明の実施の形態について、図面を参照しながら詳細に説明する。以下の実施形態では、要因分析システムは、電力需要の予測を行なう際の変動要因を分析するシステムであるものとして説明する。ただし、当該要因分析システムは、電力需要の予測を行なう際の変動要因以外の各種の要因を分析するのに用いてもよい。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the following embodiments, the factor analysis system will be described as a system that analyzes fluctuation factors when forecasting power demand. However, the factor analysis system may be used to analyze various factors other than fluctuation factors when predicting power demand.

電力の最大需要は、休平日や気象条件などの様々な要因に対して変動するため、実施形態の要因分析システムが分析する対象となるデータ（以下、分析対象データ）の性質は非線形となるのが普通である。従って、実施形態における分析対象データは、非線形のデータであるものとして説明する。そのため分析対象データを分析する際には、従来技術の如き線形手法をそのまま適用することはできない。 Since the maximum power demand fluctuates due to various factors such as holidays and weather conditions, the data to be analyzed by the factor analysis system of the embodiment (hereinafter referred to as analysis target data) is non-linear in nature. is normal. Therefore, the analysis target data in the embodiment will be described as non-linear data. Therefore, when analyzing data to be analyzed, it is not possible to apply the linear method as in the prior art as it is.

図１は、本発明の実施形態に係る要因分析システムのハードウェア構成の一例を示す図である。要因分析システム１００のハードウェアは、機能ブロックで示され、分析対象に資するデータ（上述した分析対象データ）を入力する入力部１０、入力された分析対象データを変数項目に分けて記憶装置３０に記憶させるとともに記憶装置３０から取出した変数項目のデータを選び出して処理する演算装置２０、入力された分析対象データを変数項目に分けて記憶する記憶装置３０、および、演算装置２０から出力されたデータを保存するとともに出力されたデータを表示する出力データ保存・出力部４０を備える。 FIG. 1 is a diagram showing an example of the hardware configuration of a factor analysis system according to an embodiment of the invention. The hardware of the factor analysis system 100 is represented by functional blocks, and includes an input unit 10 for inputting data contributing to the analysis target (the above-described analysis target data), the input analysis target data divided into variable items, and stored in the storage device 30. Arithmetic device 20 that selects and processes data of variable items that are stored and retrieved from storage device 30, storage device 30 that stores input analysis object data by dividing it into variable items, and data output from computation device 20. and an output data storage/output unit 40 for storing output data.

なお出力されたデータを表示する出力データ保存・出力部４０としては、ディスプレイ等が適用されてもよい。また、ディスプレイ等に表示されたデータは、例えば、ＧＵＩ(Graphical User-Interface)に基づいて、修正可能であってもよい。 A display or the like may be applied as the output data storage/output unit 40 that displays the output data. Also, the data displayed on the display or the like may be modifiable, for example, based on a GUI (Graphical User-Interface).

次に、図１に示された要因分析システムについてさらに詳細に説明する。入力部１０は、エネルギー事業者が管轄する需要情報，気象庁等が提供する気象情報，暦情報，エネルギー事業者が計画したイベント情報等の分析対象となり得るデータ（分析対象データ）を演算装置２０に入力する。入力部１０は、キーボード等であってもよい。また、入力部１０は、不図示のＬＡＮ(Local Area Network)，計測用センサなどの機器から取得した分析対象データを演算装置２０に入力してもよい。分析対象データを入力した演算装置２０は、当該分析対象データに基づいて、変数項目(例.需要情報,気象情報,暦情報,イベント情報等)に分けて記憶装置３０に記憶させる。 Next, the factor analysis system shown in FIG. 1 will be described in more detail. The input unit 10 transmits data (analysis target data) that can be analyzed, such as demand information under the jurisdiction of energy companies, weather information and calendar information provided by the Meteorological Agency, etc., and event information planned by energy companies, to the arithmetic device 20. input. The input unit 10 may be a keyboard or the like. In addition, the input unit 10 may input analysis target data acquired from a device such as a LAN (Local Area Network) (not shown) or a measurement sensor to the arithmetic device 20 . The computing device 20 that has received the data to be analyzed divides the data into variable items (eg, demand information, weather information, calendar information, event information, etc.) and stores them in the storage device 30 based on the data to be analyzed.

そして記憶させたデータを演算装置２０が記憶装置３０から取出し、分析対象データとしての目的変数データ、説明変数データを設定するとともに、要因分析を実施する。 Then, the arithmetic device 20 retrieves the stored data from the storage device 30, sets objective variable data and explanatory variable data as data to be analyzed, and performs factor analysis.

分析対象データには、需要予測の入力データとして用いるかどうか、分析対象の過去の実績値または予報値のデータ、カテゴリーの違いを表す名義尺度である質的変数（例えば曜日情報）、ダミー変数として離散値（0,1）等を用いることができる。 The data to be analyzed includes whether or not it will be used as input data for demand forecasts, data on past actual values or forecast values to be analyzed, qualitative variables (e.g., day of the week information) that are nominal scales representing differences in categories, and dummy variables. Discrete values (0,1) etc. can be used.

ところで、分析対象データには、需要（目的変数データ）と要因（説明変数データ）とが含まれている。該要因は、需要に関する要因である。例えば、分析対象データは、図２に示されるデータ構造として、記憶装置３０に記憶される。記憶装置３０は、データベースであってもよい。 By the way, data to be analyzed includes demand (objective variable data) and factors (explanatory variable data). The factor is a demand related factor. For example, the analysis target data is stored in the storage device 30 as the data structure shown in FIG. Storage device 30 may be a database.

図２の例の分析対象データは、データ取得期間として、2015年7月1日～2016年3月31日、目的変数データとして、当日最大電力需要（Y1）、説明変数データとして、前日最大電力需要（X1），当日平均気温（X2），当日13時日射量（X3），当日イベント（X4）等を含む。分析対象データは、図２の例には限定されない。 The data to be analyzed in the example in Fig. 2 is from July 1, 2015 to March 31, 2016 as the data acquisition period, the maximum power demand of the day (Y1) as the objective variable data, and the previous day's maximum power demand as the explanatory variable data. Includes demand (X1), average temperature of the day (X2), solar radiation at 13:00 on the day (X3), events on the day (X4), etc. Data to be analyzed is not limited to the example in FIG.

例えば、図２の例では、説明変数データの数（要因の数）は、３つ以下であってもよいし、５つ以上であってもよい。図２の例では、説明変数データは、さらに他の要因（Xn：nは５以上の整数）を含む。 For example, in the example of FIG. 2, the number of explanatory variable data (the number of factors) may be three or less, or may be five or more. In the example of FIG. 2, explanatory variable data further includes other factors (Xn: n is an integer of 5 or more).

このように本発明の実施形態に係る演算装置２０は、記憶装置３０に記憶された分析対象データから指定時刻のデータを抽出するだけでなく、最大電力需要や平均気温のような統計値を算出する処理を行い、要因分析を行なう場合もある。 As described above, the computing device 20 according to the embodiment of the present invention not only extracts data at a specified time from the analysis target data stored in the storage device 30, but also calculates statistical values such as maximum power demand and average temperature. In some cases, a factor analysis is performed by performing a process to do so.

図１に示した演算装置２０は、入力・選択処理部２１，データ前処理部２２，線形部分抽出部２３，要因分析部２４を含む。演算装置２０の各部の機能は、例えば、コンピュータが、所定の制御プログラムを実行することにより、実現されてもよい。 The arithmetic unit 20 shown in FIG. 1 includes an input/selection processing unit 21, a data preprocessing unit 22, a linear part extraction unit 23, and a factor analysis unit . The function of each unit of the arithmetic device 20 may be implemented by, for example, a computer executing a predetermined control program.

記憶装置３０は、入力部１０を介して入力された分析対象データをデータベース化して保存する。記憶装置３０は、上記した以外のデータや途中の計算結果を蓄積するようにしても良い。 The storage device 30 stores the analysis target data input via the input unit 10 as a database. The storage device 30 may store data other than those described above and intermediate calculation results.

出力データ保存・出力部４０は、要因分析部２４で分析した定量的な分析結果を保存する機能およびディスプレイ等に出力してユーザに表示する機能を備えている。 The output data storage/output unit 40 has a function of storing quantitative analysis results analyzed by the factor analysis unit 24 and a function of outputting the results to a display or the like for display to the user.

出力データ保存・出力部４０に保存されたデータは、ユーザの設定により分析対象期間だけ変更して定期的に分析を実施し、過去の分析結果と比較し差異がある場合にレポートとしてユーザに視覚的に表示するようにしても良い。 The data stored in the output data storage/output unit 40 is periodically analyzed by changing only the analysis target period according to the user's settings. It is also possible to display the

図９は、本発明の実施形態に係る要因分析システムの要因分析部２４による分析結果に基づくレポートの一例を示す図である。図９の例において、2015年7月～8月の分析結果と2015年9月～10月の分析結果とが棒グラフとして示されている。 FIG. 9 is a diagram showing an example of a report based on analysis results by the factor analysis unit 24 of the factor analysis system according to the embodiment of the present invention. In the example of FIG. 9, the analysis results for July-August 2015 and the analysis results for September-October 2015 are shown as bar graphs.

図９のレポートにおいて、縦軸は、定量的指標を示し、横軸は、説明変数データを示す。図９の例では、2015年7月～8月の分析結果に対して2015年9月～10月の分析結果では重要な変数（例えば、当日平均気温（X2），当日13時日射量（X3），当日イベント（X4），・・・）が変化していることを示す。当該分析結果は、ユーザに対して、要因分析に用いる入力データの変更の必要性を判断する際の材料を提示することができる。 In the report of FIG. 9, the vertical axis indicates the quantitative index, and the horizontal axis indicates explanatory variable data. In the example of Figure 9, the analysis results for September and October 2015 are important variables (for example, the average temperature of the day (X2), the amount of solar radiation at 13:00 (X3 ), that day's event (X4), . . . ) has changed. The analysis result can present the user with materials for determining the necessity of changing the input data used for the factor analysis.

上述したように演算装置２０は、一般的なコンピュータのハードウェアで構成されており、記憶装置３０から取出したデータに対して演算装置２０の処理の流れ(図１中の矢印線参照)に沿って一連の演算処理が実行され、最終段の演算処理より出力されるデータを出力データ保存・出力部４０に保存・表示する。 As described above, the arithmetic unit 20 is composed of general computer hardware, and the data taken out from the storage device 30 is processed according to the processing flow of the arithmetic unit 20 (see the arrow line in FIG. 1). A series of arithmetic processing is executed at the end, and the data output from the arithmetic processing at the final stage is stored and displayed in the output data storage/output unit 40 .

図３は、一般的なコンピュータのハードウェア構成の一例を示す図である。コンピュータ３００は、ＣＰＵ(Central Processing Unit)３０２、メモリ３０４、入力装置３０６、出力装置３０８、外部記憶装置３１２、媒体駆動装置３１４、ネットワーク接続装置３１８等がバス３１０を介して接続されている。なお、本実施形態の演算装置２０に対してコンピュータ３００の構成が適用される。 FIG. 3 is a diagram showing an example of the hardware configuration of a general computer. A computer 300 is connected via a bus 310 to a CPU (Central Processing Unit) 302 , a memory 304 , an input device 306 , an output device 308 , an external storage device 312 , a media drive device 314 , a network connection device 318 and the like. Note that the configuration of the computer 300 is applied to the arithmetic device 20 of this embodiment.

ＣＰＵ３０２は、コンピュータ３００全体の動作を制御する演算処理装置である。メモリ３０４は、コンピュータ３００の動作を制御するプログラムを予め記憶したり、プログラムを実行する際に必要に応じて作業領域として使用したりするための記憶部である。メモリ３０４は、例えばＲＡＭ(Random Access Memory)、ＲＯＭ(Read Only Memory)等である。 The CPU 302 is an arithmetic processing unit that controls the operation of the computer 300 as a whole. The memory 304 is a storage unit for pre-storing a program for controlling the operation of the computer 300 and for using it as a work area as necessary when executing the program. The memory 304 is, for example, a RAM (Random Access Memory), a ROM (Read Only Memory), or the like.

入力装置３０６は、ユーザにより操作されると、その操作内容に対応付けられているユーザからの各種情報の入力を取得し、取得した入力情報をＣＰＵ３０２に送付する装置であり、例えばキーボード装置、マウス装置等である。 The input device 306 is a device that, when operated by a user, acquires input of various types of information from the user associated with the content of the operation and sends the acquired input information to the CPU 302. For example, a keyboard device and a mouse. equipment and the like.

出力装置３０８は、コンピュータ３００による処理結果を出力する装置であり、表示装置等が含まれる。例えば表示装置は、ＣＰＵ３０２により送付される表示データに応じてテキストや画像を表示する。 The output device 308 is a device that outputs the processing result of the computer 300, and includes a display device and the like. For example, the display device displays text and images according to display data sent by the CPU 302 .

外部記憶装置３１２は、例えば、ハードディスクなどの記憶装置であり、ＣＰＵ３０２により実行される各種制御プログラムや、取得したデータ等を記憶しておく装置である。 The external storage device 312 is, for example, a storage device such as a hard disk, and is a device that stores various control programs executed by the CPU 302, acquired data, and the like.

媒体駆動装置３１４は、可搬記録媒体３１６に書込みおよび読出しを行なうための装置である。ＣＰＵ３０２は、可搬記録媒体３１６に記録されている所定の制御プログラムを、媒体駆動装置３１４を介して読出して実行することによって、各種の制御処理を行なうことができる。 A media drive device 314 is a device for writing to and reading from a portable recording medium 316 . CPU 302 can perform various control processes by reading and executing a predetermined control program recorded on portable recording medium 316 via medium drive device 314 .

可搬記録媒体３１６には、例えばＣＤ(Compact Disc)-ＲＯＭ、ＤＶＤ(Digital Versatile Disc）、ＵＳＢ(Universal Serial Bus)メモリ等が含まれている。 The portable recording medium 316 includes, for example, CD (Compact Disc)-ROM, DVD (Digital Versatile Disc), USB (Universal Serial Bus) memory, and the like.

ネットワーク接続装置３１８は、有線または無線により外部との間で行われる各種データの授受の管理を行なうインタフェース装置である。なお、本実施形態の入力部１０は、上記ネットワーク接続装置３１８と通信を行ってもよい。バス３１０は、上記各装置等を互いに接続し、データのやり取りを行なう通信経路である。 The network connection device 318 is an interface device that manages transmission and reception of various data to and from the outside by wire or wireless. Note that the input unit 10 of the present embodiment may communicate with the network connection device 318 described above. A bus 310 is a communication path that connects the devices and the like to exchange data.

図１に戻り演算装置２０についてさらに説明する。入力・選択処理部２１は、入力部１０から取得した分析対象データを、演算装置２０を介して記憶装置３０に記憶することも、分析対象データを記憶装置３０に記憶することなくデータ前処理部２２に引き渡すこともできる。 Returning to FIG. 1, the arithmetic device 20 will be further described. The input/selection processing unit 21 can store the analysis target data acquired from the input unit 10 in the storage device 30 via the arithmetic device 20, or the data preprocessing unit without storing the analysis target data in the storage device 30. 22 can also be handed over.

入力・選択処理部２１は、記憶装置３０から取出した分析対象データから、目的変数データまたは説明変数データを選んだうえで、後述するデータ前処理部２２に引き渡す。入力・選択処理部２１は、入力処理部として機能する。 The input/selection processing unit 21 selects objective variable data or explanatory variable data from the data to be analyzed taken out from the storage device 30, and transfers the data to the data preprocessing unit 22, which will be described later. The input/selection processing unit 21 functions as an input processing unit.

データ前処理部２２は、入力・選択処理部２１を介して記憶装置３０に記憶された分析対象データを演算装置２０の表示装置(例えば、出力装置３０８)に表示し、需要（目的変数データ）と要因（説明変数データ）の関係を可視化し、異常データの除去や欠損データの補間の演算処理を行なう。 The data preprocessing unit 22 displays the analysis target data stored in the storage device 30 via the input/selection processing unit 21 on the display device (for example, the output device 308) of the arithmetic device 20, and calculates the demand (objective variable data). and factors (explanatory variable data) are visualized, and arithmetic processing is performed to remove abnormal data and interpolate missing data.

図４は、本発明の実施形態に係るデータの可視化処理及びデータの除去・補間の処理の具体例を示す図である。 FIG. 4 is a diagram showing a specific example of data visualization processing and data removal/interpolation processing according to the embodiment of the present invention.

上記したようにデータ前処理部２２が実行するデータの可視化処理では、予測対象と要因の関係を演算装置２０の表示装置に散布図や時系列図等で表示し、ユーザ（運用者）によるＧＵＩ操作に基づいて、演算装置２０に対しアクセス可能としている。 As described above, in the data visualization process executed by the data preprocessing unit 22, the relationship between the prediction target and the factors is displayed on the display device of the arithmetic unit 20 as a scatter diagram, a time series diagram, or the like, and a user (operator) GUI Based on the operation, the arithmetic device 20 can be accessed.

データ前処理部２２は、上記データの可視化の処理で散布図や時系列図等で表示されたデータに対して、異常値の除去やデータ欠損の補間等を行なうＧＵＩ操作を認識した場合、異常値の除去やデータ欠損の補間等の処理を行なう。データ前処理部２２は、当該処理を、自動的に行なってもよい。 When the data preprocessing unit 22 recognizes a GUI operation for removing abnormal values, interpolating missing data, etc., for data displayed in a scatter diagram, a time-series diagram, or the like in the data visualization process, an abnormality is detected. Perform processing such as removing values and interpolating missing data. The data preprocessing unit 22 may automatically perform the processing.

具体的に説明すれば、図４の散布図の例では、当日最大電力需要(Y1)を縦軸に、前日最大電力需要(X1)を横軸にして画面上にデータを表示し、散布図上で大きくずれたデータを“異常値”と把握でき、また本来データが存在するにもかかわらずデータが飛んでいる(データの欠損)と把握できる。従って、データ前処理部２２は、異常値の除去やデータ欠損の補間等を行なうＧＵＩ操作を認識した場合、認識した操作に応じた処理（異常値の除去やデータ欠損の補間等の処理）を行なう。 Specifically, in the example of the scatter diagram in Fig. 4, the maximum power demand of the day (Y1) is plotted on the vertical axis, and the maximum power demand of the previous day (X1) is displayed on the screen on the horizontal axis. Data that greatly deviate from the above can be recognized as "abnormal values", and data are skipped (missing data) even though the data originally exists. Therefore, when the data preprocessing unit 22 recognizes a GUI operation for removing abnormal values, interpolating missing data, or the like, the data preprocessing unit 22 performs processing according to the recognized operation (processing for removing abnormal values, interpolating missing data, etc.). do

また図４の時系列図の例では、当日最大電力需要(Y1)を縦軸に、時系列である日時を横軸にして、画面上に時系列のデータを表示し、時系列上で大きくずれたデータが表示された場合には“異常値”と把握でき、また本来データが存在するにもかかわらずデータが不連続(データの欠損)と把握できる。この場合、上述したＧＵＩ操作に基づいて、または自動的に、データ前処理部２２は、異常値の除去やデータの補間等を行なう。 In the example of the time-series diagram in Fig. 4, the maximum power demand (Y1) of the day is displayed on the vertical axis, and the time-series date and time are displayed on the horizontal axis. If deviated data is displayed, it can be recognized as an "abnormal value", and it can be recognized that the data is discontinuous (missing data) even though the data originally exists. In this case, the data preprocessing unit 22 removes abnormal values, interpolates data, and the like based on the GUI operation described above or automatically.

図５は、本発明の実施形態に係るデータ前処理部におけるデータ加工の一例を示す図である。図２と図５を比較すると、図２はデータ前処理部２２によるデータ加工がまだ施されていないため、異常値やデータ欠損がテーブル中のレコードに含まれているのに対して、図５はデータ前処理部２２によるデータ加工が施された後であるため、テーブル中の異常値やデータ欠損は修復されている（図５の脚注(１)～(３)参照）。 FIG. 5 is a diagram showing an example of data processing in the data preprocessing unit according to the embodiment of the present invention. Comparing FIG. 2 and FIG. 5, since data processing by the data preprocessing unit 22 has not yet been performed in FIG. 2, abnormal values and missing data are included in the records in the table. Since the data has been processed by the data preprocessing unit 22, abnormal values and missing data in the table have been corrected (see footnotes (1) to (3) in FIG. 5).

なお、異常を示すデータの除去又はデータの補間処理は、例えば、上下限フィルタやデータ検定，スプライン補間等の統計処理に基づいてデータ前処理部２２が処理することで、異常を示すデータ部分であると判定されたデータの除去並びにデータの補間を自動的に行なうこともできる。この場合、上記処理は、自動的に行なわれるため、ユーザによるＧＵＩ操作は要しない。 The removal of data indicating anomalies or the interpolation processing of data is performed by the data preprocessing unit 22 based on statistical processing such as upper/lower limit filters, data verification, and spline interpolation. Removal of data determined to be present as well as interpolation of data can also be performed automatically. In this case, since the above processing is automatically performed, no GUI operation by the user is required.

図１に戻って演算装置２０内の線形部分抽出部２３について説明する。線形部分抽出部２３は、データ前処理部２２で加工した分析対象データから線形部分を抽出する。本実施形態において、線形部分抽出部２３は、決定木またはクラスタリング等の手法を適用して、非線形データである分析対象データから線形部分を抽出する。図６は、本発明の実施形態に係る線形部分抽出部における決定木の一例を示す図である。 Returning to FIG. 1, the linear part extractor 23 in the arithmetic unit 20 will be described. The linear part extraction unit 23 extracts a linear part from the analysis target data processed by the data preprocessing unit 22 . In this embodiment, the linear part extraction unit 23 applies a technique such as a decision tree or clustering to extract a linear part from the analysis target data, which is nonlinear data. FIG. 6 is a diagram showing an example of a decision tree in the linear partial extraction unit according to the embodiment of the present invention.

図６において決定木による抽出処理が開始されると、分岐ノードを示す菱形の右肩部に示された非線形の元データが、分岐ノードにおけるif-thenルールにより線形部分を抽出することができる。線形部分が抽出された場合には、図６に示すように、線形部分１、線形部分２のような長方形のターミナルノードが形成されて出力される。 When the extraction process by the decision tree is started in FIG. 6, the linear part can be extracted from the non-linear original data shown on the right shoulder of the rhombus indicating the branch node by the if-then rule at the branch node. When a linear portion is extracted, rectangular terminal nodes such as linear portion 1 and linear portion 2 are formed and output as shown in FIG.

これにより、長方形のターミナルノード、すなわち、線形部分１、線形部分２に示されるように、線形部分を示すデータを自動的に抽出することができる。 As a result, it is possible to automatically extract the data representing the linear portions as indicated by rectangular terminal nodes, that is, linear portion 1 and linear portion 2. FIG.

なお本実施例では図６の右肩部に示される非線形の元データから、線形部分１、線形部分２に示す線形データが抽出される例を示しているが、非線形の元データが図６の右肩部に示されるものと異なる場合であっても、抽出される分岐ノードに対してif-thenルールを適用することで線形部分を抽出できる。 In this embodiment, the linear data shown in the linear part 1 and the linear part 2 are extracted from the non-linear original data shown in the right shoulder of FIG. Linear parts can be extracted by applying if-then rules to extracted branch nodes, even if they differ from those shown in the right shoulder.

次に、本発明の実施形態に係るクラスタリング手法による線形部分抽出の一例を図７に示す。図７に示されるように、非線形の元データを任意のクラスタ数に分割する（本実施例ではクラスタ１，２の二つのクラスタに分割される）ことで、線形部分のデータを自動的に抽出することができる。 Next, FIG. 7 shows an example of linear partial extraction by the clustering method according to the embodiment of the present invention. As shown in FIG. 7, by dividing the original non-linear data into an arbitrary number of clusters (in this embodiment, it is divided into two clusters 1 and 2), the data of the linear part is automatically extracted. can do.

クラスタ数は、画面表示されている非線形データについてユーザがＧＵＩによって設定してもよいし、ＡＩＣ(Akaike’s Information Criterion:赤池情報量基準)等の情報基準量を用いて自動的に決めても良い。また線形部分を精度よく抽出するために距離関数に“マハラノビス距離”を用いて決めるようにしてもよい。 The number of clusters may be set by the user using a GUI for the nonlinear data displayed on the screen, or may be automatically determined using an information criterion such as AIC (Akaike's Information Criterion). Also, in order to extract the linear portion with high accuracy, the distance function may be determined using the "Mahalanobis distance".

線形部分抽出部２３は、入力・選択処理部２１から、分析対象データを取得してもよい。この場合、当該分析対象データは、データ前処理部２２による加工（異常を示すデータの除去又はデータの補間処理等）が行なわれていない。 The linear part extraction unit 23 may acquire analysis target data from the input/selection processing unit 21 . In this case, the data to be analyzed has not been processed by the data preprocessing unit 22 (removal of data indicating abnormality, interpolation of data, etc.).

従って、線形部分抽出部２３は、上記の加工が行なわれた分析対象データから線形部分を抽出することが好ましい。ただし、線形部分抽出部２３は、上記の加工が行なわれていない分析対象データから線形部分を抽出することもできる。 Therefore, it is preferable that the linear part extracting unit 23 extracts the linear part from the analysis target data processed as described above. However, the linear part extracting unit 23 can also extract the linear part from the analysis target data that has not been processed as described above.

再び図１に戻って演算装置２０内の要因分析部２４について説明する。要因分析部２４では、線形部分抽出部２３で抽出した線形部分毎に、多変量解析手法を用いて要因選択を考慮した要因分析を定量的に行い、その結果をデータ保存・出力部４０に表示する。なお多変量解析手法のいずれかの手法を用いるかはユーザが自由に設定できる。 Returning to FIG. 1 again, the factor analysis unit 24 in the arithmetic device 20 will be described. The factor analysis unit 24 quantitatively performs factor analysis considering factor selection using a multivariate analysis method for each linear part extracted by the linear part extraction unit 23, and displays the result on the data storage/output unit 40. do. The user can freely set which of the multivariate analysis methods to use.

図８は、本発明の実施形態に係る要因分析システムの要因分析部２４に基づく定量的指標が示された分析結果例を示す図である。 FIG. 8 is a diagram showing an analysis result example showing quantitative indicators based on the factor analysis unit 24 of the factor analysis system according to the embodiment of the present invention.

図８に示す例では、説明変数データとして、前日最大電力需要（X1），当日平均気温（X2），当日13時日射量（X3），当日イベント（X4）・・・について、7月－8月,9月－10月のそれぞれ２月分について定量的指標を算出して分析結果を出力したものである。図９は、図８の例の分析結果を、棒グラフで示したレポートである。当該レポートは、上述したように、画面表示などにより、ユーザに提示される。 In the example shown in Fig. 8, the explanatory variable data for July-8 Quantitative indices were calculated for each month, September to October, and the analysis results were output. FIG. 9 is a report showing the analysis results of the example of FIG. 8 in bar graph form. The report is presented to the user through screen display or the like, as described above.

演算装置２０は、説明変数データに対する分析結果を定量的指標に基づいてユーザに示すだけでなく、当該分析結果の情報から棒グラフや折れ線グラフ等を用いて表示してもよい。棒グラフや折れ線グラフ等が表示されることにより、分析結果をユーザにわかりやすく提示することができる。 The computing device 20 may not only show the analysis result of the explanatory variable data to the user based on the quantitative index, but may also display the analysis result information using a bar graph, a line graph, or the like. By displaying a bar graph, a line graph, or the like, the analysis results can be presented to the user in an easy-to-understand manner.

また、線形部分抽出部２３で複数の線形部分が抽出された場合、それぞれの結果の分析結果を平均や重み付き平均等の統計処理によって１つの結果として表示してもよい。要因分析の詳細な手法については後で触れることにする。 Further, when a plurality of linear parts are extracted by the linear part extracting unit 23, the analysis result of each result may be displayed as one result by statistical processing such as averaging or weighted averaging. The detailed method of factor analysis will be touched on later.

図１０は、本発明の実施形態に係る要因分析システムの要因分析部２４の動作を説明するフロー図である。要因分析部２４は、(１)分析対象データの分割、(２)要因選択を考慮した分析手法の適用、および、(３)分析結果の統合の各プロセスを含む。以下、図１０に沿って各プロセスについて順に説明する。 FIG. 10 is a flowchart for explaining the operation of the factor analysis section 24 of the factor analysis system according to the embodiment of the present invention. The factor analysis unit 24 includes processes of (1) division of data to be analyzed, (2) application of an analysis method considering factor selection, and (3) integration of analysis results. Each process will be described in order below with reference to FIG.

［１］分析対象データの分割（ステップS11）
分析対象となる要因を削減するため、分析対象データ（図１２Ａ参照）の要因を任意の数で分割する。元の分析対象データの説明変数データ（要因）の数をＴとしたとき、任意の分割数Ｂで分析対象データを分割する。このとき、分割される分析対象データの目的変数は常に同じとする。 [1] Division of data to be analyzed (step S11)
In order to reduce the factors to be analyzed, the factors of the data to be analyzed (see FIG. 12A) are divided by an arbitrary number. Assuming that the number of explanatory variable data (factors) of the original data to be analyzed is T, the data to be analyzed is divided by an arbitrary number B of divisions. At this time, the objective variable of the divided analysis object data is always the same.

図１１は、本発明の実施形態に係る要因分析システムにおける要因分析部の分析対象データの分割プロセスの様子を示すイメージ図である。図１１の左端に示す元の分析対象データにおいては、目的変数はY1のみで、説明変数(要因)はX1・・・XTまで有るものとする。 FIG. 11 is an image diagram showing the process of dividing the analysis target data of the factor analysis unit in the factor analysis system according to the embodiment of the present invention. In the original data to be analyzed shown on the left side of FIG. 11, the objective variable is Y1 only, and the explanatory variables (factors) are X1 . . . XT.

中央部に示す分割データ１では、目的変数Y1は同じで、説明変数(要因)はX1・・・XBに分割され、また右端に示す分割データ２では、目的変数Y1は同じで、説明変数(要因)はXB+1・・・X2B、・・・に分割される。なおデータ取得期間Ｎは分割プロセス中の全てのデータで同じである。 In the divided data 1 shown in the center, the objective variable Y1 is the same, and the explanatory variables (factors) are divided into X1 ... XB. factor) is divided into XB+1...X2B,... Note that the data acquisition period N is the same for all data during the division process.

図１２Ａは、図１１に示した分析対象データの分割前のデータの具体例を示す図である。すなわち、図１２Ａには、データ取得期間として、2015年7月～2015年10月末、目的変数データとして、当日最大電力需要（Y1）、説明変数データとして、前日最大電力需要（X1），当日平均気温（X2），当日13時日射量（X3），当日イベント（X4）について、数値が具体的に埋め込まれた分割前の分析対象データの例が示されている。 12A is a diagram showing a specific example of data before division of the analysis target data shown in FIG. 11. FIG. That is, in FIG. 12A, the data acquisition period is from July 2015 to the end of October 2015, the maximum power demand of the day (Y1) is the objective variable data, the maximum power demand of the previous day (X1) is the explanatory variable data, and the average An example of analysis target data before division in which numerical values are specifically embedded is shown for the temperature (X2), the amount of solar radiation at 13:00 on the day (X3), and the event on the day (X4).

以下の説明において、要因分析部２４は、上述したＴおよびＢが「Ｔ／Ｂ＝２」となるように、図１２Ａで示される分析対象データを分割する例について説明する。従って、分割後のグループ（分割データ）に属する説明変数データの数（要因数）は、２つとなる。分割後のグループに属する説明変数データの数は、１つであってもよい。 In the following description, an example will be described in which the factor analysis unit 24 divides the analysis target data shown in FIG. 12A such that T and B are "T/B=2". Therefore, the number of explanatory variable data (the number of factors) belonging to the divided group (divided data) is two. The number of explanatory variable data belonging to the group after division may be one.

図１２Ｂは、図１１に示した分析対象データの分割後の分割データ１の具体例を示す図である。図１２Ｂの例に示される分割後の分割データ１は、２つの説明変数データ（前日最大電力需要（X1）および当日平均気温（X2））を含む。 FIG. 12B is a diagram showing a specific example of split data 1 after splitting the analysis target data shown in FIG. Divided data 1 after division shown in the example of FIG. 12B includes two explanatory variable data (the previous day's maximum power demand (X1) and the day's average temperature (X2)).

図１２Ｃは、図１１に示した分析対象データの分割後の分割データ２の具体例を示す図である。図１２Ｃの例に示される分割後の分割データ２は、２つの説明変数データ（当日13時日射量（X3））および当日イベント（X4））を含む。 FIG. 12C is a diagram showing a specific example of divided data 2 after the analysis target data shown in FIG. 11 is divided. Divided data 2 after division shown in the example of FIG. 12C includes two explanatory variable data (insolation at 13:00 on the current day (X3) and event on the current day (X4)).

［２］要因選択を考慮した分析手法の適用(ステップS13)
図１０に戻って要因選択を考慮した分析手法の適用(ステップS13)では、ステップS11で分割したデータ毎に要因選択を考慮した分析手法を、設定した分割回数に成るまで適用し、要因分析を行なう。 [2] Application of analysis method considering factor selection (step S13)
Returning to FIG. 10, in the application of the analysis method considering factor selection (step S13), the analysis method considering factor selection is applied to each data divided in step S11 until the set number of divisions is reached, and factor analysis is performed. do

［３］分析結果の統合(ステップS14)
図１０に示す分析結果の統合(ステップS14)では、要因選択を考慮した分析手法を適用したステップS13で算出した複数の分析結果を１つに統合する。 [3] Integration of analysis results (step S14)
In integration of analysis results (step S14) shown in FIG. 10, a plurality of analysis results calculated in step S13 using the analysis method considering factor selection are integrated into one.

図１３は、本発明の実施形態に係る要因分析システムにおける要因分析部の分析結果の統合の様子を示すイメージ図である。図１３に示されるように、図１２Ｂで分割された分割データ１の出力結果と、図１２Ｃで分割された分割データ２の出力結果とが統合されて、図１１の左端に示された分割前の分析対象データにおける説明変数データの構造と殆ど同じ形式であっても定量的指標が新たに付された分析結果を得ることができる。 FIG. 13 is an image diagram showing how the analysis results of the factor analysis unit in the factor analysis system according to the embodiment of the present invention are integrated. As shown in FIG. 13, the output result of the divided data 1 divided in FIG. 12B and the output result of the divided data 2 divided in FIG. Even if the format is almost the same as the structure of the explanatory variable data in the data to be analyzed, it is possible to obtain analysis results with new quantitative indicators.

図１４Ａは、図１２Ｂに示された統合前の分割データ１に対応する分析結果の具体例を示す図であり、図１３の左端に示されたイメージを具現化したものである。 FIG. 14A is a diagram showing a specific example of the analysis result corresponding to the divided data 1 before integration shown in FIG. 12B, and is an embodiment of the image shown on the left end of FIG.

図１４Ｂは、図１２Ｃに示された統合前の分割データ２に対応する分析結果の具体例を示す図であり、図１３の中央に示されたイメージを具現化したものである。 FIG. 14B is a diagram showing a specific example of the analysis result corresponding to the divided data 2 before integration shown in FIG. 12C, and is an embodiment of the image shown in the center of FIG.

図１４Ｃは、図１４Ａに示す分割データ１の分析結果及び図１４Ｂに示す分割データ２の分析結果を統合して得た分析結果の具体例を示す図であり、図１３の右端に示されたイメージを具現化したものである。 FIG. 14C is a diagram showing a specific example of the analysis result obtained by integrating the analysis result of the divided data 1 shown in FIG. 14A and the analysis result of the divided data 2 shown in FIG. 14B. It embodies an image.

上述したように、要因分析部２４は、多変量解析を用いて、要因分析を行なう。多変量解析手法の１つに、グラフィカルモデリングがある。要因分析部２４が、グラフィカルモデリングを適用して、要因分析を行なう場合、分析対象の要因数（説明変数データの数）が多くなると、計算量が膨大になる。例えば、グラフィカルモデリングの場合、要因数が１つ増えるに応じて、計算量が二乗になる。 As described above, the factor analysis unit 24 performs factor analysis using multivariate analysis. One of the multivariate analysis methods is graphical modeling. When the factor analysis unit 24 applies graphical modeling to perform factor analysis, as the number of factors to be analyzed (the number of explanatory variable data) increases, the amount of calculation becomes enormous. For example, in the case of graphical modeling, the amount of computation is squared as the number of factors increases by one.

そこで、要因分析部２４は、上述したように、分析対象データの説明変数データを分割することで、要因分析を行なう際の計算量が減少する。従って、説明変数データの数が多くなったとしても、適正な計算量で、要因分析を行なうことができる。 Therefore, the factor analysis unit 24 divides the explanatory variable data of the analysis target data as described above, thereby reducing the amount of calculation when performing the factor analysis. Therefore, even if the number of explanatory variable data increases, factor analysis can be performed with an appropriate amount of calculation.

上述したように、例えば、線形部分抽出部２３が、非線形の分析対象データから、線形部分を抽出し、要因分析部２４は、抽出された線形部分に対して、グラフィカルモデリング等を適用して、要因分析を行なう。従って、分析精度が向上する。 As described above, for example, the linear part extraction unit 23 extracts a linear part from the nonlinear analysis target data, and the factor analysis unit 24 applies graphical modeling or the like to the extracted linear part, Conduct factor analysis. Therefore, analysis accuracy is improved.

またグラフィカルモデリングは、条件付き独立を仮定することで、ある要因間の偏相関係数を0（無相関）とし、他の偏相関係数を推定するため、出力結果として、偏相関係数が0（無相関）の要因は需要予測の入力データとして除外することで要因の数を減少させて要因を絞り込むことができる。 In addition, graphical modeling assumes that the partial correlation coefficient between certain factors is 0 (uncorrelated) and estimates the other partial correlation coefficients by assuming conditional independence. By excluding 0 (uncorrelated) factors as input data for demand forecast, the number of factors can be reduced and the factors can be narrowed down.

本発明の実施形態に係る要因分析部２４が、線形部分抽出部２３で抽出した線形データ毎に、多変量解析手法を用いて要因選択を考慮した要因分析を定量的に行なうことについては既述したとおりなので、ここでは要因分析部２４における要因選択を考慮した多変量解析手法の例について説明する。 It has already been described that the factor analysis unit 24 according to the embodiment of the present invention quantitatively performs factor analysis considering factor selection using a multivariate analysis method for each linear data extracted by the linear part extraction unit 23. As described above, an example of a multivariate analysis method considering factor selection in the factor analysis unit 24 will be described here.

以下、多変量解析手法の一つの手法として、上述したグラフィカルモデリングについて説明する。多変量解析手法には、例えば、主成分分析や重回帰分析等の手法が適用されてもよい。 The above-described graphical modeling will be described below as one method of multivariate analysis. Methods such as principal component analysis and multiple regression analysis may be applied to the multivariate analysis method, for example.

グラフィカルモデリングは、変数間の擬似相関を除去した偏相関係数をグラフで表現する手法として、音声認識、画像処理、マーケティングリサーチ等の分野で使用されている。 Graphical modeling is used in fields such as speech recognition, image processing, and marketing research as a method of graphically expressing partial correlation coefficients from which pseudo-correlation is removed between variables.

図１５は、本発明の実施形態に係る要因分析システムにおけるグラフィカルモデリングのアルゴリズムを説明するフローチャートである。 FIG. 15 is a flow chart explaining a graphical modeling algorithm in the factor analysis system according to the embodiment of the present invention.

図１５では、グラフィカルモデリングのアルゴリズムを、Step21～Step23に分けて説明している。すなわち、
Step21：偏相関係数行列の算出
目的変数データと説明変数データを１つの学習データの行列として、その行列の相関係数行列Rから偏相関係数行列Pを算出する。偏相関係数とは、２変数間の相関に対して、他に関連している変数の影響を除去した相関係数のことである。一般化した偏相関係数の算出式を次式(1)に示す。 In FIG. 15, the graphical modeling algorithm is divided into Steps 21 to 23 for explanation. i.e.
Step 21: Calculation of Partial Correlation Coefficient Matrix The objective variable data and explanatory variable data are treated as one matrix of learning data, and the partial correlation coefficient matrix P is calculated from the correlation coefficient matrix R of that matrix. A partial correlation coefficient is a correlation coefficient obtained by removing the influence of other related variables from the correlation between two variables. A formula for calculating a generalized partial correlation coefficient is shown in the following formula (1).

Step22：共分散選択による偏相関係数行列の推定
偏相関係数行列の中で絶対値が最小のものを条件付独立(i,j)とし、次式(2)により相関係数行列を更新する。相関係数行列をSとし、Dempsterの定理から分割逆行列の公式を用いることで条件付独立での相関係数行列Mを推定する。 Step 22: Estimation of partial correlation coefficient matrix by covariance selection The one with the smallest absolute value in the partial correlation coefficient matrix is assumed to be conditionally independent (i,j), and the correlation coefficient matrix is updated by the following equation (2). do. Let S be the correlation coefficient matrix, and estimate the conditionally independent correlation coefficient matrix M by using the formula of the partitioned inverse matrix from Dempster's theorem.

推定した相関係数行列Mから偏相関係数行列を算出する。複数の条件付独立が存在するとき、先に条件付独立とした係数はほとんどの場合0ではなくなるため、選択した条件付独立すべてが0に収束するまでこの仮定を逐次的に繰り返す。これを繰り返すことで0とみなせる収束判断基準を設定することができる。 A partial correlation coefficient matrix is calculated from the estimated correlation coefficient matrix M. When there are multiple conditions of independence, the coefficients that were previously conditionally independent are almost always non-zero, so we repeat this assumption iteratively until all of the chosen conditions of independence converge to zero. By repeating this, it is possible to set a convergence criterion that can be regarded as 0.

Step23：モデルの評価
共分散選択の打ち切り基準を赤池情報基準量（AIC）によるモデル評価により判断するため、モデルの適合度をAICが最小となるときを共分散選択打ち切り条件とし、そうでなければStep21に戻る。 Step 23: Model Evaluation Since the criterion for covariance selection is determined by model evaluation using the Akaike Information Criterion (AIC), the covariance selection criterion is set when the AIC is the minimum for the goodness of fit of the model. Return to Step21.

次に、多変量解析手法の別法としての決定木について説明する。図１６は、本発明の実施形態に係る要因分析システムにおける決定木のアルゴリズムを説明するフローチャートである。 Next, a decision tree as another method of multivariate analysis will be described. FIG. 16 is a flow chart explaining the decision tree algorithm in the factor analysis system according to the embodiment of the present invention.

決定木は、大量のデータの中に隠れている有用な情報、知識やルールを抽出する方法論であるデータマイニング手法の一つであり、入出力関係をif-thenルールによる木構造で表現するものである。if-thenルールは、一般には前提又は条件を表すif部と、if部が真である場合に実行される結論又は動作を表すthen部とから構成される規則ruleと定義される。 A decision tree is a data mining method that is a methodology for extracting useful information, knowledge, and rules hidden in a large amount of data, and expresses input/output relationships in a tree structure based on if-then rules. is. An if-then rule is generally defined as a rule consisting of an if part representing a premise or condition and a then part representing a conclusion or action to be taken if the if part is true.

図１６に示される決定木のアルゴリズムは、Step31～Step34に分けて説明される。すなわち、
Step31：木の生長
木の生長は、親ノード内のデータを２つの子ノードに分割することで、木を生長させる。まず、要因である変数に対して、対象データとなる親ノードのデータに対して生じる誤差が最も減少する分岐条件を選択し、木を構築する。すべての入力変数の改善度を次式(４)により算出し、その中で最も大きい値のものを最良分岐条件とする。そのときの入力変数を分岐入力変数とし、その分割した左右の平均を分岐値とする。この作業を繰り返し行なうことで決定木をこれ以上分割できない最大木まで生長させる。 The decision tree algorithm shown in FIG. 16 is divided into Steps 31 to 34 and explained. i.e.
Step 31: Tree Growing Tree growing grows the tree by splitting the data in the parent node into two child nodes. First, a tree is constructed by selecting a branching condition that minimizes the error that occurs with respect to the data of the parent node that is the target data for the variable that is the factor. The degree of improvement of all input variables is calculated by the following equation (4), and the largest value among them is taken as the best branching condition. The input variable at that time is used as a branch input variable, and the divided right and left averages are used as branch values. By repeating this operation, the decision tree is grown to the maximum tree that cannot be subdivided any more.

Step32：木の剪定
木構造を簡略化するため、一旦最大木まで生長した木に対して枝の剪定を行なう。各分岐ノードにおいて、そのノードよりも下層にある部分木のノード数あたりの誤差を求める。次に、得られた値において、最も小さな値となる分岐ノードをターミナルノードに置き換える。最後に、全ての分岐ノードがターミナルノードになるまで繰り返す。以下の手順により最大木を一旦最小木まで剪定を行なう。次式(５)に分岐ノードの誤差を複雑度パラメータとして定義する。 Step32: Tree pruning In order to simplify the tree structure, pruning the branches of the tree once it has grown to the maximum size. At each branch node, find the error per number of nodes in the subtree below that node. Next, among the obtained values, the branch node with the smallest value is replaced with the terminal node. Finally, repeat until all branch nodes become terminal nodes. The maximum tree is once pruned to the minimum tree by the following procedure. The branch node error is defined as a complexity parameter in the following equation (5).

Step33：最良木の選択
木の剪定を行なう過程において、CART(Classification And Regression Trees)では、決定木の誤差推定法として交差検証法を用いる。交差検証法は、モデル構築の際に、学習データが十分でない場合もしくは、学習の偏りを小さくするための学習法である。最初に、学習データをν個のグループに分割し、その中の（ν―１）個のグループをモデル構築の学習データとして用い、残りの１グループを誤差推定用のテストデータとして用いる。次式(６)に交差検証法とテストデータの誤差の式を示す。 Step 33: Selection of Best Tree In the process of pruning trees, CART (Classification And Regression Trees) uses cross-validation as a decision tree error estimation method. The cross-validation method is a learning method for reducing learning bias or when learning data is insufficient during model building. First, the learning data is divided into ν groups, (ν−1) groups among them are used as learning data for model construction, and the remaining one group is used as test data for error estimation. The following equation (6) shows the error between the cross-validation method and the test data.

上記式(６)により、剪定毎に交差検証法を用いることで、剪定後の誤差を求める。CARTでは、交差検証誤差に最良木選択ルールを用いることで最良木を選択する。次式(７)にCARTで用いるSEルールを示す。SEルールによって得られた最良候補木の中で最もノードが少ない決定木を最良木とする。 The post-pruning error is obtained by using the cross-validation method for each pruning according to the above equation (6). CART selects the best tree by using the best tree selection rule on the cross-validation error. The SE rule used in CART is shown in the following equation (7). A decision tree with the smallest number of nodes among the best candidate trees obtained by the SE rule is taken as the best tree.

Step34：変数重要度の算出
変数重要度は、決定木構築の際の入力変数の度合いを明確にした指標である。最良木での分岐ノードに使用した変数の改善度を用いる。変数重要度は、これを変数毎に合計した値であり、次式(８)に示す。変数重要度は、予測対象に最も重要である変数を100とし、他の変数の重要度を量的に表すことができる。 Step 34: Calculation of variable importance The variable importance is an index that clarifies the degree of input variables when constructing a decision tree. Use the improvement of the variable used for the branch node in the best tree. The variable importance is a value obtained by summing this for each variable, and is shown in the following equation (8). The variable importance can quantitatively express the importance of the other variables, with the variable that is the most important for the prediction target set to 100.

本発明は、以上の実施の形態に限定されるものでなく、本発明の要旨を逸脱しない範囲内で種々の改良、変更が可能である。例えば、上述の実施形態を以下のように改良、変更してもよい。 The present invention is not limited to the above embodiments, and various improvements and modifications are possible without departing from the gist of the present invention. For example, the above-described embodiments may be improved and changed as follows.

上述の実施形態では、入力部１０又は記憶装置３０を介して演算装置２０が取得した分析対象データは、非線形データであり、離散値である説明変数データ（要因データ）を含み得る。離散値である要因データとは、当該要因のデータ数と比較して、当該要因の項目数或いは当該要因のデータ値のばらつきが少ないデータを指す。例えば、平日を０、休日を１で表す暦データでは、項目数は２であり、データ数と比較して項目数が少なくなり得るため、暦データは、離散値である要因データに該当し得る。一方、例えば、気温データでは、項目に相当する各データ値が異なり得、データ数に比例して項目数も多くなり得るため、気温データは、離散値である要因データに該当しない可能性がある。線形部分抽出部２３は、決定木やクラスタリング等の非線形的手法を用いて、上述したような特徴を含み得る分析対象データから線形部分を抽出する。そして、要因分析部２４は、グラフィカルモデリングや相関分析等の線形的手法を用いて、抽出した線形部分に対して要因分析を行う。 In the above-described embodiment, the analysis target data acquired by the arithmetic device 20 via the input unit 10 or the storage device 30 is non-linear data, and may include discrete explanatory variable data (factor data). Discrete value factor data refers to data in which the number of items of the factor or the data value of the factor has less variation than the number of data of the factor. For example, calendar data representing weekdays as 0 and holidays as 1 has 2 items, and the number of items can be smaller than the number of data items. . On the other hand, for example, in temperature data, each data value corresponding to an item may be different, and the number of items may increase in proportion to the number of data. Therefore, temperature data may not correspond to discrete factor data. . The linear part extracting unit 23 uses a non-linear technique such as decision tree or clustering to extract a linear part from the data to be analyzed that may include the features described above. Then, the factor analysis unit 24 performs factor analysis on the extracted linear part using linear techniques such as graphical modeling and correlation analysis.

ある要因と、データが離散値であり得る他の要因とは相互に関連性を有し得る。そこで、線形部分抽出部２３は、線形部分を適切に抽出するために、離散値である要因データを含む分析対象データから線形部分を抽出することが望ましい。一方、要因分析部２４が要因分析に用いる線形的手法において分析対象データに離散値が含まれると正確な分析結果が得られない可能性がある。そこで、別の実施形態では、要因分析部２４は、分析対象データである線形部分から離散値の要因データを除去した後に、要因分析を行ってもよい。 Certain factors may be correlated with other factors whose data may be discrete values. Therefore, in order to appropriately extract the linear part, the linear part extracting unit 23 preferably extracts the linear part from the analysis object data including the factor data that are discrete values. On the other hand, if the data to be analyzed includes discrete values in the linear method used by the factor analysis unit 24 for factor analysis, there is a possibility that an accurate analysis result cannot be obtained. Therefore, in another embodiment, the factor analysis unit 24 may perform factor analysis after removing factor data of discrete values from the linear portion that is the data to be analyzed.

具体的には、図１７に示すように、要因分析部２４は、分析対象データを分割する処理（Step S11）に先立ち、分析対象データである線形部分から離散値の要因データを除去する（Step S11´）。図１７は、本発明の別の実施形態に係る要因分析システムの要因分析部の動作を説明するフロー図である。 Specifically, as shown in FIG. 17, the factor analysis unit 24 removes discrete value factor data from the linear part that is the analysis target data (Step S11) prior to the process of dividing the analysis target data (Step S11). S11´). FIG. 17 is a flowchart explaining the operation of the factor analysis unit of the factor analysis system according to another embodiment of the present invention.

Step S11´において、要因分析部２４は、線形部分である分析対象データの各説明変数（要因）に対して離散値比率を算出する。そして、要因分析部２４は、算出した離散値比率を予め設定した閾値と比較することで、離散値である要因データを分析対象データから除去する。 In Step S11', the factor analysis unit 24 calculates a discrete value ratio for each explanatory variable (factor) of the data to be analyzed, which is a linear part. Then, the factor analysis unit 24 removes factor data, which are discrete values, from the analysis target data by comparing the calculated discrete value ratio with a preset threshold value.

例えば、要因分析部２４は、次の式(９)に示すように、当該要因の項目数を当該要因のデータ数で除算することによって離散値比率を算出する。或いは、要因分析部２４は、次の式(１０)に示すように、当該要因のデータ数から当該要因の項目数を減算した値を当該要因のデータ数で除算することによって離散値比率を算出する。 For example, the factor analysis unit 24 calculates the discrete value ratio by dividing the number of items of the factor by the number of data of the factor, as shown in the following equation (9). Alternatively, the factor analysis unit 24 calculates the discrete value ratio by dividing the value obtained by subtracting the number of items of the factor from the number of data of the factor by the number of data of the factor, as shown in the following equation (10). do.

離散値比率の算出に式（９）を用いた場合、要因分析部２４は、算出した離散値比率が閾値を下回る要因データを分析対象データから除去する。また、離散値比率の算出に式（１０）を用いた場合、要因分析部２４は、算出した離散値比率が閾値を上回る要因データを分析対象データから除去する。例えば、図１８に示すように、要因分析部２４は、２値の離散値である要因X2のデータと、３値の離散値である要因X4のデータとを分析対象データから除去する。図１８は、本発明の別の実施形態に係る要因分析システムにおける要因分析部の分析対象データからの離散値要因除去の様子を示すイメージ図である。 When Equation (9) is used to calculate the discrete value ratio, the factor analysis unit 24 removes factor data for which the calculated discrete value ratio is below the threshold from the analysis target data. Further, when Expression (10) is used to calculate the discrete value ratio, the factor analysis unit 24 removes factor data whose calculated discrete value ratio exceeds the threshold from the analysis target data. For example, as shown in FIG. 18, the factor analysis unit 24 removes the data of factor X2, which is a binary discrete value, and the data of factor X4, which is a ternary discrete value, from the data to be analyzed. FIG. 18 is an image diagram showing how a factor analysis unit removes discrete value factors from analysis object data in a factor analysis system according to another embodiment of the present invention.

Step S12以降の処理は前述した処理と同様である。要因分析部２４は、離散値である要因を含まない分析結果を出力データ保存・出力部４０へ出力し、出力データ保存・出力部４０は、分析結果を保存すると共に、ディスプレイ等に分析結果を出力してユーザに表示する。 The processing after Step S12 is the same as the processing described above. The factor analysis unit 24 outputs analysis results that do not include factors that are discrete values to the output data storage/output unit 40, and the output data storage/output unit 40 stores the analysis results and displays the analysis results on a display or the like. Output and display to the user.

このように、別の実施形態では、要因分析部２４は、グラフィカルモデリングや相関分析等の線形的手法を用いて線形部分である分析対象データに対して要因分析を行う前に、該分析対象データから離散値の要因データを除去するため、要因分析をより正確に行うことができる。 As described above, in another embodiment, the factor analysis unit 24 performs factor analysis on the analysis target data, which is a linear part, using a linear technique such as graphical modeling or correlation analysis. Since discrete value factor data is removed from , factor analysis can be performed more accurately.

また、要因分析部２４は、離散値である要因を含まない分析結果と共に、線形部分抽出部２３が分析対象データの線形部分を抽出する際に離散値である要因に対して取得した分析結果を出力データ保存・出力部４０へ出力してもよい。線形部分抽出部２３が取得した分析結果の一例としては、離散値ではない要因に対する決定木の構築に離散値である要因が与えた影響度が挙げられる。出力データ保存・出力部４０は、要因分析部２４から入力した分析結果を保存すると共に、該分析結果をディスプレイ等に出力してユーザに表示してもよい。こうした構成によれば、離散値である要因が需要（目的変数データ）に与える影響もユーザは認識できる。 Further, the factor analysis unit 24 extracts the analysis result obtained for the factor having a discrete value when the linear part extracting unit 23 extracts the linear part of the data to be analyzed, together with the analysis result not including the factor having a discrete value. It may be output to the output data storage/output unit 40 . An example of the analysis result obtained by the linear part extraction unit 23 is the degree of influence given to the construction of a decision tree for factors that are not discrete values. The output data storage/output unit 40 may store the analysis results input from the factor analysis unit 24 and output the analysis results to a display or the like for the user. According to such a configuration, the user can also recognize the influence of factors that are discrete values on the demand (objective variable data).

１０入力部
２０演算装置
２１入力・選択処理部
２２データ前処理部
２３線形部分抽出部
２４要因分析部
３０記憶装置
４０出力データ保存・出力部
１００要因分析システム REFERENCE SIGNS LIST 10 input unit 20 arithmetic unit 21 input/selection processing unit 22 data preprocessing unit 23 linear part extraction unit 24 factor analysis unit 30 storage device 40 output data storage/output unit 100 factor analysis system

Claims

an input processing unit that performs input processing on non-linear analysis target data related to factor analysis;
a data preprocessing unit that performs predetermined processing on the data to be analyzed;
a linear part extracting unit for extracting linear part data from the analysis target data input processed by the input processing unit or from the analysis target data processed by the data preprocessing unit;
a factor analysis unit that analyzes the relationship of factors with respect to the prediction target with respect to the data of the linear portion extracted by the linear portion extraction unit, and performs control for quantitatively displaying the analysis results;
with
The factor analysis unit selects variables based on analysis results obtained by applying a multivariate analysis method to the linear part extracted by the linear part extraction unit,
A factor analysis system characterized by:

In the factor analysis system according to claim 1,
A factor analysis system, wherein the analysis target data includes at least one or more of prediction target information of an energy supplier, weather information, calendar information, and event information.

In the factor analysis system according to claim 1,
The factor analysis system, wherein the data preprocessing unit visualizes a relationship between a prediction target and factors.

In the factor analysis system according to any one of claims 1 to 3,
The factor analysis system, wherein the data preprocessing unit performs one or both of removal of abnormal data and interpolation of missing data included in the data to be analyzed.

In the factor analysis system according to claim 1,
The factor analysis system, wherein the linear part extracting unit extracts the linear part from the data to be analyzed using a decision tree or a clustering method.

In the factor analysis system according to claim 1,
The factor analysis system, wherein the factor analysis unit removes factor data, which are discrete values, from the linear part extracted by the linear part extraction part.

In the factor analysis system according to claim 6,
The factor analysis unit analyzes the factor that is a discrete value when the linear part extraction unit extracts the linear part together with the analysis result of the linear part from which the factor data that is a discrete value is removed. A factor analysis system characterized by outputting results.

In the factor analysis system according to claim 1 ,
The factor analysis system, wherein the factor analysis unit applies graphical modeling as the multivariate analysis method.

In the factor analysis system according to claim 8 ,
The factor analysis unit divides a plurality of factors included in the analysis target data into a plurality of groups, performs the graphical modeling for each of one or more factors belonging to the plurality of groups, and integrates the results. A factor analysis system characterized by:

the computer
Perform input processing for non-linear analysis target data related to factor analysis,
performing predetermined processing on the data to be analyzed;
extracting data of a linear part from the data to be analyzed that has undergone the input process or from the data to be analyzed that has undergone the predetermined processing;
Analyze the relationship of factors with respect to the prediction target for the extracted data of the linear part, and perform control to quantitatively display the analysis results,
Analyzing the relationship of the factors includes performing variable selection based on analysis results obtained by applying a multivariate analysis method to the extracted linear portion.
A factor analysis method characterized by:

Perform input processing for non-linear analysis target data related to factor analysis,
performing predetermined processing on the data to be analyzed;
extracting data of a linear part from the data to be analyzed that has undergone the input process or from the data to be analyzed that has undergone the predetermined processing;
Analyzing the relationship of factors with respect to the prediction target for the extracted data of the linear portion, and performing control to quantitatively display the analysis results , wherein analyzing the relationship of the factors is the extraction performing variable selection based on analysis results obtained by applying a multivariate analysis technique to the linear part obtained ;
A program that causes a computer to execute a process.