JP2017123088A

JP2017123088A - Prediction program, device and method using decision tree learning algorithm

Info

Publication number: JP2017123088A
Application number: JP2016002285A
Authority: JP
Inventors: 雅昭泉; Masaaki Izumi; 浩平有吉; Kohei Ariyoshi; 秀一郎古瀬; Hideichiro Furuse
Original assignee: Yaskawa Information Systems Co Ltd
Current assignee: YE Digital Co Ltd
Priority date: 2016-01-08
Filing date: 2016-01-08
Publication date: 2017-07-13
Anticipated expiration: 2036-01-08
Also published as: JP6609808B2

Abstract

PROBLEM TO BE SOLVED: To provide a prediction program, device and method capable of enhancing prediction accuracy in a prediction model even in learning with a teacher, such as a decision tree learning algorithm and even though there is a change in teacher data in the latest short period to a huge amount of teacher data in a past long period.SOLUTION: As a learning step, a decision tree learning algorithm calculates the contribution degree of each explanatory variable with a whole explanatory variable group as teacher data, and acquires the first prescribed number of first explanatory variable groups from a high level in the descending order of contribution degrees. Next, a second explanatory variable group acquired in time series for the latest short period is acquired in each explanatory variable included in the first explanatory variable groups. Next, the contribution degree of each explanatory variable is calculated by the decision tree learning algorithm with the second explanatory variable group as teacher data. Next, the second prescribed number of third explanatory variable groups is extracted from a high level in the descending order of contribution degrees. Then, a prediction model in the decision tree learning algorithm is created with the third explanatory variable groups as teacher data.SELECTED DRAWING: Figure 2

Description

本発明は、決定木学習アルゴリズムを用いて予測する技術に関する。 The present invention relates to a technique for prediction using a decision tree learning algorithm.

近年、機械学習の技術分野によれば、例えばＲ(The R Foundation for Statistical Computing Platform)のようなパッケージが、フリーソフトウェアとしてＧＮＵ(General Public License)から配布されている。そのために、機械学習に特別な知識を要することなく、様々なソリューションシステムに適用されてきている。機械学習の中でも、決定木学習アルゴリズムのような教師有り学習は、人間の勘や経験、ノウハウによって調整されてきた過去の膨大なデータを用いることができる。 In recent years, according to the technical field of machine learning, for example, a package such as R (The R Foundation for Statistical Computing Platform) is distributed as free software from GNU (General Public License). Therefore, it has been applied to various solution systems without requiring special knowledge for machine learning. Among machine learning, supervised learning such as a decision tree learning algorithm can use a huge amount of past data that has been adjusted by human intuition, experience, and know-how.

また、ソリューションシステムとして、例えば浄水場の水処理システムへの適用を想定することもできる。浄水場は、水処理工程の中で、凝集用のＰＡＣ（ポリ塩化アルミニウム）と殺菌用の次亜塩素酸との薬品注入率を制御している。これによって、ろ過池で凝集フロックを除去することができ、処理水の水質を目標値とすることができる。 In addition, as a solution system, for example, application to a water treatment system of a water purification plant can be assumed. The water purification plant controls the chemical injection rate of flocculation PAC (polyaluminum chloride) and sterilization hypochlorous acid in the water treatment process. Thereby, agglomeration floc can be removed in the filtration basin, and the quality of the treated water can be set to the target value.

従来技術によれば、水処理システムにおける凝集処理について、ＰＡＣ注入率や次亜塩素酸注入率は、原水の濁度に応じて比例制御する技術がある（例えば特許文献１，２参照）。また、ジャーテストと称される水質に対する薬品注入率を実験で算定し、単なる比例制御ではなく、水質と薬品注入率と対応付けて制御する技術もある。 According to the prior art, there is a technique in which the PAC injection rate and the hypochlorous acid injection rate are proportionally controlled according to the turbidity of the raw water for the coagulation treatment in the water treatment system (for example, refer to Patent Documents 1 and 2). In addition, there is a technique for calculating the chemical injection rate relative to the water quality called jar test and controlling it in association with the water quality and the chemical injection rate, not simply proportional control.

既存の水処理システムによれば、各工程で検出されるセンシングデータに基づいて、オペレータ（熟練技術者）が多大の時間で蓄積された勘と経験、ノウハウに依存して、薬品注入量が調整されている。これに対し、ニューラルネットワークを用いて、複数の注入条件で、ＰＡＣ注入率や活性炭注入率に基づく処理水の水質を予測する技術もある（例えば特許文献３参照）。この技術によれば、膜ろ過方式であって、超微細な細孔を持つ膜に圧力差を利用して原水を通し、膜の孔径よりも大きな原水中の不純物を分離して除去するものである。 According to the existing water treatment system, the amount of chemical injection is adjusted based on the sense, experience, and know-how accumulated by operators (skilled engineers) based on the sensing data detected in each process. Has been. On the other hand, there is a technique for predicting the quality of treated water based on the PAC injection rate and the activated carbon injection rate under a plurality of injection conditions using a neural network (see, for example, Patent Document 3). According to this technique, a membrane filtration method is used, in which raw water is passed through a membrane having ultrafine pores using a pressure difference to separate and remove impurities in the raw water larger than the pore size of the membrane. is there.

特開２００７−６１８００号公報JP 2007-61800 A 特開２００７−１８５６１０号公報JP 2007-185610 A 特開２０１２−２１３７５９号公報JP 2012-213759 A

決定木学習アルゴリズムのような教師有り学習によれば、過去長期間の膨大な教師データを用いることによって予測モデルを構築し、その予測モデルを用いて将来的なデータを予測することができる。しかしながら、実運用中における直近短期間のデータに変化があっても、過去長期間の教師データによって構築された予測モデルにおける予測精度に依存するという問題があった。一方で、直近短期間のデータのみで予測モデルを構築しても、過去の教師データの振る舞いを反映することはできない。 According to supervised learning such as a decision tree learning algorithm, it is possible to construct a prediction model by using a huge amount of teacher data in the past long period, and to predict future data using the prediction model. However, there is a problem that even if there is a change in the data for the most recent short period during actual operation, it depends on the prediction accuracy in the prediction model constructed by the past long-term teacher data. On the other hand, the behavior of past teacher data cannot be reflected even if a prediction model is constructed using only the latest short-term data.

また、決定木学習アルゴリズムによれば、予測に無関係なデータ（説明変数）が多いと予測精度が上がらないという問題もある。更に、過去長期間の膨大な教師データのみを用いた場合、その教師データの一部に抜けがあると、最適な予測モデルを構築できなくなるという問題もある。 In addition, according to the decision tree learning algorithm, there is a problem that the prediction accuracy does not increase when there is a lot of data (explanatory variables) unrelated to the prediction. Furthermore, when only a large amount of teacher data for the past long period is used, there is a problem that an optimum prediction model cannot be constructed if there is a missing part of the teacher data.

このような問題は、機械学習を、水処理システムに適用した場合にも顕著に現れる。即ち、過去長期間に運用された膨大な薬品注入率のデータに基づいて決定木学習アルゴリズムの予測モデルを構築しても、大雨や藻類の増殖などに起因する突発的な原水濁度の変化に対応できない。即ち、原水濁度に応じて薬品注入率を比例制御しても、ニューラルネットワークを用いても、この問題を解決することは極めて難しい。 Such a problem appears remarkably even when machine learning is applied to a water treatment system. In other words, even if a prediction model for a decision tree learning algorithm is constructed based on a huge amount of drug injection rate that has been in operation for a long period of time, sudden changes in raw water turbidity due to heavy rain, algae growth, etc. I can not cope. That is, it is extremely difficult to solve this problem even if the chemical injection rate is proportionally controlled according to the raw water turbidity, or a neural network is used.

そこで、本発明によれば、決定木学習アルゴリズムのような教師有り学習であっても、過去長期間の膨大な教師データに対して、直近短期間の教師データに変化があっても、予測モデルにおける予測精度を高めることができる予測プログラム、装置及び方法を提供することを目的とする。特に、水処理システムに適用した場合、過去長期間に運用された膨大な薬品注入率のデータに基づく予測モデルを構築しても、大雨や藻類の増殖などに起因する突発的な原水濁度の変化に対応して薬品注入率を決定することができる。 Therefore, according to the present invention, even in supervised learning such as a decision tree learning algorithm, even if there is a change in the most recent short-term teacher data with respect to a huge amount of teacher data in the past long period, It is an object of the present invention to provide a prediction program, an apparatus, and a method capable of improving the prediction accuracy in the method. In particular, when applied to a water treatment system, even if a prediction model based on a large amount of chemical injection rate data that has been in operation for a long period of time has been constructed, sudden turbidity due to heavy rain or algae growth The drug injection rate can be determined in response to the change.

本発明によれば、過去長期間に時系列に取得された異なる種類の説明変数全群を記憶し、次の時点の予測対象種類の予測値を算出するようにコンピュータに実行させる予測プログラムにおいて、
学習段階として、
説明変数全群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第１の所定数の第１の説明変数群を抽出する第１のステップと、
第１の説明変数群に含まれる説明変数毎に、直近短期間に時系列に取得された第２の説明変数群を取得する第２のステップと、
第２の説明変数群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第２の所定数の第３の説明変数群を抽出する第３のステップと、
第３の説明変数群を教師データとして、決定木学習アルゴリズムにおける予測モデルを作成する第４のステップと
を有し、
運用段階として、第３の説明変数群に対応する直近短期間の説明変数群を、予測モデルに基づく決定木学習アルゴリズムに入力し、次の時点の予測対象種類の予測値を算出する
ようにコンピュータに実行させることを特徴とする。 According to the present invention, in the prediction program for storing the entire group of different types of explanatory variables acquired in time series in the past long period, and causing the computer to execute the prediction value of the prediction target type at the next time point,
As a learning stage,
A first step of calculating a contribution for each explanatory variable by a decision tree learning algorithm using all the explanatory variables as teacher data, and extracting a first predetermined number of first explanatory variables from the top in descending order of contribution When,
For each explanatory variable included in the first explanatory variable group, a second step of acquiring a second explanatory variable group acquired in time series during the most recent short period;
Using the second explanatory variable group as teacher data, the degree of contribution for each explanatory variable is calculated by a decision tree learning algorithm, and a second predetermined number of third explanatory variable groups are extracted from the top in order of increasing contribution degree. And the steps
And a fourth step of creating a prediction model in a decision tree learning algorithm using the third explanatory variable group as teacher data,
As an operation stage, a computer is used to input the most recent short-term explanatory variable group corresponding to the third explanatory variable group to the decision tree learning algorithm based on the prediction model and calculate the predicted value of the type of the prediction target at the next time point. It is made to perform.

本発明の予測プログラムにおける他の実施形態によれば、
決定木学習アルゴリズムは、ランダムフォレスト(Random Forest)である
ようにコンピュータに実行させることも好ましい。 According to another embodiment of the prediction program of the present invention,
The decision tree learning algorithm is also preferably executed by a computer so that it is a random forest.

本発明の予測プログラムにおける他の実施形態によれば、
説明変数は、実測値、複数の実測値の積算値、又は、複数の実測値の平均値であり、
過去長期間は、年単位の実測期間に基づくものであり、
直近短期間は、現時点から過去に時間単位の実測期間に基づくものである
ようにコンピュータに実行させることも好ましい。 According to another embodiment of the prediction program of the present invention,
The explanatory variable is an actual value, an integrated value of a plurality of actual values, or an average value of a plurality of actual values.
The past long term is based on the annual measurement period,
It is also preferable to cause the computer to execute the latest short period so as to be based on an actual measurement period in units of time from the present time to the past.

本発明の予測プログラムにおける他の実施形態によれば、
説明変数は、水処理システムにおけるセンシングデータ及び薬品注入データであり、
予測値は、次の時点の薬品注入データである
ようにコンピュータに実行させることも好ましい。 According to another embodiment of the prediction program of the present invention,
The explanatory variables are sensing data and chemical injection data in the water treatment system,
It is also preferable to cause the computer to execute the predicted value so as to be drug injection data at the next time point.

本発明の予測プログラムにおける他の実施形態によれば、
センシングデータは、原水水温、原水濁度、前処理水濁度、前処理水ｐＨ値、中間処理遊離残留塩素値、ろ過池水圧、ろ過池ろ過時間、浄水濁度、配水池水位の１つ以上であり、
薬品注入データは、前次亜塩素酸注入率、活性炭注入率、ＰＡＣ（ポリ塩化アルミニウム）注入率、中次亜塩素酸注入率、硫酸バンド注入率、後次亜塩素酸注入率の１つ以上であるようにコンピュータに実行させることも好ましい。 According to another embodiment of the prediction program of the present invention,
Sensing data is one or more of raw water temperature, raw water turbidity, pretreatment water turbidity, pretreatment water pH value, intermediate treatment free residual chlorine value, filtration pond water pressure, filtration pond filtration time, clean water turbidity, distribution pond water level And
The chemical injection data is one or more of pre-hypochlorous acid injection rate, activated carbon injection rate, PAC (polyaluminum chloride) injection rate, medium hypochlorous acid injection rate, sulfuric acid band injection rate, and post-hypochlorous acid injection rate. It is also preferable to have the computer execute such that

本発明の予測プログラムにおける他の実施形態によれば、
センシングデータ又は薬品注入データについて、任意の種類の所定時刻で記録されていない場合、当該種類における所定時刻の直近短期間の実測値の平均値によって、所定時刻の当該説明変数を補間する
ようにコンピュータに実行させることも好ましい。 According to another embodiment of the prediction program of the present invention,
When sensing data or medicine injection data is not recorded at a predetermined time of an arbitrary type, a computer is used to interpolate the explanatory variable at the predetermined time according to the average value of the measured values in the immediate short period of the predetermined time of the type. It is also preferable to execute it.

本発明によれば、前述した予測プログラムをコンピュータによって実行する水処理予測サーバであって、
水処理システムの各工程に設置されたセンサ又はコントローラから、センシングデータ及び薬品注入データを逐次受信するデータ受信手段と、
予測された薬品注入データを、ユーザ操作の端末へ送信する予測結果送信手段と
を更に有することを特徴とする。 According to the present invention, a water treatment prediction server that executes the above-described prediction program by a computer,
Data receiving means for sequentially receiving sensing data and chemical injection data from sensors or controllers installed in each process of the water treatment system;
The apparatus further includes prediction result transmission means for transmitting the predicted medicine injection data to a user-operated terminal.

本発明によれば、過去長期間に時系列に取得された異なる種類の説明変数全群を記憶し、次の時点の予測対象種類の予測値を算出する予測装置において、
学習段階として、
説明変数全群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第１の所定数の第１の説明変数群を抽出する第１の説明変数群抽出手段と、
第１の説明変数群に含まれる説明変数毎に、直近短期間に時系列に取得された第２の説明変数群を取得する第２の説明変数群抽出手段と、
第２の説明変数群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第２の所定数の第３の説明変数群を抽出する第３の説明変数群抽出手段と、
第３の説明変数群を教師データとして、決定木学習アルゴリズムにおける予測モデルを作成する予測モデル作成手段と
を有し、
運用段階として、第３の説明変数群に対応する直近短期間の説明変数群を、予測モデルに基づく決定木学習アルゴリズムに入力し、次の時点の予測対象種類の予測値を算出する
ことを特徴とする。 According to the present invention, in the prediction device that stores the entire group of different types of explanatory variables acquired in time series in the past long period, and calculates the predicted value of the type to be predicted at the next time point,
As a learning stage,
A first explanation for calculating a contribution for each explanatory variable by a decision tree learning algorithm using all the explanatory variables as teacher data, and extracting a first predetermined number of first explanatory variables from the top in descending order of contribution. Variable group extraction means;
Second explanatory variable group extraction means for acquiring a second explanatory variable group acquired in time series during the most recent short period for each explanatory variable included in the first explanatory variable group;
Using the second explanatory variable group as teacher data, the degree of contribution for each explanatory variable is calculated by a decision tree learning algorithm, and a second predetermined number of third explanatory variable groups are extracted from the top in order of increasing contribution degree. Explanatory variable group extraction means of
Using a third explanatory variable group as teacher data, and a prediction model creating means for creating a prediction model in a decision tree learning algorithm,
As an operation stage, an explanatory variable group corresponding to the third explanatory variable group is input to a decision tree learning algorithm based on a prediction model, and a predicted value of a prediction target type at the next time point is calculated. And

本発明によれば、過去長期間に時系列に取得された異なる種類の説明変数全群を記憶し、次の時点の予測対象種類の予測値を算出する装置の予測方法において、
装置は、
学習段階として、
説明変数全群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第１の所定数の第１の説明変数群を抽出する第１のステップと、
第１の説明変数群に含まれる説明変数毎に、直近短期間に時系列に取得された第２の説明変数群を取得する第２のステップと、
第２の説明変数群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第２の所定数の第３の説明変数群を抽出する第３のステップと、
第３の説明変数群を教師データとして、決定木学習アルゴリズムにおける予測モデルを作成する第４のステップと
を実行し、
運用段階として、第３の説明変数群に対応する直近短期間の説明変数群を、予測モデルに基づく決定木学習アルゴリズムに入力し、次の時点の予測対象種類の予測値を算出する
ことを特徴とする。 According to the present invention, in the prediction method of the apparatus for storing all the different types of explanatory variables acquired in time series in the past long period, and calculating the predicted value of the prediction target type at the next time point,
The device
As a learning stage,
A first step of calculating a contribution for each explanatory variable by a decision tree learning algorithm using all the explanatory variables as teacher data, and extracting a first predetermined number of first explanatory variables from the top in descending order of contribution When,
For each explanatory variable included in the first explanatory variable group, a second step of acquiring a second explanatory variable group acquired in time series during the most recent short period;
Using the second explanatory variable group as teacher data, the degree of contribution for each explanatory variable is calculated by a decision tree learning algorithm, and a second predetermined number of third explanatory variable groups are extracted from the top in order of increasing contribution degree. And the steps
A fourth step of creating a prediction model in a decision tree learning algorithm using the third explanatory variable group as teacher data;
As an operation stage, an explanatory variable group corresponding to the third explanatory variable group is input to a decision tree learning algorithm based on a prediction model, and a predicted value of a prediction target type at the next time point is calculated. And

本発明の予測プログラム、装置及び方法によれば、決定木学習アルゴリズムのような教師有り学習であっても、過去長期間の膨大な教師データに対して、直近短期間の教師データに変化があっても、予測モデルにおける予測精度を高めることができる予測プログラム、装置及び方法を提供することを目的とする。特に、水処理システムに適用した場合、過去長期間に運用された膨大な薬品注入率のデータに基づく予測モデルを構築しても、大雨や藻類の増殖などに起因する突発的な原水濁度の変化に対応して薬品注入率を決定することができる。 According to the prediction program, apparatus, and method of the present invention, even in supervised learning such as a decision tree learning algorithm, there is a change in the latest short-term teacher data with respect to a huge amount of teacher data in the past long period. However, an object of the present invention is to provide a prediction program, an apparatus, and a method that can improve prediction accuracy in a prediction model. In particular, when applied to a water treatment system, even if a prediction model based on a large amount of chemical injection rate data that has been in operation for a long period of time has been constructed, sudden turbidity due to heavy rain or algae growth The drug injection rate can be determined in response to the change.

本発明の予測サーバが設置された水処理のシステム構成図である。It is a system block diagram of the water treatment in which the prediction server of this invention was installed. 本発明における予測サーバの機能構成図である。It is a functional block diagram of the prediction server in this invention. 本発明における予測方法を表すフローチャートである。It is a flowchart showing the prediction method in this invention. 本発明における学習時点と予測時点とを表す時系列チャートである。It is a time series chart showing the learning time point and prediction time point in the present invention. ランダムフォレストの機能構成図である。It is a functional block diagram of a random forest. 水処理システムにおける学習段階のデータを表す説明図である。It is explanatory drawing showing the data of the learning stage in a water treatment system. 本発明におけるオペレータ用の端末に表示された薬品注入ガイダンスを表す画面図である。It is a screen figure showing the medicine injection guidance displayed on the terminal for operators in the present invention. 本発明における予測精度を表すグラフである。It is a graph showing the prediction precision in this invention.

以下、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明の予測サーバが設置された水処理のシステム構成図である。 FIG. 1 is a system configuration diagram of water treatment in which a prediction server of the present invention is installed.

図１における水処理システム（浄水場）は、「急速ろ過方式」を用いたものである。急速ろ過方式とは、水中の小さな濁りや細菌類などを薬品で凝集、沈殿させた後の上澄みを、砂利や砂などが敷き詰めてあるろ過池に、１２０〜１８０ｍ／日の速度で水を流してろ過する。比較的濁りの多い河川水や湖沼水の処理に適しており、狭い敷地でも多量の水を処理できる。 The water treatment system (water purification plant) in FIG. 1 uses a “rapid filtration method”. The rapid filtration system is a method in which water is flowed at a speed of 120 to 180 m / day into a filtration pond filled with gravel and sand after the turbidity and bacteria in water are agglomerated and precipitated with chemicals. And filter. It is suitable for treating relatively turbid river water and lake water, and can treat a large amount of water even in a small site.

図１によれば、以下の水処理工程の中で、センシングデータの取得と、薬品注入率の制御とが実行される。
［原水］
（原水温度センサ） ->予測サーバへのセンシングデータ
（原水濁度センサ） ->予測サーバへのセンシングデータ
［生物処理池］
（前処理水濁度センサ） ->予測サーバへのセンシングデータ
（前処理水ＰＨセンサ） ->予測サーバへのセンシングデータ
［活性炭混和池］
<-（前次亜注入装置） <-予測サーバからの薬品注入率
->予測サーバへの薬品注入データ
<-（活性炭注入装置） <-予測サーバからの薬品注入率
->予測サーバへの薬品注入データ
［凝集混和池］
<-（ＰＡＣ注入装置） <-予測サーバからの薬品注入率
->予測サーバへの薬品注入データ
<-（中次亜注入装置） <-予測サーバからの薬品注入率
->予測サーバへの薬品注入データ
（中間処理遊離残留塩素センサ）->予測サーバへのセンシングデータ
［急速ろ過池］
（ろ過池水圧センサ） ->予測サーバへのセンシングデータ
（ろ過池ろ過時間計測器） ->予測サーバへのセンシングデータ
（浄水濁度センサ） ->予測サーバへのセンシングデータ
［浄水池］
［配水池］
（配水池水位センサ） ->予測サーバへのセンシングデータ
尚、制御可能なものとして薬剤注入率以外にも、取水量／配水量であってもよい。 According to FIG. 1, acquisition of sensing data and control of a chemical injection rate are performed in the following water treatment process.
[Raw water]
(Raw water temperature sensor)-> Sensing data to prediction server (Raw water turbidity sensor)-> Sensing data to prediction server [Biological treatment pond]
(Pretreatment water turbidity sensor)-> Sensing data to prediction server (Pretreatment water PH sensor)-> Sensing data to prediction server [Activated carbon mixing pond]
<-(Previous sub-injection device) <-Drug injection rate from prediction server
-> Drug injection data to prediction server
<-(Activated carbon injection device) <-Chemical injection rate from prediction server
-> Chemical injection data to prediction server [Agglomeration mixing pond]
<-(PAC injection device) <-Chemical injection rate from prediction server
-> Drug injection data to prediction server
<-(Medium sub-injection device) <-Chemical injection rate from prediction server
-> Chemical injection data to prediction server (intermediate treatment free residual chlorine sensor)-> Sensing data to prediction server [Rapid filtration pond]
(Filtration pond water pressure sensor)-> Sensing data to prediction server (Filtration pond filtration time measuring instrument)-> Sensing data to prediction server (purified water turbidity sensor)-> Sensing data to prediction server [Pure water pond]
[Reservoir]
(Reservoir water level sensor)-> Sensing data to the prediction server In addition to the chemical injection rate, the water intake / distribution amount may be used as controllable data.

図１の水処理システムによれば、過去長期間の運用データ（センシングデータ及び薬品注入データ）を予測サーバによって学習させることにより、その浄水場に適した予測モデルを作成することができる。これによって、例えば現時点から１時間後の薬品注入率を予測することができる。これは、浄水場の構造や規模に関係なく適用でき、オペレータにおける過去長期間の経験値を用いて薬品注入率を制御することができる。 According to the water treatment system of FIG. 1, a prediction model suitable for the water purification plant can be created by learning operation data (sensing data and chemical injection data) for the past long period by the prediction server. Thereby, for example, the drug injection rate one hour after the present time can be predicted. This can be applied regardless of the structure and scale of the water purification plant, and the chemical injection rate can be controlled using experience values for the past long period of time in the operator.

浄水場の運用について、各種センサのアナログ信号を、コントローラとしてのテレメータやＰＬＣ(Programmable Logic Controller)で集約する。水処理システムの工程は、ＳＣＡＤＡ(Supervisory Control And Data Acquisition)は、コンピュータによるシステム監視とプロセス制御を行う産業制御システムである。ＳＣＡＤＡは、ＨＭＩ(Human Machine Interface)とリンクしており、現場のオペレータに対して運用データを明示する。ビッグデータ分析ではセンシング値を速やかにクラウドの管理サーバへ送信する必要があるため、既存設備に影響をあたえないよう、ＰＬＣからデータを収集してクラウドサーバへデータを送信する。 For the operation of the water treatment plant, the analog signals of various sensors are collected by a telemeter or PLC (Programmable Logic Controller) as a controller. SCADA (Supervisory Control And Data Acquisition) is an industrial control system that performs system monitoring and process control by a computer. SCADA is linked to HMI (Human Machine Interface), and specifies operational data to operators on site. In big data analysis, it is necessary to promptly transmit sensing values to the cloud management server, so that data is collected from the PLC and transmitted to the cloud server so as not to affect existing facilities.

尚、本発明の予測サーバ、プログラム及び方法は、水処理システムの実施形態に限定されるものではなく、様々な用途のソリューションシステムに適用することができる。以下では、実施形態を説明する上で、水処理システムに適用して説明する。 The prediction server, program and method of the present invention are not limited to the embodiment of the water treatment system, and can be applied to solution systems for various uses. In the following, the embodiment will be described by applying to a water treatment system.

本発明の予測サーバ１は、学習段階と運用段階とに区別して実行される。
学習段階とは、過去長期間の教師データ（説明変数）を用いて、決定木学習エンジンの予測モデルを作成する。
運用段階とは、作成された予測モデルの決定木学習エンジンを用いて、直近短期間のデータ（説明変数）から、将来のデータを予測する The prediction server 1 of the present invention is executed by distinguishing between a learning stage and an operation stage.
In the learning stage, a prediction model of a decision tree learning engine is created by using past long-term teacher data (explanatory variables).
The operation stage is to predict future data from the most recent short-term data (explanatory variables) using the decision tree learning engine of the created prediction model

図２は、本発明における予測サーバの機能構成図である。
図３は、本発明における予測方法を表すフローチャートである。
図４は、本発明における学習時点と予測時点とを表す時系列チャートである。 FIG. 2 is a functional configuration diagram of the prediction server in the present invention.
FIG. 3 is a flowchart showing a prediction method in the present invention.
FIG. 4 is a time series chart showing the learning time point and the prediction time point in the present invention.

図１によれば、予測サーバ１は、運用段階として、データ受信部１１１と、決定木学習エンジン１１０と、予測結果送信部１１２とを有する。また、学習段階として、データベース１２０と、第１の説明変数抽出部１２１と、第２の説明変数抽出部１２２と、第３の説明変数抽出部１２３と、予測モデル作成部１２４とを有する。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現できる。また、これら機能構成部の処理の流れは、予測方法としても理解できる。 According to FIG. 1, the prediction server 1 includes a data reception unit 111, a decision tree learning engine 110, and a prediction result transmission unit 112 as an operation stage. In addition, as a learning stage, the database 120, the first explanatory variable extraction unit 121, the second explanatory variable extraction unit 122, the third explanatory variable extraction unit 123, and the prediction model creation unit 124 are included. These functional components can be realized by executing a program that causes a computer installed in the apparatus to function. The processing flow of these functional components can also be understood as a prediction method.

［決定木学習エンジン１１０］
決定木学習エンジン１１０は、機械学習のアルゴリズムとして「ランダムフォレスト(Random Forest)」を用いる。ランダムフォレストとは、決定木を弱学習器とする集団（アンサンブル）学習アルゴリズムであって、ランダムサンプリングされた説明変数によって学習した多数の決定木を構築する。ランダムフォレストを利用する理由として、説明変数の寄与度（重要度）を算出することができる。本発明によれば、決定木学習エンジンであればよいが、説明変数の寄与度（重要度）を算出可能なアルゴリズムであることを要する。 [Decision Tree Learning Engine 110]
The decision tree learning engine 110 uses “Random Forest” as an algorithm for machine learning. The random forest is a group (ensemble) learning algorithm using a decision tree as a weak learner, and constructs a large number of decision trees learned by random sampled explanatory variables. As a reason for using the random forest, the contribution (importance) of the explanatory variable can be calculated. According to the present invention, any decision tree learning engine may be used, but the algorithm needs to be able to calculate the contribution (importance) of the explanatory variable.

図５は、ランダムフォレストの機能構成図である。 FIG. 5 is a functional configuration diagram of the random forest.

ランダムフォレストは、大量の説明変数を、ブートストラップサンプリングによってＢ組のサブサンプルに分類する。サブサンプル毎に、計算負荷の低い決定木の機械学習アルゴリズムを用いて学習モデルを生成する。そして、Ｂ組のサブサンプルを単純に多数決又は平均化する。これによって、ノイズに強く且つパフォーマンスに優れた機械学習分類モデルを作成することができる。ブートストラップサンプリングによって、説明変数同士の相関が低い決定木群を作成し、学習の偏りを避けることができる。ランダムフォレストは、膨大な数の説明変数であっても十分に機能する。また、サブサンプル毎の決定木の学習は、完全に独立しているので、並列処理が可能であって高速に実行することできる。 Random forest classifies a large number of explanatory variables into B sets of subsamples by bootstrap sampling. For each subsample, a learning model is generated using a machine learning algorithm of a decision tree with a low calculation load. Then, the B sub-samples are simply majority or averaged. This makes it possible to create a machine learning classification model that is resistant to noise and excellent in performance. By bootstrap sampling, a decision tree group with low correlation between explanatory variables can be created to avoid learning bias. Random forests work well with a huge number of explanatory variables. Also, learning of the decision tree for each subsample is completely independent, so that parallel processing is possible and it can be executed at high speed.

ランダムフォレストの決定木学習アルゴリズムは、例えばＲの{randomForest}パッケージのrandomForest関数を用いることができる。また、ランダムフォレストは、分類モデルであっても、回帰と同じように説明変数毎に寄与度（重要度）を算出することができる。これには、importance関数を用いることができる。 The random forest decision tree learning algorithm can use, for example, the randomForest function of R's {randomForest} package. In addition, even if the random forest is a classification model, the degree of contribution (importance) can be calculated for each explanatory variable in the same manner as regression. For this, the importance function can be used.

＜学習段階Ｓ１２＞
［データベース１２０］
データベース１２０は、過去長期間に時系列に取得された異なる種類の説明変数全群を記憶する。説明変数は、具体的には、１時間毎に計測及び制御された運用データ（センシングデータ及び薬品注入率）であって、学習期間として例えば過去１年分（年単位の実測期間）のものである。 <Learning stage S12>
[Database 120]
The database 120 stores all groups of different types of explanatory variables acquired in time series over the past long period. The explanatory variable is specifically operation data (sensing data and chemical injection rate) measured and controlled every hour, and for example, for the past one year (annual measurement period) as a learning period. is there.

説明変数は、実測値、複数の実測値の積算値、又は、複数の実測値の平均値であり、
センシングデータ又は薬品注入データについて、任意の種類の所定時刻で記録されていない場合、当該種類について所定時刻の直近短期間の実測値における平均値によって、所定時刻の当該説明変数を補間する。 The explanatory variable is an actual value, an integrated value of a plurality of actual values, or an average value of a plurality of actual values.
When the sensing data or the medicine injection data is not recorded at an arbitrary type of predetermined time, the explanatory variable at the predetermined time is interpolated by the average value of the measured values of the predetermined time in the most recent short period.

［第１の説明変数抽出部１２１（Ｓ１２１）］
第１の説明変数抽出部１２１は、説明変数全群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第１の所定数の第１の説明変数群を抽出する。 [First explanatory variable extraction unit 121 (S121)]
The first explanatory variable extraction unit 121 calculates the contribution for each explanatory variable by the decision tree learning algorithm using the entire explanatory variable group as the teacher data, and the first predetermined number of the first predetermined number from the top in the descending order of the contribution. Extract explanatory variables.

図６は、水処理システムにおける学習段階のデータを表す説明図である。 FIG. 6 is an explanatory diagram showing learning stage data in the water treatment system.

図６（ａ）によれば、説明変数全群は、図１の水処理システムにおけるセンシングデータ及び薬品注入データである。
センシングデータは、原水水温、原水濁度、前処理水濁度、前処理水ｐＨ値、中間処理遊離残留塩素値、ろ過池水圧、ろ過池ろ過時間、浄水濁度、配水池水位の１つ以上である。
薬品注入データは、前次亜塩素酸注入率、活性炭注入率、ＰＡＣ（ポリ塩化アルミニウム）注入率、中次亜塩素酸注入率、硫酸バンド注入率、後次亜塩素酸注入率の１つ以上である。
これら説明変数は、決定木学習エンジン１１０へ入力され、説明変数毎の寄与度を算出する。ここでの予測値は、次の時点の薬品注入率となる。
図６（ｂ）によれば、寄与度の高い順に上位から、例えば２０件（第１の所定数）の第１の説明変数群が抽出される。 According to FIG. 6A, the entire explanatory variable group is sensing data and chemical injection data in the water treatment system of FIG.
Sensing data is one or more of raw water temperature, raw water turbidity, pretreatment water turbidity, pretreatment water pH value, intermediate treatment free residual chlorine value, filtration pond water pressure, filtration pond filtration time, clean water turbidity, distribution pond water level It is.
The chemical injection data is one or more of pre-hypochlorous acid injection rate, activated carbon injection rate, PAC (polyaluminum chloride) injection rate, medium hypochlorous acid injection rate, sulfuric acid band injection rate, and post-hypochlorous acid injection rate. It is.
These explanatory variables are input to the decision tree learning engine 110, and the degree of contribution for each explanatory variable is calculated. The predicted value here is the drug injection rate at the next time point.
According to FIG. 6B, for example, 20 (first predetermined number) of first explanatory variable groups are extracted from the top in the descending order of contribution.

［第２の説明変数抽出部１２２（Ｓ１２２）］
第２の説明変数抽出部１２２は、第１の説明変数群に含まれる説明変数毎に、直近短期間に時系列に取得された第２の説明変数群を取得する。直近短期間とは、現時点から過去に時間単位の実測期間に基づくものである。 [Second explanatory variable extraction unit 122 (S122)]
The second explanatory variable extraction unit 122 acquires a second explanatory variable group acquired in time series during the most recent short period for each explanatory variable included in the first explanatory variable group. The latest short period is based on an actual measurement period in units of time from the present time to the past.

図６（ｃ）によれば、図６（ｂ）によって選択された２０件の第１の説明変数群について、その説明変数毎に過去６時間（直近短期間）に取得された１時間毎の説明変数を取得する。例えば第１の説明変数群に含まれる「原水濁度」について、直近６時間における１時間毎の６個の説明変数を取得する。
原水濁度-> 原水濁度1h、原水濁度2h、原水濁度3h、
原水濁度4h、原水濁度5h、原水濁度6h
第１の説明変数群が２０件であるならば、各６時間分で、２０件×６時間＝１２０件の第２の説明変数群を取得する。 According to FIG.6 (c), about 20 1st explanatory variable groups selected by FIG.6 (b), every 1 hour acquired for the past 6 hours (the latest short period) for every explanatory variable. Get the explanatory variable. For example, for the “raw water turbidity” included in the first explanatory variable group, six explanatory variables for every hour in the latest 6 hours are acquired.
Raw water turbidity-> Raw water turbidity 1h, Raw water turbidity 2h, Raw water turbidity 3h,
Raw water turbidity 4h, Raw water turbidity 5h, Raw water turbidity 6h
If the first explanatory variable group is 20, the second explanatory variable group of 20 cases × 6 hours = 120 cases is obtained for each 6 hours.

［第３の説明変数抽出部１２３（Ｓ１２３）］
第３の説明変数抽出部１２３は、第２の説明変数群を教師データとして、決定木学習アルゴリズムによって説明変数毎の寄与度を算出し、寄与度の高い順に上位から第２の所定数の第３の説明変数群を抽出する。 [Third explanatory variable extraction unit 123 (S123)]
The third explanatory variable extraction unit 123 calculates the contribution for each explanatory variable by the decision tree learning algorithm using the second explanatory variable group as the teacher data, and the second predetermined number of the second predetermined number from the top in the descending order of the contribution. 3 explanatory variable groups are extracted.

図６（ｃ）の１２０件の第２の説明変数群を、決定木学習エンジン１１０へ入力し、説明変数毎の寄与度を算出する。
図６（ｄ）によれば、寄与度の高い順に上位から、例えば４０件（第３の所定数）の第３の説明変数群が抽出される。 The 120 second explanatory variable groups in FIG. 6C are input to the decision tree learning engine 110, and the contribution for each explanatory variable is calculated.
According to FIG. 6D, for example, 40 (third predetermined number) third explanatory variable groups are extracted from the top in the order of the contribution degree.

［予測モデル作成部１２４（Ｓ１２４）］
予測モデル作成部１２４は、第３の説明変数群を教師データとして、決定木学習アルゴリズムにおける予測モデルを作成する。 [Prediction model creation unit 124 (S124)]
The prediction model creation unit 124 creates a prediction model in the decision tree learning algorithm using the third explanatory variable group as teacher data.

尚、説明変数の数は、例えば以下のような関係にあるのが好ましい。
説明変数全群の数＞第１の説明変数群の数
第１の説明変数群の数＜＜第２の説明変数群の数
第２の説明変数群の数＞＞第３の説明変数群の数
第１の説明変数群の数＜第３の説明変数群の数 The number of explanatory variables is preferably in the following relationship, for example.
Number of explanatory variable total group> Number of first explanatory variable group Number of first explanatory variable group << Number of second explanatory variable group Number of second explanatory variable group >> Number of third explanatory variable group Number Number of first explanatory variable group <Number of third explanatory variable group

＜運用段階Ｓ１１＞
［データ受信部１１１（Ｓ１１１）］
データ受信部１１１は、データ（説明変数）を受信する。水処理システム用の予測サーバ１によれば、水処理システムの各工程に設置されたセンサ又はコントローラから、センシングデータ及び薬品注入率を、説明変数として逐次、受信する。そして、そのデータを、決定木学習エンジン１１０及びデータベース１２０へ出力する。 <Operation stage S11>
[Data receiving unit 111 (S111)]
The data receiving unit 111 receives data (explanatory variables). According to the prediction server 1 for a water treatment system, sensing data and a chemical injection rate are sequentially received as explanatory variables from sensors or controllers installed in each process of the water treatment system. Then, the data is output to the decision tree learning engine 110 and the database 120.

［決定木学習エンジン１１０（Ｓ１１０）］
決定木学習エンジン１１０は、運用段階として、第３の説明変数群に対応する直近短期間の説明変数群を、リアルタイムに入力する。決定木学習アルゴリズムには、学習段階Ｓ１２で作成された予測モデルに基づくものである。図４のＳ１１０を参照すると、現時点となる実測時点（運用時点）の過去６時間分に対応する第３の説明変数群を入力する。 [Decision Tree Learning Engine 110 (S110)]
The decision tree learning engine 110 inputs, in the real time, an explanatory variable group for the most recent short period corresponding to the third explanatory variable group as an operation stage. The decision tree learning algorithm is based on the prediction model created in the learning stage S12. Referring to S110 in FIG. 4, a third explanatory variable group corresponding to the past six hours of the actual measurement time point (operation time point) that is the current time point is input.

そして、次の１時間後の時点の予測対象種類の予測値を算出する。予測対象種類とは、水処理システムにおける「薬品注入率」となる。具体的には、過去のセンシングデータ及び薬品注入率を用いて、未来の予測時点における薬品注入率を予測することができる。 Then, the predicted value of the prediction target type at the time point next one hour later is calculated. The prediction target type is a “chemical injection rate” in the water treatment system. Specifically, it is possible to predict a drug injection rate at a future prediction point using past sensing data and a drug injection rate.

［予測結果送信部１１２（Ｓ１１２）］
予測結果送信部１１２は、予測された薬品注入データ（薬品注入率）を、ユーザ操作の端末４へ送信する。水処理システムにおけるオペレータは、端末４に表示された薬品注入率を視認することによって、熟練技術者でなくても経験値の高い運用をすることができる。また、薬品を自動的に注入するコントローラに対して予測結果を送信することによって、薬品注入率の制御が可能となる。 [Prediction Result Transmitter 112 (S112)]
The prediction result transmission unit 112 transmits the predicted chemical injection data (chemical injection rate) to the terminal 4 operated by the user. An operator in the water treatment system can operate with a high experience value even if it is not a skilled engineer by visually recognizing the chemical injection rate displayed on the terminal 4. In addition, the chemical injection rate can be controlled by transmitting the prediction result to the controller that automatically injects the chemical.

図７は、本発明におけるオペレータ用の端末に表示された薬品注入ガイダンスを表す画面図である。
図７によれば、端末４の画面には、「ＰＡＣ注入率」「活性炭注入率」「前次亜塩素酸注入率」「中次亜塩素酸注入率」毎に、予測結果としての注入率が表示されている。 FIG. 7 is a screen diagram showing medicine injection guidance displayed on the operator terminal in the present invention.
According to FIG. 7, the screen of the terminal 4 shows an injection rate as a prediction result for each of “PAC injection rate”, “activated carbon injection rate”, “previous hypochlorous acid injection rate”, and “medium hypochlorous acid injection rate”. Is displayed.

図８は、本発明における予測精度を表すグラフである。 FIG. 8 is a graph showing the prediction accuracy in the present invention.

図８は、約２０，０００ｍ^３／日の水処理能力を有する浄水場で、過去長期間１年分の運用データに基づいて、本発明によって薬品注入率を制御することによって、実績値に近い予測値を算出することを表したものである。説明変数全群は約１５０種類であり、学習データサンプリングは１時間周期のものである。そして、現時点から１時間後の薬品注入率を予測する。予測した薬品注入率は、３種類である。 FIG. 8 shows a water treatment plant having a water treatment capacity of about 20,000 m ³ / day, which is close to the actual value by controlling the chemical injection rate according to the present invention based on the operation data for one year for the past long period. This shows that the predicted value is calculated. There are approximately 150 explanatory variable groups, and the learning data sampling is of one hour period. And the chemical | medical agent injection | pouring rate after one hour from the present time is estimated. There are three types of predicted drug injection rates.

例えば従来の回帰モデルに基づく予測精度は９０％程度である。これに対し、本発明によれば、予測精度は９７％程度となり、突発的な濁度変化にも対応できる精度となった。尚、予測精度の算定は、過去の運用値と本発明の予測結果とを比較して、以下の式で算出したものである。
予測精度＝１００−｜実績値−予測値｜／実績値×１００ For example, the prediction accuracy based on the conventional regression model is about 90%. On the other hand, according to the present invention, the prediction accuracy is about 97%, and the accuracy can cope with sudden turbidity change. Note that the calculation of the prediction accuracy is calculated by the following formula by comparing the past operation value and the prediction result of the present invention.
Prediction accuracy = 100− | actual value−predicted value | / actual value × 100

以上、詳細に説明したように、本発明の予測プログラム、装置及び方法によれば、決定木学習アルゴリズムのような教師有り学習であっても、過去長期間の膨大な教師データに対して、直近短期間の教師データの変化があっても、予測モデルにおける予測精度を高めることができる。特に、水処理システムに適用した場合、過去長期間の膨大な薬品注入率に基づく予測モデルを構築しても、大雨や藻類の増殖などに起因する突発的な原水濁度の変化に対応して薬品注入率を決定することができる。 As described above in detail, according to the prediction program, apparatus, and method of the present invention, even with supervised learning such as a decision tree learning algorithm, the most recent teacher data for a long period of time can be Even if there is a change in teacher data in a short period, the prediction accuracy in the prediction model can be increased. In particular, when applied to water treatment systems, even if a prediction model based on an enormous chemical injection rate over the past long term is constructed, it responds to sudden changes in raw water turbidity due to heavy rain or algae growth. The drug injection rate can be determined.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 Various changes, modifications, and omissions of the above-described various embodiments of the present invention can be easily made by those skilled in the art. The above description is merely an example, and is not intended to be restrictive. The invention is limited only as defined in the following claims and the equivalents thereto.

１予測サーバ
１１０決定木学習エンジン
１１１データ受信部
１１２予測結果送信部
１２０データベース
１２１第１の説明変数抽出部
１２２第２の説明変数抽出部
１２３第３の説明変数抽出部
１２４予測モデル作成部
２水処理システム
２１生物処理池
２２活性炭混和池
２３凝集混和池
２４急速ろ過池
２５浄水池
２６配水池
３コントローラ
４オペレータ用の端末 DESCRIPTION OF SYMBOLS 1 Prediction server 110 Decision tree learning engine 111 Data receiving part 112 Prediction result transmission part 120 Database 121 1st explanatory variable extraction part 122 2nd explanatory variable extraction part 123 3rd explanatory variable extraction part 124 Prediction model creation part 2 Water Treatment system 21 Biological treatment pond 22 Activated carbon mixing pond 23 Coagulation mixing pond 24 Rapid filtration pond 25 Clean water pond 26 Distribution pond 3 Controller 4 Operator terminal

Claims

In a prediction program that stores all the different types of explanatory variables acquired in time series in the past long period, and causes the computer to execute the prediction value of the prediction target type at the next time point,
As a learning stage,
A first step of calculating a contribution for each explanatory variable by a decision tree learning algorithm using all the explanatory variables as teacher data, and extracting a first predetermined number of first explanatory variables from the top in descending order of contribution When,
For each explanatory variable included in the first explanatory variable group, a second step of acquiring a second explanatory variable group acquired in time series during the most recent short period;
Using the second explanatory variable group as teacher data, the degree of contribution for each explanatory variable is calculated by a decision tree learning algorithm, and a second predetermined number of third explanatory variable groups are extracted from the top in order of increasing contribution degree. And the steps
And a fourth step of creating a prediction model in a decision tree learning algorithm using the third explanatory variable group as teacher data,
As an operation stage, the most recent short-term explanatory variable group corresponding to the third explanatory variable group is input to the decision tree learning algorithm based on the prediction model, and the prediction value of the prediction target type at the next time point is calculated. A prediction program that is executed by a computer.

The prediction program according to claim 1, wherein the decision tree learning algorithm is executed by a computer so as to be a random forest.

The explanatory variable is an actual value, an integrated value of a plurality of actual values, or an average value of a plurality of actual values,
The past long term is based on a yearly measurement period,
The prediction program according to claim 1, wherein the computer is executed so that the latest short period is based on an actual measurement period in a unit of time from the present time to the past.

The explanatory variables are sensing data and chemical injection data in the water treatment system,
The prediction program according to any one of claims 1 to 3, wherein the prediction value is caused to be executed by a computer so as to be chemical injection data at a next time point.

The sensing data is one of raw water temperature, raw water turbidity, pretreatment water turbidity, pretreatment water pH value, intermediate treatment free residual chlorine value, filtration pond water pressure, filtration pond filtration time, clean water turbidity, distribution pond water level. That's it,
The chemical injection data is one of the following hypochlorous acid injection rate, activated carbon injection rate, PAC (polyaluminum chloride) injection rate, intermediate hypochlorous acid injection rate, sulfuric acid band injection rate, and post hypochlorous acid injection rate. The prediction program according to claim 4, which is executed by a computer as described above.

If the sensing data or the medicine injection data is not recorded at any type of the predetermined time, the explanatory variable at the predetermined time is interpolated by the average value of the measured values in the immediate short period of the predetermined time in the type. 6. The prediction program according to claim 4, wherein the prediction program is executed by a computer.

A water treatment prediction server that executes the prediction program according to any one of claims 4 to 6 by a computer,
Data receiving means for sequentially receiving the sensing data and the chemical injection data from sensors or controllers installed in each process of the water treatment system;
A prediction server, further comprising prediction result transmission means for transmitting the predicted medicine injection data to a user-operated terminal.

In the prediction device that stores all groups of different types of explanatory variables acquired in time series in the past long period, and calculates the predicted value of the type of the prediction target at the next time point,
As a learning stage,
A first explanation for calculating a contribution for each explanatory variable by a decision tree learning algorithm using all the explanatory variables as teacher data, and extracting a first predetermined number of first explanatory variables from the top in descending order of contribution. Variable group extraction means;
Second explanatory variable group extraction means for acquiring a second explanatory variable group acquired in time series during the most recent short period for each explanatory variable included in the first explanatory variable group;
Using the second explanatory variable group as teacher data, the degree of contribution for each explanatory variable is calculated by a decision tree learning algorithm, and a second predetermined number of third explanatory variable groups are extracted from the top in order of increasing contribution degree. Explanatory variable group extraction means of
Using a third explanatory variable group as teacher data, and a prediction model creating means for creating a prediction model in a decision tree learning algorithm,
As an operation stage, an explanatory variable group corresponding to the third explanatory variable group is input to a decision tree learning algorithm based on the prediction model, and a predicted value of a prediction target type at the next time point is calculated. A characteristic prediction device.

In the prediction method of the device for storing all the groups of different types of explanatory variables acquired in time series in the past long time, and calculating the predicted value of the prediction target type at the next time point,
The device is
As a learning stage,
A first step of calculating a contribution for each explanatory variable by a decision tree learning algorithm using all the explanatory variables as teacher data, and extracting a first predetermined number of first explanatory variables from the top in descending order of contribution When,
For each explanatory variable included in the first explanatory variable group, a second step of acquiring a second explanatory variable group acquired in time series during the most recent short period;
Using the second explanatory variable group as teacher data, the degree of contribution for each explanatory variable is calculated by a decision tree learning algorithm, and a second predetermined number of third explanatory variable groups are extracted from the top in order of increasing contribution degree. And the steps
A fourth step of creating a prediction model in a decision tree learning algorithm using the third explanatory variable group as teacher data;
As an operation stage, an explanatory variable group corresponding to the third explanatory variable group is input to a decision tree learning algorithm based on the prediction model, and a predicted value of a prediction target type at the next time point is calculated. A device prediction method.